Adaptation: The Missing Layer Between Apps and Foundation Models

Sudip Roy on Last-Mile Reliability, Adaptation, Gradient-Free Tuning, Enterprise AI.

Subscribe: AppleSpotify OvercastPocket CastsYouTube •  AntennaPodPodcast AddictAmazon •  RSS.

Ben Lorica talks with Sudip Roy (Co-founder & CTO, Adaption Labs) about why enterprise AI adoption stalls in the “last 5%” of reliability — and why waiting for the next frontier model release is usually the wrong bet. They unpack “adaptation” as something broader than post-training, including gradient-free, inference-time techniques that can sit above models to route, combine, and continuously improve behavior. The conversation also covers proportional compute allocation (so simple tasks don’t trigger expensive reasoning), how adaptation intersects with RAG/agents, and what transparency, compliance, and observability should look like in an adaptive stack.

Subscribe to the Gradient Flow Newsletter

Interview highlights – key sections from the video version:

Jump to transcript



Related content:


Support our work by subscribing to our newsletter📩


Transcript

Below is a polished and edited transcript.

Ben Lorica: All right, today we have Sudip Roy. He is the co-founder and CTO at Adaption Labs. That’s Adaption with an “A,” so Adaptionlabs.ai. Their taglines are: “The days of monolithic AI are over,” “Most AI is frozen in place; it doesn’t adapt,” and “People around the world are looking for AI that adapts to them, not the other way around.” With that, Sudip, welcome to the podcast.

Sudip Roy: Thank you, Ben, for having me. I’m looking forward to an exciting conversation.

Ben Lorica: You are a seasoned veteran in this highly hyped-up AI space, with battle scars from Google to Cohere. What did you see firsthand that convinced you that this “bigger is better” approach, loosely called scaling, was hitting a wall?

Sudip Roy: Having worked with many enterprises, we noticed that people still fail in their AI adoption journey because AI fails in the last 5% of cases. This is somewhat maskable in the consumer space, but in enterprises, that last-mile reliability is what blocks them from productizing AI.

They attempt to solve it through rigorous prompt tuning, where they spend hundreds of hours, but those prompt structures are very specific to a particular model or version. The alternative is fine-tuning, which requires high-quality data and is an expensive, slow process that can take weeks or months. Across all these themes is the desire to exercise more control, but the unit cost of adaptation—both in terms of expense and time—is very high. It takes weeks to months depending on whether you intervene at the pre-training or post-training phase.

That is fundamentally the problem we are solving at Adaption Labs: lowering the unit cost of adaptation significantly so it can happen seamlessly. We are betting on gradient-free approaches—inference-time strategies that enable users to exercise control and bridge that last-mile quality gap.

Ben Lorica: Let me drill down on that. You mentioned that last mile—the additional 5% of cases where the application doesn’t quite work. The problem is that if you knew which 5% they were, you would deploy, but you don’t know which 5% they are.

Sudip Roy: Exactly. That is one aspect of the lack of adaptability. There are two others. Second, over time, the distribution of the workload you send to the AI might change, and you want the AI to seamlessly adapt to those changes. Third is the proportional allocation of compute based on task complexity. Right now, the dominant way of deploying AI is to use one big model for everything, regardless of how complex the task is. We need to move toward a world where compute allocation is proportional to the complexity of the task. These are the three elements of control we want to provide.

Ben Lorica: Regarding scaling laws, some might think, “I can’t deploy now because it doesn’t work 5% of the time, but maybe next month the foundation model providers will improve it.” Is that “throwing it over the wall” strategy a good one?

Sudip Roy: There is a general consensus that the publicly available data used to scale these larger models is almost exhausted. Much of the remaining high-value data is locked within enterprises. To leverage that data for last-mile customization, you need AI that continually learns in a secure environment.

Waiting for frontier labs to produce the next version of a model is reaching a point of diminishing returns, especially considering the cost and compute required for marginal quality improvements. There are alternative ways to bridge that quality gap seamlessly without waiting six months for a new model.

Ben Lorica: I had a hard time pronouncing the company name, but “Adaption” is actually a synonym for “adaptation,” right?

Sudip Roy: Yes, Adaption is a legitimate word and it means the same thing as adaptation.

Ben Lorica: Let’s drill down on that word. People often use the term “post-training.” Is adaptation just post-training renamed?

Sudip Roy: No, I wouldn’t say that. Post-training is a technique, or a collection of techniques like fine-tuning, RLHF, and distillation. Post-training usually implies an approach requiring gradient updates, where model weights are the central artifact you are trying to update.

Our approach is broader. While we explore post-training, we believe a broader set of gradient-free approaches can be quite powerful. The advantage of gradient-free approaches is that they can be implemented very fast to provide a much more interactive experience.

Ben Lorica: Can you give our listeners an intuition for why these gradient-free approaches work?

Sudip Roy: In simple language, gradient-free methods allow for interactivity. A user can try something, and the system can change its behavior in the next moment based on instructions.

Foundation models have their own “personalities” and specific ways of being instructed. A prompt written for one model doesn’t necessarily transfer to another. By using gradient-free approaches, we remove that complexity from the user and move it into an algorithmic layer that sits on top of one or more models. That layer can choose the appropriate model, use a combination of models, or even dynamically merge models to accomplish a task. It sits between the user and the models to handle the complexity.

Ben Lorica: It sounds like a combination of routing, orchestration, and aggregation.

Sudip Roy: Those are elements of it, but there is a much richer set of techniques beyond those static methods, including techniques that allow for continuous learning over time.

Ben Lorica: In post-training, you usually need access to model weights. What kind of access do you need for adaptation, especially if the model is proprietary?

Sudip Roy: We want to move away from relying solely on proprietary models for that reason. An average enterprise today uses at least three models and usually has a gateway to route between providers because they run into walls with black-box models.

Adaptation makes it easier to abstract that away. You don’t necessarily need access to the weights themselves. There are many algorithmic interventions possible without direct weight access. Having access to weights opens up certain strategies, but weight access is largely required for gradient updates rather than gradient-free methods.

Ben Lorica: So, you can do some adaptation even if the model is proprietary, though open weights are better. Regarding cost, fine-tuning is expensive because of the dataset, and RLHF is complicated and compute-intensive. Where does adaptation fall in terms of cost and complexity?

Sudip Roy: The cost is fairly low because it happens at inference time—it’s happening in real-time rather than through a separate training process. In terms of complexity, while the underlying techniques may be complex, our products will abstract that away so the end user doesn’t have to worry about it.

Ben Lorica: In fine-tuning, the user knows they need to collect sample data. In the adaptation case, what does the user need to do?

Sudip Roy: You should focus on what you want the AI to accomplish and let the system figure out how to deliver it.

Ben Lorica: Do I need to give the system feedback?

Sudip Roy: Yes, it can certainly learn from feedback. This is why one of our pillars is “adaptive interfaces.” We believe the right interface is crucial for gathering feedback that can be folded back into the algorithmic layer for continuous improvement.

Our three pillars are: Adaptive Data (solving the lack of data for fine-tuning), Adaptable Intelligence (continuous learning), and Adaptive Interfaces (seamless feedback loops).

Ben Lorica: Adaption Labs launched recently. If we fast-forward 12 months, how would a non-programmer use your product to improve an AI task?

Sudip Roy: We will be releasing products soon, initially targeted toward developers. The first product focuses on “adaptive data”—helping users synthesize or improve data quality for post-training. Enterprises will be able to leverage this to produce high-quality data for many different needs. Later, we will release products that are more seamlessly usable by a consumer audience as well.

Ben Lorica: Can you give a concrete example of how the data product works?

Sudip Roy: We haven’t launched it yet, so I’ll hold back on specific details, but it is coming soon. It follows the spirit of helping you get from, say, 100 examples to the thousands needed for high-quality fine-tuning.

Ben Lorica: How much of your work will be open source?

Sudip Roy: We are building products that encapsulate these technologies, and we aren’t looking to open source the immediate roadmap. However, we are a frontier AI lab and take pride in our research. We will share our findings with the community through blog posts and research papers.

Ben Lorica: You could also provide an API for adaptation, similar to how OpenAI is the default for inference or other labs are becoming popular for fine-tuning.

Sudip Roy: Yes, we definitely want to get feedback from users, and if an API is what they want, we’d be happy to provide that.

Ben Lorica: What was the moment you decided to build this yourselves? You could have done this at your previous companies. Why start a new company and chase customers and developers from scratch?

Sudip Roy: Both Sarah [co-founder] and I have always been inspired by challenging problems. This requires a single-minded focus that is hard to maintain within a larger organization with established focus areas. We wanted to create that space. We’ve built a great team aligned with this mission, and we are full steam ahead.

Ben Lorica: You’ve seen the challenges enterprises face with RAG and agents. How does adaptation help an enterprise already deep into those technologies?

Sudip Roy: We are complementary to the ecosystem. RAG attempts to solve the context problem by getting the right information to the model. Agents try to make actionable decisions, but the “brain” is still the model layer. Our focus is on making that modeling layer better and tying it to the objective you want to accomplish, regardless of whether that objective is set by an agent or an application layer.

Ben Lorica: You benefit from a healthy ecosystem of open-weights models. Currently, many of the best open-weights models are Chinese, which can be a “no-fly zone” for some enterprises or defense companies. Where will we get high-quality open-weights models that aren’t from China?

Sudip Roy: It’s an interesting question. There is a lot of emphasis on open weights in Europe and elsewhere. While some of the best today are from China, hopefully that changes. Furthermore, gradient-free approaches are not necessarily tied to model weights. If weight access ever becomes a fundamental limitation for us, we have the capability to build our own foundation models from the ground up.

Ben Lorica: That sounds expensive.

Sudip Roy: It is, which is why it’s not the first thing we want to do, but it is an option if it makes sense.

Ben Lorica: Reasoning and multimodal models are great, but the token economics are tough. People use routing to ensure they only use reasoning when necessary. I assume adaptation helps save money here?

Sudip Roy: Yes. One dimension of adaptation is the proportional allocation of compute. If I’m writing a one-paragraph copy, I shouldn’t trigger a reasoning model that thinks for two minutes. Adaptation abstracts that decision away from the user, taking care of the compute allocation based on task complexity.

Ben Lorica: Foundation models and routers are often black boxes. How does adaptation impact transparency, diagnostics, and compliance?

Sudip Roy: Proprietary models are black boxes, but the open ecosystem is different. Adaptation solves a user problem by abstracting complexity. Regarding regulated industries, we are very aware of the need for control.

Trust can be solved through deployment strategies—if no information leaves the enterprise perimeter while the model improves, the trust problem is partly solved. Control is solved by enabling users to have more power through the right knobs at the right level of abstraction.

Ben Lorica: Will the system be able to tell me, “I used a smaller model here, but a more expensive one there”?

Sudip Roy: The idea is that you shouldn’t have to worry about it. However, if the system makes a wrong choice, there should be a feedback loop for correction. Visibility and diagnostics are important for trust, and that can be solved from a product perspective by exposing the appropriate internals.

Ben Lorica: You mentioned “adaptive interfaces.” Is that something other than a chatbot?

Sudip Roy: It is. The thesis is that different tasks require different interfaces. We want to present information in the manner most appropriate to the task. That interface is also the point where users provide feedback for continuous learning.

Ben Lorica: If the system evolves and improves continuously, does that mean I don’t need to invest as much in monitoring and observability?

Sudip Roy: Those are orthogonal concerns. You still want to monitor and observe the system. We view our work as complementary to the companies solving observability and evaluation, and we will integrate with those partners.

Ben Lorica: Are there practical use cases of adaptation in production today, perhaps inside places like Google?

Sudip Roy: Yes, some of these techniques are already implemented within frontier research labs, often behind APIs. They aren’t yet available to a broader class of users. Part of our mission is to productize a much richer set of these techniques for a wider audience.

Ben Lorica: Most people think that when they type a prompt, there is some routing happening. You’re saying it’s more than just routing?

Sudip Roy: Some labs are doing forms of adaptation, but there is a much richer set of techniques and a lot of “green field” for innovation that can improve both performance and cost.

Ben Lorica: What about memory? There is chat memory, but also “operational memory” in enterprises—like a software engineering runbook that worked before. Will adaptation take advantage of that?

Sudip Roy: There is a notion of memory there, but we aren’t a substitute for RAG, which has its place for indexing massive corpuses. Many of these things are on our future roadmap as we move into our second and third pillars.

Ben Lorica: I hear the company is hiring like crazy.

Sudip Roy: Hopefully not “crazy,” but yes, we are hiring.

Ben Lorica: Are these positions remote, or do they need to be in San Francisco?

Sudip Roy: We want to be a global-first company. We realize the need for adaptation comes from diverse languages and cultural contexts, so it’s important to have a diverse workforce. Most of our open positions are “work from anywhere.” We are based in San Francisco and will have a presence here, but we don’t want to limit ourselves to Bay Area talent.

Ben Lorica: And with that, thank you, Sudip.

Sudip Roy: Thank you so much for having me.