Introducing: PMF /evals ◎✨ + the wedge in AI support with RunLLM
A new newsletter /evaluating how AI-native products find PMF. Ft. Vikram Sreekanti, co-founder of RunLLM in edition #1.
Editor’s note: The SaaS Baton is now PMF /evals. A change driven by what’s next: AI-native PMF, examined with the same rigor you’ve come to expect. (The “from” on this email says “The SaaS Baton” one last time; please update that prior to PMF /evals :))
“For me there was a visceral sense of your thing taking on a life of its own and lurching forward, like getting pressed into the back of your seat by a fast car or a plane taking off.” — Drew Houston, Dropbox
Product-market fit used to feel a certain way.
Forever ambiguous, but it had some familiar edges: usage spikes in unexpected pockets of users, ballooning waitlists, overzealous bug reports sweeping Slack. Some math. Some signals. Something.
Not anymore, though. AI-native products are hitting record ARR with impossibly lean teams. Entire enterprise workflows have collapsed into simple prompts.
Pricing models (not just price points) now shift mid-flight. Savvy, not-new-to-disruption incumbents are layering new S-curves on top of old moats. And the LLM makers you build on? They’re launching apps that look a lot like yours.
It’s a fascinating flux state. ◎
So at Chargebee, we decided to do something about it. Because we’ve always loved telling builder stories, and you’ve been generous in loving them back (Thanks, as ever, for supporting Relay!).
Introducing PMF /evals ◎✨
a bi-weekly newsletter where we /evaluate AI-native PMF through first-hand stories of early-stage founders, operators scaling AI-native second acts, investors seeing the world unfold around this tech, and with some research/curation by us.
Here we will document what PMF really looks like today: the impossibly-hard-to-pin-down market pull, the enthralling new logic of monetization, the strange new shapes this thing is assuming overall.
The aim? To surface truths that help the ecosystem make better bets; a second brain all over.
Subscribe if you’re building, backing, or just trying to keep up!
In this edition 1️⃣ we feature Vikram Sreekanti, the co-founder of RunLLM, an AI support startup trusted by teams at MotherDuck, Databricks, and LlamaIndex.
A software engineer and researcher with a PhD in Computer Science from UC Berkeley, he brings a builder’s eye — and a realist’s skepticism — to one of the most hyped categories in AI. Vikram also documents his observations in real time on his substack here.
Here is a TL;DR of what surfaced:
why enterprises won’t put just any chatbot in front of a 7-figure deal,
why you can’t bypass integrations-building if you are selling to enterprises,
that time when someone said usage-based pricing has bad incentives,
how incumbents are tripping over the “least common denominator”, and more.
Vikram’s words, lightly edited for clarity and flow, appear below. ⤵
#1:
AI for support is one of the hottest areas in enterprise. Look at the news, Decagon recently raised at a $1.5 billion valuation.
What people miss, though, is that most focus so far has been on high-volume, low-expertise support that would typically be outsourced to a high turnover and unreliable call center.
We aren’t doing that. We are doing B2B enterprise support.
Even though the market calls it all “support”, enterprise support is peculiarly distinct in nature.
You can’t do support for a seven-figure customer buying a technical product the way you would do support for an e-commerce product.
If you look at some other, larger support products, I think they pretty explicitly tell customers that they're not oriented at doing the kind of advanced knowledge support that a more technical product is going to require.
That's a thing that we have heard from companies we talk to, all the way from series B to growth stage to public companies.
They have evaluated a bunch of products, but none of them really hit the mark in terms of being able to handle the complexity and the depth of the product that B2B enterprises are selling. This is what drives us.
#2:
All of us are technical people, so building the core product is a fun thing for us to do. But in building for enterprises, we’ve learnt that in addition to the core use case, we also need to build for scale, maturity, and customer workflows.
There are a lot of cool AI things to build like figuring out how to ask thoughtful questions to our end users, reorienting them if needed, helping them get unstuck, do all the things that a good support engineer would do — the magic bits of building with AI.
But there's also stuff we need to do that's not cool or sexy or fun, like building a deployment for Salesforce, Zendesk, Intercom and the 25 other support tools that people use because this is what it takes to build a good enterprise oriented product.
We do focus a lot on answer quality in general because we know that our customers aren't going to put us out in front of their customers unless they really trust us with the actual answers.
And then over time we've just been sort of slowly knocking down yet another integration, yet another connector, whatever it is.
#3:
We don't really view the Decagons and the Sierras of the world as competition. We're not coming across them in customer conversations.
Their support workflows are very action-oriented. They're trying to issue an order refund, send a password reset, rebook a flight, those types of actions. Our support is very knowledge-oriented.
In terms of competition, the closest is probably the incumbents like Zendesk AI, Intercom’s fin, Salesforce's Agentforce etc.
They certainly have the data advantage because they have decades of support tickets that they have answered. But from what we've heard and seen from customers, they don't have the expertise and incentives to go after the highly technical B2B support challenges.
I'll give you one funny anecdote. We were on a call with two Directors of Support at a Series C company that sells a database. They're Salesforce customers and they did a demo with the Agentforce team to see how they could use Agentforce for their support.
The Director of Support looks at us and asks, “Hey, by the way, what kind of products do you guys work with?” I thought he was asking about integrations so I started telling him about the different integrations, but he wanted to know the types of products our customers sell.
He then paused for a second and shared the full story. They had done a demo with the Agentforce team. The demo was about how they could do support automatically if, let’s say, a coffee machine stopped working.
What really got me was when this person said that “you could unplug a coffee machine if it doesn’t work, you can’t unplug a globally distributed database.”
The incentive for these larger companies is to go after the least common denominator for their customer base and Salesforce's customer base is everyone in the world, all the way from coffee machines to databases.
Whereas we focus very exclusively on those more advanced problems, which means that our product just feels right to this target audience.
The last part of the competitive landscape is of course startups.
Most of the startups we see are building generic RAG applications, which is a good starting point but largely ineffective.
Generic RAG doesn't understand the complexity of the product, so it's likely to provide vague/incorrect/generic answers to questions. And it doesn't enable you to support more advanced interactions like asking follow-up questions, analyzing logs, etc.
They also miss the bar in terms of integrating with every support tool and every documentation system and every wiki system.
#4:
I think monetization is an experimental process.
You start with some reasonable assumption and then you do some back-of -the-envelope calculation.
Like “How long would it take a person? Okay, well if it took a person X amount of time, what does that mean for how long it would take or how much it's going to save?” And then you take some percentage of that and you start there.
That's been a pretty good place for us to start.
I wouldn't say it's been met with universal acclaim, but it has actually been reasonably good in terms of a set of assumptions that we can make that help customers understand the right way to think about us.
To give you a counterpoint, we just closed a reasonably well-established Series D company. They did a fairly extensive set of testing, gave us a bunch of feedback, we iterated with all that feedback, they were pretty happy with it, and decided to commit.
They asked for pricing. We gave them our standard usage-based pricing on a model and their Head of Support reached out with a bunch of objections to it around bad incentives, feeling that we wanted them to use us more, et cetera, et cetera.
And so we got on a call with them. I asked about how much they thought was reasonable, what the expectation was, how they would make the case to the CFO, and so on. And we came to a number that we thought was mutually agreeable.
At the end of the call, I asked them to share a little bit about their expected usage. Then I took those numbers and I plugged them back into our model, and it was exactly the number that they would've paid under usage-based pricing, except we came up with a different model for them in this case.
I think there's a little bit of a cultural shift that still needs to happen when it comes to usage-based pricing.
Many customers like it. But some have a negative reaction to it. We're still in a stage where getting customers on board is more of a premium than having principles about the right way to do pricing. But I do think that usage-based correlates everyone's incentives in a cleaner way.
Even ticket resolution is a difficult thing to measure because oftentimes customers don't close tickets and oftentimes customers don't give any feedback, especially to an AI system.
So we’ve started with number of questions answered as our baseline.
Plus some of the other work that we do around updating documentation, finding issues, all of that has been fairly amenable to basically everyone that we work with except for a few specific cases.
#5:
I think with an AI company, there are three kinds of moats:
data,
UX, and
integrations.
We answer, for our largest user bases, probably 3000 questions a week. We're gathering tons and tons of examples of areas where we're able to solve user problems, areas where we miss the mark, areas where we could be better. We're always looking to gather and categorize that data.
I would be lying if I told you I had a master plan for every data set that we're collecting. But we know this data will matter over time, whether it’s refining the service itself or helping customers improve metrics like resolution and deflection.
Enterprises are afraid of odd training on their data, and so there's some restrictions in terms of what we can do with that data, but in the grand scheme of things, there's still a lot of opportunities.
On the UX part, the main thing that we see as an opportunity is to continue to push the balance in terms of what we expose as capabilities to our users.
One criticism we received a lot even up to six months ago was that we present our product as “just another chatbot”.
A chat bot has a transactional nature. A question goes in and an answer comes out and we think there's a lot more that we can do beyond that. So we have deliberately focused on doing more, proactively.
Let’s say you ask a question, oftentimes there isn't just one answer to that question because it's a technical product, there are multiple ways to get to the same result.
So we don't just stop at one answer, we'll follow up and say, “Hey, this was the first answer, what we thought was the most relevant, but in case that doesn't work for you, by the time you come back, we're going to surface another way of doing things for you proactively”.
And the other part of UX is acknowledging that this is end-user facing — our customers' customers see these things and so it is important to give our customers visibility and control into what's going on with our systems. That’s a part of UX we focus a lot on too.
The integrations bit is straightforward, when you talk to a new customer, being able to say "yes" to every integration question provides you an incredible amount of credibility and maturity.
It's no one's idea of fun to build yet-another Salesforce integration
but it makes the product usable for more and more customers, especially when selling to enterprises.
#6:
The most common objection that we hear is “It's great that you solve company X's problem”, even if company X is 10x as big as the prospect that we're talking to, “but our problems are different.”
What this really boils down to is a lack of trust. We all remember Facebook messenger's weird foray into chatbots directly in Messenger, that left a bad taste in people's mouths. Chatbots of the past were sort of goofy, robotic, and not really trustworthy. That mistrust still lingers.
So when people think, “Would I put a chatbot in front of my seven figure customer?” They're like, “Of course not. That sounds insane. Why would I do that?”
They are skeptical and you have to build trust with them. You have to sit down with them and say, “Look, I understand that this is the way you solve problems. Here's three examples of similar things we've seen with companies in your market or of your size or whatever it is.”
And ultimately what it will boil down to is them being able to see it work for them.
What we do is we show up to every customer call with a product demo pre-built on their documentation. We show up and say, “Hey, why don't you go to your Zendesk, pull in the last three support tickets, put them in the chat and try us out?”
90% of the time they'll say something like, “Oh, that's actually the answer I would've given.” And that's the aha moment that wins people.
Another aspect of the trust is trust in the team. We know that every time we start talking to a new customer, we will need to build trust in the team because inevitably we're going to make mistakes every now and then.
We don't try to pretend otherwise. We accept when we make a mistake, we don’t set unrealistic expectations, and we learn from those mistakes both systematically or programmatically.
From a customer experience perspective as well, whenever a user gives us feedback, we take it very seriously and we proactively follow up with them. We tell them why something didn’t work, the improvements we can make, or if there are issues with their documentation, we surface them too.
It comes down to trust—in the product, and in the team behind it.
#7:
2025 is when we've really started to see a growth in inbound interest.
A common thing that we hear from customers we talk to is “I have tried a bunch of chat bots and they all suck, and yours was the first one I came across in company X's, product X's support system that I thought actually was giving high quality answers.”
If you go to the MotherDuck site, there's an Ask AI chat widget there, and in the corner of that it says Powered by RunLLM, which is a great source of marketing for us.
SEO has also turned out to be a really productive area for us; there's very little in the way of people talking about support engineering right now. There's a lot of AI call center automation, very little AI support engineering.
If you Google AI support engineering right now, you actually find that most of it is job postings. We're the only product that shows up on the first page of Google's results for AI support engineer.
And then of course, there's always going to be outbound work.
What is also working well with outbound is that this year people are starting to really look at AI and say, “It's cool that there's science project budget for AI, but this is the year that we actually have to move some KPIs, show some the board is going to kick our butts at the end of the year if we don't feel like we moved this metric with AI.
And support is still an obvious application area.
PMF /evals ◎ is just getting started. Tell us about how you're approaching AI-native building in the wild! Or what you’d like us to cover. Hit reply.
Brought to you by Chargebee. Chargebee helps AI-native, recurring revenue businesses scale with billing and monetization infrastructure built for speed, flexibility, and rapid iteration.