Baseten is raising $1.5 billion at an $11 billion to $13 billion valuation because inference has stopped being a back-room AI cost. If you run models at real volume, the bill is now the product problem.
The Wall Street Journal reported today that Baseten is finalizing a $1.5 billion round with a split valuation: some investors are coming in at $11 billion, while others are paying $13 billion. Altimeter Capital, Conviction, Spark Capital, Sands Capital, and Wellington Management are co-leading the round, according to the Journal. That is a sharp move for a company that was valued at $5 billion in January, when Nvidia joined a $300 million round alongside existing backers including IVP and CapitalG.
You should pay attention to the structure, not just the headline number. A normal venture round has one price. Baseten's deal has two. When late money accepts a higher valuation rather than walking away, it tells you the allocation was tight and investors wanted in badly enough to pay up. Wellington's participation is also useful context, since the Journal reported this is the firm's first investment in the AI inference market.
The revenue makes the deal easier to understand. Sacra has tracked Baseten's annualized revenue run-rate rising from about $200 million in December 2025 to roughly $600 million by March 2026, with year-over-year growth around 1,900%. That kind of jump is not a neat software milestone. It's a sign that companies are putting far more AI traffic into production, then discovering that serving model responses all day is expensive, technical, and hard to manage.
Baseten sells the software and computing layer that helps companies run models after they have been trained. The company says it works across 20 cloud providers, which matters if you don't want your AI workload trapped inside one vendor's console. Its customers include Cursor, Mercor, and OpenEvidence, according to the Journal. Those are not casual use cases. Cursor has to handle developer queries at scale, Mercor depends on AI in hiring workflows, and OpenEvidence serves clinicians who need fast medical information rather than a demo that works once on a conference stage.
Here's the thing: training got most of the attention because it was dramatic. The largest labs raised billions, bought huge clusters, and turned Nvidia GPUs into the scarcest resource in technology. But training happens in bursts. Inference happens every time a user asks the model for an answer. If your product succeeds, that cost doesn't go away. It compounds.
That is why Baseten's promise is so direct. The company says some customers can cut costs by as much as 30% compared with closed-source APIs from companies such as OpenAI or Anthropic by routing workloads to open-source models and running them through Baseten's infrastructure. Take that claim seriously, but don't treat it as magic. Thirty percent only matters when the base bill is already painful, and the customers Baseten is chasing are exactly the ones with enough volume to feel it.
The open-source model wave helps Baseten's case. Meta's Llama family, Mistral, DeepSeek, Moonshot AI's Kimi, you name it, have given teams more options than they had two years ago. The model is only part of the work, though. Someone still has to handle GPU availability, scaling, monitoring, latency, billing, and deployment. Baseten is betting that many companies would rather buy that layer than build it themselves.
Benchling gives the story a sharper edge than another developer-tool customer would. The life sciences R&D platform announced a Baseten partnership earlier this year to bring AI inference into biotech research workflows. That is a tougher market than a coding assistant. Data sensitivity, reliability, and workflow fit all matter more when the customer is using AI around research operations, not just generating another block of text.
Competition will come quickly. Fireworks AI raised money last year at a $4 billion valuation, and the Journal noted that Factory reached a $1.5 billion valuation in April. The hyperscalers will not ignore a market where customers are spending heavily just to make trained models useful. Amazon, Google, and Microsoft already own much of the cloud relationship, and they have every reason to pull inference spending closer to their own platforms.
Baseten does not need a monopoly to justify the attention. It needs to prove that inference is a large enough category for a specialist to keep winning even as the big cloud platforms crowd the field. The $1.5 billion round says investors believe that argument today. The harder test comes after the money lands, when customers decide whether Baseten is still cheaper, faster, and easier than doing the work themselves.
Also read: General Intuition is raising $300 million on the bet that gaming data is AI's most underrated asset; The AI spending arms race is quietly ending the era of Big Tech buybacks; Accenture's revenue miss and guidance cut signal a structural reckoning for enterprise IT consulting