Amazon raises GPU reservation prices 20 percent as the era of cheap cloud AI comes to an end

AWS is raising EC2 Capacity Block prices by about 20 percent in July, and that tells you something useful: cheap, guaranteed cloud GPU capacity is no longer the assumption AI startups can build around.

AWS has made one of its important AI cloud products more expensive, and the timing is hard to miss. Business Insider reported Friday that prices for EC2 Capacity Blocks for Machine Learning will rise by roughly 20 percent starting in July. This is the product customers use when they want to reserve GPU capacity in advance for large training and fine-tuning jobs, the kind of work you can’t leave to chance when a model run is already expensive before the first token moves.

The company’s explanation is plain enough. AWS says Capacity Block prices are updated periodically based on supply and demand. That’s true, but it doesn’t fully capture what customers now have to deal with. AWS already raised Capacity Block prices by about 15 percent in January, when ITPro reported that a p5e.48xlarge instance with eight Nvidia H200 GPUs moved from $34.61 to $39.80 per hour across most regions. Now another increase is coming. If your infrastructure plan was built around last year’s reservation prices, your spreadsheet is already out of date.

This is not a small accounting problem. A p5e.48xlarge instance is not the kind of machine a founder spins up casually. It is the kind of system used when the workload needs serious GPU memory and predictable access, and DevZero’s analysis of the January increase put the extra cost from that hike alone at more than $3,700 per month for one continuously running instance. Add a second price rise in July and the bill starts to look very different from the version investors saw twelve months ago.

Capacity Blocks are meant to solve a real problem. You pay upfront to reserve accelerator capacity for a fixed time window, often for training or other machine-learning jobs where losing access halfway through the plan is unacceptable. The guarantee has value. The trouble is that the guarantee is now being repriced while demand for Nvidia GPUs remains brutally tight. Reuters reported earlier this year that Chinese companies had placed orders for more than 2 million Nvidia H200 chips against inventory of about 700,000 units. When supply looks like that, cloud providers don’t need to discount the scarce part of the stack.

Frankly, founders should stop treating GPU prices as a background variable. They are now part of the product economics. If you charge customers for AI features, your margin depends on the cost of training, fine-tuning and inference. If you sell an agent, a coding assistant, a video tool or a data product built on heavy model use, AWS’s price page is not some procurement detail. It is sitting inside your gross margin.

The hyperscalers have the leverage now

AWS is not moving in a weak market. Synergy Research Group data has consistently put AWS at the front of the cloud infrastructure market, with Microsoft and Google behind it. When the biggest cloud provider raises the price of reserved AI capacity, Azure and Google Cloud don’t have a strong reason to start a race to the bottom. They have the same bottleneck, the same customer urgency and the same investor pressure to turn massive AI spending into returns.

The numbers around that spending are huge. Business Insider reported in February that Amazon told Wall Street it expected about $200 billion in capital expenditure this year, much of it tied to AI infrastructure. Google Cloud revenue rose 63 percent year over year in the first quarter, according to MarketWatch, while Microsoft said AI contributed 12 percentage points to Azure’s 31 percent growth. You can read those figures as proof that enterprise AI demand is real. You should also read them as a warning that the platforms have no reason to give away capacity they can sell at a premium.

For AI-native startups, the practical work starts close to the ground. Audit which workloads truly need reserved H100 or H200-class capacity. Move anything tolerant of interruption away from the most expensive guaranteed blocks. Revisit inference paths, especially where a smaller or distilled model can do the job. Don’t pretend every product feature deserves the same compute budget just because it looked impressive in a demo.

None of this means AWS is doing something irrational. It has expensive chips to buy, data centers to build and customers who need capacity badly enough to pay for certainty. But you should be honest about the shift. The cheap cloud AI story was always going to run into power, memory, networking and GPU supply. AWS has simply put the new reality into the invoice.

Also read: Washington just told the AI industry it has veto power over what the world can use • Regulators are finally building the AI they need to police the markets they oversee • Meta is building a prediction market app called Arena and the existing players should be worried