AWS has announced another round of EC2 Capacity Block price hikes, effective July 1, pushing cumulative GPU reservation costs up as much as 50 percent this year alone and forcing AI-native companies to rethink unit economics built on assumptions that no longer hold.
The announcement landed on Thursday, and the timing was not subtle. Six months into 2026, AWS has now raised prices on its ML GPU reservation product three consecutive quarters in a row. The latest increase, reported by Investing.com, is approximately 20 percent across most GPU generations, including Nvidia B200, B300, H100, and H200 instances. That follows a 15 percent hike at the start of the year and a 10 percent rise in Q2. Add it up and AWS GPU reservation rates are somewhere between 20 and 50 percent more expensive than they were in January, depending on the instance type.
AWS calls it a periodic adjustment based on supply and demand. That's technically accurate. Nvidia received orders for roughly 2 million H200 chips for 2026 against available inventory of around 700,000 units, a supply gap that hands every hyperscaler enormous pricing leverage. But calling this a market correction undersells what's actually happening. This is a deliberate shift in strategy, from capturing AI workload market share at subsidized rates to extracting margin from customers already locked into AWS infrastructure. The land-grab phase is over.
A p5e.48xlarge instance featuring eight Nvidia H200 GPUs ran at roughly $34.61 per hour in US East at the start of the year. After January's increase it moved to $39.80. The July 1 increase pushes costs higher still. For teams running continuous GPU workloads, DevZero's analysis put the incremental cost from January's 15 percent hike alone at more than $3,700 per month per instance. Compound that across the year's full stack of increases and you're looking at infrastructure bills that bear little resemblance to the spreadsheets AI startups filed with investors twelve months ago.
EC2 Capacity Blocks are reservation products: customers pay upfront to lock in scarce GPU capacity for a defined window, typically for large-scale model training. The premise is that the guarantee of availability justifies a premium over spot rates. What's changed is that the premium itself keeps moving, and it's moving in one direction. Founders who priced their per-inference or per-training-run economics against 2025 reservation rates are now operating with a structural cost assumption that is quietly going stale.
Wells Fargo analysts were bullish on Amazon stock after the announcement, which tells you something. What's bad for the customer is good for the margin. AWS operating margins have been running between 35 and 38 percent recently, and price increases on inelastic, reservation-based GPU demand are exactly how you hold that number while spending $200 billion in capital expenditure on AI infrastructure this year.
Whether Azure and Google Cloud follow is not really a question
AWS setting the price on cloud GPU reservations is not a coincidence. AWS holds roughly 30 percent of global cloud infrastructure spend as of Q1 2026. When the market leader reprices, particularly on a product category this constrained, it creates room for every competitor to do the same. Azure and Google Cloud have no rational incentive to undercut aggressively when demand outstrips supply across all three providers. Google Cloud grew 63 percent year over year in Q1 2026 with AI services accounting for a meaningful share. Azure attributed roughly 12 percentage points of its 31 percent growth to AI. Neither company is in a price war right now. They are in a capacity war, and repricing is how they signal that the capacity is worth fighting over.
Frankly, the more pressing question for founders isn't whether to expect a matching price move from Azure or GCP, it's whether their model training and inference architecture was ever built to absorb this kind of volatility. Most weren't. The standard playbook for AI-native startups over the past two years assumed GPU costs would trend down as more capacity came online. That assumption looked reasonable in 2024. It looks naive now.
Operators who want to limit exposure have a few real options: lock in any remaining July 1 pre-hike capacity immediately, audit whether workloads that run on expensive reserved GPU instances actually need that guarantee or could tolerate spot interruption, and pressure-test whether inference costs in particular can shift toward more cost-efficient architectures or smaller, distilled models. None of that is a cure. It's damage control for a pricing environment that the hyperscalers have made clear they intend to manage in their favor. The era of cheap cloud AI didn't end with a warning. It ended with a Saturday night pricing page update in January and has been confirmed with every quarterly renewal since.
Also read: Washington just told the AI industry it has veto power over what the world can use • Regulators are finally building the AI they need to police the markets they oversee • Meta is building a prediction market app called Arena and the existing players should be worried