The AI chip benchmark wars are back and this time Nvidia's rivals have real numbers to show

Nvidia still sets the pace in AI chips, but the benchmark fight has become useful again because buyers are no longer shopping for one kind of compute.

For years, AI chip benchmarks had a dull rhythm. Nvidia posted the fastest numbers, rivals argued about software maturity or price, and the largest cloud buyers kept ordering Nvidia systems anyway. You could call that dominance. You could also call it a market with no real suspense.

That is changing, not because Nvidia has suddenly lost the crown. It hasn't. The change is that inference has become too expensive, too frequent, and too central to be treated as a leftover from the training race. When every model response has a cost attached to it, buyers start caring about the thing that looks boring until the bill arrives: how many useful tokens a system can produce, at what latency, on how much power, for how much money.

MLCommons describes its datacenter inference benchmark as a measure of how fast systems can process inputs and produce results using a trained model. That sounds dry, but it's exactly where the market is now. Training gets the headlines because it produces the frontier model. Inference decides whether the business works after the model ships.

As Tom's Hardware reported when Nvidia released its latest MLPerf inference results, the company's GB300 NVL72 rack-scale system beat its own GB200 platform by 45% on DeepSeek R1 inference tests and led across workloads including Llama 3.1 405B, Llama 3.1 8B and Whisper. Nvidia also pointed to upgraded tensor cores, NVFP4 quantization and the 130 TB/s NVLink fabric across the 72-GPU rack. Those are not small advantages. If you're buying for maximum throughput and the software stack already runs on CUDA, Nvidia remains the obvious answer.

But obvious is not the same as uncontested.

AMD has made the most serious move at the edge of Nvidia's territory. OpenAI's October 2025 agreement with AMD covered 6 gigawatts of AMD Instinct GPUs, starting with a 1 gigawatt MI450 deployment in the second half of 2026, according to reports on the company announcement. Meta followed in February with its own 6 gigawatt AMD deal, with the first deployments also expected in the second half of 2026. The Wall Street Journal reported that Meta's arrangement is worth more than $100 billion and includes warrants that could give Meta up to 10% of AMD if milestones are met.

That's not a benchmark chart. It's better than one.

A buyer can tolerate a slower alternative if the economics work and the roadmap is real. AMD doesn't need to win every MLPerf column to matter. It needs enough performance, enough supply, and enough confidence from customers such as OpenAI and Meta to become a credible negotiating weapon. Frankly, that alone changes the Nvidia conversation. The threat of a second supplier is valuable even before the second supplier becomes the first choice.

The ASIC story is harder for Nvidia

The bigger pressure comes from custom silicon. Amazon has Trainium, Google has TPU, Microsoft has Maia, and Meta has MTIA. These chips don't have to serve the whole market. They have to serve one company very well. That is a different contest, and Nvidia can't dismiss it as a science project anymore.

Broadcom is the quiet name sitting behind much of that shift. It has long been tied to Google's TPU program, and in October 2025 OpenAI announced a 10 gigawatt custom accelerator agreement with Broadcom, with deployments beginning in the second half of 2026 and running through 2029. This month, The Wall Street Journal reported that Broadcom, Apollo and Blackstone launched a $35 billion AI XPV Platform intended to support more than 20 gigawatts of compute capacity through 2028, including Anthropic's previously announced expansion of more than 1 gigawatt.

If you're Nvidia, that is the part you watch closely. Hyperscalers are still buying Nvidia GPUs in vast numbers, but the same customers are building the tools to buy less of them later. They want control over cost, power, networking and availability. They also want leverage. You don't spend billions on internal silicon because you enjoy semiconductor complexity. You do it because the existing option is expensive enough to make the pain worthwhile.

The benchmark fight is useful because it gives these buyers a language for comparison. Tokens per second, power consumption, latency under load, rack-level throughput, software support, availability. None of those numbers tells the whole story alone, and vendors will always choose the chart that flatters them. Still, the numbers force a more serious conversation than the old one, where Nvidia won training and everyone else argued from the margins.

The current market is not a clean upset story. Nvidia still has CUDA, the most complete systems, the best supply relationships and the habit of winning. Its GB300 results show a company still improving from a position of strength. But AI spending has become large enough that even small percentage savings matter, and inference is large enough to reward chips that are not general-purpose winners.

So the benchmark wars are back, but not because AMD, Broadcom, Google or anyone else has knocked Nvidia over. They are back because buyers finally have enough alternatives to make the comparison worth having. If you're paying for the compute, that's the first real change in years.

Also read: Amazon is investigating engineers who testified against data center expansion as Seattle votes to halt new builds • OpenAI is racing to go public while its CEO openly admits he doesn't want to • The US government just pulled the plug on Anthropic's most powerful AI models and the whole industry is watching