Etched bets $800 million that transformer silicon will outlast the GPU era

Etched has moved from a bold transformer-only chip pitch to a harder test: shipping rack-scale inference systems backed by $800 million raised and more than $1 billion in customer contracts.

Etched didn't arrive on June 30 as a mystery Nvidia challenger. Reuters reported in June 2024 that the startup had raised $120 million to develop Sohu, a custom AI chip built for transformer inference. The new claim is bigger and more useful to you: Etched now says its A0 silicon has come back from TSMC's N4P process, its first racks ship this summer, and production has started for more than $1 billion in customer contracts.

That is the point where a chip startup stops being an interesting deck and starts becoming an operations company. You can announce a faster chip with a few benchmark slides. You can't fulfill rack-scale orders without packaging, power delivery, cooling, validation, software, and manufacturing partners that survive contact with a data center floor.

Etched says it has raised $800 million across four previously unannounced financings, including backing from VentureTech Alliance, the TSMC-linked investment firm. On its own site, the company says it now has more than 400 engineers from Nvidia, Google TPUs, Broadcom, SK Hynix, TSMC and others. That matters less as a trophy list than as a clue to what Etched is actually trying to sell. This isn't just a PCIe card to drop into someone else's machine. It is selling frontier inference clusters: chips, racks, software, and the manufacturing method around them.

The original Sohu pitch was wonderfully blunt. Etched wanted to hardwire the transformer architecture into silicon and give up the flexibility that makes Nvidia GPUs useful across almost every AI workload. In 2024, the company claimed an eight-chip Sohu server could generate more than 500,000 tokens per second on Llama 70B, while an eight-GPU Nvidia H100 system would sit around 23,000 to 25,000 tokens per second. If that claim holds in real customer deployments, it isn't a marginal efficiency gain. It is a different kind of machine.

Here's the thing: the bet is still narrow by design. A general-purpose GPU is wasteful because it can do many things. Sohu is faster because it refuses to. That trade is attractive when transformer inference is the money-making workload, but it leaves customers exposed if model architectures move in ways the silicon can't follow. Hardware cycles are long. Model fashion is not.

Etched's answer is that transformers have won enough of the AI market to justify dedicated infrastructure. You don't have to agree with that forever to see why buyers are listening now. Inference demand is where the bill keeps arriving, especially as companies move from demos to agents, long-context workflows and high-volume user products. Nvidia still owns the market, but dependence on one supplier is expensive, and every serious cloud buyer knows it.

The new details make Etched more credible, but they also make the burden heavier. The company says it has opened a Taiwan factory and built a data center, test house and NPI prototyping lab at its San Jose headquarters. Its website lists Etched HQ at 3155 Olsen Drive in San Jose, not Cupertino, and the company says customer validation is already under way. First-pass silicon on a four-nanometer class process is hard. Turning that into reliable racks is harder.

Frankly, the investor names are less interesting than the customer contracts. Andrej Karpathy, Geoffrey Hinton, Fei-Fei Li and Peter Thiel give the story sparkle. More than $1 billion in contracted demand gives it teeth. If those contracts convert into working deployments, Etched has a real claim on the inference bottleneck. If they slip, the company becomes another reminder that AI hardware is full of beautiful demos that never become boring infrastructure.

So the real question isn't whether Etched can embarrass an H100 on a transformer benchmark. The real question is whether it can ship enough racks, keep them stable, support the software stack, and prove to customers that a transformer-locked system is worth the risk. The chip works in the lab. The orders are on paper. Now Etched has to do the part that usually kills hardware startups.

Also read: Bitcoin ETFs just posted their worst month on record and the buyers stepping in are not who you think • Bending Spoons prices its Nasdaq IPO above range as Wall Street bets on AI-powered software roll-ups • Schneider Electric pays $3.1 billion for Cognite and bets the industrial AI race is won at the data layer