Jun 30, 2026 · 9:14 PM
Subscribe
Home Ai

Etched bets $800 million that transformer silicon will outlast the GPU era

Etched emerged from stealth on June 30, 2026, announcing $800 million raised, a working transformer-specific chip called Sohu, and over $1 billion in signed customer contracts. The startup's bet is that hardcoding the transformer architecture into silicon delivers inference performance that general-purpose Nvidia GPUs cannot match, with early benchmarks showing 500,000 tokens per second on Llama 70B versus roughly 25,000 from an equivalent H100 configuration.

Dave Barr
· 4 min read · 68 views
Etched bets $800 million that transformer silicon will outlast the GPU era

The Harvard dropout-founded chip startup emerged from stealth on June 30 with a working chip, $800 million raised, and more than $1 billion in signed customer contracts, making it one of the most consequential Nvidia challengers yet.

Four years into the transformer era, the question isn't whether purpose-built silicon can beat general-purpose GPUs on inference. It's whether anyone can actually build and sell the stuff. Etched, the Cupertino-based startup founded in 2022 by Harvard dropouts Gavin Uberti, Chris Zhu, and Robert Wachen, came out of stealth today with an answer: a working chip called Sohu, $800 million raised across multiple rounds including a $500 million close at a $5 billion valuation, and over $1 billion in signed contracts for what the company calls full frontier inference clusters, covering chips, custom racks, and software.

The core claim is stark. A single eight-chip Sohu server, Etched says, can run Llama 70B at 500,000 tokens per second. Eight Nvidia H100s in the same configuration push roughly 25,000. Eight Blackwell B200s reach about 43,000. That is not a marginal efficiency gain. It is a different class of machine, and the reason is architectural: Sohu hardcodes the transformer attention mechanism directly into silicon rather than routing it through programmable matrix multiply units that were designed to handle everything. Where a GPU uses 30 to 40 percent of its available compute on transformer workloads, Etched claims 90 percent utilization. The chip is built on TSMC's N4P four-nanometer process and carries 144 GB of HBM3E memory.

The investor list reflects just how seriously the market is taking this. Jane Street, Hudson River Trading, Jump Trading, and Two Sigma joined the latest round alongside Stripes, Peter Thiel, Ribbit Capital, and Radical Ventures. VentureTech Alliance, which is linked to TSMC, participated in earlier financing. The presence of quant trading firms, which run some of the most latency-sensitive inference workloads in the world, is not incidental. They understand tokens-per-second in a way that most enterprise software buyers don't. Andrej Karpathy, Geoffrey Hinton, and Fei-Fei Li are also listed as backers.

Here's the thing: Etched's advantage and its vulnerability are the same fact. Sohu cannot run DeepSeek V4 or Qwen3-235B-A22B, two of the most commercially significant open-weight models in production right now. Hardcoding the transformer architecture means any model that deviates from that pattern is simply incompatible. For a startup selling to hyperscalers and enterprises with heterogeneous workloads, that is not a footnote. It is a live constraint customers have to price into every contract they sign.

The counterargument from Etched is straightforward: transformers won, and they aren't going anywhere. Uberti and Zhu dropped out of Harvard on that conviction, and the $1 billion in contracts suggests at least some major buyers agree. Those contracts cover not just chips but full rack-scale inference clusters, which means Etched is selling infrastructure, not components, and that changes the commercial relationship considerably. You don't swap out a full inference cluster the way you swap out a GPU.

As TechCrunch reported today, the company has already stood up a Taiwan factory, built out a data center, and established an NPI prototyping lab at its San Jose headquarters. First-pass silicon success on the N4P process, which the industry calls an A0 spin, is legitimately hard. Most startups take two or three spins before they have a working chip. Etched says it got there on the first try, though customers are still in the validation phase rather than full production deployment.

The path to gigawatt-scale production that the company has laid out for 2027 is aggressive. But the $1 billion in pre-revenue contracts gives them a funded runway to get there, and it signals that hyperscaler appetite for Nvidia alternatives heading into the second half of 2026 is real enough to bet on. Nvidia declined to comment, according to Bloomberg's coverage of the announcement.

Whether transformer-locked silicon proves durable over a five-year hardware cycle is genuinely uncertain. Architecture shifts happen, and they happen faster than chip development cycles. But the more immediate question for Etched is simpler: can they ship? The contracts are signed. The chip works in the lab. Now they have to build the thing at scale, and that's where most hardware startups have died before them.

Also read: Bitcoin ETFs just posted their worst month on record and the buyers stepping in are not who you thinkBending Spoons prices its Nasdaq IPO above range as Wall Street bets on AI-powered software roll-upsSchneider Electric pays $3.1 billion for Cognite and bets the industrial AI race is won at the data layer

TOPICS
Dave Barr is a professional Marketing Strategist With Over 6 Years Of Experience in PR. His primary area of expertise is public relations and social branding. Dave has been associated with various content projects from across the world on a regular basis. He has also had associations with big and reputed news networks. Dave contributes to Startup Fortune in the Business, Marketing and Technology sections.
Related Articles
More posts →
Loading next article…
You're all caught up