Groq raises $650 million for its neocloud second act after selling its soul to Nvidia for $20 billion

After licensing its core LPU technology to Nvidia in a $20 billion deal that paid out shareholders and gutted its founding team, Groq is back raising fresh capital to reinvent itself as an AI inference neocloud, and existing backers are committed enough to backstop the entire round.

Groq, the AI chip startup once positioned as the most credible challenger to Nvidia's GPU dominance, is in advanced discussions to raise $650 million in new funding, according to a report by Axios published May 28. The raise is being backstopped by existing investors Disruptive and Infinitum, meaning those firms have committed to cover any unfilled portions of the round. It's a notable vote of confidence in a company that, by most measures, has just gone through a full reset.

The backstory matters here. In December 2025, Nvidia signed a $20 billion non-exclusive licensing agreement to acquire rights to Groq's language processing unit architecture and brought on founder and CEO Jonathan Ross along with much of the core engineering team. Groq technically retained its independence and kept its intellectual property, but the deal functionally transferred the people who built the company. By February 2026, Groq had distributed $7.6 billion to shareholders, roughly $64 per share, as the first major payout under that agreement. Investors got paid. The original company, as constituted, was mostly over.

What's being funded now is Groq 2.0. The reconstituted company is led by company veterans Adam Winter as CEO and Matt Eng as CFO, and the strategic pivot is sharp: away from silicon design and toward what the company is calling an AI inference neocloud. Rather than competing with Nvidia by building chips, Groq intends to compete at the infrastructure layer, running AI inference workloads at scale over GroqCloud, its token-as-a-service platform that already counts nearly two million developers and teams as users.

The timing is deliberate. The AI industry is bifurcating. Training compute, the massive, months-long GPU clusters that built GPT-4 and Gemini, is largely Nvidia's game and likely to stay that way. Inference is a different market. It runs continuously, at latency sensitivities that matter to end users, and the economics of serving a trillion tokens per day are genuinely different from the economics of a training run. Groq's LPU architecture was purpose-built for exactly this workload: deterministic, low-latency, memory-bandwidth-optimized inference. That architecture's edge case is precisely where demand is exploding.

Nvidia's own deployment of Groq's LPU confirms the thesis. At GTC 2026 in San Jose, Nvidia unveiled what it's calling the Groq 3, the first chip to emerge from the December licensing deal. Specs are aggressive: 150 terabytes per second of on-chip SRAM bandwidth (seven times faster than Vera Rubin GPU HBM), 315 petaflops FP8 per rack, and 35x throughput per megawatt compared to Blackwell for trillion-parameter models. It's built on Samsung 4nm and set to ship in the third quarter of 2026. The fact that Nvidia itself is commercializing LPU technology is, counterintuitively, a validation of the entire inference-optimized silicon segment, including Groq's plan to offer that capability as a service.

There's also a regulatory subplot worth watching. Senators Elizabeth Warren and Richard Blumenthal opened a formal inquiry in March 2026, arguing that the Nvidia-Groq deal is a reverse acqui-hire structured deliberately to avoid triggering Hart-Scott-Rodino antitrust filing thresholds. They set an April 3 deadline for Nvidia to respond and urged the DOJ and FTC to investigate. The regulatory outcome is uncertain, but if the deal faces forced restructuring, Groq's independent standing could become either a complication or an asset depending on how the chips fall.

What $650 million buys in the neocloud race

Groq's last equity event before all this was a Series E in September 2025 that raised $750 million at a $6.9 billion post-money valuation. The $7.6 billion shareholder distribution from the Nvidia deal eclipsed that figure entirely, effectively returning more capital than the company's peak private valuation in a single payout. Now the $650 million raise is seeding a structurally different business, one that doesn't carry the capital intensity of chip fabrication but does require significant investment in data center infrastructure, interconnect, and go-to-market to compete with the likes of CoreWeave, Lambda Labs, and the hyperscalers themselves.

The neocloud space is not empty. CoreWeave has gone public and is building out Nvidia GPU capacity at scale. Lambda Labs, Together AI, and Fireworks AI are all competing for enterprise inference contracts. What differentiates Groq's pitch is a proprietary stack, LPU-based hardware that it still operates, GroqCloud's existing developer base, and the credibility of a team that invented the architecture Nvidia just paid $20 billion to license. That's a legitimate moat, assuming the new leadership can execute on the infrastructure buildout.

For investors watching the AI capital stack, the Groq raise is a signal worth parsing carefully. The money flowing into inference infrastructure is no longer theoretical, it's showing up in multibillion-dollar licensing agreements, public market debuts, and now a $650 million round for a company that has essentially shed its hardware origins and is betting entirely on serving the demand layer. If the inference economy grows as fast as current token consumption trends suggest, the infrastructure companies that own the plumbing will matter as much as the model makers sitting on top of them. Groq, in its second act, is trying to be that plumbing.

Also read: GPT-5.6 spotted in Codex • America's regulated Bitcoin perpetual market is finally taking shape. • Canada's economy just slipped into its first technical recession since the pandemic and the timing could not be worse for North American risk appetite