The Times filed a third amended complaint on June 25 targeting Microsoft directly, alleging its supercomputer platform was purpose-built to scrape and replicate copyrighted works at scale. If courts buy it, the liability map for AI infrastructure changes entirely.
For two and a half years, Microsoft has largely played the background role in the New York Times copyright case: the deep-pocketed infrastructure partner, the cloud, the compute, not the one making the editorial decisions about what to train on. That framing just got harder to sustain. The Times filed a third amended complaint on June 25 accusing Microsoft of actively architecting the infringement, not just hosting it. The new filing, reported by Bloomberg Law and MLex, alleges that Microsoft provided OpenAI with a supercomputing platform specifically designed to download, store, and replicate copyrighted works for AI training purposes, and that it supplied internet scraping software and unauthorized content directly.
That's a different legal theory than passive infrastructure liability. The Times isn't arguing that Microsoft's servers happened to hold infringing data. It's arguing Microsoft built the tool, knew what it was for, and profited from the output when those GPT models were folded into Bing Chat and Copilot. Contributory infringement requires showing that a defendant knew about the infringement and materially contributed to it. By naming the supercomputer as the instrument, the Times is trying to satisfy both prongs at once.
The timing compounds the pressure on Microsoft. Just a day before the Times filed, on June 24, a coalition of nearly 400 local and regional newspapers filed their own lawsuit in the Southern District of New York, alleging that OpenAI and Microsoft "systematically and secretly crawled" hundreds of news websites, copied content from behind paywalls, and stripped copyright management information from articles to use in training ChatGPT and Copilot. That case was filed by the Local News Copyright Alliance and represents one of the largest consolidated copyright actions the news industry has mounted against AI companies. Two major copyright suits landing within 24 hours, both naming the same two defendants, is not background noise.
Here's the thing: the legal exposure here doesn't stop at Microsoft. Every frontier model builder relying on hyperscaler compute faces a version of the same question the Times just put in front of a federal judge. If the court accepts the argument that a supercomputer configured to acquire training data at scale is itself an infringement tool, the liability analysis for cloud infrastructure providers shifts from hosting to participation. Azure is the backbone for OpenAI's training runs. It's also the platform Microsoft is betting its entire AI product roadmap on, with the company having raised its 2026 capex plan to roughly $190 billion and committed to doubling its AI capacity in gigawatts within two years. That investment thesis assumes Azure's role as AI infrastructure doesn't carry direct copyright liability. The Times is arguing it does.
For venture-backed AI labs, the implications are practical and immediate. Data provenance has already become a procurement gate: buyers' legal teams increasingly want to know where training data came from before signing enterprise agreements. A ruling that treats purpose-built scraping infrastructure as a contributory infringement vehicle would push that question upstream, from the lab to the cloud provider, and from the cloud provider to every contract in between. Microsoft has in recent months made a point of emphasizing clean data lineage for its own MAI model family, specifically flagging commercially licensed datasets as a competitive differentiator. That positioning looks less like marketing and more like legal preparation given what the Times filed Thursday.
The case itself still has a long road. Judge Ona T. Wang already ruled in March 2025 that the Times's core copyright infringement claims survive, while dismissing the unfair competition claims. A preservation order issued that same month requires OpenAI to retain all ChatGPT conversation logs affecting over 400 million users. The third amended complaint now needs court approval to proceed, and Microsoft will push back hard on the active-architect framing. But the trajectory of this litigation has consistently moved toward expanding liability, not contracting it, and the discovery process has clearly handed the Times enough specifics about Microsoft's infrastructure role to sharpen the theory.
The broader point is this: the AI industry built its training pipelines during a window when copyright law hadn't caught up with what scraping at supercompute scale actually means. That window is closing. The Times case, the 400-newspaper coalition suit, and the growing docket of similar actions are not separate events. They're the same underlying reckoning arriving from different directions. Any lab, investor, or cloud provider that has been treating training data liability as a manageable compliance footnote should now treat it as a first-order business risk.
Also read: OpenAI is raiding Apple's Vision Pro talent as the headset quietly dies • Robert Kuok's grandson is staking €5.3 billion on Italy becoming Europe's AI infrastructure capital • Apple's touchscreen MacBook arrives on M5 chips as AI memory costs reshape the whole product line