Discord sleuths breach Anthropic's zero-day hunter in a wake-up call for contained AI security

A group of Discord researchers gained unauthorised access to Anthropic's Claude Mythos cybersecurity model for weeks in April, exploiting a third-party vendor preview environment through URL guessing and data patterns , a breach that exposed the fragility of even the most restricted frontier AI deployments.

The incident is a masterclass in how AI security failures happen outside the core lab walls. Anthropic's Project Glasswing is a restricted initiative that deploys Claude Mythos, a model trained specifically to hunt zero-day vulnerabilities in software, hardware, and network systems. Access was limited to a handful of elite government and enterprise partners under strict NDAs and air-gapped environments. The breach did not come from a direct attack on Anthropic's infrastructure. It came from a vendor's preview instance , a staging environment that was publicly accessible due to a misconfigured authentication layer.

The Discord sleuths, as they called themselves, combined URL enumeration with data patterns from a prior supply-chain leak to identify the endpoint. Once inside, they ran mundane coding tasks rather than attempting to extract model weights or misuse the vulnerability-hunting capabilities. Anthropic confirmed no core weights were compromised and that the breach was contained to the vendor environment. The company has since terminated the vendor relationship, hardened its preview workflows, and committed to publishing a full incident report. That response is textbook. The vulnerability it exposed is structural.

Project Glasswing was designed as a contained deployment: Mythos runs in isolated environments with no external data flows, no API endpoints, and no model weights leaving Anthropic's control. The Discord breach did not violate that containment. It violated the vendor's containment. The preview environment was a shared staging site where the vendor tested integration with Anthropic's API before handing off to the partner. URL guessing , a technique as old as the web , combined with leaked data patterns from a prior incident, was sufficient to gain access. The fact that the sleuths used the model for coding rather than exploits does not change the risk profile. Anyone with the same access could have done the same with more malicious intent.

The incident reinforces a pattern that has repeated across frontier AI labs: the core model is secure, but the deployment ecosystem is not. OpenAI's 2024 breach involved a Slack integration that leaked API keys. Google's 2025 incident exposed a preview endpoint through a misconfigured cloud bucket. Anthropic's Mythos breach is the cybersecurity model version of the same problem. The supply chain , vendors, partners, preview environments , is where the attack surface expands.

What contained AI security actually requires

The Mythos breach produces three concrete takeaways for AI companies building restricted deployments. First, preview environments must be treated as production environments from a security standpoint. Vendor staging sites are not sandboxes. They are attack surfaces. Second, URL guessing and data pattern matching are low-barrier techniques that defeat any security model assuming competent adversaries. Third, incident response matters more than prevention in these cases. Anthropic's transparency , confirming the breach, containing it, and committing to a report , minimised reputational damage. The incident still produced a major embarrassment for a company positioning itself as the safety-first lab.

For the broader AI industry, the Discord sleuths story is a reminder that even the most sensitive models are only as secure as their weakest third-party link. Anthropic learned that lesson the hard way. The next lab to deploy a contained model will build their vendor agreements, preview workflows, and access controls with the Discord sleuths in mind. Containment is not a technical problem. It is a process problem, and the process is only as strong as its weakest participant.

Also read: Sereact raises $110 million to build robots that predict what happens before they act • AI agents can now cost more than the humans they were supposed to replace • Hipfire is a Rust-native AMD inference engine that beats llama.cpp on consumer GPUs