Related ToolsClaude

Anthropic Withheld Claude Mythos for Safety - Then Unauthorized Users Got It Anyway

Anthropic
Image: Anthropic

Three weeks ago, Anthropic was telling the world that Claude Mythos - its most capable model - was so dangerous it couldn't be released publicly. The cybersecurity capabilities were too potent, the risks too real. Now, according to Bloomberg, a small group of unauthorized users has had access to that same model anyway.

The timing makes this particularly awkward. Anthropic's entire justification for withholding Claude Mythos from general availability rested on the premise that controlling access was essential for safety. If the company can't control who gets the model through its own preview or internal channels, that argument becomes much harder to sustain.

Anthropologic has been one of the loudest voices in AI about responsible deployment and safety-first release practices. Losing control of the specific model you held back for safety reasons is exactly the kind of incident that undermines credibility on those claims.

The details from Bloomberg are thin: a "small group of unauthorized users" had access. What they did with it, how long they had access, and how the breach occurred aren't disclosed. Anthropic hasn't publicly explained whether this was an insider issue, an access control failure, or something else entirely.

This doesn't mean Mythos will cause the cybersecurity disasters Anthropic warned about. But it does mean the company's public safety posture - we're being careful so you don't have to worry - now has a visible crack in it. For a company that built much of its brand on being the responsible adult in the AI room, that's a real reputational cost, separate from whatever the security implications turn out to be.