Companies Breaking

Anthropic Caught DeepSeek, Moonshot AI, and MiniMax Distilling Claude at Scale

February 23, 2026 2 min read Source:

AnnouncementsDetecting and preventing distillation attacks

What Happened

Anthropic published evidence on February 23, 2026 that three AI companies ran large-scale distillation attacks against Claude - essentially using its outputs to train their own models cheaper and faster than doing independent research.

The numbers are significant:

DeepSeek: 150,000+ exchanges targeting reasoning and chain-of-thought data
Moonshot AI: 3.4+ million exchanges focused on agentic reasoning, tool use, and coding
MiniMax: 13+ million exchanges targeting agentic coding and tool orchestration

These weren't casual API users. The operations used commercial proxy services running "hydra cluster architectures" with sprawling networks of fraudulent accounts. One proxy alone managed 20,000+ fake accounts simultaneously. Moonshot AI used hundreds of fraudulent accounts across multiple pathways. MiniMax was caught while still active, before their model shipped, and pivoted within 24 hours when a new Claude model launched - redirecting nearly half their traffic to the updated version.

Why It Matters

If you use AI tools daily, this matters for two reasons.

First, distilled models strip out safety guardrails. Anthropic specifically flags that these extracted capabilities could be deployed without the safeguards built into Claude. That means models fine-tuned on stolen reasoning traces could end up powering tools with none of the alignment work that went into the original.

Second, this explains a pattern many practitioners have noticed: new models from smaller labs showing suspiciously strong performance in specific areas like coding and tool use, exactly the domains targeted in these campaigns. If your "budget alternative" AI tool is powered by a distilled model, you're getting capability without the safety engineering.

For Anthropic specifically, this is a direct threat to their business model. They invest heavily in alignment research and safety testing. Competitors extracting those capabilities for pennies on the dollar undermines the economic case for doing safety work.

Our Take

This is the most detailed public accounting of model theft we've seen from any AI lab. Anthropic naming names - DeepSeek, Moonshot AI, MiniMax - is a deliberate escalation.

The detection methods are worth noting: behavioral fingerprinting, traffic pattern analysis, shared payment method tracking, and intelligence sharing with industry partners. Anthropic built what amounts to a fraud detection system for AI model theft.

The practical takeaway for tool users: be skeptical of models that show strong performance in narrow domains (especially coding and agentic tasks) without clear documentation of how they were trained. The cheapest model isn't always the best deal if it was built by extracting capabilities from a more carefully engineered system.

Anthropic's response includes model-level safeguards that reduce extraction efficacy and strengthened account verification. Whether that's enough to stop state-backed operations with essentially unlimited resources remains an open question.

Source

AnnouncementsDetecting and preventing distillation attacks →

What Happened

Why It Matters

Our Take

Source

More from today

MIT Investigation: Humanoid Robot Demos Conceal Widespread Remote Human Control

Cookie Preferences