Related ToolsClaude

What Anthropic's Red Team Found Analyzing 832 Malicious AI Accounts

Open laptop with lock iconography showing on its inner screen
Image: Anthropic

832 accounts. That's how many Anthropic banned for malicious cyber activity between March 2025 and March 2026 - and the company's Frontier Red Team spent that year mapping what those attackers actually did with AI, publishing the full findings this week.

The headline number: accounts classified as medium-risk or higher jumped from 33% to 56% over the study period - a 1.7x increase in under twelve months. Attackers aren't just using AI more; they're using it for increasingly dangerous operations.

The Shift Toward Post-Compromise Attacks

Early fears about AI-enabled attacks focused on phishing - mass-personalized deceptive emails. The data suggests that threat vector is already mature and declining in relative importance. AI use for phishing dropped 8.6% across the study period. Meanwhile, AI use for account discovery (finding and mapping systems after an attacker already has a foothold inside a network) rose 8.9%.

The read here: phishing is a solved problem and attackers have moved on. The harder work - what to do once you're inside a network, which systems to target, how to avoid detection while moving between them - is where AI is providing the biggest boost now.

67% of analyzed accounts used AI for malware creation. Only 6.5% used it for lateral movement (the technique of pivoting from one compromised machine to others within the same network). That gap will narrow. The trajectory throughout the year was consistently toward AI handling more of the attack chain, with less human direction required at each stage.

The clearest example from the report: a state-sponsored operation disrupted in November 2025 used 30 distinct techniques across 13 different tactic categories. Multi-stage operations at that scale historically required substantial human coordination. The alarming part of Anthropic's assessment wasn't the technique count - it was the autonomous decision-making capability the operation demonstrated.

MITRE ATT&CK Has No Category for Autonomous Attack Chaining

The practical problem this creates for defenders: MITRE ATT&CK, the standard framework security teams use to classify attacker behavior and structure their defenses, doesn't have categories for what makes AI-enabled attackers dangerous.

MITRE ATT&CK documents known techniques and maps them to tactics. It was built around the assumption that humans are making the decisions at each step. But when an AI agent is chaining attack stages together with minimal human direction - deciding what to do next based on what it finds inside a network - the dangerous capability isn't any specific technique. It's the orchestration layer that sequences those techniques autonomously.

Anthropic puts it directly: traditional danger indicators like technique diversity and platform choice no longer reliably predict risk levels. An attacker using two techniques with full autonomous chaining may represent more risk than one using thirty techniques under manual direction. The November 2025 state-sponsored case was classified as maximum-risk not because of how many tactics it used, but because of how little human guidance it required.

The company says it's working with MITRE to update the framework to account for AI-specific attacker behaviors. Until that happens, security teams are measuring threats against a model that no longer matches what attackers are actually doing. Detection windows are shorter when attacks adapt mid-operation, and the indicators most SOC teams monitor were calibrated for human-paced, manually orchestrated intrusions - not AI-accelerated ones.