Policy Notable

Anthropic Publishes Election Safeguard Benchmarks Ahead of US Midterms

April 24, 2026 2 min read Source: Anthropic Blog

Hand casting vote into ballot box with detailed silhouette and voting process elements

Image: Anthropic

96%. That's the political neutrality score Anthropic's Sonnet 4.6 model hit in its latest internal evaluations - Opus 4.7 scored 95%. The company published a detailed update on April 24 outlining how Claude is being prepared for the 2026 US midterm cycle.

The numbers on election misuse prevention are more striking. Across 600 election-related test prompts, Opus 4.7 responded appropriately 100% of the time; Sonnet 4.6 came in at 99.8%. For harder multi-turn tests simulating coordinated influence operations - where an attacker might try to gradually steer the model toward helping create propaganda - the results dropped somewhat: Sonnet 4.6 at 90% and Opus 4.7 at 94%. Anthropic says newer models with safeguards enabled refused nearly every task in autonomous influence operation testing.

What Claude Does When You Ask About Elections

For users hitting Claude with midterm candidate questions, election banners now direct them to TurboVote, a nonpartisan resource from Democracy Works. Web search activates 92-95% of the time for those queries. Anthropic tested over 200 distinct prompts with three variations each to validate this behavior.

On the policy side, Claude's usage terms explicitly prohibit deceptive campaign content, fake persona creation, voter fraud assistance, and misleading voting information. Automated classifiers flag potential violations and a dedicated threat intelligence team investigates coordinated abuse patterns.

Third-party partners involved in the work include The Future of Free Speech, the Foundation for American Innovation, and the Collective Intelligence Project.

This kind of public benchmark release is useful - it gives researchers and journalists something concrete to test against rather than taking safety claims on faith. The 90-94% scores on influence operation simulations are the honest part of the report: they show the problem is hard and not fully solved, which is more credible than a deck of 100s.

Source

Anthropic Blog AnnouncementsAn update on our election safeguards →

What Claude Does When You Ask About Elections

Source

Related Tools

More from today

DeepSeek Previews Two Models That Nearly Match Frontier AI on Reasoning Benchmarks

Musk vs. OpenAI Trial Starts April 27 - What the Courtroom Could Reveal

DeepSeek Previews V4, Claims Parity With Anthropic and OpenAI's Top Models

Cookie Preferences