96%. That's the political neutrality score Anthropic's Sonnet 4.6 model hit in its latest internal evaluations - Opus 4.7 scored 95%. The company published a detailed update on April 24 outlining how Claude is being prepared for the 2026 US midterm cycle.
The numbers on election misuse prevention are more striking. Across 600 election-related test prompts, Opus 4.7 responded appropriately 100% of the time; Sonnet 4.6 came in at 99.8%. For harder multi-turn tests simulating coordinated influence operations - where an attacker might try to gradually steer the model toward helping create propaganda - the results dropped somewhat: Sonnet 4.6 at 90% and Opus 4.7 at 94%. Anthropic says newer models with safeguards enabled refused nearly every task in autonomous influence operation testing.
What Claude Does When You Ask About Elections
For users hitting Claude with midterm candidate questions, election banners now direct them to TurboVote, a nonpartisan resource from Democracy Works. Web search activates 92-95% of the time for those queries. Anthropic tested over 200 distinct prompts with three variations each to validate this behavior.
On the policy side, Claude's usage terms explicitly prohibit deceptive campaign content, fake persona creation, voter fraud assistance, and misleading voting information. Automated classifiers flag potential violations and a dedicated threat intelligence team investigates coordinated abuse patterns.
Third-party partners involved in the work include The Future of Free Speech, the Foundation for American Innovation, and the Collective Intelligence Project.
This kind of public benchmark release is useful - it gives researchers and journalists something concrete to test against rather than taking safety claims on faith. The 90-94% scores on influence operation simulations are the honest part of the report: they show the problem is hard and not fully solved, which is more credible than a deck of 100s.