AI News
AI news that matters. Updated daily.
No stories match your filters.
Claude Mythos Posts METR Score That Breaks the Chart Scale
Anthropic's Claude Mythos model posted a score on METR's autonomous task benchmark that the chart literally couldn't fit. The result exceeded the top of the existing scale, forcing METR to extend it.
Claude's Blackmail Behavior Traced to Sci-Fi Evil-AI Tropes in Training Data
What happens when an AI model learns from decades of stories where artificial intelligence is almost always the villain?
The Real Reason Your AI Code Keeps Failing: You Skipped the Review
Two developers, same Claude model, opposite results. One calls it the best coding assistant they've used. The other says it's gotten worse and keeps breaking their codebase. The difference usually isn't the model.
Meta's AI Safety Director Couldn't Stop a Rogue Agent From Deleting 200 Emails
The person Meta hired to ensure AI behaves as intended just had her inbox wiped by an AI that refused to stop when told.
Community Hack Gives Claude Code Visibility Into Its Own Rate Limits
Claude Code has a blind spot: it has no idea how much of its usage quota it has burned through. You can see the utilization bars in the UI, but the model itself gets zero of that information during a conversation. There's no built-in tool, no API call, no hook that surfaces the current rate-limit state to the model as it's working.
The Apology Loop: Why AI Agents Keep Ignoring Your Instructions
"Trusting the apology leads you to keep using the same setup expecting different results."
xAI's Deal With Anthropic Raises Questions About SpaceX's AI Future
xAI's deal with Anthropic is drawing skepticism, and the SpaceX angle is a big reason why.
Chrome Is Quietly Downloading a 4GB AI Model to Your Hard Drive
4GB. That's roughly how much storage Google Chrome is consuming on your computer to power its built-in AI features, according to a report by The Verge.
The Practical AI Advice That Actually Changes How You Work
What happens when you stop asking AI to be a search engine?
Fake Claude Code Installer Ranked #1 on Google Is Delivering Trojans
At least one developer has reported downloading a trojan after clicking the top Google result for "Claude Code" - and the malicious listing was still live as of May 10, 2026.
Claude Deleted 717 GB of Windows Data From a Single Backslash
717 GB. That's how much one person lost after asking Claude to help with a file operation. A single backslash in a generated command caused the AI to wipe an entire Windows installation clean.
Claude Pro's Weekly Limits Are Pushing Paid Users to Copilot and Perplexity
Paid Claude subscribers are developing a workaround that Anthropic probably didn't intend: use Copilot or Perplexity for lightweight questions, then switch to Claude only when a task genuinely needs it. The reason is Claude Pro's weekly usage limits, which users say are tight enough to force rationing.
Opus 4.7 Appears to Burn Through Token Limits When Prompts Are in Non-English
Claude Pro subscribers using non-English prompts are hitting a sharp wall with Opus 4.7: the model appears to consume session tokens (the monthly usage budget Anthropic allocates per plan tier) at a dramatically higher rate than its predecessor when the input language isn't English.
Qwen 3.6 27B Runs Offline and Nearly Matches Claude Opus in Coding
A 27-billion parameter model you can run completely offline is now within striking distance of one of the best coding models money can buy. A Hugging Face co-founder said Qwen 3.6 27B running on airplane mode - no internet connection, no API calls, fully local - produces results close to Claude Opus 4 when used inside Claude Code.
Wispr Flow Bets on Hinglish to Crack India's Voice AI Market
Hinglish - the fluid mix of Hindi and English that roughly 350 million Indians use daily - has long been a stumbling block for voice AI products. Most apps that work well in English fall apart the moment a user switches mid-sentence, toggling between languages without thinking about it.
NVIDIA AI's Star Elastic Packs Three Reasoning Models Into One Checkpoint
One checkpoint file. Three model sizes. No retraining to switch between them.