Related ToolsClaude CodeClaude

New Site Tracks Whether Claude Code and GitHub Copilot Are Getting Worse

Claude by Anthropic
Image: Anthropic

A developer built a site called diditgetdumber.com to track one specific question: have Claude Code and OpenAI's Codex - the model powering GitHub Copilot - gotten worse over time?

The site aggregates community reports on perceived quality changes in these AI coding assistants. It's trying to create a record of when users notice output getting buggier, less consistent, or less accurate - changes that companies rarely announce but that can directly affect anyone using these tools for paid work.

Quality regression in AI coding tools is genuinely hard to document. These models aren't fully deterministic, meaning they can give different outputs to the same prompt each time. That makes it difficult to tell whether a bad result is a one-off or a pattern. Official benchmarks - standardized tests used to measure AI capability - don't capture "it keeps generating code that doesn't compile." Aggregated user reports can.

The site is early and thin on data. But it addresses a real gap: there's no official changelog for when a model gets marginally worse at something, and users who've noticed shifts in Claude Code or Copilot behavior have nowhere official to check if others saw the same thing.