Related ToolsClaude CodeClaudeCursorGithub Copilot

Anthropic Is Silently A/B Testing Claude Code's Plan Mode on Paying Users

Anthropic
Image: Anthropic

$200 a month. That's what Claude Code's Max plan costs. And Anthropic is quietly running A/B tests that change how the tool behaves without telling the people paying for it.

A developer building a plan visualization tool stumbled onto something buried in Claude Code's binary: an A/B test called tengu_pewter_ledger, managed through GrowthBook (an experimentation platform). The test targets plan mode, the feature where Claude Code outlines its approach before writing code, and it has four variants that progressively strip down what users see.

What the Variants Actually Do

The four variants are null, trim, cut, and cap. The most aggressive one, cap, hard-limits plans to 40 lines maximum, forbids context sections and background information, eliminates prose paragraphs entirely, and instructs the model to "delete prose, not file paths" when plans exceed the limit. Instead of a thoughtful breakdown of an approach, you get terse bullet points.

Telemetry logged when you exit plan mode includes plan length, whether you approved or denied the plan, and which variant you were assigned. Anthropic is measuring whether shorter plans lead to faster approvals, essentially testing how much context they can strip before users push back.

The Transparency Problem

A/B testing is standard practice for consumer products. Netflix tests thumbnail images. Spotify tests playlist layouts. But Claude Code is a professional development tool where the plan is the primary mechanism for human oversight. When a developer reviews a plan before approving it, they're exercising judgment about whether the AI's approach is correct. Silently degrading that plan's detail level undermines the entire human-in-the-loop workflow.

There's no toggle to opt out. No notification that your experience differs from another user's. Two developers on the same team could get meaningfully different plan quality and never know it, leading to confusion about whether Claude Code is "getting worse" or working as intended.

A Configuration Problem Disguised as an Experiment

Some developers genuinely prefer shorter plans. Others want exhaustive detail before letting an AI modify their codebase. This is a preference that should be a user setting, not something decided by a random variant assignment. The information is there in the binary for anyone who decompiles it, but most paying customers will never know they're part of an experiment that directly affects the quality of AI-generated code they ship.

Anthropic has built its brand on safety, transparency, and responsible AI deployment. Silent experimentation on the feature most central to human control over an AI coding agent sits uncomfortably with that positioning.