Anthropic just shipped Code Review for Claude Code, and the pitch is straightforward: a squad of AI agents reviews your pull requests in parallel, cross-checks each other's findings to kill false positives, then drops a summary comment plus inline annotations ranked by severity.
The numbers from Anthropic's internal use are worth unpacking. On large PRs (1,000+ lines), 84% get flagged findings, averaging 7.5 issues per review. Small PRs under 50 lines only trigger findings 31% of the time, averaging half an issue. Reviews take about 20 minutes. The standout stat: engineers marked less than 1% of findings as incorrect.
What It Actually Catches
Anthropic shared two examples that go beyond style nits. In one case, a single-line change to a production service looked routine, but Code Review flagged it as a breaking change to authentication. In another, an open-source TrueNAS PR had a pre-existing type mismatch that was silently wiping the encryption key cache. That second one is the interesting case - it found a bug that was already in the codebase, not just in the diff.
Before rolling this out internally, only 16% of Anthropic's PRs received substantive review comments. That number jumped to 54%. The company says code output per engineer has grown 200% in the last year, which partly explains why they needed automated review in the first place - more AI-generated code means more surface area for bugs.
Pricing and Access
This is a beta research preview, available only on Team and Enterprise plans. Cost averages $15-25 per review based on PR complexity, billed by token usage. Admins get a monthly spending cap, per-repo controls, and an analytics dashboard tracking review volume, acceptance rates, and costs.
Setup is admin-side: enable it in Claude Code settings, install the GitHub App, pick your repositories. Developers don't configure anything - reviews just start appearing on PRs.
At $15-25 per review, this is priced for teams where a missed bug in production costs real money. A team doing 100 PRs a month is looking at roughly $1,500-2,500 in review costs. That math works if it catches even one authentication-breaking change that would have hit production. For solo developers or small teams doing mostly low-risk changes, the existing open-source Claude Code GitHub Action is still available as a lighter alternative.