Related ToolsClaude CodeClaudeCursorGemini

How One CTO Uses Claude Code to Semi-Automate Every Code Review

Claude by Anthropic
Image: Anthropic

Code reviews are still the bottleneck in most engineering teams, and they have been for decades. The tools change but the problem persists: someone has to read the diff, understand the context, run the tests, and decide if the code is good enough to ship. Grove CTO Daniel Olshansky thinks he has found a workflow that gets him most of the way there using Claude Code.

The 6-Step Pipeline

Olshansky manages three repositories at Grove (backend API, frontend app, Chrome extension) and built custom slash commands for each: /grove_app_review, /grove_api_review, and /grove_extension_review. Here is how the pipeline works:

  1. Kick off the review with the appropriate slash command
  2. Claude Code reviews the diff, runs checks, and spins up the local environment (including the database via Docker)
  3. End-to-end test flows run automatically - happy paths, sad paths, and what Olshansky calls "chaotic paths"
  4. Failures come with context - no bare "test failed" messages. The agent explains why something broke
  5. A PR sweep catches regressions and tech debt via /cmd-pr-sweep
  6. PR descriptions are generated automatically via /cmd-pr-description

The custom slash commands and agent skills are open-source, installable via npx add skills from his olshansk/agent-skills repository.

What Stays Manual

Olshansky is clear about the boundaries: "I still review the core logic line-by-line, but I haven't written a single line of it myself." The agent handles the mechanical work - running tests, checking for regressions, writing descriptions - while the human focuses on whether the logic is correct and fits the product direction.

He also cross-references Claude Code's output with Gemini for frontend review and OpenAI's Codex for architecture review. Using multiple models as checks on each other is a pattern more teams are adopting, especially when the stakes of a bad merge are high.

The Real Advantage Is Customization

Olshansky mentions alternatives like CodeRabbit, Claude CodeReview, and PropelCode, but says they "get you halfway there." The gap, in his view, is domain-specific knowledge: "It's the part that's specific to your domain, your product, your tech stack, your culture, and your taste."

This matches what we have seen across AI coding tools. Generic code review catches generic problems. The value shows up when you can encode your team's conventions, your database schema quirks, and your deployment gotchas into the agent's context. Claude Code's AGENTS.md file and custom slash commands make that possible without building a platform from scratch.

One practical detail stands out: Olshansky uses inline TODO comments as context for future agent runs. Instead of tracking tech debt in a separate system, the debt lives next to the code where agents will encounter it naturally. It is a small decision, but it reflects how workflows change when your reviewer is a language model that reads every file.