Some developers spend entire five-hour sessions in Claude Code without touching a usage cap. Others burn through their quota in under an hour. Both groups are paying for the same Pro plan.
The gap isn't about how much code you write. It's about how you write it.
The Workflow Split
Developers who rarely hit limits tend to share a few habits. They write detailed CLAUDE.md project files that give the model context upfront, reducing back-and-forth. They use focused, specific prompts rather than vague requests that force the model to guess and retry. And they lean on Claude Code's agentic features - letting it read files, run tests, and iterate autonomously - rather than manually copy-pasting code blocks in and out of chat windows.
The developers burning through tokens fastest often fall into a different pattern: rapid-fire short prompts, frequent context resets, or using the model as a rubber duck rather than a collaborator. Each new message in a long conversation costs more tokens because the model re-reads the entire history. Kill the conversation and start fresh too often, and you lose context but still pay for rebuilding it.
There's also the model tier factor. Opus 4.6, the most capable (and most expensive per-token) model, eats through quotas faster than Sonnet. Developers who default to Opus for every task - including simple refactors that Sonnet handles fine - will hit walls sooner than those who match model power to task complexity.
Practical Lessons for Staying Under the Cap
- Front-load context. A well-written CLAUDE.md file or detailed first prompt saves dozens of clarification rounds later.
- Use the right model. Reserve Opus for complex architecture decisions and multi-file refactors. Use Sonnet (or the
/fasttoggle in Claude Code) for straightforward edits, test writing, and simple debugging. - Let agentic loops run. When Claude Code reads files, runs commands, and iterates on its own, that's generally more token-efficient than you manually breaking the work into tiny steps.
- Avoid conversation bloat. If a thread has grown past its useful life, start a new one. But don't restart every five minutes - find the balance.
- Batch related work. Five small prompts about the same file cost more than one detailed prompt covering all five changes.
The broader pattern here is straightforward: developers with 10+ years of experience using IDEs and command-line tools tend to adapt their existing workflow discipline to AI coding. Newer developers, or those coming from chat-based AI interactions, sometimes treat Claude Code like a chatbot rather than a development environment. The tool rewards the same things good engineering always has - clear requirements, structured context, and knowing when to use which tool for the job.