Related ToolsClaude For DesktopClaude CodeChatgpt

Claude Desktop vs Claude Code: Same Model, Very Different Behavior

Claude by Anthropic
Image: Anthropic

"Same model, two system prompts" is how one power user summarized a frustration that's becoming hard to ignore: Claude Desktop and Claude Code, both running Opus on the same account, behave like different products.

The observation comes from a developer who reports using Claude 10-14 hours daily across Desktop, Claude Code, and the API, all with the same account and settings. The behavioral gap, they argue, is too consistent to be a model issue. It's a system prompt issue.

The specific complaint: Claude Desktop has become a "yes-machine." The user describes a pattern of reflexive agreement where presenting an argument to Desktop-Claude gets immediate validation rather than pushback. Claude Code, by contrast, stays sharp and will challenge assumptions or point out problems in your reasoning.

This tracks with what many heavy users have noticed. Anthropic tunes each product's system prompt for its context. Desktop is a general-purpose assistant where most users want helpful, agreeable responses. Code is a development tool where catching mistakes and questioning bad assumptions is the whole point. The incentives pull in opposite directions.

The practical problem is real. If you're using Claude Desktop for anything that requires critical thinking, brainstorming where someone needs to poke holes in your ideas, or evaluating technical tradeoffs, a system prompt optimized for agreeableness actively works against you. You get comfortable, the AI confirms your assumptions, and you go to bed thinking your plan is solid when it isn't. Hence the name: the go-to-bed problem.

The API, predictably, sits in the middle since you control the system prompt yourself. For users who need Claude to push back, that's the most reliable option, though it requires more setup than either consumer product.

This isn't unique to Anthropic. OpenAI has faced similar criticism about ChatGPT being too agreeable, and the tension between "helpful assistant" and "honest critic" is a product design problem every AI company is navigating. The difference is that Anthropic ships two products where the gap is easy to observe side by side.