Anthropic just shipped computer use for Claude Code, letting the AI open apps, click through interfaces, and take screenshots on your Mac, all from the same terminal session where it writes your code.
The feature, available as a built-in MCP server in Claude Code v2.1.85+, fills a specific gap: tasks that need a graphical interface but don't have a CLI or API. Think testing a native macOS app after building it, reproducing a layout bug at a specific window size, or driving the iOS Simulator without writing XCTest code.
How It Works
Claude follows a tool hierarchy. It tries MCP servers first, then shell commands, then Chrome automation, and only falls back to screen control when nothing else fits. When it does take over your screen, it hides other apps so it only interacts with the ones you've approved, and your terminal stays visible so you can watch.
Setup takes about a minute: run /mcp in a Claude Code session, enable the computer-use server, and grant macOS Accessibility and Screen Recording permissions when prompted. After that, you can ask Claude to do things like:
- Build a Swift app, launch it, click through every tab, and screenshot any errors
- Resize a window until a CSS bug reproduces, capture it, then patch the fix
- Open the iOS Simulator and tap through an onboarding flow
Each app needs per-session approval. Apps with broad access (terminals, Finder, System Settings) show extra warnings so you know what you're granting.
Limitations to Know
This is macOS only. No Linux, no Windows. You need a Pro or Max subscription through claude.ai directly (not Bedrock or Vertex). Only one Claude session can hold the screen lock at a time. And it only works in interactive mode, so you can't script it with the -p flag.
Browsers and trading platforms are view-only. Terminals are click-only. Everything else gets full control. Press Escape at any time to immediately stop Claude and get your screen back.
The practical use case here is narrow but real: if you build native apps and want to validate them without manually clicking through every screen, or if you're debugging visual issues that only appear under specific conditions, this saves a lot of alt-tabbing. For web developers who already use browser-based testing tools, you probably won't need it often.