Open Source Notable

IronCurtain Sandboxes AI Agents So They Can Never Touch Your Credentials

March 7, 2026 3 min read

What Happened

Niels Provos released IronCurtain, an open-source framework that applies a chokepoint security architecture to AI agents. Every action an agent takes funnels through a single MCP (Model Context Protocol) proxy that enforces policy before anything reaches external systems. You can install it with npx @provos/ironcurtain.

The framework runs agents in two sandbox modes. Code Mode executes TypeScript in a V8 isolate with zero filesystem or network access - the only way out is through the proxy's typed API. Docker Mode runs agents in containers with networking completely disabled (--network=none), communicating solely through Unix domain sockets.

The credential handling is the standout design choice. Agents never see real API keys. In Docker Mode, agents receive fake credentials that pass format validation but do nothing. A TLS-terminating man-in-the-middle proxy intercepts outbound requests and swaps in real credentials before forwarding upstream. The agent literally cannot leak what it does not have.

Policy enforcement uses plain English "constitutions" compiled into deterministic rules. Instead of writing complex policy DSLs, you write something like: "the agent may read and write files in the project directory, may perform read-only git operations without approval, and must ask before pushing to any remote." Actions get categorized as Allow, Escalate (requires human approval), or Deny (blocked with reasoning sent back to the model).

Current capabilities include filesystem operations, git, web fetching with HTML-to-markdown conversion, web search via Brave and SerpAPI, and Signal integration for encrypted bot access.

Why It Matters

AI agents are getting more autonomous. Claude Code can run shell commands. Cursor can modify files across your project. The security model for most of these tools is essentially "trust the model and hope for the best." IronCurtain takes the opposite approach: assume the model will do something unexpected and contain the blast radius.

The fake-credentials pattern is particularly relevant. Prompt injection remains an unsolved problem - if an agent processes untrusted content (a webpage, a document, a message), it could be manipulated into exfiltrating secrets. IronCurtain makes that impossible at the architecture level, not the prompt level.

Our Take

This is the right approach to agent security, and it is notable that it builds on MCP rather than inventing a proprietary protocol. The plain-English constitution for policy is clever - it meets developers where they are instead of requiring them to learn yet another configuration language.

The honest acknowledgment that IronCurtain cannot prevent prompt injection is refreshing. Too many security tools oversell their capabilities. What it can do is contain the damage, and for most real-world scenarios, containment is what matters. Your agent might get confused by a malicious prompt in a webpage, but it cannot leak your AWS keys because it never had them.

The main limitation is friction. Running everything through a proxy with human escalation for sensitive actions slows things down. But that is exactly the tradeoff most teams should be making right now. We are in the "move carefully and build trust" phase of AI agent adoption, not the "let it run unsupervised" phase. IronCurtain is built for the world we are actually in.

What Happened

Why It Matters

Our Take

Related Tools

More from today

llama.cpp Merges Full MCP Support with Agentic Loop and Tool Calls

Qwen3-Coder-Next Tops SWE-rebench at Pass@5 With Only 3B Active Parameters

Beam Protocol Proposes an Open Standard for AI Agent-to-Agent Communication

Cookie Preferences