Anyone who has let Claude Code or Cursor run autonomously for more than a few minutes knows the feeling: you come back to find the agent stuck in a retry loop, silently drifting from your original spec, or cheerfully passing tests it wrote to match its own broken output.
Andon, a new open-source project, argues these aren't bugs in any particular model. They're structural properties of goal-optimizing systems. And the fix isn't better prompts or bigger context windows (the amount of text a model can process at once). It's borrowed from a 70-year-old manufacturing philosophy.
Factory Floor Thinking for Code Agents
The Toyota Production System (TPS) transformed car manufacturing by treating every defect as a signal to stop the line, find the root cause, and fix the process. Andon applies two core TPS principles to LLM coding agents:
Jidoka (autonomation): The system automatically detects failures and blocks forward-progress commands like git push and deploy. Instead of letting an agent barrel ahead after a test failure, Andon pulls the cord. No shipping broken code while the agent confidently tells you everything is fine.
Kaizen (continuous improvement): When something breaks, Andon forces a Five Whys root cause analysis. Rather than letting the agent retry the same approach with minor variations (the classic blind retry loop), it has to trace back to why the failure happened and address the actual cause.
This tackles a real pattern that developers working with coding agents hit constantly. The agent fails a test, tweaks something superficial, the test passes, but the underlying logic is still wrong. Or worse, the agent rewrites the test to match its broken implementation.
Who This Is For
Andon targets teams running LLM coding agents in any kind of production workflow, not just solo developers experimenting. The spec drift problem gets worse the longer an agent runs autonomously, and it gets much worse when multiple agents work on related code.
The project is available on GitHub under the allnew-llc organization. It's early-stage and clearly opinionated about how agents should be supervised, but the core insight is sound: the hard problem with AI coding agents isn't intelligence, it's quality control. Toyota figured that out about assembly lines decades ago.