A detailed field report from a platform engineer makes a sharp argument: AI agents are nowhere close to handling real infrastructure work, and the industry is kidding itself by pretending otherwise.
The post, written by an SRE who tested multiple models including Claude Opus, Sonnet, and Gemini on actual Kubernetes deployments, identifies three specific barriers that make infrastructure a uniquely bad fit for current AI agents.
YAML and HCL Are Almost Invisible to AI
Infrastructure-as-code languages like HCL (used by Terraform) and YAML (used by Kubernetes) have extremely low "context density" compared to programming languages like Python or Go. There is very little implicit meaning for an AI model to latch onto. A Python function name hints at what it does. A Kubernetes YAML manifest is just keys and values that mean nothing without deep knowledge of the specific provider, version, and deployment context.
The result: agents regularly hallucinate realistic-sounding CLI flags that don't exist, confuse forked tools with their parents (like OpenBao vs. HashiCorp Vault), and burn through tokens retrying random permutations. In testing, agents struggled with anything beyond single-cluster Kubernetes deployments and frequently defaulted to tearing everything down and rebuilding from scratch.
The Agentic SRE Prerequisite Problem
The post takes aim at the growing wave of "agentic SRE" startups promising AI-powered incident response. The core problem: these tools require unified observability, current documentation, system dependency maps, and proper instrumentation. Most organizations have none of these.
The author's blunt assessment: "If there is already sufficient data access for an agent to function effectively, you likely don't need an agent for 99% of outages." In other words, by the time you've done the prerequisite work to make an AI agent useful, you've already solved most of the problems the agent was supposed to handle.
Documentation as the New Enterprise Moat
One prediction stands out. As AI makes documentation instantly searchable and actionable, enterprise software companies will start restricting access to their technical docs. Open-source community editions may get stripped-down documentation while detailed guides become a paid feature. Knowledge itself becomes the product differentiator.
The post also notes that Go appears to be the best language for AI-assisted coding due to its simplicity, strong typing, and fast compile times - giving agents quick feedback loops that languages with longer build cycles can't match.
For anyone managing infrastructure teams being pitched "AI-powered operations" tools, this is a useful reality check. The technology will get there eventually, but the gap between demo and production is measured in years, not months.