DDR5 RAM prices jumped from $110 to $415 between May and December 2025. That single data point tells you more about the state of AI infrastructure than any corporate earnings call.
A detailed analysis from researcher Martin Alderson argues that the AI compute shortage has shifted from a forecasted problem to a present one, with demand from agentic AI workflows outpacing what current infrastructure can deliver.
The Numbers Behind the Squeeze
Claude Code, Anthropic's coding agent, has roughly 2-3 million users. That sounds like a lot until you realize it represents about 1% of knowledge workers across OECD countries (the 38 wealthiest nations). The product is already generating an estimated $2.5 billion in annual revenue at around $200 million per month.
Now imagine what happens when penetration goes from 1% to 5% or 10%. The infrastructure is not ready.
Current data center capacity sits at approximately 15 GW and is unlikely to grow significantly before 2027, largely because DRAM supply (the memory chips AI servers need) is the binding constraint, not electrical power or GPU chips alone.
Visible Cracks in the System
The evidence is not just theoretical. Anthropic's uptime recently degraded to what engineers call "one nine" of reliability, and staff have attributed this to "unprecedented growth" in usage. The company has made visible product trade-offs in response: reducing the default compute effort for Opus 4.6 to medium rather than high, removing access to older models, and disabling prompt suggestions. These are not feature decisions. They are capacity management.
It is not just a Western problem either. Alibaba Cloud's CEO said in November 2025 that the company was "not able to keep pace" with deployment capacity. Data from OpenRouter shows Alibaba's flagship model achieving only 6 tokens per second in throughput, a crawl compared to what users expect.
Agentic AI Is the Multiplier
The real concern is what happens as AI usage shifts from single-prompt interactions to agentic workflows, where an AI runs multiple steps autonomously, consuming far more compute per task. A developer using Claude Code for a coding session might generate hundreds of thousands of tokens (words and code fragments the model processes) in an hour. Multiply that across enterprise "Cowork" pilots where entire teams adopt these tools, and current infrastructure starts to look very thin.
Alderson's practical advice: lock in annual contracts with AI providers now while pricing is still competitive, and avoid betting everything on a single provider. Switching costs between AI platforms remain low today, but that could change fast if providers start rationing access during capacity crunches.
For daily AI tool users, the takeaway is concrete. If you have been noticing slower response times, more rate limits, or degraded output quality during peak hours, it is not your imagination. The pipes are getting full, and new capacity is at least a year away.