Last year's AI pilots looked cheap. This year's production deployments don't.
A new Axios report puts numbers to something finance teams at large companies are increasingly living: the gap between what AI was supposed to cost and what it actually costs once you roll it out across a real organization.
The pattern is consistent. A company runs a 90-day pilot with a small team, gets useful results, and gets budget approval to expand. Then the real numbers arrive. Inference costs - the per-query fees charged every time an employee sends a prompt to an AI model - multiply with headcount. Integration work, connecting AI tools to existing software like CRMs, databases, and document systems, takes months longer than anticipated. And the human oversight layer - the people who review AI outputs before they reach clients or get acted on - turns out to be non-trivial.
Pilot Economics Don't Scale
Small teams can hand-tune their AI workflows. At 500 employees, you need documentation, training, guardrails, and someone responsible when the model generates an incorrect contract clause or misreads a customer complaint. These aren't software costs. They're people costs, and they don't show up in a vendor's pricing page.
Fine-tuning - the process of training an AI model on a company's specific data so it behaves more consistently with internal terminology and processes - sounds like a one-time expense. In practice, it requires ongoing maintenance as products, policies, and language change. The model calibrated to your Q3 product catalog doesn't automatically learn your January rebrand.
ROI is also genuinely hard to measure in a way that satisfies a CFO. If a content team of eight produces 30% more material with AI assistance, does that mean you can cut two people? Some companies are making that bet. Others are finding productivity gains don't translate cleanly to headcount reduction because the work expands to fill the new capacity.
Where the Numbers Actually Hold
Some categories have clear economics. Customer support automation, where AI handles high volumes of repetitive queries with human review on edge cases, produces measurable cost-per-ticket comparisons. Document summarization, data extraction from PDFs and scanned forms, and developer coding assistance all have countable outputs.
The harder cases involve judgment, relationships, or creative work. AI-assisted sales emails that need brand review before sending. AI-generated reports that require a senior analyst to verify before a client presentation. In those cases, oversight cost frequently eats the productivity gain.
The sticker shock isn't always about the AI tools themselves. It's about discovering that making AI reliably useful inside a complex organization is an infrastructure project, not a software subscription.