Models

AI Model Inference Prices Keep Dropping Into Spring 2026

March 24, 2026 2 min read

A year ago, pushing a million tokens through a top-tier AI API cost roughly $30 on input alone. In early 2026, that same volume often costs a fraction of the price, and the floor keeps dropping.

The latest pricing data shows the slide is continuing into spring. OpenAI, Google, and Anthropic have all cut API prices multiple times since early 2025, and each new model generation tends to deliver better performance at lower cost per token - the basic unit AI models use to process text, roughly three-quarters of a word.

For people running AI tools in their daily workflow, the downstream effects are already showing up. Subscription services that rely on these APIs can either drop their own prices or offer more generous usage limits. Several mid-tier AI writing and coding tools have expanded their free tiers in recent months, a direct result of their infrastructure costs shrinking.

Open-weight models add pressure from the other direction. When you can run capable models on consumer hardware, commercial API providers have to keep prices competitive or lose the cost-conscious segment entirely. The recent wave of efficient mixture-of-experts models - architectures that only activate a portion of their parameters per query, cutting compute costs - has made local inference more practical than ever.

None of this means AI tools are about to become free. But the trajectory is clear: what cost $20/month in API calls a year ago costs significantly less today. For small businesses and freelancers building AI into their workflows, every price cut widens the margin between what these tools cost and what they produce.

Related Tools

More from today

The 'Pincer Attack' Thesis: AI Is Hollowing Out Open Source From Both Ends

Russia's Sber Open-Sources 702B Parameter Model Under MIT License

AI Coding Agents Pass Every Test and Still Ship Broken Software

Cookie Preferences