Most AI API bills are inflated by laziness, not necessity. The same GPT-5.2 endpoint that handles nuanced legal analysis is also formatting CSV headers and answering "what time zone is Tokyo in?" A new open-source tool called CostRouter targets exactly that waste.
CostRouter sits between your application and your AI providers as an API gateway. It scores each incoming request's complexity on a 0-100 scale, then routes it to the cheapest model capable of handling it. Simple text extraction and basic Q&A go to Llama 4 Scout at $0.0001 per 1,000 tokens. Mid-range tasks hit Gemini 3 Flash at $0.0005 per 1,000 tokens. Only genuinely complex reasoning stays on premium models like GPT-5.2 or Claude Opus.
The pitch is straightforward: the developer behind the project found that 70-80% of production API calls in their own workflow were being sent to top-tier models unnecessarily. By matching task difficulty to model capability, CostRouter claims to cut API costs by roughly 60%.
The concept is not new. Several startups and internal tools at larger companies have experimented with model routing, and providers themselves have pushed users toward tiered model families (OpenAI's mini series, Google's Flash models, Anthropic's Haiku). What CostRouter adds is an open-source, drop-in gateway that handles the routing logic automatically rather than requiring developers to manually decide which model to call for each use case.
The obvious risk is quality degradation on edge cases where the complexity scorer misjudges a request. A query that looks simple but requires deep contextual understanding could get routed to a budget model and return garbage. How well the scoring algorithm handles ambiguous requests will determine whether this saves money or creates debugging headaches.
For teams running significant AI API volume, though, even imperfect routing beats the status quo of sending everything to the most expensive endpoint by default.