26.5%. That is how much further a Claude Pro subscription stretches when routed through Edgee's token compression layer, according to the startup's own head-to-head test. Same plan, same tasks, one session with raw Claude Code and one through Edgee's AI gateway.
Edgee works as a proxy that sits between you and the language model API. Before your prompt reaches Claude (or any other model), Edgee's compressor strips redundant tokens at the edge, reducing the total token count without changing the meaning of your request. The company claims this can cut LLM costs by up to 50% in broader usage, with the 26.5% figure being the specific result for Claude Pro's usage-limited subscription.
The gateway also offers model routing, meaning it can redirect requests to cheaper models when the task does not require a top-tier one, plus an observability layer for tracking usage across providers. It exposes a single OpenAI-compatible API, so switching over does not require rewriting your integration.
Edgee was co-founded by Sacha Morard, previously CTO of Le Monde Group for six years. A free tier is available with paid plans for heavier usage.
The 26.5% number is self-reported and comes from a single benchmark, so take it accordingly. But the underlying idea is sound: Claude Pro's rate limits are based on token throughput, and if you can say the same thing in fewer tokens, you get more turns before hitting the wall. For developers burning through Claude Code sessions on large codebases, even a 15-20% real-world improvement would meaningfully change the daily workflow.