Open Source

Mnemosyne Indexes Your Codebase to Cut Claude Code Token Usage by 73%

April 3, 2026 2 min read

Image: Anthropic

If you've hit Claude Code's usage limits mid-session, Mnemosyne is worth a look. Built by Rocco Castoro, this open-source context engine sits between your codebase and your LLM, indexing everything into SQLite so the AI reads less but finds more.

How It Works

Instead of letting Claude Code read raw source files (which burns through tokens fast on large repos), Mnemosyne pre-indexes your codebase and retrieves only the relevant chunks when you ask a question. It combines six retrieval signals using Reciprocal Rank Fusion (a technique that merges multiple ranked lists into one): BM25 keyword matching, TF-IDF term frequency, symbol search for function and class names, usage frequency tracking, predictive prefetch, and optional dense embeddings if you install onnxruntime.

The key technical trick is AST-aware chunking. Rather than splitting code at arbitrary line breaks, Mnemosyne parses the actual syntax tree (the structural representation of your code) for Python, Go, Rust, C#, Java, Kotlin, JavaScript, and TypeScript. This means chunks align with real code boundaries like functions and classes, not random 500-line blocks.

The Numbers

Testing on an 829-file production codebase, Mnemosyne delivered 73% token savings compared to direct source reading. Query latency sits under 500ms on a cold start and under 200ms warm. The creator notes that repos under roughly 50 files won't see much benefit since Claude Code can handle those without token pressure.

The recommended workflow is hybrid: let Mnemosyne find the relevant files first, then have Claude Code read those specific files. This combination reportedly produces better results than either approach alone.

Setup

Installation is pip install mnemosyne-engine, then mnemosyne init && mnemosyne ingest in your project directory. Integration with Claude Code works through CLAUDE.md instructions that tell the agent to query the Mnemosyne index before reading files. No API keys, no cloud services, no Docker. Everything runs locally.

Licensed under AGPL-3.0 with a commercial license option, the source is on GitHub at castnettech/mnemosyne.

How It Works

The Numbers

Setup

Related Tools

More from today

Critical OpenClaw Flaw Gave Attackers Silent Admin Access to AI Agents

Google Releases Gemma 4 Open Models That Beat Systems 20x Their Size

Six Behavioral Rules to Stop AI Coding Agents From Cutting Corners

Cookie Preferences