AI coding agents write tests the same way they write code: confidently, quickly, and with a consistent blind spot for failure modes. Optinum is a new open-source tool built to catch what those agents miss.
The problem it targets is specific. When tools like Cursor or GitHub Copilot generate code and then write tests for it, they cover the expected cases well. What they skip: null inputs, network timeouts, rate limit responses, empty arrays where the code expects at least one item. A human reviewer would flag these. An AI agent - a large language model (LLM) generating code automatically - writing tests for its own code usually doesn't.
Optinum analyzes pull requests and maps the gap between what the code actually handles and what the tests cover. It's not a standard code coverage tool - those already exist and AI agents know how to satisfy them. Optinum looks for semantic gaps: situations where a line of code is technically "covered" but the test never exercises the condition that would cause it to fail.
The project is early-stage, hosted on GitHub, and built for teams who've already adopted AI coding agents and started noticing patterns in what gets missed. The README frames it as a complement to existing testing infrastructure, not a replacement.
Who this is for: Engineering teams using Cursor, Aider, Claude Code, or similar tools on production code who want a second check before merging. Optinum fits in the CI pipeline step before human review to surface the test gaps worth looking at.
What it won't do: Replace thorough human review, catch logic bugs in the code itself, or work well on small projects where tests are still written manually.
It's a tool that makes more sense the more you've leaned on AI agents for code. If you're still writing most tests by hand, the problem it solves isn't yours yet.