Here's a number that should give every engineering manager pause: AI-assisted code erodes a team's ability to catch bugs roughly 12 times faster than human-written code.
That finding comes from a new preprint by researcher Antonio Mennillo, published March 12, 2026. The study analyzed 1,594,764 file-touch events across 27 datasets spanning seven programming languages (Python, JavaScript, Java, Go, C++, Ruby, and TypeScript) to model what happens when teams adopt AI coding tools without adjusting their quality assurance processes.
The Math Behind the Slowdown
The paper uses differential equations to model how AI-generated code affects a team's "validation capacity" - basically, how well the team can review and catch problems. The core finding: there's a critical QA threshold below which teams fall into what the paper calls "unrecoverable technical debt." Cross that line, and no amount of effort pulls you back.
Without additional QA investment, teams using AI coding tools actually saw net velocity drop to 0.85x. That's not a speedup. That's a slowdown. The code comes out faster, sure, but the rework cycle eats those gains and then some.
One Tester Changes Everything
The practical takeaway is surprisingly concrete. Adding a single dedicated tester to an AI-assisted team pushes net velocity up to 1.32x, with an estimated ROI of 18:1 on that hire. The model predicts that the investment in QA doesn't just prevent collapse; it's where the actual productivity gains from AI tools come from.
The researcher also built a classifier that can predict whether a project is heading toward debt collapse just by analyzing git log data. No complex calibration needed - the tool reads commit patterns and flags when a codebase is trending toward trouble.
What This Means for Teams Using Copilot, Cursor, and Friends
This isn't an argument against AI coding tools. It's an argument against adopting them and assuming the job is done. The pattern the paper describes matches what many senior engineers have been saying anecdotally: AI-generated code looks right, passes a quick glance, but carries subtle issues that compound over time.
The 12x erosion rate (gamma_AI of 0.028 vs. gamma_human of 0.002) suggests that AI code doesn't just need the same review process as human code. It needs more scrutiny, not less. Teams that respond to AI speed by relaxing code review are doing exactly the wrong thing.
The full paper runs 38 pages with all extraction scripts published for reproducibility. Every prediction is designed to be falsifiable, which is refreshing for a field where bold claims often come with no way to test them.