Anthropic's Mythos AI made headlines recently for finding 271 vulnerabilities in Firefox and being described as "too dangerous" for broad public release. Now Daniel Stenberg - the developer who created curl, the data-transfer tool built into virtually every operating system, web browser, and server on the planet - has run Mythos against his own code. His verdict: "primarily marketing."
Stenberg pointed Mythos at curl's approximately 176,000 lines of C code. The tool flagged 5 "confirmed" vulnerabilities. On closer inspection: 3 were already-documented limitations that aren't actual security flaws, 1 was what Stenberg classified as "just a bug," and 1 was a genuine vulnerability - rated low severity, scheduled to be patched in the June release. One real find from 176,000 lines isn't zero. But it's far from what the "too dangerous to release" positioning implied.
What Curl Has Already Survived
Context matters here. Curl is one of the most security-scrutinized open-source projects in existence. It has been through years of fuzzing (automated testing that throws random or malformed inputs at code looking for crashes or unexpected behavior), multiple professional security audits, and prior AI-assisted reviews. Those earlier tools already found "a dozen or more" CVEs - assigned identifiers that track real, confirmed security issues. Stenberg's argument isn't that Mythos found nothing. It's that Mythos performed about as well as the tools that came before it, on a codebase where most of the accessible vulnerabilities have already been extracted.
What the Firefox Number Actually Tells You
The 271 Firefox vulnerabilities figure dominated early Mythos coverage. Firefox is a vastly larger and more complex codebase than curl, which explains some of the volume difference. But the initial announcements didn't fully detail how many of those 271 findings were novel versus already known, high severity versus informational, or exploitable without user interaction. Big numbers in security research require that breakdown to be meaningful.
Stenberg's curl test provides exactly that context in miniature - and the ratio (5 flagged, 1 real, all low severity) suggests Mythos's false-positive rate and practical yield are closer to the existing industry baseline than the launch framing suggested. His published findings conclude there is "no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos."
Anthropaic has not publicly responded to Stenberg's results. Mythos remains in limited access. For teams evaluating AI security tools, Stenberg's test is a useful calibration point: on a mature, heavily audited codebase, Mythos performed like a competent tool, not a category-defining one.