What Happened
Developer Tim Kamanin tested five security-focused skills for Claude Code - Anthropic's terminal-based AI coding agent - and found that only one delivered practical value. The review, published on March 6, 2026, evaluated each skill against criteria including architecture depth, false-positive handling, language awareness, and data flow analysis.
The five skills tested:
- sickn33/antigravity-awesome-skills@security-review (1,600+ installs) - Turned out to be a redistributed copy bundled in a skills collection. Install count reflects the bundle, not individual quality.
- affaan-m/everything-claude-code@security-review - A static checklist approach that flags patterns without understanding context. It misidentified server configuration as security vulnerabilities.
- sergiodxa/agent-skills@owasp-security-check - Well-structured with 20 rules across 5 OWASP priority categories, but limited to TypeScript examples with no data-flow tracing.
- alirezarezvani/claude-skills@senior-security - Actually a threat modeling toolkit using STRIDE/DREAD frameworks, not a code review tool at all.
- getsentry/skills@security-review - The winner. Built by Sentry's team with confidence classification (HIGH/MEDIUM/LOW), framework-specific awareness, and 17 vulnerability-specific guides.
The Sentry skill stood out for understanding framework conventions - it knows Django auto-escapes templates, so it won't flag every template variable as an XSS risk. It also outputs structured results with file locations and concrete fix recommendations.
Why It Matters
Claude Code skills are still a new ecosystem, and quality varies wildly. The skills system lets anyone publish a CLAUDE.md instruction file that shapes how Claude Code behaves during specific tasks. For security reviews, getting this wrong isn't just annoying - it's dangerous. A skill that generates false positives trains developers to ignore warnings. A skill that misses real vulnerabilities gives false confidence.
The finding that high install counts don't correlate with quality is worth noting. The most-installed skill was just a bundle redistribution. Developers picking security tools based on popularity metrics alone will end up with mediocre results.
Our Take
This review confirms what we've seen across the AI tooling space: the skills and plugins ecosystem for coding agents is still the wild west. Most offerings are thin wrappers around generic prompts, not purpose-built tools with real domain expertise.
Sentry's skill works because Sentry understands security tooling. They built confidence scoring to reduce alert fatigue, framework-specific rules to avoid false positives, and structured output that fits into existing development workflows. That's the difference between a company shipping a product and someone publishing a prompt.
If you're using Claude Code for anything touching production security, install Sentry's skill and skip the rest. But don't treat any AI-powered security review as a replacement for proper security practices. These skills are a first-pass filter, not a security audit.
The broader lesson: when evaluating AI coding skills and plugins, ignore install counts. Look at who built it, whether they have domain expertise, and whether the skill handles edge cases intelligently. A security skill that doesn't understand your framework's built-in protections is worse than no security skill at all.