Related ToolsClaude CodeCursorChatgptAiderCody

Security Researchers Find Prompt Injection in Over a Third of AI Agent Skills

AI news: Security Researchers Find Prompt Injection in Over a Third of AI Agent Skills

Over a third of publicly available AI agent skills contain security vulnerabilities. That's the picture emerging from a wave of security audits targeting the rapidly growing ecosystem of third-party skills, plugins, and extensions that power AI coding agents and assistants.

SafeSkill, a new security scanning tool, reports analyzing 10,000 AI skills for code exploits and prompt injection - attacks where hidden instructions manipulate an AI agent into doing something its user never intended. It joins a crowded and urgent field: Snyk's ToxicSkills audit, Cisco's open-source skill-scanner, and several independent projects are all racing to catalog the same problem.

The Numbers Are Bad

Snyk's ToxicSkills study, which examined 3,984 skills from the ClawHub and skills.sh registries, offers the most detailed public data so far. The findings:

  • 13.4% of all skills (534) contain at least one critical-level security issue
  • 36.82% (1,467 skills) have at least one security flaw of any severity
  • 91% of confirmed malicious skills use prompt injection
  • 10.9% have hardcoded credentials sitting in plain text
  • 8.7% have direct access to financial systems

A separate audit covered by The New Stack examined 22,511 skills across four public registries and logged 140,963 individual security findings.

The attack techniques are creative and concerning. Researchers found base64-encoded commands designed to steal AWS credentials, installation scripts that download password-protected executables from unknown sources, and DAN-style jailbreaks (instructions that tell the AI to ignore its safety rules). Some skills use Unicode smuggling - hiding malicious characters that look invisible to humans but get executed by the AI.

Why AI Skills Are Uniquely Dangerous

Traditional software packages can contain malware too. NPM supply chain attacks aren't new. But AI agent skills have a problem that conventional packages don't: they run with the AI's permissions, which often include file system access, shell commands, and API keys.

When you install a malicious NPM package, it can do what your Node process can do. When you install a malicious AI skill, it can do what your AI agent can do - which increasingly means reading your codebase, executing terminal commands, and accessing cloud services. The blast radius is larger.

The other issue is detection difficulty. Prompt injection doesn't look like traditional malware. There's no suspicious binary or known exploit signature to match against. It's natural language instructions hidden in tool descriptions or response templates, and automated scanners are playing catch-up.

Scanning Helps, But It's Not Enough

Snyk itself has noted the fundamental limitation: denylist-based scanning that looks for "bad words" or forbidden patterns can't enumerate every possible way to instruct an AI to do something dangerous. The search space is essentially infinite.

Still, automated scanning catches the low-hanging fruit. Hardcoded API keys, obvious exfiltration patterns, and known jailbreak templates are all detectable. Tools like SafeSkill and Cisco's skill-scanner at least raise the floor.

For anyone building with AI agents that accept third-party skills: treat skill installation like you'd treat running untrusted code, because that's exactly what it is. Review permissions, sandbox execution where possible, and run whatever scanning tools are available before letting a skill near your production environment.