Miasma Traps AI Scrapers in Infinite Loops of Poisoned Content

AI news: Miasma Traps AI Scrapers in Infinite Loops of Poisoned Content

Most website owners fighting AI scrapers play defense: block user agents, rate-limit requests, throw up CAPTCHAs. Miasma takes the opposite approach. Instead of keeping scrapers out, it invites them in and wastes their time with an infinite maze of garbage content.

The open-source tool, written in Rust, works as a lightweight web server that sits behind your existing reverse proxy (like Nginx). You embed invisible anchor tags on your real pages using CSS properties like display: none and aria-hidden="true". Human visitors never see these links. Screen readers skip them. But scrapers, which parse raw HTML without rendering CSS, follow them straight into Miasma's trap.

How the Trap Works

Once a scraper follows one of those hidden links, Miasma serves it content from a configurable "poison fountain" source, wrapped in more self-referential links pointing back to itself. The scraper follows those links, gets more recycled junk, follows more links, and so on. It's stuck in an infinite loop of worthless data, burning its crawl budget on content that will only degrade whatever model it's training.

The memory footprint is roughly 1 MB per active connection, so running Miasma won't stress your server even if multiple scrapers are trapped simultaneously. Optional gzip compression keeps bandwidth costs low.

Surgical, Not Scorched Earth

The design is deliberately careful about collateral damage. Miasma respects robots.txt, so legitimate search engine crawlers like Googlebot won't wander into the trap. The hidden links use accessibility attributes that ensure assistive technologies ignore them too. Only bots that scrape raw HTML indiscriminately will take the bait.

Installation is a single command: cargo install miasma. Configuration covers port binding, max concurrent connections, link prefixes, and your poison data source.

This sits in a growing category of anti-scraping tools that have emerged as website owners push back against AI companies hoovering up content without permission. Rather than the legal route (which moves slowly) or the blocking route (which scrapers adapt to quickly), poisoning takes a third path: make the scraped data actively harmful to the scraper. If enough sites adopt tools like this, the cost of unauthorized scraping goes up for everyone doing it.