Related ToolsChatgptClaude

AI Can Now Unmask Anonymous Social Media Users for $4 Per Profile

AI news: AI Can Now Unmask Anonymous Social Media Users for $4 Per Profile

$2,000. That's the total cost researchers spent to correctly identify 226 out of 338 anonymous online accounts - a 67% success rate at 90% precision.

A team from ETH Zurich, MATS Research, and Anthropic published "Large-Scale Online Deanonymization with LLMs" (arXiv:2602.16800), and the results should make anyone who posts under a pseudonym uncomfortable. The system works by having an LLM agent piece together clues from your public posts - your city, your job, conferences you attended, niche hobbies - then cross-referencing those details against real identities on platforms like LinkedIn.

The per-profile cost? Between $1 and $4.

How the Attack Works

The pipeline is deceptively simple. First, the system pulls public posts from an anonymous account and builds a structured profile. Then it uses embedding-based search (a way of converting text into numerical patterns that capture meaning) to find the 100 most promising real-identity candidates from a platform like LinkedIn. Finally, an LLM reasons through the matches, either confirming an identity or abstaining when confidence is low.

What makes this hard to detect: each individual step looks benign. Summarizing a profile, computing embeddings, ranking candidates - none of these operations would trigger alarm bells on their own.

The researchers tested primarily on Hacker News accounts cross-referenced with LinkedIn, but also demonstrated Reddit account matching and even identified 9 out of 125 AI researchers from Anthropic's anonymized interview transcripts.

The $4 Problem

This is not a theoretical concern. The economics have fundamentally shifted. Deanonymizing someone used to require a team of skilled investigators spending hours or days. Now it costs less than a coffee.

Lead researcher Simon Lermen put it plainly: "Ask yourself: could a team of smart investigators figure out who you are from your posts? If yes, LLM agents can likely do the same, and the cost of doing so is only going down."

The researchers estimate the approach could scale to 100 million users with sufficient compute. Performance degrades gradually as candidate pools grow into the tens of thousands, but it doesn't collapse.

This builds on Latanya Sweeney's 2002 research showing 87% of Americans could be identified from just three data points (ZIP code, gender, birth date). LLMs automate what used to require human intuition, and they do it at a price point that makes mass surveillance economically viable.

Defenses Are Thin

The paper outlines possible countermeasures, and none are reassuring. Platform-level defenses like API rate limiting and scraping detection help but don't stop determined actors. LLM provider guardrails - the "I can't help with that" responses - can be bypassed through prompt modification, or avoided entirely by using open-source models with safety features removed.

The practical takeaway: every specific detail you share online narrows the set of people you could be. Your combination of interests, location hints, and professional details is often a unique fingerprint. The researchers found that LLM-based reasoning significantly outperformed older statistical approaches that relied on activity pattern metadata alone - these models understand context and can make inferences humans would.

For journalists, activists, and anyone who relies on pseudonymity for safety, this research is a concrete warning. The assumption that "nobody would bother" to identify a low-profile anonymous account no longer holds when the cost is a few dollars and the process is automated.