What Happened
Researchers from ETH Zurich and Anthropic published a study showing that LLMs can strip anonymity from pseudonymous online accounts at scale, cheaply, and with high accuracy. The paper, titled "Large-scale online deanonymization with LLMs," tested whether AI agents could link anonymous accounts to real identities across platforms like Hacker News, Reddit, and LinkedIn.
The numbers are stark. Working with a pool of 89,000 candidate profiles, the system correctly matched 67% of Hacker News users to their real LinkedIn identities - after all direct identifiers like names, URLs, and social handles had been removed. The precision hit 90% at threshold, meaning when the system was confident about a match, it was almost always right.
The total cost for the experiment: under $2,000. That works out to $1 to $4 per profile identified.
The attack pipeline works in three stages. First, the LLM extracts identity-relevant features from posts - writing style, technical interests, career details, opinions. Second, semantic embeddings narrow down candidate matches. Third, the LLM reasons over the top candidates to verify matches and filter out false positives.
The research team includes Simon Lermen (MATS Research), Daniel Paleka, Joshua Swanson, and Michael Aerni (ETH Zurich), Nicholas Carlini (Anthropic), and Florian Tramèr (ETH Zurich). LLM-based methods achieved up to 68% recall at 90% precision, compared to near 0% for the best non-LLM baselines.
Why It Matters
The assumption that has protected pseudonymous users for decades - that deanonymization is technically possible but too expensive to do at scale - just broke. At $2 per identity, anyone with a few thousand dollars can unmask an entire online community.
This matters directly to anyone who uses AI tools under a pseudonym, participates in anonymous forums, or assumes that separating their work and personal identities online provides meaningful protection. It doesn't anymore. An LLM can connect the dots between your Hacker News comments about specific Python libraries, your opinions on remote work, and your LinkedIn job history.
The researchers identified specific threat scenarios: governments targeting journalists and activists, corporations building advertising profiles, and social engineers gathering data for targeted attacks. These aren't hypothetical. The infrastructure to do this costs less than a monthly SaaS subscription.
Our Take
This study builds on Latanya Sweeney's 2002 research showing 87% of Americans could be identified from just ZIP code, gender, and birthdate. The difference is that LLMs automate the reasoning step. You no longer need a skilled analyst spending hours per profile. You need an API key and a script.
The fact that Anthropic co-authored this paper is worth noting. They're publishing research that demonstrates a serious misuse potential of their own technology. That's the right move - better to surface these risks in a controlled study than to pretend they don't exist.
For practical purposes: if you maintain separate online identities, assume they can be linked. Don't share specific technical details, career milestones, or strong opinions across accounts you want to keep separate. The writing style analysis alone is enough to narrow candidates dramatically. The era of practical anonymity through pseudonyms is ending.