Your Reddit Alt Account Isn’t as Anonymous as You Think — and LLMs Just Proved It
That throwaway account you use to ask embarrassing questions, rant about your boss, or discuss niche hobbies? A team of researchers just showed that large language models can link it back to your real identity with startling accuracy. The paper, published on arXiv in February 2026 by researchers including one from Anthropic, demonstrates that pseudonymity on platforms like Reddit and Hacker News is effectively broken.
What the Study Actually Found
The paper, titled “Large-scale online deanonymization with LLMs” (arXiv:2602.16800), outlines a three-step attack pipeline:
- Extract identity features from a user’s post history — things like writing style, mentioned locations, profession, hobbies, conferences attended.
- Search for candidate matches using semantic embeddings across platforms.
- Verify matches with reasoning, using the LLM to cross-reference details and filter out false positives.
The results are sobering. When linking Hacker News usernames to LinkedIn profiles, the system achieved 99% precision. In cross-platform Reddit matching — connecting different accounts owned by the same person based solely on their movie discussions — it hit 68% recall at 90% precision. Classical (non-LLM) methods scored near 0% on the same task.
This isn’t about cracking encryption or exploiting a bug. The attack works on raw, public content — the very things that make online communities useful.
Why This Is Different From Old-School Stalking
Yes, a determined human investigator has always been able to piece together someone’s identity from forum posts. People have been deanonymized before through painstaking manual work. The Netflix prize deanonymization in the late 2000s proved that much.
The difference now is scale and cost. What once took a trained investigator hours per target — cross-referencing writing patterns, checking dates, building profiles — an LLM agent can do in seconds, across thousands of accounts simultaneously. The marginal cost of deanonymizing one more user is nearly zero.
As co-author Joshua Swanson put it: the basic capabilities already exist in current models, and it’s not straightforward for LLM providers to prevent misuse. The techniques involve common summarization and search tasks — exactly the kind of thing these models are designed to do well.
Who’s Actually at Risk
Not everyone needs to panic. But certain groups face real, immediate danger:
- Abuse survivors hiding from violent ex-partners who use pseudonymous accounts as a social lifeline.
- Whistleblowers and activists in authoritarian countries who rely on anonymity to organize or share information.
- People in marginalized communities (LGBTQ+, religious minorities, political dissidents) who discuss their lives under assumed identities.
- Professionals who maintain separate accounts for industry discussion and personal expression — doctors, lawyers, teachers, military personnel.
The Substack writeup on this study included a chilling example: a woman fleeing an abusive husband who was tracked down through an MMO username she’d casually mentioned on Facebook. That took manual detective work. An LLM makes the same process trivial.
The Uncomfortable Platform Problem
Here’s the catch-22 that the researchers highlight: the data that enables deanonymization is the same data that makes online communities valuable. What’s the point of Reddit if you can’t discuss your favorite movies, your career frustrations, or your local restaurant scene?
Reddit, Hacker News, and similar platforms are built on the assumption that pseudonyms provide a meaningful privacy layer. That assumption is now unreliable. And it’s not clear what platforms can do about it without fundamentally changing how they work.
Some potential platform-level responses:
- Rate-limiting bulk profile access to make large-scale scraping harder (though the paper shows attacks work with relatively sparse data).
- Default private post histories, so strangers can’t casually browse years of your writing.
- Stricter API controls, though public web scraping remains a workaround.
- Allowing users to selectively delete old content more easily.
None of these are silver bullets. The fundamental problem is that humans are identifiable through the aggregate of what they say, and LLMs are exceptionally good at pattern matching across that aggregate.
Practical Steps You Can Take Right Now
Lead author Simon Lermen puts it plainly: “Each piece of specific information you share — your city, your job, a conference you attended, a niche hobby — narrows down who you could be. The combination is often a unique fingerprint.”
Here’s what you can actually do:
- Compartmentalize your identities. Use separate accounts for different topics. Don’t discuss your job on the same account where you talk about your city and hobbies.
- Use new accounts for sensitive questions. A one-off throwaway for that health question or relationship advice is better than your main account.
- Reduce specificity. “A mid-size European city” instead of “Lisbon.” “I work in tech” instead of “I’m a backend engineer at a fintech startup.”
- Audit your post history. Look at your last 50 posts across accounts and ask: could someone who knows me recognize this person? If yes, you’ve got a problem.
- Assume your pseudonymous accounts could be linked. Post accordingly. If you wouldn’t say it under your real name, think twice about saying it at all.
What This Means for the Broader Privacy Landscape
This study forces a reassessment of online threat models. The “practical obscurity” that protected pseudonymous users — the idea that your data was technically public but too hard to aggregate and analyze — is gone. When LLMs can process and cross-reference millions of posts in minutes, obscurity is no longer a defense.
The implications extend beyond individual privacy. Companies that promise anonymity in research surveys, feedback platforms, or employee reporting tools may need to reconsider those guarantees. Academic IRBs that approved studies under the assumption of pseudonymity should take note. And legislators working on privacy regulation need to understand that “public but anonymous” is becoming an oxymoron.
The researchers themselves acknowledge they’re walking a fine line by publishing this work. But as co-author Swanson notes, raising awareness is a major reason they published — the capabilities are already out there, and pretending otherwise doesn’t help anyone.
The Bottom Line
Pseudonymity online has been a useful fiction for years. We all sort of knew that a sufficiently motivated adversary could unmask us, but the effort required made it a theoretical concern for most people. That calculation has changed. LLMs have collapsed the cost of deanonymization to near-zero, and the techniques involved are difficult for AI companies to block without crippling their products’ core functionality.
If you’re posting under a pseudonym, operate under the assumption that your identity is linkable. Adjust your behavior accordingly. The tools aren’t getting worse at this — they’re getting better, cheaper, and faster.
References
- Lermen, S., Paleka, D., et al. “Large-scale online deanonymization with LLMs.” arXiv:2602.16800, February 2026. arxiv.org/abs/2602.16800
- Au, W.J. “Warning: LLMs Able to De-Anonymize User Accounts on Reddit, Hacker News & Other ‘Pseudonymous’ Platforms.” New World Notes, 2026. wjamesau.substack.com
- Reddit discussion: r/artificial

