What happens when a disease doesn't exist, but an AI says it does?
Researchers invented a fictitious medical condition and then asked ChatGPT about it. The chatbot confirmed the fake diagnosis as real - a finding that, when scaled to the chatbot's user base, suggests roughly 40 million people could have received false health information about a condition that was never real to begin with.
The Study Design and What It Found
The researchers created a fake disease from scratch - a condition with no clinical basis, no medical literature, and no actual patients. They then queried ChatGPT to see how it would respond to questions about symptoms, diagnosis, and treatment of this invented illness.
Rather than saying it couldn't find reliable information, or flagging uncertainty, ChatGPT generated confident, detailed responses treating the fake condition as legitimate. This is a specific flavor of AI hallucination - the tendency of large language models (AI systems trained on massive amounts of text to predict likely next words) to generate plausible-sounding text even when no factual basis exists.
The 40 million figure comes from the overlap between ChatGPT's active user base and people who might plausibly ask health questions. It's a modeled estimate, not an observed count. But even as an estimate, it illustrates the scale problem: a single confident wrong answer repeated across millions of interactions becomes a health misinformation event in its own right.
The Core Problem Isn't New, But the Scale Is
AI chatbots confirming false medical information isn't a new finding. Studies going back to 2023 have shown that ChatGPT, Claude, and similar tools regularly produce inaccurate medical guidance on real conditions - wrong drug dosages, outdated treatment protocols, symptoms misattributed to the wrong diseases.
This study sharpens that concern by removing any ambiguity about the source. With real diseases, you can argue the AI pulled from partially accurate literature. With a fabricated condition, there is no literature. The AI is generating plausibility theater - text that sounds like medical fact because it follows the patterns of medical writing.
For everyday users, this is a practical warning. Many people use ChatGPT as a first stop for health questions precisely because it gives immediate, articulate answers. That articulateness is the problem. A hesitant, uncertain answer signals unreliability. A confident, structured answer feels authoritative even when it's completely invented.
What It Means for Anyone Using AI for Health Questions
The researchers' goal was presumably to push AI developers toward better uncertainty signaling - systems that say "I don't have reliable information on this" rather than filling the gap with fabricated confidence.
For now, the practical takeaway is blunt: ChatGPT is not a diagnostic tool, and its confident tone is not a proxy for accuracy. If a chatbot confirms something you're worried about health-wise, that confirmation is worth nothing on its own. Cross-check against actual medical sources - not because AI is always wrong, but because it's wrong in ways that are hard to detect from the output alone.