What Happened
More than 40 million people now ask ChatGPT health-related questions every day, according to a March 6, 2026 report from Axios. One in four of ChatGPT's roughly 800 million regular users submits a health prompt every week.
The scale is staggering, but the accuracy is concerning. A study published in Nature found that ChatGPT under-triaged approximately half of health care emergencies in researcher testing. That means the model told people their situation was less urgent than it actually was - a dangerous failure mode when someone is deciding whether to go to the ER.
There is a significant improvement with newer models, though. OpenAI's GPT-5 correctly refers emergency cases nearly 99% of the time, according to the report. American Medical Association CEO John Whyte weighed in with a warning: "Too often people are using this as an expert and not as an assistant."
Why It Matters
This is the largest unofficial medical consultation system ever created, and it happened without any regulatory framework, clinical validation process, or informed consent. 40 million daily health queries is more than most national health systems handle.
The core problem is not that AI gives bad medical advice - it is that most people using it cannot tell when the advice is wrong. If you are a doctor, you can spot when ChatGPT suggests something inappropriate. If you are the average person Googling symptoms at 2 AM, you probably cannot.
The under-triage finding from Nature is particularly alarming. Over-triaging (telling someone to go to the ER when they did not need to) wastes time and money. Under-triaging (telling someone they are fine when they are not) can kill people. Getting it wrong in that direction 50% of the time is not a rounding error.
The GPT-5 improvement to 99% accuracy on emergency referrals is meaningful but raises its own questions. Most ChatGPT users are not on the latest model. Free-tier users and those on older plans are still getting the less accurate versions.
Our Take
We use ChatGPT daily for productivity work, and this report reinforces something we have been saying: these tools are assistants, not authorities. The same model that helps you draft emails and summarize documents is now the primary health information source for tens of millions of people, and it was never designed for that role.
The 99% emergency accuracy on GPT-5 sounds good until you do the math. If 40 million people ask health questions daily and even 1% of emergency cases get under-triaged, that is still a significant number of people getting potentially dangerous advice every single day.
For AI tool users, the takeaway is clear. Use ChatGPT and Claude for brainstorming, research starting points, and understanding medical terminology. Do not use them as a replacement for calling your doctor, especially for anything that feels urgent. The model does not know your medical history, your medications, or the thing it cannot see on the other side of the screen.
The AMA's framing is right: assistant, not expert. That applies to health queries and honestly to most things these models do.