Researchers have identified significant flaws in OpenAI’s healthcare-focused AI, raising concerns about patient safety when the system handles medical emergencies.
A major study has revealed concerning safety flaws in ChatGPT Health, OpenAI’s specialist healthcare tool, prompting fresh warnings from medical safety regulators and raising questions about how artificial intelligence should be deployed in clinical decision-making.
The research, published in Nature Medicine, tested ChatGPT Health across 60 different clinical scenarios, each run 16 times under varying conditions to assess how the system would advise real patients. Researchers varied patient characteristics including race and sex, whether laboratory results were included, and other contextual factors that would affect medical decision-making. The findings have prompted urgent discussions about AI safety in healthcare at a time when millions of people are turning to these tools for health guidance.
Under-triage of critical emergenciesThe most alarming finding concerns the system’s failure to recognise genuine medical emergencies. According to the study, the tool failed to correctly identify approximately 52% of cases that physicians would classify as requiring emergency care. In one striking example, when presented with a severe asthma exacerbation—a potentially life-threatening condition—ChatGPT Health classified it as “a moderate flare” and recommended urgent care rather than an emergency department visit in 81% of test attempts.
The lead researcher, Ashwin Ramaswamy, highlighted the critical nature of these failures, explaining that “ChatGPT Health is most reliable when the clinical decision is least consequential, and least reliable when it matters most.” This inverted pattern is particularly worrying because it suggests the tool performs worst precisely when accuracy is most vital to patient safety.
Inconsistent safety featuresThe research also documented inconsistent performance in mental health crises. Alerts to the 988 Suicide and Crisis Lifeline—a critical safety feature—appeared sporadically. The system sometimes triggered alerts for lower-risk scenarios but failed to activate them when users described specific plans for self-harm, creating potentially dangerous gaps in safety protection.
Additionally, researchers identified race bias in the system’s responses, meaning that patients from different ethnic backgrounds received inconsistent recommendations for the same medical scenarios. This raises ethical concerns about equitable healthcare advice from AI tools.
Broader industry concernThese findings arrive as healthcare safety organisations have begun sounding alarms about AI chatbot use in medicine. The ECRI Institute, a non-profit patient safety organisation, named the misuse of AI chatbots including ChatGPT, Gemini, and Copilot as the number one health technology hazard for 2026. According to ECRI, eloquent and plausible-sounding AI responses often mask serious gaps in clinical accuracy and safety awareness.
The concerns extend beyond ChatGPT Health alone. Separate research evaluated nine commercial large language models on their ability to detect health misinformation from social media, hospital discharge notes, and simulated clinical cases. All models were susceptible to fabricated medical information, accepting false health advice in approximately 32% of test cases. In one example, models failed to flag obviously dangerous advice to patients with a bleeding oesophageal condition to “drink cold milk to soothe the symptoms.”
How ChatGPT Health actually performsThe system does have strengths. When presented with textbook medical emergencies—clear-cut cases like strokes or severe allergic reactions—ChatGPT Health performed well and recognised the need for immediate emergency care. However, this performance advantage disappears when presented with more complex, nuanced medical scenarios that require contextual judgement and consideration of subtle clinical signs.
OpenAI has responded to the study by noting that ChatGPT Health is designed for back-and-forth conversations where users add context through follow-up questions, rather than providing diagnosis based on a single static scenario. The company maintains that the tool is intended as supplementary information, not a replacement for professional medical advice.
Growing regulatory attentionThe safety concerns have attracted legal attention. Wrongful-death lawsuits involving AI chatbot advice have begun surfacing, with some cases alleging that negligent AI recommendations contributed to delayed critical care or suicide. These legal developments are already prompting courts and regulators to scrutinise whether AI health tools have undergone adequate rigorous testing and maintain ongoing safety monitoring.
According to OpenAI data, approximately 25% of ChatGPT’s 800 million global weekly active users submit a healthcare-related prompt each week, highlighting just how widely people are turning to this technology for medical information. The scale of use means that even small percentage errors translate into hundreds of millions of potentially problematic interactions annually.
Source: @bmj_latest
Key Takeaways
- A Nature Medicine study found ChatGPT Health failed to recognise 52% of genuine medical emergencies and misclassified severe asthma attacks as mild in 81% of test attempts
- Safety features for mental health crises were inconsistent, with suicide prevention alerts appearing sporadically
- The ECRI Institute ranked AI chatbot misuse as the top health technology hazard for 2026
- The system performs well on textbook emergencies but struggles with nuanced, complex scenarios
- All major AI language models tested were susceptible to accepting false medical information
What This Means for Kent Residents
For patients across Kent and Medway, these findings serve as a timely reminder to approach health information from any online source—including AI tools—with appropriate caution. If you use ChatGPT or similar AI tools for health questions, the research suggests these should be treated as a starting point for conversation with your GP, not as a substitute for professional medical assessment.
For any urgent health concern, Kent residents should contact NHS 111 for advice on whether emergency care is needed, or call 999 for genuine emergencies. Your local Kent and Medway GP services remain the most appropriate first point of contact for health concerns, as GPs have access to your full medical history and can provide personalised, clinically accountable advice. The Kent and Medway NHS Trust and primary care services can be accessed through NHS England’s digital platforms or by contacting your registered GP practice directly.


