HomeLocal HealthHealthAI Medical Chatbots Excel at Diagnosis but Struggle with Critical Risk Assessment,...

AI Medical Chatbots Excel at Diagnosis but Struggle with Critical Risk Assessment, New Research Shows

Multiple studies reveal artificial intelligence systems achieve impressive diagnostic accuracy yet fail to judge patient urgency and clinical appropriateness, raising safety concerns for medical advice seekers.

Patients across Kent are increasingly turning to AI chatbots for medical advice, but new research reveals a troubling gap between technical accuracy and clinical judgment. While these systems can diagnose conditions with remarkable precision, they’re failing at something equally critical – determining how serious a patient’s condition actually is.

The numbers tell a complex story. ERNIE Bot, developed by Baidu, achieved 77.3% diagnostic accuracy and correctly prescribed medications 94.3% of the time. Yet the same system ordered unnecessary laboratory tests in 91.9% of cases and prescribed unneeded medications in 57.8% of consultations.

Even more advanced systems show similar patterns. ChatGPT-4o reached 92.5% diagnostic accuracy with perfect prescription accuracy, but recommended inappropriate treatments in 67.5% of cases. DeepSeek achieved perfect scores for both diagnosis and prescriptions, yet still suggested unnecessary medications 60% of the time.

The Accuracy Paradox

These findings highlight what researchers call the “accuracy paradox” in AI medical systems. The chatbots can identify what’s wrong with impressive precision, but they can’t judge whether immediate action is needed or if a patient can safely wait.

Dr Helen Salisbury, writing in the BMJ, points to the core issue: “The problem isn’t solely—or even mostly—about reaching a correct diagnosis but also about judging risk and acuity.”

A separate BMJ Open study found that about half of all answers from five major AI chatbots – including Gemini, DeepSeek, Meta AI, ChatGPT, and Grok – contained inaccurate or incomplete medical information when tested with real medical questions.

When AI Meets Reality

Oxford University researchers discovered that AI models performing brilliantly on benchmark tests struggled when interacting with actual human users making medical decisions. The systems provide technically correct answers that become medically inappropriate because they lack understanding of individual patient circumstances.

This context-blindness becomes chiefly problematic when patients ask leading questions based on their own suspected diagnoses. The AI systems tend to confirm these suspicions rather than properly assess the clinical situation.

Chinese frontline doctors achieved only 25% diagnostic accuracy and 10% correct prescription rates in comparative studies, suggesting AI systems do outperform some human physicians in specific metrics. But these comparisons used different evaluation methods, making direct performance assessments difficult.

Source: @bmj_latest

Key Takeaways

  • AI chatbots achieve high diagnostic accuracy (77-100%) but frequently recommend unnecessary tests and treatments
  • Technical precision doesn’t translate to appropriate clinical judgment about urgency and severity
  • Around half of AI medical responses contain inaccurate or incomplete information according to BMJ research

What This Means for Kent Residents

Kent residents should treat AI medical chatbots as preliminary information tools rather than diagnostic authorities, above all when assessing whether symptoms require immediate attention. If you’re using AI systems for health queries, always follow up with NHS 111 for urgent concerns or book a GP appointment through your local practice for non-emergency issues. The impressive accuracy statistics don’t account for the careful clinical judgment that determines whether you need same-day treatment or can safely monitor symptoms at home – that assessment still requires human medical expertise.

Test Your Knowledge

5 questions

Transparency Notice: This article was produced with AI assistance and reviewed by our editorial team before publication. Kent Local News uses artificial intelligence tools to help deliver fast, accurate local news. For more information, see our Editorial Policy.
KLN Staff Reporter
KLN Staff Reporterhttps://kentlocalnews.co.uk
The KLN Staff Reporter desk covers breaking news, crime alerts, traffic updates, and council news across Kent. Our reporting team works around the clock to bring you the latest developments from communities across the county.
RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Local News

Business & Economy

Health