Artificial Intelligence (AI) in Medicine. Why it will not replace empirical medicine any time soon

Posted by:

Ariel

On:

September 15, 2025

Artificial Intelligence (AI) in Medicine. Why it will not replace empirical medicine any time soon

At last year’s American College of Physicians conference in Boston, Dr Eric Topol presented the keynote address on Artificial Intelligence (AI)in Medicine. In his speech, he foresaw advances in medicine through AI as a unimodal and multimodal approach. In the unimodal approach, AI can interpret data differently than we can as humans. Examples of this include using a mammogram to determine your risk of heart disease, using EKGs to determine whether you will develop an arrhythmia in the next few years. In the multimodal approach, Dr Topol sees the strength of AI in its ability to encourage slow thinking, document our medical findings, diagnose rare diseases and be more empathic as a clinician.

In 2023, ChatGPT passed the medical licensing exams. Since then, Alan, an AI platform in France, successfully delivered medical advice with 95% of the messages rated positively. While these systems are helping doctors deliver better care, they will not replace doctors any time soon. The main reason we still have one in the win column for doctors is the sacred nature of the patient-doctor relationship. In addition, AI can have potentially dangerous inaccuracies (hallucinations). More importantly, AI in Medicine will not beat us until it starts thinking about the empirical nature of disease.

On average, I am about 5% more accurate at diagnosing disease than my resident doctors. This is not because I am any smarter than them. It is because I have seen a lot more medicine than them. In doing so, I have done what any animal is so good at doing: be able to observe with my senses what others may have overlooked. It is in knowing which disease features are relevant that allows me to diagnose more accurately.

Take for example a patient I saw with the residents a few months ago. She presented to our clinic with abdominal pain. The pain was primarily in the right upper abdomen and seemed to worsen with food. She stated the pain as a cramping pain. My resident suspected that the patient had gallbladder disease and was wondering if I could use our Point of Care Ultrasound machine to look at her gallbladder. I examined the gallbladder, and it was a normal exam.

I then asked her on question, that completely changed the diagnostic process,

“Has the pain moved at all since it started in the upper abdomen?”

She said,” As a matter of fact yes, it is now moving to the lower abdomen.”

Those words changed the entire diagnostic process from ruling out gallstones to the need to rule out appendicitis. The next morning, she underwent a CT scan which revealed early appendicitis.

As you can see, one data point (the fact that the pain moved) completely changed my diagnosis. I learned this key feature from one of my favorite surgeons, Dr Mark Nolan Hill. He used to teach my medical students that the pain in appendicitis moves as the disease progresses. This finding is uniquely related to where the nerve fibers in the mid gut originate in the primitive embryo. As the infection progresses from being contained in the inner abdominal cavity (appendix viscera) and invades through to the outer abdominal cavity (peritoneum) the pain moves from a dull cramping mid abdominal pain to a more sharp and severe lower abdominal pain.

If you ask ChatGPT, “What are the causes of right upper quadrant abdominal pain?” it will go through the same differential as most doctors would: gallbladder disease, peptic ulcer disease, disease originating from the heart or diaphragm. But if you ask it, “What causes right upper quadrant abdominal pain that moves to the lower abdomen?” it will answer appendicitis as the leading diagnosis. Without asking ChatGPT the right question, it will not produce the right answer.

If AI is ever to compete with the diagnostic reasoning of doctors, it needs to think more empirically about medicine. Using illness scripts where differentials are based on 5-10 data points should help AI produce a correct diagnosis and will outrival our ability to generate differentials. This will require developers to reach out to expert diagnosticians who are a dying breed now in ‘Modern Medicine’.

Posted by

Ariel

Uncategorized