From busy parents to curious patients, when it comes to getting quick medical information and answers to our questions, more of us are relying on artificial intelligence.

There's just one problem: Popular AI tools may not only be giving the wrong answers, they may be confidently making things up, as a study by researchers at the Icahn School of Medicine at Mount Sinai in New York shows.

AI chatbots can be easily misled by false medical details and generate information that sounds plausible, but is entirely false.

The team of researchers tested large language models (LLMs) such as Chat GPT, Google, Gemini and others, by asking a variety of medical questions. They noticed a troubling trend: when AI tools were asked questions containing fake medical terms, they didn't hesitate. They elaborated on the fiction, providing confident, also false, explanations.

“What we saw across the board is that AI chatbots can be easily misled by false medical details, whether those errors are intentional or accidental,” author Mahmud Omar, MD, an independent consultant with the Mount Sinai research team, said in a media release.

Worrisome? Absolutely, But there's some good news, too. Researchers also found that when a simple one-line warning was added to the chatbot prompt, it dramatically reduced errors. Think of it as a speed-bump, a brief caution that helped the AI slow down and double-check before diving into a fictional diagnosis.

The difference was striking. Without the warning, the AI systems routinely spun detailed (and false) narratives. With the warning, these types of “hallucinations” dropped significantly.

“Even a single made-up term could trigger a detailed decisive response based entirely on fiction,” Eyal Klang, co-corresponding senior author and Chief of Generative AI in the Windreich Department of Artificial Intelligence and Human Health at Mount Sinai.

Fortunately, the researchers found a well-timed safety reminder built into the prompt made cut those errors nearly in half.

That safety reminder came in the form of instructing the AI tool to use only information that had been validated clinically — in other words, a study that had been published in a peer-reviewed journal — and to acknowledge results that were uncertain rather than speculating further.

For now, AI tools in medicine are promising but not perfect. “Even a single made-up term could trigger a detailed decisive response based entirely on fiction.”

Why does this matter to you? Many patients now use AI to prepare for doctor visits or even make decisions about their treatment options. AI tools are designed to sound smart, but as this study shows, they are not necessarily accurate. They can “hallucinate”, a term used to describe when AI systems generate information that appears to be plausible but is entirely false.

If a chatbot can turn a made-up condition into an entire treatment plan, what happens when a real person relies on that information? Misdiagnoses. Misinformed choices. Missed warning signs. That's where things get dangerous.

The researchers are now exploring how these simple cautionary prompts, and more sophisticated safety checks, can be embedded into chatbots before they're used in clinical settings. They're also testing their approach using real, de-identified patient records, which could help developers and hospitals evaluate their AI's system's safety before it ever touches a patient's chart.

“Our study shines a light on a blind spot in how current AI tools handle misinformation, especially in health care,” Girish N. Nadkarni, MD, MPH, Chair of the Windreich Department of Artificial Intelligence and Human Health, Director of the Hasso Plattner Institute for Digital Health, and Chief AI Officer for the Mount Sinai Health System, said.

“A single misleading phrase can prompt a confident yet entirely wrong answer,” Girish added. “The solution isn't to abandon AI in medicine, but to engineer tools that can spot dubious input, respond with caution, and ensure human oversight remains central.”

The bottom line is that AI's bedside manner needs work. Confidence isn't competence especially when advice is based on fiction. For now, AI tools in medicine are promising but not perfect. Mount Sinai's study sends a clear message: trust must be earned not assumed. Until then, AI can support, but never substitute for, the medical judgment of a trained clinician.

The study is published in Communications Medicine.