- Health
Modern AI models can create convincing descriptions of images that were never given to them — a phenomenon researchers call a "mirage."
When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.
AI models are being trained to interpret medical scans, but researchers warn that a flaw in these systems could undermine their accuracy.
(Image credit: Westend61 via Getty Images)
- Copy link
- X
Get the world’s most fascinating discoveries delivered straight to your inbox.
Become a Member in Seconds
Unlock instant access to exclusive member features.
Contact me with news and offers from other Future brands Receive email from us on behalf of our trusted partners or sponsors By submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.You are now subscribed
Your newsletter sign-up was successful
Want to add more newsletters?
Delivered Daily
Daily Newsletter
Sign up for the latest discoveries, groundbreaking research and fascinating breakthroughs that impact you and the wider world direct to your inbox.
Signup +
Once a week
Life's Little Mysteries
Feed your curiosity with an exclusive mystery every week, solved with science and delivered direct to your inbox before it's seen anywhere else.
Signup +
Once a week
How It Works
Sign up to our free science & technology newsletter for your weekly fix of fascinating articles, quick quizzes, amazing images, and more
Signup +
Delivered daily
Space.com Newsletter
Breaking space news, the latest updates on rocket launches, skywatching events and more!
Signup +
Once a month
Watch This Space
Sign up to our monthly entertainment newsletter to keep up with all our coverage of the latest sci-fi and space movies, tv shows, games and books.
Signup +
Once a week
Night Sky This Week
Discover this week's must-see night sky events, moon phases, and stunning astrophotos. Sign up for our skywatching newsletter and explore the universe with us!
Signup +Join the club
Get full access to premium articles, exclusive features and a growing list of member rewards.
Explore An account already exists for this email address, please log in. Subscribe to our newsletterResearchers have been training artificial intelligence (AI) systems to interpret results of visual tests like mammograms, MRIs and tissue biopsies — and as AI becomes increasingly capable, some analysts have suggested that these models will replace humans in the field of medical diagnostics.
But now, a new study casts doubt on the capability of current AI models to deliver reliable results, highlighting a crucial flaw that could hinder their use in medicine.
You may like-
AI-written code can beat humans at biomedical analysis, some studies find. What does that mean for the field?
-
AI-supported breast cancer screening spots more cancers earlier, landmark trial finds
-
AI hallucinations work both ways, study shows — using chatbots can amplify and reinforce our own delusions
They called this phenomenon a "mirage," and it is the first time this effect has been shown across multiple AI models, which were used to interpret images across multiple disciplines.
"What we show is that even if your AI is describing a very, very specific thing that you would say, 'Oh, there's no way you could make that up,' yeah, they could make that up," said study first author Mohammad Asadi, a data scientist at Stanford University. "They could make very rare, very specific things up."
When AI sees what isn't there
AI "hallucinations" are well documented and involve models filling in made-up details, such as false citations for a real essay. They often result from AI making inaccurate or illogical predictions based on training data it was provided. The scientists instead called the phenomenon in the new study "mirages" because the AI created descriptions of original images on their own and then based their answers on those nonexistent images.
In the study, the researchers gave 12 models a text input prompt, such as "Identify the type of tissue present in this histology slide." Then, they either provided the image of the slide or they did not. When a model was not provided with an image, sometimes it would alert the human user that no image was provided. However, most of the time, the model would instead describe an image that did not exist and provide an answer to the original prompt.
Sign up for the Live Science daily newsletter nowContact me with news and offers from other Future brandsReceive email from us on behalf of our trusted partners or sponsorsBy submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.The researchers observed this "mirage mode" across 20 disciplines, testing models' interpretations of a variety of images, from satellites to crowds to birds. The mirage effect was seen across all the disciplines and all the AI models, to varying levels. But it was particularly pronounced in medical diagnostics.
When given text prompts about brain MRIs, chest X-rays, electrocardiograms or pathology slides, but no actual images, the AI models' answers also tended to be biased toward diagnoses that required immediate clinical follow-up. So, if used for clinical decision-making, the AI might prompt more aggressive medical care than is required, the team concluded.
Why AI invents images
So how does an AI model describe images that don’t exist?
What to read next-
Can AI detect cognitive decline better than a doctor? New study reveals surprising accuracy
-
Reading AI summaries makes people more likely to buy something — despite alarming 60% hallucination rate
-
'Rectal garlic insertion for immune support': Medical chatbots confidently give disastrously misguided advice, experts say
The models, which have been trained on massive amounts of textual and visual data, aim to find the answer to a question in the fewest steps possible. And they will take whatever shortcuts they can to deliver an answer, studies have shown. Thus, models can end up relying solely on this trained logic rather than on provided images.
Interestingly, when in mirage mode, AI models also perform well against benchmark tests typically used to assess their accuracy, the researchers found. These standardized tests challenge a model to complete a task — like answering multiple-choice questions — and compare its performance against an answer key of expected outputs.
Researchers can tweak the benchmark tests to assess an AI's visual understanding of images, but this approach doesn't account for questions answered based on mirages. Additionally, AI models are often trained on the same data that's used as a reference to write the benchmark tests. So it's possible for a model to answer questions based on that reference data, rather than by actually interpreting images.
According to Asadi, this is a problem because there is no way to tell whether an AI model has actually analyzed an image or is just making things up. If you are uploading a bunch of images but a few are corrupt or otherwise missing from the dataset, the model may not tell you. And it could still provide very coherent, comprehensive and convincing answers based on mirage images.
"[AI models] are very good at interpreting images," Asadi said. "But on the other hand, they're also very, very good at convincing us of things … and talking to us in an authoritative way."
That authority is apparent in the fact that many consumers query AI chatbots for health guidance, with about one-third of U.S. adults reporting that they do so. This conversational authority increases the risk that fabricated or overconfident outputs are trusted by both the general public and medical professionals, the study authors say.
RELATED STORIES- Biased AI can make doctors' diagnoses less accurate
- People can be identified by their breathing patterns with 97% accuracy
- Can AI detect cognitive decline better than a doctor? New study reveals surprising accuracy
"We urgently need a new generation of evaluation frameworks that strictly measure true cross-modal integration — ensuring the AI is truly 'seeing' the pathology rather than just 'reading' the clinical context," Hongye Zeng, a biomedical AI researcher in the department of radiology at UCLA who was not involved in the study, told Live Science in an email.
This study shows that, while AI has become an increasingly useful tool in medical diagnostics, there are still aspects of its inner workings that we don't understand. Adasi thinks AI models can spot things that may be missed by medical professionals, but he also believes there should be a limit to how much we trust them.
AI companies have attempted to raise guardrails to prevent their models from hallucinating or spreading misinformation — but even these safeguards won't completely prevent the mirage effect, Asadi cautioned.
Jennifer ZiebaLive Science ContributorJennifer Zieba earned her PhD in human genetics at the University of California, Los Angeles. She is currently a project scientist in the orthopedic surgery department at UCLA where she works on identifying mutations and possible treatments for rare genetic musculoskeletal disorders. Jen enjoys teaching and communicating complex scientific concepts to a wide audience and is a freelance writer for multiple online publications.
View MoreYou must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.
Logout Read more
Health
AI-written code can beat humans at biomedical analysis, some studies find. What does that mean for the field?
Artificial Intelligence
AI hallucinations work both ways, study shows — using chatbots can amplify and reinforce our own delusions
Artificial Intelligence
Reading AI summaries makes people more likely to buy something — despite alarming 60% hallucination rate
Artificial Intelligence
AI may accelerate scientific progress — but here's why it can't replace human scientists
Artificial Intelligence
'Not how you build a digital mind': How reasoning failures are preventing AI models from achieving human-level intelligence
Artificial Intelligence
Giving AI the ability to monitor its own thought process could help it think like humans
Latest in Health
Health
AI-written code can beat humans at biomedical analysis, some studies find. What does that mean for the field?
Health
Diabetes rates are lower in high-altitude environments — and scientists may have discovered why
Anatomy
Scientists mapped all the nerves of the clitoris for the first time
Health
Scientists cured type 1 diabetes in mice by creating a blended immune system
Medicine & Drugs
Scientists have discovered an 'Achilles' heel' in deadly superbugs
Medicine & Drugs
Pig semen component could deliver chemotherapy to hard-to-reach eye cancer, mouse study suggests
Latest in News
Birds
World's fattest parrot — on the verge of extinction 30 years ago — has record-breaking breeding season
Space Exploration
The Artemis II astronauts have just flown farther from Earth than any humans in history
Space Exploration
Artemis II moon flyby begins: How to watch and what to know
Health
AI-written code can beat humans at biomedical analysis, some studies find. What does that mean for the field?
Space Exploration
'Trust us; you look amazing': Artemis II crewmembers share first message from space
Space
Science news this week: Artemis II lifts off, diabetes cured in mice, and smog in China shapes Arctic storms
LATEST ARTICLES
1AI 'mirages' mean tools used to analyze medical scans could fabricate their findings- 2World's fattest parrot — on the verge of extinction 30 years ago — has record-breaking breeding season
- 3California declared war on smog in the 1970s. The knock-on effects were huge.
- 4Laifen Wave Pro electric toothbrush review: A perfect all-rounder
- 5Physicists moved volatile antimatter by truck for the first time ever — paving the way for groundbreaking new research