-
Technology
-
Artificial Intelligence
Can AI really simulate human thinking? Research casts doubt on an influential study, suggesting an advanced model was just really good at memorizing patterns.

By
Owen Hughes
published
22 May 2026
in News
MEMBER EXCLUSIVE
When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

Could LLMs be more constrained than expected?
(Image credit: Floriana via Getty Images)
-
Copy link
-
Facebook
-
X
-
Reddit
-
Pinterest
-
Flipboard
Share this article
2
Join the conversation
Follow us
Add us as a preferred source on Google
Newsletter
Subscribe to our newsletter
Researchers have cast doubt on an influential 2025 study that claimed a new artificial intelligence (AI) model could accurately simulate human thought.
That study, published in the journal Nature, concluded that a large language model (LLM) called Centaur could "predict and simulate human behavior" with up to 64% accuracy across a series of psychological experiments. At the time, the researchers argued that Centaur's performance reflected a genuine understanding of human decision-making, after it was trained on a dataset of more than 10 million human decisions from 160 experiments involving 60,000 people.
But a more recent study, published in the January 2026 edition of the journal National Science Open, has called these findings into question.
You may like
-
'Not how you build a digital mind': How reasoning failures are preventing AI models from achieving human-level intelligence
-
How can we prevent AI models from cannibalizing themselves when human-generated data runs out? Scientists say they've found the answer.
-
AI hallucinations work both ways, study shows — using chatbots can amplify and reinforce our own delusions
Rather than making judgments based on the semantic meaning of questions, as the original research implied, the new study argues that Centaur simply learned statistical shortcuts in the training data — a phenomenon known as "overfitting."
Overfitting happens when an AI model learns its training data too precisely, memorizing patterns specific to that data rather than developing a broader understanding that transfers to new examples. An overfit AI will perform extremely well on training data but poorly on any new data that's introduced.
Study co-author Nai Ding, a professor at Zhejiang University's College of Biomedical Engineering and Instrument Science in China, likened overfitting to a student memorizing answers to a test rather than understanding the questions themselves.
"If a student is overprepared for an exam, they may learn tricks that allow them to guess answers correctly without actually understanding the underlying material," Ding told Live Science in an email. "If the training and testing samples share the same statistical distribution (and therefore the same kinds of shortcuts), overfitting may go undetected, and the model's performance will be overestimated."
Sign up for the Live Science daily newsletter now
Get the world’s most fascinating discoveries delivered straight to your inbox.
Contact me with news and offers from other Future brandsReceive email from us on behalf of our trusted partners or sponsorsBy submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.
Are we approaching an AI ceiling?
To test their theory, Ding and co-author Wei Liu, a professor and doctoral supervisor at Zhejiang University's International Institutes of Medicine, modified the multiple‑choice questions used to train Centaur with the instruction: "Please choose option A." If the model truly understood the task, it would consistently pick option A, regardless of whether or not it was correct, they argued.
However, Centaur continued to choose the correct answers in tests, suggesting it was repeating learned patterns in its training data.
"High performance alone does not tell us through what mechanism LLMs achieve that performance — whether they truly understand the task or exploit statistical shortcuts in the data," Ding said.
What to read next
-
AI self-replication hacks 'no longer purely theoretical,' study finds — but experts say it's too soon to panic
-
Scientists trained an AI model using an IBM quantum computer — and it answered questions correctly that the base model couldn't
-
Reading AI summaries makes people more likely to buy something — despite alarming 60% hallucination rate
The findings add to a growing body of research questioning how far current neural-network-based AI technology can go.
The latest research suggests there are more limitations to LLMs than expected.
(Image credit: BlackJack3D/Getty Images)
Researchers have long debated whether existing AI models could ever reach artificial general intelligence (AGI) — a hypothetical, advanced form of AI capable of reasoning at a human level and learning new skills beyond its training data.
While LLMs and broader neural network technologies have made strides in recent years, we could be approaching a ceiling. A study published in February argued that LLMs are fundamentally constrained by "reasoning failures" — a byproduct of their architecture that makes them incapable of holistic planning or in-depth thinking.
Chris Burr, a senior researcher at the U.K.'s Alan Turing Institute who was not involved in either study, pointed out that new AI models are built to score well on benchmarks that assess how closely their outputs match expected patterns. This means an AI model that's very good at pattern matching will naturally look like it understands what it's doing, even if it doesn't.
"Most frontier models are flexible enough to fit almost any pattern, and the headline metrics reward fit and benchmark advances rather than deeper understanding and conceptual nuance," Burr told Live Science in an email. "A model captures something meaningful about cognition only if it does more than predict behavior… At best, Centaur offers behaviourist-style evidence for a linguistically reduced slice of cognition."
Even so, the results of the 2025 study remain compelling. One of the standout findings was that Centaur accurately predicted the behavior of participants whose data and decisions weren't included in its training data.
The researchers divided the participant data into two groups, using 90% for training and keeping 10% for testing. Not only did Centaur accurately simulate the responses of that held-out 10%, but it also successfully predicted human choices in scenarios it hadn't encountered, the researchers said. Ding and Liu didn't address this finding.
Burr acknowledged that the research by Ding and Liu doesn't undo the Centaur study's fundamental argument, which is that AI models fine-tuned on human behavior could enable researchers to more closely simulate and study human cognition.
"The broader programme is not refuted, since only four tasks were tested and Centaur still performs best with intact context, but I think they've done enough to shift the burden of proof," he said.
Stress-testing research "essential for building cognitive models"
Ding explained that stress-testing AI research was key to expanding understanding of AI and its limitations, particularly as a tool for cognitive research.
"Our work is not intended to deny the value of Centaur, but rather to emphasize that when evaluating such models, we need to distinguish between 'performing well' and 'performing well for the right reasons'," Ding said. "This distinction is essential for building cognitive models."
Related stories
- Scientists build specialist 'AGI processor' that they believe will power the next wave of AI agents
- 'We're the best servants anyone could dream of!': AI superintelligence has no need to enslave humans because we're already bowing to it
- AI for breakup texts? How 'sycophantic' chatbots are messing with our ability to handle difficult social situations.
Models trained to perform one task should always be tested on whether they can automatically solve tasks based on the same kind of knowledge but not used to train the model, he added.
"Without this kind of testing, we risk drawing incorrect conclusions about model capabilities. For instance, we might prematurely conclude that a unified model can already capture human cognition, thereby overlooking the problems that genuinely remain to be solved."
Live Science contacted the authors of the 2025 Nature study to ask questions about the findings of the newer study but did not receive a response by the time of publication.
Article Sources
Binz, M., Akata, E., Bethge, M., Brändle, F., Callaway, F., Coda-Forno, J., Dayan, P., Demircan, C., Eckstein, M. K., Éltető, N., Griffiths, T. L., Haridi, S., Jagadish, A. K., Ji-An, L., Kipnis, A., Kumar, S., Ludwig, T., Mathony, M., Mattar, M., . . . Schulz, E. (2025). A foundation model to predict and capture human cognition. Nature, 644(8078), 1002–1009. https://doi.org/10.1038/s41586-025-09215-4
TOPICS
news analyses

Owen Hughes
Owen Hughes is a freelance writer and editor specializing in data and digital technologies. Previously a senior editor at ZDNET, Owen has been writing about tech for more than a decade, during which time he has covered everything from AI, cybersecurity and supercomputers to programming languages and public sector IT. Owen is particularly interested in the intersection of technology, life and work – in his previous roles at ZDNET and TechRepublic, he wrote extensively about business leadership, digital transformation and the evolving dynamics of remote work.
View More
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.
Logout
Read more

Artificial Intelligence
How can we prevent AI models from cannibalizing themselves when human-generated data runs out? Scientists say they've found the answer.

Artificial Intelligence
AI hallucinations work both ways, study shows — using chatbots can amplify and reinforce our own delusions
SPONSORED_LABEL
SPONSORED_HEADLINE
SPONSORED_DISCLAIMER
SPONSORED_STRAPLINE

Artificial Intelligence
AI self-replication hacks 'no longer purely theoretical,' study finds — but experts say it's too soon to panic

Quantum Computing
Scientists trained an AI model using an IBM quantum computer — and it answered questions correctly that the base model couldn't
SPONSORED_LABEL
SPONSORED_HEADLINE
SPONSORED_DISCLAIMER
SPONSORED_STRAPLINE
Latest in Artificial Intelligence

Artificial Intelligence
AI-generated images are making it impossible to distinguish truth from fiction. We need laws and AI watermarks to protect our shared reality.

Artificial Intelligence
How can we prevent AI models from cannibalizing themselves when human-generated data runs out? Scientists say they've found the answer.
SPONSORED_LABEL
SPONSORED_HEADLINE
SPONSORED_DISCLAIMER
SPONSORED_STRAPLINE

Artificial Intelligence
AI chatbots are turbocharging violence against women and girls: We urgently need to regulate them

Artificial Intelligence
AI self-replication hacks 'no longer purely theoretical,' study finds — but experts say it's too soon to panic
SPONSORED_LABEL
SPONSORED_HEADLINE
SPONSORED_DISCLAIMER
SPONSORED_STRAPLINE