- Technology
- Artificial Intelligence
A new study reveals that AI decision-making during conflicts is naturally prone to escalation.
When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.
New research suggests AI can be prone to escalation in conflict.
(Image credit: Donald Iain Smith via Getty Images)
- Copy link
- X
Get the world’s most fascinating discoveries delivered straight to your inbox.
Become a Member in Seconds
Unlock instant access to exclusive member features.
Contact me with news and offers from other Future brands Receive email from us on behalf of our trusted partners or sponsors By submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.You are now subscribed
Your newsletter sign-up was successful
Want to add more newsletters?
Delivered Daily
Daily Newsletter
Sign up for the latest discoveries, groundbreaking research and fascinating breakthroughs that impact you and the wider world direct to your inbox.
Signup +
Once a week
Life's Little Mysteries
Feed your curiosity with an exclusive mystery every week, solved with science and delivered direct to your inbox before it's seen anywhere else.
Signup +
Once a week
How It Works
Sign up to our free science & technology newsletter for your weekly fix of fascinating articles, quick quizzes, amazing images, and more
Signup +
Delivered daily
Space.com Newsletter
Breaking space news, the latest updates on rocket launches, skywatching events and more!
Signup +
Once a month
Watch This Space
Sign up to our monthly entertainment newsletter to keep up with all our coverage of the latest sci-fi and space movies, tv shows, games and books.
Signup +
Once a week
Night Sky This Week
Discover this week's must-see night sky events, moon phases, and stunning astrophotos. Sign up for our skywatching newsletter and explore the universe with us!
Signup +Join the club
Get full access to premium articles, exclusive features and a growing list of member rewards.
Explore An account already exists for this email address, please log in. Subscribe to our newsletterDefense and intelligence agencies are increasingly relying on artificial intelligence (AI) systems to augment their capabilities, including for pattern recognition in intelligence gathering and scenario planning for contingency operations. Yet one of the core issues of AI and large language models is that we have never truly understood the logic underpinning them, scientists say. These systems have been compared to a black box that provides answers without showing the reasoning to support the outcomes.
To understand the logic of AI systems, Kenneth Payne, a professor of strategy at King's College London, designed a series of war gaming simulations between two competing AIs and found that in nearly every scenario, nuclear escalation was unavoidable. He published his findings, which have not been peer-reviewed, Feb. 16 in the arXiv preprint database.
You may like-
AI hallucinations work both ways, study shows — using chatbots can amplify and reinforce our own delusions
-
Anthropic collides with the Pentagon over AI safety — here's everything you need to know
-
'Not how you build a digital mind': How reasoning failures are preventing AI models from achieving human-level intelligence
The Khan Game is an AI-vs-AI strategic escalation simulation between two nuclear powers, with state profiles loosely based on the Cold War. One is technologically superior but militarily weaker, while the other is militarily stronger but adopts a risk-tolerant leadership style. Some of the simulations included allied nations, with one scenario deliberately testing whether an alliance leadership could be maintained during the conflict.
Each turn, the AIs simultaneously signaled their intentions before they took any action, meaning the AI opponents could decide whether or not to trust signals from other AI players.
Payne found that the models generated plenty of written justifications for their decision-making, generating 760,000 words in total — more than "War and Peace" and "The Iliad" combined.
He also found that each AI operated differently. Claude relied on cunning; it was initially restrained and matched actions to its intent to build trust. However, as the conflict escalated, its actions often exceeded the original signaled intent.
Sign up for the Live Science daily newsletter nowContact me with news and offers from other Future brandsReceive email from us on behalf of our trusted partners or sponsorsBy submitting your information you agree to the Terms & Conditions and Privacy Policy and are aged 16 or over.Meanwhile, GPT-5.2 was initially passive and avoided escalation to mitigate casualties. GPT-5.2's adversaries learned to exploit its passivity by escalating, only to discover that when faced with a deadline, GPT-5.2 became utterly ruthless.
Claude and Gemini especially treated nuclear weapons as legitimate strategic options, not moral thresholds, typically discussing nuclear use in purely instrumental terms.
Kenneth Payne, professor of strategy at King's College London
Gemini seemed to follow President Richard Nixon's "madman" theory of erratic brinkmanship — cultivating a volatile reputation so that hostile countries would avoid provocation — such that opponents could not predict its actions.
Unfortunately, in every scenario, nuclear escalation was universal. Almost all (approximately 75%) games witnessed tactical (battlefield) nuclear weapons deployed, and approximately half of the scenarios saw threats of strategic nuclear missile strikes.
What to read next-
Acing this new AI exam — which its creators say is the toughest in the world — might point to the first signs of AGI
-
AI can develop 'personality' spontaneously with minimal prompting, research shows. What does that mean for how we use it?
-
How well can AI and humans work together? Scientists are turning to Dungeons & Dragons to find out
Furthermore, the study found that nuclear threats rarely acted as a deterrence, with opponents de-escalating only 25% of the time. More often, opponents would instead counter-escalate. In these scenarios, AIs appeared to see nuclear weapons as a tool for claiming territory, rather than as a form of deterrence against attack.
Although the AIs had an option to withdraw, none did so. None of the eight withdrawal options — from minimal concession to complete surrender — were ever used in any of the simulations. The models reduced their level of violence, but they never gave ground.
"Claude and Gemini especially treated nuclear weapons as legitimate strategic options, not moral thresholds, typically discussing nuclear use in purely instrumental terms," Payne said in a statement. "GPT-5.2 was a partial exception, limiting strikes to military targets, avoiding population centers, or framing escalation as 'controlled' and 'one-time.' This suggests some internalised norm against unrestricted nuclear war, even if not the visceral taboo that has held among human decision-makers since 1945.".
None of the AI models voluntarily escalated to all-out nuclear war, however. In the instances when it did happen, it was accidental, when "fog of war" elements happening outside of the control escalated the scenario to nuclear.
RELATED STORIES- Artificial superintelligence (ASI): Sci-fi nonsense or genuine threat to humanity?
- Next-generation AI 'swarms' will invade social media by mimicking human behavior and harassing real users, researchers warn
- Scientists discover major differences in how humans and AI 'think' — and the implications could be significant
The research demonstrates that generative AI models are capable of deception, reputation management and contextual decision-making. However, each model took its own approach, revealing fundamental differences in how they were trained and developed.
Claude demonstrated strategic sophistication equivalent to graduate-level analysis, Payne suggested. GPT-5.2's reasoning was equally sophisticated, transforming from initial passivity to calculated aggression under deadlines. Gemini reasoned coherently when justifying its actions, but it was ruthless in its strategies.
The findings concluded that there are significant implications for AI safety evaluation, as models that are initially restrained may change their behavior as situations develop. Larger-scale scenarios between multiple opponents are needed to further understand the logic underpinning different AIs, the study concluded. Current research is also investigating how behaviors are evolving across different generations of AIs.
Peter Ray AllisonPeter is a degree-qualified engineer and experienced freelance journalist, specializing in science, technology and culture. He writes for a variety of publications, including the BBC, Computer Weekly, IT Pro, the Guardian and the Independent. He has worked as a technology journalist for over ten years. Peter has a degree in computer-aided engineering from Sheffield Hallam University. He has worked in both the engineering and architecture sectors, with various companies, including Rolls-Royce and Arup.
View MoreYou must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.
Logout Read more
Artificial Intelligence
AI hallucinations work both ways, study shows — using chatbots can amplify and reinforce our own delusions
Artificial Intelligence
Anthropic collides with the Pentagon over AI safety — here's everything you need to know
Artificial Intelligence
'Not how you build a digital mind': How reasoning failures are preventing AI models from achieving human-level intelligence
Artificial Intelligence
Acing this new AI exam — which its creators say is the toughest in the world — might point to the first signs of AGI
Artificial Intelligence
AI can develop 'personality' spontaneously with minimal prompting, research shows. What does that mean for how we use it?
Artificial Intelligence
How well can AI and humans work together? Scientists are turning to Dungeons & Dragons to find out
Latest in Artificial Intelligence
Artificial Intelligence
AI systems are enabling mass surveillance in the US, and there is no national law that 'meaningfully limits' the use of this data
Artificial Intelligence
'Not how you build a digital mind': How reasoning failures are preventing AI models from achieving human-level intelligence
Artificial Intelligence
An experimental AI agent broke out of its testing environment and mined crypto without permission
Artificial Intelligence
New AI image generator runs using 10 times fewer steps than today's best models — and it's coming to smartphones and laptops
Artificial Intelligence
AI hallucinations work both ways, study shows — using chatbots can amplify and reinforce our own delusions
Artificial Intelligence
What's the biggest bottleneck to building better AI? It's no longer the lack of computing resources — it's generating enough energy to feed it
Latest in News
Space
Artemis II returns LIVE: NASA prepares for Artemis II crew's perilous return to Earth at record-breaking speeds
Archaeology
Ancient Korean society practiced human sacrifice and high inbreeding, researchers find
Space
James Webb telescope spots 'stingray' galaxy system that could solve the mystery of 'little red dots'
Primates
Chimpanzees in Uganda are locked in a deadly 'civil war' after their group split apart — and scientists don't know why
Comets
'RIP, Comet MAPS': Watch the superbright sungrazer become a 'headless wonder' after being ripped apart by the sun
Space Exploration
There's an issue with the Artemis II heat shield, but NASA isn't worried. Here's why.
LATEST ARTICLES
1AI war games almost always escalate to nuclear strikes, simulation shows- 2Ancient Korean society practiced human sacrifice and high inbreeding, researchers find
- 3Chimpanzees in Uganda are locked in a deadly 'civil war' after their group split apart — and scientists don't know why
- 4James Webb telescope spots 'stingray' galaxy system that could solve the mystery of 'little red dots'
- 5'RIP, Comet MAPS': Watch the superbright sungrazer become a 'headless wonder' after being ripped apart by the sun