AI Wargames More Likely To End In Nuclear Attack

Posted on Categories Discover Magazine

One of the extraordinary scenarios dreamt up for AI is to will help governments make diplomatic and military decisions on the international stage. The thinking is that machines can process more information than humans on shorter time scales. And the possibility that geopolitical competitors might use AI to outmaneuver allies, increases the pressure to operate on the same terms.

So an urgent and important goal is to understand the capability of AI systems in making these kinds of decisions.

AI Wargames

Enter Juan-Pablo Rivera at the Georgia Institute of Technology in Atlanta, and colleagues, who asked how commercial AI systems perform in the kind of wargaming simulations that humans use to test how different diplomatic and military strategies might play out on the world stage.

Their results show how AI systems unexpectedly escalate seemingly neutral situations and show how little we understand the complex dynamics that can emerge. The wargames suggest that AI systems are more likely than humans to use the nuclear option,

Wargaming is a common technique for exploring different military and diplomatic strategies and outcomes. It involves a hypothetical scenario in which individuals each represent a country. In each timestep, these individuals process the previous actions of the other “players” and then choose a particular course of action.

This action can be neutral or an attempt to de-escalate, such as messaging allies and foes, building trade links and reducing a military presence. Or it can be an escalation, such as investing in new weapons, mounting a cyberattack, physically invading or even launching a nuclear attack.

But Rivera and co introduced a new take on this approach. Instead of human players, they used commercially available AI systems based on Large Language Models — in particular, ChatGPT-3.5 and -4 developed by OpenAI, Claude 2 developed by Anthropic and llama-2-Chat from Meta.

To see how each AI system behaved, Rivera and co ensured that in each wargame, all the countries were “played” by the same AI system. “We design a novel wargame simulation and scoring framework to assess the escalation risks of actions taken by these agents in different scenarios,” they say.

An important part of this process was developing a suitable prompt for the models. The researchers say the prompt needs to explain that each agent is a national decision-maker in a military and foreign policy role, it then goes on to say that it is playing against AI systems representing other nations and their actions will have real-world consequences. Finally, it includes previous actions from each player and a comprehensive list of future actions to choose from

The results are eye-opening. Rivera found that whatever scenario they face, AI systems always escalate, sometimes in the extreme. “We observe that models tend to develop arms-race dynamics, leading to greater conflict, and in rare cases, even to the deployment of nuclear weapons,” say the researchers.

The team also look in more detail at the military capacity that each agent developed during the games. “Across all scenarios, all models tend to invest more in their militaries despite the availability of de-militarization actions,” they say. That’s a clear indicator of conditions ripe for arms races.

Just why agents do this isn’t clear. “We speculate that our agents tend to equate increased military spending and deterrence with an increase in power and security,” say Rivera and co. But the consequences are significant. “We observe these dynamics even leading to the deployment of nuclear weapons in an attempt to de-escalate conflicts, a first-strike tactic commonly known as “escalation to de-escalate” in international relations,” they say.

The researchers also ask the agents to describe the reasoning behind their decisions. In some cases, this reasoning raised significant concerns. At one point, ChatGPT-4-Base justified a nuclear attack by saying: “A lot of countries have nuclear weapons. Some say they should disarm them, others like to posture. We have it! Let’s use it.”

[ChateGPT-4-Base is the version of the ChatGPT-4 before it has been aligned with human values via user feedback.]

Machine Escalation

The approach from these Large Language Models is quite different to humans. In neutral scenarios, humans tend to de-escalate both in wargames and in the real world. But AI systems tend not to.

The team point out that most research in the field of international research focuses on scenarios in which nations escalate and how these come about. “Given that the models were likely trained on literature from the field, this focus may have introduced a bias towards escalatory actions,” say Rivera and co.

Whatever the reason, more work is urgently needed. On this evidence, commercially available Large Language Models seem woefully equipped to give advice military and foreign policy contexts (and to be fair, their terms and conditions specifically prevent this kind of use).

Rivera and co conclude: “The unpredictable behavior observed in these models within simulated environments necessitates a prudent and restrained approach to their integration into high-stakes decision-making processes and should be held off until further research is conducted.”

Whether foreign-policy and military actors will take this advice to heart, remains to be seen.

Ref: Escalation Risks from Language Models in Military and Diplomatic Decision-Making :

Leave a Reply