Artificial intelligences from DeepMind and Meta are now beating human players at the strategic board games Stratego and Diplomacy. It’s a new milestone for technology, but some experts fear the risk of serious abuses…
For many years now, artificial intelligence has been has been outperforming humans at chess. and Go. Until now, however, it has been unable to master more complex board games.
AI has reached a new milestone by becoming an expert on Stratego and Diplomacy. These two strategy games are based on the notion of ” imperfect information “This is in contrast to chess and Go, where players see all the pieces on the board.
In Stratego, the identity of the parts is hidden until another piece encounters them. Diplomacy involves establishing agreements and alliancesthe nature of which is kept secret.
So these games don’t require you to calculate the paths to victory, but require more subtle skills such as the ability to guess your opponent’s thoughts and adjust your strategy to thwart his plans. Bluffing and convincing are necessary.
These two board games have been mastered a few days apart by two AI models models, one developed by DeepMind and the other by Meta (formerly Facebook).
DeepMind’s DeepNash wins at Stratego
Visit model playing Stratego and developed by DeepMind is called DeepNash. Rather than being focused on making intelligent moves, it is designed to play unpredictably.
The characteristics of this game make it more complicated than chess, Go or poker already mastered by AI. Two players place 40 pawns each on a board, but cannot see the opponent’s pawns. The aim is to move the pieces to eliminate those of the opponent and capture a flag.
Total, Stratego can be run from 10535 ways different ways. In comparison, this number is 10360 for the game of Go. Similarly, in terms of imperfect information at the start of the game, Stratego has 1066 possible hidden positions against 106 in a poker game.
This AI is sometimes audacious. In a game against a human, it has sacrificed several high-level pawns and found itself outnumbered. This was in fact a calculated risk to push the player to bring out his best trumps. She then won by developing her strategy around this element.
This DeepNash model is good enough at Stratego to beat all other systems every time, and win 84% of games against experienced humans. In 50 games on the Gravon online gaming platform, she has become the third best Stratego player on the platform since 2002.
To achieve this level of performance, DeepMind was unable to exploit the same algorithms as for chess and Go. They were not at all adapted to this game. The researchers therefore invented a new algorithmic method entitled Regularised Nash Dynamics.
The DeepNash model combines a reinforcement learning algorithm with a deep neural network. To find the ideal action to perform for each state of a game, this AI played 5.5 billion games against itself.
Cicero: Meta’s AI masters the game Diplomacy
On its side, the AI mastering Diplomacy is developed by Meta and CSAIL and called Cicero. Despite the difficulty of this game, the model is able to compete with human players.
In Diplomacy, up to 7 players compete and represent each a European power before the First World War. The aim is to control supply centers moving fleets and armies.
This game requires a flair for scheming, a talent for betrayal and false promises and a real Machiavellianism. The complexity lies not in the world map or the pawns, but in the strategy around the agreements made. What’s more, players have to communicate verbally and convince others of the sincerity of their intentions.
Here again, it is not just a question of computing power. To beat humans at this game, Cicero follows a multi-step process.
First, the AI uses the current status of the board and ongoing discussions to make an initial prediction of each player’s actions. She then refines this prediction and uses it to form an intention for herself and her partners.
She then generates several candidate messages based on the state of the set, the dialogue and its intentions. Candidate messages are then filtered to reduce nonsense, maximize value and ensure consistency with intentions.
This artificial intelligence was trained on the basis of data from 125,261 games on the online version of Diplomacy, combined with data from games played against herself. Her strategic reasoning module (SRM) learned to predict players’ actions and to choose an optimal action accordingly.
Sound dialog moduleused to communicate one’s intentions to allies, is based on a 2.7 billion-parameter language model pre-trained on text from the Internet and then refined using messages from games of Diplomacy played by humans. Based on the SRM’s intentions, this module generates a conversational message.
On webDiplomacy.net, Cicero managed to hold her own against her human opponents. She rose to the top in second place in a ranking of 19 players and outscored most of them.
An AI capable of starting a war?
According to Michael Wellman of the University of Michigan, ” the speed at which different game features have been conquered or mastered by AI in recent years is quite remarkable. “. This computer scientist studies strategic reasoning and game theory.
As he points out, “ Stratego and Diplomacy are quite different from one another, and also feature significantly different challenges games where similar successes have been achieved “.
According to Meta AI researcher Noam Brown, these game AIs capable of interacting with humans and taking into account non-optimal or even irrational actions could pave the way for real-world applications.
In his words ” if you build an autonomous caryou don’t want to assume that all the other drivers on the road are perfectly rational or will behave optimally. Cicero is a big step in this direction. “.
He believes that this technology could help virtual assistants to better understand what consumers want, where to get the virtual beings of the metaverse more engaging and realistic. The aim of these researchers is not to create AIs capable of beating humans in games, but of cooperating with them in the real world.
However, some experts are far less optimistic. According to Kentaro Toyama, an artificial intelligence expert at the University of Michigan, “ these AIs are frightening and could be used for evil “. Just as generative AIs worry artists, this type of artificial intelligence also represents a threat.
He fears that their ability to hide informationto think several turns ahead of their opponents and surpass human intelligence represents a risk. In his eyes, this technology could be used to create more convincing scams or more realistic DeepFakes.
The Cicero code is open to the publicand malicious actors could copy it and use its negotiation and communication skills to create persuasive emails and extort their prey.
Worse still, if someone were to train this language model on data such as the diplomatic documents revealed by WikiLeaks, Toyama fears that the system could impersonate a diplomat and initiate communication with a foreign power.
According to this specialist, “ AI is like the nuclear power of this age. It has colossal potential for both good and evil, but… I think if we don’t start regulating evil, all the science fiction about dystopian AI will become scientific fact “…