It's been reported that an artificial intelligence (AI) has learned to adopt “highly aggressive” strategies when it feels it’s about to lose a simulated game. While that certainly sounds a little scary, the study this is linked to was investigating a larger social problem far more fascinating and enlightening that mere aggression.
Google’s in-house AI, machine learning, and neural networking development teams are working on some truly remarkable projects at present: From AlphaGo Zero, the AI that learned 3,000 years’ worth of Go tactics in a matter of days, to AutoML, a system that makes self-correcting AI “children” that are designed to perform specific tasks, their output is nothing but impressive.
First, they copied the AI and put both versions on two different teams, Red and Blue. Arming them with virtual lasers and putting them in a virtual environment full of green “apples”, they were tasked with gathering more than the other, while fending each other off.
When there were plenty of apples, both gathered a roughly equal amount and were somewhat peaceful. When resources were few and far between, they tended to attack each other.
This is likely where the “highly aggressive” reference comes from. The in-house study notes that, in this game, “agents learn aggressive policies in environments that combine a scarcity of resources with the possibility of costly action,” while noting that “less aggressive policies emerge from learning in relatively abundant environments with less possibility for costly action.”
Red and Blue try to get the green "apples" while shooting each other with yellow blasts.
In short, when times are tough, an AI acts more aggressively against those with the same objective to try and secure supremacy. Curiously, larger and more complex – perhaps more intelligent – AIs were more likely to attack their competitors than smaller, simpler neutral networks.
The study, however, is more complex than one of mere aggressiveness: It’s an insight into the evolution of cooperation.
Take the study's second scenario, which featured a mobile form of prey that Red and Blue had to capture. Again, they could work cooperatively or competitively.
“Two cooperative policies emerged from these experiments,” the study notes. They either met up first and chased down their target, or one player found it first and waited for the others to catch up. In this case, the more complex AIs learned to cooperate faster.
Both games are versions of the Prisoner’s Dilemma, a classic example of a game of strategy that demonstrates why, even if it’s best for two players to cooperate, they sometimes choose not to. The general moral of the game design is that two individuals acting in self-interest does not always lead to a favorable outcome for either of them – and this study investigates this notion using AI, not people.
They found that in real-life situations resembling the first game, “Gathering”, less coordination is required and cooperation is easier to learn. “Wolfpack” scenarios, typified by the second game, requires more cooperation and that cooperation is harder to learn.
Red wolves chase the Blue prey.
Success, however, isn’t determined by ease of strategy alone. This in-house study clearly shows that environmental conditions – scarcity and type of resources, as well as perceived risk – trigger different cooperative or defection-based responses.
The point is that, in the future, when an AI is tasked with something that affects the real world, you can’t just assume it’ll be cooperative or competitive with other systems, even if you give it a simple objective.