Artificial Intelligence Finally Learns To Beat Classic 1980s Video Games

AI is now a gamesmaster. Image: Atmosphere1/Shutterstock.com

An artificial intelligence (AI) system has succeeded in mastering classic video games from the 1980s, including iconic Atari titles such as Montezuma’s Revenge, Pitfall, and Freeway. According to its creators, the algorithms upon which the AI is based could one day be used to help robots navigate real-world environments such as disaster zones.

Like disaster zones, many "hard-exploration" games present a series of obstacles that must be avoided and paths that must be navigated in order to reach a destination or goal. Previous attempts to create an AI capable of solving such games have failed, due to the complexities of free exploration.

For instance, many AIs use reinforcement learning – which involves rewarding successful actions – in order to complete a task. The problem with this approach is that rewards tend to be very sparse, making it difficult for a system to achieve its objective.

For instance, if a robot is required to perform a series of complex actions to reach a specified location, and is rewarded only upon arriving at its destination, then it receives no feedback regarding the many individual steps it must take along the way. Researchers can offer more "dense" rewards – such as rewarding each step a robot takes in the right direction – but this may then cause it to make a beeline for its goal and fail to avoid any hazards that may be in the way.

The only way to solve this is by creating an AI that can actively explore its environment. However, writing in the journal Nature, the creators of this new AI explain that “two major issues have hindered the ability of previous algorithms to explore.”

The first of these is known as detachment, occurring when a system doesn’t keep a record of areas it has neglected to explore. For instance, when a robot reaches a fork in the road, it must choose one path and discard the other. Detachment refers to the inability of a system to later recall that there was an alternative path that might still be worth exploring.

Even if an AI could remember such missed opportunities, it would still run into a problem called derailment, whereby it continually becomes side-tracked by its own impulse to keep exploring. Rather than heading straight back to that promising fork in the road, it investigates each side-street that it encounters on the way, and therefore never actually makes it back to the fork.

To overcome all of these issues, the researchers created a “family of algorithms” which they have called Go-Explore. In a nutshell, this system works by continually archiving every state it encounters, thereby allowing it to remember the paths it chose to discard at each point in the video game. It is then able to immediately return to any one of these promising saved states, thus overcoming both detachment and derailment.

As a consequence, Go-Explore was able to surpass the average human score on Pitfall, a game in which previous algorithms failed to score any points. It also achieved a score of 1.7 million on Montezuma’s Revenge, smashing the puny human world record of 1.2 million points.

Comments

If you liked this story, you'll love these

This website uses cookies

This website uses cookies to improve user experience. By continuing to use our website you consent to all cookies in accordance with our cookie policy.