August 12, 2017 05:19 GMT by dailymail.co.uk

Google's DeepMind to train AI to beat StarCraft II

The research lab teamed up with video game company Blizzard to open StarCraft II as an AI research environment the firms hope will give insight into the most complex problems related to AI.

Google's DeepMind AI has mastered Atari arcade classics and beaten human world champions at board games, and now it's set to take on a much bigger challenge - StarCraft II.

The research lab has teamed up with video game company Blizzard Entertainment to open StarCraft II as an AI research environment the firms hope will give insight into the most complex problems related to artificial intelligence.

Together, they are releasing a set of tools to accelerate AI research in the strategy game their algorithm can eventually beat it.

Scroll down for video 

Google's DeepMind research lab has teamed up with video game company Blizzard Entertainment to open StarCraft II as an AI research environment the firms hope will give insight into the most complex problems related to artificial intelligence

Google's DeepMind research lab has teamed up with video game company Blizzard Entertainment to open StarCraft II as an AI research environment the firms hope will give insight into the most complex problems related to artificial intelligence

WHY STARCRAFT WORKS PERFECTLY FOR AI TRANING

DeepMind has tackled games like Atari Breakout, but StarCraft II presents new challenges in how it contains multiple layers and sub-goals.

Players must accomplish smaller goals along the way, such as gathering resources or building structures. 

Also increasing complexity is the fact that the map is not fully shown at all times, meaning players must use memory and planning. 

Often actions taken early in the game may not pay-off for a long time. 

While Atari games have about 10 basic actions, StarCraft has over 300.

These actions are also hierarchical and can be modified and augmented.

It also has a massive pool of talented players and a lot of data for an AI to learn from.

The game can also be broken down into 'mini-games,' enabling the AI agents to test techniques on smaller scales in order to train for the full game. 

'Testing our agents in games that are not specifically designed for AI research, and where humans play well, is crucial to benchmark agent performance,' reads the DeepMind announcement.

'That is why we, along with our partner Blizzard Entertainment, are excited to announce the release of SC2LE, a set of tools that we hope will accelerate AI research in the real-time strategy game StarCraft II.'

The hope is that training machines to play the game will help to develop more advanced AI algorithms capable of learning, reasoning, remembering and adapting complex strategies to win.

Google’s DeepMind has already taught AI agents to play a range of video games, with the machines learning the rules as they go, or mastering ancient board games.

In tests with Atari arcade classic Breakout, its AI algorithm learned to play like a pro in just two hours and developed the most efficient strategy to beat the game after just four hours of play.

Similarly, DeepMind’s AlphaGo agent learned strategies for playing the ancient Chinese board game Go – beating human champion Lee Sedol in a man vs machine challenge this year.

 While Atari has 10 basic actions, StarCraft has more than 300. As shown above, actions available to both humans and agents depend on the units selected

THE SC2LE RELEASE

DeepMind and Blizzard teamed up to release a set of tools to accelerate AI research in the strategy game.

The set of tools, called SC2LE, includes:

 - An open source version of DeepMind's PySC2 toolset that allows other researchers to train their AI agents to play StarCraft

 - A series of simple 'mini-games' that break down the game into smaller pieces for easier learning

 - A dataset game replays, which will increase from 65k to more than half a million in the coming weeks

 - A joint paper that outlining the progress so far and everything else in the set of tools

While DeepMind has tackled games like Atari Breakout and AlphaGo, StarCraft II presents new challenges in how it contains multiple layers and sub-goals.

While all the aforementioned games have a main objective to beat the opponent, StartCraft also requires players accomplish smaller goals along the way, such as gathering resources or building structures.

Also increasing complexity is the fact that the map is not fully shown at all times, meaning players must use memory and planning.  

Additionally, the game length can vary from minutes to an hour, meaning actions taken early in the game may not pay-off for a long time.

'Part of StarCraft’s longevity is down to the rich, multi-layered gameplay, which also makes it an ideal environment for AI research,' says DeepMind, noting the game's first and second iterations are among the most successful games of all time, with players competing in tournaments for more than 20 years.

'Even StarCraft’s action space presents a challenge with a choice of more than 300 basic actions that can be taken - Contrast this with Atari games, which only have about 10 (e.g. up, down, left, right etc),' says DeepMind.

Additionally, the actions in StarCraft are hierarchical, can be modified and augmented, with many of them requiring a point on the screen. 

'Even assuming a small screen size of 84x84 there are roughly 100 million possible actions available' the firm says.

StarCraft is also an ideal game for this next step in AI research because it has a massive pool of avid players who compete online everyday, ensuring there is a large quantity of replay data to learn from.

This guarantees there a lot of top-notch opponents for AI agents to battle. 

This release means researchers can now tackle some of these challenges using Blizzard’s own tools to build their own tasks and models. 

WHAT IS STARCRAFT? 

Set in a futuristic world in which three alien species battle for dominance across worlds

Set in a futuristic world in which three alien species battle for dominance across worlds

StarCraft is a popular real-time strategy game, first released in 1998.

Set in a futuristic world in which three alien species battle for dominance across worlds.  

It was first released for windows and has had eight official releases on the series since it first began.

Gameplay involves a complex mix of skill and strategy, as players mine resources to pay for structures and military units as they explore an unknown map.

Players need to balance available resources with aggressive or defensive strategies, while adapting to what other players are doing.

The hope is that training machines to play the game will help to develop more advanced AI algorithms capable of learning, remembering and adapting complex strategies to win.

The environment wrapper, for example, offers a flexible and easy-to-use interface for RL agents to play the game

'In this initial release, we break the game down into 'feature layers', where elements of the game such as unit type, health and map visibility are isolated from each other, whilst preserving the core visual and spatial elements of the game,' DeepMind says.

 To make things even easier, the release contains a series of 'mini-games' that breakdown the game into manageable chunks for testing specific tasks.

This could  be used to work on moves such as collecting minerals or moving the camera. 

'We hope that researchers can test their techniques on these as well as propose new mini-games for other researchers to compete and evaluate on,' says DeepMind. 

A Simple mini-game that will allow researchers to test the performance of agents on specific tasks

So far, the AI agents have had success with the mini-games, but the same can't be said for the game in its entirety.  

'Our initial investigations show that our agents perform well on these mini-games, but when it comes to the full game, even strong baseline agents, such as A3C, cannot win a single game against even the easiest built-in AI.

Below, the video shows an agent in an early training stage (left) failing to keep its workers mining - a task human players would find trivial. 

'After training (right), the agents perform more meaningful actions, but if they are to be competitive, we will need further breakthroughs in deep RL and related areas,' says DeepMind. 

The firm has had success with 'imitation learning' and says, 'this kind of training will soon be far easier thanks to Blizzard, which has committed to ongoing releases of hundreds of thousands of anonymized replays gathered from the StarCraft II ladder.'

'These will not only allow researchers to train supervised agents to play the game, but also opens up other interesting areas of research such as sequence prediction and long-term memory.'

DeepMind and video game firm Blizzard Entertainment first teamed up in November 2016 to turn the game into a learning environment for AI.

Announced at BlizzCon 2016 in Anaheim, California, the partnership has focused on using the wildly popular real-time strategy game StarCraft 2 to open up into a sandbox for teaching and testing machines. 

Google’s DeepMind has teamed up with games-maker Blizzard Entertainment to turn one of its hit video games into a learning environment for AI. The popular real-time strategy game StarCraft 2 (still pictured) will be used to teach and test machine agents

Google’s DeepMind has teamed up with games-maker Blizzard Entertainment to turn one of its hit video games into a learning environment for AI. The popular real-time strategy game StarCraft 2 (still pictured) will be used to teach and test machine agents

The approach was to bring benefits far beyond the games industry, enabling researchers to build and test smarter AI algorithms, which could transfer to the real world. 

In a blog post, the firm explained: ‘DeepMind is on a scientific mission to push the boundaries of AI, developing programs that can learn to solve any complex problem without needing to be told how.’

‘Games are the perfect environment in which to do this, allowing us to develop and test smarter, more flexible AI algorithms quickly and efficiently, and also providing instant feedback on how we’re doing through scores.’

StarCraft 2 is set in a futuristic universe in which multiple alien races fight for dominance.

Gameplay involves a complex mix of skill and strategy, as players mine resources to pay for structures and military units as they explore an unknown map.

THE HISTORY OF GO - AND HOW TO PLAY 

The game of Go originated in China more than 2,500 years ago.

Confucius wrote about the game, and it is considered one of the four essential arts required of any true Chinese scholar.

Played by more than 40 million people worldwide, the rules of the game are simple.

Players take turns to place black or white stones on a board, trying to capture the opponent's stones or surround empty space to make points of territory.

The game is played primarily through intuition and feel and because of its beauty, subtlety and intellectual depth, it has captured the human imagination for centuries.

But as simple as the rules are, Go is a game of profound complexity.

There are 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000

,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,

000,000,000,000,000,000,000,000,000,000,000,000,000 possible positions - that's more than the number of atoms in the universe, and more than a googol (10 to the power of 100) times larger than chess.

This complexity is what makes Go hard for computers to play and therefore an irresistible challenge to artificial intelligence researchers, who use games as a testing ground to invent smart, flexible algorithms that can tackle problems, sometimes in ways similar to humans.

While arcade classics and turn-based board games are impressive, the StarCraft 2 universe will provide a new set of challenges for AI.

AI agents will need to respond to the actions of other players as well as work out where they are on an obfuscated map – which will mean scouts must be sent out to chart the way ahead.

The machines will also need to learn how to best spend their resources on units and structures, which expert human players learn from hours of experience and head to head battles in different scenarios.

In effect, the machines will need to reason and weigh up available options in order to complete the task.  

A statement from DeepMind said: ‘StarCraft is an interesting testing environment for current AI research because it provides a useful bridge to the messiness of the real-world.

‘The skills required for an agent to progress through the environment and play StarCraft well could ultimately transfer to real-world tasks.’ 

DeepMind has already worked with Blizzard to create an interface for AI to control gameplay, as well as a computer view of the map – transforming the complicated terrain into a simple pixelated colour view.

However, it states that AI could take a long time to catch up with human players.

‘While we’re still a long way from being able to challenge a professional human player at the game of StarCraft II, we hope that the work we have done with Blizzard will serve as a useful testing platform for the wider AI research community.’ 

 

In tests with Atari arcade classic Breakout, DeepMind's AI algorithm learned to play like a pro in just two hours and developed the most efficient strategy to beat the game after just four hours of play (still pictured)

In tests with Atari arcade classic Breakout, DeepMind's AI algorithm learned to play like a pro in just two hours and developed the most efficient strategy to beat the game after just four hours of play (still pictured)

Read more at dailymail.co.uk