DeepMind and Blizzard release new tools to train AI using Starcraft

Teaching computers to play games has always been a useful (if somewhat crude) measure of their intelligence. But as our machines have gotten smarter, we’ve had to find new challenges for them....

Teaching computers to play games has always been a useful (if somewhat crude) measure of their intelligence. But as our machines have gotten smarter, we’ve had to find new challenges for them. First it was chess, then Atari, then the board game Go, and now they’re taking on their biggest challenge yet: Starcraft.

To be precise, Starcraft II, which researchers at Google’s AI subsidiary DeepMind say is the perfect environment for teaching computers advanced skills like memory and planning. Last year, DeepMind said it was going to work with Starcraft creator Blizzard to turn the space-based strategy game into a proper research environment for AI engineers, and today, that software is being released to the public.

The toolkit from DeepMind and Blizzard bundles in various aids, including a large dataset of Starcraft II replays collected from professional matches (which AI can watch to learn human tactics); and a set of mini-games that isolate certain gameplay elements (like map exploration and resource collection) and can be used to hone particular skills. The most important bit of kit, though, is an API that lets AI agents play the game like a human would and feed back data to researchers. This means that the agents can be given the same constraints as humans (so they can’t see all of the map at once, or can’t click the mouse infinitely fast) while learning through trial and error — a process known as “reinforcement learning” in AI.

But why is Starcraft such a good way to train artificial intelligence? It’s not because we want computers to learn military tactics, but because we need to teach them certain abstract skills, and video games happen to be a good way of doing so. Video games are virtual environments, which means games can quickly be repeated over and over; there’s lots of training data available, helpfully generated by humans playing the game; and Starcraft itself has a number of gameplay mechanics that are particularly challenging for computers.

Oriol Vinyals, a researcher at DeepMind who’s working on the project (and who happens to be a former top-rank Starcraft player himself) explains that one of the interesting constraints the game offers is the “fog of war” mechanic, which covers up the map, and forces players to explore to find out what their enemy is up to. “So it might be critical for an AI agent to remember ‘Ah, I saw a unit over there before, but I don’t see it now, so I should go back and scout and see if they have a base near that location,’” Vinyals tells The Verge.

To a human, this is such an obvious idea that it’s barely worth thinking about, but it’s the sort of common sense insight that AI need to learn in order to be useful. In Starcraft, thinking about what a player can’t see is essential to winning — and it’s a challenge that doesn’t exist in games like chess or Go, where both players have complete knowledge of their environment at all times.

Vinyals says that this sort of memory skill can then be applied in all sorts of environments, and gives the example of a computer managing power in a data center to decrease electricity costs. “It might see that on a Sunday there’s a power spike for whatever reason, and it will have to remember this information next Sunday to account for it,” he says. “Memory plays a key role here, and teaching computers to infer what the state of the world might be is super interesting for us.”

As well as teaching AI certain skills, the newly released API sets the stage for a human vs. computer Starcraft showdown. Neither Blizzard nor DeepMind have said they plan to stage matches similar to those AlphaGo played against human champions, but Starcraft II’s finest players are certainly keen. Speaking to MIT Technology Review earlier this year, pro Starcraft player Byun Hyun Woo was fairly confident about his chances. “I don’t think AI can beat [a professional player], at least not in my lifetime,” he said.

The problem is that artificial intelligence has a way of surprising humans, as when DeepMind’s AlphaGo AI made moves that commentators thought nonsensical during its matches with Go master Lee Sedol (but that later turned out to be crucial to its success).

So will DeepMind’s AI surprise Starcraft players? Vinyals says it’s already happening, and gives the examples of an agent that was tasked with exploring a section of a map as quickly as possible using just two units. Usually, says Vinyals, a human player would select the units and use the “move” command to cover the ground as quickly as possible. “But it turns out that instead of using ‘move’ you can use another command called ‘patrol,’” he explains. Unlike ‘move,’ this forces the units to keep their distance from one another “and in that way they covered more of the map and collected resources faster.”

It’s not a breakthrough, but it shows how computers can get the upper hand just by taking new approaches to familiar problems. “I thought it was funny,” says Vinyals. “I just didn’t remember — or maybe didn’t know — this behavior.” It’s likely there’ll be more surprises to come.