Google unveils AI that learns on its own

 A    A    Print


Pns: Google’s artificial intelligence group, DeepMind, has unveiled the latest incarnation of its Go-playing program, AlphaGo – an AI so powerful that it derived thousands of years of human knowledge of the game before inventing better moves of its own, all in the space of three days.

Named AlphaGo Zero, the AI program has been hailed as a major advance because it mastered the ancient Chinese board game from scratch, and with no human help beyond being told the rules. In games against the 2015 version, which famously beat Lee Sedol, the South Korean grandmaster, AlphaGo Zero won 100 to 0.

The feat marks a milestone on the road to general-purpose AIs that can do more than thrash humans at board games. Because AlphaGo Zero learns on its own from a blank slate, its talents can now be turned to a host of real-world problems.

At DeepMind, which is based in London, AlphaGo Zero is working out how proteins fold, a massive scientific challenge that could give drug discovery a sorely needed shot in the arm.“For us, AlphaGo wasn’t just about winning the game of Go,” said Demis Hassabis, CEO of DeepMind and a researcher on the team.

“It was also a big step for us towards building these general-purpose algorithms.” Most AIs are described as “narrow” because they perform only a single task, such as translating languages or recognising faces, but general-purpose AIs could potentially outperform humans at many different tasks.

In the next decade, Hassabis believes that AlphaGo’s descendants will work alongside humans as scientific and medical experts.Previous versions of AlphaGo learned their moves by training on thousands of games played by strong human amateurs and professionals.

AlphaGo Zero had no such help. Instead, it learned purely by playing itself millions of times over. It began by placing stones on the Go board at random but swiftly improved as it discovered winning strategies.“It’s more powerful than previous approaches because by not using human data, or human expertise in any fashion, we’ve removed the constraints of human knowledge and it is able to create knowledge itself,” said David Silver, AlphaGo’s lead researcher.

The program amasses its skill through a procedure called reinforcement learning. It is the same method by which balance on the one hand, and scuffed knees on the other, help humans master the art of bike riding. When AlphaGo Zero plays a good move, it is more likely to be rewarded with a win.

When it makes a bad move, it edges closer to a loss.At the heart of the program is a group of software “neurons” that are connected together to form an artificial neural network. For each turn of the game, the network looks at the positions of the pieces on the Go board and calculates which moves might be made next and probability of them leading to a win.

After each game, it updates its neural network, making it stronger player for the next bout. Though far better than previous versions, AlphaGo Zero is a simpler program and mastered the game faster despite training on less data and running on a smaller computer. Given more time, it could have learned the rules for itself too, Silver said.

PNS/Tamanna Khan

Developed by Diligent InfoTech