How to Build an Unbeatable Poker-Playing Robot Source: Adam Kucharski
Each summer, the computer-science researchers behind the world’s best poker-playing robots bring their creations together for a tournament. In recent years, three competitors have dominated: First is a group from the University of Alberta, which currently has about a dozen people working on poker programs. Next is a team from Carnegie Mellon University and their champion bot “Tartanian.” And finally, there is Eric Jackson, an independent researcher who has created a program named “Slumbot.”
The tournament consists of several different competitions, with teams tailoring their bots’ personalities to each one. Some competitions are knockouts, meaning two bots go head-to-head in each round, and the one with the smallest pile of chips at the end gets eliminated. To win these competitions, bots need strong survival instincts. They need to win only enough to get through to the next round—greed, as it were, is not good. In other matches, however, the winning bot is the one that gathers the most cash overall. In these competitions, bots need to squeeze as much as they can out of their opponents; to do so, they need to go on the offensive and find ways to take advantage of the others.
Most of the bots in the tournament have spent years in development, training over millions if not billions of games. Yet there are no big prize pots waiting for the winners. The creators might get bragging rights, but they won’t leave with Vegas-sized rewards. So what’s the lure of working on these programs?
Whenever a computer plays poker, it is solving a problem that’s familiar to all of us: how to deal with missing information.
In some games, like chess, information is not an issue. Players can see everything. They know where the pieces are and what moves their opponent has made. Luck creeps into the game not because players can’t observe events, but because they are unable to process all the available information. That is why there is always a chance (albeit a tiny one) that a grandmaster could lose to a monkey picking random moves.
With a good game-playing algorithm—and a lot of computer power—it’s possible to get around that information-processing problem. That’s how the computer scientist Jonathan Schaeffer and his colleagues at the University of Alberta found the perfect strategy for checkers, and developed their theories for how a computer might one day solve chess. It’s possible for machines to beat their opponents with pure brute force, crunching through every possible set of moves. But poker is different. No matter how good players are, each has to cope with the fact that opponents’ cards are hidden. Although the game has rules and limits, there are always unknown factors.
“Poker is a perfect microcosm of many situations we encounter in the real world.”
The same problem crops up in many aspects of life. Negotiations, auctions, bargaining—all are incomplete information games. “Poker is a perfect microcosm of many situations we encounter in the real world,” Schaeffer said.
* * *
In 2015, the Alberta researchers unveiled their unbeatable poker program—named Cepheus—in the journal Science. The paper was titled “Heads-Up Limit Hold’em Poker Is Solved.”
Cepheus was able to develop its mastery of the game after plenty of practice. To build experience, it played again and again, at a rate of about two thousand games per second. Over time, it got better at exploring the vast combination of possible moves, which meant there were fewer weak links in its strategy for opponents to target. Eventually, the bot learned how to avoid losing in the long run, even to a perfect player. Cepheus perfected its game by employing a “regret minimization” algorithm: After each game it would look back and consider what might have happened had it done things differently, and use this information to learn from its mistakes.
Cepheus has shown that, even in complex situations, it can be possible to find an optimal strategy. The researchers point to a range of other scenarios in which such algorithms could be useful, from designing coast-guard patrols to developing medical treatments.
And then, of course, there’s the less practical reason for their research. The team ended their Science paper with a quote from Alan Turing: “It would be disingenuous of us to disguise the fact that the principal motive which prompted the work was the sheer fun of the thing.”
With computer programs cleaning up at chess, checkers, and now poker, it might be tempting to argue that humans can no longer compete at such games. Turing once noted that if a man tried to pretend to be a machine, “he would clearly make a very poor showing.” Ask the human to perform a calculation, and he’d be much slower, not to mention more error prone, than the computer.
Even so, there are still some situations that bots struggle with. When playing Jeopardy!, IBM’s Watson found the short clues the most difficult. If the host read out a single category and a name—such as “first ladies” and Ronald Reagan—Watson would take too long to search through its database to find the correct response (“Who is Nancy Reagan?”). Watson would beat a human contestant in a race to solve a long, complicated clue, but the human would prevail if there were only a few words to go by. In quiz shows, it seems that brevity is the enemy of machines.
The same is true of poker. Bots need time to study their opponents, learning—and then learning to exploit—their betting styles. In contrast, human professionals are able to evaluate other players much more quickly. “Humans are good at making assumptions about an opponent with very little data,” Schaeffer said.
The University of Alberta poker group has also found that humans are particularly susceptible to strong-arm tactics. “In general, a lot of the knowledge that human poker pros have about how to beat other humans revolves around aggression,” said Michael Johanson, a computer-science researcher at the University of Alberta. “An aggressive strategy that puts a lot of pressure on opponents, making them make tough decisions, tends to be very effective.” When playing humans, the bots try to mimic this behavior and push opponents into making mistakes.
It seems, in other words, that bots have a lot to gain by copying the behavior of humans. Sometimes, it even pays to copy their flaws.
| }
|