Poker Pros losing to the Artificial Intelligence in Poker

Carnegie Mellon University is one of the most reputed universities all around the world. They have recently developed an Artificial Intelligence called the “Libratus”. Poker players are in a shock that this little masterpiece has managed to defeat four of the top Poker Pros in a 20-day Poker event called the “Brains Vs Artificial Intelligence: Upping the Ante”. This event was held at the Rivers Casino in Pittsburgh. The players lost to Libratus in a Heads-up poker game, after a total of 120,000 hands altogether. Libratus was leading in a collective chip count of $1,766,250.

Dong Kim, Jimmy Chou, Daniel McAulay and Jason Les were the pros who participated in the event. They bagged $200,000 prize which was then split among the four on their respective performances in the event. Measured in milli-big blinds per hand (mbb/hand), a standard used by imperfect-information game AI researchers, Libratus defeated the humans by 147 mbb/hand or 14.7 big blinds per hand.

Pittsburgh Supercomputing Centre’s Bridges computer helped to compute Libratus strategies. Thomas Sandholm, professor of computer science, and Noam Brown, a Ph.D. student in computer science are the developers of Libratus. They said, “The techniques in Libratus do not use expert domain knowledge or human data and are not specific to poker,” Sandholm and Brown write in the paper. “Thus, they apply to a host of imperfect-information games. Thus, they apply to a host of imperfect-information games.”  They also mentioned that it was not a matter of luck that their AI could perform and win in a game like NLHE. It has more than 10 raised to the power of 161 (1 followed by 161 zeroes) information sets.  To give some perspective, that’s more combinations than the number of atoms in the universe.

Libratus

The AI programs have managed to defeat live players in a number of games in the past like, chess, jeopardy, checkers, and Go.  All these games have an immense number of information sets, and at any given point, both the players know the exact state of the game. Poker, however, is different; there is hidden information as well as a Bluff factor. “The best AI’s ability to do strategic reasoning with imperfect information has now surpassed that of the best humans,” Sandholm said.

Frank Pfenning, head of the Computer Science Department in CMU’s School of Computer Science said, “The computer can’t win at poker if it can’t bluff. This new milestone in artificial intelligence has implications for any realm in which information is incomplete and opponents sow misinformation, Business negotiation, military strategy, cybersecurity, and medical treatment planning could all benefit from automated decision-making using a Libratus like AI.” Pfenning added “Developing an AI that can do that successfully is a tremendous step forward scientifically and has numerous applications. Imagine that your smartphone will someday be able to negotiate the best price on a new car for you. That’s just the beginning.

 Libratus includes the following 3 main modules:

  1. Blueprint Strategy:

It computes the probable outcome on the hand. The number of informational sets is in excess of 10 followed by 161 zeros. It creates a detailed strategy for the early streets of the hand and rudimentary strategy for later streets. The strategy is called the blueprint strategy. There is little difference between a high–king flush and a queen–high flush. Treating those hands as identical reduces the complexity of the game and thus makes it computationally easier,” said Brown. Libratus can also group similar bet sizes.

  1. Detailed Computational Abstraction:

The second stage is the detailed computational abstraction which is based on the hand. It computes a strategy based on the sub game which happens in real-time. This balances the strategies across different sub-games using the blueprint strategy for guidance.

  1. Improvement Strategy:

The existing blueprint strategy is then further improved further in this module as the competition progresses. “AIs use machine learning to find mistakes in the opponent’s strategy and exploit them. But that also opens the AI to exploitation if the opponent shifts strategy. Instead, Libratus’ self-improver module analyzes opponents’ bet sizes to detect potential holes in Libratus’ blueprint strategy. Libratus then adds these missing decision branches, computes strategies for them, and adds them to the blueprint,” said Sandholm

Approximately 600 of Bridge’s 846 compute nodes were utilized for Libratus. The bridges total speed is 1.35 petaflops, about 7,250 times as fast as a high-end laptop and its memory is 274 Terabytes, a typical high-end laptop has 16GB.

Lastly, Sandholm and Brown concluded and said – “The techniques that we developed are largely domain independent and can thus be applied to other strategic imperfect-information interactions, including no recreational applications. Due to the ubiquity of hidden information in real-world strategic interactions, we believe the paradigm introduced in Libratus will be critical to the future growth and widespread application of AI.