Is AI the new poker champion? Everything you need to know about Libratus!
With the pacy developments in the field of artificial intelligence, it has been observed that machines have started to outperform humans in practically every field. But how does the idea of an AI champion of poker sound? Ridiculous right? Well, after reading the article, you would be amazed to know that we now have an AI that championed the game of Poker. If we could think that algorithms have no place in this kind of competition where bluff and experience prevail over the “pure and simple” calculation of probability, the AI has known how to adapt and win. The AI that took mastery over the human mind in the game of Poker is known as Libratus.
After beating humans in puzzle games (chess, GO), a new artificial intelligence, Libratus, beat 4 of the best poker players in the world!
An AI dedicated to Poker
Unlike traditional games, the biggest difficulties in Poker are:
- The concealment of information (we do not see the opponent's cards)
- The falsification of information (we bluff by pretending we have some cards in hand)
To work in this type of environment, you, therefore, needed a dedicated artificial intelligence whose scalable algorithm would make it possible to exploit the little information it has.
Libratus, from the Latin “balanced”, thus took 15 million core-hours of computation of a supercomputer in the United States to train. The strategy, as with any AI-based on learning (especially reinforcement), is not determined in advance. Here, the method of minimizing hypothetical regret (aka CFR, for Counterfactual Regret Minimization) was chosen. CFR can be explained in a few key points:
It's a self-training algorithm, which essentially means that the AI learns by playing against itself. What gets registered as regret is what we would have gained from playing such an action fixed at such a time for all the previous games, compared to what we did there.
For example, in the game of Rock-paper-scissors, if we play rock and our opponent plays paper, we lose. We, therefore, regret not having played scissors. The regret will be “if in all my previous games, by playing scissors directly, would I have won more than following my current strategy?” If we only did one part, the answer is a direct yes. But if our opponent often played rock too, then the answer could be no, and the strategy might be to stop playing. That is where AI learns to extract maximum profits by avoiding losses as well as the draws.
At the start, the strategy is random, but after each game, all the decisions taken are reviewed thanks to regrets:
If the regret is positive, it will be necessary to change the action more often than calculated
If it is negative, then the machine believes it has done the best thing possible, and it must continue like this.
The strategy is thus reviewed so that it takes into account positive regrets (the probability of taking action, therefore, depends on its interest in making us win overall)
To simplify it, the machine doesn’t look to win a game but the maximum number of games.
The Nash Equilibrium
The algorithm mentioned above does not always converge, except in the case of Poker, where it is possible to reach what is called “Nash equilibrium “, which is a winning strategy if it is perfectly respected since, on a large number of parts, we get that on average. How it works is:
If the opponent follows a Nash strategy, we will end up tied
If he plays a perfect strategy that is against Nash, we will also end up tied
But if not, at the slightest error, I will win
AI, therefore, only wins by the best defense there is (by exploiting the opponent's mistakes to score points).
The 2017 tournament that combines AI, Poker and humans
From January 11 to 31, 2017, Libratus faced four of the best professional poker players: Jimmy Chou, Dong Kim, Jason Les, and Daniel McAulay, in two-player No-Limit Texas Hold'em. The rules were simple, and the goal was to assess progress in artificial intelligence in Poker, thanks to the Brains VS Artificial Intelligence event.
As Libratus continued to play it started to improve its strategies from the learnings of the results of the previous day, to take better account of its opponents and its own flaws. After developing the algorithm for up to 120,000 hands, the machine finally championed the human mind.
In the end, the results are unequivocal: the AI pocketed a whopping 1.7 million dollars.
The astonishing win of Libratus raises numerous questions on the future of the game. Will the game halt to be played in an old fashioned way? Will it be entirely played by AI on behalf of the humans? Was the formation of Libratus just a way to test the powers of AI? Who knows what happens to the future of Poker. For now, let us just wait and watch the game progress with the advent of such technology, and what unfolds ahead of us would surely be worth the wait.