Experimenting with a Neural Network-based Poker Bot

October 13, 2009August 17, 2014Mazur 23 Comments

This is post #11 in an ongoing series of articles about my work as a poker bot developer.

At one point or another most poker bot developers have an epiphany. Their eyes open wide and they excitedly shout to the next stranger they see: “I’ll do it with a neural network!” Seems like such a good idea, right?

I tested it out and my definitive conclusion is “Maybe“.

A neural network (NN) is an AI technique that intelligently maps input values to output values. “Huh?” you say? Here’s the idea for poker: You give a NN data from previous hands you played (position, card values, hand rank, etc) and the decision you made in those situations (call, raise, fold, bet, check) and the NN will learn how to mimic those decisions. For a poker bot, this is a pretty appealing idea: you find the hand history of a winning, high stakes player, train the NN, and then set you poker bot loose to win a boat load of money.

Designing a Simple Test

A little background: My original goal for the poker bot was a full ring shortstacking bot. Shortstacking is a nasty little poker strategy that advocates aggressive play with with a relatively small amount chips. You see, when you don’t have a lot of chips to play with, you wind up making a lot of all in decisions preflop and relatively few postflop and most opponents do not adjust correctly to your strategy; most opponents play like they’re playing against someone with a normal stack, which is the absolute worst thing you can do against a talented shortstacker. This made shortstacking the perfect strategy for my fledgling poker bot.

At that time, the shortstacking bot made its decisions based on some elaborate conditional statements (ex: if you have QQ, KK, or AA and in early position, then raise). I had been testing the it for several weeks when I decided to try out the neural network idea, so I had plenty of data to work with.

To test it out, I picked a very specific situation that the shortstacking bot had faced many times in the past: Everyone folds to you preflop at a full ring (8-9 players) table, do you raise or fold? The bot never called in those situations, so I didn’t have to factor that in.

There were a total of 10,461 hands that met that criteria. For the NN, I used 7 input values:

1. The numeric value of the first hole card scaled from 0 to 1. So, for example, 2 = 2/14 and Ace = 14/14.
2. The numeric value of the second hole card scaled from 0 to 1.
3. Whether or not they were suited. 0 = unsuited, 1 = suited
4. My position at the table where 0 = first to act, 7/9 = 0.778 = Dealer
5. The average value of the two hole cards
6. The difference between the first card’s value and the average
7. The difference between the second card’s value and the average

The output was simply a 1 if it had raised and a 0 if it had folded.

Here’s what the data looks like in an Excel sheet:

You can also download the spreadsheet by clicking here.

The Results

The predictions of the neural network were stunningly accurate:

Correctly predicted Raise: 776/865 = 88.6%
Correctly predicted Fold: 9475/9596 = 98.7%
Overall Accuracy: 10241/10461 = 97.9%

The predictions indicate that it is possible to make quality decisions based on the output of a neural network.

However, and this is a big however: the conditional statements that controlled the decisions in the training data were not very complicated so it wasn’t very hard for the NN to learn the pattern. Training it based on a human’s behavior may have led to very different results because a human’s thought process is much more complicated than “if this then that”. Normal decisions are not simply based on your hole cards and position at the table. You also have to take into account your stack size, your image, your opponents, the dynamics at the table, and a host of other factors. But, interestingly, this is exactly what a NN is good at: learning how a wide range of variables affect a decision.

Despite the success of this test, I ultimately decided not to pursue a neural network based poker bot. The problem is that you don’t have much control of the decision making process. You can’t, for example, look back at a hand and analyze why it made a specific decision. The neural network will spit out a number and the bot acts accordingly. There is no why; it’s merely a feeling it had. It’s also difficult to be precise. Say you want to always raise with AA, raise with KK half the time, and always call with QQ. It’s not a trivial task to adjust a neural network to make those type of decision if the training data indicates you did something else.

One final note: When I first started developing the poker bot in late 2006 I spoke with someone online who claimed to have built a profitable Heads Up No Limit Sit-n-go bot based solely on the predictions of a neural network that he had trained on his own hand histories. Legit? Who knows. Makes you wonder though…

Modeling Human Clicking Behavior on PokerStars

July 2, 2009June 9, 2013Mazur 3 Comments

This is post #10 in an ongoing series of articles about my work as a poker bot developer.

There’s a lot of conjecture and speculation about what the online poker sites look at to detect bots. As bot developers, all we can really do is make educated guesses and hope that we fly under the radar long enough to make a profit.

If I had to guess on what they look at, mouse click location is near the top of the list.

We know that PokerStars records where you click because there is a log file (aptly named PokerStars.log) in the PokerStars directory which includes that information:

It’s not clear though whether they use this information to identify bots or as supplemental data in the event you need technical assistance from the PokerStars. Presumably if they were looking at mouse click location to detect bots they wouldn’t keep it in a log file which the user can edit.

Regardless, it’d be relatively easy for Jeff and team at PokerStars to do some statistical analysis on this data to flag suspicious activity, so if you’re going to develop a bot, you should try to make it act as human-like as possible.

But how do you know what’s normal activity?

For one, when the bot performs an action such as raising, don’t have it click the same exact location every time.

Here’s what I did:

That PokerStars log file contains the locations where you click, right? So why not take advantage of it.

I deleted the log file to reset the data and then joined a couple tables and played for a few hours. When I finished, I extracted the coordinates from the log file and plotted them on a screenshot of one of the table I had just finished playing at.

The end results show exactly where I clicked:

Most of the locations should be clear: Fold, Call, Raise, marking the Check box, viewing the hand history, and clicking the “Chat” and “Stats” tabs. The clicks around the center surprised me at first, but then I realized when I’m multitabling I click the center of the tables to bring the focus to it.

I based my bot’s click locations on this visualization. For example, the location of the clicks on the third button can be approximated by two overlapping normal distributions, one vertical and one horizontal with their intersection at the center of the button.

Here’s the code I used to implement it:

Is that the key to avoiding detection? Who knows.

I ran the bot for nearly two years, and while it may not have helped, it certainly did not hurt.

Best of luck —

PokerShark: Gaining an Edge at Online Poker

June 29, 2009June 9, 2013Mazur 8 Comments

This is post #9 in an ongoing series of articles about my work as a poker bot developer.

Imagine you’re an online poker player and you suddenly have the ability to only play bad opponents. Say goodbye to the tough, aggressive, profitable players and say hello to the legions of loose, passive fish that make up the online poker community.

With a little bit of work, that’s exactly the situation I found myself in.

Some background: When I started playing online poker in early 2005 I was immediately drawn to one vs one No Limit Hold’em tournaments, known as Heads Up Sit-n-Go’s (HUSNGs). They’re fast, exciting, and they require an increased attention to psychology that you don’t get as much of at the full ring (9 players) tables.

Here’s how you join a HUSNG (this will become important later):

1. Find an upcoming tournament.

The PokerStars lobby helps you find exactly the type of game you want to play.

2) Open up the Tournament Lobby.

Once you’ve found a game that you like, double click it to open up the tournament lobby:

The lobby displays important information about the tournament such as how much it costs to play (the buyin plus the rake), the payout, the blind structure, etc.

It also shows you how many other people are registered in the tournament. If you’re the first one to register, the list on the right will be blank, otherwise it will show the name of the person who already signed up.

3) Register. When you’re ready to play, click Register and join the tournament.

As soon as the table fills up with two players the game begins.

You can join a tournament any time you’d like. As soon as one fills up, another one is automatically created. During peak hours it can actually be a challenge to sign up for one because so many people are trying to register at once.

With thousands of people playing at the low and medium stakes, you can literally play for weeks without facing the same opponent twice.

This is fantastic news if you’re up against a tough opponent because you know that you’ll probably never face him or her again. BUT, if you’re playing some donk (a donkey–a bad player), it’s frustrating for the exact same reason: you’ll likely never play him again. And you want to. You really want to.

One important point: For a few minutes after a HUSNG ends, you can still locate it and check the results:

Why PokerStars let’s you access this is beyond me.

Anyway, after a few months of playing the HUSNG opponent lottery, I decided to see what I could do to improve the situation. I set out to create a program that would record the results of every HUSNG played on PokerStars and then use those results to determine which opponents to play and which to skip.

Several months and many iterations later, it was built. I called it PokerShark. (The original version was called PokerSanta but I decided that was kind of girly and changed it to PokerShark.)

24/7 I had a program running that recorded the winners and losers for every HUSNG and when I was ready to play, I ran a second program which opened up tournament lobbies and waiting for a player to register. As soon as someone registered, the program checked the player’s previous results and determined if he was mediocre enough to play. If he was, the program would automatically register me for that tournament.

The results were simply incredible:

Here’s what the software looked like: (click to expand the screenshots)

July 13, 2005:

At first, I focused on the software that recorded the results.

The window on the left was my attempt at concisely visualizing the recorded results.

The small, busy window below the PokerStars lobby displayed collection statistics and the large “Tournament Intercept Window” intercepted the Completed tournament windows as they were opened so it didn’t steal the focus away from anything else I was doing.

December 16, 2005

The ugly maroon window on the bottom left was my first shot at automating the registration process.

February 16, 2006

Eventually I added support for multiple buyins and made the interception window much smaller.

There were a lot of interesting graphs…

July 7, 2006

Over time I improved the design (note the little icons next to the buyins) and added extra analysis criteria such as jump ratio, which measured the stakes a player was currently playing compared to his average stake. Higher jump ratio = more tilt = more I want to play.

October 5, 2006

Notable on this one is the statistic overlays on each of the tables, which was another program I wrote that helped me make better decisions.

Using this software was a trade off. On one hand I didn’t have to face any tough opponents, so the money came easily and I didn’t have to face a lot of the stress typically associated with heads up games. However, because I was playing against a terrible opponents, I picked up a lot of bad habits which actually made me a worse poker player. It’s funny how things work out.

By October 2006 I had had acquired a ton of experience programming add-ons for the PokerStars software, and decided to take it one step further and to try to build a bot.

I mean, how hard could it be? =)

No Limit Hold’em Poker Bot Profits by Effective Stack Size

June 27, 2009June 9, 2013Mazur 3 Comments

This is post #8 in an ongoing series of articles about my work as a poker bot developer.

Two weeks ago I posted a chart of the poker bot’s net income for its last full month of play, September 2008. As a poker player, your net income is a crucial figure because it determines what you can actually buy with your hard earned profits. That’s the number you would tell your non-poker playing friends if they asked how you were doing. However, net income only paints part of the picture. To truly measure your results and your progress, you have to break down net income and analyze exactly where your profits and losses are coming from.

As a bot developer, identifying and eliminating weaknesses in the bot’s play were crucial to making it profitable. One way I did this was to break down its results into groups based on its effective stack size.

Effective stack size is best explained by example: Say you’re playing No Limit Hold’em with one opponent and you have $25 and he has $10. The important thing to realize is that your opponent cannot risk more than $10 in a single hand because that’s all he has in front of him. If he can’t risk more than $10, you can’t win or lose more than $10 either. Your effective stacks, therefore, are $10. It’s what you’re effectively playing with.

Put another way, your extra $15 does you no good playing against an opponent that can only risk $10. Because that extra $15 is not in play, you should devise your strategy based on the effective stack size; not your actual stack size (stack size should play a central role in your poker strategy).

Furthermore, you shouldn’t measure the size of your stack in chips or in dollars; you should measure it in terms of big blinds.

Consider this: you’re playing $1/$2 Heads Up No Limit Hold’em with a $200 stack. You’re the small blind and dealt pocket tens. All other things considered equal, you should make the same move as if you were playing $100/$200 with a $20,000 stack because in both cases, you’re playing with a 100 big blind stack ($200/$2 = 100, $20,000/$200 = 100). You could also measure your stack in terms of how many small blinds you have, but the standard is usually big blinds.

Q. You’re playing a Heads Up No Limit Hold’em Sit-n-Go (HUSNG) and the blinds are 25/50. You have 2,000 chips and your opponent has 1,000. What’s your effective stack size in big blinds?

A. You can’t risk more than 1,000 chips in a hand, so your effective stack is 1,000 chips, or 1,000/50 = 20 big blinds (bb).

When you start a HUSNG you’re given 1,500 chips and the blinds are 10/20, so you have a 75bb effective stack, meaning that in a HUSNG, you can never have more than a 75bb effective stack. If the blinds jump to 15/30 and you still have 1500 chips, you have a 1500/30 = 50bb stack.

The bot’s strategy changed depending on the effective stack size, which is why I broke down the net income into the different groups. The groups might seem strange (ie: why “22-35 bb’s?”) but there’s a method to the madness. Some of it is based on postflop stack to pot ratios and some of it simply based on preflop stack size.

Finally, “5 bb/100” means that the bot won, on average, 5 big blinds every 100 hands. Measuring bb/100 is the standard way to gauge a player’s ability and what is a good number will vary based on the stakes and type of game.

This is all a long way of saying here are the results of my poker bot for its last three months of play broken down by effective stack size.