Winning Blackjack through Algorithms Rene Plowden and Joseph Libby Abstract - In this paper we explore algorithms that attempt to show profitable results over 1,000,000 hands of blackjack. While blackjack is a seemingly simple game, strategies based on the presumed dealer’s hand value and the player’s current hand value have been published in the past. We will try and improve on these ideas and implement our own set of algorithms and strategies to solve the riddle of winning at blackjack. At best, with a simple strategy, a player is expected to win about half of the time. This documentation will follow the work that was done during the entirety of the project, successes and failures, with an eye towards the thought process behind our conclusions. Key Words: blackjack, algorithm analysis, Naïve probabilistic, Monte Carlo, combinatorial analysis. 1. Introduction There have been many successful examples to base our research of blackjack algorithms upon. Previous research implements evolutions of the genetic algorithm [1]; neural networks utilizing three different networks (one for splitting, one for doubling down and one for standing/hitting) [2], and another neural network working with the basic strategies and multiple players[3]. Our first implementation consisted of a naïve approach which made decisions based upon the chance of a successful hit. Our more strenuous algorithms included a Monte Carlo based optimization and a combinatorial analysis. This research simulates a marathon game of blackjack played between one player and the dealer. All testing was performed on a set of 250,000 decks, in which 4 hands were played into each deck. A formal statement of our problem is thus: Improving the profit margin of blackjack by optimizing the win to loss ratio, through the use of various strategies and algorithmic computations; given a standardized test set of 250,000 decks, in which 4 hands are played in to each deck. values in a hand may not exceed 21, or else the hand is “busted” and immediately disqualified. Players place bets before the hand is dealt; if they win the hand, the house must reward them with an amount equal to their bet. The game is named for a special case in which a hand consists of an Ace and any card with a value of 10, when it is initially dealt. When blackjack is dealt to a player they immediately receive winnings equal to 150% of their bet and the hand is over. If the dealer has blackjack, the player immediately loses their initial bet and the hand concludes. If both the player and dealer have blackjack, a “push” is declared and bets are returned. If blackjack is not present in either hand, the player can potentially “hit”, “stand”, “double-down”, or “split”. Double-downs and splits will not be discussed in this research as they were not implemented in any of our three algorithms. When a player hits, their hand receives an additional card from the top of the shoe in an attempt to form a hand total greater than the dealer’s. If the total of the hand exceeds 21, the player “busts” and loses the hand. The player may hit until they bust or decide to “stand”, which ends their turn without drawing a card and begins the dealer’s turn. The dealer plays the game using a fixed strategy – if the total of their hand is less than 17 they must hit. The range of hand totals that dealers stand on, from 17 to 26, are referred to as absorbing states. Absorbing states represent states that cannot be left once reached; any other hand state is referred to as a transitioning state. Once a dealer reaches an absorbing state, their hand is compared to the player’s hand; the hand with the greatest total wins. 2. Game play Blackjack, also known as twenty one, is a constraint-based casino card game where players compete against the house dealer. The total of card Figure 2.1 shows a brief flowchart of the game used in creating a simulation of actual play 3. Perfect Game and Benchmarks A deck consists of standard playing card deck of 52 cards without any Jokers or other specialty cards. By standardizing and using 250,000 already shuffled decks, we ensured that the decks are exactly the same for each testing instance. Even though the hands may be handled differently, allowing for slight variation in deck distributions, this test provided an upper bound for expected profitability. Each deck will be played through four times to complete a million simulated games. Thus, providing data for the hypothesis of whether or not the algorithm will improve its win rate as more of the deck is revealed. To get an idea of what an exceptional win to loss rate in blackjack is, we employed a java class that played through our test set with perfect knowledge of what the next card would be. This algorithm forms the highest possible hand by looking at the next card and deciding whether or not it will cause a bust state. If the hand will bust, then the hand will ‘stay’ and proceed to the dealer’s turn; otherwise, it will take the ‘hit’ and restart this decision-making process. This eliminates failure due to busts, yet still allows the unavoidable loss of a hand where the player must stand and the dealer has a higher hand value. This algorithm does not take into account what cards or hand value the dealer has. As shown in Figure 3.1 the amount of winning hands was just below 600,000. Perfect Game 600000 500000 400000 300000 200000 100000 4. Naïve approach When attempting to solve blackjack, a reasonable approach to winning the game or gaining an advantage over the dealer is examining probability. This analysis of the distribution of cards left in the deck and gauging whether the deck state is favorable or not is simply called card counting. A famous card counter, Kenneth Senzo Usui, devised a system that won him millions and allowed him to write books on the subject of beating the house. This prompted casinos to add more decks and increase the edge that the dealers would have over players. Although we do not play multiple decks in these simulations, the results will be similar to those shoes with multiple decks. This naïve method was implemented to observe how aggressive each hand should be played to optimize strategy. Aggressiveness is defined as how much risk is taken to maximize hand value. If the player hits when the chance of a successful hit is too low, the player will hit too often. If the player only hits with a high chance of success, they may be too cautious and lose too often when the dealer has a higher hand value. This percentage was calculated (see Algorithm 4.1) by adding all of the cards that would not bust the player divided by the remaining amount of cards in the deck. Since we had no basis for this percentage, it was set as a variable and incremented by ten percent each iteration of the test set. A zero percent chance feigns a crazy player, bent on having a total hand of 21. This strategy means the player will hit with a zero percent chance of surviving a hit. This means that they can only win a hand by having 21 or blackjack. A hundred percent chance is the opposite, where the player only hits when there is no chance of busting. As Figure 4.1 illustrates, a hit rate of 60 percent provided the most wins of the trials. This was a basis to use in subsequent trials. Although this was a basic implementation, it was still able to achieve a 0.45 win to loss ratio - just short of our breakeven mark. With improvements such as adding basic strategy or any other kind of heuristics that incorporate dealer’s state (what is the dealer’s up card) should allow this implementation to evolve into a more viable algorithm. 0 Win Lost Push Figure 3.1 shows the win/lost/push rate with a player having the highest hand value. Upper bound provided for the studied algorithms. successfulHit=(safeCards/remainingCards) * 100 Algorithm 4.1 this was used to calculate probability of successful hit (in pseudo code). Figure 4.1 shows the win/loss/push rate of naïve approach throughout the different successful hit percentages. 0% is no chance of a successful hit and 100% is always successful hit. 5. Modified Monte Carlo The next evolution of our strategy was to try and simulate card games and outcomes through the use of Monte Carlo method. This, in layman’s terms, would be considered to making certain an uncertain outcome. Outcomes are to receive weights on heuristics and what are more likely to happen down to the least likely. Then those outcomes are placed in a number line with equivalent spanning numbers according to these probabilities. Then, random numbers are chosen repeatedly and a more prevalent outcome emerges from the probability graph. Thus, by finding out what should occur in a logical course of action, in this case, whether to hit or stand will eventually become clear. Since Monte Carlo is a famous gambling tourist spot it reflects the nature of this algorithm. In 1946, a scientist named Stanislaw Ulam coordinated with John von Neumann to devise a way to statistically calculate a game of solitaire, when normal combinatorial calculations were not sufficient to work out solutions. With the help of Neumann and computers, a new wave of calculations could be used to predict neuron diffusion (splitting of atoms) in the now famous Manhattan Project. Though not as quaint as the original, I tried to implement the same type of randomness and calculations of random parts to mimic the Monte Carlo method. Method 1 is the closest to a true Monte Carlo, by randomly choosing cards for the dealer’s down cards (1000 guesses). Then by taking that into consideration, the algorithm finds out if a hand will be able to take a card without busting. Since the probability for the next card can be portrayed as 1/(remaining cards) each of the probable hands have the same chance at the time of receiving the next card. If the hand was successful and beat the dealer’s current value, then an increment of a counter was activated. Once finished with the 1000 simulations of the hand, an average was taken from this incremented solution. If the probability was higher than 0.55 (roughly optimal aggressiveness adopted from our naïve algorithm) a hit was taken and the process begins again with new card added to the player’s total hand value. This gave an overall win percentage of 0.4566. Method 2 and 3 though, are less based upon the Monte Carlo method, aside from the use of randomness. In Method 2, the algorithm calculates whether it is more advantageous, or wins more often according to a random set of cards, to stay or take a hit. The dealer’s cards are randomized as well just to make sure that a low number is not attained and subsequently loose due to always stay at low hand values. The probability of taking a hit and surviving was set to 0.5, which was an earlier estimation due to preliminary results from our naïve approach. Since there was no real value in perusing this line of method other than comparison updated probabilistic was not done. Though further work may be done and used at later dates. The final method deals with comparing the hit versus non-hit values. If the algorithm gave a more favorable of staying versus hitting than the higher value was taken as action regardless of whether the two were close or not (not weighted). Though each of these methods was crude they gave me an insight as to how a truly random multidimensional problem could be rationalized into a statistically seemingly random conclusion that could happen. This is why it is used in stock market predication sense it takes into accounts many factors that can be measured and given a statistically viable answer to an outrageously random question. count=0; if (howManyHits > 1) then count++; else if (howManyHits == 1 && playerTotal < 22) count++; Algorithm 5.1 shows the pseudo code for calculation the Monte Carlo based on each card having a similar probability of occurrence. While not in an absorbing state to start with. 600000 500000 400000 300000 Win Lost 200000 Push 100000 0 Figure 5.1 shows the win/lost/push rate with each method within each of the methods used with a Monte Carlo algorithm in mind. 6. Combinatorial Analysis The combinatorial analysis algorithm attempts to maximize profit by examining the probability of a player losing a hand by standing versus the probability of losing when hitting. The probability of losing a hand when hitting is calculated by counting the cards left in the deck that will cause the player’s hand to exceed 21, and dividing the resulting integer by the count of remaining cards. The probability of losing a hand when standing is calculated by exploring all legal combinations of the dealer’s hand, given: the dealer upcard, the player’s hand, and the current deck state. Each possible dealer end-state is generated and enumerated with a probability which is summed with the probability of other combinations that reach the same absorbing state. This algorithm minimizes loss by hitting when standing is less profitable and standing when hitting is less profitable – if standing and hitting have the same expectation, the player will stand In this implementation, all possible dealer hand combinations are generated using a depth-first tree traversal. Backtracking is supported in this algorithm by a stack, which contains the deck state and probability of reaching the current hand combination. Each time that a dealer hand with a total less than 17 is visited, thirteen branches representing each card rank are generated and the deck state and current probability are pushed on to the stack. These branches are traversed one at a time, from left to right, until a leaf is reached. A leaf of the tree is reached when the dealer’s hand total is greater than 17. Once a leaf is reached, the probability of the leaf occurring is incremented to the corresponding end-state and the last dealt card is popped from the dealer’s hand. The algorithm then pops the deck state and probability of the last state and backtracks to observe the adjacent branch. Once the probability of the dealer’s hand reaching each absorbing state is calculated, the player’s decision is made. The chance of the player losing when standing is a summation of the probability of the dealer reaching a non-busting absorbing state that has a total greater than the player’s. This algorithm did not show profitable results, with a win to loss rate of 0.4746 over 1,000,000 hands. While these were the most favorable results of the algorithms implemented, the processing time of this algorithm was also the longest. If (hitLoss < standLoss) Hit Else Stand Algorithm 6.1 expresses the logic for calculating the probability of losing when the player takes a hit. 1 Mil 600000 500000 400000 300000 200000 100000 0 Win Lost Push Figure 6.1 shows the win/loss/push rate of the implemented combinatorial analysis algorithm. 7. Future Work The algorithms examined in this research have provided an understanding that will aid in the future evolution of blackjack algorithms. Moving forward we plan to implement double-downs and split in our algorithms to approach an optimal solution. The results of our algorithms have made it clear that blackjack player’s must exploit every legal operation to exhibit a profitable strategy. 8. Conclusion & Analysis In Conclusion we found that the implementation individually will not win in a way that would produce a distinct advantage over the house or dealer. Thus being said there is more study that needs to be done with combining each of the algorithms to create a hybrid of multiple algorithms trying to solve the same problem. This might give us the advantage that we are looking for. As to which ones preformed the best Figure 7.1 shows that the Monte Carlo based method 1 did just as well as the Combinatorial Analysis since our goal was to maximize profits through maximizing wins. As for the Big-O of each the breakdown flows like this. Benchmark or perfect game although not as efficient code had a O(n) since all the lookups were a single operation. The variable (n) in these computations is one million since that is the number of games the algorithm simulates. Then followed Naïve, whereas, it made some calculations to what cards will not bust the hand there was mostly just single look ups involved. Even though it was slightly more the end results where that Naïve was also O(n). Next was Monte Carlo method 1, which in addition to having to make the class more efficient in the implementation, still took a longer time than the previously discussed ones. The analysis of it is O(1000n (x) + 1000) where (x) is the number of guesses that it takes in order to get to an absorbing state. Finally the largest Big-O notation was for the Combinatorial Analysis, which even reduced is O(n log n). This is because it must traverse a depth first search in order to complete the algorithm. Our final analysis of algorithms was time trial. Since our code was not optimized throughout implementation except for a little on the Monte Carlo class each time was reflective on how it performs. To read in the file we used to keep hand data on took an average of 23.4 seconds, which was subtracted from the run times of the algorithms. Naïve implementation with all ten trails (0% - 100%) encapsulated within it took an average of approximately 12.1 seconds. Monte Carlo Method 1 was just a bit longer with 90.4 seconds. Finally, the time consuming, Combinatorial Analysis clocked in at 3,347.3 seconds which is little less than an hour. Intuitively the classes with the less computations and Big-O have far less run time than the others, but we confirmed this in our analysis. The other hypothesis of whether or not the algorithm will become better as more of the deck is revealed holds weight with the ones that take in the probability of cards appearing. Combinatorial Analysis is the most relevant since it relies heavily on calculations based on what cards can show up in the future. As there become less cards possible a narrowing of possible outcomes makes the prediction in line with what really occurs. The only other algorithm is Naïve since it determines what the successful hit rate. The other ones do not deal with any kind of probability dealing with card counting, so as expected did not gain any kind of noticeable gain as it went further and further into the individual decks. Wins 600000 500000 400000 300000 200000 Wins 100000 0 Figure 7.1 shows the wins of each of the algorithm used during this study. Perfect game was the benchmark and each all fell short of this ultimate goal. 0.252 0.251 0.25 0.249 Hand 1 0.248 Hand 2 0.247 Hand 3 0.246 Hand 4 Figure 7.2 shows the win percentage analysis of the hands in the deck for each of the algorithms. The scale was lowered to show more of a difference in wins. 8. References [1] Perez, Andres & Sanchez, Eduardo “Blackjack as a Test Bed for Learning Strategies in Neural Networks.” Logic System Laboratory, Computer Science Department, CH-1015 Lausanne, Switzerland. 0-78034859-1. 1998 [2] Kendall, Graham & Smith, Craig “The Evolution of Blackjack Strategies.” University of Nottingham, School of Computer Science and IT, Nottingham NG8 1BB, UK.0-7830-7804-0. 2003 [3] Fogel, David “Evolving Strategies in Blackjack.” Natural Selection, Inc. 3333 N. Torrey Pines Ct. Suite 200, La Jolla CA 92037. 0-7830-8515-2 . 2004.