paper - People Server at UNCW

advertisement
Winning Blackjack through Algorithms
Rene Plowden and Joseph Libby
Abstract - In this paper we explore algorithms
that attempt to show profitable results over
1,000,000 hands of blackjack. While blackjack is a
seemingly simple game, strategies based on the
presumed dealer’s hand value and the player’s
current hand value have been published in the past.
We will try and improve on these ideas and
implement our own set of algorithms and strategies
to solve the riddle of winning at blackjack. At best,
with a simple strategy, a player is expected to win
about half of the time. This documentation will
follow the work that was done during the entirety of
the project, successes and failures, with an eye
towards the thought process behind our
conclusions.
Key Words: blackjack, algorithm analysis, Naïve
probabilistic, Monte Carlo, combinatorial analysis.
1. Introduction
There have been many successful examples to base
our research of blackjack algorithms upon. Previous
research implements evolutions of the genetic
algorithm [1]; neural networks utilizing three different
networks (one for splitting, one for doubling down and
one for standing/hitting) [2], and another neural
network working with the basic strategies and multiple
players[3]. Our first implementation consisted of a
naïve approach which made decisions based upon the
chance of a successful hit. Our more strenuous
algorithms included a Monte Carlo based optimization
and a combinatorial analysis. This research simulates a
marathon game of blackjack played between one player
and the dealer. All testing was performed on a set of
250,000 decks, in which 4 hands were played into each
deck. A formal statement of our problem is thus:
Improving the profit margin of blackjack by optimizing
the win to loss ratio, through the use of various
strategies and algorithmic computations; given a
standardized test set of 250,000 decks, in which 4
hands are played in to each deck.
values in a hand may not exceed 21, or else the hand is
“busted” and immediately disqualified. Players place
bets before the hand is dealt; if they win the hand, the
house must reward them with an amount equal to their
bet. The game is named for a special case in which a
hand consists of an Ace and any card with a value of
10, when it is initially dealt. When blackjack is dealt to
a player they immediately receive winnings equal to
150% of their bet and the hand is over. If the dealer
has blackjack, the player immediately loses their initial
bet and the hand concludes. If both the player and
dealer have blackjack, a “push” is declared and bets are
returned. If blackjack is not present in either hand, the
player can potentially “hit”, “stand”, “double-down”, or
“split”. Double-downs and splits will not be discussed
in this research as they were not implemented in any of
our three algorithms. When a player hits, their hand
receives an additional card from the top of the shoe in
an attempt to form a hand total greater than the
dealer’s. If the total of the hand exceeds 21, the player
“busts” and loses the hand. The player may hit until
they bust or decide to “stand”, which ends their turn
without drawing a card and begins the dealer’s turn.
The dealer plays the game using a fixed strategy – if
the total of their hand is less than 17 they must hit. The
range of hand totals that dealers stand on, from 17 to
26, are referred to as absorbing states. Absorbing
states represent states that cannot be left once reached;
any other hand state is referred to as a transitioning
state. Once a dealer reaches an absorbing state, their
hand is compared to the player’s hand; the hand with
the greatest total wins.
2. Game play
Blackjack, also known as twenty one, is a
constraint-based casino card game where players
compete against the house dealer. The total of card
Figure 2.1 shows a brief flowchart of the game used in
creating a simulation of actual play
3. Perfect Game and Benchmarks
A deck consists of standard playing card deck of 52
cards without any Jokers or other specialty cards. By
standardizing and using 250,000 already shuffled
decks, we ensured that the decks are exactly the same
for each testing instance. Even though the hands may
be handled differently, allowing for slight variation in
deck distributions, this test provided an upper bound
for expected profitability. Each deck will be played
through four times to complete a million simulated
games. Thus, providing data for the hypothesis of
whether or not the algorithm will improve its win rate
as more of the deck is revealed.
To get an idea of what an exceptional win to loss
rate in blackjack is, we employed a java class that
played through our test set with perfect knowledge of
what the next card would be. This algorithm forms the
highest possible hand by looking at the next card and
deciding whether or not it will cause a bust state. If the
hand will bust, then the hand will ‘stay’ and proceed to
the dealer’s turn; otherwise, it will take the ‘hit’ and
restart this decision-making process. This eliminates
failure due to busts, yet still allows the unavoidable
loss of a hand where the player must stand and the
dealer has a higher hand value. This algorithm does not
take into account what cards or hand value the dealer
has. As shown in Figure 3.1 the amount of winning
hands was just below 600,000.
Perfect Game
600000
500000
400000
300000
200000
100000
4. Naïve approach
When attempting to solve blackjack, a reasonable
approach to winning the game or gaining an advantage
over the dealer is examining probability. This analysis
of the distribution of cards left in the deck and gauging
whether the deck state is favorable or not is simply
called card counting. A famous card counter, Kenneth
Senzo Usui, devised a system that won him millions
and allowed him to write books on the subject of
beating the house. This prompted casinos to add more
decks and increase the edge that the dealers would have
over players. Although we do not play multiple decks
in these simulations, the results will be similar to those
shoes with multiple decks.
This naïve method was implemented to observe
how aggressive each hand should be played to optimize
strategy. Aggressiveness is defined as how much risk is
taken to maximize hand value. If the player hits when
the chance of a successful hit is too low, the player will
hit too often. If the player only hits with a high chance
of success, they may be too cautious and lose too often
when the dealer has a higher hand value. This
percentage was calculated (see Algorithm 4.1) by
adding all of the cards that would not bust the player
divided by the remaining amount of cards in the deck.
Since we had no basis for this percentage, it was set as
a variable and incremented by ten percent each
iteration of the test set. A zero percent chance feigns a
crazy player, bent on having a total hand of 21. This
strategy means the player will hit with a zero percent
chance of surviving a hit. This means that they can only
win a hand by having 21 or blackjack. A hundred
percent chance is the opposite, where the player only
hits when there is no chance of busting. As Figure 4.1
illustrates, a hit rate of 60 percent provided the most
wins of the trials. This was a basis to use in subsequent
trials.
Although this was a basic implementation, it was
still able to achieve a 0.45 win to loss ratio - just short
of our breakeven mark. With improvements such as
adding basic strategy or any other kind of heuristics
that incorporate dealer’s state (what is the dealer’s up
card) should allow this implementation to evolve into a
more viable algorithm.
0
Win
Lost
Push
Figure 3.1 shows the win/lost/push rate with a player
having the highest hand value. Upper bound provided
for the studied algorithms.
successfulHit=(safeCards/remainingCards) * 100
Algorithm 4.1 this was used to calculate probability of
successful hit (in pseudo code).
Figure 4.1 shows the win/loss/push rate of naïve
approach throughout the different successful hit
percentages. 0% is no chance of a successful hit and
100% is always successful hit.
5. Modified Monte Carlo
The next evolution of our strategy was to try and
simulate card games and outcomes through the use of
Monte Carlo method. This, in layman’s terms, would
be considered to making certain an uncertain outcome.
Outcomes are to receive weights on heuristics and what
are more likely to happen down to the least likely. Then
those outcomes are placed in a number line with
equivalent spanning numbers according to these
probabilities. Then, random numbers are chosen
repeatedly and a more prevalent outcome emerges from
the probability graph. Thus, by finding out what should
occur in a logical course of action, in this case, whether
to hit or stand will eventually become clear. Since
Monte Carlo is a famous gambling tourist spot it
reflects the nature of this algorithm. In 1946, a scientist
named Stanislaw Ulam coordinated with John von
Neumann to devise a way to statistically calculate a
game of solitaire, when normal combinatorial
calculations were not sufficient to work out solutions.
With the help of Neumann and computers, a new wave
of calculations could be used to predict neuron
diffusion (splitting of atoms) in the now famous
Manhattan Project.
Though not as quaint as the original, I tried to
implement the same type of randomness and
calculations of random parts to mimic the Monte Carlo
method. Method 1 is the closest to a true Monte Carlo,
by randomly choosing cards for the dealer’s down
cards (1000 guesses). Then by taking that into
consideration, the algorithm finds out if a hand will be
able to take a card without busting. Since the
probability for the next card can be portrayed as
1/(remaining cards) each of the probable hands have
the same chance at the time of receiving the next card.
If the hand was successful and beat the dealer’s current
value, then an increment of a counter was activated.
Once finished with the 1000 simulations of the hand, an
average was taken from this incremented solution. If
the probability was higher than 0.55 (roughly optimal
aggressiveness adopted from our naïve algorithm) a hit
was taken and the process begins again with new card
added to the player’s total hand value. This gave an
overall win percentage of 0.4566. Method 2 and 3
though, are less based upon the Monte Carlo method,
aside from the use of randomness. In Method 2, the
algorithm calculates whether it is more advantageous,
or wins more often according to a random set of cards,
to stay or take a hit. The dealer’s cards are randomized
as well just to make sure that a low number is not
attained and subsequently loose due to always stay at
low hand values. The probability of taking a hit and
surviving was set to 0.5, which was an earlier
estimation due to preliminary results from our naïve
approach. Since there was no real value in perusing this
line of method other than comparison updated
probabilistic was not done. Though further work may
be done and used at later dates. The final method deals
with comparing the hit versus non-hit values. If the
algorithm gave a more favorable of staying versus
hitting than the higher value was taken as action
regardless of whether the two were close or not (not
weighted). Though each of these methods was crude
they gave me an insight as to how a truly random
multidimensional problem could be rationalized into a
statistically seemingly random conclusion that could
happen. This is why it is used in stock market
predication sense it takes into accounts many factors
that can be measured and given a statistically viable
answer to an outrageously random question.
count=0;
if (howManyHits > 1) then count++;
else if
(howManyHits == 1 && playerTotal < 22)
count++;
Algorithm 5.1 shows the pseudo code for calculation
the Monte Carlo based on each card having a similar
probability of occurrence. While not in an absorbing
state to start with.
600000
500000
400000
300000
Win
Lost
200000
Push
100000
0
Figure 5.1 shows the win/lost/push rate with each
method within each of the methods used with a Monte
Carlo algorithm in mind.
6. Combinatorial Analysis
The combinatorial analysis algorithm attempts to
maximize profit by examining the probability of a
player losing a hand by standing versus the probability
of losing when hitting. The probability of losing a
hand when hitting is calculated by counting the cards
left in the deck that will cause the player’s hand to
exceed 21, and dividing the resulting integer by the
count of remaining cards. The probability of losing a
hand when standing is calculated by exploring all legal
combinations of the dealer’s hand, given: the dealer upcard, the player’s hand, and the current deck state.
Each possible dealer end-state is generated and
enumerated with a probability which is summed with
the probability of other combinations that reach the
same absorbing state. This algorithm minimizes loss
by hitting when standing is less profitable and standing
when hitting is less profitable – if standing and hitting
have the same expectation, the player will stand
In this implementation, all possible dealer
hand combinations are generated using a depth-first
tree traversal. Backtracking is supported in this
algorithm by a stack, which contains the deck state and
probability of reaching the current hand combination.
Each time that a dealer hand with a total less than 17 is
visited, thirteen branches representing each card rank
are generated and the deck state and current probability
are pushed on to the stack. These branches are
traversed one at a time, from left to right, until a leaf is
reached. A leaf of the tree is reached when the dealer’s
hand total is greater than 17. Once a leaf is reached,
the probability of the leaf occurring is incremented to
the corresponding end-state and the last dealt card is
popped from the dealer’s hand. The algorithm then
pops the deck state and probability of the last state and
backtracks to observe the adjacent branch. Once the
probability of the dealer’s hand reaching each
absorbing state is calculated, the player’s decision is
made. The chance of the player losing when standing
is a summation of the probability of the dealer reaching
a non-busting absorbing state that has a total greater
than the player’s.
This algorithm did not show profitable results, with
a win to loss rate of 0.4746 over 1,000,000 hands.
While these were the most favorable results of the
algorithms implemented, the processing time of this
algorithm was also the longest.
If (hitLoss < standLoss)
Hit
Else
Stand
Algorithm 6.1 expresses the logic for calculating the
probability of losing when the player takes a hit.
1 Mil
600000
500000
400000
300000
200000
100000
0
Win
Lost
Push
Figure 6.1 shows the win/loss/push rate of the
implemented combinatorial analysis algorithm.
7. Future Work
The algorithms examined in this research have
provided an understanding that will aid in the future
evolution of blackjack algorithms. Moving forward we
plan to implement double-downs and split in our
algorithms to approach an optimal solution. The
results of our algorithms have made it clear that
blackjack player’s must exploit every legal operation
to exhibit a profitable strategy.
8. Conclusion & Analysis
In Conclusion we found that the implementation
individually will not win in a way that would produce a
distinct advantage over the house or dealer. Thus being
said there is more study that needs to be done with
combining each of the algorithms to create a hybrid of
multiple algorithms trying to solve the same problem.
This might give us the advantage that we are looking
for. As to which ones preformed the best Figure 7.1
shows that the Monte Carlo based method 1 did just as
well as the Combinatorial Analysis since our goal was
to maximize profits through maximizing wins. As for
the Big-O of each the breakdown flows like this.
Benchmark or perfect game although not as efficient
code had a O(n) since all the lookups were a single
operation. The variable (n) in these computations is one
million since that is the number of games the algorithm
simulates. Then followed Naïve, whereas, it made some
calculations to what cards will not bust the hand there
was mostly just single look ups involved. Even though
it was slightly more the end results where that Naïve
was also O(n). Next was Monte Carlo method 1, which
in addition to having to make the class more efficient in
the implementation, still took a longer time than the
previously discussed ones. The analysis of it is
O(1000n (x) + 1000) where (x) is the number of
guesses that it takes in order to get to an absorbing
state. Finally the largest Big-O notation was for the
Combinatorial Analysis, which even reduced is O(n log
n). This is because it must traverse a depth first search
in order to complete the algorithm. Our final analysis
of algorithms was time trial. Since our code was not
optimized throughout implementation except for a little
on the Monte Carlo class each time was reflective on
how it performs. To read in the file we used to keep
hand data on took an average of 23.4 seconds, which
was subtracted from the run times of the algorithms.
Naïve implementation with all ten trails (0% - 100%)
encapsulated within it took an average of
approximately 12.1 seconds. Monte Carlo Method 1
was just a bit longer with 90.4 seconds. Finally, the
time consuming, Combinatorial Analysis clocked in at
3,347.3 seconds which is little less than an hour.
Intuitively the classes with the less computations and
Big-O have far less run time than the others, but we
confirmed this in our analysis.
The other hypothesis of whether or not the algorithm
will become better as more of the deck is revealed
holds weight with the ones that take in the probability
of cards appearing. Combinatorial Analysis is the most
relevant since it relies heavily on calculations based on
what cards can show up in the future. As there become
less cards possible a narrowing of possible outcomes
makes the prediction in line with what really occurs.
The only other algorithm is Naïve since it determines
what the successful hit rate. The other ones do not deal
with any kind of probability dealing with card
counting, so as expected did not gain any kind of
noticeable gain as it went further and further into the
individual decks.
Wins
600000
500000
400000
300000
200000
Wins
100000
0
Figure 7.1 shows the wins of each of the algorithm
used during this study. Perfect game was the
benchmark and each all fell short of this ultimate goal.
0.252
0.251
0.25
0.249
Hand 1
0.248
Hand 2
0.247
Hand 3
0.246
Hand 4
Figure 7.2 shows the win percentage analysis
of the hands in the deck for each of the
algorithms. The scale was lowered to show
more of a difference in wins.
8. References
[1] Perez, Andres & Sanchez, Eduardo “Blackjack as a
Test Bed for Learning Strategies in Neural Networks.”
Logic System Laboratory, Computer Science
Department, CH-1015 Lausanne, Switzerland. 0-78034859-1. 1998
[2] Kendall, Graham & Smith, Craig “The Evolution of
Blackjack Strategies.” University of Nottingham,
School of Computer Science and IT, Nottingham NG8
1BB, UK.0-7830-7804-0. 2003
[3] Fogel, David “Evolving Strategies in Blackjack.”
Natural Selection, Inc. 3333 N. Torrey Pines Ct. Suite
200, La Jolla CA 92037. 0-7830-8515-2 . 2004.
Download