Conditional Outcomes in Boxing Matches

advertisement
Aaron Abstatz, Whelan Boyd, and Nate Davis
Math 20: Term Project
Zachary Hamaker
Conditional Outcomes in Boxing Matches
Introduction:
Predicting the outcomes of boxing matches has sparked the interest of many,
becoming a hobby or pastime for some and a means of wealth appropriation for bookies
and gamblers. Instead of analyzing the Las Vegas trends involving bookies’ favorites or
betting patterns, we hoped to identify patterns within the match itself that might be helpful in
predicting the outcome during the fight. Hence, we set out to examine various pools of data
regarding specific statistics found in boxing matches in order to derive an algorithm that
could be used to predict the outcomes. In particular, we analyze statistics regarding the
number of jabs and power punches thrown and landed per round and per fight, focusing on
the victorious fighter. We observe that these compilations of data often fit a normal
distribution and thus we can use the various instruments at our disposal to compute
confidence intervals regarding the probability that a winner falls within our statistical
range. Furthermore, we observe that the rate of punches thrown and landed per round is
relatively stable and thus lends itself to the use of Poisson approximations to predict
conditional outcomes of the fight.
Regarding our data set, we attempted to use fights from many different fighters in
order to achieve some semblance of randomness. However, there were various
constraints. Notably, the statistical match data available to us is weighted somewhat
unevenly to a few particular fighters, namely Floyd Mayweather and Manny Pacquiao
among other well-known boxers, because of the higher number of their fights that are
televised. In order to account for this, we selected randomly for all heavyweight fighters,
purposefully excluding a proportional number of the aforementioned boxers’ fights.
Our principle conclusion is that early rounds have very little effect on the outcome of the
match. This provides insight into how betters ought to take advantage of ostensibly
lopsided early rounds because the statistical probability of winning is higher than one might
intuitively perceive.
Methodologies:
We obtained our data set through a website called www.compubox.com as well as
personal viewings of boxing matches. Our data set includes jabs thrown, jabs landed,
power shots thrown, and power shots landed per fight for as close to a random subset of
boxers as our means allowed. Calculations of the probability of a successful punch easily
follow from these data. We began by entering these data into an excel spreadsheet. We
derived a system of points to be used in determining the total number of successful
punches needed to win a match. We assigned a coefficient value of 1 to jabs and 2 to
power punches. That is, if a fighter lands 10 jabs and 5 power punches, he would achieve
20 points.
1*(10 jabs)+2*(5 power shots)=20 points
We noticed that the total number of points achieved by the victor loosely followed a normal
distribution when aggregated. Hence, we used the formulas from class to determine the
variance, both per fight and per round, of points as well as an expected value of points both
per fight and per round. For variance, we used the summation formula, adding the
percentage of punches landed multiplied by the square of the difference between the
expected number of points and the observed number of points. ( prob(X) * (X -m)^2) We
then calculated various confidence intervals. For example, our 95% confidence interval
means that of all victorious fighters, we can say with 95% confidence that the victor of a
particular fight has achieved a point total within our interval.
Next, for our active component to the project, we sought out to determine the number
of points the fighter must achieve by a given round in order to have a given probability of
winning. We calculated the variance per round by dividing the total variance by twelve, the
number of rounds in a title bout. This provides a figure for the “average” boxer. We used
the expected number of points per round to calculate the parameter lambda and then
utilized a Poisson approximation. To determine the parameter k, we subtracted the given
number of previously achieved points from the lower limit of our expected value confidence
interval. We then divided this figure by the number of rounds remaining to find an average
number of points per round necessary for the fighter to achieve in order to reach the
range. We can disregard the upper limit of the range because the fighter must only
achieve a minimum number of points to enter the interval. Tallying a point total greater than
the range only leads to a higher (though quite marginally higher) probability of winning.
Lastly, since the figures we used in our Poisson approximation refer to the “average”
boxer, it follows that his probability of winning is 50%. Thus, given the probability that a
boxer will achieve enough points to be in the 95% confidence interval (the result of our
Poisson approximation - from k=x to infintiy (⋋ke-k )k! where X is the average number of
points per round needed to reach the lower limit of points necessary to be within the range
of points that 95% of winning boxers achieve), we can multiply this figure by .5 in order to
obtain the probability that the fighter will win. Then, we calculated the necessary number of
points through a given round to have roughly 50%, 33%, and 25% chance of winning the
fight.
Results:
Our findings (active component) show that the total number of points scored in
rounds 1-2 has no perceivable effect on the probability of winning the fight. At the start of
round 4, with 0 points scored, the fighter has a 49.7% chance of winning. With 19 points
scored, he has a 50% chance of winning. Having scored 0 points by end of round 4, the
fighter’s probability then drops significantly to a 39.2% chance of winning the fight. By
scoring 36 points by the end of round 4, the fighter’s probability of winning increases to
50%. The accumulation of points begins to have more than marginal effects on the
outcome at the start of round 5, steadily influencing the result more strongly each round
thereafter. See spreadsheet for empirical data.
Discussion:
It is quite a surprising and important that the first two rounds have virtually no effect
on the outcome of the match (assuming there is no KO which is something we would take
into account if we went more in depth on this project.) However, as the fight gets into later
rounds, the differences in points needed for a certain probability to win, have greater
variability. For example, at the start of round 8, a boxer needs to have already scored 85
points to have a 50% chance of winning, only 15 points to have a 33% chance of winning,
and only 5 points to have a 26% chance of winning! These results may seem startling;
however, they make sense because a common boxing strategy is to conserve energy and
tire your opponent. Furthermore, there is a 50% that a boxer goes on a streak of at least 32
points in one round (boxer’s often have flurries of punches which our research suggests.)
Lastly, some fights produce a victor who has achieved fewer points than his
opponent. This does not contradict our results. Our results show the probability of winning,
given a fighter achieves a point total within the necessary range. That is not to say that he
will win every time he reaches this total. For example, see attached stat sheet for Williams
defeats Lara. This happens in real scenarios because of variability in the actual punches’
effects and thus the judges’ perception of the fight (i.e. a head shot trumps a body
shot). For future analysis, a detailed examination of punch location would benefit our
findings.
Our research does have problems because the scope of this project is not as large as
necessary to come up with a more accurate research paper. One of our major assumptions,
which is obviously not always true, is that all boxers are equal. We could solve this problem
by going through round by round for each boxer and compute their individual ratio of power
to jab punch, and also a lambda (rate of points scored) for each individual boxer.
Furthermore, these statistics can never be too accurate because the determination of jab
versus power and hit versus miss is all determined by human, which is obviously
susceptible to error. Our choice of coefficients is admittedly somewhat arbitrary. Probably
the most significant change to our research methods in the future would be to compute
more accurate coefficients for the jab and power punch.
Another problem with this research is that we do not have a great sample space.
Firstly, we have too little observations. Secondly, the sample space is not completely
random because there are more statistics available for famous boxers (there are more stats
for Floyd Mayweather who is known for his power boxing style.) If we had a larger data
pool, the standard deviations and hence the 95% confidence intervals will tend towards the
mean. As a result, the number of points accumulated at the start of a round will need to be
higher in order to maintain our given chances of winning. Consequently, the earlier rounds
will matter more; however, their effect will be very small compared to later rounds.
Furthermore, if we wanted to include matches that ended with a KO, it might be
more appropriate to use a fat-tail distribution and treat KO’s as black swans (very rare
events for a normal distribution.) This distribution is very similar to the normal distribution;
however, there is a greater chance that outcomes far from the mean occur (as shown by the
picture below.)
Conclusions:
Our results provide information for betters who may choose to place their bets during the
fight given certain conditions from previous rounds. Punching statistics only begin to affect
the outcome of the match by the end of round 4 or the start of round 5. That is, how many
punches a boxer lands in the first 3 rounds has very little statistical effect on the outcome of
the match, and only begins to positively influence the result in round 4. Thus, while many
observers may view lopsided early rounds as reason to bet against a particular fighter, our
results show that a better ought to keep in mind that the results of early rounds do not
statistically matter. For example, if a boxer conserves his energy and does not land a single
punch by the start of round 7, many people will believe that this boxer will lose the match
(and the announcers will most likely fuel that fire.) However, our results show that this
fighter still has a 39% chance of winning! Therefore, as long as the odds on that fighter are
longer than 3-to-2, a better should place a bet on him. Although these results are somewhat
surprising, it is not counter-intuitive to see how a boxer could accumulate very few points in
early rounds, allowing his opponent to tire, and still make a comeback by tallying 40-50
points or more in each of the remaining rounds. Mohammad Ali provides strong anecdotal
evidence of this with the rope-a-dope strategy he championed in the 1974 fight, “The
Rumble in the Jungle,” versus George Foreman.
Download