William Strauss, PhD FutureMetrics November 4, 2010

advertisement
William Strauss, PhD
FutureMetrics
November 4, 2010
Is there a game with a probability of winning that is greater than 50%?
Yes. But is there a way to make money on those situations?
Only if the Return On Investment (ROI) is greater that zero.
For example, a baseball team is highly favored. The probability of them winning is perhaps 65%.
But the moneyline* will reflect this likelihood and will make the probability of winning money less than 50%
*With a “20‐cent” line, the favorite might be ‐200 and the underdog +180. So, if you see a money line at ‐200, this means you have to bet $200 to win $100. If your bet is on an underdog, then at a line of +180 you would win $180 on a $100 bet.
This is illustrated by the red line
The spread is how the sports books make money
Note that moneylines do not exist between ‐100 and +100.
But if you went to Clay Graham’s presentation, you would know that there is a way to do better than the average bettor.
The system described in Graham’s presentation is being used by BaseballWin, LLC and is called MB4.
The model is very complex; but essentially it works out a probability distribution of the expected run production. That probability distribution will be important later…
Batter
Park
RUNS
Pitcher
Ump
A key part of the model is that outcomes of the simulation engine are compared to boundary conditions that are updated daily (the model has a “settling in” period in early season) to optimize the actual historical return on investment. Outcome of the 2010 baseball season
Using this data for wagering…
Using the data from MB4, we can see how an imaginary bet of $5000 on each of the 530 games selected for the 2010 season (out of more than 2000 games) would have paid off over the season.
Note that no games are picked in the first 30 to 40 days or after August 31. Also no games are picked during interleague play. The model “settles” in during the early season and teams play differently when they have either clinched, lost, or are fighting for a playoff spot.
For the 2010 season, early season was pretty good. Interleague was okay. September (mostly not shown) was not good (that downward slope continued!).
So for all of the following analysis, early, late, and interleague is excluded.
So how did $5000 per pick do in the constrained sample?
2010 Daily Pick Returns
flat bet of $5000 per game
19.6
-14.5
5.0%
3.8%
8
90.0%
90.4%
5.0%
5.7%
7
Input
5
Mean
Std Dev
2561.9818
10397.6538
4
Lognorm
3
Mean
Std Dev
Mean = 2561.9818
2
1
Values in Thousands
45.0
32.5
20.0
7.5
-5.0
-17.5
0
-30.0
Values x 10^-5
6
2561.8000
10329.5000
Can we do better?
The fundamental problem in gambling is to find positive betting opportunities. Just like with investing, one wants to find investments with excess risk adjusted expected rates of return.
We have identified that opportunity.
Now the problem is how much capital to bet. It is an asset allocation problem!
One can minimize the probability of total loss or maximize the probability of reaching a fixed goal on or before N trials.
But an approach studied by economists is to value money with a utility function.
Once the utility function is defined, the object is to maximize the expected value of the utility of wealth!
In 1956, J.L. Kelly* wrote a paper that started a literature that has centered on studying the “Kelly Rule”.
*J.L. Kelly, Jr., A new interpretation of information rate, Bell System Tech. Jour., (1956), 917–26 .
In 1997, E.O. Thorp wrote “The Kelly Criterion in Blackjack, Sports Betting, and the Stock Market*”. That paper provides a deep theoretical analysis of how the Kelly Rule will maximize expected logarithmic utility. The paper also shows how the author used the Rule in blackjack (while card counting) and in the stock market.
However, to date, we have not seen an application of the Kelly Rule that uses stochastic inputs to understand the actual risk of ruin.
We do that here.
*Presented at the 10th International Conference on Gambling and Risk Taking, Montreal, June, 1997.
A simple explanation of the Kelly Rule
Suppose that you have the opportunity to gamble with a 2/3 probability of wining and a 1/3 probability of losing. The amount you bet is either doubled or lost. You have twenty trials.
The betting system that maximizes the expected value of your fortune on day twenty is to bet all you have each day.
There is a BIG problem.
th trial and on kth trial you can Xk
If is the amount of your fortune after the k
bet then the max fortune strategy will lead to:
b
k
X 20
220 X 0 with probablity (2 / 3) 20  .0030

0 with probability .99970

Not a good rule!
To avoid the likelihood of ruin, a proportional betting system is employed in which you would bet a proportion of your fortune at each trial.

Using to denote the proportion invested, and you bet that proportion at each trial, after n stages if you have won times and Zn
n  Zn
lost times, your fortune will be X n  (1   ) Zn (1   ) n  Zn X 0
We can expand that and determine the rate of convergence to infinity. This is the limit of (1/ n) log((1   ) Zn (1   ) n  Zn ))

The value of that maximizes the rate of convergence can be derived by setting the derivative of the equation above to zero and solving for .

Solving the problem of how to bet with a 2/3 probability of winning yields the solution that   1/ 3
This says that if two bettors are wagering on the same event and one uses and the other   1/ 3
  1/ 3
uses then the bankroll of the first will eventually be larger than the other’s and will stay greater from that trial on. Generalizing the probability to p, the rate of convergence is maximized with   2 p 1
Thus the Kelly Rule
0
 ( p)  
 2 p 1
if p  .5
if p > .5
Which can be interpreted as the optimal rule of an investor who has the logarithm of their fortune as his or her utility function and wants to maximize the expectation of the utility of their fortune at each trial.
A single Kelly value applied to daily portfolio allocation given the expectations of MB4
Next we define a single annual Kelly value that will maximize the baseball season’s return on investment. Since there are more than on pick per day but the bankroll is a function a single day’s events, the Kelly value has to be allocated over several picks.
Using 2009 data from MB4 to determine the P(w), we derive a Kelly number of 2.78% per pick. That is based on a probability of correctly picking a winner of 58.61%.
This is a large improvement over the flat bet.
Increasing the proportion to 12.0% yields a much higher return. However, there is a danger! How many times have we been cautioned when investing that “historical data does not guarantee future results”?
There is the likelihood of consecutive days of winning and losing
So the problem is to find an optimal wagering strategy that yields good Kelly results AND is robust to uncertainty.
Since the probability of one of the teams winning (and losing) varies for each game , determining Kelly criteria for each individual game is a starting point.
1.2
1.0
0.8
Mean = 0.5497
0.6
0.4
0.2
0.0
-0.2
Mean = 0.4503
(Note the home team advantage!)
It is important to note that a “pick” is not just due to P(w). A function of the P(w) and the moneyline determines an estimated value of the return on investment. Some high P(w)’s will not be picked due to the low expected return. EVROI = Expected value of return on investment
If Line < 0 then EVROI = {P(w)*A – [1‐P(w)]*Abs(line}/A
If Line > 0 then EVROI = {P(w)*line – [1‐P(w)]*A}/A
Distribution of Winning Pick Probabilities (2010 season)
0.526
0.775
5.0%
4.8%
7
90.0%
89.2%
5.0%
6.0%
6
5
Input
Mean
0.6420
Median 0.6320
Std Dev 0.0797
4
LogLogistic
3
Mean
0.6425
Median 0.6336
Std Dev 0.0826
Mean = 0.6420
2
1
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0
After some experimentation, a good solution is found when the “basis” for each game pick is the amount of the original bankroll. The basis is the value to which the Kelly number is applied. Note that this result is achieved over 93 days!
If the basis is allowed to grow as the bankroll grows (set by a parameter), the optimal solution that minimizes the coefficient of variation* is to hold the basis to the starting bankroll (that is, the optimal solution is with the growth parameter equal to zero). But there is nothing to prohibit a larger starting basis! Also note that none of the original bankroll is used after the first days. This may not be the case in other seasons but over the course of the season, as long as the model continues to identify mispricings in moneylines, over time the fortune should grow at the optimal Kelly rate.
*The average of all of the game returns divided by the standard deviation of those returns. The coefficient of variation is a normalized measure of the variability of a probability distribution. Starting with a $1,000,000 bankroll
How Robust is this Solution?
We can use the loglogistic distribution that describes the estimated probabilities of winning to drive a simulation in which each game’s estimated probability of winning varies inside that distribution. The terminal value at the end of the season will reflect the range of potential outcomes.
The distribution of outcomes from 5000 iterations of the simulation shows that the worst case is a terminal value of $350,000.
The same results charted over the season.
We can “tweak” the parameter that allows the basis to grow with the bankroll. In this example the bankroll can grow by 40% of the previous trial’s bankroll. Note that the risk of ruin is 28.5% . There is a 50% probability that the ending bankroll will be less than $11 million. But the mean value of the last day’s fortune is $25,428,531!
(there is a 10% chance the ending value will be greater than $63.76 million)
How strong is your @RISK appetite?
Thank you,
William Strauss, PhD
FutureMetrics
WilliamStrauss@FutureMetrics.com
Download