Dr. B.J. Dean`s Baseball Talk (Extension of Dr. May`s model)

advertisement
Are Seven-game Baseball Playoffs
Fairer Than Five-game Series?
Dr. Brian Dean
The Conventional Wisdom
Teams that have earned home-field advantage over the
course of a 162-game regular season prefer longer, sevengame playoff series to five-game series, feeling that the
“better” team is more likely to win in a longer series.
Question: Is the difference between seven-game and fivegame series so great that baseball should consider changing
the Division Series round to a best-of-seven format?
Goal: Create a mathematical model to analyze this situation.
Dr. Lee May (1992):
Seven-game series are not significantly fairer than five-game
series (where significantly fairer is defined to mean that the
better team has at least a four percent greater probability of
winning a seven-game series than a five-game series.)
May’s model: Let p denote the probability that the better
team will win a given game. Since he’s looking at things from
the point of view of the better team, p must lie in the interval
[0.5, 1]. (For example, p = 0.7 if the better team has a 70%
probability of winning a given game.) Note that May’s model
treats each game of the series the same.
Probability that the better team will win a five-game series
In May’s model, the probability of each W for the better team is p, so the probability of each
L for the better team is 1-p. There are ten total scenarios of victory for the better team in a
five-game series. The probability of each scenario is the product of the probabilities of each
individual game.
Result
WWW
LWWW
WLWW
WWLW
LLWWW
LWLWW
LWWLW
WLLWW
WLWLW
WWLLW
Probability
p³
p³(1-p)
p³(1-p)
p³(1-p)
p³(1-p)²
p³(1-p)²
p³(1-p)²
p³(1-p)²
p³(1-p)²
p³(1-p)²
Adding these, the total probability that the better team would win a five-game series is
6p⁵ - 15p⁴ + 10p³
Probability that the better team will win a seven-game series
There are a total of 35 different scenarios in which the better team would win a seven-game
series. Rather than listing each individually, we will summarize the probabilities of the
different scenarios under May’s model:
Series length
4 games
5 games
6 games
7 games
# of Scenarios
1
4
10
20
Probability of Each
p⁴
p⁴(1-p)
p⁴(1-p)²
p⁴(1-p)³
Adding the probabilities of the 35 scenarios, the total probability that the better team would
win a seven-game series is
-20p⁷ + 70p⁶ - 84p⁵ + 35p⁴
Comparing five-game and seven-game series
To compare five-game and seven-game series in May’s model, let f(p) denote the
probability that the better team would win a seven-game series, minus the
probability that it would win a five-game series:
f(p) = (-20p⁷ + 70p⁶ - 84p⁵ + 35p⁴) – (6p⁵ - 15p⁴ + 10p³)
= -20p⁷ + 70p⁶ - 90p⁵ + 50p⁴ - 10p³, 0.5 ≤ p ≤ 1
The maximum value of this function is ≈ 0.0372 (when p ≈ 0.689), and the
minimum value is 0 (when p = 0.5). In other words, under May’s model, the better
team is at most only about 3.72 % more likely to win a seven-game series than a
five-game series. Therefore, a seven-game series is not significantly fairer than a
five-game series.
What are some possible ways to modify May’s
model?
My model, the subject of the rest of this talk, will attempt to
take home-field advantage into account. That is, the
probabilities of victory/defeat in road games will be different
from those in home games.
Another possible modification, which we won’t discuss, would
be to account for the effects of momentum/morale. That is,
would the status of the series after each game affect the
probabilities of victory/defeat in the next game? For example,
would the probability of victory in game 2 differ depending on
whether the team won or lost game 1?
Model taking home-field advantage into account
Let Team H be the team with home-field advantage in the
series, and let p be the probability that Team H will win a
given home game. (Since Team H is not necessarily the
“better” team, our model does not imply that p ≥ 0.5 like
May’s did. Instead, we will allow p to be anything in the
interval [0,1], though it seems unlikely that it would ever be
much less than 0.5 in practice.)
We will take the probability that Team H will win a given road
game to be rp, where r is a parameter we will call the road
multiplier.
Road Multiplier
For a given team, we define the road multiplier as the ratio of a team’s road winning
percentage to its home winning percentage. If a team’s road multiplier were 0.9, for
example, we could say that they would be 90 % as likely to win a given road game as they are
a given home game.
For the 112 playoff teams of the first 14 years of the wildcard era (1995-2008), the average
road multiplier has been (to three decimal places) 0.883. The three highest and three lowest
road multipliers, rounded to three decimal places, have been:
Team
Home
Road
Road Multiplier
‘01 Braves
40-41
48-33
1.200
‘97 Orioles
46-35
52-29
1.130
‘01 Astros
44-37
49-32
1.114
‘03 Athletics
‘05 Astros
‘08 White Sox
57-24
53-28
54-28
39-42
36-45
35-46
0.684
0.679
0.656
24 of the 112 road multipliers have been 1.000 or higher, and 17 have been 0.750 or lower.
Probability that Team H will win a five-game series
In the current five-game series format, Team H plays games one, two, and five at home, and
games three and four on the road. Let W and L denote home wins and losses (with
probabilities p and 1-p, respectively) and let w and l denote road wins and losses (with
probabilities rp and 1-rp, respectively.) The ten scenarios for victory for Team H are as
follows:
Result
Probability
WWw
p²(rp)
LWww
p(rp)²(1-p)
WLww
p(rp)²(1-p)
WWlw
p²(rp)(1-rp)
LLwwW
p(rp)²(1-p)²
LWlwW
p²(rp)(1-p)(1-rp)
LWwlW
p²(rp)(1-p)(1-rp)
WLlwW
p²(rp)(1-p)(1-rp)
WLwlW
p²(rp)(1-p)(1-rp)
WWllW
p³(1-rp)²
The total probability of victory for Team H in a five-game series would therefore be
6r²p⁵ - (9r²+6r)p⁴ + (3r²+6r+1)p³
Probability that Team H will win a seven-game series
In a seven-game series format, Team H plays games 1, 2, 6, and 7 at home, and 3, 4, 5 on the
road.
Series Result
# of Scenarios
Probability of Each
2 W, 2 w
1
p²(rp)²
1 W, 3 w, 1 L
2
p(rp)³(1-p)
2 W, 2 w, 1 l
2
p²(rp)²(1-rp)
1 W, 3 w, 2 L
1
p(rp)³(1-p)²
2 W, 2 w, 1L, 1 l
6
p²(rp)²(1-p)(1-rp)
3 W, 1 w, 2 l
3
p³(rp)(1-rp)²
2 W, 2 w, 2 L, 1 l
9
p²(rp)²(1-p)²(1-rp)
3 W, 1 w, 1 L, 2 l
9
p³(rp)(1-p)(1-rp)²
1 W, 3 w, 3 L
1
p(rp)³(1-p)³
4 W, 3 l
1
p⁴(1-rp)³
Adding the probabilities of the 35 scenarios, the total probability that the better team would
win a seven-game series is
-20r³p⁷ + (40r³+30r²)p⁶ - (24r³+48r²+12r)p⁵ + (4r³+18r²+12r+1)p⁴
Comparing five-game and seven-game series
For each fixed value of r, let f(r,p) denote the probability that the better team would win
a seven-game series, minus the probability that it would win a five-game series:
f(r,p) = [-20r³p⁷ + (40r³+30r²)p⁶ - (24r³+48r²+12r)p⁵ + (4r³+18r²+12r+1)p⁴]
- [6r²p⁵ - (9r²+6r)p⁴ + (3r²+6r+1)p³]
= -20r³p⁷ + (40r³+30r²)p⁶ - (24r³+54r²+12r)p⁵ + (4r³+27r²+18r+1)p⁴ - (3r²+6r+1)p³,
0≤p≤1
Note that, if we take r = 1 (that is, treat road games to have the same probability of
victory for Team H as home games), then we get
f(1,p) = -20p⁷ + 70p⁶ - 90p⁵ + 50p⁴ - 10p³,
the same function as in May’s model, with the only difference being that we’re no
longer requiring p ≥ 0.5.
Let’s first take r = 0.883, the average road multiplier of the 112 playoff teams from
1995-2008. The maximum value of f(0.883,p) is ≈ 0.0339 (when p ≈ 0.728), and the
minimum value is ≈ -0.0387 (when p ≈ 0.332). In other words, under our model, using
this average road multiplier as our value of r, Team H is at most only about 3.39 % more
likely to win a seven-game series than a five-game series (and is actually more likely to
win the shorter series if its home-win probability p is low enough).
Maximum/Minimum Values of f(r,p) for different values of r
We will consider values of r between 0.650 and 1.200, since the road multipliers of all 112
playoff teams from 1995-2008 have fallen in that interval. All of the max./min. values are
rounded off to four decimal places:
r
0.650
0.700
0.750
0.800
0.850
0.900
Max.
0.0255
0.0276
0.0295
0.0313
0.0329
0.0344
Min.
-0.0427
-0.0417
-0.0408
-0.0400
-0.0392
-0.0385
r
0.950
1.000
1.050
1.100
1.150
1.200
Max.
0.0358
0.0372
0.0385
0.0398
0.0411
0.0424
Min.
-0.0378
-0.0372
-0.0366
-0.0360
-0.0355
-0.0350
In general, the maximum and minimum values of f(r,p) are increasing as r is increasing (and,
though this is not shown in the table, the values of p at which the max./min. occur are
decreasing as r is increasing.)
The value of r for which the maximum value of f(r,p) is 0.0400 is r ≈ 1.107. For r below this
value, Team H is not significantly likelier to win a seven-game series than a five-game series.
Of the 112 playoff teams from 1995-2008, 109 have had road multipliers below 1.107.
Conclusion and Acknowledgements
Though our model shows that, under certain circumstances, the team with
home-field advantage may be significantly likelier to win a seven-game
series than a five-game series, a general statement that seven-game series
are significantly fairer than five-game series is incorrect.
Thank you to all of the participating schools, students, and teachers in this
year’s Eastern Shore High School Mathematics Competition, and to the
contest’s co-chairs, Dr. Kurt Ludwick and Dr. Barbara Wainwright.
Download