An Analysis of Lineup Optimization in Baseball

advertisement
Thaker 1
AN ANALYSIS OF LINEUP OPTIMIZATION IN BASEBALL
Raj Thaker
Math 20: Discrete Probability
November 30, 2011
Thaker 2
If nothing else, the 2011 MLB season taught us the importance of every single victory.
This year’s World Series Champions, the St. Louis Cardinals, made the playoffs as a wild card
by a margin of a single game over the Atlanta Braves. Had they won even one less of their 162
regular season games, history could have been altered. This offseason, managers across the
league should be scrambling to find any advantage that can give their team even one extra win.
One such way to get that extra win can be through the use of a completely optimized lineup.
Estimates vary on the exact value of an optimal lineup, but most opinions state that it could add
about 10 runs per season, which is equivalent to about one win in sabermetric circles. Of course,
managers must first answer the question of how to optimize their batting orders.
To begin, it is first necessary to examine the conventional approach to setting a lineup
and determine where or if flaws exist. Our average manager, let’s call him Ozzie Guillen just for
fun, generally starts by finding the fastest one of his starters and penciling him into the leadoff
spot (think a Juan Pierre type player). Now, he takes a player with a little bit of speed that can
also hit fairly well and bats him second. The third and fourth spots contain the team’s two best
hitters, with the more powerful hitter usually batting third. Guillen then ranks his remaining
players in terms of hitting ability, using traditional metrics like batting average, home runs, and
rbi’s, and puts them into the 5-9 slots in that order. For another interpretation of this approach,
see the cited ehow.com article (Coondin). Guillen then gives his lineup card to the umpire, goes
back into the dugout, is presumably ejected later on for arguing with said umpire, and watches
his team lose from the clubhouse.
As we can see, it does not take much time or effort to create something that resembles an
average team’s batting order. We must now ask ourselves if we can find a better approach to this
problem. Intuitively, it is pretty easy to realize that the number of at bats by position in the
Thaker 3
batting order is decreasing (see table 1) (Tanko, Lichtman, and Dolphin 122). So, if we batted
our best hitters sequentially starting from the leadoff spot, we would ensure that our best hitters
got the most at bats, while our worst hitters got the fewest. However, according to table 2, we
see that we would be giving the most at bats with runners on base to our fourth best hitter and
giving the least to our best hitter (Tanko, Lichtman, and Dolphin 124). So, this approach clearly
has flaws. The famous sabermetrician Bill James once theorized that since the most important
factor in scoring runs in getting the leadoff batter on base, we should have a player with a low
OBP bat third since this spot is least likely to lead off an inning (Keri 36). This theory was based
off of James’ observation that the most runs are scored in the first inning, the fewest in the
second inning, and equal amounts in each subsequent inning (Keri 36). While this theory makes
some sense, it neglects the fact that a team is more likely to score when its best hitters are at the
plate and they are guaranteed to be there in the first inning and highly unlikely to be there in the
second inning. Of course, there are also stories in baseball lore of managers choosing lineups
completely randomly, like when Billy Martin drew names out of a hat in an attempt to turn
around his slumping Tigers team. This idea may be the most foolish of all, because if we
examine all 9 factorial possible batting orders, the expected runs from a completely random
order would fall somewhere around that of the median batting order. In fact, since the worst
lineups create more runs below the median lineup than the best lineups create above the median
lineup, we would actually expect to score fewer runs than the median, making this a terrible idea.
That said, it did work for Martin as the Tigers won, but they only managed three runs in their
victory, suggesting that Martin was completely wrong (Keri 35).
Before we can determine where to place our best hitters, we must take a step back and
first figure out a way to determine a team’s best hitters. Guillen, for instance, would probably
Thaker 4
solve this problem using a balance of player reputations, egos, and traditional metrics. However,
there are many different ways to quantify a hitter’s performance and we must carefully consider
which metrics will provide us with the best lineup. First off, RBI’s are one of the worst
indicators of a batter’s skill, as they are extremely dependent on factors over which the hitter has
no control, like base runners and quality of teammates. Batting average is also not a great
indicator as it fails to include walks, which for some hitters can account for half of their trips to
the base paths, and cannot differentiate between home runs and singles. Statistics like on-base
percentage (OBP) and slugging percentage (SLG) are more useful, as OBP tells us how often a
player reaches base an SLG gives us a good sense of how much power a hitter possesses. That
said, the statistic OPS (on-base plus slugging) is not very useful in lineup optimization because it
weights OBP and SLG evenly, when it has been well established in baseball circles for some
time that OBP is much more valuable than SLG. Even worse in this scenario are statistics like
OPS+, which is just OPS adjusted for park and league factors, essentially evening the playing
field. However, when optimizing a specific team’s batting order, it is not desirable to adjust for
these factors, because a team will play all of its games in certain parks in a certain league, and
leveling this playing field eliminates the fact that some players have surroundings better suited to
their skill sets (think Babe Ruth and Yankee Stadium’s short porch in right field). wOBA
(weighted on-base average) is by far the best metric when it comes to optimizing a lineup, since
it weights each outcome by its run value and then adjusts the number to resemble on-base
percentage (Tanko, Lichtman, and Dolphin 30). Although it is not necessary to adjust the
number to resemble on-base percentage, it in no way changes the optimization and allows us to
work with a more intuitive number. One final thing to note is that there exists no good way to
quantify a player’s speed. We could obviously use stolen bases and success rates, but those
Thaker 5
largely fail to capture a player’s true speed as they are very dependent on situations, managers,
pitch counts, pitch types, etc. In addition, players don’t attempt very many steals, giving us a
very small sample size to work with. Likewise, there are measures of how often a player is able
to take an extra base on a single or double, but again those are very dependent on situations and
context, which are different for all players. That said, speed is certainly useful in baseball, can
be easily observed through an eye test, and is the same in all situations (meaning a player always
runs at the same speed). Thus, it works best to classify players as either fast, average, or slow,
based off of simple eye tests, rather than to perform a complex analysis that will include inherent
flaws. Tanko, Lichtman, and Dolphin establish the stolen base success rates necessary for a
player to have a positive expected impact on runs scored (134). Obviously, managers can
optimize the impact of stolen bases by simply only allowing their best base runners to attempt
steals and only in favorable situations.
Two interesting topics to consider in lineup optimization are hot/cold streaks and the idea
of protecting your best batter. Tango, Lichtman, and Dolphin examine the concept of streaks and
using them to predict near future performance. Their study of this particular concept is very well
done and uses data from the 2000-2003 seasons to form a model. Their model shows that after a
five game hot streak, players performed about 5 points of wOBA above their expected
performance and a five game cold streak led to about a 5 point decline from expected wOBA.
This shows that by the time a streak becomes apparent, a player’s performance has nearly
regressed back to normal, eliminating the need to change the batting order to accommodate their
change in performance (53-68). James Click of Baseball Prospectus examines the concept of
protection in the batting order using Barry Bonds’ performance from 2001-2004 seasons as their
case study. While it does not impact the model they later form, the use of Bonds as a case study
Thaker 6
is rather foolish. His performance those seasons was record-shattering and is unlikely to ever be
replicated. They use the run expectations of certain events to determine when walking a hitter
can actually be advantageous to a team, or in other words, when protection is needed. The table
included shows how much worse off the “protecting” hitter must be for the pitching team to
consider walking the superior hitter (see table 3) (Keri 46). From this, we can easily see that
protection is really not necessary in the lineup, as most hitters can protect just about anyone,
excluding Bonds’ superhuman seasons.
Now that we have examined the most relevant statistics available to us, we must decide
how to apply them to create an optimal batting order. The sabermetrician Mark Pankin created a
Markov Chain to model a baseball lineup and also used a regression analysis to determine which
statistics were most valuable by position in the batting order. The regression analysis makes a
necessary assumption about batters hitting the same in all situations. However, it only uses data
from one season to determine how players would perform. One season is not a terribly large
sample size and ideally we would like to use more data if it was available. Personally, I would
examine this on a player-by-player basis to account for extenuating circumstances such as injury,
change of teams, and major league service time, but I would certainly use two or three years of
data if it was available and relevant. One final thing of note is that in Pankin’s regression model,
stolen base attempts have a negative correlation with runs scored for the leadoff hitter. On the
surface, this would seem to suggest that the leadoff runner need not have any speed, less he be
tempted to steal. However, since Pankin’s data comes from actual games, this most likely
indicates that managers simply do not utilize speed properly (Pankin “Batting Orders”).
Pankin also creates a Markov model based off of the 24 different base running and out
combinations and includes 4 additional scenarios under which the third out is made. There are
Thaker 7
essentially two problems with Pankin’s model. First, he presents the model in its most basic
state, where all batters are deemed to be of equal skill. Obviously, this does not even remotely
hold in real life, meaning we would have to form a different 28 x 28 matrix for each hitter in
baseball. It is easy to see why Pankin does this and it is hard to fault him for not wanting to use
hundreds of large matrices, but it means that all results from his model only apply when all
batters are deemed perfectly equal. Pankin’s other flaw is his failure to actually apply the model
and interpret the results. By not performing calculations himself or giving the reader the
transition matrix used, Pankin eliminates many applications of his model.
Cyril Morong of the baseball blog Beyond the Box Score further uses Pankin’s
regression model to attempt to assign weights to the value of OBP and SLG for each spot in the
lineup (Morong). The main problem with this approach is that if fails to differentiate between
different types of extra-base hits. James Click of Baseball Prospectus poses a related
hypothetical question (Keri 42-43): Is it better to have a player who hits a triple every at-bat or a
player who doubles half the time and homers half the time? The answer is that it depends on
context; in some situations, the triples player is worth far more and in others, the half and half is
preferred. Had Morong weighted each outcome by the runs it created and the probability of the
situation, he would have created a much more accurate model. That said, his model should be
fairly accurate as most players have somewhat similar singles rates, doubles rates, etc. relative to
their SLG.
Baseball Prospectus actually created a program called Batting Lineup Order Optimization
Program (BLOOP). BLOOP made the most intuitive sense of any of the models researched. It
works by simply breaking down OBP into each different event, finding the probability of each
event, and then randomly simulating through multiple seasons for each possible lineup.
Thaker 8
Assuming that the data used in BLOOP is accurate, it should produce the most accurate results,
despite its failure to include any sort of stolen base mechanism. BLOOP actually produces some
startling results by stating that the difference between a team’s best and worst lineups is only
about 26 runs a season, a very small number given the disparity between the best and worst
orders. It also says that a team’s optimal lineup order is always in descending order by OBP
(Keri 35-47). This suggests inherent flaws within BLOOP, as we would expect SLG to at least
play a small role in lineup optimization. Unfortunately, BLOOP is not available to the public,
making it impossible to further analyze the program.
The final study researched regarding the optimization of a batting order comes from
Tanko, Lichtman, and Dolphin in their sabermetric study entitled The Book: Playing the
Percentages in Baseball. This group also uses Markov Chains to study lineup optimization, but
use them in a manner than combines both math and intuition. They start by comparing data from
actual games with a basic Markov model assuming all batters are equal. Surprisingly enough,
their results show that the opportunities afforded to certain lineup positions occur as a result of
their position, not simply because the best hitters are at the top of the order, leading to more RBI
opportunities for batters near the top (see table 4) (125). They then go on to examine how often
different lineup positions see runners on-base and how many runners each hitter can expect to
have on-base. They analyze the data to conclude that the optimal lineup has a team’s three best
hitters in the #1, #2, and #4 slots in the batting order. Then, they outline the skill set that
determines how these players are positioned. The leadoff hitter should reach base most often and
less home run power is preferred. Meanwhile, the #2 hitter should walk more than the #4 hitter,
who ideally has the most extra-base hits. Next, they reason that the #3 and #5 hitters should be
relatively equal with the slightly better hitter batting fifth, as he has a run expectation advantage
Thaker 9
in everything but home runs. Lastly, they state that the teams remaining hitters should bat sixth
through ninth in order of skill. This seems like a very accurate conclusion, as they show the
opportunities afforded to each hitter and analyze the importance of each event in creating runs to
determine where each player fits best (Tanko, Lichtman, and Dolphin 121-151).
The same group further explores some topics like base running in the lineup, the role of
the pitcher, and alternating between the handedness of batters. From their data, they reason that
the best place to leverage a base-stealer in a lineup is in the fifth or sixth position. While this
may be used as an all things equal tiebreaker, it makes no sense to bat a player above his optimal
spot or include him in the lineup in an attempt to leverage his speed. There are few situations in
which leveraging speed pays dividends, and they mostly occur in the late innings of close games.
Instead of setting a lineup for the entire hoping to capitalize on a rare scenario, a manager would
be better off simply inserting a pinch runner if he absolutely needed to leverage a base stealer.
Essentially, the rules of baseball negate the advantage their claim here, something which was not
taken into account. They also note the importance of avoiding the double play in certain spots,
like #3, where the batter faces a high number of double play situations (see table 5) (142), but is
careful to note that this should only be a consideration for players with extreme double play
tendencies (121-151).
The next part of their study deals with the role of the pitcher for National League teams.
Using their Markov model, Tanko, Lichtman, and Dolphin conclude that a team can gain a few
extra runs by batting the pitcher eighth. Intuition would suggest that having a regular hitter ninth
would create more RBI opportunities for the best hitters at the top of the lineup (commonly
called the second leadoff hitter theory), and Tanko, Lichtman, and Dolphin’s model proves that
this is indeed the case for the normal pitcher. Although, it should be noted that there are a few
Thaker 10
pitchers that are capable hitters and thus this theory would not apply to them. They further
reason that by avoiding the pitcher’s spot entirely by pinch-hitting for him every time could yield
the relative production of having another superstar in the lineup. While this may be true from an
offensive standpoint, it would be almost impossible to implement this strategy as it would wreak
havoc upon a team’s rotation, bullpen, and bench (121-151).
One final aspect that the group considers is the effect of platoon splits for players, or how
they perform against different handed pitchers. They theorize the number of plate appearances
necessary to judge if a hitter has a discernible platoon split. The problem with their study is that
the results call for so many plate appearances that there are few players the results can be applied
to (Tanko, Lichtman, and Dolphin 152-182). It would be better for a manager to consider all the
available data regarding platoon splits and conclude that it was accurate despite the small sample
size. This stems from a highly-regarded and presumably provable baseball fact that, in general,
left-handed batters hit LHP worse and right-handed batters hit RHP worse because it is harder to
see the pitcher release the baseball. Thus, it makes more sense to assume that any evidence of a
platoon split is indeed true and formulate two separate lineups, one for left-handed pitchers and
one for right-handed pitchers.
Lastly, I would like to mention an aside about how managers tinker with their lineup. All
of the optimization techniques discussed provide a small advantage in runs per game that only
becomes apparent over an entire 162 game season. Often, managers will slightly alter their
lineup in an attempt to “jump-start” a stagnant offense. While this will almost certainly reduce
their expected runs per game, through sheer luck, it is possible that this technique could work
over a period of one or two games. Thus, if a manager feels the need to alter his batting order in
Thaker 11
an attempt to change his team’s attitude or “send a message” to his players, it is not very harmful
in incredibly small doses and could in fact pay dividends.
From the models and studies analyzed, a few things became very apparent. First, the
importance of wOBA becomes very clear. It is the only statistic widely available that
incorporates the actual value of each event on how many runs a team will score. While you may
not be able to find an optimal lineup using wOBA alone, it certainly makes the process much
easier. Next, the BLOOP method of lineup simulation is probably the best method researched.
By running many simulations using actual outcomes as probabilities, it should give us the best
estimates of the optimal lineup. Some of BLOOP’s findings were suspect, but the logic behind it
seems correct. If I were attempting to write a lineup simulating program from scratch, which I
did theorize, it would closely resemble the BLOOP approach. That said, BLOOP really takes a
lot of the fun out of creating a lineup as it just takes in all the data and spits out an answer. The
Tanko, Lichtman, and Dolphin method was more sequential and included many layers of
thought, making it the best study available. When attempting to create a lineup for the case
study, I will use their findings to formulate optimal lineups. Despite all the research I
encountered, it appears that there is still much work to do in this field and many more interesting
concepts to consider. That said, I now have a very good idea of what goes into an optimal
lineup.
Thaker 12
Tables
Table 1
Table 2
Thaker 13
Table 3
Table 4
Thaker 14
Table 5 (Double Play Situations Relative to Other Batting Order Positions)
Thaker 15
Works Cited
Coondin, David. “How to Set a Batting Order in Baseball.” eHow. Demand Media Inc., 2011.
Web.
Keri, Jonah, ed. Baseball Between the Numbers: Why Everything You Know About the Game Is
Wrong. New York: Basic, 2006. Print.
Morong, Cyril. “Value of OBP and SLG by Lineup Position.” Beyond the Box Score. Vox
Media, Inc. 12 Feb. 2006. Web.
Pankin, Mark. “Finding Better Batting Orders.” Mark Pankin Baseball Page. “N.p.” 1991. Web.
13 Nov. 2011.
Pankin, Mark. “Markov Chain Models: Theoretical Models ” Mark Pankin Baseball Page.
“N.p” Web. 13 Nov. 2011.
Tanko, Tom, Mitchel Lichtman, and Andrew Dolphin. The Book: Playing the Percentages in
Baseball. Dulles, Virginia: Potomac, 2007. Print.
Download