Thaker 1 AN ANALYSIS OF LINEUP OPTIMIZATION IN BASEBALL Raj Thaker Math 20: Discrete Probability November 30, 2011 Thaker 2 If nothing else, the 2011 MLB season taught us the importance of every single victory. This year’s World Series Champions, the St. Louis Cardinals, made the playoffs as a wild card by a margin of a single game over the Atlanta Braves. Had they won even one less of their 162 regular season games, history could have been altered. This offseason, managers across the league should be scrambling to find any advantage that can give their team even one extra win. One such way to get that extra win can be through the use of a completely optimized lineup. Estimates vary on the exact value of an optimal lineup, but most opinions state that it could add about 10 runs per season, which is equivalent to about one win in sabermetric circles. Of course, managers must first answer the question of how to optimize their batting orders. To begin, it is first necessary to examine the conventional approach to setting a lineup and determine where or if flaws exist. Our average manager, let’s call him Ozzie Guillen just for fun, generally starts by finding the fastest one of his starters and penciling him into the leadoff spot (think a Juan Pierre type player). Now, he takes a player with a little bit of speed that can also hit fairly well and bats him second. The third and fourth spots contain the team’s two best hitters, with the more powerful hitter usually batting third. Guillen then ranks his remaining players in terms of hitting ability, using traditional metrics like batting average, home runs, and rbi’s, and puts them into the 5-9 slots in that order. For another interpretation of this approach, see the cited ehow.com article (Coondin). Guillen then gives his lineup card to the umpire, goes back into the dugout, is presumably ejected later on for arguing with said umpire, and watches his team lose from the clubhouse. As we can see, it does not take much time or effort to create something that resembles an average team’s batting order. We must now ask ourselves if we can find a better approach to this problem. Intuitively, it is pretty easy to realize that the number of at bats by position in the Thaker 3 batting order is decreasing (see table 1) (Tanko, Lichtman, and Dolphin 122). So, if we batted our best hitters sequentially starting from the leadoff spot, we would ensure that our best hitters got the most at bats, while our worst hitters got the fewest. However, according to table 2, we see that we would be giving the most at bats with runners on base to our fourth best hitter and giving the least to our best hitter (Tanko, Lichtman, and Dolphin 124). So, this approach clearly has flaws. The famous sabermetrician Bill James once theorized that since the most important factor in scoring runs in getting the leadoff batter on base, we should have a player with a low OBP bat third since this spot is least likely to lead off an inning (Keri 36). This theory was based off of James’ observation that the most runs are scored in the first inning, the fewest in the second inning, and equal amounts in each subsequent inning (Keri 36). While this theory makes some sense, it neglects the fact that a team is more likely to score when its best hitters are at the plate and they are guaranteed to be there in the first inning and highly unlikely to be there in the second inning. Of course, there are also stories in baseball lore of managers choosing lineups completely randomly, like when Billy Martin drew names out of a hat in an attempt to turn around his slumping Tigers team. This idea may be the most foolish of all, because if we examine all 9 factorial possible batting orders, the expected runs from a completely random order would fall somewhere around that of the median batting order. In fact, since the worst lineups create more runs below the median lineup than the best lineups create above the median lineup, we would actually expect to score fewer runs than the median, making this a terrible idea. That said, it did work for Martin as the Tigers won, but they only managed three runs in their victory, suggesting that Martin was completely wrong (Keri 35). Before we can determine where to place our best hitters, we must take a step back and first figure out a way to determine a team’s best hitters. Guillen, for instance, would probably Thaker 4 solve this problem using a balance of player reputations, egos, and traditional metrics. However, there are many different ways to quantify a hitter’s performance and we must carefully consider which metrics will provide us with the best lineup. First off, RBI’s are one of the worst indicators of a batter’s skill, as they are extremely dependent on factors over which the hitter has no control, like base runners and quality of teammates. Batting average is also not a great indicator as it fails to include walks, which for some hitters can account for half of their trips to the base paths, and cannot differentiate between home runs and singles. Statistics like on-base percentage (OBP) and slugging percentage (SLG) are more useful, as OBP tells us how often a player reaches base an SLG gives us a good sense of how much power a hitter possesses. That said, the statistic OPS (on-base plus slugging) is not very useful in lineup optimization because it weights OBP and SLG evenly, when it has been well established in baseball circles for some time that OBP is much more valuable than SLG. Even worse in this scenario are statistics like OPS+, which is just OPS adjusted for park and league factors, essentially evening the playing field. However, when optimizing a specific team’s batting order, it is not desirable to adjust for these factors, because a team will play all of its games in certain parks in a certain league, and leveling this playing field eliminates the fact that some players have surroundings better suited to their skill sets (think Babe Ruth and Yankee Stadium’s short porch in right field). wOBA (weighted on-base average) is by far the best metric when it comes to optimizing a lineup, since it weights each outcome by its run value and then adjusts the number to resemble on-base percentage (Tanko, Lichtman, and Dolphin 30). Although it is not necessary to adjust the number to resemble on-base percentage, it in no way changes the optimization and allows us to work with a more intuitive number. One final thing to note is that there exists no good way to quantify a player’s speed. We could obviously use stolen bases and success rates, but those Thaker 5 largely fail to capture a player’s true speed as they are very dependent on situations, managers, pitch counts, pitch types, etc. In addition, players don’t attempt very many steals, giving us a very small sample size to work with. Likewise, there are measures of how often a player is able to take an extra base on a single or double, but again those are very dependent on situations and context, which are different for all players. That said, speed is certainly useful in baseball, can be easily observed through an eye test, and is the same in all situations (meaning a player always runs at the same speed). Thus, it works best to classify players as either fast, average, or slow, based off of simple eye tests, rather than to perform a complex analysis that will include inherent flaws. Tanko, Lichtman, and Dolphin establish the stolen base success rates necessary for a player to have a positive expected impact on runs scored (134). Obviously, managers can optimize the impact of stolen bases by simply only allowing their best base runners to attempt steals and only in favorable situations. Two interesting topics to consider in lineup optimization are hot/cold streaks and the idea of protecting your best batter. Tango, Lichtman, and Dolphin examine the concept of streaks and using them to predict near future performance. Their study of this particular concept is very well done and uses data from the 2000-2003 seasons to form a model. Their model shows that after a five game hot streak, players performed about 5 points of wOBA above their expected performance and a five game cold streak led to about a 5 point decline from expected wOBA. This shows that by the time a streak becomes apparent, a player’s performance has nearly regressed back to normal, eliminating the need to change the batting order to accommodate their change in performance (53-68). James Click of Baseball Prospectus examines the concept of protection in the batting order using Barry Bonds’ performance from 2001-2004 seasons as their case study. While it does not impact the model they later form, the use of Bonds as a case study Thaker 6 is rather foolish. His performance those seasons was record-shattering and is unlikely to ever be replicated. They use the run expectations of certain events to determine when walking a hitter can actually be advantageous to a team, or in other words, when protection is needed. The table included shows how much worse off the “protecting” hitter must be for the pitching team to consider walking the superior hitter (see table 3) (Keri 46). From this, we can easily see that protection is really not necessary in the lineup, as most hitters can protect just about anyone, excluding Bonds’ superhuman seasons. Now that we have examined the most relevant statistics available to us, we must decide how to apply them to create an optimal batting order. The sabermetrician Mark Pankin created a Markov Chain to model a baseball lineup and also used a regression analysis to determine which statistics were most valuable by position in the batting order. The regression analysis makes a necessary assumption about batters hitting the same in all situations. However, it only uses data from one season to determine how players would perform. One season is not a terribly large sample size and ideally we would like to use more data if it was available. Personally, I would examine this on a player-by-player basis to account for extenuating circumstances such as injury, change of teams, and major league service time, but I would certainly use two or three years of data if it was available and relevant. One final thing of note is that in Pankin’s regression model, stolen base attempts have a negative correlation with runs scored for the leadoff hitter. On the surface, this would seem to suggest that the leadoff runner need not have any speed, less he be tempted to steal. However, since Pankin’s data comes from actual games, this most likely indicates that managers simply do not utilize speed properly (Pankin “Batting Orders”). Pankin also creates a Markov model based off of the 24 different base running and out combinations and includes 4 additional scenarios under which the third out is made. There are Thaker 7 essentially two problems with Pankin’s model. First, he presents the model in its most basic state, where all batters are deemed to be of equal skill. Obviously, this does not even remotely hold in real life, meaning we would have to form a different 28 x 28 matrix for each hitter in baseball. It is easy to see why Pankin does this and it is hard to fault him for not wanting to use hundreds of large matrices, but it means that all results from his model only apply when all batters are deemed perfectly equal. Pankin’s other flaw is his failure to actually apply the model and interpret the results. By not performing calculations himself or giving the reader the transition matrix used, Pankin eliminates many applications of his model. Cyril Morong of the baseball blog Beyond the Box Score further uses Pankin’s regression model to attempt to assign weights to the value of OBP and SLG for each spot in the lineup (Morong). The main problem with this approach is that if fails to differentiate between different types of extra-base hits. James Click of Baseball Prospectus poses a related hypothetical question (Keri 42-43): Is it better to have a player who hits a triple every at-bat or a player who doubles half the time and homers half the time? The answer is that it depends on context; in some situations, the triples player is worth far more and in others, the half and half is preferred. Had Morong weighted each outcome by the runs it created and the probability of the situation, he would have created a much more accurate model. That said, his model should be fairly accurate as most players have somewhat similar singles rates, doubles rates, etc. relative to their SLG. Baseball Prospectus actually created a program called Batting Lineup Order Optimization Program (BLOOP). BLOOP made the most intuitive sense of any of the models researched. It works by simply breaking down OBP into each different event, finding the probability of each event, and then randomly simulating through multiple seasons for each possible lineup. Thaker 8 Assuming that the data used in BLOOP is accurate, it should produce the most accurate results, despite its failure to include any sort of stolen base mechanism. BLOOP actually produces some startling results by stating that the difference between a team’s best and worst lineups is only about 26 runs a season, a very small number given the disparity between the best and worst orders. It also says that a team’s optimal lineup order is always in descending order by OBP (Keri 35-47). This suggests inherent flaws within BLOOP, as we would expect SLG to at least play a small role in lineup optimization. Unfortunately, BLOOP is not available to the public, making it impossible to further analyze the program. The final study researched regarding the optimization of a batting order comes from Tanko, Lichtman, and Dolphin in their sabermetric study entitled The Book: Playing the Percentages in Baseball. This group also uses Markov Chains to study lineup optimization, but use them in a manner than combines both math and intuition. They start by comparing data from actual games with a basic Markov model assuming all batters are equal. Surprisingly enough, their results show that the opportunities afforded to certain lineup positions occur as a result of their position, not simply because the best hitters are at the top of the order, leading to more RBI opportunities for batters near the top (see table 4) (125). They then go on to examine how often different lineup positions see runners on-base and how many runners each hitter can expect to have on-base. They analyze the data to conclude that the optimal lineup has a team’s three best hitters in the #1, #2, and #4 slots in the batting order. Then, they outline the skill set that determines how these players are positioned. The leadoff hitter should reach base most often and less home run power is preferred. Meanwhile, the #2 hitter should walk more than the #4 hitter, who ideally has the most extra-base hits. Next, they reason that the #3 and #5 hitters should be relatively equal with the slightly better hitter batting fifth, as he has a run expectation advantage Thaker 9 in everything but home runs. Lastly, they state that the teams remaining hitters should bat sixth through ninth in order of skill. This seems like a very accurate conclusion, as they show the opportunities afforded to each hitter and analyze the importance of each event in creating runs to determine where each player fits best (Tanko, Lichtman, and Dolphin 121-151). The same group further explores some topics like base running in the lineup, the role of the pitcher, and alternating between the handedness of batters. From their data, they reason that the best place to leverage a base-stealer in a lineup is in the fifth or sixth position. While this may be used as an all things equal tiebreaker, it makes no sense to bat a player above his optimal spot or include him in the lineup in an attempt to leverage his speed. There are few situations in which leveraging speed pays dividends, and they mostly occur in the late innings of close games. Instead of setting a lineup for the entire hoping to capitalize on a rare scenario, a manager would be better off simply inserting a pinch runner if he absolutely needed to leverage a base stealer. Essentially, the rules of baseball negate the advantage their claim here, something which was not taken into account. They also note the importance of avoiding the double play in certain spots, like #3, where the batter faces a high number of double play situations (see table 5) (142), but is careful to note that this should only be a consideration for players with extreme double play tendencies (121-151). The next part of their study deals with the role of the pitcher for National League teams. Using their Markov model, Tanko, Lichtman, and Dolphin conclude that a team can gain a few extra runs by batting the pitcher eighth. Intuition would suggest that having a regular hitter ninth would create more RBI opportunities for the best hitters at the top of the lineup (commonly called the second leadoff hitter theory), and Tanko, Lichtman, and Dolphin’s model proves that this is indeed the case for the normal pitcher. Although, it should be noted that there are a few Thaker 10 pitchers that are capable hitters and thus this theory would not apply to them. They further reason that by avoiding the pitcher’s spot entirely by pinch-hitting for him every time could yield the relative production of having another superstar in the lineup. While this may be true from an offensive standpoint, it would be almost impossible to implement this strategy as it would wreak havoc upon a team’s rotation, bullpen, and bench (121-151). One final aspect that the group considers is the effect of platoon splits for players, or how they perform against different handed pitchers. They theorize the number of plate appearances necessary to judge if a hitter has a discernible platoon split. The problem with their study is that the results call for so many plate appearances that there are few players the results can be applied to (Tanko, Lichtman, and Dolphin 152-182). It would be better for a manager to consider all the available data regarding platoon splits and conclude that it was accurate despite the small sample size. This stems from a highly-regarded and presumably provable baseball fact that, in general, left-handed batters hit LHP worse and right-handed batters hit RHP worse because it is harder to see the pitcher release the baseball. Thus, it makes more sense to assume that any evidence of a platoon split is indeed true and formulate two separate lineups, one for left-handed pitchers and one for right-handed pitchers. Lastly, I would like to mention an aside about how managers tinker with their lineup. All of the optimization techniques discussed provide a small advantage in runs per game that only becomes apparent over an entire 162 game season. Often, managers will slightly alter their lineup in an attempt to “jump-start” a stagnant offense. While this will almost certainly reduce their expected runs per game, through sheer luck, it is possible that this technique could work over a period of one or two games. Thus, if a manager feels the need to alter his batting order in Thaker 11 an attempt to change his team’s attitude or “send a message” to his players, it is not very harmful in incredibly small doses and could in fact pay dividends. From the models and studies analyzed, a few things became very apparent. First, the importance of wOBA becomes very clear. It is the only statistic widely available that incorporates the actual value of each event on how many runs a team will score. While you may not be able to find an optimal lineup using wOBA alone, it certainly makes the process much easier. Next, the BLOOP method of lineup simulation is probably the best method researched. By running many simulations using actual outcomes as probabilities, it should give us the best estimates of the optimal lineup. Some of BLOOP’s findings were suspect, but the logic behind it seems correct. If I were attempting to write a lineup simulating program from scratch, which I did theorize, it would closely resemble the BLOOP approach. That said, BLOOP really takes a lot of the fun out of creating a lineup as it just takes in all the data and spits out an answer. The Tanko, Lichtman, and Dolphin method was more sequential and included many layers of thought, making it the best study available. When attempting to create a lineup for the case study, I will use their findings to formulate optimal lineups. Despite all the research I encountered, it appears that there is still much work to do in this field and many more interesting concepts to consider. That said, I now have a very good idea of what goes into an optimal lineup. Thaker 12 Tables Table 1 Table 2 Thaker 13 Table 3 Table 4 Thaker 14 Table 5 (Double Play Situations Relative to Other Batting Order Positions) Thaker 15 Works Cited Coondin, David. “How to Set a Batting Order in Baseball.” eHow. Demand Media Inc., 2011. Web. Keri, Jonah, ed. Baseball Between the Numbers: Why Everything You Know About the Game Is Wrong. New York: Basic, 2006. Print. Morong, Cyril. “Value of OBP and SLG by Lineup Position.” Beyond the Box Score. Vox Media, Inc. 12 Feb. 2006. Web. Pankin, Mark. “Finding Better Batting Orders.” Mark Pankin Baseball Page. “N.p.” 1991. Web. 13 Nov. 2011. Pankin, Mark. “Markov Chain Models: Theoretical Models ” Mark Pankin Baseball Page. “N.p” Web. 13 Nov. 2011. Tanko, Tom, Mitchel Lichtman, and Andrew Dolphin. The Book: Playing the Percentages in Baseball. Dulles, Virginia: Potomac, 2007. Print.