Chapter 15 Conditional Probability, Expected Value, and Strategy in Sports Objectives Students will be able to: 1) Use the general addition rule to calculate probability (union and intersection of events) 2) Calculate conditional probability for dependent events 3) Use tree diagrams to organize events and calculate probability using the general multiplication rule 4) Find expected value of random variables 5) Use expected value to make strategy decisions in sports • On November 15, 2009, the New England Patriots were playing the Indianapolis Colts. New England had the football at their own 28 yard line with 2:08 left on the clock, and they led 34-28. It was 4th down and 2. New England had no time outs left, and the Colts had 1 time out left. • The conventional move would be to punt and play defense. However, if they go for it and pick up 2 yards they will essentially win the game. If they go for it and don’t pick up the 2 yards, Manning will have a good chance to throw for the game winning touchdown, as he was on a roll in the second half. • If you were Bill Belichick, what would you do? Two-Way Tables and the General Addition Rule • In Chapter 2, we introduced two-way tables as a way to organize information about the distribution of a categorical variable in two different contexts. • Example: the outcomes of regular season games for the 2008 Arizona Cardinals. • Two-way tables can also be used to summarize the relationship between two categorical variables. • Example: Let’s say the Tampa Bay Rays had a promotion for home games in 2010. If the team scored 7 or more runs, each fan will get a free taco (they scored 7 or more runs 15 times that season). The only thing better than getting a free taco would be getting a free taco and watching the Rays win at the same time. • Here is a two-way table to show the relationship between taco status and the outcome of the game for the Rays’ 81 regular season home games in 2010. • 13 games yielded the ideal combination of free tacos and a win. If we randomly select a game, the probability that a fan got a free taco and saw a win is P(taco and win) = 13/81= 0.16, or 16%. The General Addition Rule • What if we want to know the probability that a fan saw a win or got a free taco? • For this to occur, just one or the other event needs to take place (or if both events took place that would work as well). • Keep in mind there is some overlap between the two events, as 13 games produced both a free taco and a win. • We cannot just add the probability of getting a taco and the probability of getting a win, due to the overlap of the events (taco and win). We have to account for that overlap. • Looking at the two-way table, we should see that we could add three separate mutually exclusive events (events that can’t happen at the same time) to get the probability of a taco or a win. • What would be incorrect would be if we just added the probability of a taco and the probability of a win: • The calculation can also be done by adding the probability of a taco and the probability of a win, and then subtracting the overlap (taco and win): • This new rule is called the general addition rule: STAT 101 • Instead of using the words “and” and “or” to describe probability situations, some more traditional statistics books use set theory notation. • The word “or” is replaced by the union symbol and the word “and” is replaced by the intersection symbol. • Let’s try another example. Mr. Falcicchio is a big fan of the NJ Jackals for a variety of reasons, one of which is due to the Jackals having firework promotions. Every time the Jackals score 6 runs in a game, they shoot off fireworks at the completion of the game. On the next slide is a two-way table summarizing the 2013 results for the NJ Jackals, including wins and losses, and number of time fireworks were shot and not shot. Find the following. a) P(fireworks and win) b) P(fireworks or win) Conditional Probability and Independence • Let’s revisit our taco example. • If we randomly select one of the Rays’ victories from 2010, what is the probability that a fan at that game received a free taco? • Probabilities like the “probability that free tacos were distributed, given that the Rays won the game” are called conditional probabilities. • Conditional probability describes the probability that an event occurs, given that we know that a different event has already occurred. • Just looking at the win column makes this probability easy to see. • Conditional probability has the following formula: • For our example, the probability of both events occurring was 13/81 and the probability of a win occurring was 49/81, so: • Let’s try another example. Find the probability that the Rays won the game, given that free tacos were given away. • This is quite easy to see if we limit our attention to the “taco” row of the two-way table. Independence • Scenario: Kobe steps to the free-throw line for 2 shots. • On his first shot, he has an 85% chance of making the free-throw. • Make or miss, on his second shot he still has an 85% chance of making the free-throw. • If this is true, the outcomes of his free-throw attempts are independent, meaning Kobe’s ABILITY to make a freethrow is the same following a make as it is following a miss. In other words, knowing that he makes the first shot doesn’t help us predict the outcome of his second shot. • Using conditional probability notation: • In general, two events are independent if knowing the outcome of one event does not affect the probability of the other event. • Events A and B are independent if: – This means event A has the same probability of happening whether or not event B happens. • Let’s go back to our taco example. • Are the events “taco” and “win” independent? If so, then knowing the outcome of the game would not provide any additional information about the probability of getting a free taco. • However, if knowing the outcome of the game changes the probability of getting a free taco, then the events “taco” and “win” are not independent. • If the events are independent, then the following relationship should exist: • Let’s investigate. Clearly knowing the outcome of the game changes the probability of getting a taco. Therefore, the events “taco” and “win” are not independent. Tree Diagrams and the General Multiplication Rule • In tennis, the player serving has two chances to get a serve into play. • Generally, the player is more aggressive on the first-serve. • If the first serve is a fault, the player will be more conservative on the second-serve. • Since the player is more conservative, they tend to win a smaller percentage of points on secondserves than on successful first-serves. • On the 2011 Association of Tennis Professionals (ATP) tour, Roger Federer made 63% of his first-serves. When he made his first-serve, he won 78% of points. When he missed his first-serve, he only won 57% of points. • Using probability notation: • Because the probability of winning a point changes based on the outcome of the first-serve, the outcome of the point is not independent of the outcome of the firstserve. • This information can also be expressed in a tree diagram. • To do this: – Show the outcome of the first-serve as one set of “branches” and the outcome of the point with a second set of “branches”. – Include the probability of each branch. – Label the outcomes at the end of the branches. – Note: The probabilities that go on the second set of branches are conditional probabilities because the outcome of the point depends on the outcome of the first-serve. • Let’s make a tree diagram. • What is the probability Federer makes the firstserve and wins the point? • The previous calculation was an example of the general multiplication rule, which is used to find the probability that two events both occur. • The general multiplication rule says that for any two events A and B: • Find the remaining probabilities. • Now we can replace the “outcome” section of the tree diagram with the probabilities. • When Federer is serving, what is the probability that he wins the point? Reversing the Conditioning • Let’s say you are watching Federer serve. Brennan Huff sends you a text message and you get distracted. You look back in time to see that Federer won a point. How likely is it that he made his first-serve? In other words, what is the probability that he made the first-serve, given that he wins a point? • To find the probability, we have to work in reverse. • Use our conditional probability formula: Random Variables and Expected Value • One of the most exciting times for sports fans is a game 7 in a playoff series. • Unfortunately, not all best-of-seven series make it to a 7th game. Instead, one team might win the series in 4, 5, or 6 games. • In 2003, a New York Times article suggested that in baseball, a 7-game World Series is unusually common. Is this true? • A random variable takes on numerical values that describe the outcomes of a chance process. • Let’s define the random variable X as the number of games played in a randomly selected World Series. • A probability distribution lists the possible values of a random variable and how likely they are to occur. • The table below uses the results of the World Series from 1945 to 2010 to estimate the probability distribution of X. This probability distribution lists the possible number of games and how often those values occurred. • It is also possible to display the probability distribution using a graph, such as a histogram. The Mean (Expected Value) of a Random Variable • On average, how many games does a World Series last? In other words, what is the mean of the random variable X? • One way to estimate the mean value of X is to locate the balancing point of the histogram displaying the probability distribution of X. • Finding the balancing point can be done a few ways. • We can find the average as we did in Chapter 4. • Needless to say this could be a bit tedious. There is a more efficient way this can be done. • We know how many times each value occurs, so we can rewrite the numerator. • Now, rewrite the fraction as four separate fractions. • Finally, rearrange each fraction to reveal a helpful pattern. • Each term of the sum has two factors: – The numbers in front of the parentheses are the possible values of the random variable X. – The numbers in the parentheses are the corresponding probabilities. • In general, for a random variable X, the mean value of X (also called the expected value of X) can be found by multiplying each value of X by its probability and then adding together the products. – The sigma symbol means “add them up”. – E(X) represents the expected value of X. – This is saying that the mean value of X is equal to the expected value of X, which is equal to the sum of the X values times their probabilities. • The expected value of X is 5.86 games. How do we interpret this value? – If we were to randomly select World Series over and over, the average number of games in the selected Series would be about 5.86. Ex. 2: Hole #13 at the Augusta National golf course is one of the most famous holes in golf. Lined with the course’s signature azaleas, this hole is also a favorite of players for its relative ease. The hole is a par 5, meaning that professional golfers would be expected to complete the hole in 5 strokes. Let X = the score on hole #13 for a randomly selected golfer on day 1 of the 2011 Masters. The probability distribution of X is shown in the table on the next slide. 1) Calculate the expected value of X. 2) Interpret the expected value of X. If we randomly select golfers over and over on day 1 of the 2011 Masters, their average score on hole #13 would be about 4.627. Expected Values and Strategy in Sports • On April 15, 1947, Jackie Robinson, of the Brooklyn Dodgers, became the first black player in MLB since the 1880’s (Moses Fleetwood Walker played for the Toledo Blue Stockings of the American Association). • He had many career accomplishments, including Rookie of the Year in 1947, NL MVP in 1949, and he played in six World Series. • Robinson was extremely aggressive on the bases. He stole home 19 times in his career (an MLB record). • However, he was caught attempting to steal home 11 times. • While sometimes he provided an additional run, other times he cost his team potential runs. • The question becomes, overall, was Robinson’s aggressive base running a good strategy? • One way to evaluate the value of stealing home is by examining run expectancy for various combinations of base runners and outs. • In baseball, a team’s run expectancy (expected number of runs scored) in a particular situation is the average number of additional runs that the team would score if they could keep playing in that context over and over. • Based on data from Robinson’s playing years, when there was a runner on third with 2 outs, teams could expect to score an additional 0.36 runs that inning. • If a runner on third could steal home, his team would score 1 run. This represents a “gain” of 0.64 runs, because 1 actual run is 0.64 more than 0.36 potential runs. • Additionally, if the steal was successful, the inning would steal be alive, still with 2 outs, but now no runners on base. • With 2 outs and no one on base, teams could expect to score an additional 0.10 runs. • To recap: A successful steal of home with a runner on third and 2 outs gives a team 1.10 expected runs compared to the 0.36 expected runs if the runner did not try to steal home. • With 2 outs and a runner on third, an unsuccessful steal reduces run expectancy from 0.36 to 0. • Let’s now look at expected value of this situation. • Suppose that a base runner in this context has an 80% chance of successfully stealing home. • Let X = run expectancy when attempting to steal home. • There are then two possible values for X: – x=1.10 and x=0, with corresponding probabilities of 0.80 and 0.20. • This means that if a team has a runner on third with two outs and the runner has an 80% chance of successfully stealing home, the team would score 0.88 runs, on average if they followed this strategy in many, many innings. • Because the expected number of runs is greater than 0.36, attempting to steal home in this circumstance is a good strategy. • What if the base runner only had a 50% chance of successfully stealing home? • Because the expected number of runs is still greater than 0.36, attempting to steal home in this circumstance is still a good strategy. • When would attempting to steal home become a bad strategy? • In other words, for what probabilities of success will the expected number of runs be less than 0.36? • Here is the probability distribution of X, with p representing the probability of success and (1-p) representing the probability of failure. • To find out if stealing is a good strategy, we want to know what value of p results in an expected value greater than 0.36. • If the base runner has at least a 32.7% chance of successfully stealing home with a runner on third and 2 outs, then the expected change in run expectancy is greater than 0.36. • Thus, if a base runner has a greater than 32.7% chance of stealing home, then attempting to steal home is a good strategy. • If the base runner has a less than 32.7% chance, then attempting to steal home is a bad strategy. • So how did Robinson PERFORM with 2 outs and a runner at third? • He was successful in 7 of his 14 attempts (50%). • Because 50% is greater than 32.7%, attempting to steal home was a good strategy for Robinson. End of Game Strategy: Win Probability • Another useful concept in evaluating strategy in sports in win probability. • A team’s win probability measures the proportion of games a team would win if they could replay the game over and over again in the same context. • Using historical data, it is possible to estimate the probability that a team will win a game based on the context of the game at the time. • Example: A baseball team playing at home, down by 1 run, with runners at second and third with 1 out in the bottom of the 9th has a 54.0% chance of winning the game. However, if the next hitter strikes out, leaving the runners in the same position with 2 outs, the win probability goes down to 24.7%. • The crucial strikeout reduced the win probability by 29.3%. • Many websites show up-to-the-minute win probabilities. • Here is an example from www.live.advancednflstats.com. • Let’s now return to the Patriots-Colts example from the beginning of the chapter. • To recap: – New England had the football at their own 28 yard line with 2:08 left on the clock, and they led 34-28. It was 4th down and 2. New England had no time outs left, and the Colts had 1 time out left. – The conventional move would be to punt and play defense. However, if they go for it and pick up 2 yards they will essentially win the game. If they go for it and don’t pick up the 2 yards, Manning will have a good chance to throw for the game winning touchdown, as he was on a roll in the second half. – If you were Bill Belichick, what would you do? • Historically, when teams go for it on 4th down with 2 yards to go, they successfully gain the 2 yards 60% of the time. – If the Patriots get the 2 yards, their win probability is 100%. – If the Patriots don’t get the 2 yards, their win probability is 47%. • The other option would be to punt. This would have given the Patriots a win probability of about 70%. • If the Patriots go for it, there are two ways they can win the game: – Get the necessary 2 yards. – Fail to get the 2 yards but prevent the Colts from scoring a TD. • If the Patriots punt, they can win the game by preventing the Colts from scoring. • Going for the 4th and 2 results in a win probability of 0.788, as opposed to punting which results in a win probability of 0.70. • Therefore, going for it on 4th down would be a better strategy, statistically speaking. • Unfortunately for Belichick, he went for it and the Patriots did not get the first down (it sure was close though!). They consequently lost the game. • The play • Sports Nation debate • Just because the Patriots did not get the 1st down doesn’t mean Belichick’s decision was wrong. • Win probability tells us that for if they were able to replay this context 1000 times (for example), the Patriots would win about 788 times and the Colts would win about 212 times.