Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics Statistics and Data Analysis Part 3a – Interesting Probability Puzzles 2 Classic Problems and 1 Intriguing One Plain 32.5% 800000 800000 500000 900000 Mean StDev N AD P-Value 95 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 20000 22500 25000 IncomePC 27500 30000 32500 e mc 30 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Scatterplot of Listing vs IncomePC Normal - 95% CI 700000 700000 600000 Probability Plot of Listing 99 30000 32500 0 1000000 60 800000 40 Listing 900000 Percent Scatterplot of Listing vs IncomePC 900000 400000 Mushroom 16.2% Halftime winner Frequency Sausage 5.8% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepper and Onion 7.3% The Monty Hall problem Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% The birthday problem Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 1 0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 The Birthday Problem What is the probability that everyone in this room has a different birthday? (50 people) 1 2 50 1 P= 1 1 1 ... 1 365 365 365 365 364 ... (365 - 50 + 1) = 36550 365! = 36550 (365 50)! = 0.029626 (only about 1 chance in 30). 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 700000 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 20000 22500 25000 IncomePC 27500 30000 32500 e mc 30 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 1 0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 The Monty Hall Problem Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice? Answer: It definitely pays to switch. See “Notes for this class” or browse the 1,000,000 plus hits you’ll find if you search for this problem on the web. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 700000 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 20000 22500 25000 IncomePC 27500 30000 32500 e mc 30 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 1 0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Throwing in the Towel Consider a sporting event with two halves. Scores accumulate in each half. The winner is the team with the highest total of the scores for the two halves. (Baseball, Hockey, Football, Basketball, Rugby) No ties (so this doesn’t work well for soccer). The two teams are evenly matched, and they play exactly as hard in the second half as in the first. Given that a team is ahead at the halftime, what is the probability that they will win the game? Intuition (incorrectly) says .5. (If team A wins the first half, it’s as likely that team B will win the second half.) The correct answer is .75! The simple intuition is that it is not sufficient for team B to win in the second half. Team B must win by a higher margin in the second half than team A had in the first half. Since they are evenly matched, that probably is only .25. Formally, there is a 50% chance that team A will win the second half outright. For any first half margin, say M, since they are evenly matched, there is a 50% chance that each team will exceed that margin, so A wins the game in half of the cases that B wins the second half and all of the cases when A wins the second half. See Jeffrey Simonoff, “Probability – the language of randomness,” pp. 10-12. 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 700000 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 20000 22500 25000 IncomePC 27500 30000 32500 e mc 30 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing C ategory Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 1 0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000