Statistics

advertisement
Statistics and Data
Analysis
Professor William Greene
Stern School of Business
IOMS Department
Department of Economics
Statistics and Data Analysis
Part 3a – Interesting
Probability Puzzles
2 Classic Problems and 1
Intriguing One
Plain
32.5%
800000
800000
500000
900000
Mean
StDev
N
AD
P-Value
95
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
30
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Scatterplot of Listing vs IncomePC
Normal - 95% CI
700000
700000
600000
Probability Plot of Listing
99
30000
32500
0
1000000
60
800000
40
Listing
900000
Percent
Scatterplot of Listing vs IncomePC
900000
400000
Mushroom
16.2%
Halftime winner
Frequency
Sausage
5.8%

Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepper and Onion
7.3%
The Monty Hall problem
Listing
Pepperoni
21.8%

Listing
Meatball
Garlic 5.0%
2.3%
The birthday problem
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%

20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
1
0
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
The Birthday Problem
What is the probability that everyone in this room
has a different birthday? (50 people)
1 
2   50  1 

P= 1 1 
1
...  1 



365 
 365   365  
365 364 ... (365 - 50 + 1)
=
36550
365!
=
36550 (365  50)!
= 0.029626 (only about 1 chance in 30).
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
900000
Mean
StDev
N
AD
P-Value
95
700000
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
30
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Probability Plot of Listing
99
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
1
0
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
The Monty Hall Problem
Suppose you're on a game show, and
you're given the choice of three doors:
Behind one door is a car; behind the
others, goats. You pick a door, say No. 1,
and the host, who knows what's behind
the doors, opens another door, say No. 3,
which has a goat. He then says to you,
"Do you want to pick door No. 2?" Is it to
your advantage to switch your choice?
Answer: It definitely pays to switch. See
“Notes for this class” or browse the
1,000,000 plus hits you’ll find if you
search for this problem on the web.
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
900000
Mean
StDev
N
AD
P-Value
95
700000
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
30
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Probability Plot of Listing
99
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
1
0
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Throwing in the Towel
Consider a sporting event with two halves. Scores accumulate in each half. The
winner is the team with the highest total of the scores for the two halves.
(Baseball, Hockey, Football, Basketball, Rugby) No ties (so this doesn’t work well
for soccer). The two teams are evenly matched, and they play exactly as hard in
the second half as in the first. Given that a team is ahead at the halftime, what is
the probability that they will win the game? Intuition (incorrectly) says .5. (If team
A wins the first half, it’s as likely that team B will win the second half.) The correct
answer is .75! The simple intuition is that it is not sufficient for team B to win in the
second half. Team B must win by a higher margin in the second half than team A
had in the first half. Since they are evenly matched, that probably is only .25.
Formally, there is a 50% chance that team A will win the second half outright. For
any first half margin, say M, since they are evenly matched, there is a 50% chance
that each team will exceed that margin, so A wins the game in half of the cases
that B wins the second half and all of the cases when A wins the second half.
See Jeffrey Simonoff, “Probability – the language of randomness,” pp. 10-12.
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
900000
Mean
StDev
N
AD
P-Value
95
700000
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
30
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Probability Plot of Listing
99
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
C ategory
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
1
0
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Download