What is the margin of error?

advertisement
Statistics and Data
Analysis
Professor William Greene
Stern School of Business
IOMS Department
Department of Economics
The Margin of Error
The CNN/Opinion Research Corp. said 51 percent of those polled thought Biden
did the best job, while 36 percent thought Palin did the best job.
On the question of the candidates' qualifications to assume the presidency, 87
percent of those polled said Biden is qualified and 42 percent said Palin is
qualified.
The poll had a margin of error of plus or minus 4
percentage points.
http://www.cnn.com/2008/POLITICS/10/03/debate.poll/index.html (9:30 AM, Friday,
October 3, 2008)
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
900000
Mean
StDev
N
AD
P-Value
95
700000
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
30
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Probability Plot of Listing
99
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
What Does the “Margin of
Error” Tell You?

Did Biden do the better job? If we
could ask every individual all over the
world who had an opinion, the
proportion who think yes would be θ.
We assume that such a value exists.
 It has to be as of a moment in time.
The next day, the same question
might get a different answer.

600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
900000
Mean
StDev
N
AD
P-Value
95
700000
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
30
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Probability Plot of Listing
99
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
The Margin of Error
We can’t ask everyone, so we ask a
sample of people, i = 1,…,n.
 Do you think Biden did the better job?
Xi = 1 if the person answers yes,
Xi = 0 if they answer no.
 51% said yes, so P =(1/n)Σi xi = 0.51
 Is π = 0.51? No, we didn’t ask
everyone, we just asked a sample.
0.51 is an estimate of π.

600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
900000
Mean
StDev
N
AD
P-Value
95
700000
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
30
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Probability Plot of Listing
99
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Why the 4% Margin of Error?
We acknowledge that the 0.51 might
be inaccurate because it is based on
a sample.
 We assume that whatever n is, the
sample was drawn randomly.
 We use our empirical rule to figure out
what the real value of π might be.

600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
900000
Mean
StDev
N
AD
P-Value
95
700000
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
30
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Probability Plot of Listing
99
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Some Theory
P = (1/n)Σi xi = 0.51
 P is a random variable. It is the sum of
n Bernoulli random variables, divided
by n.
 E[xi] = the probability that person i will
answer yes. This is π. So, the
expected value of P is (1/n)Σi θ = θ.
 They think P is a good estimate of θ.

600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
900000
Mean
StDev
N
AD
P-Value
95
700000
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
30
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Probability Plot of Listing
99
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Theory Continued
Since P is a random variable it has a
variance.
 Var[P] = (1/n2) Σi θ(1- θ) = θ(1- θ)/n
 The standard deviation is the square
root.
 Use P to estimate this. The estimated
standard deviation is sqr(.51(.49)/n).

600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
900000
Mean
StDev
N
AD
P-Value
95
700000
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
30
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Probability Plot of Listing
99
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
That Margin of Error
They use the same empirical rule we do.
 The margin of error is ±.04 is
±2 standard deviations. So, one
standard deviation is .02.
 .022 = P(1-P)/n. If P is .51, n = 625.
 (According to David Gregory, they asked
560).

600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
900000
Mean
StDev
N
AD
P-Value
95
700000
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
30
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Probability Plot of Listing
99
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
What they Found
Based on a survey of 625 people, we
believe the proportion of people who think
Biden did a better job is between 47% and
55%.

Based on the same logic, the proportions
for Palin are 28% to 36%.

Sausage
5.8%
900000
800000
800000
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
900000
Scatterplot of Listing vs IncomePC
Normal - 95% CI
900000
Mean
StDev
N
AD
P-Value
95
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
30
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Probability Plot of Listing
99
700000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
30000
32500
0
1000000
60
800000
40
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
What would these ranges be if they had
asked 6,250 people instead of 625?
Frequency

Is this CERTAIN?
Listing

Is this CERTAIN?
Percent

20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Regression: θ|State ≠ θ Overall
600000
500000
400000
Mushroom
16.2%
Plain
32.5%
Scatterplot of Listing vs IncomePC
Normal - 95% CI
900000
Mean
StDev
N
AD
P-Value
95
700000
90
500000
400000
200000
100000
15000
800000
700000
60
50
40
30
20000
22500
25000
IncomePC
27500
30000
32500
e  mc  
6
5
200000
2
1
100000
15000
400000
600000
Listing
800000
1000000
17500
20000
22500
25000
IncomePC
27500
Mean
StDev
N
369687
156865
51
80
8
4
200000
Normal
10
300000
0
Marginal Plot of Listing vs IncomePC
Empirical CDF of Listing
100
12
500000
400000
10
17500
Histogram of Listing
14
2
600000
70
20
300000
200000
369687
156865
51
0.994
0.012
80
600000
300000
100000
Probability Plot of Listing
99
30000
32500
0
1000000
60
800000
40
Listing
800000
800000
Percent
900000
Frequency
Sausage
5.8%
Scatterplot of Listing vs IncomePC
900000
700000
Listing
Pepper and Onion
7.3%
Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball
Listing
Pepperoni
21.8%
Listing
Meatball
Garlic 5.0%
2.3%
Percent
Pie Chart of Percent vs Type
Mushroom and Onion
9.2%
20
600000
400000
0
0
200000
300000
400000
500000 600000
Listing
700000
800000
900000
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing
200000
15000
20000
25000
IncomePC
30000
Download