Even Statisticians Love Geometry

advertisement
Even Statisticians Love Geometry
Charles Burd, April 16, 2014
Advisor: Dr. Chauhan
Objective
• There may be multiple ways of estimating an unknown
value.
• The results obtained from multiple methods may not be
the same.
• In such situation, is there a way to determine which
method may be more appropriate, and under what
conditions?
Background
Estimating proportion
𝒑
𝒑
Population proportion :unknown
Random sample proportion - known
95% Confidence Interval (CI) of 𝑝:
𝑝 ± margin of error
𝑝 ± 1.96
𝑝(1 − 𝑝)
𝑛
Comparing
Proportions of two populations
Unknown
𝑝1
Objective:
Estimate
𝑝2
𝑝1 − 𝑝2
Estimation of 𝑝1 − 𝑝2
Two approaches: Which is better?
Overlap:
Compute CI for each sample
𝑝1
𝑝2
Decision rule: If the two intervals overlap, population proportions may be the same.
Standard:
Compute one CI for difference (𝑝1 − 𝑝2):
𝑝1 − 𝑝2
0
Decision rule: If the interval contains zero, the proportions may be the same.
Example
Overlap Approach
Population 1
Population 2
𝑛1 = 200
112
𝑝1 =
= 0.56
200
𝑛2 = 200
88
𝑝2 =
= 0.44
200
margin of error = 1.96 𝐸1
𝑚𝑎𝑟
95% CI for 𝑝1 = 0.56 ± 1.96 0.0351
0.49,0.63
0.3
0.4
= 1.96
𝑝(1−𝑝)
𝑛
= 0.0351
95% CI for 𝑝2 = 0.44 ± 1.96 0.0351
0.37,0.51
0.5
0.6
0.7
𝑝1
𝑝2
Overlap Approach: The intervals overlap, so the proportions may be the same.
Example Continues
Standard Approach
Population 1 – Population 2
𝑛 = 200
𝑝1 − 𝑝2 = 0.56 − 0.44 = 0.12
2
2
margin of error = 1.96 𝐸1 + 𝐸2 = 1.96
𝑝1(1−𝑝2) 2
𝑛1
+
𝑝2(1−𝑝2) 2
𝑛2
= 0.0496
95% CI for 𝑝1 − 𝑝2 = 0.12 ± 1.96 0.0496
0.02,0.22
−0.1
0
0.1
0.2
0.3
𝑝1 − 𝑝2
Standard Approach: Interval does not contain zero, so the proportions are not the same.
Result: The overlap method concludes the population proportions not different
while the standard method finds a difference.
A Closer Look
Individual intervals of 𝑝1, 𝑝2 overlap iff:
𝑝1 − 𝑝2 ± 1.96(𝐸1 + 𝐸2) contains zero
Standard approach interval:
𝑝1 − 𝑝2 ± 1.96 𝐸12 + 𝐸22
Only difference between the two methods then is the width of the intervals.
narrower width
less chance zero included
proporotions different
𝐸1 + 𝐸2
𝐸12 + 𝐸22
𝐸1
𝐸12 + 𝐸22
𝐸2
Equal population proportions by overlap method implies equal by
standard method, but not vice-versa (ratio greater than 1).
• Overlap method is more conservative and less powerful.
• If two populations differ, standard method will detect it.
A Closer Look
What does this geometric relationship tell us about overlap method’s deficiencies?
𝐸1 + 𝐸2
0.0496
0.0351
𝐸12 + 𝐸22
0.0351
• Max of 2 when 𝐸1 = 𝐸2
• Min → 1 as one adj. side → 0
So, overlap approach is expected to be more deficient when 𝐸1 and 𝐸2 nearly equal.
Simulation
Do simulation results confirm analytical expectations?
𝑝1 = 55%, sample size = 100, 4000 draws
Percentage of time finding a difference between populations
Overlap
Standard
𝒑𝟐 = 𝟓𝟑%
0.65%
6.5%
𝒑𝟐 = 𝟐𝟑%
98.275%
99.9%
Conclusion
• We can always get better results with the standard method.
• Overlap method is at its worst when the two margin of errors are close.
• Overlap is simple, convenient to use, but for formal testing, use standard method.
𝐸12 + 𝐸22
𝐸1
𝐸2
Reference
Nathaniel Schenker and Jane F. Gentleman : On judging the significance
of differences by examining the overlap method between confidence
intervals, The American Statistician 55 (Aug., 2001) 182-186.
Charles Burd, April 16, 2014
Advisor: Dr. Chauhan
Download