Even Statisticians Love Geometry Charles Burd, April 16, 2014 Advisor: Dr. Chauhan Objective • There may be multiple ways of estimating an unknown value. • The results obtained from multiple methods may not be the same. • In such situation, is there a way to determine which method may be more appropriate, and under what conditions? Background Estimating proportion 𝒑 𝒑 Population proportion :unknown Random sample proportion - known 95% Confidence Interval (CI) of 𝑝: 𝑝 ± margin of error 𝑝 ± 1.96 𝑝(1 − 𝑝) 𝑛 Comparing Proportions of two populations Unknown 𝑝1 Objective: Estimate 𝑝2 𝑝1 − 𝑝2 Estimation of 𝑝1 − 𝑝2 Two approaches: Which is better? Overlap: Compute CI for each sample 𝑝1 𝑝2 Decision rule: If the two intervals overlap, population proportions may be the same. Standard: Compute one CI for difference (𝑝1 − 𝑝2): 𝑝1 − 𝑝2 0 Decision rule: If the interval contains zero, the proportions may be the same. Example Overlap Approach Population 1 Population 2 𝑛1 = 200 112 𝑝1 = = 0.56 200 𝑛2 = 200 88 𝑝2 = = 0.44 200 margin of error = 1.96 𝐸1 𝑚𝑎𝑟 95% CI for 𝑝1 = 0.56 ± 1.96 0.0351 0.49,0.63 0.3 0.4 = 1.96 𝑝(1−𝑝) 𝑛 = 0.0351 95% CI for 𝑝2 = 0.44 ± 1.96 0.0351 0.37,0.51 0.5 0.6 0.7 𝑝1 𝑝2 Overlap Approach: The intervals overlap, so the proportions may be the same. Example Continues Standard Approach Population 1 – Population 2 𝑛 = 200 𝑝1 − 𝑝2 = 0.56 − 0.44 = 0.12 2 2 margin of error = 1.96 𝐸1 + 𝐸2 = 1.96 𝑝1(1−𝑝2) 2 𝑛1 + 𝑝2(1−𝑝2) 2 𝑛2 = 0.0496 95% CI for 𝑝1 − 𝑝2 = 0.12 ± 1.96 0.0496 0.02,0.22 −0.1 0 0.1 0.2 0.3 𝑝1 − 𝑝2 Standard Approach: Interval does not contain zero, so the proportions are not the same. Result: The overlap method concludes the population proportions not different while the standard method finds a difference. A Closer Look Individual intervals of 𝑝1, 𝑝2 overlap iff: 𝑝1 − 𝑝2 ± 1.96(𝐸1 + 𝐸2) contains zero Standard approach interval: 𝑝1 − 𝑝2 ± 1.96 𝐸12 + 𝐸22 Only difference between the two methods then is the width of the intervals. narrower width less chance zero included proporotions different 𝐸1 + 𝐸2 𝐸12 + 𝐸22 𝐸1 𝐸12 + 𝐸22 𝐸2 Equal population proportions by overlap method implies equal by standard method, but not vice-versa (ratio greater than 1). • Overlap method is more conservative and less powerful. • If two populations differ, standard method will detect it. A Closer Look What does this geometric relationship tell us about overlap method’s deficiencies? 𝐸1 + 𝐸2 0.0496 0.0351 𝐸12 + 𝐸22 0.0351 • Max of 2 when 𝐸1 = 𝐸2 • Min → 1 as one adj. side → 0 So, overlap approach is expected to be more deficient when 𝐸1 and 𝐸2 nearly equal. Simulation Do simulation results confirm analytical expectations? 𝑝1 = 55%, sample size = 100, 4000 draws Percentage of time finding a difference between populations Overlap Standard 𝒑𝟐 = 𝟓𝟑% 0.65% 6.5% 𝒑𝟐 = 𝟐𝟑% 98.275% 99.9% Conclusion • We can always get better results with the standard method. • Overlap method is at its worst when the two margin of errors are close. • Overlap is simple, convenient to use, but for formal testing, use standard method. 𝐸12 + 𝐸22 𝐸1 𝐸2 Reference Nathaniel Schenker and Jane F. Gentleman : On judging the significance of differences by examining the overlap method between confidence intervals, The American Statistician 55 (Aug., 2001) 182-186. Charles Burd, April 16, 2014 Advisor: Dr. Chauhan