Chapter 22 Comparing two proportions Math2200 Are men more intelligent? • Gallup poll • A random sample of 520 women and 506 men • 28% of the men thought men were more intelligent • 14% of the women agreed • Is there a gender gap in opinions about which sex is smarter? Difference between two proportions • To assess the significance of the difference of the two sample proportions, we need its sd or se • Recall that ‘the variance of the sum of difference of two independent random variables is the sum of their variances’ – If X and Y are independent, Var(X-Y) = Var(X) + Var(Y) SD of the difference between two sample proportions Assumptions and conditions • Independence assumptions – Randomization • Data in each group should be drawn independently and at random from a homogeneous population or generated by a randomized comparative experiment – 10% condition • When the data are sampled without replacement, the sample should not exceed 10% of the population – Independent samples • The two groups must be independent of each other • Sample size condition – Success/failure condition for each sample Sampling distribution of • Using normal approximation, – – – is normal is normal is also normal because of independence • Mean: • Standard deviation A two-proportion z-interval Are men more intelligent? (cont’) • Goal: estimate the gap • Parameter of interest • Conditions – Randomization – 10% condition – Independent samples – Success/failure condition • 95% confidence interval Example: ZZzzzz • Study on snoring by the National Sleep Foundation • Out of 995 respondents, – Overall, 37% reported they snored at least a few nights a week – 26% of 184 people with age under 30 – 39% of 811 people with age above 30 • Is the difference of 13% real, or due to natural fluctuations in the sample? Two-proportion z-test • – Additional information under the null hypothesis: two proportions are equal! – A pooled estimate of this equal proportion is – The corresponding se is Two-proportion z-test • P-value is then decided using standard normal (but also depends on one-sided or two-sided alternatives!) ZZzzzz (cont’) • Hypotheses • Conditions – – – – Randomization 10% condition Independent sample Success/failure condition • Two-proportion z-test • P-value = 0.0008 ZZzzzz (cont’) STAT TESTS 6 Two proportion z-test x1: 48 (# of younger people snore) n1: 184 (# of younger respondents) x2: 318 (# of older people snore) n2: 811 (# of older respondents) p1: ≠ p2 (two sided alternative) Calculate ------------2- PropZTEST p1: ≠ p2 Z = -3.3329 P = 8.5944146E-4 P1_hat = .2608695652 (x1/n1) p2_hat = .367839196 (x2/n2) P_hat =.36783916 ((x1+x2)/(n1+n2)) n1= 184 n2= 811 What Can Go Wrong? • Don’t use two-sample proportion methods when the samples aren’t independent. – These methods give wrong answers when the independence assumption is violated. • Don’t apply inference methods when there was no randomization. – Our data must come from representative random samples or from a properly randomized experiment. • Don’t interpret a significant difference in proportions causally. – Be careful not to jump to conclusions about causality. What have we learned? • We’ve now looked at the difference in two proportions. • Perhaps the most important thing to remember is that the concepts and interpretations are essentially the same— only the mechanics have changed slightly. What have we learned? • Hypothesis tests and confidence intervals for the difference in two proportions are based on Normal models. – Both require us to find the standard error of the difference in two proportions. • We do that by adding the variances of the two sample proportions, assuming our two groups are independent. • When we test a hypothesis that the two proportions are equal, we pool the sample data; for confidence intervals we don’t pool.