Section 12 Part 1 Confidence Intervals and Hypothesis Tests of Proportions from Two groups Section 12 Part 1 overview • Often researchers are interested in estimating the difference between two independent groups. • When the variable of interest is continuous, appropriate inference methods are – Confidence Intervals of the difference between two means – Hypothesis tests (t-tests) of equal means – Nonparametric when sample size is small and normality assumption is not met (not covered in PH6414) • When the variable of interest is categorical (binary), appropriate inference methods are – Confidence Intervals of the difference in proportions (Section 12 Part 1) – Hypothesis tests (z-tests) of equal proportions (Section 12 Part 1) – Chi-square tests of Independence (Section 13) PubH 6414 Section 12 Part 1 2 Section 12 Part 1 Outline • Two sample Z-test of proportions to test the null hypothesis of equal proportions • Confidence Interval for the difference of two proportions • Both inference methods require that the normal approximation to the binomial distribution are met for both groups – n*π > 5 and n*(1-π) ≥ 5 (for CI) – n*π > 5 and n*(1-π) ≥ 5 (for Hypothesis test) PubH 6414 Section 12 Part 1 3 Two sample z-test of proportions: Notation • We have 2 distinct (independent) populations of some binary variable. • The population proportions (π) are the true proportions in each population, the sample proportions (p) are estimates of the population proportions. Population Population Sample proportion size Number in sample with success 1 π1 n1 x1 2 π2 n2 x2 PubH 6414 Section 12 Part 1 Sample proportion x1 p1 = n1 x p2 = 2 n2 4 Steps in Hypothesis Testing Comparing Two Proportions 1. State the null hypothesis H0 and the alternative hypothesis Ha. 2. Calculate the value of the test statistic on which the test will be based. 3. Find the p-value for the observed data. 4. State a conclusion. PubH 6414 Section 12 Part 1 5 Hypothesis Testing Steps: Comparing two Proportions • Step 1: State your hypotheses – H0: π1-π2= π0 – Ha: π1-π2≠ π0 or Ha: π1-π2 < π0 or Ha: π1-π2 > π0 PubH 6414 Section 12 Part 1 6 Hypothesis Testing Steps: Comparing two Proportions Step 2 : Calculate your test statistic (z statistic is appropriate provided you have a SRS that meet this condition n1 p1 ≥ 5 and n1 (1 − p1 ) ≥ 5 and n2 p2 ≥ 5 and n2 (1 − p2 ) ≥ 5 We are testing under the null hypothesis that π1=π2, so we create and then, X1 + X 2 p= n1 + n2 z= ( p1 − p2 ) 1 1 p (1 − p )( + ) n 1 n2 PubH 6414 Section 12 Part 1 7 Hypothesis Testing Steps: Comparing Two Proportions • Step 3: Calculate the p-value One or two-sided. • Step 4: Make your conclusion. If p-value is small, you have evidence against the null. If p-value is large, you do not have evidence against the null. PubH 6414 Section 12 Part 1 8 Birth-Weight and Respiratory Tract Infections • A study was designed to investigate whether low birth weight (LBW: wt < 2500 gm) have the same proportion of lower respiratory tract infection (LRTI) in the first 2 years as normal birth weight infants (NBW). • A random sample of 1000 infants from several urban clinics were enrolled in the study. • Infants were identified as LBW or NBW and followed for 2 years PubH 6414 Section 12 Part 1 9 Birth-Weight and Respiratory Tract Infections Scientific Question of Interest: Is there a difference in the proportion of infants with any LRTI in the first 2 years between LBW and NBW infants? What are the null and alternative hypotheses to test this scientific question? PubH 6414 Section 12 Part 1 10 Birth-Weight and Respiratory Tract Infections Use α = 0.05 as the significance level What are the critical values are from the standard normal distribution? PubH 6414 Section 12 Part 1 11 Birth-Weight and Respiratory Tract Infections 1000 infants were enrolled in the study • 64 of the infants were LBW, 936 were NBW • After 2 years of follow-up – 15 of the LBW infants had at least one LRTI – 112 of the NBW infants had at least one LRTI Group LBW NBW Sample Size Number with LRTI n1= 64 x1= 15 Proportion with LRTI p1= 0.234 n2= 936 x2= 112 p2= 0.120 PubH 6414 Section 12 Part 1 12 Birth-Weight and Respiratory Tract Infections • Is the normal approximation is valid for both groups? • Calculate the overall proportion (p) of LRTI: x1 + x2 p= = n1 + n2 PubH 6414 Section 12 Part 1 = 0.127 13 Birth-Weight and Respiratory Tract Infections Calculate the z-statistic: the difference between the sample proportions divided by the SE of the difference. Use the overall p to calculate the SE (diff). z= ( p1 − p2 ) = 1 1 p(1 − p )( + ) n 1 n2 = 2.67 PubH 6414 Section 12 Part 1 14 Birth-Weight and Respiratory Tract Infections What is the p-value of the is hypothesis test? > 2*pnorm(-2.67, 0, 1) [1] 0.007585125 PubH 6414 Section 12 Part 1 15 Birth-Weight and Respiratory Tract Infections In this study 23% of low birth weight infants had at least one LRTI in the first two years of life and 12% of normal birth weight infants had at least one LRTI in the first two years of life. A significantly different proportion of low birth weight infants had at least one LRTI by age 2 than normal birth weight infants (p = 0.008). PubH 6414 Section 12 Part 1 16 CI for the Difference in Population Proportions • Confidence interval for the difference in population proportions: ( p1 − p2 ) ± Z * SE ( p1 − p2 ) where : SE ( p1 − p2 ) = p1 (1 − p1 ) p2 (1 − p2 ) + n1 n2 n1 p1 ≥ 5 and n1 (1 − p1 ) ≥ 5 and n2 p2 ≥ 5 and n2 (1 − p2 ) ≥ 5 PubH 6414 Section 12 Part 1 17 SE (diff) and coefficient for CI of difference between proportions • Your book is incorrect in saying the SE for the confidence interval is SE ( p1 − p2 ) = 1 1 p (1 − p ) + n1 n2 • To do this you must assume (π1=π2). PubH 6414 Section 12 Part 1 18 Binge Drinking and Gender • Binge drinking on college campuses is a public health concern. • A survey of 17, 096 students revealed the following results. Group N X p Men 7180 1630 0.227 Women 9916 1684 0.170 PubH 6414 Section 12 Part 1 19 Binge Drinking and Gender • Is the sample large enough to construct the CI? • Construct the 95% CI for the difference ( p1 − p2 ) ± Z * p1 (1 − p1 ) p2 (1 − p2 ) + n1 n2 (0.227 − 0.170) ± 1.96 0.227 (0.773) 7180 + 0.170(0.830) 9916 0.057 ± 0.012 (0.045,0.069) PubH 6414 Section 12 Part 1 20 Binge Drinking and Gender The data suggest that men are more likely to be frequent binge drinkers with an estimated risk difference (men – women) of 5.7% and 95% confidence interval from 4.5% to 6.9%. PubH 6414 Section 12 Part 1 21 Confidence Interval and z-test • Typically, if the 95% confidence interval does not contain 0, the two-tailed z-test will result in the decision to reject the null hypothesis of equal proportions. • However; the standard error is different for the confidence interval and the hypothesis test, just as with the one sample proportion. PubH 6414 Section 12 Part 1 22