8.4: TWO-PROPORTION INFERENCE TESTS Objective: To test claims about inferences for two proportions, under specific conditions THE STANDARD DEVIATION OF THE DIFFERENCE BETWEEN TWO PROPORTIONS • Proportions observed in independent random samples are independent. Thus, we can add their variances. So… • The standard deviation of the difference between two sample proportions is p1q1 p2 q2 SD pˆ1 pˆ 2 n1 n2 • Thus, the standard error is SE pˆ1 pˆ 2 Remember it’s always a + pˆ1qˆ1 pˆ 2 qˆ2 n1 n2 ASSUMPTIONS & CONDITIONS • Independence Assumptions: • Randomization Condition: The data in each group should be drawn independently and at random from a homogeneous population or generated by a randomized comparative experiment. • The 10% Condition: If the data are sampled without replacement, the sample should not exceed 10% of the population. • Independent Groups Assumption: The two groups we’re comparing must be independent of each other. ASSUMPTIONS & CONDITIONS • Sample Size Assumption: • Each of the groups must be big enough… • Success/Failure Condition: Both groups are big enough that at least 10 successes and at least 10 failures have been observed in each. TWO-PROPORTION Z-INTERVAL • When the conditions are met, we are ready to find the confidence interval for the difference of two proportions: • The confidence interval is pˆ1 pˆ 2 z SE pˆ1 pˆ 2 where SE pˆ1 pˆ 2 pˆ1qˆ1 pˆ 2 qˆ2 n1 n2 • The critical value z* depends on the particular confidence level, C, that you specify. STEPS FOR TWO-PROPORTION ZINTERVAL 1. Check Conditions and show that you have checked these! • Randomization Condition: The data in each group should be drawn independently and at random from a homogeneous population or generated by a randomized comparative experiment. • The 10% Condition: If the data are sampled without replacement, the sample should not exceed 10% of the population. • Independent Groups Assumption: The two groups we’re comparing must be independent of each other. • Success/Failure Condition: Both groups are big enough that at least 10 successes and at least 10 failures have been observed in each. 𝒏𝟏 𝒑𝟏 ≥ 𝟏𝟎 𝒏𝟐 𝒑𝟐 ≥ 𝟏𝟎 𝒏𝟏 𝒒𝟏 ≥ 𝟏𝟎 𝒏𝟐 𝒒𝟐 ≥ 𝟏𝟎 STEPS FOR TWO PROPORTION ZINTERVAL (CONT.) 2. State the test you are about to conduct Ex) Two-Proportion z-Interval 4. Calculate your z-interval (𝒑𝟏 − 𝒑𝟐 ) ± 𝒛∗ × 𝒑𝟏 𝒒𝟏 𝒑𝟐 𝒒𝟐 + 𝒏𝟏 𝒏𝟐 5. State your conclusion IN CONTEXT. We are 95% confident that the support group program could raise the proportion of smokers who manage to quit using the parch by between 2 and 22 percentage points. TWO-PROPORTION Z-INTERVAL EXAMPLE The table below describes the effect of preschool on later use of social services: Set up a 95% confidences interval. Interpret your results. TWO-PROPORTION Z-INTERVAL EXAMPLE (CONT.) POOLING P • The typical hypothesis test for the difference in two proportions is the one of no difference (when they are equal). In symbols, H 0: p1 – p2 = 0. • Since we are hypothesizing that there is no difference between the two proportions, that means that the standard deviations for each proportion are the same. • Since this is the case, we combine (pool) the counts to get one overall proportion. POOLING P (CONT.) • The pooled proportion is pˆ pooled • where Success1 Success2 n1 n2 Success1 n1 pˆ1 and Success2 n2 pˆ 2 • If the numbers of successes are not whole numbers, round them first. (This is the only time you should round values in the middle of a calculation.) POOLING P (CONT.) • We then put this pooled value into the formula, substituting it for both sample proportions in the standard error formula: SE pooled pˆ1 pˆ 2 pˆ pooled qˆ pooled n1 pˆ pooled qˆ pooled n2 • We’ll reject our null hypothesis if we see a large enough difference in the two proportions. • How can we decide whether the difference we see is large? • Just compare it with its standard deviation. • Unlike previous hypothesis testing situations, the null hypothesis doesn’t provide a standard deviation, so we’ll use a standard error (here, pooled). TWO-PROPORTION Z-TEST • The conditions for the two-proportion z-test are the same as for the twoproportion z-interval. • We are testing the hypothesis H 0: p1 – p2 = 0, or, equivalently, H0: p1 = p2. • Because we hypothesize that the proportions are equal, we pool them to find pˆ pooled Success1 Success2 n1 n2 TWO-PROPORTION Z-TEST (CONT.) • We use the pooled value to estimate the standard error: SE pooled pˆ1 pˆ 2 pˆ pooled qˆ pooled n1 pˆ pooled qˆ pooled n2 • Now we find the test statistic: ( pˆ1 pˆ 2 ) 0 z SE pooled ( pˆ1 pˆ 2 ) • When the conditions are met and the null hypothesis is true, this statistic follows the standard Normal model, so we can use that model to obtain a P-value. STEPS FOR TWO-PROPORTION ZTESTS 1. Check Conditions and show that you have checked these! • Randomization Condition: The data in each group should be drawn independently and at random from a homogeneous population or generated by a randomized comparative experiment. • The 10% Condition: If the data are sampled without replacement, the sample should not exceed 10% of the population. • Independent Groups Assumption: The two groups we’re comparing must be independent of each other. • Success/Failure Condition: Both groups are big enough that at least 10 successes and at least 10 failures have been observed in each. 𝒏𝟏 𝒑𝒑𝒐𝒐𝒍𝒆𝒅 ≥ 𝟏𝟎 𝒏𝟐 𝒑𝒑𝒐𝒐𝒍𝒆𝒅 ≥ 𝟏𝟎 𝒏𝟏 𝒒𝒑𝒐𝒐𝒍𝒆𝒅 ≥ 𝟏𝟎 𝒏𝟐 𝒒𝒑𝒐𝒐𝒍𝒆𝒅 ≥ 𝟏𝟎 STEPS FOR TWO-PROPORTION ZTESTS (CONT.) 2. State the test you are about to conduct Ex) Two-proportion z-test 3. Set up your hypotheses H 0: H A: 4. Calculate your test statistic 𝒛= 𝒑𝟏 −𝒑𝟐 −𝟎 𝒑𝒑𝒐𝒐𝒍𝒆𝒅 ∙ 𝒒𝒑𝒐𝒐𝒍𝒆𝒅 𝒑𝒑𝒐𝒐𝒍𝒆𝒅 ∙ 𝒒𝒑𝒐𝒐𝒍𝒆𝒅 + 𝒏𝟏 𝒏𝟐 5. Draw a picture of your desired area under the t-model, and calculate your P-value. STEPS FOR TWO-PROPORTION ZTESTS (CONT.) 6. Make your conclusion. P-Value Action Conclusion Low Reject H0 The sample mean is sufficient evidence to conclude HA in context. High Fail to reject H0 The sample mean does not provide us with sufficient evidence to conclude HA in context. CALCULATOR TIPS • Stat TESTS • 6: 2-PropZTest • Enter values • Calculate TWO-PROPORTION Z-TEST EXAMPLE High levels of cholesterol in the blood are associated with higher risk of heart attacks. Will using a drug to lower blood cholesterol reduce heart attacks? The Helsinki Heart Study looked at this question. Middle-aged men were assigned at random to one of two treatments: 2051 men took the drug gemfibrozil to reduce their cholesterol levels, and a control group of 2030 men took a placebo. During the next five years, 56 men in the gemfibrozil group and 84 men in the placebo group had heart attacks. What are the proportions and is the benefit of the drug statistically significant? TWO-PROPORTION Z-TEST EXAMPLE (CONT.) ASSIGNMENTS • Day 1: 8.4 Book Page # 1, 7, 9, 18 • Day 2: 8.4 Book Page # 3, 10, 20, 22