Comparing Two Proportions Case Study Recall the question that was actually asked in the CPR study reported in the NEJM. • Do we need to give mouth-to-mouth ventilation and chest compression? • Or will just doing chest compression alone be just as effective? Summary • In the Seattle study, heart-attack victims were randomly assigned to two groups: full CPR or chest compression alone. • They found a 10.4% survival rate for those receiving full CPR (x = 29, n = 278) and a 14.6% survival rate for those receiving chest compression alone (x = 35, n = 240). • The trial was designed to detect a 3.5% improvement of chest compression alone over full CPR. Question • Is there any difference in the survival proportions of dispatcher-instructed bystander administered CPR depending on whether mouth-to-mouth ventilation is used or not? Steps for Hypothesis Testing Phase 1: State the Question 1. Evaluate and describe the data 2. Review the assumptions 3. State the question—in the form of hypotheses ©AMB Phase 2: Decide How to Answer the Question 4. Decide on a summary number—a statistic—that reflects the question 5. How could random variation affect that statistic? 6. State a decision rule, using p-values, to answer the question Phase 3: Answer the Question 7. Calculate the statistic 8. Make a statistical decision 9. State the substantive conclusion Phase 4: Communicate the Answer to the Question 10. Document our understanding with text, tables, or figures The NEJM CPR Results How do these steps get applied in the case of comparing two proportions? Phase 1: State the Question 1. Evaluate and describe the data Contingency Table • The number of patients in each group, and the number of survivors (but not the nonsurvivors) is shown in Table 4. • This form of tabular display is somewhat like a contingency table. • The contingency table corresponding to these results is: This form of tabular display is called: • a contingency table • a cross-tabulation table • a two-way classification • a 2 x 2 table Observations • We observed n = 278 CPR patients who received instructions by phone, of whom x = 29 survived to hospital discharge. • We observed n = 240 chest-compression alone patients, of whom x = 35 survived. Overall (ignoring group membership), there were 64 survivors out of a total of 518. Histogram A histogram visually compares two things that should be compared (proportions) Tabular Display of Proportions • Notice the columns sum to 1 • Each proportion was calculated separately for each population or treatment group Questions? • What proportion of everyone receiving chest compression plus mouth-to-mouth ventilation, survived to hospital discharge? • Of those receiving chest compression alone, what proportion survived to discharge? • Of those receiving chest compression plus mouth-tomouth ventilation, what proportion did not survive? • Of those receiving chest compression alone, what proportion did not survive? Which display? The intent is to compare the two survival proportions. So which display(s) are best? Use a display(s) that describes the sample and the statistic being compared. Example Population proportion (N) Chest Compression and mouth-to-mouth Chest Compression Alone Row Total Survived 0.104 (29) 0.146 (35) 0.124 (64) Did not survive 0.896 (249) 0.854 (205) 0.876 (454) Column Total 1.00 (278) 1.00 (240) 1.00 (518) Variable 2. Review assumptions As in the case where we’re interested in a single proportion, with two proportions must also meet the three assumptions: • representativeness, • independence, and • sample size Representativeness Are the subjects in each group representative of some population of interest? If the study subjects were chosen as a simplerandom sample from a larger population and if these subjects were randomly assigned to the two groups, then we can be comfortable that the information in this sample is representative of the population of interest. Independence Does the response of one subject depend on the response of another? If so, then the subjects are independent. Sufficient size In order for the test statistic to follow the normal distribution, n must be large enough to expect both 5 survivals and 5 non-survivals in each group. As in the single population case, we are not asking whether you observed at least 5 subjects in each cell. To check this, we must calculate the expected number of subjects under the null-hypothesis. But we have not yet stated the hypotheses. Let’s do that and then come back 3. State the question Are the proportions in the two groups the same? The alternative is that the two groups have a different survival proportion. H0: pCPR = pchest HA: pCPR ≠ pchest If Null is True • The two groups are said to be homogeneous (Of uniform nature, similar in kind) • The two proportions are the same. • If they are the same, it’s convenient to think of the proportion as a single number, p. • So, another way to think of the null hypothesis is: H0: pCPR = pchest = p What is p? What is the best estimate of p, the survival proportion under the null hypothesis? We observed a total of 64 survivors out of 518 people so we’ll use this, called p-bar: x1 + x2 p= n1 + n2 Revisit sample size If the true proportion is the same for both groups, we should use p-bar to determine if there is sufficient size. If the null hypothesis is true, how many people do we expect to see in each of the four cells? We keep the number of subjects in each group fixed and use p-bar Survival groups If you have 278 people and 0.124 proportion survive how many do you expect to survive? ( p ) nChest = 240 × 0.124 = 29.7 If you have 240 people and 0.124 proportion survive how many do you expect to survive? ( p ) nCPR = 278 × 0.124 = 34.3 Non-survival groups? Variable Chest Compression and mouth-to-mouth Chest Compression Alone Row Total Survived Did not survive 34.3 243.7 29.7 210.3 64 454 278 240 518 Column Total • Is this assumption for our statistical test met? (Are the expected counts in all cells greater than 5?) • If it is, then we can trust that the sample proportion will be normally distributed. If we can trust that the sample proportion is normally distributed, then we can calculate a p-value. • If we can calculate a p-value we trust, then we can make a decision with understandable risk. Phase 2: Decide How to Answer the Question 4. Decide on a summary statistic that reflects the question • We want to know if the two proportions are the same: H0: pCPR = pchest = p • This is equivalent to asking if the difference between the two is zero: H0: pCPR - pchest = 0 One versus Two Proportions • Recall that when looking at one proportion there were three possibilities for null hypotheses. • In the case when we’re looking at two proportions we’re almost always interested in the null-hypothesis: “same proportions” and the alternative hypothesis: “different proportions.” Generic Test Statistic • From our earlier discussion, recall that the generic test statistic is: z= pˆ − p0 p0 (1 − p0 ) n H 0: p C P R - p chest = 0 pˆ2 − pˆ1) − 0 ( z= SE0 = ( pˆ2 − pˆ1) − 0 p (1 − p ) p (1 − p ) + n1 n2 5. How could random variation affect that statistic? • If the null hypothesis is true, then z is zero. Since the assumptions are met, z is normally distributed. • Extreme values of z reflect larger differences and thus favor the alternative hypothesis. 6. State a decision rule, using the statistic, to answer the question • Just like in the first case study, if we want to reject the null-hypothesis 5% of the time, our decision rule is to choose to believe: H0: pCPR – pchest = 0 . Choose this if p-value ? α (usually 0.05) HA: pCPR – pchest ? 0. Choose this if p-value < α (usually 0.05) Phase 3: Answer the Question 7. Calculate the statistic We’ve already calculated pCPR as 0.104 nchest = 240 pˆ = 35 = 0.146 240 x1 + x2 = 0.124 p= n1 + n2 Z-score z= ( 0.104 − 0.146 ) − 0 124 (1 − 124 ) 124 (1 − 0.124 ) + 278 −0.042 = 0.029 = −1.432 240 8. Make a Statistical Decision • Determine the p-value • To calculate a p-value, use the “two-tail” method where we are interested in calculating the probability of differences between the two proportions as large or larger than we observed. Using p-value Calculator For z = -1.43 In words • The p-value = 0.1521. • Since p-value > α = 0.05, we will fail to reject the null hypothesis. 9. State a Substantive Solution There is insufficient evidence to conclude that the two survival proportions are different. Phase 4: Communicate the Answer to the Question 10. Document our understanding with text, tables, or figures For a dispatcher-instructed bystander-administered intervention after a cardiac arrest, is the survival proportion for full CPR different from the survival proportion with chest compression alone? In this study, n = 278 patients were randomized to the chestcompression and mouth-to-mouth ventilation group, and we observed p = 0.104 (x =29) survived until hospital discharge. Step 10 (cont) And n = 240 patients were randomized to the chest-compression alone group, where we observed p = 0.146 (x =35) survived until hospital discharge. Thus, there was a nominal improvement in survival of 4.2% but the two proportions were compared and found to be not significantly different (z = 1.4, p-value = 0.1521). Question: Why did we report a positive z value? By convention, if were doing is testing “is A different than B?” we could have just as well phrase the question as “is B different than A?”. Thus, the sign does not matter. So, we report the positive value. Question: Why is our p-value different than the one reported in the NEJM paper? On page 1547 of the paper, in the last paragraph of methods it says: “The primary analysis consisted of a simple comparison of proportions by Fisher’s exact test.” Fisher’s Exact Test Determining the exact probability of obtaining the observed results or results that are more extreme. The z-score is an asymptotic probability based on large samples requiring that the normality assumption is met. Advantage to Fisher’s • We can use it even if the sample sizes are too small for the normal approximation assumptions to be met. • If we don’t expect to see more than 5 responses in each cell. Fisher’s method • Fisher’s idea was that with small samples we don’t have to approximate the distribution with z to calculate p-values. • We can enumerate (count) all the possible outcomes and calculate p-values exactly. Enumeration Let’s look at a simple example. Fisher used an example of a woman tasting tea. • A British woman claimed to be able to distinguish whether milk or tea was added to the cup first. • The Null hypothesis is that there is no ability • Let’s use a more up-to- date question. Can you tell the difference between Coke and Pepsi? Two cups • Pour, hidden from you, two soft-drink cups. One with Coke and one with Pepsi. • Then I ask you: “Which is Coke? And which is Pepsi?” • What are the possible outcomes of this experiment? Possible Results • And we can look at the exact distribution of the number of correct. • Thus we can determine the p-value we’d conclude for all the possibilities. Four Cups Assuming an equal number of Cokes and Pepsis, the next larger experiment would be 4 cups: Results? If someone is guessing randomly, these 6 possibilities are equally likely. Conclusion • So if someone got all 4 right, we be able to conclude that this person could “… tell the difference between Coke and Pepsi, pvalue = 0.1667.” • Would this be convincing? Calculation of Fisher’s exact pvalues • How are we going to use this exact test in practice? • Fortunately, software can calculate these p-values easily. • So how do you interpret the output? Reports all p-values Which one? • The most conservative p-value to report is the “2-tail” one. • In this case that’s what they did in the NEJM paper. Short cut: Comparing Two Proportions We start by labeling the four cells with the letters a thru d: The Statistic It’s actually the square of the z statistic we have already seen: 2 χ = 2 n ( ad − bc ) ( a + c )( b + d )( a + b )( c + d ) CPR Example χ2 = 518 ( 29i205 − 35i249 ) 2 ( 29 + 249 )( 35 + 205 )( 29 + 35 )( 249 + 205 ) = 2.05 → notice that 2.05 = 1.43 Decision Rule • The decision rule is straightforward. 2 • Take the square-root of the χ value (it is z) and look up the p-value. Confidence Interval • Similar to the one proportion CI but use both observed proportions an an “average” SE: ( pˆ1 − pˆ2 ) ± z 1−α ( 2 ) pˆ1 (1 − pˆ1) pˆ2 (1 − pˆ2 ) + n1 n2 CPR Example ( 0.104 − 0.146 ) ± 1.96 0.104 (1 − 0.104 ) 278 −0.042 ± 1.96 ( 0.029 ) [ −0.099, 0.016] + 0.146 (1 − 0.146 ) 240 Interpretation We’re 95% confident that the interval –9.9% to 1.6% covers the true difference in the population survival proportion from full CPR versus chest compression alone. Notice Note: The 95% CI includes zero, meaning that using a confidence interval alone to test the difference, we would conclude the difference is zero or that there is no difference in the treatment groups. If you find a significant difference, you should add the confidence interval about the observed difference to step 10 of the hypothesis testing steps. Review We have applied the ten steps of hypothesis testing to comparing a single observed proportion to an assumed proportions and comparing two observed proportions. We tested the two observed proportions by actually testing if the difference of the two observed proportions is equal to no difference. We will continue to apply the 10 steps of hypothesis testing to other types of hypothesis tests, such as comparing a single mean to an assumed mean, comparing two means, and comparing several means.