STAT 405 - BIOSTATISTICS Handout 11 – Comparing Two Binomial Proportions for Matched-Pair Data (McNemar’s Test) This handout covers material found in Section 10.4 of your text. EXAMPLE: Cancer (Example 10.21 of your text, page 408). Suppose we want to compare two different chemotherapy regimens for breast cancer after mastectomy. The two treatment groups should be as comparable as possible on other prognostic factors. To accomplish this goal, a matched study is set up such that a random member of each matched pair gets treatment A (chemotherapy) perioperatively (within 1 week after mastectomy) and for an additional 6 months, whereas the other member gests treatment B (chemotherapy only perioperatively). The patients are assigned to pairs matched on age (within 5 years) and clinical condition. The patients are followed for 5 years, with survival as the outcome variable. The data are shown below. Treatment A B Total Survive for 5 years 526 515 1041 Die within 5 years 95 106 201 Total 621 621 1242 One could naively analyze these data using a chi-square test as in the previous handout. For example, we could use SAS to analyze this contingency table as follows: data a; input Trt$ Outcome$ count; datalines; A survive 526 A die 95 B survive 515 B die 106 ; proc freq; tables Trt*Outcome / all; weight count; run; 1 However, the use of this test is valid only if the two samples are independent! In this example, the two members of each pair were matched according to age and clinical condition. Therefore, we need to consider a new method for comparing proportions from two dependent samples: McNemar’s test. McNemar’s Test To begin, let’s construct a contingency table which represents the data in a slightly different way. Note that in this example, the observational unit is really a matched pair and not an individual person. So, we will construct a contingency table with 621 total units as follows. Outcome of treatment B patient Outcome of treatment A patient Survive for 5 years Die within 5 years Total Survive for 5 years 510 16 526 Die within 5 years 5 90 95 Total 515 106 621 Now, we define the following terms. A concordant pair is a matched pair in which the outcome is the same for each member of the pair. A discordant pair is a matched pair in which the outcomes differ for the members of the pair. Note that the concordant pairs provide no information about the differences between the treatments; therefore, they will NOT be used in the analysis. Instead, we will focus on only the discordant pairs. We have 5 pairs in which the A patient died and the B patient survived. We have 16 pairs in which the A patient survived and the B patient died. Questions: 1. If the treatments are equally effective, in about how many discordant pairs do you expect to see the A patient die and the B patient survive? Explain. 2. Again, if the treatments are equally effective, in about how many discordant pairs do you expect to see the A patient survive and the B patient die? Explain. 2 Now, let 𝑝𝐴 represent the probability that a discordant pair has the A patient die and the B patient survive (or vice-versa). Note that our interest simply lies in testing the following set of hypotheses: Ho: Ha: McNemar’s test can now be viewed from the standpoint of a chi-square goodness-of-fit test: Observed Count Discordant Pair has A patient die and B patient survive Discordant Pair has A patient survive and B patient die 5 16 Expected Count χ2 (Observed- Expected)2 Expected When the null hypothesis is true, this test-statistic follows the chi-square distribution with df=1. To find the p-value, you can use the following SAS code. data ChiSquareprob; CumProb=1-CDF('ChiSquare',5.7619,1); output; proc print; run; 3 Carrying Out McNemar’s Test in SAS PROC FREQ You can request this test with the following code: data a; input Aoutcome$ Boutcome$ count; datalines; Survive Survive 510 Survive Die 16 Die Survive 5 Die Die 90 ; proc freq; tables Aoutcome*Boutcome; exact mcnem; weight count; run; Exact Test Note that this chi-square test relies on the normal approximation to the binomial distribution. Therefore, for small samples, this test may not be reliable. Your text gives the following rule of thumb: if the number of discordant pairs is less than 20, then a test based on exact binomial probabilities should be used instead. The details of this test are similar to methods discussed in Handout 3. Note that we have 21 discordant pairs, and we let π represent the probability that a discordant pair has the A patient die and the B patient survive (or vice-versa). Recall that we are testing the following: Ho: Ha: Therefore, we define the following for the binomial distribution: n= 𝑝𝐴 = 4 Now, we can find the following probabilities which represent situations at least as extreme (i.e, at least as contradictory to the null) as our observed data: P(16 or more discordant pairs in which A patient survives and B patient dies) data BinomialProbabilities; prob = 1-cdf('Binomial', 15, .5, 21); proc print data=BinomialProbabilities; run; P(5 or fewer discordant pairs in which A patient dies and B patient survives) data BinomialProbabilities; prob = cdf('Binomial', 5, .5, 21); proc print data=BinomialProbabilities; run; Note that SAS PROC FREQ has already provided us with this exact p-value: Exact McNemar’s Test only requires the calculation of binomial probabilities, thus R could easily be used to find exact p-values, e.g. for this analysis we would simply use the pbinom command. > pbinom(5,size=21,p=.5) [1] 0.01330185 > 1 - pbinom(15,size=21,p=.5) [1] 0.01330185 Sample size and power formulae are found in Section 10.5, Equations 10.16 and 10.17 respectively (pgs. 384-85). There use requires prior assumptions about the proportion of discordant pairs (𝑝𝐷 ) and the proportion of discordant pairs of “type A” (𝑝𝐴 ). It should be fairly easy to code these formulae in R. 5 In JMP Data to be entered: Outcome of treatment B patient Outcome of treatment A patient Survive for 5 years Die within 5 years Total Survive for 5 years 510 16 526 Die within 5 years 5 90 95 Total 515 106 621 In JMP we would enter these data as shown below: Then select Analyze > Fit Y by X Conduct McNemar’s test for these data select Agreement Statistic from the Contingency Analysis pull-down menu as shown above. The resulting output is shown below: 6 7