Section 11

advertisement
STAT 405 - BIOSTATISTICS
Handout 11 – Comparing Two Binomial Proportions for Matched-Pair Data
(McNemar’s Test)
This handout covers material found in Section 10.4 of your text.
EXAMPLE: Cancer (Example 10.21 of your text, page 408).
Suppose we want to compare two different chemotherapy regimens for breast cancer after
mastectomy. The two treatment groups should be as comparable as possible on other
prognostic factors. To accomplish this goal, a matched study is set up such that a random
member of each matched pair gets treatment A (chemotherapy) perioperatively (within 1 week
after mastectomy) and for an additional 6 months, whereas the other member gests treatment B
(chemotherapy only perioperatively). The patients are assigned to pairs matched on age (within
5 years) and clinical condition. The patients are followed for 5 years, with survival as the
outcome variable. The data are shown below.
Treatment
A
B
Total
Survive for 5 years
526
515
1041
Die within 5 years
95
106
201
Total
621
621
1242
One could naively analyze these data using a chi-square test as in the previous handout. For
example, we could use SAS to analyze this contingency table as follows:
data a;
input Trt$ Outcome$ count;
datalines;
A survive
526
A die
95
B survive
515
B die
106
;
proc freq;
tables Trt*Outcome / all;
weight count;
run;
1
However, the use of this test is valid only if the two samples are independent! In this example,
the two members of each pair were matched according to age and clinical condition. Therefore,
we need to consider a new method for comparing proportions from two dependent samples:
McNemar’s test.
McNemar’s Test
To begin, let’s construct a contingency table which represents the data in a slightly different way.
Note that in this example, the observational unit is really a matched pair and not an individual
person. So, we will construct a contingency table with 621 total units as follows.
Outcome of treatment B patient
Outcome of treatment A patient Survive for 5 years Die within 5 years Total
Survive for 5 years
510
16
526
Die within 5 years
5
90
95
Total
515
106
621
Now, we define the following terms.
A concordant pair is a matched pair in which the outcome is the same for each member of the
pair.
A discordant pair is a matched pair in which the outcomes differ for the members of the pair.
Note that the concordant pairs provide no information about the differences between the
treatments; therefore, they will NOT be used in the analysis. Instead, we will focus on only the
discordant pairs.
 We have 5 pairs in which the A patient died and the B patient survived.
 We have 16 pairs in which the A patient survived and the B patient died.
Questions:
1. If the treatments are equally effective, in about how many discordant pairs do you expect
to see the A patient die and the B patient survive? Explain.
2. Again, if the treatments are equally effective, in about how many discordant pairs do you
expect to see the A patient survive and the B patient die? Explain.
2
Now, let 𝑝𝐴 represent the probability that a discordant pair has the A patient die and the B
patient survive (or vice-versa). Note that our interest simply lies in testing the following set of
hypotheses:
Ho:
Ha:
McNemar’s test can now be viewed from the standpoint of a chi-square goodness-of-fit test:
Observed Count
Discordant Pair has A patient
die and B patient survive
Discordant Pair has A patient
survive and B patient die
5
16
Expected Count
χ2 

(Observed- Expected)2

Expected
When the null hypothesis is true, this test-statistic follows the chi-square distribution with df=1.
To find the p-value, you can use the following SAS code.
data ChiSquareprob;
CumProb=1-CDF('ChiSquare',5.7619,1); output;
proc print;
run;
3
Carrying Out McNemar’s Test in SAS PROC FREQ
You can request this test with the following code:
data a;
input Aoutcome$ Boutcome$ count;
datalines;
Survive Survive 510
Survive Die
16
Die Survive
5
Die Die
90
;
proc freq;
tables Aoutcome*Boutcome;
exact mcnem;
weight count;
run;
Exact Test
Note that this chi-square test relies on the normal approximation to the binomial distribution.
Therefore, for small samples, this test may not be reliable. Your text gives the following rule of
thumb: if the number of discordant pairs is less than 20, then a test based on exact binomial
probabilities should be used instead. The details of this test are similar to methods discussed in
Handout 3.
Note that we have 21 discordant pairs, and we let π represent the probability that a discordant
pair has the A patient die and the B patient survive (or vice-versa). Recall that we are testing the
following:
Ho:
Ha:
Therefore, we define the following for the binomial distribution:
n=
𝑝𝐴 =
4
Now, we can find the following probabilities which represent situations at least as extreme (i.e,
at least as contradictory to the null) as our observed data:

P(16 or more discordant pairs in which A patient survives and B patient dies)
data BinomialProbabilities;
prob = 1-cdf('Binomial', 15, .5, 21);
proc print data=BinomialProbabilities;
run;

P(5 or fewer discordant pairs in which A patient dies and B patient survives)
data BinomialProbabilities;
prob = cdf('Binomial', 5, .5, 21);
proc print data=BinomialProbabilities;
run;
Note that SAS PROC FREQ has already provided us with this exact p-value:
Exact McNemar’s Test only requires the calculation of binomial probabilities, thus R
could easily be used to find exact p-values, e.g. for this analysis we would simply use the
pbinom command.
> pbinom(5,size=21,p=.5)
[1] 0.01330185
> 1 - pbinom(15,size=21,p=.5)
[1] 0.01330185
Sample size and power formulae are found in Section 10.5, Equations 10.16 and 10.17
respectively (pgs. 384-85). There use requires prior assumptions about the proportion of
discordant pairs (𝑝𝐷 ) and the proportion of discordant pairs of “type A” (𝑝𝐴 ). It should be fairly
easy to code these formulae in R.
5
In JMP
Data to be entered:
Outcome of treatment B patient
Outcome of treatment A patient Survive for 5 years Die within 5 years Total
Survive for 5 years
510
16
526
Die within 5 years
5
90
95
Total
515
106
621
In JMP we would enter these data as shown below:
Then select Analyze > Fit Y by X
Conduct McNemar’s test for these data select Agreement Statistic from the Contingency
Analysis pull-down menu as shown above. The resulting output is shown below:
6
7
Download