Case-control and matched studies Outline of the chapter 5.3 Test of

advertisement
5.3 Test of association
5.4 Measures of association
5.6 Power McNemar’s test
References
5.3 Test of association
5.4 Measures of association
5.6 Power McNemar’s test
References
Outline of the chapter
Biostatistical Methods
SoSe2009
Chapter 5: Case-control and matched studies
Michael Höhle1
1 Department of Statistics
Ludwig-Maximilians-Universität München
1
5.3 Tests of association for matched pairs
2
5.4 Measures of association for matched pairs
3
5.6 Power function of McNemar’s test
8th lecture @ 8 June 2009
Michael Höhle
5.3 Test of association
5.4 Measures of association
BioMeth2009
Michael Höhle
1/ 20
5.6 Power McNemar’s test
References
5.3 Test of association for matched pairs
2/ 20
5.6 Power McNemar’s test
References
Exact test
To test on parameters π12 and π21 we eliminate the nuisance
parameters π11 and π22 by conditioning.
Here, the following two hypotheses are equivalent:
H0 : P(YE = D) = Pm (YĒ = D) ⇔ π1• = π•1
H0 : π12 = π21 ,
where Pm (YĒ = D) denotes the conditional probability under
matching.
The first is called the hypothesis of marginal homogeneity
under matching the second is the hypothesis of symmetry
with respect to the discordant pairs.
BioMeth2009
BioMeth2009
5.4 Measures of association
Likelihood for the 2 × 2 table of matched pairs is quadrinomial.
Consider tests of the hypothesis of association
Michael Höhle
5.3 Test of association
3/ 20
One can show that the conditional distribution of f given
M = f + g (number of discordant pairs) is
π12
f |M ∼ Bin M,
,
πd
where πd = π12 + π21 is the probability of a discordant pair.
Thus under H0 : π12 = π21 we have a binomial test
f |M ∼ Bin(M, 12 ) for which we can compute exact p-values.
Michael Höhle
BioMeth2009
4/ 20
5.3 Test of association
5.4 Measures of association
5.6 Power McNemar’s test
References
McNemar’s test (1)
5.3 Test of association
5.4 Measures of association
Because the multinomial distribution can be approximated by
a multivariate normal distribution we have
To test the hypothesis H0 : π12 = π21 we look at
d
p = (p11 , p12 , p21 , p22 )0 ≈ N4 (π, Σ),
p12 − p21
ZM = q
d 12 − p21 |H0 )
Var(p
where
p12 − p21
f −g
ZM = p
=√
f +g
(p12 + p21 )/N
Michael Höhle
5.3 Test of association
5.4 Measures of association
BioMeth2009
d
2 ≈
ZM


π11 (1 − π11 )
−π11 π12
−π11 π21
−π11 π22
1  −π11 π12
π12 (1 − π12 )
−π12 π21
−π12 π22 
Σ= 
−π11 π21
−π12 π21
π21 (1 − π21 )
−π21 π22 
N
−π11 π22
−π12 π22
−π21 π22
π22 (1 − π22 )
One can show that
We have asymptotically ZM ≈ N(0, 1) or
References
Aside: Asymptotic normal distribution of π̂
Let (e, f , g , h)0 ∼ M4 (N, π), where π = (π11 , π12 , π21 , π22 )0 .
d
5.6 Power McNemar’s test
χ2 (1).
Note that this distribution is degenerate, because
4
X
pi = 1.
i=1
Michael Höhle
5/ 20
5.6 Power McNemar’s test
References
Example: oral contraceptives and blood clotting (1)
5.3 Test of association
5.4 Measures of association
BioMeth2009
6/ 20
5.6 Power McNemar’s test
References
Example: oral contraceptives and blood clotting (2)
The data:
Example in Sartwell et al. (1969): Suspicion that oral
contraceptives might predispose women towards
thromboembolism.
175 cases from American hospitals during 3-year period with
individually matched controls
Matching criterion: female, discharged alive from the same
hospital in the same 6-month time interval as the case, same
age (5 year span), marital status, race, etc.
Cases and controls were then asked about their use of oral
contraceptives.
Michael Höhle
BioMeth2009
7/ 20
> (sartwell <- matrix(c(10, 13, 57, 95), 2, 2, dimnames = list(c("D-E",
+
"D-notE"), c("notD-E", "notD-notE"))))
D-E
D-notE
notD-E notD-notE
10
57
13
95
Exact test in R:
> binom.test(x = sartwell[1, 2], n = sartwell[1, 2] + sartwell[2,
+
1], p = 0.5, alternative = "two.sided")
Exact binomial test
data: sartwell[1, 2] and sartwell[1, 2] + sartwell[2, 1]
number of successes = 57, number of trials = 70, p-value = 1.029e-07
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
0.7033852 0.8972389
sample estimates:
probability of success
0.8142857
Michael Höhle
BioMeth2009
8/ 20
5.3 Test of association
5.4 Measures of association
5.6 Power McNemar’s test
References
5.3 Test of association
Example: oral contraceptives and blood clotting (3)
5.4 Measures of association
5.6 Power McNemar’s test
References
Conditional Odds ratio (1)
Under matching the marginal odds ratio of matching is not
equivalent to the population odds ratio
McNemar’s test in R:
ORm =
> mcnemar.test(sartwell, correct = FALSE)
McNemar's Chi-squared test
P(YE = D)/P(YE = D)
π1• /π2•
=
6= OR .
π•1 /π•2
Pm (YE = D)/Pm (YE = D)
Instead we look at the conditional OR conditioned on a
specific value z of the matching covariate Z :
data: sartwell
McNemar's chi-squared = 27.6571, df = 1, p-value = 1.448e-07
Hence we reject H0 at almost every significance level: Cases
and control appear to differ in the presence of the exposure
factor.
ORz =
P(YE = D|z) P(YE = D|z)
·
.
P(YE = D|z) P(YE = D|z)
Although P(YE = D|z) and P(YE = D|z) may vary with z we
assume a constant ORz for all values of z, i.e.
ORz = ORC
Michael Höhle
5.3 Test of association
BioMeth2009
5.4 Measures of association
9/ 20
5.6 Power McNemar’s test
Conditional Odds ratio (2)
E
E
D
D
D
π11|z
π21|z
π•1|z
π12|z
π22|z
π•2|z
π1•|z
π2•|z
1
10/ 20
5.6 Power McNemar’s test
References
Exact confidence interval
Large sample confidence interval
where π12 and π21 are the population average discordant
probabilities.
BioMeth2009
BioMeth2009
πf
Then ORC = 1−π
and CIs are computed by
f
logit-transformation of the borders of a CI for πf .
π12 p12
f
=
ˆ
= ,
π21 p21
g
Michael Höhle
5.4 Measures of association
Exact confidence limits for ORC can be based on the
conditional binomial distribution from slide 4, i.e.
f |M ∼ Bin(M, πf ), where πf = π12 /πd and πd = π12 + π21 .
In this setting one can show (→ blackboard)
ORC =
5.3 Test of association
Confidence limits for the conditional odds ratio
Basically we construct a 2 × 2 table for each matched pair
D
Michael Höhle
References
∀z.
11/ 20
A large sample (1 − α) · 100% CI for θ =
plog ORC is obtained
by using θ̂ ± z1−α se(θ̂), where se(θ̂) = M/(fg ).
A back-transformation provides a CI for ORC .
Michael Höhle
BioMeth2009
12/ 20
5.3 Test of association
5.4 Measures of association
5.6 Power McNemar’s test
References
5.3 Test of association
Mantel-Haenszel Analysis (1)
5.4 Measures of association
5.6 Power McNemar’s test
References
Mantel-Haenszel Analysis (2)
Consider the sample of matched pairs as N independent
samples consisting of one member in each exposure group
(E , E ) (prospective) or (D, D) (retrospective).
c MH can be computed by using the
Confidence intervals for OR
d
c MH ).
large sample variance Var(log
OR
The i’th table provides an unmatched 2 × 2 table
E
E
D
ai
bi
m1i
D
ci
di
m2i
n1i = 1 n2i = 1
2
where m1i = 0, 1, 2.
We have that (→ blackboard)
2
XC2 (MH) = XM
,
i.e. the Mantel-Haenszel test statistic equals the squared
McNemar’s test statistic.
However, from our Mantel-Haenszel derivations we also get an
estimator for the OR not just a test.
One can show that (→ blackboard)
c MH = f = OR
c C.
OR
g
Michael Höhle
5.3 Test of association
5.4 Measures of association
BioMeth2009
13/ 20
5.6 Power McNemar’s test
Michael Höhle
References
Mantel-Haenszel Analysis (3)
5.3 Test of association
5.4 Measures of association
14/ 20
5.6 Power McNemar’s test
References
Example: Mantel-Haenszel Analysis (1)
A 1:1 matched case-control study was performed to
investigate the association between tonsillectomy and
Hodgkin’s disease.
Additional remarks:
dC (MH) = OR
dC
We have OR
dC (MH) ) based on the expression
Furthermore, a CI for log(OR
in Chapter 4 corresponds to the large-sample confidence
dC ) from slide 12
interval for log(OR
In Chapter 6 we will see that the estimator for ORC also arises
from a conditional logit model for pair-matched data.
Results of the 85 pairs in the study:
D
E E
E 26 15
D
E
7 37
Using mantelhaen.test for the analysis in R:
> hodgkin <- matrix(c(26, 7,
+
"D-notE"), c("notD-E",
> tables <- array(c(1, 0, 1,
+
1), dim = c(2, 2, 4))
> hodgkin.strata <- tables[,
Michael Höhle
BioMeth2009
BioMeth2009
15/ 20
15, 37), 2, 2, dimnames = list(c("D-E",
"notD-notE")))
0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0,
, rep(1:4, times = t(hodgkin))]
Michael Höhle
BioMeth2009
16/ 20
5.3 Test of association
5.4 Measures of association
5.6 Power McNemar’s test
References
Example: Mantel-Haenszel Analysis (2)
5.3 Test of association
5.4 Measures of association
5.6 Power McNemar’s test
References
5.6 Unconditional power function of McNemar’s test (1)
McNemar’s test investigates H0 : π12 = π21 vs.
H1 : π12 6= π21 .
Assuming the test statistic is T = p12 − p21 which is
π d
d
TH0 ≈ N 0,
,
πd = π12 + π21 ,
N
d
πd − (π12 − π21 )2
TH1 ≈ N π12 − π21 ,
.
N
> mantelhaen.test(hodgkin.strata, correct = FALSE)
Mantel-Haenszel chi-squared test without continuity correction
data: hodgkin.strata
Mantel-Haenszel X-squared = 2.9091, df = 1, p-value = 0.08808
alternative hypothesis: true common odds ratio is not equal to 1
95 percent confidence interval:
0.8737077 5.2555753
sample estimates:
common odds ratio
2.142857
Michael Höhle
5.3 Test of association
5.4 Measures of association
BioMeth2009
Using the formula from Chapter 3 to compute the necessary
sample size of N matched pairs one obtains:

!2 
p
√
2
z
π
+
z
π
−
(π
−
π
)
1−α
12
21
d
d
1−β
.
N=


π12 − π21


17/ 20
5.6 Power McNemar’s test
Michael Höhle
References
Unconditional power function of McNemar’s test (2)
5.3 Test of association
BioMeth2009
5.4 Measures of association
18/ 20
5.6 Power McNemar’s test
Literature I
To calculate a sample size we need to specify π12 and
OR = π12 /π21 , i.e. π12 = OR · π21 .
Assume π21 = 0.125 and we wish to detect an odds ratio of
OR = 2. Hence π12 = 0.25 and πd = 0.375.
For α = 0.05 the necessary sample size for a two-sided test
with power 1 − β = 0.8 is
Sartwell, P., Masi, A., Arthes, F., Greene, G., and Smith, H. (1969).
Thromboembolism and oral contraceptives: an epidemiologic case-control study.
Am J Epidemiol, 90(5):365–380.
> samsize.mcnemar <- function(pi.12, pi.21, alpha = 0.05, beta = 0.1,
+
sided = 1) {
+
pi.d <- (pi.12 + pi.21)
+
N <- (qnorm(1 - alpha/sided) * sqrt(pi.d) + qnorm(1 - beta) *
+
sqrt(pi.d - (pi.12 - pi.21)^2))^2/(pi.12 - pi.21)^2
+
return(ceiling(N))
+ }
> samsize.mcnemar(pi.12 = 0.125, pi.21 = 0.25, sided = 2)
[1] 248
Michael Höhle
BioMeth2009
19/ 20
Michael Höhle
BioMeth2009
20/ 20
References
Download