lecture3

advertisement
Analysis of matched data; plus,
diagnostic testing
Correlated Observations

Correlated data arise when pairs or clusters
of observations are related and thus are
more similar to each other than to other
observations in the dataset.
 Ignoring correlations will:
– overestimate p-values for within-person or
within-cluster comparisons
– underestimate p-values for between-person or
between-cluster comparisons
Pair Matching: Why match?

Pairing can control for extraneous sources
of variability and increase the power of a
statistical test.
 Match 1 control to 1 case based on potential
confounders, such as age, gender, and
smoking.
Example

Johnson and Johnson (NEJM 287: 1122-1125,
1972) selected 85 Hodgkin’s patients who had a
sibling of the same sex who was free of the
disease and whose age was within 5 years of the
patient’s…they presented the data as….
Tonsillectomy
None
Hodgkin’s
41
44
Sib control
33
52
OR=1.47; chi-square=1.53 (NS)
From John A. Rice, “Mathematical Statistics and Data Analysis.
Example

But several letters to the editor pointed out that
those investigators had made an error by
ignoring the pairings. These are not
independent samples because the sibs are
paired…better to analyze data like this:
Control
Tonsillectomy
None
Tonsillectomy
26
15
None
7
37
Case
OR=2.14*; chi-square=2.91 (p=.09)
From John A. Rice, “Mathematical Statistics and Data Analysis.
Pair Matching: example
Match each MI case to an MI control based on
age and gender.
Ask about history of diabetes to find out if
diabetes increases your risk for MI.
Pair Matching: example
Just the discordant cells are
informative!
MI controls
MI cases
Diabetes
No Diabetes
Diabetes
9
37
No diabetes
16
82
25
119
Which cells are informative?
46
98
144
Pair Matching
MI controls
MI cases
Diabetes
No Diabetes
Diabetes
9
37
No diabetes
16
82
25
119
46
98
144
OR estimate comes only from discordant pairs!
The question is: among the discordant pairs, what
proportion are discordant in the direction of the
case vs. the direction of the control. If more
discordant pairs “favor” the case, this indicates
OR>1.
MI controls
MI cases
Diabetes
No Diabetes
Diabetes
9
37
No diabetes
16
82
25
119
P(“favors” case/discordant pair) =
37
b
37
ˆ 
p


37  16 b  c 53
46
98
144
MI controls
MI cases
Diabetes
No Diabetes
Diabetes
9
37
No diabetes
16
82
25
119
odds(“favors” case/discordant pair) =
b 37
OR  
c 16
46
98
144
MI controls
MI cases
Diabetes
No Diabetes
Diabetes
9
37
No diabetes
16
82
25
119
46
98
144
OR estimate comes only from discordant pairs!!
OR= 37/16 = 2.31
Makes Sense!
McNemar’s Test
MI controls
MI cases
Diabetes
No Diabetes
Diabetes
9
37
No diabetes
16
82
Null hypothesis: P(“favors” case / discordant pair) = .5
(note: equivalent to OR=1.0 or cell b=cell c)
 53 
 53 
 53 
37
16
38
15
p  value   (.5) (.5)   (.5) (.5)   (.5)39 (.5)14  ...
 37 
 38 
 39 
McNemar’s Test
MI controls
MI cases
Diabetes
No Diabetes
Diabetes
9
37
No diabetes
16
82
Null hypothesis: P(“favors” case / discordant pair) = .5
(note: equivalent to OR=1.0 or cell b=cell c)
By normal approximation to binomial:
Z 
53
)
10.5
2

 2.88; p  .01
3.64
53(.5)(. 5)
37  (
McNemar’s Test: generally
controls
exp
No exp
exp
a
b
No exp
c
d
cases
By normal approximation to binomial:
bc
b
c
)

bc
2
2
2
Z 


(b  c )(. 5)(. 5)
bc
bc
4
Equivalently:
b(
12
bc 2
(b  c) 2
(
) 
bc
bc
McNemar’s Test
MI controls
MI cases
Diabetes
No Diabetes
Diabetes
9
37
No diabetes
16
82
McNemar’s Test:
 12
(37  16 ) 2
212


 8.32  2.88 2 ; p  .01
53
53
Example: McNemar’s EXACT
test

Split-face trial:
– Researchers assigned 56 subjects to apply SPF
85 sunscreen to one side of their faces and SPF
50 to the other prior to engaging in 5 hours of
outdoor sports during mid-day. The outcome is
sunburn (yes/no).
– Unit of observation = side of a face
– Are the observations correlated? Yes.
Russak JE et al. JAAD 2010; 62: 348-349.
Results ignoring correlation:
Table I -- Dermatologist grading of sunburn after an average of 5 hours of
skiing/snowboarding (P = .03; Fisher’s exact test)
Sun protection factor
85
50
Sunburned
Not sunburned
1
55
8
48
Fisher’s exact test compares the following proportions: 1/56 versus
8/56. Note that individuals are being counted twice!
Correct analysis of data:
Table 1. Correct presentation of the data (P = .016; McNemar’s exact test).
SPF-50 side
SPF-85 side
Sunburned
Not sunburned
Sunburned
1
0
Not sunburned
7
48
McNemar’s exact test:
Null hypothesis: X~binomial (n=7, p=.5)
7  7 0
P( X  0)   .5 .5  .0078
0 
7  7 0
P( X  7)   .5 .5  .0078
0 
Two  sided p - value  .0156
RECALL: 95% confidence
interval for a difference in
INDEPENDENT proportions
Standard error can be estimated by: pˆ (1  pˆ )
n
Standard error of the difference of two proportions=
pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 )

n1
n2
95% confidence interval for the difference between two proportions:
( pˆ1  pˆ 2 )  1.96 *
pˆ1 (1  pˆ1 ) pˆ 2 (1  pˆ 2 )

n1
n2
95% CI for difference in
dependent proportions
Variance of the difference of two random variables is the sum
of their variances minus 2*covariance:
Var ( pˆ1  pˆ 2 )  Var ( pˆ1 )  Var ( pˆ 2 )  2Cov ( pˆ1 , pˆ 2 )
Var( p E / D ) 
Var( p E / ~ D ) 
p E / D (1  p E / D )
ncases controls
p E / ~ D (1  p E / ~ D )
ncases controls
Cov( p E / ~ D , p E / D ) 
Var( pE / D  pE / ~ D ) 
p E & D * p~ E & ~ D  p~ E & D * p E & ~ D
ncases controls
pE / D (1  pE / D ) pE / ~ D (1  pE / ~ D )
p
*p
 p~ E & D * p E & ~ D

 2( E &D ~ E &~ D
)
n
n
n
95% CI for difference in
dependent proportions
MI controls
MI cases
Diabetes
No Diabetes
Diabetes
9
37
No diabetes
16
82
25
119
46
98
144
46
25

 .32  .17  .15
144
144
 pE /~D )
pE / D  pE /~D 
Var( p E / D
p E / D (1  p E / D )
p E / ~ D (1  p E / ~ D )
p
* p ~ E &~ D  p ~ E & D * p E &~ D

 2( E & D
n
n
n
46
46
25
25
9
82
37
16
(
)(1 
)(
)(1 
)  2(
*

*
)
144
144
144
144
144
144
144
144

 .0024
144
 95 % CI : 0.15  1.96 ( .0024 )  0.05  0.24

The connection between McNemar
and Cochran-Mantel-Haenszel Tests
View each pair is it’s own
“age-gender” stratum
Example:
Concordant for
exposure (cell “a”
from before)
Case (MI)
Control
Diabetes
1
1
No diabetes
0
0
Case (MI)
Control
Diabetes
1
1
No diabetes
0
0
Case (MI)
Control
Diabetes
1
0
No diabetes
0
1
Case (MI)
Control
0
1
Diabetes
1
0
Case (MI)
Control
Diabetes
0
0
No diabetes
1
1
No diabetes
x9
x 37
x 16
x 82
Mantel-Haenszel for pairmatched data
We want to know the relationship between diabetes and
MI controlling for age and gender (the matching
variables).
Mantel-Haenszel methods apply.
RECALL: The Mantel-Haenszel
Summary Odds Ratio
k
ai d i

i 1 Ti
k
bi ci

i 1 Ti
Case
Control
Exposed
a
b
Not Exposed
c
d
Case (MI)
Control
Diabetes
1
1
ad/T = 0
No diabetes
0
0
bc/T=0
Case (MI)
Control
Diabetes
1
0
ad/T=1/2
No diabetes
0
1
bc/T=0
Case (MI)
Control
0
1
Diabetes
x 37
ad/T=0
bc/T=1/2
1
0
Case (MI)
Control
Diabetes
0
0
ad/T=0
No diabetes
1
1
bc/T=0
No diabetes
x9
x 16
x 82
Mantel-Haenszel Summary OR
144
ORMH
ai d i
1
37 x

37
i 1 2
2
 144


1 16
bi ci
16 *

2
i 1 2
Mantel-Haenszel Test Statistic
(same as McNemar’s)
k
[
 (a
 E (ak ))]
2
k
i 1
k
Var(a )
~
2
1
k
i 1
recall : E (ak ) 
(ak  bk ) * (ak  ck )
nk
(ak  bk ) * (ck  d k ) * (ak  ck ) * (bk  d k )
Var(ak ) 
nk2 (nk  1)
Concordant cells contribute nothing to MantelHaenszel statistic (observed=expected)
Case (MI)
Control
Diabetes
1
1
No diabetes
0
0
Case (MI)
Control
Diabetes
0
0
No diabetes
1
1
recall : E (ak ) 
Var(ak ) 
(row1) * (col1)
nk
(row1) * (row2) * (col1) * (col2)
nk2 (nk  1)
(2) * (1)
1
2
a k  E ( ak )  1  1  0
E ( ak ) 
(2)(1)(1)(0)
Var(ak ) 
0
2
2 (1)
(0) * (1)
0
2
a k  E ( ak )  0  0  0
E ( ak ) 
(0)(1)(1)(2)
Var(ak ) 
0
2
2 (1)
Discordant cells
Case (MI)
Control
Diabetes
1
0
No diabetes
0
1
Case (MI)
Control
Diabetes
0
1
No diabetes
1
0
recall : E (ak ) 
Var(ak ) 
(row1) * (col1)
nk
(row1) * (row2) * (col1) * (col2)
nk2 (nk  1)
(1) * (1) 1

2
2
1
1
ak  E (ak )  1   
2
2
(1)(1)(1)(1) 1
Var(ak )  2

2 (2  1) 4
E ( ak ) 
(1) * (1) 1

2
2
1
1
ak  E (ak )  0   
2
2
(1)(1)(1)(1) 1
Var(ak )  2

2 (2  1) 4
E ( ak ) 
k
[
 
2
1
 (a
 E (ak ))]
2
k
i 1
k
Var(a )
k
i 1
[37 (.5)  16(.5)]2 [.5(37  16)]2


(37  16)(.25)
(53)(.25)
.5 2 (37  16) 2 (37  16) 2


 8.32; p  .01
.25(53)
53
k
[
CMH 
 (a
 E (ak ))]
2
k
i 1
k
Var(a )
k
 .5
[

i 1

case disc.cells
  .5 ]
control disc.cells
 .25
[.5(b)  .5(c)]

(b  c)(.25)
disc.cells
.5 2 (b  c) 2 (b  c) 2


 McNemar' s
.25(b  c)
bc
~ 12
2
Example: Salmonella
Outbreak in France, 1996
From: “Large outbreak of Salmonella enterica serotype
paratyphi B infection caused by a goats' milk cheese, France,
1993: a case finding and epidemiological study” BMJ 312: 9194; Jan 1996.
Epidemic Curve
Matched Case Control Study
Case = Salmonella gastroenteritis.
Community controls (1:1) matched for:
 age group (< 1, 1-4, 5-14, 15-34, 35-44, 4554, 55-64, or >= 65 years)
 gender
 city of residence
Results
In 2x2 table form: any goat’s
cheese
Controls
Goat’ cheese
None
Goat’s cheese
23
23
None
6
7
29
30
Cases
b 23
OR  
 3.8
c 6
46
13
59
In 2x2 table form: Brand A
Goat’s cheese
Controls
Goat’ cheese B
None
Goat’s cheese B
8
24
None
2
25
10
49
Cases
b 24
OR  
 12.0
c 2
32
27
59
Case (MI)
Control
1
1
0
0
Case (MI)
Control
Brand A
1
0
None
0
1
Case (MI)
Control
Brand A
0
1
None
1
0
Case (MI)
Control
Brand A
0
0
None
1
1
Brand A
None
x8
x24
x2
x25
n1 k n1k 2 *1
8 concordant exposed : 11k  E(n11k ) 

1
n  k
2
Using
Observed(n11k )  11k  1  1  0
Agresti
n1 k n1k n2 k n 2 k 2 *1 * 0 *1
Var(n11k )  2

0
notation
4(2  1)
n   k (n  k  1)
here!
Summary: 8 concordant-exposed pairs (=strata) contribute
nothing to the numerator (observed-expected=0) and nothing to
the denominator (variance=0).
n1 k n1k 0 *1
25 concordant unexposed : 11k  E(n11k ) 

0
n  k
2
Observed(n11k )  11k  0  0  0
n n n n
0 *1 * 2 *1
Var(n11k )  12k 1k 2 k  2 k 
0
4(2  1)
n   k (n  k  1)
Summary: 25 concordant-unexposed pairs contribute nothing to
the numerator (observed-expected=0) and nothing to the
denominator (variance=0).
2 discordant cells favor control : 11k
Observed(n11k )  11k  0  .5  .5
(1)(1) 1


2
2
n1 k n1k n2 k n 2 k 1 *1 *1 *1 1
Var(n11k )  2


4(2  1) 4
n   k (n  k  1)
Summary: 2 discordant “control-exposed” pairs contribute -.5
each to the numerator (observed-expected= -.5) and .25 each to
the denominator (variance= .25).
(1)(1) 1
24 discordant cells favor case : 11k 

2
2
Observed(n11k )  11k  1  .5  .5
n1 k n1k n2 k n 2 k 1 *1 *1 *1 1
Var(n11k )  2


4(2  1) 4
n   k (n  k  1)
Summary: 24 discordant “case-exposed” pairs contribute +.5
each to the numerator (observed-expected= +.5) and .25 each to
the denominator (variance= .25).
[8(0)  25(0)  24(.5)  2(.5)]2
 CMH 
0  0  24(.25)  2(.25)
22 (.25) 22
(24  2)
(b  c)




26(.25)
26
26
bc
2
2
2
2
Diagnostic Testing and
Screening Tests
Characteristics of a diagnostic test
Sensitivity= Probability that, if you truly have
the disease, the diagnostic test will catch it.
Specificity=Probability that, if you truly do
not have the disease, the test will register
negative.
Calculating sensitivity and
specificity from a 2x2 table
Screening Test
+
-
+
a
b
a+b
-
c
d
c+d
Truly have disease
a Among those with true
Sensitivity 
disease, how many test
a  b positive?
d
Specificity 
cd
Among those without the
disease, how many test
negative?
Hypothetical Example
Mammography
+
-
+
9
1
10
-
109
881
990
Breast cancer ( on biopsy)
Sensitivity=9/10=.90
1 false negatives out of 10
cases
Specificity= 881/990 =.89
109 false positives out of 990
What factors determine the
effectiveness of screening?

The prevalence (risk) of disease.
 The effectiveness of screening in preventing
illness or death.
– Is the test any good at detecting disease/precursor
(sensitivity of the test)?
– Is the test detecting a clinically relevant condition?
– Is there anything we can do if disease (or pre-disease) is
detected (cures, treatments)?
– Does detecting and treating disease at an earlier stage
really result in a better outcome?

The risks of screening, such as false positives and
radiation.
Positive predictive value

The probability that if you test positive for
the disease, you actually have the disease.
 Depends on the characteristics of the test
(sensitivity, specificity) and the prevalence
of disease.
Example: Mammography

Mammography utilizes ionizing radiation to image breast
tissue.
 The examination is performed by compressing the breast
firmly between a plastic plate and an x-ray cassette that
contains special x-ray film.
 Mammography can identify breast cancers too small to
detect on physical examination.
 Early detection and treatment of breast cancer (before
metastasis) can improve a woman’s chances of survival.
 Studies show that, among 50-69 year-old women,
screening results in 20-35% reductions in mortality from
breast cancer.
Mammography

Controversy exists over the efficacy of
mammography in reducing mortality from breast
cancer in 40-49 year old women.
 Mammography has a high rate of false positive
tests that cause anxiety and necessitate further
costly diagnostic procedures.
 Mammography exposes a woman to some
radiation, which may slightly increase the risk of
mutations in breast tissue.
Example

A 60-year old woman has an abnormal
mammogram; what is the chance that she
has breast cancer? E.g., what is the positive
predictive value?
Calculating PPV and NPV
from a 2x2 table
Screening Test
+
-
+
a
b
-
c
d
Truly have disease
a+c
PPV
a

ac
b+d
Among those who test
positive, how many truly have
the disease?
NPV
d

bd
Among those who test
negative, how many truly do
not have the disease?
Hypothetical Example
Mammography
+
-
+
9
1
-
109
881
118
882
Breast cancer ( on biopsy)
PPV=9/118=7.6%
NPV=881/882=99.9%
Prevalence of disease = 10/1000 =1%
What if disease was twice as
prevalent in the population?
Mammography
+
-
+
18
2
20
-
108
872
980
Breast cancer ( on biopsy)
sensitivity=18/20=.90
specificity=872/980=.89
Sensitivity and specificity are characteristics of the test, so they don’t
change!
What if disease was more
prevalent?
Mammography
+
-
+
18
2
-
108
872
126
874
Breast cancer ( on biopsy)
PPV=18/126=14.3%
NPV=872/874=99.8%
Prevalence of disease = 20/1000 =2%
Conclusions

Positive predictive value increases with
increasing prevalence of disease
 Or if you change the diagnostic tests to
improve their accuracy.
Download