Lecture 6: Misclassification

advertisement
Lecture 7: Misclassification
Matthew Fox
Advanced
Epidemiology
Non-differential misclassification
biases …
What does that
mean?
How bad does the non-differential
misclassification have to be?
Could it ever go past
the null or only to?
We know what to do about
measured confounding. What do we
do about information bias (and
selection bias, unmeasured
confounders for that matter)
Last class

3 concepts of interaction
–
Effect measure modification
 If
–
no EMM on one scale, often EMM on another
Interdependence
 Risk
in double exposed isn’t explained by the
sum of the two exposure effects alone
–
Statistical interaction
 In
logistic regression = multiplicative interaction
Today: Misclassification





Exposure misclassification
Disease misclassification
Rules on the impact of misclassifications
Misclassification of covariates
What to do about it?
The F test



Take 2 minutes and count all the Fs
The necessity of training farm hands for first class
farms in the fatherly handling of farm livestock is
foremost in the minds of effective farm owners. Since
the forefathers of the farm owners trained the farm
hands for first class farms in the fatherly handling of
farm livestock, the farm owners feel they should carry
on with the former family tradition of training farmhands
of first class farms in the effective fatherly handling of
farm live stock, however futile, because of their belief
that it forms the basis of effective farm management
efforts.
Answer: 48 or 49
Exposure — terms (1a)

Predictive values
–
–

Positive predictive value
–
–

Truth is in the numerator
E is the truth, T is the test
Probability of being truly exposed, given a + test
Pr(E+|Test+)
Negative predictive value
–
–
Probability of being truly unexposed, given - test
Pr(E-|Test-)
Exposure — terms (1b)

Classification values
–
–

Sensitivity
–
–

Truth is the denominator
E is the truth, T is the test
Probability of being correctly classified as E+
Pr(T+|E+)
False negative
–
–
Probability of being wrongly classified as unexposed
1-sensitivity
Exposure — terms (1b)

Classification values
–
–

Specificity
–
–

Truth is the denominator
E is the truth, T is the test
Probability of being correctly classified as EPr(T-|E-)
False positive
–
–
Probability of being incorrectly classified as E+
1-specificity
Relation between predictive and
classification measures
Se  p
PPV 
Se  p  (1  Sp)  (1  p )
Sp  (1  p )
NPV 
(1  Se)  p  Sp  (1  p )
When prevalence is 100%, the PPV is 1 and the NPV is 0.
When prevalence is 0%, the NPV is 1 and the PPV is 0.
Exposure Misclassification
Exposure — terms (2)

Non-differential exposure misclassification
Rates of E misclassification doesn’t depend on D
– Se of E classification is same in the D+ and DAND
– Sp of E classification is same in the D+ and D–

Non-differential misclassification of a
dichotomous exposure creates an expected
bias of effect estimates towards the null
Exposure — terms (3)

Differential exposure misclassification
–
–
Rates of E misclassification do depend on D
Se of E classification is not same in the D+ and D-
OR
Sp of E classification is same in the D+ and DDifferential exposure misclassification of a
dichotomous exposure creates an unpredictable
bias of the effect estimates
–

Exposure — terms (4)

Exposure misclassification
–
–

Exposure classification errors are inevitable
–
–
–

Non-differential CANNOT explain a non-null result
Differential CAN explain a non-null result
Incomplete knowledge of dose, duration, induction
Errors in interview, data coding, data entry
Mistakes in inference
Strive to make the errors non-differential
Some misclassification is easily
seen as structural: Recall bias
E
D
ER
E = alcohol use during pregnancy, D = birth defect, ER =
Measure of alcohol use after giving birth
E
D
EM
E = blood lipids, D = cancer, ER = Measure of blood lipids
after cancer occurs
Exposure Misclassification (5)
Truth
X=1
Observation
X=0
D+
A
B
DC
D
Total N1 (A+C) N0 (B+D)

X=1
se1A+(1-sp1)B
sp0C+(1-sp0)D
se1A+(1-sp1)B
+
se0C+(1-sp0)D
X=0
sp1B+(1-se1)A
sp0D+(1-se0)C
sp1B+(1-se1)A
+
sp0D+(1-se1)C
Non-differential misclassification requires that
se0=se1=se and sp0=sp1=sp.
Exposure Misclassification (5)
Truth
X=1
Observation
X=0
D+
A
B
DC
D
Total N1 (A+C) N0 (B+D)

X=1
se1A+(1-sp1)B
s0C+(1-sp0)D
se1A+(1-sp1)B
+
se0C+(1-sp0)D
X=0
sp1B+(1-se1)A
sp0D+(1-se0)C
sp1B+(1-se1)A
+
sp0D+(1-se1)C
Non-differential misclassification requires that
se0=se1=se and sp0=sp1=sp.
Exposure Misclassification (6)
D+
Observation
X=1
X=0
a
b
D-
c
d
Total
n1 (a+c)
n0 (b+d)



Truth
X=1
[a-(1-sp1) D+] /
[se1-(1-sp1)]
[c-(1-sp0) D-] /
[se0-(1-sp10)]
n1
X=0
D+ - A
D- - C
n0
Given an observation and estimates of sensitivities and
specificities, recalculate truth
Obtain estimates of Se and Sp from literature, pilot
studies, or substudy with gold standard measurement
Se and Sp not necessarily non-differential
Ex. 1: Non-differential (1)
Truth
cases
undiseased
total
risk
risk difference
risk ratio
Exposed
400
600
1000
0.4
0.3
4
Unexposed
100
900
1000
0.1
Exposed
90% = sensitivity
Truth
correct
false
total
cases
400
360
0
360
Ex. 1: Non-differential
(2)
undiseased
600
540
0
540
total
900
0
900
risk
0.4
0.4
Unexposed
100% = specificity
Truth
correct
false
total
cases
100
100
40
140
undiseased
900
900
60
960
total
1000
100
1100
risk
0.1
0.4
0.13
risk difference
0.27
risk ratio
3.14
Exception #1 to the mantra

Misclassification is haphazard, not random
–

Random implies intent, but these mistakes are
made without intent, or haphazardly. We model
haphazard mistakes as occurring at random.
Misclassification operates on individuals
–
–
With some probability, ND misclassification may
bias AWAY from the null
But the EXPECTATION is towards the null
Example of Exception #1

Study truth as shown, Se=Sp=0.9 (non-differential)
–

Expectation is as shown
Apply misclassification probabilities
–
–
–
–
Apply probabilities to each individual
Calculate RR
Repeat 10,000 times
Back-calculate truth given Se and Sp
Cases
Controls
RR
40
20
Truth
60
80
2.7
Expectation
42
58
26
74
2.1
Distribution of observed OR
Towards null
Truth
Away from null
Exception #2 example
Truth
cases
controls
RR
EE(low)
E(high)
100
200
600
100
100
100
1
2
6
Misclassified (40% of high to low)
cases
100
440
360
controls
100
140
60
RR
1
3.1
6
Misclassified (20% of high to low, 20% of low to high)
cases
100
280
520
controls
100
100
100
RR
1
2.8
5.2
Exception #2 to the mantra

When exposure has two or more categories
–
–
Bias from non-differential exposure misclassification
for a given comparison may be AWAY from the null
The estimates of effect within the categories will be
biased towards one another
Disease Misclassification
Ex. 1: Non-differential (1)
Truth
cases
undiseased
total
risk
risk difference
risk ratio
Exposed
400
600
1000
0.4
0.3
4
Unexposed
100
900
1000
0.1
Ex. 1: Non-differential Disease
Cohort
cases
undiseased
total
risk
cases
undiseased
total
risk
risk difference
risk ratio
Exposed
Truth
correct
400
360
600
540
900
0.4
Unexposed
Truth
correct
100
90
900
810
900
0.1
0.24
2.33
90% = sensitivity
90% = specificity
false
total
60
420
40
580
100
1000
0.6
0.42
false
90
10
100
0.9
total
180
820
1000
0.18
Ex. 3: Non-Differential Disease Misc
Cohort
cases
undiseased
total
risk
cases
undiseased
total
risk
risk difference
risk ratio
Truth
400
600
Truth
100
900
0.15
4.00
50%
100%
false
0
200
200
0
Exposed
correct
200
600
800
0.25
Unexposed
exposed false
50
0
900
50
950
50
0.05
0
= sensitivity
= specificity
total
200
800
1000
0.2
total
50
950
1000
0.05
Equations (1)
Iˆ  Se  I  (1  Sp)
ˆI  Se  I  (1  Sp )
E
E
E
E
ˆI  Se  I  (1  Sp )
E
E
E
E
Equations (2)
if : (1 - Sp E )  (1  SpE )  0 and
SeE  SeE  Se then
ˆI  Se  I
E
E
IˆE  Se  I E
Except #3 to the mantra:
Equations (3)
ˆI
Se

I
I
E
E
E
ˆ
RR 


 RR
IˆE Se  I E I E
RDˆ  IˆE  IˆE  Se  I E  I E   RD
Ex. 3: Revisited
Cohort
cases
undiseased
total
risk
cases
undiseased
total
risk
risk difference
risk ratio
Exposed
true
200
600
800
0.25
Unexposed
true
50
900
950
0.05
0.15
4.00
50%
100%
false
0
200
200
0
false
0
50
50
0
= 0.5*(0.4-0.1)
=0.4/0.1
= sensitivity
= specificity
total
200
800
1000
0.2
total
50
950
1000
0.05
Design
Se, Sp
RR
95% CI
UCL/LCL
truth
100%,100%
4.0
3.27, 4.89
1.49
cohort
90%,90%
2.33
2.01, 2.71
1.35
casecontrol
90%,90%
2.33
1.90, 2.87
1.52
cohort
50%,100%
4.0
2.97, 5.38
1.81
casecontrol
50%,100%
4.0
2.80, 5.71
2.04
With imperfect SP interval becomes narrower, but RR biased to null
With case-control, sampling of controls increases width of interval
Misclassification of a confounder

Non-differential misclassification of a
confounder yields residual confounding
–

The estimate of effect is biased away from the truth
in the direction of the confounding
For weak effects, resources may be better
spent accurately measuring a strong
confounder than accurately measuring the
index and reference conditions
Covariate misclassification
Truth
Exposed
cases
400
undiseased
600
total
1000
risk
0.4
risk difference
0.3
risk ratio
4
stratified
C+
truth
Exposed
cases
300
undiseased
400
total
700
risk
0.43
risk ratio
3.2
SMR
3.4
Unexposed
100
900
1000
0.1
CUnexposed
40
260
300
0.13
Exposed
100
200
300
0.33
3.9
RRc = 4 / 3.4 =
Unexposed
60
640
700
0.09
1.19
stratified
truth
cases
undiseased
total
risk
risk ratio
SMR
C+
CExposed
300
400
700
0.43
3.2
3.4
Unexposed
40
260
300
0.13
Exposed
100
200
300
0.33
3.9
Unexposed
60
640
700
0.09
Covariate misclassification (3)
C+
stratfied
misclassified
cases
undiseased
total
risk
risk ratio
Cstratfied
misclassified
cases
undiseased
total
risk
risk ratio
90%
Exposed
True C+
270
360
630
0.43
3.43
90%
Exposed
Misc C30
40
70
0.43
4.02
= sensitivity
90%
= specificity
Misc C+
10
20
30
0.33
total
280
380
660
0.42
Unexposed
True C+
36
234
270
0.13
= sensitivity
90%
= specificity
total
120
220
340
0.35
Unexposed
Misc C4
26
30
0.13
True C90
180
270
0.33
Misc C+
6
64
70
0.09
total
42
298
340
0.12
True C54
576
630
0.09
total
58
602
660
0.09
Crude RR = 4
stratified
truth
cases
undiseased
total
risk
risk ratio
SMR
misclassified
cases
undiseased
total
risk
risk ratio
SMR
men
Exposed
300
400
700
0.43
3.2
3.4
Exposed
280
380
660
0.42
3.4
3.6
Unexposed
40
260
300
0.13
women
Exposed
100
200
300
0.33
3.9
Unexposed
60
640
700
0.09
Covariate misclassification (4)
Unexposed
42
298
340
0.12
Exposed
120
220
340
0.35
4.0
RRc = 4 / 3.6 =
Unexposed
58
602
660
0.09
1.11
Crude RR = 4, Adjusted = 3.4, Misc adjusted = 3.6
Exception #4: Non-differential misclassification
and R(I)
truth
cases
undiseased
total
risk
risk difference
R(I)
B
A
Crude
R
I
R
I
R
I
20
280
300
0.067
90
810
900
0.100
0.033
40
560
600
0.067
40
360
400
0.100
0.033
60
840
900
0.067
130
1170
1300
0.100
0.033
0
90% sens.
E misclassified
B
R
cases
29
undiseased
361
total
390
risk
0.074
risk difference
R(I)
0.006
100% spec
A
Crude
I
R
I
R
I
81
729
810
0.100
0.026
44
596
640
0.069
36
324
360
0.100
0.031
73
957
1030
0.071
117
1053
1170
0.100
0.029
Methodology/Principal Findings

2003 National Survey of Children Health
–
–

In unadjusted models
–

Parental report of whether child has ever been diagnosed with
asthma by a physician was D
Parental report of perception of neighborhood safety was E
OR for reporting asthma associated with living in
neighborhoods perceived sometimes/never safe was 1.36
(95% CI: 1.21, 1.53) vs. neighborhoods perceived always safe.
Adjusting for covariates attenuated OR
–
OR 1.25, 95% CI 1.08, 1.43
Exception #5: Dependent errors
Exception #6: Dependent errors
Misclassification can be shown as
a structural problem
Non-differential, non-dependent
misclassification of A and D
Note that we study A* and Y*, so there is an unblocked backdoor path
Differential, non-dependent
misclassification of A and D
Non-differential, dependent
misclassification of A and D
Differential, dependent
misclassification of A and D
The NO-SHOTS Study

In kids with WHO-defined severe pneumonia, is
treatment failure at 48 hours when given oral
amoxicillin equivalent to injectable penicillin?






Non-blinded, Equivalency RCT
Among children aged 3-59 months
Half in hospital, half at home
1,702 children randomized 1:1
Equivalence defined as a RD 95% CI +/- 5%
Results:

-RD: 0.4% (95% CI: -4.2% to 3.3%)
Baseline comparison between treatment groups
Hospital Care
(N=1012)
Home Care
(N=1025)
602 (60%)
630 (62%)
658 (65%)
354 (35%)
653 (64%)
372 (36%)
654/873 (75%)
642/861 (75%)
933 (92%)
937 (91%)
Difficulty breathing
997 (99%)
978 (95%)
Vomiting
161 (16%)
103 (10%)
Diarrhea
104 (10%)
55 (5%)
Audible wheeze
150 (15%)
111 (11%)
Antibiotics in previous 7 days
216 (21%)
167 (16%)
Up-to-date immunization status
870 (86%)
923 (90%)
-1.0 (-2.1 to -0.0)
-0.9 (-1.9 to 0.1)
151/425 (36%)
165/436 (38%)
Parameter
Male
Age
3-11 months
12-59 months
Breastfeeding
History of Fever
Weight-for-age Z-score
Positive urine antibacterial activity
Cumulative treatment failure (TF) by specific
causes by Day 6 and relapse by Day 14
Cumulative TF by Day 6
Relapse by Day 14
Variable
Inject.
(N=1012)
Oral
(N=1025)
RD
(95% CI)
Inject.
(N=925)
Oral
(N=948)
RD
(95% CI)
Total
87 (8.6%)
77 (7.5%)
1.1% (-1.3-3.5)
31 (3.4%)
25 (2.6%)
0.7% (-0.8- 2.3)
Any danger
sign
36 (3.6%)
20 (2.0%)
1.6% (0.2-3.0)
3 (0.3%)
0 (0.0%)
0.3% (-0.0-0.7)
Hospitalization
46 (4.5%)
29 (2.8%)
1.7% (0.1-3.4)
1 (0.1%)
0 (0.0%)
0.1% (0.1-0.3)
Temp > 380C
/persistent LCI
32 (3.2%)
57 (5.6%)
-2.4% (-4.2-0.6)
10 (1.1%)
2 (0.2%)
0.9% (0.1-1.6)
New comorbid
condition
6 (0.6%)
1 (0.1%)
0.5% (-0.0-1.0)
3 (0.3%)
5 (0.5%)
-0.2% (-0.8-0.4)
Inject.
(N=1048)
Oral
(N=1052)
RD
(95% CI)
Inject.
(N=943)
Inject.
(N=963)
RD
(95% CI)
105 (10.0%)
89 (8.5%)
1.6% (-0.9-4.0)
31 (3.3%)
26 (2.7%)
0.6% (-0.9-2.1)
Variable
Intention-totreat
Lancet Reviewer 1

There seems to be a selection bias and
possibility of failure of randomization to
this open labeled trial
Lancet Reviewer 3

The imbalance in baseline characteristics - Table 1
shows some alarming discrepancies - 16% vs 10%
vomiting, 10% vs 5% diarrhoea... These look odd for a
trial with 1000+ in each arm. The authors somewhat
opportunistically comment that the imbalances were in
covariates unrelated to the severity of pulmonary
disease – with respect, this misses the point. We need
to have reassurance that these differences were not
indicative of some failure of the randomisation process,
a failure which itself may be the symptom of a wider
malaise - specifically, bias in outcome assessment.
The Concern

Residual confounding
–


Easy to address, but loses randomization
Unmeasured Confounding
Outcome misclassification
–
–
–
–
Treatment failure for pneumonia is subjective
If we did a bad job of assigning subjects, we could also have
done a bad job of outcome ascertainment
Misclassification of outcome
Unlikely in equivalency trial, but reasonable to suspect
https://sites.google.com/site/biasan
alysis/
Sensitivity of Treatment
Failure in Hospital Arm
Corrected RR given Se and Sp
1
0.95
0.9
0.85
0.8
0.75
0.7
0.65
0.6
Sensitivity of Treatment Failure in Home Arm
1
0.95
0.9
0.85
0.8
0.75
0.7
0.87
0.92
0.97
1.03
1.09
1.17
1.25
0.87
0.92
0.98
1.04
1.11
1.19
0.87
0.93
0.98
1.05
1.12
0.87
0.93
0.99
1.06
0.87
0.93
1.00
0.87
0.94
0.87
Because failure rates were so low, it is unlikely that specificity of treatment
failure was a problem (i.e. falsely concluding a subject was a treatment
failure when they were not). More likely would be that some subjects who
were true treatment failures were classified as successes and this was
preferentially done in the home arm.
Sensitivity of Treatment
Failure in Hospital Arm
Corrected RR given Se and Sp
1
0.95
0.9
0.85
0.8
0.75
0.7
0.65
0.6
Sensitivity of Treatment Failure in Home Arm
1
0.95
0.9
0.85
0.8
0.75
0.7
0.87
0.92
0.97
1.03
1.09
1.17
1.25
0.87
0.92
0.98
1.04
1.11
1.19
0.87
0.93
0.98
1.05
1.12
0.87
0.93
0.99
1.06
0.87
0.93
1.00
0.87
0.94
0.87
Outcome misclassification would have to be extreme (i.e. perfect sensitivity
in hospital arm and 0.7 in home arm to substantially alter the conclusions of
the study. We consider this case unlikely, and note that if there was even
slightly imperfect non-differential specificity it would take an even lower
sensitivity in the home arm to substantially bias the results.
Unmeasured Confounding
Prevalence of confounder in:
RR(confounder-treatment failure)
Home arm
Hospital arm
2
2.5
3
3.5
4
4.5
0.010
0.085
0.94
0.97
1.00
1.03
1.06
1.10
0.060
0.135
0.94
0.96
0.99
1.02
1.04
1.06
0.110
0.185
0.93
0.96
0.98
1.00
1.02
1.04
0.160
0.235
0.93
0.95
0.97
0.99
1.01
1.02
0.210
0.285
0.93
0.95
0.97
0.98
0.99
1.01
0.260
0.335
0.93
0.94
0.96
0.97
0.98
0.99
0.310
0.385
0.92
0.94
0.95
0.97
0.98
0.98
0.360
0.435
0.92
0.94
0.95
0.96
0.97
0.98
0.410
0.485
0.92
0.93
0.95
0.95
0.96
0.97
0.460
0.535
0.92
0.93
0.94
0.95
0.96
0.96
0.510
0.585
0.92
0.93
0.94
0.95
0.95
0.96
The result: The publciation and
change in policy
Web Appendix
Conclusion

Misclassification is common in research
–

ND misclassification creates an
EXPECTATION of bias towards null
–
–

Impact can be great and effects precision
Numerous expectation exist
Still strive to make errors ND
Expectation of impact can be quantified
–
Much better than mere speculation
Download