Lecture 7: Misclassification Matthew Fox Advanced Epidemiology Non-differential misclassification biases … What does that mean? How bad does the non-differential misclassification have to be? Could it ever go past the null or only to? We know what to do about measured confounding. What do we do about information bias (and selection bias, unmeasured confounders for that matter) Last class 3 concepts of interaction – Effect measure modification If – no EMM on one scale, often EMM on another Interdependence Risk in double exposed isn’t explained by the sum of the two exposure effects alone – Statistical interaction In logistic regression = multiplicative interaction Today: Misclassification Exposure misclassification Disease misclassification Rules on the impact of misclassifications Misclassification of covariates What to do about it? The F test Take 2 minutes and count all the Fs The necessity of training farm hands for first class farms in the fatherly handling of farm livestock is foremost in the minds of effective farm owners. Since the forefathers of the farm owners trained the farm hands for first class farms in the fatherly handling of farm livestock, the farm owners feel they should carry on with the former family tradition of training farmhands of first class farms in the effective fatherly handling of farm live stock, however futile, because of their belief that it forms the basis of effective farm management efforts. Answer: 48 or 49 Exposure — terms (1a) Predictive values – – Positive predictive value – – Truth is in the numerator E is the truth, T is the test Probability of being truly exposed, given a + test Pr(E+|Test+) Negative predictive value – – Probability of being truly unexposed, given - test Pr(E-|Test-) Exposure — terms (1b) Classification values – – Sensitivity – – Truth is the denominator E is the truth, T is the test Probability of being correctly classified as E+ Pr(T+|E+) False negative – – Probability of being wrongly classified as unexposed 1-sensitivity Exposure — terms (1b) Classification values – – Specificity – – Truth is the denominator E is the truth, T is the test Probability of being correctly classified as EPr(T-|E-) False positive – – Probability of being incorrectly classified as E+ 1-specificity Relation between predictive and classification measures Se p PPV Se p (1 Sp) (1 p ) Sp (1 p ) NPV (1 Se) p Sp (1 p ) When prevalence is 100%, the PPV is 1 and the NPV is 0. When prevalence is 0%, the NPV is 1 and the PPV is 0. Exposure Misclassification Exposure — terms (2) Non-differential exposure misclassification Rates of E misclassification doesn’t depend on D – Se of E classification is same in the D+ and DAND – Sp of E classification is same in the D+ and D– Non-differential misclassification of a dichotomous exposure creates an expected bias of effect estimates towards the null Exposure — terms (3) Differential exposure misclassification – – Rates of E misclassification do depend on D Se of E classification is not same in the D+ and D- OR Sp of E classification is same in the D+ and DDifferential exposure misclassification of a dichotomous exposure creates an unpredictable bias of the effect estimates – Exposure — terms (4) Exposure misclassification – – Exposure classification errors are inevitable – – – Non-differential CANNOT explain a non-null result Differential CAN explain a non-null result Incomplete knowledge of dose, duration, induction Errors in interview, data coding, data entry Mistakes in inference Strive to make the errors non-differential Some misclassification is easily seen as structural: Recall bias E D ER E = alcohol use during pregnancy, D = birth defect, ER = Measure of alcohol use after giving birth E D EM E = blood lipids, D = cancer, ER = Measure of blood lipids after cancer occurs Exposure Misclassification (5) Truth X=1 Observation X=0 D+ A B DC D Total N1 (A+C) N0 (B+D) X=1 se1A+(1-sp1)B sp0C+(1-sp0)D se1A+(1-sp1)B + se0C+(1-sp0)D X=0 sp1B+(1-se1)A sp0D+(1-se0)C sp1B+(1-se1)A + sp0D+(1-se1)C Non-differential misclassification requires that se0=se1=se and sp0=sp1=sp. Exposure Misclassification (5) Truth X=1 Observation X=0 D+ A B DC D Total N1 (A+C) N0 (B+D) X=1 se1A+(1-sp1)B s0C+(1-sp0)D se1A+(1-sp1)B + se0C+(1-sp0)D X=0 sp1B+(1-se1)A sp0D+(1-se0)C sp1B+(1-se1)A + sp0D+(1-se1)C Non-differential misclassification requires that se0=se1=se and sp0=sp1=sp. Exposure Misclassification (6) D+ Observation X=1 X=0 a b D- c d Total n1 (a+c) n0 (b+d) Truth X=1 [a-(1-sp1) D+] / [se1-(1-sp1)] [c-(1-sp0) D-] / [se0-(1-sp10)] n1 X=0 D+ - A D- - C n0 Given an observation and estimates of sensitivities and specificities, recalculate truth Obtain estimates of Se and Sp from literature, pilot studies, or substudy with gold standard measurement Se and Sp not necessarily non-differential Ex. 1: Non-differential (1) Truth cases undiseased total risk risk difference risk ratio Exposed 400 600 1000 0.4 0.3 4 Unexposed 100 900 1000 0.1 Exposed 90% = sensitivity Truth correct false total cases 400 360 0 360 Ex. 1: Non-differential (2) undiseased 600 540 0 540 total 900 0 900 risk 0.4 0.4 Unexposed 100% = specificity Truth correct false total cases 100 100 40 140 undiseased 900 900 60 960 total 1000 100 1100 risk 0.1 0.4 0.13 risk difference 0.27 risk ratio 3.14 Exception #1 to the mantra Misclassification is haphazard, not random – Random implies intent, but these mistakes are made without intent, or haphazardly. We model haphazard mistakes as occurring at random. Misclassification operates on individuals – – With some probability, ND misclassification may bias AWAY from the null But the EXPECTATION is towards the null Example of Exception #1 Study truth as shown, Se=Sp=0.9 (non-differential) – Expectation is as shown Apply misclassification probabilities – – – – Apply probabilities to each individual Calculate RR Repeat 10,000 times Back-calculate truth given Se and Sp Cases Controls RR 40 20 Truth 60 80 2.7 Expectation 42 58 26 74 2.1 Distribution of observed OR Towards null Truth Away from null Exception #2 example Truth cases controls RR EE(low) E(high) 100 200 600 100 100 100 1 2 6 Misclassified (40% of high to low) cases 100 440 360 controls 100 140 60 RR 1 3.1 6 Misclassified (20% of high to low, 20% of low to high) cases 100 280 520 controls 100 100 100 RR 1 2.8 5.2 Exception #2 to the mantra When exposure has two or more categories – – Bias from non-differential exposure misclassification for a given comparison may be AWAY from the null The estimates of effect within the categories will be biased towards one another Disease Misclassification Ex. 1: Non-differential (1) Truth cases undiseased total risk risk difference risk ratio Exposed 400 600 1000 0.4 0.3 4 Unexposed 100 900 1000 0.1 Ex. 1: Non-differential Disease Cohort cases undiseased total risk cases undiseased total risk risk difference risk ratio Exposed Truth correct 400 360 600 540 900 0.4 Unexposed Truth correct 100 90 900 810 900 0.1 0.24 2.33 90% = sensitivity 90% = specificity false total 60 420 40 580 100 1000 0.6 0.42 false 90 10 100 0.9 total 180 820 1000 0.18 Ex. 3: Non-Differential Disease Misc Cohort cases undiseased total risk cases undiseased total risk risk difference risk ratio Truth 400 600 Truth 100 900 0.15 4.00 50% 100% false 0 200 200 0 Exposed correct 200 600 800 0.25 Unexposed exposed false 50 0 900 50 950 50 0.05 0 = sensitivity = specificity total 200 800 1000 0.2 total 50 950 1000 0.05 Equations (1) Iˆ Se I (1 Sp) ˆI Se I (1 Sp ) E E E E ˆI Se I (1 Sp ) E E E E Equations (2) if : (1 - Sp E ) (1 SpE ) 0 and SeE SeE Se then ˆI Se I E E IˆE Se I E Except #3 to the mantra: Equations (3) ˆI Se I I E E E ˆ RR RR IˆE Se I E I E RDˆ IˆE IˆE Se I E I E RD Ex. 3: Revisited Cohort cases undiseased total risk cases undiseased total risk risk difference risk ratio Exposed true 200 600 800 0.25 Unexposed true 50 900 950 0.05 0.15 4.00 50% 100% false 0 200 200 0 false 0 50 50 0 = 0.5*(0.4-0.1) =0.4/0.1 = sensitivity = specificity total 200 800 1000 0.2 total 50 950 1000 0.05 Design Se, Sp RR 95% CI UCL/LCL truth 100%,100% 4.0 3.27, 4.89 1.49 cohort 90%,90% 2.33 2.01, 2.71 1.35 casecontrol 90%,90% 2.33 1.90, 2.87 1.52 cohort 50%,100% 4.0 2.97, 5.38 1.81 casecontrol 50%,100% 4.0 2.80, 5.71 2.04 With imperfect SP interval becomes narrower, but RR biased to null With case-control, sampling of controls increases width of interval Misclassification of a confounder Non-differential misclassification of a confounder yields residual confounding – The estimate of effect is biased away from the truth in the direction of the confounding For weak effects, resources may be better spent accurately measuring a strong confounder than accurately measuring the index and reference conditions Covariate misclassification Truth Exposed cases 400 undiseased 600 total 1000 risk 0.4 risk difference 0.3 risk ratio 4 stratified C+ truth Exposed cases 300 undiseased 400 total 700 risk 0.43 risk ratio 3.2 SMR 3.4 Unexposed 100 900 1000 0.1 CUnexposed 40 260 300 0.13 Exposed 100 200 300 0.33 3.9 RRc = 4 / 3.4 = Unexposed 60 640 700 0.09 1.19 stratified truth cases undiseased total risk risk ratio SMR C+ CExposed 300 400 700 0.43 3.2 3.4 Unexposed 40 260 300 0.13 Exposed 100 200 300 0.33 3.9 Unexposed 60 640 700 0.09 Covariate misclassification (3) C+ stratfied misclassified cases undiseased total risk risk ratio Cstratfied misclassified cases undiseased total risk risk ratio 90% Exposed True C+ 270 360 630 0.43 3.43 90% Exposed Misc C30 40 70 0.43 4.02 = sensitivity 90% = specificity Misc C+ 10 20 30 0.33 total 280 380 660 0.42 Unexposed True C+ 36 234 270 0.13 = sensitivity 90% = specificity total 120 220 340 0.35 Unexposed Misc C4 26 30 0.13 True C90 180 270 0.33 Misc C+ 6 64 70 0.09 total 42 298 340 0.12 True C54 576 630 0.09 total 58 602 660 0.09 Crude RR = 4 stratified truth cases undiseased total risk risk ratio SMR misclassified cases undiseased total risk risk ratio SMR men Exposed 300 400 700 0.43 3.2 3.4 Exposed 280 380 660 0.42 3.4 3.6 Unexposed 40 260 300 0.13 women Exposed 100 200 300 0.33 3.9 Unexposed 60 640 700 0.09 Covariate misclassification (4) Unexposed 42 298 340 0.12 Exposed 120 220 340 0.35 4.0 RRc = 4 / 3.6 = Unexposed 58 602 660 0.09 1.11 Crude RR = 4, Adjusted = 3.4, Misc adjusted = 3.6 Exception #4: Non-differential misclassification and R(I) truth cases undiseased total risk risk difference R(I) B A Crude R I R I R I 20 280 300 0.067 90 810 900 0.100 0.033 40 560 600 0.067 40 360 400 0.100 0.033 60 840 900 0.067 130 1170 1300 0.100 0.033 0 90% sens. E misclassified B R cases 29 undiseased 361 total 390 risk 0.074 risk difference R(I) 0.006 100% spec A Crude I R I R I 81 729 810 0.100 0.026 44 596 640 0.069 36 324 360 0.100 0.031 73 957 1030 0.071 117 1053 1170 0.100 0.029 Methodology/Principal Findings 2003 National Survey of Children Health – – In unadjusted models – Parental report of whether child has ever been diagnosed with asthma by a physician was D Parental report of perception of neighborhood safety was E OR for reporting asthma associated with living in neighborhoods perceived sometimes/never safe was 1.36 (95% CI: 1.21, 1.53) vs. neighborhoods perceived always safe. Adjusting for covariates attenuated OR – OR 1.25, 95% CI 1.08, 1.43 Exception #5: Dependent errors Exception #6: Dependent errors Misclassification can be shown as a structural problem Non-differential, non-dependent misclassification of A and D Note that we study A* and Y*, so there is an unblocked backdoor path Differential, non-dependent misclassification of A and D Non-differential, dependent misclassification of A and D Differential, dependent misclassification of A and D The NO-SHOTS Study In kids with WHO-defined severe pneumonia, is treatment failure at 48 hours when given oral amoxicillin equivalent to injectable penicillin? Non-blinded, Equivalency RCT Among children aged 3-59 months Half in hospital, half at home 1,702 children randomized 1:1 Equivalence defined as a RD 95% CI +/- 5% Results: -RD: 0.4% (95% CI: -4.2% to 3.3%) Baseline comparison between treatment groups Hospital Care (N=1012) Home Care (N=1025) 602 (60%) 630 (62%) 658 (65%) 354 (35%) 653 (64%) 372 (36%) 654/873 (75%) 642/861 (75%) 933 (92%) 937 (91%) Difficulty breathing 997 (99%) 978 (95%) Vomiting 161 (16%) 103 (10%) Diarrhea 104 (10%) 55 (5%) Audible wheeze 150 (15%) 111 (11%) Antibiotics in previous 7 days 216 (21%) 167 (16%) Up-to-date immunization status 870 (86%) 923 (90%) -1.0 (-2.1 to -0.0) -0.9 (-1.9 to 0.1) 151/425 (36%) 165/436 (38%) Parameter Male Age 3-11 months 12-59 months Breastfeeding History of Fever Weight-for-age Z-score Positive urine antibacterial activity Cumulative treatment failure (TF) by specific causes by Day 6 and relapse by Day 14 Cumulative TF by Day 6 Relapse by Day 14 Variable Inject. (N=1012) Oral (N=1025) RD (95% CI) Inject. (N=925) Oral (N=948) RD (95% CI) Total 87 (8.6%) 77 (7.5%) 1.1% (-1.3-3.5) 31 (3.4%) 25 (2.6%) 0.7% (-0.8- 2.3) Any danger sign 36 (3.6%) 20 (2.0%) 1.6% (0.2-3.0) 3 (0.3%) 0 (0.0%) 0.3% (-0.0-0.7) Hospitalization 46 (4.5%) 29 (2.8%) 1.7% (0.1-3.4) 1 (0.1%) 0 (0.0%) 0.1% (0.1-0.3) Temp > 380C /persistent LCI 32 (3.2%) 57 (5.6%) -2.4% (-4.2-0.6) 10 (1.1%) 2 (0.2%) 0.9% (0.1-1.6) New comorbid condition 6 (0.6%) 1 (0.1%) 0.5% (-0.0-1.0) 3 (0.3%) 5 (0.5%) -0.2% (-0.8-0.4) Inject. (N=1048) Oral (N=1052) RD (95% CI) Inject. (N=943) Inject. (N=963) RD (95% CI) 105 (10.0%) 89 (8.5%) 1.6% (-0.9-4.0) 31 (3.3%) 26 (2.7%) 0.6% (-0.9-2.1) Variable Intention-totreat Lancet Reviewer 1 There seems to be a selection bias and possibility of failure of randomization to this open labeled trial Lancet Reviewer 3 The imbalance in baseline characteristics - Table 1 shows some alarming discrepancies - 16% vs 10% vomiting, 10% vs 5% diarrhoea... These look odd for a trial with 1000+ in each arm. The authors somewhat opportunistically comment that the imbalances were in covariates unrelated to the severity of pulmonary disease – with respect, this misses the point. We need to have reassurance that these differences were not indicative of some failure of the randomisation process, a failure which itself may be the symptom of a wider malaise - specifically, bias in outcome assessment. The Concern Residual confounding – Easy to address, but loses randomization Unmeasured Confounding Outcome misclassification – – – – Treatment failure for pneumonia is subjective If we did a bad job of assigning subjects, we could also have done a bad job of outcome ascertainment Misclassification of outcome Unlikely in equivalency trial, but reasonable to suspect https://sites.google.com/site/biasan alysis/ Sensitivity of Treatment Failure in Hospital Arm Corrected RR given Se and Sp 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 Sensitivity of Treatment Failure in Home Arm 1 0.95 0.9 0.85 0.8 0.75 0.7 0.87 0.92 0.97 1.03 1.09 1.17 1.25 0.87 0.92 0.98 1.04 1.11 1.19 0.87 0.93 0.98 1.05 1.12 0.87 0.93 0.99 1.06 0.87 0.93 1.00 0.87 0.94 0.87 Because failure rates were so low, it is unlikely that specificity of treatment failure was a problem (i.e. falsely concluding a subject was a treatment failure when they were not). More likely would be that some subjects who were true treatment failures were classified as successes and this was preferentially done in the home arm. Sensitivity of Treatment Failure in Hospital Arm Corrected RR given Se and Sp 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 Sensitivity of Treatment Failure in Home Arm 1 0.95 0.9 0.85 0.8 0.75 0.7 0.87 0.92 0.97 1.03 1.09 1.17 1.25 0.87 0.92 0.98 1.04 1.11 1.19 0.87 0.93 0.98 1.05 1.12 0.87 0.93 0.99 1.06 0.87 0.93 1.00 0.87 0.94 0.87 Outcome misclassification would have to be extreme (i.e. perfect sensitivity in hospital arm and 0.7 in home arm to substantially alter the conclusions of the study. We consider this case unlikely, and note that if there was even slightly imperfect non-differential specificity it would take an even lower sensitivity in the home arm to substantially bias the results. Unmeasured Confounding Prevalence of confounder in: RR(confounder-treatment failure) Home arm Hospital arm 2 2.5 3 3.5 4 4.5 0.010 0.085 0.94 0.97 1.00 1.03 1.06 1.10 0.060 0.135 0.94 0.96 0.99 1.02 1.04 1.06 0.110 0.185 0.93 0.96 0.98 1.00 1.02 1.04 0.160 0.235 0.93 0.95 0.97 0.99 1.01 1.02 0.210 0.285 0.93 0.95 0.97 0.98 0.99 1.01 0.260 0.335 0.93 0.94 0.96 0.97 0.98 0.99 0.310 0.385 0.92 0.94 0.95 0.97 0.98 0.98 0.360 0.435 0.92 0.94 0.95 0.96 0.97 0.98 0.410 0.485 0.92 0.93 0.95 0.95 0.96 0.97 0.460 0.535 0.92 0.93 0.94 0.95 0.96 0.96 0.510 0.585 0.92 0.93 0.94 0.95 0.95 0.96 The result: The publciation and change in policy Web Appendix Conclusion Misclassification is common in research – ND misclassification creates an EXPECTATION of bias towards null – – Impact can be great and effects precision Numerous expectation exist Still strive to make errors ND Expectation of impact can be quantified – Much better than mere speculation