Using multiple imputation and delta adjustment to implement sensitivity analyses for time-to-event data. Michael O’Kelly, Quintiles Ilya Lipkovich, Quintiles Copyright © 2013 Quintiles Acknowledgements • DIA Scientific Working Group (SWG) for Missing Data > > > > > This presentation stems from work with Bohdana Ratitch (inVentiv Health). The authors of these slides are members of the SWG. Chair: Craig Mallinckrodt, Eli Lilly. James Roger and Mouna Akacha, speakers at this session, are also members. Great downloadable SAS macros for control-based imputation and other MNAR approaches available SWG webpage at www.missingdata.org.uk. > SWG members have growing interest in discrete endpoints with missing data. • Gary Koch (University of North Carolina) > regular advice; > in press, with Zhao and others: describes the approach used in this presentation. • Taylor and others (2002) showed how to implement multiple imputation for time-to-event outcomes. • Michael Hughes (Harvard School of Public Health) kindly shared the example time-to-event data. 2 ACTG 175: HIV study* • Subjects randomized to four antiretroviral regimes in equal proportions. • Primary event analysed: 50% decline in CD4 count or death. • Study start Dec1991; enrolment ended Oct1992; follow-up until end Nov1994 > max follow-up just 4 years. • For this presentation, we examine two treatment arms > zidovudine > zidovudine+didanosine. * Lu and Tsiatis (2008); Hammer et al. (1996); the analyses are by O’Kelly and are not the responsibility of the authors of the cited papers. 3 ACTG 175: HIV study* Zidovudine+ Zidovudine Didanosine Enrolled 619 613 Event: 50% decline in CD4 182 98 Censored 437 515 Completed study 313 384 Other reasons 124 131 4 ACTG 175: HIV study* Zidovudine+ Zidovudine Didanosine Enrolled 619 613 Event: 50% decline in CD4 182 98 Censored 437 515 Completed study 313 384 Other reasons 124 131 5 ACTG 175: HIV study 6 Kaplan-Meier analysis Logrank statistic 46.12 Standard error 7.726 p-value <0.0001 Assumes censoring at random (CAR). (CAR is analogous to missing at random) 7 How robust is this result? • How robust is this result to the assumption of CAR? • One way to assess this: tipping point analysis. • Tipping point for continuous variable: > Add unfavourable quantity δ to efficacy score when imputed for experimental arm; > Make δ more extreme until the p-value from the primary analysis is no longer significant – the “tipping point”. > Was the “tipping point” δ clinically plausible for subjects who withdrew early? > If not, the primary result may be judged robust to the missing-at-random assumption. 8 Tipping point for time to event, Kaplan-Meier (KM) analysis • Impute time of event using some hazard worse by δ than that estimated by Kaplan-Meier. • Make δ more extreme until the p-value from the primary analysis is no longer significant – the “tipping point”. > Was the “tipping point” δ clinically plausible for subjects who withdrew early? > If not, the primary result may be judged robust to the CAR assumption. • Note unstatistical terminology in following slides: • “p(no event)” = p(T>t) • “p(event)” = p(T<=t) 9 How make p(event) worse than KM in a statistically principled way? • Inversion method • Case 1: assuming CAR > p(event) = 1- p(no event) 10 How make p(event) worse than KM in a statistically principled way? • Inversion method • Case 1: assuming CAR This is missing. To impute, first calculate prob(no event) associated with time of censoring. Interpolate between events, if necessary. > p(event) = 1- p(no event) 11 How make p(event) worse than KM in a statistically principled way? • Inversion method • Case 1: assuming CAR This is missing. To impute, first calculate prob(no event) associated with time of censoring. Interpolate between events, if necessary. > p(event) = 1- p(no event) > Imputed p(event|T>t) = U (1-p(no event), 1) 12 ACTG 175: HIV study 13 ACTG 175: HIV study Impute event for this censored subject. 14 ACTG 175: HIV study 1 – U[1-p(no event), 1] 15 ACTG 175: HIV study Imputed time of event, case 1 1 – U(1-p(no event), 1) 16 ACTG 175: HIV study Imputed time of event, case 2 1 – U(1-p(no event), 1) 17 ACTG 175: HIV study Case 3: imputation results in censoring 1 – U(1-p(no event), 1) 18 How make p(event) worse than KM in a statistically principled way? • Inversion method • Case 2: assuming CAR + some δ. > p(event) = 1- p(no event) 19 How make p(event) worse than KM in a statistically principled way? • Inversion method This is missing. To impute, first calculate prob(no event) • Case 1: assuming CAR + some δ. associated with time of censoring. Interpolate between events, if necessary. > p(event) = 1- p(no event) 20 How make p(event) worse than KM in a statistically principled way? • Inversion method This is missing. To impute, first calculate prob(no event) • Case 1: assuming CAR + some δ. associated with time of censoring. Interpolate between events, if necessary. > p(event) = 1- p(no event) > Imputed p(event|T>t) = U (1-p(no event)δ, 1) 21 ACTG 175: HIV study p(no event) 22 ACTG 175: HIV study reference line for p(no event) δ, δ = 2 23 ACTG 175: HIV study Imputed time of event, δ=2 1 – U(1-p(no event)δ, 1) 24 ACTG 175: HIV study Imputed time of event, no δ 1 – U(1-p(no event), 1) 25 ACTG 175: HIV study Imputed time of event, δ=2 1 – U(1-p(no event)δ, 1) Imputed event times tend to be shorter as δ increases 26 ACTG 175: HIV study Imputed time of event, δ=2 1 – U(1-p(no event)δ, 1) Note: this is just single imputation! 27 How to use multiple imputation here? • Bootstrap original data set. • Calculate p(no event)δ associated with time of censoring, using the bootstrap KM estimates of p(no event). • Use inversion to find corresponding time on original data set. 28 ACTG 175: HIV study Bootstrapped data set #1 Bootstrapped data set #2 Bootstrapped data set #3 Bootstrapped data set #4 29 ACTG 175: HIV study Bootstrap approximates variability of draws from posterior distribution needed for MI p(no event) = 0.958 p(no event) = 0.947 p(no event) = 0.952 p(no event) = 0.950 30 ACTG 175: HIV study Imputations include variability from U() and from the 1 – U(1-p(no event), 1) differences in bootstrapped data sets 31 ACTG 175: HIV study Imputed p(no event) is applied to the original data set 32 ACTG 175: HIV study 1 – U(1-p(no event)δ, 1) 33 ACTG 175: HIV study Imputed p(no event) is applied to the original data set, with δ applied 34 ACTG 175: HIV study Sample imputations with and without δ might look like this... 35 ACTG 175: HIV study 1 – U(1-p(no event), 1) 36 ACTG 175: HIV study 1 – U(1-p(no event)δ, 1) 37 Result of tipping point analysis for HIV study δ 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 Logrank Standard statistic* error+ 5.58 1.031 5.17 1.057 4.82 1.102 4.50 1.090 4.06 1.121 3.68 1.108 3.53 1.131 3.27 1.211 2.90 1.130 2.54 1.233 2.35 1.199 2.17 1.183 1.99 1.210 p-value <0.0001 <0.0001 <0.0001 <0.0001 0.0003 0.0010 0.0019 0.0076 0.0105 0.0413 0.0516 0.0674 0.1019 *chi-squared statistic transformed to normal using Wilson-Hilferty transformation +transformed statistic has variance = 1; standard error includes between-imputation variability 38 What if primary analysis is Cox prop’l hazards or parametric? • Implementation of MI version of Cox proportional hazards is similar to that of KM. • Other implementations of MI for time-to-event analysis in progress by Lipkovich and Ratitch: > logistic regression (suggested by Carpenter and Kenward (2013)); > piecewise exponential. • SAS macros for all four approaches planned to be available at DIA SWG web page at www.missingdata.org.uk. > tasks undertaken as part of DIA SWG “New Tools” subgroup. • The above methods can also be used to implement “control based imputation” for missing time to event outcomes. 39 References • Carpenter J and Kenward M (2013) Multiple imputation and its application. Chichester: Wiley. • Hammer S, Katzenstein D, Hughes M, Gundaker H, Schooley R, Haubrich R, Henry W, Lederman M, Phair J, Niu M, Hirsch M, and Merigan T, for the Aids Clinical Trials Group Study 175 Study Team (1996). A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 counts from 200 to 500 per cubic millimeter. The New England Journal of Medicine 335 1081-1089. • Lu X, Tsiatis, A (2008) Improving the efficiency of the log-rank test using auxiliary covariates, Biometrika 95 679-694. • Taylor J, Murray S, Hsu C-H (2002) Survival estimation and testing via multiple imputation. Statistics and probability letters 6 77-91. • Zhao Y, Herring A, Zhou H, Ali M, Koch G (submitted) A multiple imputation method for sensitivity analyses of time-to-event data with possibly informative censoring. 40 Questions? 41