An Analytic Road Map for Incomplete Longitudinal Clinical Trial Data Craig Mallinckrodt Graybill Conference June 12, 2008 Fort Collins, CO Acknowledgements PhRMA Expert Team on Missing Data Peter Lane GSK Craig Mallinckrodt Lilly James Mancuso Pfizer Yahong Peng Merck Dan Schnell P&G Geert Molenberghs Ray Carroll Many Lilly colleagues Outline Why do we care What do we know Theory Application What we should do Medical Needs Every hour we expect 195 deaths due to cancer 1950 new diagnoses of anxiety disorders 15 30 1500 70 new diagnoses of schizophrenia osteoporosis related hip fractures surgeries requiring pain treatment deaths due to cardiovascular disease Alan Breier – Nov 2006 Need for More Effective Medicines Therapeutic Area Alzheimer’s Analgesic’s (Cox-2) Asthma Cardiac Arrhythmias Depression (SSRI) Diabetes HCV Incontinence Migraine (acute) Migraine (prophylaxis) Oncology Osteoporosis Rheumatoid arthritis Schizophrenia Efficacy rate(%) 30 80 60 60 62 57 47 40 52 50 25 48 50 60 There is an efficacy gap in terms of customer expectations and the drugs we prescribe Trends in Molecular Medicine 7(5):201-204, 2001 R&D Productivity Decreasing Industry R&D Expense ($ Billions) Annual NME Approvals $50 200 $45 180 $40 $35 R&D Investment NME & Biologics Approvals 160 140 $30 120 $25 100 $20 80 $15 60 $10 40 $5 20 $0 0 80 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 000 001 002 003 004 005 006 007 9 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 Source: PhRMA, FDA, Lehman Brothers; [Dr. Robert Ruffolo] Outline Why do we care What do we know Theory Application What we should do Starting Point No universally best method for analyzing longitudinal data Analysis must be tailored to the specific situation at hand Consider the hypothesis to be tested, desired attributes of the analysis, and the characteristics of the data Missing Data Mechanisms MCAR - missing completely at random • Conditional on the independent variables in the model, neither observed or unobserved outcomes of the dependent variable explain dropout MAR - missing at random • Conditional on the independent variables in the model, observed outcomes of the dependent variable explain dropout, but unobserved outcomes do not Missing Data Mechanisms MNAR - missing not at random • Conditional on the independent variables in the model and the observed outcomes of the dependent variable, the unobserved outcomes of the dependent variable explain dropout Consequences Missing data mechanism is a characteristic of the data AND the model Differential dropout by treatment indicates covariate dependence, not mechanism Mechanism can vary from one outcome to another in the same dataset Missing Data in Clinical Trials • Efficacy data in clinical trials are seldom MCAR because the observed outcomes typically influence dropout (DC for lack of efficacy) • Trials are designed to observe all the relevant information, which minimizes MNAR data • Hence in the highly controlled scenario of clinical trials missing data may be mostly MAR • MNAR can never be ruled out Implications • All analyses rely on missing data assumptions • Any options in the trial design to minimize dropout should be strongly considered Assumptions • ANOVA with BOCF / LOCF assumes • MCAR & constant profile • MAR always more plausible than MCAR • MAR methods will be valid in every case where BOCF/ LOCF is valid • BOCF / LOCF will not be valid in every scenario where MAR methods are valid Research Showing MAR Is Useful And / Or Better Than LOCF 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. Arch. Gen. Psych. 50: 739-750. Arch. Gen. Psych. 61: 310-317. Biol. Psychiatry. 53: 754-760. Biol. Psychiatry. 59: 1001-1005. Biometrics. 52: 1324-1333. Biometrics. 57: 43-50. Biostatistics. 5:445-464. BMC Psychiatry. 4: 26-31. Clinical Trials. 1: 477–489. Computational Statistics and Data Analysis. 37: 93-113. Drug Information J. 35: 1215-1225. J. Biopharm. Stat. 8: 545-563. J. BioPharm. Stat. 11: 9-21. Research Showing MAR Is Useful And / Or Better Than LOCF 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. J. Biopharm. Stat. 12: 207-212. J. Biopharm. Stat. 13:179-190. J. Biopharm. Stat. 16: 365-384. Neuropsychopharmacol. 6: 39-48. Obesity Reviews. 4:175-184. Pharmaceutical Statistics. 3:161-170. Pharmaceutical Statistics. 3:171-186. Pharmaceutical Statistics. 4:267-285. Pharmaceutical Statistics (2007 early view) DOI: 10.1002/pst.267 Statist. Med. 11: 2043-2061. Statist. Med. 14: 1913-1925. 25. Statist. Med. 22: 2429-2441. Why Is LOCF Still Popular • LOCF perceived to be conservative • Concern over how MAR methods perform under MNAR • More explicit modeling choices needed in MAR methods • LOCF thought to measure something more valuable Conservatism Of LOCF • Bias in LOCF has been shown analytically and empirically to be influenced by many factors • Direction and magnitude of bias highly situation dependent and difficult to anticipate • Summary of recent NDA showed LOCF yielded lower p value than MMRM in 34% of analyses Biostatistics. 5:445-464. BMC Psychiatry. 4: 26-31. Performance Of MAR With MNAR Data • Studies showing MAR methods provide better control of Type I and Type II error than LOCF Arch. Gen. Psych. 61: 310-317. Clinical Trials. 1: 477–489. Drug Information J. 35: 1215-1225. J. BioPharm. Stat. 11: 9-21. J. Biopharm. Stat. 12: 207-212. Pharmaceutical Statistics (2007 early view) DOI: 10.1002/pst.267 JSM Proceedings. 2006. pp. 668-676. 2006. More Explicit Modeling Choices Needed • MMRM 6 lines of code, LOCF 5 lines of code • Convergence and choice of correlation not difficult in MMRM Clinical Trials. 1: 477–489. LOCF Thought To Measure Something More Valuable • LOCF is “effectiveness”, MAR is “efficacy” • LOCF is what is actually observed • MAR is what is estimated to happen if patients stayed on study • Non longitudinal interpretation of LOCF • LO, LAV • Dropout is an outcome Non-longitudinal Interpretation Of LOCF • An LOCF result can be interpreted as an index of rate of change times duration on study drug a composite of efficacy, safety, tolerability • An index with unknown weightings • The same estimate of mean change via LOCF can imply different clinical profiles • The LOCF penalty is not necessarily proportional to the risk • Result can be manipulated by design Completion Rates in Depression Trials Proportion of completers 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Study Drug Drug PLA Placebo Placebo Dropout Rates Influenced by Design In a Recent MDD NDA Trial 1 2 3 4 5 6 7 8 % DC-AE 4.3 6.7 3.3 9.0 3.2 1.0 2.5 4.3 % Dropout 34.3 41.3 31 42 19 9 29.5 35.3 Trials 5 and 6 had titration dosing and extension phases Lillytrials.com Outline Why do we care What do we know Theory Application What we should do Modeling Philosophies • Restrictive modeling • Simple models with few independent variables • Often include only the design factors of the experiment Psychological Methods, 6, 330-351. Modeling Philosophies • Inclusive modeling • Auxiliary variables included to improve performance of the missing data procedure – expand the scope of MAR • Baseline covariates • Time varying post-baseline covariates: Must be careful to not dilute treatment effect. Can be dangerous to include time varying postbaseline covariates in analysis model, may be better to use via imputation (or propensity scoring or weighted analyses) Psychological Methods, 6, 330-351. Rationale For Inclusive Modeling • MAR: conditional on the dependent and independent variables in the analysis, unobserved values of the dependent variable are independent of dropout • Hence adding more variables that explain dropout can make missingness MAR that would otherwise be MNAR Analytic Road Map • MAR with restrictive modeling as primary • Use MAR with inclusive modeling and MNAR methods as sensitivity analyses • Use local influence to investigate impact of influential patients Pharmaceutical Statistics. 4: 267–285. J. Biopharm. Stat. 16: 365-384. Why Not MNAR As Primary • Can do better than MAR only via assumptions • Assumptions untestable • Sensitivity to violations of assumptions and model misspecification more severe in MNAR • MNAR methods lack some desired attributes of a primary analysis in a confirmatory trial • No standard software • Complex Implementing The Road Map: Example From A Depression Trial 259 patients, randomized 1:1 ratio to drug and placebo Response: Change of HAMD17 score from baseline 6 post-baseline visits (Weeks 1,2,3,5,7,9) Primary objective: test the difference of mean change in HAMD17 total score between drug and placebo at the endpoint Primary analysis: LB-MEM Patient Disposition Drug Placebo Protocol complete 60.9% 64.7% Adverse event 12.5% 4.3% 5.5% 13.7% Lack of efficacy Differential rates, timing, and/or reasons for dropout do not necessarily distinguish between MCAR, MAR, MNAR Primary Analysis: LB-MEM proc mixed; class subject treatment time site; model Y = baseline treatment time site treatment*time ; repeated time / sub = subject type = un; lsmeans treatment*time / cl diff; run; This is a full multivariate model, with unstructured modeling of time and correlation. More parsimonious approaches may be useful in other scenarios Treatment contrast 2.17, p = .024 Inclusive Modeling in MI: Including Auxiliary AE Data • Imputation Models • *Yih = µ +1 Yi1 +…+ h-1 Yi(h-1) + ih • Yih = µ + 1 Yi1 +…+ h-1 Yi(h-1) + 1 AEi1 +…+ h-1 AEi(h-1) + ih • Yih= µ + 1 Yi1 +…+ h-1 Yi(h-1) + 1 AEi1 +…+ h-1 AEi(h-1) +11 (Yi1 *AEi1 ) + …+i(h-1) (Yi(h-1) * AEi(h-1) ) + ih • Analysis Model • MMRM as previously described Result • MI results were not sensitive to the different imputation models Endpoint contrast MMRM 2.2 MI Y+AE 2.3 MI Y+AE+Y*AE 2.1 • Including AE data might be important in other scenarios. Many ways to define AE MNAR Modeling • Implement a selection model – Had to simplify model: modeled time as linear + quadratic, and used ar(1) correlation • Compare results from assuming MAR, MNAR • Also obtain local influence to assess impact of influential patients on treatment contrasts and non-random dropout Selection Model Results Contrast (p-value) MAR MNAR 2.20 2.18 (0.0179) (0.0177) Missingness Parameters 0 1 2 Estimate -2.46 0.11 -0.08 SE 0.27 0.05 0.06 Local Influence: Influential Patients 12 Ci 6 4 #179 #154 #50 2 #6 0 Ci 8 10 #30 0 50 100 150 Patient 200 250 Individual Profiles with Influential Patients Highlighted 0 # 30 -30 -20 -10 change in HAMD17 -10 -20 -30 change in HAMD17 0 10 Duloxetine 10 placebo 2 4 6 Weeks 8 2 4 6 Weeks 8 Investigating The Influential Patients The most influential patient was #30, a drug-treated patient that had the unusual profile of a big improvement but dropped out at week 1 This patient was in his/her first MDD episode when s/he was enrolled This patient dropped out based on his/her own decision claiming that the MDD was caused by high carbon monoxide level in his/her house This patient was of dubious value for assessing the efficacy of the drug Selection Model: Influential Patients Removed ( 30, 191) Removed Subjects MAR Diff. at endpoint (p-value) (6, 30, 50, 154, 179, 191) MNAR MAR MNAR 2.07 (0.0241) 2.07 (0.0237) 2.40 (0.0082) 2.40 (0.0083) 0 -2.22 (0.14) -2.44 (0.27) -2.23 (0.15) -2.47 (0.28) 1 0.05 (0.02) 0.11 (0.05) -0.05 (0.02) 0.11 (0.06) Missingness Parameters 2 -0.07 (0.06) -0.08 (0.06) Implications Comforting that no subjects had a huge influence on results. Impact bigger if it were a smaller trial Similar to other depression trials we have investigated, results not influenced by MNAR data We can be confident in the primary result Discussion MAR with restrictive modeling was a reasonable choice for the primary analysis MAR with inclusive modeling and MNAR was useful in assessing sensitivity Sensitivity analyses promote the appropriate level of confidence in the primary result and lead us to an alternative analysis in which we can have the greatest possible confidence Opinions • Inclusive modeling has been under utilized • More research to understand dropout would be useful • Did not discuss pros and cons of various ways to implement inclusive modeling. Use the one you know? Be careful to not dilute treatment • The road map for analyses used in the example data is specific to that scenario Conclusions • No universally best method for analyzing longitudinal data • Analysis must be tailored to the specific situation at hand • Considering the missingness mechanism and the modeling philosophy provides the framework in which to choose an appropriate primary analysis and appropriate sensitivity analyses Conclusion • LOCF and BOCF are not acceptable choices for the primary analysis • MAR is a reasonable choice for the primary analysis in the highly controlled situation of confirmatory clinical trials • MNAR can never be ruled out • Sensitivity analyses and efforts to understand and lower rates of dropout are essential