Introduction: Challenges in Analyzing and Interpreting Multiple (Censored) Outcomes in Chronic Disease Trials Dianne M. Finkelstein, Ph.D., MGH, Harvard University, Boston, MA, USA 5 August, 2014 Multiple Outcomes In Clinical Trials Trials often have several outcomes that are monitored for treatment benefit • Survival is primary in a life-threatening disease but may take too long to show benefit • Other measures of disease progression can be more rapidly sensitive in a trial How should these be handled in the design, analysis, and report of the trial results? Examples of Multiple Outcomes Breast cancer clinical trial • Survival, recurrence (distant, local), quality of life • Commonly use progression-free survival as primary ALS (Lou Gehrig’s Disease) • ALS functional rating scale (ALSFRS) • Survival not likely to be affected by Rx • ALSFRS commonly chosen as primary Examples of Multiple Outcomes Heart Failure (CHARM trial) • Chronic heart failure (death, cardiovascular event), hospitalizations • Commonly use time to first of cardiac events/death Scleroderma (interstitial lung disease) • Pulmonary function (FVC) at 12 months and death • Commonly combined FVC and death primary Disability and quality of life secondary How are multiple endpoints usually handled? Report univariate analyses with correction for multiplicity (Bonferroni for example) Select a primary endpoint that guides decision about the treatment Co-primary outcomes or hierarchy Alternatively: Test Based on Data From Two or More Endpoints “Time to first” of two or more events • Failure-time analysis • Failure is the first observed • Example: Progression-free survival in cancer Global test on first as well as subsequent endpoints • Can compare patients pair-wise on many events • Base the test on sum of scores from one group Advantage of Tests Based on a Combined Outcome Considering “time to first event” or “global test” Allows simultaneous evaluation of multiple outcomes without adjustment for multiplicity Account for effect confounding of outcome measures that are combined Recovers information from censored or missing data Can increase power Co-primary endpoints would require larger N as both must have sufficient power Problems With “Time to First Outcome” Test Treats all events equally Lose information from subsequent or repeated events • CHF: time to death or first cardiac event loses information on recurrences • PFS loses information on death for patients with progression observed before death • Sometimes early events are less important clinically Could lose power if treatment primarily effects progression, and survival dilutes the test A Global Test to Combine Mortality and Longitudinal Outcomes in a Clinical Trial Joint Rank Test of Finkelstein & Schoenfeld 1999 All patients compared pair-wise on time to death and a longitudinal outcome If cannot compare on death (due to censoring) then compare on longitudinal outcome at the latest time point that you have data from both subjects Assigned score of +1 (better) or -1 (worse) or 0 (can’t compare) Test based on sum scores for each patient in treated group. Wilcoxon test applied A Global Test to Combine Mortality and Longitudinal Outcomes in a Clinical Trial (continued) The F-S Joint Rank Test was first proposed in SIM in 1999 Finkelstein-Schoenfeld Called Generalized Wilcoxon Test (GGW) in paper or Combined Test in literature Other papers have re-named the test CAFS (Combined Assessment of Function and Survival) in Berry et al. (2013) for ALS Win Ratio (Pocock et al for cardiac trails) 2012 Ritesh Ramchandani will discuss a generalization to this later in this session Other Global Tests: PC O’Brien Rank Sum Test (1984) In the pooled sample, patients are ranked on each outcome Outcome-specific ranks are summed, giving a total rank for every patient Conduct ANOVA, t-test, or rank-sum test on total ranks Viewed as supplement to univariate results Estimate Derived from the Global Test Pocock et al. Win Ratio As for F-S Joint Rank test, count number of pairs where new Rx patient did better (win) or worse (loss) Win Ratio= number of winners / number of losers Get estimate of proportion who won from WR Allows calculation of confidence bounds and graphical display of results Examples of published global endpoint analyses (Barry). 1984 Non-parametric and parametric approaches to analyzing multiple endpoints; illustrated using a diabetes trial 1999 Non-parametric evaluation of time to event (e.g. mortality) and longitudinal measure (e.g. change over time in CD4 lymphocyte count in HIV/AIDS) 2000 Analysis of longitudinal measure and event history or survival-schizophrenia trial 2006 Repeated measurements and event time data (shared parameter model) 2007, 2011 Continuous measure over time (e.g. disability index in scleroderma) and time-to-event outcomes (e.g. renal crisis and death in scleroderma) 2007 Longitudinal measurement (e.g. change in percent forced vital capacity) and time to treatment failure or death (in scleroderma-associated lung disease) 2008 Death and another outcome such as stroke, myocardial infarction, time to reintervention, angina, or hospitalization in cardiovascular clinical trials 2010 Time to loss of virological response; incorporates virological failure assessments, loss to follow-up, new treatment initiation, and death HIV 2013 Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration Suggestions in Using Global Test Can use hierarchy to give more importance to some endpoints Caution about including endpoints that don’t reflect Rx effect as can dilute results Best to use in the setting that all intermediate endpoints are on the same path (to death) • Example: tumor progression is predictive of earlier death • However salvage Rx can diminish protocol Rx impact on death Suggestions on Global Analyses (continued) Put global test analysis in the Analysis Plan Include component-wise analysis plan Calculate power for the global analysis Future Research is Needed Refine the estimate associated with the test • WIN ratio is affected by the total follow-up time of the study Graphical display of results • Depict global estimate over time • Show what outcomes dominate the score Testing and estimation of univariate outcomes • Joint test can be significant but not univariate outcomes • Hierarchical testing Conclusions A global test of multiple outcomes can increase power and assess multi-dimensional treatment outcomes. Use of joint tests is gaining in popularity in many disease settings (Pharma as well as FDA) Needs to be considered in more disease settings New methods and approaches needed to handle the interpretation of global outcome analyses If you propose a new test, give it a memorable name Session 1. Andrew Strahs: Illustration of issues that can arise with “time to first” outcome in a clinical trial 2. Ritesh Ramchandani: Generalization of Global test 3. Marc Buyse: Approach to summarizing Global test outcomes in a trial 4. David Schoenfeld: Discussion SEE OUR WEBSITE FOR THE TALKS AND CONTACTS Google “MGH Biostatistics” for hedwig.mgh.harvard.edu/ References 1. Finkelstein, DM and Schoenfeld, DA. Combining mortality and longitudinal measures in clinical trials. Statistics in Medicine 1999;18:1341-1354. 2. O’Brien PC, Procedures for comparing samples with multiple endpoints, Biometrics 1984 40(4):1079087. 3. Berry JD, Miller R, Moore DH, Cudkowicz ME, Van Den Berg LH, Kerr DA, Dong Y, Ingersoll EW, Archibald D, “The combined assessment of function and survival (CAFS): A new endpoint for ALS clinical trials, ALS and Frontotemporal Degeneration 2013 14(3): 162-8. 4. Pocock SJ, Cono AA, Collier TJ, Wang, D, “The win ratio: a new approach to the analysis of composite endpoints in clinical trials based on clinical priorities, European Heart Journal 2012 33:176-82. 5. Sun H, Davison BAA, Cotter G, Pencina MJ, Koch GG, “Evaluating treatment efficacy by multiple end points in phase II acute heart failure clinical trials: analyzing data using a global method”, Circulation 2012: 5: 742-9. Thank you! Visit our website: hedwig.mgh.harvard.edu/biostatistics/ Examples of Multiple Outcomes in ALS Phase II Trial of two doses of dexpramipexole for ALS • Mortality alonse p=.071 ALSFRS slope p=.177 • Joint Rank test p=.046 Simulations • If one endpoint moderate and one significant, Joint Rank test better • Moderate on each may not be enough Activity 282 on Aug 5 at 8:30am Upload in 256 B CC BC Healy Chair Session 210146