Summarizing Quality of Life in the Presence of Limited Survival Gerhardt Pohl, Li Li Eli Lilly and Company Objectives In this talk we focus on a special case of informative missing data, patients who cease to provide longitudinal patient reported outcomes (PRO’s) due to death. We compare various commonly used methods for analyzing such data and propose an approach based on the proportion of patients at various levels. Overview • Problem Statement • Limitations of Current Methods • (Brief Aside on Plotting Individual Patient Profiles) • Proposal • Summary and Discussion of Future Directions Problem Statement • Consider QoL or other PRO’s collected over time. • We desire to summarize the mean profile of the scores at the various time points. • However, patients often fail to complete all assessments due to early death. • Further complicating the situation is informative censoring. Patient’s scores decline as they approach death, but they also often fail to report scores as they decline. Non-Informative vs. Informative Censoring Non-Informative Informative Solid lines indicate observed data; and dotted, missing data. Commonly Used Methods • Mixed Model Repeated Measures (MMRM) • Area Under the Curve (AUC) • Survival Methods • Latent Effects Models Limitations of MMRM • Consider the model with unique mean and between-patient variability at each time point with possible within-patient correlation in the scores over the time. • Underlying assumption is that patients share same trajectory of score over time with some patients only contributing a portion of the profile. • Variability is modeled only in outcome and not in time of observation which is assumed fixed with common mean outcome at each observation time. • However, in reality, patients are experiencing accelerated time to failure with informative censoring. MMRM Unbiased in the NonInformative Setting Non-Informative Informative Solid lines indicate observed data; and dotted, missing data. Complex Profile of Real-Life PRO’s in Oncology Chemotherapy Untreated Burden Treated Time Complex Profile Befuddles Time to Worsening Analyses • Time to event analysis appears ideal for handling right-censored data. • However, of worsening in treated occurs immediately at outset of cytotoxic chemotherapy. Causal Diagram Disease Measure Cycle 1 Treatment AE Disease PRO AE Measure AE Measure Disease Measure Cycle 2 Treatment Disease AE PRO Etc. Plotting Individual Patient Profiles: Spaghetti Plots 5 Symptom Score 4 Symptom scores (discrete 0-4) for 300 patients versus time. 3 2 1 0 0 1 2 3 Week 4 5 6 7 Plotting Individual Patient Profiles: Lasagna Plots • Bruce J. Swihart, Brian Caffo, Bryan D. James, Matthew Strand, Brian S. Schwartz, and Naresh M. Punjab. “Lasagna Plots: A Saucy Alternative to Spaghetti Plots”. Epidemiology, Vol. 21, Number 5, Sept. 2010. • Remap intensity of score from vertical axis to a color and use the location on vertical axis to denote individual patient. Each Row is a Patient Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Each Row is a Patient Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Hint of early tolerability burden Sorted by Treatment Group Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Control Treated Sorted by Treatment Group and Duration of Follow-Up Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Duration Week 1 Week 2 Week 3 Week 4 Week 5 Week 6 Week 7 Duration Sorted by Treatment Group and Duration of Follow-Up Week 1 Week 1 Week 2 Week 2 Week 3 Week 3 Week 4 Week 4 Week 5 Week 5 Week 6 Week 7 Duration Week 6 Poorer scores near termination Week 7 Duration Additional Features • Sorting by characteristics of plotted data and/or by external characteristics • Annotation of discrete events and events with duration. • Filtering rows to subsets of patients • Automated aggregation of patients with similar profiles to allow more than one patient per horizontal band. • Side panels showing related data, e.g., KaplanMeier plots, proportion of data plotted. • Special thanks to Wei Wang, Eli Lilly and Co., Advanced Analytics Visualization. Limitations of AUC Methods • An approach to compensate for varying lengths of survival is to calculate area under the curve or score ο΄ time values (cf. QALY). • Note that death is mapped to zero score. • AUC yields a complete ordering of score and survival. • Exchangeability of quality and time is questionable. • Induces linearity in PRO scale that may not be realistic. Two AUC-Equivalent Patients 1.0 Patient 1 Patient 2 Score 0.5 0 0 1 Time 2 Probability-Based Methods • Rather than average scores, summarize as proportion of patients at various levels at each time point. n = 10 10 8 6 6 n = 10 10 8 6 6 Percent 100% of Patients 80% Score 4.00 3.00 vs. 2.00 Average 1.00 Score=4 Score=3 60% Score=2 40% Score=1 20% 0.00 0% 0 1 2 Time 3 4 0 1 2 Time 3 4 Incorporating Survival • Death can be appended to low end of score. n = 10 10 8 6 6 Score 4.00 3.00 n = 10 10 10 10 10 2.00 Percent 100% of Patients 80% Average 1.00 Score=4 Score=3 60% 0.00 0 1 2 3 vs. 4 Time Score=2 40% Score=1 20% Dead 0% 1.00 0 0.80 1 2 Time Survival 0.60 Prob. 0.40 0.20 0.00 0 1 2 Time 3 4 3 4 Summaries of Categorical Probabilities • Cumulative Proportion of Time in Category – One can “integrate” over time to obtain the cumulative proportion of time the group spends in each PRO level. 100% Percent 100% of Patients 80% Score=4 80% Score=3 60% 60% Score=2 40% 40% Score=1 20% 20% Dead 0% 0% 0 1 2 Time 3 4 Marginal Proportion of Group Time Underlying Nature of Data Patient 1 2 3 4 Time in Category (1.0, 2.0, 1.0, 1.0, 0.0) (0.0, 0.0, 1.0, 4.0, 0.0) (0.0, 1.0, 1.0, 3.0, 0.0) (0.0, 0.0, 1.0, 1.0, 3.0) Group-Level (1.0, 3.0, 4.0, 9.0, 3.0) Group-Level Proportion of time-person spent in each Category (1/20, 3/20, 4/20, 9/20, 3/20) Treatment (p ) (p1, Control (q ) 5 (q1 , q2 , p3, p4, p5 ) q 3 , q 4, q 5) 5 ππ = 1 πππ π=1 p2, ππ = 1. π5 = ππππ. (π ππππ‘ π‘πππ ππ π·πππ‘β) π=1 Need an Ordering Metric for Ranking which Summary Vectors are “Better” 3 possible methods, each has pros and cons. 1. Majorization Order • Introduce majorization order over all cumulative levels of the state categories • Treatment , p , as being better than that of control, q , iff π π ππ ≤ π=1 ππ , π€βπππ π = 1, β― π − 1 , π=1 ππ‘ ππππ π‘ πππ < holds π€βπππ π ππ # ππ π π‘ππ‘ππ • Pro: no need to decide weight; includes requirement that patients in treated group survive longer than control group. • Con: too strong condition. 2. Utility or Cost function • Introduce a utility or cost function for each category. • Treatment group, p , as being better than that of control group, q , iff π π=1 ππ ππ < π π=1 ππ ππ , where ππ ππ πππ π‘ ππ πππππππππ ππ πππ‘πππππ¦ π • Pro: reduces comparison of vectors to a single dimension • Con: • One needs to assign weight between PRO states; • No explicit requirement that survival be better for treated than control. 3. Pseudo Increasing Convex Order (P-ICX order) – Treatment , p , as being better than that of control, q , iff : Given π > 1, πππ πππ π = 1, . . . , π – 1, π βπ = ππ π−1 ππ − ππ + π=0 ππ (ππ − ππ ) ≥ 0 π=π+1 π1 > β― ππ−1 > 0, ππ‘ πππ π‘ πππ > holds Pro: (1) less strong condition than majorization; (2) no need to consider weight between death and PRO states; (3) includes requirement that patients in treated group survive longer than control group. Con: One needs to assign weight between PRO states. A 3-state example 3 health states Good Bad Death Treatment (p) Proportion of time spent in each state Control (q) P1 P2 P3 q1 q2 q3 d1 d2 d3 Difference (d=p-q) π1 > 0 Treatment group is better in good state. π2 > 0 Treatment group is better in bad state. π1 + π2 > 0 Treatment group is better in survival (less proportion of time spent in the state of death). Because: (π1 + π2 ) − π1 + π2 > 0 <=> (1 − π3 ) + 1 − π3 > 0 <=> π3 < π3 Connecting each approach with d. Treatment group is better than control group: • Cost function (f): c1>c2> c3=0 ↔ π π‘ππππ‘ππππ‘ > π ππππ‘πππ ( π=1 ππ ππ > ↔ π1 π1 + π2 π2 > π1 π1 + π2 π2 , ↔ π1 π1 + π2 π2 > 0, π1 ′ ′ ↔ π1 π1 + π2 > 0, π€βπππ π1 = > 1 3 π=1 ππ ππ ) π2 Cost function: ππ1 + π2 > 0, , π€βπππ π > 1 • Majorization: π1 > 0 πππ π1 + π2 > 0 • P-ICX: ππ1 + π2 > 0, , π€βπππ π > 1 πππ π1 + π2 > 0 Majorization: reject that treatment is better than control 3 health states Good P1 Propo Trt (p) (0.29) rtion of Con q1 time (q) (0.30) spent in d1 each Diff. state (d=p-q) (-0.01) Bad Death P2 (0.71) P3 (0.0) q2 (0.5) q3 (0.5) d2 (0.21) d3 (-0.5) P-ICX: accept that treatment is better than control at c=2 P-ICX: Reject that treatment is better than control at c=2 trol 3 health Good Bad Death states P1 Propo Trt (p) (0.49) rtion of Con q1 time (q) (0.0) spent in d1 each Diff. state (d=p-q) (0.49) P2 (0) P3 (0.51) q2 (0.97) q3 (0.03) d2 (-0.97) d3 (0.48) Cost Function: Accept that treatment is better at c=2an P-ICX order is in the middle of majorization and Cost function method regarding acceptance of good PRO performance. Both majorization order and P-ICX order consider survival benefit. Example • Simulated Data – Two arms: treatment vs. control (1:1) – Sample size: 300. – Survival: • treatment arm has longer survival rate than control (To show contrast, treatment arm survival rate ~ 1). – Planned visits: 6 bi-monthly visit. Follow PRO until death or completion of visits, follow patients until death or completion of study (720 days). Survival curve • Simulated Data (continued) – Longitudinal categorical QoL scores • True trend: – Treatment arm has worse QoL score than control at the first 2-3 cycles, decreased to more tolerable score than control with time going on. – Control arm has an increasing trend over time – Health status declines faster (PRO score increases) as they approach death. • Observed trend (Average of Available Data): – Missing due to death or inability to conduct survey due to approaching death. True Curve Method 1: Naïve Estimator • Average of score at each visit among available patients. • Observed curve gives impression that control arm is better than treatment arm. Method 2: MMRM – Treat score as continuous dependent. – Model separate means at each visit (treatment by visit interaction) with exchangeable covariance within-patient and independent between-patient. – Profile is similar to naïve estimator. Method 3: AUC method • Area under curve up to 14 months. • [Conclusion] x1 N Mean Std Dev Std Err Control 142 278.3 123.0 10.3236 Treatment 158 542.4 126.1 10.0342 P-value: two-sample t-test Pr > |t| <.0001 Proposed Method • Select a time period of interest– e.g., 14 months. • Collapse 5 categories (raw categories: 0-4) to 2 categories (0 or category for scores of 1-4). • Incorporate death as the worst PRO level. • Integrate over time to obtain the proportion of time the group spends in each PRO level. • Adopt P-ICX order to compare PRO and select weight of treatment effect in each level: weight (2,1)->state (0, 1-4). Results • Proportion of time spent in each level during 12 months Weight for raw states: (5,4,3,2,1) d=(-0.06, 0.20) Arms Treatment Control Difference Quality of Life Status 0 1-4 Death 0. 26 0.73 0.0001 0.32 0.53 0.15 -0.06 0.20 -0.15 – One can conduct formal hypothesis test : π»0 : π = π, π»1 : π, π π ππ‘ππ ππ¦ π − πΌπΆπ πππππ\π»0 – Calculate the vector statistics β=( 0.26, 0.29, 0.25, 0.18, 0.16 ) – βπ > 0, π = 1, 2, 3, 4, 5 gives impression that treatment arm is better than control arm within 12 months since baseline. – P-value: chi square test. Summary and Discussion: • Proposed probability based method to compare PRO between treatments may avoid need for weighting scores in some cases (majorization). • 3 possible ranking methods for comparing vectors. – Majorization: strongest condition – Cost function: simple concept – P-ICX order: cost function+ improved survival requirement • Future research: How to choose weight? P-ICX share the same question with cost function. A formal definition of ICX • the distribution of a random variable Y is larger than the distribution of a random variable X in the increasing convex order, i.e. X ≤πΌπΆπ Y , if and only if E{f(X)} ≤ E{f(Y )} holds for all non-decreasing convex functions f for which expectations are defined. • Insurance and actual science application