Introduction to medical survival analysis John Pearson Biostatistics consultant University of Otago Canterbury 7 October 2008 1 Objectives • • • • Describe survival data Define survival analysis terms Compare survival of groups Describe study design Acknowledgement: Thanks to Colm Fahy for providing the example data. 2 Omissions • Not covered: – most methodology issues – mathematical justification • See – Collett: Modelling Survival Data in Medical Research – Hosmer & Lemeshow: Applied Survival Analysis – Many other good texts. 3 Example: Metastatic Parotid SCC • Disease risk factors: – >50 yo – Male – Exposure to sun – Caucasian ancestry • 61 patients operated on since 1990 • Audit done 1/6/8 • 14 patients died from SCCMP, 20 died from other causes, 1 couldn’t be found 4 Example: Patient data Died OpDate 7/05/2002 15/11/2007 1/03/2008 12/10/2007 1/08/1993 17/04/1992 1/04/1997 7/10/1996 1/05/1991 1/05/2005 12/03/2003 Status ALIVE ALIVE DOC DOD DOC LOST DOC Preserved PARTIAL NO YES YES NO YES YES RadioTx YES YES YES YES YES YES YES ICOMP N N N Y N N Y Only 7 patients shown. Dates have been confidentialized. 5 Example: Patient data 7 Alive Dead OC Dead OD 6 5 4 3 2 audit Patient Parotidectomy patient medical records ? Lost to follow up 1 1990 1995 2000 2005 6/2008 6 Example: Patient data 7 Alive Dead OC Dead OD ? 6 5 ? 4 3 2 audit Patient Parotidectomy patient medical records ? Lost to follow up 1 1990 1995 2000 2005 6/2008 7 Example: Survival Data Patient Parotidectomy patient survival data Alive Dead OC Dead OD 7 6 5 4 3 2 ? 1 0 5 10 15 Years post operation 8 Example: Survival Data Patient Parotidectomy patient survival data Alive Dead OC Dead OD 7 6 5 4 3 2 ? 1 0 5 10 15 Years post operation Date formats and manipulation can cause headaches. Check what happens when your software subtracts dates to get survival time. 9 Example: Survival Data Patient Parotidectomy patient survival data censored 7 Alive Dead OC Dead OD 6 5 censored 4 3 2 Missing data 1 0 5 10 ? 15 Years post operation 10 Example: Survival Data Patient Parotidectomy patient survival data 7 censored 6 censored 5 censored 4 censored 3 censored Alive Dead OC Dead OD 2 Missing data 1 0 5 10 ? 15 Years post operation 11 Patient Example: Survival Data Censored data is explicitly addressed by survival 7 6 5 analysis, using simple linear regression is not recommended. Parotidectomy patient survival data Options: Alive 1. SPSS censored Dead OC 2. SAS Dead OD censored 3. R 4. Other software censored 4 censored 3 2 Missing data 1 0 5 10 ? 15 Years post operation 12 Patient Example: Survival Data 7 6 Missing data can have a large effect on results, requires careful management. Parotidectomy patient survival data Options: Alive 1. Omit censored Dead OC Dead OD 2. Imputecensored 3. Model censored 5 4 censored 3 2 Missing data 1 0 5 10 ? 15 Years post operation 13 What is survival analysis • Time to event data – Continuous – Right skewed, ≥0, not normal – Censored – Analyse risk (hazard function) • Examples – Time to death – Time to onset/relapse of disease – Length of stay in hospital 14 What is survival analysis • Time to event data Patients Post operative survival – Continuous – Right skewed, ≥0, not15normal – Censored 10 – Analyse risk (hazard function) • Examples 5 – Time to death – Time to onset/relapse 0of disease 0 2 4 – Length of stay in hospital 6 8 10 Years 15 Censoring • Right censoring • Left censoring • Interval censoring Censoring is also categorised by 1. Fixed study length 2. Fixed number of events 3. Random entry to study 16 Censoring • Right censoring – observed survival time is less than actual – Study ends before event Parotidectomy patient medical records Patient • Left Alive censoring 7 Dead OC Dead OD • Interval censoring 6 ? 5 ? 4 3 audit 2 ? Lost to follow up 1 1990 1995 2000 2005 6/2008 17 Censoring • Right censoring • Left censoring – Time to relapse Surgery 0 Recurrence t 3 month exam – Time to event is less than observed t < 3 • Interval censoring 18 Censoring • Right censoring • Left censoring • Interval censoring – Time to relapse Surgery 0 Free of disease 3 month exam Recurrence t 6 month exam – 3<t<6 19 Censoring Independent censoring Survival time is independent of censoring process. A censored patient is representative of those at risk at censoring time. The methods described here assume independent censoring 20 Censoring Independent censoring Survival time is independent of censoring process. Informative censoring Patients removed from study if condition deteriorates. 21 Censoring example How are the SCCMP patients censored? 22 Censoring example How are the SCCMP patients censored? • Enter study on surgery date • Last known status is at audit Random right censoring. 23 Survival function The survival function S(t) is the probability of surviving longer than time t. S(t) = P(T>t) Where T is the survival time. Number of patients surviving longer than t S(t) total number of patients 24 Hazard function The hazard function λ(t) is the probability of dying “at” time t. f(t) (t) S(t) Also called the instantaneous failure rate and force of mortality. Usually plotted is the cumulative hazard function, that is the accumulated hazard until time t. (t) log S (t ) 25 Survival function For censored data the survival function can only be estimated. Patient Parotidectomy patient survival data Alive Dead OC Dead OD 7 6 5 4 3 2 1 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Years post operation 26 Survival function Life table estimates All causes mortality Percent surviving 100 80 NZ 60 Australia 40 Chad 20 0 0 10 20 30 40 50 60 70 80 90 100 Age WHO, StatsNZ 27 Survival function Kaplan Meier estimates 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Months 2.2 6.12 10.32 10.78 10.88 13.08 13.35 16.11 26.2 29.42 37.48 45.86 59.08 65.33 n 57 51 46 45 44 41 39 37 34 31 26 23 19 14 d 1 1 1 1 1 1 1 1 1 1 1 1 1 1 (n-d)/n 0.982 0.980 0.978 0.978 0.977 0.976 0.974 0.973 0.971 0.968 0.962 0.957 0.947 0.929 S(t) 0.982 0.963 0.942 0.921 0.9 0.878 0.856 0.833 0.808 0.782 0.752 0.719 0.682 0.633 28 1. Order data by time to event (death) Survival function 2. Number at risk of event is number surviving less number censored. Kaplan Meier estimates 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Months 2.2 6.12 10.32 10.78 10.88 13.08 13.35 16.11 26.2 29.42 37.48 45.86 59.08 65.33 n 57 51 46 45 44 41 39 37 34 31 26 23 19 14 d 1 1 1 1 1 1 1 1 1 1 1 1 1 1 (n-d)/n 0.982 0.980 0.978 0.978 0.977 0.976 0.974 0.973 0.971 0.968 0.962 0.957 0.947 0.929 S(t) 0.982 0.963 0.942 0.921 0.9 0.878 0.856 0.833 0.808 0.782 0.752 0.719 0.682 0.633 3. Estimate of probability of surviving to next event 4. Multiply probabilities to estimate survival 29 Kaplan Meier plot Estimated survivor function Kaplan Meier estimate 1.0 0.8 0.6 0.4 0.2 0.0 0 20 40 60 80 100 120 Months 30 Kaplan Meier plot SCCMP Estimated survivor function Kaplan Meier estimate 1.0 0.8 0.6 0.4 Standard errors and 95% CI’s calculated by most software (SPSS, R, SAS) 0.2 0.0 0 20 40 60 80 100 Usually use Greenwood’s or Tsiatis’ formula, software dependent. 120 31 Cumulative Hazard SCCMP Cumulative hazard Cumulative Hazard Function 0.4 0.3 0.2 0.1 0.0 0 20 40 60 80 100 120 Months 32 Summary statistics 1. Median survival: time when S(t) = 0.5 • Must have enough data 2. Mean survival: area under the survival curve 3. 5 year survival is survival rate at 5 years 33 Kaplan Meier estimate KM and lifetables are non-parametric methods: no assumptions are made about the distribution on the survival times. Typical distributions are exponential and Weibull. More powerful but can be sensitive to getting the distribution right. 34 Disease specific survival Estimated survivor function SCCMP survival 1.0 Disease specific All causes 0.8 0.6 0.4 0.2 0.0 0 20 40 60 80 100 120 Months 35 Comparing 2 groups Log rank test • Computed in SPSS, SAS, R • Most popular – (Bland Altman BMJ 2004;328:1073 (1 May) • Limitations – No estimate of size – Unlikely to detect a difference when risk is not consistent 36 Immuno compromised Estimated survivor function SCCMP survival: Immuno Compromised 1.0 0.8 No 0.6 0.4 0.2 0.0 Yes 0 20 40 60 80 100 120 140 Months 37 Immuno compromised Estimated survivor function SCCMP survival: Immuno Compromised 1.0 0.8 No Case Processing Summary 0.6 ICOMP N Y Overall 0.4 0.2 0.0 Total N 53 7 60 Censored N Percent 44 83.0% 2 28.6% 46 76.7% N of Events 9 5 14 Yes 0 20 40 60 80 100 120 140 Months 38 Immuno compromised Estimated survivor function SCCMP survival: Immuno Compromised 1.0 0.8 No Means and Medians for Survival Time 0.6 a Mean ICOMP N Y Overall 0.4 0.2 Estimate 101.048 22.978 91.761 Median Std. Error 7.616 7.653 7.842 Estimate . 16.110 . Std. Error . 3.293 . a. Estimation is limited to the larges t survival time if it is censored. 0.0 Yes 0 20 40 60 80 100 120 140 Months 39 Immuno compromised Estimated survivor function SCCMP survival: Immuno Compromised 1.0 0.8 No 0.6 Overall Comparisons Chi-Square Log Rank (Mantel-Cox) 19.579 0.4 df 1 Sig. .000 Tes t of equality of s urvival distributions for the different levels of ICOMP. 0.2 0.0 Yes 0 20 40 60 80 100 120 140 Months 40 Age group Estimated survivor function SCCMP survival: Age group 1.0 0.8 <75 0.6 75+ 0.4 Call: survdiff(formula = Surv(mths,Status == "DOD") ~ ICOMP) 0.2 N Observed Expected (O-E)^2/E (O-E)^2/V Age75=<75 24 7 5.63 0.332 0.557 Age75=75+ 36 7 8.37 0.224 0.557 0.0 0 Chisq= 20 0.6 60 80 100 120 on401 degrees of freedom, p= 0.455 140 Months 41 Facial Nerve Estimated survivor function SCCMP survival: Facial Nerve Preserved 1.0 0.8 YES PARTIAL 0.6 NO 0.4 0.2 Log rank p value: 0.09 0.0 0 20 40 60 80 100 120 140 Months 42 Multiple independent variables Cox proportional hazards model • Most common model • Linear model for the log of the hazard ratio h1 (t ) B1Z1 B2 Z 2 e h0 (t ) • Baseline hazard unspecified 43 SCCMP example CPH model: Survival ~ Preserved + Age + ICOMP Preserved and ICOMP categorical Age continuous Plot survival for patients with each of /Y/N/partial nerve preservation adjusted for age and immuno compromised status 44 SCCMP example - SPSS Analyze > Survival > Cox Regression COXREG Months /STATUS=Status('DEAD') /PATTERN BY Preserved /CONTRAST (Preserved)=Indicator /CONTRAST (ICOMP)=Indicator(1) /METHOD=ENTER Preserved Age ICOMP /PLOT SURVIVAL /SAVE=PRESID XBETA /PRINT=CI(95) CORR SUMMARY BASELINE /CRITERIA=PIN(.05) POUT(.10) ITERATE(20) . 45 SCCMP example - SPSS Variables in the Equation Preserved No Partial ICOMP Age B SE 2.535 2.091 3.588 -.011 .871 1.110 .918 .028 Wald 8.493 8.470 3.549 15.274 .149 df 2 1 1 1 1 Sig. .014 .004 .060 .000 .700 Exp(B) 12.617 8.093 36.166 .989 95.0% CI for Exp(B) Lower Upper 2.288 .919 5.981 .936 69.564 71.279 218.676 1.046 Patients with their facial nerve preserved have 12.6 times less hazard ratio, (95% CI 2-70) . Preserving the facial nerve significantly reduces patients risk, (p value <0.001 CPH model). 46 SCCMP CPH model Estimated survivor function SCCMP survival: Facial nerve preserved 1.0 YES 0.8 0.6 PARTIAL 0.4 NO 0.2 Adjusted for age and immuno compromised patients 0.0 0 10 20 30 40 50 60 70 Months 47 Next Steps: • Check proportional hazards assumption – Residual plots for groups • Time dependent covariates • More complex models • we also didn’t do power calculations 48 Summary • Survival analysis accounts for censoring in time to event data • Log rank test: difference in survival between 2 groups • Cox proportional hazard model • More complex/powerful models available • SPSS, R, SAS, Stata 49