Kaplan-Meier Estimation &Log-Rank Test Survival of Ventilated and Control Flies (Old Falmouth Line 107) R.Pearl and S.L. Parker (1922). “Experimental Studies on the Duration of Life. V. On the Influence of Certain Environmental Factors on Duration of Life in Drosophila,” The American Naturalist, Vol. 56, #646, pp. 385-405 Data Description • 946 Flies are Venilated on Day 1 • 931 Flies are not Ventilated on Day 1 (Controls) • Counts of Survivors and Non-Survivors Reported every 6 Days Subsequently (Days 7,13,…,85) • Goal 1: Estimate the Survival Function for the two groups: S(t) = P(T > t) where T is the (random) Survival time of a fly • Goal 2: Test whether the (population) Survival Functions differ for the 2 Groups Data Days 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 VentLive 946 933 918 895 851 761 659 557 454 351 263 175 97 11 0 VentDie 0 13 15 23 44 90 102 102 103 103 88 88 78 86 11 CntlLive 931 916 902 865 788 680 573 452 370 272 182 111 41 2 0 CntlDie 0 15 14 37 77 108 107 121 82 98 90 71 70 39 2 VentDie and CntlDie represent numbers dying in period just prior to this day Kaplan-Meier Estimation - Notation • For each fly we observe the pair (ti, di) where ti is the ith fly’s time to death or censoring and di is a censoring indicator (1=censored, 0=not censored (actual death)) • Observed failure times: t(1) < … <t(k) – Number of failures at t(i) di – Subjects censored at t(i) treated as if censored between t(i) and t(i+1). Number censored in [t(i),t(i+1)) mi – Subjects at risk Prior to t(i) ni = (di+mi)+…+(dk+mk) (this removes all who have died or censor prior to t(i)) • Note: For this dataset, there is no censoring (all flies were observed to die) Kaplan-Meier Estimates Estimated Hazard at time t(i ) : # dying at t(i ) di i ni # at risk just prior to t(i ) ^ Estimated Survival Function at time t( i ) : S (t ) i|t ( i )t 1 i ^ ^ Kaplan-Meier Estimates Days 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 VentLive 946 933 918 895 851 761 659 557 454 351 263 175 97 11 0 VentDie 0 13 15 23 44 90 102 102 103 103 88 88 78 86 11 CntlLive 931 916 902 865 788 680 573 452 370 272 182 111 41 2 0 CntlDie 0 15 14 37 77 108 107 121 82 98 90 71 70 39 2 Vlambda Clambda 0.0000 0.0000 0.0137 0.0161 0.0161 0.0153 0.0251 0.0410 0.0492 0.0890 0.1058 0.1371 0.1340 0.1574 0.1548 0.2112 0.1849 0.1814 0.2269 0.2649 0.2507 0.3309 0.3346 0.3901 0.4457 0.6306 0.8866 0.9512 1.0000 1.0000 Vsurv 1.0000 0.9863 0.9704 0.9461 0.8996 0.8044 0.6966 0.5888 0.4799 0.3710 0.2780 0.1850 0.1025 0.0116 0.0000 Csurv 1.0000 0.9839 0.9689 0.9291 0.8464 0.7304 0.6155 0.4855 0.3974 0.2922 0.1955 0.1192 0.0440 0.0021 0.0000 Note: For the ventilated flies, 13 died at day 7 (actually between days 1 and 7), out of 946 that were at risk prior to that period: 13/946=.0137. At day 13, 15/933=.0161 was the proportion dying The Survival function at day 7 is obtained as (1-.0137)=.9863. At day 13, the survival function is (1-.0137)(1-.0161) = .9704 Kaplan Meier Survival Estimates - Ventilated and Control Flies 1 0.9 0.8 0.7 P(Survive) 0.6 Vsurv 0.5 Csurv 0.4 0.3 0.2 0.1 0 1 7 13 19 25 31 37 43 Days 49 55 61 67 73 79 85 Log-Rank Test • Used to test whether two (or more) survival functions are equal • Involves obtaining the “expected” number of deaths in (say) the treatment group at time t(i) if the hazard functions for the two groups were equal: – Let n1i and n2i be the numbers at risk just prior to t(i) for the 2 groups, with total at risk ni = n1i + n2i – Let d1i and d2i be the numbers dying at t(i) for the 2 groups, with total deaths di = d1i + d2i – Then the expected deaths for group 1 is: e1i=di(n1i/ni) which represents the total deaths at t(i) times the fraction of the total at risk that are in group 1 Log-Rank Test Compute the following Quantities : e1i n1i d i ni v1i n1i n2i d i ni d i ni2 ni 1 k k O1 E1 d1i e1i V1 v1i i 1 i 1 Compute the " Z"-Statist ic (Software packages often square this to get a Chi - Square) : O E1 TMH 1 ~ N (0,1) Under H 0 : No difference s in Survival Functions V1 Alternativ e (less preferred, but easier computatio nally) method : k O2 d 2i i 1 E2 O1 O2 E1 Compute the Chi - Square statistic : 2 2 O E O E 1 2 X2 1 2 E1 E2 ~ 12 Under H 0 : No difference s in Survival Functions Log-Rank Test (Fly Data) Days 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 Sum VentLive 946 933 918 895 851 761 659 557 454 351 263 175 97 11 0 #N/A VentDie 0 13 15 23 44 90 102 102 103 103 88 88 78 86 11 946 CntlLive 931 916 902 865 788 680 573 452 370 272 182 111 41 2 0 #N/A CntlDie 0 15 14 37 77 108 107 121 82 98 90 71 70 39 2 931 e1 0.00 14.11 14.63 30.26 61.53 102.81 110.37 119.28 102.13 110.75 100.29 93.97 90.56 87.86 11.00 1049.55 v1 0.00 6.90 7.14 14.51 28.18 43.48 44.56 45.47 37.40 37.64 31.32 24.76 17.02 2.48 0.00 340.86 d1-e1 0.00 -1.11 0.37 -7.26 -17.53 -12.81 -8.37 -17.28 0.87 -7.75 -12.29 -5.97 -12.56 -1.86 0.00 -103.55 O1 946 E1 1049.55 V1 340.86 O2 931 E2 946 931 1049.55 827.45 TMH X 2 O1 E1 946 1049.55 5.61 V1 340.86 TMH 2 (5.61) 2 31.47 2 2 2 2 O1 E1 O2 E2 946 1049.55 931 827.45 E1 E2 1049.55 827.45 23.17 Both tests are highly significant