Kaplan-Meier Estimation and Log-Rank Test for Ventilated and Control Flies

advertisement
Kaplan-Meier Estimation &Log-Rank Test
Survival of Ventilated and Control Flies
(Old Falmouth Line 107)
R.Pearl and S.L. Parker (1922). “Experimental Studies on the Duration of Life. V. On
the Influence of Certain Environmental Factors on Duration of Life in Drosophila,” The
American Naturalist, Vol. 56, #646, pp. 385-405
Data Description
• 946 Flies are Venilated on Day 1
• 931 Flies are not Ventilated on Day 1 (Controls)
• Counts of Survivors and Non-Survivors Reported every
6 Days Subsequently (Days 7,13,…,85)
• Goal 1: Estimate the Survival Function for the two
groups: S(t) = P(T > t) where T is the (random) Survival
time of a fly
• Goal 2: Test whether the (population) Survival
Functions differ for the 2 Groups
Data
Days
1
7
13
19
25
31
37
43
49
55
61
67
73
79
85
VentLive
946
933
918
895
851
761
659
557
454
351
263
175
97
11
0
VentDie
0
13
15
23
44
90
102
102
103
103
88
88
78
86
11
CntlLive
931
916
902
865
788
680
573
452
370
272
182
111
41
2
0
CntlDie
0
15
14
37
77
108
107
121
82
98
90
71
70
39
2
VentDie and CntlDie represent numbers dying in period just prior to this day
Kaplan-Meier Estimation - Notation
• For each fly we observe the pair (ti, di) where ti is the ith
fly’s time to death or censoring and di is a censoring
indicator (1=censored, 0=not censored (actual death))
• Observed failure times: t(1) < … <t(k)
– Number of failures at t(i)  di
– Subjects censored at t(i) treated as if censored between
t(i) and t(i+1). Number censored in [t(i),t(i+1))  mi
– Subjects at risk Prior to t(i)  ni = (di+mi)+…+(dk+mk)
(this removes all who have died or censor prior to t(i))
• Note: For this dataset, there is no censoring (all flies
were observed to die)
Kaplan-Meier Estimates
Estimated Hazard at time t(i ) :
# dying at t(i )
di
i  
ni # at risk just prior to t(i )
^
Estimated Survival Function at time t( i ) :


S (t )  i|t ( i )t 1   i 


^
^
Kaplan-Meier Estimates
Days
1
7
13
19
25
31
37
43
49
55
61
67
73
79
85
VentLive
946
933
918
895
851
761
659
557
454
351
263
175
97
11
0
VentDie
0
13
15
23
44
90
102
102
103
103
88
88
78
86
11
CntlLive
931
916
902
865
788
680
573
452
370
272
182
111
41
2
0
CntlDie
0
15
14
37
77
108
107
121
82
98
90
71
70
39
2
Vlambda Clambda
0.0000
0.0000
0.0137
0.0161
0.0161
0.0153
0.0251
0.0410
0.0492
0.0890
0.1058
0.1371
0.1340
0.1574
0.1548
0.2112
0.1849
0.1814
0.2269
0.2649
0.2507
0.3309
0.3346
0.3901
0.4457
0.6306
0.8866
0.9512
1.0000
1.0000
Vsurv
1.0000
0.9863
0.9704
0.9461
0.8996
0.8044
0.6966
0.5888
0.4799
0.3710
0.2780
0.1850
0.1025
0.0116
0.0000
Csurv
1.0000
0.9839
0.9689
0.9291
0.8464
0.7304
0.6155
0.4855
0.3974
0.2922
0.1955
0.1192
0.0440
0.0021
0.0000
Note:
For the ventilated flies, 13 died at day 7 (actually between days 1 and 7), out of
946 that were at risk prior to that period: 13/946=.0137. At day 13, 15/933=.0161
was the proportion dying
The Survival function at day 7 is obtained as (1-.0137)=.9863. At day 13, the
survival function is (1-.0137)(1-.0161) = .9704
Kaplan Meier Survival Estimates - Ventilated and Control Flies
1
0.9
0.8
0.7
P(Survive)
0.6
Vsurv
0.5
Csurv
0.4
0.3
0.2
0.1
0
1
7
13
19
25
31
37
43
Days
49
55
61
67
73
79
85
Log-Rank Test
• Used to test whether two (or more) survival functions
are equal
• Involves obtaining the “expected” number of deaths in
(say) the treatment group at time t(i) if the hazard
functions for the two groups were equal:
– Let n1i and n2i be the numbers at risk just prior to t(i) for
the 2 groups, with total at risk  ni = n1i + n2i
– Let d1i and d2i be the numbers dying at t(i) for the 2
groups, with total deaths  di = d1i + d2i
– Then the expected deaths for group 1 is: e1i=di(n1i/ni)
which represents the total deaths at t(i) times the fraction
of the total at risk that are in group 1
Log-Rank Test
Compute the following Quantities :
e1i 
n1i d i
ni
v1i 
n1i n2i d i ni  d i 
ni2 ni  1
k
k
O1  E1   d1i  e1i 
V1   v1i
i 1
i 1
Compute the " Z"-Statist ic (Software packages often square this to get a Chi - Square) :
O  E1
TMH  1
~ N (0,1) Under H 0 : No difference s in Survival Functions
V1
Alternativ e (less preferred, but easier computatio nally) method :
k
O2   d 2i
i 1
E2  O1  O2  E1
Compute the Chi - Square statistic :
2
2




O

E
O

E
1
2
X2  1
 2
E1
E2
~ 12 Under H 0 : No difference s in Survival Functions
Log-Rank Test (Fly Data)
Days
1
7
13
19
25
31
37
43
49
55
61
67
73
79
85
Sum
VentLive
946
933
918
895
851
761
659
557
454
351
263
175
97
11
0
#N/A
VentDie
0
13
15
23
44
90
102
102
103
103
88
88
78
86
11
946
CntlLive
931
916
902
865
788
680
573
452
370
272
182
111
41
2
0
#N/A
CntlDie
0
15
14
37
77
108
107
121
82
98
90
71
70
39
2
931
e1
0.00
14.11
14.63
30.26
61.53
102.81
110.37
119.28
102.13
110.75
100.29
93.97
90.56
87.86
11.00
1049.55
v1
0.00
6.90
7.14
14.51
28.18
43.48
44.56
45.47
37.40
37.64
31.32
24.76
17.02
2.48
0.00
340.86
d1-e1
0.00
-1.11
0.37
-7.26
-17.53
-12.81
-8.37
-17.28
0.87
-7.75
-12.29
-5.97
-12.56
-1.86
0.00
-103.55
O1  946 E1  1049.55 V1  340.86
O2  931 E2  946  931  1049.55  827.45
TMH 
X
2
O1  E1 946  1049.55

 5.61
V1
340.86
TMH 2  (5.61) 2  31.47
2
2
2
2


O1  E1  O2  E2 
946  1049.55 931  827.45




E1
E2
1049.55
827.45
 23.17
Both tests
are highly
significant
Download