Clinical Trials A short course

advertisement
Survival Analysis for
Randomized Clinical Trials
Ziad Taib
March 9, 2012
I The log-rank test
1. INTRODUCTION TO SURVIVAL TIME
DATA
2. ESTIMATING THE SURVIVAL
FUNCTION
3. THE LOG RANK TEST
Survival Analysis
Survival Analysis in RCT
• For survival analysis, the best observation
plan is prospective. In clinical investigation,
that is a randomized clinical trial (RCT).
• Random treatment assignments.
• Well-defined starting points.
• Substantial follow-up time.
• Exact time records of the interesting events.
Survival Analysis in
Observational Studies
• Survival analysis can be used in observational
studies (cohort, case control etc) as long as you
recognize its limitations.
• Lack of causal interpretation.
• Unbalanced subject characteristics.
• Determination of the starting points.
• Lost of follow-up.
• Ascertainment of event times.
Elements of Survival Experiments
• Event Definition (death, adverse events, …)
• Starting time
• Length of follow-up (equal length of follow-up,
common stop time)
• Failure time (observed time of event since start of
trial)
• Unobserved event time (censoring, no event
recorded in the follow-up, early termination, etc)
End of follow up time
start
event
Early termination
time
When to use survival analysis
• Examples
– Time to death or clinical endpoint
– Time in remission after treatment of CA
– Recidivism rate after alcohol treatment
• When one believes that 1+ explanatory
variable(s) explains the differences in time
to an event
• Especially when follow-up is incomplete or
variable
Standard Notation for Survival Data
•
•
•
•
Ti -- Survival (failure) time
Ci -- Censoring time
Xi =min (Ti ,Ci) -- Observed time
Δi =I (Ti ≤Ci) -- Failure indicator: If
the ith subject had an event before
been censored, Δi=1, otherwise Δi=0.
• Zi(t) – covariate vector at time t.
• Data: {Xi , Δi , Zi(·) }, where i=1,2,…n.
Describing Survival Experiments
•
•
Central idea: the event times are realizations
of an unobserved stochastic process, that can
be described by a probability distribution.
Description of a probability distribution:
1.
2.
3.
4.
5.
Cumulative distribution function
Survival function
Probability density function
Hazard function
Cumulative hazard function
Relationships Among Different
Representations
• Given any one, we can recover the others.
t
S ( t )  P T  t   1  F ( t )  exp{   h ( u ) du }
0
h (t )  
h ( t )  lim
t  0

t
H (t ) 
log S ( t )
 h (u ) du
0
Pr( t  T  t   t | T  t )
t
t
f ( t )  h ( t ) exp{ 
t
 h ( u ) du }
0

f (t )
S (t )
Descriptive statistics
• Average survival
– Can we calculate this with censored data?
• Average hazard rate
– Total # of failures divided by observed
survival time (units are therefore 1/t or 1/ptyrs)
– An incidence rate, with a higher value
indicating lower survival probability
• Provides an overall statistic only
Estimating the survival function
There are two slightly different
methods to create a survival curve.
• With the actuarial method, the x
axis is divided up into regular
intervals, perhaps months or years,
and survival is calculated for each
interval.
• With the Kaplan-Meier method,
survival is recalculated every time a
patient dies. This method is
preferred, unless the number of
patients is huge.
The term life-table analysis is used
inconsistently, but usually includes
both methods.
Life Tables (no censoring)
In survival analysis, the object of primary interest
is the survival function S(t).Therefore we need to
develop methods for estimating it in a good
manner. The most obvious estimate is the
empirical survival function:
time
0
1
2
3
# patients
ˆ
S t  
10
Sˆ 0  
1
10
4
5
6
7
8
with survival
10
Sˆ 1  
1
10
9
10
times larger tha n t
Total # patients
9
Sˆ  2  
 0 .9
10
6
Sˆ 5  
 0 .6
10
Example: A RAT SURVIVAL
STUDY
In an experiment, 20 rats exposed to a
particular type of radiation, were followed over
time. The start time of follow-up was the same
for each rat. This is an important difference
from clinical studies where patients are
recruited into the study over time and at the
date of the analysis had been followed for
different lengths of time. In this simple
experiment all individuals have the same
potential follow-up time. The potential followup time for each of the 20 rats is 5 days.
Survival Function for Ratsa
Pˆ [T  t ]
Sˆ ( t )  Pˆ [ T  t ]
Proportion of rats
dying on each of 5
days
Survival Curve for Rat Study
Confidence Intervals for
Survival Probabilities
From above we see that the "cumulative"
probability of surviving three days in the rat
study is 0.25. We may want to report this
probability along with its standard error. This
sample proportion of 0.25 is based on 20 rats
that started the study. If we assume that (i) each
rat has the same unknown probability of
surviving three days, S(3), and (ii) assume that
the probability of one rat dying is not influenced
by whether or not another rat dies, then we can
use results associated with the binomial
probability distribution to obtain the variance
of this proportion. The variance is given by
V A R IA N C E [ S (3) ] 
S (3)  [1  S (3)]

0.25  0.75
n
Z 
20
Sˆ ( 3 )  S ( 3 )
 
Var Sˆ ( 3 )
 
 0.009375
Var Sˆ ( 3 ) 
 N ( 0 ,1)

Sˆ ( 3 ) 1  Sˆ ( 3 )
  0 .0096
20
•This can be used to test hypotheses about the theoretical
probability of surviving three days as well as to construct
confidence intervals.
•For example, the 95% confidence interval for is given by
0.25 +/- 1.96 x 0.0968
or ( 0.060,0.440)
We are 95% confident that the probability of
surviving 3 days, meaning THREE OR MORE DAYS,
lies between 0.060 and 0.440.
In
general
This approach has many drawbacks
1. Patients are recruited at different time
periods
2. Some observations are censored
3. Patients can differ wrt many covariates
4. Continuous data is dicretised
Kaplan-Meier survival curves
• Also known as product-limit formula
• Accounts for censoring
• Generates the characteristic “stair step”
survival curves
• Does not account for confounding or
effect modification by other covariates
– Is that a problem?
In general
16
9
9
ˆ
ˆ
ˆ
S (2)  P [T  1]  P [T  2 | T  1] 


 0.45
20 16 20
The same as before! Similariliy
Sˆ (3)  Pˆ [ T  1 ]  Pˆ [ T  2 | T  1 ]  Pˆ [ T  3 | T  2 ]
 0.80  0.5625  0.5556

16
20

9
16

5
9

5
20
 0.25
Censored Observations (Kaplan-Meier)
We proceed as in the case without censoring
Pi  Pr T  i | T  i  1 
S i  k   P1  P2  ...  Pk
Pi stands for the proportion of patients who survive day i
among those who survive day i-1. Therefore it can be
estimated according to
( Total number of patients) - (Total number of events during day 1)
Pˆ1 
( Total number of patients)
( Number of Patients at risk day i) - (Total number of events during day i)
Pˆi 
( Number of Patients at risk day i)
K-M Estimate: General Formula
•Rank the survival times as t(1)≤t(2)≤…≤t(n).
•Formula
Sˆ ( t ) 

t(i )  t
P1 
19
P3 
17
 0.95
20
ni  d i
ni
•ni patients at risk
•di failures
S (1)  S (0)  P1  1  0.95  0.95
 0.89474 S (3)  S (0)  P1  P3  1 
19
PROC LIFETEST
19
20

17
19
 1  0.95  0.89464  0.85
Confidence Intervals
Using SAS
Using SAS
Comparing Survival Functions
•
•
•
Question: Did the treatment make a difference in
the survival experience of the two groups?
Hypothesis: H0: S1(t)=S2(t) for all t ≥ 0.
Two tests often used :
1. Log-rank test (Mantel-Haenszel Test);
2. Cox regression
A numerical Example
The Log-rank test
During the th interval, let
• Nt be the number of patients at risk in the drug
group at the beginning of the interval.
• Mt be the number of patients at risk in the placebo
group at the beginning of the interval.
• At the number of events during the interval in the
drug group
• Ct the number of events during the interval in the
placebo group
• Tt = Nt + Mt
The Log-rank test
Under H0, random allocation
• The above contingency table is a way of summarising the data at
hand. Notice though that the marginals Nt and Mt are fixed.
• In principle the problem can be formulated as a formal test of
• H0: Drug has no effect lD(t) = lC(t)
against
• H0: Drug is effective lD(t) = q lK(t)
This situation is similar to one where we have a total of Tt
balls in an urn. These balls are of two different kinds D and
C. Dt balls are drawn at random (those who experience
events).
• We denote by At the number of balls of type D among the
Dt drawn.
Dt
Tt = N t + M t
We can thus calculate probabilities of events like {At = a}
using the hypergeometric distribution.
 N t  M t 


 
 a  D t  a 
P ( At  a ) 
 Tt 


 Dt 
The corresponding mean and variance are.
 t  E  At  
Dt N t
Tt
;

2
t
 V ar  At  
D t T t - D t  N t M t
Tt  1T 2 t
The above can be used to derive the following test statistic
where Nt and Mt are supposed to be large, which is often the
case in RCT.
Zt 
At   t
t
 N ( 0 ,1) and
Z
2
t
 
2
t
Assume now that we want to combine data from k successive
such intervals. We can then define U according to
k
U 
 A
t
 t 
And use the statistic
t 1
QM H 
U
2
Var U

 1
2
Reject H0 for large values of QM-H
Some comments
1. The log-rank test is a significance test. It
does not say anything about the size of the
difference between the two groups.
2. The parameter q can be used as
measure of the size of that difference. It is
also possible to compute confidence
intervals for q.
Further comments
3.Instead of QM-H we can formulate variants
with weights


   t  At   t  
 t 1

k
Q 
k

t 1
2
t
Var  At 
2
 1
2
under Ho.
1. t =1 M-H
2. t = Tt Gehan 1965
3. t = T Tarone and Ware 1977
t
A
numerical
example
Using SAS
Limitation of Kaplan-Meier
curves
• What happens when you have several covariates that
you believe contribute to survival?
• Example
– Smoking, hyperlipidemia, diabetes, hypertension, contribute to
time to myocardial infarct
• Can use stratified K-M curves – but the combinatorial
complexity of more than two or three covariates prevents
practical use
• Need another approach – multivariate Cox proportional
hazards model is most commonly used
– (think multivariate regression or logistic regression)
II Cox Regression
• Introduction to the proportional hazard
model (PH)
• Partial likelihood
• Comparing two groups
• A numerical example
• Comparison with the log-rank test
The model
hi ( t )  h0 ( t ) exp(  1 x i1     k x ik )
Understanding “baseline hazard” h0(t)
hi ( t )
h j (t )
 exp[  1 ( x i1  x j 1 )     k ( x ik  x jk )]
Cox Regression Model
• Proportional hazard.
• No specific distributional assumptions (but
includes several important parametric
models as special cases).
• Partial likelihood estimation (Semiparametric in nature).
• Easy implementation (SAS procedure
PHREG).
• Parametric approaches are an alternative,
but they require stronger assumptions
about h(t).
Cox proportional hazards model,
continued
• Can handle both continuous and categorical
predictor variables (think: logistic, linear
regression)
• Without knowing baseline hazard ho(t), can still
calculate coefficients for each covariate, and
therefore hazard ratio
• Assumes multiplicative risk—this is the
proportional hazard assumption
– Can be compensated in part with interaction terms
Cox Regression
• In 1972 Cox suggested a model for survival data
that would make it possible to take covariates into
account. Up to then it was customary to discretise
continuos variables and build subgroups.
• Cox idea was to model the hazard rate function
P (t  T  t   t | T  t )  h (t )  t
where h(t) is to be understood as an intensity i.e. a
probability by time unit. Multiplied by time we get a
probability. Think of the analogy with speed as
distance by time unit. Multiplied by time we get
distance.
Cox’ suggestion is to model h using
hi ( t )  h0 ( t ) e
β xi
T
;
  1 ,
...
,  k  the parameter
vector;
Z i   Z i1 ,
...
, Z ik  the covariate
vector.
β
T
where each parameter is a measure of the importance of
the corresponding variable.
A consequence of this is that two individuals with
different covariate values will have hazard rate
functions which differ by a multiplicative term which is
the same for all values of t.
hi ( t )  h0 ( t ) e
h1 ( t )
h2 (t )

β
h0 (t ) e
h0 ( t ) e
T
xi
β
β
T
T
; i  1, 2 ;
x1
x
 e
β (x
T
1
-x
2
)
 C;
2
h1 ( t )  h 2 ( t ). C
The covariates can be discrete, continuous
or even categorical. It is also possible to
generalise the model to allow time varying
covariates.
Example
Assume we have a situation with one
covariate that takes two different values 0
and 1. This is the case when we wish to
compare two treatments
k  1; β   1   ;
x 1  0; x 2  1;
h1 ( t )  h 0 ( t );
h 2 ( t )  h 0 ( t ). e
h2(t)
h1(t)

t
From the definition we
Comments
see immediately that
S i t 
• The baseline can be interpreted
as the case corresponding to an
individual with covariate
values zero.
t

 e
 h i ( u ) du
0
t

 e

h 0 ( u ) du
e
β
T
xi
0
  h 0 ( u ) du


 e 0



t
 S 0 ( t ) 
e
β
T
xi





e
β
T
xi
• The name semi-parametric is
due to the fact that we do not
model h0 explicitly. Usual
likelihood theory does not apply.
Maximum likelihood
Partial likelihood
“Partial” Likelihood estimates
Let L() stand for the likelihood function and  for some
parameter (vector) of interest. Then, the maximum
likelihood estimates are found by solving the set of
equations
 log L (  )
 i
 0 , i  1, 2 ,...,
Define the information matrix, I() to have elements
  2 log L (  ) 
.
I ij (  )   E 
  

i
j


It can be shown that the estimate to have an asymptotically normal distribution
with means i and variance/covariance matrix
I (  ) 
1
• We take logLj = lj
LogL β 
J

 LogL β 
j
j 1



T
β
x
T
   β x j  Log   e i  


j 1
i

R


 j


J
J

 l β 
j
j 1
Score and information
U k β  
l j
 k

 x kj 
β xi
T
x ki e
i R j
e
β Zi
T
i R j
T
T

β xi
2 β xi

x ki e
x ki e


2

 i R j
 lj
i R j
 

T
T
2
β xi
β xi
 k
 e
 e
 i R j
 i R j






2






• The two derivatives can be used to get
parameter estimates and information.
• One difficulty we avoided so far is the occurence
of ties.
Ties
• dj = the # of failures at tj.
• sj = the sum of the covariate values for patients who die at
time tj.
• Notice that sum (arbitrary) order must be imposed on the
ties.
J 
J



β x
T
LogL β     β s j  d j Log   e
    l j 
T
i
j 1
l j
 k


x
 s kj  d j
 
 i R j
β xi
T
ki
e
i R j

β xi
T
e
i R j
T
T

β xi
2 β Zi

x ki e
x ki e


2

 i R j
 lj
i R j

 d j

T
T
2
β xi
β xi
 k
 e
 e
 i R j
 i R j






2






j 1
A numerical Example
TIME TO RELIEF OF ITCH SYMPTOMS FOR PATIENTS USING
A STANDARD AND EXPERIMENTAL CREAM
What about a t-test?
The mean difference in the time to “cure” of 1.2 days
is not statistically significant between the two groups.
Using SAS
PROC PHREG ;
MODEL RELIEFTIME * STATUS(0) = DRUG ;
Note that the estimate of the DRUG variable is 1.3396 with a p value of 0.0346. The negative sign
indicates a negative association between the
hazard of being cured and the DRUG variable. But
the variable DRUG is coded 1 for the new drug and
coded 2 for the standard drug. Therefore the hazard
of being cured is lower in the group given the
standard drug. This is an awkward but accurate
way of saying that the new drug tends to produce a
cure more quickly than the standard drug. The
mean time to cure is lower in the group given the
new drug. There is an inverse relationship between
the average time to an event and the hazard of that
event.
The ratio of the hazards is given by
h2 ( t )
h1 ( t )

e
e
ˆ  2
ˆ 1
e
( 2  1) ˆ
e
 1.3396
 0.262
• At each time point the cure rate of the
standard drug is about 25% of that of the
new drug. Put more positively we might
state that the cure rate is 1.8 times higher
in the group given the experimental cream
compared to the group given the standard
cream.
Comparing two groups in
general
Let
• nA be the number at risk in group 1 at time tj
• nB be the number at risk in group 2 at time tj
then
j
j


nAj e
U      s j  d j


nAj e  nB j
j 1

J





 n e

nAj e
Aj
I     d j 


 nA e   nB
 n A e  nB
j 1
j
j
 j
 j
J




2




Q 
U 0 
2
I 0 
  ;
2
The score test
A numerical example
• Assume that a study aiming at comparing two
groups resulted in the following data
• U(0)=-10.2505
• I(0)=6.5957
•
ˆ   1 . 509192 ;
hi ( t )  h0 ( t ) e
h1 ( t )

h2 (t )
e
ˆ
e
β Zi
h0 (t ) e
h0 (t ) e
 150919
T
; i  1, 2 ;
 .1

 .0
e ;
 0 . 2211
The treatment reduces the hazard rate
to almost ¼ compared with placebo.
Log-rank vs the score test
(PHREG)
• The two methods give very similar answers,
why?
• On one hand the score test can be based on
d n 

the statistic

;
U 0    d 
J
Q 
U 0 

j 1 
2
I 0 
I 0  
 J 
d jnAj

  d A j 
nj
 j 1 

J d n
n
j Aj Bj

j 1
j
2
nj
 


 
J

j 1
2
Aj
d jnAj nB j
2
nj
Aj


;
nj
d j  d A j  d B j  # events in both groups;
  ;
2
n j  n A j  n B j  # at risk in both groups;
s j  d Aj ;
CURE
NO CURE
DRUG
dAj
nAj-dAj
nAj
CONTROL
dBj
nBj-dBj
nBj
TOTAL
dj
nj-dj
nj
• Compare this to the Log-rank statistic
QM H 
U
k
2
Var U
Dt N t
 t  E  At  


;
Tt
 
 j  E dA 
j
d jnAj
;
nj
 
U 
2
1
 

d j n A j  


   d A j 
n j  
 j 1 

2

  ;
J d n -d n
n
j
j
j
Aj Bj

j 1
n
Tt  1T 2 t
d j n j -d
n
n  1n
j
2
QM H
D t T t - D t  N t M t
σ  V ar d A j 
2
j
J
 1 n j
2
j
 t 
t
t 1
 V ar  At  
2
t
 A
j
Aj
2
j
nB j
SCORE TEST

d jnAj

   d A j 
nj
 j 1 
Q 
J d n
n
j Aj Bj
J

j 1
2
nj
 


 
LOG-RANK TEST
2
2

d j n A j  


   d A j 
n j  
 j 1 

2

  ;
J d ( n -d ) n
n
j
j
j
Aj Bj
J
  ;
2
QM H

j 1
( n j  1) n j
2
1. dj = 1 i.e. no ties and the two formulas are identical. In
general the two formulas are approximately the same
when nj is large.
2. For the example in a case with ties there is a slight
difference
- Q=15.930
- QM-H=16.793
Generalizations of Cox
regression
1.
2.
3.
4.
5.
6.
7.
Time dependent covariates
Stratification
General link function
Likelihood ratio tests
Sample size determination
Goodness of fit
SAS
References
• Cox & Oakes (1984) “Analysis of survival data”.
Chapman & Hall.
• Fleming & Harrington (1991) “Counting
processes and survival analysis”. Wiley & Sons.
• Allison (1995). “Survival analysis using the SAS
System”. SAS Institute.
• Therneau & Grambsch (2000) “Modeling
Survival Data”. Springer.
• Hougaard (2000) “Analysis of Multivariate
survival data”. Springer.
Questions or Comments?
Download