Roy N. Tamura, Ph.D.
Eli Lilly and Company
Indianapolis, Indiana
2001 Purdue University
Department of Statistics Seminar
I. Background on Depression and
Depression Clinical Trials
II. Cure Model for Time to Response
III. Test of Latency for Cure Model
IV. Proportional Hazards Cure Model
• Lifetime risk: Women: 10-25% Men: 5-12%
• Average age at onset: mid-20s
• Course of illness:
– 50-60% of patients will have a 2nd episode
• Estimated Annual Costs to Business in the
US: >40 billion dollars
– Absenteeism
– Lost Productivity
– Suicides
– Treatment/Rehabilitation
MIT Sloan School of Management Study
1. Medication
Tricyclics (Impramine, Amitriptyline)
SSRI’s (Prozac, Zoloft, Paxil)
Others (Wellbutrin)
2. Therapy
Cognitive Behavioral
Interpersonal
3. Electroshock
• Patients meeting diagnostic criteria for depression are randomly assigned to treatment groups
• Patients are scheduled for visits to a psychiatrist at prespecified visit intervals up to some time point (usually 6-8 weeks)
• At each visit, severity of depression is assessed using a structured interview and depression rating scale
1. Response Rate
2. Time to Response
Response usually defined by change in a rating scale like
CGI or Hamilton Depression Scale.
Cure Model:
H(t) = p S(t) + (1-p)
H(t) is the probability that time to response > t p is the probability of response
S(t) is the probability that time to response > t among patients who respond
• Proportion of responders (p): incidence
• Time to response for responders (S(t)): latency
Nonparametric Generalized Maximum Likelihood Estimates: u is the endpoint of the trial.
Six Week Trial of Fluoxetine vs Fluoxetine
+ Pindolol in Major Depressive Disorder
One Hundred Eleven Randomized Patients
Twice Weekly Visits for First Three Weeks, Weekly
Visits Thereafter
Response: 50% or greater reduction in HAMD 17 item from baseline
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
0.9
0.8
0
10 20 30 40 50
Fluox + Plac Fluox + Pin
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
5 10 15 20 25 30 35 40 45
Fluox + Plac Fluox+Pin
Antidepressant A: 50% response rate
Everyone responds at exactly two weeks
Antidepressant B: 90% response rate
Everyone responds at exactly two weeks
Antidepressant B is more effective than Antidepressant A but does not exhibit faster onset of action.
Suppose we want to compare incidence and latency between 2 drugs in a clinical trial
• Incidence: several tests available (Laska and
Meisner, 1992)
• Latency: few tests in the literature until this past year
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
5 10 15 20 25 30 35 40 45
Fluox + Plac Fluox+Pin
A Two-Sample Cramer-von Mises Test Statistic:
W 2 = -(n
1
^
1
) (n
2
^
2
) / (n
1
^
1
+ n
2 2
)
[ ^
1
^
2
(t)] 2
^ dS (t)
^ ^
1
^
2
.
Tamura, Faries & Feng, 2000
Bootstrap for Cramer von Mises statistic
From the sample data, construct C
1
2
, p
where C
1
and C
2
1
, p
2
, S are the Kaplan-Meier estimates of the censoring distributions for Groups 1 and 2
Bootstrap for Cramer von Mises statistic
1. Generate Z, a Bernoulli random variable (p i
).
2. If Z = 1, then generate response time T* from S. If Z = 0, then set
T* arbitrarily large.
3. Generate censoring time U* from C i
.
4. Construct the pair (y*, *) where y* is the minimum of u* and t*, and * is the indicator variable taking the value 1 if t* is less than u*.
Repeat Steps 1-4 for sample sizes of the trial and construct a bootstrap test statistic value W 2 *.
Use the empirical distribution of W 2 * to determine p-values for the observed value of W 2 .
Fluoxetine / Pindolol Case Study
• Cramer-von Mises Statistic W 2 = .247, bootstrap p =.204
• Proportions Test of Equality of Incidence Z=2.33, p =.020
Simulation Study of CvM/bootstrap procedure:
Seven Scheduled Visits
Sample Sizes: 50 - 100 per group
Response Rates: 0.6 - 0.9 (equal and unequal across groups)
Censoring Rates: 0 - 50%
Proportional Hazards: S
2
S
1
(t) =
{
S
1
(t)} b chosen as Weibull (median time to response
17 days)
1000 realizations, each realization uses 1000 bootstrap repetitions
Simulation Results for n
1
= n
2
= 75 p
1 p
2
Censoring
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
.6
None
Moderate (35%)
Heavy (52%)
None
Moderate
Heavy
None
Moderate
Heavy
None
Moderate
Heavy
NOTE: b
= 2.5 corresponds to shift in median time to response from 17 days to 11 days.
b
2
2
2
2.5
2.5
2.5
1
1
1
1.5
1.5
1.5
Nominal = 0.05
Rejection Rate
.764
.710
.533
.938
.904
.805
.049
.048
.073
.322
.279
.195
1. Active comparator antidepressant trials usually have low drop-out rates.
2. Simulations of weekly assessments versus instantaneous observation of response suggest little effect on level or power of Cramer-von Mises test.
3. Typical antidepressant clinical trials have power to detect a 5-7 day shift in median time to response.
H(t)=p(x) S(t) + (1-p(x)) where p(x) = Pr(Response; x) = exp(x'b) / (1+ exp(x'b)) and S(t) = (S
0
(t)) exp(z' b
)
Kuk and Chen, 1992
Sy and Taylor, 2000
Proportional hazards cure model
• Estimate b, b
, and S
0
(t) using maximum likelihood. Inference about parameters b and b based on observed information matrix.
• Constraining S
0
(t) to zero after the last observed response time leads to better estimation.
Sy and Taylor, 2000
Incidence
Treatment
Latency
Treatment
Parameter Estimate S.E.
p
1.01
0.45 .026
-0.01
0.27 .968
Baseline covariates: melancholia diagnosis (yes/no) and
HAMD 17 score.
Parameter Estimate S.E.
p
Incidence
Treatment
Melancholia Diagnosis
HAMD 17 Score
Latency
Treatment
Melancholia Diagnosis
HAMD 17 Score
0.97
0.30
-0.09
-0.08
-0.19
-0.03
0.46 .035
0.60 .671
0.07 .203
0.28 .772
0.37 .602
0.05 .540
1. Attractive to be able to adjust for covariates.
2. Computationally intensive. Can't ignore S
0
(t)
3. Increased Type I error for latency parameter b in presence of heavy censoring.
1. Examining time to response increasing in importance in tests of new antidepressants.
2. Cure model is a simple way to separate incidence from latency.
3. Tests of latency possible using CvM statistic or cure model PH analyses.
4. Both CvM and PH analyses of latency need low censoring to preserve nominal level.
References
• Laska, EM, Meisner, MJ. Nonparametric estimation and testing in a cure model. Biometrics 1992; 48: 1223-1234.
• Tamura, RN, Faries, DE, Feng, J. Comparing time to onset of response in antidepressant clinical trials using the cure model and the Cramer-von Mises test. Statistics in Medicine 2000; 19: 2169-2184.
•
Kuk, AYC, Chen, CH. A mixture model combining logistic regression with proportional hazards regression. Biometrika 1992; 79: 531-541.
• Sy, JP, Taylor, JMG. Estimation in a Cox proportional hazards cure model.
Biometrics 2000; 56: 227-236.