MALLINCKRODT_Mallinckrodt_Graybill_2008.ppt

advertisement
An Analytic Road Map for
Incomplete Longitudinal Clinical
Trial Data
Craig Mallinckrodt
Graybill Conference
June 12, 2008
Fort Collins, CO
Acknowledgements
PhRMA Expert Team on Missing Data
Peter Lane
GSK
Craig Mallinckrodt
Lilly
James Mancuso
Pfizer
Yahong Peng
Merck
Dan Schnell
P&G
Geert Molenberghs
Ray Carroll
Many Lilly colleagues
Outline
Why
do we care
What
do we know
 Theory
 Application
What
we should do
Medical Needs
 Every hour we expect
195 deaths due to cancer
1950 new diagnoses of anxiety disorders
15
30
1500
70
new diagnoses of schizophrenia
osteoporosis related hip fractures
surgeries requiring pain treatment
deaths due to cardiovascular disease
Alan Breier – Nov 2006
Need for More Effective Medicines
Therapeutic Area
Alzheimer’s
Analgesic’s (Cox-2)
Asthma
Cardiac Arrhythmias
Depression (SSRI)
Diabetes
HCV
Incontinence
Migraine (acute)
Migraine (prophylaxis)
Oncology
Osteoporosis
Rheumatoid arthritis
Schizophrenia
Efficacy rate(%)
30
80
60
60
62
57
47
40
52
50
25
48
50
60
There is an efficacy
gap in terms
of customer
expectations and
the drugs we
prescribe
Trends in Molecular Medicine
7(5):201-204, 2001
R&D Productivity Decreasing
Industry R&D
Expense
($ Billions)
Annual NME
Approvals
$50
200
$45
180
$40
$35
R&D Investment
NME & Biologics
Approvals
160
140
$30
120
$25
100
$20
80
$15
60
$10
40
$5
20
$0
0
80 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 000 001 002 003 004 005 006 007
9
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2
Source: PhRMA, FDA, Lehman Brothers; [Dr. Robert Ruffolo]
Outline
Why
do we care
What
do we know
 Theory
 Application
What
we should do
Starting Point
No
universally best method for analyzing
longitudinal data
Analysis
must be tailored to the specific
situation at hand
Consider
the hypothesis to be tested, desired
attributes of the analysis, and the
characteristics of the data
Missing Data Mechanisms
MCAR - missing completely at random
• Conditional on the independent variables in
the model, neither observed or unobserved
outcomes of the dependent variable explain
dropout
MAR - missing at random
• Conditional on the independent variables in
the model, observed outcomes of the
dependent variable explain dropout, but
unobserved outcomes do not
Missing Data Mechanisms
MNAR - missing not at random
• Conditional on the independent variables in
the model and the observed outcomes of the
dependent variable, the unobserved
outcomes of the dependent variable explain
dropout
Consequences
Missing
data mechanism is a characteristic
of the data AND the model
Differential
dropout by treatment indicates
covariate dependence, not mechanism
Mechanism
can vary from one outcome to
another in the same dataset
Missing Data in Clinical Trials
• Efficacy data in clinical trials are seldom MCAR
because the observed outcomes typically
influence dropout (DC for lack of efficacy)
• Trials are designed to observe all the relevant
information, which minimizes MNAR data
• Hence in the highly controlled scenario of
clinical trials missing data may be mostly MAR
• MNAR can never be ruled out
Implications
• All analyses rely on missing data assumptions
• Any options in the trial design to minimize
dropout should be strongly considered
Assumptions
• ANOVA with BOCF / LOCF assumes
• MCAR & constant profile
• MAR always more plausible than MCAR
• MAR methods will be valid in every case
where BOCF/ LOCF is valid
• BOCF / LOCF will not be valid in every
scenario where MAR methods are valid
Research Showing MAR Is Useful And /
Or Better Than LOCF
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Arch. Gen. Psych. 50: 739-750.
Arch. Gen. Psych. 61: 310-317.
Biol. Psychiatry. 53: 754-760.
Biol. Psychiatry. 59: 1001-1005.
Biometrics. 52: 1324-1333.
Biometrics. 57: 43-50.
Biostatistics. 5:445-464.
BMC Psychiatry. 4: 26-31.
Clinical Trials. 1: 477–489.
Computational Statistics and Data Analysis. 37: 93-113.
Drug Information J. 35: 1215-1225.
J. Biopharm. Stat. 8: 545-563.
J. BioPharm. Stat. 11: 9-21.
Research Showing MAR Is Useful And /
Or Better Than LOCF
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
J. Biopharm. Stat. 12: 207-212.
J. Biopharm. Stat. 13:179-190.
J. Biopharm. Stat. 16: 365-384.
Neuropsychopharmacol. 6: 39-48.
Obesity Reviews. 4:175-184.
Pharmaceutical Statistics. 3:161-170.
Pharmaceutical Statistics. 3:171-186.
Pharmaceutical Statistics. 4:267-285.
Pharmaceutical Statistics (2007 early view) DOI: 10.1002/pst.267
Statist. Med. 11: 2043-2061.
Statist. Med. 14: 1913-1925.
25. Statist. Med. 22: 2429-2441.
Why Is LOCF Still Popular
• LOCF perceived to be conservative
• Concern over how MAR methods perform under
MNAR
• More explicit modeling choices needed in MAR
methods
• LOCF thought to measure something more
valuable
Conservatism Of LOCF
• Bias in LOCF has been shown analytically and
empirically to be influenced by many factors
• Direction and magnitude of bias highly situation
dependent and difficult to anticipate
• Summary of recent NDA showed LOCF yielded
lower p value than MMRM in 34% of analyses
Biostatistics. 5:445-464.
BMC Psychiatry. 4: 26-31.
Performance Of MAR With MNAR Data
•
Studies showing MAR methods provide better control of
Type I and Type II error than LOCF
Arch. Gen. Psych. 61: 310-317.
Clinical Trials. 1: 477–489.
Drug Information J. 35: 1215-1225.
J. BioPharm. Stat. 11: 9-21.
J. Biopharm. Stat. 12: 207-212.
Pharmaceutical Statistics (2007 early view) DOI: 10.1002/pst.267
JSM Proceedings. 2006. pp. 668-676. 2006.
More Explicit Modeling Choices Needed
• MMRM 6 lines of code, LOCF 5 lines of code
• Convergence and choice of correlation not
difficult in MMRM
Clinical Trials. 1: 477–489.
LOCF Thought To Measure Something
More Valuable
• LOCF is “effectiveness”, MAR is “efficacy”
• LOCF is what is actually observed
• MAR is what is estimated to happen if patients
stayed on study
• Non longitudinal interpretation of LOCF
• LO, LAV
• Dropout is an outcome
Non-longitudinal Interpretation Of
LOCF
• An LOCF result can be interpreted as an index
of rate of change times duration on study drug a composite of efficacy, safety, tolerability
• An index with unknown weightings
• The same estimate of mean change via LOCF
can imply different clinical profiles
• The LOCF penalty is not necessarily
proportional to the risk
• Result can be manipulated by design
Completion Rates in Depression
Trials
Proportion of completers
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Study
Drug
Drug
PLA
Placebo
Placebo Dropout Rates Influenced
by Design In a Recent MDD NDA
Trial
1
2
3
4
5
6
7
8
%
DC-AE
4.3
6.7
3.3
9.0
3.2
1.0
2.5
4.3
%
Dropout
34.3
41.3
31
42
19
9
29.5
35.3
Trials 5 and 6 had titration
dosing and extension phases
Lillytrials.com
Outline
Why
do we care
What
do we know
 Theory
 Application
What
we should do
Modeling Philosophies
• Restrictive modeling
• Simple models with few independent
variables
• Often include only the design factors of the
experiment
Psychological Methods,
6, 330-351.
Modeling Philosophies
• Inclusive modeling
• Auxiliary variables included to improve
performance of the missing data
procedure – expand the scope of MAR
• Baseline covariates
• Time varying post-baseline covariates:
Must be careful to not dilute treatment effect.
Can be dangerous to include time varying
postbaseline covariates in analysis model,
may be better to use via imputation (or
propensity scoring or weighted analyses)
Psychological Methods,
6, 330-351.
Rationale For Inclusive Modeling
• MAR: conditional on the dependent and
independent variables in the analysis,
unobserved values of the dependent variable
are independent of dropout
• Hence adding more variables that explain
dropout can make missingness MAR that
would otherwise be MNAR
Analytic Road Map
• MAR with restrictive modeling as primary
• Use MAR with inclusive modeling and
MNAR methods as sensitivity analyses
• Use local influence to investigate impact of
influential patients
Pharmaceutical Statistics. 4: 267–285.
J. Biopharm. Stat. 16: 365-384.
Why Not MNAR As Primary
• Can do better than MAR only via assumptions
• Assumptions untestable
• Sensitivity to violations of assumptions and
model misspecification more severe in MNAR
• MNAR methods lack some desired attributes of a
primary analysis in a confirmatory trial
• No standard software
• Complex
Implementing The Road Map:
Example From A Depression Trial
259 patients, randomized 1:1 ratio to drug and placebo
Response: Change of HAMD17 score from baseline
6 post-baseline visits (Weeks 1,2,3,5,7,9)
Primary objective: test the difference of mean change in
HAMD17 total score between drug and placebo at the
endpoint
Primary analysis: LB-MEM
Patient Disposition
Drug
Placebo
Protocol complete
60.9%
64.7%
Adverse event
12.5%
4.3%
5.5%
13.7%
Lack of efficacy
Differential rates, timing, and/or reasons for
dropout do not necessarily distinguish
between MCAR, MAR, MNAR
Primary Analysis: LB-MEM
proc mixed;
class subject treatment time site;
model Y = baseline treatment time site
treatment*time ;
repeated time / sub = subject type = un;
lsmeans treatment*time / cl diff;
run;
This is a full multivariate model, with unstructured modeling
of time and correlation. More parsimonious approaches
may be useful in other scenarios
Treatment contrast 2.17, p = .024
Inclusive Modeling in MI:
Including
Auxiliary AE Data
• Imputation Models
• *Yih = µ +1 Yi1 +…+ h-1 Yi(h-1) + ih
• Yih = µ + 1 Yi1 +…+ h-1 Yi(h-1) + 1 AEi1 +…+ h-1 AEi(h-1) + ih
• Yih= µ + 1 Yi1 +…+ h-1 Yi(h-1) + 1 AEi1 +…+ h-1 AEi(h-1)
+11 (Yi1 *AEi1 ) + …+i(h-1) (Yi(h-1) * AEi(h-1) ) + ih
• Analysis Model
• MMRM as previously described
Result
•
MI results were not sensitive to the different
imputation models
Endpoint contrast
MMRM
2.2
MI Y+AE
2.3
MI Y+AE+Y*AE
2.1
•
Including AE data might be important in other
scenarios. Many ways to define AE
MNAR Modeling
•
Implement a selection model
– Had to simplify model: modeled time as linear + quadratic, and
used ar(1) correlation
•
Compare results from assuming MAR, MNAR
•
Also obtain local influence to assess impact
of influential patients on treatment contrasts
and non-random dropout
Selection Model Results
Contrast
(p-value)
MAR
MNAR
2.20
2.18
(0.0179) (0.0177)
Missingness Parameters
0
1
2
Estimate
-2.46
0.11
-0.08
SE
0.27
0.05
0.06
Local Influence: Influential Patients
12
Ci
6
4
#179
#154
#50
2
#6
0
Ci
8
10
#30
0
50
100
150
Patient
200
250
Individual Profiles with Influential
Patients Highlighted
0
# 30
-30
-20
-10
change in HAMD17
-10
-20
-30
change in HAMD17
0
10
Duloxetine
10
placebo
2
4
6
Weeks
8
2
4
6
Weeks
8
Investigating The Influential
Patients
The most influential patient was #30, a drug-treated
patient that had the unusual profile of a big
improvement but dropped out at week 1
This patient was in his/her first MDD episode when
s/he was enrolled
This patient dropped out based on his/her own
decision claiming that the MDD was caused by high
carbon monoxide level in his/her house
This patient was of dubious value for assessing the
efficacy of the drug
Selection Model: Influential
Patients Removed
( 30, 191)
Removed Subjects
MAR
Diff. at endpoint
(p-value)
(6, 30, 50, 154, 179, 191)
MNAR
MAR
MNAR
2.07
(0.0241)
2.07
(0.0237)
2.40
(0.0082)
2.40
(0.0083)
0
-2.22 (0.14)
-2.44 (0.27)
-2.23 (0.15)
-2.47 (0.28)
1
0.05 (0.02)
0.11 (0.05)
-0.05 (0.02)
0.11 (0.06)
Missingness Parameters
2
-0.07 (0.06)
-0.08 (0.06)
Implications
Comforting that no subjects had a huge
influence on results. Impact bigger if it were
a smaller trial
Similar to other depression trials we have
investigated, results not influenced by MNAR
data
We can be confident in the primary result
Discussion
MAR with restrictive modeling was a
reasonable choice for the primary analysis
MAR with inclusive modeling and MNAR was
useful in assessing sensitivity
Sensitivity analyses promote the appropriate
level of confidence in the primary result and
lead us to an alternative analysis in which we
can have the greatest possible confidence
Opinions
• Inclusive modeling has been under utilized
• More research to understand dropout would be
useful
• Did not discuss pros and cons of various ways
to implement inclusive modeling. Use the one
you know? Be careful to not dilute treatment
• The road map for analyses used in the example
data is specific to that scenario
Conclusions
• No universally best method for analyzing
longitudinal data
• Analysis must be tailored to the specific
situation at hand
• Considering the missingness mechanism and
the modeling philosophy provides the framework
in which to choose an appropriate primary
analysis and appropriate sensitivity analyses
Conclusion
• LOCF and BOCF are not acceptable choices for the
primary analysis
• MAR is a reasonable choice for the primary
analysis in the highly controlled situation of
confirmatory clinical trials
• MNAR can never be ruled out
• Sensitivity analyses and efforts to understand
and lower rates of dropout are essential
Download