Bayesian Adaptive Methods

advertisement
1
DESIGN AND ANALYSIS OF CLINICAL TRIALS
Bayesian Adaptive
Methods
An Application to Phase I Clinical Trials
Carrie Ann Deis, Nadine Dewdney
May 2012
The usage of Bayesian adaptive methods is more frequent in clinical trials. Due to its flexibility, it is more
applicable to Phase I and II trials. The application of Bayesian adaptive methods is shown to be more
favorable than the utilization of the traditional approach. The benefits of using the Bayesian adaptive
methods include smaller sample size, increased power and lower incidence of ineffective dosing. A
hybridization of the traditional and the Bayesian adaptive designs improves upon the traditional design.
The FDA has addressed the increasing usage of the Bayesian adaptive designs in guidance documents.
Here is a discussion about Bayesian adaptive methods and how it compares to the more traditional
approach to drug development.
2
Table of Contents
I.
Introduction ............................................................................................................................. 3
A.
Background Information and Research ............................................................................ 3
B.
Adaptive Designs ............................................................................................................. 4
C.
Bayesian Approach .......................................................................................................... 5
D.
Prior Distributions ............................................................................................................ 7
E.
Traditional vs. Bayesian ................................................................................................... 8
F. Hybridization ..................................................................................................................... 11
G.
FDA Guidance – Medical Devices................................................................................. 12
H.
Conclusion...................................................................................................................... 13
I.
References .......................................................................................................................... 15
3
I.
Introduction
A.
Background Information and Research
Phase I clinical trials are conducted to determine the toxicity of the drug and for the appropriate
dosing of the new intervention. It is the first time that the drug is being tested in humans. The
sample size is relatively small, 20 to 50 patients. Depending on the therapeutic nature of the
drug, Phase I trials may start off with healthy volunteers. In the case of cancer therapy drugs,
Phase I trials are sometimes conducted on cancer patients who have failed to respond to
conventional therapy. In drug development, it is assumed that the effectiveness of a drug
increases as the dose level increases, however with increased dose there is an increased risk of
toxicity, so in Phase I trials the maximum tolerated dose (MTD) is sought.
There are some important attributes of Phase I trials. Prior to starting, there has to be a defined
starting dose, a toxicity profile and dose-limiting toxicity (DLT), a target toxicity level and a
dose-escalation scheme. The starting dose is commonly chosen as one-tenth of the LD10 in mice,
or one-third of the lowest toxic dose in dogs. Dose escalation is done incrementally. In most
studies, the increments are pre-determined. They may use a modified Fibonacci sequence for
dose escalation with the increase rate diminishing as the doses get higher.
Standard designs assign patients to dose levels according to predefined rules. There is no
stipulation for the dose-toxicity curve. These designs are classified as up-an-down designs. They
allow for escalation and de-escalation of doses.
The traditional approach to determine the maximum tolerable dose is to see at which dose onethird of the subjects would develop toxicity. The doses would be selected such that D1,…,DK
would be close to the MTD. The subjects would be randomized and the number of subjects, ri,
developing toxicity would be observed. The proportion, pi = ri/ni, would be used to calculate the
proportions exhibiting toxicity. The dose-response would be modeled based on the probability of
toxicity. The MTD would be fitted from this model. This method is known as the frequentist
approach.
The frequentist approach focuses the design of the clinical trial on the probability of the results
of the trial. The probability is based on the observed data with the assumption that a particular
hypothesis is true. The P-value, which is used in determining the validity of the hypothesis, is the
4
probability of observing results as extreme as or more extreme than the observed results. There
are many issues with the frequentist approach.
There are ethical concerns with the traditional approach. Patients might be treated excessively
and unnecessarily at low doses. Too many patients might be treated at doses that are either too
low or too high and it is highly likely that most subjects are treated at extremely low doses. It is
not clear that the chosen MTD is the correct dose. The approach is rigorous and inflexible. It
limits modernization in the design and analysis of clinical trials.
In an effort to improve on the frequentist approach, adaptive designs were developed. As early as
in the 1970s, the adaptive design concept was introduced. In an adaptive design, adjustments and
modifications can be made after the trial has started. The modifications do not affect the integrity
of the trial. There are interim adjustments to the study design following the accumulation of data.
There are several adaptive designs, namely, group sequential design, sample-size adjustable
design, drop-losers design, adaptive treatment allocation design and adaptive dose-escalation and
Bayesian adaptive methods.
B.
Adaptive Designs
During the clinical trials, trial and statistical procedures can be modified in an adaptive design
study. These changes are based on the review of the interim data. The goal of the modifications
is to improve upon the probability of success of the trial as well as to correctly identify the
clinical benefits of the intervention under investigation. There are different types of
modifications which are made during an adaptive design. Prospective adaptations are changes
such as stopping a trial early for safety or lack of efficacy reasons, dropping the loser or sample
size re-estimation. Modifications in hypotheses, inclusion/exclusion criteria, dose/ regimen,
treatment duration and endpoints are examples of ad-hoc modifications which are usually not
initially recognized as candidates for modification, but become necessary as the trial progresses.
Changes which are made to the statistical analysis plans prior to database lock or un-blinding of
treatment codes are known as retrospective adaptations.
Group sequential designs allow for the stopping of a trial due to safety and/or efficacy issues
based on the analysis of interim data. There are stopping boundary functions such as Pocock and
O’Brien-Fleming which are used to control type I error rate.
5
A design that allows for sample size adjustments based on the observation of interim data is
called a sample size re-estimation design. There are disadvantages to this method. The practice
of starting with a small number then adjusting the sample size could lead to ignoring the intended
clinically meaningful difference that was originally intended.
Drop-the-losers design allow for inferior treatments to be dropped from the study. This design
also allows adding additional arms. This design is used in Phase II clinical development. It is a
two-stage design.
Adaptive dose finding design is used during Phase I studies to determine minimum effective
dose (MED) and/or the maximum tolerable dose (MTD). In this design, the continual reassessment method with the Bayesian approach is usually used to estimate the dose-response
curve.
C.
Bayesian Approach
The more traditional way of designing and analyzing clinical trials is known as the frequentist
approach. The frequentist approach defines the probability based on the data, as described above.
The Bayesian approach is based on Bayes theorem and it specifies a prior distribution then
updates the distribution as data becomes available. The new distribution is the posterior
distribution. Both approaches use probability in their analyses, but they use different inferential
methods.
As previously mentioned, in the Bayesian approach, designs can be adjusted to adapt to the
changing course of the trial at interim points of the trial. Information from multiple sources can
be combined as in the case of seamless adaptive designs. The Bayesian analysis can use
nonrandomized trials which are not allowed in the frequentist designs.
For the Bayesian approach, all unknowns have probability distributions. The probabilities are
associated with the parameters and information on the parameters is summarized prior to data
collection. As the data is collected, the information on the parameters is updated and the
posterior distribution is used for statistical inference of the data. The Bayesian design can use
models from previous studies which are similar but independent through hierarchical modeling.
This gives the Bayesian approach greater strength in parameter estimation. However, the
inferences depend only on the current study and uses data that were actually observed. The
inferences are flexible, that is they can be updated as the data is gathered.
6
Sample size does not have to be chosen in advance, but is determined as the trial progresses. The
main decision at the onset of the Bayesian design is when to start. Decisions on the continuation
of the trial are made as data accumulates and the sample size projection can be determined as
information becomes available. The population definition can be altered and the drug of interest
can be changed midcourse. These changes could interfere with analysis and result in weakened
conclusions, unless they were specified beforehand.
Traditionally, clinical trials are randomized regardless of the statistical approach. Randomization
is paramount in reducing the possibility of selection bias and balancing the treatment groups’
covariates. Since the Bayesian approach is subjective probability, randomization is not required.
The Bayesian approach calculates the predictive probability that the patient will respond to the
treatment. The frequentist approach conditions the probability on future observations. The
conditional probabilities in the Bayesian approach are averaged over unknown parameters and
use the fact that an unconditional probability is the expected value of conditional probabilities.
The strength of the Bayesian approach lies on decision making. It is designed for drawing
conclusions from studies based on costs and public health benefits. A given decision problem in
a clinical trial, will give rise to possible future observations. Each observation has an associated
cost and benefit with corresponding predictive probabilities. The probabilities can be weighed
and the optimal solution is chosen. In the more traditional approach, this is not possible since
there is no way to find predictive probabilities.
The basics that are needed to enter a trial are a stopping rule and a prior distribution when using
the Bayesian approach. The sample size does not need to be specified in advance, but it is
common to have a predetermined maximum size. A trial that is Bayesian in its approach still has
to have a protocol and guidelines by which it must abide, so the anticipated type of adaptation
needs to be specified.
In the case of a Phase I study, from an ethical standpoint, adaptation is necessary. For example, if
the treatment is an anti-cancer agent and the subjects are gravely ill, an increase in dose would be
beneficially to the patients as well as the outcome of the study since the main objective is dosefinding. Similarly, in Phase II, adaptation is more desirable. The focus is efficacy without excess
toxicity, so having the power to alter a trial if efficacy is subpar or if there is excess toxicity
makes the Bayesian approach adequate. In Phase III and beyond, adaptation is not necessary and
7
the calculations for the posterior distribution become more cumbersome. There is a greater risk
of errors when the Bayesian approach is applied to Phase III and beyond clinical trials.
D.
Prior Distributions
Choosing a prior distribution is paramount in any Bayesian adaptive design. The prior
distribution provides information about the treatment before there are experimental results. The
selection of a prior distribution is sometimes based on historical data such as what is known
about the family of compounds in treating the targeted disease. Historical data may not be
available, so the prior distribution could be based on what is known about the biological nature
of the disease, data from investigational and related treatments and the preclinical results of these
treatments.
The following series of graphs depict the progression of a hypothetical prior distribution as it is
updated.
Figure 1: Sequence of probability distributions for success rate p
8
The first graph shows the selected initial distribution of p. The prior is assumed to be uniformly
distributed between 0 and 1. The first treatment is a success and the distribution shifts. After each
observation, the distribution is updated based on the results. If there is a success, the distribution
shifts to the right and if there is a failure, the distribution shifts to the left. The graph which is
labeled ‘Final’ depicts the posterior distribution after ten treatments. The final graph in the
sequence of graphs, which is labeled ‘Next observation’, shows the possible outcome of the
posterior distribution if there was an eleventh treatment. For a success, the distribution is
predicted to follow the previous trend and shift to the right; likewise if there is a failure the
previous trend of a shift in the distribution to the left is expected.
E.
Traditional vs. Bayesian
The 3+3 design is a traditional design with no modeling of the dose-toxicity curve. The only
assumption is that the toxicity increases with the dose. The first three patients are treated with the
starting dose and the next three patients are treated with the next dose level. The dose level has
been fixed in advance. If none of the subjects experience toxicity, then another three subjects
will be treated at the same dose level. The dosing will be escalated until at least two patients
experience dose limiting toxicity. So, the MTD is usually defined as the highest dose level in
which six or more patients have been treated and no more than 33% of the patients exhibited
toxicity.
Another method is the pharmacologically guided dose escalation. This method assumes that
DLTs can be predicted by the drug plasma concentrations, so pharmacokinetic data are used to
determine the next dose as the study progresses. Prior to the study, there is a pre-specified
plasma exposure level which is defined by area under the concentration-time curve (AUC). The
AUC data is extrapolated from preclinical studies. The subsequent dose is assigned as long as the
predefined plasma exposure level has not been met. Dosing is escalated one patient at a time and
it is usually done in 100% dose increments. Once the plasma exposure level has attained the prespecified AUC level, the design is switched to the traditional 3+3 design. The dose increments
are then reduced. Anti-cancer agents such as anthracyclines and some platinum compounds have
seen good results using this method. This design has limitations such as the difficulties of
obtaining real-time pharmacokinetic data from patients.
9
Another design is the accelerated titration design. It is a variation of the 3+3 design. In this
design, dose escalation is allowed in multiple cycles in the same patient. This helps in reducing
the number of patients treated at low dose levels. Escalating the dose in the same patient allows a
patient the opportunity to be treated at a higher dose. On the other hand, the true results might be
masked by the cumulative effect of multiple doses. It would be challenging to determine if the
results were due to chronic or delayed toxicity.
There are other traditional designs such as the 2+4, 3+3+3 and 3+1+1 designs. For the 2+4
design, if one patient in the first two patients exhibits toxicity then four additional patients are
added. Like the 3+3 design, the study is stopped when at least two patients exhibit DLT. In the
3+3+3 design, an additional group of three patients is added if at least two patients of the first six
exhibits toxicity. If three out of nine patients display toxicity, then the trial is stopped. The
3+1+1 design is the most aggressive of these designs. It is also known as the best of five design.
If among the first three patients, DLT is observed in one or two patients, then another patient is
added. If two out of four patients exhibit toxicity, then another patient is added. The trial will be
terminated if there are three or more patients exhibiting toxicity.
The above designs are considered to be rule-based. Bayesian designs fall in the category of
model-based designs. The assumption in these designs is that there exists a monotonic doseresponse relationship between the dose and the probability of DLT for the patients who have
been treated at the dose level. A dose-toxicity curve and a targeted toxicity level (TTL) are
clearly defined in these classes of designs. Through dose escalation, the design sets out to find a
dose that will induce a probability of DLT at a pre-specified target toxicity level.
The continual reassessment method (CRM) is a Bayesian model-based Phase I design. The dose
toxicity relationship is characterized by a one-parameter parametric model, such as the logistic
model, the power model or the hyperbolic tangent model.
Logistic model: 𝑝(𝑑) =
exp(3+𝑎𝑑)
1+exp(3+𝑎𝑑)
Power model: 𝑝(𝑑) = 𝑑exp(𝑎)
exp(𝑑)
Hyperbolic tangent: 𝑝(𝑑) = [exp(𝑑)+exp(−𝑑)]𝑎
Where:
p(d) is the probability of DLT
10
d is the dose
a is a model parameter
The CRM initially assumes a vague prior distribution for the parameter, a; one patient is treated
at the level that is closest to the estimated MTD. The toxicity is assessed and the distribution of
parameter, a, is updated by calculating its posterior distribution. The calculation is done by
multiplying the prior distribution which was chosen by the likelihood.
𝑛
𝐿(𝑎; 𝑑, 𝑦) ∝ ∏ 𝑝(𝑑𝑖 )𝑦𝑖 [1 − 𝑝(𝑑𝑖 )]1−𝑦𝑖
𝑖=1
Where:
di and yi are the dose level and toxicity outcome for patient i
yi = 1 if a DLT is observed and 0 if none is observed
The posterior distribution can be calculated using statistical software.
Once the posterior distribution has been calculated, the next patient is treated at the dose level
that is closest to the revised MTD based on the distribution of a. This sequence of steps is
repeated until a precise estimate of a is obtained or the sample size has been exhausted.
Here is an example of a dose-finding escalation design from an oncology trial. The main
objective is to determine the MTD for a new drug. Using the results from animal studies, the
dose limiting toxicity rate was determined to be 1% for the starting dose of 25 mg/m2, which is
one-tenth of the lethal dose. The MTD is estimated to be 150 mg/m2 and the dose limiting
toxicity rate is defined as 0.25. We can compare a traditional approach with the Bayesian
approach, by using the 3+3 traditional escalation rule (TER) and the continual reassessment
method (CRM). Through simulations, the comparison of TER and CRM was done. A logistic
toxicity model was selected for the model. The selected dose sequence was chosen with interim
factors = 2, 1.67, 1.33, 1.33, 1.33, 1.33, 1.33, 1.33, 1.33. Evaluations are based on safety,
accuracy and efficiency. The simulation results are shown in Table 1.
11
Table 1: Summary of simulation results for the designs
Method
Assumed
Mean
Mean number
True MTD
Predicted
of Patients
MTD
3+3 TER
100
86.7
14.9
CRM
100
99.2
13.4
3+3 TER
150
125
19.4
CRM
150
141
15.5
3+3 TER
200
169
22.4
CRM
200
186
16.8
Mean number
of DLTs
2.8
2.8
2.9
2.5
2.8
2.2
In reviewing the results, it can be seen that if the true MTD is 100 mg/m2; the TER approach
estimates the MTD to be 86.7 mg/m2 and the CRM estimates it to be 99.2 mg/m2. The average
number of patients that is required is for the TER design is 14.9 and 13.4 for the CRM. The
DLTs are the same, 2.8. At 100 mg/m2, the differences between the two methods are not
overwhelming. However, if we were to look at the results for the MTD at 150 mg/m2, then we
would see that there are greater differences. The mean predicted MTD for TER and CRM are
125 mg/m2 and 141 mg/m2 respectively. Both designs underestimate the true MTD but the
estimation is much lower for the TER method. With regards to safety, the DLT for TER is 2.9
while CRM has an estimate of 2.5. The estimates show the same trend when the assumed true
MTD is 200 mg/m2.
Both approaches underestimate the true mean; however the Bayesian approach was much closer
to the true value for all dose levels. At all three dose levels, the required number of patients was
much lower when using the Bayesian design. The mean number of DLTs for the Bayesian
approach was either less than or equal to the traditional design at all dose levels. Based on these
observations, the Bayesian CRM approach is more favorable.
F.
Hybridization
The Bayesian approach can be used alone or as a hybrid with the classic approach. The following
example shows the Bayesian approach in a classic design. A two-arm parallel design which
compares a test treatment and a control has data from three clinical trials. The trials are similar in
size. The prior probabilities for the effect size are 0.1, 0.25 and 0.4 with 1/3 probability for each
trial. The power of the effect size is given as:
𝑝𝑜𝑤𝑒𝑟(𝜀) = 𝜙(
√𝑛
𝜀 − 𝑧1−𝛼 )
2
12
Φ is the c.d.f. of the standard normal distribution
Prior, π(ε), is the uncertainty of ε, the expected power, Pexp is
√𝑛
𝑃𝑒𝑥𝑝 = ∫ 𝜙 ( 𝜀 − 𝑧1−𝛼 ) 𝜋(𝜀)𝑑𝜀
2
Assuming, one-sided α = 0.025, so 𝑧1−𝛼 = 1.96, then
1
𝜋(𝜀) = {3 𝜀 = 0.1, 0.25, 0.4
0
In the classic approach the mean of the effect size, 𝜀̅ = 0.25, is used to calculate the sample size.
For the design with β = 0.2, the total sample size would be
4(𝑧1−𝛼 + 𝑧1−𝛽 )2
𝑛=
= 502
𝜀2
If the Bayesian approach is used,
0.1√𝑛
𝜀
2
𝑃𝑒𝑥𝑝 = 1/3[𝜙 (
− 𝑧1−𝛼 )+ϕ(
0.25√𝑛
𝜀
2
0.4√𝑛
𝜀
2
− 𝑧1−𝛼 ) + 𝜙 (
− 𝑧1−𝛼 )] = 0.66
The Bayesian approach considers the uncertainty of the effect size, so the expected power is the
average of the three powers calculated using the 3 different effect sizes. The expected power is
found to be 66% which is lower than the 80% power stated by the frequentist approach. In order
to reach the expected power of 80%, the sample size needs to be increased. In this hybrid
example, the Bayesian approach is used to increase the probability of success given that the final
criterion is p ≤ α = 0.025.
G.
FDA Guidance – Medical Devices
With the growing trend of the Bayesian approach being used in clinical trials, the FDA has put
out a guideline which specifically addresses the use of Bayesian methods in medical devices and
drug development. In addition to the standard protocol, the FDA would like to have the prior
information and all the assumptions that will be made during the study. The criterion for success
of the study should be clearly stated and the proposed sample size should be justified.
In addition, the FDA recommends that tables of the probability of satisfying the study claim
given certain “true” parameter values such as event rates along with various sample sizes for the
trial be provided. The table should contain an estimate of the probability of a type I error in the
13
case where the parameters are consistent with the null hypothesis or the power in the case where
the parameter values are consistent with the alternative. Simulations which are used in order to
calculate sample size and other study parameters should adequately reflect the study design.
The FDA suggests that the prior probability which is being used for the study claim be evaluated
thoroughly before commencement of the study. The prior probability should not be too
informative. The value that is considered to be too high for a prior probability is fully dependent
on each individual case. The prior probability should definitely not be as large as the success
criterion for the posterior distribution. The prior probability should not overwhelm the data; this
could lead to inaccurate results and a loss of control of type I error. The effective sample size
quantifies the efficiency that is being gained from using the prior information and gauges if the
prior is too informative.
Data can be simulated using the prior distribution through Markov Chain Monte Carlo
simulations. The program code and data that are used in generating simulated results should be
provided to the FDA. An electronic copy of the study data and computer code of the simulation
should also be provided to the FDA.
H.
Conclusion
The Bayesian adaptive method can be used fully or as a hybrid with the classic approach.
Bayesian full approach is more beneficial in Phase I and Phase II studies, due to the inherent
adaptive nature of the design. In Phase I trials, conditions are more dynamic than in other phases
and the flexible nature of the Bayesian approach allows for unexpected changes. Clinical trials
using this method tend to be smaller and more informative. Data can be assessed as they
accumulate, so decisions to modify the trial can be made more quickly.
In the application of the Bayesian method, it is imperative that the validity and the integrity of
the study are maintained. For adaptations such as endpoints or hypotheses changes, the
feasibility should be thoroughly evaluated in order to prevent abuse of the method. Protocol
amendments need to be evaluated carefully and sufficient information about the proposed study
should be provided to the FDA according to the guidance documents.
Bayesian adaptive methods are more favorable over traditional methods because of its flexibility,
greater efficiency and lower sample size requirements. Its implementation is beneficial to the
pharmaceutical industry, as it helps to reduce costs and improve drug development.
14
Despite the fact that in June 2003 the FDA approved Pravigard Pac (Bristol-Myers Squibb)
based on analyses using the Bayesian approach, the FDA is cautious of the growing trend of
adaptive designs. The agency has released guidance documents which specifically addresses
Bayesian clinical trials. The recommendations from the FDA provide guidance in the selection of
the prior distribution and the use of simulated data to make such a selection.
Although the Bayesian approach has more flexibility than the frequentist approach, it also has
drawbacks. Data analysis has to be conducted after treating each patient; this can become
overwhelming. The selection of a prior probability can be challenging. There might not be
historical data from which a prior distribution can be modeled. The selected prior distribution
might be too informative, resulting in inaccurate conclusions regarding the new treatment.
Computation can be cumbersome for larger trials and the chance of erroneous decision making is
increased.
Regardless of the drawbacks, the Bayesian approach remains more favorable than the traditional
approach. In the long run, it will lead to faster drug development which would in turn make drug
development more economical.
15
I.
References
1. Chang, Mark (2008). Adaptive Design Theory and Implementation Using SAS and R.
Boca Raton: Chapman & Hall/CRC
2. Berry, Scott M., Carlin, Bradley P., Lee, J.Jack, Muller, Peter (2011). Bayesian
Adaptive Methods for Clinical Trials. Boca Raton: Chapman & Hall/CRC
3. Chow, Shein-Chung and Chang, Mark (2008). Adaptive Design Methods in Clinical
Trials – A Review. Orphanet Journal of Rare Diseases, 3 11
4. Cook, Thomas D. and DeMets, David L. (2008). Introduction to Statistical Methods
for Clinical Trials. Boca Raton: Chapman & Hall/CRC
5. The FDA Center for Drug Evaluation and Research (CDER) and Center for Biologics
Evaluation and Research (CBER), Guidance for Industry: Adaptive Design Clinical
Trials for Drugs and Biologics: www.fda.gov/downloads/Drugs/
GuidanceComplianceRegulatoryInformation/Guidances/UCM201790
.pdf.
6. Guidance for the Use of Bayesian Statistics:
www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/GuidanceDocuments/ucm
071072.htm
7. Donald A. Berry (2006). Bayesian Clinical Trials. Nature Reviews, 5 27-36
Download