Shu-Fang Hsu Schmitz

advertisement
Shu-Fang Hsu Schmitz
University of Bern, Bern, Switzerland
PHASE I AND PHASE II TRIAL DESIGN
IN ONCOLOGY
Phase I trials
Features
• First in human
– Ever first in human, or for a specific indication, or for a
specific combination treatment, or for a specific dosing
schedule
• A lot of uncertainties
• Focus on acute endpoints, usually adverse events
• Participating patients may be heterogeneous
– Various indications
• Basis for planning subsequent phase II trial(s)
– Adverse events, efficacy, pharmacokinetics,
pharmacodynamics, patient selection, etc.
Phase I trials
Objectives
• Assess acute adverse events
• Determine the maximum tolerated dose (MTD)
or/and the recommended dose for phase II trial
• Assess pharmacokinetics (PK)
• Assess pharmacodynamics (PD)
• Collect early evidence of efficacy
• Identify suitable patient populations
Phase I design example 1: 3+3 design
Temsirolimus (mTOR inhibitor) in HCC
• Primary endpoint: DLT
– What: Gr 4 haematological toxicity, Gr 3 non haematological
toxicity, treatment delay > 2 weeks
– When: Cycle 1 (21 days)
– How: NCI-CTC v3
• Thresholds for primary endpoint: Fixed
– #DLT/#pts < 2/3 in 1 cohort or < 2/6 in 2 cohorts
• Between-cohort accrual suspension: Yes
• Dose levels: Study-specific, fixed
Level
-2
-1
1 (starting)
2
3
Dose (mg/week)
10
15
20
25
30
Yeo W, et al. BMC Cancer 2015; doi:10.1186/s12885-015-1334-6.
Phase I design example 1: 3+3 design
Temsirolimus (mTOR inhibitor) in HCC
• Dose escalation/de-escalation scheme: Fixed
 2/3 DLT
Stop at
dose level i
MTD is
dose level i1
(in  6 pts)
 2/6
DLT
Step 1:
3 new pts
at a new
dose level i
1/3 DLT
Repeat
Step 1
No DLT
1/6
DLT
3 new pts at
higher dose
level i+1
3 new pts at
same dose
level i
Actual #DLTs/#pts
Cohort
1
2
30
Dose
(mg/wk)
Yeo W, et al. BMC Cancer 2015; doi:10.1186/s12885-015-1334-6.
25
20
0/3
0/3
3
4
1/3
1/3
5
0/7
Phase I design example 1: 3+3 design
Temsirolimus (mTOR inhibitor) in HCC
• Number of subjects per cohort:
Fixed, 3 pts/cohort
• Number of cohorts per dose level:
Fixed, with max 2 cohorts
• Trial stopping rule: Fixed
For a given dose level,
 2/3 DLTs in 1 cohort
or  2/6 DLTs in 2 cohorts
• Selection criteria for MTD:
Fixed, 1 dose level lower than that stopped
 25 mg/wk
• Dose expansion part:
Treated additional pts at MTD, for a total of 10 pts
Yeo W, et al. BMC Cancer 2015; doi:10.1186/s12885-015-1334-6.
Phase I design example 1: 3+3 design
Design features
• Algorithm-based design
• Advantages
– Easy to apply
– Commonly used and well accepted
• Disadvantages
– No flexibility: Not suitable for certain scenarios
– Non adaptive: Inefficient use of cumulated data
– Higher risk of treating pts above the true MTD
– Lower chance of selecting correct MTD
Ji Y, Wang S-J. JCO 2013;31:1785-1791.
Phase I design example 2: Modified toxicity probability
interval (mTPI) design
Nitroglycerin (NTG) patch in operable rectal cancer
• Primary endpoint: DLT
– What: 2 instance of Gr 3 toxicity, Gr 4 NTG related toxicity
– When: During neoadjuvant chemoradiation therapy
– How: CTCAE v4.02
• Threshold for primary endpoint
– Study-specific, 30%, with proper dosing interval [25%, 35%]
• Between-cohort accrual suspension: Yes
• Dose levels: Study-specific, fixed
Level
Dose (mg/hr)
Illum H, et al. Surgery 2015 (in press).
1 (starting)
2
3
0.2
0.4
0.6
Phase I design example 2: Modified toxicity probability
interval (mTPI) design
Nitroglycerin (NTG) patch in operable rectal cancer
• Dose escalation/de-escalation scheme: Adaptive
Total #DLTs
Total #pts at current dose
3
4
5
6
7
0
E
E
E
E
E
1
S
S
S
E
E
2
D
S
S
S
S
3
DU
DU
D
S
S
DU
DU
DU
D
DU
DU
DU
4
5
E = escalate, S = stay at current dose,
D = deescalate, U = unacceptable
Ji Y, Wang S-J. JCO 2013;31:1785-1791.
Actual #DLTs/#pts
Cohort
1
2
3
0.6
Dose
(mg/hr)
0/3
0.4
0.2
4
0/3
1/3
0/4
Phase I design example 2: Modified toxicity probability
interval (mTPI) design
Nitroglycerin (NTG) patch in operable rectal cancer
• Number of subjects per cohort:
Flexible, 3-4 pts
• Number of cohorts per dose level:
Adaptive, updated with accumulated data
• Trial stopping rule:
Study specific, after total 13 pts
• Selection criteria for MTD:
– Statistical rule not yet defined
– Study specific
– 0.6 mg/hr was chosen
Illum H, et al. Surgery 2015 (in press).
Phase I design example 2: Modified toxicity probability
interval (mTPI) design
Design features
• Model-based design
• Advantages
– Easy to apply: Free software (Excel or R),
dose escalation/de-escalation scheme generated in advance
– Study-specific: Thresholds, trial stopping rule
– Flexible: #pts/cohort
– Adaptive: Dose escalation/de-escalation scheme,
#cohorts/dose level
– Lower risk of treating pts above the true MTD
– Higher chance of selecting correct MTD
• Disadvantages
– Statistical criteria for MTD selection not yet defined
Ji Y, Wang S-J. JCO 2013;31:1785-1791.
Phase I design example 3: Modified time-to-event
continual reassessment method (TITE-CRM) design
Continuous MKC-1 in advanced or metastatic solid malignancies
• Primary endpoint: DLT
– What: ANC<750/mm3, neutropenic fever, or platelets
< 25,000/mm3, Gr 3 non haematological toxicity
– When: Cycle 1 (acute toxicity) and Cycle 2-3 (late toxicity)
– How: CTCAE v3.0
• Threshold for primary endpoint:
– Study-specific, 33%
• Between-cohort accrual suspension: No
• Dose levels:
– Starting dose level: Pre-specified, 60 mg BID (120 mg/d)
– Further dose levels: Adaptive, flexible
• Dose expansion part: Treated additional 12 pts at MTD
Tevaarwerk A, et al. Invest New Drugs 2012 ;30(3):1039–1045.
Phase I design example 3: Modified time-to-event
continual reassessment method (TITE-CRM) design
Continuous MKC-1 in advanced or metastatic solid malignancies
• Dose escalation/de-escalation scheme: Adaptive
Apply initial
dose to
first cohort
Apply
selected
dose to
next cohort
Apply
additional
constraints to
select a dose
Identify doses
that will give
DLT risk in
targeted range
Tevaarwerk A, et al. Invest New Drugs 2012 ;30(3):1039–1045.
Update
model using
accumulated
data
Estimate
DLT risk at
different
doses
Phase I design example 3: Modified time-to-event
continual reassessment method (TITE-CRM) design
Continuous MKC-1 in advanced or metastatic solid malignancies
• Actual dose-DLT data
Patient
1
2
3
4
5
6
7
8
9
10
320
12
13
14
15
16
17
18
A
290
Dose (mg/d)
11
260
230
200
180
150
120
No DLT
A
Acute DLT
Tevaarwerk A, et al. Invest New Drugs 2012 ;30(3):1039–1045.
19
20
21
22
23
24
Phase I design example 3: Modified time-to-event
continual reassessment method (TITE-CRM) design
Continuous MKC-1 in advanced or metastatic solid malignancies
• Final estimation
Tevaarwerk A, et al. Invest New Drugs 2012 ;30(3):1039–1045.
Phase I design example 3: Modified time-to-event
continual reassessment method (TITE-CRM) design
Continuous MKC-1 in advanced or metastatic solid malignancies
• Number of subjects per cohort:
– Cohort 1: 3 pts specified by protocol
– Further cohorts: 1 pt (continual reassessment)
• Number of cohorts per dose level: Adaptive
• Trial stopping rule:
Study specific, after total 24 pts
• Selection criteria for MTD:
– Study specific, 33% pts experience DLTs by 12 weeks
– 320 mg/d was chosen
Tevaarwerk A, et al. Invest New Drugs 2012 ;30(3):1039–1045.
Phase I design example 3: Modified time-to-event
continual reassessment method (TITE-CRM) design
Design features
• Model-based design
• Advantages
– Study-specific: Threshold, trial stopping rule
– Adaptive: Dose escalation/de-escalation scheme, dose levels
– Higher chance of selecting correct MTD
– Statistical criteria for MTD selection well defined
– Without accrual suspension between cohorts
– Account for late toxicities
• Disadvantages
– Require intensive statistician support prior to and during trial
– Require continuous data update
Tevaarwerk A, et al. Invest New Drugs 2012 ;30(3):1039–1045.
Phase I design example 4: Trivariate continual
reassessment method (TriCRM) design
Motivation
• Toxicity is not always a good indicator for efficacy
Mandrekar S, et al. Statist Med 2010;29:1077–1083.
Phase I design example 4: Trivariate continual
reassessment method (TriCRM) design
Design aspects
• Primary endpoints: Toxicity and efficacy  3 categories
Efficacy
Toxicity
No
Yes
No
Yes
No response
Success
Toxicity
– Limitation: Efficacy endpoint (e.g. biomarker) should allow
timely assessment within the timeframe as for toxicity
endpoint
• Threshold for toxicity: Study-specific
• Between-cohort accrual suspension: Yes
• Dose levels: Fixed
Zhang W et al. Statist Med 2006;25:2365–2383.
Phase I design example 4: Trivariate continual
reassessment method (TriCRM) design
Design aspects
• Dose escalation/de-escalation scheme: Adaptive
Select the dose
maximizing
success probability
& satisfying other
constraints
Apply
selected
dose to
next cohort
Identify doses
that give toxicity
probability under
threshold
Zhang W et al. Statist Med 2006; 25:2365–2383.
Apply initial
dose to
first cohort
Update
model using
accumulated
data
Estimate
probabilities for
the 3 categories
at different
doses
Phase I design example 4: Trivariate continual
reassessment method (TriCRM) design
Design aspects
• Number of subjects per cohort:
– Study specific, C ( 1) pts/cohort
• Number of cohorts per dose level: Adaptive
• Trial stopping rule: Study specific
– Total  n1 pts treated, including  n1 pts with success
– Or the maximum of n2 (> n1) pts treated
• Selection criteria for biologically optimal dose (BOD):
– Study specific, the model recommended dose at trial end
Zhang W et al. Statist Med 2006; 25:2365–2383.
Phase I design example 4: Trivariate continual
reassessment method (TriCRM) design
Design features
• Model-based design
• Advantages
– Study-specific: Threshold, #pts/cohort, trial stopping rule
– Adaptive: Dose escalation/de-escalation scheme
– Consider both toxicity and efficacy
• Disadvantages
– Require intensive statistician support prior to and during trial
– Require continuous toxicity and efficacy data update
Zhang W et al. Statist Med 2006; 25:2365–2383.
Phase I trial summary
• Phase I trials are necessary to determine MTDs, assess
acute AEs, pharmacokinetics and pharmacodynamics and
plan subsequent phase II trials
• Many trial designs available, e.g. 3+3 design, mTPI,
TITE-CRM, TriCRM, etc.
• Choice of trial design depends on wishes in
– Toxicity threshold: Fixed or study-specific?
– Adaptive features: For dose levels, number patients/cohort,
number cohorts/dose level, or dose escalation rule?
– Patient enrollment between cohorts: Suspend or continue?
– Consideration of efficacy?
Phase II trials
Features
•
•
•
•
•
•
Initial evaluation with focus on efficacy
Relatively short trial duration
Focus on short-term endpoints
Small sample size
Participating patients more homogeneous
Basis for planning subsequent phase III trial(s)
Phase II trials
Objectives
• Assess short-term efficacy
– Traditionally ORR
•
•
•
•
– Recently also PFS based endpoints
Assess common short-term adverse events
Characterise dose-response relation
Determine the dosing ranges
Decision for further development in phase III trial
– Screening: Is the new agent promising enough for further
investigations?
– Selection: Among several promising new agents/doses,
which one should be selected for the subsequent phase III
trial?
Phase II trials
Control arm
• When is a control arm needed?
– Historical data are not reliable
• Variation between institutions in subject selection, care of
subjects, diagnosis procedure, reporting habit, assessment
interval (especially important for PFS), etc.
• Long time lag
– Historical data for the intended endpoint(s) are not available
– The target population is heterogeneous
– The target population is unclear
– Often for combination interventions,
e.g. standard treatment + new drug
Phase II trials
Control arm
• The purpose of a control arm
– For calibration
• No comparison between arms
• Each arm is considered as a single-arm trial
• If the result of control arm (standard treatment) very different
from that of historical data, the result of new treatment should be
interpreted with caution
Phase II trials
Control arm
• The purpose of a control arm
– For “rough” comparisons
• This is not a substitute of a proper phase III trial
• All arms need to be considered in the same statistical design
• Use the statistical approach of phase III design, but with more
liberal parameters, e.g.
Phase II
Phase III
Effect size
Slightly “exaggerated”
Min. clinical relevance
Alternative hypothesis
1 sided
2 sided
Alpha
5-20%
5%
Power
80%
80-90%
Conclusion when
p-value ≤ alpha
Promising for phase III
Statistically significant
Phase II design example 1a: Fleming’s single-stage
design
Axitinib + standard chemotherapy in advanced squamous NSCLC
• Primary endpoint:
ORR (binary, “success”/“failure”)
• Nr of arms: 1
• Purpose: Screening
• Interim analysis: No
• Historical data: ORR 17-38% with standard chemotherapy
• Design input parameters
– 1-sided type I error prob.  = 0.1
– Power = 0.85
– Uninteresting prob. of “success” P0 = 0.4
– Promising prob. of “success” P1 = 0.6
Bondarenko I, et al. BMC Cancer 2015; DOI 10.1186/s12885-0151350-6
Phase II design example 1a: Fleming’s single-stage
design
Axitinib + standard chemotherapy in advanced squamous NSCLC
• Planned sample size: N = 36
• Decision rule:
Declare promising if > 18/36 (50%) “successes”
• Result
– Total 38 pts
– 1 CR, 14 PR  ORR 39%
• Conclusion?
• Flaws of design?
Bondarenko I, et al. BMC Cancer 2015; DOI 10.1186/s12885-0151350-6
Phase II design example 1b: Fleming’s single-stage
design
Temsirolimus (mTOR inhibitor) in HCC
• Primary endpoint:
PFS at 3 months (binary, “success”/“failure”)
• Nr of arms: 1
• Purpose: Screening
• Interim analysis: No
• Historical data: Efficacy of everolimus has not been confirmed
by phase III study (EVOLVE-1, NCT01035229)
• Design input parameters
– 1-sided type I error prob.  = 0.2
– Power = 0.85
– Uninteresting prob. of “success” P0 = 0.5
– Promising prob. of “success” P1 = 0.66
Yeo W, et al. BMC Cancer 2015; doi:10.1186/s12885-015-1334-6.
Phase II design example 1b: Fleming’s single-stage
design
Temsirolimus (mTOR inhibitor) in HCC
• Planned sample size: N = 30
• Decision rule:
Declare promising if > 17/30 (57%) “successes”
• Result
– Total 36 pts
– # “successes” not reported!
– Patients received 1-6 cycles, median 3.5 cycles
– Median follow-up 8.9 months
– Kaplan-Meier estimate PFS at 3 months = 0.47
• Conclusion?
• Flaws of design?
Yeo W, et al. BMC Cancer 2015; doi:10.1186/s12885-015-1334-6.
Phase II design example 2: Simon’s two-stage design
Metronomic oral vinorelbine as first-line in elderly advanced NSCLC
• Primary endpoint:
ORR (binary, “success”/“failure”)
• Nr of arms: 1
• Purpose: Screening
• Interim analysis: 1, for futility
• Design input parameters
– 1-sided type I error prob.  = 0.05
– Power = 0.8
– Uninteresting prob. of “success” P0 = 0.1
– Promising prob. of “success” P1 = 0.25
Camerini A, et al. BMC Cancer 2015; DOI 10.1186/s12885-0151354-2
Phase II design example 2: Simon’s two-stage design
Metronomic oral vinorelbine as first-line in elderly advanced NSCLC
• Planned sample size and decision rules
N1 (18) patients
Suspend accrual
> R1 (2)
responses
Stage-1 analysis
Total N (43) patients
 R1 (2)
responses
Trt not promising
 Stop
Camerini A, et al. BMC Cancer 2015; DOI 10.1186/s12885-0151354-2
Stage-2
 R (7)
analysis
responses
> R (7)
responses
Trt promising
 Further investigation
Phase II design example 2: Simon’s two-stage design
Metronomic oral vinorelbine as first-line in elderly advanced NSCLC
• Simon’s two-stage design variations
– Optimal design: N1=18, R1=2, N=43, R=7
– Minimax design: N1=22, R1=2, N=40, R=7
Compared with optimal design:
• Larger N1: Stage-1 analysis at later time
• Smaller N: Save resource
Note: The paper specified that the minimax design was used.
In fact, the N1, R1, N and R were from optimal design!
Camerini A, et al. BMC Cancer 2015; DOI 10.1186/s12885-0151354-2
Phase II design example 2: Simon’s two-stage design
Metronomic oral vinorelbine as first-line in elderly advanced NSCLC
• Result
– Total 43 pts
– Stage 1 results not reported!
– Final: 1 CR, 7 PR  ORR = 18.6%
• Conclusion?
• Flaws of design?
Camerini A, et al. BMC Cancer 2015; DOI 10.1186/s12885-0151354-2
Phase II design example 3: Design by 1-sample t-test
Neoadjuvant metronomic therapy in triple-negative breast cancer
• Primary endpoint:
Change in percentage of Ki-67+ cells between baseline biopsy
and surgical resection specimens
(continuous scale)
•
•
•
•
Nr of arms: 1
Purpose: Screening (Selection?)
Interim analysis: No
Design input parameters:
– 2-sided type I error prob.  = 0.05
– Power = 0.8
– Change in percentage of Ki-67+ cells = 10 (pre 20 vs post 10)
– Standard deviation = 20
Cancello G, et al. Clinical Breast Cancer 2015 (in press).
Phase II design example 3: Design by 1-sample t-test
Neoadjuvant metronomic therapy in triple-negative breast cancer
• Planned sample size: N = 32
• Decision rule: Declare promising if p-value  0.05
• Result
– Total 30/34 pts evaluable
– Mean reduction in percentage
of Ki-67+ cells = 41%,
95% confidence interval
= (30%, 51%)
– P-value < 0.0001
• Conclusion?
• Flaws of design?
Cancello G, et al. Clinical Breast Cancer 2015 (in press).
Phase II design example 4: Design by 2-sample t? test
Sipuleucel-T with abiraterone acetate in mCRPC
• Primary endpoint:
Cumulative antigen presenting cell (APC) activation
(continuous scale, positively correlated with OS in other trials)
•
•
•
•
Nr of arms: 2 (concurrent vs sequential abiraterone acetate)
Purpose: Screening
Interim analysis: No
Design input parameters:
– t-test? (not reported!)
– 2-sided type I error prob.  = 0.05? (not reported!)
– Power = 0.85
– Fold change 1.3 for ratio of means between arms
– Coefficient of variation (= SD/mean) = 0.325
– Allocation ratio = 1:1
Small E, et al. Clinical Cancer Research 2015;
DOI: 10.1158/1078-0432.CCR-15-0079
mCRPC: Metastatic castration resistant prostate cancer
Phase II design example 4: Design by 2-sample t? test
Sipuleucel-T with abiraterone acetate in mCRPC
• Planned sample size: N = 28/arm? (not reported!)
• Result
– Total 35 pts in concurrent arm and 34 pts in sequential arm
– Median 1.83 vs 1.46  109
– Means, ratio and p-value not reported!
• Conclusion
Immunologic effects of
Sipuleucel-T is not blunted or
altered in concurrent treatment
• Flaws of design?
Small E, et al. Clinical Cancer Research 2015;
DOI: 10.1158/1078-0432.CCR-15-0079
Phase II design example 5: Design by log-rank test
Apricoxib + erlotinib in biomarker-selected advanced NSCLC
•
•
•
•
•
Primary endpoint: TTP (time-to-event endpoint)
Nr of arms: 2 (Apricoxib + erlotinib vs. placebo + erlotinib)
Purpose: Screening
Interim analysis: No
Design input parameters:
– Log-rank test
– 1-sided type I error prob.  = 0.2
– Power = 0.8
– Hazard ratio (placebo/apricoxib) = 1.4
– Accrual duration and study duration not reported!
– Allocation ratio = 2:1
– Blinding = Double blind
Gitlitz B, et al. J Thorac Oncol 2014;9:577-582.
Phase II design example 5: Design by log-rank test
Apricoxib + erlotinib in biomarker-selected advanced NSCLC
• Planned sample size: N = 115
• Result
– Evaluable pts: 75 in apricoxib and 39 in placebo
– Follow-up: ≥ 5 months
if not progressed earlier
– TTP:
Median 1.8 vs. 2.1 months,
hazard ratio=1,
p-value = 0.438
• Conclusion?
• Flaws of design?
Gitlitz B, et al. J Thorac Oncol 2014;9:577-582.
Phase II trial summary
• Phase II trials are focused on efficacy and short-term
endpoints and are the basis for planning phase III trials
• Many trial designs available, e.g. Fleming’s single stage
design, Simon’s two-stage design, 1-sample t-test design,
2-sample design, etc.
• Choice of trial design depends on
– Purpose: Screening or selection?
– Type of endpoint: Binary, continuous, or time-to-event?
– Need for a control arm: Depending on the historical data, the
target population and if the trial is for a combination therapy
– Need for interim analyses
Download