Shu-Fang Hsu Schmitz University of Bern, Bern, Switzerland PHASE I AND PHASE II TRIAL DESIGN IN ONCOLOGY Phase I trials Features • First in human – Ever first in human, or for a specific indication, or for a specific combination treatment, or for a specific dosing schedule • A lot of uncertainties • Focus on acute endpoints, usually adverse events • Participating patients may be heterogeneous – Various indications • Basis for planning subsequent phase II trial(s) – Adverse events, efficacy, pharmacokinetics, pharmacodynamics, patient selection, etc. Phase I trials Objectives • Assess acute adverse events • Determine the maximum tolerated dose (MTD) or/and the recommended dose for phase II trial • Assess pharmacokinetics (PK) • Assess pharmacodynamics (PD) • Collect early evidence of efficacy • Identify suitable patient populations Phase I design example 1: 3+3 design Temsirolimus (mTOR inhibitor) in HCC • Primary endpoint: DLT – What: Gr 4 haematological toxicity, Gr 3 non haematological toxicity, treatment delay > 2 weeks – When: Cycle 1 (21 days) – How: NCI-CTC v3 • Thresholds for primary endpoint: Fixed – #DLT/#pts < 2/3 in 1 cohort or < 2/6 in 2 cohorts • Between-cohort accrual suspension: Yes • Dose levels: Study-specific, fixed Level -2 -1 1 (starting) 2 3 Dose (mg/week) 10 15 20 25 30 Yeo W, et al. BMC Cancer 2015; doi:10.1186/s12885-015-1334-6. Phase I design example 1: 3+3 design Temsirolimus (mTOR inhibitor) in HCC • Dose escalation/de-escalation scheme: Fixed 2/3 DLT Stop at dose level i MTD is dose level i1 (in 6 pts) 2/6 DLT Step 1: 3 new pts at a new dose level i 1/3 DLT Repeat Step 1 No DLT 1/6 DLT 3 new pts at higher dose level i+1 3 new pts at same dose level i Actual #DLTs/#pts Cohort 1 2 30 Dose (mg/wk) Yeo W, et al. BMC Cancer 2015; doi:10.1186/s12885-015-1334-6. 25 20 0/3 0/3 3 4 1/3 1/3 5 0/7 Phase I design example 1: 3+3 design Temsirolimus (mTOR inhibitor) in HCC • Number of subjects per cohort: Fixed, 3 pts/cohort • Number of cohorts per dose level: Fixed, with max 2 cohorts • Trial stopping rule: Fixed For a given dose level, 2/3 DLTs in 1 cohort or 2/6 DLTs in 2 cohorts • Selection criteria for MTD: Fixed, 1 dose level lower than that stopped 25 mg/wk • Dose expansion part: Treated additional pts at MTD, for a total of 10 pts Yeo W, et al. BMC Cancer 2015; doi:10.1186/s12885-015-1334-6. Phase I design example 1: 3+3 design Design features • Algorithm-based design • Advantages – Easy to apply – Commonly used and well accepted • Disadvantages – No flexibility: Not suitable for certain scenarios – Non adaptive: Inefficient use of cumulated data – Higher risk of treating pts above the true MTD – Lower chance of selecting correct MTD Ji Y, Wang S-J. JCO 2013;31:1785-1791. Phase I design example 2: Modified toxicity probability interval (mTPI) design Nitroglycerin (NTG) patch in operable rectal cancer • Primary endpoint: DLT – What: 2 instance of Gr 3 toxicity, Gr 4 NTG related toxicity – When: During neoadjuvant chemoradiation therapy – How: CTCAE v4.02 • Threshold for primary endpoint – Study-specific, 30%, with proper dosing interval [25%, 35%] • Between-cohort accrual suspension: Yes • Dose levels: Study-specific, fixed Level Dose (mg/hr) Illum H, et al. Surgery 2015 (in press). 1 (starting) 2 3 0.2 0.4 0.6 Phase I design example 2: Modified toxicity probability interval (mTPI) design Nitroglycerin (NTG) patch in operable rectal cancer • Dose escalation/de-escalation scheme: Adaptive Total #DLTs Total #pts at current dose 3 4 5 6 7 0 E E E E E 1 S S S E E 2 D S S S S 3 DU DU D S S DU DU DU D DU DU DU 4 5 E = escalate, S = stay at current dose, D = deescalate, U = unacceptable Ji Y, Wang S-J. JCO 2013;31:1785-1791. Actual #DLTs/#pts Cohort 1 2 3 0.6 Dose (mg/hr) 0/3 0.4 0.2 4 0/3 1/3 0/4 Phase I design example 2: Modified toxicity probability interval (mTPI) design Nitroglycerin (NTG) patch in operable rectal cancer • Number of subjects per cohort: Flexible, 3-4 pts • Number of cohorts per dose level: Adaptive, updated with accumulated data • Trial stopping rule: Study specific, after total 13 pts • Selection criteria for MTD: – Statistical rule not yet defined – Study specific – 0.6 mg/hr was chosen Illum H, et al. Surgery 2015 (in press). Phase I design example 2: Modified toxicity probability interval (mTPI) design Design features • Model-based design • Advantages – Easy to apply: Free software (Excel or R), dose escalation/de-escalation scheme generated in advance – Study-specific: Thresholds, trial stopping rule – Flexible: #pts/cohort – Adaptive: Dose escalation/de-escalation scheme, #cohorts/dose level – Lower risk of treating pts above the true MTD – Higher chance of selecting correct MTD • Disadvantages – Statistical criteria for MTD selection not yet defined Ji Y, Wang S-J. JCO 2013;31:1785-1791. Phase I design example 3: Modified time-to-event continual reassessment method (TITE-CRM) design Continuous MKC-1 in advanced or metastatic solid malignancies • Primary endpoint: DLT – What: ANC<750/mm3, neutropenic fever, or platelets < 25,000/mm3, Gr 3 non haematological toxicity – When: Cycle 1 (acute toxicity) and Cycle 2-3 (late toxicity) – How: CTCAE v3.0 • Threshold for primary endpoint: – Study-specific, 33% • Between-cohort accrual suspension: No • Dose levels: – Starting dose level: Pre-specified, 60 mg BID (120 mg/d) – Further dose levels: Adaptive, flexible • Dose expansion part: Treated additional 12 pts at MTD Tevaarwerk A, et al. Invest New Drugs 2012 ;30(3):1039–1045. Phase I design example 3: Modified time-to-event continual reassessment method (TITE-CRM) design Continuous MKC-1 in advanced or metastatic solid malignancies • Dose escalation/de-escalation scheme: Adaptive Apply initial dose to first cohort Apply selected dose to next cohort Apply additional constraints to select a dose Identify doses that will give DLT risk in targeted range Tevaarwerk A, et al. Invest New Drugs 2012 ;30(3):1039–1045. Update model using accumulated data Estimate DLT risk at different doses Phase I design example 3: Modified time-to-event continual reassessment method (TITE-CRM) design Continuous MKC-1 in advanced or metastatic solid malignancies • Actual dose-DLT data Patient 1 2 3 4 5 6 7 8 9 10 320 12 13 14 15 16 17 18 A 290 Dose (mg/d) 11 260 230 200 180 150 120 No DLT A Acute DLT Tevaarwerk A, et al. Invest New Drugs 2012 ;30(3):1039–1045. 19 20 21 22 23 24 Phase I design example 3: Modified time-to-event continual reassessment method (TITE-CRM) design Continuous MKC-1 in advanced or metastatic solid malignancies • Final estimation Tevaarwerk A, et al. Invest New Drugs 2012 ;30(3):1039–1045. Phase I design example 3: Modified time-to-event continual reassessment method (TITE-CRM) design Continuous MKC-1 in advanced or metastatic solid malignancies • Number of subjects per cohort: – Cohort 1: 3 pts specified by protocol – Further cohorts: 1 pt (continual reassessment) • Number of cohorts per dose level: Adaptive • Trial stopping rule: Study specific, after total 24 pts • Selection criteria for MTD: – Study specific, 33% pts experience DLTs by 12 weeks – 320 mg/d was chosen Tevaarwerk A, et al. Invest New Drugs 2012 ;30(3):1039–1045. Phase I design example 3: Modified time-to-event continual reassessment method (TITE-CRM) design Design features • Model-based design • Advantages – Study-specific: Threshold, trial stopping rule – Adaptive: Dose escalation/de-escalation scheme, dose levels – Higher chance of selecting correct MTD – Statistical criteria for MTD selection well defined – Without accrual suspension between cohorts – Account for late toxicities • Disadvantages – Require intensive statistician support prior to and during trial – Require continuous data update Tevaarwerk A, et al. Invest New Drugs 2012 ;30(3):1039–1045. Phase I design example 4: Trivariate continual reassessment method (TriCRM) design Motivation • Toxicity is not always a good indicator for efficacy Mandrekar S, et al. Statist Med 2010;29:1077–1083. Phase I design example 4: Trivariate continual reassessment method (TriCRM) design Design aspects • Primary endpoints: Toxicity and efficacy 3 categories Efficacy Toxicity No Yes No Yes No response Success Toxicity – Limitation: Efficacy endpoint (e.g. biomarker) should allow timely assessment within the timeframe as for toxicity endpoint • Threshold for toxicity: Study-specific • Between-cohort accrual suspension: Yes • Dose levels: Fixed Zhang W et al. Statist Med 2006;25:2365–2383. Phase I design example 4: Trivariate continual reassessment method (TriCRM) design Design aspects • Dose escalation/de-escalation scheme: Adaptive Select the dose maximizing success probability & satisfying other constraints Apply selected dose to next cohort Identify doses that give toxicity probability under threshold Zhang W et al. Statist Med 2006; 25:2365–2383. Apply initial dose to first cohort Update model using accumulated data Estimate probabilities for the 3 categories at different doses Phase I design example 4: Trivariate continual reassessment method (TriCRM) design Design aspects • Number of subjects per cohort: – Study specific, C ( 1) pts/cohort • Number of cohorts per dose level: Adaptive • Trial stopping rule: Study specific – Total n1 pts treated, including n1 pts with success – Or the maximum of n2 (> n1) pts treated • Selection criteria for biologically optimal dose (BOD): – Study specific, the model recommended dose at trial end Zhang W et al. Statist Med 2006; 25:2365–2383. Phase I design example 4: Trivariate continual reassessment method (TriCRM) design Design features • Model-based design • Advantages – Study-specific: Threshold, #pts/cohort, trial stopping rule – Adaptive: Dose escalation/de-escalation scheme – Consider both toxicity and efficacy • Disadvantages – Require intensive statistician support prior to and during trial – Require continuous toxicity and efficacy data update Zhang W et al. Statist Med 2006; 25:2365–2383. Phase I trial summary • Phase I trials are necessary to determine MTDs, assess acute AEs, pharmacokinetics and pharmacodynamics and plan subsequent phase II trials • Many trial designs available, e.g. 3+3 design, mTPI, TITE-CRM, TriCRM, etc. • Choice of trial design depends on wishes in – Toxicity threshold: Fixed or study-specific? – Adaptive features: For dose levels, number patients/cohort, number cohorts/dose level, or dose escalation rule? – Patient enrollment between cohorts: Suspend or continue? – Consideration of efficacy? Phase II trials Features • • • • • • Initial evaluation with focus on efficacy Relatively short trial duration Focus on short-term endpoints Small sample size Participating patients more homogeneous Basis for planning subsequent phase III trial(s) Phase II trials Objectives • Assess short-term efficacy – Traditionally ORR • • • • – Recently also PFS based endpoints Assess common short-term adverse events Characterise dose-response relation Determine the dosing ranges Decision for further development in phase III trial – Screening: Is the new agent promising enough for further investigations? – Selection: Among several promising new agents/doses, which one should be selected for the subsequent phase III trial? Phase II trials Control arm • When is a control arm needed? – Historical data are not reliable • Variation between institutions in subject selection, care of subjects, diagnosis procedure, reporting habit, assessment interval (especially important for PFS), etc. • Long time lag – Historical data for the intended endpoint(s) are not available – The target population is heterogeneous – The target population is unclear – Often for combination interventions, e.g. standard treatment + new drug Phase II trials Control arm • The purpose of a control arm – For calibration • No comparison between arms • Each arm is considered as a single-arm trial • If the result of control arm (standard treatment) very different from that of historical data, the result of new treatment should be interpreted with caution Phase II trials Control arm • The purpose of a control arm – For “rough” comparisons • This is not a substitute of a proper phase III trial • All arms need to be considered in the same statistical design • Use the statistical approach of phase III design, but with more liberal parameters, e.g. Phase II Phase III Effect size Slightly “exaggerated” Min. clinical relevance Alternative hypothesis 1 sided 2 sided Alpha 5-20% 5% Power 80% 80-90% Conclusion when p-value ≤ alpha Promising for phase III Statistically significant Phase II design example 1a: Fleming’s single-stage design Axitinib + standard chemotherapy in advanced squamous NSCLC • Primary endpoint: ORR (binary, “success”/“failure”) • Nr of arms: 1 • Purpose: Screening • Interim analysis: No • Historical data: ORR 17-38% with standard chemotherapy • Design input parameters – 1-sided type I error prob. = 0.1 – Power = 0.85 – Uninteresting prob. of “success” P0 = 0.4 – Promising prob. of “success” P1 = 0.6 Bondarenko I, et al. BMC Cancer 2015; DOI 10.1186/s12885-0151350-6 Phase II design example 1a: Fleming’s single-stage design Axitinib + standard chemotherapy in advanced squamous NSCLC • Planned sample size: N = 36 • Decision rule: Declare promising if > 18/36 (50%) “successes” • Result – Total 38 pts – 1 CR, 14 PR ORR 39% • Conclusion? • Flaws of design? Bondarenko I, et al. BMC Cancer 2015; DOI 10.1186/s12885-0151350-6 Phase II design example 1b: Fleming’s single-stage design Temsirolimus (mTOR inhibitor) in HCC • Primary endpoint: PFS at 3 months (binary, “success”/“failure”) • Nr of arms: 1 • Purpose: Screening • Interim analysis: No • Historical data: Efficacy of everolimus has not been confirmed by phase III study (EVOLVE-1, NCT01035229) • Design input parameters – 1-sided type I error prob. = 0.2 – Power = 0.85 – Uninteresting prob. of “success” P0 = 0.5 – Promising prob. of “success” P1 = 0.66 Yeo W, et al. BMC Cancer 2015; doi:10.1186/s12885-015-1334-6. Phase II design example 1b: Fleming’s single-stage design Temsirolimus (mTOR inhibitor) in HCC • Planned sample size: N = 30 • Decision rule: Declare promising if > 17/30 (57%) “successes” • Result – Total 36 pts – # “successes” not reported! – Patients received 1-6 cycles, median 3.5 cycles – Median follow-up 8.9 months – Kaplan-Meier estimate PFS at 3 months = 0.47 • Conclusion? • Flaws of design? Yeo W, et al. BMC Cancer 2015; doi:10.1186/s12885-015-1334-6. Phase II design example 2: Simon’s two-stage design Metronomic oral vinorelbine as first-line in elderly advanced NSCLC • Primary endpoint: ORR (binary, “success”/“failure”) • Nr of arms: 1 • Purpose: Screening • Interim analysis: 1, for futility • Design input parameters – 1-sided type I error prob. = 0.05 – Power = 0.8 – Uninteresting prob. of “success” P0 = 0.1 – Promising prob. of “success” P1 = 0.25 Camerini A, et al. BMC Cancer 2015; DOI 10.1186/s12885-0151354-2 Phase II design example 2: Simon’s two-stage design Metronomic oral vinorelbine as first-line in elderly advanced NSCLC • Planned sample size and decision rules N1 (18) patients Suspend accrual > R1 (2) responses Stage-1 analysis Total N (43) patients R1 (2) responses Trt not promising Stop Camerini A, et al. BMC Cancer 2015; DOI 10.1186/s12885-0151354-2 Stage-2 R (7) analysis responses > R (7) responses Trt promising Further investigation Phase II design example 2: Simon’s two-stage design Metronomic oral vinorelbine as first-line in elderly advanced NSCLC • Simon’s two-stage design variations – Optimal design: N1=18, R1=2, N=43, R=7 – Minimax design: N1=22, R1=2, N=40, R=7 Compared with optimal design: • Larger N1: Stage-1 analysis at later time • Smaller N: Save resource Note: The paper specified that the minimax design was used. In fact, the N1, R1, N and R were from optimal design! Camerini A, et al. BMC Cancer 2015; DOI 10.1186/s12885-0151354-2 Phase II design example 2: Simon’s two-stage design Metronomic oral vinorelbine as first-line in elderly advanced NSCLC • Result – Total 43 pts – Stage 1 results not reported! – Final: 1 CR, 7 PR ORR = 18.6% • Conclusion? • Flaws of design? Camerini A, et al. BMC Cancer 2015; DOI 10.1186/s12885-0151354-2 Phase II design example 3: Design by 1-sample t-test Neoadjuvant metronomic therapy in triple-negative breast cancer • Primary endpoint: Change in percentage of Ki-67+ cells between baseline biopsy and surgical resection specimens (continuous scale) • • • • Nr of arms: 1 Purpose: Screening (Selection?) Interim analysis: No Design input parameters: – 2-sided type I error prob. = 0.05 – Power = 0.8 – Change in percentage of Ki-67+ cells = 10 (pre 20 vs post 10) – Standard deviation = 20 Cancello G, et al. Clinical Breast Cancer 2015 (in press). Phase II design example 3: Design by 1-sample t-test Neoadjuvant metronomic therapy in triple-negative breast cancer • Planned sample size: N = 32 • Decision rule: Declare promising if p-value 0.05 • Result – Total 30/34 pts evaluable – Mean reduction in percentage of Ki-67+ cells = 41%, 95% confidence interval = (30%, 51%) – P-value < 0.0001 • Conclusion? • Flaws of design? Cancello G, et al. Clinical Breast Cancer 2015 (in press). Phase II design example 4: Design by 2-sample t? test Sipuleucel-T with abiraterone acetate in mCRPC • Primary endpoint: Cumulative antigen presenting cell (APC) activation (continuous scale, positively correlated with OS in other trials) • • • • Nr of arms: 2 (concurrent vs sequential abiraterone acetate) Purpose: Screening Interim analysis: No Design input parameters: – t-test? (not reported!) – 2-sided type I error prob. = 0.05? (not reported!) – Power = 0.85 – Fold change 1.3 for ratio of means between arms – Coefficient of variation (= SD/mean) = 0.325 – Allocation ratio = 1:1 Small E, et al. Clinical Cancer Research 2015; DOI: 10.1158/1078-0432.CCR-15-0079 mCRPC: Metastatic castration resistant prostate cancer Phase II design example 4: Design by 2-sample t? test Sipuleucel-T with abiraterone acetate in mCRPC • Planned sample size: N = 28/arm? (not reported!) • Result – Total 35 pts in concurrent arm and 34 pts in sequential arm – Median 1.83 vs 1.46 109 – Means, ratio and p-value not reported! • Conclusion Immunologic effects of Sipuleucel-T is not blunted or altered in concurrent treatment • Flaws of design? Small E, et al. Clinical Cancer Research 2015; DOI: 10.1158/1078-0432.CCR-15-0079 Phase II design example 5: Design by log-rank test Apricoxib + erlotinib in biomarker-selected advanced NSCLC • • • • • Primary endpoint: TTP (time-to-event endpoint) Nr of arms: 2 (Apricoxib + erlotinib vs. placebo + erlotinib) Purpose: Screening Interim analysis: No Design input parameters: – Log-rank test – 1-sided type I error prob. = 0.2 – Power = 0.8 – Hazard ratio (placebo/apricoxib) = 1.4 – Accrual duration and study duration not reported! – Allocation ratio = 2:1 – Blinding = Double blind Gitlitz B, et al. J Thorac Oncol 2014;9:577-582. Phase II design example 5: Design by log-rank test Apricoxib + erlotinib in biomarker-selected advanced NSCLC • Planned sample size: N = 115 • Result – Evaluable pts: 75 in apricoxib and 39 in placebo – Follow-up: ≥ 5 months if not progressed earlier – TTP: Median 1.8 vs. 2.1 months, hazard ratio=1, p-value = 0.438 • Conclusion? • Flaws of design? Gitlitz B, et al. J Thorac Oncol 2014;9:577-582. Phase II trial summary • Phase II trials are focused on efficacy and short-term endpoints and are the basis for planning phase III trials • Many trial designs available, e.g. Fleming’s single stage design, Simon’s two-stage design, 1-sample t-test design, 2-sample design, etc. • Choice of trial design depends on – Purpose: Screening or selection? – Type of endpoint: Binary, continuous, or time-to-event? – Need for a control arm: Depending on the historical data, the target population and if the trial is for a combination therapy – Need for interim analyses