Assessing Immune Correlates of Protection Via Estimation of the Vaccine Efficacy Curve Peter Gilbert Fred Hutchinson Cancer Research Center and University of Washington, Department of Biostatistics ISCB Vaccines Sub-Committee Web Seminar Series November 7, 2012 Outline 1. Introduction: Concepts and definitions of immune correlates/surrogate endpoints 2. Evaluating an immune correlate of protection via the vaccine efficacy curve 3. Statistical methods 2 Context: Preventive Vaccine Efficacy Trial Randomize Vaccine Placebo Receive inoculations Measure immune response • Primary Objective – Assess VE: Vaccine Efficacy to prevent pathogen-specific disease • Secondary Objective – Assess vaccine-induced immune responses as correlates of protection Follow for clinical endpoint (pathogen-specific disease) 3 Importance of an Immune Correlate • Finding an immune correlate is a central goal of vaccine research – One of the 14 ‘Grand Challenges of Global Health’ of the NIH & Gates Foundation (for HIV, TB, Malaria) • Immune correlates useful for: – Shortening trials and reducing costs – Guiding iterative development of vaccines between basic and clinical research – Guiding regulatory decisions – Guiding immunization policy – Bridging efficacy of a vaccine observed in a trial to a new setting • Pearl (2011, International Journal of Biostatistics) suggests that bridging is the reason for a surrogate endpoint 4 Two Major Concepts/Paradigms for Surrogate Endpoints Causal agent paradigm (e.g., Plotkin, 2008, Clin Infect Dis) Causal agent of protection = marker that mechanistically causes vaccine efficacy against the clinical endpoint Prediction paradigm (e.g., Qin et al., 2007, J Infect Dis) Predictor of protection = marker that reliably predicts the level of vaccine efficacy against the clinical endpoint Both are extremely useful for vaccine development, but are assessed using different approaches For the goal of statistical assessment of surrogate endpoint validity in an efficacy trial, the prediction paradigm is used As in the statistical literature, a good surrogate endpoint allows predicting VE from the vaccine effect on the surrogate 5 Immune Correlates Terminology: Contradictions Qin et al. (2007) Plotkin (2008) • Correlate (of risk) = measured immune response that predicts infection in the vaccine group • Correlate (of protection) = measured immune response that actually causes protection (mechanism of protection) • Surrogate = measured immune response that can be used to reliably predict VE (is definitely not a mechanism of protection) • Surrogate = measured immune response that can be used to reliably predict VE (may or may not be a mechanism of protection) Qin et al. correlate Plotkin correlate [very different] Qin et al. surrogate Plotkin surrogate 6 Reconciliation of Terminology: Plotkin and Gilbert (2012, Clin Inf Dis) Term Synonyms Definition CoP Correlate of Protection Predictor of Protection; Good Surrogate Endpoint An immune marker statistically correlated with vaccine efficacy (equivalently predictive of vaccine efficacy)* that may or may not be a mechanistic causal agent of protection mCoP Mechanistic Correlate of Protection Causal Agent of Protection; Protective Immune Function A CoP that is mechanistically causally responsible for protection nCoP Non-Mechanistic Correlate of Protection Correlate of Protection A CoP that is not a mechanistic causal agent Not Causal; Predictor of protection of Protection Not Causal *A CoP can be used to accurately predict the level of vaccine efficacy conferred to vaccine recipients (individuals or subgroups defined by the immune marker level). Thus a CoP is a surrogate endpoint in the statistical literature, and may be assessed 7 with the Prentice framework or the principal stratification framework. A Predictive Surrogate/CoP May or May Not be a Mechanism of Protection* Definition of a CoP: An endpoint that can be used to reliably predict the vaccine effect on the clinical endpoint Plotkin and Gilbert (2012, Clin Inf Dis) Figure 1. A correlate of protection (CoP) may either be a mechanism of protection, termed mCoP, or a non-mechanism of protection, termed nCoP, which predicts vaccine efficacy through its (partial) correlation with another immune response(s) that mechanistically protects. 8 Many Ways for a CoR to Fail to be a CoP “A Correlate Does Not a Surrogate Make” –Tom Fleming 1. The biomarker is not in the pathway of the intervention's effect, or is insensitive to its effect – E.g., the immunological assay is noisy 2. The biomarker is not in the causal pathway of the exposure/infection/disease process – E.g., the antibody response neutralizes serotypes of the pathogen that rarely expose trial participants but fails to predominantly exposing serotypes 3. The intervention has mechanisms of action independent of the disease process – E.g., other immunological functions not measured by the assay are needed for protection 9 Catastrophic Failure of a CoR to be a CoP: the ‘Surrogate Paradox’ • Surrogate Paradox: The vaccine induces an immune response, the immune response is inversely correlated with disease risk in vaccinees, but VE < 0% Three Causes of the Surrogate Paradox* 1. Confounding of the association between the potential surrogate and the clinical endpoint 2. The vaccine positively affects both the surrogate and the clinical endpoint, but for different sets of subjects 3. The vaccine may have a negative clinical effect in ways not involving the potential surrogate *From Tyler VanderWeele 10 “There is a plague on Man, the opinion that he knows something.” − Michel de Montaigne (1580, Essays) Outline 1. Introduction: Concepts and definitions of immune correlates/surrogate endpoints 2. Evaluating an immune correlate of protection via the vaccine efficacy curve* 3. Statistical methods *Gilbert, Hudgens, Wolfson (2011, J Inter Biostatistics) discussed the scientific value of the vaccine efficacy curve for vaccine development 12 Two Frameworks for Assessing a CoP from a Single Vaccine Efficacy Trial: Prentice & Principal Stratification (PS) Key Issue: Do trial participants have prior exposure to the pathogen under study? If Yes, immune responses vary for both vaccine and placebo recipients In this case, the Prentice and PS frameworks both apply If No, immune responses vary for vaccinees only, and the Prentice framework does not apply (Chan et al., 2002, Stats Med) In this case, only the PS framework applies In this talk we consider the PS approach in both settings 13 Concept of PS Framework: Assess Association of IndividualLevel Vaccine Effects on the Surrogate and Clinical Endpoint Vaccine Effect on Clinical Endpoint Probability an individual is protected Vaccine Effect on Immune Response Marker for an Individual 14 Definition of a Principal Surrogate/Principal CoP • Define the vaccine efficacy surface as VE(s1, s0) = 1 – Risk of clinical endpoint for vaccinees for subgroup with marker effect (s1, s0) Risk of clinical endpoint for placebos for subgroup with marker effect (s1, s0) • Interpretation: Percent reduction in clinical risk for a vaccinated subject with markers (s1, s0) compared to if s/he had not been vaccinated • Definition: A principal CoP is a marker with large variability of VE(s1, s0) in (s1, s0) • Another useful property is VE(s1 = s0) = 0 – This property is Average Causal Necessity: No vaccine effect on the marker implies no vaccine efficacy 15 Marker Useless as a CoP VE(s1, CEPRs0) (v1, v0) 16 Marker that is an Excellent CoP VE(s1, CEPRs0) (v1, v0) 17 Simplest Way to Think About the PS Framework for Assessing a CoP: It’s Simply Subgroup Analysis • Conceptually the analysis assesses VE in subgroups defined by the vaccine effect on the marker – Evaluate if and how VE varies with ‘baseline’ subgroups defined by (S1, S0) – Principal stratification makes (S1, S0) equivalent to a baseline covariate • A useful CoP will have strong effect modification, i.e., VE(s1, s0) varies widely in (s1, s0) • It would be even more valuable to identify actual baseline covariates that well-predict VE, but it’s much more likely that a response to vaccination well-predicts VE 18 Simplified Definition of a Principal Surrogate/Principal CoP: Ignore the Immune Response under Placebo, S0 • Define the vaccine efficacy curve as VE(s1) = 1 – Risk of clinical endpoint for vaccinees for subgroup with marker s1 Risk of clinical endpoint for placebos for subgroup with marker s1 • Interpretation: Percent reduction in clinical risk for a vaccinated subject with markers s1 compared to if s/he had not been vaccinated • Definition: A principal CoP is a marker with large variability of VE(s1) in s1 • • The vaccine efficacy curve is useful in both settings that participants have prior exposure to the pathogen or not If no prior exposure, then VE(s1, s0) = VE(s1), such that the vaccine efficacy surface simplifies to the vaccine efficacy curve 19 Vaccine Efficacy Curve: Assess How VE Varies in the Marker Under Vaccination Black marker: worthless as surrogate VE(s1) Green and blue markers satisfy causal necessity Blue marker: very good surrogate Marker level s1 20 Excellent CoP: Sets the Target for Improving the Vaccine Target: Improve the vaccine regimen by increasing the percentage of vaccinees with high immune responses Black marker: worthless as surrogate Green and blue markers satisfy causal necessity VE(s1) Blue marker: very good surrogate Marker level s1 21 Knowledge of a CoP Guides Future Research to Develop Improved Vaccines Identification of a good CoP in an efficacy trial is the ideal primary endpoint in follow-up Phase I/II trials of refined vaccines It also generates a bridging hypothesis: If a future vaccine is identified that generates higher marker levels in more vaccinated subjects, then it will have improved overall VE 22 Using the CoP for Improving the Vaccine Regimen Marker levels Original Vaccine New Vaccine 1 23 New Vaccine 2 Using the CoP for Improving the Vaccine Regimen Suppose each new vaccine is tested in an efficacy trial Under the bridging hypothesis we expect the following efficacy results: Original Vaccine New Vaccine 1 New Vaccine 2 Estimated VE Overall TE = 75% Overall TE = 50% Overall TE = 31% Marker level Marker level Marker level This is the idealized model for using a CoP to iteratively improve a vaccine regimen 24 Outline 1. Introduction: Concepts and definitions of immune correlates/surrogate endpoints 2. Evaluating an immune correlate of protection via the vaccine efficacy curve 3. Statistical methods 25 Challenge to Evaluating a Principal CoP: The Immune Responses to Vaccine are Missing for Subjects Assigned Placebo • Accurately filling in the unknown immune responses is needed to evaluate a principal CoP • Two approaches to filling in the missing data (Follmann, 2006, Biometrics): – BIP (Baseline immunogenicity predictor): At baseline, measure a predictor(s) of the immune response in both vaccinees and placebos – CPV (Close-out placebo vaccination): At study closeout, vaccinate disease-free placebo recipients and measure the immune response 26 Example of a Good BIP: Antibody Responses to Hepatitis A and B Vaccines* Spearman rank r = .85 No crossReactivity N=75 subjects 27 *Czeschinski et al. (2000, Vaccine) 18:1074-1080 Baseline Immunogenicity Predictor 28 Schematic of Baseline Immunogenicity Predictor (BIP) & Closeout Placebo Vaccination (CPV) Trial Designs* Vx S=S(1) W - BIP Approach 1 1 + CPV Approach S(1) Vx W BIP Approach - 29 (2006, Biometrics) *Proposed by Follmann Sc 1 1 + Literature on Statistical Methods for Estimating the Vaccine Efficacy Curve via BIP and/or CPV Article Comment 1. Follmann (2006, Biometrics) Binary outcome; BIP&CPV; Estimated likelihood 2. Gilbert and Hudgens (2008, Biometrics) Binary outcome; BIP; Estimated likelihood; 2-phase sampling 3. Qin, Gilbert, Follmann, Li (2008, Ann Appl Stats) Time-to-event outcome (Cox model); BIP&CPV; Estimated likelihood; 2phase sampling 4. Wolfson and Gilbert (2010, Biometrics) Binary outcome; BIP&CPV; Estimated likelihood; 2-phase sampling; relaxed assumptions 5. Huang and Gilbert (2011, Biometrics) Binary outcome; BIP&CPV; Estimated likelihood; 2-phase sampling; relaxed assumptions; compare markers 6. Huang, Gilbert, Wolfson Binary outcome; BIP&CPV; Pseudolikelihood; 2-phase sampling; relaxed (2012, under revision) assumptions; marker sampling design 7. Miao, Li, Gilbert, Chan (2012, under revision) Time-to-event outcome (Cox model); BIP; Estimated likelihood with multiple imputation; 2-phase sampling 8. Gabriel and Gilbert (2012, submitted) Time-to-event outcome (Weibull model); BIP+CPV; Estimated likelihood and pseudolikelihood; 2-phase sampling; threshold models 30 Summary of One of the Principal Stratification Methods: Gilbert and Hudgens (2008, Biometrics) [GH] 31 Notation (Observed and Potential Outcomes) Z = vaccination assignment (0 or 1; placebo or vaccine) W = baseline immunogenicity predictor of S S = candidate surrogate endpoint/immune CoP measured at time after randomization Y = clinical endpoint (0 or 1; 1 = experience event during follow-up) S(Z) = potential surrogate endpoint under assignment Z, for Z=0,1 Y(Z) = potential clinical endpoint under assignment Z, for Z=0,1 32 Assumptions A1 Stable Unit Treatment Value Assumption (SUTVA): (Si(1), Si(0), Yi(1), Yi(0)) is independent of the treatment assignments Zj of other subjects − A1 implies “consistency”: (Si(Zi), Yi(Zi)) = (Si, Yi) A2 Ignorable Treatment Assignment: Zi is independent of (Si(1), Si(0), Yi(1), Yi(0)) − A2 holds for randomized blinded trials A3 Equal individual clinical risk up to time that S is measured (zero vaccine efficacy for any individual up to time ) 33 Definition of a Principal Surrogate/Principal CoP (Frangakis and Rubin, 2002; Gilbert and Hudgens, 2008) • Define risk(1)(s1, s0) = Pr(Y(1) = 1|S(1) = s1, S(0) = s0) risk(0)(s1, s0) = Pr(Y(0) = 1|S(1) = s1, S(0) = s0) • A contrast in risk(1)(s1, s0) and risk(0)(s1, s0) is a causal effect on Y for the population {S(1) = s1, S(0) = s0} • VE(s1, s0) = 1 - risk(1)(s1, s0) / risk(0)(s1, s0) • A good CoP has VE(s1, s0) varying widely in (s1, s0) [i.e., a large amount of effect modification] • Also, with VE(s1) = 1 - risk(1)(s1) / risk(0)(s1), a good CoP has VE(s1) varying widely in s1 • These definitions allow for a spectrum of principal CoPs, some more useful than others, depending on the degree of effect modification 34 Statistical Methods: Build on Two-Phase Sampling Methods • Case-cohort or case-control sampling (Ignore S0) − (W, S(1)) measured in • All infected vaccines • Sample of uninfected vaccines − W measured in o All infected placebos o Sample of uninfected placebos • 2-Phase designs (E.g., Prentice, 1986, Biometrika; Kulich and and Lin, 2004, JASA; Breslow et al., 2009, AJE, Stat Biosciences) − Phase 1: Measure inexpensive covariates in all subjects − Phase 2: Measure expensive covariates X in a sample of subjects • Our application − Vaccine Group: Exactly like 2-phase design with X = (W, S(1)) − Placebo Group: Like 2-phase design with X = (W, S(1)) and S(1) missing 35 IPW Case-Cohort Methods Do Not Apply: Hence we use a Full Likelihood-Based Method • Most of the published 2-phase sampling/case-cohort failure time methods cannot be extended to estimate the VE curve – This is because they are inverse probability weighted (IPW) methods, using partial likelihood score equations that sum over subjects with phase-2 data only, which assume that every subject has a positive probability that S(1) is observed – However, all placebo subjects have zero-probability that S(1) is observed • To deal with this problem, the published methods all use full likelihood, using score equations that sum over all subjects 36 Maximum Estimated Likelihood* with BIP • Posit models for risk(1)(s1,0; ) and risk(0)(s1,0; ) • Vaccine arm: • • − (Wi, Si(1)) measured: − (Wi, Si(1)) not measured: Likld contribn risk(1)(Si(1), 0; ) risk(1)(s1, 0; ) dF(s1) Placebo arm: − Wi measured: − Wi not measured: Likld contribn risk(0)(s1, 0; ) dFS|W(s1| Wi) risk(0)(s1, 0; ) dF(s1) L(, FS|W, F) = i risk(1)(Si(1),0; )Yi (1 - risk(1)(Si(1),0; ))1-Yi]Zi }i [Vx subcohort] risk(0)(si,0; )dFS|W(s1|Wi)Yi (1 - risk(0)(s1,0; )) dFS|W(s1|Wi)1-Yi]1-Zi }i [Plc subcohort] risk(1)(si,0; )dF(s1)Yi (1 - risk(1)(s1,0; )) dF(s1)1-Yi]Zi }1-i [Vx not subcohort] risk(0)(si,0; )dF(s1)Yi (1 - risk(0)(s1,0; )) dF(s1)1-Yi]1-Zi }1-i [Plc not subcohort] *Pepe and Fleming (1991) an early article on estimated likelihood 37 Maximum Estimated Likelihood Estimation (MELE) • Likelihood L(, FS|W, F) − is parameter of interest [VE curve depends only on ] − FS|W and F are nuisance parameters Step 1: Choose models for FS|W and F and estimate them based on vaccine arm data Step 2: Plug the consistent estimates of FS|W and F into the likelihood, and maximize it in − e.g., EM algorithm Step 3: Estimate the variance of the MELE of , accounting for the uncertainty in the estimates of FS|W and F − Bootstrap 38 Example: Nonparametric Categorical Models • Assume: − S and W categorical with J and K levels; Si(0)=1 for all i [No prior exposure scenario: category 1 = negative response] − Nonparametric models for P(S(1)=j, W=k) − A4-NP: Structural models for risk(z) (for z=0, 1) risk(z)(j, 1, k; ) = zj + ’k for j=1, …, J; k=1, …, K Constraint: 0 ≤ zj + ’k ≤ 1 and k ’k = 0 for identifiability A4-NP asserts no interaction: W has the same effect on risk for the 2 study groups (untestable) 39 Vaccine Efficacy Curve for Categorical Marker CEPrisk(j, 1) = log (avg-risk(1)(j, 1) / avg-risk(0)(j, 1)) where avg-risk(z)(j, 1) = (1/K) k risk(z)(j, 1, k; ) VE(j, 1) = 1 – exp{CEPrisk(j, 1)} The vaccine efficacy curve is VE(j, 1) at each level j of S(1) 40 Tests for the Vaccine Efficacy Curve VE(j, 1) Varying in j • Wald tests for whether a biomarker has any surrogate value − Under the null, PAE(w) = 0.5 and AS = 0 − Z = (Est. PAE(w) – 0.5)/ s.e.(Est. PAE(w)) − Z = Est. AS/ s.e.(Est. AS) o Estimates obtained by MELE; bootstrap standard errors • For nonparametric case A4-NP, test H0: CEPrisk(j, 1) = 0 vs H1: CEPrisk(j, 1) increases in j (like Breslow-Day trend test) T = j>1(j-1) {Est. 0j – (Est. 0j + Est. 1j)(Est. z0 /(Est. z0 + Est. z1))} divided by bootstrap s.e. Est. z = (1/J) j zj 41 Simulation Study: Vax004 HIV Vaccine Efficacy Trial* • Step 1: For all N=5403 subjects, generate (Wi, Si(1)) from a bivariate normal with means (0.41, 0.41), sds (0.55, 0.55), correlation = 0.5, 0.7, or 0.9 sd of 0.55 chosen to achieve the observed 23% rate of left-censoring Values of Wi, Si(1) < 0 set to 0; values > 1 set to 1 • Step 2: Bin Wi and Si(1) into quartiles Under model A4-NP generate Yi(Z) from a Bernoulli(zj + ’k) with the parameters set to achieve: o P(Y(1) = 1) = 0.067 and P(Y(0) = 1) = 0.134 (overall VE = 50%) o The biomarker has either (i) no or (ii) high surrogate value *Flynn et al. (2005, JID), Gilbert et al. (2005, 42 JID) Simulation Plan • Scenario (i) (no surrogate value) − CEPrisk(j, 1) = -0.69 for j = 1, 2, 3, 4 − i.e., VE(j, 1) = 0.50 for j = 1, 2, 3, 4 • Scenario (ii) (high surrogate value) − CEPrisk(j, 1) = -0.22, -0.51, -0.92, -1.61 for j = 1, 2, 3, 4 − i.e., VE(j, 1) = 0.2, 0.4, 0.6, 0.8 for j = 1, 2, 3, 4 43 Simulation Plan • Step 3: Create case-cohort sampling (3:1 control: case) − Vaccine group: (W, S(1)) measured in all infected (n=241) and a random sample of 3 x 241 uninfected − Placebo group: W measured in all infected (n=127) and a random sample of 3 x 127 uninfected • The data were simulated to match the real VaxGen trial as closely as possible 44 Model A4-NP Simulation Results: Bias and Coverage Probabilities [Table 1 of GH] 45 Model A4-NP Simulation Results: Power to Detect the VE(j,1) curve varying in j [Table 2 of GH] Trend tests for VE(j, 1) increasing in j: Power 0.83, 0.99, > 0.99 for = 0.5, 0.7, 0.9 46 Conclusions of Simulation Study • The MELE method of Gilbert and Hudgens performs well for realistically-sized Phase 3 vaccine efficacy trials, with accuracy, precision, and power improving sharply with the strength of the BIP (desire high ) • This shows the importance of developing good BIPs • R code for the nonparametric method available at the Biometrics website and at http://faculty.washington.edu/peterg/SISMID2011.html 47 Remarks on Power for Evaluating a Principal Surrogate Endpoint (For All Methods- Beyond GH) • Crossing over more placebo subjects improves power of CPV and BIP + CPV designs • There is no point of diminishing returns− steady improvement with more crossed over, out to complete cross-over • If the BIP is high quality (e.g., > 0.50), then the BIP design is quite powerful with only modest incremental gain by adding CPV • However, CPV has additional value beyond efficiency improvement: – Helps in diagnostic tests of structural modeling assumptions (A4) – May help accrual and enhance ethics – May adaptively initiate crossover, e.g., as soon as the lower 95% confidence limit for VE exceeds 30% • Pseudoscore method superior to estimated likelihood method (Huang, Gilbert, Wolfson, 2012, under revision); recommend this method in practice – Happy to provide the code for this method (for BIP, CPV, BIP+CPV) 48 Concluding Remarks • Opportunity to improve assessment of immune CoPs by increasing research into developing BIPs – The better the BIP, the greater the accuracy and precision for estimating the vaccine efficacy curve 49 Some Avenues for Identifying Good BIPs • Demographic factors – E.g., age, gender, BMI, immune status • Host immune genetics – E.g., HLA type and MHC binding prediction machine learning methods for predicting T cell responses • Add beneficial licensed vaccines to efficacy trials and use known correlates of protection as BIPs (Follmann’s [2006] original proposal) – The HVTN is exploring this strategy in a Phase 1 trial in preparation for efficacy trials • Develop ‘pathogen exposure history’ chip • In efficacy trials where participants have prior exposure to the pathogen, measure the potential CoP at baseline and use it as the baseline predictor – E.g., Varicella Zoster vaccine trials: baseline gpELISA titers strongly predict post-immunization titers – Miao, Li, Gilbert, Chan (2012, under review) and Gabriel and Gilbert (2012, in preparation) estimate the Zoster vaccine efficacy curve using this excellent BIP (will constitute an excellent example when published) 50