Targeted Maximum Likelihood Super Learning Application to assess effects in RCT,Observational Studies, and Genomics Mark van der Laan works.bepress.com/mark_van_der_laan Division of Biostatistics, University of California, Berkeley Workshop Brad Efron, December 2009 Outline • • • • • • Super Learning and Targeted Maximum Likelihood Learning Causal effect in observational studies Causal effect in RCTs Variable importance analysis in Genomics Multiple testing Case Control data Motivation • Avoid reliance on human art and parametric models • Adapt the model fit to the data • Target the fit to the parameter of interest • Statistical Inference TMLE/SL Targeted Maximum Likelihood coupled with Super Learner methodology TMLE/SL Toolbox Targeted effects • Effect of static or dynamic multiple time point treatments (e.g. on survival time) • Direct and Indirect Effects • Variable importance analysis in genomics Types of data • Point treatment • Longitudinal/Repeated Measures • Censoring/Missingness/Time-dependent confounding. • Case-Control • Randomized clinical trials and observational data 4 Two-stage Methodology: SL/TMLE 1. Super Learning • Works on a library of model fits • Builds data-adaptive composite model by assigning weights • Weights are optimized based on lossfunction specific cross-validation to guarantee best overall fit 2. Targeted Maximum Likelihood Estimation • Zooms in on one aspect of the model fit—the target • Removes bias for the target. Loss-Based Super Learning in Semiparametric Models • Allows one to combine many data adaptive (e.g.) MLEs into one improved MLE. • Grounded by oracle results for loss-function based cross-validation (vdL&D, 2003). Loss function needs to be bounded. • Performs asymptotically as well as best (oracle) weighted combination, or achieves parametric rate of convergence. Super Learner Flow Chart in Prediction Super Learner Prediction Targeted Maximum Likelihood Estimation Flow Chart Inputs The model is a set of possible probability distributions of the data Initial P-estimator of the probability distribution of the data: Pˆ Model Pˆ User Dataset ˆ P* Targeted P-estimator of the probability distribution of the data O(1), O(2), … O(n) PTRUE Observations True probability distribution Target feature map: Ψ( ) Ψ(PTRUE) ˆ Ψ(P) Initial feature estimator ˆ Ψ(P*) Targeted feature estimator Target feature values True value of the target feature Target Feature better estimates are closer to ψ(PTRUE) Targeted Maximum Likelihood • MLE/SL aims to do good job of estimating whole density • Targeted MLE aims to do good job at parameter of interest • General decrease in bias for parameter of Interest • Fewer false positives • Honest p-values, inference, multiple testing (Iterative) Targeted MLE 1. ^ Identify optimal strategy for “stretching” initial P – 2. Small “stretch” -> maximum change in target Given strategy, identify optimum amount of stretch by MLE ^ 3. Apply optimal stretch to P using optimal stretching function -> 1st-step targeted maximum likelihood estimator 4. Repeat until the incremental “stretch” is zero – 5. Some important cases: 1 step to convergence Final probability distribution solves efficient influence curve equation (Iterative) T-MLE is double robust & locally efficient Example: Targeted MLE of Causal effect of point treatment on outcome Impact of Treatment on Disease Likelihood of Point Treatment with Single Endpoint Outcome • • • • • Draw baseline characteristics Draw treatment Draw missing indicator If not missing, draw outcome Counterfactual outcome distributions defined by intervening on treatment and enforcing no missingness • Causal effects defined as user supplied function of these counterfactual TMLE for Average Causal Effect • Observe predictors W, treatment A, missingness indicator Delta, and outcome Y: • Target is additive causal effect: EY(1)-Y(0) • Regress Y on treatment A and W and Delta=1 (e.g. Super Learning), and add clever covariate where – Then average regression over W for fixed treatment a: EnYa • Evaluate average effect: EnY1-EnY0 TMLE is Collaborative Double Robust • Suppose the initial fit minus true outcome regression is only a function of W through S • Suppose the treatment mechanism adjusts correctly for a set of variables that includes S • Then, the Targeted MLE is consistent. • Thus the treatment mechanism only needs to adjust for covariates whose effect has not been captured by the initial fit yet. • Formally, TMLE/SL: more accurate information from less data Simulated Safety Analysis of Epogen (Amgen) Example: Targeted MLE in RCT Impact of Treatment on Disease The Gain in Relative Efficiency in RCT is function of Gain in R^2 relative to unadjusted estimator • We observe (W,A,Y) on each unit • A is randomized, P(A=1)=0.5 • Suppose the target parameter is additive causal effect EY(1)-Y(0) • The relative efficiency of the unadjusted estimator and a targeted MLE equals 1 minus the R-square of the regression 0.5 Q(1,W)+0.5 Q(0,W), where Q(A,W) is the regression of Y on A,W obtained with targeted MLE. TMLE in Actual Phase IV RCT • Study: RCT aims to evaluate safety based on mortality due to drug-to-drug interaction among patients with severe disease • Data obtained with random sampling from original real RCT FDA dataset • Goal: Estimate risk difference (RD) in survival at 28 days (0/1 outcome) between treated and placebo groups TMLE in Phase IV RCT Estimate p-value (RE) Unadjusted TMLE 0.034 0.043 0.085 (1.000) 0.009 (1.202) • TMLE adjusts for small amount of empirical confounding (imbalance in AGE covariate) • TMLE exploits the covariate information to gain in efficiency and thus power over unadjusted • TMLE Results significant at 0.05 TMLE in RCT: Summary • TMLE approach handles censoring and improves efficiency over standard approaches – Measure strong predictors of outcome • Implications – – – – – Unbiased estimates with informative censoring Improved power for clinical trials Smaller sample sizes needed Possible to employ earlier stopping rules Less need for homogeneity in sample • More representative sampling • Expanded opportunities for subgroup analyses Targeted MLE Analysis of Genomic Data Biomarker discovery, Impact of mutations on disease, or response to treatment The Need for Experimentation • Estimation of Variable Importance/Causal Effect requires assumption not needed for prediction • “Experimental Treatment Assignment” (ETA) – Must be some variation in treatment variable A within every stratum of confounders W • W must not perfectly predict/determine A • g(a|W)>0 for all (a,W) Biomarker Discovery: HIV Resistance Mutations • Goal: Rank a set of genetic mutations based on their importance for determining an outcome – Mutations (A) in the HIV protease enzyme • Measured by sequencing – Outcome (Y) = change in viral load 12 weeks after starting new regimen containing saquinavir • How important is each mutation for viral resistance to this specific protease inhibitor drug? – Inform genotypic scoring systems Stanford Drug Resistance Database • All Treatment Change Episodes (TCEs) in the Stanford Drug Resistance Database – Patients drawn from 16 clinics in Northern CA Baseline Viral Load <24 weeks 12 weeks TCE (Change >= 1 Drug) Final Viral Load Change in Regimen • 333 patients on saquinavir regimen Parameter of Interest • Need to control for a range of other covariates W – Include: past treatment history, baseline clinical characteristics, non-protease mutations, other drugs in regimen • Parameter of Interest Variable Importance ψ = E[E(Y|Aj=1,W)-E(Y|Aj=0,W)] – For each protease mutation (indexed by j) Parameter of Interest • Assuming no unmeasured confounders (W sufficient to control for confounding) Causal Effect is same as W-adjusted Variable Importance E(Y1)-E(Y0)=E[E(Y|A=1,W)-E(Y|A=0,W)]= ψ – Same advantages to T-MLE Targeted Maximum Likelihood Estimation of the Adjusted Effect of HIV Mutation on Resistance to Lopinavir mutation p50V p82AFST p54VA p54LMST p84AV p46ILV p48VM p47V p32I p90M p82MLC p84C p33F p53LY p73CSTA p24IF p10FIRVY p71TVI p30N p88S p88DTG p36ILVTA p20IMRTVL p23I p16E p63P score 20 20 11 11 11 11 10 10 10 10 10 10 5 3 2 2 2 2 0 0 0 0 0 0 0 0 estimate 1.703 0.389 0.505 0.369 0.099 0.046 0.306 0.805 0.544 0.209 1.610 0.602 0.300 0.214 0.635 0.229 -0.266 0.019 -0.440 -0.474 -0.426 0.272 0.178 0.822 0.239 -0.131 95% CI ( 0.760, 2.645)* ( 0.084, 0.695)* ( 0.241, 0.770)* ( 0.002, 0.735)* (-0.130, 0.329) (-0.222, 0.315) (-0.162, 0.774) ( 0.282, 1.328)* ( 0.312, 0.777)* (-0.058, 0.476) ( 1.330, 1.890)* ( 0.471, 0.734)* (-0.070, 0.669) (-0.266, 0.695) ( 0.278, 0.992)* (-0.215, 0.674) (-0.522, -0.011)* (-0.243, 0.281) (-0.853, -0.028)* (-0.840, -0.108)* (-0.842, -0.010)* (-0.001, 0.544) (-0.111, 0.467) (-0.050, 1.694) (-0.156, 0.633) (-0.392, 0.131) Stanford mutation score, http://hivdb.stanford.edu, accessed September, 1997 Multiple Testing: Combining Targeted MLE with Type-I Error Control Hypothesis Testing Ingredients Data (X1,…,Xn) • Hypotheses • Test Statistics • Type I Error • Null Distribution • Marginal (p-values) or • Joint distribution of the test statistics • Rejection Region • Adjusted p-values Type I Error Rates • FWER: Control the probability of at least one Type I error (Vn): P(Vn > 0) · • gFWER: Control the probability of at least k Type I errors (Vn): P(Vn > k) · • TPPFP: Control the proportion of Type I errors (Vn) to total rejections (Rn) at a user defined level q: P(Vn/Rn > q) · • FDR: Control the expectation of the proportion of Type I errors to total rejections: E(Vn/Rn) · Multivariate Normal Null Distribution • Suppose null hypotheses involve testing of target parameters H_0: psi(j)<=0 • We estimate target parameters with TMLE, and use t-statistic for testing • T-MLE as vector is asymptotically linear with known influence curve IC • Valid joint null distribution for multiple testing is N(0,Sigma=E IC^2) • Null distr can be inputted in any MTP (Dudoit, vdL, 2009, Springer) GENERAL JOINT NULL DISTRIBUTION Let Q0j be a marginal null distribution so that for j in set S0 of true nulls Q0j-1Qnj(x)> x, for all x where Qnj is the j-th marginal distribution of the true distribution Qn(P) of the test statistic vector Tn. JOINT NULL DISTRUTION We propose as null distribution the distribution Q0n of Tn*(j)=Q0j-1Qnj(Tn(j)), j=1,…,J This joint null distribution Q0n(P) does indeed satisfy the wished multivariate asymptotic domination condition in (Dudoit, van der Laan, Pollard, 2004). BOOTSTRAP BASED JOINT NULL DISTRIBUTION We estimate this null distribution Q0n(P) with the bootstrap analogue: Tn#(j)=Q0j-1Qnj#(Tn#(j)) where # denotes the analogue based on bootstrap sample O1#,..,On# of an approximation Pn of the true distribution P. Case-Control Weighted Targeted MLE • Case-control weighting in targeted MLE successfully maps an estimation method designed for prospective sampling into a method for case-control sampling. • This technique relies on knowledge of the true prevalence probability P(Y=1)=q0 to eliminate the bias of the case-control sampling design. • The procedure is double robust and locally efficient. It produces efficient estimators when its prospective sample counterpart is efficient. Comparison to Existing Methodology Case-control weighted targeted MLE differs from other approaches as it can estimate any type of parameter, incorporates q0, and is double robust and locally efficient. Case-Control Weighted Targeted MLE Simulation Results • We showed striking improvements in efficiency and bias in our case-control weighted method versus the IPTW estimator (Mansson 2007, Table results for a sample of 500 cases and 1000 controls taken from a population of 120,000 where q0 = 0.035 Robins 1999), which does not utilize q0. • Our complete simulation results bolster our theoretical arguments that gains in efficiency and reductions in bias can be obtained by having known q0 and using a targeted estimator. Closing Remarks • True knowledge is embodied by semi or nonparametric models • Semi-parametric models require fully automated state of the art machine learning (super learning) • Targeted bias removal is essential and is achieved by targeted MLE • Statistical inference is now sensible • The machine learning algorithms are (super) efficient for the target parameters. Closing Remarks • (RC) Clinical Trials and Observational Studies can be analyzed with TMLE. • TMLE outperforms current standards in analysis of clinical trials and observational studies, including double robust methods • It is the only targeted method that is collaborative double robust, efficient, and naturally incorporates machine learning Acknowledgements • UC Berkeley – – – – – – – – – Oliver Bembom Susan Gruber Kelly Moore Maya Petersen Dan Rubin Cathy Tuglus Sherri Rose Michael Rosenblum Eric Polley – P.I. Ira Tager (Epi). • Stanford Univ. – Robert Shafer • Kaiser: Dr. Jeffrey Fessels.… • FDA: Thamban Valappil, Greg Soon, • Harvard: David Bangsberg, Victor DeGruttolas References • • • • • • Oliver Bembom, Maya L. Petersen , Soo-Yon Rhee , W. Jeffrey Fessel , Sandra E. Sinisi, Robert W. Shafer, and Mark J. van der Laan, "Biomarker Discovery Using Targeted Maximum Likelihood Estimation: Application to the Treatment of Antiretroviral Resistant HIV Infection" (August 2007). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 221. http://www.bepress.com/ucbbiostat/paper221 Mark J. van der Laan and Susan Gruber, "Collaborative Double Robust Targeted Penalized Maximum Likelihood Estimation" (April 2009). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 246, http://www.bepress.com/ucbbiostat/paper246 Mark J. van der Laan, Eric C. Polley, and Alan E. Hubbard, "Super Learner" (July 2007). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 222. http://www.bepress.com/ucbbiostat/paper222 Mark J. van der Laan and Daniel Rubin, "Targeted Maximum Likelihood Learning" (October 2006). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 213. http://www.bepress.com/ucbbiostat/paper213 Oliver Bembom, Mark van der Laan (2008), A practical illustration of the importance of realistic individualized treatment rules in causal inference, Electronic Journal of Statistics. Mark J. van der Laan, "Statistical Inference for Variable Importance" (August 2005). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 188. http://www.bepress.com/ucbbiostat/paper188 References • • • • • • Oliver Bembom, Maya L. Petersen , Soo-Yon Rhee , W. Jeffrey Fessel , Sandra E. Sinisi, Robert W. Shafer, and Mark J. van der Laan, "Biomarker Discovery Using Targeted Maximum Likelihood Estimation: Application to the Treatment of Antiretroviral Resistant HIV Infection" (August 2007). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 221. http://www.bepress.com/ucbbiostat/paper221 Mark J. van der Laan, Eric C. Polley, and Alan E. Hubbard, "Super Learner" (July 2007). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 222. http://www.bepress.com/ucbbiostat/paper222 Mark J. van der Laan and Daniel Rubin, "Targeted Maximum Likelihood Learning" (October 2006). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 213. http://www.bepress.com/ucbbiostat/paper213 Yue Wang, Maya L. Petersen, David Bangsberg, and Mark J. van der Laan, "Diagnosing Bias in the Inverse Probability of Treatment Weighted Estimator Resulting from Violation of Experimental Treatment Assignment" (September 2006). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 211. http://www.bepress.com/ucbbiostat/paper211 Oliver Bembom, Mark van der Laan (2008), A practical illustration of the importance of realistic individualized treatment rules in causal inference, Electronic Journal of Statistics. Mark J. van der Laan, "Statistical Inference for Variable Importance" (August 2005). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 188. http://www.bepress.com/ucbbiostat/paper188 Collaborative T-MLE: Building the Propensity Score Based on Outcome Data • Initial outcome regression based on super learning • Construct rich set of one dimensional dimension reductions of W, that will be used as main terms below • Select main terms in propensity score using forward selection based on emp. fit (e.g, loglik) of T-MLE • If no main term increases emp. fit of TMLE, then carry out T-MLE update to update initial outcome regression • Proceed to generate a sequence of T-MLE’s using increasingly nonparametric treatment mechanisms • Select the wished T-MLE with cross-validation The Likelihood for Right Censored Survival Data • It starts with the marginal probability distribution of the baseline covariates. • Then follows the treatment mechanism. • Then it follows with a product over time points t • At each time point t, one writes down likelihood of censoring at time t, death at time t, and it stops at first event • Counterfactual survival distributions are obtained by intervening on treatment, and censoring. • This then defines the causal effects of interest as parameter of likelihood. TMLE with Survival Outcome • Suppose one observes baseline covariates, treatment, and one observes subject up till end of follow up or death: • One wishes to estimate causal effect of treatment A on survival T • Targeted MLE uses covariate information to adjust for confounding, informative drop out and to gain efficiency TMLE with Survival Outcome • Target ψ1(t0)=Pr(T1>t0) and ψ0(t0)=Pr(T0>t0) – thereby target treatment effect, e.g., 1) Difference: Pr(T1>t0) - Pr(T0>t0), 2) Log RH: • Obtain initial conditional hazard fit (e.g. super learner for discrete survival) and add two time-dependent covariates – Iterate until convergence, then use updated conditional hazard from final step, and average corresponding conditional survival over W for fixed treatments 0 and 1 TMLE analogue to log rank test • The parameter, corresponds with Cox ph parameter, and thus log rank parameter • Targeted MLE targeting this parameter is double robust TMLE in RCT with Survival Outcome Difference at Fixed End Point Independent Censoring % Bias Power 95% Coverage Relative Efficiency <1% 0.79 0.95 1.00 TMLE <1% 0.91 TMLE: gain in power over KM 0.95 1.44 KM Informative Censoring % Bias Power 95% Coverage Relative Efficiency 13% 0.88 0.92 1.00 TMLE <1% TMLE: unbiased 0.92 0.95 1.50 KM TMLE in RCT with survival outcome: Log rank analogue Independent Censoring % Bias Power 95% Coverage Relative Efficiency Log rank <2% 0.13 0.95 1.00 TMLE (correct λ) <1% 0.22 0.95 1.48 TMLE (mis-spec λ) <1% 0.19 0.95 1.24 TMLE: gain in power over log rank Informative Censoring % Bias Power Log rank 32% 0.20* 0.93 1.00 TMLE (correct λ, correct G) <1% 0.18 0.95 1.44 TMLE (mis-spec λ, correct G) <1% 0.15 0.95 1.24 TMLE: unbiased 95% Relative Coverage Efficiency Tshepo Study The Tshepo Study is an open-label, randomized, 3x2x2 factorial design study conducted at Princess Marina Hospital in Gaborone, Botswana to evaluate the efficacy, tolerability, and development of drug resistance of six different first-line cART regimens Analysis Team: Mark van der Laan, Kelly Moore, Ori Stitelman, Victor De Gruttola Tshepo Study: C. William Wester, Ann Muir Thomas, Hermann Bussmann, Sikhulile Moyo, Joseph Makhema, Tendani Gaolathe, Vladimir Novitsky2, Max Essex, Victor deGruttola, and Richard G. Marlink Tshepo Study • Causal Effect of treatment NVP/EFV on time till death, time till virologic failure/dealth/treatment discontinuation, among other endpoints. • Effect modification by baseline CD4 and Gender • Causal effect modification by baseline CD4 Simulation Study • Generated 500 data sets of 500 observations each per scenario. • Varied How Well Hazard Was Specified (Well/Badly) • Varied Level of ETA Violation (Low/Med/High) • Observed performance of: IPCW, G-comp, Double Robust Augm IPCW, TMLE, C-TMLE and TMLE excluding the variable that causes the eta violation in the treatment mechanism. MSE And Relative Efficiency of Estimating P(TA=1>t0) Low ETA Medum ETA High ETA Method Well Specified λ Badly Specified λ Well Specified λ Badly Specified λ Well Specified λ Badly Specified λ IPCW 0.00086 (1.1) 0.00086 (1.1) 0.00117 (1.2) 0.00117 (1.2) 0.00698 (1.9) 0.00698 (1.9) 0.00071 (0.9) 0.00085 (1.1) 0.00086 (1.1) 0.00088 (1.1) 0.00073 (0.7) 0.00106 (1.1) 0.00161 (1.6) 0.00116 (1.2) 0.00071 (0.2) 0.00455 (1.3) 0.00231 (0.6) 0.00819 (2.3) TMLE 0.00085 (1.1) 0.00088 (1.1) 0.00105 (1.1) 0.00111 (1.1) 0.00347 (1.0) 0.00375 (1.0) TMLE w/o ETA 0.00084 (1.0) 0.00088 (1.1) 0.00083 (0.8) 0.00087 (1.2) 0.00079 (0.2) 0.00085 (0.2) c-TMLE 0.00085 (1.1) 0.00104 (1.3) 0.00085 (0.9) 0.00122 (1.2) 0.00080 (0.2) 0.00101 (0.3) G-comp DR-IPCW Percent Of Time Influence Curve Based 95 Percent CI Includes Truth Low ETA Medum ETA High ETA Method Well Specified λ Badly Specified λ Well Specified λ Badly Specified λ Well Specified λ Badly Specified λ IPCW 98.4% 98.4% 97.2% 97.2% 89.8% 89.8% Double Robust 94.2% 95.4% 95.0% 96.4% 94.8% 95.8% TMLE 94.0% 95.2% 95.4% 95.6% 84.8% 87.0% TMLE w/o ETA 94.2% 94.8% 94.4% 95.6% 94.6% 95.0% c-TMLE 94.6% 93.6% 96.0% 95.0% 94.8% 93.2% Properties of c-TMLE Algorithm Low ETA Method Mean Moves % No Moves % ETA Variable Medum ETA High ETA Well Specified λ Badly Specified λ Well Specified λ Badly Specified λ Well Specified λ Badly Specified λ 2.7 9.1 2.1 7.3 2.9 7.1 68.8% 0.2% 74.2% 6.4% 62.6% 1.6% 27.8% 60.4% 1.2% 6.0% 0.0% 0.4% Case-Control Weighted Targeted MLE Data Analysis Full cohort analysis sample size was 2066. Case-control analysis was repeated in 100 simulations with 269 cases and 538 randomly selected controls. “% Rej” indicates average percent rejected tests among the 100 simulations. • Nested case-control studies are quite common, and can be particularly beneficial in biomarker studies. • We simulated a nested casecontrol study within a cohort study examining the impact of higher physical activity on death in an aging population. • With known q0, we obtained similar point estimates in the case-control analysis. Description of Simulation – 100 subjects each with one random X (say a SNP’s) uniform over 0, 1 or 2. – For each subject, 100 binary Y’s, (Y1,...Y100) generated from a model such that: • first 95 are independent of X • Last 5 are associated with X • All Y’s correlated using random effects model – 100 hypotheses of interest where the null is the independence of X and Yi . – Test statistic is Pearson’s 2 test where the null distribution is 2 with 2 df. Description of Simulation, cont. – Simulated data 1000 times – Performed the following MTP’s to control FWER at 5%. • Bonferroni • Null centered, re-scaled bootstrap (NCRB) – based on 5000 bootstraps • Quantile-Function Based Null Distribution (QFBND) – Results • NCRB anti-conservative (inaccurate) • Bonferroni very conservative (actual FWER is 0.005) • QFBND is both accurate (FWER 0.04) and powerful (10 times the power of Bonferroni). Empirical Bayes/Resampling TPPFP Method • We also devised another resampling based multiple testing procedure, controlling the proportion of false positives to total rejections. • This procedure involves: – Randomly sampling a guessed set of true null hypotheses: H0(j)~Bernoulli (Pr(H0(j)=1|Tj)=p0f0(Tj)/f(Tj) ) based on the Empirical Bayes model: Tj|H0=1 ~f0 Tj~f p0=P(H0(j)=1) (p0=1 conservative) – Our joint bootstrap null distribution of test statistics. Emp. BayesTPPFP Method 1. Grab a column T~n from the null distribution of length M. 2. Draw a length M binary vector corresponding to S0n. 3. For a vector of c values calculate: 4. Repeat 1. and 2. 10,000 times and average over iterations. 5. Choose the c value where P(rn(c) > q)· . Examples/Simulations Bacterial Microarray Example • Airborne bacterial levels in specific cities over a span of several weeks are being collected and compared. • A specific Affymetrics array was constructed to quantify the actual bacterial levels in these air samples. • We will be comparing the average (over 17 weeks) strain-specific intensity in San Antonio versus Austin, Texas. 420 Airborne Bacterial Levels 17 time points San Antonio vs Austin Procedure Bonferroni FWE Augmentation TPPFP E.Bayes/Bootstrap TPPFP Number Rejected = 0.05 5 = 0.10 = 0.05 6 = 0.10 14 = 0.05 13 = 0.10 21 11 CGH Arrays and Tumors in Mice • 11 Comparative genomic hybridization (CGH) arrays from cancer tumors of 11 mice. • DNA from test cells is directly compared to DNA from normal cells using bacterial artificial chromosomes (BACs), which are small DNA fragments placed on an array. • With CGH: – differentially labeled test [tumor] and reference [healthy] DNA are co-hybridized to the array. – Fluorescence ratios on each spot of the array are calculated. – The location of each BAC in the genome is known and thus the ratios can be compiled into a genome-wide copy number profile Plot of Adjusted p-values for 3 procedures vs. Rank of BAC (ranked by magnitude of T-statistic) Acknowledgements: • UC Berkeley – – – – – – – – – Susan Gruber Ori Stittelman Kelly Moore Maya Petersen Dan Rubin Cathy Tuglus Sherri Rose Michael Rosenblum Eric Polley -P.I. Ira Tager (Epi). • Stanford: Dr. Robert Shafer • Kaiser: Dr. Jeffrey Fessels • Harvard: Dr. David Bangsberg • UCSF: Dr. Steve Deeks • FDA: Thamban Vallappil, Greg Soon TargetDiscovery demo Targeted Maximum Likelihood in Semiparametric Regression • Implementation just involves adding a covariate h(A,W) to the regression model h( A,W ) d d m( A,W | b ) E m( A,W | b ) W db db – When m(A,W|b) is linear h( A,W ) AW EA W W • Requires estimating E(A|W) – E.g. Expected value of A given confounders W • Robust: Estimate is consistent if either – E(A|W) is estimated consistently – E(Y|A,W) is estimated consistently Model-based Variable Importance • When the variable of interest (A) is continuous - Given Observed Data: O=(A,W,Y)~Po W*={possible biomarkers, demographics, etc..} A=W*j (current biomarker of interest) W=W*-j - Measure of Importance: Given : m( A,W | b ) E p (Y | A a,W ) E p (Y | A 0,W ) Define : (A) EW *[m( A,W | b )] (a) EW [m(a,W | b )] 1 n m(a, Wi | b ) n i 1 If linear : abE[W ] Simplest Case (Marginal) : ab 0 Evaluation of Biomarker methods: simulation Minimal List length to obtain all 10 “true” variables 100 List Length Linear Reg 80 VImp w/LARS RF1 60 RF2 40 20 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Correlation • No appreciable difference in ranking by importance measure or p-value –plot above is with respect to ranked importance measures • List Length for linear regression and randomForest increase with increasing correlation, Variable Importance w/LARS stays near minimum (10) through ρ=0.6, with only small decreases in power • Linear regression list length is 2X Variable Importance list length at ρ=0.4 and 4X at ρ=0.6 • RandomForest (RF2) list length is consistently short than linear regression but still is 50% than Variable Importance list length at ρ=0.4, and twice as long at ρ=0.6 • Variable importance coupled with LARS estimates true causal effect and outperforms both linear regression and randomForest Example: Biomarker Discovery HIV resistance mutations Accounting for ETA violations The Need for Experimentation • Estimation of Variable Importance/Causal Effect requires assumption not needed for prediction • “Experimental Treatment Assignment” (ETA) – Must be some variation in treatment variable A within every stratum of confounders W • W must not perfectly predict/determine A • g(a|W)>0 for all (a,W) Stanford Drug Resistance Database • All Treatment Change Episodes (TCEs) in the Stanford Drug Resistance Database – Patients drawn from 16 clinics in Northern CA Baseline Viral Load <24 weeks 12 weeks TCE (Change >= 1 Drug) Final Viral Load Change in Regimen • 333 patients on saquinavir regimen Parameter of Interest • Need to control for a range of other covariates W – Include: past treatment history, baseline clinical characteristics, non-protease mutations, other drugs in regimen • Parameter of Interest: Variable Importance ψ = E[E(Y|Aj=1,W)-E(Y|Aj=0,W)] – For each protease mutation (indexed by j) Unadjusted estimates Delta-adjusted T-MLE Realistic Causal Effect of Physical Activity Level • Given elderly population (SONOMA), we wish to establish effect of activity on 5 year mortality. • A realistic individualized treatment rule indexed by activity level a is defined as, given individuals covariates W, realistic activity level closest to assigned level a: • Causal relative risk:E(Y(d(a))/E(Y(d(0)) Realistic Rules Indexed by 5 Activity Levels Sparse-Data(ETA)-Bias for Different Causal Effects Estimates of Realistic/Intention to Treat/Static Causal Relative Risks Evaluation of Biomarker Discovery Methods Methods > Univariate Linear Regression • Importance measure: Coefficient value with associated p-value • Measures marginal association > RandomForest (Breiman 2001) • Importance measures (no p-values) RF1: variable’s influence on error rate RF2: mean improvement in node splits due to variable > Variable Importance with TMLE based on LARS • Importance measure: causal effect E p (Y | A a,W ) E p (Y | A 0,W ) • Formal inference, p-values provided • LARS used to fit initial E[Y|A,W] estimate W={marginally significant covariates} All p-values are FDR adjusted Simulation Study > Test methods ability to determine “true” variables under increasing correlation conditions • Ranking by measure and p-value • Minimal list necessary to get all “true”? > Variables • Block Diagonal correlation structure: 10 independent sets of 10 • Multivariate normal distribution • Constant ρ, variance=1 • ρ={0,0.1,0.2,0.3,…,0.9} > Outcome • Main effect linear model • 10 “true” biomarkers, one variable from each set of 10 • Equal coefficients • Noise term with mean=0 sigma=10 –“realistic noise” THEOREM • Consider any generalized linear regression model from the normal, binomial, Poisson, Gamma, Inverse Gaussian, with canonical link function, in which the linear part contains the treatment variable as main term and contains intercept. Let r be cts diff. Then the MLE of r(EY(0),EY(1)) is asymptotically consistent and locally efficient. Median number of false positives TargetImpact advantage Power Patients Crude ANCOVA TargetImpact Crude ANCOVA Viral Load below 50 copies/mL 500 1000 3000 14.6% 31.7% 70.1% x x x 32.3% 60.4% 97.0% 2.4 2.6 2.2 x x x 27.5 31.2 30.5 1.4 1.5 1.5 Change in CD4 counts 500 1000 3000 13.1% 23.9% 56.3% 97.5% 100% 100% 99.4% 100% 100% Average number of patients needed in a GST: • 2,593 with a crude analysis • 1,980 with TargetImpact $15M or 23% savings (given the typical cost of $25K per patient)