Fitting Marginal Structural Models Eleanor M Pullenayegum Asst Professor Dept of Clin. Epi & Biostatistics pullena@mcmaster.ca Outline Causality and observational data Inverse-Probability weighting and MSMs Fitting an MSM Goodness-of-fit Assumptions/ Interpretation Causality in Medical Research Often want to establish a causal association between a treatment/exposure and an event Difficult to do with observational data due to confounding Gold-standard for causal inferences is the randomized trial Randomize half the patients to receive the treatment/exposure, and half to receive usual care Deals with measured and unmeasured confounders Randomized trials are not always possible Sometimes, they are unethical cannot do a randomized trial on the effects of second-hand smoke on lung cancer or a randomized trial of the effects of living near power stations Sometimes, they are not feasible Study of a rare disease (funding is an issue) Observational Studies Observe rather than experiment (or interfere!) Recruit some people who are exposed to secondhand smoke and some who are not Study communities living close to power lines vs. those who don’t Confounding is a major concern For 1st example, are workplace environment, home environment, age, gender, income similar between exposed and unexposed? For 2nd example, are education, family income, air pollution similar between cases and controls? Handling Confounding Match exposed and unexposed on key confounders Adjust for confounders e.g. for every family living close to a power station, attempt to find a control family living in a similar neighbourhood with a similar income for the smoking example, adjust for age, gender, level of education, income, type of work, family history of cancer etc. Cannot deal with unmeasured confounders Causal Pathways There are some things we cannot adjust for When studying the effect of a lipid-lowering drug on heart disease, we can’t adjust for LDL-cholesterol level Causal Pathways Drug LDL-cholesterol Heart Disease LDL-cholesterol mediates the effect of the drug Cannot adjust for variables that are on the causal pathway between exposure and outcome. Motivating Example Juvenile Dermatomyositis (JDM) is a rare but serious skin/muscle disease in children Standard treatment is with steroids (Prednisone), however these have unpleasant side-effects Intravenous immunoglobulin (IVIg) is a possible alternative treatment DAS measures disease activity JDM Dataset 81 kids, 7 on IVIg at baseline, 23 on IVIG later Outcome is time to quiescence Quiescence happens when DAS=0 IVIg tends to be given when the child is doing particularly badly (high DAS) DAS is a counfounder Causal Pathway for JDM study DASt DASt+1 IVIgt IVIgt+1 … … Time-to-Quiescence DAS confounds IVIg and outcome DAS is on the causal pathway A Thought Experiment Suppose that at each time t, we could create an identical copy of each child i. Then if the real child received IVIG, we would give the copy control and vice versa We could then compare the child to its copy Solves confounding by matching: the child is matched with the copy If treatment varies on a monthly basis and we follow for 5 years, we would have 260-1 copies Counterfactuals Clearly, this is impossible. But we can use the idea Define the counterfactuals for child i to be the outcomes for each of the 260-1 imaginary copies Idea: treat the counterfactuals as missing data Inverse-Probability Weighting Inverse-Probability Weighting (IPW) is a way of re-weighting the dataset to account for selective observation E.g. if we have missing data, then we weight the observed data by the inverse of the probability of being observed Why does this work? Suppose we have a response Yij, treatment indicator xij and Rij=1 if Yij observed, 0 o/w Inverse-Probability Weighting Suppose we want to fit the marginal model Usually, we solve the GEE n 1 equation x V i i (Yi x i ) 0 E(Yij | x i ) x ij i 1 If we use just the observed data, n we solve 1 x i 1 i Vi i (Yi x i ) 0; ijj R ij LHS does not have mean 0 Inverse-Probability Weighting If we replace by ijj R ij pij with pij, the conditional probability of observing Yij, then E( ijj (Yij xij ) | x) E( E(Rij pij | x,Yij ,...,Yi1 ) (Yij x ij ) | x) E( 1 pij E(Rij | x,Yij ,...,Yi1 ) (Yij x ij ) | x) E(Yij xij | x) 0 because pij P(Rij 1| x,Yij ,...,Yi1 ) What to condition on? Must condition on Yij If MAR, then conditionally independent given previous Y Marginal Structural Models MSMs use inverse-probability weighting to deal with the unobserved (“missing”) counterfactuals We cannot adjust for confounders… …but using IPW, can re-weight the dataset so that treatment and covariates are unconfounded i.e. mean covariate levels are the sample between treated and untreated patients So can do a simple marginal analysis Probability-of-Treatment Model Weighting is based on the Probability-ofTreatment model Treatment is longitudinal For each child at each time, need probability of receiving the observed treatment trajectory Probability is conditional on past responses and confounders Assume independent of current response JDM Example Probability of being on IVIg at baseline (logistic regression) Probability of transitioning onto IVIg (Cox PH) Probability of transitioning off IVIg (Cox PH) Suppose a child initiates IVIG at 8 months and is still on IVIG at 12 months. What is the probability of the observed treatment pattern? P(no transition before month 8) 0 No IVIg P(transition at month 8) P(not on IVIg at baseline) Trratment probability P(no transition off before month 12) 8 Initiate IVIg 12 Still on IVIG Model Fitting First identified covariates univariately Then entered those that were sig. into model and refined (by removing those that were no longer sig.) IVIg at baseline: Functional status (any vs. none) OR 11.6, 95% CI 1.94 to 69.7; abnormal swallow/voice OR 6.28, 95% CI 0.983 to 4.02. IVIg termination: no covariates Assessing goodness-of-fit If the IPT weights are correct, in the reweighted population, treatment and covariates are unconfounded This property is crucial testable …so we should test it! Goodness-of-fit in the JDM study Biggest concern is that kids are doing badly when they start IVIg If inverse-probability weights are correct, then at each time t, amongst patients previously IVIgnaïve, IVIg is not associated with covariates. Will look at differences in mean covariate values by current IVIg status amongst patients previously IVIg-naïve Data are longitudinal, so use a GEE analysis, adjusting for time Model 1 – HRs for Treatment Initiation Covariate W1 Skin rash 3.48 (0.99 to 12.17) CHAQ 1.99 (1.10 to 3.66) Prednisone 4.01 (1.35 to 11.90) Hazard Ratios and 95% confidence intervals for initiating treatment UW UW W1 W1 W2 W2 W3 W3 W4 W4 -1 0 1 2 3 -3 DAS UW W1 W1 W2 W2 W3 W3 W4 W4 -0.2 -1 0 1 Missing DAS UW -0.4 -2 0.0 Prednisone 0.2 -0.10 0.00 0.10 Methotrexate 0.20 Model 2 -Revised Treatment initiation Covariate W1 W2 Skin rash 3.48 (0.99 to 12.17) 3.33 (0.92 to 12.1) CHAQ 1.99 (1.10 to 3.66) 1.97 (1.06 to 3.64) Prednisone 4.01 (1.35 to 11.90) 3.96 (1.33 to 11.8) DAS 1.03 (0.82 to 1.30) Hazard Ratios and 95% confidence intervals for initiating treatment New goodness-of-fit UW UW W1 W1 W2 W2 W3 W3 W4 W4 -1 0 1 2 3 -3 DAS UW W1 W1 W2 W2 W3 W3 W4 W4 -0.2 -1 0 1 Missing DAS UW -0.4 -2 0.0 Prednisone 0.2 -0.10 0.00 0.10 Methotrexate 0.20 Back to basics • Some patients start IVIg because they are steroidresistant (early-starters) • Others start because they are steroid-dependent (late-starters) • Repeat model-fitting process separately for early and late starters Covariate W3 Abnormal ALT & t < 230 5.44 (1.29 to 22.9) CHAQ & t < 230 4.27 (1.70 to 10.7) Prednisone & t > 230 4.92 (1.39 to 17.4) UW UW W1 W1 W2 W2 W3 W3 W4 W4 -1 0 1 2 3 -3 DAS UW W1 W1 W2 W2 W3 W3 W4 W4 -0.2 -1 0 1 Missing DAS UW -0.4 -2 0.0 Prednisone 0.2 -0.10 0.00 0.10 Methotrexate 0.20 Refined two-stage model Covariate W3 W4 Abnormal ALT & t < 230 5.44 (1.29 to 22.9) 5.27 (0.98 to 28.3) CHAQ & t < 230 4.27 (1.70 to 10.7) 4.22 (1.63 to 10.9) Prednisone & t > 230 4.92 (1.39 to 17.4) 5.22 (1.44 to 19.0) Missing DAS & t < 230 0.994 (0.77 to 1.28) Missing DAS & t > 230 0.939 (0.80 to 1.11) UW UW W1 W1 W2 W2 W3 W3 W4 W4 -1 0 1 2 3 -3 DAS UW W1 W1 W2 W2 W3 W3 W4 W4 -0.2 -1 0 1 Missing DAS UW -0.4 -2 0.0 Prednisone 0.2 -0.10 0.00 0.10 Methotrexate 0.20 Efficacy Results Weighting Scheme Hazard Ratio (95% CI) Unweighted 0.646 (0.342, 1.22) W1 0.825 (0.394, 1.73) W2 0.851 (0.402, 1.80) W3 0.703 (0.340, 1.44) W4 0.756 (0.378, 1.51) Other concerns with MSMs Format of treatment effect (e.g. constant over time, PH etc.) Unmeasured counfounders Lack of efficiency Experimental Treatment Assignment Efficiency IPW reduces bias but also reduces efficiency The further the weights are from 1, the worse the efficiency Can stabilise the weights: Estimating equations will still be zero-mean if we multiply ijj by a factor depending on j and treatment In JDM study, we used ijj=RijP(Rx history)/P(Rx history|confounders) Efficiency – other techniques Doubly robust methods (Bang & Robins) Could have used a more information-rich outcome Did a secondary analysis using DAS as the outcome – got far more precise (and more positive) results Experimental Treatment Assignment In order for MSMs to work, there must be some experimentality in the way treatment is assigned Intuitively, if we can predict perfectly who will get what treatment, then we have complete confounding Mathematically, if pij is 0 then we’re in trouble! Actually, we get into trouble if pij = 0 or 1 Testing the ETA – simple checks At each time j, review the distribution of covariates amongst those who are on treatment vs. those who are not. Review the distribution of the weights check p bounded away from 0/1 In the JDM example, also check distn of transition probabilities Testing the ETA – more advanced methods Bootstrapping Wang Y, Petersen ML, Bangsberg D, van der Laan MJ. Diagnosing bias in the inverse probability of treatment weighted estimator resulting from violation of experimental treatment assignment. UC Berkeley Division of Biostatistics working paper series, 2006. Implementing MSMs For time-to-event outcome, can do weighted PH regression in R For continuous (or binary) outcome, use weighted GEE Used the svycoxph function from the survey package Used proc genmod in SAS with scgwt Weighted GEEs are not straightforward in R STATA could probably handle either type of outcome MSMs - potentials Often good observational databases exist Should do what we can with them before using large amounts of money to do trials Can deal with a time-varying treatment Conceptually fairly straightforward Do not have to model correlation structure in responses MSMs - limitations There may always be unmeasured confounders Relies heavily on probability-of-treatment model being correct Experimental ETA violations can often occur (particularly with small sample sizes) Somewhat inefficient Doubly robust methods may help Not a replacement for an RCT Key points MSMs can help to establish causal associations from observational data Make some strong assumptions Need goodness-of-fit for measured confounders Will never find the right model Aim to find good models References Robins JM, Hernan MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology 2000; 11: 550560. Bang H, Robins JM (2005). Doubly Robust Estimation in Missing Data and Causal Inference Models. Biometrics 61 (4), 962–973. Pullenayegum EM, Lam C, Manlhiot C, Feldman BM. Fitting Marginal Structural Models: Estimating covariate-treatment associations in the re-weighted dataset can guide model fitting. Journal of Clinical Epidemiology. Wang Y, Petersen ML, Bangsberg D, van der Laan MJ. Diagnosing bias in the inverse probability of treatment weighted estimator resulting from violation of experimental treatment assignment. UC Berkeley Division of Biostatistics working paper series, 2006.