Summary of Causality Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/courses May 16 H.S. 1 Contents • Concepts – Statistics and Causality – Counterfactuals, Actions • Causal models – DAGs, Pies, SEM, (MSM) • Causal inference – Exchangeability, Positivity, Consistency • Methods of adjustment May 16 H.S. 2 Statistics and Causality (J. Pearl) Concepts May 16 H.S. 3 Traditional Statistics Data P Joint distribution Q(P) (Aspects of P) • Inference – Infer if customers who by product A will also by product B – Q=P(B|A) May 16 H.S. 4 From Statistical to Causal analysis P Joint distribution Data P’ Joint distribution change Q(P’) (Aspects of P’) • Intervention: P changes to P’ – Infer if customers who by product A will also by product B when we double the price – Statistics deals with static relations, P does not tell us how it ought to change: P’(v)P(v|price=2) – Need assumptions about aspects of P that stay invariant under intervention (change) May 16 H.S. 5 Statistical and Causal concepts • Statistical and causal concepts do not mix Statistical Causal Association Randomization / intervention Controlling for / Conditioning Confounding Independence Instrumental variable Collapsibility Endogeneity • No causes in – no causes out Causal assumptions + Statistical assumptions + Data = causal conclusions • Standard mathematics – Causal assumptions cannot be expressed • Non-standard mathematics – Structural Equation Models (Wright 1920, Simon 1969) – Counterfactuals (Neyman-Rubin) – Do-operator (Pearl) May 16 H.S. 6 Potential outcome, Counterfactual Concepts May 16 H.S. 7 Individual causal effect • Individual causal effect – Outcome if exposed – Outcome if unexposed – Causal effect if Y1 Y0 Y1 Y0 Potential outcomes Counterfactual • Important – Clear definition – Notation mathematical proofs – Notation new methods • Estimate individual effect? – No, (but Crossover design) May 16 H.S. 8 Population causal effect • Average causal effect – Expected outcome if all exposed – Expected outcome if all unexposed – Causal effect if E(Y1) E(Y0) E(Y1) E(Y0) • Causal effect measures – RDcausal= E(Y1) - E(Y0) – RRcausal= E(Y1) / E(Y0) • Estimate average effect? – Yes, Randomized Controlled Trial May 16 H.S. 9 Actions Modifiable risk factors • Examples Not modifiable risk factors • Examples – Smoking, Radon • Actions – Sex, Age • Actions – Reduce prevalence of smoking from 15% to 10% – ? Causal effects are strictly speaking only defined for actions May 16 H.S. 10 DAGs, Pies and SEM Causal models May 16 H.S. 11 Causal models • Four models – – – – Causal graphs (DAGs) Causal Pies, Sufficient Component Cause (SCC) Structural Equation Models (SEM) Potential outcome models • Marginal Structural Models (MSM) May 16 H.S. 12 Causal graphs, DAGs C • Causal assumptions • Units: individuals (also other units) E D • E->D reads E causes D, any definition of cause • Qualitative (non-parametric): the E->D may be linear, threshold, U-shaped, … • Simple, only 4 rules needed for analysis • No estimation, only qualitative results: confounding yes/no • Non-action (immutable) variables as exogenous, all rules apply • New understanding: collider May 16 H.S. 13 Causal Pies (SCC) • Causal assumptions U • Units: causal mechanisms A B • Any definition of cause • No estimation, only qualitative results: interaction yes/no • Logically finer than DAGs 1 DAG B A D U A U B U A 5 SCCs B • Additive scale (interaction) • New understanding: sufficient-,necessary cause, interaction May 16 H.S. 14 Structural Equation Models, SEM • Causal assumptions + statistical model + data • Units: individuals (also other units) • Any definition of cause • Quantitative (parametric): linear • Estimation: direct and indirect effects • Ordinary regression: association of actual covariates with actual outcomes • SEM: effect of actions on potential outcomes • SEM: parametric DAG May 16 H.S. 15 Causal Inference May 16 H.S. 16 Causal inference question • Counterfactual definition of cause – Cannot estimate individual causal effects – Can estimate average causal effects from RCTs • Can we estimate average causal effects from observational data? – Find conditions needed for causal inference • Examine RCTs for conditions • Apply to observational studies May 16 H.S. 17 Randomized Controlled Trial, RCT • U Observational study – Suffers from unmeasured confounders E D U • Randomized trial – If full compliance: • • • R R=E No arrow from U to E E D Three (trivial) conditions in RCTs : – – – Exchangeability: Positivity: Consistency: May-16 exposed and unexposed may be switched have both exposed and unexposed well defined treatment H.S. 18 RCTs versus Observational studies C R E C D E D RCT get Observational need strength test Exchangeability Conditional exchangeability weaker untestable Positivity Conditional positivity stronger testable Consistency Consistency - - May 16 H.S. 19 Exchangeability, Positivity and Consistency U1 K K K non-causal C C EE Sufficient causes for E D D D M M M C causal C U1 Conditions for estimating causal effect: 1. Cond. Exchangeability No open non-causal paths 2. Cond. Positivity Arrows into E not deterministic 3. Consistency Causal paths well defined May-16 H.S. 20 Conditional positivity example All – Estimate dose response for each sex? 20 10 • Positivity problem 0 Response – Dose response is linear 30 40 • Prior knowledge low high Dose Females 30 20 Response 30 20 10 10 0 0 Response 40 40 Males low high Dose low high Dose Conditional positivity, Common support E=0 Conditional positivity = exposed and unexposed for all values of C E=1 C positivity 30 40 55 70 E=1 E=0 110 130 Exposure May 16 150 170 250 C>55 150 C=40 to 55 150 90 Parametric assumption: linear “dose response” 200 250 200 250 200 150 C<40 70 D E=1 300 E=0 300 E=1 300 E=0 E 350 350 350 Confounder, C 70 90 110 130 Exposure 150 170 70 90 110 130 Exposure H.S. 150 170 22 Consistency Consistency = Well defined intervention and contrast May 16 H.S. 23 Air pollution Excess mortality from air pollution? Standard method: estimate attributable fraction Implicit contrast: current levels versus zero Implicit intervention: not existent no consistency May 16 H.S. 24 Body Mass Index Excess mortality from obesity? Standard method: Implicit contrast: Implicit intervention: estimate attributable fraction 30 versus <25 Exercise Diet Mortality Smoking no consistency May 16 H.S. 25 G-methods versus Stratification based methods Methods of adjustment May-16 H.S. 26 Adjusting for confounding G-methods Stratification-methods C E C D E Simulated population with exchangeability D Sub population with C constant Causal effect valid for entire population Causal effect valid for sub population IPW, standarization ←Non-parametric→ Stratification, matching MSM, NSM ←parametric→ regression May-16 H.S. 27 Stratification based adjustment H We want the direct effect of tea on depression chocolate O coffee E tea May-16 C caffeine U low carb Try stratification based adjustment D Fails: one non-causal path is left open depression H.S. 28 Inverse probability weighting H We want the direct effect of tea on depression chocolate O coffee E tea C caffeine U Try adjustment by IPW: D Choose a variable V and weight by the inverse of P(V| direct causes) Try C low carb depression Works: all non-causal paths are closed, only direct effect left May-16 H.S. 29 Summing up • Concepts – Causal definition: counterfactual (potential outcome) – Causal conclusion requires causal assumptions • Models – DAGs, Pies – SEM, MSM causal assumptions statistical + causal assumptions • Causal inference – Exchangeability: comparable E+ and E– Positivity: E+ and E- in all strata – Consistency: well defined intervention and contrast • Adjustment – Stratification based – G-methods May 16 stratification, matching, regression IPW, MSM H.S. 30 Recommended reading • Books – Hernan, M. A. and J. Robins. Causal Inference. Web: – Rothman, K. J., S. Greenland, and T. L. Lash. Modern Epidemiology. 2008. • Papers – Greenland, S., J. Pearl, and J. M. Robins. "Causal diagrams for epidemiologic research." Epidemiology 1999 – Hernandez-Diaz, S., E. F. Schisterman, and M. A. Hernan. "The birth weight "paradox" uncovered?" Am J Epidemiol 2006 – Hernan, M. A., S. Hernandez-Diaz, and J. M. Robins. "A structural approach to selection bias." Epidemiology 2004 – Greenland, S. and B. Brumback. "An overview of relations among causal modeling methods." Int J Epidemiol 2002 – Weinberg, C. R. "Can DAGs clarify effect modification?" Epidemiology 2007 May-16 H.S. 31 References • Chen, L., et al. "Alcohol intake and blood pressure: a systematic review implementing a Mendelian randomization approach." PLoS Med 2008 • Greenland, S. and B. Brumback. "An overview of relations among causal modelling methods." Int J Epidemiol 2002 • Hernan, M. A., S. Hernandez-Diaz, and J. M. Robins. "A structural approach to selection bias." Epidemiology 2004 • Hernan, M. A. and S. R. Cole. "Causal diagrams and measurement bias." Am J Epidemiol 2009 • Sheehan, N. A., et al. "Mendelian randomisation and causal inference in observational epidemiology." PLoS Med 2008 • VanderWeele, T. J. and J. M. Robins. "Directed acyclic graphs, sufficient causes, and the properties of conditioning on a common effect." Am J Epidemiol 2007 • VanderWeele, T. J., M. A. Hernan, and J. M. Robins. "Causal directed acyclic graphs and the direction of unmeasured confounding bias." Epidemiology 2008 • VanderWeele, T. J. "The sign of the bias of unmeasured confounding." Biometrics 2008 May-16 H.S. 32