DAGs intro without exercises 1h Directed Acyclic Graph Hein Stigum http://folk.uio.no/heins/ courses May-16 H.S. 1 Motivating example Want the effect of E on D C E (E precedes D) Observe the two associations C-E and C-D Not enough information! D Different analyses for: C E Confounder C D E D Causal information Mediator Need causal information from outside the data to do a proper analysis! May-16 H.S. 2 Agenda • Background • DAG concepts – Association, Cause – Confounder, Collider – Paths • Analyzing DAGs – Examples May-16 H.S. 3 Why causal graphs? • Problem – Association measures are biased • Causal graphs help: – Understanding • Confounding, mediation, selection bias – Analysis • Adjust or not – Discussion • Precise statement of prior assumptions May-16 H.S. 4 Causal versus casual CONCEPTS May-16 H.S. 5 god-DAG Causal Graph: Node = variable Arrow = cause E=exposure, D=disease DAG=Directed Acyclic Graph Read of the DAG: Causality = arrows Associations = paths Independencies = no paths Estimations: E-D association has two parts: ED causal effect keep open ECUD bias try to close E[C]UD Condition (adjust) to close May-16 H.S. 6 Association and Cause Association 3 possible causal Association 3 possible causal structures structures 3 possible causal structure Association 1 1 Yellow Yellow fingers fingers Lung Lung cancer cancer Cause Cause (Reversed cause) E Yellow Yellow fingers fingers Smoke Smoke D Lung Lung cancer cancer 2 2 Yellow Yellow fingers fingers Confounder Confounder Lung Lung cancer cancer U U 3 3 Yellow Yellow fingers fingers Collider Collider Lung Lung cancer cancer + more complicated structures May-16 H.S. 7 Confounder idea A common cause Smoking + Adjust for smoking Smoking + Yellow fingers Lung cancer + Yellow fingers + Lung cancer + • A confounder induces an association between its effects • Conditioning on a confounder removes the association • Condition = (restrict, stratify, adjust) • Simplest form May-16 H.S. 8 Collider idea Two causes for selection to study Selected + Yellow fingers Selected subjects Selected + Lung cancer + + Yellow fingers Lung cancer - or + and • Conditioning on a collider induces an association between its causes • “And” and “or” selection leads to different bias • Simplest form May-16 H.S. 9 Data driven analysis C E D - Want the effect of E on D (E precedes D) - Observe the two associations C-E and C-D - Assume statistical criteria dictates adjusting for C (likelihood ratio, Akaike (赤池 弘次) or 10% change in estimate) The undirected graph above is compatible with three DAGs: C C E D Confounder 1. Adjust Conclusion: May-16 E C D Mediator 2. Direct: adjust 3. Total: not adjust E D Collider 4. Not adjust The data driven method is correct in 2 out of 4 situations Need information from outside the data to do a proper analysis H.S. 10 The Path of the Righteous Paths May-16 H.S. 11 Path definitions Path: any trail from E to D (without repeating itself) Type: causal, non-causal State: open, closed 1 2 3 4 Four paths: Path ED EMD ECD EKD Goal: Keep causal paths of interest open Close all non-causal paths May-16 H.S. 12 Four rules 1. Causal path: ED (all arrows in the same direction) otherwise non-causal Before conditioning: 2. Closed path: K (closed at a collider, otherwise open) Conditioning on: 3. a non-collider closes: [M] or [C] 4. a collider opens: [K] (or a descendant of a collider) May-16 H.S. 13 ANALYZING DAGs May-16 H.S. 14 Confounding examples May-16 H.S. 15 Physical activity and Coronary Heart Disease (CHD) 1. We want the total effect of Physical Activity on CHD. What would we adjust for? Unconditional Path 1 ED 2 EC1D 3 EC2D Type Causal Non-causal Non-causal Status Open Open Open Conditioning on C1 and C2 Path 1 ED 2 EC1]D 3 EC2]D Type Causal Non-causal Non-causal Status Open Closed Closed May-16 May-16 Bias H.S. This is an example of confounding No bias 16 Intermediate variables May-16 H.S. 17 Tea and depression O coffee caffeine E tea Path 1 E→D 2 E→C→D 3 E←O→C→D May-16 C Total effect: adjust for O Direct effect: adjust for C (and O) Caffeine intermediate or confounder? Caffeine is both intermediate and part of a confounder path. D depression Type Status Causal Open Causal Open Non-causal Open direct indirect H.S. total 18 Statin and CHD We want the total effect of statin on CHD. What would you adjust for? Nothing Can we estimate the direct effect of statin on CHD (not mediated through cholesterol)? No, U is unmeasured Unconditional Path 1 ED 2 ECD 3 ECUD Type Causal Causal Non-causal Status Open Open Closed Conditioning on C Path 1 ED 2 EC]D 3 EC]UD Type Causal Causal Non-causal Status Open Closed Open May-16 H.S. No adjustments gives the total effect Adjusting for C will close path 2 but will open path 3 and give bias! C is a collider on path 3 19 Two concepts Selection bias May-16 H.S. 20 Convenience sample, homogenous sample H 1. Convenience sample: Conduct the study among hospital patients? hospital E D 2. Homogeneous sample: Population data, exclude hospital patients? fractures diabetes Conditional Unconditional Path 1 E→D 2 E→H←D E→[H]←D Type Causal Non-causal Non-Causal Status Open Closed Open Collider, selection bias Collider stratification bias: at least on stratum is biased This type of selection bias can be adjusted for! May-16 H.S. 21 Summing up • Data driven analyses do not work. Need (causal) information from outside the data. • DAGs are intuitive and accurate tools to display that information. • Paths show the flow of causality and of bias and guide the analysis. • DAGs clarify concepts like confounding and selection bias, and show that we can adjust for both. Better discussion based on DAGs Draw your assumptions before your conclusions May-16 H.S. 22 Recommended reading • Books – – – – – Hernan, M. A. and J. Robins. Causal Inference. Web: Rothman, K. J., S. Greenland, and T. L. Lash. Modern Epidemiology, 2008. Morgan and Winship, Counterfactuals and Causal Inference, 2009 Pearl J, Causality – Models, Reasoning and Inference, 2009 Veierød, M.B., Lydersen, S. Laake,P. Medical Statistics. 2012 • Papers – Greenland, S., J. Pearl, and J. M. Robins. Causal diagrams for epidemiologic research, Epidemiology 1999 – Hernandez-Diaz, S., E. F. Schisterman, and M. A. Hernan. The birth weight "paradox" uncovered? Am J Epidemiol 2006 – Hernan, M. A., S. Hernandez-Diaz, and J. M. Robins. A structural approach to selection bias, Epidemiology 2004 – Berk, R.A. An introduction to selection bias in sociological data, Am Soc R 1983 – Greenland, S. and B. Brumback. An overview of relations among causal modeling methods, Int J Epidemiol 2002 – Weinberg, C. R. Can DAGs clarify effect modification? Epidemiology 2007 May-16 H.S. 23