Causal Graphs, epi forum Hein Stigum http://folk.uio.no/heins/ talks Apr-15 H.S. 1 Agenda • Motivating examples • Concepts – Confounder, Collider • Analyzing DAGs – Paths • Examples – Confounding – Mixed (confounders and mediators) – Selection bias Apr-15 H.S. 2 Why causal graphs? • Problem – Association measures are biased • Understanding – Confounding, selection bias, mediators • Analysis – Adjust or not • Discussion – Precise statement of prior assumptions Apr-15 H.S. 3 Motivating examples • Statins and coronary heart disease – Disease risk: lifestyle, cholesterol • Diabetes and fractures Adjust or not? – Disease risk: fall, bone density – Exposure risk: BMI, Physical activity • Diabetes and fractures – Analyze among hospital patients – Exclude hospital patients Apr-15 H.S. Exclude or not? 4 Causal versus casual Concepts Apr-15 H.S. 5 god-DAG DAG=Directed Acyclic Graph Node = variable Arrow = cause, (at least one individual effect) Read of the DAG: U age obesity E D vitamin birth defects Questions on the DAG: Causality = arrows Associations = paths Apr-15 C E-D effect biased? Adjust for age? H.S. 6 Association and Cause Association Possible causal structure Yellow fingers Lung cancer Cause Lung cancer Confounder Lung cancer Collider Smoke Yellow fingers Lung cancer Yellow fingers Hospital Yellow fingers Apr-15 H.S. 7 Confounder idea A common cause Smoking + Adjust for smoking Smoking + Yellow fingers Lung cancer + Yellow fingers + Lung cancer + • A confounder induces an association between its effects • Conditioning on a confounder removes the association • Condition = (restrict, stratify, adjust) Apr-15 H.S. 8 Collider idea Two causes for coming to hospital Hospital + Yellow fingers Select subjects in hospital Hospital + Lung cancer + + Yellow fingers Lung cancer - or + and • Conditioning on a collider induces an association between its causes • “And” and “or” selection leads to different bias Apr-15 H.S. 9 Data driven analysis C E D Want the effect of E on D (E precedes D) Observe the two associations E-C and D-C Assume criteria dictates adjusting for C (likelihood ratio, Akaike (赤池 弘次) or change in estimate) The undirected graph above is compatible with three DAGs: C C E D Confounder 1. Adjust Conclusion: Apr-15 E C D Mediator 2. Adjust (direct) 3. Not adjust (total) E D Collider 4. Not adjust The data driven method is correct in 2 out of 4 situations Need information from outside the data to do a proper analysis H.S. 10 The Path of the Righteous Analyzing DAGS: Paths Apr-15 H.S. 11 Path definitions Path: any trail from E to D (without repeating or crossing itself) Type: causal, non-causal State: open, closed K C Four paths: E D M 1 2 3 4 Path ED EMD ECD ECD Goal: Keep causal paths of interest open Close all non causal paths Apr-15 H.S. 12 K Four rules C non-causal 1. Causal path: ED (all arrows in the same direction) otherwise non-causal E D M causal K closed C Before conditioning: 2. Closed path: K E (closed at a collider, otherwise open) D open M K Conditioning on: 3. a non-collider closes: [M] or [C] 4. a collider opens: [K] C E D (or a descendant of a collider) Apr-15 H.S. M 13 Confounding Apr-15 H.S. 14 C1 Physical activity and Coronary Heart Disease (CHD) age E D Phys. Act. CHD 1. We want the total effect of Physical Activity on CHD. What should we adjust for? C2 sex Unconditional Path 1 ED 2 EC1D 3 EC2D Type Causal Noncausal Noncausal Status Open Open Open Conditioning on C1 and C2 Path 1 ED 2 EC1]D 3 EC2]D Type Causal Noncausal Noncausal Status Open Closed Closed Apr-15 Bias H.S. No bias 15 Vitamin and birth defects C U age obesity E D vitamin birth defects Unconditional Path 1 ED 2 ECUD Bias in E-D? Adjust for C? Type Status Causal Open Non-causal Open Bias Conditioning on C Path Type Status 1 ED Causal Open 2 EC]UD Non-causal Closed Apr-15 H.S. No bias This example and previous slide are both confounding 16 Confounders and mediators Mixed Apr-15 H.S. 17 Diabetes and Fractures F prone to fall V E D BMI diabetes fracture P B physical activity bone density Conditional Unconditional Path Path 11 E→D E→D 22 E→F→D E→F→D 33 E→B→D E→B→D 44 E←[V]→B→D E←V→B→D 55 E←[P]→B→D E←P→B→D Apr-15 We want the total effect of diabetes on fractures Type Type Causal Causal Causal Causal Causal Causal Non-causal Non-causal Non-causal Non-causal Status Status Open Open Open Open Open Open Closed Open Closed Open H.S. Mediators Confounders 18 Statin and CHD U C lifestyle cholesterol E D statin CHD Unconditional Path 1 ED 2 ECD 3 ECUD Conditioning on C Path 1 ED 2 EC]D 3 EC]UD Apr-15 Type Causal Causal Non-causal 1. We want the total effect of statin on CHD. What would we adjust for? 2. Can we estimate the direct effect of statin on CHD (not mediated through cholesterol)? Status Open Open Closed No adjustments gives the total effect Is C a collider? Type Causal Causal Non-causal Status Open Closed Open H.S. Adjusting for C opens the collider path must also adjust for U to get the direct effect 19 Selection bias Apr-15 H.S. 20 Diabetes and Fractures 1. Convenience: Conduct the study among hospital patients? H hospital E D diabetes fracture Conditional Unconditional Path 1 E→D 2 E→H←D E→[H]←D Type Causal Non-causal Non-Causal 2. Homogeneous sample: Exclude hospital patients Status Open Closed Open Collider, selection bias Collider stratification bias: at least on stratum is biased Apr-15 H.S. 21 Selection bias: size and direction H Hospital risk: D 1 E 0 1 0 0.6 0.3 0.2 0.1 Response= 16 % Population D E 1 0 0 sum 36 164 64 736 100 900 1000 RR= 2.0 E 3.0 E Hospital D 1 Apr-15 2.0 1 0 No hospital D 1 0 sum 22 49 13 74 35 123 157 RR= 1.6 H.S. D 2.0 E 1 0 1 0 sum 15 115 51 663 65 777 843 RR= 1.5 22 Adjusting for selection bias F H prone to fall hospital E D diabetes fracture Path 1 E→D 2 E→F→[H] ←D Apr-15 Type Causal Non-causal Status Open Open H.S. Adjust for F to close this path 23 Summing up • Data driven analyses do not work. Need (causal) information from outside the data. • DAGs are intuitive and accurate tools to display that information. • Paths show the flow of causality and of bias and guide the analysis. • DAGs clarify concepts like confounding and selection bias, and show that we can adjust for both. Better discussion based on DAGs Apr-15 H.S. 24 References 1 Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. 3. ed. Philadelphia: Lippincott Willams & Williams,2008. 2 Hernan MA, Hernandez-Diaz S, Robins JM. A structural approach to selection bias. Epidemiology 2004; 15: 615-25. 3 Hernandez-Diaz S, Schisterman EF, Hernan MA. The birth weight "paradox" uncovered? Am J Epidemiol 2006; 164: 1115-20. 4 Schisterman EF, Cole SR, Platt RW. Overadjustment Bias and Unnecessary Adjustment in Epidemiologic Studies. Epidemiology 2009; 20: 488-95. 5 VanderWeele TJ, Hernan MA, Robins JM. Causal directed acyclic graphs and the direction of unmeasured confounding bias. Epidemiology 2008; 19: 720-8. 6 VanderWeele TJ, Robins JM. Four types of effect modification - A classification based on directed acyclic graphs. Epidemiology 2007; 18: 561-8. 7 Weinberg CR. Can DAGs clarify effect modification? Epidemiology 2007; 18: 569-72. • Hernan and Robins, Causal Inference (coming) Apr-15 H.S. 25