Causal Graphs,
epi forum
Hein Stigum
http://folk.uio.no/heins/
talks
Apr-15
H.S.
1
Agenda
• Motivating examples
• Concepts
– Confounder, Collider
• Analyzing DAGs
– Paths
• Examples
– Confounding
– Mixed (confounders and mediators)
– Selection bias
Apr-15
H.S.
2
Why causal graphs?
• Problem
– Association measures are biased
• Understanding
– Confounding, selection bias, mediators
• Analysis
– Adjust or not
• Discussion
– Precise statement of prior assumptions
Apr-15
H.S.
3
Motivating examples
• Statins and coronary heart disease
– Disease risk: lifestyle, cholesterol
• Diabetes and fractures
Adjust
or not?
– Disease risk: fall, bone density
– Exposure risk: BMI, Physical activity
• Diabetes and fractures
– Analyze among hospital patients
– Exclude hospital patients
Apr-15
H.S.
Exclude
or not?
4
Causal versus casual
Concepts
Apr-15
H.S.
5
god-DAG
DAG=Directed Acyclic Graph
Node = variable
Arrow = cause, (at least one individual effect)
Read of the DAG:
U
age
obesity
E
D
vitamin
birth defects
Questions on the DAG:
Causality
= arrows
Associations = paths
Apr-15
C
E-D effect biased?
Adjust for age?
H.S.
6
Association and Cause
Association
Possible causal structure
Yellow
fingers
Lung
cancer
Cause
Lung
cancer
Confounder
Lung
cancer
Collider
Smoke
Yellow
fingers
Lung
cancer
Yellow
fingers
Hospital
Yellow
fingers
Apr-15
H.S.
7
Confounder idea
A common cause
Smoking
+
Adjust for smoking
Smoking
+
Yellow fingers
Lung cancer
+
Yellow fingers
+
Lung cancer
+
• A confounder induces an association between its effects
• Conditioning on a confounder removes the association
• Condition = (restrict, stratify, adjust)
Apr-15
H.S.
8
Collider idea
Two causes for coming to hospital
Hospital
+
Yellow fingers
Select subjects in hospital
Hospital
+
Lung cancer
+
+
Yellow fingers
Lung cancer
- or
+ and
• Conditioning on a collider induces an association
between its causes
• “And” and “or” selection leads to different bias
Apr-15
H.S.
9
Data driven analysis
C
E
D
Want the effect of E on D (E precedes D)
Observe the two associations E-C and D-C
Assume criteria dictates adjusting for C
(likelihood ratio, Akaike (赤池 弘次) or change in estimate)
The undirected graph above is compatible with three DAGs:
C
C
E
D
Confounder
1. Adjust
Conclusion:
Apr-15
E
C
D
Mediator
2. Adjust (direct)
3. Not adjust (total)
E
D
Collider
4. Not adjust
The data driven method is correct in 2 out of 4 situations
Need information from outside the data to do a proper analysis
H.S.
10
The Path of the Righteous
Analyzing DAGS: Paths
Apr-15
H.S.
11
Path definitions
Path: any trail from E to D (without repeating or crossing itself)
Type: causal, non-causal
State: open, closed
K
C
Four paths:
E
D
M
1
2
3
4
Path
ED
EMD
ECD
ECD
Goal:
Keep causal paths of interest open
Close all non causal paths
Apr-15
H.S.
12
K
Four rules
C
non-causal
1. Causal path: ED
(all arrows in the same direction) otherwise non-causal
E
D
M
causal
K
closed
C
Before conditioning:
2. Closed path: K
E
(closed at a collider, otherwise open)
D
open
M
K
Conditioning on:
3. a non-collider closes: [M] or [C]
4. a collider opens:
[K]
C
E
D
(or a descendant of a collider)
Apr-15
H.S.
M
13
Confounding
Apr-15
H.S.
14
C1
Physical activity and
Coronary Heart Disease (CHD)
age
E
D
Phys. Act.
CHD
1. We want the total effect of
Physical Activity on CHD. What
should we adjust for?
C2
sex
Unconditional
Path
1 ED
2 EC1D
3 EC2D
Type
Causal
Noncausal
Noncausal
Status
Open
Open
Open
Conditioning on C1
and C2
Path
1 ED
2 EC1]D
3 EC2]D
Type
Causal
Noncausal
Noncausal
Status
Open
Closed
Closed
Apr-15
Bias
H.S.
No bias
15
Vitamin and birth defects
C
U
age
obesity
E
D
vitamin
birth defects
Unconditional
Path
1 ED
2 ECUD
Bias in E-D?
Adjust for C?
Type
Status
Causal
Open
Non-causal Open
Bias
Conditioning on C
Path
Type
Status
1 ED
Causal
Open
2 EC]UD Non-causal Closed
Apr-15
H.S.
No bias
This example
and previous slide
are both confounding
16
Confounders and mediators
Mixed
Apr-15
H.S.
17
Diabetes and Fractures
F
prone to fall
V
E
D
BMI
diabetes
fracture
P
B
physical
activity
bone
density
Conditional
Unconditional
Path
Path
11 E→D
E→D
22 E→F→D
E→F→D
33 E→B→D
E→B→D
44 E←[V]→B→D
E←V→B→D
55 E←[P]→B→D
E←P→B→D
Apr-15
We want the total effect of
diabetes on fractures
Type
Type
Causal
Causal
Causal
Causal
Causal
Causal
Non-causal
Non-causal
Non-causal
Non-causal
Status
Status
Open
Open
Open
Open
Open
Open
Closed
Open
Closed
Open
H.S.
Mediators
Confounders
18
Statin and CHD
U
C
lifestyle
cholesterol
E
D
statin
CHD
Unconditional
Path
1 ED
2 ECD
3 ECUD
Conditioning on C
Path
1 ED
2 EC]D
3 EC]UD
Apr-15
Type
Causal
Causal
Non-causal
1. We want the total effect of statin on
CHD. What would we adjust for?
2. Can we estimate the direct effect of
statin on CHD (not mediated through
cholesterol)?
Status
Open
Open
Closed
No adjustments gives
the total effect
Is C a collider?
Type
Causal
Causal
Non-causal
Status
Open
Closed
Open
H.S.
Adjusting for C opens the collider path
must also adjust for U
to get the direct effect
19
Selection bias
Apr-15
H.S.
20
Diabetes and Fractures
1. Convenience:
Conduct the study among
hospital patients?
H
hospital
E
D
diabetes
fracture
Conditional
Unconditional
Path
1 E→D
2 E→H←D
E→[H]←D
Type
Causal
Non-causal
Non-Causal
2. Homogeneous sample:
Exclude hospital patients
Status
Open
Closed
Open
Collider, selection bias
Collider stratification bias: at least on stratum is biased
Apr-15
H.S.
21
Selection bias: size and direction
H
Hospital risk:
D
1
E
0
1
0
0.6
0.3
0.2
0.1
Response=
16 %
Population
D
E
1
0
0
sum
36
164
64
736
100
900
1000
RR=
2.0
E
3.0
E
Hospital
D
1
Apr-15
2.0
1
0
No hospital
D
1
0
sum
22
49
13
74
35
123
157
RR=
1.6
H.S.
D
2.0
E
1
0
1
0
sum
15
115
51
663
65
777
843
RR=
1.5
22
Adjusting for selection bias
F
H
prone to fall
hospital
E
D
diabetes
fracture
Path
1 E→D
2 E→F→[H] ←D
Apr-15
Type
Causal
Non-causal
Status
Open
Open
H.S.
Adjust for F to close this path
23
Summing up
• Data driven analyses do not work. Need (causal)
information from outside the data.
• DAGs are intuitive and accurate tools to display that
information.
• Paths show the flow of causality and of bias and guide the
analysis.
• DAGs clarify concepts like confounding and selection bias,
and show that we can adjust for both.
Better discussion based on DAGs
Apr-15
H.S.
24
References
1
Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. 3. ed. Philadelphia:
Lippincott Willams & Williams,2008.
2
Hernan MA, Hernandez-Diaz S, Robins JM. A structural approach to selection bias.
Epidemiology 2004; 15: 615-25.
3
Hernandez-Diaz S, Schisterman EF, Hernan MA. The birth weight "paradox" uncovered?
Am J Epidemiol 2006; 164: 1115-20.
4
Schisterman EF, Cole SR, Platt RW. Overadjustment Bias and Unnecessary Adjustment
in Epidemiologic Studies. Epidemiology 2009; 20: 488-95.
5
VanderWeele TJ, Hernan MA, Robins JM. Causal directed acyclic graphs and the
direction of unmeasured confounding bias. Epidemiology 2008; 19: 720-8.
6
VanderWeele TJ, Robins JM. Four types of effect modification - A classification based on
directed acyclic graphs. Epidemiology 2007; 18: 561-8.
7
Weinberg CR. Can DAGs clarify effect modification? Epidemiology 2007; 18: 569-72.
•
Hernan and Robins, Causal Inference (coming)
Apr-15
H.S.
25