OUTLINE THE SYMBIOTIC APPROACH TO CAUSAL INFERENCE

advertisement
hendricks
12:00 pm, Apr 28, 2010
OUTLINE
THE SYMBIOTIC APPROACH
TO CAUSAL INFERENCE
• Inference: Statistical vs. Causal,
distinctions, and mental barriers
• Unified conceptualization of counterfactuals,
structural-equations, and graphs
Judea Pearl
University of California
Los Angeles
• The logical equivalence of SEM and
potential outcomes
• How to use the best features of both
www.cs.ucla.edu/~judea/
Causality blog: www.mii.ucla.edu/causality/
• Direct and Indirect effects
• The Mediation Formula and its ramifications
1
TRADITIONAL STATISTICAL
INFERENCE PARADIGM
2
FROM STATISTICAL TO CAUSAL ANALYSIS:
1. THE DIFFERENCES
Probability and statistics deal with static relations
Data
P
Joint
Distribution
Q(P)
(Aspects of P)
Data
e.g.,
Infer whether customers who bought product A
would also buy product B.
Q = P(B | A)
What happens when P changes?
e.g.,
Infer whether customers who bought product A
would still buy A if we were to double the price.
3
1. Causal and statistical concepts do not mix.
What remains invariant when P changes say, to satisfy
P′ (price=2)=1
CAUSAL
Spurious correlation
Randomization / Intervention
Confounding / Effect
Instrumental variable
Exogeneity / Ignorability
Mediation
Q(P′)
(Aspects of P′)
STATISTICAL
Regression
Association / Independence
“Controlling for” / Conditioning
Odd and risk ratios
Collapsibility / Granger causality
Propensity score
2.
Inference
3.
Note: P′ (v) ≠ P (v | price = 2)
P does not tell us how it ought to change
e.g. Curing symptoms vs. curing diseases
e.g. Analogy: mechanical deformation
4
FROM STATISTICAL TO CAUSAL ANALYSIS:
1. THE DIFFERENCES (CONT)
FROM STATISTICAL TO CAUSAL ANALYSIS:
1. THE DIFFERENCES
P
P′
Joint
Joint
Distribution
Distribution
change
Q(P′)
(Aspects of P′)
Inference
Inference
Data
P
P′
Joint
Joint
Distribution
Distribution
change
4.
5
6
1
FROM STATISTICAL TO CAUSAL ANALYSIS:
2. MENTAL BARRIERS
FROM STATISTICAL TO CAUSAL ANALYSIS:
2. MENTAL BARRIERS
1. Causal and statistical concepts do not mix.
CAUSAL
Spurious correlation
Randomization / Intervention
Confounding / Effect
Instrumental variable
Exogeneity / Ignorability
Mediation
1. Causal and statistical concepts do not mix.
STATISTICAL
Regression
Association / Independence
“Controlling for” / Conditioning
Odd and risk ratios
Collapsibility / Granger causality
Propensity score
CAUSAL
Spurious correlation
Randomization / Intervention
Confounding / Effect
Instrumental variable
Exogeneity / Ignorability
Mediation
STATISTICAL
Regression
Association / Independence
“Controlling for” / Conditioning
Odd and risk ratios
Collapsibility / Granger causality
Propensity score
2. No causes in – no causes out (Cartwright, 1989)
statistical assumptions + data
⇒ causal conclusions
causal assumptions
2. No causes in – no causes out (Cartwright, 1989)
statistical assumptions + data
⇒ causal conclusions
causal assumptions
3. Causal assumptions cannot be expressed in the mathematical
language of standard statistics.
3. Causal assumptions cannot be expressed in the mathematical
language of standard statistics.
4.
4. Non-standard mathematics:
a) Structural equation models (Wright, 1920; Simon, 1960)
}
}
b) Counterfactuals (Neyman-Rubin (Yx), Lewis (x
Y))
7
STRUCTURAL
CAUSAL MODELS
THE STRUCTURAL MODEL
PARADIGM
Data
Data
Generating
Model
Joint
Distribution
Q(M)
(Aspects of M)
M
Inference
M – Invariant strategy (mechanism, recipe, law,
protocol) by which Nature assigns values to
variables in the analysis.
•
“Think
Nature, not experiment!”
Definition: A structural causal model is a 4-tuple
⟨V,U, F, P(u)⟩, where
• V = {V1,...,Vn} are endogeneas variables
• U = {U1,...,Um} are background variables
• F = {f1,..., fn} are functions determining V,
e.g., y = α + βx + uY
vi = fi(v, u)
• P(u) is a distribution over U
P(u) and F induce a distribution P(v) over
observable variables
9
CAUSAL MODELS AND
COUNTERFACTUALS
10
CAUSAL MODELS AND
COUNTERFACTUALS
Definition:
The sentence: “Y would be y (in unit u), had X been x,”
denoted Yx(u) = y, means:
The solution for Y in a mutilated model Mx, (i.e., the equations
for X replaced by X = x) with input U=u, is equal to y.
U
8
Definition:
The sentence: “Y would be y (in unit u), had X been x,”
denoted Yx(u) = y, means:
The solution for Y in a mutilated model Mx, (i.e., the equations
for X replaced by X = x) with input U=u, is equal to y.
U
The Fundamental Equation of Counterfactuals:
X (u)
Y (u)
M
Yx (u)
X=x
Yx (u ) = YM x (u )
Mx
11
12
2
TWO PARADIGMS FOR
CAUSAL INFERENCE
CAUSAL MODELS AND
COUNTERFACTUALS
Observed: P(X, Y, Z,...)
Conclusions needed: P(Yx=y), P(Xy=x | Z=z)...
Definition:
The sentence: “Y would be y (in situation u), had X been x,”
denoted Yx(u) = y, means:
The solution for Y in a mutilated model Mx, (i.e., the equations
for X replaced by X = x) with input U=u, is equal to y.
• Joint probabilities of counterfactuals:
P(Yx = y, Z w = z ) =
In particular:
∑
u:Yx (u ) = y , Z w (u ) = z
P( y | do(x ) ) Δ
= P (Yx = y ) =
PN (Yx ' = y '| x, y ) =
∑
∑
u:Yx (u ) = y
How do we connect observables, X,Y,Z,…
to counterfactuals Yx, Xz, Zy,… ?
P (u )
P (u )
P(u | x, y )
u:Yx' (u ) = y '
13
N-R model
Counterfactuals are
primitives, new variables
Structural model
Counterfactuals are
derived quantities
Super-distribution
P * ( X , Y ,..., Yx , X z ,...)
Subscripts modify the
model and distribution
X , Y , Z constrain Yx , Z y ,... P (Yx = y ) = PM x (Y = y )
14
THE FIVE NECESSARY STEPS
OF CAUSAL ANALYSIS
ARE THE TWO
PARADIGMS EQUIVALENT?
• Yes (Galles and Pearl, 1998; Halpern 1998)
Define:
• In the N-R paradigm, Yx is defined by
consistency:
Express the target quantity Q as a function
Q(M) that can be computed from any model M.
Assume: Formulate causal assumptions A using some
formal language.
Y = xY1 + (1 − x )Y0
Identify:
• In SCM, consistency is a theorem.
Determine if Q is identifiable given A.
Estimate: Estimate Q if it is identifiable; approximate it,
if it is not.
• Moreover, a theorem in one approach is a
theorem in the other.
Test:
Test the testable implications of A (if any).
15
THE FIVE NECESSARY STEPS FOR
EFFECT ESTIMATION
16
FORMULATING ASSUMPTIONS
THREE LANGUAGES
1. English: Smoking (X), Cancer (Y), Tar (Z), Genotypes (U)
Define:
Express the target quantity Q as a function
Q(M) that can be computed from any model M.
2. Counterfactuals:
ATE Δ
= E (Y | do( x1)) − E (Y | do( x0 ))
Yz (u ) = Yzx (u ),
Assume: Formulate causal assumptions A using some
formal language.
Identify:
Z ⊥⊥ {Y , X }
x
z
Not too friendly:
consistent? complete? redundant? arguable?
Determine if Q is identifiable given A.
Estimate: Estimate Q if it is identifiable; approximate it,
if it is not.
Test:
Z x (u ) = Z yx (u ),
X y (u ) = X zy (u ) = X z (u ) = X (u ),
3. Structural:
Test the testable implications of A (if any).
17
X
Z
Y
18
3
IDENTIFYING CAUSAL EFFECTS
IN POTENTIAL-OUTCOME FRAMEWORK
Define:
IDENTIFICATION IN SCM
Find the effect of X on Y, P(y|do(x)), given the
causal assumptions shown in G, where Z1,..., Zk
are auxiliary variables.
Express the target quantity Q as a
counterfactual formula, e.g., E(Y(1) – Y(0))
G
Assume: Formulate causal assumptions using the
distribution:
Z1
P* = P( X | Y , Z , Y (1),Y (0))
Identify:
Z3
Determine if Q is identifiable using P* and
Z2
Z4
Z5
Z6
Y
Y=x Y (1) + (1 – x) Y (0).
X
Estimate: Estimate Q if it is identifiable; approximate it,
if it is not.
19
ELIMINATING CONFOUNDING BIAS
THE BACK-DOOR CRITERION
Z1
Watch out!
???
Front
Door
Gx
Z1
Z2
Z3
Z
Z2
Z3
Z4
X
20
EFFECT OF WARM-UP ON INJURY
(After Shrier & Platt, 2008)
P(y | do(x)) is estimable if there is a set Z of
variables such that Z d-separates X from Y in Gx.
G
Can P(y|do(x)) be estimated if only a subset, Z,
can be measured?
Z6
Z5
Z4
X
Y
Z6
Z5
No, no!
Y
P ( x, y , z )
Moreover, P( y | do( x)) = ∑ P ( y | x, z ) P( z ) = ∑
P( x | z )
z
z
•
(“adjusting” for Z) → Ignorability
Warm-up Exercises (X)
Injury (Y)
21
GRAPHICAL – COUNTERFACTUALS
SYMBIOSIS
22
EFFECT DECOMPOSITION
(direct vs. indirect effects)
Every causal graph expresses counterfactuals
assumptions, e.g., X → Y → Z
1. Why decompose effects?
1. Missing arrows Y ← Z
2. What is the definition of direct and indirect
effects?
2. Missing arcs
Y
Z
Yx, z (u ) = Yx (u )
Yx ⊥⊥ Z y
consistent, and readable from the graph.
3. What are the policy implications of direct and
indirect effects?
• Express assumption in graphs
• Derive estimands by graphical or algebraic
methods
4. When can direct and indirect effect be
estimated consistently from experimental and
nonexperimental data?
23
24
4
LEGAL IMPLICATIONS
OF DIRECT EFFECT
WHY DECOMPOSE EFFECTS?
Can data prove an employer guilty of hiring discrimination?
1. To understand how Nature works
(Gender) X
Z (Qualifications)
2. To comply with legal requirements
Y
3. To predict the effects of new type of interventions:
(Hiring)
What is the direct effect of X on Y ?
Signal routing, rather than variable fixing
E(Y | do( x1), do( z )) − E (Y | do( x0 ), do( z ))
(averaged over z) Adjust for Z? No! No!
25
26
NATURAL INTERPRETATION OF
AVERAGE DIRECT EFFECTS
DEFINITION OF
INDIRECT EFFECTS
Robins and Greenland (1992) – “Pure”
X
Z
X
z = f (x, u)
y = g (x, z, u)
z = f (x, u)
y = g (x, z, u)
Y
Y
Natural Direct Effect of X on Y: DE ( x0 , x1;Y )
The expected change in Y, when we change X from x0 to
x1 and, for each u, we keep Z constant at whatever value it
attained before the change.
Indirect Effect of X on Y: IE ( x0 , x1;Y )
The expected change in Y when we keep X constant, say
at x0, and let Z change to whatever value it would have
attained had X changed to x1.
E[Yx1Z x − Yx0 ]
0
In linear models, DE = Controlled Direct Effect =
β ( x1 − x0 )27
POLICY IMPLICATIONS
OF INDIRECT EFFECTS
In linear models, IE = TE - DE
28
1. The natural direct and indirect effects are
identifiable in models with no unobserved
confounders, and are given by:
The effect of Gender on Hiring if sex discrimination
is eliminated.
GENDER X
E[Yx0 Z x − Yx0 ]
1
MEDIATION FORMULAS
What is the indirect effect of X on Y?
IGNORE
Z
DE = ∑ [ E (Y | do( x1, z )) − E (Y | do( x0 , z ))]P( z | do( x0 )).
z
Z QUALIFICATION
IE = ∑ E (Y | do( x0 , z ))[ P( z | do( x1)) − P( z | do( x0 ))]
z
f
2. Applicable to linear and non-linear models,
continuous and discrete variables, regardless of
distributional form.
Y HIRING
Blocking a link – a new type of intervention
29
30
5
Z
MEDIATION FORMULAS
IN UNCONFOUNDED MODELS
X
Y
Z
X
Y
DE = ∑ [ E (Y | x1, z ) − E (Y | x0 , z )]P( z | x0 )
z
IE = ∑ [ E (Y | x1, z )[ P ( z | x1) − P( z | x0 )]
z
TE = E (Y | x1) − E (Y | x0 )
IE = Fraction of responses explained by mediation
TE − DE = Fraction of responses owed to mediation
31
RAMIFICATIONS OF THE
MEDIATION FORMULA
•
•
•
•
COMPUTING THE
MEDIATION FORMULA
X
Z
Y
n1
n2
n3
0
0
0
0
0
1
0
1
0
n4
n5
n6
n7
0
1
1
1
1
0
0
1
1
0
1
0
n8
1
1
1
E(Y|x,z)=gxz E(Z|x)=hx
n2
= g00
n1 + n2
n3 + n4
= h0
n1 + n2 + n3 + n4
n4
= g01
n3 + n4
n6
= g10
n5 + n6
n8
= g11
n7 + n8
n7 + n8
= h1
n5 + n6 + n7 + n8
DE = ( g10 − g00 )(1 − h0 ) + ( g11 − g01)h0
IE = (h1 − h0 )( g01 − g00 )
32
CONCLUSIONS
Effect measures should not be defined in terms of ML
estimates of regressional coefficients (e.g, polynomials,
logistic, or probit,) but in terms of population distributions.
The difference and product heuristics resemble TE-DE
and IE, but:
DE should be averaged over mediator levels,
IE should NOT be averaged over exposure levels.
TE-DE need not equal IE
TE-DE = proportion for whom mediation is necessary
IE = proportion for whom mediation is sufficient.
TE-DE informs interventions on indirect pathways
IE informs intervention on direct pathways.
is wise
who
bases causal
IHe
TOLD
YOU
CAUSALITY
ISinference
SIMPLE
on an explicit causal structure that is
• Formal basis for causal and counterfactual
defensible
on scientific grounds.
inference (complete)
• Unification of the graphical, potential-outcome
(Aristotle
384-322 B.C.)
and structural equation
approaches
• Friendly and formal solutions to
From
Charlie
Pooleand confusions.
century-old
problems
33
34
QUESTIONS???
They will be answered
35
6
Download