Statistics and Causal Inference by Holland, JASA 1986

advertisement
Statistics and Causal Inference
JASA 1986
I.
Association
II.
Potential Responses Framework
by Holland,
III. Fundamental Problem of Causal Inference
IV.
V.
Scientific and Statistical Solutions to
the Fundamental Problem of Causal
Inference.
Some of the many complications and being
careful!
a) What is the population?
b) What is the cause?
c) SUTVA
I am not making any attempt to cover all of the
issues raised by Holland; rather I'm covering
issues I currently think are important.
I. Association
U= the population= All preschool children in US with
parental income <= 2* poverty level.
The u's are children-- they are the units in the
population.
S(u) =
Y(u)
t if u attended HeadStart
c if u did not attend HeadStart
= total years achieved education
An associational question:
Do children who attend HeadStart achieve a higher level
of education as compared to children who do not attend
HeadStart?
Answer:
Compare the average value of Y among children who
attend HeadStart to the average value of Y among children
who do not attend HeadStart.
Quantitatively:
E[ Y | S=t]
- E[Y | S=c]
How might we try to answer the questions:
Jack did not attend HeadStart. What would Jack's
educational achievement be if he had attended
HeadStart relative to his present educational
achievement?
or
What are the educational consequences for children who
attend HeadStart as compared to children who do not
attend HeadStart? That is, does "Attending HeadStart"
cause a greater level of educational achievement as
compared to "Not Attending HeadStart?"
or
What are the educational consequences for girls who
attend HeadStart as compared to girls who do not attend
HeadStart?
II. Potential Responses Framework
U= the population = All preschool children in US with
parental income <= 2* poverty level.
Each unit(child)has two potential responses:
Yc(u)
Yt(u) and
Yt(u) is unit u's response if unit u attends HeadStart
Yc(u) is unit u's response if unit u does not attend
HeadStart
u
Jack
Jane
Nancy
Tonya
David
Yt(u)
Yc(u)
14
6
12
11
10
6
5
12
6
11
individual effect
of HeadStart
8
1
0
5
-1
Causal Effects:
The effect of HeadStart on Jack is 146=8 years of educational achievement. The average effect
of HeadStart is (8+1+0+5-1)/5= 2.6 additional years of
educational achievement. The average effect of HeadStart
on girls is (1+0+5)/3=2 additional years of educational
achievement.
Quantitative expression of the Causal Effects:
The effect of HeadStart on Jack is Yt(u)-Yc(u) where Jack
is unit u. The average effect of HeadStart can be written
as E[Yt]-E[Yc]. The average effect of HeadStart on girls
is E[Yt | girl]- E[Yc | girl].
III.
The Fundamental Problem of Causal
Inference
We observe ONLY one of unit u's potential responses!
u
Jack
Jane
Nancy
Tonya
David
Yt(u)
14
?
12
11
?
Yc(u)
?
5
?
?
11
S(u)
t
c
t
t
c
Without further assumptions we certainly can not
ascertain the effect of HeadStart on Jack, nor can we
estimate the average effect of HeadStart nor can we
estimate the average effect of HeadStart on girls.
We observe S(u); recall S(u) is the cause to which unit
u is exposed or in our example S(u) tells us whether or
not unit u attends HeadStart. Quantitatively we get to
observe S and YS but we wish to estimate Yt(u)-Yc(u) or
E[Yt]-E[Yc] or E[Yt | girl]- E[Yc | girl].
IV. Some Solutions to the Fundamental
Problem of Causal Inference.
a) Scientific/Structural:
These are assumptions
concerning the relationship between Yt(u1), Yt(u2),
Yc(u1), Yc(u2) for the units, (the u's), in the
population.
b) Statistical: These are assumptions regarding the
relationship between S and(Yt,Yc).
Except in very special cases these assumptions are
unverifiable. The assumptions must stem from a priori
scientific knowledge or experimental design.
b) Statistical:
Restrict interest to the average effect of HeadStart,
E[Yt]-E[Yc] (or the average effect of HeadStart in a
subpopulation, E[Yt | girl]- E[Yc | girl]). Assumption:
The selection of exposure, S is INDEPENDENT of the
potential responses, (Yt,Yc) (or the selection of
exposure, S is INDEPENDENT of the potential responses,
(Yt,Yc) within the subpopulation of girls).
If the assumption is true, then we can use
(S, YS) on our sample of units to estimate
This is the case because
E[YS | S=t]= E[Yt | S=t]= E[Yt]
(since Yt
of S) and
E[YS | S=c]= E[Yc | S=c]= E[Yc]
(since Yc
of S).
our data,
E[Yt]-E[Yc].
is independent
is independent
Thus estimating the average causal effect, E[Yt]-E[Yc] is
the same as estimating E[YS | S=t] - E[YS | S=c] and we
can do the latter with our sample means!
Note on Holland's paper: Regardless of whether the
independence assumption holds, Holland calls,
TPF= E[YS | S=t] - E[YS | S=c]
the prima facie causal effect of t relative to c.
We have seen that under the independence assumption the
prima facie causal effect is equal to the average causal
effect.
Note that the independence assumption holds when S is a
randomization indicator, that is the units are assigned
at random to t or c. This is what makes randomized
trials so special--we KNOW the independence assumption
holds in randomized trials.
a) Scientific/Structural:
Unit Homogeneity: Assume that Yt(u) is the same for all
u. Assume Yc(u) is the same for all u.
This is clearly not true in our last table (Jane and
David do not have the same Yc). However if this were
true, then to ascertain the causal effect of HeadStart we
find a unit exposed to t, say u1, and a unit exposed to
c, say u2. The effect of HeadStart is then Yt(u1)-Yc(u2).
Note that the average effect of HeadStart, E[Yt]-E[Yc],
is equal to Yt(u1)-Yc(u2) under unit homogeneity. Thus it
is no problem to estimate either the average effect of
HeadStart or the effect of HeadStart from observations of
S and YS for a sample of units. This is because we can
estimate E[Yt] by Yt(u1) or even the average of YS over
the units with S=t (E[YS| S=t]=E[Yt|S=t]=Yt(u)).
This assumption is frequently found in a more general
form. For example you might assume that Yt(u), Yc(u) is
the same for all girls (units) who live in East Detroit,
who live with a single parent, and whose mother is
alcoholic. Then you find two girls of this type, one of
which attended HeadStart and one of which did not attend
HeadStart. The difference in their responses is then
equal to the effect of HeadStart on girls of this type.
Constant Effect: This is another more sophisticated, yet
weaker, version of Unit Homogeneity. Here we only assume
that for all units in the population Yt(u)= Yc(u) + T
(So T does not vary by unit). T is the causal effect of t
relative to c. The Constant Effect assumption must be
combined with additional assumptions in order to
estimate T=Yt(u)-Yc(u) or T= E[Yt]-E[Yc] or
T= E[Yt | girl]- E[Yc | girl].
In particular, the prima facie causal effect of t
relative to c,
TPF
= E[YS | S=t] - E[YS | S=c]
= E[Yt|S=t] - E[Yc | S=c]
= E[Yc + T |S=t] - E[Yc | S=c]
= T + {E[Yc |S=t] - E[Yc | S=c]}
Without further assumptions there is no reason to believe
that the mean of Yc among children who attended HeadStart
should be equal to the mean of Yc among children who did
not attend HeadStart.
c) Combinations of Scientific/Structural and
Statistical Assumptions:
Researchers often combine these types of assumptions with
great success. Recall that in general the data (like in
the last table) can not be used to prove or disprove the
assumptions.
One way to combine assumptions: Assume Constant Effect
(possibly within a subpopulation). T is the causal
effect of HeadStart. Then assume that mean of Yc among
children who attended HeadStart is equal to the mean of
Yc among children who did not attend HeadStart (possibly
within a subpopulation) or assume that Yc is independent
of S. This yields TPF=T so that you can use sample means
to estimate T.
OR
Assume Unit Homogeneity. Because of measurement error we
do not observe YS(u) rather we observe
YES(u)= YS(u) + errorS(u)
for S = t or c on each unit. Thus Jane and David do not
have the same response in the last table because we are
observing YEt or YEc rather than the true response.
Next make the statistical assumption that measurement
error (errort, errorc) has mean zero and is independent
of S. Then again we have that
E[YES | S=t] - E[YES | S=c]
= E[YEt|S=t]- E[YEc | S=c]
= Yt(u) – Yc(u) + E[errort|S=t] - E[errorc | S=c]
= Yt(u) – Yc(u) + 0.
Again we can use sample averages to estimate the causal
effect of HeadStart.
V. Some of the many complications!
a) What is the population?
b) What is the cause?
c) SUTVA
a) What is the population?
Suppose you make the population U= all preschool children
in US. Does it make sense to think about Yt(u), Yc(u) for
all u???? Is it relevant?
b) What is the cause?
This is incredibly difficult and scientists have been
arguing about this for ages. One must be very careful
here.
Racial discrimination in graduate school admission.
Michigan has been charged with reverse racial
discrimination. How might we think about this problem?
(For simplicity, I will pretend that there are only two
races.)
A possible set of causes:
Putting African-American on application papers
vs.
Putting Caucasian on application papers
The population of units:
all applications by students
admission
for
graduate school
The response:
acceptance vs denial
The question:
Do you want to make inference about effect of AfricanAmerican on Jane's acceptance
or
do you want to make inference about the average effect of
African-American on acceptance
or
do you want to make inference about the average effect of
African-American on acceptance for poor women?
Another possible set of causes:
The school uses the current set of admission criteria
vs.
The school uses an alternate set of admission criteria
that does not involve racial designation.
The population units:
all applications by students for
admission
graduate school
The response:
acceptance vs denial
The question: Do you want to make inference about effect
of current admission criteria on Jane's acceptance or do
you want to make inference about the average effect of
current admission criteria on acceptance or do you want
to make inference about the average effect of current
admission criteria on acceptance for poor women?
c) SUTVA:
We have been implicitly making the assumption SUTVA,
"Stable Unit Treatment Value Assumption."
However it is
an important assumption which may not hold if we are
vague in specifying the cause, or the response or the
population.
Rubin in his discussion of Holland's paper, JASA pg. 961962, vol 81, 1986 writes:
SUTVA is
a) The value of Y for unit u when exposed to treatment t
will be the same no matter what mechanism is used to
assign treatment to unit u
and
b) The value of Y for unit u when exposed to treatment t
will be the same no matter what treatments the other
units receive.
Thus there does not exist unrepresented versions of the
treatment (Yt(u) does not depend on whether the mother
volunteered the child for HeadStart or on whether the
child was randomized to HeadStart) and there does not
exist inference between units (Yt(u) depends on whether
unit u' received treatment t or c).
Download