Holland on Rubin`s Model - Wharton Statistics Department

advertisement
Holland on Rubin’s Model
Part II
Formalizing These Intuitions.
In the 1920’s and 30’s Jerzy Neyman, a Polish statistician,
developed a mathematical model that allowed him to make sense
of intuitions like those I have been discussing.
Neyman applied this model in the analysis of results of randomized
experiments, which had only recently been invented, by R. A.
Fisher, a British statistician.
In the 1970’s Donald Rubin, an American statistician, expanded
this model to cover the more complicated cases of nonrandomized “observational” studies.
I will give a brief introduction to the Neyman-Rubin model and
show its connection to the ideas I have been discussing.
It is easiest to talk about experiments and observational studies
where there are only two causes or treatment conditions,
t and c.
In this setting there is a sample of “units” (these are samples of
material, people, parts of agricultural fields, etc.)
each of which is “subjected to” one of the two treatment
conditions, t or c.
Denote by x the treatment given to a unit,
xi = t or c for unit i.
Later on we record the value of some outcome, yi, for unit i.
So the data are pairs, (yi, xi), for each unit, i.
That is all we get to observe.
Data Analysis
In a situation like this, about the only thing that a sensible person knows
how to do is to
compute the mean value of y
for those units for which xi = t and
compare it to the mean value of y
for those units for which xi = c.
At the population level (i.e., big samples) this is a comparison of
E(y| x = t) and E(y| x = c).
(1)
When does the difference,
E(y| x = t) - E(y| x = c),
have an interpretation as a Causal Effect?
(2)
Return to the idea of a
Minimal Ideal Comparative Experiment
It has three parts
• Two identical units of study
• Two precisely defined and executed experimental
conditions.
• Precisely measured outcome observed on each unit an
appropriate time after exposure to the experimental
conditions.
We never thought seriously that we could find “Two
identical units,” but the Neyman-Rubin model substitutes this
impossible idea with one that is possible to think about.
We can imagine a unit being exposed to one of two treatment
conditions. If we do, then there are two Potential Outcomes
that we might observe for unit i:
Yti = outcome for unit if i is exposed to t,
Yci = outcome for unit if i is exposed to c.
Once we go this far, it is not hard to realize that the observed
outcome, yi, is actually the realization of two different
potential outcomes.
If xi = t, then yi = Yti, and
If xi = c, then yi = Yci.
Thus, the observed outcome, yi, is not the simple datum it might
first appear to be. This is all a result of thinking about
causation and goes way beyond the data to the
causal interpretation of the data.
Now let’s go back to (1) and (2) specified previously
In terms of the Potential Outcomes, Yt and Yc, the difference in
(2) is
E(Yt| x = t) - E(Yc| x = c),
(3)
We are not done yet. We need to define the
Average Causal Effect (ACE) of t relative to c.
Since we envision every real unit having two potential values, Yti,
Yci, from the two Potential Outcomes, we can certainly
entertain the idea of their difference,
Yti - Yci.
This difference is the
(4)
Causal Effect of t relative to c on Y for unit i.
If we average this difference over all of the units in the
population we get the
Average Causal Effect (ACE) of t relative to c on y, i.e.,
ACE = E(Yt – Yc) (5)
For reasons of simplicity, I will introduce the idea of the
ACE on the treated group, or the effect of the
treatment on the treated,
that is,
ACE(x = t) = E(Yt – Yc|x = t),
(6)
so that we may re-express (6) as
ACE(x = t) = E(Yt|x = t) - E(Yc|x = t). (7)
Returning now to (3), we see that
ACE(x = t) = E(Yt| x = t) - E(Yc| x = c)
(8)
+ E(Yc| x = c) - E(Yc| x = t).
(9)
(8) is the difference between the means of the t and c groups, at
the population level.
But (9) involves something we can know, i.e. E(Yc| x = c)
and something that is impossible to know directly,
i.e., E(Yc| x = t).
E(Yc| x = t) is an example of a counterfactual expected value.
Counterfactuals comes up in discussions of causation by
those who don’t have a model to be very concrete about
them.
The difference
E(Yc| x = c) - E(Yc| x = t),
will be zero if E(Yc| x = c) = E(Yc| x = t).
A condition that insures this is that Yc is statistically
independent of x.
How can this occur?
The effect of randomization.
When we randomize units to treatments we use an external
mechanism to assign treatments of units
like the toss of a coin or a table of random numbers.
This has the effect of making the assignment variable, x,
statistically independent of any variable defined on the units
including Yc.
Thus, under random assignment, at the level of the population we
have
E(Yc| x = c) = E(Yc| x = t).
Hence, we have equality
ACE(x = t) = E(Yt| x = t) - E(Yc| x = c),
(10)
or using the original notation
E(y| x = t) - E(y| x = c) = ACE(x = t). (11)
Equation (11) is very important. It shows that a causal
parameter, the ACE, is equal to something that we can
estimate with data, and thus do statistical inference
about.
From this point of view, causal inference is
statistical inference about causal parameters
and not something about ethereal quantities
that have little reality.
Note also, that the admonition about a “cause”
being something that could be a treatment in
some experiment is given more force by the
identity between an ACE and a difference
between means that can be estimated by data.
Causal Models
A causal model is an assumption about the Potential Outcomes.
Here are two very common ones.
Homogeneous units:
Yti = Yt, and Yci = Yc , for all i,
So it does not matter what i we look at, we get the same outcome
under t or c.
This is the basic tool of most of lab science where the units are
carefully created samples of material for study.
Constant Causal Effects:
Yti - Yci = k, for any i.
The effect of t relative to c is the same for all units.
Clearly Homogeneous Units implies Constant Causal Effect, but
the converse is not necessarily true.
So Constant Causal Effect is a weaker assumption than
Homogeneous Units.
Constant Causal Effect may be thought of as a formalization of
Hume’s “constant conjunction” condition for causality.
It is also the reason you learn about statistical models in which
the distributions have the same shape but different means.
Without going any further, let me just
end by saying that
the Neyman-Rubin model can be used to
illuminate any causal discussion or idea
and should be part of any scientists tool
kit.
Download