Causal Inference from Indirect Experiments

From: AAAI Technical Report SS-94-01. Compilation copyright © 1994, AAAI (www.aaai.org). All rights reserved.
Causal
Inference
from Indirect
Judea
Pearl
Cognitive
Systems Laboratory
Computer Science Department
University of California,
Los Angeles,
judea@cs.ucla.edu
SUMMARY
Standard experimental studies in the biological, medical, and behavioral sciences invariably invoke the instrument of randomizedcontrol, that is, subjects are assigned
at random to various groups (Mso called "treatments" or
"programs") and the mean differences between participants in different groups are taken as measures of the efficacies of the associated programs. Indirect experiments
are studies in which randomized control is either infeasible or undesirable, and randomized encouragement is
instituted instead, that is, subject are still assigned at
random to various groups, but members of each group
are encouraged, rather than forced to receive the program
associated with the group, leaving final selection among
programs to individual choice.
The purpose of this note is to bring to the attention of experimental researchers simple mathematical results that enable us to assess, from indirect experiments,
the strength with which causal influences operate among
variables of interest. The results reveal that despite the
laxity in the encouraging instrument, indirect experimentation can yield significant and sometimesaccurate information on the impact of a treatment on the population as
a whole, as well as on the treated subjects in particular.
Experiments
CA 90024
Such imperfect compliance introduces appreciable
bias into the conclusions that researchers draw from
the data, and this bias cannot be corrected unless detailed models of compliance are constructed
[Efron and Feldman, 1991].
2. Denying subjects assigned to certain control groups
the benefits of the best available treatment has moral
and legal ramifications. For example, it is difficult to
justify placebo programs in AIDSresearch because
those patients assigned to the placebo group would
be denied access to potentially life saving treatment
[Palca, 1989].
3. Randomization, by its very presence, may influence
participation as well as behavior [tteckman, 1992].
For example, the awareness that admission criteria
in a given school are deliberately randomized, may
make eligible candidates wary of applying to that
school. Likewise, Kramer and Shapiro [1984] note
that subjects in drug trims were less likely to participate in randomized trials than in nonexperimental
studies, even when the treatments were equally nonthreatening.
Altogether, mounting evidence exists that mandated
randomization may undermine the reliability
of experimental
evidence
and
that
experimentation
with
human
1 INTRODUCTION
subjects should include an element of self selection. This
Recently, there have been severM objections to the note concerns the drawing of inferences from studies in
use of randomization in social and medical experimenta- which subjects are indeed given final choice of program,
while randomization is confined to an indirect instrument
tion. These objections fall into three major categories:
which merely encourages or discourages participation in
the various programs. For example, in evaluating the effi1. Perfect control is often hard to achieve or ascertain. Studies in which treatment is assumed to be cacy of a given training program, notices of eligibility may
randomized may turn out to be marred by uncon- be sent to a randomly selected group of students .or, altrolled imperfect compliance. For example, sub- ternatively, eligible candidates maybe selected at random
jects experiencing adverse reaction to an experimen- to receive scholarships for participating in the program.
tal drug would tend to reduce the assigned dosage. Similarly, in drug trials, subjects maybe given randomly
i05
chosen advice on recommendeddosage level, yet the final
choice of dosage will be determined by the subjects to fit
individual needs.
The question we attempt to answer in this investigation is whether such indirect randomization can provide sufficient information to allow accurate assessment
of the intrinsic merit of a program, as would be measured, for example, if the program were to be extended
and mandated uniformly to the population. The analysis
presented shows that, given a minimal set of assumptions,
such inferences are indeed possible, albeit in the form of
bounds, rather than precise point estimates, for the causal
effect of the program or treatment. These bounds can be
used by the analyst to guarantee that the causal impact
of a given program must be higher than one measurable
quantity and lower than another.
Our most crucial assumption is that, for any given
person, the encouraging instrument can only influence the
treatment chosen by that person, but has no effect on how
that person would respond to the treatment chosen. The
second assumption, one which is always made in experimental studies, is that subjects respond to treatment independently of each other. Other than these two assumptions, our model places no constraints on how tendencies
to respond to treatments mayinteract with choices among
treatments.
2
PROBLEM
notation, we let z, d, and y represent, respectively, the
values taken by the variables Z, D, and Y, with the following interpretation:
z E {z0, zx}, zl asserts that treatment has been assigned
(z0, its negation);
d E {do, dl}, dl asserts that treatment has been administered (do, its negation); and
y E {Y0, yl}, Yl asserts a positive observed response (Y0,
its negation).
The domain of U remains unspecified and may, in general, combinethe spaces of several randomvariables, both
discrete and continuous.
The graphical model reflects two assumptions:
1. The assigned treatment Z does not influence Y directly but rather through the actual treatment D. In
practice, any direct effect Z might have on Y would
be adjusted for through the use of a placebo.
2. Z and U are marginally independent, as ensured
through the randomization of Z, which rules out a
commoncause for both Z and U.
1These assumptions impose on the joint distribution
the decomposition
STATEMENT
P(y, d, z, u) = P(yld, u) P(dlz, u) P(z) (1)
The basic experimental setting associated with indirect experimentation is shown below; the background
and the methodology used in this analysis are described
in [Pearl, 1993]. To focus the discussion, we have considered a prototypical clinical trial with partial compliance
although, in general, the model applies to any study in
which a randomized instrument encourages subjects to
choose one program over another.
which, of course, cannot be observed directly because
U is unobserved. However, the marginal distribution
P(y, d, z) and, in particular, the conditional distributions
P(y, dlz), z {z0, zx }, ar e observed2, and th e challenge is
to assess from these distributions the average change in
Y, due to treatment. The average causal effec~ of D on
Y, c~, is defined as the difference
c~ = E[P(ylldl,u)-P(yl[do,
Treatment/~
AssignmentReceived~k,~
[1 ~Factors
Latent
when E stands for the expectation taken over u.
If compliance is perfect, then D and U are independent and c~ can be measured by the observed mean difference between treated and untreated subjects
Treatment/ /-~
A(Y) P( yxldl) NN~ i/"~
(2)
Observed
Response
P( ylldo)
(3)
However, when compliance is not perfect high values of
A(Y) may correspond to low, or even negative values
OZ.
Figure 1: Graphical representation of causal dependencies
in a randomizedclinical trial with partial compliance.
We assume that Z, D, and Y are observed binary
variables where: Z represents the (randomized) treatment assignment, D is the treatment actually received,
and Y is the observed response. U represents all factors,
both observed and unobserved that may influence the outcome Y and the choice of treatment D. To facilitate
the
Muchof the statistical
the parameter of interest,
literature assumes that a is
since it predicts the impact
1Only the expectation over U will enter our analysis, hence we
take the liberty of denoting the distribution
of U by P(u), even
though U may consist of continuous variables.
2In practice, of course, only a finite sample of P(y, d[z) will be
observed, but since our task is one of identification,
not estimation, we make the large-sample assumption and consider P(y,
dlz
)
as given.
106
of applying the treatment uniformly (or randomly) over
the population. However,if future treatment policies will
involve selection decisions by the agents, the parameter
of interest should measure the impact of the treatment
on the treated:
a* = E[P(ylldl,
u) - P(yl[do, u)[D = dl] (4)
instrument. The significance in the a* measure’ emerges
primarily in studies whereit is desired to evaluate the efficacy of an existing programon its current participants. In
such studies, assuming the encouragement is randomized,
one can simply measure the mean response difference between the encouraged and non-encouraged populations,
divided by the rate of participation P(dllZl).
namely, the change of the mean response of the treated
subjects compared to the mean response of these same
subjects had they not been treated [Heckman, 1992].
3
SUMMARY OF RESULTS
Analysis shows [Robins, 1989, Manski, 1990, Pearl,
1993] that the expression for a (Sq. (2)) can be bounded
by two simple formulas, each made up of observed parameters of P(y, d[z) (see Appendix):
a > P(Yl [zl) - P(Yl [z0) - P(Yl, d0lzl) - P(Yo, dl [z0)
a <_P(Yl [Zl) - P(Yl [z0) + P(Yo, do [Zl) + P(Yl, dl [z0) (5)
Due to their simplicity and wide range of applicability,
the hounds of Eq. (5) were named the natural bounds
[Balke and Pearl, 1993]. The natural bounds guarantee that the causal effect of the actual treatment cannot be lower than that of the encouragement by more
than the sum of two measurable quantities, P(Yl, do Izx)+
P(yo,dl[ZO); they also guarantee that the causal effect
of treatment cannot exceed that of the encouragement
by more than the sum of two other measurable quantities, P(Yo, d0[zl) + P(Yl, dl]Zo). The width of the natural
bound, not surprisingly, is given by the rate of noncompliance, P(dl[zo) + P(do IZl). This width can be narrowed
further using linear programming[Balke and Pearl, 1993]
which shows that, even under condition of imperfect compliance, someexperimental data (i.e., P(x, ylz)) can permit the precise evaluation of a.
The analysis also shows that a* can be assessed
with greater accuracy than a. More remarkably, under conditions of "no intrusion" (namely, P(dllZO) =
as in most clinical trials) a* can be identified precisely
[Angrist and Imbens, 1991].
The bounds governing a* are (see Appendix):
4
To demonstrate
by example how the bounds
for a can be used to provide meaningful information about causal effects, consider the Lipid Research
Clinics Coronary Primary Prevention Trial data (see
[Lipid Research Clinic Program, 1984]). A portion of
this data consisting of 337 subjects was analyzed in
[Efron and Feldman, 1991] and is the focus of this example. Subjects were randomized into two treatments
groups; in the first group all subjects were prescribed
cholestyramine (Zl), while the subjects in the other group
were prescribed a placebo (z0). During several, years
treatment, each subject’s cholesterol level was measured
multiple times, and the average of these measurements
was used as the post-treatment cholesterol level (continuous variable CF). The compliance of each subject was
determined by tracking the quantity of prescribed dosage
consumed (a continuous quantity).
In order to apply our analysis to this study, the continuous data is first transformed, using thresholds, to binary variables representing treatment assignment ( Z), received treatment (D), and treatment response (Y). The
threshold for dosage consumption was selected as roughly
the midpoint between minimum and maximumconsumption, while the threshold for cholesterol level reduction
was selected at 28 units.
The data samples after thresholding gives rise to the
following eight probabilities3:
P(yo, dllzo)
P(dl)
a* > P(yl[Zl)- P(yl[zo)
P(dl[Zl)
_
a* < P(yl[zl)- P(yl[zo)
P(dllzl)
+ P(yl,dl[zo)
P(dx)
EXAMPLE
P(yo,do[zo) -- 0.919
P(yo,dolzl) =
P(yo,dllzo)
P(yl,dolzo)
P(yl,dl]zo)
P(yo, dx[zl)
P(yl,d0[zi)
P(yl,dllZl)
=
-=
0.000
0.081
0.000
=
=
--
0.315
0.139
0.073
0.473
This data represents a compliance rate of ’
P(dl[zl) = 0.139 + 0.473 = 0.61,
(6)
a mean difference of
Clearly, in situations where treatment may only be obtained by those encouraged (by assignment), (~* is identifiable and is given by:
A(y)= P(u Ida)- p(ulld0)=
and an encouragement effect of
a* = P(yl[Zl)P(yx[zo)
P(dl [zl)
if P(dllzo)
= 0 (7)
P(Yl[Zl) - P(Yl Iz0) = 0.465
Unlike the a-measure, a* is not an intrinsic prop3We makethe large-sampleassumptionand take the samplefreerty of the treatment, as it varies with the encouraging quenciesas representingP(y, d[z).
107
According to Eq. (5), a can be bounded by:
o~ >
<
0.465-0.0730.000= 0.392
0.465 + 0.315 + 0.000 = 0.780
References
[Angrist and Imbens, 1991] J.D. Angrist and G.W. Imbens. Source of Identifying Information in Evaluation Models. Discussion Paper 1568, Department of Economics, Harvard University, Cambridge, MA,1991.
These are remarkably informative bounds: Although
38.8%of the subjects deviated from their treatment protocol, the experimenter can categorically state that when [Balke and Pearl, 1993] Alexander Balke and Judea
Pearl. Nonparametric Bounds on Causal Efapplied uniformly to the population, the treatment is
fects
from Partial Compliance Data. Technical
guaranteed to improve by at least 39.2% the probability
Report No. 199, Cognitive Systems Laboratory,
of reducing the level of cholesterol by 28 points or more.
UCLAComputer Science Department, Los AnThis guarantee is purely mathematical and does not rest
geles,
CA, September 1993. Submitted.
on any assumed model of subject behavior.
The impact of treatment "on the treated" is equally [Bowden and Turkington, 1984] Roger J. Bowden and
revealing. Using Eq. (7) a* can be evaluated precisely
Darrell A. Turkington. Instrumental Variables.
(since P(dllzo) = 0) giving
Cambridge University Press, Cambridge, UK,
1984.
0.465
c~* =
= 0.762
[Efron and Feldman, 1991] B. Efron and D. Feldman.
0.610
Compliance as an explanatory variable in clinical
trials. Journal of the AmericanStatistical
In other words, those subjects who stayed in the program
Association,
86(413):9-26, March 1991.
are muchbetter off than they would have been otherwise.
The treatment can be credited with reduced cholesterol
[Heckman, 1992] James J. Heckman. Rando~nization
levels (of at least 28 units) in precisely 76.2%of these
and Social Policy Evaluation.
In C. Mansubjects.
ski and I. Garfinkle, eds., Evaluations Welfare
and Training Programs, pages 201-230, Harvard
University Press, 1992.
5
CONCLUDING
REMARKS
[Lipid Research Clinic Program, 1984] The Lipid Research Clinics Coronary Primary Prevention
Trial results, parts I and II. Journal of ~he
Indirect experiments have been considered by many
American Medical Association, 251(3):351-374,
researchers, mostly in the context of linear regression
January
1984.
models in econometrics [Bowden and Turkington, 1984].
Alternative non-parametric treatments of noncompliNonparametric
[Manski, 1990] Charles F. Manski.
ance can be found in [Robins, 1989, Manski, 1990,
bounds on treatment effects. American EcoAngrist and Imbens, 1991]. However, the languages used
nomic Review, Papers and Proceedings, 80:319in these treatments render them extremely unlikely to
323, May 1990.
reach the audience of this symposium. The analyses
in [Pearl, 1993] and [Balke and Pearl, 19931 are cast in [Palca, 1989] J. Palca. AIDSDrug Trials Enter NewAge.
graphical models and contain new and tighter bounds.
Science Magazine, pages 19-21, October 1989.
Wehope that the availability of these findings would encourage the use of indirect experimentation wheneverran- [Pearl, 1993] J. Pearl. From Bayesian Networks to Causal
Networks. Proceedings of the Adaptive Comdomized controlled experiments are infeasible or undesirputing and Information Processing Seminar,
able.
Brunel Conference Centre, London, January 25Anotherset of results of possible interest to this audi27, 1994. See also Stalistical Science, 8(3), 266ence are those concerning the deduction of causal effects
269, 1993.
from purely observational studies. Given an arbitrary
causal graph of the type described in Figure 1, some [Robins, 1989] J.M. Robins. The Analysis of Randomof whose nodes are observable and some unobservable,
ized and Non-randomized AIDS Treatment Triit is now possible to determine by graphical techniques
als Using a New Approach to Causal Inferwhether the causal effect of one variable on another can
ence in Longitudinal Studies. In L. Sechrest, H.
be computed from non-experimantal data over the obFreeman, and A. Mulley (Eds.), Health Serservables [Pearl, 1993]. If the answer is positive, then
vice Research Methodology: A Focus on AIDS,
randomized experiments are not necessary and one can
NCHSR,U.S. Public Health Service, 113-159,
predict the effect of interventions by symbolic manipula1989.
tions of graphs and probabilities.
108
Appendix
To evaluate
To prove (5), we write
¢~
Ol
(8)
P(y, dlz) = ~ P(yld, u) P(dlz, u)
tl
E{[P(yl [dl, u) - P(yl [do, u)][n = .dl
we define
-
and define the following four functions:
go(u)--P(dxlu,
gl(U)=P(dl]U,
fo(u) P(ytldo, u)
fl (u) - P(yl]dl,
zo) (9)
Zl) (10)
This permits us to express six independent components
of P(y, dlz ) as expectations of these functions:
P(yl,do[zo)
P(yl,do[zl)
P(dl[zo)
P(dl[zl)
P(yl,dl[ZO)
P(yl,dl[Zl)
--=
=
=
=
E[f0(1-g0)] -Elf0(1- gl)] -E(go) =
E(gl) =
E[fl "go] = e
E[fl .gl] = h
For any two random variables
q _--
= ~ ~u A(u)P(u)[P(Zl)gl(u)
+ P(zo)go(u)]
= p-~E{[fl(u) - fo(u)][qgl(u) (1- q)go(u)]}
- p ,~ E[qflgl + (1 - q)flgo - qfogl - (1 - q)fogo]
= p--(-~[qh + (1 - q)e - qE(fogl) (1- q)E(fogo)]
--"l -~p-yEb[qh +(1 - q)e - q( E(fo ) - b) (1- q)( E(fo) - a)]
="l"p_.(.a~[q(h+ b) + (1 q)(e + a)E(f0)
(11)
Substituting the expressions for (h + b) and (e + a)
(11), and using:
1 T E(XY) - E(Y) ~_ E(X) ~_
since E[(1 - X)(1 - Y)] >_ 0. This inequality holds
any pair of f, g functions (since they lie between0 and 1)
and we can write:
a < E(fo)
or,
1
P(dl) [P(yt) P(dllzo) - P(Yl, dolzo)] _< a*
1
P(dl) [P(Yl) P(Yl, d0]z0)] > c~* (1 5)
Alternatively, collecting
sions of (15), we get
max[h; e] < E(fl) < min[(1 + e - c); (1 + h (12)
Lower bounding E(fl) and upper bounding E(fo) provides a lower boundfor their difference
P(yo,dllZo)
P(dl)
commonterms in both expres-
< ~._P(yllzl)P(yllzo)
P(dl[zl)
< P(yl,dllzo)
- P(dl)
Thus,
a*=
P(yl[zl)
- P(yl[zo)
ifP(dl[zo)
P(dl[zl)
(13)
Substituting
back the P(y,d[z)
expressions
from
Eqs. (9)- (11), yields the lower bound of nq. (5).
larly, the difference can be upper bounded by
< a+c
from (12), we obtain upper and lower bounds on a*:
> E(fl) > E(flgo)
> E(fl) > E(flgl)
> E(fo)>_Elf0(1-go)]
> E(fo)>E[f0(1- g~)]
E(fl) - E(fo) > max[e; h] - min[(a -4- c); (b +
> h - (a + c)
P(zl)
a* = E[A(u)ID = dl]
= E~ A(u)P(u[dl)
= p~ E,, A(u)P(dlIu)P(u)
="l"p_~ ~’~t, ~’~z A(u)P(dllu, z)P(z)P(u)
X and Y such that
max[a;b] _< E(fo) <_min[(a + c); (b +
u) = fl(u)
and write
o<x_<l,O <Y<lwehave
l+E(flgo)
-E(go)
l+E(flgl)-E(gl)
l+E[fo(1-go)]
-E(1 - go)
l+E[f0(1
-gl)]-E(1-gl)
P(yl[dl,u)-P(ylldo,
which proves (6) and (7).
E(fl) - E(fo) < min[(1 + e - c); (1 -4- h - d)] - max[a;
<l+h-d-a
thus proving (5).
109
=
(14)