From: AAAI Technical Report SS-94-01. Compilation copyright © 1994, AAAI (www.aaai.org). All rights reserved. Causal Inference from Indirect Judea Pearl Cognitive Systems Laboratory Computer Science Department University of California, Los Angeles, judea@cs.ucla.edu SUMMARY Standard experimental studies in the biological, medical, and behavioral sciences invariably invoke the instrument of randomizedcontrol, that is, subjects are assigned at random to various groups (Mso called "treatments" or "programs") and the mean differences between participants in different groups are taken as measures of the efficacies of the associated programs. Indirect experiments are studies in which randomized control is either infeasible or undesirable, and randomized encouragement is instituted instead, that is, subject are still assigned at random to various groups, but members of each group are encouraged, rather than forced to receive the program associated with the group, leaving final selection among programs to individual choice. The purpose of this note is to bring to the attention of experimental researchers simple mathematical results that enable us to assess, from indirect experiments, the strength with which causal influences operate among variables of interest. The results reveal that despite the laxity in the encouraging instrument, indirect experimentation can yield significant and sometimesaccurate information on the impact of a treatment on the population as a whole, as well as on the treated subjects in particular. Experiments CA 90024 Such imperfect compliance introduces appreciable bias into the conclusions that researchers draw from the data, and this bias cannot be corrected unless detailed models of compliance are constructed [Efron and Feldman, 1991]. 2. Denying subjects assigned to certain control groups the benefits of the best available treatment has moral and legal ramifications. For example, it is difficult to justify placebo programs in AIDSresearch because those patients assigned to the placebo group would be denied access to potentially life saving treatment [Palca, 1989]. 3. Randomization, by its very presence, may influence participation as well as behavior [tteckman, 1992]. For example, the awareness that admission criteria in a given school are deliberately randomized, may make eligible candidates wary of applying to that school. Likewise, Kramer and Shapiro [1984] note that subjects in drug trims were less likely to participate in randomized trials than in nonexperimental studies, even when the treatments were equally nonthreatening. Altogether, mounting evidence exists that mandated randomization may undermine the reliability of experimental evidence and that experimentation with human 1 INTRODUCTION subjects should include an element of self selection. This Recently, there have been severM objections to the note concerns the drawing of inferences from studies in use of randomization in social and medical experimenta- which subjects are indeed given final choice of program, while randomization is confined to an indirect instrument tion. These objections fall into three major categories: which merely encourages or discourages participation in the various programs. For example, in evaluating the effi1. Perfect control is often hard to achieve or ascertain. Studies in which treatment is assumed to be cacy of a given training program, notices of eligibility may randomized may turn out to be marred by uncon- be sent to a randomly selected group of students .or, altrolled imperfect compliance. For example, sub- ternatively, eligible candidates maybe selected at random jects experiencing adverse reaction to an experimen- to receive scholarships for participating in the program. tal drug would tend to reduce the assigned dosage. Similarly, in drug trials, subjects maybe given randomly i05 chosen advice on recommendeddosage level, yet the final choice of dosage will be determined by the subjects to fit individual needs. The question we attempt to answer in this investigation is whether such indirect randomization can provide sufficient information to allow accurate assessment of the intrinsic merit of a program, as would be measured, for example, if the program were to be extended and mandated uniformly to the population. The analysis presented shows that, given a minimal set of assumptions, such inferences are indeed possible, albeit in the form of bounds, rather than precise point estimates, for the causal effect of the program or treatment. These bounds can be used by the analyst to guarantee that the causal impact of a given program must be higher than one measurable quantity and lower than another. Our most crucial assumption is that, for any given person, the encouraging instrument can only influence the treatment chosen by that person, but has no effect on how that person would respond to the treatment chosen. The second assumption, one which is always made in experimental studies, is that subjects respond to treatment independently of each other. Other than these two assumptions, our model places no constraints on how tendencies to respond to treatments mayinteract with choices among treatments. 2 PROBLEM notation, we let z, d, and y represent, respectively, the values taken by the variables Z, D, and Y, with the following interpretation: z E {z0, zx}, zl asserts that treatment has been assigned (z0, its negation); d E {do, dl}, dl asserts that treatment has been administered (do, its negation); and y E {Y0, yl}, Yl asserts a positive observed response (Y0, its negation). The domain of U remains unspecified and may, in general, combinethe spaces of several randomvariables, both discrete and continuous. The graphical model reflects two assumptions: 1. The assigned treatment Z does not influence Y directly but rather through the actual treatment D. In practice, any direct effect Z might have on Y would be adjusted for through the use of a placebo. 2. Z and U are marginally independent, as ensured through the randomization of Z, which rules out a commoncause for both Z and U. 1These assumptions impose on the joint distribution the decomposition STATEMENT P(y, d, z, u) = P(yld, u) P(dlz, u) P(z) (1) The basic experimental setting associated with indirect experimentation is shown below; the background and the methodology used in this analysis are described in [Pearl, 1993]. To focus the discussion, we have considered a prototypical clinical trial with partial compliance although, in general, the model applies to any study in which a randomized instrument encourages subjects to choose one program over another. which, of course, cannot be observed directly because U is unobserved. However, the marginal distribution P(y, d, z) and, in particular, the conditional distributions P(y, dlz), z {z0, zx }, ar e observed2, and th e challenge is to assess from these distributions the average change in Y, due to treatment. The average causal effec~ of D on Y, c~, is defined as the difference c~ = E[P(ylldl,u)-P(yl[do, Treatment/~ AssignmentReceived~k,~ [1 ~Factors Latent when E stands for the expectation taken over u. If compliance is perfect, then D and U are independent and c~ can be measured by the observed mean difference between treated and untreated subjects Treatment/ /-~ A(Y) P( yxldl) NN~ i/"~ (2) Observed Response P( ylldo) (3) However, when compliance is not perfect high values of A(Y) may correspond to low, or even negative values OZ. Figure 1: Graphical representation of causal dependencies in a randomizedclinical trial with partial compliance. We assume that Z, D, and Y are observed binary variables where: Z represents the (randomized) treatment assignment, D is the treatment actually received, and Y is the observed response. U represents all factors, both observed and unobserved that may influence the outcome Y and the choice of treatment D. To facilitate the Muchof the statistical the parameter of interest, literature assumes that a is since it predicts the impact 1Only the expectation over U will enter our analysis, hence we take the liberty of denoting the distribution of U by P(u), even though U may consist of continuous variables. 2In practice, of course, only a finite sample of P(y, d[z) will be observed, but since our task is one of identification, not estimation, we make the large-sample assumption and consider P(y, dlz ) as given. 106 of applying the treatment uniformly (or randomly) over the population. However,if future treatment policies will involve selection decisions by the agents, the parameter of interest should measure the impact of the treatment on the treated: a* = E[P(ylldl, u) - P(yl[do, u)[D = dl] (4) instrument. The significance in the a* measure’ emerges primarily in studies whereit is desired to evaluate the efficacy of an existing programon its current participants. In such studies, assuming the encouragement is randomized, one can simply measure the mean response difference between the encouraged and non-encouraged populations, divided by the rate of participation P(dllZl). namely, the change of the mean response of the treated subjects compared to the mean response of these same subjects had they not been treated [Heckman, 1992]. 3 SUMMARY OF RESULTS Analysis shows [Robins, 1989, Manski, 1990, Pearl, 1993] that the expression for a (Sq. (2)) can be bounded by two simple formulas, each made up of observed parameters of P(y, d[z) (see Appendix): a > P(Yl [zl) - P(Yl [z0) - P(Yl, d0lzl) - P(Yo, dl [z0) a <_P(Yl [Zl) - P(Yl [z0) + P(Yo, do [Zl) + P(Yl, dl [z0) (5) Due to their simplicity and wide range of applicability, the hounds of Eq. (5) were named the natural bounds [Balke and Pearl, 1993]. The natural bounds guarantee that the causal effect of the actual treatment cannot be lower than that of the encouragement by more than the sum of two measurable quantities, P(Yl, do Izx)+ P(yo,dl[ZO); they also guarantee that the causal effect of treatment cannot exceed that of the encouragement by more than the sum of two other measurable quantities, P(Yo, d0[zl) + P(Yl, dl]Zo). The width of the natural bound, not surprisingly, is given by the rate of noncompliance, P(dl[zo) + P(do IZl). This width can be narrowed further using linear programming[Balke and Pearl, 1993] which shows that, even under condition of imperfect compliance, someexperimental data (i.e., P(x, ylz)) can permit the precise evaluation of a. The analysis also shows that a* can be assessed with greater accuracy than a. More remarkably, under conditions of "no intrusion" (namely, P(dllZO) = as in most clinical trials) a* can be identified precisely [Angrist and Imbens, 1991]. The bounds governing a* are (see Appendix): 4 To demonstrate by example how the bounds for a can be used to provide meaningful information about causal effects, consider the Lipid Research Clinics Coronary Primary Prevention Trial data (see [Lipid Research Clinic Program, 1984]). A portion of this data consisting of 337 subjects was analyzed in [Efron and Feldman, 1991] and is the focus of this example. Subjects were randomized into two treatments groups; in the first group all subjects were prescribed cholestyramine (Zl), while the subjects in the other group were prescribed a placebo (z0). During several, years treatment, each subject’s cholesterol level was measured multiple times, and the average of these measurements was used as the post-treatment cholesterol level (continuous variable CF). The compliance of each subject was determined by tracking the quantity of prescribed dosage consumed (a continuous quantity). In order to apply our analysis to this study, the continuous data is first transformed, using thresholds, to binary variables representing treatment assignment ( Z), received treatment (D), and treatment response (Y). The threshold for dosage consumption was selected as roughly the midpoint between minimum and maximumconsumption, while the threshold for cholesterol level reduction was selected at 28 units. The data samples after thresholding gives rise to the following eight probabilities3: P(yo, dllzo) P(dl) a* > P(yl[Zl)- P(yl[zo) P(dl[Zl) _ a* < P(yl[zl)- P(yl[zo) P(dllzl) + P(yl,dl[zo) P(dx) EXAMPLE P(yo,do[zo) -- 0.919 P(yo,dolzl) = P(yo,dllzo) P(yl,dolzo) P(yl,dl]zo) P(yo, dx[zl) P(yl,d0[zi) P(yl,dllZl) = -= 0.000 0.081 0.000 = = -- 0.315 0.139 0.073 0.473 This data represents a compliance rate of ’ P(dl[zl) = 0.139 + 0.473 = 0.61, (6) a mean difference of Clearly, in situations where treatment may only be obtained by those encouraged (by assignment), (~* is identifiable and is given by: A(y)= P(u Ida)- p(ulld0)= and an encouragement effect of a* = P(yl[Zl)P(yx[zo) P(dl [zl) if P(dllzo) = 0 (7) P(Yl[Zl) - P(Yl Iz0) = 0.465 Unlike the a-measure, a* is not an intrinsic prop3We makethe large-sampleassumptionand take the samplefreerty of the treatment, as it varies with the encouraging quenciesas representingP(y, d[z). 107 According to Eq. (5), a can be bounded by: o~ > < 0.465-0.0730.000= 0.392 0.465 + 0.315 + 0.000 = 0.780 References [Angrist and Imbens, 1991] J.D. Angrist and G.W. Imbens. Source of Identifying Information in Evaluation Models. Discussion Paper 1568, Department of Economics, Harvard University, Cambridge, MA,1991. These are remarkably informative bounds: Although 38.8%of the subjects deviated from their treatment protocol, the experimenter can categorically state that when [Balke and Pearl, 1993] Alexander Balke and Judea Pearl. Nonparametric Bounds on Causal Efapplied uniformly to the population, the treatment is fects from Partial Compliance Data. Technical guaranteed to improve by at least 39.2% the probability Report No. 199, Cognitive Systems Laboratory, of reducing the level of cholesterol by 28 points or more. UCLAComputer Science Department, Los AnThis guarantee is purely mathematical and does not rest geles, CA, September 1993. Submitted. on any assumed model of subject behavior. The impact of treatment "on the treated" is equally [Bowden and Turkington, 1984] Roger J. Bowden and revealing. Using Eq. (7) a* can be evaluated precisely Darrell A. Turkington. Instrumental Variables. (since P(dllzo) = 0) giving Cambridge University Press, Cambridge, UK, 1984. 0.465 c~* = = 0.762 [Efron and Feldman, 1991] B. Efron and D. Feldman. 0.610 Compliance as an explanatory variable in clinical trials. Journal of the AmericanStatistical In other words, those subjects who stayed in the program Association, 86(413):9-26, March 1991. are muchbetter off than they would have been otherwise. The treatment can be credited with reduced cholesterol [Heckman, 1992] James J. Heckman. Rando~nization levels (of at least 28 units) in precisely 76.2%of these and Social Policy Evaluation. In C. Mansubjects. ski and I. Garfinkle, eds., Evaluations Welfare and Training Programs, pages 201-230, Harvard University Press, 1992. 5 CONCLUDING REMARKS [Lipid Research Clinic Program, 1984] The Lipid Research Clinics Coronary Primary Prevention Trial results, parts I and II. Journal of ~he Indirect experiments have been considered by many American Medical Association, 251(3):351-374, researchers, mostly in the context of linear regression January 1984. models in econometrics [Bowden and Turkington, 1984]. Alternative non-parametric treatments of noncompliNonparametric [Manski, 1990] Charles F. Manski. ance can be found in [Robins, 1989, Manski, 1990, bounds on treatment effects. American EcoAngrist and Imbens, 1991]. However, the languages used nomic Review, Papers and Proceedings, 80:319in these treatments render them extremely unlikely to 323, May 1990. reach the audience of this symposium. The analyses in [Pearl, 1993] and [Balke and Pearl, 19931 are cast in [Palca, 1989] J. Palca. AIDSDrug Trials Enter NewAge. graphical models and contain new and tighter bounds. Science Magazine, pages 19-21, October 1989. Wehope that the availability of these findings would encourage the use of indirect experimentation wheneverran- [Pearl, 1993] J. Pearl. From Bayesian Networks to Causal Networks. Proceedings of the Adaptive Comdomized controlled experiments are infeasible or undesirputing and Information Processing Seminar, able. Brunel Conference Centre, London, January 25Anotherset of results of possible interest to this audi27, 1994. See also Stalistical Science, 8(3), 266ence are those concerning the deduction of causal effects 269, 1993. from purely observational studies. Given an arbitrary causal graph of the type described in Figure 1, some [Robins, 1989] J.M. Robins. The Analysis of Randomof whose nodes are observable and some unobservable, ized and Non-randomized AIDS Treatment Triit is now possible to determine by graphical techniques als Using a New Approach to Causal Inferwhether the causal effect of one variable on another can ence in Longitudinal Studies. In L. Sechrest, H. be computed from non-experimantal data over the obFreeman, and A. Mulley (Eds.), Health Serservables [Pearl, 1993]. If the answer is positive, then vice Research Methodology: A Focus on AIDS, randomized experiments are not necessary and one can NCHSR,U.S. Public Health Service, 113-159, predict the effect of interventions by symbolic manipula1989. tions of graphs and probabilities. 108 Appendix To evaluate To prove (5), we write ¢~ Ol (8) P(y, dlz) = ~ P(yld, u) P(dlz, u) tl E{[P(yl [dl, u) - P(yl [do, u)][n = .dl we define - and define the following four functions: go(u)--P(dxlu, gl(U)=P(dl]U, fo(u) P(ytldo, u) fl (u) - P(yl]dl, zo) (9) Zl) (10) This permits us to express six independent components of P(y, dlz ) as expectations of these functions: P(yl,do[zo) P(yl,do[zl) P(dl[zo) P(dl[zl) P(yl,dl[ZO) P(yl,dl[Zl) --= = = = E[f0(1-g0)] -Elf0(1- gl)] -E(go) = E(gl) = E[fl "go] = e E[fl .gl] = h For any two random variables q _-- = ~ ~u A(u)P(u)[P(Zl)gl(u) + P(zo)go(u)] = p-~E{[fl(u) - fo(u)][qgl(u) (1- q)go(u)]} - p ,~ E[qflgl + (1 - q)flgo - qfogl - (1 - q)fogo] = p--(-~[qh + (1 - q)e - qE(fogl) (1- q)E(fogo)] --"l -~p-yEb[qh +(1 - q)e - q( E(fo ) - b) (1- q)( E(fo) - a)] ="l"p_.(.a~[q(h+ b) + (1 q)(e + a)E(f0) (11) Substituting the expressions for (h + b) and (e + a) (11), and using: 1 T E(XY) - E(Y) ~_ E(X) ~_ since E[(1 - X)(1 - Y)] >_ 0. This inequality holds any pair of f, g functions (since they lie between0 and 1) and we can write: a < E(fo) or, 1 P(dl) [P(yt) P(dllzo) - P(Yl, dolzo)] _< a* 1 P(dl) [P(Yl) P(Yl, d0]z0)] > c~* (1 5) Alternatively, collecting sions of (15), we get max[h; e] < E(fl) < min[(1 + e - c); (1 + h (12) Lower bounding E(fl) and upper bounding E(fo) provides a lower boundfor their difference P(yo,dllZo) P(dl) commonterms in both expres- < ~._P(yllzl)P(yllzo) P(dl[zl) < P(yl,dllzo) - P(dl) Thus, a*= P(yl[zl) - P(yl[zo) ifP(dl[zo) P(dl[zl) (13) Substituting back the P(y,d[z) expressions from Eqs. (9)- (11), yields the lower bound of nq. (5). larly, the difference can be upper bounded by < a+c from (12), we obtain upper and lower bounds on a*: > E(fl) > E(flgo) > E(fl) > E(flgl) > E(fo)>_Elf0(1-go)] > E(fo)>E[f0(1- g~)] E(fl) - E(fo) > max[e; h] - min[(a -4- c); (b + > h - (a + c) P(zl) a* = E[A(u)ID = dl] = E~ A(u)P(u[dl) = p~ E,, A(u)P(dlIu)P(u) ="l"p_~ ~’~t, ~’~z A(u)P(dllu, z)P(z)P(u) X and Y such that max[a;b] _< E(fo) <_min[(a + c); (b + u) = fl(u) and write o<x_<l,O <Y<lwehave l+E(flgo) -E(go) l+E(flgl)-E(gl) l+E[fo(1-go)] -E(1 - go) l+E[f0(1 -gl)]-E(1-gl) P(yl[dl,u)-P(ylldo, which proves (6) and (7). E(fl) - E(fo) < min[(1 + e - c); (1 -4- h - d)] - max[a; <l+h-d-a thus proving (5). 109 = (14)