Uploaded by udayanda

Comorbid Science?

advertisement
Comorbid science?1
doi:10.1017/S0140525X10000609
David Danks,a,b Stephen Fancsali,a Clark Glymour,a,b and Richard Scheinesa
a Department of Philosophy, Carnegie Mellon University, Pittsburgh, PA 15213, and b Institute for
Human and Machine Cognition, Pittsburgh, PA 15213.
ddanks@cmu.edu
http://www.hss.cmu.edu/philosophy/faculty-danks.php
sfancsal@andrew.cmu.edu
cg09@andrew.cmu.edu
http://www.hss.cmu.edu/philosophy/faculty-glymour.php
scheines@cmu.edu
http://www.hss.cmu.edu/philosophy/faculty-scheines.php
Abstract: We agree with Cramer et al.’s goal of the discovery of causal relationships, but we argue
that the authors’ characterization of latent variable models (as deployed for such purposes)
overlooks a wealth of extant possibilities. We provide a preliminary analysis of their data, using
existing algorithms for causal inference and for the specification of latent variable models.
We agree with the view that Cramer et al. develop in the target article: that naı¨ve latent variable
models often fall woefully short of ideal. Unfortunately, their proposed solution and accompanying
test case suffer from a number of flaws.
Cramer et al. begin with a straw man: They assume that, in a latent variable model, symptoms
cannot also influence one another. Unless we define “latent variable model” to exclude such effects,
there is no reason to impose such a constraint on our models. Mathematically, it is straightforward
for latent variable models to have both latent common causes of measured variables and direct
influences of measured variables on other measured variables. This is often the case for actual
causal structures; for example, when there is confounding in observational or quasi-experimental
studies.
Cramer et al. further claim that a “latent variable model renders all symptoms equally central and
thus exchangeable” (sect. 5, para. 3). This claim is difficult to understand. “Central” is neither a
causal nor a statistical notion; “exchangeable” is a statistical notion that, if meant, would be quite
inappropriate in this usage. Cramer et al. might mean that in latent variable models all symptoms
have the same variance, or the same dependence on any latent variables, or in their probability
distributions conditional on values of latent variables, or in their probabilities conditional on one
another. Each of these claims is violated in many latent variable models in the social sciences and
elsewhere, and all of these claims are false unless “latent variable model” is arbitrarily defined so as
to satisfy them. But that would be to focus on a model class that no one ought to accept a priori in
the first place.
Cramer et al. focus on a stiff but appropriate standard for (their version of) latent variable models:
The models should get the causal relations right. Unfortunately, they do not apply that same
standard to their favored alternative. Instead, they resort to representing simple associations, with
occasional suggestions that the relationships they so specify could be causal (e.g., sleep deprivation
causes tiredness). Simple associations cannot, in general, be used reliably to estimate causal
Comorbid science?1
relations: they ignore possible screening off (conditional independence) relations, measurement
errors, and latent confounding, and they give no direction to causal relations when they exist.
Cramer et al. focus on two extreme model classes and ignore the enormous space of (learnable and
estimable) models that lie between these poles; those models can include all of latent variables,
direct causal connections, and feedback cycles (of varying speeds).
Consider instead searching for graphical causal models from their data. Absent latent variables,
graphical causal models – both cyclic and acyclic – specify conditional independence relations. These
relations can be used to search for cyclic and acyclic causal models. The theory of cyclic graphical
models is difficult and underdeveloped, and for binary variables no adequate search procedure is
available, but the target article does not engage what is known (e.g., Lacerda et al. 2008; Pearl &
Dechter 1996; Richardson 1996). For acyclic graphs, there are many correct search algorithms (e.g.,
PC, Spirtes & Glymour 1991; FCI, Spirtes et al. 1993; Conservative PC or CPC, Ramsey et al. 2006;
Greedy Equivalence Search [GES], Meek 1997). The PC algorithm, for example, is an asymptotically
pointwise consistent search procedure under independent and identically distributed (i.i.d.)
sampling when there are no latent variables and the structure is acyclic. The PC algorithm will
sometimes return doubleheaded arrows; asymptotically, the appearance of such structures in the PC
output indicates latent common causes of the connected variables. We can at least begin to explore
the possibilities with PC.
The data that Cramer et al. use to illustrate their approach are missing more than 70% of the
possible values, and there is no explanation in their article of how those missing values are treated.
There are several possibilities: A search can be conducted using only cases with no missing values,
but that would include only about 10% of the cases; missing values can be replaced at random
according to some prior distribution; missing values for a variable can be replaced using a probability
distribution equal to the frequency distribution of that variable in the available data; or, as in the PC
algorithm, relevant statistics can be computed using available data and ignoring missing values. For
the data Cramer et al. provide, and a .05 alpha value for conditional independence decisions, the PC
algorithm yields the graph in Figure 1:
The variables form three distinct clusters: The larger two correspond to the two focal diagnostic
categories, connected only by the two measures of sleep. The MDD (major depressive disorders
network) measures form two disconnected components. Four variables (gSleep, gConc, gFatig, gIrri)
form a chain of double-headed arrows, suggesting that there may be an unobserved common cause.
The BPC (Build Pure Clusters) search algorithm (Silva et al. 2006) estimates whether a set of variables
shares a latent common cause; BPC is asymptotically correct for binary variables whose values are
two-valued projections of a Gaussian distribution. BPC finds that gSleep, gConc, gFatig, and gIrri do
have a latent common cause, as do a separate cluster of MDD measures (mRep, mRest, mSuic).
Similar results are found if missing values are replaced using the base frequency of each variable
value.
This analysis is scarcely complete. For example, were the data good enough to warrant it, one could
apply the FCI (Fast Causal Inference) algorithm (Spirtes et al. 1993), which is correct when there are
both latent common causes and direct influences of measured variables on one another.
In our view, Cramer et al. unnecessarily restrict the modeling options, do not offer a plausible,
reliable method for causal inference, and fail to explore what the data from their own example
might reveal.
Comorbid science?1
ACKNOWLEDGMENTS David Danks is partially supported by a James S. McDonnell Foundation
Scholar Award. Clark Glymour is partially supported by a grant from the James S. McDonnell
Foundation. Figure 1 (Danks et al.). PC output for Cramer et al.’s data Commentary/Cramer et al.:
Comorbidity: A network perspective 154 BEHAVIORAL AND BRAIN SCIENCES (2010) 33:2/3 NOTE 1.
Figure 1 (Danks et al.). PC output for Cramer et al.’s data
Authors are listed in alphabetical order; all contributed equally to this commentary
Download