A. Logic of causal analysis—Chapter 10 except for pp. on statistical

advertisement
Soc. 504 Causal Analysis
10/24/06, Outline 9
A. Logic of causal analysis—Chapter 10 except for pp. on statistical inference
1. What do social scientists mean by “causality”?
a. association versus causality
b. we speak of causality in a probabilistic (or “stochastic”), not deterministic
c. when we say that “X causes Y” we mean that X increases or reduces the
likelihood of Y
d. we rarely believe that the presence of X necessarily leads to Y, and the
absence of X always prevents Y
e. nor do we expect the correlation between X and Y = 1.
f. We will see that sociologists although want to be able to conclude that X has
an effect on Y, we can almost never prove that this is true. So we avoid the
“c” word, using “influence” or “affect” instead
2. three conditions necessary for researchers to conclude that X causes Y
a. X and Y must be associated or covary
(1) but association/covariation do not establish causation
b. X precedes Y in time
(1) temporal priority must be established based on empirical knowledge or
theory; statistical analyses cannot establish X’s temporal priority
(2) the temporal ordering of two variables may be reciprocal
(a) this makes temporal priority ambiguous (chicken and egg problem)
(b) moreover, reciprocal processes can vary in speed of cycle
i. we may not have the data necessary to observe the reciprocal
feedback between X and Y
ii. e.g., occupations’ sex composition and average pay
c. The association between X and Y is not spurious. This means that the
association stems from some other variable T that causes both X and Y (textbook
calls T “X2”)
(1) unless we have a genuine experimental design (which we rarely do), we
can only guard against spuriousness by ruling out other possible causes
of the association between X and Y
1
2
(2) we do this by controlling for other possible causes of Y (confounding
variables)
B. Controlling for other variables
1. Experimental control
a. Experiments allow us to establish temporal priority and assess covariation
b. Experiments go a long way toward ruling out other possible explanations
c. True experiments include random assignment (randomization) in which the
experimenter randomly assigns subjects to control or experimental conditions;
thus, only source of initial difference between groups is chance
d. Experimenter ensures that nothing varies except X, the experimental stimulus,
and experimenter controls the value of the experimental stimulus (experimental
manipulation);
(1) in the simplest experiment, experimental group is exposed to X and
control group is not
e. After exposing the experimental group to the experimental stimulus, the
researcher compares the values of Y for the two groups to see if they differ
enough that difference very unlikely to have resulted from chance involved in
random assignment
f. given random assignment, only two possible sources differences on Y between
experimental and control groups
(1) effect of X
(2) chance difference between control and experimental resulting from
random assignment—for which we can test using statistical inference
g. sociologists seldom do experiments because it is difficult to randomly assign
units to experimental and control groups or to control the experimental
manipulation, but status expectations researchers do experiments, also some
policy experiments
(1) Bertrand and Mullainathan
(2) Moving to Opportunity
2. Control in “observational”research
3
a. observational research refers to all research in which we can observe but not
literally control the values of the independent variables
b. in observational research, sociologists try to observe relationships between
potentially confounding variables and Y
c. Instead, we can “control for” possible confounding variables in several ways
(1) holding possibly confounding variables constant by exclusion; i.e., by
excluding all but one category of potentially confounding variable then there is
no variation in T’s value so it cannot explain the relationship between X and Y.
(a) e.g., study of effect of aspirin on heart attack held sex constant by
studying only men
(b) Many studies of status attainment did the same thing: studied only
men; sometimes only white men
d. statistically controlling for possibly confounding variables
(1) statistical method of control depends on level of measurement
(a) If T (a possible confounding variable) is a categorical variable, we
can hold it constant through contingency tables
i. contingency tables: partial tables; Lazersfeld
(b) If T is an interval- or ratio-level variable, we hold it constant by
partialling out its effects through multivariate analysis
i. multiple regression statistically controls for the variation in Y that
can be “explained” by T
ii. but if you control variables that are correlated with X, you will also
control for some of the variation in X that is associated with Y
e.g., race and class
3. How do we decide what potentially confounding variables we should control?
a. previous research or theory—the primary reason we do literature reviews
before we begin doing research
b. conceptualize the mechanisms that we believe explain the relationship between
X and Y and draw causal models
4. With statistical (as opposed to experimental) controls, we can never be sure that
we have ruled out all alternative explanations for an association between X and Y
4
a. often don’t have appropriate data
b. so we hesitate to use term “cause” (although we sometimes use less
inflammatory synonyms like “influence”, “shape”, or even “affect”)
5. In sum, multivariate analysis improves our ability to make causal inferences about
the association between X and Y when we have nonexperimental data by statistically
controlling for the values of other independent variables, but this never offers
decisive proof of causality
C. Causal structures in the relationships between three or more variables
1. model that X  Y is often misspecified
a. Spuriousness: T X and T Y
b. Suppressor relationship: X and Y have no bivariate association; but they are
associated within different categories of T, but their associations are in different
directions
(1) example: how much state spends on primary and secondary education is
negatively correlated with states’ mean SAT scores
c. statistical interaction exists when the nature, strength, or direction of the X-Y
association depends on (is conditional on, interacts with, is moderated by) a
third variable, T
(1) the effect of X on Y depends on the value of T (T can be a nominal,
ordinal or interval-level variable)
(2) easiest to understand statistical interaction by looking at contingency
tables (Lazersfeld)
(3) we can include statistical interactions in regression equations.
2. chain relationship such that X affects Y directly and indirectly through T:
a. T intervenes between X and Y and thus interprets effect of X on Y
[X  T  Y]
b. to the extent that the effect of X diminishes/disappears, we’ve explained the
mechanism through which X affects Y
c. e.g., parents’ SES  child’s education attainment
parents’ SES  encyclopedia in the house  child’s education attainment
Download