Instrumental Variable Models

advertisement
Instrumental Variable Models
A. When would you want to use instruments? Whenever an explanatory is
endogenous, and thus X’e does not equal zero in theory. This means that one of your
independent variables is not fixed, and that it is potentially correlated with your errors. This
can happen in a number of different ways. In the medical literature, instrumental variables
are often used when there is no random assignment of patients into treatment and control
groups, meaning that whether or not a patient receives the treatment will depend on his or
her characteristics. If we do not or cannot measure all of these characteristics, we have an
omitted variable that helps to explain the level of treatment and potentially the medical
outcome.
In political science, the classic example of instrumental variable use can also be
thought of as necessitated by an omitted variable. Suppose we want to estimate the effects
of campaign contributions received by a candidate upon that candidate’s vote share. A
fundamental challenge for scholars estimating the impact of money on votes is that both
variables may be influenced by perceptions about the threat posed by a challenger, a factor
that is notoriously difficult to measure (leaving us with an omitted variable). Even when one
accounts for past performance of the challenger’s party and candidate quality, a link between
the unexplained variance in dollars and votes remains. This shared error most likely reflects
the unmeasured perceptions of the challenger’s chances. Researchers have attempted to deal
with this simultaneity by using two-stage least squares estimations (Jacobson, 1978, 1980,
1985, 1990; Green and Krasno, 1988; Gerber, 1998). These procedures first predict
candidate finances from factors that are (in theory) not directly related to election outcomes,
and then use these systematic figures – purged of their candidate-specific information – to
explain vote totals.
B. What make good instruments? A set of valid instruments Z should be:
i.
exogenous: uncorrelated with the errors. Z’e=0
ii.
correlated with the endogenous X (and this is tested in the first stage
regression)
iii.
correlated with the Y only through X (this is another way of stating i., but it
is easier for me to think about in theory if I can draw a causal path diagram).
Political scientists have struggled to find valid instruments for campaign dollars:
exogenous factors that are unrelated to vote totals controlling for actual funding levels.
Gerber (1998) uses the unique circumstances of US Senate races to produce a set of
instruments which are plausibly exogenous and which pass a formal test outlined by
Hausman and Wu (Hausman, 1983). The higher the R-square in your first stage model, the
better your instruments are (because your two stage least squares estimates will be less
inefficient), but people often have 0.25 or so R-squares.
How do you do this in Stata? Use the ivreg command to first use a set of
instruments Z to generate predicted values for your endogenous X, and then use these
predicted values in a second stage regression (thus, 2SLS) to explain variation in Y. You can
put independent variables other than your instruments into the second stage model (and if
they exert an independent influence on Y, that’s where they should be). Note that you can
use the same set of instruments (such as district characteristics and campaign finance laws)
to create two instruments (such as fundraising by the incumbent and fundraising by the
challenger). Also note that if you want to look at the effects of, say, logged campaign
spending on vote totals, you should predict the log of spending in your first stage regression.
(See Kelijian, 1971).
Below is Stata output for a 2SLS model that first uses political context as an
instrument for fundraising, and then measures the effects of these predicted values of
fundraising, along with other district characteristics, to explain vote totals. I have included a
table that displays this data, a description of what we learned by using 2SLS rather than OLS,
and some of these citations.
ivreg chvote (ln_inc ln_chall = lead_inc inc_top resident compet ilim itopty
ptopty paclim ptylim comp_ind comp_pac alaska illinois oregon utah Wyoming
carolina colorado tennesse georgia kentucky) partych chpast upper contest pro
if inc_race==1
Instrumental variables (2SLS) regression
Source |
SS
df
MS
---------+-----------------------------Model | 34342.3606
7 4906.05151
Residual | 51149.5525
830 61.6259668
---------+-----------------------------Total | 85491.9131
837 102.140876
Number of obs
F( 7,
830)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
838
50.31
0.0000
0.4017
0.3967
7.8502
-----------------------------------------------------------------------------chvote |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
---------+-------------------------------------------------------------------ln_inc | -2.056137
.5876623
-3.499
0.000
-3.209616
-.9026584
ln_chall |
2.571718
.425545
6.043
0.000
1.736447
3.406988
partych |
2.088629
.576459
3.623
0.000
.9571398
3.220117
chpast |
.1735927
.0238404
7.281
0.000
.1267982
.2203872
upper |
1.903438
.6852125
2.778
0.006
.5584852
3.248391
contest |
-.044639
.0197262
-2.263
0.024
-.0833581
-.00592
pro |
.9897339
2.820606
0.351
0.726
-4.546625
6.526093
_cons |
33.00756
3.315507
9.956
0.000
26.4998
39.51533
-----------------------------------------------------------------------------Instrumented: ln_inc ln_chall
Instruments:
partych chpast upper contest pro lead_inc inc_top resident
compet ilim itopty ptopty paclim ptylim comp_ind comp_pac alaska
illinois oregon utah wyoming carolina colorado tennesse georgia
kentucky
------------------------------------------------------------------------------
Table 3 provides the estimates of the effect of money on electoral performance in
districts in which an incumbent runs, while first-stage regressions (predicting fundraising
totals) are contained in Appendix B. The first model displays the results of an ordinary least
squares model that uses actual contribution levels, along with other factors, to explain the
challenger’s electoral performance. This approach does nothing to correct for unmeasured
perceptions of the threat posed by any challenger. Coefficients for the contribution levels
seem to show that, holding constant salient characteristics of the district and state,
challengers but not incumbents see a return on their investment of campaign dollars.
Logged challenger contributions have a positive impact on the race while incumbent
contributions do not exert a statistically significant impact.
Table 3
Model of Electoral Outcomes, for races with incumbents
Variable
Challenger Contributions
(natural log)
Incumbent Contributions
(natural log)
Strength of Challenger’s Party
OLS Model
2SLS Model
2.2** (0.14)
2.6** (0.4)
-0.28 (0.32)
-2.1** (0.6)
0.17** (0.02)
0.17** (0.02)
Challenger is a Democrat
1.6** (0.5)
2.1** (0.6)
Upper House
2.0** (0.7)
1.9** (0.7)
Contested % of Races
-0.04 (0.02)
-0.044*(0.019)
Professionalism
-2.5 (2.4)
0.9 (2.8)
Constant
23.5** (2.3)
33.0** (3.3)
Adjusted R-Square
0.42
0.40
Entries are regression coefficients and (standard errors), dependent variable is the challenger’s percentage of
major party vote. N = 828, A “*” indicates a coefficient that is statistically significant in a two-tailed test
at the 0.05 confidence level, “**” indicates significance at the 0.01 confidence level.
Examining the results of the two-stage least squares model, which takes the
simultaneity between dollars and votes seriously, yields quite different lessons. Incumbents
and challengers benefit almost equally from additional fundraising. Logged challenger
contributions appear to have roughly 25% more impact on the race as donations to
incumbents. This difference may have some substantive importance, but it is not any larger
than either coefficient’s standard error. The near equivalence of challenger and incumbent
money is consistent with Gerber’s (1998) analysis of US Senate races. It differs from
Jacobson’s (1978) classic finding that congressional incumbents capture many fewer votes
per expenditure, perhaps because state legislators – unlike members of Congress – still need
to work hard to establish name recognition in their districts.
Jacobson, Gary C. 1978. The Effects of Campaign Spending in Congressional Elections.
American Political Science Review 72:469-91.
Kelejian, Harry H. 1973. Two-Stage Least Squares and Econometric Systems Linear in
Parameters but Nonlinear in the Endogenous Variable. Journal of the American Statistical
Association 66:373-374.
Green, Donald P., and Jonathan S. Krasno. 1988. Salvation for the Spendthrift Incumbent.
American Journal of Political Science 32:844-907.
Gerber, Alan. 1998. Estimating the Effect of Campaign Spending on Senate Election
Outcomes Using Instrumental Variables. American Political Science Review 92:401-12.
Hausman, J.A. 1983. “Specification and Estimation of Simultaneous Models.” In Handbook
of Econometrics, vol. 1. ed Z Griliches and Michael Intriligator. Pp. 391-448.
Download