I

advertisement
I
Digitized by the Internet Archive
in
2011 with funding from
Boston Library Consortium IVIember Libraries
http://www.archive.org/details/ivestimationwithOOhahn
,
IB31
.M415
Massachusetts Institute of Technology
Department of Economics
Working Paper Series
IV
ESTIMATION WITH VALID AND INVALID INSTRUMENTS
Jinyong Hahn
Jerry
Hausman
Working Paper 03-26, First Draft
Please do not cite without authors' permission
July 15,2003
Room E52-251
50 Memorial Drive
Cambridge, MA 02142
This paper can be downloaded without charge from the
Social Science Research Network Paper Collection at
http://papers.ssrn com/paper. taf''abstract_icl=430061
Dm
rv Estimation with Valid and Invalid Instiniments
Jinyong Hahn and Jerry Hausman
UCLA and IVUT
Abstract
While 2SLS
may do
is
the most widely used estimator for
better in finite samples.
Here
we
simuhaneous equation models,
OLS
demonstrate analytically that the for the widely
used simultaneous equation model with one jointly endogenous variable and valid
instruments,
the
F
2SLS
statistic
has smaller
MSE
error,
of the reduced form equation
relative estimators
when
and inconsistent
is
MSE of 2SLS
estimator.
We again
second order, than
extremely low
is
We then
2SLS does
better than
education
bias.
We denve the
OLS
analysis,
C3
sample
OLS
which calculates the
we
2SLS
restrictions.
For
derive the
for the
first
of over-identify ing restnctions.
Keywords: instrumental variables, 2SLS, weak instruments, retums
codess; CI,
finite
under a wide range of
bias in the estimated standard errors of
test
or
are biased in finite samples
apply our results to IV estimation of the returns to
This derivation also has implications for the
JEL
,
consider the
of 2SLS under small violations of the exclusion
We
R
e the instruments are correlated
a given correlation between invalid instruments and the error term,
maximal asymptotic
unless the
smaller than the corresponding statistics for the
find that
bias
OLS
under which the approximate
We then present a method of sensitivity
maximal asymptotic
i
2SLS and OLS
We investigate conditions
bias or the
to
the instruments are invalid,
with the stochastic disturbance. Here, both
conditions.
up
to education
time.
—Please do not quote without permission
First Draft
rV Estimation with Valid and Invalid Instruments
Jinyong Hahn and Jerry Hausman
July 15, 2003
While 2SLS
OLS may do better in
and
many Monte
most widely used estimator
the
is
finite
for simultaneous equation models,
samples. Econometricians have recognized this possibility,
Carlo studies were undertaken in the early years of econometrics to
attempt to determine condition
when OLS might do
better than
2SLS. Here we
demonstrate analytically that the for the widely used simultaneous equation mode! with
one jointly endogenous variable and valid instruments, 2SLS has smaller
to
second order, than
OLS
We do a
extremely low.
unless the R"
,
or the
F
statistic
MSE
up
error,
of the reduced fomi equation
calculation based on observable statistics with one
is
unknown
parameter that allows a calculation that should give valuable infomiation about the
relative
MSEs
of OLS and 2SLS.
We then
consider the relative estimators
when
the instruments are invalid,
instnmients are correlated with the stochastic disturbance. Here, both
biased
in finite
approximate
We
samples and inconsistent.
finite
statistics for the
sample bias or the
OLS
estimator.
We
2SLS and OLS
again find that
is
smaller than the corresponding
2SLS does
better than
OLS
wide range of conditions, which we characterize as functions of observable
one unobser\'able
are
which the
investigate conditions under
MSE of 2SLS
the
i.e.
under a
statistics
and
stafistic.
We then present a method of sensitivity analysis,
which calculates
the
maximal
asymptotic bias of 2SLS under small violations of the exclusion restrictions. For a given
correlation
between invalid instruments and
asymptotic bias.
We demonstrate how
the error term,
we
derive the maximal
such maximal asymptotic bias can be estimated
in
practice.
Next,
2SLS
turn to inference.
In the
The
other
problem
estimator are often
much
"weak instruments"
situation the bias in the
OLS
it
is
biased towards the
that arises
is
that the estimated standard errors
estimator creates a problem, since
also biased.
2SLS
we
estimator,
which
of the
too small to signal the problem of imprecise estimates.
1
is
Here we derive the bias in the estimated standard
errors for the first time,
problem. This derivation also has implications for the
to cause the
which turns out
of over-
test
identifying restrictions.
We do not survey the weak instruments
et. al.
(2002) and
Hahn and Hausman
literature.
For recent surveys see Stock
(2003).
Model Specification
I
We
begin with the model specification with one right hand side (RHS) jointly
endogenous variable so that
jointly
RHS
endogenous
the left
hand
side
(LHS) variable depends only on
the single
This model specification accounts for other
variable.
RHS
predetermined (or exogenous) variables, which have been "partialled out" of the
We will
specification.
assume
(1.1)
y,=l3y,+e,
(1.2)
yj =27:-, +V2,
where
dim(;r2
equation (1.1)
)
=
is
K
that'
Thus, the matrix z
.
is
the matrix of
all
predetermined variables, and
the reduced form equation for y^ with coefficient vector 71^.
assume homoscedasticity:
fr \
n{q,y)~n
(1.3)
(J.
v^2,y
(J..
We use the following notation:
h^
f ^'\
y
,
\.yn)
o-,,
= Var(f „
),
a^ =
Var{v,,),
Without
loss
=
Cov(£-„
,
v,, )
= a,
\^nj
We initially assume the presence of valid instruments,
'
cr^
of generality
we
E[z'e/ Ji] =
normalize the data such that yj has zero mean.
and
71:2
^0
We
also
Estimation with Valid Instruments
II
From previous
MSE of 2SLS up to second order.
and
eKJ-P^^^^^
(2.1)
© = n' z'zTTJn
where
assumed
,
to
(reduced form) equation, andyj
Condition
A
As
1:
K
^>
^
as n
Properties of the
a special case of
Theorem
Here,
we
Hahn and Hausman (2002a, 2002b) we know
papers, e.g.
- P)^>
= o^jQ,
o<,
be
is
sample
both terms
2SLS
since
The
size.
is
in
first
such that
2SLS
first
^^
(2.3)
.
K
/
=
yfn
we
i^
0.
'
order asymptotic variance.
As
a
consequence,
+
^=
f"^'""
,
R\y\y,f
n
^''
+
R'(y',y,)
)
increases with increasing
more
quickly, as expected,
"root n" consistent.
expression for 2SLS. Without loss of generality
Using
this
O"^^
normalization
MSE{2SLS) =
M,=
=
we
a^^.
=
1
to
we use
so that Var{y^) = \/{l-R^) and
find:
(K'p'(\-R')+nR'-)(\-R'^
R'
zero in terms of the sample size n becomes quite evident
normalization.
These parameter are theoretical values from
values.
o(l) for somefJ.
term, bias squared also approach zero
The convergence of MSE
this
+
We assume:
obtain that:
n'R'
with
/J
equation (2.2) approach zero as {yj' y-.
normalization (rescaling of units)
=p
the theoretical value from the second
MSE of 2SLS:
We now simplify the M2
(J^,
is
normalized to have mean zero.
N '<MLv
MSE{2SLS) = M, =
that
R
fixed,
ne'
Note
,
3 in Section 3,
the usual
obtain the approximate
(2.2)
f^-
is
2SLS Estimator
Theorem
1: V"(&2.sz,5
V-^^^^
^
The bias of 2SLS
the bias
the underlying
model specifications
for
given parameter
the
B
Properties of the
OLS
Estimator
We now calculate the bias and MSE of the OLS
estimator.
The approximate
bias
is:
eKJ~P~~^^^'^
.Var(yj)
G+
(2.4)
The approximate variance
<J,
defined as:
is
Oj^^l^^
.2
V,,,=-^
(2.5)
As
a special case of Theorem 4 in Section 3,
we
/2v2
obtain the distribution for the
OLS
estimator:
Theorem
2:
V«
'
OLS
V
This result
(2.6)
is
',.
+
V
the same as in
'^
'^
cr,
w
^l^i^JoLs)
JJ
Hausman
(1978). Thus, the approximate
a
MSE{OLS) = M,
(e +
CT„,)^
is
goes to zero as n becomes
large, the first
var^(>'2)
is
n
evident from equation (2.6) because while the second term
term
is
not a function of «.
under the normalization used above becomes:
(7 ev
of OLS
V,OLS
The inconsistency of OLS
M„
MSE
2
d4
2(j\R
(T,
"var(>'2)
«var^(>'2)
(2.7)
^/?'(77-l-27?')(l-/?') +
'^'^^^^(y'l)
The OLS
MSE
We see that the first term in the OLS MSE
is
does not go to zero as n becomes large since
From
2SLS
the
of the 2SLS
samples
OLS
MSE calculations as the
is
first
OLS
2SLS
OLS MSE,
order bias term squared.
It
inconsistent.
sample size grows large the denominator
MSE calculation in equation (2.3) dominates,
the contrary for the
bias of the
the usual
in equation (2.7) the
and the
MSE goes
To
to zero.
numerator also grows with
;;,
as the
estimator does not go to zero with the sample size. Thus, for large
is
OLS
consistent and
finite
samples.
C
Bias Comparisons of the
2SLS
is
We now consider how the estimators do
not.
OLS
and
in
Estimators
We compare the approximate finite sample bias of 2SLS
to the
approximate
MSE of
OLS:
^=
B^
(2.8)
where the F
Thus,
if
F
variance so
'^
'
nR''
\-R-
statistic is
»
1
v^'e
,
2SLS
F
the "theoretical" F-statistic from the first-stage reduced form.
has
compare
less bias.
the
MSEs
However
tTj
we
- aj ^n
.
also consider
(2.9)
2SLS
r£[6,,,J-/?
SLS
J
=
to
is less
than the
2SLS
what happens when we are
where the vector has dimension K. Thus,
the reduced fomi coefficients are "local to zero".
predicts the bias of
variance
below.
Before leaving the bias comparisons,
close to being unidentified so that
OLS
the
With
;r,
=
cr/v'J
,
equation (2.1)
be
^
K
where equation
(2.9) is
asymptotics where
tt^
an approximation to the asymptotic bias of 2SLS under the
= aj^n Here,
.
predicts the approximate bias for
OLS
Y
=
a' z'
to be:
za
.
On
the other hand, equation (2.4)
(2.10)
E[baJ-P-
""
n
Taking the
(2.11)
ratio
of the biases under local
to zero asymptotics:
-"
^"
K'
From equation
(2.1 1),
it
follows that the bias of 2SLS
K « n ,a condition which will
D
is
smaller than
OLS
as long as
always be satisfied in practice.
MSE Comparisons of the 2SLS and OLS Estimators
We next compare the MSE of 2SLS to the MSE of OLS using the normalization
(and non-local asymptotics):
(2.12)
K'p'(\-R') + nR'
^^^-
M,
where
7?"
=
{nR')[{n-\-2R')p\\-R') + \]
(7?" )^
.
The correlation parameter
equation analysis because if it
estimator and the ratio of
and inconsistent
Which
if the
is
MSEs
zero the
in
is
the unbiased
equation (2.12) equals 1/7?^ >
parameter value of
We can
p
is
1,
but
The
simultaneous
Gauss-Markov
OLS
is
biased
not zero.
solve for the "critical value" of
the 2 estimators to be equal.
is less
p^ which causes
than or
the
MSE of
solution for this "critical value" has a remarkably
simple form:
p'
estimator
estimator to use will depend on whether equation (2.12)
greater than unity.
(2.13)
OLS
p is the key parameter in
nR'
nR\n-\-2R')-K-
As
« becomes large the "critical value" of
and
/7^
goes to zero. In any particular sample
F can typically be accurately estimated from the unbiased estimates of the reduced
form so
p^
that only
is
unknown. While
this
parameter value
is
typically
applied econometrician will often have a good (a priori) knowledge of
be able
to determine
whether the
critical
coefficient.^'
As we now demonstrate,
have a lower
MSE
low F
than
OLS, even
1
we
is
below
the critical value
so that she will
the square of the correlation
often so low that
is
for situation with relatively
calculate the critical value of
range of values of R^ for
results of Figure
1
K=10 2SLS
K of 5,
2SLS should
will typically
require R"
>
find that if
R' >
0.4
.
2SLS
will
"weak instruments"
In
Table
0.1 that
1
we
2SLS
2SLS
is
p
or a
(using the absolute value) for a
and 30 and for sample sizes of n = 100.^
K=5
if
R" >
0.1 then the critical value
typically be used in terms of the
be belter
if R'^
>
0.2
.
However,
repeat the calculations for
typically will have a lower
of weak instalments, which can
high,
10,
demonstrate that for
sufficiently small that
is
value
p
unknown, the
statistic.
In Figure
For
i?"^
arise
when both R
is
for
The
of
p
is
MSE comparison.
K=30 we
typically
n=500 and n=1000. Here we
MSE.
Thus, except
in the
case
low and the number of instruments
typically the preferred estimator based
on an approximate
finite
sample
comparison of MSEs.
Ill
Estimation with Invalid Instruments
Up
to this point
we have assumed
orthogonal to the stochastic disturbance
that the instniments are valid so that they are
f,
.
However, the econometrician may not be
certain that the instruments satisfy the orthogonality condition.
situation
where
£[z'f|
^
/
n]
0.
The parameter
achieve
in a
The curves
orthogonality
the
We
first
condition
on
the
p is also estimated from the 2SLS estimation, but a
K
instruments
consider the
fails
so
consider the "large sample bias" of 2SLS:
"weak instrument"
for increasing
We now
situation
lie
to the right
of each other.
good estimate may be
diffcult to
that
plim[6,,„]-y9 =
(3.1)
where
W = ztTj
.
-^5^
When we compare
this with the
analogous expression for
OLS
plimlVsJ-^^-^^
(3.2)
a..
yiyi
In
general
estimator
either
may be
The numerator of equation
circumstances.
on
preferred
(3.1)
criterion
this
would
likely
be
depending
smaller
on
("less
correlation" in the instrument) than the numerator of equation (3.2), but the denominator
of equation (3.1)
estimator
A
may do
is
always smaller since R^ <
1.
Indeed, if R^
is
Invalid Instrument Specification
to
specify the correlation of the
instrument with the stochastic disturbance in the structural equation
specification similar to the approach in
(3.3)
assume
that
(e,v)
is
n{o,q.)-~n
V^2,V
Properties of the
instruments
We use a
local
2.1):
homoscedastic and zero mean normally distributed with
:
'e,,^
We
Hausman (1978, Theorem
(1.1).
£,=z{yl4n) + efory^0.
covariance matrix
B
OLS
better in tenns of inconsistency.
To do asymptotic approximations we need
We
very small, the
f^ll
CS-12
^12
<^7^
-"-'jy
2SLS Estimator with
Invalid Instruments
derive the asymptotic distribution of the
in
Appendix A:
2SLS
estimator with locally invalid
Theorem
where
first
J^F
=
term
condition.
sample
V" (^.^^^ -fi)=>N
3:
in
the numerator of the
The second term
size.
e
instrument and
z;r2 is the
is
1
^
^ = ;r'z'z7/;2,
mean E
to be
11
'
which
is
assumed
sample bias term and
Hausman
\-R^(
©
(1978,
R
1
1
we
find the
MSE
of 2SLS
,
n
\yln
where we use the previous nonnalizations and
3
to be fixed
.
The
it
decreases with the
p.
1256).
2SLS
estimator with
is:
=.l4n + K(7,Jn
Using Theorem
2
under instrument invalidity because of
use Theorem 3 to calculate the approximate bias of the
invalid instruments
V2SiS
arises fi"om failure of the orthogonality
f^25is
the local departure in equation (3.3) similar to
We
+[kI -Inp^
R'a..
the usual finite
The variance continues
-Jna^:^
^N
V2SLS
set
^
J
o^y.^
= H/ V« = apf 4n
for
a <\
to be:
M5£,,=^^l^^i+''"0^
(3.5)
n
n
n2^V
(.
—j= ap +
-^
+—
Kp
/?'
C
Distribution of the
We
OLS
^\-R'^
n
Estimator with Invalid Instruments
derive the asymptoUc distribufion of the
OLS
estimator with locally invalid
instruments in Appendix A:
Theorem
4:
''"
V"
V
The
the
distribution
mean of the
/».
OLS
is
V
(J,,
centered around the usual
distribution arises
V"<7„,.^
V
TV
+
+ C7.,
OLS
V,OLS
J
bias, as before,
from the instrument
invalidity.
V
^.o
and the numerator of
Again, the variance
continues to be F^^^ under instrument invalidity because of the local departure in
equation (3.3). Using
MSEo, =
Theorem 4 we
(Tr
+
find the
MSE of OLS to be:
^+/^g"l2
1
(3.6)
^'
2\2
= (l-/?0
rn
The
first
term
in parentheses is the
^-p'(l + 27?')(l-^') +
ap + p
l^
(i-/i')
"usual" simultaneous equation bias of OLS that does
not decrease with the sample size.
We consider a special
y =
for
T7t
some
r
.
Under
situation
which make
this proportionality
the fonnulae easier to interpret. Let
assumption, the asymptotic distributions
take the form:
and
2SLS
r
N
+
where
we have
cr„
l
+ cr,2/0
'
VOLS
'^
used the normalization to derive the
^(^R\v,,,)
final expression for the distribution
of OLS.
D
Bias Comparison of 2SLS with
We now
OLS
OLS
compare the bias of 2SLS under instrument
given similar circumstances.
We
now
re-write
normalization:
(3.7)
B^,=E[b,J-/3^{\-R^)
^r
As
before,
we
ap + p
take the ratio of (3.4) and (3.7):
10
invalidity with the bias of
the bias
of
OLS
using the
^
(3 8)
The
ratio
^P'^ + ^P'"
=
of the biases
is
we
can simplify terms.
and
K=5
We
..
« = 0.
and
Equation (3.9)
remains
it
is
p
in the correlation coefficient
plot the ratio of the biases in Figure 2 for the case
,
so
of n=100
1
2SLS
bias is less than the
OLS bias
if:
nR^-K
1
note that
We
find that the
r.s
homogeneous of degree zero
We
very easy to interpret.
calculate a "critical alpha" in Figure 3, and
increase quite rapidly, so that the bias of
less than the bias
of OLS so long
2SLS with
F exceeds
as
1
.0
by
invalid instniments
a small
amount. The
straightforward relationship of equation (3.9) allows for an easy interpretation on which
the econometrician
may
well have
In certain situations
and R^ such
(3.9),
that
we fmd
when
that clR'
it
some
may be
the covariance
I
a priori
knowledge.
reasonable to consider a relationship between
is less
so
is
R'.
If
da-{n-K)l\ n''\a + ^n)
.
we
increase in R^
ratio
is
approximate
Note
OLS
,
a
that the
so that
that the required
=
common
empirical finding that the
substantial bias can
weak instruments
OLS
situation.
still
arise
may
R
bias
2SLS
when
is
also
because R'
is
larger than
the instruments are valid or
instrument
if the
coefficient
is
is
"almost uncorrelated"
often quite small in the
Thus, comparing equation (3.4) to the bias of
equation (3.7), the empirical finding that the
estimate
in
.
coefficient can arise because of the
cr„..^
in the
the rate of the one over the square root of n to keep the
because of an improper instrument. Thus, even
OLS
a we fmd
of the biases approximately the same. However, the required change
inversely related to
the
at
totally differentiate equation
Thus, for a given increase
covariance between the instnimcnt and the stochastic term,
o^^,.^
indicate that the
2SLS
instrument
11
OLS
in
estimate increases compared to the
is
not orthogonal
to
the
stochastic
The
disturbance.
can be
resulting bias
Indeed,
substantial.
2SLS
leading to an increase in the estimated
it
could exceed the
OLS bias,
OLS
over the estimated
coefficient
coefficient.
MSE Comparison of 2SLS
E
and
OLS with
Invalid Instruments
Returning to the general situation and using the normalizations the ratio of the
MSEs
is
M,
(\-R'iap + Kp/^J/R'+\/R'
Mo
No
MSEs
becomes
for
«=
Note that the
vertical axis).
OLS
The
bias).
To
0.1
ratio
quite small (as with
in Figure 5.
situation
that the
2SLS
However,
2SLS
of MSEs
is
below
1
.0
p
becomes small (which decreases
remains essentially the same when
R^ <
0.2
is
.0
by
the small size
a large
of
p
p < 0.3
for
«=
which makes
OLS
a=
we
0.3
plot
Figure 6 demonstrates
.
estimator, even though
poor
relative
good estimator.
a relatively
where the absolute performance of the
weak
estimator with valid instruments would be poor under
It is
0.1
this
increase to
in this situation,
OLS
amount. The reason for
this situation is typically not a situation
not large.
we
of what can happen
and
We graph
except in the situation where R^
estimator can do quite poorly compared to the
1
less than one.
is
Figure 4 (please note the inverted
in
instruments) and
yield a better understanding
the F statistic exceeds
performance
and K=5, n=100
weak
the situation in Figure 6 where
is
/^py +\-{l-R')p' ~2i\-R')p'R']'
straightforward condition can be derived where the ratio
the ratio of the
the
il-R')[{ap +
instruments because
p
amount" of
the presence of invalid instruments, with only a "small
correlation with the stochastic disturbance that creates the problem.
F.
Comparison of a (Second order) Unbiased Estimator
In
of bias
is
our comparisons of 2SLS with OLS, two sources of bias
from
the use
of estimated parameters, Kj
in
equation
instruments. This source of bias disappears as the sample
source of bias
is
from the use of invalid instruments, y
12
(1 .2), in
becomes
^0
in
arise.
first
source
forming the
large.
equation
The
The second
(3.3).
This source
of bias does not disappear sufficiently
An
consistent.
would change
source of bias were eliminated.
if the first
by using
the
We derive the asymptotic
2SLS
would be about how the comparison of IV
interesting question
bias (to second order)
with the sample size to cause
fast
Nagar
to be
to
OLS
We can eliminate this source of
estimator.
Nagar estimator with
distribution of the
locally invalid
instruments in Appendix A:
Theorem
where
5:
4n{b^ -
W = zn^
as before V^^^^
is
N
(3)
0'
the instrument
= cr^jQ
.
Thus
=
•"
R'a
and
to
vnicr„We
-,V,s
2SLS
2 _
A/
'E
=
n' z'zy I n, vv'hich
compare
the
compare
is
is
assumed
we
see that the variance of the
the same, but that the bias differs as explained above.
the bias square of
2SLS from
to be fixed,
and
MSE of the Nagar estimator to the MSE
of the 2SLS estimator with invalid instruments,
estimators
'
two
However, when we
equation (3.4) with the Nagar estimator
we
find
that
=.1
(3.1
can be
4n
-\-
Ka^2 /"
f
~
H/V^l
i;
less
0^
than or greater than zero. Thus,
estimator to compare with
we
OLS would make
cannot conclude that using the Nagar
the comparison
more favorable
to
an IV
estimator.
IV
Sensitivity Analysis
Card (2001) discusses possible concerns
that the instruments
may
be invalid in
discussing the empirical literature that estimates the return to additional education.
use of instrumental variables
paper.
To
in this situation
The
began with Griliches (1977) well known
investigate the possibility of invalid instruments,
we
consider the specification:
The Nagar estimator may perform poorly with weak instnimenis because of its lack of moments. See
Hahn, Hausman, and Kuersteiner (2002).
'
13
y,{e) =
py,+ze + £
(5.1)
=zO + e
£
Note
we
that
have added zd
to the error
e
We derive the maximal
.
a small violation of the exclusion restriction in
between
z^tt
and
£•'
so that
i//^
is
Appendix B, where
R^ of between
the
z-TT
and e'
asymptotic bias for
is
y/
.
the correlation
We find the maximal
asymptotic bias to be:
1
Theorem
Note
max plim
6:
that the
bias y?2as (^)
plim/7"'^£-;2A"V
2
\^
I//
r
R' pVimn-'Y^yl
maximal asymptotic bias can be consistently estimated by
1/2
/
"-'Z^:
1
A"2
2
(5.2)
\-y/'
Imbens (2003) suggested a
model with binary explanatory
some
simplification,
variable bias
is
it
variable, extending
program evaluation
Rosenbaum and Rubin's
well
It is
the coefficient
known
that the omitted variable bias can
With
Imbens (2002)
is
be related to
of the omitted variable and the correlation of the omitted
variable with other observed variables in the model, e.g. Griliches (1957).
analysis of
(1983).
can be said that he considers a parametric model where an omitted
suspected.
two parameters,
different sensitivity analysis in a
The
sensitivity
based on manipulation of these two parameters.
We now consider the effect
of invalid instruments
in
an empirical example.
Estimating the return to education has been a well-researched problem over the past 25
years. Griliches (1977) is a seminal
The
usual result
is
paper that uses FV to estimate returns
that researchers find the
by approximately 25%-50%,
e.g.
between two potential sources of bias:
(1)
estimate
stochastic disturbance
may be
OLS
estimate to be smaller than the
Card (2001). This
result arises
an omitted variable,
correlated with the
14
call
it
2SLS
from a tradeoff
"spunk"
in the
amount of educations. Thus, people
Imbens (2003) considers the question of sensitivity analysis, but not
variables.
to schooling.
in the
context of instrumental
with more spunk achieve higher education levels and also higher earnings, because they
work harder both
in school
and on
the job.
This
left
out variable would lead to an upward
bias in the least squares estimate of the schooling coefficient. (2) errors in variables (EFV)
that arise
because years of schooling are a noisy measure of "useful knowledge" attained
with more years of school that leads to higher earnings. Here the EFV would lead to a
downward
bias in the least squares estimate of the schooling coefficient.
results finds that the
EIV
We now consider the
OLS
results
by
a significant
Angrist and Krueger
sample and use quarter of birth
to
(AK)
IV
typical
effect is larger than the left out variable effect, so the
estimated typically exceed the
large
The
2SLS
amount.
results
which have an extremely
fonn instruments, which may be more
likely to
be
orthogonal to the stochastic disturbance than more widely used family background and
other types of instruments typically used in the empirical returns to schooling literature.
However,
the
AK
instruments have an extremely low R" that could help create a
weak
instruments situation. Angrist and Krueger (1991) used a sample of n = 329,509
observations to estimate the returns to education. Using the .AK data
2SLS
right
return to education to be 0.0891 (.016) using
hand
side variables. This estimate
is
K=30 after
closer to the
OLS
we
estimate the
partialing out the other
estimate of 0.071 (.0003)
than expected given other empirical results. After partialling out,
we
find that the
average squared residuals equal 0.41, the average of the partialled out right hand side
endogenous variable (education) equals 10.8, and R^ = .00044662. For
find that the solution to equation (5.2)
estimate of 0.0891, so a small
to education or
Our
is
\i/~
= 0.0001 we
,
0.0925. This maximal bias exceeds the
amount of bias could
2SLS
either eliminate any estimated return
double the estimate.
finding that the returns to coefficient could be over two times the
estimate contrasts with the results of Manski and Pepper (2000)
(1990, 2003) non-parametric bounds approach.
sample) find that the upper bound
is
who
Manski and Pepper
substantially less than
OLS
apply Manski's
(on a different
two times
the usual
OLS
estimates of the returns to schooling. Hov^ever, the Manski approach does not allow for
15
errors in variables. This omission
Manski approach
to this
may
significantly limit the empirical relevance of the
problem.
This result demonstrates that use of a "weak identification" strategy such as the
AK approach is extremely sensitive to very small departures from the IV orthogonality
assumption. Note that from the result in Theorem
not decrease with increasing sample
schooling
is
size.
6, that this
extremely sensitivity does
AK estimate of the returns to
Thus, the
very sensitive despite an extremely large sample size of n = 329,509. Our
results caution against using a
weak
become widely used
identification strategy that has
in applied econometrics.
V
Bias in Estimated Standard Errors
We have previously discussed the biased
and Theorem
further
OLS
1
.
problem
In the
"weak instruments"
2SLS
arises in that the
estimator as equation (2.4) and
in the
2SLS
situation this bias
estimator
is
estimator in equation (2.1)
may
be quite large.
A
biased in the same direction as the
Theorem 2 demonstrate. Thus, Hausman (1978)
specification type test will be biased towards not rejecting the null hypothesis of lack of
orthogonality between
has been recognized
the
2SLS
that the
based on
first
and
v^ in
downward
estimate are
equations
weak instruments
in the
estimator are
2SLS
f,
biased,
much more
and
(1 .1)
situation.
much
estimator in the presence of
Thus, the researcher
Hahn-Hausman
More
is
may
analysis
2SLS
estimator would be sufficiently
weak instruments
it
would
that, to the contrary, often the
leads to a reasonably small standard
be unaware of the weak instruments problem, although
is
useful in identifying
when weak
causing a problem. The source of the problem of small reported standard
OLS
error, or alternatively the
for the estimated coefficient unless
discussion, see
From
uncertainty exists with the estimate that
(2002, 2003) propose a test that
generally, since the bias in the
measurement
mistaken inference
precise than they actually are.
not be of much use. However, researchers have found
instruments
to the
errors for
order asymptotics the usual conclusion would be that with "weak
large to signal the finding that so
error.
However, another problem
The estimated standard
sometimes leading
instruments" that the reported standard error of the
2SLS
(1.2).
Hausman
estimate
when ElV
depends on the variance of the
in the ElV problem
made regarding the unknowTi variance. For further
exists
R" of the regression, typically no bounds exist
some judgment
is
(2001).
16
of the 2SLS estimator has not been discussed
errors
source of the problem and offer a possible approach to fixing
The variance of 2SLS
- ^ce^
^2SLs
is
derived in
= tt'z'z/t / n
where
esfimate since unbiased estimated of
downward
bias in the estimated
estimate of a^^
is
We now
.
biased towards the
of the
2SLS
is
7i
2SLS
to
Now
be fixed.
is
on equation
The
intuition follows
estimator of fi creates a bias in the
2SLS
we
estimate of a^^
Thus, the
downward biased
from the
Thus,
.
not difficult to
(1 .2).
standard errors must arise from a
which minimizes a^^
estimator,
it.
and takes the usual form of
1
OLS
follow from
derive the bias.
OLS
Theorem
assumed
Here we derive the
in the literature.
fact that
2SLS
find that the bias
.
We find the bias
to be:
Theorem?;
Note
E[<J
isls
1
= cr„
-^^
''
cT
e
n
As
estimator from equation (2.1).
e
n
that the leading term in the bias calculation of
2SLS
-^-^
+
«
n
Theorem
5 is 2 times the bias
number of instruments grows
either the
of the
or the
covariance between the structural and reduced term stochastic disturbances becomes
of a^^ will also become
large, the bias in the estimation
normalization that
E[(J
we
2SLS
]
=
;
1
^^
"
"
P
\[(2K-4)p'-\]{\-R')
R
n
+-"
I
quite substantial as demonstrated
varies
from
Thus,
we
.01
p = 0.9
and
-70%
to
-80%
that the
as
it
The Monte-Carlo design
is
the
(5.1).
The
final
bias of the
in
Monte-Carlo
2SLS
results
when K =
when weak instruments
as in
Hahn-Hausman
17
we
find
estimate of the variance
K, the number of instruments, increases from
same
term in
can be ignored. Equation (5.1
note that the bias in the estimation even
finding explains the result that
'
mean
^
by equafion
demonstrates that the downward bias can be substantial;
R =
^
^'
n
equation (4.2) will typically be small so that
that for
apply the
used above to find:
(5.1)
The bias can be
We now
large.
5
can be quite
5 to 30.
large.
This
are present, the estimated standard
(2002a).
of 2SLS can appear to be near those of OLS and small enough to allow the
errors
researcher to
weak
make
conclusions about the likely true parameter value. However, with
instruments these conclusions could be erroneous because of the substantial bias in
2SLS
the estimated standard error of the
modify
alternative approach to
LIML
estimator rather than the
estimator. Kleibergen (2002) also proposes an
inferential procedures, but his
2SLS
Hausman, and Kuiersteiner (2003) discuss problems
We now
test) rejects
actual size of the test
is
"too often"
situation
it
model
may have
Hausman (1983) we
£'
P
test
as discuss
VI
on
approach
are present,
0.05 while the actual size
by
e.g.
Hausman
(
1
983).
it
tests the
In the
write the
OID
the
i.e.
Hahn-Hausman
is
sometimes
economic theory
weak instrument
in equations (3.1), (3.3)
2SLS
and
From
(3.4).
test as:
£
OID
test.
K-1 degrees of freedom. From equation
(5.2),
we
see
(7„ can lead to substantial over-rejection and an upward
Thus, correcting for this problem can have an important
test results.
Conclusions
We derive
Nagar
is
MSE that we calculation
downward biased of
biased size of the
effect
this
estimator.
can be quite important since
distributed as chi-square with
that a
with
W = ^-^-^
(5.2)
is
arise
increased importance given the substantial bias in the
estimator and the large
W
may
considerably larger than the nominal size. See
The OID
greater than 0.5.
in the
LIML
that
when weak instruments
(2002a), Table xx where the nominal size
embodied
based on the
consider the finding that the often used test of over identifying
(OID
restrictions
is
Hahn-Hausman (2003) and Hahn,
estimator.
because of non-existence of moments of the
approach
second order approximations for the bias and
The
estimator) with both valid and invalid instnnnents.
instruments
is
bias can occur
new,
to the best
when weak
of our knowledge.
We find
instruments exist which arises
MSE
of 2SLS (and the
derivation for invalid
that substantial finite
when
the
R''
sample
of the reduced
form regression
is
low, the
number of instmments
and reduced form stochastic terms
structural
We then
compare the bias and
biased and inconsistent, but
its
weak
We
instruments situation.
p
is
is
high, or the correlation between the
high.
MSE of 2SLS with OLS.
smaller variance
may make
it
The OLS estimator
preferable to
2SLS
changes
R^ >
0.1
in large,
MSE comparisons because changes in the bias term are quite important in
in the
,
in a
determine straightforward and easily checked conditions
under which 2SLS has smaller bias than OLS. These bias conditions carry over,
part, to the
is
MSE term
2SLS
is
use our formulae
situation given
to
check the expected performance of 2SLS and
a priori
demonstrate
knowledge about
likely
OLS
2SLS
for correction of the bias.
19
given
estimator for the
downward biased 2SLS
errors and over-rejection of the test of over-identifying restrictions.
would allow
in a
parameter values.
that a substantial bias exists in the
variance of the stochastic disturbance, which lead to
for the bias that
We find that for
generally the preferred estimator. However, the econometrician can
some
We also
given typical sample sizes of n=100 or larger.
standard
We derive
a formula
References
J. and A. Krueger (1991): "Does Compulsory School Attendance Affect
Schooling and Earnings," Quarterly Journal of Economics, " 106, 979-1014.
Angrist,
Card, D. (2001), "Estimating the Return to Schooling," Econometrica,
Griliches, Z. (1957), "Specification Bias in Estimates
1
127-1 152
of Production Function," Jowrwa/ of
Farm Economics, 38, 8-20
Griliches, Z. (1997), "Estimating the Returns to Schooling:
Some Econometric
Problems," Econometrica, 45, 1-22.
Hahn,
J.,
and
J.
Hausman
(2002a):
"A New
Specification Test for the Validity of
Instrumental Variables," Econometrica, 70, 163-189.
Hahn,
J.,
and
J.
Hausman
(2002b): "Notes on Bias in Estimators for Simultaneous
Equation Models", Economics Letters, 75, 237-241.
Hahn,
J.,
and
J.
Hausman
(2003):
"Weak
Instruments: Diagnosis and Cures in Empirical
Econometrics", American Economic Review
Hahn,
Hausman, and G. Kuiersteiner (2003): "Estimation with Weak Instruments:
Accuracy of Higher Order Bias and MSE Approximations," revised MIT mimeo
J., J.
Hausman,
J.
A. (1978): "Specification Tests
in
^conomtXncs,,'' Econometrica, 46, 1251
-
1271.
Hausman,
J.A. (1983): "Specification
in Z. Griliches
and
M.
and Estimation of Simultaneous Equation Models,"
Handbook of Econometrics, Vol. 1 Amsterdam:
Intriligator, eds..
,
North Holland.
Hausman, J. (2001), "Mismeasured Variables in Econometric Analysis: Problems from
the Right and Problems from the Left", Journal of Economic Perspectives, 2001
Imbens, G. (2003), "Sensivity to Exogeneity Assumptions in Program Evaluation,"
American Economic Review
Kleibergen, F. (2002): "Pivotal Statistics for Testing Structural Parameters in FV
Regression," Econometrica, 70, 1781-1803
Manski, C. (1990): "Nonparametric Bounds on Treatment
Review Papers and Proceedings, 80, 319-323.
20
XLiie.cXs,,'''
American Economic
Manski, C. (2003), Partial Identification of Probability Distributions,
New York:
Springer-Verlag.
Manski, C. and R. Pepper (2000), "Monotone Instrumental Variables: With an
Application to the Returns to Schooling," Econometrica, 68, 997-1010.
Rosenbaum,
P.,
and D. Rubin (1983), "Assessing Sensitivity to an Unobserved Binary
in an Observational Study with Binary Outcome", Journal of the Royal
Covariate
Statistical Society. Series 5, 45, 2 1 2-2
1
8.
and J.H. Stock (1997): "FV Regression with
Econometrica, 65, 557-586.
Staiger, D.,
Stock. J.H.,
J.
Weak
histruments,"
Wright and M. Yogo (2002): "A Survey of Weak Instruments and Weak
GMM," Journal of Business and Economic Statistics, 20, 5 1 8-
Identification in
529.
21
Figure
Critical
1
Values
for
Rho
n=100 and K=5. 10.30
rr
22
Table
1:
Critical Values of
0.1
0.2
0.3
0.5
0.7
0.9
0.3677
0.2323
0.1863
0.1432
0.1210
0.1070
0.1423
0.1002
0.0818
0.0634
0.0536
0.0473
0.1002
0.0708
0.0578
0.0448
0.0378
0.0334
0.01
R'^2
p
K=5
100
500
1000
**
*
0.3654
K=10
•k-U
**
0.2601
0.1949
0.1455
0.1220
0.1075
500
**
0.1445
0.1006
0.0819
0.0634
0.0536
0.0473
1000
**
0.1006
0.0708
0.0578
0,0448
0.0378
0.0334
100
K=10
100
500
1000
denotes no
**
**
**
»*
0.1789
0.1339
0.1135
**
0.1771
0.1050
0.0834
0.0638
0.0538
0.0474
**
0.1049
0.0716
0.0581
0.0448
0.0379
00334
critical
value of
p
less than 1.0 exists
23
Figure
2:
Ratio of 2SLS Bias to
OLS Bias with
N=100,K=5, a =
Invalid Instruments
0.1
1.2
1-
I
0-8
\
\
0.6
\
\
0.4
\
0.2
0.2
0.6
0.4
rr
24
0.8
Figure 3
Critical
Values
for
Alpha
n=100andK=5, 10.30
rr
25
Figure 4
MSE 2SLS/MSE OLS
Alpha=0.1
andK=5
0.2
0.4
0.61
0.8
1:
1.2^
1.4
1.6:
1.8-
0.9
>^
0.7
^^->>.,
26
Figure 5
MSE 2SLS/MSE OLS
Alpha=0.3 and K=5
0.2l^i
^^rv
^>c^<L.-^-'-r>?
v^'
0.4:
0.6^
0.8^
1-
'ill!"'
12:
1.4i
1.6^
1.8-
2
-^
^^^8
0.7
rho
0.6
'-T>.,
3
2
rr
27
Figure 6
MSE 2SLS/MSE OLS
Alpha=0.1 and
0.2'^T-^
Y
K=5
-r'
rr
28
OS
SO
ri
so o
SO
oo
so
OS
so
r
ri
o r^ r^ O
o
o
o
O Qs
O rt O
o
o -r oo o o O so
o o O o o o CD o O o o o o O O o o o o o o o o o O o o
(-0
C7S
a:
r-i
r-1
2'
oo
m
r^ o
m
rJ
o o r- o o OO o
m o so
O
o o s o
o o r*
O O O o O O O O o o o o o o o o o o o o o o o o o o o
>/n
r-i
rl
2
v/1
>/1
'y-\
Di
oo
to o o
rJ
s oo
o * o CO
O
o o OS
o oo
o r^
o o oo
o o •o
o o
o o
o o O o O o o O o o o o o o O o o o o o o o o o o o o
w-1
«/~)
UJ
<
2
»r»
r-1
CA
o oo
o o O o s
o OS
o o o
o
o o o o
o o o o
o O o
o
J. o
o
o "5 CD o o o o o O o o o o o O o o o o o o o o o o o o o o o
GTS
>/1
-3-
II
<-4
rsi
r-t
rsi
1
r^
oo
OS
TT
o
rO o
o
O
o o s o O o o o
o o
o o o
o o o o
o O OS
o
o O o O o o o o o o o o O o o o o o o o o o o o O CD
r-i
r-4
C7>
r-i
|cn O
O
Q.
OS
OS
OS
On
Os o
to OS o to OS
o o
o
o •^ OS o
o
o
o O
O o
o O
o o
O o
o O
o o
o O
O o
yr^
'y^
l/->
w~»
•/I
\j-i
ri
o o o o O o o o O o o o o O o o o O O O m
o O o O o O O
o o o o o o o o o
(—1
r->
'ic
'J-1
u:
^
»-)
o o o o o o
l/-^
ly-i
yf-i
o o o o o o
yr,
^
xr\
r*i
o o O o O o
r*i
- rj OS r
r^ m
OS so
OS
OO
OS
O o
O 0^ o
o
o
O r r- oo
O o
o O O o O o o O o o O O o o O o o O o O o O o o o o O
sD
r*\
J
r-t
r-)
C5
r-1
t
so
OS
OS
rso
SO
so
r»
rJ
o
o
^ ^o
o OS
o r^ O
o
o o O
o O o o o o o o o o O o o o o o O
T
oo
roo
^O
O
O o O o O o c O o
UJ
r~-
r-J
r-i
-3-
•3-
t*-)
- OS
"
oo
so
o n
o C^ oo O
oj
o
o T
o o
o o
o oo
O oo
o o O o o o o O o o o O o o ° o o O o o o o o o o
O
UJ
<
r-J
c:>
o o O o
o
O m oo o O
o
o o
o oo
o NO
o
o
o rg
O so
o so
is o
o
o o O O o o O o o O o o O o o o o o o
o -o CO o o o o o o o
r-J
11
c
r-i
2
oo oo
oo
CO
VI r—
o
r4
o
o OS o ro
o o
O rg
O o o o to
o o
o O o
o o so
o O
c o o o o o o O o O o o o o o o O o o o O o o o o o o
S.5 ^
C
o
r-1
1/1
1/1
o^
o o
o
r^
C^
o O
o
rsi
OS
o o lO o o J^ OS o •O C^ o
o o to OS
o
o o
o o
o o
o o
o o
O o
o o
«r^
•J",
»/"i
f1
o o o o
o o o o
o o o o o o o c o o o o o o o o o o o o o o o o o O O
'ik
/^
•^
n
V-,
o o o o o o
y/;
»^
•/->
c o o o o o
>J^
>-i
f-1
r-i
*A)
o O o o O o
r-1
Cs
OS
r^
cQO
o so o o
o
r
ri r4
o o
o rj
O O O o o o O O o O O o O o o o o o o o O o O O o O O
"/".
r-J
rs*
rsi
1
9'
r-1
rs|
rsj
r-J
rsj
OS
OS
SO o
oo m
- W1
rr
o
o
o
O oo
r^
oo
r^
r^
r4
O o o o o o O o O O o o O O o o O o O o d o o o o O o
UJ
vri
CO
^
«/^
yrt
-3-
rvi
r-l
r-J
r-J
cc:
- rCO
OS
sC
r^
ri
ri
rJ
O
r^
r^
oo
n T
rJ
O O o o O o O O o O o o o o o O O o o o o O o o o o o
UJ
C3
8 ^
ii_
•/->
r-i
<
2
'/".
(-n
r-i
^-
.«
C2
r--
<~~i
O ^C te o o CO o
O O - OS O vO so o
o oo
oo
o
o in o
o rg o o
o so
o so o so oo
o o o o O o o o O O o O O o O o O o o o O o o O o O o
r^i
r-1
S
-'
o so
m O m >o o
o o o
o T oo
o
o
~ o o o o o O O o o O o o O o
Q
'/I
/-I
r~-
1
c_
r-1
c5
03
'^
-Z
rvj
c-
^ o
o
»r-,
Cs
O o
o
o o o o o o
O o o c o
</->
;0
'r.
'/~.
.^.
OS
O o
<3
o
o
o o o - O o
o
—
•/-
/-^
OS
c c
o
«/">
cs
o c
rg
C'
o o o o o so
o
ro
o to
o o o o o o O o o O
o ^ OS
o c
r-1
r-1
O'
--1
'J-.
OS
o c
o
/I
OS
o O
o
<~r\
r—
1-1
r-1
O-
O o
o o o o o o o o o o c o o C c o O
^
tr.
o o o o o o
KT-,
/-.
iir,
o o o o
CD
o
A
Bekker Asymptotic Distribution of 2SLS, OLS, and Nagar
under Misspecification
Suppose that
=
Vu
+
y2iP
ei
=
P + uu
{z[^2)
where
TV
Following
Lenima
is
Lemma
the
1 Let
U=
S = U'P.U and
S-L
\
f^l,!
'^1,2
'^1,2
'^2,2
0,
reproduced from Hahn and Hausman (2001);
y\
y2
\-
Assume
= U'M.U. We
that
^
^t a-ir o (n"''/^), and that
-k^z' z-k^Iti is fixed at
then have
e+^
n-'^S 22
/
-UJ2,2
M
'
A
0,
A^
V
n-'S^2
leJ-
\ ri-'Si2 I
where
A and
A"*"
(l-^)-2.2
\
3x3
denote symmetric
matrices such that
A]_2
=
2t^i,i0/3
+ W^Q^\,2 +
A] .3
=
4/3eu.'i,2
+ 2qwJ2
A2.2
—
fj-"!,!©
A2,3
=
2(^2,20/5
As.s
=
4cJ2,20
+
/3
+
J J
+
01^2, 2
200;]
+
,2
Sawi, 1(^1,2
20a;i.2/?
+
+
Qa;j_iu;2,2
2qu;2,2'^1,2
2Qa;^_2
and
Aj
2
A^,3
A5
= 2(1-
= 2(l-Q)a;'[2
(1
-
q)cJ],]U,'2,2 4- (1
A^3 = 2(1 H,Z
q)u/'i_ilji,2
q)u;2,2I^'i,2
2(1 -q)u;^_2
-
Q)t
'1,2
+
ct<^\
2
0. Let
Remark
1
TTie u)s in the
the above with structural
Lemma
form
correspond
u
we can
"reduced form'". It would be convenient to rewrite
to the
parameters. Because
=
£
+
I3v
see that
Lemma
=
2 Suppose that -j^
/i
+
^1,1
=
ffi.i
+
2/3ai,2
t^l,2
=
cri,2
+
/3l72,2
t^2,2
=
0-2,2
+
/3V2,2
o(l). Then we have
\/ri
y'oPzV^
r^ UP^y\ - ^v'^M.vl _
^ \y',P-^y2-^^y'2M,y2
=>
N[Q,V2SLS)
=>
A' (n,
C^l,2
Vy2y2
Proof. Suppose that
q =
0.
©-Z^ +
n^'S,2 \
f
+
V
Vols)
t^2,2
Using the previous Lemma, we obtain
f
-^1,2
(,d-u.'2,2
.V
2
n^' (S,2 - ^S^,)
+
2u.'].-2,'3
+
((^2.2/^
(/./-u,'2,2
AA
+
+
2
(cjo.i/^
+
'^'1,1)0
2
(a;2,2,ff
+
5^2)
"^' (S22 +
52^2)
J^
B
f
/3
•
+
(/3 u-'2.2
+
2aJi_2/?
0,
2 (u;2,2/3
+
'^'1,2)
+
+
u.'i
2
u-'2,2
iJ-"],])
+
1^1 2
+ '^'1.1 "''2. 2
tlie
+ 1^1,2) © +
,2
20 + 2u;?
^9 9
(^,'2.2,/?
4^2
2u,'2.2i^'l.2
Therefore, using Delta method, we obtain
2
2u;2.2'^'l,
following
yv
0+-^-c^2,2
V2/2^=2/2
/^
/ y2-P.yi -
^y2^-^.~Vl•
?/2J/l
,J/2?/2
where we used the
fact
e p
0/3 + a;i,
© + '^2,2
N
'],2
(^i.i
+
"^2.2;
that
J
=
C],]
+
23a]
(^'1,2
=
cri.2
+
l3ff 2,2
0^2,2
--
02,2
ti,']
^2,02
I
(0
2
+
/3 '^2,2
(© +
>
'^1,2)
+
^'] ,2)
4u;2.20
^^'1.2)
and
n-i (S,2
+
4a'2,20
'^'1,2)©
2^;, ,2/^
2 (^^2,2/3
+ '^1,1)
(0+^2.2)
Q
Because
-^ =/i +
o(l),
we can
see that
/e-j0+f
^^1.2
e-/3\
•a;i,2
M'^1.2
and
^"U^^.y2
e
- ^"VyiF.2/2
;
e+f
+ f '^1.2^
+ V '^2,2 /
©-Z^
'yz^zyl
^y^P,j/2
V
e
e
'
j+^"l^ e +
a;^,^
,
^.c.2,2
M^i,2^^,j)
y
.
A.l
Asymptotic Distribution of 2SLS under Misspecification
Note that
y'2P^y2
_
y'2Pzy2
y'2Pzy\
n~^
1
+
(271-2
,
V2)'-z7
n-iy^P^y2
yjn
y'2Pzy2
But
n
^
+ U2)
(27r2
= —+n
27
=
n-^y'2P.y2
''j;227
=~+
=6+
n-1522
Op
(1)
Op (1)
so that
n
+
(271-2
'
-
V2)'27
It follows
that
7:;;^^
rz
K\
i
,,V
y'i^--yl
y'2Pzy2
A. 2
,1^
,
e
"-^2/2^.^2
Asymptotic Distribution of
OLS
" M27r2+V2)'z7
n
J
^y'2Pzy2
under Misspecification
Note that
"OLS
=
—=
y',y,
y'2
—
{yl
;
2/2^2
^22/2
=
^22/1
^2^2
+ 7^n)
1
,
Vn
71'^ (27r2
+
z;2)'a7
n-Jy^j/2
•
e
But
n
^
=
{ZTT2+V2)' zj
=.
+n
=^+
^i4z7
Op (1)
= n"^522 +""^5^2 = © + (^2,2 +
"~^2/22/2
Op(l)
so that
n
^
{zTT2
+
5
V2)' z-y
It follows
,,.
Op(l)
that
0-1,2
\^
=
fy'2yl
(3
Ky2y2
\
V^l
6+tJ2,2/y'
'"'^
1
f + W2,2
(y'2y\
=
V^\
^ n(
'"''
(P
1
+ UJ2^2
12/2^2
\
r\
,VoLs)
^9 +
A. 3
+
,
e + a2,2
'y^P,y2
,
(72,2
Asymptotic Distribution of Nagar under Misspecification
Note that
_
,
y'2P.yi
-
^M.yi
y^Pzj/2
-
f y^A/.y2
_
^'^^^ (2^'
+
7^^"^')
- ^^2^.
{y\
+ :j^n
^^^.^2 - ^y'2M.y2
y'2P^yJ--±y2Mfyi +
y'2Pzy2 - ^y'2M.y2
.
n-'(z7r2
1
V^
+
^2)'^!
n-i (y^F.yz - ^y^A/.yz)
But
n"'
(zTTa
+ r2)'27 = H + Op(l)
"^' ly2P.y2 - ^y'2M,y2]
@ + Op (1)
=
so that
+i^2)'z7
Ti~' (z7r2
It follows
^ S
that
V'n
(6;v
-0)
=
Vn
-777-
k,.,
^''\y'2P..y2-^y'2M.y2
I
-
-P
|
^y'2M..y\
\
\y'2P.y2-^y'2M^.y2
^)
( y'2P.y{
A^l Q,V2SLS
+ __w,/^,
k3 A/.
n-' (y^P.yz - 7-2/2
y2)
+
H
e
+
op(l)
B
Sensitivity Analysis
Consider a model with one endogenous regressor where other included exogenous variables are partialled
out.
The model
takes the form where
Vi
Denote the available instrument as
=1^/3
+ e,,
and write the
Zi,
=
Xi
2SLS estimator
is
=
2
first
+
z'tT
1,... ,n.
stage regression as
Uj
(1)
obviously given by
hsLs = [x'z {z'zy'
z'xY x'z {z'z)-'
z'y,
where
x\
Vi
X
Y
z
,
Vn
What
is
the property of b
exclusion restriction,
if
we add a
the exclusion restriction
little
noise to
£,,
y* {0)
is
in fact violated? In order to
implement violation
and consider a new model
=
x,(3
+
z[0
+ e:
(2)
where
= zM + e,
e
Let
PlsLsid)^ Ix'ziz'zy'z'xV x'ziz'zy'z'Y^ie)
and
We
would
restriction,
like to
i.e.,
examine the maximal asymptotic
bias \b2sLS {0)\ for a small violation of exclusion
the violation such that the correlation between
z'^O
and e*
is
some small number
jp.
We
argue that
n-'n^^t
\n-^T.:
provides such measure of sensitivity. Here,
ffi?
1
^
(3)
1 -•!/;'
denotes the R^
in the first stage.
Derivation of (3)
B.l
It
can be shown that'
b2SLs{d)
=
'^''^0
{^''^^)
where
$ = plimn-^Z'Z
Note that
^
=
,
which
is
maximized when 9
Without
scalar (.^
We
oa n.
loss of generality,
Note that the population R^
between
and
e*
z'^tt,
is
TT
<P
'
therefore focus on the type of violation such that ^
we
hsLS
(^
(tt <P7r)
^
do'
=
^
tt
for
some
will write
(S)
=
b2SLS
in the regression of e*
=
tt)
(^
on
b2SLS (0
which
z,
is
equal to the square of the correlation
equal to
0'<i>9
+
E [£2]
^^
_
"
tt'^tt
^
£.
62Sis(0 = (7r'*^)"''r'$(C-7r)
=
e'<^e
^2
^,^^
.
[^2]
and
We
can solve
(4) for ^
,
in the first stage
1.2
^
See next ^subsection for a slight
bias b'2SLS
C)
of ||b2S/.s
for a fixed
(^)ll
!>
more general
population R^
in
with equality when 9 octx. we can say that
amount
is
equal to
7r'$7r
= R2
.
£. ^2.2j
the regression of f' on
tt
is
(7")
proof.
witli respect to 8 fixing 8'<Pd constant
ln'-t-el^
given
R?
for tt'^tt as
k'^TT
-Maximization
(5)
and obtain
Now, note that the population R^
which can be solved
C
<
(7r'4>7r)
z.
•
has the purpose of maximizing the asymptotic
Because
{e'^e)
the direction that maximizes the sensitivity (inconsistency) of b2SLS for ^
of violation of exclusion restriction.
Combining
(6)
and
(7),
we obtain
e
E |x2]
K2
1
-
2
V;
or
lCl
We
=
l^...s(OI=,/f||^^^
(8)
note that (8) can be approximated by the empirical counterpart
B.2
Digression: Robustness of
In genera],
we
estimate
/?
.2
n-'T.un
\i\=\h2SLs[i)\
^
1
(9)
2SLS
by
^p^=[[ZA)'XY'(ZA)'Y
and the counterpart under small misspecification
'p\{e)
is
= [[ZA)'xY'(ZA)'Y'{e)
so that
bA{e)
=
=
plim3.4(e)-/5
p\\m[(ZA)'
xY\za)'y' {e)-p
=
plim [{ZA)'
X]
=
/3
=
p]im{A'Z'X]~'^ A'Z'Ze
=
p]]m \ A' z' z {z' zy'^
=
[A'^Tx)'^
"
+ Ze + e)-P
[ZA)' {X(i
'
+ p\im[(ZA)' X]'YZA)'
Z9-P
A'z'ze
z'x]
A'^e
Note that
—,
=
{it
$7r)
IT
$
and
Instead of dealing with an
use assume that
$ =
/.
We
awkward normalization involving the weight matrix $,
then have
db2SLS
{9)
{^'^)-'^'
39'
and
^M..,-.-
it is
convenient to
Remcirk 2
IS
If there is only
—%gf
one instrument, then
=
Therefore, small
^
tt
t:
indicates that
tt
is
2SLS
sensitive to misspecification.
Remark
3
// there are multiple
—'gg^
other components ofn, then
the exclusion restriction in z,
Remark 4
components in
would
n,
and
if the first
be small, i.e.,
2SLS
is
component of
small relative to
not very sensitive to the violation of
i.
Note that
db2SLS
{9)
(ttV)-
89'
and
dbA
[9]
db2SLS
{A'TTf^A'A{A'TTy'^
de'
Therefore,
C
2SLS
[A'Trf
is
the most robust estimator
~ \\Af
\\nf
the class of
IV
z',TT+U,
1=
1,.
consider the
2SLS
among
is
where (f,.u,)
gi\en by
is
y,
=
x,p
T,
=
f,+U, =
+
e^,
We
homoscedastic and normal.
x'Py
P2SLS
and the related estimator
for
the \-ariance of
x'Px
f,
"
1
We
\m'
estnnators
Higher Order Bias of a
Our model
>
have the following characterization of o'
5'
=
ii:('''^'(^2SLs-p)f
1T=1
n
+
7/
+
2'f'u
_
uu\
(-r
y
[^2315 - P)
where
H=
^f'f = --j'Z'Zix
n
n
de'
.
.
,n
6,4.
(9)
Lemma
3
for
T,
=
^/'t = 0,(l)
n
=
-2{^]-h(^f'e]=0,
^6
=
-2
T.
-
2^^^\i,(^r.\.o('
m
V
n J
^P
77^^
H\ ^ ) =Op
U
2
)
77.
Proof. Note that 2SLS
is
a special case of the
^^
~
\^
fc-class
estimator
x'Py - k x'My
x'Px - k x'Mx
for
n
and 6
is
the "eigenvalue"
Donald and Newey
We
Note that 2SLS corresponds
.
(1998).
therefore obtain
Lemma
4
a^
=
ol
^(e's
1
1
'
Proof.
We
have
n
n\
2
(\^
to a
=
and
6
=
0.
The
result follows
from
Because
T:
=
Op(l)
and
^n
-
^
=
=
^^4t,^oV^
n
o.(i),
'
v^.
H-
—
"" =
0,
°'(^)'
.
VV"
o,(i)
we obtain
^2
a
—n
e £
=
u'u\ /
/..
1
K\"^-
„\^
1
/I
^^^[n
[tJ^
Now, note that
V^(i£-aM=Op(i),
We
/^f^_a?,
y;:^f£-^-a,..J=o,(i),
)
= o,(i)
therefore obtain
—n = r
e'e
2
al
/-
1
+ -7=V^"
A'f
~2
o.f
'
V^
\
n
/
'
3
1
;^--(^^E^.j+^v/^(v-"^'')(iH^')^«.'
and
n
It
n
\
I
\
H
n
I
H
n
H^
Vn
follows that
?2
=
^2^_Ly^('^_^2
/n
V
n
|--(^^e^^J"^v"(t-""
-^Kv-)(»-^^
;|-U?^'
1
Tf
1
a^T/
^
1
10
=
a
ac
-
(J.
,T,
7=a^
\
Vn
2
¥^^ji''^
H'
r-
n
/
\
«^'"
1
r?
1
n
n
H
n
crin
m
+Ob
Condition
Theorem
1
Assume
that
we can ignore
Lemma
the Op (^j term in
4 in calculation of expectation.
1
1, +
K
n
2[K-2)al^
E\a^\^ot--
,
H
n
'
2^2
lolo
H
n
where
-n'Z'Zn
H= -f'f=
n
n
Proof. From
-2
a
=
Lemma
4,
we have
2
a^
1
Jn
^
\
2
?\
fe'e
/
1
^
n
2.,..(..,,..,)_£^(5;._„J(...)_l2,lz|2
+ 0r,
Because expected values of the Op
Op (-)
in the third line. First,
E[T2]
-j=
terms in the second line are zero,
I
(
we note that
=
E
u'Pe
--Ka,
\/n
E[T^]
=
E
-?)M>
==
—r^^
from which we obtain
2
/
1
^
1
^
2
n
Second, we note that
2
fe'u
n
\
W
n
I
\
due to symmetry. Third, we note that
E [rn = Ha]
11
1
H
{K-2)al
H
it
suffices to consider
the
from which we obtain
We
2T-21
oin
1
T2
1
n
H
n
m
n
therefore obtain
Eld
Remark
5 In order
understand Theorem
to
H
n
'
{P2SLS ~
asymptotic approximation for ^/n
n
'
H
imagine a counter-factual situation where the
1,
/^l
n
first
order
write
w; exact, i.e.,
V^{hsLS-P) =
We
H
n
j^T,
would then have
::?2
a
=
^2
(7,
Jn
-
\
2
r- /e'w
n
\
— v/n
—^Ofu
ttTi
H
yn
n
a fu
n
I
'
H
n
Jj^'^
n
//2
+ Od
and
IP i-~2|
Theorem
Therefore,
than would
Remark
2SLS
IS
1
2
1
2
n
'
,1
1
2
(/^-2)aL
n
H
a
H
smaller by
is
order asymptotic approximation.
can be understood from a different perspective. Note that the approximate bias of
equal to
nH
yjn
Roughly speaking, 2SLS
the
^^f^^
n
implies that the approximate Tnean of
be expected out of first
6 Theorem
_
2SLS 02SLS
^^
i=l
is
biased toward OLS, which minimizes -
close to the
OLS
f^oLS^ then
1
we should
=1
^"-j
—
x^b)
with respect to
expect
i=]
12
{y,
1=1
h.
If
;
o o n
\h
«5a\l §ft
MIT LIBRARIES
3 9080 02613 1372
"-
^lii:
I
Download