Instrument exogeneity

advertisement
Research Method
Lecture 11-2 (Ch15)
Instrumental Variables
Estimation and Two
Stage Least Square
©
1
What would happen when you use IV
method when the suspected endogenous
variable is in fact exogenous?
Consider the following model
Y=β0+β1x+u
If x is exogenous, you do not need IV method.
OLS estimators are consistent.
Suppose that you have an instrument for x,
called z, which satisfies the instrument
conditions (instrument exogeneity and
instrument relevance described in handout 111). Then, IV estimators are also consistent.
2
Then, which one is better, OLS or IV?
Answer is, OLS. If x is exogenous, IV
estimators have larger variances, so IV
estimators are imprecise (you tend to get
smaller t-stat in absolute value.)
To see this, notice the following.
Var(ˆ1, IV ) 
2
SSTx  Rx2, z
Var( ˆ1,OLS ) 
2
SSTx
Since R2x,z is always between 0 and 1
(except the case x=z, where it is 1), the
variance of IV estimator is always bigger
asymptotically).
3
Thus, controlling for endogeneity(i.e.,
using IV method) when it is actually
exogenous is costly in terms of precision.
4
Poor instruments: What would happen if the
instrumental variable does not satisfy the
instrument conditions.
Consider the following model
Y=β0+β1x+u
This time, suppose that x is endogenous.
But further suppose that your
instrumental variable z does not satisfy
the instrument conditions (i.e., you have a
poor instrument).
Then what would happen?
5
Answer to this question is the following
1.IV estimators are inconsistent.
2.The directions of the biases in IV
estimators and OLS estimators can be the
opposite.
3.The bias in IV can be worse than OLS.
6
To understand 1, notice that
Corr( z, u )  u
1, IV )  1 
Corr( z, x)  x
p lim(ˆ

p lim(ˆ1,OLS )  1  corr( x, u ) u
If instrument
exogeneity is not
satisfied, this term is
not zero, so
inconsistent.
x
If x is endogenous,
this term is not zero,
so inconsistent.
(Proof: See the front board)
So, both IV and OLS are inconsistent.
7
To understand 2, first consider that
Corr(x,u) is a positive. Then OLS has
positive bias.
But it can happen that Corr(z,u)/Corr(z,x)
is negative. In such a case, the IV
estimator have a negative bias.
This means that, when you have an
invalid instrument, you may get very
unexpected results.
8
To understand 3, consider the following
scenario.
(i) the instrument exogeneity is almost
satisfied but not perfectly statisfied, that
is; corr(z,u) is close to 0 but not exactly 0.
(ii) The instrument is not very relevant; i.e.,
corr(z, x) is very close to 0.
Then, even if instrument exogeneity is
almost satisfied, the bias will be
If this is
magnified by the small corr(z,x). small, bias
Corr( z, u )  u
p lim(ˆ1, IV )  1 
Corr( z, x)  x
will be
magnified.
9
It is possible that the bias is so magnified
that the extent of bias in IV estimator is
worse than OLS.
10
IV estimation of the multiple
regression model
I will extend the discussion to the multiple
regression model.
I will explain the following 3 cases, step by step.
Case 1: One endogenous variable, one instrument.
Case 2: One endogenous variable, more than one
instruments. (Two stage least squares)
Case 3: More than one endogenous variables,
more than one instruments. (Two stage least
squares)
11
Case 1: One endogenous
variable, one instrument.
Consider the following regression.
log(wage)  0  1educ 2 exp u
Suppose that educ is endogenous but exp
is exogenous.
12
To explain IV regression for multiple
regression, it is often useful to use
different notations for endogenous end
exogenous variable.
Let us use y for endogenous variable (i.e.,
correlated with u) and z for exogenous
variables (i.e., uncorreated with u).
Then, we can write the model as:
y1=β0+β1y2+β2z1+u …………………(1)
y1 is log(wage), y2 is educ, and z1 is exp.
13
This model is called the structural
equation to emphasize that this equation
shows the causal relationship. Off course,
OLS cannot be used to consistently
estimate the parameters since y2 is
endogenous.
If you have an instrument for y2, you can
consistently estimate the model. Let us
call this instrument, z2.
14
As before, z2 should satisfy (i) instrument
exogeneity, and (ii) instrument relevance.
For a multiple regression model, these
conditions are written as:
1. The instrument exogeneity
Cov(z2, u)=0 …………………….(2)
2. The instrument relevance
y2=π0+π1z1+π2z2+error …………….(3)
All the exogenous variables included. This equation
and π2≠0
is often called the reduced form equation.
In addition, z2 should not be a part of the
structural equation (1). This is called the
15
exclusion restriction.
Now, we have the following three
conditions that can be used to obtain the
IV estimators.
E(u)=0
Cov(z1,u)=0
Cov(z2,u)=0
(this is from the instrument exogeneity)
The sample counterparts of these conditions
are given in the next slide.
16
n
(y
i 1
i1
 ˆ0  ˆ1 yi 2  ˆ2 zi1 )  0
n
 zi1 ( yi1  ˆ0  ˆ1 yi 2  ˆ2 zi1 )  0
i 1
n
z
i 1
i2
( yi1  ˆ0  ˆ1 yi 2  ˆ2 zi1 )  0
If you divide it by n, this is
the sample average of uˆ.
If you divide it by n-1, this
is the sample covariance
between z1 and uˆ.
If you divide it by n-1, this
is the sample covariance
between z2 and uˆ.
 This is a set of three equations with three unknowns:
ˆ0 ˆ1 ˆ2
 The solutions to these equations are the IV estimators.
 There is a simple matrix expression for IV estimators.
However, we will not cover this during the class.
17
Above method can be easily extended to
the case where there are more explanatory
variables (but only one endogenous
variable).
Consider the following model.
y1=β0+β1y2+β2z1+β3z2+β4z3+..+ βkzk-1+ u
Suppose that zk is the instrument for y2.
Then the IV estimators are the solution to
the following equations.
18
n
( y
i1
i 1
n
z
i 1
 ˆ0  ˆ1 yi 2  ˆ2 zi1  ...  ˆk zik 1 )  0
i1
( yi1  ˆ0  ˆ1 yi 2  ˆ2 zi1  ...  ˆk zik 1 )  0
ik
( yi1  ˆ0  ˆ1 yi 2  ˆ2 zi1  ...  ˆk zik 1 )  0

n
z
i 1
Solution to the above equations are the IV estimators when there
are many explanatory variables, but only one endogenous
variable and one instrument.
19
Example
Consider the following model.
Log(wage)=β0+β1(educ)+β2Exper+β3Exper2
+β3(SMSA)+ β3(South)+u
Using the college proximity (nearc4) as an IV
for education, estimate the model. Use
CARD.dta. (nearc4) is a dummy variable for
someone who grew up near a four-year
college.
20
. reg lwage educ exper expersq
Source
SS
smsa south
df
MS
Model
Residual
155.959797
436.681848
5
3004
31.1919593
.145366794
Total
592.641645
3009
.196956346
lwage
Coef.
educ
exper
expersq
smsa
south
_cons
.0815797
.0838357
-.0022021
.1508006
-.1751761
4.611015
Std. Err.
.003499
.0067735
.0003238
.015836
.0146486
.067895
. ivregress 2sls lwage exper expersq
t
23.31
12.38
-6.80
9.52
-11.96
67.91
Number of obs
F( 5, 3004)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.000
0.000
0.000
0.000
0.000
0.000
Coef.
educ
exper
expersq
smsa
south
_cons
.13542
.1067727
-.0022553
.1249987
-.1409356
3.703427
Instrumented:
Instruments:
Std. Err.
.0486085
.0218136
.0003394
.0284538
.0343705
.8201379
3010
214.57
0.0000
0.2632
0.2619
.38127
OLS
[95% Conf. Interval]
.0747189
.0705545
-.0028371
.1197501
-.2038985
4.477889
.0884405
.0971169
-.0015672
.1818511
-.1464537
4.74414
smsa south (educ=nearc4)
Instrumental variables (2SLS) regression
lwage
=
=
=
=
=
=
Number of obs
Wald chi2(5)
Prob > chi2
R-squared
Root MSE
z
2.79
4.89
-6.64
4.39
-4.10
4.52
educ
exper expersq smsa south nearc4
P>|z|
0.005
0.000
0.000
0.000
0.000
0.000
=
=
=
=
=
3010
499.36
0.0000
0.2051
.39562
IV
[95% Conf. Interval]
.0401491
.0640188
-.0029205
.0692302
-.2083005
2.095986
.230691
.1495266
-.00159
.1807671
-.0735707
5.310867
21
. reg educ exper expersq smsa south nearc4, robust
Linear regression
Number of obs
F( 5, 3004)
Prob > F
R-squared
Root MSE
educ
Coef.
exper
expersq
smsa
south
nearc4
_cons
-.4258437
.0009774
.3639914
-.582683
.3456458
16.68131
Robust
Std. Err.
.0320651
.0017044
.0863314
.0743531
.0824092
.1489113
t
-13.28
0.57
4.22
-7.84
4.19
112.02
P>|t|
0.000
0.566
0.000
0.000
0.000
0.000
=
=
=
=
=
3010
675.83
0.0000
0.4524
1.9825
[95% Conf. Interval]
-.4887155
-.0023646
.1947167
-.7284712
.1840616
16.38933
-.362972
.0043194
.5332661
-.4368948
.50723
16.97329
Check if nearc4 satisfies instrument relevance. Using t-test, we
can reject the null hypothesis that nearc4 is not correlated with
educ after controlling for all other exogenous variables.
22
Case 2: One endogenous variable,
more than one instruments.
Two stage least squares
Consider the following model with one
endogenous variable.
y1=β0+β1y2+β2z1+u
Now, suppose that you have two
instruments for y2 that satisfy the
instrument conditions. Call them z2 and
z 3.
23
You can apply IV method using either z2
or z3. But this produces two different
estimators. Moreover, they are not
efficient.
Now, I will show you a more efficient
estimator.
First, it is important to lay out the
instrument conditions.
24
For z2 and z3 to be valid instruments, they
have to satisfy the following two
conditions.
1.Instrument exogeneity
Cov(z2, u)=0 and Cov(z3, u)=0
2.Instrument relevance
y2=π0+π1z1+ π2z2+ π3z3+error
Include all the
exogenous
and π2≠0 or π3≠0
variables
In addition, z2 and z3 should not be a part of the
structural equation. These are called the
25
exclusion restrictions.
Now, I will explain the estimation
method.
Instead of using only one instrument, we
use a linear combination of z2 and z3 as
the instrument.
Since a linear combination of z2 and z3
also satisfies the instrument conditions,
this is a valid method.
The question is how to find the best linear
combination of z2 and z3.
26
It turns out that OLS regression of the
following model provides the best linear
combination.
y2=π0+π1z1+ π2z2+ π3z3+error
After you estimate this model, you get the
predicted value of y2.
yˆ 2  ˆ0  ˆ1z1  ˆ2 z2  ˆ3 z3
Since yˆ2 is a combination of variables which
are not correlated with u, yˆ2 is not correlated
with u as well. At the same time, yˆ2 is correlate
with y2. Thus this is a valid instrument.
27
Thus, we have the following three
conditions that can be used to derive an
IV estimator.
E(u)=0
Cov(z1,u)=0
Cov( yˆ2 ,u)=0
The sample counter part of the above
equations are given by:
28
n
(y
i1
i 1
n
z
i 1
i1
( yi1  ˆ0  ˆ1 yi 2  ˆ2 zi1 )  0
n
 yˆ
i 1
 ˆ0  ˆ1 yi 2  ˆ2 zi1 )  0
i2
( yi1  ˆ0  ˆ1 yi 2  ˆ2 zi1 )  0
This is a set of three equations with three
unknowns ˆ ˆ ˆ .
Solution to these equations are special
type of IV estimators called the two stage
least square estimators.
0
1
2
29
You can estimate these parameters by
following the above procedure.
There is an alternative and equivalent
procedure to estimate these parameters.
This procedure will give you an idea why
it is called the two stage least squares.
30
The estimation procedures of the
two stage least square (2SLS).
Stage 1. Estimate the following model using
OLS and get the predicted value for y2: yˆ 2 .
y2=π0+π1z1+ π2z2+ π3z3+error
Make sure to put
all the exogenous
variables
Stage 2. replace y2 with yˆ 2 , then estimate the
following model using OLS.
ˆ 2   2 z1  error
y1  0  1 y
OLS estimators of the coefficients are the
two stage least square estimators (2SLS).
31
Estimating the standard
errors for two stage least
square.
When you exactly follow the two stage
procedures explained in the previous slide, you
get correct 2SLS coefficients. But you don’t get
correct standard errors.
So, after applying the procedure, you have to do
some extra work to estimate the standard errors.
Under the homoskedasticity assumption, the
valid standard errors are computed as follows/
32
1. Estimate the 2SLS coefficients, then
estimate the variance of u as
n
1
2
ˆ
ˆ 
u
i
n  k  1 i 1
where
Note you use y2,
not yˆ 2 .
Coefficients are
2SLS estimates.
2
uˆi  yi1  ˆ0  ˆ1 yi 2  ˆ2 zi 2
2. Then the variance for βj is given by
Varˆ(ˆ j ) 
ˆ 2
SSˆT2 (1  Rˆ 22 )
where SSˆT2 is the total variation of yˆ .
Rˆ is the R-squared from regressing yˆ on all other
exogenous variables appearing in the
structural equation.
33
2
2
2
2
The square root of the variance in the
previous slide is the standard error for βj.
34
Note
STATA automatically estimate 2SLS
model, as well as calculating the correct
standard errors.
Most of the cases, you should avoid
estimating 2SLS “manually” (although it
is a good exercise), since this does not
provide you with the correct standard
errors.
35
Exercise
Consider the following model.
Log(wage)=β0+β1(educ)+β2Exper+β3Exper2+u
1.Suppose educ is endogenous but exper and its
square are exogenous. Using mother and
father’s education as instruments, estimate the
2SLS model. Use Mroz.dta.
2.Manually estimate the model to check if you
get the same coefficients. (Note that you will
not get the correct standard errors.)
36
. reg lwage educ exper expersq, robust
Linear regression
Number of obs =
F( 3, 424) =
Prob > F
=
R-squared =
Root MSE
=
428
27.30
0.0000
0.1568
.66642
lwage
Robust
Coef. Std. Err.
t
P>|t|
[95% Conf. Interval]
educ
exper
expersq
_cons
.1074896 .013219
.0415665 .015273
-.0008112 .0004201
-.5220406 .2016505
8.13
2.72
-1.93
-2.59
0.000
0.007
0.054
0.010
.0815068 .1334725
.0115462 .0715868
-.0016369 .0000145
-.9183996 -.1256815
OLS
37
. ivregress 2sls lwage exper expersq (educ = motheduc fatheduc), first
First-stage regressions
Number of obs
F(
4,
423)
Prob > F
R-squared
Adj R-squared
Root MSE
educ
Coef.
exper
expersq
motheduc
fatheduc
_cons
.0452254
-.0010091
.157597
.1895484
9.10264
Std. Err.
.0402507
.0012033
.0358941
.0337565
.4265614
t
1.12
-0.84
4.39
5.62
21.34
P>|t|
0.262
0.402
0.000
0.000
0.000
Instrumental variables (2SLS) regression
lwage
Coef.
educ
exper
expersq
_cons
.0613966
.0441704
-.000899
.0481003
Instrumented:
Instruments:
Std. Err.
.0312895
.0133696
.0003998
.398453
=
=
=
=
=
=
[95% Conf. Interval]
-.0338909
-.0033744
.087044
.1231971
8.264196
Number of obs
Wald chi2(3)
Prob > chi2
R-squared
Root MSE
z
1.96
3.30
-2.25
0.12
educ
exper expersq motheduc fatheduc
P>|z|
0.050
0.001
0.025
0.904
428
28.36
0.0000
0.2115
0.2040
2.0390
.1243417
.0013562
.2281501
.2558997
9.941084
=
=
=
=
=
First
stage
regression
428
24.65
0.0000
0.1357
.67155
[95% Conf. Interval]
.0000704
.0179665
-.0016826
-.7328532
“first” option
show s first
stage and second
stage
.1227228
.0703742
-.0001154
.8290538
2SLS
results
38
 Estimating 2SLS manually: When you regress the first
stage manually on this data, more observations are used
than the above 2SLS. To use exactly the same
observations, first run the 2SLS and find the
observations used in the regression.
. ivregress 2sls lwage exper expersq (educ = motheduc fatheduc)
Number of obs
Wald chi2(3)
Prob > chi2
R-squared
Root MSE
Instrumental variables (2SLS) regression
lwage
Coef.
educ
exper
expersq
_cons
.0613966
.0441704
-.000899
.0481003
Std. Err.
.0312895
.0133696
.0003998
.398453
z
1.96
3.30
-2.25
0.12
Instrumented: educ
Instruments: exper expersq motheduc fatheduc
P>|z|
0.050
0.001
0.025
0.904
428
=
= 24.65
= 0.0000
= 0.1357
= .67155
[95% Conf. Interval]
.0000704
.0179665
-.0016826
-.7328532
.1227228
.0703742
-.0001154
.8290538
e(sample)
enable you
to create
dummy if
the
observatio
n is used
. gen fullsample= e(sample)
39
. reg educ exper expersq motheduc fatheduc if fullsample==1
Source
SS
df
MS
Model
Residual
471.620998
1758.57526
4 117.90525
423 4.15738833
Total
2230.19626
427 5.22294206
educ
Coef.
exper
expersq
motheduc
fatheduc
_cons
.0452254
-.0010091
.157597
.1895484
9.10264
Std. Err.
.0402507
.0012033
.0358941
.0337565
.4265614
t
1.12
-0.84
4.39
5.62
21.34
Number of obs
F( 4, 423)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.262
0.402
0.000
0.000
0.000
=
428
= 28.36
= 0.0000
= 0.2115
= 0.2040
= 2.039
[95% Conf. Interval]
-.0338909
-.0033744
.087044
.1231971
8.264196
Then, estimate the
first stage
regression. Note
“if fullsample==1”
tells STATA to use
observations only
if fullsample is 1.
.1243417
.0013562
.2281501
.2558997
9.941084
. predict educ_hat, xb
After estimation, type
this command. This
will automatically
create the predicted
value of educ.
40
Finally estimate the second stage regression. You can
see that the coefficient s are the same as before, but
Std error and t-stats are different.
. reg lwage educ_hat exper expersq if fullsample==1
Source
SS
df
MS
Model
Residual
11.117828
212.209613
3 3.70594266
424 .50049437
Total
223.327441
427 .523015084
lwage
Coef.
educ_hat
exper
expersq
_cons
.0613966
.0441704
-.000899
.0481003
Std. Err.
.0329624
.0140844
.0004212
.4197565
t
1.86
3.14
-2.13
0.11
Number of obs
F( 3, 424)
Prob > F
R-squared
Adj R-squared
Root MSE
P>|t|
0.063
0.002
0.033
0.909
=
=
=
=
=
=
428
7.40
0.0001
0.0498
0.0431
.70746
[95% Conf. Interval]
-.0033933
.0164865
-.0017268
-.7769624
.1261866
.0718543
-.0000711
.873163
41
Case 3: More than one endogenous
variables, more than one instruments
Consider the following structural
equation.
y1=β0+β1y2+β2y3+β3z1+β4z2+β5z3+u1
There are two endogenous variables, y2 and y3.
Thus, OLS will be biased. In order to estimate
this model with IV method, you need at least 2
instruments.
When you have multiple endogenous variables,
you need at least the same number of
instruments as the endogenous variables.
42
Suppose you have 3 instruments: z4 z5 z6.
As usual, these instruments should satisfy
2 conditions. The first is that they should
not be correlated with u1 (Instrument
exogeneity). The second is that they
should be correlated with endogenous
variable (instrument relevance). When
you have multiple endogenous variables,
the second condition has a more complex
expression, and it is called the rank
condition.
43
The estimation procedure
The 2SLS procedure when there are more
than one endogenous variables is shown
here.
 y1=β0+β1y2+β2y3+β3z1+β4z2+β5z3+u1
Suppose you have three Instruments : z4 z5 z6.
44
First stage: Estimate the following two
reduced from regressions
y2=п10+п11z1+п12z2+п13z3+п14z4+п15z5+п16z6+error
y3=п20+п21z1+п22z2+п23z3+п24z4+п25z5+п26z6+error
Then obtain yˆ 2 and yˆ3 .
The second stage: Estimate the following ‘second
stage regression’.
y1=β0+β1 yˆ 2 +β2 yˆ3 +β3z1+β4z2+β5z3+u1
The estimated coefficients are the 2SLS coefficients.
45
Note that second stage regression does
not produce correct standard errors. The
derivation of the exact formula for the
standard errors is not the focus of this
course. Stata ivregress command
automatically computes the correct
standard errors.
46
Testing multiple hypotheses
In the 2SLS method, the F statistic formula
we used for OLS is no longer valid.
STATA automatically computes a valid Ftype statistic for 2SLS.
47
Download