Instrumental variables estimation Ragnar Nymoen 6 March 2009 Department of Economics, UiO

advertisement
Instrumental variables estimation
Ragnar Nymoen
Department of Economics, UiO
6 March 2009
ECON 4610: Lecture 8
Overview
Macro model example: ILS estimator is equivalent
(numerically identical) to IV.
This holds for all exactly identi…ed models.
Over identi…cation: more instruments than included
endogeous variables. Can construct many IVs from sub-sets of
instruments.
All these IVs are consistent, but will have larger variance than
the optimal IV estimator that uses the linear combination of
the instruments that gives the best predictors of the included
endogenous variables.
That linear combination is obtained by estimating the reduced
form by OLS, which is step 1 in the 2SLS estimator.
The 2SLS is the optimal IV estimator for the over identi…ed
case (for models with no autocorrelation or heteroscedasticity)
Main reference is G Ch 13.5.1, 135.2. B Ch 8.2. K Chap 10.3.
Lecture note WH
ECON 4610: Lecture 8
ILS and IV
We saw, in the Lecture (#6) on identi…cation that the
parameters of an identi…ed structural equation can be
estimated consistently
We discovered two consistent estimators:
In the Keynes model: Indirect least squares. ILS
In the measurement error model: The instrumental variables
(IV) estimator.
We start by showing that these estimators are identical
estimators for the parameters of the consumption function in
the Keynes model.
This result is completely general: ILS and IV are equivalent
for any just identi…ed equation.
ECON 4610: Lecture 8
The simple Keynes model revisited
Let Yt denote GDP in period t D 1, 2, ..., T .
Ct is “endogenous expenditure” and let Xt denote “exogenous
expenditure”.
Assume that Ct depends on GDP, then our example model is
Yt
Ct
D Ct C Xt
(1)
D b1 C b2 Yt C "t , 0 < b2 < 1
(2)
"t is a random disturbance term. We assume that it is white
noise uncorrelated with Xt . For simplicity we assume
normality "t N.0, 2" /.
The parameter of interest is the marginal propensity to
consume b2 .
ECON 4610: Lecture 8
Identi…cation and the reduced form of the model
(1) and (2) de…ne a simultaneous equations model. The
parameters of the consumption function are exactly identi…ed.
Solution for the two endogenous variables:
Yt
Ct
11
21
b1
1 b2
b1
D
1 b2
D
D
11
C
D
21
C
12
D
22
D
12 Xt
22 Xt
C
1t
(3)
C
2t
(4)
1
1
b2
b2
1 b2
1
1t
D
1
2t
D
1
ECON 4610: Lecture 8
1
b2
b2
"t
"t
IV estimation of the marginal propensity to consume
With reference to the discussion of the measurement error model
we see that Xt in the model given by (1) and (2) has the property
of being an instrumental variable for the consumption function: It
is
1
Correlated with Yt
2
Uncorrelated with "t ,
To form moments with Xt in this case, write (2) as
Ct
CN D b2 .Yt
YN / C "t
"N
then multiply by Xt and sum over all observations:
X
X
X
.Ct CN /Xt D b2
.Yt YN /Xt C
."t
t
t
t
ECON 4610: Lecture 8
"N /Xt .
(5)
The IV estimator is de…ned as:
P
Ob IV D P t .Ct
2
t .Yt
CN /Xt
YN /Xt
(6)
which
can be seen as the solution of (5) after …rst setting
P
."
"N /Xt D 0. The de…ning equation
t t
X
X
.Ct CN /Xt D bO 2IV
.Yt YN /Xt
t
t
is often refered to as a quasi-normal equation, because it takes the
same role in the motivation of the IV estimator as the normal
equations do in the derivation of the OLS estimator.
ECON 4610: Lecture 8
The ILS estimator for the b2 was in Lecture 6 shown to be
bO 2ILS D
O 22
O 12
D
P
.C
P t
.X t
P
.Y t
t
P
.X t
CN /X t
XN /2
YN /X t
XN /2
D bO 2IV
(7)
Lecture 6 showed that plim bO 2ILS D b2 , so we have obviously
plim bO IV D b2 .
2
Consistency of bO 2IV is also straight-forward to show directly:
Start P
with (6) and use (5),and …nallly the assumption that
plim t ."t "N /Xt D 0 (instrument validity).
ECON 4610: Lecture 8
Alternative notation for a single structural equation
Without loss of generality we look at the …rst equation in a
simultaneous equations model. In matrix notation the equation can
be written (as an alternative to Greene’s …rst notation in Lecture 7)
y1 D Y1
1
C X1
1
C "1
where y1 is T 1 and Y1 is T .M 1 1/.
M 1 is the number of included endogenous variables in the
equation.The di¤erence from Greene (who use M1 ) is that y1 is
counted as an included endogenous variable in M 1 ).
X1 is T K 1 where K 1 is the number of included predetermined
variables. 1 and 1 conform to these de…nitions.
1
y1 D Z1
y1 D Z1
where Z1 is T
.M 1
1
1
C "1 , Z1 D
Y1 : X1
(9)
C "1
1 C K 1 /.
(8)
ECON 4610: Lecture 8
The general IV estimator
De…ne a T .M 1 1/ C K 1 matrix of instrumental variables with
following asymptotic properties:
1
.W10 Z1 / D 6wz ,non-singular square matrix (10)
T
1
plim .W10 "1 / D 0
(11)
T
1
plim .W10 W1 / D 6ww , positive de…nite
(12)
T
plim
As direct generalization of the above examples, the general IV
estimator is written as
0
1
0
O IV
1 D .W1 Z1 / W1 y1
ECON 4610: Lecture 8
(13)
Exact identi…cation
Let X01 denote the matrix with the predetermined variables in the
model that are excluded from the …rst equation. X01 has K K 1
columns. The matrix with all the predetermined variables in the
model:
X D X1 : X01
has K K 1 C K 1 D K columns.
We now consider X as our choice of W1 .
W10 Z1 D X0 Z1 , a K
M1
1 C K 1 matrix
In the exact identi…cation case we have
K
K 1 D M1
1
so
W10 Z1 D X0 Z1 is a square K
K matrix
ECON 4610: Lecture 8
Note that we require that X0 Z1 is non-singular (for any given
T )to be able to calculate the IV estimate.
Exact identi…cation is important for having one unique X0 Z1
matrix of moments from which the estimator is constructed.
In sum in the case where the …rst equation is exactly identi…ed
we have the IV estimator given as
bIV
1
where X is the T
of the model.
D
bIV
1
bIV
1
!
D .X0 Z1 / 1 X0 y1
K matrix with all the predetermined variables
XD
X1 : X01
ECON 4610: Lecture 8
Over identi…cation
.
In this case X0 Z1 is not a square matrix, since K > M 1 1 C K 1 .
This means that we can de…ne not one but several W10 Z1 matrices
that are square and invertible: Each one will de…ne an IV estimator
of 1 .
To solve this “problem” we choose W1 as follows:
b1 : X1
W1 D Y
b1 is the T Mi matrix of best predicted values of the
where Y
included endogenous variables:
b
yj D Xbj , j D 1, 2, ..., M 1
(14)
where bj is the OLS estimator for each equation in the reduced
form of the model.
b1 D b
y2 b
y3 ... b
yM 1 T .M 1 1/
Y
b1 D X.X0 X/ 1 X0 Y1
Y
ECON 4610: Lecture 8
bIV
1
D .W10 Z1 / 1 W10 y1
h
b1 : X1 0
D
Y
bIV
1
bIV
1
!
D
Y 1 : X1
b 0 Y1 Y
b 0 X1
Y
1
1
0
0
X1 Y1 X1 X1
i
1
1
b1 : X1
Y
b0 y1
Y
1
0
X1 y1
From OLS regression theory (see Lecture 1) we know that
b1 D PY1 D X.X0 X/ 1 X0 Y1
Y
e1 D MY1 D .I
X.X0 X/ 1 X0 /Y1
where P and M are the projection matrix, and the “residual
maker” matrix.
ECON 4610: Lecture 8
0
y1
(15)
b1 D X Y1 since all the variables in X1 by de…nition
Intuitively X1 Y
1
are uncorrelated with the “unpredictable part” of Y1 .
This is what is brought out by the algebra:
0
0
b1 D X .Y1
X1 Y
1
0
0
0
0
e1 / D X1 .Y1
0
MY1 / D X1 Y1
0
because X1 M D X M D 0, since a regression of X on a sub-set of
X must (also) give zero residuals.
b0 Y1 D Y0 P0 .Y
b1 C e1 /
Y
1
1
0
0
D Y1 P .PY1 C e1 /
b0 Y
b1 , since PM D 0
D Y
1
because the projection matrix and the residual maker are
orthogonal matrices.
ECON 4610: Lecture 8
b1 D X Y1 and
Using X1 Y
1
!
bIV
1
D
bIV
0
0
1
Next consider (14) and
b 0 Y1 D Y
b0 Y
b
Y
1
1 1 we can re-write (15)
b0 Y
b b0
Y
1 1 Y1 X1
0
b1 X0 X1
X1 Y
1
1
b0 y1
Y
1
0
X1 y1
(16)
b1 D X.X0 X/ 1 X0 Y1
Y
as the results the …rst step in a 2-step OLS procedure. In the
b1 : X1 to estimate the structural
second step, we use b
Z1 D Y
equation by OLS, which gives
O 2SLS
1
0
D .b
Z01 b
Z1 / 1 b
Z1 y1
b1 : X1
D . Y
D
0
b0 Y
b b0
Y
1 1 Y1 X1
0
b1 X0 X1
X1 Y
1
which is identical to (16).
b1 : X1 /
Y
1
1
b0 y1
Y
1
0
X1 y1
ECON 4610: Lecture 8
b1 : X1
Y
0
y1
2SLS is an IV estimator
b1 gives
The result that 2SLS is identical to IV when IV uses Y
immediately that
The 2SLS is consistent by virtue of being IV
b1 .
No other IV estimator has better set of instruments than Y
b1 is the best linear predictor of Y1 , No
The reason is that Y
other sub-set of instruments, or linear combination of
instruments, will have higher correlation with the endogenous
variables.
By construction the 2SLS instruments are contemporaneously
b1
uncorrelated with the disturbance of the structural model: Y
is obtained by regression on the predetermined variables in the
structural model.
ECON 4610: Lecture 8
Consistency of 2SLS
Since the 2SLS estimator is an IV estimator the easiest way to
prove the consistency property is to use the general assumption
and de…nitions in (9) and (10)-(12).
bIV
1
IV
plim b
1
D
D
D .W10 Z1 / 1 W10 .Z1
D
1
0
1
1
0
C "1 /
C .W1 Z1 / W1 "1
1 0
W Z1
1 C plim
T 1
1
1 C 6wz 0 D 1
1
plim
where
W1 D W1 D
b1 : X1
Y
.
ECON 4610: Lecture 8
1
W10 "1
T
Covariance matrix of 2SLS
Under the assumption of no mis-speci…cation of the model (e.g.,
no autocorrelation and or heteroscedasticity) the 2SLS/IV
estimator is asymptotically normal.
A way of expressing this result is to write
h
i
0
2
0
1
0
1
bIV
Asy
.N
,
.W
Z
/
.W
W
/.Z
W
/
.
1
1
1
1
1
1
1
1
1
In doing this, it is to be understood that
0
2
0
1
0
1 .W1 Z1 / .W1 W1 /.Z1 W1 /
is an approximation to the asymptotic covariance matrix:
2
1
h IV i
Asy .Var b1 D
2
1
T
6wz1 6wz 6wz1
can be estimated from the residuals y1
analogy to the OLS case.
IV
Z1b1 in direct
ECON 4610: Lecture 8
OLS and IV
Although IV would never be used in a classical regression model,
this case nevertheless provides some insight into the small sample
properties of IV in more general cases.
Assume that we have a simple regression model. The IV estimator
for 2 is then:
P
P
yt .wt wN /
yN /wt
t .yi
O IV
D Pt
2 D P
x/w
N t
x/w
N t
t .xi
t .xt
For simplicity, regard the regressor as deterministic, and …nd that
P
wN /2
IV
t .wt
2
Var [ O 2 ] D
.
P
2
x/w
N t
t .xt
OLS gives
OLS
Var [ O 2
]D
2
P
1
t .xt
x/
N 2
.
ECON 4610: Lecture 8
Weak instruments
2
rwx
DP
P
t .xt
t .xt
x/
N 2
OLS
Var [ O 2
IV
Var [ O 2 ]
]
x/w
N t
P
t .wt
2
wN /2
2
D rwx
2 the more e¢ cient is IV. On the other
Hence the higher is rwx
2 is low, the large variance of the IV estimator may
hand, if rwx
be regarded as a bigger concern than the bias of the OLS
estimator.
This generalizes, if
plim
1
.W10 Z1 / D 6wz
T
0
then IV estimation will yield very poor results due to the
problem of weak instruments.
ECON 4610: Lecture 8
A test of exogeneity of regressors
Assume a regression model with two explanatory variables, and
that the second (x3 ) is subject to concern about endogeneity.
A test due to Wu and Hausman, see Biorn’s WH note on this,
is to
1
regress x3 on x2 and a valid instrumental variable, and
2
include the residuals from that auxiliary regression as a third
variable in the regression model.
If the residual variable is signi…cant (use a t-test), exogeneity is
rejected.
ECON 4610: Lecture 8
Download