Instrumental variables estimation Ragnar Nymoen Department of Economics, UiO 6 March 2009 ECON 4610: Lecture 8 Overview Macro model example: ILS estimator is equivalent (numerically identical) to IV. This holds for all exactly identi…ed models. Over identi…cation: more instruments than included endogeous variables. Can construct many IVs from sub-sets of instruments. All these IVs are consistent, but will have larger variance than the optimal IV estimator that uses the linear combination of the instruments that gives the best predictors of the included endogenous variables. That linear combination is obtained by estimating the reduced form by OLS, which is step 1 in the 2SLS estimator. The 2SLS is the optimal IV estimator for the over identi…ed case (for models with no autocorrelation or heteroscedasticity) Main reference is G Ch 13.5.1, 135.2. B Ch 8.2. K Chap 10.3. Lecture note WH ECON 4610: Lecture 8 ILS and IV We saw, in the Lecture (#6) on identi…cation that the parameters of an identi…ed structural equation can be estimated consistently We discovered two consistent estimators: In the Keynes model: Indirect least squares. ILS In the measurement error model: The instrumental variables (IV) estimator. We start by showing that these estimators are identical estimators for the parameters of the consumption function in the Keynes model. This result is completely general: ILS and IV are equivalent for any just identi…ed equation. ECON 4610: Lecture 8 The simple Keynes model revisited Let Yt denote GDP in period t D 1, 2, ..., T . Ct is “endogenous expenditure” and let Xt denote “exogenous expenditure”. Assume that Ct depends on GDP, then our example model is Yt Ct D Ct C Xt (1) D b1 C b2 Yt C "t , 0 < b2 < 1 (2) "t is a random disturbance term. We assume that it is white noise uncorrelated with Xt . For simplicity we assume normality "t N.0, 2" /. The parameter of interest is the marginal propensity to consume b2 . ECON 4610: Lecture 8 Identi…cation and the reduced form of the model (1) and (2) de…ne a simultaneous equations model. The parameters of the consumption function are exactly identi…ed. Solution for the two endogenous variables: Yt Ct 11 21 b1 1 b2 b1 D 1 b2 D D 11 C D 21 C 12 D 22 D 12 Xt 22 Xt C 1t (3) C 2t (4) 1 1 b2 b2 1 b2 1 1t D 1 2t D 1 ECON 4610: Lecture 8 1 b2 b2 "t "t IV estimation of the marginal propensity to consume With reference to the discussion of the measurement error model we see that Xt in the model given by (1) and (2) has the property of being an instrumental variable for the consumption function: It is 1 Correlated with Yt 2 Uncorrelated with "t , To form moments with Xt in this case, write (2) as Ct CN D b2 .Yt YN / C "t "N then multiply by Xt and sum over all observations: X X X .Ct CN /Xt D b2 .Yt YN /Xt C ."t t t t ECON 4610: Lecture 8 "N /Xt . (5) The IV estimator is de…ned as: P Ob IV D P t .Ct 2 t .Yt CN /Xt YN /Xt (6) which can be seen as the solution of (5) after …rst setting P ." "N /Xt D 0. The de…ning equation t t X X .Ct CN /Xt D bO 2IV .Yt YN /Xt t t is often refered to as a quasi-normal equation, because it takes the same role in the motivation of the IV estimator as the normal equations do in the derivation of the OLS estimator. ECON 4610: Lecture 8 The ILS estimator for the b2 was in Lecture 6 shown to be bO 2ILS D O 22 O 12 D P .C P t .X t P .Y t t P .X t CN /X t XN /2 YN /X t XN /2 D bO 2IV (7) Lecture 6 showed that plim bO 2ILS D b2 , so we have obviously plim bO IV D b2 . 2 Consistency of bO 2IV is also straight-forward to show directly: Start P with (6) and use (5),and …nallly the assumption that plim t ."t "N /Xt D 0 (instrument validity). ECON 4610: Lecture 8 Alternative notation for a single structural equation Without loss of generality we look at the …rst equation in a simultaneous equations model. In matrix notation the equation can be written (as an alternative to Greene’s …rst notation in Lecture 7) y1 D Y1 1 C X1 1 C "1 where y1 is T 1 and Y1 is T .M 1 1/. M 1 is the number of included endogenous variables in the equation.The di¤erence from Greene (who use M1 ) is that y1 is counted as an included endogenous variable in M 1 ). X1 is T K 1 where K 1 is the number of included predetermined variables. 1 and 1 conform to these de…nitions. 1 y1 D Z1 y1 D Z1 where Z1 is T .M 1 1 1 C "1 , Z1 D Y1 : X1 (9) C "1 1 C K 1 /. (8) ECON 4610: Lecture 8 The general IV estimator De…ne a T .M 1 1/ C K 1 matrix of instrumental variables with following asymptotic properties: 1 .W10 Z1 / D 6wz ,non-singular square matrix (10) T 1 plim .W10 "1 / D 0 (11) T 1 plim .W10 W1 / D 6ww , positive de…nite (12) T plim As direct generalization of the above examples, the general IV estimator is written as 0 1 0 O IV 1 D .W1 Z1 / W1 y1 ECON 4610: Lecture 8 (13) Exact identi…cation Let X01 denote the matrix with the predetermined variables in the model that are excluded from the …rst equation. X01 has K K 1 columns. The matrix with all the predetermined variables in the model: X D X1 : X01 has K K 1 C K 1 D K columns. We now consider X as our choice of W1 . W10 Z1 D X0 Z1 , a K M1 1 C K 1 matrix In the exact identi…cation case we have K K 1 D M1 1 so W10 Z1 D X0 Z1 is a square K K matrix ECON 4610: Lecture 8 Note that we require that X0 Z1 is non-singular (for any given T )to be able to calculate the IV estimate. Exact identi…cation is important for having one unique X0 Z1 matrix of moments from which the estimator is constructed. In sum in the case where the …rst equation is exactly identi…ed we have the IV estimator given as bIV 1 where X is the T of the model. D bIV 1 bIV 1 ! D .X0 Z1 / 1 X0 y1 K matrix with all the predetermined variables XD X1 : X01 ECON 4610: Lecture 8 Over identi…cation . In this case X0 Z1 is not a square matrix, since K > M 1 1 C K 1 . This means that we can de…ne not one but several W10 Z1 matrices that are square and invertible: Each one will de…ne an IV estimator of 1 . To solve this “problem” we choose W1 as follows: b1 : X1 W1 D Y b1 is the T Mi matrix of best predicted values of the where Y included endogenous variables: b yj D Xbj , j D 1, 2, ..., M 1 (14) where bj is the OLS estimator for each equation in the reduced form of the model. b1 D b y2 b y3 ... b yM 1 T .M 1 1/ Y b1 D X.X0 X/ 1 X0 Y1 Y ECON 4610: Lecture 8 bIV 1 D .W10 Z1 / 1 W10 y1 h b1 : X1 0 D Y bIV 1 bIV 1 ! D Y 1 : X1 b 0 Y1 Y b 0 X1 Y 1 1 0 0 X1 Y1 X1 X1 i 1 1 b1 : X1 Y b0 y1 Y 1 0 X1 y1 From OLS regression theory (see Lecture 1) we know that b1 D PY1 D X.X0 X/ 1 X0 Y1 Y e1 D MY1 D .I X.X0 X/ 1 X0 /Y1 where P and M are the projection matrix, and the “residual maker” matrix. ECON 4610: Lecture 8 0 y1 (15) b1 D X Y1 since all the variables in X1 by de…nition Intuitively X1 Y 1 are uncorrelated with the “unpredictable part” of Y1 . This is what is brought out by the algebra: 0 0 b1 D X .Y1 X1 Y 1 0 0 0 0 e1 / D X1 .Y1 0 MY1 / D X1 Y1 0 because X1 M D X M D 0, since a regression of X on a sub-set of X must (also) give zero residuals. b0 Y1 D Y0 P0 .Y b1 C e1 / Y 1 1 0 0 D Y1 P .PY1 C e1 / b0 Y b1 , since PM D 0 D Y 1 because the projection matrix and the residual maker are orthogonal matrices. ECON 4610: Lecture 8 b1 D X Y1 and Using X1 Y 1 ! bIV 1 D bIV 0 0 1 Next consider (14) and b 0 Y1 D Y b0 Y b Y 1 1 1 we can re-write (15) b0 Y b b0 Y 1 1 Y1 X1 0 b1 X0 X1 X1 Y 1 1 b0 y1 Y 1 0 X1 y1 (16) b1 D X.X0 X/ 1 X0 Y1 Y as the results the …rst step in a 2-step OLS procedure. In the b1 : X1 to estimate the structural second step, we use b Z1 D Y equation by OLS, which gives O 2SLS 1 0 D .b Z01 b Z1 / 1 b Z1 y1 b1 : X1 D . Y D 0 b0 Y b b0 Y 1 1 Y1 X1 0 b1 X0 X1 X1 Y 1 which is identical to (16). b1 : X1 / Y 1 1 b0 y1 Y 1 0 X1 y1 ECON 4610: Lecture 8 b1 : X1 Y 0 y1 2SLS is an IV estimator b1 gives The result that 2SLS is identical to IV when IV uses Y immediately that The 2SLS is consistent by virtue of being IV b1 . No other IV estimator has better set of instruments than Y b1 is the best linear predictor of Y1 , No The reason is that Y other sub-set of instruments, or linear combination of instruments, will have higher correlation with the endogenous variables. By construction the 2SLS instruments are contemporaneously b1 uncorrelated with the disturbance of the structural model: Y is obtained by regression on the predetermined variables in the structural model. ECON 4610: Lecture 8 Consistency of 2SLS Since the 2SLS estimator is an IV estimator the easiest way to prove the consistency property is to use the general assumption and de…nitions in (9) and (10)-(12). bIV 1 IV plim b 1 D D D .W10 Z1 / 1 W10 .Z1 D 1 0 1 1 0 C "1 / C .W1 Z1 / W1 "1 1 0 W Z1 1 C plim T 1 1 1 C 6wz 0 D 1 1 plim where W1 D W1 D b1 : X1 Y . ECON 4610: Lecture 8 1 W10 "1 T Covariance matrix of 2SLS Under the assumption of no mis-speci…cation of the model (e.g., no autocorrelation and or heteroscedasticity) the 2SLS/IV estimator is asymptotically normal. A way of expressing this result is to write h i 0 2 0 1 0 1 bIV Asy .N , .W Z / .W W /.Z W / . 1 1 1 1 1 1 1 1 1 In doing this, it is to be understood that 0 2 0 1 0 1 .W1 Z1 / .W1 W1 /.Z1 W1 / is an approximation to the asymptotic covariance matrix: 2 1 h IV i Asy .Var b1 D 2 1 T 6wz1 6wz 6wz1 can be estimated from the residuals y1 analogy to the OLS case. IV Z1b1 in direct ECON 4610: Lecture 8 OLS and IV Although IV would never be used in a classical regression model, this case nevertheless provides some insight into the small sample properties of IV in more general cases. Assume that we have a simple regression model. The IV estimator for 2 is then: P P yt .wt wN / yN /wt t .yi O IV D Pt 2 D P x/w N t x/w N t t .xi t .xt For simplicity, regard the regressor as deterministic, and …nd that P wN /2 IV t .wt 2 Var [ O 2 ] D . P 2 x/w N t t .xt OLS gives OLS Var [ O 2 ]D 2 P 1 t .xt x/ N 2 . ECON 4610: Lecture 8 Weak instruments 2 rwx DP P t .xt t .xt x/ N 2 OLS Var [ O 2 IV Var [ O 2 ] ] x/w N t P t .wt 2 wN /2 2 D rwx 2 the more e¢ cient is IV. On the other Hence the higher is rwx 2 is low, the large variance of the IV estimator may hand, if rwx be regarded as a bigger concern than the bias of the OLS estimator. This generalizes, if plim 1 .W10 Z1 / D 6wz T 0 then IV estimation will yield very poor results due to the problem of weak instruments. ECON 4610: Lecture 8 A test of exogeneity of regressors Assume a regression model with two explanatory variables, and that the second (x3 ) is subject to concern about endogeneity. A test due to Wu and Hausman, see Biorn’s WH note on this, is to 1 regress x3 on x2 and a valid instrumental variable, and 2 include the residuals from that auxiliary regression as a third variable in the regression model. If the residual variable is signi…cant (use a t-test), exogeneity is rejected. ECON 4610: Lecture 8