Identi…cation Ragnar Nymoen Department of Economics, UiO 27 February 2009 ECON 4610: Lecture 6 Overview In this set of notes we address identi…cation in simultaneous equation models and in models with measurement errors. Identi…cation is de…ned and the discussed with the aid of an example. We then present general notation for simultaneous equation models, and give some general rules for identi…cation is such systems. Identi…cation with measurement errors at the end. Main reference is G Ch 13.1;. B Ch 7.1–7.6;K: Ch 10.2. 6N: D, E and F ECON 4610: Lecture 6 Identi…cation of econometric models A parameter in an econometric model is identi…ed if it can be estimated consistently— not necessarily by OLS though! Some prefer to express this by saying that an econometric model is identi…ed if its parameters can be estimated consistently in an ideal sample. A main point is that identi…cation is a logical property of the econometric model, and can be addressed prior to estimation issues. ECON 4610: Lecture 6 The simple Keynes model revisited Let Yt denote GDP in period t D 1, 2, ..., T . Ct is “endogenous expenditure” and let Xt denote “exogenous expenditure”. Assume that Ct depends on GDP, then our example model is Yt Ct D Ct C Xt (1) D b1 C b2 Yt C "t , 0 < b2 < 1 (2) "t is a random disturbance term. We assume that it is white noise uncorrelated with Xt . For simplicity we assume normality "t N.0, 2" /. The parameter of interest is the marginal propensity to consume b2 . ECON 4610: Lecture 6 The reduced form of the model (1) and (2) de…nes a simultaneous equations model. Solution for the two endogenous variables: Yt Ct 11 21 b1 1 b2 b1 D 1 b2 D D 11 C D 21 C 12 D 22 D 12 Xt 22 Xt C 1t (3) C 2t (4) 1 1 b2 b2 1 b2 1 1t D 1 2t D 1 ECON 4610: Lecture 6 1 b2 b2 "t "t The distribution of Y and C The Reduced Form written more compactly Yt Ct D yt C 1t (5) D ct C 2t (6) where 1t 2 y N 0, 2t cy cy 2 c j Xt . The conditional distributions of the stochastic variables 1t and are binormal with zero expectations and variance matrix: 2 y cy cy 2 c j Xt . ECON 4610: Lecture 6 (7) 2t Simultaneity bias In this example the (main) parameter of interest is the marginal propensity to consume b2 . In lecture 5 we found that the OLS estimator bO 2 was inconsistent: plim bO 2 b2 D .1 b2 / Var .Xt / C1 2 " The source of this bias is that in the consumption function, because of the logic of the model, the disturbance "t is correlated with income Yt . Idea: What if we obtain OLS estimates from a model where there is no such correlation, can we derive a consistent estimator of b2 from that model? ECON 4610: Lecture 6 Consistency by Indirect least squares (ILS) Note that from the reduced form we have 22 12 b2 1 b2 D D b2 1 1 b2 Let OLS estimators of 12 and 22 be denoted with “hats”. Given the speci…cation of the model it follows that plim O 22 D 22 and plim O 12 D 12 . Therefore: plim O 22 D b2 as well. O 12 This shows that a there exists a consistent estimator of b2 (and for b1 as well, what is it?). By de…nition, the parameters of the consumption function are identi…ed. ECON 4610: Lecture 6 Identi…cation in simultaneous equation models Looking at the example Keynes model, we see that it is under-determined: There are more economic variables (3) than there are equations (2). Under-determinedness is a necessary condition for identi…cation in simultaneous equations models. Why: In the Keynes model: Xt represents observable variation in Yt that is not due to Ct . With no Xt in the model, it is determined, and all we can “infer” from the scatter plot between Ct and Yt is the 45 degree line. With no instrument the model is not identi…ed. We say that Xt is the instrumental variable, through which the model becomes identi…ed. In this case with one degree of freedom, the simultaneous equations model is exactly identi…ed. ECON 4610: Lecture 6 Over-identi…cation What happens if we replace the identity Yt D Ct C Xt by Yt D Ct C Xt C Zt where Zt is a second exogenous demand component? Is the model made up of this identity and the consumption function identi…ed? Yes, because we can use ILS to obtain two consistent estimators of b2 . We say that the model is over-identi…ed because there are more instrument that is strictly needed for identi…cation. Over-identi…cation is a luxury, not a problem. And we will learn methods to use that extra information in an optimal way. ECON 4610: Lecture 6 Notation for the simultaneous equation model M endogenous variables .y1 , ...,yM ), and M linear equations, M structural disturbances (" 1 , ...,"M ). K exogenous variables (x1 ,.....,xK /. Let yt , xt denote vectors with observations .t D 1,....,T /, and let et contain the disturbances. yt0 0 C x0t B D e0t (8) where 0 is a M M coe¢ cient matrix and B is a K M coe¢ cient matrix. (8) is called the structural form of the model. 0 1 exists (0 is non-singular). The reduced form: yt0 D 1 x0t B0 0 0 D xt 5 C vt C e0t 0 1 5 is the matrix of reduced form coe¢ cients. ECON 4610: Lecture 6 (9) The macro model example in matrix notation yt0 D Yt Ct x0t D Xt 1 , 0D , BD 1 1 1 0 b2 1 0 b1 , e0t D 0 "t Multiplying-out according to (8) gives Yt Ct Xt D 0 b2 Yt Ct b1 D "t which of course can be written in the usual way as (1) and (2). ECON 4610: Lecture 6 A partial market equilibrium model Let Qt denote the quantity of a good, and Pt the price. A model of partial market equilibrium is for example Qt 12 Qt 21 Pt D Pt D 21 x1t 11 22 x1t 12 31 x2t 32 x2t C "1t , demand(10) C "2t , supply (11) where we have adopted the notation in Greene p 358 eq (13-2), noting that y1t D Qt and y2t D Pt . Because of the interpretation as supply and demand schedules we assume 21 > 0 and 12 < 0. In Greene’s matrix notation the model is (8) with the following speci…cations: yt0 D Qt x0t D 1 x1t Pt ,0D x21 1 21 12 1 ,BD 11 21 31 12 22 32 ECON 4610: Lecture 6 , e0t D "1t "2t Identi…cation in the partial equilibrium model It useful to proceed in steps. For the simultaneous equation model (10)-(11) there are 5 cases to consider 1 No exogenous variables 2 One exogenous variable only, in the demand equation: 21 6D 0, and 31 D 22 D 32 D 0. 3 4 5 21 D 31 D 22 D 32 D 0. One exogenous variable only, in the supply equation: and 21 D 31 D 22 D 0. One separate exogenous variable in each equation 32 6D 0 and 31 D 22 D 0 Both exogenous variables enter both equations ECON 4610: Lecture 6 21 32 6D 0, 6D 0, Case 1 In this case the only source of variation is the random structural shocks. The model will generate a scatter plot like the one below, where there is no way we can “place” one of both of the curves. PRICE QUANTITY ECON 4610: Lecture 6 Case 2 In this case x1t shifts the demand schedule, but not the supply curve (rember that x2t is not in the model). It looks like the supply curve is now identi…ed! PRICE QUANTITY ECON 4610: Lecture 6 Case 3 In this case x2t shifts the supply schedule, but not the demand curve (x1t is not in the model). It looks like the demand curve is now identi…ed! PRICE QUANTITY ECON 4610: Lecture 6 Case 4 x2t shifts the supply schedule, but not the demand curve. x1t shifts the demand curve, not the supply curve. Qt 12 Qt D D 21 Pt Pt 11 12 21 x1t 32 x2t C "1t , demand C "2t , supply (12) (13) In this case we cannot readily represent the situation graphically since both relationships are shifting so the observation in the P-Q plane will not lie on any particular “visible” curve. Graphically we are in much the same situation as in Case 1. But in this case this tells us nothing about identi…cation. ECON 4610: Lecture 6 Case 4, cont’d Intuitively, since x1t and x2t appear in the equations in the same way as in Case 2 and Case 3, but now jointly, identi…cation is not lost in Case 4. We consider a linear combination of (12) and (13), and check if that relation can (falsely) represent either curve. Let 0 h 1, then the combined relationship becomes: .1 h/ C h 12 Qt D .1 ..1 .1 C.1 h/ h/ h/ 11 h 12 21 C h/Pt 21 x1t h 32 x2t h/"1t C h"2t We see that for 0 < h < 1 this equation cannot represent (12), and it can not represent (13). For h D 0 we get back (12), For h D 1 we get back (13), Hence both are identi…ed. ECON 4610: Lecture 6 Case 5 The same check as in Case shows that a linear combination of (12) and (13) is consistent with both of the true structures. Hence, neither of them are identi…ed. A second look shows that Case 1 and Case 5 have a common feature that accounts for the total lack of identi…cation: None of the equations omits an exogenous variable that is included in the other equation. The identi…ed cases are characterized by the omission of one exogenous variable that appears in the other equation. This motivates the necessary condition for identi…cation called the order condition. ECON 4610: Lecture 6 The order condition In a simultaneous equation model with M equations, an equation is identi…ed if the number of exogenous variables excluded from the equation is equal to or larger than the number of endogenous variables included in the equation, less one. We see that Case 2, 3 and 4 gives exact (or just) identi…cation according to this Order condition. In the example with the Keynes model we gave an example of over-iden…cation: The case where the model was speci…ed with two separate exogenous demand components in the general budget equation. Note than we use this condition (or the equivalent one that follows) we count any identities (such as the general budget equation of the Keynes model) as one of the equations of the model. ECON 4610: Lecture 6 An equivalent order condition In a simultaneous equation model with M equations, an equation is identi…ed if it excludes at least M 1 of the variables appearing in the model The see the equivalence, let K i and M i denote the number of included endogenous and exogenous variables in equation i. The order condition is then: K If we add M M i on both sides of K C .M .M Ki i Mi / M / C .K Mi , and collect terms we get Ki .M i M K / 1 Mi / C Mi 1 1 which is the equivalent formulation of the order condition. ECON 4610: Lecture 6 A su¢ cient (rank) condition Looking back at Case 4 above, we see that although formally both equations are identi…ed from the order condition, suppose that 21 and 32 were zero after all. Then we would loose identi…cation.: The linear combination cannot be told apart from the true structural relationships. So we actually need to assume 21 6D 0 and 32 6D 0. This generalizes to larger systems where it can happen that the parameter constellations are exactly such that a linear combination can become inseparable from one of the structural equations. To rule that out the following su¢ cient condition has been formulated: In a model with M equations, an equation is identi…ed if and only if it omits at least M 1 variables, and at least one non-zero (M 1/ (M 1/ determinant can be formulated from the array of coe¢ cients with which the omitted variables appear in the other equations of the model. ECON 4610: Lecture 6 Linear homogenous restrictions on the parameters So far we have discussed identi…cation in terms of exclusion restrictions, which is synonymous with omission of variables. Exclusion restrictions are special cases of linear restrictions on parameters. For example 31 D 0 is an exclusion restriction and a linear restriction. 31 1 D 0 is a linear restriction. 21 C 31 D 0 is another example of a linear restriction. A generalization of the order conditions is therefore: In a simultaneous equation model with M equations, an equation is identi…ed if there are at least M 1 linear restrictions on the parameters of the equation. ECON 4610: Lecture 6 Predetermined variables So far in this set of notes xt has referered to as a vector of exogenous variables. However, just as in the regression equation case, the interpretation of xt can be extended to predetermined variables. An explanatory variable is predetermined if it is uncorrelated with the contemporaneous disturbances in e0t and of all future disturbances e0t Cj j > 0. In the same way as in the regression equation case, an important class of predetermined variables are the lags of the endogenous variables, ie., yt j , j > 0. Therefore, the identi…cation conditions above that are expressed in term of exogenous variables can be re-expressed in terms of predetermined variables ECON 4610: Lecture 6 Models with an expectations variable (measurement error) In lecture 5 we had yi D 1 C 2 xi C "i , i D 1, 2, ..., n. (14) with all the classical assumptions holding. xi is an expectations variable that we as econometricians cannot observe or cannot measure without error. ui D xi xi . (15) If we try to estimate 2 using the observable (actual) xi and OLS, that estimator is inconsistent As we now know that by itself does not mean that 2 is unidenti…ed. There may be another estimator which is consistent, and if that is the case, 1 is identi…ed. ECON 4610: Lecture 6 Identi…cation in the measurement error model The same substitution as in Lecture 5 gives yi xi D 1 C 2 xi D xi C ui . C "i 2 ui , (16) (17) Regarded as a simultaneous equation model, (16) identi…ed since it omits one variable in the model, namely xt . But we also see that this is empty formalism, since the problem is that xt is unobservable (it is a latent exogenous variable). If we introduce the idea of an instrumental variable, zi , and require of zi that it is correlated with xi but uncorrelated with "i and ui , we can consider the possibility that the covariances between yi and zi and between xi and zi give a consistent estimator. ECON 4610: Lecture 6 An extended measurement error model Write the model with a (third) equation that brings zi in as an exogenous variable: yi xi xi D 1 D 1 C 2 xi C "i C 2 zi C i, D xi C ui 2 6D 0 i is a disturbance term with zero mean and which is uncorrelated with zi . Writing the model in term of the oberveables: yi xi D 1 C D 1 C 2 xi 2 zi C "i C i 2 ui , C ui . (18) (19) Using the order condition we see that the …rst eqation is identi…ed. If we can …nd a way of utilizing the exogeneity of zi , we will also …nd that estimator to be consistent. ECON 4610: Lecture 6 We investigate this by …rst writing (16) in deviation from means form, and then multiply that equation by zi : X .yi i .yi yN /zi yN /zi i 2 .x X D D 2 i x/z N i C ."i .xi "N /z Xi x/z N iC ." i i 2 .ui "N /zi By assumption we have P P plim n1 i .yi yN /zi D Cov .y , z/ 6D 0 plim n1 i .xi P P plim n1 i ." i "N /zi D 0 plim n1 i .ui so 2 D 2 2 i .ui is identi…ed. ECON 4610: Lecture 6 u/z N i x/z N i D Cov .x, z/ 6D u/z N i D0 Cov .y , z/ Cov .x, z/ implying that the Instrumental Variables estimator: P yN /zi i .yi O IV 2 D P x/z N i i .xi is consistent, and u/z N i X (20) We can also present the argument “back to forth” and start with suggesting P yN /zi i .yi O IV 2 D P x/z N i i .xi on the basis that it uses the exogenous variable zi instead of xi in the OLS estimator. IV We then need to show that plim O 2 D 2 . P 1 P x/ N C ."i .yi yN /zi IV 2 .xi i i n O2 D P D 1 P x/z N i i .xi i .xi n "N / x/z N i By the exogeneity of zi we see that the “plims” in the previous slide apply, so that IV plim O 2 D 2 Cov .xi , zi / Cov .xi , zi / D ECON 4610: Lecture 6 2. 2 .ui u/ N zi