Exogeneity, causality and autonomy Ragnar Nymoen Department of Economics, UiO 17 April 2009 ECON 4610: Lecture 12 Overview Many of the topics of this course “turns on” the concept of exogeneity: The choice of estimator, when to use OLS, when to use 2SLS or other instrumental variables methods The relevant model for evaluating a shock (policy driven or extraneous) to the economy: Derive multipliers from a single equation, or from a larger model. The identi…cation issue. In this lecture we review the econometric concepts of exogeneity, and also explain the relationship to two other important concepts in econometric methodology: causality and autonomy. Syllabus: G Ch 4.1 B 2.2, 6N: A K Ch 10,11 ECON 4610: Lecture 12 “Classic” exogeneity concepts Three di¤erent concepts are used in linear models with stochastic regressors. Consider yt D 1 C 2 xt C "t , t D 1, 2, ..., T . where we use time-series notation because that notation is most relevant when we extend the discussion to causality. Exogeneity of xt can refer to one of the following de…nitions: 1 E ["1 jxt ] D E [" 2 jxt ] D ... D E ["T jxt ] D 0, see A3 in Table 2.1 in Greene. 2 E ["t ] D 0 for t D 1, 2, ..., T and Cov ."j , xk / D 0 for j, k D 1, 2, ..., T . 3 E ["t ] D 0 for t D 1, 2, ..., T , and " D ." 1 , " 2 , ...., "T /0 stochastically independent of x D .x1 , x2 ,...,xT /0 . ECON 4610: Lecture 12 Relationships between the classic concepts. #3. H) #1. H) #2. The stochastically independency part of #3. is a strong assumption. It implies that E [" jxt ] D 0 for all t, and therefore de…nition #1. Note also that the implication that the conditional means of independent variables works “both ways”: #3. implies E [x1 j"t ] D E [x2 j"t ] D ... D E [xT j"t ] D 0, as well. Hence according to #1. xt is uncorrelated with all disturbances, past and future: the covariance part of de…nition 2. is implied by #1. Moreover, E ["t ] D 0 implied by double expectation. Most textbooks choose #1, but clearly these concepts are closely related in practice: Heuristically, the common assumption is about “unrelatedness” between the explanatory variable and the disturbances. ECON 4610: Lecture 12 An extension of the classic concepts: pre-determinedness When xt in yt D 1 C 2 xt C "t , t D 1, 2, ..., T . is uncorrelated with "t , "t C1 , ..., i.e. the current and future disturbances, but not the past disturbances " t 1 , " t 2 , .... we say that the explanatory variable xt is pre-determined. In the discussion of identi…cation, there is no di¤erence between pre-determined variables and exogenous variables. The di¤erence between exogeneity concept #1, and pre-determinedness, has to do with estimation properties: OLS now has a …nite sample bias: remember P b2 D 2 C P.xt x/"t .xt x/2 and that the the expectation of the bias term cannot be shown to be zero. ECON 4610: Lecture 12 Pre-determinedness, small sample bias and consistency The classic case of pre-determinedness is when the explanatory variable is yt 1 (or higher order lags). Then yt is necessarily correlated with the past disturbances: yt 1 D 1 .1 C 2 C 2 2 C ../ C "t 1 C 2 "t 2 C 2 2 "t 3 1 C ... but not with the future disturbances, assuming no autocorrelation among the disturbances. Then the OLS estimator b2 is consistent (the “plim” in the numerator is then zero). What drives the consistency result, is that when there is no autocorrelation, each new observation of yt 1 will contain some new and unique information and asymptotically, this will dominate and drive the bias term towards zero. ECON 4610: Lecture 12 The lagged regressor case: how large is large? For the model yt D 2 yt 1 C "t , "t N.0, 1/, t D 1, : : : T . the following has been established in the literature: Function Asymptotic Finite sample E [b2 ] 2 2 /.T 2 2 Var [b2 ] 0 .1 2 2 //T Which can be used to assess the size of the bias: Sample T D 51 0.5 T D 101 T D 51 0.9 T D 101 2 Bias 0.02 0.01 0.036 0.018 ECON 4610: Lecture 12 1/ (1) Use of the classic exogeneity de…nitions The main use is with regard to “limitation of OLS”. Exogeneity in the meaning of de…nition #1 is violated in the presence of 1 Measurement error in the explanatory variable. 2 Simultaneity. 3 Lagged regressor with autocorrelated disturbances: yt "t D D 2 yt 1 "t 1 C "t , C t. It has been established in the literature that plim b2 D 2 1C C 2 6D b2 ECON 4610: Lecture 12 Response to “lack of exogeneity” 1 IV/2SLS estimation. 2 Speci…cation of a simultaneous equation model and IV /2SLS estimation: as when the recursive system of Lecture 7 is replaced by a simultaneous equations model 3 Estimate by GLS, or, re-specify the model by inclusion of relevant explanatory variables to obtain model without residual autocorrelation. ECON 4610: Lecture 12 Weak, strong and super exogeneity There is another group of exogeneity concepts that relate the exogeneity status of an explanatory variable to the parameters of interest, see de…nition iv) in Biorn’s note. Consider two variables xt and yt . In Lecture 11 we de…ned the likelihood function function for a single logistically distributed variable. The likelihood function extends to the bivariate and multivariate case, and to other distribution, the normal distribution in particular. Let and denote the parameters of the conditional PDF of yt given xt , and the marginal PDF for yt respectively, so that the likelihood function can be decomposed as Lx1 ,..xT ,y1 ,..yT . , / D Ly1 ,..yT jx1 ,..xT . /Lx1 ,..xT . / In general: max Lx1 ,..xT ,y1 ,..yT . , / , max Ly1 ,..yT jx1 ,..xT . / ECON 4610: Lecture 12 n o max Lx1 ,..xT . / Weak exogeneity In some important situations we will have the equality max Lx1 ,..xT ,y1 ,..yT . , / D , max Ly1 ,..yT jx1 ,..xT . / n max Lx1 ,..xT . / o saying that the maximum likelihood estimators for based on the joint likelihood and the conditional likelihood are identical. In this case xt is weakly exogenous for . An important requirement for weak exogeneity is that and are “variation free”. This mean that the two parameter (vectors) can vary freely within their respective logically admissible “spaces”. Cross restrictions between and is an example of how variation freeness may be invalidated. Another important requirement, for weak exogeneity to be a relevant concept, is that contains parameters of interest, for understanding economic behaviour, or for forecasting economic variables. ECON 4610: Lecture 12 The bivariate normal model In Lecture 1, we established that when the joint PDFs of fxt ,yt g are normal and independent, we have D E[yt j xt ] C "t D yt xt where "i D N.0, 2/ x C with 1 2 xt C C "t (2) (3) xt 2 D 2 .1 y 2 /, " i N.0, 2 /, x xy D x y and y 1 D y 2 D xy xy y x D x x D y xy 2 x xy 2 x ECON 4610: Lecture 12 x (4) (5) Exogeneity in the binormal model xt is exogenous according to de…nition #1 above: E ["t j xt ] D E [yt j xt ] E [E [yt j xt ]] D E [yt j xt ] E [yt j xt ] D 0. and note it is not uncommon to refer to this as strict exogeneity. This holds by construction, since "t only contains the part of yt that is unexplained by xt . For the same reason, "t and xt in (2) and (3) are also uncorrelated: E [" t xt ] D E [.yt 1 D E [. yt / xt ] y 2 xt / xt ] 2 D E [yt 2 x D xt ] xy E[ xy 2 x 2 xt xt ] 2 x D 0. This suggests that in terms of maximum likelihood estimation, which involves minimization of the sum of squared residuals, nothing is lost by maximizing the conditional and marginal likelihoods separately. ECON 4610: Lecture 12 Exogeneity in the binormal model, cont’d Hence we are in the situation that max Lx1 ,..xT ,y1 ,..yT . , / D , max Ly1 ,..yT jx1 ,..xT . / when the parameters of interest are contained in xt is weakly exogenous for those parameters. n max Lx1 ,..xT . / , meaning that o In the binormal case, there is only a subtle di¤erence between the “classic” exogeneity and weak exogeneity. But weak exogeneity brings out that in econometrics, a variable is not exogenous in itself, but relative to a statistical model, and relative to the parameters of interest. This turns out to be a big advantage in more complicated modelling settings. The discussion can be extended to a dynamic setting: We can then retrieve the assumption that we started with above: with independent xt s and yt s, by …rst conditioning on the lagged xs and y s. ECON 4610: Lecture 12 Strong exogeneity and Granger non-causality Consider again the bivariate case. xt is strongly exogenous if it is 1 2 Weakly exogenous and not Granger-caused by yt . Granger causality can be discussed with reference to the following dynamic equation (…rst order dynamics for simplicity): xt D 10 C 11 xt 1 C 12 yt 1 C x ,t Granger causality means 12 6D 0, so strong exogeneity requires 12 D 0 (Granger non-causality). Strong exogeneity is required for valid forecasting from the conditional model yt D E[yt j xt, xt 1 , yt 1 ] C " t i.e. if we make forecasts for y on the false assumption that there is no feed-back e¤ect of y on x, the forecasts will not be optimal and the prediction intervals will be misleading. ECON 4610: Lecture 12 Invariance (to structural changes) The parameters (of interest) in the conditional model are invariant if they are una¤ected by a change in the parameters of the marginal model. The binormal model is again a useful reference: For example 2 is invariant if a change in x or x does not change 2 . Since 2 D xy yx , 2 is invariant to changes in x (a change in the level of the explanatory variable). With respect to a change in the standard deviation of x, i.e. x , 2 may or may not be invariant, depending on how y is a¤ected by the structural change to x . Invariance is important for the validity of policy analysis based on a regression model (the structural change is then brought about by a change in a policy instrument, or in legislation) ECON 4610: Lecture 12 The Lucas critique Consider a bivariate model (without intercept for simplicity) yt xt D D e 2 xt C "t 11 xt 1 C (6) xt (7) where xte denotes expectations. It is straight-forward to show that if we regress yt on xt we obtain plim b2 D 2 11 2 which is biased, and not invariant to changes in the marginal model for xt . This is the famous Lucas critique: Policy analysis based on conditional (regression) models are not valid when there are changes in the expectations formation process, represented by (7) above. ECON 4610: Lecture 12 Super exogeneity An explanatory variable xt is super exogenous if it is 1 Weakly exogenous and 2 the parameters (of interest) of the conditional model is invariant to (structural) changes in the marginal model of yt . As mentioned: super exogeneity secures the validity of policy analysis with a conditional model. The parameters of interest is then the slope coe¢ cients (the derivatives or elasticities). In a wider interpretation, we seek econometric models that are invariant to a wide range of potential structural breaks. Haavelmo (1944) coined the term autonomous relationships, and pointed to the dangers of not paying enough attention to potential sources of structural breaks and lack of invariance. Hence the Lucas critique is a special case of the Haavelmo-critique. ECON 4610: Lecture 12 Testing exogeneity Weak exogeneity: The Wu-Hausman test of Lecture 8. Strong exogeneity: Specify an econometric equation the marginal model for xt and test the signi…cance of yt 1 (or higher order lags). Invariance: If structural breaks in the marginal model for xt can be established (use eg the tests in Lecture 2), and the dummies that represent these breaks are insigni…cant in the conditional model, then the Lucas/Haavelmo critique does not apply, and invariance with respect to these breaks are maintained. ECON 4610: Lecture 12 Testing exogeneity via inverted regressions Under the property of super exogeneity, the results for a regression model are not invariant to re-normalization. From OLS algebra, we have: 2 b2 b2 D ryx where b2 is the OLS estimate on the slope coe¢ cient when yt is the dependent variable, and b 2 is the estimate when xt is the dependent variable (the inverse regression). 2 will have If there are structural changes in the sample period ryx changed. Consequently, both b2 and b 2 cannot be stable over the sample period. But any one them can. Investigate this by recursive estimation of both models. 2 then interpreted This extends to any number of variables–with ryx as partial correlation coe¢ cients. Example: Conventional Phillips curve, versus Lucas’supply curve. ECON 4610: Lecture 12 A typical wage Phillips curve: 1wt D 1 C 2 ut C 3 1pt 1 C .. C "t where wt is log of the wage rate, ut is log of the rate of unemployment and pt is log of the price level. Lucas’s supply curve entails a negative relationship between ut and in‡ation (as meassured by eg 1pt and 1wt ), which is due to short-term misperceptions of changes in relative prices. It implies that inversion of the conventional Phillips curve may be more stable than the conventional Phillips curve Investigates this for Norwegian data. ECON 4610: Lecture 12 Stability of a Phillips curve for Norway ∆ pt−1 Intercept ∆qt 0.5 0 +2σ 0.5 +2σ +2σ 0.0 β β β -1 −2σ 1980 ∆qt 1990 −2σ 2000 1980 1990 2000 1980 tu t IP 0.1 +2σ 0.0 β −2σ 1990 2000 t 0.00 1 0 −2σ 0.0 +2σ +2σ -0.05 β −2σ β -0.10 −2σ -0.1 1980 1990 2000 0.050 1.0 1980 1% critical value 1990 2000 1.0 1980 1% critical value 1990 2000 +2 s e 0.025 1-step residuals 1 period ahead Chow statistics 0.000 0.5 0.5 Break point Chow statistics -0.025 -2 s e 1980 1990 2000 1980 1990 2000 1980 1990 ECON 4610: Lecture 12 2000 Instability of the inverted Phillips curve Intercept 10 +2σ β -2 σ -2 -3 ∆ pt−1 ∆qt 10 +2 σ 5 0 β 0 +2 σ -1 0 β -4 -2 0 1980 1990 ∆ q t− 1 10 β 0 -2 σ 1980 1990 2000 ∆ w c t − ∆ pt−1 10 +2 σ 5 -2 σ -2 σ -5 1980 2000 1 0 IP 1990 0 -1 0 +2 σ β -2 0 +2 σ -1 β -2 σ 1980 1990 1 .0 2000 1 .5 1980 1990 1980 2000 1990 2000 Break point Chow s tatis tics 1% critical value 1 .0 2 1 period ahead Chow statis tics 0 .0 0 .5 -0 .5 -2 σ 3 +2 se 1-step res iduals 0 .5 2000 t 1% critical value 1 -2 se 1980 1990 2000 1980 1990 2000 1980 1990 ECON 4610: Lecture 12 2000