Stat 521 Notes 14 Models for Panel Data that Violate the Strict Exogeneity Assumption Reading: Wooldridge, Chapter 11.1 I. Review of Fixed Effects Model Yit X it' ci it (1.1) This is a structural/casual equation in the sense that is supposed to represent the causal effect of changes in X it holding everything else fixed and vit ci it represents the effect of all omitted variables on the outcome. vit can be correlated with X it . The key assumption of strict exogeneity in the fixed effects model is that the correlation between vit and X it arises only through the time invariant part ci of vit . Assumptions of Fixed Effects Model: 1. Assumption 1 (Independence Between Units): The vectors of individual outcomes (Yi1 , , YiT ) and (Y j1 , , Y jT ) are independent for i j . 2. Assumption 2 (Strict Exogeneity): E ( it | X i1 , , X iT , ci ) 0 . it can be thought of as a time varying shock that is independent of the unobserved 1 individual characteristic ci and the observed characteristics X it . 2A. Assumption 2A. The it ’s are independent and 2 identically distributed with constant variance . This assumption is made in the usual fixed effects inferences but can be relaxed by using robust standard errors. Fixed Effects Model Estimation: Let Z j be a dummy variable 1 if j i Z for unit i , i.e., j ,it 0 if j i Then we can write model (1.1) as Yit X it' c1Z1,it cN Z N ,it it We can estimate , c1 , , cN by least squares regression of Yit on Z1,it , , Z N ,it , pooling together the data for i 1, , N , t 1, ,T . Example: Papke (1994, Journal of Public Economics, “Tax Policy and Urban Development”) studies the effect of urban enterprise zones (EZs) on economic outcomes such as unemployment claims. Urban enterprise zones encourage development in blighted neighborhoods by offering entrepreneurs and investors tax and regulatory relief if they start businesses in the area. Papke considers a panel of 22 areas in Indiana from 1980 to 1988. Indiana’s EZ program began in 1983. To qualify for consideration to be an EZ, an area must have an unemployment rate at least 1.5 times the average statewide unemployment rate, and a resident household poverty 2 rate at least 25 percent above the U.S. poverty level. Areas in the panel became EZs at different time points and some areas did not become EZs at all. One model Papke uses is a fixed effects model: Yit t EZit ci it where Yit is the log of unemployment claims in area i in year t, t is a time dummy variable and EZ it is a dummy variable for whether the ith area was an EZ in year t. The fixed effects ci take account of permanent differences across areas such as industrial composition and composition of the labor force. ezdata=read.table("ezunem.raw",header=TRUE,sep=","); area=ezdata$city; year=ezdata$year; luclms=ezdata$luclms; # Log of Unemployment Claims lag_luclms=ezdata$lag_luclms; # Unemployment Claims in previous year in area ez=ezdata$ez; # Whether area is an urban enterprise zone in given year # Fixed Effects Model femodel=lm(luclms~ez+as.factor(year)+as.factor(area)); summary(femodel) Call: lm(formula = luclms ~ ez + as.factor(year) + as.factor(area)) Residuals: Min 1Q Median 3Q Max -0.57618 -0.10837 -0.00977 0.11364 0.49623 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 11.67615 0.08008 145.808 < 2e-16 *** ez -0.10441 0.05542 -1.884 0.061291 . 3 as.factor(year)1981 -0.32163 0.06046 -5.320 3.30e-07 *** as.factor(year)1982 0.13550 0.06046 2.241 0.026332 * as.factor(year)1983 -0.21926 0.06046 -3.627 0.000381 *** as.factor(year)1984 -0.57915 0.06232 -9.294 < 2e-16 *** as.factor(year)1985 -0.59179 0.06550 -9.036 3.92e-16 *** as.factor(year)1986 -0.62126 0.06550 -9.486 < 2e-16 *** as.factor(year)1987 -0.88895 0.06550 -13.573 < 2e-16 *** as.factor(year)1988 -1.22763 0.06550 -18.744 < 2e-16 *** as.factor(area)2 -0.19348 0.09941 -1.946 0.053296 . as.factor(area)3 -0.37894 0.09941 -3.812 0.000194 *** as.factor(area)4 -0.54117 0.09941 -5.444 1.83e-07 *** as.factor(area)5 0.01103 0.09472 0.116 0.907407 as.factor(area)6 0.55458 0.09452 5.867 2.32e-08 *** as.factor(area)7 0.75007 0.09452 7.935 2.90e-13 *** as.factor(area)8 -0.05876 0.09472 -0.620 0.535900 as.factor(area)9 0.35343 0.09472 3.731 0.000261 *** as.factor(area)10 1.64501 0.09941 16.548 < 2e-16 *** as.factor(area)11 -0.13032 0.09941 -1.311 0.191695 as.factor(area)12 -0.03498 0.09941 -0.352 0.725392 as.factor(area)13 -0.83257 0.09941 -8.375 2.15e-14 *** as.factor(area)14 -0.87363 0.09472 -9.223 < 2e-16 *** as.factor(area)15 -0.23542 0.09941 -2.368 0.019020 * as.factor(area)16 0.43574 0.09941 4.383 2.06e-05 *** as.factor(area)17 -0.44522 0.09452 -4.710 5.18e-06 *** as.factor(area)18 -0.04289 0.09941 -0.431 0.666694 as.factor(area)19 0.09341 0.09941 0.940 0.348764 as.factor(area)20 -0.35098 0.09452 -3.713 0.000279 *** as.factor(area)21 0.45779 0.09452 4.843 2.90e-06 *** as.factor(area)22 0.21864 0.09941 2.199 0.029225 * --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.2005 on 167 degrees of freedom Multiple R-squared: 0.9332, Adjusted R-squared: 0.9212 F-statistic: 77.75 on 30 and 167 DF, p-value: < 2.2e-16 There is suggestive but inconclusive evidence that being designated an enterprise zone reduces log unemployment claims, 4 p-value = 0.06; the estimated effect is that being designated an enterprise zone reduces unemployment claims by about 10%. Comparison of fixed effects vs. random effects model: Hausman test. Note: For the Hausman test, I did not describe it entirely correctly in Notes 13. We should only consider coefficients on variables that are time-varying, i.e., we should not consider the intercept, the coefficients on the fixed effects or the coefficients on any other time-constant variables. Let ˆFE , ˆRE denote the fixed effects and random effects estimates of the time-varying variables. The Hausman test statistic is H (ˆFE ˆRE ) ' Var (ˆFE ) Var (ˆRE ) (ˆFE ˆRE ) Under the null hypothesis that the effects ci are uncorrelated with the time-varying variables that are included in the model, 2 H is distributed as K where K is the dimension of . # Hausman test of fixed effects model vs. random effects model betahat.femodel=coef(femodel)[2:10]; vcov.femodel=vcov(femodel)[2:10,2:10]; # Random Effects Estimates and Covariance Matrix using the lme4 package library(lme4); remodel=lmer(luclms~ez+as.factor(year)+(1|area)); betahat.remodel=fixef(remodel)[2:10]; vcov.remodel=vcov(remodel)[2:10,2:10]; # Hausman test statistic h=matrix(betahat.femodel-betahat.remodel,nrow=1)%*%solve(vcov.femodelvcov.remodel)%*%matrix(betahat.femodel-betahat.remodel,ncol=1); # Compute p-value pval=1-pchisq(as.numeric(h),length(betahat.femodel)); 5 >h 1 x 1 Matrix of class "dgeMatrix" [,1] [1,] 0.07975095 > pval [1] 1 Here, there is no evidence against the null hypothesis of uncorrelated effects under the maintained hypothesis of strict exogeneity. II. Models for When Strict Exogeneity Fails The time varying disturbance it captures shocks to an area that do not represent permanent characteristics of the area, e.g., closure of a business in an area. The closure of a business in year t-1 is likely to still affect employment at year t and thus yi ,t 1 is likely to be correlated with it . Furthermore, because the designation of an area as an economic zone depends on previous unemployment, yi ,t 1 is likely to be correlated with EZ it . This means strict exogeneity fails: it is correlated with EZ it through their mutual correlation with yi ,t 1 . We can address this problem by adding yi ,t 1 to the model: Yit t EZit Yi ,t 1 ci it (1.2) Here X it (time t dummy, EZ it , Yi ,t 1 ) . It is not plausible that it is strictly exogenous in (1.2) because this would mean E ( it | EZ i1 , , EZ iT , Yi 0 , , Yi ,T 1 ) 0 but it is correlated with 6 Yit . However, the following sequential moment restriction is plausible: E ( it | X it , , X i1 , ci ) 0, t 1, ,T (1.3) When assumption (1.3) holds, we say that X it are sequentially exogenous conditional on the unobserved effect. Given model (1.2), assumption (1.3) is equivalent to E (Yit | X it , , X i1 , ci ) E (Yit | X it , ci ) X it ci . Suppose Yi ,t 1 is part of X it so we can write X it ( Z it , Yi ,t 1 ) . Sequential exogeneity implies that after X it and ci have been controlled for, no past values of X it affect Yit . Strict exogeneity requires that after X it and ci have been controlled for, no values of X it ' other than X it affect Yit . When sequential exogeneity holds but not strict exogeneity, the fixed effects estimator is inconsistent. Generally, 1 T T 1 1 plim( ˆFE ) T E ( X it ' X it ) T E ( X it ' it ) , t 1 t 1 where X it X it X i . Under sequential exogeneity, E ( X it ' it ) E[ X it X i ' it ] E[ X i' it ] because E[ X it' it ] 0 and so T 1 T E( X t 1 it ' it ) T 1 T E( X i it t=1 7 ) E ( X i ' it ) . 1 T When X it includes Yi ,t 1 , then T Yi ,t 1 is correlated with it , t 1 meaning E ( X i ' it ) will not be zero and the fixed effects estimator will be biased. We can obtain a consistent estimator of ( , ) in (1.2) under sequential exogeneity as follows. First, we take first differences to eliminate the ci : Yit Yi ,t 1 ( EZit EZ i ,t 1 ) (Yi ,t 1 Yi ,t 2 ) it i ,t 1 , t 2, , T We cannot estimate ( , ) in the above equation by least squares since it i ,t 1 is correlated with Yi ,t 1 Yi ,t 2 . But we can use Yi ,t 2 as an instrumental variable for Yi ,t 1 Yi ,t 2 since it i ,t 1 is uncorrelated with Yi ,t 2 under sequential exogeneity. In other words, we have E *[Yit Yi ,t 1 | EZ it EZ i ,t 1 , Yi ,t 2 , t ] t ( EZit EZi ,t 1 ) E * (Yi ,t 1 Yi ,t 2 | EZit EZi ,t 1 , Yi ,t 2 , t ) , t 3, ,T where E * denotes best linear expectation. We can estimate ( , ) by two stage least squares: 1. Regress Yi ,t 1 Yi ,t 2 on EZi ,t EZi ,t 1 , Yi ,t 2 and time dummies by least squares to find Eˆ * (Yi ,t 1 Yi ,t 2 | Yi ,t 2 , EZit EZi ,t 1 , t ) . * 2. Regress Yi ,t Yi ,t 1 on EZ it EZ i ,t 1 , Eˆ (Yi ,t 1 Yi ,t 2 | Yi ,t 2 ) and time dummies to estimate ( , ) . # Calculate lagged EZ values 8 lagez=rep(NA,length(ez)); for(i in 1:22){ for(t in 1981:1988){ lagez[(area==i & year==t)]=ez[(area==i & year==t-1)]; } } # Calculate second lag of Y second_lag_y=rep(NA,length(luclms)); for(i in 1:22){ for(t in 1982:1988){ second_lag_y[(area==i & year==t)]=luclms[(area==i & year==t-2)]; } } # Calculate lag Y - second lag of Y first_diff_lag_y=lag_luclms-second_lag_y; # Calculate EZ minus lagged EZ first_diff_ez=ez-lagez; # Subset of observations where we have a second lag of y subset=!(is.na(second_lag_y)); # First stage regression fsreg=lm(first_diff_lag_y[subset]~second_lag_y[subset]+first_diff_ez[subset]+as.f actor(year[subset])); first_diff_lag_y_hat=predict(fsreg); # Second stage regression first_diff_y=luclms-lag_luclms; ssreg=lm(first_diff_y[subset]~first_diff_ez[subset]+first_diff_lag_y_hat+as.factor( year[subset]),x=TRUE); # Calculate correct standard errors for two stage least squares modelmat.ssreg=ssreg$x; # Modify the model matrix so that actual first_diff_lag_y replaces # first_diff_lag_y_hat mod.modelmat=modelmat.ssreg; mod.modelmat[,3]=first_diff_lag_y[subset]; # Calculate sigmahatusq where u is error in structural equation sigmahatusq=(1/length(first_diff_lag_y_hat[subset]))*sum((first_diff_y[subset]mod.modelmat%*%matrix(coef(ssreg),ncol=1))^2); # Calculate variance of residuals in second stage regression (sigmahatesq) sigmahatesq=deviance(ssreg)/(length(first_diff_lag_y_hat[subset])); 9 # Variance is variance from second stage regression times sigmahatusq/sigmahatesq tsls.var=vcov(ssreg)*(sigmahatusq/sigmahatesq); # CI for beta (effect of EZ) betahat=coef(ssreg)[2]; se.betahat=sqrt(tsls.var[2,2]); lci.betahat=betahat-1.96*se.betahat; uci.betahat=betahat+1.96*se.betahat; # CI for rho (effect of lagged y) rhohat=coef(ssreg)[3]; se.rhohat=sqrt(tsls.var[3,3]); lci.rhohat=rhohat-1.96*se.rhohat; uci.rhohat=rhohat+1.96*se.rhohat; > betahat first_diff_ez[subset] -0.2613225 > lci.betahat first_diff_ez[subset] -0.575849 > uci.betahat first_diff_ez[subset] 0.05320405 > rhohat first_diff_lag_y_hat 0.3553252 > lci.rhohat first_diff_lag_y_hat -0.8193982 > uci.rhohat first_diff_lag_y_hat 1.530049 The point estimate for is -0.26 which would mean that being designated an enterprise zone reduces unemployment by approximately 26% but the the CI for is pretty wide (-0.58,0.05) and does contain 0; there is not strong evidence that enterprise zones reduce unemployment claims under this model. 10 We will study a way to improve the efficiency of our estimate of in the next class. More discussion of strict exogeneity vs. sequential exogeneity assumption The strict exogeneity assumption E ( it | X iT , , X i ,t 1 , X it , , X i1 ) implies that there is no feedback between lagged dependent variables and future values of the explanatory variable. The sequential exogeneity assumption E ( it | X it , , X i1 ) implies that current shocks are uncorrelated with past and current values of X but allows for feedback effects from lagged dependent variables (or lagged errors) to current and future values of X . Examples where sequential exogeneity is more plausible than strict exogeneity include: (1) Rational expectation models of household and firm decisions. In a rational expectations model, it is assumed that a household or firm’s current choice of X is optimal given its current information, hence current shocks are uncorrelated with past and current values of X , but a shock in period t can affect choices of X t 1 , X t 2 , . (2) Effect of children on female labor force participation decisions. Let Y be labor force participation and X be number of children. Strict exogeneity would require that labor 11 supply decisions have no effect on fertility decisions at any point in the life cycle which is not realistic. 12