Notes 14 - Wharton Statistics Department

advertisement
Stat 521 Notes 14
Models for Panel Data that Violate the Strict Exogeneity
Assumption
Reading: Wooldridge, Chapter 11.1
I. Review of Fixed Effects Model
Yit  X it'   ci   it
(1.1)
This is a structural/casual equation in the sense that  is
supposed to represent the causal effect of changes in X it
holding everything else fixed and vit  ci   it represents the
effect of all omitted variables on the outcome. vit can be
correlated with X it . The key assumption of strict exogeneity in
the fixed effects model is that the correlation between vit and
X it arises only through the time invariant part ci of vit .
Assumptions of Fixed Effects Model:
1. Assumption 1 (Independence Between Units): The vectors
of individual outcomes (Yi1 , , YiT ) and (Y j1 , , Y jT ) are
independent for i  j .
2. Assumption 2 (Strict Exogeneity):
E ( it | X i1 , , X iT , ci )  0 .  it can be thought of as a time
varying shock that is independent of the unobserved
1
individual characteristic ci and the observed characteristics
X it .
2A. Assumption 2A. The  it ’s are independent and
2
identically distributed with constant variance  . This
assumption is made in the usual fixed effects inferences but
can be relaxed by using robust standard errors.
Fixed Effects Model Estimation: Let Z j be a dummy variable
1 if j  i
Z

for unit i , i.e., j ,it 0 if j  i

Then we can write model (1.1) as
Yit  X it'   c1Z1,it   cN Z N ,it   it
We can estimate  , c1 , , cN by least squares regression of Yit
on Z1,it , , Z N ,it , pooling together the data for i  1, , N ,
t  1,
,T .
Example: Papke (1994, Journal of Public Economics, “Tax
Policy and Urban Development”) studies the effect of urban
enterprise zones (EZs) on economic outcomes such as
unemployment claims. Urban enterprise zones encourage
development in blighted neighborhoods by offering
entrepreneurs and investors tax and regulatory relief if they start
businesses in the area. Papke considers a panel of 22 areas in
Indiana from 1980 to 1988. Indiana’s EZ program began in
1983. To qualify for consideration to be an EZ, an area must
have an unemployment rate at least 1.5 times the average
statewide unemployment rate, and a resident household poverty
2
rate at least 25 percent above the U.S. poverty level. Areas in
the panel became EZs at different time points and some areas
did not become EZs at all.
One model Papke uses is a fixed effects model:
Yit  t   EZit  ci   it
where Yit is the log of unemployment claims in area i in year t,
 t is a time dummy variable and EZ it is a dummy variable for
whether the ith area was an EZ in year t. The fixed effects
ci take account of permanent differences across areas such as
industrial composition and composition of the labor force.
ezdata=read.table("ezunem.raw",header=TRUE,sep=",");
area=ezdata$city;
year=ezdata$year;
luclms=ezdata$luclms; # Log of Unemployment Claims
lag_luclms=ezdata$lag_luclms; # Unemployment Claims in previous year in area
ez=ezdata$ez; # Whether area is an urban enterprise zone in given year
# Fixed Effects Model
femodel=lm(luclms~ez+as.factor(year)+as.factor(area));
summary(femodel)
Call:
lm(formula = luclms ~ ez + as.factor(year) + as.factor(area))
Residuals:
Min
1Q Median
3Q Max
-0.57618 -0.10837 -0.00977 0.11364 0.49623
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)
11.67615 0.08008 145.808 < 2e-16 ***
ez
-0.10441 0.05542 -1.884 0.061291 .
3
as.factor(year)1981 -0.32163 0.06046 -5.320 3.30e-07 ***
as.factor(year)1982 0.13550 0.06046 2.241 0.026332 *
as.factor(year)1983 -0.21926 0.06046 -3.627 0.000381 ***
as.factor(year)1984 -0.57915 0.06232 -9.294 < 2e-16 ***
as.factor(year)1985 -0.59179 0.06550 -9.036 3.92e-16 ***
as.factor(year)1986 -0.62126 0.06550 -9.486 < 2e-16 ***
as.factor(year)1987 -0.88895 0.06550 -13.573 < 2e-16 ***
as.factor(year)1988 -1.22763 0.06550 -18.744 < 2e-16 ***
as.factor(area)2 -0.19348 0.09941 -1.946 0.053296 .
as.factor(area)3 -0.37894 0.09941 -3.812 0.000194 ***
as.factor(area)4 -0.54117 0.09941 -5.444 1.83e-07 ***
as.factor(area)5 0.01103 0.09472 0.116 0.907407
as.factor(area)6 0.55458 0.09452 5.867 2.32e-08 ***
as.factor(area)7 0.75007 0.09452 7.935 2.90e-13 ***
as.factor(area)8 -0.05876 0.09472 -0.620 0.535900
as.factor(area)9 0.35343 0.09472 3.731 0.000261 ***
as.factor(area)10 1.64501 0.09941 16.548 < 2e-16 ***
as.factor(area)11 -0.13032 0.09941 -1.311 0.191695
as.factor(area)12 -0.03498 0.09941 -0.352 0.725392
as.factor(area)13 -0.83257 0.09941 -8.375 2.15e-14 ***
as.factor(area)14 -0.87363 0.09472 -9.223 < 2e-16 ***
as.factor(area)15 -0.23542 0.09941 -2.368 0.019020 *
as.factor(area)16 0.43574 0.09941 4.383 2.06e-05 ***
as.factor(area)17 -0.44522 0.09452 -4.710 5.18e-06 ***
as.factor(area)18 -0.04289 0.09941 -0.431 0.666694
as.factor(area)19 0.09341 0.09941 0.940 0.348764
as.factor(area)20 -0.35098 0.09452 -3.713 0.000279 ***
as.factor(area)21 0.45779 0.09452 4.843 2.90e-06 ***
as.factor(area)22 0.21864 0.09941 2.199 0.029225 *
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2005 on 167 degrees of freedom
Multiple R-squared: 0.9332, Adjusted R-squared: 0.9212
F-statistic: 77.75 on 30 and 167 DF, p-value: < 2.2e-16
There is suggestive but inconclusive evidence that being
designated an enterprise zone reduces log unemployment claims,
4
p-value = 0.06; the estimated effect is that being designated an
enterprise zone reduces unemployment claims by about 10%.
Comparison of fixed effects vs. random effects model: Hausman
test.
Note: For the Hausman test, I did not describe it entirely
correctly in Notes 13. We should only consider coefficients on
variables that are time-varying, i.e., we should not consider the
intercept, the coefficients on the fixed effects or the coefficients
on any other time-constant variables. Let ˆFE , ˆRE denote the
fixed effects and random effects estimates of the time-varying
variables. The Hausman test statistic is
H  (ˆFE  ˆRE ) ' Var (ˆFE )  Var (ˆRE )  (ˆFE  ˆRE )
Under the null hypothesis that the effects ci are uncorrelated
with the time-varying variables that are included in the model,
2
H is distributed as  K where K is the dimension of  .
# Hausman test of fixed effects model vs. random effects model
betahat.femodel=coef(femodel)[2:10];
vcov.femodel=vcov(femodel)[2:10,2:10];
# Random Effects Estimates and Covariance Matrix using the lme4 package
library(lme4);
remodel=lmer(luclms~ez+as.factor(year)+(1|area));
betahat.remodel=fixef(remodel)[2:10];
vcov.remodel=vcov(remodel)[2:10,2:10];
# Hausman test statistic
h=matrix(betahat.femodel-betahat.remodel,nrow=1)%*%solve(vcov.femodelvcov.remodel)%*%matrix(betahat.femodel-betahat.remodel,ncol=1);
# Compute p-value
pval=1-pchisq(as.numeric(h),length(betahat.femodel));
5
>h
1 x 1 Matrix of class "dgeMatrix"
[,1]
[1,] 0.07975095
> pval
[1] 1
Here, there is no evidence against the null hypothesis of
uncorrelated effects under the maintained hypothesis of strict
exogeneity.
II. Models for When Strict Exogeneity Fails
The time varying disturbance  it captures shocks to an area that
do not represent permanent characteristics of the area, e.g.,
closure of a business in an area. The closure of a business in
year t-1 is likely to still affect employment at year t and thus
yi ,t 1 is likely to be correlated with  it . Furthermore, because
the designation of an area as an economic zone depends on
previous unemployment, yi ,t 1 is likely to be correlated with
EZ it . This means strict exogeneity fails:  it is correlated with
EZ it through their mutual correlation with yi ,t 1 . We can
address this problem by adding yi ,t 1 to the model:
Yit  t   EZit  Yi ,t 1  ci   it
(1.2)
Here X it  (time t dummy, EZ it , Yi ,t 1 ) . It is not plausible that
 it is strictly exogenous in (1.2) because this would mean
E ( it | EZ i1 , , EZ iT , Yi 0 , , Yi ,T 1 )  0 but  it is correlated with
6
Yit . However, the following sequential moment restriction is
plausible:
E ( it | X it ,
, X i1 , ci )  0, t  1,
,T
(1.3)
When assumption (1.3) holds, we say that X it are sequentially
exogenous conditional on the unobserved effect.
Given model (1.2), assumption (1.3) is equivalent to
E (Yit | X it , , X i1 , ci )  E (Yit | X it , ci )  X it   ci .
Suppose Yi ,t 1 is part of X it so we can write X it  ( Z it , Yi ,t 1 ) .
Sequential exogeneity implies that after X it and ci have been
controlled for, no past values of X it affect Yit . Strict exogeneity
requires that after X it and ci have been controlled for, no
values of X it ' other than X it affect Yit .
When sequential exogeneity holds but not strict exogeneity, the
fixed effects estimator is inconsistent. Generally,
1
T
T





1

1
plim( ˆFE )    T  E ( X it ' X it )  T  E ( X it '  it )  ,
 t 1
  t 1

where X it  X it  X i . Under sequential exogeneity,
E ( X it '  it )  E[ X it  X i  '  it ]   E[ X i' it ] because
E[ X it'  it ]  0 and so
T
1
T
 E( X
t 1
it
'  it )  T
1
T
 E( X 
i it
t=1
7
)   E ( X i '  it ) .
1 T
When X it includes Yi ,t 1 , then T  Yi ,t 1 is correlated with  it ,
t 1
meaning E ( X i '  it ) will not be zero and the fixed effects
estimator will be biased.
We can obtain a consistent estimator of (  ,  ) in (1.2) under
sequential exogeneity as follows. First, we take first differences
to eliminate the ci :
Yit  Yi ,t 1  ( EZit  EZ i ,t 1 )   (Yi ,t 1  Yi ,t 2 )    it   i ,t 1 , t  2, , T
We cannot estimate (  ,  ) in the above equation by least
squares since  it   i ,t 1 is correlated with Yi ,t 1  Yi ,t  2 . But we
can use Yi ,t  2 as an instrumental variable for Yi ,t 1  Yi ,t  2 since
 it   i ,t 1 is uncorrelated with Yi ,t  2 under sequential exogeneity.
In other words, we have
E *[Yit  Yi ,t 1 | EZ it  EZ i ,t 1 , Yi ,t  2 , t ] 
t  ( EZit  EZi ,t 1 )   E * (Yi ,t 1  Yi ,t  2 | EZit  EZi ,t 1 , Yi ,t  2 , t )  ,
t  3,
,T
where E * denotes best linear expectation. We can estimate
(  ,  ) by two stage least squares:
1. Regress Yi ,t 1  Yi ,t  2 on EZi ,t  EZi ,t 1 , Yi ,t  2 and time
dummies by least squares to find
Eˆ * (Yi ,t 1  Yi ,t 2 | Yi ,t 2 , EZit  EZi ,t 1 , t ) .
*
2. Regress Yi ,t  Yi ,t 1 on EZ it  EZ i ,t 1 , Eˆ (Yi ,t 1  Yi ,t 2 | Yi ,t 2 )
and time dummies to estimate (  ,  ) .
# Calculate lagged EZ values
8
lagez=rep(NA,length(ez));
for(i in 1:22){
for(t in 1981:1988){
lagez[(area==i & year==t)]=ez[(area==i & year==t-1)];
}
}
# Calculate second lag of Y
second_lag_y=rep(NA,length(luclms));
for(i in 1:22){
for(t in 1982:1988){
second_lag_y[(area==i & year==t)]=luclms[(area==i & year==t-2)];
}
}
# Calculate lag Y - second lag of Y
first_diff_lag_y=lag_luclms-second_lag_y;
# Calculate EZ minus lagged EZ
first_diff_ez=ez-lagez;
# Subset of observations where we have a second lag of y
subset=!(is.na(second_lag_y));
# First stage regression
fsreg=lm(first_diff_lag_y[subset]~second_lag_y[subset]+first_diff_ez[subset]+as.f
actor(year[subset]));
first_diff_lag_y_hat=predict(fsreg);
# Second stage regression
first_diff_y=luclms-lag_luclms;
ssreg=lm(first_diff_y[subset]~first_diff_ez[subset]+first_diff_lag_y_hat+as.factor(
year[subset]),x=TRUE);
# Calculate correct standard errors for two stage least squares
modelmat.ssreg=ssreg$x;
# Modify the model matrix so that actual first_diff_lag_y replaces
# first_diff_lag_y_hat
mod.modelmat=modelmat.ssreg;
mod.modelmat[,3]=first_diff_lag_y[subset];
# Calculate sigmahatusq where u is error in structural equation
sigmahatusq=(1/length(first_diff_lag_y_hat[subset]))*sum((first_diff_y[subset]mod.modelmat%*%matrix(coef(ssreg),ncol=1))^2);
# Calculate variance of residuals in second stage regression (sigmahatesq)
sigmahatesq=deviance(ssreg)/(length(first_diff_lag_y_hat[subset]));
9
# Variance is variance from second stage regression times
sigmahatusq/sigmahatesq
tsls.var=vcov(ssreg)*(sigmahatusq/sigmahatesq);
# CI for beta (effect of EZ)
betahat=coef(ssreg)[2];
se.betahat=sqrt(tsls.var[2,2]);
lci.betahat=betahat-1.96*se.betahat;
uci.betahat=betahat+1.96*se.betahat;
# CI for rho (effect of lagged y)
rhohat=coef(ssreg)[3];
se.rhohat=sqrt(tsls.var[3,3]);
lci.rhohat=rhohat-1.96*se.rhohat;
uci.rhohat=rhohat+1.96*se.rhohat;
> betahat
first_diff_ez[subset]
-0.2613225
> lci.betahat
first_diff_ez[subset]
-0.575849
> uci.betahat
first_diff_ez[subset]
0.05320405
> rhohat
first_diff_lag_y_hat
0.3553252
> lci.rhohat
first_diff_lag_y_hat
-0.8193982
> uci.rhohat
first_diff_lag_y_hat
1.530049
The point estimate for  is -0.26 which would mean that being
designated an enterprise zone reduces unemployment by
approximately 26% but the the CI for  is pretty wide (-0.58,0.05) and does contain 0; there is not strong evidence that
enterprise zones reduce unemployment claims under this model.
10
We will study a way to improve the efficiency of our estimate of
 in the next class.
More discussion of strict exogeneity vs. sequential exogeneity
assumption
The strict exogeneity assumption
E ( it | X iT , , X i ,t 1 , X it , , X i1 )
implies that there is no feedback between lagged dependent
variables and future values of the explanatory variable.
The sequential exogeneity assumption
E ( it | X it , , X i1 )
implies that current shocks are uncorrelated with past and
current values of X but allows for feedback effects from lagged
dependent variables (or lagged errors) to current and future
values of X .
Examples where sequential exogeneity is more plausible than
strict exogeneity include:
(1) Rational expectation models of household and firm
decisions. In a rational expectations model, it is assumed that a
household or firm’s current choice of X is optimal given its
current information, hence current shocks are uncorrelated with
past and current values of X , but a shock in period t can affect
choices of X t 1 , X t  2 , .
(2) Effect of children on female labor force participation
decisions. Let Y be labor force participation and X be
number of children. Strict exogeneity would require that labor
11
supply decisions have no effect on fertility decisions at any point
in the life cycle which is not realistic.
12
Download