Panel Data

advertisement
Data organization
Time Series
Year
Sales
2005
$10,200
2006
$10,900
2007
$11,000
2008
$8,500
2009
$10,400
Cross-Sectional
Location Sales
Virginia $10,400
Florida $10,300
Colorado $8,300
Maine
$10,200
Year
2005
2005
2005
2005
2006
2006
2006
2006
2007
2007
2007
2007
2008
2008
2008
2008
2009
2009
2009
2009
Panel
Location Sales
Virginia
$9,000
Florida
$9,500
Colorado $9,200
Maine
$8,800
Virginia
$9,200
Florida $10,500
Colorado $10,700
Maine
$9,300
Virginia
$8,700
Florida
$8,900
Colorado $11,000
Maine
$9,700
Virginia
$8,000
Florida
$8,400
Colorado $9,300
Maine
$9,000
Virginia
$8,000
Florida
$9,700
Colorado $8,500
Maine
$9,100
Year
2005
2005
2005
2005
2005
2005
2005
2005
2005
2005
2005
2005
2006
2006
2006
2006
2006
2006
2006
2006
2006
2006
2006
2006
2007
2007
2007
2007
2007
2007
2007
2007
2007
2007
2007
2007
2008
2008
2008
2008
2008
2008
2008
2008
2008
2008
2008
2008
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
2009
Multi-Dimensional Panel
Location Holiday
Sales
Virginia Christmas $9,200
Virginia
July 4
$8,400
Virginia Labor Day $8,900
Florida Christmas $9,100
Florida
July 4
$8,400
Florida Labor Day $10,500
Colorado Christmas $10,300
Colorado
July 4
$9,400
Colorado Labor Day $10,900
Maine
Christmas $8,900
Maine
July 4
$9,100
Maine
Labor Day $8,700
Virginia Christmas $8,200
Virginia
July 4
$8,900
Virginia Labor Day $8,900
Florida Christmas $10,300
Florida
July 4
$11,000
Florida Labor Day $8,500
Colorado Christmas $8,100
Colorado
July 4
$9,200
Colorado Labor Day $10,200
Maine
Christmas $10,200
Maine
July 4
$8,100
Maine
Labor Day $8,600
Virginia Christmas $9,600
Virginia
July 4
$10,400
Virginia Labor Day $10,800
Florida Christmas $10,300
Florida
July 4
$9,100
Florida Labor Day $10,900
Colorado Christmas $10,800
Colorado
July 4
$9,600
Colorado Labor Day $10,200
Maine
Christmas $10,400
Maine
July 4
$9,600
Maine
Labor Day $11,000
Virginia Christmas $8,200
Virginia
July 4
$9,800
Virginia Labor Day $8,900
Florida Christmas $9,200
Florida
July 4
$10,400
Florida Labor Day $9,000
Colorado Christmas $10,700
Colorado
July 4
$9,600
Colorado Labor Day $8,600
Maine
Christmas $8,100
Maine
July 4
$8,600
Maine
Labor Day $8,000
Virginia Christmas $9,800
Virginia
July 4
$8,800
Virginia Labor Day $10,400
Florida Christmas $10,700
Florida
July 4
$8,300
Florida Labor Day $9,600
Colorado Christmas $9,100
Colorado
July 4
$8,300
Colorado Labor Day $9,600
Maine
Christmas $10,200
Maine
July 4
$9,600
Maine
Labor Day $8,200
Regression Models
• Time series
yt     xt  ut
• Cross-sectional
yi     xi  ui
• Panel
yi ,t     xi ,t  ui ,t
• Multi-dimensional panel
yi,s,t     xi ,s,t  ui ,s ,t
Errors in Uni-dimensional Data
In standard time series or cross-sectional data sets, we must adjust
for non-independent errors.
Serial correlation
Errors correlated across time
Spatial correlation
Errors correlated across cross-sections
Heteroskedasticity
Error variance changes over time or cross-sections
Errors in Panel Data
Heterogeneous serial correlation
Errors correlated across time and differently for different crosssections.
Heterogeneous spatial correlation
Errors correlated across cross-sections but differently for different
time periods.
Heterogeneous heteroskedasticity
Error variance changes over time, but does so differently for
different cross-sections.
Serial-spatial correlation
Past errors from one cross-section are correlated with future
errors from a different cross-section.
Generalized Least Squares
For the regression model yt     xt  ut
1
1
ˆ
   X '  X  X ' 1Y
The error covariance matrix shows the covariances of error terms
across different observations.
cov  ut 1 , ut  2  cov  ut 1 , ut 3  
 var  ut 1 


  cov  ut 1 , ut  2 
var  ut  2 
cov  ut  2 , ut 3  
 cov  ut 1 , ut 3  cov  ut  2 , ut 3 
var  ut 3  
Ordinary Least Squares Assumptions
For the regression model yt     xt  ut
ˆ   X '  X  X '  Y
1
1
1
u  t  s
cov  ut , us   
0  t  s
u 0 0 
   0 u 0 
 0 0 u 
Ordinary Least Squares (Heteroskedasticity)
For the regression model yt     xt  ut
ˆ   X '  X  X '  Y
1
1
1
ut  t  s
cov  ut , us   
0  t  s
ut 1 0
   0 ut  2
 0
0
0 
0 
ut 3 
Ordinary Least Squares (Serial Correlation)
For the regression model yt     xt  ut
1
1
ˆ
   X '  X  X ' 1Y
 u

   u
  2u

cov ut , us    |t s|u
 u  2u 

u
u 
 u u 
Two-Dimensional Panel Data: OLS Assumptions
For the regression model yi ,t     xi ,t  vi  t  ui ,t
1
1
ˆ
   X '  X  X ' 1Y
v  i  j
cov  vi , v j   
0 otherwise
  t  s
cov t , s   
0 otherwise
u  i  j and t  s
cov  ui ,t , u j ,s   
0 otherwise
Two-Dimensional Panel Data: OLS Assumptions
yi ,t     xi ,t  vi  t  ui ,t
    xi ,t  i ,t
 

0
  0

 0
   0

 0

 0
 0

 0
0 0
 0 
0  
0 0
0 0 
0 0 
0 0
0 0 
0 0 
0
0

0

0

 0
0 0
0 0 
0 0 
0 0
 0 
0  
0 0 0 
0 0 0 


0 0 0 
0 0 


0 0 
0 0  

0 0 
0 0  

0 0  

 0 0  
0  0


 0 0   
0
0

0
0
0

0
Two-Dimensional Panel Data: OLS (homogeneous serial correlation)
yi ,t     xi ,t  vi  t  ui ,t
    xi ,t  i ,t
 

  
   2


0

0



0


0


0



0

  2 

  
  
0 0
0 0 
0 0 
0 0
0 0 
0 0 
0 0 0
0 0 0


0 0 0 
 

 
  2

  2 

  
  
0 0 0 
0 0 0 


0 0 0 






0


0


0 

 2  

  

  
0 0 0 
0 0 0 


0 0 0 
0 0
0 0

0 0
 

 
  2




Two-Dimensional Panel Data: OLS (heterogeneous serial correlation)
yi ,t     xi ,t  vi  t  ui ,t
    xi ,t  i ,t
  1

  1
   2
1


0

0



0


0

0



0

1  21 

1 1 
1 1 
0 0
0 0 
0 0 
0 0
0 0 
0 0 
0 0 0 
0 0 0 


0 0 0 
 2

 2
  2 2

2
2
2
 22 

2 
2 
0 0 0 
0 0 0 


0 0 0 






0


0


0 

 23  

3  

3  
0 0 0 
0 0 0 


0 0 0 
0 0
0 0

0 0
 3

 3
  23

3
3
3
Two-Dimensional Panel Data: OLS (serial-spatial correlation)
yi ,t     xi ,t  vi  t  ui ,t
    xi ,t  i ,t
  1
 
  1
   2
1
 
 
  1,2
    1,2
   2
1,2

 
  1,3
  1,3
 2
   1,3
1  21 

1 1 
1 1 
 1,2

 1,2
  21,2

1,2
1,2
1,2
 21,2 

1,2 
1,2 
 1,3

 1,3
  21,3

1,3
1,3
1,3
 2,3

 2,3
  22,3

2,3
2,3
2,3
 3

 3
  23

3
3
3
1,2
1,2
1,2
 21,2 

1,2 
1,2 
 2

 2
  22

2
2
2
 22 

2 
2 
1,3
1,3
1,3
 21,3 

1,3 
1,3 
 2,3

 2,3
  22,3

2,3
2,3
2,3
 22,3 

2,3 
2,3 
 21,3  

1,3  
1,3  

 22,3  

2,3  
2,3  

 23  
 
3  

3  
OLS vs. Panel Estimation
yi ,t     xi ,t  vi  t  ui ,t
vi ~ IIN  0,  v2  , t ~ IIN  0,  2  , ui ,t ~ IINi  0,  u2  , ui ,t ~ IINt  0,  u2 
N  35, T  40
  0.5
Estimation Procedure
Estimate Standard Error Regression R 2
OLS
0.482
0.017
0.37
Cross-Sectional Effects
Time Effects
0.499
0.486
0.014
0.013
0.46
0.48
Both Effects
0.505
0.009
0.67
Fixed versus Random Effects
yi,t     xi,t  vi  t  ui ,t
Under the random effects assumption, vi and
stochastic.
t
are treated as
Under the fixed effects assumption, they are treated as fixed in repeated
samples.
Random vs. Fixed Effects
yi,t     xi,t  vi  t  ui ,t
Random Effects Assumption
Pro: Estimators are more efficient
Con: Estimators are inconsistent if any of the three errors are not
IIN(0,σ2) across all dimensions.
Fixed Effects Assumption
Pro: Estimators are consistent regardless of
Con: Estimators are less efficient.
 See Hausman test for endogeneity.
vi and  t .
Random vs. Fixed Cross-Sectional Effects
yi ,t     xi ,t  vi  t  ui ,t
t ~ IIN  0,  2  , ui ,t ~ IINi  0,  u2  , ui ,t ~ IINt  0,  u2 
N  35, T  40
  0.5
Estimation Procedure
OLS
Random Effects
Fixed Effects
Estimate Standard Error Regression R 2
0.595
0.004
0.63
0.588
0.004
0.59
0.518
0.009
0.65
Test statistic = 22
Alternatives to Panel Techniques
Separate Regressions
For cross-section 1
y1,t  1  1 x1,t  u1,t
For cross-section 2
y2,t   2   2 x2,t  u2,t
etc.
Drawbacks
Less efficient estimators due to lost
information about cross-sectional error
covariance.
Remove the ability to restrict parameter
values across cross-sections.
Alternatives to Panel Techniques
Pooled Regression
Run standard OLS on
yi ,t     xi ,t  ui ,t
Drawbacks
Less efficient estimators due to lost
information about cross-sectional error
covariance.
Restricts parameter values to be equal
across cross-sections.
Alternatives to Panel Techniques
Pooled Regression with Cross-Sectional Dummies
Run standard OLS on
yi ,t  i   xi ,t  ui ,t
Drawbacks
This is the fixed effects panel technique.
If the cross-sectional dummies are IIN,
then parameter estimates are less
efficient than under the random effects
panel technique.
Procedures to use with panel data
Generalized least squares (GLS)
Generalized method of moments (GMM)
OLS with “automated” corrections for serial correlation, etc. is GLS.
Extra stuff
Panel data reveals information that is unattainable with non-panel data.
Three-Dimensional Structure of the ASA-NBER Data Set
Shock Occurrence vs. Shock Impact
These shocks all impact inflation in quarter 9 but occur in different quarters.
These shocks all occur in quarter 6 but
impact inflation in different quarters.
Shock Occurrence vs. Shock Impact
Cumulative shocks
N
1
ˆth    Fith  Fi ,t ,h 1 
N i 1
Cross-sectional shocks
uˆth  ˆth  ˆt ,h1
Discrete shocks
vˆth  uˆth  uˆt 1,h1
Shock Occurrence vs. Shock Impact
Shock Measure
Shocks Occur From
Shocks Impact Inflation From
Cumulative shocks
Beginning of quarter t – h to the end
of quarter t.
Beginning of quarter t – h to the end
of quarter t.
Beginning of quarter t – h to the end
of quarter t – h.
Beginning of quarter t – h to the end
of quarter t.
Beginning of quarter t – h to the end
of quarter t – h.
Beginning of quarter t to the end of
quarter t.
th
Cross-sectional
shocks
uth
Discrete shocks
vth
Download