Class 4-A. Minimum Distance Estimation

advertisement
Part 4A: GMM-MDE[ 1/33]
Econometric Analysis of Panel Data
William Greene
Department of Economics
Stern School of Business
Part 4A: GMM-MDE[ 2/33]
Chamberlain’s Model and
Minimum Distance Estimation



Chamberlain (1984) “Panel Data,” Handbook of
Econometrics
Innovation: treat the panel as a system of equations:
SUR Models, See Wooldridge, Ch. 7 through p. 172.
Assumptions:



Balanced panel
Minimal restrictions on variances and covariances of
disturbances (zero means, finite fourth moments)
Model the correlation between effects and regressors
Part 4A: GMM-MDE[ 3/33]
Chamberlain (2)
y it  i  x it β  it , each observation
y i  ii  X iβ  ε i , T observations for individual i
Assuming no time invariant variables in X i . (To be
picked up when we examine Hausman and Taylor.)
Re: Mundlak's treatment, E(i | X i )  0.
i   0   tT1 x it δ t  w i
Not a regression. Changes with next period's data. Viewed as
the projection of i on x i  (1, x i1 , x i2 , ,..., x iT ).
Part 4A: GMM-MDE[ 4/33]
Chamberlain (3) - Data
Period
t= 1
Period
2 ...
T
1
2 ... T
 ... x1T

x12
Individual i=1
y11 y12 ... y1T

x11
Individual i=2
y 21 y 22 ... y 2T
x 21 x 22 ... x 2T
...
Individual i=N
...
... y1T
...
 ... x1T

x12
y11 y12
y it

x11
x it  K variables
yi  T variables
x i = TK variables
Y = NxT matrix
X = N x TK matrix
Part 4A: GMM-MDE[ 5/33]
Chamberlain (4) Model
y it  i  x it β  it , E[it | x i ]  0, E[it is | x i ]  ts unrestricted
i   0   tT1 x it δ t  w i
y it  0   tT1 x it δ t  x itβ  it  w i
= 0  x i  t  v it , E[v it | x i ]  0, E[v it v is | x i ]  ts +2w still unrestricted
Σ = Ω + 2w I is an unrestricted TxT covariance matrix.
SEEMINGLY UNRELATED REGRESSIONS
y i1   0  x i 1  v i1 Equation uses year 1 data, N observations
y i2   0  x i  2  v i2
Each equation has y for that year regressed on
...
the x's from all years. There is a constant term
y iT   0  x i  T  v iT plus TxK variables in each equation.
Part 4A: GMM-MDE[ 6/33]
Chamberlain (5) SUR Model
The SUR system
y i1   0  x i 1  v i1 ,
y i2  0  x i 2  v i2 ..., Arranged in a row now
y iT  0  x i  T  v iT
...
0 0 0 0 
(y i1 y i2 ... y iT ) = (1,x i ) 
 (v i1 v i2 ... v iT )

 1  2 ...  T 
y i = (1,x i ) vi , by rows, i = 1,...,N,
E[v i | x i ]  0, E[v i v i | x i ]  Σ
Part 4A: GMM-MDE[ 7/33]
Chamberlain (6)
Col
1
2
0
 0

δ
β  δ
 δ
β  δ


δ
 δ


δ
 δ 
3
0
δ
...
...
δ
...
β  δ  ...
δ
...
T
0
 Constant, 1 row

δ  (t  1), K rows
δ   (t  2), K rows

δ   (t  3), K rows

...

β  δ   (t  T), K rows
Part 4A: GMM-MDE[ 8/33]
Chamberlain (7) Estimation of Σ
Regardless of how the columns of  are estimated, the estimator of
Σ will be computed using sums of squares and cross products of
residuals from the T equations:
N

ˆ ts  (1 / N)i1 (y it  x i 
ˆ t )(y is  x i 
ˆs )
ˆ = (1 / N)Ni1 ( y i  
ˆ x i )( y i  
ˆ x i )
Σ
ˆ ]'[Y - X
ˆ]
= (1/N)[Y - X
The problem to be solved is how to estimate .
Part 4A: GMM-MDE[ 9/33]
Chamberlain (8) Estimation of Π


FGLS. Use the usual two step GLS estimator.
OLS. System has an unrestricted covariance matrix and the same regressors in
every equation. GLS = FGLS = equation by equation OLS. Denote the T OLS
coefficient vectors as
P = [p1, p2, p3 …, pT].


Unconstrained OLS will be consistent. Plim pt = πt, t=1,…,T
OLS is inefficient. There are T(T-1) different estimates of  in P and
T-1 estimates of each δt.
0
 0

δ
β  δ
 δ
β  δ
 
δ
 δ


δ
 δ 
0
δ
δ
β  δ
δ
0 

δ 
...
δ 

...
δ 


... β  δ  
...
...
Part 4A: GMM-MDE[ 10/33]
Chamberlain Estimator: Application
Cornwell and Rupert:
Lwageit = αi + β1Expit + β2Expit2 + β3Wksit + εit
αi projected onto all 7 periods of Exp, Exp2 and Wks.
For each of the 7 years, we regress Lwageit on a constant and the three variables for all 7
years. Each regression has 22 coefficients.
Part 4A: GMM-MDE[ 11/33]
Chamberlain Estimator
0
 0

δ
β  δ
 δ
β  δ
 
δ
 δ


δ
 δ 
0
δ
δ
β  δ
δ
0 

δ 
...
δ 

...
δ 


... β  δ  
...
...
Part 4A: GMM-MDE[ 12/33]
Efficient Estimation of Π

Minimum Distance Estimation: Chamberlain
(1984). (See Wooldridge, pp. 442-446.)



Maximum likelihood Estimation: Joreskog
(1981), Greene (1981,2008)



Asymptotically efficient
Assumes only finite fourth moments of vit
Add normality assumption
Identical asymptotic properties as MDE (!)
Which is more convenient?
Part 4A: GMM-MDE[ 13/33]
MDE-1
Cornwell and Rupert.
Pooled, 7 years
+--------+--------------+----------------+--------+--------+----------+
|Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X|
+--------+--------------+----------------+--------+--------+----------+
Constant|
5.25112359
.07128679
73.662
.0000
EXP
|
.04010465
.00215918
18.574
.0000
19.8537815
EXPSQ
|
-.00067338
.474431D-04
-14.193
.0000
514.405042
WKS
|
.00421609
.00108137
3.899
.0001
46.8115246
OCC
|
-.14000934
.01465670
-9.553
.0000
.51116447
IND
|
.04678864
.01179350
3.967
.0001
.39543818
SOUTH
|
-.05563737
.01252710
-4.441
.0000
.29027611
SMSA
|
.15166712
.01206870
12.567
.0000
.65378151
MS
|
.04844851
.02056867
2.355
.0185
.81440576
FEM
|
-.36778522
.02509705
-14.655
.0000
.11260504
UNION
|
.09262675
.01279951
7.237
.0000
.36398559
ED
|
.05670421
.00261283
21.702
.0000
12.8453782
Part 4A: GMM-MDE[ 14/33]
MDE-2
Cornwell and Rupert.
Year 1
+--------+--------------+----------------+--------+--------+----------+
|Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X|
+--------+--------------+----------------+--------+--------+----------+
Constant|
5.11054693
.13191639
38.741
.0000
EXP
|
.03199044
.00426736
7.497
.0000
16.8537815
EXPSQ
|
-.00057556
.00010715
-5.372
.0000
400.282353
WKS
|
.00516535
.00183814
2.810
.0050
46.2806723
OCC
|
-.11540477
.02987160
-3.863
.0001
.52436975
IND
|
.01473703
.02447046
.602
.5470
.39159664
SOUTH
|
-.05868033
.02588364
-2.267
.0234
.29243697
SMSA
|
.18340943
.02526029
7.261
.0000
.66050420
MS
|
.07416736
.04493028
1.651
.0988
.82352941
FEM
|
-.30678002
.05378268
-5.704
.0000
.11260504
UNION
|
.11046575
.02637235
4.189
.0000
.36134454
ED
|
.04757357
.00539679
8.815
.0000
12.8453782
Part 4A: GMM-MDE[ 15/33]
MDE-3
Cornwell and Rupert.
Year 7
+--------+--------------+----------------+--------+--------+----------+
|Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| Mean of X|
+--------+--------------+----------------+--------+--------+----------+
Constant|
5.59009297
.19011263
29.404
.0000
EXP
|
.02938018
.00652410
4.503
.0000
22.8537815
EXPSQ
|
-.00048597
.00012680
-3.833
.0001
638.527731
WKS
|
.00341276
.00267762
1.275
.2025
46.4521008
OCC
|
-.16152170
.03690729
-4.376
.0000
.51260504
IND
|
.08466281
.02916370
2.903
.0037
.40504202
SOUTH
|
-.05876312
.03090689
-1.901
.0573
.29243697
SMSA
|
.16619142
.02955099
5.624
.0000
.64201681
MS
|
.09523724
.04892770
1.946
.0516
.80504202
FEM
|
-.32455710
.06072947
-5.344
.0000
.11260504
UNION
|
.10627809
.03167547
3.355
.0008
.36638655
ED
|
.05719350
.00659101
8.678
.0000
12.8453782
Part 4A: GMM-MDE[ 16/33]
MDE-4
How to combine two estimates of ED?
Year 1: .04757357 = b1 [Consistent]
Year 7: .05719350 = b7 [Consistent]
Minimize: (b1  ED )2 + (b 7  ED )2 = (.04757357  ED )2 +
(.05719350  ED )2
 b1  ED  -1  b1  ED 
Equivalent to 
 I 

b


b


ED 
ED 
 7
 7
Solution: ˆ ED  w1b1 +w 7 b7 , w1 =1/2, w 7  1-w1
Part 4A: GMM-MDE[ 17/33]
MDE-5
How to combine two estimates of ED?
Year 1: .04757357 = b1 , standard error = .00539679 = s1 [Consistent]
Year 7: .05719350 = b 7 , standard error = .00659101 = s 7 [Consistent]
 b1  ED   b 7  ED 
Minimize variance weighted: 
 +
 .
 .00539679   .00659101 
1
2

b




0  b1  ED 
 1
ED  s1
Equivalent to min: 
 

2 
b


b


s7   7
ED   0
ED 
 7
2
Solution: ˆ ED
1/s12
 w 1b 1 + w 7 b 7 , w 1  2
, w 7  1-w 1
2
1/s1 +1/s 7
2
Part 4A: GMM-MDE[ 18/33]
MDE-6
Seemingly Unrelated Regressions Model
lnWagei,1 = xi,1  εi ,1 (Year 1 regression)
lnWagei,7 = xi,7  εi ,7 (Year 7 regression)
Same  in both regressions.
1

Asy.Var[b t ]=tt ( Xt Xt ) , t = 1 and 7
Asy.Cov[b1 , b 7 ]  17 ( X1 X1 )1 ( X1 X7 )( X7 X7 ) 1
Part 4A: GMM-MDE[ 19/33]
MDE-7
S11 S21
S12 S22
Part 4A: GMM-MDE[ 20/33]
MDE-8
How to combine two estimates of ED?
Year 1: .04757357 = b1
Year 7: .05719350 = b 7
Minimize variance and covariance weighted from SUR model:
1
 b1  ED  .0000291254 .0000189242   b1  ED 
Equivalent to min: 
 
 b   
b


.0000189242
.0000434414
  7
ED  
ED 
 7
11
17
s

s
Solution: ˆ ED  w 1b1 +w 7b 7 , w 1  11
, w 7  1-w 1
17
77
s  2s  s
Part 4A: GMM-MDE[ 21/33]
MDE-9
Two coefficient estimators, b1 and b7 . Both estimate
the same parameter vector, . How to combine? Use
a minimum distance estimator:
1

 b1     W11 W17   b1   
Minimize 
  
 b  
b


W
W
77   7
 7
  17

Any W may be used as long as the matrix is positive definite.
Part 4A: GMM-MDE[ 22/33]
Minimum Distance Estimation
Minimum Distance Estimation
p = stacked OLS estimates. Each subvector ps,t is Kx1.
 , p1,2
 , ,..., p1,T
 ), (a02 ,p2,1 , p2,2 , ,..., p2,T ),..., (a0T ,pT,1 , pT,2 , ,..., pT,T )]
= [(a01 ,p1,1
= column 1 of P
column 2 of P
column T of P
No restrictions were imposed on the T 2K  T elements of p.
 = stacked true parameters from the  matrix.
= {[0 ,(β+δ1 ), δ2 ,..., δT ],[0 ,δ1 , (β+δ2 ),..., δT ],...,[0 ,δ1 , δ2 ,..., (β+δ T )]
= column 1 of 
column 2 of 
column T of 
The Minimum Distance Estimator (MDE) seeks the (T+1)K  1 values for
 (i.e.,0 ,β, δ1 , δ2 ,..., δ T ) that are closest to the T 2K+T elements of p.
Part 4A: GMM-MDE[ 23/33]
Carey Hospital Cost Model
Part 4A: GMM-MDE[ 24/33]
Multiple Estimates (25) of 10
Structural Parameters
Part 4A: GMM-MDE[ 25/33]
Appendix I. Chamberlain Model Algebra
Part 4A: GMM-MDE[ 26/33]
MDE (2)
ˆ  ( X'X / N) 1
Asy.Var[p  ]  (1 / N)Σ  Φ xx1 estimated with (1/N)Σ
ˆ  ( X'X / N) 1
ˆ (1/N)Σ
Let = (1 / N)Σ  Φ xx1 and G = 
MDE is found by minimizing with respect to (β, δ1 , δ 2 ,..., δ T )
[p  (β, δ1 , δ 2 ,..., δ T )]G-1 [p  (β, δ1 , δ2 ,..., δ T )]
subject to all the restrictions. (There are T 2K  K(T  1).)
(This is not GMM.)
ˆ is recomputed. Σ
ˆ MD will be larger than Σ
ˆ OLS .
After estimation, Σ
(1) Least squares is least squares. (2) The restrictions increase
ˆ MD - Σ
ˆ OLS .
the size of the variance matrix. Larger means is Σ
ˆ MD  ( X'X) 1.
nonnegative definite. Est.Asy.Var[
ˆMD  ]  Σ
Part 4A: GMM-MDE[ 27/33]
MDE (3)
Obtaining the asymptotic covariance matrix for the MDE
 = {[(β+δ1 ), δ2 ,..., δT ],[δ1 ,(β+δ2 ),..., δT ],...,[δ1 , δ2 ,...,(β+δ T )]
= T 2K functions of the parameters
θ = β, δ1 , δ2 ,..., δT = K(T+1) actual parameters

= T 2K x K(T+1) matrix of derivatives, all 1s and 0s.
θ
ˆ]=[D (Σ
ˆ -1  X'X) D]-1
Est.Asy.Var[θ
D=
Part 4A: GMM-MDE[ 28/33]
Maximum Likelihood Estimation
Maximum Likelihood Estimation assuming normality
yi  x i  vi , v i ~ N[0,Σ]
N

 t )(y is  x i ˆ
s )
ˆ ts  (1 / N)i1 (y it  x i ˆ
ˆ = [
given (
 s ) as estimated. Σ
ˆ t ,ˆ
ˆ ts ].
ˆ )].
log likelihood = logL=-(NT/2)[log2 + log|Σ| + trace(Σ -1Σ
Proof in Greene (pp. 347-349).
logL
ˆ )Σ-1 (no surprise) so the ML solution for Σ
 Σ-1 (Σ - Σ
Σ
ˆ , as might be expected, whatever the solution for  is.
is Σ
Part 4A: GMM-MDE[ 29/33]
MLE (2)
Inserting the solution for Σ back in the log likelihood produces
the concentrated log likelihood function
ˆ ]
logL c  (NT / 2)[T  log2  log |Σ|
which is only a function of  , that is β and δ. The function to
ˆ The estimator of the asymptotic
be minimized is just (1/2)log|Σ|.
covariance matrix for the MLE is identical to that for the MDE.
Part 4A: GMM-MDE[ 30/33]
Rearrange the Panel Data
 y i1 
 
 y i2  
 
 
 y iT 
 x i1 1 x i1
 x 1 x
i1
 i2


 x iT 1 x i1
K
1
K
x i2
x i2
x i2
K
β
 
... x iT 
 0   v i1 
 δ1   v i2 
... x iT 
(T rows)      

 δ2   
 



... x iT 
v iT 

 
 δT 
K
[(K+1)T + 1 columns)]
Part 4A: GMM-MDE[ 31/33]
Generalized Regression Model
y i  X i0θ  v i , E[v i | X i0 ]  0, E[v i v i | X i0 ]  Σ
 y1 
 
 
y 
 N
 X10 
 v1 
 
 

θ
 
 
v 
 X N0 
 N
 
Σ 0
0 Σ
y  X 0θ + v , E[v | X 0 ]  0, E[vv  | X 0 ]  


0 0
0
0 


Σ
Part 4A: GMM-MDE[ 32/33]
Least Squares
b  [(1 / N)ΣNi1 Xi 0 X i0 ]1 (1 / N)ΣNi1 Xi 0y i
-1
plim b = θ+plim (1/N)Σ Xi X  plim (1/N)ΣNi=1 Xi 0 v i 
0
i
0
N
i=1
-1
= θ+plim (1/N)Σ Xi X  [0]
0
i
0
N
i=1
1
1
1   Xi X    Xi ΣX    Xi X 
Asy.Var[b | X]  


 
N
N
N
N


 
Asymptotics are standard for OLS in a GR model. (Text, Sec. 7.3)
N
i1
0
0
i
N
i1
0
0
i
N
i1
0
0
i
Part 4A: GMM-MDE[ 33/33]
GLS and FGLS
1
N
0 -1






Σ
X
Σ
X
Σ
X
i
i1 i Σ y i
ˆ
θ
 

N
N

 

(See Wooldridge, Section 7.4 for properties.)
N
i1
0
-1
0
i
FGLS
ˆ )( y  X 0θ
ˆ ) 
 N ( y  X 0θ
i=1
i
i
OLS
i
i
OLS
ˆ=
Use OLS residuals: Σ

N


Download