Part 7: Regression Extensions [ 1/59] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business Part 7: Regression Extensions [ 2/59] Regression Extensions Time Varying Fixed Effects Heteroscedasticity (Baltagi, 5.1) Autocorrelation (Baltagi, 5.2) Measurement Error (Baltagi 10.1) Spatial Autoregression and Autocorrelation (Baltagi 10.5) Part 7: Regression Extensions [ 3/59] Time Varying Effects Models Random Effects yit = β’xit + ai(t) + εit yit = β’xit + uig(t,) + εit A heteroscedastic random effects model Stochastic frontiers literature – Battese-Coelli (1992) 2 g (1, ) 2 2 g (2, ) g (1, ) i I u ... g (T , ) g (1, ) g (1, ) g (2, ) 2 g (2, ) ... g (T , ) g (2, ) ... ... ... ... 2 ... g (T , ) ... ... Part 7: Regression Extensions [ 4/59] Time Varying Effects Models Time Varying Fixed Effects: Additive yit = β’xit + ai(t) + εit yit = β’xit + ai + ct + εit ai(t) = ai + ct, t=1,…,T Two way fixed effects model Now standard in fixed effects modeling. Part 7: Regression Extensions [ 5/59] Time Varying Effects Models Time Varying Fixed Effects: Additive Polynomial yit = β’xit + ai(t) + εit yit = β’xit + εit + ai0 + ai1t + ai2t2 Let Wi = [1,t,t2]Tx3 Ai = stack of Wi with 0s inserted Use OLS, Frisch and Waugh. Extend “within” estimator. Note Ai’Aj = 0 for all i j. 1 ˆ N XM X N XM y i 1 i W,i i i 1 i W,i i 1 ˆ W W y X ˆ i i i i i See Cornwell, Schmidt, Sickles (1990) (Frontiers literature.) Part 7: Regression Extensions [ 6/59] Time Varying Effects Models Time Varying Fixed Effects: Multiplicative yit = β’xit + ai(t) + εit yit = β’xit + it + εit Not estimable. Needs a normalization. 1 = 1. An EM iteration: (Chen (2015).) 1. Choose starting values for (1) and (1) . (Linear FEM for and 1,0,... for , for example.) x x x ( y ˆ ˆ ) ( y x ˆ ) ˆ / ˆ = ( y x ˆ )ˆ / ˆ 2.1 ˆ (k+1) = 2.2 ˆ i ( k 1) 2.3 ˆ t (k+1) 1 i ,t it (k ) it it i ,t (k+1) t it it it (k ) t (k ) t ( k 1) (k+1) it i (k ) t N i 1 it t 2 t ( k 1) N i 1 t 2 3. Iterate to convergence. (*) a. What does this converge to? MLE under normality. (*) b. How to compute standard errors? Hessian. No IP problem for linear model. Part 7: Regression Extensions [ 7/59] Generalized Regression Accommodating Autocorrelation (and Heteroscedasticity) Fixed Effects : y it x it β i it β y i [ X i Di ] ε i α Var[ε i | X i , Di ] Σ i = Ω i (Dimension Ti xTi ), Σ i positive definite Random Effects : y it x it β ui it y i [X i ]β uii+ε i Var[uii + ε i | X i ] u2ii + Σ = Ω i (Dimension Ti xTi .) Part 7: Regression Extensions [ 8/59] OLS Estimation β Fixed Effects : y i [X i Di ] ε i = ZFi θF w Fi α Random Effects : y i [X i ]β uii+ε i = ZRi θR w Ri Least Squares Coefficient Estimator, M = F or R 1 ˆ N ZMZM N ZMy M θ M i1 i i i1 i i Robust Covariance Matrix based on the White Estimator 1 ˆ ] Z Z Z w ˆ w ˆ Z Z Z Est.Asy.Var[θ M N i1 M i M i N i1 M i M i M i M i N i1 M i M i 1 Part 7: Regression Extensions [ 9/59] GLS Estimation Fixed Effects : No natural format (yet) Random Effects : Ωi u2ii + 2 I = 2 [I 2ii]=2 Φi (Feasible) Generalized Least Squares 1 N ZMΦ , Φ ˆ N ZMΦ ˆ i-1ZM ˆ i-1 y M ˆ i-1 θ R i1 i i i1 i i ˆ ] ˆ Z Est.Asy.Var[θ ˆ Z Φ R 2 N i1 M i -1 i M i 1 ˆ2 I ii 2 1 Tiˆ Part 7: Regression Extensions [ 10/59] Heteroscedasticity Naturally expected in microeconomic data, less so in macroeconomic Model Platforms Fixed Effects Random Effects yit i xitβ it , E[it2 | Xi ] 2,it y it xitβ ui it , E[it2 | X i ] 2 ,it 2 i 2 u,i E[u | X i ] Estimation OLS with (or without) robust covariance matrices GLS and FGLS Maximum Likelihood Part 7: Regression Extensions [ 11/59] Baltagi and Griffin’s Gasoline Data World Gasoline Demand Data, 18 OECD Countries, 19 years Variables in the file are COUNTRY = name of country YEAR = year, 1960-1978 LGASPCAR = log of consumption per car LINCOMEP = log of per capita income LRPMG = log of real price of gasoline LCARPCAP = log of per capita number of cars See Baltagi (2001, p. 24) for analysis of these data. The article on which the analysis is based is Baltagi, B. and Griffin, J., "Gasoline Demand in the OECD: An Application of Pooling and Testing Procedures," European Economic Review, 22, 1983, pp. 117-137. The data were downloaded from the website for Baltagi's text. Part 7: Regression Extensions [ 12/59] Heteroscedastic Gasoline Data Ba l ta g i - Gri ffi n Ga s o l i n e Da ta : 1 8 OECD Co u n tri e s , 1 9 Ye a rs 6. 50 6. 00 LGASPCAR 5. 50 5. 00 4. 50 4. 00 3. 50 3. 00 0 2 4 6 8 10 CO UNTRY 12 14 16 18 Part 7: Regression Extensions [ 13/59] LSDV Residuals Co u n try Sp e c i fi c Re s i d u a l s . 40 EI T . 20 . 00 - . 20 - . 40 0 2 4 6 8 10 CO UNTRY 12 14 16 18 Part 7: Regression Extensions [ 14/59] Evidence of Country Specific Heteroscedasticity Co u n try Sp e c i fi c Re s i d u a l Va ri a n c e s . 050 . 040 VARI ANCE . 030 . 020 . 010 . 000 0 2 4 6 8 10 CO UNTRY 12 14 16 18 Part 7: Regression Extensions [ 15/59] Heteroscedasticity in the FE Model Ordinary Least Squares Within groups estimation as usual. Standard treatment – this is just a (large) linear regression model. White estimator 1 b Ni1 XiMDi X i Ni1 XiMDi y i 1 Ti Var[b | X] Ni1 XiMDi X i Ni1t=1 2 ,it (x it x i )(x it x i ) Ni1 XiMDi X i White Robust Covariance Matrix Estimator 1 1 Ti Est.Var[b | X] Ni1 XiMDi X i Ni1t=1 eit2 ( x it x i )(x it x i ) Ni1 XiMDi X i 1 Part 7: Regression Extensions [ 16/59] Narrower Assumptions Constant variance within the group y it i x it β it , E[it2 | X i ] 2 ,i White Robust Covariance Matrix Estimator - no change 1 Ti Est.Var[b | X ] Ni1 X iMDi X i Ni1 t=1 eit2 ( x it x i )( x it x i ) iN1 X iMDi X i Modified estimator - use within group constancy of variance 1 Var[b | X ] Ni1 X iMDi X i Ni12 ,i X iMDi X Ni1 X iMDi X i Est.Var[b | X ] X iM X i N i1 Does it matter? i D 1 Ti N t=1 eit2 i1 Ti 1 1 i N i X M X X M X i D i1 i D i 1 Part 7: Regression Extensions [ 17/59] Heteroscedasticity in Gasoline Data +----------------------------------------------------+ | Least Squares with Group Dummy Variables | | LHS=LGASPCAR Mean = 4.296242 | | Fit R-squared = .9733657 | | Adjusted R-squared = .9717062 | +----------------------------------------------------+ Least Squares - Within +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |t-ratio |P[|T|>t] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ LINCOMEP .66224966 .07338604 9.024 .0000 -6.13942544 LRPMG -.32170246 .04409925 -7.295 .0000 -.52310321 LCARPCAP -.64048288 .02967885 -21.580 .0000 -9.04180473 +---------+--------------+----------------+--------+---------+----------+ White Estimator +---------+--------------+----------------+--------+---------+----------+ LINCOMEP .66224966 .07277408 9.100 .0000 -6.13942544 LRPMG -.32170246 .05381258 -5.978 .0000 -.52310321 LCARPCAP -.64048288 .03876145 -16.524 .0000 -9.04180473 +---------+--------------+----------------+--------+---------+----------+ White Estimator using Grouping +---------+--------------+----------------+--------+---------+----------+ LINCOMEP .66224966 .06238100 10.616 .0000 -6.13942544 LRPMG -.32170246 .05197389 -6.190 .0000 -.52310321 LCARPCAP -.64048288 .03035538 -21.099 .0000 -9.04180473 Part 7: Regression Extensions [ 18/59] Feasible GLS Requires a narrower assumption, estimation of 2 ,it is not feasible. (Same as in cross section model.) E[it2 | X i ] 2 ,i ; Var[ε|X i ] 2 ,iI = Ωi ˆ N X Mi Ω-1Mi X 1 N X Mi Ω-1Mi y β i1 i D i D i i1 i D i D i 1 N 1 N 1 i i = i1 2 X iMD X i i1 2 X iMD y i ,i ,i = weighted within groups LS with constant weights within groups. 2 ˆ ,i Ti t=1 e it2 Ti ˆ ˆ i y i xi β 2 (Not a function of ˆ ,i . Proof left to the reader.) Part 7: Regression Extensions [ 19/59] Does Teaching Load Affect Faculty Size? Becker, W., Greene, W., Seigfried, J. Do Undergraduate Majors or PhD Students Affect Faculty Size? American Economist 56(1): 69-77. Becker, Jr., W.E., W.H. Greene & J.J. Siegfried. 2011 Part 7: Regression Extensions [ 20/59] Random Effects Regressions Part 7: Regression Extensions [ 21/59] Modeling the Scedastic Function Suppose = 2 ,i a function of zi , e.g., = 2 f(ziδ). Any consistent estimator of 2 ,i = 2 f(ziδ) brings full efficiency to FGLS. E.g., 2 ,i = 2 exp (ziδ) Estimate using ordinary least squares applied to log(eit2 ) log 2 + ziδ + w it Second step FGLS using these estimates Part 7: Regression Extensions [ 22/59] Two Step Estimation Benefit to modeling the scedastic function vs. using the robust estimator: Ti t=1 eit2 2 2 ˆ) ˆ ˆ ,iM = ˆ exp (ziδ vs. Ti Inconsistent; T Consistent in N is fixed. Fixed T is irrelevant Does it matter? Do the two matrices converge to the same matrix? 2 ,iR 1 N 1 1 N 1 i i i1 2 X iMD X i vs. i1 2 X iMD X i N N ,iR ,iM It is very unlikely to matter very much. What if we use Harvey's maximum likelihood estimator instead of LS. Unlikely to matter. In Harvey's model, the OLS estimator is consistent in NT. Downside: What if the specified function is wrong. Probably still doesn't matter much. Part 7: Regression Extensions [ 23/59] Heteroscedasticity in the RE Model Heteroscedasticity in both it and ui ? 2 y it x it β ui it , E[it2 | X i ] 2,i , E[ui2 | X i ] u,i OLS 1 b X i X i Ni1 X i y i N i1 1 1 X X X ΩX X X Var[b | X] N N i1 Ti i1 Ti Ni1 Ti Ni1 Ti Ω = diag[Ω1 , Ω2 , ...ΩN ] Each block is Ti xTi 1 2 Ωi 2 ,iI u,i ii Part 7: Regression Extensions [ 24/59] Ordinary Least Squares Standard results for OLS in a GR model Consistent Unbiased Inefficient Variance does (we expect) converge to zero; 1 Ni1 X i X i Ni1 X iΩi X i Ni1 X i X i Var[b | X ] N N i1 Ti i1 Ti Ni1 Ti Ni1 Ti 1 1 1 1 1 N X i X i N X iΩi X i N X i X i i1 fi i1 fi i1 fi , 0 < fi < 1. N i1 Ti Ti Ti Ti Part 7: Regression Extensions [ 25/59] Estimating the Variance for OLS White correction? 1 Ti Est.Var[b|X]= Xi X i Ni1 t=1 eit2 x it x it Ni1 X i X i Does this work? No. Observations are correlated. N i1 Cluster Estimator Ti Ti Est.Var[b|X] ( X'X)1 Ni=1 ( t=1 x it eit )( t=1 xit eit ) ( X'X)1 1 Part 7: Regression Extensions [ 26/59] White Estimator for OLS 1 1 1 X X X ΩX X X Var[b | X ] N N i1 Ti i1 Ti Ni1 Ti Ni1 Ti X iΩi X i X ΩX N i1 fi , where = Ωi =E[w i w i | X i ] N i1 T Ti In the spirit of the White estimator, use ˆ iw ˆ i X i X i w X ΩX N ˆ i = y i - X ib i1 fi , w N i1 T Ti Hypothesis tests are then based on Wald statistics. Part 7: Regression Extensions [ 27/59] Generalized Least Squares ˆ=[X Ω-1 X ]1 [X Ω-1 y ] β =[Ni1 X iΩi-1 X i ]1 [Ni1 X iΩi-1 y i ] 2 ,i 1 -1 Ωi 2 I Ti 2 ii 2 ,i ,i Tiu,i 2 (Depends on i through 2 ,i ,u,i and Ti ) Part 7: Regression Extensions [ 28/59] Estimating the Variance Components: Baltagi y it x it β ui it 2 ui ~ (0, u,i ), it ~ (0, 2ε,i ), but here he assumes 2ε,i = 2 . (Homoscedastic) 2 Use GLS with y it * = y it - i y i , i 1 / Tiu,i 2ε . 2 2 FGLS needs ˆ ε and ˆ u,i. "Requires large T, preferably small N, T >> N." Ti N 2 2 i1 t 1e it.LSDV Use . ˆ = N i1 Ti K N Ti ˆ ˆ 2 2 2 2 2 t 1 (w it w i ) Based on Var[w it ui +εit ] u,i ε i , use ˆi Ti 1 2 2 2 2 2 Then, i . ˆ u,i = ˆi - ˆ ε . Use ˆ u,i and ˆ ε to compute FGLS using ˆ "Consistency of the variance estimators requires T and finite N." Invoking Mazodier and Trognon (1978) and Baltagi and Griffin (1988). Part 7: Regression Extensions [ 29/59] Estimating the Variance Components: Hsiao 2 Let u,i and 2 ,i both vary across individuals. We consider FGLS 2 But, there is no way to get consistent estimates of u,i even if T . This is because there is but a single realization of ui . 2 ˆ ,i For example, using the conventional e Ti 2 i. residuals. However, ˆ 2 ,i using OLS or LSDV tTi 1 (eit ei. )2 = or even if we assume homoscedasticity Ti 1 2 and pool these estimators in a common ˆ , as Baltagi does. If T is finite, 2 there do not exist consistent estimators of u,i and 2 ,i even if N .(This is the incidental parameters problem Neyman and Scott (1948). (No, it isn't).) So, who’s right? Hsiao. This is no longer in Baltagi. Invoking Mazodier and Trognon (1978) and Baltagi and Griffin (1988). Part 7: Regression Extensions [ 30/59] Maximum Likelihood Let ,i exp(ziδ), u,i u exp(hi ) i =1/2i , i =ui2 / 2i , R i Ti i 1, Qi i / R i , logL i (1 / 2)[i (εiε i Qi (Ti i )2 ) logR i Ti log i Ti log 2] Can be maximized using ordinary optimization methods. Treat as a standard nonlinear optimization problem. Solve with iterative, gradient methods. Is there much benefit in doing this? Why would one do this? Part 7: Regression Extensions [ 31/59] Conclusion Het. in Effects Choose robust OLS or simple FGLS with moments based variances. Note the advantage of panel data – individual specific variances As usual, the payoff is a function of Variance of the variances The extent to which variances are correlated with regressors. MLE and specific models for variances probably don’t pay off much unless the model(s) for the variances is (are) of specific interest. Part 7: Regression Extensions [ 32/59] Autocorrelation Source? Already present in RE model – equicorrelated. Models: Autoregressive: εi,t = ρεi,t-1 + vit – how to interpret Unrestricted: (Already considered) Estimation requires an estimate of ρ Ti 1 N 1 N t 2 ei,t ei,t 1 i1 ˆ i1 ˆi T 2 i N t 1ei,t N using LSDV residuals in both RE and FE cases Part 7: Regression Extensions [ 33/59] FGLS – Fixed Effects y i,t i x i,tβ i,t , i,t i,t 1 v it y i,t y i,t 1 i (1 ) ( x i,t x i,t 1 )β v i,t y i,t * i * x i,t * β v i,t Using ˆ in LSDV estimation estimates i * i (1 ), β, 2v 2 (1 2 ) Estimate i with ai * /(1 ˆ), 2 Estimate 2 with ˆ2 ) ˆ v /(1 Part 7: Regression Extensions [ 34/59] FGLS – Random Effects y i,t x i,t β ui i,t , i,t i,t 1 v it y i,t y i,t 1 ui (1 ) ( x i,t x i,t 1 )β v i,t y i,t * x i,t * β ui * v i,t (1) Step 1. Transform data using ˆ to partial deviations (2) Residuals from transformed data to estimate variances Step 2 estimator of 2v using LSDV residuals Step 2 estimator 2v u2 (1 2 ) 2v u2 * (3) Apply FGLS to transformed data to estimate β and asymptotic covariance matrix (4) Estimates of u2 , 2 can be recovered from earlier results. Part 7: Regression Extensions [ 35/59] Microeconomic Data - Wages +----------------------------------------------------+ | Least Squares with Group Dummy Variables | | LHS=LWAGE Mean = 6.676346 | | Model size Parameters = 600 | | Degrees of freedom = 3565 | | Estd. Autocorrelation of e(i,t) .148641 | +----------------------------------------------------+ +---------+--------------+----------------+--------+---------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | +---------+--------------+----------------+--------+---------+ OCC -.01722052 .01363100 -1.263 .2065 SMSA -.04124493 .01933909 -2.133 .0329 MS -.02906128 .01897720 -1.531 .1257 EXP .11359630 .00246745 46.038 .0000 EXPSQ -.00042619 .544979D-04 -7.820 .0000 Part 7: Regression Extensions [ 36/59] Macroeconomic Data – Baltagi/Griffin Gasoline Market +----------------------------------------------------+ | Least Squares with Group Dummy Variables | | LHS=LGASPCAR Mean = 4.296242 | | Estd. Autocorrelation of e(i,t) .775557 | +----------------------------------------------------+ +---------+--------------+----------------+--------+---------+ |Variable | Coefficient | Standard Error |t-ratio |P[|T|>t] | +---------+--------------+----------------+--------+---------+ LINCOMEP .66224966 .07338604 9.024 .0000 LRPMG -.32170246 .04409925 -7.295 .0000 LCARPCAP -.64048288 .02967885 -21.580 .0000 Part 7: Regression Extensions [ 37/59] FGLS Estimates +----------------------------------------------------+ | Least Squares with Group Dummy Variables | | LHS=LGASPCAR Mean = .9412098 | | Residuals Sum of squares = .6339541 | | Standard error of e = .4574120E-01 | | Fit R-squared = .8763286 | | Estd. Autocorrelation of e(i,t) .775557 | +----------------------------------------------------+ +---------+--------------+----------------+--------+---------+ |Variable | Coefficient | Standard Error |t-ratio |P[|T|>t] | +---------+--------------+----------------+--------+---------+ LINCOMEP .40102837 .07557109 5.307 .0000 LRPMG -.24537285 .03187320 -7.698 .0000 LCARPCAP -.56357053 .03895343 -14.468 .0000 +--------------------------------------------------+ | Random Effects Model: v(i,t) = e(i,t) + u(i) | | Estimates: Var[e] = .852489D-02 | | Var[u] = .355708D-01 | | Corr[v(i,t),v(i,s)] = .806673 | +--------------------------------------------------+ +---------+--------------+----------------+--------+---------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | +---------+--------------+----------------+--------+---------+ LINCOMEP .55269845 .05650603 9.781 .0000 LRPMG -.42499860 .03841943 -11.062 .0000 LCARPCAP -.60630501 .02446438 -24.783 .0000 Constant 1.98508335 .17572168 11.297 .0000 Part 7: Regression Extensions [ 38/59] Maximum Likelihood Assuming multivariate normally distributed it Assuming fixed T > N ˆ is computed, the MLE (1) Regardless of how β T ˆ = (1/T) t=1 of Σ is Σ ε t εt (2) In this model, for any given Σ, the MLE of ˆ GLS β is by β (Oberhofer Kmenta (1974)] - iterate back and ˆ and Σ ˆ until convergence. At the forth between β solution. logL= -NT ˆ|] [1 log 2 log | Σ 2 Part 7: Regression Extensions [ 39/59] Baltagi and Griffin’s Gasoline Data World Gasoline Demand Data, 18 OECD Countries, 19 years Variables in the file are COUNTRY = name of country YEAR = year, 1960-1978 LGASPCAR = log of consumption per car LINCOMEP = log of per capita income LRPMG = log of real price of gasoline LCARPCAP = log of per capita number of cars See Baltagi (2001, p. 24) for analysis of these data. The article on which the analysis is based is Baltagi, B. and Griffin, J., "Gasoline Demand in the OECD: An Application of Pooling and Testing Procedures," European Economic Review, 22, 1983, pp. 117-137. The data were downloaded from the website for Baltagi's text. Part 7: Regression Extensions [ 40/59] OLS and PCSE +--------------------------------------------------+ | Groupwise Regression Models | | Pooled OLS residual variance (SS/nT) .0436 | | Test statistics for homoscedasticity: | | Deg.Fr. = 17 C*(.95) = 27.59 C*(.99) = 33.41 | | Lagrange multiplier statistic = 111.5485 | | Wald statistic = 546.3827 | | Likelihood ratio statistic = 109.5616 | | Log-likelihood function = 50.492889 | +--------------------------------------------------+ +---------+--------------+----------------+--------+---------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | +---------+--------------+----------------+--------+---------+ Constant 2.39132562 .11624845 20.571 .0000 LINCOMEP .88996166 .03559581 25.002 .0000 LRPMG -.89179791 .03013694 -29.592 .0000 LCARPCAP -.76337275 .01849916 -41.265 .0000 +----------------------------------------------------+ | OLS with Panel Corrected Covariance Matrix | +----------------------------------------------------+ +---------+--------------+----------------+--------+---------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | +---------+--------------+----------------+--------+---------+ Constant 2.39132562 .06388479 37.432 .0000 LINCOMEP .88996166 .02729303 32.608 .0000 LRPMG -.89179791 .02641611 -33.760 .0000 LCARPCAP -.76337275 .01605183 -47.557 .0000 Part 7: Regression Extensions [ 41/59] FGLS +--------------------------------------------------+ | Groupwise Regression Models | | Pooled OLS residual variance (SS/nT) .0436 | | Log-likelihood function = 50.492889 | +--------------------------------------------------+ +---------+--------------+----------------+--------+---------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | +---------+--------------+----------------+--------+---------+ Constant 2.39132562 .11624845 20.571 .0000 LINCOMEP .88996166 .03559581 25.002 .0000 LRPMG -.89179791 .03013694 -29.592 .0000 LCARPCAP -.76337275 .01849916 -41.265 .0000 +--------------------------------------------------+ | Groupwise Regression Models | | Test statistics against the correlation | | Deg.Fr. = 153 C*(.95) = 182.86 C*(.99) = 196.61 | | Test statistics against the correlation | | Likelihood ratio statistic = 1010.7643 | +---------+--------------+----------------+--------+---------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | +---------+--------------+----------------+--------+---------+ Constant 2.11399182 .00962111 219.724 .0000 LINCOMEP .80854298 .00219271 368.741 .0000 LRPMG -.79726940 .00123434 -645.909 .0000 LCARPCAP -.73962381 .00074366 -994.570 .0000 Part 7: Regression Extensions [ 42/59] Aggregation Test Aggregation: Separate equations for each unit; the aggregation hypothesis is that the s are the same. H0: β1 β2 ... βN H1: Not H0 Correlation structure (free Σ) is maintained. Approaches : (1) Wald test using bi from separate OLS regressions (2) LR test, using NT[log|S 0 | log | S1 |]. S is computed using residuals equation by equation. All equations fit by ML in both cases (3) Other strategies based on F statistics (4) Other hypotheses related Σ to are based on the likelihood. (See Greene (2012, section 11.11).) Part 7: Regression Extensions [ 43/59] A Test Against Aggregation Log Likelihood from restricted model = 655.093. Free parameters in and Σ are 4 + 18(19)/2 = 175. Log Likelihood from model with separate country dummy variables = 876.126. Free parameters in and Σ are 21 + 171 = 192 Chi-squared[17]=2(876.126-655.093)=442.07 Critical value=27.857. Homogeneity hypothesis is rejected a fortiori. Part 7: Regression Extensions [ 44/59] Measurement Error Standard regression results: General effects model y it x *it c i it x it x *it hit x it measured variable, including measurement error. b=(x x )-1 x y=(x x/Ni=1 Ti )-1 (x * h)[x *+c+ε]/Ni=1 T Var[x *it ] Cov[x *it , c i ] plim b = * * Var[x ] Var[h ] Var[x it it it ] Var[hit ] biased twice, possibly in opposite directions. (Griliches and Hausman (1986).) Part 7: Regression Extensions [ 45/59] General Conclusions About Measurement Error In the presence of individual effects, inconsistency is in unknown directions With panel data, different transformations of the data (first differences, group mean deviations) estimate different functions of the parameters – possible method of moments estimators Model may be estimable by minimum distance or GMM With panel data, lagged values may provide suitable instruments for IV estimation. Various applications listed in Baltagi (pp. 205-208). Part 7: Regression Extensions [ 46/59] Application: A Twins Study "Estimates of the Economic Returns to Schooling from a New Sample of Twins," Ashenfelter, O. and A. Kreuger, Amer. Econ. Review, 12/1994. (1) Annual twins study in Twinsburg, Ohio. (2) (log) wage equations, y i,j log wage twin j=1,2 in family i. (3) Measured data: (a) Self reported education, Sibling reported education, Twins report same education, other twin related variables (b) Age, Race, Sex, Self employed, Union member, Married, of mother at birth (4) Skj reported schooling by of twin j by twin k. Measurement error. Skj S j v kj . Re liability ratio = Var[S j ]/(Var[S j ]+Var[v kj ])] Part 7: Regression Extensions [ 47/59] Wage Equation Structure y i1 x iα zi1β+i i1 y i2 x iα zi2β+i i2 i zi1θ+zi2θ+x iδ i Reduced Form=Two equation SUR model. y i1 x i (α+δ) zi1 (β+θ)+zi2θ + (i1 i ) y i2 x i (α+δ)+ zi1θ zi2 (β+θ)+ (i2 i ) First differences gives the "fixed effects" approach y i1 y i2 (zi1 - zi2 )β+(i1 -i2 ) y i1 y i2 (S11 -S22 )β+(i1 -i2 ) The regressor is measured with error. First difference gets rid of the family effect, but worsens the measurement problem But, S12 S12 may be used as an instrumental variable Part 7: Regression Extensions [ 48/59] Part 7: Regression Extensions [ 49/59] Spatial Autocorrelation Thanks to Luc Anselin, Ag. U. of Ill. Part 7: Regression Extensions [ 50/59] Spatially Autocorrelated Data Per Capita Income in Monroe County, NY Thanks Arthur J. Lembo Jr., Geography, Cornell. Part 7: Regression Extensions [ 51/59] Hypothesis of Spatial Autocorrelation Thanks to Luc Anselin, Ag. U. of Ill. Part 7: Regression Extensions [ 52/59] Testing for Spatial Autocorrelation W = Spatial Weight Matrix. Think “Spatial Distance Matrix.” Wii = 0. Part 7: Regression Extensions [ 53/59] Modeling Spatial Autocorrelation ( y i) W( y i) ε, N observations on a spatially arranged variable W ' contiguity matrix;' Wii 0 W must be specified in advance. It is not estimated. spatial autocorrelation parameter, -1 < < 1. 1 Identification problem: W = k W for any k 0 k Normalization: Rows of W sum to 1. E[ε]=0, Var[ε]=2 I ( y i) [I W]1 ε E[y]=i, Var[y]=2 [(I W)(I W)]-1 Part 7: Regression Extensions [ 54/59] Spatial Autoregression y Wy + Xβ ε. E[ε|X ]=0, Var[ε|X ]=2 I y [I W]1 (Xβ ε) [I W]1 Xβ [I W]1 ε E[y|X ]=[I W]1 Xβ Var[y|X ] = 2 [(I W)(I W)]-1 Part 7: Regression Extensions [ 55/59] Generalized Regression Potentially very large N – GPS data on agriculture plots Estimation of . There is no natural residual based estimator Complicated covariance structure – no simple transformations Part 7: Regression Extensions [ 56/59] Spatial Autocorrelation in Regression y Xβ (I - W)ε. wii 0. E[ε | X ]=0, Var[ε | X]=2 I E[y | X ]=Xβ Var[y | X] = 2 (I - W)(I - W) A Generalized Regression Model ˆ X (I - W)(I - W ) X β 1 1 X (I - W )(I - W ) y 1 1 1 ˆ ˆ y - Xβ (I - W)(I - W) y - Xβ ˆ N ˆ The subject of much research 2 Part 7: Regression Extensions [ 57/59] Panel Data Application: Spatial Autocorrelation E.g., N countries, T periods (e.g., gasoline data) y it x it β c i it ε t Wε t v t = N observations at time t. Similar assumptions Candidate for SUR or SA model. Part 7: Regression Extensions [ 58/59] Part 7: Regression Extensions [ 59/59] Spatial Autocorrelation in a Panel