Part 10: Time Series Applications [ 1/64] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business Part 10: Time Series Applications [ 2/64] Part 10: Time Series Applications [ 3/64] Part 10: Time Series Applications [ 4/64] Dear professor Greene, I have a plan to run a (endogenous or exogenous) switching regression model for a panel data set. To my knowledge, there is no routine for this in other software, and I am not so good at coding a program. Fortunately, I am advised that LIMDEP has a built in function (or routine) for the panel switch model. Part 10: Time Series Applications [ 5/64] Endogenous Switching (ca.1980) Regime 0: yi xi 0 0 i 0 Regime 1: yi xi1 1 i1 Regime Switch: d* = z i u , d = 1[d* > 0] Regime 0 governs if d = 0, Probability = 1- (z i ) Regime 1 governs if d = 1, Probability = (z i ) Not identified. Regimes do not coexist. 0 02 i 0 2 Endogenous Switching: i 0 ~ N 0 , ? 1 0 1 1 1 i0 0 0 This is a latent class model with different processes in the two classes. There is correlation between the unobservables that govern the class determination and the unobservables in the two regime equations. Part 10: Time Series Applications [ 6/64] Part 10: Time Series Applications [ 7/64] Part 10: Time Series Applications [ 8/64] Modeling an Economic Time Series Observed y0, y1, …, yt,… What is the “sample” Random sampling? The “observation window” Part 10: Time Series Applications [ 9/64] Estimators Functions of sums of observations Law of large numbers? Nonindependent observations What does “increasing sample size” mean? Asymptotic properties? (There are no finite sample properties.) Part 10: Time Series Applications [ 10/64] Interpreting a Time Series Time domain: A “process” Frequency domain: A sum of terms y(t) = ax(t) + by(t-1) + … Regression like approach/interpretation y(t) = Cos( j t ) (t ) j j Contribution of different frequencies to the observed series. (“High frequency data and financial econometrics – “frequency” is used slightly differently here.) Part 10: Time Series Applications [ 11/64] For example,… Part 10: Time Series Applications [ 12/64] In parts… Part 10: Time Series Applications [ 13/64] Studying the Frequency Domain Cannot identify the number of terms Cannot identify frequencies from the time series Deconstructing the variance, autocovariances and autocorrelations Contributions at different frequencies Apparent large weights at different frequencies Using Fourier transforms of the data Does this provide “new” information about the series? Part 10: Time Series Applications [ 14/64] Autocorrelation in Regression Yt = b’xt + εt Cov(εt, εt-1) ≠ 0 Ex. RealConst = a + bRealIncome + εt U.S. Data, quarterly, 1950-2000 Part 10: Time Series Applications [ 15/64] Autocorrelation How does it arise? What does it mean? Modeling approaches Classical – direct: corrective Estimation that accounts for autocorrelation Inference in the presence of autocorrelation Contemporary – structural Model the source Incorporate the time series aspect in the model Part 10: Time Series Applications [ 16/64] Stationary Time Series zt = b1yt-1 + b2yt-2 + … + bPyt-P + et Autocovariance: γk = Cov[yt,yt-k] Autocorrelation: k = γk / γ0 Stationary series: γk depends only on k, not on t Weak stationarity: E[yt] is not a function of t, E[yt * yt-s] is not a function of t or s, only of |t-s| Strong stationarity: The joint distribution of [yt,yt-1,…,yt-s] for any window of length s periods, is not a function of t or s. A condition for weak stationarity: The smallest root of the characteristic polynomial: 1 - b1z1 - b2z2 - … - bPzP = 0, is greater than one. The unit circle Complex roots Example: yt = yt-1 + ee, 1 - z = 0 has root z = 1/ , | z | > 1 => | | < 1. Part 10: Time Series Applications [ 17/64] Stationary vs. Nonstationary Series Part 10: Time Series Applications [ 18/64] The Lag Operator Lc = c when c is a constant Lxt = xt-1 L2 xt = xt-2 LPxt + LQxt = xt-P + xt-Q Polynomials in L: yt = B(L)yt + et A(L) yt = et Invertibility: yt = [A(L)]-1 et Part 10: Time Series Applications [ 19/64] Inverting a Stationary Series yt= yt-1 + et (1- L)yt = et yt = [1- L]-1 et = et + et-1 + 2et-2 + … 1 2 3 1 (L) (L) (L) ... 1 L Stationary series can be inverted Autoregressive vs. moving average form of series Part 10: Time Series Applications [ 20/64] Regression with Autocorrelation yt = xt’b + et, et = et-1 + ut (1- L)et = ut et = (1- L)-1ut E[et] = E[ (1- L)-1ut] = (1- L)-1E[ut] = 0 Var[et] = (1- L)-2Var[ut] = 1+ 2u2 + … = u2/(1- 2) Cov[et,et-1] = Cov[et-1 + ut, et-1] = =Cov[et-1,et-1]+Cov[ut,et-1] = u2/(1- 2) Part 10: Time Series Applications [ 21/64] OLS vs. GLS OLS Unbiased? Consistent: (Except in the presence of a lagged dependent variable) Inefficient GLS Consistent and efficient Part 10: Time Series Applications [ 22/64] +----------------------------------------------------+ | Ordinary least squares regression | | LHS=REALCONS Mean = 2999.436 | | Autocorrel Durbin-Watson Stat. = .0920480 | | Rho = cor[e,e(-1)] = .9539760 | +----------------------------------------------------+ +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |t-ratio |P[|T|>t] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ Constant -80.3547488 14.3058515 -5.617 .0000 REALDPI .92168567 .00387175 238.054 .0000 3341.47598 | Robust VC Newey-West, Periods = 10 | Constant -80.3547488 41.7239214 -1.926 .0555 REALDPI .92168567 .01503516 61.302 .0000 3341.47598 +---------------------------------------------+ | AR(1) Model: e(t) = rho * e(t-1) + u(t) | | Final value of Rho = .998782 | | Iter= 6, SS= 118367.007, Log-L=-941.371914 | | Durbin-Watson: e(t) = .002436 | | Std. Deviation: e(t) = 490.567910 | | Std. Deviation: u(t) = 24.206926 | | Durbin-Watson: u(t) = 1.994957 | | Autocorrelation: u(t) = .002521 | | N[0,1] used for significance levels | +---------------------------------------------+ +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ Constant 1019.32680 411.177156 2.479 .0132 REALDPI .67342731 .03972593 16.952 .0000 3341.47598 RHO .99878181 .00346332 288.389 .0000 Part 10: Time Series Applications [ 23/64] Detecting Autocorrelation Use residuals Tt 2 (e t et 1 ) 2 2(1 r ) T 2 Durbin-Watson d= t 1et Assumes normally distributed disturbances strictly exogenous regressors Variable addition (Godfrey) yt = ’xt + εt-1 + ut Use regression residuals et and test = 0 Assumes consistency of b. Part 10: Time Series Applications [ 24/64] A Unit Root? How to test for = 1? By construction: εt – εt-1 = ( - 1)εt-1 + ut Test for γ = ( - 1) = 0 using regression? Variance goes to 0 faster than 1/T. Need a new table; can’t use standard t tables. Dickey – Fuller tests Unit roots in economic data. (Are there?) Nonstationary series Implications for conventional analysis Part 10: Time Series Applications [ 25/64] Reinterpreting Autocorrelation Regression form yt ' xt t , t t 1 ut Error Correction Form yt yt 1 '( xt xt 1 ) ( yt 1 ' xt 1 ) ut , ( 1) ' xt the equilibrium The model describes adjustment of y t to equilibrium when x t changes. Part 10: Time Series Applications [ 26/64] Integrated Processes Integration of order (P) when the P’th differenced series is stationary Stationary series are I(0) Trending series are often I(1). Then yt – yt-1 = yt is I(0). [Most macroeconomic data series.] Accelerating series might be I(2). Then (yt – yt-1)- (yt – yt-1) = 2yt is I(0) [Money stock in hyperinflationary economies. Difficult to find many applications in economics] Part 10: Time Series Applications [ 27/64] Cointegration: Real DPI and Real Consumption Part 10: Time Series Applications [ 28/64] Cointegration – Divergent Series? Part 10: Time Series Applications [ 29/64] Cointegration X(t) and y(t) are obviously I(1) Looks like any linear combination of x(t) and y(t) will also be I(1) Does a model y(t) = bx(t) + u(u) where u(t) is I(0) make any sense? How can u(t) be I(0)? In fact, there is a linear combination, [1,-] that is I(0). y(t) = .1*t + noise, x(t) = .2*t + noise y(t) and x(t) have a common trend y(t) and x(t) are cointegrated. Part 10: Time Series Applications [ 30/64] Cointegration and I(0) Residuals Part 10: Time Series Applications [ 31/64] Cross Country Growth Convergence Solow - Swan Growth Model Yi,t K i,t (A i,tL i,t )1 A i,t index of technology K i,t capital stock = Ii,t-1 + (1-)K i,t-1 Ii,t 1 s i Yi,t 1 Investment Savings L i,t labor (1 ni )L i,t 1 Equilibrium Model for Steady State Income logYi,t i i t i log Yi,t 1 i,t i (1-i )gi , gi =technological growth rate i "convergence" parameter; 1-i = rate of convergence to steady state Cross country comparisons of convergence rates are the focus of study. Part 10: Time Series Applications [ 32/64] Heterogeneous Dynamic Model logYi,t i i log Yi,t 1 i x it i,t Long run effect of interest is i i 1 i 1 (1) "Fixed Effects:" Separate regressions, then average results. Average (over countries) effect: or (2) Country means (over time) - can be manipulated to produce consistent estimators of desired parameters (3) Time series of means across countries (does not work at all) (4) Pooled - no way to obtain consistent estimates. (5) Mixed Fixed (Weinhold - Hsiao): Build separate i into the equation with "fixed effects," treat i ui as random. Part 10: Time Series Applications [ 33/64] “Fixed Effects” Approach logYi,t i i log Yi,t 1 i x it i,t i i , = 1 i 1 ˆi , ˆ (1) Separate regressions; i ˆi , i 1 N ˆ 1 Nˆ ˆ (2) Average estimates = i1 i1i or ˆ N 1 i N ˆ= Function of averages: (1 / N)Ni1ˆ i ˆi 1 (1 / N)Ni1 In each case, each term i has variance O(1/Ti ) Each average has variance O(1/N) N i=1 (1/N)O(1/Ti ) Expect consistency of estimates of long run effects. Part 10: Time Series Applications [ 34/64] Country Means logYi,t i i log Yi,t 1 i x it i,t i i , = 1 i 1 logYi i i log Yi,1 i x i i i contains log Yi,1 Estimates are inconsistent. But, tT1 log y it logYi , Ti log Yi,1 tT1 log y i,t 1 Ti logYi y i,T y i,0 Ti logYi T (y) / Ti logYi i i log Yi,1 i x i i i i (logYi T (y) / Ti ) i x i i i i i i xi T (y) / Ti 1 i 1 i 1 i 1 i Part 10: Time Series Applications [ 35/64] Country Means (cont.) logYi i i log Yi,1 i x i i i i (logYi T (y) / Ti ) i x i i i i i i xi T (y) / Ti 1 i 1 i 1 i 1 i (1) Let = i0 i /(1 i ), expect i0 0 to be random i=i /(1 i ) and i to be random (2) Level variable x i should be uncorrelated with change (logYi T (y) / Ti ). Regression of logYi on 1, x i should give consistent estimates of 0 and . Part 10: Time Series Applications [ 36/64] Time Series log y t (1 / N)Ni1 log y i,t = + (1/N)Ni1i y i,t 1 (1/N)Ni1i x i,t (1/N)Ni1i,t Use =(1/N)Ni1i so i ( i ) then (1/N)Ni1i log y i,t 1 log y 1,t (1/N)Ni1 (i ) log y i,t 1 Likewise for (1/N)Ni1i x i,t log y t log y 1,t x t ( t (1/N)Ni1 (i ) log y i,t 1...) Disturbance is correlated with the regressor. There is no way out, and by construction no instrumental variable that could be correlated with the regressor and not the disturbance. Part 10: Time Series Applications [ 37/64] Pooling Essentially the same as the time series case. OLS or GLS are inconsistent There could be no instrument that would work (by construction) Part 10: Time Series Applications [ 38/64] A Mixed/Fixed Approach log y i,t i Ni1idi,t log y i,t 1 i x i,t i,t di,t = country specific dummy variable. Treat i and i as random, i is a 'fixed effect.' This model can be fit consistently by OLS and efficiently by GLS. Part 10: Time Series Applications [ 39/64] A Mixed Fixed Model Estimator log y i,t i Ni1 idi,t log y i,t 1 x i,t (w i x i,t i,t ) i w i 2 Heteroscedastic : Var[w i x i,t i,t ]=2w x i,t 2 Use two step least squares. (1) Linear regression of logy i,t on dummy variables, dummy variables times logy i,t-1 and x i,t . (2) Regress squares of OLS residuals on x i,t2 and 1 to estimate 2w and 2 . (3) Return to (1) but now use weighted least squares. Part 10: Time Series Applications [ 40/64] Nair-Reichert and Weinhold on Growth Weinhold (1996) and Nair–Reichert and Weinhold (2001) analyzed growth and development in a panel of 24 developing countries observed for 25 years, 1971–1995. The model they employed was a variant of the mixed-fixed model proposed by Hsiao (1986, 2003). In their specification, GGDPi,t = αi + γi dit GGDPi,t-1 + β1i GGDIi,t-1 + β2i GFDIi,t-1 + β3i GEXPi,t-1 + β4 INFLi,t-1 + εi,t GGDP = Growth rate of gross domestic product, GGDI = Growth rate of gross domestic investment, GFDI = Growth rate of foreign direct investment (inflows), GEXP = Growth rate of exports of goods and services, INFL = Inflation rate. The constant terms and coefficients on the lagged dependent variable are country specific. The remaining coefficients are treated as random, normally distributed, with means βk and unrestricted variances. They are modeled as uncorrelated. The model was estimated using a modification of the Hildreth–Houck–Swamy method Part 10: Time Series Applications [ 41/64] Analysis of Macroeconomic Data Integrated series The problem with regressions involving nonstationary series Solutions to the “problem” Spurious regressions Unit roots and misleading relationships Random walks and first differencing Removing common trends Cointegration: Formal solutions to regression models involving nonstationary data Extending these results to panels Large T and small T cases. Parameter heterogeneity across countries Part 10: Time Series Applications [ 42/64] Nonstationary Data Part 10: Time Series Applications [ 43/64] Integrated Series Part 10: Time Series Applications [ 44/64] Stationary Data Part 10: Time Series Applications [ 45/64] Unit Root Tests y t 0 y t 1 t Ll1 l y t l t Different restrictions on parameters produce the model. The parameter of interest is 0 . Augmented Dickey-Fuller Tests: Unit root test H0 : < 1 vs. H1 : 1. KPSS Tests: Null hypothesis is stationarity. Alternative is broadly defined as nonstationary. Part 10: Time Series Applications [ 46/64] KPSS Test-1 Part 10: Time Series Applications [ 47/64] KPSS Test-2 Part 10: Time Series Applications [ 48/64] Cointegrated Variables? Part 10: Time Series Applications [ 49/64] Cointegrating Relationships Implications: Long run vs. short run relationships Problems of spurious regressions (as usual) Problem for existing empirical studies: Regressions involving variables of different integration. E.g., regressions of flows on stocks Part 10: Time Series Applications [ 50/64] Money demand example Part 10: Time Series Applications [ 51/64] Panel Unit Root Tests Part 10: Time Series Applications [ 52/64] Implications Separate analyses by country How to combine data and test statistics Cointegrating relationships across countries Part 10: Time Series Applications [ 53/64] Purchasing Power Parity Cross country purchasing power parity hypothesis: Ei,t i iPi,t i,t Ei,t log of exchange rate country i with U.S. Pi,t log of aggregate consumer expenditure price ratio Hypothesis of PPP is i 1. Data on numerous countries (large N) and many periods (large T) Standard simple regressions based on SUR model? (See Pedroni, P., "Purchasing Power Parity Tests in Cointegrated Panels," ReStat, Nov, 2001, p. 727 and related cited papers.) Part 10: Time Series Applications [ 54/64] Application “Some international evidence on price determination: a non-stationary panel Approach,” Paul Ashworth, Joseph P. Byrne, Economic Modelling, 20, 2003, p. 809-838. 80 quarters, 13 OECD countries log pi,t = β0 + β1log(unit labor costi,t) + β2 log(world price,t) + β3 log(intermediate goods pricei,t) + β4 (log-output gapi,t) + εi,t Various tests for unit roots and cointegration Part 10: Time Series Applications [ 55/64] Vector Autoregression The vector autoregression (VAR) model is one of the most successful, flexible, and easy to use models for the analysis of multivariate time series. It is a natural extension of the univariate autoregressive model to dynamic multivariate time series. The VAR model has proven to be especially useful for describing the dynamic behavior of economic and financial time series and for forecasting. It often provides superior forecasts to those from univariate time series models and elaborate theory-based simultaneous equations models. Forecasts from VAR models are quite flexible because they can be made conditional on the potential future paths of specified variables in the model. In addition to data description and forecasting, the VAR model is also used for structural inference and policy analysis. In structural analysis, certain assumptions about the causal structure of the data under investigation are imposed, and the resulting causal impacts of unexpected shocks or innovations to specified variables on the variables in the model are summarized. These causal impacts are usually summarized with impulse response functions and forecast error variance decompositions. Eric Zivot: http://faculty.washington.edu/ezivot/econ584/notes/varModels.pdf Part 10: Time Series Applications [ 56/64] VAR y1 (t ) 11 y1 (t 1) 12 y2 (t 1) 13 y3 (t 1) 1 x(t ) 1 (t ) y2 (t ) 21 y1 (t 1) 22 y2 (t 1) 23 y3 (t 1) 2 x(t ) 2 (t ) y3 (t ) 31 y1 (t 1) 32 y2 (t 1) 33 y3 (t 1) 3 x(t ) 3 (t ) (In Zivot's examples, 1. Exchange rates 2. y(t)=stock returns, interest rates, indexes of industrial production, rate of inflation Part 10: Time Series Applications [ 57/64] VAR Formulation y (t) = y (t-1) + x(t) + (t) SUR with identical regressors. Granger Causality: Nonzero off diagonal elements in y1 (t ) 11 y1 (t 1) 12 y2 (t 1) 13 y3 (t 1) 1 x(t ) 1 (t ) y2 (t ) 21 y1 (t 1) 22 y2 (t 1) 23 y3 (t 1) 2 x(t ) 2 (t ) y3 (t ) 31 y1 (t 1) 32 y2 (t 1) 33 y3 (t 1) 3 x(t ) 3 (t ) Hypothesis: y2 does not Granger cause y1: 12 =0 Part 10: Time Series Applications [ 58/64] Impulse Response y (t) = y (t-1) + x(t) + (t) By backward substitution or using the lag operator (text, 943) y (t) x(t) x(t-1) 2 x(t-2) +... (ad infinitum) + (t) (t-1) 2 (t-2) + ... [ P must converge to 0 as P increases. Roots inside unit circle.] Consider a one time shock (impulse) in the system, = 2 in period t Consider the effect of the impulse on y1 ( s ), s=t, t+1,... Effect in period t is 0. 2 is not in the y1 equation. 2 affects y2 in period t, which affects y1 in period t+1. Effect is 12 In period t+2, the effect from 2 periods back is ( 2 )12 ... and so on. Part 10: Time Series Applications [ 59/64] Zivot’s Data Part 10: Time Series Applications [ 60/64] Impulse Responses Part 10: Time Series Applications [ 61/64] GARCH Models: A Model for Time Series with Latent Heteroscedasticity Bollerslev/Ghysel, 1974 Part 10: Time Series Applications [ 62/64] ARCH Model Part 10: Time Series Applications [ 63/64] GARCH Model Part 10: Time Series Applications [ 64/64] Estimated GARCH Model ---------------------------------------------------------------------GARCH MODEL Dependent variable Y Log likelihood function -1106.60788 Restricted log likelihood -1311.09637 Chi squared [ 2 d.f.] 408.97699 Significance level .00000 McFadden Pseudo R-squared .1559676 Estimation based on N = 1974, K = 4 GARCH Model, P = 1, Q = 1 Wald statistic for GARCH = 3727.503 --------+------------------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X --------+------------------------------------------------------------|Regression parameters Constant| -.00619 .00873 -.709 .4783 |Unconditional Variance Alpha(0)| .01076*** .00312 3.445 .0006 |Lagged Variance Terms Delta(1)| .80597*** .03015 26.731 .0000 |Lagged Squared Disturbance Terms Alpha(1)| .15313*** .02732 5.605 .0000 |Equilibrium variance, a0/[1-D(1)-A(1)] EquilVar| .26316 .59402 .443 .6577 --------+-------------------------------------------------------------