Lecture 17: Serial correlation BUEC 333 Professor David Jacks 1 Three assumptions necessary for unbiasedness: 1.) correct specification; 2.) zero mean error term; 3.) exogeneity of independent variables. Three assumptions necessary for efficiency: 4.) no perfect collinearity; 5.) no serial correlation; 6.) no heteroskedasticity. Violatin’ the classical assumptions 2 Since serial correlation (SC) violates 5.) and this implies that OLS is not BLUE, we want to know: 1.) What is the nature of the problem? 2.) What are the consequence of the problem? 3.) How is the problem diagnosed? 4.) What remedies for the problem are available? We now consider these in turn… Violatin’ the classical assumptions 3 SC occurs when an observation’s error term (εi) is correlated with another observation’s error term (εj), or Cov(εi, εj) ≠ 0. Usually happens because there is an important relationship—economic or otherwise—between observations which we are failing to control for. Serial correlation 4 SC can also arise from cluster sampling where observations are of the same variables on systematically related subjects. Example: firms operating in the same market; consumption with sample data from families with one observation for each family member. Serial correlation 5 There are two basic types of serial correlation: pure and impure. Pure serial correlation arises if the model is correctly specified but the errors are serially correlated (that is, all other assumptions hold). Example: the DGP is Yt = β0 + β1X1t + εt where εt = ρεt-1 + ut and ut is a ―classical‖ error term. Pure serial correlation 6 Note: here, we use the subscript t (for time, instead of i) to denote the observation number; this is standard for models of time series data where SC arises most frequently. Further note: this kind of serial correlation is also called first-order autocorrelation or first-order autoregression—or AR(1) for short—and ρ is called the autocorrelation coefficient. Pure serial correlation 7 First-order autocorrelation: εt = ρεt-1 + ut. Requires –1 < ρ < 1. But why –1 and 1? And what if ρ = 0? ρ < 0 is an example of negative serial correlation where εt and εt-1 tend to have opposite signs. Pure serial correlation 8 Pure serial correlation 9 ρ > 0 is an example of positive serial correlation where εt and εt-1 tend to have the same sign. This case is fairly easy to interpret and very common in economic data, especially time series. For time series, macroeconomic shocks take time to work their way fully through the economy. Pure serial correlation 10 Pure serial correlation 11 Example: modeling the price of oranges. They can only be grown in warm climates but are consumed almost everywhere. Dispersion of production and consumption means they have to be transported by container ships, trains, and trucks before being sold to consumers. An example of pure serial correlation 12 An unexpected shock to supply of oil leads to an increase in price of oil that lasts several months. Positive shock to oil prices likely to filter into a series of positive shocks to the price of oranges. Here, modeling prices of oranges at daily frequency virtually guarantees that SC will be a An example of pure serial correlation 13 We can also have autocorrelation at higher orders: 1.) εt = ρ1εt-1 + ρ2εt-2 + ut (second-order) 2.) εt = ρ1εt-1 + ρ2εt-2 + ρ3εt-3 + ut (third-order) 3.) … Autocorrelation also in non-adjacent periods; e.g., with quarterly data on real estate prices, we might have εt = ρεt-4 + ut. Further cases of pure serial correlation 14 Further cases of pure serial correlation 15 Impure serial correlation arises if the model is mis-specified due to an omitted variable and the specification errors induces SC. For example, suppose the DGP is Yt 0 1 X 1t 2 X 2t t Instead, we estimate Yt 0 1 X 1t where: 1.) t* 2 X 2t t 2.) X 2t 3.) t and ut * t Impure serial correlation via omitted variables 16 Because of mis-specification error, error term is: 2 X 2t t 2 ( X 2t 1 ut ) t * t * t 2 X 2t 1 2ut t * ( t 1 t 1 ) 2ut t * t * t The error term of observation t is, therefore, correlated Impure serial correlation via omitted variables 17 Even though the ―true‖ errors satisfy assumptions of CLRM, leaving out X2 induces SC in the error term because X2 is serially correlated. This omitted variables problem does not cause bias if and only if omitted variable is uncorrelated with included independent variable. Impure serial correlation via omitted variables 18 Imagine your first job is to model consumer demand for LCD TVs for Samsung. You know from ECON 201 that relative prices matter, specifically the price of Samsung TVs and that of their competitors or PSAM/PSONY. Being hungover, you forget another lesson from ECON 201: Impure serial correlation via omitted variables 19 Impure serial correlation also arises if the model is mis-specified due to incorrect functional form. For example, suppose the DGP is Yt 0 1 X 1t 2 X 12t t Instead, we estimate Yt 0 1 X 1t where * 2 t 2 X 1t t . * t * t t Impure serial correlation via functional form 20 Use of incorrect functional form in this case tends to group positive and negative residuals together… Impure serial correlation via functional form 21 Both forms of SC violates Assumption 5 of the CLRM, and hence OLS is not the BLUE. What more can we say? 1.) OLS estimates remain unbiased…but only if the problem is with pure SC. Consequences of serial correlation 22 Suppose we have the simple linear regression Yi 0 1 X i i Y 0 1 X X X Y Y ˆ X X X X ˆ X X i i i 1 2 i i i 1 i i 1 2 i i E ˆ1 1 since Cov( X i , i ) 0 Consequences of serial correlation 23 OLS estimates, however, will be biased if the problem is with impure SC brought about by correlated omitted variables. Impure SC represents violation of Assumption 1. In this case, the SC problem is of secondary importance next to the bias potentially induced by specification error. Consequences of serial correlation 24 2.) Even if unbiased, OLS is no longer best; that is, no longer exhibits minimum variance. SC implies that errors are partly predictable: with positive SC, a positive error today implies tomorrow’s error is likely to be positive as well. But OLS ignores this auto-correlation. Consequences of serial correlation 25 3.) The formulas derived for the standard errors of OLS estimates are now incorrect. These formulas all assume that errors are not serially correlated. Relaxing this assumption, changes the formulas; computers can be programmed to handle this. Consequences of serial correlation 26 In this case, the “true” standard errors will typically be larger than that which OLS reports (when it assumes there is no SC)…this implies that OLS’ standard errors are biased. And since this standard error is typically larger than what OLS says, the ―true‖ t-statistic will typically be Consequences of serial correlation 27 There are a number of formal tests available. However, the simplest way forward and another good habit to get into is simply looking at a plot of the residuals from a regression model as before. If any red flags are set off with this exercise in ocular econometrics proceed to formal tests, being very mindful of the potential problem. Testing for serial correlation 28 Most common test is the Durbin-Watson (DW) Test (sometime referred to as the d-test). Some caveats to be aware of: 1.) the model needs to have an intercept term; 2.) the model cannot include a lagged dependent variable (that is, Yt-1 cannot be one of the independent variables) Testing for serial correlation 29 If we write the error term as DW will test for the following null hypothesis: H0 : ρ ≤ 0 (no positive autocorrelation) versus H1 : ρ > 0 (positive autocorrelation) This test is so common that almost every software package automatically calculates the value of the DW statistic whenever you estimate a regression. Testing for serial correlation 30 Test statistic is based on residuals from OLS, {e1, e2, …, eT} where T is the sample size: e e d e T t 2 t T 2 t 1 2 t 1 t One way to think about d when (+) SC is present: the numerator will tend to be small samples that The Durbin-Watson test 31 e e d e T t 2 t T 2 t 1 2 t 1 t where et et 1 ut Now, consider these extremes: 1.) ρ = 1, then et – et-1 ≈ 0 2.) ρ = –1, then et – et-1 ≈ – 2et-1 3.) ρ = 0, then et – et-1 ≈ –et-1 The Durbin-Watson test 32 Hence, values of the test statistic ―far‖ from 2 indicate that serial correlation likely present. Unfortunately, distribution theory for d is wonky …for some values of d, the test is inconclusive. For a given significance level, there are consequently two critical values, 0 < dL < dU < 2. The Durbin-Watson test 33 For a one-sided test, H0 : ρ ≤ 0 (no positive autocorrelation) versus H1 : ρ > 0 (positive autocorrelation) a.) Reject H0 b.) Do not reject H0 c.) The test is inconclusive Decision rules for the DW test 34 If evidence of pure serial correlation—whether through a formal test or just by looking at residual plots—you have several options available to you: 1.) Use OLS and ―fix‖ the standard errors. We know OLS is unbiased if SC is pure… but the usual formulas for the standard errors is wrong (and hence our tests can be misleading). Remedies for serially correlated errors 35 This is the approach followed with Newey-West standard errors which provide consistent estimates of the standard errors. What consistency means: estimators get arbitrarily close to their true value (in a probabilistic sense) when the sample size goes to infinity. In Stata, use ―robust‖ option in regression (as in ―reg salary points, robust‖)…most of the time, this will suffice. Remedies for serially correlated errors 36 2.) Other times, you may want to try a more efficient estimator. OLS is not BLUE in this case, but what is? The BLUE is now a generalization of OLS called Generalized Least Squares (GLS). Suppose we want to estimate the regression: Yt 0 1 X t t Remedies for serially correlated errors 37 Then we could write the model as: 1.) Yt 0 1 X t t 1 ut Multiply by ρ and lag this by one period: Yt 1 0 1 X t 1 t 2 ut 1 Since εt = ρεt-1 + ut and εt-1 = ρεt-2 + ut-1, we have Subtracting (2) from (1), 3.) Yt Yt 1 0 (1 ) 1 ( X t X t 1 ) ut Remedies for serially correlated errors 38 Rewrite 3.) Yt Yt 1 0 (1 ) 1 ( X t X t 1 ) ut as * * * 4.) Yt 0 1 X t ut where: 4.) is a Generalized Least Squares Remedies for serially correlated errors 39 Note that: 1.) The error term is now not serially correlated; OLS estimation of 4.) will be minimum variance if we know ρ or can accurately estimate it. 2.)The slope coefficient β1 is the same as the slope coefficient of the original serially correlated equation 1.) above. Remedies for serially correlated errors 40 3.) The dependent variable has changed compared to that in original equation; this means that GLS is not directly comparable to OLS with respect to R2. 4.) GLS is a method of simultaneously estimating β0 , β1, and ρ (while being the BLUE of β0 and β1); different ways of calculating the GLS estimator discussed in text which are pretty involved. Remedies for serially correlated errors 41 SC as a very common problem in econometrics. At best, SC presents problems related to the efficiency of OLS estimators. At worst, SC presents problems related to both the bias and efficiency of OLS estimators. Conclusion 42