Lecture 14: Heteroskedasticity and Serial Correlation

Lecture 14: Heteroskedasticity and Serial Correlation Heteroskedasticity (Chapter 10.6–10.7) Serial Correlation (Chapter 11.1–11.3) Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Agenda • Review • Feasible GLS (Chapter 10.6) • White Robust Standard Errors (Chapter 10.7) • Serial Correlation (Chapter 11.1) • OLS and Serial Correlation (Chapter 11.2) • Newey–West Estimated Standard Errors (Chapter 11.2) • Testing for Serial Correlation (Chapter 11.3) Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-2 Review • In the last lecture, we began relaxing the Gauss–Markov assumptions, starting with the assumption of homoskedasticity. • Under heteroskedasticity, Var(i )   2 di – OLS is still unbiased. – OLS is no longer efficient. – OLS e.s.e.’s are incorrect, so C.I., t-, and F- statistics are incorrect. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-3 Review (cont.) • Under heteroskedasticity, 2 2 2 ˆ Var (  )   w d i i • For a straight line through the origin, 2 2  X OLS 2 i di ˆ Var (  )   (X i 2 )2 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-4 Review (cont.) • We can use squared residuals to test for heteroskedasticity. • In the White test, we regress the squared residuals against all explanators, squares of explanators, and interactions of explanators. The nR2 of the auxilliary equation is distributed Chi-squared. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-5 Review (cont.) • The Breusch–Pagan test is similar, but the econometrician chooses the explanators for the auxilliary equation. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-6 Review (cont.) • Under heteroskedasticity, the BLUE Estimator is Generalized Least Squares. • To implement GLS: 1. Divide all variables by di 2. Perform OLS on the transformed variables. • If we have used the correct di , the transformed data are homoskedastic. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-7 Review (cont.) • For example, consider the relationship renti  0  1incomei   i • We are concerned that Var(i) may vary with income. • We need to make an assumption about how Var(i) varies with income. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-8 Review (cont.) 2 2 Var (  )   · income • An initial guess: i i • di = incomei • If we have modeled heteroskedasticity correctly, then the BLUE Estimator is rent 1  0  1  vi income i incomei Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-9 Review (cont.) • If we have the correct model of heteroskedasticity, then OLS with the transformed data should be homoskedastic. rent 1  0  1  vi income i incomei • Using a White test, we reject the null hypothesis of homoskedasticity of the model with transformed data. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-10 GLS: An Example • Our first guess didn’t work very well. 2 Var (  )   ·incomei • Let’s try i rent income i  0 1 income i  1 incomei  vi • This time, we fail to reject the null hypothesis of homoskedasticity. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-11 Feasible GLS (Chapter 10.6) • We usually do NOT know di, so GLS is infeasible. • We can, however, ESTIMATE di • We call GLS with estimates of di “Feasible Generalized Least Squares.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-12 Feasible GLS (cont.) • To begin, we need to assume some model for the heteroskedasticity. • Then we estimate the parameter/s of the model. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-13 Feasible GLS (cont.) • One reasonable model for the error terms could be that the variance is proportional to some power of the explanator. Var(i )   X 2 h i • For example, in the rent-income example, we tried both Var( i )   income (h  2) 2 2 i and Var( i )   2 incomei (h  1) Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-14 Feasible GLS (cont.) • To implement FGLS, we have assumed Var(i )   2 X ih • To estimate this equation using linear regression methods, we can take advantage of the properties of logs: ln(ab )  b·ln(a) AND ln(ab)  ln(a)  ln(b) • Regress ln(ei )  ln( )  hln( X i )  i 2 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 2 14-15 Feasible GLS (cont.) 1. Estimate the regression with OLS. 2. Regress 2 2 ln(ei )  ln( )  hln( X i )  i 3. Divide every variable by: hˆ di  X i  X i hˆ 2 4. Apply OLS to the transformed data. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-16 Feasible GLS (cont.) • FGLS is not a mechanistic procedure. • The econometrician may prefer other methods of estimating di Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-17 Feasible GLS (cont.) • Applying FGLS to the rent-income example, our estimated h value is 1.21 • We should divide all our variables by incomei0.605. This is very close to dividing by the square root of income, as we did in the second part of the example. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-18 TABLE 10.7 ln(Squared Residual) vs. ln(Income) Following RENT vs. INCOME by OLS Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-19 White Robust Standard Errors (Chapter 10.7) • Heteroskedasticity is a common problem. • We may not always be happy making the FGLS assumptions, especially if we don’t really need that extra efficiency. • OLS is unbiased. OLS may yield a sufficiently small standard error to allow reasonably precise estimates. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-20 White Robust Standard Errors (cont.) • The main problem in applying OLS under heteroskedasticity is that our e.s.e. formula is incorrect • White’s brilliant idea: use OLS and fix the estimated standard errors! Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-21 White Robust Standard Errors (cont.) • For OLS with an intercept and a single explanator, Yi  0  1 X i   i , we have derived the formula for the e.s.e: 2  e i e.s.e.( ˆ1 )  (n  2)xi 2 • However, we really used the homoskedasticity assumption only to simplify this formula. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-22 White Robust Standard Errors (cont.) • If we do not impose homoskedasticity, we get a slightly more complicated formula: White e.s.e.( ˆ1 )  xi ei 2 2 (xi ) 2 2 • The computer can easily perform this calculation. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-23 White Robust Standard Errors (cont.) • White Heteroskedastic Consistent standard errors (commonly called “robust” standard errors) correct for possible heteroskedasticity • Software packages often provide White e.s.e.’s as an option • If errors are homoskedastic, White e.s.e.’s are less efficient than OLS e.s.e.’s Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-24 TABLE 10.8 OLS Estimates of the Rent– Income Relationship with Robust Standard Errors Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-25 White Robust Standard Errors • Applying White estimated standard errors is a very easy fix for possible heteroskedasticity. • Some economists simply use White e.s.e.’s routinely. • This fix comes with a cost in efficiency: – OLS is not BLUE under heteroskedasticity. – White e.s.e.’s are inefficient under homoskedasticity. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-26 White Robust Standard Errors (cont.) • Note: It is CRUCIAL, when you present your own results, that you clarify which e.s.e. you have used. If you do use White standard errors, you MUST say so. • For example, many tables of results include the footnote “White standard errors in parentheses” or “Robust standard errors in parentheses.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-27 Heteroskedasticity • Heteroskedasticity is not, in practice, a burdensome complication. • Econometricians have easy-to-apply tests to detect heteroskedasticity (White tests, Breusch–Pagan tests, or Goldfeld– Quandt tests). • If there is heteroskedasticity, econometricians have a number of options available. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-28 Heteroskedasticity (cont.) • If econometricians know the exact nature of the heteroskedasticity (i.e. if they know the di), then they can simply divide all variables by di and apply GLS. • If the di are unknown, but econometricians are willing to make some assumptions about their functional form, then the di can be estimated by FGLS. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-29 Heteroskedasticity (cont.) • If econometricians are unwilling to make assumptions about the nature of the heteroskedasticity, they can implement OLS to get unbiased, but inefficient, estimates. • Then they must correct the estimated standard errors using White Robust Standard Errors. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-30 Serial Correlation (Chapter 11.1) • Now let’s relax a different Gauss– Markov assumption. • What if the error terms are correlated with one another? • If I know something about the error term for one observation, I also know something about the error term for another observation. • Our observations are NOT independent! Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-31 Serial Correlation (cont.) • Serial Correlation frequently arises when using time series data (so we will index our observations with t instead of i) • The error term includes all variables not explicitly included in the model. • If a change occurs to one of these unobserved variables in 1969, it is quite plausible that some of that change will still be evident in 1970. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-32 Serial Correlation (cont.) • In this lecture, we will consider a particular form of correlation among error terms. • Error terms are correlated more heavily with “nearby” observations than with “distant” observations. • E.g., cov(1969,1970) > cov(1969,1990) Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-33 Serial Correlation (cont.) • For example, inflation in the United States has been positively serially correlated for at least a century. We expect above average inflation in a given period if there was above average inflation in the preceding period. • Let’s look at DEVIATIONS in US inflation from its mean from 1923–1952 and from 1973–2002. There is greater serial correlation in the more recent sample. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-34 Figure 11.1 U.S. Inflation’s Deviations from Its Mean Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Serial Correlation: A DGP • We assume that covariances depend only on the distance between two time periods, |t-t’| Yt  0  1 X 1t 1 .. k X kt   t E( t )  0 Var( t )   2 Cov( t , t ' )   tt ' ,  tt '  0 for some t  t ' Specifically:  tt '   |t t '| for all t, t ' X's fixed across samples Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-36 OLS and Serial Correlation (Chapter 11.2) • The implications of serial correlation for OLS are similar to those of heteroskedasticity: – OLS is still unbiased – OLS is inefficient – The OLS formula for estimated standard errors is incorrect • “Fixes” are more complicated Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-37 OLS and Serial Correlation (cont.) • The Gauss–Markov covariance assumption: Cov( t ,  t’ )  0 for t  t’ • The expectation of a linear estimator is NOT affected by the covariance assumption. • The variance of a linear estimator is greatly affected by the covariance assumption. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-38 OLS and Serial Correlation (cont.) Var ( ˆS )  Var (wtYt ) T  Var ( wtYt )   Cov( wtYt , wt 'Yt ' ) t 1 t ' t T  wt Var (Yt )   wt wt 'Cov(Yt , Yt ' ) 2 t 1 t ' t T   wt   wt wt '  |t t '| 2 2 t 1 t ' t Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-39 OLS and Serial Correlation (cont.) • We can write Var(t ) as Cov(t ,t ) = 0 • This notation lets us simplify the expression for the variance of a linear estimator: T T Var ( ˆ )   wt wt '  |t t '| t 1 t '1 • Note the change in the bounds of the sums. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-40 OLS and Serial Correlation (cont.) • The BLUE Estimator solves T min wt T  w w t 1 t '1 t t'  |t t '| such that wt X Rt  0 for R  S and wt X St  1 • OLS is NOT the solution to this problem. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-41 OLS and Serial Correlation (cont.) • As with heteroskedasticity, we have two choices: 1. We can transform the data so that the Gauss–Markov conditions are met, and OLS is BLUE; OR 2. We can disregard efficiency, apply OLS anyway, and “fix” our formula for estimated standard errors. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-42 Newey–West E.S.E.’s • We will first consider the strategy of “fixing” the estimated standard errors from OLS. • Newey–West Serial Correlation Consistent Standard Errors Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-43 Newey–West E.S.E.’s (cont.) T T Var ( ˆ )   wt wt '  |t t '| t 1 t '1 For the case Yi   0  1 X i   i , wˆ i  1 xt T 2 x  s s 1 T T Var ( ˆ1 )   t 1 t '1 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. xt xt '  2   xs   s 1  T  |t t '| 14-44 Newey–West E.S.E.’s (cont.) To estimate Var ( ˆ )  1 T T   t 1 t '1 xt xt '    xs   s 1  T  |t t '| 2 we want to replace  |t t '| with an estimate, et et ' However, there are far too many covariances to estimate. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-45 Newey–West E.S.E.’s (cont.) T T We need to estimate Var ( ˆ1 )   t 1 t '1 xt xt '  2   xs   s 1  T  |t t '| • Whitney Newey and Ken West suggested a simplification. Instead of estimating ALL the covariances, Newey and West suggested estimating only the most important covariances. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-46 Newey–West E.S.E.’s (cont.) T T We need to estimate Var ( ˆ1 )   t 1 t '1 xt xt '  2   xs   s 1  T  |t t '| • Remember, observations are more correlated with each other the closer they are. • Cov(1969 ,1970) > Cov(1969 ,1990) • As |t-t’| grows large, |t-t’| approaches 0. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-47 Newey–West E.S.E.’s (cont.) T T We need to estimate Var ( ˆ1 )   t 1 t '1 xt xt '  2   xs   s 1  T  |t t '| • As |t-t’| grows large, |t-t’| approaches 0 • Newey–West e.s.e.’s require econometricians to make a judgment about the distance |t-t’| after which they can ignore |t-t’| Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-48 Newey–West E.S.E.’s (cont.) T T We need to estimate Var ( ˆ1 )   t 1 t '1 xt xt '  2   xs   s 1  T  |t t '| • The first step in estimating Newey–West Standard Errors is to choose a lag, L • Then we assume |t-t’| ≈ 0 for all |t-t’| > L • The choice of L is a judgment call. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-49 Checking Understanding • You are working with time series data about a company’s investment and profits. You have quarterly data (adjusted for seasonality). You are worried that a shock to the company’s profitability in one quarter could continue into the next several quarters, so you decide to use Newey–West Standard Errors. • What L should you choose if you believe any shocks will dissipate within one year? Within two years? Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-50 Checking Understanding (cont.) • You are working with quarterly data (adjusted for seasonality). • If shocks will dissipate within one year:  With quarterly data, a shock will dissipate within 4 periods, so you need to set L = 4 (You could set L > 4 to be safe, but each lag makes your e.s.e. slightly less efficient). • If shocks will dissipate within two years:  L = 8 (at least) Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-51 Newey–West E.S.E.’s T T We need to estimate Var ( ˆ1 )   t 1 t '1 xt xt '  2   xs   s 1  T  |t t '| • We assume |t-t’| ≈ 0 for all |t-t’| > L • The choice of L is a judgment call. • L = 4, L = 8, and L = 12 are typical choices. • Choosing L requires you to consider the ECONOMICS of the problem. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-52 Newey–West E.S.E.’s (cont.) To estimate Var ( ˆ )  1 T T   t 1 t '1 xt xt '    xs   s 1  T  |t t '| 2 replace  |t t '| with et et ' if | t - t ' |  L, replace  |t t '| with 0 if | t - t ' |  L. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-53 Newey–West E.S.E.’s (cont.) Newey-West e.s.e.'s correct for serial correlation : e.s.e.( ˆ1 )  T tL  xt xt ' et et '  2 x  s   s 1  White Robust e.s.e.'s correct for heteroskedasticity: t 1 t 't  L e.s.e.( ˆ1 )  T xt 2 et 2  x  2 2 t Question: How can you correct e.s.e.'s for BOTH serial correlation AND heteroskedasticity? Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-54 Newey–West E.S.E.’s (cont.) Question: How can you correct e.s.e.'s for BOTH serial correlation AND heteroskedasticity? Answer: Use Newey–West e.s.e.'s. Newey–West ALSO corrects for heteroskedasticity! Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-55 Checking Understanding To see that Newey–West e.s.e.'s ALSO correct for heteroskedasticity, consider the case L  0: e.s.e.( ˆ1 )   T tL  t 1 t 't  L T  xt xt '  2 x  s   s 1  T et et '  T t   t 1 t 't xt xt ' 2 x  s   s 1  T et et ' xt 2 et 2 2  2 x  s   s 1  which is the formula for White Robust e.s.e.'s! t 1 T Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-56 Newey–West E.S.E.’s • How do we implement Newey–West e.s.e.’s using our software? Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-57 Newey–West E.S.E.’s (cont.) • Using Newey–West e.s.e.’s, we simply conduct OLS as before, but tell the computer to use the Newey–West formula for estimating standard errors. • One drawback: OLS is not efficient. There exists an unbiased linear estimator with a lower variance. • Another drawback: we have to choose the number of lags, L, to include in the model. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-58 Durbin–Watson Test (Chapter 11.3) • How do we test for serial correlation? • As with Newey–West e.s.e.’s, we need to limit the number of correlations we handle. • James Durbin and G.S. Watson proposed testing for correlation in the error terms between adjacent observations. • In our DGP, we assume the strongest correlation exists between adjacent observations. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-59 Durbin–Watson Test (cont.) • Correlation between adjacent disturbances is called “first-order serial correlation.” • To test for first-order serial correlation, we ask whether adjacent ’s are correlated. • As usual, we’ll use residuals to proxy for  • The trick is constructing a test statistic for which we know the distribution (so we can calculate the probability of observing the data, given the null hypothesis). Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-60 Durbin–Watson Test (cont.) • We end up with a somewhat opaque test statistic for first-order serial correlation T d  t 2 (et  et 1 ) 2 T e 2 t t 1 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-61 Durbin–Watson Test (cont.) • To see the test statistic, it is helpful to expand the numerator: T d  (et  et 1 )2 T e t 2 2 t t 1 T T e  e 2  2 t 1 t t 2 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. T  2 et et 1 t 2 t 2 T 2 e t t 1 14-62 Durbin–Watson Test (cont.) • In large samples, we can divide the numerator by (T-k-2) and the denominator by (T-k-1) without creating much of a bias. d T T T 1 1 1 2 2 e  e 2 et et 1    t t 1 T  k  2 t 2 T  k  2 t 2 T  k  2 t 2 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. T 1 2 e  T  k 1 t 1 t 14-63 Durbin–Watson Test (cont.) In large samples, T T T 1 1 1 2 2 2 e , e , e    t t 1 T  k  2 t 2 T  k  2 t 2 T  k 1 t 1 t all approximate s2, an estimate of the variance of the error term. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-64 Durbin–Watson Test (cont.) • In large samples, T 1 et et 1  T  k  2 t 2 approximately estimates the covariance between adjacent error terms. If there is no first-order serial correlation, this term will collapse to 0. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-65 Durbin–Watson Test (cont.) • In large sample, the Durbin–Watson statistic approximates 2  2Cov(t ,t 1 ) 2  2  2 2Cov(t ,t 1 ) 2 • Under the null hypothesis of no first-order serial correlation, d ≈ 2 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-66 Durbin–Watson Test (cont.) • When the Durbin–Watson statistic, d, gives a value far from 2, then it suggests the covariance term is not 0 after all • i.e., a value of d far from 2 suggests the presence of first-order serial correlation • At the most extreme, Cov(t ,t-1) is bounded by – 2 and  2 • d is bounded between 0 and 4 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-67 Durbin–Watson Test (cont.) • It seems a bit roundabout to estimate d  2 2Cov( t , t 1 ) 2 when what we care about directly is Cov( t , t 1 ) • We estimate d because we know something about its distribution. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-68 Durbin–Watson Test (cont.) • Next time, we will see how to interpret the Durbin–Watson statistic. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-69 Review (cont.) • Under heteroskedasticity, 2 2 2 ˆ Var (  )   w d i i • For a straight line with an unknown intercept, 2 2  x OLS 2 i di ˆ Var (  )   2 2 (xi ) Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-70 Review (cont.) • We can correct for heteroskedasticity by dividing our variables through by di (implementing Generalized Least Squares). • The catch is that we don’t observe di. • We can guess what di is, and support our conjecture using White or Breusch–Pagan tests on the GLS model. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-71 Review (cont.) • Alternatively, we can estimate the di through Feasible Generalized Least Squares. • FGLS requires us to write down a specific model for the heteroskedastic error terms, but we let the data choose the key parameter/s. • We learned one important FGLS model. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-72 Review (cont.) • To implement FGLS, we have assumed Var(i )   2 X ih • To estimate this equation using linear regression methods, we can take advantage of the properties of logs: ln(ab )  b·ln(a) AND ln(ab)  ln(a)  ln(b) • Regress ln(ei 2 )  ln( 2 )  hln( X i )  i Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-73 Review (cont.) 1. Estimate the regression with OLS. 2. Regress 2 2 ln(ei )  ln( )  hln( X i )  i 3. Divide every variable by hˆ di  X i  X i hˆ 2 4. Apply OLS to the transformed data. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-74 Review (cont.) • Heteroskedasticity is pretty common. • We may not always be happy making the FGLS assumptions, especially if we don’t really need that extra efficiency. • OLS is unbiased. With a reasonable sample size, OLS may yield a sufficiently small standard error to allow reasonably precise estimates, if we use White robust standard errors. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-75 Review (cont.) • For example, for the case of a line with only one explanator and an intercept, 2 2  x i ei ˆ White e.s.e.( 1 )  2 2 (xi ) • The computer can easily perform this calculation instead of the simpler, homoskedastic version. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-76 Review (cont.) • Our serial correlation DGP assumes that covariances depend only on |t-t’| Yt  0  1 X 1t  ... k X kt   t E( t )  0 Var( t )   2 Cov( t , t ' )   tt ' ,  tt '  0 for some t  t ' Specifically:  tt '   |t t '| for all t, t ' X's fixed across samples Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-77 Review (cont.) • The implications of serial correlation for OLS are similar to those of heteroskedasticity: – OLS is still unbiased. – OLS is inefficient. – The OLS formula for estimated standard errors is incorrect. • “Fixes” are more complicated. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-78 Review (cont.) • As with heteroskedasticity, we have two choices: 1. We can transform the data so that the Gauss–Markov conditions are met, and OLS is BLUE; OR 2. We can disregard efficiency, apply OLS anyway, and “fix” our formula for estimated standard errors. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-79 Review (cont.) • We first consider the strategy of “fixing” the estimated standard errors from OLS. • We can get “correct” e.s.e.’s by estimating “Newey–West Serial Correlation Consistent Standard Errors.” Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-80 Review (cont.) For the case Yi   0  1 X i   i , T T To estimate Var ( ˆ1 )    |t t '|  2   xs   s 1  with an estimate, et et ' t 1 t '1 we want to replace  |t t '| xt xt ' T However, there are far too many covariances to estimate. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-81 Review (cont.) T T We need to estimate Var ( ˆ1 )   t 1 t '1 xt xt '  2   xs   s 1  T  |t t '| • The first step in estimating Newey–West Standard Errors is to choose a lag, L • Then we assume |t-t’| ≈ 0 for all |t-t’| > L • The choice of L is a judgment call. • L = 4, L = 8, and L = 12 are typical choices. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-82 Review (cont.) • The Durbin–Watson test checks for first-order serial correlation: T d  t 2 (et  et 1 ) 2 T e 2 t t 1 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-83 Review (cont.) • In large sample, the Durbin–Watson statistic approximates 2  2Cov(t ,t 1 ) 2  2  2 2Cov(t ,t 1 )  2 • Under the null hypothesis of no first-order serial correlation, d ≈ 2 Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 14-84

Lecture 14: Heteroskedasticity and Serial Correlation

Related documents

Products

Support

Lecture 14: Heteroskedasticity and Serial Correlation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib