10.3 Time Series Thus Far Whereas cross sectional data needed 3 assumptions to make OLS unbiased, time series data needs only 2 -Although the third assumption is much stronger -If we omit a valid variable, we cause biased as seen and calculated in Chapter 3 -Now all that remains is to derive assumptions that allow us to test the significance of our OLS estimates Assumption TS.4 (Homoskedasticity) Conditional on X, the variance of ut is the same for all t: Var (ut | X ) Var (ut ) , t 1,2,..., n. 2 Assumption TS.4 Notes -essentially, the variance of the error term cannot depend on X; it must be constant -it is sufficient if: 1) ut and X are independent 2) Var (ut) is constant over time -ie: no trending -if TS.4 is violated we again have heteroskedasticity -Chapter 12 shows similar tests for Het as found in Chapter 8 Assumption TS.4 Violation Consider the regression: tuitiont 0 1inflation t politics t ut Unfortunately, tuition is often a political rather than an economic decision, leading to tuition freezes (=real tuition decreases) in an attempt to buy votes -This effect can span time periods -Since politics can affect the variability of tuition, this regression is heteroskedastic Assumption TS.5 (No Serial Correlation) Conditional on X, errors in two different time periods are uncorrelated: Cor (ut , us | X ) 0 t s Assumption TS.5 Notes If we assume that X is non-random, TS.5 simplifies to: Cor (ut , us ) 0 ts (10.12) -If this assumption is violated, we say that our time series errors suffer from AUTOCORRELATION, as they are correlated across time -note that TS.5 assumes nothing about intertemporal correlation among x variables -we didn’t need this assumption for crosssectional data as random sampling ensured no connection between error terms Assumption TS.5 Violation Take the regression: weightt 0 1calories t exerciset ut If actual weight is unexpectedly high one time period (high fat intake), then ut>0, and weight can be expected to be high in subsequent periods (ut+1>0) Likewise if weight is unexpectedly low one time period (liposuction), then ut<0, and weight can be expected to be low in subsequent periods (ut+1<0) 10.3 Gauss Markov Assumptions -Assumptions TS.1 through TS. 5 are our GaussMarkov assumptions for time series data -They allow us to estimate OLS variance -If cross sectional data is not random, TS.1 through TS.5 can sometimes be used in cross sectional applications -with these 5 properties in time series data, we see variance calculated and the Gauss-Markov theorem holding the same as with cross sectional data -the same OLS properties apply in finite sample time series as in cross-sectional data: Theorem 10.2 (OLS Sampling Variances) Under the time series Gauss-Markov Assumptions TS.1 through TS.5, the variance of Bjhat, conditional on X, is Var ( ˆ j | X ) 2 SST j (1 R ) 2 j , j 0, 1,..., k. Where SSTj is the total sum of squares of xtj and Rj2 is the R-squared from the regression of xj on the other independent variables Theorem 10.3 2 (Unbiased Estimation of σ ) Under assumptions TS.1 through TS.5, the estimator SSR ˆ n k 1 2 Is an unbiased estimator of σ2, where df=n-k-1 Theorem 10.4 (Gauss-Markov Theorem) Under assumptions TS.1 through TS.5, the OLS estimators are the best linear unbiased estimators conditional on X 10.3 Time Series and Testing -In order to construct valid standard errors, t statistics and F statistics, we need to add one more assumption -TS.6 implies and is stronger than TS.3, TS.4 and TS.5 -given these 6 time series assumptions, tests are conducted identically to the cross sectional case -time series assumptions are more restrictive than cross sectional assumptions Assumption TS.6 (Normality) The errors ut are independent of X and are independently and identically distributed as Normal (0, σ2). Theorem 10.5 (Normal Sampling Distribution) Under assumptions TS.1 through TS.6, the CLM assumptions for time series, the OLS estimators are normally distributed, conditional on X. Further, under the null hypothesis, each t statistic has a t distribution, and each F statistic has an F distribution. The usual construction of confidence intervals is also valid. 10.4 Time Series Logs -Logarithms used in time series regressions again refer to percentage changes: log( U t ) 0 0 log( sleept ) 1 log( sleept 1 ) otherfactorst ut -here the impact propensity, delta0 is also called the SHORT-RUN ELASTICITY -it measures the immediate percentage change in utility given a 1% increase in sleep -the long-run propensity (delta0+delta1 in this case) is called the LONG-RUN ELASTICITY -measuring change in utility 2 periods after a 1% increase in sleep 10.4 Time Series Dummy Variables -Time series data can benefit from dummy variables much like time series data -DV’s can indicate when a characteristic changes -ie: Rain=1 days that it rains -DV’s can also refer to periods of time to see if there are systematic differences between time periods -for example, if you suspect base utility to be different during exams: U t 0 0 Exams otherfactorst ut -Where Exams=1 during exams 10.4 Index Review -an index number aggregates a vast amount of information into a single quantity -for example, Econ 399 time can be spent in class, reviewing the text/notes, studying, working on assignments, or working on your paper -since all these individual factors are highly correlated (an one hour in one area is not necessarily the same as one hour elsewhere) and numerous to conclude, work on Econ 399 can instead be shown as an index 10.4 Index Review -An index is generally equal to 100 in the base year. Base years are changed using: old index t new index t 100 old index new base -where old indexnew base is the old value of the index in the new base year -a special case of indexes is a price index, which is also useful to convert to REAL variables: nominal real 100 PI 10.4 Index Review -indexes and Dummy Variables can be used together for event studies; to test if an event has a structural impact on a regression: Your favorite character on TV is killed off, and you want to test if this affects your econ 399 performance. You estimate the regression: Mark t 0 0Char 1Work otherfactorst ut -To see if the TV event made an impact, test if delta0=0 -one could also include and test multiplicative Dummy Variables 10.5 Time Trends -Sometimes economic data has a TIME TREND; a tendency to grow over time -if two variables are either increasing or decreasing over time, they will appear to be correlated although they may be independent -failure to account for trending can lead to errors in a regression -even one variable trending in a regression can lead to errors, as we shall see 10.5 Linear Time Trend -The linear time trend is a simple model of trending: yt 0 1t et , t 1, 2, ..., n (10.24) -Where et is an independent, identically distributed sequence with E(et)=0 and Var(et)=σe2 -the change in y between any two periods is equal to alpha1 -if alpha1>0, y is growing over time and has an upward trend -if alpha1>0, y is growing over time and has an upward trend 10.5 Exponential Time Trend -The linear time trend allows for the same increase in y every period -An exponential time trend allows for the same PERCENTAGE increase in y each period: log( yt ) 0 1t et , t 1, 2, ..., n (10.26) -Here each period’s change in log(yt) is equal to alpha1 -As we’ve seen previously, if growth is small, the percentage growth rate of yt each period is equal to 100(alpha1)% 10.5 Quadratic Time Trend -While linear and exponential time trends are most common, more complicated trends can occur -For example, take a quadratic time trend: yt 0 1t 2t 2 et (10.29) -Using derivatives, here the one-period increase in yt is shown as: yt 1 2 2t (10.30) t -Although more complicated trends are possible, they run the risk of explaining variation that should be attributed to x and not t 10.5 Spurious Regressions -Trending variables do not themselves cause a violation of TS.1 through TS.6 -however, if y and at least one x variable appear to be correlated due to trending, the regression suffers from a SPURIOUS REGRESSION PROBLEM -if y itself is trending, we have the true regression: yt 0 1 xt1 2 xt 2 3t ut 10.5 Spurious Regressions yt 0 1 xt1 2 xt 2 3t ut -If we omit the valid “variable” t, we have caused bias -this effect is heightened if x variables are also trending -adding a time trend can actually make a variable more significant if its movement about its trend affects y -note that including a time trend is also valid if only x (and not y) is trending 10.5 Detrending -Including a time trend can be seen as similar to “partialling out” the trending of variables: 1) Regress y and all x variables on the time trend and save the residuals such that: yt yt ˆ 0 ˆ1t -In the above example, y has been linearly detrended using the regression: yt ˆ 0 ˆ1t et 10.5 Detrending 2) Run the following regression. Intercepts are not needed, and will be estimated as zero if not omitted: y x x ... x v t 1 t1 2 t2 k tk -These betas will be identical to the regression with a time trend included -this shows why including a time trend is also important if x is trending; the OLS estimates are still affected by the trend 10.5 R2 and Trending Typical R2 for time series regressions is artificially high as SST/(n-1) is no longer an unbiased or consistent estimator in the face of trending -R2 cannot account for y’s trending -the simplest solution is to calculate R2 from a regression where y has been detrended: yt 0 1 xt1 2 xt 2 ... k xtk 0t v -Note that only the y has been detrended and t is included as an explanatory variable 10.5 R2 and Trending This R2 can be calculated as: SSR R 1 2 yt 2 -Note that SSR is the same for both the models with and without t -this R2 will always be lower than or equal to the typical R2 -this R2 can be adjusted to account for variable inclusion -when doing F tests, the typical R2 is still used 10.5 Seasonality -Some data may exhibit SEASONALITY, it may naturally vary within the year; within seasons -ie: housing starts, ice cream sales -typically data that exhibits seasonal patterns is seasonally adjusted -if this is not the case, seasonal dummy variables should be included (11 montly dummy variables, 3 seasonal dummy variables, etc) -significance tests can then be performed to evaluate the seasonality of the data 10.5 Deseasonalizing Just as data can be deseasonalized, it can also be detrended: 1) Regress each y and x variable on seasonal dummy variables and obtain the residuals: yt yt ˆ 0 ˆ1Spring ˆ 2 Summer ˆ 3 Fall 2) Regress the deseasonalized (residuals) y on the deseasonalized x’s: yt 1xt1 2 xt 2 ... k xtk v 10.5 Deseasonalizing This deseasonalized model is again a better source for accurate R2 values -as this model nets out any variation attributed to seasonality -Note that some regressions may suffer from both trending and seasonality, requiring both detrending and deseasonalizing, which requires including seasonal dummy variables and a time trend in step 1 above.