London School of Economics and Political Science Marcia Schafgans Department of Economics Michaelmas 2022 EC2C1: Econometrics II Problem Set 6 Date due: (Wednesday November 16, 2022) 1. Consider the following result of OLS estimation: wage [ = 7.10 − 2.51f emale (.21) n = 526, (.30) R2 = .116 where wage is average hourly earnings in dollars and f emale is a binary variable that is equal to 1 if the person is female and 0 if the person is male. (a) Test whether the slope is statistically significant at the 1% level. Follow the testing recipe! Interpret your finding. (b) Suppose we use the same data and consider the population model wage = α0 + α1 male + u, where male is a binary variable that is equal to 1 if the person is male and 0 if the person is female. What are the OLS estimates calculated from this regression? What is R2 ? (c) In the model wage = β0 + β1 f emale + u, suppose the variances are also different for female and male: V ar(wage|f emale = 0) ̸= V ar(wage|f emale = 1) (or equivalently V ar(u|f emale = 0) ̸= V ar(u|f emale = 1)). What name do we give to this feature of the errors? Is the test result in (a) still reliable? Discuss how you suggest we should proceed. (d) Again, in the model wage = β0 + β1 f emale + u, suppose V ar(u|f emale) = σ 2 (1 + 0.2 · f emale), for some unknown σ 2 . Suggest a different estimator from the OLS that may have a better property. 1 2. We are interested in estimating a simple housing price equation. This exercise uses data in HPRICE1 which contains 88 observations. (a) We estimate the following regression equation and obtain both usual (in brackets) and heteroskedasticity-robust (in squared brackets) standard errors: [ = price − 21.77 + .00207 lotsize + .123 sqrf t + 13.85 bdrms (9.01) (.013) (.00064) (29.48) [36.28] [.00122] [8.28] [.017] n = 88, R2 = .672. Discuss any important differences with the usual standard errors. No derivations expected (b) Using the same data, we estimate the regression equation with log(price) instead of price, and log(lotsize) and log(sqrf t) instead of lotsize and sqrf t, respectively: \ = − 1.30 + .168 log(lotsize) + .700 log(sqrf t) + .037 bdrms log(price) (.65) (.038) (.093) (.028) [.76] [.041] [.101] [.030] n = 88, R2 = .643. Discuss any important differences with the usual standard errors. No derivations expected (c) What does this example suggest about heteroskedasticity and the transformation used for the dependent variable? 3. Consider a finite distributed lag model for the relationship between inflation and the interest rate as given by the federal funds rate: yt = β0 + β1 zt + β2 zt−1 + β3 zt−2 + et . Here yt represents inflation, and zt the interest rate, at different time periods t. Observe that et represents inflationary shocks to the economy, for e.g. oil price shocks. The theory of rational expectations states that the shock must be exogenous to all the past variables in the economy i.e. E[et |(zt ), (et−1 , zt−1 ), (et−2 , zt−2 )...] = 0. Let β̂ denote the OLS estimator of β = (β0 , β1 , β2 , β3 )′ . (a) Based on economic theory or otherwise, what is the sign (positive, negative or zero) that you would expect for Cov(zt+r , et ) when r > 0? Explain. 2 (b) What is the sign (positive, negative or zero) that you would expect for Cov(zt+r , et ) when r = 0, −1, −2, ...? Explain. (c) How would you interpret the Gauss-Markov assumption MLR.4, E[e|Z] = 0, in this context? Do you think this is a reasonable assumption to make here? Explain. (d) Is β̂ necessarily an unbiased estimator for β? Is it BLUE? Clearly explain your answers. (e) Let us define δ = β1 + β2 + β3 . This parameter also denotes the long run effect of interest on inflation; β1 is the short run (instantanous) effect of interest on inflation. Discuss what regression you should run to automatically get δ̂ and its standard error. Discuss why you may need to use a robust standard error if we are concerned about the presence of autocorrelation. What does the concept of autocorrelation mean? When we use robust standard errors we call our test, δ̂/se(δ̂), an asymptotic t-test (it does not require the assumption of strict exogeneity and normality as we discuss in the next handout). 4. Consider the MA(2) process ut = et + θ1 et−1 + θ2 et−2 where et is an i.i.d. random variable with mean zero and variance σ 2 . Show that ut is covariance stationary and weakly dependent. 5. Consider the generalized linear regression model y = Xβ + u which satisfies MLR.1, MLR.3, and MLR.4 and V ar(u|X) = σ 2 Ω ̸= σ 2 I, that is we allow the presence of heteroskedasticity and/or autocorrelation. (a) Provide the OLS estimator of β. No derivations expected. (b) Show that the OLS estimator of β remains unbiased in the presence of heteroskedasticity and/or autocorrelation. (c) Derive V ar(β̂|X), the conditional variance of the OLS estimator of β. 3