Memorial University of Newfoundland Economics ECON 6002 Roberto Martínez-Espiñeira Assignment 6 Exam Name___________________________________ Multiple choice questions are worth [3] each. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) The interpretation of the slope coefficient in the model ln(Yi) = β0 + β1Xi + ui is as follows: 1) _______ A) a change in X by one unit is associated with a 100 β1 % change in Y. B) a 1% change in X is associated with a β1 % change in Y. C) a change in X by one unit is associated with a β1 change in Y. D) a 1% change in X is associated with a change in Y of 0.01 β1. 2) For the polynomial regression model, A) the techniques for estimation and inference developed for multiple regression can be applied. B) you need new estimation techniques since the OLS assumptions do not apply any longer. C) you can still use OLS estimation techniques, but the t-statistics do not have an asymptotic normal distribution. D) the critical values from the normal distribution have to be changed to 1.962, 1.963, etc. 2) _______ 3) By including another variable in the regression, you will A) decrease the variance of the estimator of the coefficients of interest. B) decrease the regression R2 if that variable is important. 3) _______ C) look at the t-statistic of the coefficient of that variable and include the variable only if the coefficient is statistically significant at the 1% level. D) eliminate the possibility of omitted variable bias from excluding that variable. 4) Sample selection bias A) results in the OLS estimator being biased, although it is still consistent. B) is only important for finite sample results. C) is more important for nonlinear least squares estimation than for OLS. D) occurs when a selection process influences the availability of data and that process is related to the dependent variable. 4) _______ 5) The reliability of a study using multiple regression analysis depends on all of the following with the exception of A) presence of homoskedasticity in the error term. B) external validity. C) errors-in-variables. D) omitted variable bias. 5) _______ 6) In the equation = 607.3 + 3.85 Income – 0.0423Income2, the following income level results in the maximum test score A) 45.50. B) 607.3. C) 91.02. D) cannot be determined without a plot of the data. 6) _______ 7) To decide whether Yi = β0 + β1X + ui or ln(Yi) = β0 + β1X + ui fits the data better, you cannot 7) _______ consult the regression R2 because A) the TSS are not measured in the same units between the two models. B) ln(Y) may be negative for 0<Y<1. C) the regression R2 can be greater than one in the second model. D) the slope no longer indicates the effect of a unit change of X on Y in the log-linear model. 8) Autoregressive distributed lag models include A) current and lagged values of the residuals. B) lags and leads of the dependent variable. C) current and lagged values of the error term. D) lags of the dependent variable, and lagged values of additional predictor variables. 8) _______ 9) Negative autocorrelation in the change of a variable implies that A) the variable contains only negative values. B) an increase in the variable in one period is, on average, associated with a decrease in the next. C) the series is not stable. D) the data is negatively trended. 9) _______ 10) Stationarity means that the A) forecasts remain within 1.96 standard deviation outside the sample period. B) error terms are not correlated. C) time series has a unit root. D) probability distribution of the time series variable does not change over time. 10) ______ 11) To choose the number of lags in either an autoregression or in a time series regression model with multiple predictors, you can use any of the following test statistics with the exception of the A) Bayes Information Criterion. B) F-statistic. C) Augmented Dickey-Fuller test. D) Akaike Information Criterion. 11) ______ 12) The random walk model is an example of a A) binomial model. C) stochastic trend model. 12) ______ B) stationary model. D) deterministic trend model. 13) The Augmented Dickey Fuller (ADF) t-statistic A) has a normal distribution in large samples. B) has the identical distribution whether or not a trend is included or not. C) is an extension of the Dickey-Fuller test when the underlying model is AR(p) rather than AR(1). D) is a two-sided test. 13) ______ 14) Departures from stationarity A) cannot be fixed. B) can be made to have less severe consequences by using log-log specifications. C) jeopardize forecasts and inference based on time series regression. D) occur often in cross-sectional data. 14) ______ 15) The main advantage of using panel data over cross sectional data is that it A) allows you to control for some types of omitted variables without actually observing them. B) allows you to analyze behavior across time but not across entities. C) allows you to look up critical values in the standard normal distribution. D) gives you more observations. 15) ______ 16) Time Fixed Effects regression are useful in dealing with omitted variables A) if these omitted variables are constant across entities but vary over time. B) even if you only have a cross-section of data available. C) when there are more than 100 observations. D) if these omitted variables are constant across entities but not over time. 16) ______ 17) With Panel Data, regression software typically uses an “entity-demeaned” algorithm because A) the number of estimates to calculate can become extremely large when there are a large number of entities. B) the OLS formula for the slope in the linear regression model contains deviations from means already. C) there are typically too many time periods for the regression package too handle. D) deviations from means sum up to zero. 17) ______ 18) The notation for panel data is (Xit, Yit), i = 1, ..., n and t = 1, ..., T because 18) ______ A) the X’s represent the observed effects and the Y the omitted fixed effects. B) n has to be larger than T for the OLS estimator to exist. C) we take into account that the entities included in the panel change over time and are replaced by others. D) there are n entities and T time periods. 19) The Fixed Effects regression model A) has n different intercepts. B) in a log-log model may include logs of the binary variables, which control for the fixed effects. C) has “fixed” (repaired) the effect of heteroskedasticity. D) the slope coefficients are allowed to differ across entities, but the intercept is “fixed” (remains unchanged). 19) ______ 20) The difference between an unbalanced and a balanced panel is that A) you cannot have both fixed time effects and fixed entity effects regressions. B) in the former you may not include drivers who have been drinking in the fatality rate/beer tax study. C) the impact of different regressors are roughly the same for balanced but not for unbalanced panels. D) an unbalanced panel contains missing observations for at least one time period or one entity. 20) ______ ESSAY. Write your answer on a separate sheet of paper. 21) Indicate whether or not you can linearize the regression functions below so that OLS estimation methods can be applied: [5] (a) Yi = (b) Yi = + ui 22) Assume that you had data for a cross-section of 100 households with data on consumption and personal disposable income. If you fit a linear regression function regressing consumption on disposable income, what prior expectations do you have about the slope and the intercept? The slope of this regression function is called the “marginal propensity to consume.” If, instead, you fit a log-log model, then what is the interpretation of the slope? Do you have any prior expectation about its size? [5] 23) To investigate whether or not there is discrimination against a sub-group of individuals, you regress the log of earnings on determining variables, such as education, work experience, etc., and a binary variable which takes on the value of one for individuals in that sub-group and is zero otherwise. You consider two possible specifications. First you run two separate regressions, one for the observations that include the sub-group and one for the others. Second, you run a single regression, but allow for a binary variable to appear in the regression. Your professor suggests that the second equation is better for the task at hand, as long as you allow for a shift in both the intercept and the slopes. Explain her reasoning. [5] 24) You have decided to use the Dickey Fuller (DF) test on the United States aggregate unemployment rate (sample period 1962:I – 1995:IV). As a result, you estimate the following AR(1) model t = 0.114 – 0.024 UrateUSt-1 , R2=0.0118, SER = 0.3417 (0.121) (0.019) You recall that your textbook mentioned that this form of the AR(1) is convenient because it allows for you to test for the presence of a unit root by using the t- statistic of the slope. Being adventurous, you decide to estimate the original form of the AR(1) instead, which results in the following output t = 0.114 – 0.976 UrateUSt-1 , R2=0.9510, SER = 0.3417 (0.121) (0.019) You are surprised to find the constant, the standard errors of the two coefficients, and the SER unchanged, while the regression R2 increased substantially. Explain this increase in the regression R2. Why should you have been able to predict the change in the slope coefficient and the constancy of the standard errors of the two coefficients and the SER? [5] 25) You want to find the determinants of suicide rates in the United States. To investigate the issue, you collect state level data for ten years. Your first idea, suggested to you by one of your peers from Southern California, is that the annual amount of sunshine must be important. Stacking the data and using no fixed effects, you find no significant relationship between suicide rates and this variable. (This is good news for the people of Seattle.) However, sorting the suicide rate data from highest to lowest, you notice that those states with the lowest population density are dominating in the highest suicide rate category. You run another regression, without fixed effect, and find a highly significant relationship between the two variables. Even adding some economic variables, such as state per capita income or the state unemployment rate, does not lower the t-statistic for the population density by much. Adding fixed entity and time effects, however, results in an insignificant coefficient for population density. (a) What do you think is the cause for this change in significance? Which fixed effect is primarily responsible? Does this result imply that population density does not matter? [5] (b) Speculate as to what happens to the coefficients of the economic variables when the fixed effects are included. Use this example to make clear what factors entity and time fixed effects pick up. [5] (c) What other factors might play a role? [5] 26) The following two graphs give you a plot of the United States aggregate unemployment rate for the sample period 1962:I to 1999:IV, and the (log) level of real United States GDP for the sample period 1962:I to 1995:IV. You want test for stationarity in both cases. Indicate whether or not you should include a time trend in your Augmented Dickey-Fuller test and why. [5]