lOMoARcPSD|11704996 TEST 2019, vragen en antwoorden Econometrics for minor Finance (Rijksuniversiteit Groningen) Studeersnel wordt niet gesponsord of ondersteund door een hogeschool of universiteit Gedownload door tobias van den Nieuwendijk (631449@student.inholland.nl) lOMoARcPSD|11704996 PRACTICE EXAMINATION ECONOMETRICS FOR MINOR FINANCE 2018-19 INSTRUCTIONS This is a two-hour exam with both multiple-choice test and open questions. There are three sections worth 30 points, 20 points and 50 points, respectively. All questions are compulsory. In section A, you will be awarded 2 points for each correct answer, and penalized 0.5 points for each wrong answer. There is no penalty for unanswered questions. Please use the provided grid to mark your answers. For sections B and C, you will be rewarded for concise logical arguments and your understanding of econometric techniques. Use the script to write your answers. Only non-programmable scientific calculators are allowed. You will not be allowed to leave the room during the test. Good luck! SECTION A: MULTIPLE CHOICE (30 points) Choose the one alternative that best completes the statement or answers the question. 1) To standardize a variable you A) subtract its mean and divide by its standard deviation. B) integrate the area below two points under the normal distribution. C) divide it by its standard deviation, as long as its mean is 1. D) add and subtract 1.96 times the standard deviation to the variable. Page 1 of 10 Over Gedownload door tobias van den Nieuwendijk (631449@student.inholland.nl) Please Turn lOMoARcPSD|11704996 2) Imagine you regressed earnings of individuals on a constant, a binary variable ("Male") which takes on the value 1 for males and is 0 otherwise, and another binary variable ("Female") which takes on the value 1 for females and is 0 otherwise. Because females typically earn less than males, you would expect A) the coefficient for Male to have a positive sign, and for Female a negative sign. B) one of the OLS estimators to not exist because there is perfect multicollinearity. C) both coefficients to be the same distance from the constant, one above and the other below. D) this to yield a difference in means statistic. 3) Simultaneous causality bias A) results in biased estimators if there is heteroskedasticity in the error term. B) is also called sample selection bias. C) happens in complicated systems of equations called block recursive systems. D) arises in a regression of Y on X when, in addition to the causal link of interest from X to Y, there is a causal link from Y to X. 4) Threats to internal validity of quasi-experiments include A) failure of randomization. B) attrition. C) failure to follow the treatment protocol. D) all of the above with some modifications from true randomized controlled experiments. 5) The AR(p) model A) can be represented as follows: Yt = β0 + β1Xt + βpYt-p + ut. B) can be written as Yt = β0 + β1Yt-1 + ut-p. C) is defined as Yt = β0 + βpYt-p + ut. D) represents Yt as a linear function of p of its lagged values. 6) The expected value of a discrete random variable A) Is the outcome that is most likely to occur? B) is computed as a weighted average of the possible outcome of that random variable, where the weights are the probabilities of that outcome. C) equals the population median. D) can be found by determining the 50% value in the c.d.f. Gedownload door tobias van den Nieuwendijk (631449@student.inholland.nl) lOMoARcPSD|11704996 7) Using 143 observations, assume that you had estimated a simple regression function and that your estimate for the slope was 0.04, with a standard error of 0.01. You want to test whether or not the estimate is statistically significant. Which of the following possible decisions is the only correct one: A) the response of Y given a change in X must be economically important since it is statistically significant B) you decide that the coefficient is small and hence most likely is zero in the population C) since the slope is very small, so must be the regression R2. D) the slope is statistically significant since it is four standard errors away from zero 8) An estimator 𝜇̂ 𝑌 of the population 𝜇𝑌 is unbiased if A) E(𝜇̂ 𝑌 ) = 𝜇𝑌 B) 𝜇̂ 𝑌 =0 C) 𝜇̂ 𝑌 = 𝜇𝑌 D) 𝜇̂ 𝑌 > 𝜇𝑌 9) Assume that you have 125 observations on the height (H) and weight (W) of your peers in college. Let the covariance sHW= 68, and the standard deviations be sH= 3.5, sW= 29. The sample correlation coefficient is A) 0.50 B) 0.67 C) cannot be computed since males and females have not been separated out. D) 1.22 10) Changing the units of measurement of variables, e.g. measuring test scores in 100s, will do all of the following EXCEPT for changing the A) numerical value of the slope estimate B) interpretation of the effect that a change in X has on the change in Y C) residuals D) numerical value of the intercept 11) The rule-of-thumb for checking for weak instruments is as follows: for the case of a single endogenous regressor, A) a first stage F must be statistically significant to indicate a strong instrument. Page 3 of 10 Over Gedownload door tobias van den Nieuwendijk (631449@student.inholland.nl) Please Turn lOMoARcPSD|11704996 B) a first stage F < 10 indicates that the instruments are weak. C) a first stage F > 1.96 indicates that the instruments are weak. D) the t-statistic on each of the instruments must exceed at least 1.64. 12) Estimation of the IV regression model A) requires exact identification or overidentification. B) allows only one endogenous regressor, which is typically correlated with the error term. C) requires exact identification. D) is only possible if the number of instruments is the same as the number of regressors. 13) Time series variables fail to be stationary when A) there are no trends. B) there is no strong seasonal variation in the data. C) the data is collected more frequently. D) the population regression has breaks. 14) One of the least squares assumptions in the multiple regression model is that you have random variables which are "i.i.d." This stands for A) irregularly integrated dichotomies. B) independently and identically distributed. C) initially indeterminate differences. D) identically initiated deltas (as in changes). 15) The intercept in the multiple regression model A) should be excluded because the population regression function does not go through the origin. B) should be excluded if one explanatory variable has negative values. C) is statistically significant if it is larger than 1.96. D) determines the height of the regression line. Section B: Short Answer Questions (4 x 5 = 20 points) Please answer the following questions as concisely as possible. All questions carry equal points. Gedownload door tobias van den Nieuwendijk (631449@student.inholland.nl) lOMoARcPSD|11704996 1. Enumerate the steps you will follow for the hypothesis test for a bivariate regression. These are the general steps, you need to write a couple of sentences describing each: Set the null and alternate, claculate the test statistic, compare with the critical value for the level of significance, decision rule. 2. Discuss two major issues that we need to check to ascertain that our multivariate regression models are correctly specified and that our estimates are unbiased and efficient. Discuss any of the following sources in terms of what are they and why are they problematic: Multicollinearity, heteroskedasticity, OV bias, etc. 3. Why is R2 a more important statistic in time-series regressions but not for cross-sectional regressions? External validity is of higher importance in time series regressions. The fit of the model is related to the predictive power of forecasting that time-series regressions are often used for. In cross sectional regressions the focus is more on establishing causal effects than forecasting. 4. Angrist and Krueger (1991) estimates the returns to education in the US using the quarter of birth as an instrumental variable. Critically evaluate the appropriateness of using quarter of birth as an instrumental variable. The students are expected to mention that the IV passes all the standard tests but that there are theoretical objections to the instrument. They need to mention at least one such theoretical objection. Section C: Open Questions (50 points) Please answer the questions with detailed reasoning. The number of points are shown in the brackets. 1. Design possible experimental/observational protocols to examine the following research question: Does Smoking Cause Cancer? In designing the econometric techniques, please ensure that you discuss issues related to the assignment to the treatment and the control groups, control variables, internal and external validity of the design. (15 points) Page 5 of 10 Over Gedownload door tobias van den Nieuwendijk (631449@student.inholland.nl) Please Turn lOMoARcPSD|11704996 2. You are provided with regression results from a study on the drug expenditures for US elderly (ldrugexp) regressed on ▪ A private health insurance dummy through the employer (hi_empunion) and ▪ Regressors for demographic and medical conditions: totchr (Total Chronic Condition), age, female, blhisp (Black or Hispanic), linc (log of income). Table 1 Gedownload door tobias van den Nieuwendijk (631449@student.inholland.nl) lOMoARcPSD|11704996 (4 points) a. Identify the endogenous or exogenous regressors from Table 1. b. What would be the problem if we use OLS to estimate regressions with endogenous regressors? (4 points) c. In table 2, you are provided information about two possible instrumental variables: ssiratio (fraction of social security income to total income) and multc (an indicator for if the firm has multiple locations). Would you advise the researchers to use any (or both) of these variables in an instrumental variable regression? Justify your answer. (6 points) Table 2 below Page 7 of 10 Over Gedownload door tobias van den Nieuwendijk (631449@student.inholland.nl) Please Turn lOMoARcPSD|11704996 d. Next, you are presented with the results from the instrumental variable regressions with one IV (table 3). How does the interpretation for the estimate of the variable of interest change when you compare the IV and the OLS results? What can be the reason for the change in the magnitude and the precision of the estimate? (6 points) Table 3 e. You are further provided with the regression results using two instrumental variables in table 4. How do the result using two instrumental variables differ from the results in (3 points) table 3? Table 4 Gedownload door tobias van den Nieuwendijk (631449@student.inholland.nl) lOMoARcPSD|11704996 f. How is the fit of the regression models presented in Tables 3 and 4? What inferences should you make from the comparison of the fit of the two regression specifications? (3 points) g. Would it be sensible to run a two-stage regression model in the following form? Justify your answer. points) (2 Stage 1: reg hi_empunion sscratio multc totchr age female blhisp linc, vce(robust) predict hi_empunion_hat Stage 2: reg ldrugexp hi_empunion_hat totchr age female blhisp linc, vce(robust) h. How do the estimates of the control variables change from the OLS to the IV results in table 4? (2 points) Answer Keys: 2. A. Endogneous: hi_empunion Page 9 of 10 Over Gedownload door tobias van den Nieuwendijk (631449@student.inholland.nl) Please Turn lOMoARcPSD|11704996 Exogenous: ldrugexp totchr age female blhisp linc b. The coefficients will be biased and the standard errors will be wrong. C. The students should comment on the relevance and the exclusion of these two variables. They should highlight the correlation between the two instruments. d. The students are expected to comment on the magnitude and the sign reversal (and comment retrospectively on how the previous results were contaminated by endogenous bias. They should also comment on the standard error. e. Here the students are expected to comment on the small gain in precision in using two instruments, and that the qualitive results remain the same. f. The students should comment on the R-squared from tables 3 and 4 and show that the there is a drop in the fit from tables 3 and 4. They should attempt to justify the drop using the correlations they discussed in part C. g. No. Because by running the 2-SLS as two separate equation, the correction for standard errors are not implemented. h. The students are required to observe and report the changes in the sign, magnitude and significance of the control variables from the OLS and any one of the IV regressions. END OF EXAM Gedownload door tobias van den Nieuwendijk (631449@student.inholland.nl)