(EC220) - LSE Learning Resources Online

Christopher Dougherty EC220 - Introduction to econometrics: past examinations and marking schemes 2010 exam and marking scheme Original citation: Dougherty, C. (2012) EC220 - Introduction to econometrics: past examinations and marking schemes. [Teaching Resource] © 2010 The Author This version available at: http://learningresources.lse.ac.uk/160/ Available in LSE Learning Resources Online: May 2012 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/ http://learningresources.lse.ac.uk/ Summer 2010 examination EC220 Introduction to Econometrics 2009/2010 syllabus only. Not for resit candidates. Instructions to candidates Time allowed: 3 hours + 15 minutes reading time This paper contains NINE questions. Answer any FOUR questions. All questions will be given equal weight (25%). You are supplied with: Graph paper Statistical tables Calculators are NOT allowed in this examination.  LSE 2010/EC220 Page 1 of 13 1. A researcher is considering two regression specifications: log Y   1   2 log X  u (1) Y   1   2 log X  u X (2) and log where u is a disturbance term. (a) [3 marks] Determine whether (2) is a reparameterized or a restricted version of (1). [Note: A mathematical explanation is required.] Y (b) [5 marks] Writing y = log Y, x = log X, and z = log , and using the same sample of n X observations, the researcher fits the two specifications using OLS: yˆ  b1  b2 x (3) zˆ  a1  a 2 x (4) and Using the expressions for the OLS regression coefficients, demonstrate that b2  a 2  1 . (c) [3 marks] Similarly, using the expressions for the OLS regression coefficients, demonstrate that b1 = a1 . (d) [3 marks] Hence demonstrate that the relationship between the fitted values of y, the fitted values of z, and the actual values of x, is yˆ i  xi  zˆ i . (e) [3 marks] Hence show that the residuals for regression (3) are identical to those for (4). (f) [3 marks] Hence show that the standard errors of b2 and a2 are the same. (g) [3 marks] Determine the relationship between the t statistic for b2 and the t statistic for a2, and give an intuitive explanation for the relationship. (h) [2 marks] Explain whether R2 would be the same for the two regressions. intuitive (rather than a mathematical) explanation is sufficient.]  LSE 2010/EC220 [Note: A brief Page 2 of 13 2. Three researchers are investigating the effects of time spent studying on the examination marks earned by students on a certain course. For a sample of 100 students, they have the examination mark, M, total hours spent studying, H, hours on primary study, P, and hours spent on revision, R By definition, H = P + R. The sample means of H, P, and R are 100 hours, 95 hours, and 5 hours, respectively. The sample correlation coefficients are 0.98 for H and P, 0.10 for H and R, and  0.11 for P and R. The standard deviations of the distributions of H, P, and R are 10.1, 10.1, and 2.1, respectively. Researcher A decides to regress M on P and R and fits the following regression (standard errors in parentheses; t statistics in square brackets): M̂ = 45.6 + 0.15 P + 0.21 R (2.8) (0.03) (0.14) [16.30] [5.49] [1.51] R2 = 0.243 (1) Researcher B decides to regress M on H, P, and R. However, the regression application refuses to fit the regression with all three explanatory variables. Instead, it drops R and the regression output is M̂ = 45.6 + 0.21 H – 0.05 P (2.8) (0.14) (0.14) [16.30] [1.51] [–0.40] R2 = 0.243 (2) Note: In answering the following questions, you should give your reasoning in general terms. Detailed mathematical analysis is not required and no credit will be given for it. (a) [3 marks] Researcher A says that her specification has better explanatory power than that of Researcher B because the coefficient of her main variable, P, has a high t statistic. Explain whether this assertion is correct. (b) [3 marks] She says that the insignificant coefficient of R in (1) is to be expected because the students, on average, spent much less time on revision than on primary study. Explain whether this assertion is correct. (c) [3 marks] Researcher B says that, assuming that his specification is in fact correct, not being able to include R in the regression has given rise to omitted variable bias, and this is responsible for the negative coefficient of P. Explain whether this assertion is correct. (d) [4 marks] A commentator, drawing attention to the high correlation between H and P, says that the real reason for the implausible negative coefficient of P obtained by Researcher B is multicollinearity. Explain whether this assertion is correct. Is the negative coefficient of P implausible? (e) [6 marks] Researcher C says that it would be better to keep the specification simple and just regress M on H. He has done so, with the following results: M̂ = 45.8 + 0.15 H (2.8) (0.03) [16.56] [5.58] R2 = 0.241 (3) He says that his results are actually more satisfactory than those of Researchers A and B. Explain whether this assertion is correct. This question is continued on the next page. (f) [2 marks] A commentator says that, given the low R2, it is obvious that there are other important determinants of the marks and they may be associated with willingness  LSE 2010/EC220 Page 3 of 13 to spend time studying, causing all the regression results to be distorted. Explain whether this assertion might be correct. (g) [4 marks] Another commentator says that, since the hours-of-study data are self-reported, it is likely that many of the students have made up the numbers. He says that this will cause t statistics to be lower than they would have been if the reporting were accurate, and this may be one of the reasons that the R2 is so low. Explain whether this assertion might be correct.  LSE 2010/EC220 Page 4 of 13 3. As part of a workshop project, four students are investigating the effects of ethnicity and sex on earnings using data for the year 2002 in the National Longitudinal Survey of Youth 1979–. They all start with the same basic specification: log Y  1   2 S   3 EXP  u where Y is hourly earnings, measured in dollars, S is years of schooling completed, and EXP is years of work experience. The sample contains 123 black males, 150 black females, 1,146 white males, and 1,127 white females. (All respondents were either black or white. The Hispanic subsample was dropped.) The output from fitting this basic specification is shown in column 1 of the table (standard errors in parentheses; RSS is residual sum of squares, n is the number of observations in the regression). S EXP Basic (1) All 0.126 (0.004) 0.040 (0.002) MALE — BLACK — MALEBLAC — constant R2 RSS n 0.0376 (0.078) 0.285 659 2,546 Student C (2) (3) All All 0.121 0.121 (0.004) (0.004) 0.032 0.032 (0.002) (0.002) 0.277 0.308 (0.020) (0.021) –0.144 –0.011 (0.032) (0.043) –0.290 — (0.063) 0.459 0.447 (0.076) (0.076) 0.341 0.346 608 603 2,546 2,546 (4a) Males 0.133 (0.006) 0.032 (0.004) Student D (4b) (5a) Females Whites 0.112 0.126 (0.006) (0.005) 0.035 0.041 (0.003) (0.003) (5b) Blacks 0.112 (0.012) 0.028 (0.005) — — — — — — — — — — — — 0.566 (0.124) 0.287 452 1,269 0.517 (0.097) 0.275 289 1,277 0.375 (0.087) 0.271 609 2,273 0.631 (0.172) 0.320 44 273 Student A divides the sample into the four ethnicity/sex categories. He chooses white females as the reference category and fits a regression that includes three dummy variables BM, WM, and BF. BM is 1 for black males, 0 otherwise; WM is 1 for white males, 0 otherwise, and BF is 1 for black females, 0 otherwise. Student B simply fits the basic specification separately for the four ethnicity/sex subsamples. Student C defines dummy variables MALE, equal to 1 for males and 0 for females, and BLACK, equal to 1 for blacks and 0 for whites. She also defines an interactive dummy variable MALEBLAC as the product of MALE and BLACK. She fits a regression adding MALE and BLACK to the basic specification, and a further regression adding MALEBLAC as well. The output from these regressions is shown in columns 2 and 3 in the table. Student D divides the sample into males and females and performs the regression for both sexes separately, using the basic specification. The output is shown in columns 4a and 4b. She also divides the sample into whites and blacks, and again runs separate regressions using the basic specification. The output is shown in columns 5a and 5b. This question is continued on the next page.  LSE 2010/EC220 Page 5 of 13 (a) Reconstruction of missing output. Students A and B left their output on a bus on the way to the workshop. This is why it does not appear in the table. (i) [6 marks] State what the missing output of Student A would have been, as far as this is can be done exactly, given the results of Students C and D. (Coefficients, standard errors, R2, RSS.) (ii) [3 marks] Explain why it is not possible to reconstruct any of the output of Student B. (b) Tests of hypotheses. The approaches of the students allowed them to perform different tests, given the output shown in the table and the corresponding output for Students A and B. Explain the tests relating to the effects of sex and ethnicity that could be performed by each student, giving a clear indication of the null hypothesis in each case. (Remember, all of them started with the basic specification (1), before continuing with their individual regressions.) In the case of F tests, state the test statistic in terms of its components (but do not attempt to evaluate it). [Note: You are not expected to discuss the outcome of any of these tests and no credit will be given for such discussions.] (i) [4 marks] Student A (assuming he had found his output) (ii) [2 marks] Student B (assuming he had found his output) (iii) [3 marks] Student C (iv) [4 marks] Student D (c) [3 marks] If you had been participating in the project and had had access to the data set, what regressions and tests would you have performed?  LSE 2010/EC220 Page 6 of 13 4. A researcher has data on the number of crimes per 1,000 inhabitants, C, number of police per 1,000 inhabitants, P, and average household income, Y, measured in thousands of pounds, for 30 cities for 2009. She hypothesizes that C is likely to be negatively related to Y, and that P may be related to Y: C  1   2 P  u P   1   2Y  v where u and v are both identically and independently distributed disturbance terms and unrelated to each other. Putting the two relationships together, the researcher thinks C is related to Y: C   1  2  1   2  2Y  u   2 v Accordingly she fits the regressions (standard errors in parentheses): Ĉ = 3.79 + 0.22P (0.46) (0.10) R2 = 0.16 (1) P̂ = 9.50 – 0.08Y (0.80) (0.02) R2 = 0.38 (2) R2 = 0.26 (3) R2 = 0.46 (4) Ĉ = 0.59 – 0.03Y (0.49) (0.01) Experimenting further, she also fits the regression Ĉ = 4.20 – 0.08Y – 0.63P (0.60) (0.02) (0.20) (a) (6 marks) The investigator makes the following statement: ‘Crime rates per 1,000 inhabitants can be expected to be lower in cities with more police per 1,000 inhabitants. I am therefore entitled to perform a one-sided test on the coefficient of P in regression (1). The t statistic for the coefficient is 2.20. The critical value of t, using a onesided test, is about 1.70 at the 5% level and 2.47 at the 1% level. I can therefore conclude at the 5% level (but not the 1% level) that the number of police per 1,000 inhabitants has a significant effect on crime rates.’ Explain whether you agree or disagree with this statement, giving reasons. (b) (8 marks) Explain mathematically why the coefficients of Y and P in regression (4) are different from their coefficients in regressions (1) and (3). [Note: You may treat Y and P as nonstochastic variables.] (c) (5 marks) In regression (3) the residual sum of squares was 3,700. In regression (4) it was 2,700. Perform an F test on the reduction in the residual sum of squares, stating clearly your null hypothesis and conclusion. How is this test related to a one-sided test on the coefficient of P in regression (4)? (d) (3 marks) A commentator offers the following explanation for the coefficient of Y being less negative in regression (3) than in regression (4). ‘In regression (4), the coefficient is an estimate of the marginal effect of Y, holding P constant. In regression (3), the coefficient estimates the overall effect of Y on C, taking account of the fact that in reality P decreases with Y.’ Explain whether this interpretation is correct. (e) (3 marks) Another commentator suggests that the real reason for the negative correlation between P and Y is that more police are needed where there is more crime and there is less crime in higher-income areas. If this is correct, how would this affect the researcher’s conclusions?  LSE 2010/EC220 Page 7 of 13 5. A researcher has data on annual household expenditure on food, F, and total annual household expenditure, E, both measured in dollars, for 400 households in the United States for 2006. The scatter plot for the data is shown as Figure 5.1 on the next page. The basic model of the researcher is (1) F  1   2 E  u where u is a disturbance term. The researcher suspects heteroscedasticity and performs a Goldfeld– Quandt test and a White test. For the Goldfeld–Quandt test, she sorts the data by size of E and fits the model for the subsample with the 150 smallest values of E and for the subsample with the 150 largest values. The residual sums of squares (RSS) for these regressions are shown in column (1) of the table. She also fits the regression for the entire sample, saves the residuals, and then fits an auxiliary regression of the squared residuals on E and its square. R2 for this regression is also shown in column (1) in the table. She performs parallel tests of heteroscedasticity for two alternative models: F E 1  1   2  v ` (2) A A A log F   1   2 log E  w (3) A is household size in terms of equivalent adults, giving each adult a weight of 1 and each child a weight of 0.7. The scatter plot for F / A and E / A is shown as Figure 5.2, and that for log F and log E as Figure 5.3. The data for the heteroscedasticity tests for models (2) and (3) are shown in columns (2) and (3) of the table. Specification Goldfeld–Quandt test RSS smallest 150 RSS largest 150 White test R2 from auxiliary regression (1) (2) (3) 200 million 820 million 40 million 240 million 20.0 21.0 0.160 0.140 0.001 (a) [4 marks] Explain the intuitive idea behind the Goldfeld–Quandt test. (b) [5 marks] Perform the Goldfeld–Quandt test for each model and state your conclusions. (c) [3 marks] Explain why the researcher thought that model (2) might be an improvement on model (1). (d) [3 marks] Explain why the researcher thought that model (3) might be an improvement on model (1). (e) [4 marks] When models (2) and (3) are tested for heteroscedasticity using the White test, auxiliary regressions must be fitted. State the specification of this auxiliary regression for model (2). (f) [3 marks] Perform the White test for the three models. (g) [3 marks] Explain whether the results of the tests seem reasonable, given the scatter plots of the data.  LSE 2010/EC220 Page 8 of 13 Household expenditure on food ($) 20000 15000 10000 5000 0 0 50000 100000 Total household expenditure ($) Figure 5.1 Household expenditure on food per equivalent adult ($) 8000 6000 4000 2000 0 0 20000 40000 60000 Total household expenditure per equivalent adult ($) Figure 5.2 log household expenditure on food 11 9 7 5 7 9 11 13 log total household expenditure Figure 5.3  LSE 2010/EC220 Page 9 of 13 6. A researcher using cross-sectional data hypothesizes that two variables Y and X are jointly determined by a simultaneous equations model consisting of the following two relationships: Y  1   2 X   3 Z  u (1) X   1   2Y  v (2) where Z may be assumed to be an exogenous variable and u and v are identically and independently distributed disturbance terms with zero means. The observations for Z are drawn from a fixed population with finite mean and variance. (a) [2 marks] Derive the reduced form equation for Y. (b) [8 marks] Demonstrate that the ordinary least squares (OLS) estimator of 2 is, in general, inconsistent. [Note: You are not required to determine the sign of the large-sample bias and no credit will be given for doing so.] How is your conclusion affected in the special case 2 = 0? How is your conclusion affected in the special case 22 = 1? (c) [3 marks] Demonstrate that the instrumental variables (IV) estimator of 2 , using Z as an instrument for Y, is consistent. (d) [4 marks] Instead of using IV, the researcher decides to use two-stage least squares (TSLS) in the expectation of obtaining a more efficient estimator of 2. He fits the reduced form equation for Y: Yˆ  h1  h2 Z (3) saves the fitted values, and uses them as an instrument for Y in equation (2). Demonstrate that the TSLS estimator is consistent. (e) [3 marks] Determine whether the researcher is correct in believing that the TSLS estimator is more efficient than the IV estimator. X , where X and Y are the Y sample means of X and Y, is an unbiased estimator of 2. (f) [2 marks] Suppose that 1 = 0. Determine whether or not (g) [3 marks] Still supposing 1 = 0, determine whether or not  LSE 2010/EC220 X is a consistent estimator of 2. Y Page 10 of 13 7. (a) You have been given a data set that contains a sample of observations relating to individuals. There are just two variables, Y and X, and for reasons of confidentiality, all values of X greater than or equal to a certain upper limit Xmax have been recoded as Xmax. The values of Y have not been altered. You believe that Y is related to X by the model Y  1   2 X  u where u is an independently and identically distributed disturbance terms that satisfies the usual regression model assumptions. Explain how the estimates of the parameters and their standard errors will be affected in the following cases. (i) [5 marks] If you regress Y on X, including the observations where X = Xmax, using ordinary least squares (OLS). (ii) [5 marks] If you regress Y on X, excluding the observations where X = Xmax, using OLS. In both cases, you should compare the regression results with those you would have obtained if you had been able to use the original, unmodified data. In both cases, you may find it helpful to illustrate your explanation with a diagram. (b) You are next given another data set that contains a sample of observations relating to individuals. Again, there are just two variables, Y and X, and for reasons of confidentiality, all values of Y greater than or equal to a certain upper limit Ymax have been recoded as Ymax. The values of X have not been altered. You again believe that Y is related to X by the model Y  1   2 X  u where u is an independently and identically distributed disturbance terms that satisfies the usual regression model assumptions. Explain how the estimates of the parameters and their standard errors will be affected (i) [5 marks] If you regress Y on X, including the observations where Y = Ymax, using OLS. (ii) [5 marks] If you regress Y on X, excluding the observations where Y = Ymax, using OLS. In both cases, you should compare the regression results with those you would have obtained if you had been able to use the original, unmodified data. In both cases, you may find it helpful to illustrate your explanation with a diagram. (c) [5 marks] Suppose that the time, t, required to complete a certain process has probability density function f (t )  e  t    with t >  > 0 and you have a sample of n observations with times T1, ..., Tn. Determine the maximum likelihood estimate of , assuming that  is known.  LSE 2010/EC220 Page 11 of 13 8. A researcher correctly believes that a time series process can be adequately represented by a deterministic trend, Yt   1   2 t  u t (1) where t = 1, ..., T is the time trend, and ut is an identically and independently distributed (IID) random variable with zero mean and finite variance. (a) (i) [2 marks] Demonstrate that Yt is nonstationary (ii) [3 marks] Demonstrate that the first difference in Yt is stationary (iii) [2 marks] Given a sample of size T, and noting that t  0.5T , demonstrate that the ordinary least squares (OLS) estimator of the slope coefficient may be T decomposed as b2OLS  2   t  0.5T u t u t 1 T  t  0.5T  2 t 1 (iv) [3 marks] Hence demonstrate that b2OLS is unbiased. (b) Next consider the case where the time series process is given by (1) but the disturbance term is subject to an AR(1) process u t  u t 1   t where t is IID with zero mean and finite variance and   1 . 6 Specification (1) Specification (2) 4 2 0 4.5 5 5.5 (i) [2 marks] Demonstrate that the OLS estimator of 2 remains unbiased. (ii) [3 marks] Derive a representation of the time series process for Yt that is free from autocorrelation. This will be called specification (2). (iii) [2 marks] Explain whether specification (2) may be fitted using OLS. (iv) [3 marks] Explain the potential advantages and disadvantages of fitting specification (2) instead of specification (1). (v) [2 marks] Assume that fitting specification (2) using OLS yields consistent estimates of the parameters. Show how one might obtain an estimate of 2 from the regression results and demonstrate that it is consistent. (vi) [3 marks] The figure compares the distributions of the estimates of 2 obtained using specification (1) and those using the method determined in part (b)(v), using a simulation with the true values chosen as follows: 1 = 10, 2 = 5, and  = 0.9. Comment on the relationship between the distributions in the light of your answers to earlier parts of this question.  LSE 2010/EC220 Page 12 of 13 9. A researcher has time series data for aggregate consumption, C, and aggregate disposable personal income, Y, for a certain country. She establishes that the logarithms of both series are I(1) (integrated of order one) and she correctly hypothesizes that the long-run relationship between them may be represented as C  Yv (1) where  is a constant and v is a multiplicative disturbance term. It may be assumed that log v is normally distributed with zero mean and constant variance. (a) (i) [2 marks] Explain what is meant when a series is described as being integrated of order one. (ii) [3 marks] Explain what is meant when two time series are described as being cointegrated. (iii) [5 marks] The researcher believes that log C and log Y are cointegrated. How should she demonstrate this? (iv) [2 marks] The relationship implies that the long-run growth rate of consumption is equal to that of income. Explain whether it is correct to describe the growth rates as being cointegrated. (b) The researcher is also interested in the short-run dynamics of the relationship and correctly hypothesizes that they may be represented by the relationship log C t   1   2 log C t 1   3 log Yt   4 log Yt 1   t (2) where t is identically and independently distributed and drawn from a normal distribution with zero mean. (i) [5 marks] State, with an explanation, the restriction that has to be satisfied by the parameters if the short-run relationship (2) is to be compatible with the longrun relationship (1). [Note: You are not asked to propose a test of the restriction.] (ii) [5 marks] Show how the restricted version of (2) may be reparameterized as an errorcorrection model. (iii) [3 marks] Explain why fitting the error-correction model, rather than (2) directly, avoids a potentially important econometric problem.  LSE 2010/EC220 Page 13 of 13 EC220 INTRODUCTION TO ECONOMETRICS Marking Scheme for the 2010 Examination Notes to examiners: This marking scheme will in due course be posted on the EC220 website as a resource for future students. For this reason some of the explanations are more detailed than is necessary for a marking scheme. Please mark each item in each question to the nearest half mark. Perhaps even one quarter, for small items. Examiners are encouraged to award additional marks if the candidate makes points that are not included in the marking scheme, provided that they are wholly relevant and provided that they are not just a statement of unnecessary detail. A few possible additional marks have been included in the marking scheme. If, contrary to instructions, a candidate has answered more than four questions, all those after the first four should be disregarded. Candidates have been told that this will be the case.  LSE 2010/EC220 Page 1 of 13 1. Note: This question is designed so that a candidate who has made a mistake in one part should still be able to answer the remaining parts correctly. This type of question invariably has a bimodal distribution of marks, with many candidates who know what they are doing giving good answers and many others stumbling through it. Since candidates are told what they should be proving, examiners should expect many fudges, unintentional or otherwise. (a) [3 marks] (2) may be rewritten log Y   1   2  1 log X  u so it is a reparameterized version of (1) with 1 = 1 and 2 = 2 + 1. Give 0 if the candidate then asserts that it is a restricted version, since that shows that he/she does not understand the meaning of a restriction. (b) [5 marks]  x  x z  z    x  x  y  x   y  x     x  x   x  x   x  x  y  y    x  x   b  1   x  x   x  x  n a2 n i i 1 i n i 1 i i i 1 i 2 i 1 n n i i 1 i n i 1 i n 2 2 i i 2 i 1 n i i 1 i 2 2 (c) [3 marks] a1  z  a 2 x   y  x   a 2 x  y  a 2  1x  y  b2 x  b1 (d) [3 marks] zˆ i  a1  a 2 xi  b1  b2  1xi  b1  b2 x i  xi  yˆ i  x i (e) [3 marks] Let ei be the residual in (3) and fi the residual in (4). Then f i  z i  zˆ i  y i  xi   yˆ i  xi   y i  yˆ i  ei (f) [3 marks] The standard error of b2 is s.e.(b2) =  e /(n  2)   f /(n  2) = s.e.(a )  x  x   x  x  2 i 2 i 2 i t b2  (g) [3 marks] 2 2 i b2 a 1  2 (1.5 marks). s.e.b2  s.e.a 2  The t statistic for b2 is for the test of H0: 2 = 0. Given the relationship, it is also for the test of H0: 2 = –1. The tests are equivalent since both of them reduce the model to log Y depending only on an intercept and the disturbance term (1.5 marks; mark flexibly and give these marks for other attempts at an intuitive explanation if sensible). (h) [2 marks] R2 will be different because it measures the proportion of the variance of the dependent variable explained by the regression, and the dependent variables are different.  LSE 2010/EC220 Page 2 of 13 2 (a) [3 marks] The assertion is incorrect (1.5 marks, but do not give these marks if the subsequent discussion reveals that the candidate is making this assertion for the wrong reasons). The specifications are equivalent and explanatory power, as measured by R2, is the same (1.5 marks; mark flexibly, rewarding any sensible attempt at an explanation). (b) [3 marks] The assertion is incorrect (1 mark, but do not give the mark if the subsequent discussion reveals that the candidate is making this assertion for the wrong reasons)). The small mean of R has nothing to do with it (0.5 mark). Since the estimated coefficient of R is actually greater than that of P, the relatively low t statistic is attributable to the much greater standard error. This, in turn, is attributable to the fact that the variance of R is smaller than that of P. (See the data on the sample standard deviations.) (1.5 marks) (c) [3 marks] This is nonsense since his specification cannot be correct. multicollinearity. It involves exact (d) [4 marks] It is wrong to say that the negative coefficient is implausible. It is measuring the difference in the effects of P and R, not the absolute effect of P (2 marks). However, it is true say that the estimator will have large variance since H and P are highly correlated (2 marks). (e) [6 marks] The precision of the estimation of the coefficient of H, as indicated by its standard error, is better than for Researcher B (2 marks), but it is obtained at the cost of the imposition of a restriction that the effects of P and R are the same (2 marks). If the restriction is invalid, imposing it will cause the coefficients to be biased (1 extra mark). The insignificant coefficient of P in the regression of Researcher B indicates that the restriction should not be rejected (2 marks). (If the candidate says, instead, that the very small fall in R2, compared with the other regressions, is evidence in favour of the restriction, give 1 mark.) (f) [2 marks] The assertion may well be correct, the problem being omitted variable bias. (g) [4 marks] The measurement error invalidates the t statistics (1 mark). It may reduce them by causing the slope coefficient to be underestimated in absolute terms (1 mark). However, it would also affect the t statistics via the standard errors, but in what way it is not possible to say (1 extra mark). Lower t statistics are indeed likely to be associated with lower R2 (2 marks). (In the case of simple regression analysis, the effect is direct: F = t2, and R2 is a monotonic function of F. Candidates are not expected to say all this, but up to 2 extra marks should be awarded if they do.)  LSE 2010/EC220 Page 3 of 13 As for column (3), coefficients, standard errors, R2, with the following changes: • the row label MALE should be replaced with WM, • the row label BLACK should be replaced with BF, • the row label MALEBLAC should be replaced with BM and the coefficient for that row should be the sum of the coefficients in column (3): 0.308 – 0.011 – 0.290 = 0.007, and the standard error would not be known. (Give 2 marks for the coefficient of BM, 1 mark for its standard error, and 3 marks for the rest) (ii) [3 marks] One could not predict the coefficients of either S or EXP in the four regressions performed by Student B. They will, except by coincidence, be different from any of the estimates of the other students because the coefficients for S and EXP in the other specifications are constrained in some way. As a consequence, one cannot predict exactly any part of the rest of the output, either. (Give 1.5 marks if the candidate makes the correct assertion but with less precise technical reasoning.) (b) The question states that the tests should be based on the output in the table and the corresponding missing output for Students A and B. Hence tests using information from the variance-covariance matrix of the coefficients are not expected and should not be given credit. (i) [4 marks] Student A could perform tests of the differences in earnings between white males and white females, black males and white females, and black females and white females, through simple t tests on the coefficients of WM, BM, and BF (2 marks). He could also test the null hypothesis that there are no sex/ethnicity differences with an F test, comparing RSS for his regression with that of the basic regression: 922  603 / 3 F 3,2540   603 / 2540 This would be compared with the critical value of F with 3 and 2,540 degrees of freedom at the significance level chosen and the null hypothesis of no sex/ethnicity effects would be rejected if the F statistic exceeded the critical value (2 marks). (ii) [2 marks] In the case of Student B, with four separate subsample regressions, candidates are expected say that no tests would be possible because no relevant standard errors would be available (2 marks). We have covered Chow tests only for two categories. However, a four-category test could be performed, with 922  X  / 9 F 9,2534   X / 2534 where RSS = 922 for the basic regression and X is the sum of RSS in the four separate regressions. In the unlikely event of a candidate producing this F test, give 3 extra marks, and be generous to any candidate who even thinks along these lines. [If the candidate suggests mechanical t tests on the coefficients of the individual regressions, give 1 mark.] 3. (a) (i) [6 marks]  LSE 2010/EC220 Page 4 of 13 (iii) [3 marks] Student C could perform the same t tests and the same F test as Student A, with one difference: the t test of the difference between the earnings of black males and white females would not be available. Instead, the t statistic of MALEBLAC would allow a test of whether there is any interactive effect of being black and being male on earnings. (iv) [4 marks] Student D could perform a Chow test to see if the wage equations of males and females differed: 922  322  289  / 3 F 3,2540   322  289 / 2540 RSS = 322 for males and 289 for females (2 marks). This would be compared with the critical value of F with 3 and 2,540 degrees of freedom at the significance level chosen and the null hypothesis of no sex/ethnicity effects would be rejected if the F statistic exceeded the critical value (1 mark). She could also perform a corresponding Chow test for blacks and whites: 922  609  44  / 3 (1 mark). F 3,2540  609  44 / 2540 (c) [3 marks] The most obvious development would be to relax the sex/ethnicity restrictions on the coefficients of S and EXP by including appropriate interactive terms. This could be done by interacting these variables with the dummy variables defined by Student A or those defined by Student C. Mark flexibly and generously.  LSE 2010/EC220 Page 5 of 13 4. (a) [6 marks] 2 marks for saying that, given the highly significant coefficient of Y in (4), the regression is subject to omitted variable bias and hence any test would be invalid. 3 marks for saying that if the researcher were committed to a one-sided test, the fact that the coefficient has the wrong sign means that she would not reject Ho. If the candidate does not say this because he/she has already concluded that the test is in any case invalid, give at least 2 of these 3 marks. 1 mark for saying that the justification for the one-sided test is too strong, because it rules out the null hypothesis. The researcher ought to have said that a higher number of police per 1,000 inhabitants ought not to lead to an increased crime rate. (b) [8 marks] Regression (4) has the specification C   1   2Y   3 P  u Assuming that this is the true model, the slope coefficient in an OLS regression of (1), which omits Y, will be given by  P  P C  C   P  P   P  P     Y  d3  i i 2 i i 2 i 1    3 Pi  u i    1   2 Y   3 P  u  P  P   P  P Y  Y    P  P u  u   P  P   P  P   2 i  3  2 i i i i 2 2 i i (4 marks). Candidates have been told that they may treat P and Y as if they were nonstochastic. Then E d 3    3   2  P  P Y  Y   E  P  P u  u      P  P     P  P   P  P Y  Y    P  P E u  u   P  P   P  P   P  P Y  Y   P  P  i i i i 2 2 i  3  2 i i i i 2 i  3  2 i 2 i i i 2 i (1 mark). Regression (2) reveals that P and Y are negatively correlated in the sample. Hence the second factor in the bias term is positive. One would anticipate 2 < 0, and hence that the bias is positive (1 mark). The bias is so large that the coefficient of P in regression (1) is actually positive. The analysis for the case where P is omitted is similar. The coefficient of Y is upwards biased, and the coefficient in (3) is indeed less negative than in (4) (2 marks). (c) [5 marks] The null hypothesis is Ho: 3 = 0 and the alternative hypothesis is H1: 3  0 F 1,27    LSE 2010/EC220 3700  2700  / 1  10 2700 / 27 (3 marks) Page 6 of 13 At the 1% significance level, the critical value of F(1,27) is 7.68. Hence the null hypothesis is rejected (1 mark). The test is equivalent to a two-sided t test with the same null and alternative hypotheses. A one-sided t test, which is justified here since you would not expect P to have a positive effect on C, is more powerful (1 mark). (d) [3 marks] Yes, it is correct. If one wishes to estimate the marginal effect of Y, controlling for P, then the coefficient of Y in regression (3) is subject to omitted variable bias, but nevertheless it does provide an estimate of the total effect, direct and indirect, taking account of the fact that P is related to Y. [Give only 2 marks if the candidate makes no reference to the apparent problem of omitted variable bias.] (e) [3 marks] This is a somewhat open-ended question, so mark flexibly and generously. The following is just a suggestion. The model might be re-specified on the following lines: C   1   2 P   3Y  u P  1   2C  v OLS would no longer be appropriate for either equation. The first would be underidentified and the second would have to be fitted using IV, with Y as an instrument for C.  LSE 2010/EC220 Page 7 of 13 5. (a) [4 marks] Standard. 1 mark for stating the underlying assumption, 3 marks for explaining the intuitive idea underlying the test of this assumption. Candidates should make some effort at the intuitive level, with the variance of the residuals being taken as an indicator of the variance of the disturbance term. A mechanical statement of the test is not sufficient. A diagram would be useful. (b) [5 marks] The ratios are 4.1, 6.0, and 1.05 (2 marks). In each case we should look for the critical value of F(148,148). The critical values of F(150,150) at the 5 percent, 1 percent, and 0.1 percent levels are 1.31, 1.46, and 1.66, respectively. Hence we reject the null hypothesis of homoscedasticity at the 0.1 percent level (1 percent is OK) for models (1) and (2) (2 marks). We do not reject it even at the 5 percent level for model (3) (1 mark). (c) [3 marks] If the assumption that the standard deviation of the disturbance term is proportional to household size, scaling through by A should eliminate the heteroscedasticity (1.5 marks for the assertion with no maths; if the candidate does the maths, but does not make this verbal statement, give some or all of the credit anyway if it is clear that  the candidate understands), since E v 2  E      u   A   2  1  E u 2  2  A2    if the standard deviation of u = A (1.5 marks). (d) [3 marks] It is possible that the (apparent) heteroscedasticity is attributable to mathematical misspecification. If the true model is logarithmic, a homoscedastic disturbance term would appear to have a heteroscedastic effect if the regression is performed in the original units. (A bit more explanation is required to earn the full 3 marks. A diagram would be helpful.) (e) [4 marks] (Much of this item is actually given in the question, for model (1). Give 0 unless correct or nearly correct.) The dependent variable is the squared residuals from the model regression (1 mark). The explanatory variables are the reciprocal of A and its square, E/A and its square, and the product of the reciprocal of A and E/A (3 marks). (No constant.) (f) [3 marks] nR2 is 64.0, 56.0, and 0.4 for the three models. Under the null hypothesis of homoscedasticity, this statistic has a chi-squared distribution with degrees of freedom equal to the number of terms on the right side of the regression, minus one. This is two for models (1) and (3). The critical value of chi-squared with two degrees of freedom is 5.99, 9.21, and 13.82 at the 5, 1, and 0.1 percent levels. Hence H0 is rejected at the 0.1 percent level (1 percent is OK) for model (1) (1 mark), and not rejected even at the 5 percent level for model (3) (1 mark). In the case of model (2), there are five terms on the right side of the regression. The critical value of chi-squared with four degrees of freedom is 18.47 at the 0.1 percent level. Hence H0 is rejected at that level (1 percent, 13.28, is OK) (1 mark). (g) [3 marks] Absolutely. In Figures 5.1 and 5.2, the variances of the dispersions of the dependent variable clearly increase with the size of the explanatory variable. In Figure 5.3, the dispersion is much more even.  LSE 2010/EC220 Page 8 of 13 Y 6. (a) [2 marks] (b) [8 marks] 1 1 22 1   1  2   3 Z  u   2 v   Y  Y X  X    Y  Y  Y   Y  Y v  v     Y  Y  i a 2OLS  i i  Y i i   Y   1   2 Yi  vi    1   2 Y  v 2 i Y   2 i 2 2 i It is not possible to obtain a closed-form expression for the expectation of the error term because v is a component of Y. Instead, we investigate the limiting value, first dividing the numerator and denominator by n so that they have limits: plim a 2OLS 1   2  plim n  Y i 1 n  Y vi  v  2  Yi  Y  2  plim 1 n  Y plim i 1 n  Y v i  v   Y i Y  2 2    1 covY , v  1 1   1  2   3 Z  u   2 v , v  2  cov varY  varY  1  22  2  2 varv  1   2  2 varY  since cov(Z, v) = cov(u, v) = 0. Thus a 2OLS is, in general, an inconsistent estimator of 2 (3 marks for the proof; 1 mark for explaining why it is not possible to take expectations). If 2 = 0, X is not influenced by Y, there is no simultaneity, and OLS will be a consistent estimator (2 marks. Give only 1 if the candidate notes that the largesample bias is zero, but makes no effort to explain the reason for this.) If 22 = 1, the lines are parallel in the {X, Y} dimensions and they do not intersect. The reduced form relationship is undefined (2 marks). (c) [3 marks]  Z  Z X  X    Z  Z  Z Y  Y   Z  Z v  v     Z  Z Y  Y  i a 2IV  i i i i i i i i   Z   1   2Yi  v i    1   2Y  v  Z i  Z Yi  Y   2 Again, v influences Y and expectations cannot be taken. Instead, taking plims, 1 1  Z i  Z vi  v  Z i  Z v i  v  plim n 2  plim a 2IV   2  plim n 1 1 Z i  Z Yi  Y  Z i  Z Yi  Y  plim n n 2      covZ , v  2 covZ , Y  since cov(Z, v) = 0 and cov(Z, Y)  0.  LSE 2010/EC220 Page 9 of 13 (d) [4 marks] a 2TSLS   Yˆ  Yˆ X  X    h  h Z   h  h Z  X    Yˆ  Yˆ Y  Y    h  h Z   h  h Z  Y     h Z  Z  X  X     h Z  Z  Y  Y   Z  Z X  X   a   Z  Z Y  Y  i i i 1 1 i i 2 2 i 2 i 1 i 1 2 2 i i i i  X Y  i i i i i 2 IV 2 i Hence a 2TSLS is equivalent to the IV estimator and, accordingly, consistent. Alternatively, candidates may prove the consistency of TSLS without showing it is equivalent to the IV estimator. (e) [3 marks] Incorrect, since the estimators are the same. Give 0 if the candidate does not realize that it is equivalent and trots out the standard theory relating to the lower variance of TSLS in an overidentified model. This model is exactly identified and the candidate should realise this. (f) [2 marks] X  2Y  v v  2  Y Y Y It is not possible to take expectations because v is a component of Y . (g) [3 marks] plim plim v X v   2  plim   2  2 Y Y plim Y since plim v = 0, provided plim Y  0. Give up to 2 extra marks if the candidate uses the reduced from equation in (a), writes plim Y   1 1 22 1 1  22 1   1  2   3 plim Z  plim u   2 plim v  1   1  2   3  Z  and asserts that plim Y thus exists and in general will not be zero.  LSE 2010/EC220 Page 10 of 13 7. Note: Candidates will not have encountered the questions in (a). (b) is standard limited dependent variable with an upper bound rather than a lower bound. In lectures, (b)(ii) is explained in terms of the conditional expectation of the disturbance term not being independent of X. The answer below assumes 2 >0. (a) (i) [5 marks] The slope coefficient will tend to be overestimated (3 marks), the intercept underestimated (1 mark; give the mark if this is not stated explicitly but there is a diagram showing it), and the standard errors invalidated to some extent (1 mark). An intuitive explanation will suffice to earn these marks, provided that it is reasonably clear. A technical explanation might go as follows: Write Yi   1   2 Z i  vi where Zi = Xi and vi = ui for unconstrained observations, and Zi = Xmax and   vi  u i   2 X i  X max for constrained observations. The OLS slope coefficient may be decomposed as b2   2  where a i  Zi  Z  Z j Z  2 a v i i . For the unconstrained observations, E v i   0 . For the constrained observations, E v i   0 and ai > 0. The standard error will be invalidated by the non-standard distribution of the disturbance term. Be generous with extra marks if a candidate gets anywhere with a technical explanation. (ii) [5 marks] There will be no violation of the regression model assumptions (1 mark; give this mark, even if the statement is not made explicitly, if it is sufficiently implicit in the answer) and so the parameter estimators will remain unbiased (1 mark) and the standard errors will be valid (1 mark). However, the variances of the parameter estimators will be larger than those that would have been obtained with the unmodified data (1) because the variance of X will be smaller (1 mark), and (2) because the number of observations will be smaller (1 mark). (b) (i) [5 marks] The slope coefficient is likely to be underestimated and the intercept overestimated. As in a(i), an intuitive explanation is all that should be expected. A good answer is likely to include a diagram. A technical answer might be provided along the lines of that suggested in a(i), and if attempted, should be rewarded with extra marks (4 marks). Standard errors are invalid (1 mark). (ii) [5 marks] The slope coefficient is likely to be underestimated and the intercept overestimated. A good answer is likely to include a diagram. For values of X around the point where Y hits its ceiling and the observation is unobserved, there will be a negative correlation between X and the conditional expectation of u (4 marks). Standard errors are invalid (1 mark). (c) [5 marks] The loglikelihood function is  LSE 2010/EC220 Page 11 of 13 log L  , T1 ,..., Tn   n log     T i   Setting the first derivative with respect to  equal to zero, we have n  ˆ  T i   0 and hence ˆ  1 (5 marks) T  The second derivative is  n / ˆ 2 , which is negative, confirming we have maximized the loglikelihood function (1 extra mark).  LSE 2010/EC220 Page 12 of 13 8. (a) (i) [2 marks] E Yt    1   2 t and is therefore not independent of time. (ii) [3 marks] Yt   2  u t  u t 1 . ut is IID with finite variance and is therefore stationary. ut–1 is stationary. A linear combination of stationary processes is stationary. (Alternatively, the candidate may show that the three conditions for weak stationarity are satisfied.) (iii) [2 marks]  t  t Yt T b2OLS  Y  t 1  T  t  t  T  t  t   1   2 t  u t    1   2 t  u   t 1 T  t  t  2 t 1 t 1 T  2  2  t  0.5T u t u t 1 T  t  0.5T  2 t 1 Since t is (iv) [3 marks] To earn the full 3 marks, this should be done properly. nonstochastic,  E b2OLS       2  E     2  T   u    T t  0.5T 2  t 1   t  0.5T u t 1 t  1 T  t  0.5T  2  T  E  t  0.5T u t  u   t 1   t 1  2  1 T  t  0.5T  T  E t  0.5T u t  u  2 t 1 t 1  2  1 T  t  0.5T  T  t  0.5T E u t  u   2 2 t 1 t 1   (b) (i) [2 marks] The unbiasedness proof is unaffected since E u t   E  t   t21  ...  0 (ii) [3 marks] Lagging (1) one time period and multplying though by , one has Yt 1  1    2  t  1  u t 1 Subtracting this from (1), one obtains Yt   1 1      2   Yt 1   2 1   t   t (iii) [2 marks] Yes. [There is no restriction since the terms involving t and t – 1 have merged.] (iv) [3 marks] The advantage is that the specification is free from autocorrelation (1 mark), which causes a loss of efficiency and invalid standard errors (1 extra mark).  LSE 2010/EC220 Page 13 of 13 The disadvantage is that we have introduced a lagged dependent variable, which can lead to finite-sample bias (1 mark) and invalid finite-sample standard errors (0.5 extra mark), and it may give rise to multicollinearity (1 mark). (v) [2 marks] Let the fitted model be written Yˆt  d1  d 2 Yt 1  d 3 t Then plim d 3   2 1    and plim d 2   , so plim plim d 3 d3  1     2 1      2   2 1  d 2 plim 1  d 2  1  plim d 2 1  (vi) [3 marks] Specification (1) has a smaller variance than specification (2), suggesting that the adverse effect of multicollinearity is more serious than the loss of efficiency attributable to autocorrelation. Give 1 mark if the candidate points out that the greater variance of the estimator derived via specification (2) is unexpected, but is unable to explain why. [For future readers of this marking scheme who are interested in such matters, both estimators are hyperconsistent, and simulations show that specification (1) maintains a slight advantage as the sample size increases.]  LSE 2010/EC220 Page 14 of 13 9. (a) (i) [2 marks] The effects of shocks are not attenuated in time (1 mark). The first difference of the series is stationary (1 mark). (ii) [3 marks] There exists a linear combination of the two series that is stable in the sense that the divergences, represented by the disturbance term, are stationary. (iii) [5 marks] Taking logarithms, log C  log   log Y  u where u  log v . The researcher should test the difference (log C – log Y) for stationarity using a standard unit root test. (Give only 3 marks if the candidate suggests regressing log C on log Y and testing the residuals for stationarity using the special critical values for residuals.) (iv) [2 marks] We are told in the introduction to the question that the series are I(1) and so the growth rates are stationary. Stationary processes cannot be described as being cointegrated. (b) (i) [5 marks] Putting log C t  log C t 1  log C and log Yt  log Yt 1  log Y , (2) implies the long-run relationship log C   1   2 log C   3 log Y   4 log Y    4 1  3 log Y 1 2 1 2 (3 marks). Comparing this with (1), the restriction is  2   3   4  1  0 (2 marks). (ii) [5 marks] Imposing the restriction, (2) may be rewritten log C t  1   2 log C t 1   3 log Yt  1   2   3  log Yt 1   t Hence   1 log C t  log C t 1   2  1 log C t 1   log Yt 1    3 log Yt  log Yt 1    t 1 2   or C 1  log C t   2  1 log t 1   Yt 1 1   2     3  log Yt   t  Give only 3.5 marks if the candidate does not impose the restriction and comes up mechanically with the generic ECM. (iii) [3 marks] The advantage is that the variables in this relationship are all stationary and so the relationship may be fitted with standard OLS.  LSE 2010/EC220 Page 15 of 13

(EC220) - LSE Learning Resources Online

Related documents

Products

Support

(EC220) - LSE Learning Resources Online

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib