Christopher Dougherty EC220 - Introduction to econometrics: past examinations and marking schemes 2011 marking scheme Original citation: Dougherty, C. (2012) EC220 - Introduction to econometrics: past examinations and marking schemes. [Teaching Resource] © 2011 The Author This version available at: http://learningresources.lse.ac.uk/160/ Available in LSE Learning Resources Online: May 2012 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/ http://learningresources.lse.ac.uk/ EC220 INTRODUCTION TO ECONOMETRICS Marking Scheme for the 2011 Examination Notes to examiners: This marking scheme will in due course be posted on the EC220 website as a resource for future students. For this reason some of the explanations are more detailed than is necessary for a marking scheme. Please mark each item in each question to the nearest half mark. Perhaps even one quarter, for small items. Examiners are encouraged to award additional marks if the candidate makes points that are not included in the marking scheme, provided that they are wholly relevant and provided that they are not just a statement of unnecessary detail. A few possible additional marks have been included in the marking scheme. If, contrary to instructions, a candidate has answered more than four questions, all those after the first four should be disregarded. Candidates have been told that this will be the case. EC220 examination 2011, marking scheme 2 1 The first question of the examination is usually of this type and it always has a bimodal distribution of marks. It is pounced on by the stronger candidates who do it first and do it well. It is also attempted, usually disastrously, as a last resort by weak candidates who are having trouble finding a fourth question. The question has deliberately been designed so that an error in one part should not lead to further errors in the remaining parts. Since candidates know what they need to prove in parts (a) – (g), examiners should be alert for fudges. (a) [3 marks] b2* provides an estimate of the effect on Y, in terms of standard deviations of Y, of a one-standard deviation change in X. [It is, of course, the beta coefficient, but we have not discussed beta coefficients in the course.] [Give 2 marks if the candidate says that b2* provides an estimate of the effect on standardized Y of a one unit change in standardized X. Give 1 if the candidate mechanically states that b2* provides an estimate of the change in Y* caused by a one unit change in X*.] (b) [3 marks] We note that, by construction, Y * X * 0 . So b1* Y * b2* X * 0 . (c) [3 marks] Assuming an intercept, b2* X X Y Y X Y X X X * i * * * * i i * i * i *2 i * 2 which is the expression appropriate for the specification without an intercept. (d) [3 marks] Substituting for Y* and X*, b2* X X i*Yi* *2 i Xi X sX Yi Y s Y Xi X s X 2 sX sY X X Y Y s s X X i i X 2 i b2 Y X X 1 s (e) [3 marks] Yˆi* b2* X i* X b2 i b2 X i X . sY s X sY Yˆi b1 b2 X i Y b2 X b2 X i Y b2 X i X 1 ˆ Yi Y . Hence Yˆi* sY (f) [2 marks] 2 (g) [3 marks] 1 Yi Y 1 Yˆi Y 1 Yi Yˆi 1 ei ei* Yi* Yˆi* sY sY sY sY s.e. b2* 1 n X * i 2 ei* X* 2 1 1 ei2 s n s Y X s.e. b2 2 sY Xi X s X (h) [2 marks] Given the results in (d) and (g), the t statistic for b2* is the same as that for b2. (i) [3 marks] R2 will be the same. EC220 examination 2011, marking scheme * R2 Y Yˆi* Y * 2 3 2 Yˆi* Y * 2 *2 1 sY 2 ˆ Yi Y 2 2 R2 1 2 Yi Y Y Alternatively, the candidate may state that F = t2, and so will be the same, and R2 is the same transformation of F. i * Y i s EC220 examination 2011, marking scheme 4 2. (a) [4 marks] The marginal effect of a year of training is to raise earnings by 20 percent. (Literally, by a proportion 0.20.) (2 marks) (Note: This is approximative, but enough to earn the four marks. The exact figure would be a proportion e0.20 – 1. Give 0.5 extra mark for this.) The underlying mathematical model is Y e 1 2 X ... From this, one obtains Y 2 e 1 2 X ... 2 Y . X Hence Y 2 Y X giving the interpretation. (2 marks). Alternatively, candidates may start with the regression specification log Y 1 2 X ... take the partial differential with respect to X, 1 Y 2 Y X and reach the same result. (b) [4 marks] Those earning the Diploma tend to earn a proportion 0.30, that is 30 percent, more than those with no formal training 2 marks). The regression specification implies the model Y e 1 2 X D... e 1 2 X D... e D e 1 2 X ... where D is a dummy variable. For the reference category, D = 0 and e D 1 . When D = 1, e D e 1 , giving the interpretation (2 marks). (Again, the 30 percent is an approximation. The exact figure is derived from e0.30 – 1. Give 1 extra 0.5 mark for this.) (c) [3 marks] The coefficient mow provides an estimate of the extra earnings of those with a Diploma, controlling for the fact that they have had two years of training. In other words, it is the extra earnings associated with passing the test at the end of the second year. (d) [3 marks] Define a new dummy variable NQ (no qualification) equal to 1 for those who do not possess either the RM Certificate or the RM Diploma, and equal to 0 if RMC = 1 or RMD = 1. Run the regression with NQ and RMD, dropping RMC. Earning a Certificate is now the reference category and the coefficient of RMD will provide an estimate of the extra proportional earnings from obtaining a Diploma. One would perform a t test on the coefficient. Alternatively, one could make earning a Diploma the reference category and test the coefficient of RMC. Or, one could merge the Certificate holders and the Diploma holders into one category and perform an F test on the change in RSS. (e) [4 marks] In general terms the regression specification is log Y 1 2 EXP 3TRAINING RMC RMC RMD RMD u EC220 examination 2011, marking scheme 5 and the hypothesized restriction is RMD 2 RMC 0 Define RMD 2 RMC Then RMD 2 RMC Substituting into the regression specification, one has log Y 1 2 EXP 3TRAINING RMC RMC 2 RMD RMD u and one performs a t test on the coefficient of RMD. (Alternatively and equivalently, one may reparameterize the specification as log Y 1 2 EXP 3TRAINING RMD 0.5RMC RMD 0.5RMC u ) and perform a t test on the coefficient of RMC. (Alternatively, and equivalently, one may fit the restricted model log Y 1 2 EXP 3TRAINING RMC RMC 2 RMD u and perform an F test comparing RSS for that specification with RSS for the unrestricted model.) (f) [4 marks] In this specification, the dummy variables are capturing the effect of passing the tests for the award of the qualifications, as opposed to just receiving the training (1 extra mark for this or any other sensible attempt at an interpretation). F test comparing RSS for specifications (1) and (3). The null hypothesis is that the coefficients of RMC and RMD in specification (3) are both zero (1 mark). The F statistic is F 2,95 105 95 / 2 5 95 / 95 (2 marks) The critical value of F(2,100) is 4.82 at the 1 percent level, so one would reject the null hypothesis at that level (1 mark). [If the F statistic has been computed incorrectly, give this mark anyway if the conclusion is correct, conditional on F.] (g) [3 marks] This question could be answered reasonably in various ways, so mark flexibly and generously. Candidates may focus on any of the three specifications in the table. If (1) is chosen, define a dummy variable for one sector, say PRIVATE for private training institutes, and add it with an interactive dummy variable PRIVATE*TRAINING to the specification. One would then perform t tests on the coefficients of these two variables and an F test on their joint explanatory power. Some candidates may propose a Chow test. This would in principle not be appropriate because a difference in the wage equations could be due to differences in the experience coefficient. One would have to argue that any such difference would be attributable to differences in private/public training. EC220 examination 2011, marking scheme 6 3. Write the original model Y 1 2 X 3 Z u (1) Then, with X 0.5V W , Z 0.5V W , the other specifications are Y 1 0.5 2 3 V 0.5 2 3 W u (2) Y 1 2V u (3) with the implicit restriction 3 2 , and, using X = V – Z, Y 1 2V 3 2 Z u (4) (a) [20 marks] [Give the marks if the numbers are correct, even if the explanation is vestigial.] (2) and (4) are reparameterizations of (1), so the measures of fit are unchanged: E = L = 0.60, F = M = 200. Give 1 mark for each. Given the relationships among the parameters, A = 0.70, C = –0.10, J = 0.60, H = 0.20. Give 1 mark for each. The standard errors B and D cannot be reconstructed because the standard errors of b2 and b3 cannot be used (on their own) to construct standard errors of linear combinations (a loose explanation is acceptable because we have hardly touched on covariances between estimators). Give 1 mark for each. K = 0.04 since J = coefficient of X in specification (1) (2 marks). The F statistic for the restriction 3 2 implicit in specification (3) is F 1,40 220 200 / 1 4.0 200 / 40 In terms of R2 it would be F 1,40 0.60 G / 1 0.40 / 40 Hence G = 0.56 (4 marks). A two-sided t test on the coefficient of Z in specification (4) provides an equivalent test of the restriciton. The t statistic must therefore be 4.0 2.0 and so I = 0.10 (4 marks). [Note: One may also compute G using the t statistic for the coefficient of V in specification (3): G t2 1 G / 41 Give the 4 marks also if the candidate proposes this method, even if he or she cannot do the arithmetic. EC220 examination 2011, marking scheme 7 Yet another was of computing G is as follows. Since R2 in specification (1) is 0.60, TSS must be 500, using R2 1 RSS TSS TSS is the same in specification (3). Hence one obtains G = 0.56.] (b) (i) [2 marks] The estimates of 2 3 and 2 3 obtained from specification (2) should be relatively precise compared with those of 2 and 3 in specification (1). As a consequence, their standard errors should be lower. Give 0.5 extra mark for saying that the coefficient of V in specification (3) will be subject to omitted variable bias if 3 2 , and a further extra 0.5 mark for arguing that any such bias is likely to be small. Give 0.5 extra mark for saying that the standard error of V will be invalid. (ii) [3 marks] If 2 3 , W is a redundant variable in specification (2), and hence the estimate of the coefficient of V will be inefficient. Its standard error should be larger than that in specification (3). Give an extra mark for saying that it would not be much larger because the correlation was low. EC220 examination 2011, marking scheme 8 4. [Note: In this answer, CA has been replaced by A.] (a) (i) [5 marks] b2 D D Y Y D D D D i i 2 i 1 i 3 Ai u i 1 2 D 3 A u 2 Di D D D D A A D D u u D D D D 2 i 2 3 i i i i 2 2 i i (Give 2 marks for this decomposition.) Hence b2 2 3 d A i d u A i i i u where di Di D D j D 2 Hence E b2 2 3 Ed A i i Ed u A i i u Now, since the assignment to the course was random, D is distributed independently of both A and u, and hence E d i Ai A E d i E Ai A 0 and Ed i u i u E d i E u i u 0 Hence b2 is an unbiased estimator of 2. (Give 3 marks for the proof of unbiasedness. Be alert for false explanations. Candidates were told that the estimator is unbiased.) (ii) [3 marks] The researcher is nearly correct. Given the random selection of the sample, A will be distributed independently of D and so it can be treated as part of the disturbance term and the standard error will remain valid. The requirement that A have a normal distribution is too strong, since the expression for the standard error does not depend on it. However, if the standard error is to be used for t tests, then it is important that the enlarged disturbance term should have a normal distribution, and this will be the case if an only if A has a normal distribution (assuming that u has one). If both A and u have normal distributions, a linear combination will also have one. (iii) [3 marks] The commentator is correct for the reasons explained in (ii). EC220 examination 2011, marking scheme 9 (b) (i) [3 marks] Add an interactive term (slope dummy): Y 1 2 D 3 A 4 DA u where DA is the product of D and A. (ii) [5 marks] Abstracting from the effect of the disturbance term, 1 is literacy test score of an individual not assigned to the extra course possessing zero ability (which may not be possible, depending on the way that ability is measured, but candidates are not expected to say this) (1 mark); 2 is the effect of the extra course on the literacy test score, for those with zero ability, sign positive (1 mark); 3 is the effect on the test score of a unit increase in ability, for those who were not assigned to the extra course, sign positive (1 mark). Writing the model as Y 1 2 4 AD 3 A u 4 may be interpreted as the increase in the effect of the extra course for every unit increase in the measure of ability (1 mark), or rather decrease, since the sign will be negative if the researcher is correct (1 mark). The model may also be rewritten Y 1 2 D 3 4 D A u so that 4 may also be interpreted as the difference in the effect of a unit increase in ability for those who took the course compared with those who did not (give 1 extra mark), (iii) [4 marks] Extending the decomposition of b2, b2 2 3 d A i i A 4 d DA i i d u DA i i u . DA will be positively correlated with D because DA = 0 when D = 0 and DA is some positive number when D = 1. Hence b2 will be a biased estimator of 2 (3 marks). Since DA will be positively correlated with D and the coefficient of DA is likely to be negative, the bias will be downwards (1 mark). (c) [2 marks] The decision to take the course is now endogenous and likely to be correlated with the disturbance term. In particular, it would be rational for lower ability children to take the course. This should increase the estimated slope coefficient (2 marks. This is not the only possible answer. Give the marks for any sensible alternative and give up to 2 extra marks for a particularly good answer. One alternative answer might be that there could be a pushy parent effect, with parents of wellperforming children making their children take the course, even though they do not need it. This could lead to a bias in the other direction.) EC220 examination 2011, marking scheme 10 5. (a) (i) [1 mark] Yi 2 Z i vi 2 X i wi vi 2 X i u i where u i vi 2 wi . (ii) [1 mark] wi is a component of both Xi and Yi, and therefore of both the numerator and the denominator of the expression for b2OLS . As a consequence it is not possible to derive a closed-form expression for the expectation. (iii) [5 marks] n b2OLS n X u X i 2 X i u i i 1 n X i i i 1 n 2 X 2 i i 1 i 1 X i X u i u nXu n i 1 X i X n 2 i 2 nX 2 i 1 1 n X n i X u i u Xu i 1 1 n X n X X2 2 i i 1 Now 1 plim n X n i 1 i X u i u Xu cov X .u X u cov X .u covZ w, v 2 w 2 w2 assuming that Z, v and w are distributed independently, and 1 plim n X n i 1 i 2 X X 2 var X X2 Z2 u2 Z2 Hence plim b2OLS 2 w2 Z2 w2 Z2 (iv) [2 marks] (Mark this flexibly. Give the 2 marks for any sensible answer.) For fixed Z2 , v2 . and w2 , the observations in the sample form a cloud around the point Z , Y . The larger is the value of Z, the further this cloud is away from the origin, and the clearer will be the relationship between Y and Z. As a consequence, the bias will attenuate. (The variance will also decrease, but the candidate is not expected to state this. Give an extra mark if it is stated.) (v) [4 marks] Y 2 X u u 2 X X X Hence plim Y 0 2 2 Z X (vi) [2 marks] (Mark this flexibly. Give the 2 marks for any sensible answer.) As the size of the sample becomes large, the impact of the disturbance term in the determination of Y attenuates and Y approaches 2 Z . At the same time, the EC220 examination 2011, marking scheme 11 impact of the measurement error in the determination of X attenuates and X approaches Z . As a consequence, the estimator approaches 2. Q 2Z v (b) (i) [5 marks] X i Z i wi Yi Qi 2 wi Hence Yi 2 wi 2 X i wi v i Yi 2 X i v i There is no violation of any regression model assumption and the estimator will be both unbiased and consistent. (ii) [2 marks] Variations in the observations on X and Y due to variations in the measurement error will be indistinguishable from those due to variations in Z. Hence there will be no inconsistency. Note: A variety of answers, not necessarily as above, may be offered. Examiners should mark generously. (iii) [3 marks] [Note: The variance expression given in the examination question is wrong. There should have been an additional term nX 2 in the denominator. However, since X Z , the following answer remains approximately correct.] Given that the distributions of Z and w are independent, the MSD of X will tend to be larger than that of Z and so the variance of the estimator will be smaller. From part (b) (i) we can see that the estimator will be unbiased as well as consistent. Hence the only effect will be a benefit—a reduction in the variance. EC220 examination 2011, marking scheme 12 6. (a) (i) [3 marks] The IV estimator will be b2IV Z Z t Z Y Y Z Ct C t t 2 Z Z t t Z u t u Z Yt Y It is not possible to obtain a closed form expression for the expectation of the error term since ut influences both the numerator explicitly and the denominator implicitly through Yt. Instead we take plims. We know that plim 1 n Z t Z u t u covZ , u 0 plim 1 n Z t Z Yt Y covZ , Y 0 and so plim 1 Z t Z u t u Z Z u u t t plim n 1 Z t Z Yt Y Z t Z Yt Y n 1 n 1 plim n plim 0 Z Z covZ , Y t Z u t u t Z Yt Y 0 covZ , Y 0 because Z is a component of Y. b2IV is therefore a consistent estimator of 2. (ii) [3 marks] Note to examiners: Although this is standard bookwork, it has not previously appeared in an examination. The marks should be given for evidence of conceptual understanding and one should not expect the mathematical precision of the answer provided below. In future years, of course, having seen it once, the students would all be ready with expert mathematical regurgitation if this item reappeared on an examination in any guise or form. The approximation derives from the use of a central limit theorem to establish that 2 1 d T b2IV 2 N 0, u2 2 Y ,Z Y where Y, Z is the population correlation between Y and Z. The expression applies only in the limit. However, dividing by approximation, b IV 2 u2 1 2 ~ N 0, T MSDY r 2 Y ,Z T , we have, as an EC220 examination 2011, marking scheme 13 in a finite sample, and hence u2 1 b2IV ~ N 2 , 2 T MSDY rY , Z How large the sample must be for the approximation to be ‘good’ is usually established by simulation. (Give 1 extra mark if the answer is precise.) (iii) [3 marks] Obviously, one consequence of the approximation is that the estimate of the standard error is only an approximation (1 mark). Probably more important in practice is the fact that the distribution will only approximately be normal and the distortions in the tails may be serious, undermining the tests (2 marks). Give an extra mark for saying that the size of the test is likely to differ from the nominal size, and another extra mark in the unlikely event that, even allowing for this, asymmetry in the tails may cause the power of the test to be suboptimal. Yt (iv) [1 mark] 1 1 I t Gt u t 1 2 (v) [2 marks] The asymptotic variance of an IV estimator in a simple regression is inversely proportional to the square of the correlation between the instrument and the variable for which it is acting. If we fit the reduced form equation, the coefficients of I and G will be chosen so as to maximize the correlation between the actual and fitted values of Yt. The fitted values, which are linear combinations of I and G, will therefore have higher correlations with Y than I or G individually. and consequently the asymptotic variance of the TSLS estimator will be smaller than those of the IV estimators. (vi) [3 marks] The assertion is incorrect. The correlation between Y and Z will, in general, be higher than the correlations between Y and I and Y and G. Therefore the TSLS estimator will be more efficient than the use of either I or G on its own. (vii) [2 marks] Nonsense. (viii)[3 marks] Correct. We are now using TSLS with the extra benefit of the imposition of the theoretical restriction that the coefficients of It and Gt should be the same. (b) (i) [3 marks] The estimator in (a) (i) would be inconsistent. Z t Yt C t 1 2 Yt 1 u t Hence plim plim 1 n and so 1 n Z Z t t Z u t u 1 2 covY , u varu u2 Z Yt Y 1 2 varY covu , Y 1 2 varY EC220 examination 2011, marking scheme plim 14 1 Z t Z u t u Z Z u u t t plim n 1 Z t Z Yt Y Z t Z Yt Y n 1 n 1 plim n plim Z Z t Z u t u t Z Yt Y u2 varY and the IV estimator will be downwards biased in large samples. (ii) [2 marks] OLS is unbiased and efficient if Y is exogenous. EC220 examination 2011, marking scheme 15 7. (a) (i) [2 marks] The ensemble distribution is the limiting distribution of the cross-section of possible realizations of the process at time t. (ii) [2 marks] By construction, there are no transitory initial effects and so the ensemble distribution will apply equally to Xt and Xt–1. Hence E X t E X t 1 and, from the process E X t 2 E X t 1 E t 0 (iii) [3 marks] For the same reason, var X t var X t 1 , so var X t 22 var X t 1 var t 2 1 22 (iv) [3 marks] The disturbance term t is a determinant of the explanatory variable X s 1 in observation s X s 2 X s 1 s for all s t . The regression model assumption that the all values of the disturbance term in the sample be distributed independently of all values of the explanatory variables in the sample is therefore violated. As a consequence, the estimator will be biased in finite samples. However, it will be consistent because the disturbance term is contemporaneously independent of the regressor. (b) (i) [1 mark] The slope coefficient is X t X Yt Y T b2 t 1 X T X X T 2 2 t t X t t 1 t 1 X T X 2 t t 1 t is a component of Xt and hence the error term is the ratio of two related stochastic terms. Consequently, it is not possible to obtain a closed-form expression for the expectation. (ii) [3 marks] 1 T X t X t n plim b2 2 plim t 1 T 1 2 X t X n t 1 Now 1 plim n X T t t 1 X t cov X t , t 2 from the DGP for Xt, and 1 plim n 2 X t X X2 T t 1 2 1 22 Hence the limits of the numerator and denominator of the error term both exist and so EC220 examination 2011, marking scheme 16 plim plim b2 2 1 n X T t X t t 1 1 plim n X T X 2 2 t 2 2 / 1 22 2 1 22 t 1 (c) (i) [3 marks] When 2 = 0, Xt = t and Yt 5 2 X t t 5 3 X t Hence there is an exact linear relationship between the values of Y and those of X, with slope coefficient 3. (ii) [2 marks] The large sample bias should be 0.75 when 2 = 0.5 and zero when 2 = 1. The distributions support the analysis. (iii) [3 marks] If 2 = 1, Xt is a random walk. Since Yt is a linear function of Xt, Yt will also be nonstationary. The linear relationship is by construction the cointegrating relationship. (iv) [3 marks] When 2 = 0.5, the distribution should be T consistent, the variance being inversely proportional to T, the standard deviation being inversely proportional to T , and hence, approximately, the height of the distribution proportional to T . Thus one would expect h100/h25 = h400/h100 = 2. This appears to be the case. When 2 = 1 and the relationship is a cointegrating relationship, the distribution of the slope coefficient should be superconsistent, the variance being inversely proportional to T2, the standard deviation being inversely proportional to T, and hence, approximately, the height of the distribution proportional to T. Thus one would expect h100/h25 = h400/h100 = 4. Again, this appears to be the case. EC220 examination 2011, marking scheme 17 8. (a) (i) [3 marks] Let the fitted model be Yˆi b1 Then ei, the residual in observation i, is given by ei Yi Yˆi Yi b1 Hence RSS, the residual sum of squares, is given by RSS n Y i b1 2 i 1 Differentiating this, the first-order condition is n dRSS 2 Yi b1 0 db1 i 1 from which we have b1 Y . The second derivative is 2n, which is positive, confirming that we have reached a minimum. (Give the 3 marks even if the second-order condition is not mentioned. Give an extra 0.5 mark for checking it.) (ii) [1 mark] Y 1 u . E u 0 , so E Y 1 (iii) [1 mark] plim u = 0 since E u 0 . Hence plim Y = 1. (This is sufficient, given the instruction on page 2 of the examination paper. Alternatively one could argue that Y is a consistent estimator because it is unbiased and its variance tends to zero as n becomes large. The variance of Y is the variance of u , and this is u2 / n .) Yt 1 u t (b) (i) [2 marks] Hence Yt 1 1 u t 1 Subtracting from the previous equation and rearranging. Yt 1 1 Yt 1 t (1 mark) The point is that the specification is now free from autocorrelation (1 mark) (ii) [2 marks] t is a determinant of the explanatory variable Ys 1 in observation s for all s t . The regression model assumption that the all values of the disturbance term in the sample be distributed independently of all values of the explanatory variables in the sample is therefore violated. As a consequence, the estimator will be biased in finite samples. However, it will be consistent because the disturbance term is contemporaneously independent of the regressor. (iii) [3 marks] Following the usual analysis, the OLS estimator of can be decomposed as EC220 examination 2011, marking scheme 18 1 T Y T t 1 Yt 1 t t t 2 1 T Y T Yt 1 2 t 1 t 2 Since Yt is a stationary process, plim 1 T Y T t 1 Yt 1 t t covYt 1 , t 0 t 2 and plim 1 T Y T Yt 1 Y2t 2 t 1 t 2 Hence the OLS estimator of is consistent. (iv) [2 marks] Suppose that when Yt is regressed on Yt–1, the fitted relationship is Yˆt A BYt 1 Then plim B = and plim A = 1 1 and A will be a consistent 1 B estimator of 1: plim 1 plim A A 1 1 1 B 1 plim B 1 (v) [2 marks] plim Y 1 plim u 1 since E u 0 and plim u E u . (vi) [2 marks] It is not possible to determine which is preferable, at least without a simulation. Both estimators are consistent. Y is also unbiased, but for a finite sample it might have a larger variance than the other estimator. (c) (i) [1 mark] The equation Yt 1 1 Yt 1 t becomes Yt Yt 1 t and so 1 is washed out. (ii) [3 marks] Y 1 1 T T T 1 t t t 1 and E Y 1 (1 mark). The variance of Y is an increasing function of T and so the estimator is inconsistent (2 marks). (iii) [1 mark] Y1 1 1 . Hence E Y1 1 and its variance is 2 It is an inconsistent estimator because the variance is independent of the size of the sample and does not tend to zero. (iv) [2 marks] Y1 will have a smaller variance if T > 1.