Econ 4211 – Principles of Econometrics Problem Set 3 Answer Key 1 Part 1 In empirical work we often get to work with data that is imperfectly measured. Examples are numerous and include people rounding up or down their responses in surveys, data entry errors, and sometimes simply the inability to measure a certain factor precisely (IQ scores as measures of ability come to mind). Econometricians have long recognized the issues associated with the so-called “measurement error problem”. Consider the following simple regression model: y = βx∗ + u, (1) where β is our parameter of interest. We assume that (1) satisfies assumptions MLR1-MLR5. Thus E [u | x∗ ] = 0 and V ar [u | x∗ ] = σu2 . For simplicity we also assume E [y] = E [x∗ ] = 0, so there is no need to include the intercept into the model1 . Define V ar [x∗ ] = σx2∗ . Now the problem is that we don’t observe x∗ , instead, we observe x = x∗ +ε, where ε is the measurement error. There is no pattern in the error, so E [ε | x∗ ] = 0 and V ar [ε] = σε2 . We assume that ε is the “pure” measurement error, it contributes absolutely no information: E [y | x, x∗ ] = E [y | x∗ ], so all the “useful” information is in x∗ . This implies E [u | x] = 0. The problem is, ε will be correlated with x, and if we try to estimate (1) using x in place for x∗ , the results would be poor. Define V ar [x] = σx2 . Do the following: 1. Show that x and ε are correlated, i.e. compute Cov [x, ε]. 2. By definition, x = x∗ + ε, which also implies x∗ = x − ε. Use this insight to write y as a function of x and some new error term ξ (i.e. what’s the expression for ξ going to look like?) 3. Use your results in 1. and 2. above to argue that if you estimate the model y = βx + ξ via OLS, the estimate for β will be biased. 1 h i This also implies V ar [x∗ ] = E (x∗ )2 and Cov [x∗ , u] = E [x∗ · u] (go ahead and verify these formulas if you are not sure why this is the case.) 1 4. Recall the steps we’ve embarked on to prove that the OLS estimator was consistent. For a model with an intercept, we had 1 n β̂1 = β1 + Pn i=1 (xi − x̄) (ui − Pn 2 1 i=1 (xi − x̄) n ū) and used the Law of Large Numbers to argue that n 1X (xi − x̄) (ui − ū) →p n i=1 Cov [x, u] n 1X 2 (xi − x̄) n i=1 →p V ar [x] and by MLR4 Cov [x, u] = 0, so β̂1 →p β1 . Use these insights to demonstrate that in our model β̂OLS 9p β. 5. Use the fact that x∗ and ε are independent to show that: σx2 = σx2∗ + σε2 . 6. Finally, use your results in 5. to prove that measurement error biases the OLS estimate towards zero. In econometrics, this is called “attenuation bias”. 1.1 Answer: Most of this problem is discussed in Chapter 9 of the textbook. The algebra is fairly straightforward in most cases. 1. Straightforward: Cov [x, ε] = Cov [x∗ + ε, ε] = Cov [x∗ , ε] + V ar [ε] = E [x∗ · ε] + σε2 = σε2 2. Even easier: y = βx∗ + u y = β (x − ε) + u y = βx + (u − βε) y = βx + ξ, so ξ = u − βε. 2 3. This part may be a little challenging. Start with the OLS estimator: Pn xi yi β̂OLS = Pi=1 n x2 Pni=1 i xi (βxi + ξi ) i=1P = n x2 Pni=1 i xi ξi = β + Pi=1 n 2 i=1 xi Thus h E β̂OLS i Pn Pn i=1 xi ξi i=1 xi ξi P P E β+ =β+E n n 2 x2 i=1 xi Pn i i=1 xi ξi β + E E Pi=1 n 2 |x i=1 xi P n [ui − βεi | x∗ , ε] i=1 xi EP β+E n 2 i=1 xi Pn − i=1 βxi εi Pn β+E x2i Pi=1 n (x∗ + ε ) ε i=1 Pn i 2 i i β−β·E xi Pn i=1 2 ε β − β · E Pni=1 i2 6= β. x i=1 i = = = = = = In the second line I used the Law of Iterated Expectations, in the third line I used the fact that x = x∗ + ε and that conditional on x, any function of x is deterministic and I can “drag the expectation through” it. The last two lines just use the definition of x and the fact that x∗ is uncorrelated with ε. The expectation term cannot be simplified any more, but it is clear that it is not in general going to be equal to zero because it is a fraction of sums of squared values. Hence β̂OLS is biased. 4. This is actually fairly easy. The trick is to rewrite the OLS estimator as: Pn xi ξi β̂OLS = β + Pi=1 n 2 i=1 xi P Pn n 1 1 xi εi i=1 xi ui n n = β + 1 Pn − β 1 Pi=1 n 2 2 i=1 xi i=1 xi n n where the last line uses the definition of ξ. Now, the Law of Large Numbers implies: n 1X 2 x n i=1 i ar [x] = σx2 −→ pV −→ p Cov [x, u] =0 −→ p Cov [x, ε] = σε2 n 1X xi ui n i=1 n 1X xi ui n i=1 where the last line uses our results from 1. Thus β̂OLS σε2 σε2 −→p β + 0 − β 2 = β 1 − 2 σx σx 3 5. This is also trivial: σx2 = V ar [x] = V ar [x∗ + ε] = V ar [x∗ ] + V ar [ε] + 2Cov [x∗ , ε] = σx2∗ + σε2 , because the covariance is necessarily zero by independence. 6. Again, this is straightforward: β̂OLS −→p β 1 − σε2 σx2∗ + σε2 =β σx2∗ <β σx2∗ + σε2 so asymptotically β̂OLS will be equal to something that is smaller than the true β. That’s the attenuation bias. 2 Part 2 Solve the following problems from the Wooldridge’s textbook (exercise numbers refer to the 4th edition): • 15.1 • 15.3 • 15.10 2.1 Answer 2.1.1 15.1 1. Other factors that determine GP A (which are assumed to be in u) can be correlated with P C. For example, maybe only very hard-working and determined students buy computers, then their dedication will affect both P C and GP A via u. Or, alternatively, only the students who care little about grades buy computers to play games and use Facebook. This would have the potential for endogeneity of the P C variable. 2. If a student has rich parents, it is more likely this student would end up owning a computer, so this relationship is obvious. What’s less obvious is whether the parents’ income has no effect on GP A except for through the P C variable. This is unlikely. A rich student may spend more on tutors and/or textbooks, thus allowing the income to have other channels of influence on GP A. Unless we control for these channels, parents’ income won’t be a great instrument. 3. Here the indicator variable for whether a student got a grant should be a terrific instrument for P C. It will probably be correlated with P C, so we would need to argue grants were not related to GP A. But if grants were given out at random, we know this is true. Thus any effect of grants on GP A can only come through the P C ownership – this is an ideal scenario for IV analysis. 4 2.1.2 15.3 We know that: Pn (zi − z̄) (yi − ȳ) β̂IV = Pni=1 (z i − z̄) (xi − x̄) i=1 and the trick is to manipulate this equation and turn it into what we need. This may take a while. The easier way is to look at the equations from which the estimate comes: Pn (y − β − β x ) = 0 i 0 1 i i=1 Pn (yi − β0 − β1 xi ) zi = 0 i=1 The first condition gives us: n X 1 n i=1 n X yi = nβ0 + β1 yi = β0 + β1 i=1 1 n ȳ = β0 + β1 x̄ β0 = ȳ − β1 x̄ n X xi i=1 n X xi i=1 and the second condition can be rewritten as: n1 n0 n X X X (yi − β0 − β1 xi ) (yi − β0 − β1 xi ) 0 + (yi − β0 − β1 xi ) zi = i=n0 +1 i=1 n1 X i=1 = (yi − β0 − β1 xi ) = 0 i=n0 +1 where I denoted by n0 the number of observations for which zi = 0 and n1 is defined similarly (and clearly n0 + n1 = n). Then take the last equation: 0 n1 X = (yi − β0 − β1 xi ) i=n0 +1 n1 X yi = n 1 β0 + β1 i=n0 +1 n1 X xi i=n0 +1 ȳ1 = β0 + β1 x̄1 and now plug the expression for β0 into it and solve for β1 : ȳ1 = (ȳ − β1 x̄) + β1 x̄1 ȳ1 − ȳ = β1 (x̄1 − x̄) ȳ1 − ȳ . x̄1 − x̄ We are not done yet, however. But we’re close. Note that: "n # n1 n 0 X 1X 1 X ȳ = yi = yi + yi n i=1 n i=1 i=n +1 β1 = 0 = 1 [n0 ȳ0 + n1 ȳ1 ] n 5 and similarly x̄ = 1 [n0 x̄0 + n1 x̄1 ] , n so we can substitute these two expressions: β1 = = = = = ȳ1 − n1 [n0 ȳ0 + n1 ȳ1 ] ȳ1 − ȳ = x̄1 − x̄ x̄1 − n1 [n0 x̄0 + n1 x̄1 ] n0 n1 n n ȳ1 − n ȳ0 − n ȳ1 n0 n1 n n x̄1 − n x̄0 − n x̄1 n−n1 n ȳ1 n−n1 n x̄1 n0 n ȳ1 − n0 n x̄1 − − − n0 n ȳ0 n0 n x̄0 n0 n ȳ0 n0 n x̄0 ȳ1 − ȳ0 . x̄1 − x̄0 That’s it. 2.1.3 15.10 1. It is not immediately obvious that the chance of attending college increases if one attends Catholic High School. Maybe the causality goes the other way, for some reason students that are most likely to enter college choose to study at Catholic High Schools for some reasons. Maybe only the most able students go to Catholic High Schools. So if there is some unobservable ability variable, it will be contained in u and will be correlated with CathHS. 2. I would include this variable into the main equation. Doing this would partially control for the students’ abilities, which may reduce the bias from the endogeneity of CathHS. 3. We need CathRel to be correlated with CathHS – this can be tested directly. This is the relevance condition. We also need CathRel to be uncorrelated with u, i.e. the only effect the religious beliefs of the student may have on his/her probability of getting into college is through the CathHS channel. This is the validity condition that cannot be tested, unfortunately. 4. No. Relevance has nothing to do with validity. If for some reason Catholic students have higher abilities than non-Catholic, then CathRel will be correlated with ability and hence with u, which will make it an invalid instrument. 3 Part 3 Solve the following computer problems from the Wooldridge’s textbook (exercise numbers refer to the 4th edition): • C15.1 • C15.9 6 3.1 Answer 3.1.1 C15.1 1. Here are the regression outputs from using sibs as instrument for educ and for regressions that use sibs as right-hand-side variable directly: . ivregress 2sls lwage (educ = sibs) Instrumental variables (2SLS) regression Number of obs = 935 Wald chi2(1) = 21.63 Prob > chi2 = 0.0000 R-squared = . Root MSE = .42285 -----------------------------------------------------------------------------lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------educ | .1224327 .0263224 4.65 0.000 .0708418 .1740236 _cons | 5.130026 .3547911 14.46 0.000 4.434648 5.825404 -----------------------------------------------------------------------------Instrumented: educ Instruments: sibs . regress lwage sibs Source | SS df MS Number of obs = -------------+------------------------------ F( 1, 935 933) = 22.31 Model | 3.86818211 1 3.86818211 Prob > F = 0.0000 Residual | 161.788112 933 .173406337 R-squared = 0.0234 Adj R-squared = 0.0223 Root MSE .41642 -------------+-----------------------------Total | 165.656294 934 .177362199 = -----------------------------------------------------------------------------lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------sibs | -.0279044 .0059082 -4.72 0.000 -.0394992 -.0163096 _cons | 6.861076 .0220776 310.77 0.000 6.817748 6.904403 ------------------------------------------------------------------------------ 7 These two regressions really cannot be compared – they consider the impacts of two different factors on wages. First regression looks at the impact of getting more education, and the second considers the effect of having more siblings. 2. Yes, clearly there is an effect, as the output below demonstrates. A child born third is less likely to get more education because being third means you have at least 2 siblings, and your parents have to spend money on them, too. . regress educ brthord Source | SS df MS Number of obs = -------------+------------------------------ F( 1, 852 850) = 37.29 Model | 173.087012 1 173.087012 Prob > F = 0.0000 Residual | 3945.88364 850 4.64221605 R-squared = 0.0420 Adj R-squared = 0.0409 Root MSE 2.1546 -------------+-----------------------------Total | 4118.97066 851 4.84015353 = -----------------------------------------------------------------------------educ | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------brthord | -.2826441 .0462882 -6.11 0.000 -.3734967 -.1917915 _cons | 14.14945 .1286754 109.96 0.000 13.89689 14.40201 -----------------------------------------------------------------------------3. Results are fairly similar to the first regression from part 1. It appears that brthord is a good instrument, and this is reasonable if sibs was, because these two variables use pretty much the same idea. . ivregress 2sls lwage (educ = brthord ) Instrumental variables (2SLS) regression Number of obs = 852 Wald chi2(1) = 16.67 Prob > chi2 = 0.0000 R-squared = . Root MSE = .42101 -----------------------------------------------------------------------------lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------educ | .1306448 .0320009 4.08 0.000 .0679242 .1933654 _cons | 5.030397 .4324406 11.63 0.000 4.182829 5.877965 8 -----------------------------------------------------------------------------Instrumented: educ Instruments: brthord The regression suggests an extra year of education raises wages by 13 percent. This is a large effect, significant both statistically and economically. 4. The identification assumption is H0 : π2 = 0 vs. H1 : π2 6= 0. The reduced form regression results demonstrate that we are identified, πˆ2 6= 0 at 1% significance level. . regress educ sibs brthord Source | SS df MS Number of obs = -------------+------------------------------ F( 2, 852 849) = 26.29 Model | 240.246365 2 120.123183 Prob > F = 0.0000 Residual | 3878.72429 849 4.56857985 R-squared = 0.0583 Adj R-squared = 0.0561 Root MSE 2.1374 -------------+-----------------------------Total | 4118.97066 851 4.84015353 = -----------------------------------------------------------------------------educ | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------sibs | -.1528673 .0398705 -3.83 0.000 -.2311236 -.0746109 brthord | -.1526742 .0570764 -2.67 0.008 -.2647017 -.0406467 _cons | 14.2965 .1332881 107.26 0.000 14.03489 14.55811 -----------------------------------------------------------------------------5. Clearly sibs is very imprecisely estimated in the regression below. I would suspect this to be because sibs is highly correlated with brthord. The fact that educ is also barely significant now also hints at this: Stata cannot disentangle the two effects from each other well enough. . ivregress 2sls lwage sibs (educ = brthord ) Instrumental variables (2SLS) regression Number of obs = 852 Wald chi2(2) = 21.87 Prob > chi2 = 0.0000 R-squared = . Root MSE = .42623 -----------------------------------------------------------------------------lwage | Coef. Std. Err. 9 z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------educ | .136994 .0745496 1.84 0.066 -.0091205 .2831085 sibs | .0021107 .0173411 0.12 0.903 -.0318772 .0360986 _cons | 4.938529 1.05383 4.69 0.000 2.87306 7.003998 -----------------------------------------------------------------------------Instrumented: educ Instruments: sibs brthord 6. I re-estimated the reduced from from part 4. and used the command predict educhat to generate the fitted values. As suspected, they are highly correlated with sibs: . quietly regress educ sibs brthord . predict educhat (option xb assumed; fitted values) (83 missing values generated) . correlate educhat sibs (obs=852) | educhat sibs -------------+-----------------educhat | 1.0000 sibs | -0.9295 1.0000 In the first command I used the quietly prefix to suppress output. I just need the estimates from that regression to be active so that I could use predict in the next command. 3.1.2 C15.9 1. Here is the output from the two-stage least squares procedure. This is what we will try to replicate by hand in the subsequent parts of the problem. . ivregress 2sls lwage (educ = sibs) exper tenure black Instrumental variables (2SLS) regression 10 Number of obs = 935 Wald chi2(4) = 100.22 Prob > chi2 = 0.0000 R-squared = 0.1685 Root MSE = .38381 -----------------------------------------------------------------------------lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------educ | .0936325 .0336291 2.78 0.005 .0277206 .1595444 exper | .0209216 .0083653 2.50 0.012 .004526 .0373172 tenure | .0115482 .0027323 4.23 0.000 .0061929 .0169034 black | -.1833285 .0500016 -3.67 0.000 -.2813299 -.0853272 _cons | 5.215976 .5419965 9.62 0.000 4.153682 6.278269 -----------------------------------------------------------------------------Instrumented: educ Instruments: exper tenure black sibs 2. I estimate the reduced form, use predict to construct fitted values and then run the second stage. Here is the output: . regress educ exper tenure black sibs Source | SS df MS Number of obs = -------------+-----------------------------Model | 1190.65488 4 297.66372 Residual | 3316.16437 930 3.56576814 F( -------------+-----------------------------Total | 4506.81925 934 4.82528828 4, 935 930) = 83.48 Prob > F = 0.0000 R-squared = 0.2642 Adj R-squared = 0.2610 Root MSE 1.8883 = -----------------------------------------------------------------------------educ | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------exper | -.2276516 .0146296 -15.56 0.000 -.2563625 -.1989407 tenure | .0259141 .0126151 2.05 0.040 .0011568 .0506715 black | -.6226722 .1946812 -3.20 0.001 -1.004738 -.2406069 sibs | -.1703301 .0281815 -6.04 0.000 -.2256368 -.1150233 _cons | 16.49435 .1951037 84.54 0.000 16.11145 16.87724 ------------------------------------------------------------------------------ . predict educhat (option xb assumed; fitted values) . regress lwage educhat exper tenure black Source | SS df MS 11 Number of obs = 935 -------------+------------------------------ F( 4, 930) = 22.75 Model | 14.7631587 4 3.69078967 Prob > F = 0.0000 Residual | 150.893136 930 .162250683 R-squared = 0.0891 Adj R-squared = 0.0852 -------------+-----------------------------Total | 165.656294 934 .177362199 Root MSE = .4028 -----------------------------------------------------------------------------lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------educhat | .0936325 .0352931 2.65 0.008 .0243692 .1628959 exper | .0209216 .0087792 2.38 0.017 .0036923 .0381509 tenure | .0115482 .0028675 4.03 0.000 .0059206 .0171757 black | -.1833285 .0524757 -3.49 0.000 -.286313 -.080344 _cons | 5.215975 .5688148 9.17 0.000 4.099666 6.332285 -----------------------------------------------------------------------------You should notice that the estimates are the same as in part 1., but the standard errors are all wrong. We have discussed this in class. 3. This part is like part 2., but we also use the wrong reduced form. Output follows: . regress educ sibs Source | SS df MS Number of obs = -------------+------------------------------ F( 1, 935 933) = 56.67 Model | 258.055048 1 258.055048 Prob > F = 0.0000 Residual | 4248.7642 933 4.55387374 R-squared = 0.0573 Adj R-squared = 0.0562 -------------+-----------------------------Total | 4506.81925 934 4.82528828 Root MSE = 2.134 -----------------------------------------------------------------------------educ | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------sibs | -.2279164 .0302768 -7.53 0.000 -.287335 -.1684979 _cons | 14.13879 .1131382 124.97 0.000 13.91676 14.36083 ------------------------------------------------------------------------------ . predict eductilde (option xb assumed; fitted values) 12 . regress lwage eductilde exper tenure black Source | SS df MS Number of obs = -------------+------------------------------ F( 4, 935 930) = 22.75 Model | 14.7631579 4 3.69078949 Prob > F = 0.0000 Residual | 150.893136 930 .162250684 R-squared = 0.0891 Adj R-squared = 0.0852 -------------+-----------------------------Total | 165.656294 934 .177362199 Root MSE = .4028 -----------------------------------------------------------------------------lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------eductilde | .0699749 .0263758 2.65 0.008 .0182119 .1217378 exper | -.000394 .0031207 -0.13 0.900 -.0065184 .0057304 tenure | .0139746 .002691 5.19 0.000 .0086935 .0192557 black | -.2416309 .041528 -5.82 0.000 -.3231303 -.1601315 _cons | 5.771022 .3603758 16.01 0.000 5.063778 6.478267 -----------------------------------------------------------------------------Now both the standard errors and the estimates themselves are wrong. Bottom line: if a variable is used in the main equation, you must include it in the reduced form equation as well. 13