Econ 4211 Problem Set 3 Answer Key

advertisement
Econ 4211 – Principles of Econometrics
Problem Set 3
Answer Key
1
Part 1
In empirical work we often get to work with data that is imperfectly measured. Examples are numerous and include
people rounding up or down their responses in surveys, data entry errors, and sometimes simply the inability to
measure a certain factor precisely (IQ scores as measures of ability come to mind). Econometricians have long
recognized the issues associated with the so-called “measurement error problem”.
Consider the following simple regression model:
y = βx∗ + u,
(1)
where β is our parameter of interest. We assume that (1) satisfies assumptions MLR1-MLR5. Thus E [u | x∗ ] = 0
and V ar [u | x∗ ] = σu2 . For simplicity we also assume E [y] = E [x∗ ] = 0, so there is no need to include the
intercept into the model1 . Define V ar [x∗ ] = σx2∗ .
Now the problem is that we don’t observe x∗ , instead, we observe x = x∗ +ε, where ε is the measurement error.
There is no pattern in the error, so E [ε | x∗ ] = 0 and V ar [ε] = σε2 . We assume that ε is the “pure” measurement
error, it contributes absolutely no information: E [y | x, x∗ ] = E [y | x∗ ], so all the “useful” information is in x∗ .
This implies E [u | x] = 0.
The problem is, ε will be correlated with x, and if we try to estimate (1) using x in place for x∗ , the results
would be poor. Define V ar [x] = σx2 . Do the following:
1. Show that x and ε are correlated, i.e. compute Cov [x, ε].
2. By definition, x = x∗ + ε, which also implies x∗ = x − ε. Use this insight to write y as a function of x and
some new error term ξ (i.e. what’s the expression for ξ going to look like?)
3. Use your results in 1. and 2. above to argue that if you estimate the model y = βx + ξ via OLS, the estimate
for β will be biased.
1
h
i
This also implies V ar [x∗ ] = E (x∗ )2 and Cov [x∗ , u] = E [x∗ · u] (go ahead and verify these formulas if you are not sure why this
is the case.)
1
4. Recall the steps we’ve embarked on to prove that the OLS estimator was consistent. For a model with an
intercept, we had
1
n
β̂1 = β1 +
Pn
i=1 (xi − x̄) (ui −
Pn
2
1
i=1 (xi − x̄)
n
ū)
and used the Law of Large Numbers to argue that
n
1X
(xi − x̄) (ui − ū) →p
n i=1
Cov [x, u]
n
1X
2
(xi − x̄)
n i=1
→p
V ar [x]
and by MLR4 Cov [x, u] = 0, so β̂1 →p β1 .
Use these insights to demonstrate that in our model β̂OLS 9p β.
5. Use the fact that x∗ and ε are independent to show that:
σx2 = σx2∗ + σε2 .
6. Finally, use your results in 5. to prove that measurement error biases the OLS estimate towards zero. In
econometrics, this is called “attenuation bias”.
1.1
Answer:
Most of this problem is discussed in Chapter 9 of the textbook. The algebra is fairly straightforward in most cases.
1. Straightforward:
Cov [x, ε]
=
Cov [x∗ + ε, ε]
=
Cov [x∗ , ε] + V ar [ε]
=
E [x∗ · ε] + σε2
=
σε2
2. Even easier:
y
=
βx∗ + u
y
=
β (x − ε) + u
y
=
βx + (u − βε)
y
=
βx + ξ,
so ξ = u − βε.
2
3. This part may be a little challenging. Start with the OLS estimator:
Pn
xi yi
β̂OLS = Pi=1
n
x2
Pni=1 i
xi (βxi + ξi )
i=1P
=
n
x2
Pni=1 i
xi ξi
= β + Pi=1
n
2
i=1 xi
Thus
h
E β̂OLS
i
Pn
Pn
i=1 xi ξi
i=1 xi ξi
P
P
E β+
=β+E
n
n
2
x2
i=1 xi
Pn i
i=1
xi ξi
β + E E Pi=1
n
2 |x
i=1 xi
P
n
[ui − βεi | x∗ , ε]
i=1 xi EP
β+E
n
2
i=1 xi
Pn
− i=1 βxi εi
Pn
β+E
x2i
Pi=1
n
(x∗ + ε ) ε
i=1
Pn i 2 i i
β−β·E
xi
Pn i=1
2
ε
β − β · E Pni=1 i2 6= β.
x
i=1 i
=
=
=
=
=
=
In the second line I used the Law of Iterated Expectations, in the third line I used the fact that x = x∗ + ε
and that conditional on x, any function of x is deterministic and I can “drag the expectation through” it. The
last two lines just use the definition of x and the fact that x∗ is uncorrelated with ε. The expectation term
cannot be simplified any more, but it is clear that it is not in general going to be equal to zero because it is a
fraction of sums of squared values. Hence β̂OLS is biased.
4. This is actually fairly easy. The trick is to rewrite the OLS estimator as:
Pn
xi ξi
β̂OLS = β + Pi=1
n
2
i=1 xi
P
Pn
n
1
1
xi εi
i=1 xi ui
n
n
= β + 1 Pn
− β 1 Pi=1
n
2
2
i=1 xi
i=1 xi
n
n
where the last line uses the definition of ξ. Now, the Law of Large Numbers implies:
n
1X 2
x
n i=1 i
ar [x] = σx2
−→
pV
−→
p Cov [x, u]
=0
−→
p Cov [x, ε]
= σε2
n
1X
xi ui
n i=1
n
1X
xi ui
n i=1
where the last line uses our results from 1. Thus
β̂OLS
σε2
σε2
−→p β + 0 − β 2 = β 1 − 2
σx
σx
3
5. This is also trivial:
σx2
= V ar [x] = V ar [x∗ + ε]
= V ar [x∗ ] + V ar [ε] + 2Cov [x∗ , ε]
= σx2∗ + σε2 ,
because the covariance is necessarily zero by independence.
6. Again, this is straightforward:
β̂OLS
−→p β 1 −
σε2
σx2∗ + σε2
=β
σx2∗
<β
σx2∗ + σε2
so asymptotically β̂OLS will be equal to something that is smaller than the true β. That’s the attenuation
bias.
2
Part 2
Solve the following problems from the Wooldridge’s textbook (exercise numbers refer to the 4th edition):
• 15.1
• 15.3
• 15.10
2.1
Answer
2.1.1
15.1
1. Other factors that determine GP A (which are assumed to be in u) can be correlated with P C. For example,
maybe only very hard-working and determined students buy computers, then their dedication will affect both
P C and GP A via u. Or, alternatively, only the students who care little about grades buy computers to play
games and use Facebook. This would have the potential for endogeneity of the P C variable.
2. If a student has rich parents, it is more likely this student would end up owning a computer, so this relationship is obvious. What’s less obvious is whether the parents’ income has no effect on GP A except for
through the P C variable. This is unlikely. A rich student may spend more on tutors and/or textbooks, thus
allowing the income to have other channels of influence on GP A. Unless we control for these channels,
parents’ income won’t be a great instrument.
3. Here the indicator variable for whether a student got a grant should be a terrific instrument for P C. It will
probably be correlated with P C, so we would need to argue grants were not related to GP A. But if grants
were given out at random, we know this is true. Thus any effect of grants on GP A can only come through
the P C ownership – this is an ideal scenario for IV analysis.
4
2.1.2
15.3
We know that:
Pn
(zi − z̄) (yi − ȳ)
β̂IV = Pni=1
(z
i − z̄) (xi − x̄)
i=1
and the trick is to manipulate this equation and turn it into what we need. This may take a while. The easier way
is to look at the equations from which the estimate comes:

 Pn (y − β − β x ) = 0
i
0
1 i
i=1
 Pn (yi − β0 − β1 xi ) zi = 0
i=1
The first condition gives us:
n
X
1
n
i=1
n
X
yi
= nβ0 + β1
yi
= β0 + β1
i=1
1
n
ȳ
= β0 + β1 x̄
β0
= ȳ − β1 x̄
n
X
xi
i=1
n
X
xi
i=1
and the second condition can be rewritten as:
n1
n0
n
X
X
X
(yi − β0 − β1 xi )
(yi − β0 − β1 xi ) 0 +
(yi − β0 − β1 xi ) zi =
i=n0 +1
i=1
n1
X
i=1
=
(yi − β0 − β1 xi ) = 0
i=n0 +1
where I denoted by n0 the number of observations for which zi = 0 and n1 is defined similarly (and clearly
n0 + n1 = n). Then take the last equation:
0
n1
X
=
(yi − β0 − β1 xi )
i=n0 +1
n1
X
yi
=
n 1 β0 + β1
i=n0 +1
n1
X
xi
i=n0 +1
ȳ1
=
β0 + β1 x̄1
and now plug the expression for β0 into it and solve for β1 :
ȳ1
=
(ȳ − β1 x̄) + β1 x̄1
ȳ1 − ȳ
=
β1 (x̄1 − x̄)
ȳ1 − ȳ
.
x̄1 − x̄
We are not done yet, however. But we’re close. Note that:
"n
#
n1
n
0
X
1X
1 X
ȳ =
yi =
yi +
yi
n i=1
n i=1
i=n +1
β1 =
0
=
1
[n0 ȳ0 + n1 ȳ1 ]
n
5
and similarly
x̄ =
1
[n0 x̄0 + n1 x̄1 ] ,
n
so we can substitute these two expressions:
β1
=
=
=
=
=
ȳ1 − n1 [n0 ȳ0 + n1 ȳ1 ]
ȳ1 − ȳ
=
x̄1 − x̄
x̄1 − n1 [n0 x̄0 + n1 x̄1 ]
n0
n1
n
n ȳ1 − n ȳ0 − n ȳ1
n0
n1
n
n x̄1 − n x̄0 − n x̄1
n−n1
n ȳ1
n−n1
n x̄1
n0
n ȳ1 −
n0
n x̄1 −
−
−
n0
n ȳ0
n0
n x̄0
n0
n ȳ0
n0
n x̄0
ȳ1 − ȳ0
.
x̄1 − x̄0
That’s it.
2.1.3
15.10
1. It is not immediately obvious that the chance of attending college increases if one attends Catholic High
School. Maybe the causality goes the other way, for some reason students that are most likely to enter
college choose to study at Catholic High Schools for some reasons. Maybe only the most able students go to
Catholic High Schools. So if there is some unobservable ability variable, it will be contained in u and will
be correlated with CathHS.
2. I would include this variable into the main equation. Doing this would partially control for the students’
abilities, which may reduce the bias from the endogeneity of CathHS.
3. We need CathRel to be correlated with CathHS – this can be tested directly. This is the relevance condition. We also need CathRel to be uncorrelated with u, i.e. the only effect the religious beliefs of the student
may have on his/her probability of getting into college is through the CathHS channel. This is the validity
condition that cannot be tested, unfortunately.
4. No. Relevance has nothing to do with validity. If for some reason Catholic students have higher abilities
than non-Catholic, then CathRel will be correlated with ability and hence with u, which will make it an
invalid instrument.
3
Part 3
Solve the following computer problems from the Wooldridge’s textbook (exercise numbers refer to the 4th edition):
• C15.1
• C15.9
6
3.1
Answer
3.1.1
C15.1
1. Here are the regression outputs from using sibs as instrument for educ and for regressions that use sibs
as right-hand-side variable directly:
. ivregress 2sls lwage (educ = sibs)
Instrumental variables (2SLS) regression
Number of obs =
935
Wald chi2(1)
=
21.63
Prob > chi2
=
0.0000
R-squared
=
.
Root MSE
=
.42285
-----------------------------------------------------------------------------lwage |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------educ |
.1224327
.0263224
4.65
0.000
.0708418
.1740236
_cons |
5.130026
.3547911
14.46
0.000
4.434648
5.825404
-----------------------------------------------------------------------------Instrumented:
educ
Instruments:
sibs
. regress lwage sibs
Source |
SS
df
MS
Number of obs =
-------------+------------------------------
F(
1,
935
933) =
22.31
Model |
3.86818211
1
3.86818211
Prob > F
=
0.0000
Residual |
161.788112
933
.173406337
R-squared
=
0.0234
Adj R-squared =
0.0223
Root MSE
.41642
-------------+-----------------------------Total |
165.656294
934
.177362199
=
-----------------------------------------------------------------------------lwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------sibs |
-.0279044
.0059082
-4.72
0.000
-.0394992
-.0163096
_cons |
6.861076
.0220776
310.77
0.000
6.817748
6.904403
------------------------------------------------------------------------------
7
These two regressions really cannot be compared – they consider the impacts of two different factors on
wages. First regression looks at the impact of getting more education, and the second considers the effect of
having more siblings.
2. Yes, clearly there is an effect, as the output below demonstrates. A child born third is less likely to get more
education because being third means you have at least 2 siblings, and your parents have to spend money on
them, too.
. regress educ brthord
Source |
SS
df
MS
Number of obs =
-------------+------------------------------
F(
1,
852
850) =
37.29
Model |
173.087012
1
173.087012
Prob > F
=
0.0000
Residual |
3945.88364
850
4.64221605
R-squared
=
0.0420
Adj R-squared =
0.0409
Root MSE
2.1546
-------------+-----------------------------Total |
4118.97066
851
4.84015353
=
-----------------------------------------------------------------------------educ |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------brthord |
-.2826441
.0462882
-6.11
0.000
-.3734967
-.1917915
_cons |
14.14945
.1286754
109.96
0.000
13.89689
14.40201
-----------------------------------------------------------------------------3. Results are fairly similar to the first regression from part 1. It appears that brthord is a good instrument,
and this is reasonable if sibs was, because these two variables use pretty much the same idea.
. ivregress 2sls lwage (educ = brthord )
Instrumental variables (2SLS) regression
Number of obs =
852
Wald chi2(1)
=
16.67
Prob > chi2
=
0.0000
R-squared
=
.
Root MSE
=
.42101
-----------------------------------------------------------------------------lwage |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------educ |
.1306448
.0320009
4.08
0.000
.0679242
.1933654
_cons |
5.030397
.4324406
11.63
0.000
4.182829
5.877965
8
-----------------------------------------------------------------------------Instrumented:
educ
Instruments:
brthord
The regression suggests an extra year of education raises wages by 13 percent. This is a large effect, significant both statistically and economically.
4. The identification assumption is H0 : π2 = 0 vs. H1 : π2 6= 0. The reduced form regression results
demonstrate that we are identified, πˆ2 6= 0 at 1% significance level.
. regress educ sibs brthord
Source |
SS
df
MS
Number of obs =
-------------+------------------------------
F(
2,
852
849) =
26.29
Model |
240.246365
2
120.123183
Prob > F
=
0.0000
Residual |
3878.72429
849
4.56857985
R-squared
=
0.0583
Adj R-squared =
0.0561
Root MSE
2.1374
-------------+-----------------------------Total |
4118.97066
851
4.84015353
=
-----------------------------------------------------------------------------educ |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------sibs |
-.1528673
.0398705
-3.83
0.000
-.2311236
-.0746109
brthord |
-.1526742
.0570764
-2.67
0.008
-.2647017
-.0406467
_cons |
14.2965
.1332881
107.26
0.000
14.03489
14.55811
-----------------------------------------------------------------------------5. Clearly sibs is very imprecisely estimated in the regression below. I would suspect this to be because
sibs is highly correlated with brthord. The fact that educ is also barely significant now also hints at
this: Stata cannot disentangle the two effects from each other well enough.
. ivregress 2sls lwage sibs (educ = brthord )
Instrumental variables (2SLS) regression
Number of obs =
852
Wald chi2(2)
=
21.87
Prob > chi2
=
0.0000
R-squared
=
.
Root MSE
=
.42623
-----------------------------------------------------------------------------lwage |
Coef.
Std. Err.
9
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------educ |
.136994
.0745496
1.84
0.066
-.0091205
.2831085
sibs |
.0021107
.0173411
0.12
0.903
-.0318772
.0360986
_cons |
4.938529
1.05383
4.69
0.000
2.87306
7.003998
-----------------------------------------------------------------------------Instrumented:
educ
Instruments:
sibs brthord
6. I re-estimated the reduced from from part 4. and used the command predict educhat to generate the
fitted values. As suspected, they are highly correlated with sibs:
. quietly regress educ sibs brthord
. predict educhat
(option xb assumed; fitted values)
(83 missing values generated)
. correlate educhat sibs
(obs=852)
|
educhat
sibs
-------------+-----------------educhat |
1.0000
sibs |
-0.9295
1.0000
In the first command I used the quietly prefix to suppress output. I just need the estimates from that
regression to be active so that I could use predict in the next command.
3.1.2
C15.9
1. Here is the output from the two-stage least squares procedure. This is what we will try to replicate by hand
in the subsequent parts of the problem.
. ivregress 2sls lwage (educ = sibs) exper tenure black
Instrumental variables (2SLS) regression
10
Number of obs =
935
Wald chi2(4)
=
100.22
Prob > chi2
=
0.0000
R-squared
=
0.1685
Root MSE
=
.38381
-----------------------------------------------------------------------------lwage |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------educ |
.0936325
.0336291
2.78
0.005
.0277206
.1595444
exper |
.0209216
.0083653
2.50
0.012
.004526
.0373172
tenure |
.0115482
.0027323
4.23
0.000
.0061929
.0169034
black |
-.1833285
.0500016
-3.67
0.000
-.2813299
-.0853272
_cons |
5.215976
.5419965
9.62
0.000
4.153682
6.278269
-----------------------------------------------------------------------------Instrumented:
educ
Instruments:
exper tenure black sibs
2. I estimate the reduced form, use predict to construct fitted values and then run the second stage. Here is
the output:
. regress educ exper tenure black sibs
Source |
SS
df
MS
Number of obs =
-------------+-----------------------------Model |
1190.65488
4
297.66372
Residual |
3316.16437
930
3.56576814
F(
-------------+-----------------------------Total |
4506.81925
934
4.82528828
4,
935
930) =
83.48
Prob > F
=
0.0000
R-squared
=
0.2642
Adj R-squared =
0.2610
Root MSE
1.8883
=
-----------------------------------------------------------------------------educ |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------exper |
-.2276516
.0146296
-15.56
0.000
-.2563625
-.1989407
tenure |
.0259141
.0126151
2.05
0.040
.0011568
.0506715
black |
-.6226722
.1946812
-3.20
0.001
-1.004738
-.2406069
sibs |
-.1703301
.0281815
-6.04
0.000
-.2256368
-.1150233
_cons |
16.49435
.1951037
84.54
0.000
16.11145
16.87724
------------------------------------------------------------------------------
. predict educhat
(option xb assumed; fitted values)
. regress lwage educhat exper tenure black
Source |
SS
df
MS
11
Number of obs =
935
-------------+------------------------------
F(
4,
930) =
22.75
Model |
14.7631587
4
3.69078967
Prob > F
=
0.0000
Residual |
150.893136
930
.162250683
R-squared
=
0.0891
Adj R-squared =
0.0852
-------------+-----------------------------Total |
165.656294
934
.177362199
Root MSE
=
.4028
-----------------------------------------------------------------------------lwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------educhat |
.0936325
.0352931
2.65
0.008
.0243692
.1628959
exper |
.0209216
.0087792
2.38
0.017
.0036923
.0381509
tenure |
.0115482
.0028675
4.03
0.000
.0059206
.0171757
black |
-.1833285
.0524757
-3.49
0.000
-.286313
-.080344
_cons |
5.215975
.5688148
9.17
0.000
4.099666
6.332285
-----------------------------------------------------------------------------You should notice that the estimates are the same as in part 1., but the standard errors are all wrong. We have
discussed this in class.
3. This part is like part 2., but we also use the wrong reduced form. Output follows:
. regress educ sibs
Source |
SS
df
MS
Number of obs =
-------------+------------------------------
F(
1,
935
933) =
56.67
Model |
258.055048
1
258.055048
Prob > F
=
0.0000
Residual |
4248.7642
933
4.55387374
R-squared
=
0.0573
Adj R-squared =
0.0562
-------------+-----------------------------Total |
4506.81925
934
4.82528828
Root MSE
=
2.134
-----------------------------------------------------------------------------educ |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------sibs |
-.2279164
.0302768
-7.53
0.000
-.287335
-.1684979
_cons |
14.13879
.1131382
124.97
0.000
13.91676
14.36083
------------------------------------------------------------------------------
. predict eductilde
(option xb assumed; fitted values)
12
. regress lwage eductilde exper tenure black
Source |
SS
df
MS
Number of obs =
-------------+------------------------------
F(
4,
935
930) =
22.75
Model |
14.7631579
4
3.69078949
Prob > F
=
0.0000
Residual |
150.893136
930
.162250684
R-squared
=
0.0891
Adj R-squared =
0.0852
-------------+-----------------------------Total |
165.656294
934
.177362199
Root MSE
=
.4028
-----------------------------------------------------------------------------lwage |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------eductilde |
.0699749
.0263758
2.65
0.008
.0182119
.1217378
exper |
-.000394
.0031207
-0.13
0.900
-.0065184
.0057304
tenure |
.0139746
.002691
5.19
0.000
.0086935
.0192557
black |
-.2416309
.041528
-5.82
0.000
-.3231303
-.1601315
_cons |
5.771022
.3603758
16.01
0.000
5.063778
6.478267
-----------------------------------------------------------------------------Now both the standard errors and the estimates themselves are wrong. Bottom line: if a variable is used in
the main equation, you must include it in the reduced form equation as well.
13
Download