Uploaded by charles.heouro

A7

advertisement
24020447
Q1: (i) It seems reasonoable to assume that dist and u are uncorrelated because classrooms are
not usually assigned with convenence for students in mind.
(ii) The variable dist must be partially correlated with atndrte. More precisely, in the reduced
form
atndrte = pi + pi priGPA + pi ACT + pi dist + v,
0
1
2
3
we must have π ≠ 0. Given a sample of data we can test H : π = 0 against H : π ≠ 0 using a t test.
3
0
3
1
3
(iii) We now need instrumental variables for atndrte and the interaction term, priGPA⋅atndrte.
(Even though priGPA is exogenous, atndrte is not, and so priGPA⋅atndrte is generally correlated
with u.) Under the exogeneity assumption that E(u|priGPA,ACT,dist) = 0, any function of
priGPA, ACT, and dist is uncorrelated with u. In particular, the interaction priGPA⋅dist is
uncorrelated with u. If dist is partially correlated with atndrte then priGPA⋅dist is partially
correlated with priGPA⋅atndrte. So, we can estimate the equation
stndfnl = β + β atndrte + β priGPA + β ACT + β priGPA⋅atndrte + u
0
1
2
3
4
by 2SLS using IVs dist, priGPA, ACT, and priGPA⋅dist. It turns out this is not generally optimal.
2
It may be better to add priGPA and priGPA⋅ACT to the instrument list. This would give us
overidentifying restrictions to test.
Q3:(i) Family income and background variables, such as parents’ education.
(ii) The population model is
score = B0 + B1girlhs + B2faminc + B3meduc + B4feduc+ u1,
where the variables are self-explanatory.
(iii) Parents who are supportive and motivated to have their daughters do well in
school mayalso be more likely to enroll their daughters in a girls’ high school. It seems
likely that girlhs and u1 are correlated.
(iv) Let numghs be the number of girls’ high schools within a 20-mile radius of a
girl’s home. To be a valid IV for girlhs, numghs mu st satisfy two requirements: it must
be uncorrelated with u1 and it must be partially correlated with girlhs. The second
requirement probably holds, and can be tested by estimating the reduced form.
girlhs = p i 0 + pi1faminc + pi2meduc + pi3feduc + pi4numghs + v2
and testing numghs for statistical significance. The first requirement is more
problematical. Girls’ high schools tend to locat in areas where there is a demand, and
this demand can reflect the seriousness with which people in the community view
education. Some areas of a state have better students on average for reasons unrelated to
family income and parents’ education, and these reasons might be corrlated with
numghs. One possibility is to include community-level variables that can control
differences across commuities.
Q2: Consider the simple regression model:
y = β0 + β1x + u
and let z be a binary instrum ental variable for x. Using equation (15.10) in the
textbook, the IV estimator for β1 can be written as:
b1^ = (1/n) ∑i=1 to n (zi(xi - xbar)(yi - ybar)) / (1/n) ∑i=1 to n (zi(xi - xbar)^2)
where n is the sample size, z is the binary instrument, and xbar and ybar are the sample
averages of x and y, respectively.
Since z is a binary instrument, there are two groups of observations: those for which z
= 0 and those for which z = 1. Let n0 and n1 be the sample sizes for these groups,
respectively. We can rewrite the IV estimator as follows:
b1^ = [(1/n0) ∑i: zi=0 (xi - x0bar)(yi - y0bar)] / [(1/n0) ∑i: zi=0 (xi - x0bar)^2] +
[(1/n1) ∑i: zi=1 (xi - x1bar)(yi - y1bar)] / [(1/n1) ∑i: zi=1 (xi - x1bar)^2]
where x0bar and y0bar are the sample averages of x and y for the group with z = 0, and
x1bar and y1bar are the sample averages of x and y for the group with z = 1.
Note that for the group with z = 0, x = zγ + v reduces to x = v, since z = 0. Thus, we can
interpret x0bar as the sample average of v for the group with z = 0. Similarly, x1bar can
be interpreted as the sample average of v for the group with z = 1.
Now, consider the numerator of the first term in the equation for b1^. We can rewrite it
as follows:
(1/n0) ∑i: zi=0 (xi - x0bar)(yi - y0bar) = (1/n0) ∑i: zi=0 [(ziγ + vi) - x0bar][(yi - y0bar)]
= (1/n0) ∑i: zi=0 (vi - x0barγ)(yi - y0bar) = (1/n0) ∑i: zi=0 (viyi - x0barvi) γy0bar(1/n0) ∑i: zi=0 (xi - x0bar)
where we have used the fact that γ is the coefficient on the instrument in the first stage
regression, and that x0barγy0bar is a constant that can be taken out of the sum.
Similarly, we can rewrite the numerator of the second term in the equation for b1^:
(1/n1) ∑i: zi=1 (xi - x1bar)(yi - y1bar) = (1/n1) ∑i: zi=1 [(ziγ + vi) - x1bar][(yi - y1bar)]
= (1/n1) ∑i: zi=1 (vi - x1barγ)(yi - y1bar) = (1/n1) ∑i: zi=1 (viyi - x1barvi) γy1bar(1/n1) ∑i: zi=1 (xi - x1bar)
where we have used the same reasoning as before Plugging these expressions back
into the equation for b1^ and simplifying, we get:
b1^ = (y1bar - y0bar) / [(1/n0) ∑i: zi=0 (xi - x0bar)^2 - (1/n1) ∑i: zi=1 (xi - x1bar)^2]
= (y1bar - y0bar) / (x1bar - x0bar)
where we have used the fact that x0bar = vbar for the group with z = 0 and x1bar = vbar
for the group with z = 1, and that the sum of squares of deviations from the mean is
equal to the sum of squares of deviations from any other constant.
Thus, we have shown that the IV estimator for β1 can be written as the grouping
estimator:
b1^ = (y1bar - y0bar) / (x1bar - x0bar)
where y0bar and x0bar are the sample averages of y and x for the group with z = 0, and
y1bar and x1bar are the sample averages of y and x for the group with z = 1. This
estimator was first suggested by Wald (1940).
Therefore , we have used equation (15.10) in the textbook to derive the grouping
estimator for the IV estimator of β1 in the simple regression model with a binary
instrumental variable.
Download