Uploaded by papperkatt

Assignment 2 (1)

advertisement
Assignment 2
Empirical methods in economics IIa, HT22
Ellen Bäcklinder & Olivia Campbell
1. The Instrumental Variable Approach
Let’s say that we have an equation of interest: Yi = βo + β1Xi + ui. However, we think that Xi
is endogenous, i.e. cov(Xi, ui)≠0. We have another variable, Zi, that predicts Xi and only
affects the outcome variable Yi through the endogenous variable, Xi. This means that we
can find the causal effect of X on Y using an Instrumental Variable (IV) approach.
a) Write down the first stage regression.
𝑋𝑖 = 𝛾0 + 𝛾1 𝑍𝑖 + 𝑣𝑖
b) Write down the reduced form equation.
π‘Œπ‘– = 𝛿0 + 𝛿1 𝑍𝑖 + πœ‰π‘–
Μ‚ 𝑰𝑽
c) Write down the 𝜷
𝟏 estimate given by the first stage and reduced form above.
𝐼𝑉
Μ‚
𝛽1 without first stage and reduced form:
1 𝑛
∑𝑖=1(π‘Œπ‘– − π‘ŒΜ…)(𝑍𝑖 − π‘ŒΜ… )
∑𝑛𝑖=1(π‘Œπ‘– − π‘ŒΜ…)(𝑍𝑖 − π‘ŒΜ…)
𝐼𝑉
𝛽̂1 = 𝑛
= 𝑛
1 𝑛
∑𝑖=1(𝑋𝑖 − 𝑋̅ )(𝑍𝑖 − 𝑋̅)
Μ…
Μ…
∑
𝑛 𝑖=1(𝑋𝑖 − 𝑋)(𝑍𝑖 − 𝑋)
𝛽̂1𝐼𝑉 with first stage and reduced form:
𝛽1𝐼𝑉
First stage OLS estimate: 𝛾1 =
𝐢(π‘Œπ‘– , 𝑍𝑖 )
𝛿1
𝐢(π‘Œπ‘– , 𝑍𝑖 )
𝑉(𝑍𝑖 )
=
=
=
𝛾1 𝐢(𝑋𝑖 , 𝑍𝑖 ) 𝐢(𝑋𝑖 , 𝑍𝑖 )
𝑉(𝑍𝑖 )
𝐢(𝑋𝑖 , 𝑍𝑖 )
𝑉(𝑍𝑖 )
𝐢(π‘Œπ‘– , 𝑍𝑖 )
Reduced form OLS estimate: 𝛿1 =
𝑉(𝑍𝑖 )
d) In just a few sentences; What is it we do in IV-regressions that let us claim causal
effects, even though Xi is endogenous?
IV lets us extract the exogenous part of the endogenous X, which lets us find the causal effect.
Thereby, we can find the causal effects even though X is endogenous.
e) How do you test if the instrument is exogenous? That is, that the instrument does not
have a causal effect on Y.
An instrument doesn’t have a causal effect on Y, but it does have an effect on X which affects Y.
If an instrument is exogenous, then 𝐢(𝑒𝑖 , 𝑍𝑖 ) = 0. This can’t be tested; you have to use intuition
to figure out if the instrument is exogenous or not. However, you can test if Z is correlated with
covariates or not, i.e., 𝐢(π‘Šπ‘– , 𝑍𝑖 ), where W is a covariate. If Z is uncorrelated with the other
covariates, it’s more likely that Z is uncorrelated with Y as well.
Children and Their Parent’s Labor Supply: Evidence from Exogenous Variation
in Family Size, by Angrist and Evans, AER 1998
2. Questions regarding the article
a) Which variable would you say is Angrist and Evans’ endogenous variable (X)?
Which is their instrumental variable (Z)?
Their endogenous variable is fertility.
Their main instrumental variable is same sex, but they also use twins as a second instrument.
Both instruments are dummies.
b) What is the initial endogeneity problem when estimating the effect of childbearing
on labour supply?
The initial problem is that fertility is endogenous. Due to this, an instrument must be used to
capture the exogenous effect of fertility and by this show the true causal effect of fertility on
labor supply.
c) Do you think their instrument satisfies the exclusion restriction? Why/why not?
The exclusion restriction requires that Z only affects Y through X. I believe that the instrument
same sex satisfies this restriction but not the instrument twins. Having twins affect women’s
ability to work to a much larger extent than having non-twins does.
It’s more costly to put two children into childcare, not everyone will be able to afford it even if
the woman goes back to work right after birth. This might lead to women not going back to work;
hence twins have a direct causal effect on labor supply.
Another reason that I don’t believe that twins satisfy the exclusion restriction is that having twins
is more time consuming, than having non-twins. There might not be enough time to both work
and take care of the twins (as well as the household), which might affect the ability to work,
either by working less hours or not being able to work at all.
d) What is the causal effect Angrist and Evans have measured here? (Is it the labour
supply response of having a child, or is it something more narrow?)
The causal effect that Angrist and Evans have measured is how childbearing affects labor supply
for women aged 21-35 with 2 or more children.
However, Angrist and Evans do argue that their choice of sample is representative and general,
which means external validity. If there’s external validity, the causal effect that they have
measured is how childbearing affects labor supply.
3. Replication
Note: Since the output in Table 5 is for the full sample and we only use the sample
consisting of married women in 1980 census (pums80.dta), I have included tables with the
results you should be able to replicate, see page 4.
a) Replicate the last three rows of Table 3 and show your results in a table. The rows
are:
(1) one boy, one girl, (2) both same sex and difference (2)-(1).
Do this for your sample (Married women, 1980 PUMS), and only show results for
the column: Fraction that had another child.
Hint: The Fraction that had another child for the rows (1) one boy, one girl, (2) both
same sex, are mean values, with standard errors in parenthesis. Use a regression to get the
mean values and the standard errors (instead of standard deviations).
(1)
(2)
(3)
morekids
morekids
morekids
0.068***
samesex
(0.002)
_cons
N
0.346***
0.414***
0.346***
(0.001)
(0.001)
(0.001)
125909
128745
254654
b coefficients; Standard errors in parentheses
*
p < 0.05, ** p < 0.01, *** p < 0.001
b) Replicate rows 1-4 (More than 2 children,..., Weeks worked) in Column 1 (Mean
difference by Same sex) of Table 5. Show your results in a table.
Hint: The independent variable in these regressions is ”samesex”. Again, standard errors
are in parentheses.
(1)
(2)
(3)
(4)
samesex
_cons
N
morekids
kidcount
workedm
weeksm1
0.0675***
0.0825***
-0.0093***
-0.4263***
(0.0019)
(0.0030)
(0.0020)
(0.0867)
0.3464***
2.4661***
0.5329***
19.2339***
(0.0013)
(0.0021)
(0.0014)
(0.0618)
254654
254654
254654
254654
b coefficients; Standard errors in parentheses
*
p < 0.05, ** p < 0.01, *** p < 0.001
c) Replicate rows 3 and 4 for column 2 (Wald estimate using as covariate: More than 2
children) in Table 5. Use the following method:
On page 458, equation 2 shows the IV estimate of β. Calculate ¯y1, ¯y0, ¯x1, ¯x0 and
use these to get βIV in the same way as equation 2. What value do you get for βIV
when y = Worked for pay?? And when y = Weeks worked? Show your result in a table
(you don’t need to calculate standard errors).
Hint: First generate y1i (which is yi when zi = 1), then generate y1 (the mean). Proceed with
y0i (which is yi when zi = 0) etc. Compare with your results in b), maybe they are useful?
d) Replicate the same results as in c), using the following method:
Run a first stage regression (using ”morekids” and ”samesex”), predict the values
from the first stage regressions, and then run IV regressions using these predicted
values, for each of the two dependent variables. Show your results in a table
(including standard errors!).
morekids_samesex
_cons
N
(1)
(2)
workedm
weeksm1
-0.14***
-6.31***
(0.029)
(1.284)
0.58***
21.42***
(0.011)
(0.491)
254654
254654
b coefficients; Standard errors in parentheses
*
p < 0.05, ** p < 0.01, *** p < 0.001
e) Replicate the first 2 rows (Worked for pay and Weeks worked) for Column 5 in Table
7. That is, run 2SLS regressions with samesex as an instrument for More than 2
children, with Worked for pay and Weeks worked as dependent variables. Do this with
and without control variables, and show your results in a table. Compare with your
results in d), are they similar?
Hint: Use stata command ivregress. See table footnote for information about which
control variables to include.
With controls
(1)
(2)
workedm
weeksm1
morekids
_cons
N
-0.12***
-5.46***
(0.028)
(1.212)
0.45***
8.21***
(0.014)
(0.586)
254654
254654
b coefficients; Standard errors in parentheses
*
p < 0.05, ** p < 0.01, *** p < 0.001
Without controls
morekids
_cons
N
(1)
(2)
workedm
weeksm1
-0.14***
-6.31***
(0.029)
(1.275)
0.58***
21.42***
(0.011)
(0.487)
254654
254654
b coefficients; Standard errors in parentheses
*
p < 0.05, ** p < 0.01, *** p < 0.001
Download