Economics 102C Problem Set 3

advertisement
Economics 102C
Problem Set 3
Due Monday June 1 At The Beginning of Class
Question 1
Consider the following panel data regression equation:
yit = βxit + fi + uit
where yit are log earnings, xit is tenure (number of years spent working for the current firm), fi is an
individual fixed effect, and uit a disturbance term orthogonal to both fi and xit (the constant term is
omitted for simplicity). The dimension of the panel is defined by i = 1, 2, ..., N and t = 1, 2. Note that
there are only two years of data. It is assumed that fi and xit are correlated: E (fi |xit ) 6= 0.
A. (5 points) Suppose you want to estimate this model in first differences using OLS:
yit − yit−1 = β (xit − xit−1 ) + (uit − uit−1 )
Assume:
∑N
i=1 (xit − xit−1 )
N
N →∞
2
plim
∑N
i=1 (xit − xit−1 ) (uit − uit−1 )
N
N →∞
plim
= Σx
= 0
Prove that the OLS estimator of β is consistent.
B. (10 points) Assume now that tenure is measured with error, i.e. instead of observing the true tenure
xit we observe:
zit = xit + εit
where the measurement error εit is classical, orthogonal to xit , and also to fi and uit . Suppose you
estimate the model in first difference by OLS ignoring the measurement error problem. Characterize the
asymptotic bias of the OLS estimator of β. Assume:
1
∑N
i=1 (zit − zit−1 )
N
N →∞
2
plim
∑N
i=1 (zit − zit−1 ) (uit − uit−1 )
N
N →∞
plim
= Σx + Σε
= 0
C. (10 points) Suppose that the measurement problem arises only for those who switch firm, i.e.
zit − zit−1 = xit − xit−1
for those who work for the same firm at t and t − 1, while:
zit − zit−1 = xit − xit−1 + εit − εit−1
for those who switch firm at t. In other words:
zit − zit−1 = (xit − xit−1 ) (1 − Sit ) + (xit − xit−1 + εit − εit−1 ) Sit
where Sit is an indicator that equals 1 if the person switches firm at t and 0 otherwise. Assume
∑N
i=1 Sit = π < 1
s
N
N →∞
p lim
Show that the asymptotic bias of the OLS estimator of β in the model in first differences is now
smaller than in the previous case B.
D. (5 points) Discuss the problems you are likely to face if, in the case considered in C., you want to
estimate the return to tenure by using only the observations on those who do not switch firm at t.
Question 2
A researcher posits the following panel data model to explain the earnings of individual workers:
yit = αi + βi eit + γxit + uit
where eit is the labor market experience of individual i in year t and xit is a dummy for whether the person belongs to a union. This model is sometimes called the "heterogeneous growth model" of earnings:
individuals have different intercepts and slopes of their earnings profiles, by labor market experience.
The assumptions of the model are:
E (αi |xit )
6=
0
E (βi |xit )
6=
0
(i.e., αi and βi are "fixed" effects), and
2
E (uit |xit ) = 0
You can also assume
E (αi |uit ) = E (βi |uit ) = 0
A. (15 points) Assume that the panel is balanced, individuals are followed for three years, and they
all work in these three years (and hence eit+1 = eit + 1). Suggest an empirical strategy to estimate γ
without bias.
B. (15 points) The researcher runs an OLS regression of yit on a constant and xit using two different
samples: individuals in their first year in the labor market (when experience is 0, Sample 1), and individuals in their second year in the labor market (when experience is 1, Sample 2). She finds the following
results:
Estimate
γ
b
Sample 1
Sample 2
0.10
0.18
(0.03)
(0.04)
Can you say anything about the correlation between βi and xit ? Show your work.
Question 3
This problem will lead you through some of the techniques used in the paper, “Missing Women and the
Price of Tea in China: The Effect of Sex-Specific Earnings on Sex Imbalance”. This paper uses a policy change
regarding agricultural prices and regulations to demonstrate the effect of an increase in female-specific
income on the sex ratio and examine potential explanations. Use the dataset DD_data for parts 1–3. There
is a do-file to get you started with the dataset (run these commands prior to attempting the questions).
Then use the dataset Year_data for part 4.
1. (5 points) Exploring the data
(a) These regressions are not run using individual-level observations. What is the unit of analysis
here?
(b) Produce a table like table 2, presenting summary statistics of the data. Comment on any
interesting features of the table.
2. (10 points) Fixed effects
(a) Why might we think that region has a direct effect on the sex ratio?
(b) Why might we think that birth period (cohort) has a direct effect on the sex ratio?
(c) We can control for these direct effects using a fixed effects regression. Explain (briefly) how
you can use dummy variables to implement a fixed effect strategy.
3
(d) In light of your above explanation and the goal of the paper, do you think you should use the
areg or xi:reg command? Justify your choice.
3. (15 points) Tea and the sex ratio
(a) The paper begins its empirical analysis with DD estimation, using the specification given in
equation (2). Duplicate this regression in Stata. Hint: use the areg command and absorb the
region effects. State the β and δ coefficients.
(b) Give one line interpretations of each of the β and δ coefficients.
(c) We need to use survey weights to weight the regressions. The relevant weight is birpop. Rerun the regression for equation (2) using weights (hint, you want to use aweights in stata for
this). Discuss the differences between your results for this and your results for part (i).
(d) State the β and δ coefficients for the weighted DD estimation. Are these significant?
(e) The author uses two different measures of tea and orchards. First, a continuous measure of
how much tea (or orchard) is grown, and second, a dummy variable for whether tea (orchard)
is grown or not. Redo the DD estimation with the dummy variables. Comment on any differences in your results.
(f) Put the results from all of these regressions into a table. Hint, use either the outreg2 or the
estout (esttab) command in stata for this.
(g) The author’s next step is to control for cohort-region effects. The above specifications controlled for cohort and region effects, but not for effects which differed by regions and cohorts.
Draw a graph to illustrate how the interaction of cohort (pre- and post-reform) and region
adds to the flexibility of the specification.
4. (10 points) Year by year regressions. For this part use the dataset Yearly_Data.dta. This dataset is
like DD_data, but already contains the interaction terms you will require for the exercise.
(a) The paper also adopts a more flexible difference in difference specification given in equation
(3). Duplicate this regression in Stata. Do this both for the continuous measures of tea and
orchard as well as for the dummy variables for whether or not tea (orchard) is grown.
(b) Make a table showing only the coefficients on the interactions between tea and year of birth
(c) Draw a picture like figure V of the resulting coefficients.
4
Download