PS 9/10 - Suffolk University

advertisement
Department of Economics, SUFFOLK UNIVERSITY
Jonathan Haughton
ECONOMICS 724: COST-BENEFIT ANALYSIS & IMPACT EVALUATION
ASSIGNMENT 9 and 10: Impact Evaluation for Real
These exercises are quite difficult and will take some time, which is why they may be considered as constituting
two exercises. Answers to question 1 are due back by April 7, 2015, and to the remaining questions by April 21. All
the datasets are available via http://web.cas.suffolk.edu/faculty/jhaughton/ by following the link near the
Handbook on Poverty and Inequality picture.
1.
Bangladesh: Evaluating the Impact of Agricultural Extension
Download hh98big7bs.dta. This file has data from Bangladesh that should be adequately labeled. There is a
variable called agextend that indicates whether a household was chosen to participate in a program of
agricultural extension that provides advice and support. [NB: The rest of the dataset is real; this variable is
invented.] We now want to ask a basic question: what was the impact of the agricultural extension program.
a.
First, let us look at the raw numbers.
i. Load hh98big7bs.dta, sort by the variable nh, and save.
ii. Now load consume98v7s.dta (or equivalent), sort by nh, and merge nh using
hh98big7bs.
iii. Check that the merge worked correctly by looking at the _merge variable.
b.
Now compare per capita income and consumption levels for households that did, and did not, get agricultural
extension help.
i. Hint 1. First create measures of total income per capita, and total consumption per capita.
ii. Hint 2. Sort by agextend and then use the syntax by agextend: sum hh* or equivalent.
Specifically, are households that got agricultural extension poorer? Richer? Larger? Are they more
reliant on farm income?
Assume for now that agricultural extension was provided perfectly randomly.
iii. Compute the unconditional single difference in per capita expenditure and test whether it is
statistically significantly different from zero using a t-test.
iv. An alternative approach is to regress per capita expenditure on the binary treatment variable. Try
this, and determine whether the treatment effect is statistically significant. Compare the test
statistic here with that found in b(iii).
c.
Next, let us assume that agricultural extension was provided randomly, once other variables are held constant
(i.e. “partial randomization”), and then ask what effect the program had.
i. Create dummy variables for each district (“thana”). The tab thana, gen(than) command
will do this nicely.
ii. The common impact model: Run a regression of per capita income (or consumption or farm income)
on the agextend, individual variables (such as gender, age, education, family size, etc.), and district
dummy variables. The coefficient on the agextend variable measures the impact of the program.
You will probably want to run a few regressions, one for each output variable (such as income per
capita) that is of interest.
iii. Are the effects measured in c.ii larger or smaller than in b?
iv. Explain why the effects are larger or smaller than in c. For this you might want to check the
correlations among the right hand variables.
v. Do as in c.ii, but this time do not include the district dummy variables.
vi. Do as in c.v, but this time include, on the right hand side, interactive terms with the treatment
dummy variable. This is a more general specification than the dummy variable model and is more
difficult to program. Comment on the size of the impact in this case relative to that in c.vi.
Ec 724 (Cost-Benefit Analysis and Impact Evaluation), Assignments 9 and 10
Page 1 of 3
2.
Using Panel Data for Impact Evaluation
a.
b.
c.
d.
e.
f.
Open consume98.dta, keep nh and hhexpfd , rename this latter by appending 98, sort by nh,
and save under a new name such as rconsume98.dta.
Open consume91.dta, keep the same variables, sort by nh, merge with rconsume98, check
that the merge has worked (using tab _merge), drop the _merge variable, sort by nh, and save
as rconsume9198.dta.
If you have not already done so, open hh98big7bs.dta and rename each variable (except nh) by
suffixing 98. Example:
rename vill vill98.
This file has information on income. Sort using nh and save under a new name such as
revhh98.dta.
Now open hh91.dta, sort by nh, and merge using revhh98.dta. As usual, check that the two
files have merged, by examining _merge , and then delete this variable.
Sort by nh and merge using rconsume9198.dta. Save this file, which is the file with which you
will now work.
Note that prices in 1998 were 47% higher than in 1991, so before food expenditures can be compared, they
must be adjusted for the price difference. We will do this below.
g.
h.
i.
j.
k.
3.
Construct a measure of household food expenditure per capita for 1991 and multiply it by 1.47 to get
the equivalent in 1998 prices. Call it pce91in98.
Construct a measure of household food expenditure per capita for 1998. Call it pce98.
Use an unconditional double difference estimator to determine whether agricultural extension had
an impact on food expenditure per capita.
Use a regression of the change in (real) food expenditure per capita to test whether agricultural
extension (which we assume was introduced in 1995) had an impact on per capita food expenditure.
Include appropriate control variables in the equation.
Justify your choice of variables in j. and write a 100-word commentary on the results.
Propensity Score Matching
For this exercise, I assume that you already have a satisfactory copy of rconsume9198.dta. If not, please
follow steps a-f of question 2 above (or go to the web site!).
Open Stata and read in rconsume9198.dta. If you have not already created a measure of household food
expenditure per capita for 1998, do that now and call it pce98.
Now let us run a propensity score analysis. The idea is first to create a “propensity score” that measures the
probability that a household will get agricultural extension; and then to use this score to match each “treated”
household (i.e. household that gets agricultural extension) with an untreated household that is otherwise very
similar (i.e. has a very similar propensity score). Here is how it might work:
i.
ii.
iii.
iv.
From within Stata, use the search command to find “pscore” and “attnd” and download the relevant
*.ado files. This is mainly an issue of following the instructions.
Estimate the propensity score equation. This will look something like this:
pscore agextend sexhead … [other variables, including district
dummies] … [iw=weight], pscore(fhat1) comsup
Now make the comparison, using nearest-neighbor matching, using
attnd xxx agextend [iw=weight], pscore(fhat1) comsup
where xxx refers to outcomes variables (e.g. per capita income, per capita expenditure, per capita farm
income, etc.) that are of interest. Do the exercise for at least three different outcome variables.
Interpret the results: how big is the effect of agricultural extension on outcomes? Is the balancing
property met? How large are the results compared to those found by double differencing? By single
differencing?
Ec 724 (Cost-Benefit Analysis and Impact Evaluation), Assignments 9 and 10
Page 2 of 3
v.
vi.
4.
Repeat iii but now use Gaussian matching.
Repeat iii, but use nearest neighbor matching with 5 matches.
Direct covariate matching [Optional (but Ph.D. students have no excuse for avoiding it!)]
Repeat the exercise in question 1 but use direct matching instead of propensity score matching. To get started,
from within Stata use the search command to find nnmatch, download the relevant files, read the help screen,
and do it!
5.
Instrumental variables
For this exercise, I assume that you already have a satisfactory copy of rconsume9198.dta. If not, please
follow steps a-f of question 2 above (or go to the web site).
Open Stata and read in rconsume9198.dta. If you have not already created a measure of household
expenditure per capita for 1998, do that now and call it pcex98.
(i)
First try a straightforward regression of the outcome (pcex98) on plausible explanatory variables.
As a first pass try using sex, age, family size, whether household has electricity, and whether
household has a sanitary toilet. How large is the apparent impact of agricultural extension?
(ii)
The approach in (i) is only reasonable if there is randomization in the design of the experiment
(perhaps conditional on the variables in question). In the absence of randomization, we might want
to use instrumental variables. Assume, for the sake of this exercise, that educated people are more
likely to get agricultural extension visits, but that education does not directly influence pcex98 [yes,
this is not very plausible, but let’s accept it for now]. Do an instrumental variable regression for this.
Hint: use the ivreg2 command. You will need (agextend=educhead) somewhere, and it is
a good idea to use ,first in order to generate the first-stage regression. [Optional: You might
consider using the treatreg command here.]
(iii)
Redo (ii) but account for weights and clustering. Use [aw= … ] for the weights, and for the
clustering append gmm cluster(vill) to the command.
(iv)
Are your results in (ii) and (iii) plausible? Why?
6.
Pulling it all together
We have used the Bangladesh example consistently throughout these assignments. Now it is time to pull all the
results together. Complete the following table (and keep your Stata code, either in a *.log or *.do file):
Method
Impact of agricultural extension on per capita
expenditure
Difference
p-value of difference
1.
Show the mean levels of per capita expenditure for
 households receiving agricultural extension
 households not receiving agricultural extension
2. Assume randomization: simple difference (D)
3. Assume randomization: common impact model
4. Assume randomization: more general regression model
5. Covariate matching model
6. Propensity score matching model
7. Instrumental variable model
8. Instrumental variable model accounting for clustering
9. Simple double difference (DD)
10. Double difference: regression model
11. Reflexive comparison
Which results do you prefer? Why?
Ec 724 (Cost-Benefit Analysis and Impact Evaluation), Assignments 9 and 10
Page 3 of 3
Download