Uploaded by nathanshack2

econ 398 001 midterm 2019W1

advertisement
ECON 398 - 001
Jonathan L. Graves
Midterm
16 October 2019
Name: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Student Number : . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
TA Name and Section: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Instructions:
• Answer the questions in the answer books provided. Clearly label which questions you
are answering and where. Answer Question 8 online, by the deadline.
• If you make a mistake, cross it out and continue; crossed-out work will not be graded.
• If a question has multiple parts, answer all of them, being sure to label each part.
• A standard, non-graphing calculator is permitted: all other electronics, phones, dictionaries, etc. are not permitted. This policy does not apply to Question 8.
• This exam is individual: communication with other students or individuals during the
exam is forbidden. This policy does apply to Question 8.
• This exam is closed book: no notes, textbooks, or other outside resources are allowed
except for a single A4 sheet of paper, upon which you can write whatever you wish.
This policy does not apply to Question 8.
• This exam is governed by UBC’s policies on academic integrity and misconduct, as
outlined in the course syllabus. Your signature above indicates you agree to abide by
them, and these rules, during this exam. You also agree to obey any other instruction
given to you by exam invigilators during the course of the exam.
Time Permitted: 1:20 minutes
Structure: This exam has 8 questions, for a total of 125 points and 0 bonus points.
Question:
1
2
3
4
5
6
7
8
Total
Points:
10
10
10
15
20
15
20
25
125
Score:
ECON 398 - 001
Midterm, Page 2 of 10
Questions begin on next page
Questions begin on next page
Questions begin on next page
16 October 2019
ECON 398 - 001
Midterm, Page 3 of 10
16 October 2019
1. (10 points) Assess the correctness of the following statement, being careful to point out
what is correct and incorrect, with reference to the ideas and terms discussed in class:
A recent study from the American Diabetes Health Study Group, based
on more than 1.6 million adolescents, has concluded that after controlling for
major demographic and health characteristics, US public school children who
receive lunches made at home are less likely to be overweight than those who
purchase lunch at school or rely on school meal programs. The large sample
size and fact that the study controls for important characteristics provides
compelling evidence that US schools are failing to provide nutritional lunch
options for school children.
Solution: The key element which is incorrect is the statement that the large sample
size and controls provide strong(er) evidence in favour of the conclusion.
• The key issue is selection bias: if there is a systematic difference between
children who have homemade lunches and those who do not which is correlated with their health outcomes if they did not have a homemade lunch (e.g.
poverty) this conclusion would be dubious.
• The size of the sample makes no difference to this
• Controlling does make it more credible, but can only correct for some of the
selection bias, not all of it (e.g. B3 )
2. (10 points) Suppose I have two treatments, represented by the dummy variables D1,i
and D2,i , and a potential outcome variable which depends on both: Yi (D1,i , D2,i ). If the
“treatment” being considered is both treatment 1 and 2 being applied, how would you
express the average treatment effect on the treated (AT ET ) in terms of a conditional
expectation using the variables provided above? Define any new terms you introduce,
and explain your reasoning.
Solution: In general,
AT ET = E[Yi (1) − Yi (0)|Di = 1]
Here Di = 1 ⇐⇒ Di,1 = Di,2 = 1, so:
AT ET = E[Yi (1, 1) − Yi (0, 0)|Di,1 = 1, Di,2 = 1]
ECON 398 - 001
Midterm, Page 4 of 10
16 October 2019
3. (10 points) In the potential outcomes framework, we consider average treatment effects,
like E[∆i ]. Why do we consider the average effect? Give at least two reasons, relating
to the ideas we discussed in class.
Solution: There are many possible reasons:
• Because average effects are easier to apply statistics tools (LLN, CLT, regression) to than individual effects
• In order to find ways of overcoming the FPCI using similar groups which is
easier than finding similar people
• To control for or discuss heterogeneity
Other solutions which relate to the ideas in class are also acceptable.
4. Suppose we have two variables, which are related using the conditional expectation
function m(x) (i.e. so that Yi = m(Xi ) + i where E[i |Xi ] = 0) and in the population
Xi takes on values between −3 and 3 (inclusive) (i.e. −3 ≤ Xi ≤ 3). Suppose that, in
the population, the true form of m(x) = −(x − β)(x + β) where the true value of β = 3.
(a) (2 points) Sketch the CEF for all valid values of Xi . What kind of function is it?
(b) (3 points) What population regression equation should you use to estimate the
relationship between Yi and Xi ? Explain your answer with reference to the ideas
developed in class.
(c) (5 points) In the equation you chose above, what are the values of the parameters
in your population regression equation? How do you know? Explain.
(d) (5 points) Suppose you (erroneously) assumed the form of the CEF was the population regression equation β0 + β1 x. What is the problem with this assumption,
and what are the consequences? Explain, with reference to your answers above and
the ideas developed in class.
Total for Question 4: 15
Solution: (a) Not included; it’s an upside down parabola with roots at -3 and 3.
(b) We know that Yi = m(Xi ) + i = −Xi2 − 9 + i which is linear in the data. So,
we could estimate this using a population regression of the form:
Yi = β0 + β1 Xi + β2 Xi2 + (c) Matching parameters gives us β0 = −9, β1 = 0 and β2 = −1
ECON 398 - 001
Midterm, Page 5 of 10
16 October 2019
(d) The problem with this assumption is that the true CEF displays a different
relationship; the associated population regression equation would mis-estimated. The
consequences would be estimating the wrong relationship; however, it would still be
the best linear fit you could make without incorporating higher order terms.
Note: if in (b) students guess some form of linear function which isn’t quadratic,
they should discuss the best fit properties and do their best to try and evaluate it.
5. Suppose I am using the 2016 Census, and I decide to compare individuals who were
children when they immigrated to Canada, to those who were not. The variable which
represents this attribute, AGEIMM, is depicted in Figure 1.
Suppose I create a dummy variable to make this comparison, CHILDIMM, as:
gen CHILDIMM = 0
replace CHILDIMM = 1 if AGEIMM < 4
In other words, CHILDIMM is zero except for individuals who have AGEIMM < 4.
Suppose you compared the average market income (MRKINC) between people with
CHILDIMM equal to 0 and those with it equal to 1.
(a) (5 points) What is selection bias B1 in this situation? Be specific, and give a
(possible) example.
(b) (5 points) What is selection bias B2 in this situation? Be specific, and give a
(possible) example.
(c) (5 points) What is selection bias B3 in this situation? Be specific, and give a
(possible) example.
(d) (5 points) Which of the above biases is fixable? How? Explain, and be specific.
Total for Question 5: 20
Solution: (a) B1 : bias due to people not in both groups. An example is citizens by
birth are only in group 0 not group 1; if citizens by birth has an impact on income,
this creates a bias.
(b) B2 : different proportions. The current age of citizens is likely to be older than
immigrants, which affects wages by the above reasoning.
(c) B3 : unobservable selection. Perhaps people who come to Canada young have a
harder work ethic, due to the expectation of their parents, than those who immigrate
older, which affects wages.
(d) You can fix those in (a) and (b), but not (c). Students should explain how to fix
(balancing, matching, restricting sample, etc.)
ECON 398 - 001
Midterm, Page 6 of 10
16 October 2019
Figure 1: AGEIMM Variable Coding
6. For each of the following selection mechanisms, explain whether or not they meet the
conditions A1 (independence) and A2 (unfoundedness), if the goal is to sort a large
number of individuals into treatment and control groups for a medical trial.
(a) (3 points) I randomly generate identification numbers for all the subjects, then take
those with odd numbers and place them into the treatment group. For people with
even numbers, I flip a coin and assign them to the treatment if it’s heads - otherwise,
they are placed in the control group.
(b) (3 points) I give everyone who was born January-July the treatment, and everyone
who was born August-December the control.
(c) (3 points) I give everyone a random number between 1 and 10. I then draw one of
the numbers; the first person who raises their hand with that number, I place in
the treatment group. I repeat this process until half the people are sorted into the
treatment group.
(d) (3 points) I give everyone a random number between 1 and 10. I place everyone
with 1 or 2 in the treatment group, and everyone else in the control group.
(e) (3 points) I give the first person to apply for the study the number 1, the second 2,
etc. I then draw a number between 1 and 10; if a person’s number is drawn, they
are placed in the treatment group. Otherwise, they’re placed in the control group.
Total for Question 6: 15
ECON 398 - 001
Midterm, Page 7 of 10
16 October 2019
Solution: (a) This meets both A1 and A2; the original numbers were random, so
the fact that one sub-group of random people was more likely to be assigned to
treatment versus control is not correlated with their potential outcomes.
(b) Neither A1 nor A2; the month in which people are born has a long-lasting
and powerful impact on their development and education and thus their potential
outcomes.
(c) Neither A1 nor A2; while the numbers are randomly assigned, raising your hand
first is not. If speed of hand-raising is correlated with potential outcomes, we have
an issue.
(d) A1 and A2; the numbers are random, and thus independent of the potential
outcomes.
(e) Neither; similar to (c). Speed of application is likely a correlate, and since there
are a ”large” number of people in the study, this implicitly assigns treatment more
often to to ”fast” people.
7. Consider the following (condensed) summary of an article published by Marketwatch
This is the best diet for treating depression (Marketwatch, Oct 12, 2019)1
Stop feeding into your depression.
Many of us turn to comfort foods when we are feeling down, which are
generally defined as those dishes and snacks that are easy to make (or order
out — thanks, GrubHub and Postmates — or open from a package) that are
filled with nostalgic or sentimental value. (They’re also often loaded with
sugar, salt, fat and/or refined carbs.)
But new research shows that we’re doing comfort food all wrong. In fact,
cutting out processed foods and adding in more fruits, vegetables and fish
doesn’t just make you healthier — it may also make you happier.
A small, randomized trial published in PLOS One this week (just in time
for World Mental Health Day) looked at 76 adults ages 17 to 35, who all scored
“moderate to high” on a scale of depression symptoms used by doctors, and
who also consumed diets that were high in processed foods, saturated fats and
refined carbohydrates.
The subjects were split into two groups. One was encouraged to eat healthier by receiving money for grocery shopping, a small hamper of pantry items,
as well as tips to eating healthier, whole foods. Researchers checked in on
them twice a week for three weeks to see how their diets were going. The control group, on the other hand, didn’t receive any food, money or nutritional
guidance.
1
By Nicole Lyn Pesce: https://www.marketwatch.com/story/this-is-the-best-diet-for-treating-depression2019-10-10
ECON 398 - 001
Midterm, Page 8 of 10
16 October 2019
And at the end of three weeks, those on the diet who ate more fruits,
vegetables and fish — aka a Mediterranean-style diet — saw their moods
significantly improve, and their “moderate to high” depression scores dropped
within a normal range. Those in the control group who had stuck to their less
healthy diets didn’t see change to their moods or scores. Three months later,
the subjects who continued with the healthy eating habits continued to have
elevated moods and more improved life outlooks.
But as study co-author Heather Francis, a nutritional neuroscience researcher from Macquarie University in Sydney, told Live Science, “These findings add to a growing literature to suggest that healthy diet can be recommended as an effective therapy to improve depression symptoms, as an adjunct
to pharmacological and psychological therapy.”
(a) (5 points) Put this model into the potential outcomes framework, being careful to
explicitly explain what the treatment is, the outcome, the treated and untreated
groups, the effect, the population etc.
(b) (5 points) Explain what selection bias is in this context, being specific about the
terms you use.
(c) (5 points) Assess the degree to which this study suffers from selection bias. Relate
your answer to the types of selection bias, and the study.
(d) (5 points) How does the treatment in your framework relate the treatment the
author’s interpretation of the treatment being studied? Does this matter? Explain.
Total for Question 7: 20
Solution: (a) Yi : scale of depression, Di : healthy food, money, and tips, i: people
with depression and eating poorly. Effect is an ATE.
(b) Selection bias would be a reason to believe that the comparison of the people
who were given the healthy treatment would have a different outcome if given the
unhealthy treatment.
(c) They didn’t say how they split them, but it was likely random. This means that
it’s very unlikely to suffer from selection bias (A1); even the worst kind of selection
bias B3 can be overcome with A1 or A2.
(d) However, the treatment is not obvious here; they think it’s eating healthfully
but it’s really a bundle of giving people money, given people food, and giving them
attention. It’s also unclear how you can generalize this beyond the population being
studied. This matters, since although selection bias is not likely to be an issue, the
study’s conclusions are highly specific.
ECON 398 - 001
Midterm, Page 9 of 10
16 October 2019
Tear off this page and take it home
Reminder:
• This exam is individual: communication with other students or individuals during
the exam is forbidden. This policy does apply to Question 8.
• This exam is governed by UBC’s policies on academic integrity and misconduct,
as outlined in the course syllabus. Your signature above indicates you agree to
abide by them, and these rules, during this exam. You also agree to obey any other
instruction given to you by exam invigilators during the course of the exam.
8. Consider the question of whether individuals who major in business have a higher average
income than individuals who major in the social sciences, using the scripts we have been
developing in class. Answer the following questions using the 2016 Census Microdata
and STATA (or R, if you wish).
Begin by processing your data using the provided DO file (midterm.do) and following
the instructions. If you’ve been following along in class or lab, this should require very
limited adjustment. If you have any issues getting this to run, do not hesitate to contact
me.
In what follows, do not worry about using weights; you can treat this study as asking
about the sample, rather than the Canadian population.
(a) (5 points) First, what is an appropriate population you should use to answer this
question? Explain briefly, then use the keep or drop commands to select an appropriate sample of individuals in the data. (Hint: Hands-On #3.4 and the OR
condition in STATA is “|”)
(b) (5 points) Next, using the ttest command (Hint: Hands-On #2.4), perform a
naive comparison of the average market market income between these two groups.
What is the difference? What is your conclusion, based solely on this naive measure?
Explain.
(c) (5 points) Put your comparison in (b) into the potential outcomes framework, being
careful explain all the parts, including the treatment, outcome, population, etc.
(d) (5 points) Using the documentation for the census and the generate and replace
commands (Hint: Hands-On #3.1) create new variable YOUNG based on the AGEGRP variables which is a dummy where whether someone was younger than 30 years
old. Using this variable, and the tabulate command (Hint: Hands-On #2.2), determine what fraction of people with a social science degree are young, and what
fraction of those with a business degree are young.
(e) (5 points) How might this be a problem for your analysis in (b)? Explain, using
the ideas developed in class, then evaluate your suggestion using data, and any of
the commands we have used (there could be many, Hands-On Module 2 might be
useful).
Total for Question 8: 25
ECON 398 - 001
Midterm, Page 10 of 10
Solution: Answers may vary.
End of Exam
16 October 2019
Download