7532 kadutaka Causal..

advertisement
Causality and Randomized
Control Trials
Empirical Research
• Three broad types of Empirical papers
– Paper Type I - Descriptive
• CVD mortality over time
• Regional differences in medical care
• How are health insurance premiums changing over
time?
• These papers generally DON’T TRY AND SAY
WHY the trend might be changing over time
– Although there is likely to be some speculation
2
Empirical Research (cont.) • Paper Type II – Relate variable X to
variable Y
– Effect of Price on the quantity of Medical Care
– Effect of race on Income/Health
– Effect of hypertension on risk of CVD
– These papers are making a causal
argument
• The strength of which is up to the reader to
evaluate
3
Empirical Research (cont.) • Paper Type III
– Use estimates from the first two types of
papers to make policy recommendations
– For ex. Some studies find that insurance
generosity affects the use of IVF services
• Because of limited opportunities, individuals
maximize the chance of having at least one child
4
Empirical Research (cont.)• One unintended consequence of this is multiple
births
• Multiple births result in higher costs and lower
infant health
• Using estimates from the IVF papers, someone
else might write a paper about the optimal level of
insurance benefit
5
Policy Relevance
• We are going to focus on the Second
Paper Type
– All three types of papers influence policy
– But paper type II is generally of most interest
to policy researchers because it provides
magnitudes for the phenomena of interest
• Magnitudes aid policy makers in the decision to
allocate resources
6
Causation
• What do we mean by causation?
– We are asking a WHAT IF question
– What if instead of X happening, Z happened.
How would that change the outcome of
Interest?
• Thus one must always state the alternative
• The what if scenario is also called a
COUNTERFACTUAL
7
Some Notation
• Following Folland (1986)
– Some units U-where U can be a person, city,
school
– Assume for simplicity two treatments T and C
• T-Treatment and C-Control
• Treatment can be a variety of things – Drug,
education, income, textbooks, co-pays
– Y represents outcome from receiving
treatment
– So YT(u) And YC(u)
8
Fundamental Problem of
Causation
• CANNOT observe the effect of treatment
and control for the same person
– Unless Temporal Stability AND Causal
Transience are observed
• Temporal Stability (TS) -Effect of T on U is same
now and the future
• Causal Transience (CT) – Effect of T on U doesn’t
change once U is exposed to T
– Or Unit homogeneity is observed
• Yt(U1)=Yt(U2) and Yc(U1)=Yc(U2)
9
Fundamental Problem of
Causation (cont.) • Because of Temporal Stability and Causal
Transience we can only estimate average
treatment effects
• Average treatment effect equals
– [E(Yt(U)) – E(Yc(U))]
– This is simply the mean difference of the
outcome across the treatment and control
groups
10
Paper Type II-Causality
• Observational Studies – Most are crosssectional
– Some type of statistical procedure that relates
variables X and Y
• Ordinary Least Squares, Logistic Regression,
– Propensity Scores
• Quasi Experimental/Natural Experiments
– Regression Discontinuity
– Difference in Difference
– Instrumental Variables
• Randomized Control Trial (RCT) – Gold
Standard
11
Observational Studies I
• Difficult to show causation purely from
observational data, why?
– An example: Researchers are interested in
whether income is related to health
• Direct effects – Can buy more medical care
• Indirect effects – Able to afford health insurance
– Some researchers believe health insurance affects
health
12
Observational Studies I (cont.) –
• Money can affect level of education
– Education might help you get better information
– Education might help you process information
faster
13
Observational Studies I (cont.)• Take data from the cross section (point in time)
• Self-reported health as the dependent variable
and Income as the independent variable
• Also adjust for a variables such as education,
insurance, geography, age, sex, race, income,
family education etc. and identify an effect
• Can we say this is the true effect of income on
health?
14
Observational Studies II • Magnitudes from observational studies are
generally biased upwards - Especially
from cross-sectional studies
– There are some examples where estimates
from observational studies are biased
downward
• These are rare cases in the universe of all
published studies
• Can you think of an association that is biased
downward?
– I.e. An RCT would increase the size of your coefficient
15
Observational Studies II (cont.) • In some studies the bias is hard to sign
• For example a researcher is interested in
whether having fire insurance leads to
more fire accidents relative to not having
fire insurance.
• What is the IDEA?
– Fire insurance lowers the cost of having your
place burn down
– Thus individuals have less of an incentive to be
careful, which in turn increases probability of a
fire (also called Ex-Ante Moral Hazard)
16
Observational Studies II (cont.) • Look at the Correlation between purchasing
insurance and Having a fire in the next 5 years?
• In observational data-Individuals for whom fire
insurance is more valuable (more likely to have a
fire) will be more likely to buy fire insurance, How
does this affect the coefficient?
– Not adjusting for this biases the coefficient upwards
• In observational data individuals who are more
“cautious” might also be more likely to buy fire
insurance.
– Cautious people might have fewer fires than risky people
– Not adjusting for this will bias the coefficient downward
17
Observational Studies II (cont.) • Conclusion: A-priori impossible to
tell whether relationship obtained from
observational data is above or below
the true effect of having fire insurance
on having a fire.
18
Observational Studies III
• Given the above examples, Observational
studies primarily show associations
– We will talk more about research designs with
observational data that get us closer to
causality
– Why is it important to show that something is
truly causal and not just an association?
19
Randomized Control Trial
• Randomization is a process used to assign a
treatment to either treatment or control
– Randomization guarantees independence between
treatment and all the other variables that might affect
outcomes of interest
– A simple procedure for randomization – coin flipping
– If randomizations is done correctly the mean
difference across treatment and control groups
E(Yt(U)) – E(Yc(U)) is said to be unbiased
– How can we test whether randomization worked?
20
RCT (cont.)• Without randomization it is very difficult to
guarantee that it is truly the treatment that
is responsible for the outcome
• Most non-experimental procedures are
aimed at finding a control group that is
similar to the treatment group
21
RCT (cont.) • If its such a good idea why aren’t there
more RCTs?
– Ethical Problems
• Smoking is a good example
– Costs
• RCTs cost a lot of money
• The Rand HI experiment cost 280 Million 2004
dollars
• This was to randomize 7,791 people and to follow
them for approximately 8 years.
22
RCT (cont.) • Costs also impact the duration of the experiment –
Rand Health Insurance experiment only ran for 8
years
• Attrition can be high
– This is also a problem with non-experimental
designs
– Importantly people who drop out of the
experiment are likely different from people who
stay in the experiment
• Treatment effects could be different for the two
groups
23
RCT (cont.) • i.e. Conjecture that treatment effect is higher for
the group that stays in the experiment
• If you only used people who stayed in the
experiment there would again be a upward bias to
the measured treatment effect.
– Even though there is attrition, one strategy is
to estimate the effect as if there was no
attrition.
24
RCT (cont.) • Keep everyone in the sample even if some people
are not longer taking the treatment
– This is called “intent to treat” analysis
• Intent to treat will dilute the true effect since not all
individuals in treatment are taking the drug
• But this preserves the experiment and any
estimates are still valid
• In a later lecture we will consider another solution
to the attrition problem
25
RCT (cont.) • Treatment becomes Controls
– Different from Attrition
• Difficult to generalize from location to location
– Will experiment in location A reveal the same effect if
done in location B
• Hawthorne Effects
– Observation makes people behave differently
– Thus results might not apply to non-observed setting
26
RCT (cont.) • Finally – Some things are not easily
Manipulated
– How does one randomize Sex?
– How about race?
• Lets come back to this
27
In Depth Example-Discrimination
• What is the effect of Sex (Race) on
Income?
– Many studies show differences across the
groups on a variety of outcomes
– For ex. Some studies report that a woman
makes .80 cents for each dollar a man makes
28
What Does Theory Say?
• Two Theories
– Statistical Discrimination
• Employers have limited resources to get
information about any single individual, but
know something about group averages
• They use information on the group average
to make an inference about a specific
individual
29
What Does Theory Say (cont.) • Wide applicability – Physician decision
making, Product selection, Speeding tickets
- This is Profiling
– Taste-Based Discrimination
• Employers do not like to employ individuals
from a specific group
30
What Does Theory Say (cont.)• Two types of discrimination have very
different policy implication
– In a competitive market firm will bear the cost
of taste-based discrimination
– Statistical discrimination will likely never be
competed away
• Why?
• Because using information about the group solves
a problem that the profiler faces
31
Testing for Discrimination I
• How do we test whether there is
discrimination and second if so what type
of discrimination?
– One idea is to simply compare mean wages
across different groups from real world data
– What are the problems with this method?
• Employer observes something that you as a
researcher do not (experience, good looks)
• Cannot separate out two theories with this method
32
Testing for Discrimination I (cont.)
• Let’s take a step back
– How would one design an experiment to
determine whether there is discrimination?
• In the RCT framework this question amounts to,
How does one randomize race?
• Seems impossible to do
• Falls into one of these characteristics that cannot
be manipulated
33
Testing for Discrimination II
• Audit Studies – Send in hispanics, african
americans and whites for job interviews
• Two Problems:
– Auditors are matched on some observables
except race
» height, weight, age, dialect, dressing style
and hairdo, Is that enough?
– Study is not Double blind – This can effect
treatment effects
34
Testing for Discrimination III
• Hard to manipulate race in life, but
EASY to manipulate race on paper
• Which name doesn’t belong?
• Chow Yun Phat, Pete Sampras, Srikanth
Kadiyala
– Correct answer is clearly Srikanth because he
is not rich and famous
• Racial groups can have very different
sounding names
35
Testing for Discrimination III (cont.)
• Manipulate the resume so only difference
is a Black sounding name vs. a White
Sounding name
– Emily Walsh vs. Jamal Jones
– Greg Baker vs. Lakisha Washington
• Find some real Employers from the
newspapers
– Two markets: Chicago/Boston
– 1300 Ads
36
Testing for Discrimination III (cont.)
• They vary not only the name (two
resumes) but also type of resume
– More experience and Skills vs. Less
experience and Skills
– Typically 4 different types of resumes to each
job advertisement
• Measure Call Back Rate
– Researchers set up fake tel. #s to receive call
backs
37
Testing for Discrimination III (cont.)• Results
– African Americans need to send 15 resumes to get 1
call back
– Whites need to send 10 resumes to get 1 call back
– 50% gap in call back
– Whites with high quality resume receive nearly 30%
more callbacks vs. whites with low quality resumes
– Blacks with high quality resumes don’t experience the
same benefit
• Amazing fact, experience and some other skills not being
rewarded in the marketplace for blacks
38
Separating Theories
• Does this method separate Statistical from Taste
Based Discrimination?
– YES, Why?
• This study is superior to Audit studies, why?
– Perfect Matching on Treatment and Control
– Unlike audit studies no bias from either participant or
researchers
• This study has quite a few positives in the Realm
of RCTs, What are they?:
– No attrition!
– No mixing of treatment and control!
– Cheap!
39
Some Common Non-Experimental
Designs
– Designs without control groups
• X 01 - Observe only data from post
treatment (X) treatment
• 01 X 02 – Observe data from pre and post
treatment period
• 01 02 X 03 – Observe data from pre and
post; observe a longer pre period
40
Some Common Problems with
Non-Experimental Design
• Ambiguous Temporal Precedence
– For cross-sectional data
• History- Events occurring concurrently with
intervention affect results
• Maturation – Naturally occurring changes
over time confused with intervention
• Regression to the mean
41
Some Common Non-Experimental
Designs
• Designs without control groups
• X 01
– No control group
– Causality impossible to show
• 01 X 02
–
–
–
–
No true control group, pre-period is used as one
History, maturation are problems
Regression to the mean is also a problem
Most Important thing to remember – Treatment
timing might not be random
42
Some Common Non-Experimental
Designs (cont.) • 01 02 X 03
• No true control group,
• History, maturation are problems
• Arguments can be made against regression
to the mean since you have longer time
period
• Most important thing to remember
Treatment Timing might not be random
43
Cites
• Free For All? Lessons from the Rand Health Insurance
Experiment, Joe Newhouse
• Statistics and Causal Inference, Journal of American
Statistical Association, Vol. 81, no. 396, Dec. 1986, pp
945-960, Paul Holland
• Are Emily and Greg More Employable than Lakisha and
Jamal? A Field Experiment on Labor Market
Discrimination, American Economic Review, Vol. 94, no.
4, Sept. 2004, pp. 991-1013, Marianne Bertrand,
Sendhil Mullainathan
44
Download