Causality and Randomized Control Trials Empirical Research • Three broad types of Empirical papers – Paper Type I - Descriptive • CVD mortality over time • Regional differences in medical care • How are health insurance premiums changing over time? • These papers generally DON’T TRY AND SAY WHY the trend might be changing over time – Although there is likely to be some speculation 2 Empirical Research (cont.) • Paper Type II – Relate variable X to variable Y – Effect of Price on the quantity of Medical Care – Effect of race on Income/Health – Effect of hypertension on risk of CVD – These papers are making a causal argument • The strength of which is up to the reader to evaluate 3 Empirical Research (cont.) • Paper Type III – Use estimates from the first two types of papers to make policy recommendations – For ex. Some studies find that insurance generosity affects the use of IVF services • Because of limited opportunities, individuals maximize the chance of having at least one child 4 Empirical Research (cont.)• One unintended consequence of this is multiple births • Multiple births result in higher costs and lower infant health • Using estimates from the IVF papers, someone else might write a paper about the optimal level of insurance benefit 5 Policy Relevance • We are going to focus on the Second Paper Type – All three types of papers influence policy – But paper type II is generally of most interest to policy researchers because it provides magnitudes for the phenomena of interest • Magnitudes aid policy makers in the decision to allocate resources 6 Causation • What do we mean by causation? – We are asking a WHAT IF question – What if instead of X happening, Z happened. How would that change the outcome of Interest? • Thus one must always state the alternative • The what if scenario is also called a COUNTERFACTUAL 7 Some Notation • Following Folland (1986) – Some units U-where U can be a person, city, school – Assume for simplicity two treatments T and C • T-Treatment and C-Control • Treatment can be a variety of things – Drug, education, income, textbooks, co-pays – Y represents outcome from receiving treatment – So YT(u) And YC(u) 8 Fundamental Problem of Causation • CANNOT observe the effect of treatment and control for the same person – Unless Temporal Stability AND Causal Transience are observed • Temporal Stability (TS) -Effect of T on U is same now and the future • Causal Transience (CT) – Effect of T on U doesn’t change once U is exposed to T – Or Unit homogeneity is observed • Yt(U1)=Yt(U2) and Yc(U1)=Yc(U2) 9 Fundamental Problem of Causation (cont.) • Because of Temporal Stability and Causal Transience we can only estimate average treatment effects • Average treatment effect equals – [E(Yt(U)) – E(Yc(U))] – This is simply the mean difference of the outcome across the treatment and control groups 10 Paper Type II-Causality • Observational Studies – Most are crosssectional – Some type of statistical procedure that relates variables X and Y • Ordinary Least Squares, Logistic Regression, – Propensity Scores • Quasi Experimental/Natural Experiments – Regression Discontinuity – Difference in Difference – Instrumental Variables • Randomized Control Trial (RCT) – Gold Standard 11 Observational Studies I • Difficult to show causation purely from observational data, why? – An example: Researchers are interested in whether income is related to health • Direct effects – Can buy more medical care • Indirect effects – Able to afford health insurance – Some researchers believe health insurance affects health 12 Observational Studies I (cont.) – • Money can affect level of education – Education might help you get better information – Education might help you process information faster 13 Observational Studies I (cont.)• Take data from the cross section (point in time) • Self-reported health as the dependent variable and Income as the independent variable • Also adjust for a variables such as education, insurance, geography, age, sex, race, income, family education etc. and identify an effect • Can we say this is the true effect of income on health? 14 Observational Studies II • Magnitudes from observational studies are generally biased upwards - Especially from cross-sectional studies – There are some examples where estimates from observational studies are biased downward • These are rare cases in the universe of all published studies • Can you think of an association that is biased downward? – I.e. An RCT would increase the size of your coefficient 15 Observational Studies II (cont.) • In some studies the bias is hard to sign • For example a researcher is interested in whether having fire insurance leads to more fire accidents relative to not having fire insurance. • What is the IDEA? – Fire insurance lowers the cost of having your place burn down – Thus individuals have less of an incentive to be careful, which in turn increases probability of a fire (also called Ex-Ante Moral Hazard) 16 Observational Studies II (cont.) • Look at the Correlation between purchasing insurance and Having a fire in the next 5 years? • In observational data-Individuals for whom fire insurance is more valuable (more likely to have a fire) will be more likely to buy fire insurance, How does this affect the coefficient? – Not adjusting for this biases the coefficient upwards • In observational data individuals who are more “cautious” might also be more likely to buy fire insurance. – Cautious people might have fewer fires than risky people – Not adjusting for this will bias the coefficient downward 17 Observational Studies II (cont.) • Conclusion: A-priori impossible to tell whether relationship obtained from observational data is above or below the true effect of having fire insurance on having a fire. 18 Observational Studies III • Given the above examples, Observational studies primarily show associations – We will talk more about research designs with observational data that get us closer to causality – Why is it important to show that something is truly causal and not just an association? 19 Randomized Control Trial • Randomization is a process used to assign a treatment to either treatment or control – Randomization guarantees independence between treatment and all the other variables that might affect outcomes of interest – A simple procedure for randomization – coin flipping – If randomizations is done correctly the mean difference across treatment and control groups E(Yt(U)) – E(Yc(U)) is said to be unbiased – How can we test whether randomization worked? 20 RCT (cont.)• Without randomization it is very difficult to guarantee that it is truly the treatment that is responsible for the outcome • Most non-experimental procedures are aimed at finding a control group that is similar to the treatment group 21 RCT (cont.) • If its such a good idea why aren’t there more RCTs? – Ethical Problems • Smoking is a good example – Costs • RCTs cost a lot of money • The Rand HI experiment cost 280 Million 2004 dollars • This was to randomize 7,791 people and to follow them for approximately 8 years. 22 RCT (cont.) • Costs also impact the duration of the experiment – Rand Health Insurance experiment only ran for 8 years • Attrition can be high – This is also a problem with non-experimental designs – Importantly people who drop out of the experiment are likely different from people who stay in the experiment • Treatment effects could be different for the two groups 23 RCT (cont.) • i.e. Conjecture that treatment effect is higher for the group that stays in the experiment • If you only used people who stayed in the experiment there would again be a upward bias to the measured treatment effect. – Even though there is attrition, one strategy is to estimate the effect as if there was no attrition. 24 RCT (cont.) • Keep everyone in the sample even if some people are not longer taking the treatment – This is called “intent to treat” analysis • Intent to treat will dilute the true effect since not all individuals in treatment are taking the drug • But this preserves the experiment and any estimates are still valid • In a later lecture we will consider another solution to the attrition problem 25 RCT (cont.) • Treatment becomes Controls – Different from Attrition • Difficult to generalize from location to location – Will experiment in location A reveal the same effect if done in location B • Hawthorne Effects – Observation makes people behave differently – Thus results might not apply to non-observed setting 26 RCT (cont.) • Finally – Some things are not easily Manipulated – How does one randomize Sex? – How about race? • Lets come back to this 27 In Depth Example-Discrimination • What is the effect of Sex (Race) on Income? – Many studies show differences across the groups on a variety of outcomes – For ex. Some studies report that a woman makes .80 cents for each dollar a man makes 28 What Does Theory Say? • Two Theories – Statistical Discrimination • Employers have limited resources to get information about any single individual, but know something about group averages • They use information on the group average to make an inference about a specific individual 29 What Does Theory Say (cont.) • Wide applicability – Physician decision making, Product selection, Speeding tickets - This is Profiling – Taste-Based Discrimination • Employers do not like to employ individuals from a specific group 30 What Does Theory Say (cont.)• Two types of discrimination have very different policy implication – In a competitive market firm will bear the cost of taste-based discrimination – Statistical discrimination will likely never be competed away • Why? • Because using information about the group solves a problem that the profiler faces 31 Testing for Discrimination I • How do we test whether there is discrimination and second if so what type of discrimination? – One idea is to simply compare mean wages across different groups from real world data – What are the problems with this method? • Employer observes something that you as a researcher do not (experience, good looks) • Cannot separate out two theories with this method 32 Testing for Discrimination I (cont.) • Let’s take a step back – How would one design an experiment to determine whether there is discrimination? • In the RCT framework this question amounts to, How does one randomize race? • Seems impossible to do • Falls into one of these characteristics that cannot be manipulated 33 Testing for Discrimination II • Audit Studies – Send in hispanics, african americans and whites for job interviews • Two Problems: – Auditors are matched on some observables except race » height, weight, age, dialect, dressing style and hairdo, Is that enough? – Study is not Double blind – This can effect treatment effects 34 Testing for Discrimination III • Hard to manipulate race in life, but EASY to manipulate race on paper • Which name doesn’t belong? • Chow Yun Phat, Pete Sampras, Srikanth Kadiyala – Correct answer is clearly Srikanth because he is not rich and famous • Racial groups can have very different sounding names 35 Testing for Discrimination III (cont.) • Manipulate the resume so only difference is a Black sounding name vs. a White Sounding name – Emily Walsh vs. Jamal Jones – Greg Baker vs. Lakisha Washington • Find some real Employers from the newspapers – Two markets: Chicago/Boston – 1300 Ads 36 Testing for Discrimination III (cont.) • They vary not only the name (two resumes) but also type of resume – More experience and Skills vs. Less experience and Skills – Typically 4 different types of resumes to each job advertisement • Measure Call Back Rate – Researchers set up fake tel. #s to receive call backs 37 Testing for Discrimination III (cont.)• Results – African Americans need to send 15 resumes to get 1 call back – Whites need to send 10 resumes to get 1 call back – 50% gap in call back – Whites with high quality resume receive nearly 30% more callbacks vs. whites with low quality resumes – Blacks with high quality resumes don’t experience the same benefit • Amazing fact, experience and some other skills not being rewarded in the marketplace for blacks 38 Separating Theories • Does this method separate Statistical from Taste Based Discrimination? – YES, Why? • This study is superior to Audit studies, why? – Perfect Matching on Treatment and Control – Unlike audit studies no bias from either participant or researchers • This study has quite a few positives in the Realm of RCTs, What are they?: – No attrition! – No mixing of treatment and control! – Cheap! 39 Some Common Non-Experimental Designs – Designs without control groups • X 01 - Observe only data from post treatment (X) treatment • 01 X 02 – Observe data from pre and post treatment period • 01 02 X 03 – Observe data from pre and post; observe a longer pre period 40 Some Common Problems with Non-Experimental Design • Ambiguous Temporal Precedence – For cross-sectional data • History- Events occurring concurrently with intervention affect results • Maturation – Naturally occurring changes over time confused with intervention • Regression to the mean 41 Some Common Non-Experimental Designs • Designs without control groups • X 01 – No control group – Causality impossible to show • 01 X 02 – – – – No true control group, pre-period is used as one History, maturation are problems Regression to the mean is also a problem Most Important thing to remember – Treatment timing might not be random 42 Some Common Non-Experimental Designs (cont.) • 01 02 X 03 • No true control group, • History, maturation are problems • Arguments can be made against regression to the mean since you have longer time period • Most important thing to remember Treatment Timing might not be random 43 Cites • Free For All? Lessons from the Rand Health Insurance Experiment, Joe Newhouse • Statistics and Causal Inference, Journal of American Statistical Association, Vol. 81, no. 396, Dec. 1986, pp 945-960, Paul Holland • Are Emily and Greg More Employable than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination, American Economic Review, Vol. 94, no. 4, Sept. 2004, pp. 991-1013, Marianne Bertrand, Sendhil Mullainathan 44