Econometric Modeling More on Experimental Design • Angrist and Pischke • Emphasize the identification of causal effects. • Ask, “What is your identification strategy?” • Point is to control for unobserved selection effects • Offer several solutions: • • • • • Randomized Controlled Trials Experiments: We’ll talkmore about this in the next class. Natural Experiments Instrumental Variables Selection correction like the Heckman two-step Main ideas behind RCTs • RCTs try to bring the controls of hard science research to social science analysis • Some treatment is envisioned • Participants are assigned randomly to the treatment and control group • Because getting the treatment is random, difference in the outcome, after controlling for covariates, is attributable to the treatment • Removes selection effects. What are selection effects • Covariates help control for differences in the way the treatment impacts differ across groups • A problem with RCTs is that there is selection into the experiment – people who agree to participate may be different than those not willing to participate What is the idea behind natural experiments? • Basically the same as an RCT, but with less control in assignment to group • Looking for something natural that randomly assigns people into separate categories for getting treatment or not. – More rare than people like to think because behavior and policy are inherently endogenous – Need to meet a high standard; many seeming exogenous differences are endogenous – Looking for something unrelated to the treatment that separates groups – Best are natural disasters, etc. Often different political outcomes are used, but that suffers from the “Tiebot” effect • Does eliminate the selection into RCTs problem A caution about “natural experiments” and the Tiebout problem – Solon (1985) estimated effects of unemployment insurance on duration of unemployment spells – Compared states that recently changed standards – Ignores that the changed standards could be endogenous. Long spell states might have purposely tightened standards – See Tiebout (1956) “A Pure Theory of Local Expenditures” We are looking for External Validity • Do the impacts that are observed carryover if the magnitude change of the variable used to define the experiment is very different? But …. – Internal validity (the design) makes experiments narrow and idiosyncratic – Empirical evidence is always local to the data – The underlying variation never is completely representative, so extrapolation is always speculative – Calls for repeated experiments, with a range of values – Accumulate more evidence Kennedy’s paper addresses similar issues • Applied econometricians “follow” a set of rules to translate econometric theory to econometric practice. • So why doesn’t theory translate easily into practice? – Reliance of theory on asymptotic properties. Applied econometrics works with finite samples. – Econometric training focuses on estimation, and has lots of tools to fix estimation problems (ie, things like sample selection bias) by focusing on technique. But harder problems are likely to occur at the specification stage. • As a result, applied econometricians “violate” the rules they learn from their classes, as they move into practice. Kennedy’s paper outlines where violating theory has become acceptable, and how to work around it. Kennedy: Ten Rules for Applied Econometrics 1. Use common sense and economic theory – Use good statistical practices – Match like measured variables – Select functional forms appropriate for your dependent variable (beta function for a variable with values constrained between 0 and 1) – Don’t add trends for trendless variables – Don’t use a formula for your empirical work; think about what you are doing. – My Rule: Let good theory drive your econometrics. – From Angrist and Pishke: Know what you identified. 2. Avoid Type III errors (producing the right answer to the wrong question) – Corollary, an approximate answer to the right question is worth more than a precise answer to the wrong question 3. Know the context, which means get the facts – How was the data collected and imputed? – How were observations selected? – These are parts of my “Know your data” rule – But also, understand the system you are trying to model 4. Inspect the data (I need say nothing more on this) – But put together graphs of the data to see patterns and anomalies 5. Keep it sensibly simple – Begin with simple models, then make them more complicated (but only if necessary) – This is the empirical analog to what I said about theoretical modeling – Conflict between complexity (general) and simplicity (specific) – Use the simplest method and simplest specification appropriate for your analysis 6. Use the interocular trauma test (what is this?) – Look at the results until the answer hits you between the eyes. – Look at it hard until you are comfortable taking ownership (telling someone you did it) – Only then should you check that the results make sense with regard to signs, magnitudes, significance and other statistical properties. 7. Understand the costs and benefits of data mining – Goal is not a high R2 – Significance level is contextual – Specification depends on what data you have, and if it is relevant – Coase: “If you torture the data long enough, Nature will confess.” What does this mean? – Do remember, the data can drive theory. • You observe something, and try to explain it. • Econometrics often is useful in understanding what we observe. • Make sure your model focuses on your central question. 8. Be prepared to compromise – Understand the gap between the statistical theory underlying your analysis, and the actual application you are doing – For example, there are few populations that are truly infinite 9. Do not confuses statistical significance with meaningful magnitude. I talked enough about this already, think McCloskey and Ziliak. 10. Report a sensitivity analysis – Pay attention to robustness – Confess your errors and shortcomings (know the limitations of what you did, and admit to them)