Regression Discontinuity: Assumptions for RD to be valid: o All variables, except treatment status, are smooth in running var about the threshold o Agents cannot “self-sort” at threshold o (The “first stage” or relevance condition) Conditional on a smooth function of midterm scores, scoring above the median must have a strong correlation with attendance after the midterm, where “strong” specifically requires an F-stat larger than 10 (or a corresponding t-stat larger that the square root of 10. o Factors besides midterm scores that affect performance on the final must be a smooth function of the midterm score in a neighborhood of scoring above the median on the midterm (in other words, there can be no other reason scores “jump” at the threshold.) o There must be no sorting around the median on the midterm Fuzzy vs Sharp: o An RD is “fuzzy” when being to the right of the threshold does not perfectly predict the treatment status; instead, it is a “first stage” in an IV regression (e.g., birthday relative to school entry cutoff o The opposite of a “sharp” RD, in which threshold side is perfectly correlated with treatment status (e.g., legal to drink at age 21) What does an RD identify under heterogeneous treatment effects? o Sharp RD: treatment effect in the neighborhood of the threshold (the effect of legal drinking at age 21.) You could also interpret the reduced form of a “fuzzy” RD this way (e.g., the effect of being admitted to charter schools at a particular test score cutoff – the “intention to treat” effect.) o Fuzzy RD: IV estimate, so it’s a local average treatment effect (see above). So the effect in a neighborhood of the threshold, for compliers (e.g., people who do not hold their kids back in the case of a school entry cutoff.) Estimation equation for RD (how is it normally formed?): o Ma = alpha+rhoDa+ya+ea Slope should be interacted with threshold dummy, AND include quartic terms Interaction term ALWAYS needed to differ above and below Stata implementation: o Generate interacted term, running var relative to threshold o Write out full regression Rememember RD only applies for the area about the threshold Valid RD does NOT need other controls, beyond smooth function of running var o Are uncorr w/ threshold, but might be included to reduce standard errors If adding other controls, be sure to interact with the WHOLE equation Difference-in-Differences: IV Regression: Monotonicity – no defiers! LATE applies to compliers Stat significant: o Is instrument valid? Check first stage o Is IV statistically significant? Check reduced form o Is IV significantly different from OLS? Check Hausman test Remember conditions: o Strongly correlated w/ x (t>3, F>10) o Unrelated with unobservered error term that directly affects y (placebo, balance) IVreg y (X = z) controls, cluster/robust Other reminders: If fixed effects come in, error term becomes 𝜏𝑠 + 𝑢𝑖𝑠 Placebo test: Correlated with error, uncorrelated with x var or instrument