Econometrics ECON 550 March 1, 2021 Instrumental Variables W Chapter 15 Drexel University, LeBow College of Business Outline for Today Administrative Business: • Upcoming Assignments 1) Differences-in-Differences (cont.) 2) Panel Estimation Recap – Exercise #5 3) Instrumental Variables • Instrument Relevance, Exogeneity, and Monotonicity • IV Estimation • 2SLS Estimation • IV Testing Due Dates • Due Monday, 3/8, 6:00 p.m.: 1) Presentations • Due Friday, 3/12, 11:59 p.m.: 1) FINAL DRAFT 2) Peer Review – Presentations 3) Teammates Survey • FINAL EXAM: Monday, 3/15, 6:00 p.m. – 8:30 p.m. Recall: DiD as First-Differencing or Fixed Effects πΊπ¦π π£ππ ππ‘π π½ π½ πππππ‘ππ π½ πππππ‘ππ ππππβ π½ ππππβ π’ • In terms of first differences, πΊπ πΊπ πΎ ππππβ ππππβ πΎ πππππ‘ππ ππππβ ππππβ π’ • Recognizing that ππππβ ⇒ ΔπΊπ πΎ 1, ππππβ πΎ πππππ‘ππ π’ 0: π£ • πΎ captures the average change in gym visits between February and March for the control group. • πΎ captures the DiD treatment effect. Recall: DiD as First-Differencing or Fixed Effects πΊπ¦π π£ππ ππ‘π π½ π½ πππππ‘ππ π½ πππππ‘ππ ππππβ π½ ππππβ π’ • In terms of fixed effects, ⇒ πΊπ¦π π£ππ ππ‘π πΌ π π½πππππ‘ππ ππππβ • Baseline differences in February gym visits between π’ treatment and control groups are absorbed by the entity fixed effects, πΌ , while time fixed effects, π , absorb the treatment period effect, ππππβ , for members of the control group. • The coefficient on the interaction term, πππππ‘ππ ππππβ , captures the DiD treatment effect. Differences-in-Differences (DiD) (cont.) ο Under what assumptions will the DiD estimator capture the causal effect of the policy change or experimental treatment? ο A key assumption of the DiD strategy is that of parallel trends. ο If not for the policy change or experimental intervention, outcomes would have evolved similarly over time for both treatment and control groups. ο This is typically reasonable in the case of experiments given randomization into treatment, but still a potential concern in small samples. DiD Parallel Trends Assumption • A DiD regression compares the trend in the outcome in the treatment group to the trend in the outcome in the control group. • In order for this comparison to yield a good estimate of the treatment effect, we must rule out any differences in pre-existing trends among the two groups. • If the pre-existing trends differ, then any difference in differences may simply reflect a continuation of these preexisting trends rather than a causal effect. Checking Parallel Trends • Using data from the pre-period, create a linear time trend, πππππ 1,2, … π ∀π‘ 1,2 … π where period T is the last untreated time period. • Then, interact it with a treatment group dummy and estimate the model below on pre-treatment data: π½ πππππ π½ πππππ‘ π½ πππππ‘ πππππ π π½ π’ • If the coefficient on this interaction is different from zero π½ 0 , the data flunk the parallel trends assumption and the DiD estimate is likely to be biased. Checking Parallel Trends (cont.) • Researchers typically plot trends in the outcome variable across treatment and control groups for both the pretreatment and treatment periods with a vertical line at the time the treatment is applied. • If you have data from too few observations to run the regression on the last slide, you can simply plot the time trend for visual inspection. Continuous DiD E.g. Airline Full-Fare Advertising Regulations (FFAR) ππππππ‘πππππ π½ δ πππ₯ πΏ πΌ πππ π‘πΉπΉπ΄π π½ πππ₯ πΌ πππ π‘πΉπΉπ΄π π’ • In a continuous DiD set-up, all observations are “treated,” albeit to varying degrees (depending on size of πππ₯ ). • π½ measures average pre-FFAR price where πππ₯ 0 • πΏ measures average price differences pre- versus post-FFAR • π½ measures the baseline (pre-FFAR) rate of tax pass-through • πΏ captures changes in the rate of tax pass-through post-FFAR Continuous DiD (cont.) ππππππ‘πππππ π½ δ πππ₯ πΏ πΌ πππ π‘πΉπΉπ΄π π½ πππ₯ πΌ πππ π‘πΉπΉπ΄π π’ ο πΉπ is the (continuous) differences-in-differences estimator of the effect of FFAR on the rate of tax pass-through. ο Under what conditions will πΉπ capture the causal effect of the policy change? Difference in Differences in Differences (Triple-Differencing, a.k.a., DDD) • Conceptually, the DDD estimator captures the difference between two DiD results in one regression. • Triple differencing helps strengthen the credibility of the parallel trends assumption. • The first difference is the one we have already studied. • The second difference is relative to a group that was exposed to the treatment but should not be affected by it. • E.g., Outbound vs. Inbound (not subject to FFAR) flights Why Instrumental Variables? • We have discussed many scenarios in which different sources of bias will prevent us from identifying causal effects of X on Y through violation of the OLS zero conditional mean independence assumption: 1) Omitted Variables 2) Functional Form Misspecification 3) Measurement Error (Errors-in-Variables) 4) Simultaneity (Simultaneous Causality) 5) Sample Selection Why Instrumental Variables? (cont.) • When more direct solutions are not available (e.g. explicit controls or fixed effects), instrumental variables (IV) regression offers a possible method for mitigating bias due to omitted variables, simultaneity, or measurement error. • IV may thereby allow us to estimate causal effects. Intuition for Instrumental Variables • The problem we want to avoid is having our regressor(s) of interest, X, being correlated with the error term, u. • When πππ£ π, π’ 0, one can think of separating the variation in X into two parts: 1) Variation that is correlated with the error term ( = Endogenous variation) 2) Variation that is uncorrelated with the error term ( = Exogenous variation) Intuition for Instrumental Variables (cont.) In other words, the effect of X on Y can be decomposed into a causal and a non-causal component. IV allows us to decompose this variation in X and estimate only the effect of the variation in X that is uncorrelated with the error term, i.e. the causal effect. E.g. Concealed Gun Laws and Crime • Even after accounting for various sources of bias through the inclusion of fixed effects, one might still worry that our estimates of the effect of “shall issue” laws on violent crime will be biased due to the fact that states can choose if and when to implement these laws (simultaneity). • Variation in shall therefore consists of two parts: 1) 2) Endogenous variation in the timing and geographic distribution of shall issue laws due to state’s intentional responses to crime rates. Exogenous variation due to factors having nothing to do with crime rates, such as changes in political representation (driven by voter preferences over other issues). • We would like to be able to discard the endogenous variation in shall and extract only the variation that is truly exogenous to crime (e.g. as if these laws were randomly assigned) to measure the causal effect of these laws on crime. Implementing Instrumental Variables • In order to implement this desired decomposition of the variation in X, we need at least one additional variable, Z, which helps to explain the exogenous variation in X without having any direct effect on Y. • This additional variable, Z, then serves as an instrument for our X of interest. • E.g. Concealed Gun Laws and Crime • In our example, a potential instrument for shall issue laws would be a variable which helps to predict where and when these laws are implemented without having any direct relationship to crime rates (i.e. where the only relationship is through the implementation of shall issue laws). Requirements for a Valid Instrument (1) Instrument relevance: • The instrument, Z, successfully explains variation in the endogenous regressor, X: πππ£ π, π 0 (2) Instrument exogeneity: • The instrument, Z, is uncorrelated with the error term from the regression relating X and Y (i.e. Z does not directly influence Y, except through X): πππ£ π, π’ 0 Instrumental Variables Estimation • Consider the following basic regression: π • If cov X, π’ π½ π½π π’ 0, π½ will be biased and inconsistent and OLS will be uninformative, or worse—misleading. Instrumental Variables Estimation (cont.) • Now, suppose there exists a valid instrument, Z, for our endogenous regressor such that πππ£ π, π’ π π π π 0 and π£ ο How can we test for instrument relevance? ο How can we test for instrument exogeneity? Instrumental Variables Estimation (cont.) • Given that π π½ π½π πππ£ π, π π’, π½ πππ£ π, π • Hence, provided that πππ£ π, π’ 0 and πππ£ π, π πππ£ π, π πππ£ π, π π½ ⇒ π½ πππ£ π, π’ ∑ ∑ π π πΜ π πΜ π π π 0, Properties of π½ ⇒ ππππ π½ - Consistency ∑ ∑ π π πππ£ π, π πππ£ π, π πΜ π πΜ π π½ π π πππ£ π, π’ πππ£ π, π π½ Properties of - Unbiasedness π· ππΏ π ⇒πΈ π· πΏ, π ο In finite samples, π· π· πΏπ· ππΏ π π, π π πΈ π πΏ, π π½ remains generally biased (hence importance of large samples and consistency result). Properties of - Efficiency • Assuming homoskedasticity, π πππ · π π΄π£ππ π½ ο Unless π , 1 (i.e. π π΄π£ππ π½ π , π΄π£ππ π½ π 1 π 1 π’ , π from regression of π on Z (including constant) π πππ (Note that this comparison is only sensible if πππ£ π, π’ 0) Weak Instruments ο Weak correlation between X and the instrument, Z, implies a small π , , and hence, large standard errors. ο Worse, even a very modest failure of instrument 0 does not hold precisely), exogeneity (i.e., πππ£ π, π’ can lead to severe asymptotic bias and inconsistency if πππ£ π, π is weak: ππππ π½ π½ ππππ π½ πππ£ π, π’ πππ£ π, π π½ π½ ππππ π, π’ π · ππππ π, π π π ππππ π, π’ · π Weak Instruments (cont.) • Asymptotic bias for the IV estimator will be more severe than for the OLS estimator if: ππππ π, π’ ππππ π, π ππππ π, π’ ο Successful application of IV methods depends critically on having a valid (and strong) instrument that satisfies both instrument relevance and instrument exogeneity. E.g. Instrument Validity • Suppose that we want to estimate ln π€πππ π½ π½ πππ’π π’ For the many reasons discussed before, πππ’π is likely endogenous to wages through unobserved ability, etc. ο Which of the following is likely to serve as a valid instrument for ππ ππ? ο Father’s educational attainment? ο Number of siblings? ο College proximity? ο Quarter of birth? ο Social security numbers? Two-Stage Least Squares (2SLS) • Thus far, it is not altogether transparent how the introduction of Z enables the decomposition of X into endogenous and exogenous components to estimate π½ . ο Two-stage least squares estimation (2SLS) makes this explicit. 2SLS (cont.) • Recalling our expression for testing instrument relevance, π π π π π£ ο Estimating this last relationship, we can decompose variation in X into exogenous and endogenous parts: 1 πΏπ π π π π ππ (exogenous) 2) ππ (endogenous) 2SLS Estimation (cont.) • 2SLS regression thus proceeds in two stages: 1) In the first stage, we regress the endogenous regressor, X, on the instrument(s) and obtain predicted values of the component of X which is uncorrelated with the error term u from the regression of Y on X: π π ⇒π 2) π π π π£ π π In the second stage (i.e. the main or “structural” equation), we regress Y on these predicted values: π π½ π½π π’ Provided Z is a valid instrument, π·ππΊπ³πΊ will be a π consistent estimate of the true causal effect of X on Y. 2SLS Estimation (cont.) • Note that π in the second-stage 2SLS regression is a generated regressor, and is therefore measured with some sampling error that depends on π£. • Performed separately as one stage at a time OLS will be invalid in that it will fail to regressions, ππΈ π½ account for variation in π£. • Computing valid standard errors therefore requires more sophisticated adjustments, which ivregress 2sls will perform automatically in Stata. Multivariate IV vs. 2SLS Estimation • In a multivariate regression model with a single endogenous regressor, π , π½ and π½ will each still consistently estimate the effect of π on Y, provided that a valid instrument exists, and IV and 2SLS are synonymous. • With multiple valid instruments, or exclusion restrictions (i.e. variables that do not appear directly in the 2nd stage equation and satisfy instrument exogeneity), 2SLS estimation is required. Multivariate IV vs. 2SLS (cont.) Proof that π½ π½ : π π π π π£ (First Stage) π π½ π½ π π’ (Second Stage) πππ£ π , π ⇒π½ ⇒π½ ⇒ π·ππΊπ³πΊ π πππ£ π π π ,π πππ π π π πππ π π πππ£ π , π πππ£ π , π π πππ π π πππ π πππ ππ , ππ πππ ππ , πΏπ · π½ππ ππ π½ππ ππ πππ ππ , ππ πππ ππ , πΏπ π·π°π½ π Structural (IV/2SLS), Reduced Form, and First-Stage Equations The reduced form equation evaluates the effect of the instrument directly on the outcome. π π π π π£ (First Stage) π π½ π½ π π’ (Second Stage Structural Equation) π πΌ πΌ π π (Reduced Form Equation) • Under the assumption that the exclusion restriction is valid, the reduced form effect of the instrument on Y must necessarily operate through X (only). • Hence, π½ ·π πΌ Structural (IV/2SLS), Reduced Form, and First-Stage Equations (cont.) • By implication, π½ πΌ π πππ£ π , π 1 ≡ · πππ π π ο The causal effect of X on Y is equal to the reduced form effect of the instrument scaled by the first stage coefficient. E.g. Returns to education and college proximity. Local Average Treatment Effects (LATE) π·ππΊπ³πΊ captures a local average treatment effect (LATE). • To see this, note that you can think of the portion of the variation in X that is explained by Z as capturing the subset of the sample that is induced to “comply” with X, the “treatment.” E.g. Returns to education and college proximity. • Z = distance to the nearest college • X = college attendance ο2SLS (IV) compares those who were induced to attend college (treated) due to their proximity to a college to those who chose not to attend (untreated) due to being far away. LATE (cont.) • Students who respond to college proximity are called “compliers.” • Those who would go to college regardless of how far they live from a college are called “always takers.” • Those who would never go to college, regardless of how close they live to one, are called “never takers.” οIV estimates are based off a comparison of outcomes within a subset of the pool of potential college students (the compliers). LATE (cont.) •π½ is the local effect averaged across the subset of compliers, hence the name, “local average treatment effect.” •π½ does not address the effect of college attendance on always takers or what might happen if you forced never takers to attend college. • IV estimates are likely to be externally valid for those who are similar to the compliers but may not apply more generally. Multicollinearity and 2SLS Estimation • In a multivariate model, imperfect multicollinearity can be even more serious for 2SLS estimation than OLS. • This comes from the fact that 1) The second stage regressor, π , has necessarily less variation than the original endogenous regressor. 2) The correlation between π and the remaining exogenous regressors (used in the first stage as well) is generally higher than between π and the covariates. 2SLS w/ Multiple Endogenous Regressors • With multiple endogenous regressors, the order condition requires the existence of at least as many valid instruments as endogenous regressors. • Each endogenous regressor will require a separate first stage regression, involving all instruments (exogenous regressors) Tests of Endogeneity (i.e. Do we need IV?) • Suppose that we wish to estimate π π½ π½π π½π π’ πΈ π’ , and π is an where we suspect πΈ π’|π exogenous control variable. • Assuming that we have a valid instrument for πΏπ , π , we can test whether IV estimation is necessary by comparing OLS and 2SLS estimates. Tests of Endogeneity (cont.) ο Under the null hypothesis that π is exogenous, π½ whereas only π½ π½ →π½ is consistent under the alternative. ο Moreover, assuming homoskedasticity, Vππ π½ Vππ π½ Durbin-Wu-Hausman Test: π» π½ π½ ′ Vππ π½ Vππ π½ π»~π π½ π½ Tests of Endogeneity (cont.) Regression-Based Test: ο Under the null (π is exogenous), the residual from the first stage regression should have no statistically significant effect if included as an extra regressor in the OLS regression. 1) Estimate π π π π π π π£⇒ π£ • π£ captures variation in π that is orthogonal to π and π and therefore potentially correlated with π’ 2) Estimate π π½ π½π π½π πΏπ£ π, π’ πΏπ£ π 3) Test π» : πΏ 0 0 ⇔ πππ£ π’, π£ 0⇒πΏ 0 ⇒ πππ£ π’, π Tests of Endogeneity (cont.) • Rejection of πΏ 0 implies that π is endogenous (through correlation between π£ and π’). οUse IV! • Note that the regression-based test of endogeneity delivers identical point estimates in the second step regression as 2SLS. • This shows that instead of the usual IV or 2SLS routine, you could instead include π£ in a second stage regression alongside π to control explicitly for that part of π that is endogenous. • This is known as the control function technique to IV estimation. Overidentification (OID) Tests • An IV regression is said to be just identified if there are as many instruments as endogenous regressors. • If you have multiple candidate instruments, you can test whether a subset of these are uncorrelated with the structural error term in the true regression model (i.e. you can test whether instrument exogeneity is satisfied for a subset of instruments). • Note: For any of these tests to be convincing, you must assert that at least one of your instruments is valid. • This is an important shortcoming that limits the usefulness of these tests. Nevertheless, these are commonly used. OID Tests (cont.) • Suppose we have two candidate instruments, π and π , for π in π π½ π½π π½π π’. • Intuitively, we can obtain 2SLS estimates of π½ using either instrument singly. Under the null that both instruments are exogenous, both 2SLS estimators will be consistent and approximately equal (with differences due only to sampling error). • We reject this null if π½ π½ is statistically significant, and conclude that one or both instruments are invalid. OID Tests (cont.) • Note that rejection of the null for the Hausman OID test gives no guidance as to which instrument is invalid. • Moreover, the OID test might fail to reject if both instruments are invalid but nevertheless yield similar 2SLS coefficient estimates. OID Tests (cont.) • Furthermore, rejection of the null for the Hausman OID test might also falsely reject due to heterogeneous treatment effects. • In this case, instruments might isolate different sources of variation in the endogenous X. E.g. Returns to Education • One instrument might explain variation in high school education (e.g. quarter of birth) and another might be for college education (e.g. distance to the nearest 4-year college). • If the effects of high school and college education on the outcome are different (i.e. different LATEs), the OID test could falsely reject validity of the instruments. OID Tests (cont.) • Assuming homoskedasticity, an alternative OID test with q overidentifying restrictions can be implemented as follows: 1) Estimate model by 2SLS using all instruments and obtain the residuals, π’ 2) Regress π’ on all exogenous regressors and instruments and compute the regression π 3) Under the null that all exogenous regressors and instruments are uncorrelated with π’, ππ ~π οIf ππ is large, we reject this null and conclude that at least one instrument is not exogenous. Weak Instruments Tests • In the simplest case, testing for whether an instrument or collection of instruments for a single endogenous regressors is “weak” can be accomplished as an F test of the exclusion restrictions in the first-stage. • Staiger and Stock (1997) suggest as a rule-of-thumb needing πΉ 10 to reject instrument weakness. • For situations involving multiple endogenous regressors and adjusted (e.g. robust) errors, Kleibergen-Paap statistics apply, with critical values drawn from Stock and Yogo (2005). Assignment For our last week of class: Please read W Ch. 15 PRESENTATIONS due 3/8, beginning of class FINAL DRAFTS and Presentation Feedback due 3/12 @ 11:59 p.m. EST