ECON 6074: Causal Inference Module 3 - Year 2022/2023 Topic 5. Instrumental Variable 1 Last Lecture - Matching I Matching is another way to eliminate selection bias by controlling observable covariates I This can be accomplished by matching treated and untreated units with the same characteristics I Matching is constructing a comparison sample of untreated units with the same observable characteristics as the sample of treated units I Identification relies on the Conditional Independence Assumption: I I (Yi1 , Yi0 ) ⊥ ⊥ Di |Xi After matching treated and control groups on observed covariates Xi , treatment assignment is considered as good as random Example: Effect of MECON program on Monthly Salary I Want to know treatment effect of MECON on earnings I Assume that age is the only confounding factor No. MECON Age Salary Matched Sample No. Age Salary No. Non-MECON Age Salary 1 2 3 4 5 29 28 33 34 27 40,000 45,000 42,000 38,000 36,000 7 15 6 10 9 29 28 33 34 27 28,000 35,000 30,000 46,000 26,000 6 7 8 9 10 11 12 13 14 15 33 29 30 27 34 23 30 24 23 28 30,000 28,000 32,000 26,000 46,000 21,000 43,000 29,000 26,000 35,000 Avg 30.2 40,200 Avg 30.2 33,000 Avg 28.1 31,600 Review - Propensity Score Matching I Instead of matching over K dimensions, the method of propensity score matching (PSM) allows the matching problem to be reduced to a single dimension I The propensity score is defined as the treatment probability conditional on a set of observed variables Xi : p(Xi ) = E [Di |Xi ] = Pr(Di = 1|Xi ) I Intuitively, propensity score p(Xi ) summarizes all information of a set of covariates Xi into a single value I Then, by controlling the propensity score p(Xi ) we can eliminate selection bias Review - Identification with PSM I Under conditional independence assumption we can identify the causal effect by controlling for covariates that affect the probability of treatment I The propensity score theorem tells us that the only covariate you need to control for is the probability of treatment itself: p(Xi ) = Pr(Di = 1|Xi ) I Similar to identification with matching estimator, the only difference is that we control p(Xi ) instead of a set of covariates I The key condition for identification (≈ eliminate selection bias) is E [Yi0 |p(Xi ), Di = 1] = E [Yi0 |p(Xi ), Di = 0] Today’s Outline Topic 5. Instrumental Variable I Identification Results I Estimation of LATE I IV in RCT Instrumental Variable I The Instrumental Variable (IV) method exploits an exogenous source of variation that drives the treatment Di but unrelated to other confounding factors ui that affect outcome Yi I Intuitively, IV breaks variation of the treatment Di into two parts: 1. Correlated with other confounding factors 2. Uncorrelated with other confounding factors I We use the variation in Di that is uncorrelated with ui to estimate the causal effect of treatment Main Idea of Instrumental Variable I Y is an outcome (e.g. earnings), Z is the instrument, D is the treatment (e.g. Masters degree), and u is the unobserved confounding factor (e.g. ability) I Ability might be a confounder affecting both Masters degree and earnings I More able people are more likely to attain Masters degree I More able people receive higher earnings I In this case, we need to find an IV that generates variation in receiving Masters degree D which is uncorrelated with ability u I IV is based on a chain of causal reactions Conditions of Valid IV To become a valid IV it needs to satisfy the following conditions: 1. Strong first stage (relevance): Z → D 2. Exclusion Restriction (exogeneity): I No direct or indirect effect of the instrument Z on the outcome Y through variables other than the treatment variable D I That is, the instrument Z affects the outcome Y only through the treatment variable D I The instrument Z is uncorrelated with confounding factors u Can we test these conditions? I We can test whether the instrument has relevance - how? I However, we cannot test whether the instrument is exogenous - why not? Example: Angrist (1990) Joshua Angrist (1990) “Lifetime Earnings and the Vietnam Era Draft Lottery: Evidence from Social Security Administrative Records”, American Economic Review. 80(3): 313-336 I The paper examines the effect of military service on lifetime income I We will use this paper as an example to go through the key concepts of IV methods Angrist (1990) - Empirical Challenge I Joining military service is a personal choice I Is there any selection bias due to unobservable confounding factors in this example? I Risk preference: I Risk-loving people may voluntarily join military service I Risk-loving may have negative impact on lifetime earnings (e.g. risk preference affects other decisions such as human capital investment, job search, etc..) I Health condition: I More healthy people join military service I More healthy people also have higher earnings (e.g. more productive, more workdays, etc..) I We need to find a valid IV for the treatment variable, joining military service, that is uncorrelated with confounding factors Angrist (1990) - Background Angrist (1990) uses the Vietnam draft lottery (Zi ) as an IV for joining military service I In the 1960s and early 1970s, young American men were drafted for the military service to serve in the Vietnam war I Concerns about the fairness of the conscription policy led to the introduction of a draft lottery in 1970 I From 1970-1972, random sequence numbers were assigned to each birth date in cohorts of 19 year olds I Men with lottery numbers below a cutoff were elibible for being drafted while men with numbers above the cutoff was not I The draft did not perfectly determine military service: I Many draft-eligible men were exempted for health and academic reasons I Draft-ineligible men volunteered for service Angrist (1990) - Identification Strategy Next, we discuss whether draft eligibility induced by random lottery satisfies IV conditions: I I Relevance I Vietnam veteran status (joining military service) was not completely determined by the lottery outcome I But, draft eligibility (Zi ) and joining military service (Di ) is strongly correlated Exclusion Restriction I Draft eligibility is determined by a randomized lottery I Accordingly, eligibility should be uncorrelated with confounding factors affecting lifetime earnings I No reason to expect a direct effect of eligibility on earnings IV - Potential Outcomes Framework I Treatment Assignment ( 1 if assigned to treatment group Zi = 0 if assigned to control group I In the study example, I Zi = 1: those who are eligible for draft (lottery number below cutoff) I Zi = 0: those who are not eligible for draft (lottery number above cutoff) IV - Potential Outcomes Framework I I Potential Treatments I Diz : treatment status given the value of Z I Di1 : treatment status if assigned to treatment group (draft eligible) I Di0 : treatment status if assigned to control group (draft ineligible) Observed Treatment ( Di = Di1 Di0 if Zi = 1 if Zi = 0 or Di = Zi Di1 + (1 − Zi )Di0 IV - Potential Outcomes Framework I The variation in treatment Di (veteran status) was not entirely determined by draft eligibility Zi but also from individual choice I Thus, we can have Di1 = {0, 1} I I Those who were draft eligible could join or not join military service Similarly, we have Di0 = {0, 1} I Those who were ineligible for the draft could still join military service IV - Compliers and Non-compliers I We can define four types of individuals based on whether they follow the draft eligibility results: 1. Compliers: Di0 = 0 and Di1 = 1 I Eligible for the draft and joins military service I Ineligible for the draft and does not join military service 2. Always-takers: Di1 = Di0 = 1 I Always joins military service regardless of draft eligibility 3. Never-takers: Di1 = Di0 = 0 I Never joins military service regardless of draft eligibility 4. Defiers: Di0 = 1 and Di1 = 0 I Eligible but does not join military service I Ineligible but joins military service Identification Results for IV I The IV method characterizes a causal chain from the instrument Zi (draft eligibility) to outcome Yi (earnings) through the treatment Di (military service) I Intuitively, we can write this as Effect of Instrument (Z ) on Outcome (Y ) = Effect of Instrument (Z ) on Treatment (D) × Effect of Treatment (D) on Outcome (Y ) I Rearranging this equation, we obtain Effect of Treatment (D) on Outcome (Y ) Effect of Instrument (Z ) on Outcome (Y ) = Effect of Instrument (Z ) on Treatment (D) I Formally, IV can be represented as αIV = E [Yi |Zi = 1] − E [Yi |Zi = 0] E [Di |Zi = 1] − E [Di |Zi = 0] Identification Assumptions for IV Assumption 1 Relevance E [Di |Zi = 1] − E [Di |Zi = 0] 6= 0 I We need the instrument to be significantly related to the treatment Assumption 2 Independence (Yi1 , Yi0 , Di1 , Di0 ) ⊥ ⊥ Zi I The IV is independent of potential outcomes and potential treatment I This assumption is sufficient for a causal interpretation of the reduced form: E [Yi |Zi = 1] − E [Yi |Zi = 0] = E [Yi (Di1 )|Zi = 1] − E [Yi (Di0 )|Zi = 0] = E [Yi (Di1 )] − E [Yi (Di0 )] Identification Assumptions for IV I Independence also means that the first stage captures the causal effect of Zi on Di : E [Di |Zi = 1] − E [Di |Zi = 0] = E [Di1 |Zi = 1] − E [Di0 |Zi = 0] = E [Di1 − Di0 ] Assumption 3 Exclusion Restriction Yid (Z = 1) = Yid (Z = 0) = Yid I The potential outcomes for each treatment status only depend on the treatment Di , not the instrument Zi I Vietnam draft lottery example: I An individual’s potential earnings as a veteran or non-veteran are assumed to be unchanged by the draft eligibility status Violation of Exclusion Restriction I The exclusion restriction would be violated if low lottery numbers may have affected the decision to enroll to college (e.g. to avoid the draft) I If this was the case the lottery number would be correlated with earnings because: 1. it affected the likelihood of joining military service 2. it affected the likelihood of educational attainment I The fact that the lottery number is randomly assigned (which satisfies the independence assumption) does NOT ensure that the exclusion restriction is satisfied Identification Assumptions for IV Assumption 4 Monotonicity Di1 ≥ Di0 I Finally, we need to make one last assumption about the relationship between the instrument and treatment I This assumption says that the treatment is monotonically increasing in the instrument I This is sometimes called no defiers I In the draft lottery example: I there is no individual who would join military service if they were draft ineligible but would not join if they were draft eligible Identification Result - IV Assumption 5 IV Identifies Local Average Treatment Effect (LATE) αIV = E [Yi |Zi = 1] − E [Yi |Zi = 0] = E [Yi1 − Yi0 |Di1 > Di0 ] E [Di |Zi = 1] − E [Di |Zi = 0] I IV identifies the causal effect of treatment for compliers (those who take up the treatment induced by the instrument) I In the Vietnam Lottery example, the lottery IV can identify the causal effect of military service on lifetime earnings for those who join military service if eligible and not join if ineligible I Note that IV cannot say anything about the causal effect for always takers or never takers: Di1 − Di0 = 0 I Defiers are ruled out through the monotonicity assumption Proof: IV identification of LATE αIV = = E [Yi |Zi = 1] − E [Yi |Zi = 0] E [Di |Zi = 1] − E [Di |Zi = 0] E [Yi1 Di1 + Yi0 (1 − Di1 )|Zi = 1] − E [Yi1 Di0 + Yi0 (1 − Di0 )|Zi = 0] E [Di1 |Zi = 1] − E [Di0 |Zi = 0] = E [Yi1 Di1 + Yi0 (1 − Di1 )] − E [Yi1 Di0 + Yi0 (1 − Di0 )] E [Di1 ] − E [Di0 ] = E [(Yi1 − Yi0 )(Di1 − Di0 )] E [Di1 ] − E [Di0 ] Proof: IV identification of LATE (cont’d) I IV cannot identify causal effect for always takes or never takers: Di1 − Di0 = 0 I Only for compliers Di1 − Di0 = 1 and defiers Di1 − Di0 = −1 I Monotonicity assumption rules out defiers I Then, with only compliers we obtain the following: = E [(Yi1 − Yi0 )(Di1 − Di0 )] E [Di1 ] − E [Di0 ] = E [(Yi1 − Yi0 ) × (Di1 − Di0 = 1)|Di1 − Di0 = 1]P(Di1 − Di0 = 1) E [Di1 ] − E [Di0 ] = E [(Yi1 − Yi0 ) × (Di1 − Di0 = 1)|Di1 − Di0 = 1]P(Di1 − Di0 = 1) P(Di1 − Di0 = 1) = E [Yi1 − Yi0 |Di1 > Di0 ] = αLATE IV identifies ATE for compliers I Never takers and always takers do not change their treatment status regardless of the instrument status I Only compliers and defiers contribute to the IV estimate I The monotonicity assumption rules out the effect from defiers (should carefully think whether this assumption holds in the research setting) I Accordingly, IV identifies the average treatment effect for compliers Local Average Treatment Effect I LATE is the average treatment effect for individuals who joined military service because of the lottery draft outcome I ATE among the compliers 6= ATE for whole population I To extrapolate the IV results to never-takers or always-takers we need to make further assumptions (e.g. homogeneous causal effects) I Importantly, LATE is different when using different instruments I Whether or not what the IV identifies (LATE) is interesting depends on the instrument LATE and ATT Special Cases in which αLATE = αATT I If treatment Di is randomized (e.g. RCT) and Zi = Di then everybody is a complier I I That is, no never-taker or always-taker With one-sided noncompliance (i.e. no always-taker), Di1 = {0, 1}, Di0 = 0: E [Yi1 − Yi0 |Di1 > Di0 ] = E [Yi1 − Yi0 |Di1 = 1] = E [Yi1 − Yi0 |Zi = 1, Di1 = 1] = E [Yi1 − Yi0 |Di = 1] I Thus, αLATE = αATT IV Estimation We can estimate αIV by running the following two regressions: I Reduced form regression: the effect of lottery draft on earnings Yi αRF I First stage regression: the effect of lottery draft on joining military service Di αFS I = µ + αRF Zi + i Cov(Yi , Zi ) = V(Zi ) = θ + αFS Zi + νi Cov(Di , Zi ) = V(Zi ) The IV estimator is: α̂IV = d i , Zi ) Cov(Y α̂RF = d i , Zi ) α̂FS Cov(D Two Stage Least Squares (TSLS) I In practice we often estimate IV using Two Stage Least Squares I If identification assumptions only hold after conditioning on X , covariates are often introduced using TSLS regression I It is called TSLS because you could estimate it as follows: 1. Obtain the first stage fitted values: D̂i = θ̂ + α̂FS Zi + X 0 β̂ 2. Plug the first stage fitted values into the “second-stage equation” Yi = X 0 γ + αTSLS D̂i + ui∗ Two Stage Least Squares (TSLS) I However, this TSLS estimation will give you the wrong standard errors I Following the two steps procedure, we calculate the standard errors based on ui∗ ui∗ = Yi − X 0 γ − αTSLS D̂i I But what we really want is the standard errors based on ui ui = Yi − X 0 γ − αTSLS Di I In practice, I Stata and other regression softwares will return the corrected standard errors Angrist (1990) - Main Results I I First stage results: I Having a low lottery number (being eligible for the draft) increases veteran status by about 16 percentage points I The mean of veteran status is about 27 percent Second stage results: I I Serving in the army lowers earnings by $2,050 - $2,741 per year Placebo test on exclusion restriction: I Earnings of 1969 cohort should not be affected by 1970 draft lottery outcome (draft lottery should affect earnings only through military service but 1969 cohort was not subject to draft lottery) I Author finds no evidence of an association between draft eligibility and earnings among 1969 cohort Empirical Paper: Angrist and Krueger (QJE, 1991) Joshua Angrist and Alan Krueger (1991). Does Compulsory School Attendance Affect Schooling and Earnings? Quarterly Journal of Economics. 106(4): 979-1014. I The paper studies the impact of Compulsory Attendance Law (CAL) on earnings in U.S. I Influential study in the education literature I Uses quarter of birth as an instrument for schooling I I Some students who are born earlier in the year can legally drop out whereas students born later in the year have to remain in school until end of grade Find that educational attainment and earnings higher for those compelled to stay longer in school because of CAL AK(1991) - Institutional Setting Quarter of Birth and Years of Schooling I Suppose that the probability of dropout is constant across birthdays I Schools admit students to first grade only if they reach age six by January 1 of the academic year they enter I Students born earlier in the year are older (i.e. reach birthday faster) than students born later I Compulsory Attendance Law requires students to stay in school until they reach certain age (e.g. 16) I Students born in earlier months can legally drop out with less schooling than students born later Years of Education and Quarter of Birth in Census Data I Increasing trend in average education for men born between 1930 and 1950 I Average education higher for individuals born near end of year than individuals born early in the year De-trended Years of Education where I Eicj : educational outcome of individual i born in year c and quarter j I MAcj = (E−2 + E−1 + E+1 + E+2 )/4 I Qicj : dummy for quarter of year birth Detrended Years of Education and Quarter of Birth Black bars indicate that average education of men born in first quarter less than men born in surrounding quarters Earnings and Quarter of Birth I Men born in first quarter of the year tend to have lower earnings than men born in nearby months Baseline Estimates - Returns to Education Wald = ∆Earnings ∆Education AK(1991) - Two Stage Least Squares I The first stage regression is: Ei = X 0 π10 + π11 Q1i + π12 Q2i + π13 Q3i + Yi θ + i I The second stage regression is: Wi = X 0 π20 + π21 Ei + Yi θ + ηi I Ei : years of education of individual i I Wi : weekly wage of individual i I Yi : year indicator I Qi : quarter of birth indicator What is the reduced form regression? First Stage Estimation Results Earlier birth associated with less education ⇓ Caused by (1) CAL or (2) Other factors TSLS Estimates - Returns to Education for 1920-1929 Cohort I Additional year of education increases weekly earnings by 7 percent I OLS and TSLS provide similar estimates TSLS Estimates - Returns to Education for 1930-1939 Cohort I Additional year of education increases weekly earnings by 8 percent I OLS and TSLS estimates are again quite similar TSLS Estimates - Returns to Education for Black Men I Returns to education lower for black men relative to white men: 5% vs 8% I Possible explanation for less returns to education? Weak Instruments I As you will know IV is consistent but not unbiased I For a long time researchers estimating IV models never cared much about the small sample bias I In the early 1990s a number of papers, however, highlighted that IV can be severely biased in particular if instruments are weak (i.e. the first stage relationship is weak) and if you use many instruments to instrument for one endogenous variable (i.e. there are many overidentifying restrictions) I In the worst case, if the instruments are weak such that there is no first stage then the 2SLS sampling distribution is centered on the probability limit of OLS Weak Instruments - Bias Towards OLS I Let’s consider a model with a single endogenous regressor and a simple constant treatment effect I The causal model of interest is: Y = βX + I (1) The instrumental variable is Z with the first stage equation: X = γZ + η (2) I If and η are correlated, estimating equation (1) by OLS would lead to biased results I The OLS bias is: E [βOLS − β] = Cov (, η) Cov (, X ) = Var (X ) Var (X ) Weak Instruments - Bias Towards OLS I It can be shown that the bias of 2SLS is approximately: E [β̂2SLS − β] ≈ Cov (, η) 1 Var (η) F + 1 where F is the population analogue of the F-statistic for the joint significance of the instruments in the first stage regression (See MHE pp. 206-208 for derivation) I If the first stage is weak (i.e. F =⇒ 0) the bias of 2SLS approaches Cov (,η) Var (η) I This is the same as the OLS bias for γ = 0 in equation (2) (i.e. there is no first stage relationship) implies Var (X ) = Var (η) and (,η) Cov (,η) therefore the OLS bias Cov Var (X ) becomes Var (η) I If the first stage is very strong (F =⇒ ∞) the IV bias goes to 0 Weak Instruments - Adding More Instruments I Adding more weak instruments will increase the bias of 2SLS. By adding further instruments without predictive power the first stage F-statistic goes towards 0 and the bias increases I If the model is just identified, weak instrument bias is less of a problem I Bound, Jaeger, and Baker (1995) highlighted this problem for the Angrist and Krueger (1991) study which uses different sets of instruments: I quarter of birth dummies - 3 instruments I quarter of birth + quarter of birth × year of birth dummies - 30 instruments I quarter of birth + quarter of birth × year of birth + quarter of birth × state of birth - 180 instruments BJB (1995) - Adding Weak Instruments in AK (1991) I Adding more weak instruments reduced the first stage F-statistic and moves the IV coefficient towards the OLS coefficient BJB (1995) - Adding Weak Instruments in AK (1991) I Adding more weak instruments reduced the first stage F-statistic and moves the IV coefficient towards the OLS coefficient What can you do if you have weak instruments? With weak instruments you have the following options: 1. Use a just identified model with your strongest IV. If the instrument is very weak, however, your standard errors will probably be very large 2. Use a limited information maximum likelihood estimator (LIML) I This provides the same asymptotic distribution as 2SLS (under constant effects) but provides a finite-sample bias reduction I LIML can be estimated in Stata using the ivregress command 3. Find stronger instruments Empirical Paper: Angrist and Evans (1998) Joshua Angrist and William Evans (1998) “Children and Their Parents’ Labor Supply: Evidence from Exogenous Variation in Famliy Size”, American Economic Review. 88(3): 450-477. I Paper examines the effect of childbearing on female labor supply I The relationship between fertility and labor supply has long been an interest to labor economists I Challenge is selection bias: mothers with low potential earnings may be more likely to decide to have children AE (1998) - Identification Strategy I The authors solve the selection bias problem using instrumental variables: 1. Multiple births (twin): I The occurence of a multiple birth is essentially random and unrelated to potential outcomes or demographic characteristics 2. Sibling sex composition: I American parents with two children are much more likely to have a third child if the first two are of same-sex than mixed-sex I Possible if American parents prefer having mixed-sex composition I The same-sex instrument is based on the claim that sibling sex composition is random and affects mother’s labor supply exclusively through increasing fertility AE (1998) - Identification Strategy I They want to estimate the effect of childbearing (Di ) on female labor supply (Yi ) I Di : having a third birth I Yi : employment, weeks worked I Two instrumental variables: 1. Twins instrument (Z1i ) is an indicator for multiple births at second pregnancy in a sample of mothers with at least two children 2. Sibling sex instrument (Z2i ) is an indicator for the first two children having the same sex I If they are both valid IVs would they show similar results? AE (1998) - Wald Estimates AE (1998) - TSLS Estimates AE (1998) - Results I Twins and Mixed-sex instruments both suggest that the birth of a third child has a large negative impact on mother’s employment status (weeks worked and hours worked) I Interestingly, estimate using same sex shows larger magnitude of effect of childbearing than estimate using twins I Different instruments need not generate similar estimates of causal effect even when both are valid IVs I This is because IV only identifies local average treatment effects ATE for compliers I Thus, different instruments have different compliers leading to different LATE IV in Randomized Experiment with Partial Compliance I The use of IV methods can also be useful when evaluating a randomized control trial (RCT) I In many RCTs, actually receiving treatment is voluntary among those randomly assigned to treatment I This is called randomized experiment with partial (imperfect) compliance I I Those who are assigned to treatment might NOT take up treatment I Those who are NOT assigned to treatment might take up treatment Selection bias: only those who are particularly likely to benefit from treatment will actually take up treatment IV in Randomized Experiment with Partial Compliance I If we just compare means between treated and untreated individuals OLS estimates will be biased I Since some individuals in the treatment (control) group self-select into treatment I Solution: using randomly assigned treatment Zi as an instrument for receiving treatment Di I This would allow us to estimate the Local Average Treatment Effect (LATE) - average treatment effect for compliers Empirical Paper: Bloom et. al. (1997) Bloom et. al. (1997) ”The Benefits and Costs of JTPA Title II-A Programs: Key Findings from the National Job Training Partnership Act Study”, Journal of Human Resources. 32(3): 549-576 I The paper estimates the effect of the job training program on earnings I The program provides job training to people facing barriers for employment (e.g. dislocated workers, disadvantaged youth) Bloom et. al. (1997) - Partial Compliance I Among 20,601 sample members, two thirds were assigned to treatment group, offer enrollment into job training program (Title II-A), and one third was assigned to control group I Only 60 percent of those who were offered the training actually received it I 2 percent of people in the control group also received training I Authors evaluate the differences in earnings in the 30-month follow-up period after random assignment Bloom et. al. (1997) - Partial Compliance I Simply comparing those who participate in the job training program vs. those do not participate will lead to selection bias I The authors use random assignment of job training Zi as an IV for actual job training Di I Since compliance problem is largely confined to the treatment group we have little concern about always taker such that αLATE ≈ αATT Bloom et. al. (1997) - Comparison of Estimation Results Comparisons by Training Status Men Women Individual Covariates? Comparisons by Assignment Status I.V. Estimates (1) (2) (3) (4) (5) (6) 3970 (555) 2133 (345) 3754 (536) 2215 (334) 1117 (569) 1243 (359) 970 (546) 1139 (341) 1825 (928) 1942 (560) 1593 (895) 1780 (532) No Yes No Yes No Yes I Columns (1) and (2) show OLS estimates I The difference in earnings between those who are trained and those who are not I These estimates could be upward-biased because individuals assigned to treatment could decline take up of treatment Bloom et. al. (1997) - Comparison of Estimation Results Comparisons by Training Status Men Women Individual Covariates? Comparisons by Assignment Status I.V. Estimates (1) (2) (3) (4) (5) (6) 3970 (555) 2133 (345) 3754 (536) 2215 (334) 1117 (569) 1243 (359) 970 (546) 1139 (341) 1825 (928) 1942 (560) 1593 (895) 1780 (532) No Yes No Yes No Yes I Columns (3) and (4) compare individuals according to whether or not they were assigned to treatment I This estimate is called intention to treat (ITT) or reduced form estimate Intention to Treat (ITT) Effect I Since Zi is randomly assigned, the ITT effect has a causal interpretation I It provides us the causal effect of offering treatment (not actually receiving treatment) I In the event of no take-up, the ITT estimate will be smaller than the average treatment effect on treated I To derive the LATE, you divide the ITT by the difference in treatment rates between the original treatment and control group αIV = I E [Yi |Zi = 1] − E [Yi |Zi = 0] E [Di |Zi = 1] − E [Di |Zi = 0] if treatment among control group is negligible (i.e. E [Di |Zi = 0] = 0) then the denominator is simply the take up rate (i.e. E [Di |Zi = 1]) Bloom et. al. (1997) - Comparison of Estimation Results Comparisons by Training Status Men Women Individual Covariates? Comparisons by Assignment Status I.V. Estimates (1) (2) (3) (4) (5) (6) 3970 (555) 2133 (345) 3754 (536) 2215 (334) 1117 (569) 1243 (359) 970 (546) 1139 (341) 1825 (928) 1942 (560) 1593 (895) 1780 (532) No Yes No Yes No Yes I Columns (5) and (6) show IV estimates - Local Average Treatment Effect (LATE) I Here we actually estimate the ATT (not only LATE) because there are almost no always takers I The treated population consists (almost) entirely of compliers Practical Tips for IV Strategy 1. Check relevance between instrument and treatment variables I Does this IV make sense? I Do the first stage coefficients have the right magnitude and sign? I Report the F-statistic from the first stage regression I Stock, Wright, and Yogo (2010) suggest that F > 10 indicate that you do not have a weak instrument problem I If instruments are weak, then the TSLS estimator is biased and the t-statistic has a non-normal distribution Practical Tips for IV Strategy 2. Check exclusion restriction I The exclusion restriction cannot be tested directly, but it can be falsified I Placebo test I Test the reduced form effect of Zi on Yi in situations where it is impossible or extremely unlikely that Zi could affect Di I Because Zi cannot affect Di then the exclusion restriction implies that this placebo test should have no effect 3. If you have many IVs pick your best instrument and use the just identified model (e.g. use one IV with one endogeneous regressor; two IVs with two endogenous regressors, etc...) 4. The reduced form result as a litmus test I If you can’t find a significant relationship in the reduced form regression then it is probably not there Next Lecture Topic 5. Instrumental Variable II I I Empirical papers: I Acemoglu, Johnson, and Robinson (2001) “The Colonial Origins of Comparative Development: An Empirical Investigation”, American Economic Review. 91(5): 1369-1401. I Park, S. (2019) “Socializing at Work: Evidence from a Field Experiment with Manufacturing Workers” American Economic Journal: Applied Economics. 11(3): 424-455. Data session with IV