Supplementary material to: Tolerating deance? Identication of treatment eects without monotonicity. ∗ Clément de Chaisemartin April 13, 2016 Abstract This paper gathers the supplementary material to de Chaisemartin (2016). I show that one can estimate quantile treatment eects in the comvivors subpopulation, that one can test the CD condition, and that my results extend to multivariate treatment and instrument, to fuzzy regression discontinuity designs, and to the bounds in Lee (2009) for survey nonresponse. ∗ Department of Economics, University of Warwick, clementdechaisemartin@gmail.com 1 Testability and quantile treatment eects. In this section, I show that random instrument, exclusion restriction, and CD have a testable implication. Then, I discuss how the testable implications of random instrument, exclusion restriction, and ND studied in Huber & Mellace (2012), Kitagawa (2013), and Mourie & Wan (2014) relate to the CD condition. R, let S(R) denote the support of R. Let Udz be a random variable uniformly distributed on [0, 1] denoting the rank of an observation in the distribution of Y |D = d, Z = z . If the distribution of Y |D = d, Z = z is continuous, Udz = FY |D=d,Z=z (Y ). If the distribution of Y |D = d, Z = z is discrete, one can randomly allocate a rank to tied observations For any random variable by setting Udz = FY |D=d,Z=z (Y − ) + V (FY |D=d,Z=z (Y ) − FY |D=d,Z=z (Y − )), V is uniformly distributed on [0, 1] and independent of (Y, D, Z), FS . S(Y |D = d, Z = z) : y < Y }.1 For every d ∈ {0, 1}, let pd = P (D=d|Z=d) and p1 are included between 0 and 1. Finally, let where Y − = sup{y ∈ Notice that both p0 and L = E(Y |D = 1, Z = 1, U11 ≤ p1 ) − E(Y |D = 0, Z = 0, U00 ≥ 1 − p0 ) L = E(Y |D = 1, Z = 1, U11 ≥ 1 − p1 ) − E(Y |D = 0, Z = 0, U00 ≤ p0 ). Theorem S1 If Assumptions 1, 2, and 5 are satised, L ≤ W ≤ L. (29) The intuition of this result goes as follows. Under random instrument, exclusion restriction, and E(Y1 − Y0 |CV ) is point identied: following Theorem 2.1 in de Chaisemartin (2016), it is equal to W . It is also partially identied. CV is included in {D = 0, Z = 0}, and it accounts for p0 % of this population. E(Y0 |CV ) can therefore not be larger than the mean of Y0 of the p0 % of this population with largest Y0 . It can also not be smaller than the mean of Y0 of the p0 % with lowest Y0 (see Horowitz & Manski, 1995 and Lee (2009)). Combining this with a similar reasoning for E(Y1 |CV ) yields worst case bounds for E(Y1 − Y0 |CV ), L and L. The estimand of E(Y1 − Y0 |CV ) should lie within its bounds, hence the testable implication. CD, To implement this test, one can use results from Andrews & Soares (2010). Let two following moment inequality conditions: θ satisfy the 0 0 YZ Y (1 − Z) Y DZ1{U11 ≤ p1 } Y (1 − D)(1 − Z)1{U00 ≥ 1 − p0 } − − FS θ + − P (Z = 1) P (Z = 0) P (D = 1, Z = 1, U11 ≤ p1 ) P (D = 0, Z = 0, U00 ≥ 1 − p0 ) Y DZ1{U11 ≥ 1 − p1 } Y (1 − D)(1 − Z)1{U00 ≤ p0 } YZ Y (1 − Z) ≤ E FS θ + − − − . P (D = 1, Z = 1, U11 ≥ 1 − p1 ) P (D = 0, Z = 0, U00 ≤ p0 ) P (Z = 1) P (Z = 0) ≤E This denes a moment inequality model with preliminary estimated parameters. satises all the technical conditions required in Andrews & Soares (2010) if 1Y − should be understood as −∞ if Y = inf{S(Y |D = d, Z = z)}. 1 This model 2+δ E(|Y | ) < +∞ δ , and if P (Z = 0), P (Z = 1), P (D = 1, Z = 1, U11 ≤ p1 ), P (D = 1, Z = 1, U11 ≥ 1 − p1 ), P (D = 0, Z = 0, U00 ≤ p0 ), and P (D = 0, Z = 0, U00 ≥ 1 − p0 ) are all bounded by below by some strictly positive constant . Under these two mild restrictions, their Theorem 1 applies, and one can use it to construct a uniformly valid condence interval for θ . (29) is rejected when 0 does not belong to this condence interval. for some strictly positive As pointed out in Balke & Pearl (1997) and Heckman & Vytlacil (2005), random instrument, exclusion restriction, and ND also have testable implications. Huber & Mellace (2011), Kitagawa (2013), Machado et al. (2013), and Mourie & Wan (2014) have developed statistical tests of these implications. I now discuss how these tests relate to the CD condition. The test suggested in Huber & Mellace (2011) is not a test of CD. Huber & Mellace (2011) use the fact that E(Y0 |N T ) and E(Y1 |AT ) are both point and partially identied under random in- E(Y |D = 0, Z = 1) is equal to E(Y0 |N T ) (1 − p0 )E(Y0 |N T ) + p0 E(Y0 |C). This implies strument, exclusion restriction, and ND. For instance, and E(Y |D = 0, Z = 0) is equal to E(Y |D = 0, Z = 0, U00 ≤ 1 − p0 ) ≤ E(Y |D = 0, Z = 1) ≤ E(Y |D = 0, Z = 0, U00 ≥ p0 ). But under CD, E(Y |D = 0, Z = 1) is equal to P (N T ) E(Y0 |N T ) P (N T )+P (F ) + P (F ) E(Y0 |F ), so P (N T )+P (F ) the previous implication need not be true anymore. The tests suggested in Kitagawa (2013), Machado et al. (2013), and Mourie & Wan (2014) are not tests of CD, but they are tests of the CDM condition introduced hereafter. (Compliers-deers for marginals: CDM) There is a subpopulation of C denoted CF which satises: Assumption S1 P (CF ) = P (F ) Y0 |CF ∼ Y0 |F Y1 |CF ∼ Y1 |F. The CDM condition requires that a subgroup of compliers have the same size and the same marginal distributions of Y0 and Y1 as deers. CDM is stronger than CD. On the other hand, it is invariant to the scaling of the outcome, while CD is not. With non-binary outcomes, working under assumptions invariant to scaling might be desirable so one might then prefer to invoke CDM than CD, despite the fact it is a stronger assumption. Following the logic of Theorem 2.2 in de Chaisemartin (2016), a sucient condition for CDM to hold is that there be more compliers than deers in each subgroup with the same value of P (F |Y0 , Y1 ) ≤ P (C|Y0 , Y1 ). 2 (Y0 , Y1 ): One can show that under CDM, the marginal distributions of Y0 and Y1 are identied for com- vivors. The estimands are the same as in Imbens & Rubin (1997): P (D = 0|Z = 0)fY |D=0,Z=0 (y0 ) − P (D P (D = 0|Z = 0) − P (D P (D = 1|Z = 1)fY |D=1,Z=1 (y1 ) − P (D fY1 |CV (y1 ) = P (D = 1|Z = 1) − P (D fY0 |CV (y0 ) = = 0|Z = 0|Z = 1|Z = 1|Z = 1)fY |D=0,Z=1 (y0 ) = 1) = 0)fY |D=1,Z=0 (y1 ) . = 0) The testing procedures in Kitagawa (2013) and Mourie & Wan (2014) test whether the rhs of the two previous equations are positive everywhere. Under CDM, these quantities are densities so they must be positive. These procedures are therefore tests of CDM. Machado et al. (2013) focus on binary outcomes. Their procedure tests inequalities equivalent to those considered in Kitagawa (2013). Therefore, their procedure is also a test of the CDM condition. 2 Multivariate treatment and instrument Results presented in the main paper extend to applications where treatment is multivariate. z ∈ {0; 1}, Dz ∈ {0, 1, 2, ..., J} for some integer J . Let (Yj )j∈{0,1,2,...,J} j denote the corresponding potential outcomes. Let C = {D1 ≥ j > D0 } denote the subpopulation of compliers induced to go from a treatment below to above j by the instrument. Let F j = {D0 ≥ j > D1 } denote the subpopulation of deers induced to go from above to below j by the instrument. As in Angrist & Imbens (1995), I assume that the cdf of D|Z = 0 stochastically dominates that of D|Z = 1. This implies that P (Cj ) ≥ P (Fj ). Consider the following CD Assume that for every condition: (Compliers-deers for multivariate treatment: CDMU) For every j ∈ {1, 2, ..., J}, there is a subpopulation of C j denoted CFj which satises: Assumption S2 P (CFj ) = P (F j ) E(Yj − Yj−1 |CFj ) = E(Yj − Yj−1 |F j ). If CDMU is satised, one can show that J X W is equal to the following average causal response: wj E(Yj − Yj−1 |CVj ), j=1 where CVj is a subset of Cj of size P (Cj ) − P (Fj ), and wj are positive real numbers. This generalizes Theorem 1 in Angrist & Imbens (1995). Their Theorem 2 considers the case where the instrument is multivariate: Let β denote the coecient of D in a 2SLS regression of 3 Y on D Z ∈ {0, 1, 2, ..., K}. using the set of dummies (1{Z = k})1≤k≤K as instruments. Angrist & Imbens (1995) show that if Dz0 ≥ Dz for every z 0 ≥ z , β is equal to a weighted average of average causal responses. Let C jz = {Dz ≥ j > Dz−1 } jz (resp. F = {Dz ≥ j > Dz−1 }) denote the subpopulation of compliers induced to go from a treatment below to above j (resp. above to below j ) when the instrument moves from z − 1 to z . Consider the following condition: (Compliers-deers for multivariate treatment and instrument: CDMU2) For every j ∈ {1, 2, ..., J} and z ∈ {0, 1, 2, ..., K}, there is a subpopulation of C jz denoted CFjz which satises: Assumption S3 P (CFjz ) = P (F jz ) E(Yj − Yj−1 |CFjz ) = E(Yj − Yj−1 |F jz ). Under CDMU2, one can show that β is equal to a weighted average of average causal responses for comvivors. 3 Fuzzy regression discontinuity designs and survey non response My results also extend to many treatment eect models relying on ND type of conditions. An important example are fuzzy regression discontinuity (RD) designs studied in Hahn et al. (2001). Their Theorem 3 still holds if their Assumption A.3, a ND at the threshold assumption, S denote the forcing variable, and let s F = {F, S = s} respectively denote compliers is replaced by a CD at the threshold assumption. Let denote the threshold. Let s C = {C, S = s} and s and deers at the threshold. (Compliers-deers at the threshold: CDT) There is a subpopulation of C s denoted CFs which satises: Assumption S4 P (CFs |S = s) = P (F s |S = s) E(Y1 − Y0 |CFs , S = s) = E(Y1 − Y0 |F s , S = s). Finally, my results extend to the sample selection and survey non-response literatures. ND type of conditions have also been used in the sample selection and survey non-response literatures, R0 and R1 denote a subject's potential responses to a survey without and with the treatment. Let AR, RC , and RF respectively denote always respondents (subjects who satisfy R0 = R1 = 1), reponse-compliers (those who satisfy R0 = 0 and R1 = 1), and response-deers (those who satisfy R0 = 1 and R1 = 0). Under the assumption that there are no response-deers (R1 ≥ R0 ), Lee (2009) derives bounds for E(Y1 − Y0 |AR). Under the weaker assumption that a subpopulation RCF of RC has the same size and the same average treatment eect as RF , the bounds derived in Lee (2009) are valid bounds for E(Y1 − Y0 |AR ∪ RCF ). for instance in Lee (2009) or Behaghel et al. (2012). Let 4 References Andrews, D. W. K. & Soares, G. (2010), `Inference for parameters dened by moment inequalities using generalized moment selection', Econometrica 78(1), 119157. Angrist, J. D. & Imbens, G. W. (1995), `Two-stage least squares estimation of average causal eects in models with variable treatment intensity', Journal of the American Statistical Asso- ciation 90(430), pp. 431442. Balke, A. & Pearl, J. (1997), `Bounds on treatment eects from studies with imperfect compliance', Journal of the American Statistical Association 92(439), 11711176. Behaghel, L., Crépon, B., Gurgand, M. & Le Barbanchon, T. (2012), Please call again: Correcting non-response bias in treatment eect models, IZA Discussion Papers 6751, Institute for the Study of Labor (IZA). de Chaisemartin, C. (2016), `Tolerating deance? identication of treatment eects without monotonicity.'. Hahn, J., Todd, P. & Van der Klaauw, W. (2001), `Identication and estimation of treatment eects with a regression-discontinuity design', Econometrica 69(1), 201209. Heckman, J. J. & Vytlacil, E. (2005), `Structural equations, treatment eects, and econometric policy evaluation', Econometrica 73(3), 669738. Horowitz, J. L. & Manski, C. F. (1995), `Identication and robustness with contaminated and corrupted data', Econometrica 63(2), 281302. Huber, M. & Mellace, G. (2011), Testing instrument validity for late identication based on inequality moment constraints, Technical report, University of St. Gallen, School of Economics and Political Science. Huber, M. & Mellace, G. (2012), Relaxing monotonicity in the identication of local average treatment eects. Working Paper. Imbens, G. W. & Rubin, D. B. (1997), `Estimating outcome distributions for compliers in instrumental variables models', Review of Economic Studies 64(4), 555574. Kitagawa, T. (2013), `A bootstrap test for instrument validity in the heterogeneous treatment eect model', University College London-unpublished . Lee, D. S. (2009), `Training, wages, and sample selection: Estimating sharp bounds on treatment eects', Review of Economic Studies 76(3), 10711102. 5 Machado, C., Shaikh, A. & Vytlacil, E. (2013), Instrumental variables, and the sign of the average treatment eect, Technical report, Citeseer. Mourie, I. & Wan, Y. (2014), Testing late assumptions, Technical report. 6 A Proofs Proof of Theorem S1 Proving that L and L are valid bounds for E(Y1 − Y0 |CV ) relies on a similar argument as the proof of Proposition 1.a in Lee (2009). Due to a concern for brevity, I refer the reader to this paper for this part of the proof. The testable implication directly follows. QED. 7