PS5 for Econometrics 101, Warwick Econ Ph.D

PS5 for Econometrics 101, Warwick Econ Ph.D Exercise 1: Using number of calls before response to recover the average eect of the treatment within a subgroup (Behaghel et al., 2012) This question is a continuation of exercise 1 in the previous problem set + the question on non-response in the exam of last year. In this exercise, we assume that it was actually not a letter which was sent to households, but that they were called over the phone to ask them whether someone in the household runs a business or not. Some households did not answer the rst phone call so they had to be called again. Households were called a maximum of 25 times. If they could not be reached in less than 25 calls, they were regarded as non respondents. In this context, Ri = 0 if household i did not take any of the 25 calls, and Ri = 1 otherwise. The graph below shows response rate in the treatment and in the control groups as a function of the number of calls. For instance, you can see that a little bit less than 50% of treated households answered the questionnaire after 5 calls, while in the control group, this rate was reached after 25 calls only. 1 25 Control 0 5 10 15 Number of calls Treatment 20 Figure 3: Response rates by assignment according to the number of phone calls 0 .1 .2 .3 .4 .5 .6 a) What is the value of Pb(Ri = 1|Di = 1)? Of Pb(Ri = 0|Di = 0)? Will Lee bounds (those which were studied in the end term exam of last year) be wide or narrow in this application? b) Explain intuitively how you could use number of calls to recover the eect of the treatment in a subpopulation. Hint: the red lines on the25graph might help. Under which assumption does this method rely? c) How could you test whether that assumption is likely to hold? Exercise 2: Group-level OLS regressions with time and group xed eects estimate a weighted 2 sum of Wald-DIDs (de Chaisemartin and D'Haultfoeuille, 2015). Assume you observe repeated cross-sections of the same population at various dates and that units can belong to various groups (e.g. counties or states). The date at which a unit is observed is represented by a random variable T ∈ {0, 1, ..., t}, and the group a unit belongs to is represented by a random variable G ∈ {0, 1, ..., g}. For instance, if there are 5 periods (0,1,2,3,4) in the data and groups are American states, t = 4 while g = 49. Let Y be an outcome variable. Let D be a treatment variable. Let β denote the coecient of D in a 2SLS regression of Y on a constant, group dummies (1{G = g})1≤g≤g , time dummies (1{T = t})1≤t≤t , and D, with a rst stage fully saturated in (T, G). The goal of this exercise is to show that if T ⊥⊥ G, then β is equal to a weighted sum of Wald-DIDs. To alleviate a little bit the notations, for every random variable X and for any (g, t) ∈ {0, 1, ..., g} × {0, 1, ..., t}, let Xgt denote a random variable with the same probability distribution as X|G = g, T = t. For instance, D00 denotes a random variable with the same probability distribution as D|G = 0, T = 0. Because these two random variables have the same probability distribution, E(Xgt ) = E(X|G = g, T = t). For every (g, g 0 , t) ∈ {0, ..., g}2 × {1, ..., t}, let DIDD (g, g 0 , t) = E(Dgt ) − E(Dgt−1 ) − E(Dg0 t ) − E(Dg0 t−1 ) . DIDD (g, g 0 , t) is the di-in-di comparing the evolution of the mean of the treatment variable between groups g and g 0 and between periods t − 1 and t. Assume that for every 1 ≤ t ≤ t the mean of treatment does not follow a parallel evolution in any pair of groups between t − 1 and t.1 This is equivalent to assuming that DIDD (g, g 0 , t) 6= 0 for every (g, g 0 , t). If this is satised, then we can dene 0 WDID (g, g , t) = E(Ygt ) − E(Ygt−1 ) − E(Yg0 t ) − E(Yg0 t−1 ) . E(Dgt ) − E(Dgt−1 ) − E(Dg0 t ) − E(Dg0 t−1 ) WDID (g, g 0 , t) is the di-in-di comparing the evolution of the mean of the outcome variable between groups g and g 0 and between periods t − 1 and t, divided by the same di-in-di but for the treatment variable. Finally, for (g, t) ∈ {1, ..., g} × {1, ..., t}, let DIDD (g, g − 1, t)P (G ≥ g)P (T ≥ t) (E (D|G ≥ g, T ≥ t) − E (D|G ≥ g) − E (D|T ≥ t) + E(D)) . Pt t=1 DIDD (g, g − 1, t)P (G ≥ g)P (T ≥ t) (E (D|G ≥ g, T ≥ t) − E (D|G ≥ g) − E (D|T ≥ t) + E(D)) g=1 a wgt = P g a) First, let's try to interpret the assumption that T ⊥⊥ G by considering an example. If the data is a survey of a representative sample of 100 000 individuals in the American population made in three dierent years (100 000 individuals surveyed in 1991, 100 000 dierent individuals surveyed in 1992, 100 000 dierent individuals surveyed in 1993), and G represents individuals' state of residence, what does the assumption that T ⊥⊥ G require? Is it likely to be satised in this example? 1 If for some t, there are groups which experience a parallel evolution of their mean treatment between t−1 and t, the formula that you will obtain in question f.8) remains valid after grouping together these groups. 3 b) What is the predicted value for D from the rst stage regression? (remember that the rst stage is fully saturated in (T, G), which means that it has all the time dummies, all the group dummies, and all their interactions) Z̃) , where Z̃ is the residual from an OLS regression c) Conclude from question b) that β = cov(Y, V (Z̃) of E(D|G, T ) on a constant, group dummies (1{G = g})1≤g≤g , and time dummies (1{T = t})1≤t≤t . e = cov(E(D|T, G), Z) e = cov(D, Z) e . Conclude that β = d) Show that V (Z) cov(Y,Z̃) . cov(D,Z̃) e) Let α, αg , and αt respectively denote the coecients of the constant and of the group and time dummies in the regression of E(D|G, T ) on a constant, group dummies (1{G = g})1≤g≤g , and time dummies (1{T = t})1≤t≤t . e.1) As T ⊥⊥ G, one can show that αt is equal to the coecient of the dummy 1{T = t} in a regression of E(D|G, T ) on a constant and time dummies (1{T = t})1≤t≤t (without the group dummies). Use this to give a formula for αt . e.2) As T ⊥⊥ G, one can show that αg is equal to the coecient of the dummy 1{G = g} in a regression of E(D|G, T ) on a constant and group dummies (1{G = g})1≤g≤g (without the time dummies). Use this to give a formula for αg . e.3) Conclude from the 2 preceding questions that e = E(D|G, T ) − α − (E(D|G) − E(D|G = 0)) − (E(D|T ) − E(D|T = 0)) . Z f) Now that we have an explicit expression for Ze, we can try to rewrite β . Let's consider rst its numerator. e = E((E(D|G, T ) − E(D|G) − E(D|T ) + E(D))E(Y |G, T )). f.1) Show that cov(Y, Z) f.2) Deduce from this that as T ⊥⊥ G, e = cov(Y, Z) g t X X P (G = g)P (T = t)(E(Dgt ) − E(D|G = g) − E(D|T = t) + E(D))E(Ygt ). t=0 g=0 f.3) Deduce from this that e = cov(Y, Z) g t X X P (G = g)P (T = t)(E(Dgt )−E(D|G = g)−E(D|T = t)+E(D)) (E(Ygt ) − E(Yg0 ) − (E(Y0t ) − E(Y00 ))) . t=0 g=0 You need to check that the summation of the three terms added wrt to question f.2) is equal to 0. f.4) Deduce from this that e = cov(Y, Z) g t X X P (G = g)P (T = t)(E(Dgt )−E(D|G = g)−E(D|T = t)+E(D)) (E(Ygt ) − E(Yg0 ) − (E(Y0t ) − E(Y00 ))) . t=1 g=1 4 f.5) Deduce from this that e = cov(Y, Z) g t X X P (G = g)P (T = t)(E(Dgt )−E(D|G = g)−E(D|T = t)+E(D)) g t X X WDID (g 0 , g 0 −1, t0 )DIDD (g 0 , g 0 −1, t0 ) t0 =1 g 0 =1 t=1 g=1 You need to show that E(Ygt ) − E(Yg0 ) − (E(Y0t ) − E(Y00 )) = g t X X WDID (g 0 , g 0 − 1, t0 )DIDD (g 0 , g 0 − 1, t0 ). t0 =1 g 0 =1 It's easier to do if you start from the right hand side. f.6) Deduce from this that e = cov(Y, Z) g t X X g t X X WDID (g, g−1, t)DIDD (g, g−1, t) P (G = g 0 )P (T = t0 )(E(Dg0 t0 )−E(D|G = g 0 )−E(D|T = t0 )+E(D)). t0 =t g 0 =g t=1 g=1 f.7) Deduce from this that e = cov(Y, Z) g t X X WDID (g, g−1, t)DIDD (g, g−1, t)P (G ≥ g)P (T ≥ t) (E (D|G ≥ g, T ≥ t) − E (D|G ≥ g) − E (D|T ≥ t) + E(D)) . t=1 g=1 f.8) Similarly, one can show that e = cov(D, Z) g t X X DIDD (g, g − 1, t)P (G ≥ g)P (T ≥ t) (E (D|G ≥ g, T ≥ t) − E (D|G ≥ g) − E (D|T ≥ t) + E(D)) . t=1 g=1 Conclude from this and question f.7) that β = g t X X a WDID (g, g − 1, t)wgt . t=1 g=1 As you saw in question b), this 2SLS regression is equivalent to an OLS regression of Y on time dummies, group dummies, and the mean of the treatment variable in each group × period cell. An interesting special case of this type of OLS regression is when the unit of observation is a group × period cell (say county × year), and the dependent variable is the average value of Y in each group × period cell. This type of group level regression is extremely frequent in economics. In such instances, groups are stable over time (unless counties or states appear or disappear over the period under consideration, which is not going to be the case with 20th century data), thus implying that the assumption that T ⊥⊥ G will be satised. For instance, Gentzkow et al. (2011) estimate a regression of political participation in county c and year t on county and year dummies, and on the number of newspapers available in county c and year t. The result you just derived shows that the coecient of the number of newspapers in their regression is equal to a weighted sum of Wald-DIDs which are comparisons of the evolution of political participation across pairs of counties and consecutive elections, scaled by the same comparison but for the evolution of the number of newspapers available. 5

PS5 for Econometrics 101, Warwick Econ Ph.D

Related documents

Products

Support

PS5 for Econometrics 101, Warwick Econ Ph.D

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib