Digitized by the Internet Archive in 2011 with funding from Boston Library Consortium Member Libraries http://www.archive.org/details/instrumentalvariOOcher 9 ^libWt" HB31 .M415 1^ Massachusetts Institute of Technology Department of Economics Working Paper Series INSTRUMENTAL VARIABLE QUANTILE REGRESSION Victor Chemozhukov Christian Hansen Working Paper 06-1 October 15, 2004 Room E52-251 50 Memorial Drive Cambridge, MA 021 42 This paper can be downloaded without charge from the Social Science Research Network Paper Collection http://ssrn.com/abstract=909666 at Instrumental Variable Quantile Regression* Victor Chernozhukov and Christian Hansen May First Version: Revised: December 2004 2001 Abstract Quantile regression a response Y an increasingly important tool that estimates the conditional quantiles of is given a vector of regressors D. and can be used to measure the also in the D'ylU) is upper and lower U and [/ is median regression generalizes Laplace's not only in the center of a distribution, but V = For the linear quantile model defined by tails. strictly increaising in It usefully effect of covariates regression allows estimation of quantile specific covariate effects j(t) for r £ (0, we propose an instrumental D''f{U) where a standard uniform variable independent of D, quantile 1). In this paper, variable quantile regression estimator that appropriately modifies the conventional quantile regression and recovers quantile-specific covariate effects in an instrumental variables model defined by variable that V= may depend on D'a(U) where D'a{U) D but is strictly increasing in U and [/ is a uniform independent of a set of instrumental variables Z. The proposed is estimator and inferential procedures are computationally convenient in typical applications and can be carried out using software available for conventional quantile regression. In addition, the estimation procedure gives rise to a convenient inferential procedure that identification. The use of the proposed estimator and testing procedure is is proposed naturally robust to illustrated weak through two empirical examples. Keywords: Quantile Regression, Instrumental Variables, 1. Weak Instruments Introduction Quantile regression used in is an important tool for estimating conditional quantile many empirical studies and has been studied models that has been extensively in theoretical statistics; see Koenker and Bassett (1978), Koenker and Portnoy (1987), Portnoy (1991), Gutenbrunner and Jureckova (1992), Chaudhuri, Doksum, and Samarov (1997), Portnoy and Koenker (1997), Knight (1998), Koenker and Machado (1999), Portnoy (2001), and He and Zhu (2003). One of quantile regression's 'The Matlab software for this paper is available upon request via e-mail chansenl@gsb.uchicago.edu. Further updates of this paper can be downloaded at www.mit.edu/'vchern. Address correspondence to C. Hansen, Asst. Prof, of Econometrics and Statistics, The University of Chicago, Graduate School of Business, 5807 South Woodlawn Avenue, Chicago, IL 60637, USA, chansenl@gsb.uchicago.edu. ^Portions of this paper were previously included in "An IV Model of Quantile Treatment Effects", 2001. MIT Department of Economics Working Paper 02-07, most appealing features is its ability to estimate quantile-specific effects that describe the covariates not only on the center but also on the tails of the mean such as the effects, summary statistics of the effect impact of a covariate, they unless the variable affects both the central and the While the central regression, provide interesting fail to describe the full distributional impact tail quantiles in the same way. In addition, For example, in a study of the effectiveness of a job training program, the cases. training on the low tail of the earnings distribution will likely be of than the distribution. mean on the impact of covariates on points other than the center of the distribution interest focuses many outcome obtained through conditional impact of effect of training For an outcome quantile model Y and may be interest for public policy set of factors D outcome, the conventional linear conditional affecting the defined as Y= t/*|£)~Uniform(0, D'-yiU'), strictly increasing is effect of on the mean of the distribution. (1.1) where r —> D'f{T) more in and continuous By turbance U' as individual ability or proneness. 1), Doksum in r. (1974) interprets the dis- construction, D'^{t) is the r-quantile of Y conditional on D. This model generalizes the usual linear regression model y= by allowing D'7o quantile-specific effects of covariates + 7i(t/*) D. For a given quantile indexed by t € (0,1), the quantile specific effects 7(r) can be estimated using standard quantile regression methods (e.g. Koenker and Bassett (1978)). we develop a new estimation method In this paper, above model. The developed approach is determined non-experimentally, making it the outcomes. Specifically, Y= i—> D'a{T) is difficult to infer D'a{U), strictly increasing in r, instrumental variables that are independent of U , D the sampled makes 7(t) / will oi.{t), depend on such as ability, D are the true structural/causal effects of D on [/, C/|Z ~ Uniform(0, 1), D is U but statistically related to D. Since statistically dependent on U , and Z D \s a, set of depends on leading to biased sampling or endogeneity. This endogeneity rendering conventional quantile regression inconsistent for estimating (1.2). For example, suppose that level of training. an endogenous generalization of the we consider the model (1.2) where r for designed for settings where the observed variables Y is the hourly wage of a worker and that The unobserved disturbance U would reflect which influence the individual's wage via the equation individuals choose high levels of training, then the level of training causes dependence between U and D is an individual's imobserved personal characteristics, is Y= D'a(U). If high-ability correlated with ability, which D and implies that conventional quantile regression will overstate > the true effect of training on earnings, 7(t) a{T). Instrumental variables Z, such as random assignment to training programs in the training context, allow us to overcome this problem by D providing a source of variation in D examples where that many independent of U. There are is sampled depending on U, is i.e. The endogenously. other interesting empirical section presents a supply-demand example and a training example. Model Y— D'aQ outcome generaUzes the conventional instrumental variables model with additive disturbances (1.2) + ai{U) where U\Z ~ U{0, 1) to cases where the impact of D varies across quantiles of the A number of appealing approaches are readily available to estimate Qq in the distribution. conventional instrumental variables model with additive disturbances, including the conventional two-stage least squares (2SLS) estimator and its robust analogs by Amemiya (1982) and Chen and Portnoy (1996). The purpose of this paper to provide practical estimation is The estimator we propose (1.2). is and inference methods that can be constructed from a series of conventional quantile regressions. approach is computationally convenient and simple to implement in has already been used in empirical applications, (2004), procedure that D is will statistically and allow model many Thus, the estimation typical applications. It Hausman and Sidak will (2004), Januszewski be further illustrated with two empirical In addition, the estimation procedure leads naturally to an inference be valid even when one of the key conditions for identification of the model, that dependent on Z, The remainder detail e.g. and Chernozhukov and Hansen (2004a), and applications in this paper. for an appealing modification of the standard quantile regression of this paper for fails. is organized as follows. In Section other controls in the equations. procedures based on a set of moment 2, we define the model in more Section 3 discusses estimation and testing equations introduced in Section 2. Section 4 illustrates the use of the derived estimator and testing procedure through brief empirical examples, and Section 5 concludes. 2. The Instrumental Quantile Regression Method 2.1. The Model In this section, we more relationship defined formally define the model we will estimate. Suppose we have a structural by (2.1) Y= D'a{U) (2.2) D= S{X, Z, V), where (2.3) T D'aij) i-> + X'0{U), -1- X'0{t) [/|X, V is Z~ Uniform(0, statistically 1), dependent on U, strictly increasing in r. In these equations, • y • [/ is the scalar outcome variable of interest, is a scalar structural random variable that aggregates outcome equation, where • Z3 is a vector of endogenous variables determined by • V is a vector of unobserved disturbances determining • 2 is a vector of instrumental variables (control variables excluded from pendent of the disturbance • X is The observed We U but impact variable D (2.2), D and correlated with U, via (2.2)), (2.1) that are inde- and a vector of included control variables. variables consist of [Y, D, sampled depending on also of the unobserved factors affecting the all U X, Z), and due to the dependence between V and U, D is . shall refer to the function Sy (r |d, x) = (2.4) as the Structural Quantile Function (SQF) d'a{T) in + x'P{t) order to emphasize that is it in general a different Qy (r|d, x). The structural quantile function 5y(T|d, x) describes the quantile function of the latent outcome variable Yd = d'a{U) + X'0{U) obtained by fixing D = d and sampling the disturbance U ~ t/(0, 1) (all conditional on X). This notion of object than the conditional quantile function sampling corresponds to independent sampling of D experimental settings. Instead the sampled variable possible to estimate or still make the use of instrumental variables and U, which D is is generally not feasible outside determined via (2.2). Nevertheless, it is inference on the structural quantile function 5y(r|d, i) through Z which induce variation in D but are themselves independent of U. The 2.2. Principle Under the conditions of (2.1) and (2.3), the problem of dependence between through the presence of instrumental variables, Z, that affect the U and determination of D D is overcome but are in- dependent of U. In program evaluation studies with imperfect compliance, a simple example of an instrument tial is random assignment values of U. The presence to the treatment group, which can be used to estimate the parameters of is equivalent to the event (2.5) is done independently of the poten- of the instrumental variable leads to a set of {U < (2.1). From r}. It then follows (2.1) from and (2.3), (2.1) that P[Y<Sy{t\D,X)\Z,X] = t. moment the event equations that {Y < Sy{t\D, X)} Equation (2.5) provides parameters a and /3. a useful statistical restriction that can be used to estimate the structural It is important to notice that the equation P[Y < 5y (r|Z), X)\Z, X] = t differs from the conventional estimating equation P[Y <Qy(t\D,X)\D,X]=t (2.6) used to estimate the conditional quantile function of Y given D X and Recall from Koenker and Bassett (1978) that the ordinary quantile regression 8is finding the best predictor of Pt-(u) '= {t given — l(w < given 0))u. In other words, W under the asymmetric assuming integrability, the r-th conditional quantile of Q^(r|TV) J^ is the class of measurable functions of = 1/2 so that Pt(w) = ||m|. The = {D,X) and can be estimated using the W (restricted in applications to be a set of flexible function W finite The moment equation random variable Y— given in (2.5) Sy{t\D,X) Q (2.8) Thus, we may pose = is Qy{t\D,X) is solves the above equivalent to the statement that Qy-Sy{t\d,x){t\Z,X) a.s. for is problem with is the r-th quantile each r. the problem of finding Q(r) and /?(r) solving equation (2.5) as the instrumental K — Sy{t\D, X) on = argmin Epr\{Y - Sy[t\D^X) - where J^ a solution of this problem conditional on (Z,X): a solution to the quantile regression of (2.9) is sample analog of the above equation. variable or inverse quantile regression (IVQR). This problem term F = argmm E[pAY-f{W))] with T that formulated least absolute deviation loss parametric functions). Laplace's median regression function Qy{-5\W) of is W solves the problem: (2.7) where Y (QR) is to find an 5y(r|Z),X) such {Z, X): f{Z,X))] the class of measurable functions of {X, Z) (which will be restricted in applications). 'inverse' emphasizes an evident inverse relation of regression, (2.7). this The problem to the conventional quantile The Instrumental Quantile Regression Estimator and Derived Dual 3. In- ference Basic Description and Properties 3.1. Next we consider a finite-sample analog of the above procedure. Define the (weighted) conventional quantile regression objective function as " 1 Q„(r,Q,/3,7) := where Z) Yi := n = /(Xj, Z,) V{Xi, Zt) > - Z'a)Vu X is a dim(/3)- vector of exogenous explanatory a dim(7)- vector of instrumental variables such that dim(7) is is X[l3 t=i a dim(Q)-vector of endogenous variables, is variables, Zj and - D[a -Tpr{Y, "— a scalar weight. In practice, a simple procedure is to set > K = 1 dim(Q), and let Zi either be Z, or the predicted value from a least squares projection of Di on Zi and X,. The instrumental variable or inverse quantile regression estimator (IVQR) follows. For a given value of the structural parameter, say q, let us find an estimate for a{T), S(t) (3.2) where ^(q) = A(a) + = we will look for a value a that makes the arg inf [Wn(a)] Op(l) and A{o.) , Wn{a) := n[7(Q, W„{a) is the itself. Wald positive definite, uniformly in is statistic for testing The parameter The estimator (3.3) population problem 0, a fact that is It is convenient to -^71(7(0:, t) — we below will use 7(0, r)) in which for inference . a finite-sample instrumental variable quantile regression. Analogous to the (2.8), it finds parameter values Z close to zero as possible. This estimator and a ^ A. estimates are then given by that the value of coefficient 7(0, r) on regularity 7(a,T) = 9{t) := (5(t), J3(T)) := (q(t), ^(S(t), r)) (3.3) on the instrumental T)']I(Q)[7(Q,r)], A{a) equal to the inverse of the asymptotic covariance matrix of about a{T) coefficient as possible. Formally, let variable j{a, t) as close to case defined as 0n ' set is QR to obtain (^(Q,r),7(Q,r)) :=argmin Qn{T,a,P,'y). (31) ' ^ To run the ordinary in the is for a and through the inverse step ordinary quantile regression step (3.1) consistent is (3.2) such driven as and asymptotically normal under appropriate identification conditions: (3.4) for Q,g specified below. ^.(O{T)-0{T))^dNiO,ne), This asymptotic distribution can be used the parameter of interest using standard Wald procedures. for conducting direct inference on In addition, we can base inference on the "dual" Wald coefficients on the instruments are zero statistic whether 7(a,r) (i.e. = Wn{a) for testing When a = 0). whether the Q(r), Wn{a) is asymptotically chi-squared with dim(7) degrees of freedom: Wr,ia{T))^d X'(dim(7)) (3.5) Thus, a valid confidence region for a{T) can also be based on the inversion of this dual Wald CRp[a{T)] := {q (3.6) where Cp is : Wn{a) < Cp} contains a{T) with probability approaching p, the p-percentile of a ,\^(dim(7)) distribution. This dual approach assumptions than the direct approach; in particular, of Fi — D[aj on Xi and 2. = 1, ..., J}, 3. is detail. in practice as follows: Zi to obtain coefficients 0{aj,T) and 7(Qj,r). Save the inverse of the variance-covariance matrix of 7(q_;,t), which or F-statistic for testing 7(Qj,r) Choose 3(r) vaUd under weaker and run the ordinary r-quantile regression any common implementation of the ordinary QR, to use as A(aj) a Wald more may be computed For a given probability index r of interest, the estimator a suitable set of values {aj,j is robust to weak or partial identification of it is a{T). Section 3.5 discusses the properties of the dual procedure in 1. Define statistic: as a value = in is readily available in Wn{aj). Then Wnicij) becomes depending on the naming convention. 0, among {aj,j = 1, ..., J} that minimizes Wr,{a). The estimate of /?(t) then given by /?(S(t),t). 4. Direct inference on Q(r) may be conducted using the variance formula Dual confidence regions its for q(t), upper and lower bounds CRp, may be computed may be as CRp[a{T)] Computational Complexity and Implementation One of the most appealing features of the may IVQR and Portnoy and Koenker (1997) show that ordinary Op (dim(/3, 7)^ X : Qg provided below. Wn{aj) < Cp}, QR using QR can be computed in for e polynomial stochastic time using interior point algorithms with preprocessing,^ so the above and some 6 > some small 5' > 0, 0. Since we need e ex l/n", a > 1/2, and it suffices to the proposed algorithm has computational complexity Op (^n(i/2+5')dim(a) ^ dim(/3,7)' x n'+') 'in contrast, simplex procedures will have running time of Op f n''""'"'''' j. is any modern software. IVQR procedure has computational complexity of Op((l/e)'^™'°' x dim(/?, 7)^ x n^+*) for a desired of accuracy and associated dual inference confidence region be computed using the output from the conventional 71^+"*) for {aj used as end-points of a confidence interval for a(T). 3.2. that both = have a = 1/2 level + 5' that is polynomial dimension of a. in the sample size n and Thus, the procedure will number of exogenous variables, dim(/5), dim(Q), is fications, small. This situation where typically dim(Q) is is in the possibly large, but the certainly the = 1 or 2 dimension of {P,j), but is not polynomial in the be computationally attractive and work well when the number of endogenous variables, most common case prevalent and dim(/3) varies from in econometric speci- to 50 or more. In fact, due to 1 practical properties, the estimator has already been applied in empirical analysis by its and Sidak (2004), Januszewski (2004), and Chernozhukov and Hansen (2004a), and this Hausman paper also presents two additional empirical applications. There are other approaches that one could adopt For example, an immediate approach 2.1. the < D[a + X[P) - to minimize \\^T,'^^i(liYi the estimator of Sakata (2001) which the absolute deviation.'' is is method estimation of the model defined in Section of moments approach (MM) T){X[,Z';)'Vi\\ over an elegant In contrast to the for IVQR maximum a and (5. that attempts Another example likelihood type estimator based is on approach, these alternative approaches involve highly non-convex, multi-modal, and non-smooth objective functions over many parameters, which poses a serious computational challenge. Implementation of extremum estimators with non-smooth and, more importantly, non-convex objective functions generally requires non-convex searches over parameter sets of dimension high-dimension of the case in become many (3. K = dim(/3)-l-dim(a), which will be quite large in many cases due to the applications. difficult to IVQR approach will have an advantage when dim(Q) is small, as is When dim(Q:) is high, both the IVQR approach and the MM approach Thus, the implement. In such settings one could use the quasi- Bayesian methods developed in Chernozhukov and Hong (2003). intervals using a quasi-posterior defined as the MM function specified above. Assumptions 4.1. To exponent of the Asymptotic Distribution Theory 4. state the assumptions, define the population objective function as Q{T,a,p,i):=E[pr{Y,-D'^a~X[p-Z[-t)Vi\, (4.1) and let (4.2) (/3(a,r),7(Q,T)) := argmin Q{T,a,P,'y). In our notation, the estimator solves the following program; max mm Y, a,0 7,(5 \y' - D'ia for MM This approach computes estimates and confidence - X[0 - Z'n - X[&\/ f2 \y^ - ^-a - XlP\ Q = Ax B Define the parameter space value P{a, t) for each not arise. We q e ^ as a, compact convex in its interior, so that the set such that B contains the population parameter-on-the-boundary problem does assume the data were generated by the model defined in Section 2 and impose the following additional assumptions: Rl {Yi, Di, Xi, Zi) are iid defined on the probability space (H, F, P) and have compact support. R2 For the given t, (Q(r),/3(T)) is in the interior of the specified set 0. R3 Density Jy{Y\X, D, Z) is bounded by a constant / a.s. R4 dE[l{Y < D'a + X'0 + Z''i)<i!]/d[P',i) has full rank at each a in A, for * = V,{Z[,X'^' The compactness rank condition in normal and are Rl and R2 conditions in in Rl R4 and iid compactness condition The bounded simplify the analysis. are sufficient for the Jacobian matrix in sampling sufficient for R4 suffice for the estimates (/?(Q;,r),7(a, r)) to implementing the dual inference. do not impose any conditions on the relation between procedure, the dual procedure will be valid when D It is is weak R3 and The full be asymptotically important to note that R1-R4 and Z\ that identification density in to be well-defined. is, or unlike the direct inference fails partially or completely. Stronger additional conditions are imposed for implementing the direct inference. R5 dE[l(Y < D'a + X'/3)*)']/5(a',/3') has full rank at R6 The function {a, (5) ^-^ E[{t - 1(7 < D'a + X'/3)}*] The imposition of R1-R6 is sufficient for identification (a(r)',/3(r)')'. one-to-one over 0. is and asymptotic normality of the IVQR estimator, both of which are necessary for the validity of the direct inference approach. These as- sumptions considerably strengthen the conditions R1-R4 by imposing restrictions on the relationship D between and Z The . the dual approach To is dual approach does not require these assumptions for robust to the violation of either comment on further R5 its validity. Hence, or R6. Z the nature of correlation between and D required by R5, note that by Rl and R3 we have that dE[l{Y < D'a + X'l5m']ld{a' ,0') = E[fMX,D,Z)V,{Z[,Xi)'{D',,X'^)]. Hence, if we set Vj = 1 for simplicity, that global identifiability that the moment we see that the Jacobian in D and Z, and R5 requires that must hold; hence, the impact of covariation matrix between R5 this Z takes a form of density- weighted matrix has full rank. R6 imposes should be rich enough to guarantee equations are solved uniquely. These assumptions are required to carry out direct inference but are not required in the dual approach. Thus, discrepancies between the dual approach and the direct approach should be indicative of situations where R5 and R6 do not hold. 10 Before proceeding to the asymptotic results, we provide some sufficient and more primitive conditions for the global identifiability condition R6. and R6 is A set of conditions that suffices for R5' dE[l{Y < D'a + X'p)md{a\(5') has R6' The image of G under (q,^) everywhere. The rank at each {q,P) full ^ E\{t-\{Y < D'a + X'(3))^ R5' ensures that the mapping (a, E[{t - 1{Y < D'q i—> /?) is simply-connected. + X'/3)}^] somewhat the simple-connectivity condition R6' curbs G. in locally one-to-one result, cf. mapping Ambrosetti (1995). This fact and equations (2.4) and (2.5) imply that the solution of the equation E \{t is unique and of Theorem Other is non-linearity of the and implies a global one-to-one relationship by a Plastock-Hadamard type and Prodi both R5 as follows: given by (q(t), is sufficient l(r < D'a + = X'/J)}*] /3(r)). and more primitive conditions 2 in Mas-Colell (1979). Let G' for R5 and R6 also result through an application be a convex compact set that contains 9 and that has a smooth boundary dQ' Then the following conditions imply R5' and R6' and hence R5 and R6. . R5* dE[\{Y < D'a + X'P)<i']/dia',0') has a positive determinant at each (q,/3) in 9'. R6* dE [l(y < D'a + X'/3)*] /d{a', 0') is positive quasi-definite along the boundary dQ' in the sense defined by Mas-Colell (1979). Note that exogenous model, we can = set Zi Di, and these conditions will be trivially satisfied. Asymptotic Properties of the Dual Inference 4.2. We in the first state the formal results for dual inference, because they are the simplest to explain. the conditions R1-R4, as n — oo, > uniformly in a € Under ^ n M^{a,r) - ^a,T)) = -J^'(a) (4.3) n-^'^ -^3,(0) + Op(l), 1=1 (4.4) s,(q) = [r (4.5) Eiia) = Y- Aq - ^.'/?(q, t) - Zh(a, t), (4.6) Ma) = E[f,^^^iO\Z,X)<i>9'/V]. (4.3)-(4.6) follow here (4.7) (4.8) is that - 1 (£,(q) < 0)1 *„ by adopting standard arguments we have a process in a, *i = V,{X\, Z'^}' for the quantile regression process. whereas we usually have a process over V^p(a,r)-^(Q,r))-.d Q4a] = J-'[a]S[a]J-'[a], Af(0,ntf[a]), 5[a] = £[s,(q)s,(q)'1. r. Hence The for difference each a 11 The = statistic for testing 'y(a,r) where = Q,^){a] n^[Q] +Op(l) QR. the ordinary given by the Wald statistic Wn{a) = n[7(a,T)']r2tf [Q][7(a,T)], any standard consistent estimate of the asymptotic variance is (4.8) of when a = q(t) Therefore, FKn[Q(T)]-d X'(dim(7)), (4.9) and is for the confidence region P{a(T) e CRp[a{T)]} (4.10) Proposition Comment The CRp[a{T)] := {a € Under conditions R1-R4, 1. 1. Unlilce direct inference, = A : Wn{a) < Cp}, where P{x^(dim(7)) < Cp} < Cp} - p. P{Wr,[a{T)] = p, the results (4-3)-(4-10) are true. the dual inference requires only assumptions R1-R4 to hold. results for dual inference are also straightforward to extend in various direction. In particular, one can note that the preliminary estimation of weights or even (4.7) as long as q and instruments Zi Vi will the estimates of weights and instruments that must be imposed can be found in Andrews 4.3. Asymptotic Properties of the Direct Inference The following proposition presents the asymptotic properties of the direct approach. Proposition 2. not affect (4.9) a root-n neighborhood of a{T). Additional regularity conditions on in is In the specified model under conditions R1-R4 and conditions R5-R6, (1994). sufficient conditions for which are either R5'-R6' or R5*-R6*, v^p(T)-e(T))^d7V(0,ne), (4.11) where, for H = <b = V-[X',Z']' and J;yl[Q(T)]J.,, partition of L = J^^ := [E e JfiM, ne = {K',L'YSiK',L'), = Y -D'a{T)-X' 0{t), S = t{1-t)E [^^'] K = {J'^HJa)-^J'^H, M = h+r - JaK, J^ = E[f,{Q\X,Z,D)^D'], and [J'^J'..,]' is a , [fc{0\X, Z)'i/'i!'/V])~^ such that Jg is a dim(/3) x dim(/?,7) matrix and J~, is a dim (7) x dim(/3, 7) matrix. Corollary 1. When dim(7) = dim(a), the joint asymptotic variance of S(r) the choice of A{t) does not affect asymptotic variance, and P{t) ne for S defined above and Jg Corollary function 2. Wn(a) = = will generally have the simple form Jg'S{J'e)-' E[f,(0\X, Z, D)'i[D', X']]. When dim(7) > dim{a), generally matters. A the choice of the weighting matrix natural choice for A{a) is given by A{a) corresponds to the inverse of the covariance matrix of ^/n{jy{a,T) equal to {Jn,SJ!y)~^ at q(t), and it A(a) in the = objective ([n^[a]22)~^ which - 7(0, r)). Noting that A{a) — Q(r)) equals follows that the asymptotic variance of y/n{a(T) is 12 Corollary The 3. y-[X',Z*']', Z' -E[D-v'\Z,X]/V',v* andV = V, Comment Corollary 2. sense defined by 1 0)]^', where f,{0\Z,X). Thus, if IVQR step, again. It bound (1977) as well as the semi-parametric efficiency used to obtain residuals is . in the sense of methods and are used are estimated using nonparametric or parametric hsis no effect and instruments on the Andrews hold. Estimating Variance and Jacobian Matrices The components of the variance matrices that inference and J;j[q] the of first set and S[a] n where ?, /i J^ ^ and nh^ Koenker (1994). — » n oo, u\oc\ := Y^ = -y^[K{ulh)lh\^,%/Vu ^— Jo n 1=1 = = t=i [r - 1 (e, and K{-) is a kernel function. Specific choices of h are discussed in s; = -J2 S-.MJ.H'- - D[a - for direct as follows: < 0)] *,, = - -Asia] 1=1 where and S dual inference. Following Koenker's (1994) analysis for ordinary Similarly, the second set of estimates S\a] to estimate include J^, Ja, = -TlK{er/h)/h]<i,D[, '— 1=1 - D[a(T) - X'^{t), := Y, such that for we need components can be estimated 5=-V?,S:, '— is Z' In the second step, the required weights ?i. can be shown that estimation of weights and instruments (1994), on the estimates of weights QR, Z= can be implemented in two steps. Efficient estimation limit distribution of the estimators, provided additional regularity conditions, found e.g. in 4.4. = 9' becomes simple once especially convenient since the variance formula and Wellner (1993). V' and instruments Z' IVQR is Amemiya Bickel, Klaassen. Ritov. in = < l(e Z is collapsed to the same dimension as D. Corollary 3 shows how to construct Z and weight V such that the IVQR estimator achieves the efficiency bound in the the instrument first - f,{0\D,Z,X), andV the asymptotic variance of{a{r),P{T)) attains the efficiency 6ounrfr(l— t)£['I''^*']~' the instrument In the — efficient score for (Q:(T),/3(r)) is given by ^,j'_^J t X,'^(t, a) a bandwidth chosen such that h —* is *, V,[Z[, X,']', /i is a bandwidth chosen given by E [K{e^[a]/h)/h] <i^,<i!yVi t=i - Z'^{a, t) and and nh'^ — > s,[a\ cxd, = [r and K{-) is 1 {e,\a] < 0)] *.,*, = V,[Z,', X^]', h a kernel function. The consistency properties of these estimators are standard and will not be discussed here. 5. Empirical Examples In this section, 3. The first we present two applications of the estimation and inference results derived in Section application reports the results of a simple analysis of the demand for fish. This application makes use of a small sample and illustrates the potential differences between the direct 13 and dual inference procedures. program on earnings. In we In the second example, this case, the identification consider the effects of a job training quite strong, and is we see small differences between the direct and dual inference procedures. The results here also demonstrate the bias in the conventional quantile regression under endogeneity. In particular, the conventional quantile regression estimates indicate that the effect of training positive outcome distribution, while the the lower tail Denaand 5.1. Fulton market The data contain in New York price price significant across the entire is close to zero in of demand in elasticities may which potentially vary with the observations on price and quantity of fresh whiting sold in the over the five These data were used previously The and for Fish demand. fish is estimates indicate that the training impact distribution. we present estimates In this section, level of outcome of the IVQR Graddy month period from December 2, 1991 to May 1992. 8, (1995) to test for imperfect competition in the market. and quantity data are aggregated by day, with the price measured as the average daily and the quantity as the observations for the days in total amount of fish sold that day. The total sample consists of 111 which the market was open over the sample period. For the purposes of this illustration, we focus on a simple Cobb-Douglas random demand model with non-additive disturbance: where Qp the quantity that would be is the level of demand normahzed the level of demand supply if is A U. = Qo(t/) + demanded if ln(Qp) to follow (7(0, 1), supply function S^ assumed The observed quantity Y = to be independent of sold in the In y= market Qo (t/) -H the price were p, and a\{U) the price were p, subject to other factors affecting supply are ai(;7)ln(p), is ai /(p, Z is the U is an unobservable affecting random demand elasticity when Z,U) describes how much producers would and unobserved disturbance U. The factors demand disturbance Z U given in logs by ([/) In P, where (5.1) U where Oio{U) P + is is independent of Z, the price picked by the market to equate supply and demand. ai([/)ln(P) disturbance U, i.e. = that \n f{P, P= is a dummy is, P satisfies Z,U), which implies the observed price depends on the demand 6(Z, U,U) for As instruments Z, we consider two Stormy That some function different variables capturing variable which indicates greater than 18 knots, and Mixed is a 6. weather conditions at wave height greater than dummy variable indicating 4.5 feet sea: and wind speed wave height greater than 3.8 14 feet and wind speed greater than 13 knots. These variables are plausible instruments since weather conditions at sea should influence the demand Simple for the product.^ amount OLS offish that reaches the market but should not influence regressions of the log of price on these instruments suggest they are correlated to price, yielding R^ and F-statistics of 0.227 and 15.83 when both Stormy and Mixed are used small sample, and 0.160 and 20.69 when as instruments we may just Stormy used. However, given the is expect identification to be weak, and weak identification still suggested by is the results reported below. Quantile regression (QR) and inverse quantile regression (IVQR) results for the .75, and Stormy is Column columns .85 quantiles are reported in QR results, while columns (2) and (3) report used as an instrument in Column (3). (l)-(3) of IVQR (2), asymptotic approximation, and 95% confidence The computation for the IVQR of the The IVQR estimates than the "price effects" QR or of the demand that the interpretation of demand model, while The IVQR elasticities are also 1. The left in column estimates in the QR IVQR are steeper than is very different. (2) magnitude market. These results, differ- and the right figure, those estimated by It is also IVQR QR more curvature important to note estimates a structural the conditional quantiles of the equilibrium quantity variable as It is no surprise that these estimates are and large differences between the confidence different inference procedures for is using 5] both Stormy and Mixed are used as instruments. In the IVQR and QR Interestingly, there are clear dual procedure which [—5, anticipate given endogenous panel reports the plotted in the original price-quantity space. a function of the equilibrium price. two A= uniformly greater QR, which we might functions estimated by QR estimates on the inference procedure outlined in Section 3.1. plotted in log-price-log-quantity space, and that this translates directly into demand curve when The estimate of a. estimates, the row labeled "Dual Interval" contains estimated by the ordinary IVQR results when clearly see that the when IVQR exhibit considerable heterogeneity, ranging from -1.5 to -0.7 in (3). the that only 0.1. ences are illustrated graphically in Figure we (1) reports (3) differ in interval for S(t) constructed based sampling resulting from the joint determination of price and quantity panel reports the Column and estimator was conducted over the parameter space column -1.5 to -0.9 in (2) .50, while Stormy and Mixed are used as instruments in bound constructed using the dual Qj equally spaced with a step size of and from IVQR below. 1 Columns For the t"" quantile, the row labeled 3(r) gives the row labeled "Wald Interval" contains the 95% confidence the Table results. .15, .25, IVQR. different. intervals given by the In particular, the confidence intervals based on the robust to weak and partial identification are uniformly much wider than the intervals based on the direct inference procedure. For instance, the dual confidence region for the .85 quantile case contains the upper endpoint of the parameter space A. In addition, ^More detailed arguments may be found in Graddy (1995). it is worth 15 Table Example (3) (4) (5) -1.5 1188 -200 (-1.24,0.16) (-3.69,-0.69) (-2.51,-0.49) (553,1822) (-1435,1035) [-5.0,0.5) (-3.2,0.1) -0.40 -1.0 -1.4 2510 500 (-2.51,0.51) (-2.52,-0.28) (1742,3278) (-887,1887) (-4.4,0.0) (-3.1,0.1) -0.41 -0.7 -0.9 4420 300 (-0.81,-0.01) (-1.67,0.27) (-1.82,0.02) (3220,5621) (-1589,2189) (-3.0,0.6) (-3.0,0.6) Dual CI (-1400,2700) -0.70 -1.2 -1.3 4678 2700 (-2.02,-0.38) (-2.07,-0.53) (2901,6455) (-260,5660) (-2,0,-0.1) (-2.1,0.1) Dual CI (-400,5600) -0.81 -1.3 -1.1 4806 3200 (-1.24,-0.38) (-2.10,-0.50) (-1.82,-0.38) (2751,6861) (32, 6368) (-2.0,5.0] (-2.6,5.0] S(.85) Dual CI Columns (-1000,2000) (-1.18,-0.22) 3(.75) Notes: (-1300,1500) (-0.87,0.07) S(.50) Wald CI (l)-(3) report results (500,5800) from estimation of the demand report conventional quantile regression results, and columns regression results. In column (2), one instrument. Stormy, Stormy and Mixed are used. Rows labeled a(T) in for t G is for fish, JTPA report results from estimation of the returns to training from the the numbers IVQR (2) Dual CI Wald CI Returns to Training -1.5 S(.25) Wald CI 2: QR IVQR (1) Dual CI Wald CI Example for Fish IVQR -0.53 S(.15) Wald CI Demand 1: ; QR Results from Empirical Examples 1. (2), (3), used, and and and columns (5) report in column {.15, .25, .50, .75, .85} and (4) experiment. Columns (1) (5) and (4) instrumental quantile (3), two instruments. report point estimates, and parentheses are confidence intervals. noting that the confidence intervals obtained using two instrumental variables are generally shorter than the confidence intervals obtained using just one instrumental variable, suggesting an efficiency gain to using more instruments. The construction and nature and 3, two cases, line in of the dual confidence which respectively plot the a is IVQR bounds are further objective function Wn(a) illustrated in Figures 2 over the parameter space in the plotted on the horizontal axis, and the vertical axis shows ^^(q). each graph is the 95% critical value for the dual testing procedure, so all The horizontal points lying below the horizontal line belong to the confidence region for a(T). These graphs display a number of interesting features. It while having numerous local minima, has a distinct is apparent that the objective function, minimum over A in all cases. The objective 16 and hence dual confidence regions, are generally well-behaved functions, bution and become more erratic as one moves toward the that the dual confidence regions are not connected in tails of many in the middle of the distri- the distribution. It is also clear cases. This simple example clearly illustrates the potential differences between the direct and dual inference procedures. to demand an example of an apphcation of the methods of also provides It analysis where the elasticity of illustrates the use of the estimator in much potentially heterogeneous. results. The analysis and the importance of accounting demonstrate the interesting insights that may be gained through quantile results also The Returns for endogeneity in such studies. to Training of job training programs on the earnings of trainees, especially those with low income, of great interest to both policy makers training programs on earnings is and academic economists, but evaluating the causal difficult due to the self-selection of (JTPA) provides a mechanism assigned the offer of JTPA for randomly training services, but because people were able to refuse to participate, the actual treatment receipt was self-selected. in the training. effect of Job Training Partnership Act In the experiment, people were addressing this issue. is treatment status. However, data available from a randomized training experiment conducted under the training. The next example stronger, showing that in this setting the two inference procedures produce similar is The impact is paper a setting with a considerably larger sample and where identi- fication 5.2. demand this Of those offered treatment, only 60 percent participated There was also a small number of individuals from the control group who received The random assignment of the training offer provides a plausible instrument for a person's Abadie, Angrist, and Imbens (2002), actual training status. who previously used this data to examine the impact of job training on earnings,^ provide more detailed information regarding data collection procedures, sample selection criteria, additional facts and discussion about the JTPA and institutional details of the training experiment. JTPA In this example, along with we limit the analysis to the sample of adult males. To capture the effects of training on earnings, we estimate a structural quantile model of the form Y = Da[U) + X'0{U), U ~ where D outcomes indicates training status Y They use a different is instrumented X is a vector of covariates, and U is are earnings, to the treatment group, and (7(0, 1), for Z given Z and X, by assignment to the treatment group, the is a dummy variable indicating assignment an unobservable affecting earnings. modeling framework and estimator that estimates the treatment sub-population of "compliers". effect for the 17 The data and assignment consist of 5,102 observations with data on earnings, training and other individual status, Earnings are measured as total earnings over the 30 month characteristics. period following the assignment into the treatment or control group, and average earnings in the The sample are $19,147, vector of controls, dummy indicating high-school graduates dummy, a dummy for the and , dummies includes GED dummy black and Hispanic persons, a for holders, five age-group indicating whether the applicant worked 12 or to the assignment, a dummies X dummies, a marital status more weeks in the 12 months prior signifying that earnings data are from a second follow-up survey, recommended we only report service strategy/ For brevity, and results for estimates of the key parameter, q(t), which represents the impact of the training program on earnings. Since assignment to the treatment or control group was random, Z. The instrument The state D. useful for identification since is it provides a natural instrument highly correlated to the actual training is it R^ of a regression of training status on assignment partial controlling for the other covariates, correlation suggests that weak is .609, and the to the treatment group, first-stage F-statistic problem identification should not be a is 2,673, This strong in this case, so the direct and dual inference procedures should yield similar results. As in the previous example, estimation results for the reported in columns (4)-(5) of Table the IVQR The row results. labeled 1. Column 95% "Wald confidence The computation the and .85 quantiles are QR results, while column (5) reports QR or IVQR estimate of a. For the t"* quantile, the row labeled S(r) gives the Interval" contains the the asymptotic approximation, and for the the (4) reports ,15, .25, .50, .75, 95% confidence interval for S(t) constructed based on IVQR estimates, bound constructed using the dual of IVQR was the row labeled "Dual Interval" contains inference procedure outlined in Section 3.1, conducted over the parameter space A = [—2500,7500] using aj equally spaced with a step size of 100. In addition, the results are presented graphically in Figure The QR conventional and are uniformly positive had a relatively large results, which fail 4. to account for the selection into the treatment state, from significantly different 0. They indicate that the training impact on the earnings of participants, and that this impact is program increasing in the quantile index. However, given that people were able to decide whether or not to participate in training following the initial upward biased random assignment, for the actual effect of training it seems likely that these estimates on earnings. This suspicion is would be confirmed by the IVQR estimates, which account for the endogeneity of training status and are uniformly smaller than the corresponding QR estimates. This difference is most apparent quantiles. In the low quantiles, QR suggests a moderate earning quantiles; however, the IVQR The recommended service strategy training and/or job search assistance, positive and in the low and middle earning significant effect of training on estimates are quite low and, while imprecise, not significantly was broken into three categories; and other forms of training. classroom training, on-the-job 18 from different The 0. percentage impact of the training program, which 4.* Here, the starting at becomes more apparent when we consider the difference in the estimates is presented in the right hand column of Figure QR estimates imply a large percentage increase in earnings in the low earning quantiles, 139% for r = .15, which declines as one moves to the upper quantiles of the conditional earnings distribution, though the impact remains large even in the center of the distribution at T = .50, where the implied effect on the other hand are quite T = .25 are all is a 35% increase in earnings due to training. between stable, varying -13% and The IVQR estimates 14%, and with the exception of below 10%. Unlike the case considered above, IVQR inference procedures for we do not find large differences The in this case. between the direct and dual between the two approaches similarity not is unexpected due to the strong correlation between the instrument and endogenous regressor. The close agreement here further suggests that not much cases where identification is strong. It differences detected in the previous section are the dual procedure to the presence of procedure this inference The dual graphs due to weak in functions, identification. its confidence bounds are further illustrated in Figure all line in is 5, Given the robustness of many it seems that cases. IVQR which plots the objective plotted on the horizontal axis, and the vertical each graph is the 95% critical value for the dual testing points lying below the horizontal line belong to the confidence region for q(t). Figure 3 differ markedly from those in Figures and hence confidence regions, in argument that the simple computation, be preferable to the standard procedure in shows W„(q). The horizontal procedure, so by considering the dual procedure lost weak instruments and W„(q) over the parameter space A. a function axis will is also provides further support for the 1 and 2. In particular, all The of the objective Figure 3 look remarkably well-behaved. The objective in functions appear to be reasonably smooth, and the confidence intervals are all connected and clearly bounded within the parameter space considered. 6. Conclusion In this paper, we propose an estimation approach, the inverse quantile regression (IVQR), that appropriately modifies the conventional quantile regression (QR) and recovers quantile-specific covariate effects in an instrumental variables of a set of instrumental variables Z. since it model defined by The IVQR estimator is y= D'a(U) where U is appealing for estimation independent in this can be computed through a series of conventional quantile regression steps and so computationally convenient in erties of the estimator many cases encountered in practice. We model will be derive the asymptotic prop- under suitable conditions. In addition, we demonstrate that the estimation ^The percentage impact state. All other covariates is for changing from D= to D= 1, i.e. were evaluated at their sample means. from the non-training to the training 19 procedure leads to a testing procedure which and that this inference implement We procedure results naturally from the IVQR algorithm and so first weak instruments. demand appears to be In this case, upward biased we find that the conventional QR estimate of the elasticity as would be expected due to the and quantity by supply and demand. instruments. In the second example, this case, In addition, we joint determination of find that there are large differences we instrument we for training status is is is essentially no difference robust to weak instruments. strong evidence of endogeneity of training status resulting in substantial bias to QR estimator. This bias is especially pronounced in the lower tail of the earnings where QR suggests a significant and positive effect of training on earnings, while the the conventional distribution robust to weak with random assignment to the training program which very highly correlated to actual receipt of training. In this case, there In addition, there is look at the impact of a job training program on earnings. In between the direct inference procedure and the dual procedure which IVQR simple to example, we examine a simple demand model in a small sample with between the direct inference procedure and the dual inference procedure which is is then illustrate the use of the proposed estimator and testing procedure through two brief relatively price be robust to the presence of weak instruments in practice. empirical examples. In the of will estimates are insignificant and small in magnitude. 20 Appendix 7. 7.1. Proof of Proposition The result (4.3) follows 1 by adopting standard arguments stance those of Gutenbrunner and Jureckova (1992). follow by the Slutsky rest of the stated conclusions (4.4)-(4.10) Lemma. Proof of Proposition 2 7.2. We The for quantile regression processes, for in- use standard definitions and notation from empirical process theory, as /^E„[/(iy)]:=if^/(W^,), E where we use f^Gnlf{W)]:=j=f2 (fm)~E[f{W.)]), to denote the usual expectation estimated function van der Vaart and W := {Y,D,X,Z), define the maps Wellner (1996) and van der Vaart (1998). For (7.1) e.g. and E to denote expectation evaluated at an := {E[f{Wi)])j^jr. /: E[/(W,:)] For convenience we collect important definitions below. Let § := (/3,7) and i9(t) := (/3(t),0). Define f{W,a,^) := {t-1{Y < D'q + X'/? (7.2) where * := V • {X', Z')'. Define for pr{u) = - l(u < (r + Z'7)}*, 0))u g{W,a,^) := pr{Y - D'a - X'f5 - Z'i)V. (7.3) Define Qn{aJ):=E,,[g{W,a,d)], (7.4) Q(a,i)) := E [giW,a,^)] and (7.5) d{a,T) := (^(q, (7.6) i?(q, r) 3(r) (7.8) 1 (Identification) First, equation and arginf^g^dimc^.^) Qn(a,'i5), (/?(q, t), 7(a, t)) := arg inf^jgRdi^c^.^) 0':=p{a',T), We show that 7?(t) by R6, the mapping (a,P) (2.5), it is we have that i?(t) = = ^ E[{t AxB. (a, i?), r), a' := arginf^g^ W[q], 7(r) := 7(S(t),t), 7':=7(a*,r). (Q(r)',/?(r)') uniquely solves the limit problem. 1(Y < D'a + X'/?)}*] (a(T)', /3(t)')' solves the equation thus the only solution over Q W[q] := 7(Q,T)'^(a)7(Q, :=arginfag^ VKnM, ^(T):=^(3(r),T), (7.9) 0, 7(q,t)) := Wn[a]:=^{a,TYA{a)j(a,T), (7.7) Step := r), is one-to-one over £ |{r - 1{Y < D'a AxB. By + X'0)}'i!] = 21 Second, we need to show that R5'-R6' and R5*-R6* follows by a variant of Hadamard-Cacciopoli theorem R6. suffice for Sufficiency of R5'-R6' for general metric spaces, Ambrosetti and Prodi (1995). Sufficiency of R5*-R6* follows by Theorem 2 = Third, we have that the true parameters {a,/3) E [{t - (7.10) A over y< B. 1{Y < D'a By R4 and by convexity in in cf. Theorem 1.8 in Mas-Colell (1979). {a(T),P{T)) uniquely solve the equation + X'P + Z'0)}<1/] = of the limit optimization problem for each q, i9(a,T) i? uniquely solves the equation: EI{t- 1{Y < D'a + X'P{a,T) + (7.11) By ^ construction of A € find a' 7(Q*,r) = 2. B we know by equation = Thus a* (2.5). ||t?(Q,r) Q(r) (7.12) that for any Hence we Q^^p — iy(Q)||| follows by the standard consistency 0. = B 0. for is We each a £ A. = minimal, a' need to Q(r) makes a solution; by the preceding argument is sup -i9(Q,r)||-^p Oi.e. which implies supQg_4 |iyn(Q) - 7(t) in the interior of = it is unique 0{t). sup = is (Consistency) One consequence of Proposition (7.12) by that /3(q, r) such that this equation holds and the norm of 7(0*, t) and /?(Q*(T),r) Step x Z''y{a,T)})<b] > 0, argument 1, - ||7(q, r) where W{a) for namely of equation is 7(a,r)||-»p continuous in = P{a(T),T) that 0, a over A. extremum estimators that 3(t)—»p a(r), P{an,T) -^p is (4.3), It therefore Q(r), and then and 7(an,T)-*p 7(a(T),T) /?(r) = also have that i?(Q„,r)->p d{a{T),T) for any »„-»? a(r). (7.13) Note that above we have used that 'd{a, r) is continuous in q, which is verified by the implicit function theorem applied to equation (7.11). Step 3. (Asymptotics) Let a^ be a small ball centered at a{T). in properties of the quantile regression estimator 'd{an,T) established in By Theorem the computational 3.3 in Koenker and Bassett (1978), Oil/V^) = V^En[fiW,an,d{an,T))]. (7.14) The A functional class {f{W,a,'d), {a,i}) £ because this class is a product of a VC following expansion of the rhs of (7.14) 0{1/M = G„ = Gn [/ x B x Q} subgraph is valid for is class Donsker for any compact sets A, B, and and a bounded random vector. Hence the any On—^p a(r) (W, an, d{a{T), r))] + v^E [/(M/,Q(T),i?(a(r),T))] Q, [/ + V^E (w, q„, ,?(q„ r))] [f (iy,Q,,,?(Q„, r))] + Op(l). 22 Expanding the very element further, by R4 last Oil/y/H) = G„ [/(H/, + + Op(l))v^(j?(Q„,T) - a{T),^T))\ + Op(l) (7.16) (J^ d{T)) + Op(l))v^(a„ - a(T)), + (J„ where by Rl and R3 M(.y./J) = (0,/3{r)) 0(P',7') (7.17) oa' = In other words, for any la=Q(T) £[A(0|X,Z,D)*D']. Q^—*p cx{t) Md{an,T) - = - J-'Gn d{T)) [/( VK, o(r), 79(t))] (7.18) - Ja ^ Ja[l + Op(l)Jv^(a„ - Q(r)) + Op(l), so v/^(^(Q„,r) - /3(t)) = - J^Gn [f{W,a{T),d{T))] (7.19) - + Op(l)]v^(Q„ - J^Ja[l a{T)) + Op(l), and v^(7(Qn, r) - =- 0) J.,G„ [/(ly, a(r), 7?(t))] (7.20) where [J'^,]'^]' is » 1. Jq[1 + Op(l)]\/n(an - a(T)) + Op(l), the conformable partition of J^^ Center a shrinking closed ball wp — J-, Then wp — > Bn by consistency obtained at 0, so that in Step 2, an — air) G B^ 1 S(r) (7 211 = arginf iy„(Qn). a„-a(T)6B„ ^ Note that G„ [f(W, Q(r), i9(r))] ^d A^(0, 5) by the Central Limit Theorem. Hence G„ [f[W, a(r), -dir))] Op(l), and Wn{an) = [Op(l) - x[yl(Q0 + (7.22) X [Op(l) It J-,Ja[l - +Op(l)]V^(a„ -Q(r))]' Op(l)] J-,J„[1 + then follows from (7.21) and (7.22) that y/n{a.[T) rank and A{a) has rank from v/^(d(r) (7.23) where Q„(z) := full ( R4 and - Q(r)) R5. Thus, = arg inf Op(l)] — X v^(a„ - Q(r))] Q:(r)) we have [Qn(2) = . Op(l) since J^Ja has full column that + Op(l)] - J^Gn!{W,a{T),^[T)) - J^J^z)' A{a){- J^Gnf{W,a{T),d{T)) - J-,J^z). 23 LEMMA 1 (n, in \ 0, {Approximate Argmins, Knight (1999)). Define Zn such that Qn[Zn) < inf^gRd Qn{i) + and defined Z* as arginf^gK.^ a,rgmm^^jg^dQx{z) sets K Apply (7.24) that , where Q^c Lemma 1 to Suppose that Zn (5„(?). is uniquely defined in R'^ is continuous. Then a.s., = Op(l), and Qn(-) => Qoo(') Zn = Z^ + Op(l) in Z^ = Op(l), Z^o '= i°°{K) over any compact and Zn—>d Zoo- Qn{z) defined above and conclude that ^^(3(r) - Q(r)) = arg inf [Qn{z)] + o^{l), is 25) "^^"^^^ " "^^^^^ " " (^^•^7^('^w) A^) "' (i;j;^(a(T)) a) xG„[/(W^,a(r),,9(r),r)]+0p(l), Hence ^{kHr),T) ~d[T)) = -J-'[l - J„(j'J'^A{a{T))J,J^Y' J'J'^A{a{T))J, (7 26) xG,[/(H/,Q(r),,?(r),T)]+Op(l) The conclusion of Proposition 2 follows from G„ (/(VK,q(t),'!9(t))] -^d N{0, S). 24 References Abadie, a., AngriST, J. and G. Imbens (2002): "Instrumental variables estimates of the effect of subsi- dized training on the quantiles of trainee earnings," Econometrica, 70(1), 91-117. a., and G. Prodi (1995): A primer of nonlinear analysis, vol. 34 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge. Amemiya, T. (1977): "The maximum likelihood and the nonlinear three-stage least squares estimator in Ambrosetti, the general nonlinear simultaneous equation model," Econometrica, 45(4), 955-968. (1982): "Two Stage Least Absolute Deviations Estimators," Econometrica, 50, 689-711. Andrews, D. (1994): "Empirical Process Methods in Econometrics," in Handbook of Econometrics, Vol. 4, by R. Engle, and D. McFadden. North Holland. BiCKEL, P. J., C. A. J. Klaassen, Y. Ritov, and J. A. Wellner (1993): Efficient and adaptive estimation for semiparametric models. Johns Hopkins University Press, Baltimore, MD, Johns Hopkins Series in the Mathematical Sciences. ed. Chaudhuri, and K. DokSUM, P., Samarov a. (1997): "On average derivative quantile regression," Ann. Statist, 25(2), 715-744. Chen, L., and Portnoy S. "Two-stage regression quantiles and two-stage trimmed least squares Comm. Statist. Theory Methods, 25(5), 1005-1032. (1996): estimators for structural equation models," Chernozhukov, An tribution: and v., C. Hansen "The (2004a): Effects of 401(k) Participation on the Wealth Dis- Instrumental Quantile Regression Analysis," Review of Economics and Statistics, 86(3), 735-751. "An IV Model (2004b): Chernozhukov, and V., H. of Quantile Hong Treatment "An (2003): forthcoming Econometrica. Effects," MCMC Approach to Classical Estimation," Journal of Econometrics, 115, 293-346. Dorsum, K. (1974): "Empirical probability plots and statistical inference sample case," Ann. Statist., 2, 267-277. Graddy, K. for nonlinear models Rand Journal "Testing for Imperfect Competition at the Fulton Fish Market," (1995): the two- in of Economics, 26(1), 75-92. Gutenbrunner, C, and J. JURECKOVA (1992): "Regression rank scores and regression quantiles," Ann. Statist, 20(1), 305-330. Hausman, AND A., J. J. G. SiDAK (2004): 3. "Why Do the Poor and the Less-Educated Pay Higher Contributions to Economic Analysis Prices for Long-Distance Calls?," & Policy, Vol. 3, No. 1, Article http://www.bepress.com/bejeap/contributions/vol3/issl/art3. He, X., AND L. Zhu (2003): "A lack-of-fit test for quantile regression," Journal of the American Statistical Association, to appear. JanuSZEWSKI, S. I. (2004): "The Effect of Air Traffic Delays on Airline Prices," UCSD Working Paper. http://weber.ucsd.edu/ sjanusze/www/airtrafficdelays.pdf. Knight, K. (1998): "Limiting distributions for Li regression estimators under general conditions," Ann. Statist, 26(2), 755-770. (1999): "Epi-convergence KOENKER, R. (1994): and Stocheistic Equisemicontinuity," Preprint. "Confidence intervals for regression quantiles," in Asymptotic statistics (Prague, 1993), pp. 349-359. Physica, Heidelberg. Koenker, KOENKER, R., R., regression," KOENKER, J. R., and AND J. A. F. AND S. PORTNOY G. S. Bassett (1978): Machado "Regression Quantiles," Econometrica, 46, 33-50. "Goodness Amer. Statist Assoc, 94(448), 1296-1310. (1987): (1999): of fit and related inference processes "L-estimation for linear models," J. for quantile Amer. Statist Assoc, 82(399), 851-857. (1979): "Homeomorphisms of compact, convex sets and the Jacobian matrix," SIAM Math. Anal., 10(6), 1105-1109. PoRTNOY, S. (1991): "Asymptotic behavior of regression quantiles in nonstationary, dependent cases," Multivariate Anal, 38(1), 100-113. PoRTNOY, S. (2001): "Censored Regression Quantiles," prepirint, www.stat.uiuc.edu. Mas-Colell, A. J. J. 25 PORTNOY, S., AND R. KOENKER (1997): "The Gaussian Hare and the Laplacian Tortoise," Statistical Science, 12, 279-300. Sakata, "Instrumental Variable Estimation Based on the Least Absolute Deviation Estimator," Department of Economics, University of Michigan. VAN DER Vaart, A. W. (1998): Asymptotic statistics. Cambridge University Press, Cambridge. VAN DER Vaart, A. W., and J. A. Wellner (1996): Weak convergence and empirical processes. SpringerS. (2001): Preprint, Verlag, New York. 26 Figure 1. QR Estimates of Effect of Price on Quantity by Demand Conditional Quantile Functions -05 0.5 iog(Price) Demand Conditional Quantile Functions The estimated as a function of price for r IVQR Functions -05 0.5 iog(Price) Note: Left Column: and = Functions conditional quantile curves of the quantity of fish sold .15, .25, .50, .75, and .85. The top display is in log-price log-quantity space with log-price on the horizontal axis and log-quantity on the vertical axis. The bottom display is in price-quantity quantity on the vertical axis. Right Column: r = .15, .25, .50, .75, log-price is .85. The top display is in log-price log-quantity for space with on the horizontal axis and log-quantity on the vertical axis. The bottom display space with price on the horizontal axis and quantity on the vertical in price-quantity axis. and space with price on the horizontal axis and The demand curves estimated by IVQR 27 Figure 2. Statistic Wn{a) in Demand Example Using Stormy as an Instrument A \ / \^ / /'- / ', 1^ - y^J I ' K y . =50 Note: Objective functions and dual confidence regions for demand for fish example. All models are as specified in the main text. The estimates make use of one instrument, Stormy, a is on the horizontal axis and Wn(,a) is on the vertical axis. The horizontal line is the 95% critical value from a Xi- The dual confidence region is all values of a such that the Wn{oi) lies below the horizontal line. 28 Figure 3. Statistic W„{a) in Demand Example Using Stormy and Mixed 1=.15 as Instruments t = .25 15 10 XV / / /"--- 5 \ \ / J /-' ^\- a t=.50 I 40 / r^ 30 / 20 10 =.75 / ^ '' \ /^-~X A , A. : / / \ / ^_y n i = .65 Note: Objective functions and dual confidence regions for demand for fish example. All models are as specified in the main text. The estimates make use of two instruments, Stormy and Mixed, a is on the horizontal axis and W„{a) is on the vertical axis. The horizontal line is the 95% critical value from a xi- The dual confidence region is all values of a such that W„(a) lies below the horizontal line. 29 Figure 4. Estimates of the Training Impact by QR: QR and by QR: Percentage Impacl Training Effect IVQR of Training 6000 4000 5 2000 -2000 0.3 0,2 0,4 5 6 0,7 0,2 IVQR: Training Effect 02 0,3 Note: Left Column: earnings for t = 0,4 0,5 0,6 07 08 IVQR; Percentage Impact of Training 0.4 0,5 0,6 QR and IVQR .15, .25, .50, .75, 0,3 0,7 estimates of the impact of a job training program on and .85. The top panel reports the QR estimate of the and the bottom panel reports the IVQR results. In each figure, the solid line represents the point estimates, and the dashed (- -) line represents the 95% confidence interval formed using the direct inference approach. For the IVQR results, the dash-dot (-.) line represents the 95% confidence bound constructed using the dual inference procedure described in the text. In both figures, the horizontal axis measures the quantile index r, and the vertical axis is the impact of training on earning quantiles measured in dollars. Models include covariates as specified in the text, and the sample size is 5,102. Right Column: QR and IVQR estimates of the percentage impact of training for r = .15, .25, .50, .75, and .85. The top panel reports the QR estimate of the training impact, and the bottom panel reports the IVQR results. Percentage impacts are for moving from non-training to training and all other covariates are evaluated at their sample mean. In both figures, the horizontal axis measures the quantile index r, and the training impact, vertical axis is the percentage impact of training. 30 Figure 5. t Statistic Wn{cr) in the Training = .15 Example. T=.25 200 / 150 100 / / 50 :::i=^_ __^--^^ -2000 2000 4000 2000 6000 4000 6000 \ A 2000 4000 6000 2000 4000 6000 / y / \ -2000 \ r \ ^•^ 2000 ,-r^ 4000 6000 Note: Objective functions and dual confidence regions for returns to training example. All models are as specified in the main The estimates use random assignment to a is on the horizontal axis and Wn(c() is on the 95% critical value from a X\ The dual confidence text. the training program as the instrument, vertical axis. region is all The horizontal line values of a such is the that the function value lies below the horizontal line. 52 1 SS Date Due