Digitized by the Internet Archive in 2011 with funding from Boston Library Consortium Member Libraries http://www.archive.org/details/finitesampleinfeOOcher HB31 DEWEY .M415 6S -^ Massachusetts Institute of Technology Department of Economics Working Paper Series SAMPLE INFERENCE FOR QUANTILE REGRESSION MODELS FINITE Victor Chernozhukov Christian Hansen Michael Jansson Working Paper 06-03 January 31, 2006 Room E52-251 50 Memorial Drive Cambridge, MA 021 42 This paper can be downloaded without charge from the Social Science Research Network Paper Collection at http://ssrn.com/abstract=880467 \SSACHUSETTS -NSTITUTE OF TE CHNOLOGV I FEB 2 ? 2005 LIBRARIES FINITE SAMPLE INFERENCE FOR QUANTILE REGRESSION MODELS VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON Abstract. Under minimal assumptions sion models can be constructed. finite sample confidence bands for quantile regres- These confidence bands are based on the "conditional pivotal property" of estimating equations that quantile regression methods aim to solve and will pro- vide valid finite sample inference for both linear and nonlinear quantile models regardless of whether the covariates are endogenous or exogenous. MCMC, puted using and confidence bounds for single The confidence regions can be com- parameters of interest can be computed through a simple combination of optimization and search algorithms. We illustrate the finite sample procedure through a brief simulation study and two empirical examples: estimating a heterogeneous demand all cases, we find elasticity and estimating heterogeneous returns to schooling. asymptotics and confidence regions formed using the finite the usual asymptotics are suspect, such as inference about identification may usefully assumptions is partial or weak. complement fail In pronounced differences between confidence regions formed using the usual The evidence existing inference sample procedure tail in cases quantiles or inference where when strongly suggests that the finite sample methods methods for quantile regression when the standard or are suspect. Key Words: Quantile Regression, Extremal Quantile Regression, Instrumental Quantile Regression 1. Quantile regression (QR) it a useful tool for examining the effects of covariates on an outcome is Koenker variable of interest; see e.g. Introduction (2005). Perhaps the most appealing feature of allows estimation of the effect of covariates on Date: version The of finite the sample results "An paper IV of this Model paper of many were Quantile points of the included http://gsbwww.uchicago.edu/fac/christian.hansen/research/IQR-short.pdf. current paper was prepared Roger Koenker for As the Winter Meetings of the Econometric Society for constructive discussion of the in April Effects" a that outcome distribution the in Treatment QR is separate 2003 17, available at project, the San-Diego, 2004. We thank paper at the Meetings that led to the development of the optimality results for the inferential statistics used in the paper. Revised: December, 2005. We also thank An- drew Chesher, Lars Hansen, Jim Heckman, Marcello Moreira, Rosa Matzkin, Jim Powell, Whitney Newey, and seminar participants at Northwestern University of the paper can be for helpful comments and discussion. The most recent versions downloaded at http://gsbwww.uchicago.edu/fac/christian.hansen/research/FSQR.pdf. 1 well as the center of the distribution. including tlie tails as summary statistics of the impact of a covariate, they do not capture the affects all quantiles of the impact of a variable unless the variable same way. Due QR properties, to ability to capture its While the central has been used in many theoretical econometrics; see especially effects are useful full distributional outcome distribution in the heterogeneous effects and its interesting theoretical empirical studies and has been studied extensively in Koenker and Bassett (1978), Portnoy (1991), Buchinsky (1994), and Chamberlain (1994), among others. In this paper, we contribute to the existing literature We models. for quantile regression show that vahd by considering finite finite sample inference sample confidence regions can be constructed for parameters of a model defined by quantile restrictions under minimal assumptions. These assumptions do not require the imposition of distributional assumptions and be valid both linear and nonlinear conditional quantile models and for for will models which include endogenous as well as exogenous variables. The basic idea of the approach is make use to of the fact that the estimating equations that correspond to conditional quantile restrictions are conditionally pivotal; that is, conditional on the exogenous regressors and instruments, the es- timating equations are pivotal in finite samples. Thus, valid finite sample tests and confidence regions can be constructed based on these estimating equations. The approach we pursue (unconditional) quantiles. related to early is The work on finite sample inference existence of finite sample pivots quantiles as illustrated, for example, in is immediate for the sample for unconditional Walsh (1960) and MacKinnon (1964). However, the existence of such pivots in the regression case is less obvious. We extend the results from the unconditional case to the estimation of conditional quantiles by noting that conditional on the exogenous variables and instruments the estimating equations solved by used to obtain valid spirit to the finite sample inference statements. The resulting approach rank-score methods, proach finite will methods are is similar in Gutenbrunner, Jureckova, Koenker, and Portnoy (1993), e.g. but does not require asymptotics or homoscedasticity The QR This property suggests that tests based on these quantities can be pivotal in finite samples. for its vahdity. sample approach that we develop has a number of appealing features. The ap- provide valid inference statements under minimal assumptions, essentially requiring some weak independence assumptions on sampling mechanisms and continuity of conditional quantile functions in the probability index. In endogenous settings, the finite sample approach will remain valid (2003)). in cases of weak identification or partial identification (e.g. as in Tamer In this sense, the finite sample approach usefully complements asymptotic approxi- mations and can be used these approximations is in situations questionable. where the validity of the assumptions necessary to justify The chief difficulty with the finite menting the approach be quite difficult if sample approach will require inversion of number the of parameters is computational. an objective function-like quantity which is large. To help The which offers potential use of MCMC allows us to draw an adaptive may computational alleviate this (MCMC) methods problem, we explore the use of Markov Chain Monte Carlo joint confidence regions. In general, imple- for constructing set of grid points We computational gains relative to more naive grid baised methods. also consider a simple combination of search and optimization routines for constructing marginal confidence bounds. When interest focuses putationally convenient and aimed may be more on a single parameter, this robust in nonregular situations than an approach at constructing the joint confidence region. Another potential disadvantage of the proposed finite sample approach expect that minimal assumptions would lead to wide confidence intervals. concern is unwarranted for joint inference: The finite asymptotic power properties. However, conservativity sample may tests is that one might We show have correct size that this and good be induced by going from joint to marginal inference by projection methods. In this case, the may may be com- approach finite sample confidence bounds not be sharp. To explore these issues, we examine the properties of the finite and empirical examples. In the simulations, we find that joint procedure have correct sample approach tests based on the in simulation finite sample size while conventional asymptotic tests tend to be size distorted and are severely size distorted in some cases. We also find that finite sample tests about individual regression parameters have size less than the nominal value, though they have reasonable power in many situations. than the nominal We On the other hand, the asymptotic tests tend to have size that is greater level. also consider the use of finite we consider estimation of a sample inference demand curve in a small in two empirical examples. In the first, sample; and in the second, we estimate demand example, we find modest differences between the finite sample and asymptotic intervals when we estimate conditional quantiles not instrumenting for price and large differences when we instrument for price. In the schooling example, the finite sample and asymptotic intervals are almost identical in models in which we treat schooling eis exogenous, and there are large differences when we instrument for schooling. the returns to schooHng in a large sample. In the These results suggest that the identification of the structural variables models in both cases The remainder of this paper is is parameters in the instrumental weak. organized as follows. In the next section, duce the modelling framework we are considering and the basic finite we formally sample inference intro- results. Section 3 presents results from the simulation and empirical examples, and Section 4 concludes. Asymptotic properties of the finite sample procedure that include asymptotic optimahty results are contained in an appendix. 2. The Model. 2.1. In this paper, Finite Sample Inference we consider finite sample inference in the quantile regression model characterized below. Assumption Let there be a probability space 1. defined on this space, with Y gR, D e R^™(^), (fi, JF, Ze P) and a random vector K'^™^^), and Al Y — q{D, U) for a function q{d,u) that is measurable. A2 q{d, u) is strictly increasing in u for each d in the support A3 U ~ Umform(0, 1) and is independent from Z. A4 D is statistically dependent on Z. When D= and q[d,T) case, Al, A3, Y and A4 is cf. Koenker (2005) where the r-quantile of are not restrictive , P-a.s, such that D. Y Koenker and Bassett Y is conditional on the dependent variable, D = D is the d for any t € (0,1). In this and provide a representation of to have a continuous distribution function. (1974) and of (0, 1) D' Z', U) Z, the model in Al-4 corresponds to the conventional quantile regression model with exogenous covariates, regressor, e J7 {Y, Y , while A2 The exogenous model was introduced (1978). It usefully generalizes the classical linear restricts Doksum in model Y= C''7o+7i(f/) by allowing for quantile specific effects of covariates D. Estimation and asymptotic Y = inference for the linear version of this model, D'6{U), was developed Bassett (1978), and estimation and inference results have been extended in a directions Koenker and by subsequent. Matzkin (2003) provides many economic examples that framework and considers general nonparametric methods When D ^ Z f7, in number but Z is a set of instruments that for of useful in this fall asymptotic inference. are independent of the structural disturbance the model Al-4 provides a generalization of the conventional quantile model that allows for endogeneity. model identification. this See Chernozhukov and Hansen (2001, 2005a, 2005b) for discussion of the as well as for semi-parametric estimation See Chernozhukov, Imbens, and model and Chesher (2003) for a related and inference theory under strong and weak Newey (2006) for a nonparametric analysis of nonseparable model. thought of as a general nonseparable structural model that allows well as a treatment effects may be model with heterogeneous treatment The model Al-4 can be for endogenous variables as effects. In this case, D jointly determined rendering the conventional quantile regression invahd for and U making inference on the structural quantile function q{d,T). This model generahzes the conventional instrumental variables model with additive disturbances, Y = L''(0, 1), to cases where the impact of D D'ao + a\{U) where U\Z ~ varies across quantiles of the outcome distribution. Note that A4 in this case, below results presented Under Assumption 1, necessary for identification. However, the finite sample inference is remain vahd even when A4 will we (Main 1 U then follow from Equation follow. 1. P\Y <q{D,T)\Z]=T, 2. [Y < q{D,T)} hold, then (2.1) Bernouni(r) conditional on IS equivalent to is A 1-3 Suppose Statistical Implication). {Y < q{D,T)} Proof: not satisfied. state the following result which will provide the basis for the finite sample inference results that Proposition is {U < r} which is Z (2.2) . independent of Z The . results D ^U{0,1). (2.1) provides a set of the quantile function q{d,T). moment When D= conditions that can be used to identify and estimate Z, these are the standard moment conditions used in quantile regression which have been analyzed extensively starting with Koenker and Bassett (1978) and in D^ when is considered Chernozhukov and Hansen (2005b). key result fi-om which we obtain the (2.2) is the states that the event random is Z, the identification and estimation of q{d,T) from (2.1) {Y < q{D,T)} variable regardless of the sample size. This known and so is pivotal in finite samples. sample confidence regions and 2.2. conditional on These tests conditional Model and Sampling Assumptions. heterogeneous effect finite Z is sample inference random a Bernoulli(r) we Assumption {Yi,Di, Zi,i = 2. results allow the construction of exact finite In the preceding section, We also Z is distributed exactly as variable in finite samples. In order to operationalize the finite sample Let r € 1, ...,n) denote the quantile of (0, 1) on probability space i = l,...,n, (f2,.F, 0, t) is interest. Suppose that there is a sample P) (possibly dependent on the sample size), and the following additional conditions hold: = (Finite-Parameter Model): q{D,T) the function q{D, A6 we outlined a general relates to quantile regression. impose the following conditions. such that A1-A4 holds for each A5 result on the observed data, Z. model and discussed how the model random also The variable depends only on r which showed that the model implies that [Y < q{D,r)} conditional on inference, results. distributed exactly as a Bernoulli(T) known, but 6q is q{D,9o,T), for some (Conditionally Independent Sampling): {Ui,...,Un) are ditional on (Zi,...,Z„). 5 Bq £ Qn C M'^", where not. i.i.d. Uniform(0, 1), con- We will use the letter V P to denote all probability laws on the measure space (fi, J^) that satisfy conditions Al-6. Conditions A5-6 restrict the model Al-4 sufficiently to allow requires that the r-quantile function q{d, r) may (where 9q vary with that such a condition n size with n in the sense of sense, in the we can is needed. Pitman sample we Since r). Huber A5 does {Yi,Xi,Zi,i parameter ^o sample inference, allow for the (1973) and Portnoy (1985) where model Kn — > A5 sample inference. oo as obvious it is depend on the to and allows the dimension of the model, iC„, to increase ?i — oo. In this > allow flexible (approximating) functional forms for q{D,9o,T) such as linear combinations of B-splines, trigonometric, power, and spHne = finite to a finite-dimensional are interested in finite However, sense, known up is l,...,n) are series. Condition A6 satisfied is but more generally allows rather rich dynamics, i.i.d., e.g. if of the kinds considered in Koenker and Xiao (2004a) and Koenker and Xiao (2004b). 2.3. The Finite Sample Inference Procedure. Using we vious sections, We are able to provide the key results on finite sample inference. by noting that equation moments (GMM) the conditions discussed in the pre- (2.1) in Proposition function for estimating 1 justifies the following generalized start method-of- ^o^ / n 1 )_M0) Ln{0) J^m^ie) Wr. i=l where mi{9) functions of which is Z = [r — l{Yi < q{Di,9,T))]g{Zi). In this expression, g{Zi) that satisfies dim{g{Z)) fixed conditional on Zj, ..., > dim(0o)i aiid A Z„. Wn is a known vector of is a positive definite weight matrix, convenient and natural choice of n Wn (2.3) i=\ Wn is given by -1 1 -J29{Z^)g{Z,y ^ J T(l ^^ i=l which equals the inverse of the variance of n~^/^ Y^^=i this conditional variance does not depend on do, the '^i(^o) conditional GMM also corresponds to the continuous-updating estimator of We The focus on the GMM GMM function Ln{9) for defining W„ ..., Hansen, Heaton, and Yaron (1996). the key results for finite sample inference. its close and asymptotic inference procedures. In addition, we show the appendix that testing based on L„(0) state the key finite Z„. Since defined above function provides an intuitive statistic for performing inference given relation to standard estimation We now on Zi, function with sample may have results. in useful asymptotic optimality properties. Proposition Under A1-A6, 2. on (Zi, tional ..., = ^- statistic Ln{9o) li^ D^ - ^^) and {Bi, ..., Proof: Implication 2 of Proposition 1 U= J2{T - B,) 9iZ^)] Wr, are iid Bernoulli rv's with Bn) conditionally pivotal: Ln(^o) is EBi = and A6 g{zA = sample inference on Given the imply the is By Let finite critical values D GMM sample distribution of Ln{9o), a if 1 — may be unknown obtained allowing a- level test of the null hypothesis that Ln{9) > Cn{a) where Cn{oi) 3. is I uniformly in ^0 — Fix an a G (0,1). a-level test of 9 P gV, is the Q-quantile CR{a) — 9^: is {9 : Ln{9) < is : 3. > a. CR{a) Prp(^o ^ CR{a)) P{Cn <l]>a] Proposition 3 demonstrates how one may is also a valid critical region for <\ — a. Moreover, these results hold < Cnia)} and Prp{Ln((?o) < Cn{a)} 1 > — a. a, by the and Ln{9o) =d Cnobtain valid finite sample confidence regions and parameter vector 9 characterizing the quantile function q{D,9o,T). result generalizes the It a valid a-level confidence region for inference is equivalent to {L„(6'o) definition of Cn{a) := inf{/ It is also Cn{a)}. a valid a-level confidence region mfp^-pPxp{9o G CR{a)) > a and suppg-pPrp(0o ^ CR{a)) < £ CR{a) tests for the CR{a) stated formally in Proposition about 9o in finite samples: Prp(0o € CR{a)) obtaining a CR(a) = the c„(Q:)-level set of the function Ln{0): This result Proposition case. function Ln{0) ^o- follows immediately from the previous results that Proof: Z„). inverting this test-statistic, one obtains confidence regions for ^o- CR{a) be for 9o. ..., result. from the distribution given by the rule that rejects the null 6 =^ 9o of Cn- , Conditional on {Zi,...,Zn), the distribution does not depend on any Oq. parameters, and appropriate finite ^n, condi- which are independent 0/ (Zi, r, Proposition 2 formally states the finite sample distribution of the &t 6 = Zn), where Thus, this approach of Walsh (1960) from the sample quantiles to the regression apparent that the pivotal nature of the finite the asymptotically pivotal nature of the rank-score method, sample approach cf. is similar to Gutenbrunner, Jureckova, Koenker, and Portnoy (1993) and Koenker (1997), and the bootstrap method of Parzen, Wei, and Ying (1994).^ In sharp contrast to these methods, the finite sample approach does not The finite sample method should not be confused with the Gibbs bootstrap proposed in He and Hu (2002) who propose a computationally attractive variation on Parzen, Wei, and Ying (1994). The method is also very different from specifying the finite sample density of quantile regression as 7 in Koenker and Bassett (1978). The rely on asymptotics and a homoscedasticity assumption, while the The statement method valid in finite samples. Moreover, the rank-score is of Proposition 3 is finite sample approach does for joint inference relies on not. about the entire parameter vector. One can define a confidence region for a real-valued functional ip{9o,T) as CR{a,-4)) infpg-pPrp(V'(^Oi''") £ CR{a,il>)) taking all ing to ^[1]. > a by about a single component of structed as the set {ip{0,T) : e G CR{a)] imphes the event Since the event {^o in inference = {^[i] 6 £ : Confidence bounds for ^|i] say is, ^[ij, 3. CR{a,4')], £ {V'(6'o,t) Proposition 9, CR{a)}. That CR{a) and then vectors of 9 in e CR{a)]. For example, a confidence region the confidence region for follows that it one if interested is may be for 0|i[ obtained by ^jj] is confirst extracting the element from each vector correspond- may be supremum obtained by taking the infimum and over this set of values for 9^iy 2.4. Primary Properties of the Finite Sample Inference. The confidence regions obtained in the preceding section have a features. finite number Perhaps the most important feature of the proposed approach sample inference under weak conditions. restrictions makes many that is tests and and appealing allows for it Working with a model defined by quantile possible to construct exact joint inference in a general non-linear, non- it separable model with heterogeneous effects that allows for endogeneity. with sample finite of interesting This in contrast is other inference approaches for instrumental variables models that are developed for additive models only. The approach is valid without imposing distributional assumptions and allows forms of heteroskedasticity and rich forms of dynamics. The result on Eisymptotic arguments and essentially requires only that Y is for general obtained without relying has a continuous conditional distribution function given Z. In contrast with conventional asymptotic approaches to inference in quantile models, the validity of the well-behaved density for Y: It finite sample approach does not depend upon having a does not rely on the density of Y given D= d and Z= z being continuous or differentiable in y or having connected support around q{d,T), as required in e.g. Chernozhukov and Hansen (2001). In addition to these features, the finite sample inference procedure will remain valid in situations where the parameters of the model are only partially identified. The confidence regions obtained from the finite sample procedure will provide vahd inference about q{D, r) q{D,9o,T) even when 9^ on the point made in finite sample density of Hu QR is is not uniquely identified by P\Y < q{D,9o,T)\Z] (2002). In addition, since the inference is vahd for = r. any = This builds n, it follows not pivotal and can not be used for finite sample inference unless the nuisance parameters (the conditional density of the residuals given the regressors) are specified. remains valid under the asymptotic formalization of "weak instruments", as trivially that it defined e.g. As noted Stock and Wright (2000). in previously, inference statements obtained from the finite sample procedure will also remain valid in models where the dimension of the parameter space K-^ is with future increases of n since the statements are valid n. for of the previous section remain valid in the asymptotics of where Kn/n -^ 0,Kn —* oo,n —^ oo. any given allowed to increase Thus, the results Huber (1973) and Portnoy (1985) These rate conditions are considerably weaker than those required for conventional inference using Wald Newey (1997) which require K'^/n —^ 0,Kn — > statistics as described in — oo,n > Portnoy (1985) and oo. Inference statements obtained from the finite sample procedure will be valid for infer- may perform ence about extremal quantiles where the usual asymptotic approximation poorly.^ One quantiles is tiles, to alternative to using the conventional asymptotic approximation for extremal pursue an approach exphcity aimed at performing inference e.g as in Chernozhukov The extreme (2000). for extremal quan- value approach improves upon the usual asymptotic approximation but requires a regular variation assumption on the ditional distribution linearity quite ofY\D, that the and exogeneity. None tail tails of the con- index does not vary with D, and also rehes heavily on of these assumptions are required in the finite sample approach, so the inference statements apply more generally than those obtained from the extreme value approach. It is also will worth noting that while the approach presented above is exphcitly finite sample, remain valid asymptotically. Under conventional assumptions and asymptotics, and Pollard (1989) and Abadie (1995), the inference approaches conventional e.g. it Pakes GMM based joint inference. Finally, it is important to note that inference that for joint inference the approach may be made by We is is simultaneous on all components of 9 and not conservative. Inference about subcomponents of 6 projections, as illustrated in the previous section, and may be conservative. explore the degree of conservativity induced by marginahzation in a simulation example in Section 2.5. is 3. Computation. The main computing the confidence difficulty regions. values can be easily constructed The with the approach introduced in the previous sections distribution of Ln(9o) by simulation. The more function Ln{9) to find joint confidence regions Whether a given quantile is may is serious not standard, but problem is its critical that inverting the pose a significant computational challenge. extremal depends on the sample size and underlying data generating process. However, Chernozhukov (2000) finds that the usual asymptotic approximation behaves quite poorly examples for < r < .2 and 1 > t > .8. 9 in some One possible approach this approach becomes intractable. MCMC but as the dimension of to simply use a naive grid-search, MCMC methods. of grid points is To help alleviate this problem, seems attractive in this setting because We than performing a conventional grid search. for a single parameter which 2.5.1. in we explore the use of generates an adaptive set and so should explore the relevant region of the parameter space more quickly some may be approach that also consider a marginalization combines a one-dimensional grid search with optimization MCMC it increases, for estimating a confidence computationally convenient and may be more bound robust than irregular cases. The computation Computation, of the Critical Value. may of the critical value c„(a) proceed in a straightforward fashion by simulating the distribution £„• We briefly outline a simulation routine below. Algorithm 1 (Computation oi Cn{a).)- Given {Ui,j,i < n) as iid Uniform, and let {Bij = 2.5.2. {Cnj,j = 1, ••, 9{Z^)) ' number J), for a large l,...,n), for j < l{Uij hi^Ztii^ -B^J)9{Z^)YWn {^E7=i(^ ' B,,j) of the sam.ple = {Zi,i T),i 3. . < n). = 2. 1,...,J; 1. Draw Compute Cnj = Obtain Cn {a) as the a-quantile J. Computation of Confidence Regions. Finding the confidence region requires computing the c„(Q)-level set of the function Ln{0) which involves inverting a non-smooth, non-convex function. For even due to the To help Hastings moderate sized problems, the use of a conventional grid search is impractical computational curse of dimensionality. resolve this problem, MCMC algorithm.'^ we consider the use The idea is that the of a generic MCMC random walk Metropolis- algorithm will generate a set of adaptive grid-points that are placed in relevant regions of the parameter space only. focusing more on relevant regions of the parameter space, the use of MCMC may By alleviate the computational problems associated with a conventional grid search. To implement the density and feed it MCMC into a algorithm, random walk we treat f{9) oc MCMC algorithm. Chernozhukov and Hong (2003), except that we use function rather than pseudo-posterior is implemented as 6^'-\ 1. 2. Generate min{l,/(<>„p)//(0('))} and Repeat Steps 1-3 Other MCMC it (The idea as a quasi-posterior is similar to that in here to get level sets of the objective quantiles.) The basic random walk MCMC follows: Algorithm 2 (Algorithm and given means and exp(— L„(0)) J times Random Walk MCMC). For a symmetric proposal density /i(-) 2. Take ^(*+^) = 9%p with probability O^^iop ~ h{9 - 6l(')). B^') otherwise. replacing 6''*) with 3. ^(*+i) Store {e^'\ e'f^lp, L„(0(*)), as starting point for each repetition. algorithms or stochastic search methods could also be employed. 10 L„(4tlp)). 4- At each MCMC step, the algorithm considers two potential values for 9 and obtains the corresponding values of the objective function. Step 3 above walk differs from a conventional random MCMC in that we are interested in every value considered not just those accepted by the procedure. The implementation for the chain MCMC and a transition density practical implications, fine of the algorithm requires the user to specify a starting value The g{-). and implementation choice of both quantities can have important any given example in As MCMC illustrated above, the issues. algorithm generates a set of grid points {9^^\..., set of evaluations of the objective function, set of draws for 6 we can where the value 6'''^'} Using as a by-product, a set of values for the objective function {L„(6''^'), ...,Ln(^('^))}. by taking the some Robert and Casella (1998) provide tuning in both the choice of g(-) and the starting value. an excellent overview of these and related will typically involve and, this construct an estimate of the critical region < of L^iO) CR{a) = c„(a): [d'^'^ : Ln{9^'^) < c{a)}. Figure Both illustrates the construction of these confidence regions. 1 = confidence regions for r .5 in a simple demand example regions illustrated here are for a model in which price is that we figures illustrate discuss in Section 3. 95% The treated as exogenous. Values of the intercept are on the x-axis and values of the coefficient on price are on the y-axis. Figure 1 Panel MCMC an draw A illustrates a set of MCMC draws in this example. Each symbol of the parameter vector that satisfies Ln{9) < c„(.95), + represents and each symbol • represents a draw that does not satisfy this condition. Thus, (a numerical approximation to) the confidence region the MCMC is given by the area covered with symbol + algorithm appears to be doing what we would want. come from within the confidence in the figure. In this case, The majority of the draws do a good job of region, but the algorithm does appear to exploring areas outside of the confidence region as well. Panel B of Figure 1 presents a comparison of the confidence region obtained through and the confidence region obtained through a region is represented by the black area in the figure. Here, we some points that are not 2.5.3. line, and the see that the in the other, grid search. MCMC The boundary region two regions are almost but the agreement is is again given by the light gray identical. Both regions include quite impressive. Computation of Confidence Bounds for Individual Regression Parameters. The approach outlined above In our applications, may be MCMC of the grid search MCMC used to estimate joint confidence regions which can be used we use estimates of 9 and the corresponding asymptotic distribution obtained from the quantile regression of Koenker and Bassett (1978) in exogenous cases and from the inverse quantile regression of Chernozhukov and Hansen (2001) in endogenous cases as starting values and transition 11 densities. about the entire parameter vector or for joint inference parameters. there one If may be for inference about subsets of regression interested solely in inference about an individual regression parameter, is a computationally more convenient approach. In particular, for constructing a confidence bound for a single parameter, knowledge of the entire joint confidence region may unnecessary which suggests that we is collapse the rf-dimensional search to a one-dimensional search. For concreteness, suppose we are interested in constructing a confidence bound for a particular element of 9, denoted vector. We = exists a value of 6 with 9^^ is required to place 0j*j, and 6'[i], note that a value of 0[i], 9T^, let 0[_i] On the other hand, can include 6*^, if satisfies Ln{9*) < bound as long as there < Cn{a). Since only one such value of 9 restrict consideration to 9", the point ^j*j, Ln{9) Cn{a), in the confidence inside the confidence bound, we may = < that minimizes Ln{9) conditional on ^mi be no other point that lie that satisfies Ln{9) in the confidence will denote the remaining elements of the parameter say ^m,, will If . L„(0*) > we may conclude that there Cn{ct), Cn{a) and exclude 9T^,, we have found a point that from the confidence bound. satisfies Ln{9) < Cn{a) and bound. This suggests that a confidence bound for ^ni can be constructed using the following simple algorithm that combines a one-dimensional grid search with optimization. Algorithm 3 (Marginal Approach.). I,..., J}. 2. region for For 9[i] j as = 1,...,J, find ^/ {6'j'jj : L„((0j'j], j, 6'j'_jj)' 1. = < Define a suitable set of values for ^mi, {^m, j arg inf L„{{9l,, 9', .^A'). 3. = Calculate the confidence c„(q)}}. In addition to being computationally convenient for finding confidence bounds for individual parameters in high-dimensional settings, we also anticipate that this approach in some sets because it to perform well Since the marginal approach focuses on only one parameter, irregular cases. typically be easy to generate a tractable some robustness will and reasonable search region. The approach it will will have multimodal objective functions and potentially disconnected confidence considers all values in the grid search region and will not be susceptible to getting stuck at a local mode. 3. In the preceding section, Simulation and Empirical Examples we presented an inference procedure for quantile regression that provides exact finite sample inference for joint hypotheses and discussed for subsets of quantile regression parameters may be obtained. how confidence bounds In the following, we further explore the properties of the proposed finite sample approach through brief simulation examples 12 and through two simple case make we studies.^ In the simulations, parameter vector based on the finite use of the asymptotic approximation we consider marginal inference, may be about the entire find that tests sample method have the correct size while tests which When we substantially size distorted. find that using the asymptotic approximation leads to tests that reject too often while, as would be expected, the finite sample method yields conservative tests. We the also consider the use of the finite we first, consider estimation of a we consider estimation cases, we of the find that the finite sample inference procedure demand model in two case studies. In smaU sample; and in a impact of schooling on wages in the second, a rather large sample. In both in sample and asymptotic intervals are similar when the variables of interest, price and years of schoohng, are treated as exogenous. However, when we use instruments, the finite sample and asymptotic intervals differ significantly. In each of these examples, we also consider specifications that include only a constant and the covariate of interest. In these two dimensional situations, computation relatively simple, so is estimating the finite sample intervals using a simple grid search, inference approach suggested in the previous section. We find that confidence bounds for the parameter of interest in the MCMC, we consider and the marginal ah methods result in similar demand example, but some there are discrepancies in the schoohng example. Simulation Examples. To 3.1. illustrate the use of the asymptotic and finite sample ap- proaches to inference, we conducted a series of simulation studies, the results of which are summarized in Table In each panel of Table 1. marginal hypothesis that ^(t)[i] and the second row corresponds = {.5, .75, .9}. totic The first of Table simulation model The sample size In all examples, is 1 for the last three sample approach. All results are A is row corresponds to testing the the first element of vector 9{t), median, 75'^ percentile, and = 9q{t). 90**^ For each percentile, for columns correspond 5% to results obtained via the finite level tests. corresponds to a hnear location-scale model with no endogeneity. given hy The Y = D+ {\ + D)e where D ~ BETA(1,1) and t ~ The iV(0, 1). conditional quantiles in this model are given by q{D,6o,T) is 100. we set use the identity function for g(-). In the exogenous model, i.e. three columns correspond to results obtained using the usual asymp- approximation^ and the Panel first to testing the joint hypothesis that 6{t) model, we report inference results r G ^o(''')[i] the ^(t)[j] 1, where we base the asymptotic approximation on the conventional quantile regression of Koenker and Bassett (1978), using the Hall-Sheather bandwidth choice suggested by Koenker use the inverse quantile regression of Chernozhukov and bandwidth choice suggested by Koenker (2005). 13 Hansen (2001) in = (200.5), and we the endogenous settings, using the + 9o{t)[2] So{t)ii]D where of the normal Looking tests CDF = 6'o(t)[2] evaluated at ^~^{t) and 0o(t)[i] == 1 +$ first at results for joint we inference, see that the finite ^(t) the inverse sample procedure produces On with the correct size at each of the three quantiles considered. asymptotic approximation results which overreject in tests the size distortion increasing as one moves toward the behavior $ ^{t) for t. tail the other hand, the in each of the three cases, unsurprising given that the usual asymptotically normal inference is about the for inference quantiles of the distribution (see e.g. tail with quantiles of the distribution. This is inappropriate Chernozhukov (2000)) while the finite sample approach remains valid. When we asymptotic distribution are the size distorted see that tests based Here, the finite sample for joint inference. inference continues to provide valid inference in that the size of the test nominal level. However, the far less ft-equently To finite 5% than the based on the level — on the horizontal = that 6(t)[i] and the axis, where Oo{t)ui^ is test using the we plot in power curves Figure the hypothesized value for 2. In this measures rejection frequencies of the hypothesis The solid line in the figure gives the power curve sample approach, and the dashed hne gives the power curve asymptotic approximation. Prom this figure, we can for the see that while conservative, sample procedure do have some power against alternatives. tests based on the finite sample power curve always finite are 0{t)ii-i given where the horizontal axis equals zero, and remaining points give power against various alternatives. for the test using the finite is 9a{''')\i] vertical axis 6^(r)h]. Thus, size smaller than the would suggest. sample and usual asymptotic approximation finite figure, values for Oj\{t)ii-i is sample approach appears to be quite conservative, rejecting further explore the conservativity of the finite sample approach, for tests on the with the distortion increasing as one moves toward though the distortions are smaller than tail, we again look at marginal inference about ^(t)m], lies The below the corresponding power curve generated from the asymptotic approximation, though this must be interpreted with caution due to the distortion in the asymptotic In Panels B-D tests. of Table 1, we consider the performance of the asymptotic approximation and the finite sample inference approach in a model with endogeneity. are generated from a location variables. Zj is ~ .8. In particular, /V(0,1) for j As above, ^o(''")[2] normal we have 1,2,3, e ~ Y = —1 + A^(0,1), D+ V~ e and A^(0, 1), D = where evaluated at 0o(t)[2] t. for this simulation and + HZs + V where the correlation between e and v + YiZi IIZ2 this leads to a finear structural quantile function of the +^o(''")[i)D CDF = The data model with one endogenous regressor and three instrumental = -1 +<I>~^(t) and As above, the sample 14 6'o(t)|i] = size is 100. 1 for form q{D,6f^,T) = $~^(t) the inverse of the We explore the behavior of the inference procedures for differing degrees of correlation be- tween the instruments and endogenous variable by changing B, we = set IT 0.05 which produces a first tionship between the instruments and endogenous variable the asymptotic approximation to perform poorly. F-statistic = 25.353. In Panel D, 11 is 1 11 across Panels B-D. In Panel stage F-statistic of 0.205. and the very weak and we would expect is In Panel C, = 11 stage F-statistic first In this case, the rela- and the 0.5 Both 96.735. is stage first of these between the endogenous variable and specifications correspond to fairly strong relationships the instruments, and we would expect the asymptotic approximation to perform reeisonably The well in both cases. inference in all As expected, finite sample procedure, on the other hand, should provide accurate three cases. the tests based on the asymptotic approximation perform quite poorly in the weakly identified case presented range from a minimum in Panel B. Rejection frequencies of .235 to a maximum 5% of .474 for a for the level test. asymptotic tests In terms of size, the finite sample procedure performs quite well. As indicated by the theory, the finite sample approach has approximately the correct vector. When with rejection rates of .024 for r The similar size for performing tests about the entire parameter considering the slope coefficient only, the finite sample procedure results for the = .0212 for t .5, = models where identification .75, .02 for r = .9 tests conservative quantile. C and D stronger given in Panels is though not nearly so dramatic. The hypothesis and is are based on the asymptotic procedure overreject in almost every case, with the lone exception being testing the joint hypothesis at the median when IT = and they decrease when The 1. 11 size distortions increase as increases fi'om .5 to 1. The one moves from t distortions at the 90'*^ The finite more modest with rejection rates ranging = .5 between .068 and sample results are much more stable than the asymptotic Marginal inference remains quite conservative with sizes to t results. = and r The is .9, .75, joint tests, size distorted. ranging from .0198 to .0300. sample inference procedure = .086. with rejection frequencies ranging between 4.6% and 5.5%, do not appear to be results clearly suggest that the finite .5 percentile remain quite large with rejection frequencies ranging between .1168 and .1540. For r the distortions are = The preferable for testing joint hypotheses, and given the size distortions found in the asymptotic approach the results also seem to favor the finite sample procedure for marginal inference. As above, we in Figures 3-5. also plot power curves for the asymptotic and Figure 3 contains power curves for 11 = .05. finite sample testing procedures In this case, the finite sample The procedure appears to have essentially no power against any alternative. is unsurprising given that identification in this case is where the correlation between the instruments and endogenous regressors is sample procedure seems to have quite reasonable power. The power curves 15 lack of power In Figures 4 and extremely weak. 5, stronger, the finite for the finite sample procedure are similar to the power curves of the asymptotic tests across a large portion of the parameter space. The finite sample procedure does have lower power against some alternatives must that are near to the true parameter value than the asymptotic tests, though again this be interpreted with some caution due to the distortions asymptotic in the tests. The Overall, the simulation results are quite favorable for the finite sample procedure. results for joint inference confirm the theoretical properties of the procedure numeric approximation error size. is not a large problem as the tests all and suggest that have approximately correct For tests of joint hypotheses, the finite sample procedure clearly dominates the asymptotic procedure which somewhat may be less clear For marginal inference, the results are substantially size distorted. cut though may the finite sample procedure favorable for the finite sample procedure. still result in tests that are quite conservative, On do appear to have nontrivial power against many hypotheses. on the asymptotic approximation have size greater In this case, though the tests the other hand, tests based than the nominal simulation level in the models considered. 3.2. Case Studies. elasticities which Demand 1. may month period Graddy fi-om December 2, amount the total in 8, 1992. measured of fish sold that day. The £is total in New York The price in the average daily price and the quantity as sample consists of 111 observations for the which the market was open over the sample period. model with non-additive disturbance: \n(Qp) quantity that would be demanded demand normalized to when the level of demand of enter the we over the five and quantity data For the purposes of this illustration, we focus on a simple Cobb-Douglas first, demand These data were used previously (1995) to test for imperfect competition in the market. are aggregated by day, with the price days market in the Fulton fish May 1991 to of demand. The data contain observations potentially vary with the level of on price and quantity of fresh whiting sold we present estimates In this section, for Fish. set /3{U) describes 0, = ao{U) the price were p, + ai{U) \n{p) + X'P{U), where Qp is the U is an unobservable affecting the level follow a C/(0, 1) distribution, is U, and model with random = if and in X is ai{U) is the random demand elasticity a vector of indicator variables for day of the week that coefficient the second, P{U). We consider two different specifications. In the we estimate P{U). how much producers would supply unobserved disturbance U. The factors random demand Z if A supply function Sp = f{p, the price were p, subject to other factors affecting supply are assumed to Z,U) Z and be independent of demand disturbance U. As instruments, we consider two Stormy is a dummy variable which indicates greater than 18 knots, feet different variables capturing and Mixed is a weather conditions at wave height greater than dummy variable indicating 4.5 feet sea: and wind speed wave height greater than 3.8 and wind speed greater than 13 knots. These variables are plausible instruments since 16 weather conditions at sea should influence the amount of should not influence demand for the product/ Simple fish OLS that reaches the market but regressions of the log of price on B? and these instruments suggest they are correlated to price, yielding and 15.83 when F-statistics of 0.227 both Stormy and Mixed are used as instruments. Asymptotic intervals are based on the inverse quantile regression estimator of Chernozhukov and Hansen (2001) when we i.e. For models in which we set treat price as endogenous. D= Z, which we treat the covariates as exogenous, we base the asymptotic intervals on the in conventional quantile regression estimator of Koenker and Bassett (1978).^ A of Table 2 gives estimation results treat- Estimation results are presented in ing price as exogenous, and Panel B when we instrument both of the weather condition instruments described above. C and D Panels for price using Table Panel 2. contains confidence intervals for the random elasticities dummy variables for day of the week as additional covariates Panels A and B respectively. In every case, we provide estimates include a set of and are otherwise identical to of the 95%-level confidence interval obtained from the usual asymptotic approximation and the finite sample procedure. For the finite sample procedure, MCMC, we report a grid search,^ and the marginal procedure^" in Panels A intervals obtained via C and B. In Panels and D, we report only intervals constructed using the asymptotic approximation and the marginal procedure. For each model, we report estimates for t Looking we first at modest see when no interval Panels = .25, t = A and C which report results for models that treat differences between the asymptotic and finite sample = and r .50, .75. price as exogenous, intervals. At the median 95% level (-1.040,0.040). The covariates (other than price and intercept) are included, the asymptotic is differences (-0.785,-0.037), and the widest become more pronounced of the sample intervals finite at the 25''^ is and 75"^ percentiles where we would expect When the asymptotic approximation to be less accurate than at the center of the distribution. day of the week effects are included, the asymptotic intervals tend to become narrower while the finite sample intervals widen slightly leading to larger differences in this case. However, the More detailed arguments may be found We use the Hall-Sheather in Graddy (199.5). bandwidth choice suggested by Koenker (200.5) to implement the asymptotic standard errors. When price is treated as exogenous, equally spaced grid over we use [-4,2] we use an equally spaced with spacing .015 for ai different grid search regions for each quantile. with spacing .025 for .025 for ai. For r spaced grid over ^'^For = For t ao and an equally spaced grid over an equally spaced grid over .75, [-10,30] [6,12] with spacing .0125 we used an with spacing grid over [5,10] with spacing .02 for qq for all quantiles. for = .25, [-40,40] When price is we used an equally spaced with spacing .25 and an treated as endogenous, for ai. For r qo and an equally spaced grid over grid over [0,10] = [-5,5] .50, we used with spacing equally spaced grid over [0,30] with spacing .05 for qo and an equally .05 for qi. the marginal procedure, we considered an equally spaced grid over models. 17 [-5,1] at .01 unit intervals for all basic results remain unchanged. Also, for obtaining the finite it is worth noting that methods three computational all sample confidence bounds give similar answers in the model with only an intercept and price with the marginal approach performing slightly better than the other MCMC and the marginal approach as a grid search which may not be feasible in high dimensional estimation of the demand model two procedures. This finding provides some evidence that may do weU computationally as problems. Turning now to results B and in Panels D, we for and pronounced at the 25'''' Even at the median wide. between the asymptotic intervals and the see quite large differences intervals constructed using the finite 75*'' sample approach. As above the differences are particularly percentiles in the where the bounds finite intervals definitely call into question the validity of the asymptotic which include the sample and asymptotic approximation not surprising given the relatively small sample size and the fact that is When intervals. for all three quantiles between the large differences sample intercept, the finite wide as the corresponding cisymptotic additional controls are included, the finite sample The sample intervals are extremely finite model with only price and an intervals are approximately twice as entire grid search region. using instrumental variables we in this case, are estimating a nonlinear instrumental variables model. Finally, it worth noting again the three approaches to constructing the is The interval in general give similar results in this case. differences between the grid search and marginal approaches could easily be resolved by increasing the search region approach which was restricted to values we the grid search and MCMC intervals at the for the marginal were a priori plausible. The difference between 25''' percentile is more troubhng, though could it be resolved through additional simulations or starting points. likely As a final illustration, plots of and an intercept are provided in 95% confidence regions in the model that includes only price Figures 6 and where price regions are all is instrumented more for using or less eUiptical irregular and in 7, all model of the not surprising it is sample intervals produce similar results. The on the other hand, are not nearly so well-behaved. In general, they are many cases appear to be disconnected, .25 quantile in the results in jump in the In the exogenous case, and seem to be well-behaved. In this case, Table 2 region appears to be disconnected. fails to Figure 6 contains confidence regions for the weather conditions. of the procedures for generating finite regions in Figure 7. and Figure 7 contains confidence regions coefficients treating price as exogenous, that felt sample finite is The apparent failure of MCMC at the almost certainly due to the fact that the confidence The MCMC algorithm explores one of the regions but to the other region. In cases like this, it is unlikely that a simple random walk Metropolis-Hastings algorithm will be sufficient to explore the space. While more complicated 18 MCMC procedure a convenient method to pursue is Returns 2. As our to Schooling. schoohng model that allows and the basic if one is it seems that the marginal interested solely in marginal inference. example, we consider estimation of a simple returns to final schoohng on wages. for heterogeneity in the effect of employed identification strategy The data (1991). schemes could be explored, or alternative stochastic search We use data schoohng study of Angrist and Krueger in the drawn from the 1980 U.S. Census and include observations on men born are between 1930 and 1939. The data contain information on wages, years of completed schooling, and year of birth, and quarter state As ao{U) + ai{U)S is + We total sample consists of 329,509 observations. Y X'P[U) where U is might think of U and S the log of the weekly wage, is dummies a vector 51 state of birth and 9 year of birth coefficients (i{U), (0,1). The we focus on a simple hnear quantile model in the previous section, schoohng, AT of birth. is of the that enter with unobserved as indexing which case ai(r) may be thought ability, in years of schooling may be jointly determined with unobserved an instrument schoohng, following Angrist and Krueger (1991). for As in the previous regression estimator random an unobservable normahzed to follow a uniform distribution over of as the return to schoohng for an individual with unobserved abihty r. Since specifications. In the first, Y = form years of completed we set /3(I7) = 0, and in the second, we believe that we use quarter ability, We of birth as consider two different we estimate P{U). example, we construct asymptotic intervals using the inverse quantile when we schooling as exogenous, treat schooling as endogenous. we construct the asymptotic For models in which we treat intervals using the conventional quantile regression estimator. We present estimation results in Table schoohng as exogenous, and Panel we instrument of birth B and for dummy In every case, A of Table 3 birth. Panels we provide estimates sample procedure, we report intervals obtained via price is in Panels is treated as endogenous, spaced grid over [0,0.2.5] for the exogenous case endogenous include a set of 51 state 95% A and confidence interval obtained sample procedure. MCMC and a modified For the MCMC finite procedure we use unequally spaced MCMC procedure we employ grids over [4.4,5.6] for ao for a\ grid over [3,6] with spacing .01 for is a and unequally Qi where the spacing depends on the quantile under consideration. with spacing .001 for all and B. The modified we use an equally spaced ^^For the marginal procedure, in A treated as exogenous, spaced grids over [0.062,0.077] price of the finite when that better accounts for the specifics of the problem, a grid search, ^^ and the marginal procedure^^ When C and D variables but are otherwise identical to Panels from the usual asymptotic approximation and the (MCMC-2) gives estimation results treating contains confidence intervals for the schoohng effect schoohng using quarter of 9 year of birth respectively. B Panel 3. When qo and an equally for all quantiles. we considered an equally spaced grid over [0.055,0.077] at .0004 unit intervals models, and we used an equally spaced grid over case. 19 [-1,1] at .01 unit intervals in the simple stochastic search algorithm that simultaneously runs mode at a local MCMC The of the objective function. chains each started idea behind the procedure we may the chains get stuck near a local mode. if MCMC is that the simple By tends to get "stuck" because of the sharpness of the contours in this problem. multiple chains started at different values, even five If potentially explore more MCMC C and procedure. In Panels of the function the starting points sufficiently cover the function, the approach should accurately recover the confidence region the unadjusted more quickly than D, we report only intervals constructed using the asymptotic approximation and the marginal procedure. For each model, estimates for r Looking in Panels = first at A = r .25, and r .50, = using we report .75. estimates of the conditional quantiles of log wages given schooling presented and C, we see that there asymptotic inference In Panel results. is A very little difference between the finite sample and where the model includes only a constant and the schooling variable, the finite sample and asymptotic intervals are almost identical. There are between the larger differences 51 state of birth effects even and large all sample and asymptotic intervals in Panel C which includes 9 year of birth effects in addition to the schooling variable; The in this case the differences are quite small. in not surprising since in the is finite though correspondence between the results close exogenous case the parameters are well-identified and the sample enough that one would expect the asymptotic approximation to perform quite well for but the most extreme quantiles. While there model which is close agreement between the treats schooling as exogenous, there are asymptotic and finite quarter of birth. The sample results finite sample in sample and asymptotic results finite substantial differences between the still we instrument the case where in the though When we in all cases they exclude zero. model that includes the for schoohng using with the exception of the interval at the median, intervals, are substantially wider than the asymptotic intervals in the intercept, in the state of birth model with only schoohng and an consider the finite sample intervals and year of birth covariates, the differences are huge. For aU three quantiles, the finite sample interval includes at least one endpoint of the search region, and in no case are the bounds informative. may be Also, While the finite sample bounds quite conservative in models with covariates, the differences in this case are extreme. we have evidence from the model which identified setting the inflation of the that identification in this model While the finite is treats education as exogenous that in a well- bounds need not be large. Taken together, this suggests quite weak. sample intervals constructed through the different methods are similar at the median in the instrumented model, there are large differences between the finite sample intervals for the .25 and .75 quantiles with the simple marginal approach performing the best. The MCMC performing the worst and the difficulty in this case 20 is that the objective function has extremely sharp contours. This sharpness of contours the 95% level illustrated in Figure 8 is MCMC-2 confidence region obtained from the procedure for the which plots .75 quantile in the schooling example without covariates. The shape the basic of the confidence region poses difficulties for both the traditional grid search and MCMC procedure. The problem with that even with a very fine grid one the grid is the grid search unlikely to find many points along the one may miss the confidence grid, of the confidence set also causes problems with is that the interval more than a few points chosen carefully to include and with a course region, is MCMC so narrow in the region unless "fine" describing the confidence region entirely. The narrowness by making transitions quite random walk Metropohs-Hastings procedure one must Essentially with a default is difficult. specify either a very small variance for the transition density or must specify the correlation exactly so that the proposals lie along the "line" describing the contours. Designing a transition density with the appropriate covariance that lie off of the line making MCMC this suggests that complicated as even slight perturbations is is transitions unhkely unless the variance likely to travel in poor convergence properties The MCMC-2 and may is Taken together very slowly through the parameter space resulting sample confidence regions. difficulty in generating the finite procedure alleviates the problems with the random walk by running multiple chains with result in proposals small. different starting values. MCMC somewhat Using multiple chains provides local exploration of the objective function around the starting values. In cases where the objective function is largely concentrated around a few local modes, erating the finite sample confidence regions. This approach at the .25 quantile where we see that the MCMC-2 improvement this provides is still interval is insufficient in this still in gen- example significantly shorter the interval generated through the marginal approach suggesting that the MCMC-2 than procedure did not travel sufficiently through the parameter space. In this example, the marginal approach seems to clearly dominate the other approaches to computing the finite sample confidence regions that we have considered. points that he within the confidence approaches. It is also bound for the It finds more parameter of interest than any of the other simple to implement, and the search region can be chosen to produce a desired level of accuracy. 4. In this paper, restrictions that Conclusion we have presented an approach is valid to inference in under minimal eissumptions. models defined by quantile The approach does not rely on any asymptotic arguments, does not require the imposition of distributional assumptions, and be vahd for both linear and nonlinear conditional quantile models and endogenous as well as exogenous variables. The approach 21 relies for will models which include on the fact that objective functions that quantile regression aims to solve are conditionally pivotal in finite samples. This conditional pivotal property allows the construction of exact regions and on finite sample confidence The it chief may be We bounds finite sample joint confidence for quantile regression coefficients. drawbacks of the approach are that it may be computationally difficult and that quite conservative for performing inference about subsets of regression parameters. suggest that MCMC may be used or other stochastic search algorithms to construct joint we suggest a simple algorithm that combines optimization In addition, confidence regions. with a one-dimensional search that can be used to construct confidence bounds regression parameters. we In simulations, for individual sample inference procedure find that the finite is not conservative for testing hypotheses about the entire vector of regression parameters but that is it sample conservative for tests about individual regression parameters. tests do have moderate power approximation tend to overreject. many in and situations, tests However, the finite based on the asymptotic Overall, the findings of the simulation study are quite favorable to the finite sample approach. We also consider the use of the finite estimation of a demand curve large sample. In the sample inference in two simple empirical examples: a small sample and estimation of the returns to schooling in a in demand example, we find modest differences between the finite asymptotic intervals when we estimate conditional quantiles not instrumenting large differences when we instrument and again there are large differences These results suggest that in both instrumental variables models Appendix In the preceding sections, this section, (1) finite sample and which we treat schooling as exogenous, approaches when we instrument for schooling. parameters in the Optimality Arguments for L^. A. Appendix: a finite sample inference procedure for quantile regression procedure provides valid inference statements this also has desirable large most powerful (UMP) invariant in finite samples. In sample properties: form test. (2.3) contains a (locally) asymp- Inversion of this test therefore gives uniformly most accurate invariant regions. (The definitions of power and invariance follow those in Choi, Hall, (2) in the in identification, the class of statistics of the totically uniformly (locally) models and weak. is we show that the approach Under strong in cases, the identification of the structural we introduced models and demonstrated that example, the for price. In the schooling asymptotic intervals are almost identical sample and for price Under weak and Schick (1996)). identification, the class of statistics of the form (2.3) maximizes an average power function within a broad class of normal weight functions. Here, we suppose (Yi, Dj, Zj,i assume that the dimension K = of do l,...,n) is fixed. is an i.i.d. Although 22 sample from the model defined by Al-6 and this assumption can be relaxed, the primary purpose of this section is to motivate the statistics used for finite-sample inference from an optimality point of view. Recall that under Al-6, PlY - Ho:eo = where 0, Letei M^ £ some is = l[Yi< Suppose testing is g 6I,,t)] to be based As . on the null, any statistic based on , definite. As mentioned e, t. Consider the problem of testing Ha:0o¥=O., vs. (ej, ~ defined, e|Z (e;, = Zj,i = Zj,i Let G be the class of functions g is = 0|Z] constant. (A, G = U^j Gj where Gj < q [D, 9q,t) 1, 1, . Bernoulli[r (Z,6lo)] n) , . . . ,n) E which for . . . is Because ei|Zi, exists ] the class of K-'-valued functions g for which . and E [g a "natural" class of test statistics in the text, , . Z„ (Z, ~ = P [F < 6^0) i.i.d. q {D,eo,T)\Z] Bernoulli(T) under conditionally pivotal under Hq. (Z) g (Z) [g where t , . positive definite; that is {Z) g (Z)'] exists and is is given by {L„ (6,,g) is, let positive g £ G} , where Ln[e*,g) = ^g{Zi){ei-T] Being based on (Ci, Zi,i = 1, ... T{l-T)J2g{Z,)g(Z,)' ,n) , any such L„ [9,,g) is Y,9{Zi){ei-T) (A.l) conditionally pivotal under the null. In addition, under the null, L„{9,,g) any g € for G- ^d g £ G} enjoys desirable large sample power properties Moreover, the class {L„ (Ot,g) under the following strong identification assumption Assumption 3. (a) The Z distribution of 2>^dim(s) in which 0» denotes some open neighborhood of does not depend on 9o- (b) For every 6 G G, (and for almost every Z), fiZ,9) and exists If 9o,n is continuous (in 5). (c) f, Assumption 3 holds and g £ = 9, + (Z) G, = = -^^r{Z,9) f (Z, 0,) 6 5.(d) (A.2) £ supggQ_ ||f (Z, < 6*)!] co. then under contiguous alternatives induced by the sequence b/y/n, 1 Ln {6*,g) ^d 1 2 -j^X~d\m[g) t(1-t 5s{h,g) (A.3) where 5s By {b, g) = [f, a standard argument, 63 {b,g) local asymptotic asymptotic result b'E is (Z) g (Z)'] < E [g (Z) g (Z)'] "' £ [5 (Z) f, (Z)'] 63 {b,ft) for any 5 e 5. As a consequence, power within the class {L„ {9t,g) the following. 23 : g € G} An b. L^ (9,,t,) maximizes even stronger optimality . , Proposition Ln (0,,f,) is Among 4. tests based a locally asymptotically ,n) the test which rejects for on (ej, Zi,i = 1, UMP (rotation) invariant test of Ho- Therefore, {Ln . . . is an (asymptotically) essentially complete class of tests of Proof: The conditional (on Z_ = (Zi, . Ho under Assumption Z„)) log likelihood function , . . large values of , {0,,g) ' g £ Q) 3. is given by - e.)} n = J2 in {e\z) ! {log [^ (2-^)1 LAN Assumption 3 implies that the following £„ where £n ( e. e. + log [1 - ^ iz,,e)] (1 =1 + — - ) £„ expansion (00 = b'S: - is valid under the null: For any G R'^ b -bXb + Opil), the (unconditional) log likelihood fmiction, is \/n '-^ T — (1 T and n Theorem ically ^-^ T T[l — T) — T) (l-r) (1 3 of Choi, Hall, and Schick (1996) UMP now shows that L„ (&.,f,) ' ' = ^S^'I^~^S^ is the asymptot- invariant test of Ho- In view of Proposition 4, a key role is played by f, This gradient will typically be . unknown but will be estimable under various assumptions ranging from parametric assumptions to nonparametric ones. As an illustration, consider the hnear quantile model ' Y= where P [e < = 0\Z] t. If D'Oo + (A.4) e, the conditional distribution of e given (A', Z) admits a density (with respect to Lebesgue measure) fe\x,z ('l^j ^) £ind certain additional mild conditions hold, then satisfied If, with moreover, [Z) f, = —E [Df^^x.z (OIA", Z) ]Z] {e,v') \Z ~ A/'(0,S) for parametric estimation of subclass of Q will ft used, nor will some n'Z + (A.5) v, positive definite matrix E, then f, (Z) becomes feasible. is proportional to it will not affect the asymptotic validity of the test even affect the validity of finite-sample inference if is available. and provided sample splitting full is sample used full (i.e., sample). identification. Proposition 4 will not hold as stated, but a closely related optimality result The key diff'erence between the strongly and weakly property of a weakly identified model estimable. Z known.) f» is the estimation of f, and finite-sample inference are performed using different subsamples of the Under weak II' (Assuming that the gradient belongs to a particular not affect the optimality result, as Proposition 4 (tacitly) assume that Estimation of the gradient is is assumed that it is D= where , Assumption 3 an object which can be estimated nonparametrically. is As such, asymptotic optimality identified cases that the counterpart of the gradient t» results are too optimistic. Nevertheless, 24 is is that the defining not consistently it is still possible to show the used in the main text has an attractive optimahty property under the following weak statistic identification assumption Assumption Rn{Z, 0,,C)\ and exists If is for 4. (a) some in which t(Z,9o) The distribution of C G Kdim(Z)xK Ln Z modeled as a "locally linear" sequence of parameters.^'' does not depend on E =0 \Rn [Z, 6,C) -1/2 T(Z,9t) Oq, (b) = some function Rn, where A^ a,nd positive definite, (d) limn^oo Assumption 4 holds and g € is - 6'o ( [Z'CAe + = E{ZZ') {c)T.zz every 9 and every C. for Q, then (S*,g) — -Sw('^e,C,g) XXdim(g) >rf (A.6) Til where Sw As (A,, C, g) = A'gE [C'Zg (Z)'] E in the strongly identified case, the limiting distribution of in the weakly identified Within the case. the asymptotically most powerful test "' (Z) g (Z)'] [g Ln E [g (Z) Z'C] Ae- the one based on L„ {9,,gc) is \ times a noncentral member based on a class of tests is {9*,g) where gc (Z) , furthermore enjoys an optimality property analogous to the one established of the result for L„ (9f,gc) is identical to that of Proposition 4, with 5, in = X%^i„\ g € G} of {Ln (9,,g) , C'Z. This test Proposition 4. The proof 5*, and T* of the latter proof replaced by Ag, Sn [C) = C T^ Jn IZ - ~T\ T il ^ t) ^' ("^^ " ^^ and Xn [C) respectively. C = C However, the consistent estimation of is C cannot be treated "as if" it implementable without knowledge of knowledge. To that end, It let = E.„ui L^ is C is if C is known, then the statistic 5^ (C) is infeasible in the present (weakly identified) case, hrdeed, it seems more reasonable to search for a test and enjoys an optimality property that does not t(1 ^ ^E^'Zi member of {L„ (9t,g) Muirhead (1982, Exercise 3.15 a c, rely which on this z, [a - t) (A.7) 1 L* be the particular follows from matrix -z,z: T) 4.) was known, -r) 1 is, - t(1 let l: that ^ (In particular, the proof utilizes the fact that asymptotically sufficient under Assumption because n strictly increasing : g € Q} for which g (d)) that for any k> is the identity mapping. and any dim (D) x dim(D) transformation of K. exp 1 Assumption 4 C is + K Ln(9,,gc)]dJ{C;Y:,y), (A.8) motivated by the Gaussian model (A.4)-(A.5). In that model, parts (b) and (d) hold (with proportional to \/nTl) if part (c) does and n varies with matrix (as in Staiger and Stock (1997)). 25 n in such a way that y^Yl is a constant dim (Z) x K where J () is mean the cdf of the normal distribution with J {) In (A. 8), the functional form of and variance "natural" insofar as is is it E^,,; ® (n~^ Y^=\ ^i^'i) corresponds to the weak instru- ments prior employed by Chamberlain and Imbens (2004). Moreover, following Andrews and Ploberger (1994), the integrand in (A. 8) obtained by averaging the is LAN approximation to the likelihood ra- tio with respect to the weight/prior measure Kcido) associated with the distributional assumption Ag ~ A/" 0, reXn In view of the foregoing, (C)~ erage power optimality properties of the follows that the statistic L* enjoys weighted av- it Andrews and Ploberger (1994) This statement variety.^'* is formalized in the following result. Proposition is Among 5. on tests based (cj, asym,ptotically equivalent to the test that = Zi,i 1, . . . ,n), under Assumption 4 the test based on L* maximizes the asymptotic average power: / [ Pr (reject e*\eo,C)dKc {eD)d J (C). limsup References Abadie, a. Angrist, "Changes (1995): Approach," J. in Spanish Labor Income Structure During the 1980s: CEMFI Working Paper No. 9521. D., AND A. B. Krueger (1991): A "Does compulsory school attendance Quantile Regression affect schoohng and earnings," Quarterly Journal of Economics, 106, 979-1014. BUCHINSKY, M. "Changes (1994); in U.S. wage structure 1963-87: An application of quantile regression," Econometrica, 62, 405-458. Chamberlain, G. (1994): "Quantile regression censoring and the structure of wages," metrics, ed. by C. Sims. Elsevier: Chernozhukov, V. MIT, Department download at New in Advances in Econo- York. "Extremal Quantile Regression (Linear Models and an Asymptotic Theory)," (2000): of Economics, Technical Report, pubhshed in Ann. Statist. 33, no. 2 (2005), pp. 806-839, www.mit .edu/"vchern. Chernozhukov, and V., C. Hansen (2001): "Instrumental Quantile Regression Inference for Structural and Treatment Effect Models," Journal oj Econometrics (forthcoming). — (2005a): "Instrumental Variable Quantile Regression," mimeo. (2005b): "An IV Model Chernozhukov, V., and H. of Quantile Treatment Effects," Econometrica. Hong (2003): "An MCMC Approach to Classical Estimation," Journal of Econometrics, 115, 293-346. Chernozhukov, V., G. Imbens, and W. Newey (2006): "Instrumental Variable Estimation of Nonseparable Models," forthcoming Journal of Econometrics. Chesher, a. (2003): "Identification in Nonseparable Models," forthcoming in Econometrica. DoKSUM, K. (1974): sample case," Ann. Graddy, K. "Empirical probability plots and statistical inference for nonlinear models Statist., 2, (1995): in the two- 267-277. "Testing for Imperfect Competition at the Fulton Fish Market," Rand Journal of Eco- nomics, 26(1), 75-92. Gutenbrunner, C, J. JURECKOVA, R. KoENKER, AND based on regression rank scores," J. Nonparametr. As discussed by Andrews and Ploberger totically equivalent to) S. (1994, p. 1384), a Bayesian posterior odds PORTNOY Statist., 2(4), ratio. 26 (1993): "Tests of linear hypotheses 307-331. L^ can therefore be interpreted as (being asymp- Hansen, P., J. L. Heaton, and A. Yaron (1996): Sample Properties "Finite of Some Alternative GMM Estimators," Journal of Business and Economic Statistics, 14(3), 262-280. He, X., Hu, AND HUBER, p. Hu F. J. Amer. Statist. Assoc, 97(459), 783-795. "Robust regression: asymptotics, conjectures and Monte Carlo," Ann. J. (1973): Koenker, R. "Markov chain marginal bootstrap," (2002): "Estimation of a Censored Dynamic Panel Data Model," Econometnca, 70(6), 2499-2517. L. (2002): "Rank (1997): tests for linear models," in Robust inference, vol. 15 of Statist., 1, Handbook of 799-821. Statist., pp. 175-199. North-Holland, Amsterdam. Quantile Regression. Cambridge University Press. (2005): Koenker, R., Koenker, R., and and (2004b): MacKinnon, W. for sample G. S. Bassett (1978): "Regression Quantiles," Econometrica, 46, 33-50. Z. Xiao J. "Quantile autoregression," www.econ.uiuc.edu. (1964): "Table for both the sign test sizes to 1,000," Matzkin, R. (2004a): "Unit root quantile autoregression inference," L. (2003): J. Amer. Statist. Assoc, "Nonparametric estimation J. Amer. Statist. Assoc, 99(467), 775-787. and distribution-free confidence intevals of the median 59, 935-956. of nonadditive random functions," Econometrica, 71(5), 1339-1375. Newey, W. K. (1997): "Convergence rates and asymptotic normality for series estimators," J. Econometrics, 79(1), 147-168. and Pares, A., D. Pollard (1989): "Simulation and Asymptotics of Optimization Estimators," Economet- 1027-1057. rica, 57, Parzen, M. L. J. I., Wei, and Z. Ying (1994): "A resampling method based on pivotal estimating functions," Biometrika, 81(2), 341-350. Portnoy, S. (1985): "Asymptotic behavior of Normal approximation," Ann. M estimators of p regression parameters when p^ /n is large. H. 1403-1417. "Asymptotic behavior of regression quantiles (1991): Anal, Statist., 13(4), in nonstationary, dependent cases," J. Multivariate 38(1), 100-113. Robert, C. and G. Casella (1998): Monte Carlo Statistical Methods. Springer- Verlag. and J. H. Wright (2000): "GMM with weak identification," Econometrica, 68(5), P., Stock, J. Tamer, E. (2003): "Incomplete simultaneous discrete response model with multiple equilibria," Rev. H., 1055-1096. Econom. Stud., 70(1), 147-165. Walsh, Math., J. 11, E. (I960): "Nonparametric tests for median by interpolation from sign 183-188. 27 tests," Ann. Inst. Statist. ' MCMC A. for Draws and Critical Demand Example (t Region B. lor and Grid Search Demand Example 0.2 0.2 - * ' ., y-' ^ -0.2 > 1 Critical Region = 0.50) (i . 1 \-' :/."'. •^' w'.i' •> • / -ii!' -0.2 ,>' •« i." t MCMC = 0.50) / / / \ r \ -0.4 -0.4 \ / * t •^ -0.6 / V J-r^- <:" - -0.6 / •••.',1^ .'•' ( •>' - -0.8 ] 1 #.*/• < *, -0.8 i J n *>\ ;•' A-l' -." f ( • • \J /' '."f-'v. ** *.••• *; -1 , .••>' -1.2 i.2 8.3 8.4 8.5 8.6 8.7 !.2 8.3 e(.50)|2j (Inlercepl) 8.4 e( 8.5 50)|2| 8.6 8.7 (Intercept) MCMC and Grid Search Confidence Regions. This figure illustrates the construction of a 95% level confidence regions by MCMC and a grid search in the demand example from Section 3. Panel A shows the MCMC draws. The gray -h's represent draws that fell within the confidence region, and Figure the black the 1. •'s MCMC represent draws outside of the confidence region. draws within the confidence region (gray confidence region (black line). 28 4-'s) Panel and the B plots grid search Table Monte Carlo Results 1. Asymptotic Inference Null Hypothesis r = = r 0.50 0.75 t = t 0.90 = Finite Sample Inference 0.50 r = 0.75 = t 0.90 A. Exogenous Model ^(r)|i| e(r) = = eo(T)|i] ^o(t) 0.0716 0.0676 0.0920 0.0744 0.0820 0.1392 Endogenous Model. B. 0iTh] e{T) = = ^o(r)|i] ^o(r) 0.2380 0.2392 0.2720 0.2352 0.3604 0.4740 C. ^(t)(ij e{r) = = ^o(r)[i] 0^{t) Endogenous Model. 0.0744 0.0808 0.1240 0.0732 0.0860 0.1504 D. Endogenous Model, e(r)[ll 6{t) = = eo(r)[i] eo(T) 11 = 11 n 0.0080 0.0064 0.0044 0.0516 0.0448 0.0448 0.0240 0.0212 0.0200 0.0488 0.0460 0.0484 0.0300 0.0300 0.0204 0.0552 0.0560 0.0464 .05 = : == .5 1 0.0632 0.0784 0.1168 0.0232 0.0216 0.0192 0.0508 0.0772 0.1440 0.0524 0.0476 0.0508 Note: Simulation results for asymptotic and finite sample inference results for a different simulation model. q{D,8a,T) = 6o(t)|2) + The &o(t)|i|D. tests of the hypothesis that 6(t)hj tests of the joint hypothesis 6 = = 8a. for quantile regression. Each panel reports Each simulation model has quantiles first of the form row within each panel reports rejection frequencies for 5% level and the second row reports rejection frequencies for 5% level 5o(t)|i], The number of simulations 29 is 2500. T = 0.50 T = 0.75 \ 0.8 0.8 \ 1 \ \ 1 \ 1 1 \ 1 \ \ 1 ^ 0.6 0.6 \ 1 \ 1 I S I o 1 \ / ^0.4 0.4 1 \ \ 1 0.2 0.2 A -2.5 -2.5 e(.50)|,j-e^(.50) I plots 2. is - 8(.75),„-9„(.75) e^ji.so) Power Curves power curves solid line 1 2.5 = 0.90 e(.90)j,| Figure ' -'7 V^~-^_^ for Exogenous Simulation Model. This for the simulations contained in Panel A of Table 1. figure The the power curve for a test based on the finite sample inference procedure, and the dashed line is the power curve from a test based on the Eisymptotic approximation. 30 T = 0.50 T = 0.75 0.8 0.7 0.6 \ / 0.5 1 \ 0.4 1 \ 0.3 \ 0.2 \ ' . 0.1 _^/ •(.50)(„-e„(.50)„j •u 1 (.75)|„-e„(.75)j,j = 0.90 (.90)(„-e„(.90)j„ Figure 3. Power Curves for This figure plots power curves 1 Endogenous Simulation Model, for the simulations which correspond to a nearly unidentified curve for a test based on the line is the finite power curve from a test contained in Panel case. The = 11 B 0.05. of Table solid line is the power sample inference procedure, and the dashed based on the asymptotic approximation. 31 = 0.50 T \ T = 0,75 ~S^ \ 0.8 \ 0.6 • 0.8 • 0.6 / 1 1 \ 1 1 0.4 0.4 1 1 0.2 1 0.2 '/ v/ n ; -5 -5 e(.50)|„-9„(,50)|„ T = e(.75) -6„(.75) 0,90 e(,90)„|-e„(.90),„ Figure 4. Power Curves for This figure plots power curves 1. The solid line is for Endogenous Simulation Model, the simulations contained in Panel C 11 = .5. of Table the power curve for a test based on the finite sample inference procedure, and the dashed line is the power curve from a test based on the asymptotic approximation. 32 1 T = \. \ X = 0.75 0,50 ' '/ A^ \ / 0.8 0.8 1 0.6 0.6 \ V111 1 1 ' ' li 0.4 1 0.4 U \\ 1 1 \\ 1 0.2 0.2 Y -2 (.75),„-e„(.75),„ (.50),„-9„(.50),„ 1 \ = 0.90 7 ' 0.8 0.6 \ 1 ' 0.4 0.2 6(.90),„-6„(.90)p| Figure 5. Power Curves for This figure plots power curves Table 1. The solid line inference procedure, is Endogenous Simulation Model, for the simulations IT contained in Panel = D 1. of the power curve for a test based on the finite sample and the dashed line on the asymptotic approximation. 33 is the power curve from a test based Table 2. Estimation Method Demand r = for Fish r 0.25 = = t 0.50 0.75 A. Quantile Regression (No Instruments) Quantile Regression (Asymptotic) (MCMC) (-0.874,0.073) (-0.785,-0.037) (-1.174,-0.242) (-1.348,0.338) (-1.025,0.017) (-1.198,0.085) Finite Sample Finite Sample (Grid) (-1.375,0.320) (-1.015,0.020) (-1.195,0.065) Finite Sample (Marginal) (-1.390,0.350) (-1.040,0.040) (-1.210,0.090) B. IV Quantile Regression (Stormy, Inverse Quantile Regression (Asymptotic) (MCMC) Finite Sample Finite Sample (Grid) Finite Sample (Marginal) . C. Quantile Regression Quantile Regression (Asymptotic) Finite Sample (Marginal) D. IV Quantile Regression - Day Finite Note; (-1.802,0.030) (-2.035,-0.502) (-4.403,1.337) (-3.-566,0. 166) (-5.198,25.173) (-4.250,40] (-3.600,0.200) (-5.150,24.850) (-4.430,1] (-3.610,0.220) [-5,1] 95% level confidence interval estimates for treats price as exogenous, (-0.718,-0.058) (-1.265,-0.329) (-1.610,0.580) (-1.360,0.320) (-1.350,0.400) set of dummy variables for Mixed as Instruments) (-2.403,-0.324) (-1.457,0.267) (-1.895,-0.463) [-5,1] [-5,1] [-5,1] Demand for and Panel B reports endogenous and uses weather conditions as instruments a (No Instruments) (-0.695,-0.016) Sample (Marginal) model which EfFects EfFects (Stormj', Inverse Quantile Regression (Asymptotic) as Instruments) (-2.486,-0.250) Day - Mixed day of the week. The first Fish example. Panel results for price. row in A reports results from from model which treats price as Panels C and D are as A and B but include each panel reports the interval estimated using the asymptotic approximation, and the remaining rows report estimates of the finite sample interval constructed through various methods. 34 0.35 2| . . . 1 1 ; 2 3 0.5 2| ' 1 1 3 Figure 6. Finite Sample Confidence Regions ing Price as Exogenous. This figure plots firom fish for Fish finite Example Treat- sample confidence regions example without covariates treating price as exogenous. Values intercept, 0{t)i2] are on the horizontal axis, 6(r)[i] are on the vertical axis. 35 and values for the slope for the parameter -20 -10 Intercept 10 20 10 Intercept Figure 7. fish Sample Confidence Regions for Fish Example TreatEndogenous. This figure plots finite sample confidence regions Finite ing Price as from 20 Intercept example without covariates treating the intercept, 0{t)^2] eter ^(t)|i] are ^^I'e on the horizontal on the vertical axis. 36 axis, price as endogenous. Values for and values for the slope param- Table Returns to Schooling 3. Estimation Method r = t 0.25 = 0.50 r = 0.75 A. Quantile Regression (No Instruments) Quantile Regression (Asymptotic) (MCMC) (0.0715,0.0731) (0.0642,0.0652) (0.0637,0.0650) (0.0710,0.0740) (0.0640,0.0660) (0.0637,0.0656) Finite Sample Finite Sample (Grid) (0.0710,0.0740) (0.0641,0.0659) (0.0638,0.0655) Finite Sample (Marginal) (0.0706,0.0742) (0.0638,0.0662) (0.0634,0.0658) B. IV Quantile Regression (Quarter of Birth Instruments) Inverse Quantile Regression (Asymptotic) (MCMC) Finite Sample Finite Sample (MCMC-2) Finite Sample (Grid) Finite Sample (Marginal) C. Quantile Regression - (0.0784,0.2064) (0.0563,0.1708) (0.0410,0.1093) (0.1151,0.1491) (0.0378,0.1203) (0.0595,0.0703) (0.0580,0.2864) (0.0378,0.1203) (0.0012,0.0751) (0.059,0.197) (0.041,0.119) (0.021,0.073) (0.05,0.39) (0.03,0.13) (0.00,0.08) State and Year of Birth Effects (No Instruments) Quantile Regression (Asymptotic) Finite Sample (Marginal) D. IV Quantile Regression (0.0666,0.0680) (0.0615,0.0628) (0.0614,0.0627) (0.0638,0.0710) (0.0594,0.0650) (0.0590,0.0654) - State and Year of Birth Effects (Quarter of Birth Instruments) Inverse Quantile Regression (Asymptotic) Finite Note: Sample (Marginal) 95% level confidence interval estimates for model which treats schooling endogenous and uses quarter a.s in (0.0661,0.1459) (0.0625,0.1368) (-0.24,1] [-1,1] [-1,0.35] Returns to Schooling example. Panel exogenous, and Panel of birth but include a set of 51 state of birth row (0.0890,0.2057) dummies dummy B A reports results from reports results from model which treats schooling as C and D are as A and B dummy variables. The first as instruments for schooling. Panels variables and a set of 9 year of birth each panel reports the interval estimated using the asymptotic approximation, and the remaining rows report estimates of the finite sample interval constructed through various methods. 37 95% Confidence Region for Scfiooling 0.02 0.01 ;.75)|,j Figure ple. 8. Confidence Region This figure plots the example in the finite Q Q 0.06 model without covariates treating schooling 6'(.75)hi are 0.07 0.08 Schooling Exam- (Slope) for 75''' Percentile in on the on the horizontal 38 b Percentile sample confidence regions from the schooling ues for the intercept, 6'(.75)[oj are parameter 75 0.05 0.04 0.03 for Example vertical axis, axis. as endogenous. Val- and values for the slope Date Due Lib-26-67 MIT LIBRARIES 3 9080 02617 9736