Document 11163347

advertisement
Digitized by the Internet Archive
in
2011 with funding from
Boston Library Consortium
Member
Libraries
http://www.archive.org/details/finitesampleinfeOOcher
HB31
DEWEY
.M415
6S
-^
Massachusetts Institute of Technology
Department of Economics
Working Paper Series
SAMPLE INFERENCE FOR
QUANTILE REGRESSION MODELS
FINITE
Victor Chernozhukov
Christian
Hansen
Michael Jansson
Working Paper 06-03
January 31, 2006
Room
E52-251
50 Memorial Drive
Cambridge,
MA 021 42
This paper can be downloaded without charge from the
Social Science Research Network Paper Collection at
http://ssrn.com/abstract=880467
\SSACHUSETTS -NSTITUTE
OF TE CHNOLOGV
I
FEB
2 ? 2005
LIBRARIES
FINITE SAMPLE INFERENCE FOR QUANTILE REGRESSION
MODELS
VICTOR CHERNOZHUKOV CHRISTIAN HANSEN MICHAEL JANSSON
Abstract. Under minimal assumptions
sion models can be constructed.
finite
sample confidence bands
for quantile regres-
These confidence bands are based on the "conditional pivotal
property" of estimating equations that quantile regression methods aim to solve and
will pro-
vide valid finite sample inference for both linear and nonlinear quantile models regardless of
whether the covariates are endogenous or exogenous.
MCMC,
puted using
and confidence bounds
for single
The
confidence regions can be com-
parameters of interest can be computed
through a simple combination of optimization and search algorithms.
We
illustrate the finite
sample procedure through a brief simulation study and two empirical examples: estimating
a heterogeneous demand
all
cases,
we
find
elasticity
and estimating heterogeneous returns to schooling.
asymptotics and confidence regions formed using the
finite
the usual asymptotics are suspect, such as inference about
identification
may
usefully
assumptions
is
partial or weak.
complement
fail
In
pronounced differences between confidence regions formed using the usual
The evidence
existing inference
sample procedure
tail
in cases
quantiles or inference
where
when
strongly suggests that the finite sample methods
methods
for quantile regression
when the standard
or are suspect.
Key Words: Quantile
Regression, Extremal Quantile Regression, Instrumental Quantile
Regression
1.
Quantile regression (QR)
it
a useful tool for examining the effects of covariates on an outcome
is
Koenker
variable of interest; see e.g.
Introduction
(2005).
Perhaps the most appealing feature of
allows estimation of the effect of covariates on
Date:
version
The
of
finite
the
sample
results
"An
paper
IV
of
this
Model
paper
of
many
were
Quantile
points of the
included
http://gsbwww.uchicago.edu/fac/christian.hansen/research/IQR-short.pdf.
current paper was prepared
Roger Koenker
for
As
the Winter Meetings of the Econometric Society
for constructive discussion of the
in
April
Effects"
a
that
outcome distribution
the
in
Treatment
QR is
separate
2003
17,
available
at
project,
the
San-Diego, 2004.
We
thank
paper at the Meetings that led to the development of the
optimality results for the inferential statistics used in the paper. Revised: December, 2005.
We
also
thank An-
drew Chesher, Lars Hansen, Jim Heckman, Marcello Moreira, Rosa Matzkin, Jim Powell, Whitney Newey, and
seminar participants at Northwestern University
of the paper can be
for helpful
comments and
discussion.
The most
recent versions
downloaded at http://gsbwww.uchicago.edu/fac/christian.hansen/research/FSQR.pdf.
1
well as the center of the distribution.
including
tlie tails as
summary
statistics of the
impact of a covariate, they do not capture the
affects all quantiles of the
impact of a variable unless the variable
same way. Due
QR
properties,
to
ability to capture
its
While the central
has been used in
many
theoretical econometrics; see especially
effects are useful
full
distributional
outcome distribution in the
heterogeneous effects and
its
interesting theoretical
empirical studies and has been studied extensively in
Koenker and Bassett (1978), Portnoy (1991), Buchinsky
(1994), and Chamberlain (1994), among others.
In this paper,
we contribute
to the existing literature
We
models.
for quantile regression
show that vahd
by considering
finite
finite
sample inference
sample confidence regions can be
constructed for parameters of a model defined by quantile restrictions under minimal assumptions.
These assumptions do not require the imposition of distributional assumptions and
be valid
both linear and nonlinear conditional quantile models and
for
for
will
models which include
endogenous as well as exogenous variables. The basic idea of the approach
is
make use
to
of
the fact that the estimating equations that correspond to conditional quantile restrictions are
conditionally pivotal; that
is,
conditional on the exogenous regressors and instruments, the es-
timating equations are pivotal in
finite
samples. Thus, valid finite sample tests and confidence
regions can be constructed based on these estimating equations.
The approach we pursue
(unconditional) quantiles.
related to early
is
The
work on
finite
sample inference
existence of finite sample pivots
quantiles as illustrated, for example, in
is
immediate
for
the sample
for
unconditional
Walsh (1960) and MacKinnon (1964). However, the
existence of such pivots in the regression case
is
less obvious.
We
extend the results from the
unconditional case to the estimation of conditional quantiles by noting that conditional on
the exogenous variables and instruments the estimating equations solved by
used to obtain valid
spirit to the
finite
sample inference statements. The resulting approach
rank-score methods,
proach
finite
will
methods are
is
similar in
Gutenbrunner, Jureckova, Koenker, and Portnoy (1993),
e.g.
but does not require asymptotics or homoscedasticity
The
QR
This property suggests that tests based on these quantities can be
pivotal in finite samples.
for its vahdity.
sample approach that we develop has a number of appealing features. The ap-
provide valid inference statements under minimal assumptions, essentially requiring
some weak independence assumptions on sampling mechanisms and continuity
of conditional
quantile functions in the probability index. In endogenous settings, the finite sample approach
will
remain valid
(2003)).
in cases of
weak
identification or partial identification (e.g.
as in
Tamer
In this sense, the finite sample approach usefully complements asymptotic approxi-
mations and can be used
these approximations
is
in situations
questionable.
where the
validity of the
assumptions necessary to justify
The
chief difficulty with the finite
menting the approach
be quite
difficult if
sample approach
will require inversion of
number
the
of parameters
is
computational.
an objective function-like quantity which
is
large.
To help
The
which
offers potential
use of
MCMC
allows us to
draw an adaptive
may
computational
alleviate this
(MCMC) methods
problem, we explore the use of Markov Chain Monte Carlo
joint confidence regions.
In general, imple-
for constructing
set of grid points
We
computational gains relative to more naive grid baised methods.
also
consider a simple combination of search and optimization routines for constructing marginal
confidence bounds.
When
interest focuses
putationally convenient and
aimed
may be more
on a
single parameter, this
robust in nonregular situations than an approach
at constructing the joint confidence region.
Another potential disadvantage of the proposed
finite
sample approach
expect that minimal assumptions would lead to wide confidence intervals.
concern
is
unwarranted
for joint inference:
The
finite
asymptotic power properties. However, conservativity
sample
may
tests
is
that one might
We show
have correct
size
that this
and good
be induced by going from joint to
marginal inference by projection methods. In this case, the
may
may be com-
approach
finite
sample confidence bounds
not be sharp.
To
explore these issues,
we examine the properties
of the finite
and empirical examples. In the simulations, we find that joint
procedure have correct
sample approach
tests
based on the
in
simulation
finite
sample
size while conventional asymptotic tests tend to be size distorted and
are severely size distorted in
some
cases.
We
also find that finite
sample
tests
about individual
regression parameters have size less than the nominal value, though they have reasonable power
in
many
situations.
than the nominal
We
On
the other hand, the asymptotic tests tend to have size that
is
greater
level.
also consider the use of finite
we consider estimation
of a
sample inference
demand curve
in a small
in
two empirical examples. In the
first,
sample; and in the second, we estimate
demand example, we find modest differences
between the finite sample and asymptotic intervals when we estimate conditional quantiles not
instrumenting for price and large differences when we instrument for price. In the schooling
example, the finite sample and asymptotic intervals are almost identical in models in which we
treat schooling eis exogenous, and there are large differences when we instrument for schooling.
the returns to schooHng in a large sample. In the
These
results suggest that the identification of the structural
variables models in both cases
The remainder
of this paper
is
is
parameters in the instrumental
weak.
organized as follows. In the next section,
duce the modelling framework we are considering and the basic
finite
we formally
sample inference
intro-
results.
Section 3 presents results from the simulation and empirical examples, and Section 4 concludes.
Asymptotic properties of the
finite
sample procedure that include asymptotic optimahty results
are contained in an appendix.
2.
The Model.
2.1.
In this paper,
Finite Sample Inference
we consider
finite
sample inference
in the quantile regression
model characterized below.
Assumption
Let there be a probability space
1.
defined on this space, with
Y gR, D
e R^™(^),
(fi, JF,
Ze
P) and a random vector
K'^™^^), and
Al Y — q{D, U) for a function q{d,u) that is measurable.
A2 q{d, u) is strictly increasing in u for each d in the support
A3 U ~ Umform(0, 1) and is independent from Z.
A4 D is statistically dependent on Z.
When
D=
and q[d,T)
case, Al, A3,
Y
and A4
is
cf.
Koenker (2005) where
the r-quantile of
are not restrictive
,
P-a.s, such that
D.
Y
Koenker and Bassett
Y
is
conditional on
the dependent variable,
D =
D
is
the
d for any t € (0,1). In this
and provide a representation of
to have a continuous distribution function.
(1974) and
of
(0, 1)
D' Z', U)
Z, the model in Al-4 corresponds to the conventional quantile regression model
with exogenous covariates,
regressor,
e
J7
{Y,
Y
,
while
A2
The exogenous model was introduced
(1978). It usefully generalizes the classical linear
restricts
Doksum
in
model
Y=
C''7o+7i(f/) by allowing for quantile specific effects of covariates D. Estimation and asymptotic
Y =
inference for the linear version of this model,
D'6{U), was developed
Bassett (1978), and estimation and inference results have been extended in a
directions
Koenker and
by subsequent. Matzkin (2003) provides many economic examples that
framework and considers general nonparametric methods
When D ^ Z
f7,
in
number
but
Z is a set of instruments that
for
of useful
in this
fall
asymptotic inference.
are independent of the structural disturbance
the model Al-4 provides a generalization of the conventional quantile model that allows
for endogeneity.
model
identification.
this
See Chernozhukov and Hansen (2001, 2005a, 2005b) for discussion of the
as well as for semi-parametric estimation
See Chernozhukov, Imbens, and
model and Chesher (2003)
for a related
and inference theory under strong and weak
Newey
(2006) for a nonparametric analysis of
nonseparable model.
thought of as a general nonseparable structural model that allows
well as a treatment effects
may be
model with heterogeneous treatment
The model Al-4 can be
for
endogenous variables as
effects.
In this case,
D
jointly determined rendering the conventional quantile regression invahd for
and
U
making
inference on the structural quantile function q{d,T). This
model generahzes the conventional
instrumental variables model with additive disturbances,
Y =
L''(0, 1),
to cases
where the impact of
D
D'ao
+
a\{U) where U\Z
~
varies across quantiles of the outcome distribution.
Note that
A4
in this case,
below
results presented
Under Assumption
1,
necessary for identification. However, the finite sample inference
is
remain vahd even when A4
will
we
(Main
1
U
then follow from
Equation
follow.
1.
P\Y <q{D,T)\Z]=T,
2.
[Y < q{D,T)}
hold, then
(2.1)
Bernouni(r) conditional on
IS
equivalent to
is
A 1-3
Suppose
Statistical Implication).
{Y < q{D,T)}
Proof:
not satisfied.
state the following result which will provide the basis for the finite
sample inference results that
Proposition
is
{U <
r} which
is
Z
(2.2)
.
independent of
Z
The
.
results
D
^U{0,1).
(2.1) provides
a set of
the quantile function q{d,T).
moment
When
D=
conditions that can be used to identify and estimate
Z, these are the standard
moment
conditions used in
quantile regression which have been analyzed extensively starting with Koenker and Bassett
(1978) and
in
D^
when
is
considered
Chernozhukov and Hansen (2005b).
key result fi-om which we obtain the
(2.2) is the
states that the event
random
is
Z, the identification and estimation of q{d,T) from (2.1)
{Y < q{D,T)}
variable regardless of the sample size. This
known and
so
is
pivotal in finite samples.
sample confidence regions and
2.2.
conditional on
These
tests conditional
Model and Sampling Assumptions.
heterogeneous
effect
finite
Z
is
sample inference
random
a Bernoulli(r)
we
Assumption
{Yi,Di, Zi,i
=
2.
results allow the construction of exact finite
In the preceding section,
We also
Z
is
distributed exactly as
variable in finite samples. In order to operationalize the finite sample
Let r €
1, ...,n)
denote the quantile of
(0, 1)
on probability space
i
=
l,...,n,
(f2,.F,
0,
t)
is
interest.
Suppose that there
is
a sample
P) (possibly dependent on the sample
size),
and the following additional conditions hold:
=
(Finite-Parameter Model): q{D,T)
the function q{D,
A6
we outlined a general
relates to quantile regression.
impose the following conditions.
such that A1-A4 holds for each
A5
result
on the observed data, Z.
model and discussed how the model
random
also
The
variable depends only on r which
showed that the model implies that [Y < q{D,r)} conditional on
inference,
results.
distributed exactly as a Bernoulli(T)
known, but
6q
is
q{D,9o,T),
for
some
(Conditionally Independent Sampling): {Ui,...,Un) are
ditional on (Zi,...,Z„).
5
Bq
£ Qn C M'^", where
not.
i.i.d.
Uniform(0,
1),
con-
We
will use the letter
V
P
to denote all probability laws
on the measure space
(fi,
J^) that
satisfy conditions Al-6.
Conditions A5-6 restrict the model Al-4 sufficiently to allow
requires that the r-quantile function q{d, r)
may
(where 9q
vary with
that such a condition
n
size
with n
in the sense of
sense,
in the
we can
is
needed.
Pitman
sample
we
Since
r).
Huber
A5 does
{Yi,Xi,Zi,i
parameter ^o
sample inference,
allow for the
(1973) and Portnoy (1985) where
model
Kn —
>
A5
sample inference.
oo as
obvious
it is
depend on the
to
and allows the dimension of the model,
iC„, to increase
?i
—
oo. In this
>
allow flexible (approximating) functional forms for q{D,9o,T) such as linear
combinations of B-splines, trigonometric, power, and spHne
=
finite
to a finite-dimensional
are interested in finite
However,
sense,
known up
is
l,...,n) are
series.
Condition
A6
satisfied
is
but more generally allows rather rich dynamics,
i.i.d.,
e.g.
if
of the
kinds considered in Koenker and Xiao (2004a) and Koenker and Xiao (2004b).
2.3.
The
Finite
Sample Inference Procedure. Using
we
vious sections,
We
are able to provide the key results on finite sample inference.
by noting that equation
moments (GMM)
the conditions discussed in the pre-
(2.1) in
Proposition
function for estimating
1
justifies the following generalized
start
method-of-
^o^
/
n
1
)_M0)
Ln{0)
J^m^ie)
Wr.
i=l
where mi{9)
functions of
which
is
Z
=
[r
—
l{Yi
<
q{Di,9,T))]g{Zi). In this expression, g{Zi)
that satisfies dim{g{Z))
fixed conditional
on Zj,
...,
>
dim(0o)i aiid
A
Z„.
Wn
is
a known vector of
is
a positive definite weight matrix,
convenient and natural choice of
n
Wn
(2.3)
i=\
Wn
is
given by
-1
1
-J29{Z^)g{Z,y
^ J
T(l
^^
i=l
which equals the inverse of the variance of n~^/^ Y^^=i
this conditional variance does not depend on do, the
'^i(^o) conditional
GMM
also corresponds to the continuous-updating estimator of
We
The
focus on the
GMM
GMM function Ln{9)
for defining
W„
...,
Hansen, Heaton, and Yaron (1996).
the key results for finite sample inference.
its
close
and asymptotic inference procedures. In addition, we show
the appendix that testing based on L„(0)
state the key finite
Z„. Since
defined above
function provides an intuitive statistic for performing inference given
relation to standard estimation
We now
on Zi,
function with
sample
may have
results.
in
useful asymptotic optimality properties.
Proposition
Under A1-A6,
2.
on (Zi,
tional
...,
=
^-
statistic
Ln{9o)
li^ D^ - ^^)
and {Bi,
...,
Proof:
Implication 2 of Proposition
1
U= J2{T - B,)
9iZ^)] Wr,
are iid Bernoulli rv's with
Bn)
conditionally pivotal: Ln(^o)
is
EBi =
and
A6
g{zA
=
sample inference on
Given the
imply the
is
By
Let
finite
critical values
D
GMM
sample distribution of Ln{9o), a
if
1
—
may be
unknown
obtained allowing
a- level test of the null hypothesis that
Ln{9)
>
Cn{a) where Cn{oi)
3.
is
I
uniformly in
^0
—
Fix an
a G
(0,1).
a-level test of 9
P gV,
is
the Q-quantile
CR{a)
—
9^:
is
{9
:
Ln{9)
<
is
:
3.
>
a.
CR{a)
Prp(^o ^ CR{a))
P{Cn <l]>a]
Proposition 3 demonstrates
how one may
is
also a valid critical region for
<\ — a.
Moreover, these results hold
<
Cnia)} and Prp{Ln((?o)
<
Cn{a)}
1
>
—
a.
a, by the
and Ln{9o) =d Cnobtain valid finite sample confidence regions and
parameter vector 9 characterizing the quantile function q{D,9o,T).
result generalizes the
It
a valid a-level confidence region for inference
is
equivalent to {L„(6'o)
definition of Cn{a) := inf{/
It is also
Cn{a)}.
a valid a-level confidence region
mfp^-pPxp{9o G CR{a)) > a and suppg-pPrp(0o ^ CR{a)) <
£ CR{a)
tests for the
CR{a)
stated formally in Proposition
about 9o in finite samples: Prp(0o € CR{a))
obtaining a
CR(a) =
the c„(Q:)-level set of the function Ln{0):
This result
Proposition
case.
function Ln{0)
^o-
follows immediately from the previous results that
Proof:
Z„).
inverting this test-statistic, one obtains confidence regions for ^o-
CR{a) be
for 9o.
...,
result.
from the distribution
given by the rule that rejects the null
6 =^ 9o
of Cn-
,
Conditional on {Zi,...,Zn), the distribution does not depend on any
Oq.
parameters, and appropriate
finite
^n, condi-
which are independent 0/ (Zi,
r,
Proposition 2 formally states the finite sample distribution of the
&t 6
=
Zn), where
Thus, this
approach of Walsh (1960) from the sample quantiles to the regression
apparent that the pivotal nature of the
finite
the asymptotically pivotal nature of the rank-score method,
sample approach
cf.
is
similar to
Gutenbrunner, Jureckova,
Koenker, and Portnoy (1993) and Koenker (1997), and the bootstrap method of Parzen, Wei,
and Ying
(1994).^
In sharp contrast to these methods, the finite sample approach does not
The finite sample method should not be confused with the Gibbs bootstrap proposed in He and Hu (2002)
who propose a computationally attractive variation on Parzen, Wei, and Ying (1994). The method is also very
different
from specifying the
finite
sample density of quantile regression as
7
in
Koenker and Bassett (1978). The
rely
on asymptotics and
a homoscedasticity assumption, while the
The statement
method
valid in finite samples. Moreover, the rank-score
is
of Proposition 3
is
finite
sample approach does
for joint inference
relies
on
not.
about the entire parameter vector. One
can define a confidence region for a real-valued functional ip{9o,T) as
CR{a,-4))
infpg-pPrp(V'(^Oi''") £ CR{a,il>))
taking
all
ing to
^[1].
> a by
about a single component of
structed as the set
{ip{0,T)
:
e
G CR{a)] imphes the event
Since the event {^o
in inference
=
{^[i]
6 £
:
Confidence bounds
for
^|i]
say
is,
^[ij,
3.
CR{a,4')],
£
{V'(6'o,t)
Proposition
9,
CR{a)}. That
CR{a) and then
vectors of 9 in
e CR{a)].
For example,
a confidence region
the confidence region for
follows that
it
one
if
interested
is
may be
for 0|i[
obtained by
^jj] is
confirst
extracting the element from each vector correspond-
may be
supremum
obtained by taking the infimum and
over this set of values for 9^iy
2.4.
Primary Properties of the Finite Sample Inference. The
confidence regions obtained in the preceding section have a
features.
finite
number
Perhaps the most important feature of the proposed approach
sample inference under weak conditions.
restrictions
makes
many
that
is
tests
and
and appealing
allows for
it
Working with a model defined by quantile
possible to construct exact joint inference in a general non-linear, non-
it
separable model with heterogeneous effects that allows for endogeneity.
with
sample
finite
of interesting
This
in contrast
is
other inference approaches for instrumental variables models that are developed for
additive models only.
The approach
is
valid without
imposing distributional assumptions and allows
forms of heteroskedasticity and rich forms of dynamics. The result
on Eisymptotic arguments and essentially requires only that
Y
is
for general
obtained without relying
has a continuous conditional
distribution function given Z. In contrast with conventional asymptotic approaches to inference
in quantile
models, the validity of the
well-behaved density for Y:
It
finite
sample approach does not depend upon having a
does not rely on the density of
Y
given
D=
d and
Z=
z
being
continuous or differentiable in y or having connected support around q{d,T), as required
in
e.g.
Chernozhukov and Hansen (2001).
In addition to these features, the finite sample inference procedure will remain valid in
situations
where the parameters of the model are only partially
identified.
The
confidence
regions obtained from the finite sample procedure will provide vahd inference about q{D, r)
q{D,9o,T) even when 9^
on the point made in
finite
sample density of
Hu
QR
is
is
not uniquely identified by
P\Y <
q{D,9o,T)\Z]
(2002). In addition, since the inference
is
vahd
for
=
r.
any
=
This builds
n,
it
follows
not pivotal and can not be used for finite sample inference unless the nuisance
parameters (the conditional density of the residuals given the regressors) are specified.
remains valid under the asymptotic formalization of "weak instruments", as
trivially that it
defined
e.g.
As noted
Stock and Wright (2000).
in
previously, inference statements obtained from the finite sample procedure will also
remain valid in models where the dimension of the parameter space K-^
is
with future increases of n since the statements are valid
n.
for
of the previous section remain valid in the asymptotics of
where Kn/n -^ 0,Kn —* oo,n —^
oo.
any given
allowed to increase
Thus, the results
Huber (1973) and Portnoy (1985)
These rate conditions are considerably weaker than those
required for conventional inference using
Wald
Newey (1997) which require K'^/n —^ 0,Kn
—
>
statistics as described in
—
oo,n
>
Portnoy (1985) and
oo.
Inference statements obtained from the finite sample procedure will be valid for infer-
may perform
ence about extremal quantiles where the usual asymptotic approximation
poorly.^
One
quantiles
is
tiles,
to
alternative to using the conventional asymptotic approximation for extremal
pursue an approach exphcity aimed at performing inference
e.g as in
Chernozhukov
The extreme
(2000).
for
extremal quan-
value approach improves
upon the usual
asymptotic approximation but requires a regular variation assumption on the
ditional distribution
linearity
quite
ofY\D, that the
and exogeneity. None
tail
tails of
the con-
index does not vary with D, and also rehes heavily on
of these assumptions are required in the finite sample approach,
so the inference statements apply
more generally than those obtained from the extreme value
approach.
It is also
will
worth noting that while the approach presented above
is
exphcitly finite sample,
remain valid asymptotically. Under conventional assumptions and asymptotics,
and Pollard (1989) and Abadie (1995), the inference approaches conventional
e.g.
it
Pakes
GMM based joint
inference.
Finally,
it is
important to note that inference
that for joint inference the approach
may be made by
We
is
is
simultaneous on
all
components of 9 and
not conservative. Inference about subcomponents of 6
projections, as illustrated in the previous section,
and may be conservative.
explore the degree of conservativity induced by marginahzation in a simulation example in
Section
2.5.
is
3.
Computation. The main
computing the confidence
difficulty
regions.
values can be easily constructed
The
with the approach introduced in the previous sections
distribution of Ln(9o)
by simulation. The more
function Ln{9) to find joint confidence regions
Whether a given quantile
is
may
is
serious
not standard, but
problem
is
its critical
that inverting the
pose a significant computational challenge.
extremal depends on the sample
size
and underlying data generating process.
However, Chernozhukov (2000) finds that the usual asymptotic approximation behaves quite poorly
examples
for
<
r
<
.2
and
1
> t >
.8.
9
in
some
One
possible approach
this
approach becomes intractable.
MCMC
but as the dimension of
to simply use a naive grid-search,
MCMC
methods.
of grid points
is
To help
alleviate this problem,
seems attractive in this setting because
We
than performing a conventional grid search.
for
a single parameter which
2.5.1.
in
we explore the use
of
generates an adaptive set
and so should explore the relevant region of the parameter space more quickly
some
may be
approach that
also consider a marginalization
combines a one-dimensional grid search with optimization
MCMC
it
increases,
for estimating a confidence
computationally convenient and
may be more
bound
robust than
irregular cases.
The computation
Computation, of the Critical Value.
may
of the critical value c„(a)
proceed in a straightforward fashion by simulating the distribution £„•
We
briefly outline a
simulation routine below.
Algorithm 1 (Computation oi Cn{a).)- Given
{Ui,j,i < n) as iid Uniform, and let {Bij =
2.5.2.
{Cnj,j
=
1, ••,
9{Z^))
'
number
J), for a large
l,...,n), for j
<
l{Uij
hi^Ztii^ -B^J)9{Z^)YWn {^E7=i(^ ' B,,j)
of the sam.ple
=
{Zi,i
T),i
3.
.
<
n).
=
2.
1,...,J;
1.
Draw
Compute Cnj
=
Obtain Cn {a) as the a-quantile
J.
Computation of Confidence Regions. Finding the confidence region requires computing
the c„(Q)-level set of the function Ln{0) which involves inverting a non-smooth, non-convex
function. For even
due
to the
To help
Hastings
moderate sized problems, the use of a conventional grid search
is
impractical
computational curse of dimensionality.
resolve this problem,
MCMC
algorithm.'^
we consider the use
The
idea
is
that the
of a generic
MCMC
random walk Metropolis-
algorithm will generate a set of
adaptive grid-points that are placed in relevant regions of the parameter space only.
focusing
more on
relevant regions of the parameter space, the use of
MCMC
may
By
alleviate the
computational problems associated with a conventional grid search.
To implement the
density and feed
it
MCMC
into a
algorithm,
random walk
we
treat f{9) oc
MCMC
algorithm.
Chernozhukov and Hong (2003), except that we use
function rather than pseudo-posterior
is
implemented as
6^'-\
1.
2.
Generate
min{l,/(<>„p)//(0('))} and
Repeat Steps 1-3
Other
MCMC
it
(The idea
as a quasi-posterior
is
similar to that in
here to get level sets of the objective
quantiles.)
The
basic
random walk
MCMC
follows:
Algorithm 2 (Algorithm
and given
means and
exp(— L„(0))
J times
Random Walk MCMC). For a symmetric proposal density /i(-)
2.
Take ^(*+^) = 9%p with probability
O^^iop ~ h{9 - 6l(')).
B^')
otherwise.
replacing
6''*)
with
3.
^(*+i)
Store {e^'\
e'f^lp,
L„(0(*)),
as starting point for each repetition.
algorithms or stochastic search methods could also be employed.
10
L„(4tlp)).
4-
At each
MCMC
step, the
algorithm considers two potential values for 9 and obtains the
corresponding values of the objective function. Step 3 above
walk
differs
from a conventional random
MCMC in that we are interested in every value considered not just those accepted by the
procedure.
The implementation
for the chain
MCMC
and a transition density
practical implications,
fine
of the
algorithm requires the user to specify a starting value
The
g{-).
and implementation
choice of both quantities can have important
any given example
in
As
MCMC
illustrated above, the
issues.
algorithm generates a set of grid points {9^^\...,
set of evaluations of the objective function,
set of
draws
for 6
we can
where the value
6'''^'}
Using
as a by-product, a set of values for the objective function {L„(6''^'), ...,Ln(^('^))}.
by taking the
some
Robert and Casella (1998) provide
tuning in both the choice of g(-) and the starting value.
an excellent overview of these and related
will typically involve
and,
this
construct an estimate of the critical region
<
of L^iO)
CR{a) =
c„(a):
[d'^'^
:
Ln{9^'^)
<
c{a)}.
Figure
Both
illustrates the construction of these confidence regions.
1
=
confidence regions for r
.5
in a simple
demand example
regions illustrated here are for a model in which price
is
that
we
figures illustrate
discuss in Section
3.
95%
The
treated as exogenous. Values of the
intercept are on the x-axis and values of the coefficient on price are on the y-axis.
Figure
1
Panel
MCMC
an
draw
A illustrates a set of MCMC
draws
in this
example. Each symbol
of the parameter vector that satisfies Ln{9)
<
c„(.95),
+
represents
and each symbol •
represents a draw that does not satisfy this condition. Thus, (a numerical approximation to)
the confidence region
the
MCMC
is
given by the area covered with symbol
+
algorithm appears to be doing what we would want.
come from within the confidence
in the figure.
In this case,
The majority
of the draws
do a good job of
region, but the algorithm does appear to
exploring areas outside of the confidence region as well.
Panel
B
of Figure
1
presents a comparison of the confidence region obtained through
and the confidence region obtained through a
region
is
represented by the black
area in the figure. Here,
we
some points that are not
2.5.3.
line,
and the
see that the
in the other,
grid search.
MCMC
The boundary
region
two regions are almost
but the agreement
is
is
again given by the light gray
identical.
Both regions include
quite impressive.
Computation of Confidence Bounds for Individual Regression Parameters. The
approach outlined above
In our applications,
may be
MCMC
of the grid search
MCMC
used to estimate joint confidence regions which can be used
we use estimates
of 9
and the corresponding asymptotic distribution obtained from the
quantile regression of Koenker and Bassett (1978) in exogenous cases and from the inverse quantile regression
of
Chernozhukov and Hansen (2001)
in
endogenous cases as starting values and transition
11
densities.
about the entire parameter vector or
for joint inference
parameters.
there
one
If
may be
for inference
about subsets of regression
interested solely in inference about an individual regression parameter,
is
a computationally more convenient approach.
In particular, for constructing
a confidence bound for a single parameter, knowledge of the entire joint confidence region
may
unnecessary which suggests that we
is
collapse the rf-dimensional search to a one-dimensional
search.
For concreteness, suppose we are interested in constructing a confidence bound for a particular element of 9, denoted
vector.
We
=
exists a value of 6 with 9^^
is
required to place
0j*j,
and
6'[i],
note that a value of
0[i],
9T^,
let 0[_i]
On
the other hand,
can include
6*^,
if
satisfies
Ln{9*)
<
bound
as long as there
<
Cn{a). Since only one such value of 9
restrict consideration to 9", the point
^j*j,
Ln{9)
Cn{a),
in the confidence
inside the confidence
bound, we may
=
<
that minimizes Ln{9) conditional on ^mi
be no other point that
lie
that satisfies Ln{9)
in the confidence
will
denote the remaining elements of the parameter
say ^m,, will
If
.
L„(0*)
>
we may conclude that there
Cn{ct),
Cn{a) and exclude
9T^,,
we have found a point that
from the confidence bound.
satisfies
Ln{9)
<
Cn{a) and
bound.
This suggests that a confidence bound
for ^ni
can be constructed using the following simple
algorithm that combines a one-dimensional grid search with optimization.
Algorithm 3 (Marginal Approach.).
I,...,
J}.
2.
region for
For
9[i]
j
as
=
1,...,J, find ^/
{6'j'jj
:
L„((0j'j],
j,
6'j'_jj)'
1.
=
<
Define a suitable set of values for ^mi, {^m, j
arg inf L„{{9l,,
9',
.^A').
3.
=
Calculate the confidence
c„(q)}}.
In addition to being computationally convenient for finding confidence bounds for individual
parameters in high-dimensional settings, we also anticipate that this approach
in
some
sets because
it
to
perform well
Since the marginal approach focuses on only one parameter,
irregular cases.
typically be easy to generate a tractable
some robustness
will
and reasonable search region. The approach
it
will
will
have
multimodal objective functions and potentially disconnected confidence
considers
all
values in the grid search region and will not be susceptible to
getting stuck at a local mode.
3.
In the preceding section,
Simulation and Empirical Examples
we presented an
inference procedure for quantile regression that
provides exact finite sample inference for joint hypotheses and discussed
for subsets of quantile regression
parameters
may be
obtained.
how confidence bounds
In the following,
we
further
explore the properties of the proposed finite sample approach through brief simulation examples
12
and through two simple case
make
we
studies.^ In the simulations,
parameter vector based on the
finite
use of the asymptotic approximation
we
consider marginal inference,
may be
about the entire
find that tests
sample method have the correct
size while tests
which
When we
substantially size distorted.
find that using the asymptotic approximation leads to tests
that reject too often while, as would be expected, the finite sample
method
yields conservative
tests.
We
the
also consider the use of the finite
we
first,
consider estimation of a
we consider estimation
cases,
we
of the
find that the finite
sample inference procedure
demand model
in
two case studies. In
smaU sample; and
in a
impact of schooling on wages
in the second,
a rather large sample. In both
in
sample and asymptotic intervals are similar when the variables
of interest, price and years of schoohng, are treated as exogenous.
However, when we use
instruments, the finite sample and asymptotic intervals differ significantly.
In each of these
examples, we also consider specifications that include only a constant and the covariate of
interest. In these
two dimensional situations, computation
relatively simple, so
is
estimating the finite sample intervals using a simple grid search,
inference approach suggested in the previous section.
We find that
confidence bounds for the parameter of interest in the
MCMC,
we consider
and the marginal
ah methods result in similar
demand example, but
some
there are
discrepancies in the schoohng example.
Simulation Examples. To
3.1.
illustrate the use of the
asymptotic and
finite
sample ap-
proaches to inference, we conducted a series of simulation studies, the results of which are
summarized
in
Table
In each panel of Table
1.
marginal hypothesis that
^(t)[i]
and the second row corresponds
=
{.5, .75, .9}.
totic
The
first
of Table
simulation model
The sample
size
In all examples,
is
1
for the
last three
sample approach. All results are
A
is
row corresponds to testing the
the
first
element of vector 9{t),
median, 75'^ percentile, and
=
9q{t).
90**^
For each
percentile,
for
columns correspond
5%
to results obtained via the finite
level tests.
corresponds to a hnear location-scale model with no endogeneity.
given hy
The
Y = D+
{\
+ D)e
where
D ~
BETA(1,1) and
t
~
The
iV(0, 1).
conditional quantiles in this model are given by q{D,6o,T)
is
100.
we
set use the identity function for g(-).
In the exogenous model,
i.e.
three columns correspond to results obtained using the usual asymp-
approximation^ and the
Panel
first
to testing the joint hypothesis that 6{t)
model, we report inference results
r G
^o(''')[i]
the
^(t)[j]
1,
where
we base the asymptotic approximation on the conventional
quantile regression of
Koenker and Bassett (1978), using the Hall-Sheather bandwidth choice suggested by Koenker
use the inverse quantile regression of Chernozhukov and
bandwidth choice suggested by Koenker
(2005).
13
Hansen (2001)
in
=
(200.5),
and we
the endogenous settings, using the
+
9o{t)[2]
So{t)ii]D where
of the normal
Looking
tests
CDF
=
6'o(t)[2]
evaluated at
^~^{t) and
0o(t)[i] == 1
+$
first at results for joint
we
inference,
see that the finite
^(t) the inverse
sample procedure produces
On
with the correct size at each of the three quantiles considered.
asymptotic approximation results
which overreject
in tests
the size distortion increasing as one moves toward the
behavior
$
^{t) for
t.
tail
the other hand, the
in each of the three cases,
unsurprising given that the usual asymptotically normal inference
is
about the
for inference
quantiles of the distribution (see e.g.
tail
with
quantiles of the distribution. This
is
inappropriate
Chernozhukov (2000)) while
the finite sample approach remains valid.
When we
asymptotic distribution are
the
size distorted
see that tests based
Here, the finite sample
for joint inference.
inference continues to provide valid inference in that the size of the test
nominal
level.
However, the
far less ft-equently
To
finite
5%
than the
based on the
level
—
on the horizontal
=
that 6(t)[i]
and the
axis,
where
Oo{t)ui^
is
test using the
we
plot
in
power curves
Figure
the hypothesized value for
2.
In this
measures rejection frequencies of the hypothesis
The
solid line in the figure gives the
power curve
sample approach, and the dashed hne gives the power curve
asymptotic approximation. Prom this
figure,
we can
for the
see that while conservative,
sample procedure do have some power against alternatives.
tests
based on the
finite
sample power curve always
finite
are
0{t)ii-i
given where the horizontal axis equals zero, and remaining
points give power against various alternatives.
for the test using the finite
is
9a{''')\i]
vertical axis
6^(r)h]. Thus, size
smaller than the
would suggest.
sample and usual asymptotic approximation
finite
figure, values for Oj\{t)ii-i
is
sample approach appears to be quite conservative, rejecting
further explore the conservativity of the finite sample approach,
for tests
on the
with the distortion increasing as one moves toward
though the distortions are smaller than
tail,
we again
look at marginal inference about ^(t)m],
lies
The
below the corresponding power curve generated from the
asymptotic approximation, though this must be interpreted with caution due to the distortion
in the
asymptotic
In Panels
B-D
tests.
of Table
1,
we consider the performance
of the asymptotic approximation and
the finite sample inference approach in a model with endogeneity.
are generated from a location
variables.
Zj
is
~
.8.
In particular,
/V(0,1) for j
As above,
^o(''")[2]
normal
we have
1,2,3,
e
~
Y =
—1 +
A^(0,1),
D+
V~
e
and
A^(0, 1),
D =
where
evaluated at
0o(t)[2]
t.
for this simulation
and
+ HZs + V
where
the correlation between e
and v
+
YiZi
IIZ2
this leads to a finear structural quantile function of the
+^o(''")[i)D
CDF
=
The data
model with one endogenous regressor and three instrumental
= -1
+<I>~^(t) and
As above, the sample
14
6'o(t)|i]
=
size is 100.
1
for
form q{D,6f^,T)
=
$~^(t) the inverse of the
We
explore the behavior of the inference procedures for differing degrees of correlation be-
tween the instruments and endogenous variable by changing
B,
we
=
set IT
0.05
which produces a
first
tionship between the instruments and endogenous variable
the asymptotic approximation to perform poorly.
F-statistic
=
25.353. In Panel D, 11
is
1
11
across Panels B-D. In Panel
stage F-statistic of 0.205.
and the
very weak and we would expect
is
In Panel C,
=
11
stage F-statistic
first
In this case, the rela-
and the
0.5
Both
96.735.
is
stage
first
of these
between the endogenous variable and
specifications correspond to fairly strong relationships
the instruments, and we would expect the asymptotic approximation to perform reeisonably
The
well in both cases.
inference in
all
As expected,
finite
sample procedure, on the other hand, should provide accurate
three cases.
the tests based on the asymptotic approximation perform quite poorly in the
weakly identified case presented
range from a
minimum
in
Panel B. Rejection frequencies
of .235 to a
maximum
5%
of .474 for a
for the
level test.
asymptotic
tests
In terms of
size,
the finite sample procedure performs quite well. As indicated by the theory, the finite sample
approach has approximately the correct
vector.
When
with rejection rates of .024 for r
The
similar
size for
performing tests about the entire parameter
considering the slope coefficient only, the finite sample procedure
results for the
=
.0212 for t
.5,
=
models where identification
.75,
.02 for r
=
.9
tests
conservative
quantile.
C and D
stronger given in Panels
is
though not nearly so dramatic. The hypothesis
and
is
are
based on the asymptotic procedure
overreject in almost every case, with the lone exception being testing the joint hypothesis at
the median
when
IT
=
and they decrease when
The
1.
11
size distortions increase as
increases fi'om
.5
to
1.
The
one moves from t
distortions at the
90'*^
The
finite
more modest with
rejection rates ranging
=
.5
between .068 and
sample results are much more stable than the asymptotic
Marginal inference remains quite conservative with
sizes
to t
results.
=
and r
The
is
.9,
.75,
joint tests,
size distorted.
ranging from .0198 to .0300.
sample inference procedure
=
.086.
with rejection frequencies ranging between 4.6% and 5.5%, do not appear to be
results clearly suggest that the finite
.5
percentile remain
quite large with rejection frequencies ranging between .1168 and .1540. For r
the distortions are
=
The
preferable for testing joint
hypotheses, and given the size distortions found in the asymptotic approach the results also
seem
to favor the finite sample procedure for marginal inference.
As above, we
in Figures 3-5.
also plot
power curves
for the
asymptotic and
Figure 3 contains power curves
for 11
=
.05.
finite
sample testing procedures
In this case, the finite sample
The
procedure appears to have essentially no power against any alternative.
is
unsurprising given that identification in this case
is
where the correlation between the instruments and endogenous regressors
is
sample procedure seems to have quite reasonable power. The power curves
15
lack of
power
In Figures 4 and
extremely weak.
5,
stronger, the finite
for the finite
sample
procedure are similar to the power curves of the asymptotic tests across a large portion of the
parameter space. The
finite
sample procedure does have lower power against some alternatives
must
that are near to the true parameter value than the asymptotic tests, though again this
be interpreted with some caution due to the distortions
asymptotic
in the
tests.
The
Overall, the simulation results are quite favorable for the finite sample procedure.
results for joint inference confirm the theoretical properties of the procedure
numeric approximation error
size.
is
not a large problem as the tests
all
and suggest that
have approximately correct
For tests of joint hypotheses, the finite sample procedure clearly dominates the asymptotic
procedure which
somewhat
may be
less clear
For marginal inference, the results are
substantially size distorted.
cut though
may
the finite sample procedure
favorable for the finite sample procedure.
still
result in tests that are quite conservative,
On
do appear to have nontrivial power against many hypotheses.
on the asymptotic approximation have
size greater
In this case,
though the
tests
the other hand, tests based
than the nominal
simulation
level in the
models considered.
3.2.
Case Studies.
elasticities
which
Demand
1.
may
month period
Graddy
fi-om
December
2,
amount
the total
in
8,
1992.
measured
of fish sold that day.
The
£is
total
in
New York
The
price
in
the average daily price and the quantity as
sample consists of 111 observations
for the
which the market was open over the sample period.
model with non-additive disturbance: \n(Qp)
quantity that would be
demanded
demand normalized to
when the level of demand
of
enter the
we
over the five
and quantity data
For the purposes of this illustration, we focus on a simple Cobb-Douglas
first,
demand
These data were used previously
(1995) to test for imperfect competition in the market.
are aggregated by day, with the price
days
market
in the Fulton fish
May
1991 to
of
demand. The data contain observations
potentially vary with the level of
on price and quantity of fresh whiting sold
we present estimates
In this section,
for Fish.
set /3{U)
describes
0,
=
ao{U)
the price were p,
+ ai{U) \n{p) + X'P{U), where Qp is the
U is an unobservable affecting the level
follow a C/(0, 1) distribution,
is
U, and
model with random
=
if
and
in
X
is
ai{U)
is
the
random demand
elasticity
a vector of indicator variables for day of the week that
coefficient
the second,
P{U).
We
consider two different specifications. In the
we estimate P{U).
how much producers would supply
unobserved disturbance U. The factors
random demand
Z
if
A
supply function Sp
=
f{p,
the price were p, subject to other factors
affecting supply are
assumed
to
Z,U)
Z
and
be independent of
demand disturbance U.
As instruments, we consider two
Stormy
is
a
dummy
variable which indicates
greater than 18 knots,
feet
different variables capturing
and Mixed
is
a
weather conditions at
wave height greater than
dummy variable
indicating
4.5 feet
sea:
and wind speed
wave height greater than
3.8
and wind speed greater than 13 knots. These variables are plausible instruments since
16
weather conditions at sea should influence the amount of
should not influence
demand
for the
product/ Simple
fish
OLS
that reaches the market but
regressions of the log of price on
B? and
these instruments suggest they are correlated to price, yielding
and 15.83 when
F-statistics of 0.227
both Stormy and Mixed are used as instruments.
Asymptotic intervals are based on the inverse quantile regression estimator of Chernozhukov
and Hansen (2001) when we
i.e.
For models in which we set
treat price as endogenous.
D=
Z,
which we treat the covariates as exogenous, we base the asymptotic intervals on the
in
conventional quantile regression estimator of Koenker and Bassett (1978).^
A
of Table 2 gives estimation results treat-
Estimation results are presented
in
ing price as exogenous, and Panel
B
when we instrument
both of the weather condition instruments described above.
C and D
Panels
for price using
Table
Panel
2.
contains confidence intervals for the
random
elasticities
dummy variables for day of the week as additional covariates
Panels A and B respectively. In every case, we provide estimates
include a set of
and are otherwise
identical to
of the 95%-level confidence interval obtained from the usual asymptotic approximation and
the finite sample procedure. For the finite sample procedure,
MCMC,
we report
a grid search,^ and the marginal procedure^" in Panels
A
intervals obtained via
C
and B. In Panels
and
D, we report only intervals constructed using the asymptotic approximation and the marginal
procedure. For each model, we report estimates for t
Looking
we
first at
modest
see
when no
interval
Panels
=
.25,
t
=
A and C which report results for models that treat
differences
between the asymptotic and
finite
sample
=
and r
.50,
.75.
price as exogenous,
intervals.
At the median
95%
level
(-1.040,0.040).
The
covariates (other than price and intercept) are included, the asymptotic
is
differences
(-0.785,-0.037),
and the widest
become more pronounced
of the
sample intervals
finite
at the 25''^
is
and 75"^ percentiles where we would expect
When
the asymptotic approximation to be less accurate than at the center of the distribution.
day of the week
effects are included, the
asymptotic intervals tend to become narrower while
the finite sample intervals widen slightly leading to larger differences in this case. However, the
More detailed arguments may be found
We
use the Hall-Sheather
in
Graddy
(199.5).
bandwidth choice suggested by Koenker
(200.5)
to
implement the asymptotic
standard errors.
When
price
is
treated as exogenous,
equally spaced grid over
we use
[-4,2]
we use an equally spaced
with spacing .015
for
ai
different grid search regions for each quantile.
with spacing .025
for
.025 for ai.
For r
spaced grid over
^'^For
=
For t
ao and an equally spaced grid over
an equally spaced grid over
.75,
[-10,30]
[6,12]
with spacing .0125
we used an
with spacing
grid over [5,10] with spacing .02 for qq
for all quantiles.
for
=
.25,
[-40,40]
When
price
is
we used an equally spaced
with spacing .25
and an
treated as endogenous,
for ai.
For r
qo and an equally spaced grid over
grid over [0,10]
=
[-5,5]
.50,
we used
with spacing
equally spaced grid over [0,30] with spacing .05 for qo and an equally
.05 for qi.
the marginal procedure, we considered an equally spaced grid over
models.
17
[-5,1]
at .01 unit intervals for all
basic results remain unchanged. Also,
for
obtaining the
finite
it is
worth noting that
methods
three computational
all
sample confidence bounds give similar answers
in the
model with only
an intercept and price with the marginal approach performing slightly better than the other
MCMC
and the marginal approach
as a grid search
which may not be
feasible in high dimensional
estimation of the
demand model
two procedures. This finding provides some evidence that
may do
weU computationally
as
problems.
Turning now to results
B and
in Panels
D,
we
for
and
pronounced
at the
25''''
Even
at the
median
wide.
between the asymptotic intervals and the
see quite large differences
intervals constructed using the finite
75*''
sample approach. As above the differences are particularly
percentiles
in the
where the
bounds
finite
intervals definitely call into question the validity of the asymptotic
which
include the
sample and asymptotic
approximation
not surprising given the relatively small sample size and the fact that
is
When
intervals.
for all three quantiles
between the
large differences
sample
intercept, the finite
wide as the corresponding cisymptotic
additional controls are included, the finite sample
The
sample intervals are extremely
finite
model with only price and an
intervals are approximately twice as
entire grid search region.
using instrumental variables
we
in this case,
are estimating
a nonlinear instrumental variables model.
Finally,
it
worth noting again the three approaches to constructing the
is
The
interval in general give similar results in this case.
differences
between the grid search and
marginal approaches could easily be resolved by increasing the search region
approach which was restricted to values we
the grid search and
MCMC
intervals at the
for the
marginal
were a priori plausible. The difference between
25'''
percentile
is
more troubhng, though
could
it
be resolved through additional simulations or starting points.
likely
As a
final illustration, plots of
and an intercept are provided
in
95%
confidence regions in the model that includes only price
Figures 6 and
where
price
regions are
all
is
instrumented
more
for using
or less eUiptical
irregular
and
in
7,
all
model
of the
not surprising
it is
sample intervals produce similar
results.
The
on the other hand, are not nearly so well-behaved. In general, they are
many
cases appear to be disconnected,
.25 quantile in the results in
jump
in the
In the exogenous case,
and seem to be well-behaved. In this case,
Table 2
region appears to be disconnected.
fails to
Figure 6 contains confidence regions for the
weather conditions.
of the procedures for generating finite
regions in Figure
7.
and Figure 7 contains confidence regions
coefficients treating price as exogenous,
that
felt
sample
finite
is
The apparent
failure of
MCMC at the
almost certainly due to the fact that the confidence
The
MCMC
algorithm explores one of the regions but
to the other region. In cases like this,
it is
unlikely that a simple
random walk
Metropolis-Hastings algorithm will be sufficient to explore the space. While more complicated
18
MCMC
procedure
a convenient method to pursue
is
Returns
2.
As our
to Schooling.
schoohng model that allows
and the basic
if
one
is
it
seems that the marginal
interested solely in marginal inference.
example, we consider estimation of a simple returns to
final
schoohng on wages.
for heterogeneity in the effect of
employed
identification strategy
The data
(1991).
schemes could be explored,
or alternative stochastic search
We
use data
schoohng study of Angrist and Krueger
in the
drawn from the 1980 U.S. Census and include observations on men born
are
between 1930 and 1939. The data contain information on wages, years of completed schooling,
and year of birth, and quarter
state
As
ao{U)
+
ai{U)S
is
+
We
total
sample consists of 329,509 observations.
Y
X'P[U) where
U
is
might think of
U
and
S
the log of the weekly wage,
is
dummies
a vector 51 state of birth and 9 year of birth
coefficients (i{U),
(0,1).
The
we focus on a simple hnear quantile model
in the previous section,
schoohng, AT
of birth.
is
of the
that enter with
unobserved
as indexing
which case ai(r) may be thought
ability, in
years of schooling
may be jointly determined with unobserved
an instrument
schoohng, following Angrist and Krueger (1991).
for
As
in the previous
regression estimator
random
an unobservable normahzed to follow a uniform distribution over
of as the return to schoohng for an individual with unobserved abihty r. Since
specifications. In the first,
Y =
form
years of completed
we
set /3(I7)
=
0,
and
in the second,
we
believe that
we use quarter
ability,
We
of birth as
consider two different
we estimate P{U).
example, we construct asymptotic intervals using the inverse quantile
when we
schooling as exogenous,
treat schooling as endogenous.
we construct the asymptotic
For models in which we treat
intervals using the conventional quantile
regression estimator.
We
present estimation results in Table
schoohng as exogenous, and Panel
we instrument
of birth
B
and
for
dummy
In every case,
A of Table 3
birth. Panels
we provide estimates
sample procedure, we report intervals obtained via
price
is
in
Panels
is
treated as endogenous,
spaced grid over
[0,0.2.5]
for
the exogenous case
endogenous
include a set of 51 state
95%
A
and
confidence interval obtained
sample procedure.
MCMC
and a modified
For the
MCMC
finite
procedure
we use unequally spaced
MCMC
procedure we employ
grids over [4.4,5.6] for ao
for
a\
grid over [3,6] with spacing .01 for
is
a
and unequally
Qi where the spacing depends on the quantile under consideration.
with spacing .001
for all
and B. The modified
we use an equally spaced
^^For the marginal procedure,
in
A
treated as exogenous,
spaced grids over [0.062,0.077]
price
of the
finite
when
that better accounts for the specifics of the problem, a grid search, ^^ and the
marginal procedure^^
When
C and D
variables but are otherwise identical to Panels
from the usual asymptotic approximation and the
(MCMC-2)
gives estimation results treating
contains confidence intervals for the schoohng effect
schoohng using quarter of
9 year of birth
respectively.
B
Panel
3.
When
qo and an equally
for all quantiles.
we considered an equally spaced
grid over [0.055,0.077] at .0004 unit intervals
models, and we used an equally spaced grid over
case.
19
[-1,1]
at .01 unit intervals in the
simple stochastic search algorithm that simultaneously runs
mode
at a local
MCMC
The
of the objective function.
chains each started
idea behind the procedure
we may
the chains get stuck near a local mode.
if
MCMC
is
that the simple
By
tends to get "stuck" because of the sharpness of the contours in this problem.
multiple chains started at different values,
even
five
If
potentially explore
more
MCMC
C and
procedure. In Panels
of the function
the starting points sufficiently cover the
function, the approach should accurately recover the confidence region
the unadjusted
more quickly than
D, we report only intervals constructed
using the asymptotic approximation and the marginal procedure. For each model,
estimates for r
Looking
in Panels
=
first at
A
=
r
.25,
and r
.50,
=
using
we report
.75.
estimates of the conditional quantiles of log wages given schooling presented
and C, we see that there
asymptotic inference
In Panel
results.
is
A
very
little
difference
between the
finite
sample and
where the model includes only a constant and the
schooling variable, the finite sample and asymptotic intervals are almost identical. There are
between the
larger differences
51 state of birth effects
even
and
large
all
sample and asymptotic intervals
in Panel
C which
includes
9 year of birth effects in addition to the schooling variable;
The
in this case the differences are quite small.
in not surprising since in the
is
finite
though
correspondence between the results
close
exogenous case the parameters are well-identified and the sample
enough that one would expect the asymptotic approximation to perform quite well
for
but the most extreme quantiles.
While there
model which
is
close
agreement between the
treats schooling as exogenous, there are
asymptotic and
finite
quarter of birth.
The
sample results
finite
sample
in
sample and asymptotic results
finite
substantial differences between the
still
we instrument
the case where
in the
though
When we
in all cases they exclude zero.
model that includes the
for
schoohng using
with the exception of the interval at the median,
intervals,
are substantially wider than the asymptotic intervals in the
intercept,
in the
state of birth
model with only schoohng and an
consider the finite sample intervals
and year of birth covariates, the differences are
huge. For aU three quantiles, the finite sample interval includes at least one endpoint of the
search region, and in no case are the bounds informative.
may be
Also,
While the
finite
sample bounds
quite conservative in models with covariates, the differences in this case are extreme.
we have evidence from the model which
identified setting the inflation of the
that identification in this model
While the
finite
is
treats education as exogenous that in a well-
bounds need not be
large.
Taken together,
this suggests
quite weak.
sample intervals constructed through the different methods are similar at
the median in the instrumented model, there are large differences between the finite sample
intervals for the .25
and
.75 quantiles
with the simple
marginal approach performing the best. The
MCMC
performing the worst and the
difficulty in this case
20
is
that the objective function
has extremely sharp contours. This sharpness of contours
the
95%
level
illustrated in Figure 8
is
MCMC-2
confidence region obtained from the
procedure for the
which plots
.75 quantile in
the schooling example without covariates.
The shape
the basic
of the confidence region poses difficulties for both the traditional grid search and
MCMC procedure.
The problem with
that even with a very fine grid one
the grid
is
the grid search
unlikely to find
many points along the
one may miss the confidence
grid,
of the confidence set also causes problems with
is
that the interval
more than a few points
chosen carefully to include
and with a course
region,
is
MCMC
so narrow
in the region unless
"fine" describing the confidence
region entirely.
The narrowness
by making transitions quite
random walk Metropohs-Hastings procedure one must
Essentially with a default
is
difficult.
specify either
a very small variance for the transition density or must specify the correlation exactly so that
the proposals
lie
along the "line" describing the contours. Designing a transition density with
the appropriate covariance
that
lie off
of the line
making
MCMC
this suggests that
complicated as even slight perturbations
is
is
transitions unhkely unless the variance
likely to travel
in poor convergence properties
The
MCMC-2
and
may
is
Taken together
very slowly through the parameter space resulting
sample confidence regions.
difficulty in generating the finite
procedure alleviates the problems with the random walk
by running multiple chains with
result in proposals
small.
different starting values.
MCMC
somewhat
Using multiple chains provides
local
exploration of the objective function around the starting values. In cases where the objective
function
is
largely concentrated
around a few
local
modes,
erating the finite sample confidence regions. This approach
at the .25 quantile
where we see that the
MCMC-2
improvement
this provides
is still
interval
is
insufficient in this
still
in gen-
example
significantly shorter
the interval generated through the marginal approach suggesting that the
MCMC-2
than
procedure
did not travel sufficiently through the parameter space.
In this example, the marginal approach seems to clearly dominate the other approaches
to
computing the
finite
sample confidence regions that we have considered.
points that he within the confidence
approaches.
It is also
bound
for the
It finds
more
parameter of interest than any of the other
simple to implement, and the search region can be chosen to produce a
desired level of accuracy.
4.
In this paper,
restrictions that
Conclusion
we have presented an approach
is
valid
to inference in
under minimal eissumptions.
models defined by quantile
The approach does not
rely
on any
asymptotic arguments, does not require the imposition of distributional assumptions, and
be vahd
for
both linear and nonlinear conditional quantile models and
endogenous as well as exogenous
variables.
The approach
21
relies
for
will
models which include
on the
fact that objective
functions that quantile regression aims to solve are conditionally pivotal in finite samples.
This conditional pivotal property allows the construction of exact
regions and on finite sample confidence
The
it
chief
may be
We
bounds
finite
sample joint confidence
for quantile regression coefficients.
drawbacks of the approach are that
it
may be computationally
difficult
and that
quite conservative for performing inference about subsets of regression parameters.
suggest that
MCMC
may be used
or other stochastic search algorithms
to construct joint
we suggest a simple algorithm that combines optimization
In addition,
confidence regions.
with a one-dimensional search that can be used to construct confidence bounds
regression parameters.
we
In simulations,
for individual
sample inference procedure
find that the finite
is
not conservative for testing hypotheses about the entire vector of regression parameters but
that
is
it
sample
conservative for tests about individual regression parameters.
tests
do have moderate power
approximation tend to overreject.
many
in
and
situations,
tests
However, the
finite
based on the asymptotic
Overall, the findings of the simulation study are quite
favorable to the finite sample approach.
We
also consider the use of the finite
estimation of a
demand curve
large sample. In the
sample inference
in
two simple empirical examples:
a small sample and estimation of the returns to schooling in a
in
demand example, we
find
modest
differences
between the
finite
asymptotic intervals when we estimate conditional quantiles not instrumenting
large differences
when we instrument
and again there are large differences
These
results suggest that in
both
instrumental variables models
Appendix
In the preceding sections,
this section,
(1)
finite
sample and
which we treat schooling as exogenous,
approaches when we instrument
for schooling.
parameters in the
Optimality Arguments for L^.
A. Appendix:
a finite sample inference procedure for quantile regression
procedure provides valid inference statements
this
also has desirable large
most powerful (UMP) invariant
in finite
samples. In
sample properties:
form
test.
(2.3) contains a (locally)
asymp-
Inversion of this test therefore gives
uniformly most accurate invariant regions. (The definitions of power and invariance
follow those in Choi, Hall,
(2)
in the
in
identification, the class of statistics of the
totically uniformly
(locally)
models
and
weak.
is
we show that the approach
Under strong
in
cases, the identification of the structural
we introduced
models and demonstrated that
example, the
for price. In the schooling
asymptotic intervals are almost identical
sample and
for price
Under weak
and Schick (1996)).
identification, the class of statistics of the
form
(2.3)
maximizes an average power
function within a broad class of normal weight functions.
Here,
we suppose
(Yi,
Dj, Zj,i
assume that the dimension
K
=
of do
l,...,n)
is
fixed.
is
an
i.i.d.
Although
22
sample from the model defined by Al-6 and
this
assumption can be relaxed, the primary
purpose of this section
is
to motivate the statistics used for finite-sample inference from an optimality
point of view.
Recall that under Al-6,
PlY -
Ho:eo =
where
0,
Letei
M^
£
some
is
= l[Yi<
Suppose testing
is
g
6I,,t)]
to be based
As
.
on
the null, any statistic based on
,
definite.
As mentioned
e,
t.
Consider the problem of testing
Ha:0o¥=O.,
vs.
(ej,
~
defined, e|Z
(e;,
=
Zj,i
=
Zj,i
Let G be the class of functions g
is
=
0|Z]
constant.
(A,
G = U^j Gj where Gj
<
q [D, 9q,t)
1,
1,
.
Bernoulli[r (Z,6lo)]
n)
,
.
.
.
,n)
E
which
for
.
.
.
is
Because ei|Zi,
exists
]
the class of K-'-valued functions g for which
.
and
E
[g
a "natural" class of test statistics
in the text,
,
.
Z„
(Z,
~
= P [F <
6^0)
i.i.d.
q {D,eo,T)\Z]
Bernoulli(T) under
conditionally pivotal under Hq.
(Z) g (Z)
[g
where t
,
.
positive definite; that
is
{Z) g (Z)'] exists and
is
is
given by {L„ (6,,g)
is,
let
positive
g £ G}
,
where
Ln[e*,g)
= ^g{Zi){ei-T]
Being based on
(Ci,
Zi,i
=
1, ...
T{l-T)J2g{Z,)g(Z,)'
,n)
,
any such L„ [9,,g)
is
Y,9{Zi){ei-T)
(A.l)
conditionally pivotal under the null. In
addition, under the null,
L„{9,,g)
any g €
for
G-
^d
g £ G} enjoys desirable large sample power properties
Moreover, the class {L„ (Ot,g)
under the following strong identification assumption
Assumption
3.
(a)
The
Z
distribution of
2>^dim(s)
in
which 0» denotes some open neighborhood of
does not depend on
9o- (b)
For every 6 G G, (and for
almost every Z),
fiZ,9)
and
exists
If
9o,n
is
continuous
(in 5).
(c) f,
Assumption 3 holds and g £
=
9,
+
(Z)
G,
=
=
-^^r{Z,9)
f (Z, 0,) 6 5.(d)
(A.2)
£
supggQ_
||f
(Z,
<
6*)!]
co.
then under contiguous alternatives induced by the sequence
b/y/n,
1
Ln
{6*,g)
^d
1
2
-j^X~d\m[g)
t(1-t
5s{h,g)
(A.3)
where
5s
By
{b,
g)
=
[f,
a standard argument, 63 {b,g)
local asymptotic asymptotic
result
b'E
is
(Z) g (Z)']
<
E [g (Z) g (Z)'] "' £
[5
(Z) f, (Z)']
63 {b,ft) for any 5 e 5. As a consequence,
power within the class {L„ {9t,g)
the following.
23
:
g € G}
An
b.
L^
(9,,t,)
maximizes
even stronger optimality
.
,
Proposition
Ln
(0,,f,)
is
Among
4.
tests based
a locally asymptotically
,n) the test which rejects for
on (ej, Zi,i = 1,
UMP (rotation) invariant test of Ho- Therefore, {Ln
.
.
.
is an (asymptotically) essentially complete class of tests of
Proof: The
conditional (on
Z_
=
(Zi,
.
Ho under Assumption
Z„)) log likelihood function
,
.
.
large values of
,
{0,,g)
'
g £ Q)
3.
is
given by
-
e.)}
n
= J2
in {e\z)
!
{log [^ (2-^)1
LAN
Assumption 3 implies that the following
£„
where £n
(
e.
e.
+
log
[1
-
^ iz,,e)]
(1
=1
+
—
-
)
£„
expansion
(00
=
b'S:
-
is
valid
under the
null:
For any
G R'^
b
-bXb + Opil),
the (unconditional) log likelihood fmiction,
is
\/n '-^ T
—
(1
T
and
n
Theorem
ically
^-^ T
T[l — T)
— T)
(l-r)
(1
3 of Choi, Hall, and Schick (1996)
UMP
now shows
that
L„
(&.,f,)
'
'
=
^S^'I^~^S^
is
the asymptot-
invariant test of Ho-
In view of Proposition 4, a key role
is
played by f, This gradient will typically be
.
unknown but
will
be estimable under various assumptions ranging from parametric assumptions to nonparametric ones.
As an
illustration, consider the
hnear quantile model
'
Y=
where
P [e <
=
0\Z]
t. If
D'Oo
+
(A.4)
e,
the conditional distribution of e given
(A',
Z) admits a density (with respect
to Lebesgue measure) fe\x,z ('l^j ^) £ind certain additional mild conditions hold, then
satisfied
If,
with
moreover,
[Z)
f,
= —E
[Df^^x.z
(OIA",
Z)
]Z]
{e,v') \Z
~
A/'(0,S) for
parametric estimation of
subclass of
Q
will
ft
used, nor will
some
n'Z +
(A.5)
v,
positive definite matrix E, then f, (Z)
becomes
feasible.
is
proportional to
it
will
not affect the asymptotic validity of the test even
affect the validity of finite-sample inference
if
is
available.
and
provided sample splitting
full
is
sample
used
full
(i.e.,
sample).
identification. Proposition 4 will not hold as stated, but a closely related optimality result
The key
diff'erence
between the strongly and weakly
property of a weakly identified model
estimable.
Z
known.)
f» is
the
estimation of f, and finite-sample inference are performed using different subsamples of the
Under weak
II'
(Assuming that the gradient belongs to a particular
not affect the optimality result, as Proposition 4 (tacitly) assume that
Estimation of the gradient
is
is
assumed that
it is
D=
where
,
Assumption 3
an object which can be estimated nonparametrically.
is
As such, asymptotic optimality
identified cases
that the counterpart of the gradient t»
results are too optimistic. Nevertheless,
24
is
is
that the defining
not consistently
it is still
possible to
show the
used in the main text has an attractive optimahty property under the following weak
statistic
identification
assumption
Assumption
Rn{Z, 0,,C)\
and
exists
If
is
for
4. (a)
some
in
which t(Z,9o)
The
distribution of
C
G Kdim(Z)xK
Ln
Z
modeled as a
"locally linear" sequence of parameters.^''
does not depend on
E
=0
\Rn [Z, 6,C)
-1/2
T(Z,9t)
Oq, (b)
=
some function Rn, where A^
a,nd
positive definite, (d) limn^oo
Assumption 4 holds and g €
is
-
6'o
(
[Z'CAe +
= E{ZZ')
{c)T.zz
every 9 and every C.
for
Q, then
(S*,g)
—
-Sw('^e,C,g)
XXdim(g)
>rf
(A.6)
Til
where
Sw
As
(A,, C, g)
=
A'gE [C'Zg
(Z)']
E
in the strongly identified case, the limiting distribution of
in the
weakly identified
Within the
case.
the asymptotically most powerful test
"'
(Z) g (Z)']
[g
Ln
E [g (Z) Z'C] Ae-
the one based on L„ {9,,gc)
is
\ times a noncentral
member
based on a
class of tests
is
{9*,g)
where gc (Z)
,
furthermore enjoys an optimality property analogous to the one established
of the result for
L„ (9f,gc)
is
identical to that of Proposition 4, with
5,
in
=
X%^i„\
g € G}
of {Ln (9,,g)
,
C'Z. This test
Proposition
4.
The proof
5*, and T* of the latter proof
replaced by Ag,
Sn [C)
=
C
T^
Jn IZ
-
~T\
T il
^
t)
^'
("^^
"
^^
and
Xn [C)
respectively.
C
=
C
However, the consistent estimation of
is
C
cannot be treated "as
if"
it
implementable without knowledge of
knowledge. To that end,
It
let
=
E.„ui
L^
is
C
is
if
C
is
known, then the
statistic
5^ (C)
is
infeasible in the present (weakly identified) case, hrdeed,
it
seems more reasonable to search
for a test
and enjoys an optimality property that does not
t(1
^
^E^'Zi
member
of
{L„
(9t,g)
Muirhead (1982, Exercise 3.15
a
c,
rely
which
on
this
z, [a
-
t)
(A.7)
1
L* be the particular
follows from
matrix
-z,z:
T)
4.)
was known,
-r)
1
is,
-
t(1
let
l:
that
^
(In particular, the proof utilizes the fact that
asymptotically sufficient under Assumption
because
n
strictly increasing
:
g € Q} for which g
(d)) that for
any
k>
is
the identity mapping.
and any dim (D) x dim(D)
transformation of
K.
exp
1
Assumption 4
C
is
+
K
Ln(9,,gc)]dJ{C;Y:,y),
(A.8)
motivated by the Gaussian model (A.4)-(A.5). In that model, parts (b) and (d) hold (with
proportional to \/nTl)
if
part (c) does and
n
varies with
matrix (as in Staiger and Stock (1997)).
25
n
in
such a way that y^Yl
is
a constant dim (Z) x
K
where J ()
is
mean
the cdf of the normal distribution with
J {)
In (A. 8), the functional form of
and variance
"natural" insofar as
is
is
it
E^,,;
®
(n~^ Y^=\ ^i^'i)
corresponds to the weak instru-
ments prior employed by Chamberlain and Imbens (2004). Moreover, following Andrews and Ploberger
(1994), the integrand in (A. 8)
obtained by averaging the
is
LAN
approximation to the likelihood
ra-
tio
with respect to the weight/prior measure Kcido) associated with the distributional assumption
Ag
~
A/"
0, reXn
In view of the foregoing,
(C)~
erage power optimality properties of the
follows that the statistic L* enjoys weighted av-
it
Andrews and Ploberger (1994)
This statement
variety.^'*
is
formalized in the following result.
Proposition
is
Among
5.
on
tests based
(cj,
asym,ptotically equivalent to the test that
=
Zi,i
1,
.
.
.
,n),
under Assumption 4 the
test based
on L*
maximizes the asymptotic average power:
/ [ Pr (reject e*\eo,C)dKc {eD)d J (C).
limsup
References
Abadie, a.
Angrist,
"Changes
(1995):
Approach,"
J.
in
Spanish Labor Income Structure During the 1980s:
CEMFI Working Paper No. 9521.
D., AND A. B. Krueger (1991):
A
"Does compulsory school attendance
Quantile Regression
affect
schoohng and
earnings," Quarterly Journal of Economics, 106, 979-1014.
BUCHINSKY, M.
"Changes
(1994);
in
U.S.
wage structure 1963-87:
An
application of quantile regression,"
Econometrica, 62, 405-458.
Chamberlain, G.
(1994):
"Quantile regression censoring and the structure of wages,"
metrics, ed. by C. Sims. Elsevier:
Chernozhukov, V.
MIT, Department
download
at
New
in
Advances in Econo-
York.
"Extremal Quantile Regression (Linear Models and an Asymptotic Theory),"
(2000):
of Economics, Technical Report,
pubhshed
in
Ann.
Statist. 33, no. 2 (2005), pp. 806-839,
www.mit .edu/"vchern.
Chernozhukov,
and
V.,
C.
Hansen
(2001): "Instrumental Quantile Regression Inference for Structural
and
Treatment Effect Models," Journal oj Econometrics (forthcoming).
—
(2005a):
"Instrumental Variable Quantile Regression," mimeo.
(2005b):
"An IV Model
Chernozhukov,
V.,
and
H.
of Quantile Treatment Effects," Econometrica.
Hong
(2003):
"An
MCMC
Approach to
Classical Estimation,"
Journal of
Econometrics, 115, 293-346.
Chernozhukov,
V., G. Imbens,
and W. Newey
(2006):
"Instrumental Variable Estimation of Nonseparable
Models," forthcoming Journal of Econometrics.
Chesher, a.
(2003): "Identification in Nonseparable Models," forthcoming in Econometrica.
DoKSUM, K.
(1974):
sample case," Ann.
Graddy, K.
"Empirical probability plots and statistical inference for nonlinear models
Statist., 2,
(1995):
in
the two-
267-277.
"Testing for Imperfect Competition at the Fulton Fish Market,"
Rand Journal
of Eco-
nomics, 26(1), 75-92.
Gutenbrunner, C,
J.
JURECKOVA, R. KoENKER, AND
based on regression rank scores,"
J.
Nonparametr.
As discussed by Andrews and Ploberger
totically equivalent to)
S.
(1994, p. 1384),
a Bayesian posterior odds
PORTNOY
Statist., 2(4),
ratio.
26
(1993):
"Tests of linear hypotheses
307-331.
L^ can
therefore be interpreted as (being
asymp-
Hansen,
P., J.
L.
Heaton, and A. Yaron
(1996):
Sample Properties
"Finite
of
Some
Alternative
GMM
Estimators," Journal of Business and Economic Statistics, 14(3), 262-280.
He, X.,
Hu,
AND
HUBER,
p.
Hu
F.
J.
Amer.
Statist.
Assoc, 97(459), 783-795.
"Robust regression: asymptotics, conjectures and Monte Carlo," Ann.
J. (1973):
Koenker, R.
"Markov chain marginal bootstrap,"
(2002):
"Estimation of a Censored Dynamic Panel Data Model," Econometnca, 70(6), 2499-2517.
L. (2002):
"Rank
(1997):
tests for linear models," in Robust inference, vol. 15 of
Statist., 1,
Handbook of
799-821.
Statist., pp.
175-199. North-Holland, Amsterdam.
Quantile Regression. Cambridge University Press.
(2005):
Koenker,
R.,
Koenker,
R.,
and
and
(2004b):
MacKinnon, W.
for
sample
G. S. Bassett (1978): "Regression Quantiles," Econometrica, 46, 33-50.
Z.
Xiao
J.
"Quantile autoregression," www.econ.uiuc.edu.
(1964): "Table for both the sign test
sizes to 1,000,"
Matzkin, R.
(2004a):
"Unit root quantile autoregression inference,"
L. (2003):
J.
Amer.
Statist.
Assoc,
"Nonparametric estimation
J.
Amer.
Statist.
Assoc, 99(467), 775-787.
and distribution-free confidence
intevals of the
median
59, 935-956.
of nonadditive
random
functions," Econometrica, 71(5),
1339-1375.
Newey, W.
K. (1997): "Convergence rates and asymptotic normality for series estimators,"
J.
Econometrics,
79(1), 147-168.
and
Pares, A.,
D. Pollard (1989): "Simulation and Asymptotics of Optimization Estimators," Economet-
1027-1057.
rica, 57,
Parzen, M.
L. J.
I.,
Wei,
and
Z.
Ying
(1994):
"A resampling method based on pivotal estimating functions,"
Biometrika, 81(2), 341-350.
Portnoy,
S. (1985):
"Asymptotic behavior of
Normal approximation," Ann.
M estimators of p regression
parameters when p^ /n
is
large. H.
1403-1417.
"Asymptotic behavior of regression quantiles
(1991):
Anal,
Statist., 13(4),
in
nonstationary, dependent cases,"
J.
Multivariate
38(1), 100-113.
Robert, C.
and G. Casella (1998): Monte Carlo Statistical Methods. Springer- Verlag.
and J. H. Wright (2000): "GMM with weak identification," Econometrica, 68(5),
P.,
Stock,
J.
Tamer,
E. (2003): "Incomplete simultaneous discrete response model with multiple equilibria," Rev.
H.,
1055-1096.
Econom.
Stud., 70(1), 147-165.
Walsh,
Math.,
J.
11,
E. (I960):
"Nonparametric
tests for
median by interpolation from sign
183-188.
27
tests,"
Ann.
Inst. Statist.
'
MCMC
A.
for
Draws and
Critical
Demand Example
(t
Region
B.
lor
and Grid Search
Demand Example
0.2
0.2
-
* '
.,
y-'
^
-0.2
>
1
Critical
Region
= 0.50)
(i
.
1
\-'
:/."'.
•^'
w'.i'
•>
•
/
-ii!'
-0.2
,>'
•«
i."
t
MCMC
= 0.50)
/
/
/
\
r
\
-0.4
-0.4
\
/
* t
•^
-0.6
/
V
J-r^-
<:"
-
-0.6
/
•••.',1^
.'•'
(
•>'
-
-0.8
]
1
#.*/•
< *,
-0.8
i
J
n
*>\
;•'
A-l'
-."
f
(
•
•
\J
/'
'."f-'v.
**
*.•••
*;
-1
,
.••>'
-1.2
i.2
8.3
8.4
8.5
8.6
8.7
!.2
8.3
e(.50)|2j (Inlercepl)
8.4
e(
8.5
50)|2|
8.6
8.7
(Intercept)
MCMC
and Grid Search Confidence Regions. This figure
illustrates the construction of a 95% level confidence regions by MCMC and a
grid search in the demand example from Section 3. Panel A shows the MCMC
draws. The gray -h's represent draws that fell within the confidence region, and
Figure
the black
the
1.
•'s
MCMC
represent draws outside of the confidence region.
draws within the confidence region (gray
confidence region (black
line).
28
4-'s)
Panel
and the
B
plots
grid search
Table
Monte Carlo Results
1.
Asymptotic Inference
Null Hypothesis
r
=
=
r
0.50
0.75
t
=
t
0.90
=
Finite
Sample Inference
0.50
r
=
0.75
=
t
0.90
A. Exogenous Model
^(r)|i|
e(r)
=
=
eo(T)|i]
^o(t)
0.0716
0.0676
0.0920
0.0744
0.0820
0.1392
Endogenous Model.
B.
0iTh]
e{T)
=
=
^o(r)|i]
^o(r)
0.2380
0.2392
0.2720
0.2352
0.3604
0.4740
C.
^(t)(ij
e{r)
=
=
^o(r)[i]
0^{t)
Endogenous Model.
0.0744
0.0808
0.1240
0.0732
0.0860
0.1504
D. Endogenous Model,
e(r)[ll
6{t)
=
=
eo(r)[i]
eo(T)
11
=
11
n
0.0080
0.0064
0.0044
0.0516
0.0448
0.0448
0.0240
0.0212
0.0200
0.0488
0.0460
0.0484
0.0300
0.0300
0.0204
0.0552
0.0560
0.0464
.05
=
:
==
.5
1
0.0632
0.0784
0.1168
0.0232
0.0216
0.0192
0.0508
0.0772
0.1440
0.0524
0.0476
0.0508
Note: Simulation results for asymptotic and finite sample inference
results for a different simulation model.
q{D,8a,T)
=
6o(t)|2)
+
The
&o(t)|i|D.
tests of the hypothesis that 6(t)hj
tests of the joint hypothesis 6
=
=
8a.
for
quantile regression. Each panel reports
Each simulation model has quantiles
first
of the
form
row within each panel reports
rejection frequencies for
5%
level
and the second row reports
rejection frequencies for
5%
level
5o(t)|i],
The number
of simulations
29
is
2500.
T
= 0.50
T = 0.75
\
0.8
0.8
\
1
\
\
1
\
1
1
\
1
\
\
1
^
0.6
0.6
\
1
\
1
I
S
I
o
1
\
/
^0.4
0.4
1
\
\
1
0.2
0.2
A
-2.5
-2.5
e(.50)|,j-e^(.50)
I
plots
2.
is
-
8(.75),„-9„(.75)
e^ji.so)
Power Curves
power curves
solid line
1
2.5
= 0.90
e(.90)j,|
Figure
'
-'7
V^~-^_^
for
Exogenous Simulation Model. This
for the simulations
contained in Panel
A
of Table
1.
figure
The
the power curve for a test based on the finite sample inference
procedure, and the dashed line
is
the power curve from a test based on the
Eisymptotic approximation.
30
T
= 0.50
T = 0.75
0.8
0.7
0.6
\
/
0.5
1
\
0.4
1
\
0.3
\
0.2
\
'
.
0.1
_^/
•(.50)(„-e„(.50)„j
•u
1
(.75)|„-e„(.75)j,j
= 0.90
(.90)(„-e„(.90)j„
Figure
3.
Power Curves
for
This figure plots power curves
1
Endogenous Simulation Model,
for the simulations
which correspond to a nearly unidentified
curve for a test based on the
line is the
finite
power curve from a
test
contained in Panel
case.
The
=
11
B
0.05.
of Table
solid line is the
power
sample inference procedure, and the dashed
based on the asymptotic approximation.
31
= 0.50
T
\
T = 0,75
~S^
\
0.8
\
0.6
•
0.8
•
0.6
/
1
1
\
1
1
0.4
0.4
1
1
0.2
1
0.2
'/
v/
n
;
-5
-5
e(.50)|„-9„(,50)|„
T =
e(.75)
-6„(.75)
0,90
e(,90)„|-e„(.90),„
Figure
4.
Power Curves
for
This figure plots power curves
1.
The
solid line
is
for
Endogenous Simulation Model,
the simulations contained in Panel C
11
=
.5.
of Table
the power curve for a test based on the finite sample inference
procedure, and the dashed line
is
the power curve from a test based on the
asymptotic approximation.
32
1
T =
\.
\
X = 0.75
0,50
'
'/
A^
\
/
0.8
0.8
1
0.6
0.6
\
V111
1
1
'
'
li
0.4
1
0.4
U
\\
1
1
\\
1
0.2
0.2
Y
-2
(.75),„-e„(.75),„
(.50),„-9„(.50),„
1
\
= 0.90
7
'
0.8
0.6
\
1
'
0.4
0.2
6(.90),„-6„(.90)p|
Figure
5.
Power Curves
for
This figure plots power curves
Table
1.
The
solid line
inference procedure,
is
Endogenous Simulation Model,
for the simulations
IT
contained in Panel
=
D
1.
of
the power curve for a test based on the finite sample
and the dashed
line
on the asymptotic approximation.
33
is
the power curve from a test based
Table
2.
Estimation Method
Demand
r
=
for
Fish
r
0.25
=
=
t
0.50
0.75
A. Quantile Regression (No Instruments)
Quantile Regression (Asymptotic)
(MCMC)
(-0.874,0.073)
(-0.785,-0.037)
(-1.174,-0.242)
(-1.348,0.338)
(-1.025,0.017)
(-1.198,0.085)
Finite
Sample
Finite
Sample (Grid)
(-1.375,0.320)
(-1.015,0.020)
(-1.195,0.065)
Finite
Sample (Marginal)
(-1.390,0.350)
(-1.040,0.040)
(-1.210,0.090)
B. IV Quantile Regression (Stormy,
Inverse Quantile Regression (Asymptotic)
(MCMC)
Finite
Sample
Finite
Sample (Grid)
Finite
Sample (Marginal)
.
C. Quantile Regression
Quantile Regression (Asymptotic)
Finite
Sample (Marginal)
D. IV Quantile Regression
-
Day
Finite
Note;
(-1.802,0.030)
(-2.035,-0.502)
(-4.403,1.337)
(-3.-566,0. 166)
(-5.198,25.173)
(-4.250,40]
(-3.600,0.200)
(-5.150,24.850)
(-4.430,1]
(-3.610,0.220)
[-5,1]
95%
level
confidence interval estimates for
treats price as exogenous,
(-0.718,-0.058)
(-1.265,-0.329)
(-1.610,0.580)
(-1.360,0.320)
(-1.350,0.400)
set of
dummy
variables for
Mixed
as Instruments)
(-2.403,-0.324)
(-1.457,0.267)
(-1.895,-0.463)
[-5,1]
[-5,1]
[-5,1]
Demand
for
and Panel B reports
endogenous and uses weather conditions as instruments
a
(No Instruments)
(-0.695,-0.016)
Sample (Marginal)
model which
EfFects
EfFects (Stormj',
Inverse Quantile Regression (Asymptotic)
as Instruments)
(-2.486,-0.250)
Day
-
Mixed
day of the week. The
first
Fish example. Panel
results
for price.
row
in
A
reports results from
from model which treats price as
Panels
C
and
D
are as
A
and B but include
each panel reports the interval estimated using
the asymptotic approximation, and the remaining rows report estimates of the finite sample interval
constructed through various methods.
34
0.35
2|
.
.
.
1
1
;
2
3
0.5
2|
'
1
1
3
Figure
6.
Finite
Sample Confidence Regions
ing Price as Exogenous. This figure plots
firom fish
for Fish
finite
Example Treat-
sample confidence regions
example without covariates treating price as exogenous. Values
intercept, 0{t)i2] are
on the horizontal
axis,
6(r)[i] are on the vertical axis.
35
and values
for the slope
for the
parameter
-20
-10
Intercept
10
20
10
Intercept
Figure
7.
fish
Sample Confidence Regions for Fish Example TreatEndogenous. This figure plots finite sample confidence regions
Finite
ing Price as
from
20
Intercept
example without covariates treating
the intercept, 0{t)^2]
eter ^(t)|i] are
^^I'e
on the horizontal
on the vertical
axis.
36
axis,
price as endogenous. Values for
and values
for the slope
param-
Table
Returns to Schooling
3.
Estimation Method
r
=
t
0.25
=
0.50
r
=
0.75
A. Quantile Regression (No Instruments)
Quantile Regression (Asymptotic)
(MCMC)
(0.0715,0.0731)
(0.0642,0.0652)
(0.0637,0.0650)
(0.0710,0.0740)
(0.0640,0.0660)
(0.0637,0.0656)
Finite
Sample
Finite
Sample (Grid)
(0.0710,0.0740)
(0.0641,0.0659)
(0.0638,0.0655)
Finite
Sample (Marginal)
(0.0706,0.0742)
(0.0638,0.0662)
(0.0634,0.0658)
B.
IV Quantile Regression (Quarter of Birth Instruments)
Inverse Quantile Regression (Asymptotic)
(MCMC)
Finite
Sample
Finite
Sample (MCMC-2)
Finite
Sample (Grid)
Finite
Sample (Marginal)
C. Quantile Regression
-
(0.0784,0.2064)
(0.0563,0.1708)
(0.0410,0.1093)
(0.1151,0.1491)
(0.0378,0.1203)
(0.0595,0.0703)
(0.0580,0.2864)
(0.0378,0.1203)
(0.0012,0.0751)
(0.059,0.197)
(0.041,0.119)
(0.021,0.073)
(0.05,0.39)
(0.03,0.13)
(0.00,0.08)
State and Year of Birth Effects
(No Instruments)
Quantile Regression (Asymptotic)
Finite
Sample (Marginal)
D. IV Quantile Regression
(0.0666,0.0680)
(0.0615,0.0628)
(0.0614,0.0627)
(0.0638,0.0710)
(0.0594,0.0650)
(0.0590,0.0654)
-
State and Year of Birth Effects
(Quarter of Birth Instruments)
Inverse Quantile Regression (Asymptotic)
Finite
Note:
Sample (Marginal)
95%
level confidence interval estimates for
model which
treats schooling
endogenous and uses quarter
a.s
in
(0.0661,0.1459)
(0.0625,0.1368)
(-0.24,1]
[-1,1]
[-1,0.35]
Returns to Schooling example. Panel
exogenous, and Panel
of birth
but include a set of 51 state of birth
row
(0.0890,0.2057)
dummies
dummy
B
A
reports results from
reports results from model which treats schooling as
C and D are as A and B
dummy variables. The first
as instruments for schooling. Panels
variables
and a
set of 9 year of birth
each panel reports the interval estimated using the asymptotic approximation, and the remaining rows
report estimates of the finite sample interval constructed through various methods.
37
95%
Confidence Region
for Scfiooling
0.02
0.01
;.75)|,j
Figure
ple.
8.
Confidence Region
This figure plots the
example
in the
finite
Q
Q
0.06
model without covariates treating schooling
6'(.75)hi are
0.07
0.08
Schooling
Exam-
(Slope)
for 75''' Percentile in
on the
on the horizontal
38
b
Percentile
sample confidence regions from the schooling
ues for the intercept, 6'(.75)[oj are
parameter
75
0.05
0.04
0.03
for
Example
vertical axis,
axis.
as endogenous. Val-
and values
for the slope
Date Due
Lib-26-67
MIT LIBRARIES
3 9080 02617 9736
Download