Document 11162661

advertisement
Digitized by the Internet Archive
in
2011 with funding from
Boston Library Consortium
Member
Libraries
http://www.archive.org/details/instrumentalvariOOcher
9
^libWt"
HB31
.M415
1^
Massachusetts Institute of Technology
Department of Economics
Working Paper Series
INSTRUMENTAL VARIABLE QUANTILE REGRESSION
Victor
Chemozhukov
Christian
Hansen
Working Paper 06-1
October 15, 2004
Room
E52-251
50 Memorial Drive
Cambridge,
MA 021 42
This paper can be downloaded without charge from the
Social Science
Research Network Paper Collection
http://ssrn.com/abstract=909666
at
Instrumental Variable Quantile Regression*
Victor Chernozhukov and Christian Hansen
May
First Version:
Revised: December 2004
2001
Abstract
Quantile regression
a response
Y
an increasingly important tool that estimates the conditional quantiles of
is
given a vector of regressors D.
and can be used to measure the
also in the
D'ylU)
is
upper and lower
U
and
[/ is
median regression
generalizes Laplace's
not only in the center of a distribution, but
V =
For the linear quantile model defined by
tails.
strictly increaising in
It usefully
effect of covariates
regression allows estimation of quantile specific covariate effects j(t) for r £ (0,
we propose an instrumental
D''f{U) where
a standard uniform variable independent of D, quantile
1).
In this paper,
variable quantile regression estimator that appropriately modifies the
conventional quantile regression and recovers quantile-specific covariate effects in an instrumental
variables
model defined by
variable that
V=
may depend on
D'a(U) where D'a{U)
D but
is
strictly increasing in
U
and
[/ is
a uniform
independent of a set of instrumental variables Z. The proposed
is
estimator and inferential procedures are computationally convenient in typical applications and can
be carried out using software available
for conventional quantile regression. In addition, the
estimation procedure gives rise to a convenient inferential procedure that
identification.
The use
of the proposed estimator
and testing procedure
is
is
proposed
naturally robust to
illustrated
weak
through two
empirical examples.
Keywords: Quantile Regression, Instrumental Variables,
1.
Weak Instruments
Introduction
Quantile regression
used in
is
an important tool
for estimating conditional quantile
many empirical studies and has been studied
models that has been
extensively in theoretical statistics; see Koenker
and Bassett (1978), Koenker and Portnoy (1987), Portnoy (1991), Gutenbrunner and Jureckova
(1992), Chaudhuri,
Doksum, and Samarov
(1997), Portnoy
and Koenker (1997), Knight (1998),
Koenker and Machado (1999), Portnoy (2001), and He and Zhu (2003). One of quantile regression's
'The Matlab software
for this
paper
is
available
upon request
via e-mail chansenl@gsb.uchicago.edu.
Further updates of this paper can be downloaded at www.mit.edu/'vchern. Address correspondence to C.
Hansen, Asst. Prof, of Econometrics and
Statistics,
The University
of Chicago,
Graduate School of Business,
5807 South Woodlawn Avenue, Chicago, IL 60637, USA, chansenl@gsb.uchicago.edu.
^Portions of this paper were previously included in
"An IV Model
of Quantile
Treatment
Effects", 2001.
MIT Department
of
Economics Working Paper 02-07,
most appealing features
is its
ability to estimate quantile-specific effects that describe the
covariates not only on the center but also on the tails of the
mean
such as the
effects,
summary
statistics of the
effect
impact of a covariate, they
unless the variable affects both the central and the
While the central
regression, provide interesting
fail
to describe the full distributional impact
tail
quantiles in the
same way.
In addition,
For example, in a study of the effectiveness of a job training program, the
cases.
training on the low tail of the earnings distribution will likely be of
than the
distribution.
mean
on the impact of covariates on points other than the center of the distribution
interest focuses
many
outcome
obtained through conditional
impact of
effect of training
For an outcome
quantile model
Y
and
may be
interest for public policy
set of factors
D
outcome, the conventional linear conditional
affecting the
defined as
Y=
t/*|£)~Uniform(0,
D'-yiU'),
strictly increasing
is
effect of
on the mean of the distribution.
(1.1)
where r —> D'f{T)
more
in
and continuous
By
turbance U' as individual ability or proneness.
1),
Doksum
in r.
(1974) interprets the dis-
construction, D'^{t)
is
the r-quantile of
Y
conditional on D. This model generalizes the usual linear regression model
y=
by allowing
D'7o
quantile-specific effects of covariates
+ 7i(t/*)
D.
For a given quantile indexed by t € (0,1),
the quantile specific effects 7(r) can be estimated using standard quantile regression methods
(e.g.
Koenker and Bassett (1978)).
we develop a new estimation method
In this paper,
above model. The developed approach
is
determined non-experimentally, making
it
the outcomes. Specifically,
Y=
i—>
D'a{T)
is
difficult to infer
D'a{U),
strictly increasing in r,
instrumental variables that are independent of
U
,
D
the sampled
makes 7(t)
/
will
oi.{t),
depend on
such as
ability,
D are
the true structural/causal effects of D on
[/,
C/|Z
~
Uniform(0,
1),
D
is
U
but statistically related to D. Since
statistically
dependent on
U
,
and
Z
D
\s
a,
set of
depends on
leading to biased sampling or endogeneity. This endogeneity
rendering conventional quantile regression inconsistent for estimating (1.2).
For example, suppose that
level of training.
an endogenous generalization of the
we consider the model
(1.2)
where r
for
designed for settings where the observed variables
Y
is
the hourly wage of a worker and that
The unobserved disturbance
U
would
reflect
which influence the individual's wage via the equation
individuals choose high levels of training, then the level of training
causes dependence between
U
and
D
is
an individual's
imobserved personal characteristics,
is
Y=
D'a(U).
If
high-ability
correlated with ability, which
D and implies that conventional quantile regression will overstate
>
the true effect of training on earnings, 7(t)
a{T).
Instrumental variables Z, such as random
assignment to training programs in the training context, allow us to overcome this problem by
D
providing a source of variation in
D
examples where
that
many
independent of U. There are
is
sampled depending on U,
is
i.e.
The
endogenously.
other interesting
empirical section presents a
supply-demand example and a training example.
Model
Y—
D'aQ
outcome
generaUzes the conventional instrumental variables model with additive disturbances
(1.2)
+ ai{U) where U\Z ~ U{0, 1) to cases where the impact of D varies across quantiles of the
A number of appealing approaches are readily available to estimate Qq in the
distribution.
conventional instrumental variables model with additive disturbances, including the conventional
two-stage least squares (2SLS) estimator and
its
robust analogs by
Amemiya
(1982) and
Chen and
Portnoy (1996).
The purpose
of this paper
to provide practical estimation
is
The estimator we propose
(1.2).
is
and inference methods
that can be constructed from a series of conventional quantile regressions.
approach
is
computationally convenient and simple to implement in
has already been used in empirical applications,
(2004),
procedure that
D
is
will
statistically
and allow
model
many
Thus, the estimation
typical applications. It
Hausman and Sidak
will
(2004), Januszewski
be further illustrated with two empirical
In addition, the estimation procedure leads naturally to
an inference
be valid even when one of the key conditions for identification of the model, that
dependent on Z,
The remainder
detail
e.g.
and Chernozhukov and Hansen (2004a), and
applications in this paper.
for
an appealing modification of the standard quantile regression
of this paper
for
fails.
is
organized as follows. In Section
other controls in the equations.
procedures based on a set of
moment
2,
we
define the
model
in
more
Section 3 discusses estimation and testing
equations introduced in Section
2.
Section 4 illustrates the
use of the derived estimator and testing procedure through brief empirical examples, and Section 5
concludes.
2.
The Instrumental Quantile Regression Method
2.1.
The Model
In this section,
we more
relationship defined
formally define the model
we
will estimate.
Suppose we have a structural
by
(2.1)
Y=
D'a{U)
(2.2)
D=
S{X, Z, V), where
(2.3)
T
D'aij)
i->
+ X'0{U),
-1-
X'0{t)
[/|X,
V
is
Z~
Uniform(0,
statistically
1),
dependent on U,
strictly increasing in r.
In these equations,
•
y
•
[/ is
the scalar outcome variable of interest,
is
a scalar
structural
random
variable that aggregates
outcome equation,
where
• Z3
is
a vector of endogenous variables determined by
•
V
is
a vector of unobserved disturbances determining
•
2
is
a vector of instrumental variables (control variables excluded from
pendent of the disturbance
•
X
is
The observed
We
U
but impact variable
D
(2.2),
D
and correlated with U,
via (2.2)),
(2.1) that are inde-
and
a vector of included control variables.
variables consist of [Y, D,
sampled depending on
also
of the unobserved factors affecting the
all
U
X, Z), and due
to the
dependence between
V
and U,
D
is
.
shall refer to the function
Sy (r |d, x) =
(2.4)
as the Structural Quantile Function
(SQF)
d'a{T)
in
+ x'P{t)
order to emphasize that
is
it
in general a different
Qy (r|d, x). The structural quantile function 5y(T|d, x)
describes the quantile function of the latent outcome variable Yd = d'a{U) + X'0{U) obtained by
fixing D = d and sampling the disturbance U ~ t/(0, 1) (all conditional on X). This notion of
object than the conditional quantile function
sampling corresponds to independent sampling of
D
experimental settings. Instead the sampled variable
possible to estimate or
still
make
the use of instrumental variables
and U, which
D
is
is
generally not feasible outside
determined via
(2.2).
Nevertheless,
it is
inference on the structural quantile function 5y(r|d, i) through
Z
which induce variation
in
D
but are themselves independent of
U.
The
2.2.
Principle
Under the conditions of
(2.1)
and
(2.3),
the problem of dependence between
through the presence of instrumental variables, Z, that
affect the
U
and
determination of
D
D
is
overcome
but are
in-
dependent of U. In program evaluation studies with imperfect compliance, a simple example of an
instrument
tial
is
random assignment
values of U.
The presence
to the treatment group, which
can be used to estimate the parameters of
is
equivalent to the event
(2.5)
is
done independently of the poten-
of the instrumental variable leads to a set of
{U <
(2.1).
From
r}. It then follows
(2.1)
from
and
(2.3),
(2.1) that
P[Y<Sy{t\D,X)\Z,X] =
t.
moment
the event
equations that
{Y < Sy{t\D, X)}
Equation
(2.5) provides
parameters a and
/3.
a useful statistical restriction that can be used to estimate the structural
It is
important to notice that the equation P[Y < 5y (r|Z), X)\Z, X]
=
t
differs
from the conventional estimating equation
P[Y <Qy(t\D,X)\D,X]=t
(2.6)
used to estimate the conditional quantile function of
Y
given
D
X
and
Recall from Koenker and Bassett (1978) that the ordinary quantile regression
8is
finding the best predictor of
Pt-(u) '= {t
given
—
l(w
<
given
0))u. In other words,
W
under the asymmetric
assuming
integrability, the r-th conditional quantile of
Q^(r|TV)
J^
is
the class of measurable functions of
= 1/2 so that Pt(w) = ||m|. The
= {D,X) and can be estimated using the
W
(restricted in applications to be a set of flexible
function
W
finite
The moment equation
random
variable
Y—
given in (2.5)
Sy{t\D,X)
Q
(2.8)
Thus, we
may pose
=
is
Qy{t\D,X)
is
solves the above
equivalent to the statement that
Qy-Sy{t\d,x){t\Z,X)
a.s. for
is
problem with
is
the r-th quantile
each
r.
the problem of finding Q(r) and /?(r) solving equation (2.5) as the instrumental
K
— Sy{t\D, X) on
= argmin Epr\{Y - Sy[t\D^X) -
where J^
a solution of this problem
conditional on (Z,X):
a solution to the quantile regression of
(2.9)
is
sample analog of the above equation.
variable or inverse quantile regression (IVQR). This problem
term
F
= argmm E[pAY-f{W))]
with T
that
formulated
least absolute deviation loss
parametric functions). Laplace's median regression function Qy{-5\W)
of
is
W solves the problem:
(2.7)
where
Y
(QR)
is
to find an 5y(r|Z),X) such
{Z,
X):
f{Z,X))]
the class of measurable functions of {X, Z) (which will be restricted in applications).
'inverse'
emphasizes an evident inverse relation of
regression, (2.7).
this
The
problem to the conventional quantile
The Instrumental Quantile Regression Estimator and Derived Dual
3.
In-
ference
Basic Description and Properties
3.1.
Next we consider a finite-sample analog of the above procedure. Define the (weighted) conventional
quantile regression objective function as
"
1
Q„(r,Q,/3,7) :=
where
Z)
Yi :=
n
=
/(Xj, Z,)
V{Xi, Zt) >
- Z'a)Vu
X
is
a dim(/3)- vector of exogenous explanatory
a dim(7)- vector of instrumental variables such that dim(7)
is
is
X[l3
t=i
a dim(Q)-vector of endogenous variables,
is
variables, Zj
and
- D[a -Tpr{Y,
"—
a scalar weight. In practice, a simple procedure
is
to set
>
K =
1
dim(Q),
and
let
Zi either be Z, or the predicted value from a least squares projection of Di on Zi and X,.
The instrumental variable
or inverse quantile regression
estimator (IVQR)
follows. For a given value of the structural parameter, say q, let us
find
an estimate
for a{T),
S(t)
(3.2)
where ^(q)
=
A(a)
+
=
we
will look for
a value
a
that
makes the
arg inf [Wn(a)]
Op(l)
and
A{o.)
,
Wn{a) := n[7(Q,
W„{a)
is
the
itself.
Wald
positive definite, uniformly in
is
statistic for testing
The parameter
The estimator
(3.3)
population problem
0,
a fact that
is
It is
convenient to
-^71(7(0:, t)
—
we
below
will use
7(0, r)) in which
for inference
.
a finite-sample instrumental variable quantile regression. Analogous to the
(2.8), it finds
parameter values
Z
close to zero as possible. This estimator
and
a ^ A.
estimates are then given by
that the value of coefficient 7(0, r) on
regularity
7(a,T)
=
9{t) := (5(t), J3(T)) := (q(t), ^(S(t), r))
(3.3)
on the instrumental
T)']I(Q)[7(Q,r)],
A{a) equal to the inverse of the asymptotic covariance matrix of
about a{T)
coefficient
as possible. Formally, let
variable j{a, t) as close to
case
defined as
0n
'
set
is
QR to obtain
(^(Q,r),7(Q,r)) :=argmin Qn{T,a,P,'y).
(31)
'
^
To
run the ordinary
in the
is
for
a and
through the inverse step
ordinary quantile regression step (3.1)
consistent
is
(3.2)
such
driven as
and asymptotically normal under appropriate
identification conditions:
(3.4)
for Q,g specified below.
^.(O{T)-0{T))^dNiO,ne),
This asymptotic distribution can be used
the parameter of interest using standard
Wald procedures.
for
conducting direct inference on
In addition,
we can base
inference
on the "dual" Wald
coefficients on the instruments are zero
statistic
whether 7(a,r)
(i.e.
=
Wn{a)
for testing
When a =
0).
whether the
Q(r),
Wn{a)
is
asymptotically chi-squared with dim(7) degrees of freedom:
Wr,ia{T))^d X'(dim(7))
(3.5)
Thus, a valid confidence region for a{T) can also be based on the inversion of this dual Wald
CRp[a{T)] := {q
(3.6)
where
Cp
is
:
Wn{a) <
Cp} contains a{T) with probability approaching p,
the p-percentile of a ,\^(dim(7)) distribution. This dual approach
assumptions than the direct approach;
in particular,
of Fi
— D[aj on Xi and
2.
=
1,
...,
J},
3.
is
detail.
in practice as follows:
Zi to obtain coefficients 0{aj,T) and 7(Qj,r).
Save the inverse of the variance-covariance matrix of 7(q_;,t), which
or F-statistic for testing 7(Qj,r)
Choose 3(r)
vaUd under weaker
and run the ordinary r-quantile regression
any common implementation of the ordinary QR, to use as A(aj)
a Wald
more
may be computed
For a given probability index r of interest, the estimator
a suitable set of values {aj,j
is
robust to weak or partial identification of
it is
a{T). Section 3.5 discusses the properties of the dual procedure in
1. Define
statistic:
as a value
=
in
is
readily available in
Wn{aj). Then Wnicij) becomes
depending on the naming convention.
0,
among {aj,j =
1,
...,
J} that minimizes Wr,{a). The estimate of /?(t)
then given by /?(S(t),t).
4. Direct inference
on Q(r) may be conducted using the variance formula
Dual confidence regions
its
for q(t),
upper and lower bounds
CRp, may be computed
may be
as CRp[a{T)]
Computational Complexity and Implementation
One
of the most appealing features of the
may
IVQR and
Portnoy and Koenker (1997) show that ordinary
Op
(dim(/3, 7)^ X
:
Qg provided below.
Wn{aj) <
Cp},
QR using
QR can be computed
in
for
e
polynomial stochastic time
using interior point algorithms with preprocessing,^ so the above
and some 6 >
some small
5'
>
0,
0.
Since
we need
e ex
l/n", a
>
1/2,
and
it
suffices to
the proposed algorithm has computational complexity
Op
(^n(i/2+5')dim(a)
^ dim(/3,7)' x n'+')
'in contrast, simplex procedures will have running time of
Op
f
n''""'"'''' j.
is
any modern software.
IVQR
procedure has computational complexity of Op((l/e)'^™'°' x dim(/?, 7)^ x n^+*) for a desired
of accuracy
and
associated dual inference confidence region
be computed using the output from the conventional
71^+"*)
for
{aj
used as end-points of a confidence interval for a(T).
3.2.
that both
=
have a
=
1/2
level
+ 5'
that
is
polynomial
dimension of a.
in
the sample size
n and
Thus, the procedure
will
number
of exogenous variables, dim(/5),
dim(Q),
is
fications,
small. This situation
where typically dim(Q)
is
is
in the
possibly large, but the
certainly the
=
1
or 2
dimension of {P,j), but
is
not polynomial in the
be computationally attractive and work well when the
number
of endogenous variables,
most common case prevalent
and dim(/3)
varies
from
in
econometric speci-
to 50 or more. In fact, due to
1
practical properties, the estimator has already been applied in empirical analysis by
its
and Sidak (2004), Januszewski (2004), and Chernozhukov and Hansen (2004a), and
this
Hausman
paper also
presents two additional empirical applications.
There are other approaches that one could adopt
For example, an immediate approach
2.1.
the
< D[a + X[P) -
to minimize \\^T,'^^i(liYi
the estimator of Sakata (2001) which
the absolute deviation.''
is
is
method
estimation of the model defined in Section
of
moments approach (MM)
T){X[,Z';)'Vi\\ over
an elegant
In contrast to the
for
IVQR
maximum
a and
(5.
that attempts
Another example
likelihood type estimator based
is
on
approach, these alternative approaches involve
highly non-convex, multi-modal, and non-smooth objective functions over
many
parameters, which
poses a serious computational challenge. Implementation of extremum estimators with non-smooth
and, more importantly, non-convex objective functions generally requires non-convex searches over
parameter
sets of
dimension
high-dimension of
the case in
become
many
(3.
K = dim(/3)-l-dim(a), which will be quite large in many cases due to the
applications.
difficult to
IVQR approach will have an advantage when dim(Q) is small, as is
When dim(Q:) is high, both the IVQR approach and the MM approach
Thus, the
implement. In such settings one could use the quasi- Bayesian methods
developed in Chernozhukov and Hong (2003).
intervals using a quasi-posterior defined as the
MM
function specified above.
Assumptions
4.1.
To
exponent of the
Asymptotic Distribution Theory
4.
state the assumptions, define the population objective function as
Q{T,a,p,i):=E[pr{Y,-D'^a~X[p-Z[-t)Vi\,
(4.1)
and
let
(4.2)
(/3(a,r),7(Q,T)) := argmin Q{T,a,P,'y).
In our notation, the estimator solves the following program;
max mm Y,
a,0 7,(5
\y'
-
D'ia
for
MM
This approach computes estimates and confidence
- X[0 - Z'n -
X[&\/
f2 \y^ - ^-a -
XlP\
Q = Ax B
Define the parameter space
value P{a, t) for each
not
arise.
We
q e
^
as
a,
compact convex
in its interior, so that the
set such that
B
contains the population
parameter-on-the-boundary problem does
assume the data were generated by the model defined
in Section 2
and impose the
following additional assumptions:
Rl {Yi, Di, Xi, Zi) are iid defined on the probability space (H, F, P) and have compact support.
R2 For the given t, (Q(r),/3(T)) is in the interior of the specified set 0.
R3 Density Jy{Y\X, D, Z) is bounded by a constant / a.s.
R4 dE[l{Y < D'a + X'0 + Z''i)<i!]/d[P',i) has full rank at each a in A, for * = V,{Z[,X'^'
The compactness
rank condition
in
normal and are
Rl and R2
conditions in
in
Rl
R4 and
iid
compactness condition
The bounded
simplify the analysis.
are sufficient for the Jacobian matrix in
sampling
sufficient for
R4
suffice for the estimates (/?(Q;,r),7(a, r)) to
implementing the dual inference.
do not impose any conditions on the relation between
procedure, the dual procedure will be valid
when
D
It is
is
weak
R3 and
The
full
be asymptotically
important to note that R1-R4
and Z\ that
identification
density in
to be well-defined.
is,
or
unlike the direct inference
fails
partially or completely.
Stronger additional conditions are imposed for implementing the direct inference.
R5 dE[l(Y < D'a + X'/3)*)']/5(a',/3') has full rank at
R6 The function {a, (5) ^-^ E[{t - 1(7 < D'a + X'/3)}*]
The imposition
of
R1-R6
is
sufficient for identification
(a(r)',/3(r)')'.
one-to-one over 0.
is
and asymptotic normality of the
IVQR
estimator, both of which are necessary for the validity of the direct inference approach. These as-
sumptions considerably strengthen the conditions R1-R4 by imposing restrictions on the relationship
D
between
and
Z The
.
the dual approach
To
is
dual approach does not require these assumptions for
robust to the violation of either
comment on
further
R5
its validity.
Hence,
or R6.
Z
the nature of correlation between
and
D
required by R5, note that
by Rl and R3 we have that
dE[l{Y < D'a + X'l5m']ld{a' ,0') = E[fMX,D,Z)V,{Z[,Xi)'{D',,X'^)].
Hence,
if
we set
Vj
=
1 for simplicity,
that global identifiability
that the
moment
we
see that the Jacobian in
D
and Z, and R5 requires that
must
hold; hence, the impact of
covariation matrix between
R5
this
Z
takes a form of density- weighted
matrix has
full
rank.
R6 imposes
should be rich enough to guarantee
equations are solved uniquely. These assumptions are required to carry out direct
inference but are not required in the dual approach. Thus, discrepancies between the dual approach
and the
direct
approach should be indicative of situations where R5 and R6 do not hold.
10
Before proceeding to the asymptotic results, we provide some sufficient and more primitive
conditions for the global identifiability condition R6.
and R6
is
A
set of conditions that suffices for
R5' dE[l{Y < D'a + X'p)md{a\(5') has
R6' The image
of
G
under (q,^)
everywhere.
The
rank at each {q,P)
full
^ E\{t-\{Y < D'a + X'(3))^
R5' ensures that the mapping (a,
E[{t - 1{Y < D'q
i—>
/?)
is
simply-connected.
+ X'/3)}^]
somewhat the
simple-connectivity condition R6' curbs
G.
in
locally one-to-one
result, cf.
mapping
Ambrosetti
(1995). This fact and equations (2.4) and (2.5) imply that the solution of the equation
E \{t is
unique and
of
Theorem
Other
is
non-linearity of the
and implies a global one-to-one relationship by a Plastock-Hadamard type
and Prodi
both R5
as follows:
given by (q(t),
is
sufficient
l(r < D'a
+
=
X'/J)}*]
/3(r)).
and more primitive conditions
2 in Mas-Colell (1979). Let G'
for
R5 and R6
also result
through an application
be a convex compact set that contains
9
and that has a
smooth boundary dQ' Then the following conditions imply R5' and R6' and hence R5 and R6.
.
R5* dE[\{Y < D'a +
X'P)<i']/dia',0') has a positive determinant at each (q,/3) in 9'.
R6* dE [l(y < D'a +
X'/3)*] /d{a', 0')
is
positive quasi-definite along the
boundary dQ'
in the
sense defined by Mas-Colell (1979).
Note that
exogenous model, we can
=
set Zi
Di, and these conditions will be trivially satisfied.
Asymptotic Properties of the Dual Inference
4.2.
We
in the
first
state the formal results for dual inference, because they are the simplest to explain.
the conditions R1-R4, as
n
—
oo,
>
uniformly in
a €
Under
^
n
M^{a,r) - ^a,T)) = -J^'(a)
(4.3)
n-^'^
-^3,(0) +
Op(l),
1=1
(4.4)
s,(q)
=
[r
(4.5)
Eiia)
=
Y- Aq - ^.'/?(q, t) - Zh(a, t),
(4.6)
Ma) = E[f,^^^iO\Z,X)<i>9'/V].
(4.3)-(4.6) follow
here
(4.7)
(4.8)
is
that
-
1
(£,(q)
<
0)1
*„
by adopting standard arguments
we have a process
in a,
*i
=
V,{X\,
Z'^}'
for the quantile regression process.
whereas we usually have a process over
V^p(a,r)-^(Q,r))-.d
Q4a] = J-'[a]S[a]J-'[a],
Af(0,ntf[a]),
5[a]
=
£[s,(q)s,(q)'1.
r.
Hence
The
for
difference
each
a
11
The
=
statistic for testing 'y(a,r)
where
=
Q,^){a]
n^[Q] +Op(l)
QR.
the ordinary
given by the
Wald
statistic
Wn{a) =
n[7(a,T)']r2tf [Q][7(a,T)],
any standard consistent estimate of the asymptotic variance
is
(4.8) of
when a = q(t)
Therefore,
FKn[Q(T)]-d X'(dim(7)),
(4.9)
and
is
for the confidence region
P{a(T) e CRp[a{T)]}
(4.10)
Proposition
Comment
The
CRp[a{T)] := {a €
Under conditions R1-R4,
1.
1. Unlilce direct inference,
=
A
:
Wn{a) <
Cp},
where P{x^(dim(7)) < Cp}
<
Cp}
- p.
P{Wr,[a{T)]
= p,
the results (4-3)-(4-10) are true.
the dual inference requires only assumptions R1-R4 to hold.
results for dual inference are also straightforward to extend in various direction. In particular,
one can note that the preliminary estimation of weights
or even (4.7) as long as
q
and instruments Zi
Vi
will
the estimates of weights and instruments that must be imposed can be found in
Andrews
4.3.
Asymptotic Properties of the Direct Inference
The
following proposition presents the asymptotic properties of the direct approach.
Proposition
2.
not affect (4.9)
a root-n neighborhood of a{T). Additional regularity conditions on
in
is
In the specified model under conditions
R1-R4 and conditions R5-R6,
(1994).
sufficient
conditions for which are either R5'-R6' or R5*-R6*,
v^p(T)-e(T))^d7V(0,ne),
(4.11)
where, for
H
=
<b
=
V-[X',Z']' and
J;yl[Q(T)]J.,,
partition of
L =
J^^ := [E
e
JfiM,
ne
=
{K',L'YSiK',L'),
= Y -D'a{T)-X' 0{t), S = t{1-t)E [^^'] K = {J'^HJa)-^J'^H,
M = h+r - JaK, J^ = E[f,{Q\X,Z,D)^D'], and [J'^J'..,]' is a
,
[fc{0\X, Z)'i/'i!'/V])~^ such that Jg
is
a dim(/3) x dim(/?,7) matrix
and
J~,
is
a dim (7) x dim(/3, 7) matrix.
Corollary
1.
When dim(7) = dim(a),
the joint asymptotic variance of S(r)
the choice of A{t) does not affect asymptotic variance,
and P{t)
ne
for
S
defined above and Jg
Corollary
function
2.
Wn(a)
=
=
will generally
have the simple form
Jg'S{J'e)-'
E[f,(0\X, Z, D)'i[D', X']].
When dim(7) > dim{a),
generally matters.
A
the choice of the weighting matrix
natural choice for A{a)
is
given by A{a)
corresponds to the inverse of the covariance matrix of ^/n{jy{a,T)
equal to {Jn,SJ!y)~^ at q(t),
and
it
A(a) in the
=
objective
([n^[a]22)~^ which
- 7(0, r)). Noting that A{a)
— Q(r)) equals
follows that the asymptotic variance of y/n{a(T)
is
12
Corollary
The
3.
y-[X',Z*']', Z' -E[D-v'\Z,X]/V',v*
andV = V,
Comment
Corollary
2.
sense defined by
1
0)]^', where
f,{0\Z,X). Thus,
if
IVQR
step,
again.
It
bound
(1977) as well as the semi-parametric efficiency
used to obtain residuals
is
.
in the sense of
methods and are used
are estimated using nonparametric or parametric
hsis
no
effect
and instruments
on the
Andrews
hold.
Estimating Variance and Jacobian Matrices
The components
of the variance matrices that
inference and J;j[q]
the
of
first set
and S[a]
n
where
?,
/i
J^
^
and nh^
Koenker (1994).
—
»
n
oo,
u\oc\
:= Y^
= -y^[K{ulh)lh\^,%/Vu
^—
Jo
n
1=1
=
=
t=i
[r
-
1 (e,
and K{-)
is
a kernel function. Specific choices of h are discussed in
s;
= -J2 S-.MJ.H'-
- D[a -
for direct
as follows:
<
0)] *,,
= -
-Asia]
1=1
where
and S
dual inference. Following Koenker's (1994) analysis for ordinary
Similarly, the second set of estimates
S\a]
to estimate include J^, Ja,
= -TlK{er/h)/h]<i,D[,
'—
1=1
- D[a(T) - X'^{t),
:= Y,
such that
for
we need
components can be estimated
5=-V?,S:,
'—
is
Z'
In the second step, the required weights
?i.
can be shown that estimation of weights and instruments
(1994), on the estimates of weights
QR,
Z=
can be implemented in two steps.
Efficient estimation
limit distribution of the estimators, provided additional regularity conditions, found e.g. in
4.4.
=
9'
becomes simple once
especially convenient since the variance formula
and Wellner (1993).
V' and instruments Z'
IVQR
is
Amemiya
Bickel, Klaassen. Ritov.
in
=
<
l(e
Z is collapsed to the same dimension as D. Corollary 3 shows how to construct
Z and weight V such that the IVQR estimator achieves the efficiency bound in the
the instrument
first
- f,{0\D,Z,X), andV
the asymptotic variance of{a{r),P{T)) attains the efficiency 6ounrfr(l— t)£['I''^*']~'
the instrument
In the
—
efficient score for (Q:(T),/3(r)) is given by ^,j'_^J t
X,'^(t, a)
a bandwidth chosen such that h —*
is
*,
V,[Z[, X,']',
/i
is
a bandwidth chosen
given by
E
[K{e^[a]/h)/h]
<i^,<i!yVi
t=i
-
Z'^{a, t) and
and
nh'^
—
>
s,[a\
cxd,
=
[r
and K{-)
is
1 {e,\a]
<
0)]
*.,*,
=
V,[Z,', X^]',
h
a kernel function. The consistency
properties of these estimators are standard and will not be discussed here.
5.
Empirical Examples
In this section,
3.
The
first
we
present two applications of the estimation and inference results derived in Section
application reports the results of a simple analysis of the
demand
for fish.
This
application makes use of a small sample and illustrates the potential differences between the direct
13
and dual inference procedures.
program on earnings. In
we
In the second example,
this case, the identification
consider the effects of a job training
quite strong, and
is
we
see small differences
between the direct and dual inference procedures. The results here also demonstrate the bias
in
the conventional quantile regression under endogeneity.
In particular, the conventional quantile
regression estimates indicate that the effect of training
positive
outcome
distribution, while the
the lower
tail
Denaand
5.1.
Fulton
market
The data contain
in
New York
price
price
significant across the entire
is
close to zero in
of
demand
in
elasticities
may
which
potentially vary with the
observations on price and quantity of fresh whiting sold in the
over the five
These data were used previously
The
and
for Fish
demand.
fish
is
estimates indicate that the training impact
distribution.
we present estimates
In this section,
level of
outcome
of the
IVQR
Graddy
month period from December
2,
1991 to
May
1992.
8,
(1995) to test for imperfect competition in the market.
and quantity data are aggregated by day, with the price measured as the average daily
and the quantity as the
observations for the days
in
total
amount
of fish sold that day.
The
total
sample consists of 111
which the market was open over the sample period.
For the purposes of this illustration, we focus on a simple Cobb-Douglas random demand model
with non-additive disturbance:
where Qp
the quantity that would be
is
the level of
demand normahzed
the level of
demand
supply
if
is
A
U.
=
Qo(t/)
+
demanded
if
ln(Qp)
to follow (7(0,
1),
supply function S^
assumed
The observed quantity
Y
=
to be independent of
sold in the
In
y=
market
Qo (t/)
-H
the price were p,
and a\{U)
the price were p, subject to other factors
affecting supply are
ai(;7)ln(p),
is
ai
/(p,
Z
is
the
U
is
an unobservable affecting
random demand
elasticity
when
Z,U) describes how much producers would
and unobserved disturbance U. The factors
demand disturbance
Z
U
given in logs by
([/) In P,
where
(5.1)
U
where
Oio{U)
P
+
is
is
independent of Z,
the price picked by the market to equate supply and demand.
ai([/)ln(P)
disturbance U,
i.e.
=
that
\n f{P,
P=
is
a
dummy
is,
P
satisfies
Z,U), which implies the observed price depends on the demand
6(Z, U,U) for
As instruments Z, we consider two
Stormy
That
some function
different variables capturing
variable which indicates
greater than 18 knots, and Mixed
is
a
6.
weather conditions at
wave height greater than
dummy
variable indicating
4.5 feet
sea:
and wind speed
wave height greater than
3.8
14
feet
and wind speed greater than 13 knots. These variables are plausible instruments since weather
conditions at sea should influence the
demand
Simple
for the product.^
amount
OLS
offish that reaches the
market but should not influence
regressions of the log of price
on these instruments suggest
they are correlated to price, yielding R^ and F-statistics of 0.227 and 15.83 when both Stormy and
Mixed are used
small sample,
and 0.160 and 20.69 when
as instruments
we may
just
Stormy
used. However, given the
is
expect identification to be weak, and weak identification
still
suggested by
is
the results reported below.
Quantile regression (QR) and inverse quantile regression (IVQR) results for the
.75,
and
Stormy
is
Column
columns
.85 quantiles are reported in
QR results,
while columns (2) and
(3)
report
used as an instrument in Column
(3).
(l)-(3) of
IVQR
(2),
asymptotic approximation, and
95%
confidence
The computation
for the
IVQR
of the
The IVQR estimates
than the "price
effects"
QR
or
of the
demand
that the interpretation of
demand model,
while
The IVQR
elasticities are also
1.
The
left
in
column
estimates
in the
QR
IVQR are steeper than
is
very different.
(2)
magnitude
market. These
results,
differ-
and the
right
figure,
those estimated by
It is also
IVQR
QR
more curvature
important to note
estimates a structural
the conditional quantiles of the equilibrium quantity variable as
It is
no surprise that these estimates are
and large differences between the confidence
different inference procedures for
is
using
5]
both Stormy and Mixed are used as instruments. In the
IVQR and QR
Interestingly, there are clear
dual procedure which
[—5,
anticipate given endogenous
panel reports the
plotted in the original price-quantity space.
a function of the equilibrium price.
two
A=
uniformly greater
QR, which we might
functions estimated by
QR estimates
on the
inference procedure outlined in Section 3.1.
plotted in log-price-log-quantity space, and that this translates directly into
demand curve when
The
estimate of a.
estimates, the row labeled "Dual Interval" contains
estimated by the ordinary
IVQR results when
clearly see that the
when
IVQR
exhibit considerable heterogeneity, ranging from -1.5 to -0.7 in
(3).
the
that only
0.1.
ences are illustrated graphically in Figure
we
(1) reports
(3) differ in
interval for S(t) constructed based
sampling resulting from the joint determination of price and quantity
panel reports the
Column
and
estimator was conducted over the parameter space
column
-1.5 to -0.9 in
(2)
.50,
while Stormy and Mixed are used as instruments in
bound constructed using the dual
Qj equally spaced with a step size of
and from
IVQR
below.
1
Columns
For the t"" quantile, the row labeled 3(r) gives the
row labeled "Wald Interval" contains the 95% confidence
the
Table
results.
.15, .25,
IVQR.
different.
intervals given
by the
In particular, the confidence intervals based on the
robust to weak and partial identification are uniformly
much wider than
the intervals based on the direct inference procedure. For instance, the dual confidence region for
the .85 quantile case contains the upper endpoint of the parameter space A. In addition,
^More detailed arguments may be found
in
Graddy
(1995).
it is
worth
15
Table
Example
(3)
(4)
(5)
-1.5
1188
-200
(-1.24,0.16)
(-3.69,-0.69)
(-2.51,-0.49)
(553,1822)
(-1435,1035)
[-5.0,0.5)
(-3.2,0.1)
-0.40
-1.0
-1.4
2510
500
(-2.51,0.51)
(-2.52,-0.28)
(1742,3278)
(-887,1887)
(-4.4,0.0)
(-3.1,0.1)
-0.41
-0.7
-0.9
4420
300
(-0.81,-0.01)
(-1.67,0.27)
(-1.82,0.02)
(3220,5621)
(-1589,2189)
(-3.0,0.6)
(-3.0,0.6)
Dual CI
(-1400,2700)
-0.70
-1.2
-1.3
4678
2700
(-2.02,-0.38)
(-2.07,-0.53)
(2901,6455)
(-260,5660)
(-2,0,-0.1)
(-2.1,0.1)
Dual CI
(-400,5600)
-0.81
-1.3
-1.1
4806
3200
(-1.24,-0.38)
(-2.10,-0.50)
(-1.82,-0.38)
(2751,6861)
(32, 6368)
(-2.0,5.0]
(-2.6,5.0]
S(.85)
Dual CI
Columns
(-1000,2000)
(-1.18,-0.22)
3(.75)
Notes:
(-1300,1500)
(-0.87,0.07)
S(.50)
Wald CI
(l)-(3) report results
(500,5800)
from estimation of the demand
report conventional quantile regression results, and columns
regression results. In
column
(2),
one instrument. Stormy,
Stormy and Mixed are used. Rows labeled a(T)
in
for
t G
is
for fish,
JTPA
report results from estimation of the returns to training from the
the numbers
IVQR
(2)
Dual CI
Wald CI
Returns to Training
-1.5
S(.25)
Wald CI
2:
QR
IVQR
(1)
Dual CI
Wald CI
Example
for Fish
IVQR
-0.53
S(.15)
Wald CI
Demand
1:
;
QR
Results from Empirical Examples
1.
(2), (3),
used,
and
and
and columns
(5) report
in
column
{.15, .25, .50, .75, .85}
and
(4)
experiment. Columns
(1)
(5)
and
(4)
instrumental quantile
(3),
two instruments.
report point estimates, and
parentheses are confidence intervals.
noting that the confidence intervals obtained using two instrumental variables are generally shorter
than the confidence intervals obtained using just one instrumental variable, suggesting an efficiency
gain to using
more instruments.
The construction and nature
and
3,
two
cases,
line in
of the dual confidence
which respectively plot the
a
is
IVQR
bounds are further
objective function
Wn(a)
illustrated in Figures 2
over the parameter space in the
plotted on the horizontal axis, and the vertical axis shows ^^(q).
each graph
is
the
95%
critical
value for the dual testing procedure, so
all
The
horizontal
points lying below
the horizontal line belong to the confidence region for a(T).
These graphs display a number
of interesting features. It
while having numerous local minima, has a distinct
is
apparent that the objective function,
minimum
over
A
in all cases.
The
objective
16
and hence dual confidence regions, are generally well-behaved
functions,
bution and become more erratic as one moves toward the
that the dual confidence regions are not connected in
tails of
many
in
the middle of the distri-
the distribution.
It is also clear
cases.
This simple example clearly illustrates the potential differences between the direct and dual
inference procedures.
to
demand
an example of an apphcation of the methods of
also provides
It
analysis where the elasticity of
illustrates the use of the estimator in
much
potentially heterogeneous.
results.
The
analysis
and the importance of accounting
demonstrate the interesting insights that may be gained through quantile
results also
The Returns
for
endogeneity in such studies.
to Training
of job training
programs on the earnings of trainees, especially those with low income,
of great interest to both policy makers
training programs on earnings
is
and academic economists, but evaluating the causal
difficult
due to the
self-selection of
(JTPA) provides a mechanism
assigned the offer of
JTPA
for
randomly
training services, but because people were able to refuse to participate, the
actual treatment receipt was self-selected.
in the training.
effect of
Job Training Partnership Act
In the experiment, people were
addressing this issue.
is
treatment status. However, data
available from a randomized training experiment conducted under the
training.
The next example
stronger, showing that in this setting the two inference procedures produce similar
is
The impact
is
paper
a setting with a considerably larger sample and where identi-
fication
5.2.
demand
this
Of those
offered treatment, only 60 percent participated
There was also a small number of individuals from the control group who received
The random assignment
of the training offer provides a plausible instrument for a person's
Abadie, Angrist, and Imbens (2002),
actual training status.
who
previously used this data to
examine the impact of job training on earnings,^ provide more detailed information regarding data
collection procedures,
sample selection
criteria,
additional facts and discussion about the
JTPA
and
institutional details of the
training experiment.
JTPA
In this example,
along with
we
limit the
analysis to the sample of adult males.
To capture
the effects of training on earnings, we estimate a structural quantile model of the
form
Y = Da[U) + X'0{U), U ~
where
D
outcomes
indicates training status
Y
They use a
different
is
instrumented
X
is
a vector of covariates,
and
U
is
are earnings,
to the treatment group,
and
(7(0, 1),
for
Z
given
Z and X,
by assignment to the treatment group, the
is
a
dummy
variable indicating assignment
an unobservable affecting earnings.
modeling framework and estimator that estimates the treatment
sub-population of "compliers".
effect for
the
17
The data
and assignment
consist of 5,102 observations with data on earnings, training
and other individual
status,
Earnings are measured as total earnings over the 30 month
characteristics.
period following the assignment into the treatment or control group, and average earnings in the
The
sample are $19,147,
vector of controls,
dummy
indicating high-school graduates
dummy,
a
dummy
for the
and
,
dummies
includes
GED
dummy
black and Hispanic persons, a
for
holders, five age-group
indicating whether the applicant worked 12 or
to the assignment, a
dummies
X
dummies, a marital status
more weeks
in the 12
months
prior
signifying that earnings data are from a second follow-up survey,
recommended
we only report
service strategy/ For brevity,
and
results for estimates of
the key parameter, q(t), which represents the impact of the training program on earnings.
Since assignment to the treatment or control group was random,
Z.
The instrument
The
state D.
useful for identification since
is
it
provides a natural instrument
highly correlated to the actual training
is
it
R^ of a regression of training status on assignment
partial
controlling for the other covariates,
correlation suggests that
weak
is
.609,
and the
to the treatment group,
first-stage F-statistic
problem
identification should not be a
is
2,673,
This strong
in this case, so the direct
and
dual inference procedures should yield similar results.
As
in
the previous example, estimation results for the
reported in columns (4)-(5) of Table
the
IVQR
The row
results.
labeled
1.
Column
95%
"Wald
confidence
The computation
the
and
.85 quantiles are
QR results, while column (5) reports
QR or IVQR estimate of a.
For the t"* quantile, the row labeled S(r) gives the
Interval" contains the
the asymptotic approximation, and for the
the
(4) reports
,15, .25, .50, .75,
95%
confidence interval for S(t) constructed based on
IVQR estimates,
bound constructed using the dual
of
IVQR was
the row labeled "Dual Interval" contains
inference procedure outlined in Section 3.1,
conducted over the parameter space
A =
[—2500,7500] using aj
equally spaced with a step size of 100. In addition, the results are presented graphically in Figure
The
QR
conventional
and
are uniformly positive
had a
relatively large
results,
which
fail
4.
to account for the selection into the treatment state,
from
significantly different
0.
They
indicate that the training
impact on the earnings of participants, and that
this
impact
is
program
increasing in
the quantile index. However, given that people were able to decide whether or not to participate
in training following the initial
upward biased
random assignment,
for the actual effect of training
it
seems
likely that these estimates
on earnings. This suspicion
is
would be
confirmed by the
IVQR
estimates, which account for the endogeneity of training status and are uniformly smaller than
the corresponding
QR
estimates.
This difference
is
most apparent
quantiles. In the low quantiles,
QR suggests a moderate
earning quantiles; however, the
IVQR
The recommended
service strategy
training and/or job search assistance,
positive
and
in the
low and middle earning
significant effect of training
on
estimates are quite low and, while imprecise, not significantly
was broken
into three categories;
and other forms of
training.
classroom training, on-the-job
18
from
different
The
0.
percentage impact of the training program, which
4.*
Here, the
starting at
becomes more apparent when we consider the
difference in the estimates
is
presented in the right hand column of Figure
QR estimates imply a large percentage increase in earnings in the low earning quantiles,
139%
for r
=
.15,
which declines as one moves to the upper quantiles of the conditional
earnings distribution, though the impact remains large even in the center of the distribution at
T
=
.50,
where the implied
effect
on the other hand are quite
T
=
.25 are all
is
a
35%
increase in earnings due to training.
between
stable, varying
-13% and
The IVQR estimates
14%, and with the exception of
below 10%.
Unlike the case considered above,
IVQR
inference procedures for
we do not
find large differences
The
in this case.
between the direct and dual
between the two approaches
similarity
not
is
unexpected due to the strong correlation between the instrument and endogenous regressor. The
close
agreement here further suggests that not much
cases where identification
is
strong.
It
differences detected in the previous section are
the dual procedure to the presence of
procedure
this inference
The dual
graphs
due to weak
in
functions,
identification.
its
confidence bounds are further illustrated in Figure
all
line in
is
5,
Given the robustness of
many
it
seems that
cases.
IVQR
which plots the
objective
plotted on the horizontal axis, and the vertical
each graph
is
the
95%
critical value for
the dual testing
points lying below the horizontal line belong to the confidence region for q(t).
Figure 3 differ markedly from those in Figures
and hence confidence regions,
in
argument that the
simple computation,
be preferable to the standard procedure in
shows W„(q). The horizontal
procedure, so
by considering the dual procedure
lost
weak instruments and
W„(q) over the parameter space A. a
function
axis
will
is
also provides further support for the
1
and
2.
In particular,
all
The
of the objective
Figure 3 look remarkably well-behaved. The objective
in
functions appear to be reasonably smooth, and the confidence intervals are
all
connected and clearly
bounded within the parameter space considered.
6.
Conclusion
In this paper,
we propose an estimation approach,
the inverse quantile regression (IVQR), that
appropriately modifies the conventional quantile regression (QR) and recovers quantile-specific covariate effects in an instrumental variables
of a set of instrumental variables Z.
since
it
model defined by
The IVQR estimator
is
y=
D'a(U) where
U
is
appealing for estimation
independent
in this
can be computed through a series of conventional quantile regression steps and so
computationally convenient in
erties of the estimator
many
cases encountered in practice.
We
model
will
be
derive the asymptotic prop-
under suitable conditions. In addition, we demonstrate that the estimation
^The percentage impact
state. All other covariates
is
for
changing from
D=
to
D=
1, i.e.
were evaluated at their sample means.
from the non-training to the training
19
procedure leads to a testing procedure which
and that
this inference
implement
We
procedure results naturally from the
IVQR
algorithm and so
first
weak instruments.
demand appears
to be
In this case,
upward biased
we
find that the conventional
QR estimate of the elasticity
as would be expected due to the
and quantity by supply and demand.
instruments. In the second example,
this case,
In addition,
we
joint determination of
find that there are large differences
we instrument
we
for training status
is
is
is
essentially
no difference
robust to weak instruments.
strong evidence of endogeneity of training status resulting in substantial bias to
QR estimator. This bias is especially pronounced in the lower tail of the earnings
where QR suggests a significant and positive effect of training on earnings, while the
the conventional
distribution
robust to weak
with random assignment to the training program which
very highly correlated to actual receipt of training. In this case, there
In addition, there
is
look at the impact of a job training program on earnings. In
between the direct inference procedure and the dual procedure which
IVQR
simple to
example, we examine a simple demand model in a small sample with
between the direct inference procedure and the dual inference procedure which
is
is
then illustrate the use of the proposed estimator and testing procedure through two brief
relatively
price
be robust to the presence of weak instruments
in practice.
empirical examples. In the
of
will
estimates are insignificant and small in magnitude.
20
Appendix
7.
7.1.
Proof of Proposition
The
result (4.3) follows
1
by adopting standard arguments
stance those of Gutenbrunner and Jureckova (1992).
follow by the Slutsky
rest of the stated conclusions (4.4)-(4.10)
Lemma.
Proof of Proposition 2
7.2.
We
The
for quantile regression processes, for in-
use standard definitions and notation from empirical process theory, as
/^E„[/(iy)]:=if^/(W^,),
E
where we use
f^Gnlf{W)]:=j=f2 (fm)~E[f{W.)]),
to denote the usual expectation
estimated function
van der Vaart and
W := {Y,D,X,Z), define the maps
Wellner (1996) and van der Vaart (1998). For
(7.1)
e.g.
and E to denote expectation evaluated at an
:= {E[f{Wi)])j^jr.
/: E[/(W,:)]
For convenience we collect important definitions below. Let § :=
(/3,7)
and
i9(t)
:= (/3(t),0).
Define
f{W,a,^) := {t-1{Y < D'q + X'/?
(7.2)
where * :=
V
•
{X', Z')'. Define for pr{u)
=
- l(u <
(r
+
Z'7)}*,
0))u
g{W,a,^) := pr{Y - D'a - X'f5 - Z'i)V.
(7.3)
Define
Qn{aJ):=E,,[g{W,a,d)],
(7.4)
Q(a,i)) :=
E [giW,a,^)]
and
(7.5)
d{a,T) := (^(q,
(7.6)
i?(q, r)
3(r)
(7.8)
1 (Identification)
First,
equation
and
arginf^g^dimc^.^) Qn(a,'i5),
(/?(q, t),
7(a, t)) :=
arg inf^jgRdi^c^.^)
0':=p{a',T),
We show
that 7?(t)
by R6, the mapping (a,P)
(2.5),
it is
we have that
i?(t)
=
=
^ E[{t AxB.
(a,
i?),
r),
a' := arginf^g^ W[q],
7(r) := 7(S(t),t),
7':=7(a*,r).
(Q(r)',/?(r)') uniquely solves the limit problem.
1(Y < D'a
+
X'/?)}*]
(a(T)', /3(t)')' solves the equation
thus the only solution over
Q
W[q] := 7(Q,T)'^(a)7(Q,
:=arginfag^ VKnM,
^(T):=^(3(r),T),
(7.9)
0,
7(q,t)) :=
Wn[a]:=^{a,TYA{a)j(a,T),
(7.7)
Step
:=
r),
is
one-to-one over
£ |{r -
1{Y < D'a
AxB. By
+
X'0)}'i!]
=
21
Second,
we need
to
show that R5'-R6' and R5*-R6*
follows by a variant of Hadamard-Cacciopoli theorem
R6.
suffice for
Sufficiency of R5'-R6'
for general metric spaces,
Ambrosetti and Prodi (1995). Sufficiency of R5*-R6* follows by Theorem 2
=
Third, we have that the true parameters {a,/3)
E [{t -
(7.10)
A
over
y<
B.
1{Y < D'a
By R4 and by convexity
in
in
cf.
Theorem
1.8 in
Mas-Colell (1979).
{a(T),P{T)) uniquely solve the equation
+ X'P +
Z'0)}<1/]
=
of the limit optimization problem for each q, i9(a,T)
i?
uniquely solves the equation:
EI{t- 1{Y < D'a + X'P{a,T) +
(7.11)
By
^
construction of
A
€
find a'
7(Q*,r)
=
2.
B we know
by equation
=
Thus a*
(2.5).
||t?(Q,r)
Q(r)
(7.12) that for
any
Hence we
Q^^p
—
iy(Q)|||
follows by the standard consistency
0.
=
B
0.
for
is
We
each a £ A.
=
minimal, a'
need to
Q(r) makes
a solution; by the preceding argument
is
sup
-i9(Q,r)||-^p Oi.e.
which implies supQg_4 |iyn(Q) -
7(t)
in the interior of
=
it is
unique
0{t).
sup
=
is
(Consistency) One consequence of Proposition
(7.12)
by
that /3(q, r)
such that this equation holds and the norm of 7(0*, t)
and /?(Q*(T),r)
Step
x
Z''y{a,T)})<b]
>
0,
argument
1,
-
||7(q, r)
where W{a)
for
namely of equation
is
7(a,r)||-»p
continuous
in
=
P{a(T),T)
that
0,
a
over A.
extremum estimators that 3(t)—»p
a(r), P{an,T) -^p
is
(4.3),
It
therefore
Q(r), and then
and 7(an,T)-*p 7(a(T),T)
/?(r)
=
also have that
i?(Q„,r)->p d{a{T),T) for any »„-»? a(r).
(7.13)
Note that above we have used that
'd{a, r) is
continuous
in q,
which
is
verified
by the implicit
function theorem applied to equation (7.11).
Step
3.
(Asymptotics) Let a^ be
a small ball centered at a{T).
in
properties of the quantile regression estimator 'd{an,T) established in
By
Theorem
the computational
3.3 in
Koenker and
Bassett (1978),
Oil/V^) = V^En[fiW,an,d{an,T))].
(7.14)
The
A
functional class {f{W,a,'d), {a,i}) £
because this class
is
a product of a
VC
following expansion of the rhs of (7.14)
0{1/M = G„
= Gn
[/
x
B
x Q}
subgraph
is
valid for
is
class
Donsker
for
any compact
sets
A, B, and
and a bounded random vector. Hence the
any On—^p a(r)
(W, an, d{a{T), r))]
+ v^E
[/(M/,Q(T),i?(a(r),T))]
Q,
[/
+ V^E
(w, q„, ,?(q„ r))]
[f (iy,Q,,,?(Q„, r))]
+
Op(l).
22
Expanding the very
element further, by R4
last
Oil/y/H)
= G„
[/(H/,
+
+ Op(l))v^(j?(Q„,T) -
a{T),^T))\
+
Op(l)
(7.16)
(J^
d{T))
+
Op(l))v^(a„ - a(T)),
+
(J„
where by Rl and R3
M(.y./J) = (0,/3{r))
0(P',7')
(7.17)
oa'
=
In other words, for any
la=Q(T)
£[A(0|X,Z,D)*D'].
Q^—*p
cx{t)
Md{an,T) -
= - J-'Gn
d{T))
[/( VK, o(r), 79(t))]
(7.18)
-
Ja
^
Ja[l
+ Op(l)Jv^(a„ -
Q(r))
+
Op(l),
so
v/^(^(Q„,r)
-
/3(t))
= - J^Gn
[f{W,a{T),d{T))]
(7.19)
-
+ Op(l)]v^(Q„ -
J^Ja[l
a{T))
+
Op(l),
and
v^(7(Qn,
r)
-
=-
0)
J.,G„ [/(ly, a(r),
7?(t))]
(7.20)
where
[J'^,]'^]' is
»
1.
Jq[1
+
Op(l)]\/n(an
-
a(T))
+
Op(l),
the conformable partition of J^^
Center a shrinking closed ball
wp —
J-,
Then wp
—
>
Bn
by consistency obtained
at 0, so that
in
Step
2,
an — air) G B^
1
S(r)
(7 211
=
arginf
iy„(Qn).
a„-a(T)6B„
^
Note that G„ [f(W, Q(r),
i9(r))]
^d
A^(0, 5)
by the Central Limit Theorem. Hence
G„
[f[W, a(r),
-dir))]
Op(l), and
Wn{an) = [Op(l) -
x[yl(Q0 +
(7.22)
X [Op(l)
It
J-,Ja[l
-
+Op(l)]V^(a„ -Q(r))]'
Op(l)]
J-,J„[1
+
then follows from (7.21) and (7.22) that y/n{a.[T)
rank and A{a) has
rank from
v/^(d(r)
(7.23)
where Q„(z) :=
full
(
R4 and
-
Q(r))
R5. Thus,
=
arg
inf
Op(l)]
—
X v^(a„ - Q(r))]
Q:(r))
we have
[Qn(2)
=
.
Op(l) since J^Ja has
full
column
that
+ Op(l)]
- J^Gn!{W,a{T),^[T)) - J^J^z)' A{a){- J^Gnf{W,a{T),d{T)) -
J-,J^z).
23
LEMMA 1
(n, in \
0,
{Approximate Argmins, Knight (1999)). Define Zn such that Qn[Zn) < inf^gRd Qn{i) +
and defined Z* as arginf^gK.^
a,rgmm^^jg^dQx{z)
sets
K
Apply
(7.24)
that
,
where Q^c
Lemma
1
to
Suppose that Zn
(5„(?).
is
uniquely defined in R'^
is
continuous. Then
a.s.,
=
Op(l),
and Qn(-) => Qoo(')
Zn = Z^ +
Op(l)
in
Z^ = Op(l), Z^o '=
i°°{K) over any compact
and Zn—>d Zoo-
Qn{z) defined above and conclude that
^^(3(r) - Q(r))
=
arg
inf
[Qn{z)]
+
o^{l),
is
25)
"^^"^^^ "
"^^^^^
"
" (^^•^7^('^w) A^)
"' (i;j;^(a(T))
a)
xG„[/(W^,a(r),,9(r),r)]+0p(l),
Hence
^{kHr),T)
~d[T))
= -J-'[l - J„(j'J'^A{a{T))J,J^Y' J'J'^A{a{T))J,
(7 26)
xG,[/(H/,Q(r),,?(r),T)]+Op(l)
The
conclusion of Proposition 2 follows from
G„
(/(VK,q(t),'!9(t))] -^d
N{0,
S).
24
References
Abadie, a.,
AngriST,
J.
and
G. Imbens (2002): "Instrumental variables estimates of the
effect of subsi-
dized training on the quantiles of trainee earnings," Econometrica, 70(1), 91-117.
a., and G. Prodi (1995): A primer of nonlinear analysis, vol. 34 of Cambridge Studies in
Advanced Mathematics. Cambridge University Press, Cambridge.
Amemiya, T. (1977): "The maximum likelihood and the nonlinear three-stage least squares estimator in
Ambrosetti,
the general nonlinear simultaneous equation model," Econometrica, 45(4), 955-968.
(1982): "Two Stage Least Absolute Deviations Estimators," Econometrica, 50, 689-711.
Andrews, D. (1994): "Empirical Process Methods in Econometrics," in Handbook of Econometrics,
Vol. 4,
by R. Engle, and D. McFadden. North Holland.
BiCKEL, P. J., C. A. J. Klaassen, Y. Ritov, and J. A. Wellner (1993): Efficient and adaptive
estimation for semiparametric models. Johns Hopkins University Press, Baltimore, MD, Johns Hopkins
Series in the Mathematical Sciences.
ed.
Chaudhuri,
and
K. DokSUM,
P.,
Samarov
a.
(1997):
"On average
derivative quantile regression," Ann.
Statist, 25(2), 715-744.
Chen,
L.,
and
Portnoy
S.
"Two-stage regression quantiles and two-stage trimmed least squares
Comm. Statist. Theory Methods, 25(5), 1005-1032.
(1996):
estimators for structural equation models,"
Chernozhukov,
An
tribution:
and
v.,
C.
Hansen
"The
(2004a):
Effects of 401(k) Participation on the
Wealth Dis-
Instrumental Quantile Regression Analysis," Review of Economics and Statistics, 86(3),
735-751.
"An IV Model
(2004b):
Chernozhukov,
and
V.,
H.
of Quantile
Hong
Treatment
"An
(2003):
forthcoming Econometrica.
Effects,"
MCMC
Approach
to Classical Estimation," Journal of
Econometrics, 115, 293-346.
Dorsum, K. (1974): "Empirical probability plots and statistical inference
sample case," Ann. Statist., 2, 267-277.
Graddy, K.
for nonlinear
models
Rand Journal
"Testing for Imperfect Competition at the Fulton Fish Market,"
(1995):
the two-
in
of
Economics, 26(1), 75-92.
Gutenbrunner, C, and
J.
JURECKOVA
(1992):
"Regression rank scores and regression quantiles," Ann.
Statist, 20(1), 305-330.
Hausman,
AND
A.,
J.
J.
G. SiDAK (2004):
3.
"Why Do
the Poor and the Less-Educated Pay Higher
Contributions to Economic Analysis
Prices for Long-Distance Calls?,"
&
Policy, Vol. 3, No.
1,
Article
http://www.bepress.com/bejeap/contributions/vol3/issl/art3.
He, X.,
AND
L.
Zhu
(2003):
"A
lack-of-fit test for quantile regression,"
Journal of the American Statistical
Association, to appear.
JanuSZEWSKI,
S.
I.
(2004):
"The
Effect of Air Traffic Delays on Airline Prices,"
UCSD
Working Paper.
http://weber.ucsd.edu/ sjanusze/www/airtrafficdelays.pdf.
Knight, K.
(1998):
"Limiting distributions
for
Li regression estimators under general conditions," Ann.
Statist, 26(2), 755-770.
(1999): "Epi-convergence
KOENKER, R.
(1994):
and
Stocheistic Equisemicontinuity," Preprint.
"Confidence intervals
for regression quantiles,"
in
Asymptotic
statistics
(Prague,
1993), pp. 349-359. Physica, Heidelberg.
Koenker,
KOENKER,
R.,
R.,
regression,"
KOENKER,
J.
R.,
and
AND
J.
A. F.
AND
S.
PORTNOY
G.
S.
Bassett
(1978):
Machado
"Regression Quantiles," Econometrica, 46, 33-50.
"Goodness
Amer. Statist Assoc, 94(448), 1296-1310.
(1987):
(1999):
of
fit
and related inference processes
"L-estimation for linear models,"
J.
for quantile
Amer. Statist Assoc, 82(399),
851-857.
(1979): "Homeomorphisms of compact, convex sets and the Jacobian matrix," SIAM
Math. Anal., 10(6), 1105-1109.
PoRTNOY, S. (1991): "Asymptotic behavior of regression quantiles in nonstationary, dependent cases,"
Multivariate Anal, 38(1), 100-113.
PoRTNOY, S. (2001): "Censored Regression Quantiles," prepirint, www.stat.uiuc.edu.
Mas-Colell, A.
J.
J.
25
PORTNOY,
S.,
AND
R.
KOENKER
(1997):
"The Gaussian Hare and the Laplacian
Tortoise,"
Statistical
Science, 12, 279-300.
Sakata,
"Instrumental Variable Estimation Based on the Least Absolute Deviation Estimator,"
Department of Economics, University of Michigan.
VAN DER Vaart, A. W. (1998): Asymptotic statistics. Cambridge University Press, Cambridge.
VAN DER Vaart, A. W., and J. A. Wellner (1996): Weak convergence and empirical processes. SpringerS. (2001):
Preprint,
Verlag,
New
York.
26
Figure
1.
QR
Estimates of Effect of Price on Quantity by
Demand
Conditional Quantile Functions
-05
0.5
iog(Price)
Demand
Conditional Quantile Functions
The estimated
as a function of price for r
IVQR
Functions
-05
0.5
iog(Price)
Note: Left Column:
and
=
Functions
conditional quantile curves of the quantity of fish sold
.15, .25, .50, .75,
and
.85.
The top
display
is
in log-price
log-quantity space with log-price on the horizontal axis and log-quantity on the vertical
axis.
The bottom
display
is
in price-quantity
quantity on the vertical axis. Right Column:
r
=
.15, .25, .50, .75,
log-price
is
.85.
The top
display
is
in log-price log-quantity
for
space with
on the horizontal axis and log-quantity on the vertical axis. The bottom display
space with price on the horizontal axis and quantity on the vertical
in price-quantity
axis.
and
space with price on the horizontal axis and
The demand curves estimated by IVQR
27
Figure
2.
Statistic
Wn{a)
in
Demand Example Using Stormy
as
an Instrument
A
\
/
\^
/
/'-
/
',
1^
-
y^J
I
'
K y
.
=50
Note: Objective functions and dual confidence regions for demand for fish example. All
models are as specified in the main text. The estimates make use of one instrument,
Stormy, a is on the horizontal axis and Wn(,a) is on the vertical axis. The horizontal line
is the 95% critical value from a Xi- The dual confidence region is all values of a such
that the Wn{oi) lies below the horizontal line.
28
Figure
3.
Statistic
W„{a)
in
Demand Example Using Stormy and Mixed
1=.15
as Instruments
t = .25
15
10
XV
/
/
/"---
5
\
\
/
J
/-'
^\-
a
t=.50
I
40
/
r^
30
/
20
10
=.75
/
^
''
\
/^-~X
A
,
A.
:
/
/
\
/
^_y
n
i
=
.65
Note: Objective functions and dual confidence regions for demand for fish example. All
models are as specified in the main text. The estimates make use of two instruments,
Stormy and Mixed, a is on the horizontal axis and W„{a) is on the vertical axis. The
horizontal line is the 95% critical value from a xi- The dual confidence region is all
values of a such that W„(a) lies below the horizontal line.
29
Figure
4.
Estimates of the Training Impact by
QR:
QR
and by
QR: Percentage Impacl
Training Effect
IVQR
of Training
6000
4000
5
2000
-2000
0.3
0,2
0,4
5
6
0,7
0,2
IVQR: Training Effect
02
0,3
Note: Left Column:
earnings for t
=
0,4
0,5
0,6
07
08
IVQR; Percentage Impact of Training
0.4
0,5
0,6
QR
and
IVQR
.15, .25, .50, .75,
0,3
0,7
estimates of the impact of a job training program on
and
.85.
The top panel
reports the
QR
estimate of the
and the bottom panel reports the IVQR results. In each figure, the solid
line represents the point estimates, and the dashed (- -) line represents the 95%
confidence interval formed using the direct inference approach. For the IVQR results, the
dash-dot (-.) line represents the 95% confidence bound constructed using the dual
inference procedure described in the text. In both figures, the horizontal axis measures
the quantile index r, and the vertical axis is the impact of training on earning quantiles
measured in dollars. Models include covariates as specified in the text, and the sample
size is 5,102. Right Column: QR and IVQR estimates of the percentage impact of
training for r = .15, .25, .50, .75, and .85. The top panel reports the QR estimate of the
training impact, and the bottom panel reports the IVQR results. Percentage impacts are
for moving from non-training to training and all other covariates are evaluated at their
sample mean. In both figures, the horizontal axis measures the quantile index r, and the
training impact,
vertical axis
is
the percentage impact of training.
30
Figure
5.
t
Statistic Wn{cr) in the Training
= .15
Example.
T=.25
200
/
150
100
/
/
50
:::i=^_
__^--^^
-2000
2000
4000
2000
6000
4000
6000
\
A
2000
4000
6000
2000
4000
6000
/
y
/
\
-2000
\
r
\
^•^
2000
,-r^
4000
6000
Note: Objective functions and dual confidence regions for returns to training example.
All models are as specified in the
main
The estimates use random assignment to
a is on the horizontal axis and Wn(c() is on the
95% critical value from a X\ The dual confidence
text.
the training program as the instrument,
vertical axis.
region
is all
The
horizontal line
values of
a such
is
the
that the function value
lies
below the horizontal
line.
52
1
SS
Date Due
Download