MA ON COUNTERFACTUAL RoomE52-251

advertisement
MIT LIBRARIES
[©iWSfl]
liHIIIIillll
llllllMllllllllIlliJIIlillll
3 9080 03317 5883
Massachusetts Institute of TecFinology
Department of Economics
Working Paper Series
INFERENCE
ON COUNTERFACTUAL
DISTRIBUTIONS
Victor Chernozhukov
Ivan Fernandez-Val
Blaise Melly
Working Paper 08-1
August 8, 2008
Revised: April 4, 2009
RoomE52-251
50 Mennorial Drive
Cambridge,
MA 021 42
This paper can be downloaded without charge from the
Research Network Paper Collection
http://ssrn.com/abstract=1 235529
Social Science
at
Digitized by the Internet Archive
in
2011 with funding from
Boston Library Consortium IVIember Libraries
http://www.archive.org/details/inferenceoncount00cher2
INFERENCE ON COUNTERFACTUAL DISTRIBUTIONS
VICTOR CHERNOZHUKOVt
Abstract.
In this
IVAN FERNANDEZ- VAL§
paper we develop procedures
models about how potential policy interventions
an outcome of
interest.
for
BLAISE MELLY*
performing inference
affect the entire
regression
in
marginal distribution of
These policy interventions consist of either changes
in
the dis-
tribution of covariates related to the outcome holding the conditional distribution of the
outcome given covariates
fixed, or
changes
in
the conditional distribution of the outcome
given covariates holding the marginal distribution of the covariates fixed. Under either of
these assumptions,
theorems
for
we obtain uniformly
consistent estimates and functional central limit
the counterfactual and status quo marginal distributions of the outcome
as well as other function-valued effects of the policy, including, for example, the effects
of the policy on the marginal distribution function, quantile function,
functionals.
We
construct simultaneous confidence sets for these functions; these sets take
into account the sampling variation in the estimation of the relationship
come and
covariates.
approaches
for
Our procedures
rely on,
and our theory covers,
between the out-
all
main regression
modeling and estimating conditional distributions, focusing especially on
classical, quantile, duration,
and distribution regressions. Our procedures are general and
accommodate both simple unitary changes
changes
and other related
in
the values of a given covariate as well as
in the distribution of the covariates or the conditional distribution of the
given covariates of general form.
We
outcome
apply the procedures to examine the effects of labor
market institutions on the U.S. wage distribution.
Key Words:
Policy effects, counterfactual distribution, quantile regression, duration
regression, distribution regression
JEL
Date: April
Classification: C14, C21, C41, J31, J71
4,
2009. This paper replaces the earlier independent projects started in 2005 "Inference on
Counterfactual Distributions Using Conditional Quantile Models," by Chernozhukov and Fernandez- Val,
and "Estimation of Counterfactual Distributions Using Quantile Regression," by Melly.
We
would
like to
thank Alberto Abadie, Josh Angrist, Manuel Arellano, David Autor, Arun Chandrasekhar, Flavio Cunha,
Brigham Frandsen, Jerry Hausman, Michael Jansson, Joonhwan Lee,
Pierre- Andre Maugis,
and seminar
participants at Banff International Research Station Conference on Semiparameteric and Nonparametric
Methods
State, St.
in
Econometrics, Berkeley, Boston University,
CEMFI, Columbia, Harvard/MIT, MIT, Ohio
Gallen, and 2008 Winter Econometric Society Meetings for very useful
comments that helped
improve the paper. Companion software developed by the authors (counterfactual package
available from Blaise Melly.
1
for Stata) is
A
1.
•
',
'
Introduction
basic objective in empirical economics
is
:'
2000
in
if
some outcome variable
For example, we might be interested in what the wage distribution would be
workers have the same characteristics as in 1990, what the distribution of infant
birth weights would be for black mothers
what the distribution
as white mothers,
if
they receive the same amount of prenatal care
would be
of consumers expenditure
the income tax, or what the distribution of housing prices would be
hazardous waste
wages
of
market
for
site.
In other examples,
same
in
of
wages
More
characteristics).
for black
outcome variable
.
(e.g., if
generally,
black workers are paid as white workers with
we can think
of a policy intervention either
In this paper
X
Y
or in the conditional distribution of
we develop procedures
that determine the
Y
given
V
of a
i
'
given
change
is
in
A'.
to perform inference in regression
models about
these counterfactual policy interventions affect the entire marginal distribution of
The main assumption
1'.
that either the policy does not alter the conditional distribution
given A' and onl}' alters the marginal distribution of A, or that the policy does not
alter the
X.
X
of interest Y, or as a change in the conditional distribution of
the distribution of
of
what the distribution
Policy analysis consists of estimating the effect on the distribution of
how
local
workers would be in the absence of racial
as a change in the distribution of a set of explanatory variables
X
we clean up a
female workers are paid as male workers with the same characteristics),
(e.g., if
discrimination in the labor market
the
we might be interested
if
we change
if
female workers would be in the absence of gender discrimination in the labor
what the distribution
or
.,
to predict the effect of a potential policy
intervention or a counterfactual change in economic conditions on
of interest.
.
marginal distribution of A' and only alters the conditional distribution of
K
given
Starting from estimates of the conditional distribution or quantile functions of the
outcome given
covariates,
we obtain uniformly
consistent estimates for functional of the
marginal distribution function of the outcome before and after the intervention. Examples
of these functional include distribution functions, quantile functions, quantile policy effects, distribution
policy effects, means, variances, and Lorenz curves.
We
then construct
confidence sets around these estimates that take into account the sampling variation coming from the estimation of the conditional model. These confidence sets are uniform in the
sense that they cover the entire functional of interest with pre-specified probability.
analysis specifically targets
and covers the principal approaches
Our
to estimating conditional
distribution models most often used in empirical work, including classical, quantile, duration,
and distribution
regressions.
Moreover, our approach can be used to analyze the
effect of
both simple interventions consisting of unitary changes
covariate as well as
more elaborate
spond to known transformations
changes
policies consisting of general
distribution or in the conditional distribution of the
the counterfactual distribution of
in the values of a
X
outcome given
and conditional distribution of
given
in the covariate
covariates. Moreover,
Y
given
X
can corre-
of these distributions or to the distributions in a different
subpopulation or group. This array of alternatives allows us to answer a wide variety of
policy questions such as the ones mentioned in the
To develop the
inference results,
we
first
paragraph.
establish the functional
of the marginal distribution functions before
and
of the functional estimators of the conditional
(Hadamard)
after the policy
model
of the
differentiability
with respect to the limit
outcome given the
covariates.
This result allows us to derive the asymptotic distribution for the functionals of interest
taking into account the sampling variation coming from the
relationship between the
outcome and covariates by means
first
stage estimation of the
of the functional delta
method.
Moreover, this general approach based on functional differentiability allows us to establish
the validity of convenient resampling methods, such as bootstrap and other simulation
methods, to make uniform inference on the functionals of
relies
interest.
Because our analysis
only on the conditional quantile estimators or conditional distribution estimators
satisfying a functional central limit theorem,
the major regression methods listed above.
techniques, though in the discussion
and commonly used methods
it
applies quite broadly and
we show
it
covers
As a consequence, we cover a wide array
we devote
of
attention primarily to the most practical
of estimating conditional distribution
and quantile functions.
This paper contributes to the previous literature on estimating policy
effects
using
re-
gression methods. In particular, important developments include the
work
which introduced regression-based estimators to evaluate the mean
effect of policy inter-
ventions, and of Gosling, Machin,
of Stock (1989),
and Meghir (2000) and Machado and Mata (2005),
which proposed quantile regression-based policy estimators to evaluate distributional
fects of policy interventions,
estimators.
but did not provide distribution or inference theory
Our paper contributes
for these
to this literature by providing regression-based policy
estimators to evaluate quantile, distributional, and other effects
effects) of
ef-
(e.g.,
Lorenz and Gini
a general policy intervention and- by deriving functional limit theory as well
as practical inferential tools for these policy estimators.
on a rich variety of regression models
Our
policy estimators are based
for the conditional distribution, including classical.
4
and distribution
quantile, duration,
vious estimators of Gosling, Machin, and Meghir (2000) and
important special cases. In
fact,
our tlieory covers
regressions."' In particular,
our limit theory
is
pre-
tlie
Machado and Mata (2005)
as
generic and applies to any estimator of
the conditional distribution that satisfies a functional central limit theorem. Accordingly,
we cover not only a wide array
conditional distributions, but also
for
most practical current approaches
of the
many
for
estimating
other existing and future approaches, including,
example, approaches that accommodate endogeneity (Abadie, Angrist, and Imbens,
2002, Chesher
Our paper
,
is
2003, Chernozhukov and Hansen, 2005, and Imbens and Newey, 2009).
also related to the literature that evaluates policy effects
effects using propensity score
methods.
Lemieux (1996) developed estimators
The
influential article of
DiNardo, Fortin, and
for counterfactual densities using
reweighting in the spirit of Horvitz and
Thompson
(1952).
and treatment
propensity score
Important related work by
Hirano, Imbens, and Ridder (2003) and Firpo (2007) used a similar reweighting approach
in
exogenous treatment
effects
models to construct
As we comment
quantile treatment effects, respectively.
methods
to adapt the reweighting
theory for such estimators.
efficient
estimators of average and
later in the paper,
it
is
of these articles to develop policy estimators
Here, however,
we focus on developing
possible
and
limit
inferential theory for
policy estimators based on regression methods, thus supporting empirical research using
regression techniques as
its
primary method (Buchinsky, 1994, Chamberlain, 1994, Han
and Hausman, 1990, Machado and Mata, 2005). The recent book of Angrist and Pischke
(2008, Chap. 3) provides a nice comparative discussion of regression
and propensity score
methods. Finally, a related work by Firpo, Fortin, and Lemieux (2007) studied the
effects
of special policy interventions consisting of marginal changes in the values of the covari-
As we comment
ates.
later in the paper, their approach,
functionals of interest,
is
quite different from ours.
on more general non- marginal changes
in
based on a linearization of the
In particular, our
approach focuses
both the marginal distribution of covariates and
conditional distribution of the outcome given covariates.
We
focus on semi-parametric estimators due to their dominant role in empirical
work (Angrist and
Pischke, 2008). In contrast, fully nonparametric estimators are practical only in situations with a small
number
of regressors. In future work, however,
we hope
to extend the analysis to nonparametric estimators.
"In this case, the literature provides estimators for Fy^, the distribution of potential
treatment
d,
and Fd,z, the
exogenous regressors
Z
theorem specified
before and after policy.
in
Y
joint distributions of (endogenously determined) treatment status
central limit theorem specified in the
limit
outcome
main
Appendix D, our
text
As long as the estimator of Fy^
and the estimator of Fo,z
under
D
and
satisfies the functional
satisfies the functional central
inferential theory applies to the resulting policy estimators.
We illustrate our estimation and inference procedures with
the U.S. wage distribution.
Fortin,
Our
analysis
is
motivated by the influential
and Lemieux (1996), which studied the
of the changes in the
We complement
an analysis of the evolution of
institutional
article
by DiNardo,
and labor market determinants
wage distribution between 1979 and 1988 using data from the CPS.
and complete
their analysis
by using a wider range of techniques, including
quantile regression and distribution regression, providing standard errors for the estimates
of the
main
effects,
confidence bands.
wage
and extending the analysis
Our
results reinforce the
in explaining the increase in
wage
to the entire distribution using simultaneous
importance of the decline
inequality.
They
in the real
also indicate the
minimum
importance of
changes in both the composition of the workforce and the returns to worker characteristics
in explaining the evolution of the entire
wage
distribution.
Our
show
results
that, after
controlling for other composition effects, the process of de-unionization during the 80s
played a minor role in explaining the evolution of the wage distribution.
We
organize the rest of the paper as follows.
In Section 2
we
performing counterfactual analysis, setting up the modeling assumptions
factual outcomes,
results
and
inferential procedures for the policy estimators.
we
give a
summary
Appendix, we include proofs and additional theoretical
-
2.1.
for the
for
counter-
and introduce the policy estimators. In Section 3 we derive distributional
empirical application, and in Section 5
//
methods
describe
2.
we present the
In Section 4
of the
main
results.
results.
,
In the
:
Methods FOR CouNTERF.ACTUAL Analysis
Observed and counterfactual outcomes.
In our analysis
it is
important to distin-
guish between observed and counterfactual outcomes. Observed outcomes come from the
population before the policy intervention, whereas (unobserved) counterfactual outcomes
come from the population
after the potential policy intervention.
We
use the observed
outcomes and covariates to establish the relationship between outcome and covariates and
the distribution of the covariates, which, together with either a postulated distribution of
the covariates under the policy or a postulated conditional distribution of outcomes given
covariates under the policy, determine the distribution of the
outcome
after the policy
intervention, under conditions precisely stated below.
We
divide our population in two groups or subpopulations indexed by j G {0, 1}. Index
corresponds to the status quo or reference group, whereas index
group from which we obtain the marginal distribution of
1
corresponds to the
A' or the conditional distribution
6
Y
of
given A' to generate the counterfactual outcome distribution.'^
various regression models of outcomes given covariates,
and
j,
k, for
j,k €
let
=
{0. 1}.
We
can describe the observed outcome
QY^{Ui\Xj), where U]
~
(7(0, 1)
[/j
of
Y
Qy
W
X
given
X
in
in
group
group j as a function of
in
via the Skorohod representation:
independently of Xj
Here the conditional quantile function plays the
we can think
convenient to consider the
Fx^ be the marginal distribution of the p- vector of covariates
covariates and a non-additive disturbance
yJ
is
Let Qyj{u\x) be the conditional u-quantile of
following representation.
group
it
In order to discuss
~
F.v^,
{0, 1}.
More generally
role of a link function.
{u\x) as a structural or causal function
e
for j
mapping the covariates and
the disturbance to the outcome, where the covariate vector can include control variables
to account for endogeneity. In the classical regression model, the disturbance
from the covariates, as
not be. Our analysis
We
in the location shift
A'l
~
need
it
will cover either case.
The
consider two different counterfactual experiments.
first
experiment consists
Fvj, while keeping the conditional quantile function as in group
counterfactual outcome Y^
VqI
separable
model described below, but generally
drawing the vector of covariates from the distribution of covariates
of
is
in
group
i.e.,
1,
The
Qyo{u\x).
0,
therefore generated by
is
:= Qy^{Ul\Xx), where U^
~
t/(0, 1)
mdependently of
A'l
~
Fx,-
(2.1)
This construction assumes that we can evaluate the quantile function Qy^^{u\x) at each
point X in the support of
This requires that either the support of
A'j.
A'l is
a subset of
the support of A'o or we can extrapolate the quantile function outside the support of
For purposes of analysis,
it
is
useful to distinguish
the alternative distributions of the covariates.
(1)
A^o-
two different ways of constructing
The
covariates before
policy arise from two different populations or subpopulations.
and
after the
These populations might
correspond to different demographic groups, time periods, or geographic locations. Specific
examples include the distributions of worker characteristics
in different years
distributions of socioeconomic characteristics for black versus white mothers.
covariates under the policy intervention arise as
ates in group
Our
0;
that
is A'l
=
^(A'o),
where
some known transformation
g{-) is a
known
results also cover the policy intervention of changing
the conditional distribution of
observed outcome in group
1
Y
given
X.
(2)
and
The
of the covari-
function. This case covers, for
both the marginal distribution of
In this case the counterfactual
outcome corresponds
X
and
to the
example, unitary changes in the location of one of the covariates,
where
ej is
a unitary p- vector with a one in the position
Xj
tions of the covariates implemented as
=
(1
—
j;
a)£'[A'o]
or
+
mean
cuXq.
preserving redistribu-
These types of policies
are useful for estimating the effect of smoking on the marginal distribution of infant birth
weights, the effect of a change in taxation on the marginal distribution of food expenditure, or the effect of cleaning
up a
of housing prices (Stock, 1991).
different
local
hazardous waste
Even though these two
thought experiments, our econometric analysis
site
on the marginal distribution
cases correspond to conceptually
will cover either situation
within
a unified framework.
The second experiment
function in group
as in group 0, that
Y° :=
1,
is,
outcome from the conditional quantile
consists of generating the
Qyj{u\x), while keeping the marginal distributions of the covariates
Xq
~
Fxq-
QyM\Xo).
The
counterfactual outcome y'°
where U°
~
L/(0, 1)
is
therefore generated by
independently of Xq
~
Fx^-
(2.2)
This construction assumes that we can evaluate the quantile function Qyi{u\x) at each
point X in the support of Xq. This requires that either the support of
the support of Xi or
we can extrapolate the
Xq
is
a subset of
quantile function outside the support of
Xi
In this second experiment, the conditional quantile functions before and after the policy
intervention
may
arise
from two
lations might correspond to different
locations.
This type of policy
is
demographic groups, time periods, or geographic
useful for conceptualizing, for example,
bution of wages for female workers would be
same
We
These popu-
different populations or subpopulations.
if
what the
distri-
they were paid as male workers with the
characteristics, or similarly for blacks or other minority groups.
'-
'
formally state the assumptions mentioned above as follows:
Condition M.
(2.1) or (2.2).
Counterfactual outcome variables of interest are generated by either
The conditional distributions of
the
groups, namely the conditional quantile functions
outcome given
Qy
the covariates in both
[{) or the conditional distribution
functions Fyj[-\-) for j G {0, 1}, apply or can be extrapolated
to all
x ^
X
,
where
X
is
a
compact subset ofW^ that contains the supports of Xq and Xi.
2.2.
Parameters of
interest.
The primary
(function-valued) parameters of interest are
the distribution and quantile functions of the outcome before and after the policy as well
as functionals derived from them.
we
In order to define these parameters,
first
associated with the quantile function Qyj{u\x)
Fy-(y|x)Given our definitions
/
I
recall that the conditional distribution
is:
{QyM\^) < y}
du,
(2.1) or (2.2) of the rounterfactual
je{0,l}.
(2.3)
outcome, the marginal distribu-
tions of interest are:
F>;.{y)
:=
Prjy/ <
The corresponding marginal
F,- (y|x)(iFv,(x),
j,k € {0,1}
(2.4)
,
= M{y:F^.{y)>u},
J,/cG{0,1}.
,
u-quantile policy' effect and the y-distribution policy effect are:
QE^^{u)
It is
useful to
Qy^{u)
— Qy^{u)
of covariates
On
f
quantile functions are:
Q'y^{u)
The
=
y}
= Q'y^{u)-Q%{u)
and DF^/y)
=
F^^(y)
-
F°^(y), j,/cG{0,1}.
mention a couple of examples to understand the notation. For instance,
is
from
the quantile effect under a pohcy that changes the marginal distribution
F_Vo to F.Vi, fixing
the other hand, Qy^iu)
the conditional distribution of outcome to FYo(y\x).
— Qy^iu)
is
the quantile effect under a policy that changes
the conditional distribution of the outcome from FYg{y\x) to F>-j(y|x), fixing the marginal
distribution of covariates to Fxo
Other parameters of interest include,
for
counterfactual outcomes. Lorenz curves,
of partial
means
to overall
example, Lorenz curves of the observed and
commonly used
to
measure
inequality, are ratios
means
roc
/y iclF^^ii)/ j^
defined for non-negative outcomes only.
More
generally,
functionals of the marginal distributions of the
IdF^^il),
we might be
outcome before and
interested in arbitrary
after the interventions
Hy{y):=<p{y,FlF,\,F^.^,F°,).
(2.5)
These functionals include the previous examples as special cases as
= j^^tdFy {t) =: fiy mean
HY{y) = J^ t~dFy (t) — (/iy. )^ =:
such as means, with //y(y)
fUy
—
l-i-Yg':
variances, with
with Hyiy)
-
(4^)2
-
{a°Yj-,
;
well as other
policy effects, with
[uy
Lorenz policy ekcts, with Hy{y)
)^;
=
examples
Hyiy)
=
variance policy effects,
L{y, F,\)
-
L{y,
F°J =:
9
LEy
{y)\
effects,
=
Gini coefficients, with Hyiy)
=
with Hyiy)
G^^
- G^^
=:
1
-
2
=
Gy-
and Gini policy
;
GE^y^
In the case where the policy consists of either a
Xi
J^L{Fy ,y)dy =:
known transformation
giXo), or a cliange in the conditional distribution of
Y
X, we can
given
A^ =
the distribution and quantile functions for the effect of the policy,
FLi5)= f [ l{QA,{u\x)<5}dudFx,ix),
of the covariates,
also identify
—
Yj'
Yq, by:
j,fcG{0,l},
(2.6)
J a: Jo
where Q^g{u\x)
=
QYo{u\g{x))
-
QYoiu\x) and Qai{u\x)
(5i^(a)
=
inf{5
F^^{5)
:
>
a},
j,
=
Qy^{u\x)
-
(5v'o(u|x);
and
k E {0, 1},
(2.7)
under the additional assumption (Heckman, Smith, and Clements, 1997):
Condition RP. Conditional rank
Conditional models. The preceding
2.3.
and quantile functions
of interest
=
preservation: Uq
Uq\Xo and
11°
=
Uq\Xo.
analysis shows that the marginal distribution
depend on either the underlying conditional quantile
function or conditional distribution function. Thus, we can proceed by modeling and esti-
mating either of these conditional functions.
We
can rely on several principal approaches
we drop the dependence on the group index
to carrying out these tasks. In this section
to
simplify the notation.
Example
1.
Classical regression
and generalizations.
of the principal approaches to modeling
The
classical location-shift
U ~
(7(0, 1)
conditional mean.
is
7n{X)
independent of
The disturbance
V
conditional cjuantile function Qy{u\x)
covariates impact the
model,
it is
is
one
and estimating conditional quantile functions.
model takes the form
Y=
where
Classical regression
+
X
,
V,
V=
and m{-)
is
(2.8)
a location function such as the
has the quantile function Qv{u), and
= m{x) + Q\/{u).
outcome only through the
clear that a general
Qy{U),
change
This model
location.
is
Y
therefore has
parsimonious in that
Even though
this
is
a location
in the distribution of covariates or the conditional
quantile function can have heterogeneous effects on the entire marginal distribution of Y,
affecting
its
various quantiles in a differential manner.
In the rest of the discussion
we keep the
The most common model
distribution, quantile, quantile policy effects,
for the
and distribution
policy effects functions as separate cases to empheisize the importance of these functionals in practice.
Lorenz curves are special cases of the general functional with Hyly)
will
not be considered separately.
=
/^
tdFy
(t)/
j^
tdFy
(t),
and
10
m{x)
regression function
linear in parameters,
is
least squares or instrumental variable
unrestricted and estimate
results cover such
it
common
m{x)
We
methods.
=
x'P,
and we can estimate
it
using
can leave the quantile function Qv{u)
using the empirical quantile function of the residuals.
Our
estimation schemes as special cases, since we only require the
estimates to satisfy a functional central limit theorem.
The
location
model has played a
and exogenous treatment
classical role in regression analysis.
models, for example, can be analyzed and estimated
effects
using variations of this model (Cameron and Trivedi, 2005 Chap.
A
Wooldridge, 2008).
variety of standard survival
after a transformation such as the
time model,
The
cf.
model
is
25,
and duration models
Cox model with Weibull hazard
Docksum and Gasko
location-scale shift
Many endogenous
and Imbens and
also
imply
(2.8)
or accelerated failure
(1990).
a generalization that enables the covariates to impact
the conditional distribution through the scale function as well:
Y = m{X) +
where
U~
independently of
[/(O, 1)
X
,
a{X)-V,
V=
and
a positive scale function. In this model
a{-)
is
Qy[U),
the conditional quantile function takes the form Qy{il\x)
that changes in the distribution of
X
cr(.r)Qv''(")- It is clear
or in Q)-{u\x) can have a nontrivial effect
entire marginal distribution of Y\ affecting
its
on the
various quantiles in a differential manner.
This model can be estimated through a variety of means
1968, and
— m{x) +
(see, e.g.,
Rutemiller and Bowers,
Koenker and Xiao, 2002).
Example
2.
Quantile regression.
principal approach to modeling
We
can also rely on quantile regression as a
and estimating conditional quantile functions.
In this
approach, we have the general non-separable representation
Y=
The model permits
and
covariates to impact the
outcome by changing not only the location
scale of the distribution but also its entire shape.
effects goes
back to
Doksum
with the location-scale
The
Qy{U\X).
shift
(1974),
who showed
An
early convincing
example of such
that real data can be sharply inconsistent
paradigm. Quantile regression precisely addresses
this issue.
leading approach to quantile regression entails approximating the conditional quantile
11
function by a linear form Qy{u\x)
of this
=
x'P{u)} Koenker (2005) provides an excellent review
method.
Quantile regression allows researchers to
tional distribution.
fit
parsimonious models to the entire condi-
has become an increasingly important empirical tool in applied
It
economics. In labor economics, for example, quantile regression has been widely used to
model changes
in the
wage distribution (Buchinsky, 1994, Chamberlain, 1994, Abadie,
1997, Goshng, Machin,
and Meghir, 2000, Machado and Mata, 2005, Angrist, Cher-
nozhukov, and Fernandez- Val, 2006, and Autor, Katz, and Kearney, 2006b). Variations
of quantile regression can be used to obtain quantile
endogenous and exogenous treatment
and distribution treatment
effects in
models (Abadie, Angrist, and Imbens, 2002,
effects
Chernozhukov and Hansen, 2005, and Firpo, 2007).
Example
3.
Duration regression. A common way
functions in duration and survival analysis
Fyiy\x)
where
t{-) is
variates
is
=
is
limited in an important way.
model conditional distribution
through the transformation model:
exp(exp(m(x)
+
f(y))),
This model
a monotonic transformation.
to
is
(2.9)
rather rich, yet the role of co-
model leads
In particular, the
to the following
location-shift representation:
t{Y)
where
V
= m{X) +
has an extreme value distribution and
is
•'-:-
V,
^
X
independent of
.
Therefore, covariates
impact a monotone transformation of the outcome only through the location function. The
estimation of this model
1990, Donald, Green,
Example
4.
mation models
is
the subject of a large and important literature
Distribution regression. Instead of
for the conditional distribution,
y.
An example
Fyivlx)
is
=
is
restricting attention to transfor-
we can consider
+
and allows
for
more
is
unrestricted in
Throughout, by "hnear" we mean specifications that are Unear
function takes the form z'P{u) where z
=
=
is,
if
f{x).
in the
the original covariate
is
y.
This specification
exp(exp(i)))
flexible effect of the covariates.
non-linear in the original covariates; that
modehng Fv(y|x)
A(m(y,x)),
a known link function and m{y, x)
t{y))
directly
the model
includes the previous example as a special case (put A{v)
m{x)
Lancaster,
and Paarsch, 2000, and Dabrowska, 2005).
separately for each threshold
where A
(e.g.,
The
and m{y,x)
=
leading example of
parameters but could be highly
X
,
then the conditional quantile
12
would be a probit or
this specification
/3{y) is
an unknown function
1995). This approach
is
logit link function
U~
2.4.
(/(0, 1)
similar in spirit to quantile regression. In particular, as quantile
V = Qy{U\X) =
Fy
m~^{A~^{U), X)
independently of X.
Policy estimators and inference questions.
erate estimates
x'P{y), were
(Han and Hausman, 1990, and Foresi and Peracchi,
in y
regression, this approach leads to the specification
where
A and m[y,x) =
{y\x), j
G
All of the preceding approaches gen-
{0, 1}, of the conditional distribution functions either directly
or indirectly using the relation (2.3):
A',(y|x)=/' i{Qi-(u|x) <y}dn, je{0,l},
where Q^- {u\x)
We
is
a given estimate of the conditional quantile function.
then estimate the marginal distribution functions and quantile functions for the
outcome by
'
_
^>- (y)
respectively, for
j,
=
/c
/ Fy^{y\:r)dFx,{x),
€
QEy^iu)
We
(2.10)
{0, 1}.
=
We
Q\.^{u)
and Q^.
(u)
= mf{y
:
(y)
F,'^-
>
,;,},
estimate the quantile and distribution policy effects by
- Qliu), and De'-
(y)
=
F^'-iy)
-
F^^iy).
estimate the general functional introduced in (2.5) similarly, using the plug-in rule:
Hy{y)
= d[y.FlFl„F^,,F°,).
(2.11)
For example, in this way we can construct estimates of the distribution and cjuantiles of
the effects defined in (2.6) and (2.7).
Common
inference questions that arise in policy analysis involve features of the dis-
tribution of the
outcome before and
after the intervention.
For example, we might be
interested in the average effect of the policy, or in quantile policy effects at several quantiles
to
measure the impact
of the policy
on
different parts of the
outcome
distribution.
many
questions of interest involve the entire distribution
or quantile functions of the outcome.
Examples include the hypotheses that the policy
More
generally, in this analysis
has no
effect,
that the effect
is
constant, or that
it
is
positive for the entire distribution
(McFadden, 1989, Barrett and Donald, 2003, Koenker and Xiao, 2002, Linton, Maasoumi,
and Whang, 2005). The
statistical
problem
the estimation of the conditional model to
is
to account for the
make
sampling variability in
inference on the functional of interests.
Section 3 provides limit distribution theory for the policy estimators. This theory applies
13
and quantile functions
to the entire marginal distribution
vaHd
of the
outcome before and
after
performing either uniform inference about the en-
the pohey, and therefore
is
tire distribution function,
quantile function, or other functionals of interest, or pointwise
for
inference about values of these functions at a specific point.
Alternative approaches. An alternative way to proceed with policy analysis
2.5.
use reweighting methods (DiNardo, 2002).
to
is
Indeed, under Condition M, we can express
the marginal distribution of the counterfactual outcome in (2.4) as
F^,{y)= [ f
where
is
=
u;^^(x)
l{y,'
fx,{x)/fxj{x)
J
—
-
(1
= Pr{J =
the propensity score, Pj
of the covariate given
=
<y}w';{x)dFy^{y\x)dFx,{x),
j,
and
y
is
function Wj follows from Bayes' rule.
Pj)pj{x)/[pj{l
j}, J
is
-
an indicator
for
group
(2.12)
{0,1},
Pjix))], Pj{x) :=
j,
Pr{ J
Jx^
The second form
the support of Y.
We
j,ke
= j\X =
is
x)
the density
of the weighting
can use the expression (2.12) along with either
density or propensity score weighting to construct policy estimators.
Firpo (2007) used
a similar propensity score reweighting approach to derive efficient estimators of quantile
effects in
treatment
models.^
effect
With some work, one can adapt
the nice results of Firpo
(2007) to obtain the results needed to perform pointwise inference, namely, inference on
quantile policy effects at a specific point. However,
results
We
we need
to
do more work to develop the
needed to perform uniform inference on the entire quantile or distribution function.
are carrying out such
work
in a
companion paper.
In a recent important development, Firpo, Fortin,
and Lemieux (2007) propose an
ternative useful procedure to estimate policy effects of changes in the distribution of
Given a functional of interest
0,
they use a
0(F4) - 0(F°J
where
4>'{Fy^
—
Fy^)
= J a{y,
=
Fyp)d(Fy^(y)
term, where function a
is
maining approximation
error.
—
-
F,\)
FY^{y))
+
is
R{F,\, F,\),
the
first
term
(p'^Fy^
—
F°^); this
,
order finear approximation
the influence or the score function, and R{Fy^,Fy^)
is tire
re-
In the context of our problem, this approximation error
generally not equal to zero and does not vanish with the sample
Lemieux (2007) propose a
X
order approximation of the policy effect:
first
6'{F,\
al-
practical
method
mean
regression
method
size.
Firpo, Fortin,
to estimate the
first
cleverly exploits the law of iterated expectations
is
and
order
and the
See Angrist and Pischke (2008) for a detailed review of propensity score methods and a comparison
to regression
methods
in the
context of treatment effect models.
are also likely to apply to policy analysis. In this paper
The pros and cons
of these
we focus on the regression method.
two methods
14
term
linearity of tlie
tliis
method
error,
is
in
In contrast to our approach, the estimand of
the distributions.
an approximation to the pohcy
whereas we directly estimate the exact
effect
with a non-vanishing approximation
effect 0(Fy.^)
— (piFyJ without approximation
:,,',
error.
Limit Distribution and Inference Theory for Policy Estimators
3.
In this section we provide a set of simple, general sufficient conditions that facilitate
We
inference in large samples.
design the conditions to cover the principal practical ap-
proaches and to help us think about what
needed
is
approaches to work. Even
for various
though the conditions are reasonably general, they do not exhaust
which the main
3.1.
inferential
methods
will
be
valid.
all
.
scenarios under
,
:
Conditions on estimators of the conditional distribution and quantile func-
We
tions.
provide general assumptions about the estimators of the conditional quantile
or distribution function, which allow us to derive the limit distribution for the policy es-
timators constructed from them. These assumptions hold for
commonly used parametric
and semiparametric estimators of conditional distribution and quantile functions, such
and distribution regressions.
classical, quantile, duration,
We
begin the analysis by stating regularity conditions for estimators of conditional
quantile functions, such as classical or quantile regression. In the sequel,
denote the space of bounded functions mapping from
the uniform metric.
We
assume we have a sample
observations
follows
come from group
we use
^
to denote
Condition C. The
is
where
{(A',,
>',),?'
=
y
is
I—*
x
A")
to R, equipped with
<%"
=
let £°°((0, 1)
I,....,/;,}
In this
of size
sample no
n/\^ observations come from group
1.
ii
for
=
n/Ao
In
what
weak convergence.
conditional density fy {y\x) of the outcome given covariates exists,
continuous and bounded above and away from zero, uniformly on y E
a compact subset
Condition Q. The
{u,x)
and n^
x
(0, 1)
the outcome and covariates before the policy intervention.
and
as
o/R, for
j
G
y
and x €
A",
estimators (u,x) i—
{0, 1}.
>
Qy
{u\x) of the conditional quantile functions
Qyj{u\x) of outcome given covariates jointly converge in law
to
continuous Gauss-
ian processes:
n {Qy,{u\x) - Qy^{u\x)^
^
^,V,{u,x),
J
€
{0, 1}
(3.1)
15
in £°°((0, 1) X
^), where {u,x)
Vj{u,x),j G {0,1}, have zero
i-^
function Ev^^{u,x,u,x) := E[Vj{u, x)Vr{u,
These conditions appear reasonable
outcome
for j,r G {0, 1}.
x)],
in practice
C and Q
discrete, the conditions
is
when the outcome
do not hold. However,
Condition
the distribution approach discussed below.
mean and covariance
C and Q
is
continuous. If the
in this case
we can use
where
focus on the case
the outcome has a compact support with a density bounded away from zero, which
a reasonable
first
Condition
case to analyze in detail.
Q
applies to the
most
is
common
estimators of conditional quantile functions under suitable regularity conditions (Doss and
Gill, 1992,
Gutenbrunner and Jureckova, 1992, Angrist, Chernozhukov, and Fernandez- Val,
2006, and
Appendix
F). Conditions
without affecting subsequent
we want
results.
C and Q
could be extended to include other cases,
y
For instance, given set
to estimate the counterfactual distribution, Condition
C
Condition
in
Q
needs only to hold over
a smaller region
UX =
a
convergence requirement, without affecting any subsequent
less restrictive
{{u.x) G
joint convergence holds trivially
We
(0, 1)
x A"
C
Qy{u\x) e y}
:
over which
(0, 1)
x M, which leads to
results.
The
the samples for each group are mutually independent.
if
next state regularity conditions for estimators of conditional distribution functions,
such as duration or distribution regressions. Let i'^{y x
functions
mapping from
compact subset
.t)
I—>
Fy
X
x
of R.
to M,
equipped with the uniform metric, where
estimators
{y\^) of the
(y, x) i—>
>^
:
^;^^
,•
:
;
'.
.^
in law to a continuous
;
'.^
v^(Py.(y|.T)-FK,(y|:r;))=^ V^Z,(y,,7:), .7G{0,1},
in i°^{y X X),
where
(y, x) i—*
Zj(y, x), j £ {0, 1}, have zero
Sz,>(y,a;,y,x) := £;[Zj(y,i:)Zr(y,x)], /or
This condition holds for
common
1977, Burr and Doss, 1993, and
a
Fy^(y|x) of the conditional distribution func-
outcome given covariates converges
Gaussian processes:
3^ is
:,.....
.
,
Condition D. The
tions (y,
y
denote the space of bounded
A!)
J,
r
(3.2)
^
mean and covariance function
G {0,1}.
,:
,
..
/
...
.,'.,.
estimators of conditional distribution functions (Beran,
Appendix
F).
These estimators, however, might produce
estimates that are not monotonic in the level of the outcome y (Foresi and Peracchi, 1995,
and
Hall, Wolff,
and Yao, 1999).
A
way
to avoid this
problem and to improve the
sample properties of the conditional distribution estimators
is
by rearranging the estimates
(Chernozhukov, Fernandez- Val, and Galichon, 2006). The joint convergence holds
if
the samples for each group are mutually independent.
finite
trivially
16
If
we
from a conditional quantile estimator Qy-{u\x), we can define the conditional
start
distribution function estimator FY^{y\x) using the relation (2.10).
the original quantile estimator satisfies conditions
C and
It
turns out that
if
Q, then the resulting conditional
distribution estimator satisfies condition D. This result allows us to give a unified treatment
on either quantile or distribution estimators.
of the policy estimators based
Lemma
Under conditions
1.
C
and Q,
the estimators of the conditional distribution func-
tion defined by (2.10) satisfy the condition
=
Z,(y,a-)
D
with
-/,. (y|x)\/,(Fv-(y|x),x), j G {0, 1}.
Examples of Conditional Estimators. Here we
3.2.
tors of conditional distribution
and quantile functions
theorem, which we required to hold
in
verify that the principal estima-
satisfy the functional central limit
our main Conditions
D
and Q. In
this section
we
drop the dependence on the group index to simplify the notation.
Example 1 continued. Classical regression.
model y = X'fSo + V, where the disturbance V is
finite
variance and quantile function
regression and quantiles of
show
in
V
Appendix F that the
Q'o(''^).
mean Gaussian
Go{u) :=
independent of A' and has mean zero,
In this case,
we can estimate
by the empirical quantile function of the
resulting estimator 9{u)
obeys a functional central limit theorem y/n{9{u)
zero
Consider the classical linear regression
—
=
/5o
mean
by
residuals.
—
(q(u),/J')' of 9o{u)
We
(qo(u),/9q)'
Oq{u)) => Go(u.)""^Z(i7), where
Z
is
a
process with covariance function fl{u, u) specified in (F.6) and matrix
G'(ao('J,), f3o, u)' specified in (F.5).
The
resulting estimator, Q}-{u\:!:)
= q{u)+x P,
of the conditional quantile function Q)-{u\x) obeys a functional central limit theorem,
V^(^QY{y\x)-Qy{y\x)^^{l,x')Go{ur'Z{u)=:V{u.x),
in l°°{{0, 1) X A'),
where V{u, x)
i:v{u,x,u,x)
Example
model where
2 continued.
Q>'(7i|.r)
=
is
=
a zero
mean Gaussian
{l,x')Go{u)-^n{u,u)[Go(uy^]'{l,x')'.
Quantile regression. Consider a
x'Po{u). In
is
linear quantile regression
Appendix F we show the canonical quantile regression
estimator satisfies a functional central limit theorem,
where Z{u)
process with covariance function,
y/n.{/3{u)
—
/3o{u)) =>
a zero mea.n Gaussian process with covariance function Q.{u,
u u}E[XX'] and Go{u) := G{do{u),u)
=
ii)
Go{u)~^ Z{u),
= {mm{u,
{Qy{u\x) - Qv-(u|.t))
= v^
—
-E[fYiX'po{u)\X)XX']. The estimator of the
conditional quantile function also obeys a functional central limit theorem,
v^
ii)
(.r'/3(u)
- x'Mu))
^ x'GQ{ur'Z{u)
:= V{u,x),
17
X
in ^°°((0, 1)
A"),
where V{u, x)
is
a zero
mean Gaussian
process with covariance function
given by:
Ev{v.,x,u,x)
Example
=
x'Go{u)~^Q{u,u)Go{u)~^x.
Duration regression. Consider
3 continued.
for the conditional distribution function stated in
model that
gives rise to this specification
is
equation
A common
(2.9).
the proportional hazard model of
where the conditional hazard rate of an individual with covariate vector x
Ao(y) exp(x'/?o),
line
/?o
is
a p-vector of regression coefhcients, Aq
y =
hazard rate function, and y E
•^o(y)
exp{-
=
duration
Cox
is
Ay(y|x)
exp(.T'/3o-l-ln
=
Ao(y))}, delivering the transformation model (2.9) with i{y)
=
Let
y.
=
In
1
—
Ao(y)
x'Pq.
In order to discuss estimation, let us
Then Cox's
=
the nonnegative base-
is
some maximum duration
for
(1972),
^oiy)dy denote the integrated basehne hazard function. Then Fv-(y|x)
Jq
and m{x)
[0,y]
model
the transformation
(1972) partial
maximum
assume
i.i.d.
sampling of
(F,,
Xi) without censoring.
likelihood estimator of Pq takes the form
/n
n
J2^og {My) exp{x[P)/Y,My)^Mx'jP)}dN,{y),
and the Breslow-Nelson-Aalen estimator
of
Aq takes the form
n
"
_l
/y {j^JM^Mx'jd)}'
j=i
where N,{y) := 1{Y, < y} and
Let
W
J,{y) := IfV,
'^^.:
d{^iv,(y)},
1=1
>y},
denote a standard Brownian motion on
y G
3^;
y
and
.;.
;;.;..
see Breslow (1972,1974).
Z
let
denote an independent
p-dimensional standard normal vector. Andersen and Gill (1982) show that
:..•,,
in
WX
v^(^-/3o,A(y)-Ao(y))^(E-i/2Z,iy(a(y))-6(y)'S-^/2^)
,,'
i°°{y), with the terms a(y), b{y),
,•
and S, and regularity conditions defined
Andersen and GiU (1982) and Burr and Doss (1993). Let Fy{y\x)
log A(y))}
'
be the estimator of Fy (y|x). Since Fy(y|x)
is
=
1
- exp{-
exp(x';5
Hadamard-differentiable in
(/3,
in
+
A),
by the functional delta method we have the functional central limit theorem
n(Fr(y|x)-Fy(y|x))
in i°°{y
X
A'),
^
{1-Fy(y|x)} {exp(x'/3o)l'V(a(y))
where b{y,x)
=
Ay(y|x)x
Gaussian process with covariance function,
Ez(y,x,y,x)
=
—
exp(x'/3o)&(y),
for
y
<
+
5(y,x)'S-'/2^} =: Z{y,x),
and Z{y,x)
is
a zero
mean
y,
{l-Fy(y|x)}{l-Fy(y|x)}{exp(.T'/3o)exp(.T'/3o)a(y)
+
6(y,.T)'S-i6(y,f)}.
18
In
Appendix F we
also discuss another estimator of this model.
Example 4 continued. Distribution
regression.
where
A(x'/?o(y)) for the conditional distribution function,
such as the logistic or normal distribution.
maximum
In
We can
likelihood to the indicator variables !{)'
Appendix
we prove that the
F,
central limit theorem
,-~^
Consider the model F>(y|a;)
A
a
is
known
=
link function,
estimate the function Poiy) by applying
<
y} for each value oiy E
y separately.
resulting estimator 0{y) of Po{y) obeys a functional
._.'
'
V^(d{y)-My))=>-Go{yr'Z{y),
=
where Go(y) := G(/3o(y),y)
and Z{y)
derivative of A,
>
for y
y.
a zero
is
mean Gaussian
A[A'/?o(y)])}], A
is
the
process with covariance function
= E [XX'\[X'l3o{y)]X[X'i3om/{A[X'3o{ym -
n{y., y)
•
E[A[X%(y)]2XA7{A[A"/?o(y)](l -
Hence the resulting estimator FY{y\x) :=
A[A'/3o(y)])}]
A(.T'/?(y)) of the conditional distribu-
tion function also obeys the functional central limit theorem,
Fy{y\x)j => -A[x'/?o(y)]x'Go(y)-'Z(y) =: Z{y,x),
v/n(Fy(y|;
in (l°°{y X
X), where Z{y,x)
T.z{y,x.,y.x)
3.3.
=
is
a zero
mean Gaussian
process with covariance function:
A[x'/3o(2/)]A[x'(?o(y)]x'Go(y)-^fi(y,y)Go(y)- 'x.
Basic principles underlying the limit theory. The derivation
for policy
estimators
relies
on several basic principles that allow us to
of the estimators of conditional (quantile
of the limit theory
link the properties
and distribution) functions with the properties of
estimators of marginal functions. First, although there does not exist a direct connection
between conditional and marginal quantiles, we can always switch from conditional quantiles
to conditional distributions using
to go
Lemma
1,
then use the law of iterated expectations
from conditional distribution to marginal distribution, and
finally get to
marginal
quantiles by inverting. Second, as the functionals of interest depend on the entire conditional function,
we must
rely
on the functional delta method to obtain the
these functionals as well as to obtain intermediate limit results such as
limit theory for
Lemma
1.
Since the
estimated conditional distributions and quantile functions are usually non-monotone and
discontinuous in
finite
samples,
we must use
refined forms of the functional delta
method.
Accordingly, the key ingredient in the derivation and one of the main theoretical contributions of the paper
is
the demonstration of the
Hadamard
differentiability of the func-
tionals of interest with respect to the limit of the conditional processes, tangentially to the
subspace of continuous functions. Indeed, we need this refined form of differentiability to
19
random functions
deal with our conditional processes, which typically are discontinuous
finite
in
samples yet converge to continuous random functions in large samples. These refined
method
differentiabihty results in turn enable us to use the functional delta
to derive
all
and inference theory.
of the following limit distribution
3.4.
Limit theory for counterfactual distribution and quantile functions. Our
first
main
tions before
Theorem
and D,
shows that the estimators of the marginal distribution and quantile func-
result
and
after the policy intervention satisfy a functional central limit
Under Conditions
1 (Limit distribution for marginal distribution functions).
Fy
the estimators
marginal distribution functions Fy
(y) of the
theorem.
M
converge
[y) jointly
in law to the following Gaussian processes:
V^, (F^.{y)
-
F^-{y))
in i°°{y), where y
h->
^
^J
Z,(y,x)dFx,{x) =: ^,Z^{y),
j,k e {0, 1},
(3.3)
Z^[y), j e {0,1}, have zero m.ean and covariance function, for
j,fc,7-,5G {0, 1},
= E[Z^{y)Z^{y)] =
^zjy-^y)
Theorem
C,
and
D
f [ EzJy,x,y,x)dFx,{x)dFxAi)Jx J X
Under Conditions M,
2 (Limit distribution for marginal quantile functions).
the estimators
in law to the following
._y/^{Q\.^u)
-
Qy
in ^°°((0, 1)), where fy [y]
E';;^^{u,u)
Qy
=> -Z^{Q'y^{u))/f,\{Q'y^{u)) =: Vfiu),
—
J^ fY^{y\x)dFxi^{x), and u
function, for
:= E[V;'{u)V;[u)]
Our second main
marginal quantile functions
converge
{u) jointly
Gaussian processes:
Q'^y^iu))
mean and covariance
{u) of the
(3.4)
result
j,
=
k,r, s
G
i-^
j^ke
G
Vj'iu), j, k
{0, 1},
'"
{0, 1},
{0, 1},
•
(3.5)
have zero
'
.
^%^{Q'y^{u),Ql.^{u))/[f,\{Q'y^iu))f^^{Ql,{m
shows that the estimators of the marginal quantile and
distri-
bution pohcy effects also satisfy a functional central limit theorem.
Corollary
D
1 (Limit distribution for quantile policy effects).
Under Conditions M,
the estimators of the quantile policy effects converge in law to the following
Gaussian
-
[QEy^iu] - QE!^.{u)J
^
V^\//('u) - V%K)°C") =: ^^jiu), k,j G
in the space <?°°((0, 1)), where the processes
covariance function
and
'
processes:
V^
C,
E^ ^(u, u)
u
i—>
WHu),
:= E[Wj{u)W^{u)\, for
j,
{0, 1}.
k G {0, 1}, have zero
j, k, r,
s
G
{0, 1}.
(3.6)
mean and
20
Corollary 2 (Limit distribution
D
for distribution policy effects).
the estimators of the distribution policy effects converge
inlaw
M and
Under Conditions
to the following
Gaussian
processes:
-'>-,
V^
[DE,.{y) - DE'y^{y)J => V^jZ^iv) " V^oZ°{y) =: 5^(y),
1
in the space £°°{y), where the processes y i—> S^{y),
j,
k e {0, 1},
(3.7)
mean and
j.k G {0,1}, have zero
variance functionT.'^^^{y,y) := E[Sj{y)S^{y)], for j,k,r,s e {0,1}.
Our
third
main
result
shows that various functionals of the status quo and counterfactual
marginal distribution and quantile functions satisfy a functional central limit theorem.
Corollary
Fyjj,
^(y,
3
(Limit
for
differentiable
FyJ, a funcHonal taking values
Fy^, Fy^,
in {Fy^, Fy^, Fy^,
distribution
FyJ
in l'°°{y),
he
Let
=
Hyly)
Hadam.a,rd differentiable
taugentially to the subspace of continuous functions with derivative
Then under Conditions
(000, 011, 001, 0io)-
functionals).
in (2.11) converges in law to the following
V^(//v(y)-//r(y))=>
E
M and D
the plug-in estimator
Hy{y) defined
Gaussian process:
y%^'jkiy^FlFl.,F,\,F^.JZ^iy)=:TH{y),
(3.8)
.?A-e{o,i}
where y
in i°°{y),
i—>
Tuiy) has zero mean and covariance function Y^Tniy^y)
'
—
E[TH{y)TnmExamples
of functionals covered by Corollary 3 include function-valued parameters,
such as Lorenz curves and Lorenz policy
as Gini coefficients
and Gini pohcy
also include quantile
Condition RP;
in
effects, as well as scalar-valued
effects (Barrett
and Donald, 2009). These examples
and distribution functions of the
Appendix C we
parameters, such
effect of the policy defined
under
state the results for these effects separately in order to
give
them some emphasis.
3.5.
Uniform inference and resampling methods. We can
readily apply the preced-
ing limit distribution results to perform inference on the distributions and quantiles of the
outcome before and
after the policy at a specific point.
For example, Corollar}'
that the quantile policy effect estimator for a given quantile
with
mean QEy
QEy
(u) for a particular quantile index
E^.
(u, u)
[u)
and variance E^/ [u,u)/n.
by a consistent estimate.
u using
We
this
u
is
1
implies
asymptotically normal
can therefore perform inference on
normal distribution and replacing
21
However, pointwise inference permits looking at the
effect of
the pohcy at a specific
point only. This approach might be restrictive for policy analysis where the quantities and
many
hypotheses of interest usually involve
points or a continuum of points. That
entire distribution or quantile function of the observed
and counterfactual outcomes
of interest. For example, in order to test hypotheses of the policy having no effect
distribution, having a constant effect throughout the distribution, or having a
dominance
effect,
we must use the
entire
outcome
distribution,
is,
is
point. Moreover, simultaneous inference corrections to pointwise procedures based
often
on the
first
and not only a single
the
order
specific
on the
normal distribution, such as Bonferroni-type corrections, can be very conservative
for
simultaneous testing of highly dependent hypotheses, and become completely inadequate
for testing a
A
continuum
of hypotheses.
convenient and computationally attractive approach for performing inference on func-
tion-valued parameters
is
to use
Kolmogorov-Smirnov type procedures. Some complica-
tions arise in our case because the limit processes are non-pivotal, as their covariance
functions depend on unknown, though estimable, nuisance parameters.^
valid
way
ods.
An
to deal with non-pivotality
is
to use resampling
attractive feature of our theoretical analysis
simulation methods follows from the
Hadamard
is
A
practical
and
and related simulation meth-
that validity of resampling and
differentiability of the policy functionals
Indeed, given that bootstrap and
with respect to the underlying conditional functions.
other methods can consistently estimate the limit laws of the estimators of the conditional
distribution
and quantile functions, they
also consistently estimate the limit laws of our
policy estimators. This convenient result follows from preservation of validity of bootstrap
and other resampling methods
more on
see
Theorem
Lemma
this in
for
6 in
estimating laws of
Hadamard
differentiable functionals;
Appendix A.
3 (Validity of bootstrap and other simulation methods
for estimating the
laws of
policy estimators of function- valued parameters). // the bootstrap or any other simulation
method consistently estimates
the laws of the limit stochastic processes (3.1)
the estimators of the conditional quantile or distribution function, then this
consistently estimates the laws of the
and
lim.it
-
(3.2) for
method
also
stochastic processes (3.3), (3.5), (3.6), (3.7),
(3.8) for policy estimators of marginal distribution
functionals.
and
and quantile functions and other
.
.
Similar non-pivotality issues arise in a variety of goodnes.s-of-fit problems studied by Durbin and others,
and are referred to
as the
Durbin problem by Koenker and Xiao (2002).
22
Theorem
3 shows that the bootstrap
inferential processes.
This
is
is
vahd
for estimating the
true provided that the bootstrap
is
Umit laws of various
valid for estimating the
limit laws of the (function-valued) estimators of the conditional distribution
functions. This
is
and quantile
a reasonable condition, but, to the best of our knowledge, there are no
results in the literature that verify this condition for our principal estimators. Indeed, the
previous results on the bootstrap established
laws of our principal estimators, which
difficulty, in
Appendix F we prove
is
its
validity only for estimating the pointwise
not sufficient for our purposes.^ To overcome this
validity of the empirical bootstrap
and other related
methods, such as Bayesian bootstrap, wild bootstrap, k out of n bootstrap, and subsampling bootstrap, for estimating the laws of function-valued estimators, such as quantile
regression and distribution regression processes. These results
pendent
We
interest.
,
may be
of substantial inde-
,
can then use Theorem 3 to construct the usual uniform bands and perform inference
on the marginal distribution and quantile functions, and various functionals, as described
in detail in
Chernozhukov and Fernandez- Val (2005) and Angrist, Chernozhukov, and
Fernandez- Val (2006). Moreover,
if
the sample size
is
large,
we can reduce the computa-
tional complexity of the inference procedure by resampling the first order
approximation
and quantile functions (Chernozhukov
to the estimators of the conditional distribution
and Hansen, 2006); by using subsampling bootstrap (Chernozhukov and Fernandez- Val,
2005); or by simulating the limit processes Zj or Vj, j G {0, 1}, appearing in expressions
and
(3.1)
3.6.
(3.2),
using multiplier methods (Barrett and Donald, 2003).
Incorporating uncertainty about the distribution of the covariates. In the
know
preceding analysis we assumed that we
the distributions of the covariates before
after the policy intervention for the target population.
observe such distributions only for individuals
in
the sample.
sample are the target population, then the previous
inference without any adjustments.
If
In practice, however,
If
limit theory
and
we usually
the individuals in the
is
valid for
a more general population group
is
performing
the target
population, then the distributions of the covariates need to be estimated, and the previous
limit theory needs to be adjusted to take this into account.
ideas, while in
We
Appendix
D
Here we highlight the main
we present formal distribution and inference
begin by assuming that the estimators x
\—f
Fyj. (x),
theory.
k G {0, 1}, of the covariate
distribution functions are well behaved, specifically that they converge jointly in law to
Exceptions include Chernozhukov and Hansen (2006) and Chernozhukov and Fernandez- Val (2005),
but they looked at forms of subsampling only.
23
Gaussian processes
G
^a'^, k
{0, 1}:
(Fx,{x) - Fx,{x)) =>
v^
^,BxSx).
Appendix D.l. This assumption
as rigorously defined in
e
/c
is
{0, 1},
quite general and holds for
conventional estimators such as the empirical distribution under
i.i.d.
sampling as well as
various modifications of conventional estimators, as discussed further in Appendix D.
where the distribution
joint convergence holds trivially in the leading cases
a
known transformation
of the distribution in
group
0,
or
when
in
group
The
1
is
the two distributions are
estimated from independent samples.
The estimation
interests.
of the covariate distributions affects limit distributions of functionals of
Let us consider, for example, the marginal distribution functions.
covariate distributions are
form FyXv)
=
unknown, the plug-in estimators
/^ FYj{y\x)dFxi,{x)
,
j,k 6
{0, 1}.
The
When
for these functions
the
take the
limit processes for these estimators
become
V^ (^F,'.{y) where the familiar
tion
F^/y)) => ^,Z][y)
first
component
arises
^,
+
f Fy^{y\x)dBx,{x),
j,k e {0, 1},
from the estimation of the conditional distribu-
and the second comes from the estimation of the distributions of the covariates. In
Appendix
4.
The
D we
discuss further details.
.
.
,
Labor Market Institutions and the Distribution of Wages
empirical application in this section draws
by DiNardo, Fortin, and Lemieux (1996,
DFL
its
motivation from the influential article
which studied the
hereafter),
tutional and labor market factors on the evolution of the U.S.
1979 and 1988.
DFL's
,.
,
,
The
goal of our empirical application
is
effects of insti-
wage distribution between
to complete
and complement
analysis by using a wider range of techniques, including quantile regression
and
distribution regression, and to provide confidence intervals for scalar-valued effects as well
as function- valued effects of the institutional
distribution,
We
and Lorenz policy
use the
same dataset
effects.
as in
DFL,
Current Population Surveys (CPS)
is
dummies
factors,
such as quantile,
-
extracted from the outgoing rotation groups of the
in
the hourly log- wage in 1979 dollars.
nine education
and labor market
1979 and 1988.
The
The outcome
variable of interest
regressors include a union status
dummy,
interacted with experience, a quartic term in experience, two
occupation dummies, twenty industry dummies, and dummies
for race,
SMSA,
marital
24
.
and part-time
status,
CPS
of the
women
status.
Following
DFL we
weigh the observations by the product
We
sampling weights and the hours worked.
separately.
The major
factors suspected to have
an important
minimum
distribution between 1979 and 1988 are the
whose
27 percent, the
level of unionization,
in our sample,
and the composition
role in the evolution of the
we decompose the
change
total
level also declined
of the labor force,
the effect of a change in
in the
minimum
US wage
wage,
from 30 percent to 21 percent
whose education
minimum
in the
levels
and other
Thus, following
distribution into the
sum
DFL,
of four effects:
(2) the effect of de-unionization. (3)
the effect
The
effect (1)
of changes in the composition of the labor force,
measures changes
wage
wage, whose real value declined by
characteristics have also changed substantially during this period.
(1)
men and
analyze the data for
and
the price
(4)
effect.
marginal distribution of wages that occur due to a change in the
wage; the effects
(2)
and
wages that occur due to a change
measure changes
(3)
in the
marginal distribution of
in the distribution of a particular factor,
the distribution of other factors at some constant
level;
having fixed
the effect (4) measures changes in
the marginal distribution of wages that occur due to a change in the wage structure, or
conditional distribution of wages given worker characteristics.
Next we formally define these four
effects as differences
"
counterfactual distribution functions. Let F^/'
tribution function of log- wages
wage,
ni, is as
Y when
distribution observed in year
denote the counterfactual marginal
the wage structure
the level observed for year
s,
between appropriately chosen
is
as in year
as the distribution observed in year v.
We
the
minimum
the distribution of union status, U.
and the distribution of other worker
r,
t,
identify
dis-
is
as the
characteristics, Z,
and estimate such counterfactual
is
dis-
tributions using the procedures described below. Given these counterfactual distributions,
we can decompose the observed
and 1988
into the
sum
total
change
in the distribution of
of four effects:
^Yss,mss
V79, 77179
Vgg, 77179!
i'88."l88
'
'"
_|_
'
(2)
[17^^79.288
77^/79,2791
l-'^y88,'n79
'^yss, 77779)
rpUjs.Zjs
piljQ.Zjg-l
l-'^V88,in79
V79, 77779!
component
is
'
(4)
the effect of the change in the
effect of de-unionization, the third
the fourth
is
the price
effect.
is
(4.1)
I
(3)
first
Ysa.rmgi
l^V'sg, 77179
(1)
The
wages between 1979
minimum
wage, the second
is
the
the effect of changes in worker characteristics, and
As stated above, we
see that the effects (2)
changes in the marginal distribution of wages that occur due to a change
of a particular factor, having fixed the distribution of other factors at
and
(3)
measure
in the distribution
some constant
level.
25
The
effect (4)
captures changes in the wage structure or conditional distribution of wages
given observed characteristics; in particular,
it
captures the effect of changes in the market
returns to workers' characteristics, including education and experience. Finally,
the interpretation of the
The decomposition
for the
mean.
We
minimum wage
we discuss
effect (1) in detail below.
Oaxaca-Blinder decomposition
(4.1) is the distribution version of the
obtain similar decompositions for other functionals
of interest,
(f^iFy^^^l')
such as marginal quantiles and Lorenz curves, by making an appropriate substitution in
equation (4.1)
:
(1)
(2)
(3)
(4)
(4.2)
In constructing the decompositions (4.1) and (4.2),
in
DFL.^
Also, like
DEL, we
We
-"'''
framework.
^
appearing
in (4.1).
how
The
in our analysis
the
first
first
we need
counterfactual distribution
if
the real
Fy^^'^^^, the distri-
is
minimum wage were
is
to
assume the conditional wage density
minimum
only on the value of the
minimum
wage. Under the
I
where Fy^^msiyW^
characteristics
The
report
strategy,
DFL
^r88,m88 {y\u,z),
-
level of the
minimum wage
is
below the
minimum wage
The second
distribution below
minimum wage
results for the reverse sequential order in the
if
y
>
myg;
as in year
s.
(
given worker
Under the second
We
Appendix.
cannot identify this quantity from random variation
wage does not vary across individuals and
to
show that
denotes the conditional distribution of wages at year
when the
DFL,
choice of sequential order matters and can affect the relative importance of the four effects.
some
We
^)
first
at or
its level.
the minimal wage by simply censoring the observed wages below the
as high
Following
wage, and the
we employ completely avoids modeling the conditional wage
the value of the
well
.
has no employment effects and no spillover effects on wages above
strategy
fit
to identify and estimate the various counterfactual distributions
we employ
minimum wage depends
DFL, we
because they do not
Identifying this quantity requires additional assumptions.^"
strategy
as
_^
bution of wages that we would observe in 1988
as in 1979.
same sequential order
''
,
next describe
follow the
follow a partial equiUbrium approach, but, unlike
do not incorporate supply and demand factors
in our
we
in
minimum wage,
since the federal
varies little across states in the years considered.
minimum
'
strategy,
we have that
'
,,
T?
Given either
(
(4.3) or (4.4)
/
\
\
we
if
°'
y
<m79;
'
identify the counterfactual distribution of
wages using the
representation:
i^SS(y) =
.
where Fyzi
We
is
/^Vss,m..(y|^,^)dFyz.s(«,2),
(4.5)
the joint distribution of worker characteristics and union status in year
can then estimate this distribution using the plug-in principle. In particular, we
mate the conditional
and
distribution in expressions (4.3)
(4.4)
F,Hi'^TAy)
'
;
=
I
distributions
we need
esti-
using one of the regression
methods described below, and the distribution function Fyzsg using
The other counterfactual marginal
t.
its
empirical analog.
are
J ^yss,^rAyW,z)dFu,,{u\z)dFzss{z)
(4.6)
and
Fy::S:M =
j FY,,,n.Ay\u.z)dFuz.Au.z).
Given either of our assumptions on the minimum wage
tions are identified
all
the components of these distribu-
and we can estimate them using the plug-in
estimate the conditional distribution
Fvgg,n,_g(y|i/, z)
(4.7)
principle. In particular,
we
using one of the regression methods
described below, the conditional distribution Fu^^{u\z), u G {0, 1}, using logistic regression,
and
Fzsg,{z)
and
Ft'z,9 using
the empirical distributions.
Formulas (4.5)-(4.7) giving the expressions
for the counterfactual distributions reflect
the assumptions that give the counterfactual distributions a formal causal interpretation.
Indeed,
we assume
in (4.6)
and
(4.7) that
we can
and change only the marginal distributions
specify
how the
fix
the relevant conditional distributions
of the relevant covariates.
conditional distribution of wages changes with the level of the
and estimate them using the plug-in
To estimate the conditional
methods:
distributions of wages
a logit link.
The
classical regression, despite its
left
side
principle.
we consider three
classical regression, linear cjuantile regression,
also
minimum
wage. Note that we directly observe the marginal distributions appearing on the
of the decomposition (4.1)
we
In (4.5),
different regression
and distribution regression with
wide use
in the literature,
is
not appro-
priate in this application due to substantial conditional heteroscedasticity in log
wages
(Lemieux, 2006, and Angrist, Chernozhukov, and Fernandez- Val, 2006). The linear quantile
regression
is
more
flexible,
but
it
also has
shortcomings
in this application.
First,
27
there
is
a considerable amount of rounding, especially at the level of the
which makes the wage variable highly
quantile function
the
minimum
may
discrete.
minimum wage,
Second, a linear model for the conditional
not provide a good approximation to the conditional quantiles near
wage, where the conditional quantile function
may be
highly nonlinear.
we therefore
distribution regression approach does not suffer from these problems, and
employ
it
to generate the
main empirical
results.
In order to check the robustness of
our empirical results, we also employ the censoring approach described above.
minimum wage
the wages below the
The
to the value of the
We
set
minimum wage and then apply
censored quantile and distribution regressions to the resulting data. In what follows, we
first
present the empirical results obtained using distribution regression, and then briefly
compare them with the
and censored
results obtained using censored quantile regression
distribution regression.
We
present our empirical results in Tables 1-3 and Figures 1-9. In Figure
the empirical distributions of wages in 1979 and 1988. In Table
and inference
results for the
decomposition
(4.2) of
we report
1,
1,
we compare
the estimation
the changes in various measures of wage
Figures 2-
dispersion between 1979 and 1988 estimated using distribution regressions.-'^
7 refine these results by presenting estimates
for several
major functional of
distribution functions,
and 95% simultaneous confidence intervals
interest, including the effects
and Lorenz curves.
We construct
on entire quantile functions,
the simultaneous confidence bands
using 100 bootstrap replications and a grid of quantile indices {0.02, 0.021,
plot
all
...,
0.98}.
We
of these function- valued effects against the quantile indices of wages. In Tables 2-3
and Figures
8-9,
we present the estimates
of the
same
Table
effects as in
1
and Figures
2-3 estimated using various alternative methods, such as censored quantile regression and
censored distribution regression. Overall, we find that our estimates, confidence intervals,
and robustness checks
foundation.
reinforce the findings of
we provide standard
DFL,
giving
them a rigorous econometric
and confidence
intervals,
without which
able to assess the statistical significance of the results.
Moreover, we
Indeed,
we would not be
all
errors
validate the results with a wide array of estimation methods.
discuss each of our results in
In Figure
1,
more
distributions of wages in 1979 and 1988.
The estimation
what
follows below,
we
detail.
we present estimates and uniform confidence
significantly lower in 1988 while the
In
We
upper end
see that the low end of the distribution
is
is
significantly higher in 1988. This pattern
results parallel the results presented in
results for the decomposition in reverse order.
intervals for the marginal
DFL. Table Al
in the
.
,
Appendix
gives the
28.
reflects the
well-known increase
in
wage inequality during
this period.
decomposition of the total change into the sum of the four
we
Next we turn to the
For this decomposition
eff'octs.
focus mostly on quantile functions for comparability with recent studies and to facilitate
the interpretation. In Figures 2-3, we present estimates and uniform confidence intervals
for the total
change
form a decomposition of
1979 and 1988
in the
marginal quantile function of wages and the four effects that
in the
this total change.-'^
top
left
report the marginal quantile functions in
panels of Figures 2 and
results for the decomposition of the total
change
in
3.
we
In Figures 4-7,
plot analogous
marginal distribution functions and
...
Lorenz curves.
From Figures
We
2
and
we
3,
change
see that the contribution of union status to the total
The
quantitatively small and has a U-shaped effect across the quantile function for men.
magnitude and shape
of this effect
on the marginal quantiles between the
decile sharply contrast with the quantitatively large
first
is
and
last
and monotonically decreasing shape of
the effect of the union status on the conditional quantile function for this range of indexes
(Chamberlain, 1994), and illustrates the difference between conditional and unconditional
effects. -"^ In general, interpreting
the unconditional
eff'ect
a covariate requires some care, because the covariate
of its support.
of changes in the distribution of
may change
For example, de-unionization cannot affect those
at the beginning of the period,
which
is
only over certain parts
who were not unionized
70 percent of the workers; and in our data, the
unionization declines from 30 to 21 percent, thus affecting only 9 percent of the workers.
Thus, even though the conditional impact of switching from union to non-union status can
be quantitatively
large,
has a quantitatively small effect on the marginal distribution
it
since only 9 percent of the workers are affected.
From
Figures 2 and
acteristics (other
3,
we
also see that the
than union status)
inequality in the upper
tail of
is
change
in the distribution of
worker char-
responsible for a large part of the increase in
the distribution.
The importance
of these composition effects
has been recently stressed by Leniieux (2006) and Autor, Katz and Kearney (2008).
composition
effect
is
realized through at least two channels.
through between-group inequality.
Discreteness of
wage
The
In our case, higher educated
first
The
channel operates
and more experienced
wage data implies that the quantile functions have jumps.
To avoid
this erratic
behavior in the graphical representations of the results, we display smoothed quantile functions. The non-
smoothed
results are available
from the authors. The quantile functions were smoothed using a bandwidth
of 0.015 and a Gaussian kernel.
We
The
find similar estimates to
function in our
CPS
data.
results in Tables 1-3
Chamberlain (1994)
and Al have not been smoothed.
for the effect of
union on the conditional quantile
29
By
workers earn higher wages.
tween the lower and upper
is
increasing their proportion,
tails of
we induce a
the marginal wage distribution.
gap be-
larger
The second channel
that within-group inequality varies by group, so increasing the proportion of high vari-
ance groups increases the dispersion
the marginal distribution of wages.
in
In our case,
By
higher educated and more experienced workers exhibit higher within-group inequality.
increasing their proportion,
distribution.
we induce a higher inequality within the upper
To understand the
Y=
consider a linear quantile model
Var[Y]
first
X'P{U), where
we can decompose the variance
of total variance,
The
channels in wage dispersion
effect of these
=
+
E{P{U)]'Var[X]E[P{U)]
channel corresponds to changes
of
Y
X
it is
independent of U.
is
tail
useful to
By
the law
into:
trace{E[XX']Var[p{U)]}.
in the first
of the
term of
(4.8)
(4.8)
where Kar[X] represents
the heterogeneity of the labor force (between group inequality); whereas the second channel
corresponds to changes in the second term of
(4.8)
operating through the interaction of
between group inequality E[XX'] and within group inequality Var[P{U)].
In Figures 2 and
3,
we
also include estimates of the price effect.
changes in the conditional wage structure.
if
It
represents the difference
this period.
similar to the pattern Autor,
Katz and Kearney (2006a)
and 2000. They
skill jobs.
1990.
A
This
effect
effect
captures
we would observe
the distribution of worker characteristics and union status, and the
remained unchanged during
minimum wage
has a U-shaped pattern, which
However, they do not find a U-shaped pattern
employment
for the
is
between 1990
find for the period
relate this pattern to a bi-polarization of
into low
and high
period between 1980 and
possible explanation for the apparent absence of this pattern in their analysis
might be that the declining minimum wage masks
we
This
control for this temporary factor,
component
this
phenomenon. In our
analysis, once
we do uncover the U-shaped pattern
for the price
in the 80s.
In Tables 2-3 and Figures 8-9,
we present
several interesting robustness checks.
As we
mentioned above, the assumptions about the minimum wage are particularly
delicate, since
the mechanism that generates wages strictly below this level
could be mea-
surement
error, non-coverage, or
the results to the
DFL
is
not clear;
it
non-compliance with the law. To check the robustness of
assumptions about the
minimum wage and
to our semi-parametric
model of the conditional
distribution,
linear quantile regression
and censored distribution regression with a
we re-estimate the decomposition using censored
logit link, using
wage data censored below the minimum wage. For censored quantile
regression,
the
we use
30
.
.
Powell's (1986) censored quantile regression estimated using Chernozhukov and Hong's
(2002) algorithm. For censored distribution regression,
we simply censor
bution regression estimates of the conditional distributions below the
recompute the functionals of
interest.
Overall,
we
to zero the distri-
minimum wage and
find the results are very similar for the
quantile and distribution regressions, and they are not very sensitive to the censoring.
_
^
This paper develops methods
interest of a
of the
change
samples
for
Conclusion
performing inference about the
on an outcome of
effect
in either the distribution of policy-related variables or the relationship
outcome with these
in large
5.
'"^
relies
variables.
The
proposed inference procedures
validity of the
only on the applicability of a functional central limit theorem for
the estimator of the conditional distribution or conditional quantile function. This condition holds for
most important semiparametric estimators
cjuantile functions,
^
such as
classical, quantile, duration,
of conditional distribution
and distribution regressions.
Massachusetts Institute of Technology, Department of Economics
search Center; and University College London,
and
CEMMAP.
&
Operations Re-
E-mail: vchern@mit.edu. Re-
search support from the Castle Krob Chair, National Science Foundation, the Sloan Foundation,
§
and
CEMMAP
gratefully acknowledged.
is
Boston University, Departm.ent of Economics. E-mail: ivanf@bu.edu. Research sup-
port from the National Science Foundation
J
Brown
University,
gratefully acknowledged.
is
Department of Economics. E-mail: Blaise^Melly ©brown, edu.
Appendix
This Appendix contains proofs and additional
lemmas on the
functional delta
simulation method, extending
method and
its
results.
Section
Section
D
Z-processes and Section
E
F
C
for
collects
gives limit distribution
results
derives limit theory, including
complement the
Hadamard
results in
differentiability, for
applies this theory to the principal estimators of conditional
distribution and quantile functions. These results establish the validity of bootstrap
We
any
presents additional results for the case
where the covariate distributions are estimated. These
Section
method
beyond the bootstrap. Section B
the proofs for the results in the main text of the paper. Section
the main text.
collects preliminary
derives the functional delta
applicability
theory for policy effects estimators.
A
have additional results on quantile, distribution and Lorenz
these are available on request from the authors.
We
effects for the
and
censored estimates;
do not report them here to save space.
31
other resampling schemes for the entire quantile regression process, the entire distribution
regression process, and related processes arising in the estimation of various conditional
These
quantile and distribution functions.
results
may be
of a substantial independent
interest.
Appendix
Functional Delta Method, Bootstrap, and Other Methods
A.
This section collects preliminary lemmas on the functional delta method and derives the
method
functional delta
any simulation method, extending
for
beyond the
applicability
its
bootstrap.
Some
A.l.
definitions
and auxiliary
We
results.
begin by quickly recalling from van
der Vaart and Wellner (1996) the details of the functional delta method.
-Definition 1 (Hadamard-differentiability). Let Dq,
Do C P. A map
Do
to
4>
V^p
:
cH)
^-^
E
E
normed
he
map
:
4>'b
Do
'"^
E
such that
n
(t>'g{h),
—
with
spaces,
^B>^ tangentially
called Hadamard-differentiable at 6
is
there is a continuous linear
if
o,nd
IP;
.
,
oo,
>
tn
for
sequences
all
^-
t„
and h^ ^i h ^ Dq such
that 6
+
tnh„ G
D^ for
every n.
,
,,
,
This notion works well together with the continuous mapping theorem.
Lemma
2 (Extended continuous mapping theorem). Let
and Qn
D„ H^ E
^
if Xn'
Xn
:
maps
(n
>
D„ and
every
element inK:
random element
,
,
.
X
6e arbitrary subsets
such that for every sequence
0),
X G Do along a subsequence, then p„'(x„') —> goi^)-
r^n H^-
:
he arbitrary
D
D„ C
.,;..
-,
,
(i)
(ii)
IfXr.^X,thengn{Xr,)^go{X);
IfXn-^pX,
The combination
then gn{Xn) ~^pgo{X).
of the previous definition
.:.:>."
v.
^.
a
is
,.
:
maps
random
,,,,.,.
.
,
.
6 ©„
Then, for arbitrary
with values in Dq such that go{X)
,
a;„
.;,:,
.:,..,:.
,
:
,
::'
::
:
and lemma
is
known
.,,.'.''
-
:
•
,
;
as the functional delta
method.
Lemma
D
3 (Functional delta-method). Let Do, D, and
E 6e Hadamard-differentiable
rn{Xn — 9) =^ X inlD, where X is
I—>
of constants rn
—
>
oo.
Then
at
be
normed
tangentially to Dq. Let
separable and takes
rn {4>{Xn)
E
—
(p{9))
its
Xn
:
spaces.
Q„
i—> D,^ be
values in Do, for
=> 0e(A'). If
(p'g
is
Let
defined
(/>
:
D^ C
maps with
some sequence
and continuous
32
on
the whole
o/P, then
the sequence r„ {(p{Xn)
—
—
4>[9))
—
{fn{^n
(!>'$
d))
converges to zero
in outer probability.
The
applicability of the
method
is
greatly enhanced
by the
fact that
Hadamard
differ-
entiation obeys the chain rule.
Lemma
4 (Chain rule). If
tangentially to
4)'
V'
o
xp'.,gs
o
then
(Do),
deiivative
Do and
</!)
:
•i/)
:
D^
F
1-^
F
i—>
E^,
C H
B>^
:
is
i-^
E^
is
Hadamard-dijferentiable at 9
Hadamard- differentiable
Hadamard- differentiable
is
at 9
B)^
tangentially to
at (p{6)
Dq with
tangentially to
•-
'
(p'g.
£
.
,
.
Another technical
result to be used in the sequel
is
.
concerns the equivalence of continuous
and uniform convergence.
Lemma
5 (Uniform convergence via continuous convergence). Let
D
separable metric spaces, with
sequence of functions /„
convergent sequence Xn
:
—
*
D
^—>
x in
E
D
Proof of Lemmas 2-4: See van
Proof of
Lemma
A. 2. Functional
JF^
=
(IFi,
...,
5: See, for
we have
—
that fn{xn)
^
for
of constants
mo
is
if
for any
D
D
random elements
Vn
+
mo/n
-^ c
\/n.{Vn
Vr,
— V)
=
Vn{J-'n),
converges
random elements
GjV^
(A.l)
a possibly random sequence such that m/niQ
>
0,^^
^p
1
for
and the "draw" Gn
some sequence
is
produced by
method that guarantees that the sequence
converges conditionally given .F„ in distribution to a tight random element G,
sup,,eBL,(0)
in
and only
2.
normed space D, the sequence
bootstrap, simulation, or any other consistent
Gn
if
a
/(2.').
data. Consider sequences of
—> 00 such that
Then
continuous.
zs
D
complete
be
bootstrap and other simulation methods. Let
K=
m{n)
E
E
and
der Vaart and Wellner (1996) Chap. 1.11 and 3.9.
unconditionally to the process G. Let the sequence of
m=
h->
converges to f uniformly on
the original empirical process. In a
where
D
:
example, Resnick (1987), page
delta-method
Wn) denote the
Suppose /
compact.
D
\E\:f,MGr.r - Eh{G)\ ->
0,
outer probability, where BLi(D) denotes the space of function with Lipschitz
most
1
and
can take
G
E\jr^
norm
at
denotes the conditional expectation given the data. In the definition, we
to be independent of ^„.
The random
(A.2)
scaling
is
needed to cover wild bootstrap,
for
example.
33
Given a
map
:
D^ C
D i—
>
E,
we wish
-
Ei^nh (%/^(0(K)
SUP/jgBLi(E)
show that
to
0(K^)))*
-
^
Eh{(j>'y{G))\
(A.3)
0,
in outer probabihty.
Lemma
E
be
6 (Delta-method for bootstrap and other simulation methods). Let Dq;
normed
spaces, with
C
Pq
D. Let
tangentially to Bq. Let V„ and K„ be
that \/n(Ki
— V)
=>
and
Proof of Leraima
6:
maps
D
E
^->
The proof
o-Tid
Hadamard-differentiable at
6e
as indicated previously with values in
f-4.S^ holds in outer probability,
Then (A.3) holds
values in Do-
its
G
D^ C
:
IP,
where
G
V
D^ such
separable and takes
is
in outer probability.
generalizes the functional delta-method for empirical
bootstrap in Theorem 3.9.11 of van der Vaart and Wellner (1996) to exchangeable bootstrap. This
expands the applicability of delta-method to a wide variety
of resampling
and
simulation schemes that are special cases of exchangeable bootstrap, including empirical
bootstrap, Bayesian bootstrap, wild bootstrap, k out of n bootstrap, and subsampling
bootstrap (see next section
Without
for details).
assume that the derivative
loss of generality,
uous on the whole space. Otherwise, replace
by an extension
4>'y
BL||<^.jl(D).
Thus
probability.
Next
:
D
h->
E**. For every h
proved once
converges to zero
in
it
-
*•
E
defined and contin-
is
and the derivative
E*"*
-
<P'v
-
(p'y
is
contained in
Eh{(P'y{G))\ ->
0, in
outer
E\^Ji{cP'y (G„))),
(A.4)
{V^iVn -
Vn))
f
>
s)
has been shown that the conditional probability on the right
outer probability.
distribution to separable
=
random elements
sequence converges by assumption and
>
i—
G BLi(E), the function h o
4>{Vn))
Both sequences y/rn{V„ - V) and G„
c
D
second dual
%> (\/^ {HVu) - HVn))y m (<^(t4)
is
its
:
(A.2) implies sup^gBL,(E) \EirJi{(p'v{Gn)T
SUP/ieBLi(E)
The theorem
E by
^'^/
and converges to zero when
\/rn{Vn
converge (unconditionally) in
that concentrate on the space Do-
Slut'sky's
rrio/n
- V)
—
>
theorem when m/m.Q
by assumption and
Slutslcy's
second sequence converges, by noting that
V^l{Vn
-V) =
V^l{Vn
-
Vn)
+ V^{Vn -
—>p
V)
1
The
first
and niQ/n -^
theorem.
The
34
and that E\E\r„h{^{Vn
Vn)Y - ^i^XG)!
(A. 2),
and by
By Lemma
=
-
Ki)*
- £^|^XG + i„)| < sup^gBLi(D„) E\Eij:„h{y/m.(V„ E\E\^^h[{Gny - E\^^h{G)\ which converges to zero by
+
sup;jgBLi(D„)
E|;r,^/i(G)
= Eh{G)
t„)
G
due to independence of
from
JF„.
3,
= 0K (^/^(K^ - ^O) +
V^(0(K„) - (i6(y)) = 0V (v^(K, - V)) +
V^ (0(Vn)
-
<^(^0)
Subtract these equations to conclude that the sequence
Op(l).
,^
.
.
o*p(l).
— (l){Vn)) — (f^'viV^i^n —
y/Tn{(l){Vn)
Vn)) converges unconditionally to zero in outer probability. Thus, the conditional probability
on the right
in (A. 4)
Exchangeable Bootstrap.
A. 3.
D
converges to zero in outer mean.
Let {\Vi, ...,Wn) denote the
data.
i.i.d.
Next we define
the collection of exchangeable bootstrap methods that we can employ for inference.
each
71,
let (e„i,
....
e„„) be an exchangeable, nonnegative
random
vector.
For
Exchangeable
bootstrap uses the components of this vector as random sampling weights in place of
constant weights
(1,
...,
1).
A
simple way to think of exchangeable bootstrap
each variable Wi the number of times equal to
Given an empirical process Ki(/)
valued.
e,i,',
=
is
as
albeit without requiring e„, to
- X2,'Li /(A',),
we
samphng
be integer-
define an exchangeable
bootstrap draw of this process as
where e^
=
XliLi ^ml'n- This insures that each
each observation, which
is
important
draw
of V„ assigns nonnegative weights to
in applications of
to preserve con\-exity of criterion functions.
We
bootstrap to extremum estimators
assume
that, for
some
c
>
n
sup£;[e2+^]
<
cx),
n"^ J](e„, -e„)'
^P
1,
e^
^p
>
c
(A.6)
0,
,=1
where the
the last one
cases:
is
(1)
two conditions are standard, see Van der Vaart and Wellner (1996), and
first
is
needed to apply the previous lemma. Let us consider the following special
The standard
empirical bootstrap corresponds to the case where (e„i,
a multinomial vector with parameters n and probabilities (1/n,
and
m=
n.
(2)
The Bayesian bootstrap corresponds
nonnegative random variables,
and
e„j
=
Ui/Un, so that e„
case where e^i,
...,
e„„ are
=
i.i.d.
e.g.
1
to the case
...,
and
m=
vectors with
n.
(3)
£'[e^,|'^]
The
<
1/n), so that £„
where Ui,
unit exponential, with E\U{'^^]
...,
<
...,
Un are
co for some
£"
£„„)
=
1
i.i.d.
>
0,
wild bootstrap corresponds to the
oo for some
t
>
0,
and Karfeni]
=
1,
35
m/n =
SO that
< n
resamples k
letting (e„i,
—>p
e^
...,
= k/n
< n
number
m =
and
both
m = nk/{n —
k) -^ oo.
As a consequence
row
A;
of
—
>
(A. 6)
(5)
....
oo and n
method described above
on the weights holds
—
number n{n —
Lemma
7:
By Lemma
6,
we only need
by y^ the support of
Vi,
C
[V,
throughout that y^
B.2.
=
result,
to
and n — k times
k/{n ~
>
k) -^ c
Q and
which might be of inde-
to this
the conclusions of
method.
to verify condition (A. 2), which follows
D
'
for the results in the
U ~
main
text of the paper.
Uniform(ZY) with
UX
yA! :=
{(y, x)
which
a compact subset of M, and that x €
is
:
y E yx, x e ^}, and
:=
U = (0, 1).
U x X We
.
A",
what
I—>
M, and CiJAX) denotes the set of continuous functions mapping h
follows, £°^(^/<Y) denotes the set of
Uniform Hadamard
Hadamard
Denote
assume
a compact subset
In
bounded and measurable functions
:
UX
i—>
M.
differentiability of conditional distribution functions
with respect to the conditional quantile functions. The
the
oo, so that
Inference Theory for Counterfactual Estimators (Proofs)
B,
Notation. Define K, := Qy{U\x), where
UX
^
condition (A. 6) on the
and therefore
satisfies condition. (A. 2),
This section collects the proofs
:
The
6 about validity of the functional delta method apply
Appendix
h
/c
k)~^^'^k~^/^
k -^ oo. In this case e^
we obtain the following
6,
if
without replacement. This corresponds to
by Theorem 3.6.13 of Van der Vaart and Wellner (1996).
of R''.
This corresponds to
The subsampling bootstrap corresponds
Wn
of k times the
Lemma
n bootstrap
of
7 (Functional delta method for exchangeable bootstrap). The exchangeable boot-
Proof of
B.l.
The k out
interest.
Lemma
Lemma
(4)
ordered at random, independent of the Wj's.
0,
if
strap
k —> oo.
observations from Wi,
weights holds
pendent
The condition
1/n).
...,
letting (e„i, ...,enn) be a
the
nEe^^^ -^ oo.
y/n/k times multinomial vectors with parameters k and
e^n) be equal to
—^c>0
resampling k
=
and mo
observations from Wi,...,Wn with replacement.
probabilities (1/n,
e^
>
Ee\-^
following
lemma
establishes
differentiabihty of the conditional distribution function with respect to the
conditional quantile function.
We
use this result to prove
Lemma
1
in the
main
text
and
to derive the limit distribution for the policy estimators based on conditional quantile
models.
We
drop the dependence on the group index to simplify the notation.
36
Lemma
;=
8 (Hadamard derivative of
J 1{Qy{u\x) +
< y}du. Under
i/i((u|x)
m
The convergence holds uniformly
-
\\ht
-^
/i||oo
Lemma
Proof of
1{Qy{u\x)
for all
where
0,
We
8:
u G 5e(Fy(y|x)) and
whereas
condition C, as
L
\
/i,)
0,
^Hy|.^,/^.0^-^y(y|^-) _^
^^^y|^) .^ -fyiy\x)hiFy{y\x)\x).
D,MxJ) =
for every
with respect to Qy{u\x)). Define Fy(y|x,
Fy-(y|a:)
for
+
G
h,,
i°°
small enough
:= {{y.x)
:
y ^
y^^,
x G P(]
{UX), and h G C{UX).
have that for any J
th,{u\x)
ofyX
any compact subset
>
there exists
0,
e
>
-
5)
such that for
>
t
<y]<
+
1{Qy{u\x)
t{h{FY{y\x)\x)
<
y};
u ^ B(^[Fy{y\x)),
\{Qy{u\x)
Therefore, for small enough
j^ l{gv'(u|.r)
+
+
tht[u\x)
<y] =
1{Qy{u\x) <y].
>
f
< y}du -
th,{u\x)
1{Qy{u\x) < y}du
Jo
(B.l)
t
l{Qy{u\x)
r
^
+
t{h{FYiy\x)\x)
-
<
5)
-
y}
l{Qr(i/|.x)
<
y}
^^
'
JB,iFy{y\x))
which by the change of variable y
=
Q)-{u\x)
equal to
is
•
fY{y\x)d:y,
i
where J
the image of B£(Fy(y|x)) under u
is
because Qy{-\^)
Fixing
JJn[y,y-t{h(FY(y\:r)\:r)-5)]
e
>
f(/i(Fy(y|.r)|x)
hand term
t-^
0,
for
-
(5)],
in (B.l) is
— /y(y|a;)
I
\
0,
we have that J D
and /y(y|x) -^
Lemma
5.
-
=
possible
\y,y
-
i
{h{FY{y\x)\x)
{h{Fy{y\x)\x)
Take a sequence
+
S)
+
+
[y,y
-
o{l).
o(l) bounds (B.l) from below. Since 6
of (y^, Xt) in
uniformly continuous on K.
in
>
can
result follows.
(y,
x) E
K, a compact subset
K that converges to
(y,
to this sequence, since the function {y,x)
Fy(y|x) and /y(y|x)
5)]
/)'(y|x) as Fy(y|x) -^ Fy(y|a;). Therefore, the right
that the result holds uniformly in
argument apphes
is
no greater than
be made arbitrarily small, the
To show
of variable
one-to-one between Bf:{F)-{y\x)) and J.
is
-fy{y\x){h{FY{y\x)\x)-5)
Similarly
The change
Q)-{-\x).
of
yX, we
x) G A', then the preceding
>-^
— /y(y|.T)/7.(Fy(y|.7:)|.r)
This result follows by the assumed continuity of
both arguments, and the compactness
use
of
K.
is
li{u\x),
D
37
Proof of Lemma
B.3.
This result follows by the Hadamard differentiability of the con-
1.
Lemma
ditional distribution function with respect to the conditional quantile function in
8,
Condition Q, and the functional delta method in
Proof of Theorem
B.4.
D
1.
The
joint
Lemma
D
3.
uniform convergence result follows from Condition
by the extended continuous mapping theorem
in
Lemma
2,
since the integral
uous operator. Gaussianity of the limit process follows from linearity of the
Proof of Theorem
B.5.
limit process follow
uous mapping theorem
uous mapping theorem
1.
2.
Proof of Corollary
B.8.
method
in
Lemma
3
by the functional delta method
differentiable (see, e.g.,
3.
C.
3
since
,
D
D
1
by the extended contin-
D
2.
rule for
Hadamard
1
by the functional delta
differentiable functionals in
-.
Lemma
^
,
This result follows from the functional delta method for the
bootstrap and other simulation methods
Appendix
Lemma
2.
:.:
Proof of Theorem
in
Doss and GiU, 1992).
This result follows from Theorem
and the chain
,.
B.9.
3.
uniform convergence result and Gaussianity of the
This result follows from Theorem
Lemma
in
1
D
integral.
This result follows from Theorem 2 by the extended contin-
Lemma
in
Proof of Corollary
B.7.
joint
Hadamard
is
Proof of Corollary
B.6.
The
from Theorem
the quantile operator
4.
2.
a contin-
is
in
Lemma
D
6.
Limit distribution for the estimators of the effects
For policy interventions that can be implemented either as a known transformation
of the covariate,
X, we can
Yj'
—
Xj
=
also identify
g{Xo), or as a change in the conditional distribution of
and estimate the distribution
Yq, j,k G {0,1}, under Condition
RP
results provide estimators for the distribution
limit distribution theory for them. Let
Lemma
Y
of the effect of the policy,
stated in the
main
text.
and quantile functions
The
given
Aj
following
of the effects
V={6eR: S = y — y,yEy,yE
=
and
y}.
9 (Limit distribution for estimators of conditional distribution and quantile func-
tions). Let
Qao{u\x)
=
QYo{u\g{x))
-
Qv-o(u|.t)
and Qai{u\x)
=
Qyi{u\x)
-
QYg{u\x) be
38
estimators of the conditional quantile function of the effect Q/\^{u\x),j £ {0,
and RP, we have:
the conditions C, Q,
Vn
in i°°{{0,l) X X),
\/X^Vi{u, x)
zero
—
where
.
- Qa^{u\x)j
(Qa,{u\x)
:=
•3^)
\4.o('"'>
«^
'
,,,";
,
,:
Under
1}.''^
=^ \\{u,x), j G {0,1},
\/^[K){",
-
5'(3:))
^/X^Vo{u,x). The Gaussian processes (u,x) i—
mean and covariance function Q\r^{u,x,u,x) :=
''
and Vaj^{u,x)
yo{u,x)]
>
'
:=
{u,x), j G {0, 1}, have
Va
E[V/^^{u, x)Vi\r{u,x)], for j,r
G
{0,1}.
Let F^j{S\x)
=
the effects F^,^{5\x), for j
G
Jlz
x) :=
^((5. .T, 5,
effect,
and
—
9.
Under
^
The uniform convergence
G
bution processes \/n(-^Aj('5|x)
{0, 1}, follows
in
—
Lemma
{5\x)
as in the proof of
Theorem
functions).
/v
with respect to
Lemma
-^Aj((^|x)cfF\-^, (x)
the
where 5
h->
the
2.
result for the conditional quantile processes
from Conditions
Q
and
RP
by the extended
Uniform convergence of the conditional
distri-
Qa
method
in
Lemma
3.
The Hadamard
{u\x) can be established using the
differentia-
same argument
D
.
for estimators of the
conditions
M,
C,
Q,
and
marginal distribution and quantile
RP,
the
estimators
F^
F^
(6) jointly
(S)
=
converge
Gaussian processes:
VTi{Fi^[8) - Fi^{6))
in i°°{T>),
The conditional density of
of the marginal distributions of the effects
in law to the following
'
have zero mean and covariance function
8.
4 (Limit distribution
Under
G {0,1}.
Faj{5\x)),j G {0, 1}, follows from the covergence of the
quantile process by the functional delta
Fa
{0, 1},
j
hounded above and away from zero}'
to be
(3aj("|x)), j
and RP. we have:
-fA,{6\x)VA^{FA,{S\x)^x) =: Za,{S,x),
Z^ji^.x),] G
continuous mapping theorem
bility of
the conditions C, Q,
x)ZAr(i5, x)], for j,r G {0,1}.
assumed
Proof of Lemma
\/^(OAj(fi|x)
t-^
£'[Z/\,^((5,
(5|x), is
/a
{S,x)
an estimMor of the conditional distribution of
S}d,u be
{0, 1}.
%/^(FA,(5ix) -Fa/(5|x))
in i°^{T> X A!),
<
Jq l{QAji'>i'\-T)
Z^
(5),
^
^^ZA,(<5,,T;)r/Fv,(x) ^: Zi^[8), j,k G {0,1},
j,k G
{0,1}, have zero
n'/J6,6) := E[Zi^{S)Z^^/s)l forj,k.,r^s G
mean and
covariance function
{0, 1}.
In the distribution approach, Qy, (u|a:) can be obtained by inversion of the estimator of the conditional
distribution.
This assumption rules out degenerated distributions
for the distribution of effects,
such as constant
policy effects. These "distributions" can be estimated using standard regression methods.
39
Under
the conditions
M,
C, Q,
and RP,
the estimators
the marginal quantile functions of the effects
Qa
(^^)
Q^
=
(u)
inf{(5
F^
:
> u]
[5)
of
jointly converge in law to the following
Gaussian processes:
V^{q1^{u)-Q%{u))
=
in /?°°((0, 1)), where fi^{S)
mean and
-Zi^(Qi^(u))/4(Qi^(u)) =: \%[u), j,k e {0,1},
=>
E[V^
variance function fly^{u,u) :=
Proof of Theorem
V^
{u)V^^{u)], for
The uniform convergence
4.
^
J^ fA^{S\x)dFxi^ix) and u
{u),
e
j
k,r,s E {0,1}.
j,
marginal distribution
result for the
functions follows from the convergence of the conditional processes in
extended continuous mapping theorem
in
Lemma
erator. Gaussianity of the limit process follows
convergence result
differentiable (see, e.g..
Appendix
D.
method
Doss and
are estimated. These results
integral.
in
Lemma
3,
D
The uniform
since the quantile operator
Gill, 1992).
r—
;
is
D
/
for the case
complement the analysis
where the covariate distributions
in the
main
text.
We
Limit theory, bootstrap, and other simulation methods.
ing Condition
a continuous op-
Case with Estimated Covariate Distributions
,.
This section presents additional results
D.l.
from linearity of the
is
by the
9
Inference Theory for Counterfactuals Estimators: The
-/
;:
.
since the integral
Lemma
function follows from the convergence of the distribu-
for the quantile
tion function by the functional delta
Hadamard
2,
have zero
{0, 1},
start
by restat-
to incorporate the assumptions about the estimators of the covariate
distributions.
Condition DC.
.
i/n
J
fd{Fxi^ (x)
Let
(a)
— Fx;, {x)),
^{FY^{y\x) - FY^{y\x))
:=
Z,{y,x)
and G.v,(/)
where Fx^ are estimated prohahility measures, for j, k €
{0,
;
=
1 }
These measures must support the P-Donsker property, namely
Zq, Zj,
m
the space
i^{y
x
G X, G xj
=>
X) x i°°{y x
where the right hand side
is
a zero
(v
'^0-2^0,
v
^\Zi,
V
'^oCa'o,
vAiGxi
;f ) x i°^{J=-) x i°°{J^), for each
mean Gaussian
process and Xj
Fx-Donsker
class
T,
the limit of the ratio
is
of the sample size in group j to the total sample size n, for j G {0, 1}.
(b)
The function
The condition on
class {Fy^ {y\^)> y
^ y}
the estimated measure
measure based on a random sample.
is
is
Fx^-Donsker, for
weak and
is
satisfied
j,
fc
€
{0, 1}.
wh6n Fxj
Moreover, the condition holds
is
an empirical
for various
smooth
40
empirical measures; in fact, in this case the class of functions
T for which DC(a)
holds can
be much larger than Glivenko-Cantelli or Donsker (see Radulovic and Wegkamp, 2003,
and Gine and
Nickl, 2008). Condition
classes of functions, see, e.g.,
DC(b)
is
weak condition that holds
also a
for rich
van der Vaart (1998).
Theorem
5 (Limit distribution and inference theory for counterfactual marginal distribu-
tions). (1)
Under conditions
M and DC the estimators Fy (y) = J^ Fy^ {y\x)dFx^{x)
marginal distribution Junctions Fy
(y) jointly
converge
m
law
to the
of the
following Gaussian
processes:
V^[F^-iy) - F^-{y)]
in £°^{y), where y
^
VXjZ^iy) + ^X,Gx,{Fy^{y\-)) =: Z^{y),
Zj{y), j,k £ {0,1}, have zero
i-^
j,k e
mean and covariance
{0, 1}.
(D.l)
function, for
j,k,r,se {0.1},
st(y,y)
where El'^
Any
(2)
is
x/AASt(y-.y) + Vh>^sE [GxAFyM-))GxAFvM-))]
:=
(D-2)
,
defined as in (3.4)-
bootstrap or other simulation
empirical process (Zq, Zj, G.Vo,
GyJ
method
that consistently estimates the law of the
in the space £^(J^
x X)xt'^{y x X) x (.'^ [T) x t°° [T)
also consistently estimates the law of the empirical process
{Z^.Z\,Z\,Z\)
in the space
e°°{y) X e'^iy) x e°^{y) x f°°(3^).
Proof of Theorem
in
Lemma.
Lemma
delta
3
10 below with
for the
t
=
part of the theorem follows by the functional delta
first
and the Hadamard
method
The
The
5:
differentiability of the
l/\/n.
The second
method
marginal functions demonstrated in
part of the theorem follows by the functional
bootstrap and other simulation methods
in
Lemma
D
6.
expressions for the covariance functions can be further characterized in
some leading
cases;
(1)
tions
The
distributions of the covariates in groups
and
1
correspond to different popula-
and are estimated by the empirical distributions using mutually independent random
samples. In this case Gxq and Gxi are independent integrals over Brownian bridges, and
the second component of the covariance function in (D.2)
Fy^{y)]dFx^{x) for
0,
(2)
The
A'l
=
/c
=
s
and zero
for
covariates in group j are
k
^
is
random sample.
— Fy
{y)]{F)-^{y\x)
—
s.
known transformations
g{Xo), and the covariate distribution in group
distribution from a
J;^\F)' {y\x)
In this case
is
of the covariates in
group
estimated by the empirical
Gxo and Gxj
are highly dependent
41
The second components
processes.
F^M
-
[FvAm
FimdFxoix)
for k
Fl{y)]dFxo{'^)
k^s =
for
of the covariance function in (D.2)
=
and J^[Fy^{y\x)
l,
—
J;:^'[FYjiy\x)
- F,\{y)]FyMg{^)) "
- F°^,(y)][Fy,(y|5(x)) - Fl{y)]dFx,{x) for
=
s
is
JjFY^{y\g{x))
0,
ky^S.
Limit distribution theory and validity of bootstrap and other simulation
Corollary
4.
methods for
the estimators of the marginal quantile function, quantile policy effects, distri-
bution policy
to
and
effects,
Theorems 2 and
D.2.
Hadamard
sult,
we
differentiable functionals can be obtained using similar
arguments
and Corollaries 1-3 with obvious changes of notation.
3,
derivatives of marginal functionals. In order to state the next
y
define the pseudometric Pi^^p) on
x
X
^
re-
and on !F by
1/2
Pi2fp^((y,a;),(y,x))
'LHP)
= ElZjiy.x) -
Zj{y,:
1/2
-I
=
p'iHP){fJ)
follows from
It
under
Lemma
the completion of
x
3^
compact. Likewise, JF
Lemma
-^^
is
T
^
on
totally
yx
;
,.,
-.^^
and
mapping
,
,
/c
is,
we
x A!
p'^iip)
totally
is
each
for
D^ C
p\iip\ for each k.
D=
j.
bounded
Moreover,
,
E=
j,ke{(),l].
j Fy^{-\x)dFx,{x),
is
'
^°^'{yX) x i"°{T) h^
bounded maps f
dPi := d{F\,^
t-H>
t^iy),
:
j fdFx^, where
the sequence
.
•
:.
;
ini^[yX).
Pi^PkeC{:F,pl^p^)
me,^{r),
G
T
and j,k G
{0, 1}.
{0, 1}.
Then, as
t
Finally,
we assume
\0
0(F^,,Fj,J-0(Fy^,F;,,;
identify Fx^^ with the
map /
i—>
J fdFxi,
{Fy
- Fx,)/{ty%), and
<P\Py,
That
y
with respect to either of the pseudometrics
,Fx,
t
i&
k 6 {0,1}.
a]-^a,eC{yX,p{,^^p;)
for the Fx^'Donsker class
j,
:
the space of
.
.
Fx^-Donsker, for
X
X, forj.k G {0,1}.^^ Consider
,,,,_.
is
x
bounded under
- Fy^)l{t^),
that for a] := (F^^
for
the product of the space of the conditional distribution functions
is
tribution function on
t\Q
y
denoted
A',
4>{Fy^,Fx,):^
where the domain D^
Fy{-\-)
,
18.15 in van der Vaart (1998) that
10. Consider the
..;-
E{GxM)-GxM)y
and Zj has continuous paths with respect to
P^i2ip)
forj G {0,1},
,
in i^{J^).
(aj,A),
•:
Fx,.
F*^
,
)
pi{f) :=
^
.,:
,:.
a dis-
is
G D^ such
jfddl
.;
.
that {F)-^{y\x),y
as
^
=
G y}
42
where
and
the derivative m.ap {a,p)
Proof of
%
Lemma
The
continuous.
~'^'FY,FxMj^0k) as
Fy^ {dpi
- dA) + \/AA
/ (^Ml^i
+ \/V^
/ ("^
- "jO^ci/J^
|
Since a^
is
third term vanishes by the
term vanishes, since
J{a'j
—
<
aj)tdP\.\
=
Let 7rm(y,:r)
—
||aj
argument provided below.
Qj||y;f
/
\td3i\
continuous on the compact semi-metric space
is
measurable partition U,"ii3^^,:m of
within
is
'^(n,.F"x,)-<l'(Fyyi'x,)
term of (D.3)
by assumption.
finite
E,
to
bounded by |JqJ— Q'j|J3;;t' / dFxf. -^ 0. The second term vanishes,
any Fx^-Donsker set T, J /dpi —> J fdpk in i°°{T), and {Fvv(y|x),y ^ y} C f
first
since for
mapping D^
(a,/5),
p
10. Write
- a',>/Fv, +
^/^tJ
I (QJ
The
h-> (p'p
yX^m
each
?';
a^tdPl
also let limilj^^)
<
2\\Qj
<
2e
-
=
^{{iJ,^')
+
aj O TTr^Wy;^
+ J]
2||Qf^
(J^.-f
,
—
Q;j||y;t'
p-^s/p)),
varies less than
G y^im, where [yim^^im)
(yi„i,x,>n) if (y,x)
for
yX such that a^
<
e
The fourth
—
>
0.
there exists a
on each subset.
an arbitrarily chosen point
is
£ 3^'^im}- Then
^
|a-j(y,m,
\a,{y,rn,X,r,MPk{hra
+
X,m)\tPUltr
o(l))
i=l
<
2e
+
+o(l)
|Qj||y;tmaX/5fc(l.m)
^771
7<77l
<
since {Itm,?-
< w}
is
2e
+ O(0,
a FA-^-Donsker
The constant
class.
e
is
arbitrary, so the left
hand
side of the preceding display converges to zero.
Finally, the
map
is
trivially
respect to
map
is
norm on
•
||
D is
given by
•
||
||:va'
continuous with respect to
\\yx
by the
continuous.
first
term
V
•
||
•
||
||.f
.
||jr.
The second component
of the derivative
The
continuous with
in (D.3) vanishing, as
first
component
is
shown above. Hence the derivative
D
43
Appendix
Functional Delta Method and Bootstrap and Other
E.
Simulation Methods for Z-processes
This section derives a preliminary result that
is
key to deriving the limit distribution and
inference theory for various estimators of the conditional distribution
and quantile func-
This result shows that suitably defined Z-estimators satisfy a functional central limit
tions.
theorem and that we can estimate
result follows
their laws using bootstrap
and related methods. The
from a lemma that establishes Hadamard differentiability of Z-functionals
in
spaces that are particularly well-suited for our applications.
El. Limit distribution and inference theory for approximate Z-processes. Let
us consider an index set
T}, where
some
rate.
for
T and a set
each u 6 T, e{u)
That
9{;u) is
is,
C W. We consider Z-estimation processes {9{u), u G
satisfies ||$(^(u), u)\\ < iniee® ll$(^, u)\\+tn, with e^ \
at
an approximate solution to the problem of minimizing
6 0. The random function
over
{9,
u)
u)
h-> '^{9,
is
specifies conditions
';/,)||
an estimator of some fixed population
function {9,u) h^ '^{9,u), and satisfies a functional central limit theorem.
lemma
||^(6',
The
following
under which the Z-processes satisfy a functional central limit
theorem, and under which bootstrap and other simulation methods consistently estimate
the law of this process.
Lemma
11 (Limit distribution and inference theory
he a relatively coinpact set of
some metric
.:.__.._
that
(i)
for each u G T, '^{.u)
and has inverse
(ii)
^{,u)
is
'^eo{u).u
(iii)
(iv)
and
Q
approximate Z-processes). Let
be a
'\
ihat
is
,,,_
»^ R^ possesses a unique zero at 9o{u) £ interior Q,
u) that
is
uniformly
continuous at
uniformly non-singular, namely
Z
in i'^{Q x T),
where
Z
m
mu
u £ T, with derivative
inf„gTinfi]/,[[=,i
is a.s.
E T,
||\i>gg(„)_^/!.|j
continuous on
Q
x
T
>
0.
with respect
Euclidean metric,
Bootstrap or some other method consistently estimates the law of \/n(^^
For each u G T,
o(n~''''^).
T
compact subset ofW. Assume
continuously differentiable at 9q{u) uniformly
\/n(^ — ^) =>
to the
\l/~^(-,
:
space,
for
let
9{u) be such that \\'^{9{u),u)\\
Then, under conditions
< mU£s\\'^i9:U)\\ +
e„,
—
Vl').
with e„
=
(i)-(iii)
y^(^(.)_^„(.))^_4,-^i^_
Moreover, any bootstrap or other method that
the law of the empirical process y/n{9
—
[Z(0o(.),.)]
inr[T).
satisfies condition (iv) consistently estimates
9o) in i°°{T).
44
Lemma
Proof of
The
11.
results follow
and by the functional delta method
Hadamard
is
some metric
map
an r-approximate zero of the
Let
4>{-,r)
;
i°°{Q)
6
i—>
be a
6
i—^
\\zie,u)\\
<
map
Lemma
be a compact subset of
and
space,
Lemma
Lemma 6, and
with = l/>/n-
12
z{9, u)
some
for
if
inf \\z(e',u)\\
that assigns one of
+
T
the
W. An
be a relatively
Q
element 6 E
>
r
r.
r-approximate zeroes
its
3
(
on the following lemma. Let
of the preceding result relies
set of
in
bootstrap and other methods in
for
differentiability of Z-functionals established in
The proof
compact
by the functional delta method
(p{z{-,
u), r)
"
G
to each element z{-,v.)
Lemma
lemma
z
:
Q
Assume
12.
T
that conditions (i)
Take any
hold.
X
(.'^{Q).
W,
t—f
and suppose
we have
Here
that,
it is
uniformly
m
Hadamard
tzt{-,
as
t
\
0,
/
Zt
bounded functions on T, which
T
as
u) denoted as Ot{u)
—
0(^(-, w)
where n
the sample
is
meet
in our context
over the parameter space
is
T=
See
Van
Lemma
(!°^{J^
Moreover, our
lemma
to cover quantile regression processes,
on
X
T
for a
12.
map
z
:
normed
— ^)
x T), which appears to be
to be totally bounded, which
Proof of Lemnia
+
Then, for the
/::((•,
u), tqt{u))
spaces.
Lemma
The
con-
the collection of
all
an extremely large parameter space. In particular, to
der Vaart and Wellner (1996) p.
3.9.34.
0.
size.
/?°°(T),
is
396
indexed by J^
weak convergence
hard to attain when
for
a
=
£°°{T)
difficult to attain in appli-
cations such as quantile regression processes. Indeed, note that
J-
t
map
because they include the uniform
that the empirical processes ^/n{'^
converge weakly in the space
space requires
\
uniformly on
as l/y/n,
are difficult to
lemma we need
for a continuous
an alternative to van der Vaart and Wellner's (1996)
convergence of the functions
apply their
+
\
T
x
diiferentiability of Z-functionals in general
lemma
ditions of their
is
u)
qt
Q
stated in the preceding
*P
u E T,
useful to think of
Remark. Our lemma
3.9.34 on
<!'(•,
that
on the function
(li)
—^ z uniformly on
Zi
fqt{n)- approximate zero of
and
comment on
J-' is
in this
too rich a space.
the limitation of their
allows for approximate Z-estimators. This allows us
where exact Z-estimators do not
We have that ^(6'o(n), u) =
Q x T i-^ £°^{Q x T) that is
for all
u G T. Let
exist.
Zf
—> z uniformly
continuous at each point, and
qt
\
45
uniformly
in
u E
T
t\0. By
as
Tlie the rest of
a rate of convergence for
concerning the Unear representation
Step
we
3,
Step
=
In Step
=
in
ri(^(0,(u), u)
-
and that uniformly
in
2,
<
We
u G T.
it
In Step
we
—
assuming that
0{-)),
we establish
1,
main claim
verify the
u G T,
li
G
+
\\t-^^{9o{u).,u)
=
-zt{9t{u), u)
-
\\-^{9t{u),u)
follows that uniformly in
-
\\9i{u)
+
of the
=
A;(-)
lemma
o(l).
In
=
is
+
0{t).
=
G T, as
v.
u)
-
\\z{9o{u),u)
+
c"^||^(6',,(ii),
Zt{9o{u),u)\\
0(At(u)
'^{9o{a),u)\\
<
9o{u)\\
conclude that uniformly in
^{9o{u), u))
u G T,
in
has a unique zero at 9o{u) and has an inverse that
hence
tzt{-,u),tqt{u)) satisfies
o(l).
0{t). Note that A,(u)
0(1) uniformly
+
iias tliree steps.
for t~^{6t{-)
Here we show that uniformly
1.
'I'(^o("),^^)||
o(l)||
=
verify that At(-)
0('I'(-,u)
\\^{B,u)+tZi{e,u)\\+tqt{u) =: t\,,{u)+tq,{u),
proof
tlie
to 6{-).
9t{-)
M
<
\\^{et{u),u)-<l'{Bo{u),u)+tZt{e;{u),u)\\
uniformly in u G T.
=
definition 9t{u)
=
qt{u))
\
/
0(1)
By assumption
'!'(•,(/,)
continuous at zero uniformly in u G T;
T",
\\9t{u)-9a{u)\\<dH{^-\^{Bt{u),u).u),<^-\{),u))-^Q,
where dn
the Hausdorff distance.
is
-
formly in ^ G T, ||^(^,(u), u)
so that uniformly in u
eT
By continuous
differentiability
-
-
^[9^{u),u)
^,o(„).„[^t(^^)
assumed
d^{u)]\\
=
to hold uni-
'-
9^{il)\\)
'
'.
'
!
'
.
_
,
-
o[\\9t{u)
'
'
\\<l'i9t{u),u)
~
mu)
-
''
'
^.^.^^
t\o
<i'{9oiu),u)\\
J^g,^^)J9,{u)-9o{u)]\\
>^.^j^^f^'^'^
-
||^,(u)-^o(^)||
9o{u)\\
>inf||,,|l=i||^(,o(u),u(/iOII
where h ranges over
W,
and
c
>
by assumption. Thus, uniformly
9o{u)\\<c-'\\<I'{Bt{u)^u)-<iJ{9o{u),u)\\
Step
=
0{t).
uniformly in u again, conclude \\'i{9t{u),u)
T
will
show that
by assumption.
=
^ieo{u), u))
....
,
Xf{'ii)
=
o(l)
—
—
'i{9o{u),u)
u G T,
\\9t{u)
—
.,.,..,.
and we
also have
*i'eo(u),u[^((u)
qt{'ii)
Thus, we can conclude that uniformly
-zt{9tiu), u)
t-%{u) .
in
Here we verify the main claim of the lemma. Using continuous differentiability
2.
Below we
= c>0,
9o{u)]
'
.
+
o(l)
=
-zi9oiu), u)
+
o(l)
in
=
-
6'o(u)]||
o(l) uniformly in
o{t).
(/.
u E T, t~^{'i/{9t{u),u)
and
=
%^l^^,^[r\<^{9tiu).u)-<i/{9oiu),u))
=
-^,4),j2(^oW,^^)] +
o(l).
=
+
o{l)]
,
G
—
46
Step
we show that
In this step
3.
-
=
Xt{u)
=
o(l) uniformly in u
G T. Note that
for
+ 0{i), we have that Ot G 9, for small
—
enough t, uniformly in u G T; moreover, Xt{u) < ||i~^^(^t(w), u) + Zt{9t{u),u)\\ =
D
^eo{u)AK\u)M^o{u),u)]} + ::{eoiu),u) + o{l)\\ = o{l),ast\0.
Oti'u)
:= Oo{u)
i^~^(^)^^ [2:(6'o(a),(i)]
Oo{u)
||
Appendix
F.
Z-Estimators of Conditional Quantile and Distribution
'—^
Functions
'.
This section derives limit theory
These
and quantile functions.
sampling plans
for the principal estimators of conditional distribution
re-
for the entire quantile regression process, the entire distribution regression
and related processes arising
process,
and other
results establish the validity of bootstrap
These
distribution functions.
parameter values u
all
11
and
of our leading examples. In
9{u) where
i—>
of a substantial independent interest.
we use Lemmas
In order to prove the results,
conditions that cover
may be
results
u.
G
and
in estimation of various conditional quantile
TCR
12.
We
also specify
we have functional
these examples,
all
and 9{u)
C
C
some primitive
R^, where for each u G T,
9o{u) solves the equation
'i'{9,u):=E{giW,9,u)]
where g
;
W
x
x
T
^> R^,
W
:= {X, Y)
is
=
0,
a random vector with support
moment
estimation purposes we have an empirical analog of the above
^i9,u)
where En
is
Condition
Z.l.
The
set
Q
is
For
functions
= E^[g{W„9,u)]
the empirical expectation and
For each u e T, the estimator 9{u)
W.
(ll^i,
satisfies
..., l'V'„)
||^(^(u), u)||
a compact subset
ofW
is
<
a
inf^ge ||^(^,
and
W
random sample from
T
is
")||
+
^n,
with
either a finite subset
or a bounded open subset o/M'^.
(i)
For each u G T, ^{9,u) := Eg{W,9,u)
(q'o(u)', /3q)'
(ii)
=
has a unique zero
t—>
^{9,u)
is
:
=
continuously differentiable at {9q{u),u) with a uni-
formly bounded derivative on T, where differentiability in u needs
T
9q{u)
G interior Q.
The map {9,u)
case of
at
being a bounded open subset
uniformly nonsmgular at 9q{u), namely
o/ R''
,•
^g^u
=
infugj- inf |]/i||=i
G{9,u)
to
hold for the
= ^Eg{W,9,u)
||>I'gg(u)^u/i||
>
0.
is
47
(iii)
The function
Q = {g{W,6,u),{9,u) €
set
The map {0,u)
integrable envelope G.
T
X
Condition
i-^
T}
x
g{W,0,u)
(b)
the quantile functions have the
and
9 I—> A(x, 9)
9
that are uniformly
Q{x,
i-^
9)
=
form Qy{u\x)
each
{9,
u) G
Q{x,9q{u)), where the functions
are continuously differentiable in 9 with derivatives
bounded over the set
X
.
13. Condition Z.l implies conditions (i)-(iv) of
tion
holds with ^/n{'i
— ^)
with continuous paths in u E
T
n{u,u)
^
Z, in £°°(T), where
Z
Lemm.a
is
11.
a zero
In particular, condi-
mean Gaussian
process
and covariance function
=
E[g{W,9o{u),u)g{W,9o{u),uy].
holds with the set of consistent methods for estimating the law of y/n{^
consisting of bootstrap and exchangeable bootstraps,
clusions of
at
form Fy{u\x) = A{x,do{u)); or
Lemma
(iv)
continuous
a square
Z.2. Either of the following holds:
the conditional distribution has the
Condition
is
P-Donsker with
with probability one.
(a)
(iii)
is
Lemma
11 hold,
more
generally.
namely Vn{9{-)-9o{-)) => -G{9o{-), )-^
— ^i)
Consequently, the con[Z{9o{-), )]
in i°°{T).
Moreover, bootstrap and exchangeable bootstraps consistently estimate the law of the empirical process \/n{9
This
lemma
—
9q).
presents a useful result in
its
own
the following result, a corollary of the lemma,
D
Condition
and Condition
Q
for a
is
right.
of
From
the point of view of this paper,
immediate
interest to us since
it
verifies
wide class of estimators of conditional distribution and
quantile functions.
Theorem
6 (Limit distribution and inference theory for Z-estimators of conditional dis-
tribution and quantile functions).
1.
Under conditions Z.l-Z.2(a),
Fy{u\x) of the conditional distribution function {u,x)
i—>
the estimator {u,x)
*—>
Fy{u\x) converges in law to a
continuous Gaussian process:
v^(Fy(4x) in i°° {y X
A:!)
,
Fy(n|x)) =» Z{u,x) :=
where
(?/.,
-^M^^^G{9oiu),u)-' Zi9o{u),u)
x) i—> Z{u, x) has zero
mean and
(F.l)
covariance function Tiz{u, x, u, x)
:
E[Z{u,x)Z{u,x)]. Moreover, bootstrap and exchangeable bootstraps consistently estimate
the law of
Z
•
,
=
48
I
Under conditions Z.l-Z.2(b),
2.
the estimator {u,x)
Qy{u\x) of the conditional
*-^
quantile function {u,x) •—> Qy{u\x) converges in law to a continuous Gaussian process:
V^i [Qv{u\x)
^
- Qy{u\x))
in £°°{{0, 1) X A!), where the process
tion Ev'(tt, x, u, x) :=
We
Lemma
We
13.
finite
T
simpler,
is
V
at each pair
(/;.
=
sequence of points
||^~^(/it, Ut)
T
bounded open subset
a
is
and follows
T
of each pair
(/j(,
then note that,
Atiuuhf)
for t*
covariance func-
=
it
(ii)
G
0,u),
of R.
6*0,
and
it
the inverse
=
T
to
The proof
(i),
map
for the case
we note that by
^"'(/u,u) exists on a
continuously differentiable in
is
with u £ T, where
0(||/if||)
11.
To show condition
similarly.
T
is
by taking
=
we have that
the closure of T,
"i'^lO^u)
is
we take any sequence
uniformly continuous
{ut,ht) —> {u,h) with
u G T,
-.
^{Ooiu),u)h = Gieo{u),a)h,
^ht,u,)
-
^i0o{ut),ut)}
-
-g^i&oiut)
+
t*h,,Ut)ht
(iii),
note that by the Donsker central limit theorem for ^(^, u)
=
—
sup^cT^n^jij^i \At{u. h)
— ^)
cess with covariance function Q.{u,ii)
=
=> Z, where
Z
is
(6',
u) with probability one.
from the assumptions stated
is
that
Z
G{9o{u), u)h\
a zero
as
/
mean Gaussian
pro-
E[g{W,9o{u),u)g{\V,9o{u),u)'] that has contin-
uous paths with respect to the L2{P) semi-metric on Q.
continuous at each
i—> Oq{u).
0.
Enlg{Wi,6,u)] we have that \/n{'i
is
^
v-
\
we conclude that
To show condition
G R^ and
/i
[0, t]
rn^(^oK) +
5,
map
limits.
=
Lemma
u)
This implies that for any
using the continuity hypotheses on the derivative d'^ /dO and the continuity of
Hence by
(/u,
o(l), verifying the continuity of the inverse
can also conclude that 9o{u)
and we can extend
To show condition
=
(/j
Ut) -^ (0, u)
We
u.
Lemma
with a uniformly bounded derivative.
0, n)
'^~^iO,Ut)\\
uniformly in
at
on
—
(F.2)
u),
^
the imphcit function theorem and uniqueness of
open neighborhood
mean and
x) i—^ V{u, x) has zero
{v.,
shall verify conditions (i)-(iv) of
consider the case where
with a
-^^i|^l^G(^o('"), u)-'Z{e,{u).
E[V{u,x)V{u,x)]. Moreover, bootstrap and exchangeable bootstraps
consistently estiinate the law of
Proof of
:=
\/(a, x)
The only
The map
{0,u)
result that
also has continuous paths
on
is
x
i-^
g{\V,9,u)
not immediate
T
with respect
By assumption Z has continuous paths with respect to
PLHP){{9,u)Ae,u)) = {E[g{W,9,u)~g{W,e,u)fy/~, As |i(^,n) - {e,u)\\ ^ 0, we have
to the Euclidean metric
that g{W,9,u)
•
||
— g{W,9,u)
|{.
-^
almost surely.
It
follows by the
dominated convergence
49
theorem, with dominating function equal to
(2(5)^,
velope for the function class Q, that {E\g[W,
0,
The square
continuity condition.
To show
(iv),
u)
- g{W,
integrable envelope
we simply invoke Theorem
G
where
G
3.6.13 in
9,
is
u)]-}^/^ -^
exists
Van
the square integrable en0.
This verifies the
by assumption.
der Vaart and Wellner (1996)
which implies that the bootstrap and exchangeable bootstraps, more generally, consistently
estimate the hmit law of \/n{'i
Proof of Theorem
method
Lemma
in
and the preservation of
F.l.
This result follows directly from
6.
Hadamard
validity of bootstrap
Lemma
D
say G, in the sense of equation (A. 2).
\1/),
the chain rule for
3,
tiable functionals in
—
Lemma
12,
the functional delta
differentiable functionals in
and other methods
for
Lemma
Hadamard
4,
differen-
D
6.
Examples of conditional quantile estimation methods. We
consider the loca-
tion and quantile regression models described in the text.
Example
Y
variable
2.
Quantile regression. The conditional quantile function
given the covariate vector
X
is
given by X'Pq{-). Here
We
at X'Po{u) uniformly in
u e T, almost
almost surely; and ElXX']
Eg{W, P,u) =
for
each u £
by (F.3),
T =
the
(0,1),
Lemma
map P
continuous
surely; moreover, iniuer fri^'Poi'^)]^)
and
The
of full rank.
Conditions Z.l-Z.2(b) hold for
{min(n,ii)
Proof of
is
9
is
>
c
>
true parameter Po{u) solves
such that Po{u) £ interior
(0, 1).
14.
=
finite
(F.3)
uniformly bounded and
and we assume that the parameter space
Lemma
n{u,u)
is
is
-
'
= {u-l{Y<X'P})X.
assume that the conditional density Jy{-\X)
outcome
we can take the moment
functions corresponding to the canonical quantile regression approach:
g{W,/3,u)
of the
i—>
Qy{u\x)
example with mom,ent function given
x'Poiu), G{po{u),u)
= -E[fy{X'po{u)\X)XX'],
and
— uu}E[XX'].
To show
14.
Eg{W,P,u).
{Pq{u),u),
=
this
It is
we need
to verify conditions
straightforward to show that
on the derivatives of
we have that
at
(/?,
u)
=
i
-
,
Z.l,
,..
^-^^Eg{W^p,u) = [G{P,u),EX] = [-E[fy{X'P\X)XXlEX],
and the
right
hand
side
convergence theorem, the
at X'Po{u),
£LS
is
continuous at {Po{il),u).
a.s.
This follows using the dominated
continuity and boundedness of the
well as finiteness of i?||X|p. Finally, note that Po{u)
mapping y
is
i-h>
/y-(y|A')
the unique solution to
50
..
:
Eg{W, P,u) =
uniformly
for
u G
in
each u because
it is
G{Po{u),u)
(0, 1),
a root of a gradient of convex function. Moreover,
> J'EXX' >
where /
0,
the uniform lower
is
bound
'
To show
we
Z.l(iii)
W}
VC
are
The
Q =
classes, so
{T\j
which
is
-
=
li
••:P}
&
Q
x
T
is
of a
VC
in
class
J^j
J^^j
map
{d,u)
= T, To = 1{K <
= ^k^j are also VC
with a fixed function
is
(Lemma
a Lipschitz transform
van der Vaart, 1998. The collection
The envelope
thus also Donsker.
Finally, the
square-integrable.
at each {P,u)
function classes
P-Donkser by Example 19.9
it is
J^2]yj
formed as products
i
P-Donsker with a square integrable
and Wellner, 1996). The difference Tij—J^2,
2.6.18 in van der Vaart
VC
is
Therefore the function classes
classes.
classes because they are
of
Q
verify that the function class
envelope and the continuity hypothesis.
X'0,P G
'
-
on fY{X'goiu)\X).
\—^
{u
is
given by 2
— 1{Y < X'(5))X
is
max^
\X.j\
continuous
with probability one by the absolute continuity of the conditional
distribution of V.
To show
Z.2(b),
we note that the map
Z.2(b) provided the set A"
Example
is
(x, 0) i—>
x'9 trivially verifies the hypotheses of
D
compact.
V=
+ V, where
X is independent of V. so the conditional quantile function of outcome variable Y given the
conditioning variable A' is given by yY'/?o + a'o(-)i where EfKIA'] = X'Po and Q'o(-) = Qv{-)1.
Classical regression.
This
is
the location model
Here we can take the moment functions corresponding to using
X'Po
least squares to
estimate
Pq and sample quantiles of residuals to estimate Qq-
giW,a,p.u) = [{u-l{Y-X'p<a}).{Y-X'P)X']':
We
assume that the density
V = Y—
of
X'Po,
/v'(')
uniformly bounded and
is
tinuous at Qo(^') uniformly in u G T, almost surely; moreover, inf^gT- /(q:o(u))
almost surely;
EXX'
(ao(u),/3o)' solves
that (oo('"),/3o)' e
Lemma
(F.4),
is
finite,
and
Eg{W,Q,P,xi)
interior
for
=
full
rank, and EY'^
<
oo.
(0, 1),
Qy{u\x)
=
x'po
G{ao{u),po,u)
+
>
con-
is
c
>
true parameter value
is
such
(0, 1).
15. Conditions Z.l-Z.3(h) hold for this exam.ple with
T=
The
and we assume that the parameter space
each u G
(F.4)
moment
junction given by
ao{u),
fviMu))
fv{ao{u))E{X]'
Opxi
EXX'
(F.5)
51
and
Q{u, u)
mm(ii, u)
=
-E[V 1{V < ao{u)}]E[X]'
uti
(F.6)
-E[V 1{V <
Lemma
Proof of
—
The proof
15.
E[V^]EXX'
ao{u)}]E[X]
follows analogously to the proof of
Lemma
14.
Unique-
ness of roots can also be argued similarly, with do uniquely solving the least squares
normal
D
equation, and Qq uniquely solving the quantile equation.
Examples of conditional distribution function estimation methods. We
F.2,
sider the distribution regression
for the
model described
in the text
and an alternative estimator
duration model based on distribution regression.
Example
The
Distribution regression.
4.
outcome variable
Y
given the covariate. vector
X
conditional distribution function of the
is
given by A{X'(3q{-)), where
moment
the probit or the logit link function. Here we can take the
to the pointwise
maximum
where A
finite
and
space
full
X{X'/3)X,
:f.7)
A(A"/?)(1-A(X'/?))'
3^
t-^ /y'(y|3:),
which
is
be either a
finite set or
a bounded open subset of
continuous at each y E y,
a.s.
FyiylX)
Moreover, EX X' is
i->
rank; the true parameter value /3o(y) belongs to the interior of the parameter
each y E y; and A{X'P){1
for
either
functions corresponding
For the latter case we assume that the conditional distribution function y
admits a density y
is
A{X'P)-l{Y<y}
the derivative of A. Let
is
A
likelihood estimation;
9{W,P,y) =
R*^.
con-
-
A{X'l3))
>
c
>
uniformly on
Lemma
16. Conditions Z.l-Z.2(a) hold for this example with
(F.7),T
= y,u =
y,FY{y\x)
=
moment
(3
G 0,
a.s.
Junction given by
A{x'Po{y)),
G{Po{y),y)--=E
x{X'Po{y)r
XX'
A{X'Poiym-A{X'Po{y))]
and, for y
>
y,
n{y,y)
X{X'Poiy))HX'Pom
=E
A'
A"
A(X'/?o(y))[l-A(.Y'/3o(y))]
Proof of
The
case where 3^
To show
By
Lemma
Z.l,
is
We
16.
consider the case where
a finite set
we need
d{P',y)
is
a bounded open subset of
Eg{W,p,y)
R*^.
simpler and follows similarly.
to verify conditions on the derivatives of the
a straightforward calculation
d
is
y
we have that
at (/3,y)
=
map p
i-+
Eg{W,
(/3o(y),?y),
= E[—g{W,P,y)],[g^Eg{W,P,y)
=
[G(/3,y),/?(/i,y)],
P, u).
52
.
=
where, for H{z)
-
X{z)/{A{z){l
A{z)]} and h{z)
=
dH{z)/dz,
'
,
,
GW,y)
-
E[{h{X'P)[A{X'/3)-l{Y<y}] + H{X'P)X{X'P)}XX'],
i?(/3,y)
=
E[H{X'P)fy{y\X)X}].
Both terms
are continuous in
,
E y. This follows from using by
y) at {Po{y), y) for each y
{f3,
the dominated convergence theorem and the following ingredients: (1)
map
{p,y)
^
^g{W,Po{y),y),
function constllXJI, (3)
and
(4) A(A''/5)(1
-
(2)
continuity of the
a.s.
domination of \\-^g{W,P,y)\\ by a square-integrable
continuity of the conditional density function y h^ /y(y|X),
a.s.
>
A(A"/?))
c
>
uniformly on
Eg{W,p,y) =
the solution Po{y) to
'
unique
is
for
£ 9,
/?
Finally, also note that
a.s.
each y E
y
because
it is
a root of a
gradient of a convex function.
To show
=
we
verify that the function class
Function classes
envelope.
j
Z.l(iii),
1, ...,p
are
VC
J^i
= {X'P,p
G 6},
The
classes of functions.
Q
is
J^2
{1{^'
^ y}:y E
X{J='i)Xj,
,.
j
=
l,...,p
a Lipschitz transformation of VC classes with Lipschitz coefRcient
and the envelope function c'max^
positive constants.
the
Hence G
\Xj\,
which are
bounded by craaxj
sciuare- integrable; here
Donsker by Example 19.9
is
and {Xj},
>'},
final class
\A(^i)(1-A(J-i))
is
=
A(^i)-.F,2
f
is
P-Donsker with a square integrable
in
1
and
c'
are
\Xj\
some
van der Vaart (1998). Finally,
map
x
continuous at each {p, y) G
conditional distribution of
uniformly on p E Q,
To show
Z.2(a),
model
in
3b.
with probability one by the absolute continuity of the
and by the assumption that
we note that the map {x,0)
X
is
Y
- A{X'p)) >
t-^
c
>
A{x'0) trivially verifies the hypotheses
D
compact.
An
Duration regression.
duration and survival analysis
of the duration
A(A''/3)(1
a.s.
of Z.2(a) provided the set
Example
V
3^
is
to specify the conditional distribution function
given the covariate vector
probit or the logit link function.
We
alternative to the proportional hazard
X
normalize
as A(ao(')
Q'o(yo)
=
+
X' Pq), where A
at
some
yo
is
either the
E y. Here we can
53
moment
take the following
functions:
+ X'l3)-\{Y <y)
\{a +
A{a + X'p){l-A{a + X'0))
h{a
5(l4/,a,/?,y)
=
X'p)
MX'p) - i{Y < y.\,^^,^^^
h{X'(3){l-K{X'(3)\
where A
and the second
Let
y
The
the derivative of A.
is
for
be either a
first set
estimation of
/^o-
finite set or a
bounded open subset
that the conditional distribution function y
which
continuous at each y G
is
true parameter value (Q'o(y),
y ey-, and
Lemma
(F.7),
A(a
+
of equations
X'P){1
/Jq)'
3^,
i-^
used
=
> c>
X'i3))
A(ao(y)
Giaoiy),Po,y)-=
,
we assume
EXX'
and
finite
is
full
rank; the
belongs to the interior of the parameter space
- A(a +
Fy{y\x)
y,
estimation of cvo(y)
of R''. For the latter case
uniformly on
+
x'Po),
'
e ©,
(a,/?')'
moment
17. Conditions Z.l-Z.2(a) hold for this example with
T = y,u =
for
Fy(i/|A') admits a density y h^ Iy{y\x).,
Moreover,
a.s.
is
—
E———g{W,ao{y),0o),
;••
for
each
a.s.
function given by
.'
.
^
/;'_;,:.
o[a, b')
''
an(iQ(y,y)
Proof of
=
"
E[5(M/ao(y),/3o)5(W^,"o(y),/9o)'].
Lemma
17.
The proof
'
[1]
in
Lemma
16.
D
A
Quantile
References
'
"Changes
Abadie, A. (1997):
follows analogously to the proof of
Spanish Labor Income Structure during the 1980's:
Regression Approach," Investigaciones Economicas XXI, pp. 253-272.
[2]
Abadie, A., Angrist,
J.,
and G. Imbens (2002):
"Instrumental variables estimates of the effect of
subsidized training on the quantiles of trainee earnings," Econometrica 70, pp. 91-117.
[3]
Andersen,
and R. D.
P. K.,
Gill (1982):
Sample Study," The Annals of
[4]
[5]
Angrist,
J.,
"Cox's Regression Model
for
Counting Processes:
Chernozhukov, V., and
I.
Fernandez- Val (2006): "Quantile Regression under Misspecifi-
an Application to the U.S. Wage Structure," Econometrica
Angrist,
and
J.-S.
Pischke (2008): Mostly Harmless Econometrics:
74, pp.
An
Autor, D., Katz,
L.,
Economic Review
[7]
Autor, D., Katz,
and Prices,"
[8]
sionists,"
and M. Kearney (2006a): "The Polarization
Empiricist's Companion,
of the U.S.
Labor Market," American
96, pp. 189-194.
L.,
and M. Kearney (2006b): "Rising Wage Inequality: The Role of Composition
NBER Working
Autor, D., Katz,
539-563.
'
Princeton Univesity Press, Princeton.
[6]
Large
Statistics 10, pp. 1100-1120.
cation, with
J.,
A
L.,
Paper wll986.
and M. Kearney (2008): "Trends
Review of Economics and
in U.S.
Statistics 90, pp. 300-323.
Wage
Inequality: Revising the Revi-
54
[9]
Barrett, G.,
and
Donald (2003): "Consistent Tests
S.
for Stochastic
Dominance," Econometrica,
71,
pp. 71-104.
[10]
Barrett, G., and S.
Donald (2009):
"Statistical Inference with Generalized Gini Indexes of Inequality,
Poverty, and Welfare," Journal of Business and Economic Statistics 27, pp. 1-17.
[11]
Beran, R.J. (1977); "Estimating a distribution function," Annals of Statistics
[12]
Breslow, N. E. (1972): Contribution to the Discussion of "Regression Models and Life Tables," by D.
5,
pp. 400-404.
R. Cox, Journal of the Royal Statistical Society, Ser. B, 34, pp. 216-217.
[13]
Breslow, N. E. (1974): "Covariance Analysis of Censored Survival Data," Biometrics, 30, pp. 89-100.
[14]
Buchinsky, M. (1994):
"Changes
in
the
US Wage
Structure 1963-1987:
Application of Quantile
Regression," Econometrica 62, pp. 405-458.
[15]
Burr, B., and H. Doss (1993): "Confidence Bands for the Median Survival
Covariates in the
[16]
of the
American
as a
Function of the
Statistical Association 88, pp.
1330-1340.
Cameron, A.C., and P.K. Trivedi (2005): Microeconometrics: Methods and Applications, Cambridge
University Press,
[17]
Cox Model," Journal
Time
New
York.
(ed.).
•
.
^
"Quantile Regression, Censoring, and the Structure of Wages," in C. A.
Chamberlain, G. (1994):
Sims
,
Advances in Econometrics, Sixth World Congress. Volume
1,
Cambridge University Press,
Cambridge.
[18]
Chernozhukov,
and
V.,
I.
"Subsampling Inference on Quantile Regression
Fernandez- Val (2005):
Processes," Sankhyd 67, pp. 253-276.
[19]
Chernozhukov, V.,
I.
.
Fernandez- Val and A. Galichon (2006):
"Quantile and Probability Curves
without Crossing," mimeo, MIT.
[20]
Chernozhukov, V., and C. Hansen (2005): "An IV Model of Quantile Treatment
Effects,"
Economet-
rica 73, pp. 245-261.
[21]
Chernozhukov, V., and C. Hansen (2006): "Instrumental quantile regression inference
and treatment
[22]
effect
for structural
models," Journal of Econometrics 132, pp. 491-525.
Chernozhukov, V., and H. Hong (2002): "Three-step censored quantile regression and extramarital
affairs,"
Journal of the American Statistical Association 97, pp. 872-882.
[23]
Chesher. A. (2003): "Identification in Nonseparable Models." Econometrica. 71, pp. 1405-1441.
[24]
Cox, D. R. (1972):
[25]
Dabrowska, D.M. (2005): "Quantile Regression
[26]
DiNardo,
"Regression Models and Life Tables," (with discussion). Journal of the Royal
Statistical Society, Ser. B, 34, pp. 187-220.
J.
in
Transformation Models," Sankhyd 67, pp. 153-186.
(2002): "Propensity Score Reweighting
and Changes
in
Wage
Distributions," unpublished
manuscript. University of Michigan.
[27]
DiNardo,
J.,
Fortin, N.,
Wages, 1973-1992:
[28]
Doksum, K.
the
[29]
A
(1974);
and T. Lemieux (1996): "Labor Market Institutions and the Distribution of
Semiparametric Approach," Econometrica
Two-Sample Case," Annals
Doksum, K.A., and M. Gasko
analysis
and survival
64, pp.
1001-1044.
"Empirical Probability Plots and Statistical Inference for Nonlinear Models in
of Statistics
(1990):
2,
pp. 267-277.
"On a correspondence between models
analysis," International Statistical
Review
in
58, pp. 243-252.
binary regression
55
[30]
Donald,
Green, A. A., and H.
S. G.,
Canada and the United
An
States:
J.
Paarsch (2000): "Differences
Wage
in
Distributions
Between
Application of a Flexible Estimator of Distribution Functions in
the Presence of Covariates," Review of Economic Studies 67, pp. 609-633.
[31]
"An Elementary Approach
Doss, H., and R. D. Gill (1992):
Processes,
With an Applications
to
Weak Convergence
for
Quantile
Journal of the American Statistical
to Censored Survival Data,"
Association 87, pp. 869-877.
[32]
Firpo,
S.
"Efficient semiparanietric estimation of quantile
(2007):
treatment
effects,"
Econom,etrica
75, pp. 259-276.
[33]
Firpo,
S.,
N. Fortin, and T. Lemieux (2007): "Unconditional Quantile Regressions," Econometrica,
forthcoming.
[34]
Foresi, S.,
and
F. Peracchi (1995):
Analysis," Journal of the
[35]
American
"The Conditional Distribution
of Excess Returns:
an Empirical
Statistical Association 90, pp. 451-466.
Gine, E. and R. Nickl (2008): "Uniform central
limit-
theorems
for kernel density estimators,"
Prob-
ability Theory and Related Fields 141, pp. 333-387.
[36]
Gosling, A.,
Machin, and C. Meghir (2000); "The Changing Distribution of Male Wages in the
S.
U.K.," Review of Economic Studies 67, pp. 635-666.
[37]
Gutenbrunner,
in the
C, and
J.
Jureckova (1992): "Regression Quantile and Regression Rank Score Process
Linear Model and Derived Statistics," Annals of Statistics 20, pp. 305-330.
[38]
Hall, P., Wolff, R.,
[39]
Han, A., and
and Yao, Q. (1999), "Methods
for
estimating a conditional distribution function,"
Journal of the American Statistical Association 94, pp. 154-163.
J.
A.
Hausman
(1990), "Flexible Parametric Estimation of Duration
Risk Models," Journal of Applied Econometrics
[40]
Heckman,
tions
Smith,
J.,
pp. 1-28.
5,
and N. Clements (1997), "Making the Most Out of Programme Evalua-
and Social Experiments: Accounting
Economic
[41]
J. J.,
and Competing
for
Heterogeneity in
Programme Impacts," The Review
of
'
Studies. Vol. 64, pp. 487-535.
•
.
Hirano, K., Imbens, G. W., and G. Ridder, (2003), "Efficient estimation of average treatment effects
using the estimated propensity score," Econometrica, Vol. 71, pp. 1161-1189.
[42]
Horvitz, D., and D.
Thompson
(1952),
Finite Universe," Journal of the
[43]
Imbens, G.W., and
W.
"A Generalization
American
of Sampling
Without Replacement from a
Statistical Association, Vol. 47, pp.
663-685.
K. Newey (2009), "Identification and Estimation of Nonseparable Triangular
Simultaneous Equations Models Without Additivity," Econometrica, forthcoming.
[44]
Imbens, G.W., and
Evaluation,"
[45]
J.
Wooldridge (2008), "Recent Developments
NBER Working
in
the Econometrics of
Program
Paper No. 14251.
Koenker, R. (2005), Quantile Regression. Econometric Society Monograph Series 38, Cambridge University Press.
:
..
'
.
[46]
Koenker R. and G. Bassett (1978): "Regression Quantiles," Econometrica
[47]
Koenker, R., and
Z.
46, pp. 33-50.
Xiao (2002): "Inference on the Quantile Regression Process," Econometrica
70,
no. 4, pp. 1583-1612.
[48]
Lancaster, T. (1990): The Econometric Analysis of Transition Data,
graph, Cambridge University Press.
An
Econometric Society Mono-
56
[49]
.
Lemieux, T. (2006):
Rising
[50]
Demand
[51]
for Skill?,"
A
Machado,
(2005):
Mata
J.
Effects,
Noisy Data, or
96, pp. 451-498.
"Testing for Stochastic
subsampling approach," Review of Economic Studies
and
J.,
Whang
Composition
Inequality;
American Economic Review
Linton, O., Maasomi, E., and Y.
conditions:
Wage
"Increasing Residual
Dominance under general
72, pp.
(2005): "Counterfactual Decomposition of
735-765.
Changes
in
Wage
Distributions
Using Quantile Regression," Journal of Applied Econometrics 20, pp. 445-465.
[52]
McFadden, D.
(1989):
Economics of Uncertainty
(eds) Studies in the
Powell,
[54]
Radulovic. D. and M.
of
[55]
(1986):
Wegkamp
smoothed empirical processes,"
Resnick, S.
I.
honor of
J.
of T.
Fomby and
T. K. Sec
Hadar), Springer- Verlag.
(1987), Extreme
"Necessary and sufficient conditions for weak convergence
(2003):
Statistics
&
Probability Letters, 61, pp. 321-336.
values, regular variation,
Series of the Applied Probability Trust,
[56]
(in
II
"Censored Regression Quantiles," Journal of Econometrics 32, pp. 143-155.
[53]
J. L.
"Testing for Stochastic Dominance," in Part
4.
and point processes, Applied Probability.
Springer- Verlag,
Rutemiller, H.C., and D.A. Bowers (1968):
"Estimation
New
in
A
York.
a Heteroscedastic Regression VIodel,"
Journal of the American Statistical Association 63, pp. 552-557.
[57]
Stock, J.H. (1989): "Nonparametric Policy Analysis," Journal of the Am.erican Statistical Association
84, pp. 567-575.
[58]
Stock, J.H. (1991): "Nonparametric Policy Analysis:
Cleanup Benefits,"
eds.
[59]
W.
Barnett,
and Semiparam.etric Methods
Asymptotic
statistics,
Cambridge
m
Econometrics and
Waste
Statistics,
Cambridge University Press.
Series in Statistical
and Probabilistic
3.
van der Vaart, A., and
to statistics,
Application to Estimating Hazardous
Powell, and G. Tauchen, Cambridge, U.K.:
van der Vaart, A. (1998):
Mathematics,
[60]
J.
in Nonparam.etric
An
New
J.
Wellner (1996): Weak convergence and empirical processes: with applications
York: Springer.
,
57
Table
1:
Decomposing Changes
in
Measures of Wage Dispersion: 1979-1988,
DR
Effect of:
Minimum
Statistic
Total change
Individual
wage
Unions
Coefficients
attributes
Men:
Standard
8.0(0.3)
Deviation
90-10
21.5 (1.0)
50-10
2.8 (0.1)
0.7
0.0)
1,8 (0.2)
2.7 (0.3)
35.4 (1.4)
8.5
0.6)
22.9 (1.9)
33.1 (2.4)
11.2 (0.1)
0.0
0.0)
9.2 (0.8)
I.l (1.3)
52.1 (2.4)
0.0
0.1)
42.6 (4.4)
5.3 (5.9)
-2.0
1.0)
5.1 (0.4)
7.9
(
1.2)
45.5 (8.3)
2.0
1.0)
4.0 (0.8)
4.2 (1.1)
19.7 '8.4)
39.3 (8.8)
41.0 (9.8)
11.2 (0.1)
11.3 (1.4)
99.6
90-50
10.2(1.2)
75-25
15.4(1.1)
(
14.1)
0.0 (0.0)
0.0 (0.0)
95-5
33.0(2.1)
Gini
4.1 (O.I)
coefficient
-3.1 (1.1)
27.1
(
14.0)
0.0 (0.0)
4.1
q.O)
0.3 (1.3)
11.1 (1.2)
0.0 (0.0)
26.5
6.2)
1.7 (8.6)
71.8 (8.7)
23.0 (0.7)
0.0 '0.6)
8.5 (1.1)
1.4 (1.5)
69.9 (4.1)
0.0
1.7)
25.8 (2.6)
4.3 (4.4)
1.3 (0.0)
0.5
0.0)
0.3 (0.1)
2.0 (0.1)
32.1 (1.2)
11.7
0.6)
6.8 (1.8)
49.4 (1.8)
3.8 (0.1)
0.3
0.0)
4.7 (0.2)
2.1 (0,3)
34.9 (1.5)
3.2 '0.4)
42.8 (1.8)
19.1 (2.5)
23.0 (0.2)
0.9 '0.5)
14.5 (0.7)
1.3 (1.1)
57.9 (1.9)
2.3
1.2)
36.4 (1.7)
3.4 (2.6)
23.0 (0.2)
0.0
0.1)
11.3 (0.4)
-1.4 (0.7)
69.9 (1.6)
0.0
0.4)
34.4 (1.3)
-4.3 (2.4)
0.0 (0.0)
0.9
0.5)
0.0 (0.0)
13.6
7.2)
0.0 (0.0)
0.0
0.5)
8.3 (0.2)
4.5 (0.8)
0.0 (0.0)
0.0
3.9)
65.1 (5.0)
35.0 (4.5)
16.8 (0.5)
0.7
0.7)
16.4 (2.0)
5.0 (2.1)
43.2 (2.2)
1.9
1.9)
42.1 (5.0)
12.8 (5.1)
2.0 (0.1)
O.I
0.0)
I.O (0.1)
0.9 (0.1)
49.0 (1.8)
3.5
0.4)
24.5 (1.4)
23,0 (2.2)
Women:
Standard
10.9(0.4)
Deviation
90-10
39.8(1.4)
50-10
33.0(0.7)
90-50
6.8(1.4)
75-25
12.8(0.9)
95-5
38.8(1.9)
Gini
4.0(0.1)
coefficient
Notes All numbers are
in
%
Bootstrapped standard errors are given
indicates the percentage of total variation
The
(
40.3 (9.9)
11.3)
The second
model has been applied.
in parenthesis
distribution regression
2.8 (1.4)
3.1 (0.8)
46.0
line in
each
cell
58
Table
2:
Decomposing Changes
in
Measures of Wage Dispersion: 1979-1988,
CDR
Effect of:
Minimum
Statistic
Total change
Individual
wage
Unions
Coefficients
attributes
Men:
Standard
.2(0.3)
Deviation
90-10
50-10
21 ,5(1.0)
11 .3(1.4)
3.3
0.0)
0.6
0.0)
1.9
0.2)
2.4 (0,2)
40.7
1.4)
7.9
0,5)
22.5
1.8)
28.9 (2,4)
9.2
0.8)
1.1 (1.3)
42,6 4.4)
5.3 (5.9)
11.2
0.1)
0.0
0.0)
52.1
2.4)
0,0
0,1)
11.2
0.1)
-2.0
9.6(1 4.1)
90-50
10 .2(1.2)
75-25
15 .4(1.1)
95-5
36 .4(2.1)
Gini
.2(0.1)
coefficient
-17.9
1.0)
5,1
0,4)
(
1.2)
45.5
8.3)
4,0 0,8)
-3,1 (1.1)
27,2
(
14.0)
4,2 (1.1)
0.0
0,0)
2.0
1.0)
0.0
0.0)
19.7
8.4)
39.3
8,8)
0.0
0.0)
4.1
1.0)
0,3
1,3)
11,1 (1.2)
0.0
0.0)
26.5
6.2)
1.7
8,6)
71,8 (8.7)
26.4 0.7)
0.0
0.6)
8.5
1,1)
1.4 (1.5)
72.7
3.8)
0.0
1.5)
23,4 2.7)
3.9 (4.0)
1.6
0.0)
0.4
0.0)
0,3
0.1)
1,8 (0.1)
37.9
1,1)
10,7
0.5)
7,1
1.6)
44,2 (1.6)
5.6
0,1)
0.3
0.0)
5,1
0,2)
1,7 (0.3)
44.1
1.5)
2.2
0.4)
39,9
1.8)
13,8 (2.5)
13,0 (1.1)
41,0 (9.8)
Women:
Standard
12 .7(0.4)
Deviation
90-10
50-10
43 .2(1.4)
36 ,4 (0.7)
90-50
.8(1.4)
26.4 0.2)
0.9 (0.5)
14,5
0.7)
61.2
2.2 (1.2)
33,5
1.7)
3,1 (2.6)
1.9)
26.4 0.2)
0.0
0.1)
11,3
0.4)
-1.4 (0.7)
72.7
1.6)
0.0
0.4)
31,2
1.3)
-3,9 (2.4)
0.0
0.0)
0.9
0.5)
3.1
0.8)
2,8 (1.4)
1.3)
40,3 (9,9)
0.0 0.0)
75-25
95-5
12 .8 (0.9)
52 .7(1.9)
Gini
.9(0.1)
(
0.0
0.0)
0.0
0,5)
83
0.2)
4,5 (0,8)
0.0)
0.0 '3.9)
65,1
5.0)
35,0 (4,5)
4.7 (2,1)
30.6 0.5)
0.7
0.7)
16.7
2.0)
58.1
2.2)
1.4
1.9)
31,6
5.0)
8,8 (5,1)
2.9 (0.1)
0.1
0,0)
1.3
0.1)
0,6 (0,1)
1.9
0.4)
26.1
1,4)
12.8 (2.2)
1.8)
Bootstrapped standard errors are given in parenthesis The second hne in each
indicates the percentage of total vanation. The censored distnbution regression mod'el has been applied
Notes All numbers are
cell
46,0
0.0
59.2
coefficient
13.6 (7.2)
in
°/o
59
Table
Decomposing Changes
3:
in
Measures of Wage Dispersion: 1979-1988,
CQR
Effect of:
Minimum
Statistic
Total change
Individual
wage
Unions
Coefficients
attributes
Men:
Standard
,0(0.3)
Deviation
90-10
22
3 (I.I)
50-10
,5(0.9)
90-50
12 7(0.7)
4.1
0.0)
0.3
0.0)
1.8
O.I)
2.8 (0.2)
45.3
1.5)
3.2
0.5)
20.0
1.6)
31.5 (2.2)
14.2
0.4)
-0.5
O.I)
7.2
0.4)
1.4 (1.1)
63.6 '3.4)
-2.2
0.6)
32.3
2.8)
6.4 (5.1)
14.2 '0.4)
-1.8
0.1)
4.6
0.4)
-7.4 (0.9)
47.9 9.0)
78.0(21.6)
148.7
75-25
12 ,7(0.6)
95-5
39 ,2(0.8)
Gini
,5(0.1)
coefficient
(
6.7)
18.7
3.0)
0.0
0.0)
1.3
0.1)
0.0
0.0)
0.0
0.0)
1.6
0.1)
2.0
0.4)
9.1 (0.5)
0.0 '0.0)
12.9
1.2)
15.5
3.0)
71.5 (3.1)
10.
1.0)
2.6
0.3)
8.8 (0.5)
20.6 2,4)
69.4 (2.5)
0.0)
-0.5
O.I)
7.4
0.5)
1.6 (0.8)
78.1 ,1.8)
-1.2
0.3)
18.9
1.2)
4.2 (2.1)
1.9 ,0.0)
0.3
0.0)
0.3
O.I)
2.1 (0.1)
6.1
1.4)
45.9 (1.4)
30.6
42.2
I.I)
5.9
0.4)
6.2
Women:
Standard
13
I
(0.4)
Deviation
90-10
48 ,8(1.2)
50-10
37. 2(0.7)
90-50
11 5 (0.9)
75-25
:
95-5
Gini
15 3 (0.9)
50. 8(1.4)
5.
(O.I)
0.3
O.I)
4.5
0.3)
2.0 (0.3)
2.6
0.4)
34.8
1.5)
15.0 (1.9)
30.6
0.0)
0.8
0.2)
14.7
0.8)
2.7 (0,8)
62.8
1.5)
1.6
0.3)
30.1
1.3)
5.5 (1,6)
30.6
0.0)
-0.3
0.1)
10.9
0.8)
-4.1 (0,5)
82,3
1.6)
-0.7
0.3)
29.4
1.7)
-10.9 (1,4)
0.0
0.0)
I.l
0.1)
3.7
0.5)
6.7 (0,8)
0.0
0.0)
9.1
1.1)
32,3
3.3)
58.5 (3,5)
0.0
0.0)
0.8
0.1)
11.8
0.8)
2.6 (0.9)
0.0
0.0)
5.6
0.7)
77.6
5.2)
16.9 (5,1)
30.6
0.0)
l.I
0.2)
15.1
0.8)
4,0 (1,0)
60.3 ,1.6)
2.1
0.4)
29.7
1.2)
7.9 (1.8)
3.2 '0.0)
0.1
0.0)
I.I
O.I)
0.7 (0,1)
2.1
0.3)
21.8
1.2)
13.6 (1.5)
62.5
coefficient
1.5)
Bootstrapped standard errors are given in parenthesis The second hne in each
indicates the percentage of total variation. The censored quantile regression model has been applied.
Notes All numbers
ceil
1
0.0)
47.6 .1.6)
are in °o
60
Men
CO -
CO -
'd-
-
<M -
r^
.J-*^
Women
00 -
ID
"* -
CN
Distribution function in
Uniform CI
Figure
for
1.
Empirical
observed wages
in the
in
in
79
Distribution function
79
Uniform CI
CDFs and 95%
1979 and 1988.
upper panel and distributions
for
in
in
88
88
simultaneous confidence intervals
Distributions for
women
men
are plotted
are plotted in the
bottom
panel. Confidence intervals were obtained by bootstrap with 100 repetitions.
Vertical lines are the levels of the
minimum
wage.
61
Observed quantile functions
.4
.2
Observed differences
.6
Minimum wage
o
De-unionization
.
If
LU
^..
1
1
1
^-;.
.
.
1
CM
1
^
CO
1
I
1
1
.4
.2
.6
.8
Residual
Individual characteristics
.4
.4
.6
Quantile
QE
Figure
2.
95%
.6
.4
.6
Quantile
r"
-:-"l
Uniform confidence bands
simultaneous confidence intervals for observed quantile
functions, observed quantile policy effects
and decomposition
of the quantile
policy effects for men. Confidence intervals were obtained by bootstrap with
100 repetitions.
62
Observed differences
Observed quantile functions
o
.4
.2
.6
IVIinimum
wage
De-unionization
Residual
Individual characteristics
LU
O
o
I
CM
.6
.4
—
Figure
3.
Quantile
QE
95% simultaneous
Uniform confidence bands
confidence intervals for observed quantile
functions, observed quantile policy effects
policy effects for
women. Confidence
with 100 repetitions.
.6
.4
Quantile
and decomposition of the quantile
intervals were obtained
by bootstrap
63
Observed
Observed differences
distribution functions
Q
o.
1.5
1
2
2.5
Minimum wage
De-unionization
-—
—
--—-
-
^*^
o
^
Q
'
;'.
o
.4
.2
.6
.2
.4
.6
.8
Residual
Individual characteristics
o
in
o
.4
.6
.4
Quantile
—
Figure
4.
95%
.6
Quantile
DE
Uniform confidence bands
simultaneous confidence intervals for observed distribu-
tion functions, observed distribution policy effects
distribution policy effects for men.
bootstrap with 100 repetitions.
and decomposition of the
Confidence intervals were obtained by
64
Observed
distribution functions
Observed differences
2
IVIinimum
wage
.4
.6
De-unionization
Q
o
I
o
LU
Q
o
.4
.6
—
Figure
5.
Quantile
DE
Uniform confidence bands
95% simultaneous
confidence intervals for observed distribu-
tion functions, observed distribution policy effects
distribution policy effects for
.6
.4
Ouantile
women. Confidence
bootstrap with 100 repetitions.
and decomposition of the
intervals were obtained by
65
Observed differences
Lorenz curves
2
Minimum wage
.4
De-unionization
.4
.6
Figure
6.
LE
95% simultaneous
.6
.4
Quantile
—
.6
Residual
Individual characteristics
.4
.6
Quantile
ZZH]
Uniform confidence bands
confidence intervals for observed Lorenz,
observed Lorenz policy effects and decomposition of the Lorenz policy effects
for
men. Confidence intervals were obtained by bootstrap with 100
tions.
repeti-
66
Observed differences
Lorenz curves
o
CO
_^
;^
0)
X!
CD
*
•
--
0)
^
O
/
>^
CM
^^^^^
'
•1
OTn
1
1
.6
.4
.2
Minimum wage
De-unionization
CM
O
uj
_i
r
o
Individual characteristics
o
r
CM
o
Lu
r
N^^
a-
o
-
-
r
,4
.2
.4
.6
Uniform confidence bands
LE
Figure
7.
95%
.6
Quantile
Quantile
simultaneous confidence intervals
for
observed Lorenz,
observed Lorenz policy effects and decomposition of the Lorenz policy
for
women. Confidence
titions.
intervals were obtained
effects
by bootstrap with 100 repe-
67
Minimum wage
o
LJJ
De-unionization
-
-^
O
I
CNl
r
l'
CO
CO
—
.2
.4
r
r—
.8
.6
.2
1
.4
.6
.8
Residual
Individual cJiaracteristics
.^_
o
^
T—
O
^^^^^^..^^.r'-^,
^^^^^"^
r
I
CNl
CNJ
r
r
CO
CO
r
r
.4
^
Quantile
—
—
-
^--..^...^^
1
.2
.1
_,^'"^^
-
1
1
1
1
.2
.4
.6
.8
Quantile
Distribution regression
Censored
distribution regression
Censored quantile regression
Figure
8.
regression
Comparison
1
1
of distribution regression, censored distribution
and censored quantile regression estimates of the decomposition
of quantile policy effects for
men.
68
Minimum wage
De-unionization
CM
O
o
-
-
LU
O
CM
r
CO
r
n
.2
.4
.6
.8
1
Individual characteristics
"i
.2
.4
I
.6
"T
.8
r
1
Residual
r
Quantile
—
Distribution regression
—
Censored quantile regression
9.
I
.4
.6
Quantile
Figure
.2
Comparison
Censored
distribution regression
of distribution regression, censored distribution
regression and censored quantile regression estimates of the decomposition
of quantile policy effects for
women.
69
Table Al: Reversing the order of the decomposition: 1979-1988,
DR
Effect of:
Minimum
Individual
Total change
Statistic
attributes
wage
Unions
Coefficients
Men:
Standard
8.0 (0.3)
Deviation
90-10
21.5(1.0)
50-10
11.3 (1.4)
0.9(0.2)
1.5 (0.1)
2.9 (0.2)
2,7 .0.3)
11.4(2.8)
19.2 (1.0)
36.3 (2.8)
33,1 ;2.4)
0.4(1.3)
8.8 (1.2)
11.2 (0.9)
1,1
'1.3)
1.8(5.8)
40.7 (5.6)
52.1 (4.8)
5,3
5.9)
2.5(1.5)
0.7 (1.2)
11.2 (0.5)
-3.1
l.I)
21.8(12.7
90-50
10.2(1.2)
-2.1(1.2)
-20,1 (12.9)
75-25
15.4(1.1)
95-5
33.0(2.1)
Gini
4.1 (0.1)
coefficient
5.8
(
11.3)
8.1 (0.8)
79,1
99.6
(
15.5)
27.1
0.0 (0.9)
(
4.0)
4.2
1.1)
41.0 9.8)
0.7)
0.0 (9.5)
6.4(1.2)
-2.1 (1.3)
0.0 (1,0)
11.1
1.2)
41.6(6.8)
-13.4 (9.2)
0.0 (7,4)
71.8
8.7)
2.6(1.7)
5.9 (I.l)
23.0 (1,0)
1.4
1.5)
7.9(4.8)
17.9 (3.4)
69.9 (5.5)
4.3
4.4)
(
-0.3 (0.1)
1.0 (0.0)
1.4 (0.1)
2.0
0.1)
-6.8 (2.8)
24.1 (1.1)
33.3 (2.5)
49.4
1.8)
Women:
Standard
10.9(0.4)
Deviation
90-10
39.8(1.4)
50-10
33.0(0.7)
90-50
6.8 (1.4)
75-25
12.8 (0.9)
95-5
Gini
38.8(1.9)
4.0(0.1)
.
coefficient
Notes All numbers are
cell
%
4.5 (0.2)
0.0 (0,0)
4.4 (0.3)
2.1
0.3)
41.1 (2.4)
-0.2 (0,2)
40.0 (2.4)
19.1
2.5)
11.2 (0.7)
0.0 (0,4)
27.2 (0.2)
1.3
1.1)
28.2(1.6)
0.0 (1,0)
68.4 (2.3)
3.4 2.6)
7.9(0.8)
-0.8 (0,5)
22,7 (0,5)
-1.4
0.7)
24.1 (2.6)
-2.4 (1,7)
82,6 (2,2)
-4.3
2.4)
3.3 (0,8)
0.8 (0,5)
0,0 (0,5)
2.8
1.4)
47.9(13.4)
11.7 (7,2)
0,0 (6,5)
40.3
9.9)
4.5
0.8)
2.8(0.7)
0.0 (0,5)
5,5 (0,3)
22.0(5.6)
0.0 (3,9)
43,0 (3,5)
17.4(0.9)
0.0 (0,3)
16,5 (2,0)
5.0
44.8(3.0)
0.0 (0,9)
42,4 (5,1)
12.8
5.1)
0.6(0.1)
0.0 (0.0)
2.5 (0,1)
0.9
0.1)
14.7(2.4)
1.2 (0,3)
61.1 (2,7)
Bootstrapped standard errors are given in parenthesis The second line
indicates the percentage of total variation The distribution regression model has been applied
in
35.0 4.5)
2.1)
23.0 2.2)
in
each
Download