Document 11163292

advertisement
Digitized by the Internet Archive
in
2011 with funding from
Boston Library Consortium
Member
Libraries
http://www.archive.org/details/estimationconfidOOcher
D^WEY
HB31
.M415
15 S
Massachusetts Institute of Technology
Department of Economics
Working Paper Series
ESTIMATION AND CONFIDENCE REGIONS FOR PARAMETER
SETS
IN
ECONOMETRIC MODELS
(Previously called "Inference on Parameter Sets in Econometric Models," id# 909625)
Victor Chernozhukov
Han Hong
Elie Tamer
Working Paper 06-1 8B
Earlier Version: June 6, 2004
Revision: June 22, 2006
1
Room
E52-251
50 Memorial Drive
Cambridge, MA 02142
This paper can be downloaded without charge from the
Social Science Research Network Paper Collection at
http://ssrn.com/abstract=963451
FE&
j
2 2 2007
LIBRARIES
ESTIMATION AND CONFIDENCE REGIONS FOR PARAMETER SETS
IN ECONOMETRIC MODELS*
VICTOR CHERNOZHUKOVf HAN HONG§ ELIE TAMER*
Abstract. The paper develops estimation and inference methods for econometric models
with partial identification, focusing. on models defined by moment inequalities and equalities.
Main
applications of this framework include analysis of game-theoretic models, revealed pref-
erence, regression with missing
quantile models,
bounds
and mismeasured data, auction models, bounds
among many
in asset pricing,
in structural
others.
Specifically, this paper provides estimators and confidence regions for minima of an econometric criterion function Q{9). In applications, Q{d) embodies testable restrictions on economic models. A parameter 9 that describes an economic model passes these restrictions if
Q{9) attains the
minimum
the set of parameters
value normalized to be zero.
The
interest therefore focuses
on
0/
that minimizes Q{0), called the identified set. This paper uses
the inversion of the sample analog Q„{6) of the population criterion Q{6) to construct the
We
estimators and confidence regions for 0/.
inference results for these estimators
yet simple conditions,
and
and
develop consistency, rates of convergence, and
regions.
The
results are
practical procedures are provided to
shown to hold under general
implement the approach. In
order to derive these results, the paper also develops methods for analyzing the asymptotics
of sample criterion functions under set identification.
Key Words:
This
is
Set estimator, contour sets,
moment
inequalities,
moment
equalities
a revision of a previous paper "Parameter Set Inference in Econometric Models", 2002, which was
circulated as an
MIT Working Paper and is available at www.ssrn.com.
May 2002. This version: June 22, 2006.
Date: Original Version:
We
Newey
thank Alexandre Belloni, Oleg Rytchkov, Marc Henry, Denis Nekipelov, Igor Makarov, and Whitney
for
thorough readings of the paper and their valuable comments.
and Oliver Linton
t
Department
for
many
of Economics,
MIT, vchern@mit.edu. Research support from the Castle Krob
Science Foundation, and the Sloan Foundation
§
Department of Economics, Duke
Science Foundation
I
is
We also thank two anonymous referees
valuable comments.
University,
is
Chair, National
gratefully acknowledged.
han hongSduke edu.
.
.
Research support from the National
gratefully acknowledged.
Department of Economics, Northwestern
University, tamer@northwestern.edu. Research support from the
National Science Foundation and the Sloan Foundation
is
gratefully acknowledged.
1.
Introduction
Parameters of interest in econometric models can be defined as values that minimize a
population criterion function.
If this criterion
function
is
minimized uniquely at a particular
parameter vector, then one can obtain confidence regions
parameter using a sample
for this
analog of this function. This paper extends this criterion-beised estimation and inference to
econometric models where the objective function
identified set.
minimized on a
is
(The terminology follows Manski (2003).) Our goal
inferences directly on the identified set.
defined by either
moment
The development
inequalities or
moment
focuses on
set of parameters, the
is
to estimate
moment
and make
condition models
equalities.
This paper uses the inversion of the sample criterion functions as the building principle
for estimators
and confidence
regions.
The
resulting estimators
The paper develops
appropriate contour sets of the sample criterion functions.
rates of convergence,
and inference
and confidence regions are
results for these sets. Specifically, this
consistency,
paper shows that an
appropriate lower contour set of the sample criterion function converges in Hausdorff metric
to the identified set at (an exact or an arbitrarily close to) l/\/n rate, in
problems, and at polynomial rates, more generally.
termining the appropriate
level of the
moment
condition
The paper develops a method
contour set so that
for de-
covers the identified set with a
it
prespecified probability. For this purpose, the paper derives the asymptotics of several inferential statistics
which quantiles determine the appropriate
of equi-continuous behavior of the
level of the
sample criterion functions
in
contour
moment
set.
The
lack
inequality problems
poses challenges to this analysis.
The primary
applications of the estimation and inference
methods developed
in this
paper
are in such areas as (1) empirical game-theoretic models, (2) empirical revealed preference
analysis, (2) econometric analysis with missing
in
auction models,
without additivity,
dominance regions
(4) structural quantile
(5)
bounds analysis
in stochastic
and mis-measured data,
analysis,
among
the economic models of interest satisfy a collection of
criterion functions are typically
confidence regions for these sets.
bounds analysis
models and other simultaneous equation models
in asset pricing models,
dominance
(3)
minimized on a
set.
and
others. In
moment
(6)
the inference on
most of these problems,
inequalities,
and the
resulting
Our paper develops estimators and
In the context of estimation of
games and revealed preference
analysis, our
methods have
already been employed by several substantive empirical studies. Bajari, Benkard, and Levin
(2006) estimated a dynamic
discrete optimality conditions for equilibrium, which result in
fies
tions.
moment
Ciliberto
moment inequahty
condi-
and Tamer (2003) analyze empirical entry models with
They do not make the equilibrium
multiple equilibria.
inequality conditions and set identification.
mate a game
types.
of each player satis-
Beresteanu and Ellickson (2006) presented a further application to a study of dynamic
oligopolistic competition.
to
Markov game where the observed action
They
in
which firms may enter as
different types
which leads
selection assumptions,
Cohen and Manuszak (2006)
and which has multiple
esti-
equilibria in
do not impose equilibrium selection assumptions. Borzekowski and Cohen
also
(2004) estimate a model of strategic complementarity between credit unions in their choice of
adopting a technology or outsourcing
it.
In the context of simultaneous equation models with
non-additive disturbances, Chernozhukov and Hansen (2004) estimate a
demand model where
a partial identification occurs. In fact, they use the pointwise versions of the inference methods
developed here. Econometric analysis with missing and mismeasured data
is
another area of
applications of our methods: Molinari (2004) apphed our methods to construct a confidence
region for the identified set in a causal model with missing treatments. In auction analysis,
the very nature of auction mechanisms also often leads to the missing data framework, see
Haile and
Tamer
(2003).
In addition to these existing applications, there appear to be
cations to the revealed preference analysis,
Blundell, Browning,
and Crawford (2005), and
important application
and
volatility
is
and Wang
The
see Varian (1982, 1984),
Bajari, Fox,
potential appli-
McFadden
and Ryan (2006).
A
(2005),
potentially,
the inference on the set of asset pricing models that satisfy
bounds developed
of potential applications
Post,
e.g.
many more
is
in
mean
Hansen, Heaton, and Luttmer (1995). Yet another area
in the analysis of stochastic
dominance
relations, e.g. see Linton,
(2005).
relationship of this paper to the econometric literature on inference under partial
identification
is
as follows.^
Marschak and Andrews
^Manski (2003)
for
The concepts
(1944).
of set identification go back to Frisch (1934)
Marschak and Andrews (1944) constructed the
a detailed introduction to partial
2
identifiability.
and
identified set
as a collection of parameters representing different production functions that can not be re-
jected by the data and that are consistent with functional restrictions the authors consider.
Frisch (1934) constructs consistent interval
bounds on parameters of structural regression
equations that are subject to measurement error.
Klepper and Learner (1984) generalized
the Frisch bounds to multivariate regression models with measurement errors and constructed
consistent estimates. Gilstein and
Leamer (1983) provided
of nonlinear regression models where the identified set
is
set consistent estimation in a class
an interval of parameters that are
robust to misspecification of the distribution of the error term.
Phillips (1989) suggested that multi-collinearity
a
number
of econometric
may be
a cause
In a different development,
for partial identification in
models and provided some asymptotic results
for
Wald
statistics
under such conditions. Hansen, Heaton, and Luttmer (1995) proposed an estimator
region of feasible
consistency.
means and variances
of pricing kernels in asset pricing
for the
model and proved
its
Manski and Tamer (2002) developed a number of models with interval-censored
data, as well as derived several consistency results. In a previous version of this paper, Cher-
nozhukov, Hong, and Tamer (2002) developed consistency and inference results
moment
for linear
inequality models, using inversion of the econometric criterion functions, and devel-
oped an empirical apphcation. Imbens and Manski (2004) investigated a problem of Wald
inference in the case of a scalar
means.
mean parameter bounded above and below by
Recently, Andrews, Berry, and Jia (2004) and Pakes, Porter,
Ishii,
investigated the inference problem using projection methods, which proceeds
other scalar
and Ho (2006)
bj^
constructing
a region for point-identified high-dimensional nuisance parameter and then further projecting
it
with the purpose of obtaining a confidence region
this parameter, such as the identified set.
relative to the
inference
Wald
methods developed
methods
statistic that
estimator. These
paper
is
This projection method tends to be conservative
in this paper. ^
for the linear regression
for the partially identified functionals of
Beresteanu and Molinari (2006) develop
model with interval-censored outcomes, using the
measures the Hausdorff distance between the identified
methods
differ
from the criterion-based inference studied
also related to the literature
and a
set- valued
in this paper.
Our
on the weak identification problem, see notably Dufour
Conservativity of projection methods in the point-identified setting
(2000).
set
is
discussed in
Romano and Wolf
(1997) and Staiger and Stock (1997).
fers fi-om
by the weak
identification framework.
relationship of this paper to the statistical literature
has pointed out the multi-collinearity
is
Hannan
as follows.
problems
set-identification)
(i.e.
Hannan and
models. Redner (1981) and
in several
(1982)
time series
showed that a maximum likelihood
Deistler (1988)
estimator eventually converges to a point in the identified set 0/, though obviously
and Liu and Shao (2003) investigated the behavior of the likelihood
identifiability in correctly specified likelihood models,
ARMA
is
condition models analyzed in this paper.
ratio test
under
loss of
with a special focus on the mixture
These results do not apply or extend
models.
it
Veres (1987), Dacunha-Castelle and Gassiat (1999),
not consistent for estimation of 9/.
and
dif-
the latter, as the nature of failure of point identification in our main applications
typically can not be approximated
The
However, the problem studied here considerably
in
any obvious way to moment
Fukumizu (2003) pointed out that the
likelihood
ratio has an unusually large stochastic order in multi-layer neural networks, which does not
apply to the
moment
condition problems analyzed in this paper. Also, the literature on image
processing considers the problem of support estimation of a density,
e.g.
Korostelev, Simar,
and Tsybakov (1995) and Cuevas and Fraiman (1997), though the structure
is
different
The
from that arising
rest of the
paper
is
in the
and inference
main
Section 2 presents the
condition
results of the paper. Section 3 develops consistency, rates of convergence,
results that apply generically.
and moment equality models
and a
moment
will serve to illustrate the analysis. Section 2 also outlines
The
results require that simple high-level condi-
tions on the econometric criterion functions are met. Section 4 analyzes the
lects proofs
problem
condition models analyzed here.
organized as follows.
models and several examples that
informally the
moment
of such
in detail
and
verifies
moment
the conditions of Section
definition of notations used in the paper,
3.
inequality
Appendix
and also discusses pointwise
confidence regions (confidence regions for particular parameter values in the identified
2.
set).
Problem Definition and Informal Discussion of the Main Results
Consider a nonnegative population criterion function Q{6) which attains
on a
col-
set 0/, that
is
0/
parameter values, and thus
function.
The parameter
=
is
{^
G
:
Q{9)
=
0}.
The
a singleton. Suppose there
is
set
also a
9 belongs to the parameter space 0,
4
0/ generally
its
minimal value
consists of
many
sample analog Qn{6) of
which
is
this
a compact subset of the
Euclidean space
Every
IR'^.
0/ indexes an economic model that passes
in
embodies
restrictions that the criterion Q{9) typically
and confidence regions
investigates the estimators
for
in
testable empirical
economic applications. This paper
0/ constructed using the contour
sets of
the sample criterion function Qn- (Appendix also discusses a related problem of constructing
confidence regions for a particular point d* in 0/.)
This section begins with a review of the main econometric models and economic examples
that motivate the framework described above. Then, the section gives an informal review of
the methods and the results obtained in this paper.
2.1.
Moment
Condition Models. This paper
is
primarily concerned with applications to
two main types of econometric structural models: moment inequalities and moment
In empirical analysis, the
moment
on economic models.
restrictions
C
parameters ^ e
parameters 0/
C
The moment
IR'^,
inequalities,
much
moment
like
equalities.
equalities, represent testable
Economic models are described by the finite-dimensional
where
is
the parameter space.
We
are interested in the set of
that satisfy the testable restrictions.
restrictions are
computed with respect
to the population probability law
P
and take the form
of the data
Ep[mi{e)]
where mi{9)
=
m{9,
lUi) is
a vector of
moment
<
0,
(2.1)
functions parameterized by 9 and determined by
a vector of real random variables Wi. Therefore the set of parameters 6 that pass restrictions
(2.1) is given
It is
by 0/
=
interesting to
{^ G
:
Ep[mi{e)]
comment on the
When moment
0}.
structure of the set
functions are linear in parameters, the set
and could be a
<
0/
is
0/
in this
model.
When
the
moment
given by an intersection of linear half-spaces;
triangle, trapezoid, or a polyhedron, as in
functions are non-linear, the set
0/
is
Examples
1
and
2 introduced below.
given by an intersection of nonlinear
half-spaces which boundaries are defined by nonlinear manifolds.
The
set
0/ can be characterized
as the set of minimizers of the criterion function^
Qi9) := \\Ep[mm'W'/\9)\\l,
Let
||a;||+
=
||(a;)+||
and
||x||_
=
||(x)_||,
where {x)+ := max(a;,0) and
5
(2.2)
(a;)_
:=
max(— x,0).
where W{6)
a continuous and diagonal matrix with strictly positive diagonal elements for
is
each 9 e Q. Therefore, the inference on 0/
Qr,{9) :=
may be based on
\\E„[mm'K^\ml,
the empirical analog of Q:
£^nK(0)] := -X^m,(^),
(2.3)
t=\
where Wn{9)
a uniformly consistent estimate of W{9). In applications Wn{9) can be taken
is
to be an identity matrix or chosen to weigh the individual empirical
moments by estimates
of
inverses of their individual variances.
Moment
dexed by
equalities are
are
9,
assumed
more
moment
the
=
0,
that
is
Qi
=
{9 e
0/
typically a manifold,
is
Q
:
Ep[mi{9)]
functions are linear in parameters, the set
hyperplane intersected with the parameter space 0.
the set
The economic models,
to satisfy the set of testable restrictions given
Ep[mi{e)]
When
traditional in empirical analysis.
which
'V\nien
moment
by moment
=
0}.
0/
is
in-
equalities:
(2.4)
either a point or a
functions are non-linear,
also includes the case of isolated points (a zero-
dimensional manifold).
The
set
moments
0/ can be characterized
as the set of minimizers of the generalized
of
function
Q{9) :=
where W{9)
is
method
is
\\Ep[mm]'W'/m\\\
(2.5)
.
a continuous and positive-definite matrix for each 9 E Q.
The
inference on
0/
based on the conventional generalized method-of-moments function
Qr.{9):=\\E^[m,{9)]'K^'m\
where Wn{9)
is
a uniformly consistent estimate of W{9).
(2.6)
In applications
Wn{9) can be an
identity matrix or an estimate of the inverse of the asymptotic covariance matrix of empirical
moment
In
functions.
many
situations,
we can
also use the modified objective function for inference:
Q„(0)This modification
"^In
such
cases,,
is
useful in cases
mf^g„(e')-
where Qn does not attain value
in finite samples.^
using the modified objective function typicallj' leads to power improvements, as
in point-identified cases.
6
is
well-known
.
Motivating Examples. There
2.2.
are several interesting examples for the
tion models described above, where the identified set
9/
moment
condi-
naturally a collection of points,
is
rather than a single point.
Example
Y
is
1 (Interval Data).
an miobserved
which are observed
real
real
The
random
random
example
first
is
motivated by missing data problems, where
variable bracketed below by Yi and above by I2, both of
The parameter
variables.
=
of interest 9
£'p[V]
is
known
to
satisfy the restriction
Ep[Y,\
Hence the
identified set
is
=
an interval, Qj
9
{9
<
:
moment
the moment-inequality framework with
mi{e)
<
Ep[Y2].
Ep[Yi]
<
9
<
Ep[Y2]}. This example
function
= {Yu-9,9~Y2^)'.
Therefore, 6/ can be characterized as the set of minimizers of Q{9)
{Ep[Yu]-9)l + iEp[Y2i]-9)l, with the sample analog g„(0)
Example
2 (Interval Outcomes
previous example
mean
is
in
is
modeled using
imply the following inequalities are
linear function X19.
1,
...,
jy,
< X19 <
{Er,[Yu]
<
9'Ep[XiZi]
in the
moment
therefore given
by an intersection
is
=
The parameters
{{Yu
functions of observed bids, Yi and Y2,
and Tamer
(2003).
<
Ep[Y2.Z,],
-
moment
9'X,)Z[, -(F.,
is
=
{l{Xi
<
Xj),j
=
identified set 9/,
example
also falls
function given by
-
9'X,)Z[)'.
Y
- bidder's valuation - by
very natural and occurs in a variety of settings, see
Analogous situations occur
in
income surveys, where only income
brackets are available instead of true income, see Manski and
Hong, and Tamer (2002) analyze
of this function
Ep[Y2i\Xi]. These conditional restrictions
In auction analysis, the bracketing of the latent response
Haile
and the conditional
available,
of hnear half-spaces in H'^. This
inequality framework, with the
m,{9)
=
regression generalization of the
These inequalities define the
for a suitable collection of values Xj.
is
\\Ep[mi{9)]\W
- 9)% + {E^^lY^^] - 9)1
a vector of positive transformations of Xi, for instance, Zj
which
=
valid:
Ep[YuZ,]
is
A
Regression Models).
can be bounded using inequality £'p[Yii |Xj]
where Zi
=
immediate. Suppose a regressor vector Xi
of unobserved Yi
falls in
this linear
moment
7
Tamer
inequality set
up
(2002). Chernozhukov,
in detail.
Example
optimal choice behavior of firms and economic agents
Suppose that a firm can make two choices Di
the firm from making the choice
such that Ep[Ui\Xi]
Wi
Game
3 (Optimal Choice of Economic Agents and
=
0, for
Z?j is
=
example, W^
may
another area of applications of
is
or Di
—
given by n {Wi, Di,9)
Xj representing information
are various determinants of the firm's profit,
some
Interactions). Analysis of the
Suppose that the
1.
+
Ut,
where Ui
make
available to
of which
may be
is
(2.1).
profit of
a disturbance
the decision, and
included in Xi. For
include actions of other firms that affect the firm's profit.
From a
revealed
preference principle, the fact that the firm chooses Di necessarily implies that
EpItt {Wi,
Therefore,
\Xi]
> Ep[n
we can take the moment condition
m,{d)
where Zi
A, 0)
is
=
{Wi,
1
in (2.1) to
-
Di, 9) |X,].
(2.7)
be
{'n{W,,l-Di,B)-T:{Wi,D,,d))Zi,
(2.8)
the set of positive instrumental variables defined as positive transformations of
Xi, as in the previous example.
This simple example highlights the structure of empirically testable restrictions arising
from the optimizing behavior
given in the form of
moment
of firms
and economic agents. These testable
inequality conditions.
also allows for game-theoretic interactions
could be noted that this simple example
among economic
conditions of the above kind are ubiquitous, and are
settings, see Bajari,
It
restrictions are
known
agents.
to arise in
The moment inequahty
(more
realistic)
dynamic
Benkard, and Levin (2006), Ciliberto and Tamer (2003), and Ryan (2005).
Similar principles are used in Blundell, Browning, and Crawford (2005) to analyze bounds on
demand
functions.
Related ideas also appear in the area of stochastic revealed preference
analysis, e.g. see Varian (1984)
Example 4
and McFadden (2005).
(Structural Equations). Consider the structural instrumental variable estima-
tion of returns to schooling.
potential income
Y
is
Suppose that we are interested
related to education
E
8
through a
in the following
flexible
example where
quadratic functional form,
Y=
9q
+
e^E +
e2E'^
+
t
= X'e +
simonious, this simple model
is
e,
for 9
and
(^0,^1,^2)
X
=
{l,E,E^y. Although par-
not point-identified in the presence of the standard quarter-
of-birth instrument suggested in Angrist
tification, all
-
and Krueger (1992).^ In the absence of point iden-
parameter values 6 consistent with the instrumental orthogonality restriction
Ep[{Y — d'X)Z] =
are of interest for purposes of economic analysis. Phillips (1989) devel-
ops a number of related examples. Similar partial identification problems arise
moment and
instrumental variables problems, see
in nonlinear
Demidenko (2000) and Chernozhukov
e.g.
and Hansen (2005). In Chernozhukov and Hansen (2005), the parameters
9 of the structural
quantile functions for returns to schooling satisfy the restrictions:
Ep[{t-1{Y <X'9))Z] =
where r G
(0, 1) is
the quantile of interest.
This
is
0,
an example of a nonlinear instrumental
variable model, where the identification region, in the absence of point identification,
erally given
is
gen-
by a nonlinear manifold. Chernozhukov and Hansen (2004) and Chernozhukov,
Hansen, and Jansson (2005) analyze an empirical returns-to-schooling example and a structural
2.3.
demand example where such
situations arise.
Informal Discussion of Results. The
6/ that are consistent estimates
for
objective of this paper
of 0/, converge to
Q
confidence interval property lim inf„_oo -P(0/
a G
(0, 1).^
The
sets
Cn)
=
C„ we construct take the form
0/
is
to construct sets
at fastest rates,
C„
and have the
a, for a prespecified confidence level
of a contour set Cn{c) of level c of the
sample criterion function Q„:
Cn{c):={0ee:anQniO)<c},
for
some appropriate normalization
=n
the discussion, assume a„
^The instrument
is
a„,
where a„
= n
in
Examples
1-4.
In order to simplify
in this section only.
the indicator of the
first
quarter of birth. Sometimes the indicators of other quarters of
birth are used as instruments. However, these instruments are not correlated with education (correlation
is
extremely small) and thus bring no additional identification information.
Robustness to perturbing
P
is
also discussed in the
Addendum
to this paper, which obtains the conditions
under which coverage holds under contiguous perturbations of P. In addition, Andrews and Guggenberger
(2006) establish global robustness of the subsampling confidence regions proposed in this paper in a class
of
moment
Example
inequality problems.
Sheikh (2006) establishes global robustness of our subsampling regions
1.
9
in
In order to estimate
0/
consistently, the level c
to be diverging to infinity slowly; for concreteness,
of problems, c'does not need to
E.g., in
Example
1, c
=
=
c (which
we can
be diverging, so we can
gives us C„(0)
=
can be data-dependent) needs
set c
=
<
c
set
However,
Inn.
=
Op(l) and even
which clearly
[Enp'i], £'„[y2]],
in a class
is
c"
=
0.
consistent for the
region [£Jp[Yi],Ep[y2]]. Generally, whether c can be non-diverging in order to maintain consistency depends on whether the sample criterion function a„Q„(^) has degenerate behavior,
vanishes, over contractions of
i.e.
9/
in large samples, as formally stated in Section 3.
particular, the latter property does not hold in
under conditions formally stated
1-3,
The
sets,
which
is
from
either
A
or
d{b,
A)
where d{b,A) :=
B
is
inf
\\b
—
a\\,
a£,4
beB
empty. The motivation for the use of this metric comes
being a natural generalization of the Euclidean distance and
it
of the Hausdorff dis-
defined as
a^A
if
but typically holds in Examples
and consistency makes use
dH{A, B) := max sup d[a, 5), sup
and dniA, B) ;= oo
4,
in Section 4.
analysis of the rates of convergence
tance between
Example
In
its
previous uses by other
authors in the consistency analysis in the context of set estimation, see Hansen, Heaton, and
Luttmer
(1995).
The
general consistency result
c?i7(C„(c),e;)-.pO,
obtained in this paper follows from the uniform convergence of the sample function
limit continuous function
gence over set 0/
literature,
The
and
is
is
Q
is
conventional in econometric
easilj' verifiable.
rates of convergence follows from the existence of polynomial minorants on Qn{0)
over suitable neighborhoods of 0/, as defined formally in Section
minorants on
Qn
occurring in Examples
1-4, as verified in
c?h(C„(c),07)
which
is
to the
over the compact parameter space 0, where the rate of conver-
l/a„. Such uniform convergence condition
thus
Q„
=
Examples
1
4,
Existence of quadratic
implies that
Op(v'max(c, l)/n),
very close to l/\/n rate of convergence, and
inequality problems (such as
Section
3.
and
over contractions of 0/).
10
2,
is
exactly \j \fn in
many moment
where anQn has degenerate asymptotics
we need
In order for C„(c) to have the confidence region property for 0/,
c
=
c{q.),
such that c[a)
is
a consistent estimate of the a-quantile of the statistic:
=
Cn
which
is
sup anQn[d),
a quasi-likelihood-ratio type quantity.
limit distributions of (2.9) or a generic
tively in Section 4
Ep\Yi]), y/n{En[Y^
max[{Wi)\,
(^^2)?.],
and Section
-
3.5.
'
(^2.9)
The estimates
For instance, in Example
Ep[Y2])) -^d (H^i, W2)
=
N{0,
where the distribution of C can be
unknown
set
0/
The paper
statistic C„.
statistic (2.9),
suppose {s/n{En\Yi\
1,
then Section 3 shows that
ft),
easily obtained
is
—
Cn^dC =
by simulation methods
(2.9) is
the paper constructs a generic subsampling estimate c(a), which
approximation of
c[a) can be based on the
subsampling method, which are developed, respec-
For the cases when the limit distribution of
(see Section 4).
to choose level
not easily tractable,
based on subsampling an
where one uses the consistent estimate Cn(c)
in place of the
in (2.9) (see Section 3.5 for a detailed description of the algorithm).
characterizes the asymptotic behavior
The paper
and derives the
limit distribution of the
also characterizes the limit distribution of related statistics used
to determine the probabihty of false coverage (probability of covering larger sets than 0/).
(This in turn characterizes the power properties of the testing procedure implicitly defined
by the confidence region.)
o-nQn{G) in e.g.
Examples
The non-equicontinuous behavior
of the empirical process 6
1-3 poses a challenge to this analysis,
which
is
1-^
addressed through
a generalization of epi-convergence and stochastic equi-semi-continuity (Knight 1999) to the
set-identified case.
analysis; e.g.
it
The "parameter-on-the-boundary" problem
arises in
Example
4,
a hyperplane with 0, generally has
to H''.
This challenge
is
where the identified
common
is
another challenge in this
set 0/, defined as
an intersection of
points with the boundary of 0, defined relative
addressed through an appropriate generalization of the "Chernoff
regularity"- the condition that in the point-identified case requires convergence of the local
parameter space to a cone (Chernoff 1954, Andrews 2001).
The
generalization requires a
convergence of a graph of the local parameter space to an appropriate limit graph.
3.
General Estimation and Inference
in
Large Samples
This section defines the estimators and confidence regions formally and develops the basic
results
on consistency, rates of convergence, and coverage properties of these
11
regions.
The
section develops general conditions that parallel those used in
(Amemiya
identified cases
illustrates
and
1985,
verifies these
Newey and McFadden
conditions for the
Basic Setup. The parameter space
3.1.
with a subspace topology relative to IR
tion
...,
Q >
Wn)
is
w-i,...,Wn are a
criterion function
and that Q^ converges uniformly to a continuous
available,
that attains minimal value
The contour
on 0/.
11*^
equipped
random vector defined on a
Suppose that the sample
complete probability space {Q.,J^,P).
Qn{0,w\,
condition models.
a non-empty compact subset of
Data
.
in point-
1994, van der Vaart 1998). Section 4
moment
is
extremum estimation
sets of
Qn
Qn{d)
=
criterion func-
be used
will
for
estimation and inference on 0/. This approach therefore employs the classical duality principle of inverting a likelihood type test statistic to obtain confidence regions.
Regarding notations used
d{9,Qi)
:
<
e}.
in the paper, e-expansion of
Unless an ambiguity arises, sup^ /
notions of stochastic convergence,
stochastic order symbols.
Op and
e.g.
For any two numbers a and
convenience, Appendix
A
in
is
0} := {6 e
defined as
used to denote supa^^f{a).
convergence in (outer) probability, denoted as -^p, and
b,
wp —
>
1
stands for "with (inner) probability approaching
a /\b denotes min(a,
collects other definitions
and a V
b),
denotes max(a,
6
b).
For
and notations.
Consistency and Rates of Convergence in The General Cases. Let the
3.2.
The
Op are defined with respect to the outer probability P*, as in
van der Vaart and Wellner (1996);
1."
is
0/
following
assumption hold.
Condition C.l (Uniform Convergence and
ofJR'^,
(bj
Qn{d, wi,
Q Q
...,
t-^ 1R_|_
:
Continuity), (a)
continuous and mine
is
Wn) takes values in 1R+ and
is
Q=
—
>
oo
,
and
(e)
non-empty compact subset
a
0/ := argmineQ,
jointly measurable in the
wi, ...,Wn defined on a complete probability space (Q,
a sequence of constants 6„
0; let
is
supg^
J^,
Qnid)
=
parameter 9 and the data
P), (d) sup© \Qn
Qn = Op(l/a„)
(c)
— Q\ =
Op{l/bn) for
for a sequence of constants
an -^ oo.
Condition C.l assumes uniform convergence
It
also identifies
Q >
and Qn
0/
>
^Given a function (5„
for the criterion function
as the minimizer of the limit criterion function Q.
are not restrictive.^
:
© —
>
Qn{0) - infs'ge Qn{9') and Q{9)
K
and
=
Q{6)
its
-
In C.l(c), measurability
continuous uniform limit
Q
:
—
is
>
12
to the limit Q.
The assumptions
assumed
IR,
infg'ge Q(^') to reach this assumption.
Qn
we can
to hold with
define Qn{6)
respect to the product of the sigma-field
measurabihty of supe^
and related
(5„
T and the Borel sigma-field
of 0. C.l(c) guarantees
van der Vaart and Wellner (1996),
statistics; see e.g.
p.47.«
Condition C.l also defines the principal quantity supg^
the analysis of inference.
Its rate of
Q^ which
plays the crucial role in
convergence to zero a„ plays the key role in the analysis of
consistency and rates of convergence. Section 4 shows that in the models of Section 2 a„
The contour
of
anQn
is
a^Qn form the
sets of
class of estimates
we
The
consider.
level c
c
>
defined as
0.
Next
let c"be
slowly so that 'cjan
—>p
a sequence of non-negative
For instance, when a„
0.
when we
the choice of c further
generally take the form
A
n.
contour set
C„(c):={^ee:a„(5„(^)<c},
where
=
consider inference.
=
random
real variables such that c
we can
n,
The
(3.1)
=
set c
Inn.
We
^p
oo
will discuss
estimates and confidence regions will
0/ := Cni^.
condition that determines the rate of convergence of Cn{c) to 07
Condition C.2 (Existence
is
the following.
of a Polynomial Minorant). There exist positive constants
such that for any e e (0,1) there are [K^,n^) such that for
all
{5, k,
7)
n>n^,
Qn{e)>K-[d{e,ei)A5Y<
uniformly on {9 £
Q
:
d(9, 0/)
>
n^/an
},^
with probability at least
1
—
e.
Condition C.2 states that Qn can be stochastically bounded below by a polynomial over
a neighborhood of 0/. C.2 parallels the conditions used to derive the rate of convergence of
estimators in the point-identified cases.
Theorem
where
c^p
wp ->
1
3.1 (Coverage, Consistency, and Rates of Convergence of Cnic)). Let 0/
00 such that cja^ -^p
and dniQi, 0/)
Suppose that 0/
The condition
referee.
Otherwise,
=
=
Op{l),
0.
Suppose that 0/
7^
0, then (3) C.l implies that dniQijQi)
we can
easily
is
=
wp
^
is
Op{{c/ar,yl'^).
only needed to simplify exposition, following a suggestion of a
drop this condition, since we allow
for stochastic
cally measurable.
0, this set
=
C 0/
I.
convergence in the sense of
Hoffmann-Jorgensen. In this case, under the other assumptions stated, the primary
When Qi =
Cn{c),
0, then, (1) C.l implies that Qj
and (2) C.l and C.2 imply that d//(0/, 0/)
of joint measurability
=
empty, in which case C.2 does not apply.
13
statistics are
asymptoti-
.
when 0/
Parts (1) and (2) address the case of the partial identification,
sense, this
is
the typical case for applications, and thus our interest
The consistency
and
results (1)
new
moment-condition models of Section
follows
by Theorem
Part
(3)
3.1 that the
primarily in this case.
lies
Q„
2,
is
in this paper. ^°
Section 4 shows that in the
locally quadratic,
convergence rate can be
made
=
7
i.e.
=
and a„
2,
prime
addresses the case of the complete non-identification,
interest,
Example
=
and Q{e)
Ep[Y2])y
Op(l/n),
=
0/
when 0/ = 0,
in
which case
-
is
not
stated for completeness.
is
Example
In
1 (contd.)
{Ep[Y,\
It
+
6)1
-
{Ep[Y2\
=
recall that Q„(6I)
1,
-
Suppose {^{En[Y,]
Of^.
-
{En[Yi\
+
9)\
{Er,[Y2\
-
9f_
Ep[Y,]), ^{E,,[Y2]
-
-d {Wi.W^y - N{0, Q). Then supe \Qn - Q\ = 0^(1/^.) while supe^ |Q„ - Q\ =
so that 6„ = ^/n and a„ = n. By Theorem 3.1 C„(lnn) consistently estimates
Further,
[E'pfyi], £'p[y2]].
Theorem
C„(lnn)
3.1 the set
=
set.C„(0)
this
and
n.
arbitrarily close to 1/ ^/n}^
the estimator converges to 0/ in the Hausdorff metric faster than any rate. This case
of
some
van der Vaart (1998). Both the consistency and
e.g.
problem studied
for the
In
stated in terms of the Hausdorff metric, generalize those
(2),
obtained for point-identified cases, see
rate results are
9.
7^
it is
simple to verify that C.2 holds with 7
consistent exactly at
is
Hence by
2.
Note, however, that the
rate.
estimates [£'p[yi],£'p[y2]] at l/\/n rate.
[i?„[yi], £'„[l2]] consistently
example and many others, but not
y'lnn/n
=
all,
it is
Hence, in
possible to achieve the rate l/a-n
exactly.
Section 3.3 below develops this point further.
Renicirk 3.1. (A counter-example) The following example shows why
achieve the sharp rate of convergence \/aJ"' by setting
may
=
the source of the inconsistency. Let
Qn{0)
a„
=
=n
1
e
for
(2,3],
=
where x^
to claim that C„(lnn)
since (ii/(C„(c), 0/)
"^The consistency
of the set {6 £
Q
:
=
Qj, {9)
replacing 6„ with a„
» hn,
c/b„
}
Qn{9)
Q{9)
= xVn
=
for
G
for 6
trivial
0/)
from an
=n
=
cases. Setting c
example that
each 9 E
[0,1],
not possible to
Q„(0)
[0,2]
=
illustrates
and Q{9)
€
for 9
Op(l)
=
(1,2],
1 for
and
a chi-square variable. Then Theorem 3.1(1) applies with
is
where hn
where a„
[0,3],
all
consistent. However, Cn(c) for a fixed c
is
d//([l, 2],
result differs
<
[0,2];
Op{\) in
Consider the following
in general lead to inconsistency.
each d G (2,3], so that 0/
c=
is
it
=
1
is
not consistent,
with the asymptotic probabihty Pr{x^
earlier result
=
>
in regular cases.
^^In other examples like the ones considered in
More
Kim and
14
c)
>
0,
by Manski and Tamer (2002) that derives consistency
\fn in regular cases.
rate of convergence v}!'^
>
We
in fact
generally, 6„
show consistency
of smaller sets
and a„ are defined by
Pollard (1990), a„
=
r?!'^
and 7
=
2,
C.l(d,e).
giving the
—
while dH{Cn{c),Qi)
c?7^([0, 2],
=
9/)
with the asymptotic probabihty Pr{-)^
<
c)
<
1.
Therefore, with a positive probabihty the set Cn[c) does not cover substantial portions of
the set 0/, and the Hausdorff distance between Cn{c) and 0/ does not converge to
inconsistency of this kind extends to
more general
Example
cases such as
4.
0.
The
Thus, Z-^^p oo
is
needed to achieve consistency generally.
Consistency and Rates of Convergence with "Degenerate Interior".
3.3.
moment
c
=
inequality problems, the exact rate of convergence l/a„
Op(l) or even c
where
this
=
0.
Qn can have
discussion of
The reason
possible.
is
The
Example
In
many
can be attained by setting
above provides the simplest instance
1
that in moment-inequahty problems, criterion function
is
degenerate asymptotics,
i.e.
vanish, over subsets of the identified set
0/ that
can approximate 0/. Consistency and rate results then follow, because C„(c) includes these
when
subsets even
c
=
0.
In order to discuss this property formally, consider the following condition:
Condition C.3 (Degeneracy). There
e
{07',
e
there
is
that for
[0,ri]}
ofQj such
n^ such that for all
any
e
>
that (a)
n >
n^,
exists a constant
<
dH{Qj',Qi)
for
e
P{suPq-£ anQn
=
77
>
and a
G
all e
0}
=
there are constants {k^^u^) such that for all
1,
[0,77],
collection of subsets
(h) for
any
(c) there exists
n > n^ F{sup
_^
7
e
>
-i/-,
G
[0,7?],
such
=
anQn
0}>l-£.
In the remainder of the paper,
we take 07^
to
be an e-contraction of the
Qj':={deej:d{e,e\ei)>e},
where
e
>
0,'^^
to
1,
(3.2)
although, in principle. Condition C.3 does not require the sets 07"^ to be
e-contr actions of 0/.
finite-sample
set 0/, that is
moment
C.3(a-b) typically arises in the
inequalities satisfied
how
this
inequality models due to
all
on e-contractions 07^ with probability converging
which makes the criterion function vanish on
rate assumption on exactly
moment
happens.
07^
Condition C.3(c) further puts a
Section 4 verifies this condition in our main
applications.
Theorem
3.2 (Consistency and Rates of Convergence of C„(c) with Degenerate Interiors). Let
0/ denote Cn{Z) where
Note that
this set
is
c'
>
with probability
1
always well-defined, although
15
and
it
c'
—>p c >
0.
may be an empty
Then, (1) C.2 and C.3(a,b)
set.
imply that d//(6/,0/)
Moreover,
if
Qj
=Q
=
and (2) C.2 and C.3 imply
Op{l),
and sup@^ cinQn
=
wp
—^
dH(©/,©/)
that
then (3) dniOi, Qi)
1,
=
wp
Parts (1) and (2) contain the results of primary interest, which state that
rate l/on
is
achieved exactly. In particular, the smallest contour
converges to 0; at the rate l/a„
=
C.3(b) and C.3(c) hold with 7
(3)
Section 4 shows that in
.
2
=
and a„
C„(0),
set,
many moment
n, yielding the rate of
behavior of UnQn on 0/; in which case the estimator converges to 0/
than any
Example
small
>
e
0,
rate.
This case
1 (contd.)
07'
=
To
[Ep[Yi]
+
at least 1
3.4.
—
on
Qn =
07';='
v"
_
Example
is
The
=
_|_
is
^>- 1.
C.3 holds, the
consistent and
inequality examples,
=
0, and degenerate
in the
Hausdorff metric
completeness.
for
Clearly, for a sufficiently
1.
=
[£;p[yi], E'pfyo]] in
(£'„[Yi]
Further, for any £
©7*^.
[^'^jy^j
Thus, Cn{c)
Cn{c) has a confidence region property.
K^/^^Ep[Y2] —
—
>
0)\ + {En[Y2]
0,
—
the
^)i,
a constant k^ can
K^/y/n\ with probabihty
consistent at rate l/\/n in this example.
arises next
is
how
to choose c to guarantee that
determined
inferential properties of sets Cn{c) are
statistic
Cn
Indeed,
If
on
it
can approximate 0/
e]
Confidence Regions. The question that
by the
state
not a singleton. Since Q„(0)
1,
e in large samples.
-
e,Ep[Y2]
is
with probability converging to
Qn =
we
not of prime interest;
clarify the role of C.3, recall
Hausdorff metric, provided 0/
be found such that
is
if
Op{l/al/'').
convergence l/y/n. Part
addresses the less typical case of the complete non-identification, 0/
faster
=
Lemma
3.1
quantiles of C„ or
conducted. ^^
=
sup anQniO).
below shows that event {C„
<
c}
is
(3.3)
equivalent to event {0/
C
Cn{c)}.
good upper bounds on them are known, finite-sample inference can be
This paper provides asymptotic estimates of quantiles of C„, using either a
generic subsampling method, developed in Section 3.5, or the asymptotic limits for C„ in the
moment
The
condition problems, developed in Section
following basic condition
Condition C.4 (Convergence
is
4.
required to hold.
of C„). Suppose that
[0,oo), where the distribution function of
C
is
P{Cn <
c}
^
P{C <
c} for each c
nan- degenerate and continuous on
[0,
E
oo).
^^For instance, the upper bounds on quantiles can be obtained using the maximal inequalities for empirical
processes.
16
moment
Section 4 verifies C.4 for
Lemma
is
condition models.
3.1 (Basic Large Sample Confidence Regions).
c{a) := inf{c
>
have that lim„
liminf„
:
P{C <
P{0/ C
P{Qi C
C„(c)}
> a}
c}
a £
for
(0, 1),
= lim„P{C„ <
such that
= P{C <
= liminf„ P{C„ <c}> P{C = 0} > a
Cn{c)}
Then for any
(2) Suppose that C.4 holds.
equivalent to event {Cn{c) covers 0/}.
>
c
with probability
= a
= 0.
c{a)}
c}
if
c{a)
if
>
c{a)
many data subsamples
quantiles of C„ using
of size
The
b.
we
1,
and
section
develops a generic subsampling method for consistent estimation of the critical value.
method estimates the
c]
—>p
c
0,
Generic Estimation of the Critical Value based on Subsampling. This
3.5.
<
Under C.l event {C„
(1)
The
following
condition facilitates the construction.
Condition C.
P[Cn{5n)
<
c]
(Approximability of C„). For C„(5„)
5
= P[C <
c]
+
any
0(1) for
any
tion require that this condition holds for
and any
(5„
[
(5„
t 0.
Section 4 verifies Condition C.5 for models of Section
subsampling to a feasible
a^Qb
statistic sup^'^/^-)
:= sup
sequence, consider
cases
when {Wt}
size b of the
value
Co,
to hold,
{Cj,b,n
is
all
...,
also set Cq
=
and
:= suPe€Cr,{c) ^bQj,b,n{^), j
/t„
=
=
0.
(2)
(4)
Repeat Step 2
Report Cn{cL
+
series,
-^p cq
cq
Compute
l,...,5n}, usiug c
for
^
=
2, ...,L
it
in addi-
holds,
suffices to
apply
when data {Wt}
is
construct i?„
=
>
0.
Cj as
Cq
+
Set k^
=
n — b+1 subsets of
Initialize
=
some
In n. If
C.3
starting
is
known
the a-quantile of the sample
«;„,
where
Qj^b,n
denotes the
(3) (Optional/ Asymptotically Equivalent
criterion function evaluated using j-th subsample.
Iterations)
C.3
stage, for cases
Wj+b-i}. The algorithm has four steps: (1)
which can be data-dependent, such that
we can
//
Denote the number of subsets by 5„. For
a stationary strongly mixing time
form {Wj,
0.
in place of the infeasible statistic supg,^ abQb-
C n.^^
subsets of size 6
>
anQn, we have that
C.5 implies that
2.
Generic Subsampling Algorithm. At a preliminary
i.i.d.
c
1/7
by computing q from Step 2 using c
=
q_i
+
k„.
k„) as a consistent estimator and a confidence region. Report Cn{cL) as
a confidence region. (The latter region
may be
inconsistent as an estimator,
if
C.3 does not
hold).
In applications, since
number
randomly chosen subsets of
size b
of such subsets
such that
5„ —
>
is
large,
00 as n
17
it
—
»
suffices to consider
00.
a smaller number, 5„, of
Remark
Hong, and Tamer (2002) discuss implementation and compu-
3.2. Chernozhukov,
tation in further detail. Using
number
L ~
of iterations,
Example
2 as the basis for simulations, they find that a small
or 2, setting cq to a-quantile of a y^ variable with degrees of
1
freedom equal to the number of moment equations, and using
good coverage and estimation
results for
samples of
Wolf (1999) describe calibration methods
Theorem
(c)
a E
(0, 1)
On
=
and strongly mixing
—
>
300 and
1000 and 2000.
b
—
400, led to
Romano, and
Politis,
for choosing b in practice.
3.3 (General Validity of Subsampling). Suppose (a) {Wi,
a stationary
and
size
h
series, (b) b -^
b/n —^
cxd,
at
...,
W„}
either
is
i.i.d.
polynomial rates as n ^f oo,
oo ai least at a polynomial rate in n. Suppose C.l, C.2, C.4, and C.5 hold. Let
denote the desired coverage
described above, the following
lim„P{e/ C Cn{c)] = a
is
true:
>
c{a)
if
level.
0,
Then, for any finite iteration
>
(1) ci -^p c{a.) := inf{c
and liminf„P{e/
C
Cn{c)}
L
of the algorithm
P{C <
;
> a
if
=
c{a)
>
c}
a), (2)
0.
Therefore, any finite iteration of the algorithm produces consistent estimates of c{a).
iterations are asymptotically equivalent
single step procedure.
samples.
The
and thus,
for the
resulting regions Cnici) cover
we do not need
to
+
i^n) for
expand them, so we can
set
«;„
=
0/ with P-probability a
in large
suffice for
C„(cl) to cover 0/ with P-probability at least a.
It
follows from the proof of
3.4. If the researcher does not
with the expansion constant
«;„
=
(When C.3
Theorem
3.3 that Conditions C.l
know whether C.3
is
known
is
holds, he can
still
use the algorithm
In n.
numerically equivalent to our algorithm, except that
of constants Cq oc a„
and
estimation of the set 0/.
«;„
=
0,
The
to
and C.4 alone
3.5 (Variations). Recently Sheikh (2006) proposed a step-down variant of our
gorithm, which
we
0.)
3.3.
Remark
purposes of asymptotics, form a
«„ defined above.
Remark
Remark
The
Further, in order to get confidence regions that also consistently estimate 9/,
should expand them, namely take Cn{cL
hold,
or
where the very conservative choice
finitely- iterated
18
The
employs the choice
it
cq cx
a„
is
used to avoid
typically
more con-
infinitely-iterated
step-down
step-down algorithm
servative (hence less powerful) than the original procedure.
al-
is
algorithm has the same asymptotic properties as our algorithm, but
expensive.
ods
more computationally
^^
Asymptotics of C„ and Related Inferential
3.6.
it is
for obtaining the limits of
abilities of false coverage.
Statistics. This section develops meth-
C„ and related inferential statistics that determine the prob-
Such a task faces two major
difficulties:
one
the failure of the
is
usual stochastic equicontinuity conditions of the underlying empirical process and another
the parameter on the boundary problem, as defined below.
work
for obtaining these limits
is
This section outlines a frame-
by relying on concepts of stochastic equi-semi-continuity and
generalizations of Chernoff (1954) type conditions on the parameter space O.
Consider the
statistic C„(5)
:= sup
Since C„
=
C„(0), C„
is
a„Qn, where 9/""
1/7
is
-expansion of 9/.
5/ a„
"
0J-
a special case of this
statistic.
Suppose that
for
each ^
>
Cn{S) -^d C{5) in IR.
(3.4)
Relation (3.4) implies that the probability that the confidence region for 9/ covers false local
'''•'
(5/
region 9/""^
satisfies
^'''
p|0j/a.
as long as c{a)
is
C
^
= P{
Cn{c{a))}
Cr.[c{a))}
sup
anQn < c{a)} -^ P{C{5) <
the continuity point of the distribution oiC{5).
of false coverage satisfies
P{C{5) < c{a)}
< P{C <
c{a)},
Then asymptotic
.3
probability
c{a)}, with strict inequality holding
we can view 9/
false coverage translate in
The
reduces
c{a)
+
as a local alternative to 9/, so that statements about
an obvious way to statements about local power.
starting value Cq oc a„ in
tliis critical
"
if
From a
the distribution function of C{5) differs from the distribution function of C at c{a).
testing prospective,
5>^
tlie
step-down algorithm
is
very conservative. Iteration of the algorithm
value; in the limit of iteration, the critical value
Op(l) produced by our algorithm.
Clearly,
is
when the number
iterated step-down algorithm provides confidence regions that are
essentiallj'
the same as the critical value
of iterations
is
insufficient, the finitely-
more conservative than our confidence
regions. Thus, in practice our confidence regions are often smaller than the finitely-iterated step-down regions.
More
formally,
when C.3
valid critical value Ci
=
holds, starting with cq
=
and «„
=
0,
our algorithm produces the asymptotically
c{a) +Op{l) in merely a single iteration, and cj
by the step-down variant
in
any
finite
number
step-down procedure which aggressively sets «„
of iterations.
=
When
is
less
than the
critical value
produced
C.3 does not hold, the infinitely-iterated
asymptotically agrees with our critical value c{a)
+ Op(l)
with probability at least a. However, conditional on the event that the step-down region does not cover 0/,
which occurs with probability at most
1
—
a, the step-down critical values
may
be smaller than c{a)
Thus, since the discrepancy between step-down and our regions occurs only when Type
discrepancy
is
irrelevant.
19
I
+
Op(l).
error does, the
The
analysis focuses on the asymptotic behavior of the empirical process:
in{e,\):=anQr,{e
+
(^,A)eV;^
\/a]l'^),
(3.6)
where
Kf :=
{(^, A)
:
where Bs denotes the closed
^
e 07, A € ^f (0)}, Vl{B) := ay^(e
ball in
-9)0
Bs,
of diameter 5 centered at the origin
IR'^
(3.7)
and an''{Q —
a]/'' }^
denotes the parameter space translated by 9 and multiplied by the scahng rate
parameter A represents the local deviation from 9 and ranges over the
V^{9).
The parameter
suprema
are
V^
that
sup
limit properties of C„(^) will therefore
is
depend on the
m
of
Observe
^
let
V'^
V^iO), also defined over domain 0j.
IR
n^,
and
,
is
P{\ SXVpys
In S.l (A)
is
(.n
(B) For any e
—
SUPv'i ^n|
is
IR'^
Y^
= V^ =
and 5 >
>
is
-
61
is
many
5
=
0; x
0,
a case which
a complete
9.
Qi X Bs
9
is
for large
n and hence
relevant for asymptotics of
C
for
each 9 E Qj, where Bs{0)
is
a
Then,
for all sufficiently large n,
applications,
of set
0/ x Bs
{0}.
such that Bs{9)
Minkowski difference
defined on a neighborhood Q'
there exists n^ such that for all
well defined over
also holds trivially. This case will be called the
appears to be reasonable in
is
when
= V^ =
>
is
< £
of radius 6 centered at
V^
^^That
f}
obviously satisfied
Next, suppose there exists 5
and S.l(B)
>
Qn
^ Q' and data Wi,...,Wn defined on
jointly measurable in 6
C„(0), since in this case
closed ball in
Regularity). (A)
needed to make sure that £„
over V^. S.l(B)
=
is
statistically.
probability space {^,J-,P).
C„
limit properties of V^.
the graph of the correspondence 9 =t Kf(^), defined over domain 6/, and
Condition S.l (Generalized Chernoff
>
inferential statistics in (3.4)
condition below requires that V^ "converges" to V^, where the notion of convergence
motivated
n
parameter space
4(^,A).
denote the graph of some other correspondence 6
The
The
of the empirical process (8.4) over the set (3.7):
Cr,{5)=
The
The
9 ranges over the identified set 9/.
local
6)
and
20
where
set {6'}.
e.g.
parameter in the interior
is
(3.8)
case. It
a rectangle or a convex body
and 9/
in IR'^
fails
in the interior of
is
The
0.
case where the parameter in the interior condition
to hold will be called the parameter on the boundary case.
definition of Chernoff (1954)
V^
graph
will
and Andrews (1999) to the present context. In
the boundary case arises in
instrumental variable model (Example 4) with
hold.
this case the limit
have a form that depends on the structure of ©.
The parameter on
tified set
This definition extends the
0/ and the boundary
Another example
is
many
the case where
One example
the linear
is
given by a rectangular region. There, iden-
necessarily have
of
problems.
is itself
common
points, so that (3.8) does not
defined by a manifold, so that (3.8) does
V^
not hold.
Lemmas
Section
covering the cases in which the parameter on the boundary problem does occur.
2,
Condition
(sup^,^
The
4.2 in Section 4 derive
£ A)
in IR'
',
where {9,X)
set
(^oo{d,\) is a
>—>
moment
for the
(Weak Sup-Convergence) For any finite
S.2.
^tx),'^
and
4.1
AC
[0,
condition models of
The sup convergence
general than uniform convergence, namely the convergence in L°°[Qj x B5), which
by finite-dimensional convergence and stochastic equicontinuity of
fails in
moment inequahty model,
the
not; see, for instance, discussion of
Example
1
below.
—
s-^
non-negative stochastic process}"^
process i^o will be referred to as the sup limit of £„.
uniform convergence
E A)
oo), (sup^^^ in, 5
The
is
is
more
implied
In particular, the
in-^^
while the sup convergence does
following condition
is
helpful in
verifying sup convergence.
Condition
(Weak Finite-Dimensional Convergence and Approximahility)
S.3.
A. (Fidi Convergence) For any 5
(^oo(^)
^)-, (^) •^)
G
^) in^^
B. (Fidi Approximahility)
that for all
n G
[n^, 00]
.•
',
>
Q and any finite subset
where
For any
e
Pjsupj^i In
{9,
A)
>
—
i-^
^'''Here |^|
two
limits
may
and 5
"^^^Mie)
>
C
disagree in general.
denotes cardinalitj' of the
finite set
there
>
£^}
The
is
a finite subset
limit
hypo-graphs
is
M[e)
and the sup-limit
M)
-^^
of
V^
such
coincide.
Oth-
finite-dimensional approximabihty
is
A.
is
known
which may then be used to study the convergence of hypo-graphs of ^„ to those
extending the approach in Molchanov (2005)
€
<£
'^This notion of sup convergence could be modified to yield what
e.g.
[in{Q, A), {9, A)
^oo(^, ^) is a non-negative stochastic process.
These conditions imply that the finite-dimensional
erwise, the
M ofV^,
for
as
weak hypo-convergence,
of. £00
as
the present problem. However, the
not of interest per se in this paper.
21
random
closed sets,
weak convergence
of
motivated and extends the Knight's (1999) notion of stochastic equi-semi-continuity to
set-
identified models.
Lemma
supyd
3.2. (1) Condition S.l
particular
ioo, 'in
Example
=
4(^,A)
1
-
is
limit
(3.4) with the limit variables given by C{5)
=
Q„(0)
1,
{En[Yi]
-
d)l
is
not continuous in
at ^
is
[W,
in the interior of
=
=
{W2 - Xtl{9
+
Ep[Y,])
-EpfVi]
and
=
at ^
we have that
3.2
Then
Ep[Y2]).
Moment
Equalities.
moment-equality set-up
Ep['mi{9)]
=
hence ^„(^,A) can
jE'p[y2],
= V^ =
V^
=
Suppose that 0/
fails.
S.l holds, since
C„((5) —>-d C{5)
sup e^{e,0)
Analysis of
4.
9
9.
=
Gi x Bs
for
Also, finite-dimensional approximability S.3(B) can be easily veri-
large n.
By Lemma
:
Then
the finite-dimensional limit of
A)^l(^
=
Cn^dC=
G
Of_.
[WuWn)' = N{0,n). Then
Therefore S.2 holds, so that the finite-dimensional limit ^oo(^>A)
4.1.
-
Suppose that {y/^[E„[Y^\ -
-
all sufficiently
£n{9, A).
(E^fFs]
given by
[Ep[Yi], Ep[Y2]]
{^
+
-9 - \/^f_.
n{En[Y2]
not be stochastically equicontinuous and uniform convergence
fied.
—
supg^g^ ^oo(^,0). (2) Condition S.3 implies condition S.2.
Example
Ep[Y2\))' -^d
i^{0, A)
The
=
e- X/^)l +
-
n{Er,[Yi]
C(0)
In
(contd.)
Ep\Y^]), ^{Er,[Y2]
4(6*, A)
=
C
and S.2 imply
We
= max
also the sup-limit of
supi^gx)^eixBs ^oo{9, A), in particular
((VKi)^,
{W2t)
.
Moment Condition Models
begin the discussion with
in Section 2,
0].
—
is
moment
equalities.
Recall the
where the identification region takes the form 0/
Suppose there
exist positive constants
C
and 5 such that
=
for all
eQ
||^pK(^)]||>C-(d(^,0/)A<^).
This
the
is
a partial identification condition, which states that once 9
moment
(4.1)
is
bounded away from 0/,
equations are bounded away from zero.
In the point-identified case, the full rank and continuity of the Jacobian VeEp[mi{9)] near
0/ ordinarily imply
requires a
(4.1).
In the set-identified case, the Jacobian
more involved condition
(4.1).
we have that Ep[mi{9)] = Ep[ZX']{9 —
may be
degenerate, which
For example, in the linear IV model of Example 4
9*),
where
22
9*
is
the closest point to ^ in 0/. Provided
that
if
-
\\6*
>
9\\
0,
rank Ep[ZX']
vector {6
- 6*)
we have \\Ep[ZX']{9 -e*)\\>C-
non-zero,
is
orthogonal to the hyperplane {v
is
root of the minimal positive eigenvalue of
The main
where 6'
stochastic assumption
a neighborhood of
is
in
By
IR'^.
Ep[ZX']v
=
C
is
where
-9*\\,
0}.
Hence
the square
Ep[X Z']Ep[ZX'].
that {mi{9),9 G 0'}
is
\\9
:
a P-Donsker class of functions,
is
we mean the
this
(1) In
following:
the metric
space L°°(0')
Gn[mm
where A(^)
a
is
mean
each 9 G 0'.
for
^^(E„K(^)] -
:=
zero Gaussian process with
The probabihty space
(2)
augmented)-'^ so that there exists a
A„(^)
=
+
A(^)
^ A{9),
(4.2)
continuous paths and Varp[A{9)]
a.s.
P)
(r2,jF,
rich
is
enough
(or has
:
The second
>
been suitably
^ L°°(0') such that A„(^) =ci G„[m,;(6')]
map A„ Q
Op{l) in L°°(0'). ^°
Ep[m.,{e)]^
and
condition does not involve loss of generality
due to the Skorohod-Dudley-Wichura Construction, see Theorem 1.10.4
in
van der Vaart and
Wellner (1996) or Dudley (1985). Other conditions are given in the following assumption.
Condition M.l. Suppose
Section 2: (a)
Qn{6)
is
Q
is
the following conditions hold for the
a non-empty compact subset o/lR
m
defined on a neighborhood 0' of
R,
and
,
data Wi, ...,Wn defined on a com.plete probability space
of the local parameter space
>
non- decreasing in 5
is
(c)
G{9)
=
some
{mi{6),9 G 0'}
VeEp[mi{9)\ for each 9 G
where W{9)
Most
C.5,
is
positive definite
the real-valued criterion function
jointly measurable in 9
{fl.,J^,
set
V^
satisfies
0',
and
(e)
P), (b)
is
E Q' and
such that the graph
in Hausdorff metric, where
P-Donsker condition stated
and continuous for
all 9
We
and other main conditions. Condition M.l(b)
is
needed
the second part of
V^
above,
shall use {fl,J-,
^"Notation
Ep^\f[X)\
to P, see
=4 means
=
is
E
e Q'
Op(l) uniformly in 9
Q'.
needed them to verify C.l, C.2, C.4,
a generalization of the Chernoff (1954)
for the analysis of false coverage, as discussed in Section 3.6,
Theorem
4.1 below. 'M.l(b) also holds trivially in the
interior case, as defined in Section 3.6, in
^^We
= W{e) +
Wn{9)
of these assumptions are conventional.
condition, which
if
to
is
equality m.odel of
Ep[mi{9)\ satisfies partial identification condition (4-i) and has a continuous Jacobian
(d)
for
0,
V^ converges
and
,
moment
which case
V^ = 0/ x B^.
parameter
and
in the
M.l(b) can be replaced by
P) to denote the augmented probability space.
equality in law: given two
Ep-[f{Y)]
for
every bounded /
van der Vaart and Wellner (1996),
:
maps X and Y that map fi to a metric space D, X =d y
M, where Ep- denotes outer expectation with respect
ID >-+
p. 60.
23
the classical assumption that
Lemma 4.1. The
convergence imposed in M.l(b), known
convergence of correspondences,
instance
convex, in which case
is
is
V^
has a very simple form stated in
in variational analysis as
a graphical
a fairly weak notion of convergence for correspondences, for
considerably weaker than the uniform convergence sup^^g^ d//(V"^(^), K^(^))
it is
o(l) (Rockafellar
Lemma
and Wets 1998).
=
below provides further discussion of
4.1 stated
M.l(b).
Theorem 4.1 (Moment Equations).
with 7 = 2, a„ = n, and bn = y/n. If
sup-limit ofin{9,X) :=
(1) Conditions M.l(a,c,d,e) imply C.l, C.2, C.4, C.5
condition M.l(h) also holds, then S.1-S.3 hold, and the
nQ„(^ + X/^/E)
is
=
given by: i^{9,X)
{A{e)
\\
+
Gie)X)' W^/\e)\\'^.
In particular,
c := sup
where A{9)
C.2,
=
C.4,
Qn{d)
V^
:— lim^foo
C := sup
The most
^cxj-
n/n
=
loo(^,0)
-^'^
infe'ee Qnl^')
3)
=
<3n(^')
2,
is
a„
=
a
=
n, 6„
y/n,
given by: 1^{9,X)
=
and
the sup-limit 0/ ^„(^, A) :=
io^{9,X)
-
mi(^e\x')ev^ ^^{9' ,X'),
particular,
sup \\A{9yW^'\9)\\''
(it
used for inference, condition M.l implies
is
-
basic implications are that, for c
asymptotic coverage
at ^J\n
—
and C.5, with 7
nQn{9 + X/s/n) - ninf^'ee
where
u
sup \\A{9yw'/'{e)\\'
a zero-mean Gaussian process defined in (4-2).
is
(2) \¥hen Qn{9)
C.l,
ue,o) =
||(A(^)
inf
—
+
u 4)
G{e)X)'W'!\9)\\\
c{a), the confidence region Cn{c) has
>-p
need not be consistent). The estimator Cn{c
+
Inn)
is
consistent
rate with respect the Hausdorff distance, and has asymptotic coverage of
1.
The
sup-limit £00 of the empirical process in obtained by the theorem describes the limit behavior
of related inferential statistics, following Section 3.6.
insures that the limit variable
The
3.5 or
quantiles of
C
in (4.3)
by simulating the
C
Lastly, note that
compactness of 9/
is finite.
can be estimated by the generic subsampling method of Section
limit distribution.
The
latter
method
is
generally
more accurate than
subsampling.
RerriEirk 4.1. (Quantiles of (4.3) by Simulation) For instance,
if
the data are
estimate the distribution of C by making the simulation draws of
C: := sup C:{9),
C:{9) :=
24
\\Al{9)'WS~m\
i.i.d.,
we can
= n
where A*(6')
Note that A*(^)
ables.
is
<
{zi,i
n)
of
and the law
and supg^Q
Op{l),
of C, in the
argument applies
if
A''(0, 1)
vari-
—
||Ty„(6')
weak convergence
the distribution of
A
Op(l) uniformly in {9,6')
A* (6') and the law
gence metric, converges in probability to zero. Since A*
=
i.i.d.
a zero-mean Gaussian process in L°°{Q) with covariance function
Thus the distance between the law
dniQi, Qi)
a n-vector of
is
Then En[mi{0)mi{0'y] = Ep[m{e)m,{e'y] +
En[mi{e)mi{e'y].
9x0.
and
^^'^^'^^i['miiO)zi],
VK(0)||
=
weak conver-
stochastically equicontinuous,
is
(6')
of A(^), in the
Op(l), the distance
between the law of C*
The same
metric, converges in probability to zero.
(6') is
G
estimated by the nonparametric bootstrap with
recentering.
Remark
4.2 (Quantiles of (4.4) by Simulation).
We
can estimate the quantiles of C in
(4.4)
by simulating the distribution of the variable
cm := \\A:{9yW^/\9)r -
C: := sup Qie),
where G{9)
is
The form
of
plays an important role as
C„.((5)
it
determines the limit form of local parameter
which behavior determines the probability of
4.1 (Chernoff Regularity for
Moment
(1) Suppose there exists 5
either one of the following:
gQi. Then, V^
IR'^
:
gr{9)
=
^V^^OjxBs
0}, where
Qg
is
a
for
compact and convex
above convergence holds with
l,V,5,(0)A
Lemma
=
O,r
=
that has
V^{9)
Q'g,
=
C
such that Bs{9)
(2) Suppose
Qt
\
Q'g
:
X £
include
for each
= 0g nf^i
[9 e
-^ IR'^ has a continuous
a neighborhood of
{X E Bs
0,
Qg
\/rii{Qg
Then
in WC^.
—
0) for
some
the
n'
>
l,...,i?}.
4.1 provides the sufficient condition for S.l to hold.
the interior case that
interior of
V^
>
set,
>
non- decreasing in S
is
all sufficiently large n.
Jacobian Vgr{9) with a constant row rank over
false coverage.
Equations). Sufficient conditions for V^ to
converge in the Hausdorff metric to some set V^, which
9
+ G{9)XyW^/\9)f,
a uniformly consistent estimate of \/eEp[mi{9)].
spaces and statistics
'Lemma
||(A*(^)
inf
may
arise e.g.
when 0/
is
Case
(1) is
the parameter in
a collection of isolated points that
(defined relative to W^), in which case
V^
has a
trivial form.
Case
(2)
lie in
the
addresses
the parameter on the boundary case, and covers the convex parameter space Qg as well as the
parameter space generated by an intersection of Qg with several manifolds representing various
restrictions
imposed on the parameter space;
in this case, the limit local
25
parameter space V^{9)
is
9
given (up to a closure) by a tangent cone of
at ^ intersected with the ball Bs- This extends
the results obtained for the point-identified cases (Chernoff 1954, Geyer 1994,
Moment
We have 0/ =
4.2.
that for
all
9
Andrews
1997).
Inequalities. Recall the setup of the moment-inequality model in Section
{^ G
9
:
||£'p[mi(^)]||+
=
Assume
0}.
there are positive constants (C,
5,r})
2.
such
eQ
C-idie,Qj)A5),
\\Ep[mi{e)\\\+>
and, recalling that m.i{0)
is
a
J- vector
m&-KEp[m.ij{e)]
(4.5)
=
with components {mij{6),j
< -C {d{e,e\
9/) A
5),
for all 9
1,
J),
...,
eQi
:
(4.6)
d(97^9/) <
Equation
e for all
[O,??].
Equation
(4.5) is the partial identification condition.
equations are strictly negative for
97"^
ee
all
can approximate 9/. Equation
9 in the contractions of
(4.6)
empirical examples listed in Section
(4.6) states that
moment
9/ and that these contractions
needs not hold generally, but
it is
many
satisfied in
2.^^
In order to state the regularity conditions define
Qj
where
J
is
{9eQj:
:=
=
Vj G J,
Ep[m,,j{9)]
{1,T..,
J} and
J'^ is the
Ep[m,j{9)]
any (non-empty) subset of
<
Vj G J'},
complement of
J' relative to
{I,..; J}.
Condition M.2. Suppose
Section 2: (a)
Q
non-empty compact subset of JR
a
is
the following conditions hold for the
defined on a neighborhood
Q'ofQinJR, and
is
V^\9 G
Qj
is
Qj
and
9
is
converges to some set V^\9 G
non- decreasing in 5
>
0,
for each
J
,
(c)
inequality model of
the criterion function Qn{9)
jointly measurable in 9
defined on a complete probability space {Q,,T,P), (b)
parameter space^^ V^\9 G
,
moment
^ Q' and data Wi,
...,Wn
such that the graph of the
Qj
in the
Haus dorff metric,
{mi{9),9 G 9'}
satisfies
is
local
where
P-Donsker
condition stated in Section 4-1, (d) Ep[m.i{9)\ satisfies partial identification condition (4-5)
and has continuous Jacobian G{9)
uniformly in 9 E Q'
continuous for
all
,
where W{9)
G
9',
and
(f)
=
is
VeEp\m,i{9)] for each 9 G 9', (e) Wn{9)
=
W{9) +
a diagonal matrix with positive diagonal elements
Op{l)
and
is
condition (4-6) holds.
moment inequality framework has
been provided in the previous version of this paper (Chernozhukov, Hong, and Tamer 2002).
22The set V^\e e @j is defined as {(61, A) eV^-.O e Qj).
^"A detailed
illustration
and
verification of this condition for the linear
26
Most
We
of these assumptions are conventional.
and other main conditions.
Condition M.2(b)
need them to verify C.l, C.2, C.4, C.5
an assmiiption of Chernoff type, which
is
needed
for the analysis of false coverage, as discussed in Section 3.6,
part of
Theorem
Lemma
4.2 below.
has a very simple form stated in
Lemma
for the
second
below provides further discussion of M.2(b).
4.2 stated
Condition M.2(b) can be replaced by the classical assumption that
V^
and
is
is
4.L Condition M.2(f)
convex, in which case
needed to verify the
is
"degenerate interior" condition C.3.
Theorem
=
with J
4.2 (Moment Inequalities). (1) Conditions M.2(a,c,d,e) imply C.l, C.2, C.4, C.5
2, ttn
=
n,
and
h^
—
\/n. If further condition
M.2
(f) holds,
then C.3 holds. If further
condition M.2(h) holds, then S.1-S.3 holds, and the sup-limit of in{0, A) :=
given by: £^{9,X)
=
{A{e)
\\
+
G{9)\
+ ({9))' W^-^^{9)\\1.
C=snpi^{9,0) =
where A(^)
is
(2)
C.l,
When Qn{9) —
C.2,' C.4,
limit of ln{9,X)
mi(0>^x')ev^
c
=
= -oo
inf^/ge (5n(^')
:= nQn{9
where
t-s
and
if
=
nmfe>(.QQn{9')
2,
is
=
a„
M. 2(f)
n,
bn
given by:
=
=
=
<
J) with
0.
\fn,
loo{9,X)
supg^Q^
and
=
^00(6*, 0),
the sup-
£co{9,X)
-
i.e.
\\{A{9)-^G{9)x+myw'/'{9)\\i,
(4.8)
holds.
c{a), the region Cn{c)
is
consistent at l/\/n rate with respect to the
Hausdorff distance as an estimator, and has asymptotic coverage
The theorem
{^j{9),j
Ep[m,j{9)]
if
:= lim^^oo V^. In particidar C
where the second term equals zero
^p
=
—
and ^{9)
used for inference, conditions M.2(a,b,c,d,e) imply
\\{A{9)+mrw'/'mi-,j^i
Therefore, for c
^j{9)
In particular, 7
+ A/y^) -
V^
<
is
uj)
sup \\{A{9)+a9)yw'/'~{9)\\l,
Ep[m^j{9)]
and S.1-S.3.
C.5,
^00(6"', A'),
sup
if
+ A/'i/n)
In particular,
a zero-mean Gaussian process defined in (4-2)
(j{9)
nQn{9
q
as a confidence region.
also obtains the sup-limit £00 of the empirical process in,
which describes the
limit behavior of the related inferential statistics. Following Section 3.6, the latter results are
needed to describe the probability of
The
quantiles of
C
in (4.7)
false coverage.
can be estimated by either the generic subsampling method of
Section 3.5 or simulating the limit distribution.
than subsampling.
27
The
latter
method
is
generally
more accurate
Remark
4.3 (Quantiles of (4.7) by Simulation).
limit distribution of C„
=
ables.
CM :=
cm,
and
n"^''^ X^"^j7nj(^)zj],
Note that A*(^)
is
< —Cj\ogn/^/n, and
En[mij{9)]
>
<
{zi,i
n)
(j{9) :=
a n-vector of
is
in
if
En{mij{9)]
Cj
Remark
4.4 (Quantiles of (4.8) by Simulation).
i'V(0, 1)
i.i.d.
vari-
L°°(0) with covariance function
^{9) := {ij{9),j
4.1.
constants
=
J)'
1,...,
with ^j{9) := -oo
> —CjlognJ s/n,
for
some
positive
0.
limit distribution of
C::=supQ(e),
where G{9)
we can simulate the
i.i.d.,
+ m)'Wl'\e)\\l,
||(A;(0)
a zero-mean Gaussian process
En\mi{e)mi{e')% as discussed in Remark
if
the data are
by making the simulation draws of
CI := sup
where A*(^)
If
If
the data are
C by making the simulation draws
we can simulate the
of
mi{e)+m'Wll\9)f,-
C:{9) :=
i.i.d.,
\\[K{6)+G{9)\+myWl'\e)\\\
Inf
a uniformly consistent estimate of VeEp[m.,{9)].
is
The form
plays an important role in determining the limit form of local parameter
of
spaces and of the statistic C„(5), which behavior determines the probability of false coverage.
Lemma
4.2 (Chernoff Regularity for
graph of the
set
local
Qj
V^\9 G
Suppose there
Qj X Bs for
is
non- decreasing
is
exists 5
>
a compact and convex
that has
¥^{9)
Lemma
after
such that Bs{9)
4.2
Lemma
set, gr
:
Q' —^
a neighborhood of
=
{X E B5
is
similar to
:
^
Qg
"^
in
Lemma
4.1
Sufficient conditions for the
converge in the Hausdorff metric to some
include either one of the following:
Then, V^
for each 9 e Qj.
Q = Qg fl^^j
{9 elR'^
:
gr{9)
=
0},
(1)
= V^ —
where Qg
has continuous Jacobian Vgr{9) with a coTistant
IR''.
Then
X e y/n'{Qg-9) for some
the above convergence holds with
n'
>
l,Vegr{d)X
=
Q,r
=
V^
l,...,R].
and the comments that are similar to those stated
4.1 apply here.
5.
The
C
(2) Suppose
all sufficiently large n.
Inequalities).
Qj to
in 5 >
parameter space V^\9 €
that
row rank over Q'
Moment
Appendix A: Notation
following standard notation for empirical processes will be used:
28
The
notions of convergence and outer and inner probabilities, P* and P*, are defined as in van der
—>p
Vaart and Wellner (1996). For instance,
denotes convergence in outer probability,
wp —
1
>
means
"with the inner probability approaching 1"; the stochastic order notations Op(l) and Op(l) are with
respect to
P*
X
Notation =d means equality in law: given two elements
unless otherwise stated.
,
Y that map fi to a metric space ^, X =,1 Y if Ep*[f[X)] — Ep'[f{Y)] for every bounded
/ D I— IR, where Ep* denotes outer expectation with respect to P. Let ||a:||+ = ||max(x,0)|| and
max(— a;,0)||, where in the case of vectors the max operations are elementwise. Bs denotes
||x||_ =
and
>
:
II
many
a closed ball of diameter 5 centered at the origin. In
sup^ / to mean suP(jg^/(a), unless an ambiguity
The Hausdorff distance between sets is defined as
arises, in
dH{A,B) :— max sup d{a, 5), sup d{h, A)
and dH{A,B) := oo
d{0,
<
Qi)
if
A
either
ov
B
and the e-contraction
e},
9/
Proof of Theorem
supe^
Qn =
Step
Op(l/a„)
For any
(b).
<
e
0, infQ\^0j
Q„
B:
(1).
C
(i)
9/
as
assumed
=(j) infe\ee
(i)
holds by construction of
wp —
Q=
supg
in C.l(b). Similarly,
9/ and
(ii)
Combining Steps
1.
and
(a)
(b),
Step
Q+
ci/f(9/,9)
supg Qn
-> 1
9
by C.l(e) and by c-^p
1,
>
e
Op(l) >(jj)
and
(5(e)
+
d(6,
from
(n)
9/)
Op(l) for
wp —
>
1.
C
which implies 9/
Since
e
>
is
—
oo,
some
>
6{e)
=(ij) Op(l),
6{e)
=
infe\e=
9j. Given Step
larger
than
1
—
For any £
(2).
we have c/an <
^
and c/an >
arbitrary, the result
is
(i)
follows
supa anQn <
>
by c/un
—>p
and
c -^p
oo
anQn[0) >u) n- an- [\c/{ann)f''^ A 5]
2".
by C.2 and
Hence 9/ C
(ii)
follows
0^^'''""^'"
we have that dH(9/,9/) <
[c/(an«:)]i/T.
Proof of Part
When 9 =
dHi@i,Q) =
there exist positive constants {n^,K,^,K) such that for
i^e/cin,
;
all
so that, with probabihty
e.
inf
where
Q
(a),
D
Proof of Part
Tie
0,
on
where
proven.
n >
:
0.
Q being minimized
+ Op(l) <(j) c/an + Op(l)
^p 0. Hence supg Q <
wp —
—
<
Wp
(a).
in C.l(d)
holds by c/ttn
Hence 9/ n (9 \ 9f)
where 6{e) > 0,
this implies supgQ(i(0, 9/) < e.
*
||6
Proofs
9/, which implies supg^g^
from uniform convergence as assumed
where
follows
as
c/an, which implies 9/
>
inf
used.
is
:
PROOF OF Part
3.1:
where d{h,A) :—
,
latter notation
The e-expansion of G/ is defined as 0j :— {0 £
97^ :=-- {9 eQi d{e, 9 \ 9/) > e}, where e > 0.
Appendix
6.
6.1.
which case the
empty.
is
of
we use abbreviated notation
instances,
(3).
.
by c/o„ -^p
0.
By
c,
construction of 9/,
Hence, combining with Step
Therefore diy(9/,9/)
^ui)
=
(a) of the
we have that
Proof of Part
(1),
D
Op([c/a„]i/T).
9/, by Step (a) of Proof of Part (1), 9/
= 9 wp
^
1,
so
D
Owp^l.
29
Proof of Theorem 3.2.
PROOF OF Part (1). Fix any e £ (0,77]. It follows that wp
— 1, ©7^ Q{i) C'n(c') C(;j) Cn{c) C(jjj) G|, where c is from Theorem 3.1. Inclusion (i) follows since
supg-t anQn = < c* wp — 1, by C.3(b), (ii) follows from c > c' wp -^ 1, and (iii) follows from Part
6.2.
.
>
>
Theorem
1 of
6},
it
dH{Qj^,@i) <
Since
3.1.
follows that d//(67,C„(c'))
Proof of Part
<
Let e„
(2).
by Condition C.3(a) and dH{Q%©i)
e
e.
Part (1) follows.
=
inf{e
=
(2) of
Theorem
C Qfl<'^^"\ Conclude
have that Cn[c')
Proof of Part
Hence dn {&!,&) wp
Proof of
6.3.
9/
C.
Cn{c)
=
3.1 that for
>
>
e
=
Then we have
Op{an^^'').
It
=
that dH{Cn{c), 9/)
it is
that
can be shown, similarly to the
n >
all
ng,
immediate that 9/
= 0/ = 9 wp
-^
:
<
sup^g©^ a„Q„(0)
3.1.
—
C„
c implies
supg^Q^ a,nQn{9)
<
c
The
(2).
Proof of Theorem
1 is special
Step
and
e„ :— (Inn/ttn)^/'''
Cb„
is
3.1 or
:=
7?„
:= supafcQj^b^n
— e„,
<
samples
is
if
Cj^h,n
By
standard
is
->
1,
C.3 holds, and
??„
:=
=
3.2
Bn- Define G(,^„(x) :=
:=
^
B-'
sup abQj,b,n
<
for
^p
Step 2 below Gi,„(x)
G{x)
is
The
allowed to be data-dependent.
subsampling.
0, if
Cj,b,„
C.3
not hold. Hence
dofes
:= sup abQj^b,n,
for all j
wp —
>
1,
< B„,
Q]^
was computed using j-th subsample;
B~^ Si=i
^{^j,b,n
<x}<
l{C,,b,n
therefore omitted.
is
we have that 9^" C Cn{c) C 9^", where
Cn{c)
j denotes that the statistic
G^Jx)
proof
(1). It suffices to prove the result for ci only.
wp
Theorem
97"
where index
its
identical to this proof, since c\
to our problem, while Step 2
By Theorem
1.
elementary and
is
PROOF OF Part
3.3.
proof for any subsequent step
Step
result
by
D
D
compactness of 9/.
6.4.
1.
D
PROOF OF Part (1). Clearly, C„ =
€ Q anQn{9) < c}. Conversely, 9/ C Cn{c) implies
Proof of Part
we
D
Op{an^''').
1.
Lemma
{9
definition of
C.3(c) e„ exists
there exist {5i:,n^) such that for
0,
Under the stated condition,
(3).
—
any
By Condition
0}.
and e„ = Op{an^^'^). Hence by C.3(a) and C.3(c), d//(0"'",e/)
©7^" C CnC^) Q Cn{c') wp —+ 1, where c' > c and ? = c + Op(l).
Proof of Part
by
e
D
supg-e a„Q„
:
<
= P{C <
<
Gb,n{x)
2;}.
wp —
Hence
< GbA^)
>
:-
total
number
of sub-
1
B''
Y, H^^An <
x] and Gf,,„(x) ~>p G{x)
= P{C <
= P{C <
0.
^}-
x), for each
a;
>
0.
This proves that
Gb,n{x) -^p G{x)
x}
for
each x
>
(6.1)
Convergence of the distribution function at continuity points implies convergence of the quantile
function at continuity points.
implies that c :=
Step
G^^{a)
Define
2.
—
Op(l)
= P{Cb <
x}
C_^
+
Varp(S~'^ J2j=i l{^j,M
—>p
By
C.4 c{a) := G"~^(a)
G^^{a)
for
each
a S
:= supQ^n abQb and Cb
(21
Op(l)
<
x})
= P{C
— o(l).
< x}
For
-I-
i.i.d.
is
continuous in
a £
(0,1).
Hence,
(6.1)
(0, 1).
—
supgcn a^Qb-
Op(l) at each x
Write Gt_„(x)
>
0.
data, this follows from
30
=
Conclusion
Bn
—* oc
£'p[G{,„(x)]
(1)
follows
-|-
by
and the Hoeffding
inequality for
bounded
in the proof of
by C.5 and C.4 and by
(1999). Conclusion (2) follows
to restrictions on the subsample size b
—*p
Likewise, conclude Gb^ni'x)
Proof of Part
Lemma
Proof of
6.5.
The
(2).
3.2.
^(2;) ==
—
stationary a-mixing series, this follows from i?„
l/'-statistics; for
an upper bound on covariance given
Theorem
e„
=
o{l/a^
and the rate a„ stated
P{C <
and
=
77„
oo and
o(l/aj
)
arising
due
theorem.
in conditions (b,c) of this
D
D
x}.
Lemma
from
result follows
)
>
Romano, and Wolf
3.2.1 in Politis,
PROOF OF Part
3.1
Conditions S.l and S.2 immediately imply
(1).
D
(3.4).
Proof of Part
lim sup
n-*oo
P{sup £„ <
v^
where inequality
(i)
—
(5
00.
>
Since e
and e >
is
limsupP{max^„ <
from sup^d in
follows
arbitrary,
ys
(i)
6
>0
and e >
^} <(m) P{sup^oo
<
r
+ e} + e,
ys^
follows
< P{supyi
r}
^00
<
r}.
for
Further, for any
M{£) C V^ such that
r} >(i\ liminf
'
P{max£„ <
n-*oo
"•
<
in
r
—
e}
—
e
M{e)
P{max^oo
<r-6} -e
M(e)
y/m) Pjsup^oo
<r-e}
-e,
yS
from the finite-dimensional approximability condition S.3(B),
finite-dimensional convergence condition S.3(A), and
arbitrary,
any
from the finite-dimensional convergence
sup;j,|/£\ in, (h)
hmsup„^QoP{supv<5
>/u)
where inequality
for
from the finite-dimensional approximability condition S.3(B) applied
(iii)
Pjsup £„ <
n-»oo
^
<
r} <(jj) P{max£(x>
M(e)
M(£)
n^oo
there exists a finite set
lim inf
Note that
S.2.
M(e) C V^ such that
r} <(j\
condition S.3(A), and
n
This part shows that S.3 imphes
(2).
there exists a finite set
(iii)
from sup^s i^ >
su^p{^^^^^irx:,
(ii)
from
Since e
hminf„^oo P{supy« in < 1^} > P{sup^,^5 £00 < ^}- Conclude by the Portmanteau lemma
G A) for finite set A in S.2 follows
i^o- The joint convergence of (supys £„,
that supys in —^d supys
(5
D
similarly.
Proof of Theorem
6.6.
Step
1 verifies
4.1.
PROOF OF PART
1.
(C.l
and Step
—
n and
6„
=
is
organized in the following steps.
In particular, uniform convergence
=
||(Gn[m,(0)]
>
C
>
C- |v^||^p[mi(0)]||
wp
—^
1,
uniformly in 9 G
+ V^Ep[mi{9)]f
-
Step
by
and the
imme-
rates of convergence
inf
Q
by definition
mineig Wni9)
>
C
>
0,
+ y\\ >
>C-\C-V^{d{9,ei)A5)-Op{l)f, hysnp\\Gn[mi{9)]\\ = 0p{l)
\\Gn[mi{e)]\\\^
is
Q} being P-Donsker and having Ep[mi(9)] —
+ y^Ep[m,{9)]yW^/^i9)f
\\Gn[mi{9)]
2,
5 verifies S.1-S.3.
y/n in C.l follow from {mi{9), 9 G
on ©;. To verify C.2 observe that
nQn{9)
The proof
and C.2: Uniform Convergence and Quadratic Minorants) Condition C.l
diate from condition M.l(a,c,d,e).
an
(1).
C.l and C.2. Step 2 gives an auxiliary basic approximation for £„. Using Step
3 verifies C.4, Step 4 verifies C.5,
Step
is
by inequality
\\x
wp
\\\y\\
-
^
1,
by M.l(e)
||x|||
and M. 1(d),
eee
(6.2)
31
where sup^gg
~
||G„[mj(^)]||
Op{l) follows from P-Donskerness.
choose (Kejne) large enough so that for
>l-C-C^-n-
nQniO)
This
[d{9,
all
Therefore, for any £
n > n^ with probabUity
at least
9
0;) A 6f uniformly on {0 G
1
-
>
d{e, 9/)
:
>
we can
e
Ke/n^^~}.
verifies C.2.
= \\^/nEn[m^{e+X/^/ii)]'Wn^'^{e+\/^)\\^ =
{Gn[mi{e + A/v^)] + -/nEplm^ie + X/y^)])' Wn^^{e + X/^)f. For any non-empty compact subset K oiJR'^, we have uniformly in
A) £QxK: (1) Gn[mi{0 + X/^/n)] = Gn[mi{9)] + 0p{l), by the
stochastic equicontinuity arising due to P-Donskerness, (2) Wn{0+X/^/n) = W{6)+0p{l), by M.l(e),
Step
(An Auxiliary Expansion).
2
Write £^{0, A)
II
(6',
and
G„[mj(^)]
(3)
=d A{9)
^"^(9), by P-Donskerness, where A{9)
-t-Op(l) in
+
defined in the statement of the theorem, and (4) ^/nEp[mi{9
=
and by Ep[mi{9)]
£ 9/. These
for all ^
4(^,A) ^d
\\{A{9)
the Gaussian process
is
=
G{9)X
L-(9/
x K).
X/y/n)]
+
o(l),
by M.l(d)
imply that
results
+ G{9)Xyw'/\9)f +Op{l)
in
^oo(e,A)
Note that^oo(0, A)
is
is
stochastically equicontinuous in L°° (9/ x/i'), because
A)
(0,
\-^
{A{9),G{9)X,W{9))
stochastically equicontinuous in L°°(97 x K).
Step
3 (C.4: Convergence of C„).
>
where C
and Smorodina
Step 4
Cn{5n)
By
Step
C„ =d sup^gQ^ ||A(^)'Tyi/2(0)||2+Op(l) = C-f Op(l),
11.1 of Davydov, Lifshits,
(1998). This verifies C.4.
(C.5: Approximability of C„).
=
2,
and has a continuous distribution function by Theorem
a.s.
sup
sup
nQn{9) =d
By
expansions in Step 2
\\A{9)'W^'\9)\\
+
Op{l)
=d sup
\\A{9)'W'I\9)\\ +Op(l),
C
where the
Step
last
equahty follows by stochastic equicontinuity of
A{9yW^/'^{9). This
5 (S.1-S.3: Limits of Related Statistics) This step shows that
to M.l(a,c,d,e), then S.l and S.3 hold. S.3 implies S.2
o(l).
i->
Then,
for
some
e„ j 0, Isup^-^^^
suP||(6),A)-(e',A')||<en Koo(^, A)
-
^00(6*', A')
by
- sup^^^^j <
+ Op(l)| +
Op(l)
Lemma
if
M.l(b) states dH{V^,V^)
3.2.
supi\i^gx)-{e',\')\\<en
=
Op(l)
verifies C.5.
M.l(b) holds in addition
I4(^,A) -4(6i',A')|
=
=
by Step 2 and stochastic equiconti-
nuity of ^oo(6',A). This verifies S.l(B). Condition M.l(a) implies S.l(A).
By
This
Step
2,
the finite-dimensional limit of 4(6', A) equals
by stochastic equicontinuity
imability condition S.3(B)
Proof of Part
particular,
the
first
that
=
A)
||(A(6I)
+ G{9)XyW^/^{9)f.
verifies S.3(A).
Finally, note that
well
£00(6*,
(2).
is
inf(5i ;^)gv'4
finite-dimensional approx-
and
it is
therefore omitted. In
=
nQn{9) — ninfg'ge Qn{^'), where asymptotic approximations for
to the proof of Part (1). The second term inf^/ggnQn (&') can be arbitrarily
nQn{9 + X/^/n) —d
=
2,
D
similar to the proof of Part (1),
approximated by mi,Q^^s^y6 nQn{9+X/^/n)
infg/ge nQn{9')
and Step
trivially satisfied.
The proof is
we have that nQn{9)
term are identical
of £oo(^, A)
for
a sufficiently large
mf(^gx)QV^ ^oo{9, X)
inif^gx-j^yx, ioo{9, A)
-I-
Op(l).
The
32
+ Op(l).
5.
Then
as in Part (1)
follows
Setting 5 arbitrarily large gives that
limit \rd(^gx)&v^ ^oo(^, A) exists
'
it
°°
and
is
tight
due
monotone convergence:
to
as 5 t
V^
cxd,
V^, and
f
>
mf(^g)^)^Y^ioo{9,X) [ mi^g\)^v^^oo{d,X)
n
a.s.
6.7.
Proof of
Lemma
Proof of Part
VM
^
4.1.
(2).
PROOF OF PART
This part holds
(1).
G = 9^ is convex and compact. Define
V^ = {(6',A) 6 eQi,Xe v^(0g - 6)}.
Consider the simplest case where
V^iQg -
{X e Bs X €
9) for some n'}. Define
Note that V^ C V^ by convexity of Qg and V^ t V^ monotonically
:
V^ and V^
implies convergence in the Hausdorff distance because
Further,
6
V^^ denote V^ from the convex
let
+ X/^) = 0}, Vi
We have dniVlVl) <
e Qi,X ^ B6,gr{e
Bs,Ve9riO)X = 0}.
the first term is o(l) by the argument
6.8.
are subsets of a
compact
Mi^, Mij.
Ml,, Ml,
+ Er^idniMi^Ml,) =
:= {(0,A)
set.
{{0,X)
is
by an argument similar to that
in
bounded by
Lemma
2 in
D
4.2.
Proof of Part
The proof
(1).
organized as follows:
is
Conditions C.l, C.2, and C.3. Step 2 gives an auxiliary basic approximation for
another approximation. Using Step 2 and
6.1 gives
:
e Qi,X e
o(l),23 where
9
:
convex case and the second term
o(l)
=
This
(1997).
Proof of Theorem
verifies
dH{V^',V^')
is
:
in the set-theoretic sense.
Define V^ := V^^ nf^j
case.
:= V^' n^^,
for the
X^r^i supggQ;^ <^//(-^nr(^)'-^oor(^))i which
Andrews
D
trivially.
Lemma
Step 3
6.1,
verifies
in.
Step
1
Lemnla
Condition C.4, Step
4 verifies Condition C.5, and Step 6 verifies Conditions S.1-S.3.
Step
1
and
(Verification of C.l, C.2,
C.l
C.3).
uniform convergence and the rates of convergence a„
0} being P-Donsker and Ep[mi{6)] <
immediate from M.2(a,c,d,e). In particular,
is
= n and
6„
=
^/n in C.l follow from {mi{9),6 E
on-B/. To verify C.2 observe that
wp
-^
1,
uniformly in
9eQ
nQn{9)
=
\\{Gn[m,{9)]
+
>C-\\Gn[mi{e)]
by
By
C
inf
•
>
This follows by
>
(6.3),
by
Op(l), where the latter
>C
n-
C
>
wp
-^
(||G„K:(0)]
{d{9,
0/) A
{Ki;,n^) so that for all
>\-C,-C
by
definition
y^Ep[m,{e)]\\l
\\V^Ep[m,{9)]\\l
we can choose
nQn{9)
+
mineig Wni9)
M.2(d), \\VnEp[mi{9)]\\l
any e
V^Ep{mi{9)]yW^^\e)\\l
-n- {d{9, 0;) A
(5)^
+
5)^
1,
^^'^'
by M.2(e)
V^Ep[mi{9)]\\l/\\V^Ep[m,i9)]\\l).
on
for
some
C>
n > n^ with probability
uniformly in
{6i
G
:
and 5 >
at least 1
d{9,
0/)
0.
—
Therefore, for
e
> ^Jn^/^}.
+ /||x||+ —» 1 as ||x||+ -^ oo for any y £ IR'^, and by supg^Q
holds by the P-Donsker property. This verifies condition C.2.
||y-l-a;||
^^This follows by the elementary inequality
||Gn[mj(0)]||
dniAn B,C Ci D) < dniAO B,C n B) + dniC n D,C f] B) <
dH{A,C) + dH{B,D).
33
To
wp —
verify C.3 observe that
nQn{e)
=
\\{Gn[mi{e)]
<
C'
<
C'
+
1,
uniformly in 9 G Qj
^Ep[mi{e)])'Wl''^[e)\\l by definition
+
||(G„K(0)]
by supmaxeig Wn{d) <
y/^Ep{mi{e)]\\l
2_^ |'G„[mij(0)]
•
>
+
where subscript
•s/nEp[mij{6)\\'^_^_,
<
C'
oo
wp
j denotes j-th
^
1,
by M.2(e)
element of vector m,:(0)
i<J
<
C'
2Z
•
l'^p(^)
-
\/n
•
C
•
{d[e,
e \ Gy) A <5)|;
for
C>
some
and
>
5
by M.2(f).
o<J
Therefore, for any e
least 1
—
verifies
Step
II
G
implied by
A{6)
-
Q'''^'^
uniformly on
k^ large enough so that for
n > n^ with probability
all
at
^^g^Q^,
9\
^{0,
9/)
>
K,/n^/^}.
Condition C.3.
(A Basic Approximation). Write
2.
i<Gn[mi{e
in (^, A)
we can choose
£
Qn[e)
This
>
4(61, A)
=
\\y/nEn[mi{e+X/y/n)]'Wn^'{e+X/^)\\l
=
+ ^Ep[m,{9 + A/Vn)])' Wn^^{e+X/^/fi)\\l. We have for any 5 > 0, uniformly
(9 X Bs): (1) G„[mj(0 + X/^/n)] — Gn['mi{6)] + Op(l), by the stochastic equicontinuity
the P-Donsker property, (2) W„(i9 + X/y/n) = W{9) + Op(l), by M.2(e), (3) Gn[mi{e)] =d
+
A/Vn)]
+ Op(l)
in
L°°(9), by P-Donskerness property, where A{6)
^Ep[mi{0 + X/^/ii.)]
the statement of the theorem, (4)
the Gaussian process defined in
is
- ^Ep[mi{e)] ^ G{9)X + o{l), by
A4.2(d).
Therefore
en{0, A) -rf \\iA{9)
Steps
Lemma
3,4,
6.1.
and
.5
also
+
Gi9)X
make
=d sup
'=^
||(A(0)
sup ||(A(0)
^max
Qj
:— {9 G
Qj
:
L°°(9/ x Bs).
in
Op(l)||^,
use of the following result.
The following approximation
sup£„(0. A)
where
+ V^Ep[m,{0)]yw'/'{9) +
Ep[mij{9)]
true:
+ G{9)X + V^Ep[m,{0)]yw'/\9) +
+
sup
~
is
+ mywy'.{9) +
G{9)X
J]|(A,(0)
+
Op(l)||t
+
G,(^)'A)l^_,f(0)
0\/j G J,Ep[mij{9)]
Op{l)\\l
<0\/j G
J'^},
(6 4)
Op(l)|-;,
J
denotes any non-empty
subset of {1, ...,J}, arid
ij{9) :-
The proof
Step
of this
3.
(C.4:
lemma
if
is
Ep[m.ij{9)]
=
and
^^{9) :=
^
if
Ep[m^J{0)]
<
0.
(6.5)
given below, immediately after the proof of this theorem.
Convergence of C„) Application of
C„
-co
sup
6.1 for
V^
= V^ = Q
+ m)'W^'-{0) +
Op(l)f+
X; \A,{eywjp{9) +
Op(l)|^.
sup nQn{9) =d sup i|(A(0)
= max
Lemma
34
x {0} yields
<
Hence P[C„
By Theorem
—
c]
<
P[C
*
11.1 of
function of A(^) imphes that
mass
at c
=
C has continuous
=
1/2
ixiax.j
laaxj^j supg^Qj.[Aj{6)Wjj
Ppn <
0]
=
Step
4.
P[Y <
<
P[C
This
0].
=
Ep[mi{9')])\\
nQn{e)=d
Then
o(l)
it
oo) with a possible point
Y =
Then by
(6.6)
0],
with some e„
e„]
By
(£„ T 0),
j
IR.
and P[Y <
e„]
-^
sup
Step 2
+ ^Ep[mi{9)]yw'/^{9) +
\\{A{e)
and by
of ^ h^ {A{9),W^^'^{9))
follows that for any
(5„
or
j
follows as in Step 3 that P[Cn{Sn)
<
c]
(5„
=
\\iAi9)+V^Ep[m,{9)]yw'/\9)+Oj,{l)\\l
it
[0,
note that non-degeneracy imphes that
Condition C.4.
verifies
and by stochastic equicontinuity
statement of the theorem.
has a continuous distribution function on
(6)]
(C.5: Approximability of C„)
sup
sup
=
bounded above (below) by P[Y <
is
0]
in the
(1998), non-degeneracy of the covariance
distribution function on
-^ P[C
0]
C defined
for
0,
and Smorodina
Lifshits,
To show P[Cn
0.
>
for each c
c]
Davydov,
—
>
Opil)\\l
supug, _0»^j^,/:j^\\^/n{Ep[mi{9)]
—
T
sup \\{A{9)+V^Ep[mi{e)]yw'/^i9)+Op{l)\\l.
P[C <
c]
for
each c
>
This
0.
verifies condition
C.5.
Step
5.
(Verification of S.1-S.3) S.l(A) follows
from M.2(a). S.l(B) and S.2 follow from
Lemma
Further, in equation (6.4), for each J, supys^^g^Q^ J^jej \i^A^) + Gj{eyX)WJj\e) + Op{l)\l
admits finite-dimensional approximation by stochastic equicontinuity of (9, A) i-+ {A{9),G{9)X, W{9))
6.1.
L°°(0 X Bs), which implies S.3(B). By Step 2, the finite-dimensional
\\{A{9) + G{9)X + ^{9)yW^/^{9)\\l, which verifies S.3(A).
ioc{0,\)
in
limit of £„(0,A) equals
-
Proof of Part
6.9.
Proof of Lemma
4.2.
Equality
Step
Here
1.
(i)
\\{A{e)
Wp
The proof is
(2).
6.1.
-^
-^
2.
immediate by Step
is
2 of the proof of
+ V^Ep[mii9)]yw'/'i9) +
gn{9,X,x):^\\{A{9)
+
G{9)X
+ myw'/\9) +
1,
for
some
e„
J.
by y/nEp[mi{9)]
0,
gn{9,X,-en)
>
^{9) for
for
1,
entries,
some
\\{A{9)
and
(ii)
Furthermore,
.
for
<(i) /n(6', A,
x\\l,
-e„)
<(ii)
£n{9,X)
each 9 G Qi and by monotonicity:
(iii)
Theorem
x\\l.
+ G{9)X + X2yW^/\9)\\l,
and
therefore omitted.
proven as follows. Define
G{9)X
vecallmg
tha.t
<{zzi)
xi
>
fn{9,X,en).
X2 imphes
W{9) is dia,gonal
Theorem 4.2.
follow from Step 2 of the proof of
e„ j
supp„(6i,A,-e„)
Step
it is
+
+ G{0)X+.xiyW^/^{9)\\l >
wp
is
and
fn{9,X,x):=\\{A{9)
with positive diagonal
Therefore,
equality in (6.4)
the main claim of the lemma,
(*) in (6.4),
follows
The first
similar to the proof of Part (1),
D
D
any e„
f
sup5„(6',A,£„)
v^
or e„
=
<
sup4(6i,A)
<
sup/„(6', A,e„).
J,
sup5„(6',A,Op(l))
=
sup5„(6l,A,Op(l)),
VT
VI
35
,„^.
^^^)
where
V^
is
To show
the closure of V^.
sup5„(0,A,£„)
Then by M.2(b)
for
equicontinuity of
(0, A)
= max
V
sup
{A{e),G{e)X,W{e))
sup
Y.
+
I(^(^)
|(Aj(0)
+ G,(e)'A)VK/f (0) +
=
ej,V^\9 G Qj)
every J, dniV^lO e
^
write
this,
o(l),
e„|2..
which impHes by stochastic
L°°(0 x Bs) that
in
G,(e)'A)T4^f (0)
+ e„|2.
=
sup
J]|(A,(e)
=
sup
y:i(A,(g)+G,(gyA)I^f(g) +
+ G,(0)'A)l^f(0) + Op(l)|i
Op(l)P^,
so relation (6.8) follows.
Step
3.
This step shows that for some e^
i 0,
<
sup/„(6>,A,e„)
ys
Observe that
for
<
(jiO)
if
n
1
Q,n,£
—e
and e
—
{lo
for all
>
^rt
:
n> Ue-
=
O(^)
supgg©^^;^^^^ l|A(0)||
<
(6.9)
(w„(fc),
^iTasup^.Ep[mij{en)]
K^]. For any £
V^ by dniV^, V^)
^
and by
(6.9)
>
V^CQj
(6.11) gives that lim;
"
SUp5„(fc)
^ ^* ^^'^
x Bg. As
>
(6*,
j.
This gives a contradiction to
Combining Steps 1,2, and
6.10.
Proof of
Lemma
^n{k{i))
in
^jiO)
=
K^ such
-oo. (gjo)
>
that P(n„_£)
e
>
~*
'^*!
0.
(6.11)
where
A*,e/2)
in (6.7), this inequality
>
wp
-
(^*, A*) is in
-^
1
g„(fc(,))
the closure
,
(6**,
A*, e/2)](tj„(fc(,)))
can occur only
>
if
Ci(^*)
(6.10). Therefore, the claim of
V^j
>
Step 2 conclude that
g„((9*,
3 implies the result of the
4.2. Define
A, e)](w„(fc))
[/„(;,(,)) (6i„(fc(i)), A„(fc(;)),0)
Given the definition of /„ and Qn stated
some
there exists
0,
limsupVnM))£^pKj(^^(/c(/)))]
for
H
^j{0)
does not hold, then there must exist constants
sup5„(6l,A,e) =sup5„(6',A,e)
0.
=
On(k),K{k)) with u^^k) £ ^n{k),e^ {Sn{k),K{k)) e V^^^^y such that
Select a further subsequence such that ^n(fc(0)
which together with
1.
n
Suppose that relation
and a subsequence
0.
lim[/n(fc)(^n(fc))'*^n(fc).en(fc))
of
-^
ys
any 0„ G 0/ converging to 6 ^ Qj,
limsup ^/nEp[m,J{0n)]
Let
wp
supg„(^,A,e^)
Step 3
is
correct.
D
lemma.
:= V^\9 G Oj. Apply the proof of
Lemma
4.1 for
V^
to
°
Kj36
>
7.
Suppose that one
9* in
is
0/
interested in a particular parameter 9* inside Qj.
is
when
well motivated,
is
when
typically the case
Extension: Pointwise Approach
there
a sense in which 9*
is
<
9*.^'^
we make the
c)
for each c
one 9 & Qj, C{9)
>
>
random
real
c)
—
variable that has a
as c{a,9). Moreover, for at least
for 9*.
a.t
6
=
9*,
of confidence regions,
suppose the estimate c{9)
is
available such that
condition models, through the use of the limit distributions obtained in
Consistency of the subsampling estimate c{9) follows by the standard argument,
in
<
c{a,9) for each 9 £ O/. Consistent estimates c{9) can be obtained by subsampling
moment
not of
is
we construct a confidence region for 9* as follows. We
for each 9 G Q. Then we collect all ^ £ 6 that pass the test to form
More precisely, we collect all G
such that a„Qn{9) < c{a,9).
Towards the construction
the
9/
with positive probability.
a confidence region
c{9)
In this scenario,
there exists a„ -^ oo such that, for Cn{9) := anQni^), •P(Cn(^)
Using the fact that Qi9) =
test whether whether Q{9) =
—>p
latter
following assumption.
and each 9 G Qj, where C{9) is a
continuous distribution function on [0,oo) and a-quantile denoted
P{C{9)
The
some
is.
In order to facilitate inference about 9*
Condition C.6. Suppose
the true parameter.
is
data-generating processes of real data for some parameter value
interest per se,
inference about
maintained that the economic models are correct representations of
it is
but rather 9*
The
Step 2 of the Proof of Theorem
3.3. It
should be noted that subsampling
is
e.g.
or, for
Theorem
7.3.
the one given
generally less accurate
than the use of the limit distributions.
Let 0/ be an estimator of 0/ so that 0/
we can
for instance,
(logn/n)^/'''.
we
set
Let also
0/ =
C 0/ wp —+
1.
We
also
C„(logn), which under C.l and C.2
cbe any
is
want 0/ to be a sharp estimate,
consistent
consistent estimate of the a-quantile of
and converges
C defined
at rate
in C.4. Recall that
used c for the construction of the region- wise critical value.
The
The
following two regions will be considered.
first
region
a simple region defined by a
is
single critical value:
C„(2*
The second
regions
is
A
c)
-
{0
7.1.
:
OnQniO) <
c*
A
c},
where
-
c*
sup
a region that employs critical values that depend on
Cn{c{-)
Remark
G
The two
A
c)
=
{0 e
:
anQn{9)
constructions are equivalent in
can be transformed to have equal quantiles.^^ The
compute, and report. Clearly, either region
is
fist
a subset
<
c{9)
many
A
,^ ^.
9:
c}.
(7.2)
cases, since the objective functions
construction
of,
c{9).
and hence
is
is
more parsimonious,
easier to
no larger than, the confidence
region Cn{c) for 0/.
^^There
is
0*
G
such that the model law Pg agrees with the actual law of data P.
This can be seen by defining the new criterion function Qnifi) := Qn{6)/ ma,x[c{a,9),e]
In
many examples
this
is
for all
G 0/.
unnecessary, as criterion functions have the equi-quantile property by using optimal
weights.
37
Remark
Bahadur
7.2. If c{d)
do not truncate
obtained by subsampling, the truncation of critical values by c improves
Indeed,
efficiency:
c{0)Ac—>p c{a) <
is
ii
^
9
subsampling implementations that,
GO. Therefore,
by
c{6)
c suffer
0/, we have that c{9) —>p +cx>, typically at the rate
from the
power
loss of
in contrast to
a^,
but
our construction,
Chernozhulcov and Fernandez-
in finite-samples.
Val (2005) show, in a different situation, that the Bahadur inefficiency of canonical (untruncated)
subsampUng
Remark
leads to a substantial loss of
The
7.3.
hypotheses Q{9)
=
in finite samples.
construction of either region employs the pointwise inversions of tests of point
This follows the Anderson and Rubin (1949) construction of confidence regions
0.
the case of simultaneous equations.
for
power
In the case of weakly identified
and unidentified
linear
instrumental variable models, the construction was used by Dufour (1997) and Staiger and Stock
among
(1997),
others.
employed region
dynamic censored regression model,
In a partially identified
(7.2) for inference.
Hu
(2002) also
In partially identified instrumental variable quantile regression
model, Chernozhukov and Hansen (2004)
also use the region (7.2).
A
previous version of the paper,
Chernozhukov, Hong, and Tamer (2002), Appendix G, also gave pointwise constructions. Imbens
and Manski (2004)
investigate the
Wald type
inference about 9* for the special case where 9*
is
a
parameter known to belong to an interval which endpoints can be consistently estimated. The
real
analysis here apphes to a considerably
Remark
7.4.
The more
setting.
recent developments in the literature include
who show
(2006) and Sheikh (2006)
critical values
more general
Andrews and Guggenberger
that the confidence regions of the type proposed here, with
obtained by subsampling, have important robustness (uniform coverage) properties.
moment condition models, we can construct the critical values using limit
Remark 7.6, which should be preferable to subsampling due to higher accuracy.
Note, however, that in
distributions, e.g. see
Remark
be
will
7.5.
less
The
(2006).
Due
to reasons given in
Remark
7.2,
our regions (7.2) constructed using subsampling
by Andrews and Guggenberger (2006) and Sheikh
conservative than the regions studied
latter are constructed using canonical (untruncated)
contrast, regions (7.2) use the truncated critical value c{9)
Theorem 7.1. Suppose
c{9) —>p c{a,9) and c ~^p
and C.6
that (a) Conditions C.4
c{a)
>
(1) for any 9* G 9/, liminf„_oo
A
supggg^ c{a,9), where c{9)
^'{6'*
subsampling
critical value c{9). In
c.
hold,
>
and
and
c
(b)
>
for each 9 G Qj we have
with probability
1.
Then,
G C„(c(-)Ac)}
>
\im'min^ooP{o-nQn{9*)
<
e C„(c* Ac)} > a, and (2) lim inf „^oo
-Pi^^*
a.
Proof of Theorem
2*
Ac}
^[iii)
on
c
>(i)
c{9),
^
c(0*)} —(iv)
(iii)
trivially follows
The
Part (1): lim inf„_^oo -P{^* £ Cni'S' Ac)}
liminf„^ooP{anOn(^*)
P{C{^)
and
7.1:
follows
>
a,
<c(r)A?}
where
(i)
by Condition C.6, and
from inequality
(i)
in the
liminf„^oo P{a„Q„(r) <
>(,i)
follows
=
by construction,
(iv) follows
proof of Part
(n) follows
(c(r)
+ Op(l)) V 0}
by the assumptions
by Condition C.6. Part
(2):
This part
D
(1).
following theorem provides consistency and rates of convergence of the sets constructed
above.
Theorem
7.2 (Consistency and rates of convergence)
7.1 hold.
Consider estimators 0/ := {5 G
9
:
.
Suppose C.l, C.2, and conditions of Theorem
anQn{9) <
38
c{6) f\c
+
Kn] and 9/ := {^ G
9
:
o-nQniS) < c* Ac+ Kn], where Kn :—
dniOhQi) = Op([l/a„]V7) ^ 0^(1)
The same
Op(l), otherwise.
Proof of Theorem
follows
Theorem
n-V2
known
is
c.3 is^known
and k„ := logn otherwise. Then,
to hold,
to hold,
and dH{Q],Qi)
= Op
(
[log
n/a„]V7)
Since Cni^n)
C 6/ C 6/ C Cn{c +
«;„),
C„(k„) and C„(c
for
the rate and consistency results
+
k„) obtained in
Theorem
3.1
D
3.2.
(Limits of Cn{0) in
7.3.
moment
=
results apply to Qj.
from the rates and consistency results
and Theorem
for the
7.2:
when C.3
if
Moment
Condition Models) (1) Suppose Condition M.l holds
P-Donsker condition on moment functions implies
equality model. In particular, the
{Yl'i=i{mi{0)
-
=
Ep[mi{e)])) -^^ A(0)
c{e)
.= \\i\{eyw^i\e)f,
c{e)
:^ \\^{e)'w'!\e)f
N{Q,Ep[^{e)/\{e)']). Then, Condition C.6 holds with
(7.3)
-
m[e') +
inf
G{e')x)'w'/\d')\\\
(7.4)
.
when Qn{S) and Qn{d) = Qn{^) — inf^'ge Qn{9') are used for inference, respectively.
Suppose
Condition M.2 holds for the moment inequality model. Then, Condition C.6 holds with
(2)
for the case
c{0)
.=
ae)
.= \\{A{e)
for the case
m{9)+myw'/'{e)\\i,
when Qn{^)
Proof of Theorem
the proof of
Remark
and
(7.6),
4.3,
and
Theorem
(7.5)
+ myw'/H9)\\lQni9)
o-nd
7.3.
=
ini
\\{Aie')
+
G{9')x
+ mrw'/'{e')\\l,
Qn{9) — mfgiQQQn{6') are used for inference,
Part (1) follows from the proof of Theorem
4.1.
(7.6)
respectively.
Part (2) follows from
D
4.2.
7.6. (Quantiles oiC{9)
by Simulation) The quantiles of C(6'),
can be obtained by simulating variable
specified in (7.3), (7.4), (7.5),
C!^{9) specified respectively in
Remarks
4.1, 4.2,
4.4.
References
Amemiya, T. (1985): Advanced Econometrics. Harvard University Press.
Anderson, T. W., and H. Rubin (1949): "Estimation of the parameters
system of stochastic equations," Ann. Math.
Andrews, D. W. K.
(1997):
Statistics, 20,
of a single equation in a complete
46-63.
"Estimation when a parameter
is
on a boundary: Theory and Applications,"
Cowles Foundation Working Paper, Yale University.
when a parameter is on a boundary," Econometrica, 67(6), 1341-1383.
when a parameter is on the boundary of the maintained hypothesis," Econometrica,
(1999): "Estimation
—
(2001): "Testing
69(3), 683-734.
Andrews, D. W.
K., S. Berry,
and
the Presence of Multiple Equilibria,"
Andrews, D. W.
K.,
and
P.
P. JlA (2004):
"Placing Bounds on Parameters of Entry
Working Paper, Yale
Guggenberger
(2006):
Games
in
University.
"The Limit of Finite Sample Size and a Problem
with Subsampling," Working Paper, Yale University.
Angrist,
J.,
AND
A.
Krueger
(1992):
"Age at School Entry," Journal of the American
tation, 57, 11-25.
39
Statistical Associ-
and
Bajari, p., L. Benkard,
J.
Levin
Working Paper, University of Minnesotta, forthcoming
Bajari, P.,
Demand
Fox,
J.
and
Ryan
S.
Dynamic Models
(2006): "Estimating
in
of Imperfect Competition,"
Econometrica.
(2006): "Evaluating Wireless Carrier Consolidation Using Semiparametric
Estimation," Working Paper, MIT.
Beresteanu,
a.,
and
A.,
and
B.
Ellickson
(2006):
"Dynamics of the Retail Olygopoly," Working Paper, Duke
Molinari
(2006):
"Asymptotic Properties
University.
Beresteanu,
F.
for a Class of Partially Identified
Models," Working Paper, Cornell University.
Blundell,
sponses,"
M. Browning, and
R.,
Crawford
I.
(2005):
"Best Nonparametric Bounds on
Demand Re-
Working Paper, University College London.
Borzekowski,
and
R.,
a.
Cohen
(2004):
"Estimating Strategic Complementarities
New York
Outsourcing Decisions," Working Paper,
in
Credit Unions'
Federal Reserve Bank.
Chernoff, H. (1954): "On the distribution of the likelihood ratio," Ann. Math. Statistics, 25, 573-578.
Chernozhukov, v., and I. Fernandez- Val (2005): "Subsampling Inference on Quantile Regression Processes," Sankhya, 67, 253-276.
Chernozhukov,
and
V.,
Hansen
C.
(2004): "Instrumental Quantile Regression,"
"An IV Model of Quantile Treatment
Chernozhukov, V., C. Hansen, and M. Jansson
•
sion Models,"
Working Paper, MIT.
Effects," Econometrica, 81, 806-839.
(2005):
(2005): "Finite-Sample Inference for Quantile Regres-
Wroking Paper, MIT.
Chernozhukov,
V., H.
Hong, and
E.
Tamer
(2002): "Parameter Set Inference in a Class of Econometric
Models," Working Paper, MIT.
CiLiBERTO, F.,
and
E.
Tamer
Working Paper, Northwestern
Cohen,
tion
and M. Manuszak
A.,
and Multiple Equihbria:
Paper,
New York
CUEVAS, A.,
(2003):
"Market Structure and Multiple Equilibria
in Airline
Markets,"
University.
An
(2006):
"Endogenous Market Structure with Discrete Product
Differentia-
Empirical Analysis of Competition Between Banks and Thrifts," Working
Federal Reserve Bank.
AND
R.
Fraiman
(1997):
"A plug-in approach
to support estimation,"
Ann. Statist, 25(6),
2300-2312.
Dacunha-Castelle,
D.,
and
E. Gassiat (1999):
parametrization: population mixtures and stationary
Davydov, Y.
A.,
distributions,"
processes,"
M. A. Lifshits, and N. V. Smorodina
American Mathematical
stochastic functionals.
Demidenko, E.
Dudley, R. M.
"Testing the order of a model using locally conic
ARMA
(1998):
Ann.
Statist., 27(4),
1178-1209.
Local properties of distributions of
Society, Providence, RI.
(2000): "Is this the least squares estimate?," Biometrika, 87(2), 437-452.
(1985):
"An extended Wichura theorem,
in Probability in
Banach
spaces,
V
definitions of
Donsker
class,
and weighted empirical
(Medford, Mass., 1984), vol. 1153 of Lecture Notes in
Math., pp. 141-178. Springer, Berlin.
DuFOUR, J.-M. (1997): "Some Impossibility Theorems
Dynamic Models," Econometrica, 65, 1365-1387.
Frisch, R. (1934):
Statistical
Institute of Economics, Oslo,
Fukumizu, K.
in
Econometrics with Applications to Structural and
Confluence Analysis by Means of Com.plete Regression Systems. University
Norway.
"Likelihood ratio of unidentifiable models and multilayer neural networks," Ann.
(2003):
Statist, 31(3), 833-851.
Geyer, C.
J. (1994):
"On
the asymptotics of constrained M-estimation," Ann. Statist., 22(4), 1993-2010.
40
GiLSTEiN, C.
AND
Z.,
Leamer
E.
"Robust Sets of Regression Estimates," Econometrica,
(1983):
51,
555-
582.
and
Haile, p.,
Tamer
E.
"Inference with an Incomplete
(2003):
Model
of English Auctions," Journal of
Political Economy, 111, 1-51.
Hannan,
E.
"Testing for autocorrelation and Akaike's criterion,"
(1982):
J.
Appl. Probab., (Special Vol.
J.
19A), 403-412, Essays in statistical science.
Hannan,
E.
J.,
and M. Deistler
(1988): The statistical theory of linear systems,
Wiley Series
in
Probability
and Mathematical Statistics: Probability and Mathematical Statistics. John Wiley & Sons Inc., New York.
Hansen, L., J. Heaton, and E. G. Luttmer (1995): "Econometric Evaluation of Asset Pricing Models,"
The Review of Finacial Studies,
Hu,
and
Imbens, G.,
237-274.
dynamic panel data model," Econometrica,
J.,
Manski
C.
70(6), 2499-2517.
(2004): "Confidence Intervals for Partially Identified Parameters,"
Economet-
1845 - 1857.
rica, 74,
Kim,
8,
L. (2002): "Estimation of a censored
and
Klepper,
D. Pollard (1990): "Cube root asymptotics," Ann.
and
S.,
Leamer
E.
Statist., 18(1),
191-219.
"Consistent Sets of Estimates for Regressions with Errors in All
(1984):
Variables," Econometrica, 52(1), 163-184.
Knight, K.
"Epi-convergence and Stochastic Equisemicontinuity," Technical Report, University of
(1999):
Toronto, Department of Statistics (http://www.utstat.toronto.edu/keith/papers/).
Korostelev, a.
aries,"
and
Simar,
p., L.
Tsybakov
A. B.
(1995):
"Efficient estimation of
monotone bound-
Ann. Statist, 23(2), 476-489.
and
Linton, O., T. Post,
Y.
Wang
(2005):
"Testing for Stochastic
Dominance
Working
Efficiency,"
Paper, London School of Economics.
Liu, X.,
and
Y. Shao (2003): "Asymptotics
for likelihood ratio tests
under
loss of identifiability,"
Ann.
Statist, 31(3), 807-832.
Manski, C.
Verlag,
F. (2003): Partial identification of probability distributions. Springer Series in Statistics. Springer-
New
York.
Manski, C.
F.,
and
Tamer
E.
(2002):
"Inference on regressions with interval data on a regressor or
outcome," Econometrica, 70(2), 519-546.
Marschak,
J.,
and W.
H.
Andrews
(1944):
"Random Simultaneous Equations and
the Theory of Pro-
duction," Econometrica, 812, 143-203.
McFadden, D.
Molchanov, I.
Molinari, F.
Newey, W.
L. (2005): "Revealed stochastic preference: a synthesis,"
S. (2005):
Theory of
Random
(2004): "Missing Treatments,"
K.,
and
D.
McFadden
Sets. Springer,
Economic Theory,
Working Paper, Cornell
University.
(1994): "Large sample estimation
and hypothesis
testing," in
of econometrics, Vol. IV, vol. 2 of Handbooks in Econom., pp. 2111-2245. North-Holland,
Pakes, a.,
J.
Porter,
J. Ishii,
and
K.
Ho
(2006):
"Moment
26, 245-264.
London.
Inequalities
Handbook
Amsterdam.
and Their Applications," Working
Paper, Harvard University.
Phillips, P. C. B. (1989): "Partially identified econometric models," Econometric Theory, 5(2), 181-240.
Politis, D. N.,
J.
P.
Romano, and M. Wolf
(1999): Subsampling. Springer- Verlag,
New
York.
Potscher, B. M. (1991): "Effects of model selection on inference," Econometric Theory, 7, 163 - 185.
Redner, R. (1981): "Note on the consistency of the maximum likelihood estimate for nonidentifiable distributions," Ann. Statist, 9(1), 225-228.
Rockafellar, R.
T.,
and
R. J.-B.
Wets
(1998):
Variational analysis. Springer- Verlag, Berlin.
41
Romano,
Ann.
J.
P.,
S.
(2000): "Finite sample nonparametric inference
and
large
sample
efficiency,"
28(3), 756-778.
Statist.,
Ryan,
and M. Wolf
P. (2005): "The Costs of Environmental Regulation in a Concentrated Industry," Working Paper,
MIT.
Sheikh, A. (2006): "Inference
for Partially Identified
Econometric Models," Working Paper, Stanford Uni-
versity.
Staiger, D.,
and
J.
H.
Stock
(1997): "Instrumental variables regression with
weak instruments," Econo-
metrica, 65(3), 557-586.
VAN DER Vaart, A. W. (1998): Asymptotic statistics. Cambridge University Press, Cambridge.
VAN DER Vaart, A. W., and J. A. Wellner (1996): Weak convergence and empirical processes.
Verlag,
New
Varian, H. R.
(1984):
Veres,
Time
Springer-
York.
(1982):
"The nonparametric approach
"The nonparametric approach
S. (1987):
to
demand
analysis," Econometrica, 50(4), 945-973.
to production analysis," Econometrica, 52(3), 579-597.
"Asymptotic distributions of likelihood ratios
Ser. Anal., 8(3), 345-357.
42
for
overparametrized
ARMA
processes,"
J.
ADDENDUM
for "Estimation and Confidence Regions for Parameter Sets
Econometric Models"
In this paper P, the true probability measure,
which contiguous perturbations of the
and coverage properties
follows
its
The
the nuisance pajameter.
is
goal
to examine
is
P
preserve or do not preserve the estimation
The
idea of focusing on the local perturbations
original fixed
of the confidence regions.
uses in the confidence interval literature, see notably Dufour (1997), Potscher (1991), and
Andrews and Guggenberger
tically detected
do not
P
Robustness to Contiguous Perturbations of
8.
in
affect
Intuitively, contiguous perturbations of
(2006).
P
can not be
statis-
with certainty, and we therefore want to make sure that contiguous changes
the coverage properties of confidence regions.
An
alternative motivation
is
in
P
that, in
the asymptotic context, the relevant parameter space for nuisance parameters consists of contiguous
parameter values, which
(1998),
Chapter
is
a standard approach in asymptotic efficiency analysis, see van der Vaart
In fact, finding minimal coverage under contiguous sequences
8.7.
establishing local uniform coverage,
compact
We
the local nuisance parameters are allowed to vary over a
set.^^
focus on examining the robustness of the
stated in
8.1.
when
Theorem
3.1
and Theorem
Regular Cases. Consider a
where 7
equivalent to
is
is
on the law
what
3.3.
=
follows, notation
of the data P.
=
P and {P„^.y,7 6 r,n = 1, ...} C P. Let P^ denote
Each 7 G P is such that P" is contiguous to P", the law
in
'W\^...,Wn under P, namely P"(yl„)
events j4„.^^ In
inferential results, the ones
triangular sequence of probability measures {Pn_.y,n
an index of a sequence
of data w\,...,Wn under Pn^-y-
main estimation and
is
used to
Similarly, notation c{a,
any sequence
the law
of data
of
measurable
reflect that identification region
0/ depends
oil) implies P".^(A„)
Qi{P)
=
1,2,...},
P)
o(l) for
used to denote that the a-quantile of C
is
depends on P.
Lemma
Assume
conclusions of Theorem 3.1.
place of {P}, for any
7 G
and C.2 hold with
(2)
Assume
instance, that provided in Sections 3 or 4-
liminf P„-.{e/(P„^)
first result
as long as C.l
C
states that consistency
Take any estimate
Then for each 7 £
Cn(c)}
and
in the re-statement of C.l
Then
and C.4 hold under
under {P}, with the common
c
limit real
—>p
c(a,
(1)
so do
{Pn,-y} in
random
variable
P) under {P}, for
P,
> a and ^ a
if
P{C >
0}
>
a.
rates of convergence will be preserved under sequences
and C.2 hold under sequences (replacing
cause no ambiguity
{P}, for each 7 G P.
{Pn,-y] replacing
that Conditions C.l, C.2,
P, as well as hold
C, distribution of which does not depend on 7.
The
and Coverage]
8.1. [Conditions for Maintaining Consistency, Rates of Convergence,
that Conditions C.l
and
C.2).
P
with
P„_.y
The second
and 9/ with Qj{Pn,-y) should
result of the
lemma
addresses
^^The weak IV example presented below clarifies this statement; see, specifically, equations (8.8)- (8.9).
^^Throughout this section, measurable events An are events that are measurable with respect to (fi.jT)
completed with respect to both
P" and
Pn,-y
43
coverage properties in the regular case -
Note that the coverage
Note that
if
C„
is
result
when
the hmit of C„ does not depend on the local sequence."^
independent of the
is
non-regular - that
is,
way
the critical value
limit distribution
its
is
under
estimated.
{Pn,-y}
the coverage under sequence depends on whether the distribution of Cn under
dominated
depends on 7 -
P^^^y is
samples by the distribution under fixed sequence {P}, as stated
in large
stochastically
in
Lemma
8.3
below.
Conditions of
Lemma
8.1 are verified in our principal
Condition M.S. (Moment
Equalities) Suppose that
apphcations as follows:
M.l
partial identification conditions (4-1) holds uniformly in
P
holds for each
V,
(h)
G{9)
=
V
£
and that
limnVgE^p
(a) the
exists
„[77i,;(0)]
and
is continuous over a neighborhood ofQ, for each 7 G F, (c) the Donsker condition (4-2) holds
under {Pn,j} in place of {P} for each 7 G F, with the common limit Gaussian process A(^), (d)
^P.,-,KW] = Ep[mi{e)] +
Condition M.
0(1) for each
(Moment
4.
7 G
F, (e)
dH{<di{Pna),Ql{P))
Inequalities) Suppose that
M.l
exists
and
is
(d)
V,
(h)
continuous over a neighborhood ofQ, for each 7 G F,
w place
Sp„,^K(0)]
=
Ep[mi{e)]
dH{Qj{Pn,-y),Qj{P))
Condition
of
(a) is
=
{P} for each 7 G
+
G{9)
(c) the
common
limit
P
=
J
e
V
7 G F.
and
that (a)
lim^ V6i£^p.^^„[mi(0)]
Donsker condition
(4-2)
Gaussian process A{d),
7 G F, (e) and dH{edPn,-y),Qi{P))
and each 7 G F
0(1) for each
0(1) for each
=
o(l)
and
.
a locally uniform partial identification condition. Sufficient condition for con-
known and
dition (c) are well
F, with the
o(l) for each
holds for each
the partial identification conditions (4-5) holds uniformly in
holds under {^71,7}
-
are given in van der Vaart
The
quadratic-mean-differentiability condition, p. 406.
requires that the perturbations of
P
and Wellner
(1996), p. 173, including a
principal condition
is
Condition
(e),
which
affect the identification region smoothly.
Lemma 8.2.
(Coverage, Consistency, Rates under Regular Sequences in
(1) Condition
M.3
implies conditions of Lemma 8.1. (2) Condition
M.4
Moment
Condition Models)
implies conditions of Lemma
8.1.
Example
1 (contd.) It
is
helpful to illustrate conditions M.4(a)-(e) via a simple example. Recall
the example of interval censored
suppose Yi
<
Y2 P-a.s. for
M.4(a)-(d) easily follow.
by the uniform
in
V
all
To
P
Y without covariates, in which case Qi{P) — [Ep[Yi], Ep[Y2]] and
P and that {Yi,Y2) are uniformly Donsker in P."^ Then condition
G
verify M.4(e) note that
by contiguity and uniform integrabihty implied
Donskerness,
iEp^jY,],Ep^JY2])
including the case of [E'pfFi],
£'p[Y'2]]
-
{Ep[YriEp[Y2]),
being a singleton. The
and Manski (2004) used precisely the case
last
point
is
noteworthy, since Imbens
of identification region shrinking to a singleton at a l/y/n
rate as a counterexample to the coverage of certain t}T)es of confidence regions.
^^The
definition of regularity follows that given
^^Conditions
for
the Donskerness uniformly in
by van der Vaart and Wellner (1996), p. 413
V is well known, see van der Vaart and Welhier (1996),
p.168-170.
44
many examples we have
Conditions M.3 and M.4 are reasonable in
boundary
of
0/
Conditions M.3 and M.4 are not expected to hold
(in M'') is strongly-identified.
otherwise. Therefore, the models with
weak
identification,
by the framework
identification, are not covered
considered, provided the
Dufour (1997), that are
cf.
of regular sequences.
Weak
local to non-
identification
is
not
our focus in any case. However, Section 8.2 provides a general condition under which the proposed
inference
methods
will
and Staiger and Stock
8.2.
The
work.
Non-regular cases. The
lemma
following
be preserved
will
with a weak IV framework of Dufour (1997)
in
addresses non-regular cases mentioned earher, and
much
greater generality.
(Maintaining Partial Consistency and Minimal Coverage under Non- Regular Sequences).
(1) Suppose that supe^^p^^j
Q„
vided c ^>p DO, under {Pn,j}-
P
C
Op„ .^(l/a„) under {P„,^}. Then e/(P„,^)
==
(2) Let there be
Condition C.4 holds under fixed
that for each
illustrated
is
(1997).
shows that coverage results
Lemma 8.3
condition
any estimate
c ^>p c{a,
Cn{c) wp -^
1,
pro-
P) under {P}- Suppose
that
with the limit real variable C that has a-quantile c{a,P). Suppose
7 G F and any sequence
e„
J,
we have
0,
<
liminf F„,^,[C„
{cia,P)
-
e„)
V
0]
>
a.
(8.1)
Consider any estimate c-^p c{a,P) under {P], for instance, that provided in Section 3 or
for each 7 G
Then
4-
F
>a.
liminfP„^,{e/(P„^) CC„(c)}
Note again that the
result
is
independent of the way the
Example (Weak IV) The point
.
of this
lemma can be
critical
(8.2)
value
is
estimated.
illustrated using a very simple
IV example
with one regressor:
Y^OqX +
c,
6I0
€
G (compact) C
X=
IR,
Q-Z +
v,
and
(e,i;)|Z
The identification region is Qi{P) = ©, that is, we have complete
samphng and other conditions as in Section 4 hold under P.
Now
~
Af(0, fi),
Z~
iV(/i,o-|).
non-identification.
(8.3)
Assume
i.i.d.
consider a sequence of models where
Y^eoX +
e,
X=
(p/^)Z +
Let 7 index the parameter sequence {p}.
(8.4); it is
contiguous to law P". Let
generated according to
P„^-y
v,
Let
and (e,i;)|Z~ A^(0,n),Z~
P"
Af(0,CT|).
denote the law of vector {Yi,Xi,Zi,i
denote the law of the
infinite
-
9o
(8.4)
<
sequence {Yi,Xi, Zi,i
n) in
<
00)
(8.4).
Note
©/(Pn,7)
=
©/(P)
=
if
p
and
©/(P„,-,)
This implies that the weak limit of C„ under P„^^ with p
limit of Cn under P„_^ with p = 0, since
sup \\A{eyw^/\e)f
^
=
0, is
G ©/(P)
G
45
P
7^ 0.
stochastically smaller than the
< sup \\A{eyw^/~i9)f.
So
if
(8.5)
weak
(8.6)
Therefore a-quantile of the right side
(8.1)
is
Therefore, for each p e R'' and
a.s.
bigger than Q-quantile of the
is
Note, compactness of
satisfied.
7
is
=
important in insuring that the right-hand side
K
is
& compact subset of ]R;
—
p
\
0,
convergent subsequence p„
in
r and
>
each sequence Cn
is
local
Q.
parameter sequences 7
=
T denote the
let
C
>
n—>oo
Next we consider more general
C„(c)}
\\Aieyw'/^{9)f < (c(a,P)
sup
> liminfP[sup||A(0)'wV2(^)||2 <
n —^00
Lemma
(c(q,P)
-
K denoting any
non-empty compact subset
inf liminf inf
K n—
»(X)
where P^p denotes the law
sequence {Yi,Xi,Zi,i
Proof of
8.3.
<
Lemma
-|-
0(1),
Proof of
in the
P
have that
C
by assumption that
etc.,
Lemma
in
<
C„(c)}
Pn,-y[Cn
P[C <
(2).
P
0]
0]
(8.7)
e„)
>
a.
a.
(8.8)
C„(8)}
>
a,
(8.9)
is
in the spirit of
c]
4.1,
The proof
i->
on
The proof
is
by substituting
{Pn.-y}
contiguity, c -+p c{a,P)
under
'straightforward
D
3.1.
By
>
[0,
A{e)'W^/'^(e)
7,
is
> Pn,^[Cn < c] = Pn,y[Cn < c{a,P) + Op„,^(l)] = P[C <
< c] — P[C < c] for all c > 0, by c > 0, and by continuity
having replaced
M.3(b) A(^) does not depend on
having replaced
v
V
n) in (8.4), and Pn,p denotes the law of infinite
D
00).
The proof
P
with
and then noting that supe,(p„,^) \\A{eyW'/\9)f
Proof of Part
C
Theorems
is
Pn,-y,
=
straightforward by repeating Steps 1-4
0/ with
0/(P„_-y), Op(l) with Op„^(l),
supe^(p) \\A{9yw'/^{d)\\^
andby d//(e7(P„,T,),e/(P)) =
and by contiguity W{d) does not
variable C := sup0^(p) \\A{9yW'^/~{9)f does not
4.1,
e„)
of IR
c{a,P) under {P}.
c —>p
8.2. Proof of Part (1).
Proof of Theorem
followsby equicontinuity of
By
-
"^
in the proof of
of the distribution function c 1—>
8.4.
P^.{e/(P„,^)
8.1. Proof of Part (1).
{Pn,j}. Therefore Pl^{GiiPn,j)
c{a, P)]
each n,
analysis of estimation, see van der Vaart (1998), Chapter 8.7.
We
(2).
p^K
of vector {Yi,Xi, Zi,i
in place of the fixed sequence
Proof of Part
for
The Umit under each
00) generated according to (8.4). This coverage property
minimax
local asymptotic
K
8.3 that
Uminf P„,^„{e/(P„,.,J C Cnic)} >
n —»oo
Equivalently, for
{p„} with pn G
set of all these sequences.
either the left or right side of (8.6). Hence, for each sequence {7„}
liminfP„,.,„[
This implies by
finite
is
{p]
liminf P„,^{07(P„,.^)
where
and we have that
left side,
depend on
o(l)
+ 0p„Jl),
imposed
either.
which
in M.3(e).
Hence the
D
7.
straightforward by repeating Steps 1-4 in the Proof of
limit
Theorem
with Pn,^, 0/ with 0/(P„_.y), and Op(l) with Op„^(l). The exception
is
that
— lim„ ^/nEp[mi{9)] under fixed sequence {P}. Note that the key
Lemma 6.2 on which the proof is based will be preserved under seciuences {P-n,-)}proof of Lemma 6.2, the convergent subsequence {0„} in Qj{P) is replaced by the convergent
Step
2,
we need
to define ^{9)
inequality (6.10) in
In the
subsequence {&„} in Ql{Pn,j)^ where convergent means 9^ ^> 9 G Qj{P). Since we care only about
46
C„ in this
A
=
0.
Lemma,
[That
is,
that for every J,
to consider the set
=
6.2, we have a drastic
V^ — V^ — Qj{Pn,f) x
observation utihzed equicontinuity of
i-^
A{9yW^^^{9) and the
7, and by contiguity W{0) does not depend on 7
Step 2 can be stated then as
Hence C
8.5.
4(^,0) =d max
= maxj sup5)ge^(p) Ylj^j
Proof of Lemma
Op„,.^(l/a„)
<
{0}.]
In addition,
we note
o(l)
not depend on
sup
simplification from setting
by M.4(e), so that maxj7sup0^(p^^) J2jeJ li'^ji^) +
max^sup0_^(p) E,e^ l(A,(0) + Gj{eyX)WJf~{9) + o,^Jl)\l. The
dH{Qj{Pn,j),Qj{P))
Gj{9yX)wf{9) + o,^Jl)\l =
last
Lemma
in repeating the proof of
we only need
sup
C
by M.4(b) A(^) does
result of the modified
which does not depend on
Under {Pnn)' wp -^
c/an, which imphes 9/(P„,-y)
The
^\{A,{e))WJj'ie) + Op^Jl)\l.
l('^i(^))^'^j/ (^)l+i
8.3. Part (1).
fact that
either.
1,
by construction
c,
supg^^p^
^
Q^ =
U
Cnic).
c —>p
We have that c—>p
of
D
7.
c{a, P) under {P}. By contiguity,
c(a, P) under {Pn,-^]- Hence
C
c]
{c{a,P)
<
V
Cn{c)]
>
<
>
Pn,^{Cn
Pn^i^n
£„)
0} = Pn^iC < \c{a,P) Pna{®l{Pnn)
Cn) V 0}, for some e„ | 0. The conclusion follows from the assumption that liminf„_^oo Pnai'^n ^
(c(a,P)-e„)VO}>a.
D
Part
(2).
47
3521
83
Date Due
Lib-26-67
Download