Document 11158522

advertisement
Digitized by the Internet Archive
in
2011 with funding from
Boston Library Consortium
Member
Libraries
http://www.archive.org/details/inferenceonparamOOcher
OEWEY
HB31
.M415
4.8
Massachusetts Institute of Technology
Department of Economics
Working Paper Series
INFERENCE
ON PARAMETER SETS
IN
ECONOMETRIC MODELS*
Victor Chernozhukov
Han Hong
Elie
Tamer
Working Paper 06-1
June 16,2006
Room
E52-251
50 Memorial Drive
Cambridge,
MA 02142
This paper can be downloaded without charge from the
Social Science Research Network Paper Collection at
http://ssrn.com/abstract=909625
^^Sffl
M)G
9 2006
LIBRARIES
INFERENCE ON PARAMETER SETS IN ECONOMETRIC MODELS*
VICTOR CHERNOZHUKOVt HAN HONG
s
ELIE TAMER*
Abstract. This paper provides confidence regions for minima of an econometric criterion function
Q(9). The minima form a set of parameters, 0/, called the identified set. In economic applications,
0/ represents a class of economic models that are consistent with the data. Our inference procedures are criterion function based and so our confidence regions, which cover O; with a prespecified
probability, are appropriate level sets of Q n {8), the sample analog of Q(9). When 0/ is a singleton,
our confidence sets reduce to the conventional confidence regions based on inverting the likelihood or
other criterion functions. We show that our procedure is valid under general yet simple conditions,
and we provide feasible resampling procedure for implementing the approach in practice. We then
show that these general conditions hold in a wide class of parametric econometric models. In order
to verify the conditions, we develop methods of analyzing the asymptotic behavior of econometric
criterion functions under set identification and also characterize the rates of convergence of the confidence regions to the identified set. We apply our methods to regressions with interval data and
set identified method of moments problems. We illustrate our methods in an empirical Monte Carlo
study based on Current Population Survey data.
Key WORDS:
Set estimator, level sets, interval regression, subsampling bootsrap
Date: Original Version:
*
Note: This paper
and
May
2002. This version: June, 2004
currently undergoing
is
MAJOR
.
revision following referee
comments.
New
results
on estimation
rates of convergence are added. Convexity assumptions on the parameter space are being replaced by
more general
conditions.
The most
*We thank
Alberto Abadie, Gary Chamberlain, Jinyong Hahn, Bo Honore, Sergei Izmalkov, Guido Imbens, Christian
recent, yet preliminary
and incomplete, revision can be requested via e-mail. September, 2005.
Hansen, Yuichi Kitamura, Chuck Manski, Whitney Newey, and seminar participants at Berkeley, Brown, Chicago,
Harvard-MIT, Montreal, Northwestern, Penn, Princeton, UCSD, Wisconsin, and participants
of
'
Econometric Society
Department
0241810)
§
is
San-Diego
of Economics,
for
at the
Winter Meetings
comments.
MIT, vchern9mit.edu. Research support from the National Science Foundation (SES-
gratefully acknowledged.
Department of Economics, Duke
dation (SES-0335113)
J
in
Department
of
is
University,
han.hongaduke.edu. Research support from the National Science Foun-
gratefully acknowledged.
Economics, Princeton University, tamerSprinceton. edu. Research support from the National Science
Foundation (SES-01 12311)
is
gratefully acknowledged.
1
Introduction and Motivation
1.
Parameters of interest in econometric models can be defined as those parameter vectors that
minimize a population objective or criterion function.
at a particular
If this criterion
parameter vector, then one can obtain valid confidence regions
parameter using a sample analog of
this function. Likelihood
two commonly used and well studied methods
in this setting.
inference to econometric models that are set identified,
minimized on a
set of
and
identified set
function
parameters, the identified
to provide a
method
set.
i.e.,
Our
and method
of
is
minimized uniquely
(or intervals) for this
moments procedures
are
This paper extends this criterion-based
models where the objective function
goal
is
to
make
on the
inferences directly
of obtaining confidence regions with
is
good properties (such as
consistency and equivariance) that cover the identified set with a prespecified probability.
Our point
analog
paper
is
in
is
a nonnegative population criterion function Q(8) and
6 C R
where 8 G
Q n {9)
where every 9
this
of departure
d
The
.
0/ indexes an economic model that
Cn
to construct confidence sets
lim
(1.1)
for
a prespecified confidence level
Qn
is
a £
for
A
is
Cn =
)
level set
{8 £
consistent with the data.
6/ from the
P(Gj C
(0, 1).
can be defined as 6/
Q n {9)
level sets of
its finite
Q
The
sample
=
Q{9)
:
0}
objective of
such that
a,
Cn (c)
of the finite
sample criterion function
defined as
Cn (c)
for
identified set
=
=
\
8
Qn(8)
-
some appropriate normalization a n
Cn = Cn(c)
with the cut-off
level
c =
<
qn
\,
where qn :=
inf
Q n (9)
or g n := 0,
In order to obtain the correct coverage (1.1),
.
ca
c/a n
,
we choose
where c a equals the asymptotic a-quantile of the coverage
statistic:
Cn := sup a n (Q n {9) - qn ),
eeQi
which
is
a quasi-likelihood-ratio type quantity.
properties, such as consistency
situations,
where the
and equivariance
identified set
is
The constructed confidence
to reparameterization.
defined as the
minimand
of
sets possess
Our approach
important
covers general
an objective function. In addition,
our confidence regions are sets that are robust to the failure of point-identifying assumptions in that
they cover the
unknown
These confidence
where the
We
identified set
- whether
it is
a set or a point - with a prespecified probability.
sets collapse to the usual confidence intervals
identified set
0/
is
show that the above
based on the likelihood ratio in cases
a singleton.
level set
Cn {c a
)
has correct asymptotic coverage, where c a
is
the ap-
propriate quantile of a well defined coverage statistic C, the nondegenerate large sample limit of
Cn
.
Then, we provide general resampling methods to consistently estimate the percentile c Q
2
.
The
coverage and resampling results hold under general (high
level) conditions.
econometric models, we then show that these conditions are
satisfied. In the process,
Focusing on a class of
we provide
— qn ),
acterizations of the stochastic properties of the criterion functions process, a n (Q n (-)
the fact that the population criterion
Q
minimized on a
is
set rather
of convergence of
Cn {c)
The paper then
to 0/.
exploiting
than a point. Furthermore, we
obtain the asymptotics of the coverage statistic C n and of the level sets
and speed
char-
Cn (c),
focusing on coverage
considers two important applications:
regression with interval censored outcomes and set-identified generalized method-of-moments.
finally illustrate
We
our methods in a Monte Carlo study based on data from the Current Population
Survey.
body
In the last decade, a growing
partially identified models,
(2003).
While most of the work,
paper
is
is
e.g.
Manski
set identified, cf.
Horowitz and Manski (1998) and Imbens and Manski (2004),
on the case where there
exclusively focuses
this
models where parameters of interest are
i.e.,
problem of inference in
of literature has considered the
is
a scalar parameter of interest that
concerned with inference on vector parameters
in
lies in
an
interval,
problems where the identified set
defined via a general optimization problem, as in the economic problems described below. This
multivariate set
is
usually not an interval (or cube) and can be a set of isolated points or manifolds.
Moreover, the methods used in the interval case are based on estimating the two end-points of
the interval.
approach
Hence they are not applicable
for
posing the identified set as the object of inference
In Hansen, Heaton, and Luttmer (1995), the identified set
models that obey the pricing-error and
models with multiple
equilibria, the identified set
played, since particular equilibria
Another example
is
E
through a
of-birth instrument
birth).
Z
flexible,
is
a subset of asset-pricing
market returns.
In
There
is
no single "true" equilibrium that
is
Tamer
Y
example where potential income
Y =
Qo
is
related to
e.
Although
2
+ ct\E + a^E +
not point-identified in the presence of the standard quarter-
suggested in Angrist and Krueger (1992) (indicator of the
In the absence of point identification,
some
many
the set of parameters that describe different
quadratic functional form,
the instrumental orthogonality restriction
In
is
motivated by
vary across observational units; see Ciliberto and
in the following
parsimonious, this simple model
1
different
the structural instrumental variable estimation of returns to schooling.
Suppose that we are interested
education
may
is
is
volatility constraints implicit in asset
equilibria supported across markets or industries.
(2003).
which a
needed.
is
The main motivation
examples.
in the general set-identified case, for
all
first
quarter of
of the parameter values (ao, Qi, 02) consistent with
E(Y — qq +
ot\E
+
ct2E 2 )(l, Z)
1
=
are of interest for
cases the indicators of other quarters of birth are used as instruments. However, these instruments are not
correlated with education (correlation
is
extremely small) and thus bring no additional identification information.
3
purposes of economic analysis. Similar partial identification problems arise in nonlinear
instrumental variables problems, see
e.g.
moment and
Demidenko (2000) and Chernozhukov and Hansen
(2001).
Literature. To our knowledge, the earliest work in econometrics on parametric set identified
models can be found
in the
paper of Marschak and Andrews (1944).
There, the identified set
is
the collection of parameters representing different production functions that are consistent with
the data and functional restrictions the authors consider.
2
and Learner (1983) provide
Gisltein
consistent estimation in a class of likelihood models where
0/
as the set of
set
parameters that are
robust to misspecification. Klepper and Learner (1984) generalize the Frish bounds to multivariate
measurement
regression models with
errors.
On the other hand,
Hansen, Heaton, and Luttmer (1995)
provide consistent set estimates of means and standard deviations in a class of asset pricing methods.
Manski and Tamer (2002) provided conditions under which an appropriately defined
estimates the identified
about the identified
set.
However, these consistency results do not contain a method
set.
To the best
of the paper
for inference
of our knowledge, there are no results in the literature that deal
with the general problem of obtaining confidence regions for parameter
The remainder
set consistently
organized as follows.
is
on general
inference on the identified sets based
sets.
3
Section 2 provides a general theory of
on a
criterion functions. Section 3 focuses
class of
regular parametric models, provides verification of the regularity conditions posed in Section
provides additional results that pertain to regular cases. Section 4 provides a
of the
General Set Inference
Generic Inference on Identified Sets.
forms the basis for the rest of the analysis.
We
in
Q n (9, W\,..., Wn ),
where data {W\,
criterion function
0/, the identified
Large Samples
In this section
first
present our
main
result
which
0/ and formalize some
Q n {6)
...,
Wn
}
0/
are defined on
is
based on a criterion function
Q n {6) =
some common probability space (Cl,F,P).
converges to a continuous criterion function Q(8), that
is
minimized at
set.
Assumption A.l
(Basic Setup). Criterions
Qn Rd — M +
>
:
and
For a good description of Marschak and Andrews (1944), see Chapter
The only other paper known
to us
is
that can be estimated.)
The problem
is
interval-identified
of inference about 6'
Imbens and Manski (2004) and
III
Q Rd — R +
:
>
and
in
Appendix G.
4
is
(i.e.
satisfy
of Nerlove (1965).
Imbens and Manski (2004). Their paper considers a
inference about the a real parameter 9" that
in
we
define the identified set
For given data, the inference about the parameter set
shown
Monte Carlo evaluation
be used throughout.
definitions that will
The
and
methods, and Section 5 concludes.
2.
2.1.
2,
different
problem of
contained between some upper and lower bounds
fundamentally different from inference about 9;, as
is
ii.
Q{9)
iii.
iv.
v.
and convex subset ofR d
a compact
i.
is
continuous and
= argmmg e Q[Q(6)} is a finite union
Q{Oi) = for each 8; S 0/,
Qn(0) - Q(9) = o p (l) for each 9eQ.
0/
region
is
when 0/
is
This covers both the case when 0/
a
finite
is
a finite union of isolated points and the
union of compact sets with boundaries defined by manifolds (nonlinear
The assumption
hyperplanes).
limit criterion function Q.
a finite union of compact connected sets, an assumption that serves to organize
the presentation.
case
and convexity assumption, which are important
0/ as the minimizer of the
defines
It also
BE RELAXED],
of connected compact subsets ofQ,
states a standard compactness
to the subsequent analysis.
to
lower-semicontinuous,
is
0/
Assumption A.l
The
Q n (8)
[CONVEXITY
that Q(9j)
convergence condition serves to relate
Q
=
The pointwise
serves as a convenient normalization.
as the limit of the finite
sample objective function. The
pointwise convergence will be strengthened later on.
Lemma 2.1
shows how to construct a
level set of the
sample objective function that
The
provide the proper inferential statement (1.1) about 0/ in a generic setting.
objective function
Qn
Cn (c)
where a n
later.
is
:=
I
9
:
a n (Q n {9)
=
Note that choosing qn
Example
1
in
c-level set of
given by
-
qn )
< c\
defined in Assumption A. 2 below.
non-empty, though
e.g.
is
will eventually
,
where qn :=
inf
or qn := 0,
Q n {9)
Typically a n equals n in the regular cases studied
infgge Qn{9) guarantees that the confidence region
some cases one has inf^ge
Q n {9)
=
Cn {c)
is
always
with probability converging to one, see
in Section 3.
Consider the following coverage index
p(c)
(o I)
v
;
= c-
sup a n {Qn{9) - qn )
eee,
'
The
sign of the index p(c) indicates whether
such that 9
index
is
Cn (c),
also linear in
property
to
$.
is
we have a n (Q n {9) —
c,
which
Qn )
will allow us to
0/ C
>
c
Cn (c)
or not.
For example,
which implies that p(c) <
have data-dependent cut-off
0,
if
there
and vice
levels c.
is
versa.
level set in the equivariant
{t(9)
following
:
an
lemma summarizes
For definition of manifold, see
e.g.
way
(Q^r- ^))) 1
the discussion.
Milnor (1964)
qn )
<
c}
=
r (Cn (c))
.
The
Cn (c)
invariance of the index p to parameter transformations which implies equi variance of
changes the
0/
Another desirable
parameter transformations. For instance, a one-to-one transformation of parameters 9
The
9 S
—
>
t{9)
Lemma
2.1.
<
p(c)
•£=>
Let Assumptions A.l-(ii) and A.l-(iii) hold.
Cn {c)
0/ ^
and p
>
(c)
cut-off level; HI. the coverage index
«=>
6/ C Cn (c);
Then,
the coverage property holds:
I.
the coverage index
II.
is
linear in the
invariant to re-parameterization; and IV. the level sets are
is
equivariant to bijective parameter transformations.
Next we
main condition that enables
state the
Assumption A. 2
(Coverage
Statistic).
large sample inference
Suppose that there
on 0/.
exist a sequence of constants
—
an
>
oo
such that
Cn
where C
is
a nondegenerate
we
In the sequel
where a n
interest,
=
explain
n.
We
random
how
=
sup a n (Q n (0i)
this
main assumption
methods
also provide
new asymptotic methods
Assumption A. 2 leads us
qn ) -^ d
C,
variable.
is
attained in sufficiently regular models of
assumption and
for verification of this
the limit variable C. Verification of Assumption A. 2
set of
-
is
for finding
a difficult matter and requires developing a
of dealing with criterion functions under set identification.
main theorem, which provides a generic
to the following
on
result
inference in set-identified models.
Theorem
2.1 (Generic Inference on Sets). Suppose Assumptions A.l and A. 2 hold and that c a
continuity point of distribution function of
The
o p (l)
result follows
—>d
-C
p {Cn (c a )) -> d c a
I.
C such
and
Lemma
immediately from
that
II.
P{C <
lim
n—>oo
2.1
and A. 2.
ca }
=
Then for any
a.
P\ 0/ C Cn (c a
)
=
is
ca
a
,
a.
)
I.
First,
\
ca
—p
p{Cn (c))
=
C„
—
ca
= Cn -
ca
+
C — c a Second,
.
P{e r C Cn (c a )] = p{p(Cn (c a )) >
o}
= P{c a - Cn >
o}
= P{Cn <C a +
O p (l)}
= P{C <
CQ
}
+ o(l),
provided c a a continuity point of distribution function of C.
It is
clear
from Theorem
2.1 that our
classical principle of inverting
some
method
criterion function. Indeed, in point identified cases
to a singleton {6j} so that the coverage statistic
Cn
=
a,
n
(Q n (8l) —
Qn)
,
of constructing valid confidence sets builds
becomes a standard
supremum
over the elements of 0/: C n
=
0/ reduces
likelihood ratio type quantity
which follows well known limit laws. In the general
involved quantity, being the
on the
case, the statistic
su P6>,ee/ a n {Qn{@l)
is
a more
"
Qn)
In practice, our procedure for constructing the confidence regions critically depends on being
able to consistently estimate c a the a-quantile of the limit variable C. In
,
6
all
examples that we study,
C
non-standard and
is
distribution depends on 0/.
its
Despite this problem,
we
will
show how
to
obtain a consistent estimate of c a by the following method.
Cn
We
Feasible Inference.
2.2.
=
sup0 6
a n (Q n (0)
;
—
where
A;
is
we
= Cn (k),
where k S
[ci,C2J
will replace
wp —
Inn
•
starting cut-off k = 0.)
important, and we discuss
it
in
initial
estimate'
first class
of our examples,
proven below suggests that the asymptotic validity
result
depend on the starting
of the procedure will not
may be
The
by an
it
1,
»
a possibly data-dependent starting value of the cut-off. (In the
we can use th
value
need to obtain an approximation to the sampling distribution of
Since we do not observe 0/,
qn )
0/
(2.2)
first
value. In finite samples, the choice of the starting
Section
4.
Consider the following subsampling algorithm:
1.
For cases
when data {Wi}
when {Wt} form
cases
of the
form {Wi,
...,
are
iid,
construct
For each j
=
1, ...,
Wi + b-\}-
Bn
,
(B n ) subsets of
a stationary time series, construct
(In practice,
chosen subsets under the condition that
2.
all
size b -C
=n—
Bn
b
+
n
of the data. For
1
subsets of size b
one can use a smaller number
Bn —
oo as
»
n
—
>
Bn
of
randomly
oo.)
compute
Cj,b,n
=
sup
a b {Qj,b{Q)
~
Qj,b)
eeCn(c)
where
'=
qjfi
if
and qj^ :=
q n :=
inf^ e e
Qj
t
b{8) otherwise;
Qj
t
b(8)
denotes the criterion
function defined using the j-th subset of the data only.
3. Let c a
4.
be the a-th quantile of the sample {Cj^^ij
(Optional)
As commented
times using c
of c Q .)
We
=
c a lnn.
require that as
n
—
b/n
(Any
finite
•,
-B n }.
one could repeat steps
number
2
and 3
finite
number
of
of repetitions produces the consistent estimate
oo,
—
>
0,
Bn —
>
oo,
b
—
>
oo
at
polynomial
rates.
choice of b and other practical aspects of the procedure are discussed in Section
To guarantee asymptotic
The adjustment
as
in Section 4.3,
1,
7
(2.3)
The
6
=
factor
validity of the
above procedure, the following assumption
Inn can be replaced by
In
Inn or any other m(n) such that m(n)
n — oo for all a > 0.
The adjustment by In n is not needed in the first class of our examples.
In fact, we found in Monte-Carlo experiments described in Section 4 that
—
is
4.
needed.
1
>
oo and m(n)/n' —>
>
just
two repetitions worked the best
terms of coverage and computational expense. This was also confirmed by Bajari, Benkard, Levin (2003).
7
in
Assumption A. 3 (Sandwich
n —
oo.
>
for
fc
G
[co, ci]
-Inn,
—>
Property). Suppose that b/n
tup
—
Qi C C„(&) C Gj\
>
:
~ {^ +
*
sucft i/iaf
— e n^j 6 0/} and
11*11
Assumption A. 3 guarantees
ab(
sup Qi(^)
en
—
is
>
Theorem
The
2.2.
know 0/, hence we
it
is
by an estimate
impact on the distribution of the coverage
-
it
Cn {k)
Wald
(Theorem
hence requiring that
Qj,
statistic in
3.2).
<3b(#))
=
op (l),
to the identified set 0/, as
Cn {c). The
We
subsamples.
show how
Politis,
Romano, and Wolf
know
(1999), p 44, in the
the true 9j and replaces
effect
to verify A. 3 for parametric
The next theorem summarizes our
we do not
similar in nature to the
is
replacement has negligible
this
to
replacement should have only a negligible
subsamples. A. 3
statistic in
and
shown below
In the subsampling bootstrap,
inference in point identified cases. There, one does not
with an estimate
of the
3
Wald
sup
see,
as follows.
polynomial rate of convergence assumption used by
case of
oo at polynomial rates as
»
inferential validity but also allows us to establish consistency
intuition behind A. 3
replace
—
a sequence of positive constants.
characterize the rate of convergence of the level sets
in
b
1
6>ee< n
©/"
and
on the distribution
models
in Section
results for general inference in set identified
models.
Theorem
are iid or
I.
2.2 (Consistency and Validity of Inference). Suppose that the estimation data {Wi,i
form
and
a stationary strongly-mixing sequence
Then for
that A. 2
the subsampling algorithm defined above, provided
and A. 3
P{C <
< n)
hold.
c] is continuous at c
=
ca ,
P{Qi&Cn {c a ))-^a.
II.
The
set
Cn (k)
for k
=
ca
\nn
is
consistent under the Hausdorff metric:
d H (Cn {k),Q I )<e n
Recall that the Hausdorff metric between two sets
d H (A,B):=max.[h(A,B) h(B,A)],
>
Cn (k)
rate
en
that converge to the identified set at the rate
will
be shown to be essentially l/\/n
Under the given
The conventional
level of generality,
for
defined
is
is
en
—
>
1
^O.
where
Thus, under general conditions our inference method
wp
as:
h(A, B) := sup inf
a€A>>eB
\\a
-
&||.
asymptotically valid and delivers level sets
with respect to the Hausdorff distance (the
parametric models).
subsampling appears to be the only valid resampling method.
(n-out-of-n) bootstrap will not be generically consistent in the present settings.
One counterexample
is
as follows: In Section
3,
one of the leading examples, the partially identified
linear
IV
regression, necessarily involves a parameter
bootstrap
fails in
parameter-on-the boundary problems,
ing the approach,
we
establish the
methods
namely that existence
We
the sandwich condition (A. 3).
and
illustrate the the
cf.
Andrews
It is
known that
the
(2000).
Regular Parametric Case and Applications
3.
In this section
on the boundary problem.
for verifying the
main conditions required
for
implement-
of the limit distribution for the coverage statistic (A. 2)
and
develop these methods to cover a variety of parametric examples,
approach with applications to regression models with missing outcome data
and generalized method
of
moments under
we
partial identification. For parametric models,
establish
further properties of the confidence regions, such as the speed of convergence of the level sets to the
identified set,
and the stochastic properties of objective functions
Examples of Parametric Problems with Set
3.1.
in partially identified cases.
Example
Identification.
I.
Regression
with Interval-Censored Outcomes.
Consider the linear conditional expectation models
Ep
The models
i?,.
are not
[Y\X)
=
X'O, where 8 £
assumed to be correctly
Q,X
e
Rd
.
specified, in the sense that there
[Y|X] agrees with i?/,[y |X] under the actual law
P
8
such that
of the data.
Observed data consists of i.i.d. observations (Yu, Yu, Xt) where Yu and
observation on
may be no
Y<n represents the interval
Y{\
Yi
(3.1)
€[YiuYx
]
given
X
{
,
a.s.
In the absence of further information, the set
is
{9eR d
Qi =
(3.2)
the object of interest, as
consistent with data.
We
it
:
E[Yi\X]
< X'9 < E\Y2 \X)
a.s.
}
represents the set of linear conditional expectation models that are
assume that 9/ C 0, where
is
Rd
a compact subset of
.
Observe that 0/
minimizes the objective function
(3.3)
where (u)\
=
(u)
2
Q(8)
=
>
0]
x l[u
Using a sample analog of
(3.4)
J
{(E^x] -
and
2
(u) _
2
x'8)
= {uf
+
+ {E[Y2 \x} -
x l[u
<
0].
2
x'8)
_] dP(x),
Notice also that 0/
=
(3.3),
Qn(6)
=
\
£
i<n
(EiYulXi]
- Xief + (E[Y2i \Xi] -
X'^
,
{8 €
:
Q{8)
=
0}
.
Manski and Tamer (2002) characterize consistent estimates
of 0/.
8
However, no method for
We
ence about 0/ in the sense of providing the inferential statements was given.
infer-
provide below a
confidence approach to inference about 0/ in this and similar examples.
Example
Moment
Structural
II.
terested in deducing the set of
data.
(CI,
The data
<
(Xi,i
9j of interest are
assumed to
E[mi{6i)]
=
m(9, X{)
d
are in-
that satisfy the
respect to the probability sampling distribution of the economic
(3.5)
rrii(9)
R
£
9
we
settings,
n) are stationary and strongly mixing, defined on the probability space
T, P). The economic models
where
moments
of
economic models indexed by a parameter
all
moment equation computed with
Equations. In method
is
=
satisfy
0,
a lower-semi-continuous function in 9
The
a.s.
entire set of
models 0/
C
that solve (3.5) also minimize the criterion function
where Q(9)
is
to nonlinear equations (3.5)
Demidenko
conditions
fail
l
(9)}'W(9)E[m i (9)},
continuous for each 9 6 0, a compact subset of
positive definite matrix for each 9
e.g.
= E[m
Q(9)
(3.6)
(2000).
is
more of a
general be minimized on a
mentioned
set.
The
Q n (0) =
,
and W(9)
is
a continuous and
In nonlinear models, the existence of multiple solutions
rule rather
than an exception, except
may be
In linear models there
to hold, as
(3.7)
S 0.
Rd
multiple solutions as well
in the introduction.
inference on
Thus, the
GMM
g n {9)
=
-V
n —
*
if
the usual rank
function (3.6) will in
0/ may be based on the usual
n[g n (9)]'Wn (9)[g n (e)},
in special cases, see
GMM
function
771,(0),
i=i
where
Wn
(9) is a lower-semi-continuous,
uniformly positive definite matrix. 9
To the best
of our
knowledge, no methods for inference about 0/ in the sense of providing Neyman-Pearson confidence
statements has been yet given in such settings.
3.2.
General Asymptotics of Criterion Functions in Parametric Models. The prime
of this section
is
to provide primitive conditions
a wide variety of parametric models.
and
The
tools will
{6 £
©
tools that help verify conditions A. 2
goal
and A. 3
in
be illustrated using the examples stated in the
next section.
Specifically,
if £ n
>
they show that the set
0„ =
:
Qn{0) < mine
Q n {8) +
£„} converges almost surely to the set 0;
converges to zero at a rate strictly slower than the rate at which Qn(-) converges uniformly to Q().
They
use
a Hausdorff metric as a distance function between sets.
For example,
it is
convenient to use the continuous updating type weight matrix
consistent estimate of asymptotic variance of
n
-1 ' 2
5Z"=i
m i(&)10
W
n (0),
i.e.
to let
W
n (6)
equal a
Since
borhood
approaches Q{0), we should be able to
Q n (9)
of 6/.
The
size of this
0/
that 9
tell
if
8
is
outside some neigh-
neighborhood depends on the rate at which the boundary of 0/ can
be learned. In the remainder of the paper, the rate we consider
the parametric rate and hence the
is
points of uncertainty are those 9i that are within a 1/'-y/n neighborhood of the true set 0/.
Such
points are of the form
9j
+ X/y/n,
In order to keep track of such points
central interest
6
for 0/
0/,Ag R d
will suffice to record all pairs of
it
The
over a suitable domain.
dQj Given
.
the form (9j,X). Hence of
the local empirical process
is
(0j,A)~/ B (0/,A):=n(g„(0/
set
.
9j
pertinent
domain
:= {A €
Rd
Vn (9])
In addition, the following limit version of
V
00
:= {A G
(9 I )
M.
d
+
6j
:
+ A/Vn7
9i
:
A
for
is
G 0,
will play
X/y/n G
1
be shown to be the boundary of identified
for 8j will
G dQj, the pertinent domain
Vn (9i)
+ AA/n)-Q(ff/))
for all ri
G [n,oo)}.
an important
role:
for all sufficiently large n},
(3.8)
where when
Thus,
V
oa
parameter space,
parameter space,
space
Voa (9j)
i.e.
9j
V^di)
i.e.
G dO, the
9i
= K.
G int(0), V„(6j)
Andrews
the partially identified models
=R
local deviations A should
of the specified form. This situation
case, as characterized in
in
G int(0),
the role of the limit local parameter space relative to
{9j) plays
interior of the
9j
is
d
.
When
9i
is
9j.
more
9j
is
in the
on the boundary of the
be constrained to the local parameter
similar to the one arising in the point-identified
(1999). Unlike in the point-identified case, the
is
When
of a rule rather than an exception.
boundary problem
It arises
even
in the
simplest leading cases, as will be seen in Section 3.3.
Another important
set
is
the subset of the local parameter space
A are constrained to be towards the interior of the identified
A„(0/) := {0}
U {A G R d
:
In addition, the following limit version of
A oo (0 /
)
:= {0}
Note that since int(0/)
is
C
U
{A G
0,
Rd
:
6j
A. n (9j) is
9]
+ X/Vn*
A n (9j)
+ X/^n
also a subset of V^{8i).
n
Vn (9i),
for all ri
an important
G int(0/)
where the
local deviations
set:
G int(0/)
will play
a subset of
Vn (9j)
G
[n,
oo)}.
role:
for all sufficiently large
which
is
n}.
a subset of K»(0r), and
A oo (0/)
Given above
definitions,
C„ := sup
we
n(Q n {8i) -
shall obtain the following representations of the coverage statistic
qn )
su P0 7 ee 7 4(0/, 0)
_\
\ sup 9/e e ; in (6i,0)
=
- inf^+A/^ee
su Pe /6 se ; ,A6An(e;) 4(0/, A)
f
{ su Pe 7 eaej,AeA„(0j)
In this representation, the
first
+
4(0/, A)
if
qn := 0,
if
g„ := inf eeQ Q„(0),
o p (l)
+ op (l)
4(0/, A) -inf e76QjiA6Ki ( e7) 4(0/, A)
.
ifng n
^p
0,
nqn
Ap
o.
if
equality follows by definition of the local empirical process 4(0, A),
To guarantee non-
while the second equality emerges from the analysis given in the appendix.
degenerate asymptotics for this
statistic,
we
4(0/, A) converges to a sufficiently
will require that
well-behaved random element £oo(9j, A), in the finite-dimensional sense coupled with some uniformity,
which we
will call the
quasi-uniform convergence.
This form of convergence will be sufficient to
establish that
c _^
c
.
=
\ sup flieSe/iA6Aoo(9/) ^oo(0/,A)-inf fl/e e / ,A 6 voo (e/
The
7^
—
e.
:
e
>
there exists large
infg/ e gQ \\9x
;
Since £ n (9j,X)
>
—
9j\\
0,
0.
Condition C.l
is
>
enough 5 >
and
int(0/)
=
that for In (S) :=
with probability no less than
=
4(0/, A)
inf
=
wp
—
>
1.
6 I ee,,\evn (e I )
motivated by the fact that in interval regression examples
£ n (9j,\)
not empty, so that
enough n such
large
5/y/n}, su.pg l€ jn ^\ |4(0/>O)|
is
implies that
0, this
nqn
We
)
When
(Degenerate Asymptotics on the Interior).
dQj, for any
{8j G int(0;)
I
^(0/^)
following conditions ensure that this convergence takes place.
Assumption C.l
Ql
ifngn -> p
ifn 9n7A p
f su Pe 7 6se n A-6Aoc(e/) 4o(0/,A),
=
wp
->
1
for
any fixed 9j G int(9/) and A G
M.
for large
n
d
.
provide examples and further discussion of this interesting phenomenon in Section
3.3.
Condition C.2 puts further convergence conditions by appropriately matching the large sample
behavior of function £ n (9,X) with that of some function £oo(9,
A),
which we
limit of £ n (9i, A). In order to state the condition, define the following
u n (X) :=
sup
£ n (8j,\)
and l n {X) :=
inf
4(0/, A)
for
Assumption C.2 (Quasi-Uniform Convergence Near Boundary).
K
is
any compact subset ofR
d
.
Then
12
call
the quasi-uniform
key functionals
n < oo and n
Let 9j G
=
oo.
dQj and A G
K
,
where
i-
4(^/iA) >
which
ii.
(a)
is
is
ifOi
converges weakly in finite- dimensional sense to
continuous in A for each
^
to Uoo(A) in finite- dimensional sense,
= 90/,
ifQj
0,
and
(b)
u n (X)
then (a) (u n (\),l n (\)) jointly converge weakly
s
and
in finite-dimensional sense,
to (u 00 (X),l ao (X) )
>
£ !X {9],X)
9].
d&i, u n (X) converges weakly
stochastically e qui- continuous;
some function
(b) (u n (X)
and l n (\) are
stochastically
equi- continuous.
iii-
<oo
sup e/€ae/iAeAoo( 0,)4o(0/,A)
a.5.
Condition C.2 extends the standard conditions used to derive the asymptotics of criterion func-
where 0/
tions in the point-identified case,
(1994).
Although asymptotics
near the boundary
is
in the interior
assumed
a single point
is
is
degenerate, as condition C.l states, the asymptotics
be non-degenerate.
to
£ n (9,X) converges in finite-dimensional sense to
and
requires that u n (X) and
l
inf
in C.2-(ii),
n {X) are stochastically equicontinuous.
it is
any compact subsets
I
edQ I ,XeA xl (e ;)
of IR
.
interval-censored outcome), while
it
is
least
C\ and
needed
^> e
holds in others (the generalized
as large as
(0i,X)
uniformly in (9j,X)such lhat u{6i,~X)
positive constants
w^
It also
one °f ^he components of
C.
to
S
90/ x K,
C.2-(ii).
hold in some cases of interest (the regression model with
(Local Quadratic Bound). For any
en
Condition C.3
^°o (^/> ^)
This stronger and simpler condition certainly implies
fails to
some
in finite-dimensional
C.2-(iii) insures tightness of the
stochastically equicontinuous over
is
However, this stronger condition
such that with probability at
sup and
possible to require that
(#/,A) I—* £n(9i,X)
Assumption C.3
C.2-(ii) requires that
limit £00(9, A).
transformations of ^oo(#r>A), denoted as Uoo(A) and ^oo(A).
limit coverage statistic C, since sup e
is
for verifying A. 3. C.2-(i) requires that
l
sense to respective sup
K
and
oi£n (9i,X) over 99/, denoted u n (X) and n (X), converge
inf transformations
where
some
imposes the conditions required
C.2-(i.-ii.)
for obtaining the limit distribution of coverage statistics
Note that
Newey and McFadden
see e.g.
9;,
1
—
e
e
>
0,
there
method
is
of
moments).
a sufficiently large positive
K
for large enough n
> Ci •n-minH<9 / ,A),J] 2
> K/^/n, where
that do not depend on
v(6r,X)
=
inf^'ge/
\\9j
X/y/n
—
9'j\\,
for set estimates,
and along with C.l and
C.2 allows us to verify the main previous high-level conditions A. 2 and A. 3.
C.3 extends, to the
needed to obtain the rate of convergence of extremum
timators in point-identified case; see conditions in Theorem 5.52 in Van der Vaart (1998) and
13
for
e.
obtain rate of convergence
set-identified case, the standard conditions
+
es-
Newey
and McFadden (1994),
Theorem
2185.
p.
c ._
c _>
I
su Pe /6 9e;,AeA<»(9/)
—p
Theorem
may be
is
necessarily occurs if qn :=
easily checked in
—
dQj and
the local parameter spaces
int(Oj)
is
A 00 (6j) and
we would
like to
/)
4o(0/,A),
non-empty
ifnq n7^ p
in 0.
states that under a set of conditions,
statistic attains a limit distribution,
For any k
—
C
and
00 such that k
p
67 C
=
Cn (k) C
which
which
of the quasi-uniform limit £oo{0j,X)
Ko(0r).
know
certain properties of the level sets.
3.2 (Rate of Convergence and Sandwich Property). Suppose A.l and C.l
Then, for some positive constants
I.
if
i
and sup transformations
inf
In addition to this basic result,
Theorem
or
inf 0/ee/iAeV oo(6
examples of interest, the coverage
determined by the appropriate
over
ifnq n -> p
-
Assumption A. 2. The theorem
3.1 verifies
holds, in particular,
^00(6"/, A),
\ su P9,ede,,\eAco(8,) 4o(0/, A)
where nq n
Then A. 2
3.1 (Limits of Cn ). Suppose A.l and C.1-C.3 hold.
-
C.3 hold.
C
o p (n),
0}" wp -»
1,
where
=C
en
\jk/n,
so that
d H {Cn {k),e I )<C-yfk/n~wp -+1;
II.
For any k €
Inn, the sandwich condition A. 3 takes place,
[co,Ci]
©/ C
and, provided
b
—
>
Cn (k) C
00 and b/n
b{
—
sup
eee'r n
Theorem
3.2 verifies
wp
"
7
at
>
Q
b
Assumption A. 3
vergence of the level sets to the identified
metric,
is
(8)
->
1,
where
polynomial
en
—
C'
lnn/y/n,
rate,
- sup Q b (6)) =
op(l).
see.
for
parametric models and characterizes the rate of con-
set.
The
essentially 1/\/n.
14
rate of convergence, as measured by Hausdorff
3.3.
Analysis of Regression with Interval-Censored Outcomes.
=
Xi
1
and suppose E[Y: < E[Y2 ). Then 0/ = {9
}
:
<9<
E[Y{\
First consider the case
E[Y2 )},
that
is
9/
=
when
[E{YX }, E\Y2 \]
Then
\2
Q n (0) = (Yi-ey+ +
„\2
/,->
{Y2 -ey_,
,
and
=
e n (9j, A)
n
-Bi- X/^)\ +
((?!
-9;-
(Y2
2
\/Jn) _)
Suppose that
MYi ~ EYU Y
2
- EY2
)'
(WU W2 ~ Af(0,Sl).
)'
d
Then
ln(0iA)
=0 wp
-» 1
9t s
if
(£yi,^y2 ),
and the finite-dimensional limit of
(e n
(EYu \),
en
(EY2 ,X))'
is
Therefore the finite-dimensional limit of £ n (9,\)
=
*oo(«>/, A)
Theorem
?n(Bi,X)
{Wi -
3.3 verifies C.1-C.3 for this
.
This implies by Theorem
Cn
^ d C=
2
X) + l(9j
is
2
((^1 -A)^,(^2 -A) _)'.
given by
= EYi) + (W2 -
example so that £oo(Bl,X)
3.1
2
\) _l(9j
is
= EY2
).
also the quasi-uniform limit of
that
= max[(W )l,(W2 )l].
sup
l Oo(0i,\)
e,sSe ,A6A 00 (9/)
1
;
One can
use the above distribution to obtain the critical values and corresponding level set with the
required coverage property.
Before proceeding to a more general case, notice that the inferential strategy developed in this
simple example can be used in other problems where 0/
F\
<
9
< F2 where
Fi and
F2
is
models
is
to derive the joint
interest. All
above
£
such that
is
a set of
asymptotic distribution of the sample analogs of the endpoints of
for the case
when F\
= EY\
and
F2 = EY2
Getting back to more general settings, suppose that
(3.9)
9
one needs to do in this class
the interval of interest and obtain via simulation the critical values for the
variables, as outlined
all
are functionals of the distribution of the observed data. This
models that provide interval bounds on the parameters of
of
defined as the set of
X
£ {xi,...,xj}
15
P-a.s.,
.
maximum
of two
random
where the
0/
is
first
component of
is
Condition
1.
(3.9)
assumes that
X
is
discrete.
The
identified set
determined by the set of inequalities:
€R d
0/ := {6
(3.10)
It is
X
C
assumed that 6/
>n
x'fi
:
0, where
X
x'jO
< t2
<
:= (xj,j
J) has rank
=
a©/
and int(0/)
is
empty
=
Define t\{x)
{9 1 £ 0/
Rd
in
,
E{Y\\x) and t$(x)
~
\
=
E
E
or x'flj
ri (xj)
= dQr
0/
so that
3n(#)
=
x'jdi
:
=
and only
if
d,
r2 (xj)
if
J}.
and that the d x J matrix
,
which rules out redundant parameterization. Then the boundary of the
(3.12)
<
(xj) for all j
Rd
a compact subset of
is
(3.11)
and
(xj)
some
for
,
t\(xj)
=
identified set <90r
j
<
T2(xj) for
is
J},
some
j.
.E^Y^x) and consider the objective function
X^l + X
-
fa(*i)
(
'i°
- ?*(*))-
.
i=1
(3.13)
Tk(xj) :=
Yi
fc
'
=
1 .2,
j
=
l,.„, J.
i:Xi=Xj
In this case for
n3 := ^
^2?=i Ip^i
-
=
£ n (9;, A)
W\ 2
£.?']>
n£ —" ((Tlfo) - x^
3
Define
=
7
- &$A/Vn)£ +
-
(f2 ( Xj )
x'^9,
-
2
x'jX/y/n) .)
.
=1
:= \/n(T\(x j)
—
W
T\{x j)) and
2j
:= v/ n(f2 (xj ) —T2(xj)) for j
=
1,...,
J.
Assume
also that
a central limit theorem and a law of large numbers apply so that
(Wi.^i),..,^,^))'^
—p
rij/ri
Theorem
pj
for
eachj
=
1,
...,
((Wn,W2i),...,(W1 j,W2J ))'
~ M(0,£l),
and
J.
3.3 (Interval Regression). Let the basic conditions specified in Section 3.2 hold and also
(3.9)- (3.14) hold.
Then C.1-C.3 and A. 1- A. 3
are satisfied. Moreover,
J
too{0i, A)
c=
\
= J2Pi [( WU " x'jXftUxfa =
nixj))
+ (W2j -
2
x;.A) _l(x^ 7
su Pe 7 e3e,,AgA 00 (e )^oo(^/,A)
/
if
\ su Pe eae,,A6A 00 (e ; )^oo(^/,A) -inf e;6e/iA6Koo(6)/) 4o(0/,A)
;
where nqn
=
This theorem
orems
—
wp
>
1
occurs
verifies
2.1, 2.2, 3.1,
and
when int(0/)
is
non-empty or
if q n
:=
=
r2 (x,))]
nqn -> p
ifnqn7^ p
0.
Assumptions C.1-C.3 and A.1-A.3, which implies that the
results of
The-
3.2 apply to the interval regression case. Therefore the inference procedure
16
proposed
in Section 2 is valid in this case.
Theorems
properties given in
The
and
3.1
Further, the confidence regions have the stochastic
3.2.
distribution of the limit variable
C
is
not pivotal, and has no
known
closed analytical form,
but this does not create a problem for the inferential method proposed in Section
simplified
much
further, unlike in the trivial case with
X{
=
1.
One can
C can not
2.
simplify the leading term as
J
sup
=
4o(0/, A)
V w Uwisf+Uxfa = n
sup
(ay))
+ (W2j
2
)
_l(x'j ei
=
t2 {xj))}
,
»/t
but other terms do not appear to simplify
3.4.
Analysis of the Structural
ple,
and then generalize
it.
above expressions.
in the
Moment
Equations Model.
We
first
work out a simple exam-
Consider the usual two-stage-least-squares model
Yi^Xfa + eiiej),
Eei{ej)Zi
= 0,
which leads to the following objective function
1
Qn{e)
Assume
linear
<
that
r
subspace of
=
Md
=
{Y -xle)z') (^E^)"
(±Y
2=1
i=l
J
EXZ' <
rank
Note that 0/ has empty
when
there
is
weak
=
{6
=
O
+
0/
Q n {0)
defined as follows: for any 6q such that Ee(Oo)Z
is
$ £
interior relative to
identification,
6
such that
Rd
pivotal
is
Suppose
is
5'
EXZ' =
and can be inverted
in the case
for confidence intervals.
making inference on 0/
is
(which
also true
is
when
n
n
/
i
= (A n (9i) + \'-J2 X Z?)
*
i
(~E
i=\
q n :=
mig^QQ n (9)
but dim(.Z')
n
-l
Z^0~
i
(A„(e/)
+
A'-X>^),
i=l
n
j
=l
n
Under standard sampling conditions
n
n
n f—
i=i
ZiZ'i ^ p EZZ',
n
In
the empirical
dim(X)), so that
- ]T
0,
no longer pivotal.
for simplicity that qn :=
l n (9!,\)
=
0}.
Under the usual circumstances, even
.
the partially identified case, the statistic most relevant to
process {Q n {9i),9j 6 ©/) which
i=\
dim(0), so that the identified set 0/ consists of an r-dimensional
intersected with 9.
6/
(if^iYi-XftZi).
i
J2 z x
i
i
-+p
EZX', and
.
i=i
17
{e(6i)Z, 9; e
©/}
is
Donsker.
<
Hence the finite-dimensional and quasi-uniform
ioo(Bi,X)= (a(6i)
where (A(8j),8i 6 0/)
under
is
the
weak
mean Gaussian
n {8hX)
+ X'EXz}' (eZZ'Y
is
given by
+ X'EXZ'
(a(0j)
(A n (8i),8j 6 0/). For
limit of the empirical process
instance,
EA(8
process with covariance kernel given by
'/)
A{8'1 )
=
1
Ee(8j)e(8'I )ZZ
/
.
therefore conclude that C.1-C.3 are satisfied and thus
Cn
-1
w/1
= A n (8 I )'(-\]Z
l
Notice that the limit
Z'i
A n (8,)^ d C=
)
we
Next,
method
generalize our
This
\\Emi{8)\\
is
l
2
10
method
C
There are positive constants
> C-min[
inf
\\8
Also, compactness of
0/
is
the
finite.
to the general nonlinear
partial identification condition hold:
\
3 15
sup A{8 ,)' [E Z Z]~ A{8 j)
not pivotal and depends on knowing 0/.
is
necessary condition for the limit variable C to be
(
(.
sampling, provided that {e(8j)Z,8j S 0;} has a square-integrable envelope, the limit A(-)
iid
a zero
We
is
limit of
2
-
8j\\,5\
of
and
uniformly
,
moments. Let the following
5 such that
6 0.
for 8
both a partial identification condition and a smoothness assumption. This condition implies
that Errii(8)
=
if
an only
if
6 0/, and
8
smoothness on the behavior of Em,i(d)
also imposes
for
points 8 near 0/.
In the point-identified case, global identification and the
V$Em.i(8) near 8j ordinarily imply
set identified case, the
Jacobian
(3.15); see e.g.
may be
Theorem
to the hyperplane {v
:
positive eigenvalue of
Theorem
hold and
EZ'Xv =
In
Hence
if
rank
||0J
EZ'X
—
0||
Pakes and Pollard (1998). In the
>
0,
the vector (8
= EZ'X(8 — 8J),
— 8j)
is
orthogonal
non-zero, for Co denoting the minimal
is
(EX'Z)(EZ'X), we have \\EZ'X(8 -
2
8*j)\\
> C
\\8
-
2
8*j\\
.
3.4 (Generalized Method-of Moments). Let the basic conditions specified in Section 3.1
let
i.
Qj
=
dOj,
continuous Jacobian G(8)
(A(0 7 )'
0}.
3.3 in
IV model we have that Em,i(8)
the closest point to 8 in 0/. Provided that
is
rank and continuity of the Jacobian
degenerate, which necessitates a statement of a more careful
condition (3.15). For example, in the previous linear
where 8}
full
the
case
ii,
=
1
be a
VoErni(d),
Wn
>
dim(2)
+ X'EXZ') [EZZ'}-
{mi(8),8 G 0}
v.
dim(i),
Donsker
{8)
and
class, Hi.
= W{8) +
op (l) uniformly in
=
qn
Em.i(8) satisfy (3.15) and have
inf« 6 e Qn(9),
(A(9i) + X'EXZ') and
,
Cn
^ C=
d
sup ^oo(0j,O)«,ee,
inf
»/€6, »ev„(«,)
1
18
4o(0/,A).
8,
then
where W(8)
^oo(#/,A)
is
positive definite
and continuous for
icoiBj, A)
c
_
j
=
(A(0/)
sup e/6 e /
+
Then, C.1-C.3 and A.1-A.3 hold, and
all 9.
+
W(6j) {A(9j)
A'G(0/))'
\'G{9 I ))
i/ng n -> p O
^ oo (6'/,0)
\ sup0 ;€e;iAeAoo(e;) 4o(#/,A) -inf e/€e;|AeKxj(fl/)
where 9j
>—>
A(9j)
the
is
weak
=
—p
necessarily occurs if qn '=
The theorem
orems
2.1, 2.2, 3.1,
Section 2
in
verifies
is
and
3.1
and
3.2.
does not create a problem
4.
We
illustrate our
m i{8l) Ya=i rn i{^i)']-
E{n~ l Y27=l
<
:= inf^ee Qnifi) but dim(mi(#))
i/qr n
GMM
3.2 apply to the
Above,
dim(f?).
case. Therefore the inference
procedure proposed
in
Further, the confidence regions have the stochastic properties given
C depends on 0/. Hence the
Similarly to the interval regression case,
C
distribution of the limit variable
We
a zero-mean Gaussian process
the conditions C.1-C.3 and A.1-A.3, which implies that the results of The-
valid in this case.
Theorems
or
lim n _oo
ifnq n f+ p O
^oo((9/,A)
A n (9;),
limit of the process 9j >— >
with the covariance function E[A(9[)A(9{)'}
nqn
,
not pivotal, and has no
is
for the inferential
known
method proposed
closed analytical form, but this
in Section 2.
Computation and Empirical Monte Carlo
methods above with an empirical Monte Carlo that uses data from the CPS.
also provide details
on the computational method that can be used to construct the confidence
sets.
Computation. An
4.1.
subsampling method
issue in the
being able to use an
is
algorithm to construct level sets of an objective function. Ideally, one would
of grid points
and evaluate the function on those
efficient
like to
use a rich set
However, as the dimension of 9 increases,
points.
constructing a simple uniform grid becomes computationally infeasible.
The Metropolis-Hastings
algorithm provides a computationally attractive method for generating adaptive grid
details of the numerical
approach we use
=
1.
Generate a grid of points B
2.
Given a starting critical value
objective function as:
3.
C n (k)
=
j
:
c
k=
and
n(Q n ((9 g )
,
5
Then compute C n (6(a))
This algorithm
is
{6>
g
6 §
:
<
k}.
compute Qb(#g) for
j = l,...,B n
= b(max eg g k Qt,(0g) - min 8 sc (k)
{Cj n j = 1, .--iBn}
,
=
sc „(k) Qn(#g))
Cii (
Compute the Q-quantile 5(a) of
ib ,
)
B
n
all 9 &
£C n (k),
and the
Qt>(# g ))
.
,
n(q n (0 g )
- min 9g6 c„( k
)
Qb(<? g ))
<
c(a)}.
a valuable method of generating grid points adaptively, so that they are placed in relevant
regions only. See Robert and Casella (1998) for a detailed description; and Chernozhukov and
examples
11
lnn, we can compute the level set of the
c
- min e
Cj |b n
4.
.
in the following algorithm:
where
subsample coverage statistic, e.g.
The
sets.
using the Metropolis-Hastings algorithm.
(9%, ...,9 k }
{9 g €
At each subsampling stage
summarized
is
numerical
in non-likelihood settings.
19
Hong
(2003) for related
An
important practical issue
consistency of estimate c a
the choice of the initial point
is
not affected by the choice of cq as long as cq
is
Hence a reasonable procedure
(2.2).
For example, in
for selecting cq
how
is
2.2 suggests the
bounded
in the sense of
can be based on pointwise testing procedures.
GMM a pointwise critical value for testing Q{9)
of a chi-squared variable with degrees of freedom equal to the
section for
Theorem
cq.
to choose cq in the interval regression case.
of the asymptotic distribution of the coverage statistic
More
=
at 6
number
is
given by the a-quantile
of equations.
See the next
generally, one can use the a-quantile
computed under the assumption that
8j
is
a
singleton.
Empirical
4.2.
Monte
Carlo: Returns to Schooling in the
sample performance of the inferential procedures proposed
Population Survey. Our "population"
is
a sample of white
wave of the CPS. The wages and salaries
CPS. We examine the
in this
men
series are not top
actual finite-
paper using data from Current
ages 20 to 50 from the
March 2000
coded or censored and so we are able
to use this population to construct
1.
confidence regions for the returns to schooling parameters in the point identified case, and
2.
confidence regions covering the identified set
Our Monte Carlo
summary
statistics for
1.
Variable
Wages and
Salaries
Education
Population
9/
set.
Table
We
start first
Summary
1
below provides
Obs
Mean
Min
Max
66667.6
51968.41
1
513472
13290
11.77
1.89
1
16
Std
Dev
least squares regression of log of
by describing the
is
.0533
Statistics
13290
with a constant term of 3.91 obtained from a
.
bracket the data.
our data (population). The "true" returns to schooling coefficient
TABLE
4.3.
artificially
based on random sampling from the original data
is
in the "population"
when we
details of the
wages on education
computational procedure.
Starting values and Implementation Details: The algorithm used to obtain estimates of
the same as the one described on page 20 above.
is
1.
We
draw a sample
of size
The
steps of the procedure are as follows:
n from the above population
(this
population will be
artificially
bracketed below).
2.
We
build an initial estimate C„(co) at the starting cutoff level cq (choosing cq
is
described
next).
3.
In the subsampling step,
we obtain
the consistent estimate, c a of the cutoff level by subsam-
pling the coverage statistic.
20
At
this point
One may
In
The
iterate several times, but
no or at most two
after
4.
one can go back to step 2 above and set
all
of the
we
=
en
is
Inn and then repeat step
find that the cutoff levels that
we
used to construct the
For example, in the method of moments case, the
the following way. Let
is
function
Q n {6)
initial level set of
critical value
some value
get are very close
the objective
<
6q,
and
specific.
from a \ with an appropriate degrees
Yii
=
we choose an
initial
Yf, then we use the objective
This generates an auxiliary model, which
n).
which we can compute the
for
problem
is
2
Y? = \{Yu + Yu) and Yu =
applied to the data {Yu,Y2i,Xi,i
point-identified at
12
as described in Section 4.1.
of freedom can be used as the starting value. In the interval regression example,
value co
3.
iterations.
above steps we use the numerical procedures
starting value en that
ca
is
limit distribution of
a
a
n(Qn(O O )-Qn(0 O )),
and take
which
is
its
what
quantiles as the starting value cq. In
we used subsampling
follows,
a consistent method of estimating cq by Theorem 2.6.1 in
Politis,
to estimate Co,
Romano, and Wolf
(1999).
Properties of the Set Inference Procedure in the Point-identified model. Here, the
4.4.
data on wages
is
not interval measured (the model
is
point identified), but
we nonetheless apply
our procedure to obtain the confidence region. The objective function that we use
is
the
minimum
distance objective function
J=16
Q n (9) = J2 ^(flfo) -
(4.1)
Since wages are not interval-measured,
x'jO?-
we have
we provide graphs
1,
subsampling procedure
In Table
sample
2,
we examine
sizes:
n
=
and
b
=
4.5.
for the case
—
ellipse
x' e)l.
3
?2(xj),j
=
I,...,
J.
The
results
obtained using the usual Wald inference
n = 400 and n
=
2000 observations and see that the
We provide the coverage for two
simulations. We report the coverage for a
the coverage of our inferential procedure
sizes.
where n
-
close to the ellipse obtained using the usual chi-squared approximation.
is
1000 and n
sequence of subsample
for
(h(xj)
that f\{xj)
obtained are compared to the the joint confidence
approach. In Figure
+
=
=
2000, using
As we can
1000,
it
see,
R =
600
.
coverage seems monotonic initially in the subsample size
peaks at
b
=
200 while
for
the case where
n
=
2000,
it
peaks at
500 and comes close to 95%.
Properties of the Set Inference Procedure in the Set-Identified Model. To examine
the identified set of parameters in the case of censoring of the dependent variable, we bracket the
income data
into 15 different categories.
In the interval regression
These brackets are
example and similar
situations,
21
(in
thousands)
Inn may be dropped.
Figure
Point Identified Case: Subsampling vs x 2 Ellipses
1.
Subsampling
3
Ellipse
0.06
Chi Squared Ellipse
UJ
Constant
Table
2.
Finite-Sample Coverage Property in the Point Identified Model
Subsample Size
n=1000
b=50
b=80
b=120
b=200
b=300
85.1
88
87.2
93.1
91.2
b=200
b=300
b=400
b=500
b=600
86.1
88.2
90.5
95.01
93.3
Coverage f95%,)
n=2000
Coverage (Qh%)
[0,5]
,
[5,7.5]
[30,35]
,
,
[7.5,10]
[35,40]
,
,
[40,50]
[10,12,5]
[50,60]
,
Notice here that the topcoding
obtain 718 observations in the
,
,
[12.5,15]
[60,75]
is artificially
first
,
at
artificially
,
[20,25]
,
[25,30]
[100,150]
,
[150,100000]
,
$100 million. To give a flavor of the data, we
bracket, 1211 belong to [50,60], 1598 belong to [100,150] while
Cn {c$$)
random without replacement from
Using the
[15,20]
[75,100]
set at
734 are making above 150. The objective function
through our 95% confidence region
,
for
is
the same as the one used earlier. In Figure 2
sample
sizes
n = 600,2000,4000, and 10000 drawn
the original "population"
bracketed population,
we can obtain
the identified set by collecting the
parameters that satisfy the following set of J inequalities corresponding to the set of J values that
the level of education takes:
E[y\\xj]
<
b
+
bixj
< E[y 2 \xj],
22
j
=
1,
...,
J.
We
Cn (c.9$).
then graph the identified set 0/ along with the 95% confidence region
as the sample size increases, the set estimates shrink towards the identified set.
n
=
600, our set estimate of the intercept
Figure
0.16
9/
2.
Cn (c 95):
is
[3.2,4.71] while the true range
Figure
vs
Cn (c 95).
n= =600, b=60.
0.16
3.
§
0.1
9/
vs
n=2000, b=200.
-
-
0.14
Em,
0.12
[3.6,4.4].
0.16
0.14
c
For example, at
0.18
1
'
is
Notice that
•
»t-
c
- Sut) sampling Confidence Sol
Subsamptinq Confidence Set
0.12
O
|o,
"a
UJ
Horn
So.08
^C —
b
|
Identified
w
E
Sel
§0.08
0.06
a.
0.04
1
A
v
0.02
—
^Wmfflk-
-
Identified
Sel
a.
\
0.04
-
0.02
-
^b
-
2.5
3
4
3.5
4,5
5
5.5
6
Constant
Figure
Figure
4.
9/
vs
5.
C„(c. 95 ). n=4000, b=400.
9/
vs
n=10000,
Cn(c.g 5 ).
b=2000.
0.14
c
0.12
g
1
0,
•o
HI
2 0.06
<fl
E
I 0.06
a.
Confidence Set Estimate
3
3.5
4
4.5
5
3
3.5
4
4.5
5
Constant
To
calculate coverage probabilities,
we draw
R=
600 random samples from the population and
record whether our set estimates using these samples cover the identified
set.
For every sample, we
construct the set estimate, and calculate coverage in the following manner. Numerically,
23
we
store the
5.5
6
Table
3.
Finite-Sample Coverage Property in the Set Identified Model
Subsample Size
N=1000
Coverage f95%,)
N=2000
Coverage ^95%J
identified set
Cn (c a
).
6/ and
As Table
set estimates
3
Cn (c a
)
as arrays,
and Figures
b=100
b=150
b=200
b=250
b=300
84.1
90.2
91.3
88.5
82.8
b=200
b=300
b=400
b=500
b=600
86.34
90.1
92.2
94.3
91.2
and then check whether
2-5 show, the coverage
is
all
the points in 0/ are contained in
similar to the point identified case
and the
converge to the identified set as the samples size increases. The numerical performance
of our inferential
methods supports the
large
5.
sample theory developed
in
Theorems
3.1-3.3.
Conclusion
This paper provided confidence regions for identified sets in models with partial identification.
The proposed
inference procedures are criterion function based, and our confidence regions are certain
level sets of the criterion function in finite samples. In the case
when
the model
is
point-identified,
our confidence sets reduce to the conventional confidence regions based on inverting the likelihood
or other criterion functions.
The proposed procedure was shown
to
be valid under general yet
simple conditions. Along with inferential procedures, we have developed methods of analyzing the
asymptotic behavior of econometric criterion functions under
set identification
the rates of convergence of the confidence regions to the identified
regressions with interval data
and
set identified
method
of
set.
We
and
also characterized
applied our methods to
moments problems.
We
also assessed the
performance of the methods in Monte Carlo experiments based on the Current Population Survey
data.
We
found that the methods perform well and in accordance with the asymptotic theory
developed.
24
Appendix A. Appendix
We
-L
-J2 /(Wi), G n f(W) =
E n f(W) =
£
W=
use the following notation for empirical processes in the sequel: for
E
denotes expectation and
(Y,
E (/W) " E /(^))
denotes expectation evaluated at estimated functions
E/(Wi)
=
X)
(£/(Wi)) /=/
/:
.
Outer and inner probabilities, P' and P» and corresponding notions of weak converegence are defined as
—
van der Vaart and Wellner (1996).
in distribution
under P*
wp —*
,
denotes convergence in outer probability, and
p
means "with the inner probability approaching
1
"with the inner probability no smaller than
—
1
e for
sufficiently large
n"
.
A
1,"
—*d
in
means convergence
and "wp >
table of notation
is
1
— e" means
given at the
end of the Appendix.
Appendix
Lemma
B.
Proofs of Lemma
B.l.
Proof of
B.2.
Proof of Theorem
2.1. Proof
B.3.
Proof of Theorem
2.2. Step
2.1. Proof
2.1,
and Theorem
2.2.
D
given in the main text.
is
is
I.
(B.l)
Theorem
2.1,
given in the main text.
We
have that
a b (Q jtb (0)-qjtb ),
sup
Cj,b,n=
eec„(c)
where
q h b :=
if
and q^ b
qn :=
:
=
Q h b{9)
infgge Qj,b{9) otherwise;
denotes the criterion function defined
using the j-th subset of the data only. Define
s„
G
(B.2)
b ,„(
X ) :=
B~ J2
b„
l
l{Cj.fr.n
<
X}
=
-S"
1
3=1
<X- {Chb
J2 HCjAn
,n
- CjA n)},
3=1
where
Cjtb ,n
(B.3)
=
sup a b {Qj,b{9)
-
q} ,b)
eee,
In
what
follows, the
main step
is
to
show that
can be replaced by Cj
Cj,b,n
:
b,
n
This
-
will
be possible due to
the sandwich assumption, and despite that the constant k used in construction of the preliminary set estimate
Cn (k)
data-dependent.
is
By
A. 3
wp —
>
1, for
some deterministic
(B.5)
0^" we have
e,cc„(i)ce;",
(B.4)
where
set
£
6 7"
:= {9,
+
t
:
||i||
<
e„,0/ £ 0/}. Hence
sup a b (Q],b{9) - q 3 ,b) <
see,
sup
wp
->
1
for all
a b (Qj,b{9)
-
<
i
qj,b)
J,6,rl
< sup
eee'"
«ec„(c)
'
Bn
-
>
C,
b
'
n
25
a b (Qj,b{9)
—
C
-
qj,b).
'
b
n
wp —
Hence
>
1
G btn (x) = B~ Y,
l
(B.6)
1{^,6,»
<
< &,»(*) < G b
*}
,
n (x)
= B'
1
3=1
By
£
3
l{Ci|6 n < x}.
,
=1
A.2 C 6 -> d C and by A.
Cb
(B.7)
= Cb +
-
op {l)
C,
d
so that by Step II
G btn (x)-* p G{x)~P{C<x}, G b n {x)-*P
(B.8)
if a; is
G(x)
,
= P{C<x},
a continuity point of G(x), which proves that:
G
(B.9)
at the continuity points
b
,
x of G(x). Furthermore,
(B.10)
n (x)^ p
G
if
= G-\{a)
ca
is
-*
G(x)
continuous at c a
p
=
ca
= G~
1
{a),
G-'(q),
since convergence of distribution function at continuity points implies the convergence of quantile functions
at continuity points. Finally the claim
I,
that
p(e,ec„(y)^
(B.ll)
by Theorem
follows
Step
II.
a
,
2.1.
This part shows (B.8). Write
Bn
Gitn (x) = B^ '£l{C jAn <x}
1
(B.i2)
r
(0)
= P{C b <
(
=>
at the continuity points x of
G(x)
= P{C <
(B.13)
n
where
(a) follows
P{C <x} +
o p (l)
x}, as long as
6
0,
—
>
co,
n
—
*
Varl^fSfo,^*})
by the variance bound
in Politis,
and the
o p (l)
oo;
from
(B.14)
(B.15)
+
x}
for
bounded
Romano, and Wolf
17-statistics for i.i.d. series
(1999);
and
(b) follows
C b -> d
definition of convergence in distribution
and a-mixing
by
C, as
on K.
26
6->
=o(l)
oo,
series given
on pages 45 and 72
Likewise, conclude
G
b
,
= B- Y^HCJ ,b,n<x}
1
n (x)
J=1
(B.16)
= P{Cb <
x of G(x)
at the continuity points
Step
Wp
III.
->
= P{C <
=>
('k)
=>
C„(i)ce'"
—
o p {l)
O.
/i(C n (/i),G /
)</i(0 -,0 / )<£
E
71
.
1
>
dH(6/,C*n (fc))<e n
(B.19)
Thus Claim
=
/i(0/,Cn (fc))
1
>
(B.18)
Hence wp
= P{C <x} +
z}.
e,cCn
wp —
o p (l)
1,
(B.17)
Then,
+
x}
.
proven.
II is
Appendix
Proof of Theorem
C.
3.1
First recall that
C n := sup
n{Q n {6i) -
qn )
e,se,
(C,1)
=
We
/
C2
f
su Pe/£e , 4(0/, 0)
if
g„:=0,
\
sup e;e6; 4(0/,O)-inf S;+Vv^ ee 4(0/,A)
if
gn := inf fl6e <2n(0)-
seek to establish that C n
n
c
.
=
f
—>j
C,
where
suPe/ese/.AeA^ts,) ^oo(^/,A),
- inf^ge^sv^ta,)
1 suPe/ese^AeA^o,). 4c (0/, A)
A 00 (6> /
(C.3)
)
{0}U {A
14,(0/) := {A
(C.4)
Step
:=
I:
Case when nq n —* p
0.
eM d
G
Kd
Since nq n
61/
:
:
—p
Cn
(C.5)
+
0/
+ A/Vn
G
0,
we have
that
=
in
what
follows
Then we would
nq n -^ p
if
ng n /* p
for all sufficiently large
0,
n},
for all sufficiently large n}.
sup 4(0/,O)
a;
Hence
A/ v/nG int(0 7 )
4o(0/,A)
if
+
o p (l).
se,
we ignore the o p (l) term.
like to
show the following simple, but important representation:
Cn
(C.6)
=
sup 4(0/,
0)=
4(0/, A)
sup
e,ee,
e,ede,,\eA„(8 r )
where
A n (0/)
(C.7)
To prove
0}
+
(C.6),
X*/^/n
=
we need
0/.
If 0/
to
:= {0}
U {A G R d
show that
for
:
0/
+ A/v^7
G 0/, there
each
G 90/, then simply take
G int(©/)
6]
=
6j
27
and A*
for all ri
exists 6]
=
0.
G
[n,
oo)}.
G 90/ and A* G
If 0/
G int(0/) take a
A n (0J)
such that
sufficiently small
>
ball centered at #/ of radius 6
radius of the ball
<5
6 90/. Such radius
9*j
6j all
denoted Es(8j), such that Bs(di)
,
exists
In
what
90/ and
follows,
we
A n (9}). Hence
A* e
1.
0/ has an empty
Case
2.
0; has a nonempty
1.
By
Cn
is
=
0)=
sup 4(6>/,
2.
—
9})y/n.
By
/
90/.
e;
sup 4=(0J,O).
eae.
empty,
{0},
also rewrite (C.8) using general notation
=
=
sup 4(07,0)
For any 5
C=
4(07,A)-> d
sup
e,sse,,AeA n (fl;)
«iee,
Case
(9i
90/,
0/
C=
4(0/,O)-> d
sup
)
Cn
=
0/
e,^ae,
A n (e / = A oo (5 / ) =
(CIO)
=
true.
interior relative to 0, so that
(C.9)
we can
segment between 9} and
\*/y/n/, n' g [n,oo), where A*
interior relative to 0, so that
e,ee,
so that
expanding the
C.2-(i)
(C.8)
Since int(0/)
is
line
start
ball includes a point
distinguish two cases:
Case
Case
(C.6)
+
9}
Then
int(©/).
boundary of the
by compactness of 6/. Then the points on the
belong to int(0/) and can be parameterized as
construction 9} €
C
so that to find the minimal value of S such that the
>
sup
4o(0/,A).
0jeee,,AeA,„(e,)
decompose
Cn
(C.ll)
=
max[C;(<5),C„(5)],
where
f
v
C
4 (#7,0)
sup
C;(<5):=
12);
and In (5)
=
{#/ e int(0/)
By Assumption
:
infygae,
C.l for any
e
>
||#/
sup
-
0/||
>
5/y/n}, as defined in Assumption C.l.
liminfP»(c;(<5)
also that
C n (5) >
4(07,0),
e,ee,\i n (6)
there exists S sufficiently large such that
(C.13)
Observe
and Cn (5) :=
e,£i„(6)
'
by construction. Hence
liminfP„(<4
(C.14)
n—*oo
for
=
o)
any
e
=Cn (5)\
>
1
>
>
-
e.
there exists 5 sufficiently large such that
1
- e.
J
I
Observe that
(C15)
C n {5)=
By Assumption
C. 2
(i.-ii.)
any compact set
4 (9j, A).
K
SU P *n(0i,-)=> sup 4o(6>/,-)mi°°(^),
e,eae,
e,eae,
(C16)
v
'
where A
for
sup
»/68e ; ,A6A„(8,)n{||A||<i}
>—>
sup S;gae; 4o($7, A) has uniformly continuous paths.
The next claim
Continuous Mapping Theorem that
Cn{5)^d
(C
17)'
*
C{5) :=
The proof
(C.18)
sup
4o(#7,A).
e;6Se,,AeAoo(fl,)n{||A||<«}
'
of this claim
is
given below. Hence for any closed set
F
limsupP*{C n (<5) e F} < P{C(6) e F}.
n—*oo
28
is
that (C.16) implies by
Hence by (C.13) and (C.14)
for
any
e
>
there exists
limsupP*{Cn
(C.19)
£F}<
enough such that
large
<5 £
+
P{C{5 € ) € F}
e.
n—*oo
Therefore, taking
e
—
>
and
<5 t
—
oo accordingly,
»
it
follows that
< P{C
limsu P P*{C n € F}
n—»oo
(C.20)
6 F},
where
/n
C := lim C
o~\)
The
limit
C.3(iii)
=
(5)
s ^°°
;
^
C
exists in
C < oo
a.s.
sup
^<x(di,X)
a.s.
e,eae,, A€A„(9,)
R by the Monotone Convergence Theorem. By construction C >
a.s.,
and by Assumption
Conclude that by the Portmanteau lemma (van der Vaart and Wellner (1996),
(C.20) that
Cn
(C.22)
Proof of C n (5) -> d C(5). Observe that
(C.23)
and A n (9[)
/A
oo
(0/), in
the sense that
U~ =1 D~
(C24)
since by definitions given earlier for
U~
(C25)
no
no
all
C.
'
C A n (6i) C
An„(0/)
^d
A„(fi 7 ), for
A n (0/)
is
all
A n (07 ) =
,
all
a nested sequence and
A n (9j) = n»0=1 U~
n >
n > n
„„
A n (0,) =
n >
1
A n (0j) — A oa (8 I ),
»
i.e.
A_(0,),
1
n»
A.(0,) and
no
A„(fl,)
= A„ o (0,).
Therefore,
4(fl 7l A)
sup
C;,(5):=
fl/6se ; ,AsA„ (e,)n{||A||<5}
^ C"( 5 =
(C 26)
v
'
^ (^' A
SU P
)
)
e/ese,,AeA„(e,)n{||A||<cS}
'
<C n (S):=
sup
<„(»/, A).
e,sse,,AeA«,(e,)n{||A||<5}
and
£(<*):
=
sup
S/68e,,AeA„
*»
(*/.*)
(e,)n{||A||<i}
(C.27)
<C(<5):=
sup
4c(0/,A).
e,eae,,\eA a,(6,)n{\\x\\<s)
By
(C.16) and the Continuous
(C.28)
and
for
(C.29)
Mapping Theorem
r {6)^
d
C{5)
-
d
C(6)
n
any
fixed
n
Cn(S)
Let the underlying probability space be denoted as
(C.30)
(fl,J-, P).
sup
Observe that
4o(0/,A)
9;eae,,A6A«,(e,)n{||A||<
29
5}
1
for
every
u>
£
fi,
p. 20)
and
is
some
necessarily attained at
compact and A„(8}(w))
[~l
5}
=
C(5)
—
»
{||A||
<
<$},
because the set 90;
is
is
t
(6*(u),\ {w))a..s.
i 0O
oo,
A no (0/) n
(C.32)
That
also compact.
is
(C.31)
Moreover as no
A 00 (8*i (lo)) n
€ 90/ and A*(o;) €
9*,(u>)
<
{||A||
/ A„{6,) n
<5)
{||A||
<
{||A||
6} for each 9,
€ 90,
hence
An„
(C.33)
so that by continuity of A
i—
i
WH) n
x {6i, A),
we have
/ A»(»;(w)) n
5}
<
{||A||
5}
a.s.
that
lim
/ no—
C(5)
(C.34)
<
{||A||
C(5)
=
C{5)
a.s.
»oo
Note that to obtain (C.34) we use that C(5)
exists a.s.
is
monotonicity increasing
in
no due to (C.32), so that lim no _ 00 C(5)
and by (C.27)
<
C(S)
lim
(C.35)
C{5)
a.s.
TLq—»CX3
Moreover, for 9](u) defined above there exists a sequence A* o
A*(oj) a.s.
Hence by continuity of A
lim
no—*oo
(C.36)
Now we
C(5)>
^(^z,
i—»
A) at each 9j
(u;)
A no (0J(u;))ri{||A|| <
in
5} such that A^ o
(u;)
—
€ 90/,
,
t
lim e ao {6*I {uj),\ no (u}))=e oo (9 I {u),X {u))=C{6)a..s.
no—*oo
F
are ready to close the argument. Let
P{C(6)
< F} =
be any
real
number such that P{C(S)
= F} =
0,
then
P*{C n (5) < F}
lim
n—*oo
(2)
<
liminfP,{C„((J)
<F}
n—>oo
(3)
< limsupP*{Cn (5) < F}
n "°°
(C.37)
(4)
<
lim sup
n—*oo
P*{C n {5) < F)
(5)
< P{C{5) < F)
^
no—*oo
where
(1) is
(C.26), (5)
= F} =
Thus, by the Portmanteau
1
P{C{5)<F},
(6) is
< F} =
lim sup
Case when ©/
1.
P* {C n (S) < F}
=
P{C(5) < F}
/
90/,
lemma
Cn (5)
= 90/
and nq n
0/ has a nonempty
-f* p 0:
p. 20),
by (C.34) and the Portmanteau lemma. Thus
n— oo
(C.39)
Case
>
0,
lim inf P. {C n {5)
n ^°°
(C.38)
II.
any n
by (C.28) and the Portmanteau lemma (van der Vaart and Wellner (1996),
by the Portmanteau lemma, and
such that P{C(5)
Step
for
-
d
C{5).
Consider two cases:
interior relative to 0, so that
30
0/
(2)-(4)
for
any
by
F
Case
Case
1.
2.
6/ has an empty
nq n ->„
Case
2.
I
has taken care
We
0,
of.
have by definition of £ n (8i, A) and C n
C„=
fC4H
We
90/.
In this case by Assumption C.l
(C.40)
which Step
=
0/
interior relative to 0, so that
sup
,
MM-
4(0/,
inf
A).
need to show that
(C.42)
(sup4(0/,O),
in{0,,X)
inf
->d
J
8,+A/vATse
\8,ee,
sup
(
/
4o(0/,A),
inf
S/eae/.Aev,,.^,)
\0,eae,AeA«,(e,)
4o
(0/, A)
Define
(C.43)
M
n :=
M :=
and
4(0/, A)
inf
«;+A/v/See
Below we show only the marginal convergence
the arguments of the proofs of Step
I
M
and Step
n
inf
S/gae/.AeVooffl,)
—»d M. The joint
4>(0/,A).
convergence (C.42) follows by combining
and applying the Cramer- Wold Device.
II
Write
M
(C.44)
n
= mm[M n
(K)
,
Mn
{K)],
where
M n (K)~ e,+x/^ee,v(B,,x)>K/y/K 4
M (K):=
4(0/,
v{e,,\)<K/^
inf
(e It A),
inf
n
A),
8,+A/v/Sge,
where
(C.46)
By Assumption
1/(0/,
C.3, for any
some constants C\ >
0,
\imini
(C.47)
for
>
e
and
5
>
A) :=
there
t
is
inf
K
{M n (K)
||0j
9'j\\.
enough so that
large
>
+ A/Vn -
C mm\K 2 ,n6 2 }} > 1 x
that do not depend on
e,
e.
First observe that
M
n
(K)=
=
(C.48);
'
v(8 ,
=
A)
,\)<K / sfH
nf
e,+A/v^ee,
v
4(0/,
inf
9;+A/v^ee,
4(0/,
A)
||a||<k
inf
0;eSe,,A£Vn (9,)n{||A||</<'}
4(0/, A),
where
(C.49)
Equality (1) in (C.48)
can be represented as 6]
Vn {0i)
:=
is trivial.
+
trivially satisfies y/nu(9i, A)
{XeR d
First,
+ A/Vn7 €
0,
for all n'
by compactness of 0/ every
A* /\/n where A*
<
:9i
=
\/ni/(0/,A)
K
31
< K.
6i
e [n,oo)}
+ X/^/n
such that u(6j, A)
Second, every 9j
+
X/^/n with
< K/^/n
< K
||A||
Equality (2) in (C.48)
more
is
(2) note that {8, +
B
+
(9i
K /^{Q)) D G is convex
(by Assumption A.l 6 is convex)
To show
subtle.
{(#/
+ BK/^(0)) nO,9j
it is
the intersection of two convex sets
£ ©/}. Note that
X/y/n,
|A|
< K,9j e G;}
and necessarily contains
that contain
9/.
Hence
nS =
8i, since
for
every
= 9 1 + X/^/n 6 (9 1 + B K /^(Q)) n G, points on the linear segment between 9' and 6j are also contained
{9, + B K/ ^(Q)) n G. Therefore A = ^/n{9' - 9 r e Vn {6,) n {||A|| < K}, and representation (2) follows.
6'
)
By Assumption C.2
(i.-ii.)
inf
(C.50)
i n (e Ir )=>
t^Oj, )
inf
in
L°°(K).
Equipped with (C.48) and (C.50), we can show through the use of the Continuous Mapping Theorem that
M
(C.51)
The proof
of this claim
(K)^ d M(K):=
n
inf
4,(0;, A).
stated below.
is
Hence by (C.47)-(C51),
any
for
e
>
there
lim inf P.
(C.52)
{
a sufficiently large K(e) such that
is
n
M
=
n
(K
>
(e)) }
-
1
e.
Note that
M=
(C 53)
M
exists a.s. in
<
by C.2,
By
R
M<
inf
x
l
(0,, A)
=
lim
M (K)
by the Monotone Convergence Theorem. Since 4o
oo
a.s., i.e
M
(C.51)-(C,52), for any
e
is
>
lim sup
"^°°
(C.54)
{9j, A)
and each closed
P {M n
f
e
—
>
and K(e)
—
>
6 F}
oo accordingly,
<
{M n (K (e))
lim sup P*
n ^°°
it
e
F} +
e
e.
follows that
limsup P*{A4 n
(C.55)
and
set F,
<P{M{K{e))€F} +
Letting
>
tight.
GF}
<P{M
eF}.
n—KX>
By
the Portmanteau
Lemma
conclude that
M ^
(C56)
Proof
n
otM n (K)-* d M{K).
M.
Observe that
Vno (9i) C Vn {9,) C
(C57)
d
^,(0;), for
all
n > n
so that
A4„(iO:=
<M
(C5&)
inf
n
{K)=
<TOff):=
By
(C.50)
(C59)
inf
^ M(K)
d
4(19;, A)
inf
9/e3e,,Asv„
and the Continuous Mapping Theorem
jA~
n {K)
t n {fii,X)
for
(s f )n{||>,||<if}
4 (5;, A).
any fixed n
:=
inf
fl;eae,,AevnD (e,)n{||A||<K}
32
<«, (*/,*)",
finite at least for
A
=
in
and
M
(C.60)
Moreover, as no
—
>
n
(K)^ d
inf
{||A||
/ 1^(0,) n
< K)
so that by continuity of A i—> £oo(# 7 A) at each 0/ G
,
M
(0 7l A).
< K)
{||A||
for each
7
G 50,,
30/
\ A4
M(K)
(C.62)
The
*
«;eae/,A6K»(e,)n{||A||<K-}
oo
Vno (0/) n
(C.61)
M {K):=
a.s.
details of proving (C.62) are nearly identical to those given in (C.31)-(C36), so are not repeated.
Let
F
be any real number such that
P{M{K) = F} =
P{M(K) < F} =
0,
then
P.{M n (K) < F}
lim
n—*oo
(2)
>
\im sup
P'{M n {K)
< F}
n—KX)
(3)
<F}
> liminfP,{X n (^)
n -°°
(C.63)
(4)
>
\iminf P.{M n (K)
< F}
n—*oo
(5)
> P{M(K) < F}
^
n D —*oo
where
(1) is
by the Portmanteau lemma and (C.60),
=
limsup.P*{.M„(iO < F}
(C.64)
(2)-(4)
for
by (C.58),
any
liminf P„{M n [K)
n— oo
any n
>
1
P{M[K)<F),
by (C.62) and the Portmanteau lemma. Therefore,
(6) is
for
n-»oo
real
F
(5)
by the Portmanteau lemma, and
such that
P{M{K) = F} =
< F} = P{M{K) <
F}.
Hence by the Portmanteau lemma
M
(C.65)
n
(K) -> d
M{K).
U
Appendix
Steps
Step
I
I:
and
II
prove Claim
Case when nq n
—p
(D.l)
I
and Step
0.
—p
0,
we have
III
Recall that
l n (Bi, A)
Since nqn
D.
:=
Proof of Theorem
proves Claim
Q(0 7 )
=
II.
and
n{Q n {6, + X/y/n) -
Cn
=
sup
M0/,O) +
o p (l).
e;ee,
From
(D.3)
the Proof of
Q(0,)).
that
(D.2)
Theorem
3.1
we have
that
sup *„(0/,O)
33
3.2
= O p .(l).
Hence since k
—p
oo
C n < k wp ->
(D.4)
Condition C.3 implies that
wp —
=>
1
for
K—
any
—
Cn ('k) wp
6/ C
>
oo with
=>
1
—
i<"/fc
/i(0/,Cn (fc))
=0 wp
-+
1.
there are positive constants C\ and 5 such that
>
1
>
n
Ci
(D.5)
uniformly in
By
(0/, A)
definition of
> K/y/n, where
such that v(9[, A)
Cn
nQ n (9) +
sup
2
2
<5
,
< 4(#/, A)
]
=
v(6i, A)
inf S e e,
'
+ X/y/n —
||#/
8'j\\.
=
and since Q(6i)
(fc)
(D.6)
min[i/(0/, A)
•
=
o p {l)
sup
4(0/
,
+
A)
o p (l)
<
k.
e,+\/y/nec n (k)
eec„(k)
Hence
(D.7)
9,
Then the claim
is
that
wp —
+
Cn (k)
X/y/n £
implies £ n {6 It X)
cn (k) c e f /c
+1
:
\\t\\
<
(D.9)
)i/2/
'
=
inf
\0i
+
-
X/yfr
where
4/c
wp —
for this pair (5/, A),
by (D.9) and by
d n-min[^(e
—
oo and k/n
wp —
>
(b)
1,
is
by (D.5),
fc
(c) is
Combining (D.4) and (D.S)
d H (Qi,
(D.ll)
Step
it
—
>
/
,A)
-/-*
p 0.
2(fc/Ci)
#/
1/2
2
2
,<5
< 4(0/, A) <
]
fc
so that Cj n.-min[v(^/, A)
+ X/y/n
follows that
2
2
,
<5
]
> Ci n'min[4(A;/C
wp —
»
and the centered version be denoted
/i(0
2( " /Cl)
'
/2/
^, 9/) <
2('k/C 1 )
as (in this proof only):
Cn {c):= \8:n(Q n (9)-q n )<c\,
where
q n := inf
Observe that the following key relationship between the two sets
Cn {k) = Cn {k + nq n
(D.14)
(D.15)
>
oo but k
=
o p (n).
From
the proof of
M„
)
for all k
Theorem
= nqn = O p (l)
2.1
since
>
0.
we know that
Mn —
></
Hence
(D.16)
)/n, J
fc:=(fc
2
]
>
true.
1
Cn {c):=[8:n{Q n {9))<c],
—
is
1
1
'2
/^.
Observe the relationship between the "centered" and "un-centered"
(D.12)
Let k
Cn (k),
+ op (l),
dence regions. Let the un-centered region be denoted as before:
(D.13)
6
/^.
by (D.7). This yields a contradiction. Thus, (D.8)
Cn (k)) < h(Cn ('k), 8/) <
Case when nq n
II:
>
some
1
<
4fc
(a) is
>
>
8'A
0/66/
(D.10)
^,
c,9j € ©/}. Suppose otherwise, then for
v{0i,X)
Then
o p (l).
1
>
(D.s)
where 0; := {9j
<k +
+ ng„) =
34
fc-(l
+ op (l)).
M.
Q n{9).
confi-
By
(D.14)
Cn {k) = Cn (k),
(D.17)
Hence by (D.ll) and (D.16)
it
wp —
follows that
d H {Cn {k),Q,) < d H (Cn (k),
(D.18)
and the claim now
Step
III
>
t
1/2
/V^ <
1 )
(2(fc/C 1 )
6/ C
G
that for k
II,
Cn {'k) C
[c
where
",
7
wp —
Inn,
,Ci]
b{
sup
Q
>
1
h
{9)
-
= {0/ + t
< e n 6; 6
We have that k € [cq, c\]
:
\\t\\
,
0/}, and
Inn
0/ C
(D.20)
for
some
C>
<
b(
sup
eee) n
Q
b
{9)
—
tn
=
Hence, by Claim
1.
>
Q)
n
Q b {0)) =
sup
wp
- sup Q
-»
1,
where e n
o p (l),
=C
In
n/^/n
(0/))
fc
&(Q 6 (0/+A/V6-Q(0/))- sup b(Q b (8,)) - Q{e,))
e,ee,
4 (0j, A)-
sup
sup 4(0/,
= max[
sup
4(6/,
sup 4(6/,
A),
=
max[
sup
4(0/,.O)
e,eae,
=
max[
sup
4(0/,
sup 4(0/,O)
+ op (l),
sup 4(0/,
- sup
0)]
4(0/, 0)
8,ee t
0)]
+
-
o p (l)
sup 4(0/,
0)
s,€9,
s,ee,
+ op (l)-
0)
e,ee,
sup 4(0/,
sup 4(0/,
0),
B,ee,
=
-
0,ee,
s,eae,
=
0)]
0,ee,
/
e,ese,,|A|<v &£„
(D.21)
0)
0,ee,
e,ee,,|A|<\/6£„
in
° p (l)),
e,ee,
<
assumed
+
0,
sup
(a) follows
(1
I
Bi+x/Vtee]"
where
/
/v ^)
a sequence of positive constants.
is
>
wp —
Cn (k) C
Then, using Q(0/)
0.
/2
see,
see',"
n
1
follows.
proves Claim
(D.19)
0j
).
1
< 2(k/C
; )
= dH {Cn (k) Q I
d H {Cn {k),Q I )
so
sup 4(0/,
0)
B,ee,
°p(1),
from the uniform stochastic equicontinuity of the process A
>—>
sup e/£SQ/ 4(0/, A) over
K
C.2 and from
Vbe n ->
(D.22)
0,
which follows from the assumption that
(D.23)
(b) follows
(D.24)
b/n
—
at
polynomial
rate, so that
vbe n
oc
\/b/n\nn
by the Continuous Mapping Theorem and proof of Theorem
(
V
sup 4(0/,
9|69,
0),
sup
B,€d&,
4(0j,O))-> d
'
(
sup
V
fl;£3e,,AeA„(9,)
3.1
—
>
0;
which implies that as
4o(0/,A),
sup
8/696,
b
—
>
oo
4o(0/,O)Y
'
D
35
Appendix
Verification of A.l
Proof of Theorem
E.
3.3
immediate from the stated assumptions. Therefore we
is
on verifying
will focus
CI
to
C.3.
Step
x'jO
d
=
>
and x'0 < r2
T\ (xj)
Tk
dim(0). Denote
(90/
T=
is
{t
RJ
G
<
:= (Tk {xj),j
<
T\
:
<
t
0;
identified set
<
(xj) for all j
J}.
J) for k
Rd
=
X
{0 £
X'6
=
7^}
is
compact and
X
<
:= (xj,j
J)
is
Rd
has
t:Ti<t< T2 }.
some
for
t
0/
rank,
full
compact.
is
It is
assumed that 0/ C 0.
determined as
90, =
(E.2)
=
0/
{0 e
:
x'jO
=n
(x^
=
or x'jd
r2
(a;,-)
inf e e a ej
||0,
—
from any rk [xj) by the distance proportional to
\\xj
\\5/^/n. Indeed,
That
G I n {8)
0,
G int(0;)
{Oi
:
'
>
8'j\\
I
Suppose otherwise that k
=
This implies
0.
£ 90/ Hence k >
poses a contradiction to 0/
l|T*(sj)-*jftll
\\
Recall that
||xj||
>
1
>
K||e
,
x j\\
=
component
IM^O-x^jH^*
some
j}.
Tk {xj)
>
observe that there exists k
:x' e'I
i
for
_ ^u W/ ^ a6/) v ^ e qq,
for all j since the first
(E.5)
x'.B'j
for
=Tk (xi
),
such that
v(j,fc).
some j and k and some
Q\
G 90/. This
Hence
0.
.
(E.4)
=
x'-0,
,
S/y/n} implies that x^0, must be bounded away
Pi^rfrii^ K>0 Wi4de I ,w ede I
(e.3)
inf
of Xj
H0',-
0/||
is 1.
Hz,
:
=
Tk ( Xj ),
V(j,
fc).
Hence
V0,
||
x%
£90,,
V(j,fc).
Thus,
Qieln(S)
(E.6)
n (x.,) - x^-0, <
=>
--^=||x 7
orT2(xj)-40, >
||
Vj.
-^=11^11,
Recall
(E.7)
4 (0/, A) =
£ ^ [^ (n
j=l
2
(x,)
-
x;-0,)
-
+
""
„ [v^_
E= ^
„., r
+
xjA]
j
-]2
.
(
f2 (*i)
-
*}*/)
-
*J
A
l
Hence
2
(E.8)
* n (0,, 0)
= J2
J [W„ - v^ (^0/ j=i
r, (x 3 ))
]
+
+
E ^ [^W - ^ K^ "
T2 (*i))
]
'
j=i
Hence
sup
(E.9)
\e n
(0/, 0)
|
<
Choose any
e
>
0.
Since
||xj||
E -n f^w -
2
^||x,
L
e,er„(6)
(E.10)
:
of rank
Thus
1,2.
=
:
determined by a set of inequalities: 0, := {0 £
is
assumed that the d x J matrix
It is
6/
(E.l)
Since
The
Verification of C.l:
I.
JTi
>
1,
Wij
limsupmaxP (Wij >
= Op
k;5||x.,||)
and
(1)
<
e
W
2j
and
36
||]
J+
= Op
+
E -n [^' +
2
«*ll^
111
Jit
(1) for
each j
<
J, for 5 sufficiently large,
maxlimsupP {W2 j < — k<5||xj||) <
e.
C.l
is
now
verified, since
sup
(E.ll)
Step
II.
any
follows that for
it
Verification of part
#>
.
of C.2:
i
=
(9,,a)
there exists & sufficiently large such that
0,
= 0wp > 1-e.
|4(<?/, 0)|
Write
^
>
e
=
4 (0/, A)
l)
(9/, A)
£n
+
2)
4
(9,, A)
where
,
2
[w„ - x ;a]
=
foe!
i
+
3=1
Ti
fo))
(E.12)
3
=1
/l
J
L
J
E J [™V - 4
+
3
Note
first
2
4 >(0
J
4o
(E.14)
(9j, A)
relative to
R
x'jXJ
C 0/
;
4 (9j, A)
=n
(x' 9 l
]
+
(E.15)
x'ji eI
I
=
1, ...,d)
has rank d and
I,
4 (#/, A)
J
1
(a:,))
given by
is
2
+ £p, [w2j - !bJa]_1
-
(xjfl/
=
r2 {x 2 ))
.
3=1
n < oo amounts
for
Step
in
finite-dimensional limit of
that are defined as a collection of solutions to
such that (xj n
^ fa))
<
1 {x'jOl
]
= Owp -1.
0/
^
<90/, that
examine the finite-dimensional
C.2-ii(a) requires us to
.
problem of finding sup e;g ge
Vj
,A)
Verification of part ii(a) of C.2: First, suppose that
d
T* ( x 3))
2
= Y,Vj [Wij 3=1
III.
7
Mapping Theorem the
Therefore, by the Continuous
-
by an argument similar to that
fixed A,
(E.13)
Step
2
>/" ( x i°'
-
A
=1
56/ and
that for fixed 8j G
-f-
=T
kl
(fc/,j?i)
the interior of
limit theory of
to choosing 9j
among
0/
sup fl;gSe;
is
nonempty
4 (9j,
•).
The
the finite subset of points
systems of rf-equations of the form:
all
(x jl ),
is
l
=
l,...,d,
G {1,2} x {1,2,
...,
J}
for
each
I.
There
is
only a finite set of
such systems of equations. Then,
sup
(E.16)
4(6> 7 A)
,
= max
4(0,, A),
for
n <
Hence by the Continuous Mapping Theorem the finite-dimensional weak
by maxe, e y, 4o
(9j, ),
and therefore the finite-dimensional weak
oo.
ma-xg^v,
limit of
limit of
sup e/eSQf
4 ($/.
4 (0/, )
is
•)
is
given
given by
su Pe,eae,^oo(5/,-)-
Second, suppose that 0/
=
90/, that
4 {0
inf
A)
It
is
= £l(7i
(
Xj
)
=
r2 (*,)) S.
R d Then wp —
>
.
1
(wl3 - x 'xY
j~J
(E.17)
inf
'
Therefore, by Continuous
infa ;e e ; too (9j, ).
the interior of 0/ empty relative to
The
*«, (0 7 A)
,
'
=
£
1
(n fo)
=
r2 (x,-))ft (^13
- ^A)
2
.
J<J
Mapping Theorem
the finite-dimensional limit of infg, e e,
joint finite-dimensional convergence of (inf^ge,
37
4 (0/,
•)
,
4 (9j, )
4 (9i,
sup 9;£Q
is
,
given by
•))
follows
by the Continuous Mapping Theorem.
Step
Verification of part ii(b) of C.2: Let
III.
max 8
tinuity for
fixed 9 1 £
(n ($/i
')
90/,
,
K be any compact subset of R d
A £ K, where Vj
e v, £ n (9j, A) over
proven in Step
I.
it
(9j,
suffices to
) to a convex continuous function
show that
for
any
infs, e e,
{Si,
t-n
•)
^
(9j,
over
•)
Kd
,
K also follows from convexity
£
Likewise, stochastic equicontinuity of infg ;£ e ; in (#/, A) in A
and finite-dimensional convergence of
verify the stochastic equicon-
equicontinuous in A £ K. This immediately follows from convexity of
( n (#/. A) is stochastically
A an d finite-dimensional convergence of l n
in
To
.
a finite subset of 99/,
is
to a convex continuous function infe, e e ;
i^
(9i,
•),
cf.
(E.17).
Step IV. Verification of part
whenever
iii
= i^/, and
Tj (xj)
Any A £ A^
of C.2:
x^-A
<
whenever r2
£ 90/
(#/) for 9[
(x^)
=
the property that
satisfies
a:'
A
>
Hence
x'ftj.
J
sup
l^
e,ede,,xeA„(e 1
(E.18)
(<?/,
YV- ([W^l + \W2j )l)
<
A)
Step V. Verification of C.3: For each 9 n
<
oo
a.s.
j=1
)
such that v{9j,X)
(#/, A)
=
inf e g e,
<
\\6i
+
X/y/n
—
> K/y/n,
9'j\\
decompose
(E.19)
Any
9,
+
=
\/y/n
solution 9}
is
9}
+
X'/y/n, where 9j
subject to
1
<p<
=
arg
inf
8£dQ
*},*/=**,(**).
where A" := (x J(
depends on
J =
(#/, A))
is
l,...,p)
has rank p and
Since the eigenvalues of (X*
=
1
(rk
,
(
Xjl ),
X*)
are
I
=
,
Xj,A*<0
(E.23)
(E.22) ai least for one {j',k') £
Ci||A*||
<
\x'
j
.X*\
and
9\\
||A*||
=
y/nv
(6» 7
,
A)
> K.
*
aZi {ji,k[)
if
fc,
=
i,...,p,
2}
x {!>••>
-f} for
each
I.
Matrix X*' X* (which
sub-matrices of X*' X* that have
full
rank
zero. Let also
and
JK' =
((ji,h),l
=
1, ...,p).
zero,
< IK'A.l..
Cl ||A*||
Moreover, given any index set JK.* for
i1
bounded away from
(E.22)
(E.24)
1, ...,'p)
=
j
many p x p
and whose eigenvalues are bounded above away from
T*
£
(fc(,j()
necessarily one of the finitely
(E.21)
X/y/n -
+
dim(9) binding constraints of the form:
(E.20)
By
\\9,
i
£ J7X":
or
1
x' X*
jt
>
if
/c,
=
2.
J<C:
so that x'
}
.
A*
<
-Ci A*
||
||
38
if
fc*
=
1
or
x'
}
.
A*
>
Ci||A*||
if
fc*
=
2.
p,
Decompose
i n (0/, A)
^
=
A)
(<?/,
+ i^
(Qi, A)
,
where
(E.25)
L
//,
J -^
/£
J
L.
-
7=1
>— [Wii--Vn(^-.A*)]
l(fc*
= l)+[Vn(^.A*)-W2j-]
2
> Pj-(l +
P
(1))[WV +
Cl ||A*||]
1
(**
=
+
1)
Cl ||A*||
-
1 (fe*
= 2)
2
W
(*•
1
2j-]
=
2)
[
2
>pJ .(l +
0p (l))[min[vV u .,-Vy2j .]+c 1
||A*||]
>minpj(l+Op(l))[min[Wi J-,-W'2 i ,.j<
i<j
+
J]
Cl ||A*||l
J+
I
,
and
£(*,, A) .=
[Wu -
J2
v^* - nfe)) - ^A*)]
2
i(x'tf
> n( Xj ))
3=1
V
J
J
Hence, recalling that
>
constants c
\
W
= Op (l),
\
wp >
1
—
for
and
c'
any
e
=l
=
||A*||
>
>
2
y/nu(9i, X), one has that
where W_ := min UVij,— W2j,j
0,
we can
0,
select
K
large
(#/,A)
> c(]y +
< /
Let y/nv(9j,X)
enough so that
•
c'||A*||)
+
=
for
,
some
||A*||
W > —c'y/rlv(9j, A)/2 wp >
>
1
positive
A".
—
e.
Since
Then
e,
1
e^(e I ,x)>
(e.27)
Since A>(0/, A)
>
#
]
(0/, A),
C.3
is
I.
Verification of A.l
is
-cc'v(e 1 ,xf.
verified.
Appendix
Step
tn
F.
Proof of Theorem
3.4
immediate from the stated assumptions and from
{m
t
(9),9
e 0} being
Donsker (hence Glivenko-Cantelli).
Step
II.
Step
III.
Verification of C.l
(F.l)
condition
ii.
A(6>/)
condition
(F.3)
= ^mifff, +
=
(Gnrmtfi
x
W
n {8,
that {rrii(9),9 €
G n mi(0/ +
(F.2)
where
not needed because 0/
= 90/
in 0.
Verification of C.2 Write
e n (6r,X)
By
is
is
+
X/y/n)'Wn {0,
X/y/n)\/nE n
+ A/Vn) + y/^Emi(9, +
X/y/n) (G n m,{9,
0} forms a Donsker
X/y/n)
+
+
X/y/n)
class for
= G n mj(0/) + op (l)
=>
m
i
{e I
+
X/y/n)
X/y/n))'
+ y/n'Em^e, +
X/y/n))
any compact set K,
A(0 7 ),
in
L~(9/ x K)
the Gaussian process with the covariance functions given in the statement of
iv.,
Wn
(6,
+ X/y/n) =
W{8,)
+
39
o p (l), in
^(0;
x K).
Theorem
2.4.
By
By
condition
iii.,
y/nEm
(F.4)
l
(& 1
+
=
X/y/n)
+
G(0j)'A
o(l), in
L°°(0/ x K).
Hence
(F.5)
Hence
C.2-i.
C.2-ii.(a)
is
is
is
4-
convergence
is
and u n (A)
weak convergence
in
,
L°°(0 7 x
AT).
implied by weak convergence in L°°(Q I
sup e(66j £ n (9i, A) are continuous transformations of £ n {9,
implied by
+ G(0 7 )'A)
x W(0j) x (A(0j)
G(0j),'A)'
by the Continuous Mapping Theorem since the functionals
stochastic equicontinuity of n (A)
which
(A(0 7 )
verified, since finite-dimensional
verified
=
and u n (A)
=
=* *oo(0/, A)
t n {6i,X)
A).
l
n (\)
=
mfg /6 e
;
K)
£ n (8j, A)
C.2-ii.(b) follows as well, since
implied by stochastic equicontinuity of £ n (9j, A) over
is
of £ n (9i, A) in L°° (0/ x
x K).
0/ x K,
to the quadratic form of a Gaussian process,
4o(0/,A).
By
Step IV. Verification of C.3.
condition
Gnm
(F.6)
where A(#)
l
forms a Donsker
ii.
m,i(Q)
{9)
=> A(<?) in L°°(0),
Hence
class.
the Gaussian process with the covariance functions given in the statement of
is
Theorem
2.4.
Hence
sup
see
(F.7)
By
condition
—p
iv. £n
f
uniformly
iv.,
>
in 9
Hence wp
0.
—
„,..,2
Anf ... _ \n(v(9
lt
v(b,,\)>k/s/E
(
>
Wn
€
=
||G n mi(0)||
= W(9)+op (l).
(9)
Define £ n
^ A5^)J
A) 2
:
^
2
..
.
^»
inf ...
>
_
||V^"H(»/
inf
y/n(i;(0/,A)
II
r/2
in f
„(9„A)>K:/ys
from the preceding one, as
\\
y/nEmj(9,
II
some constant
|l
2
A<5
Op(1)
,
2
2
2
v n(w(6' / ,A) A5 )ll
/
)
O p (l)
v^£^,(g/ + A/v^)
y/n^/.A)
II
it
uses the sup
+
A/y/w)
C>
0.
For a given
e
>
^71(^(61/, A)
0,
K
2
A
5
2
norm
2
A(5
2
)
y/n(v{6,, A) 2
)
II
IVn(i;(e / ,A) 2
wp >
1
—
A5 2 )lloc
large, so that
<c/2
e,
(F.ll)
Hence wp
^
can be made arbitrarily
inf
!"
{e
X
m )>C' = Z/2(C/2)
2
!; 2 l 2
(
v(9,,x)>K/ %/n\n(v(9
Ad ) J
I ,X)
>
1
—
(F.12)
Hence C.3
e,
e n {9i,X)
is
>C n-{v(6
{
,X)
2
AS 2
verified.
40
)
for all
II-
A
5 2 ) H<
instead of the Euclidean norm.
Hoc
~pv
P (1)/
(F.10)
Hence
+ Vv'n)
f/2
the partial identification condition (3.15)
(F9)
for
(Wn (9)). By condition
n{v{9i,X) 2 A<5 2 )
v(e,,\)>K/yfc
By
mineig
infg s e
vtf,,\)>K/-fn
>
line is different
=
1
(F.8)
Note the
0,(1).
v{9,,X)>
K/^.
wp >
1
—
e
Appendix
Suppose that one
when
well motivated,
when
analysis or
some parameter
interested in a special parameter 9" inside 6/.
is
there
is
it
Relationship to Pointwise Inference
G.
is
a sense in which 9*
the "truth"
is
The
inference about
This case typically arises
.
some
in
0/
is
believed that the models are correct representations of data-generating processes for
value,
i.e.
there
0/
of data P. In this scenario,
is
is
9*
Rd
G
such that the model law Pg agrees with the actual stochastic law
not of interest per
se,
but rather 9*
In
is.
method
moments
of
IV quantile estimation by Chernozhukov and Hansen (2003), and
GMM weak or complete unidentification
Kleibergen (2002). Imbens and Manski (2004) investigate the Wald type inference about 9"
case where a real parameter of interest
is
known
to
lie
in
settings,
a
dynamic model with censoring by Hu (2002),
similar problem has been already investigated in the context of
for
by
the special
an interval with endpoints that can be consistently
Here we provide further insights concerning the pointwise inference
estimated.
9* in
non-structural
in its relation to regionwise
inference.
Assumption A. 4. Suppose
~
a n (Qn(9i)
where C(9i)
is
a
random
—
there exists a n
>
co such that
Qn)
-d
C(9 r )
variable. Moreover, for at least
has continuous distribution function on
(0,
for
G.l. Suppose that Assumptions A.l and A. 4
liminf n _ 00
p{r eCn {c*a )} >a.
0,
one 9j £ ©/, C(9j)
co); otherwise, C(9j)
Theorem
9, e
all
=
Let c*
hold.
>
with positive probability and
with probability one.
=
sup 9; c a (9[). Then for any 9* in Qj
Proof:
Cn (c*a )} =
liminf P{9' 6
n —>oo
liminf
>
n—*oo
liminf
n —>oo
P{a n {Q n {9') -
qn )
< sup c a {9)}
a
P{a n (Q n (9 t ) -
qn )
<
c a (9')}
>
(1 or
a)
>
a,
D
The theorem above
is
constructed using the Anderson-Rubin pointwise testing principle. Notice that the
confidence set constructed in the theorem above
Theorem
2.1.
Notice also that as a byproduct of our derivation, inequality
Cn (ca (-)) =
(G.2)
also has the
necessarily smaller than the regionwise confidence set in
is
{9* e
6/
:
an
generally not equal
is
generally a special case
Qn{9) :=
A more
moment
14
let c a (9)
of our construction.
is
for all 9
6
Cn (c*)
of the function a n
Qn
13
(9).
The
IV
quantiles.
set
Cn (c a (-))
However, this set
This can be seen by defining the new objective function
14
7
.
be the subsampling estimate of c a (9)
for
each 9 £ 0;
recent paper than ours, by Ho, Pakes and Porter (2004), has also proposed such sets in the context of
inequalities.
This
also use this technique for
this set for the case of interval-identified parameter.
smaller) than the level set
Q n (9)/max[c a {9),t}
Further
13
(is
set
set (G.2) in the context of a partially identified
dynamic censored regression model. Chernozhukov and Hansen (2003)
is
shows that the
(Q n (9*) - qn ) < c a (5*)}
a pointwise coverage. Hu (2001) proposed the
Manski and Imbens (2004) construct
(1) in (G.3)
The
difference
is
that they do not directly work with objective functions.
an equi-quantile transformation of the original objective function. In many examples
as objective functions
have the equi-quantile property by using optimal weights.
41
this
is
unnecessary,
Theorem
G.2. Suppose that Assumptions A.l and A. 4 hold. Let
liminf n ^oo
P<
Proof:
0*
Cn (c^) >
G
a. (Likewise, a corollary is that C"n (c*
>
—*
6/ C 6/ wp
Since
c*
1, it
liminf P{6* E
n—*oo
=
sup S/ c a (6j). Then for any 6* in Q;
(•))
also covers 6* with probability a.)
follows that
Cn {c'a )} =
>
(G.3)
liminf
n—*co
P{a n (Q n (0*) -
< supc Q (0)}
qn )
q
liminf
P{a n (Q n (e , )-q n )<c a {6')}
liminf
P{a n (Q n (8*) -
n ^°°
(2)
>
>
Equality
(1 or
from the standard argument
(2) follows
proof of Theorem
n—>oo
2.1.
Inequality (1) shows that
(
a)
for
>
qn )
<
+
c a (e')
o p (l)}
q,
subsampling,
Cn (c^(-))
e.g.
as the
one presented
in
also covers 9* with probability a.
Step 2 of the
).
[CONSISTENCY AND RATES HERE TOO?]
Notation and Terms
—p
—>d
convergence in (outer) probability P*
convergence in distribution under P*
wp —+ 1
wp > 1 — e
with inner probability P, converging to one
with inner probability P, larger than
closed ball centered at x of radius 5
Bs(x)
Donsker
—
e for
sufficiently large n,
identity matrix
/
normal random vector with mean
Af(0, a)
T
1
>
and variance matrix a
means that empirical process /
here this
class
t—> -4=
YM=i(fiWi) ~ Ef(Wi))
is
asymptotically Gaussian in L°°(T), see Vaart (1999)
L 00 (Jr )
metric space of bounded over
minimum
mineig(A)
T functions, see Vaart
eigenvalue of matrix
(1999)
A
References
Andrews, D. W. K.
(1999):
"Estimation when a parameter
"Inconsistency of the bootstrap
(2000):
when
is
on a boundary," Econometrica,
a parameter
is
67(6), 1341-1383.
on the boundary of the parameter space,"
Econometrica, 68(2), 399-405.
ANGRIST,
J.,
AND
CHERNOZHUKOV,
Krueger (1992): "Age at School Entry," JASA, 57, 11-25.
C. HANSEN (2001): "An IV Model of Quantile Treatment
A.
V.,
AND
V.,
AND
Effects,"
forthcoming
in
Econo-
metrica.
CHERNOZHUKOV,
H.
HONG
(2003):
"An
MCMC Approach to Classical Estimation,"
Journal of Econometrics,
115, 293-346.
ClLIBERTO,
AND
F.,
Tamer
E.
(2003):
"Market Structure and Multiple Equilibria
in Airline
Markets," Working
Paper: Princeton University.
Demidenko,
E. (2000):
Gisltein, C.
Z.,
Hansen,
L., J.
AND
"Is this
E.
HEATON, AND
of Finacial Studies,
8,
the least squares estimate?," Biometrika, 87(2), 437-452.
Leamer
(1983): "Robust Sets of Regression Estimates," Econometrica, 51.
E. G.
Luttmer
(1995): "Econometric Evaluation of Asset Pricing Models,"
237-274.
42
The Review
HOROWITZ,
AND
J.,
C.
MANSKI
(1995):
"Identification
and Robustness with Contaminated and Corrupted Data,"
Econometrica, 63, 281-302.
"Censoring of Outcomes and Regressors Due to Survey Nonresponse: Identification and Estimation
(1998):
Using Weights and Imputations," Journal of Econometrics, 84, 37-58.
Hu,
"Estimation of a Censored Dynamic Panel Data Model," Econometrica, 70(6).
L. (2002):
AND
Imbens, G.,
MANSKI
C.
Department
"Confidence Intervals for Partially Identified Parameters,"
(2004):
of
Economics, Northwestern University.
KLEIBERGEN, F.
(2002):
"Pivotal statistics for testing structural parameters in instrumental variables regression,"
Econometrica, 70(5), 1781-1803.
KLEPPER,
AND
S.,
E.
Leamer
(1984):
"Consistent Sets of Estimates for Regressions with Errors in All Variables,"
Econometrica, 52(1), 163-184.
Manski, C.
(2003): Partial Identification of Probability Distributions, Springer Series in Statistics. Springer,
AND
Manski, C,
Tamer
E.
"Inference on Regressions with Interval
(2002):
Data on a Regressor
or
New
York.
Outcome,"
Econometrica, 70, 519-547.
Marschak,
AND W.
J.,
H.
ANDREWS
(1944):
"Random Simultaneous Equations and
the
Theory
of Production,"
Econometrica, 812, 143-203.
MlLNOR,
J.
(1964):
"Differential topology," in Lectures
on Modern Mathematics,
Vol.
II,
pp. 165-183. Wiley,
New
York.
NERLOVE, M.
Estimation and Identification of Cobb-Douglas Production Functions. Rand McNally
(1965):
ic
and
Company, Chicago.
Newey, W.
AND
K.,
D.
McFadden
(1994):
"Large sample estimation and hypothesis testing," in Handbook of
econometrics, Vol. IV, vol. 2 of Handbooks in Econom., pp. 2111-2245. North-Holland, Amsterdam.
ROBERT, C.
P.,
AND
G. Casella (1998): Monte Carlo Statistical Methods. Springer- Verlag.
VAN DER Vaart, A. W.,
New
AND
J.
A.
Wellner
(1996):
Weak convergence and empirical
processes. Springer- Verlag,
York.
Politis, D.,
J.
ROMANO, AND M. Wolf
Vaart, A. V. D.
(1999): Subsampling, Springer Series in Statistics. Springer:New York.
(1999): Asymptotics Statistics.
Cambridge University
43
352
7 n
Press.
Date Due
Download