Document 11164241

advertisement
DEWEY
HB31
.M415
working paper
department
of
economics
TWO-STEP ESTIMATION,
OPTIMAL MOMENT CONDITIONS, AND
SAMPLE SELECTION MODELS
Whitney K. Newey
James L. Powell
No.
99-06
February, 1999
massachusetts
institute of
technology
50 memorial drive
Cambridge, mass. 02139
WORKING PAPER
DEPARTMENT
OF ECONOMICS
TWO-STEP ESTIMATION,
OPTIMAL MOMENT CONDITIONS, AND
SAMPLE SELECTION MODELS
Whitney K. Newey
James L. Powell
No.
99-06
February, 1999
MASSACHUSETTS
INSTITUTE
OF
TECHNOLOGY
50
MEMORIAL DRIVE
CAMBRIDGE, MASS. 02142
TWO-STEP ESTIMATION, OPTIMAL MOMENT CONDITIONS, AND SAMPLE SELECTION MODELS
by
Whitney K. Newey
Department of Economics
MIT
and
James L. Powell
Department of Economics
UC
Berkeley
January, 1992
Revised, December 1998
Abstract:
Two step estimators with a nonparametric first step are important,
particularly for sample selection models where the first step is estimation of the
propensity score.
We
In this paper we consider the efficiency of such estimators.
characterize the efficient moment condition for a given first step nonparametric
We also show how it is possible to approximately attain efficiency by
estimator.
combining many moment conditions.
In addition we find that the efficient moment
condition often leads to an estimator that attains the semiparametric efficiency bound.
As illustrations we consider models with expectations and semiparametric minimum distance
estimation.
JEL Classification:
Keywords:
CIO, C21
Efficiency,
Two-Step Estimation, Sample Selection Models, Semiparametric
Estimation,
Department of Economics, MIT, Cambridge, MA 02139; (617) 253-6420 (Work); (617)
253-1330 (Fax); wnewey@mit.edu (email). The NSF provided financial support for the
research for this project.
Digitized by the Internet Archive
in
2011 with funding from
Boston Library Consortium
Member
Libraries
http://www.archive.org/details/twostepestimatioOOnewe
Introduction
1.
Two
step estimators are useful for a variety of models, including sample selection
models and models that depend on expectations of economic agents.
first step is nonparametric are particularly important, having
Estimators where the
many applications
in
econometrics, and providing a natural approach to estimation of parameters of interest.
The purpose of
this paper is to derive the
form of an asymptotically efficient two step
estimator, given a first step estimator.
The efficient estimator for a given
first step
nonparametric estimator will often be
fully efficient, attaining the semiparametric efficiency bound for the model, as
for some sample selection models considered by
occurs because the first step
is
Newey and Powell
Full efficiency
(1993).
just identified, analogous to the efficiency of a
when
limited information estimator of a simultaneous equation
just identified.
we show
all
the other equations are
An analogous result for two-step parametric estimators
Crepon, Kramarz, and Trognon (1997), where optimal estimation
in
is
given in
the second step leads to
full efficiency if the first step is exactly identified.
We
will first give
are efficient
in
some general results that characterize second step estimators that
a certain class and consider construction of estimators that are
approximately efficient.
We then
derive the form of efficient estimators in several
specific models, including conditional
moment
restrictions that depend on functions of
conditional expectations and sample selection models where the propensity score
selection probability)
is
nonparametric.
We
also describe
how an approximately
(i.e.
the
efficient
estimator could be constructed by optimally combining many second step estimating
equations.
Throughout the paper we rely on the results of Newey (1994) to derive the form of
asymptotic variances and make efficiency comparisons.
Those results allow us to sidestep
regularity conditions for asymptotic normality and focus on the issue at hand, which
the form of an efficient estimator.
In this
is
approach we follow long standing econometric
practice where efficiency comparisons are
Newey
specific estimators (e.g. as in
mean
its
(1994) for series estimators or
(1994) for kernel estimators), but this
As an
of a dependent variable
in
given some conditioning variable
y
Newey and
would detract from our main purpose.
example, consider the following simple model
initial
full set
Of course, we could give general regularity conditions for
of regularity conditions.
McFadden
made without necessarily specifying a
which the conditional
x
is
proportional to
conditional standard deviation:
y = 8 cr(x) + u,
Given a sample
estimator for
<(y.,
x'.
2
E[u|x] =
)',
=
i
1,
0,
cr
...,
n>
(x)
= Var(y|x).
of observations on
would be an instrumental variables
8
with a nonparametric estimator
(1)
(IV)
and
x,
one type of
estimator replacing
and using an instrument
cr(x)
y
a(x)
cr(x)
to solve the
equation
- |3£(x.)] =
£.,a(x.)[y.
J
l
^i=l
l
i
**•
for
have
8 =
"
—1 n
n
,a(x.)c(x.)] Y. ,a(x.)y..
^1=1
^1=1
l
l
[Y.
a(x) = c(x).
If
the data generating process and the nonparametric estimator
then the formulae given
in
asymptotic distribution for
v^n(/3
- B
)
q
Chamberlain (1987)
is
<r(x)
root-n consistent and asymptotically normal,
is
(1994) can be used to derive the following
form of the
/3:
_1
The asymptotic variance of
c(x)
Newey
J3
-iU N(0, <E[a(x)cr(x)]r 2 -E[< 2 a(x) 2 ]),
C = u - [2<r(x)]
regressor
For example, the least squares estimator would
11
are sufficiently regular, so that
a(x),
(2)
2
{u
- o-(x)
(3)
2
).
this estimator is that of an IV estimator with instrument
and residual
cr(x)/w(x)
for
^.
The efficient choice of instrument, as
w(x) =
2
E[<;
|x].
in
The novel feature of this optimal
instrument
is
that
depends on the inverse of the conditional variance of
it
than the original residual
2
(v
l/cr(x),
though,
2
so that
E[C
the same as
if
-l)/2],
it
is
l
u.
x
v
If
independent of
is
2
proportional to
is
l
= u/cr(x)
cr
x
<;
rather
< = o-(x)[v -
then
and the the best instrument
(x),
the first stage estimation were not accounted for.
necessary to account for the first-stage estimator
in
is
In general
forming the optimal
instrument for the second stage.
The best IV estimator
(1993),
is
weighted least squares with weight
j3
= [y.%(x.) ^(x.)
^i=l
l
2
]
V.^wCxJ
l
^i=l
Alternatively, as
should be efficient.
estimator could be constructed by
2.
As
.
in
Newey
estimation of the optimal instrument should not affect the asymptotic variance,
so the weighted least squares estimator, so for
where
u(x)
A(x)
is
some vector
l
we
GMM
that
w(x)
is
suitably well behaved,
l
£(x.)y.
(4)
li
discuss below, an approximately efficient
estimation with
moment conditions
A(x)[y-/3(r(x.)],
of approximating functions.
General Methods
To describe the general class
vector of functions, where
vector, and
a
is
E[m(z,e
z
of estimators
)]
consider, let
denotes a single data observation,
an unknown function.
,ct
we
m(z,G,a)
9
=
(5)
"0"
subscript denotes true values.
and an estimator
a
can be used to construct an estimator
solving the equation
of
a parameter
Suppose that the moment restrictions
are satisfied, where the
a
is
denote a
This moment restriction
6
of
,
by
£." m(z.,e,a)/n =
0.
(6)
The class of estimators we consider are of
some
moment
set of feasible
Section
z =
1,
x')',
(y,
6 =
a =
|3,
m(z,/3,a)
we need
n
,m(z.,e n ,i)/v^ =
^i=l
i
r
Here
u
u
(z)-m(z,9 ,a
results of
)
Newey
some
in
-
H
asymptotic variance
H
m
E[u
When
m
E[u
will simply
such that
m (z)
(7)
,m(z.,G„,a)/n
^i=l
with the term
l
u
When
a.
(z).
If
a
is
equation
= <3E[m(z,6,a )]/3e|
0— 9
u
nonparametric the
(7)
holds along with
,
o
m (z)u m (z)']H"m
1
').
will consider is one
m(z,9,a)
we
the
(8)
where
m
minimizes the
'.
to minimize the asymptotic variance of
6
m,
all
m
= E[u
mm
(z)u-(z)'].
asymptotic variance will be
(9)
Y.
m (z)u m (z)']H m
this equation holds then by
Equation
m
u
has the
In general,
(1).
_1
A sufficient condition for
H
H
0.
8
p
(1994) can be used to derive
The efficient two-step estimator we
that for
+ o
i
N(0,
example of
the one where
i.e.
but for the moment,
a,
accounting for the presence of
6J -^
class,
an associated function
(z.Wn
£.%
*"i=l m
other regularity conditions then for
vWe
is
the influence function of
is
m (z)
In the
the asymptotic distribution of
there
m(z,8,a)
m.
denotes the conditional standard deviation
<r
asymptotic variance will depend on the form of
assume that for each
restricted to be in
is
|3cr(x)].
To characterize the optimal
smallest asymptotic variance,
m
where
may depend on
a
functions, and
m(z,6,a) = o-(x)[y -
function, and
this form,
is
(9)
Newey and McFadden
(E[u— (z)u— (z)'
m
m
])
,
(1994),
a lower bound on the
and will be attained when
analogous to the generalized information matrix equality
in
m
= m.
parametric
is
models, and similar equations have been used by Hansen (1985a), Hansen, Heaton, and Ogaki
(1988),
Here we use this
and Bates and White (1993) to find efficient estimators.
mm
equation to derive the optimal choice of a second step estimator.
This characterization of an efficient two-step estimator can be used to derive the
optimal estimator in the initial example.
m
Also, the choice of
-E[a(x)cr(x)].
In that
example,
equation, that
hence an optimal instrument
is
H
and
and
a
is
a(x) = a>(x)
A solution to this
cr(x).
Construction of an efficient estimator can often be based on the solution
m may
Although
to equation (9).
=
-
E[a(x)cr(x)] = E[a(x)^ a(x)] = Eia(x)w(x)a(x)].
is
(9)
= a(x)C
(z)
reduces to a choice of instrument
2-
equation
u
depend on unknown functions other then
m(z,/3,oc)
they
a,
can often be replaced by parametric or nonparametric estimators without affecting the
efficiency of
e.g.
e,
restricted class of
as in
unknown
Newey
Estimators which are efficient for some
(1993).
distributions, referred to as locally efficient here, can be
constructed by using finite dimensional parameterizations of unknown components of the
optimal moment function.
Estimators which are efficient for
distributions can be
all
constructed by using nonparametric methods to estimate unknown components.
examples to follow we
which will result
will discuss various estimators of the optimal
in efficiency
A general approach
and
of
it
is
to efficient estimation,
hard to form an explicit estimate,
(1976),
is
which
useful
is
in
m
when
is
complicated
to use the efficient generalized
method
This approach has been considered
functions.
Hayashi and Sims (1983), Chamberlain (1987), and Newey (1993).
"spanning" condition, this approach will result in an estimator that
efficient,
functions,
under appropriate regularity conditions.
moments estimator based on "many" moment
by Beran
moment
In the
is
Under a
approximately
the sense that as the number of moments grows, the asymptotic variance of
the estimator approaches that of the optimal estimator.
To be precise, consider a
depend on
with
mj
J).
and
J
x
1
Suppose that for some
replacing
u
j
m
and
vector of functions
J
u
x
m
1
vector
u
(z),
m
(z,G,a)
equation
respectively, and let
V
(where
(7)
is
a
may
satisfied
denote an estimator
j
>
V
of
= E[u (z)u (z)']
GMM
optimal
0J
V
(e.g.
06Q I i 2 1
m J (z.,e
An alternative one-step version
1
6. = 6 - (H'T V~ H
J
J
where
6
is
equivalent to
variance
As
an
J
)" 1
I
H
)
u— (z)
in
mean square,
2.1:
mm
E[u (z)u — (z)'
...
Suppose
]
for
e,a).
)
3m
£.
(10)
(z.,6,a)/ae/n,
(11)
>
I
]
As usual, the one-step estimator
is
asymptotically
Both estimators will have asymptotic
(H'.V.
H
)
.
m
D'
1
m
that there is
feasible
E[\\u— (z)-C T u
m
as
can approximate the optimal influence function
u (z)
as shown by the following result.
all
such that
(Elu-(z)u-(z)'
m
is
gets larger the asymptotic variance of this estimator will approach the lower
linear combinations of
2,
=
which can be estimated by
if
1,
H
optimization counterpart.
,
An
(z)).
1
n
H' V" ^. m (z.,e a)
J
J J ^1=1
l
bound,
=
a)'V- I.2 m (z.
1
J
I
J
u
1
)
for
is,
initial estimator.
its full
(H'V
J
Theorem
m
estimator based on the moment vector
= argmin
for an estimator
u (z.)u ,(z.)' /n
= £
J
—
m,
H
and there are conformable constant matrices
C
2
J JT
»
nonsingular,
such that
(z)\\
]
—
as
J
—
»
w.
is
]
Then
—
(H'V~H
)
J J Jr
co.
The mean-square approximation hypothesis of
referred to above.
>
E[u—(z)u—(z)'
this result is the spanning condition
This result falls short of an efficient estimation result, because
does not specify a way, independent of the true data generating process, such that
grow with the sample
It
is
size so that
&,, ,
J(n)
has asymptotic variance
possible to give such rates in particular problems, as in
to avoid technical detail rates are not derived here.
Instead,
Newey
mm
(E[u— (z)u— (z)'
(1993),
u (z)
that approximate
u— (z)
we focus on how
in
J
])
.
but
this
result suggests efficient estimation might approximately be achieved, by choosing
functions with corresponding
it
mean-square.
moment
JT
=
,
J
3.
Conditional
The
Moment Restrictions and Nonparametric Generated Regressors.
first specific
model we consider
is
a semiparametric instrumental variables
The introductory example
estimator that depends on a nonparametric regression.
considered by Ahn and Manski (1993) and Rilstone (1989).
E[p(z,e 0> a )|x] =
0,
Let
denote a vector of
a(x)
a residual that depends on a function
p(z,9,a)
Then the moment vectors
a
Also, this case includes estimators that have been
special case of this model.
instrumental variables,
is
a,
where
a (w) = E[d|w].
Q
(12)
will consist of a vector of instrumental variables
a(x)
multiplying the residual,
m(z,6,a) = a(x)p(z,0,a).
The optimality problem here
is
finding the set of instrumental variables that minimizes
the asymptotic variance.
To derive the optimal instruments we need
to account for the nonparametric
estimation, which can be done by imposing the following condition.
Let
<x(w,y)
denote a parametric specification for the conditional expectation, satisfying regularity
conditions, along with the conditional
moment
vector, so that the derivatives in the
following condition exist.
Assumption
3.1:
some
dE[a(x)p(z,e.,cd»)]/3r
y_,
There
is
5(w)
such that for
I
all
a(w,^)
|
that
it
is
as derived in
a (w) = a(w,^
)
for
= E[a(x)5(w)aa(w,^)/a?-].
This condition leads to correction terms for estimation of
E[a(x) w]5(w)[d - a (w)],
with
Newey
(1994).
based on a simple derivative calculation that
a.
of the simple
The value of
is
this
easy to apply.
form
approach
is
For instance,
in the initial
y - |3{a (x) - [a (x)]
-/3
[2cr
The
}
Assumption
,
3.1
x
,<x
Theorem
3.1:
m (z)
=
p(z,9 ,a) =
and
,y)' |x]
x
If
leading to the correction term given earlier.
we address
and for now assume that
w
£
5(w) =
satisfied with
is
for the case where the
are a subset of the first stage regressors.
)/S9|x],
E[5p(z,9
2
is
first optimal instrument question
instruments
u
2 1/2
(l,-2E[y |x]),
(x)]
a_(x) = E[(y
where
example,
and Assumption
3.1
p
is
D(x) =
Let
a scalar.
is satisfied,
then
< = p(z,6 n ,a n ) + 8(w)[d - g n (w)],
u
u u
a(xK,
and the choice of instruments that minimizes the asymptotic variance
2
a(x) = D(x)'(E[C \x])~
"adjusted residual"
1
.
u
m (z)
which
= a(x)C,
is
the instrumental variables times an
that accounts for the presence of
^,
form
of these optimal instruments follows upon noting the
An interesting interpretation
of the influence function,
is
a.
Consequently, the
optimal instruments are the same as for an IV estimator without the first stage,
except that the conditional variance
E[<
2
|x]
the optimal instruments have this simple form
has replaced
is
E[p
2
|x].
The reason that
that the adjusted residual fully
accounts for the generated regressors.
The
initial
example provides one illustration of
estimator of Ahn and Manski (1993).
{0,
1>,
is
Another example
is
a binary dependent variable
the
y 6
with
Prob(y=l|x) = $(e
where
Suppose there
this case.
$
is
the
CDF
[cc
(x,0)-<x (x,l)] +
io
x^e^),
for a standard normal.
nonparametric estimator
a(w)
replacing
a^w)
= E[d|w],
Their estimator
a (w).
is
w
=
(x,y),
(13)
probit, with a
As usual for probit, this estimator
is
asymptotically equivalent to an instrumental variables estimator with
p(z,e,a) = y - $(9 [a(x,0)-a(x,D] +
D(xMx) \
a(x) =
where
v = 6
(14)
Q(x) = *(v)[l - *(v)], D(x) = -(a (x.O)-a (x.D.x'
(x,0)-a (x,l)]
[a
x'^),
:
x'6
and
<p{v)
is
)'
<piv),
the standard normal p.d.f..
These
instruments are not optimal.
x £ w,
To derive the optimal instruments, note that
To do
3.1.
we
so,
note that for any functions
= Elb(x)d(w)U(l-y)/(l-*(v))] -
i/>(v)[5a(x,0,3')/33' - 9<x(x,l,y)/3y],
-9
0(v){[l-$(v)]
(1-y) - <Mv)
so for
</>(v)5(w)[d-g (w)],
a(x) = (E[^
2
E[C
As
in the last
y>.
Assumption
Then, as
in
and
3.1
is
Theorem
2
|x],
d(w),
we can apply Theorem
E[b(x)<d(x,0)-d(x,l)}]
a(y))/ay|
9p(z,9
u(x) = E[{d-E[d|w]}
satisfied for
3.1,
<;
=
_
5(w) =
= y-$(v) -
the optimal instruments are
_1
2
|x])
D(x)
= *(v)[l-*(v)]
|x]
Then, by
[y/*(v)]>.
-0
b(x)
so that
(15)
2
+ ^(v) 8
^(vf^l-ttvjr'utx).
example, the optimal instruments are those for weighted nonlinear least
squares, where the weight
is
the inverse of the conditional variance of an "adjusted
residual," rather than the original residual.
An optimal estimator can be constructed by using estimates
As usual, estimating instruments will not affect the asymptotic variance.
instruments.
The optimal instruments may be estimated by substituting
a(x),
and
v
residuals
9
,
a(x),
and
v
for
9
,
respectively in the formula for the optimal instruments, and also
substituting an estimator for
estimating
of the optimal
w(x)
w(x).
A locally efficient estimator can be constructed by
as the predicted value
[d-a(w)]
2
from a regression of nonparametric squared
on a few functions of
10
x.
A
fully efficient estimator
would
require nonparametric estimation of
Alternatively, an approximately efficient
w(x).
GMM
estimator could be constructed from
many functions
estimation using
of
as
x,
considered below.
Another example of Theorem
(1994),
h(x)
where
y
=
(t
,
1,
2.1
the semiparametric panel probit estimator of
is
are binary variables and there
2),
is
Newey
an unknown function
such that
E[y |x] =
<H[x^
+ h(x)]/(r
Inverting the normal
CDF and differencing
p(x,6 ,a
gives the condition
(/3',cr),
(t
),
=
1,
2),
2
1.
eliminates the unknown function
where
= 0,
)
=
<r
t
t
«
n
(x)
= (E[y
|x],
E[y
and
h(x)
9 =
|x])',
and
p(x,e,cx)
= #
(a (x)) -
The least squares estimator
regression estimator
Newey
in
a
for
a(x)
[a Ax)) - (x -x
$
o*
)'/3.
(1994) is obtained by substituting a nonparametric
and minimizing
(x)
£._ p(x.,6,ct)
.
As usual for
least squares, this is asymptotically equivalent to an IV estimator with instruments
a(x) = D(x)'
= 3p(x,e ,a )/ae = -((x -x)',$
2
_1
(a
(x)))'.
The optimal instruments can be derived by applying Theorem
</>($
3.1
In this
(a))
we have
C,
5(x) =
example
(x)),-cr
(i//(a
= 5(x)<y - E[y|x]>.
i//(a
(x))).
Let
i//(a)
=
Therefore, from Theorem
w(x) = 5(x)Var(y |x)5(x)'
Let
3.1.
= Var(^lx).
Then the
optimal instruments are
r
X 2" X
1
a(x) = D(x)'w(x)
/Var(0(a
_1
$
(a
10
2O
(x))y
-o-
2
(x))y
i//(a
lo
lo
1
x).
1
(x))
These instruments correspond to the first order condition for the weighted least squares
estimator
= argmin
V.
n
-1
,u(x.)
6^1=1
i
*
2
p(x.,6,a) /n
i
11
This estimator can also be thought of as a semiparametric minimum distance
estimator, where
6
being chosen to minimize a function that should converge to zero
is
The characterization of an optimal IV estimator applies to any
at the truth.
semiparametric minimum distance problem where
p(x,6 ,a
)
=
0.
instruments will be
be optimal
D(x) = dp{yi,B ,a )/dQ
For
Q
D(x)'w(x)
Q
n
(x)
= E[y|x]
for a vector
w(x) = <5(x)Var(y x)S(x)'
and
|
,
and
y
the optimal
Furthermore, the weighted least squares estimator will
.
in this class.
Construction of an efficient estimator
The optimal instruments
in
Theorem
analogously to Newey (1993).
constructed by
GMM
2.1
is
straightforward
in
the case where
the influence function
Alternatively, an approximately efficient estimator can be
moment functions would
is
be
A(x)p(z,6,a).
so simple, the spanning condition of
that a linear combination of
x £ w.
can be estimated nonparametrically, proceeding
estimation using a vector of approximating functions
instruments, where the
we omit
<*
A(x)
In this case,
Theorem
where
2.1 just requires
can approximate the optimal instruments.
A(x)
as
For brevity
a formal result.
The next case we consider
is
that where
w
£
x.
This case
is
more complicated
in
that the correction for the first stage estimation does not lead to the adjusted residual
form of the influence function.
5(w)[d-a (w)],
Let
S(w) = E[VV'|w],
p = p(z,0 ,a
and
),
Q(x) = E[pp' |x],
K(x) = E[pV'|x].
L2
V =
Theorem
3.2:
u
w
If
m (z)
£
x
and Assumption
then
is satisfied,
3.1
= a(x)p + E[a(x)\w]V.
Also, if the linear equations
a(x) = [D(xY
+
P(w)'
+
P(w) = -Z(w)E[a(x)' \w]
have a solution for
instruments are
a(x),
a(x).
1
P(w) = -{Z(w)'
In general the
form
R(w)' K(x)' ]Q(x)~
.
E[K(x)'a(x)' \w],
-
R(w),
P(w)
and
Furthermore, if
2
+
1
with probability one, then the optimal
K(x) =
1
R(w) = -E[a(x)' \w],
then
R(w) =
and
1
E[Q(x)~ \w]f E[Q(x)~ D(x)\w].
of the optimal instruments
quite complicated, although
is
K(x) =
simplifies in the zero conditional covariance case,
An example where Theorem 3.2 applies
is
it
0.
a nonparametric generated regressor model
where
p(z,9,a) = y - z'e
This residual
residual
is
-
ya(w)
a (w) = E[d|w],
- 5[d - a(w)],
(d,w) £ x.
that for a linear model, where a conditional expectation and the
from the same conditional expectation are included as regressors.
The model
is
a semiparametric version of a familiar model with many economic applications, that has
been considered
K(x) =
in
Rilstone (1989).
and that Assumption
Note here that
3.1 is satisfied for
Dfx) = -(E[z'
5(w) = 5 -y
|x],
.
a (w), d-a (w)),
Therefore, the optimal
instruments are
_1
a(x) = D(x)'n(x)
_1
+ E[D(x)'Q(x)
_1
-1
|w]<Z(w)
13
+ E[Q(x)
1
wlT^tx)"
1
.
In the case
where
fE[z
I
are constant, this formula simplifies to
Z(w)
and
fi(x)
fE[z
xfl
Iwll
i
a(x) =
a (w
Q
)
a (w
+
o
d-a
(
w)
D*
)
o
z + n
o
and
it
can be shown that the resulting estimator attains Rilstone's (1989) semiparametric
bound for normal disturbances.
It
interesting to note that these instruments are
is
equal to those for the best estimator
involving the additional variable
if
z
if
a (w)
were used
If
E[z |w] = E[z
E[z |w],
a,
as would occur
e.g.
|x],
plus a term
then the best instruments would be a linear combination of the instruments
£ w,
that are best without the generated regressor problem.
follows that least squares
In the
in place of
constant
readily available
is
Q(x)
and Z(w)
z
£ w,
then
it
E[z
an estimator of the optimal instruments
case,
|x],
E[z |w],
a
(w),
Z,
and
Q
is
by
Alternatively, since the optimal instruments are a linear
A(x) = (E[z'|x], E[z'|w], a (w), d-a (w))',
combination of
if
best.
from replacing
corresponding estimates.
Indeed,
an optimal estimator could be obtained from
GMM
for the corresponding
A(x)
with an optimal weighting matrix that
accounts for the generated regressors.
In the general
model of Theorem 3.2
it
should be possible to construct an efficient
estimator by using nonparametric estimates of the optimal instruments.
Alternatively, an
approximately efficient estimator can be constructed using many moment conditions.
that corresponds to
GMM
estimation with many functions of
x
as instruments.
It
straightforward to give conditions for approximate efficiency of these estimators.
A
(x)
= (A
(x),...,A
estimator as
dimension of
in
(x))'
be a vector of functions of
equation (10) or
(11)
with
p.
14
m
x,
and consider a
(z,0,a) = p(z,0,a)®A (x).
Let
Here
is
Let
GMM
r
be the
Theorem
If
3.3:
mm
bounded, and for any
E[\\a(x)-C
JI
J
The point of
®A.(x)}\\
J
r
_
E[u—(z)u—(z)']
a(x)
2
]
—
is
with
as
»
E[\\a(x)\\
J
—
is
in
2
]
2
E[\\a(x)\\
<
]
(H'T V~}h
J J
S
oo,
C
< m, there exists
then
m,
>
this result is that all that
complicated optimal influence function
is
nonsingular,
Q(x)
and
S(w)
such that
mm
l
-> (E[u-(z)u-(z)'
J
are
])~ 2
as
needed to guarantee approximation of the
Theorem
leading to approximate efficiency,
3.2,
that the instruments can approximate functions of
x.
There are many types of
instruments that would meet this qualification, including power series and regression
splines,
making
it
relatively easy to construct an approximately efficient estimator.
Here we have shown the form of an efficient two-step
GMM
estimator when the second
step instruments are a subset of the first step regressors or vice versa.
possible to obtain some results for the case where
and
x
is
also
are not a subset of each
For brevity these results are omitted.
other.
4.
w
It
Sample Selection Models and Nonparametric Propensity Score Estimation
Interesting and important two-step estimators arise in the context of sample
selection models.
A general form of such a model
*
y
*
= x'6
+ e,
The selection probability
9
only observed
y
Prob(d = l|w) =
interest
is
?r(w)
P
is
=
P,
if
d =
1,
d e <0,
(16)
1}.
x £ w.
referred to as the propensity score.
The parameters of
are identified under various restrictions on the joint distribution of the
selection indicator
d
and the disturbance
15
e.
In this section
we consider
the form of
J
optimal two-step estimators
d =
conditional on
1
and
in
two cases, where
c
w
and
are
mean independent
and where they are statistically independent.
P
estimators are two-step estimators where the first step
is
The
nonparametric estimation of
Some such estimators have been previously considered by Ahn and
the propensity score.
Powell (1993) and Choi (1990).
The
first case
we consider
is
E[e|w,d=l] = E[e|P,d=l],
(17)
Here the conditional mean of the disturbance, given selection and
the propensity score.
is
This can be motivated by a latent variable model where
for some unknown function
1(t(w)+tj i 0)
t(w)
a one-to-one function of
that
(e.g.
and that
positive everywhere)
is
E[e|w,d=l] = E[e w,7)i-T(w)] = E[E[e
|
w
a function of
depends only on
w,
only through
w
and
7)
|
Suppose that
tj.
are independent and
E[c|t),w] = E[e |t),t(w)].
|tj,w]
x(w),
and a disturbance
t(w)
d =
has density
tj
Then
w,t)£-t(w)] = E[E[e |t),t(w)] w,t)2:-t(w)],
which
|
and hence through
P
is
so that equation (17)
P,
holds.
A two step estimator of
can be constructed using a nonparametric propensity
score estimator
P =
with a residual
d<y-E[y P,d=l] - (x-E[x P,d=l])'
tt(w).
A vector of instrumental variables
|
|
regression estimator with regressor
to
P,
Here
a
and
y
value
tt(w).
a
W
ir
have true values
Using this function with
w (a
E[y|a (w)]
E[*|P]
(w))]'6>,
is
as
a = (a
,
y
7r
and
TT
a
can be used along
a nonparametric
form a moment vector
m(z,8,a) = a(w)d<y-a (a (w)) - [w-a
y
where
0},
a(w)
E[w|a
(w)],
w
and
,
*).
(18)
a (w)
has true
77
71
we have described
a
is
a two-step
instrumental variables version of Robinson's (1988) estimator, and a density weighted
version has been developed by Ahn and Powell (1993).
It
is
straightforward to derive the influence function from the results of Newey
16
(1994) and to obtain the optimal instruments.
Let
A(P) = E[e|w,d=l]
and
=
A
ax(P)/ap.
Theorem
4.1:
u
For
-q(w)
2
The influence function
m (z)
=
= E[C,
2
is
< = dlc-X(P)] + PA D (PXd-P;.
r
{a(w)-E[a(w)\P]K,
2
2 3
bounded away from zero, the
\w] = E[d{c-X(P)} \w] + A (P) P (1-P)
optimal instruments are
2
a(w) = P-q(w)' {x
Furthermore,
in the
Var(u—(z))
2
x\P]/E[-n(wf
2
\P]}.
semiparametric variance bound for estimation of
is the
m
model of equation
- E[-n(w)~
like
those that appear in an efficient estimator for a
heteroskedastic partially linear model as discussed in Chamberlain (1992).
obtained from partialling out an unknown function of
where the weight
is
P
is
In this
way
P
that the presence of
the inverse conditional variance of the adjusted residual
p.
in a
t;,
They are
weighted least squares
the inverse of the conditional variance of
difference with Chamberlain (1992)
residual
U
(17).
The best instruments here are
criterion,
6„
The main
C,.
leads to the weight being
rather than the original
the optimal instruments account for the presence of the first
stage nonparametric estimates.
Because the optimal instruments attain the semiparametric efficiency bound we do not
need to search beyond instrumental variables estimation for an efficient estimator.
Construction of an optimal estimator, or approximately optimal estimator could be carried
similar to the
way discussed
E[d{y-x'8-A(P)}
initial
r
in Section 2.
|w] + A (P) P (1-P)
A nonparametric estimator
could be constructed using
6
unweighted least squares estimator of a partially linear model
in the selected data,
7)(w)
and
y =
A
2
=
from an
x'0 + A(P)
+
and then a nonparametric estimator of the optimal instruments
17
formed as
2
a(w) = pi)(w)
This
is
<x - E[tj(w)
2
x|P]/E[^(w)
2
|P]}.
a complicated estimator for which regularity conditions have not yet been
formulated
the literature, but should lead to efficiency.
in
An approximately
efficient estimator
straightforward to construct here, by using
is
approximating functions as instruments and then doing optimal
GMM
functions and consider a
estimator as
- (x-E[x| P,d=l])' 6}.
A (w)d{y-E[y |P,d=l]
E[u—(z)u—(z)']
If
4.2:
is
with finite mean-square, there exists
co,
(H\V~}h
then
J J
Here we see that
S
J
it
l
—
>
in
equation (10) or
GMM
mm
J
as
is
~n(w)
(z,6,a) =
bounded and for any
E[\\a(w)-C A (w)\\
—
>
2
]
—
as
>
a(w)
J
suffices for approximate efficiency that the approximating vector
There are many such functions
that could be used, including splines and power series.
The second sample selection model we consider
e
Here
it
is
and
w
are independent conditional on
assumed that conditioning on
P
If
the joint distribution of
satisfies equation (16) with
P
and
d =
(19)
1.
removes any dependence between
This can be motivated by a latent variable model
1(t(w)+t} £ 0).
(c,T))
like that above,
given
w
where
c
and
w.
d =
depends only on
x(w)
then this equation will be satisfied.
A basic implication
function of
w
—
oo.
spans the set of functions with finite mean-square.
A (w)
m
estimator:
such that
2
(E[u-(z)u-(z)' ])~
with
(11)
a relatively simple spanning
is
nonsingular,
C
that accounts for the
be a vector of approximating
(w)
Then there
condition for approximate efficiency of the
Theorem
A
Let
presence of the nonparametric first stage.
GMM
will be
of the conditional independence in equation (19)
uncorrected with any function of
18
e
and
P,
is
that any
conditional on
P
and
d =
This allows us to form estimators analogous to those above, where
1.
y - x'0
replaced by any function of
instrumental variables
a(w),
and
a function
|
and a corresponding residual
q(e,P),
the
moment function
a =
a (w))],
m(z,e,a) = a(w)d[q(y-x'e,a (w)) - a (y-x'6,
J
and
a
q
E[q(y-x'0,ct (w))|a (w),d=l]
have true values
a
n
q
tt
Here
TT
tt
Using this function with
respectively.
a
is
To be precise, for a vector of
P.
we can use
d{q(y-x'e,P) - E[q(y-x' 0,P) P,d=l ]>,
y-x'6
a
qrc
,
and
(20)
).
tt(w),
71
we have described
as
(a
of the estimator described above for the conditional
mean
is
a nonlinear version
case.
The next result gives the form of the influence function and the optimal moment
Here
functions for this estimator.
the density of
f(e|P)
s
(e,P) = dlnf(e|P)/dP
Theorem
4.3:
c
given
MP) =
let
d =
and
1
be the score for
f
For the model of equation
A (P) = 5A(P)/3P,
E[q(e,P) P,d=l],
|
w.
Also, set
(e,P) = dlnf(e|P)/de
s
and
and
with respect to a location parameter and
(19)
and moment functions as
in
P.
equation (20),
the influence function is
u
m (z)
= {a(w)-E[a(w)\P,d=l]}C,,
C,
= d[q(c,P)-X(P)] +
and the optimal choice of moment function has
q(c,P) = s (c,P)
c
Furthermore,
Var(u—(z))
-
= x
where the best
and
l
PsJc,P)E[s s \P,d=U/(P~ (l-P)~
P
c p
is the
Using many moment functions
this case,
a(w) = x
P{E[q (c,P)\P,d=l]-X (P)}(d-P),
p
p
q(c,P)
has a known functional form.
1
+
PE[s
2
\P,d=l]}.
p
semiparametric variance bound for estimation of
in
is
an approximately efficient estimator
is
useful in
quite complicated, but the optimal instrument
In this
q(e,P),
rather than the
potentially high dimensional approximation of the best instruments
19
a(w)
case approximate efficiency can be achieved
with only a two-dimensional approximation of the best
a(w)
Q
from Theorem
.
4.1.
The low dimensional nature of the approximation means that
attain high efficiency using a relatively small set of
moment
(10) or
(11)
Theorem
4.4:
and for any
-^
If
of
d^ty-x'
E[u—(z)u—(z)']
2
is
9,P) -
C
may make sense
when
c
and
t)
is
E[(s
likely to induce
P 2
)
\P,d=l]
are bounded,
such that E[d{q(c,P)-C q (c,P)}
U J
J -^
like
2
]
».
e
and
For example, one could derive the form
have a normal distribution.
few simple approximating functions,
selection
and
to begin the approximation with functions of
that correspond to particular distributions.
q(e,P)
J
x
mm
J
estimator as in equation
|
1
l
(H'.V'^H j' -^ (E[u-(z)u-(z)' })~ as
J J
q (e,P)
Elq^y-x' 9,P) P,d=l]}©x.
nonsingular,
finite there is
]
then
oo,
In practice it
P
(z,9,a) =
E[dq(c,P)
J -^
as
m
with
Let
conditions.
GMM
denote a vector of approximating functions, and consider a
should be possible to
it
power series
in
Alternatively, one could use a
some function of
some skewness and because normality
is
e.
Because
not expected in
many
econometric applications, using such an estimator for the conditional independence case
of equation (19) can result in substantial efficiency gains over the linear estimators
based on equation
(18),
as
shown
in
Newey
probability.
20
(1991) for a
semiparametric selection
APPENDIX: Proofs of Theorems
denote a constant that
C
Let
Proof of Theorem
= Efu.u— '
J
m
For a matrix
2.1:
B
where we suppress the
],
different in different uses.
is
let
= [tr(B'B)]
IIBII
1/2
By equation
.
(9),
argument for notational convenience, so that
z
H
6.
J
has asymptotic variance
(H'.V^HJ"
J
J
u =
For the
C
1
= (Elu-u'.KElu.u'jf^tu.u-'lf
mJ
J
1
= (Elu.u'jf
Jm
JJ
JJ
1
,
Elu-u'.KEIu.u'JfV,
m
in the
J
J
J
J
statement of the result,
u—
multivariate least squares projection of
E[u _u'
C C
]
£ Etuu'
£ Efu-u-'
mm
]
= C u
u
let
on
u
it
,
Since
.
u
is
the
follows that
(21)
I.
Also, by the spanning condition in the statement,
IIE[u_u'] -
C C
Cm
E[u—u— ']H s E[llu_u'-u— u—
C C m m
£ E[llu -u-ll
Therefore, by equation
by nonsingularity of
matrix.
mm
2
(21),
E[u— u—
+ 2(E[llu
]
E[uu'
'
]
]
—
>
Cm
-u-ll
2
II]
'
2
1/2
(E[llu-ll
])
E[u— u—
m m
'
]
as
J
—
1/2
])
m
>
oo,
-»
0.
so the conclusion follows
and continuity of the inverse matrix at any nonsingular
QED.
Proof of Theorem
3.1:
The form of the influence function follows by Newey
(1994).
The
(1994).
For
optimal instruments follow as in Chamberlain (1987).
Proof of Theorem 3.2:
The form of the influence function follows by Newey
notational convenience suppress the
x
argument of
21
a(x),
fi(x),
D(x),
and
K(x).
Then by iterated expectations, equation
hence the optimal instruments,
H
D = Qa'
requires that
Solving for
and
Z
w
P
= -Z E[n
w
+ Z(w)E[a' |w] + KE[a' |w] + E[K'a' |w] = Qa'
Plugging
formula for
in the
-1
_1
(D + P
W
W
)|w] = -Z
gives the third result.
QED.
Proof of Theorem 3.3:
Let
W E[Q
a,(x) =
Theorem
E[llu
3.2,
m
a
For the third result,
WW
-
Z
P
P
= P(w)
WW
+ P'
)fi
,
P
gives
Solving for
.
Note that for any
®a,(x).
r
|w]P
E[n
u
J
C
and conformable constant matrix
P
w
as in
m (z)
then
Theorem
,
2
(z)-u_
(z)
Cm
J
II
]
J
£ 2E[lla(x)-C a (x)Ml
2
IIQ(x)ll]
+ 2E[IIE[{a(x)-C a (x)> w]ll
|
J
£ CE[lla(x)-C a
2
(x)}ll
]
+ CEtEtllatxl-Cja^x)!!
The conclusion then follows by Theorem
For the proof of Theorems
E[b|P,d=l],
let
a = (D'
the equation for
in
P(w) - KR(w).
-
_1
D|w]
I
J
3.1 or
satisfaction of this equation
x,
Applying the optimal instrument formula gives
= Z(w).
E[a' |w].
Z(w)E[a' |w] + KE[a' |w] + E[K'a' |w]>].
then gives the the second result.
a
-Z
+
can be any bounded function of
a = a(x)
Since
is
= E[aD] = E[a<Qa'
m
for the optimal influence function, and
[9)
4.1
- 4.4,
|IZ(w)ll
]
2
!
w]] < CElllatxKCjajtx)}!!
2
].
QED.
2.1.
for any function
and note that for functions
2
]
b(w)
of
of the data let
b
E b =
E b = E[b|P].
w,
P
Proof of Theorem
4.1:
As shown
in
Newey
estimation can be derived separately for
a
a
,
y
conditional expectations correction in
for
a
and
y
a
is
Newey
-E[a(w) |P,d=l]p.
the correction terms for nonparametric
(1994),
w
.
(1994)
Also,
let
and
a
.
From
it
follows that the correction term
jrtw,^) =
E [d|w]
y
vv
22
the form of the
tt
denote the
propensity score when for a distribution parameterized by
Then by Newey
truth.
(1994) the correction
By equation
(21),
term for nonparametric estimation of
with respect to
E[da(w)E[e |Tr(w,y),d=l]]
can be computed from the derivative of
the truth.
that passes through the
y
7i(w)
at
y
iterated expectations, the chain rule, and the fact that
dnlvr.r-Vdy = SE [d|w]/3y = E[{d-P}S (z)|w]
U
If
^f
for the score
for
S (z)
z,
SE[da(w)E[c|n(w,^-),d=l]]/a9- = aE[da(w)E[MP)|ir(w,y),d=l]]/5y
= SE[da(w)E[A(P-Ti(w,y)+P)|P,d=l]]/ay + aE[da(w)M7r(w,y))]/3r
- E[PA D (P){a(w)-E[a(w)|P]}a7r(w,^)/33'] = E[PA (P){a(w)-E[a(w)|P]}{d-P}S
Then by Newey
(1994) the correction
PA (P){a(w)-E[a(w)|P]}(d-P).
we
Noting that
obtain the first conclusion.
Equation
-E[aP(x-E x)'
p
w
where the
H
Also,
m
P
is
E[a(w) P,d=l] = E[da(w) |P]/E[d P] = E[a(w)|P],
|
|
= E[dm(z,e,a_)/ae] = -E[Pa(w){x-E[x|P]}'
argument
]
2
= -E[(a-E a)Px'] = E[(a-E aK (a-E a)'
p
p
p
is
]
= E[(a-E
2
p
a)7)
Subtracting gives
suppressed for convenience.
2
-E[(a-E a)(Px-7) <a-E a})'] =
p
p
can be any function of
a(w)
Since
0.
this equation implies that
w,
2
Px-T) (a-E a) = h(P)
for some function
h(P)
expectations given
P
PE
-2
i)
x/E
of
gives
P.
PE
Dividing through by
-2
tj
x = h(P)ET)
-2
7)
].
then
is
(9)
term for estimation of
(z)].
,
and solving again,
23
2
T)
and taking conditional
-2
.
Solving,
h(P) =
(a-E
p
a)'
],
2
a - E a =
p
[x -
E
2
p
(T)
x)/E
2
p
Newey and Powell
Proof of Theorem
Ellla-aJI
Pt,
2
(x - E[tj
x|P]/E[tj
2
|P]).
2
—
>
]
2 2
!)
]
+ CE[IIE
p
(a-a
= A (w)
EDa =
Then since
0.
A
Choose
4.2:
P
2 2
4.3:
)ll
T,
]
|
C
ard
such that for
mJJ
E[ llu— (z)-C ¥ u.(z)ll
0,
* CE[ lla-ajll
2
]
+
CE[E
p
2
]
=
a
CA,
JPJ
= E[li{(a-a )-E ri (a-a
T
lla- aj
2
ll
]
s CEMli-ajll
2
]
I
2
)}CII
->
=s
]
The
0.
QED.
2.1.
a
|
term for estimation of
where
<a(w)-E[a(w)|P,d=l]}P{E[q (e,P)|P]-A (P)Md-P),
P
A(P) =
where
-E[a(w) P,d=l]d<q(e,P) - A(P)},
is
Similarly, the correction
E[dq(e,P) P,d=l].
(3.20)
follows as for the conditional mean case that the correction
It
term for estimation of
from equation
last result follows
J
conclusion then follows by Theorem
Proof of Theorem
E[a|P] =
QED.
(1993).
J
CEllla-ajll
2
=
)]
The
for that choice, gives the second result.
of
(7,
equal to the expression on the right-hand side, and noting that
a
Setting
Pt)
and
P
is
subscripts will denote
e
corresponding partial derivatives, giving the first conclusion.
To
solve equation (9) for the optimal
= -E[a(w){dq (e,P)x-E[q (c,P)x
and
d =
q (e,P)
=
s
1,
s
c
and
e
(e,P),
= q (e,P).
p
q
PE[dq |w,d=l] = PE q
H
Let
m
p = d[q(e,P)-A(P)].
we
and
P,d=l]}].
Let
Note that
E[dq
H
functions, note
f(e|P)
s„ = s„(e,P).
P
P
m
= E[3m(z,9,a 1/90]
denote the density of
For notational simplicity
|
w] = PE[dq
given
c
q
let
|w,d=l] + (l-P)E[dq
w
=
c
|w,d=0] =
Therefore,
= -E[aP(E D q )(x-E D x)'
P c
P
differentiating
0,
.
|
moment
]
= -E[(a-E D a)P(E D q )(x-E D x)'
P
P x
P
Integration by parts of
A(P) = 5 rq(e,P)f(e |P)dc/9P
E q
].
= Tq (c.P)f(e |P)de,
with respect to
P,
and using
E
s
= E
s
obtain
E
Pqe
=
,
J
q e (e,P)f(c|P)de = -E q(c,P)s = -E (ps ),
p
e
p
e
24
E q
p p
- A
p
(P)
= -E (ps ).
p
p
]
It
follows that
C = p - E p (ps p )P(d-P).
Let
E
(9)
and
a
and
E[pp|w] = PE
Note that by conditional independence,
)P(d-P).
(ps
p = d{q(e,P)-E q(e,P)},
denote the optimal functions,
q
C,
= p
-
Then equation
(pp).
is
Pre
E[(a-E n a)PE n (ps )(x-E D x)'
P
= H
]
m
mm
= E[u
(z)u-(z)'
2
= El(a-E a)E[<<|w](a-Epa)'] = E[P(a-E aKE (pp~) + P (l-P)E (ps )E (ps )>(a-E a)'
p
p
p
p p
p
As
in the
E
p
proof of Theorem
(ps
e
- {E
)(x-E x)'
p
p
a(w)
equality for all
4.1,
2
(pp) + P (l-P)E (ps
p
p
a = x,
if
= E (ps
p
c
g(P) =
and for
p
- [E (pp) + P (l-P)E (ps
p
-
[s
g(P)],
E q =
since
conditional expectation gives
E
(s s
)/[l
2
q(e,P) = s
To show the
2
P (l-P)E
+
s
].
0.
,
h(P) = 0.
This
),
g(P) =
E
An optimal
c°P
- g(P)s D
,
)g(P)]
p
q(c,P)
it
is
sufficient that
Multiplying through by
=
g(P)
to
last conclusion, note that
_1
2
g(P)P~ (l-P)
(ps
find that
2
P (l-P)s g(P)]}|P,d=l].
p
For this equality to hold for any function
2
-P (l-P)s
E
we
P
= h(P).
2
)
= -E[dq(e,P)<p -
[s
Ma-E p a)'
)E (ps
Taking conditional expectations of both sides given
equation will hold
requires that
s
(s
q
PE
)
is
25
s
Solving,
g(P).
g(P) =
then given by
_1
pep
(s
E q =
so that
P (l-P)E
and taking a
2
2
-
s
q =
s
0,
)/[P
_1
(l-P)
so that
+
PE
2
(s
PP
p = dq.
)].
Also,
E
(ps
)
< = dq - E (ps )P(d-P) = dq - g(P)P
p
p
u— (z)
Therefore,
= (x-E x)^
Newey and Powell
(1993),
Proof of Theorem
4.4:
!
V-P)
(d-P).
matches the efficient score given
giving the last conclusion.
By arguments
like
in
equation (4.15) of
QED.
u (z) = (x - E[x|P]){p
those above,
E[p
J
—
P
s
>
|P](d - P)>
p
= d(p -E[p |P,d=l]).
EldwJxXptcPj-C^p^cP)}
oo,
P(l-P)E[(s
where
for
P 2
)
q
|P]E[llx-E[x|P]ll
is
2
|P].
the dimension of
x,
2
]
=
Then for
4E[dw(x)s
2
]
=
2
]
C
E[llu— (z)-C'u
2
P
2
2E[P(l-P)llx-E[x|P]ll <E[<p-C^ Pj (c,P)}s |P]}
2E[dw(x)<E[ ej |x,d=l]}
for
o(l)
<
o(l).
26
]
Suppose that there
w(x) = llx-E[x
= C ®I
(z)ll
]
and
2
|
e
C
is
P]ll
such that as
+
= p(e,P)-C' p (e,P),
T
T
= E[llx-E[x|P]ll 2 <p-C'p
< E[dw(x){e -E[e
J
J
•J
|P,d=l]>
J
2
]
2
}
]
+
- 2E[dw(x)e 2
]
+
References
Ahn, H. and C.F. Manski, 1993, Distribution theory for the analysis of binary choice
under uncertainty with nonparametric estimation of expectations, Journal of
Econometrics 58, 291-321.
Ahn, H. and J.L. Powell, 1993, Semiparametric estimation of censored selection Models
with a nonparametric selection mechanism, Journal of Econometrics 58, 3-29.
Bates, C.E. and H. White, 1993, Determination of estimators with
covariance matrices, Econometric Theory 9, 633-648.
minimum asymptotic
Beran, R., 1976, Adaptive estimates for autoregressive processes, Annals of the Institute
of Statistical Mathematics 26, 77-89.
Chamberlain, G., 1987, Asymptotic efficiency in estimation with conditional moment
restrictions, Journal of Econometrics 34, 305-334.
Chamberlain, G., 1992, Efficiency bounds for semiparametric regression, Econometrica 60,
567-596.
The semiparametric estimation of the sample selection model using series
expansion and the propensity score, manuscript, Department of Economics, University of
Choi, K., 1990,
Chicago.
Crepon, B., F. Kramarz, and A. Trognon, 1997, Parameters of interest, nuisance
parameters, and orthogonality conditions: An application to autoregressive error
component models, Journal of Econometrics 82, 135-156.
P., 1982, Large sample properties of generalized method of moments estimators,
Econometrica 50, 1029-1054.
Hansen, L.
Hansen, L.P., 1985a, A method for calculating bounds on the asymptotic covariance
matrices of generalized method of moments estimators, Journal of Econometrics 30,
203-238.
Hansen, L.P., 1985b, Two-step generalized method of moments estimators, discussion, North
American Winter Meeting of the Econometric Society, Meeting, New York.
Hansen, L.P., J.C. Heaton, and M. Ogaki, 1988, Efficiency bounds implied by multiperiod
conditional moment restrictions, Journal of the American Statistical Association 83,
863-871.
Hayashi, F., and C. Sims, 1983, Nearly efficient estimation of time series models with
predetermined, but not exogenous, instruments, Econometrica 51, 783-798.
Newey, W.K., 1990, Semiparametric efficiency bounds, Journal of Applied Econometrics
5,
99-135.
1991, Two-step series estimation of sample selection models, working
paper, MIT Department of Economics.
Newey, W.K.,
Newey, W.K., 1993, Efficient estimation of models with conditional moment
restrictions, in G.S. Maddala, C.R. Rao, and H.D. Vinod, eds., Handbook of Statistics,
Volume 11: Econometrics. Amsterdam: North-Holland.
27
Newey, W.K., 1994, The asymptotic variance of semiparametric estimators, Econometrica
62,
1349-1382.
Newey, W.K. and D. McFadden, 1994, Large Sample Estimation and Hypothesis Testing,
Engle and D. McFadden (eds.), Handbook of Econometrics, Vol. 4, Amsterdam,
R.
North-Holland, 2113-2245.
Newey, W.K. and J.L. Powell, 1993, Efficiency bounds for semiparametric selection models,
Journal of Econometrics 58, 169-184.
Rilstone, P., 1989, Computing the (local) efficiency bound for a semiparametric
generated regressors model, manuscript, Department of Economics, University of Western
Ontario.
Robinson, P.M., 1988, Root-n consistent semiparametric regression, Econometrica 56,
931-954.
28
70
/
b
)0
Date Due
MIT LIBRARIES
3 9080
972 0900
mmm
p«4V'*
;
:?
mmmmmms
<:
iltl-
^MMilMi^mm
mmmmmm iBstim MB
:
ilillliliitlillliilifill
mmmmmmmm^mmmk
"
,
!
.'
i
l
;
.'i;
i
.
,
:'.'i'.'i!':-''|!;i;:.:ii';t
1
I
.
K^K^i'tfrtv:;!!
'
'
I
iiiiiliiiiiiii
.
:
:
.!-;i;,: ;;!i.i'
:
•:.:;!.
.5
''...'.
':
IS v*^';;
;
:
'¥*"
m
iiMSM-m^KM^ia^MM
mw
>K" s
;
.:*:;:/;
<
!
mm
Download