MA Room REARRANGING EXPANSIONS

advertisement
Massachusetts Institute of Technology
Department of Economics
Working Paper Series
REARRANGING EDGEWORTH-CORNISH-FISHER EXPANSIONS
Victor
Chemozhukov
Ivan Fernandez- Val
Alfred Galichon
Working Paper 07-20
August 13,2007
Room
E52-251
50 Memorial Drive
Cambridge,
MA 021 42
This paper can be downloaded without charge from the
Social Science Research Network Paper Collection at
http://ssrn.com/abstract=1007282
REARRANGING EDGEWORTH-CORNISH-FISHER EXPANSIONS
VICTOR CHERNOZHUKOV+
Abstract. This paper
ment
to
IVAN FERNANDEZ- VAL§
ALFRED GALICHON^
applies a regularization procedure called increasing rearrange-
monotonize Edgeworth and Cornish-Fisher expansions and any other related
approximations of distribution and quantile functions of sample
isfying the logical monotonicity, required of distribution
statistics.
Besides sat-
and quantile functions, the pro-
cedure often delivers strikingly better approximations to the distribution and quantile
functions of the sample
mean than
Key Words: Edgeworth
Date:
The
results of this
October 2006.
Durbin, Ivar Ekeland,
Ilya
is
first
presented at the Statistics Seminar in Cornell University,
of August 2007.
Xuming He,
We
would
like
like to
Joel Horowitz, Roger Koenker,
Molchanov, Francesca Molinari, and Peter Phillips
would also
at
expansion, Cornish-Fisher expansion, rearrangement
paper were
This version
the original Edgeworth-Cornish-Fisher expansions.
thank Andrew Chesher, James
Enno Mammen, Charles Manski,
for the helpful discussions
and comments.
We
to thank seminar participants at Cornell, University of Chicago, University of Illinois
Urbana-Champaign, Cowless Foundation 75th Anniversary Conference, Transportation Conference
at the University of
Columbia, and the
CEMMAP
"Measurement Matters" Conference
for helpful
comments.
t
Massachusetts Institute of Technology, Department of Economics
and the University College
Castle
Krob
of London,
CEMMAP.
&
Operations Research Center,
E-mail: vchern@mit.edu. Research support from the
Chair, National Science Foundation, the Sloan Foundation, and
CEMMAP
is
gratefully
acknowledged.
§
Boston University, Department of Economics. &mail: ivanf@bu.edu.
X
Harvard University, Department of Economics.
E-mail:
alfred.galichon@post.harvard.edu.
search support from the Conseil General des Mines and the National Science Foundation
acknowledged.
1
is
Re-
gratefully
improving edgeworth approximations
2
1.
Introduction
Higher order approximations to the distribution of sample
beyond the
statistics
or-
der n"^/^ provided by the central limit theorem are of central interest in the theory
of asymptotic statistics, see e.g.
(Blinnikov and Moessner 1998,
Cramer
1999, Bhat-
tacharya and Ranga Rao 1976, Hall 1992, Rothenberg 1984, van der Vaart 1998).
important tool
for
performing these refinements
sion (Edgeworth 1905,
statistics of interest
is
An
provided by the Edgeworth expan-
Edgeworth 1907), which approximates the distribution of the
around the
limit distribution (often the
normal distribution) by
a combination of Hermite polynomials, with coefficients defined in terms of moments.
Inverting the expansion yields a related higher order approximation, the Cornish-Fisher
expansion (Cornish and Fisher 1938, Fisher and Cornish 1960), to the quantiles of the
statistic
One
is
around the quantiles of the limiting distribution.
of the important shortcomings of either
Edgeworth or Cornish-Fisher expansions
that the resulting approximations to the distribution and quantile functions are not
monotonia, which violates an obvious monotonicity requirement. This comes from the
fact that the polynomials involved in the expansion are not
monotone. Here we propose
to use a procedure, called the rearrangement, to restore the monotonicity of the approx-
imations and, perhaps more importantly, to improve the estimation properties of these
approximations. The resulting improvement
necessarily brings the
is
due to the
non-monotone approximations
fact that the
rearrangement
closer to to the true
monotone
target function.
The main
Figure
1.
findings of the paper can be illustrated through a single picture given as
In that picture,
we
plot the true distribution function of a
sample mean
X
based on a small sample, a third order Edgeworth approximation to that distribution,
and a rearrangement of
approximation
is
this third order approximation.
We
see that the
Edgeworth
sharply non-monotone and provides a rather poor approximation to
the distribution function.
The rearrangement merely
sorts the value of the
approximate
IMPROVING EDGEWORTH APPROXIMATIONS
Edgeworth Expansion: lognormal
(n
=
5)
P(y) (Edgeworth)
F(y) (Rearranged)
F*(y)(True-MC)
—r-5
Figure
1.
log- normal
Distribution function for the standarized sample
random
variables (sample size 5), the third order
mean
of
Edgeworth
approximation, and the rearranged third order Edgeworth approximation.
distribution function in an increasing order.
imation, in addition to being monotonic,
is
One can
a
much
see that the rearranged approx-
better approximation to the true
function than the original approximation.
We
organize the rest of the paper as follows. In Section
2,
we
describe the rearrange-
ment and
qualify the approximation property
provides for monotonic functions.
In
section
we introduce the rearranged Edgeworth-Cornish-Fisher expansions and
ex-
plain
3,
how
Section
4,
it
these produce better estimates of distributions and quantiles of statistics. In
we
illustrate the
procedure with several additional examples.
improving edgeworth approximations
4
Improving Approximations of Monotone Functions by Rearrangement
2.
In
what
X=
[0, 1].
follows, let A'
interval.
We
first
Let f{x) be a measurable function mapping
=
Let Ff{y)
be a compact
consider an interval of the form
X
to
K, a bounded subset
f^ ^{f{u) < y}du denote the distribution function of f{X) when
the uniform distribution on
[0, 1].
of M.
X follows
Let
r{x) := Qf{x) :=
inf
{y €
R
:
Ff{y)
> x}
be the quantile function of Ff{y). Thus,
f*{x) := inf
<^
y e
/
l{f{u)
< y}du >
X
Jx
This function /*
ment
(see e.g.
is
Hardy, Littlewood, and Polya (1952) and Villani (2003).)
of Chebyshev,
monotone
distribution
who used
it
That
we use
this tool to
X ~
Semendyayev,
improve approximations
functions, such as the Edgeworth-Cornish-Fisher approximations to the
and quantile functions
is,
originates in the
It
to prove a set of inequalities (Bronshtein,
of the sample mean.
The rearrangement operator simply transforms a
/*.
rearrange-
extensively used in functional analysis and optimal transportation
Musiol, Muehlig, and Miihlig 2004). Here
of
The
called the increasing rearrangement of the function /.
a tool that
is
work
is
x
>-^
f*{x)
is
function / to
the quantile function of the
random
its
quantile function
variable
Another convenient way to think of the rearrangement
f/(0, 1).
is
f{X) when
as a sorting
operation: Given values of the function f{x) evaluated at x in a fine enough
equidistant points,
in this
way
Finally,
[a, 6],
if
is
we simply
sort the values in
is
of the form [a,b], let x(a;)
>
=
{x-a)/{b-a) G
and /* be the rearrangement of the function f{x)
Then, the rearrangement of /
is
of
an increasing order. The function created
the rearrangement of /.
-Y
mesh
defined as
r{x):=r{x{x)).
=
[0, 1],
x{x)
= a+{b — a)x
f{x{x)) defined on A^
=
G
[0, 1].
Digitized by the Internet Archive
in
2011 with funding from
IVIIT
Libraries
http://www.archive.org/details/rearrangingedgew0720cher
IMPROVING EDGEWORTH APPROXIMATIONS
The
following result
Proposition
-^
[a, b]
K
detailed in Chernozhukov, Fernandez-Val,
is
This
1.
measurable function in x, where
the target function that
is
an
be another measurable function,
function
and Galichon
(Improving Approximation of Monotone Functions). Let fo
1
be a weakly increasing
subset of R.
5
we want
to
K
(2006).
X =
:
a bounded
is
approximate. Let f
:
X
-^
K
approximation or an estimate of the target
initial
/q.
For any p G
[1,
oo], the
rearrangement of f, denoted f*, weakly reduces the estimation
error:
rix)
ll/p
i/p
p
--
<
dx
fo{x)
2.
Suppose that there
that for all
>
fo{x')
strict for
fo{x)
+
x'
G Xq we have
for some
e,
e
>
dx
foix]
(2.1)
Aq and Xq, each of measure greater than
exist regions
x E Xq and
fix)
[L
X
-
that
Then
0.
p € (l,oo). Namely, for any p G
>
(i) x'
>
x, (ii) f{x)
f{x')
1
where
the
5;,
rjp
=
mi{\v
t'\^
infimum taken over
=
is
+
1^'
—
t\^
all v, v', t,
1 (Strict
Improvement).
decreasing over a subset of
Lp norm, forp G (l,oo),
The
first
inequality
X
\l
Jx
-
fix)
foix)
\v'
—
in the set
K
such that
(Hi)
t|P
t'\^}
and
v'
dx
-
>
rjp
Sxrjp
(2.2)
forp G
>v + e
and
t'
(l,C)o),
with
>t + e;
and
X
and
is
X
// the target function /o is increasing over
that has positive measure, then the
part of the proposition states the
is strict
for
p G
(1,
oo)
if
As an
weak
inequality (2.1),
we mean
and the second
implication, the Corollary states that the
the original estimate fix)
strictly increasing throughout).
Of course,
decreasing on a subset
is
having positive measure, while the target function foix)
increasing,
improvement in
necessarily strict.
part states the strict inequality (2.2).
of
and
[l,oo],
-
-\v —
t'
such
S/ib-a).
Corollary
f
—
e,
Q,
i/p
<
dx
+
>
the gain in the quality of approximation is
1/)
nx) - foix)
5
is
if
increasing on
foix)
is
X
(by
constant, then
IMPROVING EDGEWORTH APPROXIMATIONS
6
the inequality (2.1) becomes an equality, as the distribution of the rearranged function
/*
is
the same as the distribution of the original function /, that
is
Ft,
=
Ft.
This proposition establishes that the rearranged estimate /* has a smaller estimation
norm than
error in the Lp
This
size
is
the original estimate whenever the latter
a very useful and generally applicable property that
and of the way the
Remark
1.
An
original approximation
indirect proof of the
/ to
/o
weak inequality
consequence of the following classical inequality due
two functions mapping
X
to
K
to
is
not monotone.
independent of the sample
obtained.
(2.1) is a simple hut important
Lorentz (1953): Let q and g he
a hounded suhset of R.
,
is
is
Let q* and g* denote their
corresponding increasing rearrangements. Then,
<
L L{q*{x),g*{x),x)dx
for any submodular discrepancy function
g{x)
=
and g*{x)
fo{x),
everywhere, that
jiw
— f 1^
For p
is
=
is,
=
/o(a:).
suhmodular for p G
[1,
oo)
L
:
R^
Now, note
the target function
.
L{q{x),g{x),x)dx,
/
Jx
IX
is its
>-^ M..
f{x), q*{x)
that in our case /o(x)
=
=
f*{x),
fo{x) almost
own rearrangement. Moreover, L{v,w,x)
=
This proves the first part of the proposition ahove.
the first part follows hy taking the limit as
oo,
=
Set q{x)
p —>
og.
In the Appendix,
for completeness, we restate the proof of Chernozhukov, Fernandez- Val, and Galichon
(2006) of the strong inequality (2.2) as well as the direct proof of the weak inequality
(2.1).
Remark
2.
The following immediate implication of the ahove finite-sample
result is also
worth emphasizing: The rearranged estimate f* inherits the Lp rates of convergence from
the original estimates f.
some sequence
while the rate
Remark
For p G
[l,oo],
^/A„
of constants On, then [J^ \fo{x)
is
the same, the error itself
3. Finally,
is
=
[/^ \fo{x)
—
f{x)\PduY^^
- f*{x)\PduY^P <
=
Op{an) for
K = Op{an).
However,
smaller.
one can also consider weighted rearrangements that can accentuate
the quality of approximation in various areas. Indeed, consider
an ahsolutely continuous
IMPROVING EDGEWORTH APPROXIMATIONS
distribution function
rearrangement of u
A
h->-
on
X=
then we have that for (/ o
[a, b],
A
^)*{u) denoting the
f[A~^{u))
i/p
ll/p
(7oA-i)*(n)-/o(A-^ u))
[
<
du
f{A-\u)) - f,{A-\u))
f
du
U[o,i]
J[o,i]
(2.3)
or, equivalently,
by a change of variable, setting
mx) =
we have
{foA-'r{A{x))
that
m
i/p
-
x)
<
dA{x)
fo{x)
Ux
Jx
Thus, the function x
<—>
/X(^)
^^
ii/p
-
dAix)
foix)
(2.4)
improvements
^^^ weighted rearrangement that provides
in the quality of approximation in the
norm
is
weighted according to the distribution
we apply rearrangements
to improve the Edgeworth-Cornish-
that
function A.
In the next section,
Fisher and related approximations to distribution and quantile functions.
3.
3.1.
Improving Edgeworth-Cornish-Fisher and Related expansions
Improving Quantile Approximations by Rearrangement. We
the quantile case. Let
Qn be
the quantile function of a statistic X„,
Qn{u)
which we assume to be
function
Qn
inf{x e
K
:
Pr[Xn <
strictly increasing. Let
satisfying the following relation
Qn{u)
The
=
=
Qn{u)
+
en{u),
x]
>
u},
on an interval Un
<
leading example of such an approximation
is
a„,
for
=
[e„, 1
X„ =
Yl^^ii^i
~
—
to the quantile
£n]
Q
ah u G Un-
[0, 1]:
(3.1)
the inverse Edgeworth, or Cornish-
Fisher, approximation to the quantile function of a sample mean.
dardized sample mean,
consider
i.e.
Qn be an approximation
|e„(it)|
first
If
X„
E[Yi\)/ ^/Var{Yi) based on an
is is
i.i.d.
a stan-
sample
IMPROVING EDGEWORTH APPROXIMATIONS
8
then we have the following J-th order approximation
(Vi, ...,Yn),
Qn{u)
=
Qn{u)
Qn{u)
=
R,{^-\u))
+
eniu)
+ R2{^~\u))ln"^ +
+ Rj{^-\u))/n^'-'y\
...
(3.2)
\^n{u)\
for
< Cn
some
e„
"'/^
\
ueUn =
for all
C>
and
[e„, 1
-
e„],
0,
provided that a set of regularity conditions, specified
Here
$ and $~^
normal random
Zolotarev (1991), hold.
in, e.g.,
denote the distribution function and quantile function of a standard
variable.
The
first
three terms of the approximation are given by the
polynomials,
where A
is
Ri{z)
=
z,
R2{z)
=
X{z'-
Rsiz)
=
{3k{z^
the skewness and k
Fisher approximation
is
is
-
l)/6,
-
3z)
(3.3)
-
2X^{2z^
the kurtosis of the
-
5z))/72,
random
variable Y.
The Cornish-
one of the central approximations of the asymptotic
statistics.
Unfortunately, inspection of the expression for polynomials (3.3) reveals that this ap-
proximation does not generally deliver a monotone estimate of the quantile function.
This shortcoming has been pointed and discussed in detail
nature of the polynomials constructed
is
A
in the case of the
by Hall (1992). The
such that there always exists a large enough
range Un over which the Cornish-Fisher approximation
As an example,
e.g.
is
not monotone,
second order approximation
(
J
=
cf.
Hall (1992).
we have
2)
that for
<
Qn{u)
that
is,
\ -oo,
the Cornish-Fisher "quantile" function
as
u
Qn
y
is
u.
(3.4)
decreasing far enough in the
This example merely suggests a potential problem that
ranges of probability indices
'
I,
may
tails.
apply to practically relevant
Indeed, specific numerical examples given below show
that in small samples the non-monotonicity can occur in practically relevant ranges.
Of
IMPROVING EDGEWORTH APPROXIMATIONS
9
course, in sufficiently large samples, the regions of non-monotonicity are squeezed quite
far into the tails.
Q^ be
Let
Then we have
the rearrangement of Q„.
that for any
p E
[1,
the rear-
oo],
ranged quantile function reduces the approximation error of the original approximation:
i/p
Q:{u)
J
- Qn{u)
i/p
<
du
- Qn{u)
Qn{u)
<
du
-
(1
2e„)a„,
(3.5)
Un
with the
first
region of
Un
We
of positive measure.
to this result.
p G
inequality holding strictly for
Under condition
(3.1),
(1,cxd)
whenever Qn
decreasing on a
is
can give the following probabilistic interpretation
there exists a variable
U=
F„(X„) such that both
the stochastic expansion
Xn = Qn{U) +
Op{an),
(3.6)
Xn^Q:{U) +
Op{an),
(3.7)
and the expansion
but the variable Qn{U) in
hold,-^
(3.7) is
a better coupling to the statistic
in (3.6), in the following sense: For each
[E1„[X„
where 1„
= 1{U
[1,
oo],
- Qn{U)YY'^ < [EUXn -
G Un}- Indeed, property
The above improvements apply
(3.8)
is
Qn(f/)]1'/^
(3.8)
immediately follows from
in the context of the
the probabilistic interpretation above
limit
p G
X„ than Qn{U)
(3.5).
sample mean X„. In
this case,
directly connected to the higher order central
theorem of Zolotarev (1991), which states that under
(3.2),
we have the
following
higher-order probabilistic central limit theorem,
Xn = Qn{U) + Op{n
^QniU)
U ^Un
is
defined only on Un, so
we can
with probability going to zero.
set
Qn{U)
-J/2)
= Qn{U)
(3.9)
outside Un,
if
needed.
Of
course,
IMPROVING EDGEWORTH APPROXIMATIONS
10
The term Qn{U)
Zolotarev's high-order refinement over the
is
$~^(f/). Sun, Loader, and
order normal terms
(2000) employ an analogous higher-order proba-
theorem to improve the construction of confidence
bilistic central limit
The
McCormick
first
intervals.
application of the rearrangement to the Zolotarev's term in fact delivers a clear
improvement
in the sense that
it
also leads to a probabilistic higher order central limit
theorem
=
Xr.
but the leading term Q^iU)
is
+
Q*^{U)
closer to
X„
0,{n-"^),
(3.10)
than the Zolotarev's term Qn{U), in the
sense of (3.8).
We
summarize the above discussion
Proposition
2. If
The improvement
expansion (3.1) holds, then the improvement (3.5) necessarily holds.
is
Qn
necessarily strict if
positive measure. In particular, this
worth approximation defined in (3.2)
3.2.
into a formal proposition.
is
decreasing over a region
improvement property applies
to the quantile
ofUn
Edge-
to the inverse
function of the sample mean.
Improving Distributional Approximations by Rearrangement.
consider distribution functions.
that has a
We
next
Let Fn{x) be the distribution function of a statistic
Xn, and Fn{x) be the approximation to
this distribution such that the following relation
holds:
Fn{x)
where
A'„
=
growing to
The
[— 6„,
6„] is
=^
Fn{x)
+
an interval
en{x)
in
,
|e„(x)
|
<
-^n,
(3.11)
infinity.
distribution function of a sample mean.
~
aU x G
M for some sequence of positive numbers 6„ possibly
leading example of such an approximation
IZr=i(^«
a„, for
E\yi]) / y/V ar{Yi) based on
an
If
Xn
i.i.d.
is
is
the Edgeworth expansion of the
a standardized sample mean,
sample
(Yl, ...,y„),
Xn =
then we have the
IMPROVING EDGEWORTH APPROXIMATIONS
11
following J-th order approximation
Fn{z)
=
Fniz)
+
eniz)
Fn{x)
=
P,{x)
+
P2{x)/n"^
<
\^n{x)\
for
some
hold.
where
C>
The
<l>
P,ix)
= ^x),
P2{x)
=
-X{x^
P^(x)
=
-{3k{x^
(3.12)
x E Xn,
e.g.
Hall (1992),
-
l)(/.(x)/6,
-
3x)
+
(3.13)
\^{x^
-
lOx^
+
15x))(t){x)/72,
denote the distribution function and density function of a standard
normal random
asymptotic
Pj{x)/n('-^^'\
three terms of the approximation are given by
and
variable,
and A and k are the skewness and kurtosis of the random
The Edgeworth approximation
variable Y.
+
...
provided that a set of regularity conditions, specified in
0,
first
for all
Cn~'^^'^,
+
statistics.
is
one of the central approximations of the
Unfortunately, like the Cornish-Fisher expansion
not provide a monotone estimate of the distribution function.
been pointed and discussed
in detail
it
generally does
This shortcoming has
by (Barton and Dennis 1952, Draper and Tierney
1972, Sargan 1976, Balitskaya and Zolotuhina 1988),
Let F* be the rearrangement of F„.
among
others.
Then, we have that
any p G
for
[l,oo],
the
rearranged Edgeworth approximation reduces the approximation error of the original
Edgeworth approximation:
i/v
\Ip
dx
F:ix)-Fnix)
Jx„
Fn{x)
-
Fn{x)
dx
<
26„a„,
(3.14)
^n
with the
first
region of
Xn
inequality holding strictly for p
£
(1,
oo) whenever
F„
is
decreasing on a
of positive measure.
Proposition
holds.
<
3.
// expansion (3.11) holds,
The improvement
is
then the improvement (3.14) necessarily
necessarily strict if
Fn
is
decreasing over a region of
Xn
IMPROVING EDGEWORTH APPROXIMATIONS
12
that has a positive measure.
In particular, this improvement property applies to Edge-
worth approximation defined in (3.12)
3.3.
In
to the distribution
function of the sample mean.
Weighted Recirrangement of Cornish-Fisher and Edgeworth Expansions.
some
cases,
it
could be worthwhile to weigh different areas of support differently
than the Lebesgue
(fiat)
weighting prescribes.
For example,
it
might be desirable to
rearrange F„ using F„ as a weight measure. Indeed, using F„ as a weight,
better coupling to the P-value:
to the true F„).
P=
Using such weight
Fn{Xn)
will
(in this quantity,
X„
is
we obtain a
drawn according
provide a probabilistic interpretation for the
rearranged Edgeworth expansion, in analogy to the probabilistic interpretation for the
rearranged Cornish-Fisher expansion. Since the weight
standard normal measure
initial
$
as the weight
not available
is
measure instead.
rearrangement with the Lebesgue weight, and use
We may
we may use the
also construct
an
as weight itself in a further
it
weighted rearrangement (and even iterate in this fashion). Using non-Lebesgue weights
may
also be desirable
more
when we want the improved approximations
Whatever the reason might be
heavily.
Let
A
in
view of Remark
3.
be a distribution function that admits a positive density with respect to the
Lebesgue measure on the region Un
=
[— 6„,6„] for the distribution case.
Then
Q* ^
non-Lebesgue weighting, we
for further
have the following properties, which follow immediately
to weight the tails
of the function
Q
—
[e„, 1
e„] for
the quantile case and region
(3.1) holds, the
if
A- weighted rearrangement
satisfies
1/p
\/p
[/
Q:,^{u)- Qniu)
<
dK{u)
<
where the
first
equality holds strictly
measure. Furthermore,
Xn =
if
when
(3.11) holds,
Q
J
(A[l
is
Qn{u)
-
Sn]
-
- Qn{u)
dA{u)
(3.15)
A[£n])an,
(3.16)
decreasing on a subset of positive A-
then the A- weighted rearrangement
F*^
of the
IMPROVING EDGEWORTH APPROXIMATIONS
function
F
13
satisfies
1/p
F:Jx)-Fr,{x)
1/p
<
dA{x)
/
F„(x)-F„(x)
dA{x)
<
4.
{A[bn]
-
A[-bn])an
(3.18)
Numerical Examples
In addition to the lognormal example given in the introduction,
we use the gamma
tribution to illustrate the improvements that the rearrangement provides. Let (Yi,
be an
i.i.d.
n
function
=
Qn
...,
dis-
F„)
sequence of Gamma(l/16,16) random variables. The statistic of interest
the standardized sample
of sizes
(3.17)
-JXn
Jxr,
4, 8, 16,
and
mean X„ =
32. In this
of the sample
mean X„
l^"=i(^i
~
E[Yi\)/y/Var{Yi).
We
is
consider samples
example, the distribution function F„ and quantile
are available in a closed form,
making
it
easy to com-
pare them to the Edgeworth approximation F„ and the Cornish-Fisher approximation
Qn, as well as to the rearranged Edgeworth approximation F* and the the rearranged
Cornish-Fisher approximation Q^. For the Edgeworth and Cornish-Fisher approximations, as defined in the previous section,
we
set
J=
we consider the
third order expansions, that
is
3.
Figure 2 compares the true distribution function F„, the Edgeworth approximation
F„, and the rearranged Edgeworth approximation F*.
worth approximation not only
fixes the
We see that
the rearranged Edge-
monotonicity problem, but also consistently does
a better job at approximating the true distribution than the Edgeworth approximation.
Table
1
further supports this point by presenting the numerical results for the Lp ap-
proximation errors, calculated according to the formulas given
We
in the previous section.
see that the rearrangement reduces the approximation error quite substantially in
most
cases.
Figure 3 compares the true quantile function Qn, the Cornish-Fisher approximation
Qn, and the rearranged Cornish-Fisher approximation Q*.
Here too we see that the
IMPROVING EDGEWORTH APPROXIMATIONS
14
rearrangement not only
fixes
the non-monotonicity problem, but also brings the approx-
imation closer to the truth.
Table 2 further supports this point numerically, showing
that the rearrangement reduces the Lp approximation error quite substantially in most
cases.
Conclusion
5.
we have applied the rearrangement procedure
In this paper,
and Cornish-Fisher expansions and other
The
functions.
logical
related expansions of distribution
benefits of doing so are twofold.
of the distribution
to monotonize
and quantile curves of the
Edgeworth
and quantile
we have obtained estimates
First,
statistics of interest
which
satisfy the
monotonicity restriction, unlike those directly given by the truncation of the
we have shown
series expansions. Second,
that doing so resulted in better approximation
properties.
Appendix
We consider the case where X =
The
similarly.
first
Proof of Part
The proof
The second
We
1.
/ denote the
/(•)
on the s-th
integer in
1,
...,
fi,
assume
all
intervals
can be dealt
inequality, following in part the strategy
focuses directly on obtaining the result stated in
f(,
and
at first that the functions /(•)
—
l)/r, s/r], s
=
1, ...,r.
> fm
for
some
be a r-vector with the
m
>
i-th.
I.
If f
/o(-) are
simple
For any simple /(•) with r
with the s-th element, denoted
r- vector
r such that
and
weak
more general
interval. Let us define the sorting operator
i exists, set S{f) to
equal to
only, as the
1
part establishes the strong inequality.
functions, constant on intervals ((s
steps, let
[0, 1]
part establishes the
in Lorentz's (1953) proof.
the proposition.
Proof of Proposition
A.
/s,
equal to the value of
S{f) as follows: Let i be an
does not
exist, set
S{f)
=
f. If
element equal to fm, the m-th element
other elements equal to the corresponding elements of /. For any
submodular function
L
:
E^
—
^
E+, by
/^
>
/m, fom
>
foe
and the
definition of the
IMPROVING EDGEWORTH APPROXIMATIONS
15
submodularity,
Lifrm
Therefore,
foe)
+ L{fe, /om) <
+
L{fm, fom)-
we conclude that
/
L{S{f){x),fo{x))dx
Jx
using that
L{fe, foe)
we
< [
L{fix)Jo{x))dx,
integrate simple functions.
Applying the sorting operation a
completely sorted, that
= 5
composition /*
o
...
o S{f)
By
.
number
sufficient finite
Jx
Furthermore, this inequality
error. Therefore,
X
to
K
function
.,
(A.2)
Lifix), fo{x))dx.
Jx
extended to general measurable functions
/o(-)
almost everywhere as r -^ oo.
/^"^H") to /(") implies
/**'"^(-)
^
/(•)
and
by taking a sequence of bounded simple functions p^\-) and
converging to /(•) and
convergence of
£
is
finite
repeating the argument above, each composition
/ L(f (x), fo{x))dx < I L(^.^(/), Ux))dx < f
Jx
we obtain a
of times to /,
Thus, we can express /* as a
rearranged, vector /*.
is,
weakly reduces the approximation
mapping
(A.l)
Jx
/o
'(•)
The almost everywhere
the almost everywhere convergence of
to the quantile function of the limit,
/o(-)
/*() Since
its
quantile
inequality (A.2) holds
along the sequence, the dominated convergence theorem imphes that (A.2) also holds
D
for the general case.
Proof of Part
1.
We
A'q,
that
Let us
2.
first
consider the case of simple functions, as defined in Part
take the functions to satisfy the following hypotheses: there exist regions Aq and
each of measure greater than 5
(i)
x'
>
x, (u) f{x)
>
f[x')
+
>
e,
0,
such that for
and
(in) fo{x')
>
the proposition. For any strictly submodular function
T]
where the infimum
t'
>t +
e.
=
inf{L(i/,
is
taken over
t)
+ L{v, t') -
L{v,
all v,v',t,t' in
t)
all
-
x E Xq and
fo{x)
L R^
:
+
K
for e
^ M+
L{v', t')}
the set
e,
>
x'
£
A'q,
>
we have
specified in
we have that
0,
such that
v'
>
v
+
e
and
IMPROVING EDGEWORTH APPROXIMATIONS
16
We
can begin sorting by exchanging an element f{x), x G ^o, of
element f{x'),
The
total
x'
€ Xq, of
r- vector /.
This induces a sorting gain of at least
mass of points that can be sorted
sort all of these points in this way,
After the sorting
is
in this
is
at least
5.
r]
We then
/ with an
times 1/r.
proceed to
and then continue with the sorting of other
/ Lirix), fo{x))dx < f
We
way
completed, the total gain from sorting
Jx
r- vector
is
at least
L{f{x), h{x))dx
-
5r].
That
is,
57).
Jx
then extend this inequality to the general measurable functions exactly as
proof of part one.
points.
in the
D
improving edgeworth approximations
17
References
and
Balitskaya, E. O.,
Edgeworth
and
E.,
Bhattacharya, R.
John Wiley
(1988):
"On the
representation of a density by an
K. E. Dennis (1952): "The conditions under which Gram-Charher and Edge-
worth curves are positive
sions.
Zolotuhina
Biometrika, 75(1), 185-187.
series,"
Barton, D.
L. a.
and
N.,
&
definite
Sons,
R.
New
and unimodal," Biometrika,
Ranga Rao
(1976):
39, 425-427.
Normal approximation and asymptotic expan-
York-London-Sydney, Wiley Series in Probability and Mathematical
Statistics.
Blinnikov,
S.,
and
Moessner
R.
(1998):
"Expansions
for nearly
Gaussian distributions," Astron.
Astrophys. Suppl. Ser., 130, 193-205.
Bronshtein,
K. A. Semendyayev, G. Musiol, H. Muehlig,
N.,
I.
Handbook of mathematics. Springer- Verlag,
Chernozhukov,
of
v.,
H.
Muhlig
(2004):
Berlin, english edn.
Fernandez- Val, and A. Galichon
I.
and
Monotone Functions by Rearrangement," MIT and
(2006):
UCL CEMMAP
"Improving Approximations
Working Paper,
available at
www.arxiv.org, cemmap.ifs.org.uk, www.ssrn.com.
and
Cornish, E. A.,
S.
R. A. Fisher (1938):
"Moments and Cumulants
Distributions," Review of the International Statistical Institute,
Cramer, H.
5,
(1999): Mathematical methods of statistics, Princeton
in the Specification of
307-320.
Landmarks
in
Mathematics. Prince-
ton University Press, Princeton, NJ, Reprint of the 1946 original.
Draper, N.
and
R.,
D. E. Tierney (1972): "Regions of positive and unimodal
series
expansion of
the Edgeworth and Gram-Charlier approximations," Biometrika, 59, 463-465.
Edgeworth,
F. Y. (1905): "The law of error," Trans. Camb. Phil. Society, 20, 36-65, 113-41.
(1907):
Fisher,
S.
"On
R. A.,
the representatin of a statistical frequency by a series,"
and
cumulants," Technometrics,
Hall, P.
New
(1992):
Cornish
E. A.
2,
(I960):
J.
R. Stat. Soc, 70, 102-6.
"The percentile points of distributions having known
209-225.
The bootstrap and Edgeworth expansion, Springer Series
in Statistics. Springer- Verlag,
York.
Hardy, G.
H., J. E.
Littlewood, and G. Polya
(1952): Inequalities.
Cambridge University
Press,
2ded.
Lorentz, G. G.
(1953):
"An
inequality for rearrangements,"
Amer. Math. Monthly,
60, 176-179.
IMPROVING EDGEWORTH APPROXIMATIONS
18
ROTHENBERG, T.
Statistics," in
Griliches
Sargan,
J. (1984):
Handbook of econometrics,
and Michael D.
J.
"'Approximating the Distributions of Econometric Estimators and Test
Vol. II, vol. 2.
North-Holland, Amsterdam, edited by Zvi
Intriligator.
D. (1976): "Econometric estimators and the Edgeworth approximation," Econometrica,
44(3), 421-448.
Sun,
J.,
C. Loader,
and W.
P.
McCormick
(2000):
"Confidence bands in generalized hnear
models," Ann. Statist, 28(2), 429-460.
VAN DER Vaart, A. W.
ViLLANi, C. (2003):
Cambridge University
Press, Cambridge.
Topics in optimal transportation, vol. 58 of Graduate Studies in Mathematics.
American Mathematical
ZOLOTAREV, V. M.
(1998): Asymptotic statistics.
Society, Providence, RI.
(1991): "Asymptotic expansions with probability
variables," Tear. Veroyatnost.
i
Primenen., 36(4), 770-772.
1
of
sums
of independent
random
IMPROVING EDGEWORTH APPROXIMATIONS
Table
1.
Estimation errors
for
tion of the standarized sample
First
Order
approximations to the distribution func-
mean from a Gamma(l/16,
Third Order
19
TO-
Rearranged
A. n
=
16) population.
Ratio
(TO/RTO)
4
Li
0.07
0.05
0.02
0.38
L2
0.10
0.06
0.03
0.45
Ls
0.13
0.07
0.04
0.62
U
0.15
0.08
0.07
0.81
Loo
0.30
0.48
0.48
1.00
B.
n=
8
Li
0.06
0.03
0.01
0.45
L2
0.08
0.04
0.02
0.63
L,
0.10
0.05
0.04
0.85
U
0.11
0.06
0.06
0.96
Loo
0.23
0.28
0.28
1.00
C.
n--
=
16
Li
0.05
0.01
0.01
0.97
L2
0.06
0.02
0.02
0.99
Ls
0.08
0.03
0.03
1.00
L,
0.08
0.04
0.04
1.00
Loo
0.15
0.11
0.11
1.00
D. n
=
32
Li
0.04
0.01
0.01
1.00
L2
0.05
0.01
0.01
1.00
Ls
0.06
0.01
0.01
1.00
L,
0.06
0.01
0.01
1.00
Loo
0.09
0.03
0.03
1.00
IMPROVING EDGEWORTH APPROXIMATIONS
20
Table
2.
Estimation errors for approximations to the distribution func-
tion of the standarized sample
First
Order
mean from a Gamma(l/16,
Third Order
TO-
Rearranged
A. n
=
16) population.
Ratio
(TO/RTO)
4
Li
0.50
0.24
0.09
0.39
L2
0.59
0.32
0.11
0.35
Ls
0.69
0.42
0.13
0.31
U
0.78
0.52
0.15
0.29
L^
2.04
1.53
0.49
0.32
B.
n
=
8
L,
0.39
0.08
0.03
0.37
L2
0.47
0.11
0.04
0.35
Ls
0.56
0.16
0.05
0.32
U
0.65
0.21
0.06
0.30
Loo
1.66
0.67
0.22
0.33
C. n-= 16
L,
0.28
0.02
0.02
0.97
L2
0.35
0.04
0.03
0.84
L,
0.43
0.05
0.04
0.71
L,
0.51
0.07
0.04
0.63
L^
1.34
0.24
0.10
0.44
D. n
=
32
L,
0.20
0.01
0.01
1.00
L2
0.26
0.01
0.01
1.00
Ls
0.31
0.02
0.02
1.00
L,
0.37
0.02
0.02
1.00
Loo
1.02
0.07
0.07
1.00
IMPROVING EDGEWORTH APPROXIMATIONS
n =4
21
n = 8
X
.^^^
CO
d
CM
True
True
FO Aproximation
TO Approximation
TO - Rearranged
d
FO
TO
TO
Aproximation
Approximation
- Rearrang
T"
1
1
1
1
1
1
2
1
std.
mean
std.
mean
n = 32
n = 16
CO
d
CD
CD
d
d
CM
True
FO Aproximation
TO Approximation
TO - RearrangecJ
d
Std.
Figure
2.
d
—
•
•
•
True
FO Aproximation
«»» TO Approximation
TO - Rearranged
mean
std.
mean
Distribution Functions, First Order Approximations, Third
Order Approximations, and Rearrangements
mean from a Gamma(l/16,
16) population.
for the standarized
sample
IMPROVING EDGEWORTH APPROXIMATIONS
22
n = 8
n = 4
-*
True
FO Aproximation
TO Approximation
TO - Rearranged
-
CO
-
(M
-
—
True
FO Aproximation
TO Approximation
TO - Rearranged
i
/
J
O
1
CM
-
_.^^i--^
~
_
1
T—
0.0
1
1
r
0.2
0.4
0.6
r
O.i
1.0
1
1
1
1
1
1
0.0
0.2
0.4
0.6
0.8
1.0
prob
prob
n = 16
n = 32
True
FO Aproximation
TO Approximation
TO - Rearranged
True
FO Aproximation
TO Approximation
TO - Rearranged
w
1
1
\
0.0
0.2
0.4
0.6
r
0.8
3.
—
1
1
1
1
r
\
1.0
prob
Figure
o
0.0
0.2
0.6
0.4
0.8
1.0
prob
Quantile Functions, First Order Approximations, Third Order
Approximations, and Rearrangements
from a Gamma(l/16, 16) population.
for the standarized
sample mean
Download