Massachusetts Institute of Technology Department of Economics Working Paper Series REARRANGING EDGEWORTH-CORNISH-FISHER EXPANSIONS Victor Chemozhukov Ivan Fernandez- Val Alfred Galichon Working Paper 07-20 August 13,2007 Room E52-251 50 Memorial Drive Cambridge, MA 021 42 This paper can be downloaded without charge from the Social Science Research Network Paper Collection at http://ssrn.com/abstract=1007282 REARRANGING EDGEWORTH-CORNISH-FISHER EXPANSIONS VICTOR CHERNOZHUKOV+ Abstract. This paper ment to IVAN FERNANDEZ- VAL§ ALFRED GALICHON^ applies a regularization procedure called increasing rearrange- monotonize Edgeworth and Cornish-Fisher expansions and any other related approximations of distribution and quantile functions of sample isfying the logical monotonicity, required of distribution statistics. Besides sat- and quantile functions, the pro- cedure often delivers strikingly better approximations to the distribution and quantile functions of the sample mean than Key Words: Edgeworth Date: The results of this October 2006. Durbin, Ivar Ekeland, Ilya is first presented at the Statistics Seminar in Cornell University, of August 2007. Xuming He, We would like like to Joel Horowitz, Roger Koenker, Molchanov, Francesca Molinari, and Peter Phillips would also at expansion, Cornish-Fisher expansion, rearrangement paper were This version the original Edgeworth-Cornish-Fisher expansions. thank Andrew Chesher, James Enno Mammen, Charles Manski, for the helpful discussions and comments. We to thank seminar participants at Cornell, University of Chicago, University of Illinois Urbana-Champaign, Cowless Foundation 75th Anniversary Conference, Transportation Conference at the University of Columbia, and the CEMMAP "Measurement Matters" Conference for helpful comments. t Massachusetts Institute of Technology, Department of Economics and the University College Castle Krob of London, CEMMAP. & Operations Research Center, E-mail: vchern@mit.edu. Research support from the Chair, National Science Foundation, the Sloan Foundation, and CEMMAP is gratefully acknowledged. § Boston University, Department of Economics. &mail: ivanf@bu.edu. X Harvard University, Department of Economics. E-mail: alfred.galichon@post.harvard.edu. search support from the Conseil General des Mines and the National Science Foundation acknowledged. 1 is Re- gratefully improving edgeworth approximations 2 1. Introduction Higher order approximations to the distribution of sample beyond the statistics or- der n"^/^ provided by the central limit theorem are of central interest in the theory of asymptotic statistics, see e.g. (Blinnikov and Moessner 1998, Cramer 1999, Bhat- tacharya and Ranga Rao 1976, Hall 1992, Rothenberg 1984, van der Vaart 1998). important tool for performing these refinements sion (Edgeworth 1905, statistics of interest is An provided by the Edgeworth expan- Edgeworth 1907), which approximates the distribution of the around the limit distribution (often the normal distribution) by a combination of Hermite polynomials, with coefficients defined in terms of moments. Inverting the expansion yields a related higher order approximation, the Cornish-Fisher expansion (Cornish and Fisher 1938, Fisher and Cornish 1960), to the quantiles of the statistic One is around the quantiles of the limiting distribution. of the important shortcomings of either Edgeworth or Cornish-Fisher expansions that the resulting approximations to the distribution and quantile functions are not monotonia, which violates an obvious monotonicity requirement. This comes from the fact that the polynomials involved in the expansion are not monotone. Here we propose to use a procedure, called the rearrangement, to restore the monotonicity of the approx- imations and, perhaps more importantly, to improve the estimation properties of these approximations. The resulting improvement necessarily brings the is due to the non-monotone approximations fact that the rearrangement closer to to the true monotone target function. The main Figure 1. findings of the paper can be illustrated through a single picture given as In that picture, we plot the true distribution function of a sample mean X based on a small sample, a third order Edgeworth approximation to that distribution, and a rearrangement of approximation is this third order approximation. We see that the Edgeworth sharply non-monotone and provides a rather poor approximation to the distribution function. The rearrangement merely sorts the value of the approximate IMPROVING EDGEWORTH APPROXIMATIONS Edgeworth Expansion: lognormal (n = 5) P(y) (Edgeworth) F(y) (Rearranged) F*(y)(True-MC) —r-5 Figure 1. log- normal Distribution function for the standarized sample random variables (sample size 5), the third order mean of Edgeworth approximation, and the rearranged third order Edgeworth approximation. distribution function in an increasing order. imation, in addition to being monotonic, is One can a much see that the rearranged approx- better approximation to the true function than the original approximation. We organize the rest of the paper as follows. In Section 2, we describe the rearrange- ment and qualify the approximation property provides for monotonic functions. In section we introduce the rearranged Edgeworth-Cornish-Fisher expansions and ex- plain 3, how Section 4, it these produce better estimates of distributions and quantiles of statistics. In we illustrate the procedure with several additional examples. improving edgeworth approximations 4 Improving Approximations of Monotone Functions by Rearrangement 2. In what X= [0, 1]. follows, let A' interval. We first Let f{x) be a measurable function mapping = Let Ff{y) be a compact consider an interval of the form X to K, a bounded subset f^ ^{f{u) < y}du denote the distribution function of f{X) when the uniform distribution on [0, 1]. of M. X follows Let r{x) := Qf{x) := inf {y € R : Ff{y) > x} be the quantile function of Ff{y). Thus, f*{x) := inf <^ y e / l{f{u) < y}du > X Jx This function /* ment (see e.g. is Hardy, Littlewood, and Polya (1952) and Villani (2003).) of Chebyshev, monotone distribution who used it That we use this tool to X ~ Semendyayev, improve approximations functions, such as the Edgeworth-Cornish-Fisher approximations to the and quantile functions is, originates in the It to prove a set of inequalities (Bronshtein, of the sample mean. The rearrangement operator simply transforms a /*. rearrange- extensively used in functional analysis and optimal transportation Musiol, Muehlig, and Miihlig 2004). Here of The called the increasing rearrangement of the function /. a tool that is work is x >-^ f*{x) is function / to the quantile function of the random its quantile function variable Another convenient way to think of the rearrangement f/(0, 1). is f{X) when as a sorting operation: Given values of the function f{x) evaluated at x in a fine enough equidistant points, in this way Finally, [a, 6], if is we simply sort the values in is of the form [a,b], let x(a;) > = {x-a)/{b-a) G and /* be the rearrangement of the function f{x) Then, the rearrangement of / is of an increasing order. The function created the rearrangement of /. -Y mesh defined as r{x):=r{x{x)). = [0, 1], x{x) = a+{b — a)x f{x{x)) defined on A^ = G [0, 1]. Digitized by the Internet Archive in 2011 with funding from IVIIT Libraries http://www.archive.org/details/rearrangingedgew0720cher IMPROVING EDGEWORTH APPROXIMATIONS The following result Proposition -^ [a, b] K detailed in Chernozhukov, Fernandez-Val, is This 1. measurable function in x, where the target function that is an be another measurable function, function and Galichon (Improving Approximation of Monotone Functions). Let fo 1 be a weakly increasing subset of R. 5 we want to K (2006). X = : a bounded is approximate. Let f : X -^ K approximation or an estimate of the target initial /q. For any p G [1, oo], the rearrangement of f, denoted f*, weakly reduces the estimation error: rix) ll/p i/p p -- < dx fo{x) 2. Suppose that there that for all > fo{x') strict for fo{x) + x' G Xq we have for some e, e > dx foix] (2.1) Aq and Xq, each of measure greater than exist regions x E Xq and fix) [L X - that Then 0. p € (l,oo). Namely, for any p G > (i) x' > x, (ii) f{x) f{x') 1 where the 5;, rjp = mi{\v t'\^ infimum taken over = is + 1^' — t\^ all v, v', t, 1 (Strict Improvement). decreasing over a subset of Lp norm, forp G (l,oo), The first inequality X \l Jx - fix) foix) \v' — in the set K such that (Hi) t|P t'\^} and v' dx - > rjp Sxrjp (2.2) forp G >v + e and t' (l,C)o), with >t + e; and X and is X // the target function /o is increasing over that has positive measure, then the part of the proposition states the is strict for p G (1, oo) if As an weak inequality (2.1), we mean and the second implication, the Corollary states that the the original estimate fix) strictly increasing throughout). Of course, decreasing on a subset is having positive measure, while the target function foix) increasing, improvement in necessarily strict. part states the strict inequality (2.2). of and [l,oo], - -\v — t' such S/ib-a). Corollary f — e, Q, i/p < dx + > the gain in the quality of approximation is 1/) nx) - foix) 5 is if increasing on foix) is X (by constant, then IMPROVING EDGEWORTH APPROXIMATIONS 6 the inequality (2.1) becomes an equality, as the distribution of the rearranged function /* is the same as the distribution of the original function /, that is Ft, = Ft. This proposition establishes that the rearranged estimate /* has a smaller estimation norm than error in the Lp This size is the original estimate whenever the latter a very useful and generally applicable property that and of the way the Remark 1. An original approximation indirect proof of the / to /o weak inequality consequence of the following classical inequality due two functions mapping X to K to is not monotone. independent of the sample obtained. (2.1) is a simple hut important Lorentz (1953): Let q and g he a hounded suhset of R. , is is Let q* and g* denote their corresponding increasing rearrangements. Then, < L L{q*{x),g*{x),x)dx for any submodular discrepancy function g{x) = and g*{x) fo{x), everywhere, that jiw — f 1^ For p is = is, = /o(a:). suhmodular for p G [1, oo) L : R^ Now, note the target function . L{q{x),g{x),x)dx, / Jx IX is its >-^ M.. f{x), q*{x) that in our case /o(x) = = f*{x), fo{x) almost own rearrangement. Moreover, L{v,w,x) = This proves the first part of the proposition ahove. the first part follows hy taking the limit as oo, = Set q{x) p —> og. In the Appendix, for completeness, we restate the proof of Chernozhukov, Fernandez- Val, and Galichon (2006) of the strong inequality (2.2) as well as the direct proof of the weak inequality (2.1). Remark 2. The following immediate implication of the ahove finite-sample result is also worth emphasizing: The rearranged estimate f* inherits the Lp rates of convergence from the original estimates f. some sequence while the rate Remark For p G [l,oo], ^/A„ of constants On, then [J^ \fo{x) is the same, the error itself 3. Finally, is = [/^ \fo{x) — f{x)\PduY^^ - f*{x)\PduY^P < = Op{an) for K = Op{an). However, smaller. one can also consider weighted rearrangements that can accentuate the quality of approximation in various areas. Indeed, consider an ahsolutely continuous IMPROVING EDGEWORTH APPROXIMATIONS distribution function rearrangement of u A h->- on X= then we have that for (/ o [a, b], A ^)*{u) denoting the f[A~^{u)) i/p ll/p (7oA-i)*(n)-/o(A-^ u)) [ < du f{A-\u)) - f,{A-\u)) f du U[o,i] J[o,i] (2.3) or, equivalently, by a change of variable, setting mx) = we have {foA-'r{A{x)) that m i/p - x) < dA{x) fo{x) Ux Jx Thus, the function x <—> /X(^) ^^ ii/p - dAix) foix) (2.4) improvements ^^^ weighted rearrangement that provides in the quality of approximation in the norm is weighted according to the distribution we apply rearrangements to improve the Edgeworth-Cornish- that function A. In the next section, Fisher and related approximations to distribution and quantile functions. 3. 3.1. Improving Edgeworth-Cornish-Fisher and Related expansions Improving Quantile Approximations by Rearrangement. We the quantile case. Let Qn be the quantile function of a statistic X„, Qn{u) which we assume to be function Qn inf{x e K : Pr[Xn < strictly increasing. Let satisfying the following relation Qn{u) The = = Qn{u) + en{u), x] > u}, on an interval Un < leading example of such an approximation is a„, for = [e„, 1 X„ = Yl^^ii^i ~ — to the quantile £n] Q ah u G Un- [0, 1]: (3.1) the inverse Edgeworth, or Cornish- Fisher, approximation to the quantile function of a sample mean. dardized sample mean, consider i.e. Qn be an approximation |e„(it)| first If X„ E[Yi\)/ ^/Var{Yi) based on an is is i.i.d. a stan- sample IMPROVING EDGEWORTH APPROXIMATIONS 8 then we have the following J-th order approximation (Vi, ...,Yn), Qn{u) = Qn{u) Qn{u) = R,{^-\u)) + eniu) + R2{^~\u))ln"^ + + Rj{^-\u))/n^'-'y\ ... (3.2) \^n{u)\ for < Cn some e„ "'/^ \ ueUn = for all C> and [e„, 1 - e„], 0, provided that a set of regularity conditions, specified Here $ and $~^ normal random Zolotarev (1991), hold. in, e.g., denote the distribution function and quantile function of a standard variable. The first three terms of the approximation are given by the polynomials, where A is Ri{z) = z, R2{z) = X{z'- Rsiz) = {3k{z^ the skewness and k Fisher approximation is is - l)/6, - 3z) (3.3) - 2X^{2z^ the kurtosis of the - 5z))/72, random variable Y. The Cornish- one of the central approximations of the asymptotic statistics. Unfortunately, inspection of the expression for polynomials (3.3) reveals that this ap- proximation does not generally deliver a monotone estimate of the quantile function. This shortcoming has been pointed and discussed in detail nature of the polynomials constructed is A in the case of the by Hall (1992). The such that there always exists a large enough range Un over which the Cornish-Fisher approximation As an example, e.g. is not monotone, second order approximation ( J = cf. Hall (1992). we have 2) that for < Qn{u) that is, \ -oo, the Cornish-Fisher "quantile" function as u Qn y is u. (3.4) decreasing far enough in the This example merely suggests a potential problem that ranges of probability indices ' I, may tails. apply to practically relevant Indeed, specific numerical examples given below show that in small samples the non-monotonicity can occur in practically relevant ranges. Of IMPROVING EDGEWORTH APPROXIMATIONS 9 course, in sufficiently large samples, the regions of non-monotonicity are squeezed quite far into the tails. Q^ be Let Then we have the rearrangement of Q„. that for any p E [1, the rear- oo], ranged quantile function reduces the approximation error of the original approximation: i/p Q:{u) J - Qn{u) i/p < du - Qn{u) Qn{u) < du - (1 2e„)a„, (3.5) Un with the first region of Un We of positive measure. to this result. p G inequality holding strictly for Under condition (3.1), (1,cxd) whenever Qn decreasing on a is can give the following probabilistic interpretation there exists a variable U= F„(X„) such that both the stochastic expansion Xn = Qn{U) + Op{an), (3.6) Xn^Q:{U) + Op{an), (3.7) and the expansion but the variable Qn{U) in hold,-^ (3.7) is a better coupling to the statistic in (3.6), in the following sense: For each [E1„[X„ where 1„ = 1{U [1, oo], - Qn{U)YY'^ < [EUXn - G Un}- Indeed, property The above improvements apply (3.8) is Qn(f/)]1'/^ (3.8) immediately follows from in the context of the the probabilistic interpretation above limit p G X„ than Qn{U) (3.5). sample mean X„. In this case, directly connected to the higher order central theorem of Zolotarev (1991), which states that under (3.2), we have the following higher-order probabilistic central limit theorem, Xn = Qn{U) + Op{n ^QniU) U ^Un is defined only on Un, so we can with probability going to zero. set Qn{U) -J/2) = Qn{U) (3.9) outside Un, if needed. Of course, IMPROVING EDGEWORTH APPROXIMATIONS 10 The term Qn{U) Zolotarev's high-order refinement over the is $~^(f/). Sun, Loader, and order normal terms (2000) employ an analogous higher-order proba- theorem to improve the construction of confidence bilistic central limit The McCormick first intervals. application of the rearrangement to the Zolotarev's term in fact delivers a clear improvement in the sense that it also leads to a probabilistic higher order central limit theorem = Xr. but the leading term Q^iU) is + Q*^{U) closer to X„ 0,{n-"^), (3.10) than the Zolotarev's term Qn{U), in the sense of (3.8). We summarize the above discussion Proposition 2. If The improvement expansion (3.1) holds, then the improvement (3.5) necessarily holds. is Qn necessarily strict if positive measure. In particular, this worth approximation defined in (3.2) 3.2. into a formal proposition. is decreasing over a region improvement property applies to the quantile ofUn Edge- to the inverse function of the sample mean. Improving Distributional Approximations by Rearrangement. consider distribution functions. that has a We next Let Fn{x) be the distribution function of a statistic Xn, and Fn{x) be the approximation to this distribution such that the following relation holds: Fn{x) where A'„ = growing to The [— 6„, 6„] is =^ Fn{x) + an interval en{x) in , |e„(x) | < -^n, (3.11) infinity. distribution function of a sample mean. ~ aU x G M for some sequence of positive numbers 6„ possibly leading example of such an approximation IZr=i(^« a„, for E\yi]) / y/V ar{Yi) based on an If Xn i.i.d. is is the Edgeworth expansion of the a standardized sample mean, sample (Yl, ...,y„), Xn = then we have the IMPROVING EDGEWORTH APPROXIMATIONS 11 following J-th order approximation Fn{z) = Fniz) + eniz) Fn{x) = P,{x) + P2{x)/n"^ < \^n{x)\ for some hold. where C> The <l> P,ix) = ^x), P2{x) = -X{x^ P^(x) = -{3k{x^ (3.12) x E Xn, e.g. Hall (1992), - l)(/.(x)/6, - 3x) + (3.13) \^{x^ - lOx^ + 15x))(t){x)/72, denote the distribution function and density function of a standard normal random asymptotic Pj{x)/n('-^^'\ three terms of the approximation are given by and variable, and A and k are the skewness and kurtosis of the random The Edgeworth approximation variable Y. + ... provided that a set of regularity conditions, specified in 0, first for all Cn~'^^'^, + statistics. is one of the central approximations of the Unfortunately, like the Cornish-Fisher expansion not provide a monotone estimate of the distribution function. been pointed and discussed in detail it generally does This shortcoming has by (Barton and Dennis 1952, Draper and Tierney 1972, Sargan 1976, Balitskaya and Zolotuhina 1988), Let F* be the rearrangement of F„. among others. Then, we have that any p G for [l,oo], the rearranged Edgeworth approximation reduces the approximation error of the original Edgeworth approximation: i/v \Ip dx F:ix)-Fnix) Jx„ Fn{x) - Fn{x) dx < 26„a„, (3.14) ^n with the first region of Xn inequality holding strictly for p £ (1, oo) whenever F„ is decreasing on a of positive measure. Proposition holds. < 3. // expansion (3.11) holds, The improvement is then the improvement (3.14) necessarily necessarily strict if Fn is decreasing over a region of Xn IMPROVING EDGEWORTH APPROXIMATIONS 12 that has a positive measure. In particular, this improvement property applies to Edge- worth approximation defined in (3.12) 3.3. In to the distribution function of the sample mean. Weighted Recirrangement of Cornish-Fisher and Edgeworth Expansions. some cases, it could be worthwhile to weigh different areas of support differently than the Lebesgue (fiat) weighting prescribes. For example, it might be desirable to rearrange F„ using F„ as a weight measure. Indeed, using F„ as a weight, better coupling to the P-value: to the true F„). P= Using such weight Fn{Xn) will (in this quantity, X„ is we obtain a drawn according provide a probabilistic interpretation for the rearranged Edgeworth expansion, in analogy to the probabilistic interpretation for the rearranged Cornish-Fisher expansion. Since the weight standard normal measure initial $ as the weight not available is measure instead. rearrangement with the Lebesgue weight, and use We may we may use the also construct an as weight itself in a further it weighted rearrangement (and even iterate in this fashion). Using non-Lebesgue weights may also be desirable more when we want the improved approximations Whatever the reason might be heavily. Let A in view of Remark 3. be a distribution function that admits a positive density with respect to the Lebesgue measure on the region Un = [— 6„,6„] for the distribution case. Then Q* ^ non-Lebesgue weighting, we for further have the following properties, which follow immediately to weight the tails of the function Q — [e„, 1 e„] for the quantile case and region (3.1) holds, the if A- weighted rearrangement satisfies 1/p \/p [/ Q:,^{u)- Qniu) < dK{u) < where the first equality holds strictly measure. Furthermore, Xn = if when (3.11) holds, Q J (A[l is Qn{u) - Sn] - - Qn{u) dA{u) (3.15) A[£n])an, (3.16) decreasing on a subset of positive A- then the A- weighted rearrangement F*^ of the IMPROVING EDGEWORTH APPROXIMATIONS function F 13 satisfies 1/p F:Jx)-Fr,{x) 1/p < dA{x) / F„(x)-F„(x) dA{x) < 4. {A[bn] - A[-bn])an (3.18) Numerical Examples In addition to the lognormal example given in the introduction, we use the gamma tribution to illustrate the improvements that the rearrangement provides. Let (Yi, be an i.i.d. n function = Qn ..., dis- F„) sequence of Gamma(l/16,16) random variables. The statistic of interest the standardized sample of sizes (3.17) -JXn Jxr, 4, 8, 16, and mean X„ = 32. In this of the sample mean X„ l^"=i(^i ~ E[Yi\)/y/Var{Yi). We is consider samples example, the distribution function F„ and quantile are available in a closed form, making it easy to com- pare them to the Edgeworth approximation F„ and the Cornish-Fisher approximation Qn, as well as to the rearranged Edgeworth approximation F* and the the rearranged Cornish-Fisher approximation Q^. For the Edgeworth and Cornish-Fisher approximations, as defined in the previous section, we set J= we consider the third order expansions, that is 3. Figure 2 compares the true distribution function F„, the Edgeworth approximation F„, and the rearranged Edgeworth approximation F*. worth approximation not only fixes the We see that the rearranged Edge- monotonicity problem, but also consistently does a better job at approximating the true distribution than the Edgeworth approximation. Table 1 further supports this point by presenting the numerical results for the Lp ap- proximation errors, calculated according to the formulas given We in the previous section. see that the rearrangement reduces the approximation error quite substantially in most cases. Figure 3 compares the true quantile function Qn, the Cornish-Fisher approximation Qn, and the rearranged Cornish-Fisher approximation Q*. Here too we see that the IMPROVING EDGEWORTH APPROXIMATIONS 14 rearrangement not only fixes the non-monotonicity problem, but also brings the approx- imation closer to the truth. Table 2 further supports this point numerically, showing that the rearrangement reduces the Lp approximation error quite substantially in most cases. Conclusion 5. we have applied the rearrangement procedure In this paper, and Cornish-Fisher expansions and other The functions. logical related expansions of distribution benefits of doing so are twofold. of the distribution to monotonize and quantile curves of the Edgeworth and quantile we have obtained estimates First, statistics of interest which satisfy the monotonicity restriction, unlike those directly given by the truncation of the we have shown series expansions. Second, that doing so resulted in better approximation properties. Appendix We consider the case where X = The similarly. first Proof of Part The proof The second We 1. / denote the /(•) on the s-th integer in 1, ..., fi, assume all intervals can be dealt inequality, following in part the strategy focuses directly on obtaining the result stated in f(, and at first that the functions /(•) — l)/r, s/r], s = 1, ...,r. > fm for some be a r-vector with the m > i-th. I. If f /o(-) are simple For any simple /(•) with r with the s-th element, denoted r- vector r such that and weak more general interval. Let us define the sorting operator i exists, set S{f) to equal to only, as the 1 part establishes the strong inequality. functions, constant on intervals ((s steps, let [0, 1] part establishes the in Lorentz's (1953) proof. the proposition. Proof of Proposition A. /s, equal to the value of S{f) as follows: Let i be an does not exist, set S{f) = f. If element equal to fm, the m-th element other elements equal to the corresponding elements of /. For any submodular function L : E^ — ^ E+, by /^ > /m, fom > foe and the definition of the IMPROVING EDGEWORTH APPROXIMATIONS 15 submodularity, Lifrm Therefore, foe) + L{fe, /om) < + L{fm, fom)- we conclude that / L{S{f){x),fo{x))dx Jx using that L{fe, foe) we < [ L{fix)Jo{x))dx, integrate simple functions. Applying the sorting operation a completely sorted, that = 5 composition /* o ... o S{f) By . number sufficient finite Jx Furthermore, this inequality error. Therefore, X to K function ., (A.2) Lifix), fo{x))dx. Jx extended to general measurable functions /o(-) almost everywhere as r -^ oo. /^"^H") to /(") implies /**'"^(-) ^ /(•) and by taking a sequence of bounded simple functions p^\-) and converging to /(•) and convergence of £ is finite repeating the argument above, each composition / L(f (x), fo{x))dx < I L(^.^(/), Ux))dx < f Jx we obtain a of times to /, Thus, we can express /* as a rearranged, vector /*. is, weakly reduces the approximation mapping (A.l) Jx /o '(•) The almost everywhere the almost everywhere convergence of to the quantile function of the limit, /o(-) /*() Since its quantile inequality (A.2) holds along the sequence, the dominated convergence theorem imphes that (A.2) also holds D for the general case. Proof of Part 1. We A'q, that Let us 2. first consider the case of simple functions, as defined in Part take the functions to satisfy the following hypotheses: there exist regions Aq and each of measure greater than 5 (i) x' > x, (u) f{x) > f[x') + > e, 0, such that for and (in) fo{x') > the proposition. For any strictly submodular function T] where the infimum t' >t + e. = inf{L(i/, is taken over t) + L{v, t') - L{v, all v,v',t,t' in t) all - x E Xq and fo{x) L R^ : + K for e ^ M+ L{v', t')} the set e, > x' £ A'q, > we have specified in we have that 0, such that v' > v + e and IMPROVING EDGEWORTH APPROXIMATIONS 16 We can begin sorting by exchanging an element f{x), x G ^o, of element f{x'), The total x' € Xq, of r- vector /. This induces a sorting gain of at least mass of points that can be sorted sort all of these points in this way, After the sorting is in this is at least 5. r] We then / with an times 1/r. proceed to and then continue with the sorting of other / Lirix), fo{x))dx < f We way completed, the total gain from sorting Jx r- vector is at least L{f{x), h{x))dx - 5r]. That is, 57). Jx then extend this inequality to the general measurable functions exactly as proof of part one. points. in the D improving edgeworth approximations 17 References and Balitskaya, E. O., Edgeworth and E., Bhattacharya, R. John Wiley (1988): "On the representation of a density by an K. E. Dennis (1952): "The conditions under which Gram-Charher and Edge- worth curves are positive sions. Zolotuhina Biometrika, 75(1), 185-187. series," Barton, D. L. a. and N., & definite Sons, R. New and unimodal," Biometrika, Ranga Rao (1976): 39, 425-427. Normal approximation and asymptotic expan- York-London-Sydney, Wiley Series in Probability and Mathematical Statistics. Blinnikov, S., and Moessner R. (1998): "Expansions for nearly Gaussian distributions," Astron. Astrophys. Suppl. Ser., 130, 193-205. Bronshtein, K. A. Semendyayev, G. Musiol, H. Muehlig, N., I. Handbook of mathematics. Springer- Verlag, Chernozhukov, of v., H. Muhlig (2004): Berlin, english edn. Fernandez- Val, and A. Galichon I. and Monotone Functions by Rearrangement," MIT and (2006): UCL CEMMAP "Improving Approximations Working Paper, available at www.arxiv.org, cemmap.ifs.org.uk, www.ssrn.com. and Cornish, E. A., S. R. A. Fisher (1938): "Moments and Cumulants Distributions," Review of the International Statistical Institute, Cramer, H. 5, (1999): Mathematical methods of statistics, Princeton in the Specification of 307-320. Landmarks in Mathematics. Prince- ton University Press, Princeton, NJ, Reprint of the 1946 original. Draper, N. and R., D. E. Tierney (1972): "Regions of positive and unimodal series expansion of the Edgeworth and Gram-Charlier approximations," Biometrika, 59, 463-465. Edgeworth, F. Y. (1905): "The law of error," Trans. Camb. Phil. Society, 20, 36-65, 113-41. (1907): Fisher, S. "On R. A., the representatin of a statistical frequency by a series," and cumulants," Technometrics, Hall, P. New (1992): Cornish E. A. 2, (I960): J. R. Stat. Soc, 70, 102-6. "The percentile points of distributions having known 209-225. The bootstrap and Edgeworth expansion, Springer Series in Statistics. Springer- Verlag, York. Hardy, G. H., J. E. Littlewood, and G. Polya (1952): Inequalities. Cambridge University Press, 2ded. Lorentz, G. G. (1953): "An inequality for rearrangements," Amer. Math. Monthly, 60, 176-179. IMPROVING EDGEWORTH APPROXIMATIONS 18 ROTHENBERG, T. Statistics," in Griliches Sargan, J. (1984): Handbook of econometrics, and Michael D. J. "'Approximating the Distributions of Econometric Estimators and Test Vol. II, vol. 2. North-Holland, Amsterdam, edited by Zvi Intriligator. D. (1976): "Econometric estimators and the Edgeworth approximation," Econometrica, 44(3), 421-448. Sun, J., C. Loader, and W. P. McCormick (2000): "Confidence bands in generalized hnear models," Ann. Statist, 28(2), 429-460. VAN DER Vaart, A. W. ViLLANi, C. (2003): Cambridge University Press, Cambridge. Topics in optimal transportation, vol. 58 of Graduate Studies in Mathematics. American Mathematical ZOLOTAREV, V. M. (1998): Asymptotic statistics. Society, Providence, RI. (1991): "Asymptotic expansions with probability variables," Tear. Veroyatnost. i Primenen., 36(4), 770-772. 1 of sums of independent random IMPROVING EDGEWORTH APPROXIMATIONS Table 1. Estimation errors for tion of the standarized sample First Order approximations to the distribution func- mean from a Gamma(l/16, Third Order 19 TO- Rearranged A. n = 16) population. Ratio (TO/RTO) 4 Li 0.07 0.05 0.02 0.38 L2 0.10 0.06 0.03 0.45 Ls 0.13 0.07 0.04 0.62 U 0.15 0.08 0.07 0.81 Loo 0.30 0.48 0.48 1.00 B. n= 8 Li 0.06 0.03 0.01 0.45 L2 0.08 0.04 0.02 0.63 L, 0.10 0.05 0.04 0.85 U 0.11 0.06 0.06 0.96 Loo 0.23 0.28 0.28 1.00 C. n-- = 16 Li 0.05 0.01 0.01 0.97 L2 0.06 0.02 0.02 0.99 Ls 0.08 0.03 0.03 1.00 L, 0.08 0.04 0.04 1.00 Loo 0.15 0.11 0.11 1.00 D. n = 32 Li 0.04 0.01 0.01 1.00 L2 0.05 0.01 0.01 1.00 Ls 0.06 0.01 0.01 1.00 L, 0.06 0.01 0.01 1.00 Loo 0.09 0.03 0.03 1.00 IMPROVING EDGEWORTH APPROXIMATIONS 20 Table 2. Estimation errors for approximations to the distribution func- tion of the standarized sample First Order mean from a Gamma(l/16, Third Order TO- Rearranged A. n = 16) population. Ratio (TO/RTO) 4 Li 0.50 0.24 0.09 0.39 L2 0.59 0.32 0.11 0.35 Ls 0.69 0.42 0.13 0.31 U 0.78 0.52 0.15 0.29 L^ 2.04 1.53 0.49 0.32 B. n = 8 L, 0.39 0.08 0.03 0.37 L2 0.47 0.11 0.04 0.35 Ls 0.56 0.16 0.05 0.32 U 0.65 0.21 0.06 0.30 Loo 1.66 0.67 0.22 0.33 C. n-= 16 L, 0.28 0.02 0.02 0.97 L2 0.35 0.04 0.03 0.84 L, 0.43 0.05 0.04 0.71 L, 0.51 0.07 0.04 0.63 L^ 1.34 0.24 0.10 0.44 D. n = 32 L, 0.20 0.01 0.01 1.00 L2 0.26 0.01 0.01 1.00 Ls 0.31 0.02 0.02 1.00 L, 0.37 0.02 0.02 1.00 Loo 1.02 0.07 0.07 1.00 IMPROVING EDGEWORTH APPROXIMATIONS n =4 21 n = 8 X .^^^ CO d CM True True FO Aproximation TO Approximation TO - Rearranged d FO TO TO Aproximation Approximation - Rearrang T" 1 1 1 1 1 1 2 1 std. mean std. mean n = 32 n = 16 CO d CD CD d d CM True FO Aproximation TO Approximation TO - RearrangecJ d Std. Figure 2. d — • • • True FO Aproximation «»» TO Approximation TO - Rearranged mean std. mean Distribution Functions, First Order Approximations, Third Order Approximations, and Rearrangements mean from a Gamma(l/16, 16) population. for the standarized sample IMPROVING EDGEWORTH APPROXIMATIONS 22 n = 8 n = 4 -* True FO Aproximation TO Approximation TO - Rearranged - CO - (M - — True FO Aproximation TO Approximation TO - Rearranged i / J O 1 CM - _.^^i--^ ~ _ 1 T— 0.0 1 1 r 0.2 0.4 0.6 r O.i 1.0 1 1 1 1 1 1 0.0 0.2 0.4 0.6 0.8 1.0 prob prob n = 16 n = 32 True FO Aproximation TO Approximation TO - Rearranged True FO Aproximation TO Approximation TO - Rearranged w 1 1 \ 0.0 0.2 0.4 0.6 r 0.8 3. — 1 1 1 1 r \ 1.0 prob Figure o 0.0 0.2 0.6 0.4 0.8 1.0 prob Quantile Functions, First Order Approximations, Third Order Approximations, and Rearrangements from a Gamma(l/16, 16) population. for the standarized sample mean