Quicksort algorithm again revisited Charles Knessl and Wojciech Szpankowski †

advertisement
Discrete Mathematics and Theoretical Computer Science 3, 1999, 43–64
Quicksort algorithm again revisited
Charles Knessl† and Wojciech Szpankowski‡
Dept. Mathematics, Statistics and Computer Science, University of Illinois at Chicago, Chicago, Illinois 60607-7045,
U.S.A, knessl@uic.edu
Department of Computer Science, Purdue University, W. Lafayette, IN 47907, U.S.A., spa@cs.purdue.edu
received 11th January 1999, revised 12th March 1999, accepted 10th February 1999.
We consider the standard Quicksort algorithm that sorts n distinct keys with all possible n! orderings of keys being equally
likely. Equivalently, we analyze the total path length distribution of n
n
in a randomly built binary search tree. Obtaining the limiting
is still an outstanding open problem. In this paper, we establish an integral equation for the probability
density of the number of comparisons n.
Then, we investigate the large deviations of n.
We shall show that the left
tail of the limiting distribution is much “thinner” (i.e., double exponential) than the right tail (which is only exponential).
Our results contain some constants that must be determined numerically. We use formal asymptotic methods of applied
mathematics such as the WKB method and matched asymptotics.
Keywords: Algorithms, Analysis of algorithms, Asymptotic analysis, Binary search tree, Quicksort, Sorting.
1 Introduction
Hoare’s Quicksort algorithm [11] is the most popular sorting algorithm due to its good performance in practise. The basic algorithm can be briefly described as follows [11, 14, 16]:
A partitioning key is selected at random from the unsorted list of keys, and used to partition the
keys into two sublists to which the same algorithm is called recursively until the sublists have
size one or zero.
To justify the algorithm’s good performance in practise, a body of theory was built. First of all, every
undergraduate learns in a data structures course that the algorithm sorts “on average” n keys in Θ n logn steps. To be more precise, one assumes that all n! possible orderings of keys are equally likely. It is, however,
also known that in the worst case the algorithm needs O n2 steps (e.g., think of an input that is given in
a decreasing order when the output is printed in an increasing order). Thus, one needs a more detailed
† The
‡ The
work of this author was supported by NSF Grant DMS-93-00136 and DOE Grant DE-FG02-93ER25168.
work of this author was supported by NSF Grants NCR-9415491 and CCR-9804760, and NATO Collaborative Grant
CRG.950060.
1365–8050 c 1999 Maison de l’Informatique et des Mathématiques Discrètes (MIMD), Paris, France
44
Charles Knessl and Wojciech Szpankowski
probabilistic analysis to understand better the Quicksort behavior. In particular, one would like to know how
likely (or rather unlikely) it is for such pathological behavior to occur.
A large body of literature is devoted to analyzing the Quicksort algorithm [4, 5, 6, 8, 14, 15, 16, 17, 18,
19, 21]. However, many aspects of this problem are still largely unsolved. To review what is known and
what is still unsolved, we introduce some notation. Let n
denote the number of comparisons needed to sort
a random list of length n. It is known that after selecting randomly a key, the two sublists are still “random”
(cf. [14]). Clearly, the sorting time depends only on the keys’ ranking, so we assume that the input consists
of the first n integers 1 2 n , and key k is chosen with probability 1 n. Then, the following recurrence
holds
n
Now, let Ln u Eu
n
∑k 0 n n 1 k n 1 k (1)
k uk be the probability generating function of
n.
The above recur-
rence implies that
Ln u with L0 u un 1 n 1
Li u Ln
n i∑
0
1 i u
(2)
1. Observe that the same recurrences are obtained when analyzing the total path length n
of a
binary search tree built over a random set of n keys (cf. [14, 16]). Finally, let us define a bivariate generating
function L z u ∑n 0 Ln u zn . Then ( 2) leads to the following partial-differential functional equation
∂L z u ∂z
Observe also that L z 1 The moments of
n
1 z
L2 zu u ∂L 0 u ∂z
1
(3)
1.
are relatively easy to compute since they are related to derivatives of Ln u at u
1. Hennequin [8] analyzed these carefully and computed the first five cumulants. He also conjectured an
∞ which he later proved in [9].
asymptotic formula for the cumulants as n
The main open problem is to find the limiting distribution of
distribution of
n E n !"
n.
Régnier [18] proved that the limiting
n exists, while Rösler [19, 20] characterized this limiting distribution as a fixed
point of a contraction satisfying a recurrence equation. A partial-differential functional equation seemingly
similar to ( 3) was studied recently by Jacquet and Szpankowski [12]. They analyzed a digital search tree for
which the bivariate generating function L z u (in the so–called symmetric case) satisfies
∂L z u ∂z
with L z 0 L2 1
zu u 2
(4)
1. The above equation was solved asymptotically in [12], and this led to a limiting normal
distribution of the path length n
in digital search trees. While the above equation and ( 3) look similar, there
are crucial differences. Among them, the most important is the contracting factor
1
2
in the right-hand side of
the above. Needless to say, we know that ( 3) does not lead to a normal distribution since the third central
moment is not asymptotically equal to zero (cf. [16]). More precisely, the third (and all higher odd) moments
of
n E n#
n does not tend to zero as n
∞.
In view of the above discussion, a less ambitious goal was set, namely that of computing the large deviations of n,
i.e.,
$"% n E n %'&
εE n (
for ε
)
0. Hennequin [8] used Chebyshev’s inequality to show
Quicksort algorithm again revisited
45
that the above probability is O 1 * ε log2 n " . Recently, Rösler [19] showed that this probability is in fact
O n
k
for any fixed k, and soon after McDiarmid and Hayward [15] used the powerful method of bounded
differences to obtain an even better estimate, namely that the tail is approximately equal to n 2εlog log n
(see
the comment after Theorem 1 of Section 2).
In this paper, we obtain some new results for the tail probabilities. First of all, we establish an integral
n,
equation for the probability density of
deviations of
n.
and using this we derive a left tail and a right tail of the large
We demonstrate that the left tail is much thinner (i.e., double exponential) than the right
tail, which is roughly exponential.
We establish these results using formal asymptotic methods of applied mathematics such as the WKB
method and matched asymptotics. By “formal” we mean that we do not rigorously establish error bounds
on the various asymptotic expansions. The main merit of these methods is that they can be used to obtain
asymptotic information directly from the underlying equations. Similar asymptotic approaches were used for
enumeration problems by Knessl and Keller [13] and Canfield [3].
The paper is organized as follows. In the next section we describe our main findings and compare them
with other known results. In Section 3 we derive the integral equation for the asymptotic probability density
of n.
In Section 4 we obtain our large deviations results.
2 Formulation and Summary of Results
As before, we let n
be the number of key comparisons made when Quicksort sorts n keys. The probability
generating function of n
becomes
∞
Ln u ∑ k uk E u
n
k 0
The upper limit in this sum may be truncated at k ,+
n
2-
n
"
(5)
, since this is clearly an upper bound on the number
of comparisons needed to sort n keys.
The generating function in ( 5) satisfies ( 2 on the page before) which we repeat below (cf. also [6, 16, 18,
19])
Ln .
Note that Ln 1 1 for all n
&
1
u
un
n
n
Li u Ln
1∑
i 0
0, and that the probability
integral
$0 n k
1
2πi
1
u
C
i
u / L0 u n k 1
1
(6)
k may be recovered from the Cauchy
Ln u du (7)
Here C is any closed loop about the origin.
In Section 3, we analyze ( 6) asymptotically for n
∞ and for various ranges of u. We use asymptotic
methods of applied mathematics, such as the WKB method and matched asymptotics [2, 7]. The most
important scale is where n
∞ with u 1
O n
1
, which corresponds to k E n2
O n
2n logn O n . Most of the probability mass is concentrated in this range of k. As mentioned before, the existence
of a limiting distribution of
n E n !"
n as n
∞ was established in [18, 19], though there seems to be
little known about this distribution (cf. [4, 6, 8, 15, 21]). Numerical and simulation results in [4, 6] show that
46
Charles Knessl and Wojciech Szpankowski
the distribution is highly asymmetric and that the right tail seems much thicker than the left tail. It is also of
interest to estimate these tails (cf. [15, 19]), as they give the probabilities that the number of key comparisons
will deviate significantly from E n
, which is well known to be asymptotically equal to 2n logn as n
∞
(cf. [8, 16]).
For u 1 w n O n 1
Ln u where An
E n
∞, we derive in Section 3 the asymptotic expansion
and n
log n
1
G1 w 4
G2 w 5 o n n
n
exp An w n 3 G0 w 4
1
76
(8)
. The leading term G0 w satisfies a non-linear integral equation. Indeed, in Section 3 we
find that (cf. ( 49 on page 52))
e
w
1
G0 w 1
0
G0 0 e2φ 8 x 9 w G0 wx G0 w wx dx
G0: 0 1;
(9)
0
(10)
where
φ x
x log x ; 1 x log 1 x (11)
is the entropy of the Bernoulli(x) distribution. Furthermore, the correction terms G1 "<= and G2 < satisfy linear
integral equations (cf. ( 50 on page 52)–( 51 on page 52)).
By using ( 8) in ( 7 on the page before) and asymptotically approximating the Cauchy integral we obtain
$" n E n 1
P y
n
ny?>
(12)
where
P y
1
or equivalently G0 w 1
2πi
∞
∞
1
c . i∞
c i∞
e
yw
G0 w dw
(13)
eyw P y dy
(14)
and c is a constant. Hence, G0 w is the moment generating function of the density P y .
Now, we can summarize our main findings. We establish the results below under the assumptions that P y
has certain forms as y A@ ∞ (cf. Section 4).
Theorem 1 Consider the Quicksort algorithm that sorts n randomly selected keys.
(i) The limiting density P y satisfies
1
P y 1
11
0
∞
∞
and
1
P 3 xt ∞
∞
y 2φ x 2 1 x
1
6
P y dy 1 P
∞
∞
3 B
y 2φ x
2x
1 x t 6
dtdx
yP y dy 0 (15)
(16)
(ii) The left tail of the distribution satisfies
n E C
n D
nz*>
2
1
E
exp
π 2 log2 1
3 α exp
3
β z
6F6
2 log 1 2
(17)
Quicksort algorithm again revisited
for n
47
∞, and z z n GH ∞ sufficiently slowly, where α 2log 2 1
e log 2 0 205021 and β is a constant.
(iii) The right tail becomes
n for n
E C
n &
CI
E
2 π wI
ny?>
1
wI
E
e
1
wJ"8 wJ" 12 . 2γ . log2 9
1
exp
3 wJ
yw IK
1
2eu
du6
u
∞, and y y n LM ∞ sufficiently slowly. Here CI is a constant, γ is Euler’s constant and w I
is the solution to
2
exp w I
wI
y
(18)
w I y
(19)
that asymptotically becomes
wI
for y
y
2O
N
log
loglog
y
2O
N
log log y 2 1 o 1 log y 2 (20)
∞ (cf. ( 89 on page 60) in Section 4).
Finally, we relate our results for the tails to those of McDiarmid and Hayward [15]. These authors showed
that
% n E εE C
n % )
exp 2ε logn log log n log 1 ε4 O log loglog n P"
n (21)
∞ and ε in the range
which holds for n
1
log n
ε
Q
D
1
(22)
As pointed out in [15], this estimate is not very precise if, say, ε O log log n logn .
¿From Theorem 1 we conclude that (since the right tail is much thicker than the left tail)
"% n E C
ny*> Ck y eφ 8 y 9
n %'&
y
∞
(23)
(24)
where C is a constant and
1
φ y
k y
yw I*
e
wJ
1
2eu
du u
w2J ewJ 12 2γ log2 wI
E
w IK 1
y
2ewJ
wI
We have not been able to determine the upper limit on y for the validity of ( 23). However, it is easy to see
that ( 23) reduces to ( 21) if we set y εE 1
yw I*
n 2ε logn R ε 2γ 4 4 O ε log n " n and use ( 20) to
∞. This yields
approximate w I as y
n "
wJ
1
2eu
du
u
y
y
y
log log N
1 o 1 0T
2O
2O
2ε logn log log n log 1 ε4 log log ε logn * 1U o logn SP
log N
(25)
which agrees precisely with the estimate ( 21), and also explicitly identifies the O log loglog n error term.
This suggests that ( 23) applies for y as large as 2 logn, though it cannot hold for y as large as n 2 in view
of the fact that
$0 n k
0 for k
)
n
+ 2-
. An important open problem is obtaining an accurate numerical
approximation to the constant C. This would likely involve the numerical solution of the integral equation for
G0 w .
48
Charles Knessl and Wojciech Szpankowski
∞
3 Analysis of the Generating Function for n V
We study ( 6 on page 45) asymptotically, for various ranges of n and u, namely: (i) u
∞ and u
w W n u 1 fixed when n
1; (iii) w H@ ∞; and (iv) u
below.
A. C ASE n FIXED
AND
u
)
1 or u
Q
1 with n fixed; (ii)
1. We study these cases
1
1 with n fixed. Then using the Taylor expansion
First we consider the limit u
Ln u 1 An u 1 4 Bn u 1 2 O " u 1 3 1 2
A / u 1 2 O u 1 eAn 8 u 1 9 1 ; Bn 2 n
we find from ( 6 on page 45) that An
An .
Bn .
1
n
1 N
1 n
2O
Ln: 1 and Bn
n
i 0
n
2n
An
∑ Ai
n 1 i
0
3
P
Ln: : 1 2 satisfy the linear recurrence equations
n
Ai 1 ∑
(26)
i 1
n 1
n
2
n
n
Ai ;
1∑
A0
i 0
0
(27)
n
∑ 2Bi Ai An i 0
i
; B0
0
(28)
These are easily solved using either generating functions, or by multiplying ( 27) and ( 28) by n 1 and then
differencing with respect to n. The final result is (cf. [14, 16])
An
Bn
Here Hn
1
1
2 2 n 1 Hn 4n
n
2 8 29
23n 17 * 2 n 1 Hn
2
2 n 1 2Hn2 X 8n 2 Y n 1 Hn 1
3 Z<<
<0
1
n
8 29
is the harmonic number, and Hn
second order. In terms of An and Bn , the mean and variance of E Var
Asymptotically, for n
n
An
n
n
∑nk 1 k 2 is the harmonic number of
are given by
2 n 1 Hn 4n
Ln: 1 P
(31)
2
A2n
Ln: : 1 4 Ln: 1 [
7n2 2 n 1 Hn 13n 4 n 1 2Hn
2Bn An 8 29
(32)
2n logn ; 2γ 4 n 2 logn 2γ 1 1 2
A
2 n
W
Cn 3
7
2
π2
21
2
6 n 2n logn n 3
3
2
5 n
6
1
2γ O n
2
2 2
π 6
3
(33)
o n /
and n
n
E n
is of order
∞, all terms in the series ( 26) will be of comparable magnitude when (roughly) u 1 O n ∞. If we view ( 26) as an asymptotic series for n fixed and u
n O u (34)
1 and n fixed, to those
that will apply for other ranges of n and u. Since it is well known that the lth moment of O n as n
(30)
These expressions will be used in order to asymptotically match the expansion for u
l
∞, we obtain
An
Bn (29)
1 1 , which motivates the analysis that follows.
1
1, then it will cease to be valid when
Quicksort algorithm again revisited
B. C ASE w W n u 1 49
Next we consider the limit u
∞ AND u
n
FIXED WHEN
1
∞ with w W n u 1 held fixed. This scaling is necessary to obtain
1, n
a non–trivial limiting problem. We define G "<= by
Ln u exp An w n G w; n eAn 8 u 19
G n u 1 ; n /
(35)
With this change of variables, we rewrite ( 6 on page 45) as
exp An . 1 w n ]\
w
; n 1
n
G w n m
\
w
∑ exp N Ai n ^
i m
m 1
1 2
2 Ai .
where Ci Bi w
∑ exp N Ai n 2
i 0
1 w n
n 1
n
(36)
An i
w
i
i
G 3 w ; i6 G 3 w 3 1 6 ; n i6
nO
n
n
An i
w
nO
3
1 Ci
w2
n2
;<<< 6
i
6 ; n i 6`_
n
G 3 w 3 1
Here we have broken up the sum in ( 6 on page 45) into the three ranges 0 D i D m 1,
m D i D n m, and n m 1
D
D
i
n, and used the symmetry i
n i of the summand. We expect ( 26
on the preceding page) to also be valid for large values of n, as long as n u 1 $
0 as n
∞. Thus, for
0 D i D m 1 we replaced Li u in the sum by the approximation ( 26 on the page before). The integer m may
be chosen arbitrarily, since the right side of ( 36) must ultimately be independent of m. For now we assume
that m
∞ but m n
∞. For n large we have (cf. ( 33 on the page before))
0 as n
Ai
n
An
n
i
An .
n
1
i
i
log 3 6Ra3 1 n
n
2
i
i
log 3
3 1
6F6R
n
n
n
2
3
i
i
6 log 3 1 6b6
n
n
3
1
o n n
(37)
with which we rewrite ( 36) as
n 1 e
w
\
w2
2n
1
3
O n
2
n m
∑ e2φ8 i c n9 w 3
i m
1 w
w
;n 1
O
n
G N w
6
3
(38)
2
i
3
ψ 3 6
6 O n
n
n
n
2
n 1m 1
76
i
i
G 3 w ; i6 G 3 w 3 1 6 ; n i6
n
n
\
m 1
2
∑
\
i 0
N
1 Ai
G 3 w 3 1
w
n
O + Bi n 2
- O
exp
Nd
An i
An .
1
w
nO
i
6 ; n i6
n
where
φ x
x log x ; 1 x log 1 x
ψ x
log x 1 xP"
(39)
We now evaluate the two sums in ( 38) asymptotically and show that when the two results are added, the
dependence on m disappears.
50
Charles Knessl and Wojciech Szpankowski
From ( 35 on the preceding page) and the identity Ln: 1 G 0; n An
E 1 and G: 0; n n
we find that for all n
0
(40)
∞, G w; n has an asymptotic expansion of the form
We assume that for n
G w; n G0 w 4 a1 n G1 w 5 a2 n G2 w 4;<<<
where a j n is an asymptotic sequence as n
∞, i.e., a j .
1
n " a j n [
0 as n
(41)
∞. The appropriate sequence
is determined by balancing terms in ( 38 on the page before). This will eventually yield
a1 n log n
;
n
a2 n 1
n
(42)
so we use this form from the beginning. Note that G0 w is the moment generating function of the limiting
density of
E C n n !"
n, which is discussed in [19]. The conditions ( 40) imply that
G0 0 1; G1 0 G0: 0 G1: 0 G2 0 G2: 0 <<
< <
<< 0
0
(43)
(44)
We consider the first sum in ( 38 on the page before), which we denote by S1 S1 n; m . Using ( 41), ( 42)
and the Euler-MacLaurin formula we obtain
n m
∑ e2φ8 i c n9 w e 1 S1
i m
\
2
i
ψ3 6
n
n
3
6
n
o n
1
Pf
(45)
G0 3 w
w
1
1 m c n
mc n
1 m c n
1
mc n
1 m c n
mc n
1
1 m c n
1
3
i
i
i log n i i
G1 3 w 3 1 6 G0 3 w 3 1 6b6 G0 3 w 6
6b6
n
n
n
n i
n
i
log i
1
i
i
i
G1 3 w 6 G0 3 w 6
G2 3 w 3 1 G0 3 w 3 1 6F6
6F6
n
i
n
n n i
n
i
i
1
G2 3 w 6 ;<<
<ih
G0 3 w 3 1 6F6
n
i
n
1 1 mc n
m
m
n
G0 N w N 1 e2φ 8 x 9 w G0 wx G0 w wx dx e2φ 8 m c n 9 w G0 N w
nO
n OGO
mc n
g
1
w
mc n
1 m c n
mc n
2ψ x 4
3 e2φ 8 x 9 w G0 wx G0 w wx dx
log n log 1 x 2φ 8 x 9 w
e
G0 wx G1 w wx dx
1 x
log n logx 2φ 8 x 9 w
G0 w wx G1 wx dx
e
x
1
1 x
e2φ 8 x 9 w G0 wx G2 w wx dx
1 2φ 8 x 9 w
e
G2 wx G0 w wx dx o 1 Y
x
We note that all the integrals remain finite as m n
0, in view of ( 43) and ( 44). However, if we were to
consider higher order terms in the expansion ( 41), which would involve terms of order n 2
and n 3 , then
Quicksort algorithm again revisited
51
the corresponding higher order terms in ( 45 on the page before) would involve integrands not integrable
over [0,1]. It then becomes essential that we integrate only over the range m n 1 m n and consider the
contribution from the second sum S2 S2 n; m in ( 38 on page 49). We can further simplify S1 by evaluating
each term in the limit m n
1
1 m c n
W
T11
mc n
1
1
0
1
1
0
\
1
1
0
e2φ 8 x 9 w G0 wx G0 w wx dx 2
2
>
2
m
m2
m
2w 3
log N
n
2n2
nO
e
1 m c n
1
0
mc n
0
e2φ 8 x 9 w G0 wx G0 w wx dx
1
2w x logx x4 O x2 P
3 m2
6
4 n2
f m2
wG0: w ;
n2
logn log x 2φ 8 x 9 w
e
G0 w wx G1 wx dx
x
m
logx 2φ 8 x 9 w
G0 w wx G1 wx dx 2 log n G1 w ;
e
x
n
mc n
1 1
log n 0
and
1
W
2
>
2
T13
1 mc n
1
1
n
w
1
0
1
e2φ 8 x 9 w G2 wx G0 w wx dx
x
m
1
e2φ 8 x 9 w G2 wx G0 w wx dx 2 G2 w Y
x
n
mc n
1 1
0
Thus S1 simplifies to
S1
mc n
e2φ 8 x 9 w G0 wx G0 w wx dx
1
W
1
O x2 P G0 w [ wxG0: w 4 O x2 P dx
2G0 w T12
e2φ 8 x 9 w G0 wx G0 w wx dx
e2φ 8 x 9 w G0 wx G0 w wx dx 2
1
>
0, which we assumed to be true. We have
e2φ 8 x 9 w G0 wx G0 w wx dx G0 w 1
0
1 1
2ψ x 5
(46)
3 e2φ 8 x 9 w G0 wx G0 w wx dx
log n logx 2φ 8 x 9 w
e
G0 w wx G1 wx dx
x
0
1 1
1 2φ 8 x 9 w
2
e
G0 w wx G2 wx dx
0 x
m2
2m2
m
3
G0: w [
wG0 w e log N
f
g 2mG0 w 4 w
n
n
nO
2
m
m
2 log n G1 w * 2 G2 w /jk;<<<
n
n
where we have grouped the terms involving m inside the “ l ”. The error term in ( 46) approaches zero as
2
∞.
n
Now we consider the second sum S2 in ( 38 on page 49). Using A n i K A n 1 >mB i 1 A: n >
B i 1 Y 2 logn 2γ 2 , we obtain
m 1
S2
>
2
∑
i 0
S
1 Ai
w
w
i 1 / 2 logn 2γ 2 TBS 1 n
n
T
(47)
52
Charles Knessl and Wojciech Szpankowski
e
\
>
i
logn
1
wG0: w 4
G1 w 4
G2 w n
n
n
f
w m 1
Ai 2 log n / i 1 [X 2γ 2 Y i 1 po
n i∑
0
2G0 w Fn m m2
m
wG0: w log n G1 w 4 G2 w Pd
n
n
m2
m
3
2G0 w m 2 wG0 w e log N
f
n
nO
2
m
2 logn G1 w 5 G2 w "
n
2
>
Here we have used 2 ∑ni G0 w *
1
0 Ai m2
wG0: w n
n An R n 1 . Upon adding ( 46 on the page before) to ( 47 on the preceding
page), we see that all the terms involving m cancel, and that the leading three terms in the expansion of S1 S2
(i.e., the right side of ( 38 on page 49)) are of order O n , O log n and O 1 , respectively. Using ( 41 on
page 50) and ( 42 on page 50), the left side of ( 38 on page 49) becomes
n 1 e
w
3
1
w2
log n
1
6q3 G0 w 4
wG0: w 4 G2 w 5 o n
G1 w 4
2n
n
n
1
76R
(48)
Thus, comparing the above to S1 S2 , we find that
e
e
e
w
G0 w 4
1
w
G0 w G1 w 1
0
e2φ 8 x 9 w G0 wx G0 w wx dx 1
w
1
1 2φ 8 x 9 w
e
G0 w wx G1 wx dx x
2
0
1 2
w G0 w r
2
wG0: w 4 G2 w P
1
1
w
0
1 1
(49)
(50)
G0 w (51)
3 e2φ 8 x 9 w G0 wx G0 w wx dx
2ψ x 4
log x 2φ 8 x 9 w
G0 w wx G1 wx dx
e
x
1 1
1 2φ 8 x 9 w
e
2
G0 w wx G2 wx dx 0 x
2
0
Equations ( 49)-( 51), along with ( 39 on page 49), ( 43 on page 50) and ( 44 on page 50) are integral equations
for the first three terms in the series ( 41 on page 50). Below we discuss some aspects of the solutions to
these problems. The leading order equation ( 49) was previously obtained in [6], using more probabilistic
arguments.
We observe that the solution to ( 49) is not unique: G0 w then so is
ecw G0 0 is one solution and if G0 w is any solution,
w for any constant c. We can construct the solution as a Taylor series:
∞
G0 w This eliminates the trivial solution G0 w 1
1
and noting that
for j
&
0
2φ x *
1
∑ g jw j (52)
j 1
0 and satisfies the normalization G0 0 1. Using ( 52) in ( 49)
1 dx 0, we see that g1 remains arbitrary, and then we can easily calculate g j
2 in terms of g1 . But, ( 44 on page 50) forces g1
0 and then all the Taylor coefficients in ( 52) are
Quicksort algorithm again revisited
53
uniquely determined. They may be evaluated from the recursion
3
n 1
2
1
n 1
6
∑ B i n gn
i 1
s
n 1
for n & 2 where
1
B i j k
7
2 In particular, g2 1
i 0 gign
∑ ∑ B i #t$
s
1 n ut/ gigs
0 i 0
xi 1 x
j
0
(53)
i
i
1
2φ x 4 1 k dx k! (54)
π2
3 .
Next we consider the equations ( 50 on the page before) and ( 51 on the preceding page) for the correction
terms G1 and G2 . These are linear, Fredholm integral equations of the second kind. Their solutions may also
be constructed as Taylor series in w. In view of ( 43 on page 50) and ( 44 on page 50) we must have
as w
G0 w v>
1 α0 w2 ;
G1 w v>
β0 w 2 ;
G2 w v>
γ0 w2 0, where we have already computed α0
g2 . Given β0 , we can easily compute the higher order
Taylor coefficients of G1 w from ( 50 on the page before), in terms of the (now uniquely determined) Taylor
coefficients of G0 w . However, the constant β0 cannot be determined solely from ( 50 on the preceding
page), ( 43 on page 50) and ( 44 on page 50). To fix β0 we use the principle of asymptotic matching.
We require that expansions ( 26 on page 48) and ( 35 on page 49) (with ( 41 on page 50)) agree in some
intermediate limit, where u
1, n
∞ and n u 1 $
must agree with the behavior of ( 35 on page 49) (with ( 41 on page 50)) as w
as Ln u exp An u 1 1 Cn u 1 Cn u 1 Setting u 1
2
1 2 O " u ;<<<w n x ∞ >
w
w
w
1 3 G0 w 4
∞
0. Then the behavior of ( 26 on page 48) as n
0. Writing ( 26 on page 48)
the matching condition becomes
log n
1
G1 w 5
G2 w 4;<<<pw
w wx
n
n
w
w
(55)
0
w n, using ( 34 on page 48), and noting that the right side of ( 55) is 1 n 2
2
α0 n β0 log n n γ0n w2 , we obtain
β0 2; γ0
21
2
2γ 2 2
π 3
Now consider equation ( 51 on the preceding page) in the limit w
abstractly as T G1
0. We write ( 50 on the page before)
0 where T is the linear integral operator in ( 50 on the preceding page). Then ( 51 on
the page before) may be written as T G2
Since T G1
(56)
f G0 G1 where f is a known function of the first two terms.
0 has a non-zero solution (made unique by the condition β0
2 , we expect that T G2
f
will have a solution only if a solvability condition is met. To obtain this solvability condition we expand
( 51 on the preceding page) as w
1
1
0
2ψ x *
3 dx 0. Obviously ( 51 on the page before) is satisfied as w
0, and, since
1, ( 51 on the preceding page) also holds to order O w . Comparing O w2 terms in
54
Charles Knessl and Wojciech Szpankowski
( 51 on page 52) using ( 55 on the preceding page) gives
1
1 2α0
and thus β0
1
1
φ x ψ x dx
β0 3 4
2
0
1
5 π2
β0 3 4 3 6
2
2
6
(57)
2, which regains the result we obtained by using asymptotic matching. Note, however, that
this argument required that we derive the equation satisfied by G2 w in order to uniquely specify the previous
term G1 w . In contrast, asymptotic matching made no use of ( 51 on page 52). Presumably, by deriving the
equation for the next term G3 w in ( 41 on page 50) and examining its solvability condition, we would have
an alternate way of computing γ0 . Given β0 and γ0 , we can easily obtain the Taylor expansions of G1 and G2
from ( 50 on page 52) and ( 51 on page 52).
To summarize, we have obtained the expansion ( 35 on page 49), with ( 41 on page 50)-( 44 on page 50),
( 49 on page 52)-( 51 on page 52) and ( 56 on the preceding page), for the scaling n
n u 1
∞, u
1 with
w fixed. We have not been able to explicitly solve these integral equations. However, we can
derive some approximate formulas in the limits w A@ ∞, and these may be used to obtain approximations to
the tail probabilities of the Quicksort distribution.
C. C ASE w W n u 1 KA@ ∞
y
We shall only examine the leading term G0 , and we first consider the limit w
∞. As w
y
∞, the
“kernel” exp 2wφ x P in ( 49 on page 52) is sharply concentrated near x 1 2, and behaves as a multiple of
% w%
1 c 2δ x 1 2 . Thus we treat ( 49 on page 52) as a Laplace type integral (cf. [10]).
Assuming that G0 w has a weak (e.g., algebraic) dependence on w, we approximate the right side of ( 49
on page 52) by Laplace’s method, which (to leading order) yields the functional equation
e
w
G0 w Gz
S
G0
w
2O
N
T
π e
4w
2{
2wlog 2
w | ∞ (58)
But, if G0 varies weakly with w, then the exponential orders of magnitude of the right and left sides of ( 58)
do not agree. In order to get agreement, we would need G0 w to vary much more rapidly as w | ∞, of the
order exp O w log 7 w " . But this then contradicts the assumption used to obtain ( 58). Therefore, we return
to ( 49 on page 52) and allow more rapid variation of G0 . Specifically, we assume that as w H ∞
G0 w G> eα1 wlog 8}
w 9}. βw
0
w
γ1
δ1 (59)
Using ( 59) in ( 49 on page 52) yields
e
If 2 α1
)
w
7
w
γ1
δ1 > δ21 0 w 1
1
2γ1
0
xγ1 1 x
γ1 φ 8 x 9}8 2 . α1 9 w
e
dx (60)
0 we again use Laplace’s method to approximate the integral, thus obtaining
e
w
>
δ1
3
1
6
4
γ1
7
w
π
γ1 12 {
2 2 α1 e L8 α1 .
Hence, we must have
α1
2
1
; γ1
log 2
1
; δ1 2
2 9}8 log 2 9 w
(61)
E
E
2 2
π log2
(62)
Quicksort algorithm again revisited
55
and β remains arbitrary. Since the solution of ( 49 on page 52) is not unique, we can never determine β using
only this integral equation. We thus have
E
2 2 E
π log 2
E
G0 w G>
1
log 2
w exp e βw a3
w H ∞ f 26 w log 7 w (63)
By computing higher order terms in the expansion of the integral in ( 49 on page 52), we can extend ( 59 on
the page before) and ( 63) to an asymptotic series in powers of w 1 . To fix β, we must use the condition in
( 43 on page 50) and ( 44 on page 50), and this probably requires a numerical solution of ( 49 on page 52).
By comparing this numerical solution to ( 49 on page 52) for large, negative w to ( 63), we can obtain an
approximation to β.
Now consider ( 49 on page 52) in the limit w | ∞. Then exp 2wφ x is concentrated near the endpoints
x 0 and x 1. We assume that G0 w has an asymptotic expansion in the WKB form [7]:
G0 w G> K w exp Ψ w 7
(64)
The major contribution to the integral will come from where x 1 x o w G0 wx
O wx 2 1
K w eΨ8 w9}
1
. Thus we use the Taylor series
, and ( 64) to approximate G0 w wx . This yields
1
w
>
1c 2
2
0
2wφ x 4 O w2φ2 x" 1 O wx 1
2K w eΨ8 w9
>
1
1c 2
e
wΨ~i8 w 9 x
2
P
K w wx eΨ8 w wx 9
dx
dx (65)
0
If furthermore Ψ <= is such that wΨ: w G
∞ as w
∞, then ( 65) yields (after a slight rearrangement)
wΨ: w G> 2ew
and hence we define
Ψ w
Note that ( 66) is consistent with wΨ: w [
1
w eu
2
u
du (66)
∞ and shows that G0 grows very rapidly (as a double exponential)
as w A ∞. We note that the asymptotic expansion of the integrand in ( 65) is only valid for x o w the major contribution to the integral turns out to be from x O e w , which is certainly
o w 1
1
, but
. To obtain
the next term K w , we refine the expansion ( 65) to
K w eΨ8 w 9(
1
w
0
w
O w2 e 2w 2wφ x4 O w2φ2 xP 1 O " wx 2
(67)
K
\
For x small we have 2wφ x 1
w [ wxK : w 4 O w2 x2 K : : w "
1 2
exp 3 Ψ w [ wΨ: w x w Ψ: : w x2 O w3 Ψ: : : w x3 2
\
w3 x3 Ψ: : : 1c 2
2
2w x logx x O x2 and if x
. Setting x exp
e
ηe w 1 2 2
x w Ψ: : w 2
O e
w
then w2 x2
2 we have
f 1
1 2
η w 1 e
4
w
1
o 1 P"
6
dx O w2 e 2w and
56
Charles Knessl and Wojciech Szpankowski
Thus ( 67 on the preceding page) becomes, after cancelling the common factor exp Ψ w ,
e
w
1
e
K w
K w
w
g
1
∞
K : w we
e
η
dη
(68)
0
1 w 2
η
e η w 1 4 we w η N2 w log N
4
2O
∞1
ηe η dη O e 2w K w wα Uh
0 2
η e
e
0
∞
K w
1
w
1
O
f
dη
for some constant α. To leading order ( 68) is obviously satisfied and then collecting terms that are of order
O e
w
∞ (modulo some algebraic factors) we obtain the following differential equation
as w
w K : w
2 K w
and thus
1
∞
e
0
1 2 η e
4
K : w
K w
η
w 1 4 wηe η
N2
η
2O
w log N
1
f
O
dη
(69)
1
w
2w 1 2γ 2 log2 (70)
Solving ( 70) and using the result along with ( 66 on the page before), we have
1
G0 w G> CI exp
3
2eu
due u
w
1
w2 .K8 1 2γ 2log 2 9 w
6
1
w
(71)
Here CI is an undetermined constant and we have chosen the lower limit on the integral in ( 66 on the
preceding page) as one. An alternate choice would only change CI , which we cannot determine using only
the integral equation ( 49 on page 52).
Our analysis shows that as w  ∞, the nonlinear equation ( 49 on page 52) may be approximated by a
linear one. To fix CI it would seem that we will again need an accurate numerical solution to ( 49 on page 52).
We have thus obtained formal asymptotic results as w H@ ∞ for the solution to ( 49 on page 52). Using
our procedure we can derive full asymptotic series in these limits, but the constants β and CI will remain
undetermined.
In Section 4, we will use our results for G0 w to obtain asymptotic expansions for the limiting density
P y as y €@ ∞.
D. C ASE u
Q
1 OR u ) 1
We next study ( 6 on page 45) for n
∞ but for fixed u ) 1 or u
1. First we assume that 0
Q
that u appears in ( 6 on page 45) only as a parameter. We assume an expansion of the form
Ln u > eA 8 u 9 nlog n.
for n
∞ and u
Q
B 8 u9 n C8 u9
n
Q
u
Q
1. Note
D u
(72)
1. The major contribution to the sum in ( 6 on page 45) will come from the midpoint
i z n 2. Using ( 72) in ( 6 on page 45) and noting that n 1 log n 1 = n logn log n 1 O n i logi ; n i log n i = n logn n log2 eAnlog n eA log n eA .
2n 1 i B Bn C
e n D
n 2
D2
>
2
N
+ O n 2 i n
2O
2C
eBn
un
n
n 2
3
eAnlog n e
, we obtain
Anlog 2
n
\
∑ exp + B 2 i 0
n % A % i n 2
2
-
1
and
Quicksort algorithm again revisited
for A
Q
57
0. The sum is asymptotically equal to

nπ 2 % A % , and thus
A
log u
log 2
D
2eB u2 ‚
C
1 log u
2 log 2
(73)
2 logu
log u
exp 3
6
π log2
log 2
and B B u remains arbitrary.
Some further information may be obtained by asymptotically matching ( 72 on the preceding page) to the
∞, u 1 O n expansion valid for n
1
. In the intermediate limit where u
∞ and n u 1 4ƒ ∞,
1, n
we use ( 63 on page 55) with w n u 1 to get
E
An 8 u 1 9
e
\ 
e28 u 1 9 nlog n e8 2γ 49 n8 u 1 9 E
G0 w G>
n 1 u exp g
3
1
log 2
2 2 βn 8 u
e
π log2
19
26 n u 1 logn log 1 u hl
As u „ 1, ( 72 on the preceding page) and ( 73) yields
e
exp g
u 1
O " u 1 log 2
2
n logn Bn h n1c 2 2eB ‚
f
2 1 u
π log2
and these two expressions agree provided that as u „ 1
B u >
3
1
log 2
26
u 1 log 1 u 5… 2γ 4 β Y u 1 /
1
u
(74)
This relates the behavior of B u as u „ 1 to the constant β in ( 63 on page 55).
Now consider n
∞ with u
)
1. The dominant contribution in the sum ( 6 on page 45) now comes from
the terms with i 0 and i n. Thus
Ln.
1
u
un
n 1
2L0 u Ln u 4 2L1 u Ln and
Ln u G> u 2n
k1 u ;
n!
n
2
where k1 u is an undetermined function. Since for u
k1 u $
1 4 as u
n
1
u 4;<<< ?>
2un
Ln u n 1
∞ u ) 1
∞ and all n
(76)
2, Ln u >
&
u
to
n
2
2n 4n! , we have
∞.
We examine asymptotic matching between the expansions for u ) 1 and u 1 O n eAn 8 u 1 9
(75)
1
. If ( 76) matches
G0 w then ( 71 on the page before) would agree with the expansion of ( 76) as u † 1. However
this cannot be true as the dominant exponential term in ( 71 on the page before) is
O e exp
while the dominant term in ( 76) as u
3
2ew
6
w
f
O n exp
‡
2en8 u 19
n u 1 'ˆ
o
(77)
1 is
O e exp
3
n2
u 1
2
6 f (78)
58
Charles Knessl and Wojciech Szpankowski
This suggests that yet another expansion is needed on some scale where n
∞ and u † 1 with n u 1 KM ∞.
By comparing ( 77 on the page before) to ( 78 on the preceding page), this new scale is likely to be
w n u 1
log n 2 loglog n O 1 Y
(79)
With this scaling both ( 77 on the page before) and ( 78 on the preceding page) are exp O n logn . We
have not examined this intermediate scale in any detail. However, we would guess that with ( 79), the
approximation of ( 6 on page 45) will involve retaining an infinite number of terms in this sum (rather than
just the 2 terms in ( 75 on the page before)), but not approximating the sum by an integral, as was possible
when u 1 O n 1
.
To summarize this section, we have analyzed ( 6 on page 45) in various asymptotic limits. These include
1, n fixed; (ii) u
(i) u
n
∞, n u 1 1, n
w fixed; (iii) w A@ ∞; (iv) 0
Q
u
Q
1, n
∞; and (v) u ) 1,
∞. In the next section, we use these results to obtain information about the distribution 0 n k .
4 Tails of the Limiting Distribution
Using the approximation ( 35 on page 49) for u 1 O n n E n ny
1
1 1
n 2πi
1
1 1
n 2πi
>
>
where Br
N
w
nO
1
C
e
yw
Br
1
2πi
ny 1
e
1
1
, we obtain
z K‰ ny .
An . 1Š
C
Ln z dz An log 8 1 . w c n 9}. An w c n
(80)
G0 w dw
1
P y
n
G0 w dw c i∞ c i∞ for some constant c, is any vertical contour in the w-plane. Here we have set
z 1 w n in the integral. It follows that
1
G0 w ∞
∞
ewy P y dy
(81)
so that G0 w is the moment generating function of the density P y . In view of ( 43 on page 50) and ( 44 on
page 50) we have ‹
∞
∞ P y
dy 1 and ‹
∞
∞ yP y dy 0.
Observe that, using ( 35 on page 49), ( 41 on page 50), and ( 42 on page 50), we can refine the approximation ( 80) to
" n E C
n ny
1
log n
3 P y 4
P1 y4 P: :! yP
n
n 1
1
yP: : y 4; γ 2 P: : y PŒ o n P2 y 4 P: y 5
n
2
where
Pk y
for k 1 2.
1
2πi
1
e
Br
yw
Gk w dw
1
6
(82)
Quicksort algorithm again revisited
59
An integral equation for P y can easily be derived from ( 49 on page 52). We multiply ( 49 on page 52)
by e
wy *
2πi and integrate over a contour Br in the w-plane:
1
P y 1
1
2πi
1
1
2πi
1
1
0
1
11
Br
e
w8 y . 19
1
1
Br 0
1
2πi
1
G0 w dw
e2wφ 8 x 9 G0 wx G0 w wx e
e
1
∞
wy
P ξ ewxξ dξ
∞
Br
∞ 1 ∞
P ξ P η ∞ ∞
1 1 1 ∞
1
y
P η P 3 x
x
∞
0
1 11 ∞
y 2 P 3 xt 1
0 ∞
1
∞
wy
dxdw
P η e8 w
∞
wx 9 η
dη e2wφ 8 x 9 dwdx
δ y xξ X 1 x η 2φ x dηdξdx
0
(83)
1 x
2φ x
η
6 dηdx
x
x
φ x
y 2 φ x
6 P 3 B 1 x t x
x
6
dtdx where we used the well known identity (cf. [1])
1
2πi
1
e
yw
Br
dw δ y
(84)
where δ y is Dirac’s delta function. The last expression is precisely ( 15 on page 46). The solution to this
integral equation is not unique: if P y is a solution, so is P y c for any c.
We study P y 2πi 1‹
yw G 0
Br e
w dw as y
integral will be determined by a saddle point w
∞. We argue that the asymptotic expansion of the
@
s y , which satisfies s yŽ@ ∞ as y
y@
∞. Thus for
y H ∞, we can use the approximation ( 63 on page 55) for G0 w , which yields
P y$>
1
2πi
E
1
2 2 E
π log2
E
Br
e
w exp
β y w a3
1
log 2
26 w log 0 w f dw
(85)
This integrand has a saddle point where
d
dw
e
y β w B
so that
w
1
exp
e
1
log 2
3
y β
f
2 1 log2
e f 26 w log 0 w W
0
w̃ y which satisfies w̃ y$| ∞ as y  ∞. Then the standard saddle point approximation to ( 85) yields
E
P y>
E
2 2 E
π log2
\
1
2πi
w̃ exp
1
exp
Br
e
e
1
2w̃
β y w̃ 3
1
log 2
2
1
β y
E
exp e
πe 2 log2 1
2 1 log2
3
1
log 2
26R w w̃
26 w̃ log 0 w̃ f
2
f
dw
2 1 log2
β y
exp 3
6
e
2 1 log2
(86)
f
for y  ∞. Thus, the left tail is very small and the behavior of P y as y  ∞ is similar to that of an
extreme value distribution (i.e., double-exponential distribution).
60
Charles Knessl and Wojciech Szpankowski
Now take y A ∞ and use ( 71 on page 56) to get
1
2πi
P y $>
1
1
Br
CI exp
2eu
1 du6
e
u
w
w
3
1
The saddle point now satisfies
d
dw
or y 2ew w. Let w I
e
1
yw yw w2 .K8 1 2γ 2log 2 9 w
2eu
duf
u
w
1
dw
(87)
0
∞ as y
w I‘ y be the solution to ( 19 on page 47) that satisfies w IŽ
∞. Then
expanding the integrand in ( 87) about w w I2 y and using the standard saddle point approximation yields
E
P y>
as y
E
CI
2 2π
y
1 1 wI

1
e exp
yw I
wJ
1
2eu
du w2I X 2γ 2 log2 w I7f
u
∞, from which ( 18 on page 47) easily follows. Thus for y
∞ we have P y
(88)
exp O 0 y logy and
hence the right tail is thinner than the right tail of an extreme value distribution. ¿From ( 19 on page 47) it is
easy to show that
wI
N
log
y
2O
loglog
y
2O
N
log log y 2 1 o 1 "4
log y 2 ∞
(89)
∞,
For fixed z and y we have, as n
1
0 n E C
n D
z
nz’>
" n E C
n &
P y dy
(90)
P u du (91)
∞
1
If z
y
∞
ny“>
y
∞ or y  ∞, then these integrals may be evaluated asymptotically using ( 86 on the preceding
”
page) and ( 88), and we obtain the results ( 17 on page 46) and ( 18 on page 47), respectively.
This derivation of the expansions of P y as y €@ ∞ has the disadvantage in that it assumes the existence
of certain saddle points. However, we can obtain the identical results simply by using the integral equation
for P y , which we now show.
Let us write ( 83 on the preceding page) in the form
1
P y 1
1
1
1
x 1 x
0
∞
∞
P
3
ξ y 2 φ x
1 x
6
P
3
ξ y 2 φ x
x
6
dξdx
(92)
and assume that for y  ∞, P y has the form
P y$> decy exp ae by
(93)
for some constants a b c d. Using ( 93) in ( 92) we find that the major contribution to the double integral
will come from x ξ 1
+ 2
1
2
1 eby c 2 u and ξ eby c 2 η we obtain
dec ecy exp ae b e by
e2cy e4c log 2
exp • ν u y 2 log2 5 2η 2 b2 2νb u2 y 2 log 2 1 4 2ηu 0– dudη
1
\
0- . After scaling x ∞ 1
∞
∞ ∞
5>
2eby d 2 exp 2ae by 2blog 2
e
Quicksort algorithm again revisited
where ν ae 2blog 2
61
1 b 3c 2 . The double integral is equal to π 2ν 2e
e
2d 2
2blog 2
dec
b
ecy
e8 b.
π 4c log 2 e
b
2a
2 1 b
1
1 log2
2
c
and thus the above yields
2c 9 y
(94)
3 c 2 2blog 2
e
b
d
2a 3 c
b
π
1c 2
1
6
b
2
3
We thus have
b
1c 2
2
{
1
b
2
(95)
and a remains arbitrary. But then ( 93 on the page before) is equivalent to ( 86 on page 59) if we identify a as
2 1 log2
β
exp e
f
e
2 1 log2
a
(96)
Next we consider y | ∞ and write the equation for P y in the form
1
P y 1
1c 2
2
1
1
1 x
0
∞
∞
x y η * 2φ x
1 x
P η P 3 y 6
dηdx (97)
We seek solutions in the WKB form
P y> G y eF 8 y 9
y € ∞
(98)
where log G varies less rapidly than F. Then ( 97) may be approximated by
eF 8 y 9}.
1
1c 2
2
e
After scaling x
ω
∞
1
F : : y4;<<< f
2
G: y 5…<
<< e 1 G y5
e
P η eF 8 y 9 exp g
x y η
1 x
x y η * 2φ x
G: y4;<<< f
1 x
G y4
∞
1 x
0
\
1
1
F ~ 8 y9
n
2φ x f F : y
1 x
1
2
1
3
(99)
h
x y η * 2φ x
1 x
2
6
F : : y 4;<<
< o dηdx yF : yP , the leading term in the right side of ( 99) is eF 8 y 9 G y Y0 2 "* yF : y " and
hence
eF ~=8 y 9
2
yF : y (100)
After some calculation, we obtain the following linear differential equation for G:
G:
1
F: : G 2
G
yF :
2G
G:
γ log 7 yF : Pd
y F:
F: :
G
F: 2
(101)
Equation ( 100) is a first order non-linear ordinary differential equation for F, which is really just a transcendental equation for F : . Setting F : y $W— w I' y we have
1
F y
1
y
w I' t dt
wJ"8 y 9
2ζ
‡
1
wJ
yw I* 2
eζ
ζ
eζ
dζ
ζ2 ˆ
eζ
dζ
ζ
(102)
62
Charles Knessl and Wojciech Szpankowski
where we have used ( 100 on the preceding page) to change the variable of integration. But then the dominant
exponential terms in ( 88 on page 60) and ( 98 on the preceding page) agree precisely. To solve ( 101 on the
˜
page before) we change variables y
G: y G y
F : : y dG
G dw I
where we have used log 0 yF : w I and write ( 101 on the preceding page) as
1
F:
1
3
log 2e
ewJ
1
6
F~ 3
1
wI
e
F: :
F: 2
1
F: :
2
1
yF :
2F :
y
2 γ log2 y
f
(103)
log 2 F : . From ( 100 on the page before) we get
1
6
w2I
7
F : : y "
1
2
(104)
Using ( 104) in ( 103) we obtain, after some calculation,
1 dG
G dw I
1
2
1 1
2 wI 1
2w IK 2 γ log2 Y
Solving ( 105) we see that ( 98 on the page before) is equivalent to ( 88 on page 60), noting that

E
(105)
y
2 w I exp w I 2 . Of course, G is determined only up to a multiplicative constant, which corresponds to
E
CI* 2 π in ( 88 on page 60).
References
[1] Abramowitz, M. and Stegun, I., eds. Handbook of Mathematical Functions. Wiley, New York, 1962.
[2] C. Bender and S. Orszag, Advanced Mathematical Methods for Scientists and Engineers, McGrew-Hill,
New York 1978.
[3] E. R. Canfield, “From Recursions to Asymptotics: On Szekeres” Formula for the Number of Partitions,
Electronic J. of Combinatorics, 4, RR6, 1997.
[4] M. Cramer, A Note Concerning the Limit Distribution of the Quicksort Algorithm, Theoretical Informatics and Applications, 30, 195–207, 1996.
[5] L. Devroye, A Note on the Height of binary Search Trees, J. ACM, 489–498, 1986.
[6] W. Eddy and M. Schervish, How Many Comparisons Does Quicksort Use, J. Algorithms, 19, 402–431,
1995.
[7] N. Froman and P. Froman, JWKB Approximation, North-Holland, Amsterdam 1965.
[8] P. Hennequin, Combinatorial Analysis of Quicksort Algorithm, Theoretical Informatics and Applications, 23, 317–333, 1989.
[9] P. Hennequin, Analyse en Moyenne d’Algorithmes, Tri Rapide at Arbres de Recherche, Ph.D. Thesis,
Ecole Politechnique, Palaiseau 1991.
[10] P. Henrici, Applied and Computational Complex Analysis, John Wiley&Sons, New York 1977.
[11] C.A.R. Hoare, Quicksort, Comput. J., 5, 10–15, 1962.
[12] P. Jacquet and W. Szpankowski, Asymptotic Behavior of the Lempel-Ziv Parsing Scheme and Digital
Search Trees, Theoretical Computer Science, 144, 161-197, 1995.
[13] C. Knessl and J.B. Keller, Partition Asymptotics from Recursions, SIAM J. Appl. Math., 50, 323–338,
1990.
Quicksort algorithm again revisited
63
[14] D. Knuth, The Art of Computer Programming. Sorting and Searching, Addison-Wesley (1973).
[15] C.J. McDiarmid and R. Hayward, Large Deviations for Quicksort, J. Algorithms, 21, 476–507, 1996.
[16] H. Mahmoud, Evolution of Random Search Trees, John Wiley & Sons, New York 1992.
[17] S. Rachev and L. Rüschendorf, Probability Metrics and Recursive Algorithms, Adv. Appl. Prob., 27,
770–799, 1995.
[18] M. Régnier, A Limiting Distribution for Quicksort, Theoretical Informatics and Applications, 23, 335–
343, 1989.
[19] U. Rösler, A Limit Theorem for Quicksort, Theoretical Informatics and Applications, 25, 85–100, 1991.
[20] U. Rösler, A Fixed Point Theorem for Distributions, Stochastic Processes and Their Applications, 42,
195–214, 1992.
[21] K.H. Tan and P. Hadjicostas, Some Properties of a Limiting Distribution in Quicksort, Statistics &
Probability Letters, 25, 87–94, 1995.
64
Charles Knessl and Wojciech Szpankowski
Download