jjj j1 j j j j j j j j j j j j j j j j j j j j jjjj jjjj jjjj j jj jj jj j jjjj1 jjjjjjjj1 jjj j1

advertisement
THE LEVEL OF NODES IN HEAP ORDERED TREES
HELMUT PRODINGER
Abstract. A heap ordered tree with n nodes (\size n") is a planted plane tree together
with a bijection from the nodes to the set f1; : : : ; ng which is monotonically increasing when
going from the root to the leaves. We consider the level of the node j in a (random) heap
ordered tree of size n j . This distribution does not depend on n. Precise expressions are
derived for the expectation, the variance, and the probability distribution.
1. Heap ordered trees
A heap ordered tree with n nodes (\size n") might be described as a planted plane tree together with a bijection from the nodes to the set f1; : : :; ng which is monotonically increasing
when going from the root to the leaves.
1j
1j
1j
1j
1j
1j
2j 3j 4j 2j 4j 3j 3j 2j 4j 3j 4j 2j 4j 2j 3j 4j 3j 2j
1j
1j
1j
2j 4j 2j 3j 3j 2j
3j
4j
4j
Figure 1.
1j
1j
1j
1j
1j
1j
2j
4j 2j 3j 2j 2j 3j 2j
2j
3j
4j
4j 3j 4j 4j 3j 3j
4j
All 15 heap ordered trees with 4 nodes
In this note, we want to concentrate on the level of the node j in a (random) heap ordered
tree of size n j . This extends some of the results in [9] and [2] where the average over
j (the depth of a random node) was considered. It will turn out, that surprisingly, this
distribution does not depend on n. Precise expressions are derived for the expectation, the
variance, and the probability distribution. A general reference for heap ordered trees and,
more generally, increasing trees is [1].
The enumeration of the numbers an of heap ordered trees of size n is easy and appears
already in [10]. The recursion is
an+1 =
Date : May 7, 1996.
X
X
m1 h1 ++hm =n
!
n
h1 ; : : :; hm ah1 : : :ahm for n 1; a1 = 1 ;
1
(1)
2
HELMUT PRODINGER
hence the exponential generating function A(z ) := n0 an znn! fullls the dierential equation
A0 (z) = 1 ? 1A(z) with A(0) = 0 ;
with the solution
p
A(z) = 1 ? 1 ? 2z ;
so that
an = n! 21?n Cn = 1 3 5 : : : (2n ? 3) ;
?
with a (shifted) Catalan number Cn = n1 2nn??12 .
Now let us consider the average level of node j in a heap ordered tree of size n. For that
purpose we need a dierent recursion than (1). We want to x the subtree which contains
j and say that it is the rst subtree. However, j can be in any of the m subtrees. So we
introduce a factor m. (Of course, we assume j 2.) But then, we cannot choose h1 numbers
out of f2; : : :; n + 1g, since we know already that j is there!
Therefore our alternative recursion is
P
!
m h ? 1n; h?;1: : :; h ah1 : : :ahm for n 1; a1 = 1 : (2)
an+1 =
1
2
m
m1 h1 ++hm =n
X
X
Let us test via generating functions that this is indeed correct. We multiply (2) by (znn??1)!1
and sum up. The right hand side becomes
1
d u A0(z) =
du 1 ? u A(z) u=1 (1 ? 2z)3=2 :
The left hand side is A00 (z ), which equals the same quantity.
Now we want the probability that j lies in a subtree of size k. This is
!
!
X
X
n
?
1
n
?
k
m k ? 1 h ; : : :; h ak aha2 : : :ahm
n+1
2
m
m1 h2 ++hm =n?k
k in this expression by multiplying this factor by zn?k
We compute the factor of nk??11 ana+1
(n?k)!
and summing up. The result is amazingly simple:
u = 1 ;
d
du 1 ? u A(z) u=1 1 ? 2z
whence our sought probability that j lies in a subtree of size k turns out to be
!
n ? 1 ak 2n?k (n ? k)! :
k ? 1 an+1
Again, it is an easy check that the sum on k of this quantity is 1.
Now, let Dn;j be the average level of node j . To avoid misinterpretations, we say that
the root (= the node `1') is on level 0. Using the last probability, we can set up a recursion
for Dn;j . However, we must take care of the fact that in its subtree, `j ' does not mean `j '
any further. The numbers in the subtree are to be replaced by 1; 2; : : : , according to their
?
THE LEVEL OF NODES IN HEAP ORDERED TREES
3
relative order. Let us compute the probability that j will be i after this procedure, or, what
is the same, that j is the ith largest number in its subtree. It is
?j ?2?n+1?j i?1?
k?i
n?1
k?1
;
since i ? 1 numbers have to be chosen from f2; 3; : : :; j ? 1g and k ? i numbers from
fj + 1; : : :; n + 1g.
Therefore we nally found our recursion:
k ?j ?2?n+1?j n n ? 1! a
X
X
k
i?1? k?i D :
n?k (n ? k)!
(3)
2
Dn+1;j = 1 +
k;i
n?1
i=1
k?1
k=1 k ? 1 an+1
This recursion is valid for n + 1 j 2. Additionally, we have, since `1' is always on level
0, that Dn;1 = 0 for all n 1.
The solution of this recursion is found by inspection1 (!), and it transpires that
Dn;j = H2j?2 ? 12 Hj?1 ; independently of n (n j ) :
Since we have such a simple result, we should explain it and prove it. The independency
on n means that the nodes larger than j don't change anything. We might say that a random
tree of n nodes (each heap ordered tree is equally likely) remains `random' if we cut o the
nodes larger than j . This is in fact not too hard to see. First, it is enough to show that
randomness is preserved if we erase node n. Or, since
an+1 = 2n ? 1 ;
an
we might alternatively show that from each tree with n nodes, we get 2n ? 1 new trees by
inserting the new node n + 1. We can attach this new node on every node, but the relative
order in the plane is important, so if a certain node has i outgoing branches, it gives us i + 1
possibilities, namely to the left of all, between rst and second edge, etc. Denoting by d(k)
the number of outgoing branches, we have altogether
n X
as desired.
k=1
1j
2j
3j
1 + d(k) = n + number of edges = 2n ? 1 ;
1j
4j 2j
3j
1j
1j
2j 4j 2j
3j 4j 3j
1j
1j
2j
2j
3j 4j 3j
4j
A heap ordered tree with 3 nodes and the 5 trees obtained by
inserting a fourth node
Figure 2.
1 We used Maple to `see' it.
4
HELMUT PRODINGER
To solve the recursion, we set n = j ? 1 and write Dj = Dj;j for simplicity. Then (3) turns
into
!
jX
?1 a
j
?
2
k
j
?
1
?
k
Dj = 1 + a 2
(j ? 1 ? k)! k ? 1 Dk :
(4)
j
k=1
With
j ;
Ej := 2j (ajj D
? 1)!
we get
?1
1?j aj jX
2(j ? 1)Ej = (2j ? 2)!
+ Ek ;
k=1
or, by taking dierences of two consecutive equations,
2jEj +1 = (2j ? 1)Ej + 2j (ja?j 1)! ;
which means that
Dj+1 = Dj + 2j 1? 1 :
Unrolling this recursion, we get
Dj = 1 + 13 + + 2j1?3 = H2j?2 ? 12 Hj?1 ;
and that was to be demonstrated.
We can get an easy corollary: Since
?1 1
n jX
n n?i X
n (n ? 1 ) ? (i ? 1 ) X
X
1
1 Hn ) ? n ;
2
2
=
=
=
n
?
(
H
?
2
n
2
2i ? 1
2
2
j =1 i=1 2i ? 1 i=1 2i ? 1 i=1
we get, by dividing this by n, the average depth dn of a random node
dn = 1 ? 21n (H2n ? 12 Hn) ? 12 ;
a result from [9] and [2].
Now we want to compute the variance. Very much in the style of (3) we can set up a
recursion for the probability generating functions Fn;j (x);
k ?j ?2?n+1?j n n ? 1! a
X
X
k
i?1? k?i F (x) :
n
?
k
2 (n ? k)!
Fn+1;j (x) = x
k;i
n?1
a
k
?
1
n+1
i=1
k?1
k=1
As before, we freely drop the index n, by replacing n by j ? 1.
!
jX
?1 a
j
?
2
k
j
?
1
?
k
(j ? 1 ? k)! k ? 1 Fk (x) :
Fj (x) = x a 2
(5)
j
k=1
With the short hand notation
j (x) ;
Ej = 2aj (jjF?
1)!
THE LEVEL OF NODES IN HEAP ORDERED TREES
we nd
2(j ? 1)Ej +1 = x
whence
or
Therefore
jX
?1
k=1
5
Ej ;
2jEj +1 ? 2(j ? 1)Ej = xEj ;
jY
?1
2
j
?
4
+
x
1
Ej = 2j ? 2 Ej?1 = 2j?1 (j ? 1)! (2i ? 2 + x) :
i=1
Fj (x) =
jY
?1 2i ? 2 + x
(6)
2i ? 1 :
When we have the probability generating function in a factored form where each factor
constitutes a probability generating function itself, it is particularly easy to get the variance,
see [5].
i=1
jX
?1 1
1 :
?
Var 2i 2?i ?2 +1 x =
2
i=1 2i ? 1 (2i ? 1)
i=1
The reader might have noticed that in this paper a modied notion of harmonic numbers
(of rst and second order) would be more appropriate. If we dene
n
n
X
X
1
(2)
b
b
Hn = 2k ? 1 and Hn = (2k ?1 1)2 ;
k=1
k=1
then we have
Vj = Hb j?1 ? Hb j(2)
?1 ;
in terms of harmonic numbers this equals
Vj = H2j?2 ? 21 Hj?1 ? H2(2)j?2 + 41 Hj(2)
?1 :
From (6) it is even possible to get an explicit expression for the coecient of xk , which is
the probability that node j is at level k. We write
?1 j ? 1 xk
j ?1 x j ?1 2j ?1 jX
x
(
x
+
2)
:
:
:
(
x
+
2(
j
?
1))
(
?
2)
= a
Fj (x) =
= a
?2
aj
2k ;
j
j k=0 k
and therefore
j ?1 [xk ]Fj (x) = 2a j ?k 1 21k :
j
Here, we used the notion of Stirling's cycle numbers nk , compare [3].
For the readers convenience we collect our main ndings as follows.
Vj := Var fFj (x)g =
jX
?1
6
HELMUT PRODINGER
Theorem 1. The average and the variance of the level (depth) of the node with label j in
a heap ordered tree of at least j nodes is given by
Dj = Hb j?1 ;
Vj = Hb j?1 ? Hb j(2)
?1 :
The probability that node j has level k is given by
2j ?1?k j ? 1 :
aj
k
2. Binary trees
For the sake of completeness and comparison we briey sketch the corresponding considerations for the instance of binary trees. However, mutatis mutandis, these are well known
and old results, going back to Lynch, Hibbard, Knuth and others, see [4, 7, 6, 8].
The recursion for the increasing binary trees is
n
X
!
n aa
an+1 =
k n?k
k=0 k
with the obvious solution an = n!. The probability that j lies in a subtree of size k is
!
n
?
1
k)! = 2k :
2 k ? 1 k(!(nn+?1)!
n(n + 1)
The recursion for the probability generating functions is
n
k ?j ?2?n+1?j X
X
2
k
i?1? k?i F (x) ;
Fn+1;j (x) = x n(n + 1)
k;i
n?1
i=1
k=1
k?1
and the specialization to n = j ? 1 is
jX
?1
Fj (x) = j (j2?x 1) k Fk (x) ;
k=1
from which we derive
(j + 1)Fj +1 (x) = (j ? 1 + 2x)Fj (x)
and thus
j i + 2(x ? 1)
Y
F (x) =
:
j
i=2
Therefore the expectation is
Dj =
and the variance is
Vj =
j
X
i=2
j
X
i=2
i
2 = 2(H ? 1) ;
j
i
2 ? 4 = 2H ? 4H (2) + 2 :
j
j
2
i
i
THE LEVEL OF NODES IN HEAP ORDERED TREES
7
Furthermore, the probability generating function Fj (x) can be written as
jX
?1 j ? 1 1
1
j
?
1
j
?
1
Fj (x) = j ! (?1) (?2x) = j !
(2x)k ;
k
k=0
whence we nd for the probability that the node j lies on level k the quantity
2k j ? 1 :
j! k
References
[1] F. Bergeron, P. Flajolet, and B. Salvy. Varieties of increasing trees. Lecture Notes in Computer Science,
581:24{48, 1992.
[2] W.-C. Chen and W.-C. Ni. On the average altitude of heap{ordered trees. International Journal of
Foundations of Computer Science, 15:99{109, 1994.
[3] R. L. Graham, D. E. Knuth, and O. Patashnik. Concrete Mathematics. Addison Wesley, 1989.
[4] T. N. Hibbard. Some combinatorial properties of certain trees with applications to searching and sorting.
Journal of the ACM, 9(1):13{28, January 1962.
[5] D. E. Knuth. The Art of Computer Programming, volume 1: Fundamental Algorithms. Addison-Wesley,
1968. Second edition, 1973.
[6] D. E. Knuth. The Art of Computer Programming, volume 3: Sorting and Searching. Addison-Wesley,
1973.
[7] W. C. Lynch. More combinatorial problems on certain trees. Computer Journal, 7:299{302, 1965.
[8] H. M. Mahmoud. Evolution of Random Search Trees. John Wiley, New York, 1992.
[9] H. Prodinger. Depth and path length of heap ordered trees. International Journal of Foundations of
Computer Science, 1996 (to appear).
[10] H. Prodinger and F.J. Urbanek. On monotone functions of tree structures. Discrete Applied Mathematics,
5:223{239, 1983.
Institut fur Algebra und Diskrete Mathematik, Technical University of Vienna, Wiedner
Hauptstrasse 8{10, A-1040 Vienna, Austria.
E-mail address : Helmut.Prodinger@tuwien.ac.at
WWW-address: http://info.tuwien.ac.at/theoinf/proding.htm
Download