THE TAKiCS THE DISTKmUTION NUMBER

advertisement
Journal of Applied Mathematics and Stochastic Analysis 4, Number 3, Fall 1991, 175-186
ON THE DISTKmUTION OF THE NUMBER OF
VERTICES IN LAYERS OF PNDOM TIESa
LAJOS TAKiCS
Case Western Reserve University
Cleveland, Ohio2
ABSTR&CT
Denote by S n the set of Ml distinct rooted trees with n lbeled
vertices. A tree is chosen t random in the set Sn, assuming that all the
possible n n- 1 choices are equally probable. Define 7"n(m as the number
of vertices in layer m, that is, the number of vertices at a distance m
from the root of the tree. The distance of a vertex from the root is the
number of edges in the path from the vertex to the root. This paper is
concerned with the distribution and the moments of rn(m ) and their
asymptotic behavior in the case where m = [2c], 0 < c < oo and n-c.
In addition, more random trees, branching processes, the Bernoulli
excursion and the Brownian excursion are also considered.
Key words: Random trees, Branching processes, Bernoulli
excursion, Brownian excursion, Local times, Limit theorems.
AMS (MOS)subject classifications: 60F05, 05C05, 60J55,
602165, 60J80.
I. INTRODUUTION
In 1889, A. Cayley [3] observed that the number of distinct trees with n labeled
vertices is n n- 2. Since then various proofs have been found for Cayley’s formula. For a
simple proof see L. Takhcs [23]. The number of distinct rooted trees with n labeled vertices is
Rn=n n-1
for n=l, 9,
Since among the n vertices we can choose a root in n ways,
(1)
(1) immediately
follows from Cayley’s formula.
The number of vertices in layer m in a rooted tree is the number of vertices at a
distance m from the root. The distance of a vertex from the root is the number of edges in the
path from the vertex to the root.
1Received:
2postal Address:
May, 1991. Revised: June, 1991.
2410 Newbury Drive, Cleveland Heights. OH 44118
Printed in the U.S.A. (C) 1991 The Society of Applied Mathematics, Modeling and Simulation
175
LAJOS TAKCS
176
Let S n be the set of all distinct rooted trees with n labeled vertices and denote by
tn(j,m),j = O, 1,...,n-m, the number of trees in S n having j vertices at a distance m from
the root. Let us choose a tree at random in the set Sn, assuming that all the possible n n- 1
choices are equally probable. Define
rn(m)
as the number of vertices in layer m, that is, the
number of vertices at a distance m from the root of the tree chosen at random. If all the
possible trees in Sn are equally probable, then
P{rn(m) = j} = t,(j,m)/nn-1
for j
= 0, 1,..., n
(2)
m.
In this paper we are concerned with the distribution and the moments of rn(m ) and
their asymptotic behavior in the case where m
derived for
7"n(m
= [2aff],0 < a < oo
are extended to other random
and n-oo. The results
trees, branching processes, the Bernoulli
excursion and the Brownian excursion.
2. AUXILIARY THEOREMS
Let us define the generating functions
E tn(J’m)zJ
(3)
E gn(z, m)wn/n!
(4)
gn (z’m) =
and
Gm(z w) =
forn>_landm>0. If
zl
<land
wl <l/e, then (4) is convergent-
Lemma 1: If w] <_ l/e, then the equation
y,- u = w
has exactly one root in the unit disk
Yl < 1
(5)
and
rXrn(n_ .r r)!
oo
yr
for
w
<lIe
and r
=[Y( W )] r =
rt
n
wn
(6)
= l, 9
By Rouch6’s theorem it follows that (5) has exactly one root in the unit
Proof:
disk Yl < 1 and we obtain (6) by Lagrange’s expansion. For r = 1 the expansion (6) was
already known to L. Euler [7].
On the Distribution of the Number of Vertices
177
Gin(z, w) = we Ginwhere
Go(z, w) = zy(w),
(6)
with r
= 1.
If we take into consideration that the degree of the root of a tree may be
Proof:
k
and y = y(w) is given by
= 0,1,2,..., then we obtain hat
Gm(z, w = W’4" W
E [Gm_ l(Z,w)]k/k! =
We
Gm- l(z’w)
(8)
k---1
for m
= 1,2,... and obviously
for
w
_.< 1/e.
appears also in A. Meir and J.W. Moon
(7)
Equation
[191 and
in A.M.
Odlyzko and H.S. Will [20].
3. TIIE MOMENTS OF
ra(m)
The following theorem has been found by V.E. Stepanov
[21].
In what follows
we
shall give a simple proof for it.
Theorem I:
If 0 < c < c,
2rn
lim
exists for r
= O, 1,2,
We
then
= pr(a)
ave po(a) = 1,pl( = 4ae- 22,
r-1
pr(a) = 2 + :r!a r
r
/ (1 + z)e
(10)
and
2a2(1 + X)2g
r_
1
(11)
(z)dz
o
for r >_ 2,
where
[z]
gr-1(x)
E (- llJ (r j 1)(z-j)r-2
(r- 2)!
(12)
3 =0
fort>2
and z
Proof:
>_ O.
Let us define
l (OrGm(z,w)
Sr(’m) = ".k
OZ r
oo
rn )}
)z----l: E E{(m)
n!
LAJOS TAK/CS
178
for r_>0,m>_0, and
Iwl _<l/e.
By forming the derivative of (7) with respect to z we obtain
=
Oz
for rn
for rn
for rn
>
>
1. Hence
Bl(w,m = Bo(w,m)Bl(W,m- 1)
(15)
B0(,-,) = Y()
(16)
1. Since
>_ 0,
by
(15) we obtain that
[y(w)] m + 1
B 1 (w, m)
for m
(17)
>_ 0, and thus by (6)
m+l
If r
(14)
Oz
_> 2, and m >_ 1, then the (r- 1)st derivative of (14) with respect to z at
r-1
r[Br(w,m
y(w)Br(w,m- 1)1 =
whence for the determination of
(18)
nm/
z
-
1 yields
Z (r- j)Bj(w,m)Br_j(w,m- 1),
Br(w,m),(r= 2,3,...),
(19)
we get the following recurrence
formula:
r-1
rBr(w’m)If r
2 in
(r-j) E
E
=
(20)
(20), then by (17)
:(,,)and thus by
[Y(w)]m-i-lBj (w’i+ 1)Br-j(w’i)"
1/2
[u()] ++
(:)
O<i<m
(6)
O<i<m
2n,-
If r- 3 in (20), then by (17)and (21)
(,m 2.2)!- (n- 2m L 2)*.
(22)
On the Distribution of the Ntonber of Vertices
179
B3(w,m) = 1
o<i<j<,
0_<i=j<m
and hence
E
o<_i<./<.,
By continuing this procedure
(r 1)!
Br(w m) = 2 r- 1
7n(m+i+j+2)+ 0<_i=j<m
E 7n(m+i+j+2).
we obtain that for r
o_<i 1 <i
E
2<...<i
(24)
>_ 2,
[Y(w)]m + il +’"’!"
r_l
r_
1
+r +...
(25)
<rn
where the neglected terms are constant multiples of sums similar to the one displayed, except
that in these sums il, i2,...,i r_l are not distinct; for at least one v = 2,...,r-1 we have
v 1 = it,. Formula (25) can be proved by mathematical induction. If we suppose that (25) is
B2(w,m),...,Br_(w,m where r =3,4,..., then
for Br(w, m) too. Accordingly, (25)is true for every r > 2.
true for
by (20)it follows that (25)is true
It is easy to prove that
17.(m)- me- m2/(2n) < 4/3
for 0
as
< m < n. If r = 1
---c
and m
(26)
= [2aft’if], then by (18) we obtain that
E{vn(m)} = 7n(m),-. 2aff’e- 22
(27)
lLnoo2E{rn(m)}/’i’a = 4ere- 2a2.
(28)
or
This proves
(10) for r = 1. If r > 2,m = [2aft’a], 0 < a < c and
(,., 1)!
0
<_
I
<
E
2
<... <
r
1
<m
n---cx),
then by (25)
7n (m + il +"" + ir- 1 + r- 1) +...
(29)
where the neglected terms are of smaller order than the displayed one. If r
0
< cr < c and n---,c, then
( (rnm) ) ),
(30)
lnimoo2rE{[rn(m)]r}/nr/2 = btr(cr)
(31)
E{[rn(m)]r) r!E
and by (26) and (29) we obtain that
LAJOS TAICS
180
exists and
(,) = (r-
+ z 1)e- 22( + xl +"" + =r- 1)2dZl...dzr_ 1 (32)
(1 + z t +
0<x 1
1
=
1
(1 + .r. t +
cr
o
for r
r_
<...<Xr_ 1 <1
+ r t) e 2a’2(:t
:t
dzt...dzr_ +.
o
2 r + lrtotr. We can write also that
>_ 2, where c r
r-1
/Zr( ) = 2 r + lr!cr
] (1 + z)e- 2a2(1 + X)2g
r
(33)
l(X)dar
o
>_ 2
where gr- 1(z) is the density function of
1 + 2 +-’. + r- 1
where
l,2,-’-,r- 1
are independent random variables each having a uniform distribution over the interval (0,1).
For the density function gr_l(a:), formula (12) has been found by P.S. Laplace [14], pp. 256257. For a simple proof of (12) see L. Wakcs [22].
for r
We note that
/:z(a) 4(e 2c’2
e
8c’2),
(34)
and
(35)
where
-u2/2du
(36)
is the normal distribution function.
4. THE ASYMPTOTIC DISTRIBUTION OF
The asymptotic distribution of
different form.
rn(m
rn(m )
has been found by V.E. Stepanov
[21]
in a
On the Distribution of the Number of Vertices
Theorem I:
If 0 < o < cx,
181
then
ff=p(’-([z’]) _< ) = G,()
.for x > 0 where G(x)
is the distribution
function of
(37)
a nonnegative random variable and is
given by
1
k3 1)
2j__IE k0(
G() = 1
for >_ 0
Ho(x),Hl(X),...
where
j
e -(x +
2cj)2/2( z)kH k + 2( -]- 2j)//I.
are the Hermite polynomials
Hn(z) = n!
defined by
Eo (-23J!(1)ix" 2j)!
(39)
n
j
(38)
We have
E (4c2J2
3-’1
Ga(0 = 1
2
1)e 2a;J2
(40)
and
d
Proof:
I + 2(
=2
j=l k=l
>_ 0,
it follows from
u2
<_ (2e)
1/2
(42)
< 1 [2
(11) that
(43)
,,(,)/,’! < (2,)"/,
_> 2. Accordingly,
Ga(x 0 for x < 0 and
for r
(41)
Since
ue
if u
+ 2aj)/(k 1)!
there exists one and only one distribution function
/ xrdGa(x)- i.tr(O)
Ga(x)
such that
(44)
-0
>_ 0. By the moment convergence theorem of M. Frchet and J. Shohat [8] it follows from
(I0) that (37) holds in every continuity point of Ga(x ). If Is < 1/(2a), then the Laplace-
for r
Stieltjes transform
a(s) =
f e-SXdGa(x)
-0
(45)
LAffOS
182
TAKCS
can be expressed as
a(s) =
(- 1)r/r(a)sr/r!.
(46)
r=0
By (11) we obtain that
(47)
for
Is < 1/(2ix).
Hence (38) and (41) follow by inversion.
5. VARIOUS EXTENSIONS
By using the same method which we used in proving Theorems 1 and 2 we can
demonstrate that the distribution function
Ga(m
appears also in the solutions of various other
Apparently, the interesting interrelation among these
problems in probability theory.
problems has not been noticed before, and
Ga(z
has appeared in various disguises. Here are
some examples.
(i)
Random trees. Denote by T n+l the set of distinct rooted ordered trees with
n + 1 unlabeled vertices. There are
1
Cn : (2nn) n+1
(48)
,
distinct trees in T n + 1" This follows from the obvious recurrence formula
Cn = i=1 Ci- Cn
for n
= 1,2,... where CO = 1. In (48) C n
(49)
is the nth Catalan number. Let us choose a tree at
random, assuming that all the possible C n trees are equally probable. Denote by r n + l(m) the
number of vertices at a distance m from the root of a tree chosen at random. If 0 <
then we have
{
+
}= Go( )
for x > 0.
Denote by Tn +2 the set of distinct planted trivalent trees with 2n +2 unlabeled
vertices. A planted tree is rooted at an end vertex. In a trivalent tree every vertex has degree
3 except the end vertices which have degree 1. In 1859, A. Cayley [2] demonstrated that there
On the Distbution
of the Number of Vertices
183
T, + 2 where C a is given by (48). Let us choose a tree at random in
Tn + 2 assuming that all the possible C, choices are equally probable. Denote by V2n + 2(m)
are C n distinct trees in
the number of vertices at a distance m from the root of a tree chosen at random.
0
< c < c,
then we have
+
lira P
for z
If
G()
(51)
> 0.
(ii) Branching processes. Let
us suppose that in a population initially we have a
progenitor and in each generation each individual reproduces, independently of the others, and
has probability p j, (j = O, 1,...), of giving rise to j descendants in the following generation.
Denote by (m), (m = 0,1,...), the number of individuals in the mth generation; (0)= 1.
Define
P=
E (m),
(52)
that is, p is the total number of individuals (total progeny) in the process (possibly p = c).
Let
and
(54)
gcd{j: pj > 0} = d.
If 1(1)= 1, f’(1)=l,
0 < a < c, then
tim
P
,’,-..-,oo
for z
> 0 where Ga(z)
If pj
f"(1)=2
{ o’
is defined by
= e-X/j!
where 0<a<cx,
f(r)(1)<cx
<z’IP = nd+l } Ga(z
r2= 1
and d = 1 and (55) reduces to (37). If
and d=l and (55)reduces to (50). If
1/2 j+l for j=0,1,2,..., then a2=2
2
P0 = P2 = 1/2 and pj = 0 otherwise, then r = 1 and d = 2 and (55) reduces
The limit distribution
(55)
(38).
for j = 0,1,2,..., then
pj=
for r>_2, and
to
(51).
(55) has already been determined by D.P. Kennedy [12] in
different form. By his results we can conclude that
a
LAJOS TAKCS
184
-
o < ,, < /(,)
o < < I(4. )
for a:
> 0 and
-,u-
for
wv
f(u, v)dudv = fsinh(.
Re(s) > 0 and Re(w) > 0.
(iii) Bernoulli
4a2v))(1 -4a2v)-3/2uf(u,v)dudv
a2u2/(2(
=
2q.! +
2
(56)
(57)
"
excursion. Let us arrange n white balls and n black balls in a row in
such a way that for every
= 1,2,...,2n among the
balls there are at least as many
first
white balls as black. The total number of such arrangements is given by the nth Catalan
number
Cn,
defined by
(48). Let
us suppose that all the possible
C n sequences
are equally
probable and choose a sequence at random. We associate a random walk with the random
sequence chosen by assuming that a particle starts at time t
in the time interval
=0
at the origin of the z-axis and
(i- 1,i], i = 1,2,..., 2n, it moves with a unit velocity to the right or to the
left according to whether the ith ball in the row is white or black respectively. Denote by
z=r/n+(t) the position of the particle at time 2nt where 0_<t< 1. The process
{r/+n(t),0 <t_< 1} is called a Bernoulli excursion. Denote by 2r+n(m)(m= 1,2,...,n) the
number of crossings of the sample function of the process {r/+n (t),0 < t < 1} through the line
+n
= m-1/2. In other words, r (m)/n is the total
the process {r/+n (t), 0 < < 1}. If 0 < a <
then
z
lira P
for z > O. Since r
,
time spent in the interval
{ 2r +n ([c2]) < }
z
G(z)
+n (m) has exactly the same distribution as
r n + :(m) in
(m-1,m) by
(58)
(50), the two results,
(50) and (58), imply each other.
(iv) Brownian excursion. The process {r/n+ (t)/q-, 0 <_ <_ 1}, where r/+n (t) is denned
under (iii), converges weakly to the Brownian excursion {r/+ (t),0 < t _< 1}. For the definition
and properties of the Brownian excursion we refer to P. Lvy [15], [16], K. It5 and H.P.
McKean, Jr. [11] and K.L. Chung [4]. For the process {r/+(t),0 _< t _< 1} define r+() as the
local time at the level a for a >_ 0. From (58) we can conclude that
+
for z
> 0,
and also
<_
=
(59)
On the Distribution
of the Number of Vertices
185
(60)
for r = O, 1,2,... where pr(a) is defined by
The distribution function
(10).
(59) has attracted considerable
interest. In the articles by
R.K. Getoor and M.J. Sharpe [9], J.W. Cohen and G. Hooghiemstra [5], G. Louchard [17], [18],
E. Cshki and S.G. Mohanty [6], and Ph. Blanc and M. Yor [1], P{r + (a) _< x} is expressed in
the form of a complex integral.
P(v+(a) _<x}
F.B. Knight [13] and G. Hooghiemstra [10] expressed
in explicit forms, but their formulas are hardly suitable for numerical
calculations. We can easily produce tables and graphs for
(38) and
Ga(z) and G’a(z) by using formulas
(41) and the remarkable program MATHEMATICA by S. Wolfram [24].
REFERENCES
[1]
Ph. Biane and M. Yor, "Valeurs principales associes aux temps locaux
browniens’, Bulletin des Sciences Mathmatiques 111, (1987), pp. 23-101.
[e]
A. Cayley, "On the analytic forms called trees, Second part", Philosophical Magazine
18, (1859), pp. 374-378. [The Collecled Mathematical Papers of Arthur Cayley, Vol. IV,
Cambridge University Press, (1891), pp. 112-115.]
[3]
A. Cayley, "A theorem on trees", Quarterly Journal of Pure and Applied Mathematics
2, (1889), pp. 376-378. [The Collected Mathematical Papers of Arthur Cayley, Vol.
XIII, Cambridge University Press, (1897), pp. 26-28.]
[4]
K.L. Chung, "Excursions in Brownian motion", Arkiv fldr Matematik 14, (1976), pp.
157-179.
J.W. Cohen and G. Hooghiemstra, "Brownian excursion, the M/M/1 queue and their
occupation times", Mathematics
of Operations Research 6, (1981), pp. 608-629.
E. Cski and S.G. Mohanty, "Some joint distributions for conditional random walks",
The Canadian Journal
of Statistics 14, (1986),
pp. 19-28.
[7]
L. Euler, "De serie Lambertina plurimisque eius insignibus proprietatibus", A cta
Academiae Scientiarum Petropolitanae (1779), 2, (1783), pp. 29-51. [Leonhardi Euleri
Opera Omnia. Set. I. Vol. 6, B.G. Teubner, Leipzig, (1921), pp. 350-369].
[8]
M. Frchet and J. Shohat, "A proof of the generalized second-limit theorem in the
theory of probability", Transactions
pp. 533-543.
of the
American Mathematical Society 33,
(1931),
[9]
R.K. Getoor and M.J. Sharpe, "Excursions of Brownian motion and Bessel processes",
Zeitschrifl flit Wahrscheinlichkeitslheorie und verwandte Gebiete 47, (1979), pp. 83-106.
[0]
G. Hooghiemstra, "On the explicit form of the density of Brownian excursion local
time", Proceedings
of the
American Mathematical Society 84,
(1982), pp. 127-130.
LAJOS TAKACS
186
[11]
K. It6 and H.P. McKean, Jr., "Diffusion Processes and their Sample Paths", SpringerVerlag, Berlin, (1965).
[12]
D.P. Kennedy, "The Galton-Watson process conditioned
on total
progeny", Journal of
Applied Probability 12, (1975), pp. 800-806.
[13]
F.B. Knight, "On the excursion process of Brownian motion", Transactions of the
American Mathematical Society 258, (1980), pp. 77-86. [Correction: Zentralblatt fiir
Mathematik 426, (1980) #60073.]
[14]
P.S. Laplace, "Thorie analytique des probabilits’, Paris, (1812). [Oeuvres VIi,
Gauthier-Villars, Paris, (1886), Culture et Civilisation, Bruxelles, (1967).]
[15]
P. Lvy, "Sur certains processus stochastiques homognes’, Compositio Mathematica 7,
(1940), pp. 283-339.
[16]
P. Lvy, "Processus Stochastiques et Mouvement Brownien’, Second edition, GauthierVillars, Paris, (1965).
[17]
G. Louchard, "Kac’s formula, Lvy’s local time and Brownian excursion", Journal of
Applied Probability 21, (1984), pp. 479-499.
[18]
G. Louchard, "Brownian motion and algorithm complexity", BIT 26, (1986), pp. 17-34.
[19]
A. Meir and J.W. Moon, "On the altitude of nodes in random trees", Canadian
of Mathematics 30,
journal
(1978), pp. 997-1015.
[20]
A.M. Odlyzko and H.S. Wilf, "Bandwidths and profiles of trees", Journal of
Combinatorial Theory Ser. B 42, (1987), pp. 348-370.
[21]
V.E. Stepanov, "On the distribution of the number of vertices in strata of
tree", Theory of Probability and
its Applications 14,
(1969),
a random
pp. 65-78.
[22]
L. Takcs, "On the method of inclusion and exclusion", Journal of the American
Statistical Association 62, (1967), pp. 102-113.
[23]
L. Takhcs, "On Cayley’s formula for counting forests", Journal of Combinatorial Theory
Ser. A 53, (1990), pp. 321-323.
[24]
S. Wolfram, "Mathematica. A System for Doing Mathematics by Computer", AddisonWesley, Redwood City, California, (1988).
Download