Moments of level numbers of leaves in binary trees

13
Journal of Statistical Planning and Inference 101 (2002) 267–279 www.elsevier.com/locate/jspi Moments of level numbers of leaves in binary trees Alois Panholzer a;, Helmut Prodinger b;1 a Institut F ur Algebra und Computermathematik, Technical University of Vienna, Wiedner Hauptstrasse 8–10, A-1040 Vienna, Austria b School of Mathematics, The John Knopfmacher Centre for Applicable Analysis and Number Theory, University of the Witwatersrand, P.O. Wits, 2050 Johannesburg, South Africa Received 20 November 1998; received in revised form 1 April 1999 Abstract For the random variable “height of leaf j in a binary tree of size n” we derive closed formul for all moments. The more general case of t -ary trees is also considered. c 2002 Elsevier Science B.V. All rights reserved. MSC: 05CO5; 05A15; 60CO5 Keywords: Binary tree; Height; Asymptotic distribution; Moments 1. Introduction We reexamine the heights of leaves (enumerated from left to right) in binary trees and, more generally, t -ary trees. The expectations go back to Kirschenhofer (1983a), and the variance to Gutjahr (1992). For higher moments, Gutjahr had to resort to analytic techniques. However, we are able to produce an explicit formula for the sth moment, for general s. The result is “closed form”, as it comprised only a number of summands that depends on s, not on n or j, where n denotes the size of the tree and j the number of the leaf in consideration. From this explicit result, asymptotic formul, as given by Gutjahr, are of course corollaries for which we do not need more than Stirling’s approximation for the factorials. In a follow-up paper to Kirschenhofer (1983a), Kirschenhofer (1985) considered also t -ary trees (binary trees are the instance t = 2). He treated the expectation with This research was supported by the Austrian Research Society (FWF) under the project number P12599-MAT. Corresponding author. E-mail addresses: [email protected] (A. Panholzer), [email protected] (H. Prodinger). 1 http:==www.wits.ac.za=helmut=index.htm 0378-3758/02/$ - see front matter c 2002 Elsevier Science B.V. All rights reserved. PII: S0378-3758(01)00187-2

Transcript of Moments of level numbers of leaves in binary trees

Journal of Statistical Planning andInference 101 (2002) 267–279

www.elsevier.com/locate/jspi

Moments of level numbers of leaves in binary trees�

Alois Panholzera;∗, Helmut Prodingerb;1

aInstitut F�ur Algebra und Computermathematik, Technical University of Vienna,Wiedner Hauptstrasse 8–10, A-1040 Vienna, Austria

b School of Mathematics, The John Knopfmacher Centre for Applicable Analysis and Number Theory,University of the Witwatersrand, P.O. Wits, 2050 Johannesburg, South Africa

Received 20 November 1998; received in revised form 1 April 1999

Abstract

For the random variable “height of leaf j in a binary tree of size n” we derive closed formul2for all moments. The more general case of t-ary trees is also considered. c© 2002 ElsevierScience B.V. All rights reserved.

MSC: 05CO5; 05A15; 60CO5

Keywords: Binary tree; Height; Asymptotic distribution; Moments

1. Introduction

We reexamine the heights of leaves (enumerated from left to right) in binary treesand, more generally, t-ary trees.

The expectations go back to Kirschenhofer (1983a), and the variance to Gutjahr(1992). For higher moments, Gutjahr had to resort to analytic techniques. However, weare able to produce an explicit formula for the sth moment, for general s. The result is“closed form”, as it comprised only a number of summands that depends on s, not on nor j, where n denotes the size of the tree and j the number of the leaf in consideration.

From this explicit result, asymptotic formul2, as given by Gutjahr, are of coursecorollaries for which we do not need more than Stirling’s approximation for thefactorials.

In a follow-up paper to Kirschenhofer (1983a), Kirschenhofer (1985) consideredalso t-ary trees (binary trees are the instance t = 2). He treated the expectation with

� This research was supported by the Austrian Research Society (FWF) under the project numberP12599-MAT.

∗ Corresponding author.E-mail addresses: [email protected] (A. Panholzer), [email protected]

(H. Prodinger).1 http:==www.wits.ac.za=helmut=index.htm

0378-3758/02/$ - see front matter c© 2002 Elsevier Science B.V. All rights reserved.PII: S0378 -3758(01)00187 -2

268 A. Panholzer, H. Prodinger / Journal of Statistical Planning and Inference 101 (2002) 267–279

a complicated analytic technique (“diagonalization method”) because he believed thatexplicit enumeration is not feasible. However, we disproof this claim by giving anexact formula for the sth (factorial) moment. This time, it is not “closed form”, butwe can get asymptotics for the expectations by rather elementary techniques (Stirling’sformula and approximation of a sum by an integral).

Although it was possible in these cases to obtain exact formul2, it is clear thatthis approach relies heavily on the “nice structure” of the trees. So this approach willnot be applicable for more general families of trees. At this point it must be noticed,that with powerful analytical methods Drmota (1994) was able to extend the resultsfor the limiting distribution to the large family of simply generated trees. Gittenberger(1995) in his PhD thesis Inally showed that the so called “Contour process” in simplygenerated families of trees converges to the brownian excursion.

2. The height of leaves in binary trees

We denote by B the family of binary trees, sometimes called extended binary trees,which is deIned by the formal identity

;

where © is the symbol for an internal node and is the symbol for a leaf or externalnode.

The family of objects from B with exactly n internal nodes is denoted by Bn.Further we write Bn for the number of elements from Bn. Finally we denote withB(z) =

∑n¿0 Bnz

n the ordinary generating function of the Bn’s.Let us state the following classic results that we will need in the sequel:

B(z) = 1 + zB2(z) =1 −√

1 − 4z2z

; Bn =1

n + 1

(2nn

):

Assume that the leaves are labelled with the numbers 0; 1; : : : ; n from left to rightand that the distance (depth, height) of a leaf is given by the number of internal nodeson the path from the root to the leaf. This deInes random variables Xn;j assuming thatall binary trees of size n are equally likely.

Kirschenhofer (1983a,b) has obtained the following explicit formula for the expec-tations (see Panholzer and Prodinger, 1997 for more results):

E(Xn;j) =2(2j + 1)(2n− 2j + 1)

n + 2

(2jj

)(2n− 2jn− j

)(

2nn

) − 1 for 06 j6 n:

Gutjahr obtained the variance V(Xn;j) in Gutjahr (1992); unfortunately, this paperwas overlooked during the preparation of our paper (Panholzer and Prodinger, 1997).

A. Panholzer, H. Prodinger / Journal of Statistical Planning and Inference 101 (2002) 267–279 269

Studying the higher moments, Gutjahr writes:“There is little hope to ;nd a closed expression for the third moment”.This statement is too pessimistic, since we will Ind explicit formul2 for all (facto-

rial) moments in the sequel. Since the usual moments can be obtained from them bylinear combinations with Stirling numbers as coeLcients, we have also explicit formul2for all usual moments.

Let fn;j; l denote the number of binary trees of size n, where the leaf labelled with jhas height l. Then we get for the trivariate generating function F(z; u; v) of the fn;j; l,deIned by

∑n; j; l fn;j; lz

nujvl the functional equation

F(z; u; v) = 1 + zvF(z; u; v)B(z) + zuvF(z; u; v)B(zu): (2)

This leads immediately to the solution

F(z; u; v) =1

1 − zvB(z) − zuvB(zu): (3)

This appears in Panholzer and Prodinger (1997) and Gutjahr (1992). The sth factorialmoment Es(Xn;j) is given by

Es(Xn;j) =1Bn

[znuj]@s

@vsF(z; u; v)

∣∣∣∣v=1

:

Now, a routine calculation (that is easy to verify by induction) shows that

@s

@vsF(z; u; v)

∣∣∣∣v=1

= 2s!(2 − (X + Y ))s

(X + Y )s+1 = 2s!∑s

l=0

(sl

)(−1)s−l2l

1(X + Y )l+1 ;

with

X =√

1 − 4z and Y =√

1 − 4zu:

So that our task is reduced to Ind explicit formul2 for

[znuj]1

(X + Y )q:

First, we compute the coeLcients of 1=(1 +√

1 − 4x)q; for that, we use the substi-tution x= t=(1 + t)2 and the Lagrange inversion formula:

[xk ]1

(1 +√

1 − 4x)q=

1k

[tk−1](1 + t)2k ddt

1

(1 +√

1 − 4t=(1 + t)2)q

=1k

[tk−1](1 + t)2k ddt

(1 + t

2

)q

=q2q

1k + q

(2k + q− 1

k

):

Hence1

(X + Y )q=

1

(1 − 4z)q=2(1 +√

1 − 4z(u− 1)=1 − 4z)q

=∑k¿0

q2q

1k + q

(2k + q− 1

k

)zk(u− 1)k

(1 − 4z)k+q=2 :

270 A. Panholzer, H. Prodinger / Journal of Statistical Planning and Inference 101 (2002) 267–279

Thus

[znuj]1

(X + Y )q=

∑j6k6n

(−1)k−j(kj

)q2q

1k + q

(2k + q− 1

k

)

×4n−k(n− 1 + q=2

n− k

):

To evaluate the remaining sum we use Zeilberger’s algorithm (Graham et al., 1994;PetkovQsek et al., 1996). (Alternative proofs, e.g. by the use of hypergeometric series,would also be possible.) Let

F(n; k) = (−1)k−j(kj

)q2q

1k + q

(2k + q− 1

k

)4n−k

(n− 1 + q=2

n− k

);

then we Ind

(n− j + 1)(n + q + 1)F(n + 1; k) − (2n + q)(2n + q− 2j + 1)F(n; j)

=G(n; k + 1) − G(n; k)

with

R(n; k) = − 2(2n + q)(k + q)(k − j)n + 1 − k

and G(n; k) =F(n; k)R(n; k):

Now set

f(n) =∑

k∈ZF(n; k);

then we obtain by summing (since the right side telescopes)

(n− j)(n + q)f(n) = (2n + q− 2)(2n + q− 2j − 1)f(n− 1);

with the initial value

f(j) =q2q

1j + q

(2j + q− 1

j

);

which gives by iteration

f(n) =q2q

1j + q

(2j + q− 1

j

)n∏

k=j+1

(2k + q− 2)(2k + q− 2j − 1)(k − j)(k + q)

=q2q

1j + q

(2j + q− 1

j

)(j + q)!

(n− j)!(n + q)!

×n∏

k=j+1(2k + q− 2)(2k + q− 2j − 1)

= q22n−2j−q (2j + q− 1)!j!(n− j)!(n + q)!

�(n + q=2)�(n− j + (q + 1)=2)�(j + q=2)�((q + 1)=2)

:

Of course, these �-functions can always be written by factorials (or binomial coeL-cients), by distinguishing the cases q even and odd. Summarizing,

[znuj]1

(X + Y )q= q22n−2j−q (2j + q− 1)!

j!(n− j)!(n + q)!�(n + q=2)�(n− j + (q + 1)=2)

�(j + q=2)�((q + 1)=2)

A. Panholzer, H. Prodinger / Journal of Statistical Planning and Inference 101 (2002) 267–279 271

or citing the cases q even and odd explicitly

[znuj]1

(X + Y )2p

=2p(p− 1)!(j + 2p)!(2n− 2j + 2p− 1)!(2j + 2p− 1)!(n + p− 1)!

22p(2p− 1)!(j + p− 1)!(n− j)!(n− j + p− 1)!j!(j + 2p)!(n + 2p)!;

[znuj]1

(X + Y )2p+1 =(2p + 1)(j + p)!(2n + 2p− 1)!(n− j + p)!

22pp!j!(n + p− 1)!(n− j)!(n + 2p + 1)!:

Hence, we have the following theorem.

Theorem 1. The sth factorial moment E(X sn;j) is given by the following explicit

formula.

E(X sn;j) =

s!Bn

s∑l=0

(sl

)(−1)s−l4n−j (l + 1)(2j + l)!

j!(n− j)!(n + l + 1)!

×�(n + (l + 1)=2)�(n + 1 − j + l=2)�(j + (l + 1)=2)�(1 + l=2)

: (4)

With Stirling’s asymptotic formula for the factorials, we get for a Ixed ratio �= j=n

(l + 1)(2j + l)!j!(n− j)!(n + l + 1)!

�(n + (l + 1)=2)�(n + 1 − j + l=2)�(j + (l + 1)=2)�(1 + l=2)

= 4�nn(l−3)=2 (1 − �)l=2�l=22l(l + 1)√��(1 + l=2)

(1 + O

(1n

)):

From this we see that the summand with l= s gives the main term in the asymptoticexpansion of the factorial moments. Using the reSection law of the Gamma functionwe get

E(X sn;j) =

(s + 1)!(1 − �)s=2�s=22s

�(1 + s=2)ns=2 + O(n(s−1)=2)

=2√�

4s�s=2(1 − �)s=2�(s + 3

2

)ns=2 + O(n(s−1)=2):

Thus, we have the following corollary.

Corollary 2. The sth factorial moment E(X sn;j) (and thus also the ordinary sth

moment E(X sn;j)) is asymptotically for n → ∞ given by

E(X sn;j) =

2√�

4s�s=2(1 − �)s=2�(s + 3

2

)ns=2 + O(n(s−1)=2):

Together with the theorem of FrTechet and Shohat (second Central Limit Theorem)(Fisz, 1963) and the fact, that the Maxwell distribution is uniquely determined by its

272 A. Panholzer, H. Prodinger / Journal of Statistical Planning and Inference 101 (2002) 267–279

moments, we get the following result, which was Irst proved in Gutjahr and PSug(1992).

Corollary 3. The limiting distribution of the normalized height Xn;j=√n is for a ;xed

ratio �= j=n with 0¡�¡ 1 a Maxwell distribution with parameter �=√

8�(1 − �).That means it has for x¿ 0 the density function

f�(x) =x2

16√�(�(1 − �))3=2 e−x2=(16�(1−�)): (5)

For the reader’s convenience, we mention that the Maxwell distribution with param-

eter � is deIned as the distributions Y =√X 2

1 + X 22 + X 2

3 , where the Xi are indepen-

dently normally distributed random variables N(0; �2) with mean 0 and variance �2.It has the following density function f(x) and moments Ms = E(Y s)

f(x) =

√2x2

√��3 e−x2=(2�2) for x¿ 0; Ms =

2√�

2s=2�s�(s + 3

2

):

3. t-ary trees

Let us consider t-ary trees for general t¿ 2. In this instance, we can also reportsome progress, although the results are not so nice. The random variable Xn;j is againdeIned as the height of the leaf j in a t-ary tree of size n, where the leaves are labelledwith the numbers 0; 1; : : : ; (t − 1)n from left to right and all trees are equally likely.

Recall that the generating function of t-ary trees satisIes T (z) = 1 + zT t(z), and that

Tn:=[zn]T (z) =1

(t − 1)n + 1

(tnn

)and

T (p)n :=[zn]Tp(z) =

p(t − 1)n + p

(tn + p− 1

n

);

which is classical and easy to see by means of Lagrange’s inversion formula: WithT (z) = 1 + U (z) we get

[zn]Tp(z) = [zn](1 + U (z))p =pn

[Un−1](1 + U )tn+p−1

=p

(t − 1)n + p

(tn + p− 1

n

):

The trivariate generating function

F(z; u; v) =∑n; j;l

[Number of trees of size n with leaf j on level l]znujvl

is now given by

F(z; u; v) =1

1 − zv∑t−1

k=0 ukT t−1−k(z)Tk(zut−1)

:

A. Panholzer, H. Prodinger / Journal of Statistical Planning and Inference 101 (2002) 267–279 273

Let us deIne the following abbreviations: X =T (z) and Y =T (zut−1) (in the previ-ous section X and Y had a slightly diUerent meaning). Now note that

zt−1∑k=0

ukX t−1−kY k = zX t − utY t

X − uY=X − uY − (1 − u)

X − uY= 1 − 1 − u

X − uY;

so that we have

F(z; u; v) =1

1 − v(1 − (1 − u)=(X − uY )): (6)

3.1. The expectation

At Irst we want to Ind an explicit expression for the expectations E(Xn;j). HereKirschenhofer (1985) writes:

“For general t¿ 2, the method, that led to an explicit expression for t = 2; is notfeasible”.

Again, this statement is too pessimistic; in the following we will Ind an explicitformula for the expectations, from which it is also possible to get an asymptotic re-sult without the laborious “diagonalization method”, by only using very elementarytechniques.

Introducing

G1(z; u) =@@vF(z; u; v)

∣∣∣∣v=1

we get by diUerentiating (6) w.r.t. v and evaluating at v= 1 the following result, whichappeared in Kirschenhofer (1983a).

G1(z; u) =(X − uY1 − u

)2

− X − uY1 − u

:

Extracting the coeLcients for 06 j6 (t − 1)n leads immediately to

[znuj]X −uY1−u

=Tn and [znuj](X −uY1−u

)2

= [znuj]X 2

(1−u)2 −2[znuj]XuY

(1−u)2 :

For 06 j6 (t − 1)n we get then

[znuj]X 2

(1 − u)2 =2(j + 1)

(t − 1)n + 2

(tn + 1n

)

and

[znuj]XuY

(1 − u)2 =∑

06k6�(j−1)=(t−1)�

1(t − 1)k + 1

(tkk

)

× 1(t − 1)(n− k) + 1

(t(n− k)n− k

)(j − (t − 1)k):

274 A. Panholzer, H. Prodinger / Journal of Statistical Planning and Inference 101 (2002) 267–279

Combining these results leads to

[znuj](X − uY1 − u

)2

=2(j + 1)

(t − 1)n + 2

(tn + 1n

)

−2∑

06k6�(j−1)=(t−1)�

j−(t−1)k((t−1)k+1)((t−1)(n−k)+1)

(tkk

)(t(n−k)n−k

):

(7)

To get asymptotic equivalents it is helpful to manipulate this expression in thefollowing way. Partial fraction expansion leads to

[znuj](X − uY1 − u

)2

=2(j + 1)

(t − 1)n + 2

(tn + 1n

)

− 2(j + 1)(t − 1)n + 2

∑06k6�(j−1)=(t−1)�

1(t − 1)k + 1

(tkk

)(t(n− k)n− k

)

+2((t−1)n−j+1)

(t−1)n+2∑

06k6�(j−1)=(t−1)�

1(t−1)(n−k)+1

(tkk

)(t(n−k)n−k

):

In the following, we use the identity

n∑k=0

1(t − 1)k + 1

(tkk

)(t(n− k)n− k

)=(tn + 1n

):

This is easy to prove, when considering the equation

(t − 1)zT ′(z) + T (z) =∑n¿0

(tnn

)zn:

The sum can then be simpliIed by extracting the coeLcients of a convolution of twofunctions:

n∑k=0

1(t − 1)k + 1

(tkk

)(t(n− k)n− k

)= [zn]T (z)((t − 1)zT ′(z) + T (z))

= [zn]T 2(z) +t − 1

2[zn−1](T 2(z))′

=(

1 +t − 1

2n)

[zn]T 2(z) =(tn + 1n

):

A. Panholzer, H. Prodinger / Journal of Statistical Planning and Inference 101 (2002) 267–279 275

When the Irst sum is running over the whole range from 0 to n it cancels the extrasummand, which is asymptotically of higher order. Adding the remaining summandsleads to the following expression

[znuj](X − uY1 − u

)2

=2(j + 1)

(t − 1)n + 2∑

�(j−1)=(t−1)�+16k6n

1(t − 1)k + 1

(tkk

)(t(n− k)n− k

)

+2((t−1)n−j+1)

(t−1)n+2∑

06k6�(j−1)=(t−1)�

1(t−1)(n−k)+1

(tkk

)(t(n−k)n−k

):

This leads to the following exact formula for the expectation E(Xn;j) = 1=Tn[znuj]G1(z; u).

Theorem 4. The expectation E(Xn;j) of the height of leaf j in a t-ary tree with ninternal nodes is given by

E(Xn;j) =2(j+1)((t−1)n+1)

((t−1)n+2)(tnn

) ∑�(j−1)=(t−1)�+16k6n

1(t−1)k+1

(tkk

)(t(n−k)n−k

)

+2((t − 1)n− j + 1)((t − 1)n + 1)

((t − 1)n + 2)(tnn

)

× ∑06k6�(j−1)=(t−1)�

1(t−1)(n−k)+1

(tkk

)(t(n−k)n−k

)−1: (8)

In the following, we will obtain an asymptotic equivalent of this expression. First,we want to evaluate the sum

2(j + 1)((t − 1)n + 1)

((t − 1)n + 2)(tnn

) ∑�(j−1)=(t−1)�+16k6n

1(t − 1)k + 1

(tkk

)(t(n− k)n− k

);

asymptotically for a Ixed ratio �= j=((t−1)n). With Stirling’s asymptotic formula forthe binomial coeLcients(

tnn

)=

√t(tt=(t − 1)t−1)n√t − 1

√2√�

n−1=2(

1 + O

(1n

));

we obtain the following expansion for the summands from k = �n to n− 1;

1(t − 1)k + 1

(tkk

)(t(n− k)n− k

)

=t

2�(t − 1)

(tt

(t − 1)t−1

)n 1(t − 1)k + 1

1√k√n− k

276 A. Panholzer, H. Prodinger / Journal of Statistical Planning and Inference 101 (2002) 267–279

×(

1 + O

(1k

))(1 + O

(1

n− k

))

=t

2�(t − 1)

(tt

(t − 1)t−1

)n 1(t − 1)k + 1

1√k√n− k

+t

2�(t − 1)

(tt

(t − 1)t−1

)n 1(t − 1)k + 1

ct√k(n− k)3=2

:

In the last expression, the constant ct is independent from k and n.Therefore, we get

2(j + 1)((t − 1)n + 1)

((t − 1)n + 2)(tnn

) ∑�(j−1)=(t−1)�+16k6n

1(t − 1)k + 1

(tkk

)(t(n− k)n− k

)

=n−1∑k=�n

t2�(t − 1)

(tt

(t − 1)t−1

)n 1(t − 1)k + 1

1√k√n− k

+n−1∑k=�n

t2�(t − 1)

(tt

(t − 1)t−1

)n 1(t − 1)k + 1

ct√k(n− k)3=2

+ O(1):

Interpreting the Irst sum as a Riemann sum, we obtain the following expansion;

2(j + 1)((t − 1)n + 1)

((t − 1)n + 2)(tnn

) n−1∑k=�n

t2�(t − 1)

(tt

(t − 1)t−1

)n 1(t − 1)k + 1

1√k√n− k

=

√2√t�√n√

�√t − 1

1−1=n∑k=n=�

1

n( kn )3=2√

1 − k=n

(1 + O

(1n

))

=

√2√t�√n√

�√t − 1

∫ 1

x=�

dxx3=2

√1 − x

+ O(1)

=

√2√t�√n√

�√t − 1

(−2

√1 − x√x

)∣∣∣∣∣1

x=�

+ O(1)

=2√

2√t√�√

1 − �√�√t − 1

√n + O(1):

The remainder term leads to the following order estimation;

2(j+1)((t−1)n + 1)

((t−1)n + 2)(tnn

) n−1∑k=�n

t2�(t−1)

(tt

(t−1)t−1

)n 1(t−1)k + 1

ct√k(n−k)3=2

= cn3=2n−1∑k=�n

1k3=2(n− k)3=2 = c

n−1∑k=�n

(1k

+1

n− k

)3=2

A. Panholzer, H. Prodinger / Journal of Statistical Planning and Inference 101 (2002) 267–279 277

6 2cn=2∑k=1

(1k

+1

n− k

)3=2

6 2cn=2∑k=1

(1k

+2n

)3=2

6 2cn=2∑k=1

(2k

)3=2

=O(1):

So we have the asymptotic expansion

2(j+1)((t − 1)n + 1)

((t − 1)n + 2)(tnn

) ∑�(j−1)=(t−1)�+16k6n

1(t−1)k + 1

(tkk

)(t(n− k)n− k

)

=2√

2√t√�√

1 − �√�√t − 1

√n + O(1):

An analogous procedure leads to

2((t−1)n−j+1)((t−1)n+1)

((t−1)n+2)(tnn

) ∑06k6�(j−1)=(t−1)�

1(t−1)(n−k)+1

(tkk

)(t(n−k)n−k

)

=2√

2√t√�√

1 − �√�√t − 1

√n + O(1):

So we get Inally the asymptotic result for the expectations of the leaf heights int-ary trees, as it was stated in Kirschenhofer (1985).

Theorem 5. The expectation E(Xn;j) of the height of the leaf j in a t-ary tree with ninternal nodes is asymptotically for a ;xed ratio �= j=((t − 1)n) with 0¡�¡ 1 forn → ∞ given as

E(Xn;j) =4√

2√t√�√

1 − �√�√t − 1

√n + O(1): (9)

3.2. Higher moments

We are also able to Ind explicit expressions for the higher factorialmoments, although they are not so nice.

DiUerentiating F(z; u; v) s-times w.r.t. v and evaluating at v= 1 leads to

@s

@vsF(z; u; v)

∣∣∣∣v=1

=s!(1 − (1 − u)=(X − uY ))s

((1 − u)=(X − uY ))s+1

=s+1∑p=1

(−1)s+1−ps!(

sp− 1

)(X − uY1 − u

)p:

278 A. Panholzer, H. Prodinger / Journal of Statistical Planning and Inference 101 (2002) 267–279

The coeLcients from (X−uY1−u )p are obtained as follows:

[znuj](X − uY1 − u

)p

=p∑l=0

[znuj](−1)l(pl

)(uY )lX p−l

(1 − u)p

=T (p)n

(j + p− 1p− 1

)

+p−1∑l=1

(−1)l(pl

) ∑06k6�(j−l)=(t−1)�

T (l)k T (p−l)

n−k

(j − l− (t − 1)k + p− 1

p− 1

):

Therefore, we get

@s

@vsF(z; u; v)

∣∣∣∣v=1

=s+1∑p=1

(−1)s+1−ps!(

sp− 1

)[T (p)n

(j + p− 1p− 1

)

+p−1∑l=1

(−1)l(pl

)

× ∑06k6�(j−l)=(t−1)�

T (l)k T (p−l)

n−k

(j − l− (t − 1)k + p− 1

p− 1

)]:

Hence the higher moments are given by the following theorem.

Theorem 6. The sth factorial moment E(X sn;j) of the height of the leaf j in a t-ary

tree with n internal nodes is given by

E(X sn;j) =

1Tn

s+1∑p=1

(−1)s+1−ps!(

sp−1

)[T (p)n

(j+p−1p−1

)

+p−1∑l=1

(−1)l(pl

) ∑06k6�(j−l)=(t−1)�

T (l)k T (p−l)

n−k

(j−l−(t−1)k+p−1

p− 1

)]:

(10)

One could work out the asymptotics from this formula, at least for the variance, butthe computational eUort would be considerable, because of the alternating sign in theinner sum. Since according formul2 are already stated in Gutjahr (1992), we refrainfrom doing that.

References

Drmota, M., 1994. The height distribution of leaves in rooted trees. Discrete Mathematics and Applications4, 45–58.

Fisz, M., 1963. Probability Theory and Mathematical Statistics. Wiley, New York.

A. Panholzer, H. Prodinger / Journal of Statistical Planning and Inference 101 (2002) 267–279 279

Gittenberger, B., 1995. Die Konvergenz spezieller stochastischer Prozesse gegen die Brownsche Exkursionund deren lokale Zeit. PhD thesis, Technische UniversitWat Wien, 1995.

Graham, R.L., Knuth, D.E., Patashnik, O., 1994. Concrete Mathematics, 2nd Edition. Addison-Wesley,Reading, MA.

Gutjahr, W., 1992. The variance of level numbers in certain families of trees. Random Struct. Algorithms3, 361–374.

Gutjahr, W., PSug, G., 1992. The asymptotic distribution of leaf heights in binary trees. Graphs Combin. 8,243–251.

Kirschenhofer, P., 1983a. On the height of leaves in binary trees. J. Combin. Inform. System Sci. 8, 44–60.Kirschenhofer, P., 1983b. Some new results on the average height of binary trees. Ars Combin. 16A,

255–260.Kirschenhofer, P., 1985. Asymptotische Untersuchungen zur durchschnittlichen Gestalt gewisser

Graphenklassen. In: Hlawka, E. (Ed.), Zahlentheoretische Analysis, Vol. 1114, Lecture Notes inMathematics. Springer, Berlin, pp. 40–54.

Panholzer, A., Prodinger, H., 1997. Descendants and ascendants in binary trees. Discrete Math. Theoret.Comput. Sci. 1, 247–266.

PetkovQsek, M., Wilf, H., Zeilberger, D., 1996. A=B. A.K. Peters, Wellesley, MS.