>>: Okay. So our next speaker is Dr.... Francisco to Seattle. He got his Ph.D. from U.C....

advertisement
>>: Okay. So our next speaker is Dr. Rizzolo. He just moved from San
Francisco to Seattle. He got his Ph.D. from U.C. Berkeley under Jim Pitman.
He's now an NSF post-doc and research associate at UW. And he's going to talk
about the shortest problems and the random trees.
>> Douglas Rizzolo: Thanks. I'd like to thank the organizers for inviting me, and
the NSF for funding this work on any number of grants. And I'd like to thank Izof
[phonetic] for giving such a nice first talk. I would have appreciated it if the bar
had been set lower [laughter].
So as the title suggests, I'm going to talk about Schröder's problems and random
trees. So Schröder's problems are a class of four problems that were introduced
a while ago by Schröder in the 1870s in a paper that everybody cites but I've
never seen physical evidence of it existing or even really of the journal it's been
published in. But everybody propagates the story that it was published in 1870
so I'll stick with that.
So it's four problems about bracketings of words and sets. So the first is a very
common problem, comes up in a lot of discrete math courses it's how many
binary bracketings are there of a word of length N.
So these are bracketings of length four. They're pretty easy to get ahold of.
These are one of the classical examples of things that are counted by the
Catalan numbers. So the second problem is what if you remove the condition
that the bracketings be binary, what if you just had arbitrary bracketings of words
lengths N. You require the bracketings to be nonredundant and nonempty so
they're finitely many of them.
For these you can get a general solution. These have been sort of known for a
while. This is just taken off the online Encyclopedia of integer sequences.
They're all there, you can search them by name if you want to learn more about
them.
And so these first two problems are problems of bracketing words. So the
elements are ordered. But you can ask the same problem for bracketings of
sets. So how many bracketings of a set of size N are there.
Binary ones, you can again get a nice formula. And the last problem is what if
you remove the binary condition for bracketings of sets? You can do this. Again,
you can get a much less nice formula. Looking at this, it goes from the formula
it's pretty apparent that doing common tore Ricks for binary things is much easier
than the general case. But you can always get formulas, whether or not they're
terribly helpful.
>>: I'm sorry, I missed the difference between problem one and problem three,
could you just ->> Douglas Rizzolo: Yes.
>>: Go back and ->> Douglas Rizzolo: Sorry. Stop me if I go too fast at any point.
So in these, you're bracketing sets. So the elements aren't ordered. Right? So
you can sort of see a difference. For example, for these ->>: These, they're the same number. I get it.
>> Douglas Rizzolo: No, here I've only shown five of the 15. There were too
many of them to list all of them.
>>: I see.
>> Douglas Rizzolo: Sort of indicated underhandedly in language. These are
some of the bracketings of size four.
>>: You could have one and three in a set and two and four in a set.
>> Douglas Rizzolo: Exactly. So it's the basic change between problems one
and two and problems three and four is you're trading in order structure for a
labeling structure. And part of what Jim pitman who is my co-author in a lot of
this work has done is figure out what's the relation between labeled structures
and ordered structures of these types.
Okay. So what we're interested in is what do these look like if you pick one
uniformly at random for some large N? And the way we're going to look at this is
we're going to cast it in the setting of random trees, which is a fairly well studied
field.
So the bijection we're going to use with trees is in some sense the obvious one,
although there's lots of bijections between these problems and various models of
trees.
So, for example, the first problem is going to correspond to rooted ordered binary
trees, sort of in the following way: It's just there's a somewhat obvious nesting
structure that these bracketings have and those nestings become the vertices
and leafs corresponding. This is what our trees look like for binary word
bracketings.
And we can do binary set bracketings and here it's sort of clear that now we have
labels and though you may not be able to tell from the picture these trees aren't
ordered. They're rooted. And similarly we can do this for the fourth problem and
you can see the only difference is that we're allowing sort of arbitrary degrees
instead of just binary degrees.
Okay. So to look at these, what we're going to do is to frame these as problems
of looking at conditioned Galton-Watson trees. And Galton-Watson trees have
been fairly well studied. They've come up a bit, but although in the context of
conditioning on the number of vertices, never in the context of the number of
leaves, which is what we have in this particular case.
So let's recall that the definition of a Galton-Watson tree is you just have some
offspring distribution and you take distribution on trees that makes the
outdegrees of the vertices as independent as possible.
So that's -- you have this product form over the degrees in the tree. And just sort
of been known for a while that a lot of combinatorial models fit very nicely into the
Galton-Watson tree framework. For example, if the offspring distribution is
geometric and you take a Galton-Watson tree with that offspring distribution
condition, it has N vertices, what you get is a uniform random tree on N vertices.
So our hope was to be able to do something like this with the trees appearing in
Schröder's problems. And, in fact, you can. And Jim Pitman and I did this along
with Curien-Kortchemski at around the same time. It was -- for the first two
problems it's fairly straightforward, because the trees are -- the trees in
Schröder's problems are ordered and Galton-Watson trees are root ordered
trees. They're the same type. And you can just go through it.
And what you get are these two offspring distributions. So in the binary case you
get the only thing you can really have, which is uniform distribution on vertices of
outdegree 0 and outdegree 2. And in the general case you get this somewhat
stranger offspring distribution where it's not necessarily clear where it's coming
from, but the key point is that for I greater than 2 you have some number risen to
the power I minus 1.
And the reason that's going to be nice is just because if you sum over internal
vertices of a tree, of their outdegree minus 1, what you get is just something, a
formula in terms of the number of leaves.
So give a quick proof of this to show how these things go. If you look at the trees
appearing in Schröder's second problem, if you take a Galton-Watson tree with
that offspring distribution, you just get this product form, and you can compute
the sum in the exponent of the second one and you get a formula in terms of the
number of leaves. What this says if you condition on the number of leaves,
everything has the same probability, so it's uniform. And that's basically how that
goes for the first two problems.
Now, for the second two problems, it's a little more difficult, because you need
something to -- you need a way to get from rooted ordered trees, which all
Galton-Watson trees are rooted ordered trees to rooted labeled trees.
And this is sort of a classical problem. It's been done in the case where you're
labeling vertices and in that case it turns out to be a bit easier, but you can also
do it when you're labeling leaves.
And the basic idea is the following sort of transformation is if you take some
rooted ordered tree and just in ordering of the number of leaves in the tree, you
just label the leaves from left to right.
It's one of the first things you might try. It wasn't one of the first things we tried
for some reason. I can't tell you why.
And the idea here is if you now take ->>: Can you go back?
>> Douglas Rizzolo: Uh-huh. So ->>: Okay.
>> Douglas Rizzolo: It might not be clear from the picture. But the tree on the
right-hand side is unordered but labeled. All right. So in this case you have for
binary trees it's the one and only critical Galt and Watson offspring distribution on
binary trees. And the case of the fourth problem, you get this even stranger
distribution. But it turns out if you take a Galton-Watson tree with one of these
distributions and independent uniform ordering of the numbers 1 through N, and
go through this label the leaves from left to right by this ordering and then forget
the order, you get the trees appearing in the second, I mean in the third and
fourth problems respectively.
So what this does is basically cast looking at Schröder's problems at looking at
random Galton-Watson trees conditioned on their number of leaves.
So looking at Galton-Watson trees conditioned on the number of leaves is sort of
a recently developed thing. There are at this .4 or five distinct approaches to
dealing with them when they're large. I'm going to tell you about what I think is
mostly the right approach or the easiest and most intuitive approach to dealing
with conditioning Galton-Watson trees on their number of leaves.
Okay. So I should mention perhaps where these have come up before. The
trees appearing in the second problem. These are rooted ordered trees with no
vertices of outdegree one. These recently came up in studying non-crossing
arrangements in the plane and looking at Brownian triangulations and things like
that in work of Curien-Kortchemski, sort of where they came at these from.
Okay. So how do we deal with these? What is there to say about
Galton-Watson trees, conditioned on their number of leaves?
So the way I'm going to do approach the problem or at least the way we're going
to approach it today is by relating trees and leaves with trees with N vertices and
something of a nonobvious way.
So the way we're going to do it is we're going to start with a rooted ordered tree
with N leaves and we're going to transform it into a tree with N vertices by the
following procedure.
So we're just going to label the leaves from left to right, just increasing order.
Once we've done this, we're going to label all of the edges in the tree. And the
labeling we're going to do is we're going to label the edge by the index of the
smallest leaf in the tree above the edge.
So if you were to remove the edge and look at the tree above it, the tree further
away from the root, you'll get the smallest leaf and label the edge by that.
And then the transformation is to collapse along these spines to get a tree. So
what's happening here is if you look at the vertex labeled one that's coming from
the spine from the leaf labeled one to the root and the vertices attached to it are
all of the spines that touch the one labeled one. So we have the spine labeled
two, five and six.
And you sort of repeat this procedure and it collapses the tree into a tree with N
leaves, I mean into a tree with N vertices.
So it's worth noting, this is something of a classical transformation. This is one of
the many known bijections between binary trees with N leaves and trees with N
vertices.
So this is something of a classical transformation at least in the case of binary
trees. And we're going to do it just sort of in general.
And what's nice about this is this transformation actually preserves the law of
Galton-Watson trees. If you do this to a Galton-Watson tree, what you get out is
again a Galton-Watson tree.
So here's the theorem. Basically says that if you start with a Galt and Watson
trees, do this transformation the result is again a Galt and Watson tree but with a
different offspring distribution. You can write down explicitly what that offspring
distribution is.
And other nice properties, if the tree you started with is critical, that is the
offspring distribution has mean one then your new tree is also critical, offspring
distribution has mean one and, again, if you start with something that's finite
variance you end up with something that's finite variance.
And that can -- that last statement can be extended, that if you start with
something in the domain of attraction of a stable law, you get something in the
domain of attraction of the same stable law.
Okay. So just to repeat, sort of to say in a different way that this is a bijection
from binary trees to general trees, that if you take the trees appearing in
Schröder's first problem and do this, what you get is a uniform tree on N vertices.
And that will mention that again as to something interesting that comes out of this
study because of this.
>>: Can you explain bijection again?
>> Douglas Rizzolo: The bijection? Okay. Yeah. So this transformation, you
start with -- right, you just label the leaves and you label the edges. And if you
look at the spine now labeled one, you attach to it as vertices all of the spines
that touch it. So two touches it, five touches it, six touches it and those become
the children of one. So you get two, five and six.
And similarly, if you look at the spine two, that's touched by the spine's three and
four. So you get as a child of two, three and four, in the order that they're
appearing as they touch it.
And you go through with five and six, they're leaves. And that's the
transformation in general.
>>: One with longer by one edge, does that make a difference?
>>: No, it's not a bijection on general trees. If you restrict to binary trees, you get
a bijection from binary trees with N leaves to trees with N vertices.
So this in general is definitely not a bijection. But if you look at -- if you do this
only to binary trees, then it is a bijection.
And that's because you can basically just reverse the process. If you restrict -- if
you require everything to be binary, you just sort of expand each vertex, you take
its degree and you turn it into a path of that length and attach things as required.
And the only way to make it binary.
>>: Match Galt and Watson trees to Galt and Watson trees and for all Galt and
Watson trees on the right is there a Galt and Watson trees tree on the left?
When you get all such, when you get all such Galt and Watson trees out of this?
>> Douglas Rizzolo: I'm fair sure the answer is no. You can sort of -- when you
write down what the distribution is of the tree on the left, what it is is it's -- it's
what I would call a compound geometric distribution. Right? You're summing
up -- you're looking at a sequence of independent things until you see the first 0.
And then you're summing those up.
And I quite frankly would be surprised if you got everything from a procedure like
that, but I don't actually have a proof that you don't.
Okay. So are there any other questions about this transformation? Okay.
So one thing this does is it gives you by this remark it gives you a connection
between binary trees with N leaves and uniform and just regular trees with N
vertices, and as a consequence of what I'm going to say, what turns out that a
uniform tree with N vertices is almost a binary tree with N leaves. And there's a
very explicit coupling of the two that has such that the Gromov-Hausdorff
distance between them is small.
Okay. And the way we're going to get to something like that is through the
depth-first processes of these trees.
One of the nice things about ordered trees, you can make use of the order
structure to get bijections with conditioned random walks something you can't do
as nicely with labeled trees, which is sort of why moving to the Galton-Watson
framework is useful even for dealing with the third and fourth problems, which are
not inherently, are not inherently ordered.
Okay. So the most -- one of the basic depth first processes is the depth first
walk. Here's a formal definition, but the basic idea is you start at the root and
then you go to the its left-most child from there you go to that one's left-most
children unless it doesn't have any more left most children that haven't been
visited at which point you go back to the parent and proceed around the tree in
this manner.
And this gives a nice way to order the vertices of tree, sort of the depth first
order. And so this is the order we're always going to be using. If I have a root
ordered tree and I list the vertices V1 through VN these are going to be listed in
order of appearance on this depth first walk.
So once we have that, the easiest depth first process to deal with the is depth
first queue. So you take a tree. The depth first queue is at step N is just sort of
the random walk that sums up the degree of the vertex minus one, as you are
sort of going around.
I should mention that degrees are always out degree here. So it's the number
of -- number of vertices that are adjacent to it and further from the root.
Okay. So you get this? As the depth first queue. And the reason this is easy to
deal with for Galton-Watson trees is that it's the first passage bridge from 0 to
minus 1.
So if you have Galton-Watson tree with offspring distribution C, then SN as I said
is just a first passage bridge from 0 to minus 1 of a random walk with the shifted
step distribution.
Okay. So this is an example of the depth first queue of a random binary tree with
11 leaves. Okay. So the reason for introducing these depth first queues or the
first step is that the transformation I was discussing before has a fairly nice
representation or has a fairly nice action on the depth first queues.
So if you look at it, what happens is the following: You take the depth first queue
and suppose you only observe it when it steps down. You only observe it after
negative steps. And you just take that walk. That's the walk that you have in red
here. And then just sort of compress that, right? You only just -- this is just a
time shift of the red walk on the previous slide. This again is a first passage
bridge. And it is in fact the depth first queue of the transformed tree. Okay. So
that's a fairly straightforward action on the depth first queues. And it also shows
you that the two trees really aren't that far apart in terms of the distance between
their depth first queue, because you're basically just waiting until you have steps
down and really how much could go wrong in between the steps down of one of
these walks.
The answer is a lot. But it turns out that it doesn't in this case, because things
are pleasant. Okay. So what we can get out of this is sort of this theorem. And I
apologize for having so much text on a single slide, but that was no escaping it.
So what this theorem says is that if you take a Galton-Watson tree with fine nice
variance mean 1 offspring distribution and you look at the transformed tree and
you look at their depth first queues, this gives you a coupling between the depth
first queues of those trees, and if you scale appropriately, the joint distribution of
the two processes converge to the same Brownian excursion.
Okay. So what this is saying is that the distance between -- and so the thing to
note on this is that this is the same Brownian excursion here and here. They're
converging to the same thing. That's the joint distribution of the two.
And the way to prove something like this is let me remark before I remark on the
proof. It holds under weaker conditions you can prove sort of the analogous
theorems for offspring distribution that are in the domain of attraction of a stable
law. You can do this coupling and make it work but it's considerably more
difficult to do so.
So the way we do it is this sort of proposition of showing that the uniform
distance between the two depth first queues is not far apart. And that basically
comes just by looking at the picture. The whole thing sort of comes down to you
look at the time change of how far apart are the steps down. Not that far apart.
You can basically scale those and have them go into the identity.
And then if you look at the distances between the two walks, well, I mean, the
walks just aren't going to get far apart. They don't have enough time to get far
apart when you're doing this. And this sort of works out because in the limit
we're going to get standard Brownian excursion, which is continuous.
And basically the modulus of continuity, the distance between these two walks is
controlled by the modulus of continuity. And that's going to 0. So we can get this
type of proposition connecting them.
>>: How far do they really get apart? This loop in, loop back out?
>> Douglas Rizzolo: Oh, yeah, this is not nearly the strongest statement you can
have. In general, for finite variance, it might be. In our cases, we have some
exponential moments on the offspring distribution. So you can do a moderate
deviations type thing.
You can get, I think -- I think you can put an N to the one-fourth plus epsilon and
get decay of order E to the minus alpha N to the epsilon.
>>: [inaudible].
>> Douglas Rizzolo: Should be longer. Yes. I suspect it could be. I'm going to
say I haven't actually -- I haven't sort of looked into how tight you can get these
tails assuming exponential moments. But sort of there is obvious theory there
that you could use to try to work, to try to work that out.
So existing deviation theorems could give you rates for something like this.
Okay. So once we have that -- so that's sort of the depth first queue. What its
convergence says but the resulting tree isn't that obvious. And it says you some
things, but it doesn't -- it doesn't tell you that the whole tree converges, say, in a
Gromov-Hausdorff sense. It just sort of gets you close to a result like that.
To get the convergence of the tree in a Gromov-Hausdorff sense or as a whole,
you need to take -- you need to look closer at the contour process. So the
contour process is just the process that goes along recording the distance from
each vertex to the root as you go along the depth first walk.
So that's what this definition says you. And the nice thing about this is if you
assume exponential -- that you have some exponential moments of your
offspring distribution, you get these sorts of moderate deviations for the distance
between the depth first queue and the contour process of a conditioned tree.
So this theorem, when TN is conditioned like a Galt and Watson tree to have N
vertex software due to Markert and Mokkadem in around 2003.
And more recently it was done in the case when you have N vertices, and you
can basically just reduce it. I mean, in the case of N leaves and you can
basically just reduce it to their proof. So getting it for trees conditioned N leaves
once you have N vertices. Once you have it for trees for N vertices isn't so bad
for that though.
One thing I should mention, though, since we have this for both, what this says
you is that the distance between the contour process of a uniform binary tree with
N leaves and the contour process of a uniform tree with N vertices, you can
couple them so that you -- so that they're very close together.
You have these sorts of moderate deviations for the distance between them at
least as Yuval was mentioning, you can probably do better than this.
But still this says you that a uniform binary tree with N leaves in a very strong
sense almost a uniform tree with N vertices, which I think is something of an
interesting coupling. So once we have convergence of the contour process, we
still want to say something about the tree itself, especially for the third and fourth
problems, which aren't naturally ordered. So it converges with the contour
process is nice but it's not really what you're interested in in the first place.
So the way to make rigorous something about convergence of the whole tree that
we're going to do is Gromov-Hausdorff-Prokhorov convergence. So we're going
to let MW be the set of equivalence classes of compact metric measure spaces
that are pointed, and we're going to define the Gromov-Hausdorff-Prokhorov
distance between them as usual. Usual is not the usual definition but this is the
one I'm using, as you minimize over the distance, of the distance between the
roots, the Hausdorff distance between the sets and Prokhorov distance between
the measures over all metric spaces and isometric embeddings of the two into a
common metric space.
I should mention that theoretically speaking this slide is complete nonsense.
None of that means anything in ZF or choice if you want quantifying over all
metric spaces, several times over, you just can't do that.
But there is a way to formalize this in terms of classical Zermelo-Fraenkel set
theory. So I mention this because usually it doesn't matter that you can't
formalize this. But every now and again you'll find people getting in trouble
because of that. So sort of mention that you can in fact formalize this.
Okay. And the nice thing is that this metric, this Gromov-Hausdorff-Prokhorov
metric plays nicely with encoding trees by things like their contour functions, and
the interaction is sort of by -- you can take the contour function to define a metric
to define a pseudo metric, rather, on the unit interval, and then when you sort of
quotient out to make it non-metric you get a tree.
That's sort of what this is doing. Is you're defining sort of the pseudo distance
between two points by a function is just some the values of the function subtract
the minimum between them.
And there's nothing inherently natural about this definition the first time you see it.
But if you sort of write down what this means for a tree, for a finite tree, you'll see
that this sort of gives you the gravimetric on the tree, and you've just sort of filled
in the edges by unit intervals.
Okay. So if we take this pseudo metric, we quotient out by it. We get a compact
metric tree. And we can push forward any measure we want on to that. And
nice thing is that this map that takes continuous non-negative continuous
functions that start and end at 0 and measures to trees, you actually have a
continuous function. This says you sort of everything you want to do in terms of
composing them, works nicely.
Now I should mention while this lemma has certainly been in the folklore since as
at least the early nineties as far as I can tell it's only been written down the past
several months.
So this is sort of the lemma that lets us actually do things with this procedure of
constructing metric trees.
And the sort of universal object for this setting is the Brownian continuum tree
which is what happens if you apply our construction to standard Brownian
excursion and Lebesgue measure. This gets you Brownian CRT, which is
slightly different from the one that all this originally defined where he had twice
Brownian excursion. But this has become the more standard normalization.
And sort of coupling this with the theorem about convergence of contour
processes, you get sort of the following theorems for convergence in distribution
of the trees appearing in Schröder's problems with respect to this
Gromov-Hausdorff-Prokhorov topology where the measure we're putting on them
is the uniform probability distribution on the leaves and you get this explicit
formula available.
Okay. And I think that's it. Thanks.
[applause]
Download