>> David Wilson: Welcome everyone. I'm delighted to... Northwest Probability Seminar. It's our annual northwest event and...

advertisement
>> David Wilson: Welcome everyone. I'm delighted to see so many people attending the
Northwest Probability Seminar. It's our annual northwest event and I'm happy to see it is going
so well. I have the pleasure of introducing the first speaker. Actually, the first speaker did a
great job of introducing himself. Alexander Holroyd who is representing Microsoft at this miniconference will talk about finitely dependent colorings, sorry, just one, coloring.
>> Alexander Holroyd: Okay thank you. It's wonderful to be here and to see so many people.
What I'm going to talk about is this new and rather strange and mysterious object that we
found, stochastic process. I think it's fair to say that we don't fully understand how it works.
We can prove a lot of things about it, but we don't entirely know what's going on, so we would
certainly be pleased to have any help if anyone can offer on that. It's joint work with Tom
Liggett and it's dedicated to Oded Schramn and I'll explain why. The basic motivating question
is, I guess, do local constraints demand global organization. There are lots of questions in real
life that fit into that category. Do you need national banks to organize financial transactions?
Do you need national borders? Do you need central planning of road networks or could you
just let roads evolve and organize themselves like they did for many centuries? Of course, I
don't have any answers to those questions, but I want to talk about a very, very simple clean
mathematical question which is a version of that general question. Before I state it formally,
I'm going to just tell you a little story. I hesitate to say application, because it's not really that
realistic. It's just a story. Imagine you have a network of machines, computers and the simplest
type of network of all is just a line, an integer line. And suppose you want to assign some
schedule for them, say schedules for some updates for when the machines update their
software or something like that. For simplicity, let's say the schedule is simply a day of the
week so you want to assign a day of the week to the computer for when it's going to do
something. Maybe we just update them all on Mondays, but maybe that's a bad idea. Maybe
we have some local constraints and so perhaps we have the local constraints that adjacent
machines you don't want to update them at the same time. Maybe that creates a conflict, so
maybe you want to update adjacent machines on different days, so you can just alternate
Monday, Tuesday, Monday, Tuesday. Suppose I also have a security constraint and my security
constraint is that I would prefer adversaries not to learn too much about my schedules and so
the best way to make something secret is to choose it at random and so I want to choose the
schedules at random in some way and I want it to be the case that if somebody finds out some
of them then they don't find out too much about the others. For instance, maybe they find out
the schedules of these two here, and I would like that that doesn't give you too much
information about the others. It's bound to give you some information, because if you know
that this one is Monday, then because of the local constraint, you know that this one cannot be
Monday excellent, but maybe you can hope that it doesn't give you too much information
about other machines further away, some larger distance away. That's the story, so now let me
turn it into a mathematical question. Here's a question. From discussions between myself and
Benjamini and Benji Weiss in 2008, so fix integers k and q, does there exist a stationary k
dependent q coloring of the integers and of course I will tell you about what these words mean.
By a q coloring I mean a random sequence indexed by the integers assigning a color which is
just integers from 1 to q to each integer, whole sequences random and I wanted to be a proper
coloring, which means that adjacent integers must have different colors. That's the local
constraint. And stationary, of course, we all know what it means. If you shift by 1 to the right
in the sequence you get a sequence with the same distribution. The most interesting condition
is k dependence and this is if you take two sets of integers which are more than distance k
apart, then the random vector that you get by restricting to the set is independent of the
random activity that you get by restricting to this set, so you ignore k guys in between and you
look at some set of integers here and some set of integers here. Then the colors I see over here
should be independent of the colors I see over there. Is the definition clear? And, of course,
the integer line z is only one graph though many others and I'll have something to say about
other graphs later and proper coloring is just one local constraint and I'll have something to say
about other local constraints later as well, but it turns out that this simplest case is also the key
case in a certain sense. Does there exist a thing like this coloring and k dependent and it’s
stationary? Let's just try some simple and in some cases foolish attempts. These are some
things that won't work. Suppose you just try to have identically distributed colors. For each
integer you have some distribution over the colors and you choose them iid, so it's stationary
and it's zero dependent because anything is independent of anything else. But of course, it's
not a coloring because there must be some color, say color 1 that has positive over tp and that
means you have probability p squared of seeing two 1’s next to each other, so that obviously
doesn't work. Let's try something that obviously is a coloring. If you have two colors you can
certainly just have them alternating, one, two, one, two and that's a deterministic sequence;
that's not stationary. But if I want to make it stationary, then exactly two deterministic two
colorings whereby alternate red blue or I alternate blue red and if I choose this whole sequence
with probability half and this whole sequence with probability half then, of course, I have a
stationary process, but it's not k dependent. It's not anywhere close because if I know x0 then I
know whether it's red or blue, then I know xn for any n so no way x0 and xn are independent
and one determines the other, so, of course, I doesn't work. Another thing that you might try
which also work is just make the coloring as random as possible, so just essentially choose
uniformly, so choose x0 to be a uniformly random color. And then choose x1 to the uniformly
random from the colors that are not the same as x0 and so on in sequence. And then do the
same for the other ones as well and so, of course, this is just a stationary Markov chain with a
transition matrix. And so it's stationary and it's a proper coloring, but of course, it's not k
dependent for any k because you can just compute given the [indiscernible] a 1 at 0. You can
compute the probability I say a 1 at n and it's decaying towards 1 over q exponentially fast but
it never reaches 1 over q so these two events are never independent, so that's not a k
dependent answer. That won't do. Since we're thinking about Markov chains, still there are k
dependent Markov chains, so here's a 1 dependent Markov chains just to understand the
definition a little bit better. The state of the Markov chain is you have four states and this
corresponds to having two coins in your hands and the move of the Markov chain, the staff of
the Markov chain is you exchange the two coins and you re-toss one of them, say the claim that
winds up in your left hand. Maybe you start with head head and then the first step you just
move at this coin down there and re-toss that one and maybe it becomes tails and then you
have tail head so that's the staff of the Markov chain and you keep going. So it has some
[indiscernible] matrix. This is a 1 dependent process because as soon as you have taken two
steps you have tossed both of the coins. You have completely forgotten where you came from,
so x0 is independent of x2 so all of the past up to x0 is independent of all the future of x2
onward so it's a 1 dependent process. Of course, it is not a coloring because you can easily
make the transition from head head to head head. Of course, that doesn't work. It turns out
that no Markov chain will work. No Markov chain, no stationary Markov chain can be k
dependent color, so why is that? If it's k dependent then what's the probability that you see a
particular color, say color b, at time k plus 1 given that you saw color a at time 0? Since k plus 1
is bigger than k, these two events should be independent so that should be just the same as the
stationary probability of seeing color b so in particular it’s the same as the conditional
probability of seeing color b after k +2 steps as well. That means the transition matrix satisfies
this P to the k +1 equals P to the k +2 and that means the matrix satisfies this equation, so that
eigenvalues of the transition matrix are precisely 0 and 1, perhaps with multiplicities. On the
other hand, if it's a q coloring then that means that the transition from color a to color a
happens with probability zero, that means the matrix has zeros along the diagonals, so that
means that the trace is zero; the trace is the sum of the eigenvalues and the eigenvalues are
zero and one. That's impossible. We can't do anything with Markov chains and furthermore,
you can say more. Oded Schramm that there's no hidden Markov k dependent q coloring.
What does that mean? It means you have a Markov chain on a finite state space. The finite
state space is important and then x is just some deterministic function of the Markov chain.
You collapse some sets of states into one if you want and that gives you the process. Even that
one work there's no such thing. This covers many things that you might think of like k step
Markov chains where you go next depends on the last k steps and so on and Gibbs measures
with vocal interaction, those are just Markov chains on the integer line. This is proved by
similar ideas to the very simple proof I just showed you, but in a more complicated setting.
Oded Schramm, Oded is well known to almost everyone here, a central figure of his generation
in probability theory passed away in a tragic hiking accident in 2008. This problem that I'm
talking about was actually, he became very, very interested in and it was one of his favorite
problems in his last few months with us. So this is one of the reasons that I've been so keen to
try to make some progress on it, to come up with something that Oded would have been happy
to hear and I think we've succeeded as I will try to show you. There's another potential source
of examples and maybe this would be one of the first things you would think of. In fact, when I
think of k dependent processes, and that is r-block factors. Suppose Ui is an iid process such as
with iid random variables on any space and f is any function which takes r arguments and we
just let Xi be f over a block of consecutive U’s like this. Obviously, you have these independent
U’s and you take some function of these three, x-1. You take the same function of those three
et cetera. This is automatically going to be what is called an r-block factor of an iid process, the
process X is, so it's automatically going to be k dependent, k equals r-1 because if these blocks
don't double up with each other then certainly the X’s over here are independent of the X’s
over there. For instance, this coin example I showed you is actually a two block factor of a
sequence of coin tosses because you can think of it as another way. You can just think of a
sequence of fair coin tosses and then you just look at the pairs. Look at this is head head. This
is head tails and it turns out that's the same process. There are other examples which are not
Markov chains which are really new types. A simple example is just take uniform random
variables, uniform on the integers over 1 and they are then see which way the inequalities go
between them and write down a zero if the inequality goes this way and a one if the inequality
goes the other way. That's kind of an interesting stationary process and it’s really a different
type. It's not a Markov chain or a hidden Markov chain, but of course, it's not a coloring
because you can perfectly easily have two ones next to each other. An r-block factor is an r-1
dependent process because, as I said, if the blocks are far enough apart than the random
variables are independent and you might wonder whether these are the only k dependent
processes where there is an indication in the other direction. This has nothing to do with
coloring right now. It's just the difference between block factors and k dependent processes
and so this question has kind of an interesting history. The answer is no. There are
counterexamples and this was stated without proof in a Russian book by Ibragimov and Linnik
in 1965 and they proofed for Gaussian processes that it is equivalent on [indiscernible] example
and the question received quite a lot of attention. It was explicitly stated as an open question
by a number of people and finally, the first published solution, at least, was by these people,
Aaronson, Gilat, Keane and de Valk in 1989. They came up with a counter example but it is a
little bit of a strange indirect proof and they asked for a natural example. And then there's
been a whole sequence of papers of people trying to come up with better, more natural
counterexamples as an explicit Markov chain that people came up with and then one
breakthrough is Burton, Goulet and Meester found a 1 dependent process that is not an r-block
factor for any r and that's a hidden Markov chain. And there's been another 20 papers on
people trying to really understand what the differences here and come up with more natural
examples, but the general belief, I think, was summed up by Borodin, Diaconis and Fulton and
this is particularly for 1 dependent processes, they said it appears the most natural 1
dependent processes are 2 block factors. So all these counterexamples that people came up
with were kind of a bit contrived and it seemed like the only examples anyone could come up
with were ones that were specifically constructed for the purpose of counterexamples. I think
that's a fair summary. This is interesting because it turns out block factors will not help with
our question. No block factor coloring exists for any r and any q and this is one of those results
where it's a little bit hard to know who to attribute it to. There's essentially an equivalent result
proved by Moni Naor in 1991, although stated in a very different language than our own. You
can argue it's really a variation of Ramsey’s theorem, but anyway it's fact. It's a very clean
statement and in interesting perhaps little-known statement. If you have any iid random
variables and you have a function of r arguments which just takes finitely many discrete values
1 up to q and you look at the event that f of U1 and up to Ur is equal to f of U2 up to Ur +1 then
that is strictly positive for any function and, moreover, there's an explicit lower bound that you
can write there, a very, very small lower bound, but it is strictly positive. This is essentially
tight, this tao function form. There are examples that are more or less like that. Let me show
you why this is true. I won't pull this tao function, but I'll give you, I won't even show you the
proof, but I'll just tell you the idea of it from this slide if you went away and spend 5 minutes
you could reconstruct the proof. I'll just prove that it is positive. You have a function and you
have iid random variables, this is strictly positive and it will be by induction on r the length of
the block and so it goes r equals 1 is easy because the blocks is just length 1 so f of U1 and f of
U2 are independent discrete random variables so that's the argument I should be right at the
beginning. The inductive step is the following. At least here is the key step and once you've
done this it's not hard to finish the proof. I'll define a set, a set valued function of r minus 1
arguments and what it is is you look at f and you fix the first r minus 1 argument and do a lot of
the last one to be random and you look at which values, which colors A this can take with
positive probability. You look at this set of all those colors when Ur is random and that is a set
valued function of r minus 1 variables, so in particular, it takes at most 2 to the q values
because it is a set of the numbers of 1 up to q and you apply the inductive hypothesis to that.
You think of this as a block function of a block of r-1 variables. Apply the inductive hypothesis
to that and everything solves that. And you can see why you would get this tao function bound
because in the inductive set of gone from q colors to 2 to the q and then you have to do this
once for each step in the induction. Back to the question. Is it possible that that could be a
stationary k dependent q coloring? It's really looking like the answer ought to be no at this
point because we've tried Markov chains. We tried hidden Markov chains. They cannot work
and we've tried block factors and they cannot work, so that only leaves these unnatural
counterexamples that people really struggle to come up with of k dependent processes that
aren't block factors. Seems very unlikely. Oded actually proved that it's not possible in the sort
of first nontrivial case of k and q which is 1 dependent three colorings, so we sent right at the
beginning that two colors isn't enough, but three colors and one dependent also doesn't work.
Oded conjectured and I think all of us who were thinking about the problem at the time agreed
as well with this conjecture that it's not possible for any k and q. Oded even reduced it to a
problem of functional analysis saying if this weird thing exists then there's a Hilbert space with
some strange properties and, indeed, there's no finite dimensional space. Nazarov and Denisov
came up with complex Hilbert space example, but it has to be a real Hilbert space. That doesn't
help. This was where the problem rested for quite some years until I started talking to my coauthor Tom Liggett at UCLA and I hope he doesn't mind me sharing this story. I told him about
the problem and he was kind of interested. After a week or so he said I think I can prove
Oded’s theorem that there is no 1 dependent three coloring. He proved it himself, so I said
great. That sounds good. And he said so now I'm going to look at 4 coloring and a week or two
later he said I think I can also prove that there's no 1 dependent 4 coloring and now I'm going to
move on to 5 colors and I said that sounded good. Then a few weeks later he said, you know, I
think there was a mistake in that last proof and now I think there is a 1 dependent 4 coloring
and I was very skeptical about this. We managed to come up with this incredibly complicated
formula which I won't go into but if you're interested in Dyck words and linear extensions of
process then I can tell you about it after this. It's kind of interesting. What this purports to be
is the cylinder probability for 1 dependent 4 coloring, the finite dimensional distributions.
Some weird formula and we were able to prove that this satisfies what it needs to satisfy for 1
dependence and stationarity and being a 4 coloring, but what we couldn't prove was that it was
non-negative. It has to be non-negative if it's a probability and, of course, that's always the
hardest thing, particularly, because it's inequality and inequalities are much harder than
inequalities. And you see it's an alternating sum so it's not that all obvious. And we struggled
for three years over this. We had all sorts of special cases that we could prove. We checked
billions of cases by computer and so on but we really couldn't do it until we found another way
which I'm going to tell you about. There is a 1 dependent 4 coloring. A number of
consequences, in particular, this old question about the difference between block factors and
colorings, now there's a beautifully clean answer to it. Coloring can be done by one dependent
processes but not by any block factor, so that's very interesting. That answers some of your
questions about hidden Markov processes and this weird Hilbert space exists. I don't know
whether anyone cares about that now. This previous complicated formula I showed you is also
correct and that leads to a number of combinatorial identities which I don't know how to prove
any other way, interesting combinatorial identities, so that's the coloring. Here are some weird
properties that it has. This is a very, very strange process. We don't understand what's going
on fully. It's symmetric under permutations of the colors, maybe not such a big deal. If you
exchange red and green, permute the colors anyway you want you get a process with the same
law. Here's something interesting. If you just look at the 1s so if you just look at the reds, say,
some process, look at the indicator where you have a red dot, then that's the following process
you take fair coin tosses and you look at the places where you see head tail in sequence,
perfectly nice process, obviously, 1 dependent because it's a two block factor. That's what the
reds are. If you look at 1 or 2 to look at the places where you see red or purple, the indicator of
that event, that gives you a process of zeros and ones and that's this process that I told you
earlier where you take iid uniforms and you look at which way the inequalities go, very strange.
Because of the symmetry, this holds for any pair, so in particular, try to think of any process on
4 colors so that if you look at any pair of them, ones and twos or ones and threes et cetera, you
get this process. I know of no other example and possibly this characterizes the process. We
don't have a simple proof of this, really. We conjecture that the 1 dependent 4 coloring is
unique, actually, and this property is true for any stationary 1 dependent 4 coloring to the first
one. Nevertheless, despite all these strange mysteries, I'll tell you how to construct the
process. It's very simple. I'll tell you two ways to do it. First one, start with just iid uniform
colors, uniform on 1 to 4. We know this is a silly thing to do, of course, but now I'm going to
condition on something and I'm not just going to condition on the event that this is a proper
coloring. What I'm going to do here is I'm just going to tell you the finite dimensional
distribution. I'll tell you the joint distribution of x1 up to xn and it will turn out that this is a
consistent family of distributions. First of all, use iid uniform colors, now I'm going to condition
on a certain event and the event is not just the event that I have a proper coloring. That would
be too easy. That would give me the Markov chain that I started with in the beginning. Instead,
I choose an independent uniformly random permutation of 1 up to n as well and I think of these
as arrival times. I'm going to condition on the event that each time a subsequence that has
arrived so far is a proper coloring and I'll show you what I mean with the example. Here are my
colors and then they arrive at these random times, so at time 1 all I see is blue. At time 2 I see
blue red because the red arrived. At time 3 I see red, blue red, which is good. At time 4 I see
red, blue, red red and that's not good because that's not the proper coloring because I have
two reds next to each other. Even though there were some other things that were going to
arrive in between, it doesn't matter, so I can condition on the event that every time the
subsequence that I see so far is a proper coloring. In this case, because I have two reds next to
each other at the end, no matter what permutation I chose it wouldn't work, but if it was a
proper coloring it might or it might not depending on the permutation. That's it. I condition on
that event and that gives the correct law of x1 up to xn. It's obviously a coloring, but is it
obvious to you that this is a 1 dependent process? It certainly isn't to me. Here is a second
construction. Here's an equivalent way to do it and it's not too hard to show that it's equivalent
to what I just showed you. Start with a random color, uniformly random. Now I'm going to
insert extra colors, so maybe I insert a red to the left and maybe I insert a blue in between in
the middle and maybe I insert a red there. What is the rule? The rule is at any time you have
these possible slots where you can insert, the endpoints or the internal slots and how many
possible colors are there that you can insert? At the end there are always three possibilities if
you want to make a proper coloring and in the middle there is always two, because these two
colors are different so there are two things that can go there. You choose uniformly from these
possibilities down there, so you choose the position proportional to the numbers 3, 2, 2, 2, 3
and then once you have chosen the position you chose a random color. Again, that's it. You do
that for n steps. You get a random vector of length n and it turns out these are consistent
distributions and they are the ones I want. I don't have time to talk about the proof but the
proof that this works is quite simple. It's only a page or two. You can express this probability as
a number of as we call buildings which are permutations that work for that particular coloring
divided by the sum of all of them and the denominator is something innocuous. And there is
this lemma, the number of buildings of x is equal to the sum of the number of buildings that
you get by deleting the elements one at a time. Makes sense if you think about it because you
just see what was the last element that was added and you just have to prove this, that the
number of buildings, if you sum over this color in the middle and you have a word here and a
word here, these are supposed to be independent. Basically, you are trying to show that it is
equal to that product and you just expand using this lemma. You split in some cases. You use
induction and it just works. So what about other k and q? Here is a very easy way to get a 3
dependent 3 coloring. Start from the 1 dependent 4 coloring, eliminate color 4 wherever I see
it and replace it with something else. Replace it with just the smallest number that is not
represented along the two neighbors. This is a block factor of the original coloring. It's a 3
block because what I put here only depends on myself and the two neighbors. If you think
about it, that means it's 3 dependent and it's a 3 coloring. Not only do we know that 1
dependent 3 coloring can't be done, that only leaves 2 dependent 3 coloring and we were
getting all ready to say that this is an interesting problem and then we realized that can be
done as well. The astonishing thing is and this just blew me away, the same process works, so
these two constructions that I gave you, they're equivalent. You can do the constructions with
any number of colors. You always get a consistent family of distributions, it turns out, so you
get a coloring on the integer line. And with four colors it's 1 dependent. With three colors it's 2
dependent. With any other number of colors it is not k dependent for any k. Very strange,
there is something absolutely magical about three and four which we still don't fully
understand. There are very few other colorings, k dependent colorings known. You can do
embellishments of these like the thing I just showed you with the three colors and there's one
more family of examples which I won't have time to talk about. Interesting properties of the 3
coloring as well is that this 3 coloring I just told you is just the 4 coloring conditioned to have no
4s at least on a finite interval, so why should that work? I have no idea, but it does. Why
should it work if you take a 1 dependent 4 coloring and you condition it to have no 4s, you get a
2 dependent process. No idea, except we have the proof. If you look at the locations of a
single color, 1 in the 3 coloring, those have a beautiful form as well. You take these iid U’s and
you look at the peaks, you look at the place where the U is a local maximum. It turns out that's
what process it is, so remember this strange thing about the 4 coloring, that 1s and 2s have a
nice interpretation as far as the U’s so you would really think there was some connection
between these three facts here, that the 3 coloring is what you get by conditioning the 4
coloring and this has this nice property with U. And this is another nice property with U’s. I
have proofs of each of the three things but they are completely different and we don't
understand what the connection is. I'll skip that. I will skip this. I'll just say a word. One can
define more general local systems of constraints. Coloring is just one constraint that you can
say on windows of length 3 I'm only allowed to see this finite list of things. That gives you
what's called a shift of finite type. You can extend some of these things to shifts of finite type,
so if you have any system of constraints like that, throw aside some degenerate cases that don't
matter. Then you basically have the same thing. You can satisfy it with a k dependent process
and you cannot satisfy it with a block factor. There's an even more remarkable answer to this
old question. Any shift of finite type, if you want your process to do anything at all nontrivial
then you can do it with a k dependent process and you cannot do it with a block factor. We can
say some things about zd as well. I have lots of open questions. Uniqueness, how many colors
do you need on zed D for 1 dependent coloring? We know it's between nine and 16. Can you
express one of these colorings as a function of a countable state Markov chain? We know you
can't do it as a function of a finite state Markov chain. On the other hand, any process can be
expressed as a function of uncountable state Markov chain. Shifts of finite type, in d-dimension
is a big complicated area. We didn't have much to say about that. I guess the bigger question is
just what's going on here? Is there another way of thinking about these processes that make
the properties more obvious? I'll stop there. [applause].
>> David Wilson: With that, are there any questions?
>>: Do I understand correctly that you are construction doesn't work for 5 colors because it's
not k dependent?
>> Alexander Holroyd: Right. Construction works, so it gives you a stationary 5 coloring, but it
is not k dependent for any k.
>>: Do you have a cross-section then for larger numbers of colors?
>> Alexander Holroyd: Great question. Yes. First of all, there are some trivialities, so a 4
coloring is a 5 coloring, so done. If you want positive probability of each color then you can just
split a color. Whenever you see color 4 you toss a coin and make it color 5 some of the time.
But you might ask is there a symmetric 5 coloring? For example, symmetric under
permutations of the colors, the answer to that is yes. This is one of the things that I did not
have time to say, but we managed to come up with a family of symmetric 1 dependent q
colorings for all q greater than [indiscernible] and this we understand even less than the other
thing, if that's possible, but it is a proof. There's a much more complicated recursive formula
for the probability in terms of the things that you get by deleting one symbol at a time with
either Chubby Chef polynomials, which can also be expressed in terms of hyperbolic functions
with this word parameter. Yes, but we don't really understand why it works.
>>: [indiscernible]
>> Alexander Holroyd: Yeah, symmetric under all permutations.
>> David Wilson: Any other questions or comments? So this formula doesn't make sense for
you first thing?
>> Alexander Holroyd: Right. [indiscernible] or something like that.
>>: It seems that T is not real.
>> Alexander Holroyd: The T is real. A good question, but [indiscernible] is exactly where it
breaks down from here.
>> David Wilson: If there are no more questions or comments, let's thank Alex again.
[applause]
Download