21193 >> Yuval Peres: Hi everyone. It's a pleasure...

advertisement
21193
>> Yuval Peres: Hi everyone. It's a pleasure to welcome Fabio Martinelli to lead this topic of
Glauber dynamics for the Ising model, is now one of my favorites but I really first learned about it
from Fabio's lecture in San Flora in 1997. I'm very delighted that he's going to tell us some of the
latest developments today. Thank you.
>> Fabio Martinelli: Thank you, Yuval, for this invitation and super nice day at Microsoft
Research.
So, I mean, I'm going to tell you some exciting new progresses in this topic. Okay. So general
overview is we will talk about Gibbs sampler or Glauber dynamics for spin model.
I'm sure that many of you have heard about this topic here. And okay. So there is this general
dichotomy that if you consider the mixing time, say optimal mixing time like N log N or say poly N
bounds on the mixing time are possible if there is some kind of what's called special mixing
property for this system that roughly speaking means that far away region B is sort of
independently.
And so the interesting question that has been opened for many years with very slow progress is
what happened and which tools are available to analyze the mixing time when the spatial mixing
is not really present.
So, for example, that's the typical situation when the system is sort of multi-phases taxer so
there's bottlenecks, so forth and so on, so really to understand deeply the evolution of the
Glauber really requires to evaluate the details analysis that in many situations is not today
possible.
Okay. So but I want to talk about the case study. So the case study is the easy model on finite
grid L by L. Okay. So the box L by L is the gray one.
So those are the inner vertices. Outside, there are boundary vertices and at each vertex is
attached a variable that takes value plus or minus. Plus or minus 1. This variable is equal to spin
from physics, but it's really dragon [phonetic] so sigma Y is the spin attached to an inner vertex.
And on the boundary vertices, there are -- it is one could fix some spin configuration and in order
to distinguish from the inner spin called usually tau. And those are imagined to be frozen once
and for all.
So assigned to certain value plus or minus 1 according to what one really wants to study. And,
okay, so then given an assignment of the internal spin sigma N given the boundary spin that's
tau, fixed once and for all, so the energy of this assignment is given by minus the sum of the
product of spin where XY, where X and the Y simply means that XY are the end points of a bound
inside the gray box.
And then the results are an interaction or energy term that is related, so connects the spin to a
boundary vertex.
So, of course, this sum is only along bounds that connect the inner box to the boundary. Okay.
Then okay so the probability distribution over the spin configuration is called the Gibbs measure.
But sometimes this is a scary word.
Simply is so the probability of a spin assignment, I put the tau here just to remember that there
are this the boundary condition on the outside, is proportional to exponential of minus beta some
positive parameter that you choose, the energy of the assignment.
So that's in just as used standard called boltsman weight. So a quick review of what is known for
this model to mention is that there's a critical value.
So it's one of those exactly solvable models on the grid. So beta critical takes this value such
that for if beta is smaller than this critical value, then essentially the boundary spin are irrelevant.
There is a strong decay of correlation among the spin, and when irrelevant means the following.
If you look, for example, at the distribution of the marginal of the spin at the center of the box, and
you send L to the side of the box to infinity, whatever you put around the box does not produce
any bias.
Okay. So instead, if beta is larger than beta critical, then the boundary -- so the boundary spin
have an effect even when L goes to infinity. So, for example, if they're all plus, they put a bias
towards plus 1. And if they're minus 1 by symmetry to minus 1.
So it's clear that correlations are not really decaying. So there is appearance of infinite longer
there range. Okay. So if beta is at beta critical then there is a big story with SLE. So probably
here is one of the best places in the world.
Okay. So what we want to look at is Gibbs sampler, Glauber, which is just a standard one.
So in discrete time is you pick uniformally at random one of the inner vertices. And you resample
the spin, you put the new value plus or minus 1, extract it from the distribution at X given the
current configuration around you.
Okay. And so this actually is possible to compute very simplex explicitly so in continuous time
exactly the same except that you do this resampling at the vertex X according to a Poisson clock
of mean 1.
So in this talk I will choose the continuous time version. So I mean, each second a spin gets
updated. So the mixing time is for this Gibbs sampler, you start from some configuration, you
compute the distribution that you get at time T, you compare it in this, in total variation to the
equilibrium distribution. So the Gibbs measure. And then you take the worst case over the initial
condition and you ask how long do you have to wait in such a way that this is, say, less than
one-fourth.
Okay. So quickly all the results are if beta is less than the critical value ->>: [inaudible] [laughter].
>> Fabio Martinelli: What?
>>: Never mind.
>>: You have the I ->> Fabio Martinelli: Oh, this. [laughter].
>>: Two different things.
>> Fabio Martinelli: Ah, okay. So....I'm sorry. Okay. I'm always mixed up with -- anyway, beta is
smaller than the critical value. Then mixing time is order log L, which in discrete time means L
log L, so optimal.
And actually so recently there has been a very beautiful and sort of surprising result that it's
possible to prove the cut-off phenomena, namely really to pin down the mixing time log L with a
sharp constant plus a window that is log log L. And that was beautifully prepared by Ayal and
Allen here.
Again by the same group of people, if beta is critical, then mixing time as expected by physicists,
number of simulation, is polynomial. It's very interesting that the missing bit to prove really to
attack this problem comes from SLE.
And, I mean, okay, they get of course polynomial with certain power that is not really the power
predicted by physicists, but still this was an open problem for many, many years from there quite
nice.
So instead if beta is larger than the critical and there are no boundary conditions, namely that tau
spin or rapson then the mixing time is exponential with a short constant that depends on beta L,
in L.
So you see that there is -- so that this critical values is a signature. We're going from log L to
exponential L. Going through polynomial. That was the picture so far.
Okay. So what's the role of the boundary condition when beta is bigger than beta critical? So
without boundary condition is exponential in L, as we just saw.
And so instead, if the boundary condition, they select what is -- so you have to -- if you look at the
energy of a spin assignment, then without boundary condition, the spin, they want to be equal.
So in order to minimize the energy, this means either all plus or all minus.
So if the boundary condition tells the system you should prefer plus, okay, for example, in ->>: Go back one slide. You really feel a bit uncomfortable with the first bullet. We just show kind
of phenomenon, whatever the spatial mixing, and Fabio ->> Fabio Martinelli: Come on.
>>: Group of people showed that this whole.
>> Fabio Martinelli: His story. That's his story. [laughter].
>> Fabio Martinelli: Okay.
>>: [inaudible].
>> Fabio Martinelli: When I was young. [laughter].
Okay. So the idea is if the boundary condition is selected phase, so they sort of break this
multi-modal actually B modal structure of the Gibbs measure, and therefore you expect, if you
almost start with unimodality, then the Gibbs sampler should mix much faster.
A few years ago there's been regular result. So regular result for this problem by Allistar Sinclair
Weitz and myself on trees, still on minus under bee an can I on hypergraph and the result is
actually the mixing time goes back to optimal.
Sorry. This is copied. This should be L. But anyway. So what's happening in two dimension
when beta is bigger than beta critical. Instead, the boundary spins are all pluses, already Fabio
pulled up the sampling about this problem in three dimension at 0 time pressure.
But it's something like that. So if you start from all minuses, in the wrong situation, then you
should go to mostly pluses or the correct equilibrium.
So there should be like a bubble in the exact form of this bubble is unknown. That shrinks under
the pressure of the surrounding pluses.
And one expects polynomial mixing time. Okay. So that's a super graph picture. If you really
look at the boundary of this bubble of mostly minuses, is you know is really, you know, some line,
some contour that separated pluses from minuses. There are overhangs and so on and so forth.
So the key problem is really to understand the evolution of these interfaces under the golf balls
and until recently, the best bound, I proved many years ago, is that the mixing time is exponential
of a little bit more than the root L. Okay. And actually that was proven for beta much larger than
the critical value than there was some other work with less sharp result but beta just bigger than
the critical value.
Okay. So still very far from polynomial. Okay. Better than responsible in L but still very far from
polynomial. Now, the key tool -- because I want to get as quickly as possible to the newest result,
is block dynamics are today I would actually use sensoring Paris Winkler sensoring. So
sensoring means the following in two words. So if you start from the maximal configuration,
namely all pluses and you want to go down to equilibrium, if you decide to reject certain update at
certain time in certain region, okay, that you have to preassign without knowing the actual
evolution, then you will actually be farther from the equilibrium than if you would have accepted
their moves.
So this tool of sensoring opens the way to sort of saying first I ran the dynamics here, then there
and then again in another place and so on. So you can sort of try to guide the golf balls towards
equilibrium according to some sort of pre -- some scheduling that you sort of prepare in advance.
So the scheduling here would be the following: So imagine that you split your box into long
rectangle and the width is a little bit more than root L.
Okay? And you start from all minuses. So suppose now that you ran the dynamics for a very
long time, only in this initial rectangle in such a way that after this time you are at equilibrium in
this rectangle. So the boundary conditions are plus, plus, plus and minus outside, because here
you started from minuses.
Well, so the minuses must be separated by the pluses by a line that is very badly drawn here, but
this red line really separated the pluses from the minuses.
The key point is that this line behaves like a random walk condition to come back at 0. So if you
believe that, and this is really a big part in equilibrium probability estimate to prove that it really
behaves like a random walk, then clearly the fluctuation above this horizontal line are about root
L. And therefore they are -- so the red line must be very well below this mid-line because I added
this epsilon here.
So after the mixing time of this rectangle, you have put essentially above equilibrium. So the
correct almost plus case. And then if you start now again from the second rectangle, it's like this
Violet rectangle here, on top you will see basically pluses, which is exactly the same situation of
the first one. So you can start to repeat.
So if you repeat, you bring down this red line now to this level, then to another level, and down.
Okay? And then you have reached equilibrium. Of course, this is super sketchy, but exactly this
is the line of proof.
So how long does it take? Well, essentially you have to do this many steps, how many
rectangles that are here is a little bit more than root L. And each one takes you a time -- okay,
now you have to make some missing time estimate for the Gibbs sampling in this rectangle. For
example, you could use conical path, conical path techniques by Jager and Sinclair and this tells
you that the worst is exponential of the shortest site and therefore it explains this result.
Okay. Okay. So what we really need to improve in order to go beyond this L a little bit more than
root L is the mixing time of a rectangle plus, plus and minus. Okay. So in the S of S
approximation, so S of S approximation is if you replace the contour separating the pluses or
minuses, which have overhangs and all sorts of complications by really like a discrete, so the
graph of a discrete function. Like that.
So with Allistar Sinclair, we analyze this problem, and so you can apply David's techniques -- the
coupling introduced by David to study [indiscernible] and some. Plus other work and you get to
polynomial and with almost optimal power, 2.5 instead of 2.
So really the question is that the real problem is that the true contour doesn't look like that. It's
more complicated. So the problem is still open. So a year ago with Fabio, we made the first step
forward for deeper understanding.
And, okay, so the first idea is that if you want to improve the bound, you have to accept the fact
that you are not able to work with fixed plus or minus boundary condition but you have to accept
to work with random boundary condition.
So need to consider random boundary condition tau with some law. And there are two cases. So
case A -- so I say the law of this distribution is, okay, so I order the spin from sonar east
southwest. And so I say that P is like plus, plus, plus. If the marginal on say the northern side or
instant side and so on is S positive as the using plus phase. So as a reference distribution, which
is obtained by the taking a box with plus boundary condition, sending the box to infinity, you get a
certain distribution. This is what is called the plus phase. Is as positive as the distribution
obtained from this plus phase.
>>: [inaudible] or just more.
>>: But if it was really in the plus phase then you say the north and east side would be
dependent?
>> Fabio Martinelli: Yeah, just the margins. So I just ask that the marginal here is bigger, is
above -- and minus means that you are more negative than the minus phase, which is just the
opposite. Okay. Okay. So the theorem that we prove is the following: So you fix a little error
epsilon. And stay very well away from the critical points beta much larger than this critical point.
Then in both cases, A or B. So P -- now P here is the probability over the random boundary
condition. And this P is either more plus than everywhere or is three plus, more three pluses and
less 1 minus. Okay. In the southern set. Then the probability that the mixing time -- so the
mixing time is of course depends on the random boundary conditions, a random variable. So the
probability that this guy is bigger than exponential of L to this little tower is so small. It's very
surprising that actually this result is almost optimal. So it's not optimum because of these two.
And the reason is year ago [inaudible] Alexander proved -- and I'm sorry, Yoshita and Alexander.
They proved if you have exactly plus but at the corner you put a little bit of minuses, this many
minuses around the corner, then the mixing time become exponential of this line.
And so the probability that you put this guy is negative exponential of this line. And so we are
pretty close. And then we also proved that instead you start from all pluses, then your mix is very
fast.
So I would like to give a sketch which, of course, is a sketch with many sketches in between.
Okay. So the main idea is to bound the mixing time. So whenever I write this plus, plus and
minus, I really mean that they are random and their distribution satisfied the monotonicity that I
mentioned.
Of course, in extreme cases when they are exactly all pluses or all minuses but this is only in
extreme cases. The main idea is to bound the mixing time. The entirety of the procedure
essentially doubling the scale.
And the main problem is that at each step I have to reproduce boundary conditional smaller
region that satisfied the assumption of being more plus or more negative than the plus or minus
phase first.
Sensory is absolutely essential. So I'll show you that first we will tell one part of the space to do
something, then the second part to do something else and so on and so forth. Now, so in the
recursion, we need some time scale that we want to prove that is not more than exponential of L
to the epsilon. And this time scale, for example, we choose the following definition. So we take
the average over the variation distance starting from all pluses or all minuses of the Gibbs
sampler from the equilibrium distribution.
And we ask that tau and we wait for the time such that this average is less than a small constant
L delta. And the recursion, both delta and T will depend on the scale and will change from going
from L to 2 to L for L and so on and so forth. And of course we have to keep track in such a way
that the finite result is the correct one.
Okay. So in the approach there is an easy bit and a hard bit. Okay. So let me describe the easy
bit. So we want to go from -- actually, the recursion is L -- is L to the 2 L plus 1. Okay. Okay. So
the precise sequence. Okay. So we have a rectangle 2 L plus 1 and the side is 12 plus 1 root
plus epsilon.
Okay. So first thing is the following: We want to estimate the mixing time of that in terms of the
mixing time of a rectangle that which is 2 L plus 1, the base. And instead is only L to the power 1
half plus epsilon on the vertical side.
So these two rectangle A is this guy and B is this other guy. There are 2 L plus 1 and L to the
one-half plus epsilon on the vertical side.
Okay. So let's say that we start from all minuses and we have mostly plus here and mostly minus
here. So we start from all minuses.
And first we run A using sensoring we can only run A. So since we start from all minuses, the
spin in this region are frozen equal to minus, is like to have minus outside along this line.
So we ran A for exactly its mixing time. So in such a way that the equilibrium here is on average
a distance delta from its equilibrium.
And now okay so there is a line separating the minuses here from the pluses there, and this line
is this red guy. And this red guy is not able to reach this dashed line, simply because this red line
behaves like a random walk, fractation of root L and this dashed line is a little bit more than root
L.
Okay. Then so essentially here, then, which -- so I have some random boundary condition here
at the end of this time. But the distribution of this random boundary condition is more positive
than the plus phase, because it has plus here, plus here, plus here. And on the red line, on the
outside of the red line is pluses.
Okay. So it's more positive. So is a good distribution for us. So then we can then run the
rectangle B for exactly another amount of time on scale for this rectangle. And then we have
reached equilibrium.
>>: Something very efficiency here, because -- [laughter] -- but I can't put my finger on it. Below
the red line you have minuses and [inaudible] and now more positive than pluses.
>> Fabio Martinelli: No, no, no.
>>: Above the red line.
>>: Above the line you have pluses.
>> Fabio Martinelli: Above the red line you have pluses. Yeah. All I need is the spin along this
line, this dashed line, right, are more -- are random but the distribution is more positive than the
plus phase, which is correct, because I have plus, plus, plus. And outside this line -- so this line
is separating pluses from minuses.
>>: Okay. It's because the dotted line is higher up than this other line.
>> Fabio Martinelli: Yeah, yeah. That is crucial.
>>: For instance, so below on the sorted line below the red one they're more minus.
>> Fabio Martinelli: Of course.
>>: Sounded at first as if something ->> Fabio Martinelli: No.
>>: That it was more negative.
>> Fabio Martinelli: Okay. Then we have to study the mixing time with the random boundary
condition of the good sort of a rectangle which is 2-L plus 1 on the long side but only L, one-half
plus epsilon on the vertical side, and we have to estimate this mixing time in terms of the mixing
time of a rectangle which is only L times L. Okay. So if we are able to do that, then likely we are
able to establish a recursion, a mixing time going from L to 2-L plus 1 and then we can solve the
recursion, okay?
So here comes the -- oops -- the second idea. Okay. And this is the hard bit. So let me try to
explain how it goes. Okay. So one is tempted to run first -- so we start from all minuses. So
there are minuses here and minuses there and minuses around so you use N pluses on top. So
you are tempted to run the center rectangle, but then the center rectangle will equilibrate, but the
separation line from pluses to minuses so this red line being a random walk will stay very close to
the top and not to the bottom.
And the further distribution over this spin here will not be plus as required to do then left or right.
So we are in trouble. We are not reproducing the correct boundary condition.
Okay. Well, the main idea is to exploit a nominal fluctuation of the interface. It is true that most of
the time the interface will be very close to the top. But there is a little bit of chance that
sometimes the interface will actually go down and then up again.
If that occurs, then you have to imagine that here are all pluses. And the further distribution here
now will be very much plus. Okay? So the probability of going down being like a random walk is
just responsible of minus L to the 2 epsilon, because it's a random walk on time L and you asked
to go level root L to the one-half plus epsilon. So that's the probability.
So very intuitively, okay, one would like to run the center rectangle times, many each mixing time.
How many? Well, this number of times. Okay. In such a way that in one run you will see these
large fluctuations.
Then, as soon as you see the fluctuation, then you run the left and the right until you equilibrate.
So certainly that is forbidden, because sensoring cannot be applied to your, depending on your
current configuration. That's very attractive but it's forbidden. Definitely forbidden.
Still in the -- yeah, so many times one goes back to that. So it's still doable, actually. In the first
version of the paper there was a super messy chapter with this idea.
So the way out is instead is less natural but more elegant. And is the following. Okay. It is true
that here on the bottom we have minuses. By brute force I change many minuses here and I
replace by plus and I start to attract the random walk. So please come here. Okay.
And the question is how many. Okay? Well, okay. So this is beautiful simulation by Ayal and
Alan. So suppose that I have minuses here, okay, so sort of the blue dots. And I put a little bit of
pluses here and pluses on top.
So there are two possibilities. Either the pluses stay close to the top and a little bit close to the
bottom, or the pluses, they go down. They reach their friends down in the bottom boundary, and
they form two contours, one to the left and one to the right.
There are two possibilities. It would be very nice if the second possibility is the likely one,
because then on the vertical line that is one of the sides of the two vertical rectangle, then I see
now pluses exactly as I want.
Okay. So the cost for a random walk to go down this L to one-half plus epsilon is as we saw is
exponential of minus L to the epsilon. However, the worth of killing this extra contour is
proportional to exponential of the lengths of this region.
So if I take this length much bigger than L to the 2 epsilon, the reward is more than the cost, and
then this situation will be more likely.
Okay? So the probability that this random walk goes down to this is, so the cost is like that.
However, the reward of eliminating this contour, this little contour with respect to this situation, is
that. Sorry. Yeah, it's something like that.
So if delta -- if this region is more than L to the 2 epsilon, it's better to come down, so the total
cost lines is more common for you.
And, of course, there is a huge part of the work that really tries to prove that, because this is not
really random walk.
So the final result is -- so the final result is so we say that this is true if this -- so if at time T you
have on average distance less than delta from equilibrium. For all distribution of the random
boundary conditions that are more positive or more negative than the plus or minus phase.
And then so the easy and hard bit proved that that is true, that this assumption implies this other
one. And basically -- so I'm not writing the exact recursion, but basically the error that you pay in
variation distance is like a constant of the previous one, okay, and the time that you need in the
second instance is like the old one and times this little exponential. Okay. Then, of course, you
need an input for the recursion.
And the input is you go down to a scale which is L to the epsilon and then you apply just conical
path and you make some terrible estimate. Okay. So the recursion is -- okay. So then okay we
were already very happy, but ->>: What about the method of the pluses there?
>>: Sorry?
>>: What lets you plant this inter ->>: Sorry. Okay. Why? Maybe this is an important -- okay. So this is sort of an abstract -- is a
simple -- is a simple calculation. You ask yourself: What is the mixing time with, say, all minus
boundary condition here or what is the mixing time if instead of putting all minuses here I put this
delta of pluses by brute force.
So then it is easy to see that the ratio between the two mixing times is at most exponential of the
size of delta. So you decide that you work with delta but you pay a price, of course, which is
exponential of this distance. Yeah. Okay. Okay. Then so then when Alan and Ayal visited the
room, we started discussing. And so it came out that it's possible to improve and so to improve to
the point that the mixing time is quasi polynomial. So originally we wrote exponential of log
square but then we learned that it's better to sell like the classical polynomial time, we had to
adjust.
And then quite surprisingly we can prove this result not for very large beta but for any beta bigger
than the critical one. So I guess that's the last -- so the last slide or almost, yeah. So the key
new ideas, this really is very much, much to say. Maybe they can provide later the detail.
So there is a new recursion scheme, first of all. Still essentially doubling the scale. But organized
in different ways. Then there is a new analysis of the contours of this interface separating pluses
from minuses, with really very precise estimate that are like random walk-like. So proving that
really behaves for all what we need. Essentially really like a random walk.
For example, we're able to prove the following that so in this situation I think that this is
remarkable. But ->>: Can't see ->> Fabio Martinelli: Oh.
>>: Higher up.
>>: All the way up it says N to N.
>> Fabio Martinelli: Okay. So if you put say plus, plus, minus, minus, minus, so there is some
line now separating the pluses from the minus. So, for example, one instant that we need is to
prove that the probability that this line is all above the horizontal line. So level 0. That this
probability is at most, is at least an inverse polynomial of the scale. And okay that was absolutely
essential.
And actually we can prove that and actually we can really prove that is exactly as for random walk
bridge, which is apparently really nontrivial.
And what is surprising is that using this random line representation, we can get this bound up to
the critical point excluded.
And so the bottom line that in the new recursion, the mixing time gets worse by polynomial in L
factor. And since there are only log L step, then you get this quasi polynomial. So that's the last
slide.
So originally we had rectangle that were of a length like that, and here there was this length to the
power one half plus epsilon. So that's the current length in the recursion. And instead -- instead
here we make it thinner, okay. So a large constant root of this, okay. So that would be the scale
of the typical fluctuation of this red line. And then in order to get this side away from this red line,
we put a root of log of the final scale where we want to get. Then, of course, we need super
precise estimate that are really exactly like random walk to get the easy and hard beta as in the
previous work. And then we get this quasi polynomial. So forecast for polynomial according to
Ayal of ten years and I don't know. So thank you.
>>: You dropped everything else through experience.
>> Fabio Martinelli: Yes. Okay. Thank you. [applause].
>> Yuval Peres: Questions or comments? An easy way to get the down polynomial. No? Thank
you.
Download