>> Yuval Peres: Good afternoon. So at Russell's request or suggestion... researched various proofs of the Ergodic Theorem and the Subadditive

advertisement
>> Yuval Peres: Good afternoon. So at Russell's request or suggestion I
researched various proofs of the Ergodic Theorem and the Subadditive
Ergodic Theorem. So these are classical theorems going back almost a
hundred years in the case of the Ergodic Theorem and the 70's for the
Subadditive Ergodic Theorem. The original proofs, you know, were quite
hard. Since then many shorter and simpler proofs were found. Some of
the shortest proofs using the Maximal Ergodic Theorem are not so
intuitive, so I want to present proofs that are perhaps more intuitive
following the works of Kamae, Benji Weiss, Katznelson, and Mike Keane
and Mike Steele. So it'll be a product of these.
So the Ergodic Theorem of Birkhoff can be stated in measure theoretic
form or a probabilistic form. In the probabilistic form we have a
stationary ergodic process so Xj. So -- All right, so this process is,
say, a stationary meaning that the law of any sequence, say, X1, X2, up
to Xn is the same as the law of a shift X2, X3, up to X plus 1. And
this is true for all n and it follows from this shift in variance that
you can shift by more than one.
In other words if we look at the transformation, p, that sends just the
sequence, Xj, to the sequence Xj plus 1, this transformation is
measure-preserving. And so our measure is a probability measure. And
then we'll also assume this assumption can be removed, but in the
applications often the process is ergodic which means that if you have
any -- if -- So we assume that the process is ergodic which means if we
have an invariant event or invariant function. So if F is any function
of the process which is measurable and satisfies -- Okay, I'll write
measurable -- and this shift in variance. So F is pointwise equal not
just in distribution but pointwise equal to its composition with T then
F is constant almost everywhere surely.
And this assumption is satisfied in many cases. In particular this
holds in the independent case but in many more cases. So if the Xj are
independent then this invariance certainly implies that it's constant.
Okay, so in that setting the Ergodic Theorem would just say that if we
look at the partial sum -- So let me write more generally. So the
partial sum of size L starting at K will be the sum of Xj plus K when J
ranges from zero to L minus 1. Okay, so we'll be interested in the
partial sums from various places and then we have the normalized
partial sum, or the average, Al of K is 1 over L. Sl of K.
And then the theorem states that An of K for any K, but it's enough to
talk about the An of 1. This converges to the mean, [inaudible] of X1
of course by stationarity they all have the same mean, this converges
almost surely.
So that's the Birkhoff Ergodic Theorem. Later we'll discuss the
subadditive version. Okay. So, all right. So let's -- Any questions on
this statement. Okay, so there are two -- there are really two parts to
the statement. One is that this actually converges which is maybe the
[inaudible] the harder part and then identifying that the limit is, as
you expect, the mean of X1. Any questions or something unclear?
>> : So we've got to get the property of the distribution of the Xj?
>> Yuval Peres: It's the property of the distribution of the Xj and of
the transformation key. Okay, so I'm worried about -- What this means,
you can -- instead of invariance functions you can think just of
invariant events or an event B is invariant if T inverse of B equals B.
And it's equivalent to just assume that all invariance events have
probability of zero or one. So that's another form of ergodicity.
And yet another form is that, you know, for any two sets of positive
measure if you apply T enough times to 1 it will intersect the other at
positive measure. So you cannot have kind of two sets of positive
measure that don't see each other when you apply T.
Okay, so that's about the statement. So I said there're several short
proofs; I want to give the one that is most intuitive and also
generalizes to the subadditive case. So as I said this proof starts
with [inaudible] of Teturo Kamae then there is -- who wrote the proof
in nonstandard analysis. Then this was simplified by Katznelson and
Weiss and then by Keane, and it's basically his version that I'll show
today. Okay, so first I want to -- Ah, so first I want to say that it's
enough to consider positive variable. So we may assume all the Xj are
non-negative because in general you just separate into the positive and
the negative parts and prove it separately.
All right? So you just write X as a difference of a positive and a
negative part and work with each one separately and get the result. So
once we prove everything here it's kind of linear. So once we prove it
for the positive part, we prove it for the negative part, we can just
subtract and get it. So it's enough to [inaudible] with this case.
>> : Yuval?
>> Yuval Peres: Yes.
>> : [Inaudible] An I is the same as An [inaudible]?
>> Yuval Peres: No. The limit is the same. But, again, An1 is the limit
-- An, right? We're summing -- These are partial sums from different
locations, right? So...
>> : But there's a shift in variance so doesn't...
>> Yuval Peres: So the shift in variance is of the distributions.
Right? So X1 is not equal to X2, just it's distribution is the same. So
the distribution -- But, so A1 is definitely not A2, but it has the
same distribution. Right. Okay? So, yeah so of course -- And also it's
easy to see that they must have the same limit. So An -- Right, so I
wrote here An1 but it applies to AnK for any K. So this is a limit,
right, as N [inaudible] to infinity. Okay.
All right. But, okay, so look at the lim sup of the An's and -- Okay,
as I said I'm going to [inaudible] proof as intuitive as possible which
means not exactly the same as the short as possible. So I want to first
consider an easy case which will then generalize. So, okay, one more
definition before that. So we have the lim sup. We fix an alpha which
is less than the lim sup. By the way this lim sup, we a priori don't
know that it's finite at all even though --. So maybe I should have
added the assumption. So we have stationary ergodic, and I want to
assume that it's integrable. So assume that the expectation of the
variables is finite. So let's add that.
Okay. To make this statement meaningful. Okay, so we look at this lim
sup which a priori don't know it's finite. So we know its, you know,
zero infinity might at this stage be infinite but we'll prove it's
finite. If we were talking about the lim inf, it would be immediate
that it's finite [inaudible]. But we're talking about the lim sup so
there's something still to prove. Okay.
So fix alpha which is less than the lim sup and then let's say L of K,
this would be random variables, these are the first time that you
exceed -- that the averages exceed this lim sup so the first L so that
Al of K exceeds alpha.
Okay, alpha is certainly a finite number. Okay, so by the definition of
lim sup this number is certainly finite but it might be very large. We
might have to wait for long. So case one is when these numbers are
uniformly bounded. So this is very special. You know, it happens for
some ergodic sequences like periodic sequences, but it will be a good
warm up for the general case. So suppose that these L of K's are
uniformly bounded by some L. This is just a special atypical case, and
we'll see what to do in this case and then generalize from that.
So in this case we just take the partial sum and write it. And so
basically the idea is to take the interval from 1 to N and cover it by
intervals where the averages exceed alpha. So we know there is some
interval here where the average exceeds alpha and the length of
interval -- Right, so this is L of 1. The length of this interval is at
most L. And then we find another interval where the partial sum exceeds
alpha and so on. And we know that all these intervals are no longer
than L.
So we stop the first time we cross N minus L and list the composition
of this series. Let's write it now more formally. So I'm going to say
that the sum of Xj, J from 1 to N, this is going to be bigger than the
sum I from 1 to sum M of partial sums. So this will be partial sums
from sum Ak from some number -- Okay, let's write. These are L of Ki at
Ki.
Okay, so -- And I'll write some more and then explain this is going to
be bigger than the sum of L of Ki times alpha which will be bigger than
N minus L times alpha. And here -- Where each time we choose Ki -- So
we start with K1 is 1 and Ki plus 1 will just be Ki plus L of Ki.
Right, so each time we start at one we find an interval where the
partial sums are large then we just go to the next. Right? So we find
this interval then we got to the next point, find another good interval
and so on. So all these partial sums are larger than their length times
the average, right, because this is greater than alpha. So we get this
inequality and then -- and the length of all these intervals will
exceed N minus L because we just keep going as long as we can. And we
just have to stop once we exceed N minus L; it's possible that the next
interval will overshoot N so we stop at that point. We don't take the
next interval. Okay?
Now this inequality is, you know, completely obvious after you've seen
it and maybe confusion the first time you see it, so please stop me
because this is -- So apologies to whom it's trivial and apologies to
those for whom it's confusing. But if anybody from the second group who
wants to ask something -- Because this is really the crux of the whole
matter. So --.
>> : You probably should've mentioned that A var that it doesn't depend
on L.
>> Yuval Peres: Yes.
>> : That [inaudible].
>> Yuval Peres: That's important. Thank you. So I should've mentioned
that A var is a constant. So thank you. That's right. So A var this lim
sup -- Thanks. So maybe at this point I should mention, so A var is -So observe note that A var equals A var composed with T. In other words
if we -- This related to [Inaudible]'s comment. So if we start the
partial sums from 1 or from 2, it not only has the same distribution
but pointwise the partial sums only differ by the first variable. So
when we divide by N and take a limit or a lim sup, it doesn't matter.
So pointwise the lim sup of the sequence starting from the first or the
lim sup starting from the second, when we divide by N, clearly, easy to
see, it has the lim sup.
So we have this invariance which means, because we're working in the
ergodic case, so A var is a constant. Almost surely. Okay, so thanks
[inaudible]. I should've commented on that.
>> : [Inaudible].
>> Yuval Peres: Yes?
>> : It's not just the first term [inaudible] the last term
[inaudible].
>> Yuval Peres: Right. Yeah, so it's -- So we want it only to differ in
the fist term, so maybe I'll say this. So An of 1, you know, equals An
minus 1 of 2. [Inaudible]. Okay, Sn of 1 equals Sn minus 1 of 2 plus
X1. Okay? So that's better because, you know, the last term we don't
control well. All right, so now divide -- All right, so An of 1 equals,
you know, N minus 1 over N. An minus 1 of 2 plus X1 over N. And now
it's safe to take lim sups. Okay, so...
>> : [Inaudible]?
>> Yuval Peres: This is to formally justify this. Okay? So what
[inaudible] pointed out is that if we just work with averages of length
N then partial sum of length N from 1 and from 2 they differ by a last
element which kind of varies in time. So we don't control it well. We
could, but it's easier just to use this identity which compares partial
sum of N terms to partial sum of N minus 1 terms. And now once we have
it in this form, we really can take the lim sup. And see this goes to
zero, this factor doesn't matter, and so we indeed derive this
identity.
Okay. So in this case we are basically done at least with the existence
of the limit because -- And we can easily finish the rest because now
if you take these two sides, right, you divide by N and take the lim
inf, so you get that the -- So the lim inf of -- I'll just write it out
-- 1 over N sum Xj. J equals 1 to N is going to be -- well, we take
this, divide by N and take a lim inf for limit [inaudible]. So this is
greater than alpha. Okay, and of course if we could do this for any
alpha less than A var then we would get that the lim inf equals the lim
sup and so the limit exists.
But, of course, this assumption that L is constant is very restrictive.
So now I don't want to spell out the details in this case. This case
was more just to see the key argument in the [inaudible] setting. Now
we're going to just do that same thing in the general case. So in the
general case we can't assume that this L case abounded but what we can
do is, you know, given epsilon small we can pick a large number L so
that the probability that Lk. Okay, this probability doesn't depend on
K but the probability of this is bigger than L will be less than
epsilon.
Okay. By stationarity this probability is the same for any case, so I
could have put here just L of 1. Okay? So the point is this L of K is a
finite number so it's a finite random variable but -- so I can put in L
large so the probability of this variable is bigger than that is less
than epsilon. And now we want to modify the process so that -- So Xk
star will be Xk in the case when L of K is less than L. These are some
how well behaved cases. And in the bad cases -- So I'm going to look
ahead. And if the situation is bad, so we need to wait too long, then
we're just going to modify the process and put in alpha here.
Okay? This is modified process Xk star, and I want to write it as Xk
plus Zk. All right, so Zk is usually zero. Just when we are in this
case Zk will be, you know, alpha minus Xk. So we define Xk star this
way. And now the -- And define this, you know -- All right, so Al star
of K are the averages of the Xk star. Right? So Xj plus K. J from zero
to L minus 1. And L. And then we have L star of K is the first L so
that these averages Al star of K exceed alpha. And now we're in a good
position. These are always at most L.
>> : Did you mean X star [inaudible]?
>> Yuval Peres: X star. Thank you. Yes. That's the whole point. These
are averages of the X star. And now because L star of K, in fact it's L
of K if L of K is less than L, and it's 1. All right, so L star of K,
what is it? It's L of K if L of K is less than L and it's 1, otherwise.
Okay, so it's certainly always at most, at most L. So now the previous
argument applies and just partitioning the partial sum, sum I from -so now we're going to take N which is much, much larger than L
intuitively. And we're going to take the sum. So the sum J from 1 to N
of Xj star is going to be bigger by the same argument from before
because we are in a situation of N minus L times alpha.
Okay, so it's convenient that all the variables here are non-negatives.
So the terms that we throw away here at the end we know they are nonnegative. Okay, so it's exactly the same argument from before now
proves us this inequality which is really, you know, the key inequality
[inaudible]. Right. Any questions? So it's exactly the argument because
we are in that situation where we only have to sum at most capital L
terms, you know, to get the average that we want. All right?
So now what we can do with this, several things. So first let's take
expectations on both sides. So we get that N times expectation of X1
star is greater than N minus L alpha. And what we can say about this?
Well look at [inaudible] only differs in this case. So it's at most -So this is N times expectation of X1 plus alpha epsilon.
Right, because the difference is just the expectation of this Z1 and
that difference is at most alpha and with probability at most epsilon.
Okay, so now we're in a good situation. We can divide by N and take a
limit as N tends to infinity and we'll get that expectation of X1 is
greater than 1 minus epsilon times alpha. Okay, moving the epsilon to
the other side. And this we could do -- Note that here it's -- now it's
X1; it's not X1 star anymore. So this is true for any epsilon. So yet
expectation of X1 is greater at equal alpha. But this was true for any
alpha less than A var. So we can conclude that expectation of X1 is in
fact greater than A var which is a powerful inequality here because A
var was the lim sup. So we conclude that this lim sup is in fact finite
and bounded above by the expectation.
Okay. So now we're almost done. We just still have to argue that the
lim inf of the X is also A var. And for this we just go back to -- Yes?
>> : [Inaudible] epsilon small? Like epsilon could be half [inaudible]?
>> Yuval Peres: Yeah, but the point that we use that is -- I mean, at
the end we use the fact that this is true for any epsilon. Right, once
we had this -- right, we got this inequality and now we say this is
true for any positive epsilon so E X1 is in fact greater than alpha.
>> : Oh, okay.
>> Yuval Peres: Okay? And then we said this is true for any alpha less
than A var so in fact X1 is bigger than A var. So now in order to go
back notice that the X star has two pieces, has this X and this Z. So
let's make sure we can control the Z's well. And for this we're going
to use what we've already proved for the process X, we're going to use
it for the process Z. So the expectation of -- So this fact that we've
already proved implies when applied to the process Z which is after all
just another stationary process, stationary ergodic process that the
expectation of Z1 is going to be bigger than the lim sup 1 over N sum J
equals 1 to N of the Cj almost surely.
And...
>> : [Inaudible].
>> Yuval Peres: Right. So the Z's are a function of the X process, and
so they are also ergodic. So any --. So if you have any ergodic process
and you apply a function of that then you also get an ergodic process.
And you just check that from the definition. Thanks, [inaudible].
So --. All right, so this is something that should be added. Since the
Z's are a function, an invariant function of the original process,
they're also an ergodic process. So we get this inequality. Now we want
to apply that going back to our key inequality upstairs here. So that
inequality tells us that -- Maybe I'll also write that expectation of
Z1 we know is at most alpha epsilon. Okay, now let's write this
inequality in the form 1 over N sum J from 1 to N of Xj plus sum J from
1 to N of Zj is at least N minus [inaudible] over N times alpha.
[ Silence ]
Okay. Now we want to take lim inf of both sides. Well here it just
converges to alpha. What can we say about the lim inf of this side?
Well on the one hand it's bigger. So we take the lim inf. And on the
one hand it's bigger than alpha for that inequality. On the other hand
it's certainly, "Well what can we bound it above?" It's not true that
the lim inf is bounded above by the sum of the limits, but it's
certainly true that it's bounded above by the lim inf of the X's plus
the lim sup of the Z's. Right? So this is just true for any two
sequences. When we add them we can bound -- right, we can bound above
the lim inf by the lim sup of 1 and the lim inf of the other. Okay? But
-- Right so this is lim inf of An of 1 plus -- and this we know -- so
can be bounded above by alpha epsilon. This is greater than alpha. And
so we're in the same situation we wanted before we move alpha epsilon
to the other side. And so we get that this -- so this lim inf is in
fact greater than alpha. So it follows that this lim inf is in fact
greater than A var so it's equal to A var. And we have the convergence.
Okay. And we've already verified that --.
[ Silence ]
All right. So we have convergence to --.
[ Silence ]
So A var --. Okay, so here I guess I've explained why A var is at most
the expectation. I didn't say why A var is at least the expectation. So
this is a final note. So the fact that is at most the expectation was
proved. Now in the bounded case if the X's were bounded, we certainly
know that A var would be equal Ex 1 just from Lebesgue Bounded
Convergence Theorem because all the averages will be bounded by the
same bound. And so in general A var is bigger than the limit of one
over N sum J from 1 to N of the minimum of Xj with M which is -- Right?
So this is now a bounded process. So this will give this expectation of
X1 minimum M. So A var is bigger.
>> : [Inaudible]?
>> Yuval Peres: Xj minimum with a large number, M. This [inaudible] is
the minimum.
[ Silence ]
All right. So again in the bounded case just Lebesgue just Lebesgue
bounded convergence theorem gives you the expectation of the limit is
the limit of expectations. In the unbounded case you just truncate. You
get this inequality. And now once we have this inequality, you can let
M tend to infinity and you get that A var would also be greater or
equal than [inaudible] just by taking M to infinity.
Okay? So any questions about this? All right. So as I said there other
proofs. This one is particularly well adapted to generalization to the
subadditive case. Historically when Kingman first proved the
Subadditive Ergodic Theorem in the seventies the proof was much harder.
So I'll go onto that if there are no questions on this case.
[ Silence ]
And despite the name Subadditive Ergodic Theorem, I'm going to prove
the superadditive version. But it's up to [inaudible] minuses. So we're
going to -- So what the superadditive process --. So maybe first it's
just one word on the kind of places where Subadditive Ergodic Theorem's
get applied. So one places in first passage percolation you have, say,
lattice and you want to -- And on the edges you have random variables
which indicate passage times. And you want to find P of zero N is the
time to go from zero to N in general P of Mn might be the time to go
from M to N on the X axis. So we have two points, you know, M and N on
the X axis. We look at all possible paths that go from one to the
other. For each path we look at the total passage time of the path, and
we minimize over all these paths. So on the edges are endowed with
independent random variables which are, you know, passage time of these
edges. So these random variables, it's easy to see that they're
subadditives. So the time to go from zero to N is certainly bounded by
the time to go from zero to M plus the time to go from M to N. Because
here we're considering a larger ensemble of paths that go from zero to
N. Here we're considering paths that -- On the right-hand side we're
considering paths that go from zero to M but have to go via the
intermediate point M. So here we're minimizing over a larger ensemble
of paths. So this is the kind of [inaudible]. And then you want to show
that when you take the time to go from zero to N, you divide by N, this
actually has a limit and the limit is nonrandom so it's almost really
constant and it's equal to the limit of the expectations.
This is one kind of application. Another application is for random
walks on groups, and you want to show that the random walk has speed.
So you have some kaleidagraph; you doing a random walk on the
kaleidagraph and you look at the distance from your starting point to
the Nth point in your walk. And, again, you can check that that
satisfies such an inequality and still get existence of a limit.
So I won't talk now about more applications but rather go to the formal
statements since I want to finish in time.
So the Subadditive Ergodic Theorem of Kingman. And I'm going to state
it in superadditive version. So again we can think of some underlying
probability space. So in this case the probability space is just all
the edges on -- all the random variables that indicate the passage
times of the edges. And you can think of a transformation key from
omega to itself which is measure of preserving. And then the important
things are random variables Ymn. So you can think of these as, say, the
negative of these passage times. Okay, and Ymn they satisfy the
subadditive inequality. So I'll just write Y zero N is, I'm sorry, the
superadditive. So this is bigger than Y zero M plus Ymn.
So that's one assumption. And also the shift in variance and
distribution. So Ymn composed with T is Y m plus 1 n plus 1. So you see
in this situation the transformation is just shifting the random
variables and you see that the time to go from M plus 1 to N plus 1, it
just has the same law as the time to go from M to N. And this
corresponds to the shifted passage times.
Okay so these are the assumptions. And then the conclusion is that if
you take --. What? So, no, this is not equality in distribution. This
is the actual random variables. So --.
>> : T is measure preserving.
>> Yuval Peres: T is measure preserving. That's right. Okay, so you
should -- You really think there's really just one sequence, just like
in the ergodic theorem. Yeah, so here you should -- Okay, so just think
of your basic variables as these Ymn, these -- or Y zero N. And then
from Y zero N you can end the transformation and you have all the
variances. Okay, but the transformation just shifts the underlying
space. Okay.
So then --. Right, then the conclusion is that there exists the limit
of Y zero N. This limit exists almost surely. Now beta is some number.
In this case it's not minus infinity but it could well be infinity.
Okay, the number is Y. These are finite numbers but I didn't assume
integrability here, so certainly averages could go to infinity. Okay.
And also -- Beta is also the limit of the expectations.
>> : [Inaudible].
>> Yuval Peres: Thank you. So it's ergodic. Okay, so --.
[ Silence ]
The proof is -- as you see it follows the same lines as before. So
first if you look at the variables which are Ymn minus the sum Y K
minus 1K, K from M plus 1 to A. Then you can just see that the
superadditivity assumption implies that this is non-negative. Okay, so
leave -- That is a little verification. You just recursively apply this
assumption again and again. And remember that this assumption together
with the one on the right implies that we have the superadditivity
along any interval. If you take the interval and break it in two
pieces, Y of the big interval is bigger than the sum of Y's of the
pieces. Right? And you just keep breaking it up until you get to this Y
on the intervals of length 1. And so this is non-negative. And, okay,
so --. And this partial sum can be treated with the Ergodic Theorem.
So, okay, if these variables have infinite expectation then you easily
conclude that limit is infinite. If they have a finite expectation then
you can just use the Ergodic Theorem that tells you that averages of
these will go to their mean, and you just reduce the case of Ymn to the
case of Y [inaudible]. So because of such a definition so this allows
us to assume that the original Ymn are non-negative.
>> : So you are assuming that Y has an integral including plus or minus
infinity?
[ Silence ]
>> Yuval Peres: Yes. So --. All right. So let's --. Yes, that's right.
Okay, so...
>> : So Y is an integral of a low non-negative?
>> Yuval Peres: Yes. Okay, so let's first completely assume that these
Y's are -- assume that these are finite. Okay. So, thanks. So then -All right. So this allows us to assume that the Ymn are non-negative.
So now we just continue with no negative variables and define as
before. So A -- So I guess now I'll call it beta. The lim sup of A1
over N Y zero N.
[ Silence ]
Okay. We fix alpha less than beta and define like before L of K as the
first L so that when you take Y from K to K plus L this is bigger than
L alpha. And L start of K. So now L star of K will be L of K if L of K
is less than L and 1 of L of K is bigger than L. Okay, now the same
logic that we've used already twice before will allow us to bound from
below Y zero N. Okay, by something N minus -- essentially N minus L
times alpha. I'll write it and then explain. Minus the sum over all --.
[ Silence ]
So I didn't tell you how we choose L but you can already guess.
[Inaudible]. So the sum over all K so that L of K is --.
[ Silence ]
Okay. And here, as before, we choose L so that the probability of L of
K, to be bigger than L, is less than epsilon. And then we -- So these L
star K we don't really use them. They just kind of remind us of the
argument to get this inequality. But to get this inequality, as before,
we take the interval zero N and we look from zero, we look, "Do we have
an interval here where L of K is less than L?" If so, we're happy and
then we continue. But maybe here, when we look here, the L of K is
bigger than L. Then we just take a singleton and we go to the next
point. Okay. But this singleton was a special point where L of K was
bigger than L. Then we go to the next point -- And so overall we cover
the whole interval from zero to N minus L by good intervals where L of
K is less than L and bad singletons. So when L of K is bigger than L we
just take that singleton and jump to the next.
So overall from the good intervals, we'll get N minus L times their
length, but we're going to lose -- And here. Yeah, so I have -- We're
going to lose from this. So we have this also, alpha multiplies this as
well so maybe I'll write it this way.
So N minus L minus the sum, all of this multiplies alpha. Right, so the
total length of the good intervals is at least N minus L minus the sum
of the bad locations, the number of the bad locations. Okay, and the
bad locations will have to go to the next point. Okay. This is some of
the key to this proof but it's similar to the keys to the previous
proofs that's why we went through that.
>> : [Inaudible] then the next one is not going to be bad as well?
>> Yuval Peres: Maybe. But, you see, it's not -- I'm not -- It might -It kind of does give some negative information. But this is not
[inaudible] these here are not completing probabilities. This is just a
communitorial pointwise inequality, right, that says we gain alpha
times the length of the good intervals and we lose -- I mean, but what
is the total length of the good intervals? It's at least N minus L
minus the number of the bad singletons. Every time we see a bad
singleton we go to the next. Maybe that's another bad singleton. But
then it will just enter into this sum. So the total length of the good
intervals is at least what's within the parenthesis here. And then from
them we get this times alpha. Okay?
>> : Isn't it the same as the previous argument except you did both
their cases together? Was there something different?
>> Yuval Peres: Yeah, it's essentially the same. What is different here
is that we didn't do actually a modification of the process for this.
We just kind of paid the price here. But we're in a better position
than before because we already have the Ergodic Theorem, the Birkhoff
Ergodic Theorem, and we're about to use it. Okay, so that's why we
didn't have to go through the same thing because now when we divide by
N and we want to take a lim inf, you see these indicators are just a
sequence of -- because everything is a function of a stationary
process. These indicators could check, they're also stationary. So if
we divide by N and take a limit we know they converge to the
expectation of this indicator. And that expectation is just the
probability of this event which is small; it's less than epsilon. So
that's -- If we divide by N and take a lim inf, what do we get? Well
here when L is constant. So when we divide by N we're going to get at
least alpha minus -- Well, alpha times one minus the probability of L,
say, of 1 and bigger than 1 which is bigger than L of 1 bigger than L.
So this is bigger than alpha times 1 minus epsilon.
Okay, and at this point we use the Birkhoff Ergodic Theorem for these
random variables. Again, because we're in a stationary situation these
themselves are from a stationary sequence so we can apply the Birkhoff
Ergodic Theorem here. And now we're done. The lim inf here is greater
than alpha and this was true for any alpha less than beta, so the lim
inf equals the value. So the last comment is why is the limit the same
as the limit of the expectations?
So you always have -- Right so we already proved so the limit of Y zero
N over N exists and equals beta. So then from [inaudible] if we take
expectation we get that the expectation of the limit, which is beta, is
at most the lim inf of the expectations. But this is a superadditive
numerical sequence so the limit exists. So the limit exists and so beta
is at most this limit. In the other direction just observe that if you
take Y zero and say K times N you can break this up into N intervals of
length K. Right? So if you take this and divide by N and take a limit
this will be bigger than the expectation of Y zero K.
Okay. Because just from the superadditive inequality you can break this
up into a sum of N sums on each on intervals of length K. And then for
these sums you apply the ordinary Ergodic Theorem to get that when you
divide -- This is the sum of N summons -- when you divide by N, you'll
get this limit. This is true for every K. So now let's take this -- So
here I thought of N as tending to infinity and K is constant. So I can
-- All right. So I have this and this is true. This is exactly our
limit beta and it's bigger than this for every K. So taking the limit
gives us the remaining inequality we need.
Okay, since I promised to finish at five I won't really discuss more
applications now. But any -- Let me stop here and wait for any
questions. Yes?
>> : What was the benefit of taking the Y [inaudible]?
>> Yuval Peres: So we had -- In this inequality here, right, what did
we do? We take the interval zero N, and we broke it into good intervals
and bad singletons. Right. Now in the bad singletons all I said is,
"Well, you know, we don't get the good contribution but we know we get
something at least zero." So that's why I could write this inequality.
So this gave us -- What is in the parenthesis is just the total length
of the good intervals, and those all give us alpha times their length.
The other things I don't know but it's no negative.
>> : Oh, so the last L? Right?
>> Yuval Peres: Right. Also the last L, that's right. Yes, because we
stop before the end. Okay, so that's where that gets used. Okay, there
are no more...
>> : So you mentioned that the first proof used nonstandard analysis?
>> Yuval Peres: No, not the first proof. The first proof along this
argument. So the Kingman proof was...
>> : [Inaudible]...
>> Yuval Peres: So this is -- So this line of proof started with a
nonstandard analysis proof by Taturo Kamae in 1982. And then this was,
you know, Yitzhak Katznelson and Benji Weiss read that proof,
understood it and understood how to remove the nonstandard analysis.
But they still -- But it's -- And then this was further simplified a
bit by Mike Keane and Mike Steele who gave essentially these arguments
for the Birkhoff case in the subadditive, ergodic case. You want to
compare to other proofs, you can look -- So say direct probability book
has a proof for the Subadditive Ergodic Theorem, slightly more general
version but this one applies to most applications. This is -- What I
prove to you is basically the original Kingman version. And -- But the
proof that [inaudible] gives which follows [inaudible] is really much
harder to follow and to remember. So here I think at least the idea is
pretty easy to remember.
>> : So say if you kind of use your nonstandard analysis, would the
proof have been simpler or no?
>> Yuval Peres: Eh...
>> : Can you...
>> Yuval Peres: The proof -- If you assume -- I mean, but you prove a
completely different statement. So if you want to verify that that
statement is actually equivalent, it's much longer. Yes [inaudible].
>> : So what's the relation of this to maximal theorems?
>> Yuval Peres: This is a way to avoid maximal theorems. So there's a
very short proof of the Birkhoff Ergodic Theorem that comes from the
Maximal Ergodic Theorem. And this was one of the roots of the original
proofs. And initially it was thought, "Oh, this reduction is so easy so
the Maximal Ergodic Theorem must be hard." But then Garcia came up with
a very short proof of the Maximal Ergodic Theorem. So if you combine
those -- There is a very -- You know, there is a proof even shorter
than the one I presented going via Maximal Ergodic Theory, but that one
is more mysterious than say even to the experts.
>> : And that's not of the subadditive?
>> Yuval Peres: Right. Right. That one doesn't translate directly to
the subadditive. Okay. Thanks.
[ Audience clapping ]
Download