————————————————— Lecture 14 (Oct. 7) Ergodic Theory:

advertisement
—————————————————
Lecture 14 (Oct. 7)
Ergodic Theory:
Defn: A measure-preserving transformation on a probability space
(M, A, µ) is map T : M → M which is measurable and measurepreserving, i.e.,
for A ∈ A, T −1(A) ∈ A and µ(T −1(A)) = µ(A) .
Defn: IMPT: T is MPT + bijection a.e. + T −1 is MPT.
Note: for IMPT, need only assume that T is MPT, bijective and
T −1 is measurable:
µ(T (A)) = µ(T −1 ◦ T (A)) = µ(A)
R
R
Theorem: T is MPT iff f ◦ T dµ = f dµ for all f ∈ L1.
Proof:
If: Apply to f = χA, A ∈ A and note that χA ◦ T = χT −1(A)
Only If: split f into positive and negative parts;
approximate f + by monotone sequence of simple functions φn.
then f + ◦ T is approximated by φn ◦ T
R
R
each φn = φn ◦ T
use monotone convergence theorem
Every MPT is recurrent:
Poincare Recurrence Theorem (1890)
Let T be an MPT and µ(A) > 0. Then a.e. x ∈ A visits A
infinitely often, i.e. for a.e. x ∈ A, there are infinitely many n > 0
s.t. T n(x) ∈ A.
Proof:
1
Lemma: There exists n > 0 such that µ(T −n(A) ∩ A) > 0.
Proof: If not, we claim that {T −n(A)}∞
n=0 are pairwise measure
disjoint (i.e., for n 6= m, µ(T −n(A) ∩ T −m(A)) = 0); this follows
because (assuming n > m):
T −n(A) ∩ T −m(A) = T −m(T −(n−m)(A) ∩ A)
−n
Thus, µ(∪∞
(A)) = ∞.
n=0 T
end of proof of lemma.
Let G be the set of points in A which do not visit A infinitely
often. We want to show that µ(G) = 0.
Let Gn be the set of points in A that visit A exactly n times.
Then G is the disjoint union of the Gn and so it suffices to show
that µ(Gn) = 0.
If µ(Gn) > 0, then by Lemma, there exists n0 > 0 s.t.
0
µ((T −n (Gn) ∩ Gn)) > 0.
0
If x is in this intersection, then x ∈ Gn and T n (x) ∈ Gn and so
x visits A at least n + 1 times, a contradiction to defn. of Gn.
Lecture 15 (Oct. 12):
Comments from last time:
1. Recall charactrerization
R of MPT inRterms of functions: 1
Theorem: T is MPT iff f ◦ T dµ = f dµ for all f ∈ L .
– This indicates why we, in defn,, we require measure of T −1(A)
to be preserved, instead of T (A); other reasons:
——— parallel with defn of measurablity
——— allows certain non-invertible tranfromations to be MPT
– In fact, this results holds for Lp any 1 ≤ p ≤ ∞.
2
(on a finite measure space all such Lp are contained in L1:
Z
Z
Z
Z
|f | ≤
|f | +
|f | ≤ 1 +
|f |p
|f |≤1
|f |>1
|f |>1
)
Induced operators on Lp: UT f = f ◦ T .
– Note: f ◦ T ∈ Lp since |f |p ∈ L1 and on L1, U preservers
integral.
– most important for p = 2; Hilbert space.
– von Neumann, Koopmans, Kakutani
2. Recall Lemma in Poincare Recurrent Theoren (1890):
For MPT T and µ(A) > 0.
Lemma: There exists n > 0 such that µ(T −n(A) ∩ A) > 0
Note: an inductive application of the lemma shows that for any
k > 0, there exist nk > nk−1 . . . n2 > n1 > 0 s.t.
µ(A ∩ T −n1 (A) ∩ T −n2 (A) · · · ∩ T −(nk (A)) > 0.
Multiple recurrence theorem (Furstenburg, 1977):
Let T be an MPT and k > 0. Let µ(A) > 0. Then there exists
n > 0 s.t.
µ(A ∩ T −n(A) ∩ T −2n(A) · · · ∩ T −(k)n(A)) > 0.
Proof: Deep.
This theorem turns out to be equivalent to
Szmeeredi Theorem (1975):
Let Λ ⊂ N have positive upper density, i.e.,
lim sup
n→∞
Λ ∩ [1, n]
>0
n
3
Then B contains arithmetic progressions of arbitrary length.
Proof: delicate combinatorics.
Even non-trivial for k = 2
Will explain the link later (or good topic for a student talk).
For now, corresponding to Λ, you constrruct an MPT T on M and
a subset A ⊂ M s.t. Λ has an arithemetic progression of length k +1
iff µ(A ∩ T −n(A) ∩ T −2n(A) · · · ∩ T −(k)n(A)) > 0.
Ergodic theory provided first proofs for similar results on patterns
that must occur say in subset of Z 2 with “upper positive density.”
Next, how do you verify measure-preserving?
Check on a small collection of sets (alluded to this last time):
Defn: A semi-algebra B is a collection of sets s.t.
1. B is closed under finite intersections
2. for B ∈ B, B c is a finite disjoint union of elements of B.
Note: weaker concept than algebra
Examples:
• Intervals in R,
• (literal) rectangles in R2
+
• cylinder sets in F Z or F Z :
A = {x : xi1 = a1 . . . xik = ak }
A semi-algebra B generates a σ-algebra A if A is the smallest
σ-algebra containing B.
4
Examples: Borel sets in R, R2 and Borel sets in the product σ+
algebras F Z , F Z . ms
Theorem
T is MPT iff for a generating semi-algebra B of A, for all for
B ∈ B, T −1(B) ∈ A and µ(T −1(B)) = µ(B)
Proof:
Only if: obvious
If: argue using monotone class lemma:
Let
C = {A ∈ A : T −1(A) ∈ A and µ(T −1(A)) = µ(A)}
Then C contains B and hence the algebra generated by B (the
algebra is the set of all finite disjoint unions of elts. of B)
And C is a monotone class, (i.e., closed under countable increasing
sequence of s sets and countable decreasing sequences of sets).
Then C = A, by Monotone class lemma.
Examples: Check MPT on semi-algebra:
1. Recall: Circle rotation (w.r.t. normalized Lebesgue measure on
the circle)
M : circle identified as [0, 2π] with normalized Lebesgue measure
Tα (θ) = θ + α mod 2π
MPT because Lebesgue measure is translation invariant.
Can also be viewed as map from M = [0, 2π] to itself.
Graph has slope 1.
Inverse image of an interval I is one or two intervals whose total
length is `(I).
5
2. Doubling map (w.r.t. normalized Lebesgue measure on the
circle)
M = [0, 1],µ = Lebesgue measure
T (x) = 2x mod 1
Draw graph, which has two pieces of slope 2:
Inverse image of an interval I is the union of two intervals each
with length (1/2)`(I).
Note: would not be an MPT if forward images required to preserve
measure.
3. Recall: Baker’s transformation (w.r.t. Lebesgue measure on
the square)
M : unit square with Lebesgue measure
if 0 ≤ x < 1/2
T (x, y) = (2x mod 1, (1/2)y)
T (x, y) = (2x mod 1, (1/2)y + 1/2)
if 1/2 ≤ x < 1
Draw inverse image of a rectangle contained in bottom (blue) or
top (red). Inverse image has half the width and twice the height.
If a rectangle intersects top and bottom, then split it into top part
and bottom part.
Since IMPT, can alternatively check µ(T (A)) = µ(A).
4. Recall: stationary stochastic process, one-sided or two-sided
e.g., iid or stationary Markov
say one-sided:
+
M = F Z = {x0x1x2 · · · : xi ∈ F }
(F is a finite alphabet)
– T = σ, the left shift map:
6
for cylinder set
A = {x ∈ M : xi1 = a1 . . . xik = ak }
define
µ(A) = p(Xi1 = a1 . . . Xik = ak )
where p is law of process. Extend to product sigma-algebra to define
µ.
σ −1(A) = {x ∈ M : xi1+1 = a1 . . . xik +1 = ak }
µ(σ −1(A)) = µ(A)
Note that sometimes µ(σ(A)) 6= µ(A), e.g.,
A = {x : x0 = 1}, σ(A) = M, enire space.
falls off cliff
—————————————
The link from Furstenburg to Szmeredi : Given B, let χB be the
characteristic function on N. Let M be the subset of all x ∈ {0, 1}N
such that every finite word in x appears in χB infinitely often. Let
µ be the measure on M which counts frequency of 1’s. Then apply
MRT to σ on (M, A, µ) (where σ is the left shift).
—————————————Lecture 16 (Oct, 14):
Defn: An MPT T is ergodic if whenever µ(A) > 0, then a.e.
−n
x ∈ M visits A, i.e., µ(∪∞
A) = 1.
n=1 T
Note: in fact, a.e. x ∈ M visits A infinitely often:
Why?
−n
−k
Let B = ∪∞
A). Then µ(∩∞
(B)) = 1.
n=1 T
k=0 T
TFAE
7
1 T is ergodic
2 if µ(A), µ(B) > 0, then for some n > 0, µ(T −n(A) ∩ B) > 0.
3 If T −1(A) ⊆ A, then µ(A) = 0 or 1.
3’ If µ(T −1(A) \ A) = 0, then µ(A) = 0 or 1.
4 (the usual definition) If T −1(A) = A, then µ(A) = 0 or 1.
4’ If µ(T −1(A)∆A) = 0, then µ(A) = 0 or 1.
5p If f ∈ Lp and f ◦ T = f , then f is constant a.e. (here, 0 ≤ p ≤
∞)
5p’ If f ∈ Lp and f ◦ T = f a.e., then f is constant a.e. (here,
0 ≤ p ≤ ∞)
Note:
– 1 and 2 express universal explorer
– 3 and 4 express irreducibility; you can’t split the space into two
non-trivial invariant pieces
– 5 expresses a functional analysis version: the number 1 is a
simple eigenvalue of the induced operator on Lp 1 ≤ p < ∞
Note: 3 is equivalent to T −1(A) ⊇ A, then µ(A) = 0 or 1.
Proof (exercise): Assume 3.
If T −1(A) ⊇ A, then T −1(Ac) = (T −1(A))c ⊆ Ac.
Thus, µ(Ac) = 0 or 1. Thus µ(A) = 0 or 1.
Note: Most important equivalent defns. left out; need ergodic
theorem.
Proof:
8
1 implies 2: the measure of the intersection of a set of measure 1
and a set of measure b is b. Thus,
∞
X
−n
0 < µ(B) = µ(B ∩ (∪∞
(A)) ≤
µ(B ∩ T −n(A))
n=1 T
n=1
−n
2 implies 3: If T −1(A) ⊆ A, then ∪∞
(A) ⊆ A.
n=1 T
Let B = Ac. If 0 < µ(A) < 1, then µ(A), µ(B) > 0.
−n
Then, by 2, for some n > 0, µ(B∩T −n(A)) > 0. But ∪∞
(A)
n=1 T
and B are disjoint.
3 implies 3’: Let B = T −1(A) \ A. Then µ(B) = 0.
−n
Let C = ∪∞
(B). Then µ(C) = 0.
n=0 T
Let D = A ∪ C.
Then T −1(D) ⊆ D:
T −1(D) = T −1(A) ∪ T −1(C) ⊆ (A ∪ B) ∪ C = A ∪ C = D
By 3, µ(D) = 0 or 1.
But µ(D∆A) ≤ µ(C) = 0:
Thus, µ(A) = 0 or 1.
3’ implies 3: obvious
4’ implies 4: obvious
3 implies 4: obvious
3’ implies 4’: obvious
4 implies 5p: Let Ar = f −1((−∞, r]).
Then T −1(Ar ) = Ar and so, by 4, µ(Ar ) = 0 or 1.
But Ar are increasing and
1 = µ(∪r Ar ) = lim µ(Ar ).
r→∞
9
Let r0 = inf{r : µ(Ar ) = 1}. Then f = r0 a.e.
Note: use fact that distribution functions are right continuous.
4’ implies 5p’: similar
5p’ implies 5p: obvious
5p implies 4: let f = χA.
4 implies 1:
−n
Proof: Let µ(A) > 0 and B = ∪∞
(A)).
n=1 T
−k
Let C = ∩∞
(B).
k=0 T
Enough to show that µ(C) = 1 because T −k (B) is a decreasing
sequence of sets all with the same measure.
Observe that T −1(C) = C because T −k (B) is a decreasing sequence.
Thus, µ(C) = 0 or 1. If µ(C) = 0, then µ(B) = 0 and so
µ(A) = 0, a contradiciton.
Note: C is the lim sup of A.
Lecture 17 (Oct. 17):
Midterm review: Friday, Oct. 21, 4:15 PM, Math 126
Lebesgue ∼ 1902. So, how could Poincare (1890) prove something
about measure theory before measure theory was really developed?
Recall: equivalent conditions for ergodicity.
Note: If T is IMPT, can replace T −1 with T .
Note: If T is not ergodic, it can be decomposed into ergodic pieces
(ergodic decomposition– Keller, sec. 2.3).
Check examples:
Example 1: Rotation of Circle
10
View as T = Tα (z) = az on M = {|z| = 1}.
where a = eiα .
with measure µ(A) = (1/(2π))Lebesgue((log(A))/i)
Note this is IMPT
Case 1: α/(2π) = p/q ∈ Q:
gcd(p, q) = 1
{1, a, a2, . . . , aq−1} is invariant under Tα .
Using rigidity of T , a thickening of this set is invariant under Tα
and has measure in (0, 1).
Thus, T is not ergodic.
Alternatively, the function f (z) = z q is satisfies f ◦ T (z) =
(az)q ) = z q = f (z), but is not constant a.e.
Case 2: α/(2π) 6∈ Q:
Then for all n ∈ Z \ {0}, an 6= 1.
Apply condition 5p’ with p = 2:
An orthonormal basis of functions for L2(M ) is {z n : n ∈ Z} and
each f ∈ L2(M ) is represented by its Fourier series:
f=
∞
X
bnz n in L2
n=−∞
Then, since T is MPT,
∞
∞
X
X
f ◦T =
an(z n ◦ T ) =
anbnz n in L2
n=−∞
n=−∞
So, if f ◦T = f a.e., then f ◦T = f in L2, and so each (an −1)bn = 0
Since for n 6= 0, an − 1 6= 0, we have for for each n 6= 0, bn = 0,
and so f is constant a.e.
11
So, T is ergodic.
Example 4 (special case): iid (2-sided or 1-sided)
Proof of ergodicity: (alternative proof, apply L2 strong law. ???)
Will need fact that |µ(A) − µ(B)| ≤ µ(A∆B).
+
Recall A is the product σ-algebra on F Z or F Z ,
Apply condition 4.
Suppose A = σ −1(A) for some A ∈ A
Suffices to show: µ(A) = µ(A)2.
Let B be the algebra of finite disjoint unions of cylinder sets (not
the semi-algebra of cylinder sets).
Given > 0, there exists B ∈ B such that
µ(A∆B) < .
Choose n s.t. the cylinder coordinates of B are disjoint from those
of σ −n(B).
Let
C = σ −n(B)
Then
µ(B ∩ C) = µ(B)2.
Now,
µ(A∆C) = µ(σ −n(A)∆σ −n(B)) = µ(A∆B) < |µ(A)−µ(A)2| ≤ |µ(A)−µ(B)|+|µ(B)−µ(B∩C)|+|µ(B∩C)−µ(A)2|
≤ µ(A∆B) + µ(B∆(C)) + |µ(B)2 − µ(A)2|
12
≤ + 2 + 2
———————
Since symmetric difference is a metric (on measble. sets, mod
measure zero), it follows that
µ(B∆C) < 2
Thus,
µ(B)2 = µ(B ∩ C) ≈ µ(B)
Since µ(A∆B) < ,
µ(A) ≈ µ(B)
Thus,
µ(A)2 ≈ µ(A)
Thus µ(A) ≈ 0 or 1.
——————————Note!: all we really needed is asymptotic independence for cylinder
sets. Will pick up on this thread later.
Non-example:
µ = 1/2 iid(p) + 1/2 iid(q) where p and q are distinct prob.
vectors of length 2.
Sat one-sided shift
We claim that w.r.t. µ, σ satisfies all equiv. of defn of ergodicity
for all cylinder sets or even algebra of finite unions of cylinder sets
(e.g., version 1).
However, it is not ergodic.
Proof: Check condition 1:
13
Let A be cylinder set. Since µ dominates iid(p), iid(q), µ(A) > 0.
Since iid(p), iid(q) are ergodic, for ν = iid(p), iid(q),
−n
ν(∪∞
A) = 1.
n=1 T
and so
−n
µ(∪∞
A) = 1.
n=1 T
However, let
+
x0 + . . . + xn−1
= q1 }
n→∞
n
A = {x ∈ F Z : lim
Then by strong law of large numbers µ(A) ≥ 1/2. But Ac contains
+
x0 + . . . + xn−1
= p1 }
n→∞
n
B = {x ∈ F Z : lim
and µ(B) ≥ 1/2 for same reason
Thus, µ(A) = 1/2. But clearly σ −1(A) = A. So, σ is not ergodic
w.r.t µ.
Lecture 18 (Oct. 19)
Can show directly that doubling map and Baker are ergodic. But
here is an indirect way.
We can consider measure-preserving maps and invertible measurepreserving maps from one probability space to another.
Defn: If T is an MPT on (M, A, µ) and S is an MPT on (N, B, ν),
we say T is isomorphic to S if there exists an invetrible measurepreserving map φ from M to N such that
φ◦T =S◦φ
Major Problem: classify MPT’s up to isomorphism.
Prop: Ergodicity is an isomorphism invariant.
14
Proof: Spose T is ergodic.
Let B ∈ B s.t. S −1(B) = B. Let A = φ−1((B)).
Then
T −1(A) = T −1 ◦ φ−1((B)) = (φ ◦ T )−1(B)
= (S ◦ φ)−1(B) = φ−1 ◦ S −1(B) = φ−1(B) = A.
Thus, µ(A) = 0 or 1. Thus, ν(B) = 0 or 1.
Example 3: Recall: Isomorphism of Baker and uniform i.i.d. binary two-sided process
φ(· · · x−1x̂0x1x2 · · ·) = (.x0x1x2 · · · , .x−1x−2 · · ·) (in binary)
Example 2: Isomorphism of doubling map and uniform i.i.d. binary one-sided process
φ(x0x1x2 · · ·) = .x0x1x2 · · · , (in binary)
So, both are ergodic.
Ergodic Theorem:
Let T be MPT. Let f ∈ L1.
Pn−1
1. The limit limn→∞(1/n)( i=0 f ◦ T i(x)) exists a.e. x. Call the
limit function f ∗(x)
2. f ∗ ∈ L1
3. f ∗ ◦ T = f ∗ a.e.
4. Let I be σ-algebra of invariant sets mod 0. If A ∈ I, then
Z
Z
f∗ =
f
A
A
5. f ∗ = E(f |I)
15
Pn−1
◦ T i(x)) = f ∗(x) in L1.
Pn−1
p
7. If f ∈ L , then limn→∞(1/n)( i=0 f ◦ T i(x)) = f ∗(x) in Lp.
6. limn→∞(1/n)(
n=0 f
8. If T is ergodic, then
f∗ =
Z
f a.e. x
M
(time average = space average)
Note: item 3 is clear from the defn.
Note: item 5 follows from items 3 and 4, because 3 implies that
each set (f ∗)−1((−∞, r]) ∈ I and so f ∗ is I-measurable.
Note: item 8 follows since if T is ergodic, then I is trivial and so
f ∗ = E(f |I) is constant a.e.
R
R
R
∗
∗
But also M f = M f and so f = M f a.e.
Note that this recovers the strong law of large numbers for iid
processes.
Lecture 19 (Oct. 21)
Midterm Review: Today at 4:15 in Math 126.
Proof of Ergodic Theorem: Part 1 is the guts.
By splitting into positive and negative parts, WMA, f ≥ 0.
Let
Snf (x) =
n−1
X
f ◦T i(x), Anf (x) = (1/n)Snf (x), A+f (x) = lim sup Anf (x)
i=0
Show: A+f = A−f a.e.
R
R +
Will show: (*)
f dµ ≥ A f dµ.
R
R −
It will follow that f dµ ≤ A f dµ.
16
Thus, A+f = A−f a.e.
Proof of (*):
Let r > 0 be large and 0 < < 1 be small.
Let H = Hr, = (1 − ) min(A+f, r). (a lower approx to A+f )
Picture:
Clearly, A+f ◦ T = A+f .
Thus, H ◦ T = H.
Let τ (x) = min{n ≥ 1 : Anf (x) ≥ H(x)}.
Note τ < ∞ a.e.
Picture:
Let M > 0 large
Define inductively: τk = τk (x) and tk = tk (x).
τ0 = 0, t0 = 0.
τk = τ (T tk−1(x)(x) if τ (T tk−1(x)(x) ≤ M and = 1 otherwise.
tk = tk−1 + τk .
Picture:
Snf (x) ≥
X
Sτk f ◦ (T tk−1 (x))
k:tk ≤n
≥ H(x)(
X
τk ) − H(x)(
X
k:tk ≤n, τ (T tk−1 (x))>M
k:tk ≤n
≥ H(x)(n − M ) − H(x)Sn χτ >M (x) ≥
17
1)
H(x)(n − M ) − rSn χτ >M (x)
So,
Snf (x) ≥ H(x)(n − M ) − rSn χτ >M (x)
Integrate and divide by n:
Z
Z
n−M
(1 − ) min(A+f, r)dµ − rµ({τ > M })
f dµ ≥
n
Let n → ∞.
Z
Z
f dµ ≥ (1 − ) min(A+f, r)dµ − rµ({τ > M })
Let M → ∞.
Z
Z
f dµ ≥ (1 − )
min(A+f, r)dµ
Let → 0.
Z
Z
f dµ ≥
min(A+f, r)dµ
Let r → ∞ and apply MCT.
Z
Z
f dµ ≥ A+f dµ
as desired.
Now, we show: (**)
R
f dµ ≤
R
A−f dµ.
Proof: First, assume f is bounded: f ≤ N .
By (*),
Z
Z
Z
(N − f )dµ ≥ A+(N − f )dµ = (N + A+(−f ))dµ,
18
which is equivalent to:
Z
N−
Thus,
R
f dµ ≤
R
Z
f dµ ≥ N −
A−f )dµ
A−f dµ.
In general case, let fN = min(f, N ). Then
Z
Z
Z
A−f dµ ≥ A−fN dµ ≥ fN dµ
Let N → ∞ and apply MCT. This give (**).
Then, as outlined above, we get A+f = A−f a.e., and this proves
1.
In the course of proving this, we have shown
thus f ∗ ∈ L1, giving 2.
R
f ∗dµ =
R
f dµ and
As mentioned above, 3 follows from 1.
4:
If A ∈ I and µ(A) = 0, this clearly holds.
R ∗
R
If µ(A) > 0, apply the result f dµ = f dµ to the MPT T |A.
As mentioned above, 5 follows from 3 and 4.
6: uses a generalized version of DCT.
7: von Neumann’s mean ergodic theorem generalized from Hilbert
space to Banach space.
8: as mentioned above, easily follows from 1 and 5.
19
Download