CHAPTER 2 Structure theory for p.m.p. actions As shown in Theorem 1.23, every unitary representation π : G → B(H) of a group on a Hilbert space decomposes uniquely into a sum of a weakly mixing representation and a compact representation. The weakly mixing and compact vectors each form a closed G-invariant subspace of H, and these subspaces are orthogonal and have direct sum equal to H. For a p.m.p. action G y (X, µ) one can apply this decomposition to the associated Koopman representation of G on L2 (X), but this does not neatly translate back into a description involving the space X. The problem is that L2 (X) lacks the algebraic structure of L∞ (X), which needs to be taken into account if we wish to faithfully encode the dynamics at the functional-analytic level. In particular, working in L∞ (X) permits us to see factors and extensions, which we expect to play a role in the structure theory of p.m.p. actions analogous to that of closed invariant subspaces and superspaces in the theory of unitary representations. It turns out that this structure theory hinges on neither L2 nor L∞ phenomena alone, but rather on the interplay between these. This interplay is a hallmark of the theory of von Neumann algebras, not least in the deformation-rigidity theory of Popa which appears in Chapter 5 in the dynamical form of cocycle superrigidity. In fact the results we present in Sections 2.1 and 2.2 admit versions for the general von Neumann algebra context, with p.m.p. actions replaced by actions preserving a faithful normal tracial state, but one needs to modify the treatment of conditional compactness (see Remark 2.16). The approach taken here, which is special to the commutative case, yields the stronger conclusions which are necessary to prove multiple recurrence theorem in Section 2.3, although we do not carry out the most refined possible analysis of compact extensions, which relies on measure disintegration (which we have avoided), and work with a definition of compact extension that is formally weaker than (but logically equivalent to) the ones commonly encountered in the literature. The dynamical notion of compactness provides the most immediate point of contact between L2 (X) and L∞ (X) from the perspective of factors and extensions. If we view L∞ (X) as a subspace of L2 (X), then the compact vectors (Definition 1.21) which live in L∞ (X) form not merely a G-invariant linear subspace but also a conjugation-invariant subalgebra and hence naturally describe a factor X → Y . This is not the case for the 67 68 2. STRUCTURE THEORY FOR P.M.P. ACTIONS weakly mixing vectors in L∞ (X), which reflects a fundamental asymmetry in the dichotomy between weak mixing and compactness for actions that does not appear in the unitary representation framework. We can, however, relativize the notion of weak mixing to G-extensions using Hilbert modules and then ask whether every action is a weakly mixing extension of a compact action. The answer to this is no in general, but it will be true that every G-extension X → Y which fails to be weakly mixing admits an intermediate factor X → Y ′ → Y such that the extension Y ′ → Y is nontrivial and compact in a suitable conditional sense. Once we know this, a simple maximality argument then yields the Furstenberg-Zimmer structure theorem, which expresses every p.m.p. action as a weakly mixing extension of an action that decomposes into a tower of compact extensions indexed by a countable ordinal. After setting up the preliminaries in Section 2.1, we prove the conditional version of the dichotomy between weak mixing and compactness in Lemma 2.11 and then deduce the Furstenberg-Zimmer structure theorem in Theorem 2.15. In Section 2.3 we specialize to integer actions and use the Furstenberg-Zimmer structure theorem to prove Furstenberg’s multiple recurrence theorem (Theorem 2.25), which is a deep stengthening of Poincaré recurrence. As a straighforward consequence we deduce Szemerédi’s theorem (Theorem 2.26), which states that every subset A of Z satisfying the positive upper density condition lim sup n→∞ 1 |A ∩ {−n, −n + 1, . . . , n}| > 0 2n + 1 contains arbitrarily long arithmetic progressions. Szemerédi’s theorem was conjectured by Erdös and Turán as a strengthening of the van der Waerden theorem, which says that if the integers are partitioned into finitely many pieces then at least one of the pieces contains arbitrarily long arithmetic progressions [136]. Van der Waerden’s theorem can be established using methods of topological dynamics by observing it to be equivalent to a multiple recurence property for minimal Zactions. Szemerédi established Erdős and Turán’s conjecture by a difficult combinatorial argument [129], and afterward Furstenberg developed the ergodic-theoretic approach that we present here. Furstenberg’s treatment has been enormously influential and inspired many generalizations and related results. One highlight within this trajectory of ideas is Green and Tao’s theorem on the existence of arbitrarily long arithmetic progressions in the primes [63]. 2.1. Hilbert modules from factors of probability spaces In order to formulate and prove the conditional version of the dichotomy between weak mixing and compactness in Lemma 2.11, we need to describe the Hilbert module 2.1. HILBERT MODULES FROM FACTORS OF PROBABILITY SPACES 69 L2 (X|Y ) associated to a measure-preserving factor map X → Y between probability spaces and then collect some basic facts. The first of these facts asserts that conditional precompactness in the normed L∞ (Y )-module L2 (X|Y ) implies an approximation in terms of finite orthonormal sets in L2 (X|Y ) (Proposition 2.3), the second is a description of L2 (X|Y ) using tensor products (Propositions 2.4 and 2.5), and the third concerns rankone operators L2 (X|Y ) → L2 (X) (Proposition 2.6). Group actions will not enter the picture until Section 2.2. Let ϕ : (X, µ) → (Y, ν) be a measure-preserving factor map between probability spaces. We regard L∞ (Y ) as a von Neumann subalgebra of L∞ (X) through the composition map f 7→ f ◦ ϕ. We write 1A for the indicator function of a measurable subset A of X or Y . Given that we are viewing L∞ (Y ) as sitting in L∞ (X), for a measurable set A ⊆ Y we could also write 1ϕ−1 (A) , but we will stick with the simpler 1A , especially since the map ϕ will typically not be named. Write EY for the conditional expectation L2 (X) → L2 (Y ), which is the orthogonal projection. Note that, as orthogonal projections are self-adjoint, for all f ∈ L∞ (X) we have Z Z (10) f dµ. EY (f ) dµ = hEY (f ), 1i = hf, EY (1)i = hf, 1i = X X Moreover, (i) EY maps L∞ (X) contractively onto L∞ (Y ), and (ii) EY is completely positive, meaning that for every n ∈ N the map (fij )ij 7→ (EY (fij ))ij between the matrix algebras Mn (L∞ (X)) to Mn (L∞ (Y )) preserves positivity, where positivity for elements f ∈ Mn (L∞ (X)) means that hf ξ, ξi ≥ 0 for all ξ ∈ L2 (X)⊕n , and similarly for elements of Mn (L∞ (Y )). To see (i), let f ∈ L∞ (X) and observe that for all g, h ∈ L∞ (Y ) we have |hEY (f )g, hi| = |hEY (f ), hḡi| = |hf, EY (hḡ)i| = |hf, hḡi| = |hf g, hi| ≤ kf kkgk2 khk2 . so that kEY (f )(ν(D)−1/2 1D )k2 ≤ kf k for every set D ⊆ Y with ν(D) > 0, which shows that EY (f ) is an element of L∞ (Y ) of norm at most kf k. To see (ii), if (aij )ij is a positive element in Mn (L∞ (X)) then for all g1 , . . . , gn ∈ L∞ (Y ) we have X X X X hEY (aij )gj , gi i = hEY (aij ), gi ḡj i = haij , gi ḡj i = haij gj , gi i ≥ 0, i,j i,j i,j i,j and by approximation we see that this positivity also holds when the gi lie more generally in L2 (Y ). We view L∞ (X) as an L∞ (Y )-module via the multiplication (f, g) 7→ f g, and we define on it the L∞ (Y )-valued inner product hf, giY := EY (f ḡ). Write L2 (X|Y ) for 70 2. STRUCTURE THEORY FOR P.M.P. ACTIONS the Hilbert L∞ (Y )-module obtained by completing L∞ (X) according to Proposition B.5 with respect to the norm kf k := khf, f iY k1/2 . Although we are using the notation k·k for both the Hilbert module norm on L2 (X|Y ) and the L∞ norm on L∞ (X) and L∞ (Y ), the context should make it clear which one we mean. The norm on L2 (X) on the other hand will be denoted by k·k2 . We have the natural inclusions L∞ (X) ⊆ L2 (X|Y ) ⊆ L2 (X). To see the second inclusion, first note that for all f ∈ L∞ (X) we have, using (10), Z Z 2 ¯ EY (f f¯) dµ ≤ kEY (f f¯)k = khf, f iY k. f f dµ = kf k2 = hf, f i = X X This shows that the formal identity map from L∞ (X) with the Hilbert module norm into L2 (X) is contractive and hence extends to L2 (X|Y ). To verify that this extension is injective, let f denote an element of L2 (X|Y ) as well as its image in L2 (X). Take a net {fi } in L∞ (X) which converges to f in the Hilbert module norm. Then, using the continuity of the L∞ (Y )-valued inner product (Proposition B.2) and the self-adjointness of EY as an orthogonal projection, we have, for all g, h ∈ L∞ (Y ), hhf, f iY g, hi = limhhfi , fi iY g, hi = limhEY (fi f¯i ), hḡi i i = limhfi f¯i , EY (hḡ)i = limhfi f¯i , hḡi = limhh̄fi , ḡfi i = hh̄f, ḡf i, i i i which shows that kf k2 = 0 implies hf, f iY = 0, yielding the desired injectivity. The representation of L∞ (Y ) on L2 (X) by multiplication operators turns L2 (X) into a normed L∞ (Y )-module (Definition B.12), and this L∞ (Y )-module structure on L2 (X) restricts to the one on L2 (X|Y ) under the natural inclusion. We will make use of both of these L∞ (Y )-modules in this chapter. In particular, conditional Hilbert-Schmidt operators from L2 (X|Y ) to L2 (X) will play a role in the proof of Lemma 2.11. The following property of conditional precompactness for subsets of L2 (X|Y ) will be a crucial ingredient in the proofs of both the Furstenberg-Zimmer structure theorem and Furstenberg’s multiple recurrence theorem. We use the ε-containment notation A ⊆ε B to mean that every element of the set A lies within ε from some element of B. D EFINITION 2.1. A subset of L2 (X|Y ) is called a finitely generated module zonotope P if it is of the form h∈Ω BL∞ (Y ) h where Ω is a finite subset of L2 (X|Y ) and BL∞ (Y ) denotes the unit ball of L∞ (Y ). We say that a set K ⊆ L2 (X|Y ) is conditionally precompact if for every ε > 0 there are a set D ⊆ Y with ν(D) > 1 − ε and a finitely generated module zonotope Z in L2 (X|Y ) such that 1D K ⊆ε Z. 2.1. HILBERT MODULES FROM FACTORS OF PROBABILITY SPACES 71 R EMARK 2.2. The above definition of conditional precompactness has been formulated so that the Furstenberg-Zimmer structure theorem provides a strong enough conclusion, with regard to what it means to be a compact extension (Definition 2.8), for the purpose of establishing multiple recurrence in Section 2.3. Another possibility is to omit the cutting down by an indicator function 1D and require that the ε-containment be in the L2 -norm. Then the definitions and arguments leading to the structure theorem will still work mutatis mutantis (and this is the approach one needs to take in the noncommutative case, as explained in Remark 2.16), but the definition of compactness for extensions in this case, while being logically the same (compare Section 6.3 of [49]), would not by itself provide enough leverage to deduce multiple recurrence. Approximate containment in a finitely generated module zonotope implies L2 (X)norm approximation by the L∞ (Y )-span of a finite orthonormal set. In the following proposition we express this fact in context of conditional precompactness, as this is the version we will need for the proof of Lemma 2.11. For background on orthonormality in L∞ (Y )-modules, see Appendix B. P ROPOSITION 2.3. Let K be a conditionally precompact subset of L2 (X|Y ). Then for every ε > 0 there exist a D ⊆ Y with µ(D) ≥ 1 − ε and a finite orthonormal set Ω ⊆ L2 (X|Y ) such that k1D f − pΩ (1D f )k2 ≤ ε for all f ∈ K, where pΩ denotes the P orthogonal projection of L2 (X) onto the closure of h∈Ω L∞ (Y )h in L2 (X). P ROOF. Let ε > 0. Denote by BL∞ (Y ) the unit ball of L∞ (Y ). By assumption we can find a D ⊆ X with µ(D) > 1 − ε and a finite set Ω ⊆ L2 (X|Y ) such that 1D K ⊆ε/2,k·k2 P Proposition B.11 we can find a h∈Ω BL∞ (Y ) h. Set r = maxh∈Ω khk. Let δ > 0. ByP ′ 2 finite orthonormal set Ω ⊆ L (X|Y ) such that Ω ⊆δ,k·k h∈Ω′ rBL∞ (Y ) h. Then Ω ⊆δ,k·k2 P P ∞ (Y ) h, and hence 1D K ⊆δ|Ω|+ε/2,k·k rB ′ L 2 h∈Ω′ |Ω|rBL∞ (Y ) h. Taking δ small h∈Ω P ∞ enough, we get 1D K ⊆ε,k·k2 h∈Ω′ |Ω|rBL (Y ) h. We next use tensor products to give an alternative description of L2 (X|Y ) which will be helpful in proving the implication (1)⇒(2) in Theorem 2.15. P ROPOSITION 2.4. There is a pre-inner product h·, ·i on the algebraic tensor product L (X) ⊗ L∞ (X) satisfying Z EY (h̄f )g k̄ dµ hf ⊗ g, h ⊗ ki = ∞ X ∞ for all f, g, h, k ∈ L (X). P ROOF. Define h·, ·i : (L∞ (X) ⊗ L∞ (X)) × (L∞ (X) ⊗ L∞ (X)) → C by X XZ X EY (h̄j fi )gi k̄j dµ. fi ⊗ g i , h j ⊗ kj = i j i,j X 72 2. STRUCTURE THEORY FOR P.M.P. ACTIONS Note that XZ i,j EY (h̄j fi )gi k̄j dµ = X XZ i,j EY (f¯i hj )kj ḡi dµ X and so hf, gi = hg, f i for all f, g ∈ L∞ (X) ⊗ L∞ (X). Since EY is completely positive and (f¯i fj )i,j is a positive element of the matrix algebra Mn (L∞ (Y )) over L∞ (Y ), the element (EY (f¯i fj ))i,j is positive in Mn (L∞ (Y )). Therefore XZ X X X EY (f¯j fi )gi ḡj dµ = hgi , EY (f¯i fj )gj i ≥ 0, fi ⊗ g i , fj ⊗ g j = i i,j j X i,j yielding the proposition. The pre-inner product in Proposition 2.4 descends to an inner product on the quotient of L∞ (X)⊗L∞ (X) by the subspace of all f for which hf, f i = 0. Denote the completion of this quotient by HX . P ROPOSITION 2.5. The map f 7→ f¯ ⊗ f from L∞ (X) to HX extends uniquely to a continuous map Φ : L2 (X|Y ) → HX . Furthermore, for all f, g ∈ L2 (X|Y ) we have khf, giY k22 = hΦ(f ), Φ(g)i (11) and khf, giY k22 ≤ kgkkgk2 khf, f iY k2 . (12) P ROOF. First observe that for all f, g ∈ L∞ (X) we have Z Z 2 ¯ (13) EY (g f¯)f ḡ dµ = hf¯ ⊗ f, ḡ ⊗ gi. khf, giY k2 = EY (g f )EY (f ḡ) dµ = X X Since L∞ (X) is dense in L2 (X|Y ), the uniqueness of Φ is trivial. To prove the existence of Φ, it suffices to show that for any sequence {fn } in L∞ (X) converging to some f ∈ L2 (X|Y ), the sequence {f¯n ⊗ fn i in HX is Cauchy. When n, m → ∞, the elements hfn , fm iY converge to hf, f iY in L∞ (Y ) and hence also in L2 (X). Therefore as n, m → ∞ we have, using (13), kf¯n ⊗ fn − f¯m ⊗ fm k2 = hf¯n ⊗ fn − f¯m ⊗ fm , f¯n ⊗ fn − f¯m ⊗ fm i = khfn , fn iY k22 − khfm , fn iY k22 − khfn , fm iY k22 + khfm , fm iY k22 → 0. This establishes the existence of Φ. 2.1. HILBERT MODULES FROM FACTORS OF PROBABILITY SPACES 73 Assertion (11) now follows from (13) and the continuity of Φ. Next note that for all f, g ∈ L∞ (X) we have Z Z 2 2 ¯ ¯ (14) gḡ dµ = kf¯k2L2 (X|Y ) kgk22 , kf ⊗ gk = EY (f f )gḡ dµ ≤ kf kL2 (X|Y ) X X and using this inequality in conjunction with two applications of (11) we obtain khf, giY k22 = hf¯ ⊗ f, ḡ ⊗ gi ≤ kf¯ ⊗ f kkḡ ⊗ gk = khf, f iY k2 kḡ ⊗ gk ≤ kgkL2 (X|Y ) kgk2 khf, f iY k2 . From this we can get (12) by approximating elements in L2 (X|Y ) by elements in L∞ (X). The final proposition of the section will be used in the proof of (4)⇒(1) in Lemma 2.11. It says in particular that, like rank-one operators on a Hilbert spaces, operators L2 (X|Y ) → L2 (X) of the form f 7→ hf, giY g for some g ∈ L2 (X|Y ) are conditionally HilbertSchmidt (Definition B.13). P ROPOSITION 2.6. Let g ∈ L2 (X|Y ) and define T : L2 (X|Y ) → L2 (X) by T f = P hf, giY g for all f ∈ L2 (X|Y ). Then f ∈Ω kT f k22 ≤ kgk2 kgk22 for every orthonormal set Ω ⊆ L2 (X|Y ). Moreover, T extends to a bounded L∞ (Y )-linear operator on L2 (X) with norm at most kgk2 . P ROOF. For every f ∈ L2 (X|Y ) we have (15) hT f, T f iY = hf, giY hg, giY hg, f iY ≤ |hg, f iY |2 kgk2 and hence, using Proposition B.2, Z Z 2 2 |hg, f iY |2 dµ hT f, T f iY dµ ≤ kgk kT f k2 = X ZX kgk2 hf, f iY dµ = kgk4 kf k22 . ≤ kgk2 X Therefore T is bounded for the norm k · k2 and extends to a bounded linear map T̃ on L2 (X) with operator norm at most kgk2 . Clearly T is L∞ (Y )-linear, and thus so is T̃ . Now let Ω be an orthonormal subset of L2 (X|Y ). Then for every finite set Ω′ ⊆ Ω we have, making use of (15) and Lemma B.9, XZ X 2 hT f, T f iY dµ kT f k2 = f ∈Ω′ f ∈Ω′ ≤ kgk 2 X XZ f ∈Ω′ X hf, giY hg, f iY dµ 74 2. STRUCTURE THEORY FOR P.M.P. ACTIONS 2 = kgk XZ f ∈Ω′ 2 ≤ kgk Z X X hhg, f iY f, hg, f iY f iY dµ hg, giY dµ = kgk2 kgk22 . Hence P f ∈Ω kT f k22 ≤ kgk2 kgk22 . 2.2. The Furstenberg-Zimmer structure theorem The heart of the Furstenberg-Zimmer structure theorem (Theorem 2.15) is the analogue for extensions of the part of Theorem 1.22 which relates weak mixing and compactness. This is the content of Lemmas 2.11 and 2.14. Thereafter we will only need to apply a simple maximality argument to obtain Theorem 2.15. Let X → Y be a G-extension of p.m.p. actions G y (X, µ) and G y (Y, ν). As described in the previous section, we consider the Hilbert L∞ (Y )-module L2 (X|Y ), identify L∞ (Y ) as a von Neumann algebra of L∞ (X), view L2 (X) as an L∞ (Y )-module where appropriate, and have the natural inclusions L∞ (X) ⊆ L2 (X|Y ) ⊆ L2 (X). As we now have a group acting, we need to introduce some notation for the induced action on the above function spaces and make some preliminary observations concerning the interaction of the dynamics with the module structures. For s ∈ G we define the unitary isomorphism αs of L2 (X) by αs (f )(x) = f (s−1 x) for f ∈ L2 (X) and x ∈ X. This restricts to a G-action by automorphisms on L∞ (X), and also to a G-action on L2 (X|Y ) satisfying hαs (f ), αs (g)iY = αs (hf, giY ) and αs (af ) = αs (a)αs (f ) for all s ∈ G, f ∈ L2 (X|Y ), and a ∈ L∞ (Y ). Recall the Hilbert space HX with inner product h·, ·i constructed after Proposition 2.4 as the completion of a quotient of L∞ (X) ⊗ L∞ (X), along with the map Φ : L2 (X|Y ) → HX from Proposition 2.5. For each s ∈ G we denote by α̂s the unitary automorphism of HX determined by α̂s (f ⊗ g) = αs (f ) ⊗ αs (g) for f, g ∈ L∞ (X). By the uniqueness in Proposition 2.5, we have Φ ◦ αs = α̂s ◦ Φ for all s ∈ G. Note furthermore by (11) of Proposition 2.5 that, for all f, g ∈ L2 (X|Y ) and s ∈ G, (16) khαs (f ), giY k22 = hΦ(αs (f )), Φ(g)i = hα̂s (Φ(f )), Φ(g)i. 2.2. THE FURSTENBERG-ZIMMER STRUCTURE THEOREM 75 It follows that the function s 7→ khαs (f ), giY k2 on G is weakly almost periodic for all f, g ∈ L2 (X|Y ). Recall that the weakly almost periodic functions form a unital C∗ subalgebra WAP(G) of ℓ∞ (G) with unique G-invariant mean m. D EFINITION 2.7. An element f ∈ L2 (X|Y ) is said to be (i) conditionally weakly mixing if the weakly almost periodic function s 7→ khαs (f ), f iY k2 on G has mean zero, (ii) conditionally compact if its orbit {αs (f ) : s ∈ G} is conditionally precompact in L2 (X|Y ), and (iii) conditionally compact in measure if for every ε > 0 there is a set D ⊆ Y with ν(D) > 1 − ε such that 1D f is conditionally compact. D EFINITION 2.8. The extension X → Y is said to be weakly mixing if every element in L2 (X|Y ) orthogonal to L∞ (Y ) is conditionally weakly mixing, and compact if every element of L∞ (X) is conditionally compact in measure. P ROPOSITION 2.9. Let X → Y be an extension of p.m.p. G-actions. Then the collection N0 of all elements in L∞ (X) which are conditionally compact is a G-invariant conjugation-invariant subalgebra of L∞ (X) containing L∞ (Y ), while the collection N of all elements in L∞ (X) which are conditionally compact in measure is a G-invariant von Neumann subalgebra of L∞ (X) which is equal to the strong operator closure of N0 . P ROOF. It is clear that both N0 and N are G-invariant linear subspaces of L∞ (X) which are invariant under conjugation, and that N0 contains L∞ (Y ). To check that N0 is closed under multiplication, let f, g ∈ N0 and let ε > 0. First find h1 , . . . , hn ∈ L2 (X|Y ), as,1 , . . . , as,n ∈ BL∞ (Y ) , and a set C ⊆ Y with ν(C) > 1 − ε/2 such that P k1C αs (f ) − ni=1 as,i hi k < ε/(2kgk∞ + 1) for all s ∈ G. We may perturb the functions P hi so that they all lie in L∞ (X). Set M = ni=1 khi k∞ . Now find k1 , . . . , km ∈ L2 (X|Y ), bs,1 , . . . , bs,m ∈ BL∞ (Y ) , and a set D ⊆ Y with ν(D) > 1 − ε/2 such that k1D αs (g) − Pn i=1 bs,i ki k2 < ε/(2M + 1) for all s ∈ G. Then ν(C ∩ D) > 1 − ε, and for every s ∈ G we have n X m n X X 1C∩D αs (f g) − ≤ 1C αs (f ) − k1D αs (g)k∞ a b h k a h s,i s,j i j s,i i i=1 j=1 i=1 X n a h + s,i i i=1 ∞ m X 1D αs (g) − b k s,j j j=1 ε ε · kgk∞ + M · < ε, < 2kgk∞ + 1 2M + 1 P P so that 1C∩D αs (f g) is ε-contained in the finitely generated module zonotope ni=1 m j=1 BL∞ (Y ) hi kj . Therefore f g ∈ N0 . 76 2. STRUCTURE THEORY FOR P.M.P. ACTIONS Now let f be an element in the strong operator closure M of N0 , which is a von Neumann algebra. Since strong operator convergence implies convergence in the L2 (X)norm, given an εR> 0 we can find, for every n ∈ N, an fn ∈ N0 such that kf − fn k2 < ε/2n . Now since Y hf −fn , f −fn iY dν = kf −fn k22 < (ε/2n )2 we can find a set Dn ⊆ Y with ν(Dn ) > 1 − ε/2n such that k1Dn f − 1Dn fn k2 = khf − fn , f − fn iY 1Dn k∞ ≤ ε/2n . T Set D = ∞ n=1 Dn , which has ν-measure greater than 1 − ε. Then k1D f − 1D fn k = k1D (1Dn f − 1Dn fn )k ≤ k1Dn f − 1Dn fn k → 0 as n → ∞. Since the algebra N0 is obviously closed in the L2 (X|Y )-norm and the elements 1D fn lie in N0 by the first paragraph, we deduce that 1D f is conditionally compact and hence that f ∈ N . Therefore M ⊆ N . Finally, note that the definition of conditional compactness in measure implies that the unit ball of N0 is dense in the unit ball of N with respect to the L2 (X)-norm, and since the strong operator topology and the L2 (X)-norm topology agree on the unit ball of L∞ (X) we conclude that M contains N and hence is equal to N . E XAMPLE 2.10. Consider for a fixed irrational θ ∈ [0, 1) the skew transformation T of T2 ∼ = R2 /Z2 defined by T (x, y) = (x+θ, x+y) modulo Z2 , as discussed in Section 1.3. The map T2 → T onto the first coordinate factors T onto rotation by θ, and we will verify that this extension is compact. For all m ∈ Z the function fm (x, y) = e2πimy is conditionally precompact, since for all n ∈ Z we have (fm ◦ T n )(x, y) = e2πibm,n θ e2πimnx fm (x, y) for some bm,n ∈ Z, showing that the orbit of fm is contained BL∞ (T) fm . The functions fm together with the functions (x, y) 7→ e2πinx for n ∈ Z generate L∞ (T2 ) as a von Neumann algebra, and so we conclude that the extension is compact. L EMMA 2.11. Let f ∈ L2 (X|Y ). Then the following are equivalent: (i) f is conditionally weakly mixing, (ii) for every g ∈ L2 (X|Y ) the weakly almost periodic function s 7→ khαs (f ), giY k2 on G has mean zero, (iii) hf, giY = 0 for every conditionally compact g ∈ L2 (X|Y ), (iv) hf, giY = 0 for every conditionally compact g ∈ L∞ (X). P ROOF. (i)⇒(ii). Let Φ : L2 (X|Y ) → HX be the map in Proposition 2.5, where HX is the space described before the proposition statement. Note that the function s 7→ khαs (f ), giY k2 has mean zero if and only if the function s 7→ khαs (f ), giY k22 has mean zero. Also note that khαs (f ), giY k22 = hα̂s (Φ(f )), Φ(g)i. Write p for the orthogonal projection of HX onto the closure of the linear span of {α̂s (Φ(f )) : s ∈ G}. Since hα̂s (Φ(f )), ξi = 0 for all ξ ∈ p⊥ HX , it suffices to show that, given a ξ ∈ pHX , the function s 7→ hα̂s (Φ(f )), ξi has mean zero. We may furthermore 2.2. THE FURSTENBERG-ZIMMER STRUCTURE THEOREM 77 assume by an approximation argument that ξ is of the form α̂t (Φ(f )) for some t ∈ G. But then, using the G-invariance of m, m(s 7→ hα̂s (Φ(f )), α̂t (Φ(f ))i) = m(s 7→ hα̂t−1 s (Φ(f )), Φ(f )i) = m(s 7→ hα̂s (Φ(f )), Φ(f )i) = 0. (ii)⇒(iii). Let g be a conditionally compact element of L2 (X|Y ) and let us show that hf, giY = 0. Since hf, giY = αs−1 (hαs (f ), αs (g)iY ) for all s ∈ G, it is enough to show that m(s 7→ khαs (f ), αs (g)iY k2 ) = 0. Let ε > 0. Since g is conditionally compact, by Proposition 2.3 there exist a finite orthonormal set Ω ⊆ L2 (X|Y ) and a set D ⊆ Y with ν(D) ≥ 1−ε such that k1D αs (g)− p(1D αs (g))k2 < ε for all s ∈ G, where p denotes the orthogonal projection of L2 (X) P onto the closure of h∈Ω L∞ (Y )h in L2 (X). By Lemma B.9, for every s ∈ G we have P p(1D αs (g)) = h∈Ω h1D αs (g), hiY h ∈ L2 (X|Y ). Therefore, using Proposition B.3 for the second inequality, X khαs (f ), p(1D αs (g))iY k2 = αs (f ), h1D αs (g), hiY h Y 2 h∈Ω X = hαs (f ), hiY h1D αs (g), hiY 2 h∈Ω ≤ X kh1D αs (g), hiY kkhαs (f ), hiY k2 ≤ X k1D αs (g)kkhkkhαs (f ), hiY k2 h∈Ω h∈Ω ≤ kgk X h∈Ω khαs (f ), hiY k2 and, using (12) in Proposition 2.5, khαs (f ), (1 − p)(1D αs (g))iY k22 ≤ khf, f iY k2 k(1 − p)(1D αs (g))k2 k(1 − p)(1D αs (g))k ≤ εkhf, f iY k2 kgk. Therefore khαs (f ), 1D αs (g)iY k2 ≤ khαs (f ), (1 − p)1D αs (g)iY k2 + khαs (f ), p(1D αs (g))iY k2 X √ 1/2 ≤ εkhf, f iY k2 kgk1/2 + kgk khαs (f ), hiY k2 . h∈Ω Using Proposition B.3 we also have khαs (f ), 1Dc αs (g)iY k2 = khαs (f ), αs (g)iY 1Dc k2 78 2. STRUCTURE THEORY FOR P.M.P. ACTIONS √ ≤ khαs (f ), αs (g)iY kk1Dc k2 ≤ kf kkgk ε, and so m(s 7→ khαs (f ), αs (g)iY k2 ) ≤ √ √ 1/2 εkgk1/2 khf, f iY k2 + kf kkgk ε. Since ε can be taken arbitrarily small, we conclude that m(s 7→ khαs (f ), αs (g)iY k2 ) = 0. (iii)⇒(iv). Trivial. (iv)⇒(i). Suppose that f is not conditionally weakly mixing. By the density of ∞ L (X) in L2 (X|Y ), we can find an f ′ ∈ L∞ (X) which is close enough to f in L2 (X|Y )norm for a purpose to be described and satisfies kf ′ k ≤ kf k. By Proposition 2.6, for every s ∈ G the map h 7→ hh, αs (f ′ )iY αs (f ′ ) from L2 (X|Y ) to L2 (X) extends to a bounded L∞ (Y )-linear operator Ts : L2 (X) → L2 (X) satisfying kTs k ≤ kαs (f ′ )k2 = kf ′ k2 ≤ kf k2 whose restriction L2 (X|Y ) → L2 (X) is a conditionally Hilbert-Schmidt operator with X kTs hk22 ≤ kαs (f ′ )k2 kαs (f ′ )k22 = kf ′ k2 kf ′ k22 h∈Ω for every orthonormal set Ω ⊆ L2 (X|Y ). Write P0 (G) for the convex set of finitely supported probability measures on G. For P every λ ∈ P0 (G), define Tλ to be the convex combination s∈G λs Ts . Since the function x 7→ x2 on R is convex, using the above bounds it is easy to see that kTλ k ≤ kf k2 and X kTλ hk22 ≤ kf ′ k2 kf ′ k22 h∈Ω for every orthonormal set Ω ⊆ L2 (X|Y ). Note that the mean m on WAP(G) lies in the weak∗ closure of P0 (G). Indeed if this is not the case then by the Hahn-Banach theorem there are an f ∈ WAP(G) and an α > 0 P such that re s∈G λ(s)f (s) + α ≤ re m(f ) for all λ ∈ P0 (G), and taking the real part of f and adding a constant function if necessary we may assume that f is real-valued and f ≥ 0. Then taking a finite set F ⊆ G such that f (s) ≥ kf k − α/2 for all s ∈ F we obtain 1 X α 1 X f (s) + α ≤ m(f ) ≤ kf k ≤ f (s) + , |F | s∈F |F | s∈F 2 a contradiction. We can thus find a net {λη }η in P0 (G) which converges in the weak∗ topology to the mean m on WAP(G). As the operators Tλη are all bounded in norm by kf k2 and norm-closed balls in B(L2 (X)) are compact in the weak operator topology, we may assume by passing to a subnet that {Tλη }η converges in the weak operator topology on B(L2 (X)) to some operator T , which is L∞ (Y )-linear by virtue of the fact that hTλη f ξ, ζi = hf Tλη ξ, ζi = hTλη ξ, f¯ζi for all f ∈ L∞ (Y ) and ξ, ζ ∈ L2 (X). 2.2. THE FURSTENBERG-ZIMMER STRUCTURE THEOREM 79 Next we argue that T sends L∞ (X) into itself. Let g ∈ L∞ (X). Then, for each s ∈ G, kTs gk∞ = khg, αs (f ′ )iY αs (f ′ )k∞ = kEY (gαs (f ′ ))αs (f ′ )k∞ ≤ kαs (f ′ )k∞ kEY (gαs (f ′ ))k∞ ≤ kf ′ k∞ kgαs (f ′ )k∞ ≤ kf ′ k2∞ kgk∞ and hence kTλ gk∞ ≤ kf ′ k2∞ kgk∞ for every λ ∈ P0 (G). Since the ball of radius kf ′ k2∞ kgk∞ in L∞ (X) is compact in the weak operator topology, by pasing to a subnet we may assume that {Tλη g}η converges to some h ∈ L∞ (X) in the weak operator topology. Then {Tλη g}η converges to h in the weak topology of L2 (X). Therefore T g = h ∈ L∞ (X). For each f ∈ L2 (X), since Tλη f → T f weakly, we have kT f k2 ≤ lim inf η kTλη f k2 . P It follows that h∈Ω kT hk22 ≤ kf ′ k2 kf ′ k22 for every orthonormal set Ω ⊆ L2 (X|Y ). Thus the restriction L2 (X|Y ) → L2 (X) of T is conditionally Hilbert-Schmidt. For all g, h ∈ L∞ (X), we have Z ′ ′ hg, αs (f )iY αs (f ), h = EY (gαs (f ′ ))αs (f ′ )h̄ dµ X = hαs (f ′ ) ⊗ αs (f ′ ), ḡ ⊗ hi = hα̂s (f¯′ ⊗ f ′ ), ḡ ⊗ hi = hα̂s (Φ(f ′ )), ḡ ⊗ hi so that hTλη g, hi = X λη (s)hTs g, hi = X λη (s) hg, αs (f ′ )iY αs (f ′ ), h = X λη (s)hα̂s (Φ(f ′ )), ḡ ⊗ hi s∈G s∈G s∈G → m(s 7→ hα̂s (Φ(f ′ )), ḡ ⊗ hi) and hence hT g, hi = m(s 7→ hα̂s (Φ(f ′ )), ḡ ⊗ hi), which shows that for t ∈ G, using the invariance of m, hT αt (g), hi = m(s 7→ hα̂s (Φ(f ′ )), αt (g) ⊗ hi) = m(s 7→ hα̂ts (Φ(f ′ )), αt (g) ⊗ hi) = m(s 7→ hα̂s (Φ(f ′ )), α̂t−1 (αt (g) ⊗ h)i) = m(s 7→ hα̂s (Φ(f ′ )), ḡ ⊗ αt−1 (h)i) = hT g, αt−1 (h)i = hαt (T g), hi. Therefore T commutes with the action of G. Thus for every g ∈ L∞ (X) the orbit {αs (T g) : s ∈ G} is equal to {T αs (g) : s ∈ G}. Viewing L∞ (X) as an L∞ (Y )module with the L∞ (Y )-valued inner product h·, ·iY , and using the fact that T (L∞ (X)) ⊆ 80 2. STRUCTURE THEORY FOR P.M.P. ACTIONS L∞ (X) by the previous paragraph, we deduce that this orbit is conditionally precompact in L∞ (X) by Proposition B.18. Since f is not conditionally weakly mixing, by (11) we have m(s 7→ hα̂s (Φ(f )), Φ(f )i) = m(s 7→ khαs (f ), f iY k22 ) > 0. Denote this number by c. When f ′ is close enough to f in L2 (X|Y )-norm, by Proposition 2.5 the element Φ(f ′ ) is close to Φ(f ), and hence the function s 7→ hα̂s (Φ(f ′ )), Φ(f ′ )i on G is uniformly close to the function s 7→ hα̂s (Φ(f )), Φ(f )i, so that the quantity hT f ′ , f ′ i = m(s 7→ hα̂s (Φ(f ′ )), f ′ ⊗ f ′ i) = m(s 7→ hα̂s (Φ(f ′ )), Φ(f ′ )i) is close to c. Since |hT f ′ , f ′ i − hT f ′ , f i| ≤ kT f ′ k2 kf ′ − f k2 ≤ kT kkf ′ k2 kf ′ − f k2 ≤ kf k3 kf ′ − f k2 , we can thus take f ′ to be close enough to f so that hT f ′ , f i is nonzero. As we showed above, T maps L∞ (X) into itself , and so T f ′ ∈ L∞ (X), yielding the implication. L EMMA 2.12. Let X → Y ′ → Y be extensions of p.m.p. G-actions. Then kEY ′ (f )kL2 (X|Y ′ ) ≤ kf kL2 (X|Y ) for all f ∈ L∞ (X). Thus EY ′ : L∞ (X) → L∞ (Y ′ ) extends to a contractive L∞ (Y )linear map L2 (X|Y ) → L2 (Y ′ |Y ), which we again denote by EY ′ . P ROOF. For all f ∈ L∞ (X) we have, with the first supremum taken over g ∈ L∞ (X) with kgkL2 (X|Y ) ≤ 1 and the second over h1 , h2 ∈ L∞ (Y ) with kh1 k2 , kh2 k2 ≤ 1, kf kL2 (X|Y ) = sup khf, giY k = sup sup |hhf, giY h1 , h2 i| g g h1 ,h2 = sup sup |hf ḡ, h2 h̄1 i| g h1 ,h2 = sup sup |hf, h2 h̄1 gi|. g h1 ,h2 Using the analogous expression for kEY ′ (f )kL2 (Y ′ |Y ) , with the first supremum taken over g ∈ L∞ (Y ′ ) with kgkL2 (Y ′ |Y ) ≤ 1 and the second over h1 , h2 ∈ L∞ (Y ) with kh1 k2 , kh2 k2 ≤ 1, we then obtain kEY ′ (f )kL2 (Y ′ |Y ) = sup sup |hEY ′ (f ), h2 h̄1 gi| g h1 ,h2 = sup sup |hf, h2 h̄1 gi| ≤ kf kL2 (X|Y ) . g h1 ,h2 2.2. THE FURSTENBERG-ZIMMER STRUCTURE THEOREM 81 L EMMA 2.13. Let X → Y ′ → Y be extensions of p.m.p. G-actions and let K ⊆ L2 (Y ′ |Y ). Then K is conditionally precompact in L2 (X|Y ) if and only if it is conditional precompact in L2 (Y ′ |Y ). P P ROOF. For the nontrivial direction, if h∈Ω BL∞ (Y ) h is a finite generated module zonotope in L2 (X|Y ) which ε-contains 1D K with respect to k · kL2 (X|Y ) for some meaP surable D ⊆ Y and ε > 0, then it follows by Lemma 2.12 that h∈Ω BL∞ (Y ) EY ′ (h) is a finite generated module zonotope in L2 (Y ′ |Y ) which ε-contains 1D K with respect to k · kL2 (Y ′ |Y ) . L EMMA 2.14. Let X → Y be a G-extension which is not weakly mixing. Then there is a factorization X → Y ′ → Y of G-extensions such that the second one is nontrivial and compact. P ROOF. Denote by L∞ (Y )⊥ the set of all elements in L2 (X|Y ) which are orthogonal to L∞ (Y ). Note that L2 (X|Y ) = L∞ (Y )⊥ ⊕L∞ (Y ) and that the projection p of L2 (X|Y ) onto L∞ (Y )⊥ is given by g 7→ g − hg, 1iY . Write L2 (X|Y )wm for the set of conditionally weakly mixing elements in L2 (X|Y ). Since the extension X → Y is not weakly mixing, there exists an f ∈ L∞ (Y )⊥ \L2 (X|Y )wm . As L2 (X|Y )wm is closed in L2 (X|Y ), there is a constant c > 0 such that kf − gk ≥ c for all g ∈ L2 (X|Y )wm . Take an f ′ ∈ L∞ (X) such that kf − f ′ k < c/2. Then kf − pf ′ k = kpf − pf ′ k ≤ kf − f ′ k < c/2, and hence pf ′ ∈ / L2 (X|Y )wm . Note that pf ′ = f ′ − hf ′ , 1iY = f ′ − EY (f ′ ) is in L∞ (X). Apply Lemma 2.11 to pf ′ to get a conditionally compact h ∈ L∞ (X) such that pf ′ is not orthogonal to h. Since pf ′ is in L∞ (Y )⊥ , this means that h is not in L∞ (Y ). Then the G-invariant von Neumann algebra of elements in L∞ (X) which are conditionally compact in measure, as given by Proposition 2.9, yields a p.m.p. action G y (Y ′ , ν ′ ) for which we have factor maps X → Y ′ → Y such that the second one is nontrivial and, by Lemma 2.13, compact. T HEOREM 2.15. There is a countable ordinal λ and a tower of G-extensions X → Y λ → · · · → Y 2 → Y 1 → Y0 = Y consisting of p.m.p. actions G y (Yθ , νθ ) for 0 ≤ θ ≤ λ such that (i) X → Yλ is weakly mixing, (ii) Yθ+1 → Yθ is nontrivial and compact for every 0 ≤ θ < λ, and S (iii) L∞ (Yθ ) is the von Neumann subalgebra of L∞ (X) generated by θ′ <θ L∞ (Yθ′ ) for every limit ordinal 0 < θ ≤ λ. P ROOF. By Zorn’s lemma there is a maximal tower of G-extensions X → Y λ → · · · → Y 2 → Y 1 → Y0 = Y 82 2. STRUCTURE THEORY FOR P.M.P. ACTIONS indexed by the ordinals from 0 to some ordinal λ such that conditions (ii) and (iii) in the theorem statement are satisfied. Since L2 (X) is separable, λ must be a countable ordinal. By maximality we cannot factor the extension X → Yλ as X → Y ′ → Yλ where Y ′ → Yλ is nontrivial and compact. It follows by Lemma 2.14 that X → Yλ is weakly mixing, as desired. R EMARK 2.16. A version of Theorem 2.15 is valid for G-actions on a von Neumann algebra preserving a faithful normal tracial state, with the tower of G-extensions becoming a tower of G-inclusions in the opposite direction. One must however make some modifications involving the notion of conditional compactness. The definition of conditional precompactness in Definition 2.1 should not require cutting down by an indicator function, and the approximate containment should be in the L2 -norm (this definition makes sense in any Hilbert module). The definition of conditional compactness in Definition 2.7 is interpreted accordingly and compactness for a G-inclusion N ֒→ M is defined to mean that M is generated as a von Neumann algebra by its conditionally compact elements. R EMARK 2.17. The p.m.p. actions for which the tower in Theorem 2.15 consists only of compact extensions are said to be distal, as suggested by Furstenberg in [48] in analogy with topological dynamics (see Section 6.3). 2.3. Multiple recurrence and Szemerédi’s theorem The goal of this section is to establish Furstenberg’s multiple recurrence theorem (Theorem 2.25) and derive from it Szemerédi’s theorem (Theorem 2.26). There is also a multidimensional version of multiple recurrence [50], but we will prove the theorem in its original form for p.m.p. Z-actions. As we will think of a Z-action in terms of its generating transformation, it will be convenient here to abandon our conventional notation Z y (X, µ) in favour of (X, µ, T ) where T is a measure-preserving transformation of the probability space (X, µ). We will refer to (X, µ, T )R as a (p.m.p.) system. For brevity we will often write µ(f ) to mean X f dµ for integrable functions f on a probability space (X, µ). Recall (Definition C.15) that a subset J of Z is syndetic if finitely many of its translates cover Z, i.e., there exists a finite set F ⊆ Z such that S n∈F (n + J) = Z. D EFINITION 2.18. We say that the system (X, µ, T ) is syndetically multiply recurrent, or SMR for short, if for every nonnegative function f ∈ L∞ (X) with µ(f ) > 0 and every k ∈ N there are a syndetic set J ⊆ Z and a δ > 0 such that Z k−1 Y T in f dµ ≥ δ X i=0 2.3. MULTIPLE RECURRENCE AND SZEMERÉDI’S THEOREM for all n ∈ J. Equivalently, 1 inf M ∈Z N MX +N −1 Z n=M k−1 Y 83 T in f dµ > 0 X i=0 for some N ∈ N. We prove Furstenberg’s multiple recurrence theorem in the form that states that every p.m.p. system is SMR. Although the essential scheme of the argument is the same as Furstenberg’s, we avoid measure disintegration in favour of an operator algebra approach along the lines of [131]. We first show that SMR is preserved under weakly mixing extensions (Lemma 2.22), and then under compact extensions (Lemma 2.23), and then finally under inverse limits (Lemma 2.24). With these ingredients at hand, the FurstenbergZimmer structure theorem immediately implies the conclusion (Theorem 2.25). 2.3.1. SMR is preserved under weakly mixing extensions. L EMMA 2.19 (van der Corput-type lemma). Let {ξn }∞ n=1 be a bounded sequence in a Hilbert space such that M +N −1 H−1 1 X 1 X hξn , ξn+h i = 0. lim sup sup lim H→∞ H N →∞ M ∈Z N n=M h=0 PM +N −1 1 Then limN →∞ supM ∈Z k N n=M ξn k = 0. P ROOF. We may assume by scaling that kξn k ≤ 1 for all n ∈ N. Given N, H ∈ N and M ∈ Z we have +N −1 MX MX X H−1 +N −1 MX +N −1 X MX +N −1 1 H−1 1 ≤ − ξ ξ ξ − ξ n+h n n+h n H H n=M h=0 n=M h=0 ≤ 1 H H−1 X h=0 n=M n=M 2h ≤ 2H where the second last inequality follows by telescoping. Averaging over N then yields MX M +N −1 H−1 2H 1 +N −1 1 X 1 X ξn ≤ ξn+h + N . N N H n=M n=M h=0 P Now we square both sides and use the inequalities (a+b)2 ≤ 2a2 +2b2 and | k1 ki=1 ai |2 ≤ Pk 1 2 i=1 |ai | (the latter obtained by applying Cauchy-Schwarz to the vectors (a1 , . . . , ak ) k and (1/k, . . . , 1/k) in Ck ) to get MX 2 M +N −1 H−1 1 +N −1 2 2 X 4H 2 1 X ξ ξ ≤ + n n+h N N n=M H h=0 N2 n=M 84 2. STRUCTURE THEORY FOR P.M.P. ACTIONS H−1 2 X 1 ≤ 2 H h,h′ =0 N MX +N −1 n=M 4H 2 hξn+h , ξn+h′ i + 2 . N Since by telescoping we have, for all h, h′ = 0, . . . , H − 1, M +N −1 MX +N −1 X ′i − ′ , ξn i ≤ 2H, hξ , ξ hξ n+h n+h n+h−h n=M it follows that 1 N MX +N −1 n=M 2 H−1 X 1 4 ξn N ≤H h=0 n=M MX +N −1 n=M 4H 4H 2 + 2. hξn+h , ξn i + N N Taking the supremum over all M ∈ Z and then the limit supremum as N → ∞ yields M +N −1 2 M +N −1 H−1 1 X 1 X 4 X ξn ≤ hξn+h , ξn i. lim sup sup lim sup sup H h=0 N →∞ M ∈Z N n=M N →∞ M ∈Z N n=M P +N −1 ξn k2 = 0, Now take the limit as H → ∞ to obtain lim supN →∞ supM ∈Z k N1 M n=M from which the result follows. L EMMA 2.20. Let (X, µ, T ) be a p.m.p. system. Then the orthogonal projection P from L2 (X) onto the closed subspace of T -invariant vectors sends L∞ (X) into itself. P ROOF. Let g ∈ L∞ (X). By the mean ergodic theorem (Theorem 3.22) we have Pn−1 j 2 ∞ n j=0 T g → P g in L (X) as n → ∞. Thus, for all h1 , h2 ∈ L (X) with kh1 k2 , kh2 k2 ≤ 1, X n−1 1 j kgk∞ ≥ T g h1 , h2 → |h(P g)h1 , h2 i| n j=0 −1 showing that P g ∈ L∞ (X) and kP gk∞ ≤ kgk∞ . The following is a mean ergodic theorem for weakly mixing extensions (note that the expectation only appears in the second product). L EMMA 2.21. Let (X, µ, T ) be a weakly mixing extension of a system (Y, ν, S). Let k ∈ N and f1 , . . . , fk ∈ L∞ (X) and let c1 , . . . , ck be distinct nonzero integers. Then MX k k Y 1 +N −1 Y ci n ci n T fi − T EY (fi ) lim sup = 0. N →∞ M ∈Z N 2 i=1 i=1 n=M P ROOF. We will proceed by induction on k. First we argue the case k = 1. By the mean ergodic theorem (Theorem 3.22) it suffices to show that the projection P of L2 (X) onto the subspace of T c1 -invariant vectors has image in L2 (Y ), for then P (f1 − 2.3. MULTIPLE RECURRENCE AND SZEMERÉDI’S THEOREM 85 EY (f1 )) = 0. By Proposition C.17 the invariant mean on WAP(Z) can be expressed as P the weak∗ limit of the averages g 7→ n−1 n−1 k=0 g(k) as n → ∞. From this we see that the weakly almost periodic function in Definition 2.7 has mean zero over c1 Z whenever it has mean zero over Z, which shows that (X, µ, T c1 ) is a weakly mixing extension of (Y, ν, S c1 ). The definition of conditional weak mixing then shows that the subspace of T c1 -invariant vectors in L2 (X|Y ) is contained in L∞ (Y ). Now let f be an element of L2 (X) such that T c1 f = f . Then given an ε > 0 we can find a g ∈ L∞ (X) such that kf − gk2 < ε. Then P g lies in L∞ (X) by Lemma 2.20, and hence in L∞ (Y ). Since kf − P gk2 = kP (f − g)k2 < ε we see that f ∈ L2 (Y ), whence P (L2 (X)) ⊆ L2 (Y ), as desired. Assuming now the validity of the case k − 1 for some k > 1, let us establish it for k. For any n ∈ Z we have, expressing a difference of products by untelescoping in the usual way, k Y i=1 T ci n f i − k Y T ci n EY (fi ) i=1 = j−1 k Y X j=1 T ci n fi (T i=1 cj n (fj − EY (fj ))) Y k T ci n i=j+1 EY (fi ) . This allows us to reduce to the case that one of the functions fi satisfies EY (fi ) = 0, since this will imply that the supremum over M ∈ Z of the averages over n = M, . . . , M + N − 1 of each of the above k summands tends to zero in L2 -norm as n → ∞, given that EY (fi − EY (fi )) = 0 for every i. We may therefore assume, relabeling if necessary, that EY (f1 ) = 0. By Lemma 2.19 we need only show that H−1 1 1 X lim lim sup sup H→∞ H N →∞ M ∈Z N h=0 MX +N −1 Y k T ci (n+h) i=1 n=M fi , k Y T ci n i=1 By the T -invariance of µ, the inner products above can be written as Z f¯k T ck h fk X k−1 Y fi = 0. T (ci −ck )n (f¯i T ci h fi ) dµ i=1 and the absolute value of the average of these from n = M to M + N − 1 is bounded by f¯k T ck h fk 1 N MX +N −1 k−1 Y n=M i=1 T (ci −ck )n ci h ¯ (fi T fi ) 2 86 2. STRUCTURE THEORY FOR P.M.P. ACTIONS which in turn is at most 1 kfk k N 2 MX +N −1 k−1 Y n=M T (ci −ck )n (f¯i T ci h fi ) . 2 i=1 We will thus be done upon showing that M +N −1 k−1 H−1 1 X Y (c −c )n 1 X ci h i k ¯ lim (17) lim sup sup T ( fi T fi ) = 0. H→∞ H N →∞ M ∈Z N 2 n=M i=1 h=0 Writing f¯1 T c1 h f1 as the sum of EY (f¯1 T c1 h f1 ) and f¯1 T c1 h f1 −EY (f¯1 T c1 h f1 ), the latter of which lies in the kernel of EY , we derive for each h the estimate M +N −1 k−1 1 X Y (c −c )n c h i i k T (f¯i T fi ) lim sup sup N →∞ M ∈Z N 2 n=M i=1 MX +N −1 k−1 Y 1 (ci −ck )n ¯ ci h ≤ kEY (f¯1 T c1 h f1 )k2 lim sup sup T ( f T f ) i i N →∞ M ∈Z N i=2 n=M M +N −1 1 X T (c1 −ck )n (f¯1 T c1 h f1 − EY (f¯1 T c1 h f1 )) + lim sup sup N N →∞ M ∈Z n=M k−1 Y (ci −ck )n ¯ ci h × T (fi T fi ) . i=2 2 The second expression on the right side of this inequality is zero by the inductive hypothesis, and the average of the first expression from h = 0 to H − 1 tends to zero as H → ∞ P ¯ c1 h f1 )k2 = 0 by the definition of conditional weak because (i) limH→∞ H1 H−1 h=0 kEY (f1 T mixing and Proposition C.17 and (ii) the limit supremum inside the brackets is bounded Q 2 above by k−1 i=2 kfi k , which is independent of h. We thus obtain (17), completing the proof. L EMMA 2.22. Let (X, µ, T ) be a weakly mixing extension of an SMR system (Y, ν, S). Then the system (X, µ, T ) is SMR. P ROOF. Let f be a nonnegative function in L∞ (X) with µ(f ) > 0. Let k ∈ N. It follows from Lemma 2.21 that M +N −1 Z k−1 M +N −1 Z k−1 Y Y 1 X 1 X in T f dµ = lim sup inf S in EY (f ) dν, lim sup inf M ∈Z M ∈Z N N N →∞ N →∞ X i=0 Y i=0 n=M n=M and the latter limit supremum is greater than zero by our SMR hypothesis on S since ν(EY (f )) = µ(f ) > 0. Hence (X, µ, T ) is SMR. 2.3. MULTIPLE RECURRENCE AND SZEMERÉDI’S THEOREM 87 2.3.2. SMR is preserved under compact extensions. L EMMA 2.23. Let (X, µ, T ) be a compact extension of an SMR system (Y, ν, S). Then the system (X, µ, T ) is SMR. P ROOF. Let k ∈ N. Let f be a nonnegative function in L∞ (X) with µ(f ) > 0. For the purposes of establishing SMR, we may assume that f is conditionally compact by replacing it with 1D f by a suitable measurable set D ⊆ Y for which µ(1D f ) > 0. We may also assume that kf k ≤ 1. Since ν(EY (f )) = µ(f ) and kEY (f )k ≤ 1 we can find a B ⊆ Y such that ν(B) ≥ µ(f )/2 and EY (f )(y) ≥ µ(f )/2 for all y ∈ B. Fix an ε > 0 k k such that ε < µ(f )2 /22 . For a.e. y ∈ Y we can equip the linear span of {T n f : n ∈ Z} with the pre-inner product hg, hi = EY (g h̄)(y) and use the associated seminorm to define on the k-fold direct sum (span{T n f : n ∈ Z})⊕k the seminorm k(f0 , . . . , fk−1 )ky = max EY (|fk |2 )1/2 (y). l=0,...,k−1 Write Ωy for the subset of this seminormed space consisting of all tuples of the form (T ln f )l=0,...,k−1 for n ∈ Z. Since f is conditionally compact we can find a finite subset {e1 , . . . , er } of L2 (X|Y ), elements an,i in the unit ball of L∞ (Y ), and a set D ⊆ Y with ν(B ∩ D) > 0 such that r X n < ε (18) 1 T f − a e D n,i i 3k i=1 for all n ∈ Z. For a.e. y ∈ D the tuples (an,i (y))i=1,...,r for n ∈ Z are contained in the unit ball of ℓ∞ ({1, . . . , r}), which is compact, and so using (18) and the triangle inequality we deduce that for a.e. y ∈ D the maximum cardinality of an (ε/k)-separated subset of Ωy is finite (by ε-separated we mean that any two distinct points in the set are at distance greater than ε from each other). We can therefore construct a measurable map y 7→ Fy from D into the finite subsets of Z such that for a.e. y ∈ D we have min ′ max EY (|T lm f − T lm f |2 )1/2 (y) > m6=m′ ∈Fy l=0,...,k−1 ε k and this minimum is at most ε/k when Fy is replaced with any finite subset of Z with larger cardinality than Fy . We can then find a finite set F ⊆ Z such that F = Fy for all y in a set A ⊆ B ∩ D with µ(A) > 0, and by replacing A with a smaller set we may assume that there is an η > 0 such that min ′ ′ max EY (|T lm f − T lm f |2 )1/2 (y) > m6=m ∈F l=0,...,k−1 ε + η. k 88 2. STRUCTURE THEORY FOR P.M.P. ACTIONS By replacing A by an even smaller set we may furthermore assume that for all l = ′ 0, . . . , k − 1 and distinct m, m′ ∈ F the function y 7→ EY (|T lm − T lm |2 )1/2 (y) varies by at most η on A. By our SMR hypothesis there is a δ > 0 such that the set J of all n ∈ Z for which Tk−1 ln Tk−1 ln ν( l=0 T A) ≥ δ is syndetic. Fix an n ∈ J. Set A′ = l=0 T A. Let y ∈ A′ . Then for all distinct m, m′ ∈ F we can find an l ∈ {0, . . . , k − 1} such that EY (|T lm f − ′ T lm f |2 )1/2 (y) > ε/k + η, in which case ′ ′ EY (|T l(m+n) f − T l(m +n) f |2 )1/2 (y) = EY (|T lm f − T lm f |2 )1/2 (T −ln y) ′ ≥ EY (|T lm f − T lm f |2 )1/2 (y) − η > ε . k This shows that the set of all tuples (T l(m+n) f )l=0,...,k−1 for m ∈ F is (ε/k)-separated in Ωy . It follows by our choice of F that for every y ∈ A′ we can find an m ∈ F such that the k-tuple (f, . . . , f ) lies within distance ε/k to (T l(m+n) f )l=0,...,k−1 , i.e., EY (|T l(m+n) f − f |2 )1/2 (y) < ε/k for all l = 0, . . . , k − 1. Untelescoping a difference of products in the usual way, applying the Cauchy-Schwarz inequality, and using the fact that kf k∞ ≤ 1, we then have k−1 Y l(m+n) k E Y T f − f (y) l=0 Y k−1 k−1 l−1 X Y j(m+n) l(m+n) E Y ≤ T f T f −f f (y) j=0 l=0 ≤ k−1 X l=0 j=l+1 EY (|T l(m+n) f − f |2 )1/2 (y) ≤ k · k k ε =ε k k and thus, since EY (f ) ≤ EY (f 2 )1/2 ≤ EY (f k )1/2 by k applications of the CauchySchwarz inequality (Proposition B.2) and the fact that 0 ≤ f ≤ 1, EY k−1 Y T l(m+n) l=0 k µ(f )2 − ε. f (y) ≥ EY (f ) (y) − ε ≥ 2 2k 2k Now m depends on y but ranges in the set F , and so there must exist a single m ∈ F such that Z EY A′ k−1 Y l=0 T l(m+n) f k δ µ(f )2 dν ≥ −ε , |F | 2 2k 2.3. MULTIPLE RECURRENCE AND SZEMERÉDI’S THEOREM in which case Z k−1 Y T l(m+n) X l=0 f dµ = Z EY Y k−1 Y T l(m+n) l=0 f 89 k δ µ(f )2 dν ≥ −ε . |F | 2 2k This last expression is greater than zero by our choice of ε and does not depend on n. Observe finally that, writing mn for the number m as a function of n, the set {mn + n : n ∈ J} is syndetic. We conclude that the system (X, µ, T ) is SMR. 2.3.3. SMR and Szemerédi’s theorem. L EMMA 2.24. Let (X, µ, T ) be the inverse limit of a net of SMR systems (Yγ , νγ , Sγ ). Then X is SMR. P ROOF. Let k ∈ N. Let f be a nonnegative function in L∞ (X) with µ(f ) > 0. Then there is an ε > 0 such that f (x) ≥ ε for all x in a set A ⊆ X of measure at least ε, and so for the purpose of showing syndetic multiple recurrence we may assume that f = 1A . Since limγ kEYγ (1A ) − 1A k2 = 0, one of the systems (Y, ν, S) in the net will have the property that the set B = {y ∈ Y : EY (1A )(y) ≥ 1 − 1/2k} has nonzero measure. For a Q Pk−1 in in given n ∈ Z we have k−1 i=0 T 1A ≥ 1 − i=0 T 1X\A and hence k−1 k−1 k−1 X Y X in in EY (T 1X\A ) = 1 − T 1A ≥ 1 − EY (1 − T in 1A ) EY i=0 i=0 i=0 Tk−1 and the value of this last function at any y ∈ i=0 T in B is at least 1/2. It follows that k−1 k−1 Z k−1 Z \ Y Y 1 in in in EY T 1A dν ≥ ν T 1A dµ = T B , 2 X i=0 Y i=0 i=0 and this last quantity is greater than some δ > 0 for all n in a syndetic subset of Z by the syndetic multiple recurrence of (Y, ν, S), establishing the lemma. T HEOREM 2.25 (Furstenberg’s multiple recurrence theorem). Every measure-preserving Z-system is SMR. P ROOF. Combine Theorem 2.15 with Lemmas 2.22, 2.23, and 2.24. T HEOREM 2.26 (Szemerédi’s theorem). Every subset of Z with positive upper density contains arbitrarily long arithmetic progressions. P ROOF. Let A be a subset of Z with positive upper density. Then there exists a sequence of integers 0 < n1 < n2 < . . . such that the limit infimum of 2nk1+1 |A ∩ {−nk , . . . , nk }| as k → ∞ is a nonzero number, say δ. Consider the shift T on {0, 1}Z given by T x(j) = x(j − 1). We encode A as an element a of {0, 1}Z by declaring a(j) to be 1 if n ∈ A and 0 otherwise. Write δa for the point mass at a, i.e., δa (f ) = f (a) for f ∈ 90 2. STRUCTURE THEORY FOR P.M.P. ACTIONS P k δ a ◦ T j }∞ C({0, 1}Z ). Take a weak∗ limit point µ of the sequence { 2nk1+1 nj=−n k=1 . Then k Z µ is T -invariant, and the cylinder set B consisting of all x ∈ {0, 1} such that x(0) = 1 T in satisfies µ(B) ≥ δ > 0. For every k ∈ N, Theorem 2.25 yields µ( k−1 i=0 T B) > 0 for some nonzero n ∈ Z. Writing X = {T j a : j ∈ Z}, we note that µ is supported on X T in and that X ∩ k−1 i=0 T B is open in X, and so we conclude that A contains an arithmetic progression of the form m, m + n, m + 2n, . . . , m + (k − 1)n. 2.4. Notes and references The structure theory of Section 2.2 was worked out by Zimmer in [148, 149]. These ideas were also developed by Furstenberg in the setting of Z-actions for his proof of Szemerédi’s theorem, which appeared in [48]. Furstenberg’s proof is also presented (in the multidimensional form of [50]) in his book on dynamics and combinatorial number theory [49], as well as in the article [51] by Furstenberg, Katzelnelson, and Weiss. For other expositions see [100, 10, 131, 36]. Here we have adopted the Hilbert module approach of Tao [131], although our use of Hilbert-Schmidt operators is somewhat different. Bergelson showed in [10] that the fact that multiple recurrence is preserved under compact extensions can be established using van der Waerden’s theorem, and this approach also appears in [131, 36]. For this step we have followed the argument in [51], which enables us to show that the syndeticity of the multiple recurrence is perserved, and have also benefited from the presentation in [36]. Our definition of compact extension is different than the customary ones but is equivalent to these, as one can observe by comparing it to the properties C3 and C4 in Section 6.3 of Furstenberg’s book [49], the first of which it easily implies and the second of which is easily implied by it. A structural description of ergodic compact extensions is given in [148]. The dichotomy between weak and compactness for trace-preserving actions on possibly noncommutative von Neumann algebras was investigated in [119, 6] using an approach centred around the basic construction for inclusions of finite von Neumann algebras.