Averaging and Rates of Averaging for Uniform Families A. Korepanov

advertisement
Averaging and Rates of Averaging
for Uniform Families
of Deterministic Fast-Slow Skew Product Systems
A. Korepanov
Z. Kosloff
I. Melbourne
Mathematics Institute, University of Warwick, Coventry, CV4 7AL, UK
21 September 2015
Abstract
We consider families of fast-slow skew product maps of the form
xn+1 = xn + a(xn , yn , ),
yn+1 = T yn ,
where T is a family of nonuniformly expanding maps, and prove averaging and
rates of averaging for the slow variables x as → 0. Similar results are obtained
also for continuous time systems
ẋ = a(x, y, ),
ẏ = g (y).
Our results include cases where the family of fast dynamical systems consists
of intermittent maps, unimodal maps (along the Collet-Eckmann parameters)
and Viana maps.
1
Introduction
The classical Krylov-Bogolyubov averaging method [32] deals with skew product flows
of the form
ẋ = a(x, y, ),
ẏ = g(y).
Let ν be an ergodic invariant probability measure for the fast flow generated by g.
Under a uniform Lipschitz condition on a, it can be shown that solutions to the slow x
dynamics, suitably rescaled,
R converge almost surely to solutions of an averaged ODE
Ẋ = ā(X) where ā(x) = a(x, y, 0) dν(y).
A considerably harder problem is to handle the fully-coupled situation
ẋ = a(x, y, ),
ẏ = g(x, y, ).
1
Here it is supposed that there is a distinguished family of ergodic invariant probability
measures νRx, for the fast vector fields g(x, ·, ) and the averaged vector field is given
by ā(x) = a(x, y, 0) dνx,0 (y). The first results on averaging for fully-coupled systems
were due to Anosov [6] who considered the case where the fast vector fields are Anosov
with νx, absolutely continuous. Convergence here is in the sense of convergence in
probability with respect to Lebesgue measure.
Kifer [24, 25] extended the results of [6] to the case where the fast vector fields
are Axiom A (uniformly hyperbolic) with SRB measures ν . More generally, Kifer
considers the case where x 7→ νx,0 is sufficiently regular so that ā is Lipschitz, and
gives necessary and sufficient conditions for averaging to hold. However, the only
situations where the conditions in [24, 25] are verified are in the Axiom A case, even
though it is hoped [25] that the conditions are verifiable for nonuniformly hyperbolic
examples. Analogous results for the discrete time case are obtained in [23]. See
also [15, Theorem 5] for certain partially hyperbolic fast vector fields.
Here, we consider an intermediate class of examples that lies between the classical
uncoupled situation and the fully coupled systems of [6, 24], namely families of skew
products of the form
ẋ = a(x, y, ),
ẏ = g(y, ),
(1.1)
with distinguished
family of ergodic invariant measures ν and averaged vector field
R
ā(x) = a(x, y, 0) dν0 (y). Notice that in this way we avoid issues concerned with the
regularity of the averaged vector field ā, but we still have to deal with the -dependence
of the measures ν as well as the fast vector fields. In other words, linear response
(differentiability) of the invariant measures is replaced by statistical stability (weak
convergence) which is more tractable. Indeed one aspect of the general framework in
this paper is that our averaging theorems hold in a similar generality to the methods
of Alves & Viana [2, 5] for proving statistical stability.
Hence, we obtain results on averaging and rates of averaging for a large class of
families of skew products (1.1), going far beyond the uniformly hyperbolic setting,
both in discrete and continuous time. Our examples include situations where the
fast dynamics is given by intermittent maps with arbitrarily poor mixing properties,
unimodal maps where linear response fails, and flows built as suspensions over such
maps.
We obtain results also on rates of averaging. In the very simple situation ẋ = a(y),
ẏ = g(y), where g is a uniformly expanding semiflow or uniformly hyperbolic flow, it
is easily seen that the optimal rate of averaging in L1 is O(1/2 ). For systems of the
1
form (1.1), we often obtain the essentially optimal rate O( 2 − ).1
We have chosen to focus in this paper on the case of noninvertible dynamical
systems. In this situation, the measures of interest are absolutely continuous and we
are able to present the main ideas without going into the technical issues presented by
1
q− denotes q − a for all a > 0.
2
dealing with nonabsolutely continuous measures as required in the invertible setting.
The invertible case will be covered in a separate paper.
Even in the noninvertible setting, our results depend strongly on extensions and
clarifications of the classical first order and second order averaging theorems. These
prerequisites are presented in Appendices A and B and may be of independent interest.
The remainder of the paper is organised as followed. In Section 2, we set up the
averaging problem for families of fast-slow skew product systems in the discrete time
case, leading to a general result Theorem 2.2 for such systems. In Section 3, we show
that Theorem 2.2 leads easily to averaging when the fast dynamics is a family of
uniformly expanding maps. Section 4 is the heart of the paper and deals with the
case when the fast dynamics is a family of nonuniformly expanding maps. Our main
examples are presented in Section 5. In Section 6, we show how the continuous time
case reduces to the discrete time case. In Section 7, we present a simple example to
show that almost sure convergence fails for families of skew products.
2
General averaging theorem for families of skew
products
Let T : M → M , 0 ≤ < 0 , be a family of transformations defined on a measurable
space M . For each ∈ [0, 0 ), let ν denote a T -invariant ergodic probability measure
on M .
We consider the family of fast-slow systems
()
() ()
xn+1 = x()
n + a(xn , yn , ),
()
yn+1 = T yn() ,
()
x0 = x0 ,
()
y0 = y0 ,
(2.1)
()
where the initial condition x0 = x0 is fixed throughout. The initial condition y0 ∈
M is chosen randomly with respect to various measures that are specified in the
statements of the results. Here a : Rd × M × [0, 0 ) → Rd is a family of functions
satisfying certain regularity
hypotheses.
R
Define ā(x) = M a(x, y, 0) dν0 (y) and consider the ODE
Ẋ = ā(X), X(0) = x0 .
(2.2)
We are interested in the convergence, and rate of convergence, of the slow variables
()
xn , suitably rescaled, to solutions X(t) of this ODE. More precisely, define x̂() :
()
[0, 1] → Rd by setting x̂() (t) = x[t/] . We study convergence of the difference
z = sup |x̂() (t) − X(t)|.
t∈[0,1]
Remark 2.1 The restriction to the time interval [0, 1] entails no loss of generality:
the results apply to arbitrary bounded intervals by rescaling .
3
Regularity assumptions Given a function g : Rd → Rn , we define kgkLip =
max{|g|∞ , Lip g} where Lip g = supx6=x0 |g(x) − g(x0 )|/|x − x0 | and |x − x0 | =
maxi=1,...,n |xi − x0i |.
In this section, and also in Appendices A and B, we consider functions g : Rd ×
M × [0, 0 ) → Rn where there is no metric structure assumed on M . In that case,
kgkLip = supy∈M sup∈[0,0 ) kg(·, y, )kLip . If E ⊂ Rd , then kg|E kLip is computed by
restricting to x, x0 ∈ E (and y ∈ M , ∈ [0, 0 )).
∂
. If g : Rd × M × [0, 0 ) → Rn , then Dg : Rd ×
Throughout, we write D = ∂x
M × [0, 0 ) → Rn×d and kDgkLip is defined accordingly. Similarly for kDg|E kLip when
E ⊂ Rd .
Below, L1 , L2 , L3 ≥ 1 are constants.
For first order averaging we require that a is globally Lipschitz in x:
kakLip ≤ L1 .
(2.3)
Set E = {x ∈ Rd : |x − x0 | ≤ L1 }.
For second order averaging we require in addition that Da|E is Lipschitz in x:
kDa|E kLip ≤ L2 .
(2.4)
In both cases, we suppose that
sup sup |a(x, y, ) − a(x, y, 0)| ≤ L3 .
(2.5)
x∈E y∈M
In the sequel we let L = max{L1 , L3 } when performing first order averaging, and
L = max{L1 , L2 , L3 } when performing second order averaging.
2.1
Order functions and a general averaging theorem
R
Define ā(x, ) = M a(x, y, ) dν (y) and let v,x (y) = a(x, y, ) − ā(x, ). We define the
first order function δ1, : M → R,
δ1, = sup sup |v,x,n |
where
v,x,n =
x∈E 1≤n≤1/
n−1
X
v,x ◦ Tj .
j=0
Next, we define the second order function δ : M → R,
δ = δ1, + δ2, ,
δ2, = sup sup |V,x,n |,
x∈E 1≤n≤1/
Theorem 2.2 Let S = supx∈E |
R
M
where V,x,n
n−1
X
=
(Dv,x ) ◦ Tj .
j=0
a(x, y, 0) (dν − dν0 )(y)| + .
4
(a) Assume conditions (2.3) and (2.5). Then z ≤ 6e2L (
p
δ1, + S ).
(b) Assume conditions (2.3)—(2.5). If δ ≤ 21 , then z ≤ 6e2L (δ + S ).
The proof of Theorem 2.2 is postponed to Appendices A and B.
Remark 2.3 For averaging without rates, it suffices instead of condition (2.5) that
lim→0 |a(x, y, ) − a(x, y, 0)| = 0 for all x ∈ E, y ∈ M .
Remark 2.4 As shown in Section 7, almost sure convergence in the averaging theorem is not likely to hold for fast-slow systems of type (2.1). Hence we consider
convergence in Lq with respect to certain absolutely continuous probability measures
on M . Since z ≤ 2L and δ ≤ 4L, convergence in Lp is equivalent to convergence
in Lq for all p, q ∈ (0, ∞). For brevity, we restrict statements to convergence in L1
except when speaking of rates.
Throughout the remainder of this paper, it is assumed that conditions (2.3)—(2.5)
are satisfied, and we apply Theorem 2.2(b). By Theorem 2.2(a), the results without
rates go through unchanged when condition (2.4) fails, and results with rates hold
usually with weaker rates of convergence, but for brevity these rates are not stated
explicitly.
According to Theorem 2.2(b), results on averaging reduce to estimating the scalar
quantity S and the random variable δ = δ (y0 ). These quantities are discussed below
in Subsections 2.2 and 2.3 respectively.
2.2
Statistical stability
In this subsection, we suppose that M is a topological space and that the σ-algebra
of measurable sets is the σ-algebra of Borel sets. Recall that the family of measures
ν is statistically
stable at
R
R = 0 if ν0 is the weak limit of ν as → 0 (ν →w ν0 ). This
means that M φ dν → M φ dν0 for all continuous bounded functions φ : M → R.
In the noninvertible setting, often a stronger property known as strong statistical
stability holds. Let m be a reference measure on M and suppose that ν is absolutely
continuous with respect to m for all ≥ 0. Then ν0 is strongly
statistically stable if
R
the densities ρ = dν /dm satisfy lim→0 R = 0 where R = M |ρ − ρ0 | dm. We note
that S ≤ LR + .
Proposition 2.5 If ν →w ν0 , then lim→0 S = 0.
R
R
Proof Let A (x) = M a(x, y, 0) dν (y)− M a(x, y, 0) dν0 (y). Let δ > 0. Since ν →w
ν0 , we have that A (x) → 0 for each x, so there exists x > 0 such that |A (x)| < δ
for all ∈ (0, x ). Moreover, |A (z)| < 2δ for all ∈ (0, x ) and z ∈ Bδ/(2L) (x).
Since E is covered by finitely many such balls Bδ/(2L) (x), there exists ¯ > 0 such that
5
supx∈E |A (x)| < 2δ for all ∈ (0, ¯). Hence
zero uniformly in x.
R
M
a(x, y, 0) (dν − dν0 )(y) converges to
Hence for proving averaging theorems, statistical stability takes care of the term
S in Theorem 2.2. In specific examples, we are able to appeal to results on statistical
stability with rates, yielding effective estimates.
Proposition 2.6 Let q ≥ 1. There is a constant C > 0 such that
|z |Lq (ν ) ≤ C(|δ |Lq (ν ) + S ),
for all ∈ [0, 0 ).
Assuming strong statistical stability, there is a constant C > 0 such that
|z |Lq (ν0 ) ≤ C(|δ |Lq (ν ) + R1/q + ),
for all ∈ [0, 0 ).
Proof Let A = {y ∈ M : δ (y) ≤ 21 }. Then Theorem 2.2(b) applies on A and
Z
Z
Z
Z
q
q
q
q
2L q
1
z dν =
z dν +
z dν ≤ (2L) ν (δ > 2 ) + (6e )
(δ + S )q dν
M
M \A
A
M
Z
Z
Z
q
q
2L q
q
2L q
≤ (4L)
δ dν + (6e )
(δ + S ) dν ≤ (12e )
(δ + S )q dν .
M
M
M
Hence
|z |Lq (ν ) ≤ 12e2L |δ + S |Lq (ν ) ≤ 12e2L |δ |Lq (ν ) + 12e2L S ,
yielding the first estimate.
Next,
Z
Z
Z
Z
q
q
q
2L q
(δ + S )q dν + (2L)q R
z dν0 =
z dν +
z (dν0 − dν ) ≤ (12e )
M
M
M
M
Z
2L q
≤ (12e )
(δ + LR + )q dν + (2L)q R
M
Z
2L q
≤ (12Le )
(δ + R1/q + )q dν + (2L)q R
ZM
≤ (24Le2L )q
(δ + R1/q + )q dν .
M
Hence
|z |Lq (ν0 ) ≤ 24Le2L |δ + R1/q + |Lq (ν ) ≤ 24Le2L (|δ |Lq (ν ) + R1/q + ),
yielding the second estimate.
6
Corollary
2.7 (a) Assume statistical stability and that lim→0
R
lim→0 M z dν = 0.
R
M
δ dν = 0. Then
(b) Assume in addition strong statistical
stability and that µ is a probability measure
R
on M with µ ν0 . Then lim→0 M z dµ = 0.
Proof Part (a), and part (b) in the special case µ = ν0 , are immediate from Proposition
2.6. To prove the general case of part (b), suppose
R
R for contradiction that
z dµ → b > 0 along some subsequence k → 0. Since M zk dν0 → 0, by passing
M k
without loss to a further subsequence, we can suppose also that zk → 0 on a set of full
measure with
R respect to ν0 and hence with respect to µ. By the bounded convergence
theorem, M zk dµ → 0 which is the desired contradiction.
The next result is useful in situations where ν0 is absolutely continuous but whose
support is not the whole of M .
R
Corollary 2.8 Assume strong statistical stability and that lim→0 M δ dν0 = 0. Suppose further that each T is nonsingular with respect to m and that for almost every
y ∈ MR, there exists N ≥ 1 such that TN y ∈ supp ν0 for all ∈ [0, 0 ). Then
lim→0 M z dµ = 0 for every probability measure µ on M with µ m.
Proof First, we note that for all N ≥ 1, ≥ 0,
|δ ◦ TN − δ |∞ ≤ 8LN .
(2.6)
R By the arguments in the proof of Corollary 2.7, it suffices
R to prove that
δ dm → 0. Suppose this is not the case. By Corollary 2.7(b), supp ν0 δ dm → 0.
M Hence there exists a subsequence k → 0 and a subsetR A ⊂ supp ν0 with m(supp ν0 \
A) = 0 such that (i) δk → 0 pointwise on A and (ii) M δk dm → b > 0.
Since each T is nonsingular, there exists M 0 ⊂ M with m(M 0 ) = 1 such that
0
M ∩ T−n
(supp ν0 \ A) = ∅ for all k, n ≥ 1. By the hypothesis of the result, there is
k
a subset M 00 ⊂ M 0 with m(M 00 ) = 1 such that for any y ∈ M 00 there exists N ≥ 1
such that TNk y ∈ A for all k ≥ 1. Hence it follows from (i)
R and (2.6) that δk → 0
pointwise on M 00 . By the bounded convergence theorem M δk dm → 0. Together
with (ii), this yields the desired contradiction.
Remark 2.9 The hypotheses of Corollary 2.8 are particularly straightforward to
apply when supp ν0 has nonempty interior. This property is automatic for large classes
of nonuniformly expanding maps, see [3, Lemma 5.6] (taking G = supp ν0 ∩ H(σ), it
follows that Ḡ contains a disk).
2.3
Estimating the order function
By Proposition 2.6 and Corollary 2.7, it remains to deal with the order function δ .
This is a random variable depending on the initial condition y0 . Here we give a useful
7
estimate.
Since δ = δ1, + δ2, and the definition of δ2, is identical to that of δ1, with a
replaced by Da, it suffices to consider δ1, . From now on, Γ denotes a constant that
only depends on d, p, L and whose value may change from line to line.
Lemma 2.10 Let µ be a probability measure on M . Then for all p ≥ 0, ∈ [0, 0 ),
Z
Z
p+d+1
p
δ1,
dµ ≤ Γ sup sup
|v,x,n |p dµ.
x∈E 1≤n≤1/
M
M
Proof For most of the proof, we work pointwise on M suppressing the initial condition y0 ∈ M . There exist x̃ ∈ E, ñ ∈ [0, 1/] such that δ1, = |v,x̃,ñ |. Observe
that
|v,x,n − v,x̃,ñ | ≤ |v,x,n − v,x̃,n | + |v,x̃,n − v,x̃,ñ | ≤ 2L−1 |x − x̃| + 2L|n − ñ|
for every x ∈ E and n ≤ 1/. Define
A = {x ∈ E : |x − x̃| ≤
1
δ },
8L 1,
B = {n ∈ [0, −1 ] : |n − ñ| ≤
1
δ }.
8L 1,
Then for every x ∈ A and n ∈ B we have |v,x,n | ≥ δ1, /2. Moreover, since δ1, ≤ 2L,
d
we have Leb(A) ≥ Γδ1,
and #B ≥ −1 δ1, /8L. Hence
p
[1/]−1 Z
X
p
|v,x,n | dx ≥ (#B) Leb(A)
E
n=0
δ1,
2
p
p+d+1
≥ Γ−1 δ1,
.
Finally,
Z
p+d+1
δ1,
p+1
dµ ≤ Γ
M
[1/]−1 Z
X
n=0
Z
p
p
Z
|v,x,n | dµ dx ≤ Γ sup
E
sup
x∈E 1≤n≤1/
M
|v,x,n |p dµ,
M
as required.
R
Remark R2.11 Often, estimating M |v,x,n | dµ leads to an essentially identical estimate for M sup1≤n≤1/ |v,x,n | dµ. In this case, slightly better convergence rates for
δ1, can be obtained using the estimate
Z
Z
p+d
p
δ1, dµ ≤ Γ sup
sup |v,x,n |p dµ
(2.7)
M
x∈E
for all p ≥ 0, ∈ [0, 0 ).
8
M 1≤n≤1/
3
Examples: Uniformly expanding maps
Let T : M → M be a family of maps defined on a metric space (M, dM ) with
invariant ergodic Borel
ν . Let P denote the corresponding
R probability measures
R
transfer operators, so M P v w dν = M v w ◦ T dν for all v ∈ L1 (ν ), w ∈ L∞ (ν ).
From now on, we require Lipschitz regularity in the M variables in addition to
the Rd variables as was required in assumptions (2.3) and (2.4). So for g : Rd ×
M × [0, 0 ) → Rn we define kgkLip = |g|∞ + sup∈[0,0 ) Lip g(·, ·, ) where Lip g(·, ·, ) =
supx6=x0 supy6=y0 |g(x, y, ) − g(x0 , y 0 , )|/(|x − x0 | + dM (y, y 0 )). We continue to assume
conditions (2.3)—(2.5) with this modified definition of k kLip .
Proposition
R
R 3.1 Suppose that there is a sequence of constants an → 0 such that
n
|P v − MR v dµ | ≤ an kvkLip for all Lipschitz v : M → R and all n ≥ 1, ≥ 0.
M
Then lim→0 M δ dν = 0.
Proof We prove the result for δ1, and δ2, separately. By Remark 2.4, we can work
in Lq for any choice of q and we take q = d + 3.
For every Lipschitz v, we have
Z X
n−1
M
j=0
v◦
Tj
2
dν =
n Z
X
j=0
Z
=n
2
v ◦
Tj
dν + 2
M
v 2 dν + 2
=n
X
Z
(n − k)
1≤k≤n−1
v 2 dν + 2
M
X
v v ◦ Tk dν
M
Z
(n − k)
1≤k≤n−1
v ◦ Ti v ◦ Tj dν
M
0≤i<j≤n−1
M
Z
Z
X
Pk v v dν .
M
Hence for v Lipschitz and mean zero,
Z X
n−1
M
v◦
Tj
2
dν ≤ bn kvk2Lip ,
j=0
P
where bn = n + 2n 1≤k≤n−1 ak = o(n2 ) by the assumption on an . R
By condition (2.3), kv,x kLip ≤ 2L for all , x, so supx∈E sup1≤n≤1/ M |v,x,n |2 dν ≤
R d+3
4L2 bn . By Lemma 2.10, it follows that lim→0 MR δ1,
dν = 0. Similarly, by condition (2.4), kV,x kLip ≤ 2L, so supx∈E sup1≤n≤1/ M |V,x,n |2 dν ≤ 4L2 bn and hence
R d+3
lim→0 M δ2,
dν = 0.
Remark 3.2 The proof uses only that n−1
Pn
k=1
an → 0.
Proposition 3.1 is useful in situations where T is a family of (piecewise) uniformly
expanding maps. A general result of Keller & Liverani [22] guarantees uniform spectral properties of the transfer operators P under mild conditions, and consequently
9
R
lim→0 M δq dν = 0 for all q. We mention two situations where this idea can be
applied. Again, for brevity we work in L1 except when discussing convergence rates
(see Remark 2.4).
Example 3.3 (Uniformly expanding maps) Suppose that M = Tk ∼
= R/ Zk is a
torus with Haar measure m and distance dM inherited from Euclidean distance on
Rk and normalised so that diam M = 1. We say that a C 2 map T : M → M is
uniformly expanding if there exists λ > 1 such that |(DT )y v| ≥ λ|v| for all y ∈ M ,
v ∈ Rk . There is a unique absolutely continuous invariant probability measure, and
the density is C 1 and nonvanishing.
If T : M → M , ∈ [0, 1], is a continuous family of C 2 maps, each of which
is uniformly expanding, with corresponding probability measures ν , then
R it follows
from [22] that we are in the situation of Proposition 3.1, and so lim→0 M δ dν = 0.
Moreover, it is well-known that ν0 is uniformly equivalent to m and is strongly
statistically
stable.
Hence by
R
R Corollary 2.7 we obtain the averaging result
lim→0 M z dν = 0 and lim→0 M z dµ = 0 for every absolutely continuous probability measure µ.
Suppose further that T : M → M , ∈ [0, 1], is a C k family of C 2 maps, for some
k ∈ (0,
(for instance [22] with Banach spaces C 0 and C 1 ),
R 1]. By standard results
R = M |ρ − ρ0 | dm = O(k ). If k ∈ (0, 21 ), then by Remark 4.4 below we obtain the
convergence rate O(k ) for z in Lq (ν ) and Lq (m) for all q > 0. If k ≥ 12 , then the
1
convergence rate for z is O( 2 − ).
Example 3.4 (Piecewise uniformly expanding maps) Let M = [−1, 1] with
Lebesgue measure m. We consider continuous maps T : M → M with T (−1) =
T (1) = −1 such that T is C 2 on [−1, 0] and [0, 1]. We require that there exists λ > 1
such that T 0 ≥ λ on [−1, 0) and T 0 ≤ −λ on (0, 1]. There exists a unique absolutely
continuous invariant probability measure with density of bounded variation.
The results are analogous to those in Example 3.3. Let → T be a continuous
family of such maps on [−1, 1] with associated measures ν . We assume that T0 is
topologically mixing on the interval [T02 (0), T0 (0)] and that 0 is not periodic (which
guarantees that T is mixing for small).
Then ν0 is strongly statistically stable, so by Corollaries 2.7 and 2.8 we obtain
averaging in L1 (ν ) and also in L1 (µ) for every absolutely continuous probability
measure µ.
Now suppose that → T is a C 1 family of such maps on [−1, 1] with densities
ρ = dν /dm. Keller [21] showed that → ρ is C 1− as a map into L1 densities. Hence
1
−
q
2
q > 0 and in L1 (ν0 ).
we obtain the convergence rate O(
R ) for z in L (ν ), for all
More precisely, [21] shows that M |ρ −ρ0 | dm = O( log −1 ). By [8], this estimate
is optimal, so this is a situation where linear response fails in contrast to Example 3.3.
10
4
Families of nonuniformly expanding maps
In this section, we consider the situation where the fast dynamics is generated by
nonuniformly expanding maps T , such that the nonuniform expansion is uniform in
the parameter .
In Subsection 4.1, we recall the notion of nonuniformly expanding map. In Subsection 4.2, we describe the uniformity criteria on the family T and state our main
result on averaging, Theorem 4.3, for such families. In Subsection 4.3 we establish
some basic estimates for nonuniformly expanding maps. In Subsection 4.4 we prove
Theorem 4.3.
4.1
Nonuniformly expanding maps
Let (M, dM ) be a locally compact separable bounded metric space with finite Borel
measure m and let T : M → M be a nonsingular transformation for which m is
ergodic. Let Y ⊂ M be a subset of positive measure, and normalise m so that
m(Y ) = 1. Let α be an at most countable measurable partition of Y with m(a) > 0
for all a ∈ α. We suppose that there is an L1 return time function τ : Y → Z+ ,
constant on each a with value τ (a) ≥ 1, and constants λ > 1, η ∈ (0, 1], C0 , C1 ≥ 1
such that for each a ∈ α,
(1) F = T τ restricts to a (measure-theoretic) bijection from a onto Y .
(2) dM (F x, F y) ≥ λdM (x, y) for all x, y ∈ a.
(3) dM (T ` x, T ` y) ≤ C0 dM (F x, F y) for all x, y ∈ a, 0 ≤ ` < τ (a).
(4) ζ =
dm|Y
dm|Y ◦F
satisfies | log ζ(x) − log ζ(y)| ≤ C1 dM (F x, F y)η for all x, y ∈ a.
Such a dynamical system T : M → M is called nonuniformly expanding. We refer to
F = T τ : Y → Y as the induced map. (It is not required that τ is the first return
time to Y .) It follows from standard results (recalled later) that there is a unique
absolutely continuous ergodic T -invariant probability measure ν on M .
Remark 4.1 The uniformly expanding maps in Example 3.3 are clearly nonuniformly expanding: take Y = M , η = 1, τ = 1. Then conditions (1) and (2)
are immediate, condition (3) is vacuously satisfied, and condition (4) holds with
C1 = supx,y∈M, x6=y |(DT )x − (DT )y |/dM (x, y).
4.2
Uniformity assumptions
Now suppose that T : M → M , ∈ [0, 0 ), is a family of nonuniformly expanding
maps as defined in Subsection 4.1, with corresponding absolutely continuous ergodic
invariant probability measures ν .
11
Definition 4.2 Let p > 1. We say that T : M → M is a uniform family of nonuniformly expanding maps (of order p) if
(i) The constants C0 , C1 ≥ 1, λ > 1, η ∈ (0, 1] can be chosen independent of
∈ [0, 0 ).
+
p
(ii) The return
R timepfunctions τ : Y → Z lie in L for all ∈ [0, 0 ), and moreover
sup∈[0,0 ) Y |τ | dm < ∞.
We can now state our main result for this section. Recall the set up in Section 2.
Theorem 4.3 If T : M → M is a uniform family of nonuniformly expanding maps
of order p, then there is a constant C > 0 such that for all ∈ [0, 0 )
(
Z
C(p−1)/2 ,
p>3
δp+d−1 dν ≤
,
p−1 p−2
C| log | , p ∈ (2, 3]
M
and
(
C(p−1)/2 ,
δp+d dν ≤
2
C| log |p−1 (p−1) /p ,
M
Z
p ∈ (2, 3]
.
p ∈ (1, 2]
Remark 4.4 In the case p > 3, it follows from Theorem 4.3 that
|δ |Lq (ν ) = O((p−1)/(2(p+d−1)) ),
for all q ≤ p + d − 1. Since δ is uniformly bounded, |δ |Lq (ν ) = O((p−1)/(2q) ) for all
q > p + d − 1. Similar comments apply for p ∈ (1, 3].
In particular, if p can be taken arbitrarily large in Definition 4.2, then we obtain
1
that |δ |Lq (ν ) = O( 2 − ) for all q > 0.
R
1
If in addition ν0 is strongly statistically stable with R = M |ρ −ρ0 | dm = O( 2 − ),
1
then by Proposition 2.6 we obtain |z |Lq (ν ) = O( 2 − ) for all q > 0 and |z |L1 (ν0 ) =
1
O( 2 − ).
Remark 4.5 Alves & Viana [5] prove strong statistical stability for a large class of
noninvertible dynamical systems. These maps are uniform families of nonuniformly
expanding maps in a sense that is very similar to our definition. In fact it is almost
the case that their definition includes our definition, so it is almost the case that
verifying the assumptions of [5] is sufficient to obtain averaging and rates of averaging
via Theorem 4.3.
To be more precise, let us momentarily ignore assumption (3) in Subsection 4.1.
Then Definition 4.2(i) with η = 1 is immediate from [5, (U1)], and Definition 4.2(ii)
is immediate from [5, (U20 )] which follows from their conditions (U1) and (U2).
Hence it remains to discuss assumption (3). This assumption is not explicitly
mentioned in [5] since it is not required for the statement of their main results.
12
However, in specific applications, the hypotheses in [5] are often verified via the
method of hyperbolic times [1]. When the return time function τ is a hyperbolic
time, then it is automatic that C0 = 1 (see for example [2, Proposition 3.3(3)]).
Alves et al. [4] introduced a general method for constructing inducing schemes,
where τ is not necessarily a hyperbolic time but is close enough that C0 can still be
chosen uniformly. Alves [2] combined the methods of [4] and [5] to prove statistical
stability for large classes of examples. We show now that in the situation discussed
by [2], assumption (3) holds with uniform C0 and hence our main results hold. Certain
quantities δ1 > 0 and N0 ≥ 1 are introduced in [2, Lemma 3.2] and [2, eq. (16)]
respectively, and are explictly uniform in . Moreover τ = n + m where n is a
hyperbolic time and m ≤ N0 (see [2, Section 4.3]), so C0 depends only on at most N0
iterates of T . The construction in [2] (see in particular the proof of [2, Lemma 4.2])
ensures that the derivative of T is bounded along these iterates, so assumption (3)
holds and C0 is uniform in .
We mention also the extension of [4] due to Gouëzel [19] where C0 = 1 (see [19,
Theorem 3.1 4)].)
Finally, we note that when [4] is used to obtain polynomial decay of correlations
with rate O(1/nβ ), β > 0, the resulting uniform family is of order p = β + 1−.
(Uniformity in in Definition 4.2(ii) follows from [2, Lemma 5.1].)
4.3
Explicit estimates for nonuniformly expanding maps
Throughout this subsection, we work with a fixed nonuniformly expanding map
T : M → M . Some standard constructions and estimates are described. The
main novelty is that we stress the dependence of various constants on the underlying constants C0 , C1 , λ and η. For convenience, we normalise the metric dM so that
diam M = 1.
Symbolic metric
Fix θ ∈ [λ−η , 1) and define the symbolic metric dθ (x, y) = θs(x,y) where the separation
time s(x, y) is the least integer n ≥ 0 such that F n x and F n y lie in distinct partition
element. It is assumed that the partition α separates orbits of F , so s(x, y) is finite
for all x 6= y guaranteeing that dθ is a metric.
Proposition 4.6 dM (x, y)η ≤ dθ (x, y) for all x, y ∈ Y .
Proof Let n = s(x, y). By condition (2),
1 ≥ diam Y ≥ dM (F n x, F n y) ≥ λn dM (x, y) ≥ (θ1/η )−n dM (x, y).
Hence dM (x, y)η ≤ θn = dθ (x, y).
13
Invariant measures and transfer operators
For a positive observable ψ : Y → (0, ∞), define the log-Lipschitz seminorm
|ψ|θ` =
| log ψ(x) − log ψ(y)|
.
dθ (x, y)
x,y∈Y, x6=y
sup
We note that
e
−|ψ|θ`
Z
ψ dm ≤ ψ ≤ e
|ψ|θ`
Y
Z
ψ dm.
(4.1)
Y
Also, for ψk : Y → (0, ∞), k ≥ 1,
∞
X
ψk ≤ sup |ψk |θ` .
k=1
θ`
(4.2)
k≥1
1
1
operator corresponding to F and m,
RLet P̃ : L (Y ) →R L (Y ) denote the transfer
∞
so Y φ ◦ F ψ dm = Y φ P̃ ψ dm for all φ ∈ L and ψ ∈ L1 .
Proposition 4.7 Let ψ : Y → (0, ∞). Then
(a) |P̃ ψ|θ` ≤ C1 + θ|ψ|θ` , and
R
R
(b) e−(C1 +θ|ψ|θ` ) ψ dm ≤ P̃ ψ ≤ eC1 +θ|ψ|θ` ψ dm.
Proof Let a ∈ α and write ψa = 1a ψ. For y ∈ Y , we have (P̃ ψa )(y) = ζ(ya )ψ(ya )
where ya is the unique preimage of y under F lying in a.
Let x, y ∈ Y with preimages xa , ya ∈ a. By condition (4) and Proposition 4.6,
| log ζ(xa ) − log ζ(ya )| ≤ C1 dθ (x, y). Hence
| log(P̃ ψa )(x)− log(P̃ ψa )(y)| ≤ | log ζ(xa ) − log ζ(ya )| + | log ψ(xa ) − log ψ(ya )|
≤ C1 dθ (x, y) + |ψ|θ` dθ (xa , ya ) ≤ (C1 + θ|ψ|θ` )dθ (x, y),
and so |P̃Rψa |θ` ≤ C1 +Rθ|ψ|θ` . Part (a) follows from (4.2).
Since Y P̃ ψ dm = Y ψ dm, part (b) follows from (4.1) and part (a).
Proposition 4.8 There is a unique ergodic F -invariant probability measure µ on Y
equivalent to m|Y . Moreover, the density h = dµ/dm|Y satisfies
−1
e−C1 (1−θ)
≤ h ≤ eC1 (1−θ)
−1
and
|h|θ` ≤ C1 (1 − θ)−1 .
Proof There is at most one ergodic invariant probability measure equivalent to
m|Y . To prove existence of such a measure with the desired
R properties, we define the
sequence of positive functions ĥ0 = 1, ĥn+1 = P̃ ĥn . Then Y ĥn dm = 1 for all n ≥ 0.
14
Inductively, it follows from Proposition 4.7(a) that |ĥn |θ` ≤ C1 (1 + θ + · · · + θn−1 ) ≤
C1 (1 − θ)−1 for all n ≥ 1. Also, by Proposition 4.7(b),
ĥn = P̃ ĥn−1 ≤ eC1 +θ|ĥn−1 |θ` ≤ eC1 +C1 θ(1−θ)
−1
−1
= eC1 (1−θ) ,
−1
1 (1−θ)
and similarly ĥn ≥ e−C
for all P
n ≥ 1.
R
P
n−1 j
n−1
−1
−1
P̃
1.
It
is
immediate
that
ĥ
=
n
h dm = 1
Define hn = n
j
j=0
j=0
Y n
Pn−1
C1 (1−θ)−1
−C1 (1−θ)−1
. By (4.2), |hn |θ` = | j=0 ĥj |θ` ≤ C1 (1 − θ)−1 . In
≤ hn ≤ e
and e
particular, the sequence hn is bounded and equicontinuous. By Arzela-Ascoli, there
exists a subsequential limit h. The limit inherits the properties
Z
−1
−1
h dm = 1, e−C1 (1−θ) ≤ h ≤ eC1 (1−θ) , |h|θ` ≤ C1 (1 − θ)−1 .
Y
Moreover, P̃ hn = hn + n−1 (P̃ n 1 − 1) and so P̃ h = h. Hence dµ = h dm defines an
F -invariant probability measure.
W
−j
α
Define g : Y → R by setting g|a = dµ|a /d(µ ◦ F |a ) for a ∈ α. Let αn = n−1
j=0 F
denote the set of n-cylinders in Y , and write gn = (g ◦ F n−1 ) · · · (g ◦ F ) · g. Thus for
a ∈ αn , gn |a = dµ|a /d(µ ◦ F n |a ) is the inverse of the Jacobian of F n |a.
Let P : L1 (Y )R→ L1 (Y ) denote Rthe (normalised) transfer operator corresponding
to F and µ,Pso Y φ ◦ F ψ dµ = Y φ P ψ dµ for all φ ∈ L∞ and ψ ∈ L1 . Then
(P n v)(y) = a∈αn g(ya )v(ya ) where ya is the unique preimage of y under F n lying
in a.
Proposition 4.9 For all x, y ∈ a, a ∈ αn , n ≥ 1,
gn (y) ≤ C2 µ(a) and
where C2 = 2C1 (1 − θ)−2 e4C1
|gn (x) − gn (y)| ≤ C2 µ(a)dθ (F n x, F n y),
(1−θ)−2
(4.3)
.
Proof First, suppose that n = 1 and let x, y ∈ a, a ∈ α. We have g = ζh/h ◦ F , so
using Proposition 4.8,
| log g(x) − log g(y)|
≤ | log ζ(x) − log ζ(y)| + | log h(x) − log h(y)| + | log h(F x) − log h(F y)|
≤ C1 dθ (F x, F y) + |h|θ` dθ (x, y) + |h|θ` dθ (F x, F y)
≤ C1 (1 + θ(1 − θ)−1 + (1 − θ)−1 )dθ (F x, F y) = 2C1 (1 − θ)−1 dθ (F x, F y).
Next, let x, y ∈ a, a ∈ αn for general n ≥ 1. Then
| log gn (x) − log gn (y)| ≤
n−1
X
| log g(F j x) − log g(F j y)|
j=0
−1
≤ 2C1 (1 − θ)
n
X
j
j
dθ (F x, F y) ≤ 2C1 (1 − θ)
j=1
−2
−1
n
X
θn−j dθ (F n x, F n y)
j=1
n
n
≤ 2C1 (1 − θ) dθ (F x, F y).
(4.4)
15
−2
In particular gn (x)/gn (y) ≤ e2C1 (1−θ) . Hence
Z
Z
−2
µ(a) =
1a dµ =
P n 1a dµ ≥ inf P n 1a = inf gn (y) ≥ e−2C1 (1−θ) sup gn (y),
Y
y∈a
Y
y∈a
and so
−2
gn (y) ≤ e2C1 (1−θ) µ(a).
(4.5)
Next, we note the inequality t − 1 ≤ t log t which is valid for all t ≥ 1. Suppose
that gn (y) ≤ gn (x). Setting t = gn (x)/gn (y) ≥ 1 and using (4.4),
gn (x)
gn (x)
gn (x)
−2
−1≤
log
≤ e2C1 (1−θ) 2C1 (1 − θ)−2 dθ (F n x, F n y).
gn (y)
gn (y)
gn (y)
Hence
−2
gn (x) − gn (y) ≤ gn (y)e2C1 (1−θ) 2C1 (1 − θ)−2 dθ (F n x, F n y),
and the result follows from (4.5),
There is a standard procedure to pass from µ to a T -invariant ergodic absolutely
continuous probability measure ν on M , which we now briefly recall, since the construction is required in the proof of Lemma 4.17. Define the Young tower [36]
∆ = {(y, `) ∈ Y × Z : 0 ≤ ` < τ (y)},
(4.6)
R
with probability measure µ∆ = µ × {counting}/ Y τ dµ. Define π∆ : ∆ → M ,
π∆ (y, `) = T ` y. Then ν = (π∆ )∗ µ∆ is the desired probability measure on M .
Induced observables
Given φ : Y → Rd , we define kφkθ = |φ|∞ + |φ|θ where |φ|θ = supx6=y |φ(x) −
φ(y)|/dθ (x, y). Then φ is dθ -Lipschitz if kφkθ < ∞. We say that φ : Y → Rd is locally
dθ -Lipschitz if supy∈a |φ(y)| < ∞ and supx,y∈a, x6=y |φ(x) − φ(y)|/dθ (x, y) < ∞ for each
a ∈ α.
Given an observable v : M → Rd , we define the induced observable V : Y → R,
τ (y)−1
V (y) =
X
v(T ` y).
`=0
If v : M → Rd satisfies
R
v dν = 0, then
M
R
Y
V dµ = 0.
Proposition 4.10 If v : M → Rd is dM -Lipschitz, then V : Y → Rd is locally
dθ -Lipschitz and P V : Y → Rd is dθ -Lipschitz.
Moreover,
|V (y)| ≤ τ (a)|v|∞ ,
|V (x) − V (y)| ≤ C0 θ−1 τ (a)Lip v dθ (x, y), for all x, y ∈ a, a ∈ α,
and
|P V |∞ ≤ C2 |τ |L1 (µ) |v|∞ ,
|P V |θ ≤ C0 C2 θ−1 |τ |L1 (µ) kvkLip .
16
Proof The estimate for V (y) is immediate. By condition (3) and Proposition 4.6,
τ (a)−1
|V (x) − V (y)| ≤ Lip v
X
τ (a)−1
`
`
dM (T x, T y) ≤ C0 Lip v
`=0
X
dM (F x, F y)
`=0
= C0 τ (a)Lip v dM (F x, F y) ≤ C0 τ (a)Lip v dθ (F x, F y)1/η
≤ C0 τ (a)Lip v dθ (F x, F y) = C0 θ−1 τ (a)Lip v dθ (x, y),
completing the estimates
P for V .
Next, (P V )(y) = a∈α g(ya )V (ya ), so by (4.3),
X
|P V |∞ ≤ C2
µ(a)τ (a)|v|∞ = C2 |τ |L1 (µ) |v|∞ .
a∈α
Also,
|(P V )(x) − (P V )(y)| ≤
X
|g(xa ) − g(ya )||V (xa )| +
a∈α
≤ C2
X
X
g(ya )|V (xa ) − V (ya )|
a∈α
µ(a)dθ (x, y)τ (a)|v|∞ + C2
a∈α
X
µ(a)C0 θ−1 τ (a)Lip v dθ (x, y),
a∈α
yielding the estimate for |P V |θ .
Explicit coupling argument
For the remainder of this subsection, Lp -norms are computed using the invariant
probability measure µ.
Note that
Z
Z
|ψ|θ`
−|ψ|θ`
e
ψ dµ ≤ ψ ≤ e
ψ dµ.
(4.7)
Y
Y
Proposition 4.11 Let ψ : Y → (0, ∞). Then
(a) |P ψ|θ` ≤ C2 + θ|ψ|θ` , and
R
R
(b) e−(C2 +θ|ψ|θ` ) ψ dµ ≤ P ψ ≤ eC2 +θ|ψ|θ` ψ dµ.
Proof By the beginning of the proof of Proposition 4.9,
| log g(x) − log g(y)| ≤ 2C1 (1 − θ)−1 dθ (F x, F y) ≤ C2 dθ (F x, F y)
for all x, y lying in a common partition element. The proof now proceeds exactly as
in Proposition 4.7 with P̃ , m, ζ and C1 replaced by P , µ, g and C2 .
R
By (4.7), ψ − ξ Y ψ dµ is positive for all ξ ∈ [0, e−|ψ|θl ).
17
Proposition 4.12 Let ψ : Y → (0, ∞). For each ξ ∈ [0, e−|ψ|θl )
Z
|ψ|θl
ψ dµ ≤
.
ψ − ξ
1 − ξe|ψ|θl
θl
Y
Proof Let κ(y) = log ψ(y). Note that
Z
d
eκ
1
κ
R
R
log e − ξ
ψ dµ = κ
=
.
−κ
dκ
e − ξ Y ψ dµ
1 − ξe
ψ dµ
Y
Y
By (4.7),
1
1−
ξe−κ(y)
R
Y
ψ dµ
=
1
1
R
≤
,
−1
1 − ξe|ψ|θl
1 − ξψ(y)
ψ dµ
Y
for all y ∈ Y . Hence, by the mean value theorem, for x, y ∈ Y ,
Z
Z
|κ(x) − κ(y)|
|ψ|θl dθ (x, y)
κ(y)
κ(x)
ψ
dµ
ψ
dµ
−
log
e
−
ξ
log
e
−
ξ
≤
.
≤
1 − ξe|ψ|θl
1 − ξe|ψ|θl
Y
Y
This completes the proof.
Choose R large enough so that C2 + θR < R. Define ξ = e
Lemma 4.13 Let φ : Y → R be an observable with
R
−R
C2 + θR 1−
.
R
φ dµ = 0, and |φ|θ ≤ R. Then
|P n φ|1 ≤ 2(1 + |φ|1 ))(1 − ξ)n .
Proof Write φ = ψ0+ − ψ0−R , where ψ0−R = 1 − min{0, φ} and ψ0+ = 1 + max{0, φ}.
Then ψ0± : Y → [1, ∞) and Y ψ0+ dµ = Y ψ0− dµ ≤ 1 + |φ|1 .
Define
Z
±
±
ψn+1 = P ψn − ξ
ψn± dµ, n ≥ 0.
Y
R +
±
±
Then
ψ
dµ
=
(1
−
ξ)
ψ
dµ,
so
inductively
we
have
that
ψn dµ =
n+1
n
Y
Y
R − Y
R ±
n
n
n +
n −
+
−
dµ.
Moreover,
P
φ
=
P
ψ
−
P
ψ
=
ψ
−
ψ
ψ
dµ
=
(1
−
ξ)
ψ
0
0
0
n
n.
n
Y
Y
±
We claim that the functions ψn are nonnegative for all n. Then
Z
Z
n
+
−
+
n
|P φ|1 ≤ |ψn |1 + |ψn |1 = 2 ψn dµ = 2(1 − ξ)
ψ0+ dµ ≤ 2(1 − ξ)n (1 + |φ|1 ),
R
R
Y
Y
as required.
It remains to verify the claim. Since ξ < e−R , it suffices by Proposition 4.11(b)
and the choice of R that |ψn± |θl ≤ R for all n.
18
For n = 0, we suppose first that φ(x) ≥ φ(y) ≥ 0. Then
φ(x) − φ(y) log ψ0+ (x) − log ψ0+ (y) = log(1 + φ(x)) − log(1 + φ(y) = log 1 +
1 + φ(y)
φ(x) − φ(y)
≤ φ(x) − φ(y).
≤
1 + φ(y)
Also, if φ(x) ≥ 0 ≥ φ(y),
log ψ0+ (x) − log ψ0+ (y) = log(1 + φ(x)) ≤ φ(x) ≤ φ(x) − φ(y).
It follows that |ψ0+ |θ` ≤ |φ|θ ≤ R. Noting that ψ0− = 1 + max{−φ, 0}, we obtain that
|ψ0± |θ` ≤ R.
Suppose inductively that |ψn± |θl ≤ R for some n. By Proposition 4.11(a),
±
|P ψn± |θl ≤ C2 + θR < R, and so ξ < e−R < e−|P ψn |θl . By Proposition 4.12 and
the definition of ξ,
Z
C2 + θR
±
±
|ψn+1 |θl = P ψn − ξ
P ψn± dµ ≤
= R,
1 − ξeR
θl
Y
completing the induction argument.
Corollary 4.14 Let p > 0, R > 0. There exists a constant γ ∈ (0, 1) depending
1−1/p
only on p, θ, C2 and R such that |P n φ|p ≤ 2dγ n (1 + |φ|1 )1/p |φ|∞
for all mean zero
dθ -Lipschitz φ : Y → Rd with |φ|θ ≤ R and all n ≥ 1.
Proof The result holds for p ≤ 1 and d = 1 by Lemma 4.13. Working componentwise, the extension to general d is immediate.
Since µ is T -invariant, it follows from the duality definition of P that |P φ|∞ ≤ |φ|∞
for all φ ∈ L∞ . Let γ̃ = 1 − ξ be the constant for p = 1 and suppose that p > 1.
Then
Z
Z
Z
n p
n p−1
n
p−1
|P φ| dµ = |φ|∞
|P n φ| dµ ≤ 2dγ̃ n (1 + |φ|∞ )|φ|p−1
|P φ| dµ ≤ |P φ|∞
∞ .
Y
Y
Y
The result follows with γ = γ̃ 1/p .
Explicit moment estimate
Proposition 4.15 Let p ≥ 1, R ≥ 1. Suppose that |τ |p kvkLip ≤ R. Then there exists
m, χ ∈ Lp (Y, Rd ) such that V = m + χ ◦ F − χ, and m ∈ ker P . Moreover,
|m|p ≤ 3C3 ,
and
|χ|p ≤ C3 ,
where C3 ≥ R depends only on d, θ, C0 , C2 and R.
19
Proof Let R̂ = C0 C2 θ−1 R, and choose γ̂ corresponding to R̂ as in Corollary 4.14.
By Proposition 4.10, |P V |θ ≤ R̂. Hence by Proposition 4.10 and Corollary 4.14 with
φ = P V , for n ≥ 1,
1−1/p
|P n V |p ≤ 2dγ̂ n−1 (1 + |P V |1 )1/p |P V |∞
≤ 2dγ̂ n−1 (1 + C2 R)1/p (C2 R)1−1/p ≤ 4dC2 Rγ̂ n−1 .
P
k
p
−1
It follows that χ = ∞
k=1 P V lies in L and |χ|p ≤ C3 where C3 = 4dC2 R(1 − γ̂) .
p
Write V = m+χ◦F −χ; then m ∈ L and P m = 0. Finally, |m|p ≤ |V |p +2|χ|p ≤
R + 2|χ|p ≤ 3C3 .
Pn−1
j
Corollary 4.16 Define Vn =
j=0 V ◦ F . Let p > 1, R ≥ 1. Suppose that
|τ |p kvkLip ≤ R. Then there exists a constant C4 ≥ R depending only on p and
C3 such that
(
C4 n1/2
p>2
.
max |Vj | ≤
1/p
1≤j≤n
p
C4 (log2 n)n
p ∈ (1, 2]
Pn−1
Proof First note that Vn = mn + χ ◦ F n − χ where mn =
j=0 m ◦ F . Hence
|Vn |p ≤ |mn |p + 2|χ|p . Since m ∈ ker P , an application of Burkholder’s inequality [13]
shows that
|mn |p ≤ C(p)|m|p nmax{1/p,1/2} ,
(see for example the proof of [29, Proposition 4.3]). Hence
|Vn |p ≤ C(p)(|m|p + 2|χ|p )nmax{1/p,1/2} .
By Billingsley [12, Problem 10.3, p. 112] (see also [33, Equations (2.2) and (2.3)] with
ui = 1, ν = p, α = max{1, p/2}), and using stationarity,
max |Vj | ≤ C(p)(|m|p + 2|χ|p )(log2 n)nmax{1/p,1/2} .
1≤j≤n
p
For p ∈ (1, 2], the result follows from Proposition 4.15 with C4 = 15C3 C(p). By [30,
Theorem 1], the logarithmic factor can be removed for p > 2 provided C4 is increased
by a factor depending only on p.
Lemma 4.17 Let p > 1, R ≥ 1. Suppose that |τ |p kvkLip ≤ R. Let vn =
Then
(
1/(p−1) 1/2
3C4 |τ |p
n
p>2
|vn |p−1 ≤
.
1/q
3C4 |τ |p (log2 n)n1/p p ∈ (1, 2]
Pn−1
j=0
v ◦ T j.
Moreover, there exists a constant C5 ≥ R depending only on p and C4 such that
(
1/(p−1) 1/2
3C5 |τ |p
n
p>3
max |vj |
≤
.
1/(p−1)
p−1
1/(p−1)
j≤n
3C5 |τ |p
(log2 n)n
p ∈ (2, 3]
20
Proof Let q = p − 1. Define the tower ∆ as in (4.6) with tower map f : ∆ → ∆
where
(
(y, ` + 1), ` ≤ τ (y) − 2
f (y, `) =
.
(F y, 0),
` = τ (y) − 1
R
Recall that µ∆ = µ × counting/τ̄ on ∆ where τ̄ = Y τ dµ. Also, ν = (π∆ )∗ µ∆ where
π∆ : ∆ → M is the projection π∆ (y,P`) = T ` y.
R
R
n−1
Let v̂ = v ◦ π∆ and define v̂n = j=0
v̂ ◦ f j . Then M |vn |q dν = ∆ |v̂n |q dµ∆ .
Next, let Nn : ∆ → {0, 1, . . . , n} be the number of laps by time n,
Nn (y, `) = #{j ∈ {0, . . . , n} : f j (y, `) ∈ Y }.
Then
v̂n (y, `) = VNn (y,`) (y) + H ◦ f n (y, `) − v̂` (y, 0),
P
where H(y, `) = ``0 =1 v̂(y, `0 ) = v̂`+1 (y, 0) − v̂(y, 0). Note that |v̂` (y, 0)| ≤ |v|∞ τ (y)
and |H(y, `)| ≤ |v|∞ τ (y) for all (y, `) ∈ ∆.
Now
Z
Z
n q
|H ◦ f | dµ∆ =
∆
Z
q
|H| dµ∆ = (1/τ̄ )
∆
≤ |v|q∞
Z
Y
τ (y)−1
Y
X
j=0
τ (y)−1
τ q dµ = |v|q∞
Z
X
|H(y, `)|q dµ
j=0
τ p dµ = |τ |pp |v|q∞ = (|τ |p |v|∞ )q |τ |p ≤ Rq |τ |p ,
Y
R
where we have used the fact that τ̄ ≥ 1. Similarly, ∆ |v̂` (y, 0)|q dµ∆ ≤ Rq |τ |p .
Next, using Hölder’s inequality,
Z
Z
Z
q
q
|VNj (y,`) (y)| dµ∆ ≤
max |Vj (y)| dµ∆ = (1/τ̄ ) τ max |Vj |q dµ
j≤n
∆
∆ j≤n
Y
q
≤ |τ |p max |Vj |q = |τ |p max |Vj | .
j≤n
p/q
j≤n
p
By the triangle inequality and Corollary 4.16,
|vn |p−1 = |v̂n |Lq (∆)
)≤
≤ |τ |1/q
(2R
+
max
|V
|
j
p
p
j≤n
(
1/q
3C4 |τ |p n1/2
1/q
3C4 |τ |p (log2 n)n1/p
p>2
.
p ∈ (1, 2]
Suppose that p > 3. By [30, Theorem
increasing C4 by a factor that
1], after 1/q
depends on p, we have that maxj≤n |vn | p−1 ≤ 3C4 |τ |p n1/2 .
1/q
If p ∈ (1, 3], then we have |vn |p−1 ≤ 3C4 |τ |p n1/(p−1) . For p ∈ (2, 3], it follows
1/q
from [30, Theorem 3] that maxj≤n |vn |p−1 ≤ 3C4 |τ |p (log2 n)n1/(p−1) .
21
4.4
Proof of Theorem 4.3
RDefine v,x and v,x,n as in Section 2. Note that Lip v,x ≤ 2L for all , x and
v dν = 0. Take R = 2L sup≥0 |τ |p in Lemma 4.17.
M ,x
First suppose that p > 3. It follows from Lemma 4.17 that
max |v,x,j |
≤ 6C5 |τ |p1/(p−1) n1/2 ,
p−1
j≤n
for all ≥ 0, x ∈ R, n ≥ 1. By (2.7),
Z
Z
p+d−1
p−1
δ1,
dν ≤ Γ sup
M
≤
max |v,x,n |p−1 dν
n≤1/
x∈E M
p−1
ΓC5 |τ |p p−1 (1−p)/2 = ΓC5p−1 |τ |p (p−1)/2 .
Similarly,
v,x , v,x,n and δ1, by V,x , V,x,n and δ2, , we obtain that
R p+d−1 replacing p−1
δ
dν
≤
ΓC
|τ |p (p−1)/2 . Since sup∈[0,0 ) |τ |p < ∞, this completes the
5
M 2,
proof.
From now on we restrict attention entirely to δ1, since the same estimate for δ2,
is then immediate as above.
Suppose that p ∈ (2, 3]. We can proceed as for p > 3, using the
for
R estimate
p+d−1
maxj≤n |vj | given by Lemma 4.17 combined with (2.7); this gives M δ1,
dν ≤
ΓC5p−1 |τ |p | log |p−1 p−2 . Alternatively,Rwe can use the estimate for vn in Lemma 4.17
p+d
combined with Lemma 2.10 to obtain M δ1,
dν ≤ ΓC4p−1 |τ |p (p−1)/2 .
Finally, if p ∈ (1, 2],
use the estimate for vn in Lemma 4.17 combined with
R wep+d
2
Lemma 2.10 to obtain M δ1, dν ≤ ΓC4p−1 |τ |p | log |p−1 (p−1) /p .
5
Examples: Nonuniformly expanding maps
Example 5.1 (Intermittent maps) Let M = [0, 1] with Lebesgue measure m and
consider the intermittent maps T : M → M given by
(
x(1 + 2a xa ), x ∈ [0, 21 ]
.
(5.1)
Tx =
2x − 1,
x ∈ ( 21 , 1]
These were studied in [27]. Here a > 0 is a parameter. For a ∈ (0, 1) there is a unique
absolutely continuous invariant probability measure with C ∞ nonvanishing density.
We consider a family T : M → M , ∈ [0, 0 ), of such intermittent maps with
parameter a ∈ (0, 1) depending continuously on . Let ν denote the corresponding
family of absolutely continuous invariant probability measures.
For each , we take Y = [ 21 , 1] and let τ : Y → Z+ be the first return time
τ (y) = inf{n ≥ 1 : Tn y ∈ Y }. Define the first return map F = Tτ : Y → Y . Let
α = {Y (n), n ≥ 1} where Y (n) = {y ∈ Y : τ (y) = n}.
22
It is standard that each T is a nonuniformly expanding map in the sense of
Section 4.1 with τ ∈ Lp for all p < 1/a . Fix p ∈ (1, 1/a0 ) and choose 0 < a− < a0 <
a+ < 1 such that p < 1/a+ . Without loss we can shrink 0 so that a ∈ [a− , a+ ] for
all ∈ [0, 0 ). We show that T , ∈ [0, 0 ), satisfies Definition 4.2 for this choice of p.
Since T0 ≥ 1 on M and T0 = 2 on Y , it is immediate that conditions (1)—(3) in
Section 4.1 are satisfied with λ = 2 and C0 = 1.
dm|Y
Recall that ζ = dm|
. Note that ζ (y) = 1/F0 (y). For each Y (n) ∈ α , define
Y ◦F
−1
−1 0
. By [26,
) = ζ ◦ F,n
the bijection F,n = F |Y (n) : Y (n) → M . Let G,n = (F,n
Assumption A2 and Theorem 3.1], there is a constant K, depending only on a− and
−1 0
a+ , such that |(log G,n )0 | ≤ K for all ∈ [0, 0 ). Hence |(log ζ ◦F,n
) | = |(log G,n )0 | ≤
K. By the mean value theorem, for x, y ∈ Y (n),
−1
−1
| log ζ (x) − log ζ (y)| = |(log ζ ◦ F,n
)(F x) − (log ζ ◦ F,n
)(F y)| ≤ K|F x − F y|.
This proves condition (4) in Section 4.1, and so condition (i) in Definition 4.2 is
satisfied.
Define x1 = 21 and inductively xn+1 < xn (depending on ) with T xn+1 = xn .
Then T (Y (n)) = [xn , xn−1 ] for n ≥ 2 and it is standard that xn = O(1/nα ) as a
function of n. By [26, Lemma 5.2], there is a constant K, depending only on a− and
a+ , such that xn ≤ Kn−1/α+ for all n ≥ 1, ∈ [0, 0 ). Hence
m(τ > n) = m([1/2, (xn + 1)/2]) = xn /2 ≤ Kn−1/α+ .
R
Since p < 1/α+ it follows that sup∈[0,0 ) Y |τ |p dm < ∞ so condition (ii) in Definition 4.2 is satisfied.
R By [11, 26], ν is strongly statistically stable and the densities ρ satisfy 1R =
|ρ − ρ0 | dm = O(a − a0 ). Hence, by Corollary 2.7, we obtain averaging in L with
respect to ν and also with respect to any absolutely continuous probability measure.
Finally, if 7→ a is Lipschitz say, so that R = O(), then we obtain the rates
described in Remark 4.4 with p = (1/a0 )−.
Example 5.2 (Logistic family) We consider the family of quadratic maps T :
[−1, 1] → [−1, 1] given by T (x) = 1 − ax2 , a ∈ [−2, 2], with m taken to be Lebesgue
measure.
Let b, c > 0. The map T satisfies the Collet-Eckmann condition [14] with constants
b, c if
|(T n )0 (1)| ≥ cebn
for all n ≥ 0.
(5.2)
In thisScase, we write a ∈ Qb,c . The set of Collet-Eckmann parameters is given by
P1 = b,c>0 Qb,c and is a Cantor set of positive Lebesgue measure [20, 10]. When
a ∈ P1 , the map T has an invariant set Λ consisting of a finite union of intervals with
an ergodic absolutely continuous invariant probability measure νa . The density for νa
23
is bounded below on Λ and lies in L2− . The invariant set attracts Lebesgue almost
every trajectory in [−1, 1].
There is also an open dense set of parameters P0 ⊂ [−2, 2] for which T possesses a
periodic sink attracting Lebesgue almost every trajectory in [−1, 1]. By Lyubich [28],
P0 ∪ P1 has full measure. For a ∈ P0 , we let νa denote the invariant probability
measure supported on the periodic attractor, so we have a map a 7→ νa defined on
P0 ∪ P1 .
It is clear that statistical stability holds on P0 , and that strong statistical stability
fails everywhere in P0 ∪ P1 . Moreover, Thunberg [34, Corollary 1] shows that on any
full measure subset of E ⊂ [−2, 2] the map a → νa is not statistically stable at any
point of P1 ∩ E. On the other hand, Freitas & Todd [18] proved that strong statistical
stability holds on Qb,c for all constants b, c > 0. That is, the map a → ρa = dνa /dm
from Qb,c → L1 is continuous. (See also [16, 17] for the same result restricted to the
Benedicks-Carleson parameters [10].)
We consider families → T where each T is a quadratic map with parameter
a = a depending continuously on . Fix b, c > 0 such that a0 ∈ Qb,c . We claim that
Z
lim
z dνa = 0.
→0
a ∈Qb,c
Moreover, using Corollary 2.8 we obtain convergence in L1 (ν) for every absolutely
continuous probability measure ν. Given the above results on strong statistical stability, it suffices to verify that T is a uniform family of nonuniformly expanding
maps.
For the Benedicks-Carleson parameters, the method in [16, 17] is the approach
of [2] and we can apply Remark 4.5. In the general case, a different method exploiting negative Schwarzian derivative and Koebe spaces [18, Proof of Theorem B
in Section 6] shows that the conditions in [5] are satisfied. By Remark 4.5, this
completes the proof of averaging with the possible exception of condition (3). However, a standard consequence of negative Schwarzian derivative and the Koebe distortion property (as discussed in [18, Lemma 4.1] and used in [18, Remark 3.2])
is that bounded distortion holds at intermediate steps and not just at the inducing time as in condition (4). Hence there is a uniform constant C̃1 such that
|Tj x−Tj y|
x−F y|
≤ C̃1 |Fdiam
for all partition elements a, all x, y ∈ a and all j ≤ τ (a).
Y
diam T j a
In particular, |Tj x − Tj y| ≤ (2C̃1 / diam Y )|F x − F y|, yielding condition (3) uniformly in .
Next, we discuss rates of convergence. By [18, Lemma 4.1], condition (ii) in
1
Definition 4.2 is satisfied for any p > 1. Hence |δ |Lq (ν ) = O( 2 − ). If 7→ a is
R
1
C 1 , then it follows from Baladi et al. [9] that R = |ρ − ρ0 | dm = O( 2 − ). By
1
Remark 4.4, we obtain averaging with rate O( 2 − ) in Lq (ν ) for all q > 0 and in
L1 (ν0 ).
24
Example 5.3 (Multimodal maps) Freitas & Todd [18] also consider families of
multimodal maps where each critical point c satisfies a Collet-Eckmann condition
along the orbit of T c with constants uniform in . Hence the averaging result for the
quadratic family in Example 5.2 extends immediately to multimodal maps.
Example 5.4 (Viana maps) Viana [35] introduced a C 3 open class of multidimensional nonuniformly expanding maps T : M → M . For definiteness, we
restrict attention to the case M = S 1 × R. Let S : M → M be the map
S(θ, y) = (16θ mod 1, a0 + a sin 2πθ − y 2 ). Here a0 is chosen so that 0 is a preperiodic point for the quadratic map y 7→ a0 − y 2 and a is fixed sufficiently small. Let
T , 0 ≤ < 0 be a continuous family of C 3 maps sufficiently close to S. It follows
from [1, 5] that there is an interval I ⊂ (−2, 2) such that, for each ∈ [0, 0 ), there is
a unique absolutely continuous T -invariant ergodic probability measure ν supported
in the interior of S 1 × I. Moreover the invariant set Λ = supp ν attracts almost
every initial condition in S 1 × I.
By Alves & Viana [5], ν0 is strongly statistically stable. Moreover, the inducing
method of [4] and the arguments in [2] apply to this example, so T is a uniform family
of nonuniformly expanding maps by Remark 4.5. Also, Corollary 2.8 is applicable
by Remark 2.9. Hence we obtain averaging in L1 (ν ) and in L1 (µ) for all absolutely
continuous µ.
6
Averaging for continuous time fast-slow systems
Let φt : M → M , 0 ≤ < 0 , be a family of semiflows defined on the metric
space (M, dM ). For each ≥ 0, let ν denote a φt -invariant ergodic Borel probability
measure. Let a : Rd × M × [0, 0 ) → Rd be a family of vector fields on Rd satisfying
conditions (2.3)—(2.5).
We consider the family of fast-slow systems
ẋ() = a(x() , y () , ),
x() (0) = x0 ,
y () (t) = φt y0 ,
where the initial condition x() (0) = x0 is fixed throughout. The initial condition
y0 ∈ M is again chosen randomly with respect to various measures that are specified
in the statements of the results.
Define x̂() : [0, 1] → Rd by setting x̂() (t) = x() (t/). Let X : [0, 1] → Rd be the
solution to the ODE (2.2) and define
z = sup |x̂() (t) − X(t)|.
t∈[0,1]
d
R Recall that E = {x ∈ R : |x − x0 | ≤ L1 }. As in Section 2.1, define ā(x, ) =
a(x, y, ) dν (y) and let v,x (y) = a(x, y, ) − ā(x, ). We define the order function
M
25
δ = δ1, + δ2, : M → R where
t
Z
δ1, = sup sup |v,x,t |
where
v,x ◦ φs ds,
v,x,t =
x∈E 0≤t≤1/
0
Z
δ2, = sup sup |V,x,t |
where
V,x,t =
x∈E 0≤t≤1/
t
(Dv,x ) ◦ φs ds.
0
The next results is the continuous time analogue of Theorem 2.2. The proof is
entirely analogous, and hence is omitted.
R
Theorem 6.1 Let S = supx∈E | M a(x, y, 0) (dν − dν0 )(y)| + . Assume conditions (2.3)—(2.5). If δ ≤ 21 , then z ≤ 6e2L (δ + S ).
As in the discrete time setting, statistical stability together with L1 convergence
of δ implies averaging, and similarly for rates of averaging. In the remainder of this
section, we consider statistical stability and convergence of δ . Under mild assumptions, we show that results for continuous time follow directly from those for discrete
time with identical convergence rates.
Fix a Borel subset M 0 ⊂ M and a finite Borel measure m0 supported on M 0 .
Let h : M 0 → R+ be a family of Lipschitz functions such that φh (y) (y) ∈ M 0 for
almost all y ∈ M 0 . The map T : M 0 → M 0 , T (y) = φh (y) (y), is then defined almost
everywhere.
We suppose throughout that there are constants K2 ≥ K1 ≥ 1 such that for all
x, y ∈ M 0 , ∈ [0, 0 ),
• K1−1 ≤ h ≤ K1 , Lip h ≤ K1 , |h − h0 |∞ ≤ K1 .
• dM (φt x, φt y) ≤ K2 dM (x, y) and dM (φt y, φ0t y) ≤ K2 for all t ≤ K1 .
(These assumptions are easily weakened; in particular changing the estimates to
1/2 will not affect anything.)
As usual, we suppose that there is a family ν0 of ergodic T -invariant probability
measures on M 0 . Define the suspension Mh = {(y, u) ∈ M 0 × R : 0 ≤ u ≤ h (y)}/ ∼
where (y, h (y)) ∼ (T y, 0). The suspension semiflow ft : Mh →RMh is given by
ft (y, u) = (y, u + t) computed modulo identifications. Let h̄ = M 0 h dν0 . Then
ν00 = (ν0 × Lebesgue)/h̄ is an ergodic absolutely continuous ft -invariant probability
measure on Mh . The projection π : Mh → M given by π (y, u) = φu y is a semiconjugacy between ft and φt . Hence ν = π,∗ ν00 is an ergodic φt -invariant probability
measure on M .
It is easy to see that if ν0 is absolutely continuous on M 0 , then ν is absolutely
continuous on M . Moreover, (strong) statistical stability of ν00 implies (strong) statistical stability of ν0 . Rates of strong statistical stability are also preserved. Let
ρ = dν /dm and ρ0 = dν0 /dm0 denote the relevant densities.
26
Proposition 6.2 S ≤
4K24 L
Z
+
M0
|ρ0 − ρ00 | dm0 .
Proof Fix x and let v(y) = a(x, y, 0). Then
Z
Z
Z
00
v ◦ π dν −
v ◦ π0 dν000
v (dν − dν0 ) =
M
M h0
M h
Z
h
Z
v◦
= (1/h̄ )
M0
π du dν0
Z
Z
− (1/h̄0 )
M0
0
h0
v ◦ π0 du dν00
0
= I1 + I2 + I3 + I4
where
Z
Z
h
Z
Z
h
I1 = (1/h̄ − 1/h̄0 )
v◦
I2 = (1/h̄0 )
(v ◦ π − v ◦ π0 ) du dν0 ,
0
0
M
0
M
0
Z Z h0
Z Z h
v ◦ π0 du dν0 −
v ◦ π0 du dν0 ,
I3 = (1/h̄0 )
M0 0
M0 0
Z Z h0
Z Z h0
0
0
v ◦ π0 du dν −
v ◦ π0 du dν0 .
I4 = (1/h̄0 )
M0
π du dν0 ,
M0
0
0
Now
|I1 | ≤
K12 (h̄
|I2 | ≤
K12
− h̄0 )K1 |v|∞ ≤
sup
y∈M 0
sup Lip v
0≤u≤K1
K14 L
Z
+
M0
0
dM (φu y, φu y) ≤
K12 K2 L,
Z
2
|I4 | ≤ K1 L
|ρ0 − ρ00 | dm0 .
|I3 | ≤ K1 |v|∞ |h − h0 |∞ ≤ K12 L,
R
Altogether,
M
v(dν − dν0 ) ≤ 3K24 L( +
|ρ0 − ρ00 | dm0 ,
M0
R
M0
|ρ0 − ρ00 | dm0 ) and the result follows.
Next we show how the order function for the flows φt : M → M is related to the
order function for the maps T : M 0 → M 0 . We restrict attention to δ1, since the
corresponding statement for δ2, is identical.
Define the family of induced observables wx, : M 0 → R,
Z h (y)
w,x (y) =
v,x (φu y) du.
0
Note that
R
M0
w,x dν0 = 0 and w,x is dM -Lipschitz with kw kLip ≤ 2K1 K2 L. Let
∆1, = sup
sup
|w,x,n |
where
w,x,n =
x∈E 1≤n≤K1 /
n−1
X
j=0
We can now state our main result for this section.
27
w,x ◦ Tj .
Lemma 6.3 Let q ≥ 1. There is a constant C ≥ 1 such that
Z
Z
q
q
0
q
δ1, dν ≤ C
∆1, dν + .
M0
M
Proof Let v̂,x = v,x ◦ π and define v̂,x,t =
{0, 1, . . . , K1 t} be the number of laps by time t,
Rt
0
v̂,x ◦ fu du. Let N,t : Mh →
N,t (y, u) = #{s ∈ {0, . . . , K1 t} : fs (y, u) ∈ M 0 }.
Then
v̂,x,t (y, u) = w,x,N,t (y,u) (y) + H,x ◦ ft (y, u) − H,x (y, u),
Ru
where H,x (y, u) = 0 v̂,x (y, u0 ) du0 = v̂,x,u (y, 0). Note that |H,x |∞ ≤ 2K1 L. Hence
sup |v,x,s | ◦ π (y, u) = sup |v̂,x,s (y, u)| ≤ sup |w,x,Ns (y,u) (y)| + 4K1 L
s≤t
s≤t
s≤t
≤ max |w,x,j (y)| + 4K1 L,
j≤K1 t
and so δ1, ≤ ∆1, + 4K1 L. The result follows.
As a consequence of Proposition 6.2 and Lemma 6.3, many of our results for
maps go through immediately for semiflows. For example, suppose that the maps
T (y) = φh (y) (y) are a family of quadratic maps as in Example 5.2. Then we obtain
averaging in L1 (ν ) and in L1 (µ) for any absolutely continuous probability measure µ
1
on M . Moreover, we obtain the rate O( 2 − ) for convergence in Lq (ν ) for any q > 0.
7
Counterexample for almost sure convergence
It is known [7] that almost sure convergence fails for fully-coupled fast-slow systems.
Here we give an example to show that almost sure convergence fails also in the simpler
context of families of skew products as considered in this paper.
Fix β > 0. We consider the family of maps T : [0, 1] → [0, 1] given by T y =
2y + β mod 1 with invariant measure ν taken to be Lebesgue for all ≥ 0. Let
a(x, y, ) = cos 2πy. Since a has mean zero, the averaged ODE is given by Ẋ = 0.
We take x0 = 0 so that X(t) ≡ 0. Nevertheless, we prove:
Proposition 7.1 For every y0 ∈ [0, 1], lim sup→0 x̂() (1) = 1.
Proof Let y0 ∈ [0, 1] and δ > 0. Let N = [δ −1/2 ] and choose an integer 1 ≤ k ≤ 2N
such that
y0 ∈ [−δ β + (k − 1)2−N , −δ β + k2−N ].
Choose such that
y0 = −β + (k − 1)2−N .
28
Then
δ β − 2−N ≤ β ≤ δ β .
If δ is small enough, then δ β − 2−N = δ β − 2−[δ
Now
−1/2 ]
> 0, and so 0 < ≤ δ.
yn() = 2n y0 + (2n − 1)β mod 1 = −β + (k − 1)2n−N mod 1,
()
()
for all n ≥ 0. In particular, for n ≥ N we have yn = −β mod 1, and cos 2πyn ≥ 1−
P −1
()
πβ . Note that N ≤ −1/2 . Hence x̂() (1) = [n=0]−1 cos(2πyn ) = 1+O(1/2 )+O(β ).
Since ∈ (0, δ] is arbitrarily small, the result follows.
A
Proof of first order averaging
In this appendix, we prove Theorem 2.2(a). Our proof is analogous to that of [32,
Theorem 4.3.6], but our setup is different, and we work with discrete time rather than
continuous time.
First, we consider the case where T is independent of . Let T : M → M be
a transformation, and a : Rd × M → Rd and ā : Rd → Rd be functions satisfying
kakLip = supy∈M ka(·, y)kLip ≤ L and kākLip ≤ L, where L ≥ 1.
For > 0, consider the discrete fast-slow system
xn+1 = xn + a(xn , yn ),
yn+1 = T yn
with x0 ∈ Rd and y0 ∈ M given.
Define x̂ : [0, 1] → Rd , x̂ (t) = x[t/] , and let X : [0, 1] → Rd be the solution of the
ODE Ẋ = ā(X) with initial condition X(0) = x0 . As in Section 2, we define
δ1,
n−1
X
= sup sup (a(x, yj ) − ā(x)).
x
1≤n≤1/
j=0
Theorem A.1 For all > 0, x0 ∈ Rd , y0 ∈ M and t ∈ [0, 1],
q
2L
δ1, (y0 ) + .
|x̂ (t) − X(t)| ≤ 5e
First we recall a discrete version of Gronwall’s lemma.
Proposition A.2
C, D ≥ 0 such
P Suppose that bn ≥ 0 and that there exist constants
n
that bn ≤ C + D n−1
b
for
all
n
≥
0.
Then
b
≤
C(D
+
1)
.
n
m=0 m
Proof This follows by induction.
29
We are interested in the sequences xn ∈ Rd , yn ∈ M for n ≤ −1 , and for convenience and without loss of generality we assume that a(x, yn ) = ā(x) for all n > −1
and all x ∈ Rd . This gives us
n−1
δ (y )
X
1, 0
(a(x, ym ) − ā(x)) ≤
m=0
(A.1)
for every n, from the definition of δ1, .
Define the local averages of x and a,
xN,n
N −1
1 X
xn+m ,
=
N m=0
N −1
1 X
aN,n (x) =
a(x, yn+m ).
N m=0
The proof consists mainly of five short lemmas, all of them uniform in N, n ∈ Z,
x0 ∈ Rd , y0 ∈ M , > 0, with 0 ≤ n ≤ −1 .
Lemma A.3 |xn − xN,n | ≤ LN .
Proof We have
xN,n
n+m−1
N −1
N −1 n+m−1
i
X
X X
1 Xh
a(xk , yk ) = xn +
=
xn + a(xk , yk ),
N m=0
N m=0 k=n
k=n
and so |xn − XN,n | ≤ N |a|∞ ≤ LN .
n−1
X
aN,m (xm ) ≤ 2L2 N .
Lemma A.4 xN,n − x0 − m=0
Proof First compute that
N −1 n+m−1
X X
xN,n − x0 = xn − x0 +
a(xk , yk )
N m=0 k=n
N −1 n+m−1
N −1 n+m−1
X X
X X
a(xk , yk ) =
a(xk , yk ).
=
a(xk , yk ) +
N m=0 k=n
N m=0 k=0
k=0
n−1
X
Next,
n−1 N −1
1 XX
aN,m (xm ) =
a(xm , ym+k ) = A(n, N ) + B(n, N ),
N m=0 k=0
m=0
n−1
X
30
where
n−1 N −1
1 XX
A(n, N ) =
[a(xm , ym+k ) − a(xm+k , ym+k )],
N m=0 k=0
n−1 N −1
1 XX
a(xm+k , ym+k ).
N m=0 k=0
B(n, N ) =
P
PN −1
Note that |A(n, N )| ≤ NL n−1
m=0
k=0 |xm − xm+k | and |xm − xm+k | ≤ Lk ≤ LN so
that |A(n, N )| ≤ L2 nN ≤ L2 N . Also
B(n, N ) =
N −1 n−1
N −1 n+m−1
1 XX
1 X X
a(xm+k , ym+k ) =
a(xk , yk ).
N m=0 k=0
N m=0 k=m
Altogether,
xN,n − x0 − =
N
=
N
n−1
X
aN,m (xm )
m=0
N
−1
X n+m−1
X
m=0 k=0
N
−1 m−1
X
X
N −1 n+m−1
X X
a(xk , yk ) − A(n, N ) −
a(xk , yk )
N m=0 k=m
a(xk , yk ) − A(n, N ).
m=0 k=0
Hence
|xN,n − x0 − n−1
X
aN,m (xm )| ≤ LN + |A(n, N )| ≤ 2L2 N,
m=0
as required.
Define a sequence un ∈ Rd by setting u0 = x0 and
un+1 = un + aN,n (un ).
Lemma A.5 |xn − un | ≤ 3L2 eL N .
Proof Note that Lip(aN,n ) ≤ L for all N, n. Hence it follows from Lemma A.4 that
|xN,n − un | ≤ n−1
X
2
|aN,m (xm ) − aN,m (um )| + 2L N ≤ L
m=0
n−1
X
m=0
By Lemma A.3,
|xn − un | ≤ L
n−1
X
|xm − um | + 3L2 N.
m=0
31
|xm − um | + 2L2 N.
By Proposition A.2, |xn − un | ≤ 3L2 N (1 + L)n ≤ 3L2 N (1 + L)1/ ≤ 3L2 eL N .
Define a sequence zn ∈ Rd with
zn+1 = zn + ā(zn ),
Lemma A.6 |un − zn | ≤
z0 = x0 .
2eL δ1,
.
N
Proof By equation (A.1),
n+N −1
n−1
2δ
X
1 X
1,
|aN,n (x) − ā(x)| = [a(x, ym ) − ā(x)] −
[a(x, ym ) − ā(x)] ≤
,
N m=0
N
m=0
for all x. Hence,
|un − zn | ≤ n−1
X
|aN,m (um ) − ā(zm )|
m=0
≤
n−1
X
|aN,m (um ) − ā(um )| + m=0
n−1
X
|ā(um ) − ā(zm )| ≤
m=0
By Proposition A.2, |un − zn | ≤
2δ1,
(1
N
+ L)n ≤
n−1
X
2δ1,
|um − zm |.
+ L
N
m=0
2eL δ1,
.
N
Lemma A.7 |X(n) − zn | ≤ L2 eL .
Proof Write
X(n) = x0 +
n−1 Z
X
m=0
(m+1)
[ā(X(s)) − ā(X(m))] ds + m
n−1
X
ā(X(m)).
m=0
Since |ā(X(t1 )) − ā(X(t2 ))| ≤ L|X(t1 ) − X(t2 )| ≤ L2 |t1 − t2 | for all t1 , t2 , we obtain
that
n−1
X
ā(X(m)) ≤ L2 n2 ≤ L2 .
X(n) − x0 − m=0
Hence
|X(n) − zn | ≤ n−1
X
2
|ā(X(m)) − ā(zm )| + L ≤ L
m=0
n−1
X
m=0
The result follows from Proposition A.2.
32
|X(m) − zm | + L2 .
Proof of Theorem A.1 For convenience, we write L2 eL ≤ e2L . Also, 2eL ≤ e2L .
By Lemmas A.5 and A.6,
|xn − zn | ≤
e2L δ1,
+ 3e2L N.
N
1/2
(A.2)
1/2
Choosing N = [δ1, /] + 1, we obtain |xn − zn | ≤ 4e2L δ1, + 3e2L .
Overall, using Lemma A.7,
|x[t/] − X(t)| ≤ |X(t) − X([t/])| + |X([t/]) − z[t/] | + |z[t/] − x[t/] |
p
p
≤ L + L2 eL + 4e2L δ1, + 3e2L ≤ 5e2L ( δ1, + ).
This completes the proof.
Proof of Theorem 2.2(a) Replacing T , a(x, y) and ā(x) in Theorem A.1 by T ,
a(x, y, ) and ā(x, ), we obtain that
p
δ1, + |x̂ (t) − X (t)| ≤ 5e2L
where
Ẋ = ā(X, ),
X (0) = x0 .
Let A = supx∈E |ā(x, ) − ā(x, 0)|. Then
Z t
|ā(X (s), ) − ā(X(s), 0)| ds
|X (t) − X(t)| ≤
0
Z t
|ā(X (s), 0) − ā(X(s), 0)| ds
≤ tA +
0
Z t
≤ A + L
|X (s) − X(s)| ds.
0
By Gronwall’s lemma, |X (t)R− X(t)| ≤ eL A for all t ≤ 1.
Next, A ≤ L + supx∈E | M a(x, y, 0) (dν − dν0 )(y)|. Combining these estimates
we obtain that
Z
p
2L
2L
L
δ1, + 6e + e sup
a(x, y, 0) (dν − dν0 )(y),
|x̂ (t) − X(t)| ≤ 5e
x∈E
M
yielding the result.
B
Proof of second order averaging
In this appendix, we prove Theorem 2.2(b). This is a quantitative version of a result
due to [31] with a somewhat simplified proof. Again we work with discrete time rather
than continuous time.
33
As in Appendix A, we consider first the case where T : M → M is independent
of . Suppose that a : Rd × M → Rd and ā : Rd → Rd are functions. Assume that
d
and L ≥ 1.
kakLip ≤ L and kDakLip ≤ L where D = dx
Define δ = δ1, + δ2, where
n−1
X
δ1, (y0 ) = sup sup (a(x, yj ) − ā(x)),
x
1≤n≤1/
j=0
n−1
X
δ2, (y0 ) = sup sup (Da(x, yj ) − Dā(x)).
x
1≤n≤1/
j=0
Theorem B.1 Let > 0, x0 ∈ Rd . For all y0 ∈ M with δ (y0 ) ≤
1
2
and t ∈ [0, 1],
|x̂ (t) − X(t)| ≤ 5e2L (δ (y0 ) + ).
Define a function u : Rd × {0, 2, . . . , 1/} by setting u(x, 0) ≡ 0 and
u(x, n) =
n−1
X
(a(x, yj ) − ā(x)),
δ j=0
n ≥ 1.
Proposition B.2 For any n ≤ 1/, we have |u(·, n)|∞ ≤ 1 and Lip u(·, n) ≤ 1.
Proof For all x,
n−1
δ
X
2,
|Du(x, n)| = (Da(x, yj ) − Dā(x)) ≤
≤ 1.
δ j=0
δ
Hence the second estimate follows from the mean value theorem, and the first estimate
is easier.
Define a new sequence wn by setting w0 = x0 and
wn = xn − δ u(wn−1 , n),
n ≥ 1.
Lemma B.3 For all 0 ≤ n ≤ 1/ and for all y ∈ M with δ (y) ≤ 21 ,
n−1
X
ā(wk ) ≤ 4Lδ .
w n − w 0 − k=0
Proof By definition, for all 0 ≤ k ≤ 1/,
δ [u(wk , k + 1) − u(wk , k)] = [a(wk , yk ) − ā(wk )].
34
Thus
n−1
X
δ u(wn−1 , n) = δ
{u(wk , k + 1) − u(wk−1 , k)}
k=0
= δ
=
n−1
X
n−1
X
{u(wk , k + 1) − u(wk , k)}) +
{u(wk , k) − u(wk−1 , k)}
k=0
n−1
X
k=0
{a(wk , yk ) − ā(wk )} + In ,
k=0
where
In = δ
n−1
X
{u(wk , k) − u(wk−1 , k)}.
k=1
This together with the definition of xn yields
w n = x0 + = w0 + = w0 + n−1
X
k=0
n−1
X
k=0
n−1
X
a(xk , yk ) − δ u(wn−1 , y, n)
a(xk , yk ) − n−1
X
{a(wk , yk ) − ā(wk )} − In
k=0
ā(wk ) − In + IIn ,
k=0
where
IIn = n−1
X
{a(xk , yk ) − a(wk , yk )}.
k=0
We claim that for all 0 ≤ n ≤ 1/,
|wn+1 − wn − ā(wn )| ≤ 4Lδ .
The result follows by summing over n.
It remains to prove the claim. It is easy to check that w1 − w0 − ā(w0 ) = 0.
Inductively, suppose that |wn − wn−1 − ā(wn−1 )| ≤ 4Lδ . Notice that
|IIn+1 − IIn | ≤ Lipa|xn − wn | ≤ Lipa δ |u(wn−1 , n)| ≤ Lδ .
Also,
|In+1 − In | ≤ δ Lipu|wn − wn−1 | ≤ δ (|ā(wn−1 )| + 4Lδ ) ≤ 3Lδ ,
where the second inequality follows by the induction hypothesis and the third inequality uses δ ≤ 12 . Therefore
|wn+1 − wn − ā(wn )| ≤ |In+1 − In | + |IIn+1 − IIn | ≤ 4Lδ ,
35
proving the claim.
Define the sequence
zn+1 = zn + ā(zn ),
z0 = x0 .
Corollary B.4 |xn − zn | ≤ 5δ e2L for all 1 ≤ n ≤ 1/, and for all y ∈ M with
δ (y) ≤ 21 ,
Proof Write
wn − zn = wn − x0 − n−1
X
ā(zk ) = wn − w0 − k=0
n−1
X
ā(wk ) + k=0
n−1
X
{ā(wk ) − ā(zk )}.
k=0
By Lemma B.3,
|wn − zn | ≤ 4Lδ + n−1
X
|ā(wk ) − ā(zk )| ≤ 4Lδ + L
k=0
n−1
X
|wk − zk |.
k=0
By Proposition A.2, |wn − zn | ≤ 4Lδ eL ≤ 4δ e2L . Moreover, |xn − wn | ≤ δ |u|∞ ≤ δ
and the result follows.
Proof of Theorem B.1 This is identical to the proof of Theorem A.1, using Corollary B.4 in place of (A.2) together with Lemma A.7.
Proof of Theorem 2.2(b) This follows from Theorem B.1 in exactly the same way
that Theorem 2.2(a) followed from Theorem A.1.
Acknowledgements This research was supported in part by a European Advanced
Grant StochExtHomog (ERC AdG 320977). We are grateful to Vitor Araújo, Jorge
Freitas, Vilton Pinheiro, Mike Todd and Paulo Varandas for helpful discussions.
References
[1] J. F. Alves. SRB measures for non-hyperbolic systems with multidimensional expansion. Ann. Sci. École Norm. Sup. (4) 33 (2000) 1–32.
[2] J. F. Alves. Strong statistical stability of non-uniformly expanding maps. Nonlinearity 17 (2004) 1193–1215.
[3] J. F. Alves, C. Bonatti and M. Viana. SRB measures for partially hyperbolic
systems whose central direction is mostly expanding. Invent. Math. 140 (2000)
351–398.
36
[4] J. F. Alves, S. Luzzatto and V. Pinheiro. Markov structures and decay of correlations for non-uniformly expanding dynamical systems. Ann. Inst. H. Poincaré
Anal. Non Linéaire 22 (2005) 817–839.
[5] J. F. Alves and M. Viana. Statistical stability for robust classes of maps with
non-uniform expansion. Ergodic Theory Dynam. Systems 22 (2002) 1–32.
[6] D. V. Anosov. Averaging in systems of ordinary differential equations with rapidly
oscillating solutions. Izv. Akad. Nauk SSSR Ser. Mat. 24 (1960) 721–742.
[7] V. Bakhtin and Y. Kifer. Nonconvergence examples in averaging. Geometric and
probabilistic structures in dynamics. Contemp. Math. 469, Amer. Math. Soc.,
Providence, RI, 2008, pp. 1–17.
[8] V. Baladi. On the susceptibility function of piecewise expanding interval maps.
Comm. Math. Phys. 275 (2007) 839–859.
[9] V. Baladi, M. Benedicks and D. Schnellmann. Whitney-Hölder continuity of the
SRB measure for transversal families of smooth unimodal maps. Invent. Math. To
appear.
[10] M. Benedicks and L. Carleson. On iterations of 1 − ax2 on (−1, 1). Ann. of Math.
122 (1985) 1–25.
[11] V. Baladi and M. Todd. Linear response for intermittent maps. Preprint, 2015.
[12] P. Billingsley. Convergence of probability measures, second ed., Wiley Series in
Probability and Statistics: Probability and Statistics, John Wiley & Sons Inc.,
New York, 1999, A Wiley-Interscience Publication.
[13] D. L. Burkholder. Distribution function inequalities for martingales. Ann. Probability 1 (1973) 19–42.
[14] P. Collet and J.-P. Eckmann. Positive Liapunov exponents and absolute continuity for maps of the interval. Ergodic Theory Dynam. Systems 3 (1983) 13–46.
[15] D. Dolgopyat. Averaging and invariant measures. Mosc. Math. J. 5 (2005) 537–
576, 742.
[16] J. M. Freitas. Continuity of SRB measure and entropy for Benedicks-Carleson
quadratic maps. Nonlinearity 18 (2005) 831–854.
[17] J. M. Freitas. Exponential decay of hyperbolic times for Benedicks-Carleson
quadratic maps. Port. Math. 67 (2010) 525–540.
[18] J. M. Freitas and M. Todd. The statistical stability of equilibrium states for
interval maps. Nonlinearity 22 (2009) 259–281.
37
[19] S. Gouzel. Decay of correlations for nonuniformly expanding systems. Bull. Soc.
Math. France 134 (2006) 1–31.
[20] M. V. Jakobson. Absolutely continuous invariant measures for one-parameter
families of one-dimensional maps. Comm. Math. Phys. 81 (1981) 39–88.
[21] G. Keller. Stochastic stability in some chaotic dynamical systems. Monatsh.
Math. 94 (1982) 313–333.
[22] G. Keller and C. Liverani. Stability of the spectrum for transfer operators. Annali
della Scuola Normale Superiore di Pisa, Classe di Scienze XXVIII (1999) 141–
152.
[23] Y. Kifer. Averaging in difference equations driven by dynamical systems.
Astérisque (2003), no. 287, xviii, 103–123, Geometric methods in dynamics. II.
[24] Y. Kifer. Averaging principle for fully coupled dynamical systems and large deviations. Ergodic Theory Dynam. Systems 24 (2004) 847–871.
[25] Y. Kifer. Another proof of the averaging principle for fully coupled dynamical
systems with hyperbolic fast motions. Discrete Contin. Dyn. Syst. 13 (2005) 1187–
1201.
[26] A. Korepanov. Linear response for intermittent maps with summable and nonsummable decay of correlations. Preprint, 2015.
[27] C. Liverani, B. Saussol and S. Vaienti. A probabilistic approach to intermittency.
Ergodic Theory Dynam. Systems 19 (1999) 671–685.
[28] M. Lyubich. Almost every real quadratic map is either regular or stochastic.
Ann. of Math. 156 (2002) 1–78.
[29] I. Melbourne and R. Zweimüller. Weak convergence to stable Lévy processes for
nonuniformly hyperbolic dynamical systems. Ann Inst. H. Poincaré (B) Probab.
Statist. 51 (2015) 545–556.
[30] F. Móricz. Moment inequalities and the strong laws of large numbers. Z.
Wahrscheinlichkeitstheorie und Verw. Gebiete 35 (1976) 299–314.
[31] J. A. Sanders. On the fundamental theorem of averaging. SIAM J. Math. Anal.
14 (1983) 1–10.
[32] J. A. Sanders, F. Verhulst and J. Murdock. Averaging methods in nonlinear
dynamical systems, second ed., Applied Mathematical Sciences 59, Springer, New
York, 2007.
38
[33] R. J. Serfling. Moment inequalities for the maximum cumulative sum. Ann. Math.
Statist. 41 (1970) 1227–1234.
[34] H. Thunberg. Unfolding of chaotic unimodal maps and the parameter dependence
of natural measures. Nonlinearity 14 (2001) 323–337.
[35] M. Viana. Multidimensional nonhyperbolic attractors. Inst. Hautes Études Sci.
Publ. Math. (1997) 63–96.
[36] L.-S. Young. Recurrence times and rates of mixing. Israel J. Math. 110 (1999)
153–188.
39
Download