The Averaging Principle (argued via a Method of Multiple Time Scales)

advertisement
The Averaging Principle (argued via a Method of Multiple Time Scales)
Unperturbed system (not necessarily Hamiltonian):
dI
=0
dt
dφ
= ω(I, φ)
dt
I ∈ Rm ,
(1)
(2)
φ ∈ T n or any compact smooth manifold
(3)
Perturbed system:
dI
= f (I, φ, )
dt
dφ
= ω(I, φ) + g(I, φ, )
dt
(4)
(5)
Note that:
• f and g generally have power series expansions in • I changes slowly since is small
• φ changes rapidly compared to I because (we assume) ω 6= 0
Define a parameter λ that changes slowly with time. More specifically it changes as
dλ/dt = O(). We will use this parameter as an argument to functions that change slowly
with time. The functional dependence can take care of exactly how things slowly change
with time so we can just define the parameter to simply be λ = t. Furthermore we restrict
our method to t ∈ [0, 1/] so that λ ∈ [0, 1]. Now dI/dt = O(). One could imagine writing
an expansion for I as I = I∗ (λ) + I∗∗ (λ) + . . . but this is not sufficiently general. There can
be no high frequency variation in I∗ allowed, yet the time derivative of the I∗ term of the
expansion corresponds to f (I, φ, 0) and φ is a “fast” variable: f (I, φ, 0) should be allowed
to have high frequencies. With must have faith to believe that the solution is to write the
expansion as
I = I0 (λ) + I1 (λ, t) + 2 I2 (λ, t) . . .
(6)
By comparing orders
dI ∂I0
=0
dt 1
∂t
dI dI0
∂I1
1 :
=
+
= f (I, φ, 0)
dt dλ
∂t
dI ∂I1
∂I2
∂f 2 :
= 2
+ 2
= 2 dt 2
∂λ
∂t
∂ =0
0 :
=
1
(7)
(8)
(9)
But how unique is the expansion (6)? So far it is meaningless as an expansion in .
Consider the case where the term I1 (λ, t) has the form h1 (λ)(a + bt). Then since t = λ/
we see that this term contains the O(1) piece bh1 (λ) which should be included in I0 (λ). And
a ct2 term is even more disastrous than a bt term. This might lead one to conclude that t
should not be present in I1 — but it is the fact that a polynomial in t is not secular in t that
causes the problem. (A function h is secular in t if limt→∞ h(t)/t is not 0.1 ) If I1 were of
the form h1 (λ) sin(t) we would not have this problem; this is an example of a non-secular
I1 . We demand that in the expansion (6) all of the Ii be non-secular. (The fact that λ
cannot become arbitrarily large causes the phrases “non-secular in time” and “non-secular
in the variable t” to have the same meaning.)
If we want we can also expand φ as
φ = φ0 (λ, t) + φ1 (λ, t) + . . .
(10)
where the φi are non-secular. The use of expansions such as (6) and (10) is sometimes called
a (or the) method of multiple time scales.
Now we average (4) over the intermediate time scale τ (see previous footnote). This
yields
D
E
d hIiτ
≈ f (I0 (λ), φ0 (λ, t), 0)
dt
τ
(11)
Corrections are O(2 ).
These equations are not simply equations for the I on Rm ; the solutions also depend on
the initial value of the phases φ (in addition to the initial I(0)). However, if τ is sufficiently
long and the unperturbed motion covers the whole torus spending, on average, equal times
in regions of equal torus area, then the time average can be replaced by an average over the
φ torus. One then usually replaces the quantity hIiτ with a new variable called J . (Strictly
speaking, we are not setting J = hIiτ , we are just setting up the equations below. It appears
1
In the development of this method we will be forced to use a more nebulous definition of secularity that
has the same intuitive meaning. We will have to assume that we can pick an intermediate time scale τ
so that in the region of phase space we consider 1/ωi τ 1/. Then we will consider a function h
to be non-secular if (h(t + τ ) − h(t))/τ 1 for t ∈ [0, 1/] (here τ takes the place of the infinite time in
the stricter definition). When h is a function of λ as well, it should not make a difference whether we use
λ = t in the evaluation of h or just treat λ as independent.
2
that J = hIiτ + O().) The equations for this averaged system are
dJ
= f¯(J )
dt
Z
−n
¯
f (J ) = (2π)
f (J , φ, 0)dφ1 . . . dφn
(13)
J (0) = I(0)
(14)
(12)
The averaged system is simpler than the original system: it exists on Rm . The averaging
principle is the replacement of (4) and (5) with (12–14), e.i. that J ≈ I. In the words of
Arnold[2]:
We note that this principle is neither a theorem, an axiom, nor a definition, but
rather a physical proposition, i.e., a vaguely formulated and, strictly speaking,
untrue assertion.
Here is the argument for the averaging principle based on the method of multiple time
scales we have just used: Using the non-secularity of the Ii one can see from (6) that
hIiτ = I0 + O(). Since J = hIiτ + O() we have J = I0 + O(). And (6) says directly that
I = I0 + O(). Thus it seems that I = J + O(). This isn’t generally true, although it is
√
for systems where n = 1. Generally the error is larger (for example) O( ). I believe the
increased error has to do with either correlations or near-resonances in the φ motions.
The typical averaging-type theorem states that, in some restricted region of parameters,
J (t) remains close (in a defined way) to the exact solution I(t) (with any initial phase) on
a certain time interval (such as [0, 1/]). Lochak and Meunier [1] discuss many averaging
theorems, for the special cases of n = 1 and n = 2 as well as for any n. (The theorems for
n = 1, 2 are stronger than those for general n.) Often we cannot use the strict theorems
(not given here) because the system does not meet the conditions for the theorems, or we
cannot tell whether it does or not.
3
I.
ASIDE
Here’s a more explicit way to show that J ≈ I. Consider the time average of (8). We
will want to drop any terms that are higher order in . For the term dI0 /dλ:
Z
D dI E
1 t+τ dI0 0
0
=
dτ
dλ τ
τ t
dλ
Z
1 t+τ 0 dI0 d dI0 0
=
dt
+
(t − t) + . . .
τ t
dλ λ=t dt dλ λ=t
Z
t+τ 0 d2 I0 dI0
+
dt
=
+O(2 )
dλ
τ t
dλ2 λ=t
dI0
+ O()
=
dλ
(15)
(16)
(17)
(18)
For the term ∂I1 /∂t:
D ∂I E
1
∂t
τ
1
=
τ
Z
t+τ
t
dI1
∂I1
dt
−
dt
∂λ
0
(19)
The second term on the RHS can be dropped because of the . The first term integrates to
the form (h(t + τ ) − h(t))/τ which is much less than 1 due to secularity. Thus h∂I1 /∂tiτ hdI0 /dλiτ . Thus
d hIiτ
dI0
≈
dt
dλ
(20)
The corrections to this are either of order 2 or (loosely speaking) /τ . Using (11) and
ergodicity, the property (which we are assuming holds for our system) of φ averages being
equal to long time averages, we get
E
dI0 D
≈ f (I0 (λ), φ0 (λ, t), 0)
dλ
τ
Z
≈ (2π)−n
f (I0 , φ, 0)dφ1 . . . dφn
= f¯(I0 )
(21)
Using λ = t gives
dI0
≈ f¯(I0 )
dt
(22)
I(0) = I0 (0) + O()
(23)
From (6) we have
4
(where the argument is time). Comparing the above two equations with (12) and (14) we
see J ≈ I0 . Again, (6) says that I = I0 + O() so we have J ≈ I.
[1] P. Lochak and C. Meunier, Multiphase Averaging for Classical Systems: with Applications to
Adiabatic Theorems, Springer-Verlag, New York, 1988.
[2] V. I. Arnold, Mathematical Methods of Classical Mechanics, 2nd Ed., Springer-Verlag, New
York, 1989.
5
Download