Document 11126900

advertisement
II. Extensions
c 2015, Philip D Loewen
A. Vector-Valued Arcs
Everything works just like it does for scalar-valued arcs. With careful management of notation, the main results even have the same appearance. So let us consider
.
∗
Rn as the vector space of n ×1-matrices (column vectors, like [..]), and write (Rn ) for
the vector space of 1×n-matrices (row vectors, like [· · ·]). We will use square brackets
when being completely precise, and allow round ones to simplify the expression of
points in Rn :
 
x1
x2 

n

(x1 , x2 , . . . , xn ) ∈ Rn encodes 
 ...  ∈ R .
xn
We will use no decorations on elements x ∈ Rn (avoiding x, ~x, etc.), and keep using
the hat accent just for identification, not a symbol for a unit vector.
Use C 1 ([a, b], Rn) as our notation for the basic arc space when n ≥ 2, reverting
to C 1 [a, b] when n = 1. Define VM N , DLM, etc., just as in the case n = 1.
Gradients versus Derivatives. For any scalar-valued function Φ: X → R and point
of interest x ∈ X, the symbol DΦ[x] denotes an operator intended to approximate
the action of Φ near x. That is, DΦ[x] is itself a funcion that takes inputs from the
space X and gives out real numbers. In abstract notation, evaluating DΦ[x] at some
point h ∈ X looks (and is defined) like this:
def
Φ[x + λh] − Φ[x]
.
λ→0
λ
DΦ[x](h) = Φ′ [x; h] = lim
In calculus, X = Rn and the name f is more common than Φ. Linearization for f
near x looks like this:
f (x + h) ≈ f (x) +
∂f
∂f
h1 + · · · +
hn
∂x1
∂xn
∂f
≈ f (x) + [ ∂x
1
∂f
∂x2
···
∂f
∂xn


h1
 h2 

]
 ...  ,
for h ≈ 0 in Rn .
hn
To write f (x + h) ≈ f (x) + Df (x)h requires interpreting Df (x) as the row vector
shown above. Our subscript notation fx (x) = Df (x) will match this.
The Gradient. To say, as in calculus, that the vector v = ∇f (x) tells the direction
to move from x in order to increase f -values most rapidly, the expression x + λv
must make sense for real scalars λ. So v must have the same shape as x—that is,
File “extns”, version of 23 January 2015, page 1.
Typeset at 12:29 January 23, 2015.
2
PHILIP D. LOEWEN
v = ∇f (x) must be a column vector. So our fx (x) = Df [x] is not quite identical to
∇f (x): to be completely precise,
∂f
Df [x] = [ ∂x
1
∂f
∂x2
···
∂f
∂xn
T
] = (∇f (x)) .
Basic Problem. A real interval [a, b] is given, along with a C 1 function L: [a, b] ×
Rn × Rn → R and points A, B ∈ Rn .
(
min Λ[x] =
Z
b
L(t, x(t), ẋ(t)) dt : x(a) = A, x(b) = B
a
)
.
Theorem. Suppose the arc x
b ∈ C 1 ([a, b], Rn) gives a DLM (rel. VII ) above. Then
∗
(a) There is a constant c ∈ (Rn ) such that
b v (t) = c +
L
(IEL)
Z
t
a
b x (r) dr,
L
b v is a function in C 1 [a, b]; (Rn )∗ satisfying
(b) So L
d b
b x (t)
Lv (t) = L
dt
(DEL)
∀t ∈ [a, b].
∀t ∈ [a, b].
(c) If L ∈ C 2 on some open set containing (t0 , x
b(t0 ), x(t
ḃ 0 )) and the n × n matrix
b vv (t) is invertible, then x
L
b is C 2 on some open interval containing t0 .
(d) If x
b ∈ C 2 ([a, b], Rn) and L ∈ C 2 , then
(WE2)
i
d hb
b v (t)x(t)
b t (t),
L(t) − L
ḃ
=L
dt
∀t ∈ [a, b].
Here (DEL) is an equation between row vectors of length n. Writing out the
component equations gives a system of n ODE’s in n unknown functions. However,
(WE2) is just a single scalar differential equation . . . certain to be inadequate to
provide a unique solution when n ≥ 2.
Kepler’s Problem. For central-force motion in polar coordinates, a particle of mass
m, generalized position (r, θ), and generalized velocity (ṙ, θ̇), has kinetic and potential
energies given by
2
1
2 mv
KE:
T =
PE:
V =−
File “extns”, version of 23 January 2015, page 2.
=
Km
.
r
1
2m
1 2
2 ṙ
+
1 2 2
2 r θ̇
,
Typeset at 12:29 January 23, 2015.
II. Extensions
3
The Principle of Least Action says that real objects move along paths in space that
minimize (at least over small time intervals) the
Integral” below, in which
“Action
r(t)
the input arcs have the R2 -valued form x(t) =
:
θ(t)
A[x] =
Z
b
(T − V ) dt =
a
Z
b
m
a
1
ṙ(t)2
2
+
1
r(t)2 θ̇(t)2
2
K
+
r(t)
dt.
Ignoring the constant m > 0, write x = (r, θ) ∈ R2 and v = (u, ω) ∈ R2 . Then the
integrand above is built by evaluating this function along arcs in R2 :
L(t, x, v) = 21 u2 + 12 r 2 ω 2 +
Note that
Lt (t, x, v) = 0,
h
∂L
Lx (t, x, v) =
∂r
h
∂L
Lv (t, x, v) =
∂u
K
.
r
i h
K
∂L
= rω 2 − 2
r
∂θ
i
∂L
= [ u r2 ω ] .
∂ω
i
0 ,
Any action-minimizing trajectory x(t) = (r(t), θ(t)) must satisfy (DEL), namely,
d
[ ṙ
dt
h
K
r θ̇ ] = r θ̇ − 2
r
2
2
i
0 .
This leads to the system of 2 second-order equations in 2 variables,
r̈(t) = r θ̇ 2 −
K
,
r2
r 2 θ̇ = const.
The second equation shows conservation of angular momentum (and leads to Kepler’s
Second Law). Combining it with the first leads to Kepler’s other two laws of planetary
motion . . . Physics courses show how. In this problem we have Lt = 0, so (WE2)
implies that the following quantity is constant:
K
ṙ
2
− [ ṙ r θ̇ ]
+
+
θ̇
r
K
= − 12 ṙ 2 − 12 r 2 θ̇ 2 +
= −(T + V ).
r
b −L
b v (t)x(t)
L(t)
ḃ =
1 2
2 ṙ
1 2 2
2 r θ̇
Again the conservation of total energy arises from a variational principle.
////
B. Parametric Curves
Sometimes the independent variable t is a physically meaningful parameter (as
in Kepler’s problem above). However, the same letter is often used as a generic
parameter name in geometric problems where only the shape of the curve in Rn is
of interest, and the details of its parametrization make no difference. In general, we
File “extns”, version of 23 January 2015, page 3.
Typeset at 12:29 January 23, 2015.
4
PHILIP D. LOEWEN
want to deal with parametric curves x: [t0 , t1 ] → Rn in C 1 ([a, b]; Rn); in what follows
we enforce smoothness by requiring
2
0 6= |ẋ(t)| =
dx1
dt
2
+
dx2
dt
2
+···+
dxn
dt
2
,
∀t ∈ [t0 , t1 ].
To illustrate, suppose n = 2. At any parameter value t0 ∈ [a, b] the vector ẋ(t0 )
(if nonzero) is tangent to the curve at the point x(t0 ); the slope of curve at that point
equals the slope of ẋ(t0 ) = (ẋ1 (t0 ), ẋ2 (t0 )), namely,
ẋ2 (t0 )
dx2 =
.
dx1 t=t0
ẋ1 (t0 )
(Picture.) In a parametric description where x(t0 ) = a and x(t1 ) = b, the ordinary
Rb
variational integral a L0 (x, y(x), y ′(x)) dx would correspond to
ẏ(t)
ẋ(t) dt.
I=
L0 x(t), y(t),
ẋ(t)
t=t0
def
Z
t1
(For calculation purposes, this looks just like integration by substitution, where we
“substitute x = x(t)”.) To prove that this expression is unaffected by changes in
parametrization of the curve, suppose φ: [r0 , r1 ] → R is differentiable with φ′ (r) > 0
always and φ(r0 ) = t0 , φ(r1 ) = t1 . Then the mapping
def
r 7→ (e
x(r), ye(r)) = (x(φ(r)), y(φ(r))) ,
r ∈ [r0 , r1 ],
traces the same points of R2 in the same order as the original curve, and its integral
value is
!
Z r1
y(r)
ė
def
x(r)
ė dr.
Ie=
L0 x
e(r), ye(r),
x(r)
ė
r=r0
Let t = φ(r) in I: since
we get
x(r)
ė
= ẋ(φ(r))φ̇(r), y(r)
ė = ẏ(φ(r))φ̇(r),
I=
=
Z
Z
r1
L0
r=r0
r1
L0
r=r0
r ∈ [r0 , r1 ],
ẏ(φ(r))
x(φ(r)), y(φ(r)),
ẋ(φ(r))φ̇(r) dr
ẋ(φ(r))
!
y(r)
ė
e
x(r)
ė dr = I.
x
e(r), ye(r),
x(r)
ė
This calculation shows that the integrand
x
v
L
,
= L0 (x, y, w/v)v,
y
w
File “extns”, version of 23 January 2015, page 4.
(0)
Typeset at 12:29 January 23, 2015.
II. Extensions
5
leads to a functional
Z t1 x
x(t)
ẋ(t)
Λ
=
L
,
dt
y
y(t)
ẏ(t)
t0
that assigns the same numerical value to a given geometric curve regardless of its
parametric description.
In more general situations, where x denotes a parametric arc in Rn , any integrand
L: Rn × Rn → R with the property
∀r > 0,
will give a functional Λ[x] =
L(x, rv) = rL(x, v)
Z
∀x, v ∈ Rn
(1)
t1
L(x(t), ẋ(t)) dt whose value is the same for every
t0
parametrization of the input curve. (Notice that property (1) holds for any integrand
defined using (0).) The converse also holds, giving . . .
Theorem. Let L = L(t, x, v) be of class C 1 on R × Rn × Rn . The correspondRb
ing functional Λ[x] = a L(t, x(t), ẋ(t)) dt gives the same value to every parametric
description of the same arc if and only if both
(i) L has no direct t-dependence, i.e., Lt (t, x, v) = 0 everywhere, and
(ii) L satisfies the homogeneity condition (1).
Proof. Easy, but details are given by Bliss, Lectures on the COV.
////
Euler’s Theorem on Homogeneous Functions. Taking ∂/∂r in (1) gives
Lv (x, rv)v = L(x, v).
Substituting r = 1 yields a useful identity, which can be differentiated further: for
all x, v,
Lv (x, v)v = L(x, v),
v T Lvx (x, v) = Lx (x, v),
v T Lvv (x, v) = 0.
(2)
Hence condition (WE2) is completely redundant in parametric problems (don’t waste
your time), and the matrix Lvv is guaranteed to be singular everywhere (Weierstrass/Hilbert Theorem can’t be applied directly as written). Furthermore, the
equations making up (DEL) are linearly dependent: in vector notation, any arbitrary parametric curve x will obey
d
Lv (x(t), ẋ(t) − Lx (x(t), ẋ(t)) ẋ(t) = 0
for all t.
dt
(Practice: Derive this from (2).) This is to be expected: since the integral shows no
preference for one parametrization over another, the system of Euler equations for the
solution arc should leave us one degree of freedom to choose whichever parametrization is convenient.
File “extns”, version of 23 January 2015, page 5.
Typeset at 12:29 January 23, 2015.
6
PHILIP D. LOEWEN
C. Piecewise Smooth Arcs
Overview. Allowing arcs with corners to compete for the minimum in a COV problem increases the chances that a minimizer exists.
Infimum versus Minimum. Let S ⊆ R, S 6= ∅. The notation m = min S = min(S)
is reserved for the unique real number m with these two properties:
(i) m ≤ s,
∀s ∈ S;
(ii) m ∈ S.
Strictly speaking, then, the following symbol is undefined:
min {t ∈ R : t > 0} .
(The set S = (0, +∞) contains no m with properties (i)–(ii).)
The notation µ = inf S = inf(S) is reserved for the unique element of R ∪ {−∞}
with these two properties:
(i) µ ≤ s, ∀s ∈ S, and
(ii) S contains a sequence (sn ) such that sn → µ.
E.g., inf {t ∈ R : t > 0} = 0.
When S = {Φ(x) : x ∈ X} for some function Φ: X → R, we write
inf(S) = inf f (x).
x∈X
Any sequence (sn ) as in (ii) must have the form sn = Φ(xn ) for some xn ∈ X; we
then call (xn ) a minimizing sequence for Φ. If
m = min(S) = min f (x)
x∈X
happens to exist, then the condition m ∈ S forces m = f (b
x) for some x
b ∈ X: then x
b
is a minimizing point for Φ.
Example. Let S = x ∈ C 1 [−1, 1] : x(−1) = 0, x(1) = 1 , and solve
min Λ[x] :=
x∈S
Z
1
2
x(t)2 (ẋ(t) − 1) dt.
−1
Clearly Λ[x] ≥ 0 for all arcs x. Indeed, to get Λ[x] = 0 would require either x(t) = 0
or ẋ(t) = 1 at each t, and there is no x ∈ C 1 with this property, so Λ[x] > 0 for all
x ∈ S. Of course, Λ[b
x] = 0 for
x
b(t) =
File “extns”, version of 23 January 2015, page 6.
0, if −1 ≤ t < 0,
t, if 0 ≤ t ≤ 1,
Typeset at 12:29 January 23, 2015.
II. Extensions
7
but x
b 6∈ S because x(0)
ḃ
does not exist. A sequence of smooth arcs approximating x
b
is obtained by defining

−1 ≤ t < − k1 ,

 0,
vk (t) = k2 (t + k1 ), − k1 ≤ t < k1 ,


1
≤ t ≤ 1,
1,
k

−1 ≤ t < − k1 ,

Z t
 0,
xk (t) =
vk (r) dr = k4 (t + k1 )2 , − k1 ≤ t < k1 ,

−1

1
t,
k ≤ t ≤ 1.
(Pictures.) It’s easy to check that each xk ∈ C 1 [−1, 1], and that
Λ[xk ] =
Z
1/k
2
xk (t)2 [vk (t) − 1] dt → 0
as k → ∞.
−1/k
It follows that inf S Λ = 0, while minS Λ does not exist. If we could allow inputs like
x
b, the minimum would exist and equal 0.
////
Use “smooth” as a synonym for “continuously differentiable (once).” Then call
a function x: [a, b] → R piecewise smooth if it is a finite end-to-end concatenation
of smooth arcs. Formally, say x ∈ P WS[a, b] if and only if x is continuous on [a, b]
and there exist N ∈ N and numbers a = t0 < t1 < t2 < · · · < tN = b such that
x ∈ C 1 [ti−1 , ti ] for each i = 1, 2, . . . , N . As before, x ∈ C 1 [ti−1 , ti ] requires that these
two one-sided limits to exist finitely for each i = 1, . . . , N :
ẋ(t+
i−1 ) = lim ẋ(t),
t→t+
i−1
ẋ(t−
i ) = lim ẋ(t).
t→t−
i
Thus the job of the partition points t0 , . . . , tN is to cover endpoints and corner
points in the graph of x.
Write P WS ([a, b]; Rn) for the set of arcs for which each component is piecewise
smooth. For any such arc there will be a finite subset of [a, b] covering all points
where any component has a jump in its derivative.
Terminology Upgrade. From now on, “arc” means “function in P WS”.
For any continuous Lagrangian L = L(t, x, v) and arc x on [a, b], the function
t 7→ L (t, x(t), ẋ(t)) is piecewise continuous. Such a function is certainly (Riemann)
integrable. Indeed, if the partition t0 < · · · < tN covers the corners of x, then
Λ[x] =
Z
b
L (t, x(t), ẋ(t)) dt =
a
N Z
X
i=1
ti
L (t, x(t), ẋ(t)) dt
ti−1
shows how to split the evaluation of Λ into a sum of integrals involving continuous
functions.
File “extns”, version of 23 January 2015, page 7.
Typeset at 12:29 January 23, 2015.
8
PHILIP D. LOEWEN
Note: If x ∈ P WS[a, b] obeys x(a) = A and x(b) = B, then we still get
Z b
ẋ(t) dt = x(b) − x(a) = B − A.
a
Indeed, choose any partition t0 < · · · < tN that covers the corners of x, and telescope
the sum:
Z b
N
N Z ti
X
X
[x(ti ) − x(ti−1 )] = x(tN ) − x(t0 ) = x(b) − x(a).
ẋ(t) dt =
ẋ(t) dt =
a
i=1
ti−1
i=1
Theorem (The Fairing Theorem—Troutman Prop 7.6). Under the conditions
stated above,
inf {Λ[x] : x(a) = A, x(b) = B} =
inf
{Λ[x] : x(a) = A, x(b) = B} .
(∗)
x∈P WS
x∈C 1
Proof. (Sketch.) Inequality ≥ in (∗) is obvious, since the arcs competing for minimality on the right include all the arcs allowed on the left and more besides. To
prove the reverse inequality, choose an arbitrary ε > 0. For any z ∈ P WS satisfying
the endpoint conditions, surround each corner point of z in (a, b) with a small open
interval; write Ω for the union of these intervals. Carefully modify z inside Ω to
smooth out its corners, thereby producing an arc y ∈ C 1 such that
ẏ(t) = ż(t) ∀t ∈ [a, b] \ Ω,
sup
|ẏ − ż| < +∞.
Ω\{corners}
By controlling both the length of Ω and the worst-case discrepancy sup[a,b] |y − z|,
we can arrange that
!
y(a) = z(a), y(b) = z(b),
and
Λ[z] ≥ Λ[y] − ε ≥
inf Λ[y]
− ε.
y∈C 1
Here the rightmost expression is independent of z, and the same argument works for
any z, so
inf Λ[z] ≥ inf Λ[y] − ε.
z∈P WS
y∈C 1
Since this last inequality holds for arbitrary ε > 0, we have
inf
Λ[z] ≥ inf Λ[y],
z∈P WS
y∈C 1
as required.
////
Costs and Benefits of Using PWS.
1. Allowing corners makes existence of minimizers more likely. (Benefit.)
2. Allowing corners doesn’t affect the infimum value. (Benefit.)
3. Cornered variations are technically convenient. (Benefit.)
4. Familiar necessary conditions can be extended to allow corners, but this takes
work and careful interpretation. (Cost.)
File “extns”, version of 23 January 2015, page 8.
Typeset at 12:29 January 23, 2015.
II. Extensions
9
D. Piecewise Smooth Directional Local Minimizers
Cornered Variations. Upgrade variation spaces with notation like
VII = {h ∈ P WS[a, b] : h(a) = 0 = h(b)} .
There are lots of piecewise linear arcs in this space that are easier to calculate with
than their smooth approximants.
Principle of Optimality. Suppose x
b gives a DLM relative to VII in (P ):
(Z
)
b
L (t, x(t), ẋ(t)) dt : x ∈ P WS, x(a) = A, x(b) = B
min
.
a
Pick any subinterval [α, β] ⊆ [a, b] and use the corresponding points on gph(b
x) as
endpoint targets in a new problem:
(Z
)
β
min
α
L (t, x(t), ẋ(t)) dt : x ∈ C 1 , x(α) = x
b(α), x(β) = x
b(β) .
Clearly, the restriction of x
b to [α, β] gives a DLM in this new problem. (Suppose not:
if some yb provides a lower cost over [α, β], then splicing x
b to yb to x
b would produce
a lower cost over the full interval [a, b]. End-to-end splicing preserves the defining
property of P WS.)
Necessary Conditions. Let x
b, h ∈ P WS[a, b]. By splitting the whole calculation
into subintervals where both x
b and h are C 1 , same calculation done before will prove
Z
′
Λ [b
x; h] =
a
t
b x (r) dr h(b) +
L
Z
b
a
b v (t) −
L
Z
t
a
b x (r) dr ḣ(t) dt.
L
Now suppose x
b gives a DLM in problem (P ), has exactly one corner point, say at
θ ∈ (a, b). Apply Principle of Optimality on [a, θ] to see that x
b solves a version of
(P ) there, so IEL holds:
Likewise for [θ, b]:
Define c2 = d1 −
b v (t) = c1 +
∃c1 : L
Z
b v (t) = d1 +
∃d1 : L
Rθ
a
b x (r) dr to get
L
b v (t) = c2 +
L
Z
File “extns”, version of 23 January 2015, page 9.
a
t
t
a
Z
θ
t
b x (r) dr
L
∀t ∈ [a, θ).
b x (r) dr
L
∀t ∈ (θ, b].
b x (r) dr
L
∀t ∈ (θ, b].
(†)
(‡)
Typeset at 12:29 January 23, 2015.
10
PHILIP D. LOEWEN
Combine (†)–(‡) to get
b v (t) −
L
Z
t
a
b x (r) dr =
L
c1 , if t ∈ [a, θ),
c2 , if t ∈ (θ, b].
(∗∗)
Now x
b solves (P ), so our general theory says that for every h ∈ VII ,
′
0 = Λ (b
x; h) =
Z
b
a
=
Z
θ
b v (t) −
L
Z
t
a
[c1 ] ḣ(t) dt +
b
Lx (r) dr ḣ(t) dt
a
Z
b
[c2 ] ḣ(t) dt = h(θ)[c1 − c2 ].
θ
Since h ∈ VII is arbitrary, this forces c1 = c2 . Hence, by (∗∗), the constant c = c1 = c2
obeys
Z
b v (t) = c +
L
t
a
b x (r) dr
L
∀t ∈ [a, θ) ∪ (θ, b].
Similar arguments can be made for any finite number of corner points, with the
following result.
Theorem. If L ∈ C 1 and x
b ∈ P W S[a, b] gives a DLM in problem (P ), then there
is a constant c such that for all t ∈ [a, b] where x
ḃ is continuous,
b v (t) = c +
L
Z
t
a
b x (r) dr.
L
(IEL)
Regularity Bonus. The function of t on the RHS of (IEL) is continuous on [a, b],
b v (t) are
even if x
b has corners. Consequently all discontinuities of the function L
removable: the one-sided limits obey
b v (t− ) = L
b v (t+ )
L
∀t ∈ (a, b).
(WE1)
Example. In the motivating example of Section C above, x
b(t) = max {0, t} and
2
2
2
L(t, x, v) = x (v − 1) . Calculation gives Lv = 2x (v − 1) and
Hence
x(t)
ḃ =
(
0,
if −1 ≤ t < 0,
undefined, if t = 0,
1,
if 0 < t ≤ 1.
b v (t) = 2b
L
x(t)2 (x(t)
ḃ − 1) =
(
0,
if −1 ≤ t < 0,
undefined, if t = 0,
0,
if 0 < t ≤ 1.
b v (t) looks like a continuous function punctuated by at most
The graph of t 7→ L
finitely many holes where the function is undefined.
////
File “extns”, version of 23 January 2015, page 10.
Typeset at 12:29 January 23, 2015.
II. Extensions
11
Extremality. Any x
b ∈ P WS obeying (IEL) on an open interval, with finitely many
exceptions, is called an extremal for L. At the non-exceptional points, differentiation
gives
d b
b x (t).
(DEL)
Lv (t) = L
dt
But splicing solutions of (DEL) across points where x
b has corners must be done with
care: for arcs in P WS, we have
(IEL) ⇐⇒ (DEL) and (WE1).
Geometrical Interpretation. Suppose L ∈ C 1 and some extremal x
b has a corner
−
+
point at (t0 , x0 ). Let u = x(t
ḃ 0 ) and w = x(t
ḃ 0 ). Then u 6= w, but by (WE1),
Lv (t0 , x0 , u) = lim Lv (t, x
b(t), x(t))
ḃ
= lim Lv (t, x
b(t), x(t))
ḃ
= Lv (t0 , x0 , w).
t→t−
0
(∗)
t→t+
0
This means that the function v 7→ Lv (t0 , x0 , v) is not one-to-one, and indeed that
v = u and v = w are different inputs that give the same output. Illustrate with
L(t, x, v) = (v 2 − 1)2 , for which
L = v 4 − 2v 2 + 1 =⇒ Lv = 4v 3 − 4v = 4v(v 2 − 1).
A sketch shows that choices u = 1, w = −1 are compatible, since both give Lv = 0:
hence all zig-zag arcs of slope ±1 are extremals for L.
Theorem (Regularity). Fix (t0 , x0 ) ∈ R × Rn , and suppose L = L(t, x, v) is C 1
on some open set containing (t0 , x0 ) × Rn . Suppose x
b is an extremal for L obeying
x
b(t0 ) = x0 .
(a) If v 7→ Lv (t0 , x0 , v) is one-to-one on Rn , then x
b must be C 1 on some open interval
containing t0 .
(b) If L is C 2 on the open set above and Lvv (t0 , x0 , v) > 0 for all v ∈ Rn , then x
b
must be C 2 on some open interval containing t0 .
[Likewise if Lvv (t0 , x0 , v) < 0 for all v ∈ Rn .]
Proof. (a) Define u = x(t
ḃ −
ḃ +
0 ) and w = x(t
0 ). As in (∗) above, (WE1) gives
Lv (t0 , x0 , u) = lim Lv (t, x
b(t), x(t))
ḃ
= lim Lv (t, x
b(t), x(t))
ḃ
= Lv (t0 , x0 , w).
t→t−
0
t→t+
0
∃
Since v 7→ Lv (t0 , x0 , v) is one-to-one, this forces u = w. By L’Hospital, x(t
ḃ 0 )=u =
w. [Give details here.] In particular, t0 is not a corner point for x
b. Since there
are only finitely many corner points, the nearest one is some positive distance
(say r) from t0 : then x
b is C 1 on (t0 − r, t0 + r).
(b) The hypothesis implies that v 7→ Lv (t0 , x0 , v) is strictly monotonic, hence one-toone. So x
b is C 1 near t0 by (a). Hence x
b is C 2 near t0 by the Weierstrass/Hilbert
Theorem.
////
Natural Boundary Conditions. These work just the same as they did for smooth
extremals. Students derived a rather general formulation on HW02, and solutions
have been distributed.
File “extns”, version of 23 January 2015, page 11.
Typeset at 12:29 January 23, 2015.
12
PHILIP D. LOEWEN
E. Multiple Integrals
Suppose a bounded open set Ω in R3 is given, together with a scalar-valued function
g defined and continuous on some open set that contains the boundary surface of
Ω. Use the symbol ∂Ω for this boundary surface, and write Ω for the set Ω ∪ ∂Ω.
(The surface ∂Ω is “closed” in two senses: (1) it completely encloses a finite threedimensional volume, (2) its complement in R3 is an open set. Topologically, Ω is the
“closure of Ω”, a compact set in R3 .) We are interested in scalar-valued functions
u: Ω → R that agree with g on the boundary, i.e.,
u(x) = g(x)
∀x ∈ ∂Ω.
(DBC)
[DBC stands for Dirichlet Boundary Condition.] Every such function u is assigned
a number by the functional
ZZZ
def
Λ[u] =
L(x, u(x), ∇u(x)) dV (x),
Ω
where L: Ω × R × R3 → R is a given function. The basic multidimensional problem
in the COV is to minimize Λ[u] subject to (DBC). In full detail,
ZZZ
def
minimize Λ[u] =
L(x, u(x), ∇u(x)) dV (x)
Ω
among all u: Ω → R
subject to u(x) = g(x) ∀x ∈ ∂Ω.
Background. Under suitable smoothness hypotheses on the set Ω and a vector field
F: Ω → R3 , Gauss’s Divergence Theorem states
ZZZ
ZZ b
(∇ • F) dV =
F • N dS.
Ω
∂Ω
This is a form of the Fundamental Theorem of Calculus: it shows how a suitable
combination of first derivatives of F will “cancel” one integral in a triple-integral
setup, leaving only a double-integral over the boundary of the original integration
b represents the
domain. The boundary integral on the right is a “flux integral”: N
outward unit normal to the solid Ω, which typically varies from point to point.
A typical F: Ω → R3 , will have input x = (x1 , x2 , x3 ) and values as shown here:


P (x1 , x2 , x3 )
F(x1 , x2 , x3 ) =  Q(x1 , x2 , x3 ) 
R(x1 , x2 , x3 )
The divergence of F is this scalar-valued function of position:
div(F) = ∇ • F =
File “extns”, version of 23 January 2015, page 12.
∂Q
∂R
∂P
+
+
.
∂x1
∂x2
∂x3
Typeset at 12:29 January 23, 2015.
II. Extensions
13
For any smooth scalar-valued function y: Ω → R, one form of the product rule says
∇ • (yF) = (∇y) • F + y(∇ • F).
Necessary conditions for optimality in problem (P) can be derived following the
same abstract path we took in earlier studies. First, pick an arbitrary function u
obeying (DBC) and an arbitrary y: Ω → R satisfying y(x) = 0 for each x ∈ ∂Ω.
Consider
d
Λ[u + λy] − Λ[u]
′
Λ [u; y] = lim+
=
Λ[u + λy]
λ
dλ
λ→0
λ=0
ZZZ
d
L(x, u(x) + λy(x), ∇u(x) + λ∇y(x)) dV (x)
=
dλ
Ω
λ=0
ZZZ
d
L(x, u(x) + λy(x), ∇u(x) + λ∇y(x))
dV (x)
=
Ω dλ
λ=0
ZZZ =
Lu (x, u(x), ∇u(x))y(x) + Lw (x, u(x), ∇u(x))∇y(x) dV (x).
Ω
Now we apply the product rule above with F(x) = Lw (x, u(x), ∇u(x)): it gives
Lw (x, u(x), ∇u(x))∇y(x) = ∇•(y(x)Lw (x, u(x), ∇u(x)))−y(∇•Lw(x, u(x), ∇u(x))).
Integrating this and applying the Divergence Theorem gives
ZZZ
Ω
Lw (x, u(x), ∇u(x))∇y(x) dV
ZZZ
ZZZ
=
∇ • (y(x)Lw (x, u(x), ∇u(x))) dV −
y(∇ • Lw (x, u(x), ∇u(x))) dV
Ω
Ω
ZZ
ZZZ
b
=
y(x)Lw (x, u(x), ∇u(x)) • N dS −
y(∇ • Lw (x, u(x), ∇u(x))) dV.
∂Ω
Ω
In summary, we have
′
Λ [u; y] =
ZZZ
[Lu (x, u(x), ∇u(x)) − ∇ • Lw (x, u(x), ∇u(x))] y(x) dV (x)
ZZ
b dS.
+
y(x)Lw (x, u(x), ∇u(x)) • N
Ω
∂Ω
This formula is valid for any smooth y. If, in addition, y(x) = 0 for each x on
∂Ω, then the second integral equals 0. A simple argument involving perturbations
with bump-shaped graphs leads to the following necessary condition for u to solve
problem (P):
∇ • Lw (x, u(x), ∇u(x)) = Lu (x, u(x), ∇u(x)),
File “extns”, version of 23 January 2015, page 13.
x ∈ Ω.
(ELPDE)
Typeset at 12:29 January 23, 2015.
14
PHILIP D. LOEWEN
Example. Suppose
Λ[u] =
ZZZ
Here we have
so Lu ≡ 0 and
Ω
1
2 |∇u(x)|
2
dV =
1
2
ZZZ
Ω
∂u
∂x1
2
+
∂u
∂x2
2
+
∂u
∂x3
2 !
dV.

 
x1
w1
L  x2  , u,  w2  = 21 w12 + 21 w22 + 21 w32 ,
x3
w3

Lw (x, u, w) = Dw L(x, u, w) = [ w1
w2
w3 ] .
So a function u is an extremal [i.e., a solution of (ELPDE)] exactly when
0 = ∇ • Lw (x, u(x), ∇u(x)) = ∇ • [ u,1 (x) u,2 (x) u,3 (x) ] = u,11 + u,22 + u,33 .
This is Laplace’s Equation for the function u.
////
There are analogous results in all dimensions. For the case of two independent
variables, where Ω ⊆ R2 , u = u(x, y) and L = L((x, y), u, (v, w)), (ELPDE) says
∂
Lv ((x, y), u(x, y), (ux(x, y), uy (x, y)))
∂x
∂
Lw ((x, y), u(x, y), (ux(x, y), uy (x, y)))
+
∂y
= Lu ((x, y), u(x, y), (ux(x, y), uy (x, y))),
(x, y) ∈ Ω.
F. Local Minima (Basic Problem)
min {Λ[x] : x ∈ P WS[a, b], x(a) = A, x(b) = B} .
(P )
Recall the space of variations
VII = {h ∈ P WS[a, b] : h(a) = 0 = h(b)} .
We’re looking for arcs x
b that obey
Λ[b
x] ≤ Λ[b
x + λh]
(∗)
for a good selection of variations h ∈ VII and λ > 0.
x
b gives a (Global) Minimum if (∗) holds for all λ > 0 and all h ∈ VII .
x
b gives a Strong Local Minimum if there exists ρ > 0 such that (∗) holds for any
combination of λ > 0 and h ∈ VII such that
max |λh(t)| < ρ.
t∈[a,b]
File “extns”, version of 23 January 2015, page 14.
Typeset at 12:29 January 23, 2015.
II. Extensions
15
x
b gives a Weak Local Minimum if there exists ρ > 0 such that (∗) holds for any
combination of λ > 0 and h ∈ VII such that both
max |λh(t)| < ρ
and
sup λḣ(t) < ρ.
t∈[a,b]
t∈[a,b]
x
b gives a Directional Local Minimum if for each fixed h ∈ VII , there exists ρ > 0
such that (∗) holds for all λ ∈ (0, ρ). I.e., for each h, the function λ 7→ Λ[b
x + λh]
has a local minimum over [0, +∞) at λ = 0.
Each definition is different, and the sets of arcs in various categories may be different vary. Writing Σ(P ) for the set of global minimizers, and adding descriptive
superscripts as suggested above, we have
Σ(P ) ⊆ ΣSLM (P ) ⊆ ΣW LM (P ) ⊆ ΣDLM (P ).
Short Segments. Combining each definition above with the Principle of Optimality
leads to a “short segment” version of the criterion. Instead of the full vector space
VII , we use the subset of variations h for which all nonzero values occur for t in some
short open interval. For example, an arc x
b gives a Directional Local Minimum
on Short Segments if there exists ρ > 0 such that for each fixed h ∈ VII whose
nonzero values can all be covered by some open interval of length ρ, the function
λ 7→ Λ[b
x + λh] has a local minimum over [0, +∞) at λ = 0. Clearly, ΣDLM SS
contains ΣDLM .
There must be more. Our proofs of IEL, NBC, WE1, WE2 (when x
b ∈ C 2 ), and
DLM SS
Hilbert’s theorem apply to every arc in the biggest class, Σ
. (DLM class is
obvious, add SS by using Principle of Optimality.) Assuming we have a more robust
type of local minimum is a stronger hypothesis, and should give stronger conclusions.
Intuition-Builder. Consider f : R2 → R defined by
−1, if x2 = x21 , x1 > 0,
f (x) =
0,
otherwise,
near the point x
b = 0. Pick any nonzero h ∈ R2 and observe that there exists ρ > 0
such that
λ 7→ f (λh) = 0 = f (0, 0)
∀λ ∈ (0, ρ).
Hence the point x
b = 0 gives a directional local min for f . Detail: Fix h = (h1 , h2 ) ∈
R2 . If either h1 ≤ 0 or h2 ≤ 0, then f (λh) = 0 for all λ ≥ 0. If both h1 > 0 and
h2 > 0, then
f (λh) 6= 0 ⇐⇒ f (λh) = −1 ⇐⇒ λh2 = (λh1 )2 ⇐⇒ λ =
h2
.
h21
def
Thus f (λh) = 0 for all λ ∈ [0, ρ), where ρ = ρ(h) = h2 /h21 . Note, however, that there
are points arbitrarily near x
b with smaller values. (E.g., x(n) = (1/n, 1/n2 ) f or large
n.)
File “extns”, version of 23 January 2015, page 15.
Typeset at 12:29 January 23, 2015.
16
PHILIP D. LOEWEN
G. Strong Local Minimizers; the Weierstrass Condition
Example. Consider the extremal x
b(t) = t for the problem
min
Z
1
3
ẋ(t) dt : x(0) = 0, x(1) = 1 .
0
Given any ε > 0, consider the variation hλ defined as follows:
 ε
 − t,
if 0 ≤ t ≤ λ,
λ
hλ (t) =
ε
−
(t − 1), if λ < t ≤ 1.
λ−1
Observe that khλ k∞ = ε. Calculation reveals
3
Z 1
ε
ε 3
dt
dt +
1−
Λ[b
x + hλ ] =
1−
λ
λ−1
λ
0
3
ε
ε 3
+ (1 − λ) 1 −
= λ 1−
λ
λ−1
3
3
(ε − λ)
(1 − λ − ε)
=−
+
.
2
λ
(1 − λ)2
Z
λ
Notice that as λ → 0+ , the second term here converges to the finite quantity (1 − ε)3 ,
while the first term diverges to −∞. Thus we have Λ[b
x + hλ ] < Λ[b
x] for all λ > 0
sufficiently small, and since this is true for every ε > 0, the arc x
b fails to provide a
strong local minimum here. In fact, our analysis shows that for every ε > 0,
inf {Λ[x] : x(0) = 0, x(1) = 1, kx − x
bk∞ ≤ ε}
(1 − λ + ε)3
(ε − λ)3
+
= −∞.
≤ inf −
λ2
(1 − λ)2
Thus the problem above has no minimum at all.
////
The method of this example works well in other contexts, too. The key observation is that large negative derivatives will make a large negative contribution to
the objective integral. The triangular variations constructed here allow these derivatives to make a large negative contribution in the first very small interval, leading to
the divergent first term in the sum above; in contrast the variations are very nearly
zero for the remainder of the interval, so the integral there comes very close to the
original objective value Λ[b
x] for small λ. It is helpful to imagine changing ε to −ε
in the definition of hλ , thus producing a triangular variation with positive values. In
this case the first term of the objective sum above is positive, but the second term
diverges to −∞ as one takes the limit λ → 1− .
File “extns”, version of 23 January 2015, page 16.
Typeset at 12:29 January 23, 2015.
II. Extensions
17
Theorem (Weierstrass, 1879). If L ∈ C 1 and x
b gives a weak local minimum in
the basic problem, then there exists ε > 0 such that for all t ∈ (a, b),
h
i
L t, x
b(t), x(t)
ḃ
+ Lv t, x
b(t), x(t)
ḃ
w − x(t)
ḃ
≤ L(t, x
b(t), w) ,
(∗)
whenever w − x(t)
ḃ < ε.
[Interpretation: If t ∈ (a, b) is a corner point of x
b, (∗) holds if we write x(t
ḃ − ) or x(t
ḃ + )
instead of x(t)
ḃ throughout.]
Moreover, if x
b gives a strong local minimum, then
(i) (∗) holds for all w without restriction.
b −L
b v (t)x(t)
(ii) the function t 7→ L(t)
ḃ has only removable discontinuities in [a, b].
Notation. Define the Weierstrass Excess Function
E(t, x, v, w) = L(t, x, w) − L(t, x, v) − Lv (t, x, v)[w − v].
Then the Weierstrass Necessary Condition (for strong local minimizers) says
E(t, x
b(t), x(t),
ḃ
w) ≥ 0,
∀w.
(W)
Geometry. Condition (W) asserts the subgradient inequality for the function v 7→
L(t, x
b(t), v) for the specific tangent erected using v = x(t).
ḃ
(Draw a picture.) If
this function is convex, the subgradient inequality holds for all tangents at all base
points; so a simple sufficient condition for (W) [and indeed for (W+ ) below] is the
requirement Lvv (t, x
b(t), v) ≥ 0 for all v.
Proof. (Graves.) Choose ε > 0 as in the definition
of WLM, and let t ∈ (a, b) be a
ḃ
Then
ḃ < ε, let v = w − x(t).
non-corner point of x
b. Given any w where w − x(t)
|v| < ε and w = x(t)
ḃ + v. For h > 0 and α > 0 small enough that t − α > 0 and
t + α/h < 1, define a variation y by taking y(0) = 0 and

0, if 0 < r < t − α,


v, if t − α < r < t,
ẏ(r) =

 −hv, if t < r < t + α/h,
0, if t + α/h < r < 1.
Note that y satisfies (∗), so Λ[b
x + y] ≥ Λ[b
x], and consequently
0 ≤ lim α−1 [Λ[b
x + y] − Λ[b
x]]
α→0+
Z t h
i
−1
L(b
x(r) + y(r), x(r)
ḃ + v) − L(b
x(r), x(r))
ḃ
dr
= lim+ α
α→0
t−α
+ lim α
α→0+
−1
Z
t
t+α/h
h
i
L(b
x(r) + y(r), x(r)
ḃ − hv) − L(b
x(r), x(r))
ḃ
dr
= L(b
x(t), x(t)
ḃ + v) − L(b
x(t), x(t))
ḃ
+
File “extns”, version of 23 January 2015, page 17.
i
1h
L(b
x(t), x(t)
ḃ − hv) − L(b
x(t), x(t))
ḃ
.
h
Typeset at 12:29 January 23, 2015.
18
PHILIP D. LOEWEN
Rearranging this inequality gives, for all h > 0 sufficiently small,
i
1 h
L(b
x(t), x(t))
ḃ
+
L(b
x(t), x(t)
ḃ − hv) − L(b
x(t), x(t))
ḃ
≤ L(b
x(t), x(t)
ḃ + v).
(−h)
Now (∗) follows by taking the limit as h → 0+ .
Both sides in (∗) depend continuously on t in any open interval where x
b has no
corners. Hence we can take one-sided limits as t approaches a corner point and retain
the validity of the inequality.
(i) If x
b gives a strong local minimum, then the key inequality above remains valid
for arbitrary v so the results above hold for arbitrary w.
(ii) Define p(t) := Lv (t, x
b(t), x(t)).
ḃ
If x
b is a SLM, it is certainly a DLM, so it
must obey WE1. This means that p(t− ) = p(t+ ) holds at each t ∈ (a, b).
Rearranging (∗) gives
L(t, x
b(t), x(t))
ḃ
− p(t)x(t)
ḃ ≤ L(t, x
b(t), w) − p(t)w
∀w.
Now fix t. The function of w on the right side is unchanged if we swap t for
either t− or t+ , so its minimum value over w is unambiguous. Choosing t− on
the left side gives one expression for this minimum (attained when w = x(t
ḃ − )),
+
+
while choosing t gives another (attained when w = x(t
ḃ )). Matching these
expressions gives the stated result:
H(t− ) = H(t+ ),
File “extns”, version of 23 January 2015, page 18.
where
def b
b v (t)x(t).
H(t) = L(t)
−L
ḃ
(WE2)
////
Typeset at 12:29 January 23, 2015.
Download