Document 11126890

advertisement
IV. The Second Variation
c
UBC M402 lecture notes 2015
by Philip D. Loewen
A. Second-Order Necessary Conditions
Consider the basic problem
)
(
Z b
L (t, x(t), ẋ(t)) dt : x(a) = A, x(b) = B .
min Λ[x] :=
(P )
a
If x
b gives a directional local minimum, and y is any arc in
VII = {y ∈ P WS[a, b] : y(a) = 0 = y(b)} ,
then the function g: R → R defined by
g(λ) := Λ[b
x + λy]
must have a local minimum at the point λ = 0. Thus g ′ (0) = 0, which gives (IEL)
and all the theory developed so far. But also g ′′ (0) ≥ 0, which we now investigate.
Suppose L, Lx , and Lv are all C 1 . Then
Z b g(λ) =
L t, x
b(t) + λy(t), x(t)
ḃ + λẏ(t) dt
a
Z bh ′
⇒ g (λ) =
Lx t, x
b(t) + λy(t), x(t)
ḃ + λẏ(t) y(t)
a
i
+ Lv t, x
b(t) + λy(t), x(t)
ḃ + λẏ(t) ẏ(t) dt
Z b
∂
′′
⇒ g (0) =
[ · · ·]λ=0 dt
a ∂λ
Z b h
i
b
b
b
b
=
Lxx (t)y(t) + Lxv (t)ẏ(t) y(t) + Lvx (t)y(t) + Lvv (t)ẏ(t) ẏ(t) dt
a
=
Z
a
′′
b
i
h
b vv (t)ẏ(t)2 + 2L
b xv (t)y(t)ẏ(t) + L
b xx (t)y(t)2 dt.
L
Since g (0) ≥ 0 must hold for any fixed y, we have the following result.
A.1. Theorem. Assume that L, Lx , and Lv are of class C 1 . Let x
b provide a
directional local minimum in problem (P). Then for any arc y in VII , one has
Z bh
i
2
2
b
b
b
0 ≤ J[y] :=
Lvv (t)ẏ(t) + 2Lxv (t)y(t)ẏ(t) + Lxx (t)y(t) dt.
a
In particular, the arc yb(t) = 0 provides a global minimum in the so-called accessory
problem
min {J[y] : y(a) = 0 = y(b)} .
(Q)
y∈P WS[a,b]
The accessory problem introduced in the statement of Theorem A.1 is purely
quadratic, with an integrand
L(t, y, w) = α(t)w2 + 2β(t)yw + γ(t)y 2
File “var2”, version of 27 Feb 2015, page 1.
Typeset at 14:19 February 27, 2015.
2
PHILIP D. LOEWEN
whose coefficients are determined by second derivatives of the original integrand L
along the given arc x
b:
b vv (t),
α(t) = L
b vx (t),
β(t) = L
b xx (t).
γ(t) = L
Assuming that L, Lx , Lv ∈ C 1 makes these coefficients piecewise continuous functions
of t in [a, b].
Observations.
1. For problems where the basic integrand L has the general form
L(t, x, v) = c1 v 2 + 2c2 xv + c3 x2 + c4 v + c5 x + c6 ,
the partial derivatives needed to set up the accessory problem are all independent
of whatever reference arc x
b is given. It’s easy to calculate them, and arrive at
L(t, y, w) = 2 c1 w2 + 2c2 yw + c3 y 2 .
In particular, if L is a purely quadratic function of (x, v), i.e., c4 = 0, c5 = 0,
and c6 = 0, then L(t, y, w) = 2L(t, y, w), so J[y] = 2Λ[y]. Thus the only
consequential difference between (P ) and (Q) for purely quadratic problems is
that (Q) imposes zero endpoint values. Note that this observation remains valid
even when the coefficients ck = ck (t) are smooth functions depending only on t.
2. The constant arc y(t) = 0 is admissible in (Q), so the inequality inf(Q) ≤ J[0] =
0 is obvious. Theorem A.1 is useful because it provides the opposite inequality,
namely inf(Q) ≥ 0, so that inf(Q) = 0.
3. The only alternative to inf(Q) = 0 is inf(Q) = −∞. Indeed, if inf(Q) 6= 0 then
(since we know inf(Q) ≤ 0) there must be some admissible arc y1 such that
J[y1 ] < 0. But then we can define a sequence of arcs yn (t) = ny1 (t), n ∈ N:
each yn satisfies the boundary conditions, and since J is purely quadratic,
J[yn ] = J[ny1 ] = n2 J[y1 ] → −∞
as n → ∞.
Thus inf(Q) = −∞, as claimed.
B. Legendre’s Condition
Inspired by the findings above, we consider a variational problem with purely quadratic
integrand:
(
)
Z b
J[y] :=
min
α(t)ẏ(t)2 + 2β(t)ẏ(t)y(t) + γ(t)y(t)2 dt : y(a) = 0 = y(b) .(Q)
y∈P WS
a
We assume the coefficient functions α, β, γ: [a, b] → R are piecewise continuous. The
notation is deliberately aligned with that in the accessory problem above, but here
we treat (Q) strictly on its own merits.
B.1. Lemma. If α(τ ) < 0 for some τ ∈ [a, b] where α is continuous, then inf(Q) =
−∞. In particular, the arc yb ≡ 0 is not a DLM in problem (Q).
Proof. We may assume that the point τ mentioned in the statement is not an endpoint
of [a, b]. (Indeed, knowing only that α is continuous at point b with α(b) < 0 is enough
File “var2”, version of 27 Feb 2015, page 2.
Typeset at 14:19 February 27, 2015.
The Second Variation
3
to deduce that there is some nearby point τ < b where the hypotheses hold. The
situation is similar at a.)
Define c = − 12 α(τ ): then c > 0 and α(τ ) = −2c, so, by continuity of α at τ ,
there exists some constant ε > 0 so small that (τ − ε, τ + ε) is a subinterval of (a, b),
and each t in this subinterval gives α(t) ≤ −c. Then consider, for integers n > 1/ε,
the sequence of arcs yn defined by

0,
if a ≤ t < τ − 1/n,


n (t − (τ − 1/n)) , if τ − 1/n ≤ t < τ ,
yn (t) =

 n ((τ + 1/n) − t) , if τ ≤ t < τ + 1/n,
0,
if τ + 1/n ≤ t ≤ b.
(Sketch!) Whenever n > 1/ε, each yn is admissible in (Q), and the sketch shows that
Z τ +1/n
Z b
1
n, if 0 < |t − τ | < 1/n,
yn (t) dt = ,
|yn (t)| dt =
|ẏn (t)| =
0,
otherwise.
n
τ −1/n
a
Furthermore, piecewise continuity makes both constants β = maxa≤t≤b |β(t)| and
γ = maxa≤t≤b |γ(t)| finite. Using these, we estimate the three terms in the cost
functional of (Q). First,
Z τ +1/n
Z b
2
n2 α(t) dt ≤ n2 (−c)(2/n) = −2cn.
α(t)ẏn (t) dt =
J1 [yn ] :=
τ −1/n
a
Second, thanks to the observations above,
Z
Z b
b
2β(t)ẏn (t)yn (t) dt
J2 [yn ] :=
2β(t)ẏn (t)yn (t) dt ≤ a
a
Z τ +1/n
≤
|2β(t)ẏn (t)yn (t)| dt
τ −1/n
≤ 2nβ
Third,
J3 [yn ] :=
Z
a
b
Z
τ +1/n
yn (t) dt = 2β.
τ −1/n
Z
b
γ(t)yn (t)2 dt ≤ γ(t)yn (t)2 dt
a
Z τ +1/n
2
|yn (t)| dt ≤ γ(2/n).
≤γ
τ −1/n
Combining these three estimates, we find that
J[yn ] = J1 [yn ] + J2 [yn ] + J3 [yn ] ≤ −2cn + 2β + 2γ/n
∀n > 1/ε.
Clearly, as n → ∞ we have J[yn ] → −∞, and this gives inf(Q) = −∞.
To get the DLM conclusion, focus on any single N ∈ N so large that J[yN ] < 0.
Then for any scalar λ, no matter how small,
J[b
y + λyN ] = J[λyN ] = λ2 J[yN ] < 0 = J[b
y ].
Thus yN provides a direction in which the arc yb = 0 does not provide a onedimensional local minimum.
////
File “var2”, version of 27 Feb 2015, page 3.
Typeset at 14:19 February 27, 2015.
4
PHILIP D. LOEWEN
The general fact about quadratic variational problems just established has useful
consequences for the study of the accessory problem introduced in Section A.
B.2. Theorem (Legendre). Under the hypotheses of Theorem A.1, suppose x
b∈
P WS[a, b] gives a directional local minimum for problem (P). Then
b vv (t) ≥ 0
L
∀t ∈ [a, b].
(L)
b vv (τ ) < 0,
Proof. If, on the contrary, there were some τ in [a, b] for which α(τ ) = L
then Lemma B.1 would show that the infimum in the accessory problem is −∞. This
would contradict Theorem A.1, so it cannot happen.
////
(Z
)
π/4
B.3. Example. min
x(t)2 − ẋ(t)2 dt : x(0) = 0, x(π/4) free .
0
Solution. Here L(t, x, v) = x2 − v 2 satisfies Lvv (t, x, v) = −2 < 0 for all points
b vv (t) < 0 for all t. Thus Legendre’s condition
(t, x, v), so for any arc x
b one will have L
is never satisfied, and no extremal gives even a directional local minimum!
////
C. Jacobi’s Equation; Conjugate Points
Recall. If x
b gives a directional local minimum for the basic problem (P ), then the
arc yb = 0 gives an absolute minimum in the following problem:
Z b
b vv (t)ẏ(t)2 + 2L
b xv (t)y(t)ẏ(t) + L
b xx (t)y(t)2 dt
L
minimize J[y] :=
(Q)
a
subject to y(a) = 0, y(b) = 0.
Problem (Q) is called the accessory problem corresponding to the arc x
b. If both L
and x
b are sufficiently smooth, (Q) is a variational problem of the sort we know how
to handle. Let us therefore assume throughout what follows that the minimizer x
b
in (P ) is of class C 2 , and that Lxx , Lxv , and Lvv are continuously differentiable in
(t, x, v)-space. [It is possible to get similar results with weaker hypotheses.] Then
any minimizing arc in (Q) must satisfy both (DEL) and (WE1). In general notation,
the integrand in (Q) has the form
b vv (t)w2 + 2L
b vx (t)yw + L
b xx (t)y 2 ,
L(t, y, w) = L
and consequently the appropriate form of (DEL) is
d
Lw (t, y(t), ẏ(t)) = Ly (t, y(t), ẏ(t)),
dt
i
d h b
b xv (t)y = 2L
b xv (t)ẏ + 2L
b xx (t)y,
2Lvv (t)ẏ + 2L
dt
i d
d hb
b xx (t) − L
b xv (t) y = 0.
Lvv (t)ẏ − L
dt
dt
(JDE)
Equation (JDE) is the Jacobi equation: note that it is linear, homogeneous, and of
second order.
File “var2”, version of 27 Feb 2015, page 4.
Typeset at 14:19 February 27, 2015.
The Second Variation
5
Observations. It’s interesting to compare the Jacobi equation with the EulerLagrange equation for problems where the basic integrand L has the general form
L(t, x, v) = c1 v 2 + 2c2 xv + c3 x2 + c4 v + c5 x + c6 .
(Here we allow ck = ck (t).) The Euler-Lagrange equation says, after a little rearrangement
d
2c1 (t)ẋ − (2c3 − 2ċ2 ) x = c5 (t) − ċ4 (t).
dt
It’s a second-order linear ODE, in which the coefficient function c6 is completely
irrelevant and the coefficients c4 and c5 attached to the linear terms in L account for
any and all inhomogeneity. For any reference arc x
b, the Jacobi equation for L is
d
2c1 (t)ẏ − (2c3 − 2ċ2 ) y = 0.
dt
This is just the homogeneous counterpart of the Euler-Lagrange equation! In particular, whenever L is purely quadratic, the Jacobi equation and the Euler-Lagrange
equation are identical. Of course the situation can be more delicate in general, but
there are enough interesting quadratic Lagrangians in the world to make this a fact
worth noticing.
Direct Rejection of Non-Minimizers. Any directional local minimizer y in the
accessory problem (Q) must satisfy both (JDE) and the corner condition
i.e.,
Lw (t, y(t), ẏ(t− )) = Lw (t, y(t), ẏ(t+ )),
b vv (t)ẏ(t− ) = L
b vv (t)ẏ(t+ ),
L
∀t ∈ (a, b).
(‡)
Let’s rewrite this fact in contrapositive form: if some arc y with y(a) = 0 = y(b)
violates either (JDE) or (‡), then y cannot give a directional local minimum for problem (Q). This implies inf(Q) < J[y]. Combine this finding with the contrapositive
form of Theorem A.1, namely, if inf(Q) < 0 then x
b cannot provide a DLM for (P ).
The following short statement results.
If some y ∈ VII obeys J[y] ≤ 0 but fails in either (JDE) or (‡), then
the original arc x
b cannot be a DLM in (P ).
The theoretical developments below can be interpreted as a systematic search for
arcs y of the sort described here. But there are cases where an elementary approach
works, too. Here are two examples.
Example 1. Consider L(t, x, v) = v 2 − x2 , and an interval [0, b] where b > π. For
any extremal arc x
b associated with L on [0, b], regardless of its endpoints, we have
b vv (t) = 2,
b xv (t) = 0,
b xx (t) = −2.
L
L
L
Direct calculation will confirm thatthe arc
sin(t), 0 ≤ t < π,
z(t) =
0,
π ≤ t ≤ b,
is admissible in (Q), with J[z] = 0. However, this z fails to satisfy (‡) at t = π ∈ (a, b).
Hence inf(Q) < 0 and the extremal x
b fails to provide a DLM in problem (P ). Since
this is true for all extremals, no extremal can provide a DLM. It follows that when
b > π, no instance of (P ) has a solution.
////
File “var2”, version of 27 Feb 2015, page 5.
Typeset at 14:19 February 27, 2015.
6
PHILIP D. LOEWEN
Example 2. Consider L(t, x, v) = t(v 2 − x2 ), and an interval [0, b] where b ≥ π. For
any extremal arc x
b associated with L on [0, b], regardless of its endpoints, we have
b vv (t) = 2t,
L
b xv (t) = 0,
L
b xx (t) = −2t.
L
Direct calculation confirms that the admissible arc
sin(t), 0 ≤ t < π,
y(t) =
0,
π ≤ t ≤ b,
gives J[y] = 0, but fails to satisfy (JDE) at most points of (0, π). So y cannot be a
minimizer in (Q). This implies that inf(Q) < J[y] = 0; by Theorem A.1, the extremal
x
b must fail to provide a DLM in problem (P ). Since this is true for all extremals, it
follows that when b ≥ π, no instance of (P ) has any (directional local) minimizers at
all.
////
We now introduce additional hypotheses that are satisfied in Example 1 above,
but not in Example 2. These will eventually support the assertion that any admissible
extremal x
b for the problem in Example 1 actually provides a DLM whenever b ≤ π.
In contrast, the corresponding
statement for Example 2 isp
false. In Example 2, we
p
certainly need b < 28/3 for a DLM to exist in (P ) (note 28/3 < π), and there is
no reason to believe that this estimate is sharp.
Sidebar. Harry Zheng’s UBC thesis suggests how to generate “by hand” arcs y with
J[y] < 0. His method is to start with an arc that starts from 0 and returns to 0 at
some time c in [a, b), and is stationary for the integral in (Q) restricted to [a, c], and
then to introduce a suitable perturbation. Rosenblueth has followed up on this.
The Strengthened Legendre Condition. The theory of differential equations
b vv (t)
applies most cleanly to (JDE) in the nonsingular case, when the coefficient L
of ẏ(t) is nonvanishing throughout [a, b]. Since Legendre’s condition already limits
b vv (t) ≥ 0 throughout [a, b], it seems like a small step
attention to arcs x
b satisfying L
to now assume the strengthened Legendre condition,
b vv (t) > 0 ∀t ∈ [a, b].
L
(L+ )
This condition has two useful consequences:
(i) It implies that any extremal arc for the accessory problem (Q) must be C 1 .
(This follows immediately from (‡) above.)
(ii) It ensures uniqueness of solutions for initial-value problems involving (JDE). In
particular, for any point c in [a, b], supplementing the differential equation (JDE)
with the initial conditions y(c) = 0 and ẏ(c) = 0 produces a problem whose only
solution is the function y(t) = 0.
C.1. Proposition. Under (L+ ), all nontrivial solutions y of (JDE) with y(a) = 0
have the same zeros.
Proof. Pick any two nonzero solutions of (JDE), say y1 and y2 . By uniqueness
(item (ii) above), nontriviality forces ẏ1 (a) 6= 0 and ẏ2 (a) 6= 0, so we may define
k = ẏ1 (a)/ẏ2 (a) 6= 0 and
z(t) := y1 (t) − ky2 (t),
File “var2”, version of 27 Feb 2015, page 6.
t ∈ [a, b].
Typeset at 14:19 February 27, 2015.
The Second Variation
7
Now z solves (JDE) with z(a) = 0 and ż(a) = 0, so uniqueness gives z ≡ 0, i.e.,
y1 (t) = ky2 (t),
∀t ∈ [a, b].
Since k 6= 0, the functions y1 and y2 must have the same zeros.
////
C.2. Definition. Let x
b be an extremal of class C 1 for problem (P). Assume (L+ ). A
point c ∈ (a, b] is called conjugate to a [relative to x
b] if Jacobi’s differential equation
(JDE) has a nontrivial solution y on [a, c] satisfying y(a) = 0 = y(c). In other words,
c > a is conjugate to a exactly when the following two-point boundary value problem
admits a nontrivial solution y:
i d hb
d b
b
(ODE)
Lvv (t)ẏ − Lxx (t) − Lxv (t) y = 0,
a < t < c,
dt
dt
(BC) y(a) = 0, y(c) = 0.
C.3. Theorem (Jacobi). Assume L ∈ C 3 . If x
b ∈ C 2 provides a directional local
minimum in the basic problem (P), and (L+ ) holds for x
b, then the open interval (a, b)
contains no points conjugate to t = a relative to x
b.
Proof. Suppose, on the contrary, that some point c in (a, b) is conjugate to a relative
to x
b. Let y satisfy (JDE) nontrivially, with y(a) = 0 = y(c), and define
y(t), if a ≤ t ≤ c,
z(t) =
0,
if c < t ≤ b.
Notice that since y(c) = 0 and y is nontrivial, we have ẏ(c) 6= 0. In particular,
ż(c− ) = ẏ(c) 6= 0 = ż(c+ ),
so z has a corner point at c. But J[z] = 0. Indeed,
Z c
Z b
2
2
b
b
b
Lvv (t)ẏ(t) + 2Lxv (t)y(t)ẏ(t) + Lxx (t)y(t) dt.
L(t, z(t), ż(t)) dt =
J[z] =
a
a
(∗)
But y satisfies (JDE) on [a, c], which we multiply by y and integrate:
Z c
Z c
Z c
i
d b
d hb
2
b
y(t)2 L
Lvv (t)ẏ dt +
y(t)
Lxx (t)y(t) dt =
xv (t) dt.
dt
dt
a
a
a
Integration by parts, gives
Z c
Z c
i
h
ic
d hb
b vv (t)ẏ(t)2 dt
b
L
Lvv (t)ẏ dt = y(t)Lvv (t)ẏ(t)
−
y(t)
dt
t=a
a
Z c
Z ca
h
ic
d
b xv (t) dt = y(t)2 L
b xv (t)
b xv (t) [2y(t)ẏ(t)] dt
y(t)2 L
−
L
dt
t=a
a
a
The boundary terms are all zero, so using these expressions in (∗∗) gives
Z c
Z c
Z c
2
2
b xv (t)y(t)ẏ(t) dt,
b
b
2L
Lvv (t)ẏ(t) dt −
Lxx (t)y(t) dt = −
a
a
(∗∗)
a
a simple rearrangement of J[z] = 0 in (∗).
Now (L+ ) implies that extremals in problem (Q) cannot have corners. Since z
has a corner, it cannot be extremal. Hence it certainly cannot solve (Q): consequently
inf(Q) < J[z] = 0. This contradicts the minimality of x
b, by Theorem A.1 above.
////
File “var2”, version of 27 Feb 2015, page 7.
Typeset at 14:19 February 27, 2015.
8
PHILIP D. LOEWEN
In applications, one turns the statement of this theorem inside out. Much as
in its proof, one first identifies a smooth extremal arc x
b for the basic problem and
checks condition (L+ ). If this is satisfied, one solves (JDE) with y(a) = 0, ẏ(a) 6= 0
and looks for the first time c > a when y(c) = 0. This is the first conjugate point to
t = a relative to x
b: if it lies inside the open interval (a, b), Theorem C.3 guarantees
that the extremal x
b is not a directional local minimizer in the basic problem. In
order for x
b to pass this test and remain in the competition for a DLM, it must satisfy
Jacobi’s Necessary Condition:
There are no points in (a, b) conjugate to a relative to x
b.
(J)
Example. For any b > π, A, and B, this problem has no solution:
(Z
)
b
min
ẋ2 − x2 + 2f (t)x dt : x(0) = A, x(b) = B .
0
(The given function f is smooth.)
Proof. Here a = 0 and L(t, x, v) = v 2 −x2 +2f (t)x. The differentiated Euler-Lagrange
equation associated with L is
ẍ + x = f (t),
with general solution
x(t) = c cos t + d sin t + xp (t), c, d ∈ R,
where xp is a particular solution of the inhomogeneous equation above. Any particb will single out a definite extremal x
ular parameter choices c = b
c, d = d,
b, for which
we can calculate
b vv (t) = 2, L
b xv (t) = 0, L
b xx (t) = −2.
L
In fact, these functions are independent of which extremal we consider; for any of
the possible extremals, the associated Jacobi equation is
ÿ + y = 0.
Imposing the boundary condition y(0) = 0 produces the family of solutions y(t) =
α sin t, α ∈ R. Thus the point c = π is conjugate to 0 relative to any one of the
extremals x
b for the problem. According to Jacobi’s theorem, if c = π is a point in
the open interval (0, b), then every one of those extremals fails to be a true (directional
local) minimizer. Since the true minimizers are all to be found among the extremals,
there can be no minimum at all.
Note that these considerations do not apply when 0 < b ≤ π, so the conjugate
point is outside the basic open interval. In these cases, Jacobi’s theorem makes no
promises about the optimality of an admissible extremal x
b: it merely fails to eliminate
x
b from the set of possible minimizers.
////
File “var2”, version of 27 Feb 2015, page 8.
Typeset at 14:19 February 27, 2015.
The Second Variation
9
D. Conjugate Points—Geometry
Linearization. Let I be an open interval containing a point α
b. Suppose F =
{x(t; α) : α ∈ I} is a family of extremals for a given Lagrangian L, indexed by elements α of the interval I. Write x
b(t) = x(t; α
b).
For example, when L(t, x, v) = v 2 − x2 + 2f (t)x as above, possible families of
extremals with I = R include
x1 (t; α) = α sin t + xp (t),
x2 (t; α) = α sin t + (1 − α2 ) cos t + xp (t),
etc.
D.1. Proposition. In the construction above, suppose x is C 2 as a function of the
vector variable (t; α). Then the function below satisfies Jacobi’s differential equation
relative to x
b(t) = x(t; α
b):
∂x
y(t) =
(t; α
b)
∂α
Proof. For each α ∈ I, the function t 7→ x(t; α) is extremal, so (DEL) holds:
∂
Lv (t, x(t; α), xt(t; α)) = Lx (t, x(t; α), xt(t; α)), ∀α ∈ I.
∂t
Compute ∂/∂α of both sides, using smoothness to interchange the mixed partial
derivatives:
∂
∂
Lvx (t, x(t; α), xt(t; α))xα (t; α) + Lvv (t, x(t; α), xt(t; α)) xα (t; α)
∂t
∂t
∂
= Lxx (t, x(t; α), xt(t; α))xα (t; α) + Lxv (t, x(t; α), xt(t; α)) xα (t; α).
∂t
Now set α = α
b, remembering x
b(t) = x(t; α
b) and y(t) = xα (t; α
b):
i
d hb
b vv (t)ẏ(t) = L
b xx (t)y(t) + L
b xv (t)ẏ(t).
Lxv (t)y(t) + L
dt
i d hb
d
b xx (t) − L
b xv (t) y = 0.
⇐⇒
Lvv (t)ẏ − L
dt
dt
Thus y satisfies Jacobi’s equation relative to x
b, as claimed.
////
(i) Points where extremal families bunch together. Suppose now that every
extremal in an indexed family as above passes through the same initial point (a, A),
i.e.,
x(a; α) = A ∀α ∈ I.
(1)
Assume also that the family of extremals has nontrivial α-dependence near t = a, in
other words, that
∂x
(t; α
b) 6≡ 0 near t = a.
(2)
∂α
∂x
(t; α
b) obeys
Then (1) implies that the arc y(t) =
∂α
∂x
∂
y(a) =
(a; α
b) =
A = 0,
∂α
∂α
File “var2”, version of 27 Feb 2015, page 9.
Typeset at 14:19 February 27, 2015.
10
PHILIP D. LOEWEN
while (2) implies that y is not the zero function. Thus y is a nontrivial solution of
Jacobi’s equation relative to x
b, and (assuming the strengthened Legendre condition
holds along x
b) the point c will be conjugate to a iff
∂x
y(c) = 0 ⇔
(c; α
b) = 0
∂α
⇔ x(c, α) = x(c, α
b) + xα (c, α
b)(α − α
b) + o(α − α
b), α → α
b,
⇔ x(c, α) = x
b(c) + o(α − α
b), α → α
b.
The last condition says that the family of extremals “bunches up” to crowd through
a small neighbourhood of the point (c, x
b(c)). If x
b is a weak local minimizer on [a, b],
this cannot happen inside the interval (a, b).
Pictorial Example. Think about paths of minimum arc length on the surface of
a sphere. These can be described by a variational problem in spherical polar coordinates, and extremals turn out to be “great circles”—that is, circles formed by
the intersection of the spherical shell with a plane through the origin. Using the
planet Earth as a model for the sphere, and the north pole as the starting point, the
shortest path to any specified destination follows a single line of longitude on the
globe. These lines emerge from the north pole in every possible direction, but they
all cluster together again at the south pole. Thus the south pole is a conjugate point
to the north pole. The shortest arc from the north pole to UBC follows the longitude
123◦ 15′ W: it passes over the Arctic Ocean, the Northwest Territories, and northern
BC. The opposite longitude, about 56◦ 45′ E, connects the North pole to UBC via
Russia, the Ural Mountains, Iran, Oman, the Seychelles, Mauritius, the South Pole,
a very long stretch of water, California, Oregon, and Washington State. Clearly the
latter path is NOT the shortest one—even in a weak local sense. The mathematical
clue to this intuitively obvious fact is the conjugate point at the south pole.
(ii) Contact points with an envelope of extremals. Let F = {x(t; α) : α ∈ I}
be any family of functions indexed by some open real interval I. A smooth curve
x = e(t) is an envelope of the family F if every curve x in F provides a unique
solution for t in the system of equations
x(t) = e(t), ẋ(t) = ė(t).
(These equations characterize the instant when the graphs of x and e meet tangentially.) In other words, we are asking that every α ∈ I come with a unique contact
time t = t(α) such that
(1) x(t(α); α) = e(t(α)),
(2) xt (t(α), α) = ė(t(α)).
If the system (1)–(2) defines t as a differentiable function of α, and if x is a smooth
function of the vector variable (t, α), then we can find the envelope curve by taking
d/dα in line (1) and using (2) to simplify:
d
0=
[x(t(α), α) − e(t(α))]
dα
dt
dt
(3)
+ xα (t(α); α) − ė(t(α))
= xt (t(α); α)
dα
dα
= xα (t(α); α).
File “var2”, version of 27 Feb 2015, page 10.
Typeset at 14:19 February 27, 2015.
The Second Variation
11
Equations (3) and (1) are valid for all α in I: written together, in the form “0 =
xα (t; α), e = x(t; α)”, they provide a parametric description of the envelope curve
in the (t, e)-plane. In some special cases one can solve for α = α(t) in (3), and then
determine a closed form expression for the envelope curve using (1):
e(t) = x(t; α(t)).
Example. The envelope of the family of straight lines x(t; α) = csc α − t cot α,
α ∈ (0, π), is the top half of the unit circle.
Proof. Observe that
x(t; α) =
1 − t cos α
,
sin α
xα (t; α) =
(t sin α) sin α − (1 − t cos α) cos α
t − cos α
=
.
2
sin α
sin2 α
Thus the pair of equations (1),(3) says
1 − t cos α
,
0 = t − cos α.
sin α
A parametric description of the envelope curve, with parameter α, is therefore
e=
t = cos α, e =
1 − cos2 α
= sin α,
sin α
0 < α < π.
This represents the top half of the unit circle in the (t, e)-plane.
////
Now suppose that each curve x(t; α) in the family F above is an extremal for
some given Lagrangian L, and that each one satisfies the initial condition
x(a; α) = A,
∀α ∈ I.
(4)
If the family has an envelope curve E, then we have the following result.
D.2. Proposition. Fix α
b ∈ I, and denote x
b(t) = x(t; α
b). Suppose that x(t; α) is
3
C as a function of the vector variable (t, α), and that the strengthened Legendre
condition holds along x
b. Then the time b
t = t(b
α) at which the extremal arc x
b meets
the envelope curve E is conjugate to a relative to x
b.
Proof. We already know that the function y(t) = xα (t; α
b) is a solution of Jacobi’s
differential equation satisfying y(a) = 0. (See the arguments above.) Equation (3)
above, with α = α
b, gives y(b
t) = 0.
////
E. An Example
This whole section is devoted to the following example and its near relatives:
)
(Z
T p
x(t) [1 + ẋ(t)2 ] dt : x(0) = 1, x(b) = B, x(t) > 0 .
min
0
Here L(t, x, v) =
Lv (t, x, v) =
√
√ √
x 1 + v 2 has
v
x√
,
1 + v2
Lvv (t, x, v) =
File “var2”, version of 28 Oct 05, page 11.
√
x
1
,
3/2
(1 + v 2 )
L−Lv ·v =
r
x
.
1 + v2
Typeset at 14:19 February 27, 2015.
12
PHILIP D. LOEWEN
Since Lvv (t, x, v) > 0 for all v whenever x > 0, it follows that any positive-valued
extremal arc x must be of class C 2 .
Since Lt ≡ 0, condition (WE2) implies that the function L − Lv · v must
√ be
constant along extremals. This constant must be positive; writing it as 1/ m for
convenience leads to a first-order equation that any solution must satisfy:
s
r
x(t)
1
=
,
i.e., ẋ(t)2 = mx(t) − 1.
(∗)
2
1 + ẋ(t)
m
On any open interval where ẋ(t) > 0, (∗) implies
p
dx
2 √
mx − 1 = t + C+ ,
ẋ(t) = mx(t) − 1 =⇒ √
= dt =⇒
m
mx − 1
for some constant C+ , so
x(t) =
1
m
2
+
(t + C+ ) .
m
4
(Note that ẋ(t) = m(t + C+ )/2 on such an interval.)
On any open interval where ẋ(t) < 0, (∗) implies
p
dx
ẋ(t) = − mx(t) − 1 =⇒ √
= −dt =⇒
mx − 1
2
m
√
mx − 1 = −t − C− ,
for some constant C− , so
x(t) =
1
m
2
+
(t + C− ) .
m
4
(Note that ẋ(t) = m(t + C− )/2 on such an interval.)
If the sign of ẋ changes at time t∗ , the continuity of ẋ requires that
m ∗
m ∗
(t + C− ) =
(t + C+ ) ,
hence C− = C+ .
2
2
Thus the single expression
x(t) =
m
m
mC
1
mC 2
1
2
+
(t + C) = t2 +
t+
+
m
4
4
2
m
4
covers both possible cases, and provides a complete description of all possible positivevalued extremals for this integrand. Let α = mC: then a convenient two-parameter
family of extremals for the problem is
x(t; α, m) =
m 2 α
4 + α2
t + t+
,
4
2
4m
α, m ∈ R.
(‡)
Imposing the initial condition 1 = x(0) leads to m = (4 + α2 )/4 and produces the
one-parameter family
x(t; α) =
File “var2”, version of 28 Oct 05, page 12.
4 + α2 2 α
t + t + 1,
16
2
α ∈ R.
(†)
Typeset at 14:19 February 27, 2015.
The Second Variation
13
Notice that xt (0; α) = α/2, so extremals with different parameter values have differf (α + tf (α))
ent initial slopes, and are therefore distinct. (Observe also that x(t; α) =
f (α)
holds for f (z) = 1 + z 2 /4: a similar expression has already been observed in the minimal surface of revolution problem, where the choice f (z) = cosh z generates all
extremals through the same initial point.)
To identify the extremals, if any, passing through the terminal point (b, B), we
must choose the parameter α for which
4 + α2 2 α
b + b + 1,
16
2
b2 2 b
8
b2
b2
16
b2
2
0=
α + α+ 2 1−B+
.
α + α+1−B+
=
16
2
4
16
b
b
4
B=
i.e.,
The quadratic formula gives
s "
#
r
2
−4
−4
16
b2
4
b2
α=
=
1∓ B−
.
− 2 1−B+
±
b
b
b
4
b
4
Thus there are three cases to consider:
(1) If B < b2 /4, then there is no choice of α for which the extremal x(t; α) passes
through the target point. The problem has no admissible extremals, and hence
there is no minimum.
(2) If B = b2 /4, then there is exactly one admissible extremal, determined by taking
α = −4/b.
(3) If B > b2 /4, then there are two admissible extremals, with parameter values
i
i
p
p
−4 h
−4 h
1 + B − b2 /4 ,
1 − B − b2 /4 .
α+ =
α− =
b
b
(Note that α+ < 0 always, whereas α− can have either sign.)
Second Order Analysis.
(i) The Legendre Condition. Since Lvv (t, x, v) > 0 for all x > 0, t, and v, the
strengthened Legendre condition (L+ ) is sure to hold along any positive-valued
extremal arc.
b , fix a parameter α
(ii) The Jacobi Condition. Given a target point bb, B
b for which
the arc x
b(t) = x(t; α
b) passes through it.
Method 1. Define q(t) = 16b
x(t) = 16 + 8b
αt + (4 + α
b2 )t2 . Onerous calculations
reveal that
√
2 4b
α + (4 + α
b2 )t
−2 4 + α
b2
b
b
√
Lxx (t) =
,
Lxv (t) =
,
q(t)
q(t) 4 + α
b2
128
b vv (t) =
.
L
3/2
(4 + α
b2 ) q(t)
File “var2”, version of 28 Oct 05, page 13.
Typeset at 14:19 February 27, 2015.
14
PHILIP D. LOEWEN
Further calculations unleash Jacobi’s differential equation relative to x
b:
0 = q(t)ÿ − q̇(t)ẏ + 2 4 + α
b2 y
= 16 + 8b
αt + (4 + α
b2 )t2 ÿ − 8b
α+2 4+α
b2 t ẏ + 2 4 + α
b2 y.
Nontrivial solutions to this equation are not immediately apparent.
Method 2. In the two-parameter family (‡), every curve is an extremal. The
extremal x
b identified above is associated with the values α = α
b and m =
def
2
m
b = (4 + α
b )/4. Fixing either one of these values leaves a one-parameter
family of extremals indexed by the other, to which the theory developed
above applies. So two solutions of Jacobi’s equation relative to x
b are
∂
t
α
2b
α
t
y1 (t) =
= +
=
x(t; α, m)
b +
,
∂α
2 2m
b α=α
2 4+α
b2
b
α=α
b
2
∂
t2
4+α
b2
4
t
y2 (t) =
=
=
x(t; α
b, m)
−
−
.
2
∂m
4
4m
4
(4 + α
b2 )
m=m
b
m=m
b
These solutions can be confirmed by direct substitution in Jacobi’s equation
above, but we must stress that they were found directly from (‡): the explicit
form of Jacobi’s equation is not required to construct them! Clearly y1
and y2 are linearly independent, so their linear combinations give rise to
the general solution of Jacobi’s equation above. In order to find conjugate
points, however, we need a solution with initial value y(0) = 0. To construct
such a solution, one may either consider a suitable linear combination of y1
and y2 , or else use the same method just described on the one-parameter
family of extremals (†), for which every curve obeys the initial condition.
The latter approach yields the solution
t
α
4 + α2 2 α
b 2
α
bt 4
∂
= + t =
1+ t+
t +t .
y(t) =
∂α
2
16
8
8 α
b
α=α
b 2
(Note that y = y1 + (b
α/2)y2 , as the former approach could have produced
directly.) It follows that there a point conjugate to t = 0 relative to the arc
x
b only if α
b < 0, and in this case the unique conjugate point is b
t = −4/b
α.
Now in case (2) above, we had α
b = −4/b, so b = −4/b
α=b
t. The conjugate point
falls exactly at the right endpoint of the interval [0, b]. This situation is permitted
by Jacobi’s necessary condition, which only precludes conjugate points in the open
interval (0, b). So the unique extremal in case (2) satisfies our second-order necessary
conditions.
In case (3), however, our formula for α
b can be rearranged to give
h
i
h
i
p
p
−4
1 ± B − b2 /4 = t± 1 ± B − b2 /4 .
b=
α±
Choosing the positive sign here reveals b > t+ > 0, so the conjugate point falls into
the open interval (0, b) and the corresponding extremal violates Jacobi’s condition.
It cannot be a minimizer. Choosing the negative sign instead reveals b < t− , so
the conjugate point lies well outside the closed interval [0, b] and Jacobi’s condition
is satisfied. This extremal is not discardable: advanced techniques show that it is
indeed a true minimizer.
File “var2”, version of 28 Oct 05, page 14.
Typeset at 14:19 February 27, 2015.
The Second Variation
15
Envelopes. Consider the one-parameter family of extremals (†). This family forms
an envelope, as we can show by considering the system
α
4 + α2 2
t+
t ,
2
16
1
α
∂x
= t + t2 .
(2) 0 =
∂α
2
8
In this easy case, (2) gives α = −4/t and substitution in (1) then gives
4
1
1
e = 1 + (−2) +
+ 2 t2 = t2 .
16 t
4
(1) e = x = 1 +
def
Thus the curve x = e(t) = 14 t2 provides an envelope for the family of extremals
above. This envelope curve coincides with the boundary of our case analyses above:
it is precisely the boundary line dividing the target points hit by two extremals from
those hit by no extremals at all. For an endpoint (b, B) above the envelope curve,
the extremal x(t; α+ ) has conjugate point t+ = −4/α+ in (0, b), so it is not optimal.
Only the extremal x(t; α− ) can possibly give the minimum.
The situation is accurately depicted in Figure E.1, where the envelope curve
x = t2 /4 is shown together with several members of the extremal family (†). For
the particular endpoint (b, B) = (1.6, 1), the appropriate parameter values turn out
to be α+ = −4 and α− = −1. The corresponding times when these curves touch
the envelope are t+ = 1 and t− = 4. Thus α+ generates the lower curve, which is
not optimal, while α− generates the upper curve, which can be shown to be optimal
using techniques to be described later.
Remarks. Using the function f (z) = 1 + z 2 /4, the family of extremals in (†) can be
expressed succinctly as
f (α + tf (α))
.
x(t; α) =
f (α)
√
For the minimum surface of revolution problem, where the integrand L = x 1 + v 2 is
very similar to the one just treated, it turns out that the family of extremals obeying
x(0) = 1 can be expressed similarly, just by replacing f with cosh in the formula
above. Since both f and cosh are even convex functions with comparatively rapid
growth and global minimum values of 1 at the origin, it seems reasonable to expect a
certain similarity in the qualitative features of the results for these two problems. This
turns out to be the case: extremals in the minimum surface of revolution problem do
exist for some target points and not for others, and above a certain envelope curve
[not available in closed form] each point is hit by two extremals emanating from
(0, 1). The lower one of these extremals touches the envelope before reaching the
target point, and therefore fails to be even a local minimizer.
File “var2”, version of 28 Oct 05, page 15.
Typeset at 14:19 February 27, 2015.
16
PHILIP D. LOEWEN
x
t+
t
Figure E.1: Quadratic Envelope Curve and Extremals
File “var2”, version of 28 Oct 05, page 16.
Typeset at 14:19 February 27, 2015.
Download