IV. The Second Variation c UBC M402 lecture notes 2015 by Philip D. Loewen A. Second-Order Necessary Conditions Consider the basic problem ) ( Z b L (t, x(t), ẋ(t)) dt : x(a) = A, x(b) = B . min Λ[x] := (P ) a If x b gives a directional local minimum, and y is any arc in VII = {y ∈ P WS[a, b] : y(a) = 0 = y(b)} , then the function g: R → R defined by g(λ) := Λ[b x + λy] must have a local minimum at the point λ = 0. Thus g ′ (0) = 0, which gives (IEL) and all the theory developed so far. But also g ′′ (0) ≥ 0, which we now investigate. Suppose L, Lx , and Lv are all C 1 . Then Z b g(λ) = L t, x b(t) + λy(t), x(t) ḃ + λẏ(t) dt a Z bh ′ ⇒ g (λ) = Lx t, x b(t) + λy(t), x(t) ḃ + λẏ(t) y(t) a i + Lv t, x b(t) + λy(t), x(t) ḃ + λẏ(t) ẏ(t) dt Z b ∂ ′′ ⇒ g (0) = [ · · ·]λ=0 dt a ∂λ Z b h i b b b b = Lxx (t)y(t) + Lxv (t)ẏ(t) y(t) + Lvx (t)y(t) + Lvv (t)ẏ(t) ẏ(t) dt a = Z a ′′ b i h b vv (t)ẏ(t)2 + 2L b xv (t)y(t)ẏ(t) + L b xx (t)y(t)2 dt. L Since g (0) ≥ 0 must hold for any fixed y, we have the following result. A.1. Theorem. Assume that L, Lx , and Lv are of class C 1 . Let x b provide a directional local minimum in problem (P). Then for any arc y in VII , one has Z bh i 2 2 b b b 0 ≤ J[y] := Lvv (t)ẏ(t) + 2Lxv (t)y(t)ẏ(t) + Lxx (t)y(t) dt. a In particular, the arc yb(t) = 0 provides a global minimum in the so-called accessory problem min {J[y] : y(a) = 0 = y(b)} . (Q) y∈P WS[a,b] The accessory problem introduced in the statement of Theorem A.1 is purely quadratic, with an integrand L(t, y, w) = α(t)w2 + 2β(t)yw + γ(t)y 2 File “var2”, version of 27 Feb 2015, page 1. Typeset at 14:19 February 27, 2015. 2 PHILIP D. LOEWEN whose coefficients are determined by second derivatives of the original integrand L along the given arc x b: b vv (t), α(t) = L b vx (t), β(t) = L b xx (t). γ(t) = L Assuming that L, Lx , Lv ∈ C 1 makes these coefficients piecewise continuous functions of t in [a, b]. Observations. 1. For problems where the basic integrand L has the general form L(t, x, v) = c1 v 2 + 2c2 xv + c3 x2 + c4 v + c5 x + c6 , the partial derivatives needed to set up the accessory problem are all independent of whatever reference arc x b is given. It’s easy to calculate them, and arrive at L(t, y, w) = 2 c1 w2 + 2c2 yw + c3 y 2 . In particular, if L is a purely quadratic function of (x, v), i.e., c4 = 0, c5 = 0, and c6 = 0, then L(t, y, w) = 2L(t, y, w), so J[y] = 2Λ[y]. Thus the only consequential difference between (P ) and (Q) for purely quadratic problems is that (Q) imposes zero endpoint values. Note that this observation remains valid even when the coefficients ck = ck (t) are smooth functions depending only on t. 2. The constant arc y(t) = 0 is admissible in (Q), so the inequality inf(Q) ≤ J[0] = 0 is obvious. Theorem A.1 is useful because it provides the opposite inequality, namely inf(Q) ≥ 0, so that inf(Q) = 0. 3. The only alternative to inf(Q) = 0 is inf(Q) = −∞. Indeed, if inf(Q) 6= 0 then (since we know inf(Q) ≤ 0) there must be some admissible arc y1 such that J[y1 ] < 0. But then we can define a sequence of arcs yn (t) = ny1 (t), n ∈ N: each yn satisfies the boundary conditions, and since J is purely quadratic, J[yn ] = J[ny1 ] = n2 J[y1 ] → −∞ as n → ∞. Thus inf(Q) = −∞, as claimed. B. Legendre’s Condition Inspired by the findings above, we consider a variational problem with purely quadratic integrand: ( ) Z b J[y] := min α(t)ẏ(t)2 + 2β(t)ẏ(t)y(t) + γ(t)y(t)2 dt : y(a) = 0 = y(b) .(Q) y∈P WS a We assume the coefficient functions α, β, γ: [a, b] → R are piecewise continuous. The notation is deliberately aligned with that in the accessory problem above, but here we treat (Q) strictly on its own merits. B.1. Lemma. If α(τ ) < 0 for some τ ∈ [a, b] where α is continuous, then inf(Q) = −∞. In particular, the arc yb ≡ 0 is not a DLM in problem (Q). Proof. We may assume that the point τ mentioned in the statement is not an endpoint of [a, b]. (Indeed, knowing only that α is continuous at point b with α(b) < 0 is enough File “var2”, version of 27 Feb 2015, page 2. Typeset at 14:19 February 27, 2015. The Second Variation 3 to deduce that there is some nearby point τ < b where the hypotheses hold. The situation is similar at a.) Define c = − 12 α(τ ): then c > 0 and α(τ ) = −2c, so, by continuity of α at τ , there exists some constant ε > 0 so small that (τ − ε, τ + ε) is a subinterval of (a, b), and each t in this subinterval gives α(t) ≤ −c. Then consider, for integers n > 1/ε, the sequence of arcs yn defined by 0, if a ≤ t < τ − 1/n, n (t − (τ − 1/n)) , if τ − 1/n ≤ t < τ , yn (t) = n ((τ + 1/n) − t) , if τ ≤ t < τ + 1/n, 0, if τ + 1/n ≤ t ≤ b. (Sketch!) Whenever n > 1/ε, each yn is admissible in (Q), and the sketch shows that Z τ +1/n Z b 1 n, if 0 < |t − τ | < 1/n, yn (t) dt = , |yn (t)| dt = |ẏn (t)| = 0, otherwise. n τ −1/n a Furthermore, piecewise continuity makes both constants β = maxa≤t≤b |β(t)| and γ = maxa≤t≤b |γ(t)| finite. Using these, we estimate the three terms in the cost functional of (Q). First, Z τ +1/n Z b 2 n2 α(t) dt ≤ n2 (−c)(2/n) = −2cn. α(t)ẏn (t) dt = J1 [yn ] := τ −1/n a Second, thanks to the observations above, Z Z b b 2β(t)ẏn (t)yn (t) dt J2 [yn ] := 2β(t)ẏn (t)yn (t) dt ≤ a a Z τ +1/n ≤ |2β(t)ẏn (t)yn (t)| dt τ −1/n ≤ 2nβ Third, J3 [yn ] := Z a b Z τ +1/n yn (t) dt = 2β. τ −1/n Z b γ(t)yn (t)2 dt ≤ γ(t)yn (t)2 dt a Z τ +1/n 2 |yn (t)| dt ≤ γ(2/n). ≤γ τ −1/n Combining these three estimates, we find that J[yn ] = J1 [yn ] + J2 [yn ] + J3 [yn ] ≤ −2cn + 2β + 2γ/n ∀n > 1/ε. Clearly, as n → ∞ we have J[yn ] → −∞, and this gives inf(Q) = −∞. To get the DLM conclusion, focus on any single N ∈ N so large that J[yN ] < 0. Then for any scalar λ, no matter how small, J[b y + λyN ] = J[λyN ] = λ2 J[yN ] < 0 = J[b y ]. Thus yN provides a direction in which the arc yb = 0 does not provide a onedimensional local minimum. //// File “var2”, version of 27 Feb 2015, page 3. Typeset at 14:19 February 27, 2015. 4 PHILIP D. LOEWEN The general fact about quadratic variational problems just established has useful consequences for the study of the accessory problem introduced in Section A. B.2. Theorem (Legendre). Under the hypotheses of Theorem A.1, suppose x b∈ P WS[a, b] gives a directional local minimum for problem (P). Then b vv (t) ≥ 0 L ∀t ∈ [a, b]. (L) b vv (τ ) < 0, Proof. If, on the contrary, there were some τ in [a, b] for which α(τ ) = L then Lemma B.1 would show that the infimum in the accessory problem is −∞. This would contradict Theorem A.1, so it cannot happen. //// (Z ) π/4 B.3. Example. min x(t)2 − ẋ(t)2 dt : x(0) = 0, x(π/4) free . 0 Solution. Here L(t, x, v) = x2 − v 2 satisfies Lvv (t, x, v) = −2 < 0 for all points b vv (t) < 0 for all t. Thus Legendre’s condition (t, x, v), so for any arc x b one will have L is never satisfied, and no extremal gives even a directional local minimum! //// C. Jacobi’s Equation; Conjugate Points Recall. If x b gives a directional local minimum for the basic problem (P ), then the arc yb = 0 gives an absolute minimum in the following problem: Z b b vv (t)ẏ(t)2 + 2L b xv (t)y(t)ẏ(t) + L b xx (t)y(t)2 dt L minimize J[y] := (Q) a subject to y(a) = 0, y(b) = 0. Problem (Q) is called the accessory problem corresponding to the arc x b. If both L and x b are sufficiently smooth, (Q) is a variational problem of the sort we know how to handle. Let us therefore assume throughout what follows that the minimizer x b in (P ) is of class C 2 , and that Lxx , Lxv , and Lvv are continuously differentiable in (t, x, v)-space. [It is possible to get similar results with weaker hypotheses.] Then any minimizing arc in (Q) must satisfy both (DEL) and (WE1). In general notation, the integrand in (Q) has the form b vv (t)w2 + 2L b vx (t)yw + L b xx (t)y 2 , L(t, y, w) = L and consequently the appropriate form of (DEL) is d Lw (t, y(t), ẏ(t)) = Ly (t, y(t), ẏ(t)), dt i d h b b xv (t)y = 2L b xv (t)ẏ + 2L b xx (t)y, 2Lvv (t)ẏ + 2L dt i d d hb b xx (t) − L b xv (t) y = 0. Lvv (t)ẏ − L dt dt (JDE) Equation (JDE) is the Jacobi equation: note that it is linear, homogeneous, and of second order. File “var2”, version of 27 Feb 2015, page 4. Typeset at 14:19 February 27, 2015. The Second Variation 5 Observations. It’s interesting to compare the Jacobi equation with the EulerLagrange equation for problems where the basic integrand L has the general form L(t, x, v) = c1 v 2 + 2c2 xv + c3 x2 + c4 v + c5 x + c6 . (Here we allow ck = ck (t).) The Euler-Lagrange equation says, after a little rearrangement d 2c1 (t)ẋ − (2c3 − 2ċ2 ) x = c5 (t) − ċ4 (t). dt It’s a second-order linear ODE, in which the coefficient function c6 is completely irrelevant and the coefficients c4 and c5 attached to the linear terms in L account for any and all inhomogeneity. For any reference arc x b, the Jacobi equation for L is d 2c1 (t)ẏ − (2c3 − 2ċ2 ) y = 0. dt This is just the homogeneous counterpart of the Euler-Lagrange equation! In particular, whenever L is purely quadratic, the Jacobi equation and the Euler-Lagrange equation are identical. Of course the situation can be more delicate in general, but there are enough interesting quadratic Lagrangians in the world to make this a fact worth noticing. Direct Rejection of Non-Minimizers. Any directional local minimizer y in the accessory problem (Q) must satisfy both (JDE) and the corner condition i.e., Lw (t, y(t), ẏ(t− )) = Lw (t, y(t), ẏ(t+ )), b vv (t)ẏ(t− ) = L b vv (t)ẏ(t+ ), L ∀t ∈ (a, b). (‡) Let’s rewrite this fact in contrapositive form: if some arc y with y(a) = 0 = y(b) violates either (JDE) or (‡), then y cannot give a directional local minimum for problem (Q). This implies inf(Q) < J[y]. Combine this finding with the contrapositive form of Theorem A.1, namely, if inf(Q) < 0 then x b cannot provide a DLM for (P ). The following short statement results. If some y ∈ VII obeys J[y] ≤ 0 but fails in either (JDE) or (‡), then the original arc x b cannot be a DLM in (P ). The theoretical developments below can be interpreted as a systematic search for arcs y of the sort described here. But there are cases where an elementary approach works, too. Here are two examples. Example 1. Consider L(t, x, v) = v 2 − x2 , and an interval [0, b] where b > π. For any extremal arc x b associated with L on [0, b], regardless of its endpoints, we have b vv (t) = 2, b xv (t) = 0, b xx (t) = −2. L L L Direct calculation will confirm thatthe arc sin(t), 0 ≤ t < π, z(t) = 0, π ≤ t ≤ b, is admissible in (Q), with J[z] = 0. However, this z fails to satisfy (‡) at t = π ∈ (a, b). Hence inf(Q) < 0 and the extremal x b fails to provide a DLM in problem (P ). Since this is true for all extremals, no extremal can provide a DLM. It follows that when b > π, no instance of (P ) has a solution. //// File “var2”, version of 27 Feb 2015, page 5. Typeset at 14:19 February 27, 2015. 6 PHILIP D. LOEWEN Example 2. Consider L(t, x, v) = t(v 2 − x2 ), and an interval [0, b] where b ≥ π. For any extremal arc x b associated with L on [0, b], regardless of its endpoints, we have b vv (t) = 2t, L b xv (t) = 0, L b xx (t) = −2t. L Direct calculation confirms that the admissible arc sin(t), 0 ≤ t < π, y(t) = 0, π ≤ t ≤ b, gives J[y] = 0, but fails to satisfy (JDE) at most points of (0, π). So y cannot be a minimizer in (Q). This implies that inf(Q) < J[y] = 0; by Theorem A.1, the extremal x b must fail to provide a DLM in problem (P ). Since this is true for all extremals, it follows that when b ≥ π, no instance of (P ) has any (directional local) minimizers at all. //// We now introduce additional hypotheses that are satisfied in Example 1 above, but not in Example 2. These will eventually support the assertion that any admissible extremal x b for the problem in Example 1 actually provides a DLM whenever b ≤ π. In contrast, the corresponding statement for Example 2 isp false. In Example 2, we p certainly need b < 28/3 for a DLM to exist in (P ) (note 28/3 < π), and there is no reason to believe that this estimate is sharp. Sidebar. Harry Zheng’s UBC thesis suggests how to generate “by hand” arcs y with J[y] < 0. His method is to start with an arc that starts from 0 and returns to 0 at some time c in [a, b), and is stationary for the integral in (Q) restricted to [a, c], and then to introduce a suitable perturbation. Rosenblueth has followed up on this. The Strengthened Legendre Condition. The theory of differential equations b vv (t) applies most cleanly to (JDE) in the nonsingular case, when the coefficient L of ẏ(t) is nonvanishing throughout [a, b]. Since Legendre’s condition already limits b vv (t) ≥ 0 throughout [a, b], it seems like a small step attention to arcs x b satisfying L to now assume the strengthened Legendre condition, b vv (t) > 0 ∀t ∈ [a, b]. L (L+ ) This condition has two useful consequences: (i) It implies that any extremal arc for the accessory problem (Q) must be C 1 . (This follows immediately from (‡) above.) (ii) It ensures uniqueness of solutions for initial-value problems involving (JDE). In particular, for any point c in [a, b], supplementing the differential equation (JDE) with the initial conditions y(c) = 0 and ẏ(c) = 0 produces a problem whose only solution is the function y(t) = 0. C.1. Proposition. Under (L+ ), all nontrivial solutions y of (JDE) with y(a) = 0 have the same zeros. Proof. Pick any two nonzero solutions of (JDE), say y1 and y2 . By uniqueness (item (ii) above), nontriviality forces ẏ1 (a) 6= 0 and ẏ2 (a) 6= 0, so we may define k = ẏ1 (a)/ẏ2 (a) 6= 0 and z(t) := y1 (t) − ky2 (t), File “var2”, version of 27 Feb 2015, page 6. t ∈ [a, b]. Typeset at 14:19 February 27, 2015. The Second Variation 7 Now z solves (JDE) with z(a) = 0 and ż(a) = 0, so uniqueness gives z ≡ 0, i.e., y1 (t) = ky2 (t), ∀t ∈ [a, b]. Since k 6= 0, the functions y1 and y2 must have the same zeros. //// C.2. Definition. Let x b be an extremal of class C 1 for problem (P). Assume (L+ ). A point c ∈ (a, b] is called conjugate to a [relative to x b] if Jacobi’s differential equation (JDE) has a nontrivial solution y on [a, c] satisfying y(a) = 0 = y(c). In other words, c > a is conjugate to a exactly when the following two-point boundary value problem admits a nontrivial solution y: i d hb d b b (ODE) Lvv (t)ẏ − Lxx (t) − Lxv (t) y = 0, a < t < c, dt dt (BC) y(a) = 0, y(c) = 0. C.3. Theorem (Jacobi). Assume L ∈ C 3 . If x b ∈ C 2 provides a directional local minimum in the basic problem (P), and (L+ ) holds for x b, then the open interval (a, b) contains no points conjugate to t = a relative to x b. Proof. Suppose, on the contrary, that some point c in (a, b) is conjugate to a relative to x b. Let y satisfy (JDE) nontrivially, with y(a) = 0 = y(c), and define y(t), if a ≤ t ≤ c, z(t) = 0, if c < t ≤ b. Notice that since y(c) = 0 and y is nontrivial, we have ẏ(c) 6= 0. In particular, ż(c− ) = ẏ(c) 6= 0 = ż(c+ ), so z has a corner point at c. But J[z] = 0. Indeed, Z c Z b 2 2 b b b Lvv (t)ẏ(t) + 2Lxv (t)y(t)ẏ(t) + Lxx (t)y(t) dt. L(t, z(t), ż(t)) dt = J[z] = a a (∗) But y satisfies (JDE) on [a, c], which we multiply by y and integrate: Z c Z c Z c i d b d hb 2 b y(t)2 L Lvv (t)ẏ dt + y(t) Lxx (t)y(t) dt = xv (t) dt. dt dt a a a Integration by parts, gives Z c Z c i h ic d hb b vv (t)ẏ(t)2 dt b L Lvv (t)ẏ dt = y(t)Lvv (t)ẏ(t) − y(t) dt t=a a Z c Z ca h ic d b xv (t) dt = y(t)2 L b xv (t) b xv (t) [2y(t)ẏ(t)] dt y(t)2 L − L dt t=a a a The boundary terms are all zero, so using these expressions in (∗∗) gives Z c Z c Z c 2 2 b xv (t)y(t)ẏ(t) dt, b b 2L Lvv (t)ẏ(t) dt − Lxx (t)y(t) dt = − a a (∗∗) a a simple rearrangement of J[z] = 0 in (∗). Now (L+ ) implies that extremals in problem (Q) cannot have corners. Since z has a corner, it cannot be extremal. Hence it certainly cannot solve (Q): consequently inf(Q) < J[z] = 0. This contradicts the minimality of x b, by Theorem A.1 above. //// File “var2”, version of 27 Feb 2015, page 7. Typeset at 14:19 February 27, 2015. 8 PHILIP D. LOEWEN In applications, one turns the statement of this theorem inside out. Much as in its proof, one first identifies a smooth extremal arc x b for the basic problem and checks condition (L+ ). If this is satisfied, one solves (JDE) with y(a) = 0, ẏ(a) 6= 0 and looks for the first time c > a when y(c) = 0. This is the first conjugate point to t = a relative to x b: if it lies inside the open interval (a, b), Theorem C.3 guarantees that the extremal x b is not a directional local minimizer in the basic problem. In order for x b to pass this test and remain in the competition for a DLM, it must satisfy Jacobi’s Necessary Condition: There are no points in (a, b) conjugate to a relative to x b. (J) Example. For any b > π, A, and B, this problem has no solution: (Z ) b min ẋ2 − x2 + 2f (t)x dt : x(0) = A, x(b) = B . 0 (The given function f is smooth.) Proof. Here a = 0 and L(t, x, v) = v 2 −x2 +2f (t)x. The differentiated Euler-Lagrange equation associated with L is ẍ + x = f (t), with general solution x(t) = c cos t + d sin t + xp (t), c, d ∈ R, where xp is a particular solution of the inhomogeneous equation above. Any particb will single out a definite extremal x ular parameter choices c = b c, d = d, b, for which we can calculate b vv (t) = 2, L b xv (t) = 0, L b xx (t) = −2. L In fact, these functions are independent of which extremal we consider; for any of the possible extremals, the associated Jacobi equation is ÿ + y = 0. Imposing the boundary condition y(0) = 0 produces the family of solutions y(t) = α sin t, α ∈ R. Thus the point c = π is conjugate to 0 relative to any one of the extremals x b for the problem. According to Jacobi’s theorem, if c = π is a point in the open interval (0, b), then every one of those extremals fails to be a true (directional local) minimizer. Since the true minimizers are all to be found among the extremals, there can be no minimum at all. Note that these considerations do not apply when 0 < b ≤ π, so the conjugate point is outside the basic open interval. In these cases, Jacobi’s theorem makes no promises about the optimality of an admissible extremal x b: it merely fails to eliminate x b from the set of possible minimizers. //// File “var2”, version of 27 Feb 2015, page 8. Typeset at 14:19 February 27, 2015. The Second Variation 9 D. Conjugate Points—Geometry Linearization. Let I be an open interval containing a point α b. Suppose F = {x(t; α) : α ∈ I} is a family of extremals for a given Lagrangian L, indexed by elements α of the interval I. Write x b(t) = x(t; α b). For example, when L(t, x, v) = v 2 − x2 + 2f (t)x as above, possible families of extremals with I = R include x1 (t; α) = α sin t + xp (t), x2 (t; α) = α sin t + (1 − α2 ) cos t + xp (t), etc. D.1. Proposition. In the construction above, suppose x is C 2 as a function of the vector variable (t; α). Then the function below satisfies Jacobi’s differential equation relative to x b(t) = x(t; α b): ∂x y(t) = (t; α b) ∂α Proof. For each α ∈ I, the function t 7→ x(t; α) is extremal, so (DEL) holds: ∂ Lv (t, x(t; α), xt(t; α)) = Lx (t, x(t; α), xt(t; α)), ∀α ∈ I. ∂t Compute ∂/∂α of both sides, using smoothness to interchange the mixed partial derivatives: ∂ ∂ Lvx (t, x(t; α), xt(t; α))xα (t; α) + Lvv (t, x(t; α), xt(t; α)) xα (t; α) ∂t ∂t ∂ = Lxx (t, x(t; α), xt(t; α))xα (t; α) + Lxv (t, x(t; α), xt(t; α)) xα (t; α). ∂t Now set α = α b, remembering x b(t) = x(t; α b) and y(t) = xα (t; α b): i d hb b vv (t)ẏ(t) = L b xx (t)y(t) + L b xv (t)ẏ(t). Lxv (t)y(t) + L dt i d hb d b xx (t) − L b xv (t) y = 0. ⇐⇒ Lvv (t)ẏ − L dt dt Thus y satisfies Jacobi’s equation relative to x b, as claimed. //// (i) Points where extremal families bunch together. Suppose now that every extremal in an indexed family as above passes through the same initial point (a, A), i.e., x(a; α) = A ∀α ∈ I. (1) Assume also that the family of extremals has nontrivial α-dependence near t = a, in other words, that ∂x (t; α b) 6≡ 0 near t = a. (2) ∂α ∂x (t; α b) obeys Then (1) implies that the arc y(t) = ∂α ∂x ∂ y(a) = (a; α b) = A = 0, ∂α ∂α File “var2”, version of 27 Feb 2015, page 9. Typeset at 14:19 February 27, 2015. 10 PHILIP D. LOEWEN while (2) implies that y is not the zero function. Thus y is a nontrivial solution of Jacobi’s equation relative to x b, and (assuming the strengthened Legendre condition holds along x b) the point c will be conjugate to a iff ∂x y(c) = 0 ⇔ (c; α b) = 0 ∂α ⇔ x(c, α) = x(c, α b) + xα (c, α b)(α − α b) + o(α − α b), α → α b, ⇔ x(c, α) = x b(c) + o(α − α b), α → α b. The last condition says that the family of extremals “bunches up” to crowd through a small neighbourhood of the point (c, x b(c)). If x b is a weak local minimizer on [a, b], this cannot happen inside the interval (a, b). Pictorial Example. Think about paths of minimum arc length on the surface of a sphere. These can be described by a variational problem in spherical polar coordinates, and extremals turn out to be “great circles”—that is, circles formed by the intersection of the spherical shell with a plane through the origin. Using the planet Earth as a model for the sphere, and the north pole as the starting point, the shortest path to any specified destination follows a single line of longitude on the globe. These lines emerge from the north pole in every possible direction, but they all cluster together again at the south pole. Thus the south pole is a conjugate point to the north pole. The shortest arc from the north pole to UBC follows the longitude 123◦ 15′ W: it passes over the Arctic Ocean, the Northwest Territories, and northern BC. The opposite longitude, about 56◦ 45′ E, connects the North pole to UBC via Russia, the Ural Mountains, Iran, Oman, the Seychelles, Mauritius, the South Pole, a very long stretch of water, California, Oregon, and Washington State. Clearly the latter path is NOT the shortest one—even in a weak local sense. The mathematical clue to this intuitively obvious fact is the conjugate point at the south pole. (ii) Contact points with an envelope of extremals. Let F = {x(t; α) : α ∈ I} be any family of functions indexed by some open real interval I. A smooth curve x = e(t) is an envelope of the family F if every curve x in F provides a unique solution for t in the system of equations x(t) = e(t), ẋ(t) = ė(t). (These equations characterize the instant when the graphs of x and e meet tangentially.) In other words, we are asking that every α ∈ I come with a unique contact time t = t(α) such that (1) x(t(α); α) = e(t(α)), (2) xt (t(α), α) = ė(t(α)). If the system (1)–(2) defines t as a differentiable function of α, and if x is a smooth function of the vector variable (t, α), then we can find the envelope curve by taking d/dα in line (1) and using (2) to simplify: d 0= [x(t(α), α) − e(t(α))] dα dt dt (3) + xα (t(α); α) − ė(t(α)) = xt (t(α); α) dα dα = xα (t(α); α). File “var2”, version of 27 Feb 2015, page 10. Typeset at 14:19 February 27, 2015. The Second Variation 11 Equations (3) and (1) are valid for all α in I: written together, in the form “0 = xα (t; α), e = x(t; α)”, they provide a parametric description of the envelope curve in the (t, e)-plane. In some special cases one can solve for α = α(t) in (3), and then determine a closed form expression for the envelope curve using (1): e(t) = x(t; α(t)). Example. The envelope of the family of straight lines x(t; α) = csc α − t cot α, α ∈ (0, π), is the top half of the unit circle. Proof. Observe that x(t; α) = 1 − t cos α , sin α xα (t; α) = (t sin α) sin α − (1 − t cos α) cos α t − cos α = . 2 sin α sin2 α Thus the pair of equations (1),(3) says 1 − t cos α , 0 = t − cos α. sin α A parametric description of the envelope curve, with parameter α, is therefore e= t = cos α, e = 1 − cos2 α = sin α, sin α 0 < α < π. This represents the top half of the unit circle in the (t, e)-plane. //// Now suppose that each curve x(t; α) in the family F above is an extremal for some given Lagrangian L, and that each one satisfies the initial condition x(a; α) = A, ∀α ∈ I. (4) If the family has an envelope curve E, then we have the following result. D.2. Proposition. Fix α b ∈ I, and denote x b(t) = x(t; α b). Suppose that x(t; α) is 3 C as a function of the vector variable (t, α), and that the strengthened Legendre condition holds along x b. Then the time b t = t(b α) at which the extremal arc x b meets the envelope curve E is conjugate to a relative to x b. Proof. We already know that the function y(t) = xα (t; α b) is a solution of Jacobi’s differential equation satisfying y(a) = 0. (See the arguments above.) Equation (3) above, with α = α b, gives y(b t) = 0. //// E. An Example This whole section is devoted to the following example and its near relatives: ) (Z T p x(t) [1 + ẋ(t)2 ] dt : x(0) = 1, x(b) = B, x(t) > 0 . min 0 Here L(t, x, v) = Lv (t, x, v) = √ √ √ x 1 + v 2 has v x√ , 1 + v2 Lvv (t, x, v) = File “var2”, version of 28 Oct 05, page 11. √ x 1 , 3/2 (1 + v 2 ) L−Lv ·v = r x . 1 + v2 Typeset at 14:19 February 27, 2015. 12 PHILIP D. LOEWEN Since Lvv (t, x, v) > 0 for all v whenever x > 0, it follows that any positive-valued extremal arc x must be of class C 2 . Since Lt ≡ 0, condition (WE2) implies that the function L − Lv · v must √ be constant along extremals. This constant must be positive; writing it as 1/ m for convenience leads to a first-order equation that any solution must satisfy: s r x(t) 1 = , i.e., ẋ(t)2 = mx(t) − 1. (∗) 2 1 + ẋ(t) m On any open interval where ẋ(t) > 0, (∗) implies p dx 2 √ mx − 1 = t + C+ , ẋ(t) = mx(t) − 1 =⇒ √ = dt =⇒ m mx − 1 for some constant C+ , so x(t) = 1 m 2 + (t + C+ ) . m 4 (Note that ẋ(t) = m(t + C+ )/2 on such an interval.) On any open interval where ẋ(t) < 0, (∗) implies p dx ẋ(t) = − mx(t) − 1 =⇒ √ = −dt =⇒ mx − 1 2 m √ mx − 1 = −t − C− , for some constant C− , so x(t) = 1 m 2 + (t + C− ) . m 4 (Note that ẋ(t) = m(t + C− )/2 on such an interval.) If the sign of ẋ changes at time t∗ , the continuity of ẋ requires that m ∗ m ∗ (t + C− ) = (t + C+ ) , hence C− = C+ . 2 2 Thus the single expression x(t) = m m mC 1 mC 2 1 2 + (t + C) = t2 + t+ + m 4 4 2 m 4 covers both possible cases, and provides a complete description of all possible positivevalued extremals for this integrand. Let α = mC: then a convenient two-parameter family of extremals for the problem is x(t; α, m) = m 2 α 4 + α2 t + t+ , 4 2 4m α, m ∈ R. (‡) Imposing the initial condition 1 = x(0) leads to m = (4 + α2 )/4 and produces the one-parameter family x(t; α) = File “var2”, version of 28 Oct 05, page 12. 4 + α2 2 α t + t + 1, 16 2 α ∈ R. (†) Typeset at 14:19 February 27, 2015. The Second Variation 13 Notice that xt (0; α) = α/2, so extremals with different parameter values have differf (α + tf (α)) ent initial slopes, and are therefore distinct. (Observe also that x(t; α) = f (α) holds for f (z) = 1 + z 2 /4: a similar expression has already been observed in the minimal surface of revolution problem, where the choice f (z) = cosh z generates all extremals through the same initial point.) To identify the extremals, if any, passing through the terminal point (b, B), we must choose the parameter α for which 4 + α2 2 α b + b + 1, 16 2 b2 2 b 8 b2 b2 16 b2 2 0= α + α+ 2 1−B+ . α + α+1−B+ = 16 2 4 16 b b 4 B= i.e., The quadratic formula gives s " # r 2 −4 −4 16 b2 4 b2 α= = 1∓ B− . − 2 1−B+ ± b b b 4 b 4 Thus there are three cases to consider: (1) If B < b2 /4, then there is no choice of α for which the extremal x(t; α) passes through the target point. The problem has no admissible extremals, and hence there is no minimum. (2) If B = b2 /4, then there is exactly one admissible extremal, determined by taking α = −4/b. (3) If B > b2 /4, then there are two admissible extremals, with parameter values i i p p −4 h −4 h 1 + B − b2 /4 , 1 − B − b2 /4 . α+ = α− = b b (Note that α+ < 0 always, whereas α− can have either sign.) Second Order Analysis. (i) The Legendre Condition. Since Lvv (t, x, v) > 0 for all x > 0, t, and v, the strengthened Legendre condition (L+ ) is sure to hold along any positive-valued extremal arc. b , fix a parameter α (ii) The Jacobi Condition. Given a target point bb, B b for which the arc x b(t) = x(t; α b) passes through it. Method 1. Define q(t) = 16b x(t) = 16 + 8b αt + (4 + α b2 )t2 . Onerous calculations reveal that √ 2 4b α + (4 + α b2 )t −2 4 + α b2 b b √ Lxx (t) = , Lxv (t) = , q(t) q(t) 4 + α b2 128 b vv (t) = . L 3/2 (4 + α b2 ) q(t) File “var2”, version of 28 Oct 05, page 13. Typeset at 14:19 February 27, 2015. 14 PHILIP D. LOEWEN Further calculations unleash Jacobi’s differential equation relative to x b: 0 = q(t)ÿ − q̇(t)ẏ + 2 4 + α b2 y = 16 + 8b αt + (4 + α b2 )t2 ÿ − 8b α+2 4+α b2 t ẏ + 2 4 + α b2 y. Nontrivial solutions to this equation are not immediately apparent. Method 2. In the two-parameter family (‡), every curve is an extremal. The extremal x b identified above is associated with the values α = α b and m = def 2 m b = (4 + α b )/4. Fixing either one of these values leaves a one-parameter family of extremals indexed by the other, to which the theory developed above applies. So two solutions of Jacobi’s equation relative to x b are ∂ t α 2b α t y1 (t) = = + = x(t; α, m) b + , ∂α 2 2m b α=α 2 4+α b2 b α=α b 2 ∂ t2 4+α b2 4 t y2 (t) = = = x(t; α b, m) − − . 2 ∂m 4 4m 4 (4 + α b2 ) m=m b m=m b These solutions can be confirmed by direct substitution in Jacobi’s equation above, but we must stress that they were found directly from (‡): the explicit form of Jacobi’s equation is not required to construct them! Clearly y1 and y2 are linearly independent, so their linear combinations give rise to the general solution of Jacobi’s equation above. In order to find conjugate points, however, we need a solution with initial value y(0) = 0. To construct such a solution, one may either consider a suitable linear combination of y1 and y2 , or else use the same method just described on the one-parameter family of extremals (†), for which every curve obeys the initial condition. The latter approach yields the solution t α 4 + α2 2 α b 2 α bt 4 ∂ = + t = 1+ t+ t +t . y(t) = ∂α 2 16 8 8 α b α=α b 2 (Note that y = y1 + (b α/2)y2 , as the former approach could have produced directly.) It follows that there a point conjugate to t = 0 relative to the arc x b only if α b < 0, and in this case the unique conjugate point is b t = −4/b α. Now in case (2) above, we had α b = −4/b, so b = −4/b α=b t. The conjugate point falls exactly at the right endpoint of the interval [0, b]. This situation is permitted by Jacobi’s necessary condition, which only precludes conjugate points in the open interval (0, b). So the unique extremal in case (2) satisfies our second-order necessary conditions. In case (3), however, our formula for α b can be rearranged to give h i h i p p −4 1 ± B − b2 /4 = t± 1 ± B − b2 /4 . b= α± Choosing the positive sign here reveals b > t+ > 0, so the conjugate point falls into the open interval (0, b) and the corresponding extremal violates Jacobi’s condition. It cannot be a minimizer. Choosing the negative sign instead reveals b < t− , so the conjugate point lies well outside the closed interval [0, b] and Jacobi’s condition is satisfied. This extremal is not discardable: advanced techniques show that it is indeed a true minimizer. File “var2”, version of 28 Oct 05, page 14. Typeset at 14:19 February 27, 2015. The Second Variation 15 Envelopes. Consider the one-parameter family of extremals (†). This family forms an envelope, as we can show by considering the system α 4 + α2 2 t+ t , 2 16 1 α ∂x = t + t2 . (2) 0 = ∂α 2 8 In this easy case, (2) gives α = −4/t and substitution in (1) then gives 4 1 1 e = 1 + (−2) + + 2 t2 = t2 . 16 t 4 (1) e = x = 1 + def Thus the curve x = e(t) = 14 t2 provides an envelope for the family of extremals above. This envelope curve coincides with the boundary of our case analyses above: it is precisely the boundary line dividing the target points hit by two extremals from those hit by no extremals at all. For an endpoint (b, B) above the envelope curve, the extremal x(t; α+ ) has conjugate point t+ = −4/α+ in (0, b), so it is not optimal. Only the extremal x(t; α− ) can possibly give the minimum. The situation is accurately depicted in Figure E.1, where the envelope curve x = t2 /4 is shown together with several members of the extremal family (†). For the particular endpoint (b, B) = (1.6, 1), the appropriate parameter values turn out to be α+ = −4 and α− = −1. The corresponding times when these curves touch the envelope are t+ = 1 and t− = 4. Thus α+ generates the lower curve, which is not optimal, while α− generates the upper curve, which can be shown to be optimal using techniques to be described later. Remarks. Using the function f (z) = 1 + z 2 /4, the family of extremals in (†) can be expressed succinctly as f (α + tf (α)) . x(t; α) = f (α) √ For the minimum surface of revolution problem, where the integrand L = x 1 + v 2 is very similar to the one just treated, it turns out that the family of extremals obeying x(0) = 1 can be expressed similarly, just by replacing f with cosh in the formula above. Since both f and cosh are even convex functions with comparatively rapid growth and global minimum values of 1 at the origin, it seems reasonable to expect a certain similarity in the qualitative features of the results for these two problems. This turns out to be the case: extremals in the minimum surface of revolution problem do exist for some target points and not for others, and above a certain envelope curve [not available in closed form] each point is hit by two extremals emanating from (0, 1). The lower one of these extremals touches the envelope before reaching the target point, and therefore fails to be even a local minimizer. File “var2”, version of 28 Oct 05, page 15. Typeset at 14:19 February 27, 2015. 16 PHILIP D. LOEWEN x t+ t Figure E.1: Quadratic Envelope Curve and Extremals File “var2”, version of 28 Oct 05, page 16. Typeset at 14:19 February 27, 2015.