Document 11126907

advertisement
Chapter I. First Variations
c 2015, Philip D Loewen
In abstract terms, the Calculus of Variations is a subject concerned with max/min
problems for a real-valued function of several variables. Given a vector space V and
a function Φ: V → R, we explore the theory and practice of minimizing Φ[x] over
x ∈ V . Additional interest and power comes from allowing
• dim(V ) = +∞,
• constrained minimization, where the choice variable x must lie in some preassigned subset S of V .
We’ll investigate and generalize familiar facts and new issues, including . . .
- necessary conditions: if x minimizes Φ over V , then Φ′ [x] = 0 and Φ′′ [x] ≥ 0;
- existence/regularity: what spaces V are appropriate?
- sufficient conditions: if x obeys Φ′ [x] = 0 and Φ′′ [x] > 0 then x gives a local
minimum.
- applications, calculations, etc.
A. The Basic Problem
Choice Variables. A closed real interval [a, b] of finite length is given. The set
X = C 1 [a, b] denotes the collection of all functions x: [a, b] → R whose derivative ẋ is
continuous on [a, b]. These will be the “choice variables” in a minimization problem:
we call them “arcs” for now. Typically x = x(t).
Detail: Continuity of ẋ on [a, b] requires ẋ(a) and ẋ(b) to be defined. We use one-sided
definitions at a and b to arrange this. For example,
x(a + r) − x(a)
ẋ(a) = lim
.
+
r
r→0
Examples: (i) The function x(t) = t|t| lies in C 1 [−1, 1], so it’s an arc (check this).
However, x 6∈ C 2 [−1, 1]: there’s trouble at 0.
√
(ii) The function x(t) = t does not lie in C 1 [0, 1]. (It’s too steep at left end.)
Objective Functional. A function L ∈ C 1 ([a, b] × R × R) (“the Lagrangian”) is
given. Typically L = L(t, x, v): t for “time”, x for “position”, v for “velocity”. The
Lagrangian helps assign a number to every arc x in X, namely
Z b
def
Λ[x] =
L(t, x(t), ẋ(t)) dt.
a
A Minimization Problem. Given real numbers A, B, the basic problem in the
Calculus of Variations is this:
Z b
L(t, x(t), ẋ(t)) dt
minimize Λ[x] =
a
over
x∈X
subject to x(a) = A, x(b) = B.
File “var1”, version of 08 January 2015, page 1.
Typeset at 14:51 January 9, 2015.
2
PHILIP D. LOEWEN
Shorthand:
min
x∈C 1 [a,b]
(Z
b
L(t, x(t), ẋ(t)) dt : x(a) = A, x(b) = B
a
)
.
(P )
Example: Geodesics in the Plane. An arc defined on [a, b] has total length
s=
Z
ds =
Z
a
b
p
1 + ẋ(t)2 dt.
So finding the shortest
√ arc joining (a, A) to (b, B) is an instance of the basic problem
in which L(t, x, v) = 1 + v 2 .
Example: Soap Film in Zero-Gravity. Wire rings of radii A > 0 and B > 0 are
perpendicular to an axis through both their centres; the centres are 1 unit apart. A
soap film stretches between them, forming a surface of revolution relative to the axes
shown below. Surface tension acts to minimize the area of that surface, which we
can calculate:
infinitesimal
ring at position t with horizontal slice dt has slant length
p
√
ds = dt2 + dx2 = 1 + ẋ(t)2 dt, perimeter 2πx(t), hence area
p
dS = 2πx(t) 1 + ẋ(t)2 dt.
Total area is the “sum” of these contributions, i.e.,
Z
Z 1
p
2πx(t) 1 + ẋ(t)2 dt.
S=
dS =
0
The problem of minimizing S subject to x(0) = A, x(1) = B fits the pattern above,
with [a, b] = [0, 1] and
p
L(t, x, v) = 2πx 1 + v 2 .
Example: Brachistochrone. The birth announcement of our subject came just
over 310 years ago:
“I, Johann Bernoulli, greet the most clever mathematicians in the world.
Nothing is more attractive to intelligent people than an honest, challenging
problem whose possible solution will bestow fame and remain as a lasting
monument. Following the example set by Pascal, Fermat, etc., I hope to
earn the gratitude of the entire scientific community by placing before the
finest mathematicians of our time a problem which will test their methods
and the strength of their intellect. If someone communicates to me the
solution of the proposed problem, I shall then publicly declare him worthy
of praise.”
(Groningen, 1 January 1697)
Here is a statement of Bernoulli’s problem in modern terms: Given two points α
and β in a vertical plane, find the curve joining α to β down which a bead—sliding
from rest without friction—will fall in least time. The Greek works for “least” and
“time” give the unknown curve its impressive title: the brachistochrone. To set
File “var1”, version of 08 January 2015, page 2.
Typeset at 14:51 January 9, 2015.
Chapter I. First Variations
3
up, install a Cartesian coordinate system with its origin at point α and the y-axis
pointing downward. Then B ≥ 0, for the bead to “fall”.
Now speed is the rate of change
to time: v = ds/dt. Along a
p of distance relative
p
curve in the (x, y)-plane, ds = (dx)2 + (dy)2 = 1 + (y ′ (x))2 dx, so the infinitesimal time taken to travel along the segment of curve corresponding to a horizontal
distance dx is
p
1 + (y ′ (x))2 dx
ds
=
.
dt =
v
v
Here v is the speed of the bead, given from conservation of energy as
PE + KE = const.
−mgy + 21 mv 2 = 12 mv02 .
(Here v0 is the bead’s initial velocity: v0 ≥ 0 seems reasonable.) This gives v =
p
v02 + 2gy, leading to
p
1 + (y ′ (x))2
dt = p 2
dx.
v0 + 2gy(x)
The total travel time is then
T =
Z
dt =
Z
b
x=0
s
1 + (y ′ (x))2
dx.
v02 + 2gy(x)
The problem of minimum fall time fits our pattern, with integration interval [0, b],
endpoint values A = 0 and B as specified above, and integrand
s
1 + w2
.
L(x, y, w) =
v02 + 2gy
Or, after changing to standard letter-names (spoiling the meaning completely!),
s
1 + v2
L(t, x, v) =
.
v02 + 2gx
Comparing this integrand to the one for minimum-distance makes a straight line
seem unlikely to provide the minimum. More details later.
Analogy/Crystal Ball. Calculus deals with minimization at every level. For unconstrained local minima, the only possible minimizers are critical points:
• When solving minx∈R f (x), solve f ′ (x) = 0 . . . an algebraic equation for the
unknown number x.
• When solving minx∈Rn F (x), solve ∇F (x) = 0 . . . a system of n algebraic equations for the unknown vector x.
• When solving minx∈C 1 [a,b] Λ(x), solve DΛ[x] = 0 . . . a differential equation for
the unknown arc x.
Before deriving that differential equation, it will be instructive to try some ad-hoc
methods directly on the given problem.
File “var1”, version of 08 January 2015, page 3.
Typeset at 14:51 January 9, 2015.
4
PHILIP D. LOEWEN
A.′ Ad-hoc Methods
Let a pair of endpoints (a, A), (b, B) with a < b and a smooth function L = L(t, x, v)
be given. We call any piecewise-smooth function defined on [a, b] an “arc”, and
distinguish two special families of arcs:
• an arc x: [a, b] → R is “admissible” if x(a) = A and x(b) = B;
• an arc h: [a, b] → R is “a variation” if h(a) = 0 and h(b) = 0.
Our ultimate goal is to minimize
Λ[x] :=
Z
b
L(t, x(t), ẋ(t)) dt
a
among all admissible arcs x. In this “basic problem”, a simple admissible arc is
always available—the straight line between the given endpoints:
t−a
x0 (t) = A +
[B − A],
a ≤ t ≤ b.
b−a
The number Λ[x0 ] provides a reference value for the cost we hope to minimize: clearly
the minimum value must be less than or equal to Λ[x0 ]. The arc x0 also provides a
reference input for our problem, and we can search for preferable inputs by making
wise adjustments to x0 . For any specific variation h, consider the one-parameter
family of arcs
xλ (t) = x0 (t) + λh(t),
a ≤ t ≤ b; λ ∈ R.
Each of these arcs is admissible; the sign and magnitude of the parameter λ determine
the strength by which a perturbation of “shape” h is added to the reference shape
x0 . To assess how much improvement from the reference value we can achieve, we
define the function
Z b
def
φ(λ) = Λ[xλ ] =
L(t, xλ (t), ẋλ (t)) dt.
a
Here φ(0) = Λ[x0 ] is our reference value, and if φ′ (0) < 0 we know that φ(λ) < φ(0)
for small λ > 0: that is, a small perturbation of shape h will improve our reference arc. (If φ′ (0) > 0, then φ(λ) < φ(0) for small λ < 0, so an improvement is
still possible—it is obtained by adding a small positive multiple of −h to x0 .) To
get the most possible improvement out of this idea, we could somehow solve the
single-variable problem of choosing the scalar λ∗ that minimizes the function φ. The
resulting perturbed arc xλ∗ is certain to be preferable to x0 whenever φ′ (0) 6= 0. In
some practical problems, good choices of the reference arc x0 and the variation h
might make xλ∗ a usable improvement. Alternatively, one could build an iterativeimprovement scheme by declaring xλ∗ as the new reference arc (change its name to
x0 ) choosing a new variation, and repeating the process above to generate further
improvements. (Note: This approach is both conceptually attractive and technically
feasible, but it is neither efficient nor effective. Training computers to find approximate solutions for the basic problem is an ongoing area of research, and the best
known methods are quite different from the one outlined above.)
File “var1”, version of 08 January 2015, page 4.
Typeset at 14:51 January 9, 2015.
Chapter I. First Variations
5
√
Example (Soap Film). When L(t, x, v) = x 1 + v 2 , (a, A) = (0, 1), and (b, B) =
(1, 2), the unique admissible linear arc is
x0 (t) = 1 + t,
Calculation gives
Z
Λ[x0 ] =
0
1
0 ≤ t ≤ 1.
√
Z 1
p
√
3 2
2
x0 (t) 1 + ẋ0 (t) dt =
≈ 2.1213.
(1 + t) 2 dt =
2
0
Selecting the variation h(t) = sin(πt) leads to
xλ (t) = 1 + t + λ sin(πt), ẋλ (t) = 1 + πλ cos(πt),
Z 1
p
φ(λ) =
(1 + t + λ sin(πt)) 1 + (1 + πλ cos(πt))2 dt.
0
The computer-algebra system “Maple” suggests that the choice λ ≈ −0.1886 minimizes φ(λ), with
φ(−0.1886 . . .) ≈ 2.0796.
The arc x(−0.1886) has a Λ-value that is 1.96% lower than Λ[x0 ].
Selecting the variation h(t) = t(1 − t) instead leads to
xλ (t) = 1 + t + λt(1 − t), ẋλ (t) = 1 + λ(1 − 2t),
Z 1
p
(1 + t + λt(1 − t)) 1 + (1 + λ(1 − 2t))2 dt.
φ(λ) =
0
Numerical minimization using “Maple” suggests that the λ ≈ −0.7252 minimizes
φ(λ), with
φ(−0.725 . . .) ≈ 2.0792.
The arc x(−0.725) produces a Λ-value that is 1.99% lower than Λ[x0 ].
Notice that the definitions of φ, xλ , and the minimizing λ-value all depend
implicitly on the chosen variation h. Some variations produce better results than
others.
Theoretical methods we will develop below reveal that the minimizing arc in this
problem is
x
b(t) =
cosh(α + t cosh α)
,
cosh α
where
α ≈ 0.323074.
The minimum value is Λ[b
x] ≈ 2.0788, only 2.00% lower than Λ[x0 ].
Here are two plots to illustrate these results. The first shows the four admissible
arcs generated above: the straight line in blue, the trigonometric perturbation in
red, the quadratic perturbation in green, and the true minimizer in black. The
nonlinear arcs are so close together that they look identical when printed. To display
their differences, the second plot shows the differences x − xopt for the trigonometric
File “var1”, version of 08 January 2015, page 5.
Typeset at 14:51 January 9, 2015.
6
PHILIP D. LOEWEN
perturbation x in red, the quadratic perturbation x in green, and the true minimizer
x = xopt in black. (So the black line coincides with the t-axis.)
2.5
2
1.5
1
0
0.2
0.4
0.6
0.8
1
0.2
0.4
0.6
0.8
1
−3
20
x 10
15
10
5
0
−5
0
////
Descent Directions. Every time we choose a reference arc x0 and a variation h, we
can define the single-variable function
def
φ(λ) = Λ[x0 + λh].
For small λ, the linear approximation
φ(λ) ≈ φ(0) + φ′ (0)λ + o(λ2 )
reveals the scalar φ′ (0) as the rate of change of the objective value Λ with respect
to a variation in direction h, locally near the reference arc x0 . It seems natural to
say that h provides a “descent direction for Λ at x0 ” when φ′ (0) < 0. Let’s calculate
File “var1”, version of 08 January 2015, page 6.
Typeset at 14:51 January 9, 2015.
Chapter I. First Variations
7
φ′ (0), holding tight to a single fixed variation h throughout.
φ(λ) − φ(0)
λ→0
λ
Z b
L(t, x0 (t) + λh(t), ẋ0 (t) + λḣ(t)) − L(t, x0 (t), ẋ0 (t))
= lim
dt
λ→0 a
λ
Z b
L(t, x0 (t) + λh(t), ẋ0 (t) + λḣ(t)) − L(t, x0 (t), ẋ0 (t))
=
lim
dt
λ
a λ→0
Z b
d
L(t, x0 (t) + λh(t), ẋ0 (t) + λḣ(t))
dt
=
dλ
a
λ=0
Z b
=
Lx (t, x0 (t), ẋ0 (t))h(t) + Lv (t, x0 (t), ẋ0 (t))ḣ(t) dt
a
Z b
d
=
Lx (t, x0 (t), ẋ0 (t)) − Lv (t, x0 (t), ẋ0 (t)) h(t) dt
(int by parts).
dt
a
φ′ (0) = lim
In summary, we have
′
φ (0) =
Z
b
R(t)h(t) dt,
a
d
R(t) = Lx (t, x0 (t), ẋ0 (t)) − Lv (t, x0 (t), ẋ0 (t)).
dt
where
(∗∗)
A key observation: φ and φ′ (0) depend on our choice of h, but R does not.
Formula (∗∗) is valid for any smooth admissible x0 and variation h, and is full
of useful information.
Viewpoint 1. For an admissible arc x0 , we can use (∗∗) to guide a search for descent
directions. It suffices to choose a variation h whose pointwise product with R is large
and negative. The choice h = −R is particularly tempting, but the requirement that
h(a) = 0 = h(b) sometimes requires a modification of this selection.
Example. Suppose (a, A) = (0, 0), (b, B) = ( π2 , π2 ), and L(t, x, v) = v 2 −x2 . Consider
the linear reference arc x0 (t) = t. Its objective value is
Λ[x0 ] =
Z
0
π/2
2
ẋ0 (t) − x0 (t)
To improve on this, calculate
2
dt =
Lx (t, x, v) = −2x,
Z
π/2
0
1−t
2
π 1
dt = −
2
3
3
π
≈ 0.2789.
2
Lv (t, x, v) = 2v,
so Lx (t, x0 (t), ẋ0 (t)) = −2t,
Lv (t, x0 (t), ẋ0 (t)) = 2.
d
We get R(t) = [−2t] − dt
[2] = −2t. Here −R(t) = 2t does not vanish at both endpoints, but it is positive everywhere, so we try a variation that is positive everywhere:
h(t) = sin(2t).
File “var1”, version of 08 January 2015, page 7.
Typeset at 14:51 January 9, 2015.
8
PHILIP D. LOEWEN
Calculation gives
Z π/2
[1 + 2λ cos(2t)]2 − [t + λ sin(2t)]2 dt
φ(λ) =
0
=
Z
π/2
0
[1 + 4λ cos(2t) + 4λ2 cos2 (2t)] − [t2 + 2λt sin(2t) + λ2 sin2 (2t)] dt
3π 2 π
π π3
λ − λ+ −
.
4
2
2
24
This is minimized when λ = 1/3, and the minimum value provides a 94% discount
from the reference value Λ[x0 ]:
=
Λ[x1/3 ] = φ(1/3) =
5π π 3
−
≈ 0.01707.
12
24
Viewpoint 2. If our minimization problem has a solution, we don’t need to know it
in detail to assign it the name “x0 ”. If the solution happens to be smooth, then the
derivation above applies and conclusion (∗∗) is available. But now the minimality
property of x0 makes it impossible to improve upon: we must have φ′ (0) = 0 for
every possible variation h, i.e.,
Z b
0=
R(t)h(t) dt
for every h: [a, b] → R obeying h(a) = 0 = h(b). (†)
a
This forces R(t) = 0 for all t. To see why any other outcome is impossible, imagine
that some θ ∈ (a, b) makes R(θ) < 0. Our smoothness hypotheses guarantee that R
is continuous on [a, b]. Recall
d
[Lv (t, x0 (t), ẋ0 (t))] , t ∈ [a, b].
dt
So if R(θ) < 0, there must be some open interval (α, β) containing θ such that
[α, β] ⊆ [a, b] and R(t) < 0 for all t in (α, β). The variation

t−α

, for α ≤ t ≤ β,
1 − cos 2π
h(t) =
β
−
α

0,
otherwise,
R(t) = Lx (t, x0 (t), ẋ0 (t)) −
is continuously differentiable, everywhere nonnegative, and has h(t) > 0 if and only
if t ∈ (α, β). Using this variation in (∗∗) would give
Z β
′
φ (0) =
R(t)h(t) dt < 0.
α
This contradicts (†). We have shown that if x0 is a smooth minimizer, then R cannot
take on any negative values. Positive R-values are impossible for similar reasons.
Our conclusion, R(t) = 0 for all t ∈ [a, b], is usually written as
d
[Lv (t, x0 (t), ẋ0 (t))] ,
t ∈ [a, b].
(DEL)
dt
This is the renowned Euler-Lagrange Equation (“EL”) in differentiated form (“D”,
hence “DEL”). If a smooth arc x0 gives the minimum in the basic problem, it must
obey (DEL). Solutions of (DEL) are called extremal arcs.
Lx (t, x0 (t), ẋ0 (t)) =
File “var1”, version of 08 January 2015, page 8.
Typeset at 14:51 January 9, 2015.
Chapter I. First Variations
9
Example. When L(t, x, v) = v 2 − x2 , we have Lx (t, x, v) = −2x and Lv (t, x, v) = 2v,
so equation (DEL) for an unknown arc x(·) says
−2x(t) =
d
[2ẋ(t)] ,
dt
i.e.,
ẍ(t) + x(t) = 0.
A complete list of smooth solutions for this equation is
x(t) = c1 cos(t) + c2 sin(t),
c1 , c2 ∈ R.
In a previous example, the prescribed endpoints where (a, A) = (0, 0) and (b, B) =
( π2 , π2 ). Only one solution joins these points: substitution gives
π
π
=
c
sin
0 = x(0) = c1 ,
2
2
2 = c2 ,
so x(t) = π2 sin(t) is the only smooth contender for optimality in the corresponding
problem. Its objective value is
Λ[ π2
π/2
2 Z π/2
Z
π2
π 2 π/2
π
2
2
cos(2t) dt =
cos t − sin t dt =
= 0.
sin(2t)
sin] =
2
4 0
8
0
t=0
////
Discussion. It seems natural to call R the residual in equation (DEL), and to
record the following interpretation. A given arc x0 is extremal if and only if it makes
R = 0. If x0 makes R(θ) < 0 at some instant θ, then perturbing x0 with a smooth
upward bump centred at θ will give a preferable arc (i.e., an arc with lower Λ-value).
A smooth downward bump is advantageous near any point where R(θ) > 0.
B. Abstract Minimization in Vector Spaces
Throughout this section, V is a real vector space, and Φ: V → R is given.
Derivatives. Suppose Φ: V → R is given. The directional derivative of Φ at base
point x
b in direction h is this number (or “undefined”):
def
Φ′ [x; h] = lim
λ→0+
Φ[b
x + λh] − Φ[b
x]
.
λ
Note: Φ′ [x; 0] = 0, and for all r > 0,
Φ′ [b
x; rh] = lim
λ→0+
Φ[b
x + λrh] − Φ[b
x] r
Φ[b
x + (λr)h] − Φ[b
x]
× = r lim
= rΦ′ [b
x; h].
+
λ
r
(λr)
λ→0
When x
b and Φ are such that Φ′ [b
x; h] is defined for every h ∈ V , we say Φ is directionally differentiable at x
b, and define the derivative of Φ at x
b as the operator
DΦ[b
x]: V → R for which
DΦ[b
x](h) = Φ′ [b
x; h]
File “var1”, version of 08 January 2015, page 9.
∀h ∈ V.
Typeset at 14:51 January 9, 2015.
10
PHILIP D. LOEWEN
Descent. If Φ′ [b
x; h] < 0, then h 6= 0, and h provides a first-order descent direction
for Φ at x
b. That is, for 0 < λ ≪ 1,
Φ′ [x; h] ≈
Φ[b
x + λh] − Φ[b
x]
λ
=⇒
Φ[b
x + λh] ≈ Φ[b
x] + λΦ′ [b
x; h] < Φ[b
x].
(∗)
Directional Local Minima. A point x
b in X provides a Directional Local Minimum (DLM) for Φ over V exactly when, for every h ∈ V , there exists ε = ε(h) > 0
so small that
∀λ ∈ (0, ε),
Φ[b
x] ≤ Φ[b
x + λh].
Intuitively, x
b is a DLM for Φ if it provides an ordinary local minimum in the
one-variable sense along every line through x
b in the space V .
Proposition. If x
b gives a DLM for Φ over V , then
∀h ∈ V,
Φ′ [b
x; h] ≥ 0
(or Φ′ [b
x; h] is undefined).
(∗∗)
In particular, if Φ is directionally differentiable at x
b and DΦ[b
x] is linear, then DΦ[b
x] =
0 (“the zero operator”).
Proof. If (∗∗) is false, then Φ′ [b
x; h] < 0 for some h ∈ X, and DLM definition is
contradicted by (∗). So (∗∗) must hold. Now if DΦ[b
x] is linear, then for arbitrary
h ∈ V two applications of (∗∗) give
DΦ[b
x](h) = Φ′ [b
x; h] ≥ 0,
′
−DΦ[b
x](h) = DΦ[b
x](−h) = Φ [b
x; −h] ≥ 0
Thus 0 ≤ DΦ[b
x](h) ≤ 0, giving DΦ[b
x](h) = 0. Since this holds for arbitrary h ∈ V ,
DΦ[b
x] must be the zero operator.
////
Application. When V = Rn and Φ: Rn → R has a gradient at x
b ∈ Rn , we calculate
Φ[b
x + λh] − Φ[b
x]
λ
φ(λ) − φ(0)
def
= lim+
for φ(λ) = Φ[b
x + λh]
λ
λ→0
= φ′ (0) = ∇Φ(b
x)h.
Φ′ [b
x; h] = lim
λ→0+
Here DΦ[b
x] is the operator “left matrix multiplication by the 1 × n matrix ∇Φ(b
x)”.
This is linear, so the proposition above gives a familiar result: if x
b minimizes Φ over
Rn , then ∇Φ(b
x) must be the zero vector.
Affine Constraints. In many practical minimization problems, the domain of the
objective function is confined to a proper subset of a real vector space X. The simplest
such case arises when the subset S is affine, i.e., when
S = x0 + V = {x0 + h : h ∈ V }
for some element x0 ∈ X and subspace V of X.
File “var1”, version of 08 January 2015, page 10.
Typeset at 14:51 January 9, 2015.
Chapter I. First Variations
11
Lemma. For S as above, one has S = z + V for every z ∈ S.
Proof. Since z ∈ S, there exists h0 ∈ V such that z = x0 + h0 . Then
y ∈ S ⇔ y = x0 + h for some h ∈ V
⇔ y = (z − h0 ) + h for some h ∈ V
⇔ y = z + (h − h0 ) for some h ∈ V
⇔ y = z + H for some H ∈ V (H = h − h0 ).
////
Definition. Given a vector space X containing an element x0 and a subspace V ,
let S = x0 + V and suppose Φ: S → R. A point x
b ∈ S gives a directional local
minimum (DLM) for Φ relative to V if and only if
∀h ∈ V,
∃ε = ε(h) > 0 :
∀λ ∈ [0, ε), Φ[b
x] ≤ Φ[b
x + λh].
Geometrically, x
b provides a DLM relative to V if x
b gives a local minimum in
every one-dimensional problem obtained by restricting Φ to a line passing through x
b
and parallel to V .
To harness our previous results, we translate (shift) our reference point from the
origin of X to the point x
b, defining Ψ: V → R via
def
Ψ[h] = Φ[b
x + h],
h ∈ V.
Note that
Ψ′ [0; h] = lim
λ→0+
Φ[b
x + λh] − Φ[b
x]
Ψ[λh] − Ψ[0]
= lim
= Φ′ [b
x; h].
λ
λ
λ→0+
If x
b ∈ S gives a DLM for Φ relative to V , then 0 gives a DLM for Ψ over V , and our
previous result applies to Ψ, giving
∀h ∈ V,
Ψ′ [0; h] ≥ 0
(or Ψ′ [0; h] is undefined),
Summary. If x
b ∈ S = x0 + V solves the constrained problem
min {Φ[x] : x ∈ x0 + V } ,
x∈X
then
∀h ∈ V,
Φ′ [b
x; h] ≥ 0
(or Φ′ [b
x; h] is undefined),
If, in addition, Φ is differentiable on X and DΦ[b
x]: X → R is linear, then then
DΦ[b
x](h) = 0 for all h ∈ V .
Application. Consider X = Rn . Given a nonzero normal vector N ∈ Rn and point
x0 ∈ Rn , consider the hyperplane
S = {x ∈ Rn : N • (x − x0 ) = 0} .
File “var1”, version of 08 January 2015, page 11.
Typeset at 14:51 January 9, 2015.
12
PHILIP D. LOEWEN
Observe that x ∈ S iff x − x0 ∈ V , where
V = {h ∈ Rn : N • h = 0} .
This V is a subspace of Rn : it’s the set of all vectors perpendicular to N. Now
suppose some x
b in Rn minimizes Φ over S. We have DΦ[b
x](h) = 0 for all h ∈ V , i.e.,
∇Φ(b
x)h = 0 for all h ∈ V , i.e., ∇Φ(b
x) ⊥ V , i.e., ∇Φ(b
x) = −λN for some λ ∈ R. In
summary, if x
b minimizes Φ: Rn → R subject to the constraint N • (x − x0 ) = 0, then
0 = ∇ Φ(x) + λN • (x − x0 )
.
x=b
x
That’s the Lagrange Multiplier Rule for the case of a single (linear) constraint!
C. Derivatives of Integral Functionals
Our “basic problem” involves minimization of Λ: C 1 [a, b] → R defined by Λ[x] =
L(t, x(t), ẋ(t)) dt for some given Lagrangian L ∈ C 1 ([a, b] × R × R). So we pick
a
arbitrary x
b, h ∈ C 1 [a, b] and calculate
1
′
Λ [b
x; h] = lim+
Λ[b
x + λh] − Λ[b
x]
(1)
λ→0 λ
Z
i
1 bh L t, x
b(t) + λh(t), x(t)
ḃ + λḣ(t) − L(t, x(t), ẋ(t)) dt (2)
= lim+
λ→0 λ a
Z b
i
1h =
lim+
L t, x
b(t) + λh(t), x(t)
ḃ + λḣ(t) − L t, x
b(t), x(t)
ḃ
dt (3)
a λ→0 λ
Z b
∂
=
L t, x
b(t) + λh(t), x(t)
ḃ + λḣ(t)
dt
(4)
a ∂λ
λ=0
Z b =
Lx t, x
b(t), x(t)
ḃ
h(t) + Lv t, x
b(t), x(t)
ḃ
ḣ(t) dt
(5)
Rb
a
Here line (1) is the definition of the directional derivative, and line (2) comes from
the definition of Λ. Passing from (2) to (3) requires that we interchange the limit and
the integral. In our case this is justified because the limit is approached uniformly
in t, a consequence of L ∈ C 1 . See, e.g., Walter Rudin, Real and Complex Analysis,
page 223. Existence of the derivative in (4) and its evaluation in (5) also follow from
our assumption that L ∈ C 1 ; the definition of this derivative allows it to be evaluated
as shown inside the integral in (3).
b
b x (t) and L
b v (t),
Using the notation L(t) = L t, x
b(t), x(t)
ḃ
and likewise defining L
we summarize: if L ∈ C 1 and x
b ∈ C 1 [a, b], then
Z bh
i
′
b
b
Lx (t)h(t) + Lv (t)ḣ(t) dt
Λ [b
x; h] =
a
∀h ∈ C 1 [a, b].
Integration by parts gives an equivalent expression. Let q(t) =
q(a) = 0 and
b x (t) dt
dq(t) = q̇(t) dt = L
∀t ∈ [a, b].
File “var1”, version of 08 January 2015, page 12.
Rt
a
(∗)
b x (r) dr. Then
L
Typeset at 14:51 January 9, 2015.
Chapter I. First Variations
13
b x is continuous
(The latter follows from the Fundamental Theorem of Calculus, since L
on [a, b].) Hence the first term on the right in (∗) is
Z b
Z b
b
Lx (t)h(t) dt =
h(t) dq
a
a
b
= q(t)h(t)
t=a
= q(b)h(b) −
We conclude that for all h ∈ C 1 [a, b],
"Z
#
Z
b
′
b
Λ [b
x; h] =
Lx (t) dt h(b) +
a
a
b
Z
−
a
b
Z
b
q(t) dh(t)
a
b x (t)ḣ(t) dt
L
Z t
b v (t) −
b x (r) dr ḣ(t) dt.
L
L
(∗∗)
a
Note that this expression well-defined for all h ∈ C 1 [a, b], and linear in h.
D. Fundamental Lemma; Euler-Lagrange Equation
Given real numbers a < b, consider these subspaces of C 1 [a, b]:
VII = h ∈ C 1 [a, b] : h(a) = 0, h(b) = 0 ,
VI0 = h ∈ C 1 [a, b] : h(a) = 0 ,
V0I = h ∈ C 1 [a, b] :
h(b) = 0 ,
VM N = h ∈ C 1 [a, b] : M h(a) = 0, N h(b) = 0 ,
V00 = C 1 [a, b].
Recall the basic problem of the COV:
(
)
Z b
Λ[x] :=
L (t, x(t), ẋ(t)) dt : x(a) = A, x(b) = B .
min
1
x∈C [a,b]
a
With X = C 1 [a, b], the arcs competing for minimality in (P ) are those in the set
S = {x ∈ X : x(a) = A, x(b) = B} .
An arc is admissible for (P ) if it lies in S. This set is affine: pick any x0 in S (like
the straight-line curve x0 (t) = A + (t − a)(B − A)/(b − a)) to see
S = x0 + VII .
[Proof: One has x ∈ S if and only if x − x0 ∈ VII . This is easy.] Our general
discussion above shows that if some arc x
b in S gives a directional local minimum for
Λ relative to VII , then every h ∈ VII must satisfy
Z b
Z t
′
b
b
b
b x (r) dr.
0 = Λ [b
x; h] =
N (t)ḣ(t) dt, where N (t) = Lv (t) −
L
a
a
This situation explains our interest in the Fundamental Lemma below.
File “var1”, version of 08 January 2015, page 13.
Typeset at 14:51 January 9, 2015.
14
PHILIP D. LOEWEN
Lemma (duBois-Reymond). If N : [a, b] → R is integrable, TFAE:
Z b
N (t)ḣ(t) dt = 0 for all h ∈ VII .
(a)
a
(b) The function N is essentially constant.
Proof. (b⇒a): If N (t) = c for all t in [a, b] (allowing finitely many exceptions), then
each h ∈ VII obeys
b
Z b
= 0.
(∗)
cḣ(t) dt = ch(t)
a
t=a
(a⇒b): Consider any piecewise continuous function N . Use the average value of N ,
namely the constant
Z b
1
N (r) dr,
N=
b−a a
to define
Z t
h(t) =
N − N (r) dr.
a
1
Obviously h ∈ C [a, b] with h(a) = 0, but also (by definition of N )
Z b
Z b
h(b) =
N − N (r) dr = (b − a)N −
N (r) dr = 0.
a
a
Therefore h ∈ VII ; in particular, using c = N in line (∗) gives
Z b
N ḣ(t) dt = 0.
a
It follows that
Z b
Z
N (t)ḣ(t) dt =
a
b
a
=
Z
a
b
N (t) − N ḣ(t) dt
N (t) − N
N − N (t) dt = −
Z
a
b
N (t) − N
2
dt.
If (a) holds, this integral equals 0, and therefore N (t) = N for all t in [a, b] (with at
most finitely many exceptions). This proves (b).
////
Theorem (Euler-Lagrange Equation—Integral Form). If x
b is a directional
local minimizer in the basic problem (P ), then there is a constant c such that
Z t
b x (r) dr
b v (t) = c +
L
∀t ∈ [a, b].
(IEL)
L
a
Proof. As discussed above, directional local minimality implies that for every h in
VII obeys
Z t
Z b
′
b x (r) dr.
b
b
b
L
N (t)ḣ(t) dt, where N (t) = Lv (t) −
0 = Λ [b
x; h] =
a
a
b is essentially constant. But
Thanks to the Fundamental Lemma, it follows that N
1
b is continuous on [a, b], so (IEL) holds at
since both L and x
b are C , the function N
all points without exception.
////
File “var1”, version of 08 January 2015, page 14.
Typeset at 14:51 January 9, 2015.
Chapter I. First Variations
15
Definition. An arc that satisfies (IEL) on some interval [a, b] is called an extremal
for L on that interval.
Remark. Note that (IEL) is the same for the function −L as it is for L, so it must
hold also for a directional local maximizer in the Basic Problem. An extremal arc in
the COV is analogous to a “critical point” in ordinary calculus: the set of extremal
arcs includes every arc that provides a (directional) local minimum or maximum, and
possibly some arcs that provide neither.
Remark. For any admissible arc x
b that fails to satisfy (IEL), the continuous function
b
N defined above will be nonconstant and the proof of the Fundamental Lemma shows
that
Z t
Z b
1
h(t) =
N (r) dr,
N − N (r) dr,
where
N=
b−a a
a
makes Λ′ [b
x; h] < 0. That is, this h is a descent direction for Λ[·] relative to the
nominal arc x
b.
////
Regularity Bonus. The integrand L(t, x, v) = 21 v 2 , for which Lv (t, x, v) = v, gives
Lv (t, x(t), ẋ(t)) = ẋ(t). So for typical arcs x ∈ C 1 [a, b], we can be sure that the
mapping t 7→ Lv (t, x(t), ẋ(t)) is continuous, but perhaps no better. However, minimality exerts a little bias in favour of smoothness: in (IEL), the integrand on RHS
b v (t) is
is continuous, so the RHS fcn of t is differentiable. This means that t 7→ L
guaranteed to be differentiable, with
d b
b x (t)
Lv (t) = L
dt
∀t ∈ [a, b].
(DEL)
In fact, since the right side here is continuous, we have the following.
b v is conCorollary. If x
b gives a directional local minimum in problem (P), then L
tinuously differentiable, and (DEL) holds.
We discuss consequences of this in the next section.
Example. Find candidates for minimality in (P ) with L(t, x, v) = v 2 − x2 and
(a, A) = (0, 0), in cases
(i) (b1 , B1 ) = (π/2, 1),
(ii) (b2 , B2 ) = (3π/2, 1).
Note Lv (t, x, v) = 2v and Lx (t, x, v) = −2x.
If x
b minimizes Λ among all C 1 curves from (0, 0) to (b1 , B1 ), then (DEL) says
d ∃
2x(t)
ḃ
= − 2b
x(t)
dt
∀t.
That is, x(t)
b̈ = −b
x(t) for all t. This shows x
b ∈ C 2 , and gives the general solution
x
b(t) = c1 cos(t) + c2 sin(t),
File “var1”, version of 08 January 2015, page 15.
c1 , c2 ∈ R.
Typeset at 14:51 January 9, 2015.
16
PHILIP D. LOEWEN
(i) Here the boundary conditions give c1 = 0, c2 = 1. Unique candidate: x
b1 (t) =
sin(t). Later we’ll show that x
b1 gives a true [global] minimum:
(
Λ1 [b
x1 ] = min Λ1 [x] =
Z
π/2
2
ẋ(t) − x(t)
0
2
)
dt : x(0) = 0, x(π/2) = 1 .
(ii) Here the BC’s identify the unique candidate x
b2 (t) = − sin(t). Later we’ll show
that this does not give even a directional local minimum; moreover,
inf
(
Λ2 [x] =
Z
3π/2
ẋ(t)2 − x(t)
0
2
dt : x(0) = 0, x(3π/2) = 1
)
= −∞.
////
Special Case 1: L = L(t, v) is independent of x.
Here (IEL) reduces to a first-order ODE for x
b, involving an unknown constant:
b v (t) = const.
L
Consider these subcases, where L = L(v) is also independent of t:
+ 2
2
L= v −1
.
p
L = 1 + v2 ,
2
L=v ,
In these three cases, every admissible extremal x
b is a global minimizer. To see this,
let c = Lv (x)
ḃ and define
f (v) = L(v) − cv.
Then f ′ (v) = Lv (v) − c is nondecreasing, with f ′ (x(t))
ḃ
= 0, so that point gives a
global minimum. In other words,
f (v) ≥ f (x(t))
ḃ
Now every arc x obeying the BC’s has
Z
Z
a
b
a
b
∀v ∈ R, ∀t ∈ [a, b].
Rb
a
(∗)
cẋ(t) dt = c [x(b) − x(a)] = c [B − A], so
f (ẋ(t)) dt ≥
L(ẋ(t)) dt − c [B − A] ≥
Z
Z
b
a
b
a
f (x(t))
ḃ
dt
L(x(t))
ḃ
dt − c [B − A]
Λ[x] ≥ Λ[b
x].
√
(For L = 1 + v 2 , this proves that the arc of shortest length from (a, A) to (b, B)
is the straight line. The technical definition of the term “arc” here leaves room for
some improvement in this well-known conclusion.)
File “var1”, version of 08 January 2015, page 16.
Typeset at 14:51 January 9, 2015.
Chapter I. First Variations
17
An Optimistic Calculation: Suppose x
b solves (IEL) and is in fact C 2 . (See
“Regularity Bonus” above, and Section D below.) Then the Chain Rule and (DEL)
together give
d b t (t) + L
b x (t)x(t)
b v (t)x(t)
L t, x
b(t), x(t)
ḃ
=L
ḃ + L
b̈
dt
h
i
b v (t)x(t)
b t (t) + d L
ḃ
by (DEL).
=L
dt
We rearrange this to get
i
d hb
b v (t)x(t)
b t (t)
L(t) − L
ḃ
=L
dt
∀t ∈ [a, b].
(W E2)
Special Case 2: L = L(x, v) is independent of t (“autonomous”). For every extremal x
b of class C 2 , (W E2) implies that
b −L
b v (t)x(t)
L(t)
ḃ = C
∀t ∈ [a, b],
for some constant C. In other words, the following function of 2 variables is constant
along every C 2 arc solving (IEL):
L(x, v) − Lv (x, v) · v.
In Physics, a famous Lagrangian is L(x, v) = 12 mv 2 − 21 kx2 (that’s KE minus
PE for a simple mass-spring system). Here Lv = mv, and the function above works
out to
2
2
2
2
1
1
1
1
mv
−
kx
−
(mv)v
=
−
mv
+
kx
,
2
2
2
2
the total energy. For other Lagrangians, condition (WE2) expresses conservation of
energy along real motions.
////
Caution. (WE2) and (IEL) are not quite equivalent, even for very smooth arcs. The
Lagrangian L(x, v) = 12 mv 2 − 12 kx2 illustrates this. As shown above, (WE2) holds
along an arc x
b if and only if
1
ḃ 2
2 mx(t)
+ 21 kb
x(t)2 = const,
and this is true for any constant function x
b. However, (IEL) holds along x
b iff
mx(t)
b̈ + kb
x(t) = 0,
and the only constant solution of this equation is x
b(t) = 0. The upshot: Every
smooth solution of (IEL) must obey (WE2), but (WE2) may have some spurious
solutions as well. Home practice: Show that if x ∈ C 2 obeys (WE2) and satisfies
ẋ(t) 6= 0 for all t, then x obeys (IEL) also. (Thus, many solutions of WE2 also obey
IEL, and we can predict which they are.)
////
File “var1”, version of 08 January 2015, page 17.
Typeset at 14:51 January 9, 2015.
18
PHILIP D. LOEWEN
D. Smoothness of Extremals
Here we explore the “regularity bonus” mentioned above in detail. The situation
is especially good when L is quadratic in v.
Proposition. Suppose
L(t, x, v) = 12 A(t, x)v 2 + B(t, x)v + C(t, x)
for C 1 functions A, B, C. Then for any x
b obeying (IEL), with A(t, x
b(t)) 6= 0 for all
t ∈ [a, b], we have x
b ∈ C 2 [a, b].
Proof. Here Lv (t, x, v) = A(t, x)v + B(t, x). Along the arc x
b, this identity gives
x(t)
ḃ =
h
i
1
b v (t) − B(t, x
L
b(t)) ,
A(t, x
b(t))
t ∈ [a, b].
The function of t on the RHS in C 1 , so x
ḃ ∈ C 1 [a, b], giving x
b ∈ C 2 [a, b].
////
The previous result covers a surprising wealth of examples, but can still be
generalized. Recall the quadratic approximation:
f (v) ≈ f (b
v) + ∇f (b
v )(v − vb) + 21 (v − vb)T D2 f (b
v)(v − vb),
Use this on the function f (v) = L(t, x, v) near vb = x(t):
ḃ
v ≈ vb.
v)(v − vb)2 .
L(t, x, v) ≈ L(t, x, b
v) + Lv (t, x, b
v)(v − vb) + 21 Lvv (t, x, b
Here the coefficient of 12 v 2 is A(t, x) = Lvv (t, x, b
v). So we might expect the propob vv (t) 6= 0 for all
sition above to hold for general L, under the assumption that L
t ∈ [a, b]. This turns out to be correct, but the reasons are not simple.
Classic M100 problem: Assuming the relation
v3 − v − t = 0
defines v as a function of t near the point (t, v) = (0, 0), find dv/dt there.
Solution: Differentiation gives
3v 2
dv
dv
dv
1
−
− 1 = 0 =⇒
= 2
.
dt
dt
dt
3v − 1
At the point (t, v) = (0, 0), substitution gives
dv = −1.
dt (t,v)=(0,0)
The curve t = v 3 − v is easy to draw in the (t, v)-plane. As the following sketch
shows, there are three points on the curve satisfying t = 0, and the calculation above
File “var1”, version of 08 January 2015, page 18.
Typeset at 14:51 January 9, 2015.
Chapter I. First Variations
19
finds the slope at just one of them:
v
q
α
β
t
p
Figure 1: The curve v 3 − v − t = 0 in (t, v)-space.
(The shaded rectangle will be described later.)
Classic M200 reformulation: Assuming the relation F (t, v) = 0 defines v as a function
of t near the point (t0 , v0 ), find dv/dt at this point.
Solution:
0=
d
dv
dv
Ft (t, v(t))
F (t, v(t)) = Ft (t, v(t)) + Fv (t, v(t))
=⇒
=−
.
dt
dt
dt
Fv (t, v(t))
Classic M321 theorem (Rudin, Principles, Thm. 9.28):
Implicit Function Theorem. An open set U ⊆ R2 is given, along with
(t0 , v0 ) ∈ U and a function F : U → R (F = F (t, v)) such that both Ft and
Fv exist and are continuous at each point of U . Suppose F (t0 , v0 ) = 0. If
Fv (t0 , v0 ) 6= 0, then there are open intervals (α, β) containing t0 and (p, q)
containing v0 such that
(i) For each t ∈ (α, β), the equation F (t, v) = 0 holds for a unique point
v ∈ (p, q).
(ii) If we write ψ(t) for the unique v in (i), so that F (t, ψ(t)) = 0 for all
t ∈ (α, β), then ψ ∈ C 1 (α, β), with
ψ̇(t) = −
Ft (t, ψ(t))
Fv (t, ψ(t))
∀t ∈ (α, β).
Illustrations. The equation z = F (t, v) defines a surface (the graph of F ) in R3
lying above the open set U . The (t, v)-plane in R3 is defined by the equation z = 0.
File “var1”, version of 08 January 2015, page 19.
Typeset at 14:51 January 9, 2015.
20
PHILIP D. LOEWEN
This plane slices the graph of F in the same curve we see in the M100 example above.
1
0
–1
–1
–0.5
–1
–0.5
v0
0.5
0t
1
0.5
1
Figure 2: The plane z = 0 slicing the surface z = F (t, v).
The Theorem presents conditions under which some open rectangle (α, β) × (p, q)
centred at (t0 , v0 ) contains a piece of the curve that coincides with the graph of a C 1
function on (α, β). A rectangle consistent with this conclusion is shown in Figure 1.
We now prove the famous regularity theorem of Weierstrass/Hilbert.
Theorem (Weierstrass/Hilbert). Suppose L = L(t, x, v) is C 2 on some open set
containing the point (t0 , x0 , v0 ) ∈ [a, b] × R × R. Assume
Lvv (t0 , x0 , v0 ) 6= 0.
Then any arc x
b obeying (IEL) on some interval I containing t0 , and satisfying
(t0 , x
b(t0 ), x(t
ḃ 0 )) = (t0 , x0 , v0 ), must be C 2 on some relatively open subinterval of
I containing t0 . In particular, if L ∈ C 2 everywhere and x
b ∈ C 1 [a, b] is extremal for
2
L, then x
b ∈ C [a, b].
Proof. Since x
b is an extremal, there is a constant c so that the function
Z t
b x (r) dr − c
F (t, v) := Lv (t, x
b(t), v) −
L
a
obeys F (t, x(t))
ḃ
= 0 for all t ∈ [a, b]. In particular, F (t0 , v0 ) = 0. Now function F is
1
jointly C near (t0 , v0 ), and
Fv (t0 , v0 ) = Lvv (t0 , x0 , v0 ) 6= 0
by (ii). Hence the implicit function theorem cited above gives an open interval (α, β)
around t0 and another open set U around v0 such that the conditions
F (t, ψ(t)) = 0, ψ(t) ∈ U
implicitly define a unique ψ ∈ C 1 (α, β). Since x
ḃ is continuous at t0 , with x(t
ḃ 0 ) = v0 ∈
U , we may shrink (α, β) if necessary to guarantee that x(t)
ḃ ∈ U for all t ∈ I ∩ (α, β).
We already know F (t, x(t))
ḃ
= 0 in I ∩ (α, β) so uniqueness gives ψ(t) = x(t)
ḃ for all t
1
1
2
in this interval. But since ψ ∈ C , this gives x
ḃ ∈ C , i.e., x
b ∈ C (I ∩ (α, β)). ////
File “var1”, version of 08 January 2015, page 20.
Typeset at 14:51 January 9, 2015.
Chapter I. First Variations
21
E. Natural Boundary Conditions
Our abstract theory is easily extended to problems where one or both of the endpoint
values are unconstrained. A missing boundary condition in the problem formulation
generates allows a larger class of admissible variations, and this gives an extra conclusion in the first-order analysis. The extra conclusion is called a “Natural Boundary
Condition” because it arises naturally through minimization, rather than artificially
in the formulation of the problem.
Consider, for example, this problem where the right endpoint is unconstrained:
(
min Λ[x] :=
Z
)
b
a
L(t, x(t), ẋ(t)) dt : x(a) = A, x(b) ∈ R .
For
b with x
b(a) =
b + h is admissible for every variation h ∈ VI0 =
any1arc x
A, the arc x
h ∈ C [a, b] : h(a) = 0 . If x
b happens to give a DLM relative to VI0 , then every
h ∈ VI0 obeys
0 = DΛ[b
x](h) =
"Z
a
b
#
b x (t) dt h(b) +
L
Z
b
a
Z t
b x (r) dr ḣ(t) dt.
b v (t) −
L
L
(∗)
a
Now VII ⊆ VI0 , and knowing (∗) for h ∈ VII is enough to establish (IEL), as shown
above: thus there exists some constant c such that
b v (t) = c +
L
Z
t
a
b x (r) dr
L
∀t ∈ [a, b].
This equation is independent of h: it describes a property of x
b that can be used
anywhere, including in (∗) above. Hence, for all h ∈ VI0 ,
Z b
h
i
b
cḣ(t) dt
0 = DΛ[b
x](h) = Lv (b) − c h(b) +
a
h
i
b v (b) − c h(b) + c [h(b) − h(a)] ḣ(t) dt
= L
b v (b)h(b).
=L
Since h(b) is arbitrary, we get the natural boundary condition
b v (b) = 0.
L
A similar argument, with a similar outcome, applies to problems when x(a) is free
to vary in R, but x(b) = B is required. When both endpoints are free, both natural
File “var1”, version of 08 January 2015, page 21.
Typeset at 14:51 January 9, 2015.
22
PHILIP D. LOEWEN
boundary conditions are in force. The table below summarizes these results:
BC’s in Prob Stmt
Admissible Variations
Natural BC’s
x(a) = A, x(b) = B
VII
,
x(a) = A,
VI0
b v (b) = 0
,L
, x(b) = B
V0I
,
V00 = C 1 [a, b]
b v (a) = 0,
L
b v (a) = 0, L
b v (b) = 0
L
Example. Consider the Brachistochrone Problem with a free right endpoint:
(
)
Z bs
2
1 + ẋ(t)
min Λ[x] =
dt : x(0) = 0 .
v02 + 2gx(t)
0
−1/2
Here Lv = v (1 + v 2 )(v02 + 2gx)
, and the natural boundary condition at t = b
says a minimizing curve must obey
−1/2 2
2
b
0 = Lv (b) = v (1 + v )(v0 + 2gx)
.
(x,v)=(b
x(b),ḃ
x(b))
It follows that x(b)
ḃ = 0: the minimizing curve must be horizontal at its right end.
[Troutman, page 156: “In 1696, Jakob Bernoulli publicly challenged his younger
brother Johann to find the solutions to several problems in optimization including
[this one] (thereby initiating a long, bitter, and pointless rivalry between two representatives of the best minds of their era).”]
////
HW02. Modify the derivation above to treat free-endpoint problems where the cost
to be minimized includes endpoint terms, so it looks like this:
Z b
L(t, x(t), ẋ(t)) dt.
k(x(a)) + ℓ(x(b)) +
a
Future Considerations. Later, we’ll study problems in which one or both endpoints
are allowed to vary along given curves in the (t, x)-plane. The analysis above handles
only rather special curves . . . namely, the vertical lines t = a and t = b.
File “var1”, version of 08 January 2015, page 22.
Typeset at 14:51 January 9, 2015.
Download