MA209 Variational Principles June 3, 2013

advertisement
MA209 Variational Principles
June 3, 2013
The course covers the basics of the calculus of variations,
and derives the Euler-Lagrange
R
equations for minimising functionals of the type I(y) = f (x, y, y 0 )dx. It then gives examples
of this in physics, namely optics and mechanics. It furthermore considers constrained motion
and the method of Lagrange multipliers. Required is a basic understanding of differentiation
many dimensions, together with a knowledge of how to solve ODEs.
Contents
1 Review of Calculus
1.1 Functions of One Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Functions of Several Variables . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2
2
2 Variational Problems
3
3 Derivation of the Euler Lagrange Equations
3.1 The one variable - one derivative case . . . .
3.2 Solutions of some examples . . . . . . . . . .
3.3 Extension of the Theory . . . . . . . . . . . .
3.3.1 More Derivatives . . . . . . . . . . . .
3.3.2 Several dependent functions . . . . . .
.
.
.
.
.
4
4
6
8
8
9
4 Relationship with Optics and Fermat’s Principle
4.1 Fermat’s Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Optical Analogy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
10
12
5 Hamilton’s Principle
12
.
.
.
.
.
.
.
.
.
.
6 Constraints and Lagrange Multipliers
6.1 Finite Dimensions . . . . . . . . . . . . . . . . .
6.1.1 Two dimensions . . . . . . . . . . . . . .
6.1.2 n dimensions . . . . . . . . . . . . . . . .
6.1.3 Examples . . . . . . . . . . . . . . . . . .
6.1.4 A functional constrained by a functional .
6.1.5 One functional constrained by a function
7 Constrained Motion
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
13
13
13
15
16
16
18
20
These notes are based on the 2011 MA209 Variational Principals course, taught by
J.H.Rawnsley, typeset by Matthew Egginton. No guarantee is given that they are accurate
or applicable, but hopefully they will assist your study. Please report any errors, factual or
typographical, to m.egginton@warwick.ac.uk
1
MA209 Variational Principles Lecture Notes 2011
1
1.1
Review of Calculus
Functions of One Variable
Figure 1: Graph showing a maximum at x = a
Suppose that x = a is a maximum of f . Then the graph of f bears some resemblance to
that in figure 1. Suppose that f is differentiable and that f 0 (a) 6= 0. Then we either have
(a)
and if h > 0 we
that f 0 (a) > 0 or f 0 (a) < 0. Consider the former. f 0 (a) = limh→0 f (a+h)−f
h
have that f (a + h) − f (a) > 0 for h small and so f (a + h) > f (a), but as f (a) is a maximum,
f 0 (a) > 0 must be impossible. A similar argument shows that f 0 (a) < 0 is impossible. Hence
our original assumption is false, and so f 0 (a) = 0.
However, there are functions f with f 0 (a) = 0 at values of a which aren’t extrema, for
example f (x) = x3 . We call a point a where f 0 (a) = 0 a critical point of the function f
and we have shown that the set of extrema is a subset of the set of critical points. This is
also true for the set of local extrema.
Example 1.1 Let f (x) = ax2 + bx + c with a 6= 0. Then f 0 (x) = 2ax + b with −b
2a the only
−b
−b 2
−b
b2
b2
2
2
critical point. f (x) − f 2a = ax + bx + c − a 2a − b 2a − c = ax + bx − 4a
+ 2a
=
b 2
a x + 2a and so for a > 0 is a minimum and a < 0 a maximum.
In general, this won’t be so pretty, but for “nice” functions with Taylor series we have
2
f (a + h) − f (a) = hf 0 (a) + h2! f 00 (a) + ... and so if f 00 (a) 6= 0 we can decide if f 00 (a) > 0
whence we have a local minimum and if f 00 (a) < 0 we have a local maximum.
1.2
Functions of Several Variables
We will look at the two variable case. Consider f (x1 , x2 ) a differentiable function with
extremum (a1 , a2 ). Pick functions x1 (t) and x2 (t) such that x1 (0) = a1 and x2 (0) = a2 . Set
g(t) = (x1 (t), x2 (t)). Then g takes some values of f and at t = 0 is an extremum of f and
hence g. Thence g 0 (0) = 0 and so if (a1 , a2 ) is an extremum then we have that
d
(f (x1 (t), x2 (t))
=0
(1)
dt
t=0
for any pair of functions (x1 (t), x2 (t)) passing through (a1 , a2 ). Thus by the chain rule we
have that
∂f
dx1
∂f
dx2
(a1 , a2 )
(0) +
(a1 , a2 )
(0) = 0
∂x1
dt
∂x2
dt
∂f
∂f
As this is true for arbitrary functions, we must have that ∂x
(a1 , a2 ) = 0 = ∂x
(a1 , a2 ). Note
1
2
that we could have picked functions with independent derivatives at t = 0 specifically.
2 of 20
MA209 Variational Principles Lecture Notes 2011
For n variables, f (x) real valued and with an extremum at x = a, we pick a function
gv (t) = f (a + tv), where v is an arbitrary vector. Then this will have an extremum at t = 0
so gv0 (0) = 0 for all v and so ∇f (a) · v = 0 for all v so ∇f (a) = 0. If t = 0 is a local maximum
P
2f
(a) = Hessf (a) < 0 Then
of gv then a is a local maximum of f . Then gv00 (0) = ij vi vj ∂x∂i ∂x
j
all eigenvalues of Hessf (a) must be negative. If they have mixed signs or are zero then we
can deduce nothing.
Example 1.2 Suppose that f (x, y) = ax2 +bxy+cy 2 . Then ∇f = (2ax+by, bx+2cy) = (0, 0)
for an extrema. Thus if 4ac − b2 6= 0 then x = 0 = y is the only critical point.
X
∂2f
(a) = v12 2a + 2bv1 v2 + 2cv22
vi vj
∂xi ∂xj
ij
2 b
and so if a 6= 0 we get this equal to 2 a v1 + 2a
v2 + c −
b2
maximum or minimum when a c − 4a
> 0 and 4ac > b2 .
2
b2
4a
v22
and so we have a
Variational Problems
In order to motivate the study of Variational Principles we give some examples of famous
problems in the subject.
1. Suppose that y is a function such that y(x1 ) = y1 and y(x2 ) = y2 . We want to find y
with the shortest length. The length L(y) is given by
s
2
Z x2
dy
1+
dx
L(y) =
dx
x1
We say that L is a “functional” of the function y
2. Brachistochrone. Suppose that we have a bead of mass m sliding down a frictionless
wire under gravity along a curve from (x1 , y1 ) to (x2 , y2 ). Let T (y) be the time taken
to go from (x1 , y1 ) to (x2 , y2 ) along the curve y. We want to find a minimum of
this. If the time is t1 at (x1 , y1 ) and t2 at (x2 , y2 ), and we denote by s the arclength
parametrisation, then
r
2
Z s2
Z t2
Z s2
Z x2 1 + dy
dx
ds
ds
T (y) = t2 − t1 =
dt =
=
=
dx
ds
v
s1 dt
t1
s1 v
x1
We can find the velocity v from conservation of energy. We know that E = 12 mv 2 +
mg(y(x) − y1 ) = 12 mv12 + 0 if the initial speed is v1 . If we set v1 = 0 then v =
p
2g(y1 − y(x)) and so
r
2
dy
Z x2
1 + dx
p
T (y) =
dx
2g(y1 − y(x))
x1
3. Least area of revolution Take a curve y with y(x1 ) = y1 and y(x2 ) = y2 and rotate
it about the x-axis. One then gets a surface of revolution around the x-axis. We want
to find the curve for which the surface area is as small as possible. The surface area is
equal to
Z
x2
A(y) = 2π
y
x1
3 of 20
p
1 + (y 0 )2 dx
MA209 Variational Principles Lecture Notes 2011
3
3.1
Derivation of the Euler Lagrange Equations
The one variable - one derivative case
The problems in section 2 involve minimising functionals built from a function of one variable
by integration of the function and its derivatives with values of the function specified at the
ends of the range of integration. These are typically called fixed endpoint problems. In
general, the class of problems of this kind have a functional of the form
Z x2
f (x, y(x), y 0 (x))dx
(2)
I(y) =
x1
for y(x) with y(x1 ) = y1 and y(x2 ) = y2 . In future I will write y for y(x) and y 0 for y 0 (x) to
simplify the notation.
How do we find extrema of I(y)? We proceed in a similar manner to finding conditions
for functions at extrema. We consider a one parameter family of functions yt with y0 the
extremising function. Clearly they all have the same fixed endpoints. Then if g(t) = I(yt )
d
we have g 0 (0) = 0 or dt
I(yt ) |t=0 = 0. If yt = y0 + tv then v(x1 ) = 0 = v(x2 ) Hence
d
I(y0 + tv) |t=0 = 0
dt
(3)
for v as defined above. The solutions to this equation are called critical points of I(y).
Example 3.1 Consider
1
Z
I(y) =
[xy 2 + (y 0 )2 ]dx
0
We then have
Z
I(y0 + tv) =
1
[x(y0 + tv)2 + (y00 + tv 0 )2 ]dx
0
and so
Z 1
Z 1
d
(2xy0 v + 2y 0 v 0 )dx
I(y0 + tv)
=
[x(y0 + tv)2 + (y00 + tv 0 )2 ]dx
=
dt
0
0
t=0
t=0
y0 is a critical point if the integral is 0 for all v with the conditions as above.
Rx
In the general case I(y0 + tv) = x12 f (x, y0 + tv, y00 + tv 0 )dx and so if we proceed formally
we get
Z
d x2
d
0
0
I(y0 + tv)
=
f (x, y0 + tv, y0 + tv )dx
dt
dt x
t=0
t=0
Z x2 1
∂f
∂f
=
(x, y0 , y00 )v + 0 (x, y0 , y00 )v 0 dx
∂y
x1 ∂y
If y0 is a critical point and v(x) is any suitable function with v(x1 ) = 0 = v(x2 ) then we
have from equation (3)
Z x2
∂f
∂f
0=
(x, y0 , y00 )v + 0 (x, y0 , y00 )v 0 dx
∂y
∂y
x
Z 1x2 ∂f
d
∂f
∂f x2
=
−
vdx +
v
∂y
dx ∂y 0
∂y 0 x1
x1
Z x2 ∂f
d
∂f
=
−
vdx
∂y
dx ∂y 0
x1
4 of 20
MA209 Variational Principles Lecture Notes 2011
and hence we want to solve
x2
Z
x1
∂f
d
−
∂y
dx
∂f
∂y 0
vdx = 0
for suitable v.
We now make rigorous sense of this, and so we
need
f and its partial derivatives up to
∂f
∂f
d
00
order two and y0 to be continuous. Then ∂y − dt ∂y0 is continuous. We also need y0 + tv
to be a family of functions in a suitable space and so v must have two continuous derivatives.
Theorem 3.1 (The Fundamental Theorem of the Calculus of Variations) If u(x) is
continuous on [x1 , x2 ] and
Z
x2
u(x)v(x)dx = 0
x1
for all v(x) with two continuous derivatives and v(x1 ) = 0 = v(x2 ) then u(x) = 0 for all
x ∈ [x1 , x2 ].
Proof We use a contradiction argument. Suppose there is some point x0 ∈ (x1 , x2 ) with
u(x0 ) 6= 0. Without loss of generality we can assume that u(x0 ) > 0. IF not, consider the
function −u. Then u(x) is non zero on some interval around x0 (positive here even), as u is
continuous. Call this interval (x01 , x02 ). Suppose we have v(x) with two continuous derivatives
and v(x) = 0 where x 6∈ [x01 , x02 ]. Then
Z
x2
Z
x02
u(x)v(x)dx
u(x)v(x)dx =
0=
x01
x1
If furthermore v(x) > 0 for any x ∈ (x01 , x02 ) then we have that
Z
x02
u(x)v(x)dx > 0
x01
This is a contradiction; hence there is no point x0 where u(x0 ) 6= 0 and so u(x) = 0 for all
x ∈ [x1 , x2 ]. Thus the proof is reduced to a construction of such a function v(x). A suitable
function would be
0
x 6∈ (x01 , x02 )
v(x) =
0
3
0
3
(x − x1 ) (x − x2 ) x ∈ (x01 , x02 )
Q.E.D.
Remark If functionals have more derivatives then this argument could be modified for those.
We simply take one higher power than the derivatives.
−1
Aside If we need infinitely many derivatives, we can use e x2 as it has infinitely many
derivatives at x = 0 and they are all equal to zero.
Theorem 3.2 If f is a function of three variables
R x with all partial derivatives up to order two
continuous then any critical point y of I(y) = x12 f (x, y(x), y 0 (x))dx on the set of functions
with two continuous derivatives and satisfying endpoint conditions y(x1 ) = y1 and y(x2 ) = y2
has
∂f
d
∂f
−
= 0 ∀x ∈ [x1 , x2 ]
(4)
∂y
dx ∂y 0
5 of 20
MA209 Variational Principles Lecture Notes 2011
Proof We showed above that
Z
x2
x1
∂f
d
−
∂y
dx
∂f
∂y 0
vdx = 0
for all v with two continuous derivatives. The expression in the square brackets is continuous
and so by the fundamental theorem (theorem 3.1) must be zero ∀x ∈ [x1 , x2 ]
Q.E.D.
Rx
Definition 3.1 If a functional I(y) = x12 f (x, y(x), y 0 (x))dx then f is called the Lagrangian
∂f
d
of I and ∂f
−
∂y
dx ∂y 0 = 0 is called the Euler-Lagrange equation of I
Remark The E-L equation is a second order ODE for y(x) with endpoint conditions.
3.2
Solutions of some examples
R
1 π
2
0 2
2 0 (y − (y ) )dx. We have
d
0
00
dx (y ) = 0 giving y + y = 0
Example 3.2 Find the E-L equation for I(y) =
and
∂f
∂y 0
=
−y 0
and so the E-L equation is y −
that
∂f
∂y
=y
We now solve the examples in section 2.
1. We have from before that
Z
x2
L(y) =
s
1+
x1
and so
∂f
∂y
= 0 and
∂f
∂y 0
=√
y0
.
1+(y 0 )2
d
−
dx
and so √
y0
1+(y 0 )2
dy
dx
2
dx
The E-L equation then gives
y0
p
1 + (y 0 )2
!
=0
is constant, hence y 0 = m giving the line y = mx + a
Remark Any case where ∂f
∂y = 0 will have an immediate integral of the E-L equation
∂f
∂f
d
− dx
∂y 0 = 0 as ∂y 0 = constant. We call this a first integral of the E-L equation.
Before looking at the other two examples, we note that x does not appear explicitly so
we ask if there is a first integral. Observe that
d
∂f
∂f
d ∂f
∂f
0 ∂f
00 ∂f
y 0 0 − f = y 00 0 + y 0
−
+
y
+
y
dx
∂y
∂y
dx ∂y 0
∂x
∂y
∂y 0
d
∂f
∂f
∂f
= y0
−
− y0
0
dx ∂y
∂x
∂y
and if y is a solution of the E-L equations we have that
d
∂f
0 ∂f
y
−f =−
0
dx
∂y
∂x
∂f
and so if f is independent of x then y 0 ∂y
0 − f is a constant. This is called the first integral
for the case of a Lagrangian independent of x.
6 of 20
MA209 Variational Principles Lecture Notes 2011
2. Brachistochrone We have
f (x, y, y 0 )
no x dependence here and so
∂f
y 0 ∂y
0
√
1+(y 0 )2
= √
if we ignore the constants. There is
(y1 −y)
− f is a constant.
p
1 + (y 0 )2
∂f
y0y0
p
p
p
=A
y
−
f
=
−
∂y 0
(y1 − y)
1 + (y 0 )2 (y1 − y)
0
giving √
−1
√
1+(y 0 )2 (y1 −y)
= A and hence (1 + (y 0 )2 )(y1 − y) =
s
1
0
y =±
A2 (y1
− y)
1
A2
and we thus get
−1
If we now make the substitution A2 (y1 − y) = sin2
sin 2θ cos 2θ θ0 and we get that
s
1 − sin2
θ
1
θ 0
− 2 sin cos θ = ±
A
2
2
sin2 2θ
θ
2
θ
2
then we get that −A2 y 0 =
=±
cos 2θ
sin 2θ
and so − A12 sin2 2θ θ0 = ±1 giving − 2A1 2 (1 − cos θ)θ0 = ±1 and then integrating gives
−
1
(θ − sin θ) = B ± x
2A2
which implicitly determines θ(x) and so y(x)
This curve is called a cycloid. Figure 2 shows such a curve.
Figure 2: A cycloid
p
3. f (x, y, y 0 ) = 2πy 1 + (y 0 )2 and observe that we have no x dependence again. Thus we
look at the first integral:
y0
and so we get that
p
∂f
(y 0 )2 y
−f = p
− y 1 + (y 0 )2 = A
0
∂y
1 + (y 0 )2
−y
p
=A
1 + (y 0 )2
q
2
and so y 0 = ± Ay 2 − 1. If we then make the substitution
y0
A
y
A
= sin zz 0 and hence the equation to solve becomes
p
A sinh zz 0 = ± cosh2 z − 1 = ± sinh z
7 of 20
= cosh z we get that
MA209 Variational Principles Lecture Notes 2011
Figure 3: The shape of surface which minimises the surface of revolution
x
and thus we get that z 0 = ± A1 and so z = B ± A
and so
x
y = A cosh B ±
A
and so it looks like figure 3
We now try to fit this shape of solution to the endpoint
conditions. Without loss
x
0
of generality we will assume that y = A cosh B + A , and we want a solution with
y(x1 ) = y1 and y(x2 ) = y2 . Using the first of these we get that B 0 = cosh−1 yA1 − xA1
p 2
1
1
and then y = y1 cosh x−x
+ y1 − A2 sinh x−x
and using the second condition
A
A
gives a pretty nasty equation (I leave to the reader to work it out). To see if solutions
exist we plot the graph of y(x) for various values of A. Thus from this graph you can
see that if (x2 , y2 ) is to the right of the dotted line then there is no solution. Also
note that if (x2 , y2 ) is above the dotted line then there are two solutions. Also these
solutions may not be extrema, as a broken line may well minimise the problem.
Remark y0 + tv is called a variation of y0 , hence the name Calculus of Variations
3.3
3.3.1
Extension of the Theory
More Derivatives
Suppose that
Z
x2
I(y) =
f (x, y, y 0 , ..., y (n) )dx
x1
We try the same method as before, considering I(y + tv) for y an extremum. Set g(t) =
d
T (y + tv) and then this has an extremum at t = 0 so g 0 (0) = 0 and thus dt
I(y + tv)t=0 = 0
and so
Z x2 d
∂f
∂f
∂f
I(y + tv)
=
v + 0 v 0 + ... + (n) v (n) dx = 0
dt
∂y
∂y
∂y
x1
t=0
If we assume that v(x1 ) = 0 = v(x2 ) and all partial derivatives up to v (n−1) are zero at x1
and x2 then we get that
Z x2 n
∂f
d
∂f
∂f
d
n d
I(y + tv) |t=0 =
−
+
...
+
(−1)
vdx = 0
dt
∂y
dx ∂y 0
dxn ∂y (n)
x1
For the argument to be complete we need f to have (n + 1) continuous derivatives and y to
have 2n continuous derivatives. Then the term in square brackets is continuous and we need
8 of 20
MA209 Variational Principles Lecture Notes 2011
the version of the fundamental theorem for v with 2n continuous derivatives. Then for y an
extremum it satisfies
n
∂f
∂f
∂f
d
n d
=0
+ ... + (−1)
(5)
−
∂y
dx ∂y 0
dxn ∂y (n)
This is again called the Euler Lagrange equation for the functional. There is no existence
or uniqueness theorem in this case again.
Rπ
Example 3.3 Suppose I(y) = 02 ((y 00 )2 − y 2 )dx with y(0) = 0 = y 0 (0) and y( π2 ) = 1 and
d2
00
(4) − y = 0 and this
y 0 ( π2 ) = 0. The E-L equation gives −2y + dx
2 (2y ) = 0 and so y
x
−x
has a general solution of y = A cos x + B sin x + Ce + De and solving for the endpoint
π
π
conditions gives the four equations 0 = A + C + D, 0 = B + C − D, 1 = B + Ce 2 + De− 2
π
π
and 0 = −A + Ce 2 − De− 2 and these can be solved.
3.3.2
Several dependent functions
Problems involving curves may not be expressible as y = y(x) and so instead we could
the curve in parametric form, i.e. for the length problem we could write L(x, y) =
Rwrite
t2 p 0 2
(x
) + (y 0 )2 dt. In general these have the form
t1
Z
t2
I(x, y) =
f (t, x(t), y(t), x0 (t), y 0 (t))dt
t1
and we use a one parameter variation (x + hu, y + hv). Then (x, y) is an extremum of I
means that
d
I(x + hu, y + hv)
=0
dh
h=0
Note that u and v must vanish at the endpoints to preserve the endpoint conditions.
d
If we first take v(x) = 0 ∀x ∈ [t1 , t2 ] then dh
I(x + hu, y)h=0 = 0 and so
∂f
d ∂f
−
=0
∂x dt ∂x0
d
Similarly if u(x) = 0 ∀x ∈ [t1 , t2 ] then dh
I(x, y + hv)h=0 = 0 and so
∂f
d ∂f
−
=0
∂y
dt ∂y 0
In other words both x and y satisfy the Euler Lagrange equation for one variable.
One can also derive these two equations as we did before: using the chain rule on the
necessary condition, then integrating by parts. Then taking v = 0 and then u = 0 we can
apply the fundamental theorem in both cases, giving the result above.
It should be clear that this works for any number of independent variables, so long as
they can be varied independently. If
Z t2
I(x1 , ..., xn ) =
f (t, x1 (t), ..., xn (t), x01 (t), ..., x0n (t))dt
t1
then I has n simultaneous E-L equations
∂f
d
−
∂xi dt
∂f
∂ ẋi
= 0 ∀i = 1, ..., n
9 of 20
(6)
MA209 Variational Principles Lecture Notes 2011
R1p
Example 3.4 Suppose that L(x, y) = 0 (x0 )2 + (y 0 )2 dt with x(0) = x1 and x(1) = x2 as
well as y(0) = y1 and y(1) = y2 . This has two E-L equations:
x0
d
√
− dt
=0
0 2
0 2
(x ) +(y ) 0
d
√ 0 y2 0 2 = 0
− dt
(x ) +(y )
and so both √
x0
(x0 )2 +(y 0 )2
and √
y0
(x0 )2 +(y 0 )2
are constants. Thus √
1
(x0 , y 0 )
(x0 )2 +(y 0 )2
= (A, B)
is
a constant direction. If c(t) =
p a constant unit vector. Hence (x(t), y(t)) is a curve with
0
0
2
0
2
(x ) + (y ) then (x, y) = d(t)(A, B) + (C, D) where d = c
Remark Observe that although it is written in term of two variables, the problem is degenerate. It has infinitely many solutions given by different possible functions d(t).
If there is no explicit t dependence, i.e. ∂f
∂t = 0, then consider
F (t) = x01
∂f
∂f
+ ... + x0n 0 − f
∂x01
∂xn
Then
∂f
+ ...
∂x01
d
∂f
∂f
∂f
∂f
−
− x01
−
+ x00n 0 + x0n
∂xn
dt ∂x01
∂t
∂x1
∂f
∂f
∂f
... − x0n
− x001 0 − ... − x00n 0
∂xn
∂x1
∂xn
∂f
d
dF
= x001 0 + x01
dt
∂x1
dt
=0
if there is no explicit time dependence and x1 , ..., xn satisfy the E-L equations. Hence F is
constant and this is another First Integral.
4
Relationship with Optics and Fermat’s Principle
We look here at rays of light in the plane moving with speed c(x, y)
4.1
Fermat’s Principle
Theorem 4.1 (Fermat’s Principle) Light Travels along a path between two points (x1 , y1 )
and (x2 , y2 ) so as to take the least time to get from (x1 , y1 ) to (x2 , y2 )
c(x, y) is the speed at (x, y), and if we travel along a path the speed will be the rate of
change of arclength along the path. Thus if we measure arclength s from an initial position,
then ds
dt = c(x, y). If the path is a graph of a function y(x) then from a path from (x1 , y1 ) to
(x2 , y2 ), where we are at (x1 , y1 ) at time t1 and arclength s1 and at (x2 , y2 ) at time t2 and
arclength s2 , we get that
Z t2
Z s2
Z s2
Z x2 p
1 + (y 0 )2
ds
ds
T (y) = t2 − t1 =
dt =
=
=
dx
ds
c(x, y)
t1
s1 dt
s1 c
x1
The actual path followed by a light ray will be a minimum of T (y).
10 of 20
MA209 Variational Principles Lecture Notes 2011
Example 4.1 Light in a homogeneous medium Here we assume that c is a constant.
We have that
Z
1 x2 p
1
T (y) =
1 + (y 0 )2 dx = L(y)
c x1
c
and hence in a homogeneous medium light travels in straight lines since these are critical
points of the length functional.
Example 4.2 The Law of Refraction Suppose that we have two homogeneous media with
speeds c1 and c2 and have a straight line interface and a ray of light from the first to the
second. We know that we will have a broken line, but what is the change in direction at the
interface. We look at broken straight line paths passing through the point(x0 , 0) on the x-axis.
The time taken is τ (x0 ) and this is equal to
p
p
(x0 − x1 )2 + y12
(x2 − x0 )2 + y22
τ (x0 ) =
+
c1
c2
The actual path will be a minimum with respect to x0 and so at that point where the path
dτ
dτ
crosses the x-axis we have dx
= 0. Now dx
= √ x0 −x1 2 2 − √ x2 −x0 2 2 = 0 whence
0
0
c1
sin θ1
c1
−
sin θ2
c2
(x0 −x1 ) +y1
c2
(x2 −x0 ) +y2
= 0 or
sin θ1
c1
=
sin θ2
c2
(7)
This is known as Snell’s Law.
Suppose that c is only a function of y, i.e. that c(x, y) = c(y). We divide into strips
parallel to the x-axis. In each strip, the path is approximated by a straight line segment.
dy
dy
Then the slope in the strip will be approximately dx
. We then have that cot θ = dx
= y0
1
and then sin θ = √
. It is cot θ here because θ is the angle the ray makes with the y
0 2
1+(y )
1 √ 1
direction. According to Snell’s Law sinc θ is a constant and so c(y)
is a constant. This
1+(y 0 )2
q
p
1
equation gives 1 + (y 0 )2 = Kc(y)
and so y 0 = ± K 2 c12 (y) − 1. Then dividing by the square
root term and integrating with respect to x gives
Z
dy
q
=A±x
1
−
1
2
2
K c (y)
This gives an equation for x as a function of y and by solving, or using a substitution, we
get an explicit solution.
We now rework the above using the Calculus of Variations. In this case we have a
functional independent of x as
Z x2 p
1 + (y 0 )2
T (y) =
dx
c(y)
x1
and then this has a first integral of
p
1 + (y 0 )2
p
y
−
=K
c(y)
c(y) 1 + (y 0 )2
0
and this gives
c(y)
√−1
1+(y 0 )2
y0
= −K which we deduced from Snell’s law before. Hence the first
integral of Fermat’s Principle is Snell’s Law.
11 of 20
MA209 Variational Principles Lecture Notes 2011
4.2
Optical Analogy
If a problem in the Calculus of Variations leads to a functional of the same form as that
coming from Fermat’s Principle and the optical problem is already solved then the same
solution applies to the variational problem. It then has the solution, when independent of x
in the functional, given by
Z
dy
q
A±x=
1
−1
K 2 c2 (y)
This was how Bernoulli first solved the Brachistochrone problem, where we have
Z x2 p
1 + (y 0 )2
p
dx
2g(y1 − y)
x1
p
as our functional. If we take c(y) = 2g(y1 − y) then we can write down the integral formula
for the solution.
When x appears explicitly we have to go to the full E-L equations.
5
Hamilton’s Principle
Suppose that x(t) = (x(t), y(t), z(t)) describes the motion of a point particle in three dimensions where t is the time variable. We define ẋ := dx
velocity v. Furtherdt and call it the
p
√
d2 x
2
more we define ẍ := dt2 and call it the acceleration. v := |v| := ẋ + ẏ 2 + ż 2 = v · v
is called the speed. The motion is governed by the mass m > 0. The kinetic energy is
1
1
2
2
2
2
2 mv P= 2 m(ẋ + ẏ + ż ). If we have many particles then the total kinetic energy is
1
2
T = i 2 m1 vi . If q1 , ..., qn is a different set of coordinates of which x1 , y1 , z1 , x2 , y2 , z2 , ...
are functions then we get T as a function of q1 , ..., qn , q˙1 , ..., q˙n by substitution.
Definition 5.1 A conservative system is where the forces acting F can be given in terms
of a function V such that F = −∇V . V is called the potential energy and is a function
of q1 , ..., qn independent coordinates.
Definition 5.2 The Lagrangian of the system is
L(q1 , ..., qn , q˙1 , ..., q˙n ) := T − V
Example 5.1 Suppose that a particle of mass m is moving in a circle in the x-y plane with
gravity acting in the negative y direction. Then the potential is given by V := mgy =
mgR sin θ and the kinetic energy is T = 21 mR2 θ̇2 and so the Lagrangian is L(θ, θ̇) =
1
2 2
2 mR θ̇ − mgR sin θ
Theorem 5.1 (Hamilton’s Principle) The path followed by a system described by a Lagrangian L = T − V in getting from an initial position P1 at time t1 to a final position P2
at time t2 is a critical point of the functional
Z
t2
Ldt
I=
t1
amongst all possible paths from P1 to P2 at the relevant times.
Hence the actual path satisfies the E-L equations for L, namely
∂L
d ∂L
−
= 0 for i = 1, ..., n
∂qi dt ∂ q˙i
12 of 20
(8)
MA209 Variational Principles Lecture Notes 2011
Example 5.2 Suppose we have a particle on a circle of radius R and is acted upon by gravity
(see example 5.1). Then we have L(θ, θ̇) = 12 mR2 θ̇2 − mgR sin θ and so the E-L equations
for this gives
g
d
−mgR cos θ − (mR2 θ̇) = 0 =⇒ θ̈ + cos θ = 0
dt
R
and this is called the pendulum equation
Example 5.3 Suppose that we have a particle of mass m moving in R3 with a force F =
−∇V . Then L = 12 m(ẋ2 + ẏ 2 + ż 2 ) − V (x, y, z) and the E-L equations give
∂L
d ∂L
∂x − dt ∂ ẋ = 0
∂L
d
∂L
∂y − dt ∂ ẏ = 0
d ∂L
∂L
∂z − dt ∂ ż = 0

=⇒ − ∂V
−
mẍ
=
0

∂x

∂V
=⇒ − ∂y − mÿ = 0
=⇒ F − mẍ = 0


∂V
=⇒ − ∂z − mz̈ = 0
i.e. Newton’s Second Law. Thus Hamilton’s principle is in accord with Newton’s second Law.
Observe that L is independent of the time variable, and so we always have a first integral of
the form
∂L
∂L
q˙1
+ ... + q˙n
− L = constant
∂ q˙1
∂ q˙n
Observe that the kinetic energy is quadratic in the derivatives, and will be so for any system.
Thus
n X
n
X
T (q1 , ..., qn , q˙1 , ..., q˙n ) =
q˙i q˙j Tij (q1 , ..., qn )
i=1 j=1
And hence we get the identity
T (q1 , ..., qn , aq˙1 , ..., aq˙n ) = a2 T (q1 , ..., qn , q˙1 , ..., q˙n )
which is called Euler’s Formula. It should be clear that
q˙1 , ..., q˙n , the first integral becomes
q˙1
∂L
∂ q˙i
=
∂T
∂ q˙i
(9)
as V is independent of the
∂T
∂T
+ ... + q˙n
− L = constant
∂ q˙1
∂ q˙n
and hence, by differentiating (9) with respect to a and setting a = 0, we get that T + V =
constant and this is called conservation of energy.
6
Constraints and Lagrange Multipliers
6.1
6.1.1
Finite Dimensions
Two dimensions
A typical example is to find extrema of f (x, y) on the set {(x, y) ∈ R2 |g(x, y) = 0}. The
implicit function theorem tells us which variable in an equation can be solved for in terms
of the others.
∂g
If ∂x
(x0 , y0 ) 6= 0 for some point then there is a function η(x) defined for x near x0 with
η(x0 ) = y0 , η differentiable, such that all solutions (x, y) of g(x, y) = 0 near (x0 , y0 ) have the
form (x, η(x)).
We say the constraint is regular if at every solution at least one of the partial derivatives
is non-zero.
13 of 20
MA209 Variational Principles Lecture Notes 2011
If (x0 , y0 ) is an extremum of f on {(x, y)|g(x, y) = 0} then let y = η(x) be a solution
near (x0 , y0 ) of the constraint, and then substitute this in f to give f (x, η(x)) and this has
x0 as an extremum. Therefore
d
f (x, η(x))
=0
(10)
dx
x=x0
We also have the fact that g(x, η(x)) = 0 for all x for which η is defined. Equation (10), by
the chain rule, yields
∂f
dη
∂f
(x0 , y0 ) +
(x0 , y0 ) (x0 ) = 0
∂x
∂y
dx
and we also have that
∂g
∂g
dη
d
g(x, η(x)) = 0 =⇒
(x, y) +
(x, y) (x) = 0
dx
∂x
∂y
dx
for all x near x0 , and if one evaluates this at x = x0 we get that
∂g
(x0 , y0 )
dη
(x0 ) = − ∂x
∂g
dx
(x0 , y0 )
∂y
as the denominator is non zero by assumption. From these we get that:
∂g
(x0 , y0 )
∂f
∂f
=0
(x0 , y0 ) −
(x0 , y0 ) ∂x
∂g
∂x
∂y
(x0 , y0 )
∂y
and if we define λ =
∂f
(x0 ,y0 )
∂y
∂g
(x
0 ,y0 )
∂y
then this becomes
∂f
∂g
∂
(x0 , y0 ) − λ (x0 , y0 ) = 0 =⇒
(f − λg) (x0 ,y0 ) = 0
∂x
∂x
∂x
∂
Also, by definition of λ we get that ∂y (f − λg) (x0 ,y0 ) = 0 and therefore f − λg has a critical
point at (x0 , y0 ).
∂g
Similarly if ∂y
(x0 , y0 ) 6= 0 and (x0 , y0 ) is an extremum of f on {(x, y)|g(x, y) = 0} then
there is a constant λ0 =
∂f
(x0 ,y0 )
∂x
∂g
(x
0 ,y0 )
∂x
such that f − λ0 g has a critical point at (x0 , y0 ).
We have thus proved:
Theorem 6.1 (Lagrange Multiplier) If g is a regular constraint with ∇g 6= 0 for all
(x, y) with g(x, y) = 0 then any extremum (x0 , y0 ) of f (x, y) on the set {(x, y)|g(x, y) = 0}
has an associated real number λ such that f − λg has a critical point at (x0 , y0 ).
We call λ the Lagrange multiplier for (x0 , y0 ). The unknowns are now (x0 , y0 ) and λ.
The condition of f − λg having a critical point at (x0 , y0 ) is ∇(f − λg)(x0 , y0 ) = 0 and we
also have the condition of g(x0 , y0 ) = 0.
Example 6.1 Find the extrema of f (x, y) = ax + by on x2 + y 2 = 1. The extrema has a
2 + y 2 − 1) and so 0 = a − 2λx and 0 = b − 2λy
critical point of f − λg = ax + by − λ(x
√
2
2
giving a2 + b2 = 4λ2 and so λ = ± a 2+b and so (x, y) = ± √a2a+b2 , ± √a2b+b2 . Then
√
√
2
2
2 + b2 and hence there is a maximum at + a2 + b2 and a minimum at
f = ± √aa2+b
=
±
a
2
+b
√
− a2 + b2 .
In general there may be more solutions to ∇(f − λg)(x0 , y0 ) = 0 and g(x0 , y0 ) = 0 than there
are extrema (x0 , y0 ) of f on {(x, y)|g(x, y) = 0}.
Definition 6.1 We call the solutions to the above constrained critical points of f
The constrained critical points of f on g(x, y) = 0 are unconstrained critical points of
f − λg for some λ.
14 of 20
MA209 Variational Principles Lecture Notes 2011
6.1.2
n dimensions
Let f be a function of n variables and look for extrema of f (x1 , ..., xn ) on the set of points
where a function g(x1 , ..., xn ) = 0. Suppose that x = (x1 , ..., xn ) is an extreme point and
pick two vectors u and v and consider a function of two variables Fu,v (h, k) := f (x + hu + kv)
subject to the constraints Gu,v (h, k) := g(x + hu + kv) = 0. Then (h, k) = (0, 0) is an
extremum of Fu,v subject to Gu,v (h, k) = 0. Hence there is a Lagrange multiplier λu,v such
that Fu,v − λu,v Gu,v has a critical point at (h, k) = (0, 0). Therefore we have that
∂
(Fu,v − λu,v Gu,v ) = 0
∂h
and that
∂
(Fu,v − λu,v Gu,v ) = 0
∂k
This then gives us that
u · ∇(f − λu,v g)(x) = 0
and that
v · ∇(f − λu,v g)(x) = 0
∂F
∂
from the fact that ∂hu,v = ∂h
(f (x + hu + kv)) = u · f (x + hu + kv) and similarly for the
other partial derivatives.
Then for every pair of vectors u and v we have that ∇(f − λu,v g)(x) is perpendicular to
both u and v.
If e1 , ..., en is the standard basis and λij = λei ,ej then ∇f (x) − λij ∇g(x) is perpendicular
to both ei and ej for each i and j. In terms of the partial derivatives this becomes
∂f
∂g
∂f
∂g
(x) − λij
(x) = 0 =
(x) − λij
(x) ∀i, j
∂xi
∂xi
∂xj
∂xj
We aim to find a condition that is independent of j and so have a single Lagrange multiplier
for each of the equations. For a regular constraint we need ∇g 6= 0 everywhere on g(x) = 0.
Thus at least one partial derivative of g is non zero, say the i0 th. Then we can write
λi0 j =
∂f
∂xi0 (x)
∂g
∂xi0 (x)
and this is independent of j. Then if we put λ := λi0 j for any j and then input this into the
second equation, we get that
∂f
∂g
(x) − λ
(x) = 0 ∀j
∂xj
∂xj
Hence we have a λ such that ∇(f − λg)(x) = 0
Example 6.2 Find the point on the plane x · n = p closest to a given point a not on the
plane.
We aim to minimise the distance from a point x to the point a such that x · n = p The
Euclidean distance is given by d(x, a) = |x − a| but we will take the square of this to simplify
working out. It should be clear that if the square of the distance has a minimum,
P then so must2
the distance itself. Thus we have that we want to minimise f (x) = |x−a|2 = i=m
i=1 (xi − ai )
subject to g(x) = x · n − p. Now ∇f = 2(x − a) and ∇g = n. At the critical point there is a
number λ such that ∇f − λg = 0 and in this case this is 2(x − a) − λn = 0 and so we get
that x = a + λ2 n and (a + λ2 n) · n = p and so λ2 = p − a · n thus x = a + (p − a · n)n.
This is a minimum as any point which is different from the above one will have distance
on a hypotenuse of a right angled triangle with one side equal to the length at a critical point.
15 of 20
MA209 Variational Principles Lecture Notes 2011
6.1.3
Examples
The following examples are ones that we aim to solve, and will develop techniques to do so
in the next section.
1. Hanging rope or chain Suppose we have a rope hanging in equilibrium between two
points (x1 , y1 ) and (x2 , y2 ). What is the shape of the rope? This is called a catenary.
Suppose the shape is the graph of a function y = y(x). In equilibrium its potential
energy will be minimised. Let ρ be the density per unit Rlength
of the rope and assume
x p
that it is constant. Then the total mass is J(y) := ρ x12 1 + (y 0 )2 dx =: M . The
Rx p
potential energy is then given by I(y) := ρg x12 y 1 + (y 0 )2 dx. Hence we want to
minimise I(y) subject to J(y) being a constant value M .
2. Isoperimetric problem Consider a closed curve in the plane. For a given length, we
want to find the curve which encloses the greatest area.
Let the curve C be given by (x(t), y(t)) with x(t
) = x0 = x(t2 ) and y(t1 ) = y0 = y(t2 ).
R t21p
Then the length of C is given by L(C) = t1 ẋ2 + ẏ 2 dt and the area is given by
Rt
A(C) = 21 t12 (xẏ − y ẋ)dt We want to minimise A(C) for fixed L(C).
3. Geodesics on Surfaces Curves which minimise the distance in a surface are called
geodesics. Here we minimise a length functional L(x) for curves x(t) which satisfy
g(x(t)) = 0 for all t.
6.1.4
A functional constrained by a functional
We first look at problems with two functionals (like one and two above), with two parameter
variations, and then look at two variation problems.
If I(y) is extremised on the set g(x) with J(y) constant, we look at two parameter
variations y +hu+kv where u and v are chosen such that u(x1 ) = u(x2 ) = v(x1 ) = v(x2 ) = 0
and then (h, k) = (0, 0) is an extremum for I(y + hu + kv) = J0 which is a fixed constant.
Define Fuv (h, k) = I(y + hu + kv) and Guv (h, k) = I(y + hu + kv) − J0 . Then Fuv has an
extremum at (h, k) = (0, 0) for (h, k) such that Guv (h, k) = 0. Hence we have a Lagrange
Multiplier λuv such that Fuv − λuv Guv has a critical point at (0, 0). Thus
∂
(Fuv (h, k) − λuv Guv (h, k)) |h,k=0 = 0
∂h
and also
∂
(Fuv (h, k) − λuv Guv (h, k)) |h,k=0 = 0
∂k
R x2
Rx
If I(y) = x1 f (x, y, y 0 )dx and J(y) = x12 g(x, y, y 0 )dx then the h partial equation gives
Z
x2
0=
x1
∂
d
(f − λuv g) −
∂y
dx
∂
(f − λuv g)
udx
∂y 0
and the k partial equation gives
Z x2 ∂
d
∂
0=
(f − λuv g) −
(f − λuv g)
vdx
∂y
dx ∂y 0
x1
Then we have that
Z x2 Z x2 ∂f
d
∂f
∂g
d
∂g
0=
−
udx
−
λ
−
udx
uv
∂y
dx ∂y 0
∂y dx ∂y 0
x1
x1
16 of 20
MA209 Variational Principles Lecture Notes 2011
The regularity condition gives that the latter integrand and hence integral in the above
equation is non zero on the set of g(x) and so J(y) = J0 . Then the former integral is non
zero and so we can set
R x2 ∂f
∂f
d
−
u0 dx
∂y
dx ∂y 0
x1
λu0 v = R
x2 ∂g
∂g
d
u0 dx
∂y − dx ∂y 0
x1
and note that the right hand side here is independent of v, and so we can write λu0 v =: λ.
Then for any v vanishing at x1 and x2 , and for λ defined before, we get that
Z x2 ∂
∂
d
(f − λg)
vdx = 0
(f − λg) −
∂y
dx ∂y 0
x1
and by the fundamental lemma we get that
∂
d
(f − λg) −
∂y
dx
∂
(f
−
λg)
=0
∂y 0
This is called the Euler Lagrange equation for this case. We have thus proved:
Theorem 6.2 An extremum of I(y) subject to J(y) = J0 satisfies the Euler Lagrange equation
∂
d
∂
(11)
(f − λg) −
(f
−
λg)
=0
∂y
dx ∂y 0
for I − λJ for some λ called the Lagrange Multiplier.
Remark This proof can be adapted to more derivatives or more independent variables.
We now solve the examples given at the start of this subsection.
Rx p
Rx p
1. Catenary We have I(y) := ρg x12 y 1 + (y 0 )2 dx and J(y) := ρ x12 1 + (y 0 )2 dx =:
M . y satisfies the E-L equation for I − λJ for some λ. This functional is
Z x2
p
ρ
(gy − λ) 1 + (y 0 )2 dx
x1
and we use the optical analogy to solve it. This corresponds to light moving with speed
1
c = gy−λ
and has solution of
y =λ+
c1
ρgx
cosh
+ c2
ρg
c1
and we have three conditions and three unknowns and so we can solve to find c1 , c2 , λ
2. Isoperimetric Problem We want to maximise A(x, y) while keeping L(x, y) = l fixed.
(x(t), y(t)) is a parameterisation of a closed curve. The extremising curve will satisfy
the E-L equations for A − λL and so we get
Z t2 p
1
2
2
(xẏ − y ẋ) − λ ẋ + ẏ dt
(A − λL)(x, y) =
2
t1
and we have two E-L equations and so we have
1
d
1
ẋ
√
=0
2 ẏ − dt − 2 y − λ ẋ2 +ẏ 2
d
1
√ ẏ
− 21 ẋ − dt
=0
2x − λ
ẋ2 +ẏ 2
17 of 20
MA209 Variational Principles Lecture Notes 2011
Note that both equations are time derivatives and so we get
ẋ
d
√
=0
dt y − λ ẋ2 +ẏ 2
d
√ ẏ
=0
dt −x + λ
2
2
ẋ +ẏ
and integrating once gives
ẋ
ẋ2 +ẏ 2
λ √ 2ẏ 2
ẋ +ẏ
y − λ√
−x +
=B
= −C
and hence we get that (x − C)2 + (y − B)2 = λ2 and this is a circle centre (C, B) and
l
radius λ. Therefore 2πλ = l and so λ = 2π
.
To handle example three we need a new method.
6.1.5
One functional constrained by a function
As far as I can gather, we cannot in general constrain a functional with respect to a given
function, but we can do if the functional is a function of curves.
Suppose we have curves x(t) = (x1 (t), ..., xn (t)) joining two points x(1) and x(2) at times
tR1 and t2 and the curves satisfy g(t, x(t), ẋ(t)) = 0. We have some functional I(x) =
t2
t1 f (t, x(t), ẋ(t))dt and we aim to extremise amongst these curves. An example of this
is finding geodesics in a surface.
Let xh be a variation of an extremum x, and xh = x + hu + o(h2 ) which satisfy the
constraint for all h. Then h = 0 is a critical point for I(xh ) as a function of h and so
d
I(xh )
=0
dh
h=0
and thus we get that
Z
t2
" n X ∂f
t1
i=1
∂f
ui +
u̇i
∂xi
∂ ẋi
#
dt = 0
(12)
= 0 for all t
(13)
Differentiating the constraint g(t, x(t), ẋ(t)) = 0 gives
n X
∂g
i=1
∂g
ui +
u̇i
∂xi
∂ ẋi
Pick a function λ(t), multiply (13) by λ and subtract from the integrand of (12). Therefore
we get
t2
n X
∂f
∂f
∂g
∂g
ui +
u̇i − λ(t)
ui +
u̇i
dt
0=
∂xi
∂ ẋi
∂xi
∂ ẋi
t1 i=1
#
Z t2 "X
n ∂
∂
=
(f − λg)ui +
(f − λg)u̇i
dt
∂xi
∂ ẋi
t1
Z
i=1
Observe that u(t1 ) = 0 = u(t2 ) and then integrating the u̇i terms by parts gives
#
Z t2 "X
n ∂
d
∂
0=
(f − λg) −
(f − λg)
ui dt
∂xi
dt ∂ ẋi
t1
i=1
18 of 20
MA209 Variational Principles Lecture Notes 2011
We can pick λ(t) so that
one of thecoefficients, say of ui , is zero. We can do this since
d
∂
setting ∂x1 (f − λg) − dt ∂∂x˙1 (f − λg) = 0 gives a first order linear inhomogeneous ODE for
λ which can be solved by the integrating factor method. The constraint (12) amongst the
ui then determines u1 in terms of u2 , ..., un and then the latter can be varied freely. Hence
the condition for an extremum becomes
n Z t2 X
d
∂
∂
(f − λg) −
(f − λg)
ui dt
0=
∂xi
dt ∂ ẋi
t1
i=2
and since now u2 , ..., un are arbitrary, vanishing at t1 and t2 we can apply the fundamental
theorem to each ui in turn taking the rest of u2 , ..., un to be zero giving
∂
d
∂
(f − λg) −
(f − λg) = 0 for i = 2, ..., n
∂xi
dt ∂ ẋi
For i = 1 this equation was the way we chose λ(t) and hence the equation becomes
∂
d
(f − λg) −
∂xi
dt
∂
(f − λg) = 0 for i = 1, 2, ..., n
∂ ẋi
(14)
We have thus proved
Theorem 6.3 To extremise a functional I given by a Lagrangian f amongst curves x(t)
with fixed endpoints subject to a constraint g(t, x(t), ẋ(t)) = 0 there is a function λ(t) such
that the Euler-Lagrange equations for the Lagrangian f − λg (namely (14)) are satisfied.
Example 6.3 Geodesics on a surface inR R3pgiven by an equation g(x) = 0. Geodesics are
t
paths of shortest length and so minimise t12 ẋ2 + ẏ 2 + ż 2 dt. Then there is a function λ(t)
p
such that a geodesic x(t) satisfies the E-L equations of the Lagrangian ẋ2 + ẏ 2 + ż 2 −
λ(t)g(x). We thus have, for the x equation,
!
∂g
d
ẋ
p
−λ(t)
−
=0
∂x dt
ẋ2 + ẏ 2 + ż 2
We aim to simplify
p this equation, and so we introduce the arclength parameter s. This
ds
is given by dt = ẋ2 + ẏ 2 + ż 2 and we change independent variables from t to s. Then
dx
ẋ
λ
√ ẋ
. Then dividing the equation by ds
ds = ds =
dt and putting µ = ds we get that
2
2
2
dt
ẋ +ẏ +ż
dt
−µ
∂g
d2 x
− 2 =0
∂x ds
−µ
∂g d2 y
− 2 =0
∂y
ds
−µ
∂g d2 z
− 2 =0
∂z
ds
and the y equation becomes
and similarly the z equation becomes
and so
−µ =
d2 x
ds2
∂g
∂x
=
d2 y
ds2
∂g
∂y
19 of 20
=
d2 z
ds2
∂g
∂z
MA209 Variational Principles Lecture Notes 2011
There is no general method to solve these equations. We now consider a special case of
a sphere in R3 , and so we have that g(x, y, z) = x2 + y 2 + z 2 − R2 . This then gives us that
1 d2 x
1 d2 y
1 d2 z
=
=
2x dx2
2y dx2
2z dx2
and notice that
2
dy
dz dy
1 d y 1 d2 z
d
dz
d2 y dy dz
d2 z
z
=
−
=0
−y
+z 2 −
− y 2 = yz
ds
ds
ds
ds ds
ds
ds ds
ds
y ds2
z ds2
dy
dz
dz
dx
dx
and so z dy
ds − y ds = A is constant. Similarly x ds − z ds = B is constant and y ds − x ds = C
is constant. Therefore multiplying by x, y and z respectively on these equations gives 0 =
Ax + By + Cz. This is a plane through the origin perpendicular to (A, B, C). Hence the
path must lie in the intersection of the sphere and a plane through the origin. These are
called Great Circles. We have two solutions to the E-L equations satisfying the endpoint
conditions so long as the endpoints are not antipodal. If two endpoints are poles then we get
a continuum of great circles all of which are solutions to the problem.
7
Constrained Motion
For particles movingR with coordinates related by a constraint, say g = 0, then Hamilton’s
t
principle extremises t12 Ldt where L = T − V and now we are subjected to a constraint. We
use the Lagrange Multiplier method, and so we have a function λ(t) such that the motion
satisfies the E-L equation for T − V − λg
Example 7.1 Consider free motion on a surface in R3 . We then have by definition V = 0
and also T = 12 m(ẋ2 + ẏ 2 + ż 2 ) and g(x, y, z) = 0. Thus we want the E-L equations for
1
2
2
2
2 m(ẋ + ẏ + ż ) − λ(t)g(x, y, z) and so we have that

∂g
d
−λ ∂x
− dt
(mẋ) = 0 

∂g
d
− dt
(mẏ) = 0
−λ ∂y
=⇒ mẍ = −λ∇g


∂g
d
−λ ∂z − dt (mż) = 0
∇g is a vector perpendicular to the surface at each point. If we eliminate
ẍ
∂g
∂x
=
ÿ
∂g
∂y
=
λ
m
then we get that
z̈
∂g
∂z
Observe that
d 2
2λ
(ẋ + ẏ 2 + ż 2 ) = 2ẋẍ + 2ẏ ÿ + 2ż z̈ = −
dt
m
∂g
∂g
∂g
ẋ
+ ẏ
+ ż
∂x
∂y
∂z
=0
and so ẋ2 + ẏ 2 + ż 2 is constant and so ds
dt is constant so changing from t to s gives the geodesic
equation. Hence the motion of a free particle is along a geodesic and constant speed.
20 of 20
Download