Control-Tex-Ch3_0131..

advertisement
C HAPTER 3
T HE M AXIMUM P RINCIPLE : M IXED I NEQUALITY
C ONSTRAINTS
– p. 1/73
T HE M AXIMUM P RINCIPLE : M IXED I NEQUALITY
C ONSTRAINTS
•
Mixed Inequality Constraints: Inequality constraints
involving control and possibly state variables.
Examples:
g(u, t) ≥ 0;
g(x, u, t) ≥ 0.
•
Pure Inequality Constraints of the type
h(x, t) ≥ 0,
i.e., involving only state variables, will be treated in
Chapter 4.
– p. 2/73
P ROBLEMS WITH M IXED I NEQUALITY
C ONSTRAINTS
3.1 A Maximum Principle for Problems with Mixed Inequality
Constraints
•
State equation:
ẋ = f (x, u, t), x(0) = x0
(1)
where x(t) ∈ E n and u(t) ∈ E m , and
f : E n × E m × E 1 → E n is assumed to be continuously
differentiable.
– p. 3/73
P ROBLEMS WITH M IXED I NEQUALITY
C ONSTRAINTS CONT.
•
Objective function:
max
(
J=
Z
T
)
F (x, u, t)dt + S[x(T ), T ] ,
0
(2)
where F : E n ×E m ×E 1 → E 1 and S : E n ×E 1 → E 1 are
continuously differentiable functions and T denotes the
terminal time.
•
For each t ∈ [0, T ], u(t) is admissible if it is piecewise
continuous and satisfies the mixed constraints
g(x, u, t) ≥ 0, t ∈ [0, T ],
(3)
where g : E n ×E m ×E 1 → E q is continuously
differentiable.
– p. 4/73
T ERMINAL S TATE
•
The terminal state is constrained by the following
inequality and equality constraints:
a(x(T ), T ) ≥ 0,
(4)
b(x(T ), T ) = 0,
(5)
where a : E n × E 1 → E la and b : E n × E 1 → E lb are
continuously differentiable.
– p. 5/73
S PECIAL C ASES
An interesting case of the terminal inequality constraints is
x(T ) ∈ Y ⊂ X,
(6)
where Y is a convex set and X is the reachable set from the
initial state x0 .
X={x(T )|x(T ) obtained by an admissible control u and (1)}.
Remarks:
1. The constraint (6) does not depend explicitly on T .
2. The feasible set defined by (4) and (5) need not be
convex.
3. (6) may not be expressible by a simple set of
inequalities.
– p. 6/73
F ULL R ANK C ONDITIONS OR C ONSTRAINT
Q UALIFICATIONS
rank [∂g/∂u, diag(g)] = q
holds for all arguments x(t), u(t), t, and
rank
"
∂a/∂x diag(a)
∂b/∂x
0
#
= la + lb
hold for all possible values of x(T ) and T .
The first means that the gradients of active constraints in (3)
with respect to u are linearly independent. The second
means that the gradients of the equality constraints (5) and
of the active inequality constraints in (4) are linearly
independent.
– p. 7/73
H AMILTONIAN F UNCTION
•
The Hamiltonian function H : E n ×E m ×E n ×E 1 → E 1 is:
H[x, u, λ, t] := F (x, u, t) + λf (x, u, t),
(7)
where λ ∈ E n (a row vector) is called the adjoint vector
or the costate vector. Recall that λ provides the
marginal valuation of increases in x.
– p. 8/73
L AGRANGIAN F UNCTION
•
The Lagrangian function L : E n ×E m ×E n ×E q ×E 1→ E 1
as
L[x, u, λ, µ, t] := H(x, u, λ, t) + µg(x, u, t),
(8)
where µ ∈ E q is a row vector, whose components are
called Lagrange multipliers.
•
Lagrange multipliers satisfy the complimentary
slackness conditions
µ ≥ 0, µg(x, u, t) = 0.
(9)
– p. 9/73
A DJOINT V ECTOR
The adjoint vector satisfies the differential equation
λ̇ = −Lx [x, u, λ, µ, t]
(10)
with the boundary conditions
λ(T ) = Sx (x(T ), T ) + αax (x(T ), T ) + βbx (x(T ), T ),
α ≥ 0,
αa(x(T ), T ) = 0,
where α ∈ E la and β ∈ E lb are constant vectors.
– p. 10/73
N ECESSARY C ONDITIONS
The necessary conditions for u∗ (corresponding state x∗ ) to be an
optimal solution are that there exist λ, µ, α, and β which satisfy
the following:
ẋ∗ = f (x∗ , u∗ , t), x∗ (0) = x0 ,
satisfying the terminal constraints
a(x∗ (T ), T ) ≥ 0 and b(x∗ (T ), T ) = 0,
λ̇ = −Lx [x∗ , u∗ , λ, µ, t]
with the transversality conditions
λ(T ) = Sx (x∗ (T ), T ) + αax (x∗ (T ), T ) + βbx (x∗ (T ), T ),
α ≥ 0, αa(x∗ (T ), T ) = 0,
the Hamiltonian maximizing condition
H[x∗ (t), u∗ (t), λ(t), t] ≥ H[x∗ (t), u, λ(t), t]
at each t ∈ [0, T ] for all u satisfying
g[x∗ (t), u, t] ≥ 0,
and the Lagrange multipliers µ(t) are such that
∂H ∂g ∂L
+µ
|
∗
:=
|u=u∗ (t) = 0
∂u u=u (t)
∂u
∂u
and the complementary slackness conditions
µ(t) ≥ 0, µ(t)g(x∗ , u∗ , t) = 0 hold.
(11)
– p. 11/73
S PECIAL C ASE
In the case of the terminal constraint (6), the terminal
conditions on the state and the adjoint variables in (11) will
be replaced, respectively, by
x∗ (T ) ∈ Y ⊂ X
(12)
and
[λ(T ) − Sx (x∗ (T ), T )][y − x∗ (T )] ≥ 0,
∀y ∈ Y.
(13)
– p. 12/73
S PECIAL C ASE
Furthermore, if the terminal time T in the problem (1)-(5) is
unspecified, there is an additional necessary transversality
condition for T ∗ to be optimal (see Exercise 3.5), namely,
H[x∗ (T ∗ ), u∗ (T ∗ ), λ(T ∗ ), T ∗ ] + ST [x∗ (T ∗ ), T ∗ ] = 0,
(14)
if T ∗ ∈ (0, ∞).
– p. 13/73
R EMARK 3.1 AND 3.2
Remark 3.1: We should have H = λ0 F + λf in (7) with
λ0 ≥ 0. However, we can set λ0 = 1 in most applications.
Remark 3.2: If the set Y in (6) consists of a single point
Y = {k}, making the problem a fixed-end-point problem,
then the transversality condition reduces to simply λ(T )
equals a constant β to be determined, since x∗ (T ) = k . In
this case the salvage function S becomes a constant, and can
therefore be disregarded.
– p. 14/73
E XAMPLE 3.1
Consider the problem:
max J =
Z
1
0
udt
subject to
ẋ = u, x(0) = 1,
(15)
u ≥ 0, x − u ≥ 0.
(16)
Note that constraints (16) are of the mixed type (3). They
can also be rewritten as 0 ≤ u ≤ x.
– p. 15/73
S OLUTION OF E XAMPLE 3.1
The Hamiltonian is
H = u + λu = (1 + λ)u,
so that the optimal control has the form
u∗ = bang[0, x; 1 + λ].
(17)
To get the adjoint equation and the multipliers associated
with constraints (16), we form the Lagrangian:
L = H + µ1 u + µ2 (x − u) = µ2 x + (1 + λ + µ1 − µ2 )u.
– p. 16/73
S OLUTION OF E XAMPLE 3.1 CONT.
From this we get the adjoint equation
∂L
λ̇ = −
= −µ2 , λ(1) = 0.
∂x
(18)
Also note that the optimal control must satisfy
∂L
= 1 + λ + µ1 − µ2 = 0,
∂u
(19)
and µ1 and µ2 must satisfy the complementary slackness
conditions
µ1 ≥ 0, µ1 u = 0,
(20)
µ2 ≥ 0, µ2 (x − u) = 0.
(21)
– p. 17/73
S OLUTION OF E XAMPLE 3.1 CONT.
It is obvious for this simple problem that u∗ (t) = x(t) should
be the optimal control for all t ∈ [0, 1]. We now show that
this control satisfies all the conditions of the Lagrangian
form of the maximum principle.
Since x(0) = 1, the control u∗ = x gives x = et as the
solution of (15). Because x = et > 0, it follows that
u∗ = x > 0; thus µ1 = 0 from (20).
– p. 18/73
S OLUTION OF E XAMPLE 3.1 CONT.
From (19) we then have
µ2 = 1 + λ.
Substituting this into (18) and solving gives
1 + λ(t) = e1−t .
(22)
Since the right-hand side of (22) is always positive, u∗ = x
satisfies (17). Notice that µ2 = e1−t ≥ 0 and x − u∗ = 0, so
(21) holds.
– p. 19/73
S UFFICIENCY C ONDITIONS : C ONCAVE AND
Q UASICONCAVE F UNCTIONS
•
Let D ⊂ E n be a convex set. A function ψ : D → E 1 is
concave, if for all y, z ∈ D and for all p ∈ [0, 1],
ψ(py + (1 − p)z) ≥ pψ(y) + (1 − p)ψ(z).
(23)
The function ψ is quasiconcave if (23) is relaxed to
ψ(py + (1 − p)z) ≥ min{ψ(y), ψ(z)}.
(24)
• ψ
is strictly concave if y 6= z and p ∈ (0, 1) and (23)
holds with a strict inequality.
ψ is convex, quasiconvex, or strictly convex if −ψ is
concave, quasiconcave, or strictly concave, respectively.
– p. 20/73
S UFFICIENCY C ONDITIONS : T HEOREM 3.1
Let (x∗ , u∗ , λ, µ, α, β) satisfy the necessary conditions in
(11).
If H(x, u, λ(t), t) is concave in (x, u) at each t ∈ [0, T ],
S in (2) is concave in x, g in (3) is quasiconcave in (x, u), a
in (4) is quasiconcave in x, and b in (5) is linear in x, then
(x∗ , u∗ ) is optimal.
– p. 21/73
R EMARK ON THE C ONCAVITY C ONDITION IN
T HEOREM 3.1
The concavity of the Hamiltonian with respect to (x, u) is a
crucial condition in Theorem 3.1. Unfortunately, a number
of management science and economics models lead to
problems that do not satisfy this concavity condition. We
replace the concavity requirement on the Hamiltonian in
Theorem 3.1 by a concavity requirement on H 0 , where
H 0 (x, λ, t) =
max
{u|g(x,u,t)≥0}
H(x, u, λ, t).
(25)
– p. 22/73
T HEOREM 3.2
Theorem 3.1 remains valid if
H 0 (x∗ (t), λ(t), t) = H(x∗ (t), u∗ (t), λ(t), t), t ∈ [0, T ],
and, if in addition, we drop the quasiconcavity requirement
on g and replace the concavity requirement on H in
Theorem 3.1 by the following assumption: For each
t ∈ [0, T ], if we define A1 (t) = {x|g(x, u, t) ≥ 0 for some u},
then H 0 (x, λ(t), t) is concave on A1 (t), if A1 (t) is convex. If
A1 (t) is not convex, we assume that H 0 has a concave
extension to co(A1 (t)), the convex hull of A1 (t).
– p. 23/73
3.3 C URRENT-VALUE F ORMULATION
Assume a constant continuous discount rate ρ ≥ 0. The time
dependence of the relevant functions comes only through
the discount factor. Thus,
F (x, u, t) = φ(x, u)e−ρt and S(x, T ) = σ(x)e−ρT .
Now, the objective is to
maximize
(
J=
Z
T
φ(x, u)e−ρt dt + σ[x(T )]e−ρT
0
)
(26)
subject to (1) and (3)-(5).
– p. 24/73
3.3 C URRENT-VALUE F ORMULATION CONT.
The standard Hamiltonian is
H s := e−ρt φ(x, u) + λs f (x, u, t)
(27)
and the standard Lagrangian is
Ls := H s + µs g(x, u, t).
(28)
– p. 25/73
3.3 C URRENT-VALUE F ORMULATION CONT.
The standard adjoint variables λs and standard multipliers
µs , αs and β s satisfy
λ̇s = −Lsx ,
(29)
λs (T ) = Sx [x(T ), T ] + αs ax (x(T ), T ) + β s bx (x(T ), T )
= e−ρT σx [x(T )] + αs ax (x(T ), T ) + β s bx (x(T ), T ), (30)
αs ≥ 0, αs a(x(T ), T ) = 0,
(31)
µs ≥ 0, µs g = 0.
(32)
– p. 26/73
3.3 C URRENT-VALUE F ORMULATION CONT.
The current-value Hamiltonian
H[x, u, λ, t] := φ(x, u) + λf (x, u, t)
(33)
and the current-value Lagrangian
L[x, u, λ, µ, t] := H + µg(x, u, t).
(34)
λ := eρt λs and µ := eρt µs ,
(35)
We define
so that we can rewrite (27) and (28) as
H = eρt H s and L = eρt Ls .
(36)
– p. 27/73
3.3 C URRENT-VALUE F ORMULATION CONT.
From (35), we have
λ̇ = ρeρt λs + eρt λ̇s .
(37)
Then from (29),
λ̇ = ρλ − Lx ,
λ(T ) = σx [x(T )] + αax (x(T ), T ) + βbx (x(T ), T ),
(38)
where (38) follows from the terminal condition for λs (T ) in
(30), the definition (36),
α = eρt αs and β = eρt β s .
(39)
– p. 28/73
3.3 C URRENT-VALUE F ORMULATION CONT.
The complimentary slackness conditions satisfied by the
current-value Lagrange multipliers µ and α are
µ ≥ 0, µg = 0, α ≥ 0, and αa = 0
on account of (31), (32), (35), and (39).
From (14), the necessary transversality condition for T ∗ to
be optimal is
H[x∗ (T ∗ ), u∗ (T ∗ ), λ(T ∗ ), T ∗ ] − ρσ[x∗ (T ∗ )] = 0.
(40)
– p. 29/73
T HE CURRENT- VALUE MAXIMUM PRINCIPLE
ẋ∗ = f (x∗ , u∗ , t),
a(x∗ (T ), T ) ≥ 0, b(x∗ (T ), T ) = 0,
λ̇ = ρλ − Lx [x∗ , u∗ , λ, µ, t],
with the terminal conditions
λ(T ) = σx (x∗ (T )) + αax (x∗ (T ), T ) + βbx (x∗ (T ), T ),
α ≥ 0, αa(x∗ (T ), T ) = 0,
and the Hamiltonian maximizing condition
H[x∗ (t), u∗ (t), λ(t), t] ≥ H[x∗ (t), u, λ(t), t]
at each t ∈ [0, T ] for all u satisfying
g[x∗ (t), u, t] ≥ 0,
and the Lagrange multipliers µ(t) are such that
∂L
∗
∂u |u=u (t) = 0, and the complementary slackness
conditions µ(t) ≥ 0 and µ(t)g(x∗ , u∗ , t) = 0 hold.
(41)
– p. 30/73
S PECIAL C ASE
When the terminal constraint is given by (6) instead of (4)
and (5), we need to replace the terminal condition on the
state and the adjoint variables, respectively, by (12) and
[λ(T ) − σx (x∗ (T ))][y − x∗ (T )] ≥ 0, ∀y ∈ Y.
(42)
– p. 31/73
E XAMPLE 3.2
Use the current-value maximum principle to solve the
following consumption problem for ρ = r:
(
max J =
Z
T
)
e−ρt ln C(t)dt
0
subject to the wealth dynamics
Ẇ = rW − C, W (0) = W0 , W (T ) = 0,
where W0 > 0. Note that the condition W (T ) = 0 is
sufficient to make W (t) ≥ 0 for all t. We can interpret
ln C(t) as the utility of consuming at the rate C(t) per unit
time at time t.
– p. 32/73
S OLUTION OF E XAMPLE 3.2
The current-value Hamiltonian is
H = ln C + λ(rW − C),
(43)
where the adjoint equation, under the assumption ρ = r, is
∂H
λ̇ = ρλ −
= ρλ − rλ = 0, λ(T ) = β,
∂W
(44)
where β is some constant to be determined.
The solution of (44) is simply λ(t) = β for 0 ≤ t ≤ T .
– p. 33/73
S OLUTION OF E XAMPLE 3.2 CONT.
To find the optimal control, we maximize H by
differentiating (43) with respect to C and setting the result
to zero:
∂H
1
= − λ = 0,
∂C
C
which implies C = 1/λ = 1/β . Using this consumption
level in the wealth dynamics gives
1
Ẇ = rW − , W (T ) = 0,
β
which can be solved as
1 rt
W (t) = W0 e − (e − 1).
βr
rt
– p. 34/73
S OLUTION OF E XAMPLE 3.2 CONT.
Setting W (T ) = 0 gives
1 − e−rT
.
β=
rW0
Therefore, the optimal consumption
1
rW0
C (t) =
=
β
1 − e−rT
ρW0
=
,
−ρT
1−e
∗
since ρ = r.
– p. 35/73
3.4 T ERMINAL C ONDITIONS /T RANSVERSALITY
C ONDITIONS
Free-end point. From the terminal conditions in
(11), it is obvious that for the free-end-point problem, i.e.,
when Y = X ,
Case 1:
λ(T ) = σx [x∗ (T )].
(45)
If σ(x) ≡ 0, then λ(T ) = 0.
Case 2:
Fixed-end point. The terminal condition is
b(x(T ), T ) = x(T ) − k = 0,
and the transversality condition in (11) does not provide any
information for λ(T ). λ(T ) will be some constant β .
– p. 36/73
3.4 T ERMINAL C ONDITIONS /T RANSVERSALITY
C ONDITIONS CONT.
One-sided constraints. The ending value of the state
variable is in a one-sided interval, namely,
Case 3:
a(x(T ), T ) = x(T ) − k ≥ 0,
where k ∈ X . In this case it is possible to show that
λ(T ) ≥ σx [x∗ (T )]
(46)
{λ(T ) − σx [x∗ (T )]}{x∗ (T ) − k} = 0.
(47)
and
For σ(x) ≡ 0, these terminal conditions can be written as
λ(T ) ≥ 0 and λ(T )[x∗ (T ) − k] = 0.
(48)
– p. 37/73
3.4 T ERMINAL C ONDITIONS /T RANSVERSALITY
C ONDITIONS CONT.
Case 4:
A general case. A general ending condition is
x(T ) ∈ Y ⊂ X.
– p. 38/73
TABLE 3.1 S UMMARY OF THE T RANSVERSALITY
C ONDITIONS
Constraint
Description
λ(T )
on x(T )
1
x(T ) ∈ Y = X
λ(T )
when σ ≡ 0
Free-end
λ(T ) = σx [x∗ (T )]
λ(T ) = 0
point
2
3
x(T ) = k ∈ X,
Fixed-end
λ(T ) = a constant
λ(T ) = a constant
i.e., Y = {k}
point
to be determined
to be determined
x(T ) ∈ X ∩ [k, ∞),
One-sided
λ(T ) ≥ σx [x∗ (T )]
λ(T ) ≥ 0
i.e., Y = {x|x ≥ k}
constraints
and
and
x(T ) ≥ k
{λ(T ) − σx [x∗ (T )]}{x∗ (T ) − k} = 0
λ(T )[x∗ (T ) − k] = 0
One-sided
λ(T ) ≤ σx [x∗ (T )]
λ(T ) ≤ 0
constraints
and
and
x(T ) ≤ k
{λ(T ) − σx [x∗ (T )]}{k − x∗ (T )} = 0
λ(T )[k − x∗ (T )] = 0
General
{λ(T ) − σx [x∗ (T )]}{y − x∗ (T )} ≥ 0
λ(T )[y − x∗ (T )] ≥ 0
constraints
∀y ∈ Y
∀y ∈ Y
4 x(T ) ∈ X ∩ (−∞, k],
i.e., Y = {x|x ≤ k}
5
x(T ) ∈ Y ⊂ X
– p. 39/73
E XAMPLE 3.3
Consider the problem:
max J =
Z
2
0
−xdt
subject to
ẋ = u, x(0) = 1, x(2) ≥ 0,
(49)
−1 ≤ u ≤ 1.
(50)
– p. 40/73
S OLUTION OF E XAMPLE 3.3
The Hamiltonian is
H = −x + λu.
Clearly the optimal control has the form
u∗ = bang[−1, 1; λ].
(51)
The adjoint equation is
λ̇ = 1
(52)
with the transversality conditions
λ(2) ≥ 0 and λ(2)x(2) = 0.
(53)
– p. 41/73
S OLUTION OF E XAMPLE 3.3 CONT.
Since λ(t) is monotonically increasing, the control (51) can
switch at most once, and it can only switch from u∗ = −1 to
u∗ = 1. Let the switching time be t∗ ≤ 2. Then the optimal
control is
∗
u (t) =
(
−1
+1
for 0 ≤ t ≤ t∗ ,
for t∗ < t ≤ 2.
(54)
Since the control switches at t∗ , λ(t∗ ) must be 0. Solving
(52) we get
λ(t) = t − t∗ .
– p. 42/73
S OLUTION OF E XAMPLE 3.3 CONT.
There are two cases t∗ < 2 and t∗ = 2. We analyze the first
case first. Here λ(2) = 2 − t∗ > 0; therefore from (53),
x(2) = 0. Solving for x with u∗ given in (54), we obtain
x(t) =
(
1−t
for 0 ≤ t ≤ t∗ ,
(t − t∗ ) + x(t∗ ) = t + 1 − 2t∗ for t∗ < t ≤ 2.
Therefore, setting x(2) = 0 gives
x(2) = 3 − 2t∗ = 0,
which makes t∗ = 3/2. Since this satisfies t∗ < 2, we do not
have to deal with the case t∗ = 2.
– p. 43/73
F IGURE 3.1 S TATE AND A DJOINT T RAJECTORIES
IN
E XAMPLE 3.3
– p. 44/73
I SOPERIMETRIC OR BUDGET CONSTRAINT
It is of the form:
Z
0
T
l(x, u, t)dt ≤ K,
(55)
where l : E n × E m × E 1 → E 1 is assumed nonnegative,
bounded, and continuously differentiable, and K is a
positive constant representing the amount of the budget. To
see how this constraint can be converted into a one-sided
constraint, we define an additional state variable xn+1 by the
state equation
ẋn+1 = −l(x, u, t), xn+1 (0) = K, xn+1 (T ) ≥ 0.
(56)
– p. 45/73
E XAMPLES I LLUSTRATING T ERMINAL
C ONDITIONS
Example 3.4 The problem is:
max
(
J=
Z
T
e−ρt ln C(t)dt + e−ρT BW (T )
0
)
(57)
subject to the wealth equation
Ẇ = rW − C, W (0) = W0 , W (T ) ≥ 0.
(58)
Assume B to be a given positive constant.
– p. 46/73
S OLUTION OF E XAMPLE 3.4
The Hamiltonian for the problem is given in (43), and the
adjoint equation is given in (44) except that the
transversality conditions are from Row 3 of Table 3.1:
λ(T ) ≥ B, [λ(T ) − B]W (T ) = 0.
(59)
In Example 3.2 the value of β , which was the terminal value
of the adjoint variable, was
1 − e−rT
β=
.
rW0
We now have two cases: (i) β ≥ B and (ii) β < B .
– p. 47/73
S OLUTION OF E XAMPLE 3.4 CONT.
•
In case (i), the solution of the problem is the same as
that of Example 3.2, because by setting λ(T ) = β and
recalling that W (T ) = 0 in that example, it follows that
(59) holds.
•
In case (ii), we set λ(T ) = B and use (44) which is
λ̇ = 0. Hence, λ(t) = B for all t. The Hamiltonian
maximizing condition remains unchanged. Therefore,
the optimal consumption is:
1
1
C= = .
λ
B
– p. 48/73
S OLUTION OF E XAMPLE 3.4 CONT.
Solving (58) with this C gives
1 rt
W (t) = W0 e −
(e − 1).
Br
rt
It is easy to show that
W (T ) = W0 e
rT
1 rT
−
(e − 1)
Br
is nonnegative since β < B . Note that (59) holds for case
(ii).
– p. 49/73
E XAMPLE 3.5: A T IME -O PTIMAL C ONTROL
P ROBLEM
Consider a subway train of mass m (assume m = 1), which
moves along a smooth horizontal track with negligible
friction. The position x of the train along the track at time t
is determined by Newton’s Second Law of Motion, i.e.,
ẍ = mu = u.
(60)
Note: (60) is a second-order differential equation.
– p. 50/73
E XAMPLE 3.5: A T IME -O PTIMAL C ONTROL
P ROBLEM
Let the initial conditions on x(0) and ẋ(0) be
x(0) = x0 and ẋ(0) = y0 ,
Transform (60) to two first-order differential equations:
(
ẋ = y, x(0) = x0 ,
ẏ = u, y(0) = y0 .
(61)
Let the control constraint be
u ∈ Ω = [−1, 1].
(62)
– p. 51/73
E XAMPLE 3.5: A T IME -O PTIMAL C ONTROL
P ROBLEM CONT.
The problem is:

n
o
R
T


max
J
=
−1dt

0




subject to



ẋ = y, x(0) = x0 , x(T ) = 0,

ẏ = u, y(0) = y0 , y(T ) = 0,





and the control constraint



 u ∈ Ω = [−1, 1].
(63)
– p. 52/73
S OLUTION OF E XAMPLE 3.5
The standard Hamiltonian function in this case is
H = −1 + λ1 y + λ2 u,
where the adjoint variables λ1 and λ2 satisfy
λ˙1 = 0, λ1 (T ) = β1 and λ˙2 = −λ1 , λ2 (T ) = β2 ,
and
λ1 = β1 and λ2 = β2 + β1 (T − t).
The Hamiltonian maximizing condition yields the form of
the optimal control to be
u∗ (t) = bang{−1, 1; β2 + β1 (T − t)}.
(64)
– p. 53/73
S OLUTION OF E XAMPLE 3.5 CONT.
The transversality condition (14) with y(T ) = 0 and S ≡ 0
yields
H + ST = λ2 (T )u∗ (T ) − 1 = β2 u∗ (T ) − 1 = 0,
which together with the bang-bang control policy (64)
implies either
λ2 (T ) = β2 = −1 and u∗ (T ) = −1,
or
λ2 (T ) = β2 = +1 and u∗ (T ) = +1.
– p. 54/73
TABLE 3.2 S TATE T RAJECTORIES AND
S WITCHING C URVE
(a) u∗ (τ ) = −1 for (t ≤ τ ≤ T )
(b) u∗ (τ ) = +1 for (t ≤ τ ≤ T )
y =T −t
y =t−T
x = −(T − t)2 /2
Γ− : x = −y 2 /2 for y ≥ 0
x = (t − T )2 /2
Γ+ : x = y 2 /2 for y ≤ 0
– p. 55/73
S OLUTION OF E XAMPLE 3.5 CONT.
We can put Γ− and Γ+ into a single switching curve Γ as
y = Γ(x) =
(
√
− 2x,
x ≥ 0,
√
+ −2x, x < 0.
(65)
If the initial state (x0 , y0 ) lies on the switching curve, then
we use u∗ = +1 (resp., u∗ = −1) if (x0 , y0 ) lies on Γ+ (resp.,
Γ− ). In common parlance, we apply the brakes. If the initial
state (x0 , y0 ) is not on the switching curve, then we choose,
between u∗ = 1 and u∗ = −1, that which moves the system
toward the switching curve. By inspection, it is obvious that
above the switching curve we must choose u∗ = −1 and
below we must choose u∗ = +1.
– p. 56/73
F IGURE 3.2 M INIMUM T IME O PTIMAL R ESPONSE
FOR
P ROBLEM (63)
– p. 57/73
S OLUTION OF E XAMPLE 3.5 CONT.
The other curves in Figure 3.2 are solutions of the
differential equations starting from initial points (x0 , y0 ). If
(x0 , y0 ) lies above the switching curve Γ as shown in Figure
3.2, we use u∗ = −1 to compute the curve as follows:
ẋ = y, x(0) = x0 ,
ẏ = −1, y(0) = y0 .
Integrating these equations gives
y = −t + y0 ,
t2
x = − + y 0 t + x0 .
2
Elimination of t between these two gives
y02 − y 2
+ x0 .
x=
2
(66)
– p. 58/73
S OLUTION OF E XAMPLE 3.5 CONT.
(66) is the equation of the parabola in Figure 3.2 through
(x0 , y0 ). The point of intersection of the curve (66) with the
switching curve Γ+ is obtained by solving (66) and the
equation for Γ+ , namely 2x = y 2 , simultaneously. This gives
x∗ =
y02
q
+ 2x0 ∗
, y = − (y02 + 2x0 )/2,
4
(67)
where the minus sign in the expression for y ∗ in (67) was
chosen since the intersection occurs when y ∗ is negative.
The time t∗ to reach the switching curve, called the
switching time, given that we start above it, is
q
t∗ = y0 − y ∗ = y0 + (y02 + 2x0 )/2.
(68)
– p. 59/73
S OLUTION OF E XAMPLE 3.5 CONT.
To find the minimum total time to go from the starting point
(x0 , y0 ) to the origin (0,0), we substitute t∗ into the equation
for Γ+ in Column (b) of Table 3.2. This gives
T = t∗ − y ∗ = y0 +
q
2(y02 + 2x0 ).
(69)
As a numerical example, start at the point (x0 , y0 ) =(1,1).
Then, the equation of the parabola (66) is
2x = 3 − y 2 .
p
The switching point (67) is (3/4, − 3/2). Finally, the
p
∗
switching time is t = 1 + 3/2 from (68). Substituting into
√
(69), we find the minimum time to stop is T = 1 + 6.
– p. 60/73
S OLUTION OF E XAMPLE 3.5 CONT.
To complete the solution of this numerical example let us
evaluate β1 and β2 , which are needed to obtain λ1 and λ2 .
Since (1,1) is above the switching curve, u∗ (T ) = 1, and
therefore β2 = 1. To compute β1 , we observe that
λ2 (t∗ ) = β2 + β1 (T − t∗ ) = 0 so that
∗
p
p
β1 = −β2 /(T − t ) = −1/ 3/2 = − 2/3.
In Exercises 3.14 - 3.17, you are asked to work other
examples with different starting points above, below, and on
the switching curve. Note that t∗ = 0 by definition, if the
starting point is on the switching curve.
– p. 61/73
3.5 I NFINITE H ORIZON AND S TATIONARITY
Transversality conditions:
•
Free-end-point:
lim λs (T ) = 0 ⇒ lim e−ρT λ(T ) = 0.
T →∞
•
T →∞
(70)
One-sided constraints:
lim x(T ) ≥ 0.
T →∞
Then, the transversality conditions are
lim e−ρT λ(T ) ≥ 0 and lim e−ρT λ(T )x∗ (T ) = 0. (71)
T →∞
T →∞
– p. 62/73
3.5 I NFINITE H ORIZON AND S TATIONARITY CONT.
•
Stationarity Assumption:
f(x,u,t) = f(x,u),
g(x,u,t) = g(x,u).
•
(72)
Long-run stationary equilibrium is defined by the
quadruple {x̄, ū, λ̄, µ̄} satisfying
f (x̄, ū) = 0,
ρλ̄ = Lx [x̄, ū, λ̄, µ̄],
µ̄ ≥ 0, µ̄g(x̄, ū) = 0,
and
H(x̄, ū, λ̄) ≥ H(x̄, u, λ̄)
for all u satisfying
g(x̄, u) ≥ 0.
(73)
– p. 63/73
3.5 I NFINITE H ORIZON AND S TATIONARITY CONT.
Clearly, if the initial condition x0 = x̄, the optimal control is
u∗ (t) = ū for all t. If x0 6= x̄, the optimal solution will have a
transient phase.
If the constraint involving g is not imposed, µ̄ may be
dropped from the quadruple. In this case, the equilibrium is
defined by the triple {x̄, ū, λ̄} satisfying
f (x̄, ū) = 0, ρλ̄ = Hx (x̄, ū, λ̄), and Hu (x̄, ū, λ̄) = 0. (74)
– p. 64/73
E XAMPLE 3.6
Consider the problem:
max J =
Z
∞
0
e−ρt ln C(t)dt
subject to
lim W (T ) ≥ 0,
(75)
Ẇ = rW − C, W (0) = W0 > 0.
(76)
T →∞
– p. 65/73
S OLUTION OF E XAMPLE 3.6
By (73) we set
rW̄ − C̄ = 0, λ̄ = β,
where β is a constant to be determined. This gives the
optimal control C̄ = rW̄ , and by setting λ̄ = 1/C̄ = 1/rW̄ ,
we see all the conditions of (73) including the Hamiltonian
maximizing condition hold.
– p. 66/73
S OLUTION OF E XAMPLE 3.6 CONT.
Furthermore, λ̄ and W̄ = W0 satisfy the transversality
conditions (71). Therefore, by the sufficiency theorem, the
control obtained is optimal. Note that the interpretation of
the solution is that the trust spends only the interest from its
endowment W0 . Note further that the triple
(W̄ , C̄, λ̄) = (W0 , rW0 , 1/rW0 ) is an optimal long-run
stationary equilibrium for the problem.
– p. 67/73
TABLE 3.3: O BJECTIVE , S TATE , AND A DJOINT
E QUATIONS FOR VARIOUS M ODEL T YPES
Objective
State
Current-Value
Form of Optimal
Function
Equation
Adjoint Equation
Control Policy
φ=
ẋ = f =
λ̇ =
(a)
Cx + Du
Ax + Bu + d
λ(ρ − A) − C
Bang-Bang
(b)
C(x) + Du
Ax + Bu + d
λ(ρ − A) − Cx
Bang-Bang+Singular
(c)
xT Cx + uT Du
Ax + Bu + d
λ(ρ − A) − 2xT C
Linear Decision Rule
(d)
C(x) + Du
A(x) + Bu + d
λ(ρ − Ax ) − Cx
Bang-Bang+Singular
(e)
c(x) + q(u)
(ax + d)b(u) + e(x)
λ(ρ − ab(u) − ex ) − cx
Interior or Boundary
(f)
c(x)q(u)
(ax + d)b(u) + e(x)
λ(ρ − ab(u) − ex ) − cx q(u)
Interior or Boundary
Integrand
– p. 68/73
3.6 M ODEL T YPES
•
In Model Type (a) of Table 3.3 we see that both φ and f
are linear functions of their arguments. Hence it is
called the linear-linear case. The Hamiltonian is
H = Cx + Du + λ(Ax + Bu + d)
= Cx + λAx + λd + (D + λB)u.
•
(77)
Model Type (b) of Table 3.3 is the same as Model Type
(a) except that the function C(x) is nonlinear.
– p. 69/73
3.6 M ODEL T YPES CONT.
•
Model Type (c) has linear functions in the state equation
and quadratic functions in the objective function.
•
Model Type (d) is a more general version of Model
Type (b) in which the state equation is nonlinear in x.
•
In Model Types (e) and (f), the functions are scalar
functions, and there is only one state equation so that λ
is also a scalar function.
– p. 70/73
R EMARKS 3.3 AND 3.4
Remark 3.3: In order to use the absolute value function |u|
of a control variable u in forming the functions φ or f . We
define u+ and u− satisfying the following relations:
u := u+ − u− , u+ ≥ 0, u− ≥ 0,
u+ u− = 0.
(78)
(79)
We write
|u| = u+ + u− .
(80)
We need not impose (79) explicitly.
Remark 3.4: Tables 3.1 and 3.3 are constructed for
continuous-time models.
– p. 71/73
R EMARK 3.5
Remark 3.5: Consider Model Types (a) and (b) when the
control variable constraints are defined by linear
inequalities of the form
g(u, t) = g(t)u ≥ 0.
(81)
Then, the problem of maximizing the Hamiltonian function
becomes:


 max(D + λB)u
subject to
(82)

 g(t)u ≥ 0.
– p. 72/73
R EMARKS 3.6 AND 3.7
Remark 3.6:The salvage value part of the objective
function, S[x(T ), T ], makes sense in two cases:
(a) When T is free, and part of the problem is to determine
the optimal terminal time.
(b) When T is fixed and we want to maximize the salvage
value of the ending state x(T ), which in this case can be
written simply as S[x(T )].
Remark 3.7: One important model type that we did not
include in Table 3.3 is the impulse control model of
Bensoussan and Lions. In this model, an infinite control is
instantaneously exerted on a state variable in order to cause
a finite jump in its value.
– p. 73/73
Download