A "HUM" Conjugate Gradient Algorithm ... Nonlinear Optimal Control: Terminal and ...

advertisement
A "HUM" Conjugate Gradient Algorithm for Constrained
Nonlinear Optimal Control: Terminal and Regulator Problems
by
Ivan B. Oliveira
Submitted to the Department of Mechanical Engineering
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
Feb 2002
@ Ivan B. Oliveira, MMII. All rights reserved.
The author hereby grants to MIT permission to reproduce and distribute publicly paper
and electronic copies of this thesis document in whole or in part.
Author ...........................
.
.....................................
Department of Mechanical Engineering
January 22, 2002
.................... 2
Anthony T. Patera
Professor of Mechanical Engineering
Thesis Supervisor
Certified by.............................
Accepted by ..........
.. . . . .. . . . . .. . . . .
Ain A. Sonin
Chairman, Department Committee on Graduate Students
....................
MASSACHSETTIS IN TITU
OF TECHNOLOGY
MAR 2 9 2002
LIBRARIES
BARKER
*
4
.
2
A "HUM" Conjugate Gradient Algorithm for Constrained Nonlinear Optimal
Control: Terminal and Regulator Problems
by
Ivan B. Oliveira
Submitted to the Department of Mechanical Engineering
on January 22, 2002, in partial fulfillment of the
requirements for the degree of
Doctor of Philosophy
Abstract
Optimal control problems often arise in engineering applications when a known desired behavior is
to be imposed on a dynamical system. Typically, there is a performance and controller use tradeoff that can be quantified as a total cost functional of the state and control histories. Problems
stated in such a manner are not required to follow an exact desired behavior, alleviating potential
controllability issues.
We present a method for solving large deterministic optimal control problems defined by
quadratic cost functionals, nonlinear state equations, and box-type constraints on the control variables. The algorithm has been developed so that systems governed by general parabolic partial
differential equations can be solved. The problems addressed are of the regulator-terminal type,
in which deviations from specified state variable behavior are minimized over the entire trajectory as well as at the final time. The core of the algorithm consists of an extension of the Hilbert
Uniqueness Method which, we show, can be considered a statement of the dual. With the definition
of a problem-specific inner-product space, a formulation is constructed around a well-conditioned,
stable, SPD operator, thus leading to fast rates of convergence when solved by, for instance, a
conjugate gradient procedure (denoted here TRCG). Total computational time scales roughly as
twice the order of magnitude of the computational cost of a single initial-value problem.
Standard logarithmic barrier functions and Newton methods are employed to address the hard
constraints on control variables of the type umin < u < umax.
We have shown that the TRCG
algorithm allows for the incorporation of these techniques, and that convergence results maintain
advantageous properties found in the standard (linear programming) literature.
The TRCG operator is shown to maintain its symmetric positive-definiteness for temporal
discretizations, a property that is crucial to the practical implementation of the proposed algorithm.
Sample calculations are presented which illustrate the performance of the method when applied to
a nonlinear heat transfer problem governed by partial differential equations.
Thesis Supervisor: Anthony T. Patera
Title: Professor of Mechanical Engineering
3
4
Acknowledgments
I would like to acknowledge the help and guidance provided by my Ph. D. advisor Professor Tony
Patera. His insights and flexibility have allowed for my pursuit of the present topic, while his humor
has made the experience enjoyable. I am also very thankful to my committee members Professors
Robert Freund and Jean Jacques Slotine for their useful comments, interesting suggestions, and
constant encouragements.
Throughout my student career I have been fortunate to work under the guidance of other faculty
to whom I am indebted for their encouragement and support.
Among these, I am particularly
thankful to Professors Simone Hochgreb and Harsha Chelliah.
My labmates Dimitrios, Thomas, Christophe, Karen, Yuri, and Sid have made the research
process fun, and the exposure to their research topics has greatly enhanced my appreciation of
other areas of computational science. We have all been rather fortunate to have the help of Mrs.
Debra Blanchard, who is somehow able to maintain the group running smoothly and efficiently.
I must also acknowledge my best friends, who's humor and support have been priceless and
essential. Colleen, Doug, Matt, Tom W., Laura, Lauren, Kenich, Deanna, Fer, Tom C., Marcelo,
Fabio - thanks. But I'd like to give special thanks to my two closest friends: my sisters Lara and
Iara.
Above all I must acknowledge my parents. It has always been clear to me that without their
love and the many sacrifices they have made, I would not have been able to pursue my dreams. I
dedicate this thesis to my Mom and Dad as a small token of my appreciation.
5
6
Contents
1
Introduction
13
1.1
Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
1.1.1
General Optimal Control Problem Statement . . . . . . . . . . . . . . . . . .
14
1.1.2
A Nonlinear Example
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
1.1.3
Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
Optimality Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
1.2.1
General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
1.2.2
Nonlinear Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
1.2.3
An LQP Example
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
Dual Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
1.3.1
Fenchel Duality in Optimal Control
. . . . . . . . . . . . . . . . . . . . . . .
20
1.3.2
Reformulation of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
1.2
1.3
2
Existing Numerical Methods and Literature Review
27
2.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
2.2
Parametric Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
29
2.3
Riccati Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
2.4
Dynamic Programming
33
2.5
Shooting Methods
2.6
Newton-Raphson
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
2.7
Sequential Quadratic Programming (SQP) . . . . . . . . . . . . . . . . . . . . . . . .
38
2.7.1
SQP in Nonlinear Optimization . . . . . . . . . . . . . . . . . . . . . . . . . .
39
2.7.2
SQP in Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.8
Gradient Methods
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
42
2.9
3
2.8.1
General Gradient Methods
. . . . . . . . . . . . . . . . . . . . . . . .
42
2.8.2
The Hilbert Uniqueness Method (HUM) . . . . . . . . . . . . . . . . .
46
The Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
TR Conjugate Gradient Algorithm - Linear-Quadratic, Unconstrained Prob49
lems
3.1
Motivation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
3.2
Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
3.3
Optimality Conditions for the LQP Problem . . . . . . . . . . . . . . . . . . . . . . .
52
3.4
The Hilbert Uniqueness Method for the Terminal-Regulator LQP Problem . . . . . .
53
3.4.1
Separation of Inhomogeneous and Homogeneous Parts . . . . . . . . . . . . .
53
3.4.2
7Z and
3.4.3
The Terminal-Regulator ((-, .)) Inner Product . . . . . . . . . . . . . . . . . .
54
3.4.4
Proof of SPD Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
3.4.5
HUM from the Dual Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.6
Differences from Previous HUM . . . . . . . . . . . . . . . . . . . . . . . . . .
59
General Conjugate Gradient (CG) Algorithms . . . . . . . . . . . . . . . . . . . . . .
60
3.5.1
General Conjugate Direction Methods . . . . . . . . . . . . . . . . . . . . . .
60
3.5.2
The General Conjugate Gradient Method
. . . . . . . . . . . . . . . . . . . .
61
Terminal-Regulator Conjugate Gradient Algorithm (TRCG) . . . . . . . . . . . . . .
63
3.5
3.6
g Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.6.1
The Skeletal TRCG Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.6.2
Convergence Results for TRCG . . . . . . . . . . . . . . . . . . . . . . . . . .
64
3.7
Stopping Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67
3.8
Time Discretization - Implicit-Euler
. . . . . . . . . . . . . . . . . . . . . . . . . . .
68
3.8.1
Discretization of Problem Statement . . . . . . . . . . . . . . . . . . . . . . .
68
3.8.2
Discretization of Solution Procedure . . . . . . . . . . . . . . . . . . . . . . .
70
3.9
Detailed TRCG Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.10 Numerical Properties of Method
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
3.10.1
Storage Requirements
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
3.10.2
Conditioning of 9 Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
3.11 Formulation for Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . .
77
3.11.1
Problem Statement . . . . . . . . . . . . . . . . . . . . . .
8
. . 78
3.11.2 Weak Formulation and Galerkin Approximation
. . . . . . . . . . . . . . . .
78
. . . . . . . . . . . . . . . . . . . . . . . . . .
79
3.11.4 The Governing Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
3.11.3 Finite Element Approximation
3.11.5 The Cost Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.11.6 Optimality Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.11.7 Effect on the Conditioning of g - A One-Dimensional Example
. . . . . . . .
83
. . . . . . . . . . . . . .
86
3.12.1
Problem Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
3.12.2
General Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
3.12.3
Computational Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
3.12 Example Problem: Linear, Two Dimensional Heat Transfer
4
5
Interior Point Methods - Linear, Constrained Problems
93
4.1
M otivation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
93
4.2
Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
94
4.3
Optimality Conditions for the Constrained LQP Problem
. . . . . . . . . . . . . . .
95
4.4
Interior Point Methods (IPM) for Optimal Control . . . . . . . . . . . . . . . . . . .
96
4.4.1
Logarithmic Barrier Functions
. . . . . . . . . . . . . . . . . . . . . . . . . .
96
4.4.2
Proofs of convergence
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97
4.4.3
State and Adjoint Equations
4.4.4
Primal IPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.4.5
Correcting Values that Lie Outside Box Constraints
4.4.6
Initializing the Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.4.7
Barrier Parameter Rate of Reduction . . . . . . . . . . . . . . . . . . . . . . . 105
4.4.8
Primal-Dual IPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
. . . . . . . . . . . . . . . . . . . . . . . . . . .
99
. . . . . . . . . . . . . . 102
4.5
IPM-TRCG Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.6
Example Problem: Linear, Constrained 2D Heat Transfer
. . . . . . . . . . . . . . . 113
4.6.1
General Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.6.2
Numerical Performance
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Lagging Procedure - Nonlinear, Constrained Problems
117
5.1
M otivation
5.2
Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
9
6
5.3
Optimality Conditions for the Constrained NLQP Problem
5.4
Linearization of Optimality Conditions .
.1 . . . . . . . . . . . . . 118
. . . . . . . . . 119
5.4.1
Naive Implementation . . . . . . . . . . . .
. . . . . . . . . 119
5.4.2
Proposed Algorithm - Separation of Parts .
. . . . . . . . . 121
5.4.3
Initializing the Algorithm . . . . . . . . . .
. . . . . . . . . 122
5.5
Sufficient Convergence for Lagging Procedure . . .
. . . . . . . . . 123
5.6
Determining the Newton Step . . . . . . . . . . . .
. . . . . . . . . 124
5.7
NL-IPM-TRCG Algorithm . . . . . . . . . . . . . .
. . . . . . . . . 124
5.8
Example Problem: Nonlinear, Constrained 2D Heat . Tansfer
. . . . . . . . . 125
5.8.1
Problem Statement . . . . . . . . . . . . . .
. . . . . . . . . 125
5.8.2
FEM Formulation
. . . . . . . . . . . . . .
. . . . . . . . . 126
5.8.3
General Results . . . . . . . . . . . . . . . .
. . . . . . . . . 128
5.8.4
Numerical Performance
. . . . . 129
131
Concluding Remarks
6.1
Summary of Contributions . . . . . . . .
. . . . . . . . . . . . . .
131
6.2
Possible Pitfalls . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
132
6.3
Conclusions . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
133
A Additional Time-Discretization Schemes for TRCG
A.1
A.2
135
. . . . . . . . . . . . .
136
A.1.1
Cost Functional Definition . . . .
136
A.1.2
Optimality Conditions . . . . . .
136
A.1.3
TRCG Components
. . . . . . .
137
A.1.4
TRCG Proofs . . . . . . . . . . .
138
Second-Order Backward Difference . . .
140
A.2.1
Cost Functional Definition . . . .
140
A.2.2
Optimality Conditions . . . . . .
140
A.2.3
TRCG Components
. . . . . . .
141
A.2.4
TRCG Proofs . . . . . . . . . . .
142
Crank-Nicholson
145
Bibliography
10
List of Figures
3-1
Motivation - the goal is to control the temperature in the shaded region. . . . . . . .
50
3-2
Condition number of 7I and 9 for sample one-dimensional problem . . . . . . . . . .
85
3-3
Diagram of sample problem domain (7 cm x 3 cm).
86
3-4
Time history of desired regulator behavior YR,RS (desired temperature at reaction
. . . . . . . . . . . . . . . . . . .
surface). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
3-5
Mesh used for problem discretization, N = 3960 (7 cm x 3 cm).
. . . . . . . . . . . .
89
3-6
Optimal control and FRS temperature histories, P = 2.37 x 10 7 . . . . . . . . . . . .
89
3-7
Residual value of error Iluk
u*11 for TRCG iterations. . . . . . . . . . . . . . . . . .
90
3-8
Structure of the stiffness matrix A. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
91
4-1
Optimal control and PRS temperature histories for constrained problem, P = 8.19 x
107.
.........
-
.............................................
114
4-2
Newton and last conjugate gradient convergence of IPM-TRCG algorithm . . . . . . 115
5-1
Diagram of sample nonlinear heat transfer problem domain (7 cm x 3 cm). . . . . . . 125
5-2
Optimal control and FRS temperature histories for nonlinear constrained problem,
J = 1.42 x 108.
5-3
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Newton and last conjugate gradient convergence of NL-IPM-TRCG algorithm.
11
. . . 129
12
Chapter 1
Introduction
1.1
Problem Statement
"Optimal control" problems encompass a wide range of scientific applications in a number of
fields. Here, we are concerned with developing optimal strategies for dynamic systems governed by
parabolic differential equations (ordinary and partial). Areas that find applications for such systems include engineering sciences (mechanical, electrical, chemical, industrial), finance, economics,
and others. In fact, these systems are present in most areas of modern technology. For generality,
we consider linear and nonlinear systems.
Modern control theory has come to rely on optimal control due to limitations of the classical
approach of pole placement. For example, given an Nth-order system subject to M control variables, only N poles are available for a controllable system. Non-dynamic controllers require NM
parameters to be specified for feedback control, allowing for an infinite combination of parameters
to be selected without strong theoretical guidance.
Another problem of classical methods is that there is no clear guidance in pole placement
when designing controllers. Often, an intuitive feel is required of the engineer so that desirable
speed of response is achieved. Since MIMO systems present a somewhat unpredictable coupling
not present in SISO systems for which classical methods were developed, the intuitive approach
becomes undesirable and less than robust.
Finally, problems of controllability can arise if we require the system to behave in a predefined
manner. If the system is uncontrollable there exists sub-spaces of the state space that cannot be
affected by the control variables. This is especially true for systems governed by partial differential
13
equations, since the desired behavior may be too arbitrary and thus impossible. The optimal control
statement of the problem avoids this complication by allowing slack in performance at a know cost.
Stabalizable systems can thus be effectively controlled.
1.1.1
General Optimal Control Problem Statement
We are interested in systems that are large in the sense that many variables must be stored and
solved for during a computed simulation. Here, it is always assumed that it is possible to either
directly or indirectly control the dynamic systems through a variable called the control variable and
denoted as u(t)
c
IRM.
The state of the system is represented by the state variable y(t) E RN. We
are concerned with developing algorithms for deterministic control problems, and so it is assumed
that a known relation exists between the state and control variables of the form
y(to) =yo
f (y, u),
where
Q represents
Vt E [to, tf],
(1.1)
(1.2)
the partial derivative of y with respect to time t.
Typically, it is possible to define a function that naturally reflects a cost of the system's performance. Seeking to minimize this "cost functional" while obeying the system's governing equations
(1.1)-(1.2) is the goal of optimal control algorithms. In the problems of interest here, the cost
functional J can usually be expressed as
J=
The symbol
#[x(tf),
[y(t), tf] +
j
L[y(t), u(t), t] dt.
(1.3)
tf] represents a final time penalty and the integrand L[x(t), u(t), t] dictates
1
). Equation (1.3) represents
the nature of the optimizing solution (and is called the Lagrangian
the Bolza type form of the problem since it contains the integral and the final term. Equivalent
forms with only the first or second term can be derived and are called Mayer type and Lagrange
type, respectively.
In addition to the governing equations (1.1)-(1.2), it is generally possible that there exist inequality constraints on the state or control variables. For example, if the control variable cannot
'Although, in a more general setting, the term Lagrangian usually represents a cost functional adjoined with the
constraints, the terminology here is more consistent with the traditional use in optimal control literature.
14
exceed a maximum safety value, such a constraint would take the for u(t) < Umax.
A general
expression can be given as
cj[u(t),y(t)]
!0,
for
j = 1, ... , J,
(1.4)
where c E RR. In the example above, c(-) = Umax - u(t), R = M, and J = 1.
Having provided the necessary definitions, we can state the problem in a concise manner as a
mathematical programming problem of the form
find
u* = arg min J[u]
(1.5)
uCU
y(to) = yo,
subject to
p = f (y,u),
Vt C [toitfl,
c'j(u, y)> 0,
j=1,... , J,
(1.6)
where U = C 0 (to, tf ; RM)
1.1.2
A Nonlinear Example
We can illustrate the above definitions with a more concrete example of a nonlinear system. Suppose
the cost functional reflects the deviation of the final state from a desired state YT.
be expressed as a quadratic penalty
yf = y(tf)).
#[y(tf),tf]
= (yf - yT)TWT(yf
This can
- YT) (to simplify notation
During the time interval [to, tf] there may also exist a desired state history YR.
In addition, the cost of the control can also be expressed quadratically, so L[y(t), u(t), t] = (y YR)TWR(y - YR)
+ UTWUU, where it is assumed that WU is symmetric positive-definite (SPD). The
total quadratic cost is then
J = 2(y - yT)T WT(yf - YT) +
( y - yR)'WR(y - yR) + uT Wuu] dt.
(1.7)
An example of a system governed by nonlinear equations is
=Ay + Bu + K(y -y
15
4
).
(1.8)
For simplicity, we assume for the moment that K is a matrix and y 4 represents a vector y with
each term taken to the fourth power. A more rigorous and realistic presentation of this example
will be given in section 5.8.
The inequality constraints that are most common in control problems will be applied. Most
controllers have lower and upper limits between which they may operate. For example, Umin may
represent a positivity constraint and Umax a saturation level. Therefore, we get J = 2 (ji := min;
max), and
I2 :
Cmin[U(t)] = u(t)
Cmax[u(t)]
Umax
(1.9)
Umin > 0,
-
(1.10)
u(t) > 0-
Now we simply restate the problem with (1.5)-(1.6).
This type of problem is a nonlinear
quadratic program. If K = 0 we get an important sub category called linear-quadratic programs
(LQP), which present several nice features that facilitate solution algorithms.
1.1.3
Partial Differential Equations
More specifically, the problems of interest here are derived from Partial Differential Equations
(PDEs) of evolution. These can be generally expressed as
(1.11)
y + A(y) = Bu.
The operatorA may be linear or nonlinear (as in the example above) and it is implied in (1.11)
that appropriate boundary conditions are imposed. The operator B maps the "space of controls"
into the state space [13].
A typical example would be B = B E
]Rk").
Furthermore, through
this operator, the control u can be either applied throughout the state space domain Q
C
d
(distributed control) or on the boundary F c Q (boundary control). Again, initial conditions must
also be specified:
y(t = 0) = Yo.
(1.12)
Two assumptions are made at this point. The first is that the system (1.11)-(1.12) with a control
history u(t) uniquely defines a solution. The second assumption is the the system is approximately
controllable. If a system is exactly controllable, there exists a u(t) which drives the system y(T) to
any give member YT of the state space at time T > 0 from any given initial condition yo. Relaxing
16
this definition, we may assume the system to be approximately controllable if we only require y(T)
Clearly, exactly controllable systems are a strict
to belong to a 'small' neighborhood of YT [13].
(and, in practice, relatively small) subset of approximately controllable systems. Optimal control
allows us to control systems that are not exactly controllable.
1.2
1.2.1
Optimality Conditions
General Case
The modern statement of stationary conditions for problem (1.5,1.6) are known in the optimal control literature as the "Pontryagin minimum principle." (Also known as the "Pontryagin maximum
principle" due to an alternative definition of the Hamiltonian, see below.)
Using notation from
section 1.1.1, we define the "Hamiltonian" as
J
W(y, u, A, v, t)
L(y, u, t) + (A(t), f (y, u, t)) ±
(v (t), cj (y, u,t)) ,
(1.13)
j=1
where A(t) E R'N and v1 (t) E RM are termed "adjoint" variables.
Since vj are adjoined to
inequality constraints, we have, for j = 1, ... , J,
vj (M)
= 0
if
cj (y, u,t) > 0,
<0
if
cj(y,u,t) =0.
(1.14)
The minimum principle can now be stated as
= Hx* (y*IU* A* V* t),
-A* = j(y*, U*A*, v*, t),
y* (0) = yo,
(1.15)
A*(tf) = #Y[y*(tf)],
(1.16)
W(y* u*, A*, v*, t) = min 11 (y*, u, A*, v*, t).
uEU
(1.17)
The notation * indicates optimum values and subscripts represent partial derivatives with respect to the indicated variable. These conditions are achieved by calculating the stationary conditions of the nonlinear program and manipulating the augmented cost functional
J = #[y(tf), tf] +
I:
u(t), t] - (A(t), y(t)) dt.
J[y(t),
17
Conditions (1.15)-(1.17) are necessary conditions for optimality of the problem. However, sufficient conditions must also hold. Here, we assume 2 normality (controllable problem with stable
neighboringpaths) and the Jacobi condition (empty set of conjugate points). Given these conditions,
sufficiency is achieved if the Legendre-Clebsch condition holds:
WNu (y*, u*, A*, v*, t) > 0.
1.2.2
(1.18)
Nonlinear Example
Returning to our nonlinear example, we can express the necessary conditions for the example given
in section 1.1.2 as
0
where C
Y*
=
=
Ay* + Bu* + K(y* - y*4 ),
=
AT y* - 4KY* A* + WR(y* - yR)
WUu* + BT A*+
and C
E C
y*( 0 )
-
ZC
=
(1.19)
Yo,
A*(tf) = WT(y* - YT),
(1.20)
(1.21)
v,
represent matrices with elements (&cj /Oy) and (&cj /&u), respectively, and
diag(y* 3 ). For SPD Wu, the Legendre-Clebsch condition
(uu (y*, u*, A*, v*, t) = WU > 0,
shows that any stationary point must necessarily be a local minimizer for this problem. In principle,
we should be able to solve for the optimal solution (y*, u*, A*, v*) that satisfy (1.19)-(1.21) without
concern for any other sufficiency conditions.
In practice, however, iterative numerical methods must evaluate search directions from nonstationary points. In this case, the simple Legendre-Clebsch condition does not apply. The Hessian
of the augmented cost functional is important in determining the convexity of the minimizing
problem, and can be expressed as
2see
2see
6YOyy6Yf
tf
dt
+aT
luTi ci ze d [ 6
[8] for mathematical expressions of italicized concepts.
18
(1.22)
To simplify, consider the nonlinear example of section 1.1.2 in which WT
=
(a 2c/eay 2 )
(42c/&yau) = (02c/auIy) = (02 c/au2 ) = 0. Then we have
Tf
[6 T
62NL
6UT]
WR - 12KY2 Ai
0
0
for all points (ui, yi, Aj) - H x
fRNXN.
6y
W
U .J ~L6
dt
g
(1.23)
For the convergence and efficiency of numerical methods, it
is very desirable for this operator to be SPD. We see that for this to be the case, we would require
WR > 12KY2Ai, which is not necessary true. This must be considered when choosing iteration
points (ui, yi, Aj).
1.2.3
An LQP Example
Consider the Linear Quadratic Program (LQP) with no inequality constraints. This problem produces the following necessary conditions:
= Ay* + Bu*,
y* (0) = Yo,
(1.24)
=
A*(tf) = WT(y* - YT),
(1.25)
T A* +WR(y* - YR),
0 = Wuu* + BT A*.
(1.26)
From the above, it is evident that this problem offers several advantages in developing a numerical solution algorithm. As a result, we use the LQP as a starting point for the development of
the method proposed in this work. For now, we note that the Hessian for the regulator version of
this problem
62
j
fo
[6 T
uTl
WR
0
y
0
WU_
_6U_
(1.27)
is positive regardless of the value of the iterate (ui, yi, At).
1.3
Dual Problem Statement
The following is a presentation of the essence of the method that will be developed in detail (and
for more general problems) in subsequent chapters.
19
1.3.1
Fenchel Duality in Optimal Control
Primal and Dual Forms
Fenchel duality offers a simple and elegant way of attaining the dual of (1.5,1.6).
Henceforth we
assume that the cost J and the function f (y, u) are separable. That is,
f (y, u) = fi(Y) + f 2 (u),
J(y, u) = Ji(y) - J 2 (u),
(1.28)
where, for the Bolza problem
Ji(y) =
<(yf)
J2 (u)
-
L[y(t)] dt,
+
j to
L[u(t)] dt.
(1.29)
(1.30)
Here, we make the assumption that J is convex, J2 is concave, and fi(y) and f 2 (u) are convex
(as in the nonlinear example of section 1.1.2). For the moment, we assume that c = 0 for all j and
rewrite the original statement of the problem in the exactly equivalent form
minimize
subject to
where Y
{y E [to, tf]
X JN
Ji(y) - J 2 (u)
(y, u) c Y n u,
= f(y,U) Vt E [to,tf], y(0) = yo . Assuming this problem has a
feasible solution, we can easily convert the above form to
minimize
Ji(y) - J 2 (u)
y E [to, tf] X
subject to
=
f (y,u),
UE U.
20
N,
y(to) = yo,
Vt E [toitf,
(1.32)
Dualizing the first constraint of (1.32), we can write the dual function as
I(A) =
inf
Ji(Y) - J 2 (u) +
[to,tf ] xotN
(A, f 2 (u)) dt - J2 (u)
= inf
UEU
=
to
(A, (f (y, u) - y)) dt
ft
tEU
+
Ji(y) -
inf
yE [to,tf ] XfN
(A,-fi(y) + y)) d,
(1.33)
to
12 (A) - 11(A),
where, applying Green's formula to I1,
11 (A)
(A,
sup
=
yE[to,tf x]R
(1.34)
-f( ,y)
sup
-
- fi(y)) dt - Ji(y)
to
.N
- (A,
fi(y)) dt - Ji(y) + (Af, yf)
-
(Ao yo)
t0
yE[to,tf]XJRN
and
12 (A)
inf
UEU
to
(A, f 2 (u)) dt - J2(u)
(1.35)
are the so-called conjugate convex and conjugate concave functionals, respectively [3]. As a result of
the above operations, we can state the dual problem in a very simple, compact form which mirrors
(1.31)
maximize
subject to
where A1 = {A c [to, tf] x
IRN
12 (A)- 11 (A)
(1.36)
A E A1 n A 2 ,
11 (A) < oc} and A2 = {A E [to, tf] x IRN 12 (A) > -oc}.
It should
be noted that now I1(A) is a convex function over A1 and I 2 (A) is a concave function over A2 .
Furthermore, the bracketed term of 1i(A) is concave in {[to, tf]
x IRN},
which leads to a unique
supremum. Similarly, the bracketed term in 12 (A) is convex in U, which leads to a unique infimum.
Terminal LQP Example - Conjugate Functions
We can now further develop the conjugate functionals for the terminal (WR
=
0) LQP Bolza
problem. We obtain an expression for 1(A) from (1.34)
11 (A)=
sup
y E [tO,tf]R
1o
-(A
+ ATA)Ty dt - '(Y
2N
21
-YT
WT(Yf - YT) + A yf - Ayo.
(1.37)
For 11 (A) < o,
the above equation requires that A E A1
=
{A E [to, tf x
lN
I+ATA
=
0}. Then,
from stationary in y, we get the supremum
=
I,(A)
A c A,.
Af Wj Af + AfyT - A0yo,
(1.38)
Similarly, for 1 2 (A),
12 (A) = in
were
Q=
{
ATBu
+
UT Wuud} = - 1
ATQA dt,
A E A2 = [to, tf] x
RN
(1.39)
BWVBT .
Dual Problem - A Formulation
Having determined I,(A) and 12 (A), the dual problem (1.36) can now be stated as
maximize
I(A) =
subject to
2 tA
A E A = A, n A2 =
We note here that the sets [to, tf]
[to, tf]
x RN
ATQA dt
AfWj-1Af 2
x ]ftN
[tO, tf IX RN
A
-
A YT -
+AT\=
y
(1.40)
.
and U are convex, and that J, (y) is convex over
and J 2 (u) is concave over U. In addition, we can state that the functionals Ji (y, u)
and J2 (y, u) are convex and concave over all [to, Itf]
x RN+M,
respectively. Thus, we can state [3]
there there is no duality gap and we have
inf
{Ji(y) - J 2 (u)} = max {1 2 (y) - Ii(u)},
(1.41)
AcA
(yu)E{[totf]XJRN}XU
or simply: 1* = J*.
1.3.2
Reformulation of the Problem
The R and g operators
We note above that the space of allowable dual variables A is determined once a value Af is given.
There is an obvious relationship between this space and the Maximum principle (1.16). Therefore,
22
in order to have a similar form to this condition, we introduce a variable q such that given q,
A(tf) = WT(q - YT),
(1.42)
-A
Vt E (to, tf).
AT A,
Given such A, we can perform the following operations:
y(0) = Yo,
(1.43)
y = Ay - QA,
Vt E (to, Itf).
Note that y is primal feasible, but since it is not true in general that A(tf)
=
WT(y(tf) - YT),
it is not necessarily optimal. Separating the above operations into respective inhomogeneous and
homogeneous parts,
-A1
=
A TA 1 ,
j = Ay, -
-A H
YH
=
AI(tf)
(1.44)
= -WTYT,
yo,
(1.45)
ATAH,
A
AH(tf) = WTq,
(1.46)
AyH - QAH,
yH (to) =
QA1 ,
yj(to)
We define the operator R as the following: given q C
(1.47)
0.
IN,
lq
= yH(tf).
For later ease of
notation, we also define gq = q - Rq.
It will be useful to perform the following operations to equations (1.45) and (1.46):
H
TAy,
-
-
Hyl
d(AT yi)/dt
=
T
-
ATQAI
Ay,
(1.48)
-AT QA I .
Integrating over time and applying Green's formula and final/initial conditions, we get
AQAI dt
-
=
(tf)T WTq -
to
23
AH(to)Tyo.
(1.49)
Similarly, using the remaining equations,
Jtf
ATQAI dt =yTWTy(tf) - Aj(to)Tyo,
tf AT QAH dt = qT WT R q.
-
(1.50)
(1.51)
to
The ((., .)) inner-product
Given v C RN and w E
RN,
we define the following inner product:
(1.52)
((v,w)) = VTWTW.
From (1.51), can conclude that the operator R is symmetric, negative semi-definite in the above
inner-product space. As a result, 9 is SPD in this space.
Dual Problem - q Formulation
Here we separate the dual function into homogeneous and inhomogeneous parts and apply final
and initial conditions:
I(A) =
2 AH W-iTAHf! - 1AI W-I1A\If
AHfW-1AIJ
ATQA dt - (AHf - AIf)yT + (AHO - AI )Ty
-
2
=
-
1 TW
2
qTq
1,
TW
- -yTWy
2
+ AHO y00A0 YO -
T +Y
+ q- WTY yT'Wy
yT~rr
(1.53)
-AOrvO
+ IqTWTRq + yj fWTq - AHf yO
1
S
= 2
+y
fi~W q+
TW
2 YT'W
(Yy--
yI(tf ) )
1
+ 2A(tO)Tyo.
Note that yI(tf) and AI(to) are independent of q. Thus, the last two terms can be calculated
from (1.44) and (1.45) independently of the optimization problem. Defining
CI(yo, yT)
-
- yI(tf)) + 2A 1(to)Tyo,
yTw(y
24
(1.54)
and using the inner-product notation, we can state the dual problem simply as
I(A*) = I(q*) = max {
qRN
2
((q,9q)) + ((q,yIf))
+ CI(yoiyT).
(1.55)
The strength of the above formulation is the simplicity of the dual functional. There are several
numerical advantages that result from this formulation:
(i) Since q belongs to all of
ffN,
the problem is an unconstrained maximization;
(ii) Since
g is SPD, the functionals in brackets is concave, thus immediately proving uniqueness;
(iii) Since
g is SPD, efficient numerical methods can be used to solve the problem, such as the
conjugate gradient method;
(iv) Values of q which are not optimal result in dual feasible solutions which can be used to obtain
lower bounds for I*;
(v) When solving for 9q, the problem offers yH(q, t) as a by-product of the computation. This
can be used in conjunction with the inhomogeneous part (solved for only once) to obtain a
primal feasible solution y(t) = yi(t) + yH(t). This value can then be used to calculate an
upper bound on the cost functional J* (note that u(t) = W--BTA(t));
(vi) Since there exists no duality gap in problem (1.41), the primal and dual variables can be used
to calculate bounds on the true cost:
I(A(q)) < I* = J* < J(y(q)).
(1.56)
We define the "bound gap" as AC = J(y(q)) - I(A(q)).
Remark 1 There is a simpler way of formulating the problem of finding q*. We note that equations
(1.42,1.43) satisfy the minimum principle (1.24)-(1.26) if
q = q*
Since Rq
=
y*(tf).
yH(tf), we have
q= y*(tf) + yy(tf)
25
=
yj(tf) + Rq*,
or simply
gq*
=
y(tf).
(1.57)
The solution q* of equation (1.57) solves problem (1.55) without requiring the steps of finding
the dual as a function of q. However, this simplification of the formulation has several theoretical
and practical drawbacks:
(i) Without the dual formulation, the origin of this equation is unclear;
(ii) There is no indication of the origin of the inner-product ((-,-)), which is essential to show the
SPD property of operator g;
(iii) Without knowledge of ((., .)), trying to solve (1.57) by minimizing a quadratic function in
Euclidean space will not guarantee SPD g, thus hindering the numerical methods used;
(iv) Equation (1.57) gives no indication of C'(yo, yT), so that even if it is acknowledged that it is
a version of the dual problem, it cannot be used to determine a lower bound for the cost P.
FD
26
Chapter 2
Existing Numerical Methods and
Literature Review
2.1
Introduction
Optimal control problems are typically large and difficult to solve due to the complex relationships in
the stationary conditions. Numerical methods must deal with the fact that the number of operations
required to solve a problem increases much faster than the problem's dimension. Efficient methods
are those which exploit certain aspects of a problem to reduce the amount of computational time
and storage.
A variety of methods exist, since each has been developed to take advantages of aspects of
particular problems. However, successful methods often allow for modifications that expand their
applicability. Here we present several of the most popular methods for solving optimal control
problems. This survey will be useful, in later sections, for comparisons between our approach and
currently used methods.
All of the methods below are discussed in relation to the Bolza problem of section 1.1.1. Thus,
it can be said that the purpose of each method is to solve for a control value u* (and consequently
A* and y*) which satisfies the minimum principle (1.15)-(1.17).
27
With smoothness assumptions on
W, we restate these here as follows:
=
f (y*, u*),
0,
S=-,
Y(y*,
u* A*)
y*(to)
=
A (ty)
=
yo given;
(2.1)
a#[y*(tf)1
4
;
(2.2)
0 W(y*,auu*, A*)
(2.3)
Problems of this type are usually referred to as two-point boundary value problems [27]. The
main difficulty that arises in their solution is that differential equations must be solved in such a way
that both initial and final conditions are satisfied. This means that a simple forward or backward
time integration of equations (2.1) or (2.2) cannot be done before the appropriate relation between
them (u*) is know.
The methods presented here are iterative in the sense that particular variables are updated
as the resulting cost J[(y, u) (A)] approaches the minimum J*. Historically, there have been three
approaches to solving for the above stationary conditions [24]:
(i) solution of the boundary value problem presented by equations (2.1) and (2.2) with a local
Hamiltonian optimization, equation (2.3), at each time step;
(ii) solution of a completely discretized problem, so that it resembles a finite-dimensional nonlinear program;
(iii) solution of a finite parameterization of the control history, where the state and adjoint variables are evaluated by integration of (2.1) and (2.2), and the control variable is adjusted from
sensitivity equations.
Shooting methods and Sequential Quadratic Programming were used to solve the problem
through approach (i). These methods are relatively straight-forward in the implementation stage,
but present some practical problems. Riccati equations and Newton-Raphson methods were employed to use approach (ii). Though technically very accurate and fast, storage can easily become
an issue with these methods. Approach (iii)was employed by the use of parametric optimization,
dynamic programming, and gradient methods. Some of these methods have met with considerable
success due to their flexibility, robustness, and ease of implementation.
28
2.2
Parametric Optimization
Control Parameterization
Among the first optimal control solution methods developed, parametric optimization [27] has fallen
out of favor for by large, complex problems. It is included here for historical completeness. These
methods are approximate since they will not necessarily converge to exact optimal values regardless
of the number of iterations. In essence, the method restricts the allowable control histories to a
small subspace of U, thus considerably simplifying the problem. In particular, we make u a simple
function of time.
By restricting u to be dependent on a few parameters, the dimension of the problem can decrease
significantly. Here we take the simplest possible example:
u(k, t) = k.
(2.4)
This control history is constant in time, and so we have a single vector k E
JRM
as a "parameter"
to be used in the optimization of J[u(k)]. Considering a terminal LQP problem from section 1.1.2,
we have for the state equations
y(t) = Ay(t) + Bk,
y(to) = 0.
Such a simple system can be solved analytically when A is square and nonsingular:
y(tf) = eAt! (Yo + A- 1 Bk) - A-'Bk,
where eAt!
k=O (Atf)k
k!
-
is the matrix exponential of Atf. The cost then can readily be expressed
as
J(k) =-(eAtf
2
+
-1Bk)
-
A 1 Bk
-
yT)TWTeAt( A-Bk) - A- 1 Bk - YT)
kTWuk dt,
which is only a function of k. Setting &J/Ok= 0, we get
=
(WTe 2 At(A-1B)T (A-1B) + (tf - to) Wu-(WTeAtf (A-'B)(eAtf - YT),
29
(2.5)
which is the unique solution as can easily be confirmed by noting that 3 2 J/&k2 > 0.
This results in u(t)
=
k* which is only a constant-in-time approximation of the actual control
history u*(t). It happens to be the best possible approximation for constant functions, but it is
nonetheless unacceptable for practical applications. Rather than using the simple form of (2.4),
it may therefore seem more appropriate to used higher order terms in the approximation of the
control history. Alternative candidate functions may include ramp functions,
u(k, t) = k1 + k 2 t,
truncated power series,
u(k, t) = k1 + k 2 t + k3 t 3 +
-+ k t,
truncated Fourier series,
in
u(k, t)
[k1i sin
tf - to
+ k 2i cos
ist'
tf - to
,
or other orthogonal functions. One possibility is to increase n for the truncated functions until a
satisfactory cost is obtained. However, it must be stressed that the resulting cost J(k*) need not
be arbitrarily close to J*, even for large n. Furthermore, we do not know a priori which functions
give best results for a given problem. This is the first critical problem of the method. With the
exception of a handful of problems, the method is not exact and requires too much intuition to be
considered a robust numerical method.
Even for the unrealistically simple example above, expression (2.5) hints at another problem
with this method. Although the idea is very simple and elegant before implementation, the practical
solution becomes very complicated as the problem grows in complexity or as better approximations
are used. For cases which have no analytical solutions we must resort to numerical calculations of J
and the gradient &J/&k, which requires integration of (2.1). As n increases, the overall calculations
can be very expensive. For good approximations of u, this calculation approaches the complexity
of more accurate (exact) methods, with the severe disadvantage of being non-exact. Therefore, this
method, although interesting for simple problems, is not used today for large, complex problems.
30
Penalty Methods
It may seem inappropriate that the above method requires integration of (2.1) to be done done
exactly when u is only a (probably) crude approximation of u*. If we also approximate y we need
not perform such an operation. Consider the augmented cost
1
JE(y, u) =
tf
-
E Jt0
fl
y - f (y, u)JI dt,
with c > 0, and suppose we define
y(t)
y(kyt)
(2.6)
u(t)
u(k, t).
(2.7)
and
Now both the state and control variables are functions of time and parameters ky and ku. Since y
is defined as a known function of time,
the expensive integration of (2.1).
y
can be calculated (possibly analytically) without requiring
If J is SPD, then so is J, and we are faced with the uncon-
strained minimizations of JE(ky, ku). This problem may be solved by, for example, Newton-Raphson
iterations on ky and ku with incremental increase in c.
Though this method does not require solution of ODEs, it is still plagued by potentially difficult
evaluations of complicated functions. It is also non-exact, and J* need not approach J* even for
large c if (2.1) and U are not sufficiently represented by (2.6) and (2.7). Again, there is no a priori
guarantee that we may find the necessary functions.
2.3
Riccati Equations
The remaining methods presented in this section are exact. These have gained favor in the modern
solution of optimal control problems due to the fact that, regardless of other difficulties, they
have the potential of producing exact answers to the posed problems (or at least arbitrarily close
approximations).
Riccati equations were originally developed in order to theoretically deal with calculus of variations problems [27].
They are easily derived from LQP problems of section 1.2.3. Consider the
31
example whose necessary conditions are expressed as
= Ay* -
QA*
= ATA* + WRy*,
y* (0) = yo,
(2.8)
A*(tf) = WTY*.
(2.9)
Here, for simplicity, we have YT = yR(t) = 0. The fact that we are dealing with and LQP means
that these are also sufficient conditions for optimality. Since y and A are adjoint we can state the
following linear relationship:
A*(t)
=
S(t)y*(t),
where S(t) is a time-dependent Riccati matrix. Naturally, we have
i*(t) = S(t)y*(t) + S(t)p*(t),
and, incorporating the above into (2.8,2.9),
-S(t)y*(t) - S(t)Ay*(t) + S(t)QS(t)y*(t) = ATS(t)y*(t) + WRY*(t),
S(tf)y(tf) = WTy(tf).
Canceling y*(t), we obtain
S(t) = -S(t)A - ATS(t) + S(t)QS(t)
-
WR,
S(tf)
=
WT.
(2.10)
Once S(t) has been determined from (2.10), the state and control variables can be obtained:
(A - QS(t))y*(t),
u*(t)
= -WBTS(t)y *(t)
y*(0) = yo,
(2.11)
= -C(t)y*(t).
(2.12)
Equation (2.10) is referred to as the Riccati matrix equation for this LQP. It is a non-linear
matrix ODE which could be solved by backward integration.
It is possible to implement the method above relying solely on linear algebra operations. Though
this might seem advantageous at first, an inspection of the numerical complexity of equation (2.10)
reveals its weakness. As stated previously, optimal control problems tend to be large. If y(t) has
N elements (for each times step), then S(t) is an N x N dense matrix. For certain problems to
32
which positivity assumptions apply, the dimension of the solution procedure can be reduced, thus
allowing for more efficient methods [26].
It is also a fact that good numerical time discretizations rely on a large number L of time
steps. Since storage for all of S(t) is O(N 2 L), this solution method quickly becomes prohibitive
as the size of the problem increases. Compared to modern gradient methods (see below) which
require O(NL), Riccati equations suffer from a significant disadvantage. A moderate problem, for
example, may require N = 0(103) and L
=
0(102), demanding 1000 times more storage for Riccati
equations than for gradient methods, or a total of 0(108) versus 0(10'), respectively.
Other than for some relatively small problems [17], Riccati equations today are mainly used as
a theoretical tool. In particular, since equations (2.12) provides an explicit expression of a closedloop control law, Riccati equations are typically the point of departure for the study of stability
properties of feedback optimal regulator systems [25, 26, 1].
2.4
Dynamic Programming
It is possible to express the minimum principle in an alternative form. Suppose that rather than
considering a cost function (1.3), we consider the value function
f
V(ti) = 0[y*(tf)] +
L[y*(t), u*(t)] dt
iti
=
#[y*(tf)] -
=min
Itf
#[y*(tf)]
L[y*(t), u*(t)] dt
-
(2.13)
'L[y*(t),u(t)]dt}
Clearly, minimizing V(to) subject to (1.2) is equivalent to the original problem. After some manipulation, the optimality condition takes the form
at
[y* (0)] = - min
V[y*(tf)]
y* (t), UM),
U
=
y
[y* (0)]1
(2.15)
#[y*(t)],
where
W = L[y*(t),u(t)] +
(2.14)
K
a' [y*(t)], f [Y*(t),u(t)].
33
The partial differential equation (2.14) is known as the Hamiltonian-Jacobi-Bellman(HJB)
equation [27]. Dynamic programming techniques in optimal control use this equation rather than
(2.1)-(2.3).
Backward integration of equation (2.14) may be carried out with
V[y*(t)] for arbitrary time, t E [to, tf]. Further integration to t'
y* = f (y*,
u*) to yield
C [to, t], has no effect on values of
y*(t) to y*(tf); a fact known as Bellman's Principle of Optimality.
Any point on the terminal hypersurface defined by (2.15) results in a unique initial state-time
solution (y (to), to). If a portion of this hypersurface is defined near the expected value y* (tf), values
of (y(to), to) can be iteratively calculated until one obtains (y*(to), to). Though it may be realistic
to map the hyperspace created by HJB equations for problems with two or three variables, storage
becomes prohibitive for large problems [8]. Thus an alternative is to choose a few starting points
near the expected final solution and interpolate values of y(tf) so that y(to) = yo is achieved.
In any case, dynamic programming algorithms typically involve some type of mapping the related hyperspace, which is always storage-intensive. However, the HJB equations serve an important
modern theoretical purpose. If one defines
Dv
ay
AT (t) = a (t),
one immediately sees the relationship between N and 7.
Alternatively, if we retain the notation
and consider the problem posed by (2.8,2.9), we may rewrite HJB as
V*
09t
=
V[y*(tf)]
1 (V*
1TT
Ay + -y Way -- oBy
2
Oy
2
0V*
Q(aV*
09y
7
(2.16)
(2.17)
yT(tf)WTy(tf).
This first order, nonlinear partial differential equation has a product solution of the form
V* = -YTS(t)y,
2
which, upon substituting into (2.16,2.17) gives
0 =
yT ($ + SA + ATS - SQS +Wy.
Since the above must hold for all y, it implies the Riccati equation of section 2.3.
Though solution of (2.14,2.15) is typically more difficult than that of the original stationary
34
conditions, exceptions exist for the solution of LQP controllers. In addition, dynamic programming
techniques can be effectively used in formulating stochastic optimal control of nonlinear systems
[27], and are preferred for the synthesis of feedback problems [16].
2.5
Shooting Methods
Among the first used in optimal control [7, 8], shooting methods aim at target values from initial
guesses. For example, one may guess an initial value of the adjoint variable A(to), and determine
u(to) from equation (2.3).
In fact, it is possible to integrate the entire system (2.1)-(2.3) in this
fashion as long as all the integration is done forward in time (including equation (2.2)). Of course,
it is not necessarily true that the final condition A(tf) = aq(tf)/Oy will be satisfied, since A(to)
may not be optimal. By changing the initial guess for the adjoint variable, iterations can be carried
out until the final conditions is satisfied.
There are immediate problems with this technique, however. We have implicitly assumed that
equation (2.1) is stable in forward integration. Such an assumption usually means that equation
(2.2) is not. Therefore, forward integration of (2.2) will probably result in unworkable values of the
adjoint variable history for a large portion of possible A(to).
The same approach can be used by guessing y and A at the final time and integrating backwards
in time.
This alternative would still suffer from the same problem as above, since stable state
equations tend to be unstable when integrated backward in time.
In essence, the difficulty in these methods involves getting started [8]: finding an admissible value
A(to) that will produce well-conditioned transition matrices.
Ill-conditioned transition matrices
cannot be inverted in the crucial step of improving the guess, greatly compromising the accuracy of
numerical methods used. Therefore, these methods are best used for finding neighboring solutions
[8, 27] to an optimal solution that is usually calculated via more stable methods.
Though some alternative formulations have been proposed (such as quasilinearization [20]),
multiple shooting techniques have been the most successful variants of this numerical technique
[24]. These are described below in section 2.7.2.
35
2.6
Newton-Raphson
Widely used in many types of problems, Newton-Raphson methods have also been successfully
used to solve optimal control problems. The quadratic convergence property of this method makes
it very attractive for fast-converging solution algorithms; thus they are often used today in the
optimal control context.
These methods are based on a type of linearization of the state-adjoint equations. As we have
observed before, by way of equation (2.3), it is possible to write u(t) as a function A(t). So that, in
general form, the stationary conditions can be written as
=
f(y*, A*)
y*(to)
yo,
(2.18)
=
Hy (y*, A*),
A*(tf) =Y[y*(tf)].
(2.19)
=
Guessing non-optimal values of yi(t) and Ai(t) will not, in general, satisfy the above equations.
However, if these values are close enough to y*(t) and A*(t), the following approximation can be
made:
Ay~ fy (yi, Aj)Ay + fA(yi, Aj)AA + (t),
-AA
~
yy (yi, Ai)Ay + WyA (yi, Aj)AA + I(t),
where 5(t) = (f(yi, Aj) -
yi)
Aj) -
and 6(t) = (-Ry(yi,
Ay(to) = Yo - y2(to),
(2.20)
AA(tf) = #y[y(tf)] - Ai(tf),
(2.21)
i 2).
The method can best be described for
a given time discretization. Suppose we use an Implicit Euler scheme, with index f, time steps At,
and total number of time steps L = tf/At. Then we can write the above equations as
RAt
Jy}
[AAtl
Ay
-A
At+
(2.22)
L_A -+ bI
bJ
-h
0
10
AyO
j
AA
If we create a vector Ax = (Ayl, Ay 2 ,...
Yo - yi(to)
#,y(yJ§)
I
L-1
-
(f = 0, L).
(2.23)
A21
AA 2
...
AAL-
1
) of size 2(NL - 1), and
define the appropriate right-hand side i, equation (2.22) can be expressed as
JcAX
36
=
(2.24)
where J, is the Jacobian block matrix
E K
Je =.
N
P
Matrices E, K, N, and P, are (for Implicit Euler) block diagonal with matrix elements
K
-f
(2.25)
R
Once the step direction Ax has been solved for via equations (2.23,2.24), we update the current
solution
=
I+
a,
Vf = 0,...,L,
(2.26)
where ai ;> 0 is the step-size.
Although the quadratic convergence of the method is attractive, there are severe limitations
when solving large problems. Equation (2.24) involves the inversion of the Jacobian matrix, which is
of size O(N 2L 2 ). Even for the most simplified problems, this number is prohibitive. Thanks to the
diagonal nature of the Jacobian that results in most time discretizations, storage can be achieved
with O(NL), but the inversion process will nevertheless be computationally intensive. Principally,
however, the method is only guaranteed to achieve quadratic convergence near the optimal solution.
Starting from far away guesses will not guarantee convergence, and even if convergence does occur
the method does not fundamentally distinguish between maxima and minima.
There are cases in which the size of the problem has been reduced, through proper orthogonal
decomposition [23] for example, in which these methods have been successfully employed. They
also tend to be used for ill-conditioned smaller problems, for which storage may not be a problem
but fast convergence is essential.
The method is, nonetheless, powerful in the sense that it presents a linear approximation to the
stationary conditions (this is in fact a quadratic approximation to the cost functional). Therefore,
a wide class of non-linear optimal control problems can be approached with this method very
efficiently. In fact, one can apply gradient methods in conjunction with the linearization of the
problem to calculate step directions near an optimal, non-linear solution (see section 4.4.4).
37
Remark 2 If we consider the problems in which Ly = 0, such as the LQP example, we have
hy = ,fy
that is, P = -ET.
In fact, for the LQP problem, we have
=
=Q,
Al,
A,
At
Nt, =WR,
Pe~
-+ AT
At
(2.27)
D
Remark 3 Finding ai for equation (2.26) can be done in several, well-established ways. They
include:
(i) constant step size: ai = constant;
(ii) minimization rule: ai
=
arg mina;>o J(xi + aiAx), by line search techniques;
(iii) diminishing step size: ai
-
(iv) Armijo rule: ai =
#
f'ks,
J(X,)
0 as i -+ oc, but E'
ai = oc;
E (0, 1), Mk first nonnegative integer m such that
-
J(X, + /" m sAX) ;> -o-3"SVJ(X,)TAXi,
where o- E (0, 1). Typically, good values of 0 and o- are 1/2 and [10-5, 10--1], respectively.
Of the methods above, the most efficient tends to be the Armijo rule, since it guarantees
sufficient reduction of J when near the solution. L
2.7
Sequential Quadratic Programming
(SQP)
Sequential Quadratic Programming (SQP) methods are very widely used in modern optimal control,
serving as the main competitors of gradient methods. Here we first describe the origin of the method
and then the application to optimal control. Though typically the method of choice for small- and
medium-sized problems, difficulties arise when larger problems are considered [21].
38
2.7.1
SQP in Nonlinear Optimization
Sequential Quadratic Programming is a class of so-called Lagrange Multiplier Methods, in which
Lagrange multiplier estimates are made and used in the solution procedure. In this section, we
introduce SQP for general, nonlinear programming (NLP) problems.
SQP is based on the idea that one wishes to solve the following problem
f(x)
min
(2.28)
XEJRN
s. t.
g (X) <;0,
j = 1,...,Ir,
by addressing a "penalty" version of the form f (x) + cP(x), where P(x) = maxj{gg (x)} and c is a
"sufficiently large" number. Extensions to equality-constrained problems are easily made and given
below. It can be shown [3] that if c > E p,
where p are Lagrange multipliers corresponding
to the constraints, then the strict unconstrained local minimum of f + cP is also the solution of
(2.28).
Suppose we define J(x) = {jj gj(x) = P(x),j
1,... ,r} and
Oc(x, d) = max{Vf (X)T d + cVgj (X) T dj
Then, for small
I dl
E J(x)}.
, a quadratic approximation of f + cP around x can be written as
f(x) + cP(x) + Oc(x, d)+
dTHd,
(2.29)
where H is a positive-definite symmetric matrix (to be addressed later). The descent direction d
can be found by solving the problem
min
Vf(x)Td + IdTHd +c}
2(2.30)
gj(X) +Vgj(X) T d
d,
s.t.
,
Vj.
(See [3] for a geometrical illustration of (2.30).) Since it can be shown that x is a stationary point
of f + cP if and only if (2.30) has d = 0,
= P(x) as its optimal solution, an appropriate numerical
procedure would be to sequentially approach a stationary point x* for which this is true. SQP is
39
an iterative descent algorithm that operates on this idea:
Xk+1
=
+ akdk ,
Xk
where
dk=argmin Vf (Xk)T d+d T Hkd +c
d,
2
s.t.
< (,
gj(Xk) + Vgj(Xk)T d
(2.31)
J
Vj,
in which we may, for example, use the Armijo rule to determine a.
Two issues remain to be addressed before implementing the above idea: the selections of c
and Hk.
We know that c > Eg pt is required to guarantee convergence, but do not know p*
apriori. Well-developed methods exist for addressing this problem such as solving similar problems
for dk with
set to zero and using the resulting Lagrange multipliers to estimate c [3].
This
approach works well but some care must be taken since c tends to become too large. This results
in sharp corners of the penalized cost
f+
cP, which adversely affects the stepsize procedure, and
consequently the entire algorithm.
As for the matrix Hk, a natural choice is the Hessian of the Lagrangian, V2xL, where
L(x, p)
=
f (X) + pTg(x).
Though difficulties can still occur here (for example, pj are not generally known, or V2XL may not
be SPD), there are methods of determining appropriate, SPD Hk.
It is simple to incorporate equality constraints in the method. To solve the problem
min
f(x)
XE RN
s.t.
hi(x) = 0,
gj(x)
we may simply replace hi(x)
=
O,
i = 1, ...
, n,
(2.32)
j=1,...,r,
0 by the two inequalities hi(x) < 0 and -hi(x)
< 0. Then, the
direction finding step becomes
dk
argmin Vf
d,
s.t.
kT
2
T H kd
Ihi(xk) + Vhj(Xk)Tdl <
gj(Xk) + Vgj(Xk)T d
40
,
< ,
+ c
Vi,
Vj.
(2.33)
The crucial point here is that the above system is a quadratic program (QP) with linear constraints. Therefore, a unique solution can be found for each k iteration of the procedure. Given
appropriate stepsizes, dk approaches zero,
2.7.2
k approaches P(x*), and xk approaches x*.
SQP in Optimal Control
The presentation of the method above has been carried out in the NLP context.
This section
applies the method to optimal control problems by converting such problems to NLPs.
Multiple Shooting Methods
SQP is typically performed in the context of multiple shooting methods, which are derived from
the simple shooting methods of section 2.5. Consider the LQP problem of section 1.2.3 with the
governing equation
=
Ay + Bu,
y(to) = yo.
(2.34)
In multiple shooting methods, one divides the time domain into Nt subintervals [ta, t,+1] (for
discretized problems, these intervals are larger than the time discretization intervals). For each tn,
a guess yn is given as a initial conditions for that time interval. By integrating (2.34) from t,
with
y, and a given u(t), to t,+,, one gets y(t,± 1 ). In general, one finds that y(tn+1 ) # yn+1, and the
aim becomes to determine y, and u(t) that ensure y(t,+i) = yn+1. Since these time intervals can
be made, in principle, arbitrarily small, stability issues may be alleviated. This procedure produces
continuity requirements that can be seen as problem constraints:
cn(x) = y(tn+1 ) - Yn+1 = 0,
where x
=
[yo,...,yN",u(t)T,
for
n = 1,... ,Nt,
and y(tn+ 1 ) is arrived at by integration of (2.34).
(2.35)
Now we may
restate the problem as
min J(x)
x
s.t. c(x) = 0,
where c(x) = [C1 (4)C 2 (X), -- ,CNt (X)I T .
41
(2.36)
Optimal Control Problem
Having redefined the optimal control problem as (2.36), we are ready to apply the techniques of
section 2.7.1. The QP subproblem of interest is written
dk
=
arg minJ(Xk) + VJ(Xk)Td+ 1 dTHkd
d
S. t.
2
(2.37)
c (Xk) + VCX *)d = 0,
where Hk is a SPD approximation (a quasi-Newton scheme may be used, for example) of V2 Lk(Xk, Ak),
and
Lk(x, Ak)
J(x) - AkT[c(x)
-
c(xk)
-
Vc(xk)(X - xk)]
(2.38)
is a typically used modified Lagrangian [12]. Defining the terms (c(xk)-Vc(Xk) (X-Xk)) of the above
equation as the constraint linearizationcL(x,X k), we may interpret dL(X, Xk)
=
c(X)
-
cL(X, Xk) as
a departure from linearity [2]. The QP has a unique solution (dk, dk) which can be found by solving
the stationary conditions
Hkdk
-
Vc(xk)
Vc(xk )dk
=
T
dk
-V J(xk),
A
(2.39)
*b
-
Once the QP subproblem is solved, the variables can be updated:
Sk-1
Ak+1
xk +±akdk,
(2.40)
_k + akdk,
where ak is found by appropriate line search techniques based on a variety of merit functions. SQP
and its variants have been successfully applied to a variety of problems [2, 11, 12, 14, 24, 6, 5].
2.8
2.8.1
Gradient Methods
General Gradient Methods
Along with SQP, gradient methods are among the most often used algorithms for solving modern
optimal control problems. Pioneered by H. J. Kelly [24], these methods have been traditionally
performed in the control space; that is, the control variable is improved at each iteration until the
42
stationary conditions are satisfied. An obvious advantage of this approach is that, if the system is
stable, a guess of the control history u(t) will produce manageable values of y(t) during integration
regardless of initial conditions (unlike shooting methods).
Take, for example, the general system with stationary conditions
=
f (y*, U*),
D7
w (y* u* A*)
-
I
y*(to) = yo given;
(2.41)
A*(t#) =
(2.42)
Dy
0
-
/[Y*(tf)1
Dy
(2.43)
.
au
A guess uk(t) for the control variable can be used in place of u*(t) to integration equation (2.41)
forward in time. The resulting value of y(t) can be used to integrate equation (2.42) backward
in time, providing an approximation of the state-adjoint history (y, A). Since in general
uk
# U*,
equation (2.43) will not be satisfied for this set of results. The goal of gradient methods, then, is
to use gradient information to obtain uk+1 such that uk --+ U* as k
-+
00.
In the above procedure, stability properties of the equations are maintained, exact state histories
are produced at each step, and estimates can easily be made of cost errors from duality. For these
reasons, these methods have been chosen as the basis for the algorithm developed in this work.
Gradient methods that operate on the control variable typically assume the form:
Uk+1
= Uk -
akDkVuN(uk),
(2.44)
where ak is a stepsize chosen by an appropriate line search, Dk is a SPD matrix, and Vu7i(uk)
is the gradient of the Hamiltonian at uk. This formulation derives directly from applying NLP
gradient methods to the optimal control context. It can be shown [3] that if:
(i) the step direction dk = -DkVj(uk)
is gradient related 1 for every k,
(ii) and the stepsize ak is chosen by some appropriate line search technique such as the minimization rule, the limited minimization rule, or the Armijo rule,
'Given a subsequence {uk} that converges to a non-stationary point, the corresponding subsequence {dk} is
bounded and satisfies the condition
lim sup VuR(uk)T dk < 0.
k-
I
oo
43
then every limit point of the sequence {uk} generated by (2.44) is a stationary point.
Steepest Descent (u)
Steepest descent schemes are among the simplest for this type of approach.
implemented by assigning Dk
=
They are simply
I, where I is the identity operator of appropriate size. For the
LQP problem, for instance, this would result in the iteration
uk+1 _ uk
Clearly, the step direction dk
stationary points (Vu7j(uk)
-VuH(uk)
-
ak (Wuuk + BTAk).
always satisfies the gradient related condition for non-
# 0). Though this guarantees eventual convergence in most cases, a
critical problem of this approach is that it is only linearly convergent. This can seriously hamper
the convergence rate if the problem is not well conditioned.
Newton-Raphson (u)
Newton-Raphson algorithms are based on a quadratic perturbation of the cost functional with
respect to u. In the context of gradient methods, this corresponds to Dk
--
(V2
)
1
. For the
LQP example, we have
Uk+1 _ uk - akW-il (Wuuk + BTAk).
The main advantage of this approach is that quadratic convergence is attained near the solution.
However, several drawbacks often make the pure Newton-Raphson method difficult to implement.
The first of these drawbacks is that the method requires (V2UW(uk)) to be symmetric positivedefinite for possible values of uk in order to guarantee that dk be gradient related. Variants of the
method are more appropriate to deal with such situations. Quasi-Newton methods, such as the
BFGS variant [15], have been successfully used to solve nonlinear (non-SPD VUUW) problems. These
methods use a symmetric positive-definite approximation Hk of (V 27
W(uk))-
which improves as
uk approaches u*. A serious disadvantage of the quasi-Newton methods, however, is that Hk must
be stored throughout the iterations [22].
Since there is no intrinsic sparsity in H , this usually
requires O(M 2 L) storage, which can be prohibitive for large problems2 .
2
Pure Newton-Raphson methods are able, in principle, to calculate (VN72
storage. In practice, however, this is a far greater computational hindrance.
44
(uk)>1 on the fly, avoiding issues with
Another problem of the method is that it requires an expensive inversion of (V2 ,?1(uk)).
Quasi-
Newton methods may be cheaper since Hk is an approximation of the inverse of (V 27W(uk)), but
the calculations are still expensive compared to simpler forms of Dk.
Finally, the pure form of the method may not converge for guesses that are far from the solution (particularly for nonlinear, non-SPD problems). Even if convergence does occur, the method
does not fundamentally distinguish between maxima and minima. This problem may be avoided
by careful use of appropriate line search routines coupled with homotopy loops which approach
the problem from a known solution. This, however, imposes another level of complexity to the
calculation.
Conjugate Gradient (u)
Often regarded as among the most powerful techniques for solving many types of linear algebra
problems, conjugate gradient techniques strike a balance between the simplicity of the steepest
descent and the speed of the Newton-Raphson. In these methods, the descent direction is redefined
as
dk _
_V
w(uk) + ok dk-1
where
k
V7kT(v
k
-
V
k-1)
V-Rk-1TV-k-1
and the steps uk+1 _
Uk
+ akdk are taken such that
?W(uk + ak dk)
=
min 71L(Uk + a~dk).
a
These methods are typically simple to implement (especially for the quadratic case). In addition,
storage requirements are O(M); far less than quasi-Newton methods. In practice, these methods
exhibit very fast convergence due to the orthogonalization of the descent direction. Possibilities for
preconditioning exist, further improving convergence.
Because of its speed and simplicity, the conjugate gradient approach has been chosen as the
basis for the method presented in this work.
There are, however, modifications made to take
advantage of theoretical flexibility of the Hilbert Uniqueness Method (HUM) developed by J.L.
Lions [18, 19, 13].
45
2.8.2
The Hilbert Uniqueness Method (HUM)
Originally developed in the context of the wave equation [19], the Hilbert Uniqueness Method has
been used to study controllability and stabilization properties of distributed systems governed by
parabolic differential equations for the terminal problem [13, 9].
Since parabolic systems are of
interest here, the method will be presented in this context. Here we briefly summarize the results
of [9], and in section 3.4 we develop the method to solve general optimal control problems, including
the regulator problem.
Consider the terminal LQP problem of section 1.2.3. Then, the optimality conditions can be
expressed as
y - Ay* = -QA*,
-*
-
T A* =
y*(0) = yo,
0,
A*(tf) = WT(y*(tf) - YT).
We can represent a general terminal state y(tf) as an operator R if, given q c RN,
y-
y(0) = 0,
Ay = -QA,
A(tf) = q,
-i-ATA = 0,
and
(2.45)
Rq = y(tf).
Also, let E(t) denote the solution operator at time t for
tb - Aw = 0,
w(0) = z,
that is, w(t) = E(t)z. Comparing the above relations, we arrive at
E(tf)yo + W§-lR(y*(tf) - YT)
=
y*(tf),
or
(WT + R)(y*(tf)
-
YT) = WT(YT - E(tf)yo).
(2.46)
The operator R is similar to A used by Lions [19] in HUM. Though [9] considers only the terminal
46
problem, several important properties of R are proved:
(i) R is a compact operator in ffN;
(ii) R is symmetric and semi-definite in RN.
(iii) KerR
=
0.
As a result, the article is able to provide constructive proofs for approximate and exact controllability from the HUM point of view. Section 3.4 extends the theory to regulator problems
and presents a solution method derived from HUM by introducing a more restricted space and
connecting the above with the concepts of sections 1.3.2 and 2.8.1.
2.9
The Proposed Method
The method developed here is based on gradient methods.
However, rather than approaching
the primal problem from control variable perspective, the problem is approached from the dual
statement with a transformed form of the adjoint variable.
This approach is inspired by the HUM, which, as shown above, is a theoretical result used for
proofs of the uniqueness of certain optimal control problems (hyperbolic problems and the parabolic
terminal LQP, for example). Due to its constructive nature, the method can be implemented for
the actual calculation of optimal solutions.
In its current form, however, the method has been developed in Euclidean space, which restricts
it to terminal optimal control problems. When the more general regulator problem is attempted
in Euclidean space, the SPD property of the operator no longer holds, and neither uniqueness nor
the implementation are justifiable.
We have found that if the method is constrained to a new problem-specific space, then the SPD
property of the terminal-regulatorproblem can be stated, and the implementation of the algorithm
follows.
All that is required, then, is to use a method that maps out this new space.
General
conjugate direction algorithms can be used for this purpose, and in particular we use Conjugate
Gradients. The difference with the traditional gradient algorithms is that the inner-products are
no longer Euclidean, but rather of the type "((., .))" which will be defined in chapter 3.
The method is also flexible: a decomposition of the variables into inhomogeneous and homogeneous parts along with a careful treatment of the used operator allow for extensions to be made, so
47
that more general problems can be addressed. In our case, we are interested in addressing problems
which are more realistic in the engineering context: constrained control, nonlinear problems. By
treating the constrained control with Interior Point Methods (IPMs) and the nonlinearity with a
lagging algorithm, we are able to preserve key aspects of the proposed algorithm. This means that
it can be used successfully by more general methods to tackle difficult problems.
48
Chapter 3
TR Conjugate Gradient Algorithm
-
Linear-Quadratic, Unconstrained
Problems
3.1
Motivation
Linear-Quadratic Problems are those characterized by linear governing equations and quadratic
cost functionals.
By restricting the definition to linear state equations, many realistic problems
are excluded from the model.
However, many other problems can be at least very accurately
approximated by such equations. The main example of this chapter is a solid heat transfer problem,
governed by a linear, first-order partial differential equation.
Here we focus on finite time, terminal-regulator problems. In such problems, we aim to control
the state of the system throughout a given time interval I
=
[to, tf ].
In our example we would
like to control the temperature at certain points of a domain given a set of available heaters. The
behavior of the system should be forced to come "close" to a desired trajectory throughout I, with
no constraints imposed on the values of controllers (heaters).
Suppose we are presented with the object in Figure 3-1, and would like to use its top surface
to control a certain (hypothetical and idealized) chemical reaction. For this purpose, it is assumed
that the exact temperature history required is known (an example will be given), and that the
reaction itself does not affect the temperature of the surface. The object extends infinitely in the
z-direction, and so this can be considered a two-dimensional problem. The side boundaries are
49
reaction surface
constant temperature
constant temperature
object extends infinitely in z-direction
x
top and bottom surfaces well-insulated
(zero heat transfer)
Figure 3-1: Motivation - the goal is to control the temperature in the shaded region.
kept at a constant temperature, while the top and bottom boundaries are well insulated and thus
allow for zero heat flux.
As a result of the simplifications above, this problem is governed by linear partial differential
equations.
It is assumed here that an appropriate spatial discretization has been performed so
that the problem statement is presented as a set of governing ordinary differential equations. The
details of this procedure are presented in section 3.11.
Constrained by the linear governing equations, the problem will be stated as an optimization
of a quadratic cost. Regarded as a practical compromise of true design objectives to yield solvable
problem statements [10], quadratic costs have been the subject of much optimal control work
[6, 8, 26, 27, 1, 9, 13, 24, 23]. This is because the quadratic cost functional allows for well-posed
problem statements and relatively easy numerical solution algorithms. Advantages and limitations
of the form given here will be mentioned in the next section.
3.2
Problem Statement
Given a state vector y E y = CO{(to, tf); IRN} and a control vector u E U =
(to, tf) x
RM}, we
begin the problem statement by presenting the quadratic cost functional to be minimized:
J[y(u)] = 2(y(tf) - YT)TWT(y(tf)
where YT E RN, YR E {(totf) X
-
YT) + 2
JJN},
jt
[(y - YR)TWR(y - YR)
WT E RNxN, WR
E
]RNxN,
WUTWu] dt,
(3.1)
and Wu E IMxM. The
variables YT and YR represent the desired terminal and regulator behavior of the system, and the
differences in the first and second terms represent deviations from desired behavior. The matrices
50
WT, WR, and Wu, represent the terminal, regulator, and control cost weights, respectively. Here
we make the following assumptions about the matrices: WU is symmetric positive definite, WT and
WR are both symmetric positive semi-definite, at least one of which is strictly positive.
Remark 4 It is assumed here that all of the above information is known, though it should be
noted that it is not a trivial matter to determine these parameters. The variables YT and YR are
usually derived from the (assumed known) desired behavior. For example, we assume that in our
example the required temperatures for the desired rate of reaction are precisely known for the
desired outcome.
The matrices that determine the weights of different terms of the cost functional may be derived
from a few different approaches. In the simplest sense, it is typical for values to be empirically
chosen until an acceptable controlled behavior is observed [10].
Alternatively, there may be an
economic cost associated with the quality of state behavior (WT and WR) and with the cost of
control (Wu) that automatically provide the required values.
Another way of determining the weights is to choose, for example, diagonal matrices
1
WTii
for i = 1, ...
,
N and j
1
- yT)"
=
1,
...
1
(k yT
- y
r
'
W)U 3
(Umax) 2
, M, where yf, Q, and Umax determine the "maximum" values to
be tolerated for the respective deviations. Values below this maximum contribute terms that are
less than 1 to J, while values above contribute terms that are grater than 1.
Though this does
not guarantee that any of the maximum deviations are strictly satisfied, it provides a guideline for
a balanced distribution of the cost weights. Chapter 4 addresses a case in which strict limits are
imposed on the control variable (a methodology that can easily be extended to the state variables).
Finally, one may wish to approach the problem from the approximate controllability point of
view. In [13] and [9] it was shown for the terminal LQP problem (WR = 0), that
lim
1fy(tf) - YTII
=
0.
WT
That is, if we take WU
>
WT we can approach the desired final state arbitrarily close. 1
Having determined the cost functional above, we can now state the form of the constraints.
These are the initial conditions for the state variables and the linear ordinary differential equations
51
that govern the system:
where yo E
RN, A
E
JNxN,
B E
y(to) = yo,
(3.2)
y = Ay + Bu + F,
(3.3)
and F E 1EN. The matrix A represents, for example,
INxM,
the stiffness matrix that controls the dynamic response of the system, while B represents the
distribution of control variables throughout the domain. The vector F is included for generality.
For the presentation here we assume that all the system parameters WR, WU, A, B, F are constant
in time. This need not be the case, however, and the algorithm proposed in section 3.6 remains
unchanged if these parameters are allowed to vary in time.
Given J[y(u)] of equation (3.1), we define the problem to be solved in this section as the
Linear-Quadratic Program (LQP):
min
uEU
J[y(u)]
{
s.t.
(LQP)
= Ay + Bu+ F,
y(to) = yo.
3.3
Optimality Conditions for the LQP Problem
Having defined the problem to be solved, we must determine the necessary and sufficient conditions
for stationarity. From section 1.2.3, we know that stationary conditions are necessary and sufficient
for LQP problems. These can easily be derived from equations (1.15)-(1.17):
* =Ay*+ Bu* + F,
S=T A* + WR(y*
0
-
y*(0) = yo,
A* (tf)
YR),
=
WT(y*(tf) - YT),
Wuu* + BT A*,
where A E A = CO{(to, tf); JRN.
Defining
Q=
BWJlBT, we can reduce the above to
= Ay* - QA* + F,
y*(0) = yo,
(3.4)
= ATA* + WR(y* - YR),
A*(tf) = WT(y*(tf) - YT),
(3.5)
52
and
u* = -W-lBT A*.
(3.6)
Any algorithm that is developed to solve problem (LQP) must therefore solve for the system
of equations (3.4,3.5). The solution for the optimal control history u* can then be obtained from
equation (3.6). The difficulties associated with solving this type of system were presented in section
2.1, and are obvious here for this simpler system.
3.4
The Hilbert Uniqueness Method for the Terminal-Regulator
LQP Problem
3.4.1
Separation of Inhomogeneous and Homogeneous Parts
Before introducing HUM for the regulator problem, we begin with a separation of state and adjoint
variables into inhomogeneous and homogeneous parts: y = yI +yH and A = AI + AH. Then problem
(3.4,3.5) can be separated into:
y* = Ay* - QA-*
+ F,
-*
= ATAr - WRYR,
Y (0) = yo,
(3.7)
A*(tf)
(3.8)
=
-WTYT,
and
Q=Ay*
- QA*,
0,
(3-9)
A*(tf) = WTy*(tf),
(3.10)
y;I(0)
= A TA* + WRy*,
=
The inhomogeneous parts yi and Al can immediately be calculated from above since equation
(3.8) is uncoupled from equation (3.7): all that is required is a backward integration of equation
(3.8) followed by a forward integration of equation (3.7) with the resulting A,. This cannot be said
for yH and AH since equation (3.10) depends on values of y = Y + YH, which are not known.
3.4.2
R and g Operators
Here we begin to build on HUM by defining an operator R for the regulator problem.
q E Y(= A), Rq is defined by the following operation motivated by equations (3.9,3.10):
53
Given
(lq)(t) = A(Rq)(t) - QAH(t)
(3.11)
IRq(to)=o
-H(t)
-
= ATAH(t) + WRq(t)
AH (tf) = WTq(tf)
Note this is different from the HUM operator of section 2.8.2 and [9, 13] since it includes a
regulator term WRq(t).
Once a value of q has been given, a backward integration and a forward
integration in the order indicated gives the operation Rq. The operator 9 is imply defined as
gq
3.4.3
=
(3.12)
q - Rq.
The Terminal-Regulator ((-, .)) Inner Product
The terminal-regulator ((-, .)) inner-product is similar to the one defined in section 1.3.2; however,
important modifications are to be noted. Given v E Y and w E Y,
((v, w)) = v(tf)TWTw(tf) +
j
v(t)TWRw(t)
(3.13)
dt.
Jto
Now the inner product operates on variables that have a time dependence.
It also has an
additional term that is integrated over the control time period.
3.4.4
Proof of SPD Property
It will be useful to show at this point that the g operator is symmetric positive definite in the space
defined by the above inner-product. We have from equations (3.10,3.11)
AT (Zq)
T(Rq)
d(AT(Rq))dt
=
AT A(Rq) - ATQAH
-
-ATA(q)
=
-AQAH
-
qTWR(Rq)
- qTWR(Rq),
which, by Green's formula, yields
AT QAH dt = q(tf )TWT (Rq)(tf) +
-
to
qTW
to
54
(Rq) dt
((q, Rq)).
(3.14)
Since WU is assumed symmetric positive definite, we have that
Q
= BWilBT is also symmet-
ric positive definite. This fact together with equation (3.14) guarantees that the operator R is
symmetric negative semi-definite in the inner-product space defined above. Since gq = q - Rq,
it must also be true that the operator g is symmetric positive definite in that space: ((p, gp)) =
((p, p)) - ((p, Rp)) > 0, Vp C Y, p = 0.
3.4.5
HUM from the Dual Problem
Preliminaries
For later use, we note that also from equations (3.7,3.8,3.11):
4Hyi
T91
=
d(A T yi)/dt
=
-A7HAyi - S~y
-X IflA
-
qT~ay1 + ATF,
and
=
4fy1
=
d(ATyi)/dt
AT Ayi - AT QA1 + AT F
-AT Ay,+ yWRyI
-ATQAI
+ yTWRyI
+ ATF,
which, again by Green's formula, yield
- jf
AT QAI dt = y 1 (tf)TWT q(tf) - AH (to)Tyo +
tf
to0
to
T Wy
1
d -
to
QA
y IWRyI dt -
o -]
Y(t
= -yTWT YI(tf) -
to
tNATF dt.
(3.15)
ATF dt.
(3.16)
ft0
to
The Terminal-Regulator Dual Problem
We can use Fenchel duality to easily obtain the dual problem. Since equations (1.34) and (1.35) are
stated in general form, the conjugate convex and concave functionals are easily derived. Defining
55
fi(y) = Ay, and f 2 (u) = Bu + F, they are, respectively,
I1(A)
=
-(A + ATA)Ty
t2f
sup
YY
-
to
(Yf - yT) TW(y
-
1
(Y - YR)TWR(y
2
-
YT) + A0yf
YR) ct
-
(3.17)
-
and
12(A)
= inf
UEU
to
AT (Bu + F) +
1iUT W
'Id
J
2
Here, we note that A1 = A 2 = A for I(A) < 0o and 12 (A) > -o0.
(3.18)
Optimizing the functionals
above, we have
Ii(A) =
j
( + AT A)TWjl(A +
to
AT A)
-
( + AT A)TyR dt
2
(3.19)
+ 1 AfWj Af + A7YT
-
A yo,
A E A,
and
f
12 (A) = -1
AT QAdt +
SATF dt,
(3.20)
A E A.
The dual problem can now be stated:
maximize
I(A)
- (A + ATA)TWnl (A + ATA) + (A + AT A)TyR dt
=
1t
subject to
t
2
Af W-Af
T
A TQAdt - ATYT + A oY +
2-
ffT
ATF dtf,
A E A.
(LQPdual)
56
Dual Problem - q Formulation
Before changing from the A to the q formulation, we split the adjoint variable into inhomogeneous
and homogeneous parts:
I(A)
-
i
(1
:
-
+ ATA 1 )T W
1(AH
(A1 + AT
I
2f
2
f
T
A,
QAI dt
1
-
-
AH )T YR
T
(AH +
dt
+ AT Al)
'AIf
2IWj
-AHf -
to
-(AH
) - (A1 + AT AI)T yR dt
+ AT AH )TW-l(AH + AT AH)
+ f(AH + AT AH)TW1'(A
to
-2ATW
1
-
AH fW
tf
tf
1
A4QAHdt2 t(O
f + AIf )TYT + (AHO + AIo )Ty0 +
-AIf
QAHdt
to
tf (AI +
ft0
AH )T F
dt.
Now, we can use equations (3.8)-(3.16) to substitute term by term above.
I(q (A)) = +
1
tf
T ~
tf T
qTWaq dt +
YRWRYR dt - 2Ito
1 to
q(tf )TWrq(tf)
-
-
2
1
-
+
TWWTy
t
t- Y Wfyt dt -
2 It0
2Y
)TW(Iq)(tf)+
f
qWRyR dt -
qT Wyft dt
to
)TYW
2YTWTYT + q 1f
-
I to
ftf ATFdt
2 It0
t qTWR(Rq) dt
+ yI(tf)T Wrq(tf) - AH (to)Tyo +
jto
T Wy
1
dt -
toH
AT F
- q(tf )T WTyT + yT T WTYT + AH(t0)Ty0 + AI (t0)Ty0 + jf
(AI+ AH)T F dt.
Finally, canceling like terms and rearranging we arrive at a function that is independent of AH:
57
{q(tf )TWTq(tf) - q(tf )TWT( Rq)(tf
I(q) =-
-+ q(tf )WTy
1
(tf)
+ Iy WT (yT - yI(tj)) + !AI(to)Tyo
±2 T2
{f
f
-
qTWRqdt
-
i:t
1ft
+ -YRWR(YR - yI) dt + -1 'f
2 to
2 '0
+
1 -{q(tf)TWT(9q)(tf)
+
+
2
yT
rWT(YT - yI(tf))
+
qTWR(lq)dt
Jj
tf TWRyjdt
fTFdt
F
qWR(gq) dt} -
q(tf )TW yI(tf)
+
qTW
yI dt
+ 2_AI(to)TYo
1 f
1 1 tftfTW1f
tf \T Fdt..
yRWR(yR - yI) dt +
2 to
2 '
Defining
C
1
yTWR(yR - yI) dt + 1
YTW(yT - yI(tf)) + 2A 1 (to)Tyo
f F dt,
(3.21)
we may write the dual problem as
I(q*) = max
gEY
-((q,
2
9q)) + ((q, yi))
+ C1.
(LQPq)
Therefore, we have restated the original primal problem (LQP) as an unconstrained maximization
problem in terms of q(t). Both (LQPdual) and (LQPq) are the dual statement of (LQP). The latter,
however, has the distinct advantage that it is stated in terms of the operator 9 which is symmetric
positive definite in the space defined by ((-, -)). Additionally, the statement of the problem is much
cleaner since no time derivatives are present in I(q), as opposed to I(A).
We note here that the the set A is convex and polyhedral, and that 11 (A) is convex and I2 (A)
is concave over A. Thus, we can state [3] that there is no duality gap and we have
{Ji(y) - J 2 (u)}
inf
(y,u)E{yxu}
or simply: J(u*)
=
=
max {I2(A) - I1 (A)},
AEA
(3.22)
I(A*) = I(q*). This result will be referred to as strong duality. From Fenchel
duality theory, weak duality also holds: I(q) < J(u) V {q C Y, u E U1}.
58
Form (LQPq) is in fact a rather useful statement of the problem from the numerical point of
view:
(i) since the operator G is symmetric positive definite in the given inner product, efficient numerical techniques that depend on this property can be used;
(ii) the q statement of the dual problem is an unconstrained maximization problem, which can
be solved for a unique solution by well-established methods;
(iii) only the action of operator g on q is required, thus allowing for potential storage savings.
This action is uncoupled in time, and is performed via operations (3.11,3.12);
(iv) the inhomogeneous part C, can be solved for by equations (3.7,3.8) which are uncoupled in
time. Since this term is independent of q, it needs to be solved only once in an iterative
procedure;
(v) in iteratively solving for q*, any value q E Y will produce I(q), which serves as a lower bound
to the optimal cost I(q*)
=
J(u*);
(vi) in iteratively solving for q*, the operation gq produces a non-optimal value of A(q) which can
be used to determine u(q)
=
-Wj1BTA(q).
This can be used to determine a upper bound
J(u(q)), and thus a bound gap to the solution Ac(q) = J(u(q)) - I(q) > 0. Because of strong
duality, effective numerical algorithms should allow Ac(q) -+ 0 as q -+ q*.
3.4.6
Differences from Previous HUM
The Hilbert Uniqueness Method has been developed as a constructive technique to study exact
controllability of distributed systems. Originally developed in the context of the wave equation
[19], the method has been extended to study parabolic equations [9] of the type considered here.
The approach, however, has been to state the problem as a solution to the system
Pq = yj,
where P is some operator similar to G presented here. The operator is then shown to be symmetric
positive definite allowing for the development of algorithms for solving the above system for the
terminal problem. This, however, overlooks the origin of the problem above, which is the dual
59
statement of the original problem as in (LQPq). By using Fenchel duality, and stating the problem
as such, we have available Ac(q): a measure of quality of the current iterative solution.
Though Fenchel duality has been used to express the dual of the optimal control problem [13],
this has been done for the terminal problem only. In such a situation, the end result is similar
to problem (1.40), and the symmetric positive definite property of the terminal g operator can be
proven in Euclidean space, thus not requiring a special inner-product such as ((., -)). The advantage
of statement (LQPq), however, is that regulatorproblems can be just as easily formulated, provided
the inner-product is appropriately defined. By combining the duality result with HUM as above,
one is able to automatically derive the appropriate form of the inner-product to be used to solve
the problem.
3.5
3.5.1
General Conjugate Gradient (CG) Algorithms
General Conjugate Direction Methods
The method of conjugate gradients is presented here for the solution of general optimization problems in ]R9.
In section 3.6 the method will be applied to a subspace of
Y
for solving the optimal
control problem (LQPq). Since the problem of interest is quadratic, the method will be presented
in this context.
Conjugate gradient methods are a subset of the more general conjugate direction methods [3].
They were originally developed for solving quadratic problems of the form
x* = arg min
where
Q E 1R?
f (x)
XTQ
is a positive definite matrix and b E JRf
-X
Tb
(3.23)
(this problem can also be expressed as
Qx = b). Conjugate direction methods rely on Q-conjugatedirections dl,... , dk; that is, diT Qdj
=
0
for all i and j such that i = j. These directions are linearly independent by construction since Q is
positive definite. Minimization of f is then iteratively performed by xk+1 - xk + akdk, where ak
is chosen in such a way that
f (Xk + ak dk)
=
minf (xk + adk).
a
60
Since
Q
is symmetric positive definite, this value of a can easily be found from the form of
dkT(b
ak
-
f
QXk)
to be
(3.24)
dkTQdk
for any given direction dk. The strength of these methods is that the iterates progressively minimize
f
over an expanding linear manifold that eventually includes all of R'.
Using the Gram-Schmidt procedure, it is possible to obtain a set of mutually Q-conjugate direc-
tions do,...
, dk
from a set of linearly independent vectors r 0 ,...
,r
, so that the subspace spanned
by both sets of vectors is the same. This can be done iteratively in the following manner. Choosing
do-
r0 and
di+ - ri+l +
cmldm,
m=O
determine coefficients cmi such that di+1 is Q-conjugate to do,..., d'.
From the Q-conjugacy
definition, we get
c Z+-
3.5.2
ri+TQdJ
iT Qdi
j= 1,...,Ii.
The General Conjugate Gradient Method
In the conjugate gradient method, the set of Q-orthogonal directions is obtained by applying the
Gram-Schmidt procedure to the gradient vectors of problem (3.23):
rk=
Vf (xk)
=
b
-
Qxk,7
(the negative of the gradient is used for notational convenience, this value is also often referred to
as the residual). In this case, the direction calculation for the k-th iteration adopts the form
d rk
k-1 kTQdJ c
= rk -E
r IQTdJ dd.
But since the gradient rk is, by construction, Q-orthogonal to the subspace spanned by ro, ... , rk-1
the above can be greatly simplified to
dk -rk
-
rj rr )
(rk-ITyk-1
61
dk.
(3.25)
1r(rj+
ykQdT
k-i
j=0
T
j=0 dT(ri+1 - rj)/c4
d
rkT k
j
_
kT k
rrk-1)
k-1T(rk
rk-lTrk-1J
The simplicity of the above forms indicate one of the advantages of the method: for implementation, it is only necessary to follow the algorithm:
Algorithm CG
Set k = 0;
Set x 0 =O
, r0 = b, and d0 = r0 ;
while not (rkTrk) > (tolerance) do
k = k + 1;
ak
(rk-lT rk-1)(dk-1TQdk-1);
akk-1.;
xk _
k-1
rk
k-1 _ akQdk-1;
k = (rkT rk)
dk
-
(rk-T rk-1);
rk + !kk-1;
end do.
Another significant advantage of the method is the fast rate of convergence observed. An upper
bound for the convergence rate is
where |XI|Q = (xTQx)
||z * -
<k|
2 Vr,(3.26)
IIX* - X0|IQ
-
\+/1Th+
'
denotes the Q-norm, and r is the condition number of Q (see [28] for
proof and other results). The conjugate gradient method is particularly powerful in practice for
matrices
Q
whose spectra are well-behaved (clustered eigenvalues). Though in theory the method
will achieve the exact solution when k = n, such problems allow for very satisfactory accuracy for
k
< n. This is a result of the fact that the method chooses search directions dk which allow for the
minimization of the error IIx* -xkIIQ over the entire Krylov space ICk spanned by {b, Qb, ...
62
, Qk-lb}.
3.6
Terminal-Regulator Conjugate Gradient Algorithm (TRCG)
3.6.1
The Skeletal TRCG Algorithm
We can adopt the above conjugate gradient methodology to solve problem (LQPq). Here, we
express the problem in the equivalent minimization form:
q* = arg min i(q) = -((q, gq)) - ((q, yi))},
gEY
2
(LQPq*)
where we are reminded that I(q) = {-1/2((q, gq)) + ((q, yi)) + 0r} < I(q*) for all q E Y. The
similarities between problems (3.23) and (LQPq*) are immediately apparent: (i) both f(x) and
i(q) are quadratic functionals; (ii) both problems are unconstrained minimizations; (iii) both
problems are defined by an symmetric positive definite operation in an appropriate inner-product.
The differences of the problems have to do with the operators involved and the inner-products: (i)
though Q is an n x n matrix, g is an integration operation, which takes in argument q and returns
a time history gq; (ii) the inner-product for problem (3.23) is the Euclidean inner-product, while
that for (LQPq*) is defined as in (3.13). This last point is crucial to the method since operator g
can be shown to be positive definite only in the space defined by this inner-product.
The result of these observation at the algorithmic level is as follows. All of the conjugate
gradient ideas presented in section 3.5 can be restated in the context of ((., -)) in place of (-)T().
Now, rather than minimizing the Q-norm of the error Ix k - X*I|Q, the method will minimize the
"g-norm" jqk
-
q* Ig, defined by
1vflg
=
((V, g)),
over the Krylov-type space spanned by {y', gy1 , g(gy),... , Qk-1y}. The algorithm now takes the
form
Algorithm TRCG (skeletal)
Calculate yJ from equations (3.7,3.8);
Set k = 0;
Set qo= 0, ro=y1 , and d0 = r0 ;
while not (stopping criterion) do
k = k + 1;
qk
qk-i + akdk-I 63
rk
=rk--1
_
k !dk-1;
Ok= ((rkrk))((rk-1 ,rk-1));
dk
k +
.
k k-1.
end do.
3.6.2
Convergence Results for TRCG
Before we explore the convergence characteristics of the method presented above, we present the
following definition.
Definition 1 The set of scalars
Xi
E R and function Ei E
((EiIv)) = ((Aisi, v))
Y that solve the system
Vv
E
(3.27)
Y,
are defined as eigenvalues and eigenfunctions of 9, respectively, where i = 1,... , N. Furthermore,
we define eigenfunctions as those ei which satisfy ((ei, Ei)) = 1. El
Lemma 1 (a) The eigenfunctions of 9 from Definition 1 are mutually orthonormal: ((Ei, Ey)) = 0,
for all i : j. (b) In addition, the complete set
of scalars
{i} such that q(t) =
E
{e} forms a basis for Y: Vq E Y, there exists a set
I lis,(t).
Proof. (a) First we observe from equation (3.66) that 9 is symmetric in the
((-,-)) inner-product.
For i : j we have ei : e. and Ai 0 ANj. From symmetry and equation (3.27), we can write
((Ei, ej)) = ((i e, I le)) = -i ((ei, ej))
Since
and
((e, I i)) = ((Aj ej , ei)) = Aj ((e, I i)).
A AJ,
- we must have ((,e, ej)) = ((aj, Ei)) = 0. (b) The complete set of orthonormal
{se}
spans the space Y, and so serves as a basis. 0
Though the typical convergence result presented for the general conjugate gradient method is
shown in equation (3.26), a more useful result from a cost minimization point of view is presented
here for the TRCG algorithm:
Proposition 1 Assume that 9 has (N - k) eigenvalues in the interval [a, b], where the remaining
64
k eigenvalues are greater than b. Then every qk+1 generated by the TRCG algorithm satisfies
(qk+1) _I*)
b(I - a) 2
-I(qI) _ I(q*)
bb+a
(3.28)
Proof. To simplify the proof we introduce the change of variables p = q - q* and the function
4p(p) =
1(q) - f(q*). Then
I
p (p)
=
1p
((q, 9q)) - ((q, yj))
1
2 (*
2
1
= ((q, 9q))
2
=
1
=
1+
-((q*,
-
q)
9q*)) + ((q*, yi))
-(q-qy)
((p + q*, g[p + q*]))
((q* 9q*))
((p, gp)) - ((p, yi - 9q*))
=
-
1
!((p, gp))
and we need to show that
b - a )2
jP(pk+l)
Ip(po)
-
+a
for iterations in pk.
The TRCG algorithm builds iterates
k
Pk+1
(3.29)
y*kigpk
PO +
i=0
by selecting scalars 7*ki such that
4,(pk+1)
is minimized over all sets of possible coefficients -yki for
every k. Defining the polynomial
k
pk(g) =
E ckigk+l
i=1
with appropriate cki, we can restate the iterate (3.29) as
(3.30)
Pk+l - [ + gpk ()]O
where I is the identity operator: Ip = p. Choosing -*ki which minimize
65
I,(pk+1)
can be expressed
as
Ip(pk+l)
-
= miln
+P(([I
.
pk2
(3.31)
From Lemma 1, any function po E Y can be written as
N
p0 _
for a set of scalars
j. Also, from
N
N
gpo _
zig,i jy'i i,
{i} and ((2j I ))
=
090
together with the orthogonality of
~
Ip(PO
1
10
2 ((p
pO))
1, we can write
N
(NN
=z1
2
Applying the same idea to equation (3.31), we have for any polynomial pk
IP(Pk+)
(1 +
<
p
2
) 2 -(
)
and, finally,
< max(1 + Aipk
ZP(pk+)
(3.32)
Vpk, k.
Now we denote A1,... , Ak as the eigenvalues which are larger than b and choose the polynomial
Pk defined by
((a + b) -
-
Since (1 +
for
X C [a, b].
3 Pjk(A))
= 0, for
j
=
I(a + b)
1,...
, k,
)
(
A
A
.
Ak -
-
A
-
the maximization in equation (3.32) need only be done
Also since A3 > A > 0, for all j, we have (Aj - A)/Aj < 1. So we can rewrite equation
(3.32) as
IP(pk+1)
max
a<\<b
{
('(a + b)) 2 2
(1 (a + b))2
66
1PV
b- a 2
b+a)1()
Proposition 1 is significant because it shows that the method tends to converge very quickly to a
minimum value if the range of eigenvalues of the remainder of the space is narrow. Since operators
can be preconditioned, drastic improvements can be made regarding convergence of the method.
Stopping Criterion
3.7
Though the residual rk can be used as a measure of proximity to the optimal solution (since rk -+ 0
as k
-+
oc), a more physically meaningful criterion can be obtained from estimates on the cost of
at the kth iteration.
In carrying out algorithm TRCG, it is assumed that the value qk (t) is available for all t E [to, tf]
by means of some type of storage (see section 3.10.1).
This value of qk can be used to create an
estimate yk of the optimal state variable (note that in general yk : y*) by the operation
yk =
- y1,
and an estimate uk of the optimal control variable by
uk = -w
BT(
+
4k).
This is possible since y[ and AI are available by means of equations (3.7,3.8), and \k is obtained
as a by-product of Rqk as in operation (3.11).
Now that the values of the above estimates are
available, a cost bound gap Ak can easily be obtained by
Ac(qk) - j[uk (qk),yk(q k)
-
(3.33)
(qk).
Weak and strong duality assure that Ac(qk) > 0 for all k and Ac(qk) -+ 0 as k
-+
oc, respectively.
Therefore, a natural stopping criterion for algorithm TRCG is
while not (Ac(qk)
Ec) do ...
,
(3.34)
where cc is a tolerance parameter which represents the allowable error from the true optimal cost.
67
3.8
3.8.1
Time Discretization - Implicit-Euler
Discretization of Problem Statement
Thus far the above formulation and results have been carried out in the continuous time domain.
Before final implementation of the algorithm, a time discretization must be performed on the
problem. Several well-known schemes are available to discretized ordinary differential equations.
The ones explored here are implicit-Euler, Crank-Nicholson, and second order Backward Difference
Formulas, but the method also extends to higher order schemes. In this section we present only
the Euler formulation; Crank-Nicholson and BF schemes are presented (in a more general context)
in Appendix A.
It is important to note that care must be taken in applying these schemes to the two-point
boundary problem which make up the stationary conditions (3.4,3.5). If such care is taken, the
symmetric positive definite property of the discretized equivalent of 9 is preserved, and all the
results follow. Here we present the case for implicit-Euler, and the remaining schemes are addressed
in appendices.
Suppose we divide the time domain [to, tf ] into L equal intervals of size At = (tf - to)/L. In
this way, we substitute the original time function in C 0 {[to, tf ]; ]JN} by y
IRNx(L+1),
where ye E
JftN
f y
11
,
YL} E
for f = 0, ... , L. We note here that the time superscript f should not
be confused with the conjugate gradient iterate k. Henceforth, when both indecies are of interest,
we will denote the variable as yke; otherwise, one of the indecies will be implied.
Having defined the state variables as above, we can similarly define the control and adjoint
variables as u E
f = 1, ...
, L:
b(E
=
JMxL
and A E
respectively. These are indexed by uf and At for
JNxL,
note that f = 0 does not apply for these variables, since they have no impact on the
initial state of the system. We also define YR E
]NxL
and YT E RN in a similar fashion. Finally,
the matrices WT, WR, and WU are defined as before with the same assumptions: WU symmetric
positive definite and WT and WR symmetric positive semi-definite, at least one of which strictly
definite. Having defined the variables as such, we may restate the problem in the discretized form:
min
JE [y(u)
UEUE
YV
y 0 = yo,l
Ay + B u +
0),
68
P 1
f= 1, ... , L ;
(L QP E )
where the discretized cost JE[y(u)] is
L
JE[y(u)]
1 -
T(yL -
2
YT) + 2T)TIR (
t=1
-
yI)TWR(y
-
yI) + u
dul At.
(3.35)
Remark 5 It should be noted that the cost functional need not take the exact form as above. For
example, the sum term can be replaced by
y[ - y)TWR(yt
-
ye) + UjWUU
At
+ ±
[(yL _ Y)TWR(yL
--
) + U LTWuuLl
+
This cost has a slightly different interpretation: the cost penalizes the fth value of the variables in
the time interval [(1 - 1/2)eAt, (1 + 1/2)eAt], as opposed to [(f - 1)At, LAt] as in (3.35). Though
this is negligible in terms of accuracy (implicit Euler being O(At) accurate), it is important in the
stability of the method, as will become apparent below. The point to be made is that different
discretization choices will impact the appropriate form of the the operators used to solve the problem
in the TRCG algorithm. F-1
From problem statement (LQPE), the ordinary differential equations representing stationary
conditions for optimality can be expressed as (optimality is implied (k
*) for equations (3.36-
3.40):
S=
Ay' + But + Fe,
(I - AtA)T AL
At
(3.36)
Y0 Yo;
)
=1, ... , L;
WT(YL - YT) + WR(YL - yR)At;
= ATA + WR(yW - Y),
Ut=
-Wf BTAt,
(3.37)
(3.38)
f= L - 1,..., 1;
(3.39)
f = 1,..., L - 1.
(3.40)
Although most of the equations above could easily be deduced from standard application of the
Euler scheme, the form of equation (3.38) may seem unexpected. The final conditions on A is where
care must be taken so that the appropriate symmetric positive definite quality of the discrete analog
of g is preserved. The correct form of these conditions can be arrived at by direct manipulation of
the discrete augmented cost functional.
69
3.8.2
Discretization of Solution Procedure
The following definitions will be useful in developing the time-discrete form of the TRCG algorithm:
Definition 2 Let Ai E
RNxL
and y' E
]RNx(L+1)
(I - AtA)T A= WTYT
(I - AtA)T A'
=
0
Y,
-
be defined by:
AtWRYL;
=
A'+' - WRyAL
- -
-- ) 1;
= Yo
(I - AtA)y'= y'-1 - (QA
+F ))At,
f= 1,..., L.
The the time coupling of the above is such that the operations can be immediately and uniquely
carried out for any set of inputs {yo, YR, YT}. E
Definition 3 Given q C
I[?NxL
]RNxL,
let the operators
'RE : IRNxL
_, JNx(L+1) and
gE
: RNxL
_
be defined by:
(I-
(3.41)
AtA)T4\L = (WT + AtWR)qL
(I - AtA)T A'
++
(3.42)
WRq At,
(3.43)
0
(REq)o
(I - AtA) (REq)q'e
(7ZEq/'
1
-
(3.44)
QA'IAt,
(3.45)
(gEq)'=q - (REq)',
Again, the time coupling is such that the above operations can be immediately and uniquely
carried out for any given q. 0
Definition 4 Given v E
lNx(L+1)
and w E
jINx(L+1),
let the inner-product ((-, -))E be defined
as:
((vw))E = vLTWrwL +
TWRW
At-
f=1
Defining the norm IIVHIE =
((v, v))2,
and given the assumptions of WT and WR, it is a simple
matter to show that for all such (v,w): ((V,w))E = ((w,v))E;
|avHl = IaIJ|vIHE, V a C R. D
70
IVHE > 0; IIIE = 0 if V
0;
The above definitions can be seen as discrete analogies to the components of the TRCG algorithm: the inhomogeneous parts, the operators, and the inner-product space. The first important
result is in regard to definitions 3 and 4:
Proposition 2 Given the definitions above, the operator GE is Symmetric positive definite in the
space defined by ((-, -))E-
Proof. From Definition 3 we can multiply expressions (3.42) and (3.44) by -(REq)f
respectively, to obtain for £
(I
-A'T
and for
and A4,
1,... , (L - 1)
=
AtA) (REq)'
-
=E
a(RE
-(RgT
(3.46)
= 1,... ,L
Aj T (I
whose sum results, for any
RE
-
e
AtA) (REq)'
1,...
-
AHT
H TtT
-
H
(3.47)
- 1), in
, (L
+TW(Eq
HT
T
HQA
At
(3.48)
Performing a sum over all f = 1,. . ., (L - 1) of equation (3.48), we have
L-1
AL
7
'0
_ 4j(R(q)O
T (N~
T (RQL-1 ZEq)
St=WREQ
+
T
L-1
A
-
f=1
TQAf
f=1
We now use the facts that (REq)0 = 0 and, from equation (3.47) at L,
AHT(REq)L-1
_L
T
(I - AtA) (q)L
LT
LAt
to rewrite the above equation as
L-1
AT
(I
-
AtA) (REq)L
L
q
WR(7Eq)At
H
e=1
f=1
71
H'
At.
Finally, since ALH T (I - AtA) = qLT(WT + AtWR) from final condition (3.41) and the definition of
(('))E, we have
L
((q, RE q))E
As a result Vp E
L
TWR(REqf AtAfT
q=
LTWT(REq) L +
I? NxL,
(3.49)
AAt<0
P 0 0, ((p gEP))E = ((pP))E - ((q, REq))E > 0; that is 9E is symmetric
positive definite in ((., -)). 0
The second important result is in regard to the dual of the problem:
Proposition 3 The dual of problem (LQPE) can be stated as
IE
(
=
q*) max
-
((q, 9Eq))E
-
((q,
+ CE
yI))E
(LQPqE)
where
L
CE = -yT
2
(YT
IL
Y
Y, TWR(y,
1
2 Ay
YJ)At +
1: AT F'At.
(3.50)
f=1
Proof. Similarly to before, we can combine expressions in Definitions 2 and 3 to obtain
S
H=
QA'At =
I
-((q,
yI))E + A Ty 0 +
5 A'TF'At,
(3.51)
=1
and
L
A, QA'At
From
YLTWTYL +
L
A j QVAAt
y TWy At + ATy
L
+
±
AITFeZAt.
~TQA' At
5T QA
A At + S: A47QA'FIAt+2A
7
72
(3.52)
we can substitute into the EL' AeQAAt term of the discrete form of the dual as a function of A
{1L-1
maximize
ff+
IE(A) =
_Ae
+1_g
+ AT_A
_
T W
A_
_
+ ATf A
A
f=1
i
1
L-1
At
A
T
TL
~ t
1 L A T~~
=-
2LTW-AL _ ALTy + ATy +
A FJATJ,
f=1
subject to
A E JNxL
(LQPdualE)
to arrive at expression (LQPqE). E
3.9
Detailed TRCG Algorithm
The algorithm below is implemented in the Implicit-Euler scheme.
Extensions to other time-
discretizations are trivial.
Algorithm TRCG
Calculate (y', A-)
Set k
Set q0 = 0, ro = yi, and do = ro;
0;
=
by Definition 2;
Set AE(q 0 ) > cost tolerance;
while (AE(qk) > cost tolerance) do
k = k + 1;
a=
(rk,
rk-))E/((dk-
;
k
qk _ k-1 ±kdk-1;
k-1 _ ak Edk-1.
rk ._
k
dk
((rk, rk) )E
_ rk
(rk-1 rk-1))E;
+ fkk-1;
Calculate (u(qk), y(qk)) = (-W-lBT(AH + AI), 7ZEqk + yJ) by Definition 3;
Calculate AE(qk)
JE[u(qk), y(qk)]
-
IE[qk] by equations (3.35) and (LQPqE);
end do.
73
3.10
Numerical Properties of Method
3.10.1
Storage Requirements
The storage required by the TRCG algorithm proposed above is O(NL) for the full terminalregulator problem. This number arises from the fact that state variable iterates
qk,
rk, and dk must
be stored for all time steps f = 1,... , L (note that the terminal-only problem requires only the
terminal condition, resulting in O(N) storage).
This requirement can be alleviated at a computational cost. Rather than storing state variable
data for all time steps, one might prefer to store the control variables, which are typically of much
lower dimension, and only initial and final conditions for the state and adjoint. If this is done,
0(mL + N) storage would be required, which can be significantly less than 0(NL) for m < N and
L large. The disadvantage of this approach is that every time a state or adjoint variable is required
(such as the conjugate gradient iterates), a full integration from initial (state) or final (adjoint)
conditions must be performed. This can be very computationally expensive, and so we look for a
compromise between these extremes.
Partitioning the time domain into NL segments and storing the state variables only at time
steps
= 1,..., NL results in O(mL + NNL + NL/NL) required storage. The mL cost comes
from storage of control variables u for all time, NNL is required for state variables at each
, and
NL/NL is required for the storage of data within the current working partition. Thus, partition
act as initial or final conditions, from which, along with u which
state or adjoint data at every
has been stored for all time, state or adjoint variables can be calculated at any f. In the context of
the conjugate gradient iterations, the variables are calculated as follows:
Inhomogeneous parts:
Calculate A) by:
(I - AtA)T AL
= WTYT -
AtWRyL;
(I - AtA)T A' = A'-- - WRy'At,
= L - 1,...,1;
storing only uf= -W-BTA.
Define uro
:= U1, UdO
:= uJ, ro := yo, d:
yo.
Calculate:
do = yo;
(I - AtA) d' = d'-l + (Bu'o + F')At,
f = 1,... ,L;
74
(I - AtA)T
d = (WT +
(I- AtA)T At
AtWR)dL;
f = L-
A'6++WRd'At,
1,...,11;
Rdo = 0;
(I-AtA)Rd' =7Rd'- + (Bu'dO +F)At,
storing only uy7Zd
0
f= 1,... IL;
W-lBTAe
j-1 1'I1 {r}NL
U
do' {do}NL
=
{do}N,
and {RdO}NL
Homogeneous parts: (for k = 1, 2, 3,...)
Calculate:
ka=_.
kk -1)E
k -1,
((rkl
rkl))E/((dkl
E k-1))
gEdk
N),
fk = ((rk, rk))E ((rk-l rk-1)
by:
(rk-1)o
(rk1)0;
(I - AtA) (rk)= (rk) l +- (Bus + F )At,
f
1,... L;
(dk-1)6o = (dk-1)60;
(I - AtA) (dk)
(Rdk-l) O
-
r = r6~
{Rdk},N 1
ef k
Ur
q-
--
__
±
£tk1-C
rk-
(-dk)-l + (Bu dk)At,
f - 1,...,
for
=1,
...
, L;
Rdk}NL and:
Rd6~) o
ak (d6_
d6 = r7+#3kd_,
qk
+ (Butk + F )At,
= (Rdk-1)60;
(I - AtA) (Rdk)f
storing only
(dk)t-1
=1..N
=1,...,NL
dk,
Uk1
k-dk-
U
R
UFk = Ufk1 +pakF,
Ferk
PF rk- 1 +±akF£
k1
Ft
=F'
d rk + /kFt_.
dk
Performing the operations above in the TRCG algorithm allows for the storage proposed above.
We still need to determine, however, the number of time partitions NL to be used. We chose here
simply to optimize the storage O(mL + NNL + NL/NL) without regard to computational cost. In
that case, NL = V/
and total storage becomes O(mL + Nv L/).
75
3.10.2
Conditioning of g Operator
The conditioning of the ! operator is crucial to the efficiency of the TRCG algorithm, and so it is
of interest to see how it depends on the problem data. For simplicity, we consider here the timeinvariant terminal problems, with obvious extensions to time-varying regulator problems. Since the
operator is defined as
9q = q - R,
we first consider the action of R. As stated previously, Rq is defined as: given any q E JRN,
AT AH,
-AH
AH(tf)= WTq,
(7Zq) = A(Rq) - QAH,
(7Zq) (to) = 0.
One may then define a transition matrix function 4(t, to) by
<1>(t, to) =
A4(t, to)
)(to, to) = IN,
resulting in 1(t, to) = eA(-to) for the time-invariant case. Then
(Rq)(tf)
eA(tf -T)QAH (T) dT.
-
Wto
We also note that, for symmetric A,
AH(T)
-
eA(tf-) WTq,
and, as a result,
(_q)
(tf
_
A(t--)
A(t-T W
d,
to
and, finally, we find that we can express 9 in matrix form:
S= IN +
eA(tf-r) QeA(tf--)WT dT.
(3.53)
We therefore see that the conditioning of the operator 9 is directly tied to that of matrices A and
Q.
In particular, we make a few observations which will be useful in future sections. As the smallest
76
eigenvalue X 7 ,o of R approaches zero, Ag,o will approach unity. Thus, if the largest eigenvalue
AR,N
is bounded from above, then so is the condition number K(9) =
AR,N
AR,N/AR,o.
Alternatively, if
is unbounded, then so if n(g).
In the sections that follow we present the Finite Element formulation of the problem. It is a well
known result that as the discretization diameter h approaches zero, the stiffness matrix A becomes
ill-conditioned, since K(A) = O(h- 2 ) for linear elements. The question then arises as to how this
will affect the conditioning of g. We mention here that K(G) -
C(WU, WT, WR) as h
-*
0, where C
is a constant that depends on (Wu, WT, WR) but not on h. The reason for this is that as h
-+
0, the
basis of the FEM formulation cannot remain linearly independent, thus the smallest eigenvalues of
A approach zero. Since the largest eigenvalues are still bounded, then the relations and arguments
above indicate that the conditioning of 9 will remain bounded. This is an important property of
the method, because it shows that it is suitable for use in FEM formulations of problems governed
by partial differential equations. Section 3.11.7 below presents a simple example that exhibits this
property: as long as the FEM systems can be solved for each time step, the TRCG algorithm does
not deteriorate as h
3.11
--
0.
Formulation for Partial Differential Equations
So far we have formulated the TRCG algorithm for optimal control problems of systems governed
by ordinary differential equations. Such systems are often encountered in control problems in which
lumped parameter models are used. Mechanical, fluidic, thermal, and electrical problems (among
others) may readily and efficiently be approximated by such ODE's.
However, this work is concerned with the application of optimal control to systems that must
be more more carefully described by the governing partial differential equations. Heat and fluid
dynamical systems are often too complex to be modeled by lumped parameters. As a result, an
appropriate statement of the problem must be made in the context of the fundamental governing
partial differential equations.
Our approach is to use the Finite Element Method (FEM) for discretizing the spatial variable
that represents the state of the system. We show that the impact of the new statement of the problem is minimal in the TRCG algorithm, allowing us to use it virtually unchanged for the solution
of (appropriately stated) optimal control problems governed by partial differential equations.
77
3.11.1
Problem Statement
We are now interested in addressing problems of the form: given spatial variable x E Q solve for
temperature
E Y such that
Q(to) = o
in
Q,
(3.54)
in
Q X (to, tf),
(3.55)
on
FN X (to, tf),
(3.56)
on
IFD X (to, tf),
(3.57)
M
= V - (a()VQ) +
um(X)
M=1
V - ii= 0
S=0
where we note that we have arbitrarily imposed a zero temperature at the Dirichlet boundaries.
We use this case in the presentation of the following section for convenience. However, we note that
the problem to be address will have inhomogenous boundary temperatures YD, which are addressed
in a standard way for Finite Element Methods. Each element un of the control vector u C U is
applied to a sub-domain Q,
3.11.2
of Q.
Weak Formulation and Galerkin Approximation
We treat the above set of equations in the standard Finite Element context. As such, we begin with
a variational statement of the problem. First, recall the definition of the following spaces spaces
LP(Q) =
v : Q -+ R |
j
IvIP dQ < +oo
with associated norm
i/p
dQ)
Also define the Sobolev space Hk (k > 0) and Hok as
Hk(Q)
-
v E L 2 (Q) IDv C L 2 (Q), VlI < k},
Hok ={v c Hk(Q)IVD =)011
and the function space
X = {v E H 1 (Q) IVFD = YD}78
Now we may state the weak formulation of the problem: find
: Q x (to, tf)
-+ IR,
Q(t, ) E X
such that
QVto) = fo,
aM
(3.58)
t((t), v)= a(jj(t), v) + 1 b(um(t), v)
Vv C HOJ(Q),
(3.59)
M=1
where
a(W(t),
v)=
a(w(t), v) =
f
-
j
b(um(t), v)
x)v(x) dQ,
(t,
Vw(t,x) - Vv(x) dQ,
Um(t, x)v(x) dQ.
The statement of problem in weak form (3.58,3.59) is still continuous in space and time. To begin
the discretization of the problem, we use the Galerkin approximation: find yA : Q x (to, tf)
y(t, ) E Xh
C
-- 1,
X such that
Yh(tO) = Yo,h,
(3.60)
a
t(Yh
SM b (um,(t),
M), Vh) = a(yh t), vh) +
Vh)
Vv E HO(Q),
(3.61)
M=1
where Yo,h E Hh(Q) is chosen appropriately to approximate the initial conditions (for example, it
may be a solution of a boundary-value problem with appropriate initial boundary conditions, or a
projection of arbitrary initial conditions yo).
3.11.3
Finite Element Approximation
Now we may introduce the triangulation Th of Q, which represents a set of triangles such that
Q
=
UTTjhTh
TnTh
=
0
and
if Th
79
0
Th
(Q and 'h
represent the closure of Q and Th, respectively). Now we use the following space of p
elements in the approximation of the problem
Xh
{v E X IVITh E Pp(Th), V Th E
}.
Here we make use of p = 1 (linear) elements, though the extension to higher orders is trivial. Now
we are ready to discretize the problem by setting
N
yj(t)
yh(t, X) =
j(),
j=1
where {yj Ij = 1, . . . , N} denotes the basis of Xh. To simplify notation, we redefine y as a vector
in
jN
whose elements are the basis of Xh.
3.11.4
The Governing Equations
Having thus defined the finite element spaces, we can carry out the operators of the weak formulation
to obtain
y(to)
=
(3.62)
yo,
M'dy =Ay + Bu + F,
dt
(3.63)
where, for every t E [to, Itf,
Mij :=(#i,I0),
Aij := a(#j, #),
Bim := b(um, 0i),
i, j
=1, ... ,N,
7- =
, ... , M,
are definitions of the mass, stiffness, and control matrices, respectively, and F(t) incorporates the
inhomogeneous boundary conditions.
We note that equations (3.62,3.63) are of similar form to the (3.2,3.3) for which the TRCG
method was presented above.
The difference that the mass matrix M is present in the above
equations will be shown not to affect the algorithm in a fundamental way, due to the SPD nature
80
of this matrix.
3.11.5
The Cost Functional
An appropriate cost functional must be defined for partial differential equations. From our treatment above, it is apparent that a quadratic form may be advantageous:
J(9, 9, 4RU)
f
+
~
)
nWR)
fr W()( (tf) - 9T)2 dQ
=1
(3.64)
uTWuudt
1 f
)2
--
where WT(x) and XR(x) are weight functions for the terminal and regulator deviations over the
domain Q. For example, if WT(XI) < WT(x2), then deviations from desired terminal behavior in the
sub-region Q1 D x1 will be more heavily penalize than deviations from desired terminal behavior
in Q2 3
X2.
Now using an interpolant I we define the following approximations to the desired terminal and
regulator behavior:
N
1
yTh()
T()
--
yT j j(),
j=1
N
yRh (t, X) =
R(t, X)
yRj t)J(),
j=1
where we also define the vectors YT and yR(t) as containing the elements YTj and YTj, respectively.
We may apply the Galerkin approximation to the cost functional:
J(yh,yThiyRhu) = J(Y, YTYTU)
+
1
-(y(tf) - yT
2t
[(Yy - YR) WR(y
1
T(y(tf)
-
(3.65)
yR) + uTWUU] dt,
-
where
WTij
=
(#i,wTq j),
Wai
=
81
((i,wRqj),
YT)
ij
= 1,..., N.
The problem can finally be stated in the familiar form for ordinary differential equations:
min
J[y(u)]
uEU
s.t.
(LQPfem)
MQ = Ay + Bu + F,
y (to) = yo.
3.11.6
Optimality Conditions
By introducing Lagrange multipliers A E {(to, tf) x RN}, we can derive the optimality conditions
for the FEM formulation of the problem:
My* = Ay* + Bu* + F,
y*(0)= yo,
-Mi* = AT A* + WR(y* - YR),
A* (tf)
= (MlWT)(y*(tf) - yT),
0 = Wuu* + BT A*,
which, except for the appearance of the mass matrix M in the above equations, is very similar to
what has been presented before. Therefore, the question is how this matrix may affect the proposed
TRCG algorithm.
Fortunately, the algorithm is unaffected by this matrix. Since M is symmetric and nonsingular
(it is SPD), we quickly obtain the appropriate R operator (with homogeneous and inhomogeneous
parts defined as before):
A TM(Rq)
=
ATA(Rq) - ATQAH
T M(Rq)
=
-A(
d(AT M(-Rq))/dt
=
-ATQ AH
q) - qTW((Rq)
-
qTWR (Rq),
which, by Green's formula, yields
((q, Rq)) :=
-
7
AH4QAH
dt
Sto
bto
=
q(tf)TWr(Rq)(tf) +
qT
wR(Rq)
dt.
(3.66)
Now the definition of R is the backward and forward integrations with M included in the above
82
equations. If this is the case, then the operation above uses the same inner-product as had been
defined before. As such, we see that the algorithm has not changed fundamentally, as long as the
spatial discretization of the state and adjoint variables is consistent with the new FEM formulation
and that R and y' are calculated accordingly.
Defining R as above, we still have gp = p - Rp and the same inner-product as equation (3.13).
Consequently, all the proofs hold and the time-discrete case is a trivial extension of the method
presented in section 3.8. With the fully discretized statement, the TRCG algorithm can be applied
to problems governed by partial differential equations.
3.11.7
Effect on the Conditioning of g - A One-Dimensional Example
Since we are taking the FEM approach to problems governed by partial differential equations, it is
important to understand how this formulation will affect the TRCG algorithm. In particular, as
pointed out in section 3.10.2, g should remain well-conditioned for discretized problems as h
-+
0.
In this section we test the heuristic arguments of section 3.10.2.
Here we show an example in which this property is observed to hold. Take the terminal problem
governed by the partial differential equation:
at =a 0X2 +b(u, x),
where
(x, t) is defined on a one-dimensional domain of length 1 for all time (to, tf
x E Q
=R.
Apply Dirichlet boundary conditions
(X = 0)
M
0,
= 1) =
0,
arbitrary initial conditions
Q(X, to)
=
yo,
and define b(u, x) as a mapping of a single controller u onto the entire domain Q. This is not a
realistic example since control problems rarely allow for point control at every x of the domain, but
by exciting every mode we have a simple form of the problem for analysis of g which bounds the
more realistic case.
Having defined the problem as such we choose a simple triangulation: divide the spatial domain
83
into N equal segments of length h. Assume we would like to drive the system to zero final state
(YT = 0) and that WU = 1 so that
y E
]ffN
Q
= BBT. Then the spatially discrete variable is defined as
X (to, tf) and the governing equations become:
My = Ay - QA,
y(to) = yo;
(3.67)
Mi = -ATA,
A(tf) = WTy(tf),
(3.68)
where
4 1 0
0
1
h
0
6
0
A =
h
4
1
.
0
1
.
1
0
- 1
4
1
1
4
- 0
0
2
-1
0
0
-1
2
-1
-
0
-1
0
0
(3.69)
0
-1
0
-1
2
-1
0
-1
2
(3.70)
and
1/2
1
(3.71)
B = h
1
.1/2_
are matrices of size N x N, N x N, and N x 1, respectively.
Now define the following: zH = MyH and (H = MAH, from which we get
ZH = AM 1 'zH -
QM 1(H ,
ZH (to) =
0;
(H(tf) = MWTq,
(3.72)
(3.73)
from the homogeneous equations. Following the steps of section 3.10.2 (and discretizing in time)
84
Conditioning of R operator for W =102103104
10
10
W
10 , 10
10
10
10
0
20
10
30
40
50
N
60
Conditioning of G operator for WT
70
2
10
80
3
10
90
100
4
10
10
W11'
o3
1
102
W =10~
(9102
=102
101WT
10
0
10
20
30
40
50
N
60
70
80
90
100
Figure 3-2: Condition number of R and 9 for sample one-dimensional problem.
L
9q =
I - (-
MeAM'
I
1(tf
T)QeAM'1(tf-T)MWTq.(74
T-e--MW
q.
(3.74)
Therefore, 9 can be calculated explicitly as a full matrix for this simple problem. Assuming the
problem size is not too large for the desirable range of h, one can calculate the condition number by
any available method. Here, we have used MATLAB for this simple calculation and have explicitly
plotted the condition number of R and g in Figure 3-2 for different values of WT.
Our goal is to observe the behavior of the condition number of g as h
-+
0; that is, as N -+ oc. As
expected, we see that as N -+ oc, R quickly becomes ill-conditioned (exceeding machine precision
for N > 10) due to the inevitable ill-conditioning of A caused by eigenvalues which approach zero.
However, the condition number of 9 approaches constants that are only dependent on WT. As a
result, conjugate gradient iterations that operate on 9 will not be degraded as h -+ 0, making the
TRCG algorithm suitable for PDE problems in the context of the FEM formulation.
The case that we have considered above assumes that all points on the state domain were
directly controlled by u. This is obviously not the case in general. For the realistic cases in which
only a subset of Q is under the influence of a control variable, the eigenvalues of g would be bounded
by the eigenvalues of the above matrix from below and above. Thus such a realistic problem can
only be better conditioned than the problem presented above. As a result, these problems are also
well-conditioned as h -+ 0.
85
OC5
FRS
D
u
FD
F(-2
()3
G4
FN
q 2:- U
q3
U2
q 4:-- 1
Figure 3-3: Diagram of sample problem domain (7 cm x 3 cm).
3.12
Example Problem: Linear, Two Dimensional Heat Transfer
We now return to the original problem posed in the beginning of this chapter. We are interested
in controlling the temperature on the "reaction surface" of Figure 3-1. Suppose further that the
control mechanism is such that we are allowed to input heat through the material on certain parts
of the domain. In particular, we propose to use three heaters on the bottom part of the domain.
Figure 3-3 is an illustration of the spatial domain Q E R2 of the problem to be addressed.
Five sub-domains Q, through Q5 (all C Q) have been used with diffusivity values a, through a5.
Controller u1 represents volumetric heat input into Q2 and Q4, while controller U2 is the volumetric
heat input into Q3.
Together, they form the vector u E U, where M = 2.
From here, heat
propagates to the rest of the domain.
Surfaces denoted by
by
IFD
IFN
(Neumann boundaries) are thermally insulated, while those denoted
(Dirichlet boundaries) are held at a fixed temperature of 300 K throughout the process. The
surface denoted by
IFRS
is the reaction surface, whose temperature we wish to control. We note here
that this choice is arbitrary: any part of the state-domain may be chosen for desired performance.
3.12.1
Problem Data
The problem to be solved is composed of the following data ([a] = m2/S):
a,
a2
a3
&4
10-4
10-3
10-3
10-3
and
86
05
1n-2
YO
tO
tf
YT,RS
WT,RS
WR,RS
WU
300K
0
5s
500K
5 x 10 6 /K
Ix 10 5 /K
1s/K
where we note that Wu = WUIM, with IM the identity matrix of size M, and
JYT,RS
0
X E FRS,
if
otherwise,
JjWT,RS
if X E
RS,
0
otherwise,
WR,RS
if
0
otherwise.
and
WR(X)
X E rRS,
The above definitions guarantee that we penalize only the deviations from desired temperature
at the reactions surface
7
' RS,
with no penalty on any other part of the domain.
This type of
penalization is completely arbitrary and demonstrates the power of optimal control methods: we
are able to arbitrarily specify the desired behavior of the system without regard to controllability
issues.
Though we have specified
iT
above, we have not yet specified
R(t).
It is taken here as a
function of time that can be represented in the following manner:
R(t, X)
E
X
RS'
YR,RS (t)
if
0
otherwise,
where YR,RS(t) is a scalar function of time of the form
YR,RS (t)
(300 + 400 t/tf ) K
500 K
if t < tf /2,
if
t > tf/2,
as shown in Figure 3-4. In words, we would like the temperature at the reaction surface FRS to rise
in an approximately linear fashion from 300 K at t = 0 to 500 K at t = tf/2, then hold a steady
value of 500 K until the end of the process at t = tf.
87
YR,RS
500-
450-
400-
350
300
Figure 3-4:
surface).
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Time history of desired regulator behavior YR,RS (desired temperature at reaction
The problem discretization mesh that has been used is shown in Figure 3-5 with 200 equal-length
time-steps At. The resulting size of the problem is: N
3.12.2
=
3960, L = 200.
General Results
Figure 3-6 shows the solution to the problem posed above by way of the TRCG algorithm. The
time histories show that, for the parameters chosen for this problem, we are able to obtain good
agreement between our desired and resulting state histories.
An interesting qualitative observation that can be made of Figure 3-6 is that the controllers
gradually shut off at the end of the process. This feature is consistent with physical intuition since
the system inertia maintains a desired temperature at the reaction surface related to a system time
constant.
However, we also note that to overcome the system's initial inertia, very large values of control
variables must be applied.
This is typically seen in optimal control solutions since a quadratic
penalty cannot impose a hard limit on the value of controllers. Suppose our controllers can only
operate below values of 1500 K/s (shown in the figure by a horizontal line). Then the controllers
would easily saturate at the beginning of the process. The resulting behavior would, of course, be
different than predicted and certainly not optimal.
88
3
2.5
2
1.5
1
0.5
0
0
1
3
2
4
5
7
6
Figure 3-5: Mesh used for problem discretization, N = 3960 (7 cm x 3 cm).
Optimal control history u (t)
2000
1500
.
1000
.
500 F0
-500
-
-1000
0
u1 (3,5)
u2 (4)
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Average temperature at reaction surface RS
550
500
I
I
I
I
I
I
I
I
I
-
,-450
400
350 300
0
-
0.5
1
1.5
2
2.5
3
3.5
4
yRI
4.5
5
t (s)
Figure 3-6: Optimal control and I'RS temperature histories, J* = 2.37 x 10 7 .
89
CG residual convergence
10 1
10
_
100
102
0
5
10
15
20
25
30
35
iteration
Figure 3-7: Residual value of error ljuk - u*11 for TRCG iterations.
In addition, we might consider this problem with simple Joule heaters, which provide heat as a
function of i 2 R for some current i and material resistance R. In this case, it would be impossible
to extract heat from the system by use of the controllers u. But a positivity constraint cannot be
enforced by the quadratic cost functional alone.
Both of the above issues point to the fact that realistic engineering problems will require additional constraints to be imposed on control (and possibly state) variables. In particular, "hard
constraints" of the form
Umin
t
Umax
often need to be added to the statement of the problem since, as shown by the example here, they
can easily be violated by optimization of the quadratic functional alone. The next chapter addresses
the issue of how to incorporate the TRCG algorithm into a method for solving these more general
problems.
3.12.3
Computational Performance
Figure 3-7 demonstrates the expected fast convergence of the TRCG algorithm for this particular
problem. The total computational cost required is reasonable since: roughly 2 x 30 times the cost
of a single, initial value problem solution for this example.
90
500
1000
1500 -
:.I
3500L*
0
1
500
1000
1500
2000
nz=27112
2500
3000
3500
Figure 3-8: Structure of the stiffness matrix A.
Therefore, problems that are well-conditioned and that can be computationally solved as an
initial time problem (these are the problems that are of interest in the context of simulation) can
be very appropriately addressed by the TRCG algorithm proposed in this chapter.
We note that the strength of the method lies in two properties: (a) its use of integration of
the system by the action of g without resorting to numerically inverting the problem, and (b) its
ability to preserve stability in the system, thus leading to well conditioned g.
(a) Calculation of g Operator
Figure 3-8 shows the structure of the stiffness matrix involved in solving the problem posed in this
section. By requiring only a stable time integration of the system, the action of g can be seen as
solving systems characterized by such matrices (actually M - AtA, which has the same structure).
This is evident from the time-discrete statement of the problem of section 3.8.
Assuming the system is well-conditioned, this process should be stable and relatively fast. It
depends mainly on multiplication operations of nonzero elements of the above matrix. For systems
governed by PDE's, the sparsity of such matrices make such multiplications much faster than
91
inversion - on the order of N. Therefore we can conclude that the TRCG algorithm is specially wellsuited to optimal control problems of systems governed by partial differential equations, requiring
for its action O(NL) operations.
(b) Conditioning of 9 Operator
For well-conditioned problems we have stated above that the action of 9 should required a very
manageable computational cost. However, these are calculations per iteration. We therefore mention here the second advantage of the method: it is stable and well-conditioned, so that the number
of iterations required for convergence is also small.
We have observed that the conditioning of the 9 operator depends directly on the conditioning
of A.
XA
-
For FEM discretizations of the Laplace operator, it is known that for a triangulation Th,
2(h-), where XA is the condition number of the stiffness matrix A. In section 3.10.2, we
observed that the eigenvalues of R are bounded by the eigenvalues of A. This means that R tends
to quickly become ill-conditioned as h
-*
oo. However, we also observed the presence of the identity
in calculating 9. This guarantees that the smallest eigenvalues will be bounded away from zero,
and so the condition number of 9 is bounded and approaches a constant as h
92
-4
oo.
Chapter 4
Interior Point Methods
-
Linear,
Constrained Problems
4.1
Motivation
Realistic engineering control problems rarely allow controllers to take any arbitrary value without
constraints. Take, for example, the case where electrical resistance heaters are used as controllers.
Then, for current i and resistance R, heater input u = i 2 R will not allow for negative values of
u into the system. In the context of the problem presented in chapter 3, the cheapest, easiest
controllers are of this type.
Therefore it is of interest to impose certain types of constraints on the problem of the form
uj (t) E cj [uj (t), t].
Though c can take different forms, typical engineering applications will be
characterized by two limits: a lowest allowable value (such as the positivity requirement mentioned
above) and the a highest allowable value (imposed by power saturation, safety issues, etc). In short,
we are interested in cases where c
=
[umin, Umax]; that is, for u(t) E iRM,
Umin < u(t)
Umax.
(4.1)
Though all of the results in this section immediately apply to cases in which Umin and Umax
are functions of time, we have assumed for simplicity of presentation that these values are timeinvariant. Furthermore, the results extend to more complicated constraint conditions (for example,
more general polyhedra) with minimal modifications. Though more involved in form, these are rarer
93
in engineering applications, so the presentation has been done with the simple interval case above
for the sake of clarity. Finally, it must be observed that though the treatment here is concerned
only with control constraints, the method can readily be extended to problems where constraints
are imposed on state variables.
4.2
Problem Statement
The mathematical statement of the problem is similar to before, with the addition that the control
variable observes constraint (4.1):
min
uCU
J[y(u)]
= Ay+Bu+F,
s.t.
(LQPu)
y(to) = yo,
ci = u(t) - Umin > 0,
C2 = Umax -
u(t) > 0,
where J[y(u)] takes the same form as (3.1). We define the variables as before: y C Y = C0{(to, tf); JEN},
u E U = {(to, Itf) x RM}.
The feasible region Y
=
{(y, u)
= Ay + Bu + F, Umin
u(t), u(t)
Umax} is assumed to be non-empty, which is typically the case when Umin < Umax.
Before proceeding, it will be convenient to define the following notation: given a vector c E R',
diag(c)
c(l)
0
-
0
0
C(2 )
...
0
0
0
0
(4.2)
c(m).
We also define Ci(u(t)) = diag(u(t) - Umin) and C 2 (u(t)) = diag(umax - u(t)), both functions of
time.
94
4.3
Optimality Conditions for the Constrained LQP Problem
Following standard treatment [3, 8, 27] of the problem we introduce Lagrange multipliers A E
C 0 (0, tf;
RN)
t
}, so that an augmented cost functional may be defined as
and v 1 , v 2 E {[0, tf] x Rm
Ja(U, y, A, Vi, V 2 ) = J[U] +
f
AT (-
Ay - Bu - F) +
(viTC
The problem then becomes finding stationarity conditions for (4.3).
V2T C2 ).
(4.3)
The derivation of these
conditions can be found in standard texts [8] (optimal variables are denoted u*, y*, A*, v*, and
Q* = Ay* + Bu* + F,
=
0
y*(0)
AT A* + WR(y* - YR),
A* (tf)
yo,
=
(4.4)
WT(y*(tf)
-
Wu* + BT A* + Vuc*v* + Vcv*,
YT),
(4.5)
(4.6)
where
Vi)
V*
0,
= 0;
if
c*
if
c*(j)0,
v2s) <0,
if
c*
= 0;
v
if
c*
> 0,
)=O,
for j = 1, ... , M.
Here we observe that in addition to the difficulties previously encountered with time coupling,
we are now required to solve for vi*
q(jj) =
and v* In addition, complementary slackness (v*(j) < 0 for
0) is information which is not known apriori. These problems can be overcome by the
application of barrierfunction methods. These are based on imposing additional penalties on the
cost functional such that inequality constraints are transferred to the minimization statement of the
problem. A Newton method can then be applied to linearize the resulting stationarity conditions.
Nonlinear programming methods which approach the solution in such a manner from the interior of
the feasible region Y are known as Interior Point Methods (IPM). The following sections describe
how we extend these methods to the optimal control problem.
95
4.4
4.4.1
Interior Point Methods (IPM) for Optimal Control
Logarithmic Barrier Functions
Barrier function methods have enjoyed significant success in the past decades in the interior-point
solution of linear and quadratic programming problems. Those based on logarithmic functions have
been especially successful and are particularly interesting for optimal control problems.
We can state the original problem (LQPu) in the context of barrier function methods in the
form
find
subject to
where the cost functional
jlk
[U]
(4.8)
arg min JPk [U]
=
y = f (y,
y(0) = yo,
u),
(4.9)
is the original cost J[u] augmented by a penalty (barrier function)
on the deviation from inequality constraints. This modification permits the removal of the explicit
statement of the inequality constraints in (4.9). We must, of course, ensure that u* -+ u* given
appropriate choices of
jlk
and a sequence {pk}
Although modifications and improvements exist, the form of the barrier function used in this
paper is
2
]
J"[U] = J[u] - p
q=1
where cq(j) is the jth component of vector
cq
t
M
(4.10)
ln(cq(j)) dt,
0
=
for q = 1, 2 (note that
Cq(j)
> 0 is the imposed
constraint for all (q, j)). Since J' is seen to be strictly convex, and since the feasible region F
is compact, for any p > 0 there exists a unique minimizer u*(p) of J[u] such that
Q)
Ay*(pu) + Bu*(p) + F, and c*(p) > 0.
The logarithm barrier function method is based on the fact that given a positive decreasing
sequence {k}, then u* - u* as k -
oo. Defining U*
=
{w E U w = u*} as the set of minimizers
of problem (LQPu), it is possible to show that (limkso u*) E b* for 0 < pk+1 <pk. Then
lim U* = U*
k-+oo
follows immediately from this result since the convexity and continuity of J and the compactness
of F require that the set U* be composed of a unique u*. The proofs to these statements are given
96
in the following section.
4.4.2
Proofs of convergence
Proposition 4 Given a compact "level set" S = {(z, w) E F|I J[w] < J[wo]}, a positive decreasing
sequence { k }, correspondingminimizing solutions w*
=
w* (Mk) of problem (4.8,4.9), and a starting
point wo (not necessarily a minimizer) corresponding to k = 0:
1. J[w* 1 ] < J[w*] for k = 0,1,2,...,
2. w* E S for k = 1, 2,3,...;
3. there exists a subsequence {uk} of {w* } such that
-=limkou * E S;
4. f, is feasible.
Proof. The following proofs are immediate extensions of those found in [29].
1. Since w* and w*
are minimizers corresponding to ,pk and pk+1 , respectively,
EI
T
ln(c(w*)) dt < J[w*±1
kj+1
T
q
-
5
Ak E
ln(c(w~k1)) dt,
and
J[w*+1 ] -
k+1
f
T
q1
_]-,k+1
ln(c(w*~) dt
(
IT
7n(c(w*)) dt.
q
j
Combining the above equations,
/k+1)
Jz
1
]1
tk
Jku (i
Ik+1
1p
but since we've required that {p/_k} be positive and decreasing (0 < yk+1 < pk) the above
inequality proves that J[w*+1 ] < JEwI]2. From the above, we see by induction that
J[vWk 1]
J[wk] < J[wO]
It is therefore evident that w* E S for k = 1, 2, 3....
97
J[wo].
3. Since every sequence in a compact metric space has a subsequence that converges to a point
of that space, and since S is compact, fi
C S for some subsequence {u*} of {w*}.
4. fL is feasible since, from (4), fL E S C F.
Proposition 5 Let U* = {w E U w
=
u*} be the set of minimizers of problem (LQPu). Then
lim U* E ,*.
k-ock
Proof. Again, we base the following on [29]. Given L = limk,+o u*, we prove that i E U* by
contradiction: suppose ii
U*, then it is required that
(4.11)
J[i] > J[u*].
From J continuous and J[u*] > J[u*,], we have J[u*] > J[f] for k = 0, 1, 2,.... We claim that if
(4.11) holds, there exists a uint
C strict(Y) = {w E F c(um,q) > 0, V (m, q)} such that
(4.12)
J[fl] > J[Uint].
To prove this claim, we consider the two possibilities: (i) if u* E stric(F), then pick uint = U*;
(ii) if u* ( stric(F) then choose a point z
C strict(F): if J[z] < J[i] pick uint
=
z, otherwise
define i= (1 - A)u* + Az, A E (0, 1). Since A > 0, ii E strict(Y), and since J is convex (J[i] <
(1- A)J[u*] + AJ[z]), and J[z] ;> J[i] > J[u*], there exists a A such that J[i] < J[fL] from the
continuity of J. So in this last case, pick uipt = U. Either way the claim is proved if ft
To find a contradiction to (4.12), we consider two possibilities for f: (a) Li
i
b*.
C strict(F) or (b)
strict(Y). If (a), then as k -+ oc, the logarithm terms of
J[u*] - p
k
q f0ikqf
ln(c(u7)) dt < J[uint] - ILk q
are bounded and therefore vanish since we've required that limk_-
98
ln(c(uint)) dt
k = 0. Since lim
ue
= U,
we arrive at a contradiction to (4.12):
J[f] < J[uiont],
Vuiot E strict(i).
If (b), then combining the above equation with equation (4.12) in the following form:
J[uint] - ,k
S
q
f
10~ ln(c(uint)) dt < JL'41] -
pk~
T:
ln(c(uint)) dt,
T
results in
q
3
ln(c(u*)) dt <
This is again a contradiction since fi
-
k
I
T
51n(c(uint)) dt.
strict(F) implies that, as k
-+
oc, the first term above is
unbounded, whereas the second is bounded, and the inequality cannot hold. E
Proposition 6 Maintaining the definitions above,
lim u* = U*
k-+oo
where u* is the desired solution to the quadratic program of problem (LQPu).
Proof. This follows immediately from the last theorem since strict convexity and continuity of
J and the compactness of Y require that the set W* be composed of a unique u*. E
4.4.3
State and Adjoint Equations
The state and adjoint equations for a given barrier parameter pk for the above problem are
=Ay* + Bu* + F,
A* =ATA*
+ WR(y*
YR),
0 = Wuu* + BT A* - pk (C1
where e E RM
=
(1, 1,...
,
-
-C2
y*(0) = y0,
(4.13)
A*(t)
(4.14)
=
WT(y*(tf) - YT),
e,
(4.15)
1)T. The advantage that equations (4.7) are no longer required (as long
as the solution is approached from the interior of F) is immediately evident, but the non-linearity
in equation (4.15) presents a problem that will be addressed in section 4.4.4.
The equations above are the starting point for so-called Primal Interior Point Methods. It is
99
possible to add an extra degree of freedom to the problem by introducing slack (dual) variables
z1 E {[to, tf]; R
IM}
and z 2 E {[to, tf];
IRM}.
Stating the problem in this way leads to the Primal-
Dual (PD) implementation:
*Ay*
A T A* + WR(y* - YR),
0 =W
Cz*
C2z
(4.16)
y*(0) =yo,
+ Bu* + F,
A* (tf)
WT(y*(tf)
-
+ BT A* - z* + z,
+u*
YT),
(4.17)
(4.18)
k e,
(4.19)
= yk e.
(4.20)
Now we observe that the nonlinearity has been transfered to equations (4.19,4.20). The algorithmic
impact of this alternative statement is that as the solution converges to (y*, u*, A*, z*, zI ), the last
two equations in the set need not be satisfied exactly for (yk, Uk, Ak,
, z).
This results in a more
favorable situation as is discussed in more detail in section 4.4.8.
Since PD methods are algorithmically more favorable, they have been chosen here for implementation. However, the simpler mathematical statement associated with Primal methods lends
itself far better to theoretical treatment.
Therefore, the next few sections describe the Primal
methods for optimal control in detail, including some theoretical convergence results. Then, the
PD methods for optimal control are presented in section 4.4.8 and incorporated in the algorithm
in section 4.5.
4.4.4
Primal IPM
Quadratic Approximation (Newton Projection Method)
Given a barrier parameter pk, a difficulty in solving problem (4.8,4.9) is that the stationarity
condition (4.15) is not linear. Given arbitrary, non-optimal "guess" values of the control variable
ui, and the resulting values of Ai (the subscript i will later be used as a Newton iteration index),
the equation
h(ui, Ai, t) = Wou2 + BT Ai - pk(C--1
- C2 j)e
(4.21)
will not, in general, equal zero for all time. In order to resolve this problem, a Newton iterative
method can be used to find the root of (4.21).
This procedure is commonly referred to as the
100
Newton Projection Method in the linear programming context.
Linearizing Stationarity Conditions
Here we show the standard Newton iteration procedure used to solve the problem. Corrections for
values that lie outside the feasible region and initializing the algorithm will be addressed in the
following two sections. We first note that equation (4.21) can be written as a vector (uncoupled)
equation since W, is usually (or can at least be made) diagonal. Each of the M components can
then be linearized:
h(ui, Ai, t) + Vuh(ui, Ai)AU + VAh(ui, Ai)AAi
where
ui+1
=
ui + Aui
and
Ai+
1 =
=
0,
Ai + AAi.
(4.22)
(4.23)
Equation (4.22) can be rewritten as
Hi Aui + BTAAi = -gi - B Ai,
where
H
(4.24)
= WU + Pk(C-- + C-)
is the Hessian of JA with respect to ui, and
gi = Woui - [kt(C-
-
C
(4.25)
e
is the gradient. Finally, using the relation in (4.23), we can rewrite the linearized equation for ui+1:
i+1 = Ui - (H )-(BT Ai+
+ gi).
(4.26)
Equation (4.26) is clearly a linear relation between ui+1 and Ai+i, but in order to find an
appropriate value of Ai+i, the stationarity conditions of equations (4.13) and (4.14) should still be
satisfied. Therefore, given ui, we solve the linear system:
Yi+i = Ayi+1
-Ai+i
+ Bui+I + F,
= AT Ai+1 + WR(yi+1
Ui+1 = Ui - (Hf ) -(BT
-
yi+i(O)
=
Yo,
Ai+1(tf) = WT(yi+1(tf) - YT),
YR),
Ai+1 + gi).
101
(4.27)
(4.28)
(4.29)
In terms of Newton step directions, the above can be expressed as
=
AAyi - Qk/A Ai + F_,
Ay(0)
-AAi
=
ATAA, + WRAY, + f,
AAi(tf) = WT(Ayi(tf)
=
(4.30)
Yo - yi(to),
AyS
-
(4.31)
YT,i),
(4.32)
Auj = -(Hf )-(BTA, + BTAA, + gi),
where
= (B(H )-lBT),
Fj = -(Hk )-(BT Ai + gi) + (-yj + Ayj + Bui + F),
fj
=-
- ATA, - WR(Yi - YR),
YT,i = WI 1 Ai(tf) + (YT - yi(tf)),
vary with i.
The significance of equations (4.30,4.31) is that they take exactly the same form as (3.4,3.5).
The key requirement for application of the TRCG algorithm on this system is that Qk be SPD.
Since Hik from equation (4.24) is certainly SPD
then so is
(pk
> 0 and C1,i, C2,j are SPD for feasible ui),
Qi. In other words, we can safely apply the TRCG algorithm to the system above for
each Newton iteration.
T
Denoting p
= [uT
T
y7T T, for each Newton step Ap[ = [Au7
A[
AAT
Ay[
the appro-
priate step size ai E (0,1] needs to be determined for calculation of the next iterate:
Pi+1 = Pi + ajApi.
(4.33)
This can be accomplished by well established procedures such as line minimization and the Armijo
rule.
4.4.5
Correcting Values that Lie Outside Box Constraints
Given any ui, the calculation uj+1 = ui + ajAui may result in values that lie outside the hard
constraints (4.1). If these values are not corrected, Newton iterations may simply approach the
unconstrained solution as p
-
0 (although undefined in the logarithmic statement of the problem,
this infeasibility is not automatically detected in the Newton approximation).
102
In order to avoid
such a problem, we determine the distance to the boundary in the cases where a full step would
result in infeasible controller values:
,= -minM
1,
j=-1,...,)
min
Um
';-) ,
Aui~j
/(jAiai,j
(ijAiai~j<Umin)
min
(ij
>Umax)
.
-Umax)
Ani~j
(4.34)
/
Having determined /i, we can take the actual Newton step:
Pi+1 = pi + (0.995)OjajApj.
The constant 0.995 is used to ensure that the next Newton iteration will be performed in the strict
interior of the feasible region. Using unity may result in pi+1 on the boundary of the feasible region,
which is undefined in the sense of logarithmic barrier functions.
4.4.6
Initializing the Algorithm
The Analytical Center
The previous section describes a method for determining step directions for the solution of the
quadratic approximation of the logarithmic barrier version of the constrained problem. No mention
has been made of the appropriate starting point po from which to take the first step.
We emphasize that the IPM algorithm is based on two approximations: (i) a logarithmic barrier
function that accounts for the effect of the boundary, and (ii) a Newton approximation for the solution of the nonlinear equations that make up the stationary conditions for each P1. In its currently
proposed form, the algorithm takes iterates k and i for the two approximations respectively.
Assuming we have a predetermined positive, decreasing set {pk}, it is necessary to determine
the initial guess
po(p1 k)
for each k. Given any k, it is natural to initialized the Newton procedure
with
PO (Itk+l) = Plast (,,k),
where Plast (k)
/I.
(4.35)
is the last (possibly converged) iterate of the Newton procedure for barrier parameter
More will be said about this choice in the following sections.
But the question of how to assign po(p-0 ) still remains. Here we make the first statement con-
cerning the set {pk}: take p 0 = M, where M is "large" in the sense that it can be taken to approach
the case in which pO
-+
oo. For a closed feasible set F (umin < Umax both bounded), the logarithmic
103
terms dominate J1'0 , and the solution of problem (4.8,4.9) can be immediately recognized as the analytical center of the feasible set. In this case po(p
0
) = Plast(P 0 )
[ (u*=O)T
(y*=O)T
]T
(A*o=)T
is uniquely determined from:
*
Uk=O =
=
Umin + Umax
(4.36)
(.6
2
ye 0 (to)
AY=O + Bu*=o + F,
-k=0 = AT A\=o + WR(Y
o - YR),
A
(tj)
=
yo,
W
(4.37)
(y*O(tf) - YT),
(4.38)
requiring no Newton iterations to be performed. Once this initial solution has been established,
one can proceed to Newton iterations for k = 1, from which Plast(p)
will be available for the
initialization (4.35). The question of how to select pk+1 will be addressed in section 4.4.7.
For cases in which the feasible region is unbounded (umin < u < oc, for example) we can simple
impose Umax on the problem to be a value that is much larger than expected. Such situations are
rare in practical engineering problems since most controllers will have physical limits. Therefore,
we observe here that assuming that
7 is
closed is reasonable for most engineering applications.
The Central Path
Above we have addressed the question of how to chose po(pk) for each k > 0.
If we allow the
Newton iterations that follow from such initial guesses to converge "exactly" (that is, i = last when
||h(pi)fl < e, where e is a negligibly small tolerance value) we will solve problem (4.8,4.9) exactly
for each k. The path
{pkat}
taken by these solutions for {1 k} as pL -+ 0 is called the central path.
Proposition 6 states that the central path will converge for any decreasing positive choice of
{pkk}.
1
Therefore, if it can be assumed that each Newton iteration will satisfactorily converge from
the resulting choices of {pk} = {pik-}, we can readily implement the algorithm.
However, it is more interesting to approach the question from another perspective.
Rather
than choosing an arbitrary sequence {pk} and determining {pk} that will allow for convergence
of Newton iterations, it is possible to determine an appropriate {pk} such that a single Newton
iteration will suffice for convergence of the entire algorithm. That is:
Pk+1 = pi+1,
(4.39)
allowing us to relinquish the index i. This means that even if we do not exactly follow the central
104
path, the algorithm will converge to the solution u* given an appropriate choice of {pk} and a
single Newton approximation step. The choice of the sequence and the necessary proof of the
above statement are given in the next section.
4.4.7
Barrier Parameter Rate of Reduction
Here we establish the rate of reduction of the variable pk which will guarantee convergence of the
algorithm for a single Newton step. This theoretical result is important in the sense that it guides
the convergence expectations based on the size of the problem. In practice, however, the method is
often observed to perform better than this result suggests, and yk can be reduced more aggressively.
The results of this section are shown in the context of the time-discretized problem since they
are important in the implementation of the method. In fact, the discretized size of the problem
will drive the convergence rate of the algorithm. Here we use the inner product (x, y)
= XTy
with
associated norm 11- 11= v'(x, x) and deal mostly with vectors x C RM, where, as before, M is the
number of control variables. In addition, L is the number of time discretizations assigned to the
problem.
These results are based on the practices followed by IPM formulations used for solution of
general linear programming problems [41. Though similar in their conclusions, the formulations
differ in the fact that optimal control problems involve a time domain that is not present in the
traditional treatment of these methods. As the results show, the size of the control problem vML
dominates the result. This is analogous to the linear programming context.
Before stating and proving the main result let us state the following definition:
Definition 5 (Time norm) Given a set of vectors {}
where e E
IRM,
we define as the time
norm ||XjjL the following operation:
L
2.
|X| 11
|IX||L =
(4.40)
f=0
The required properties of the norm can be easily shown to hold: IXHJL
0;
HXIIL
0 iff {X} = {0};
||azX|L = ja|lIX11L, V a C R. F
Lemma 2 It will be useful to explicitly state the following inequality:
11X + YllL
HXflL + flYIIL-
105
(4.41)
This is simply a statement of the triangle inequality and can easily be checked. The long hand
notation takes the form:
z
S
+ yefl 2
|xflj2 +
y
Now we state the main result of this section:
Proposition 7 The proposed single-step Primal IPM is guaranteed to converge if the barrier
parameter decreases in the following manner:
k
k
k+1
_
=
+± ML
where
a
(4.42)
and the path followed by the algorithm will not deviate from the central path by more than
Ik kSke - C
P
30
#:
(4.43)
L
for all k.
Proof. The proof is carried out here for the terminal problem only in order to reduce the
number of terms involved. The result holds for the terminal-regulator problem and the proof is
easily extended. To prove this statement we recall the discretized form of the problem:
(
Ayk0 = 0;
ki-
AYk-l=
(4.44)
AAyk' + BAuk,
f = 1, ... , L;
At
(4.46)
(I - AtA)T AAkL = WTAykLAt;
A=
= L - 1,...,1;
ATAAA,
106
(4.45)
(4.47)
along with the condition
Wuukk + BT Ak
for f = 1, ...
, L.
+ Wu Auk
jk+1(Ci)-e
--
+ BTAk+ + pk+1(Cf)-
2
A V = 0,
This last condition can be rewritten as
k+I= WUUk+e
+
(e
BAk+_k1()-
-
(C )-IAUkf.
(4.48)
Now we observe that since
_
Sk+l
U.
'i
__
+1
k +
+i
kj
+ Ukf)
(1I
for j = 1,...,M, we have
U k+1
ki
_
_
- Umin)
1
Au '
+
iI.V2
)
(4.49)
mi
from which, together with (4.48) and observing the diagonal nature of the matrices involved, we
get
2
U kk
-
umin) sk+1
Uk+1iU
Ak
n
n
which can be written as
2
Pk+Ck+1Sk+1e
e
2
=
(C
For positive vectors v, the inequality |ZVv|
-2 (AUk ) 2e(k_
(ZV,) 2
2
= E,
V, = eTv,
allows us to
write
(C)-
2
(A Uk) 2 e
<
(Cl)- 2 (AU ) 2e
Squaring, and summing over all
2
(AU )e)
=
(C)-lAUkt
2
e,
1
k+l Ck+lSk+le -
T (AU
(e
)(C)
2
e
L
L1
Z
f=0
(CklAk'
107
(
2
(Ck<
1
uk
||).
(4.50)
Now, from (4.48) we readily get for the right-most term above:
AUT (C
-2AUk
S
C
1
k+1
P ,
Ck
k
T
e
(C
-IAuk f
AUk T WUAUkf + AUk TBT AAk f
-
Pk+1
Auk T WUAUk f + Auk
S(Cf)~-"A6k
e
TBTA X ke
- k+1
or
(Ckl-IAuk'
+
A Uk
WUAUk f + Auk TBTAAk'
Pk+1
1
K
~k+1E4e
(C )-1AUk
-e
Since pk > 0 for all k, and WU is SPD, squaring the above leads to
2
Ck |-I A
-e
T
+1
Ak +ICkSke
and, summing over all f,
(Ck
-1
Auk
|L
+
k11AUk
B AAk
1
AkICkSke
2
(4.51)
L
From relations (4.45-4.47), we have
= 'AYk WTAYk L> 0
AAk
AUk jBT
so that (4.51) further simplifies to
1
CI(C
k)-
Uk
||IL
2
-e
A k+1 CkSke
L
Together with (4.50) we have the result
1
kSk+lk+1
-
e L K
108
1
e
k+lCkSke -
2
L
(4.52)
Since pk+1 - aky11k we have for the right hand side:
1
II
k+1
CkSke
-
1 1
e
L
1
- e L
CkSke
Ckk
ak
L
1
<
ak
+ (1ak) ak
-e
IkCkSke
) 1ak
( Ok Ske -e
-
k
e
JeI
L
1
2
+1
ak)
ak/ +(ak
) VML.
(=0
Now we can determine the possible values of ak which will ensure that for iteration (k + 1) the
deviation form the central path will not exceed 3. Combining (4.52) with the above we have
2
1
YkICk+lSk~le
-e
L
Wk
= 0.
k
(4.53)
Solving for ak:
0 +
ak=
ak-
4.4.8
M
.
(4.54)
Primal-Dual IPM
Difficulties associated with Primal Methods
The Newton step (Api) calculation is by far the most costly part of the proposed algorithm. The
problem of finding Api can be represented in the following form:
KpApi = fp,
(4.55)
where Kp is the symmetric block matrix
Hk
KP =B
[iP
BT
0
0
(AT +)
109
0
(A-
a)
WR
with appropriate initial and final conditions implied. We recall here that the (1,1) element of the
above block matrix is the Hessian
Hk =WU+
tk (C-? +C).
The main problem with Primal IPMs is that as they approach the solution u*, the matrix Kp
becomes ill-conditioned, leading to problems in inverting the system (4.55). It is a simple matter to
show this. First, we note that we have shown in Proposition 6 that as k
for active constraints, C*'(,)
-+
oc, u* -+ u* and thus,
0. We then note that for u* , stationarity conditions must hold for
-
the point to be optimal. Comparing stationarity for the original problem and for the logarithmic
barrier approximation
0
WUU* + BTA* +
v
1
(4.56)
-
0=h*= Wuu* + B T 4 _ Pk(C*)-e + pk(C2*fle
(equation (4.56) is a result of Vuci = IM and Vuc 2 = -U),
(4.57)
we note that for (u*, A*)
(u*, A*),
-
we must have
pk(C*)-le-+ - *,(.
pk(C*)le
v*
-+
oc. As a result H
-
(4.59)
0 for active
Since the above values are non-zero constants and C*
k
)
j,
k(Cqi(j,j)-
2
-
oc as
oc, making Kp ill-conditioned.
Primal-Dual Methods Formulation
It is possible to avoid the ill-conditioning that arises from Primal methods by sacrificing the symmetry of Kp.
If Newton iterations are applied to the system (4.16)-(4.20), and we define the
Primal-Dual variables p'
AuT AA[
AyT
AzT
[uT
AZ T
4
,
y7'
zfl
z2,
and corresponding Newton step Ap
=
we have
KPDAPi = fPD,
110
(4.60)
where
KPD=
WU
BT
0
-I
I
B
0
(A-h)
0
0
0
(AT+0)
WR
0
0
Zi'i
0
0
C1,
0
Z2,i
0
0
0
C1,i
and Zq,j = diag(zq,i). Now Hk does not appear in KPD and the problem encountered in Primal
methods is eliminated. Since
zq
-+
and
-Vq
cq(j) -
0 for active j as k -
oo, the block matrix
above does not become ill-conditioned as we approach the solution, resulting in a better behaved
algorithm.
The fact that we have sacrificed symmetry does not have an adverse impact on the algorithm,
since the form (4.60) can still be reduced to
-
=Yi
= AAyi
QAAi + Fp,
-A A= ATzA A + WRAY, +
Au
=
(4.61)
Ayi(o) = Yo - yi(to),
A Ai(tf)
ij
=
WT(Ay(tf)
-(Hf )--(BTA, + BTAA, + y),
-
YT,i),
(4.62)
(4.63)
where now
(CZ
$t$=
f= Wu
Z2,i
-- C i) e
(C 1-
-(H)
-(BT
Ai +
i)
fi = -Ai - ATAi - WR(yi
YT,i
C
(B(Hf )- 1 BT ),
-=
Nj =
- pk
,+
11u
=
Wj-1Aj(tf) + (YT
-
+ (-yi + Ayj + Bui + F),
-
YR),
yi(tf)).
The TRCG algorithm can therefore still be used to solve the above system. With Primal-Dual
methods the Qi matrix is seen above to be better conditioned than its Primal counter-part of
section 4.4.4, which is advantageous to the TRCG algorithm since the conditioning of 9 depends
on this matrix.
Since Primal-Dual algorithms improve the method without an adverse effect, we have chosen
111
to use the above statement of the problem in the solution method. The result of equation (4.42)
will no longer hold for the statement above. It can, however, still be used to guide the convergence
of the algorithm in the sense that the problem size is reflected by the term v/ML. We postulate
here without proof that an often used result [4] for traditional Primal-Dual IPMs can be extended
to the context of optimal control to determine the selection of barrier parameters:
+ (c
M
ML
T Zk-1
k-1 T k-1
k
(4.64)
Finally, we must address the issue of step length so that the new iterate is guaranteed to lie
inside the feasible region. Similarly to the primal case, assuming that we start from a feasible point
pi, we take
Pi+1 = Pi + (0.995)#OajApj,
where
i
min
j=1
1,
min
mm
.
(zi,,j+ciAzi,,3 <0)
4.5
-
/
-Zl'i'j
Azi,'ij
(Ui,
min
i,j
u
Au
.
mm
(z 2 ,+aiz2,,J <0)
-
Umax
Auij
(.i.j+Ai.ij>Umax)
Z2,i,j
Az 2 ,i
(4.65)
).}
IPM-TRCG Algorithm
The following is a detailed listing of the Primal-Dual Interior Point Methods approach to solving
optimal control problems via the TRCG algorithm.
Algorithm IPM-TRCG
Set k = 0;
Set pk=O = M (M large);
Set uk=o at the analytical center of U;
Calculate Pk=o(no);
while (pk > ptol) do
while (llFill > Ftoi) do;
Solve (4.60) for Api by TRCG;
Solve for ai and
/i
by Armijo rule and (4.65), respectively;
pi+1 = pi + (0.995)#jajApj;
112
end do;
if (pk > ptoi) then
k = k + 1;
Solve for hk by (4.64);
end if;
end do
We note that in the above we have not used the theoretical results for single Newton steps.
Though helpful as a guide in estimating the rate of reduction of pk, this approach is rarely pursued
in practice. We have chosen an often used approach for implementation of IPMs by allowing for
a small number of Newton steps and calculation of pk by (4.64). But since IPMs can be applied
much in their original context to optimal control via TRCG, other typical implementations can
also be used if preferred.
4.6
4.6.1
Example Problem: Linear, Constrained 2D Heat Transfer
General Results
Here we consider the same example problem as in section 3.12 with the additional constraint:
0 K/s < u* < 1500 K/s,
V t E [to, tf].
(4.66)
Note that, although the above is imposed for all time, these constraints can be time-varying, which
would require no modifications to the algorithm proposed.
We shall refer to the problem above as the "constrained" problem and to that posed in section
3.12 as the "unconstrained" problem.
In reality, both can be viewed as constrained minimiza-
tions since governing equations must be satisfied; however, this section deals with hard, inequality
constraints on the variables themselves.
Figure 4-1 shows the solution to the constrained problem. It is obvious from the plot that the
hard constraints are indeed observed for the control variables as required. The performance of the
system is expectedly degraded as can be seen from the higher value of J* in comparison to the
unconstrained problem. This is expected since the set U of admissible control functions u has been
113
Optimal control history u (t)
2000
1
1
500
0
----.--
50
--5001--
u1 (3,5)-- u2 (4)
-1
0
0.5
40
1
1.5
2
2.5
3
3.5
4
4.5
5
Average temperature at reaction surface IRS
-
550
4000
500-
9
450 --
>s400-350 -
__yRRS
-y
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
t (s)
Figure 4-1: Optimal control and l'7 RS temperature histories for constrained problem, J*= 8.19 x 10 7 .
reduced to include only those functions that lie in the allowed range over time.
4.6.2
Numerical Performance
Rather than using equation (4.64) to calculate p , we simply took
P k = 1o-2 Pk-1I
which is known to work well in practice. Equation (4.64) was only used as an estimate for the
stopping criterion. IPMs are often observed to work better in practice than predicted by theory,
and their application to optimal control problems is no exception. Therefore, a simple, aggressive
reduction of pk can be expected to allow the method to converge very quickly. The results below
show that this is the case for our current example.
Figure 4-2 shows the Newton convergence of the residual |IFill of the nonlinear state equations
and the conjugate gradient convergence for the last Newton iteration in the procedure. Referring
to the left graph, we note that the tolerance value chosen was Ft(i = 10-4, and that new pk were
114
1
Newton residual convergence
10
10
Last TRCG residual convergence
10
-
10
1061
-
10
10 2
10
-
1
10
-
113
-
10
102
1010
-
10-1
1006
10
10- 0
-4_______________________
10-4
0
5
10
15
20
25
0
5
10
iteration
15
20
25
iteration
Figure 4-2: Newton and last conjugate gradient convergence of IPM-TRCG algorithm.
calculated at iterations {1, 6, 8,13, 16,17, 18,19, 20, 21}.
We see that the Newton iterations present the expected excellent convergence properties and
that towards the end of the iterations only a single Newton step is necessary for staying very close
to the central path (as is suggested by ||-ill < Ft
0 I for these iterations).
However, we may be concerned with the conditioning of the Hessian (which impacts the conditioning of the operator 9 as was observed in section 4.4.8).
Since we used the Prima-Dual
formulation of the method, we argued that we should not have a badly conditioned Hessian as
/I k
0. To test this in the current problem, we present on the right hand side of Figure 4-2 the
convergence of the algorithm for the last barrier parameter iteration, that is, the smallest value of
ILk (in this case
= 10-9).
-h'
Compared to the unconstrained case (Figure 3-7), we see that the algorithm was effective even
as
bk -
0, converging in fewer than 30 iterations. Since the conditioning of g is essential for the
effectiveness of the TRCG algorithm, we emphasize the importance of using Primal-Dual methods
in conjunction with TRCG to solve such optimal control problems.
Linear, constrained problems addressed here are far more common in engineering applications
than those of chapter 3. However, the generality of the TRCG algorithm lends it to an even broader
class of problems if certain adaptations are made. The following chapter addresses application of
the method to nonlinear, constrained optimal control problems.
115
116
Chapter 5
Lagging Procedure
Nonlinear,
-
Constrained Problems
5.1
Motivation
We have presented in the previous chapters a method for solving linear, constrained optimal control
problems. In this chapter, we would like to extend the method to a broader class of engineering
problems. In particular, we present an extension to a specific class of problems: those with nonlinearities in the state variables.
Considering the examples that have been presented in chapters 3 and 4, it may be of interested
to include problems that allow for radiative heat transfer.
For example, the "reaction surface"
of Figure 3-1 may be exposed to an environment with which heat radiation is exchanged. This
addresses a more realistic set of problems since it is unlikely that either Neumann or Dirichlet
boundary conditions could be imposed on this surface.
Radiative heat transfer on a surface is given by the Stephan-Boltzmann law
4rad
where 4rad [W/m
10-8 W/m
2
2
Y-
~
] is the resulting heat flux, c
(5.1)
s)
c [0, 1] is the surface emissivity, o-
=
5.67 x
4
- K is the Stephan-Boltzmann constant, y [K] is the surface temperature, and y, [K] is
the surrounding (ambient) temperature. The nonlinearity is apparent from the quartic dependence
of the heat flux on the surface temperature. This radiative law will appear in the example presented
117
in the form of a boundary condition.
Detailed treatment of the incorporation of this condition in the model and the FEM formulation
will be addressed in section 5.8. For now, we state that such a nonlinearity will appear in the
spatially-discretized ODEs as
Q=
(5.2)
Ay - Kf (y) + Bu + F,
(5.3)
y(to) = Yo,
where y, A, B, F, yo are defined as before, K :
Y
Y is a linear mapping (that may, for example, in-
-+
corporate boundary terms (5.1)), and we make use of the notation
f (y)
= [f (yi)
f (Y2)
-
f (.) ].
The function f(y) is taken to be nonlinear and is separated from the linear part Ay for clarity in
the algorithm. In the example of radiative heat transfer, we would have
f (y)
= y4 , a fourth-order
polynomial nonlinearity.
5.2
Problem Statement
In the interest of addressing general problems, we present the mathematical statement here with
constrained controls:
min
J[y(u)]
UEU
y= Ay - Kf(y) + Bu+ F,
c
u(t)
Umin >
-
C2 = Umax
5.3
(NLQPu)
y(to) = yo,
s.t.
-
0,
u(t) > 0.
Optimality Conditions for the Constrained NLQP Problem
Similarly to the treatment of section 4.3, we can state the optimality conditions of the problem:
0
=
Ay* - Kf (y) + Bu* + F,
y*(0) =
=
AT A* + WR(y*
A*(tf) = WT(y*(tf) - YT),
=
Wuu* + BT A*
-
YR) - KVf (y)A,
+ Vc*v* + Vucv2*,
118
yo,
(5.4)
(5.5)
(5.6)
where
0,
V*1<
1
)Ij) 0,
if c
= 0;
if c
> 0>
0,
S<
if
= 0;
c*
(5.7)
if
,
v'2(j) =
> 0.
C
We note that a new term is present in equation (5.5) which is nonlinear in y. In the spirit of
IPM, we can also state the optimality conditions for the Primal-Dual logarithmic barrier problem:
= Ay* - Kf (y*) + Bu* + F,
=
0
C*
yO,)
(5.8)
A*(tf) = WT(y*(tf) - yT),
(5.9)
y*(0)
AT A* + WR(y* - YR) - KVf(y)A,
=
Wau* + BT A* - 4 + z,
(5.10)
k
(5.11)
(5.12)
Cz* = pke.
Since this is the useful form from the algorithmic point of view, we deal with it directly here.
5.4
5.4.1
Linearization of Optimality Conditions
Naive Implementation
It is natural to consider Newton iterations to address the nonlinear term KVf(y)A in the same
fashion that was used for the nonlinear barrier terms pk(C}*)-le in the implementation of IPMs.
Each Newton step Ap[ =
Au[
AA[
I~~~~
AyT
AZi
would then be calculated by
T
IT'
(5.13)
PDApi = fPD,
with
ATPD
WU
BT
B
0
0
(AT+)
-I
I
0
0
WR
0
0
0
(A-A
)
Z1 ,i
0
0
C1 ,i
0
Z 2 ,i
0
0
0
C1,i
119
where
A
=
A - KVf (yi),
WR = (WR - KV 2 f(yi)A\)
At this point, a generic algorithm that might try to invert fPD would run into a substantial
difficulty. The term WR is not necessarily SPD for arbitrary combinations of (WR, K, V 2f(y,), A,).
In our example, in which f(y) = y4,
WR - 12K[diag(yi)] 2 diag(Ai)
can easily become non-SPD depending on the iterates yi and Ai. If the solution procedure ventures
into regions where this is the case, then the SPD property of kPD is compromised and efficient
numerical algorithms such as conjugate gradients cannot be effectively used.
If we try to implement the above idea into the TRCG algorithm, we would have to solve a
system of the form
Ayi = AAyi - Q AAi + F.,
-A*Ai
=
AAi ±T
- WRAMy
Ay(O)
AAi(tf)
+ fi,
yo
=
-
y(t 0 ),
WT(Ay(tf) - YT,i),
(5.14)
(5.15)
where A and W are defined as above, and all other variables are defined as in equations (4.61)-(4.63).
Following the procedure of chapter 3, we would not be able to implement the TRCG algorithm
because the operation
((v,w)) := v(tf)TWTw(tf) +
v(t)TWRw(t) dt
to
cannot be defined as an inner-product for non-SPD WR.
It is possible, however, to sacrifice the quadratic convergence of the Newton algorithm for a
valid inner-product form. First, we make a distinction between iterations used for the logarithmic
barrier system and those used for the nonlinearities presented in this chapter.
120
The problem posed by the Newton projection method with nonlinearities is to solve the system
Ayi = AAy - Qi AAi + Fz,
zAyi(o)
= ATAAi + (WR - KV 2 f(yi)Ai)Ayi + fi,
-A*Ai
Aui
+ BTAAi +
= -(Nf)-1(BTAi
AAi(tf)
=
(5.16)
Yo - y (to),
= WT(Ay 2 (tf) - QT,i),
(5.17)
(5.18)
i).
for Api. This is equivalent to solving problem (5.13) with possibly non-SPD KPD. If we have an
algorithm that can robustly solve this system, the IPM iterations can proceed. In what follows, we
present a robust algorithm for solving such a system.
5.4.2
Proposed Algorithm - Separation of Parts
We begin by separating parts in a procedure analogous to that presented in chapter 3.
Define
(Ayi, AAI) and (AyH, AAH) such that
Ay = AyI + AYH,
AAi =AA1r + AAH.
A crucial difference here, however, is that all nonlinear terms in the stationary conditions are placed
in the inhomogeneous equations:
Ay1 = AZAy
-A
- Qi AA, + Fi,
r = AT AAI - KV 2 f(yi)AiAyi + fi,
AyH
-TH
=
AAyH
- Qi AHI
AYH()
AH + WRAyi,
Ay (0) = Yo - yi(to)
(5.19)
AAI(tf) = -W
(5.20)
= 0,
\AH(tf) = WTAyi(tf).
T,i,
(5.21)
(5.22)
We first note the immediate advantage of the above: the homogeneous system (5.21,5.21) now
takes exactly the form of chapter 3, and so the operator g can now be safely applied in the space
121
defined by the inner-product
((v,w)) :=v(tf)TWw(tf) +
v(t)TwRW(t) dt.
Unfortunately, we are not yet ready to apply the TRCG algorithm due to a difficulty that has has
been introduced: system (5.19,5.20) now depends on Ayi, and is, as a result, no longer decoupled
in time. We propose to use a known approximation for the nonlinear term (KV 2f(y,)AAyj)
in
equation (5.20), thus decoupling the system in time and allowing for the use of the TRCG algorithm.
In solving for the inhomogenous terms, we solve the uncoupled system
Ayj
-A
1
- Q AA1 + Fi,
=ZAy-
=
ZT A A
AyI(0)
- KV 2 f (y )A Ayi + fJ,
ANA(tf)
=
=
Yo - yi(to),
(5.23)
-WTiT,i,
(5.24)
where Ayj is an approximation of Ayi. Given a method for determining Ayi, system (5.23,5.24)
becomes uncoupled in time, allowing for the calculation of Ayj and the subsequent use of the
TRCG algorithm.
We must therefore finally propose a method for determining Ayi. We do so iteratively: given
an index r and an iterate Ayir, solve equations (5.23,5.24) with
Ayi = Ayjr.
(5.25)
We denote the result of this operation Ayj,. With this value, we can use the TRCG algorithm to
solve
g9Ayir+ 1
for the next iterate Ay r+1.
= AYIr
(5.26)
These iteration can then be repeated until some stopping criterion,
I1Ayir+i - Ayir|1 < e for example, is observed. The crucial point is that since g is SPD in the innerproduct space defined above, the TRCG algorithm can be used without modifications to effectively
solve system (5.13) for Ayi.
5.4.3
Initializing the Algorithm
In order to execute the algorithm proposed above, an initial guess must be specified.
Here we
address the final issue of choosing a suitable Ayi 0 (r = 0). Suppose we choose Ayio = 0 to start
122
the algorithm. Then, solving the system
!;,Yi
for Sy,
= ,yYo
provides us with an initial, first order approximation of Ayi.
As we have shown in
previous chapters, a solution for the above equation exists, is unique, and can be found by the
TRCG algorithm. This is also true of each subsequent iteration (r = 1, 2, 3,...) regarding the
solution of the system (5.26).
Thus every Newton step Api can be calculated assuming the above procedure converges: Ayir a
Ayi as r -+ oc. Here we make this assumption and show that it holds for our heat radiation example.
In addition, we note that the algorithm can only be considered efficient if the iteration converges
quickly; that is, r must not be too large before we attain sufficient convergence.
5.5
Sufficient Convergence for Lagging Procedure
It was noted in section 4.4.7 that exact convergence of the Newton iterations for IPMs is not
necessary for convergence of the full algorithm. The interpretation in that section was that the
solution need not move exactly along the central path for convergence to the true solution u*.
Here we are faced with a similar situation. Define the central path of the lagging procedure as
the exact solution of the stationary conditions (5.8)-(5.12) for all positive p1k E R. Again, we are
not interested in the exact value of Api for any given Newton iteration i, but only in obtaining
good enough estimates of these values along the central path as we approach the true solution of
the problem. Therefore, it may be postulated that, similarly to IPMs, our lagging procedure need
not produce estimates Api that have converged fully to Api of system (5.13).
We further postulate that, much like in the case of IPMs, our estimates can be rather far from
the central path and allow for convergence. Here we've applied a very simple rule to take advantage
of this feature: take riast = R where R is a small integer, typically less than 5. In section 5.8 we
present an example of the fully implemented algorithm where this simple rule has been successfully
used.
123
5.6
Determining the Newton Step
Having determined an appropriately "close" approximation Ayj of Ayi, we may proceed to find
and approximation of the Newton step Api, which we denote Api:
KPDAPj
=
(5.27)
fPD,
where now
kPD
WU
BT
0
B
0
(A-A)
0
0
WR
0
0
0
(AT
-I
Z1 ,i
0
0
Ci,i
0
Z2 ,i
0
0
0
C1 ,i
as in section (4.4.8). The above fully determines Api since the right-hand side fPD now includes
the approximation term Ayi.
5.7
NL-IPM-TRCG Algorithm
Algorithm NL-IPM-TRCG
Set k = 0;
Set pk=O = M (M large);
Set Uk=O at the analytical center of U;
Calculate Pk=o(Uo);
while (pk > ptol) do
while (HIFill > Ftol) do
Ayir=o = 0;
for r = 1, ... ,R do
Ayi = Ayir;
Solve (5.23,5.24) for Ayir;
Solve (5.26) by TRCG for Ayir+i;
end do;
Solve (5.27) for Ap;
124
YS
FN
U5
RS
FD
7D
U2
q 2 =u
(X3
q 3=
X4
2
q4
U1
Figure 5-1: Diagram of sample nonlinear heat transfer problem domain (7 cm x 3 cm).
Solve for ai and
fi by Armijo rule and (4.65), respectively;
Pi+1 = Pi + (0.995)/3aj/&pj;
end do;
Pk _
if (pk > /ptoi) then
k = k + 1;
Solve for pk by (4.64);
end if;
end do
5.8
5.8.1
Example Problem: Nonlinear, Constrained 2D Heat Transfer
Problem Statement
We now address a specific nonlinear problem governed by partial differential equations. In particular, we consider radiative heat transfer, where the geometry is similar to the one considered in the
problems of previous chapters.
Take the domain shown in Figure 5-1, where the reaction surface I'RS is now exposed to an
environment at temperature ys. Rather than imposing a Neumann condition on this boundary, we
allow heat exchange through radiation to occur. This process is nonlinear and governed by the
Stephan-Boltzmann law (5.1). The governing equations for this process can thus be expressed as:
125
M
Em(X)
+
= V -(a(x)VQ) m=1
in
Q,
(5.28)
in
Q X (to, tf ),
(5.29)
FN X (tof),
(5.30)
V - f = 0 on
Q-n=&(Q4 - y4)
on
EPRS X (tO, tf),
(5.31)
9=300 K
on
F D X (to, tf),
(5.32)
where all definitions and properties are as in previous chapters. We add here that we take & =
5 x 10-7 /m
2
K3 for the following examples. This value is unrealistically high for typical engineering
materials, making the effect of the nonlinear term more pronounced than may be expected to test
the proposed algorithm.
5.8.2
FEM Formulation
In stating the above problem in the FEM context, we recall the spaces defined in section 3.11.
To preserve desirable properties of the stiffness matrix, we deal directly with the time-discretized
form of the problem and treat the nonlinearity explicitly. By doing so we may state the problem
governing equations as
(5.33)
y= o;
M
Y)
At
Ay' - f (y'-1) + Buf + F,
(5.34)
where
f(y-I.
-
(&(( e-
1 4
)
-
y4), 0j).
(5.35)
Having defined the problem as such, and following the procedures of section 3.11, we may easily
derive the stationarity conditions:
126
(5.36)
MY0 = YO;
M
=YAyf - f (y'-1) + But + Fe,
(M - AtA)T AL = WT(Y
M (
At
)
(5.38)
- YT) + WR(YL - yR)At;
ATAe + WR(y'
nt= -W
(5.37)
-
(5.39)
y') - Vf (y')Ae+,
(5.40)
BTAe,
where we define the matrix
Vf (ye)
(5.41)
4 diag[(&( t) 3 ,$)],
with
(V, A) = (V, 0j).-
Finally, in order to pose the Newton projection problem, we state the variations in stationarity
conditions as
AY9 = Yo
M (AY
(M - AtA
M (AAe
WT(AYL
AAL
(5.42)
Y,
(5.43)
A A' +
Ay-
T,i)
+ WR(AYL
-
i)At;
(5.44)
) = (A')TAA + (W-R V 2 f (yI)AI)Ay +
(5.45)
(%) - 1(BT A + BT AA' + g),
(5.46)
Au
where we note that Ef and
-
=
absorb the explicit nonlinearity terms, and
(5.47)
A' = A + Vf (yf)l,
V2f (y ) = 12 diag[(&()2
127
).
(5.48)
Optimal control history u (t)
1500
... .....
1000 500~30
-500
u1 (3,5)
u2 (4)
-1000
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Average temperature at reaction surface RS
550
500,450 >400 -
YR,RS-
3~00
350--
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
t (S)
Figure 5-2: Optimal control and I'RS temperature histories for nonlinear constrained problem,
J = 1.42 x 108.
5.8.3
General Results
Having thus defined the problem, we may safely apply the NL-IPM-TRCG algorithm to the heat
transfer process of Figure 5-1. We preserve all the problem data given in sections 3.12.1 and 4.6.1.
Figure 5-2 presents the solution of optimal control and state histories. We note that the nonlinearity drastically changes the nature of the solution in comparison to the that of section 4.6.1.
The heat lost through the radiative surface causes the temperatures at l'RS to be generally lower
throughout the process than was observed with Neumann boundary conditions. In fact, increasing
the cost for these deviations will not impact this portion of the solution, since as the controllers
saturate in the early part of the process and cannot drive the system to the desired temperature
as fast as in the former situation.
We note that if hard bounds were not imposed on the problem, early control values would
increase far beyond practical limitations when driven by such strong nonlinearities. This example
demonstrates the strength of the TRCG algorithm in that it allows for effective incorporation of
both nonlinearities and hard bounds, making it very practical for engineering problems.
128
1
0
Newton residual convergence
I
I
T
I
1
I
Last TRCG residual convergence
-
0 1
1__
108
10
-
10
-
108
104
104
102
U2-100 -
102
10-2_
100
-
10-
10
10-2
10-4
108
0
5
10
15
20
25
0
iteration
5
10
15
iteration
Figure 5-3: Newton and last conjugate gradient convergence of NL-IPM-TRCG algorithm.
5.8.4
Numerical Performance
We chose R = 3 for the NL-IPM-TRCG algorithm, so that each circle shown on the left plot of
Figure 5-3 corresponds to 3 iterations to find Api. Comparing Figures 4-2 and 5-3, we conclude
that the deviation from the central path introduced by the nonlinearity has a minimal effect on the
convergence of Newton iterations when even this small number of lagging iterations is used. New
values of barrier parameters pk were calculated at iterations {1, 6,11,15, 17, 20, 21, 22, 23, 24}.
From the right plot of the figure, we note that the last TRCG calculation is still well conditioned,
as expected for Primal-Dual methods, converging in only 15 iterations.
The power of the method thus lies is the fact that IPMs tend to be forgiving of deviations from
the central path. A small number of lagging iterations (3 or 4) is therefore all that is required
to achieve sufficient convergence.
The conditioning of the problem will not degrade as uk
-+
u*
if Primal-Dual IPMs are used, guaranteeing that the TRCG calculations will converge quickly
throughout the process.
These features, in conjunction with the stability of the central SPD
operator 9, lead to a very efficient overall algorithm.
129
130
Chapter 6
Concluding Remarks
6.1
Summary of Contributions
We presented in this work a method for solving quadratic cost, nonlinear state equations, constrained control optimal control problems. The core of the algorithm was developed for unconstrained LQP, but its flexibility allowed for extensions to more practical problems by applications
of IPMs and a lagging technique.
Though developed primarily in the context of ODEs, the goal of this work was to solve problems
governed by first-order parabolic partial differential equations. We showed that the method could
be very effectively extended to address these problems since the central requirement that g be SPD,
stable, and well-conditioned was not compromised by a FEM discretization, as would have been
the case, for example, if shooting techniques were employed.
We presented an engineering (heat transfer) problem as an example of the efficiency of the
method, and showed that its numerical performance was very favorable at all levels: conjugate
gradient, Newton projection, and lagging iterations.
The TRCG algorithm is derived from the idea of a state variable operator in the spirit of
HUM. Our formulation differs from HUM in that the redefined problem (which can be viewed as
a statement of the dual) is minimized over a space defined by a problem-specific inner product. In
this space, we showed that the
g operator is symmetric positive-definite, allowing for the solution
of terminal and regulator problems by a conjugate gradient-based method. In addition,
g is shown
to be well-conditioned, thus allowing the method to converge quickly and efficiently.
The most costly part of the algorithm is the action of g. Therefore, problems that are charac131
terized by dynamical equations with sparse matrices can take advantage of this sparsity. For FEM
discretizations of parabolic partial differential equations, an initial value problem can be solved
with O(LN) operations, where N and L are the number of spatial and temporal nodes, respectively. The action of
9 is twice this cost, and, since 20-30 iterations of the conjugate gradient
algorithm are required, the entire problem is solved by roughly twice the order of magnitude of a
single initial-value problem.
6.2
Possible Pitfalls
A certain amount of care must be taken in implementing the TRCG algorithm. Pitfalls, which
often arise in the discrete statement of the problem, can be avoided if the following precautions are
taken.
There is no way of predicting the correct form of the terminal conditions for A if the timediscretized form of stationarity conditions is not derived directly from the discretized cost. Though
an incorrect terminal condition introduces only O(At) error in the solution, it will likely destroy
the SPD property of
9, possibly compromising conjugate gradient iterations. Therefore, it is rec-
ommended that time-discrete stationary conditions be derived from the discretized cost functional,
and that the SPD property of
9 be verified by a proof similar to the one found in section 3.8 with
an appropriate, discrete inner-product.
The spatial discretization of the problem in the FEM context introduces the mass matrix on
the left-hand side of stationary conditions. We observed that as long as we include this matrix
in the definition of operator R, no modifications need to be done to the algorithm. In fact, any
invertible symmetric matrix can be multiplied to the left-hand side of these equations if we define
R accordingly.
Finally we note that it is best to use Primal-Dual variants of IPMs. This guarantees that g
is well-conditioned throughout the solution process.
Though Primal methods may be easier to
implement and may work for some problems, they cannot guarantee this important property of g
in general.
132
6.3
Conclusions
The examples presented in the work demonstrated the predicted effectiveness of the method. The
operator g was shown to be central to the efficiency of the algorithm, due to the fact that it is stable,
well-conditioned, and SPD in an appropriate inner-product space.
In addressing more general
problems, we noted that IPMs provided a way of guiding the solution to u* without compromising
this operator. More general nonlinear problems can very efficiently be addressed in this context,
since IPMs allow for deviations from the central path introduced by nonlinearities. In fact, we
propose that for unconstrained nonlinear problems, it would be advantageous to apply fictional
limits on u beyond expected values and employ the NL-IPM-TRCG algorithm to exploit this
guiding behavior of the method.
133
134
Appendix A
Additional Time-Discretization
Schemes for TRCG
Here we present two additional time-discretization schemes that may be used with TRCG: CrankNicholson and Second Order Backward Difference Formulas.
The method is shown here to be
applicable for these schemes since the main results of chapter 3 hold given appropriate definitions
of the R operator and ((., .)) inner-product.
As suggested in that chapter, we approach the derivation from the cost functionals. The results
in this appendix show that such an approach leads to the appropriate form of operators for general
time-discretizations. As a result, TRCG can accommodate any time-discretization scheme provided
care is taken in developing R and ((., -)).
For completeness, we present the below in the context of logarithmic barrier functions of chapter
4. This is done to explicitly show the form of augmented cost functionals as new schemes are
introduced. Extensions to the lagging procedure for nonlinear problems are trivial.
135
A.1
Crank-Nicholson
A.1.1
Cost Functional Definition
P[
] =
('
T-)
T(Y
-
T) + 2
(
Wu-1/2)WU(ui/2)At
f=1
+ (2w-
)At/2 + 1
R
L-1
T WR(
_
2
4)
2
where m,q = (M1/2
A.1.2
-
T
WR(L
--
)At/2 -PE
-
)At
(A.1)
M
L
E
ln( 'Mq)At
q=1 f=1 m=1
ft ), and xe1/2 is defined as the value of x at t = (f - 1/2)At.
Optimality Conditions
must be defined. For Crank-Nicholson, Cq = diag(ut1/2_
Before applying a particular scheme, C
I q
ut). We then reintroduce the state and adjoint equations:
Y0
YY
At
AL = WT(YL
A
(A.2)
YO,
A+iAT
IA(y' + yt~1) + Bu f-1/2 + F,
2
-
(A.3)
(A.4)
YT),
(A' + A'+) + WR (y' - y')
0 = W±U-1/2 + BT A' - - (Cin
where equations (A.3-A.6) are used for f = 1,... , L.
136
+ Ciax1) e,
(A.5)
(A.6)
A.1.3
TRCG Components
First, the discrete state-adjoint equations are put into linearized form and separated into the
inhomogeneous and homogeneous parts:
(A.7)
Yo
= YO,
f+1 -
+
+1 +
+ yf)+ - I IQt+1/2
A++A'+
2
A(y
At -
t 1/2 .
(A.9)
= -WTYT,
A+1
(A.8)
t
-
At
I
2
+ A'I
-WR(Yi
Ii-
2
iAJ
(A.10)
+Yi
and
y
(A.11)
0
i+1 YH
YH
At
f+1
_
f
t+1/2(A+1
(A. 12)
YH
y
(A.13)
_W
=WYL,
H
_H
T(AH+1+ A)
At
+ IWR(y+
±).
(A.14)
Similarly to the time-continuous case, equations (A.7) - (A.10) are uncoupled, and can be solved
for yt, for e = 1,...
, L.
The second set, however, is coupled. Again, we define a RT-operator RCN
that solves (A.11) and (A.12) with
ALg = WTqL,
-
Ht+
H
H
At
such that yH = RCNqi V {q'L
0
(A.15)
+
= 1A T (Ae+I±
H
2
At) + 1WR(q+1 + q')
(A. 16)
E RNx(L+1). The problem can then be weakly stated as
((p,gCNq))CN
= ((pyI))CN,
137
(A.17)
where 9CNq = q - RCNq,
and
((v,w))CN
= (VL)TWT(W') +
2
E
+1 + v )TW(w +l + wf) At,
(A.18)
f-1
V{v},{w} C IRNx(L+1)
A.1.4
TRCG Proofs
Proposition 8 The operator ((v, w))CN defines an inner-product space.
Proof. Defining the norm IIvIICN =
1.
((v, W))CN
2.
|IvH|CN
3. IIVIICN
4.
>
((v, v
N21,
it is simple to show that for any v, w
E JNx (L+1):
((wv))CN;
0;
0
-
v = 0;
I|avH| = Ia|IHV||CN, V a E JR.
Proposition 9 The R-T operator 7 ZCN is symmetric negative semi-definite in the space defined by
the above inner product.
Proof. This proof closely follows the discussion for the time-continuous case.
equations (A.12) and (A.16) in generic variables, {z1,Y1,72, pI, p2} E
JRNx(L+1)
with corresponding
initial and final conditions:
f+1 - z £
1
z1
1z+,
At
_12
2
~~1- =
At
+
I)
I Qf+1/2 (+1
1 21
+ - WR(h
+
_Y
ATy±+YD
22
+ 7f)
1
+p)
z
=
0,
WTPL
Reducing the above set,
(2+1)T
=Y -
2+1 _
_
(1
T
2T
1/
+1
+ _p
138
+11
Iy)A
-W
We rewrite
_p )TWR(zf+l + zI)At,
and adding from f = I to (L - 1),
(PL)T WTZL
L-1
2 _
2
L-1
(i+
1
+
(+1
j)TQf+1/2
+ 71) At
-
1
('i+
=1
2
=1
where we have applied the initial (zo = 0) and final (yL
1
+P
)TWR(z+1 + z)At,
= WTPL) conditions of the problem.
Rearranging and applying the operator RCNPf = zf we obtain
Z+
(p)TwRCNP
L-1
_L _L
/==1
L 1
+
TQ+1/
1 ±pf)TWR(7ZCNpi+l +
2
7CNP1 )At
+1I)At
By identifying the left-hand side of the above equation as ((P2, RCNP1))CN we see that
((P2,
7
ZCNP1))CN
=
(( 7 ZCNP1,P2))CN
and that
((P, ZCNP))CN
=
LZ(2+1
-
QT+1/2(7+1 + /t)At < 0,
e
V p} CjNx(L+1)
The operator RCN is therefore SNSD in the ((-, -))CN space. E
From the above, we see that 9CN (gCNP = P - RICNP) is SPD in ((-, -))CN, and therefore a
unique solution q E
RNx(L+1)
of equation (A.17) can be found by conjugate gradient methods in
the space defined by this inner product.
139
A.2
Second-Order Backward Difference
A.2.1
Cost Functional Definition
[
2
i,
IL
3
f-/
+2
+
where
=
iq
A.2.2
qi)
-
_ pf)
Q0 )At/2 +
-( _
T
-(L
WR(pL
L-1L
)At/2
-
1
+ if12W
f1/2At
2
WU3 f-1/2
-
R
T
R5
+g(9~~.
p)
T)
T
-T'
-3/2
2
-O_ 1
-
T-
-1/2
f=2
+I
)TWT(
L
for f = 1 and
2fi-3/2
At
(A.19)
l -S)At
_aTRTWR(
-
2
L
q=1
f=1
p E 1:
/
M
1n(R)L
m=1
) for
-
£ = 2,
...
,
L.
Optimality Conditions
For Second-Order BDF, Cq = diag(u/
-
u ) for £
=
diag(mum 1 /2
1, and C=
£-3/2
f
Um
- U')
q) for
=2, ... ,L, and
(A.20)
=Y0
yo;
= Ayi + Bu 1 /2 + F;
y -
At
f-2
-2y-+
3p
1
-
(A.21)
+
At
=
Ayf + B'
for
+ F,
f = 2, ... , L;
3
A2
+ WR(yt
T AL At
-
-
yn)At/2 + WT(yE
23AL-1 - 2AL = A T A L-lAt
+ WR(yL-1 _
3A' -
A' - 2A 2 + 1 A3
2
ATA +WR(Yf
2
=
0=- Wui'+ B TA'
where
fi
=
1 2
/
(A.23)
yT);
(A.24)
-1);
2A'++A f+2
At
-
- Y),
for
f= L-2,...,2;
P (cmfin-
for f = 1 and i' =
+ Cia7') e,
3U-1/2
for
f = 1, ... ,L,
+ IUI-3/2
for
2
fo£2..L 2, ... , L.
140
(A.25)
(A.26)
AT AlAt + WR(y' - y')At;
=
-
(A.22)
(A.27)
A.2.3
TRCG Components
We begin by rewriting the state-adjoint equations:
1
0
YI -YI
0
Yi= YO,
3
Py
1
f- 2yi 1 +
-Q
y
+
=- A (3y
At
2 A) + 2D
3
(A.28)
+ 3DO,
t-2
y
At
= Ay - QfA1 + Df
(A.29)
= 2, ... IL - 1),
Wy,
-WTYT,
ALI
2AL,
II
AL+1 = 2I
(for
3
2 AtI - 2A'+l
I + 2'At+
I
(A.30)
2
=A TA
AtI
- WRyfR
(for
f = L - 1, ...
(A.31)
1),
and
0
yyUH =
3
yH
- 22yt- 1 +
-2
H
2YH
AL+l
H
-= 2AL
2H'
3
2
A' H
At
)
3YH +
3H
HA-
H
2A
- f
(A.32)
-
A = WryL ,
2 A'++A
H
1
2 1
AtH H
f+2
H
(YL
2 yL-1
+ WRY
ATA'H
(for
f=
(A.34)
_ YL-2)
(for
(A.35)
f= L -1,...,1),
where the first set of equations is uncoupled (and can be solved for yi, for f = 1,.
=2yL-1 -
(A.33)
2,..., L - 1),
., L - 1, and
L-2) while the second set is coupled. We define RBD as solving (A.32) and (A.33)
with
- 2AL
AL+l
H
-H)
3 A'
H
-2Af+
H
Hl = WrqL
+}A+
2H
At
such
Th that given
_qj-
be.
weak1
IRNxL,
(qL = 2qL-1 _
qL-
)
(A.36)
f=L-1,...,1),
(A.37)
2
2
=
T AH+
s(RBDq a-2
Waq
H for f
(for
1, ...
L-1, and (BDq)L
2
H
_
Y
The problem can then be weakly stated as
((pQBDq))BD =
141
((pyI))BD,
(A.38)
where
9BDq =
q - R)BDq, and
L-1
I
((v,w))BD
( 2 vL-1
-
2
L-
2
2
(V)TWR(We)
wL- )
)TWT(2L -1
V{v, w}f_-
A.2.4
At,
(A.39)
f_2
.
E
NxL
TRCG Proofs
Proposition 10 The operator ((v, w))BD defines an inner-product space.
1/2
Proof. Defining the norm H|VHlBD =
((l)BD'
tis
ti
simple to show that for any v, w E JNxL.
1. ((V,W))BD = ((w,v))BD;
2. |IVIIBD > 0;
3.
||Vj|BD =
0
#-4
v
=
0;
4. ||avII = 1a0HV1BD, V a E
11?.
Proposition 11 The R-T operator 7 ZBD is symmetric negative semi-definite relative to the above
inner product.
Proof. Rewriting equations (A.33) and (A.37) in generic variables, {zi,71,72,P1,P2} c JJNxL
3-
2
'- 2
2z
+
2z
+ 1
-2 = AzIAt
- Q yfAt
2
reducing,
fT t 1
tIT
-2Qy
-y
1
£
1 jT t-2
z 1 ) + -Y 2 Z1
- -Y t±2 T f)
142
jt
1
2W
A
and summing from f = 2 to (L - 1), we obtain
(-2_y2 + 1/2y )T Z +
LT( 2 ZL-1 -
L-1
=f
=
Q=2At
0 has been used.
WTpJ; (RBDP)L
(-2y2 +
1/ 27
=
7 2L+1TZL-1
1
L-1
=
where z
1/2ZL-
1
-
)TZ =
121
Now we note that according to (A.28) - (A.37):
L+1
,L
2
zL- ; and
-
T
(AtA - 3/21)(AtA - 3/2I)- Q 17 At =
iTQ fAt.
Substituting these terms where appropriate, and rearranging, we obtain:
L-1
P2
L-1
z
W(BDP1L
2TW(R
72 T-
)BDP)At =
t=2
1'
f=1
By identifying the left-hand side of the above equation as ((p2, RBDP1))BD we see that
((p2,RBDP1))BD
and that
L-1
((P,
The operator
((RBDP1,P2))BD
=
RBD
ZBDP))BD = -
7
Q
7
is therefore SNSD in the ((-, -))BD space.
Similarly to before, the operator
9
BD
Vp} c iNxL
At < 0,
l
is SPD in ((-, -))BD, and a unique solution q E IRNxL
of
equation (A.38) can be found by conjugate gradient methods in the space defined by this inner
product.
143
144
Bibliography
[1] K. G. Arvanitis, G. Kalogeropoulos, and S. Giotopoulos. Guaranteed stability margins and
singular value properties of the discrete-time linear quadratic optimal regulator. IMA Journal
of Mathematical Control and Information, 18:299-324, 2001.
[2] Alex Barclay, Philip E. Gill, and J. Ben Rosen. SQP methods and their application to numerical
optimal control. Technical Report NA 97-3, Dept of Mathematics, University of California,
San Diego, 1997.
[3] D. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, 2nd edition, 1999.
[4] D. Bertsimas and J. N. Tsitsiklis.
Introduction to Linear Optimization. Athena Scientific,
Belmont, MA, 1997.
[5] J. T. Betts. Survey of numerical methods for tajectory optimization. AIAA Journal of Guidance, Control, and Dynamics, 21:193-207, 1998.
[6] J. T. Betts. PracticalMethods for Optimal Control Using Nonlinear Programming. Advances
in Design and Control. SIAM, Philadelphia, 2001.
[7] J. V. Breakwell. The optimization of trajectories. SIAM, 7:215-247, 1959.
[8] A. E. Bryson and Y.-C. Ho. Applied Optimal Control. Hemisphere Publishing Corporation,
revised edition, 1975.
[9] Y. Cao, M. Gunzburger, and J. Turner. On exact controllability and convergence of optimal
controls to exact controls of parabolic equations. Optimal Control: Theory, Algorithms, and
Applications, pages 67-83, 1998.
[10] B. Friedland. Control System Design - An Introduction to State-Space Methods. McGraw-Hill,
1986.
145
[11] 0. Ghattas and J.-H. Bark. Optimal control of two- and three- dimensional incompressible
navier-stokes flows. Journal of computational physics, 136:231-244, 1997.
[12] Philip E. Gill, Laurent 0. Jay, Michael W. Leonard, Linda R. Petzold, and Vivek Sharma. An
SQP method for the optimal control of large-scale dynamical systems. Journal of Computational and Applied Mathematics, pages 197-213, 2000.
[13] R. Glowinski and J. L. Lions. Exact and approximate controllability for distributed systems.
Acta Numerica, pages 159-333, 1995.
[14] H. Goldberg and F. Tr6lzsch. On a sqp-multigrid technique for nonlinear parabolic boundary
control problems.
Optimal Control: Theory, Algorithms, and Applications, pages 154-177,
1998.
[15] J.-W. He, R. Glowinski, R. Metcalfe, A. Nordlander, and J. Periaux. Active control of drag
optimization for flow past a circular cylinder. Journal of Computational Physics, 163:83-117,
2000.
[16] V. F. Krotov. Global Methods in Optimal Control. Monographs and Textbooks in Pure and
Applied Mathematics (195). Marcel Dekker, New York, 1996.
[17] M. K. Kwak and L. Meirovitch. An algorithm for the computation of optimal control gains
for second order matrix equations. Journal of Sound and Vibration, 161:45-54, 1993.
[18] J. L. Lions. Optimal Control of Systems Governed by PartialDifferential Equations. SpringerVerlag, Berlin, 1971.
[19] J. L. Lions.
Exact controllability, stabilization and perturbations for distributed systems.
SIAM Review, 30(1):1-68, 1988.
[20] A. Miele. Method of particular solutions for linear two-point boundary-value problems. Journal
of Optimization Theory and Application, 2(4), 1968.
[21] W. Murray. Some aspects of sequential quadratic programming methods. Large Scale Optimization with Applications, Part II: Optimal Design and Control, 93:21-35, 1997.
[22] W.H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes in
Fortran 77, volume 1. Cambridge University Press, second edition, 1996.
146
[231 S. S. Ravidran. Proper orthogonal decomposition in optimal control. Technical report, Flow
Modeling and Control Branch, Fluid Mechanics and Acoustics Division, NASA Langley Reseach Center, internet, http://fmcb.larc.nasa.gov/-ravi.
[24] R. W. H. Sargent.
Optimal control.
Journal of Computational and Applied Mathematics,
124:361-371, 2000.
[25] U. Shaked. Guaranteed stability margins for the discrete-time linear quadratic optimal regulator. IEEE Transactions on Automatic Control, AC-31(2), 1986.
[26] V. Sima. Algorithms for Linear-Quadratic Optimization. Monographs and Textbooks in Pure
and Applied Mathematics (200). Marcel Dekker, New York, 1996.
[27] Robert F. Stengel. Optimal Control and Estimation. Dover Publications, New York, 1993.
[28] L. N. Trefethen and D. Bau. Numerical Linear Algebra. SIAM, Philadelphia, 1997.
[29] M. H. Wright. Interior methods for constrained optimization. Acta Numerica, pages 341-407,
1992.
147
Related documents
Download