A "HUM" Conjugate Gradient Algorithm for Constrained Nonlinear Optimal Control: Terminal and Regulator Problems by Ivan B. Oliveira Submitted to the Department of Mechanical Engineering in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY Feb 2002 @ Ivan B. Oliveira, MMII. All rights reserved. The author hereby grants to MIT permission to reproduce and distribute publicly paper and electronic copies of this thesis document in whole or in part. Author ........................... . ..................................... Department of Mechanical Engineering January 22, 2002 .................... 2 Anthony T. Patera Professor of Mechanical Engineering Thesis Supervisor Certified by............................. Accepted by .......... .. . . . .. . . . . .. . . . . Ain A. Sonin Chairman, Department Committee on Graduate Students .................... MASSACHSETTIS IN TITU OF TECHNOLOGY MAR 2 9 2002 LIBRARIES BARKER * 4 . 2 A "HUM" Conjugate Gradient Algorithm for Constrained Nonlinear Optimal Control: Terminal and Regulator Problems by Ivan B. Oliveira Submitted to the Department of Mechanical Engineering on January 22, 2002, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract Optimal control problems often arise in engineering applications when a known desired behavior is to be imposed on a dynamical system. Typically, there is a performance and controller use tradeoff that can be quantified as a total cost functional of the state and control histories. Problems stated in such a manner are not required to follow an exact desired behavior, alleviating potential controllability issues. We present a method for solving large deterministic optimal control problems defined by quadratic cost functionals, nonlinear state equations, and box-type constraints on the control variables. The algorithm has been developed so that systems governed by general parabolic partial differential equations can be solved. The problems addressed are of the regulator-terminal type, in which deviations from specified state variable behavior are minimized over the entire trajectory as well as at the final time. The core of the algorithm consists of an extension of the Hilbert Uniqueness Method which, we show, can be considered a statement of the dual. With the definition of a problem-specific inner-product space, a formulation is constructed around a well-conditioned, stable, SPD operator, thus leading to fast rates of convergence when solved by, for instance, a conjugate gradient procedure (denoted here TRCG). Total computational time scales roughly as twice the order of magnitude of the computational cost of a single initial-value problem. Standard logarithmic barrier functions and Newton methods are employed to address the hard constraints on control variables of the type umin < u < umax. We have shown that the TRCG algorithm allows for the incorporation of these techniques, and that convergence results maintain advantageous properties found in the standard (linear programming) literature. The TRCG operator is shown to maintain its symmetric positive-definiteness for temporal discretizations, a property that is crucial to the practical implementation of the proposed algorithm. Sample calculations are presented which illustrate the performance of the method when applied to a nonlinear heat transfer problem governed by partial differential equations. Thesis Supervisor: Anthony T. Patera Title: Professor of Mechanical Engineering 3 4 Acknowledgments I would like to acknowledge the help and guidance provided by my Ph. D. advisor Professor Tony Patera. His insights and flexibility have allowed for my pursuit of the present topic, while his humor has made the experience enjoyable. I am also very thankful to my committee members Professors Robert Freund and Jean Jacques Slotine for their useful comments, interesting suggestions, and constant encouragements. Throughout my student career I have been fortunate to work under the guidance of other faculty to whom I am indebted for their encouragement and support. Among these, I am particularly thankful to Professors Simone Hochgreb and Harsha Chelliah. My labmates Dimitrios, Thomas, Christophe, Karen, Yuri, and Sid have made the research process fun, and the exposure to their research topics has greatly enhanced my appreciation of other areas of computational science. We have all been rather fortunate to have the help of Mrs. Debra Blanchard, who is somehow able to maintain the group running smoothly and efficiently. I must also acknowledge my best friends, who's humor and support have been priceless and essential. Colleen, Doug, Matt, Tom W., Laura, Lauren, Kenich, Deanna, Fer, Tom C., Marcelo, Fabio - thanks. But I'd like to give special thanks to my two closest friends: my sisters Lara and Iara. Above all I must acknowledge my parents. It has always been clear to me that without their love and the many sacrifices they have made, I would not have been able to pursue my dreams. I dedicate this thesis to my Mom and Dad as a small token of my appreciation. 5 6 Contents 1 Introduction 13 1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.1.1 General Optimal Control Problem Statement . . . . . . . . . . . . . . . . . . 14 1.1.2 A Nonlinear Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.1.3 Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Optimality Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.2.1 General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.2.2 Nonlinear Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.2.3 An LQP Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Dual Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.3.1 Fenchel Duality in Optimal Control . . . . . . . . . . . . . . . . . . . . . . . 20 1.3.2 Reformulation of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.2 1.3 2 Existing Numerical Methods and Literature Review 27 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.2 Parametric Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.3 Riccati Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.4 Dynamic Programming 33 2.5 Shooting Methods 2.6 Newton-Raphson . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.7 Sequential Quadratic Programming (SQP) . . . . . . . . . . . . . . . . . . . . . . . . 38 2.7.1 SQP in Nonlinear Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.7.2 SQP in Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.8 Gradient Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 42 2.9 3 2.8.1 General Gradient Methods . . . . . . . . . . . . . . . . . . . . . . . . 42 2.8.2 The Hilbert Uniqueness Method (HUM) . . . . . . . . . . . . . . . . . 46 The Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 TR Conjugate Gradient Algorithm - Linear-Quadratic, Unconstrained Prob49 lems 3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.3 Optimality Conditions for the LQP Problem . . . . . . . . . . . . . . . . . . . . . . . 52 3.4 The Hilbert Uniqueness Method for the Terminal-Regulator LQP Problem . . . . . . 53 3.4.1 Separation of Inhomogeneous and Homogeneous Parts . . . . . . . . . . . . . 53 3.4.2 7Z and 3.4.3 The Terminal-Regulator ((-, .)) Inner Product . . . . . . . . . . . . . . . . . . 54 3.4.4 Proof of SPD Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.4.5 HUM from the Dual Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.4.6 Differences from Previous HUM . . . . . . . . . . . . . . . . . . . . . . . . . . 59 General Conjugate Gradient (CG) Algorithms . . . . . . . . . . . . . . . . . . . . . . 60 3.5.1 General Conjugate Direction Methods . . . . . . . . . . . . . . . . . . . . . . 60 3.5.2 The General Conjugate Gradient Method . . . . . . . . . . . . . . . . . . . . 61 Terminal-Regulator Conjugate Gradient Algorithm (TRCG) . . . . . . . . . . . . . . 63 3.5 3.6 g Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.6.1 The Skeletal TRCG Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.6.2 Convergence Results for TRCG . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.7 Stopping Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.8 Time Discretization - Implicit-Euler . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.8.1 Discretization of Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 68 3.8.2 Discretization of Solution Procedure . . . . . . . . . . . . . . . . . . . . . . . 70 3.9 Detailed TRCG Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.10 Numerical Properties of Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.10.1 Storage Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.10.2 Conditioning of 9 Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.11 Formulation for Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . 77 3.11.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . 8 . . 78 3.11.2 Weak Formulation and Galerkin Approximation . . . . . . . . . . . . . . . . 78 . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.11.4 The Governing Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.11.3 Finite Element Approximation 3.11.5 The Cost Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.11.6 Optimality Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 3.11.7 Effect on the Conditioning of g - A One-Dimensional Example . . . . . . . . 83 . . . . . . . . . . . . . . 86 3.12.1 Problem Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.12.2 General Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 3.12.3 Computational Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 3.12 Example Problem: Linear, Two Dimensional Heat Transfer 4 5 Interior Point Methods - Linear, Constrained Problems 93 4.1 M otivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 4.3 Optimality Conditions for the Constrained LQP Problem . . . . . . . . . . . . . . . 95 4.4 Interior Point Methods (IPM) for Optimal Control . . . . . . . . . . . . . . . . . . . 96 4.4.1 Logarithmic Barrier Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.4.2 Proofs of convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.4.3 State and Adjoint Equations 4.4.4 Primal IPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.4.5 Correcting Values that Lie Outside Box Constraints 4.4.6 Initializing the Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4.4.7 Barrier Parameter Rate of Reduction . . . . . . . . . . . . . . . . . . . . . . . 105 4.4.8 Primal-Dual IPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 . . . . . . . . . . . . . . 102 4.5 IPM-TRCG Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 4.6 Example Problem: Linear, Constrained 2D Heat Transfer . . . . . . . . . . . . . . . 113 4.6.1 General Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.6.2 Numerical Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Lagging Procedure - Nonlinear, Constrained Problems 117 5.1 M otivation 5.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 9 6 5.3 Optimality Conditions for the Constrained NLQP Problem 5.4 Linearization of Optimality Conditions . .1 . . . . . . . . . . . . . 118 . . . . . . . . . 119 5.4.1 Naive Implementation . . . . . . . . . . . . . . . . . . . . . 119 5.4.2 Proposed Algorithm - Separation of Parts . . . . . . . . . . 121 5.4.3 Initializing the Algorithm . . . . . . . . . . . . . . . . . . . 122 5.5 Sufficient Convergence for Lagging Procedure . . . . . . . . . . . . 123 5.6 Determining the Newton Step . . . . . . . . . . . . . . . . . . . . . 124 5.7 NL-IPM-TRCG Algorithm . . . . . . . . . . . . . . . . . . . . . . . 124 5.8 Example Problem: Nonlinear, Constrained 2D Heat . Tansfer . . . . . . . . . 125 5.8.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 125 5.8.2 FEM Formulation . . . . . . . . . . . . . . . . . . . . . . . 126 5.8.3 General Results . . . . . . . . . . . . . . . . . . . . . . . . . 128 5.8.4 Numerical Performance . . . . . 129 131 Concluding Remarks 6.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . 131 6.2 Possible Pitfalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 6.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 A Additional Time-Discretization Schemes for TRCG A.1 A.2 135 . . . . . . . . . . . . . 136 A.1.1 Cost Functional Definition . . . . 136 A.1.2 Optimality Conditions . . . . . . 136 A.1.3 TRCG Components . . . . . . . 137 A.1.4 TRCG Proofs . . . . . . . . . . . 138 Second-Order Backward Difference . . . 140 A.2.1 Cost Functional Definition . . . . 140 A.2.2 Optimality Conditions . . . . . . 140 A.2.3 TRCG Components . . . . . . . 141 A.2.4 TRCG Proofs . . . . . . . . . . . 142 Crank-Nicholson 145 Bibliography 10 List of Figures 3-1 Motivation - the goal is to control the temperature in the shaded region. . . . . . . . 50 3-2 Condition number of 7I and 9 for sample one-dimensional problem . . . . . . . . . . 85 3-3 Diagram of sample problem domain (7 cm x 3 cm). 86 3-4 Time history of desired regulator behavior YR,RS (desired temperature at reaction . . . . . . . . . . . . . . . . . . . surface). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 3-5 Mesh used for problem discretization, N = 3960 (7 cm x 3 cm). . . . . . . . . . . . . 89 3-6 Optimal control and FRS temperature histories, P = 2.37 x 10 7 . . . . . . . . . . . . 89 3-7 Residual value of error Iluk u*11 for TRCG iterations. . . . . . . . . . . . . . . . . . 90 3-8 Structure of the stiffness matrix A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 4-1 Optimal control and PRS temperature histories for constrained problem, P = 8.19 x 107. ......... - ............................................. 114 4-2 Newton and last conjugate gradient convergence of IPM-TRCG algorithm . . . . . . 115 5-1 Diagram of sample nonlinear heat transfer problem domain (7 cm x 3 cm). . . . . . . 125 5-2 Optimal control and FRS temperature histories for nonlinear constrained problem, J = 1.42 x 108. 5-3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Newton and last conjugate gradient convergence of NL-IPM-TRCG algorithm. 11 . . . 129 12 Chapter 1 Introduction 1.1 Problem Statement "Optimal control" problems encompass a wide range of scientific applications in a number of fields. Here, we are concerned with developing optimal strategies for dynamic systems governed by parabolic differential equations (ordinary and partial). Areas that find applications for such systems include engineering sciences (mechanical, electrical, chemical, industrial), finance, economics, and others. In fact, these systems are present in most areas of modern technology. For generality, we consider linear and nonlinear systems. Modern control theory has come to rely on optimal control due to limitations of the classical approach of pole placement. For example, given an Nth-order system subject to M control variables, only N poles are available for a controllable system. Non-dynamic controllers require NM parameters to be specified for feedback control, allowing for an infinite combination of parameters to be selected without strong theoretical guidance. Another problem of classical methods is that there is no clear guidance in pole placement when designing controllers. Often, an intuitive feel is required of the engineer so that desirable speed of response is achieved. Since MIMO systems present a somewhat unpredictable coupling not present in SISO systems for which classical methods were developed, the intuitive approach becomes undesirable and less than robust. Finally, problems of controllability can arise if we require the system to behave in a predefined manner. If the system is uncontrollable there exists sub-spaces of the state space that cannot be affected by the control variables. This is especially true for systems governed by partial differential 13 equations, since the desired behavior may be too arbitrary and thus impossible. The optimal control statement of the problem avoids this complication by allowing slack in performance at a know cost. Stabalizable systems can thus be effectively controlled. 1.1.1 General Optimal Control Problem Statement We are interested in systems that are large in the sense that many variables must be stored and solved for during a computed simulation. Here, it is always assumed that it is possible to either directly or indirectly control the dynamic systems through a variable called the control variable and denoted as u(t) c IRM. The state of the system is represented by the state variable y(t) E RN. We are concerned with developing algorithms for deterministic control problems, and so it is assumed that a known relation exists between the state and control variables of the form y(to) =yo f (y, u), where Q represents Vt E [to, tf], (1.1) (1.2) the partial derivative of y with respect to time t. Typically, it is possible to define a function that naturally reflects a cost of the system's performance. Seeking to minimize this "cost functional" while obeying the system's governing equations (1.1)-(1.2) is the goal of optimal control algorithms. In the problems of interest here, the cost functional J can usually be expressed as J= The symbol #[x(tf), [y(t), tf] + j L[y(t), u(t), t] dt. (1.3) tf] represents a final time penalty and the integrand L[x(t), u(t), t] dictates 1 ). Equation (1.3) represents the nature of the optimizing solution (and is called the Lagrangian the Bolza type form of the problem since it contains the integral and the final term. Equivalent forms with only the first or second term can be derived and are called Mayer type and Lagrange type, respectively. In addition to the governing equations (1.1)-(1.2), it is generally possible that there exist inequality constraints on the state or control variables. For example, if the control variable cannot 'Although, in a more general setting, the term Lagrangian usually represents a cost functional adjoined with the constraints, the terminology here is more consistent with the traditional use in optimal control literature. 14 exceed a maximum safety value, such a constraint would take the for u(t) < Umax. A general expression can be given as cj[u(t),y(t)] !0, for j = 1, ... , J, (1.4) where c E RR. In the example above, c(-) = Umax - u(t), R = M, and J = 1. Having provided the necessary definitions, we can state the problem in a concise manner as a mathematical programming problem of the form find u* = arg min J[u] (1.5) uCU y(to) = yo, subject to p = f (y,u), Vt C [toitfl, c'j(u, y)> 0, j=1,... , J, (1.6) where U = C 0 (to, tf ; RM) 1.1.2 A Nonlinear Example We can illustrate the above definitions with a more concrete example of a nonlinear system. Suppose the cost functional reflects the deviation of the final state from a desired state YT. be expressed as a quadratic penalty yf = y(tf)). #[y(tf),tf] = (yf - yT)TWT(yf This can - YT) (to simplify notation During the time interval [to, tf] there may also exist a desired state history YR. In addition, the cost of the control can also be expressed quadratically, so L[y(t), u(t), t] = (y YR)TWR(y - YR) + UTWUU, where it is assumed that WU is symmetric positive-definite (SPD). The total quadratic cost is then J = 2(y - yT)T WT(yf - YT) + ( y - yR)'WR(y - yR) + uT Wuu] dt. (1.7) An example of a system governed by nonlinear equations is =Ay + Bu + K(y -y 15 4 ). (1.8) For simplicity, we assume for the moment that K is a matrix and y 4 represents a vector y with each term taken to the fourth power. A more rigorous and realistic presentation of this example will be given in section 5.8. The inequality constraints that are most common in control problems will be applied. Most controllers have lower and upper limits between which they may operate. For example, Umin may represent a positivity constraint and Umax a saturation level. Therefore, we get J = 2 (ji := min; max), and I2 : Cmin[U(t)] = u(t) Cmax[u(t)] Umax (1.9) Umin > 0, - (1.10) u(t) > 0- Now we simply restate the problem with (1.5)-(1.6). This type of problem is a nonlinear quadratic program. If K = 0 we get an important sub category called linear-quadratic programs (LQP), which present several nice features that facilitate solution algorithms. 1.1.3 Partial Differential Equations More specifically, the problems of interest here are derived from Partial Differential Equations (PDEs) of evolution. These can be generally expressed as (1.11) y + A(y) = Bu. The operatorA may be linear or nonlinear (as in the example above) and it is implied in (1.11) that appropriate boundary conditions are imposed. The operator B maps the "space of controls" into the state space [13]. A typical example would be B = B E ]Rk"). Furthermore, through this operator, the control u can be either applied throughout the state space domain Q C d (distributed control) or on the boundary F c Q (boundary control). Again, initial conditions must also be specified: y(t = 0) = Yo. (1.12) Two assumptions are made at this point. The first is that the system (1.11)-(1.12) with a control history u(t) uniquely defines a solution. The second assumption is the the system is approximately controllable. If a system is exactly controllable, there exists a u(t) which drives the system y(T) to any give member YT of the state space at time T > 0 from any given initial condition yo. Relaxing 16 this definition, we may assume the system to be approximately controllable if we only require y(T) Clearly, exactly controllable systems are a strict to belong to a 'small' neighborhood of YT [13]. (and, in practice, relatively small) subset of approximately controllable systems. Optimal control allows us to control systems that are not exactly controllable. 1.2 1.2.1 Optimality Conditions General Case The modern statement of stationary conditions for problem (1.5,1.6) are known in the optimal control literature as the "Pontryagin minimum principle." (Also known as the "Pontryagin maximum principle" due to an alternative definition of the Hamiltonian, see below.) Using notation from section 1.1.1, we define the "Hamiltonian" as J W(y, u, A, v, t) L(y, u, t) + (A(t), f (y, u, t)) ± (v (t), cj (y, u,t)) , (1.13) j=1 where A(t) E R'N and v1 (t) E RM are termed "adjoint" variables. Since vj are adjoined to inequality constraints, we have, for j = 1, ... , J, vj (M) = 0 if cj (y, u,t) > 0, <0 if cj(y,u,t) =0. (1.14) The minimum principle can now be stated as = Hx* (y*IU* A* V* t), -A* = j(y*, U*A*, v*, t), y* (0) = yo, (1.15) A*(tf) = #Y[y*(tf)], (1.16) W(y* u*, A*, v*, t) = min 11 (y*, u, A*, v*, t). uEU (1.17) The notation * indicates optimum values and subscripts represent partial derivatives with respect to the indicated variable. These conditions are achieved by calculating the stationary conditions of the nonlinear program and manipulating the augmented cost functional J = #[y(tf), tf] + I: u(t), t] - (A(t), y(t)) dt. J[y(t), 17 Conditions (1.15)-(1.17) are necessary conditions for optimality of the problem. However, sufficient conditions must also hold. Here, we assume 2 normality (controllable problem with stable neighboringpaths) and the Jacobi condition (empty set of conjugate points). Given these conditions, sufficiency is achieved if the Legendre-Clebsch condition holds: WNu (y*, u*, A*, v*, t) > 0. 1.2.2 (1.18) Nonlinear Example Returning to our nonlinear example, we can express the necessary conditions for the example given in section 1.1.2 as 0 where C Y* = = Ay* + Bu* + K(y* - y*4 ), = AT y* - 4KY* A* + WR(y* - yR) WUu* + BT A*+ and C E C y*( 0 ) - ZC = (1.19) Yo, A*(tf) = WT(y* - YT), (1.20) (1.21) v, represent matrices with elements (&cj /Oy) and (&cj /&u), respectively, and diag(y* 3 ). For SPD Wu, the Legendre-Clebsch condition (uu (y*, u*, A*, v*, t) = WU > 0, shows that any stationary point must necessarily be a local minimizer for this problem. In principle, we should be able to solve for the optimal solution (y*, u*, A*, v*) that satisfy (1.19)-(1.21) without concern for any other sufficiency conditions. In practice, however, iterative numerical methods must evaluate search directions from nonstationary points. In this case, the simple Legendre-Clebsch condition does not apply. The Hessian of the augmented cost functional is important in determining the convexity of the minimizing problem, and can be expressed as 2see 2see 6YOyy6Yf tf dt +aT luTi ci ze d [ 6 [8] for mathematical expressions of italicized concepts. 18 (1.22) To simplify, consider the nonlinear example of section 1.1.2 in which WT = (a 2c/eay 2 ) (42c/&yau) = (02c/auIy) = (02 c/au2 ) = 0. Then we have Tf [6 T 62NL 6UT] WR - 12KY2 Ai 0 0 for all points (ui, yi, Aj) - H x fRNXN. 6y W U .J ~L6 dt g (1.23) For the convergence and efficiency of numerical methods, it is very desirable for this operator to be SPD. We see that for this to be the case, we would require WR > 12KY2Ai, which is not necessary true. This must be considered when choosing iteration points (ui, yi, Aj). 1.2.3 An LQP Example Consider the Linear Quadratic Program (LQP) with no inequality constraints. This problem produces the following necessary conditions: = Ay* + Bu*, y* (0) = Yo, (1.24) = A*(tf) = WT(y* - YT), (1.25) T A* +WR(y* - YR), 0 = Wuu* + BT A*. (1.26) From the above, it is evident that this problem offers several advantages in developing a numerical solution algorithm. As a result, we use the LQP as a starting point for the development of the method proposed in this work. For now, we note that the Hessian for the regulator version of this problem 62 j fo [6 T uTl WR 0 y 0 WU_ _6U_ (1.27) is positive regardless of the value of the iterate (ui, yi, At). 1.3 Dual Problem Statement The following is a presentation of the essence of the method that will be developed in detail (and for more general problems) in subsequent chapters. 19 1.3.1 Fenchel Duality in Optimal Control Primal and Dual Forms Fenchel duality offers a simple and elegant way of attaining the dual of (1.5,1.6). Henceforth we assume that the cost J and the function f (y, u) are separable. That is, f (y, u) = fi(Y) + f 2 (u), J(y, u) = Ji(y) - J 2 (u), (1.28) where, for the Bolza problem Ji(y) = <(yf) J2 (u) - L[y(t)] dt, + j to L[u(t)] dt. (1.29) (1.30) Here, we make the assumption that J is convex, J2 is concave, and fi(y) and f 2 (u) are convex (as in the nonlinear example of section 1.1.2). For the moment, we assume that c = 0 for all j and rewrite the original statement of the problem in the exactly equivalent form minimize subject to where Y {y E [to, tf] X JN Ji(y) - J 2 (u) (y, u) c Y n u, = f(y,U) Vt E [to,tf], y(0) = yo . Assuming this problem has a feasible solution, we can easily convert the above form to minimize Ji(y) - J 2 (u) y E [to, tf] X subject to = f (y,u), UE U. 20 N, y(to) = yo, Vt E [toitf, (1.32) Dualizing the first constraint of (1.32), we can write the dual function as I(A) = inf Ji(Y) - J 2 (u) + [to,tf ] xotN (A, f 2 (u)) dt - J2 (u) = inf UEU = to (A, (f (y, u) - y)) dt ft tEU + Ji(y) - inf yE [to,tf ] XfN (A,-fi(y) + y)) d, (1.33) to 12 (A) - 11(A), where, applying Green's formula to I1, 11 (A) (A, sup = yE[to,tf x]R (1.34) -f( ,y) sup - - fi(y)) dt - Ji(y) to .N - (A, fi(y)) dt - Ji(y) + (Af, yf) - (Ao yo) t0 yE[to,tf]XJRN and 12 (A) inf UEU to (A, f 2 (u)) dt - J2(u) (1.35) are the so-called conjugate convex and conjugate concave functionals, respectively [3]. As a result of the above operations, we can state the dual problem in a very simple, compact form which mirrors (1.31) maximize subject to where A1 = {A c [to, tf] x IRN 12 (A)- 11 (A) (1.36) A E A1 n A 2 , 11 (A) < oc} and A2 = {A E [to, tf] x IRN 12 (A) > -oc}. It should be noted that now I1(A) is a convex function over A1 and I 2 (A) is a concave function over A2 . Furthermore, the bracketed term of 1i(A) is concave in {[to, tf] x IRN}, which leads to a unique supremum. Similarly, the bracketed term in 12 (A) is convex in U, which leads to a unique infimum. Terminal LQP Example - Conjugate Functions We can now further develop the conjugate functionals for the terminal (WR = 0) LQP Bolza problem. We obtain an expression for 1(A) from (1.34) 11 (A)= sup y E [tO,tf]R 1o -(A + ATA)Ty dt - '(Y 2N 21 -YT WT(Yf - YT) + A yf - Ayo. (1.37) For 11 (A) < o, the above equation requires that A E A1 = {A E [to, tf x lN I+ATA = 0}. Then, from stationary in y, we get the supremum = I,(A) A c A,. Af Wj Af + AfyT - A0yo, (1.38) Similarly, for 1 2 (A), 12 (A) = in were Q= { ATBu + UT Wuud} = - 1 ATQA dt, A E A2 = [to, tf] x RN (1.39) BWVBT . Dual Problem - A Formulation Having determined I,(A) and 12 (A), the dual problem (1.36) can now be stated as maximize I(A) = subject to 2 tA A E A = A, n A2 = We note here that the sets [to, tf] [to, tf] x RN ATQA dt AfWj-1Af 2 x ]ftN [tO, tf IX RN A - A YT - +AT\= y (1.40) . and U are convex, and that J, (y) is convex over and J 2 (u) is concave over U. In addition, we can state that the functionals Ji (y, u) and J2 (y, u) are convex and concave over all [to, Itf] x RN+M, respectively. Thus, we can state [3] there there is no duality gap and we have inf {Ji(y) - J 2 (u)} = max {1 2 (y) - Ii(u)}, (1.41) AcA (yu)E{[totf]XJRN}XU or simply: 1* = J*. 1.3.2 Reformulation of the Problem The R and g operators We note above that the space of allowable dual variables A is determined once a value Af is given. There is an obvious relationship between this space and the Maximum principle (1.16). Therefore, 22 in order to have a similar form to this condition, we introduce a variable q such that given q, A(tf) = WT(q - YT), (1.42) -A Vt E (to, tf). AT A, Given such A, we can perform the following operations: y(0) = Yo, (1.43) y = Ay - QA, Vt E (to, Itf). Note that y is primal feasible, but since it is not true in general that A(tf) = WT(y(tf) - YT), it is not necessarily optimal. Separating the above operations into respective inhomogeneous and homogeneous parts, -A1 = A TA 1 , j = Ay, - -A H YH = AI(tf) (1.44) = -WTYT, yo, (1.45) ATAH, A AH(tf) = WTq, (1.46) AyH - QAH, yH (to) = QA1 , yj(to) We define the operator R as the following: given q C (1.47) 0. IN, lq = yH(tf). For later ease of notation, we also define gq = q - Rq. It will be useful to perform the following operations to equations (1.45) and (1.46): H TAy, - - Hyl d(AT yi)/dt = T - ATQAI Ay, (1.48) -AT QA I . Integrating over time and applying Green's formula and final/initial conditions, we get AQAI dt - = (tf)T WTq - to 23 AH(to)Tyo. (1.49) Similarly, using the remaining equations, Jtf ATQAI dt =yTWTy(tf) - Aj(to)Tyo, tf AT QAH dt = qT WT R q. - (1.50) (1.51) to The ((., .)) inner-product Given v C RN and w E RN, we define the following inner product: (1.52) ((v,w)) = VTWTW. From (1.51), can conclude that the operator R is symmetric, negative semi-definite in the above inner-product space. As a result, 9 is SPD in this space. Dual Problem - q Formulation Here we separate the dual function into homogeneous and inhomogeneous parts and apply final and initial conditions: I(A) = 2 AH W-iTAHf! - 1AI W-I1A\If AHfW-1AIJ ATQA dt - (AHf - AIf)yT + (AHO - AI )Ty - 2 = - 1 TW 2 qTq 1, TW - -yTWy 2 + AHO y00A0 YO - T +Y + q- WTY yT'Wy yT~rr (1.53) -AOrvO + IqTWTRq + yj fWTq - AHf yO 1 S = 2 +y fi~W q+ TW 2 YT'W (Yy-- yI(tf ) ) 1 + 2A(tO)Tyo. Note that yI(tf) and AI(to) are independent of q. Thus, the last two terms can be calculated from (1.44) and (1.45) independently of the optimization problem. Defining CI(yo, yT) - - yI(tf)) + 2A 1(to)Tyo, yTw(y 24 (1.54) and using the inner-product notation, we can state the dual problem simply as I(A*) = I(q*) = max { qRN 2 ((q,9q)) + ((q,yIf)) + CI(yoiyT). (1.55) The strength of the above formulation is the simplicity of the dual functional. There are several numerical advantages that result from this formulation: (i) Since q belongs to all of ffN, the problem is an unconstrained maximization; (ii) Since g is SPD, the functionals in brackets is concave, thus immediately proving uniqueness; (iii) Since g is SPD, efficient numerical methods can be used to solve the problem, such as the conjugate gradient method; (iv) Values of q which are not optimal result in dual feasible solutions which can be used to obtain lower bounds for I*; (v) When solving for 9q, the problem offers yH(q, t) as a by-product of the computation. This can be used in conjunction with the inhomogeneous part (solved for only once) to obtain a primal feasible solution y(t) = yi(t) + yH(t). This value can then be used to calculate an upper bound on the cost functional J* (note that u(t) = W--BTA(t)); (vi) Since there exists no duality gap in problem (1.41), the primal and dual variables can be used to calculate bounds on the true cost: I(A(q)) < I* = J* < J(y(q)). (1.56) We define the "bound gap" as AC = J(y(q)) - I(A(q)). Remark 1 There is a simpler way of formulating the problem of finding q*. We note that equations (1.42,1.43) satisfy the minimum principle (1.24)-(1.26) if q = q* Since Rq = y*(tf). yH(tf), we have q= y*(tf) + yy(tf) 25 = yj(tf) + Rq*, or simply gq* = y(tf). (1.57) The solution q* of equation (1.57) solves problem (1.55) without requiring the steps of finding the dual as a function of q. However, this simplification of the formulation has several theoretical and practical drawbacks: (i) Without the dual formulation, the origin of this equation is unclear; (ii) There is no indication of the origin of the inner-product ((-,-)), which is essential to show the SPD property of operator g; (iii) Without knowledge of ((., .)), trying to solve (1.57) by minimizing a quadratic function in Euclidean space will not guarantee SPD g, thus hindering the numerical methods used; (iv) Equation (1.57) gives no indication of C'(yo, yT), so that even if it is acknowledged that it is a version of the dual problem, it cannot be used to determine a lower bound for the cost P. FD 26 Chapter 2 Existing Numerical Methods and Literature Review 2.1 Introduction Optimal control problems are typically large and difficult to solve due to the complex relationships in the stationary conditions. Numerical methods must deal with the fact that the number of operations required to solve a problem increases much faster than the problem's dimension. Efficient methods are those which exploit certain aspects of a problem to reduce the amount of computational time and storage. A variety of methods exist, since each has been developed to take advantages of aspects of particular problems. However, successful methods often allow for modifications that expand their applicability. Here we present several of the most popular methods for solving optimal control problems. This survey will be useful, in later sections, for comparisons between our approach and currently used methods. All of the methods below are discussed in relation to the Bolza problem of section 1.1.1. Thus, it can be said that the purpose of each method is to solve for a control value u* (and consequently A* and y*) which satisfies the minimum principle (1.15)-(1.17). 27 With smoothness assumptions on W, we restate these here as follows: = f (y*, u*), 0, S=-, Y(y*, u* A*) y*(to) = A (ty) = yo given; (2.1) a#[y*(tf)1 4 ; (2.2) 0 W(y*,auu*, A*) (2.3) Problems of this type are usually referred to as two-point boundary value problems [27]. The main difficulty that arises in their solution is that differential equations must be solved in such a way that both initial and final conditions are satisfied. This means that a simple forward or backward time integration of equations (2.1) or (2.2) cannot be done before the appropriate relation between them (u*) is know. The methods presented here are iterative in the sense that particular variables are updated as the resulting cost J[(y, u) (A)] approaches the minimum J*. Historically, there have been three approaches to solving for the above stationary conditions [24]: (i) solution of the boundary value problem presented by equations (2.1) and (2.2) with a local Hamiltonian optimization, equation (2.3), at each time step; (ii) solution of a completely discretized problem, so that it resembles a finite-dimensional nonlinear program; (iii) solution of a finite parameterization of the control history, where the state and adjoint variables are evaluated by integration of (2.1) and (2.2), and the control variable is adjusted from sensitivity equations. Shooting methods and Sequential Quadratic Programming were used to solve the problem through approach (i). These methods are relatively straight-forward in the implementation stage, but present some practical problems. Riccati equations and Newton-Raphson methods were employed to use approach (ii). Though technically very accurate and fast, storage can easily become an issue with these methods. Approach (iii)was employed by the use of parametric optimization, dynamic programming, and gradient methods. Some of these methods have met with considerable success due to their flexibility, robustness, and ease of implementation. 28 2.2 Parametric Optimization Control Parameterization Among the first optimal control solution methods developed, parametric optimization [27] has fallen out of favor for by large, complex problems. It is included here for historical completeness. These methods are approximate since they will not necessarily converge to exact optimal values regardless of the number of iterations. In essence, the method restricts the allowable control histories to a small subspace of U, thus considerably simplifying the problem. In particular, we make u a simple function of time. By restricting u to be dependent on a few parameters, the dimension of the problem can decrease significantly. Here we take the simplest possible example: u(k, t) = k. (2.4) This control history is constant in time, and so we have a single vector k E JRM as a "parameter" to be used in the optimization of J[u(k)]. Considering a terminal LQP problem from section 1.1.2, we have for the state equations y(t) = Ay(t) + Bk, y(to) = 0. Such a simple system can be solved analytically when A is square and nonsingular: y(tf) = eAt! (Yo + A- 1 Bk) - A-'Bk, where eAt! k=O (Atf)k k! - is the matrix exponential of Atf. The cost then can readily be expressed as J(k) =-(eAtf 2 + -1Bk) - A 1 Bk - yT)TWTeAt( A-Bk) - A- 1 Bk - YT) kTWuk dt, which is only a function of k. Setting &J/Ok= 0, we get = (WTe 2 At(A-1B)T (A-1B) + (tf - to) Wu-(WTeAtf (A-'B)(eAtf - YT), 29 (2.5) which is the unique solution as can easily be confirmed by noting that 3 2 J/&k2 > 0. This results in u(t) = k* which is only a constant-in-time approximation of the actual control history u*(t). It happens to be the best possible approximation for constant functions, but it is nonetheless unacceptable for practical applications. Rather than using the simple form of (2.4), it may therefore seem more appropriate to used higher order terms in the approximation of the control history. Alternative candidate functions may include ramp functions, u(k, t) = k1 + k 2 t, truncated power series, u(k, t) = k1 + k 2 t + k3 t 3 + -+ k t, truncated Fourier series, in u(k, t) [k1i sin tf - to + k 2i cos ist' tf - to , or other orthogonal functions. One possibility is to increase n for the truncated functions until a satisfactory cost is obtained. However, it must be stressed that the resulting cost J(k*) need not be arbitrarily close to J*, even for large n. Furthermore, we do not know a priori which functions give best results for a given problem. This is the first critical problem of the method. With the exception of a handful of problems, the method is not exact and requires too much intuition to be considered a robust numerical method. Even for the unrealistically simple example above, expression (2.5) hints at another problem with this method. Although the idea is very simple and elegant before implementation, the practical solution becomes very complicated as the problem grows in complexity or as better approximations are used. For cases which have no analytical solutions we must resort to numerical calculations of J and the gradient &J/&k, which requires integration of (2.1). As n increases, the overall calculations can be very expensive. For good approximations of u, this calculation approaches the complexity of more accurate (exact) methods, with the severe disadvantage of being non-exact. Therefore, this method, although interesting for simple problems, is not used today for large, complex problems. 30 Penalty Methods It may seem inappropriate that the above method requires integration of (2.1) to be done done exactly when u is only a (probably) crude approximation of u*. If we also approximate y we need not perform such an operation. Consider the augmented cost 1 JE(y, u) = tf - E Jt0 fl y - f (y, u)JI dt, with c > 0, and suppose we define y(t) y(kyt) (2.6) u(t) u(k, t). (2.7) and Now both the state and control variables are functions of time and parameters ky and ku. Since y is defined as a known function of time, the expensive integration of (2.1). y can be calculated (possibly analytically) without requiring If J is SPD, then so is J, and we are faced with the uncon- strained minimizations of JE(ky, ku). This problem may be solved by, for example, Newton-Raphson iterations on ky and ku with incremental increase in c. Though this method does not require solution of ODEs, it is still plagued by potentially difficult evaluations of complicated functions. It is also non-exact, and J* need not approach J* even for large c if (2.1) and U are not sufficiently represented by (2.6) and (2.7). Again, there is no a priori guarantee that we may find the necessary functions. 2.3 Riccati Equations The remaining methods presented in this section are exact. These have gained favor in the modern solution of optimal control problems due to the fact that, regardless of other difficulties, they have the potential of producing exact answers to the posed problems (or at least arbitrarily close approximations). Riccati equations were originally developed in order to theoretically deal with calculus of variations problems [27]. They are easily derived from LQP problems of section 1.2.3. Consider the 31 example whose necessary conditions are expressed as = Ay* - QA* = ATA* + WRy*, y* (0) = yo, (2.8) A*(tf) = WTY*. (2.9) Here, for simplicity, we have YT = yR(t) = 0. The fact that we are dealing with and LQP means that these are also sufficient conditions for optimality. Since y and A are adjoint we can state the following linear relationship: A*(t) = S(t)y*(t), where S(t) is a time-dependent Riccati matrix. Naturally, we have i*(t) = S(t)y*(t) + S(t)p*(t), and, incorporating the above into (2.8,2.9), -S(t)y*(t) - S(t)Ay*(t) + S(t)QS(t)y*(t) = ATS(t)y*(t) + WRY*(t), S(tf)y(tf) = WTy(tf). Canceling y*(t), we obtain S(t) = -S(t)A - ATS(t) + S(t)QS(t) - WR, S(tf) = WT. (2.10) Once S(t) has been determined from (2.10), the state and control variables can be obtained: (A - QS(t))y*(t), u*(t) = -WBTS(t)y *(t) y*(0) = yo, (2.11) = -C(t)y*(t). (2.12) Equation (2.10) is referred to as the Riccati matrix equation for this LQP. It is a non-linear matrix ODE which could be solved by backward integration. It is possible to implement the method above relying solely on linear algebra operations. Though this might seem advantageous at first, an inspection of the numerical complexity of equation (2.10) reveals its weakness. As stated previously, optimal control problems tend to be large. If y(t) has N elements (for each times step), then S(t) is an N x N dense matrix. For certain problems to 32 which positivity assumptions apply, the dimension of the solution procedure can be reduced, thus allowing for more efficient methods [26]. It is also a fact that good numerical time discretizations rely on a large number L of time steps. Since storage for all of S(t) is O(N 2 L), this solution method quickly becomes prohibitive as the size of the problem increases. Compared to modern gradient methods (see below) which require O(NL), Riccati equations suffer from a significant disadvantage. A moderate problem, for example, may require N = 0(103) and L = 0(102), demanding 1000 times more storage for Riccati equations than for gradient methods, or a total of 0(108) versus 0(10'), respectively. Other than for some relatively small problems [17], Riccati equations today are mainly used as a theoretical tool. In particular, since equations (2.12) provides an explicit expression of a closedloop control law, Riccati equations are typically the point of departure for the study of stability properties of feedback optimal regulator systems [25, 26, 1]. 2.4 Dynamic Programming It is possible to express the minimum principle in an alternative form. Suppose that rather than considering a cost function (1.3), we consider the value function f V(ti) = 0[y*(tf)] + L[y*(t), u*(t)] dt iti = #[y*(tf)] - =min Itf #[y*(tf)] L[y*(t), u*(t)] dt - (2.13) 'L[y*(t),u(t)]dt} Clearly, minimizing V(to) subject to (1.2) is equivalent to the original problem. After some manipulation, the optimality condition takes the form at [y* (0)] = - min V[y*(tf)] y* (t), UM), U = y [y* (0)]1 (2.15) #[y*(t)], where W = L[y*(t),u(t)] + (2.14) K a' [y*(t)], f [Y*(t),u(t)]. 33 The partial differential equation (2.14) is known as the Hamiltonian-Jacobi-Bellman(HJB) equation [27]. Dynamic programming techniques in optimal control use this equation rather than (2.1)-(2.3). Backward integration of equation (2.14) may be carried out with V[y*(t)] for arbitrary time, t E [to, tf]. Further integration to t' y* = f (y*, u*) to yield C [to, t], has no effect on values of y*(t) to y*(tf); a fact known as Bellman's Principle of Optimality. Any point on the terminal hypersurface defined by (2.15) results in a unique initial state-time solution (y (to), to). If a portion of this hypersurface is defined near the expected value y* (tf), values of (y(to), to) can be iteratively calculated until one obtains (y*(to), to). Though it may be realistic to map the hyperspace created by HJB equations for problems with two or three variables, storage becomes prohibitive for large problems [8]. Thus an alternative is to choose a few starting points near the expected final solution and interpolate values of y(tf) so that y(to) = yo is achieved. In any case, dynamic programming algorithms typically involve some type of mapping the related hyperspace, which is always storage-intensive. However, the HJB equations serve an important modern theoretical purpose. If one defines Dv ay AT (t) = a (t), one immediately sees the relationship between N and 7. Alternatively, if we retain the notation and consider the problem posed by (2.8,2.9), we may rewrite HJB as V* 09t = V[y*(tf)] 1 (V* 1TT Ay + -y Way -- oBy 2 Oy 2 0V* Q(aV* 09y 7 (2.16) (2.17) yT(tf)WTy(tf). This first order, nonlinear partial differential equation has a product solution of the form V* = -YTS(t)y, 2 which, upon substituting into (2.16,2.17) gives 0 = yT ($ + SA + ATS - SQS +Wy. Since the above must hold for all y, it implies the Riccati equation of section 2.3. Though solution of (2.14,2.15) is typically more difficult than that of the original stationary 34 conditions, exceptions exist for the solution of LQP controllers. In addition, dynamic programming techniques can be effectively used in formulating stochastic optimal control of nonlinear systems [27], and are preferred for the synthesis of feedback problems [16]. 2.5 Shooting Methods Among the first used in optimal control [7, 8], shooting methods aim at target values from initial guesses. For example, one may guess an initial value of the adjoint variable A(to), and determine u(to) from equation (2.3). In fact, it is possible to integrate the entire system (2.1)-(2.3) in this fashion as long as all the integration is done forward in time (including equation (2.2)). Of course, it is not necessarily true that the final condition A(tf) = aq(tf)/Oy will be satisfied, since A(to) may not be optimal. By changing the initial guess for the adjoint variable, iterations can be carried out until the final conditions is satisfied. There are immediate problems with this technique, however. We have implicitly assumed that equation (2.1) is stable in forward integration. Such an assumption usually means that equation (2.2) is not. Therefore, forward integration of (2.2) will probably result in unworkable values of the adjoint variable history for a large portion of possible A(to). The same approach can be used by guessing y and A at the final time and integrating backwards in time. This alternative would still suffer from the same problem as above, since stable state equations tend to be unstable when integrated backward in time. In essence, the difficulty in these methods involves getting started [8]: finding an admissible value A(to) that will produce well-conditioned transition matrices. Ill-conditioned transition matrices cannot be inverted in the crucial step of improving the guess, greatly compromising the accuracy of numerical methods used. Therefore, these methods are best used for finding neighboring solutions [8, 27] to an optimal solution that is usually calculated via more stable methods. Though some alternative formulations have been proposed (such as quasilinearization [20]), multiple shooting techniques have been the most successful variants of this numerical technique [24]. These are described below in section 2.7.2. 35 2.6 Newton-Raphson Widely used in many types of problems, Newton-Raphson methods have also been successfully used to solve optimal control problems. The quadratic convergence property of this method makes it very attractive for fast-converging solution algorithms; thus they are often used today in the optimal control context. These methods are based on a type of linearization of the state-adjoint equations. As we have observed before, by way of equation (2.3), it is possible to write u(t) as a function A(t). So that, in general form, the stationary conditions can be written as = f(y*, A*) y*(to) yo, (2.18) = Hy (y*, A*), A*(tf) =Y[y*(tf)]. (2.19) = Guessing non-optimal values of yi(t) and Ai(t) will not, in general, satisfy the above equations. However, if these values are close enough to y*(t) and A*(t), the following approximation can be made: Ay~ fy (yi, Aj)Ay + fA(yi, Aj)AA + (t), -AA ~ yy (yi, Ai)Ay + WyA (yi, Aj)AA + I(t), where 5(t) = (f(yi, Aj) - yi) Aj) - and 6(t) = (-Ry(yi, Ay(to) = Yo - y2(to), (2.20) AA(tf) = #y[y(tf)] - Ai(tf), (2.21) i 2). The method can best be described for a given time discretization. Suppose we use an Implicit Euler scheme, with index f, time steps At, and total number of time steps L = tf/At. Then we can write the above equations as RAt Jy} [AAtl Ay -A At+ (2.22) L_A -+ bI bJ -h 0 10 AyO j AA If we create a vector Ax = (Ayl, Ay 2 ,... Yo - yi(to) #,y(yJ§) I L-1 - (f = 0, L). (2.23) A21 AA 2 ... AAL- 1 ) of size 2(NL - 1), and define the appropriate right-hand side i, equation (2.22) can be expressed as JcAX 36 = (2.24) where J, is the Jacobian block matrix E K Je =. N P Matrices E, K, N, and P, are (for Implicit Euler) block diagonal with matrix elements K -f (2.25) R Once the step direction Ax has been solved for via equations (2.23,2.24), we update the current solution = I+ a, Vf = 0,...,L, (2.26) where ai ;> 0 is the step-size. Although the quadratic convergence of the method is attractive, there are severe limitations when solving large problems. Equation (2.24) involves the inversion of the Jacobian matrix, which is of size O(N 2L 2 ). Even for the most simplified problems, this number is prohibitive. Thanks to the diagonal nature of the Jacobian that results in most time discretizations, storage can be achieved with O(NL), but the inversion process will nevertheless be computationally intensive. Principally, however, the method is only guaranteed to achieve quadratic convergence near the optimal solution. Starting from far away guesses will not guarantee convergence, and even if convergence does occur the method does not fundamentally distinguish between maxima and minima. There are cases in which the size of the problem has been reduced, through proper orthogonal decomposition [23] for example, in which these methods have been successfully employed. They also tend to be used for ill-conditioned smaller problems, for which storage may not be a problem but fast convergence is essential. The method is, nonetheless, powerful in the sense that it presents a linear approximation to the stationary conditions (this is in fact a quadratic approximation to the cost functional). Therefore, a wide class of non-linear optimal control problems can be approached with this method very efficiently. In fact, one can apply gradient methods in conjunction with the linearization of the problem to calculate step directions near an optimal, non-linear solution (see section 4.4.4). 37 Remark 2 If we consider the problems in which Ly = 0, such as the LQP example, we have hy = ,fy that is, P = -ET. In fact, for the LQP problem, we have = =Q, Al, A, At Nt, =WR, Pe~ -+ AT At (2.27) D Remark 3 Finding ai for equation (2.26) can be done in several, well-established ways. They include: (i) constant step size: ai = constant; (ii) minimization rule: ai = arg mina;>o J(xi + aiAx), by line search techniques; (iii) diminishing step size: ai - (iv) Armijo rule: ai = # f'ks, J(X,) 0 as i -+ oc, but E' ai = oc; E (0, 1), Mk first nonnegative integer m such that - J(X, + /" m sAX) ;> -o-3"SVJ(X,)TAXi, where o- E (0, 1). Typically, good values of 0 and o- are 1/2 and [10-5, 10--1], respectively. Of the methods above, the most efficient tends to be the Armijo rule, since it guarantees sufficient reduction of J when near the solution. L 2.7 Sequential Quadratic Programming (SQP) Sequential Quadratic Programming (SQP) methods are very widely used in modern optimal control, serving as the main competitors of gradient methods. Here we first describe the origin of the method and then the application to optimal control. Though typically the method of choice for small- and medium-sized problems, difficulties arise when larger problems are considered [21]. 38 2.7.1 SQP in Nonlinear Optimization Sequential Quadratic Programming is a class of so-called Lagrange Multiplier Methods, in which Lagrange multiplier estimates are made and used in the solution procedure. In this section, we introduce SQP for general, nonlinear programming (NLP) problems. SQP is based on the idea that one wishes to solve the following problem f(x) min (2.28) XEJRN s. t. g (X) <;0, j = 1,...,Ir, by addressing a "penalty" version of the form f (x) + cP(x), where P(x) = maxj{gg (x)} and c is a "sufficiently large" number. Extensions to equality-constrained problems are easily made and given below. It can be shown [3] that if c > E p, where p are Lagrange multipliers corresponding to the constraints, then the strict unconstrained local minimum of f + cP is also the solution of (2.28). Suppose we define J(x) = {jj gj(x) = P(x),j 1,... ,r} and Oc(x, d) = max{Vf (X)T d + cVgj (X) T dj Then, for small I dl E J(x)}. , a quadratic approximation of f + cP around x can be written as f(x) + cP(x) + Oc(x, d)+ dTHd, (2.29) where H is a positive-definite symmetric matrix (to be addressed later). The descent direction d can be found by solving the problem min Vf(x)Td + IdTHd +c} 2(2.30) gj(X) +Vgj(X) T d d, s.t. , Vj. (See [3] for a geometrical illustration of (2.30).) Since it can be shown that x is a stationary point of f + cP if and only if (2.30) has d = 0, = P(x) as its optimal solution, an appropriate numerical procedure would be to sequentially approach a stationary point x* for which this is true. SQP is 39 an iterative descent algorithm that operates on this idea: Xk+1 = + akdk , Xk where dk=argmin Vf (Xk)T d+d T Hkd +c d, 2 s.t. < (, gj(Xk) + Vgj(Xk)T d (2.31) J Vj, in which we may, for example, use the Armijo rule to determine a. Two issues remain to be addressed before implementing the above idea: the selections of c and Hk. We know that c > Eg pt is required to guarantee convergence, but do not know p* apriori. Well-developed methods exist for addressing this problem such as solving similar problems for dk with set to zero and using the resulting Lagrange multipliers to estimate c [3]. This approach works well but some care must be taken since c tends to become too large. This results in sharp corners of the penalized cost f+ cP, which adversely affects the stepsize procedure, and consequently the entire algorithm. As for the matrix Hk, a natural choice is the Hessian of the Lagrangian, V2xL, where L(x, p) = f (X) + pTg(x). Though difficulties can still occur here (for example, pj are not generally known, or V2XL may not be SPD), there are methods of determining appropriate, SPD Hk. It is simple to incorporate equality constraints in the method. To solve the problem min f(x) XE RN s.t. hi(x) = 0, gj(x) we may simply replace hi(x) = O, i = 1, ... , n, (2.32) j=1,...,r, 0 by the two inequalities hi(x) < 0 and -hi(x) < 0. Then, the direction finding step becomes dk argmin Vf d, s.t. kT 2 T H kd Ihi(xk) + Vhj(Xk)Tdl < gj(Xk) + Vgj(Xk)T d 40 , < , + c Vi, Vj. (2.33) The crucial point here is that the above system is a quadratic program (QP) with linear constraints. Therefore, a unique solution can be found for each k iteration of the procedure. Given appropriate stepsizes, dk approaches zero, 2.7.2 k approaches P(x*), and xk approaches x*. SQP in Optimal Control The presentation of the method above has been carried out in the NLP context. This section applies the method to optimal control problems by converting such problems to NLPs. Multiple Shooting Methods SQP is typically performed in the context of multiple shooting methods, which are derived from the simple shooting methods of section 2.5. Consider the LQP problem of section 1.2.3 with the governing equation = Ay + Bu, y(to) = yo. (2.34) In multiple shooting methods, one divides the time domain into Nt subintervals [ta, t,+1] (for discretized problems, these intervals are larger than the time discretization intervals). For each tn, a guess yn is given as a initial conditions for that time interval. By integrating (2.34) from t, with y, and a given u(t), to t,+,, one gets y(t,± 1 ). In general, one finds that y(tn+1 ) # yn+1, and the aim becomes to determine y, and u(t) that ensure y(t,+i) = yn+1. Since these time intervals can be made, in principle, arbitrarily small, stability issues may be alleviated. This procedure produces continuity requirements that can be seen as problem constraints: cn(x) = y(tn+1 ) - Yn+1 = 0, where x = [yo,...,yN",u(t)T, for n = 1,... ,Nt, and y(tn+ 1 ) is arrived at by integration of (2.34). (2.35) Now we may restate the problem as min J(x) x s.t. c(x) = 0, where c(x) = [C1 (4)C 2 (X), -- ,CNt (X)I T . 41 (2.36) Optimal Control Problem Having redefined the optimal control problem as (2.36), we are ready to apply the techniques of section 2.7.1. The QP subproblem of interest is written dk = arg minJ(Xk) + VJ(Xk)Td+ 1 dTHkd d S. t. 2 (2.37) c (Xk) + VCX *)d = 0, where Hk is a SPD approximation (a quasi-Newton scheme may be used, for example) of V2 Lk(Xk, Ak), and Lk(x, Ak) J(x) - AkT[c(x) - c(xk) - Vc(xk)(X - xk)] (2.38) is a typically used modified Lagrangian [12]. Defining the terms (c(xk)-Vc(Xk) (X-Xk)) of the above equation as the constraint linearizationcL(x,X k), we may interpret dL(X, Xk) = c(X) - cL(X, Xk) as a departure from linearity [2]. The QP has a unique solution (dk, dk) which can be found by solving the stationary conditions Hkdk - Vc(xk) Vc(xk )dk = T dk -V J(xk), A (2.39) *b - Once the QP subproblem is solved, the variables can be updated: Sk-1 Ak+1 xk +±akdk, (2.40) _k + akdk, where ak is found by appropriate line search techniques based on a variety of merit functions. SQP and its variants have been successfully applied to a variety of problems [2, 11, 12, 14, 24, 6, 5]. 2.8 2.8.1 Gradient Methods General Gradient Methods Along with SQP, gradient methods are among the most often used algorithms for solving modern optimal control problems. Pioneered by H. J. Kelly [24], these methods have been traditionally performed in the control space; that is, the control variable is improved at each iteration until the 42 stationary conditions are satisfied. An obvious advantage of this approach is that, if the system is stable, a guess of the control history u(t) will produce manageable values of y(t) during integration regardless of initial conditions (unlike shooting methods). Take, for example, the general system with stationary conditions = f (y*, U*), D7 w (y* u* A*) - I y*(to) = yo given; (2.41) A*(t#) = (2.42) Dy 0 - /[Y*(tf)1 Dy (2.43) . au A guess uk(t) for the control variable can be used in place of u*(t) to integration equation (2.41) forward in time. The resulting value of y(t) can be used to integrate equation (2.42) backward in time, providing an approximation of the state-adjoint history (y, A). Since in general uk # U*, equation (2.43) will not be satisfied for this set of results. The goal of gradient methods, then, is to use gradient information to obtain uk+1 such that uk --+ U* as k -+ 00. In the above procedure, stability properties of the equations are maintained, exact state histories are produced at each step, and estimates can easily be made of cost errors from duality. For these reasons, these methods have been chosen as the basis for the algorithm developed in this work. Gradient methods that operate on the control variable typically assume the form: Uk+1 = Uk - akDkVuN(uk), (2.44) where ak is a stepsize chosen by an appropriate line search, Dk is a SPD matrix, and Vu7i(uk) is the gradient of the Hamiltonian at uk. This formulation derives directly from applying NLP gradient methods to the optimal control context. It can be shown [3] that if: (i) the step direction dk = -DkVj(uk) is gradient related 1 for every k, (ii) and the stepsize ak is chosen by some appropriate line search technique such as the minimization rule, the limited minimization rule, or the Armijo rule, 'Given a subsequence {uk} that converges to a non-stationary point, the corresponding subsequence {dk} is bounded and satisfies the condition lim sup VuR(uk)T dk < 0. k- I oo 43 then every limit point of the sequence {uk} generated by (2.44) is a stationary point. Steepest Descent (u) Steepest descent schemes are among the simplest for this type of approach. implemented by assigning Dk = They are simply I, where I is the identity operator of appropriate size. For the LQP problem, for instance, this would result in the iteration uk+1 _ uk Clearly, the step direction dk stationary points (Vu7j(uk) -VuH(uk) - ak (Wuuk + BTAk). always satisfies the gradient related condition for non- # 0). Though this guarantees eventual convergence in most cases, a critical problem of this approach is that it is only linearly convergent. This can seriously hamper the convergence rate if the problem is not well conditioned. Newton-Raphson (u) Newton-Raphson algorithms are based on a quadratic perturbation of the cost functional with respect to u. In the context of gradient methods, this corresponds to Dk -- (V2 ) 1 . For the LQP example, we have Uk+1 _ uk - akW-il (Wuuk + BTAk). The main advantage of this approach is that quadratic convergence is attained near the solution. However, several drawbacks often make the pure Newton-Raphson method difficult to implement. The first of these drawbacks is that the method requires (V2UW(uk)) to be symmetric positivedefinite for possible values of uk in order to guarantee that dk be gradient related. Variants of the method are more appropriate to deal with such situations. Quasi-Newton methods, such as the BFGS variant [15], have been successfully used to solve nonlinear (non-SPD VUUW) problems. These methods use a symmetric positive-definite approximation Hk of (V 27 W(uk))- which improves as uk approaches u*. A serious disadvantage of the quasi-Newton methods, however, is that Hk must be stored throughout the iterations [22]. Since there is no intrinsic sparsity in H , this usually requires O(M 2 L) storage, which can be prohibitive for large problems2 . 2 Pure Newton-Raphson methods are able, in principle, to calculate (VN72 storage. In practice, however, this is a far greater computational hindrance. 44 (uk)>1 on the fly, avoiding issues with Another problem of the method is that it requires an expensive inversion of (V2 ,?1(uk)). Quasi- Newton methods may be cheaper since Hk is an approximation of the inverse of (V 27W(uk)), but the calculations are still expensive compared to simpler forms of Dk. Finally, the pure form of the method may not converge for guesses that are far from the solution (particularly for nonlinear, non-SPD problems). Even if convergence does occur, the method does not fundamentally distinguish between maxima and minima. This problem may be avoided by careful use of appropriate line search routines coupled with homotopy loops which approach the problem from a known solution. This, however, imposes another level of complexity to the calculation. Conjugate Gradient (u) Often regarded as among the most powerful techniques for solving many types of linear algebra problems, conjugate gradient techniques strike a balance between the simplicity of the steepest descent and the speed of the Newton-Raphson. In these methods, the descent direction is redefined as dk _ _V w(uk) + ok dk-1 where k V7kT(v k - V k-1) V-Rk-1TV-k-1 and the steps uk+1 _ Uk + akdk are taken such that ?W(uk + ak dk) = min 71L(Uk + a~dk). a These methods are typically simple to implement (especially for the quadratic case). In addition, storage requirements are O(M); far less than quasi-Newton methods. In practice, these methods exhibit very fast convergence due to the orthogonalization of the descent direction. Possibilities for preconditioning exist, further improving convergence. Because of its speed and simplicity, the conjugate gradient approach has been chosen as the basis for the method presented in this work. There are, however, modifications made to take advantage of theoretical flexibility of the Hilbert Uniqueness Method (HUM) developed by J.L. Lions [18, 19, 13]. 45 2.8.2 The Hilbert Uniqueness Method (HUM) Originally developed in the context of the wave equation [19], the Hilbert Uniqueness Method has been used to study controllability and stabilization properties of distributed systems governed by parabolic differential equations for the terminal problem [13, 9]. Since parabolic systems are of interest here, the method will be presented in this context. Here we briefly summarize the results of [9], and in section 3.4 we develop the method to solve general optimal control problems, including the regulator problem. Consider the terminal LQP problem of section 1.2.3. Then, the optimality conditions can be expressed as y - Ay* = -QA*, -* - T A* = y*(0) = yo, 0, A*(tf) = WT(y*(tf) - YT). We can represent a general terminal state y(tf) as an operator R if, given q c RN, y- y(0) = 0, Ay = -QA, A(tf) = q, -i-ATA = 0, and (2.45) Rq = y(tf). Also, let E(t) denote the solution operator at time t for tb - Aw = 0, w(0) = z, that is, w(t) = E(t)z. Comparing the above relations, we arrive at E(tf)yo + W§-lR(y*(tf) - YT) = y*(tf), or (WT + R)(y*(tf) - YT) = WT(YT - E(tf)yo). (2.46) The operator R is similar to A used by Lions [19] in HUM. Though [9] considers only the terminal 46 problem, several important properties of R are proved: (i) R is a compact operator in ffN; (ii) R is symmetric and semi-definite in RN. (iii) KerR = 0. As a result, the article is able to provide constructive proofs for approximate and exact controllability from the HUM point of view. Section 3.4 extends the theory to regulator problems and presents a solution method derived from HUM by introducing a more restricted space and connecting the above with the concepts of sections 1.3.2 and 2.8.1. 2.9 The Proposed Method The method developed here is based on gradient methods. However, rather than approaching the primal problem from control variable perspective, the problem is approached from the dual statement with a transformed form of the adjoint variable. This approach is inspired by the HUM, which, as shown above, is a theoretical result used for proofs of the uniqueness of certain optimal control problems (hyperbolic problems and the parabolic terminal LQP, for example). Due to its constructive nature, the method can be implemented for the actual calculation of optimal solutions. In its current form, however, the method has been developed in Euclidean space, which restricts it to terminal optimal control problems. When the more general regulator problem is attempted in Euclidean space, the SPD property of the operator no longer holds, and neither uniqueness nor the implementation are justifiable. We have found that if the method is constrained to a new problem-specific space, then the SPD property of the terminal-regulatorproblem can be stated, and the implementation of the algorithm follows. All that is required, then, is to use a method that maps out this new space. General conjugate direction algorithms can be used for this purpose, and in particular we use Conjugate Gradients. The difference with the traditional gradient algorithms is that the inner-products are no longer Euclidean, but rather of the type "((., .))" which will be defined in chapter 3. The method is also flexible: a decomposition of the variables into inhomogeneous and homogeneous parts along with a careful treatment of the used operator allow for extensions to be made, so 47 that more general problems can be addressed. In our case, we are interested in addressing problems which are more realistic in the engineering context: constrained control, nonlinear problems. By treating the constrained control with Interior Point Methods (IPMs) and the nonlinearity with a lagging algorithm, we are able to preserve key aspects of the proposed algorithm. This means that it can be used successfully by more general methods to tackle difficult problems. 48 Chapter 3 TR Conjugate Gradient Algorithm - Linear-Quadratic, Unconstrained Problems 3.1 Motivation Linear-Quadratic Problems are those characterized by linear governing equations and quadratic cost functionals. By restricting the definition to linear state equations, many realistic problems are excluded from the model. However, many other problems can be at least very accurately approximated by such equations. The main example of this chapter is a solid heat transfer problem, governed by a linear, first-order partial differential equation. Here we focus on finite time, terminal-regulator problems. In such problems, we aim to control the state of the system throughout a given time interval I = [to, tf ]. In our example we would like to control the temperature at certain points of a domain given a set of available heaters. The behavior of the system should be forced to come "close" to a desired trajectory throughout I, with no constraints imposed on the values of controllers (heaters). Suppose we are presented with the object in Figure 3-1, and would like to use its top surface to control a certain (hypothetical and idealized) chemical reaction. For this purpose, it is assumed that the exact temperature history required is known (an example will be given), and that the reaction itself does not affect the temperature of the surface. The object extends infinitely in the z-direction, and so this can be considered a two-dimensional problem. The side boundaries are 49 reaction surface constant temperature constant temperature object extends infinitely in z-direction x top and bottom surfaces well-insulated (zero heat transfer) Figure 3-1: Motivation - the goal is to control the temperature in the shaded region. kept at a constant temperature, while the top and bottom boundaries are well insulated and thus allow for zero heat flux. As a result of the simplifications above, this problem is governed by linear partial differential equations. It is assumed here that an appropriate spatial discretization has been performed so that the problem statement is presented as a set of governing ordinary differential equations. The details of this procedure are presented in section 3.11. Constrained by the linear governing equations, the problem will be stated as an optimization of a quadratic cost. Regarded as a practical compromise of true design objectives to yield solvable problem statements [10], quadratic costs have been the subject of much optimal control work [6, 8, 26, 27, 1, 9, 13, 24, 23]. This is because the quadratic cost functional allows for well-posed problem statements and relatively easy numerical solution algorithms. Advantages and limitations of the form given here will be mentioned in the next section. 3.2 Problem Statement Given a state vector y E y = CO{(to, tf); IRN} and a control vector u E U = (to, tf) x RM}, we begin the problem statement by presenting the quadratic cost functional to be minimized: J[y(u)] = 2(y(tf) - YT)TWT(y(tf) where YT E RN, YR E {(totf) X - YT) + 2 JJN}, jt [(y - YR)TWR(y - YR) WT E RNxN, WR E ]RNxN, WUTWu] dt, (3.1) and Wu E IMxM. The variables YT and YR represent the desired terminal and regulator behavior of the system, and the differences in the first and second terms represent deviations from desired behavior. The matrices 50 WT, WR, and Wu, represent the terminal, regulator, and control cost weights, respectively. Here we make the following assumptions about the matrices: WU is symmetric positive definite, WT and WR are both symmetric positive semi-definite, at least one of which is strictly positive. Remark 4 It is assumed here that all of the above information is known, though it should be noted that it is not a trivial matter to determine these parameters. The variables YT and YR are usually derived from the (assumed known) desired behavior. For example, we assume that in our example the required temperatures for the desired rate of reaction are precisely known for the desired outcome. The matrices that determine the weights of different terms of the cost functional may be derived from a few different approaches. In the simplest sense, it is typical for values to be empirically chosen until an acceptable controlled behavior is observed [10]. Alternatively, there may be an economic cost associated with the quality of state behavior (WT and WR) and with the cost of control (Wu) that automatically provide the required values. Another way of determining the weights is to choose, for example, diagonal matrices 1 WTii for i = 1, ... , N and j 1 - yT)" = 1, ... 1 (k yT - y r ' W)U 3 (Umax) 2 , M, where yf, Q, and Umax determine the "maximum" values to be tolerated for the respective deviations. Values below this maximum contribute terms that are less than 1 to J, while values above contribute terms that are grater than 1. Though this does not guarantee that any of the maximum deviations are strictly satisfied, it provides a guideline for a balanced distribution of the cost weights. Chapter 4 addresses a case in which strict limits are imposed on the control variable (a methodology that can easily be extended to the state variables). Finally, one may wish to approach the problem from the approximate controllability point of view. In [13] and [9] it was shown for the terminal LQP problem (WR = 0), that lim 1fy(tf) - YTII = 0. WT That is, if we take WU > WT we can approach the desired final state arbitrarily close. 1 Having determined the cost functional above, we can now state the form of the constraints. These are the initial conditions for the state variables and the linear ordinary differential equations 51 that govern the system: where yo E RN, A E JNxN, B E y(to) = yo, (3.2) y = Ay + Bu + F, (3.3) and F E 1EN. The matrix A represents, for example, INxM, the stiffness matrix that controls the dynamic response of the system, while B represents the distribution of control variables throughout the domain. The vector F is included for generality. For the presentation here we assume that all the system parameters WR, WU, A, B, F are constant in time. This need not be the case, however, and the algorithm proposed in section 3.6 remains unchanged if these parameters are allowed to vary in time. Given J[y(u)] of equation (3.1), we define the problem to be solved in this section as the Linear-Quadratic Program (LQP): min uEU J[y(u)] { s.t. (LQP) = Ay + Bu+ F, y(to) = yo. 3.3 Optimality Conditions for the LQP Problem Having defined the problem to be solved, we must determine the necessary and sufficient conditions for stationarity. From section 1.2.3, we know that stationary conditions are necessary and sufficient for LQP problems. These can easily be derived from equations (1.15)-(1.17): * =Ay*+ Bu* + F, S=T A* + WR(y* 0 - y*(0) = yo, A* (tf) YR), = WT(y*(tf) - YT), Wuu* + BT A*, where A E A = CO{(to, tf); JRN. Defining Q= BWJlBT, we can reduce the above to = Ay* - QA* + F, y*(0) = yo, (3.4) = ATA* + WR(y* - YR), A*(tf) = WT(y*(tf) - YT), (3.5) 52 and u* = -W-lBT A*. (3.6) Any algorithm that is developed to solve problem (LQP) must therefore solve for the system of equations (3.4,3.5). The solution for the optimal control history u* can then be obtained from equation (3.6). The difficulties associated with solving this type of system were presented in section 2.1, and are obvious here for this simpler system. 3.4 The Hilbert Uniqueness Method for the Terminal-Regulator LQP Problem 3.4.1 Separation of Inhomogeneous and Homogeneous Parts Before introducing HUM for the regulator problem, we begin with a separation of state and adjoint variables into inhomogeneous and homogeneous parts: y = yI +yH and A = AI + AH. Then problem (3.4,3.5) can be separated into: y* = Ay* - QA-* + F, -* = ATAr - WRYR, Y (0) = yo, (3.7) A*(tf) (3.8) = -WTYT, and Q=Ay* - QA*, 0, (3-9) A*(tf) = WTy*(tf), (3.10) y;I(0) = A TA* + WRy*, = The inhomogeneous parts yi and Al can immediately be calculated from above since equation (3.8) is uncoupled from equation (3.7): all that is required is a backward integration of equation (3.8) followed by a forward integration of equation (3.7) with the resulting A,. This cannot be said for yH and AH since equation (3.10) depends on values of y = Y + YH, which are not known. 3.4.2 R and g Operators Here we begin to build on HUM by defining an operator R for the regulator problem. q E Y(= A), Rq is defined by the following operation motivated by equations (3.9,3.10): 53 Given (lq)(t) = A(Rq)(t) - QAH(t) (3.11) IRq(to)=o -H(t) - = ATAH(t) + WRq(t) AH (tf) = WTq(tf) Note this is different from the HUM operator of section 2.8.2 and [9, 13] since it includes a regulator term WRq(t). Once a value of q has been given, a backward integration and a forward integration in the order indicated gives the operation Rq. The operator 9 is imply defined as gq 3.4.3 = (3.12) q - Rq. The Terminal-Regulator ((-, .)) Inner Product The terminal-regulator ((-, .)) inner-product is similar to the one defined in section 1.3.2; however, important modifications are to be noted. Given v E Y and w E Y, ((v, w)) = v(tf)TWTw(tf) + j v(t)TWRw(t) (3.13) dt. Jto Now the inner product operates on variables that have a time dependence. It also has an additional term that is integrated over the control time period. 3.4.4 Proof of SPD Property It will be useful to show at this point that the g operator is symmetric positive definite in the space defined by the above inner-product. We have from equations (3.10,3.11) AT (Zq) T(Rq) d(AT(Rq))dt = AT A(Rq) - ATQAH - -ATA(q) = -AQAH - qTWR(Rq) - qTWR(Rq), which, by Green's formula, yields AT QAH dt = q(tf )TWT (Rq)(tf) + - to qTW to 54 (Rq) dt ((q, Rq)). (3.14) Since WU is assumed symmetric positive definite, we have that Q = BWilBT is also symmet- ric positive definite. This fact together with equation (3.14) guarantees that the operator R is symmetric negative semi-definite in the inner-product space defined above. Since gq = q - Rq, it must also be true that the operator g is symmetric positive definite in that space: ((p, gp)) = ((p, p)) - ((p, Rp)) > 0, Vp C Y, p = 0. 3.4.5 HUM from the Dual Problem Preliminaries For later use, we note that also from equations (3.7,3.8,3.11): 4Hyi T91 = d(A T yi)/dt = -A7HAyi - S~y -X IflA - qT~ay1 + ATF, and = 4fy1 = d(ATyi)/dt AT Ayi - AT QA1 + AT F -AT Ay,+ yWRyI -ATQAI + yTWRyI + ATF, which, again by Green's formula, yield - jf AT QAI dt = y 1 (tf)TWT q(tf) - AH (to)Tyo + tf to0 to T Wy 1 d - to QA y IWRyI dt - o -] Y(t = -yTWT YI(tf) - to tNATF dt. (3.15) ATF dt. (3.16) ft0 to The Terminal-Regulator Dual Problem We can use Fenchel duality to easily obtain the dual problem. Since equations (1.34) and (1.35) are stated in general form, the conjugate convex and concave functionals are easily derived. Defining 55 fi(y) = Ay, and f 2 (u) = Bu + F, they are, respectively, I1(A) = -(A + ATA)Ty t2f sup YY - to (Yf - yT) TW(y - 1 (Y - YR)TWR(y 2 - YT) + A0yf YR) ct - (3.17) - and 12(A) = inf UEU to AT (Bu + F) + 1iUT W 'Id J 2 Here, we note that A1 = A 2 = A for I(A) < 0o and 12 (A) > -o0. (3.18) Optimizing the functionals above, we have Ii(A) = j ( + AT A)TWjl(A + to AT A) - ( + AT A)TyR dt 2 (3.19) + 1 AfWj Af + A7YT - A yo, A E A, and f 12 (A) = -1 AT QAdt + SATF dt, (3.20) A E A. The dual problem can now be stated: maximize I(A) - (A + ATA)TWnl (A + ATA) + (A + AT A)TyR dt = 1t subject to t 2 Af W-Af T A TQAdt - ATYT + A oY + 2- ffT ATF dtf, A E A. (LQPdual) 56 Dual Problem - q Formulation Before changing from the A to the q formulation, we split the adjoint variable into inhomogeneous and homogeneous parts: I(A) - i (1 : - + ATA 1 )T W 1(AH (A1 + AT I 2f 2 f T A, QAI dt 1 - - AH )T YR T (AH + dt + AT Al) 'AIf 2IWj -AHf - to -(AH ) - (A1 + AT AI)T yR dt + AT AH )TW-l(AH + AT AH) + f(AH + AT AH)TW1'(A to -2ATW 1 - AH fW tf tf 1 A4QAHdt2 t(O f + AIf )TYT + (AHO + AIo )Ty0 + -AIf QAHdt to tf (AI + ft0 AH )T F dt. Now, we can use equations (3.8)-(3.16) to substitute term by term above. I(q (A)) = + 1 tf T ~ tf T qTWaq dt + YRWRYR dt - 2Ito 1 to q(tf )TWrq(tf) - - 2 1 - + TWWTy t t- Y Wfyt dt - 2 It0 2Y )TW(Iq)(tf)+ f qWRyR dt - qT Wyft dt to )TYW 2YTWTYT + q 1f - I to ftf ATFdt 2 It0 t qTWR(Rq) dt + yI(tf)T Wrq(tf) - AH (to)Tyo + jto T Wy 1 dt - toH AT F - q(tf )T WTyT + yT T WTYT + AH(t0)Ty0 + AI (t0)Ty0 + jf (AI+ AH)T F dt. Finally, canceling like terms and rearranging we arrive at a function that is independent of AH: 57 {q(tf )TWTq(tf) - q(tf )TWT( Rq)(tf I(q) =- -+ q(tf )WTy 1 (tf) + Iy WT (yT - yI(tj)) + !AI(to)Tyo ±2 T2 {f f - qTWRqdt - i:t 1ft + -YRWR(YR - yI) dt + -1 'f 2 to 2 '0 + 1 -{q(tf)TWT(9q)(tf) + + 2 yT rWT(YT - yI(tf)) + qTWR(lq)dt Jj tf TWRyjdt fTFdt F qWR(gq) dt} - q(tf )TW yI(tf) + qTW yI dt + 2_AI(to)TYo 1 f 1 1 tftfTW1f tf \T Fdt.. yRWR(yR - yI) dt + 2 to 2 ' Defining C 1 yTWR(yR - yI) dt + 1 YTW(yT - yI(tf)) + 2A 1 (to)Tyo f F dt, (3.21) we may write the dual problem as I(q*) = max gEY -((q, 2 9q)) + ((q, yi)) + C1. (LQPq) Therefore, we have restated the original primal problem (LQP) as an unconstrained maximization problem in terms of q(t). Both (LQPdual) and (LQPq) are the dual statement of (LQP). The latter, however, has the distinct advantage that it is stated in terms of the operator 9 which is symmetric positive definite in the space defined by ((-, -)). Additionally, the statement of the problem is much cleaner since no time derivatives are present in I(q), as opposed to I(A). We note here that the the set A is convex and polyhedral, and that 11 (A) is convex and I2 (A) is concave over A. Thus, we can state [3] that there is no duality gap and we have {Ji(y) - J 2 (u)} inf (y,u)E{yxu} or simply: J(u*) = = max {I2(A) - I1 (A)}, AEA (3.22) I(A*) = I(q*). This result will be referred to as strong duality. From Fenchel duality theory, weak duality also holds: I(q) < J(u) V {q C Y, u E U1}. 58 Form (LQPq) is in fact a rather useful statement of the problem from the numerical point of view: (i) since the operator G is symmetric positive definite in the given inner product, efficient numerical techniques that depend on this property can be used; (ii) the q statement of the dual problem is an unconstrained maximization problem, which can be solved for a unique solution by well-established methods; (iii) only the action of operator g on q is required, thus allowing for potential storage savings. This action is uncoupled in time, and is performed via operations (3.11,3.12); (iv) the inhomogeneous part C, can be solved for by equations (3.7,3.8) which are uncoupled in time. Since this term is independent of q, it needs to be solved only once in an iterative procedure; (v) in iteratively solving for q*, any value q E Y will produce I(q), which serves as a lower bound to the optimal cost I(q*) = J(u*); (vi) in iteratively solving for q*, the operation gq produces a non-optimal value of A(q) which can be used to determine u(q) = -Wj1BTA(q). This can be used to determine a upper bound J(u(q)), and thus a bound gap to the solution Ac(q) = J(u(q)) - I(q) > 0. Because of strong duality, effective numerical algorithms should allow Ac(q) -+ 0 as q -+ q*. 3.4.6 Differences from Previous HUM The Hilbert Uniqueness Method has been developed as a constructive technique to study exact controllability of distributed systems. Originally developed in the context of the wave equation [19], the method has been extended to study parabolic equations [9] of the type considered here. The approach, however, has been to state the problem as a solution to the system Pq = yj, where P is some operator similar to G presented here. The operator is then shown to be symmetric positive definite allowing for the development of algorithms for solving the above system for the terminal problem. This, however, overlooks the origin of the problem above, which is the dual 59 statement of the original problem as in (LQPq). By using Fenchel duality, and stating the problem as such, we have available Ac(q): a measure of quality of the current iterative solution. Though Fenchel duality has been used to express the dual of the optimal control problem [13], this has been done for the terminal problem only. In such a situation, the end result is similar to problem (1.40), and the symmetric positive definite property of the terminal g operator can be proven in Euclidean space, thus not requiring a special inner-product such as ((., -)). The advantage of statement (LQPq), however, is that regulatorproblems can be just as easily formulated, provided the inner-product is appropriately defined. By combining the duality result with HUM as above, one is able to automatically derive the appropriate form of the inner-product to be used to solve the problem. 3.5 3.5.1 General Conjugate Gradient (CG) Algorithms General Conjugate Direction Methods The method of conjugate gradients is presented here for the solution of general optimization problems in ]R9. In section 3.6 the method will be applied to a subspace of Y for solving the optimal control problem (LQPq). Since the problem of interest is quadratic, the method will be presented in this context. Conjugate gradient methods are a subset of the more general conjugate direction methods [3]. They were originally developed for solving quadratic problems of the form x* = arg min where Q E 1R? f (x) XTQ is a positive definite matrix and b E JRf -X Tb (3.23) (this problem can also be expressed as Qx = b). Conjugate direction methods rely on Q-conjugatedirections dl,... , dk; that is, diT Qdj = 0 for all i and j such that i = j. These directions are linearly independent by construction since Q is positive definite. Minimization of f is then iteratively performed by xk+1 - xk + akdk, where ak is chosen in such a way that f (Xk + ak dk) = minf (xk + adk). a 60 Since Q is symmetric positive definite, this value of a can easily be found from the form of dkT(b ak - f QXk) to be (3.24) dkTQdk for any given direction dk. The strength of these methods is that the iterates progressively minimize f over an expanding linear manifold that eventually includes all of R'. Using the Gram-Schmidt procedure, it is possible to obtain a set of mutually Q-conjugate direc- tions do,... , dk from a set of linearly independent vectors r 0 ,... ,r , so that the subspace spanned by both sets of vectors is the same. This can be done iteratively in the following manner. Choosing do- r0 and di+ - ri+l + cmldm, m=O determine coefficients cmi such that di+1 is Q-conjugate to do,..., d'. From the Q-conjugacy definition, we get c Z+- 3.5.2 ri+TQdJ iT Qdi j= 1,...,Ii. The General Conjugate Gradient Method In the conjugate gradient method, the set of Q-orthogonal directions is obtained by applying the Gram-Schmidt procedure to the gradient vectors of problem (3.23): rk= Vf (xk) = b - Qxk,7 (the negative of the gradient is used for notational convenience, this value is also often referred to as the residual). In this case, the direction calculation for the k-th iteration adopts the form d rk k-1 kTQdJ c = rk -E r IQTdJ dd. But since the gradient rk is, by construction, Q-orthogonal to the subspace spanned by ro, ... , rk-1 the above can be greatly simplified to dk -rk - rj rr ) (rk-ITyk-1 61 dk. (3.25) 1r(rj+ ykQdT k-i j=0 T j=0 dT(ri+1 - rj)/c4 d rkT k j _ kT k rrk-1) k-1T(rk rk-lTrk-1J The simplicity of the above forms indicate one of the advantages of the method: for implementation, it is only necessary to follow the algorithm: Algorithm CG Set k = 0; Set x 0 =O , r0 = b, and d0 = r0 ; while not (rkTrk) > (tolerance) do k = k + 1; ak (rk-lT rk-1)(dk-1TQdk-1); akk-1.; xk _ k-1 rk k-1 _ akQdk-1; k = (rkT rk) dk - (rk-T rk-1); rk + !kk-1; end do. Another significant advantage of the method is the fast rate of convergence observed. An upper bound for the convergence rate is where |XI|Q = (xTQx) ||z * - <k| 2 Vr,(3.26) IIX* - X0|IQ - \+/1Th+ ' denotes the Q-norm, and r is the condition number of Q (see [28] for proof and other results). The conjugate gradient method is particularly powerful in practice for matrices Q whose spectra are well-behaved (clustered eigenvalues). Though in theory the method will achieve the exact solution when k = n, such problems allow for very satisfactory accuracy for k < n. This is a result of the fact that the method chooses search directions dk which allow for the minimization of the error IIx* -xkIIQ over the entire Krylov space ICk spanned by {b, Qb, ... 62 , Qk-lb}. 3.6 Terminal-Regulator Conjugate Gradient Algorithm (TRCG) 3.6.1 The Skeletal TRCG Algorithm We can adopt the above conjugate gradient methodology to solve problem (LQPq). Here, we express the problem in the equivalent minimization form: q* = arg min i(q) = -((q, gq)) - ((q, yi))}, gEY 2 (LQPq*) where we are reminded that I(q) = {-1/2((q, gq)) + ((q, yi)) + 0r} < I(q*) for all q E Y. The similarities between problems (3.23) and (LQPq*) are immediately apparent: (i) both f(x) and i(q) are quadratic functionals; (ii) both problems are unconstrained minimizations; (iii) both problems are defined by an symmetric positive definite operation in an appropriate inner-product. The differences of the problems have to do with the operators involved and the inner-products: (i) though Q is an n x n matrix, g is an integration operation, which takes in argument q and returns a time history gq; (ii) the inner-product for problem (3.23) is the Euclidean inner-product, while that for (LQPq*) is defined as in (3.13). This last point is crucial to the method since operator g can be shown to be positive definite only in the space defined by this inner-product. The result of these observation at the algorithmic level is as follows. All of the conjugate gradient ideas presented in section 3.5 can be restated in the context of ((., -)) in place of (-)T(). Now, rather than minimizing the Q-norm of the error Ix k - X*I|Q, the method will minimize the "g-norm" jqk - q* Ig, defined by 1vflg = ((V, g)), over the Krylov-type space spanned by {y', gy1 , g(gy),... , Qk-1y}. The algorithm now takes the form Algorithm TRCG (skeletal) Calculate yJ from equations (3.7,3.8); Set k = 0; Set qo= 0, ro=y1 , and d0 = r0 ; while not (stopping criterion) do k = k + 1; qk qk-i + akdk-I 63 rk =rk--1 _ k !dk-1; Ok= ((rkrk))((rk-1 ,rk-1)); dk k + . k k-1. end do. 3.6.2 Convergence Results for TRCG Before we explore the convergence characteristics of the method presented above, we present the following definition. Definition 1 The set of scalars Xi E R and function Ei E ((EiIv)) = ((Aisi, v)) Y that solve the system Vv E (3.27) Y, are defined as eigenvalues and eigenfunctions of 9, respectively, where i = 1,... , N. Furthermore, we define eigenfunctions as those ei which satisfy ((ei, Ei)) = 1. El Lemma 1 (a) The eigenfunctions of 9 from Definition 1 are mutually orthonormal: ((Ei, Ey)) = 0, for all i : j. (b) In addition, the complete set of scalars {i} such that q(t) = E {e} forms a basis for Y: Vq E Y, there exists a set I lis,(t). Proof. (a) First we observe from equation (3.66) that 9 is symmetric in the ((-,-)) inner-product. For i : j we have ei : e. and Ai 0 ANj. From symmetry and equation (3.27), we can write ((Ei, ej)) = ((i e, I le)) = -i ((ei, ej)) Since and ((e, I i)) = ((Aj ej , ei)) = Aj ((e, I i)). A AJ, - we must have ((,e, ej)) = ((aj, Ei)) = 0. (b) The complete set of orthonormal {se} spans the space Y, and so serves as a basis. 0 Though the typical convergence result presented for the general conjugate gradient method is shown in equation (3.26), a more useful result from a cost minimization point of view is presented here for the TRCG algorithm: Proposition 1 Assume that 9 has (N - k) eigenvalues in the interval [a, b], where the remaining 64 k eigenvalues are greater than b. Then every qk+1 generated by the TRCG algorithm satisfies (qk+1) _I*) b(I - a) 2 -I(qI) _ I(q*) bb+a (3.28) Proof. To simplify the proof we introduce the change of variables p = q - q* and the function 4p(p) = 1(q) - f(q*). Then I p (p) = 1p ((q, 9q)) - ((q, yj)) 1 2 (* 2 1 = ((q, 9q)) 2 = 1 = 1+ -((q*, - q) 9q*)) + ((q*, yi)) -(q-qy) ((p + q*, g[p + q*])) ((q* 9q*)) ((p, gp)) - ((p, yi - 9q*)) = - 1 !((p, gp)) and we need to show that b - a )2 jP(pk+l) Ip(po) - +a for iterations in pk. The TRCG algorithm builds iterates k Pk+1 (3.29) y*kigpk PO + i=0 by selecting scalars 7*ki such that 4,(pk+1) is minimized over all sets of possible coefficients -yki for every k. Defining the polynomial k pk(g) = E ckigk+l i=1 with appropriate cki, we can restate the iterate (3.29) as (3.30) Pk+l - [ + gpk ()]O where I is the identity operator: Ip = p. Choosing -*ki which minimize 65 I,(pk+1) can be expressed as Ip(pk+l) - = miln +P(([I . pk2 (3.31) From Lemma 1, any function po E Y can be written as N p0 _ for a set of scalars j. Also, from N N gpo _ zig,i jy'i i, {i} and ((2j I )) = 090 together with the orthogonality of ~ Ip(PO 1 10 2 ((p pO)) 1, we can write N (NN =z1 2 Applying the same idea to equation (3.31), we have for any polynomial pk IP(Pk+) (1 + < p 2 ) 2 -( ) and, finally, < max(1 + Aipk ZP(pk+) (3.32) Vpk, k. Now we denote A1,... , Ak as the eigenvalues which are larger than b and choose the polynomial Pk defined by ((a + b) - - Since (1 + for X C [a, b]. 3 Pjk(A)) = 0, for j = I(a + b) 1,... , k, ) ( A A . Ak - - A - the maximization in equation (3.32) need only be done Also since A3 > A > 0, for all j, we have (Aj - A)/Aj < 1. So we can rewrite equation (3.32) as IP(pk+1) max a<\<b { ('(a + b)) 2 2 (1 (a + b))2 66 1PV b- a 2 b+a)1() Proposition 1 is significant because it shows that the method tends to converge very quickly to a minimum value if the range of eigenvalues of the remainder of the space is narrow. Since operators can be preconditioned, drastic improvements can be made regarding convergence of the method. Stopping Criterion 3.7 Though the residual rk can be used as a measure of proximity to the optimal solution (since rk -+ 0 as k -+ oc), a more physically meaningful criterion can be obtained from estimates on the cost of at the kth iteration. In carrying out algorithm TRCG, it is assumed that the value qk (t) is available for all t E [to, tf] by means of some type of storage (see section 3.10.1). This value of qk can be used to create an estimate yk of the optimal state variable (note that in general yk : y*) by the operation yk = - y1, and an estimate uk of the optimal control variable by uk = -w BT( + 4k). This is possible since y[ and AI are available by means of equations (3.7,3.8), and \k is obtained as a by-product of Rqk as in operation (3.11). Now that the values of the above estimates are available, a cost bound gap Ak can easily be obtained by Ac(qk) - j[uk (qk),yk(q k) - (3.33) (qk). Weak and strong duality assure that Ac(qk) > 0 for all k and Ac(qk) -+ 0 as k -+ oc, respectively. Therefore, a natural stopping criterion for algorithm TRCG is while not (Ac(qk) Ec) do ... , (3.34) where cc is a tolerance parameter which represents the allowable error from the true optimal cost. 67 3.8 3.8.1 Time Discretization - Implicit-Euler Discretization of Problem Statement Thus far the above formulation and results have been carried out in the continuous time domain. Before final implementation of the algorithm, a time discretization must be performed on the problem. Several well-known schemes are available to discretized ordinary differential equations. The ones explored here are implicit-Euler, Crank-Nicholson, and second order Backward Difference Formulas, but the method also extends to higher order schemes. In this section we present only the Euler formulation; Crank-Nicholson and BF schemes are presented (in a more general context) in Appendix A. It is important to note that care must be taken in applying these schemes to the two-point boundary problem which make up the stationary conditions (3.4,3.5). If such care is taken, the symmetric positive definite property of the discretized equivalent of 9 is preserved, and all the results follow. Here we present the case for implicit-Euler, and the remaining schemes are addressed in appendices. Suppose we divide the time domain [to, tf ] into L equal intervals of size At = (tf - to)/L. In this way, we substitute the original time function in C 0 {[to, tf ]; ]JN} by y IRNx(L+1), where ye E JftN f y 11 , YL} E for f = 0, ... , L. We note here that the time superscript f should not be confused with the conjugate gradient iterate k. Henceforth, when both indecies are of interest, we will denote the variable as yke; otherwise, one of the indecies will be implied. Having defined the state variables as above, we can similarly define the control and adjoint variables as u E f = 1, ... , L: b(E = JMxL and A E respectively. These are indexed by uf and At for JNxL, note that f = 0 does not apply for these variables, since they have no impact on the initial state of the system. We also define YR E ]NxL and YT E RN in a similar fashion. Finally, the matrices WT, WR, and WU are defined as before with the same assumptions: WU symmetric positive definite and WT and WR symmetric positive semi-definite, at least one of which strictly definite. Having defined the variables as such, we may restate the problem in the discretized form: min JE [y(u) UEUE YV y 0 = yo,l Ay + B u + 0), 68 P 1 f= 1, ... , L ; (L QP E ) where the discretized cost JE[y(u)] is L JE[y(u)] 1 - T(yL - 2 YT) + 2T)TIR ( t=1 - yI)TWR(y - yI) + u dul At. (3.35) Remark 5 It should be noted that the cost functional need not take the exact form as above. For example, the sum term can be replaced by y[ - y)TWR(yt - ye) + UjWUU At + ± [(yL _ Y)TWR(yL -- ) + U LTWuuLl + This cost has a slightly different interpretation: the cost penalizes the fth value of the variables in the time interval [(1 - 1/2)eAt, (1 + 1/2)eAt], as opposed to [(f - 1)At, LAt] as in (3.35). Though this is negligible in terms of accuracy (implicit Euler being O(At) accurate), it is important in the stability of the method, as will become apparent below. The point to be made is that different discretization choices will impact the appropriate form of the the operators used to solve the problem in the TRCG algorithm. F-1 From problem statement (LQPE), the ordinary differential equations representing stationary conditions for optimality can be expressed as (optimality is implied (k *) for equations (3.36- 3.40): S= Ay' + But + Fe, (I - AtA)T AL At (3.36) Y0 Yo; ) =1, ... , L; WT(YL - YT) + WR(YL - yR)At; = ATA + WR(yW - Y), Ut= -Wf BTAt, (3.37) (3.38) f= L - 1,..., 1; (3.39) f = 1,..., L - 1. (3.40) Although most of the equations above could easily be deduced from standard application of the Euler scheme, the form of equation (3.38) may seem unexpected. The final conditions on A is where care must be taken so that the appropriate symmetric positive definite quality of the discrete analog of g is preserved. The correct form of these conditions can be arrived at by direct manipulation of the discrete augmented cost functional. 69 3.8.2 Discretization of Solution Procedure The following definitions will be useful in developing the time-discrete form of the TRCG algorithm: Definition 2 Let Ai E RNxL and y' E ]RNx(L+1) (I - AtA)T A= WTYT (I - AtA)T A' = 0 Y, - be defined by: AtWRYL; = A'+' - WRyAL - - -- ) 1; = Yo (I - AtA)y'= y'-1 - (QA +F ))At, f= 1,..., L. The the time coupling of the above is such that the operations can be immediately and uniquely carried out for any set of inputs {yo, YR, YT}. E Definition 3 Given q C I[?NxL ]RNxL, let the operators 'RE : IRNxL _, JNx(L+1) and gE : RNxL _ be defined by: (I- (3.41) AtA)T4\L = (WT + AtWR)qL (I - AtA)T A' ++ (3.42) WRq At, (3.43) 0 (REq)o (I - AtA) (REq)q'e (7ZEq/' 1 - (3.44) QA'IAt, (3.45) (gEq)'=q - (REq)', Again, the time coupling is such that the above operations can be immediately and uniquely carried out for any given q. 0 Definition 4 Given v E lNx(L+1) and w E jINx(L+1), let the inner-product ((-, -))E be defined as: ((vw))E = vLTWrwL + TWRW At- f=1 Defining the norm IIVHIE = ((v, v))2, and given the assumptions of WT and WR, it is a simple matter to show that for all such (v,w): ((V,w))E = ((w,v))E; |avHl = IaIJ|vIHE, V a C R. D 70 IVHE > 0; IIIE = 0 if V 0; The above definitions can be seen as discrete analogies to the components of the TRCG algorithm: the inhomogeneous parts, the operators, and the inner-product space. The first important result is in regard to definitions 3 and 4: Proposition 2 Given the definitions above, the operator GE is Symmetric positive definite in the space defined by ((-, -))E- Proof. From Definition 3 we can multiply expressions (3.42) and (3.44) by -(REq)f respectively, to obtain for £ (I -A'T and for and A4, 1,... , (L - 1) = AtA) (REq)' - =E a(RE -(RgT (3.46) = 1,... ,L Aj T (I whose sum results, for any RE - e AtA) (REq)' 1,... - AHT H TtT - H (3.47) - 1), in , (L +TW(Eq HT T HQA At (3.48) Performing a sum over all f = 1,. . ., (L - 1) of equation (3.48), we have L-1 AL 7 '0 _ 4j(R(q)O T (N~ T (RQL-1 ZEq) St=WREQ + T L-1 A - f=1 TQAf f=1 We now use the facts that (REq)0 = 0 and, from equation (3.47) at L, AHT(REq)L-1 _L T (I - AtA) (q)L LT LAt to rewrite the above equation as L-1 AT (I - AtA) (REq)L L q WR(7Eq)At H e=1 f=1 71 H' At. Finally, since ALH T (I - AtA) = qLT(WT + AtWR) from final condition (3.41) and the definition of (('))E, we have L ((q, RE q))E As a result Vp E L TWR(REqf AtAfT q= LTWT(REq) L + I? NxL, (3.49) AAt<0 P 0 0, ((p gEP))E = ((pP))E - ((q, REq))E > 0; that is 9E is symmetric positive definite in ((., -)). 0 The second important result is in regard to the dual of the problem: Proposition 3 The dual of problem (LQPE) can be stated as IE ( = q*) max - ((q, 9Eq))E - ((q, + CE yI))E (LQPqE) where L CE = -yT 2 (YT IL Y Y, TWR(y, 1 2 Ay YJ)At + 1: AT F'At. (3.50) f=1 Proof. Similarly to before, we can combine expressions in Definitions 2 and 3 to obtain S H= QA'At = I -((q, yI))E + A Ty 0 + 5 A'TF'At, (3.51) =1 and L A, QA'At From YLTWTYL + L A j QVAAt y TWy At + ATy L + ± AITFeZAt. ~TQA' At 5T QA A At + S: A47QA'FIAt+2A 7 72 (3.52) we can substitute into the EL' AeQAAt term of the discrete form of the dual as a function of A {1L-1 maximize ff+ IE(A) = _Ae +1_g + AT_A _ T W A_ _ + ATf A A f=1 i 1 L-1 At A T TL ~ t 1 L A T~~ =- 2LTW-AL _ ALTy + ATy + A FJATJ, f=1 subject to A E JNxL (LQPdualE) to arrive at expression (LQPqE). E 3.9 Detailed TRCG Algorithm The algorithm below is implemented in the Implicit-Euler scheme. Extensions to other time- discretizations are trivial. Algorithm TRCG Calculate (y', A-) Set k Set q0 = 0, ro = yi, and do = ro; 0; = by Definition 2; Set AE(q 0 ) > cost tolerance; while (AE(qk) > cost tolerance) do k = k + 1; a= (rk, rk-))E/((dk- ; k qk _ k-1 ±kdk-1; k-1 _ ak Edk-1. rk ._ k dk ((rk, rk) )E _ rk (rk-1 rk-1))E; + fkk-1; Calculate (u(qk), y(qk)) = (-W-lBT(AH + AI), 7ZEqk + yJ) by Definition 3; Calculate AE(qk) JE[u(qk), y(qk)] - IE[qk] by equations (3.35) and (LQPqE); end do. 73 3.10 Numerical Properties of Method 3.10.1 Storage Requirements The storage required by the TRCG algorithm proposed above is O(NL) for the full terminalregulator problem. This number arises from the fact that state variable iterates qk, rk, and dk must be stored for all time steps f = 1,... , L (note that the terminal-only problem requires only the terminal condition, resulting in O(N) storage). This requirement can be alleviated at a computational cost. Rather than storing state variable data for all time steps, one might prefer to store the control variables, which are typically of much lower dimension, and only initial and final conditions for the state and adjoint. If this is done, 0(mL + N) storage would be required, which can be significantly less than 0(NL) for m < N and L large. The disadvantage of this approach is that every time a state or adjoint variable is required (such as the conjugate gradient iterates), a full integration from initial (state) or final (adjoint) conditions must be performed. This can be very computationally expensive, and so we look for a compromise between these extremes. Partitioning the time domain into NL segments and storing the state variables only at time steps = 1,..., NL results in O(mL + NNL + NL/NL) required storage. The mL cost comes from storage of control variables u for all time, NNL is required for state variables at each , and NL/NL is required for the storage of data within the current working partition. Thus, partition act as initial or final conditions, from which, along with u which state or adjoint data at every has been stored for all time, state or adjoint variables can be calculated at any f. In the context of the conjugate gradient iterations, the variables are calculated as follows: Inhomogeneous parts: Calculate A) by: (I - AtA)T AL = WTYT - AtWRyL; (I - AtA)T A' = A'-- - WRy'At, = L - 1,...,1; storing only uf= -W-BTA. Define uro := U1, UdO := uJ, ro := yo, d: yo. Calculate: do = yo; (I - AtA) d' = d'-l + (Bu'o + F')At, f = 1,... ,L; 74 (I - AtA)T d = (WT + (I- AtA)T At AtWR)dL; f = L- A'6++WRd'At, 1,...,11; Rdo = 0; (I-AtA)Rd' =7Rd'- + (Bu'dO +F)At, storing only uy7Zd 0 f= 1,... IL; W-lBTAe j-1 1'I1 {r}NL U do' {do}NL = {do}N, and {RdO}NL Homogeneous parts: (for k = 1, 2, 3,...) Calculate: ka=_. kk -1)E k -1, ((rkl rkl))E/((dkl E k-1)) gEdk N), fk = ((rk, rk))E ((rk-l rk-1) by: (rk-1)o (rk1)0; (I - AtA) (rk)= (rk) l +- (Bus + F )At, f 1,... L; (dk-1)6o = (dk-1)60; (I - AtA) (dk) (Rdk-l) O - r = r6~ {Rdk},N 1 ef k Ur q- -- __ ± £tk1-C rk- (-dk)-l + (Bu dk)At, f - 1,..., for =1, ... , L; Rdk}NL and: Rd6~) o ak (d6_ d6 = r7+#3kd_, qk + (Butk + F )At, = (Rdk-1)60; (I - AtA) (Rdk)f storing only (dk)t-1 =1..N =1,...,NL dk, Uk1 k-dk- U R UFk = Ufk1 +pakF, Ferk PF rk- 1 +±akF£ k1 Ft =F' d rk + /kFt_. dk Performing the operations above in the TRCG algorithm allows for the storage proposed above. We still need to determine, however, the number of time partitions NL to be used. We chose here simply to optimize the storage O(mL + NNL + NL/NL) without regard to computational cost. In that case, NL = V/ and total storage becomes O(mL + Nv L/). 75 3.10.2 Conditioning of g Operator The conditioning of the ! operator is crucial to the efficiency of the TRCG algorithm, and so it is of interest to see how it depends on the problem data. For simplicity, we consider here the timeinvariant terminal problems, with obvious extensions to time-varying regulator problems. Since the operator is defined as 9q = q - R, we first consider the action of R. As stated previously, Rq is defined as: given any q E JRN, AT AH, -AH AH(tf)= WTq, (7Zq) = A(Rq) - QAH, (7Zq) (to) = 0. One may then define a transition matrix function 4(t, to) by <1>(t, to) = A4(t, to) )(to, to) = IN, resulting in 1(t, to) = eA(-to) for the time-invariant case. Then (Rq)(tf) eA(tf -T)QAH (T) dT. - Wto We also note that, for symmetric A, AH(T) - eA(tf-) WTq, and, as a result, (_q) (tf _ A(t--) A(t-T W d, to and, finally, we find that we can express 9 in matrix form: S= IN + eA(tf-r) QeA(tf--)WT dT. (3.53) We therefore see that the conditioning of the operator 9 is directly tied to that of matrices A and Q. In particular, we make a few observations which will be useful in future sections. As the smallest 76 eigenvalue X 7 ,o of R approaches zero, Ag,o will approach unity. Thus, if the largest eigenvalue AR,N is bounded from above, then so is the condition number K(9) = AR,N AR,N/AR,o. Alternatively, if is unbounded, then so if n(g). In the sections that follow we present the Finite Element formulation of the problem. It is a well known result that as the discretization diameter h approaches zero, the stiffness matrix A becomes ill-conditioned, since K(A) = O(h- 2 ) for linear elements. The question then arises as to how this will affect the conditioning of g. We mention here that K(G) - C(WU, WT, WR) as h -* 0, where C is a constant that depends on (Wu, WT, WR) but not on h. The reason for this is that as h -+ 0, the basis of the FEM formulation cannot remain linearly independent, thus the smallest eigenvalues of A approach zero. Since the largest eigenvalues are still bounded, then the relations and arguments above indicate that the conditioning of 9 will remain bounded. This is an important property of the method, because it shows that it is suitable for use in FEM formulations of problems governed by partial differential equations. Section 3.11.7 below presents a simple example that exhibits this property: as long as the FEM systems can be solved for each time step, the TRCG algorithm does not deteriorate as h 3.11 -- 0. Formulation for Partial Differential Equations So far we have formulated the TRCG algorithm for optimal control problems of systems governed by ordinary differential equations. Such systems are often encountered in control problems in which lumped parameter models are used. Mechanical, fluidic, thermal, and electrical problems (among others) may readily and efficiently be approximated by such ODE's. However, this work is concerned with the application of optimal control to systems that must be more more carefully described by the governing partial differential equations. Heat and fluid dynamical systems are often too complex to be modeled by lumped parameters. As a result, an appropriate statement of the problem must be made in the context of the fundamental governing partial differential equations. Our approach is to use the Finite Element Method (FEM) for discretizing the spatial variable that represents the state of the system. We show that the impact of the new statement of the problem is minimal in the TRCG algorithm, allowing us to use it virtually unchanged for the solution of (appropriately stated) optimal control problems governed by partial differential equations. 77 3.11.1 Problem Statement We are now interested in addressing problems of the form: given spatial variable x E Q solve for temperature E Y such that Q(to) = o in Q, (3.54) in Q X (to, tf), (3.55) on FN X (to, tf), (3.56) on IFD X (to, tf), (3.57) M = V - (a()VQ) + um(X) M=1 V - ii= 0 S=0 where we note that we have arbitrarily imposed a zero temperature at the Dirichlet boundaries. We use this case in the presentation of the following section for convenience. However, we note that the problem to be address will have inhomogenous boundary temperatures YD, which are addressed in a standard way for Finite Element Methods. Each element un of the control vector u C U is applied to a sub-domain Q, 3.11.2 of Q. Weak Formulation and Galerkin Approximation We treat the above set of equations in the standard Finite Element context. As such, we begin with a variational statement of the problem. First, recall the definition of the following spaces spaces LP(Q) = v : Q -+ R | j IvIP dQ < +oo with associated norm i/p dQ) Also define the Sobolev space Hk (k > 0) and Hok as Hk(Q) - v E L 2 (Q) IDv C L 2 (Q), VlI < k}, Hok ={v c Hk(Q)IVD =)011 and the function space X = {v E H 1 (Q) IVFD = YD}78 Now we may state the weak formulation of the problem: find : Q x (to, tf) -+ IR, Q(t, ) E X such that QVto) = fo, aM (3.58) t((t), v)= a(jj(t), v) + 1 b(um(t), v) Vv C HOJ(Q), (3.59) M=1 where a(W(t), v)= a(w(t), v) = f - j b(um(t), v) x)v(x) dQ, (t, Vw(t,x) - Vv(x) dQ, Um(t, x)v(x) dQ. The statement of problem in weak form (3.58,3.59) is still continuous in space and time. To begin the discretization of the problem, we use the Galerkin approximation: find yA : Q x (to, tf) y(t, ) E Xh C -- 1, X such that Yh(tO) = Yo,h, (3.60) a t(Yh SM b (um,(t), M), Vh) = a(yh t), vh) + Vh) Vv E HO(Q), (3.61) M=1 where Yo,h E Hh(Q) is chosen appropriately to approximate the initial conditions (for example, it may be a solution of a boundary-value problem with appropriate initial boundary conditions, or a projection of arbitrary initial conditions yo). 3.11.3 Finite Element Approximation Now we may introduce the triangulation Th of Q, which represents a set of triangles such that Q = UTTjhTh TnTh = 0 and if Th 79 0 Th (Q and 'h represent the closure of Q and Th, respectively). Now we use the following space of p elements in the approximation of the problem Xh {v E X IVITh E Pp(Th), V Th E }. Here we make use of p = 1 (linear) elements, though the extension to higher orders is trivial. Now we are ready to discretize the problem by setting N yj(t) yh(t, X) = j(), j=1 where {yj Ij = 1, . . . , N} denotes the basis of Xh. To simplify notation, we redefine y as a vector in jN whose elements are the basis of Xh. 3.11.4 The Governing Equations Having thus defined the finite element spaces, we can carry out the operators of the weak formulation to obtain y(to) = (3.62) yo, M'dy =Ay + Bu + F, dt (3.63) where, for every t E [to, Itf, Mij :=(#i,I0), Aij := a(#j, #), Bim := b(um, 0i), i, j =1, ... ,N, 7- = , ... , M, are definitions of the mass, stiffness, and control matrices, respectively, and F(t) incorporates the inhomogeneous boundary conditions. We note that equations (3.62,3.63) are of similar form to the (3.2,3.3) for which the TRCG method was presented above. The difference that the mass matrix M is present in the above equations will be shown not to affect the algorithm in a fundamental way, due to the SPD nature 80 of this matrix. 3.11.5 The Cost Functional An appropriate cost functional must be defined for partial differential equations. From our treatment above, it is apparent that a quadratic form may be advantageous: J(9, 9, 4RU) f + ~ ) nWR) fr W()( (tf) - 9T)2 dQ =1 (3.64) uTWuudt 1 f )2 -- where WT(x) and XR(x) are weight functions for the terminal and regulator deviations over the domain Q. For example, if WT(XI) < WT(x2), then deviations from desired terminal behavior in the sub-region Q1 D x1 will be more heavily penalize than deviations from desired terminal behavior in Q2 3 X2. Now using an interpolant I we define the following approximations to the desired terminal and regulator behavior: N 1 yTh() T() -- yT j j(), j=1 N yRh (t, X) = R(t, X) yRj t)J(), j=1 where we also define the vectors YT and yR(t) as containing the elements YTj and YTj, respectively. We may apply the Galerkin approximation to the cost functional: J(yh,yThiyRhu) = J(Y, YTYTU) + 1 -(y(tf) - yT 2t [(Yy - YR) WR(y 1 T(y(tf) - (3.65) yR) + uTWUU] dt, - where WTij = (#i,wTq j), Wai = 81 ((i,wRqj), YT) ij = 1,..., N. The problem can finally be stated in the familiar form for ordinary differential equations: min J[y(u)] uEU s.t. (LQPfem) MQ = Ay + Bu + F, y (to) = yo. 3.11.6 Optimality Conditions By introducing Lagrange multipliers A E {(to, tf) x RN}, we can derive the optimality conditions for the FEM formulation of the problem: My* = Ay* + Bu* + F, y*(0)= yo, -Mi* = AT A* + WR(y* - YR), A* (tf) = (MlWT)(y*(tf) - yT), 0 = Wuu* + BT A*, which, except for the appearance of the mass matrix M in the above equations, is very similar to what has been presented before. Therefore, the question is how this matrix may affect the proposed TRCG algorithm. Fortunately, the algorithm is unaffected by this matrix. Since M is symmetric and nonsingular (it is SPD), we quickly obtain the appropriate R operator (with homogeneous and inhomogeneous parts defined as before): A TM(Rq) = ATA(Rq) - ATQAH T M(Rq) = -A( d(AT M(-Rq))/dt = -ATQ AH q) - qTW((Rq) - qTWR (Rq), which, by Green's formula, yields ((q, Rq)) := - 7 AH4QAH dt Sto bto = q(tf)TWr(Rq)(tf) + qT wR(Rq) dt. (3.66) Now the definition of R is the backward and forward integrations with M included in the above 82 equations. If this is the case, then the operation above uses the same inner-product as had been defined before. As such, we see that the algorithm has not changed fundamentally, as long as the spatial discretization of the state and adjoint variables is consistent with the new FEM formulation and that R and y' are calculated accordingly. Defining R as above, we still have gp = p - Rp and the same inner-product as equation (3.13). Consequently, all the proofs hold and the time-discrete case is a trivial extension of the method presented in section 3.8. With the fully discretized statement, the TRCG algorithm can be applied to problems governed by partial differential equations. 3.11.7 Effect on the Conditioning of g - A One-Dimensional Example Since we are taking the FEM approach to problems governed by partial differential equations, it is important to understand how this formulation will affect the TRCG algorithm. In particular, as pointed out in section 3.10.2, g should remain well-conditioned for discretized problems as h -+ 0. In this section we test the heuristic arguments of section 3.10.2. Here we show an example in which this property is observed to hold. Take the terminal problem governed by the partial differential equation: at =a 0X2 +b(u, x), where (x, t) is defined on a one-dimensional domain of length 1 for all time (to, tf x E Q =R. Apply Dirichlet boundary conditions (X = 0) M 0, = 1) = 0, arbitrary initial conditions Q(X, to) = yo, and define b(u, x) as a mapping of a single controller u onto the entire domain Q. This is not a realistic example since control problems rarely allow for point control at every x of the domain, but by exciting every mode we have a simple form of the problem for analysis of g which bounds the more realistic case. Having defined the problem as such we choose a simple triangulation: divide the spatial domain 83 into N equal segments of length h. Assume we would like to drive the system to zero final state (YT = 0) and that WU = 1 so that y E ]ffN Q = BBT. Then the spatially discrete variable is defined as X (to, tf) and the governing equations become: My = Ay - QA, y(to) = yo; (3.67) Mi = -ATA, A(tf) = WTy(tf), (3.68) where 4 1 0 0 1 h 0 6 0 A = h 4 1 . 0 1 . 1 0 - 1 4 1 1 4 - 0 0 2 -1 0 0 -1 2 -1 - 0 -1 0 0 (3.69) 0 -1 0 -1 2 -1 0 -1 2 (3.70) and 1/2 1 (3.71) B = h 1 .1/2_ are matrices of size N x N, N x N, and N x 1, respectively. Now define the following: zH = MyH and (H = MAH, from which we get ZH = AM 1 'zH - QM 1(H , ZH (to) = 0; (H(tf) = MWTq, (3.72) (3.73) from the homogeneous equations. Following the steps of section 3.10.2 (and discretizing in time) 84 Conditioning of R operator for W =102103104 10 10 W 10 , 10 10 10 10 0 20 10 30 40 50 N 60 Conditioning of G operator for WT 70 2 10 80 3 10 90 100 4 10 10 W11' o3 1 102 W =10~ (9102 =102 101WT 10 0 10 20 30 40 50 N 60 70 80 90 100 Figure 3-2: Condition number of R and 9 for sample one-dimensional problem. L 9q = I - (- MeAM' I 1(tf T)QeAM'1(tf-T)MWTq.(74 T-e--MW q. (3.74) Therefore, 9 can be calculated explicitly as a full matrix for this simple problem. Assuming the problem size is not too large for the desirable range of h, one can calculate the condition number by any available method. Here, we have used MATLAB for this simple calculation and have explicitly plotted the condition number of R and g in Figure 3-2 for different values of WT. Our goal is to observe the behavior of the condition number of g as h -+ 0; that is, as N -+ oc. As expected, we see that as N -+ oc, R quickly becomes ill-conditioned (exceeding machine precision for N > 10) due to the inevitable ill-conditioning of A caused by eigenvalues which approach zero. However, the condition number of 9 approaches constants that are only dependent on WT. As a result, conjugate gradient iterations that operate on 9 will not be degraded as h -+ 0, making the TRCG algorithm suitable for PDE problems in the context of the FEM formulation. The case that we have considered above assumes that all points on the state domain were directly controlled by u. This is obviously not the case in general. For the realistic cases in which only a subset of Q is under the influence of a control variable, the eigenvalues of g would be bounded by the eigenvalues of the above matrix from below and above. Thus such a realistic problem can only be better conditioned than the problem presented above. As a result, these problems are also well-conditioned as h -+ 0. 85 OC5 FRS D u FD F(-2 ()3 G4 FN q 2:- U q3 U2 q 4:-- 1 Figure 3-3: Diagram of sample problem domain (7 cm x 3 cm). 3.12 Example Problem: Linear, Two Dimensional Heat Transfer We now return to the original problem posed in the beginning of this chapter. We are interested in controlling the temperature on the "reaction surface" of Figure 3-1. Suppose further that the control mechanism is such that we are allowed to input heat through the material on certain parts of the domain. In particular, we propose to use three heaters on the bottom part of the domain. Figure 3-3 is an illustration of the spatial domain Q E R2 of the problem to be addressed. Five sub-domains Q, through Q5 (all C Q) have been used with diffusivity values a, through a5. Controller u1 represents volumetric heat input into Q2 and Q4, while controller U2 is the volumetric heat input into Q3. Together, they form the vector u E U, where M = 2. From here, heat propagates to the rest of the domain. Surfaces denoted by by IFD IFN (Neumann boundaries) are thermally insulated, while those denoted (Dirichlet boundaries) are held at a fixed temperature of 300 K throughout the process. The surface denoted by IFRS is the reaction surface, whose temperature we wish to control. We note here that this choice is arbitrary: any part of the state-domain may be chosen for desired performance. 3.12.1 Problem Data The problem to be solved is composed of the following data ([a] = m2/S): a, a2 a3 &4 10-4 10-3 10-3 10-3 and 86 05 1n-2 YO tO tf YT,RS WT,RS WR,RS WU 300K 0 5s 500K 5 x 10 6 /K Ix 10 5 /K 1s/K where we note that Wu = WUIM, with IM the identity matrix of size M, and JYT,RS 0 X E FRS, if otherwise, JjWT,RS if X E RS, 0 otherwise, WR,RS if 0 otherwise. and WR(X) X E rRS, The above definitions guarantee that we penalize only the deviations from desired temperature at the reactions surface 7 ' RS, with no penalty on any other part of the domain. This type of penalization is completely arbitrary and demonstrates the power of optimal control methods: we are able to arbitrarily specify the desired behavior of the system without regard to controllability issues. Though we have specified iT above, we have not yet specified R(t). It is taken here as a function of time that can be represented in the following manner: R(t, X) E X RS' YR,RS (t) if 0 otherwise, where YR,RS(t) is a scalar function of time of the form YR,RS (t) (300 + 400 t/tf ) K 500 K if t < tf /2, if t > tf/2, as shown in Figure 3-4. In words, we would like the temperature at the reaction surface FRS to rise in an approximately linear fashion from 300 K at t = 0 to 500 K at t = tf/2, then hold a steady value of 500 K until the end of the process at t = tf. 87 YR,RS 500- 450- 400- 350 300 Figure 3-4: surface). 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Time history of desired regulator behavior YR,RS (desired temperature at reaction The problem discretization mesh that has been used is shown in Figure 3-5 with 200 equal-length time-steps At. The resulting size of the problem is: N 3.12.2 = 3960, L = 200. General Results Figure 3-6 shows the solution to the problem posed above by way of the TRCG algorithm. The time histories show that, for the parameters chosen for this problem, we are able to obtain good agreement between our desired and resulting state histories. An interesting qualitative observation that can be made of Figure 3-6 is that the controllers gradually shut off at the end of the process. This feature is consistent with physical intuition since the system inertia maintains a desired temperature at the reaction surface related to a system time constant. However, we also note that to overcome the system's initial inertia, very large values of control variables must be applied. This is typically seen in optimal control solutions since a quadratic penalty cannot impose a hard limit on the value of controllers. Suppose our controllers can only operate below values of 1500 K/s (shown in the figure by a horizontal line). Then the controllers would easily saturate at the beginning of the process. The resulting behavior would, of course, be different than predicted and certainly not optimal. 88 3 2.5 2 1.5 1 0.5 0 0 1 3 2 4 5 7 6 Figure 3-5: Mesh used for problem discretization, N = 3960 (7 cm x 3 cm). Optimal control history u (t) 2000 1500 . 1000 . 500 F0 -500 - -1000 0 u1 (3,5) u2 (4) 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Average temperature at reaction surface RS 550 500 I I I I I I I I I - ,-450 400 350 300 0 - 0.5 1 1.5 2 2.5 3 3.5 4 yRI 4.5 5 t (s) Figure 3-6: Optimal control and I'RS temperature histories, J* = 2.37 x 10 7 . 89 CG residual convergence 10 1 10 _ 100 102 0 5 10 15 20 25 30 35 iteration Figure 3-7: Residual value of error ljuk - u*11 for TRCG iterations. In addition, we might consider this problem with simple Joule heaters, which provide heat as a function of i 2 R for some current i and material resistance R. In this case, it would be impossible to extract heat from the system by use of the controllers u. But a positivity constraint cannot be enforced by the quadratic cost functional alone. Both of the above issues point to the fact that realistic engineering problems will require additional constraints to be imposed on control (and possibly state) variables. In particular, "hard constraints" of the form Umin t Umax often need to be added to the statement of the problem since, as shown by the example here, they can easily be violated by optimization of the quadratic functional alone. The next chapter addresses the issue of how to incorporate the TRCG algorithm into a method for solving these more general problems. 3.12.3 Computational Performance Figure 3-7 demonstrates the expected fast convergence of the TRCG algorithm for this particular problem. The total computational cost required is reasonable since: roughly 2 x 30 times the cost of a single, initial value problem solution for this example. 90 500 1000 1500 - :.I 3500L* 0 1 500 1000 1500 2000 nz=27112 2500 3000 3500 Figure 3-8: Structure of the stiffness matrix A. Therefore, problems that are well-conditioned and that can be computationally solved as an initial time problem (these are the problems that are of interest in the context of simulation) can be very appropriately addressed by the TRCG algorithm proposed in this chapter. We note that the strength of the method lies in two properties: (a) its use of integration of the system by the action of g without resorting to numerically inverting the problem, and (b) its ability to preserve stability in the system, thus leading to well conditioned g. (a) Calculation of g Operator Figure 3-8 shows the structure of the stiffness matrix involved in solving the problem posed in this section. By requiring only a stable time integration of the system, the action of g can be seen as solving systems characterized by such matrices (actually M - AtA, which has the same structure). This is evident from the time-discrete statement of the problem of section 3.8. Assuming the system is well-conditioned, this process should be stable and relatively fast. It depends mainly on multiplication operations of nonzero elements of the above matrix. For systems governed by PDE's, the sparsity of such matrices make such multiplications much faster than 91 inversion - on the order of N. Therefore we can conclude that the TRCG algorithm is specially wellsuited to optimal control problems of systems governed by partial differential equations, requiring for its action O(NL) operations. (b) Conditioning of 9 Operator For well-conditioned problems we have stated above that the action of 9 should required a very manageable computational cost. However, these are calculations per iteration. We therefore mention here the second advantage of the method: it is stable and well-conditioned, so that the number of iterations required for convergence is also small. We have observed that the conditioning of the 9 operator depends directly on the conditioning of A. XA - For FEM discretizations of the Laplace operator, it is known that for a triangulation Th, 2(h-), where XA is the condition number of the stiffness matrix A. In section 3.10.2, we observed that the eigenvalues of R are bounded by the eigenvalues of A. This means that R tends to quickly become ill-conditioned as h -* oo. However, we also observed the presence of the identity in calculating 9. This guarantees that the smallest eigenvalues will be bounded away from zero, and so the condition number of 9 is bounded and approaches a constant as h 92 -4 oo. Chapter 4 Interior Point Methods - Linear, Constrained Problems 4.1 Motivation Realistic engineering control problems rarely allow controllers to take any arbitrary value without constraints. Take, for example, the case where electrical resistance heaters are used as controllers. Then, for current i and resistance R, heater input u = i 2 R will not allow for negative values of u into the system. In the context of the problem presented in chapter 3, the cheapest, easiest controllers are of this type. Therefore it is of interest to impose certain types of constraints on the problem of the form uj (t) E cj [uj (t), t]. Though c can take different forms, typical engineering applications will be characterized by two limits: a lowest allowable value (such as the positivity requirement mentioned above) and the a highest allowable value (imposed by power saturation, safety issues, etc). In short, we are interested in cases where c = [umin, Umax]; that is, for u(t) E iRM, Umin < u(t) Umax. (4.1) Though all of the results in this section immediately apply to cases in which Umin and Umax are functions of time, we have assumed for simplicity of presentation that these values are timeinvariant. Furthermore, the results extend to more complicated constraint conditions (for example, more general polyhedra) with minimal modifications. Though more involved in form, these are rarer 93 in engineering applications, so the presentation has been done with the simple interval case above for the sake of clarity. Finally, it must be observed that though the treatment here is concerned only with control constraints, the method can readily be extended to problems where constraints are imposed on state variables. 4.2 Problem Statement The mathematical statement of the problem is similar to before, with the addition that the control variable observes constraint (4.1): min uCU J[y(u)] = Ay+Bu+F, s.t. (LQPu) y(to) = yo, ci = u(t) - Umin > 0, C2 = Umax - u(t) > 0, where J[y(u)] takes the same form as (3.1). We define the variables as before: y C Y = C0{(to, tf); JEN}, u E U = {(to, Itf) x RM}. The feasible region Y = {(y, u) = Ay + Bu + F, Umin u(t), u(t) Umax} is assumed to be non-empty, which is typically the case when Umin < Umax. Before proceeding, it will be convenient to define the following notation: given a vector c E R', diag(c) c(l) 0 - 0 0 C(2 ) ... 0 0 0 0 (4.2) c(m). We also define Ci(u(t)) = diag(u(t) - Umin) and C 2 (u(t)) = diag(umax - u(t)), both functions of time. 94 4.3 Optimality Conditions for the Constrained LQP Problem Following standard treatment [3, 8, 27] of the problem we introduce Lagrange multipliers A E C 0 (0, tf; RN) t }, so that an augmented cost functional may be defined as and v 1 , v 2 E {[0, tf] x Rm Ja(U, y, A, Vi, V 2 ) = J[U] + f AT (- Ay - Bu - F) + (viTC The problem then becomes finding stationarity conditions for (4.3). V2T C2 ). (4.3) The derivation of these conditions can be found in standard texts [8] (optimal variables are denoted u*, y*, A*, v*, and Q* = Ay* + Bu* + F, = 0 y*(0) AT A* + WR(y* - YR), A* (tf) yo, = (4.4) WT(y*(tf) - Wu* + BT A* + Vuc*v* + Vcv*, YT), (4.5) (4.6) where Vi) V* 0, = 0; if c* if c*(j)0, v2s) <0, if c* = 0; v if c* > 0, )=O, for j = 1, ... , M. Here we observe that in addition to the difficulties previously encountered with time coupling, we are now required to solve for vi* q(jj) = and v* In addition, complementary slackness (v*(j) < 0 for 0) is information which is not known apriori. These problems can be overcome by the application of barrierfunction methods. These are based on imposing additional penalties on the cost functional such that inequality constraints are transferred to the minimization statement of the problem. A Newton method can then be applied to linearize the resulting stationarity conditions. Nonlinear programming methods which approach the solution in such a manner from the interior of the feasible region Y are known as Interior Point Methods (IPM). The following sections describe how we extend these methods to the optimal control problem. 95 4.4 4.4.1 Interior Point Methods (IPM) for Optimal Control Logarithmic Barrier Functions Barrier function methods have enjoyed significant success in the past decades in the interior-point solution of linear and quadratic programming problems. Those based on logarithmic functions have been especially successful and are particularly interesting for optimal control problems. We can state the original problem (LQPu) in the context of barrier function methods in the form find subject to where the cost functional jlk [U] (4.8) arg min JPk [U] = y = f (y, y(0) = yo, u), (4.9) is the original cost J[u] augmented by a penalty (barrier function) on the deviation from inequality constraints. This modification permits the removal of the explicit statement of the inequality constraints in (4.9). We must, of course, ensure that u* -+ u* given appropriate choices of jlk and a sequence {pk} Although modifications and improvements exist, the form of the barrier function used in this paper is 2 ] J"[U] = J[u] - p q=1 where cq(j) is the jth component of vector cq t M (4.10) ln(cq(j)) dt, 0 = for q = 1, 2 (note that Cq(j) > 0 is the imposed constraint for all (q, j)). Since J' is seen to be strictly convex, and since the feasible region F is compact, for any p > 0 there exists a unique minimizer u*(p) of J[u] such that Q) Ay*(pu) + Bu*(p) + F, and c*(p) > 0. The logarithm barrier function method is based on the fact that given a positive decreasing sequence {k}, then u* - u* as k - oo. Defining U* = {w E U w = u*} as the set of minimizers of problem (LQPu), it is possible to show that (limkso u*) E b* for 0 < pk+1 <pk. Then lim U* = U* k-+oo follows immediately from this result since the convexity and continuity of J and the compactness of F require that the set U* be composed of a unique u*. The proofs to these statements are given 96 in the following section. 4.4.2 Proofs of convergence Proposition 4 Given a compact "level set" S = {(z, w) E F|I J[w] < J[wo]}, a positive decreasing sequence { k }, correspondingminimizing solutions w* = w* (Mk) of problem (4.8,4.9), and a starting point wo (not necessarily a minimizer) corresponding to k = 0: 1. J[w* 1 ] < J[w*] for k = 0,1,2,..., 2. w* E S for k = 1, 2,3,...; 3. there exists a subsequence {uk} of {w* } such that -=limkou * E S; 4. f, is feasible. Proof. The following proofs are immediate extensions of those found in [29]. 1. Since w* and w* are minimizers corresponding to ,pk and pk+1 , respectively, EI T ln(c(w*)) dt < J[w*±1 kj+1 T q - 5 Ak E ln(c(w~k1)) dt, and J[w*+1 ] - k+1 f T q1 _]-,k+1 ln(c(w*~) dt ( IT 7n(c(w*)) dt. q j Combining the above equations, /k+1) Jz 1 ]1 tk Jku (i Ik+1 1p but since we've required that {p/_k} be positive and decreasing (0 < yk+1 < pk) the above inequality proves that J[w*+1 ] < JEwI]2. From the above, we see by induction that J[vWk 1] J[wk] < J[wO] It is therefore evident that w* E S for k = 1, 2, 3.... 97 J[wo]. 3. Since every sequence in a compact metric space has a subsequence that converges to a point of that space, and since S is compact, fi C S for some subsequence {u*} of {w*}. 4. fL is feasible since, from (4), fL E S C F. Proposition 5 Let U* = {w E U w = u*} be the set of minimizers of problem (LQPu). Then lim U* E ,*. k-ock Proof. Again, we base the following on [29]. Given L = limk,+o u*, we prove that i E U* by contradiction: suppose ii U*, then it is required that (4.11) J[i] > J[u*]. From J continuous and J[u*] > J[u*,], we have J[u*] > J[f] for k = 0, 1, 2,.... We claim that if (4.11) holds, there exists a uint C strict(Y) = {w E F c(um,q) > 0, V (m, q)} such that (4.12) J[fl] > J[Uint]. To prove this claim, we consider the two possibilities: (i) if u* E stric(F), then pick uint = U*; (ii) if u* ( stric(F) then choose a point z C strict(F): if J[z] < J[i] pick uint = z, otherwise define i= (1 - A)u* + Az, A E (0, 1). Since A > 0, ii E strict(Y), and since J is convex (J[i] < (1- A)J[u*] + AJ[z]), and J[z] ;> J[i] > J[u*], there exists a A such that J[i] < J[fL] from the continuity of J. So in this last case, pick uipt = U. Either way the claim is proved if ft To find a contradiction to (4.12), we consider two possibilities for f: (a) Li i b*. C strict(F) or (b) strict(Y). If (a), then as k -+ oc, the logarithm terms of J[u*] - p k q f0ikqf ln(c(u7)) dt < J[uint] - ILk q are bounded and therefore vanish since we've required that limk_- 98 ln(c(uint)) dt k = 0. Since lim ue = U, we arrive at a contradiction to (4.12): J[f] < J[uiont], Vuiot E strict(i). If (b), then combining the above equation with equation (4.12) in the following form: J[uint] - ,k S q f 10~ ln(c(uint)) dt < JL'41] - pk~ T: ln(c(uint)) dt, T results in q 3 ln(c(u*)) dt < This is again a contradiction since fi - k I T 51n(c(uint)) dt. strict(F) implies that, as k -+ oc, the first term above is unbounded, whereas the second is bounded, and the inequality cannot hold. E Proposition 6 Maintaining the definitions above, lim u* = U* k-+oo where u* is the desired solution to the quadratic program of problem (LQPu). Proof. This follows immediately from the last theorem since strict convexity and continuity of J and the compactness of Y require that the set W* be composed of a unique u*. E 4.4.3 State and Adjoint Equations The state and adjoint equations for a given barrier parameter pk for the above problem are =Ay* + Bu* + F, A* =ATA* + WR(y* YR), 0 = Wuu* + BT A* - pk (C1 where e E RM = (1, 1,... , - -C2 y*(0) = y0, (4.13) A*(t) (4.14) = WT(y*(tf) - YT), e, (4.15) 1)T. The advantage that equations (4.7) are no longer required (as long as the solution is approached from the interior of F) is immediately evident, but the non-linearity in equation (4.15) presents a problem that will be addressed in section 4.4.4. The equations above are the starting point for so-called Primal Interior Point Methods. It is 99 possible to add an extra degree of freedom to the problem by introducing slack (dual) variables z1 E {[to, tf]; R IM} and z 2 E {[to, tf]; IRM}. Stating the problem in this way leads to the Primal- Dual (PD) implementation: *Ay* A T A* + WR(y* - YR), 0 =W Cz* C2z (4.16) y*(0) =yo, + Bu* + F, A* (tf) WT(y*(tf) - + BT A* - z* + z, +u* YT), (4.17) (4.18) k e, (4.19) = yk e. (4.20) Now we observe that the nonlinearity has been transfered to equations (4.19,4.20). The algorithmic impact of this alternative statement is that as the solution converges to (y*, u*, A*, z*, zI ), the last two equations in the set need not be satisfied exactly for (yk, Uk, Ak, , z). This results in a more favorable situation as is discussed in more detail in section 4.4.8. Since PD methods are algorithmically more favorable, they have been chosen here for implementation. However, the simpler mathematical statement associated with Primal methods lends itself far better to theoretical treatment. Therefore, the next few sections describe the Primal methods for optimal control in detail, including some theoretical convergence results. Then, the PD methods for optimal control are presented in section 4.4.8 and incorporated in the algorithm in section 4.5. 4.4.4 Primal IPM Quadratic Approximation (Newton Projection Method) Given a barrier parameter pk, a difficulty in solving problem (4.8,4.9) is that the stationarity condition (4.15) is not linear. Given arbitrary, non-optimal "guess" values of the control variable ui, and the resulting values of Ai (the subscript i will later be used as a Newton iteration index), the equation h(ui, Ai, t) = Wou2 + BT Ai - pk(C--1 - C2 j)e (4.21) will not, in general, equal zero for all time. In order to resolve this problem, a Newton iterative method can be used to find the root of (4.21). This procedure is commonly referred to as the 100 Newton Projection Method in the linear programming context. Linearizing Stationarity Conditions Here we show the standard Newton iteration procedure used to solve the problem. Corrections for values that lie outside the feasible region and initializing the algorithm will be addressed in the following two sections. We first note that equation (4.21) can be written as a vector (uncoupled) equation since W, is usually (or can at least be made) diagonal. Each of the M components can then be linearized: h(ui, Ai, t) + Vuh(ui, Ai)AU + VAh(ui, Ai)AAi where ui+1 = ui + Aui and Ai+ 1 = = 0, Ai + AAi. (4.22) (4.23) Equation (4.22) can be rewritten as Hi Aui + BTAAi = -gi - B Ai, where H (4.24) = WU + Pk(C-- + C-) is the Hessian of JA with respect to ui, and gi = Woui - [kt(C- - C (4.25) e is the gradient. Finally, using the relation in (4.23), we can rewrite the linearized equation for ui+1: i+1 = Ui - (H )-(BT Ai+ + gi). (4.26) Equation (4.26) is clearly a linear relation between ui+1 and Ai+i, but in order to find an appropriate value of Ai+i, the stationarity conditions of equations (4.13) and (4.14) should still be satisfied. Therefore, given ui, we solve the linear system: Yi+i = Ayi+1 -Ai+i + Bui+I + F, = AT Ai+1 + WR(yi+1 Ui+1 = Ui - (Hf ) -(BT - yi+i(O) = Yo, Ai+1(tf) = WT(yi+1(tf) - YT), YR), Ai+1 + gi). 101 (4.27) (4.28) (4.29) In terms of Newton step directions, the above can be expressed as = AAyi - Qk/A Ai + F_, Ay(0) -AAi = ATAA, + WRAY, + f, AAi(tf) = WT(Ayi(tf) = (4.30) Yo - yi(to), AyS - (4.31) YT,i), (4.32) Auj = -(Hf )-(BTA, + BTAA, + gi), where = (B(H )-lBT), Fj = -(Hk )-(BT Ai + gi) + (-yj + Ayj + Bui + F), fj =- - ATA, - WR(Yi - YR), YT,i = WI 1 Ai(tf) + (YT - yi(tf)), vary with i. The significance of equations (4.30,4.31) is that they take exactly the same form as (3.4,3.5). The key requirement for application of the TRCG algorithm on this system is that Qk be SPD. Since Hik from equation (4.24) is certainly SPD then so is (pk > 0 and C1,i, C2,j are SPD for feasible ui), Qi. In other words, we can safely apply the TRCG algorithm to the system above for each Newton iteration. T Denoting p = [uT T y7T T, for each Newton step Ap[ = [Au7 A[ AAT Ay[ the appro- priate step size ai E (0,1] needs to be determined for calculation of the next iterate: Pi+1 = Pi + ajApi. (4.33) This can be accomplished by well established procedures such as line minimization and the Armijo rule. 4.4.5 Correcting Values that Lie Outside Box Constraints Given any ui, the calculation uj+1 = ui + ajAui may result in values that lie outside the hard constraints (4.1). If these values are not corrected, Newton iterations may simply approach the unconstrained solution as p - 0 (although undefined in the logarithmic statement of the problem, this infeasibility is not automatically detected in the Newton approximation). 102 In order to avoid such a problem, we determine the distance to the boundary in the cases where a full step would result in infeasible controller values: ,= -minM 1, j=-1,...,) min Um ';-) , Aui~j /(jAiai,j (ijAiai~j<Umin) min (ij >Umax) . -Umax) Ani~j (4.34) / Having determined /i, we can take the actual Newton step: Pi+1 = pi + (0.995)OjajApj. The constant 0.995 is used to ensure that the next Newton iteration will be performed in the strict interior of the feasible region. Using unity may result in pi+1 on the boundary of the feasible region, which is undefined in the sense of logarithmic barrier functions. 4.4.6 Initializing the Algorithm The Analytical Center The previous section describes a method for determining step directions for the solution of the quadratic approximation of the logarithmic barrier version of the constrained problem. No mention has been made of the appropriate starting point po from which to take the first step. We emphasize that the IPM algorithm is based on two approximations: (i) a logarithmic barrier function that accounts for the effect of the boundary, and (ii) a Newton approximation for the solution of the nonlinear equations that make up the stationary conditions for each P1. In its currently proposed form, the algorithm takes iterates k and i for the two approximations respectively. Assuming we have a predetermined positive, decreasing set {pk}, it is necessary to determine the initial guess po(p1 k) for each k. Given any k, it is natural to initialized the Newton procedure with PO (Itk+l) = Plast (,,k), where Plast (k) /I. (4.35) is the last (possibly converged) iterate of the Newton procedure for barrier parameter More will be said about this choice in the following sections. But the question of how to assign po(p-0 ) still remains. Here we make the first statement con- cerning the set {pk}: take p 0 = M, where M is "large" in the sense that it can be taken to approach the case in which pO -+ oo. For a closed feasible set F (umin < Umax both bounded), the logarithmic 103 terms dominate J1'0 , and the solution of problem (4.8,4.9) can be immediately recognized as the analytical center of the feasible set. In this case po(p 0 ) = Plast(P 0 ) [ (u*=O)T (y*=O)T ]T (A*o=)T is uniquely determined from: * Uk=O = = Umin + Umax (4.36) (.6 2 ye 0 (to) AY=O + Bu*=o + F, -k=0 = AT A\=o + WR(Y o - YR), A (tj) = yo, W (4.37) (y*O(tf) - YT), (4.38) requiring no Newton iterations to be performed. Once this initial solution has been established, one can proceed to Newton iterations for k = 1, from which Plast(p) will be available for the initialization (4.35). The question of how to select pk+1 will be addressed in section 4.4.7. For cases in which the feasible region is unbounded (umin < u < oc, for example) we can simple impose Umax on the problem to be a value that is much larger than expected. Such situations are rare in practical engineering problems since most controllers will have physical limits. Therefore, we observe here that assuming that 7 is closed is reasonable for most engineering applications. The Central Path Above we have addressed the question of how to chose po(pk) for each k > 0. If we allow the Newton iterations that follow from such initial guesses to converge "exactly" (that is, i = last when ||h(pi)fl < e, where e is a negligibly small tolerance value) we will solve problem (4.8,4.9) exactly for each k. The path {pkat} taken by these solutions for {1 k} as pL -+ 0 is called the central path. Proposition 6 states that the central path will converge for any decreasing positive choice of {pkk}. 1 Therefore, if it can be assumed that each Newton iteration will satisfactorily converge from the resulting choices of {pk} = {pik-}, we can readily implement the algorithm. However, it is more interesting to approach the question from another perspective. Rather than choosing an arbitrary sequence {pk} and determining {pk} that will allow for convergence of Newton iterations, it is possible to determine an appropriate {pk} such that a single Newton iteration will suffice for convergence of the entire algorithm. That is: Pk+1 = pi+1, (4.39) allowing us to relinquish the index i. This means that even if we do not exactly follow the central 104 path, the algorithm will converge to the solution u* given an appropriate choice of {pk} and a single Newton approximation step. The choice of the sequence and the necessary proof of the above statement are given in the next section. 4.4.7 Barrier Parameter Rate of Reduction Here we establish the rate of reduction of the variable pk which will guarantee convergence of the algorithm for a single Newton step. This theoretical result is important in the sense that it guides the convergence expectations based on the size of the problem. In practice, however, the method is often observed to perform better than this result suggests, and yk can be reduced more aggressively. The results of this section are shown in the context of the time-discretized problem since they are important in the implementation of the method. In fact, the discretized size of the problem will drive the convergence rate of the algorithm. Here we use the inner product (x, y) = XTy with associated norm 11- 11= v'(x, x) and deal mostly with vectors x C RM, where, as before, M is the number of control variables. In addition, L is the number of time discretizations assigned to the problem. These results are based on the practices followed by IPM formulations used for solution of general linear programming problems [41. Though similar in their conclusions, the formulations differ in the fact that optimal control problems involve a time domain that is not present in the traditional treatment of these methods. As the results show, the size of the control problem vML dominates the result. This is analogous to the linear programming context. Before stating and proving the main result let us state the following definition: Definition 5 (Time norm) Given a set of vectors {} where e E IRM, we define as the time norm ||XjjL the following operation: L 2. |X| 11 |IX||L = (4.40) f=0 The required properties of the norm can be easily shown to hold: IXHJL 0; HXIIL 0 iff {X} = {0}; ||azX|L = ja|lIX11L, V a C R. F Lemma 2 It will be useful to explicitly state the following inequality: 11X + YllL HXflL + flYIIL- 105 (4.41) This is simply a statement of the triangle inequality and can easily be checked. The long hand notation takes the form: z S + yefl 2 |xflj2 + y Now we state the main result of this section: Proposition 7 The proposed single-step Primal IPM is guaranteed to converge if the barrier parameter decreases in the following manner: k k k+1 _ = +± ML where a (4.42) and the path followed by the algorithm will not deviate from the central path by more than Ik kSke - C P 30 #: (4.43) L for all k. Proof. The proof is carried out here for the terminal problem only in order to reduce the number of terms involved. The result holds for the terminal-regulator problem and the proof is easily extended. To prove this statement we recall the discretized form of the problem: ( Ayk0 = 0; ki- AYk-l= (4.44) AAyk' + BAuk, f = 1, ... , L; At (4.46) (I - AtA)T AAkL = WTAykLAt; A= = L - 1,...,1; ATAAA, 106 (4.45) (4.47) along with the condition Wuukk + BT Ak for f = 1, ... , L. + Wu Auk jk+1(Ci)-e -- + BTAk+ + pk+1(Cf)- 2 A V = 0, This last condition can be rewritten as k+I= WUUk+e + (e BAk+_k1()- - (C )-IAUkf. (4.48) Now we observe that since _ Sk+l U. 'i __ +1 k + +i kj + Ukf) (1I for j = 1,...,M, we have U k+1 ki _ _ - Umin) 1 Au ' + iI.V2 ) (4.49) mi from which, together with (4.48) and observing the diagonal nature of the matrices involved, we get 2 U kk - umin) sk+1 Uk+1iU Ak n n which can be written as 2 Pk+Ck+1Sk+1e e 2 = (C For positive vectors v, the inequality |ZVv| -2 (AUk ) 2e(k_ (ZV,) 2 2 = E, V, = eTv, allows us to write (C)- 2 (A Uk) 2 e < (Cl)- 2 (AU ) 2e Squaring, and summing over all 2 (AU )e) = (C)-lAUkt 2 e, 1 k+l Ck+lSk+le - T (AU (e )(C) 2 e L L1 Z f=0 (CklAk' 107 ( 2 (Ck< 1 uk ||). (4.50) Now, from (4.48) we readily get for the right-most term above: AUT (C -2AUk S C 1 k+1 P , Ck k T e (C -IAuk f AUk T WUAUkf + AUk TBT AAk f - Pk+1 Auk T WUAUk f + Auk S(Cf)~-"A6k e TBTA X ke - k+1 or (Ckl-IAuk' + A Uk WUAUk f + Auk TBTAAk' Pk+1 1 K ~k+1E4e (C )-1AUk -e Since pk > 0 for all k, and WU is SPD, squaring the above leads to 2 Ck |-I A -e T +1 Ak +ICkSke and, summing over all f, (Ck -1 Auk |L + k11AUk B AAk 1 AkICkSke 2 (4.51) L From relations (4.45-4.47), we have = 'AYk WTAYk L> 0 AAk AUk jBT so that (4.51) further simplifies to 1 CI(C k)- Uk ||IL 2 -e A k+1 CkSke L Together with (4.50) we have the result 1 kSk+lk+1 - e L K 108 1 e k+lCkSke - 2 L (4.52) Since pk+1 - aky11k we have for the right hand side: 1 II k+1 CkSke - 1 1 e L 1 - e L CkSke Ckk ak L 1 < ak + (1ak) ak -e IkCkSke ) 1ak ( Ok Ske -e - k e JeI L 1 2 +1 ak) ak/ +(ak ) VML. (=0 Now we can determine the possible values of ak which will ensure that for iteration (k + 1) the deviation form the central path will not exceed 3. Combining (4.52) with the above we have 2 1 YkICk+lSk~le -e L Wk = 0. k (4.53) Solving for ak: 0 + ak= ak- 4.4.8 M . (4.54) Primal-Dual IPM Difficulties associated with Primal Methods The Newton step (Api) calculation is by far the most costly part of the proposed algorithm. The problem of finding Api can be represented in the following form: KpApi = fp, (4.55) where Kp is the symmetric block matrix Hk KP =B [iP BT 0 0 (AT +) 109 0 (A- a) WR with appropriate initial and final conditions implied. We recall here that the (1,1) element of the above block matrix is the Hessian Hk =WU+ tk (C-? +C). The main problem with Primal IPMs is that as they approach the solution u*, the matrix Kp becomes ill-conditioned, leading to problems in inverting the system (4.55). It is a simple matter to show this. First, we note that we have shown in Proposition 6 that as k for active constraints, C*'(,) -+ oc, u* -+ u* and thus, 0. We then note that for u* , stationarity conditions must hold for - the point to be optimal. Comparing stationarity for the original problem and for the logarithmic barrier approximation 0 WUU* + BTA* + v 1 (4.56) - 0=h*= Wuu* + B T 4 _ Pk(C*)-e + pk(C2*fle (equation (4.56) is a result of Vuci = IM and Vuc 2 = -U), (4.57) we note that for (u*, A*) (u*, A*), - we must have pk(C*)-le-+ - *,(. pk(C*)le v* -+ oc. As a result H - (4.59) 0 for active Since the above values are non-zero constants and C* k ) j, k(Cqi(j,j)- 2 - oc as oc, making Kp ill-conditioned. Primal-Dual Methods Formulation It is possible to avoid the ill-conditioning that arises from Primal methods by sacrificing the symmetry of Kp. If Newton iterations are applied to the system (4.16)-(4.20), and we define the Primal-Dual variables p' AuT AA[ AyT AzT [uT AZ T 4 , y7' zfl z2, and corresponding Newton step Ap = we have KPDAPi = fPD, 110 (4.60) where KPD= WU BT 0 -I I B 0 (A-h) 0 0 0 (AT+0) WR 0 0 Zi'i 0 0 C1, 0 Z2,i 0 0 0 C1,i and Zq,j = diag(zq,i). Now Hk does not appear in KPD and the problem encountered in Primal methods is eliminated. Since zq -+ and -Vq cq(j) - 0 for active j as k - oo, the block matrix above does not become ill-conditioned as we approach the solution, resulting in a better behaved algorithm. The fact that we have sacrificed symmetry does not have an adverse impact on the algorithm, since the form (4.60) can still be reduced to - =Yi = AAyi QAAi + Fp, -A A= ATzA A + WRAY, + Au = (4.61) Ayi(o) = Yo - yi(to), A Ai(tf) ij = WT(Ay(tf) -(Hf )--(BTA, + BTAA, + y), - YT,i), (4.62) (4.63) where now (CZ $t$= f= Wu Z2,i -- C i) e (C 1- -(H) -(BT Ai + i) fi = -Ai - ATAi - WR(yi YT,i C (B(Hf )- 1 BT ), -= Nj = - pk ,+ 11u = Wj-1Aj(tf) + (YT - + (-yi + Ayj + Bui + F), - YR), yi(tf)). The TRCG algorithm can therefore still be used to solve the above system. With Primal-Dual methods the Qi matrix is seen above to be better conditioned than its Primal counter-part of section 4.4.4, which is advantageous to the TRCG algorithm since the conditioning of 9 depends on this matrix. Since Primal-Dual algorithms improve the method without an adverse effect, we have chosen 111 to use the above statement of the problem in the solution method. The result of equation (4.42) will no longer hold for the statement above. It can, however, still be used to guide the convergence of the algorithm in the sense that the problem size is reflected by the term v/ML. We postulate here without proof that an often used result [4] for traditional Primal-Dual IPMs can be extended to the context of optimal control to determine the selection of barrier parameters: + (c M ML T Zk-1 k-1 T k-1 k (4.64) Finally, we must address the issue of step length so that the new iterate is guaranteed to lie inside the feasible region. Similarly to the primal case, assuming that we start from a feasible point pi, we take Pi+1 = Pi + (0.995)#OajApj, where i min j=1 1, min mm . (zi,,j+ciAzi,,3 <0) 4.5 - / -Zl'i'j Azi,'ij (Ui, min i,j u Au . mm (z 2 ,+aiz2,,J <0) - Umax Auij (.i.j+Ai.ij>Umax) Z2,i,j Az 2 ,i (4.65) ).} IPM-TRCG Algorithm The following is a detailed listing of the Primal-Dual Interior Point Methods approach to solving optimal control problems via the TRCG algorithm. Algorithm IPM-TRCG Set k = 0; Set pk=O = M (M large); Set uk=o at the analytical center of U; Calculate Pk=o(no); while (pk > ptol) do while (llFill > Ftoi) do; Solve (4.60) for Api by TRCG; Solve for ai and /i by Armijo rule and (4.65), respectively; pi+1 = pi + (0.995)#jajApj; 112 end do; if (pk > ptoi) then k = k + 1; Solve for hk by (4.64); end if; end do We note that in the above we have not used the theoretical results for single Newton steps. Though helpful as a guide in estimating the rate of reduction of pk, this approach is rarely pursued in practice. We have chosen an often used approach for implementation of IPMs by allowing for a small number of Newton steps and calculation of pk by (4.64). But since IPMs can be applied much in their original context to optimal control via TRCG, other typical implementations can also be used if preferred. 4.6 4.6.1 Example Problem: Linear, Constrained 2D Heat Transfer General Results Here we consider the same example problem as in section 3.12 with the additional constraint: 0 K/s < u* < 1500 K/s, V t E [to, tf]. (4.66) Note that, although the above is imposed for all time, these constraints can be time-varying, which would require no modifications to the algorithm proposed. We shall refer to the problem above as the "constrained" problem and to that posed in section 3.12 as the "unconstrained" problem. In reality, both can be viewed as constrained minimiza- tions since governing equations must be satisfied; however, this section deals with hard, inequality constraints on the variables themselves. Figure 4-1 shows the solution to the constrained problem. It is obvious from the plot that the hard constraints are indeed observed for the control variables as required. The performance of the system is expectedly degraded as can be seen from the higher value of J* in comparison to the unconstrained problem. This is expected since the set U of admissible control functions u has been 113 Optimal control history u (t) 2000 1 1 500 0 ----.-- 50 --5001-- u1 (3,5)-- u2 (4) -1 0 0.5 40 1 1.5 2 2.5 3 3.5 4 4.5 5 Average temperature at reaction surface IRS - 550 4000 500- 9 450 -- >s400-350 - __yRRS -y 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 t (s) Figure 4-1: Optimal control and l'7 RS temperature histories for constrained problem, J*= 8.19 x 10 7 . reduced to include only those functions that lie in the allowed range over time. 4.6.2 Numerical Performance Rather than using equation (4.64) to calculate p , we simply took P k = 1o-2 Pk-1I which is known to work well in practice. Equation (4.64) was only used as an estimate for the stopping criterion. IPMs are often observed to work better in practice than predicted by theory, and their application to optimal control problems is no exception. Therefore, a simple, aggressive reduction of pk can be expected to allow the method to converge very quickly. The results below show that this is the case for our current example. Figure 4-2 shows the Newton convergence of the residual |IFill of the nonlinear state equations and the conjugate gradient convergence for the last Newton iteration in the procedure. Referring to the left graph, we note that the tolerance value chosen was Ft(i = 10-4, and that new pk were 114 1 Newton residual convergence 10 10 Last TRCG residual convergence 10 - 10 1061 - 10 10 2 10 - 1 10 - 113 - 10 102 1010 - 10-1 1006 10 10- 0 -4_______________________ 10-4 0 5 10 15 20 25 0 5 10 iteration 15 20 25 iteration Figure 4-2: Newton and last conjugate gradient convergence of IPM-TRCG algorithm. calculated at iterations {1, 6, 8,13, 16,17, 18,19, 20, 21}. We see that the Newton iterations present the expected excellent convergence properties and that towards the end of the iterations only a single Newton step is necessary for staying very close to the central path (as is suggested by ||-ill < Ft 0 I for these iterations). However, we may be concerned with the conditioning of the Hessian (which impacts the conditioning of the operator 9 as was observed in section 4.4.8). Since we used the Prima-Dual formulation of the method, we argued that we should not have a badly conditioned Hessian as /I k 0. To test this in the current problem, we present on the right hand side of Figure 4-2 the convergence of the algorithm for the last barrier parameter iteration, that is, the smallest value of ILk (in this case = 10-9). -h' Compared to the unconstrained case (Figure 3-7), we see that the algorithm was effective even as bk - 0, converging in fewer than 30 iterations. Since the conditioning of g is essential for the effectiveness of the TRCG algorithm, we emphasize the importance of using Primal-Dual methods in conjunction with TRCG to solve such optimal control problems. Linear, constrained problems addressed here are far more common in engineering applications than those of chapter 3. However, the generality of the TRCG algorithm lends it to an even broader class of problems if certain adaptations are made. The following chapter addresses application of the method to nonlinear, constrained optimal control problems. 115 116 Chapter 5 Lagging Procedure Nonlinear, - Constrained Problems 5.1 Motivation We have presented in the previous chapters a method for solving linear, constrained optimal control problems. In this chapter, we would like to extend the method to a broader class of engineering problems. In particular, we present an extension to a specific class of problems: those with nonlinearities in the state variables. Considering the examples that have been presented in chapters 3 and 4, it may be of interested to include problems that allow for radiative heat transfer. For example, the "reaction surface" of Figure 3-1 may be exposed to an environment with which heat radiation is exchanged. This addresses a more realistic set of problems since it is unlikely that either Neumann or Dirichlet boundary conditions could be imposed on this surface. Radiative heat transfer on a surface is given by the Stephan-Boltzmann law 4rad where 4rad [W/m 10-8 W/m 2 2 Y- ~ ] is the resulting heat flux, c (5.1) s) c [0, 1] is the surface emissivity, o- = 5.67 x 4 - K is the Stephan-Boltzmann constant, y [K] is the surface temperature, and y, [K] is the surrounding (ambient) temperature. The nonlinearity is apparent from the quartic dependence of the heat flux on the surface temperature. This radiative law will appear in the example presented 117 in the form of a boundary condition. Detailed treatment of the incorporation of this condition in the model and the FEM formulation will be addressed in section 5.8. For now, we state that such a nonlinearity will appear in the spatially-discretized ODEs as Q= (5.2) Ay - Kf (y) + Bu + F, (5.3) y(to) = Yo, where y, A, B, F, yo are defined as before, K : Y Y is a linear mapping (that may, for example, in- -+ corporate boundary terms (5.1)), and we make use of the notation f (y) = [f (yi) f (Y2) - f (.) ]. The function f(y) is taken to be nonlinear and is separated from the linear part Ay for clarity in the algorithm. In the example of radiative heat transfer, we would have f (y) = y4 , a fourth-order polynomial nonlinearity. 5.2 Problem Statement In the interest of addressing general problems, we present the mathematical statement here with constrained controls: min J[y(u)] UEU y= Ay - Kf(y) + Bu+ F, c u(t) Umin > - C2 = Umax 5.3 (NLQPu) y(to) = yo, s.t. - 0, u(t) > 0. Optimality Conditions for the Constrained NLQP Problem Similarly to the treatment of section 4.3, we can state the optimality conditions of the problem: 0 = Ay* - Kf (y) + Bu* + F, y*(0) = = AT A* + WR(y* A*(tf) = WT(y*(tf) - YT), = Wuu* + BT A* - YR) - KVf (y)A, + Vc*v* + Vucv2*, 118 yo, (5.4) (5.5) (5.6) where 0, V*1< 1 )Ij) 0, if c = 0; if c > 0> 0, S< if = 0; c* (5.7) if , v'2(j) = > 0. C We note that a new term is present in equation (5.5) which is nonlinear in y. In the spirit of IPM, we can also state the optimality conditions for the Primal-Dual logarithmic barrier problem: = Ay* - Kf (y*) + Bu* + F, = 0 C* yO,) (5.8) A*(tf) = WT(y*(tf) - yT), (5.9) y*(0) AT A* + WR(y* - YR) - KVf(y)A, = Wau* + BT A* - 4 + z, (5.10) k (5.11) (5.12) Cz* = pke. Since this is the useful form from the algorithmic point of view, we deal with it directly here. 5.4 5.4.1 Linearization of Optimality Conditions Naive Implementation It is natural to consider Newton iterations to address the nonlinear term KVf(y)A in the same fashion that was used for the nonlinear barrier terms pk(C}*)-le in the implementation of IPMs. Each Newton step Ap[ = Au[ AA[ I~~~~ AyT AZi would then be calculated by T IT' (5.13) PDApi = fPD, with ATPD WU BT B 0 0 (AT+) -I I 0 0 WR 0 0 0 (A-A ) Z1 ,i 0 0 C1 ,i 0 Z 2 ,i 0 0 0 C1,i 119 where A = A - KVf (yi), WR = (WR - KV 2 f(yi)A\) At this point, a generic algorithm that might try to invert fPD would run into a substantial difficulty. The term WR is not necessarily SPD for arbitrary combinations of (WR, K, V 2f(y,), A,). In our example, in which f(y) = y4, WR - 12K[diag(yi)] 2 diag(Ai) can easily become non-SPD depending on the iterates yi and Ai. If the solution procedure ventures into regions where this is the case, then the SPD property of kPD is compromised and efficient numerical algorithms such as conjugate gradients cannot be effectively used. If we try to implement the above idea into the TRCG algorithm, we would have to solve a system of the form Ayi = AAyi - Q AAi + F., -A*Ai = AAi ±T - WRAMy Ay(O) AAi(tf) + fi, yo = - y(t 0 ), WT(Ay(tf) - YT,i), (5.14) (5.15) where A and W are defined as above, and all other variables are defined as in equations (4.61)-(4.63). Following the procedure of chapter 3, we would not be able to implement the TRCG algorithm because the operation ((v,w)) := v(tf)TWTw(tf) + v(t)TWRw(t) dt to cannot be defined as an inner-product for non-SPD WR. It is possible, however, to sacrifice the quadratic convergence of the Newton algorithm for a valid inner-product form. First, we make a distinction between iterations used for the logarithmic barrier system and those used for the nonlinearities presented in this chapter. 120 The problem posed by the Newton projection method with nonlinearities is to solve the system Ayi = AAy - Qi AAi + Fz, zAyi(o) = ATAAi + (WR - KV 2 f(yi)Ai)Ayi + fi, -A*Ai Aui + BTAAi + = -(Nf)-1(BTAi AAi(tf) = (5.16) Yo - y (to), = WT(Ay 2 (tf) - QT,i), (5.17) (5.18) i). for Api. This is equivalent to solving problem (5.13) with possibly non-SPD KPD. If we have an algorithm that can robustly solve this system, the IPM iterations can proceed. In what follows, we present a robust algorithm for solving such a system. 5.4.2 Proposed Algorithm - Separation of Parts We begin by separating parts in a procedure analogous to that presented in chapter 3. Define (Ayi, AAI) and (AyH, AAH) such that Ay = AyI + AYH, AAi =AA1r + AAH. A crucial difference here, however, is that all nonlinear terms in the stationary conditions are placed in the inhomogeneous equations: Ay1 = AZAy -A - Qi AA, + Fi, r = AT AAI - KV 2 f(yi)AiAyi + fi, AyH -TH = AAyH - Qi AHI AYH() AH + WRAyi, Ay (0) = Yo - yi(to) (5.19) AAI(tf) = -W (5.20) = 0, \AH(tf) = WTAyi(tf). T,i, (5.21) (5.22) We first note the immediate advantage of the above: the homogeneous system (5.21,5.21) now takes exactly the form of chapter 3, and so the operator g can now be safely applied in the space 121 defined by the inner-product ((v,w)) :=v(tf)TWw(tf) + v(t)TwRW(t) dt. Unfortunately, we are not yet ready to apply the TRCG algorithm due to a difficulty that has has been introduced: system (5.19,5.20) now depends on Ayi, and is, as a result, no longer decoupled in time. We propose to use a known approximation for the nonlinear term (KV 2f(y,)AAyj) in equation (5.20), thus decoupling the system in time and allowing for the use of the TRCG algorithm. In solving for the inhomogenous terms, we solve the uncoupled system Ayj -A 1 - Q AA1 + Fi, =ZAy- = ZT A A AyI(0) - KV 2 f (y )A Ayi + fJ, ANA(tf) = = Yo - yi(to), (5.23) -WTiT,i, (5.24) where Ayj is an approximation of Ayi. Given a method for determining Ayi, system (5.23,5.24) becomes uncoupled in time, allowing for the calculation of Ayj and the subsequent use of the TRCG algorithm. We must therefore finally propose a method for determining Ayi. We do so iteratively: given an index r and an iterate Ayir, solve equations (5.23,5.24) with Ayi = Ayjr. (5.25) We denote the result of this operation Ayj,. With this value, we can use the TRCG algorithm to solve g9Ayir+ 1 for the next iterate Ay r+1. = AYIr (5.26) These iteration can then be repeated until some stopping criterion, I1Ayir+i - Ayir|1 < e for example, is observed. The crucial point is that since g is SPD in the innerproduct space defined above, the TRCG algorithm can be used without modifications to effectively solve system (5.13) for Ayi. 5.4.3 Initializing the Algorithm In order to execute the algorithm proposed above, an initial guess must be specified. Here we address the final issue of choosing a suitable Ayi 0 (r = 0). Suppose we choose Ayio = 0 to start 122 the algorithm. Then, solving the system !;,Yi for Sy, = ,yYo provides us with an initial, first order approximation of Ayi. As we have shown in previous chapters, a solution for the above equation exists, is unique, and can be found by the TRCG algorithm. This is also true of each subsequent iteration (r = 1, 2, 3,...) regarding the solution of the system (5.26). Thus every Newton step Api can be calculated assuming the above procedure converges: Ayir a Ayi as r -+ oc. Here we make this assumption and show that it holds for our heat radiation example. In addition, we note that the algorithm can only be considered efficient if the iteration converges quickly; that is, r must not be too large before we attain sufficient convergence. 5.5 Sufficient Convergence for Lagging Procedure It was noted in section 4.4.7 that exact convergence of the Newton iterations for IPMs is not necessary for convergence of the full algorithm. The interpretation in that section was that the solution need not move exactly along the central path for convergence to the true solution u*. Here we are faced with a similar situation. Define the central path of the lagging procedure as the exact solution of the stationary conditions (5.8)-(5.12) for all positive p1k E R. Again, we are not interested in the exact value of Api for any given Newton iteration i, but only in obtaining good enough estimates of these values along the central path as we approach the true solution of the problem. Therefore, it may be postulated that, similarly to IPMs, our lagging procedure need not produce estimates Api that have converged fully to Api of system (5.13). We further postulate that, much like in the case of IPMs, our estimates can be rather far from the central path and allow for convergence. Here we've applied a very simple rule to take advantage of this feature: take riast = R where R is a small integer, typically less than 5. In section 5.8 we present an example of the fully implemented algorithm where this simple rule has been successfully used. 123 5.6 Determining the Newton Step Having determined an appropriately "close" approximation Ayj of Ayi, we may proceed to find and approximation of the Newton step Api, which we denote Api: KPDAPj = (5.27) fPD, where now kPD WU BT 0 B 0 (A-A) 0 0 WR 0 0 0 (AT -I Z1 ,i 0 0 Ci,i 0 Z2 ,i 0 0 0 C1 ,i as in section (4.4.8). The above fully determines Api since the right-hand side fPD now includes the approximation term Ayi. 5.7 NL-IPM-TRCG Algorithm Algorithm NL-IPM-TRCG Set k = 0; Set pk=O = M (M large); Set Uk=O at the analytical center of U; Calculate Pk=o(Uo); while (pk > ptol) do while (HIFill > Ftol) do Ayir=o = 0; for r = 1, ... ,R do Ayi = Ayir; Solve (5.23,5.24) for Ayir; Solve (5.26) by TRCG for Ayir+i; end do; Solve (5.27) for Ap; 124 YS FN U5 RS FD 7D U2 q 2 =u (X3 q 3= X4 2 q4 U1 Figure 5-1: Diagram of sample nonlinear heat transfer problem domain (7 cm x 3 cm). Solve for ai and fi by Armijo rule and (4.65), respectively; Pi+1 = Pi + (0.995)/3aj/&pj; end do; Pk _ if (pk > /ptoi) then k = k + 1; Solve for pk by (4.64); end if; end do 5.8 5.8.1 Example Problem: Nonlinear, Constrained 2D Heat Transfer Problem Statement We now address a specific nonlinear problem governed by partial differential equations. In particular, we consider radiative heat transfer, where the geometry is similar to the one considered in the problems of previous chapters. Take the domain shown in Figure 5-1, where the reaction surface I'RS is now exposed to an environment at temperature ys. Rather than imposing a Neumann condition on this boundary, we allow heat exchange through radiation to occur. This process is nonlinear and governed by the Stephan-Boltzmann law (5.1). The governing equations for this process can thus be expressed as: 125 M Em(X) + = V -(a(x)VQ) m=1 in Q, (5.28) in Q X (to, tf ), (5.29) FN X (tof), (5.30) V - f = 0 on Q-n=&(Q4 - y4) on EPRS X (tO, tf), (5.31) 9=300 K on F D X (to, tf), (5.32) where all definitions and properties are as in previous chapters. We add here that we take & = 5 x 10-7 /m 2 K3 for the following examples. This value is unrealistically high for typical engineering materials, making the effect of the nonlinear term more pronounced than may be expected to test the proposed algorithm. 5.8.2 FEM Formulation In stating the above problem in the FEM context, we recall the spaces defined in section 3.11. To preserve desirable properties of the stiffness matrix, we deal directly with the time-discretized form of the problem and treat the nonlinearity explicitly. By doing so we may state the problem governing equations as (5.33) y= o; M Y) At Ay' - f (y'-1) + Buf + F, (5.34) where f(y-I. - (&(( e- 1 4 ) - y4), 0j). (5.35) Having defined the problem as such, and following the procedures of section 3.11, we may easily derive the stationarity conditions: 126 (5.36) MY0 = YO; M =YAyf - f (y'-1) + But + Fe, (M - AtA)T AL = WT(Y M ( At ) (5.38) - YT) + WR(YL - yR)At; ATAe + WR(y' nt= -W (5.37) - (5.39) y') - Vf (y')Ae+, (5.40) BTAe, where we define the matrix Vf (ye) (5.41) 4 diag[(&( t) 3 ,$)], with (V, A) = (V, 0j).- Finally, in order to pose the Newton projection problem, we state the variations in stationarity conditions as AY9 = Yo M (AY (M - AtA M (AAe WT(AYL AAL (5.42) Y, (5.43) A A' + Ay- T,i) + WR(AYL - i)At; (5.44) ) = (A')TAA + (W-R V 2 f (yI)AI)Ay + (5.45) (%) - 1(BT A + BT AA' + g), (5.46) Au where we note that Ef and - = absorb the explicit nonlinearity terms, and (5.47) A' = A + Vf (yf)l, V2f (y ) = 12 diag[(&()2 127 ). (5.48) Optimal control history u (t) 1500 ... ..... 1000 500~30 -500 u1 (3,5) u2 (4) -1000 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Average temperature at reaction surface RS 550 500,450 >400 - YR,RS- 3~00 350-- 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 t (S) Figure 5-2: Optimal control and I'RS temperature histories for nonlinear constrained problem, J = 1.42 x 108. 5.8.3 General Results Having thus defined the problem, we may safely apply the NL-IPM-TRCG algorithm to the heat transfer process of Figure 5-1. We preserve all the problem data given in sections 3.12.1 and 4.6.1. Figure 5-2 presents the solution of optimal control and state histories. We note that the nonlinearity drastically changes the nature of the solution in comparison to the that of section 4.6.1. The heat lost through the radiative surface causes the temperatures at l'RS to be generally lower throughout the process than was observed with Neumann boundary conditions. In fact, increasing the cost for these deviations will not impact this portion of the solution, since as the controllers saturate in the early part of the process and cannot drive the system to the desired temperature as fast as in the former situation. We note that if hard bounds were not imposed on the problem, early control values would increase far beyond practical limitations when driven by such strong nonlinearities. This example demonstrates the strength of the TRCG algorithm in that it allows for effective incorporation of both nonlinearities and hard bounds, making it very practical for engineering problems. 128 1 0 Newton residual convergence I I T I 1 I Last TRCG residual convergence - 0 1 1__ 108 10 - 10 - 108 104 104 102 U2-100 - 102 10-2_ 100 - 10- 10 10-2 10-4 108 0 5 10 15 20 25 0 iteration 5 10 15 iteration Figure 5-3: Newton and last conjugate gradient convergence of NL-IPM-TRCG algorithm. 5.8.4 Numerical Performance We chose R = 3 for the NL-IPM-TRCG algorithm, so that each circle shown on the left plot of Figure 5-3 corresponds to 3 iterations to find Api. Comparing Figures 4-2 and 5-3, we conclude that the deviation from the central path introduced by the nonlinearity has a minimal effect on the convergence of Newton iterations when even this small number of lagging iterations is used. New values of barrier parameters pk were calculated at iterations {1, 6,11,15, 17, 20, 21, 22, 23, 24}. From the right plot of the figure, we note that the last TRCG calculation is still well conditioned, as expected for Primal-Dual methods, converging in only 15 iterations. The power of the method thus lies is the fact that IPMs tend to be forgiving of deviations from the central path. A small number of lagging iterations (3 or 4) is therefore all that is required to achieve sufficient convergence. The conditioning of the problem will not degrade as uk -+ u* if Primal-Dual IPMs are used, guaranteeing that the TRCG calculations will converge quickly throughout the process. These features, in conjunction with the stability of the central SPD operator 9, lead to a very efficient overall algorithm. 129 130 Chapter 6 Concluding Remarks 6.1 Summary of Contributions We presented in this work a method for solving quadratic cost, nonlinear state equations, constrained control optimal control problems. The core of the algorithm was developed for unconstrained LQP, but its flexibility allowed for extensions to more practical problems by applications of IPMs and a lagging technique. Though developed primarily in the context of ODEs, the goal of this work was to solve problems governed by first-order parabolic partial differential equations. We showed that the method could be very effectively extended to address these problems since the central requirement that g be SPD, stable, and well-conditioned was not compromised by a FEM discretization, as would have been the case, for example, if shooting techniques were employed. We presented an engineering (heat transfer) problem as an example of the efficiency of the method, and showed that its numerical performance was very favorable at all levels: conjugate gradient, Newton projection, and lagging iterations. The TRCG algorithm is derived from the idea of a state variable operator in the spirit of HUM. Our formulation differs from HUM in that the redefined problem (which can be viewed as a statement of the dual) is minimized over a space defined by a problem-specific inner product. In this space, we showed that the g operator is symmetric positive-definite, allowing for the solution of terminal and regulator problems by a conjugate gradient-based method. In addition, g is shown to be well-conditioned, thus allowing the method to converge quickly and efficiently. The most costly part of the algorithm is the action of g. Therefore, problems that are charac131 terized by dynamical equations with sparse matrices can take advantage of this sparsity. For FEM discretizations of parabolic partial differential equations, an initial value problem can be solved with O(LN) operations, where N and L are the number of spatial and temporal nodes, respectively. The action of 9 is twice this cost, and, since 20-30 iterations of the conjugate gradient algorithm are required, the entire problem is solved by roughly twice the order of magnitude of a single initial-value problem. 6.2 Possible Pitfalls A certain amount of care must be taken in implementing the TRCG algorithm. Pitfalls, which often arise in the discrete statement of the problem, can be avoided if the following precautions are taken. There is no way of predicting the correct form of the terminal conditions for A if the timediscretized form of stationarity conditions is not derived directly from the discretized cost. Though an incorrect terminal condition introduces only O(At) error in the solution, it will likely destroy the SPD property of 9, possibly compromising conjugate gradient iterations. Therefore, it is rec- ommended that time-discrete stationary conditions be derived from the discretized cost functional, and that the SPD property of 9 be verified by a proof similar to the one found in section 3.8 with an appropriate, discrete inner-product. The spatial discretization of the problem in the FEM context introduces the mass matrix on the left-hand side of stationary conditions. We observed that as long as we include this matrix in the definition of operator R, no modifications need to be done to the algorithm. In fact, any invertible symmetric matrix can be multiplied to the left-hand side of these equations if we define R accordingly. Finally we note that it is best to use Primal-Dual variants of IPMs. This guarantees that g is well-conditioned throughout the solution process. Though Primal methods may be easier to implement and may work for some problems, they cannot guarantee this important property of g in general. 132 6.3 Conclusions The examples presented in the work demonstrated the predicted effectiveness of the method. The operator g was shown to be central to the efficiency of the algorithm, due to the fact that it is stable, well-conditioned, and SPD in an appropriate inner-product space. In addressing more general problems, we noted that IPMs provided a way of guiding the solution to u* without compromising this operator. More general nonlinear problems can very efficiently be addressed in this context, since IPMs allow for deviations from the central path introduced by nonlinearities. In fact, we propose that for unconstrained nonlinear problems, it would be advantageous to apply fictional limits on u beyond expected values and employ the NL-IPM-TRCG algorithm to exploit this guiding behavior of the method. 133 134 Appendix A Additional Time-Discretization Schemes for TRCG Here we present two additional time-discretization schemes that may be used with TRCG: CrankNicholson and Second Order Backward Difference Formulas. The method is shown here to be applicable for these schemes since the main results of chapter 3 hold given appropriate definitions of the R operator and ((., .)) inner-product. As suggested in that chapter, we approach the derivation from the cost functionals. The results in this appendix show that such an approach leads to the appropriate form of operators for general time-discretizations. As a result, TRCG can accommodate any time-discretization scheme provided care is taken in developing R and ((., -)). For completeness, we present the below in the context of logarithmic barrier functions of chapter 4. This is done to explicitly show the form of augmented cost functionals as new schemes are introduced. Extensions to the lagging procedure for nonlinear problems are trivial. 135 A.1 Crank-Nicholson A.1.1 Cost Functional Definition P[ ] = (' T-) T(Y - T) + 2 ( Wu-1/2)WU(ui/2)At f=1 + (2w- )At/2 + 1 R L-1 T WR( _ 2 4) 2 where m,q = (M1/2 A.1.2 - T WR(L -- )At/2 -PE - )At (A.1) M L E ln( 'Mq)At q=1 f=1 m=1 ft ), and xe1/2 is defined as the value of x at t = (f - 1/2)At. Optimality Conditions must be defined. For Crank-Nicholson, Cq = diag(ut1/2_ Before applying a particular scheme, C I q ut). We then reintroduce the state and adjoint equations: Y0 YY At AL = WT(YL A (A.2) YO, A+iAT IA(y' + yt~1) + Bu f-1/2 + F, 2 - (A.3) (A.4) YT), (A' + A'+) + WR (y' - y') 0 = W±U-1/2 + BT A' - - (Cin where equations (A.3-A.6) are used for f = 1,... , L. 136 + Ciax1) e, (A.5) (A.6) A.1.3 TRCG Components First, the discrete state-adjoint equations are put into linearized form and separated into the inhomogeneous and homogeneous parts: (A.7) Yo = YO, f+1 - + +1 + + yf)+ - I IQt+1/2 A++A'+ 2 A(y At - t 1/2 . (A.9) = -WTYT, A+1 (A.8) t - At I 2 + A'I -WR(Yi Ii- 2 iAJ (A.10) +Yi and y (A.11) 0 i+1 YH YH At f+1 _ f t+1/2(A+1 (A. 12) YH y (A.13) _W =WYL, H _H T(AH+1+ A) At + IWR(y+ ±). (A.14) Similarly to the time-continuous case, equations (A.7) - (A.10) are uncoupled, and can be solved for yt, for e = 1,... , L. The second set, however, is coupled. Again, we define a RT-operator RCN that solves (A.11) and (A.12) with ALg = WTqL, - Ht+ H H At such that yH = RCNqi V {q'L 0 (A.15) + = 1A T (Ae+I± H 2 At) + 1WR(q+1 + q') (A. 16) E RNx(L+1). The problem can then be weakly stated as ((p,gCNq))CN = ((pyI))CN, 137 (A.17) where 9CNq = q - RCNq, and ((v,w))CN = (VL)TWT(W') + 2 E +1 + v )TW(w +l + wf) At, (A.18) f-1 V{v},{w} C IRNx(L+1) A.1.4 TRCG Proofs Proposition 8 The operator ((v, w))CN defines an inner-product space. Proof. Defining the norm IIvIICN = 1. ((v, W))CN 2. |IvH|CN 3. IIVIICN 4. > ((v, v N21, it is simple to show that for any v, w E JNx (L+1): ((wv))CN; 0; 0 - v = 0; I|avH| = Ia|IHV||CN, V a E JR. Proposition 9 The R-T operator 7 ZCN is symmetric negative semi-definite in the space defined by the above inner product. Proof. This proof closely follows the discussion for the time-continuous case. equations (A.12) and (A.16) in generic variables, {z1,Y1,72, pI, p2} E JRNx(L+1) with corresponding initial and final conditions: f+1 - z £ 1 z1 1z+, At _12 2 ~~1- = At + I) I Qf+1/2 (+1 1 21 + - WR(h + _Y ATy±+YD 22 + 7f) 1 +p) z = 0, WTPL Reducing the above set, (2+1)T =Y - 2+1 _ _ (1 T 2T 1/ +1 + _p 138 +11 Iy)A -W We rewrite _p )TWR(zf+l + zI)At, and adding from f = I to (L - 1), (PL)T WTZL L-1 2 _ 2 L-1 (i+ 1 + (+1 j)TQf+1/2 + 71) At - 1 ('i+ =1 2 =1 where we have applied the initial (zo = 0) and final (yL 1 +P )TWR(z+1 + z)At, = WTPL) conditions of the problem. Rearranging and applying the operator RCNPf = zf we obtain Z+ (p)TwRCNP L-1 _L _L /==1 L 1 + TQ+1/ 1 ±pf)TWR(7ZCNpi+l + 2 7CNP1 )At +1I)At By identifying the left-hand side of the above equation as ((P2, RCNP1))CN we see that ((P2, 7 ZCNP1))CN = (( 7 ZCNP1,P2))CN and that ((P, ZCNP))CN = LZ(2+1 - QT+1/2(7+1 + /t)At < 0, e V p} CjNx(L+1) The operator RCN is therefore SNSD in the ((-, -))CN space. E From the above, we see that 9CN (gCNP = P - RICNP) is SPD in ((-, -))CN, and therefore a unique solution q E RNx(L+1) of equation (A.17) can be found by conjugate gradient methods in the space defined by this inner product. 139 A.2 Second-Order Backward Difference A.2.1 Cost Functional Definition [ 2 i, IL 3 f-/ +2 + where = iq A.2.2 qi) - _ pf) Q0 )At/2 + -( _ T -(L WR(pL L-1L )At/2 - 1 + if12W f1/2At 2 WU3 f-1/2 - R T R5 +g(9~~. p) T) T -T' -3/2 2 -O_ 1 - T- -1/2 f=2 +I )TWT( L for f = 1 and 2fi-3/2 At (A.19) l -S)At _aTRTWR( - 2 L q=1 f=1 p E 1: / M 1n(R)L m=1 ) for - £ = 2, ... , L. Optimality Conditions For Second-Order BDF, Cq = diag(u/ - u ) for £ = diag(mum 1 /2 1, and C= £-3/2 f Um - U') q) for =2, ... ,L, and (A.20) =Y0 yo; = Ayi + Bu 1 /2 + F; y - At f-2 -2y-+ 3p 1 - (A.21) + At = Ayf + B' for + F, f = 2, ... , L; 3 A2 + WR(yt T AL At - - yn)At/2 + WT(yE 23AL-1 - 2AL = A T A L-lAt + WR(yL-1 _ 3A' - A' - 2A 2 + 1 A3 2 ATA +WR(Yf 2 = 0=- Wui'+ B TA' where fi = 1 2 / (A.23) yT); (A.24) -1); 2A'++A f+2 At - - Y), for f= L-2,...,2; P (cmfin- for f = 1 and i' = + Cia7') e, 3U-1/2 for f = 1, ... ,L, + IUI-3/2 for 2 fo£2..L 2, ... , L. 140 (A.25) (A.26) AT AlAt + WR(y' - y')At; = - (A.22) (A.27) A.2.3 TRCG Components We begin by rewriting the state-adjoint equations: 1 0 YI -YI 0 Yi= YO, 3 Py 1 f- 2yi 1 + -Q y + =- A (3y At 2 A) + 2D 3 (A.28) + 3DO, t-2 y At = Ay - QfA1 + Df (A.29) = 2, ... IL - 1), Wy, -WTYT, ALI 2AL, II AL+1 = 2I (for 3 2 AtI - 2A'+l I + 2'At+ I (A.30) 2 =A TA AtI - WRyfR (for f = L - 1, ... (A.31) 1), and 0 yyUH = 3 yH - 22yt- 1 + -2 H 2YH AL+l H -= 2AL 2H' 3 2 A' H At ) 3YH + 3H HA- H 2A - f (A.32) - A = WryL , 2 A'++A H 1 2 1 AtH H f+2 H (YL 2 yL-1 + WRY ATA'H (for f= (A.34) _ YL-2) (for (A.35) f= L -1,...,1), where the first set of equations is uncoupled (and can be solved for yi, for f = 1,. =2yL-1 - (A.33) 2,..., L - 1), ., L - 1, and L-2) while the second set is coupled. We define RBD as solving (A.32) and (A.33) with - 2AL AL+l H -H) 3 A' H -2Af+ H Hl = WrqL +}A+ 2H At such Th that given _qj- be. weak1 IRNxL, (qL = 2qL-1 _ qL- ) (A.36) f=L-1,...,1), (A.37) 2 2 = T AH+ s(RBDq a-2 Waq H for f (for 1, ... L-1, and (BDq)L 2 H _ Y The problem can then be weakly stated as ((pQBDq))BD = 141 ((pyI))BD, (A.38) where 9BDq = q - R)BDq, and L-1 I ((v,w))BD ( 2 vL-1 - 2 L- 2 2 (V)TWR(We) wL- ) )TWT(2L -1 V{v, w}f_- A.2.4 At, (A.39) f_2 . E NxL TRCG Proofs Proposition 10 The operator ((v, w))BD defines an inner-product space. 1/2 Proof. Defining the norm H|VHlBD = ((l)BD' tis ti simple to show that for any v, w E JNxL. 1. ((V,W))BD = ((w,v))BD; 2. |IVIIBD > 0; 3. ||Vj|BD = 0 #-4 v = 0; 4. ||avII = 1a0HV1BD, V a E 11?. Proposition 11 The R-T operator 7 ZBD is symmetric negative semi-definite relative to the above inner product. Proof. Rewriting equations (A.33) and (A.37) in generic variables, {zi,71,72,P1,P2} c JJNxL 3- 2 '- 2 2z + 2z + 1 -2 = AzIAt - Q yfAt 2 reducing, fT t 1 tIT -2Qy -y 1 £ 1 jT t-2 z 1 ) + -Y 2 Z1 - -Y t±2 T f) 142 jt 1 2W A and summing from f = 2 to (L - 1), we obtain (-2_y2 + 1/2y )T Z + LT( 2 ZL-1 - L-1 =f = Q=2At 0 has been used. WTpJ; (RBDP)L (-2y2 + 1/ 27 = 7 2L+1TZL-1 1 L-1 = where z 1/2ZL- 1 - )TZ = 121 Now we note that according to (A.28) - (A.37): L+1 ,L 2 zL- ; and - T (AtA - 3/21)(AtA - 3/2I)- Q 17 At = iTQ fAt. Substituting these terms where appropriate, and rearranging, we obtain: L-1 P2 L-1 z W(BDP1L 2TW(R 72 T- )BDP)At = t=2 1' f=1 By identifying the left-hand side of the above equation as ((p2, RBDP1))BD we see that ((p2,RBDP1))BD and that L-1 ((P, The operator ((RBDP1,P2))BD = RBD ZBDP))BD = - 7 Q 7 is therefore SNSD in the ((-, -))BD space. Similarly to before, the operator 9 BD Vp} c iNxL At < 0, l is SPD in ((-, -))BD, and a unique solution q E IRNxL of equation (A.38) can be found by conjugate gradient methods in the space defined by this inner product. 143 144 Bibliography [1] K. G. Arvanitis, G. Kalogeropoulos, and S. Giotopoulos. Guaranteed stability margins and singular value properties of the discrete-time linear quadratic optimal regulator. IMA Journal of Mathematical Control and Information, 18:299-324, 2001. [2] Alex Barclay, Philip E. Gill, and J. Ben Rosen. SQP methods and their application to numerical optimal control. Technical Report NA 97-3, Dept of Mathematics, University of California, San Diego, 1997. [3] D. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, 2nd edition, 1999. [4] D. Bertsimas and J. N. Tsitsiklis. Introduction to Linear Optimization. Athena Scientific, Belmont, MA, 1997. [5] J. T. Betts. Survey of numerical methods for tajectory optimization. AIAA Journal of Guidance, Control, and Dynamics, 21:193-207, 1998. [6] J. T. Betts. PracticalMethods for Optimal Control Using Nonlinear Programming. Advances in Design and Control. SIAM, Philadelphia, 2001. [7] J. V. Breakwell. The optimization of trajectories. SIAM, 7:215-247, 1959. [8] A. E. Bryson and Y.-C. Ho. Applied Optimal Control. Hemisphere Publishing Corporation, revised edition, 1975. [9] Y. Cao, M. Gunzburger, and J. Turner. On exact controllability and convergence of optimal controls to exact controls of parabolic equations. Optimal Control: Theory, Algorithms, and Applications, pages 67-83, 1998. [10] B. Friedland. Control System Design - An Introduction to State-Space Methods. McGraw-Hill, 1986. 145 [11] 0. Ghattas and J.-H. Bark. Optimal control of two- and three- dimensional incompressible navier-stokes flows. Journal of computational physics, 136:231-244, 1997. [12] Philip E. Gill, Laurent 0. Jay, Michael W. Leonard, Linda R. Petzold, and Vivek Sharma. An SQP method for the optimal control of large-scale dynamical systems. Journal of Computational and Applied Mathematics, pages 197-213, 2000. [13] R. Glowinski and J. L. Lions. Exact and approximate controllability for distributed systems. Acta Numerica, pages 159-333, 1995. [14] H. Goldberg and F. Tr6lzsch. On a sqp-multigrid technique for nonlinear parabolic boundary control problems. Optimal Control: Theory, Algorithms, and Applications, pages 154-177, 1998. [15] J.-W. He, R. Glowinski, R. Metcalfe, A. Nordlander, and J. Periaux. Active control of drag optimization for flow past a circular cylinder. Journal of Computational Physics, 163:83-117, 2000. [16] V. F. Krotov. Global Methods in Optimal Control. Monographs and Textbooks in Pure and Applied Mathematics (195). Marcel Dekker, New York, 1996. [17] M. K. Kwak and L. Meirovitch. An algorithm for the computation of optimal control gains for second order matrix equations. Journal of Sound and Vibration, 161:45-54, 1993. [18] J. L. Lions. Optimal Control of Systems Governed by PartialDifferential Equations. SpringerVerlag, Berlin, 1971. [19] J. L. Lions. Exact controllability, stabilization and perturbations for distributed systems. SIAM Review, 30(1):1-68, 1988. [20] A. Miele. Method of particular solutions for linear two-point boundary-value problems. Journal of Optimization Theory and Application, 2(4), 1968. [21] W. Murray. Some aspects of sequential quadratic programming methods. Large Scale Optimization with Applications, Part II: Optimal Design and Control, 93:21-35, 1997. [22] W.H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes in Fortran 77, volume 1. Cambridge University Press, second edition, 1996. 146 [231 S. S. Ravidran. Proper orthogonal decomposition in optimal control. Technical report, Flow Modeling and Control Branch, Fluid Mechanics and Acoustics Division, NASA Langley Reseach Center, internet, http://fmcb.larc.nasa.gov/-ravi. [24] R. W. H. Sargent. Optimal control. Journal of Computational and Applied Mathematics, 124:361-371, 2000. [25] U. Shaked. Guaranteed stability margins for the discrete-time linear quadratic optimal regulator. IEEE Transactions on Automatic Control, AC-31(2), 1986. [26] V. Sima. Algorithms for Linear-Quadratic Optimization. Monographs and Textbooks in Pure and Applied Mathematics (200). Marcel Dekker, New York, 1996. [27] Robert F. Stengel. Optimal Control and Estimation. Dover Publications, New York, 1993. [28] L. N. Trefethen and D. Bau. Numerical Linear Algebra. SIAM, Philadelphia, 1997. [29] M. H. Wright. Interior methods for constrained optimization. Acta Numerica, pages 341-407, 1992. 147