Rensselaer Polytechnic Institute ECSE-4760 Real-Time Applications in Control & Communications EXPERIMENTS IN OPTIMAL CONTROL Number of Sessions – 4 INTRODUCTION In modern control systems design it is sometimes necessary to design controllers that not only effectively control the behavior of a system, but also minimize or maximize some user defined criteria such as energy or time conservation, or obey other physical or time constraints imposed by the environment. Optimal Control Theory provides the mathematical tools for solving problems like these, either analytically or through computer iterative methods, by formulating the user criteria into a cost function and using the state equation representation for the system dynamics. The optimal control experiments in this lab involve the use of a PC to illustrate the efficacy of microprocessors as system controllers. Two types of control will be considered: 1) Continuous (analog) 2) Discrete (digital) The general formulation of the optimal control problem is as follows: Let the system states (i.e. the state variables) be represented by an n-dimensional vector x, where n is the order of the system. Let the control variables (input) be represented by an m-dimensional vector u, and the system output by an r-dimensional vector y. The system can, in general, be modeled by a set of differential equations of the form: . x = (x, u) , y = (x, u) x(t0) = c (1) where and are general (possibly nonlinear and time varying) expressions of x and u, t0 is the initial time and c is an n-dimensional set of initial conditions. The objective is to find a control input u(t) that will drive the system from the initial point c ( = x(t0) ) in the state space to the final point x(tf), and at the same time minimize a cost functional J (formally called the performance index/criterion) given by equation (2): tf J = h(x(tf), tf) + g(x(t), u(t), t) dt t0 1 (2) where: tf represents the end of the control interval, h and g are user defined penalty expressions. PART I – CONTINUOUS CONTROL MATHEMATICAL FORMULATION – RESULTS Even though the general problem is very hard to solve, there are very useful simplifications that lead to closed form (analytic) solutions. The most common simplification (and the one used for this experiment) is the LQR problem, where the system is Linear and the controller (Regulator) must satisfy a Quadratic cost functional. Assuming that the system is time invariant as well, then state and output equations (1) become: . x(t) = Ax(t) + Bu(t) , y(t) = Cx(t) + Du(t) x(0) = x0 (3) where: x(t) is the (n x 1) state vector, x0 is the initial state vector, u(t) is the (m x 1) control input vector, y(t) is the (r x 1) output vector, A is the (n x n) state dynamics matrix, B is the (n x m) control dynamics matrix, C is the (r x n) state-output matrix, D is the (r x m) input-output matrix (for all practical purposes assumed 0 thereafter). Corresponding to the system is a performance index represented in a quadratic form as: tf 1 1 J(u(t), x(t), t) = x'(t)Hx(t) + { x'(t)Qx(t) + u'(t)Ru(t) } dt 2 2 (4) 0 where: H is the (n x n) terminal state penalty matrix, Q is the (n x n) state penalty matrix, R is the (m x m) control penalty matrix. It is important to emphasize that the H, Q, and R matrices are user selectable, and it is through the proper selection of these that the various environment constraints are to be satisfied. If H, Q, and R are positive definite and symmetric, then it can be shown that a closed loop optimal control function u(t) exists (called u*(t)), and is uniquely given by: u*(t) = -R-1B'P(t)x(t) = G(t)x(t) . G(t) = -R-1B'P(t) 2 (5) where: G(t) is the (m x n) optimal feedback gain matrix, P(t) is an (n x n) symmetric and positive definite matrix that satisfies the continuous matrix differential Riccati equation given by: . P (t) = -P(t)A - A'P(t) + P(t)BR-1B'P(t) - Q , P(tf) = H (6) From the above it is obvious that, even for the LQR problem, the set of coupled differential equations (6) must be solved before the controller gains can be found. It should be noted that these gains are functions of time. The significance of the result of equation (5) is that the gains can be computed ahead of time, and the optimal controller is easily implemented on any digital computer. In all but the simplest cases of the real life applications, computer iterative methods are utilized to pre-calculate and store the P matrix and the gains, resulting in very efficient and robust open and closed loop controllers. A further simplification to the above solution takes place when tf approaches ∞, or in more realistic terms the control is applied for a very long period of time. In this case it is shown that: if (1) the system is controllable, (2) H is zero, (3) the A, B, Q, and R, are all constant matrices, then the P(t) matrix converges to a constant symmetric, real, positive definite matrix Pss. Obviously then . P (t) is zero, and the matrix differential Riccati (6) is transformed into a matrix algebraic equation (called the Steady State Riccati Equation) given by: 0 = -PA - A'P + PBR-1B'P - Q (7) The steady state solution Pss need not be unique. However, only one real positive definite solution can exist. Since P is of dimensions (n x n), we obtain n2 equations to solve for the n2 components of P. Yet due to the fact that it is symmetric, the number of computations is greatly reduced. Note that the steady state Riccati equation for the continuous case is easier to solve without the aid of a computer than is the discrete equivalent to be computed later, because no inverse of the unknown P appears in the equation. Once the P matrix is determined, the optimal control gain matrix G can be found from equation (5). Of course now the gains are constants as expected. Solving the general (non steady state) LQR problem requires substantial storage for the gains that depends on the tf and the sampling interval, and exceeds the scope of this experiment that will hereafter focus on the steady state case. Another obvious but never the less important result of the steady state case is that the system states must converge to zero regardless of their initial condition vector x0. An intuitive proof by contradiction of the above result can be derived using the following notion: The performance index is always positive as the sum of positive factors since all matrices are positive definite. If it is to be minimized over an infinitely large time, then this minimum better be constant. Yet if the states are not approaching zero, then from equation (5) the control isn't either, and we end up summing two positive quantities, inside the integral, for an infinite time. This of course can never give a constant (minimum) value; therefore the states must go to zero. Applications exploiting this result are controllers designed to keep the states always near zero, counteracting random abrupt changes in the state values (or any other causes that can be modeled as impulses). The following example sets up a typical infinite time LQR problem and analytically solves the steady state Riccati equation, giving the student sufficient experience to deal with the actual experiment. 3 EXAMPLE Consider the following second order system: x. (t) 1 = 0 1 x1(t) + . -1 0 x2(t) x2(t) x1(t) y(t) = [ 1 0 ] x (t) 2 0 u(t) 1 with the corresponding performance index given by ∞ 1 0 x1(t) + u2(t) } dt J = { [x1(t) x2(t)] 0 1 x2(t) 0 The continuous steady state Riccati equation (7) can be written as: PBR-1B'P = PA + A'P + Q (8) and using this example's data it becomes: 1 0 0 -1 P11 P12 P11 P12 0 P11 P12 P11 P12 0 1 P P 1 [1][ 0 1 ] P P = 0 1 + 1 0 P P + P P -1 0 21 22 21 22 21 22 21 22 The matrix equation can be rewritten as three simultaneous algebraic equations (remember P12 = P21): 2 P12 = 1 - 2P12 P12P22 = P11 - P22 2 P22 = 1 + 2P12 Solving the three equations, two real solutions for P are obtained: 1.912 0.4142 0.4142 1.352 -1.912 0.4142 , 0.4142 -1.352 By checking the principal minors of each of the two matrices, it is found that only the first of these is positive definite*, therefore it's the accepted solution. The feedback gain matrix is then determined from equation (5) to be: G = [ -0.4142 * -1.352 ] A matrix is positive definite if all principal minors are positive. The principal minors are the determinants of the submatrices along the principal diagonal of the matrix which include the topmost left element of the diagonal. Thus the first principal minor is the element (1,1), the second is the determinant of the 2x2 submatrix in the upper left corner, and so on. 4 Thus, the optimal control input u(t) is given by: u(t) = -0.4142x1(t) - 1.352x2(t) (9) The student is expected to work through this example prior to attempting the problem for the experiment, to ensure a thorough understanding of the procedure. FIGURE 1. Block diagram of the open loop system. PROBLEM FORMULATION This experiment is based on the Steady State case of the LQR problem. Here the plant to be controlled is a 2nd order linear and time invariant system implemented on the Comdyna analog computer, the penalty matrices Q, R are constants and H is zero. The optimal control input u(t) will drive the system states x1 and x2 from the initial values of -3 volts and 0 volts respectively to zero, and it will be generated either using the Comdyna analog computer, or a PC running the appropriate program. The second order system to be controlled is modeled by the transfer function: Output Y(s) H(s) = Input = U(s) = 0.5 s(s + 0.5) (10) which is to be implemented on the Comdyna analog computer following a suitably designed simulation diagram. The state variable representation of the system transfer function (10) is of the same general form as equations (3) with D equal to zero, and is given by: x. (t) 1 = 0 1 x1(t) + 0 u(t) . 0 -0.5 x2(t) 0.5 x2(t) x1(t) y(t) = [ 1 0 ] x (t) 2 , -3 x1(0) x (0) = 0 2 (11) Given the state variable equations, the open loop system block diagram can be obtained as in FIG. 1 and the actual analog computer simulation as in FIG 2. The student is urged to verify both results, paying particular attention to the initial conditions in FIG. 2. Note that the integrators invert the voltage on the IC input so a +3 volts will give the desired initial output voltage of -3 volts. 5 -1 AO 0 -1 AI 0 AI 1 FIGURE 2. Analog computer simulation of the open loop system. The performance index for this system is given in accordance with equation (4) by: ∞ 2 0 x1(t) + 0.4u2(t) } dt J = { [x1(t) x2(t)] 0 1 x2(t) (12) 0 EXPERIMENTAL PROCEDURE The aim of the experiment is to design a closed loop optimal controller that will drive the states to zero using the feedback gains obtained via the equations (5). For this purpose the following steps must be sequentially implemented: 1) The open loop system is to be built on the Comdyna analog computer in accordance with the simulation diagram of FIG. 2. Special care must be taken when implementing the initial conditions and the gains because of the sign inversions at the output of the amplifiers. The DSO (oscilloscope) must be thoroughly calibrated and its various scales mastered, before any useful work can be done. Remember, the Comdyna dial must be on the Pot Set position and the pushbutton labeled IC be pressed in during setup; during operation the dial must always be on the Oper position, and the pushbutton labeled OP must be pressed in just before each run starts. 2) The impulse response of the system is to be simulated using the Comdyna and the PC, and the natural (uncontrolled) modes of the states x1, x2 are to be saved from the DSO screen. The concept behind this is that when an Asymptotically Stable system is excited with an impulse function (t), then the states converge to zero regardless of their initial values, producing the same final result as the optimal controller does. Hence the plots of the state trajectories obtained in this part, are to be compared with the ones produced by applying the closed loop optimal control. Before running this part it is necessary to verify that both states are indeed stable by solving the system differential equations. However, it should be noted that this system is not strictly Asymptotically Stable due to the pole at the origin. To implement the impulse function, connect the AO 0 port of the National Instruments BNC2110 board to the input of the system (details on running the PC program are provided later). The states x1(t) and x2(t) as well as the control signal u(t) should be observed on the scope. Follow the steps to load the LabVIEW program in 5) below. Select the Impulse Response tab at the top of the VI front panel. After resetting the analog computer system’s I.C.s, (IC pushbutton) begin the scope 6 sweep, go to operate mode (OP pushbutton) on the Comdyna, and click the Run arrow icon in the VI menu bar. Be sure to use Run, not Run Continuously. The weighted impulse (created by a 10 volt square pulse held for 0.3 s) will drive x1(t) and x2(t) to zero. The controller stops automatically at the end of the cycle. It is important to understand that since the (t) function is a theoretical pulse of infinite amplitude and infinitesimal duration, no human realizable input can reproduce it. Hence a +10 volt step function (the largest voltage the analog card is capable of) is applied for a period of 0.3 s to produce the same net result. The duration is valid only for the -3 volt initial condition of state x1 (10 x 0.3 = 3, the negative value of the I.C.). Also you may observe that state x1 reaches the desired zero as theoretically expected, yet it then very slowly continues to change linearly with time. This should be attributed to leaks and offsets within the Comdyna integrators rather than to a theoretical inconsistency. FIGURE 3. Block diagram of the closed loop system. 3) The algebraic Riccati equation for the above system is to be solved and the P matrix elements and the optimal feedback gains g1, g2 are to be computed using equation (5) and the example. Even though MATLAB or any other equivalent math package can be used for verification purposes, a detailed analytical solution of the P matrix elements and the gains is mandatory. Having found the gains, the optimal controller is given by equation (13) as: u*(t) = g1x1(t) + g2x2(t) (13) The block diagram of the complete closed loop system is given in FIG. 3, and its simulation diagram in FIG. 4. The control calculation (equation (13)) of FIG. 4 is to be implemented on the Comdyna analog computer and PC through the Optimal.vi LabVIEW program. NOTE: The gain calculated from the Riccati solution will be negative numbers. These values will be entered as positive attenuations on the analog computer pots with the summing amp providing the sign inversion. Diagrams showing the negative gain of the summing amp will also show negated feedback gains (which end up as positive numbers) for consistency. This convention also applies to the LabVIEW implementation where the gains will still be entered as positive values. 4) The complete closed loop system, both the plant and the feedback controller, is to be built on the Comdyna analog computer using the FIG. 4, and a DSO screen recording of the state trajectories be obtained for comparison with the results from the other runs. This controller should produce the best results for a given set of dynamics and penalties. (Why?) The implementation on the Comdyna 7 can be done either using the multipliers, or passing the states individually through the .1 inputs of adders to effectively increase them by a factor of 10, and then through potentiometers with values .1 times the gains. -1 AO 0 -1 AI 0 AI 0 -1 Figure 4. Analog computer simulation of the closed loop system. 5) The experiment is to also be run using a LabVIEW program to implement the closed loop controller. For this part, the system built on the Comdyna is to be connected to the PC as in FIG. 4. State x1 must be fed to the BNC-2110 input port AI 0, state x2 to input port AI 1, and the system input u to output port AO 0. All other ports are to be left free. The control algorithm is to be executed using different sampling periods (at least ten runs ranging from 50 to 1000 ms), and the plots of the state trajectories be compared with the previous results. To access the LabVIEW program do the following: • • • • • • Turn the PC on (if off) and go to the OPTIMAL subdirectory (My Computer\Local Disk): C:\CStudio\RTA_lab\Optimal. Double click Optimal Controller DAQmx+.vi to load the program. A LabVIEW program will display with two different tabs at the top of the screen. When moving through the parameter fields, the tab button will not work. You must use the mouse to manually select them. Be sure to push the operate button (OP) on the analog computer before starting the VI program, otherwise the controller will not behave properly. Push IC when done to reset the plant for the next run. Press the right arrow button at the top left-hand corner of the screen in order to start execution. In order to pause; press the STOP button on the screen, not the stop sign at the top-left of the page (this will help to calculate actual sampling times, and reset the output to 0 Volts). It may take a few seconds for the program to respond. Although some values can be changed during execution by user input, it is important, in order to properly ensure correct measurement, to stop the controller (by using the STOP button) before altering parameters values as necessary. If you notice the controller isn’t working properly: o Press the stop button then run the LabVIEW program again. This should reset the program and enable you to start from scratch. o Wiggle the T-connectors; make sure the connection is good. If when wiggling you notice a difference in the response, change connectors. Occasionally the PC may need to be restarted. In almost all the cases data entry consists of the sampling period T in ms. For the Continuous Case the pre-calculated gains in step 2 must be entered into the relevant fields once since they are constant, and only the sampling period is to be changed before each new run. Since the computer program does invert the controller sign in the output, the negated calculated gains must be entered 8 in the relevant fields (as with the analog implementation of step 3). Record your observations and make suitable DSO screen recordings of the states. WRITE-UP AND ANALYSIS Your report should contain: 1) Solution of the continuous steady state Riccati equation and calculation of the feedback gain. (Show that the solution to the steady state Riccati equation is positive definite.) 2) Comments on the asymptotic stability of the system. 3) Various DSO screen recordings for procedure parts 1), 3), and 4). These should be appropriately labeled. 4) Discussion of results, including a comparison of the recordings and the significance of the sampling times (especially when it increases). When discussing the sampling times, reference must be made to the DSO screen recordings. 5) Conclusion. PART II – DISCRETE CONTROL MATHEMATICAL FORMULATION – RESULTS In the case of digital control, the optimization must be treated as a discrete time problem. Thus, the sampled system is modeled by the difference equations: x(k + 1) = Ax(k) + Bu(k) , y(k) = Cx(k) + Du(k) x(0) = c (14) and the discrete performance index in a quadratic form is given by N 1 1 J = x'(N)Hx(N) + { x'(k)Qx(k) + u'(k)Ru(k) } 2 2 (15) k=0 where A, B, C, D, H, Q and R are similar to those in the continuous case and N is a fixed number of time intervals. By applying the conditions of optimality to the problem, the optimal control is found to be: u*(k) = G(k)x(k) (16) where G is again the (m x n) feedback gain matrix given by: G(k) = -R-1B'(A')-1[P(k) - Q ] (17) and P is the (n x n) real positive definite solution of the discrete matrix difference Riccati equation given by: P(k) = Q + A'[P-1(k + 1) + BR-1B']-1A 9 (18) When N approaches ∞ and the same conditions mentioned in the continuous case apply, then the P(k) matrix converges to a constant symmetric, real, positive definite matrix P. Hence: P(k) = P(k + 1) = P (19) and the matrix difference Riccati becomes a matrix algebraic equation as before. Yet even in the steady state case equation (18) is very difficult to solve analytically, hence iterative methods implemented on computers handle this task. Once the P matrix is found, the gains and the controller are easily computed using equations (17) and (16). The following example should provide some reference as to the procedure to follow to obtain the feedback gain matrix for a simple system. EXAMPLE Given a discrete system described by the state and output equations: x1(k + 1) cosT sinT x1(k) x (k + 1) = -sinT cosT x (k) 2 2 x1(k) y(k) = [ 1 0 ] x (k) 2 + 1 - cosT u(k) sinT where T is the sampling interval in seconds. The discrete performance index is given by: ∞ J = { x'(k) 1 0 x(k) + u2(k) } 0 1 k=0 If we let the sampling interval T be equal to /40 seconds, the discrete equations become: 0.997 0.0785 x1(k) 0.00308 x1(k + 1) x (k + 1) = -0.0785 0.997 x (k) + 0.0785 u(k) 2 2 The discrete Riccati equation is not easily solved by hand due to the fact that a complicated cubic equation has to be solved. Using the class web page MATLAB m-file program, the solution is found: P = 24.6 5.15 5.15 17.5 Thus, from equation (17) the feedback gain matrix is: G(k) = [ -0.331 -1.28 ] and from (12) the discrete optimal control is becomes: u(k) = -0.331x1(k) - 1.28x2(k) PROBLEM FORMULATION 10 This part of the experiment implements the steady state case of the Discrete Linear Quadratic Regulator problem. The same linear and time invariant system as before is used, but now for all calculations it is treated as a sampled data system modeled by the following state and output equations: x1(k + 1) 1 2 - 2e-.5T x1(k) + T - 2 + 2e-.5T u(k) , x (k + 1) = 0 e-.5T x2(k) 1 - e-.5T 2 x1(k) y(k) = [ 1 0 ] x (k) 2 -3 x1(0) x (0) = 0 2 (20) where T is the sampling interval. The discrete performance index for this system is given by: ∞ J = 2 0 x1(k) { [x1(k) x2(k)] 0 1 x (k) + 0.4u2(k) } 2 (21) k=0 It should be noted here that the discrete A state dynamics and B control dynamics matrices have been chosen (out of the infinitely many possibilities representing the given system) so that the discrete states x(k) are the continuous states x(t). Given a continuous system to be modeled by a discrete system with zero order holds on the outputs, the discrete matrices Ad and Bd may be obtained from the continuous matrices Ac and Bc by using the MATLAB function c2d or by: A T Ad = e c -1 Bd = [Ad - I]Ac Bc where T is the sample period. Note that Cc = Cd and Dc = Dd. (Why?) The section "Discrete State Equations" in the chapter "Open-Loop Discrete-Time Systems" of Digital Control System Analysis and Design by C. L. Phillips and H. T. Nagle provides more details on this continuous to discrete transformation. EXPERIMENTAL PROCEDURE The procedure for this section is similar to that of the PC based Continuous Control section, yet now the system and input dynamics (A, B) vary with sampling, hence they must be calculated in advance. Use any tool with which you feel comfortable (e.g. Excel or MATLAB) to compute them. A MATLAB m-file is available for download from the course web page that will solve the discrete Riccati equation for the P matrix and the feedback gain matrix. The system is implemented on the Comdyna analog computer and connected to the BNC-2110 according to FIG. 4 as before. Load the LabVIEW program, enter/edit the data in the various fields and select the arrow at the top left corner of the window to begin execution. Care must be taken to avoid typos when editing the m-file values, especially R, because divisions by zero will crash the program. It is recommended that all control values be saved and tabulated before starting the actual run. Record your observations, and make appropriate DSO screen recordings. Note that the LabVIEW program assumes negative feedback and negates the output, which is the sum of the products of the system states and the feedback gains. This means that positive values 11 of the gains g1 and g2 entered into the program will produce the desired negative feedback control signal. WRITE-UP AND ANALYSIS Your report should contain: 1) Parameters that are input for each of ten sampling times, and the P matrix for each of the ten sampling times (from 100 ms to 1000 ms in steps of 100 ms). 2) Hardcopy recordings of x1 and x2 for each of the ten sampling times. 3) Discussion of results, which should include a comparison of the discrete control to the continuous control problem, with regard to sampling times, etc. References must be made to the recordings. 4) Conclusions. PART III – DISCRETE CONTROL SUPPLEMENT This is a variation of PART II, where the Q matrix and the value of R in the discrete performance index are changed. As mentioned at the beginning, these matrices are "freely" selected by the designer. Thus it becomes of great importance to observe and analyze how their values affect the states and the control input. Given are four sets of Q and R, selected in such a way that each time a different state or control will be heavily penalized. The procedure for PART II with the exception of the sampling times (only 100 ms is used for this part), must be followed when using each of the sets below. The write-up format should also be adhered to. Q = 2 0, 0 1 R = 0.1 Q = 2 0, 0 1 R = 1.0 Q = 9 0, 0 1 R = 0.4 Q = 2 0, 0 9 R = 0.4 The effects of changing Q and R are to be noted in the write-up. The behavior of the responses for the different cases must be explained (for example, the difference in response times). 12 REFERENCES For the state variables and feedback control only introductory material is necessary. Suggested reading: Frederick, D. K. and Carlson, A. B., Linear Systems in Communication and Control, Section 3.5, pp. 79-88. DeRusso, P. M., Roy, R. J., and Close, C. M., State Variables for Engineers, Sections 8.1 - 8.5. For the continuous and discrete optimal control two independent approaches exist: the Linear Programming methodology and the Calculus of Variations approach. Kirk covers both approaches from a tutorial point of view, yet the material is still extremely difficult to digest. The perspective reader should concentrate on the notions and the examples rather than trying to verify the mathematical derivations. Sage is more application-oriented and extends his results to the Estimation Problem. Linear Algebra and Advanced Calculus are prerequisites for anything other than browsing. Suggested reading: Kirk,D., Optimal Control Theory, Prentice-Hall, NJ 1970 Chapter 1 pp. 3-21: general introduction to systems and terminology. Chapter 2 pp. 34-42: Typical 2nd Order LQR example, very useful to read. Chapter 3 pp. 84-86: Example of the Discrete LQR detailed in section 3.10. Chapter 4: Complete introduction to Calculus of Variations, contains proofs of all subsequent results. Chapter 5: pp. 209-218: The LQR problem with examples, state transition matrix and Kalman solutions. Sage, A. P. and White, C. C.,Optimum Systems Control, Prentice -Hall, NJ 1977 Chapter 5 pp. 88-100: The LQR problem with examples, only the Kalman solution. Chapter 6 pp 128-132 & pp. 135-136: The Discrete LQR derivation and examples. 13 APPENDIX – ANALOG COMPUTER WIRING DIAGRAMS POT 3 AO 0 POT 1 AMP 1 -1 AMP 2 -1 AMP 7 POT 2 14 AI 0 AI 1 INTEGRATOR 1 3 SUMMER 3 INTEGRATOR 2 POT 3 (B) AO 0 AMP 1 -1 POT 1 AMP 7 POT 4 POT 2 (B) AMP 2 -1 ____ 10 AMP 3 -10 ____ 10 POT 5 15 AI 0 AI 1