Optimal Control Lab

advertisement
Rensselaer Polytechnic Institute
35476 Computer Applications Laboratory
EXPERIMENTS IN OPTIMAL CONTROL
Number of Sessions – 4
INTRODUCTION
In modern control systems design it is sometimes necessary to design controllers that not only
effectively control the behavior of a system, but also minimize or maximize some user defined criteria
such as energy or time conservation, or obey other physical or time constraints imposed by the
environment. Optimal Control Theory provides the mathematical tools for solving problems like
these, either analytically or through computer iterative methods, by formulating the user criteria
into a cost function and using the state equation representation for the system dynamics.
The optimal control experiments in this lab involve the use of a PC-AT to illustrate the efficacy of
microprocessors as system controllers. Two types of control will be considered:
1) Continuous (analog)
2) Discrete (digital)
The general formulation of the optimal control problem is as follows:
Let the system states (ie. the state variables) be represented by an n-dimensional vector x, where n is
the order of the system. Let the control variables (input) be represented by an m-dimensional vector
u, and the system output by an r-dimensional vector y. The system can, in general, be modelled by a
set of differential equations of the form:
.
x =  (x, u) ,
y =  (x, u)
x(t0) = c
(1)
where  and  are general (possibly nonlinear and time varying) expressions of x and u, t0 is the
initial time and c is an n-dimensional set of initial conditions. The objective is to find a control input
u(t) that will drive the system from the initial point c ( =x(t0) ) in the state space to the final point
x(tf), and at the same time minimize a cost functional J (formally called performance index/criterion)
given by equation (2):
1
tf
J = h(x(tf), tf) +

 g(x(t), u(t), t) dt
(2)
t0
where:
tf represents the end of the control interval,
h and g are user defined penalty expressions.
PART I – CONTINUOUS CONTROL
MATHEMATICAL FORMULATION – RESULTS
Even though the general problem is very hard to solve, there are very useful simplifications that
lead to closed form (analytic) solutions. The most common simplification (and the one used for this
experiment) is the LQR problem, where the system is Linear and the controller (Regulator) must
satisfy a Quadratic cost functional. Assuming that the system is time invariant as well, then state
and output equations (1) become:
.
x(t) = Ax(t) + Bu(t) ,
y(t) = Cx(t) + Du(t)
x(0) = x0
(3)
where:
x(t) is the (n x 1) state vector,
x0 is the initial state vector,
u(t) is the (m x 1) control input vector,
y(t) is the (r x 1) output vector,
A is the (n x n) state dynamics matrix,
B is the (n x m) control dynamics matrix,
C is the (r x n) state-output matrix,
D is the (r x m) input-output matrix (for all practical purposes assumed 0 thereafter).
Corresponding to the system is a performance index represented in a quadratic form as:
tf
1
1
 { x'(t)Qx(t) + u'(t)Ru(t) } dt
J(u(t), x(t), t) =
x'(t)Hx(t) + 
2
2
(4)
0
where:
H is the (n x n) terminal state penalty matrix,
Q is the (n x n) state penalty matrix,
R is the (m x m) control penalty matrix.
It is important to emphasize that the H, Q, and R matrices are user selectable, and it is through
the proper selection of these that the various environment constraints are to be satisfied. If H, Q,
and R are positive definite and symmetric, then it can be shown that a closed loop optimal control
function u(t) exists (called u*(t)), and is uniquely given by:
u*(t) = -R-1B'P(t)x(t) = G(t)x(t)
2
.
G(t) = -R-1B'P(t)
3
(5)
where:
G(t) is the (m x n) optimal feedback gain matrix,
P(t) is an (n x n) symmetric and positive definite matrix that satisfies the continuous matrix
differential Riccati equation given by:
.
P (t) = -P(t)A - A'P(t) + P(t)BR-1B'P(t) - Q ,
P(tf) = H
(6)
From the above it is obvious that, even for the LQR problem, the set of coupled differential
equations (6) must be solved before the controller gains be found. It should be noted that these gains
are functions of time. The significance of the result of equation (5) is that the gains can be computed
ahead of time, and the optimal controller is easily implemented on any digital computer. In all but
the simplest cases of the real life applications, computer iterative methods are utilized to precalculate and store the P matrix and the gains, resulting in very efficient and robust open and closed
loop controllers.
A further simplification to the above solution takes place when tf approaches ∞, or in more
realistic terms the control is applied for a very large period of time. In this case it is shown that: if
(1) the system is controllable, (2) H is zero, (3) the A, B, Q, and R, are all constant matrices, then the
P(t) matrix converges to a constant symmetric, real, positive definite matrix Pss. Obviously then
.
P (t) is zero, and the matrix differential Riccati (6) is transformed into a matrix algebraic equation
(called the Steady State Riccati Equation) given by:
0 = -PA - A'P + PBR-1B'P - Q
(7)
The steady state solution Pss need not be unique. However, only one real positive definite
solution can exist. Since P is of dimensions (n x n), we obtain n2 equations to solve for the n2
components of P. Yet due to the fact that it is symmetric, the number of computations is greatly
reduced. Note that the steady state Riccati equation for the continuous case is easier to solve
without the aid of a computer than is the discrete equivalent to be computer later, because no inverse
of the unknown P appears in the equation. Once the P matrix is determined, the optimal control
gain matrix G can be found from equation (5). Of course now the gains are constants as expected.
Solving the general (non steady state) LQR problem requires substantial storage for the gains that
depends on the tf and the sampling interval, and exceeds the scope of this experiment which will
hereafter focus on the steady state case.
Another obvious but never the less important result of the steady state case is that the system
states must converge to zero regardless of their initial condition vector x0. An intuitive proof by
contradiction of the above result can be derived using the following notion: The performance index is
always positive as sum of positive factors since all matrices are positive definite. If it is to be
minimized over an infinitely large time, then this minimum better be constant. Yet if the states are
not approaching zero, then from equation (5) the control isn't either, and we end up summing two
positive quantities, inside the integral, for an infinite time. This of course can never give a constant
(minimum) value, therefore the states must go to zero.
Applications exploiting this result are controllers designed to keep the states always near zero,
counteracting random abrupt changes in the state values (or any other causes that can be modelled
as impulses). The following example sets up a typical infinite time LQR problem and analytically
solves the steady state Riccati equation, giving the student sufficient experience to deal with the
actual experiment.
4
5
EXAMPLE
Consider the following second order system:
x. (t)
 1  =  0 1 x1(t) +
. 
 -1 0 x2(t)
x2(t)
x1(t)
y(t) = [ 1 0 ] x (t)
 2
 0  u(t)
1



with the corresponding performance index given by
∞

 1 0 x1(t) + u2(t) } dt
J = { [x1(t) x2(t)]
 0 1 x2(t)

0
The continuous steady state Riccati equation (7) can be written as:
PBR-1B'P = PA + A'P + Q
(8)
and using this example's data it becomes:
 P11 P12 0
 P P 1 [1][
 21 22 
P
 11 P12  0 1 
 P P  -1 0 
 21 22 
0
1
]
 P11 P12 
P P  =
 21 22 
1 0 +
0 1
 0 -1  P11 P12  +
 1 0  P21 P22 
The matrix equation can be rewritten as three simultaneous algebraic equations (remember
P12 = P21):
2
P12 = 1 - 2P12
P12P22 = P11 - P22
2
P22 = 1 + 2P12
Solving the three equations, two real solutions for P are obtained:
 1.912
 0.4142
0.4142
1.352
 -1.912
 0.4142
 ,

0.4142
-1.352


By checking the principal minors of each of the two matrices, it is found that only the first of these is
positive definite*, therefore it's the accepted solution.
*
A matrix is positive definite if all principal minors are positive. The principal minors are the determinants
of the submatrices along the principal diagonal of the matrix which include the topmost left element of the
diagonal. Thus the first principal minor is the element (1,1), the second is the 2x2 submatrix in the upper left
corner, and so on.
6
The feedback gain matrix is then determined from equation (5) to be:
G = [ -0.4142
-1.352 ]
Thus, the optimal control input u(t) is given by:
u(t) = -0.4142x1(t) - 1.352x2(t)
(9)
The student is expected to work through this example prior to attempting the problem for the
experiment, to ensure a thorough understanding of the procedure.
FIGURE 1. Block diagram of the open loop system.
PROBLEM FORMULATION
This experiment is based on the Steady State case of the LQR problem. Here the plant to be
controlled is a 2nd order linear and time invariant system implemented on the Comdyna analog
computer, the penalty matrices Q, R are constants and H is zero. The optimal control input u(t) will
drive the system states x1 and x2 from the initial values of -3 volts and 0 volts respectively to zero,
and it will be generated either using the Comdyna analog computer, or a PC-AT running the
appropriate program. The second order system to be controlled is modeled by the transfer function:
Output
Y(s)
H(s) = Input =
U(s)
=
0.5
s(s + 0.5)
(10)
which is to be implemented on the Comdyna analog computer following a suitably designed
simulation diagram.
The state variable representation of the system transfer function (10) is of the same general form
as equations (3) with D equal to zero, and is given by:
x. (t)
 1  =  0 1 x1(t) +  0  u(t)
. 
 0 -0.5 x2(t)
 0.5 
x2(t)
x1(t)
y(t) = [ 1 0 ] x (t)
 2
7
,



-3
x1(0)
x (0) =  0 
 2 
(11)
Given the state variable equations, the open loop system block diagram can be obtained as in
FIG. 1 and the actual analog computer simulation as in FIG 2. The student is urged to verify both
results, paying particular attention to the initial conditions in FIG. 2. Note that the integrators
invert the voltage on the IC input so a +3 volts will give the desired initial output voltage of -3 volts.
FIGURE 2. Analog computer simulation of the open loop system.
The performance index for this system is given in accordance with equation (4) by:
∞

 2 0 x1(t) + 0.4u2(t) } dt
J = { [x1(t) x2(t)]
 0 1 x2(t)

(12)
0
EXPERIMENTAL PROCEDURE
The aim of the experiment is to design a closed loop optimal controller that will drive the states
to zero using the feedback gains obtained via the equations (5). For this purpose the following steps
must be sequentially implemented:
1) The open loop system is to be built on the Comdyna analog computer in accordance with the
simulation diagram of FIG. 2. Special care must be taken when implementing the initial conditions
and the gains because of the sign inversions at the output of the amplifiers. The strip chart recorder
must be thoroughly calibrated and its various scales mastered, before any useful work can be done.
Remember, the Comdyna dial must be on the Pot Set position and the push button labeled IC be
pressed in during setup; during operation the dial must always be on the Oper position, and the
push button labeled OP must be pressed in just before each run starts.
2) The impulse response of the system is to be simulated using the Comdyna and the PC-AT, and
the natural (uncontrolled) modes of the states x1, x2 are to be plotted on the strip chart. The concept
behind this, is that when an Asymptotically Stable system is excited with an impulse function (t),
8
then the states converge to zero regardless of their initial values, producing the same final result as
the optimal controller does. Hence the plots of the state trajectories obtained in this part, are to be
compared with the ones produced by applying the closed loop optimal control. Before running this
part it is necessary to verify that both states are indeed stable by solving the system differential
equations. However, it should be noted that this system is not strictly Asymptotically Stable due to
the pole at the origin.
To implement the impulse function, connect the D/A0 port of the PC-AT to the input of the
system, and select the Uncontrolled State Response option from the program menu (details on
running the PC-AT program are provided later). It is important to understand that since the (t)
function is theoretical pulse of infinite amplitude and infinitesimal duration, no human realizable
input can reproduce it. Hence a +10 volt step function is applied for a period of 296 msec to produce
the same net result. The duration is found experimentally and is valid only for the -3 volt initial
condition of state x1. Also you may observe that state x1 reaches the desired zero as theoretically
expected, yet it then continues linearly increasing as time passes by. This should be attributed to
leaks within the Comdyna integrators (from the first to the second integrator) rather than to a
theory inconsistency.
FIGURE 3. Block diagram of the closed loop system.
3) The algebraic Ricatti equation for the above system is to be solved and the P matrix elements
and the optimal feedback gains g1, g2 are to be computed using the equations (5) and the example.
Even though MATLAB or any other equivalent math package can be used for verification purposes, a
detailed analytical solution of the P matrix elements and the gains is mandatory. Having found the
gains, the optimal controller is given by equation (13) as:
u*(t) = g1x1(t) + g2x2(t)
(13)
The block diagram of the complete closed loop system is given on FIG. 3, and its simulation
diagram is given on FIG. 4. The control calculation (equation (13)) of FIG. 4 is to be implemented
either on the Comdyna by the student, or through the PC-AT program upon selecting the relevant
menu option.
9
Figure 4. Analog computer simulation of the closed loop system.
4) The complete closed loop system, both the plant and the feedback controller, is to be built on
the Comdyna analog computer using the FIG. 4, and a strip chart recording of the state trajectories
be obtained for comparison with the results from the other runs. This controller should produce the
best results for a given set of dynamics and penalties. (Why?) The implementation on the Comdyna
can be done either using the multipliers, or passing the states individually through the .1 inputs of
adders to effectively increase them by a factor of 10, and then through potentiometers with values .1
times the gains. Selecting the later method requires the use two analog computers.
5) The experiment is to be run using the PC-AT program to implement the closed loop controller.
For this part the system built on the Comdyna is to be connected to the PC-AT as in FIG. 4. State x1
must be fed to the PC-AT input port A/D0, state x2 to the PC-AT input port A/D1, and the system
input u to the PC-AT output port D/A0. All other ports are to be left free. The control algorithm is to
be executed using different sampling periods (at least ten runs ranging from 50 to 1000 msec), and
the plots of the state trajectories be compared with the previous results. To access the PC-AT
program do the following:
• Turn the PC-AT on (if off) and go to the OPTIMAL subdirectory by typing:
CD \CAL_LAB\OPTIMAL.
• Type OPTIMAL to run the program. After the introductory screen a menu will appear.
Select your option using the arrow keys and ENTER, or pressing the highlighted character.
• A Data Input Menu will appear with the variable names on the left and editable fields on the
right. If the cursor does not appear press the Insert key. To move among the fields and edit
the data within the fields use the cursor arrows and the regular editing keys (ENTER,
Backspace, Delete).
• Unacceptable keystrokes are met with error messages, and a basic help screen is available by
pressing F1. Be warned that whenever ranges appear next to variable names, they act in an
advisory capacity, and it is at the user's discretion to enter meaningful data in the fields.
• When done editing, press the END key to start execution, or ESC to exit the Input menu.
During control execution no other key is to be pressed, except CTRL-BREAK which is used to
abort the current run.
10
• Aborting control execution brings you back to the Input Menu for more data entry.
For the Continuous Case the precalculated gains during step 2 must be entered at the relevant
fields once since they are constant, and only the sampling period is to be changed before each new
run. Since the computer program does not invert the controller sign in the output, the actual
calculated gains must be entered in the relevant fields (compare with the analog implementation of
step 3). Record your observations and make suitable strip chart recordings of the states.
WRITE-UP AND ANALYSIS
Your report should contain:
1) Solution of the continuous steady state Riccati equation and calculation of the feedback gain.
(Show that the solution to the steady state Riccati equation is positive definite).
2) Comments on the asymptotic stability of the system.
3) Various strip chart recordings for procedure parts 1), 3), and 4). These should be
appropriately labeled.
4) Discussion of results, including a comparison of the strip chart recordings and the
significance of the sampling times (especially when it increases). When discussing the
sampling times, reference must be made to the strip chart recordings.
5) Conclusion.
PART II – DISCRETE CONTROL
MATHEMATICAL FORMULATION – RESULTS
In the case of digital control, the optimization must be treated as a discrete time problem. Thus,
the sampled system is modelled by the difference equations:
x(k + 1) = Ax(k) + Bu(k) ,
y(k) = Cx(k) + Du(k)
x(0) = c
(14)
and the discrete performance index in a quadratic form is given by
N
1
1
J =
x'(N)Hx(N) + { x'(k)Qx(k) + u'(k)Ru(k) }
2
2
(15)
k=0
where A, B, C, D, H, Q and R are similar to those in the continuous case and N is a fixed number of
time intervals. By applying the conditions of optimality to the problem, the optimal control is found
to be:
u*(k) = G(k)x(k)
(16)
where G is again the (m x n) feedback gain matrix given by:
G(k) = -R-1B'(A')-1[P(k) - Q ]
11
(17)
and P is the (n x n) real positive definite solution of the discrete matrix difference Riccati equation
given by:
P(k) = Q + A'[P-1(k + 1) + B'R-1B]-1A
(18)
When N approaches ∞ and the same conditions mentioned in the continuous case apply, then the
P(k) matrix converges to a constant symmetric, real, positive definite matrix P. Hence:
P(k) = P(k + 1) = P
(19)
and the matrix difference Riccati becomes a matrix algebraic equation as before. Yet even in the
steady state case equation (18) is very difficult to solve analytically, hence iterative methods
implemented on computers handle this task. Once the P matrix is found, the gains and the
controller are easily computed using equations (17) and (16).
The following example should provide some reference as to the procedure to follow to obtain the
feedback gain matrix for a simple system.
EXAMPLE
Given a discrete system described by the state and output equations:
x1(k + 1)  cosT sinT x1(k)
x (k + 1) =  -sinT cosT x (k)
 2

 2 
x1(k)
y(k) = [ 1 0 ] x (k)
 2
+
 1 - cosT  u(k)
 sinT 



where T is the sampling interval in seconds. The discrete performance index is given by:
∞
J =
{
x'(k)
 1 0  x(k) + u2(k) }
0 1
k=0
If we let the sampling interval T be equal to /40 seconds, the discrete equations become:
0.997 0.0785 x1(k)
0.00308
x1(k + 1)
x (k + 1) =  -0.0785 0.997 x (k) +  0.0785  u(k)
 2

 2 
The discrete Riccati equation is not easily solved by hand due to the fact that a complicated cubic
equation has to be solved. Using the PC program the solution is found as:
P =
 24.6 5.15 
 5.15 17.5 
Thus, from equation (17) the feedback gain matrix is:
G(k) = [ -0.331 -1.28 ]
12
and from (12) the discrete optimal control is becomes:
u(k) = -0.331x1(k) - 1.28x2(k)
PROBLEM FORMULATION
This part of the experiment implements the steady state case of the Discrete Linear Quadratic
Regulator problem. The same linear and time invariant system as before is used, but now for all
calculations it is treated as a sampled data system modelled by the following state and output
equations:
x1(k + 1)
 1 2 - 2e-.5T x1(k) +  T - 2 + 2e-.5T  u(k) ,



x (k + 1) = 

 0 e-.5T x2(k)
 1 - e-.5T 
 2

x1(k) 
y(k) = [ 1 0 ] x (k) 
 2 
-3
x1(0)
x (0) =  0 
 2 
(20)
where T is the sampling interval. The discrete performance index for this system is given by:
∞
J =

2 0 x1(k)
{ [x1(k) x2(k)] 0 1 x (k) + 0.4u2(k) }
 2 
(21)
k=0
It should be noted here that the discrete A state dynamics and B control dynamics matrices have
been chosen (out of the infinitely many possibilities representing the given system) so that the
discrete states x(k) are the continuous states x(t). Given a continuous system to be modeled by a
discrete system with zero order holds on the outputs, the discrete matrices Ad and Bd may be
obtained from the continuous matrices Ac and Bc by:
A T
Ad = e c
-1
Bd = [Ad - I]Ac Bc
where T is the sample period. Note that Cc = Cd and Dc = Dd. (Why?) The section "Discrete State
Equations" in the chapter "Open-Loop Discrete-Time Systems" of Digital Control System Analysis
and Design by C. L. Phillips and H. T. Nagle provides more details on this continuous to discrete
transformation.
EXPERIMENTAL PROCEDURE
The procedure for this section is similar to that of the PC-AT based Continuous Control section,
yet now the system and input dynamics (A, B) vary with sampling, hence they must be calculated in
advance. Use any tool you feel comfortable with (e.g. Lotus 1-2-3) to compute them. The system is to
implemented on the Comdyna analog computer and be connected to the PC-AT according to FIG. 4 as
before. Run the program and select the Discrete Optimal Control option from the menu.
Enter/edit the data in the various fields and press END when done. At this point the program solves
13
the steady state discrete Ricatti iteratively, and returns the P matrix elements as well as the gains
G. Care must be taken to avoid typos when entering the data, especially R, because divisions by zero
will crash the program. It's recommended that a hard copy (Screen dump) be taken before starting
the actual run. To do this press the Print Screen button after the Riccati solution is on the screen.
Pressing END will again start the control execution until CTRL-BREAK stops it. Record your
observations, and make appropriate strip chart recordings.
WRITE-UP AND ANALYSIS
Your report should contain:
1) Parameters that are input for each of ten sampling times, and the P matrix for each of the
ten sampling times (from 100 msec. to 1000 msec. in steps of 100 msec.).
2) Strip chart recordings of x1 and x2 for each of the ten sampling times.
3) Discussion of results, which should include a comparison of the discrete control to the
continuous control problem, with regard to sampling times, etc. References must be made to
the strip chart recordings.
4) Conclusions.
PART III – DISCRETE CONTROL SUPPLEMENT
This is a variation of PART II, where the Q matrix and the value of R in the discrete
performance index are changed. As mentioned at the beginning, these matrices are "freely" selected
by the designer. Thus it becomes of great importance to observe and analyze how their values affect
the states and the control input. Given are four sets of Q and R, selected in such a way that the each
time a different state or control will be heavily penalized. The procedure for PART II with the
exception of the sampling times (only 100 msec. is used for this part), must be followed when using
each of the sets below. The write-up format should also be adhered to.
Q =
2 0,
0 1
R = 0.1
Q =
2 0,
0 1
R = 1.0
Q =
9 0,
0 1
R = 0.4
Q =
2 0,
0 9
R = 0.4
The effects of changing Q and R are to be noted in the writeup. The behavior of the responses for the
different cases must be explained (for example, the difference in response times).
14
REFERENCES
For the state variables and feedback control only introductory material is necessary. Suggested
reading:
Frederick, D. K. and Carlson, A. B., Linear Systems in Communication and Control, Section 3.5,
pp. 79-88.
DeRusso, P. M., Roy, R. J., and Close, C. M., State Variables for Engineers, Sections 8.1 - 8.5.
For the continuous and discrete optimal control two independent approaches exist: the Linear
Programming methodology and the Calculus of Variations approach. Kirk covers both approaches
from a tutorial point of view, yet the material is still extremely difficult to digest. The perspective
reader should concentrate on the notions and the examples rather than trying to verify the
mathematical derivations. Sage is more application-oriented and extends his results to the
Estimation Problem. Linear Algebra and Advanced Calculus are prerequisites for anything other
than browsing. Suggested reading:
Kirk,D., Optimal Control Theory, Prentice-Hall, NJ 1970
Chapter 1 pp. 3-21: general introduction to systems and terminology.
Chapter 2 pp. 34-42: Typical 2nd Order LQR example, very useful to read.
Chapter 3 pp. 84-86: Example of the Discrete LQR detailed in section 3.10.
Chapter 4: Complete introduction to Calculus of Variations, contains proofs of all subsequent
results.
Chapter 5: pp. 209-218: The LQR problem with examples, state transition matrix and Kalman
solutions.
Sage, A. P. and White, C. C.,Optimum Systems Control, Prentice -Hall, NJ 1977
Chapter 5 pp. 88-100: The LQR problem with examples, only the Kalman solution.
Chapter 6 pp 128-132 & pp. 135-136: The Discrete LQR derivation and examples.
15
APPENDIX – ANALOG COMPUTER WIRING DIAGRAMS
POT 3
AMP 1
AMP 2
POT 1
AMP 7
POT 2
16
INTEGRATOR 1
3
SUMMER 3
INTEGRATOR 2
POT 3
AMP 1
AMP 2
POT 1
AMP 7
POT 4
POT 2
AMP 3
POT 5
17
Download