8 LECTURE 8: LAGRANGE MULTIPLIERS At the end of this lecture a learner should be able to: i) Define the optimization problem ii) Explain the use of Lagrange multipliers in optimization iii) Explain the use of Lagrange multipliers in optimal control problem iv) The application of Lagrange multiplier to a control problem Concepts Optimization, minimum, maximum, quadratic forms, optimal control problem 8.1 Optimization problem An optimization problem is the problem of finding the best solution from all feasible solutions. In everything in life there are trade-offs so that you cannot get everything you want. Therefore, one tries to maximize a desired value while minimizing the effects of the undesirable parts. In a control system, one tries to minimize the steady state error and reduce the percentage overshoot by choosing an optimal gain. Maximization or minimization of a dynamic variable, f(x), with respect to changes in an independent variable, x, involves determination of the maximum and or minimum of the dynamic variable. The dynamic variable , in order to have a maximum or minimum should be quadratic in nature. There are different types of critical points; absolute minimum, relative minimum, relative maximum and absolute maximum as shown in Fig. 8.1. These are related to the stability of a dynamic system. Fig. 8.1 Function with minimum and maxima Optimization problem involves the determination of the maxima and minima of a given function. Many applications involve solutions around minimum and maximum values of a function, f(.). At the extremum points the derivative, df/dx =0. The different extremum points are : Absolute Maxima: Is a point where the function obtains its greatest possible value. 1 Absolute minima: Is a point where the function obtains its least possible value. Relative maxima: Is a point where the function changes direction from increasing to decreasing but is not the absolute maximum. Relative minima: is a point where the function changes direction from decreasing to increasing but is not the absolute minimum. Functions which have extremum points are normally quadratic form. Considering a function of two variable, x and y, then the necessary conditions for an extrema is the first order conditions: Fx Fy f ( x, y ) f ( x, y ) 0........................................(1) x y The type of extrema is obtain by the second order conditions: The second order condition if it is less than zero, is a maximum and if it is greater than zero, it is a minimum. If the condition is zero, then the critical point is an inflexion point. In second order systems this corresponds to a saddle point. d2 f 0; minimum dx 2 d2 f Fyy 0; maximum dy 2 Fzz 0; infexion Fxx 8.1.1 Lagrange Multipliers and Constrained Optimization There are two types of optimization n problems: i) Unconstrained optimization problems ii) Constrained optimization problems Unconstrained Optimization Unconstrained optimization problems consider the problem of minimizing an objective function that depends on real variables with no restrictions on their values.In unconstrained optimization there are no constraints and the function is solved as it is. Example 1. Optimize the unconstrained function given by f ( x, y ) 3x 2 6 y 2 xy The solution: Gives a simultaneous equations f ( x, y ) f ( x, y ) x y 6 x y 12 y x Fx Fy 7x 13y, 2 x 13 y 7 There are many values of x and y that satisfy the equation. Constrained Optimization Sometimes, there are constraint which restricts from freely choosing variables: i) Maximize volume subject to limited material costs ii) Minimize surface area subject to fixed volume, The constrained optimization gives a conditions, g1(x,y) and g2(x,y), in addition to the quadratic function, f(x,y) to be optimized. The constraints gives the allowable limits where the optimization function is required to be true. The constraints may be given: i) g1(x,y)=0 ii) 0< g1(x,y)< umax : : an equation within two limits We shall consider the first case, where the equation of the constraint equation is given: Example 2 Maximize the function, f ( x, y ) 3x 2 6 y 2 xy Subject to the constraint, g ( x, y ) x y 20 We shall consider the constrained optimization problem. There are several methods used in optimization which include, among others: 1. 2. 3. 4. Substitution method Lagrange multipliers Total Differential Dynamic programming We shall illustrate the first two methods and discuss the Lagrange multipliers method. 8.1.2 Substitution method 1. Rewrite the condition for x in terms of y: g ( x, y ) 20 x y 2. Substitute back into the original equation and solve f ( x, y ) 3(20 x y ) 2 6 y 2 (20 y ) y 1200 140 y 10 y 2 120 14 y y 2 3. Differentiate and equate to zero f 2 y 14 0, and x y 20; y 7, x 13 y 4. Is the extrema a maximum or minimum: 3 f f 6x y , 12 y x x x 2 f 2 f 6 0 ; 2 0 Both points are minima x 2 x 2 The function is called the objective function which is in quadratic form and the constraint can take different forms. 8.1.3 Lagrange Multiplier method: Equality constraints The method of Lagrange multipliers is an indirect optimization strategy for finding the local maxima and minima of a function subject to equality constraints. The basic idea is to convert a constrained problem into a form such that the derivative test of an unconstrained problem can still be applied. The relationship between the gradient of the function and gradients of the constraints leads to a reformulation of the original problem, known as the Lagrangian function. Consider the optimization of the functions with constraints defined as: MaximizeOptimize f ( x, y ) 3x 2 6 y 2 xy subject to g(x, y) x y 20 . Can be reformulated so that the function and the constraint become part of the same problem in the following steps: Steps 1. Introduces a parameter multipliers, λ; this gives the multiples of the constraint that satisfy the function. g(x, y) ( 20 - x - y) 0 2. Combine the function and constrains as one function , called the Langrangian, L(x,y,λ): L( x, y ) 3x 2 6 y 2 xy (20 - x - y) 3. Differentiate , L(x,y,λ), to obtain the critical points d L( x, y ) 0, dx d L( x, y ) 0, dy d L( x , y ) 0 d 4. Solve for the solution of the maximum points. The solution corresponding to the original constrained optimization is always a saddle point of the Lagrangian function, which can be identified among the stationary points from the definiteness of the bordered Hessian matrix. Maximize the function, f ( x, y ) 3x 2 6 y 2 xy Subject to the constraint, g(x,y) : x+y= 20 1. Rewrite the constraint function as so that the result is zero : 20 x y 0.....................(2) 2. Introduces another parameter,λ, or multiplier of the constraint function, g(x,y),that determines multiples of the constraint, called Lagrange multiplier, λ, still gives zero. (20 x y ) 0............(3) 3. Form and write the Lagrangian function, L(x,y,λ) = f(x,y) + g(x,y,λ), using the performance and constraint. This means L(x,y) = f(x,y) since g(x,y) = 0. L 3x 2 6 y 2 xy (20 x y ) 4. Determine the FOC with respect to x, y and λ; assuming all conditions remain constant(ceteris paribus) 4 dL 6 x y 0........( 4) dx dL Ly x 12 y 0.......(5) dy dL L 20 x y 0.........(6) d Lx 5. Solve: (4) and (5) simultaneous equations to eliminate λ 7 x 13 y 0.............(7) x 20 y and (x*,y*,λ*) = (13,7,71) 6. Substitute the value of NB: The optimal point is (x*,y*,λ*) = (13,7,71) The first order derivatives: Lx = Ly = Lλ = 0 ; The optimal Lagrangian function, L(x*,y*,λ*) = 710 The lagrange multiplier is also taken to be a variable and we have to determine the optimal value of λ. It is sometimes called the co-state vector. Example 2 Minimize the function f ( x1 , x2 ) x1 x22 subject to g1 ( x ) : x 1, g 2 ( x1 , x2 ) : x12 x22 1 The solution is (x1,x2)=(1,0). In this case we form: L( x, y, ) f ( x, y) 1 g1 ( x, y) 2 g2 ( x, y) Write the expression L( x1 , x2 , ) x1 x22 1 ( x 1) 2 (1 x12 x22 ) Differentiate with respect to each variable Lx1 dL 1 1 22 0........ dx1 Lx2 dL 2 x2 2x2 0....... dx2 L dL x1 x12 x22 x1 (1 x1 ) d 8.1.4 Using Total Derivative The total derivative of a function for the function, f(x,y), and constraint, g(x,y) is: dF(x, y) dG(x, y) And dF dF dx dy 0 Fx dx Fy dy dx dy dG dG dx dy 0 G x dx G y dy dx dy 1. dx dy Fy Fx G y Gx 2. Fy Fx G y Gx 5 0 0 df 0 f f dx1 dx2 x1 x2 g x dx2 1 dx1 g x2 g g dx1 dx2 x1 x2 f f x 2 df g x 1 x 2 g dx x1 1 ( )=λ f g df dx1 x1 x1 The ratio of partial derivatives in the ( ), fx/gx is a Lagrange multiplier, λ. Lagrange multipliers are a ratio of partial derivatives at the optimum. This represented in Fig. 8.4 . The red curve shows the constraint g(x, y) = c. The blue curves are contours of f(x, y). The point where the red constraint tangentially touches a blue contour is the maximum of f(x, y) along the constraint. Fig. 8.4 Representation of f(x,y) and g(x,y). Example Find the maximum and minimum of the function f ( x1 , x2 ) x12 x22 2 x1 6 x2 8 subject to constraint x y 10 The lagrangian: L (10 - x1 x2 ) Using the total derivative, 6 df dx1 dg dx1 df dx 2 dg dx 2 ( 2 x1 2) ( 2 x 2 6) ............1) 1...........2) x 1 x 2 10 Subtract 1) from 2) : x 2 x1 2 and putting back into x 2 x 2 2 10 x 2 6 , and x1=4. Example: Maximize the function, F x1 x2 2x1 subject to g(): 60 4 x1 2 x2 : 60 4 x1 2 x2 0 Taking first-order differentials of F and constraint (G): dF ( x2 2)dx1 x1dx2 0 : dG 4dx1 2dx2 0 dx1 x1 dx2 x2 2 dx1 dx2 1 2 x1 x2 2 1 2 x1 x2 2 2 Substitute back in the constraint: 60 41 1 2 x2 2 x2 x2* 14; x1* 8 F * (8,14) 814 28 128 8.2 Hamilton’s Principle of Least Action: Hamiltonian principle gives a minimization problem expressed as an integral and also quadratic form. However, it involves some quadratic energy functions. The Hamiltonian Principle states that: “The motion of the system from time t1 to time t2 is such that the line integral ( called the action or action integral has a stationary value for the actual path of motion”. tf A Ldt t0 the integral, A, is called the action of the system. OR “The path followed by a mechanical system during some time interval , [t1,t2], is the path that makes the integral of the difference between the kinetic and the potential energy,(L=T-U), stationary”. 7 “Of all possible paths along which a dynamical system may move from one point to another, in a given time interval (consistent with the constraints), the actual path followed is one which minimizes the time integral of the difference in the KE & the PE”. That is, the one which makes the variation of the following integral equal to zero: tf tf t0 t0 (T V )dt Ldt 0 L=T-V is the Lagrangian of the system; T and V are respectively the kinetic and potential energy of the system. L( x , x, t ) 1 2 1 2 mx kx 2 2 It gives the minimum quantity (time, path, energy,..) for the system. Every system moves in such a way to minimize the energy. tf 1 2 t0 ( mx 2 kx2 )dt 0 Example of the Minimum or variational Principles • • • Shape of a drop of water, Law of Reflection of light, θi = θr string tied on both ends The principle of least action is used in defining the optimal control to minimize the various forms of action on the state, input or the time required to move from one point to the other. The minimization solution of the equation is given by the Euler-Lagrange equation used in calculus of variations: [ Lx , x ] [ L( x , x )] 0 dx x x NB: We shall not derive this equation at this time. 8.2.1 Performance measures The performance measures are intended to minimize or maximize a required parameter of interest. Some of the common parameters are: i) Reduces the tracking error ii) Optimize the state vector iii) Minimize the control energy iv) Minimize wastage 8 The performance measures are a form of Lyapunov function, V(x), so that they quadratic functions that can be optimized. Given a state space model, x Ax bu; y Cx Performance index measures is defined using the input and state vectors as quadratic function or an energy function or a form of Lyapunov function that can be optimized; have a minimum or a maximum. J ( x, u ) ax 2 ru 2 ; L(x, u) x TQx u T Ru ; J(x, u) L(x, u)dt Fig. 8.5 Performance measurement of a system The control problem now has two parts: 1) x Ax bu; y Cx 2) L x T Qx u T Ru which forms the required optimization problem with function, f(x,t)=Ax+bu as the constraint and the J(x,y,t) =qx2+ru2 as the objective function. The corresponding new Lyapunov function, that includes the state equation, is: 3) L x T Qx u T Ru ( Ax bu x ) Using these tools the control problem may be solved. The performance measure is normally expressed as an integral equation as: tf J S ( x, t ) x T Qx u T Rudt t0 where S(x,t) refers to the final state and the integral term the dynamic part. 9 8.2.2 Types of optimal control problems The optimal control may be classified according to the parameter to be optimized as: i) ii) iii) iv) v) State and output Regulator Problem Tracking Problem Minimum energy Minimum Time Minimize fuel The performance criterion in a quadratic mathematical form. 8.2.3 The State Regulator Problem Transfers the system from an initial state x0=xr to the final state, xf , with minimum integral square error tf J ( x, t ) tf x x xdt T r 0 where x xr tf x x r dt e dt 2 0 2 0 is the norm of the vector and the error. e ( x x r ) With a change of the coordinate system by starting from the origin, this may be written in the more general infinite time form as: J ( x, t ) xT Qxdt 0 where Q is the symmetric positive definite constant matrix 0 q1 q2 Q .. .. .. .. qn 0 This is the normal problem with the performance criteria defined as a Lyapunov function. The complete regulator problem defines both the state and control matrices: 1 0 x1 2 2 x2 x 1 x2 0 1 x2 1 0 Q , x1 0 1 And tf tf J ( x, u ) ( x Qx u Ru )dt ( x12 x22 ) u 2 dt T T 0 0 where R is a diagonal, positive definite matrix associated with the input, u(t). 10 8.2.4 Minimum Energy The objective is to transfer the system from an initial point state, x(0), to the final state x(f) with a minimum expenditure on energy applied to drive this system. u2(t) is a measure of the instantaneous rate of expenditure of energy. Therefore to minimize this we use tf J (u, t ) u 2 dt 0 If there are several inputs, then tf J (u, t ) u 2 dt 0 8.2.5 Minimum time The objective is to transfer the system from an initial point state, x(0), to the final state with minimum time t. tf J ( x, t ) 1dt 0 11