Class V – Non Linear programming Topics Example Unconstrained nonlinear programming - Gradient decent Langrange multiplyers The karush-kuhn-tucker (kkt) conditions for constrained optimization Types of NonLinear Programming Algprothms for solution of convex programing General Problem General problem - Maximize f (x), subject to gi ( x) bi for i _ 1, 2, . . . , m, And x j 0 for all j Example Portfolio Selection with Risky Securities It now is common practice for professional managers of large stock portfolios to use computer models based partially on nonlinear programming to guide them. Because investors are concerned about both the expected return (gain) and the risk associated with their investments, nonlinear programming is used to determine a portfolio that, under certain assumptions, provides an optimal trade-off between these two factors. This approach is based largely on path-breaking research done by Harry Markowitz and William Sharpe that helped them win the 1990 Nobel Prize in Economics A nonlinear programming model can be formulated for this problem as follows. Suppose that n stocks (securities) are being considered for inclusion in the portfolio, and let the decision variables xj ( j _ 1, 2, . . . , n) be the number of shares of stock j to be included. Let i , ij be the (estimated) mean and covariance R(x) = j x j ,V(x) = ij xi x j 1 1 j=1 j=1 Assume V(x) is the objective function and R(x) is the constraint Minimize V(x) = ij xi x j , subject to R(x) = j x j l , B(x) = Pj x j B , 1 1 1 j=1 j=1 j=1 xj 0 One could test this solution for any possible value of l. Another example would be if the profit from a product is non linear: P(x)=xp(x)cx Gradient Descent The general solution for an unconstrained minimization/maximization is f ( x) 0 and the Hessian is positive definite. The way yo find such a place is through a gradient Descent algorithm o Choose any initial guess x0 o x* xn tf ( x) o f (t ) f ( x*) df (t*) o minimize/maximize 0 dt o xn1 xn t * f ( x) Example f ( x) 2 x1 x2 2 x2 - x12 - 2 x22 . this converging sequence of trial solutions never reaches its limit, the procedure actually will stop somewhere (before the optimum) Lagrange Multipliers One constraint Lagrangian Consider a two-dimensional case. Suppose we have a function f(x,y) we wish to maximize or minimize subject to the constraint g(x,y) = c. where c is a constant. We can visualize contours of f given by f ( x, y ) d for various values of d and the contour of g given by g(x,y) = c. Suppose we walk along the contour line with g = c. In general the contour lines of f and g may be distinct, so traversing the contour line for g = c could intersect with or cross the contour lines of f. This is equivalent to saying that while moving along the contour line for g = c the value of f can vary. Only when the contour line for g = c touches contour lines of f tangentially, we do not increase or decrease the value of f - that is, when the contour lines touch but do not cross. This occurs exactly when the tangential component of the total derivative vanishes: df|| 0 , which is at the constrained stationary points of f (which include the constrained local extrema, assuming f is differentiable). Computationally, this is when the gradient of f is normal to the constraint(s): when f ( x0 , y0 ) g ( x0 , y0 ) for some scalar λ. Note that the constant λ is required because, even though the directions of both gradient vectors are equal, the magnitudes of the gradient vectors are generally not equal. Geometrically we translate the tangency condition to saying that the gradients of f and g are parallel vectors at the maximum, since the gradients are always normal to the contour lines. Thus we want points (x,y) where g(x,y) = c and f ( x0 , y0 ) g ( x0 , y0 ) To incorporate these conditions into one equation, we introduce an auxiliary function F ( x, y, ) f ( x, y ) ( g ( x, y) c) and solve F ( x, y, ) 0 Caveat: Be aware that the solutions are the stationary points of the Lagrangian F, and may be saddle points: The weak Lagrangian principle – multiple constraints Denote the objective function by f ( x) and let the constraints be given by g k ( x) 0 . The domain of f ( x) should be an open set containing all points satisfying the constraints. Furthermore, f ( x) and the g k ( x) must have continuous first partial derivatives and the gradients of the g k ( x) must not be zero on the domain. Now, define the Lagrangian, Λ, as ( x, ) f ( x) k g k ( x) .The k solutions of the lagrangians constrained problem. provides all possible solutions to the Several constraints at once Again, it is easy to understand this graphically. Consider the example shown at right: the solution is constrained to lie on the brown plane (as an equation, "g(P) = 0") and also to lie on the purple ellipsoid ("h(P) = 0"). For both to be true, the solution must lie on the black ellipse where the two intersect. I have drawn several normal vectors to each constraint surface along the intersection. The important observation is that both normal vectors are perpendicular to the intersection curve at each point. In fact, any vector perpendicular to it can be written as a linear combination of the two normal vectors. (Assuming the two are linearly independent! If not, the two constraints may already give a specific solution: in our example, this would happen if the plane constraint was exactly tangent to the ellipsoid constraint at a single point.) The pink ellipsoids at right all have the same two foci (which are faintly visible as black dots in the middle), and represent surfaces of constant total distance for travel from one focus to the surface and back to the other. As in two dimensions, the optimal ellipsoid is tangent to the constraint curve, and consequently its normal vector is perpendicular to the combined constraint (as shown). Thus, the normal vector can be written as a linear combination of the normal vectors of the two constraint surfaces. In equations, this statement reads Examples for langrange multipliers f ( x) x y 1. 2. x2 y 2 r f ( x) x 2 y x 2 ay 2 r f ( x) 20 2 x 2 y z 2 x 2 y 2 z 2 11 3. x yz 3 The strong Lagrangian principle: Lagrange duality Given a convex optimization problem in standard form maximize f (x), subject to gi ( x) bi for i _ 1, 2, . . . , m , And h j ( x) =c j for j=1..p with the domain having non-empty interior, the Lagrangian function is defined as L( x, , ) f ( x) i gi ( x) ci j h j ( x) b j i j The vectors λ and ν are called the dual variables or Lagrange multiplier vectors associated with the problem. The Lagrange dual function is defined as g ( , ) sup( L( x, , )) sup f ( x) i gi ( x) ci j h j ( x) b j . xD xD i j The dual function g is concave, even when the initial problem is not convex. The dual function yields upper bounds on the optimal value p * of the initial problem; for any 0 and any ν we have g ( , ) p * . If the original problem is convex, then we have strong duality, i.e. d * min( g ( , )) p * . 0,