Lecture 9 – Nonlinear Programming Models Topics • Convex sets and convex programming • First-order optimality conditions • Examples • Problem classes General NLP Minimize f(x) s.t. gi(x) (, , =) bi, i = 1,…,m x = (x1,…,xn)T is the n-dimensional vector of decision variables f (x) is the objective function gi(x) are the constraint functions bi are fixed known constants Convex Sets Definition: A set S n is convex if every point on the line segment connecting any two points x1, x2 S is also in S. Mathematically, this is equivalent to x0 = lx1 + (1–l)x2 S for all l such 0 ≤ l ≤ 1. x1 x2 x1 x1 x2 x2 (Nonconvex) Feasible Region S = {(x1, x2) : (0.5x1 – 0.6)x2 ≤ 1 x2 2(x1)2 + 3(x2)2 ≥ 27; x1, x2 ≥ 0} x1 Convex Sets and Optimization Let S = { x n : gi(x) bi, i = 1,…,m } Fact: If gi(x) is a convex function for each i = 1,…,m then S is a convex set. Convex Programming Theorem: Let x n and let f (x) be a convex function defined over a convex constraint set S. If a finite solution exists to the problem Minimize { f (x) : x S } then all local optima are global optima. If f (x) is strictly convex, the optimum is unique. Convex Programming Min f (x1,…,xn) Max f (x1,…,xn) s.t. gi(x1,…,xn) bi i = 1,…,m x1 0,…,xn 0 s.t. gi(x1,…,xn) bi i = 1,…,m x1 0,…,xn 0 is a convex program if f is convex and each gi is convex. is a convex program if f is concave and each gi is convex. Linearly Constrained Convex Function with Unique Global Maximum Maximize f (x) = (x1 – 2)2 + (x2 – 2)2 x2 5 subject to –3x1 – 2x2 ≤ –6 4 –x1 + x2 ≤ 3 3 x1 + x2 ≤ 7 2 2x1 – 3x2 ≤ 4 1 1 2 3 4 5 x1 (Nonconvex) Optimization Problem First-Order Optimality Conditions Minimize { f (x) : gi(x) bi, i = 1,…,m } Lagrangian: L(x,) f (x) i gi (x) bi m i1 Optimality conditions m • Stationarity: L(x,) f (x) igi (x) 0 i1 • Complementarity: igi(x) = 0, i = 1,…,m • Feasibility: gi(x) bi, i = 1,…,m • Nonnegativity: i 0, i = 1,…,m Importance of Convex Programs Commercial optimization software cannot guarantee that a solution is globally optimal to a nonconvex program. NLP algorithms try to find a point where the gradient of the Lagrangian function is zero – a stationary point – and complementary slackness holds. Given L(x,) = f(x) + (g(x) – b) we want L(x,) = f(x) + g(x) = 0 g(x) – b) = 0 g(x) – b ≤ 0, 0 For a convex program, all local solutions are global optima. Example: Cylinder Design We want to build a cylinder (with a top and a bottom) of maximum volume such that its surface area is no more than s units. Max V(r,h) = pr2h s.t. 2pr2 + 2prh = s r h r 0, h 0 There are a number of ways to approach this problem. One way is to solve the surface area constraint for h and substitute the result into the objective function. Solution by Substitution 2 s 2p r s 2pr 2 rs 2 Volume = V = pr pr 3 [ ] = h= 2 2pr 2pr dV s 1/2 s s 1/2 = 0 r=( ) h = 2pr r = 2( p) dr 6p 6 V = pr 2h s 3/2 = 2p ( p) 6 s 1/2 r = ( p) 6 s 1/2 ) h = 2( 6p Is this a global optimal solution? Test for Convexity dV(r) s rs 3 pr dr = 2 3pr 2 V(r ) = 2 d2V(r ) dr 2 d2V 0 for all r 0 dr 2 Thus V(r ) is concave on r 0 so the solution is a global maximum. 6pr Advertising (with Diminishing Returns) • A company wants to advertise in two regions. • The marketing department says that if $x1 is spent in region 1, sales volume will be 6(x1)1/2. • If $x2 is spent in region 2, sales volume will be 4(x2)1/2. • The advertising budget is $100. Model: Max f (x) = 6(x1)1/2 + 4(x2)1/2 s.t. x1 + x2 100, x1 0, x2 0 Solution: x1* = 69.2, x2* = 30.8, f (x*) = 72.1 Is this a global optimum? Excel Add-in Solution A B C D L K J I H G F E Objective Terms Solver: Excel Solver Name: Adv100 0 Type: Nonlinear Linear: Type: NLP1 Goal: Max NonLinear 1: 72.111 Sens.: Yes 0 Objective: 72.111NonLinear 2: 1 Nonlinear Model 2 72.111 Change 2 3 4 TRUE Solve 5 TRUE Variables 100 6 Name: Change Relation 7 Values: 8 Lower Bounds: 9 10 11 Linear Obj. Coef.: 12 Nonlinear Obj. Terms: 13 Nonlinear Obj. Coef.: 14 Constraints 15 RHS Rel. Num. Name Value 16 100 <= 100 Con1 1 17 10000 <= 0 Con2 2 18 19 2 1 X2 X1 69.231 30.769 0 0 0 8.3205 6 0 5.547 4 Linear Constraint Coefficients 1 1 0 0 M N O Comp. Time 00:00 Status Optimal Portfolio Selection with Risky Assets (Markowitz) • Suppose that we may invest in (up to) n stocks. • Investors worry about (1) expected gain (2) risk. Let rj = random variable associated with return on stock j j = expected return on stock j sjj variance of return for stock j We are also concerned with the covariance terms: sij = cov(ri, rj) If sij > 0 then returns on i and j are positively correlated. If sij < 0 returns are negatively correlated. Decision Variables: xj = # of shares of stock j purchased n Expected return of the portfolio: R(x) = jxj j =1 n Variance (measure of risk): V(x) = n sijxixj i =1 j =1 Example: s 11 s 12 1 1 s 21 s 22 1 1 If x1 = x2 = 1, we get V(x) = s11x1x1 + s12x1x2 + s21x2x1 + s22x2x1 = 1 + (1) + (1) + 1 = 0 Thus we can construct a “risk-free” portfolio (from variance point of view) if we can find stocks “fully” negatively correlated. s 11 s 12 1 1 If , then buying stock 2 is just like s 21 s 22 1 1 buying additional shares of stock 1. Nonlinear optimization models … Let pj = price of stock j b = our total budget b risk-aversion factor (when b 0 risk is not a factor) Consider 3 different models: 1) Max f (x) = R(x) – bV(x) n s.t. pj xj b, xj 0, j = 1,…,n j =1 where b 0 is determined by the decision maker 2) Max f (x) = R(x) s.t. n V(x) , pjxj b, xj 0, j = 1,…,n j =1 where 0 is determined by the investor. Smaller values of represent greater risk aversion. 3) Min f (x) = V(x) s.t. n R(x) , pj xj b, xj 0, j = 1,…,n j =1 where 0 is the desired rate of return (minimum expectation) is selected by the investor. Hanging Chain with Rigid Links 10ft 1 ft x each link y What is equilibrium shape of chain? Decision variables: Let (xj, yj), j = 1,…,n, be the incremental horizontal and vertical displacement of each link, where n 10. Constraints: xj2 + yj2 = 1, j = 1,…,n, each link has length 1 x1 + x2 + ••• + xn = 10, net horizontal displacement y1 + y2 + ••• + yn = 0, net vertical displacement Objective: Minimize chain’s potential energy Assuming that the center of the mass of each link is at the center of the link. This is equivalent to minimizing 1 y + (y + 1 y ) + (y + y + 1 y ) + 1 1 1 2 2 3 2 2 2 + (y1 + y2 + 1 1 = (n 1 + 2 ]y1 + (n 2 + 2 )y2 1 + (n 3 + 2 )y3 + ••• ••• ••• 1 + yn-1 + 2 yn) 3 1 + 2 yn-1 + 2 yn Summary n Min (n j + ½)yj j =1 s.t. xj2 + yj2 = 1, j = 1,…,n x1 + x2 + ••• + xn = 10 y1 + y2 + ••• + yn = 0 Is a local optimum guaranteed to be a global optimum? No! Constraints xj2 + yj2 = 1 for all j yield a nonconvex feasible region so there may be several local optima. Consider a chain with 4 links: These solutions are both local minima. Direct Current Network 10 20 I2 I4 I3 I1 100v I6 I5 10 20 I7 Problem: Determine the current flows I1, I2,…,I7 so that the total content is minimized Content: G(I) = 0 I 0 v(i)di for I ≥ 0 and G(I) = I v(i)di for I < 0 Solution Approach Electrical Engineering: Use Kirchoff’s laws to find currents when power source is given. Operations Research: Optimize performance measure in network taking flow balance into account. Linear resistor: Voltage, v(I ) = IR Content function, G(I ) = I 2R/2 Battery: Voltage, v(I ) = –E Content function, G(I ) = –EI Network Flow Model Network diagram: 2 2 5 I2 2 3 10 I 4 5 5 I 32 -100I1 1 0 I6 4 2 10 I5 0 I7 6 Minimize Z = –100I1 + 5I22 + 5I32 + 10I42 + 10I52 subject to I1 – I2 = 0, I2 – I3 – I4 = 0, I5 – I6 = 0, I5 + I7 = 0, I3 + I6 – I7 = 0, –I1 – I6 = 0 Solution: I1 = I2 = 50/9, I3 = 40/9, I4 = I5 = 10/9, I6 = –50/9, I7 = –10/9 NLP Problem Classes • Constrained vs. unconstrained • Convex programming problem • Quadratic programming problem f (x) = a + cTx + ½ xTQx, Q 0 • Separable programming problem f (x) = j=1,n fj(xj) • Geometric programming problem g(x) = t=1,T ctPt(x), Pt(x) = (x1at1) . . . (xnatn), xj > 0 • Equality constrained problems What You Should Know About Nonlinear Programming • How to identify a convex program. • How to write out the first-order optimality conditions. • The difference between a local and global solution. • How to classify problems.