Constrained optimization: indirect methods Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi spring 2014 TIES483 Nonlinear optimization On constrained optimization We have seen how to characterize optimal solutions in constrained optimization – KKT optimality conditions include the balance of forces (−π»π π₯ ∗ , π»ππ π₯ ∗ , π ∈ πΌ and π»βπ (π₯ ∗ )) and complementarity conditions (ππ ππ π₯ ∗ = 0 ∀π) – Regularity of π₯ ∗ need to be assumed Now, we are interested in how to find such solutions spring 2014 TIES483 Nonlinear optimization Methods for constrained optimization Many methods utilize knowledge about the constraints – Linear inequalities or linear equalities – Nonlinear inequalities or equalities For example, if a linear constraint is active at some point, you know that by taking steps along the direction of the constraint, it remains active For nonlinear constraints, you don’t have such a direction Methods for constrained optimization can be characterized based on how they treat constraints spring 2014 TIES483 Nonlinear optimization Classification of the methods Indirect methods: the constrained problem is converted into a sequence of unconstrained problems whose solutions will approach to the solution of the constrained problem, the intermediate solutions need not to be feasible Direct methods: the constraints are taking into account explicitly, intermediate solutions are feasible spring 2014 TIES483 Nonlinear optimization Transforming the optimization problem Constraints of the problem can be transformed if needed ππ π₯ ≤ 0 βΊ ππ π₯ + π¦π2 = 0, where π¦π is a slack variable; constraint is active if π¦π = 0 – By adding π¦π2 no need to add π¦π ≥ 0 – If ππ (π₯) is linear, then linearity is preserved by ππ π₯ + π¦π = 0, π¦π ≥ 0 ππ π₯ ≥ 0 βΊ −ππ π₯ ≤ 0 βπ π₯ = 0 βΊ βπ π₯ ≤ 0 & − βπ π₯ ≤ 0 spring 2014 TIES483 Nonlinear optimization Examples of indirect methods Penalty function methods Lagrangian methods spring 2014 TIES483 Nonlinear optimization Penalty function methods Include constraints into the objective function with the help of penalty functions that penalize constraint violations or even approaching the boundary of π Different types – Penalty function: penalize for constraint violations – Barrier function: prevents leaving the feasible region – Exact penalty function Resulting unconstrained problems can be solved by using the methods presented earlier in the course spring 2014 TIES483 Nonlinear optimization Penalty function methods Generate a sequence of points that approach the feasible region from outside Constrained problem is converted into minπ π π₯ + π πΌ(π₯), π₯∈π where πΌ(π₯) is a penalty function and π is a penalty parameter Requirements: πΌ π₯ ≥ 0 ∀ π₯ ∈ π π and πΌ π₯ = 0 if and only if π₯ ∈ π spring 2014 TIES483 Nonlinear optimization On convergence When π → ∞, the solutions π₯ π of penalty function problems converge to a constrained minimizer (π₯ π → π₯ ∗ and ππΌ π₯ π → 0) – All the functions should be continuous – For each π, there should exist a solution for penalty functions problem and {π₯ π } belongs to a compact subset of π π spring 2014 TIES483 Nonlinear optimization Examples of penalty functions Can you give an example of a penalty function πΌ(π₯)? For equality constraints – βπ π₯ = 0 βΆ πΌ π₯ = πΌ π₯ = π 2 or β π₯ π=1 π π π ,π ≥ | |β (π₯) π π=1 2 For inequality constraints – ππ π₯ ≤ 0 βΆ πΌ π₯ = πΌ π₯ = spring 2014 π π=1 max π π=1 max TIES483 Nonlinear optimization 0, ππ (π₯) or 0, ππ π₯ π , π ≥ 2 How to choose π? Should be large enough in order for the solutions be close enough to the feasible region If π is too large, there could be numerical problems in solving the penalty problems For large values of π, the emphasis is on finding feasible solutions and, thus, the solution can be feasible but far from optimum Typically π is updated iteratively Different parameters can be used for different constraints (e.g. ππ βΆ ππ , ππ βΆ ππ ) – For the sake of simplicity, same parameter is used here for all the constraints spring 2014 TIES483 Nonlinear optimization Algorithm 1) Choose the final tolerance π > 0 and a starting point π₯ 1 . Choose π1 > 0 (not too large) and set β = 1. 2) Solve minπ π π₯ + π β πΌ(π₯) π₯∈π with some method for unconstrained problems (π₯ β as a starting point). Let the solution be π₯ β+1 = π₯(π β ). 3) Test optimality: If π β πΌ π₯ β+1 < π, stop. Solution π₯ β+1 is close enough to optimum. Otherwise, set π β+1 > π β (e.g. π β+1 = π π β , where π can be initialized to be e.g. 10). Set β = β + 1 and go to 2). spring 2014 TIES483 Nonlinear optimization min π₯ π . π‘. −π₯ + 2 ≤ 0 Let πΌ π₯ = max [0, (−π₯ + 2)] 2 0, ππ π₯ ≥ 2 Then πΌ π₯ = −π₯ + 2 2 , ππ π₯ < 2 Minimum of π + ππΌ is at 2 spring 2014 1 − 2π TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example Barrier function method Prevents leaving the feasible region Suitable only for problems with equality constraints – Set π₯ ππ π₯ < 0 ∀π} should not be empty Problem to be solved is min Θ π π . π‘. π ≥ 0, π where Θ π = inf π π₯ + ππ½ π₯ π₯ ππ π₯ < 0 ∀π] ο§ π½ is a barrier function: π½ π₯ ≥ 0 when ππ π₯ < 0 ∀π and π½ π₯ → ∞ when π₯ approaches boundary of π ο§ Constraints ππ π₯ < 0 can be omitted since π½ → ∞ in the boundary of π spring 2014 TIES483 Nonlinear optimization On convergence Denote Θ π = π π₯ π + ππ½(π₯ π ) Under some assumptions, the solutions π₯ π of barrier problems converge to a constrained minimizer (π₯ π → π₯ ∗ and ππ½ π₯ π → 0) when π → 0+ – All functions should be continuous – π₯ ππ π₯ < 0 ∀π} ≠ ∅ spring 2014 TIES483 Nonlinear optimization Properties of barrier functions Nonnegative and continuous in π₯ ππ π₯ < 0 ∀π} Approaches ∞ when the boundary of the feasible region is approached from inside Ideally: π½ = 0 in π₯ ππ π₯ < 0 ∀π} and π½ = ∞ in the boundary – Guarantees staying in the feasible region – This kind of discontinuity causes problems for any numerical method Examples of barrier functions – π½ π₯ = 1 π − π=1 π π₯ – π½ π₯ =− spring 2014 π π π=1 ln (min[1, −ππ (π₯)]) TIES483 Nonlinear optimization Algorithm 1) Choose the final tolerance π > 0 and a starting point π₯ 1 s.t. ππ π₯ < 0 ∀π. Choose π1 > 0, not too small (and a parameter 0 < π < 1 for reducing π). Set β = 1. 2) Solve min π π₯ + π β π½(π₯) π . π‘. ππ π₯ < 0 ∀π π₯ by using the starting point π₯ β . Let the solution be π₯ β+1 . 3) Test optimality: If π β π½ π₯ β+1 < π, stop. Solution π₯ β+1 is close enough to optimum. Otherwise, set π β+1 < π β (e.g. π β+1 = ππ β ). Set β = β + 1 and go to 2). spring 2014 TIES483 Nonlinear optimization Let π½ π₯ = −1 −π₯+1 when π₯ ≠ 1 Minimum of π π₯ + ππ½ π₯ = π₯ + π π₯ − 1 −1 is at 1 + π spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) min π₯ π . π‘. −π₯ + 1 ≤ 0 Example Summary: penalty and barrier function methods Penalty and barrier functions usually differentiable Minimum is obtained in a limit – Penalty function: π β → ∞ – Barrier function: π β → 0 Choosing the sequence π β essential for convergence – If π β → ∞ or π β → 0 too slowly, a large number of unconstrained problems need to be solved – If π β → ∞ or π β → 0 too fast, solutions of successive unconstrained problems are far from each other and solution time increases spring 2014 TIES483 Nonlinear optimization Exact penalty function Idea is to have a method where the solution could be found with a small amount of iterations Suitable for both equality and inequality constraints An exact penalty function problem is e.g. of the form minπ π π₯ + π( π₯∈π spring 2014 π π=1 max[0, ππ π₯ ]+ TIES483 Nonlinear optimization π π=1 |βπ (π₯)|) Exact penalty function method Theorem: Consider a point π₯ where the necessary KKT conditions hold. Let the corresponding Lagrange multipliers be π and π. Assume that objective and inequality constraint functions are convex and equality constraint functions are affine. Then π₯ is a solution of the exact penalty function problem with π ≥ max[ππ , π = 1, … , π, ππ , π = 1, … , π] Solution can be obtained with a finite value for the penalty parameter π Algorithm is similar to penalty function method except for that π β is increased only if necessary – E.g. when the feasible region is not approached fast enough spring 2014 TIES483 Nonlinear optimization Properties of exact penalty function Not differentiable in points π₯ where ππ π₯ = 0 or βπ π₯ = 0 – Gradient based methods are not suitable If π and the starting point could be chosen appropriately, only one minimization would be required in principle – If π is too large and the starting point is not close enough to the optimum, minimizing the exact penalty function problem could become difficult spring 2014 TIES483 Nonlinear optimization Example min π π₯ = π₯12 + π₯22 π . π‘. π₯1 + π₯2 − 1 = 0 Optimal solution is π₯∗ = 1 1 π , 2 2 , π ∗ = −2π₯1∗ = −2π₯2∗ = −1 Exact penalty function problem: minπ π₯12 + π₯22 + π|π₯1 + π₯2 − 1| π₯∈π π₯∗ Solution: = when π ≥ 1 π π π , when 2 2 0 ≤ π < 1 and π₯∗ = 1 1 π , 2 2 – (obtained by using KKT conditions of an equivalent differentiable problem where the absolute value term is replaced with a new variable and two inequality constraints) Thus, the solution can be found with π ≥ 1 (= |π ∗ |) spring 2014 TIES483 Nonlinear optimization min π π₯ = π₯1 π₯22 π . π‘. π₯12 +π₯22 − 2 ≤ 0 π₯ ∗ = −0.8165, −1.1547 π , the constraint is active in π₯ ∗ (a) level curves of π(π₯) and the boundary of π Logarithmic barrier function: (b) π = 0.2, (c) π = 0.001 spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example: barrier function min π π₯ = π₯1 π₯22 π . π‘. π₯12 +π₯22 − 2 ≤ 0 π₯ ∗ = −0.8165, −1.1547 π , the constraint is active in π₯ ∗ Quadratic penalty function: (a) π = 1, (b) π = 100 spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example: penalty function min π π₯ = π₯1 π₯22 π . π‘. π₯12 +π₯22 − 2 ≤ 0 π₯ ∗ = −0.8165, −1.1547 π , the constraint is active in π₯ ∗ , π ∗ = 0.8165 Exact penalty function: (a) π = 1.2, (b) π = 5, (c) π = 100 spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example: exact penalty function Lagrangian function Consider problem min π(π₯) π . π‘. βπ π₯ = 0, π = 1, … , π Lagrangian function πΏ π₯, π = π π₯ + ππ=1 ππ βπ (π₯) KKT conditions π»π π₯ + ππ=1 ππ π»βπ (π₯) = 0 βπ π₯ = 0, π = 1, … , π Let π₯ ∗ be a minimizer and π ∗ corresponding Lagrange multiplier spring 2014 TIES483 Nonlinear optimization Properties of Lagrangian KKT conditions: π₯ ∗ is a critical point of the Lagrangian function – π₯ ∗ is not necessarily a minimizer for πΏ(π₯, π ∗ ) Thus, minimizing the Lagrangian function doesn’t necessarily give a minimum for π(π₯) 2 πΏ(π₯ ∗ , π ∗ ) may be indefinite → a saddle – Hessian π»π₯π₯ point Improve Lagrangian function! spring 2014 TIES483 Nonlinear optimization Augmented Lagrangian function Augmented Lagrangian function: πΏπ΄ π₯, π, π = π π₯ + π π=1 ππ βπ (π₯) 1 2 π π=1 + π βπ π₯ 2,π >0 – Lagrangian function + quadratic penalty function A point (π₯ ∗ , π ∗ ) is a critical point of the augmented Lagrangian 1 2 – π»x πΏπ΄ π₯ ∗ , π ∗ , π = 0 and π π π=1 βπ π₯ ∗ 2 =0 Hessian: 2 πΏ π₯ ∗ , π ∗ , π = π» 2 πΏ π₯ ∗ , π ∗ + ππ»β π₯ ∗ π π»β(π₯ ∗ ) π»π₯π₯ π΄ π₯π₯ 2 πΏ π₯ ∗ , π ∗ , π is It can be shown that for π > π, π»π₯π₯ π΄ ∗ positive definite → π₯ is a local minimizer of πΏπ΄ π₯, π ∗ , π Need to know π ∗ spring 2014 TIES483 Nonlinear optimization Properties of πΏπ΄ (π₯, π, π) Differentiable if the original functions are π₯ ∗ is a minimizer of πΏπ΄ π₯, π ∗ , π for finite π Lagrangian function + quadratic penalty function spring 2014 TIES483 Nonlinear optimization Algorithm 1) 2) 3) 4) 5) 6) ο§ Choose the final tolerance π > 0. Choose π₯ 1 , ππ1 (π = 1, … , π) and π. Set β = 1. Test optimality: if optimality conditions are satisfied, stop. The solution is π₯ β . Solve (with a suitable method) minπ πΏπ΄ π₯, π β , π π₯∈π by using π₯ β as a starting point. Let the solution be π₯ β+1 . Update Lagrange multipliers: e.g. π β+1 = π β + πβ π₯ β+1 . Increase π if necessary: e.g. if β(π₯ β ) − β π₯ β+1 < π. Set β = β + 1 and go to 2). Note: π₯ β → π₯ ∗ only if π β → π ∗ spring 2014 TIES483 Nonlinear optimization min π π₯ = π₯1 π₯22 π . π‘. π₯12 +π₯22 − 2 ≤ 0 π₯ ∗ = −0.8165, −1.1547 π , the constraint is active in π₯ ∗ Lagrangian function: saddle point in π₯ ∗ spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example Augmented Lagrangian function (a) π = 0.075, (b) π = 0.2, (c) π = 100 spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example (cont.) Augmented Lagrangian function ο§ π ∗ = 0.8165, π = 0.2 (a) π = 0.5, (b) π = 0.9, (c) π = 1.0 spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example (cont.) Topic of the lectures next week Mon, Feb 10th: Constrained optimization: gradient projection, active set method Wed, Feb 12th: Constrained optimization, SQP method & Matlab Study this before the lecture! Questions to be considered – What is the basic idea of gradient projection? – What is the basic idea of active set methods? – What is the basic idea of Sequential Quadratic Programming (SQP)? spring 2014 TIES483 Nonlinear optimization