8. Constrained optimization: indirect methods

Constrained optimization: indirect methods Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi spring 2014 TIES483 Nonlinear optimization On constrained optimization We have seen how to characterize optimal solutions in constrained optimization – KKT optimality conditions include the balance of forces (−𝛻𝑓 𝑥 ∗ , 𝛻𝑔𝑖 𝑥 ∗ , 𝑖 ∈ 𝐼 and 𝛻ℎ𝑗 (𝑥 ∗ )) and complementarity conditions (𝜇𝑖 𝑔𝑖 𝑥 ∗ = 0 ∀𝑖) – Regularity of 𝑥 ∗ need to be assumed Now, we are interested in how to find such solutions spring 2014 TIES483 Nonlinear optimization Methods for constrained optimization Many methods utilize knowledge about the constraints – Linear inequalities or linear equalities – Nonlinear inequalities or equalities For example, if a linear constraint is active at some point, you know that by taking steps along the direction of the constraint, it remains active For nonlinear constraints, you don’t have such a direction Methods for constrained optimization can be characterized based on how they treat constraints spring 2014 TIES483 Nonlinear optimization Classification of the methods Indirect methods: the constrained problem is converted into a sequence of unconstrained problems whose solutions will approach to the solution of the constrained problem, the intermediate solutions need not to be feasible Direct methods: the constraints are taking into account explicitly, intermediate solutions are feasible spring 2014 TIES483 Nonlinear optimization Transforming the optimization problem Constraints of the problem can be transformed if needed 𝑔𝑖 𝑥 ≤ 0 ⟺ 𝑔𝑖 𝑥 + 𝑦𝑖2 = 0, where 𝑦𝑖 is a slack variable; constraint is active if 𝑦𝑖 = 0 – By adding 𝑦𝑖2 no need to add 𝑦𝑖 ≥ 0 – If 𝑔𝑖 (𝑥) is linear, then linearity is preserved by 𝑔𝑖 𝑥 + 𝑦𝑖 = 0, 𝑦𝑖 ≥ 0 𝑔𝑖 𝑥 ≥ 0 ⟺ −𝑔𝑖 𝑥 ≤ 0 ℎ𝑖 𝑥 = 0 ⟺ ℎ𝑖 𝑥 ≤ 0 & − ℎ𝑖 𝑥 ≤ 0 spring 2014 TIES483 Nonlinear optimization Examples of indirect methods Penalty function methods Lagrangian methods spring 2014 TIES483 Nonlinear optimization Penalty function methods Include constraints into the objective function with the help of penalty functions that penalize constraint violations or even approaching the boundary of 𝑆 Different types – Penalty function: penalize for constraint violations – Barrier function: prevents leaving the feasible region – Exact penalty function Resulting unconstrained problems can be solved by using the methods presented earlier in the course spring 2014 TIES483 Nonlinear optimization Penalty function methods Generate a sequence of points that approach the feasible region from outside Constrained problem is converted into min𝑛 𝑓 𝑥 + 𝑟 𝛼(𝑥), 𝑥∈𝑅 where 𝛼(𝑥) is a penalty function and 𝑟 is a penalty parameter Requirements: 𝛼 𝑥 ≥ 0 ∀ 𝑥 ∈ 𝑅𝑛 and 𝛼 𝑥 = 0 if and only if 𝑥 ∈ 𝑆 spring 2014 TIES483 Nonlinear optimization On convergence When 𝑟 → ∞, the solutions 𝑥 𝑟 of penalty function problems converge to a constrained minimizer (𝑥 𝑟 → 𝑥 ∗ and 𝑟𝛼 𝑥 𝑟 → 0) – All the functions should be continuous – For each 𝑟, there should exist a solution for penalty functions problem and {𝑥 𝑟 } belongs to a compact subset of 𝑅𝑛 spring 2014 TIES483 Nonlinear optimization Examples of penalty functions Can you give an example of a penalty function 𝛼(𝑥)? For equality constraints – ℎ𝑖 𝑥 = 0 ⟶ 𝛼 𝑥 = 𝛼 𝑥 = 𝑙 2 or ℎ 𝑥 𝑖=1 𝑖 𝑙 𝑝 ,𝑝 ≥ | |ℎ (𝑥) 𝑖 𝑖=1 2 For inequality constraints – 𝑔𝑖 𝑥 ≤ 0 ⟶ 𝛼 𝑥 = 𝛼 𝑥 = spring 2014 𝑚 𝑖=1 max 𝑚 𝑖=1 max TIES483 Nonlinear optimization 0, 𝑔𝑖 (𝑥) or 0, 𝑔𝑖 𝑥 𝑝 , 𝑝 ≥ 2 How to choose 𝑟? Should be large enough in order for the solutions be close enough to the feasible region If 𝑟 is too large, there could be numerical problems in solving the penalty problems For large values of 𝑟, the emphasis is on finding feasible solutions and, thus, the solution can be feasible but far from optimum Typically 𝑟 is updated iteratively Different parameters can be used for different constraints (e.g. 𝑔𝑖 ⟶ 𝑟𝑖 , 𝑔𝑗 ⟶ 𝑟𝑗 ) – For the sake of simplicity, same parameter is used here for all the constraints spring 2014 TIES483 Nonlinear optimization Algorithm 1) Choose the final tolerance 𝜖 > 0 and a starting point 𝑥 1 . Choose 𝑟1 > 0 (not too large) and set ℎ = 1. 2) Solve min𝑛 𝑓 𝑥 + 𝑟 ℎ 𝛼(𝑥) 𝑥∈𝑅 with some method for unconstrained problems (𝑥 ℎ as a starting point). Let the solution be 𝑥 ℎ+1 = 𝑥(𝑟 ℎ ). 3) Test optimality: If 𝑟 ℎ 𝛼 𝑥 ℎ+1 < 𝜖, stop. Solution 𝑥 ℎ+1 is close enough to optimum. Otherwise, set 𝑟 ℎ+1 > 𝑟 ℎ (e.g. 𝑟 ℎ+1 = 𝜅𝑟 ℎ , where 𝜅 can be initialized to be e.g. 10). Set ℎ = ℎ + 1 and go to 2). spring 2014 TIES483 Nonlinear optimization min 𝑥 𝑠. 𝑡. −𝑥 + 2 ≤ 0 Let 𝛼 𝑥 = max [0, (−𝑥 + 2)] 2 0, 𝑖𝑓 𝑥 ≥ 2 Then 𝛼 𝑥 = −𝑥 + 2 2 , 𝑖𝑓 𝑥 < 2 Minimum of 𝑓 + 𝑟𝛼 is at 2 spring 2014 1 − 2𝑟 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example Barrier function method Prevents leaving the feasible region Suitable only for problems with equality constraints – Set 𝑥 𝑔𝑖 𝑥 < 0 ∀𝑖} should not be empty Problem to be solved is min Θ 𝑟 𝑠. 𝑡. 𝑟 ≥ 0, 𝑟 where Θ 𝑟 = inf 𝑓 𝑥 + 𝑟𝛽 𝑥 𝑥 𝑔𝑖 𝑥 < 0 ∀𝑖]  𝛽 is a barrier function: 𝛽 𝑥 ≥ 0 when 𝑔𝑖 𝑥 < 0 ∀𝑖 and 𝛽 𝑥 → ∞ when 𝑥 approaches boundary of 𝑆  Constraints 𝑔𝑖 𝑥 < 0 can be omitted since 𝛽 → ∞ in the boundary of 𝑆 spring 2014 TIES483 Nonlinear optimization On convergence Denote Θ 𝑟 = 𝑓 𝑥 𝑟 + 𝑟𝛽(𝑥 𝑟 ) Under some assumptions, the solutions 𝑥 𝑟 of barrier problems converge to a constrained minimizer (𝑥 𝑟 → 𝑥 ∗ and 𝑟𝛽 𝑥 𝑟 → 0) when 𝑟 → 0+ – All functions should be continuous – 𝑥 𝑔𝑖 𝑥 < 0 ∀𝑖} ≠ ∅ spring 2014 TIES483 Nonlinear optimization Properties of barrier functions Nonnegative and continuous in 𝑥 𝑔𝑖 𝑥 < 0 ∀𝑖} Approaches ∞ when the boundary of the feasible region is approached from inside Ideally: 𝛽 = 0 in 𝑥 𝑔𝑖 𝑥 < 0 ∀𝑖} and 𝛽 = ∞ in the boundary – Guarantees staying in the feasible region – This kind of discontinuity causes problems for any numerical method Examples of barrier functions – 𝛽 𝑥 = 1 𝑚 − 𝑖=1 𝑔 𝑥 – 𝛽 𝑥 =− spring 2014 𝑖 𝑚 𝑖=1 ln (min[1, −𝑔𝑖 (𝑥)]) TIES483 Nonlinear optimization Algorithm 1) Choose the final tolerance 𝜖 > 0 and a starting point 𝑥 1 s.t. 𝑔𝑖 𝑥 < 0 ∀𝑖. Choose 𝑟1 > 0, not too small (and a parameter 0 < 𝜏 < 1 for reducing 𝑟). Set ℎ = 1. 2) Solve min 𝑓 𝑥 + 𝑟 ℎ 𝛽(𝑥) 𝑠. 𝑡. 𝑔𝑖 𝑥 < 0 ∀𝑖 𝑥 by using the starting point 𝑥 ℎ . Let the solution be 𝑥 ℎ+1 . 3) Test optimality: If 𝑟 ℎ 𝛽 𝑥 ℎ+1 < 𝜖, stop. Solution 𝑥 ℎ+1 is close enough to optimum. Otherwise, set 𝑟 ℎ+1 < 𝑟 ℎ (e.g. 𝑟 ℎ+1 = 𝜏𝑟 ℎ ). Set ℎ = ℎ + 1 and go to 2). spring 2014 TIES483 Nonlinear optimization Let 𝛽 𝑥 = −1 −𝑥+1 when 𝑥 ≠ 1 Minimum of 𝑓 𝑥 + 𝑟𝛽 𝑥 = 𝑥 + 𝑟 𝑥 − 1 −1 is at 1 + 𝑟 spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) min 𝑥 𝑠. 𝑡. −𝑥 + 1 ≤ 0 Example Summary: penalty and barrier function methods Penalty and barrier functions usually differentiable Minimum is obtained in a limit – Penalty function: 𝑟 ℎ → ∞ – Barrier function: 𝑟 ℎ → 0 Choosing the sequence 𝑟 ℎ essential for convergence – If 𝑟 ℎ → ∞ or 𝑟 ℎ → 0 too slowly, a large number of unconstrained problems need to be solved – If 𝑟 ℎ → ∞ or 𝑟 ℎ → 0 too fast, solutions of successive unconstrained problems are far from each other and solution time increases spring 2014 TIES483 Nonlinear optimization Exact penalty function Idea is to have a method where the solution could be found with a small amount of iterations Suitable for both equality and inequality constraints An exact penalty function problem is e.g. of the form min𝑛 𝑓 𝑥 + 𝑟( 𝑥∈𝑅 spring 2014 𝑚 𝑖=1 max[0, 𝑔𝑖 𝑥 ]+ TIES483 Nonlinear optimization 𝑙 𝑖=1 |ℎ𝑖 (𝑥)|) Exact penalty function method Theorem: Consider a point 𝑥 where the necessary KKT conditions hold. Let the corresponding Lagrange multipliers be 𝜇 and 𝜈. Assume that objective and inequality constraint functions are convex and equality constraint functions are affine. Then 𝑥 is a solution of the exact penalty function problem with 𝑟 ≥ max[𝜇𝑖 , 𝑖 = 1, … , 𝑚, 𝜈𝑖 , 𝑖 = 1, … , 𝑙] Solution can be obtained with a finite value for the penalty parameter 𝑟 Algorithm is similar to penalty function method except for that 𝑟 ℎ is increased only if necessary – E.g. when the feasible region is not approached fast enough spring 2014 TIES483 Nonlinear optimization Properties of exact penalty function Not differentiable in points 𝑥 where 𝑔𝑖 𝑥 = 0 or ℎ𝑖 𝑥 = 0 – Gradient based methods are not suitable If 𝑟 and the starting point could be chosen appropriately, only one minimization would be required in principle – If 𝑟 is too large and the starting point is not close enough to the optimum, minimizing the exact penalty function problem could become difficult spring 2014 TIES483 Nonlinear optimization Example min 𝑓 𝑥 = 𝑥12 + 𝑥22 𝑠. 𝑡. 𝑥1 + 𝑥2 − 1 = 0 Optimal solution is 𝑥∗ = 1 1 𝑇 , 2 2 , 𝜈 ∗ = −2𝑥1∗ = −2𝑥2∗ = −1 Exact penalty function problem: min𝑛 𝑥12 + 𝑥22 + 𝑟|𝑥1 + 𝑥2 − 1| 𝑥∈𝑅 𝑥∗ Solution: = when 𝑟 ≥ 1 𝑟 𝑟 𝑇 , when 2 2 0 ≤ 𝑟 < 1 and 𝑥∗ = 1 1 𝑇 , 2 2 – (obtained by using KKT conditions of an equivalent differentiable problem where the absolute value term is replaced with a new variable and two inequality constraints) Thus, the solution can be found with 𝑟 ≥ 1 (= |𝜈 ∗ |) spring 2014 TIES483 Nonlinear optimization min 𝑓 𝑥 = 𝑥1 𝑥22 𝑠. 𝑡. 𝑥12 +𝑥22 − 2 ≤ 0 𝑥 ∗ = −0.8165, −1.1547 𝑇 , the constraint is active in 𝑥 ∗ (a) level curves of 𝑓(𝑥) and the boundary of 𝑆 Logarithmic barrier function: (b) 𝑟 = 0.2, (c) 𝑟 = 0.001 spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example: barrier function min 𝑓 𝑥 = 𝑥1 𝑥22 𝑠. 𝑡. 𝑥12 +𝑥22 − 2 ≤ 0 𝑥 ∗ = −0.8165, −1.1547 𝑇 , the constraint is active in 𝑥 ∗ Quadratic penalty function: (a) 𝑟 = 1, (b) 𝑟 = 100 spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example: penalty function min 𝑓 𝑥 = 𝑥1 𝑥22 𝑠. 𝑡. 𝑥12 +𝑥22 − 2 ≤ 0 𝑥 ∗ = −0.8165, −1.1547 𝑇 , the constraint is active in 𝑥 ∗ , 𝜇 ∗ = 0.8165 Exact penalty function: (a) 𝑟 = 1.2, (b) 𝑟 = 5, (c) 𝑟 = 100 spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example: exact penalty function Lagrangian function Consider problem min 𝑓(𝑥) 𝑠. 𝑡. ℎ𝑖 𝑥 = 0, 𝑖 = 1, … , 𝑙 Lagrangian function 𝐿 𝑥, 𝜈 = 𝑓 𝑥 + 𝑙𝑖=1 𝜈𝑖 ℎ𝑖 (𝑥) KKT conditions 𝛻𝑓 𝑥 + 𝑙𝑖=1 𝜈𝑖 𝛻ℎ𝑖 (𝑥) = 0 ℎ𝑖 𝑥 = 0, 𝑖 = 1, … , 𝑙 Let 𝑥 ∗ be a minimizer and 𝜈 ∗ corresponding Lagrange multiplier spring 2014 TIES483 Nonlinear optimization Properties of Lagrangian KKT conditions: 𝑥 ∗ is a critical point of the Lagrangian function – 𝑥 ∗ is not necessarily a minimizer for 𝐿(𝑥, 𝜈 ∗ ) Thus, minimizing the Lagrangian function doesn’t necessarily give a minimum for 𝑓(𝑥) 2 𝐿(𝑥 ∗ , 𝜈 ∗ ) may be indefinite → a saddle – Hessian 𝛻𝑥𝑥 point Improve Lagrangian function! spring 2014 TIES483 Nonlinear optimization Augmented Lagrangian function Augmented Lagrangian function: 𝐿𝐴 𝑥, 𝜈, 𝜚 = 𝑓 𝑥 + 𝑙 𝑖=1 𝜈𝑖 ℎ𝑖 (𝑥) 1 2 𝑙 𝑖=1 + 𝜚 ℎ𝑖 𝑥 2,𝜚 >0 – Lagrangian function + quadratic penalty function A point (𝑥 ∗ , 𝜈 ∗ ) is a critical point of the augmented Lagrangian 1 2 – 𝛻x 𝐿𝐴 𝑥 ∗ , 𝜈 ∗ , 𝜚 = 0 and 𝜚 𝑙 𝑖=1 ℎ𝑖 𝑥 ∗ 2 =0 Hessian: 2 𝐿 𝑥 ∗ , 𝜈 ∗ , 𝜚 = 𝛻 2 𝐿 𝑥 ∗ , 𝜈 ∗ + 𝜚𝛻ℎ 𝑥 ∗ 𝑇 𝛻ℎ(𝑥 ∗ ) 𝛻𝑥𝑥 𝐴 𝑥𝑥 2 𝐿 𝑥 ∗ , 𝜈 ∗ , 𝜚 is It can be shown that for 𝜚 > 𝜚, 𝛻𝑥𝑥 𝐴 ∗ positive definite → 𝑥 is a local minimizer of 𝐿𝐴 𝑥, 𝜈 ∗ , 𝜚 Need to know 𝜈 ∗ spring 2014 TIES483 Nonlinear optimization Properties of 𝐿𝐴 (𝑥, 𝜈, 𝜚) Differentiable if the original functions are 𝑥 ∗ is a minimizer of 𝐿𝐴 𝑥, 𝜈 ∗ , 𝜚 for finite 𝜚 Lagrangian function + quadratic penalty function spring 2014 TIES483 Nonlinear optimization Algorithm 1) 2) 3) 4) 5) 6)  Choose the final tolerance 𝜖 > 0. Choose 𝑥 1 , 𝜈𝑖1 (𝑖 = 1, … , 𝑙) and 𝜚. Set ℎ = 1. Test optimality: if optimality conditions are satisfied, stop. The solution is 𝑥 ℎ . Solve (with a suitable method) min𝑛 𝐿𝐴 𝑥, 𝜈 ℎ , 𝜚 𝑥∈𝑅 by using 𝑥 ℎ as a starting point. Let the solution be 𝑥 ℎ+1 . Update Lagrange multipliers: e.g. 𝜈 ℎ+1 = 𝜈 ℎ + 𝜚ℎ 𝑥 ℎ+1 . Increase 𝜚 if necessary: e.g. if ℎ(𝑥 ℎ ) − ℎ 𝑥 ℎ+1 < 𝜖. Set ℎ = ℎ + 1 and go to 2). Note: 𝑥 ℎ → 𝑥 ∗ only if 𝜈 ℎ → 𝜈 ∗ spring 2014 TIES483 Nonlinear optimization min 𝑓 𝑥 = 𝑥1 𝑥22 𝑠. 𝑡. 𝑥12 +𝑥22 − 2 ≤ 0 𝑥 ∗ = −0.8165, −1.1547 𝑇 , the constraint is active in 𝑥 ∗ Lagrangian function: saddle point in 𝑥 ∗ spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example Augmented Lagrangian function (a) 𝜚 = 0.075, (b) 𝜚 = 0.2, (c) 𝜚 = 100 spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example (cont.) Augmented Lagrangian function  𝜈 ∗ = 0.8165, 𝜚 = 0.2 (a) 𝜈 = 0.5, (b) 𝜈 = 0.9, (c) 𝜈 = 1.0 spring 2014 TIES483 Nonlinear optimization From Miettinen: Nonlinear optimization, 2007 (in Finnish) Example (cont.) Topic of the lectures next week Mon, Feb 10th: Constrained optimization: gradient projection, active set method Wed, Feb 12th: Constrained optimization, SQP method & Matlab Study this before the lecture! Questions to be considered – What is the basic idea of gradient projection? – What is the basic idea of active set methods? – What is the basic idea of Sequential Quadratic Programming (SQP)? spring 2014 TIES483 Nonlinear optimization

8. Constrained optimization: indirect methods

Related documents

Products

Support

8. Constrained optimization: indirect methods

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib