8. Constrained optimization: indirect methods

advertisement
Constrained optimization:
indirect methods
Jussi Hakanen
Post-doctoral researcher
jussi.hakanen@jyu.fi
spring 2014
TIES483 Nonlinear optimization
On constrained optimization
We have seen how to characterize optimal
solutions in constrained optimization
– KKT optimality conditions include the balance of
forces (−𝛻𝑓 π‘₯ ∗ , 𝛻𝑔𝑖 π‘₯ ∗ , 𝑖 ∈ 𝐼 and π›»β„Žπ‘— (π‘₯ ∗ )) and
complementarity conditions (πœ‡π‘– 𝑔𝑖 π‘₯ ∗ = 0 ∀𝑖)
– Regularity of π‘₯ ∗ need to be assumed
Now, we are interested in how to find such
solutions
spring 2014
TIES483 Nonlinear optimization
Methods for constrained optimization
Many methods utilize knowledge about the
constraints
– Linear inequalities or linear equalities
– Nonlinear inequalities or equalities
For example, if a linear constraint is active at
some point, you know that by taking steps
along the direction of the constraint, it
remains active
For nonlinear constraints, you don’t have
such a direction
Methods for constrained optimization can be
characterized based on how they treat
constraints
spring 2014
TIES483 Nonlinear optimization
Classification of the methods
Indirect methods: the constrained problem is
converted into a sequence of unconstrained
problems whose solutions will approach to the
solution of the constrained problem, the
intermediate solutions need not to be feasible
Direct methods: the constraints are taking into
account explicitly, intermediate solutions are
feasible
spring 2014
TIES483 Nonlinear optimization
Transforming the optimization
problem
Constraints of the problem can be transformed
if needed
𝑔𝑖 π‘₯ ≤ 0 ⟺ 𝑔𝑖 π‘₯ + 𝑦𝑖2 = 0, where 𝑦𝑖 is a
slack variable; constraint is active if 𝑦𝑖 = 0
– By adding 𝑦𝑖2 no need to add 𝑦𝑖 ≥ 0
– If 𝑔𝑖 (π‘₯) is linear, then linearity is preserved by
𝑔𝑖 π‘₯ + 𝑦𝑖 = 0, 𝑦𝑖 ≥ 0
𝑔𝑖 π‘₯ ≥ 0 ⟺ −𝑔𝑖 π‘₯ ≤ 0
β„Žπ‘– π‘₯ = 0 ⟺ β„Žπ‘– π‘₯ ≤ 0 & − β„Žπ‘– π‘₯ ≤ 0
spring 2014
TIES483 Nonlinear optimization
Examples of indirect methods
Penalty function methods
Lagrangian methods
spring 2014
TIES483 Nonlinear optimization
Penalty function methods
Include constraints into the objective function with
the help of penalty functions that penalize
constraint violations or even approaching the
boundary of 𝑆
Different types
– Penalty function: penalize for constraint violations
– Barrier function: prevents leaving the feasible region
– Exact penalty function
Resulting unconstrained problems can be solved
by using the methods presented earlier in the
course
spring 2014
TIES483 Nonlinear optimization
Penalty function methods
Generate a sequence of points that approach
the feasible region from outside
Constrained problem is converted into
min𝑛 𝑓 π‘₯ + π‘Ÿ 𝛼(π‘₯),
π‘₯∈𝑅
where 𝛼(π‘₯) is a penalty function and π‘Ÿ is a
penalty parameter
Requirements: 𝛼 π‘₯ ≥ 0 ∀ π‘₯ ∈ 𝑅𝑛 and 𝛼 π‘₯ = 0
if and only if π‘₯ ∈ 𝑆
spring 2014
TIES483 Nonlinear optimization
On convergence
When π‘Ÿ → ∞, the solutions π‘₯ π‘Ÿ of penalty
function problems converge to a constrained
minimizer (π‘₯ π‘Ÿ → π‘₯ ∗ and π‘Ÿπ›Ό π‘₯ π‘Ÿ → 0)
– All the functions should be continuous
– For each π‘Ÿ, there should exist a solution for penalty
functions problem and {π‘₯ π‘Ÿ } belongs to a compact
subset of 𝑅𝑛
spring 2014
TIES483 Nonlinear optimization
Examples of penalty functions
Can you give an example of a penalty function
𝛼(π‘₯)?
For equality constraints
– β„Žπ‘– π‘₯ = 0 ⟢ 𝛼 π‘₯ =
𝛼 π‘₯ =
𝑙
2 or
β„Ž
π‘₯
𝑖=1 𝑖
𝑙
𝑝 ,𝑝 ≥
|
|β„Ž
(π‘₯)
𝑖
𝑖=1
2
For inequality constraints
– 𝑔𝑖 π‘₯ ≤ 0 ⟢ 𝛼 π‘₯ =
𝛼 π‘₯ =
spring 2014
π‘š
𝑖=1 max
π‘š
𝑖=1 max
TIES483 Nonlinear optimization
0, 𝑔𝑖 (π‘₯) or
0, 𝑔𝑖 π‘₯ 𝑝 , 𝑝 ≥ 2
How to choose π‘Ÿ?
Should be large enough in order for the solutions be
close enough to the feasible region
If π‘Ÿ is too large, there could be numerical problems in
solving the penalty problems
For large values of π‘Ÿ, the emphasis is on finding
feasible solutions and, thus, the solution can be
feasible but far from optimum
Typically π‘Ÿ is updated iteratively
Different parameters can be used for different
constraints (e.g. 𝑔𝑖 ⟢ π‘Ÿπ‘– , 𝑔𝑗 ⟢ π‘Ÿπ‘— )
– For the sake of simplicity, same parameter is used here for all
the constraints
spring 2014
TIES483 Nonlinear optimization
Algorithm
1) Choose the final tolerance πœ– > 0 and a starting
point π‘₯ 1 . Choose π‘Ÿ1 > 0 (not too large) and set
β„Ž = 1.
2) Solve
min𝑛 𝑓 π‘₯ + π‘Ÿ β„Ž 𝛼(π‘₯)
π‘₯∈𝑅
with some method for unconstrained problems
(π‘₯ β„Ž as a starting point). Let the solution be
π‘₯ β„Ž+1 = π‘₯(π‘Ÿ β„Ž ).
3) Test optimality: If π‘Ÿ β„Ž 𝛼 π‘₯ β„Ž+1 < πœ–, stop. Solution
π‘₯ β„Ž+1 is close enough to optimum. Otherwise, set
π‘Ÿ β„Ž+1 > π‘Ÿ β„Ž (e.g. π‘Ÿ β„Ž+1 = πœ…π‘Ÿ β„Ž , where πœ… can be
initialized to be e.g. 10). Set β„Ž = β„Ž + 1 and go to
2).
spring 2014
TIES483 Nonlinear optimization
min π‘₯ 𝑠. 𝑑. −π‘₯ + 2 ≤ 0
Let 𝛼 π‘₯ = max [0, (−π‘₯ + 2)] 2
0, 𝑖𝑓 π‘₯ ≥ 2
Then 𝛼 π‘₯ =
−π‘₯ + 2 2 , 𝑖𝑓 π‘₯ < 2
Minimum of 𝑓 + π‘Ÿπ›Ό is at 2
spring 2014
1
−
2π‘Ÿ
TIES483 Nonlinear optimization
From Miettinen: Nonlinear optimization, 2007 (in Finnish)
Example
Barrier function method
Prevents leaving the feasible region
Suitable only for problems with equality constraints
– Set π‘₯ 𝑔𝑖 π‘₯ < 0 ∀𝑖} should not be empty
Problem to be solved is
min Θ π‘Ÿ 𝑠. 𝑑. π‘Ÿ ≥ 0,
π‘Ÿ
where Θ π‘Ÿ = inf 𝑓 π‘₯ + π‘Ÿπ›½ π‘₯
π‘₯
𝑔𝑖 π‘₯ < 0 ∀𝑖]
 𝛽 is a barrier function: 𝛽 π‘₯ ≥ 0 when 𝑔𝑖 π‘₯ < 0 ∀𝑖 and
𝛽 π‘₯ → ∞ when π‘₯ approaches boundary of 𝑆
 Constraints 𝑔𝑖 π‘₯ < 0 can be omitted since 𝛽 → ∞ in the
boundary of 𝑆
spring 2014
TIES483 Nonlinear optimization
On convergence
Denote Θ π‘Ÿ = 𝑓 π‘₯ π‘Ÿ + π‘Ÿπ›½(π‘₯ π‘Ÿ )
Under some assumptions, the solutions π‘₯ π‘Ÿ of
barrier problems converge to a constrained
minimizer (π‘₯ π‘Ÿ → π‘₯ ∗ and π‘Ÿπ›½ π‘₯ π‘Ÿ → 0) when
π‘Ÿ → 0+
– All functions should be continuous
– π‘₯ 𝑔𝑖 π‘₯ < 0 ∀𝑖} ≠ ∅
spring 2014
TIES483 Nonlinear optimization
Properties of barrier functions
Nonnegative and continuous in π‘₯ 𝑔𝑖 π‘₯ < 0 ∀𝑖}
Approaches ∞ when the boundary of the feasible
region is approached from inside
Ideally: 𝛽 = 0 in π‘₯ 𝑔𝑖 π‘₯ < 0 ∀𝑖} and 𝛽 = ∞ in the
boundary
– Guarantees staying in the feasible region
– This kind of discontinuity causes problems for any numerical
method
Examples of barrier functions
– 𝛽 π‘₯ =
1
π‘š
−
𝑖=1 𝑔 π‘₯
– 𝛽 π‘₯ =−
spring 2014
𝑖
π‘š
𝑖=1 ln
(min[1, −𝑔𝑖 (π‘₯)])
TIES483 Nonlinear optimization
Algorithm
1) Choose the final tolerance πœ– > 0 and a starting
point π‘₯ 1 s.t. 𝑔𝑖 π‘₯ < 0 ∀𝑖. Choose π‘Ÿ1 > 0, not too
small (and a parameter 0 < 𝜏 < 1 for reducing π‘Ÿ).
Set β„Ž = 1.
2) Solve
min 𝑓 π‘₯ + π‘Ÿ β„Ž 𝛽(π‘₯) 𝑠. 𝑑. 𝑔𝑖 π‘₯ < 0 ∀𝑖
π‘₯
by using the starting point π‘₯ β„Ž . Let the solution be
π‘₯ β„Ž+1 .
3) Test optimality: If π‘Ÿ β„Ž 𝛽 π‘₯ β„Ž+1 < πœ–, stop. Solution
π‘₯ β„Ž+1 is close enough to optimum. Otherwise, set
π‘Ÿ β„Ž+1 < π‘Ÿ β„Ž (e.g. π‘Ÿ β„Ž+1 = πœπ‘Ÿ β„Ž ). Set β„Ž = β„Ž + 1 and
go to 2).
spring 2014
TIES483 Nonlinear optimization
Let 𝛽 π‘₯ =
−1
−π‘₯+1
when π‘₯ ≠ 1
Minimum of 𝑓 π‘₯ + π‘Ÿπ›½ π‘₯ =
π‘₯ + π‘Ÿ π‘₯ − 1 −1 is at 1 + π‘Ÿ
spring 2014
TIES483 Nonlinear optimization
From Miettinen: Nonlinear optimization, 2007 (in Finnish)
min π‘₯ 𝑠. 𝑑. −π‘₯ + 1 ≤ 0
Example
Summary: penalty and barrier
function methods
Penalty and barrier functions usually differentiable
Minimum is obtained in a limit
– Penalty function: π‘Ÿ β„Ž → ∞
– Barrier function: π‘Ÿ β„Ž → 0
Choosing the sequence π‘Ÿ β„Ž essential for
convergence
– If π‘Ÿ β„Ž → ∞ or π‘Ÿ β„Ž → 0 too slowly, a large number of
unconstrained problems need to be solved
– If π‘Ÿ β„Ž → ∞ or π‘Ÿ β„Ž → 0 too fast, solutions of successive
unconstrained problems are far from each other and
solution time increases
spring 2014
TIES483 Nonlinear optimization
Exact penalty function
Idea is to have a method where the solution
could be found with a small amount of
iterations
Suitable for both equality and inequality
constraints
An exact penalty function problem is e.g. of
the form
min𝑛 𝑓 π‘₯ + π‘Ÿ(
π‘₯∈𝑅
spring 2014
π‘š
𝑖=1 max[0, 𝑔𝑖
π‘₯ ]+
TIES483 Nonlinear optimization
𝑙
𝑖=1 |β„Žπ‘– (π‘₯)|)
Exact penalty function method
Theorem: Consider a point π‘₯ where the necessary
KKT conditions hold. Let the corresponding Lagrange
multipliers be πœ‡ and 𝜈. Assume that objective and
inequality constraint functions are convex and equality
constraint functions are affine. Then π‘₯ is a solution of
the exact penalty function problem with
π‘Ÿ ≥ max[πœ‡π‘– , 𝑖 = 1, … , π‘š, πœˆπ‘– , 𝑖 = 1, … , 𝑙]
Solution can be obtained with a finite value for the
penalty parameter π‘Ÿ
Algorithm is similar to penalty function method except
for that π‘Ÿ β„Ž is increased only if necessary
– E.g. when the feasible region is not approached fast enough
spring 2014
TIES483 Nonlinear optimization
Properties of exact penalty function
Not differentiable in points π‘₯ where 𝑔𝑖 π‘₯ = 0
or β„Žπ‘– π‘₯ = 0
– Gradient based methods are not suitable
If π‘Ÿ and the starting point could be chosen
appropriately, only one minimization would be
required in principle
– If π‘Ÿ is too large and the starting point is not close
enough to the optimum, minimizing the exact
penalty function problem could become difficult
spring 2014
TIES483 Nonlinear optimization
Example
min 𝑓 π‘₯ = π‘₯12 + π‘₯22 𝑠. 𝑑. π‘₯1 + π‘₯2 − 1 = 0
Optimal solution is
π‘₯∗
=
1 1 𝑇
,
2 2
, 𝜈 ∗ = −2π‘₯1∗ = −2π‘₯2∗ = −1
Exact penalty function problem:
min𝑛 π‘₯12 + π‘₯22 + π‘Ÿ|π‘₯1 + π‘₯2 − 1|
π‘₯∈𝑅
π‘₯∗
Solution:
=
when π‘Ÿ ≥ 1
π‘Ÿ π‘Ÿ 𝑇
,
when
2 2
0 ≤ π‘Ÿ < 1 and
π‘₯∗
=
1 1 𝑇
,
2 2
– (obtained by using KKT conditions of an equivalent
differentiable problem where the absolute value term is replaced
with a new variable and two inequality constraints)
Thus, the solution can be found with π‘Ÿ ≥ 1 (= |𝜈 ∗ |)
spring 2014
TIES483 Nonlinear optimization
min 𝑓 π‘₯ = π‘₯1 π‘₯22 𝑠. 𝑑. π‘₯12 +π‘₯22 − 2 ≤ 0
π‘₯ ∗ = −0.8165, −1.1547 𝑇 , the constraint is active in π‘₯ ∗
(a) level curves of 𝑓(π‘₯) and the boundary of 𝑆
Logarithmic barrier function: (b) π‘Ÿ = 0.2, (c) π‘Ÿ = 0.001
spring 2014
TIES483 Nonlinear optimization
From Miettinen: Nonlinear optimization, 2007 (in Finnish)
Example: barrier function
min 𝑓 π‘₯ = π‘₯1 π‘₯22 𝑠. 𝑑. π‘₯12 +π‘₯22 − 2 ≤ 0
π‘₯ ∗ = −0.8165, −1.1547 𝑇 , the constraint is active in π‘₯ ∗
Quadratic penalty function: (a) π‘Ÿ = 1, (b) π‘Ÿ = 100
spring 2014
TIES483 Nonlinear optimization
From Miettinen: Nonlinear optimization, 2007 (in Finnish)
Example: penalty function
min 𝑓 π‘₯ = π‘₯1 π‘₯22 𝑠. 𝑑. π‘₯12 +π‘₯22 − 2 ≤ 0
π‘₯ ∗ = −0.8165, −1.1547 𝑇 , the constraint is active in π‘₯ ∗ ,
πœ‡ ∗ = 0.8165
Exact penalty function: (a) π‘Ÿ = 1.2, (b) π‘Ÿ = 5, (c) π‘Ÿ = 100
spring 2014
TIES483 Nonlinear optimization
From Miettinen: Nonlinear optimization, 2007 (in Finnish)
Example: exact penalty function
Lagrangian function
Consider problem
min 𝑓(π‘₯) 𝑠. 𝑑. β„Žπ‘– π‘₯ = 0, 𝑖 = 1, … , 𝑙
Lagrangian function
𝐿 π‘₯, 𝜈 = 𝑓 π‘₯ + 𝑙𝑖=1 πœˆπ‘– β„Žπ‘– (π‘₯)
KKT conditions
𝛻𝑓 π‘₯ + 𝑙𝑖=1 πœˆπ‘– π›»β„Žπ‘– (π‘₯) = 0
β„Žπ‘– π‘₯ = 0, 𝑖 = 1, … , 𝑙
Let π‘₯ ∗ be a minimizer and 𝜈 ∗ corresponding
Lagrange multiplier
spring 2014
TIES483 Nonlinear optimization
Properties of Lagrangian
KKT conditions: π‘₯ ∗ is a critical point of the
Lagrangian function
– π‘₯ ∗ is not necessarily a minimizer for 𝐿(π‘₯, 𝜈 ∗ )
Thus, minimizing the Lagrangian function
doesn’t necessarily give a minimum for 𝑓(π‘₯)
2 𝐿(π‘₯ ∗ , 𝜈 ∗ ) may be indefinite → a saddle
– Hessian 𝛻π‘₯π‘₯
point
Improve Lagrangian function!
spring 2014
TIES483 Nonlinear optimization
Augmented Lagrangian function
Augmented Lagrangian function:
𝐿𝐴 π‘₯, 𝜈, 𝜚 = 𝑓 π‘₯ +
𝑙
𝑖=1 πœˆπ‘– β„Žπ‘– (π‘₯)
1
2
𝑙
𝑖=1
+ 𝜚
β„Žπ‘– π‘₯
2,𝜚
>0
– Lagrangian function + quadratic penalty function
A point (π‘₯ ∗ , 𝜈 ∗ ) is a critical point of the augmented
Lagrangian
1
2
– 𝛻x 𝐿𝐴 π‘₯ ∗ , 𝜈 ∗ , 𝜚 = 0 and 𝜚
𝑙
𝑖=1
β„Žπ‘– π‘₯ ∗
2
=0
Hessian:
2 𝐿 π‘₯ ∗ , 𝜈 ∗ , 𝜚 = 𝛻 2 𝐿 π‘₯ ∗ , 𝜈 ∗ + πœšπ›»β„Ž π‘₯ ∗ 𝑇 π›»β„Ž(π‘₯ ∗ )
𝛻π‘₯π‘₯
𝐴
π‘₯π‘₯
2 𝐿 π‘₯ ∗ , 𝜈 ∗ , 𝜚 is
It can be shown that for 𝜚 > 𝜚, 𝛻π‘₯π‘₯
𝐴
∗
positive definite → π‘₯ is a local minimizer of 𝐿𝐴 π‘₯, 𝜈 ∗ , 𝜚
Need to know 𝜈 ∗
spring 2014
TIES483 Nonlinear optimization
Properties of 𝐿𝐴 (π‘₯, 𝜈, 𝜚)
Differentiable if the original functions are
π‘₯ ∗ is a minimizer of 𝐿𝐴 π‘₯, 𝜈 ∗ , 𝜚 for finite 𝜚
Lagrangian function + quadratic penalty
function
spring 2014
TIES483 Nonlinear optimization
Algorithm
1)
2)
3)
4)
5)
6)

Choose the final tolerance πœ– > 0. Choose π‘₯ 1 ,
πœˆπ‘–1 (𝑖 = 1, … , 𝑙) and 𝜚. Set β„Ž = 1.
Test optimality: if optimality conditions are satisfied,
stop. The solution is π‘₯ β„Ž .
Solve (with a suitable method)
min𝑛 𝐿𝐴 π‘₯, 𝜈 β„Ž , 𝜚
π‘₯∈𝑅
by using π‘₯ β„Ž as a starting point. Let the solution be
π‘₯ β„Ž+1 .
Update Lagrange multipliers: e.g.
𝜈 β„Ž+1 = 𝜈 β„Ž + πœšβ„Ž π‘₯ β„Ž+1 .
Increase 𝜚 if necessary: e.g. if
β„Ž(π‘₯ β„Ž ) − β„Ž π‘₯ β„Ž+1 < πœ–.
Set β„Ž = β„Ž + 1 and go to 2).
Note: π‘₯ β„Ž → π‘₯ ∗ only if 𝜈 β„Ž → 𝜈 ∗
spring 2014
TIES483 Nonlinear optimization
min 𝑓 π‘₯ = π‘₯1 π‘₯22 𝑠. 𝑑. π‘₯12 +π‘₯22 − 2 ≤ 0
π‘₯ ∗ = −0.8165, −1.1547 𝑇 , the constraint is active
in π‘₯ ∗
Lagrangian function: saddle point in π‘₯ ∗
spring 2014
TIES483 Nonlinear optimization
From Miettinen: Nonlinear optimization, 2007 (in Finnish)
Example
Augmented Lagrangian function
(a) 𝜚 = 0.075, (b) 𝜚 = 0.2, (c) 𝜚 = 100
spring 2014
TIES483 Nonlinear optimization
From Miettinen: Nonlinear optimization, 2007 (in Finnish)
Example (cont.)
Augmented Lagrangian function
 𝜈 ∗ = 0.8165, 𝜚 = 0.2
(a) 𝜈 = 0.5, (b) 𝜈 = 0.9, (c) 𝜈 = 1.0
spring 2014
TIES483 Nonlinear optimization
From Miettinen: Nonlinear optimization, 2007 (in Finnish)
Example (cont.)
Topic of the lectures next week
Mon, Feb 10th: Constrained optimization: gradient
projection, active set method
Wed, Feb 12th: Constrained optimization, SQP
method & Matlab
Study this before the lecture!
Questions to be considered
– What is the basic idea of gradient projection?
– What is the basic idea of active set methods?
– What is the basic idea of Sequential Quadratic
Programming (SQP)?
spring 2014
TIES483 Nonlinear optimization
Download