Non-Linear Programming

advertisement
Solving Single-Variable, Unconstrained NLPs
Prerequisites:
Non-Linear Programming (NLP): Single-Variable,
Unconstrained
ChE 4G03: Optimization in Chemical Engineering
Optimization – Part II
McMaster University
Department of Chemical Engineering
Basic Concepts for
Basic Concepts for
Benoı̂t Chachuat <benoit@mcmaster.ca>
Optimization – Part I
Methods for Single-Variable
Unconstrained Optimization
11111111111111111
00000000000000000
Benoı̂t Chachuat (McMaster University)
NLP: Single-Variable, Unconstrained
4G03
1 / 18
Solving Single-Variable, Unconstrained NLPs
Benoı̂t Chachuat (McMaster University)
NLP: Single-Variable, Unconstrained
4G03
2 / 18
Outline
Single Variable
Optimization
“Aren’t single-variable problems easy?” — Sometimes
“Won’t the methods for multivariable problems work in the single
variable case?” — Yes
But,
1
A few important problems are single variables — e.g., nonlinear
regression
2
This will give us insight into multivariable solution techniques
3
Single-variable optimization is a subproblem for many nonlinear
optimization methods and software! — e.g., linesearch in
(un)constrained NLP
This lecture
Numerical
Solution Methods
Region Elimination
Methods
Interpolation
Methods
Analytical
Solution Methods
Derivative-based
Methods
For additional details, see Rardin (1998), Chapter 13.2
Benoı̂t Chachuat (McMaster University)
NLP: Single-Variable, Unconstrained
4G03
3 / 18
Benoı̂t Chachuat (McMaster University)
NLP: Single-Variable, Unconstrained
4G03
4 / 18
Region Elimination Methods (Minimize Case)
Golden Section Search (Minimize Case)
Iteratively consider the function value at 4 carefully spaced points:
Assume a unimodal function
Leftmost x ℓ is always a lower bound on the optimal value x ∗
Rightmost x u is always an upper bound on the optimal value x ∗
Points x 1 and x 2 fall in between
How do we choose the intermediate points x 1 , x 2 ?
Case 1: f (x 1 ) < f (x 2 )
f (x)
x1
x2
x1
x2
xu
xℓ
x1 x2
xu
xℓ
x1 x2
xu
What is the advantage of choosing this ratio γ?
Benoı̂t Chachuat (McMaster University)
x
xu
xℓ
eliminated
NLP: Single-Variable, Unconstrained
4G03
5 / 18
Golden Section Search Algorithm (Minimize Case)
◮
Choose lower and upper bounds, x ℓ and x u , that bracket x ∗ , as well as
stopping tolerance ǫ > 0
Set x 1 ← x u − γ x u − x ℓ and x 2 ← x ℓ + γ x u − x ℓ
Step 1: Stopping
◮
u
ℓ
∗
If x − x < ǫ, stop — report x ←
solution
1
2
u
x −x
ℓ
as an approximate
Step 2a: Case f (x 1 ) < f (x 2 ) (optimum left)
◮
Narrow the search by eliminating the rightmost part:
x u ← x 2, x 2 ← x 1, x 1 ← x u − γ x u − x ℓ ,
Step 2b: Case
◮
>
f (x 2 )
xℓ
x1 x2
xu
xu
reduction γ
NLP: Single-Variable, Unconstrained
4G03
6 / 18
Pros:
Objective function can be nonsmooth or even discontinuous
Calculations are straightforward (only function evaluations)
Cons:
Assumes a unimodal function and requires prior knowledge of an
enclosure x ∗ ∈ [x ℓ , x u ]
Golden section search is slow! Considerable computation may be
needed to get an accurate solution
it.
0
1
2
3
4
(optimum right)
and evaluate f (x 2 ) — Return to step 1
NLP: Single-Variable, Unconstrained
x2
∆ (x−20)4
500
Narrow the search by eliminating the leftmost part:
x ℓ ← x 1, x 1 ← x 2, x 2 ← x ℓ + γ x u − x ℓ ,
Benoı̂t Chachuat (McMaster University)
x1
Example: Consider the problem: min f (x) =
and evaluate f (x 1 ) — Return to step 1
f (x 1 )
Benoı̂t Chachuat (McMaster University)
xℓ
Pros and Cons of Golden Section Search
Step 0: Initialization
◮
x
√
∆ 1+ 5
≈ 0.618 a
with γ =
2
fraction know as the golden ratio
x
xℓ
f (x)
Golden Section Search
The points x 1 , x 2 are taken as:
h
i
∆
x1 = xu − γ xu − xℓ
h
i
∆
x2 = xℓ + γ xu − xℓ
Case 2: f (x 1 ) > f (x 2 )
f (x)
Golden section search proceeds by keeping both possible intervals
[x ℓ , x 2 ] or [x 1 , x u ] equal
4G03
7 / 18
xℓ
0.00
15.28
15.28
21.12
x1
15.28
24.72
21.12
24.72
x2
24.72
30.54
24.72
26.95
Benoı̂t Chachuat (McMaster University)
xu
40.00
40.00
30.56
30.56
f (x ℓ )
320.00
-29.57
-29.57
-42.23
f (x 1 )
-29.57
-48.45
-42.23
-48.45
NLP: Single-Variable, Unconstrained
− 2x
f (x 2 )
-48.45
-36.27
-48.45
-49.23
f (x u )
240.00
240.00
-36.27
-36.27
xu − xℓ
40.00
24.72
15.28
9.44
4G03
8 / 18
Interpolation Methods (Minimize Case)
Math Refresher: Lagrange Polynomials
Improve speed by taking full advantage of a 3-point pattern:
Assume a unimodal, continuous function
Fit a smooth curve through the points: (x ℓ , f (x ℓ )), (x m , f (x m )),
(x u , f (x u )); then, optimize the fitted curve
Case 1B:
f (x)
xm xq
xu
eliminated
xm xu
xℓ
Benoı̂t Chachuat (McMaster University)
NLP: Single-Variable, Unconstrained
xm xq
m
xℓ x
pN (x) =
xu
xu
1.2
1.2
1
1
0.8
0.8
0.6
0.4
9 / 18
0.4
0.2
0
0
-0.2
1
2
3
4
x
Benoı̂t Chachuat (McMaster University)
5
6
0
1
2
3
x
4
5
NLP: Single-Variable, Unconstrained
6
4G03
10 / 18
Quadratic Fit Search Algorithm (Minimize Case)
Step 0: Initialization
◮
Quadratic fit search proceeds by fitting a 2nd-order polynomial
through (x ℓ , f (x ℓ )), (x m , f (x m )) (x u , f (x u )):
Choose starting 3-point pattern (x ℓ , x m , x u ), as well as stopping
tolerance ǫ > 0
Step 1: Stopping
◮
(x − x m )(x − x u )
(x − x ℓ )(x − x u )
p2 (x) = f (x ) ℓ
+ f (x m ) m
m
ℓ
u
(x − x )(x − x )
(x − x ℓ )(x m − x u )
(x − x ℓ )(x − x m )
+ f (x u ) u
(x − x ℓ )(x u − x m )
ℓ
If x u − x ℓ < ǫ, stop — report x ∗ ← x m as an approximate solution
Step 2: Quadratic Fit
◮
Quadratic Fit Search
Compute quadratic fit optimum x q as
h
i
h
i
m2
ℓ
u2
m
u2
ℓ2
u
ℓ2
m2
f
(x
)
x
−
x
+
f
(x
)
x
−
x
+
f
(x
)
x
−
x
1
xq ←
2
f (x ℓ ) [x m − x u ] + f (x m ) [x u − x ℓ ] + f (x u ) [x ℓ − x m ]
Step 3a: Case x q < x m
The unique optimum of the fitted quadratic function occurs at:
h
i
h
i
ℓ ) x m2 − x u2 + f (x m ) x u2 − x ℓ 2 + f (x u ) x ℓ 2 − x m2
f
(x
∆ 1
xq =
2
f (x ℓ ) [x m − x u ] + f (x m ) [x u − x ℓ ] + f (x u ) [x ℓ − x m ]
◮
4G03
Step 3b: Case x q > x m
If f (x q ) > f (x m ), update
◮
xℓ ← xq
◮
m
x ←x ,
11 / 18
x
Return to step 1
Benoı̂t Chachuat (McMaster University)
If f (x q ) > f (x m ), update
xh ← xq
Otherwise, update
u
◮
NLP: Single-Variable, Unconstrained
0.6
0.2
0
4G03
N
Y
x − xi
with: Lk (x) =
xk − xi
∆
i =1
i 6=k
-0.2
How do we fit the curve?
Benoı̂t Chachuat (McMaster University)
p(xk )Lk (x),
k=0
Quadratic Fit Search (Minimize Case)
∆
N
X
∆
x
xℓ
The Lagrange polynomials provide one way of computing this
interpolation:
x q > x m , and
f (x q ) > f (x m )
x
xℓ
2
p3 (x)
x q > x m , and
f (x q ) < f (x m )
For N + 1 data points, there is one and only one polynomial of order
∆
N, pN (x) = a0 + a1 x + · · · + aN x N , passing through all the points
p2 (x)
Case 1A:
f (x)
1
◮
m
←x
Otherwise, update
x ℓ ← x m,
q
◮
xm ← xq
Return to step 1
NLP: Single-Variable, Unconstrained
4G03
12 / 18
Pros and Cons of Quadratic Fit Search
Derivative-Based Methods
Pros:
Improve convergence speed by using first- and second-order
derivatives taking full advantage of a 3-point pattern:
Assume a smooth function f
Consider the quadratic approximation of f passing through the
current iterate x k
Optimize this approximation to determine the next iterate, x k+1
Calculations are also straightforward (only function evaluations)
Typically faster than Golden section search (but not so much...)
Cons:
Assumes a unimodal function and requires prior knowledge of an
enclosure x ∗ ∈ [x ℓ , x u ]
f (x)
Objective function has to be smooth
∆ (x−20)4
500
Example: Consider the problem: min f (x) =
it.
0
1
2
3
4
xℓ
0.00
0.00
20.92
20.92
xm
32.00
20.92
25.00
25.00
xu
40.00
32.00
32.00
30.00
f (x ℓ )
320.00
320.00
-41.84
-41.84
f (x m )
-22.53
-41.84
-48.75
-48.75
− 2x
xu − xℓ
40.00
32.00
11.80
9.08
f (x u )
240.00
-22.53
-22.53
-40.03
f (x q )
-41.84
-48.75
-40.03
xq
20.92
25.00
30.00
x
x k+1 x k
Benoı̂t Chachuat (McMaster University)
NLP: Single-Variable, Unconstrained
4G03
13 / 18
Newton’s Method
Benoı̂t Chachuat (McMaster University)
NLP: Single-Variable, Unconstrained
xk ?
Use Taylor series approximation:
h
i 1
h
i2
f (x) ≈ f (x k ) + f ′ (x k ) x − x k + f ′′ (x k ) x − x k
2
Step 0: Initialization
Calculate the optimum, x k+1 , of the quadratic approximation:
h
i
f ′ (x k+1 ) = 0 = f ′ (x k ) + f ′′ (x k ) x k+1 − x k
Step 1: Derivatives
◮
◮
◮
Choose an initial guess x 0 , as well as stopping tolerance ǫ > 0
Set k ← 0
Compute first- and second-order derivatives f ′ (x k ), f ′′ (x k )
Step 2: Stopping
◮
Basic Newton’s Search
The basic Newton’s method proceeds iteratively as
If f ′ (x k ) < ǫ, stop — report x ∗ ← x k as an approximate solution
Step 3: Newton Step
′
x k+1
′
14 / 18
Basic Newton Algorithm
How do we get a quadratic approximation at
∆
4G03
◮
f ′ (x k )
∆
= x k − ′′ k
f (x )
◮
◮
k
k
)
Compute Newton step d k+1 ← − ff ′′(x
(x k )
Update iterate x k+1 ← x k + d k+1
Increment k ← k + 1 and return to step 1
)
is called the Newton step.
The term d k+1 = − ff ′′(x
(x k )
Benoı̂t Chachuat (McMaster University)
NLP: Single-Variable, Unconstrained
4G03
15 / 18
Benoı̂t Chachuat (McMaster University)
NLP: Single-Variable, Unconstrained
4G03
16 / 18
Variant: Quasi-Newton Algorithm
Pros and Cons of (Quasi-)Newton Search
Idea: Approximate second-order derivatives using finite differences,
Pros:
Very fast convergence close to the optimal solution (quadratic
convergence rate)
No need to bound the optimum within a range
f ′ (x k ) − f ′ (x k−1 )
f ′′ (x k ) ≈
x k − x k−1
Step 0: Initialization
◮
◮
Choose initial points x −1 and x 0 , as well as stopping tolerance ǫ > 0
Set k ← 0
Step 1: Derivatives
◮
Compute first-order derivative f ′ (x k ) and approximate inverse of
∆
second-order derivatives B(x k ) =
x k −x k−1
f ′ (x k )−f ′ (x k−1 )
Step 2: Stopping
◮
∆ (x−20)4
500
Example: Consider the problem: min f (x) =
If f ′ (x k ) < ǫ, stop — report x ∗ ← x k as an approximate solution
k
0
1
2
3
Step 3: Newton Step
◮
◮
◮
Compute Newton step d k+1 ← −f ′ (x k )B k
Update iterate x k+1 ← x k + d k+1
Increment k ← k + 1 and return to step 1
Benoı̂t Chachuat (McMaster University)
NLP: Single-Variable, Unconstrained
4G03
Cons:
Requires a “good” initial guess, otherwise typically diverges
No distinction between (local) minima and maxima!
Objective function has to be smooth and first-order derivatives must
be available (possibly second-order derivatives too)
17 / 18
xk
f (x k )
30.00
27.50
26.48
-40.00
-48.67
-49.43
Benoı̂t Chachuat (McMaster University)
f
′ (x k )
6.000
1.37
0.18
f
′′ (x k )
|f ′ (x k )|
2.40
1.35
1.01
6.000
1.375
0.178
NLP: Single-Variable, Unconstrained
− 2x
d k+1
-2.50
-1.02
4G03
18 / 18
Download