Nonlinear optimization and Modelling

advertisement
Nonlinear optimization and Modelling
Optimization is the process where we determine the
values of parameters that result in a function reaching
a maximum or minimum value.
Examples:
- least-squares fitting  obtain a minimum in the sum of
squares for linear models.
- linear programming in business and economics.
- optimizing the performance of an instrument that has
many operating variables.
-
computational methods used to determine molecular or
crystal structures  establish the most stable structure
(the one with the lowest potential energy).
Nonlinear models can also be used but new
techniques are needed to perform the minimization /
maximisation.
All solution strategies for nonlinear problems are
iterative.
1
Direct Search Methods
 use function values only (NOT 1st / 2nd derivatives)
Example:
y = f(x1,x2)
Function with 2 variables and min. at (x1*,x2*)
x1*, x2*
x2
A
B
x1
1) Start at some point A (combination of x1 and x2)
2) Look for a minimum in f(x1,x2) along x1-axis, say B.
3) From B, look for a minimum in f(x1,x2) along x2-axis.
4) Continue until changes in the response (f(x1,x2))
becomes insignificant.
Slow process for more than two variables!
2
Simplex Optimization of Variables
Simplex: a geometric figure defined by a number of
points in space, equal to one more than the number of
dimensions of the space.
In optimization and curve fitting:
dimensions of space = no. of parameters to be
determined.
E.g.: Simplex in 2-D = triangle
(2 parameters, 3 points.)
Simplex in 3-D = (distorted) tetrahedral
(3 parameters, 4 points)
3
Example:
Consider a 2-D simplex to optimize a function with
two parameters. Apply the simplex approach to a
least-square curve fitting, i.e. the response function is
the ‘sum of squares of errors’ (SSE) which must tend
to a minimum.
Consider a set N observed experimental data:
(x1 y1) (x2 y2) … (xN yN)
The data can be modelled by a function:
y = f(x; a, b)
e.g.,
y  aebx
Find values of a and b to minimize the SSE between
the observed and calculated data points.
The space we are searching might look like the
following, where the contours are values of SSE.
4
b
SSE
min. in SSE
a
1) Establish a starting simplex - Obtain SSE for three
pairs of a and b values to form a triangle.
(i.e. m+1 points for m parameters)
b
C
B
SSE at A, B and C
A
a
5
2) Move the simplex in the direction that will reduce
the SSE.
HOW?
Find the least desirable point. Replace it with its
mirror image across the face of the remaining points.
b
C
B
B’
A
a
How do I find the mirror image?
6
The vertices of the m-dimensional simplex are
represented by the coordinate vectors:
P1 P2 … Pj … Pm Pm+1
Say the most undesirable point Pj is eliminated.
It leaves
P1 P2 … Pj-1 Pj+1 … Pm Pm+1
The centroid (centre of gravity) is then:
P

1
P1  P2  ...  Pj 1  Pj 1  ...  Pm1
m

The new (reflected) point of the simplex is given by:

Pj*  P  P  Pj

7
Example:
2;2
b
1;1
3;1
a
Eliminate (1 ; 1) from (1 ; 1), (3 ; 1), (2 ; 2)
Centre of gravity:
P

1
P1  P2  ...  Pj 1  Pj 1  ...  Pm1
m

1
(3 ; 1)  (2 ; 2) 
2
1
P  5 ; 3
2
P
5 3
P  ; 
2 2
8
New (reflected) point:

Pj*  P  P  Pj


 5 3   5 3 
Pj*   ;    ;   1 ; 1
 2 2   2 2 

Pj*
5 3 3 1
  ;  ; 
2 2 2 2
Pj*  4 ; 2
b
4;2
2;2
1;1
3;1
a
This procedure continues until SSE no longer
improves.
9
Improve the performance of the simplex in two ways:
1) Expand in size if it is going in the right direction.
2) Contract near the minimum to improve resolution.
b
N
W
T
R
U
P
R
S
B
a
Original simplex = WNB.
1) W  most undesirable point
(B = best and N = next best)
Elimination of W and reflection gives R.
2) If R is better than B  move in the correct direction
and suggests a possible expansion of the
simplex in that direction.
P*  P  ( P  PW )
where γ > 1 and we get S.
10
b
N
W
T
R
U
P
R
S
B
a
3) If R is not better than B, but better than N
 new simplex is BNR (with γ = 1).
4) If R is less desirable than B and N, but better than
W  a contraction is indicated:
P*  P  ( P  PW )
where 0 < β < 1 and we get U.
5) If R is less desirable than W then:
P*  P  ( P  PW )
where 0 < β < 1 and we get T.
11
 the simplex moves in the direction of the minimum
and gets smaller as it gets closer to it.
The iteration must be stopped at some point; e.g.
when the changes in the parameters become small.
PRO’S:
- easy to visualise and easy to program.
- no setting up of equations or finding derivatives.
CONS:
- may fail for large problems with many parameters
- slow (no additinal info which might indicate the
direction of the minimum).
12
Any minimization procedure is unlikely to find the
global minimum if it is started near a local minimum.
 good initial estimates of the starting parameters are
required.
SSE
local
minimum
global
minimum
a
13
Gradient Methods
Look at the gradient of the response as the
parameters are changed to locate a minimum (or
maximum)  need to find derivatives
Newton-Raphson method:
- simplest gradient method
- uses a Taylor series expansion to linearise the
function
- not very reliable in its basic form
Levenberg-Marquardt method:
- optimized for nonlinear curve fitting
- modification of the basic Newton-Raphson method
14
Download