Nonlinear optimization and Modelling Optimization is the process where we determine the values of parameters that result in a function reaching a maximum or minimum value. Examples: - least-squares fitting obtain a minimum in the sum of squares for linear models. - linear programming in business and economics. - optimizing the performance of an instrument that has many operating variables. - computational methods used to determine molecular or crystal structures establish the most stable structure (the one with the lowest potential energy). Nonlinear models can also be used but new techniques are needed to perform the minimization / maximisation. All solution strategies for nonlinear problems are iterative. 1 Direct Search Methods use function values only (NOT 1st / 2nd derivatives) Example: y = f(x1,x2) Function with 2 variables and min. at (x1*,x2*) x1*, x2* x2 A B x1 1) Start at some point A (combination of x1 and x2) 2) Look for a minimum in f(x1,x2) along x1-axis, say B. 3) From B, look for a minimum in f(x1,x2) along x2-axis. 4) Continue until changes in the response (f(x1,x2)) becomes insignificant. Slow process for more than two variables! 2 Simplex Optimization of Variables Simplex: a geometric figure defined by a number of points in space, equal to one more than the number of dimensions of the space. In optimization and curve fitting: dimensions of space = no. of parameters to be determined. E.g.: Simplex in 2-D = triangle (2 parameters, 3 points.) Simplex in 3-D = (distorted) tetrahedral (3 parameters, 4 points) 3 Example: Consider a 2-D simplex to optimize a function with two parameters. Apply the simplex approach to a least-square curve fitting, i.e. the response function is the ‘sum of squares of errors’ (SSE) which must tend to a minimum. Consider a set N observed experimental data: (x1 y1) (x2 y2) … (xN yN) The data can be modelled by a function: y = f(x; a, b) e.g., y aebx Find values of a and b to minimize the SSE between the observed and calculated data points. The space we are searching might look like the following, where the contours are values of SSE. 4 b SSE min. in SSE a 1) Establish a starting simplex - Obtain SSE for three pairs of a and b values to form a triangle. (i.e. m+1 points for m parameters) b C B SSE at A, B and C A a 5 2) Move the simplex in the direction that will reduce the SSE. HOW? Find the least desirable point. Replace it with its mirror image across the face of the remaining points. b C B B’ A a How do I find the mirror image? 6 The vertices of the m-dimensional simplex are represented by the coordinate vectors: P1 P2 … Pj … Pm Pm+1 Say the most undesirable point Pj is eliminated. It leaves P1 P2 … Pj-1 Pj+1 … Pm Pm+1 The centroid (centre of gravity) is then: P 1 P1 P2 ... Pj 1 Pj 1 ... Pm1 m The new (reflected) point of the simplex is given by: Pj* P P Pj 7 Example: 2;2 b 1;1 3;1 a Eliminate (1 ; 1) from (1 ; 1), (3 ; 1), (2 ; 2) Centre of gravity: P 1 P1 P2 ... Pj 1 Pj 1 ... Pm1 m 1 (3 ; 1) (2 ; 2) 2 1 P 5 ; 3 2 P 5 3 P ; 2 2 8 New (reflected) point: Pj* P P Pj 5 3 5 3 Pj* ; ; 1 ; 1 2 2 2 2 Pj* 5 3 3 1 ; ; 2 2 2 2 Pj* 4 ; 2 b 4;2 2;2 1;1 3;1 a This procedure continues until SSE no longer improves. 9 Improve the performance of the simplex in two ways: 1) Expand in size if it is going in the right direction. 2) Contract near the minimum to improve resolution. b N W T R U P R S B a Original simplex = WNB. 1) W most undesirable point (B = best and N = next best) Elimination of W and reflection gives R. 2) If R is better than B move in the correct direction and suggests a possible expansion of the simplex in that direction. P* P ( P PW ) where γ > 1 and we get S. 10 b N W T R U P R S B a 3) If R is not better than B, but better than N new simplex is BNR (with γ = 1). 4) If R is less desirable than B and N, but better than W a contraction is indicated: P* P ( P PW ) where 0 < β < 1 and we get U. 5) If R is less desirable than W then: P* P ( P PW ) where 0 < β < 1 and we get T. 11 the simplex moves in the direction of the minimum and gets smaller as it gets closer to it. The iteration must be stopped at some point; e.g. when the changes in the parameters become small. PRO’S: - easy to visualise and easy to program. - no setting up of equations or finding derivatives. CONS: - may fail for large problems with many parameters - slow (no additinal info which might indicate the direction of the minimum). 12 Any minimization procedure is unlikely to find the global minimum if it is started near a local minimum. good initial estimates of the starting parameters are required. SSE local minimum global minimum a 13 Gradient Methods Look at the gradient of the response as the parameters are changed to locate a minimum (or maximum) need to find derivatives Newton-Raphson method: - simplest gradient method - uses a Taylor series expansion to linearise the function - not very reliable in its basic form Levenberg-Marquardt method: - optimized for nonlinear curve fitting - modification of the basic Newton-Raphson method 14