WI4087TU sheets week..

advertisement

Criteria for optimality

If f is a differentiable function, then f’(x) = 0 is a necessary condition for x being a minimum. It is not a sufficient condition, however: f’(x) = 0  stationary point f(x) = x 2 f(x) = -x 2

 x = 0 is absolute minimum

 x = 0 is absolute maximum f(x) = x 3

 x = 0 is saddle point f(x) = 0

 x = 0 is both minimum and maximum

For a twice differentiable function the condition f’(x) = 0, f’’(x) > 0 is sufficient for x being a minimum. It is not a necessary condition, however: f(x) = x 4 f’(0) = f’’(0) = f’’’(0) = 0, f’’’’(0) > 0

This function has an absolute minimum, but does not satisfy the above criterion.

For 2k times differentiable functions, a sufficient criterion is: f’(x) = f’’(x) = … = f (2k-1) (x) = 0, f (2k) (x) >0

Is this also necessary for an infinitely differentiable function, i.e., does a non-zero function that has a minimum, satisfy this criterion for some k?

The answer is: no! Consider f(x) = exp(-1/x 2 ) (if x

0), and f(0) = 0

This function is continuous, infinitely many times differentiable, and f (j) (0) = 0 for all j, but f has an absolute minimum in x = 0.

This function is not analytic in x = 0 (the Taylor series expansion does not converge to the function).

Generalization to higher dimensions: f:R n 

R has a strict minimum in x if

 f(x) = 0 and

2 f(x) > 0

2 f(x) > 0 means that the Hessian matrix of f is strictly positive definite (This means that (

2 f(x)y, y) > 0 for all y

0)

Example1: consider f

 x

1

, x

2

, x

3

  x

1

2

 x

1 x

2

 x

2

2

 x

3

2 . The first and second derivatives are:

 f

 x

1

, x

2

, x

3

 

2 x

1 x

1

2 x

3 x

2

2 x

2

,

2 f

 x

1

, x

2

, x

3

 

2

1

0

1

2

0

0

0

2

The eigenvalues of the Hessian matrix are 1, 2, 3. These are all non-negative, so (0, 0, 0) is a strict minimum.

Example 2: consider f

 x

1

, x

2

, x

3

 x

1

2 x

2

 x

3

2

for x

2

> 0.

The first and second derivatives are:

 f

 x

1

, x

2

, x

3

2

 x

1 x

2 x

1

2 x

2

2

2 x

3

,

2 f

 x

1

, x

2

, x

3

2 x

2

2 x

1

0 x

2

2

2 x x

2

2

2 x

1

2

1 x

3

2

0

0

0

2

 x

1

2

The eigenvalues of the Hessian matrix are 0, 2, 2

3 x

2 x

2

2

.

These are all non-negative, so the matrix is nonnegative definite and the function f is convex. This also follows from the definition of non-negative definite:

2

 y

1 y

2 y

3

,

 x

2

2

0 x

1 x

2

2

2 x x

2

2

2 x

1

2

1 x

2

3

0

0

0

2

 y

1 y

2 y

3

2 x

2

 y

1

 x

1 y

2 x

2



2

2 y

3

2 

0 .

All points (0, x

2

, 0) with x

2

> 0 are (non-strict) minima.

Multivariable unconstrained optimization (Ch 12.5)

Steepest descent method.

The idea of this method is that from a starting point a minimum is found in a steep(est) descent direction.

From that point a new point is then found.

The steepest ascent direction is given by the gradient:

 f(x) = f’(x) T , because from the Taylor approximation f(x+h) = f(x) +

 f(x) T h + O(|h| 2 ). it is clear that

 f(x) is the direction in which f locally increases maximally.

The steepest ascent algorithm works as follows:

0. Find a starting point x

0

1. Find the value t* for which t

 f(x k

+ tf’(x k

)) is maximal

2. x k+1

:= x k

+ t* f’(x k

)

3. If the stopping criterion is satisfied, stop, else k:=k+1 and go to 1

Example: f(x

1

, x

2

) = 2x

1 x

2

+ 2x

2

– x

1

2 – 2x

2

2 f’(x

1

, x

2

) = (2x

2

– 2x

1

2x

1

+ 2 – 4x

2

)

Starting point X

0

= (0,0)

Iteration 1 : f’(0,0) = (0,2)

Find the maximum of f((0,0) + t(0,2)) = f(0, 2t) = 4t – 8t 2 : t* = ¼

X

1

= (0,0) + ¼(0,2) = (0,1/2)

Iteration 2 : f’(0,1/2) = (1 0)

Find the maximum of f((0,1/2) + t(1,0)) = f(t, 1/2) = ½ + t – t 2 : t* = 1/2

X

2

= (0,1/2) + 1/2(1,0) = (1/2,1/2)

Download