An Iterative Method for Optimization Problems

advertisement
Faculty of Sciences and Mathematics
University of Niš
Available at:
www.pmf.ni.ac.yu/sajt/publikacije/publikacije_pocetna.html
Filomat 21:1 (2007), 17 – 24________________________________________________
ON AN ALGORITHM IN C1,1 OPTIMIZATION
Nada I. Đuranović-Miličić
Abstract: In this paper an algorithm for minimization of C1,1 functions, which uses
the second order Dini upper directional derivative is considered. The purpose of the
paper is to establish for this algorithm general hypotheses under which convergence
occurs to optimal points. A convergence proof is given, as well as an estimate of the
rate of convergence.
Received: November 14, 2005
Keywords and phrases: DIRECTIONAL DERIVATIVE, SECOND ORDER DINI
UPPER DIRECTIONAL DERIVATIVE, UNIFORMLY CONVEX FUNCTIONS.
2000 Mathematics Subject Classification : 90C30; 65K05.
1. INTRODUCTION
We shall consider the following C1,1 problem of unconstrained optimization
min  f ( x) | x  D  R n  ,
(1)
where f : D  R n  R is a C1,1 function on the open convex set D , that means the
objective function we want to minimize is continuously differentiable and its gradient
f is locally Lipschitzian, i.e.
f ( x)  f ( y)  L x  y for x, y  D or some L  0 .
We shall present an iterative algorithm which is based on the results from [1], [2] and
[4] for finding an optimal solution to problem (1) generating the sequence of points
{xk } of the following form:
xk 1  xk   k sk   k2 d k , k  0,1,..., sk  0, d k  0
(2)
18
where the step-size  k and the direction vectors sk and d k are defined by the
particular algorithms
2. PRELIMINARIES
We shall give some preliminaries that will be used for the remainder of the
paper.
Definition (see [4]). The second order Dini upper directional derivative of the
function f  C1,1 at xk  R n in the direction d  Rn is defined to be
.
f ( xk   d )  f ( xk )
f ( x ; d )  limsup
T
D
k
d

 0
If f is directionally differentiable at xk , we have
f D"  x k ; d   f "  x k ; d   lim
f xk  d   f xk T d

 0
for all d  R n .
Lemma 1 (See [4]). Let f : D  R n  R be a C1,1 function on D , where D  R n
is an open subset. If x is a solution of the optimization problem (1), then:
f ( x; d )  0
and f D( x; d )  0, d  R n .
Lemma 2 (see [4]). Let f : D  R n  R be a C1,1 function on D , where D  R n is
an open subset . If x satisfies
f ( x; d )  0
and f D( x; d )  0, d  0, d  R n , then x is a strict local minimizer of (1).
2. THE OPTIMIZATION ALGORITHM
At the k-th iteration the direction vector sk  0 in (2) is any vector satisfying the
descent property, i.e. f ( xk )T sk  0 holds, and the direction vector d k presents a
solution of the problem
min  k (d ) | d  R n  ,
(3)
where  k (d )  f ( xk )T d  12 f D( xk , d ) . For given q , where 0  q  1 , the step-
19
size  k  0 is a number satisfying  k  q i ( k ) ,where i (k ) is the smallest integer from
i  0,1,... such that xk 1  xk  q i ( k ) sk  q 2i ( k ) d k  D and the following two
inequalities are satisfied:
1
T


f  xk   k sk   k2 d k   f  xk     k f  xk  sk   k4 f D"  xk ; d k   (4A)
2


and


f xk  2 k sk  (2 k ) 2 d k  f xk  
1


 2 k f xk T sk  (2 k ) 4 f D" xk ; d k 
2


where 0    1 is a preassigned constant, and x0  D is a given point.
(4B)
We make the following assumptions.
A1. We suppose that there exist constants c2  c1  0 such that
c1 d
 f D( x; d )  c2 d
2
2
(5)
for every d  R .
A2. d k  1 and sk  1, k  0,1,...
n
A3. There exists a value   0 such that
f ( xk )T sk   sk
(6)
A4.
2
,
k  0,1, 2,...
f ( xk )T sk  0  f  xk   0, k  .
It follows from Lemma 3.1 in [4] that under the assumption A1 the optimal solution of
the problem (3) exists.
Lemma 3 (see [4]). The following statements are equivalent:
1. d  0 is a globally optimal solution of the problem (3);
2. 0 is the optimum of the objective function of the problem (3);
3. the corresponding xk is a stationary point of the function f .
Proposition 1 If the function f  C1,1 satisfies the condition (5), then: 1) the function
f is uniformly and, hence, strictly convex, and, consequently; 2) the level set
L( x0 )  x  D : f ( x)  f ( x0 ) is a compact convex set; 3) there exists a unique
point x* such that f ( x* )  min f ( x)
xL ( x0 )
Proof. 1) From the assumption (5) and the mean value theorem it follows that for all
x  L( x0 ) there exists   (0,1) such that
20
1
f D  x0   ( x  x0 ); x  x0 
2
f ( x)  f ( x0 )  f ( x0 )T ( x  x0 ) 
1
2
 f ( x0 )T ( x  x0 )  c1 x  x0  f ( x0 )T ( x  x0 ),
2
that is, f is uniformly and consequently strictly convex on L( x0 ) .
2) From [5] it follows that the level set L( x0 ) is bounded. The set L( x0 ) is closed
because of the continuity of the function f ; hence, L( x0 ) is a compact set. L( x0 ) is
also (see [5]) a convex set.
3) The existence of x* follows from the continuity of the function f on the bounded
set L( x0 ) . From the definition of the level set it follows that
f ( x* )  min f ( x)  min f ( x)
xL ( x0 )
xD
*
Since f is strictly convex it follows from [5] that x is a unique minimizer.
In order to have a finite value i (k ) satisfying (4A) to exist , it is sufficient that sk and
d k have descent properties, i.e. f ( xk )T sk  0
and
f ( xk )T d k  0 whenever
f  xk   0. The first relation follows from (6). Relating to the second condition, if
d k  0 is a solution of (3), it follows that k (dk )  0   k (0) . Consequently, we
have by (5) that
1
1
f D( xk ; d k )   c1 d k
2
2
i.e. d k is a descent direction at xk .
f ( xk )T d k  
2
0
(7)
Now suppose that for some  k  q i ( k ) the inequality (4A) holds, but the inequality
(4B) does not hold. In other words we have the following:
1
T


f  xk   k sk   k2 d k   f  xk     k f  xk  sk   k4 f D"  xk ; d k   and
2


2
f xk  2 k sk  (2 k ) d k  f xk  






 2 k f xk T sk  (2 k ) 4 f D" xk ; d k  ,
1
2
that is, if there is no j for which 2 j  k satisfies (4B) we shall obtain


f x k  2 j  k s k  (2 j  k ) 2 d k  f  xk  
1


 2 j  k f xk T sk  (2 j  k ) 4 f D" xk ; d k , j  0,1,2,..
2


21
The right side of the above inequality tends to   as


j   , that is,
f xk  2  k sk  (2  k ) d k  f xk   
as j   and that is a
contradiction since f is, because of the assumptions, bounded below on the compact
set L x0 . Therefore, for some j the both inequalities (4A) and (4B) will be satisfied.
Consequently, our algorithm is well-defined.
j
j
2
The inequality (4B) guarantees a suitable reduction in f, that is that f(x) is decreased
by at least a multiple of the modulus of the directional derivative  k f x k  s k
T
and   k4 f D"  x k ; d k . As  k  0 satisfies this inequality it is necessary to introduce
1
2
another condition that prevents too small an  k from being chosen. This is the
purpose of the inequality (4A).
Convergence theorem . Suppose that f  C1,1 and that the assumptions A1, A2 ,A3
and A4 hold. Then for any initial point x0  D, xk  x , as k   , where x is a
unique minimal point.
Proof. From (4A), (6) and (7) it follows that
1


f ( xk 1 )  f ( xk )    q i ( k )f ( xk )T sk  q 4i ( k ) f D( xk ; d k )   0.
2


(8)
xk   L( x0 ) . Since
L( x0 ) is by Proposition 1 a compact convex set, it follows that the sequence  xk  is
bounded. Therefore there exist accumulation points of  xk  . Since f is by
Hence
 f ( xk )
is a decreasing sequence and consequently
assumption continuous, then, if f ( xk )  0 as k   , it follows that every
accumulation point x of the sequence  xk  satisfies f ( x )  0 . Since f is by the
Proposition 1 strictly convex, it follows that there exists a unique point x  L( x0 )
such that f ( x )  0 . Hence,
xk 
has a unique limit point x – and it is a global
minimizer. Therefore we have to prove that f ( xk )  0, k   .
We first show that he set of indices i(k ) is uniformly bounded above by a number
I , i.e. i (k )  I   . Suppose it is not. By the definition of i(k) from (4A) it follows
that
22

f xk  q 
i k  1
sk  q


2i  k   2

dk  f  xk  


1
2
  q i k 1f  xk  sk  q 4i k  4 f D''  xk ; d k   .
T
By the definition of the Dini derivative and by (5) we have

f xk  q 
i k  1
sk  q
2i  k   2

dk  f  xk   q 

f  xk  sk  q
i k  1
 
T
2i k  2
f  xk  d k 
T

1 "
T
i k 1
2i k  2
2i k 2
i k 1
f D xk ; q   sk  q   d k  o q    q   f  xk  sk 
2
2
1
T
T
2i k  2
i k 1
i k 1
2i k  2
2i k 2
q   f  xk  d k  c1 q   sk  q   d k  o q    q   f  xk  sk 
2
2
1
T
2i k  2
2i k  2
i k 1
2i k  2
q   f  xk  d k  c1q   sk  q   d k  o q   
2
1
T


  q i k 1f  xk  sk  q 4i k  4 f D"  xk ; d k   .
2



Accumulating all terms of order higher than O q




2i  k  2
 into the

term o q
2i  k   2

(because sk  dk  1 ) and using the fact that f ( xk )T d k  0 (by (7)) yields
1 2i ( k ) 2
c1q
sk
2
2
 o(q 2i ( k ) 2 )  (  1)q i ( k ) 1f ( xk )T sk  0
since 0    1 and f ( xk )T sk  0 . Dividing by q i ( k ) 1 yields
1 i ( k ) 1
c1q
sk
2
2
 o(q i ( k ) 1 )  (  1)f ( xk )T sk  0.
.
1
Dividing by
c1 sk
2q
.
Taking
qi ( k ) 
the
limit
2
2(  1)q
o( q i ( k ) ) q
1
i(k )
T
f ( xk ) sk 
.

c1 yields q 
c1
c1
2q
as
k  ,
and
having
in
view
(6),
we
get
2(1   )q
  o(q i ( k ) 1 )  0.
c1
Hence  k  q
i k 
is bounded away from zero, which contradicts the assumption that
the sequence i(k ) is unbounded. Hence i (k )  I   for all k. From (4A) it follows
that
23
f x k 1   f x k     q i k f  xk T sk  1 q 4i k  f D"  xk ; d k  
2

1
T


   q I f  xk  sk  q 4 I f D"  xk ; d k   .
2


Multiplying this inequality by (–1) we get

(9)
f  xk   f  xk 1     q I f  xk T sk  1 q 4 I f D"  xk ; d k   .
Since  f x k 
it follows that
2


is bounded below (on the compact set Lx 0  ) and monotone (by (9)),
f ( xk 1 )  f ( xk )  0 as k  . Hence from (9) it follows that f ( xk )T sk  0 and
f D"  xk ; dk   0, k  .
Finally,
from
A4
it
follows
that
f  xk   0, k  .
Convergence rate theorem Under the assumptions of the previous theorem we have
that the following estimate holds for the sequence {xk } generated by the algorithm.
1
  n 1 f  x   f  x  
k
k 1
f  xn   f  x   0 1  20 
 ,
2




f
x
k

0
k


n=1,2,… where 0  f ( x0 )  f ( x ) , and diam L( x0 )     since by Proposition 1
it follows that L( x0 ) is bounded.
Proof. The proof directly follows from the Theorem 9.2, page 167 in [3].
3. CONCLUSION
As we already have said, the algorithm presented in this paper is based on the results
from [1], [2] and [4]. The idea was to define a second-order variant of the original
Cea-Goldstein algorithm for C1,1 function, but not a C 2 function. Note that such an
algorithm generates at every iteration a point closer to an optimal point than the
algorithms given in [4]. It happens because in [4] we have minimization along one
direction, and here we have minimization along a plane defined by the vectors sk and
d k . Relating to the algorithm in [2], the presented algorithm is defined and converges
under weaker assumptions than the algorithm given in [2].
4. ACKNOWLEDGMENT
This research was supported by Science Fund of Serbia, grant number 144015G,
through Institute of Mathematics, SANU.
24
REFERENCES
[1] Djuranovic-Milicic N., (1992.) A GENERALIZATION OF THE CEAGOLDSTEIN STEP-SIZE ALGORITHM, European Journal of Operational Research
(EJOR), 63 , 1-9.
[2] Goldfarb D.,(1980.) CURVILINEAR PATH PLENGTH ALGORITHMS FOR
MINIMIZATION WHICH DIRECTIONS OF NEGATIVE CURVATURE,
Mathematical Programming, 18 , 31-40.
[3] Karmanov, V.G., (1975)
Nauka, Moskva.
MATEMATICESKOE
PROGRAMMIROVANIE,
[4] Sun W.J., Sampaio R.J.B and Yuan J.Y., (2000) TWO ALGORITHMS FOR LC1
UNCONSTRAINED OPTIMIZATION. JOURNAL OF COMPUTATIONAL
MATHEMATICS, 6 621-632.
[5] Vujčić V., Ašić M. and Miličić, N.,(1980) MATHEMATICAL PROGRAMMING
(in Serbian), Institute of Mathematics, Belgrade.
Department of Mathematics
Faculty of Technology and Metallurgy
University of Belgrade
Serbia
E-mail: nmilicic@elab.tmf.bg.ac.yu
Download