Faculty of Sciences and Mathematics University of Niš Available at: www.pmf.ni.ac.yu/sajt/publikacije/publikacije_pocetna.html Filomat 21:1 (2007), 17 – 24________________________________________________ ON AN ALGORITHM IN C1,1 OPTIMIZATION Nada I. Đuranović-Miličić Abstract: In this paper an algorithm for minimization of C1,1 functions, which uses the second order Dini upper directional derivative is considered. The purpose of the paper is to establish for this algorithm general hypotheses under which convergence occurs to optimal points. A convergence proof is given, as well as an estimate of the rate of convergence. Received: November 14, 2005 Keywords and phrases: DIRECTIONAL DERIVATIVE, SECOND ORDER DINI UPPER DIRECTIONAL DERIVATIVE, UNIFORMLY CONVEX FUNCTIONS. 2000 Mathematics Subject Classification : 90C30; 65K05. 1. INTRODUCTION We shall consider the following C1,1 problem of unconstrained optimization min f ( x) | x D R n , (1) where f : D R n R is a C1,1 function on the open convex set D , that means the objective function we want to minimize is continuously differentiable and its gradient f is locally Lipschitzian, i.e. f ( x) f ( y) L x y for x, y D or some L 0 . We shall present an iterative algorithm which is based on the results from [1], [2] and [4] for finding an optimal solution to problem (1) generating the sequence of points {xk } of the following form: xk 1 xk k sk k2 d k , k 0,1,..., sk 0, d k 0 (2) 18 where the step-size k and the direction vectors sk and d k are defined by the particular algorithms 2. PRELIMINARIES We shall give some preliminaries that will be used for the remainder of the paper. Definition (see [4]). The second order Dini upper directional derivative of the function f C1,1 at xk R n in the direction d Rn is defined to be . f ( xk d ) f ( xk ) f ( x ; d ) limsup T D k d 0 If f is directionally differentiable at xk , we have f D" x k ; d f " x k ; d lim f xk d f xk T d 0 for all d R n . Lemma 1 (See [4]). Let f : D R n R be a C1,1 function on D , where D R n is an open subset. If x is a solution of the optimization problem (1), then: f ( x; d ) 0 and f D( x; d ) 0, d R n . Lemma 2 (see [4]). Let f : D R n R be a C1,1 function on D , where D R n is an open subset . If x satisfies f ( x; d ) 0 and f D( x; d ) 0, d 0, d R n , then x is a strict local minimizer of (1). 2. THE OPTIMIZATION ALGORITHM At the k-th iteration the direction vector sk 0 in (2) is any vector satisfying the descent property, i.e. f ( xk )T sk 0 holds, and the direction vector d k presents a solution of the problem min k (d ) | d R n , (3) where k (d ) f ( xk )T d 12 f D( xk , d ) . For given q , where 0 q 1 , the step- 19 size k 0 is a number satisfying k q i ( k ) ,where i (k ) is the smallest integer from i 0,1,... such that xk 1 xk q i ( k ) sk q 2i ( k ) d k D and the following two inequalities are satisfied: 1 T f xk k sk k2 d k f xk k f xk sk k4 f D" xk ; d k (4A) 2 and f xk 2 k sk (2 k ) 2 d k f xk 1 2 k f xk T sk (2 k ) 4 f D" xk ; d k 2 where 0 1 is a preassigned constant, and x0 D is a given point. (4B) We make the following assumptions. A1. We suppose that there exist constants c2 c1 0 such that c1 d f D( x; d ) c2 d 2 2 (5) for every d R . A2. d k 1 and sk 1, k 0,1,... n A3. There exists a value 0 such that f ( xk )T sk sk (6) A4. 2 , k 0,1, 2,... f ( xk )T sk 0 f xk 0, k . It follows from Lemma 3.1 in [4] that under the assumption A1 the optimal solution of the problem (3) exists. Lemma 3 (see [4]). The following statements are equivalent: 1. d 0 is a globally optimal solution of the problem (3); 2. 0 is the optimum of the objective function of the problem (3); 3. the corresponding xk is a stationary point of the function f . Proposition 1 If the function f C1,1 satisfies the condition (5), then: 1) the function f is uniformly and, hence, strictly convex, and, consequently; 2) the level set L( x0 ) x D : f ( x) f ( x0 ) is a compact convex set; 3) there exists a unique point x* such that f ( x* ) min f ( x) xL ( x0 ) Proof. 1) From the assumption (5) and the mean value theorem it follows that for all x L( x0 ) there exists (0,1) such that 20 1 f D x0 ( x x0 ); x x0 2 f ( x) f ( x0 ) f ( x0 )T ( x x0 ) 1 2 f ( x0 )T ( x x0 ) c1 x x0 f ( x0 )T ( x x0 ), 2 that is, f is uniformly and consequently strictly convex on L( x0 ) . 2) From [5] it follows that the level set L( x0 ) is bounded. The set L( x0 ) is closed because of the continuity of the function f ; hence, L( x0 ) is a compact set. L( x0 ) is also (see [5]) a convex set. 3) The existence of x* follows from the continuity of the function f on the bounded set L( x0 ) . From the definition of the level set it follows that f ( x* ) min f ( x) min f ( x) xL ( x0 ) xD * Since f is strictly convex it follows from [5] that x is a unique minimizer. In order to have a finite value i (k ) satisfying (4A) to exist , it is sufficient that sk and d k have descent properties, i.e. f ( xk )T sk 0 and f ( xk )T d k 0 whenever f xk 0. The first relation follows from (6). Relating to the second condition, if d k 0 is a solution of (3), it follows that k (dk ) 0 k (0) . Consequently, we have by (5) that 1 1 f D( xk ; d k ) c1 d k 2 2 i.e. d k is a descent direction at xk . f ( xk )T d k 2 0 (7) Now suppose that for some k q i ( k ) the inequality (4A) holds, but the inequality (4B) does not hold. In other words we have the following: 1 T f xk k sk k2 d k f xk k f xk sk k4 f D" xk ; d k and 2 2 f xk 2 k sk (2 k ) d k f xk 2 k f xk T sk (2 k ) 4 f D" xk ; d k , 1 2 that is, if there is no j for which 2 j k satisfies (4B) we shall obtain f x k 2 j k s k (2 j k ) 2 d k f xk 1 2 j k f xk T sk (2 j k ) 4 f D" xk ; d k , j 0,1,2,.. 2 21 The right side of the above inequality tends to as j , that is, f xk 2 k sk (2 k ) d k f xk as j and that is a contradiction since f is, because of the assumptions, bounded below on the compact set L x0 . Therefore, for some j the both inequalities (4A) and (4B) will be satisfied. Consequently, our algorithm is well-defined. j j 2 The inequality (4B) guarantees a suitable reduction in f, that is that f(x) is decreased by at least a multiple of the modulus of the directional derivative k f x k s k T and k4 f D" x k ; d k . As k 0 satisfies this inequality it is necessary to introduce 1 2 another condition that prevents too small an k from being chosen. This is the purpose of the inequality (4A). Convergence theorem . Suppose that f C1,1 and that the assumptions A1, A2 ,A3 and A4 hold. Then for any initial point x0 D, xk x , as k , where x is a unique minimal point. Proof. From (4A), (6) and (7) it follows that 1 f ( xk 1 ) f ( xk ) q i ( k )f ( xk )T sk q 4i ( k ) f D( xk ; d k ) 0. 2 (8) xk L( x0 ) . Since L( x0 ) is by Proposition 1 a compact convex set, it follows that the sequence xk is bounded. Therefore there exist accumulation points of xk . Since f is by Hence f ( xk ) is a decreasing sequence and consequently assumption continuous, then, if f ( xk ) 0 as k , it follows that every accumulation point x of the sequence xk satisfies f ( x ) 0 . Since f is by the Proposition 1 strictly convex, it follows that there exists a unique point x L( x0 ) such that f ( x ) 0 . Hence, xk has a unique limit point x – and it is a global minimizer. Therefore we have to prove that f ( xk ) 0, k . We first show that he set of indices i(k ) is uniformly bounded above by a number I , i.e. i (k ) I . Suppose it is not. By the definition of i(k) from (4A) it follows that 22 f xk q i k 1 sk q 2i k 2 dk f xk 1 2 q i k 1f xk sk q 4i k 4 f D'' xk ; d k . T By the definition of the Dini derivative and by (5) we have f xk q i k 1 sk q 2i k 2 dk f xk q f xk sk q i k 1 T 2i k 2 f xk d k T 1 " T i k 1 2i k 2 2i k 2 i k 1 f D xk ; q sk q d k o q q f xk sk 2 2 1 T T 2i k 2 i k 1 i k 1 2i k 2 2i k 2 q f xk d k c1 q sk q d k o q q f xk sk 2 2 1 T 2i k 2 2i k 2 i k 1 2i k 2 q f xk d k c1q sk q d k o q 2 1 T q i k 1f xk sk q 4i k 4 f D" xk ; d k . 2 Accumulating all terms of order higher than O q 2i k 2 into the term o q 2i k 2 (because sk dk 1 ) and using the fact that f ( xk )T d k 0 (by (7)) yields 1 2i ( k ) 2 c1q sk 2 2 o(q 2i ( k ) 2 ) ( 1)q i ( k ) 1f ( xk )T sk 0 since 0 1 and f ( xk )T sk 0 . Dividing by q i ( k ) 1 yields 1 i ( k ) 1 c1q sk 2 2 o(q i ( k ) 1 ) ( 1)f ( xk )T sk 0. . 1 Dividing by c1 sk 2q . Taking qi ( k ) the limit 2 2( 1)q o( q i ( k ) ) q 1 i(k ) T f ( xk ) sk . c1 yields q c1 c1 2q as k , and having in view (6), we get 2(1 )q o(q i ( k ) 1 ) 0. c1 Hence k q i k is bounded away from zero, which contradicts the assumption that the sequence i(k ) is unbounded. Hence i (k ) I for all k. From (4A) it follows that 23 f x k 1 f x k q i k f xk T sk 1 q 4i k f D" xk ; d k 2 1 T q I f xk sk q 4 I f D" xk ; d k . 2 Multiplying this inequality by (–1) we get (9) f xk f xk 1 q I f xk T sk 1 q 4 I f D" xk ; d k . Since f x k it follows that 2 is bounded below (on the compact set Lx 0 ) and monotone (by (9)), f ( xk 1 ) f ( xk ) 0 as k . Hence from (9) it follows that f ( xk )T sk 0 and f D" xk ; dk 0, k . Finally, from A4 it follows that f xk 0, k . Convergence rate theorem Under the assumptions of the previous theorem we have that the following estimate holds for the sequence {xk } generated by the algorithm. 1 n 1 f x f x k k 1 f xn f x 0 1 20 , 2 f x k 0 k n=1,2,… where 0 f ( x0 ) f ( x ) , and diam L( x0 ) since by Proposition 1 it follows that L( x0 ) is bounded. Proof. The proof directly follows from the Theorem 9.2, page 167 in [3]. 3. CONCLUSION As we already have said, the algorithm presented in this paper is based on the results from [1], [2] and [4]. The idea was to define a second-order variant of the original Cea-Goldstein algorithm for C1,1 function, but not a C 2 function. Note that such an algorithm generates at every iteration a point closer to an optimal point than the algorithms given in [4]. It happens because in [4] we have minimization along one direction, and here we have minimization along a plane defined by the vectors sk and d k . Relating to the algorithm in [2], the presented algorithm is defined and converges under weaker assumptions than the algorithm given in [2]. 4. ACKNOWLEDGMENT This research was supported by Science Fund of Serbia, grant number 144015G, through Institute of Mathematics, SANU. 24 REFERENCES [1] Djuranovic-Milicic N., (1992.) A GENERALIZATION OF THE CEAGOLDSTEIN STEP-SIZE ALGORITHM, European Journal of Operational Research (EJOR), 63 , 1-9. [2] Goldfarb D.,(1980.) CURVILINEAR PATH PLENGTH ALGORITHMS FOR MINIMIZATION WHICH DIRECTIONS OF NEGATIVE CURVATURE, Mathematical Programming, 18 , 31-40. [3] Karmanov, V.G., (1975) Nauka, Moskva. MATEMATICESKOE PROGRAMMIROVANIE, [4] Sun W.J., Sampaio R.J.B and Yuan J.Y., (2000) TWO ALGORITHMS FOR LC1 UNCONSTRAINED OPTIMIZATION. JOURNAL OF COMPUTATIONAL MATHEMATICS, 6 621-632. [5] Vujčić V., Ašić M. and Miličić, N.,(1980) MATHEMATICAL PROGRAMMING (in Serbian), Institute of Mathematics, Belgrade. Department of Mathematics Faculty of Technology and Metallurgy University of Belgrade Serbia E-mail: nmilicic@elab.tmf.bg.ac.yu