Applied Mathematics E-Notes, 11(2011), 139-147 c Available free at mirror sites of http://www.math.nthu.edu.tw/ amen/ ISSN 1607-2510 A Conjugate Gradient Method With Su¢ cient Descent And Global Convergence For Unconstrained Nonlinear Optimization Hailin Liuy, Sui Sun Chengz, Xiaoyong Lix Received 9 March 2011 Abstract In this paper a new conjugate gradient method for unconstrained optimization is introudced, which is su¢ cient descent and globally convergent and which can also be used with the Dai-Yuan method to form a hybrid algorithm. Our methods do not require the strong convexity condition on the objective function. Numerical evidence shows that this new conjugate gradient algorithm may be considered as one of the competitive conjugate gradient methods. 1 Introduction There are now many conjugate gradient schemes for solving unconstrained optimization problems of the form min ff (x) : x 2 Rn g where f is a continuously di¤erentiable function of n real variables with gradient g = rf . An essential feature of these schemes is to arrive at the desired extreme points through the following nonlinear conjugate gradient algorithm x(k+1) = x(k) + where k k dk (1) is the stepsize, and dk is the conjugate search direction de…ned by dk = gk gk + k dk 1 k=1 ; k>1 (2) where gk = rf x(k) and k is an update parameter. In a recent survey paper by Hager and Zhang [5], a number of choices of the parameter k are given in chronological order. Two well known choices are recalled here for later use: Mathematics Subject Classi…cations: 15A99, 39A10. of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, Guangdong 510665, P. R. China z Department of Mathematics, Tsing Hua University, Taiwan 30043, R. O. China x School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, Guangdong 510665, P. R. China y School 139 140 Gonjugate Gradient Method for Unconstrained Optimization DY k Dai-Yuan: Hager-Zhang: HZ k = 1 dTk 1 yk = yk jjgk jj2 dTk 1 yk 2dk (3) jjyk jj2 1 T d k 1 yk !T gk ; (4) where jj jj is the Euclidean norm and yk = gk gk 1 . In (1) the stepsize k is obtained T through the exact linear search (i.e., g x(k) + k dk dk = 0) or inexact linear search with Wolfe’s criterion de…ned by f (x(k) + k dk ) f (x(k) ) + T k gk dk ; (5) and g(x(k) + T k dk ) dk gkT dk (6) where 0 < < < 1: In 1999, Dai and Yuan [2] proposed the DY conjugate gradient method using k de…ned by (3). In 2001 [1], they introduced an updated formula of k with three parameters, which may be regarded as a convex combination of several earlier choices of k listed in [5]; but the three parameters are restricted in small intervals. Based on the ideas of Dai-Yuan, Andrei in [3] presents yet another su¢ cient descent and global convergence algorithm that avoided the strongly convex condition on the objective function f (x) assumed by Hager and Zhang [4] incoporating HZ in (4) (to be named k the HZ method in the sequel). However, the method by Andrei requires some additional conditions (see the statement following the proof of Theorem 1, and also the additional conditions such as T (gk+1 gk ) > 0 and 0 < ! in Theorem 2 of [3]). Therefore it is of ingk+1 k terest to …nd further alternate methods that are as competitive, yet neither the strong convexity of the objective function nor the above mentioned conditions are required. In this note, we introduce a new formulation of the update parameter k de…ned by jjgk jj2 N EW : (7) = k T jdk 1 gk j + dTk 1 yk Note that if we use the exact line search, our new algorithm reduces to the algorithm of Dai and Yuan. In this paper, however, we consider general nonlinear functions and an EW in (3), we may then introduce and the DY inexact line search. By means of our N k k a hybrid algorithm for …nding the extreme values of f: Global convergence of our methods will be established and numerical evidence will be listed to support our …ndings. 2 New Algorithm and Convergence As in [2], we assume that the continuously di¤erentiable function f is bounded in the level set L1 = fxjf (x) f (x(1) )g, where x(1) is the starting point; and that g(x) is Lipschitz continuous in L1 , i.e., there exists a constant L > 0 such that jjg(x) g(y)jj Liu et al. 141 Ljjx yjj for all x; y 2 L1 : We remark that in Andrei [3], it is required that the level set L1 be bounded instead of the slightly weaker condition of Dai-Yuan. Also, we use the same algorithm in [2] which is restated here for the sake of convenience: Step 1. Initialize starting point x(1) , and > 1, a very small positive " > 0: Compute d1 = g1 : Set k = 1: Step 2. If jjgk jj < ", then stop and output x(k) , else go to step 3. Step 3. Compute x(k+1) = x(k) + k dk through inexact linear search by (5) and (6). Step 4. Compute dk+1 by (2) and (7). Compute gk+1 : Set k = k + 1 and go to step 2. In order to consider convergence, we …rst notice, by (6), that dTk 1 (gk LEMMA 1. If gk dTk 1) 1 gk 1 > 1, then gkT dk < dTk (1 1 1 gk 1 1)dTk =( 1 gk 1 : (8) )jjgk jj2 < 0 for k = 1; 2; ::: : PROOF. If k = 1 then d1 = g1 and g1T d1 = jjg1 jj2 < (1 1 )jjg1 jj2 < 0 since > 1: Assume by induction that gkT 1 dk 1 < (1 1 )jjgk 1 jj2 < 0: By (2), (6), (7) and (8), we have gkT dk = N EW T gk dk 1 k jjgk jj2 + jjgk jj2 + jjgk jj2 + j j jdTk jjgk jj2 + = 1 jdTk 1 gk j 1 jjgk jj2 + = jjgk jj2 + dTk 1 (gk jjgk jj2 1)dTk 1 gk j + ( jjgk jj2 jg T dk jdTk 1 gk j k gk jdTk 1 )j jjgk jj2 T 1 gk j + dk 1 (gk jgkT dk 1j jgkT dk 1j 1 gk 1 j gk 1) gkT dk 1 1j jjgk jj2 : The proof is complete. We remark that our Lemma 1 implies that dk is a su¢ cient descent direction. LEMMA 2 (see [2]). If the sequence fx(k) g is generated by (1) and (2), the stepsize k satis…es (5) and (6), and dk is a descent direction, f is bounded and g(x) is Lipschitz in the level set, then 1 X (gkT dk )2 < 1: (9) jjdk jj2 k=1 THEOREM 1 (Global convergence). If > 1 in (7), f is bounded and g(x) is Lipschitz in the level set, then our algorithm either terminates at a stationary point or lim inf jjgk jj = 0: 142 Gonjugate Gradient Method for Unconstrained Optimization Proof. If our conclusion does not hold, then there exists a real number " > 0 such that jjgk jj > ", for all k = 1; 2; ::: : Since dk + gk = k dk 1 ; we have jjdk jj2 = 2 2 k jjdk 1 jj jjgk jj2 2gkT dk : (10) By (8) and Lemma 1, we have gkT dk N EW T gk dk 1 k = jjgk jj2 + = jjgk jj2 + = jdTk 1 gk j + dTk 1 gk 1 jjgk jj2 jdTk 1 gk j + dTk 1 (gk gk 1 ) jjgk jj2 T 1 gk j + dk 1 (gk jdTk gk 1) gkT dk 1 jdTk T jdk 1 gk j : gk 1 j dTk 1 gk 1 jjgk jj2 : jdTk 1 gk j Since dTk < 0 and dTk gk < 0; we see that 1 gk 1 jdTk 1 gk jjgkT dk j ; jdTk 1 gk 1 j jjgk jj2 that is, N EW k Replace k = jdTk 1 gk j in (10) with jjdk jj2 (gkT dk )2 = jjgk jj2 + dTk 1 (gk N EW , k gk jjgk jj2 jdTk 1 gk j 1) we get jjdk 1 jj2 (gkT 1 dk 1 )2 jjgk jj2 (gkT dk )2 jjdk 1 jj2 (gkT 1 dk 1 )2 ( 1 gkT dk jjgk jj 1 2 1 + ) + jjgk jj jjgk jj2 gkT dk jjdk 1 jj2 1 + T 2 2 jjg (gk 1 dk 1 ) k jj since d1 = 2 jjdk 1 jj2 1 + 2 T 2 " (gk 1 dk 1 ) g1 ; so that jjdk jj2 jjd1 jj2 k 1 1 k 1 1 k 1 k < T + 2 = + 2 < 2 + 2 = 2: T 2 " jjg1 jj2 " " " " (gk dk ) (g1 d1 )2 Thus 1 X (g T dk )2 k k=1 jjdk jj2 > 1 X "2 k=1 k = +1 which is contrary to Lemma 2. The proof is complete. Liu et al. 3 143 Hybrid Algorithm We may build a hybrid algorithm (see discussions on hybrid algorithms in [5] for backEW ground information) based on DY and our N as follows: First we let k k mix k DY k N EW k = if j DY k j otherwise N EW k and gkT dk 1 <0 ; (11) EW and then we replace N with mix at step 4 of the algorithm in the last section. k k THEOREM 2. For k = 1; 2; :::; we have gkT dk < (1 1 )jjgk jj2 (so that our new method is also su¢ cient descent). PROOF. When n = 1; since > 1 and d1 = g1 , we have g1T d1 = jjg1 jj2 < (1 1 )jjg1 jj2 < 0: Assume by induction that gkT 1 dk 1 < (1 1 )jjgk 1 jj2 < 0. If mix = DY k k , then N EW , we have k gkT dk gkT dk = mix k 0. Therefore, in case where 1 jjgk jj2 + mix T gk dk 1 k jjgk jj2 + From the proof of Lemma 1, we can then get gkT dk < = DY k or mix k = N EW T gk dk 1 : k 1 (1 )jjgk jj2 . THEOREM 3. (Global convergence). If > 1, f is bounded and g(x) is Lipschitz in the level set, then our algorithm either terminates at a stationary point or lim inf jjgk jj = 0: Proof. As in the proof of Theorem 1, if our conclusion does not hold, then we have gkT dk gkT dk 1 jjgk jj2 + mix k EW T gk dk 1 jjgk jj2 + N k T jdk 1 gk j + dTk 1 gk 1 jjgk jj2 T jdk 1 gk j + dTk 1 (gk gk 1 ) = = jdTk gk 1) dTk 1 gk 1, therefore, N EW k gkT dk dTk 1 gk = 1 jgkT dk j : jdTk 1 gk 1 j On the other hand, by (12) and (10), we have jjdk jj2 1 N EW T dk 1 gk 1 : k = Since gkT dk < 0 for all k jjgk jj2 T 1 gk j + dk 1 (gk = ( mix )2 jjdk 1 jj2 jjgk jj2 2gkT dk k N EW 2 ( k ) jjdk 1 jj2 jjgk jj2 2gkT dk (gkT dk )2 jjdk 1 jj2 jjgk jj2 2gkT dk (gkT 1 dk 1 )2 The remaining proof is the same as the proof of Theorem 1. (12) 144 4 Gonjugate Gradient Method for Unconstrained Optimization Numerical Evidences In this section, we will test the DY, HZ, the ANDREI (see [3]) and our NEW as well as HYBRID conjugate methods with weak Wo‡e line search. For each method, we take = 0:2; = 0:3; = 1:1 in (5)-(6), and the termination condition is kgk k " = 10 6 . The test problems are extracted from [6]. Since the computational procedures are similar to those in [6] and in [7], we will not bother with the detailed descriptions of the numerical data. Instead, we prepare a Table which provides conclusions of our numerical comparisons. More speci…cally, in this table, the terms Problem, Dim, NI, NF, NG, -, * have the following meaning: Problem: the name of the test problem; Dim: the dimension of the problem; NI: the total number of iterations; NF: the number of the function evaluations; NG: the number of the gradient evaluations; -: method not applicable; *: the best method. Our Table (see the last two pages) shows that our new methods in some test problems outperform the Dai-Yuan method. Although the HZ method is also performing well, but this method requires that the objective function is strongly convex and the level set is bounded, so HZ method may not be applicable (such as the Gulf research problem). In conclusion, our new methods are competitive among the well known conjugate gradient methods for unconstrained optimization. Acknowledgment. Research supported by the Natural Science Foundation of Guangdong Province of P. R. China under grant number 9151008002000012. References [1] Y. H. Dai and Y. Yuan, A Three-parameter family of nonlinear conjugate gradient methods, Math. Comp., 70(2001), 1155-1167. [2] Y. H. Dai and Y. Yuan, A nonlinear conjugate gradient with a strong global convergence property, SIAM Journal of Optimization, 10(1999), 177–182. [3] N. Andrei, A Dai-Yuan conjugate gradient algorithm with su¢ cient descent and conjugacy conditions for unconstrained optimization, Appl. Math. Lett., 21 (2008) 165–171 [4] W. W. Hager and H. C. Zhang, A new conjugate gradient method with guaranteed descent and an e¢ cient line search, Siam J. Optim., 16(1)(2005), 170–192. [5] W. W. Hager and H. C. Zhang, A survey of nonlinear conjugate gradient methods. Pac. J. Optim., 2(1)(2006), 35–58. Watson Penalty I Box 3-dimensional Rosenbrock Variably dimensioned Dis. integral equation Dis. boundary value Broden banded Broden tridiagonal Linear-rank 1 Gaussian Linear-full rank Problem Dim 3 10 10000 10 10000 10 10000 10 10000 1000 10000 10 10000 3 2 200 10000 31 500 10000 4=7=5 3=27=3 3=27=8 4=33=12 4=38=3 26=79=26 65=195=67 18=58=22 18=55=79 2=22=1 2=22=1 5=27=4 5=27=4 136=257=165 96=325=123 8=42=11 16=151=31 2504=10003=2503 38=77=63 67=147=108 4=7=5 3=27=3 3=27=8 4=33=14 4=38=3 31=94=31 91=275=94 16=51=18 22=66=23 2=22=1 2=22=1 4=29=19 5=31=7 2364=7121=2386 53=180=61 8=42=11 15=143=28 2507=10002=2508 37=85=56 314=1211=505 NEW NI/NF/NG NI/NF/NG DY HYBRID 4=7=5 3=27=3 3=27=8 4=33=12 4=38=3 26=79=26 64=194=67 14=45=16 22=66=23 2=22=1 2=22=1 5=27=4 5=27=4 58=123=73 41=153=69 8=42=11 17=160=33 2507=10001=2506 32=73=51 53=144=108 NI/NF/NG HZ 5=7=5 3=27=3 3=27=8 5=38=12 4=38=3 27=81=27 72=219=76 20=66=23 21=67=23 2=22=1 2=22=1 4=26=3 4=26=3 39=96=56 36=127=51 8=46=12 18=173=37 2592=10002=2594 29=64=46 26=68=42 NI/NF/NG ANDREI 102=482=115 12=86=24 20=198=44 294=1868=311 20=93=36 692=3405=737 7=40=22 6=32=18 4=9=5 4=26=12 4=26=15 6=42=19 5=51=7 47=146=50 54=171=63 37=121=43 37=112=43 2534=10003=2602 NI/NF/NG Liu et al. 145 Gonjugate Gradient Method for Unconstrained Optimization 146 Penalty II Problem Brown badly scaled Brown and Dennis Gulf research Trigoonometric Ext. Rosenbrock Ext. Powell singular Beale Wood Chebyquad Brown almost-linear Dim 100 10000 2 4 3 10 10000 500 10000 20 10000 2 4 20 10000 20 10000 DY 76=214=87 NEW 78=229=92 NI/NF/NG HYBRID 89=263=103 NI/NF/NG HZ NI/NF/NG ANDREI 15=92=21 41=212=72 49=65=60 79=119=104 27=130=52 39=127=53 104=290=115 152=414=164 16=42=21 127=394=133 155=473=162 2=22=1 6=36=19 6=64=19 4=16=4 50=288=80 84=543=99 394=1839=411 53=156=76 1611=10000=1613 74=319=72 1456=10002=1462 56=231=72 50=151=65 711=4587=733 484=1939=489 2=22=1 7=44=26 7=61=29 NI/NF/NG 1666=10001=1677 57=270=83 764=2302=764 41=58=50 65=67=67 44=161=72 48=171=76 3334=10002=3337 3334=10002=3337 27=69=29 80=252=85 153=461=155 2=22=1 6=34=19 6=67=28 102=324=117 1668=10005=1674 37=181=57 944=2836=946 41=58=54 70=88=85 120=397=147 141=460=168 3183=8548=3187 3336=10002=3339 15=38=20 113=374=123 166=522=172 2=22=1 14=57=27 6=87=30 NI/NF/NG 1667=10003=1676 63=287=81 316=970=327 78=151=80 225=473=248 57=191=65 57=191=65 3225=9678=3229 3334=10002=3337 17=42=20 2327=9399=2373 2483=10002=2526 2=22=1 6=34=19 6=67=28 Liu et al. [6] N. Andrei, Test functions for http://www.ici.ro/neculai/SCALCG/evalfg.for. 147 unconstrained optimization, [7] W. W. Hager and H. Zhang, Algorithm 851: CG DESCENT, a conjugate gradient method with guaranteed descent, ACM Trans. Math. Software, 32(2006), 113–137.