Document 10677457

advertisement
Applied Mathematics E-Notes, 11(2011), 139-147 c
Available free at mirror sites of http://www.math.nthu.edu.tw/ amen/
ISSN 1607-2510
A Conjugate Gradient Method With Su¢ cient
Descent And Global Convergence For Unconstrained
Nonlinear Optimization
Hailin Liuy, Sui Sun Chengz, Xiaoyong Lix
Received 9 March 2011
Abstract
In this paper a new conjugate gradient method for unconstrained optimization
is introudced, which is su¢ cient descent and globally convergent and which can
also be used with the Dai-Yuan method to form a hybrid algorithm. Our methods
do not require the strong convexity condition on the objective function. Numerical
evidence shows that this new conjugate gradient algorithm may be considered as
one of the competitive conjugate gradient methods.
1
Introduction
There are now many conjugate gradient schemes for solving unconstrained optimization
problems of the form
min ff (x) : x 2 Rn g
where f is a continuously di¤erentiable function of n real variables with gradient g =
rf . An essential feature of these schemes is to arrive at the desired extreme points
through the following nonlinear conjugate gradient algorithm
x(k+1) = x(k) +
where
k
k dk
(1)
is the stepsize, and dk is the conjugate search direction de…ned by
dk =
gk
gk +
k dk
1
k=1
;
k>1
(2)
where gk = rf x(k) and k is an update parameter. In a recent survey paper by
Hager and Zhang [5], a number of choices of the parameter k are given in chronological
order. Two well known choices are recalled here for later use:
Mathematics Subject Classi…cations: 15A99, 39A10.
of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, Guangdong
510665, P. R. China
z Department of Mathematics, Tsing Hua University, Taiwan 30043, R. O. China
x School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, Guangdong
510665, P. R. China
y School
139
140
Gonjugate Gradient Method for Unconstrained Optimization
DY
k
Dai-Yuan:
Hager-Zhang:
HZ
k
=
1
dTk
1 yk
=
yk
jjgk jj2
dTk 1 yk
2dk
(3)
jjyk jj2
1 T
d k 1 yk
!T
gk ;
(4)
where jj jj is the Euclidean norm and yk = gk gk 1 . In (1) the stepsize k is obtained
T
through the exact linear search (i.e., g x(k) + k dk dk = 0) or inexact linear search
with Wolfe’s criterion de…ned by
f (x(k) +
k dk )
f (x(k) ) +
T
k gk dk ;
(5)
and
g(x(k) +
T
k dk ) dk
gkT dk
(6)
where 0 < < < 1:
In 1999, Dai and Yuan [2] proposed the DY conjugate gradient method using k
de…ned by (3). In 2001 [1], they introduced an updated formula of k with three
parameters, which may be regarded as a convex combination of several earlier choices
of k listed in [5]; but the three parameters are restricted in small intervals. Based on
the ideas of Dai-Yuan, Andrei in [3] presents yet another su¢ cient descent and global
convergence algorithm that avoided the strongly convex condition on the objective
function f (x) assumed by Hager and Zhang [4] incoporating HZ
in (4) (to be named
k
the HZ method in the sequel).
However, the method by Andrei requires some additional conditions (see the statement following the proof of Theorem 1, and also the additional conditions such as
T
(gk+1 gk ) > 0 and 0 < !
in Theorem 2 of [3]). Therefore it is of ingk+1
k
terest to …nd further alternate methods that are as competitive, yet neither the strong
convexity of the objective function nor the above mentioned conditions are required.
In this note, we introduce a new formulation of the update parameter k de…ned
by
jjgk jj2
N EW
:
(7)
=
k
T
jdk 1 gk j + dTk 1 yk
Note that if we use the exact line search, our new algorithm reduces to the algorithm of
Dai and Yuan. In this paper, however, we consider general nonlinear functions and an
EW
in (3), we may then introduce
and the DY
inexact line search. By means of our N
k
k
a hybrid algorithm for …nding the extreme values of f:
Global convergence of our methods will be established and numerical evidence will
be listed to support our …ndings.
2
New Algorithm and Convergence
As in [2], we assume that the continuously di¤erentiable function f is bounded in the
level set L1 = fxjf (x)
f (x(1) )g, where x(1) is the starting point; and that g(x) is
Lipschitz continuous in L1 , i.e., there exists a constant L > 0 such that jjg(x) g(y)jj
Liu et al.
141
Ljjx yjj for all x; y 2 L1 : We remark that in Andrei [3], it is required that the level
set L1 be bounded instead of the slightly weaker condition of Dai-Yuan.
Also, we use the same algorithm in [2] which is restated here for the sake of convenience:
Step 1. Initialize starting point x(1) , and
> 1, a very small positive " > 0:
Compute d1 = g1 : Set k = 1:
Step 2. If jjgk jj < ", then stop and output x(k) , else go to step 3.
Step 3. Compute x(k+1) = x(k) + k dk through inexact linear search by (5) and
(6).
Step 4. Compute dk+1 by (2) and (7). Compute gk+1 : Set k = k + 1 and go to step
2.
In order to consider convergence, we …rst notice, by (6), that
dTk
1 (gk
LEMMA 1. If
gk
dTk
1)
1 gk 1
> 1, then gkT dk <
dTk
(1
1
1 gk 1
1)dTk
=(
1 gk 1 :
(8)
)jjgk jj2 < 0 for k = 1; 2; ::: :
PROOF. If k = 1 then d1 = g1 and g1T d1 = jjg1 jj2 < (1 1 )jjg1 jj2 < 0 since
> 1: Assume by induction that gkT 1 dk 1 < (1 1 )jjgk 1 jj2 < 0: By (2), (6), (7)
and (8), we have
gkT dk
=
N EW T
gk dk 1
k
jjgk jj2 +
jjgk jj2 +
jjgk jj2 +
j
j jdTk
jjgk jj2 +
=
1
jdTk 1 gk j
1
jjgk jj2 +
=
jjgk jj2
+ dTk 1 (gk
jjgk jj2
1)dTk
1 gk j + (
jjgk jj2
jg T dk
jdTk 1 gk j k
gk
jdTk
1 )j
jjgk jj2
T
1 gk j + dk 1 (gk
jgkT dk
1j
jgkT dk
1j
1 gk 1 j
gk
1)
gkT dk
1
1j
jjgk jj2 :
The proof is complete.
We remark that our Lemma 1 implies that dk is a su¢ cient descent direction.
LEMMA 2 (see [2]). If the sequence fx(k) g is generated by (1) and (2), the stepsize
k satis…es (5) and (6), and dk is a descent direction, f is bounded and g(x) is Lipschitz
in the level set, then
1
X
(gkT dk )2
< 1:
(9)
jjdk jj2
k=1
THEOREM 1 (Global convergence). If
> 1 in (7), f is bounded and g(x) is
Lipschitz in the level set, then our algorithm either terminates at a stationary point or
lim inf jjgk jj = 0:
142
Gonjugate Gradient Method for Unconstrained Optimization
Proof. If our conclusion does not hold, then there exists a real number " > 0 such
that jjgk jj > ", for all k = 1; 2; ::: : Since dk + gk = k dk 1 ; we have
jjdk jj2 =
2
2
k jjdk 1 jj
jjgk jj2
2gkT dk :
(10)
By (8) and Lemma 1, we have
gkT dk
N EW T
gk dk 1
k
=
jjgk jj2 +
=
jjgk jj2 +
=
jdTk 1 gk j + dTk 1 gk 1
jjgk jj2
jdTk 1 gk j + dTk 1 (gk gk 1 )
jjgk jj2
T
1 gk j + dk 1 (gk
jdTk
gk
1)
gkT dk
1
jdTk
T
jdk 1
gk j
:
gk 1 j
dTk 1 gk 1
jjgk jj2 :
jdTk 1 gk j
Since dTk
< 0 and dTk gk < 0; we see that
1 gk 1
jdTk 1 gk jjgkT dk j
;
jdTk 1 gk 1 j
jjgk jj2
that is,
N EW
k
Replace
k
=
jdTk 1 gk j
in (10) with
jjdk jj2
(gkT dk )2
=
jjgk jj2
+ dTk 1 (gk
N EW
,
k
gk
jjgk jj2
jdTk 1 gk j
1)
we get
jjdk 1 jj2
(gkT 1 dk 1 )2
jjgk jj2
(gkT dk )2
jjdk 1 jj2
(gkT 1 dk 1 )2
(
1
gkT dk
jjgk jj
1 2
1
+
) +
jjgk jj
jjgk jj2
gkT dk
jjdk 1 jj2
1
+
T
2
2
jjg
(gk 1 dk 1 )
k jj
since d1 =
2
jjdk 1 jj2
1
+ 2
T
2
"
(gk 1 dk 1 )
g1 ; so that
jjdk jj2
jjd1 jj2
k 1
1
k 1
1
k 1
k
< T
+ 2 =
+ 2 < 2 + 2 = 2:
T
2
"
jjg1 jj2
"
"
"
"
(gk dk )
(g1 d1 )2
Thus
1
X
(g T dk )2
k
k=1
jjdk jj2
>
1
X
"2
k=1
k
= +1
which is contrary to Lemma 2. The proof is complete.
Liu et al.
3
143
Hybrid Algorithm
We may build a hybrid algorithm (see discussions on hybrid algorithms in [5] for backEW
ground information) based on DY
and our N
as follows: First we let
k
k
mix
k
DY
k
N EW
k
=
if j DY
k j
otherwise
N EW
k
and gkT dk
1
<0
;
(11)
EW
and then we replace N
with mix
at step 4 of the algorithm in the last section.
k
k
THEOREM 2. For k = 1; 2; :::; we have gkT dk < (1 1 )jjgk jj2 (so that our new
method is also su¢ cient descent).
PROOF. When n = 1; since > 1 and d1 = g1 , we have g1T d1 = jjg1 jj2 <
(1 1 )jjg1 jj2 < 0: Assume by induction that gkT 1 dk 1 < (1 1 )jjgk 1 jj2 < 0. If
mix
= DY
k
k , then
N EW
, we have
k
gkT dk
gkT dk =
mix
k
0. Therefore, in case where
1
jjgk jj2 +
mix T
gk dk 1
k
jjgk jj2 +
From the proof of Lemma 1, we can then get gkT dk <
=
DY
k
or
mix
k
=
N EW T
gk dk 1 :
k
1
(1
)jjgk jj2 .
THEOREM 3. (Global convergence). If > 1, f is bounded and g(x) is Lipschitz in
the level set, then our algorithm either terminates at a stationary point or lim inf jjgk jj =
0:
Proof. As in the proof of Theorem 1, if our conclusion does not hold, then we have
gkT dk
gkT dk 1
jjgk jj2 + mix
k
EW T
gk dk 1
jjgk jj2 + N
k
T
jdk 1 gk j + dTk 1 gk 1
jjgk jj2
T
jdk 1 gk j + dTk 1 (gk gk 1 )
=
=
jdTk
gk
1)
dTk
1
gk
1, therefore,
N EW
k
gkT dk
dTk 1 gk
=
1
jgkT dk j
:
jdTk 1 gk 1 j
On the other hand, by (12) and (10), we have
jjdk jj2
1
N EW T
dk 1 gk 1 :
k
=
Since gkT dk < 0 for all k
jjgk jj2
T
1 gk j + dk 1 (gk
=
( mix
)2 jjdk 1 jj2 jjgk jj2 2gkT dk
k
N EW 2
( k
) jjdk 1 jj2 jjgk jj2 2gkT dk
(gkT dk )2 jjdk 1 jj2
jjgk jj2 2gkT dk
(gkT 1 dk 1 )2
The remaining proof is the same as the proof of Theorem 1.
(12)
144
4
Gonjugate Gradient Method for Unconstrained Optimization
Numerical Evidences
In this section, we will test the DY, HZ, the ANDREI (see [3]) and our NEW as well as
HYBRID conjugate methods with weak Wo‡e line search. For each method, we take
= 0:2; = 0:3; = 1:1 in (5)-(6), and the termination condition is kgk k " = 10 6 .
The test problems are extracted from [6]. Since the computational procedures are
similar to those in [6] and in [7], we will not bother with the detailed descriptions of
the numerical data. Instead, we prepare a Table which provides conclusions of our
numerical comparisons. More speci…cally, in this table, the terms Problem, Dim, NI,
NF, NG, -, * have the following meaning:
Problem: the name of the test problem;
Dim: the dimension of the problem;
NI: the total number of iterations;
NF: the number of the function evaluations;
NG: the number of the gradient evaluations;
-: method not applicable;
*: the best method.
Our Table (see the last two pages) shows that our new methods in some test problems outperform the Dai-Yuan method. Although the HZ method is also performing
well, but this method requires that the objective function is strongly convex and the
level set is bounded, so HZ method may not be applicable (such as the Gulf research
problem). In conclusion, our new methods are competitive among the well known
conjugate gradient methods for unconstrained optimization.
Acknowledgment. Research supported by the Natural Science Foundation of
Guangdong Province of P. R. China under grant number 9151008002000012.
References
[1] Y. H. Dai and Y. Yuan, A Three-parameter family of nonlinear conjugate gradient
methods, Math. Comp., 70(2001), 1155-1167.
[2] Y. H. Dai and Y. Yuan, A nonlinear conjugate gradient with a strong global convergence property, SIAM Journal of Optimization, 10(1999), 177–182.
[3] N. Andrei, A Dai-Yuan conjugate gradient algorithm with su¢ cient descent and
conjugacy conditions for unconstrained optimization, Appl. Math. Lett., 21 (2008)
165–171
[4] W. W. Hager and H. C. Zhang, A new conjugate gradient method with guaranteed
descent and an e¢ cient line search, Siam J. Optim., 16(1)(2005), 170–192.
[5] W. W. Hager and H. C. Zhang, A survey of nonlinear conjugate gradient methods.
Pac. J. Optim., 2(1)(2006), 35–58.
Watson
Penalty I
Box 3-dimensional
Rosenbrock
Variably dimensioned
Dis. integral equation
Dis. boundary value
Broden banded
Broden tridiagonal
Linear-rank 1
Gaussian
Linear-full rank
Problem
Dim
3
10
10000
10
10000
10
10000
10
10000
1000
10000
10
10000
3
2
200
10000
31
500
10000
4=7=5
3=27=3
3=27=8
4=33=12
4=38=3
26=79=26
65=195=67
18=58=22
18=55=79
2=22=1
2=22=1
5=27=4
5=27=4
136=257=165
96=325=123
8=42=11
16=151=31
2504=10003=2503
38=77=63
67=147=108
4=7=5
3=27=3
3=27=8
4=33=14
4=38=3
31=94=31
91=275=94
16=51=18
22=66=23
2=22=1
2=22=1
4=29=19
5=31=7
2364=7121=2386
53=180=61
8=42=11
15=143=28
2507=10002=2508
37=85=56
314=1211=505
NEW
NI/NF/NG
NI/NF/NG
DY
HYBRID
4=7=5
3=27=3
3=27=8
4=33=12
4=38=3
26=79=26
64=194=67
14=45=16
22=66=23
2=22=1
2=22=1
5=27=4
5=27=4
58=123=73
41=153=69
8=42=11
17=160=33
2507=10001=2506
32=73=51
53=144=108
NI/NF/NG
HZ
5=7=5
3=27=3
3=27=8
5=38=12
4=38=3
27=81=27
72=219=76
20=66=23
21=67=23
2=22=1
2=22=1
4=26=3
4=26=3
39=96=56
36=127=51
8=46=12
18=173=37
2592=10002=2594
29=64=46
26=68=42
NI/NF/NG
ANDREI
102=482=115
12=86=24
20=198=44
294=1868=311
20=93=36
692=3405=737
7=40=22
6=32=18
4=9=5
4=26=12
4=26=15
6=42=19
5=51=7
47=146=50
54=171=63
37=121=43
37=112=43
2534=10003=2602
NI/NF/NG
Liu et al.
145
Gonjugate Gradient Method for Unconstrained Optimization
146
Penalty II
Problem
Brown badly scaled
Brown and Dennis
Gulf research
Trigoonometric
Ext. Rosenbrock
Ext. Powell singular
Beale
Wood
Chebyquad
Brown almost-linear
Dim
100
10000
2
4
3
10
10000
500
10000
20
10000
2
4
20
10000
20
10000
DY
76=214=87
NEW
78=229=92
NI/NF/NG
HYBRID
89=263=103
NI/NF/NG
HZ
NI/NF/NG
ANDREI
15=92=21
41=212=72
49=65=60
79=119=104
27=130=52
39=127=53
104=290=115
152=414=164
16=42=21
127=394=133
155=473=162
2=22=1
6=36=19
6=64=19
4=16=4
50=288=80
84=543=99
394=1839=411
53=156=76
1611=10000=1613
74=319=72
1456=10002=1462
56=231=72
50=151=65
711=4587=733
484=1939=489
2=22=1
7=44=26
7=61=29
NI/NF/NG
1666=10001=1677
57=270=83
764=2302=764
41=58=50
65=67=67
44=161=72
48=171=76
3334=10002=3337
3334=10002=3337
27=69=29
80=252=85
153=461=155
2=22=1
6=34=19
6=67=28
102=324=117
1668=10005=1674
37=181=57
944=2836=946
41=58=54
70=88=85
120=397=147
141=460=168
3183=8548=3187
3336=10002=3339
15=38=20
113=374=123
166=522=172
2=22=1
14=57=27
6=87=30
NI/NF/NG
1667=10003=1676
63=287=81
316=970=327
78=151=80
225=473=248
57=191=65
57=191=65
3225=9678=3229
3334=10002=3337
17=42=20
2327=9399=2373
2483=10002=2526
2=22=1
6=34=19
6=67=28
Liu et al.
[6] N.
Andrei,
Test
functions
for
http://www.ici.ro/neculai/SCALCG/evalfg.for.
147
unconstrained
optimization,
[7] W. W. Hager and H. Zhang, Algorithm 851: CG DESCENT, a conjugate gradient
method with guaranteed descent, ACM Trans. Math. Software, 32(2006), 113–137.
Download