Research Article A Self-Adjusting Spectral Conjugate Gradient Method for Large-Scale Unconstrained Optimization

advertisement
Hindawi Publishing Corporation
Abstract and Applied Analysis
Volume 2013, Article ID 814912, 8 pages
http://dx.doi.org/10.1155/2013/814912
Research Article
A Self-Adjusting Spectral Conjugate Gradient Method for
Large-Scale Unconstrained Optimization
Yuanying Qiu,1 Dandan Cui,2 Wei Xue,2 and Gaohang Yu2
1
2
School of Foreign Languages, Gannan Normal University, Ganzhou 341000, China
School of Mathematics and Computer Sciences, Gannan Normal University, Ganzhou 341000, China
Correspondence should be addressed to Gaohang Yu; maghyu@163.com
Received 21 February 2013; Accepted 17 March 2013
Academic Editor: Guoyin Li
Copyright © 2013 Yuanying Qiu et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
This paper presents a hybrid spectral conjugate gradient method for large-scale unconstrained optimization, which possesses a
self-adjusting property. Under the standard Wolfe conditions, its global convergence result is established. Preliminary numerical
results are reported on a set of large-scale problems in CUTEr to show the convergence and efficiency of the proposed method.
1. Introduction
Consider the following unconstrained optimization problem:
min {𝑓 (π‘₯) | π‘₯ ∈ R𝑛 } ,
π›½π‘˜FR =
(1)
where 𝑓 : R𝑛 → R is a nonlinear smooth function and
its gradient is available. Conjugate gradient methods are very
efficient for solving (1), especially when the dimension 𝑛 is
large, and have the following iterative form:
π‘₯π‘˜+1 = π‘₯π‘˜ + π›Όπ‘˜ π‘‘π‘˜ ,
(2)
where π›Όπ‘˜ > 0 is a steplength obtained by a line search, and π‘‘π‘˜
is the search direction defined by
−𝑔 ,
π‘‘π‘˜ = { π‘˜
−π‘”π‘˜ + π›½π‘˜ π‘‘π‘˜−1 ,
for π‘˜ = 1,
for π‘˜ ≥ 2,
There are at least six formulas for π›½π‘˜ , which are given
below:
(3)
where π›½π‘˜ is a scalar and π‘”π‘˜ denotes the gradient of 𝑓 at point
π‘₯π‘˜ .
π‘”π‘˜π‘‡ π‘”π‘˜
,
𝑇 𝑔
π‘”π‘˜−1
π‘˜−1
π›½π‘˜CD = −
π›½π‘˜DY =
π‘”π‘˜π‘‡ π‘”π‘˜
,
𝑇 𝑦
π‘‘π‘˜−1
π‘˜−1
π›½π‘˜PR =
π›½π‘˜HS =
π‘”π‘˜π‘‡ π‘¦π‘˜−1
,
𝑇 𝑦
π‘‘π‘˜−1
π‘˜−1
π›½π‘˜LS = −
π‘”π‘˜π‘‡ π‘”π‘˜
,
𝑇 𝑦
π‘‘π‘˜−1
π‘˜−1
π‘”π‘˜π‘‡ π‘¦π‘˜−1
,
𝑇 𝑔
π‘”π‘˜−1
π‘˜−1
(4)
π‘”π‘˜π‘‡ π‘¦π‘˜−1
,
𝑇 𝑔
π‘‘π‘˜−1
π‘˜−1
where π‘¦π‘˜−1 = π‘”π‘˜ −π‘”π‘˜−1 and β€–⋅β€– denotes the Euclidean norm. In
the above six methods, HS, PR, and LS methods are especially
efficient in real computations, but one may not globally
converge for general functions. FR, CD, and DY methods
are globally convergent, but they perform much worse. To
combine the good numerical performance of HS method and
the nice global convergence property of DY method, Dai and
Yuan [1] proposed an efficient hybrid formula for π›½π‘˜ which is
defined as the following form:
π›½π‘˜HSDY = max {0, min {π›½π‘˜DY , π›½π‘˜HS }} .
(5)
Their studies suggested that the HSDY method (5) has the
same advantage of avoiding the propensity of short steps as
the HS method [1]. They also proved that the HSDY method
2
Abstract and Applied Analysis
with the standard wolfe line search produces a descent search
direction at each iteration and converges globally. Descent
condition may be crucial for the convergence analysis of
conjugate gradient methods with inexact line searches [2, 3].
Further, there are some modified conjugate gradient methods
[4–7] which possess the sufficiently descent property without
any line search condition. Recently, Yu [8] proposed a spectral
version of HSDY method:
π›½π‘˜S-HSDY = max {0, min {π›½π‘˜SDY , π›½π‘˜SHS }} ,
(6)
where
π›½π‘˜SDY
σ΅„©σ΅„© σ΅„©σ΅„©2
󡄩𝑔 σ΅„©
= 󡄩𝑇 π‘˜ σ΅„©
,
π›Ώπ‘˜ π‘¦π‘˜−1 π‘‘π‘˜−1
π›½π‘˜SHS =
π‘”π‘˜π‘‡ π‘¦π‘˜−1
,
𝑇 𝑑
π›Ώπ‘˜ π‘¦π‘˜−1
π‘˜−1
(7)
π›½π‘˜DSDY
∗
π‘¦π‘˜−1
= π‘¦π‘˜−1 +
π›½π‘˜DSHS =
σ΅„©σ΅„© σ΅„©σ΅„©2
󡄩𝑑 σ΅„©
π‘žπ‘˜ = σ΅„© π‘˜ σ΅„© 2 ,
(π‘”π‘˜π‘‡ π‘‘π‘˜ )
𝛿 𝑔𝑇 𝑑
π›Ύπ‘˜ = − σ΅„©π‘˜ π‘˜σ΅„©2π‘˜ .
σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©
σ΅„© σ΅„©
σ΅„© σ΅„©2
π‘”π‘˜π‘‡ π‘‘π‘˜ ≤ −πΆσ΅„©σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©σ΅„© .
(9)
A full description of DS-HSDY method is formally given
as follows.
Algorithm 1 (DS-HSDY conjugate gradient method).
Data. Choose constants 0 < 𝜌 < 𝜎 < 1, πœ‡ > 1, and 0 ≤ πœ– β‰ͺ 1.
Given an initial point π‘₯1 ∈ 𝑅𝑛 , set 𝑑1 = −𝑔1 . Let π‘˜ := 1.
Step 1. If β€–π‘”π‘˜ β€– ≤ πœ€, then stop.
Step 2. Determine π›Όπ‘˜ satisfying the standard Wolfe condition:
𝑔(π‘₯π‘˜ + π›Όπ‘˜ π‘‘π‘˜ ) π‘‘π‘˜ > πœŽπ‘”π‘˜π‘‡ π‘‘π‘˜ ,
(10)
𝑓 (π‘₯π‘˜ + π›Όπ‘˜ π‘‘π‘˜ ) − 𝑓 (π‘₯π‘˜ ) ≤ πœŒπ›Όπ‘˜ π‘”π‘˜π‘‡ π‘‘π‘˜ .
(11)
Then update π‘₯π‘˜+1 = π‘₯π‘˜ + π›Όπ‘˜ π‘‘π‘˜ .
DS-HSDY
. Then update π‘‘π‘˜+1
Step 3. Compute π‘”π‘˜+1 , π›Ώπ‘˜+1 and π›½π‘˜+1
such as
1
π›Ώπ‘˜+1
DS-HSDY
π‘”π‘˜+1 + π›½π‘˜+1
π‘‘π‘˜ .
Set π‘˜ := π‘˜ + 1 and go to Step 1.
(14)
(15)
On the other hand, it follows from (12) that
πœ—π‘˜ = 2 {𝑓 (π‘₯π‘˜ ) − 𝑓 (π‘₯π‘˜−1 )} + [𝑔 (π‘₯π‘˜ ) + 𝑔 (π‘₯π‘˜−1 )] π‘ π‘˜−1 .
π‘‘π‘˜+1 = −
(13)
otherwise, a stationary point has been found, and define the
two following important quantities:
𝑇
𝑇
∀π‘˜ ≥ 1,
The quantity π‘žπ‘˜ shows the size of π‘‘π‘˜ , where π›Ύπ‘˜ is a quantity
showing the descent degree of π‘‘π‘˜ . In fact, if π›Ύπ‘˜ > 0, π‘‘π‘˜ is a
descent direction. Furthermore, if π›Ύπ‘˜ ≥ 𝐢 for some constant
𝐢 > 0, then we have the sufficient descent condition
∗
π‘”π‘˜π‘‡ π‘¦π‘˜−1
,
∗ 𝑇𝑑
π›Ώπ‘˜ π‘¦π‘˜−1
π‘˜−1
max {πœ—π‘˜ , 0}
σ΅„©σ΅„©
σ΅„©2 π‘ π‘˜−1 ,
σ΅„©σ΅„©π‘ π‘˜−1 σ΅„©σ΅„©σ΅„©
π‘”π‘˜ =ΜΈ 0,
(8)
where
σ΅„©σ΅„© σ΅„©σ΅„©2
σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©
=
,
∗ 𝑇𝑑
π›Ώπ‘˜ π‘¦π‘˜−1
π‘˜−1
2. Self-Adjusting Property
In this section, we prove that the DS-HSDY method possesses
a self-adjusting property. To begin with, we assume that
𝑇
with π›Ώπ‘˜ = π‘¦π‘˜−1
π‘ π‘˜−1 /β€–π‘ π‘˜−1 β€–2 , π‘ π‘˜−1 = π‘₯π‘˜ − π‘₯π‘˜−1 . The numerical
experiments show that this simple preconditioning technique
benefits to its performance.
In this paper, based on a new conjugate condition [9],
we propose a new hybrid spectral conjugate gradient method
with π›½π‘˜ defined by
π›½π‘˜DS-HSDY = max {0, min {π›½π‘˜DSDY , π›½π‘˜DSHS }} ,
The rest of the paper is organized as follows. In the next
section, we show that the DS-HSDY method possesses a
self-adjusting property. In Section 3, we establish its global
convergence result under the standard Wolfe line search
conditions. Section 4 gives some numerical results on a set
of large-scale unconstrained test problems in CUTEr to
illustrate the convergence and efficiency of the proposed
method. Finally we have a Conclusion section.
(12)
π‘‘π‘˜ +
1
𝑔 = π›½π‘˜DS-HSDY π‘‘π‘˜−1 .
π›Ώπ‘˜ π‘˜
(16)
Hence
1 σ΅„© σ΅„©2
σ΅„©σ΅„© σ΅„©σ΅„©2
σ΅„©2 2
DS-HSDY 2 σ΅„©
) σ΅„©σ΅„©σ΅„©π‘‘π‘˜−1 σ΅„©σ΅„©σ΅„© − π‘”π‘˜π‘‡ π‘‘π‘˜ − 2 σ΅„©σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©σ΅„© .
σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„© = (π›½π‘˜
π›Ώπ‘˜
π›Ώπ‘˜
(17)
Combining |π›½π‘˜DS-HSDY | ≤ |π›½π‘˜DSDY | ≤ |π›½π‘˜SDY | with (17) yields
1 σ΅„© σ΅„©2
σ΅„©σ΅„© σ΅„©σ΅„©2
σ΅„©2 2
DS-HSDY 2 σ΅„©
) σ΅„©σ΅„©σ΅„©π‘‘π‘˜−1 σ΅„©σ΅„©σ΅„© − π‘”π‘˜π‘‡ π‘‘π‘˜ − 2 σ΅„©σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©σ΅„©
σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„© = (π›½π‘˜
π›Ώπ‘˜
π›Ώπ‘˜
2σ΅„©
1 σ΅„© σ΅„©2
σ΅„©2 2
≤ (π›½π‘˜SDY ) σ΅„©σ΅„©σ΅„©π‘‘π‘˜−1 σ΅„©σ΅„©σ΅„© − π‘”π‘˜π‘‡ π‘‘π‘˜ − 2 σ΅„©σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©σ΅„© .
π›Ώπ‘˜
π›Ώπ‘˜
(18)
Dividing both sides of (18) by (π‘”π‘˜π‘‡ π‘‘π‘˜ )2 and using (7), we
obtain
σ΅„©2
σ΅„© σ΅„©2
σ΅„©σ΅„©
σ΅„©σ΅„© σ΅„©σ΅„©2
σ΅„©σ΅„©π‘‘π‘˜−1 σ΅„©σ΅„©σ΅„©
σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„©
2 1
1 σ΅„©σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©σ΅„©
≤
−
−
. (19)
2
2
π›Ώπ‘˜ π‘”π‘˜π‘‡ π‘‘π‘˜ π›Ώπ‘˜2 (𝑔𝑇 π‘‘π‘˜ )2
𝑇 𝑑
(π‘”π‘˜π‘‡ π‘‘π‘˜ )
(π‘”π‘˜−1
)
π‘˜
π‘˜−1
It follows from (19) and the definitions of π‘žπ‘˜ and π›Ύπ‘˜ that
1 2
1 1
π‘žπ‘˜ ≤ π‘žπ‘˜−1 + σ΅„© σ΅„©2 − σ΅„© σ΅„©2 2 .
σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„© π›Ύπ‘˜ σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„© π›Ύπ‘˜
σ΅„© σ΅„©
σ΅„© σ΅„©
(20)
Abstract and Applied Analysis
3
Additionally, we assume that there exist positive constants
𝛾 and 𝛾 such that
σ΅„© σ΅„©
0 < 𝛾 ≤ σ΅„©σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©σ΅„© ≤ 𝛾, ∀π‘˜ ≥ 1,
(21)
Theorem 2. Consider the method (2), (8) and (12), where π‘‘π‘˜ is
a descent direction. If (21) holds, there exist positive constants
πœ‰1 , πœ‰2 , and πœ‰3 such that relations
πœ‰1
,
√π‘˜
σ΅„©σ΅„© σ΅„©σ΅„©2 πœ‰2
σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„© ≥ ,
π‘˜
π›Ύπ‘˜ ≥
πœ‰3
√π‘˜
0.9
0.8
0.7
then we have the following result.
−π‘”π‘˜π‘‡ π‘‘π‘˜ ≥
1
0.6
𝑃 0.5
0.4
0.3
(22)
0.2
0.1
(23)
0
1
1.1
1.2
1.3
1.4
1.5
𝑑
(24)
hold for all π‘˜ ≥ 1.
PRP+
HSDY
S-HSDY
DS-HSDY
Figure 1: Performance profiles for CPU time.
Proof. Summing (20) over the iterates and noting that 𝑑1 =
−𝑔1 , we get
π‘˜
1
2
1
π‘žπ‘˜ ≥ ∑ σ΅„© σ΅„©2 ( − 2 ) .
σ΅„©
σ΅„©
𝛾𝑖 𝛾𝑖
𝑖=1 σ΅„©
󡄩𝑔𝑖 σ΅„©σ΅„©
(25)
Therefore, by Theorems 2 and 3, it was shown that DSHSDY method possesses a self-adjusting property which is
independent of the line search and the function convexity.
Since π‘žπ‘˜ ≥ 0, it follows from (25) that
π‘˜−1
1
1
2
1
2
1
σ΅„©σ΅„© σ΅„©σ΅„©2 ( 𝛾 − 𝛾2 ) ≤ ∑ σ΅„©σ΅„© σ΅„©σ΅„©2 ( 𝛾 − 𝛾2 ) .
𝑖
𝑖
󡄩󡄩𝑔𝑖 σ΅„©σ΅„©
𝑖
𝑖
𝑖=1 σ΅„©
󡄩𝑔𝑖 σ΅„©σ΅„©
(26)
Equations (21), (26), and 2/𝛾𝑖 − 1/𝛾𝑖2 ≤ 1 yield
1
2 𝛾2
−
−
(π‘˜ − 1) ≤ 0.
π›Ύπ‘˜2 π›Ύπ‘˜ 𝛾2
(27)
(28)
Thus (24) holds with πœ‰3 = 𝛾/2𝛾.
Noting that −π‘”π‘˜π‘‡ π‘‘π‘˜ = β€–π‘”π‘˜ β€–2 π›Ύπ‘˜ and β€–π‘‘π‘˜ β€– ≥ β€–π‘”π‘˜ β€–π›Ύπ‘˜ , it is easy
to derive that (22) and (23) hold with πœ‰1 = πœ‰3 𝛾2 and πœ‰2 = πœ‰32 𝛾2 ,
respectively. Hence the proof is complete.
Theorem 3. Consider the method (2), (8), and (12), where π‘‘π‘˜
is a descent direction. If (21) holds, then for any 𝑝 ∈ (0, 1),
there exist constants πœ‰4 , πœ‰5 , and πœ‰6 > 0 such that, for any k, the
relations
−𝑔𝑖𝑇𝑑𝑖 ≥ πœ‰4 ,
σ΅„©σ΅„© σ΅„©σ΅„©2
󡄩󡄩𝑑𝑖 σ΅„©σ΅„© ≥ πœ‰5 ,
πœ‰
𝛾𝑖 ≥ 6
√π‘˜
hold for at least [π‘π‘˜] values of 𝑖 ∈ [1, π‘˜].
3. Global Convergence
Throughout the paper, we assume that the following assumptions hold.
Furthermore, we have
2𝛾
𝛾2
𝛾2
1
≤ 1 + √ 1 + 2 (π‘˜ − 1) ≤ 1 + 2 √π‘˜ ≤ √π‘˜.
π›Ύπ‘˜
𝛾
𝛾
𝛾
Proof. The proof is similar to the Theorem 2 in [10], so we
omit it here.
(29)
Assumption 1. (1) 𝑓 is bounded below in the level set L =
{π‘₯ ∈ 𝑅𝑛 : 𝑓(π‘₯) ≤ 𝑓(π‘₯1 )};
(2) in a neighborhood N of L, 𝑓 is differentiable and
its gradient 𝑔 is Lipschitz continuous; namely, there exists a
constant 𝐿 > 0 such that
󡄩󡄩󡄩𝑔 (π‘₯) − 𝑔 (𝑦)σ΅„©σ΅„©σ΅„© ≤ 𝐿 σ΅„©σ΅„©σ΅„©π‘₯ − 𝑦󡄩󡄩󡄩 , ∀π‘₯, 𝑦 ∈ N.
(30)
σ΅„©
σ΅„©
σ΅„©
σ΅„©
Under Assumption 1 on 𝑓, we could get a useful lemma.
Lemma 4. Suppose that π‘₯1 is a starting point for which
Assumption 1 holds. Consider any method in the form (2),
where π‘‘π‘˜ is a descent direction and π›Όπ‘˜ satisfies the weak Wolfe
conditions; then one has that
2
(π‘”π‘˜π‘‡ π‘‘π‘˜ )
(31)
∑ σ΅„© σ΅„©2 < +∞.
σ΅„©π‘‘π‘˜ σ΅„©σ΅„©
π‘˜≥1 σ΅„©
σ΅„© σ΅„©
For DS-HSDY method, one has the following global convergence result.
Theorem 5. Suppose that π‘₯1 is a starting point for which
Assumption 1 hold. Consider DS-HSDY method; if π‘”π‘˜ =ΜΈ 0 for
all π‘˜ ≥ 1, then one has that
π‘”π‘˜π‘‡ π‘‘π‘˜ < 0 ∀π‘˜ ≥ 1.
(32)
4
Abstract and Applied Analysis
Table 1: Numerical results for PRP+ method.
𝑛
10000
10000
10000
10000
10000
5000
5000
5000
10000
10000
10000
10000
10000
10000
5000
5000
5000
5000
5000
5000
5000
10000
10000
10000
10000
10000
10000
10000
10000
10000
5000
10000
Function
Quadratic QF2
Extended EP1
Extended Tridiagonal 2
ARGLINA
ARWHEAD
BDQRTIC
BDEXP
BRYBND
COSINE
CRAGGLVY
DIXMAANA
DIXMAANB
DIXMAANC
DIXMAAND
DIXMAANE
DIXMAANF
DIXMAANG
DIXMAANH
DIXMAANI
DIXMAANJ
DIXMAANK
DQDRTIC
DQRTIC
EDENSCH
EG2
ENGVAL1
EXTROSNB
FREUROTH
LIARWHD
NONDIA
NONDQUAR
NONSCOMP
NI
2227
4
39
5
7
157
6
5
21
129
6
8
11
13
558
558
519
379
593
492
653
11
33
26
209
30
29
61
25
9
1786
5001
(33)
π›½π‘˜DS-HSDY = πœ† π‘˜ π›½π‘˜SDY =
Proof. Since 𝑑1 = −𝑔1 , it is obvious that
< 0. Assume
𝑇
π‘‘π‘˜−1 < 0. By (10) and the definition of the π‘¦π‘˜∗ , we
that π‘”π‘˜−1
𝑇
∗
𝑇
have π‘‘π‘˜−1
π‘¦π‘˜−1
≥ π‘‘π‘˜−1
π‘¦π‘˜−1 > 0, then π›½π‘˜DSDY > 0. In addition,
from (8), we have
0≤
≤
π›½π‘˜DSDY
≤
π›½π‘˜SDY .
(34)
Let πœ† π‘˜ =
then we have 0 ≤ πœ† π‘˜ ≤ 1. By (12)
with π‘˜ + 1 replaced by π‘˜, and multiplying it by π‘”π‘˜ , we have
π›½π‘˜DS-HSDY /π›½π‘˜SDY ,
π‘”π‘˜π‘‡ π‘‘π‘˜ =
𝑇
π‘”π‘˜−1
π‘‘π‘˜−1
1) π‘”π‘˜π‘‡ π‘‘π‘˜−1
+ (πœ† π‘˜ −
𝑇 𝑦
π›Ώπ‘˜ π‘‘π‘˜−1
π‘˜−1
σ΅„©σ΅„© σ΅„©σ΅„©2
σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„© .
(35)
πœ† π‘˜ π‘”π‘˜π‘‡ π‘‘π‘˜
𝑇 𝑑
𝑇
π‘”π‘˜−1
π‘˜−1 + (πœ† π‘˜ − 1) π‘”π‘˜ π‘‘π‘˜−1
𝑔𝑇 𝑑
= πœ‰π‘˜ 𝑇 π‘˜ π‘˜ ,
π‘”π‘˜−1 π‘‘π‘˜−1
𝑔1𝑇 𝑑1
π›½π‘˜DS-HSDY
‖𝑔(π‘₯)β€–∞
9.98𝐸 − 07
6.09𝐸 − 13
9.29𝐸 − 07
1.95𝐸 − 07
3.10𝐸 − 07
1.47𝐸 − 04
1.72𝐸 − 07
2.60𝐸 − 07
9.60𝐸 − 07
5.35𝐸 − 06
4.20𝐸 − 07
6.72𝐸 − 07
3.38𝐸 − 08
1.32𝐸 − 07
9.96𝐸 − 07
8.65𝐸 − 07
5.99𝐸 − 07
8.97𝐸 − 07
7.51𝐸 − 07
5.08𝐸 − 07
9.61𝐸 − 07
9.59𝐸 − 08
3.44𝐸 − 07
9.10𝐸 − 06
1.10𝐸 − 03
1.37𝐸 − 06
3.77𝐸 − 08
2.33𝐸 − 07
6.54𝐸 − 09
1.03𝐸 − 09
7.96𝐸 − 07
3.47𝐸 − 06
From this and the formula for π›½π‘˜SDY , we get
Further, the method converges in the sense that
σ΅„© σ΅„©
lim inf σ΅„©σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©σ΅„© = 0.
π‘˜→∞
𝑇 (0.01 S)
2016
3
47
4
21
526
7
1215
39
444
19
26
38
44
712
525
684
2469
755
651
863
19
42
78
473
21
25
81
36
17
1752
9751
Nfg
2885
7
98
15
14
720
8
11
45
250
12
16
23
29
799
598
784
3488
854
751
979
23
57
90
1426
93
63
145
49
17
3258
6799
(36)
where
πœ‰π‘˜ =
πœ†π‘˜
,
1 + (πœ† π‘˜ − 1) π‘™π‘˜−1
(37)
π‘”π‘˜π‘‡ π‘‘π‘˜−1
.
𝑇 𝑑
π‘”π‘˜−1
π‘˜−1
(38)
π‘™π‘˜−1 =
At the same time, if we define
πœπ‘˜ =
1 + (πœ† π‘˜ − 1) π‘™π‘˜−1
,
π‘™π‘˜−1 − 1
(39)
Abstract and Applied Analysis
5
Table 2: Numerical results for HSDY method.
𝑛
10000
10000
10000
10000
10000
5000
5000
5000
10000
10000
10000
10000
10000
10000
5000
5000
5000
5000
5000
5000
5000
10000
10000
10000
10000
10000
10000
10000
10000
10000
5000
10000
Function
Quadratic QF2
Extended EP1
Extended Tridiagonal 2
ARGLINA
ARWHEAD
BDQRTIC
BDEXP
BRYBND
COSINE
CRAGGLVY
DIXMAANA
DIXMAANB
DIXMAANC
DIXMAAND
DIXMAANE
DIXMAANF
DIXMAANG
DIXMAANH
DIXMAANI
DIXMAANJ
DIXMAANK
DQDRTIC
DQRTIC
EDENSCH
EG2
ENGVAL1
EXTROSNB
FREUROTH
LIARWHD
NONDIA
NONDQUAR
NONSCOMP
NI
1593
4
34
5
13
171
6
5
21
109
5
9
10
13
446
389
552
202
365
444
367
8
37
30
305
30
27
143
32
7
2049
58
Nfg
1902
7
55
15
58
567
8
11
46
255
10
18
21
29
541
876
660
5106
450
532
452
17
68
99
2811
52
54
283
62
14
3730
100
𝑇 (0.01 S)
1876
4
32
4
71
422
6
1222
41
434
16
29
33
45
493
690
602
3417
409
484
410
14
48
85
879
21
21
177
44
14
2011
98
‖𝑔(π‘₯)β€–∞
7.11𝐸 − 07
6.09𝐸 − 13
9.40𝐸 − 07
1.95𝐸 − 07
4.42𝐸 − 07
6.31𝐸 − 04
1.72𝐸 − 07
2.60𝐸 − 07
8.02𝐸 − 07
1.45𝐸 − 06
5.13𝐸 − 07
2.21𝐸 − 07
5.42𝐸 − 07
1.14𝐸 − 07
9.24𝐸 − 07
9.60𝐸 − 07
9.85𝐸 − 07
4.05𝐸 − 04
9.95𝐸 − 07
4.95𝐸 − 07
9.77𝐸 − 07
6.35𝐸 − 07
3.40𝐸 − 07
7.97𝐸 − 07
2.08𝐸 − 03
8.62𝐸 − 07
6.20𝐸 − 09
8.01𝐸 − 07
1.75𝐸 − 07
4.57𝐸 − 07
6.68𝐸 − 07
5.03𝐸 − 07
Since π‘‘π‘˜ + (1/π›Ώπ‘˜ )π‘”π‘˜ = π›½π‘˜DS-HSDY π‘‘π‘˜−1 , we have that
it follows from (39) that
π‘”π‘˜π‘‡ π‘‘π‘˜ =
πœπ‘˜ σ΅„©σ΅„© σ΅„©σ΅„©2
󡄩𝑔 σ΅„© .
π›Ώπ‘˜ σ΅„© π‘˜ σ΅„©
(40)
Then we have by (10), with π‘˜ replaced by π‘˜ − 1, that
π‘™π‘˜−1 ≤ 𝜎.
(41)
Furthermore, we have
1 + (πœ† π‘˜ − 1) π‘™π‘˜−1 ≥ 1 + (−
1−𝜎
1−𝜎
− 1) 𝜎 =
.
1+𝜎
1+𝜎
(42)
The above relation, (40), (41), and the fact that 𝜎 < 1 imply
that π‘”π‘˜π‘‡ π‘‘π‘˜ < 0. Thus by induction, (32) holds.
We now prove (33) by contradiction and assume that
there exists some constant 𝛾 > 0 such that
σ΅„©σ΅„© σ΅„©σ΅„©
σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„© ≥ 𝛾
∀π‘˜ ≥ 1.
(43)
1 σ΅„© σ΅„©2
σ΅„©σ΅„© σ΅„©σ΅„©2
σ΅„©2 2
DS-HSDY 2 σ΅„©
) σ΅„©σ΅„©σ΅„©π‘‘π‘˜−1 σ΅„©σ΅„©σ΅„© − π‘”π‘˜π‘‡ π‘‘π‘˜ − 2 σ΅„©σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©σ΅„© .
σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„© = (π›½π‘˜
π›Ώπ‘˜
π›Ώπ‘˜
(44)
Dividing both sides of (44) by (π‘”π‘˜π‘‡ π‘‘π‘˜ )2 and using (36) and
(40), we obtain
σ΅„©σ΅„©
σ΅„©2
σ΅„©σ΅„© σ΅„©σ΅„©2
π‘‘π‘˜−1 σ΅„©σ΅„©σ΅„©
σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„©
1
2
1
2 σ΅„©
σ΅„©
= πœ‰π‘˜
− σ΅„© σ΅„©2 ( + 2 )
2
2
𝑇
σ΅„©
σ΅„©
𝜁
𝜁
(π‘”π‘˜−1 π‘‘π‘˜−1 )
(π‘”π‘˜ π‘‘π‘˜ )
π‘˜
σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©
π‘˜
(45)
σ΅„©σ΅„©
σ΅„©σ΅„©2
2
󡄩𝑑 σ΅„©
1
1
= πœ‰π‘˜2 σ΅„© π‘˜−1 σ΅„© 2 + σ΅„© σ΅„©2 [1 − (1 + ) ] .
σ΅„©
σ΅„©
𝜁
(π‘”π‘˜−1 π‘‘π‘˜−1 )
π‘˜
σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©
In addition, since π‘™π‘˜−1 < 1 and πœ† π‘˜ ≤ 1, we have that (1−πœ† π‘˜ )(1−
π‘™π‘˜−1 ) ≥ 0, or equivalently
1 + (πœ† π‘˜ − 1) π‘™π‘˜−1 ≥ πœ† π‘˜ ,
(46)
6
Abstract and Applied Analysis
Table 3: Numerical results for S-HSDY method.
𝑛
10000
10000
10000
10000
10000
5000
5000
5000
10000
10000
10000
10000
10000
10000
5000
5000
5000
5000
5000
5000
5000
10000
10000
10000
10000
10000
10000
10000
10000
10000
5000
10000
Function
Quadratic QF2
Extended EP1
Extended Tridiagonal 2
ARGLINA
ARWHEAD
BDQRTIC
BDEXP
BRYBND
COSINE
CRAGGLVY
DIXMAANA
DIXMAANB
DIXMAANC
DIXMAAND
DIXMAANE
DIXMAANF
DIXMAANG
DIXMAANH
DIXMAANI
DIXMAANJ
DIXMAANK
DQDRTIC
DQRTIC
EDENSCH
EG2
ENGVAL1
EXTROSNB
FREUROTH
LIARWHD
NONDIA
NONDQUAR
NONSCOMP
NI
1582
4
34
5
13
111
6
5
21
103
5
9
10
13
422
310
410
217
380
359
404
8
37
30
242
29
27
214
27
7
1782
58
which with (37) yields
𝑇 (0.01 S)
1836
3
34
3
75
377
5
1179
39
332
15
30
33
43
468
618
449
4642
417
402
448
14
49
84
570
22
22
260
37
14
1738
100
Nfg
1941
7
55
15
58
526
8
11
46
189
10
18
21
29
514
792
495
6957
450
438
485
17
68
99
1731
124
54
408
54
14
3210
100
‖𝑔(π‘₯)β€–∞
6.58𝐸 − 07
6.09𝐸 − 13
9.40𝐸 − 07
1.95𝐸 − 07
5.60𝐸 − 07
3.39𝐸 − 04
1.72𝐸 − 07
2.60𝐸 − 07
9.72𝐸 − 07
1.94𝐸 − 06
5.13𝐸 − 07
2.21𝐸 − 07
5.42𝐸 − 07
1.14𝐸 − 07
9.73𝐸 − 07
6.77𝐸 − 07
9.83𝐸 − 07
4.13𝐸 − 04
9.96𝐸 − 07
9.95𝐸 − 07
6.67𝐸 − 07
6.35𝐸 − 07
3.41𝐸 − 07
1.54𝐸 − 06
4.25𝐸 − 04
1.78𝐸 − 06
3.98𝐸 − 09
8.08𝐸 − 07
4.45𝐸 − 12
4.58𝐸 − 07
8.99𝐸 − 07
5.03𝐸 − 07
which indicates
󡄨󡄨 󡄨󡄨
σ΅„¨σ΅„¨πœ‰π‘˜ 󡄨󡄨 ≤ 1.
(47)
By (45) and (47), we obtain
σ΅„©σ΅„© σ΅„©σ΅„©2
σ΅„©2
σ΅„©σ΅„©
σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„©
σ΅„©σ΅„©π‘‘π‘˜−1 σ΅„©σ΅„©σ΅„©
1
≤
+ σ΅„© σ΅„©2 .
2
2
𝑇
σ΅„©σ΅„©π‘”π‘˜ σ΅„©σ΅„©
(π‘”π‘˜−1 π‘‘π‘˜−1 )
(π‘”π‘˜ π‘‘π‘˜ )
σ΅„© σ΅„©
(48)
Using (48) recursively and noting that ‖𝑑1 β€–2 = −𝑔1𝑇 𝑑1 =
‖𝑔1 β€–2 ,
σ΅„©σ΅„© σ΅„©σ΅„©2
π‘˜
σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„©
1
≤
∑
2
σ΅„©σ΅„© σ΅„©σ΅„©2 .
𝑇
(π‘”π‘˜ π‘‘π‘˜ )
𝑖=1 σ΅„©
σ΅„©π‘”π‘˜ σ΅„©σ΅„©
(49)
Then we get from this and (43) that
2
(π‘”π‘˜π‘‡ π‘‘π‘˜ )
πœ†2
≥
,
2
σ΅„©σ΅„©σ΅„©π‘‘π‘˜ σ΅„©σ΅„©σ΅„©
π‘˜
σ΅„© σ΅„©
(50)
2
(π‘”π‘˜π‘‡ π‘‘π‘˜ )
∑ σ΅„© σ΅„©2 = +∞.
σ΅„©π‘‘π‘˜ σ΅„©σ΅„©
π‘˜≥1 σ΅„©
σ΅„© σ΅„©
(51)
This contradicts the Zoutendijk condition (31). Hence we
complete the proof.
4. Numerical Result
In this section, we compare the performance of DS-HSDY
method to PRP+ method [11], HSDY method [1], and SHSDY method [8]. The test problems are taken from CUTEr
(http://hsl.rl.ac.uk/cuter-www/problems.html) with the standard initial points. All codes are written in double precision
Fortran and complied with f77 (default compiler settings)
on a PC (AMD Athlon XP 2500 + CPU 1.84 GHz). Our
line search subroutine computes π›Όπ‘˜ such that the Wolfe
conditions (10) and (11) hold with 𝜌 = 10−4 and 𝜎 = 0.5. We
Abstract and Applied Analysis
7
Table 4: Numerical results for DS-HSDY method.
Function
Quadratic QF2
Extended EP1
Extended Tridiagonal 2
ARGLINA
ARWHEAD
BDQRTIC
BDEXP
BRYBND
COSINE
CRAGGLVY
DIXMAANA
DIXMAANB
DIXMAANC
DIXMAAND
DIXMAANE
DIXMAANF
DIXMAANG
DIXMAANH
DIXMAANI
DIXMAANJ
DIXMAANK
DQDRTIC
DQRTIC
EDENSCH
EG2
ENGVAL1
EXTROSNB
FREUROTH
LIARWHD
NONDIA
NONDQUAR
NONSCOMP
𝑛
10000
10000
10000
10000
10000
5000
5000
5000
10000
10000
10000
10000
10000
10000
5000
5000
5000
5000
5000
5000
5000
10000
10000
10000
10000
10000
10000
10000
10000
10000
5000
10000
NI
1623
4
34
5
13
165
6
5
14
110
5
9
10
13
410
432
476
442
397
445
403
10
35
29
251
29
65
50
47
7
1831
73
use the condition ‖𝑔(π‘₯π‘˜ )β€–∞ ≤ 10−6 or π›Όπ‘˜ π‘”π‘˜π‘‡ π‘‘π‘˜ < 10−20 |𝑓(π‘₯π‘˜ )|
as the stopping criterion. The numerical results are presented
in Tables 1, 2, 3, and 4 with the form NI/Nfg/T, where we
report the dimension of the problem (𝑛), the number of
iteration (NI), the number of function evaluations (Nfg), and
the CPU time (𝑇) in 0.01 seconds.
Figure 1 shows the performance of these test methods
relative to the CPU time, which were evaluated using the
profiles of Dolan and Moré [12]. That is, for each method,
we plot the fraction 𝑃 of problems for which the method
is within a factor 𝑑 of the best time. The top curve is the
method that solved the most problems in a time that was
within a factor 𝑑 of the best time. Clearly, the left side of the
figure gives the percentage of the test problems for which a
method is the fastest. As we can see from Figure 1, DS-HSDY
method has the best performance which performs better than
S-HSDY method, HSDY method, and the well-known PRP+
method.
Nfg
1978
7
55
15
58
448
8
11
38
150
10
18
21
29
493
546
582
1204
467
594
507
21
62
87
1121
50
122
133
94
14
3262
126
𝑇 (0.01 S)
1783
3
30
3
70
324
4
990
28
266
16
27
31
44
430
469
505
7792
408
503
438
17
43
70
381
19
44
67
61
12
1665
119
‖𝑔(π‘₯)β€–∞
9.81𝐸 − 07
6.09𝐸 − 13
1.95𝐸 − 07
5.60𝐸 − 07
5.60𝐸 − 07
2.93𝐸 − 03
1.72𝐸 − 07
2.60𝐸 − 07
5.78𝐸 − 07
8.92𝐸 − 07
5.17𝐸 − 07
2.21𝐸 − 07
5.42𝐸 − 07
1.10𝐸 − 07
9.57𝐸 − 07
3.65𝐸 − 07
5.89𝐸 − 07
4.05𝐸 − 04
9.45𝐸 − 07
9.66𝐸 − 07
9.05𝐸 − 07
1.19𝐸 − 07
9.73𝐸 − 07
5.74𝐸 − 06
4.02𝐸 − 03
4.26𝐸 − 07
7.11𝐸 − 07
1.59𝐸 − 07
1.16𝐸 − 08
4.60𝐸 − 07
9.66𝐸 − 07
5.10𝐸 − 07
5. Conclusion
In this paper, we proposed an efficient hybrid spectral conjugate gradient method with self-adjusting property. Under
some suitable assumptions, we established the global convergence result for the DS-HSDY method. Numerical results
indicated that the proposed method is efficient for large-scale
unconstrained optimization problems.
Acknowledgments
This work was partly supported by the National Natural
Science Foundation of China (no. 61262026), the JGZX program of Jiangxi Province (20112BCB23027), and the science
and technology program of Jiangxi Education Committee
(LDJH12088). The authors would also like to thank the
editor and an anonymous referees for their comments and
8
suggestions on the first version of the paper, which led to
significant improvements of the presentation.
References
[1] Y. H. Dai and Y. Yuan, “An efficient hybrid conjugate gradient
method for unconstrained optimization,” Annals of Operations
Research, vol. 103, pp. 33–47, 2001.
[2] Y. H. Dai and Y. Yuan, “A nonlinear conjugate gradient method
with a strong global convergence property,” SIAM Journal on
Optimization, vol. 10, no. 1, pp. 177–182, 1999.
[3] Z. X. Wei, G. Y. Li, and L. Q. Qi, “Global convergence of
the Polak-RibieΜ€re-Polyak conjugate gradient method with an
Armijo-type inexact line search for nonconvex unconstrained
optimization problems,” Mathematics of Computation, vol. 77,
no. 264, pp. 2173–2193, 2008.
[4] W. W. Hager and H. Zhang, “A new conjugate gradient method
with guaranteed descent and an efficient line search,” SIAM
Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005.
[5] L. Zhang, W. Zhou, and D.-H. Li, “A descent modified PolakRibieΜ€re-Polyak conjugate gradient method and its global convergence,” IMA Journal of Numerical Analysis, vol. 26, no. 4, pp.
629–640, 2006.
[6] G. Yuan, “Modified nonlinear conjugate gradient methods
with sufficient descent property for large-scale optimization
problems,” Optimization Letters, vol. 3, no. 1, pp. 11–21, 2009.
[7] G. Yu, L. Guan, and W. Chen, “Spectral conjugate gradient
methods with sufficient descent property for large-scale unconstrained optimization,” Optimization Methods and Software, vol.
23, no. 2, pp. 275–293, 2008.
[8] G. Yu, Nonlinear self-scaling conjugate gradient methods for
large-scale optimization problems [Ph.D. thesis], Sun Yat-Sen
University, Guangzhou, China, 2007.
[9] G. Li, C. Tang, and Z. Wei, “New conjugacy condition and
related new conjugate gradient methods for unconstrained optimization,” Journal of Computational and Applied Mathematics,
vol. 202, no. 2, pp. 523–539, 2007.
[10] Y.-H. Dai, “New properties of a nonlinear conjugate gradient
method,” Numerische Mathematik, vol. 89, no. 1, pp. 83–98, 2001.
[11] J. C. Gilbert and J. Nocedal, “Global convergence properties of
conjugate gradient methods for optimization,” SIAM Journal on
Optimization, vol. 2, no. 1, pp. 21–42, 1992.
[12] E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming,
vol. 91, no. 2, pp. 201–213, 2002.
Abstract and Applied Analysis
Advances in
Operations Research
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Advances in
Decision Sciences
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Mathematical Problems
in Engineering
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Journal of
Algebra
Hindawi Publishing Corporation
http://www.hindawi.com
Probability and Statistics
Volume 2014
The Scientific
World Journal
Hindawi Publishing Corporation
http://www.hindawi.com
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
International Journal of
Differential Equations
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Volume 2014
Submit your manuscripts at
http://www.hindawi.com
International Journal of
Advances in
Combinatorics
Hindawi Publishing Corporation
http://www.hindawi.com
Mathematical Physics
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Journal of
Complex Analysis
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
International
Journal of
Mathematics and
Mathematical
Sciences
Journal of
Hindawi Publishing Corporation
http://www.hindawi.com
Stochastic Analysis
Abstract and
Applied Analysis
Hindawi Publishing Corporation
http://www.hindawi.com
Hindawi Publishing Corporation
http://www.hindawi.com
International Journal of
Mathematics
Volume 2014
Volume 2014
Discrete Dynamics in
Nature and Society
Volume 2014
Volume 2014
Journal of
Journal of
Discrete Mathematics
Journal of
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Applied Mathematics
Journal of
Function Spaces
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Optimization
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Download