A TRUST REGION METHOD WITH A CONIC MODEL FOR Wang Chengjing

advertisement
Appl. Math. J. Chinese Univ. Ser. B
2006, 21(3): 263-275
A TRUST REGION METHOD WITH A CONIC MODEL FOR
NONLINEARLY CONSTRAINED OPTIMIZATION
Wang Chengjing
Abstract. Trust region methods are powerful and effective optimization methods. The conic
model method is a new type of method with more information available at each iteration than
standard quadratic-based methods. The advantages of the above two methods can be combined
to form a more powerful method for constrained optimization. The trust region subproblem of
our method is to minimize a conic function subject to the linearized constraints and trust region
bound. At the same time, the new algorithm still possesses robust global properties. The global
convergence of the new algorithm under standard conditions is established.
§1 Introduction
In this paper we consider a general optimization problem with nonlinearly equality constraints
min f (x)
(1.1)
s.t. c(x) = 0,
(1.2)
x∈Rn
where c(x) = (c1 (x), . . . , cm (x))T , f (x), ci (x) are twice continuously differentiable. The Lagrangian function for problem (1.1) and (1.2) is defined as
L(x, λ) = f (x) +
m
X
λi ci (x),
i=1
where λi for i = 1, . . . , m are Lagrange multipliers. We use the notation g(x) = ∇f (x),
and A(x) = (a1 (x), . . . , am (x)) = (∇c1 (x), . . . , ∇cm (x)) is an n × m matrix. The constraint
gradient ∇ci (x) is assumed to be linearly independent for all x. For any x, let N (x) be an
n × (n − m) matrix whose columns form a basis for the null space of A(x) Throughout this
paper we define A(xk ) = Ak , g(xk ) = gk , c(xk ) = ck , N (xk ) = Nk , for the kth iteration. The
sequential quadratic programming method for (1.1), (1.2) is used to compute a search direction
Received:2006-2-23.
MR Subject Classification:90C30.
Keywords: trust region method, conic model, constrained optimization, nonlinear programming.
264
Appl. Math. J. Chinese Univ. Ser. B
Vol. 21, No. 3
by minimizing a quadratic model of Lagrangian subject to linearized constraints. That is, at
kth iteration, the following subproblem
1
minn gkT d + dT Bk d
d∈R
2
s.t. ATk d + ck = 0
is solved to obtain a search direction dk , where xk is the current iteration point and Bk is
symmetric and an approximation to Hessian ∇xx L(xk , λk ) of the Lagrangian of problem (1.1),
(1.2). The next iteration has the form
xk+1 = xk + αk dk ,
where αk > 0 is a step length which satisfies some line search conditions(see [1–4]).
Trust region methods for problem (1.1), (1.2) have been studied by many researchers, including Vardi [5] , Byrd , Schnabel and Schultz [6] , Toint [7] , Zhang and Zhu [8] , Powell and Yuan [3]
and El-Alem [9] , etc.. It is especially worth mentioning that the book of Conn, Gould and
Toint [10] is an excellent and comprehensive one on trust region methods. Trust region methods are robust, can be applied to ill-conditioned problems and have strong global convergence
properties. Another advantage of trust region methods is that there is no need to require the
approximate Hessian of the trust region subproblem to be positive definite. For unconstrained
problems, Nocedal and Yuan [11] showed that a trust region trial step is always a descent direction for any approximate Hessian. It is well known that for line search methods one generally
has to assume the approximate Hessian to be positive definite in order to ensure that the search
direction is a descent direction.
The collinear scaling of variables and conic model method for unconstrained optimization
were first studied by Davidon [12] . Sorensen [13] published detailed results on a class of conic
model method and proved that a particular member of this class has Q-superlinear convergence.
Ariyawansa [14] modified the derivation of Sorensen [13] and established the duality between the
collinear scaling BFGS and DFP methods. Ariyawansa and Lau [15] derived the collinear scaling
Broyden’s family and established its superlinear convergence results. Sheng [16] studied further
the interpolation properties of conic model method. Sun [17] analyzed several non-quadratic
model methods and pointed out that the collinear scaling method is one of nonlinear scaling
methods with scale invariance. A typical conic model for unconstrained optimization is
ψk (s) =
gkT s
1 sT Bk s
+
,
1 − hTk s 2 (1 − hTk s)2
which is an approximation to f (xk +s)−f (xk ) and coincides up to first order with the objective
function, where Bk is an approximate Hessian of f (x) at xk . The vector hk is the associated
vector for collinear scaling at the kth iteration, and it is normally called the horizontal vector.
If hk = 0, the conic model reduces to a quadratic model.
Many researchers have explored the trust region methods based on conic model. For details,
one may refer to [18–21], etc. In this paper, we also study this topic. However, when we solve
Wang Chengjing
A TRUST REGION METHOD WITH A CONIC MODEL FOR ...
265
the trial step, we adopt the way of dividing the step to normal step and tangential step, which
is quite effective. Now the subproblem at kth iteration is
min ψk (s) =
s
gkT s
1 sT Bk s
+
1 − hTk s 2 (1 − hTk s)2
(1.3)
s.t.ATk s + θk ck = 0
(1.4)
ksk ≤ ∆k ,
(1.5)
where ∆k is the trust region radius, θk ∈ (0, 1] is a relaxation parameter, and it is so chosen
that the feasible set (1.4) and (1.5) is not empty.
Throughout this paper we use k · k for the 2-norm.
The organization of this paper is as follows. In the following section, the derivation of our
method and a description of our algorithm are presented. The global convergence properties
are studied in §3. A short discussion is given in §4.
§2 The algorithm
First, according to constraints (1.4) and (1.5), we define the region
def
F(x, ∆, θ) = {s| A(x)s + θc(x) = 0 and ksk ≤ ∆}.
Clearly, it has a set of feasible points comprising all vectors lying in the null space of A(x)
whose norm is no greater than ∆.
If A(x) is of full column rank and an l2 -norm trust region is used, it is easy to show that
the largest such θ is
θmax = min[1,
∆
def
], where nc (x) = −AT (x)(A(x)AT (x))−1 c(x).
knc (x)k
The vector nc (x) is the solution to A(x)n + c(x) = 0 of a minimal l2 norm, and θmax nc (x) is
the closest point to the constraints within the trust region. If θmax < 1, picking θ = θmax will
result in F(x, ∆, θ) the single point θmax nc (x).
In practice, this plays to have some ”elbow room” in which to allow both normal and
tangential components to move. Thus, a value θ < θmax should be chosen if θmax < 1. It is
also important not to allow θ to be too much smaller than θmax , as otherwise there may be
little progress towards feasibility. Furthermore, when the problem involves a large number of
variables, it is wasteful to compute nc (x), and we might expect to be able to get away with a
suitable approximation nc .
Now we shall formally impose the following conditions on the trial step and its scaling.
A1 There is a constant κbsc > 0 for which the trial step nck satisfies
A(xk )nck + c(xk ) = 0
(2.1)
knck k ≤ κbsc kc(xk )k
(2.2)
and
266
Appl. Math. J. Chinese Univ. Ser. B
Vol. 21, No. 3
for all k.
A2 The scaling factor θk satisfies
θk ∈ [min[1,
τ ξ N ∆k
ξ N ∆k
], min[1,
]]
c
knk k
knck k
(2.3)
for given parameters ξ N , τ ∈ (0, 1]. We note that
nk = 0 if c(xk ) = 0,
since A1 requires that nck = 0 and
nk = θk nck .
(2.4)
The requirements in A1 are easy to explain. Firstly, since the step is intended to find a
feasible point, if possible, it should satisfy the linearized constraints. Secondly, since there is
(usually) a vast number of different steps to the linearized constraints, we wish to ensure that
the particular step we choose is not too large. Notice that the assumption will be satisfied by
the projection step nc (xk ) under reasonable assumptions on A(xk ) and is intended to allow
other values that are not too remote from this, perhaps ideal, step.
The scaling implied by A2 is also easy to justify. If the trial step is well inside the trust
region, αk = 1 and nk will satisfy both the linearized constraint and trust region. On the other
hand, if the trial step is outside the trust region, the scaling ensures that
τ ξ N ∆k ≤ knk k ≤ ξ N ∆k ,
and thus the step is within a fixed range of the maximum possible. Reasonable values might
be ξ N in the range [0.8, 1] and τ = 0.8.
Then we turn to the calculation of tangential step tk which plays the role of reducing
the suitable model with the trust region. Since the normal step has been determined, the
subproblem (1.3)-(1.5) is converted to
1
min gkT w + wT Bk w
w
2
s.t.w =
nk + t
1 − hTk (nk + t)
ktk ≤ ∆Tk
where
def
∆Tk =
q
∆2k − knk k2 .
Therefore problem (2.5)-(2.7) can be solved by the technique given in [19].
The merit function we apply is the L2 penalty function
φ(x) = f (x) + σkc(x)k,
where σ > 0 is a penalty parameter.
(2.5)
(2.6)
(2.7)
(2.8)
Wang Chengjing
267
A TRUST REGION METHOD WITH A CONIC MODEL FOR ...
We define
m(x, B, σ, s) =
g T (x)s
1
sT Bs
+
+
1 − hT (x)s 2 (1 − hT (x)s)2
σkc(x) + A(x)sk,
f (x) +
q(x, B, s) = f (x) +
g T (x)s
1
sT Bs
+
,
1 − hT (x)s 2 (1 − hT (x)s)2
mN (x, n) = kc(x) + A(x)nk,
mT (x, B, t) = f (x) +
g T (x)(n + t)
1 (n + t)T B(n + t)
+
1 − hT (x)(n + t) 2 [1 − hT (x)(n + t)]2
δm = m(x, B, σ, 0) − m(x, B, σ, s), δmN = mN (x, 0) − mN (x, n),
δmT = mT (x, B, 0) − mT (x, B, t), δq N = q(x, B, 0) − q(x, B, n).
Obviously, δm = δmT + σδmN + δq N , and we might simply let σ be large enough so that
δmT + σδmN + δq N ≥ 0.
Since we prefer to have a significant decrease, we may ensure this by making
δm = δmT + σδmN + δq N ≥ νσδmN
(2.9)
for some ν ∈ (0, 1). In particular, we could compute
σc = −
δq N + δmT
(1 − ν)δmN
(2.10)
as the smallest value for which (2.9) is satisfied, and replace the current σ by max[σ c , τ1 σ, σ +τ2 ]
whenever σ < σ c for some appropriate τ1 ≥ 1 and τ2 ≥ 0 for which the product (τ1 − 1)τ2 > 0.
The extra terms τ1 σ and σ +τ2 in the proposed updates are simply to prevent a long sequence of
barely increasing penalty parameters. Suitable values might be ν = 0.0001,τ1 = 2 and τ2 = 1.
For the convenience of the statement in the following, we define the actual reduction in the
merit function by
Aredk = f (xk ) − f (xk + sk ) + σk (kck k − kck + ATk sk k),
where sk is a trial step computed by the algorithm at xk , and define the predicted reduction by
Predk = −
gkT sk
1 sTk Bk sk
−
+ σk (kck k − kck + ATk sk k),
T
2 (1 − hTk sk )2
1 − hk sk
Now we formally give a description of our algorithm.
(2.11)
268
Appl. Math. J. Chinese Univ. Ser. B
Vol. 21, No. 3
Algorithm 2.1.
Step 0.
Step 1.
Step 2.
Step 3.
Initialization. An initial point x0 , an initial trust region radius
∆0 > 0, and an initial penalty parameter σ−1 > 0 are given. The
constants µ, η, ξ1 , ξ2 , ξ N , ν, τ1 and τ2 are also given and satisfy
the conditions 0 ≤ µ < 1, η < µ < 1, 0 < ξ1 < 1 < ξ2 , 0 < ξ N ≤ 1,
0 < ν < 1, τ1 > 1, τ2 > 0, and (τ1 − 1)τ2 > 0. Compute f0 and
c0 , and set k = 0.
Stopping criterion. Compute fk , gk . If kNkT gk k ≤ ² and
kck k ≤ ², stop.
Composite step calculation. Compute the composite step
sk = nk + tk as follows:
Step 2a. Normal step calculation. Compute a step nk = θk nck
for which knk k ≤ ξ N ∆k is satisfied and A1 and A2 hold.
Step 2b. Tangential step calculation. Solve problem (2.5) − (2.7)
to obtain tk .
Increase the penalty parameter if necessary. Compute the
T
model decrease δmN
k and δmk , as well as the change in the conic
N
model δqk following the normal step. Find
δq N +δmT
k
k
σkc = − (1−ν)δm
N ,
k
set
σk =
Step 4.
Step 5.
(
max[σkc , τ1 σk−1 , σk−1 + τ2 ] if σk−1 < σkc ,
σk−1
otherwise.
Acceptance of the trial point. Compute
ρk = Aredk .
Predk
If ρk ≥ µ, then define xk+1 = xk + sk and possibly update Bk+1 ;
otherwise define xk+1 = xk .
Trust region radius update. Set
(
max[∆k , ξ2 ksk k] if ρk ≥ η,
∆k+1 =
ξ1 ksk k
otherwise.
k := k + 1, go to Step 1.
How to choose the scaling vector hk is one of the key issues of a conic model method.
In general, hk+1 and Bk+1 are so chosen that certain generalized quasi-Newton equations are
satisfied, which means that the conic model function interpolates both the function values and
the gradient values of the objective function at xk and xk+1 . Specific details on several updating
formulae for hk and Bk+1 can be found in [12–14, 19, 20]. The conditions that we assume for
proving global convergence are that the matrices Bk are uniformly bounded and
∃δ1 ∈ (0, 1) : khk k∆k < δ1 ,
which ensures that the conic model function ψk (s) is bounded over the trust region {s| ksk ≤
∆k }.
Wang Chengjing
A TRUST REGION METHOD WITH A CONIC MODEL FOR ...
269
§3 Global convergence
In this section we establish the convergence results of our algorithm given in the previous
section.
First, we have to give some assumptions as follows:
A3 {xk } are uniformly bounded and khk k∆k are bounded above by a constant δ1 ∈ (0, 1).
A4 There is a constant κbns > 0 such that for all k,
0<
1
κbns
≤ σmin [N (xk )] ≤ σmax [N (xk )] ≤ κbns .
A5 f (x) and ci (x) are twice continuously differentiable in some open set containing all iterates
generated by the algorithm.
A6 f (x) is uniformly bounded from below, g(x) and c(x) are uniformly bounded.
A7 The sequence of second derivative approximations {Bk } is bounded.
Then we give the following lemmas.
Lemma 3.1. Let sk be the solution of subproblem (2.5)-(2.7). If A1, A3 and A4 hold, then
there exist positive constants δ2 and δ3 such that
ψk (0) − ψk (sk ) ≥ δ2 kNkT gk k min[∆k ,
kNkT gk k
] − δ3 (1 + kBk k)
kBk k
min[∆k , kck k]
(3.1)
for all k.
Proof. Define
sk (ζ) = nk − ζNk NkT gk ,
then we can see that sk (ζ) is feasible of (2.6)-(2.7) for all ζ ∈ [0, αk ], αk =
from the definitions of sk and sk (ζ), we have
∆T
k
.
kNkT gk k
ψk (0) − ψk (sk ) ≥ ψk (0) − ψk (sk (ζ))
Therefore,
(3.2)
for all ζ ∈ [0, αk ]. Using hTk sk (ζ) ≤ khk k∆k < δ1 < 1, Cauchy-Schwartz inequality and A3, we
get
ψk (0) − ψk (sk (ζ)) =
−
gkT sk (ζ)
1 sTk (ζ)Bk sk (ζ)
−
≥
T
1 − hk sk (ζ) 2 (1 − hTk sk (ζ))2
kgk kknk k knk kkBk k(knk k + kNk k∆Tk )
+
−
1 − δ1
(1 − δ1 )2
kN T gk k2
kN T gk k2 kBk kkNk k2
]
[ζ k
− ζ2 k
1 + δ1
2(1 − δ1 )2
−
for all ζ ∈ [0, αk ]. By calculus and (2.8), we have
max {ζ
0≤ζ≤αk
kNkT gk k2
kN T gk k2 kBk kkNk k2
}≥
− ζ2 k
1 + δ1
2(1 − δ1 )2
(3.3)
270
Appl. Math. J. Chinese Univ. Ser. B
Vol. 21, No. 3
kNkT gk k2
(1 − δ1 )2
min[αk ,
]≥
2(1 + δ1 )
(1 + δ1 )kBk kkNk k2
kNkT gk k
(1 − δ1 )2 kNkT gk k
kNkT gk k
min[∆k ,
]−
knk k.
2
2(1 + δ1 )
(1 + δ1 )kBk kkNk k
2(1 + δ1 )
(3.4)
Hence it follows from (3.2)-(3.4) that
ψk (0) − ψk (sk ) ≥
max |ψk (0) − ψk (sk (ζ))| ≥
0≤ζ≤αk
kNkT gk k
(1 − δ1 )2 kNkT gk k
kNkT gk k
min[∆k ,
knk k −
]−
2
2(1 + δ1 )
(1 + δ1 )kBk kkNk k
2(1 + δ1 )
knk k[kBk k(knk k + kNk k∆Tk ) + kgk k]
.
(1 − δ1 )2
According to A1 and Step 2a of Algorithm 2.1, we have
knk k ≤ min[ξ N ∆k , κbsc kck k].
(3.5)
The boundedness of xk , the updating rule of Step 5 of Algorithm 2.1, the inequality (3.5) and
A4 imply that knk k + kNk k∆Tk ≤ κ1 and kgk k ≤ κ2 uniformly. Thus if we let
δ2 = min[
(1 − δ1 )2
1
max[κ1 , (1 + κbns )κ2 ] min[ξ N , κbsc ]
,
],
δ
=
,
3
2(1 + δ1 )2 κ2bns 2(1 + δ1 )
(1 − δ1 )2
then the result (3.1) can be obtained.
Lemma 3.2. Suppose nck and θk satisfy A1 and A2, then
N
N
δmN
k = m (xk , 0) − m (xk , nk ) ≥ min[kc(xk )k,
τ ξN
∆k ]
κbsc
(3.6)
for all k.
Lemma 3.3. Suppose A1 and A2 hold, then there is a constant κbon > 0 such that
δmN
k ≥ κbon knk k
(3.7)
for all k.
Proof. The results is trivially true if c(xk ) = 0. So we assume that c(xk ) 6= 0. Now suppose
κ
∆k ≤ bsc
kc(xk )k,
τ ξN
then (3.6) gives
δmN
k ≥
τ ξN
∆k ≥ κbon ξ N ∆k ≥ κbon knk k,
κbsc
def
since knk k ≤ ξ N ∆k and where κbon =
τ
κ
bsc
. On the other hand, if
κ
kc(xk )k,
∆k > bsc
τ ξN
it follows from (3.6)
δmN
k ≥ kc(xk )k.
(3.8)
Wang Chengjing
A TRUST REGION METHOD WITH A CONIC MODEL FOR ...
271
However, the fact that θk ≤ 1 and (2.2) give that
knk k = |θk |knck k ≤ knck k ≤ κbsc kc(xk )k.
Combining this with (3.8) implies
δmN
k ≥
1
κbsc
knk k ≥
τ
κbsc
knk k = κbon knk k,
(3.9)
since τ ≤ 1. Thus, as either (3.8) or (3.9) holds, the lemma is true.
Lemma 3.4. Suppose that A3 to A7 hold and (3.7) holds for all k sufficiently large. Then
there are constants for all σmax > 0 and κbdm ≥ 0, and an index k1 such that
δmk ≥ δmTk + (σk − κbdm )δmN
k
(3.10)
σk = σmax
(3.11)
and
for all k ≥ k1 .
Proof. Step 3 of Algorithm 2.1 ensures that σk satisfies (2.9), where
δqkN = −
gkT nk
1 nTk Bk nk
−
.
2 (1 − hTk nk )2
1 − hTk nk
(3.12)
Since (i) A6 and A7 imply that gk and Bk are bounded, (ii) (3.7) gives that nk is bounded by
T
a multiplier of δmN
k , (iii) A3 implies that 1 − hk nk is bounded above, so we have from (3.12)
that
δqkN ≥ −κbdm δmN
k
for some constant κbdm ≥ 0. But then (2.9) immediately gives (3.10), from which we deduce
δmk ≥ (σk − κbdm )δmN
k ,
κ
bdm . Every increase of an
since δmTk > 0. Hence (2.9) is automatically satisfied for all σk ≥ 1−ν
insufficient σk−1 must be by at least max[τ2 , (τ1 − 1)σk−1 ], and therefore the penalty parameter
can only be increased a finite number of times. The required value σmax is at most the first
κ
bdm , which proves the lemma.
value of σk that exceeds 1−ν
Theorem 3.5. Under the conditions of A1 to A7 and that {xk } are uniformly bounded,
lim kck k = 0.
k→∞
Proof. If the theorem is not true, there exists a constant δ4 > 0 such that
lim sup kck k = δ4 > 0.
k→∞
Let x̂ be an accumulation point of {xk } satisfying kc(x̂)k = δ4 . There exists a positive constants
δ5 such that
kc(x)k ≥ δ4 /2, for all kx − x̂k ≤ δ5 .
(3.13)
272
Appl. Math. J. Chinese Univ. Ser. B
Vol. 21, No. 3
Define the set
K(δ) = {k| kck k ≥ δ}.
(3.14)
Then, using (2.9) and Lemma 3.2, for sufficiently large k ∈ K(δ4 /4) we have
Predk ≥ νσk δmN
k ≥ νσk min[kck k,
τ ξ N ∆k
] ≥ νσk δ6 min[δ4 /4, ∆k ],
κbsc
(3.15)
N
where δ6 = min[1, κτ ξ ]. Now Define
bsc
K0 = {k|
Aredk
≥ η}.
Predk
The boundedness of {xk } and the previous lemma imply that
X
Predk < ∞,
T
k∈K0
K(δ4 /4)
which, together with (3.15), implies that
X
k∈K0
T
∆k < ∞.
K(δ4 /4)
Therefore there exists an integer k̂ such that
X
k∈K0
T
∆k <
K(δ4 /4), k≥k̂
δ5 (1 − ξ1 )
,
4
(3.16)
where ξ1 ∈ (0, 1) is a constant. Because x̂ is an accumulation point of {xk }, there exists k̄ > k̂
such that
(3.17)
kxk̄ − x̂k < δ5 /4.
By induction we can use inequalities (3.13), (3.16) and (3.17) to prove
k ∈ K(δ4 /2)
(3.18)
for all k ≥ k̄. For k = k̄, (3.18) follows from (3.17) and (3.13). Now assume that (3.18) is true
for all k = k̄, . . . , i, we need to show that it is also true for k = i + 1. From (3.16) and the
updating rule of algorithm 2.1 that ∆k+1 ≤ ξ1 ∆k for all k 6∈ K0 , we have
i
X
j=k̄
∆j =
i
X
j=k̄, j∈K0
∆j +
i
X
∆j ≤ (1 +
j=k̄, j6∈K0
Therefore
kxi+1 − x̂k ≤ kxk̄ − x̂k +
ξ1
)
1 − ξ1
i
X
j=k̄
Thus it follows from (3.13) and (3.14)
xi+1 ∈ K(δ4 /2).
i
X
j=k̄, j∈K0
∆j ≤
δ5
.
2
∆j ≤
δ5
.
4
(3.19)
Wang Chengjing
A TRUST REGION METHOD WITH A CONIC MODEL FOR ...
273
By induction, we see that (3.18) holds for all k ≥ k̄.
Now from (3.18) and (3.19), by setting i → ∞, we can see that
∞
X
∆k ≤ δ5 /4,
k=k̄
which implies
lim xk = x̂, lim ∆k = 0.
k→∞
(3.20)
k→∞
It can be seen that
Aredk = Predk + o(ksk k) = Predk + o(∆k )
which, together with (3.20) and (3.15), gives
lim ρk = lim
k→∞
k→∞
Aredk
= 1.
Predk
The above limit shows that ∆k+1 ≥ ∆k for all sufficiently large k, which contradicts (3.20).
This completes our proof.
Theorem 3.6. Suppose A1 to A7 hold, then
lim inf kNkT gk k = 0.
(3.21)
k→∞
Proof. Assume that the theorem is false, hence there exists a positive constant δ7 such that
kNkT gk k ≥ δ7
for all k. If
δ3 (1 + kBk k)kck k > δ2 kNkT gk k
(3.22)
∆k
,
4
(3.23)
then, by using (2.9) and Lemma 3.2, we have
Predk ≥ νσk δ6 min[
δ2 kNkT gk k
, 1]∆k .
4δ3 (1 + kBk k)
(3.24)
If inequality (3.23) fails, the following relation follows from (2.11) and (3.1) that
Predk ≥
1
kNkT gk k
δ2 kNkT gk k min[∆k ,
]
2
kBk k
(3.25)
when k is large, because
δ3 (1 + kBk k) ≤
1 δ2 kNkT gk k2
2
kBk k
for all large k. Relations (3.24) and (3.25) tell us that there exists a positive constant δ8 such
that
Predk ≥ δ8 min[1, ∆k ]
(3.26)
for all large k. From inequality (3.26) we can derive a contradiction similarly as in the proof of
the previous theorem. This shows that (3.21) holds.
274
Appl. Math. J. Chinese Univ. Ser. B
Vol. 21, No. 3
§4 Discussion
We have presented a conic model trust region algorithm for nonlinearly constrained optimization and established the global convergence of the algorithm. In Step 2b of Algorithm
2.1, we solve the subproblem (2.5)-(2.7). In fact, it needn’t solve this subproblem the obtained
tangential step that satisfies some descent condition can maintain the global convergence yet.
(For example, see p.665 in [10].)
Acknowledgements. The author would like to acknowledge Prof. X. H. Wang for his comments greatly improve the paper.
References
1 Han S P. A globally convergent method for nonlinearly programming, J Optimization Theory and
Applications, 1977, 22: 297-309.
2 Powell M J D. Variable metric methods for constrained optimization, In: Bachem M G A and
Korte B eds, Mathematical Programming, The State of the Art, Berlin, New York: Springer, 1983:
283-311.
3 Powell M J D, Yuan Y. A trust region algorithm for equality constrained optimization, Math
Programming, 1991, 49: 189-211.
4 Yuan Y, Sun W. Optimization Theory and Methods, Beijing: Science Press, 1997.
5 Vardi A. A trust region algorithm for equality constrained minimization: convergence properties
and implementation, SIAM J Numer Anal, 1985, 22: 575-591.
6 Byrd R H, Schnabel R B, Shultz G A. A trust region algorithm for nonlinear constrained optimization, SIAM J Numer Anal, 1987, 24: 1152-1170.
7 Toint Ph L. Global convergence of a class of trust region methods for nonconvex minimization in
Hilbert space, em IMA J Numerical Analysis, 1988, 8: 231-252.
8 Zhang J Z, Zhu D T. Projected quasi-Newton algorithm with trust region for constrained optimization, J Optimization Theory and Applications, 1990, 67: 369-393.
9 El-Alem M. A robust trust region algorithm with a nonmonotonic penalty parameter scheme for
constrained optimization, SIAM J Optimization, 1995, 5: 348-378.
10 Conn A R, Gould N I M, Toint Ph L. Trust Region Methods, USA: SIAM, Philadelphia, 2000.
11 Nocedal J, Yuan Y. Combining trust region and line search techniques, Report NAW 07, Department
of EECS, Northwestern University, 1991.
12 Davidon W C. Conic approximation and collinear scaling for optimizers, SIAM J Numer Anal, 1980,
17: 268-281.
13 Sorensen D C. The q-superlinear convergence of a collinear scaling algorithm for unconstrained
optimization, SIAM J Numer Anal, 1980, 17: 84-114.
14 Ariyawansa K A. Deriving collinear scaling algorithms as extensions of quasi-Newton methods and
the local convergence of DFP and BFGS related collinear scaling algorithm, Math Programming,
1990, 49: 23-48.
Wang Chengjing
A TRUST REGION METHOD WITH A CONIC MODEL FOR ...
275
15 Ariyawansa K A, Lau D T M. Local and Q-superlinear convergence of a class of collinear scaling
algorithm that extends quasi-Newton methods with Broyden’s bounded-φ class of updates, Optimization, 1992, 23: 323-339.
16 Sheng S. Interpolation by conic model for unconstrained optimization, Computing, 1995, 54: 83-98.
17 Sun W. On nonquadratic model optimization methods, Asia and Pacific Journal of Operations
Research, 1996, 13: 43-63.
18 Sun W, Yuan Y X. A conic trust-region method for nonlinearly constrained optimization, Annals
of Operations Research, 2001, 103: 175-191.
19 Di S, Sun W. Trust region method for conic model to solve unconstrained optimization problems,
Optimization Methods and Software, 1996, 6: 237-263.
20 Han Q, Han J, Sun W. A modified quasi-Newton method with collinear scaling for unconstrained
optimization, J Computational and Applied Mathematics, submmitted to Optimization Methods
and Software.
21 Ni Q, Hu S. A new derivative free algorithm based on conic interpolation model, Technical report,
Faculty of Science, Nanjing University of Aeronautics and Astronautics, 2001.
Dept. of Math., Zhejiang Univ., Hangzhou 310028, China.
Download