On the superlinear local convergence of a filter-SQP method Stefan Ulbrich

advertisement
Stefan Ulbrich
On the superlinear local convergence of a filter-SQP
method
October 2002
Abstract. Transition to superlinear local convergence is shown for a modified version of the trust-region
filter-SQP method for nonlinear programming introduced by Fletcher, Leyffer, and Toint [8]. Hereby, the
original trust-region SQP-steps can be used without an additional second order correction. The main modification consists in using the Lagrangian function value instead of the objective function value in the filter
together with an appropriate infeasibility measure. Moreover, it is shown that the modified trust-region filterSQP method has the same global convergence properties as the original algorithm in [8].
Key words. nonlinear programming, superlinear convergence, global convergence, filter, SQP
1. Introduction
This paper is concerned with the transition of trust-region filter-SQP methods to superlinear local convergence without using second order corrections. Hereby, we consider
the general nonlinear programming problem (NLP)
min f (x) subject to
cE (x) = 0,
cI (x) ≤ 0
(P)
with twice continuously differentiable functions f : Rn → R, c : Rn → Rm and cE =
(ci )i∈E , cI = (ci )i∈I , where E = {1, . . . , mE } and I = {mE + 1, . . . , m} denote
the indices of equality and inequality constraints, respectively. With these notations the
Lagrangian function associated with (P) is given by
`(x, y) = f (x) + y T c(x),
where y = (yET , yIT )T ∈ Rm , yI ≥ 0.
We propose a modification of the globally convergent filter-SQP method by Fletcher,
Leyffer, and Toint [8] that has the same global convergence properties and ensures in
addition transition to superlinear local convergence under appropriate assumptions on
the Hessian approximation in the SQP-problems close to a regular minimum.
In the pioneering work [7], Fletcher and Leyffer proposed the filter technique for
the globalization of SQP methods, which avoids the use of classical merit functions
with penalty term. Instead, the the acceptability of steps is determined by comparing
the constraint violation and objective function value with previous iterates collected in
a filter. The new iterate is acceptable to the filter if either feasibility or objective function
value is sufficiently improved in comparison to all iterates bookmarked in the current
filter. The promising numerical results in [7] led to a growing interest in filter methods
Stefan Ulbrich: Zentrum Mathematik, Technische Universität München, Boltzmannstr. 3, 85747 Garching b.
München, Germany, E-mail: sulbrich@ma.tum.de.
S. Ulbrich: Superlinear local convergence of filter-SQP methods
2
in recent years. By augmenting the basic filter concept with a sufficient decrease condition for the objective function if the current iterate is sufficiently feasible, the global
convergence of three variants of trust-region filter-SQP methods was shown in Fletcher,
Gould, Leyffer, Toint, and Wächter [6], Conn, Gould, and Toint [4], Fletcher, Leyffer,
and Toint [8], and Fletcher, Gould, Leyffer, Toint, and Wächter [6]. Global convergence of a filter method for nonlinear programming was subsequently also shown by
Gonzaga, Karas, and Vanti [10]. A primal-dual interior-point filter method for NLP is
proposed in Ulbrich, Ulbrich, and Vicente [15] and global convergence is shown. An
alternative nonmonotone trust-region SQP-method without penalty function for equality constrained problems was proposed by Ulbrich and Ulbrich [14]. For this latter
algorithm, in addition to global convergence also the transition to superlinear local convergence was proven in [14]. Hereby, no second order correction is necessary. So far,
to the best of our knowledge, the only result on the superlinear local convergence of a
filter method is shown in Wächter and Biegler [16] for an interior point linesearch filter
method with second order correction steps.
In this paper we propose a modified version of the filter-SQP algorithm in Fletcher,
Leyffer, and Toint [8] that ensures transition to superlinear local convergence without
using second order correction steps. We believe that the modifications and proof techniques of the present paper can be adapted to ensure also the fast local convergence of
the filter methods in [4,6, 16] without second order corrections.
The original filter-SQP methods in [4, 6, 8] can–at least if no second order correction
steps are used–be affected by the well known Maratos effect. In fact, these algorithms
use constraint violation and objective function value in the filter. By the Maratos effect
a full SQP-step can lead to an increase of both these filter components even arbitrarily
close to a regular minimizer. This makes the full SQP-step unacceptable for the filter
and can prohibit fast local convergence. To avoid this problem, we use the value of
the Lagrangian function in the filter instead of the objective function and modify the
feasibility measure in the filter slightly.
The paper is organized as follows. In section 2 we describe briefly the trust-region
filter-SQP method of Fletcher, Leyffer, and Toint [8] and recall the global convergence
result proven in [8]. In section 3 we derive a modified filter-SQP method and prove its
transition to superlinear local convergence close to a point satisfying the strong second order sufficient condition. In subsection 3.1 we describe the local requirements on
Hessian approximation and Lagrange multipliers. In subsection 3.2 we motivate the
necessary changes to avoid the Maratos effect and formulate the modified trust-region
filter-SQP method. In subsection 3.3 we prove the transition to superlinear local convergence of the modified algorithm. In section 4 we show how the global convergence
proof in [8] can be adapted to the modified algorithm.
2. A trust-region filter-SQP algorithm
In this section we recall the globally convergent trust-region filter-SQP-framework proposed by Fletcher, Leyffer, and Toint [8] for the solution of (P).
Given the current iterate xk and a trust-region radius ∆k , the step computation is
based on the approximate solution of the trust-region SQP-subproblem QP(xk , ∆k )

minimize qk (s) subject to cE (xk ) + AE (xk )s = 0, 

cI (xk ) + AI (xk )s ≤ 0,
QP(xk , ∆k )


ksk∞ ≤ ∆k .
S. Ulbrich: Superlinear local convergence of filter-SQP methods
3
Hereby,
AE (xk ) = (ai (x)T )i∈E := ∇cE (xk )T ,
AI (xk ) = (ai (x)T )i∈I := ∇cI (xk )T
are the Jacobians of cE and cI , respectively,
1
qk (s) = fk + gkT s + sT Bk s
2
with gk = g(xk ) := ∇f (xk ) is a quadratic model of `(xk + s, yk ) − ykT A(xk )s,
A(xk ) = ∇c(xk )T , and Bk ∈ Rn,n is a symmetric approximation of the Hessian of
the Lagrangian
X
Hk := ∇xx `(xk , yk ) = ∇xx f (xk ) +
(yk )i ∇xx ci (xk ).
i∈E∪I
To ensure global convergence, filter methods evaluate the quality of a search direction
by using a filter
Fk ⊂ {(hj , fj ) : j = 0, . . . , k − 1}
that collects pairs
(hj , fj ) := (h(xj ), f (xj ))
from certain previous iterations. Hereby, h(x) is an infeasibility measure. Fletcher,
Leyffer, and Toint [8] propose to use
h(x) := kcE (x)k1 + kc+
I (x)k1 ,
where c+
I (x) = (max {ci (x), 0})i∈I .
Since a local solution x̄ of (P) minimizes f locally under the constraint h(x) = 0,
a promising trial step s should either reduce the contraint violation h or the function
value f . To ensure sufficient decrease of at least one of the two criteria, Fletcher et al.
[8] call x acceptable to the filter Fk if for all (hj , fj ) ∈ Fk
either
h(x) ≤ βhj
or
f (x) + γh(x) ≤ fj
(1)
is satisfied with constants 0 < γ < β < 1. Following [8] a pair (hj , fj ) is called
dominated by (h(x), f (x)) if
h(x) ≤ hj
and f (x) ≤ fj .
To update the information in the filter appropriate points x, which are acceptable to the
filter, are added to the filter Fk . This operation means that (h(x), f (x)) is added to the
set of pairs (hj , fj ) in Fk and subsequently any old pair is removed from the filter that
is dominated by the new pair (h(x), f (x)).
The acceptance test (1) has the useful inclusion property that after adding a pair
(h(x), f (x)) to the filter all unacceptable points of the old filter remain unacceptable
for the new filter (although dominated pairs are removed). This allows to show that the
infeasibility tends to zero for any infinite subsequence of iterates added to the filter,
which is an essential ingredient for the global convergence analysis of filter methods.
More precisely, we have the following result, cf. the corollary of Lemma 1 in [8].
Lemma 1. For any infinite sequence of iterates xk for which hk > 0, {fk } is bounded
below, and (hk , fk ) is entered to the filter, it follows that hk → 0.
Proof. See Lemma 1 and its corollary in [8].
u
t
S. Ulbrich: Superlinear local convergence of filter-SQP methods
4
Obviously, the filter concept alone does not necessarily enforce convergence to a local
minimum. For example if the iterates converge to an arbitrary feasible point and the
infeasibility measure h is in each iteration reduced by a factor of at least β then the
iterates will be acceptable to the filter. Therefore, the filter-SQP algorithm of Fletcher,
Leyffer, and Toint [8] imposes a sufficient reduction condition of the form
∆fk (s) := f (xk ) − f (xk + s) ≥ σ∆qk (s),
∆qk (s) := qk (0) − qk (s)
with a constant 0 < σ < 1 if the feasibility is good enough, more precisely if ∆qk (s) >
0.
Finally, it is obvious that the SQP-subproblem QP(xk , ∆) can be incompatible (infeasible). In this case, the algorithm invokes a restoration phase that finds an iterate
xk and a trust-region radius ∆ ≥ ∆min > 0 such that QP(xk , ∆) is compatible and
(hk , fk ) is acceptable to the filter Fk .
We have now collected all ingredients to give a precise statement of the filter-SQP
method of Fletcher, Leyffer, and Toint [8].
Algorithm 1.
Filter-SQP Algorithm of [8]
Let 0 < γ < β < 1, σ ∈]0, 1[, and ∆min > 0.
0. Initialize with a point x, k = 1 and filter F1 = {(u, −∞)} with some u ≥ βh(x).
1. Enter restoration phase to find a point xk acceptable to the filter Fk such that
e is compatible for some ∆
e ≥ ∆min and initialize ∆ = ∆.
e
QP(xk , ∆)
2. Try to compute a solution s of QP(xk , ∆). If the QP is incompatible then include
(hk , fk ) in the filter, set k := k + 1 and goto step 1 (h-type iteration).
3. If s = 0 stop with KT point xk .
4. If xk + s is not acceptable to Fk ∪ (hk , fk ) then set ∆ := ∆/2 and goto 2.
5. If
∆qk (s) := qk (0) − qk (s) > 0
and ∆fk (s) := fk − f (xk + s) < σ∆qk (s)
then set ∆ := ∆/2 and goto 2.
6. If ∆qk (s) ≤ 0 then include (hk , fk ) in the filter (h-type iteration).
7. Set sk = s, ∆k = ∆, xk+1 = xk + sk , and initialize ∆k+1 ≥ ∆min . Set k := k + 1
and goto step 2.
A convergence analysis of this algorithm is given in [8] under the following assumptions.
Assumptions (A1)–(A3).
(A1) All points that are sampled by the algorithm lie in a nonempty convex and compact set X.
(A2) The problem functions f (x) and c(x) are twice continuously differentiable on an
open set containing X.
(A3) There exists an MB > 0 such that the Hessian matrices Bk satisfy kBk k2 ≤ MB
for all k.
The following global convergence result is proven in [8].
Theorem 1. (Global convergence result, cf. [8, Thm. 7])
Under assumtions (A1)–(A3) the outcome of applying the filter-SQP algortihm of Algorithm 1 is one of the following:
S. Ulbrich: Superlinear local convergence of filter-SQP methods
5
(A) The restoration phase fails to find a point x which is both acceptable to the filter
and for which QP(x, ∆) is compatible for some ∆ > ∆min .
(B) A KT point of problem (P) is found.
(C) There exists an accumulation point that is feasible and either is a KT point or
fails to satisfy the Mangasarian-Fromowitz constraint qualification (MFCQ).
We recall that MFCQ holds at some point x if the rows of AE (x) are linearly independent and there exists a vector s with AE (x)s = 0, AA (x)s < 0, where A ⊂ I is the
index set of active inequality constraints at x and AA (x) = (ai (x)T )i∈A .
3. Transition to fast local convergence
Our aim is to modify Algorithm 1 in such a way that it allows transition to superlinear
local convergence if
– the computed iterates xk converge to a point x̄,
– x̄ satisfies the second order sufficient condition with strict complementarity and the
linear independence constraint qualification (LICQ),
– the Hessian approximations Bk are chosen appropriately for xk close to x̄.
More precisely, we assume the following.
Assumption (A4).
(A4) The iterates xk converge to a point x̄ that satisfies with a Lagrange multiplier
ȳ = (ȳET , ȳIT ) the following second order sufficient conditions with strict complementarity and LICQ:
h(x̄) = 0, ∇x `(x̄, ȳ) = 0,
ȳA = 0, ȳI\A > 0, where A := {i ∈ I : ci (x̄) = 0} ,
∇xx `(x̄, ȳ) is positive definite on {s : AE∪A (x̄)s = 0},
AE∪A (x̄) has full row rank.









(O)
We still have to specify our requirements on the choice of Bk .
3.1. Local requirement on the step computation
In this subsection we introduce a standard framework for the Lagrange-multiplier update that ensures with the choice Bk = ∇xx `(xk , yk ) q-superlinear local convergence
to a point x̄ satisfying (O) if full (local) SQP-steps are taken. To obtain suitable Lagrange multiplier estimates, we will make use of a multipliers function, see [9,5]:
Definition 1. Let Ux̄ be a neighborhood of x̄ satisfying (O). Then a function Y : Ux̄ →
Rm is called a multipliers function if
Y : Ux̄ → Rm is continuous at x̄ and ȳ = Y (x̄).
S. Ulbrich: Superlinear local convergence of filter-SQP methods
6
One can show that for Ux̄ small enough a continuously differentiable multipliers
function is for example given by the multipliers function of Glad and Polak
¡
¢−1
Y GP (x) = − A(x)A(x)T + γ1 diag(ci (x)2 )
A(x)g(x)
with γ1 > 0, see [11,5]. Moreover, by the strict complementarity assumption in (O) it
is easy to see that for any multipliers function Y and any constant ν > 0 for Ux̄ small
enough it holds
A = {i ∈ I : ci (x) ≥ −νYi (x)} =: A(x)
∀ x ∈ Ux̄ .
(2)
Thus, the active set can be identified exactly close to x̄.
To analyze the transition to fast local convergence we make for simplicity the following assumptions on the choice of Bk and yk . The use of inexact Hessian approximations, see for example [2,5], would be possible.
Assumption (A5).
(A5) There is a neighborhood Ux̄ of x̄ such that
Bk := Hk = ∇xx `(xk , Y (xk )),
yk = Y (xk )
∀ xk ∈ Ux̄ ,
where Y : Ux̄ → Rm is a Lipschitz continuous multipliers function that satisfies
YI (x) ≥ 0,
YI\A (x) = 0
∀ x ∈ Ux̄ .
Remark 1. If Y : Ux̄ → Rm is a Lipschitz continuous multipliers function then a
multipliers function Yb satisfying (A5) can for Ux̄ small enough be obtained by setting
YbE∪A(x) = YE∪A(x) , YbI\A(x) = 0 with A(x) in (2), since A(x) = A for Ux̄ sufficiently
small.
u
t
Under assumption (O) the point x̄ is a strict local minimum of (P) and we find
∆¯ > 0 such that for Ux̄ small enough the following holds: for all xk ∈ Ux̄ with Bk
according to (A5) we have by continuity with a constant κB > 0
2
sT Bk s ≥ κB ksk2 on {s : AE∪A (xk )s = 0} ,
AE∪A (xk ) has full row rank,
1
(yk )A ≥ ȳA > 0.
2
Moreover, for any given 0 < ∆ ≤ ∆¯ we can shrink Ux̄ such that for all xk ∈ Ux̄ the
¯ is unique, satisfies ksk < ∆ and coincides with the solution
solution s of QP(xk , ∆)
∞
of the equality constrained problem EQP(xk ) given by
minimize
qk (s)
subject to
cE∪A (xk ) + AE∪A (xk )s = 0,
(EQP)
where A ⊂ I is the set of active inequality constraints at x̄, see for example [1, Sec.
4.4.3], [2, Sec. 5] or [12, Thm. 18.1]. In particular, since (yk )I\A = 0 by (A5) the step
s satisfies the KT conditions
µ
¶µ ¶ µ
¶
s
−∇x `(xk , yk )
∇xx `(xk , yk ) AE∪A (xk )T
=
.
z
−cE∪A (xk )
AE∪A (xk )
0
S. Ulbrich: Superlinear local convergence of filter-SQP methods
7
Now it is well known that with the choice xk+1 := xk + s, yk+1 := Y (xk+1 ) the
sequence {xk } converges under assumption (A5) q-superlinearly to x̄ and that the convergence is q-quadratic if ∇xx f , and ∇xx ci are Lipschitz continuous in a neighborhood
of x̄, see for example [5, 14].
We collect these facts in the following proposition:
Proposition 1. Let x̄ satisfy (O) and let (A5) hold. Then there exists ∆¯ > 0 such that
e ≤ ∆¯ possibly after shrinking Ux̄ the following holds:
for any given ζ ∈]0, 1[ and 0 < ∆
AE∪A (x)
YA (x) ≥
has full row rank on a compact set Sx̄ ⊃ Ux̄ ,
1
ȳA > 0 ∀ x ∈ Ux̄ ,
2
(3)
(4)
and with a constant κB > 0
2
sT ∇xx `(x, Y (x))s ≥ κB ksk2
on {s : AE∪A (x)s = 0} ∀ x ∈ Ux̄ .
(5)
For any xk ∈ Ux̄ , yk = Y (xk ), the sequence defined by x0k := xk and
x0j+1 = x0j + s0j
¯ for j = k, k + 1, . . .,
with the unique solution s0j of QP(x0j , ∆)
where Bj0 = ∇xx `(x0j , Y (x0j )) are chosen according to (A5), is well defined, converges
q-superlinearly to x̄ (q-quadratically if ∇xx f , ∇xx ci are Lipschitz on Ux̄ ), and satisfies
ks0j+1 k ≤ ζks0j k,
kx0j+1 − x̄k ≤ ζkx0j − x̄k,
e
ks0j k∞ < ∆.
(6)
e ∆].
¯ Moreover, s0 is the unique solution
Thus, s0j solves also QP(x0j , ∆) for all ∆ ∈ [∆,
j
0
of EQP(xj ).
3.2. Modifications of the filter-SQP algorithm to avoid the Maratos effect
By the well known Maratos effect the full SQP-step s can lead to an increase of f and
h and therefore the iterate xk + s will not be acceptable to Fk ∪ (hk , fk ). To avoid this
difficulty, there are three promising approaches:
– Perform a second order correction step if the Maratos effect is detected. This approach is used successfully by Wächter and Biegler [16] in the context of an interiorpoint linesearch filter algorithm.
– Allow nonmonotonicity in the filter to admit an increase in f and h. This is in the
spirit of the well known watchdog technique. A promising approach in this direction
is the nonmonotone filter method proposed recently by Toint [13].
– Use the Lagrangian function instead of the objective function in the filter.
In this paper we will consider the last approach.
As already mentioned, the Maratos effect can cause that the full SQP-step is not
acceptable with respect to Fk ∪ (hk , fk ), since both h and f are increased. By using as
a technical tool the fact that the augmented Lagrangian function
ρ
2
`ρ (x, Y (x)) = `(x, Y (x)) + kcE∪A (x)k2
2
(7)
is an exact penalty function close to x̄ satisfying (O) if ρ > 0 is large enough and
Y : Ux̄ → Rm is a Lipschitz continuous multipliers function, we will see that full SQPsteps are eventually accepted by the filter, if the filter entries (h(x), f (x)) are replaced
S. Ulbrich: Superlinear local convergence of filter-SQP methods
8
by entries (θ(x, y), `(x, y)) with an appropriately defined infeasibility measure θ(x, y).
This leads to our first modification of Algorithm 1:
Modification 1.
– Instead of the filter entries (h(x), f (x)) we use the entries (θ(x, y), `(x, y)), where
2
2
T −
2
θ(x, y) := kcE (x)k2 + kc+
I (x)k2 + (yI cI (x)) .
(8)
Consequently, we call (x, y) acceptable to the filter Fk if
either
θ(x, y) ≤ βθj
or `(x, y) + γθ(x, y) ≤ `j
(9)
with constants 0 < γ < β < 1.
We remark that
0 ≤ h(x) ≤
√
mθ(x, y)1/2 .
(10)
Moreover, it is obvious from (8) that
θ(x, y) = 0
=⇒
y T c(x) = 0.
(11)
The second modification concerns step 5 of Algorithm 1. We note that in the case
∆qk (s) > 0 the step is only accepted if the sufficient decrease condition
∆fk (s) ≥ σ∆qk (s)
holds. This decrease condition can only be ensured if the error of the quadratic model
becomes small compared to the size of the predicted decrease ∆qk (s). We now estimate
the size of ∆qk (s) for a full SQP-step.
Let (O) and (A5) hold and consider xk ∈ Ux̄ with Ux̄ as in Proposition 1. To explore
¯
the decrease properties of the full (local) SQP-step s given as the solution of QP(xk , ∆)
we use the fact that s solves EQP(xk ). For convenience we set
bk := AE∪A (xk ),
A
b
ck := cE∪A (xk ).
Then EQP(xk ) can be written as
minimize
qk (s)
bk s = 0.
subject to b
ck + A
(EQP)
Now we decompose s in a normal step sn and a tangential step st . To this end, let
s = sn + st ,
bTk (A
bk A
bTk )−1 b
sn = − A
ck .
(12)
Then we have obviously
µ
¶
1 n T
1 t T
T n
n
n T t
t
∆qk (s) = − min gk s + (s ) Bk s + (gk + Bk s ) s̃ + (s̃ ) Bk s̃ .
bk s̃t =0
2
2
A
(13)
Since xk ∈ Ux̄ , we know from (3), (4), and (8) that there exists κn > 0 (independent
of k) with
1/2
ksn k ≤ κn θk .
(14)
Moreover, (5) yields
2
(st )T Bk st ≥ κB kst k .
S. Ulbrich: Superlinear local convergence of filter-SQP methods
9
Thus, we obtain by (13) with a constant C > 0
1 t T
1
(s ) Bk st − gkT sn − (sn )T Bk sn
2
2
MB n 2
κB t 2
n
≥
ks k − kgk kks k −
ks k
2
2
κB t 2
2
≥
ks k − C(ksn k + ksn k )
2
∆qk (s) =
(15)
2
and this lower bound can be sharp. Since ∆fk (s) = ∆qk (s) + O(ksk ), the full SQPstep will not necessarily satisfy ∆fk (s) ≥ σ∆qk (s), unless
2
∆qk (s) ≥ (1 − σ)|∆fk (s) − ∆qk (s)| = O(ksk ),
which is not ensured in view of (15).
To circumvent this difficulty we replace the predicted reduction ∆qk (s) and actual
2
reduction ∆fk (s) by quantities that differ only by o(ksk ) and we check the decrease
ratio only if the predicted reduction is large enough compared to θk . As we will see, the
first goal can be accomplished by using the actual reduction
∆`k (s) := `k − `(xk + s, yk (s))
(16)
instead of ∆fk (s), where yk (s) denotes the new multiplier estimate, and the predicted
reduction
(17)
∆b
qk (s) := ∆qk (s) + ykT ck
instead of ∆qk (s). Secondly, we require the sufficient decrease condition ∆`k (s) ≥
ψ/2
σ∆b
qk (s) only if ∆b
qk (s) > κθ θk with a constant ψ ∈]1/2, 1]. This leads to the
following modification of step 5.
Modification 2.
– To ensure that f -type steps allow eventually full (local) SQP-steps we replace the
condition in step 5 by
ψ/2
∆b
qk (s) := ∆qk (s) + ykT ck > κθ θk
and ∆`k (s) < σ∆b
qk (s)
with ψ ∈]1/2, 1]. Analogously, we replace the condition ∆qk (s) ≤ 0 in step 6 by
ψ/2
∆b
qk (s) ≤ κθ θk .
Finally, we enter restoration not only if QP(xk , ∆) is infeasible but also if the Lagrange
multipliers weight inactive constraints to strongly. More precisely, we define the compatibility of QP(xk , ∆) as follows.
Modification 3.
– We call QP(xk , ∆) compatible if
QP(xk , ∆) is feasible
1/2
and θk
≤ κ∆ ∆1+ξ
with constants κ∆ > 0 and ξ ∈]0, 1[.
This leads to the following modified version of Algorithm 1.
(18)
S. Ulbrich: Superlinear local convergence of filter-SQP methods
Algorithm 2.
10
Modified Filter-SQP Algorithm
Let 0 < γ < β < 1, σ ∈]0, 1[, ξ ∈]0, 1[, ψ ∈]1/2, 1], κθ , κ∆ > 0, and ∆min > 0.
0. Initialize with a point (x, y), k = 1 and filter F1 = {(u, −∞)} with some u ≥
βθ(x, y).
1. Enter restoration phase to find a point (xk , yk ) acceptable to the (θ, `)-filter Fk such
e is compatible in the sense (18) for some ∆
e ≥ ∆min and initialize
that QP(xk , ∆)
e
∆ = ∆.
2. Try to compute a solution s of QP(xk , ∆). If the QP is incompatible according
to (18) then include (θk , `k ) in the filter, set k := k + 1 and goto step 1 (h-type
iteration).
3. If s = 0 stop with KT point xk .
Otherwise compute a Lagrange multiplier estimate y = yk (s).
4. If (xk + s, y) is not acceptable to Fk ∪ (θk , `k ) then set ∆ := ∆/2 and goto 2.
5. If
ψ/2
∆b
qk (s) := ∆qk (s) + ykT ck > κθ θk
and ∆`k (s) < σ∆b
qk (s)
then set ∆ := ∆/2 and goto 2.
ψ/2
6. If ∆b
qk (s) ≤ κθ θk then include (θk , `k ) in the filter (h-type iteration).
7. Set sk = s, ∆k = ∆, xk+1 = xk + sk , yk+1 = y, and initialize ∆ ≥ ∆min . Set
k := k + 1 and goto step 2.
Remark 2. We note that the modified filter-SQP method in Algorithm 2 has many similarities with the filter-SQP algortihms proposed by Fletcher, Gould, Leyffer, Toint, and
Wächter [6] and Conn, Gould, and Toint [4].
The convergence analysis in [8] of the original filter-SQP algorithm in Algorithm 1 uses
two essential observations. Firstly, ∆qk (s) never increases if the trust-region radius ∆ is
reduced, since the feasible set of QP(xk , ∆) shrinks if ∆ is reduced. Thus, the condition
∆qk (s) > 0 for trying the possibility of an f -type step can only switch from true to
false in the inner loop, but not vice versa. We note that the modified Algorithm 2 has
ψ/2
also this property, since we only replace ∆qk (s) > 0 by ∆qk (s) > −ykT ck + κθ θk
ψ/2
and −ykT ck + κθ θk is fixed in the inner iteration.
Remark 3. A reinspection of the convergence proof in [8] shows that the global convergence result of [8]–and also the results of the present paper–remain true if in step 1 only
the local minimizer of QP(xk , ∆) is determined, which is closest to xk . Essential is the
monotonicity of ∆qk (s) with respect to ∆.
u
t
The second key point in the analysis of [8] is the observation that Algorithm 1 only
adds pairs (hk , fk ) to the filter with hk > 0. Analogously, all pairs (θk , `k ) added to the
filter by Algorithm 2 satisfy θk > 0. In fact, new entries to the filter can only be added
ψ/2
in step 2 if QP(xk , ∆) is incompatible or in step 6 if ∆qk (s) ≤ ykT ck + κθ θk . But if
θk = 0 then QP(xk , ∆) is compatible in the sense (18) for all ∆ > 0, since s = 0 is
feasible. Thus, (θk , `k ) is not added to the filter in step 2. Moreover, if xk is not a KT
ψ/2
point then we must have ∆qk (s) > 0 and θk = 0 implies −ykT ck + κθ θk = 0 by
ψ/2
(11). Thus, we have ∆b
qk (s) = ∆qk (s) > 0 = κθ θk and also the condition in step 6
for adding (θk , `k ) to the filter is not satisfied. Consequently, the minimum constraint
violation in the filter satisfies always
τk :=
min
(θj ,`j )∈Fk
θj > 0.
(19)
S. Ulbrich: Superlinear local convergence of filter-SQP methods
11
This will allow us to adjust in section 4 the elegant global convergence analysis in [8]
to the modified Algorithm 2. First, we will show that Algorithm 2 allows transition to
fast local convergence.
3.3. Proof of transition to fast local convergence
Our aim is to show the following result.
Theorem 2. Let assumptions (A1)–(A5) hold. Then there is K > 0 such that Algorithm
2 takes full (local) SQP-steps for all k ≥ K, i.e., with some ∆¯ > 0
xk+1 = xk + sk , ksk k∞ < ∆¯
¯ ∀ k ≥ K.
with the unique solution sk of QP(xk , ∆)
In particular, xk converges q-superlinearly to x̄. If ∇xx f and ∇xx ci are Lipschitz continuous in a neighborhood of x̄ then xk converges q-quadratically.
The proof of this result requires some preparation and is given at the end of this section.
In a first step we show that full SQP-steps according to Proposition 1 are eventually
accepted by the filter. To this end, we prove as an auxiliary result that the augmented
Lagrangian function `ρ (x, Y (x)) in (7) with a Lipschitz continuous multipliers function
Y : Ux̄ → Rm is an exact penalty function.
Lemma 2. Let x̄ satisfy (O) and let Y : Ux̄ → Rm be a Lipschitz continuous multipliers
function. Then there exist constants 0 < λ < η, ρ0 > 0 and a neighborhood Bδ (x̄) =
{x : kx − x̄k < δ} such that the augmented Lagrangian function `ρ in (7) satisfies
λ
ρ − ρ0
2
2
2
kx − x̄k2 +
kcE∪A (x)k2 ≤ `ρ (x, Y (x))−`ρ (x̄, ȳ) ≤ η(1+ρ)kx − x̄k2 (20)
2
2
for all ρ ≥ ρ0 and all x ∈ Bδ (x̄).
Moreover, possibly after increasing η > 0, ρ0 > 0, and reducing δ > 0 we have for
the generalized augmented Lagrangian
b̀ρ (x, Y (x)) := `(x, Y (x)) + ρ θ(x, Y (x))
2
(21)
with θ according to (8) that for all ρ ≥ ρ0 and all x ∈ Bδ (x̄)
λ
ρ − ρ0
2
2
kx − x̄k2 +
θ(x, Y (x)) ≤ b̀ρ (x, Y (x))− b̀ρ (x̄, ȳ) ≤ η(1+ρ)kx − x̄k2 . (22)
2
2
b
Proof. Let x ∈ Ux̄ . We use again the notation b
c(x) = cE∪A (x), A(x)
= AE∪A (x).
Using Taylor’s theorem and ∇x `ρ (x̄, ȳ) = 0, ∇y `ρ (x̄, ȳ) = 0, we have with some
(x̃, ỹ) on the line segment between (x̄, ȳ) and (x, Y (x))
`ρ (x, Y (x)) − `ρ (x̄, ȳ)
1
= (x − x̄)T ∇xx `ρ (x̃, ỹ)(x − x̄) + (x − x̄)T ∇xy `ρ (x̃, ỹ)(Y (x) − ȳ).
2
(23)
Now it is obvious that
³
´ ρ
ρ
b T A(x̃).
b
c(x̃) + A(x̃)
∇xx `ρ (x̃, ỹ) = ∇xx `ρ/2 x̃, ỹ + b
2
2
(24)
S. Ulbrich: Superlinear local convergence of filter-SQP methods
12
It is well known that under assumption (O) there exists ρ̄ > 0 such that ∇xx `ρ/2 (x̄, ȳ)
is uniformly positive definite for all ρ ≥ ρ̄, i.e.,
2
sT ∇xx `ρ/2 (x̄, ȳ)s ≥ 4λksk2
∀ s ∈ Rn
with a constant λ > 0, see for example [12, Thm. 17.5].
¡
¢
Let Ly be the Lipschitz constant of Y (x) and let ρ0 := max ρ̄, 4L2y /λ . By continuity, we find δ > 0 such that for all (x, y) with kx − x̄k < δ, ky − ȳk < Ly δ
´
³
ρ0
2
(25)
c(x) s ≥ 2λksk2 ∀ s ∈ Rn ,
sT ∇xx `ρ0 /2 x, y + b
2
where we choose δ so small that the δ-ball Bδ (x̄) = {x : kx − x̄k < δ} is contained
in Ux̄ . Thus, we obtain for all x ∈ Bδ (x̄) by (23), (24), (25)
2
ρ0 b
kA(x̃)(x − x̄)k2
4
b
+ (Y (x) − ȳ)T A(x̃)(x
− x̄).
2
`ρ0 (x, Y (x)) − `ρ0 (x̄, ȳ) ≥ λkx − x̄k2 +
Now we have for all α > 0
2
1 b
2
b
2(Y (x) − ȳ)T A(x̃)(x
− x̄) ≥ − kA(x̃)(x
− x̄)k2 − αkY (x) − ȳk2
α
2
1 b
2
≥ − kA(x̃)(x
− x̄)k2 − αL2y kx − x̄k2 .
α
Hence, the choice α = λ/L2y yields (note that ρ0 /4 ≥ L2y /λ =
`ρ0 (x, Y (x)) − `ρ0 (x̄, ȳ) ≥
1
α)
2
λ
ρ0 b
2
kx − x̄k2 + kA(x̃)(x
− x̄)k2 .
2
8
(26)
Since `ρ (x̄, ȳ) = `ρ0 (x̄, ȳ) = f (x̄), we have
`ρ (x, Y (x)) − `ρ (x̄, ȳ) = `ρ0 (x, Y (x)) − `ρ0 (x̄, ȳ) +
ρ − ρ0
2
kb
c(x)k2 ,
2
(27)
and together with (26) the left inequality in (20) is shown.
On the other hand, it is obvious from (23) that for all x ∈ Ux̄
2
`ρ (x, Y (x)) − `ρ (x̄, ȳ) ≤ η(1 + ρ)kx − x̄k2
with a constant η > 0. This proves the right inequality in (20).
The estimates (22) for b̀ρ (x, Y (x)) in (21) follow now from the following observations. By (O) we can reduce δ > 0 such that for all x ∈ Bδ (x̄)
YA (x) ≥
1
ȳA > 0,
2
c+
I\A (x) = 0.
Thus, we find by (8) a constant µ > 0 with
2
2
2
µkb
c(x)k2 ≤ θ(x, Y (x)) ≤ kb
c(x)k2 + (YI (x)T c−
I (x)) .
(28)
With ρ̂0 = ρ0 /µ we obtain therefore from (26) and b̀ρ̂0 (x̄, ȳ) = `ρ0 (x̄, ȳ) = f (x̄) that
b̀ρ̂ (x, Y (x)) − b̀ρ̂ (x̄, ȳ) ≥ `ρ (x, Y (x)) − `ρ (x̄, ȳ) ≥ λ kx − x̄k2 .
0
0
0
0
2
2
S. Ulbrich: Superlinear local convergence of filter-SQP methods
13
Since an analogue of (27) holds for b̀ρ , this proves as above the left inequality in (22)
after increasing ρ0 to ρ̂0 . On the other hand we have by the right inequality in (28)
b̀ρ (x, Y (x)) − b̀ρ (x̄, ȳ) ≤ `ρ (x, Y (x)) − `ρ (x̄, ȳ) + ρ (YI (x)T c− (x))2 .
I
2
T −
2
Since c−
A (x̄) = 0, YI\A (x̄) = 0, and Y is Lipschitz at x̄, we have (YI (x) cI (x)) =
2
O(kx − x̄k ). This shows the right inequality in (22) possibly after increasing η.
u
t
We are now in the position to show that for k large enough the sequence {x0j }j≥k of
full (local) SQP-steps is acceptable to the filter.
Lemma 3. Assume that xk → x̄ with x̄ satisfying (O) and that (A5) holds. Then with
e ≤ ∆,
¯ and M ≥ 1 there is an index K such that
some ∆¯ > 0 for any ζ ∈]0, 1[, 0 < ∆
for all k ≥ K the following holds:
The sequence {x0j }j≥k of full SQP-steps defined in Proposition 1 is well defined,
converges q-superlinearly to x̄, and satisfies (6). Moreover, the problems QP(x0j , ∆),
e are all compatible in the sense (18) and for all ∆ ∈ [∆,
e ∆]
¯ they have the same
∆ ≥ ∆,
0 ¯
solution as QP(xj , ∆).
Finally, for all k ≥ K with
kxk − x̄k ≤ M
min
(θl ,`l )∈Fk
kxl − x̄k
(29)
the points (x0j+1 , Y (x0j+1 )), j = k, k + 1, . . ., corresponding to full SQP-steps are
acceptable to
0
, `0k+1 ) ∪ · · · ∪ (θj0 , `0j ).
Fj0 := Fk ∪ (θk , `k ) ∪ (θk+1
Thus, full SQP-steps starting from xk pass the filter test in step 4 of Algorithm 2.
Proof. Let ∆¯ > 0 be as in Proposition 1 and choose Ux̄ as in Proposition 1 for any
e ≤ ∆¯ and ζ = 1/2, say. Moreover, let ρ0 and Bδ (x̄) ⊂ Ux̄ be given by
given 0 < ∆
Lemma 2. Then by Proposition 1 for all xk ∈ Bδ (x̄) the above sequence {x0j }j≥k is
well defined, satisfies (6) and converges q-superlinearly to x̄. In particular, we obtain
¯ and QP(x0 , ∆) coincide for all ∆ ∈ [∆,
e ∆].
¯
from (6) that the solutions of QP(x0j , ∆)
j
e1+ξ for
Without restriction we can choose δ > 0 so small that θ(x, Y (x))1/2 ≤ κ∆ ∆
e are all compatible in the sense (18), since
all x ∈ Bδ (x̄). Then QP(x0j , ∆), ∆ ≥ ∆,
0
{xj }j≥k ⊂ Bδ (x̄) for xk ∈ Bδ (x̄) by (6).
Choose K1 so large that xk ∈ Bδ (x̄) for all k ≥ K1 and choose ρ ≥ ρ0 so large
that
µ
¶
³ρ
´
ρ
ρ − ρ0
β ρ
β
−γ ≥β
and
≥ 1−
.
(30)
2
4
2
4 2
Then we know from (22) in Lemma 2 that for all x ∈ Bδ (x̄) with b̀ρ in (21)
µ
¶
λ
β ρ
2
2
kx − x̄k2 + 1 −
θ(x, Y (x)) ≤ b̀ρ (x, Y (x)) − b̀ρ (x̄, ȳ) ≤ η(1 + ρ)kx − x̄k2
2
4 2
(31)
holds with constants 0 < λ < η. Let as in (19)
τK1 :=
min
(θj ,`j )∈FK1
θj .
S. Ulbrich: Superlinear local convergence of filter-SQP methods
14
We have already observed that τK1 > 0. By continuity we find 0 < δ1 < δ such
that θ(x, Y (x)) ≤ βτK1 for all x ∈ Bδ1 (x̄). Thus, for all points x ∈ Bδ1 (x̄) the pair
(x, Y (x)) is acceptable to FK1 . Since xk → x̄ we find K2 > K1 with xk ∈ Bδ1 (x̄) for
all k ≥ K2 . Using (6) in Proposition 1, we can choose δ1 so small that for all k ≥ K2
the sequence {x0j }j≥k of full SQP-steps converges linearly with a contraction factor of
at least
s µ
¶
1
1
1 − β/2
λ
0
kxj+1 − x̄k ≤
kx0j − x̄k ≤ kx0j − x̄k. (32)
1−
2
2
1 − β/4 2M η(1 + ρ)
2
Now consider an arbitrary k ≥ K2 such that (29) holds and set δ2 := kxk − x̄k. By (32)
we have x0k+1 ∈ Bδ2 (x̄) and thus (x0k+1 , Y (x0k+1 )) is acceptable to FK1 . We still have
to show that it is acceptable with respect to (θl , `l ) ∈ Fk ∪ (θk , `k ) for K1 ≤ l ≤ k.
For these l we have xl ∈ Bδ (x̄) \ Bδ2 /M (x̄) by (29) and therefore by (31)
µ
¶
b̀ρ (xl , Y (xl )) − b̀ρ (x̄, ȳ) ≥ λ δ 2 + 1 − β ρ θl .
(33)
2M 2 2
4 2
On the other hand, (31) and (32) yield
b̀ρ (x0
k+1 , Y
2
(x0k+1 )) − b̀ρ (x̄, ȳ) ≤ η(1 + ρ)kx0k+1 − x̄k
µ
¶
1
1 − β/2
λ 2
≤
1−
δ .
2
1 − β/4 2M 2 2
(34)
To prove that (x0k+1 , Y (x0k+1 )) is acceptable with respect to (θl , `l ) we show now that
any point (x, Y (x)) with x ∈ Bδ2 (x̄) that is not acceptable with respect to (θl , `l )
satisfies in contrast to (34)
¶
µ
λ 2
b̀ρ (x, Y (x)) − b̀ρ (x̄, ȳ) > 1 − 1 − β/2
δ .
1 − β/4 2M 2 2
In fact, if (x, Y (x)) is not acceptable to (θl , `l ) then
θ(x, Y (x)) > βθl
and `(x, Y (x)) + γθ(x, Y (x)) > `l .
Thus, we have with (30)
³
´
b̀ρ (x, Y (x)) = `(x, Y (x)) + ρ θ(x, Y (x)) > `l + ρ − γ θ(x, Y (x))
2
2
µ
¶
³ρ
´
ρ
β ρ
b̀
> `l + β
− γ θl ≥ `l + β θl = ρ (xl , Y (xl )) − 1 −
θl
2
4
2 2
´
1 − β/2 ³ b̀
b̀
≥ b̀ρ (xl , Y (xl )) −
ρ (xl , Y (xl )) − ρ (x̄, ȳ) .
1 − β/4
Hereby, we have used (31) in the last step. This shows with (33) that
µ
¶³
´
b̀ρ (x, Y (x)) − b̀ρ (x̄, ȳ) > 1 − 1 − β/2
b̀ρ (xl , Y (xl )) − b̀ρ (x̄, ȳ)
1 − β/4
¶
µ
λ 2
1 − β/2
δ .
≥ 1−
1 − β/4 2M 2 2
Since this does by (34) not hold for x0k+1 , we have shown that (x0k+1 , Y (x0k+1 )) is
acceptable to Fk ∪ (θk , `k ).
S. Ulbrich: Superlinear local convergence of filter-SQP methods
15
Obviously, the same argument can be applied to x0k+1 and Fk ∪ (θk , `k ) instead
of xk and Fk , since by (32) again (29) is satisfied for x0k+1 and Fk ∪ (θk , `k ). This
0
shows that x0k+2 is acceptable to Fk ∪ (θk `k ) ∪ (θk+1
, `0k+1 ). Thus the acceptability of
0
0
0
(θj+1 , `j+1 ) for Fj follows by induction.
u
t
Finally, we show that the sequence {x0j }j≥k of full SQP-steps passes for k large enough
also the decrease test in step 5.
Lemma 4. Let the assumptions of Lemma 3 hold and let K be as in Lemma 3 for any
e ≤ ∆.
¯ Then after a possible increase of K for all k ≥ K the
fixed ζ ≤ 1/2 and 0 < ∆
0
sequence {xj }j≥k of full SQP-steps defined in Proposition 1 satisfies with yj0 = Y (x0j )
the following implication for all j ≥ k:
∆b
qj0 (s0j ) := ∆qj0 (s0j ) + (yj0 )T c0j > κθ (θj0 )ψ/2
∆`0j (s0j ) ≥ σ∆b
qj0 (s0j ). (35)
=⇒
Hereby, we use the notations ∆`0j (s0j ) = `(x0j , yj0 )−`(x0j +s0j , Y (x0j +s0j )), c0j = c(x0j ),
θj0 = θ(x0j , yj0 ) and analogously for the other quantities.
e
Thus, full SQP-steps starting from xk pass step 5 of Algorithm 2 for all ∆ ≥ ∆
0
0 ¯
for which the solutions of QP(xj , ∆) and QP(xj , ∆) coincide, in particular for all
e ∆].
¯
∆ ∈ [∆,
Proof. By Lemma 3 we know that there exists K > 0 such that for all k ≥ K the
sequence {x0j }j≥k , corresponding to full SQP-steps converges q-superlinearly to x̄ and
satisfies (6) for ζ = 1/2. We then have
kx0j − x̄k ≤
ks0j k
Let
≤
kx0j
1
2j−k
kxk − x̄k,
− x̄k +
kx0j+1
− x̄k ≤
3
2j−k+1
(36)
kxk − x̄k.
∆b
qj0 (s0j ) := ∆qj0 (s0j ) + (yj0 )T c0j .
Obviously, the assertion is proven if we show that with K > 0 large enough for all
j ≥ k ≥ K in the case ∆b
qj0 (s0j ) > σ(θj0 )ψ/2 the following holds
¯ 0 0
¯
¯∆b
qj (sj ) − ∆`0j (s0j )¯ ≤ (1 − σ)∆b
qj0 (s0j ).
Since s0j is also the solution of EQP(x0j ), we have
b0j s0j
b
cj0 = −A
(37)
b0 = AE∪I (x0 ). Moreover, (y 0 )I\A = 0 and thus
with b
cj0 = cE∪I (x0j ), A
j
j
j
b0j s0j = −(yj0 )T A0j s0j .
(yj0 )T c0j = (yj0 )TE∪A b
cj0 = −(yj0 )TE∪A A
This shows that
1
∆b
qj0 (s0j ) = −∇x `(x0j , yj0 )T s0j − s0T
B 0 s0 .
2 j j j
Moreover, we have by (A5) that (Y (x0j + s0j ) − yj0 )I\A = 0 and thus
k(Y (x0j + s0j ) − yj0 )T c(xj + s0j )k ≤ k(Y (x0j + s0j ) − yj0 )E∪A kkcE∪A (x0j + s0j )k
2
= o(ks0j k ).
S. Ulbrich: Superlinear local convergence of filter-SQP methods
16
Hereby, we have used that kY (x0j + s0j ) − yj0 k ≤ Ly ks0j k and that cE∪A (x0j + s0j ) =
b0 s0 + O(ks0 k2 ) = O(ks0 k2 ) by (37). Thus, we obtain with some τ ∈ [0, 1]
b
c0 + A
j
j j
j
j
¯ 0 0
¯
1
¯∆b
qj (sj ) − ∆`0j (s0j )¯ ≤ ks0T
(∇xx `(x0j + τ s0j , yj0 ) − Bj0 )s0j k
2 j
+ k(Y (x0j + s0j ) − yj0 )T c(x0j + s0j )k
(38)
2
= o(ks0j k ).
On the other hand, using the step decomposition (12) and the obvious inequality
2
2
2
2
0t
0n
0t
ks0j k = ks0n
j + sj k ≤ 2(ksj k + ksj k )
we have by (15) and (14) after a possible increase of K for all j ≥ k ≥ K
κB 0 2 ³
κB ´
κB 0t 2
0n 2
0n 2
ksj k − C(ks0n
ksj k − C +
(ks0n
∆qj0 (s0j ) ≥
j k + ksj k ) ≥
j k + ksj k )
2
4
2
κB 0 2
0 ψ/2
ks k − C̃(θj )
≥
4 j
with a constant C̃ > 0. Hereby, we use (14) and ψ ∈]1/2, 1] in the last step and choose
K large enough such that θj0 ≤ 1 for all j ≥ k ≥ K.
01/2
0ψ/2
Let My = 1 + supx∈Ux̄ kY (x)k. Then (yj0 )T c0j ≤ My θj ≤ My θj
we conclude that
κB 0 2
∆b
qj0 (s0j ) ≥
ks k − (C̃ + My )(θj0 )ψ/2 .
4 j
by (8) and
(39)
We are now ready to show the implication (35). Let ∆b
qj0 (s0j ) > κθ (θj0 )ψ/2 . We
consider two cases:
2
2
If ∆b
qj0 (s0j ) ≤ κ8B ks0j k then (39) yields κ8B ks0j k ≤ (C̃ + My )(θj0 )ψ/2 and thus
∆b
qj0 (s0j ) > κθ (θj0 )ψ/2 ≥
κθ κB
2
ks0j k .
8(C̃ + My )
2
Otherwise, we have ∆b
qj0 (s0j ) > κ8B ks0j k . Thus, we obtain in both cases
)
(
κθ κB
κB
2
2
0 0
ks0j k =: κ1 ks0j k .
∆b
qj (sj ) > min
,
8 8(C̃ + My )
(40)
Since xk → x̄, we obtain by (36) and (38) possibly after increasing K that for all
j≥k≥K
¯ 0 0
¯
2
¯∆b
qj (sj ) − ∆`0j (s0j )¯ ≤ (1 − σ)κ1 ks0j k < (1 − σ)∆b
qj0 (s0j ),
where we use (40) in the last inequality. This proves the lemma.
u
t
By combining Lemma 3 and Lemma 4 we are now able to prove Theorem 2.
Proof of Theorem 2. Since (A4) and (A5) hold, we have xk → x̄ with x̄ satisfying
(O) and we can thus apply Proposition 1, Lemma 3, and© Lemma ª
4. Let ∆¯ > 0 be
¯ ∆min /4. Now choose
e := min ∆,
as in Proposition 1, fix some M ≥ 1 and set ∆
K > 0 large enough such that the assertions of Lemma 3 and Lemma 4 hold. Since
kxk+1 − xk k → 0 we find K1 ≥ K such that
©
ª
¯ ∆min
min
∆,
1/2
e1+ξ
e=
, θk = θ(xk , Y (xk ))1/2 ≤ κ∆ ∆
(41)
ksk k∞ < ∆
4
S. Ulbrich: Superlinear local convergence of filter-SQP methods
17
for all k ≥ K1 . Then by (41) and Lemma 3 the problems QP(xk , ∆) are compatible in
e
the sense (18) for all k ≥ K1 as long as ∆ ≥ ∆.
Now let xk , k ≥ K1 , be an arbitrary iterate satisfying (29). We show that xk+1 =
¯ as in Proposition 1.
x0k+1 = xk + s0k , where s0k is the unique solution of QP(xk , ∆)
e and QP(xk , ∆) are
Since the inner iteration halfs ∆ starting from ∆ ≥ ∆min ≥ 4∆
e it either finds an acceptable iterate xk+1 = xk + sk for some
compatible for ∆ ≥ ∆,
e
e 2∆].
e
∆ > 2∆ or it locates a trust-region radius ∆ ∈ [∆,
e < ∆ by (41) and thus sk must coincide with
In the first case we have ksk k∞ < ∆
0
¯
the unique solution sk of QP(xk , ∆) as asserted.
e ∆]
¯ and
In the second case denote the solution of QP(xk , ∆) by s. Since ∆ ∈ [∆,
k ≥ K, we know by Proposition 1 and Lemma 3 that s coincides with the unique
¯ and thus ksk < ∆ by (6). Moreover, by Lemma 3 the iterate
solution s0k of QP(xk , ∆)
∞
0
0
(xk+1 , yk+1 ) = (xk +s0k , Y (xk +s0k )) is acceptable to the filter Fk ∪(θk , `k ) and passes
by Lemma 4 the decrease test in step 5. Thus, the trial iterate (xk + s, Y (xk + s)) is
accepted by the algorithm and we have xk+1 = x0k+1 = xk + s0k , xk+1 = Y (xk+1 ).
Hence, in both cases the algorithm takes the full (local) SQP-step xk+1 = x0k+1 .
The next iterate satisfies again (29) and we obtain by induction that xj = x0j for all
j ≥ k with the iterates {x0j }j≥k according to Proposition 1. Now Proposition 1 yields
the q-superlinear (or q-quadratic) convergence of {xk }. The proof is complete.
u
t
4. Global convergence of the modified filter-SQP algorithm
In this section we show that the global convergence result of Theorem 1 remains true for
the modified filter-SQP method of Algorithm 2. In addition to assumptions (A1)–(A3)
we assume that the used Lagrange multipliers remain uniformly bounded and that the
Lagrange multiplier estimates weight inactive inequality constraints not to strongly.
Assumption (A6).
(A6) There exists a constant My > 0 such that all Lagrange multiplier estimates generated in step 1 or step 3 satisfy
kyk k ≤ My , kyk (s)k ≤ My ,
(yk )I ≥ 0, (yk (s))I ≥ 0
for all k. Moreover, there are constants ML > 0 and ζ > 0 with
½
¾
1
max ξ, − 1 < ζ ≤ 1
ψ
(42)
(43)
such that all Lagrange multiplier estimates y = yk (s) generated in step 3 satisfy
1+ζ
|yIT c−
.
I (xk + s)| ≤ ML ∆
(44)
We note that the condition (44) is not required for Lagrange multipliers generated in the
restoration. (42) is a natural requirement and is consistent with (A5).
Remark 4. In the equality constrained case (44) is void. Therefore, for equality constrained problems (A6) is obviously satisfied by the multipliers in assumption (A5).
S. Ulbrich: Superlinear local convergence of filter-SQP methods
18
Remark 5. Condition (44) can for example be ensured by applying the modification rule
©
ª
1+ζ
max (c−
=⇒ yi = 0, (45)
I (xk + s))i , (cI (xk ) + AI (xk )s)i < −MI ∆
1
2
since (c−
I (xk + s))i ≥ (cI (xk ) + AI (xk )s)i − 2 nMC ∆ , cf. Lemma 6 below.
This rule allows in particular yi 6= 0 for all constraints that are active at the solution
s of QP(xk , ∆).
u
t
Remark 6. If (A4) and (A5) hold and if QP(xk , ∆k ) has sufficiently close to x̄ once the
correct active set (which will fail to occur only in quite pathological situations if we use
Bk = ∇xx `(xk , Y (xk )) close to x̄) then from this iteration on the multipliers Y (x) can
be taken and thus (A5) is compatible with (A6).
This can be seen as follows. If (A4) holds and the solution of QP(xk , ∆k ) has for
xk ∈ Ux̄ the active set E ∪ A then the rule (45) allows to choose yk+1 = Y (xk+1 )
according to (A5). Thus, with k := k + 1 we have yk = Y (xk ) and if xk is close
1/2
enough to x̄ then the condition θk ≤ κ∆ ∆1+ξ in (18) ensures that
1+ξ
kc−
A (xk )k ≤ µ∆
with a constant µ > 0. Using this together with (O) one can show that for all xk ∈ Ux̄
possibly after shrinking Ux̄ the active set of QP(xk , ∆) with Bk = ∇xx (xk , Y (xk ))
is E ∪ A under the compatibility requirement (18) and thus the rule (45) leaves y =
Y (xk + s) unchanged. More precisely, we have the following lemma, which is proven
in the appendix.
Lemma 5. Let (A4) hold. Then we can find K > 0 such that for all k ≥ K the following is true: if Bk = ∇xx `(xk , Y (xk )), yk = Y (xk ) as in (A5) and if QP(xk , ∆) is
compatible in the sense (18) then QP(xk , ∆) has the active set E ∪ A.
u
t
The global convergence of Algorithm 2 can be obtained by a slight modification of the
convergence proof of Fletcher, Leyffer, and Toint [8] for Algorithm 1. Our aim is to
show the following analogue of Theorem 1.
Theorem 3. Let assumptions (A1)–(A3) and (A6) hold. Then the result of applying the
modified filter-SQP algorithm of Algorithm 2 is one of the scenarios (A), (B), or (C)
stated in Theorem 1.
To prove this theorem we only indicate the necessary modifications of the proof in [8].
We have already observed after the statement of Algorithm 2 that it shares the following properties with Algorithm 1 that are essential for the convergence analysis in [8]:
the predicted reduction ∆b
qk (s) depends monotone nonincresing on ∆ and the minimal
constraint violation τk in the filter Fk , see (19), is always positive.
In the following we sketch how the convergence analysis in [8] can be adjusted to
the framework of Algorithm 2.
We start by recalling the following auxiliary result in [8, Lem. 3].
Lemma 6. Let assumptions (A1)–(A3) hold. Then for any feasible point s of QP(xk , ∆)
the following holds.
∆fk (s) ≥ ∆qk (s) − nMB ∆2 ,
1
kcE (xk + s)k∞ ≤ nMC ∆2 ,
2
1
+
kcI (xk + s)k∞ ≤ nMC ∆2 ,
2
where MC = supx∈X k∇xx ci (x)k2 , 1 ≤ i ≤ m.
(46)
(47)
(48)
S. Ulbrich: Superlinear local convergence of filter-SQP methods
19
Proof. This is a direct consequence of Taylor’s theorem, see [8, Lem. 3].
u
t
The next lemma is an analogue of [8, Lem. 4] and shows that the iterate xk + s is
acceptable to the filter if ∆ is small enough and QP(xk , ∆) is feasible.
Lemma 7. Let assumptions (A1)–(A3) and (A6)
QP(xk , ∆),
© hold. If s solves
ª then xk +
2+2ζ
2
2
2
s is acceptable to the filter if ∆
≤ min 1, 4βτk /(mn MC + 4ML ) with τk in
(19) and ζ in (A6).
Proof. By (47), (48), and (A6) we have
2
2
T −
2
θ(xk + s, yk (s)) = kcE (xk + s)k2 + kc+
I (xk + s)k2 + ((yk (s))I cI (xk + s))
1
≤ mn2 MC2 ∆4 + ML2 ∆2+2ζ .
4
©
ª
Thus, θ(xk + s) ≤ βτk holds for ∆2+2ζ ≤ min 1, 4βτk /(mn2 MC2 + 4ML2 ) . This
ensures by (9) and the definition of τk in (19) that the iterate xk + s is acceptable to the
filter.
u
t
The next lemma is an adaption of [8, Lem. 5] to Algorithm 2.
Lemma 8. Let assumptions (A1)–(A3) and (A6) hold and let x◦ ∈ X be a feasible
point of problem (P) at which MFCQ holds but which is not a KT point. Then there exists
a neighborhood N ◦ of x◦ and constants ε, µ
b, κ > 0 such that for all xk ∈ N ◦ ∩ X and
all ∆ for which
1/(2+2ξ)
ψ/2
1/2
µ
b(θk
+ θk + θk ) ≤ ∆ ≤ κ
(49)
it follows that QP(xk , ∆) is compatible in the sense (18) with solution s at which the
predicted reduction (17) satisfies
∆b
qk (s) ≥
1
ψ/2
ε∆ + κθ θk
4
(50)
and the actual reduction (16) satisfies the sufficient reduction condition
∆`k (s) ≥ σ∆b
qk (s)
(51)
∆`k (s) ≥ γθ(xk + s).
(52)
as well as
Proof. In [8, Lem. 5] it is shown that there exists a neighborhood N ◦ of x◦ and constants ε, µ, κ > 0 such that for all xk ∈ N ◦ ∩ X and all ∆ with
µhk ≤ ∆ ≤ κ
the problem QP(xk , ∆) has a feasible solution s at which
∆qk (s) ≥
1
ε∆,
3
∆fk (s) ≥ σ∆qk (s),
∆fk (s) ≥ γh(xk + s).
Without restriction we can choose κ ≤ 1. Now set
½
¾
κθ + My + 1
−1/(1+ξ) √
µ
b = max 1, κ∆
.
, mµ, 12
ε
Then (49) implies
1/2
∆1+ξ ≥ κ−1
∆ θk .
(53)
S. Ulbrich: Superlinear local convergence of filter-SQP methods
20
Thus, QP(xk , ∆) is compatible in the sense (18).
√ 1/2
By (10) we have mθk ≥ hk . Hence, (49) implies by our choice of µ
b
1/2
µhk ≤ µ
bθk ≤ ∆ ≤ κ,
κθ + My + 1 ψ/2
1/2
∆ ≥ 12
(θk + θk ).
ε
(54)
Therefore, (53) is satisfied by [8, Lem. 5] and yields with (49), (54)
∆qk (s) ≥
1
1
1
ψ/2
1/2
ε∆ + ε∆ ≥ ε∆ + (κθ + My + 1)(θk + θk ).
4
12
4
This shows that
1/2
∆b
qk (s) = ∆qk (s) + ykT ck ≥ ∆qk (s) − (My + 1)θk
≥
1
ψ/2
ε∆ + κθ θk
4
and (50) is proven.
It remains to show (51) and (52). By (42) in (A6) we have (yk (s))TI c−
I (xk + s) ≤ 0.
This yields with (47), (48)
∆`k (s) = ∆fk (s) + ykT ck − yk (s)T c(xk + s)
1√
≥ ∆fk (s) + ykT ck − My
mnMC ∆2 .
2
Together with (46) and (50) we obtain
¶
µ
1√
mnMC ∆2
∆`k (s) ≥ ∆b
qk (s) − nMB + My
2
µ
¶
1
1√
≥ σ∆b
qk (s) + (1 − σ) ε∆ − nMB + My
mnMC ∆2 .
4
2
(55)
Thus, (51) holds for all
∆≤
(1 − σ) 14 ε
=: κ1
√
nMB + My 12 mnMC
(56)
and consequently (49) ensures (51) if we choose κ ≤ κ1 . Finally, (51), (50), (47), (48),
and (A6) yield
1
1
∆`k (s) − γθ(xk + s) ≥ σ ε∆ − γ mn2 MC2 ∆4 − γML2 ∆2+2ζ
4
4
and the right hand side is nonnegative for all
( µ
¶1/(1+2ζ) )
σε
=: κ2 .
∆ ≤ min 1,
γmn2 MC2 + 4γML2
Thus, also (52) is satisfied if we choose κ ≤ κ2 in (49).
(57)
(58)
u
t
As a last preparation for the proof of Theorem 3 we proceed as in [8] and show that the
inner loop of Algorithm 2 terminates finitely. The following lemma is a counterpart of
[8, Lem. 6].
Lemma 9. Let assumptions (A1)–(A3) and (A6) hold. Then the inner iteration of Algorithm 2 terminates finitely.
S. Ulbrich: Superlinear local convergence of filter-SQP methods
21
Proof. Again, we only describe the modifications of the proof of Lemma 6 in [8].
If xk is a KT point of (P) then s = 0 solves QP(xk , ∆) and the inner iteration
terminates in step 3.
Otherwise, we have ∆ → 0 if the inner iteration does not terminate finitely. If
θk > 0 then QP(xk , ∆) is obviously incompatible in the sense (18) for all sufficiently
small ∆ > 0. Thus, the inner iteration terminates finitely in step 2.
If θk = 0 then we find exactly as in the proof of Lemma 6 in [8, p. 55] a constant
η > 0 with ∆qk (s) ≥ 12 η∆ for all 0 < ∆ ≤ κ with a constant κ > 0 sufficiently
small. Now, θk = 0 implies ykT ck = 0 by (11) and thus ∆b
qk (s) = ∆qk (s) ≥ 12 η∆ >
ψ/2
κθ θk = 0. Thus, it follows that ∆b
qk (s) ≥ 41 ε∆ with ε = 2η. Now we can continue
exactly as in the proof of Lemma 8. In fact, we obtain (55) and thus ∆`k (s) ≥ σ∆b
qk (s)
folds for all ∆ ≤ κ1 with κ1 > 0 according to (56). Hence, the sufficient decrease
condition in step 5 is satisfied for ∆ ≤ min {κ, κ1 }. Moreover, we have also (57) and
thus ∆`k (s) ≥ γθ(xk + s) is satisfied for ∆ ≤ κ2 with κ2 > 0 in (58). Thus, xk is for
∆ ≤ min {κ, κ1 , κ2 } also acceptable relative to (θk , `k ), cf. (9). Moreover, for
( µ
¶1/(2+2ζ) )
4βτk
∆ ≤ min 1,
=: κ3
mn2 MC2 + 4ML2
the iterate xk is by Lemma 7 also acceptable to Fk . Hence, for ∆ ≤ min {κ, κ1 , κ2 , κ3 }
also the filter test of step 4 is satisfied and thus the inner iteration terminates in step 7.
The proof is complete.
u
t
We are now able to adapt the global convergence proof of [8] to Algorithm 2.
Proof of Theorem 3. We have only to consider case (C). Then the algorithm runs
infinitely and since the inner iteration is finite by Lemma 9, the outer iteration produces an infinite sequence of iterates xk . By (A1) the sequence {xk } has at least an
accumulation point.
Consider the case that the outer iteration produces an infinite sequence of h-type
iterations. Exactly as in [8] we obtain a subsequence S of h-type iterations such that
xk → x∞ with some x∞ , τk+1 = θk < τk for k ∈ S and Lemma 1 with (θk , `k )
instead of (hk , fk ) yields θk → 0 for k ∈ S. Thus, x∞ is feasible and if x∞ violates
MFCQ then (C) is satisfied.
Now consider the case where MFCQ is satisfied at x∞ and x∞ is not a KT point.
As in [8] we will derive a contradiction. Let N ∞ be the neighborhood of x∞ given in
Lemma 7 and choose K > 0 so large that xk ∈ N ∞ for all k ∈ S, k ≥ K. Lemma 7
and Lemma 8 show that with constants µ
b, κ > 0 for all k ∈ S, k ≥ K and all ∆ with
(
µ
¶1/(2+2ζ) )
4βτk
1/2
ψ/2
1/(2+2ξ)
µ
b(θk + θk + θk
) ≤ ∆ ≤ min κ, 1,
(59)
mn2 MC2 + 4ML2
the problem QP(xk , ∆) is compatible and the solution s satisfies ∆`k (s) ≥ σ∆b
qk (s),
ψ/2
∆b
qk (s) > κθ θk , ∆`k (s) ≥ γθ(xk + s) (see Lemma 8), and (θ(xk + s), `(xk +
s, yk (s))) is acceptable to Fk (see Lemma 7). In particular, for all
1/2
∆≥µ
b(θk
ψ/2
+ θk
1/(2+2ξ)
+ θk
)
no h-type iteration occurs, since QP(xk , ∆) is compatible in the sense (18) and ∆b
qk (s) >
ψ/2
κθ θk (recall that ∆b
qk (s) is nondecreasing in ∆) and for ∆ satisfying (59) an f -type
step will occur. Hence, the algorithm will take an f -type step in iteration k ∈ S, k ≥ K,
S. Ulbrich: Superlinear local convergence of filter-SQP methods
22
if halfing of ∆ beginning by some ∆ ≥ ∆min in the inner loop locates a ∆ satisfying
(59) (then the conditions for an f -type step will be met if this was not the case before).
But we have τk+1 = θk < τk and θk → 0 for k ∈ S. Since by (43)
¾
½
1
1 ψ
1
max
, ,
>
,
2 2 2 + 2ξ
2 + 2ζ
the upper bound in (49) will for all k ∈ S large enough be at least twice the lower
bound. Thus, for large k ∈ S an f -type step will be taken which contradicts the fact
that the iterations k ∈ S are composed of h-type steps. Thus, the assumption that x∞
is not a KT point was wrong.
Finally, consider the case where the algorithm takes only a finite number of h-type
ψ/2
steps. Then there is K > 0 such that `k − `k+1 ≥ σ∆b
qk (s) > κθ θk for all k ≥ K.
By (A1), (A2), (A6) `k is uniformly bounded and thus
X
X
X
ψ/2
∞>
(`k − `k+1 ) ≥
σ∆b
qk (s) >
σκθ θk ≥ 0.
k≥K
k≥K
k≥K
Therefore, we have θk → 0 and thus any accumulation point x∞ of {xk } which exists
by (A1) is feasible. Assume that x∞ satisfies MFCQ but is no KT point and let S be
a subsequence of iterates converging to x∞ . As above let N ∞ be the neighborhood of
x∞ given in Lemma 7 and choose K > 0 so large that xk ∈ N ∞ for all k ∈ S, k ≥ K.
Since no h-type steps occur for k ≥ K we have τk = τK > 0 for all k ≥ K and exactly
as above an f -type step is guaranteed for all k ∈ S, k ≥ K if
(
µ
¶1/(2+2ζ) )
4βτK
1/2
ψ/2
1/(2+2ξ)
µ
b(θk + θk + θk
) ≤ ∆ ≤ min κ, 1,
=: κ
b.
mn2 MC2 + 4ML2
(60)
This time the upper bound is constant and the lower bound is half the upper bound for
k ≥ K possibly after increasing K. Hence, for all k ∈ S, k ≥ K, the f -type step
occurs with ∆ ≥ min {∆min , κ
b/2} and thus (50) yields
`k − `k+1 ≥ σ∆b
qk (sk ) ≥
1
ε min {∆min , κ
b/2} .
4
ψ/2
Since `k − `k+1 > κθ θk ≥ 0 for all k ≥ K, this implies `k → −∞, which is a
contradiction to the boundedness of `k . Thus x∞ is a KT point or violates MFCQ. u
t
5. Conclusions
We have proposed a modified version of the globally convergent trust-region filterSQP method by Fletcher, Leyffer, and Toint [8] such that the transition to fast local
convergence is ensured under appropriate local assumptions on the Hessian approximation in the SQP-problems and on the Lagrange multipliers. Hereby, the original
SQP-steps without additional second-order correction can be used. The modified algorithm is globally convergent if the Lagrange multipliers used are bounded and weight
inactive inequality constraints not to strongly. Thus, in the equality constrained case
only the boundedness of the Lagrange multipliers is assumed globally which is always
compatible with our local requirement on the Lagrange multipliers. In the inequality
constrained case our global and local conditions on the Lagrange multipliers are–in
particular–compatible as soon as sufficiently close to the limit point x̄ a subproblem
S. Ulbrich: Superlinear local convergence of filter-SQP methods
23
QP(xk , ∆k ) identifies the active set correctly. From this point on the local and global
conditions on the Lagrange multipliers can be met.
Our modified algorithm is very similar to the filter-SQP methods in Fletcher, Gould,
Leyffer, Toint, and Wächter [6] and in Conn, Gould, and Toint [4], but uses the Lagrangian function value instead of the objective function value and a different feasibility measure in the filter. Thus our algorithm can also be viewed as a variant of the latter
algorithms. We decided to build on the results in [8], since the global convergence analysis is less technical than in [4,6]. However, it would be interesting to apply the proof
techniques of [4,6] to get an alternative convergence proof for our modified algorithm.
In conclusion, the results of the present paper show that the filter technique does
allow transition to fast local convergence without using second order correction steps
if appropriate filter entries are used. It is an interesting topic for future research if the
proposed modifications can help to further improve the encouraging numerical performance of filter-SQP methods reported in the past [3,7].
Appendix: Proof of Lemma 5
e with a constant
It is sufficient to show that the assertion is true for all 0 < ∆ ≤ ∆
e
e
∆ > 0. In fact, since by (A4) xk → x̄ holds, we find K > 0 with kxk+1 − xk k∞ < ∆
for all k ≥ K and thus for all k ≥ K Algorithm 2 produces the same sequence if we
e
replace ∆k in all iterations k ≥ K by min{∆k , ∆}.
b
Let again A(x) := AE∪A (x), b
c(x) := cE∪A (x). Moreover, we set
AE,k := AE (xk ),
AA,k := AA (xk ),
cE,k := cE (xk ),
cA,k := cA (xk ).
We know from (O) in (A4) that
b
c(x̄) = 0,
b T ȳE∪A ,
g(x̄) = −A(x̄)
ȳA ≥ 2ε > 0,
cI\A (x̄) < −ε
e > 0 so small that cI\A (x) < 0 for
for some ε > 0. Now choose K > 0 so large and ∆
e k ≥ K. Then QP(xk , ∆), 0 < ∆ ≤ ∆
e is for k ≥ K
all feasible points of QP(xk , ∆),
equivalent to
1
min qk (s) := gkT s + sT Bk s,
2
AE,k s = −cE,k ,
AA,k s ≤ −cA,k .
(61)
By continuity and (O), we can increase K such that with Bk = ∇xx `(xk , Y (xk )),
yk = Y (xk ) for all k ≥ K the following holds
n
o
2
T
b
s Bk s ≥ κB ksk2 on s : Ak s = 0 ,
bk A
bTk )−1 A
bk k + kA
bTk (A
bk A
bTk )−1 k ≤ MA ,
k(A
1
(yk )A ≥ ȳA ≥ ε.
2
Moreover, we have constants with kgk k ≤ Mg√
, kyk k ≤ My . Since (yk )I\A = 0 if
yk = Y (xk ) we obtain by (8) with µ = min{1, ε} > 0
1/2
θk
≥ µkb
ck k.
In addition, we observe that
T
bTk ȳE∪A + (gk − g(x̄)) + (A
bk − A(x̄))
b
bTk ȳE∪A + rk ,
gk = −A
ȳE∪A =: −A
(62)
S. Ulbrich: Superlinear local convergence of filter-SQP methods
24
where krk k1 ≤ Lg kxk − x̄k1 with a constant Lg . Finally, we increase K such that for
all k ≥ K
ε
kxk − x̄k1 ≤
.
(63)
16MA Lg
e k ≥ K, that is
Now let s be the solution of a problem QP(xk , ∆), 0 < ∆ ≤ ∆,
compatible in the sense (18). We know that s solves also (61) and have by (18)
1/2
µkb
ck k ≤ θk
≤ κ∆ ∆1+ξ .
(64)
bk s = −b
We want to show that A
ck . We first compute an upper bound for qk (s). If
we take
bk A
bTk (A
bTk )−1 b
ŝ = −A
ck ,
bk ŝ = −b
then A
ck and by (64)
kŝk ≤ MA kb
ck k ≤
−1/ξ
e C
Thus, for ∆ ≤ min{1, ∆,
1
e1
function value is for ∆ ≤ ∆
MA κ∆ 1+ξ
∆
=: C1 ∆1+ξ .
µ
e1 we have kŝk ≤ kŝk ≤ ∆ and the objective
} =: ∆
∞
qk (ŝ) ≤ Mg C1 ∆1+ξ + nMB C12 ∆2+2ξ ≤ (Mg C1 + nMB C12 )∆1+ξ .
e2 ≤ ∆
e1 such that for all 0 < ∆ ≤ ∆
e2
Thus we find a constant 0 < ∆
qk (ŝ) ≤
ε
∆.
8MA
(65)
Now we decompose the optimal solution s in a normal and tangential component
s = sn + st ,
bTk v,
sn = A
st = W z,
bk . The corresponding objecwhere the columns of W are an ONB of the nullspace of A
tive function value is by (62)
T
bk sn + rkT s + 1 sT Bs
qk (s) = −ȳE∪A
A
2
T
bk sn − (Lg kxk − x̄k ∆ + nMB ∆2 ).
≥ −ȳE∪A
A
1
(66)
e3 ≤ ∆
e2 such that for all 0 < ∆ ≤ ∆
e3
Using (63), we find 0 < ∆
T
bk sn −
qk (s) ≥ −ȳE∪A
A
ε
∆.
8MA
(67)
e4 ≤ ∆
e3 for all 0 < ∆ ≤ ∆
e4 the solution s of
We show next that with a constant 0 < ∆
n
a compatible problem QP(xk , ∆) has a normal component s with ksn k∞ ≤ ∆/2. To
this end we show that qk (s) > qk (ŝ) would hold if ksn k∞ > ∆/2.
bT v, we have v = (A
bk A
bT )−1 A
bk sn and thus
Since sn = A
k
k
kvk ≤ MA ksn k.
This yields
bk sn k ≥ v T A
bk sn = ksn k2 ≥ 1 kvkksn k
kvkkA
MA
S. Ulbrich: Superlinear local convergence of filter-SQP methods
25
and we conclude that
bk sn k ≥
kA
1
1
ksn k ≥
ksn k∞ .
MA
MA
Since
bA,k sn ≤ −cA,k ,
A
bA,k sn )+ ≤ −c− and thus
we have for the positive part 0 ≤ (A
A,k
°µ +
°
¶
n °
° A
b
κ∆ 1+ξ
° A,k s °
∆
=: C3 ∆1+ξ .
°
° ≤ kck k ≤
bE,k sn °
° A
µ
We conclude that
b− sn k ≥ kA
bk sn k − C3 ∆1+ξ ≥ 1 ksn k − C3 ∆1+ξ .
kA
∞
A,k
MA
This gives for ksn k∞ ≥ ∆/2
T
1+ξ
bk sn ≥ −ȳ T A
b− n
b− sn k − My C3 ∆1+ξ
−ȳE∪A
A
≥ 2εkA
A A,k s − My C3 ∆
A,k
1
2ε n
ks k∞ − C3 (2ε + My )∆1+ξ
(68)
MA
2ε
ε
≥
∆ − C3 (2ε + My )∆1+ξ ≥
∆
2MA
2MA
e4 with a constant 0 < ∆
e4 ≤ ∆
e3 . Thus, we obtain by combining (65), (67),
for ∆ ≤ ∆
e
and (68) for 0 < ∆ ≤ ∆4
ε
ε
ε
ε
∆−
∆−
∆=
∆>0
qk (s) − qk (ŝ) ≥
2MA
8MA
8MA
4MA
e4 ensures that ksn k ≤ ∆/2.
which contradicts the optimality of s. Therefore, ∆ < ∆
∞
bk s = −b
e4 . Assume the
We are now in the position to show that A
ck for ∆ < ∆
bk corresponding to an inequality constraint
contrary. Then there exists a row aTi of A
T
with ai s < (−b
ck )i . Set
u
bTk (A
bk A
bTk )−1 ei ,
δs = ∆
, u=A
2kuk∞
≥
where ei denotes the i-th unit vector. For all components with |(sn +st )j | = ∆ we have
|(st )j | ≥ ∆/2, since |(sn )j | ≤ ∆/2. Therefore, we have with s(α) := (1 − α)st +
sn + αδs for small α > 0
bk s(α) = A
bk sn + α∆ ei
ks(α)k∞ ≤ ∆,
A
≤ −b
ck .
2kuk∞
bT ȳE∪A + rk
Now we have by using that gk = −A
k
µ
¶
d
1
d
T
T
qk (s(α)) =
gk s(α) + s(α) Bk s(α)
dα
dα
2
= gkT (δs − st ) + s(α)T Bk (δs − st )
(ȳE∪A )i
+ rkT (δs − st ) + s(α)T Bk (δs − st )
2kuk∞
2ε
≤−
∆ + (2Lg kxk − x̄k1 ∆ + 2nMB ∆2 )
2MA
ε
ε
∆+
∆ < 0.
≤−
MA
4MA
= −∆
S. Ulbrich: Superlinear local convergence of filter-SQP methods
26
e3 , see (66), (67). But since s(α) is feasible
In the last step, we use the choice of ∆
d
for small α > 0, dα qk (s(α)) < 0 contradicts the optimality of s and the assumption
e=∆
e4 .
aTi s < (−cA,k )i for some i was wrong. This completes the proof by setting ∆
u
t
References
1. D. P. B ERTSEKAS, Constrained optimization and Lagrange multiplier methods, Computer Science and
Applied Mathematics, Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, 1982.
2. P. T. B OGGS AND J. W. T OLLE, Sequential quadratic programming, in Acta numerica, 1995, Acta
Numer., Cambridge Univ. Press, Cambridge, 1995, pp. 1–51.
3. C. M. C HIN AND R. F LETCHER, Numerical results for SLPSQP, Filter-SQP and LANCELOT on
selected Cute test problems, Tech. Report NA/203, Department of Mathematics, Dundee University,
Dundee, Scotland, 2001.
4. A. R. C ONN , N. I. M. G OULD , AND P. L. T OINT, Trust-region methods, MPS/SIAM Series on Optimization, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2000.
5. F. FACCHINEI AND S. L UCIDI, Quadratically and superlinearly convergent algorithms for the solution
of inequality constrained minimization problems, J. Optim. Theory Appl., 85 (1995), pp. 265–289.
6. R. F LETCHER , N. I. M. G OULD , S. L EYFFER , P. L. T OINT, AND A. W ÄCHTER, Global convergence
of a trust-region SQP-filter algorithm for general nonlinear programming, Technical Report 99/03, Department of Mathematics, University of Namur, Namur, Belgium, 1999 (revised 2001). To appear in
SIAM J. Optim.
7. R. F LETCHER AND S. L EYFFER, Nonlinear programming without a penalty function, Math. Program.,
91 (2002), pp. 239–269.
8. R. F LETCHER , S. L EYFFER , AND P. L. T OINT, On the global convergence of a filter–SQP algorithm,
SIAM J. Optim., 13 (2002), pp. 44–59.
9. T. G LAD AND E. P OLAK, A multiplier method with automatic limitation of penalty growth, Math. Programming, 17 (1979), pp. 140–155.
10. C. C. G ONZAGA , E. K ARAS , AND M. VANTI, A globally convergent filter method for nonlinear programming, Technical Report, Department of Mathematics, Federal University of Santa Catarina, Brazil,
2001 (revised 2002).
11. S. L UCIDI, New results on a continuously differentiable exact penalty function, SIAM J. Optim., 2
(1992), pp. 558–574.
12. J. N OCEDAL AND S. J. W RIGHT, Numerical optimization, Springer Series in Operations Research,
Springer-Verlag, New York, 1999.
13. P. L. T OINT, Non-monotone filter methods, Presentation at the SIAM Conference on Optimization, May
20–22, 2002, Toronto, Canada, 2002.
14. M. U LBRICH AND S. U LBRICH, Non-monotone trust region methods for nonlinear equality constrained
optimization without a penalty function, Math. Program., Ser. B (2002), pp. DOI 10.1007/s10107–002–
0343–9, in press.
15. M. U LBRICH , S. U LBRICH , AND L. N. V ICENTE, A globally convergent primal-dual interior-point
filter method for nonconvex nonlinear programming, Technical Report TR00-12, Department of Computational and Applied Mathematics, Rice University, Houston, Texas 77005-1892, 2000 (revised 2002).
16. A. W ÄCHTER AND L. T. B IEGLER, Global and local convergence of line search filter methods for nonlinear programming, CAPD Technical Report B-01-09, Department of Chemical Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania, 2001 (revised 2002).
Download