Stefan Ulbrich On the superlinear local convergence of a filter-SQP method October 2002 Abstract. Transition to superlinear local convergence is shown for a modified version of the trust-region filter-SQP method for nonlinear programming introduced by Fletcher, Leyffer, and Toint [8]. Hereby, the original trust-region SQP-steps can be used without an additional second order correction. The main modification consists in using the Lagrangian function value instead of the objective function value in the filter together with an appropriate infeasibility measure. Moreover, it is shown that the modified trust-region filterSQP method has the same global convergence properties as the original algorithm in [8]. Key words. nonlinear programming, superlinear convergence, global convergence, filter, SQP 1. Introduction This paper is concerned with the transition of trust-region filter-SQP methods to superlinear local convergence without using second order corrections. Hereby, we consider the general nonlinear programming problem (NLP) min f (x) subject to cE (x) = 0, cI (x) ≤ 0 (P) with twice continuously differentiable functions f : Rn → R, c : Rn → Rm and cE = (ci )i∈E , cI = (ci )i∈I , where E = {1, . . . , mE } and I = {mE + 1, . . . , m} denote the indices of equality and inequality constraints, respectively. With these notations the Lagrangian function associated with (P) is given by `(x, y) = f (x) + y T c(x), where y = (yET , yIT )T ∈ Rm , yI ≥ 0. We propose a modification of the globally convergent filter-SQP method by Fletcher, Leyffer, and Toint [8] that has the same global convergence properties and ensures in addition transition to superlinear local convergence under appropriate assumptions on the Hessian approximation in the SQP-problems close to a regular minimum. In the pioneering work [7], Fletcher and Leyffer proposed the filter technique for the globalization of SQP methods, which avoids the use of classical merit functions with penalty term. Instead, the the acceptability of steps is determined by comparing the constraint violation and objective function value with previous iterates collected in a filter. The new iterate is acceptable to the filter if either feasibility or objective function value is sufficiently improved in comparison to all iterates bookmarked in the current filter. The promising numerical results in [7] led to a growing interest in filter methods Stefan Ulbrich: Zentrum Mathematik, Technische Universität München, Boltzmannstr. 3, 85747 Garching b. München, Germany, E-mail: sulbrich@ma.tum.de. S. Ulbrich: Superlinear local convergence of filter-SQP methods 2 in recent years. By augmenting the basic filter concept with a sufficient decrease condition for the objective function if the current iterate is sufficiently feasible, the global convergence of three variants of trust-region filter-SQP methods was shown in Fletcher, Gould, Leyffer, Toint, and Wächter [6], Conn, Gould, and Toint [4], Fletcher, Leyffer, and Toint [8], and Fletcher, Gould, Leyffer, Toint, and Wächter [6]. Global convergence of a filter method for nonlinear programming was subsequently also shown by Gonzaga, Karas, and Vanti [10]. A primal-dual interior-point filter method for NLP is proposed in Ulbrich, Ulbrich, and Vicente [15] and global convergence is shown. An alternative nonmonotone trust-region SQP-method without penalty function for equality constrained problems was proposed by Ulbrich and Ulbrich [14]. For this latter algorithm, in addition to global convergence also the transition to superlinear local convergence was proven in [14]. Hereby, no second order correction is necessary. So far, to the best of our knowledge, the only result on the superlinear local convergence of a filter method is shown in Wächter and Biegler [16] for an interior point linesearch filter method with second order correction steps. In this paper we propose a modified version of the filter-SQP algorithm in Fletcher, Leyffer, and Toint [8] that ensures transition to superlinear local convergence without using second order correction steps. We believe that the modifications and proof techniques of the present paper can be adapted to ensure also the fast local convergence of the filter methods in [4,6, 16] without second order corrections. The original filter-SQP methods in [4, 6, 8] can–at least if no second order correction steps are used–be affected by the well known Maratos effect. In fact, these algorithms use constraint violation and objective function value in the filter. By the Maratos effect a full SQP-step can lead to an increase of both these filter components even arbitrarily close to a regular minimizer. This makes the full SQP-step unacceptable for the filter and can prohibit fast local convergence. To avoid this problem, we use the value of the Lagrangian function in the filter instead of the objective function and modify the feasibility measure in the filter slightly. The paper is organized as follows. In section 2 we describe briefly the trust-region filter-SQP method of Fletcher, Leyffer, and Toint [8] and recall the global convergence result proven in [8]. In section 3 we derive a modified filter-SQP method and prove its transition to superlinear local convergence close to a point satisfying the strong second order sufficient condition. In subsection 3.1 we describe the local requirements on Hessian approximation and Lagrange multipliers. In subsection 3.2 we motivate the necessary changes to avoid the Maratos effect and formulate the modified trust-region filter-SQP method. In subsection 3.3 we prove the transition to superlinear local convergence of the modified algorithm. In section 4 we show how the global convergence proof in [8] can be adapted to the modified algorithm. 2. A trust-region filter-SQP algorithm In this section we recall the globally convergent trust-region filter-SQP-framework proposed by Fletcher, Leyffer, and Toint [8] for the solution of (P). Given the current iterate xk and a trust-region radius ∆k , the step computation is based on the approximate solution of the trust-region SQP-subproblem QP(xk , ∆k ) minimize qk (s) subject to cE (xk ) + AE (xk )s = 0, cI (xk ) + AI (xk )s ≤ 0, QP(xk , ∆k ) ksk∞ ≤ ∆k . S. Ulbrich: Superlinear local convergence of filter-SQP methods 3 Hereby, AE (xk ) = (ai (x)T )i∈E := ∇cE (xk )T , AI (xk ) = (ai (x)T )i∈I := ∇cI (xk )T are the Jacobians of cE and cI , respectively, 1 qk (s) = fk + gkT s + sT Bk s 2 with gk = g(xk ) := ∇f (xk ) is a quadratic model of `(xk + s, yk ) − ykT A(xk )s, A(xk ) = ∇c(xk )T , and Bk ∈ Rn,n is a symmetric approximation of the Hessian of the Lagrangian X Hk := ∇xx `(xk , yk ) = ∇xx f (xk ) + (yk )i ∇xx ci (xk ). i∈E∪I To ensure global convergence, filter methods evaluate the quality of a search direction by using a filter Fk ⊂ {(hj , fj ) : j = 0, . . . , k − 1} that collects pairs (hj , fj ) := (h(xj ), f (xj )) from certain previous iterations. Hereby, h(x) is an infeasibility measure. Fletcher, Leyffer, and Toint [8] propose to use h(x) := kcE (x)k1 + kc+ I (x)k1 , where c+ I (x) = (max {ci (x), 0})i∈I . Since a local solution x̄ of (P) minimizes f locally under the constraint h(x) = 0, a promising trial step s should either reduce the contraint violation h or the function value f . To ensure sufficient decrease of at least one of the two criteria, Fletcher et al. [8] call x acceptable to the filter Fk if for all (hj , fj ) ∈ Fk either h(x) ≤ βhj or f (x) + γh(x) ≤ fj (1) is satisfied with constants 0 < γ < β < 1. Following [8] a pair (hj , fj ) is called dominated by (h(x), f (x)) if h(x) ≤ hj and f (x) ≤ fj . To update the information in the filter appropriate points x, which are acceptable to the filter, are added to the filter Fk . This operation means that (h(x), f (x)) is added to the set of pairs (hj , fj ) in Fk and subsequently any old pair is removed from the filter that is dominated by the new pair (h(x), f (x)). The acceptance test (1) has the useful inclusion property that after adding a pair (h(x), f (x)) to the filter all unacceptable points of the old filter remain unacceptable for the new filter (although dominated pairs are removed). This allows to show that the infeasibility tends to zero for any infinite subsequence of iterates added to the filter, which is an essential ingredient for the global convergence analysis of filter methods. More precisely, we have the following result, cf. the corollary of Lemma 1 in [8]. Lemma 1. For any infinite sequence of iterates xk for which hk > 0, {fk } is bounded below, and (hk , fk ) is entered to the filter, it follows that hk → 0. Proof. See Lemma 1 and its corollary in [8]. u t S. Ulbrich: Superlinear local convergence of filter-SQP methods 4 Obviously, the filter concept alone does not necessarily enforce convergence to a local minimum. For example if the iterates converge to an arbitrary feasible point and the infeasibility measure h is in each iteration reduced by a factor of at least β then the iterates will be acceptable to the filter. Therefore, the filter-SQP algorithm of Fletcher, Leyffer, and Toint [8] imposes a sufficient reduction condition of the form ∆fk (s) := f (xk ) − f (xk + s) ≥ σ∆qk (s), ∆qk (s) := qk (0) − qk (s) with a constant 0 < σ < 1 if the feasibility is good enough, more precisely if ∆qk (s) > 0. Finally, it is obvious that the SQP-subproblem QP(xk , ∆) can be incompatible (infeasible). In this case, the algorithm invokes a restoration phase that finds an iterate xk and a trust-region radius ∆ ≥ ∆min > 0 such that QP(xk , ∆) is compatible and (hk , fk ) is acceptable to the filter Fk . We have now collected all ingredients to give a precise statement of the filter-SQP method of Fletcher, Leyffer, and Toint [8]. Algorithm 1. Filter-SQP Algorithm of [8] Let 0 < γ < β < 1, σ ∈]0, 1[, and ∆min > 0. 0. Initialize with a point x, k = 1 and filter F1 = {(u, −∞)} with some u ≥ βh(x). 1. Enter restoration phase to find a point xk acceptable to the filter Fk such that e is compatible for some ∆ e ≥ ∆min and initialize ∆ = ∆. e QP(xk , ∆) 2. Try to compute a solution s of QP(xk , ∆). If the QP is incompatible then include (hk , fk ) in the filter, set k := k + 1 and goto step 1 (h-type iteration). 3. If s = 0 stop with KT point xk . 4. If xk + s is not acceptable to Fk ∪ (hk , fk ) then set ∆ := ∆/2 and goto 2. 5. If ∆qk (s) := qk (0) − qk (s) > 0 and ∆fk (s) := fk − f (xk + s) < σ∆qk (s) then set ∆ := ∆/2 and goto 2. 6. If ∆qk (s) ≤ 0 then include (hk , fk ) in the filter (h-type iteration). 7. Set sk = s, ∆k = ∆, xk+1 = xk + sk , and initialize ∆k+1 ≥ ∆min . Set k := k + 1 and goto step 2. A convergence analysis of this algorithm is given in [8] under the following assumptions. Assumptions (A1)–(A3). (A1) All points that are sampled by the algorithm lie in a nonempty convex and compact set X. (A2) The problem functions f (x) and c(x) are twice continuously differentiable on an open set containing X. (A3) There exists an MB > 0 such that the Hessian matrices Bk satisfy kBk k2 ≤ MB for all k. The following global convergence result is proven in [8]. Theorem 1. (Global convergence result, cf. [8, Thm. 7]) Under assumtions (A1)–(A3) the outcome of applying the filter-SQP algortihm of Algorithm 1 is one of the following: S. Ulbrich: Superlinear local convergence of filter-SQP methods 5 (A) The restoration phase fails to find a point x which is both acceptable to the filter and for which QP(x, ∆) is compatible for some ∆ > ∆min . (B) A KT point of problem (P) is found. (C) There exists an accumulation point that is feasible and either is a KT point or fails to satisfy the Mangasarian-Fromowitz constraint qualification (MFCQ). We recall that MFCQ holds at some point x if the rows of AE (x) are linearly independent and there exists a vector s with AE (x)s = 0, AA (x)s < 0, where A ⊂ I is the index set of active inequality constraints at x and AA (x) = (ai (x)T )i∈A . 3. Transition to fast local convergence Our aim is to modify Algorithm 1 in such a way that it allows transition to superlinear local convergence if – the computed iterates xk converge to a point x̄, – x̄ satisfies the second order sufficient condition with strict complementarity and the linear independence constraint qualification (LICQ), – the Hessian approximations Bk are chosen appropriately for xk close to x̄. More precisely, we assume the following. Assumption (A4). (A4) The iterates xk converge to a point x̄ that satisfies with a Lagrange multiplier ȳ = (ȳET , ȳIT ) the following second order sufficient conditions with strict complementarity and LICQ: h(x̄) = 0, ∇x `(x̄, ȳ) = 0, ȳA = 0, ȳI\A > 0, where A := {i ∈ I : ci (x̄) = 0} , ∇xx `(x̄, ȳ) is positive definite on {s : AE∪A (x̄)s = 0}, AE∪A (x̄) has full row rank. (O) We still have to specify our requirements on the choice of Bk . 3.1. Local requirement on the step computation In this subsection we introduce a standard framework for the Lagrange-multiplier update that ensures with the choice Bk = ∇xx `(xk , yk ) q-superlinear local convergence to a point x̄ satisfying (O) if full (local) SQP-steps are taken. To obtain suitable Lagrange multiplier estimates, we will make use of a multipliers function, see [9,5]: Definition 1. Let Ux̄ be a neighborhood of x̄ satisfying (O). Then a function Y : Ux̄ → Rm is called a multipliers function if Y : Ux̄ → Rm is continuous at x̄ and ȳ = Y (x̄). S. Ulbrich: Superlinear local convergence of filter-SQP methods 6 One can show that for Ux̄ small enough a continuously differentiable multipliers function is for example given by the multipliers function of Glad and Polak ¡ ¢−1 Y GP (x) = − A(x)A(x)T + γ1 diag(ci (x)2 ) A(x)g(x) with γ1 > 0, see [11,5]. Moreover, by the strict complementarity assumption in (O) it is easy to see that for any multipliers function Y and any constant ν > 0 for Ux̄ small enough it holds A = {i ∈ I : ci (x) ≥ −νYi (x)} =: A(x) ∀ x ∈ Ux̄ . (2) Thus, the active set can be identified exactly close to x̄. To analyze the transition to fast local convergence we make for simplicity the following assumptions on the choice of Bk and yk . The use of inexact Hessian approximations, see for example [2,5], would be possible. Assumption (A5). (A5) There is a neighborhood Ux̄ of x̄ such that Bk := Hk = ∇xx `(xk , Y (xk )), yk = Y (xk ) ∀ xk ∈ Ux̄ , where Y : Ux̄ → Rm is a Lipschitz continuous multipliers function that satisfies YI (x) ≥ 0, YI\A (x) = 0 ∀ x ∈ Ux̄ . Remark 1. If Y : Ux̄ → Rm is a Lipschitz continuous multipliers function then a multipliers function Yb satisfying (A5) can for Ux̄ small enough be obtained by setting YbE∪A(x) = YE∪A(x) , YbI\A(x) = 0 with A(x) in (2), since A(x) = A for Ux̄ sufficiently small. u t Under assumption (O) the point x̄ is a strict local minimum of (P) and we find ∆¯ > 0 such that for Ux̄ small enough the following holds: for all xk ∈ Ux̄ with Bk according to (A5) we have by continuity with a constant κB > 0 2 sT Bk s ≥ κB ksk2 on {s : AE∪A (xk )s = 0} , AE∪A (xk ) has full row rank, 1 (yk )A ≥ ȳA > 0. 2 Moreover, for any given 0 < ∆ ≤ ∆¯ we can shrink Ux̄ such that for all xk ∈ Ux̄ the ¯ is unique, satisfies ksk < ∆ and coincides with the solution solution s of QP(xk , ∆) ∞ of the equality constrained problem EQP(xk ) given by minimize qk (s) subject to cE∪A (xk ) + AE∪A (xk )s = 0, (EQP) where A ⊂ I is the set of active inequality constraints at x̄, see for example [1, Sec. 4.4.3], [2, Sec. 5] or [12, Thm. 18.1]. In particular, since (yk )I\A = 0 by (A5) the step s satisfies the KT conditions µ ¶µ ¶ µ ¶ s −∇x `(xk , yk ) ∇xx `(xk , yk ) AE∪A (xk )T = . z −cE∪A (xk ) AE∪A (xk ) 0 S. Ulbrich: Superlinear local convergence of filter-SQP methods 7 Now it is well known that with the choice xk+1 := xk + s, yk+1 := Y (xk+1 ) the sequence {xk } converges under assumption (A5) q-superlinearly to x̄ and that the convergence is q-quadratic if ∇xx f , and ∇xx ci are Lipschitz continuous in a neighborhood of x̄, see for example [5, 14]. We collect these facts in the following proposition: Proposition 1. Let x̄ satisfy (O) and let (A5) hold. Then there exists ∆¯ > 0 such that e ≤ ∆¯ possibly after shrinking Ux̄ the following holds: for any given ζ ∈]0, 1[ and 0 < ∆ AE∪A (x) YA (x) ≥ has full row rank on a compact set Sx̄ ⊃ Ux̄ , 1 ȳA > 0 ∀ x ∈ Ux̄ , 2 (3) (4) and with a constant κB > 0 2 sT ∇xx `(x, Y (x))s ≥ κB ksk2 on {s : AE∪A (x)s = 0} ∀ x ∈ Ux̄ . (5) For any xk ∈ Ux̄ , yk = Y (xk ), the sequence defined by x0k := xk and x0j+1 = x0j + s0j ¯ for j = k, k + 1, . . ., with the unique solution s0j of QP(x0j , ∆) where Bj0 = ∇xx `(x0j , Y (x0j )) are chosen according to (A5), is well defined, converges q-superlinearly to x̄ (q-quadratically if ∇xx f , ∇xx ci are Lipschitz on Ux̄ ), and satisfies ks0j+1 k ≤ ζks0j k, kx0j+1 − x̄k ≤ ζkx0j − x̄k, e ks0j k∞ < ∆. (6) e ∆]. ¯ Moreover, s0 is the unique solution Thus, s0j solves also QP(x0j , ∆) for all ∆ ∈ [∆, j 0 of EQP(xj ). 3.2. Modifications of the filter-SQP algorithm to avoid the Maratos effect By the well known Maratos effect the full SQP-step s can lead to an increase of f and h and therefore the iterate xk + s will not be acceptable to Fk ∪ (hk , fk ). To avoid this difficulty, there are three promising approaches: – Perform a second order correction step if the Maratos effect is detected. This approach is used successfully by Wächter and Biegler [16] in the context of an interiorpoint linesearch filter algorithm. – Allow nonmonotonicity in the filter to admit an increase in f and h. This is in the spirit of the well known watchdog technique. A promising approach in this direction is the nonmonotone filter method proposed recently by Toint [13]. – Use the Lagrangian function instead of the objective function in the filter. In this paper we will consider the last approach. As already mentioned, the Maratos effect can cause that the full SQP-step is not acceptable with respect to Fk ∪ (hk , fk ), since both h and f are increased. By using as a technical tool the fact that the augmented Lagrangian function ρ 2 `ρ (x, Y (x)) = `(x, Y (x)) + kcE∪A (x)k2 2 (7) is an exact penalty function close to x̄ satisfying (O) if ρ > 0 is large enough and Y : Ux̄ → Rm is a Lipschitz continuous multipliers function, we will see that full SQPsteps are eventually accepted by the filter, if the filter entries (h(x), f (x)) are replaced S. Ulbrich: Superlinear local convergence of filter-SQP methods 8 by entries (θ(x, y), `(x, y)) with an appropriately defined infeasibility measure θ(x, y). This leads to our first modification of Algorithm 1: Modification 1. – Instead of the filter entries (h(x), f (x)) we use the entries (θ(x, y), `(x, y)), where 2 2 T − 2 θ(x, y) := kcE (x)k2 + kc+ I (x)k2 + (yI cI (x)) . (8) Consequently, we call (x, y) acceptable to the filter Fk if either θ(x, y) ≤ βθj or `(x, y) + γθ(x, y) ≤ `j (9) with constants 0 < γ < β < 1. We remark that 0 ≤ h(x) ≤ √ mθ(x, y)1/2 . (10) Moreover, it is obvious from (8) that θ(x, y) = 0 =⇒ y T c(x) = 0. (11) The second modification concerns step 5 of Algorithm 1. We note that in the case ∆qk (s) > 0 the step is only accepted if the sufficient decrease condition ∆fk (s) ≥ σ∆qk (s) holds. This decrease condition can only be ensured if the error of the quadratic model becomes small compared to the size of the predicted decrease ∆qk (s). We now estimate the size of ∆qk (s) for a full SQP-step. Let (O) and (A5) hold and consider xk ∈ Ux̄ with Ux̄ as in Proposition 1. To explore ¯ the decrease properties of the full (local) SQP-step s given as the solution of QP(xk , ∆) we use the fact that s solves EQP(xk ). For convenience we set bk := AE∪A (xk ), A b ck := cE∪A (xk ). Then EQP(xk ) can be written as minimize qk (s) bk s = 0. subject to b ck + A (EQP) Now we decompose s in a normal step sn and a tangential step st . To this end, let s = sn + st , bTk (A bk A bTk )−1 b sn = − A ck . (12) Then we have obviously µ ¶ 1 n T 1 t T T n n n T t t ∆qk (s) = − min gk s + (s ) Bk s + (gk + Bk s ) s̃ + (s̃ ) Bk s̃ . bk s̃t =0 2 2 A (13) Since xk ∈ Ux̄ , we know from (3), (4), and (8) that there exists κn > 0 (independent of k) with 1/2 ksn k ≤ κn θk . (14) Moreover, (5) yields 2 (st )T Bk st ≥ κB kst k . S. Ulbrich: Superlinear local convergence of filter-SQP methods 9 Thus, we obtain by (13) with a constant C > 0 1 t T 1 (s ) Bk st − gkT sn − (sn )T Bk sn 2 2 MB n 2 κB t 2 n ≥ ks k − kgk kks k − ks k 2 2 κB t 2 2 ≥ ks k − C(ksn k + ksn k ) 2 ∆qk (s) = (15) 2 and this lower bound can be sharp. Since ∆fk (s) = ∆qk (s) + O(ksk ), the full SQPstep will not necessarily satisfy ∆fk (s) ≥ σ∆qk (s), unless 2 ∆qk (s) ≥ (1 − σ)|∆fk (s) − ∆qk (s)| = O(ksk ), which is not ensured in view of (15). To circumvent this difficulty we replace the predicted reduction ∆qk (s) and actual 2 reduction ∆fk (s) by quantities that differ only by o(ksk ) and we check the decrease ratio only if the predicted reduction is large enough compared to θk . As we will see, the first goal can be accomplished by using the actual reduction ∆`k (s) := `k − `(xk + s, yk (s)) (16) instead of ∆fk (s), where yk (s) denotes the new multiplier estimate, and the predicted reduction (17) ∆b qk (s) := ∆qk (s) + ykT ck instead of ∆qk (s). Secondly, we require the sufficient decrease condition ∆`k (s) ≥ ψ/2 σ∆b qk (s) only if ∆b qk (s) > κθ θk with a constant ψ ∈]1/2, 1]. This leads to the following modification of step 5. Modification 2. – To ensure that f -type steps allow eventually full (local) SQP-steps we replace the condition in step 5 by ψ/2 ∆b qk (s) := ∆qk (s) + ykT ck > κθ θk and ∆`k (s) < σ∆b qk (s) with ψ ∈]1/2, 1]. Analogously, we replace the condition ∆qk (s) ≤ 0 in step 6 by ψ/2 ∆b qk (s) ≤ κθ θk . Finally, we enter restoration not only if QP(xk , ∆) is infeasible but also if the Lagrange multipliers weight inactive constraints to strongly. More precisely, we define the compatibility of QP(xk , ∆) as follows. Modification 3. – We call QP(xk , ∆) compatible if QP(xk , ∆) is feasible 1/2 and θk ≤ κ∆ ∆1+ξ with constants κ∆ > 0 and ξ ∈]0, 1[. This leads to the following modified version of Algorithm 1. (18) S. Ulbrich: Superlinear local convergence of filter-SQP methods Algorithm 2. 10 Modified Filter-SQP Algorithm Let 0 < γ < β < 1, σ ∈]0, 1[, ξ ∈]0, 1[, ψ ∈]1/2, 1], κθ , κ∆ > 0, and ∆min > 0. 0. Initialize with a point (x, y), k = 1 and filter F1 = {(u, −∞)} with some u ≥ βθ(x, y). 1. Enter restoration phase to find a point (xk , yk ) acceptable to the (θ, `)-filter Fk such e is compatible in the sense (18) for some ∆ e ≥ ∆min and initialize that QP(xk , ∆) e ∆ = ∆. 2. Try to compute a solution s of QP(xk , ∆). If the QP is incompatible according to (18) then include (θk , `k ) in the filter, set k := k + 1 and goto step 1 (h-type iteration). 3. If s = 0 stop with KT point xk . Otherwise compute a Lagrange multiplier estimate y = yk (s). 4. If (xk + s, y) is not acceptable to Fk ∪ (θk , `k ) then set ∆ := ∆/2 and goto 2. 5. If ψ/2 ∆b qk (s) := ∆qk (s) + ykT ck > κθ θk and ∆`k (s) < σ∆b qk (s) then set ∆ := ∆/2 and goto 2. ψ/2 6. If ∆b qk (s) ≤ κθ θk then include (θk , `k ) in the filter (h-type iteration). 7. Set sk = s, ∆k = ∆, xk+1 = xk + sk , yk+1 = y, and initialize ∆ ≥ ∆min . Set k := k + 1 and goto step 2. Remark 2. We note that the modified filter-SQP method in Algorithm 2 has many similarities with the filter-SQP algortihms proposed by Fletcher, Gould, Leyffer, Toint, and Wächter [6] and Conn, Gould, and Toint [4]. The convergence analysis in [8] of the original filter-SQP algorithm in Algorithm 1 uses two essential observations. Firstly, ∆qk (s) never increases if the trust-region radius ∆ is reduced, since the feasible set of QP(xk , ∆) shrinks if ∆ is reduced. Thus, the condition ∆qk (s) > 0 for trying the possibility of an f -type step can only switch from true to false in the inner loop, but not vice versa. We note that the modified Algorithm 2 has ψ/2 also this property, since we only replace ∆qk (s) > 0 by ∆qk (s) > −ykT ck + κθ θk ψ/2 and −ykT ck + κθ θk is fixed in the inner iteration. Remark 3. A reinspection of the convergence proof in [8] shows that the global convergence result of [8]–and also the results of the present paper–remain true if in step 1 only the local minimizer of QP(xk , ∆) is determined, which is closest to xk . Essential is the monotonicity of ∆qk (s) with respect to ∆. u t The second key point in the analysis of [8] is the observation that Algorithm 1 only adds pairs (hk , fk ) to the filter with hk > 0. Analogously, all pairs (θk , `k ) added to the filter by Algorithm 2 satisfy θk > 0. In fact, new entries to the filter can only be added ψ/2 in step 2 if QP(xk , ∆) is incompatible or in step 6 if ∆qk (s) ≤ ykT ck + κθ θk . But if θk = 0 then QP(xk , ∆) is compatible in the sense (18) for all ∆ > 0, since s = 0 is feasible. Thus, (θk , `k ) is not added to the filter in step 2. Moreover, if xk is not a KT ψ/2 point then we must have ∆qk (s) > 0 and θk = 0 implies −ykT ck + κθ θk = 0 by ψ/2 (11). Thus, we have ∆b qk (s) = ∆qk (s) > 0 = κθ θk and also the condition in step 6 for adding (θk , `k ) to the filter is not satisfied. Consequently, the minimum constraint violation in the filter satisfies always τk := min (θj ,`j )∈Fk θj > 0. (19) S. Ulbrich: Superlinear local convergence of filter-SQP methods 11 This will allow us to adjust in section 4 the elegant global convergence analysis in [8] to the modified Algorithm 2. First, we will show that Algorithm 2 allows transition to fast local convergence. 3.3. Proof of transition to fast local convergence Our aim is to show the following result. Theorem 2. Let assumptions (A1)–(A5) hold. Then there is K > 0 such that Algorithm 2 takes full (local) SQP-steps for all k ≥ K, i.e., with some ∆¯ > 0 xk+1 = xk + sk , ksk k∞ < ∆¯ ¯ ∀ k ≥ K. with the unique solution sk of QP(xk , ∆) In particular, xk converges q-superlinearly to x̄. If ∇xx f and ∇xx ci are Lipschitz continuous in a neighborhood of x̄ then xk converges q-quadratically. The proof of this result requires some preparation and is given at the end of this section. In a first step we show that full SQP-steps according to Proposition 1 are eventually accepted by the filter. To this end, we prove as an auxiliary result that the augmented Lagrangian function `ρ (x, Y (x)) in (7) with a Lipschitz continuous multipliers function Y : Ux̄ → Rm is an exact penalty function. Lemma 2. Let x̄ satisfy (O) and let Y : Ux̄ → Rm be a Lipschitz continuous multipliers function. Then there exist constants 0 < λ < η, ρ0 > 0 and a neighborhood Bδ (x̄) = {x : kx − x̄k < δ} such that the augmented Lagrangian function `ρ in (7) satisfies λ ρ − ρ0 2 2 2 kx − x̄k2 + kcE∪A (x)k2 ≤ `ρ (x, Y (x))−`ρ (x̄, ȳ) ≤ η(1+ρ)kx − x̄k2 (20) 2 2 for all ρ ≥ ρ0 and all x ∈ Bδ (x̄). Moreover, possibly after increasing η > 0, ρ0 > 0, and reducing δ > 0 we have for the generalized augmented Lagrangian b̀ρ (x, Y (x)) := `(x, Y (x)) + ρ θ(x, Y (x)) 2 (21) with θ according to (8) that for all ρ ≥ ρ0 and all x ∈ Bδ (x̄) λ ρ − ρ0 2 2 kx − x̄k2 + θ(x, Y (x)) ≤ b̀ρ (x, Y (x))− b̀ρ (x̄, ȳ) ≤ η(1+ρ)kx − x̄k2 . (22) 2 2 b Proof. Let x ∈ Ux̄ . We use again the notation b c(x) = cE∪A (x), A(x) = AE∪A (x). Using Taylor’s theorem and ∇x `ρ (x̄, ȳ) = 0, ∇y `ρ (x̄, ȳ) = 0, we have with some (x̃, ỹ) on the line segment between (x̄, ȳ) and (x, Y (x)) `ρ (x, Y (x)) − `ρ (x̄, ȳ) 1 = (x − x̄)T ∇xx `ρ (x̃, ỹ)(x − x̄) + (x − x̄)T ∇xy `ρ (x̃, ỹ)(Y (x) − ȳ). 2 (23) Now it is obvious that ³ ´ ρ ρ b T A(x̃). b c(x̃) + A(x̃) ∇xx `ρ (x̃, ỹ) = ∇xx `ρ/2 x̃, ỹ + b 2 2 (24) S. Ulbrich: Superlinear local convergence of filter-SQP methods 12 It is well known that under assumption (O) there exists ρ̄ > 0 such that ∇xx `ρ/2 (x̄, ȳ) is uniformly positive definite for all ρ ≥ ρ̄, i.e., 2 sT ∇xx `ρ/2 (x̄, ȳ)s ≥ 4λksk2 ∀ s ∈ Rn with a constant λ > 0, see for example [12, Thm. 17.5]. ¡ ¢ Let Ly be the Lipschitz constant of Y (x) and let ρ0 := max ρ̄, 4L2y /λ . By continuity, we find δ > 0 such that for all (x, y) with kx − x̄k < δ, ky − ȳk < Ly δ ´ ³ ρ0 2 (25) c(x) s ≥ 2λksk2 ∀ s ∈ Rn , sT ∇xx `ρ0 /2 x, y + b 2 where we choose δ so small that the δ-ball Bδ (x̄) = {x : kx − x̄k < δ} is contained in Ux̄ . Thus, we obtain for all x ∈ Bδ (x̄) by (23), (24), (25) 2 ρ0 b kA(x̃)(x − x̄)k2 4 b + (Y (x) − ȳ)T A(x̃)(x − x̄). 2 `ρ0 (x, Y (x)) − `ρ0 (x̄, ȳ) ≥ λkx − x̄k2 + Now we have for all α > 0 2 1 b 2 b 2(Y (x) − ȳ)T A(x̃)(x − x̄) ≥ − kA(x̃)(x − x̄)k2 − αkY (x) − ȳk2 α 2 1 b 2 ≥ − kA(x̃)(x − x̄)k2 − αL2y kx − x̄k2 . α Hence, the choice α = λ/L2y yields (note that ρ0 /4 ≥ L2y /λ = `ρ0 (x, Y (x)) − `ρ0 (x̄, ȳ) ≥ 1 α) 2 λ ρ0 b 2 kx − x̄k2 + kA(x̃)(x − x̄)k2 . 2 8 (26) Since `ρ (x̄, ȳ) = `ρ0 (x̄, ȳ) = f (x̄), we have `ρ (x, Y (x)) − `ρ (x̄, ȳ) = `ρ0 (x, Y (x)) − `ρ0 (x̄, ȳ) + ρ − ρ0 2 kb c(x)k2 , 2 (27) and together with (26) the left inequality in (20) is shown. On the other hand, it is obvious from (23) that for all x ∈ Ux̄ 2 `ρ (x, Y (x)) − `ρ (x̄, ȳ) ≤ η(1 + ρ)kx − x̄k2 with a constant η > 0. This proves the right inequality in (20). The estimates (22) for b̀ρ (x, Y (x)) in (21) follow now from the following observations. By (O) we can reduce δ > 0 such that for all x ∈ Bδ (x̄) YA (x) ≥ 1 ȳA > 0, 2 c+ I\A (x) = 0. Thus, we find by (8) a constant µ > 0 with 2 2 2 µkb c(x)k2 ≤ θ(x, Y (x)) ≤ kb c(x)k2 + (YI (x)T c− I (x)) . (28) With ρ̂0 = ρ0 /µ we obtain therefore from (26) and b̀ρ̂0 (x̄, ȳ) = `ρ0 (x̄, ȳ) = f (x̄) that b̀ρ̂ (x, Y (x)) − b̀ρ̂ (x̄, ȳ) ≥ `ρ (x, Y (x)) − `ρ (x̄, ȳ) ≥ λ kx − x̄k2 . 0 0 0 0 2 2 S. Ulbrich: Superlinear local convergence of filter-SQP methods 13 Since an analogue of (27) holds for b̀ρ , this proves as above the left inequality in (22) after increasing ρ0 to ρ̂0 . On the other hand we have by the right inequality in (28) b̀ρ (x, Y (x)) − b̀ρ (x̄, ȳ) ≤ `ρ (x, Y (x)) − `ρ (x̄, ȳ) + ρ (YI (x)T c− (x))2 . I 2 T − 2 Since c− A (x̄) = 0, YI\A (x̄) = 0, and Y is Lipschitz at x̄, we have (YI (x) cI (x)) = 2 O(kx − x̄k ). This shows the right inequality in (22) possibly after increasing η. u t We are now in the position to show that for k large enough the sequence {x0j }j≥k of full (local) SQP-steps is acceptable to the filter. Lemma 3. Assume that xk → x̄ with x̄ satisfying (O) and that (A5) holds. Then with e ≤ ∆, ¯ and M ≥ 1 there is an index K such that some ∆¯ > 0 for any ζ ∈]0, 1[, 0 < ∆ for all k ≥ K the following holds: The sequence {x0j }j≥k of full SQP-steps defined in Proposition 1 is well defined, converges q-superlinearly to x̄, and satisfies (6). Moreover, the problems QP(x0j , ∆), e are all compatible in the sense (18) and for all ∆ ∈ [∆, e ∆] ¯ they have the same ∆ ≥ ∆, 0 ¯ solution as QP(xj , ∆). Finally, for all k ≥ K with kxk − x̄k ≤ M min (θl ,`l )∈Fk kxl − x̄k (29) the points (x0j+1 , Y (x0j+1 )), j = k, k + 1, . . ., corresponding to full SQP-steps are acceptable to 0 , `0k+1 ) ∪ · · · ∪ (θj0 , `0j ). Fj0 := Fk ∪ (θk , `k ) ∪ (θk+1 Thus, full SQP-steps starting from xk pass the filter test in step 4 of Algorithm 2. Proof. Let ∆¯ > 0 be as in Proposition 1 and choose Ux̄ as in Proposition 1 for any e ≤ ∆¯ and ζ = 1/2, say. Moreover, let ρ0 and Bδ (x̄) ⊂ Ux̄ be given by given 0 < ∆ Lemma 2. Then by Proposition 1 for all xk ∈ Bδ (x̄) the above sequence {x0j }j≥k is well defined, satisfies (6) and converges q-superlinearly to x̄. In particular, we obtain ¯ and QP(x0 , ∆) coincide for all ∆ ∈ [∆, e ∆]. ¯ from (6) that the solutions of QP(x0j , ∆) j e1+ξ for Without restriction we can choose δ > 0 so small that θ(x, Y (x))1/2 ≤ κ∆ ∆ e are all compatible in the sense (18), since all x ∈ Bδ (x̄). Then QP(x0j , ∆), ∆ ≥ ∆, 0 {xj }j≥k ⊂ Bδ (x̄) for xk ∈ Bδ (x̄) by (6). Choose K1 so large that xk ∈ Bδ (x̄) for all k ≥ K1 and choose ρ ≥ ρ0 so large that µ ¶ ³ρ ´ ρ ρ − ρ0 β ρ β −γ ≥β and ≥ 1− . (30) 2 4 2 4 2 Then we know from (22) in Lemma 2 that for all x ∈ Bδ (x̄) with b̀ρ in (21) µ ¶ λ β ρ 2 2 kx − x̄k2 + 1 − θ(x, Y (x)) ≤ b̀ρ (x, Y (x)) − b̀ρ (x̄, ȳ) ≤ η(1 + ρ)kx − x̄k2 2 4 2 (31) holds with constants 0 < λ < η. Let as in (19) τK1 := min (θj ,`j )∈FK1 θj . S. Ulbrich: Superlinear local convergence of filter-SQP methods 14 We have already observed that τK1 > 0. By continuity we find 0 < δ1 < δ such that θ(x, Y (x)) ≤ βτK1 for all x ∈ Bδ1 (x̄). Thus, for all points x ∈ Bδ1 (x̄) the pair (x, Y (x)) is acceptable to FK1 . Since xk → x̄ we find K2 > K1 with xk ∈ Bδ1 (x̄) for all k ≥ K2 . Using (6) in Proposition 1, we can choose δ1 so small that for all k ≥ K2 the sequence {x0j }j≥k of full SQP-steps converges linearly with a contraction factor of at least s µ ¶ 1 1 1 − β/2 λ 0 kxj+1 − x̄k ≤ kx0j − x̄k ≤ kx0j − x̄k. (32) 1− 2 2 1 − β/4 2M η(1 + ρ) 2 Now consider an arbitrary k ≥ K2 such that (29) holds and set δ2 := kxk − x̄k. By (32) we have x0k+1 ∈ Bδ2 (x̄) and thus (x0k+1 , Y (x0k+1 )) is acceptable to FK1 . We still have to show that it is acceptable with respect to (θl , `l ) ∈ Fk ∪ (θk , `k ) for K1 ≤ l ≤ k. For these l we have xl ∈ Bδ (x̄) \ Bδ2 /M (x̄) by (29) and therefore by (31) µ ¶ b̀ρ (xl , Y (xl )) − b̀ρ (x̄, ȳ) ≥ λ δ 2 + 1 − β ρ θl . (33) 2M 2 2 4 2 On the other hand, (31) and (32) yield b̀ρ (x0 k+1 , Y 2 (x0k+1 )) − b̀ρ (x̄, ȳ) ≤ η(1 + ρ)kx0k+1 − x̄k µ ¶ 1 1 − β/2 λ 2 ≤ 1− δ . 2 1 − β/4 2M 2 2 (34) To prove that (x0k+1 , Y (x0k+1 )) is acceptable with respect to (θl , `l ) we show now that any point (x, Y (x)) with x ∈ Bδ2 (x̄) that is not acceptable with respect to (θl , `l ) satisfies in contrast to (34) ¶ µ λ 2 b̀ρ (x, Y (x)) − b̀ρ (x̄, ȳ) > 1 − 1 − β/2 δ . 1 − β/4 2M 2 2 In fact, if (x, Y (x)) is not acceptable to (θl , `l ) then θ(x, Y (x)) > βθl and `(x, Y (x)) + γθ(x, Y (x)) > `l . Thus, we have with (30) ³ ´ b̀ρ (x, Y (x)) = `(x, Y (x)) + ρ θ(x, Y (x)) > `l + ρ − γ θ(x, Y (x)) 2 2 µ ¶ ³ρ ´ ρ β ρ b̀ > `l + β − γ θl ≥ `l + β θl = ρ (xl , Y (xl )) − 1 − θl 2 4 2 2 ´ 1 − β/2 ³ b̀ b̀ ≥ b̀ρ (xl , Y (xl )) − ρ (xl , Y (xl )) − ρ (x̄, ȳ) . 1 − β/4 Hereby, we have used (31) in the last step. This shows with (33) that µ ¶³ ´ b̀ρ (x, Y (x)) − b̀ρ (x̄, ȳ) > 1 − 1 − β/2 b̀ρ (xl , Y (xl )) − b̀ρ (x̄, ȳ) 1 − β/4 ¶ µ λ 2 1 − β/2 δ . ≥ 1− 1 − β/4 2M 2 2 Since this does by (34) not hold for x0k+1 , we have shown that (x0k+1 , Y (x0k+1 )) is acceptable to Fk ∪ (θk , `k ). S. Ulbrich: Superlinear local convergence of filter-SQP methods 15 Obviously, the same argument can be applied to x0k+1 and Fk ∪ (θk , `k ) instead of xk and Fk , since by (32) again (29) is satisfied for x0k+1 and Fk ∪ (θk , `k ). This 0 shows that x0k+2 is acceptable to Fk ∪ (θk `k ) ∪ (θk+1 , `0k+1 ). Thus the acceptability of 0 0 0 (θj+1 , `j+1 ) for Fj follows by induction. u t Finally, we show that the sequence {x0j }j≥k of full SQP-steps passes for k large enough also the decrease test in step 5. Lemma 4. Let the assumptions of Lemma 3 hold and let K be as in Lemma 3 for any e ≤ ∆. ¯ Then after a possible increase of K for all k ≥ K the fixed ζ ≤ 1/2 and 0 < ∆ 0 sequence {xj }j≥k of full SQP-steps defined in Proposition 1 satisfies with yj0 = Y (x0j ) the following implication for all j ≥ k: ∆b qj0 (s0j ) := ∆qj0 (s0j ) + (yj0 )T c0j > κθ (θj0 )ψ/2 ∆`0j (s0j ) ≥ σ∆b qj0 (s0j ). (35) =⇒ Hereby, we use the notations ∆`0j (s0j ) = `(x0j , yj0 )−`(x0j +s0j , Y (x0j +s0j )), c0j = c(x0j ), θj0 = θ(x0j , yj0 ) and analogously for the other quantities. e Thus, full SQP-steps starting from xk pass step 5 of Algorithm 2 for all ∆ ≥ ∆ 0 0 ¯ for which the solutions of QP(xj , ∆) and QP(xj , ∆) coincide, in particular for all e ∆]. ¯ ∆ ∈ [∆, Proof. By Lemma 3 we know that there exists K > 0 such that for all k ≥ K the sequence {x0j }j≥k , corresponding to full SQP-steps converges q-superlinearly to x̄ and satisfies (6) for ζ = 1/2. We then have kx0j − x̄k ≤ ks0j k Let ≤ kx0j 1 2j−k kxk − x̄k, − x̄k + kx0j+1 − x̄k ≤ 3 2j−k+1 (36) kxk − x̄k. ∆b qj0 (s0j ) := ∆qj0 (s0j ) + (yj0 )T c0j . Obviously, the assertion is proven if we show that with K > 0 large enough for all j ≥ k ≥ K in the case ∆b qj0 (s0j ) > σ(θj0 )ψ/2 the following holds ¯ 0 0 ¯ ¯∆b qj (sj ) − ∆`0j (s0j )¯ ≤ (1 − σ)∆b qj0 (s0j ). Since s0j is also the solution of EQP(x0j ), we have b0j s0j b cj0 = −A (37) b0 = AE∪I (x0 ). Moreover, (y 0 )I\A = 0 and thus with b cj0 = cE∪I (x0j ), A j j j b0j s0j = −(yj0 )T A0j s0j . (yj0 )T c0j = (yj0 )TE∪A b cj0 = −(yj0 )TE∪A A This shows that 1 ∆b qj0 (s0j ) = −∇x `(x0j , yj0 )T s0j − s0T B 0 s0 . 2 j j j Moreover, we have by (A5) that (Y (x0j + s0j ) − yj0 )I\A = 0 and thus k(Y (x0j + s0j ) − yj0 )T c(xj + s0j )k ≤ k(Y (x0j + s0j ) − yj0 )E∪A kkcE∪A (x0j + s0j )k 2 = o(ks0j k ). S. Ulbrich: Superlinear local convergence of filter-SQP methods 16 Hereby, we have used that kY (x0j + s0j ) − yj0 k ≤ Ly ks0j k and that cE∪A (x0j + s0j ) = b0 s0 + O(ks0 k2 ) = O(ks0 k2 ) by (37). Thus, we obtain with some τ ∈ [0, 1] b c0 + A j j j j j ¯ 0 0 ¯ 1 ¯∆b qj (sj ) − ∆`0j (s0j )¯ ≤ ks0T (∇xx `(x0j + τ s0j , yj0 ) − Bj0 )s0j k 2 j + k(Y (x0j + s0j ) − yj0 )T c(x0j + s0j )k (38) 2 = o(ks0j k ). On the other hand, using the step decomposition (12) and the obvious inequality 2 2 2 2 0t 0n 0t ks0j k = ks0n j + sj k ≤ 2(ksj k + ksj k ) we have by (15) and (14) after a possible increase of K for all j ≥ k ≥ K κB 0 2 ³ κB ´ κB 0t 2 0n 2 0n 2 ksj k − C(ks0n ksj k − C + (ks0n ∆qj0 (s0j ) ≥ j k + ksj k ) ≥ j k + ksj k ) 2 4 2 κB 0 2 0 ψ/2 ks k − C̃(θj ) ≥ 4 j with a constant C̃ > 0. Hereby, we use (14) and ψ ∈]1/2, 1] in the last step and choose K large enough such that θj0 ≤ 1 for all j ≥ k ≥ K. 01/2 0ψ/2 Let My = 1 + supx∈Ux̄ kY (x)k. Then (yj0 )T c0j ≤ My θj ≤ My θj we conclude that κB 0 2 ∆b qj0 (s0j ) ≥ ks k − (C̃ + My )(θj0 )ψ/2 . 4 j by (8) and (39) We are now ready to show the implication (35). Let ∆b qj0 (s0j ) > κθ (θj0 )ψ/2 . We consider two cases: 2 2 If ∆b qj0 (s0j ) ≤ κ8B ks0j k then (39) yields κ8B ks0j k ≤ (C̃ + My )(θj0 )ψ/2 and thus ∆b qj0 (s0j ) > κθ (θj0 )ψ/2 ≥ κθ κB 2 ks0j k . 8(C̃ + My ) 2 Otherwise, we have ∆b qj0 (s0j ) > κ8B ks0j k . Thus, we obtain in both cases ) ( κθ κB κB 2 2 0 0 ks0j k =: κ1 ks0j k . ∆b qj (sj ) > min , 8 8(C̃ + My ) (40) Since xk → x̄, we obtain by (36) and (38) possibly after increasing K that for all j≥k≥K ¯ 0 0 ¯ 2 ¯∆b qj (sj ) − ∆`0j (s0j )¯ ≤ (1 − σ)κ1 ks0j k < (1 − σ)∆b qj0 (s0j ), where we use (40) in the last inequality. This proves the lemma. u t By combining Lemma 3 and Lemma 4 we are now able to prove Theorem 2. Proof of Theorem 2. Since (A4) and (A5) hold, we have xk → x̄ with x̄ satisfying (O) and we can thus apply Proposition 1, Lemma 3, and© Lemma ª 4. Let ∆¯ > 0 be ¯ ∆min /4. Now choose e := min ∆, as in Proposition 1, fix some M ≥ 1 and set ∆ K > 0 large enough such that the assertions of Lemma 3 and Lemma 4 hold. Since kxk+1 − xk k → 0 we find K1 ≥ K such that © ª ¯ ∆min min ∆, 1/2 e1+ξ e= , θk = θ(xk , Y (xk ))1/2 ≤ κ∆ ∆ (41) ksk k∞ < ∆ 4 S. Ulbrich: Superlinear local convergence of filter-SQP methods 17 for all k ≥ K1 . Then by (41) and Lemma 3 the problems QP(xk , ∆) are compatible in e the sense (18) for all k ≥ K1 as long as ∆ ≥ ∆. Now let xk , k ≥ K1 , be an arbitrary iterate satisfying (29). We show that xk+1 = ¯ as in Proposition 1. x0k+1 = xk + s0k , where s0k is the unique solution of QP(xk , ∆) e and QP(xk , ∆) are Since the inner iteration halfs ∆ starting from ∆ ≥ ∆min ≥ 4∆ e it either finds an acceptable iterate xk+1 = xk + sk for some compatible for ∆ ≥ ∆, e e 2∆]. e ∆ > 2∆ or it locates a trust-region radius ∆ ∈ [∆, e < ∆ by (41) and thus sk must coincide with In the first case we have ksk k∞ < ∆ 0 ¯ the unique solution sk of QP(xk , ∆) as asserted. e ∆] ¯ and In the second case denote the solution of QP(xk , ∆) by s. Since ∆ ∈ [∆, k ≥ K, we know by Proposition 1 and Lemma 3 that s coincides with the unique ¯ and thus ksk < ∆ by (6). Moreover, by Lemma 3 the iterate solution s0k of QP(xk , ∆) ∞ 0 0 (xk+1 , yk+1 ) = (xk +s0k , Y (xk +s0k )) is acceptable to the filter Fk ∪(θk , `k ) and passes by Lemma 4 the decrease test in step 5. Thus, the trial iterate (xk + s, Y (xk + s)) is accepted by the algorithm and we have xk+1 = x0k+1 = xk + s0k , xk+1 = Y (xk+1 ). Hence, in both cases the algorithm takes the full (local) SQP-step xk+1 = x0k+1 . The next iterate satisfies again (29) and we obtain by induction that xj = x0j for all j ≥ k with the iterates {x0j }j≥k according to Proposition 1. Now Proposition 1 yields the q-superlinear (or q-quadratic) convergence of {xk }. The proof is complete. u t 4. Global convergence of the modified filter-SQP algorithm In this section we show that the global convergence result of Theorem 1 remains true for the modified filter-SQP method of Algorithm 2. In addition to assumptions (A1)–(A3) we assume that the used Lagrange multipliers remain uniformly bounded and that the Lagrange multiplier estimates weight inactive inequality constraints not to strongly. Assumption (A6). (A6) There exists a constant My > 0 such that all Lagrange multiplier estimates generated in step 1 or step 3 satisfy kyk k ≤ My , kyk (s)k ≤ My , (yk )I ≥ 0, (yk (s))I ≥ 0 for all k. Moreover, there are constants ML > 0 and ζ > 0 with ½ ¾ 1 max ξ, − 1 < ζ ≤ 1 ψ (42) (43) such that all Lagrange multiplier estimates y = yk (s) generated in step 3 satisfy 1+ζ |yIT c− . I (xk + s)| ≤ ML ∆ (44) We note that the condition (44) is not required for Lagrange multipliers generated in the restoration. (42) is a natural requirement and is consistent with (A5). Remark 4. In the equality constrained case (44) is void. Therefore, for equality constrained problems (A6) is obviously satisfied by the multipliers in assumption (A5). S. Ulbrich: Superlinear local convergence of filter-SQP methods 18 Remark 5. Condition (44) can for example be ensured by applying the modification rule © ª 1+ζ max (c− =⇒ yi = 0, (45) I (xk + s))i , (cI (xk ) + AI (xk )s)i < −MI ∆ 1 2 since (c− I (xk + s))i ≥ (cI (xk ) + AI (xk )s)i − 2 nMC ∆ , cf. Lemma 6 below. This rule allows in particular yi 6= 0 for all constraints that are active at the solution s of QP(xk , ∆). u t Remark 6. If (A4) and (A5) hold and if QP(xk , ∆k ) has sufficiently close to x̄ once the correct active set (which will fail to occur only in quite pathological situations if we use Bk = ∇xx `(xk , Y (xk )) close to x̄) then from this iteration on the multipliers Y (x) can be taken and thus (A5) is compatible with (A6). This can be seen as follows. If (A4) holds and the solution of QP(xk , ∆k ) has for xk ∈ Ux̄ the active set E ∪ A then the rule (45) allows to choose yk+1 = Y (xk+1 ) according to (A5). Thus, with k := k + 1 we have yk = Y (xk ) and if xk is close 1/2 enough to x̄ then the condition θk ≤ κ∆ ∆1+ξ in (18) ensures that 1+ξ kc− A (xk )k ≤ µ∆ with a constant µ > 0. Using this together with (O) one can show that for all xk ∈ Ux̄ possibly after shrinking Ux̄ the active set of QP(xk , ∆) with Bk = ∇xx (xk , Y (xk )) is E ∪ A under the compatibility requirement (18) and thus the rule (45) leaves y = Y (xk + s) unchanged. More precisely, we have the following lemma, which is proven in the appendix. Lemma 5. Let (A4) hold. Then we can find K > 0 such that for all k ≥ K the following is true: if Bk = ∇xx `(xk , Y (xk )), yk = Y (xk ) as in (A5) and if QP(xk , ∆) is compatible in the sense (18) then QP(xk , ∆) has the active set E ∪ A. u t The global convergence of Algorithm 2 can be obtained by a slight modification of the convergence proof of Fletcher, Leyffer, and Toint [8] for Algorithm 1. Our aim is to show the following analogue of Theorem 1. Theorem 3. Let assumptions (A1)–(A3) and (A6) hold. Then the result of applying the modified filter-SQP algorithm of Algorithm 2 is one of the scenarios (A), (B), or (C) stated in Theorem 1. To prove this theorem we only indicate the necessary modifications of the proof in [8]. We have already observed after the statement of Algorithm 2 that it shares the following properties with Algorithm 1 that are essential for the convergence analysis in [8]: the predicted reduction ∆b qk (s) depends monotone nonincresing on ∆ and the minimal constraint violation τk in the filter Fk , see (19), is always positive. In the following we sketch how the convergence analysis in [8] can be adjusted to the framework of Algorithm 2. We start by recalling the following auxiliary result in [8, Lem. 3]. Lemma 6. Let assumptions (A1)–(A3) hold. Then for any feasible point s of QP(xk , ∆) the following holds. ∆fk (s) ≥ ∆qk (s) − nMB ∆2 , 1 kcE (xk + s)k∞ ≤ nMC ∆2 , 2 1 + kcI (xk + s)k∞ ≤ nMC ∆2 , 2 where MC = supx∈X k∇xx ci (x)k2 , 1 ≤ i ≤ m. (46) (47) (48) S. Ulbrich: Superlinear local convergence of filter-SQP methods 19 Proof. This is a direct consequence of Taylor’s theorem, see [8, Lem. 3]. u t The next lemma is an analogue of [8, Lem. 4] and shows that the iterate xk + s is acceptable to the filter if ∆ is small enough and QP(xk , ∆) is feasible. Lemma 7. Let assumptions (A1)–(A3) and (A6) QP(xk , ∆), © hold. If s solves ª then xk + 2+2ζ 2 2 2 s is acceptable to the filter if ∆ ≤ min 1, 4βτk /(mn MC + 4ML ) with τk in (19) and ζ in (A6). Proof. By (47), (48), and (A6) we have 2 2 T − 2 θ(xk + s, yk (s)) = kcE (xk + s)k2 + kc+ I (xk + s)k2 + ((yk (s))I cI (xk + s)) 1 ≤ mn2 MC2 ∆4 + ML2 ∆2+2ζ . 4 © ª Thus, θ(xk + s) ≤ βτk holds for ∆2+2ζ ≤ min 1, 4βτk /(mn2 MC2 + 4ML2 ) . This ensures by (9) and the definition of τk in (19) that the iterate xk + s is acceptable to the filter. u t The next lemma is an adaption of [8, Lem. 5] to Algorithm 2. Lemma 8. Let assumptions (A1)–(A3) and (A6) hold and let x◦ ∈ X be a feasible point of problem (P) at which MFCQ holds but which is not a KT point. Then there exists a neighborhood N ◦ of x◦ and constants ε, µ b, κ > 0 such that for all xk ∈ N ◦ ∩ X and all ∆ for which 1/(2+2ξ) ψ/2 1/2 µ b(θk + θk + θk ) ≤ ∆ ≤ κ (49) it follows that QP(xk , ∆) is compatible in the sense (18) with solution s at which the predicted reduction (17) satisfies ∆b qk (s) ≥ 1 ψ/2 ε∆ + κθ θk 4 (50) and the actual reduction (16) satisfies the sufficient reduction condition ∆`k (s) ≥ σ∆b qk (s) (51) ∆`k (s) ≥ γθ(xk + s). (52) as well as Proof. In [8, Lem. 5] it is shown that there exists a neighborhood N ◦ of x◦ and constants ε, µ, κ > 0 such that for all xk ∈ N ◦ ∩ X and all ∆ with µhk ≤ ∆ ≤ κ the problem QP(xk , ∆) has a feasible solution s at which ∆qk (s) ≥ 1 ε∆, 3 ∆fk (s) ≥ σ∆qk (s), ∆fk (s) ≥ γh(xk + s). Without restriction we can choose κ ≤ 1. Now set ½ ¾ κθ + My + 1 −1/(1+ξ) √ µ b = max 1, κ∆ . , mµ, 12 ε Then (49) implies 1/2 ∆1+ξ ≥ κ−1 ∆ θk . (53) S. Ulbrich: Superlinear local convergence of filter-SQP methods 20 Thus, QP(xk , ∆) is compatible in the sense (18). √ 1/2 By (10) we have mθk ≥ hk . Hence, (49) implies by our choice of µ b 1/2 µhk ≤ µ bθk ≤ ∆ ≤ κ, κθ + My + 1 ψ/2 1/2 ∆ ≥ 12 (θk + θk ). ε (54) Therefore, (53) is satisfied by [8, Lem. 5] and yields with (49), (54) ∆qk (s) ≥ 1 1 1 ψ/2 1/2 ε∆ + ε∆ ≥ ε∆ + (κθ + My + 1)(θk + θk ). 4 12 4 This shows that 1/2 ∆b qk (s) = ∆qk (s) + ykT ck ≥ ∆qk (s) − (My + 1)θk ≥ 1 ψ/2 ε∆ + κθ θk 4 and (50) is proven. It remains to show (51) and (52). By (42) in (A6) we have (yk (s))TI c− I (xk + s) ≤ 0. This yields with (47), (48) ∆`k (s) = ∆fk (s) + ykT ck − yk (s)T c(xk + s) 1√ ≥ ∆fk (s) + ykT ck − My mnMC ∆2 . 2 Together with (46) and (50) we obtain ¶ µ 1√ mnMC ∆2 ∆`k (s) ≥ ∆b qk (s) − nMB + My 2 µ ¶ 1 1√ ≥ σ∆b qk (s) + (1 − σ) ε∆ − nMB + My mnMC ∆2 . 4 2 (55) Thus, (51) holds for all ∆≤ (1 − σ) 14 ε =: κ1 √ nMB + My 12 mnMC (56) and consequently (49) ensures (51) if we choose κ ≤ κ1 . Finally, (51), (50), (47), (48), and (A6) yield 1 1 ∆`k (s) − γθ(xk + s) ≥ σ ε∆ − γ mn2 MC2 ∆4 − γML2 ∆2+2ζ 4 4 and the right hand side is nonnegative for all ( µ ¶1/(1+2ζ) ) σε =: κ2 . ∆ ≤ min 1, γmn2 MC2 + 4γML2 Thus, also (52) is satisfied if we choose κ ≤ κ2 in (49). (57) (58) u t As a last preparation for the proof of Theorem 3 we proceed as in [8] and show that the inner loop of Algorithm 2 terminates finitely. The following lemma is a counterpart of [8, Lem. 6]. Lemma 9. Let assumptions (A1)–(A3) and (A6) hold. Then the inner iteration of Algorithm 2 terminates finitely. S. Ulbrich: Superlinear local convergence of filter-SQP methods 21 Proof. Again, we only describe the modifications of the proof of Lemma 6 in [8]. If xk is a KT point of (P) then s = 0 solves QP(xk , ∆) and the inner iteration terminates in step 3. Otherwise, we have ∆ → 0 if the inner iteration does not terminate finitely. If θk > 0 then QP(xk , ∆) is obviously incompatible in the sense (18) for all sufficiently small ∆ > 0. Thus, the inner iteration terminates finitely in step 2. If θk = 0 then we find exactly as in the proof of Lemma 6 in [8, p. 55] a constant η > 0 with ∆qk (s) ≥ 12 η∆ for all 0 < ∆ ≤ κ with a constant κ > 0 sufficiently small. Now, θk = 0 implies ykT ck = 0 by (11) and thus ∆b qk (s) = ∆qk (s) ≥ 12 η∆ > ψ/2 κθ θk = 0. Thus, it follows that ∆b qk (s) ≥ 41 ε∆ with ε = 2η. Now we can continue exactly as in the proof of Lemma 8. In fact, we obtain (55) and thus ∆`k (s) ≥ σ∆b qk (s) folds for all ∆ ≤ κ1 with κ1 > 0 according to (56). Hence, the sufficient decrease condition in step 5 is satisfied for ∆ ≤ min {κ, κ1 }. Moreover, we have also (57) and thus ∆`k (s) ≥ γθ(xk + s) is satisfied for ∆ ≤ κ2 with κ2 > 0 in (58). Thus, xk is for ∆ ≤ min {κ, κ1 , κ2 } also acceptable relative to (θk , `k ), cf. (9). Moreover, for ( µ ¶1/(2+2ζ) ) 4βτk ∆ ≤ min 1, =: κ3 mn2 MC2 + 4ML2 the iterate xk is by Lemma 7 also acceptable to Fk . Hence, for ∆ ≤ min {κ, κ1 , κ2 , κ3 } also the filter test of step 4 is satisfied and thus the inner iteration terminates in step 7. The proof is complete. u t We are now able to adapt the global convergence proof of [8] to Algorithm 2. Proof of Theorem 3. We have only to consider case (C). Then the algorithm runs infinitely and since the inner iteration is finite by Lemma 9, the outer iteration produces an infinite sequence of iterates xk . By (A1) the sequence {xk } has at least an accumulation point. Consider the case that the outer iteration produces an infinite sequence of h-type iterations. Exactly as in [8] we obtain a subsequence S of h-type iterations such that xk → x∞ with some x∞ , τk+1 = θk < τk for k ∈ S and Lemma 1 with (θk , `k ) instead of (hk , fk ) yields θk → 0 for k ∈ S. Thus, x∞ is feasible and if x∞ violates MFCQ then (C) is satisfied. Now consider the case where MFCQ is satisfied at x∞ and x∞ is not a KT point. As in [8] we will derive a contradiction. Let N ∞ be the neighborhood of x∞ given in Lemma 7 and choose K > 0 so large that xk ∈ N ∞ for all k ∈ S, k ≥ K. Lemma 7 and Lemma 8 show that with constants µ b, κ > 0 for all k ∈ S, k ≥ K and all ∆ with ( µ ¶1/(2+2ζ) ) 4βτk 1/2 ψ/2 1/(2+2ξ) µ b(θk + θk + θk ) ≤ ∆ ≤ min κ, 1, (59) mn2 MC2 + 4ML2 the problem QP(xk , ∆) is compatible and the solution s satisfies ∆`k (s) ≥ σ∆b qk (s), ψ/2 ∆b qk (s) > κθ θk , ∆`k (s) ≥ γθ(xk + s) (see Lemma 8), and (θ(xk + s), `(xk + s, yk (s))) is acceptable to Fk (see Lemma 7). In particular, for all 1/2 ∆≥µ b(θk ψ/2 + θk 1/(2+2ξ) + θk ) no h-type iteration occurs, since QP(xk , ∆) is compatible in the sense (18) and ∆b qk (s) > ψ/2 κθ θk (recall that ∆b qk (s) is nondecreasing in ∆) and for ∆ satisfying (59) an f -type step will occur. Hence, the algorithm will take an f -type step in iteration k ∈ S, k ≥ K, S. Ulbrich: Superlinear local convergence of filter-SQP methods 22 if halfing of ∆ beginning by some ∆ ≥ ∆min in the inner loop locates a ∆ satisfying (59) (then the conditions for an f -type step will be met if this was not the case before). But we have τk+1 = θk < τk and θk → 0 for k ∈ S. Since by (43) ¾ ½ 1 1 ψ 1 max , , > , 2 2 2 + 2ξ 2 + 2ζ the upper bound in (49) will for all k ∈ S large enough be at least twice the lower bound. Thus, for large k ∈ S an f -type step will be taken which contradicts the fact that the iterations k ∈ S are composed of h-type steps. Thus, the assumption that x∞ is not a KT point was wrong. Finally, consider the case where the algorithm takes only a finite number of h-type ψ/2 steps. Then there is K > 0 such that `k − `k+1 ≥ σ∆b qk (s) > κθ θk for all k ≥ K. By (A1), (A2), (A6) `k is uniformly bounded and thus X X X ψ/2 ∞> (`k − `k+1 ) ≥ σ∆b qk (s) > σκθ θk ≥ 0. k≥K k≥K k≥K Therefore, we have θk → 0 and thus any accumulation point x∞ of {xk } which exists by (A1) is feasible. Assume that x∞ satisfies MFCQ but is no KT point and let S be a subsequence of iterates converging to x∞ . As above let N ∞ be the neighborhood of x∞ given in Lemma 7 and choose K > 0 so large that xk ∈ N ∞ for all k ∈ S, k ≥ K. Since no h-type steps occur for k ≥ K we have τk = τK > 0 for all k ≥ K and exactly as above an f -type step is guaranteed for all k ∈ S, k ≥ K if ( µ ¶1/(2+2ζ) ) 4βτK 1/2 ψ/2 1/(2+2ξ) µ b(θk + θk + θk ) ≤ ∆ ≤ min κ, 1, =: κ b. mn2 MC2 + 4ML2 (60) This time the upper bound is constant and the lower bound is half the upper bound for k ≥ K possibly after increasing K. Hence, for all k ∈ S, k ≥ K, the f -type step occurs with ∆ ≥ min {∆min , κ b/2} and thus (50) yields `k − `k+1 ≥ σ∆b qk (sk ) ≥ 1 ε min {∆min , κ b/2} . 4 ψ/2 Since `k − `k+1 > κθ θk ≥ 0 for all k ≥ K, this implies `k → −∞, which is a contradiction to the boundedness of `k . Thus x∞ is a KT point or violates MFCQ. u t 5. Conclusions We have proposed a modified version of the globally convergent trust-region filterSQP method by Fletcher, Leyffer, and Toint [8] such that the transition to fast local convergence is ensured under appropriate local assumptions on the Hessian approximation in the SQP-problems and on the Lagrange multipliers. Hereby, the original SQP-steps without additional second-order correction can be used. The modified algorithm is globally convergent if the Lagrange multipliers used are bounded and weight inactive inequality constraints not to strongly. Thus, in the equality constrained case only the boundedness of the Lagrange multipliers is assumed globally which is always compatible with our local requirement on the Lagrange multipliers. In the inequality constrained case our global and local conditions on the Lagrange multipliers are–in particular–compatible as soon as sufficiently close to the limit point x̄ a subproblem S. Ulbrich: Superlinear local convergence of filter-SQP methods 23 QP(xk , ∆k ) identifies the active set correctly. From this point on the local and global conditions on the Lagrange multipliers can be met. Our modified algorithm is very similar to the filter-SQP methods in Fletcher, Gould, Leyffer, Toint, and Wächter [6] and in Conn, Gould, and Toint [4], but uses the Lagrangian function value instead of the objective function value and a different feasibility measure in the filter. Thus our algorithm can also be viewed as a variant of the latter algorithms. We decided to build on the results in [8], since the global convergence analysis is less technical than in [4,6]. However, it would be interesting to apply the proof techniques of [4,6] to get an alternative convergence proof for our modified algorithm. In conclusion, the results of the present paper show that the filter technique does allow transition to fast local convergence without using second order correction steps if appropriate filter entries are used. It is an interesting topic for future research if the proposed modifications can help to further improve the encouraging numerical performance of filter-SQP methods reported in the past [3,7]. Appendix: Proof of Lemma 5 e with a constant It is sufficient to show that the assertion is true for all 0 < ∆ ≤ ∆ e e ∆ > 0. In fact, since by (A4) xk → x̄ holds, we find K > 0 with kxk+1 − xk k∞ < ∆ for all k ≥ K and thus for all k ≥ K Algorithm 2 produces the same sequence if we e replace ∆k in all iterations k ≥ K by min{∆k , ∆}. b Let again A(x) := AE∪A (x), b c(x) := cE∪A (x). Moreover, we set AE,k := AE (xk ), AA,k := AA (xk ), cE,k := cE (xk ), cA,k := cA (xk ). We know from (O) in (A4) that b c(x̄) = 0, b T ȳE∪A , g(x̄) = −A(x̄) ȳA ≥ 2ε > 0, cI\A (x̄) < −ε e > 0 so small that cI\A (x) < 0 for for some ε > 0. Now choose K > 0 so large and ∆ e k ≥ K. Then QP(xk , ∆), 0 < ∆ ≤ ∆ e is for k ≥ K all feasible points of QP(xk , ∆), equivalent to 1 min qk (s) := gkT s + sT Bk s, 2 AE,k s = −cE,k , AA,k s ≤ −cA,k . (61) By continuity and (O), we can increase K such that with Bk = ∇xx `(xk , Y (xk )), yk = Y (xk ) for all k ≥ K the following holds n o 2 T b s Bk s ≥ κB ksk2 on s : Ak s = 0 , bk A bTk )−1 A bk k + kA bTk (A bk A bTk )−1 k ≤ MA , k(A 1 (yk )A ≥ ȳA ≥ ε. 2 Moreover, we have constants with kgk k ≤ Mg√ , kyk k ≤ My . Since (yk )I\A = 0 if yk = Y (xk ) we obtain by (8) with µ = min{1, ε} > 0 1/2 θk ≥ µkb ck k. In addition, we observe that T bTk ȳE∪A + (gk − g(x̄)) + (A bk − A(x̄)) b bTk ȳE∪A + rk , gk = −A ȳE∪A =: −A (62) S. Ulbrich: Superlinear local convergence of filter-SQP methods 24 where krk k1 ≤ Lg kxk − x̄k1 with a constant Lg . Finally, we increase K such that for all k ≥ K ε kxk − x̄k1 ≤ . (63) 16MA Lg e k ≥ K, that is Now let s be the solution of a problem QP(xk , ∆), 0 < ∆ ≤ ∆, compatible in the sense (18). We know that s solves also (61) and have by (18) 1/2 µkb ck k ≤ θk ≤ κ∆ ∆1+ξ . (64) bk s = −b We want to show that A ck . We first compute an upper bound for qk (s). If we take bk A bTk (A bTk )−1 b ŝ = −A ck , bk ŝ = −b then A ck and by (64) kŝk ≤ MA kb ck k ≤ −1/ξ e C Thus, for ∆ ≤ min{1, ∆, 1 e1 function value is for ∆ ≤ ∆ MA κ∆ 1+ξ ∆ =: C1 ∆1+ξ . µ e1 we have kŝk ≤ kŝk ≤ ∆ and the objective } =: ∆ ∞ qk (ŝ) ≤ Mg C1 ∆1+ξ + nMB C12 ∆2+2ξ ≤ (Mg C1 + nMB C12 )∆1+ξ . e2 ≤ ∆ e1 such that for all 0 < ∆ ≤ ∆ e2 Thus we find a constant 0 < ∆ qk (ŝ) ≤ ε ∆. 8MA (65) Now we decompose the optimal solution s in a normal and tangential component s = sn + st , bTk v, sn = A st = W z, bk . The corresponding objecwhere the columns of W are an ONB of the nullspace of A tive function value is by (62) T bk sn + rkT s + 1 sT Bs qk (s) = −ȳE∪A A 2 T bk sn − (Lg kxk − x̄k ∆ + nMB ∆2 ). ≥ −ȳE∪A A 1 (66) e3 ≤ ∆ e2 such that for all 0 < ∆ ≤ ∆ e3 Using (63), we find 0 < ∆ T bk sn − qk (s) ≥ −ȳE∪A A ε ∆. 8MA (67) e4 ≤ ∆ e3 for all 0 < ∆ ≤ ∆ e4 the solution s of We show next that with a constant 0 < ∆ n a compatible problem QP(xk , ∆) has a normal component s with ksn k∞ ≤ ∆/2. To this end we show that qk (s) > qk (ŝ) would hold if ksn k∞ > ∆/2. bT v, we have v = (A bk A bT )−1 A bk sn and thus Since sn = A k k kvk ≤ MA ksn k. This yields bk sn k ≥ v T A bk sn = ksn k2 ≥ 1 kvkksn k kvkkA MA S. Ulbrich: Superlinear local convergence of filter-SQP methods 25 and we conclude that bk sn k ≥ kA 1 1 ksn k ≥ ksn k∞ . MA MA Since bA,k sn ≤ −cA,k , A bA,k sn )+ ≤ −c− and thus we have for the positive part 0 ≤ (A A,k °µ + ° ¶ n ° ° A b κ∆ 1+ξ ° A,k s ° ∆ =: C3 ∆1+ξ . ° ° ≤ kck k ≤ bE,k sn ° ° A µ We conclude that b− sn k ≥ kA bk sn k − C3 ∆1+ξ ≥ 1 ksn k − C3 ∆1+ξ . kA ∞ A,k MA This gives for ksn k∞ ≥ ∆/2 T 1+ξ bk sn ≥ −ȳ T A b− n b− sn k − My C3 ∆1+ξ −ȳE∪A A ≥ 2εkA A A,k s − My C3 ∆ A,k 1 2ε n ks k∞ − C3 (2ε + My )∆1+ξ (68) MA 2ε ε ≥ ∆ − C3 (2ε + My )∆1+ξ ≥ ∆ 2MA 2MA e4 with a constant 0 < ∆ e4 ≤ ∆ e3 . Thus, we obtain by combining (65), (67), for ∆ ≤ ∆ e and (68) for 0 < ∆ ≤ ∆4 ε ε ε ε ∆− ∆− ∆= ∆>0 qk (s) − qk (ŝ) ≥ 2MA 8MA 8MA 4MA e4 ensures that ksn k ≤ ∆/2. which contradicts the optimality of s. Therefore, ∆ < ∆ ∞ bk s = −b e4 . Assume the We are now in the position to show that A ck for ∆ < ∆ bk corresponding to an inequality constraint contrary. Then there exists a row aTi of A T with ai s < (−b ck )i . Set u bTk (A bk A bTk )−1 ei , δs = ∆ , u=A 2kuk∞ ≥ where ei denotes the i-th unit vector. For all components with |(sn +st )j | = ∆ we have |(st )j | ≥ ∆/2, since |(sn )j | ≤ ∆/2. Therefore, we have with s(α) := (1 − α)st + sn + αδs for small α > 0 bk s(α) = A bk sn + α∆ ei ks(α)k∞ ≤ ∆, A ≤ −b ck . 2kuk∞ bT ȳE∪A + rk Now we have by using that gk = −A k µ ¶ d 1 d T T qk (s(α)) = gk s(α) + s(α) Bk s(α) dα dα 2 = gkT (δs − st ) + s(α)T Bk (δs − st ) (ȳE∪A )i + rkT (δs − st ) + s(α)T Bk (δs − st ) 2kuk∞ 2ε ≤− ∆ + (2Lg kxk − x̄k1 ∆ + 2nMB ∆2 ) 2MA ε ε ∆+ ∆ < 0. ≤− MA 4MA = −∆ S. Ulbrich: Superlinear local convergence of filter-SQP methods 26 e3 , see (66), (67). But since s(α) is feasible In the last step, we use the choice of ∆ d for small α > 0, dα qk (s(α)) < 0 contradicts the optimality of s and the assumption e=∆ e4 . aTi s < (−cA,k )i for some i was wrong. This completes the proof by setting ∆ u t References 1. D. P. B ERTSEKAS, Constrained optimization and Lagrange multiplier methods, Computer Science and Applied Mathematics, Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, 1982. 2. P. T. B OGGS AND J. W. T OLLE, Sequential quadratic programming, in Acta numerica, 1995, Acta Numer., Cambridge Univ. Press, Cambridge, 1995, pp. 1–51. 3. C. M. C HIN AND R. F LETCHER, Numerical results for SLPSQP, Filter-SQP and LANCELOT on selected Cute test problems, Tech. Report NA/203, Department of Mathematics, Dundee University, Dundee, Scotland, 2001. 4. A. R. C ONN , N. I. M. G OULD , AND P. L. T OINT, Trust-region methods, MPS/SIAM Series on Optimization, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2000. 5. F. FACCHINEI AND S. L UCIDI, Quadratically and superlinearly convergent algorithms for the solution of inequality constrained minimization problems, J. Optim. Theory Appl., 85 (1995), pp. 265–289. 6. R. F LETCHER , N. I. M. G OULD , S. L EYFFER , P. L. T OINT, AND A. W ÄCHTER, Global convergence of a trust-region SQP-filter algorithm for general nonlinear programming, Technical Report 99/03, Department of Mathematics, University of Namur, Namur, Belgium, 1999 (revised 2001). To appear in SIAM J. Optim. 7. R. F LETCHER AND S. L EYFFER, Nonlinear programming without a penalty function, Math. Program., 91 (2002), pp. 239–269. 8. R. F LETCHER , S. L EYFFER , AND P. L. T OINT, On the global convergence of a filter–SQP algorithm, SIAM J. Optim., 13 (2002), pp. 44–59. 9. T. G LAD AND E. P OLAK, A multiplier method with automatic limitation of penalty growth, Math. Programming, 17 (1979), pp. 140–155. 10. C. C. G ONZAGA , E. K ARAS , AND M. VANTI, A globally convergent filter method for nonlinear programming, Technical Report, Department of Mathematics, Federal University of Santa Catarina, Brazil, 2001 (revised 2002). 11. S. L UCIDI, New results on a continuously differentiable exact penalty function, SIAM J. Optim., 2 (1992), pp. 558–574. 12. J. N OCEDAL AND S. J. W RIGHT, Numerical optimization, Springer Series in Operations Research, Springer-Verlag, New York, 1999. 13. P. L. T OINT, Non-monotone filter methods, Presentation at the SIAM Conference on Optimization, May 20–22, 2002, Toronto, Canada, 2002. 14. M. U LBRICH AND S. U LBRICH, Non-monotone trust region methods for nonlinear equality constrained optimization without a penalty function, Math. Program., Ser. B (2002), pp. DOI 10.1007/s10107–002– 0343–9, in press. 15. M. U LBRICH , S. U LBRICH , AND L. N. V ICENTE, A globally convergent primal-dual interior-point filter method for nonconvex nonlinear programming, Technical Report TR00-12, Department of Computational and Applied Mathematics, Rice University, Houston, Texas 77005-1892, 2000 (revised 2002). 16. A. W ÄCHTER AND L. T. B IEGLER, Global and local convergence of line search filter methods for nonlinear programming, CAPD Technical Report B-01-09, Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania, 2001 (revised 2002).