An Iterative Solver-Based Infeasible Primal-Dual Path-Following Algorithm for Convex QP Zhaosong Lu

An Iterative Solver-Based Infeasible Primal-Dual Path-Following Algorithm for Convex QP Zhaosong Lu ∗ Renato D.C. Monteiro† Jerome W. O’Neal ‡ May 3, 2004 Abstract In this paper we develop an interior-point primal-dual long-step path-following algorithm for convex quadratic programming (CQP) whose search directions are computed by means of an iterative (linear system) solver. We propose a new linear system, which we refer to as the augmented normal equation (ANE), to determine the primal-dual search directions. Since the condition number of the matrix associated with the ANE may become large for degenerate CQP problems, we use a maximum weight basis preconditioner introduced in [16, 14] to better condition this matrix. Using a result obtained in [13], we establish a uniform bound, depending only on CQP data, on the number of iterations required for the iterative solver to obtain a sufficiently accurate solution to the ANE. Since the iterative solver can only generate an approximate solution to the ANE, this solution does not yield a primal-dual search direction satisfying all equations of the primal-dual Newton system. We propose a way to compute an inexact primal-dual search direction so that the Newton equation corresponding to the primal residual is satisfied exactly, while the one corresponding to the dual residual contains a manageable error which allows us to establish a polynomial bound on the number of iterations of our method. Keywords: Convex quadratic programming, iterative solver, inexact search directions, polynomial convergence. AMS 2000 subject classification: 65F10, 65F35, 90C20, 90C25, 90C51. ∗ School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0205. (email: zhaosong@isye.gatech.edu). This author was supported in part by NSF Grant CCR-0203113 and ONR Grant N00014-03-1-0401. † School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0205. (email: monteiro@isye.gatech.edu). This author was supported in part by NSF Grants CCR-0203113 and INT-9910084 and ONR Grant N00014-03-1-0401. ‡ School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0205 (email: joneal@isye.gatech.edu). This author was supported in part by the NDSEG Fellowship Program sponsored by the Department of Defense. 1 1 Introduction In this paper we develop an interior-point primal-dual long-step path-following algorithm for convex quadratic programming (CQP) whose search directions are computed by means of an iterative (linear system) solver. More specifically, the problem we consider is: ½ ¾ 1 T T min x Qx + c x : Ax = b, x ≥ 0 , (1) x 2 where the data are Q ∈ Rn×n , A ∈ Rm×n , b ∈ Rm , and c ∈ Rn , and the decision vector is x ∈ Rn . We also assume that Q is positive semidefinite, and that a factorization Q = V V T is explicitly given, where V ∈ Rn×l . A similar algorithm for solving the special case of linear programming (LP), i.e. problem (1) with Q = 0, was developed by Monteiro and O’Neal in [12]. The algorithm studied in [12] is essentially the long-step path-following infeasible algorithm studied in [6, 20], the only difference being that the search directions are computed by means of an iterative solver. We refer to the iterations of the iterative solver as the inner iterations and to the ones performed by the interior-point method itself as the outer iterations. The main step of the algorithm studied in [6, 12, 20] is the computation of the primal-dual search direction (∆x, ∆s, ∆y), whose ∆y component can be found by solving a system of the form AD2 AT ∆y = g, referred to as the normal equation, where g ∈ Rm and the positive diagonal matrix D depends on the current primal-dual iterate. In contrast to [6, 20], the algorithm studied in [12] uses an iterative solver to obtain an approximate solution to the normal equation. Since the condition number of the normal matrix AD2 AT may become excessively large on degenerate LP problems (see e.g. [9]), the maximum weight basis preconditioner T introduced in [16, 14] is used to better condition this matrix and an approximate solution of the resulting equivalent system with coefficient matrix T AD2 AT T T is then computed. By using a result obtained in [13], which establishes that the condition number of T AD2 AT T T is uniformly bounded by a quantity depending on A only, Monteiro and O’Neal [12] shows that the number of inner iterations of the algorithm in [12] can be uniformly bounded by a constant depending on n and A. In the case of CQP, the standard normal equation takes the form A(Q + X −1 S)−1 AT ∆y = g, (2) for some vector g. When Q is not diagonal, the matrix (Q + X −1 S)−1 is not diagonal, and hence the coefficient matrix of (2) does not have the form required for the result of [13] to hold. To remedy this difficulty, we develop in this paper a new linear system, referred to as the augmented normal equation (ANE), to determine a portion of the primal-dual search direction. This equation has the form ÃD̃2 ÃT u = w, where w ∈ Rm+l , D̃ is an (n+l)×(n+l) positive diagonal matrix and Ã is a 2 × 2 block matrix of dimension (m + l) × (n + l) whose blocks consist of A, V T , the zero matrix and the identity matrix (see equation (21)). As was done in [12], a maximum weight basis preconditioner T̃ for the ANE is computed and an approximate solution of the resulting preconditioned equation with coefficient matrix 2 T̃ ÃD̃2 ÃT T̃ T is generated using an iterative solver. Using the result of [13], which claims that the condition number of T̃ ÃD̃2 ÃT T̃ T is uniformly bounded regardless of D̃, we obtain a uniform bound (depending only on Ã) on the number of inner iterations performed by the iterative solver to find a desirable approximate solution for the ANE (see Theorem 2.6). Since the iterative solver can only generate an approximate solution to the ANE, it is clear that not all equations of the Newton system, which determines the primal-dual search direction, can be satisfied simultaneously. In the context of LP, Monteiro and O’Neal [12] proposed a recipe to compute an inexact primal-dual search direction so that the Newton equations corresponding to the primal and dual residuals were both satisfied. In the context of CQP, such an approach is no longer possible. Instead, we propose a way to compute an inexact primal-dual search direction so that the Newton equation corresponding to the primal residual is satisfied exactly, while the one corresponding to the dual residual contains a manageable error which allows us to establish a polynomial bound on the number of outer iterations of our method. Interestingly, the presence of this error on the dual residual Newton equation implies that the primal and dual residuals go to zero at different rates. This is a unique feature of our algorithm and stands in contrast with the other interior-point primaldual path-following algorithms, where the primal and dual residuals go to zero at the same rate. The use of inexact search directions in interior-point methods has been extensively studied in the context of LP (see e.g. [1, 5, 8, 11, 21]), but little research has been done regarding the use of inexact search directions in interior-point methods for CQP. Moreover, the use of iterative solvers to compute the primal-dual Newton search directions of interior-point path following algorithms in the context of LP have been investigated in [2, 5, 8, 14, 15, 16]; again, little research has been done regarding the use of iterative solvers in the context of CQP. In the context of CQP, Bergamaschi et al. [3] have proposed a preconditioner for the augmented system of equations and have developed an interior-point algorithm which uses an iterative solver on the preconditioned augmented system in order to obtain the search directions. However, it should be noted that this paper does not touch upon the issue of complexity bounds either from the inner or outer iteration points of view. To our knowledge, no one has used the ANE system to obtain the primal-dual search direction in the context of CQP, nor has anyone presented an iterative solver-based CQP algorithm with strong complexity bounds on both the inner and outer iterations. Our paper is organized as follows. In Subsection 1.1, we give the terminology and notation which will be used throughout our paper. Section 2 describes the algorithm which will be the main focus of our study in this paper and presents both the inner and outer iterationcomplexity results obtained for it. In Section 3, we prove the two convergence results stated Section 2. Finally, we present some concluding remarks in Section 4. 1.1 Terminology and Notation Throughout this paper, upper-case Roman letters denote matrices, lower-case Roman letters denote vectors, and lower-case Greek letters denote scalars. We let Rn , Rn+ and Rn++ denote 3 the set of n- vectors having real, nonnegative real, and positive real components, respectively. Also, we let Rm×n denote the set of m × n matrices with real entries. For a vector v ∈ Rn , we let |v| denote the vector whose ith component is |vi |, for every i = 1, . . . , n, and we let Diag(v) denote the diagonal matrix whose i-th diagonal element is vi , for every i = 1, . . . , n. Certain matrices bear special notation, namely the matrices X, ∆X, S, D, and D̃. ˜ These matrices are the diagonal matrices corresponding to the vectors x, ∆x, s, d, and d, respectively, as described in the previous paragraph. The symbol 0 will be used to denote a scalar, vector, or matrix of all zeroes; its dimensions should be clear from the context. Also, we denote by e the vector of all 1’s, and by I the identity matrix; their dimensions should be clear from the context. For a symmetric positive definite matrix W , we denote by κ(W ) its condition number, i.e. its maximum eigenvalue divided by its minimum eigenvalue. We will denote sets by upper-case script Roman letters (e.g. B, N ). For a finite set B, we denote its cardinality by |B|. Given a matrix A ∈ Rm×n and an ordered set B ⊆ {1, . . . , n}, we let AB denote the submatrix whose columns are {Ai : i ∈ B} arranged in the same order as B. Similarly, given a vector v ∈ Rn and an ordered set B ⊆ {1, . . . , n}, we let vB denote the subvector consisting of the elements {vi : i ∈ B} arranged in the same order as B. the paper. For a vector z ∈ Rn , kzk = √ We will use several different normsPthroughout n z T z is the Euclidian norm, kzk1 = i=1 |zi | is the “1-norm”, and kzk∞ = maxi=1,...,n |zi | is the “infinity norm”. For a matrix V ∈ Rm×n , kV k denotes the operator norm associated with the Euclidian = maxz:kzk=1 kV zk. Finally, kV kF denotes the Frobenius Pmnorm: Pn kV2k1/2 norm: kV kF = ( i=1 j=1 Vij ) . 2 Main Results This section, which presents the main results of our paper, is divided into four subsections. In Subsection 2.1, we will review a well-known polynomially convergent primal-dual longstep path-following algorithm for CQP whose search directions are computed by means of an exact solver. Also, we will introduce the “augmented normal equation” (ANE) approach for computing the primal-dual search directions for the above algorithm. In the next three subsections, we discuss a variant of the above algorithm where the corresponding search directions are (inexactly) computed by means of an iterative solver. In Subsection 2.2, we will discuss the use of the ANE in connection with the preconditioning results developed in Monteiro, O’Neal and Tsuchiya [13] and derive a bound on the number of iterations required for the iterative solver to obtain a sufficiently accurate solution to the ANE. In Subsection 2.3, we will discuss a suitable way to define the primal-dual search direction using an inexact solution to the ANE. Finally, in Subsection 2.4 we will present our main algorithm and the corresponding (inner and outer) iteration-complexity results. 4 2.1 Preliminaries and Motivation In this subsection, we discuss a primal-dual long-step path-following CQP algorithm, Algorithm CQP-exact, which will serve as the basis for Algorithm CQP-IS given in Section 2.4. We will also give the complexity results which have been obtained for Algorithm CQP-exact. In particular, the complexity results obtained in this section follow as an immediate corollary from Theorem 2.5. Consider the following primal-dual pair of CQP problems: ½ ¾ 1 T 2 T minx kV xk + c x : Ax = b, x ≥ 0 , (3) 2 ½ ¾ 1 T 2 T T T max(x̂,s,y) − kV x̂k + b y : A y + s − V V x̂ = c, s ≥ 0 , (4) 2 where the data are V ∈ Rn×l , A ∈ Rm×n , b ∈ Rm and c ∈ Rn , and the decision variables are x ∈ Rn and (x̂, s, y) ∈ Rn × Rn × Rm . It is well-known that if x∗ is an optimal solution for (3) and (x̂∗ , s∗ , y ∗ ) is an optimal solution for (4), then (x∗ , s∗ , y ∗ ) is also an optimal solution for (4). Now, let S denote the set of all vectors w := (x, s, y, z) ∈ R2n+m+l satisfying Ax A y+s+Vz Xs T V x+z = = = = T b, x ≥ 0, c, s ≥ 0, 0, 0. (5) (6) (7) (8) It is clear that w ∈ S if and only if x is optimal for (3), (x, s, y) is optimal for (4), and z = −V T x. (Throughout this paper, the symbol w will always denote the quadruple (x, s, y, z), where the vectors lie in the appropriate dimensions; similarly, ∆w = (∆x, ∆s, ∆y, ∆z), wk = (xk , sk , y k , z k ), w̄ = (x̄, s̄, ȳ, z̄), etc.) We observe that the presentation of the primal-dual path-following algorithm based on an exact solver in this subsection differs from the classical way of presenting it in that we introduce an additional variable z as above. Clearly, it is easy to see that the variable z is completely redundant and can be eliminated, thereby reducing the method described below to the usual way of presenting it. However, we observe that the introduction of the variable z plays a crucial role in the version of the algorithm below based on an iterative solver (see Algorithm CQP-IS in Subsection 2.4), where the search directions are computed only approximately. We will make the following two assumptions throughout the paper: Assumption 1 A has full row rank. 5 Assumption 2 The set S is nonempty. m+l For a point w ∈ R2n , let us define ++ × R µ rp rd rV r µ(w) = xT s/n, rp (w) = Ax − b, rd (w) = AT y + s + V z − c, rV (w) = V T x + z, r(w) = (rp (w), rd (w), rV (w)). := := := := := (9) (10) (11) (12) (13) m+l Moreover, given γ ∈ (0, 1) and an initial point w0 ∈ R2n , we define the following ++ × R neighborhood of the central path: ½ ¾ krk µ 2n m+l : Xs ≥ (1 − γ)µe, Nw0 (γ) := w ∈ R++ × R , (14) ≤ kr0 k µ0 where r := r(w), r0 := r(w0 ), µ := µ(w), and µ0 := µ(w0 ). (Here, we use the convention that ν/0 is equal to 0 if ν = 0 and ∞ if ν > 0.) The infeasible primal-dual algorithm which will serve as the basis for our iterative CQP method is as follows: Algorithm CQP-exact 1. Start: Let ² > 0, γ ∈ (0, 1), w0 ∈ Nw0 (γ) and 0 < σ < σ < 1 be given. Set k = 0. 2. While µk := µ(wk ) > ² do (a) Let w := wk and µ := µk ; choose σ := σk ∈ [σ, σ]. (b) Let ∆w = (∆x, ∆s, ∆y, ∆z) denote the solution of the linear system A∆x A ∆y + ∆s + V ∆z X∆s + S∆x V T ∆x + ∆z T = = = = −rp , −rd , −Xs + σµe, −rV . (15) (16) (17) (18) (c) Let α̃ = argmax {α ∈ [0, 1] : w + α0 ∆w ∈ Nw0 (γ), ∀α0 ∈ [0, α]}. © ª (d) Let ᾱ = argmin (x + α∆x)T (s + α∆s) : α ∈ [0, α̃] . (e) Let wk+1 = w + ᾱ∆w, and set k ← k + 1. End (while) (Note: we titled this algorithm “CQP-exact” to distinguish it from our main algorithm in Section 2.4, where system (15)-(18) is solved inexactly using an iterative solver.) The following result, which establishes an iteration-complexity bound for Algorithm CQP-exact, is well-known. 6 Theorem 2.1 Assume that the constants γ, σ, and σ are such that © ª max γ −1 , (1 − γ)−1 , σ −1 , (1 − σ)−1 = O(1), m+l and that the initial point w0 ∈ R2n satisfies (x0 , s0 ) ≥ (x∗ , s∗ ) for some w∗ ∈ S. ++ × R m+l Then, Algorithm CQP-exact finds an iterate wk ∈ R2n satisfying µk ≤ ²µ0 and ++ × R krk k ≤ ²kr0 k within O(n2 log(1/²)) iterations. A few approaches have been suggested in the literature for computing the Newton search direction (15)-(18). Instead of pursuing one of them, we will discuss below a new approach, referred to in this paper as the augmented normal equation (ANE) approach, that we believe to be suitable not only for direct solvers but especially for iterative solvers as we will see in the next subsection. Let us begin by defining the following matrices: D := X 1/2 S −1/2 , µ ¶ D 0 D̃ := ∈ R(n+l)×(n+l) , 0 I µ ¶ A 0 Ã := ∈ R(m+l)×(n+l) . VT I Suppose that we first solve the following system of equations for (∆y, ∆z): µ ¶ µ ¶ µ ¶ ∆y x − σµS −1 e − D2 rd −rp 2 T ÃD̃ Ã = Ã + =: h. ∆z 0 −rV (19) (20) (21) (22) This system is what we refer to as the ANE. Next, we obtain ∆s and ∆x according to: ∆s = −rd − AT ∆y − V ∆z, ∆x = −D2 ∆s − x + σµS −1 e. (23) (24) Clearly, the search direction ∆w = (∆x, ∆s, ∆y, ∆z) computed as above satisfies (16) and (17) in view of (23) and (24). Moreover, it also satisfies (15) and (18) due to the fact that by (20), (21), (22), (23) and (24), we have that µ ¶ µ ¶ ∆x −D2 ∆s − x + σµS −1 e Ã = Ã ∆z ∆z µ 2 ¶ D rd + D2 AT ∆y + D2 V ∆z − x + σµS −1 e = Ã ∆z µ 2 T ¶ µ 2 ¶ 2 D A ∆y + D V ∆z D rd − x + σµS −1 e = Ã + Ã ∆z 0 ¶ µ ¶ µ 2 ¶ µ −1 ∆y D rd − x + σµS e −rp 2 T , (25) + Ã = = ÃD̃ Ã −rV ∆z 0 7 or equivalently, (15) and (18) hold. Theorem 2.1 assumes that we can solve ∆w exactly. This requires (22) to be computed exactly, which is normally done by computing the Cholesky factorization of (22). In this paper, we will consider a variant of Algorithm CQP-exact which computes an approximate solution to (22) by means of an iterative solver. We then modify equations (17) and (18) to obtain the other search directions in such a manner as to ensure that the number of outer iterations is still polynomially bounded. 2.2 Iterative approaches for solving the ANE In this subsection we discuss an approach for solving the ANE based on iterative solvers. Since the condition number of ANE matrix ÃD̃2 ÃT may “blow up” for points w near an optimal solution, the direct application of a generic iterative solver for solving the ANE (22) without first preconditioning it is generally an ineffective approach. We discuss a natural remedy to this problem which consists of using a preconditioner T̃ , namely the maximum weight basis preconditioner, such that κ(T̃ ÃD̃2 ÃT T̃ T ) remains uniformly bounded regardless of the iterate w. Finally, we analyze the complexity of the resulting approach to obtain a suitable approximate solution to the ANE. We start by describing the maximum weight basis preconditioner. Its construction essentially consists of building a basis B of Ã which gives higher priority to the columns of Ã corresponding to larger diagonal elements of D̃. More specifically, the maximum weight basis preconditioner is determined by the following algorithm: Maximum Weight Basis Algorithm (n+l) Start: Given d˜ ∈ R++ , and Ã ∈ R(m+l)×(n+l) such that rank(Ã) = m + l, 1. Order the elements of d˜ so that d˜1 ≥ . . . ≥ d˜n+l ; order the columns of Ã accordingly. 2. Let B = ∅, j = 1. 3. While |B| < m + l do (a) If Ãj is linearly independent of {Ãi : i ∈ B}, set B ← B ∪ {j}. (b) j ← j + 1. ˜ determine the set B according to this 4. Return to the original ordering of Ã and d; ordering and set N := {1, . . . , n + l}\B. 5. Set B := ÃB and D̃B := Diag(d˜B ). ˜ := D̃−1 B −1 . 6. Let T̃ = T̃ (Ã, d) B end 8 Note that the above algorithm can be applied to the matrix Ã defined in (21) since this matrix has full row rank due to Assumption 1. The maximum weight basis preconditioner was originally proposed by Resende and Veiga in [16] in the context of the minimum cost network flow problem. In this case, Ã = A is the node-arc incidence matrix of a connected digraph (with one row deleted to ensure that Ã has full row rank), the entries of d˜ are weights on the edges of the graph, and the set B generated by the above algorithm defines a maximum spanning tree on the digraph. Oliveira and Sorensen [14] later proposed the use of this preconditioner for general matrices Ã. For the purpose of stating the next result, we now introduce some notation. Let us define ϕÃ := max{kB −1 ÃkF : B is a basis of Ã}. (26) It is easy to show that ϕÃ ≤ (n + l)1/2 χ̄Ã , where χ̄Ã is a well-known condition number (see [18]) defined as (n+l) χ̄Ã := sup{kÃT (ÃE ÃT )−1 ÃEk : E ∈ Diag(R++ )}. Indeed, this follows from the fact that kCkF ≤ (n+l)1/2 kCk for any matrix C ∈ R(m+l)×(n+l) and that an equivalent characterization of χ̄Ã is χ̄Ã := max{kB −1 Ãk : B is a basis of Ã}, as shown in [17] and [18]. Lemmas 2.1 and 2.2 of Monteiro, O’Neal and Tsuchiya [13] imply the following result. ˜ be the preconditioner determined according to the MaxProposition 2.2 Let T̃ = T̃ (Ã, d) imum Weight Basis Algorithm, and define W := T̃ ÃD̃2 ÃT T̃ T . Then, kT̃ ÃD̃k ≤ ϕÃ and κ(W ) ≤ ϕ2Ã . 2 Note that the bound ϕÃ on κ(W ) is independent of the diagonal matrix D̃ and depends only on Ã. This will allow us to obtain a uniform bound on the number of iterations needed by an iterative solver to obtain a suitable approximate solution to (22). This topic is the subject of the remainder of this subsection. Instead of dealing directly with (22), we consider the application of an iterative solver to the preconditioned ANE: W u = v, (27) where W := T̃ ÃD̃2 ÃT T̃ T , v := T̃ h. (28) For the purpose of our analysis below, the only thing we will assume regarding the iterative solver when applied to (27) is that it generates a sequence of iterates {uj } such that · ¸j 1 j kv − W u0 k, ∀ j = 0, 1, 2, . . . , (29) kv − W u k ≤ c(κ) 1 − f (κ) 9 where c and f are positive functions of κ ≡ κ(W ). Examples of solvers which satisfy (29) include the steepest descent (SD) and conjugate gradient (CG) methods, with the following values for c(κ) and f (κ): Solver c(κ) √ SD √κ CG 2 κ f (κ) (κ √ + 1)/2 ( κ + 1)/2 Table 2.2 The justification for the table above follows from Section 7.6 and Exercise 10 of Section 8.8 of [10]. The following result gives an upper bound on the number of iterations that the generic √ iterative linear solver needs to perform to obtain an iterate uj satisfying kv − W uj k ≤ ξ µ for some constant ξ > 0: Proposition 2.3 Let u0 be an arbitrary starting point. Then, a generic iterative solver with √ a convergence rate given by (29) generates an iterate uj satisfying kv − W uj k ≤ ξ µ in µ µ ¶¶ c(κ)kv − W u0 k O f (κ) log (30) √ ξ µ iterations, where κ := κ(W ). Proof: The proof is similar to Theorem 2.4 of [12]. √ We will refer to an iterate uj satisfying kv − W uj k ≤ ξ µ as a ξ-approximate solution of (27). From Proposition 2.3, it is clear that different choices of the initial point u0 lead to different bounds on the number of iterations performed by the iterative solver. In Subsection 2.4, we will describe a suitable way of selecting u0 and ξ so that the resulting number of iterations performed by the iterative solver can be uniformly bounded by a universal constant depending on the quantities f (κ), c(κ), n and ϕÃ (see the bound (51)). 2.3 Computation of the Inexact Search Direction In this subsection, we will discuss the impact an inexact solution to (27) has on equations (15)–(18). More specifically, we will discuss one way to define the overall search direction so that its residual with respect to system (15)–(18) is appropriate for the development of a polynomially convergent algorithm based on that search direction. Suppose that we solve (27) inexactly according to Section 2.2. Then our final solution uj √ satisfies W uj − v = f˜ for some vector f˜ such that kf˜k ≤ ξ µ. Letting µ ¶ ∆y = T̃ T uj , (31) ∆z 10 we see from (28) that µ 2 T ÃD̃ Ã ∆y ∆z ¶ = h + T̃ −1 f˜. (32) The component ∆s of the search direction is still obtained as before, i.e. as in (23). However, the component ∆x is computed as ∆x = −D2 ∆s − x + σµS −1 e − S −1 p, (33) where p ∈ Rn will be specified below. To motivate the choice of p, we first note that an argument similar to (25) implies that µ ¶ µ ¶ ∆x rp Ã + ∆z rV ¶ µ ¶ µ −D2 ∆s − x + σµS −1 e − S −1 p rp = Ã + ∆z rV µ 2 ¶ µ ¶ 2 T 2 D rd + D A ∆y + D V ∆z − x + σµS −1 e − S −1 p rp = Ã + ∆z rV µ T ¶ µ 2 ¶ µ −1 ¶ µ ¶ −1 A ∆y + V ∆z D rd − x + σµS e S p rp 2 = ÃD̃ + Ã − Ã + ∆z 0 0 rV ¶ µ 2 ¶ µ ¶ µ ¶ µ D rd − x + σµS −1 e S −1 p rp ∆y + Ã − Ã + = ÃD̃2 ÃT 0 0 rV ∆z µ −1 ¶ S p −1 ˜ = T̃ f − Ã . (34) 0 Based on the above equation, one is naturally tempted to choose p so that the right hand side of (34) is zero, and consequently (15) and (18) are satisfied exactly. However, the existence of such p cannot be guaranteed and, even if it exists, its magnitude may be so large as to yield a search direction which is not suitable for the development of a polynomially convergent algorithm. Instead, we consider an alternative approach where p is chosen so that the first component of (34) is zero and the second component is small enough. More specifically, we choose (p, q) ∈ Rn × Rl such that µ −1 ¶ µ ¶ S p (XS)−1/2 p ˜ ˜ 0 = f − T̃ Ã = f − T̃ ÃD̃ . (35) q q Then, using (34) and (35), we conclude that ¶ µ −1 ¶ µ ¶ µ S p ∆x rp −1 ˜ = T̃ f − Ã Ã + rV 0 ∆z (36) from which we see that the first component of (34) is set to 0 and the second component is exactly q. We now discuss a convenient way to choose (p, q) satisfying (35). (Note that the 11 coefficient matrix of system (35) has full row rank, implying that (35) has multiple solutions.) First, let B = (B1 , . . . , Bm+l ) be the ordered set of basic indices computed by the Maximum Weight Basis algorithm and note that, by step 6 of this algorithm, the Bi -th column of T̃ ÃD̃ is the ith unit vector for every i = 1, . . . , m+l. Then, the vector u ∈ Rn+l defined as uBi = f˜i for i = 1, . . . , m + l and uj = 0 for every j ∈ / {B1 , . . . , Bm+l } clearly satisfies f˜ = T̃ ÃD̃u. We then obtain a pair (p, q) ∈ Rn × Rl satisfying (35) by defining µ ¶ µ ¶ p (XS)1/2 0 := u. q 0 I (37) (38) We summarize the results of this subsection in the following proposition: Proposition 2.4 The search direction ∆w derived in this subsection satisfies A∆x A ∆y + ∆s + V ∆z X∆s + S∆x V T ∆x + ∆z T = = = = −rp , −rd , −Xs + σµe − p, −rV + q, (39) (40) (41) (42) where the vectors p and q satisfy: kpk ≤ kXSk1/2 kf˜k, kqk ≤ kf˜k. (43) (44) Proof: Equations (39) and (42) immediately follow from (21) and (36). Equations (40) and (41) follow from (23) and (33), respectively. The bounds on kpk and kqk follow from (38) and the fact kuk = kf˜k. 2.4 The Algorithm In this subsection, we describe the primal-dual long-step path-following algorithm based on an iterative solver that will be the main focus of the analysis of this paper. We also state the main convergence results for this algorithm, whose proofs are given in the next section. The first result essentially shows that the number of outer iterations is bounded in the same way as for its exact counterpart discussed in Subsection 2.1. The second result shows that the number of inner iterations performed by the iterative solver to compute an approximate solution of (27) is uniformly bounded by a universal constant depending only on the quantities f (κ), c(κ), n and ϕÃ . 12 m+l First, given a starting point w0 ∈ R2n and the parameters η ≥ 0, γ ∈ [0, 1], and ++ × R θ > 0, we will define the following set: ( ) 0 0 , r Xs ≥ (1 − γ)µe, (r , r ) = η(r ), p d p d m+l Nw0 (η, γ, θ) = w ∈ R2n : , (45) √ ++ × R 0 krV − ηrV k ≤ θ µ, η ≤ µ/µ0 where µ = µ(w), µ0 = µ(w0 ), r = r(w) and r0 = r(w0 ). Next, we define the neighborhood [ Nw0 (γ, θ) = Nw0 (η, γ, θ). (46) η∈[0,1] We now present the algorithm which will be the focus of the analysis of this paper. Algorithm CQP-IS: 1. Start: Let ² > 0, γ ∈ (0, 1), θ > 0, w0 ∈ Nw0 (γ, θ), and 0 < σ < σ < 4/5 be given. Determine w̄ which satisfies equations (49). Set k = 0. 2. While µk := µ(wk ) > ² do (a) Let w := wk and µ := µk ; choose σ ∈ [σ, σ]. (b) Define D, Ã and D̃ according to (19), (21) and (20). Set rp = Ax − b, rd = AT y + s + V z − c, rV = V T x + z, and η = krp k/krp0 k. ˜ using the Maximum Weight Basis Algo(c) Build the preconditioner T̃ = T̃ (Ã, d) rithm, and define W , v, and u0 according to (28) and (50). (d) Using u0 as the start point for an iterative solver, find an approximate solution √ uj of W u = v such that f˜ = W uj − v satisfies kf˜k ≤ ξ µ, where ½ ·r ¸ ¾ ³ γσ γ´ √ , ξ := min 1+ 1− σ−1 θ . (47) 2 4 n (e) Solve for ∆y and ∆z using equation (31). Let (p, q) be computed according to (38). Compute ∆s and ∆x by (23) and (33). (f) Compute α̃ := argmax{α ∈ [0, 1] : w + α0 ∆w ∈ Nw0 (γ, θ), ∀α0 ∈ [0, α]}. (g) Compute ᾱ := argmin{(x + α∆x)T (s + α∆s) : α ∈ [0, α̃]}. (h) Let wk+1 = w + ᾱ∆w, and set k ← k + 1. End (while) Next, we present the main convergence results we have obtained for Algorithm CQP-IS. The first result gives a bound on the number of “outer” iterations needed by the algorithm to obtain an ²-solution to the KKT conditions (5)–(8). 13 Theorem 2.5 Assume that the constants γ, σ, σ and θ are such that ( µ ¶−1 ) √ 5 max γ −1 , (1 − γ)−1 , σ −1 , 1 − σ = O(1), θ = O( n), 4 (48) m+l and that the initial point w0 ∈ R2n satisfies (x0 , s0 ) ≥ (x∗ , s∗ ) for some w∗ ∈ S. ++ × R m+l Then, Algorithm CQP-IS generates an iterate wk ∈ R2n satisfying µk ≤ ²2 µ0 , ++ × R √ k k 2 0 0 k 2 0 2 k(rp , rd )k ≤ ² k(rp , rd )k, and krV k ≤ ² krV k + ²θ µ0 within O (n log(1/²)) iterations. We will prove Theorem 2.5 in Section 3.2. We next describe a suitable way of selecting u0 so that the resulting number of iterations performed by the iterative solver can be uniformly bounded by a universal constant depending only on the quantities f (κ), c(κ), n and ϕÃ . First, compute a point w̄ = (x̄, s̄, ȳ, z̄) such that µ ¶ µ ¶ x̄ b Ã = , AT ȳ + s̄ + V z̄ = c. (49) z̄ 0 Note that vectors x̄ and z̄ satisfying the first equation in (49) can be easily computed once a basis of Ã is available (e.g., the one computed by the Maximum Weight Basis Algorithm in the first outer iteration of Algorithm CQP-IS). Once ȳ is arbitrarily chosen, a vector s̄ satisfying the second equation of (49) is immediately available. We then define µ 0 ¶ y − ȳ 0 −T u = −η T̃ . (50) z 0 − z̄ For the specific starting point u0 defined in (50), the following result gives an upper bound on the number of iterations that the generic iterative linear solver needs to perform √ to obtain an iterate uj satisfying kv − W uj k ≤ ξ µ, where ξ is given in (47). We will prove this result in Subsection 3.1. Theorem 2.6 Assume that ξ is defined in (47), where σ, γ, θ are such that max{σ −1 , γ −1 , (1 − γ)−1 , θ, θ−1 } is bounded by a polynomial in n. Assume also that w0 and w̄ are such that (x0 , s0 ) > |(x̄, s̄)| and (x0 , s0 ) ≥ (x∗ , s∗ ) for some w∗ ∈ S. Then, a generic iterative solver with a convergence rate given by (29) generates an approximate solution uj as in step (d) of Algorithm CQP-IS in O (f (κ) log (c(κ)nϕÃ )) (51) iterations, where κ := κ(W ). As a consequence, the SD and CG methods generate this approximate solution uj in O(ϕ2Ã log(nϕÃ )) and O(ϕÃ log(nϕÃ )) iterations, respectively. 14 3 Technical Results In this section, we will prove Theorems 2.5 and 2.6. Section 3.1 is devoted to the proof of Theorem 2.6, and Section 3.2 will give the proof of Theorem 2.5. 3.1 “Inner” Iteration Results – Proof of Theorem 2.6 In this subsection, we will provide the proof of Theorem 2.6. We begin by establishing three technical lemmas. m+l , w ∈ Nw0 (η, γ, θ) for some η ∈ [0, 1], γ ∈ [0, 1] Lemma 3.1 Suppose that w0 ∈ R2n ++ × R ∗ and θ > 0, and w ∈ S. Then (x − ηx0 − (1 − η)x∗ )T (s − ηs0 − (1 − η)s∗ ) ≥ − θ2 µ. 4 (52) Proof: Let us define w̃ := w − ηw0 − (1 − η)w∗ . Using the definitions of Nw0 (η, γ, θ), r, and S, we have that Ax̃ = 0 A ỹ + s̃ + V z̃ = 0 V T x̃ + z̃ = rV − ηrV0 . T Multiplying the second relation by x̃T on the left, and using the first and third relations along with the fact that w ∈ Nw0 (η, γ, θ), we see that x̃T s̃ = −x̃T V z̃ = [z̃ − (rV − ηrV0 )]T z̃ = kz̃k2 − z̃ T (rV − ηrV0 ) µ ¶2 krV − ηrV0 k krV − ηrV0 k2 2 0 ≥ kz̃ k − kz̃kkrV − ηrV k = kz̃k − − 2 4 0 2 2 krV − ηrV k θ ≥ − ≥ − µ. 4 4 m+l Lemma 3.2 Suppose that w0 ∈ R2n such that (x0 , s0 ) ≥ (x∗ , s∗ ) for some w∗ ∈ S. ++ × R Then, for any w ∈ Nw0 (η, γ, θ) with η ∈ [0, 1], γ ∈ [0, 1] and θ > 0, we have µ ¶ θ2 T 0 T 0 µ. (53) η(x s + s x ) ≤ 3n + 4 Proof: Using the fact w ∈ Nw0 (η, γ, θ) and (52), we obtain T xT s − η(xT s0 + sT x0 ) + η 2 x0 s0 − (1 − η)(xT s∗ + sT x∗ ) θ2 T T T +η(1 − η)(x∗ s0 + s∗ x0 ) + (1 − η)2 x∗ s∗ ≥ − µ. 4 15 T T Rearranging the terms in this equation, and using the facts that η ≤ xT s/x0 s0 , x∗ s∗ = 0, (x, s) ≥ 0, (x∗ , s∗ ) ≥ 0, (x0 , s0 ) > 0, η ∈ [0, 1], x∗ ≤ x0 , and s∗ ≤ s0 , we conclude that T T T T T η(xT s0 + sT x0 ) ≤ η 2 x0 s0 + xT s + η(1 − η)(x∗ s0 + s∗ x0 ) + ≤ η 2 x0 s0 + xT s + 2η(1 − η)x0 s0 + T ≤ 2ηx0 s0 + xT s + θ2 ≤ 3x s + µ = 4 T θ2 µ 4 µ θ2 3n + 4 θ2 µ 4 θ2 µ 4 ¶ µ. m+l Lemma 3.3 Suppose w0 ∈ R2n , w ∈ Nw0 (η, γ, θ) for some η ∈ [0, 1], γ ∈ [0, 1] and ++ × R θ > 0, and w̄ satisfies (49). Let W , v and u0 be given by (28) and (50), respectively. Then, µ ¶ −x + σµS −1 e + η(x0 − x̄) + ηD2 (s0 − s̄) 0 W u − v = T̃ Ã . (54) rV − ηrV0 Proof: Using the fact that w ∈ Nw0 (η, γ, θ) along with (21), (45) and (49), we easily obtain that µ ¶ µ ¶ µ ¶ ¶ µ 0 ηrp0 rp 0 x − x̄ = + Ã (55) = η Ã rV rV − ηrV0 z 0 − z̄ ηrV0 + (rV − ηrV0 ) s0 − s̄ = −AT (y 0 − ȳ) − V (z 0 − z̄) + rd0 . (56) Using relations (20), (21), (28), (45), (50), (55) and (56), we obtain µ ¶ µ ¶ x − σµS −1 e − D2 rd rp 0 2 T T 0 W u − v = T̃ ÃD̃ Ã T̃ u − T̃ Ã + T̃ 0 rV µ 0 ¶ µ ¶ µ ¶ −1 2 0 y − ȳ x − σµS e − ηD rd rp 2 T = −η T̃ ÃD̃ Ã − T̃ Ã + T̃ z 0 − z̄ 0 rV ¶ µ 2 T 0 D A (y − ȳ) + D2 V (z 0 − z̄) − D2 rd0 = −η T̃ Ã z 0 − z̄ ¶ µ ¶ µ x − σµS −1 e rp , − T̃ Ã + T̃ rV 0 µ ¶ µ ¶ −D2 (s0 − s̄) x − σµS −1 e = −η T̃ Ã − T̃ Ã z 0 − z̄ 0 µ 0 ¶ µ ¶ x − x̄ 0 + η T̃ Ã + T̃ Ã , z 0 − z̄ rv − ηrV0 16 which yields equation (54), as desired. The next result shows that the initial residual of the iterative solver satisfies kv −W u0 k ≤ √ Ψ µ, where Ψ > 0 is a universal constant independent of the current iterate of Algorithm CQP-IS. ˜ is given and that w0 and w̄ are such that (x0 , s0 ) > Lemma 3.4 Assume that T̃ = T̃ (Ã, d) 0 0 ∗ ∗ |(x̄, s̄)| and (x , s ) ≥ (x , s ) for some w∗ ∈ S. Further, assume that w ∈ Nw0 (γ, θ) for some γ ∈ [0, 1] and θ > 0, and that W , v and u0 are given by (28) and (50), respectively. √ Then, the initial residual in (29) satisfies kv − W u0 k ≤ Ψ µ, where ¸ · 7n + θ2 /2 √ (57) + θ ϕÃ . Ψ := 1−γ Proof: Since w ∈ Nw0 (γ, θ), we have that xi si ≥ (1 − γ)µ for all i, which implies k(XS)−1/2 k ≤ p 1 (1 − γ)µ . (58) Note that kXs − σµek, when viewed as a function of σ ∈ [0, 1], is convex. Hence, it is maximized at one of its endpoints, which, together with the facts kXs − µek < kXsk and σ ∈ [σ, σ] ⊂ [0, 1], immediately implies that kXs − σµek ≤ kXsk ≤ kXsk1 = xT s = nµ. (59) Using the fact that (x0 , s0 ) > |(x̄, s̄)| together with Lemma 3.2, we obtain that © ª © ª ηkS(x0 − x̄) + X(s0 − s̄)k ≤ η kS(x0 − x̄)k + kX(s0 − s̄)k ≤ 2η kSx0 k + kXs0 k µ ¶ θ2 T 0 T 0 ≤ 2η(x s + x s ) ≤ 6n + µ. (60) 2 Since w ∈ Nw0 (γ, θ), there exists η ∈ [0, 1] such that w ∈ Nw0 (η, γ, θ). It is clear that the requirements of Lemma 3.3 are met, so equation (54) holds. By (19), (20) and (54), we see that ° µ ¶° ° (XS)−1/2 {Xs − σµe − η[S(x0 − x̄) + X(s0 − s̄)]} ° 0 ° ° kv − W u k = °T̃ ÃD̃ ° rV − ηrV0 ½ h i ≤ kT̃ ÃD̃k k(XS)−1/2 k kXs − σµek + ηkX(s0 − s̄) + S(x0 − x̄)k ¾ 0 + krV − ηrV k , ) ( · µ ¶ ¸ θ2 1 √ √ ≤ ϕÃ p nµ + 6n + µ + θ µ = Ψ µ, 2 (1 − γ)µ 17 where the last inequality follows from Proposition 2.2, relations (58), (59), (60), and the assumption that w ∈ Nw0 (γ, θ). The proof of the first part of Theorem 2.6 immediately follows from Proposition 2.3 and Lemma 3.4. The proof of the second part of Theorem 2.6 follows immediately from Table 2.2 and Proposition 2.2. 3.2 “Outer” Iteration Results – Proof of Theorem 2.5 In this subsection, we will present the proof of Theorem 2.5. Specifically, we will show that Algorithm CQP-IS obtains an ²-approximate solution to (5)–(8) in O(n2 log(1/²)) outer iterations. Throughout this section, we use the following notation: w(α) := w + α∆w, µ(α) := µ(w(α)), r(α) := r(w(α)). Lemma 3.5 Assume that ∆w satisfies (39)-(42) for some σ ∈ R, w ∈ R2n+m+l and (p, q) ∈ Rn × Rl . Then, for every α ∈ R, we have: (a) X(α)s(α) = (1 − α)Xs + ασµe − αp + α2 ∆X∆s; (b) µ(α) = [1 − α(1 − σ)]µ − αpT e/n + α2 ∆xT ∆s/n; (c) (rp (α), rd (α)) = (1 − α)(rp , rd ); (d) rV (α) = (1 − α)rV + αq. Proof: Using (41), we obtain X(α)s(α) = = = = (X + α∆X)(s + α∆s) Xs + α(X∆s + S∆x) + α2 ∆X∆s Xs + α(−Xs + σµe − p) + α2 ∆X∆s (1 − α)Xs + ασµe − αp + α2 ∆X∆s, thereby showing that (a) holds. Left multiplying the above equality by eT and dividing the resulting expression by n, we easily conclude that (b) holds. Statement (c) can be easily verified by means of (39) and (40), while statement (d) follows from (42). Lemma 3.6 Assume that ∆w satisfies (39)-(42) for some σ ∈ R, w ∈ R2n+m+l and (p, q) ∈ Rn × Rl such that pT e/n ≤ σµ/2. Then, for every scalar α satisfying ¾ ½ σµ , (61) 0 ≤ α ≤ min 1 , 2 k∆X∆sk∞ 18 we have µ(α) ≥ 1 − α. µ (62) Proof: Using Lemma 3.5(b) and the assumption that pT e/n ≤ σµ/2, we conclude for every α satisfying (61) that µ(α) = [1 − α(1 − σ)]µ − αpT e/n + α2 ∆xT ∆s/n 1 ≥ [1 − α(1 − σ)]µ − ασµ + α2 ∆xT ∆s/n 2 1 ≥ (1 − α)µ + ασµ − α2 k∆X∆sk∞ 2 ≥ (1 − α)µ. Lemma 3.7 Assume that ∆w satisfies (39)-(42) for some σ > 0, w ∈ Nw0 (γ, θ) with γ ∈ [0, 1], θ ≥ 0, and (p, q) ∈ Rn × Rl such that kpk∞ ≤ γσµ/4 and ·r kqk ≤ ¸ γ´ √ 1+ 1− σ − 1 θ µ. 2 (63) ³ Then, w(α) ∈ Nw0 (γ, θ) for every scalar α satisfying ½ ¾ γσµ 0 ≤ α ≤ min 1 , . 4 k∆X∆sk∞ (64) (65) Proof: Since w ∈ Nw0 (γ, θ), there exists η ∈ [0, 1] such that w ∈ Nw0 (η, γ, θ). We will show that w(α) ∈ Nw0 ((1 − α)η, γ, θ) ⊆ Nw0 (γ, θ) for every α satisfying (65). First, we note that (rp (α), rd (α)) = (1 − α)η(rp0 , rd0 ) by Lemma 3.5(c) and the definition of Nw0 (η, γ, θ). Next, since the assumption that γ ∈ [0, 1] and equation (63) imply that pT e/n ≤ σµ/2, it follows from Lemma 3.6 that (62) holds for every α satisfying (61), and hence (65). Thus, for every α satisfying (65), we have (1 − α)η ≤ µ(α) µ µ(α) µ(α) η ≤ = . µ µ µ0 µ0 (66) Now, it is easy to see that for every u ∈ Rn and τ ∈ [0, n], there holds ku − τ (uT e/n)ek∞ ≤ (1 + τ )kuk∞ . Using this inequality twice, the fact that w ∈ Nw0 (η, γ, θ), relation (63) and 19 statements (a) and (b) of Lemma 3.5, we conclude for every α satisfying (65) that X(α)s(α) − (1 − γ) µ(α)e · µ pT e n ¶ ¸ e = (1 − α) [Xs − (1 − γ)µe] + αγσµe − α p − (1 − γ) · µ T ¶ ¸ ∆x ∆s 2 e + α ∆X∆s − (1 − γ) n ° ° ° ° ¸ · ° ° pT e ° ∆xT ∆s ° ° ° ° ≥ α γσµ − °p − (1 − γ) e − α °∆X∆s − (1 − γ) e° ° e n °∞ n ∞ µ ¶ 1 1 ≥ α (γσµ − 2kpk∞ − 2αk∆X∆sk∞ ) e ≥ α γσµ − γσµ − γσµ e = 0. 2 2 Next, by Lemma 3.5(d), we have that rV (α) = (1 − α)rV + αq = (1 − α)ηrV0 + â, 0 where p â = (1 − α)(rV − ηrV ) + αq. To complete the proof, it suffices to show that kâk ≤ θ µ(α) for every α satisfying (65). By using equation (64) and Lemma 3.5(b) along with √ the facts that krV − ηrV0 k ≤ θ µ and α ∈ [0, 1], we have kâk2 − θ2 µ(α) = (1 − α)2 krV − ηrV0 k2 + 2α(1 − α)(rV − ηrV0 )T q + α2 kqk2 − θ2 µ(α) √ ≤ (1 − α)2 θ2 µ + 2α(1 − α)θ µkqk + α2 kqk2 ½ ¾ T pT e 2 2 ∆x ∆s − θ [1 − α(1 − σ)]µ − α +α n n T p e ∆xT ∆s √ ≤ α2 kqk2 + 2αθ µkqk − αθ2 σµ + αθ2 − α2 θ 2 n n i ³ h γ´ 2 √ 2 2 θ σµ + θ αk∆X∆sk∞ ≤ α kqk + 2θ µkqk − 1 − 4´ h ³ γ 2 i √ ≤ α kqk2 + 2θ µkqk − 1 − θ σµ ≤ 0, 2 where the last inequality follows from the quadratic formula and equation (64). Next, we consider the minimum step length allowed under our algorithm: Lemma 3.8 In every iteration of Algorithm CQP-IS, the step length ᾱ satisfies ½ ¾ min{γσ, 1 − 54 σ}µ ᾱ ≥ min 1, 4 k∆X∆sk∞ and · µ(ᾱ) ≤ µ 5 1− 1− σ 4 20 ¶ ¸ ᾱ µ. 2 (67) (68) Proof: By equations (43)–(44) and the bound on kf˜k in step (d) of Algorithm CQP-IS, we conclude that γσ √ γσµ √ kpk∞ ≤ kpk ≤ kXSk1/2 kf˜k ≤ nµ √ µ = , (69) 4 4 n ·r ¸ ³ ´ γ √ kqk ≤ kf˜k ≤ 1+ 1− σ − 1 θ µ, (70) 2 so (63) and (64) are satisfied. Hence, by Lemma 3.7, the quantity α̃ computed in step (g) of Algorithm CQP-IS satisfies ½ ¾ γσµ α̃ ≥ min 1, . (71) 4 k∆X∆sk∞ Moreover, by (69), it follows that the coefficient of α in the expression for µ(α) in Lemma 3.5(b) satisfies µ ¶ pT e 1 5 −(1−σ)µ− ≤ −(1−σ)µ+kpk∞ ≤ −(1−σ)µ+ γσµ = − 1 − σ µ < 0, (72) n 4 4 since σ ∈ (0, 45 ). Hence, if ∆xT ∆s ≤ 0, it is easy to see that ᾱ = α̃, and hence that (67) holds in view of (71). Moreover, by Lemma 3.5(b) and (72), we have · µ ¶ ¸ · µ ¶ ¸ ᾱ pT e 5 5 µ(ᾱ) ≤ [1 − ᾱ(1 − σ)]µ − ᾱ ≤ 1 − 1 − σ ᾱ µ ≤ 1 − 1 − σ µ, n 4 4 2 showing that (68) also holds. We now consider the case where ∆xT ∆s > 0. In this case, we have ᾱ = min{αmin , α̃}, where αmin is the unconstrained minimum of µ(α). It is easy to see that αmin n[µ(1 − σ) − 14 σµ] µ(1 − 54 σ) nµ(1 − σ) + pT e = ≥ ≥ . 2∆xT ∆s 2∆xT ∆s 2 k∆X∆sk∞ The last two observations together with (71) imply that (67) holds in this case too. Moreover, since the function µ(α) is convex, it must lie below the function φ(α) over the interval [0, αmin ], where φ(α) is the affine function interpolating µ(α) at α = 0 and α = αmin . Hence, · µ ¶ ¸ pT e 5 ᾱ ᾱ ≤ 1− 1− σ µ(ᾱ) ≤ φ(ᾱ) = [1 − (1 − σ) ]µ − ᾱ µ, (73) 2 2n 4 2 where the second inequality follows from (72). We have thus shown that ᾱ satisfies (68). Our next task will be to show that the stepsize ᾱ remains bounded away from zero. In view of (67), it suffices to show that the quantity k∆X∆sk∞ remains bounded. The next lemma addresses this issue. 21 m+l Lemma 3.9 Let w0 ∈ R2n be such that (x0 , s0 ) ≥ (x∗ , s∗ ) for some w∗ ∈ S, and ++ × R let w ∈ Nw0 (γ, θ) for some γ ≥ 0 and θ ≥ 0. Then, the search direction ∆w generated by Algorithm CQP-IS satisfies µ ¶1/2 σ2 √ max(kD ∆xk, kD∆sk) ≤ 1 − 2σ + nµ 1−γ µ ¶ 1 γσ θ2 √ √ +√ + 6n + µ + θ µ. 2 1−γ 4 −1 (74) Proof: Since w ∈ Nw0 (γ, θ), there exists η ∈ [0, 1] such that w ∈ Nw0 (η, γ, θ). Let g ∆w := ∆w + η(w0 − w∗ ). Using relations (39), (40), (42), and the fact that w ∈ Nw0 (η, γ, θ), we easily see that f = 0 A∆x f + ∆s f + V ∆z f = 0, AT ∆y f + ∆z f = q − rV + ηrV0 . V T ∆x (75) (76) (77) f T and using (75) and (77), we obtain Pre-multiplying (76) by ∆x f T ∆s f = −∆x f T V ∆z f = [∆z f − (q − rV + ηrV0 )]T ∆z f = k∆zk f 2 − (q − rV + ηrV0 )T ∆z f ∆x 0 2 f 2 − kq − rV + ηr0 k k∆zk f ≥ − kq − rV + ηrV k . ≥ k∆zk V 4 (78) Next, we multiply equation (41) by (XS)−1/2 to obtain D−1 ∆x + D∆s = H(σ) − (XS)−1/2 p, where H(σ) := −(XS)1/2 e + σµ(XS)−1/2 e. Equivalently, we have that £ ¤ f + D∆s f = H(σ) − (XS)−1/2 p + η D(s0 − s∗ ) + D−1 (x0 − x∗ ) =: g. D−1 ∆x Taking the squared norm of both sides of the above equation and using (78), we obtain T kq − rV + ηrV0 k2 2 2 2 2 f f f f kD ∆xk + kD∆sk = kgk − 2∆x ∆s ≤ kgk + 2 µ ¶2 kqk + krV − ηrV0 k √ √ ≤ (kgk + θ µ)2 , ≤ kgk + 2 √ √ £√ ¤ √ √ since kqk + krV − ηrV0 k ≤ 2 − 1 θ µ + θ µ = 2θ µ by (45), (70), and the fact that 1 + (1 − γ/2)σ ≤ 2. Thus, we have −1 √ f , kD∆sk) f ≤ kgk + θ µ max(kD−1 ∆xk £ ¤ √ ≤ kH(σ)k + k(XS)−1/2 k kpk + η kD(s0 − s∗ )k + kD−1 (x0 − x∗ )k + θ µ. 22 g and the fact that This, together with the triangle inequality, the definitions of D and ∆w, w ∈ Nw0 (η, γ, θ), imply that max(kD−1 ∆xk, kD∆sk) £ ¤ √ ≤ kH(σ)k + k(XS)−1/2 k kpk + 2η kD(s0 − s∗ )k + kD−1 (x0 − x∗ )k + θ µ £ ¤ √ ≤ kH(σ)k + k(XS)−1/2 k kpk + 2ηk(XS)−1/2 k kX(s0 − s∗ )k + kS(x0 − x∗ )k + +θ µ £ ¡ ¢¤ 1 √ ≤ kH(σ)k + p kpk + 2η kX(s0 − s∗ )k + kS(x0 − x∗ )k + θ µ. (79) (1 − γ)µ It is well-known (see e.g. [7]) that µ kH(σ)k ≤ σ2 1 − 2σ + 1−γ ¶1/2 √ nµ. Moreover, using the fact that s∗ ≤ s0 and x∗ ≤ x0 along with Lemma 3.2, we obtain ¶ µ ¡ ¢ θ2 0 ∗ 0 ∗ T 0 T 0 η kX(s − s )k + kS(x − x )k ≤ η(s x + x s ) ≤ 3n + µ. 4 (80) (81) Finally, it is clear from (69) that kpk ≤ γσµ . 4 (82) The result now follows by incorporating inequalities (80)–(82) into (79). We are now ready to prove Theorem 2.5. Proof of Theorem 2.5: Let ∆wk denote the search direction, and let rk = r(wk ) and µk = µ(wk ), at the k-th iteration of Algorithm CQP-IS. Clearly, wk ∈ Nw0 (γ, θ). Hence, using Lemma 3.9, assumption (48) and the inequality k∆X k ∆sk k∞ ≤ k∆X k ∆sk k ≤ k(Dk )−1 ∆xk k kDk ∆sk k, we easily see that k∆X k ∆sk k∞ = O(n2 )µk . Using this conclusion together with assumption (48) and Lemma 3.8, we see that, for some universal constant β > 0, we have µ ¶ β µk+1 ≤ 1 − 2 µk , ∀k ≥ 0. n Using this observation and some standard arguments (see, for example, Theorem 3.2 of [19]), we easily see that the Algorithm CQP-IS generates an iterate wk ∈ Nw0 (γ, θ) satisfying µk /µ0 ≤ ²2 within O (n2 log(1/²)) iterations. The theorem now follows from this conclusion and the definition of Nw0 (γ, θ). 23 4 Concluding Remarks We have shown that the primal-dual long-step path-following LP algorithm based on an iterative solver presented in [12] can be extended to the CQP case. This was not immediately obvious at first, since the standard normal equation for CQP does not fit into the mold required for the results in [13] to hold. By creating the ANE, we were able to use the results about the maximum weight basis preconditioner developed in [13] in the context of CQP. Another difficulty we encountered was the proper choice of the starting iterate u0 for the iterative solver. By choosing u0 = 0 as in the LP case, we obtain kv − W u0 k = kvk, which √ can only be shown to be O(max{µ, µ}). In this case, for every µ > 1, Proposition 2.3 would guarantee that the number of inner iterations of the iterative solver is O (f (κ) max { log (c(κ)nϕÃ ) , log µ }) , a bound which depends on the logarithm of the current duality gap. On the other hand, Theorem 2.6 shows that choosing u0 as in (50) results in a bound on the number of inner iterations of the iterative solver that does not depend on the current duality gap. The usual way of defining the dual residual is as the quantity Rd := AT y + s − V V T x − c, which, in view of (11) and (12), can be written in terms of the residuals rd and rV as R d = r d − V rV . (83) Note that along the iterates generated by Algorithm CQP-IS, we have rd = O(µ) and √ √ rV = O( µ), implying that Rd = O( µ). Hence, while the usual primal residual converges √ to 0 according to O(µ), the usual dual residual does so according to O( µ). This is an unusual feature of the algorithm presented in this paper, and one which contrasts it with other infeasible interior-point path-following algorithms, which require that the primal and dual residuals go to zero at an O(µ)-rate. It should be noted however that this feature is √ possible due to the specific form of the O( µ)-term present in (83), i.e. one that lies in the range space of V . CQP problems where V is explicitly available arise frequently in the literature. One important example arises in portfolio optimization (see [4]), where the rank of V is often small. In such problems, l represents the number of observation periods used to estimate the data for the problem. We believe that Algorithm CQP-IS could be of particular use for this type of application. References [1] V. Baryamureeba and T. Steihaug. On the convergence of an inexact primal-dual interior point method for linear programming. Technical Report 188, Department of Informatics, University of Bergen, 2000. 24 [2] V. Baryamureeba, T. Steihaug, and Y. Zhang. Properties of a class of preconditioners for weighted least squares problems. Technical Report 16, Department of Computational and Applied Mathematics, Rice University, 1999. [3] L. Bergamaschi, J. Gondzio, and G. Zilli. Preconditioning indefinite systems in interior point methods for optimization. Technical Report MS-02-002, Department of Mathematical Methods for Applied Sciences, University of Padua, Italy, 2002. [4] T.J. Carpenter and R.J. Vanderbei. Symmetric indefinite systems for interior-point methods. Mathematical Programming, 58:1–32, 1993. [5] R.W. Freund, F. Jarre, and S. Mizuno. Convergence of a class of inexact interior-point algorithms for linear programs. Mathematics of Operations Research, 24(1):50–71, 1999. [6] M. Kojima, N. Megiddo, and S. Mizuno. A primal-dual infeasible-interior-point algorithm for linear programming. Mathematical Programming, 61(3):263–280, 1993. [7] M. Kojima, S. Mizuno, and A. Yoshise. A primal-dual interior point algorithm for linear programming. in Progress in Mathematical Programming: Interior-Point and Related Methods, ch. 2, pages 29–47, 1989. [8] J. Korzak. Convergence analysis of inexact infeasible-interior-point algorithms for solving linear programming problems. SIAM Journal on Optimization, 11(1):133–148, 2000. [9] V.V. Kovacevic-Vujcic and M.D. Asic. Stabilization of interior-point methods for linear programming. Computational Optimization and Applications, 14:331–346, 1999. [10] D.G. Luenberger. Linear and Nonlinear Programming. Addison-Wesley, 1984. [11] S. Mizuno and F. Jarre. Global and polynomial-time convergence of an infeasibleinterior-point algorithm using inexact computation. Mathematical Programming, 84:357–373, 1999. [12] R.D.C. Monteiro and J.W. O’Neal. Convergence analysis of a long-step primal-dual infeasible interior-point lp algorithm based on iterative linear solvers. Technical report, Georgia Institute of Technology, 2003. Submitted to Mathematical Programming. [13] R.D.C. Monteiro, J.W. O’Neal, and T. Tsuchiya. Uniform boundedness of a preconditioned normal matrix used in interior point methods. Technical report, Georgia Institute of Technology, 2003. Submitted to SIAM Journal on Optimization. [14] A.R.L. Oliveira and D.C. Sorensen. Computational experience with a preconditioner for interior point methods for linear programming. Technical Report 28, Department of Computational and Applied Mathematics, Rice University, 1997. [15] L.F. Portugal, M.G.C. Resende, G. Veiga, and J.J. Judice. A truncated primal-infeasible dual-feasible network interior point method. Networks, 35:91–108, 2000. 25 [16] M.G.C. Resende and G. Veiga. An implementation of the dual affine scaling algorithm for minimum cost flow on bipartite uncapacitated networks. SIAM Journal on Optimization, 3:516–537, 1993. [17] M.J. Todd, L. Tunçel, and Y. Ye. Probabilistic analysis of two complexity measures for linear programming problems. Mathematical Programming A, 90:59–69, 2001. [18] S.A. Vavasis and Y. Ye. A primal-dual interior point method whose running time depends only on the constraint matrix. Mathematical Programming A, 74:79–120, 1996. [19] S.J. Wright. Primal-Dual Interior-Point Methods. SIAM, 1997. [20] Y. Zhang. On the convergence of a class of infeasible interior-point methods for the horizontal linear complementarity problem. SIAM Journal on Optimization, 4(1):208– 227, 1994. [21] G. Zhou and K.-C. Toh. Polynomiality of an inexact infeasible interior point algorithm for semidefinite programming. Mathematical Programming. to appear. 26

An Iterative Solver-Based Infeasible Primal-Dual Path-Following Algorithm for Convex QP Zhaosong Lu

Related documents

Products

Support

An Iterative Solver-Based Infeasible Primal-Dual Path-Following Algorithm for Convex QP Zhaosong Lu

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib