An Iterative Solver-Based Infeasible Primal-Dual Path-Following Algorithm for Convex QP Zhaosong Lu

advertisement
An Iterative Solver-Based Infeasible
Primal-Dual Path-Following Algorithm for Convex QP
Zhaosong Lu
∗
Renato D.C. Monteiro†
Jerome W. O’Neal
‡
May 3, 2004
Abstract
In this paper we develop an interior-point primal-dual long-step path-following algorithm for convex quadratic programming (CQP) whose search directions are computed
by means of an iterative (linear system) solver. We propose a new linear system, which
we refer to as the augmented normal equation (ANE), to determine the primal-dual
search directions. Since the condition number of the matrix associated with the ANE
may become large for degenerate CQP problems, we use a maximum weight basis
preconditioner introduced in [16, 14] to better condition this matrix. Using a result
obtained in [13], we establish a uniform bound, depending only on CQP data, on the
number of iterations required for the iterative solver to obtain a sufficiently accurate
solution to the ANE. Since the iterative solver can only generate an approximate solution to the ANE, this solution does not yield a primal-dual search direction satisfying
all equations of the primal-dual Newton system. We propose a way to compute an
inexact primal-dual search direction so that the Newton equation corresponding to the
primal residual is satisfied exactly, while the one corresponding to the dual residual
contains a manageable error which allows us to establish a polynomial bound on the
number of iterations of our method.
Keywords: Convex quadratic programming, iterative solver, inexact search directions,
polynomial convergence.
AMS 2000 subject classification: 65F10, 65F35, 90C20, 90C25, 90C51.
∗
School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0205.
(email: zhaosong@isye.gatech.edu). This author was supported in part by NSF Grant CCR-0203113 and
ONR Grant N00014-03-1-0401.
†
School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0205.
(email: monteiro@isye.gatech.edu). This author was supported in part by NSF Grants CCR-0203113 and
INT-9910084 and ONR Grant N00014-03-1-0401.
‡
School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0205
(email: joneal@isye.gatech.edu). This author was supported in part by the NDSEG Fellowship Program
sponsored by the Department of Defense.
1
1
Introduction
In this paper we develop an interior-point primal-dual long-step path-following algorithm for
convex quadratic programming (CQP) whose search directions are computed by means of
an iterative (linear system) solver. More specifically, the problem we consider is:
½
¾
1 T
T
min
x Qx + c x : Ax = b, x ≥ 0 ,
(1)
x
2
where the data are Q ∈ Rn×n , A ∈ Rm×n , b ∈ Rm , and c ∈ Rn , and the decision vector is
x ∈ Rn . We also assume that Q is positive semidefinite, and that a factorization Q = V V T
is explicitly given, where V ∈ Rn×l .
A similar algorithm for solving the special case of linear programming (LP), i.e. problem
(1) with Q = 0, was developed by Monteiro and O’Neal in [12]. The algorithm studied in
[12] is essentially the long-step path-following infeasible algorithm studied in [6, 20], the only
difference being that the search directions are computed by means of an iterative solver. We
refer to the iterations of the iterative solver as the inner iterations and to the ones performed
by the interior-point method itself as the outer iterations. The main step of the algorithm
studied in [6, 12, 20] is the computation of the primal-dual search direction (∆x, ∆s, ∆y),
whose ∆y component can be found by solving a system of the form AD2 AT ∆y = g, referred
to as the normal equation, where g ∈ Rm and the positive diagonal matrix D depends on
the current primal-dual iterate. In contrast to [6, 20], the algorithm studied in [12] uses
an iterative solver to obtain an approximate solution to the normal equation. Since the
condition number of the normal matrix AD2 AT may become excessively large on degenerate
LP problems (see e.g. [9]), the maximum weight basis preconditioner T introduced in [16, 14]
is used to better condition this matrix and an approximate solution of the resulting equivalent
system with coefficient matrix T AD2 AT T T is then computed. By using a result obtained in
[13], which establishes that the condition number of T AD2 AT T T is uniformly bounded by
a quantity depending on A only, Monteiro and O’Neal [12] shows that the number of inner
iterations of the algorithm in [12] can be uniformly bounded by a constant depending on n
and A.
In the case of CQP, the standard normal equation takes the form
A(Q + X −1 S)−1 AT ∆y = g,
(2)
for some vector g. When Q is not diagonal, the matrix (Q + X −1 S)−1 is not diagonal, and
hence the coefficient matrix of (2) does not have the form required for the result of [13] to
hold. To remedy this difficulty, we develop in this paper a new linear system, referred to
as the augmented normal equation (ANE), to determine a portion of the primal-dual search
direction. This equation has the form ÃD̃2 ÃT u = w, where w ∈ Rm+l , D̃ is an (n+l)×(n+l)
positive diagonal matrix and à is a 2 × 2 block matrix of dimension (m + l) × (n + l) whose
blocks consist of A, V T , the zero matrix and the identity matrix (see equation (21)). As
was done in [12], a maximum weight basis preconditioner T̃ for the ANE is computed and
an approximate solution of the resulting preconditioned equation with coefficient matrix
2
T̃ ÃD̃2 ÃT T̃ T is generated using an iterative solver. Using the result of [13], which claims
that the condition number of T̃ ÃD̃2 ÃT T̃ T is uniformly bounded regardless of D̃, we obtain
a uniform bound (depending only on Ã) on the number of inner iterations performed by the
iterative solver to find a desirable approximate solution for the ANE (see Theorem 2.6).
Since the iterative solver can only generate an approximate solution to the ANE, it is
clear that not all equations of the Newton system, which determines the primal-dual search
direction, can be satisfied simultaneously. In the context of LP, Monteiro and O’Neal [12]
proposed a recipe to compute an inexact primal-dual search direction so that the Newton
equations corresponding to the primal and dual residuals were both satisfied. In the context
of CQP, such an approach is no longer possible. Instead, we propose a way to compute
an inexact primal-dual search direction so that the Newton equation corresponding to the
primal residual is satisfied exactly, while the one corresponding to the dual residual contains
a manageable error which allows us to establish a polynomial bound on the number of outer
iterations of our method. Interestingly, the presence of this error on the dual residual Newton
equation implies that the primal and dual residuals go to zero at different rates. This is a
unique feature of our algorithm and stands in contrast with the other interior-point primaldual path-following algorithms, where the primal and dual residuals go to zero at the same
rate.
The use of inexact search directions in interior-point methods has been extensively studied
in the context of LP (see e.g. [1, 5, 8, 11, 21]), but little research has been done regarding
the use of inexact search directions in interior-point methods for CQP. Moreover, the use of
iterative solvers to compute the primal-dual Newton search directions of interior-point path
following algorithms in the context of LP have been investigated in [2, 5, 8, 14, 15, 16]; again,
little research has been done regarding the use of iterative solvers in the context of CQP. In
the context of CQP, Bergamaschi et al. [3] have proposed a preconditioner for the augmented
system of equations and have developed an interior-point algorithm which uses an iterative
solver on the preconditioned augmented system in order to obtain the search directions.
However, it should be noted that this paper does not touch upon the issue of complexity
bounds either from the inner or outer iteration points of view. To our knowledge, no one
has used the ANE system to obtain the primal-dual search direction in the context of CQP,
nor has anyone presented an iterative solver-based CQP algorithm with strong complexity
bounds on both the inner and outer iterations.
Our paper is organized as follows. In Subsection 1.1, we give the terminology and notation
which will be used throughout our paper. Section 2 describes the algorithm which will be
the main focus of our study in this paper and presents both the inner and outer iterationcomplexity results obtained for it. In Section 3, we prove the two convergence results stated
Section 2. Finally, we present some concluding remarks in Section 4.
1.1
Terminology and Notation
Throughout this paper, upper-case Roman letters denote matrices, lower-case Roman letters
denote vectors, and lower-case Greek letters denote scalars. We let Rn , Rn+ and Rn++ denote
3
the set of n- vectors having real, nonnegative real, and positive real components, respectively.
Also, we let Rm×n denote the set of m × n matrices with real entries. For a vector v ∈ Rn ,
we let |v| denote the vector whose ith component is |vi |, for every i = 1, . . . , n, and we let
Diag(v) denote the diagonal matrix whose i-th diagonal element is vi , for every i = 1, . . . , n.
Certain matrices bear special notation, namely the matrices X, ∆X, S, D, and D̃.
˜
These matrices are the diagonal matrices corresponding to the vectors x, ∆x, s, d, and d,
respectively, as described in the previous paragraph. The symbol 0 will be used to denote a
scalar, vector, or matrix of all zeroes; its dimensions should be clear from the context. Also,
we denote by e the vector of all 1’s, and by I the identity matrix; their dimensions should
be clear from the context.
For a symmetric positive definite matrix W , we denote by κ(W ) its condition number,
i.e. its maximum eigenvalue divided by its minimum eigenvalue. We will denote sets by
upper-case script Roman letters (e.g. B, N ). For a finite set B, we denote its cardinality by
|B|. Given a matrix A ∈ Rm×n and an ordered set B ⊆ {1, . . . , n}, we let AB denote the
submatrix whose columns are {Ai : i ∈ B} arranged in the same order as B. Similarly, given
a vector v ∈ Rn and an ordered set B ⊆ {1, . . . , n}, we let vB denote the subvector consisting
of the elements {vi : i ∈ B} arranged in the same order as B.
the paper. For a vector z ∈ Rn , kzk =
√ We will use several different normsPthroughout
n
z T z is the Euclidian norm, kzk1 = i=1 |zi | is the “1-norm”, and kzk∞ = maxi=1,...,n |zi |
is the “infinity norm”. For a matrix V ∈ Rm×n , kV k denotes the operator norm associated
with the Euclidian
= maxz:kzk=1 kV zk. Finally, kV kF denotes the Frobenius
Pmnorm:
Pn kV2k1/2
norm: kV kF = ( i=1 j=1 Vij ) .
2
Main Results
This section, which presents the main results of our paper, is divided into four subsections.
In Subsection 2.1, we will review a well-known polynomially convergent primal-dual longstep path-following algorithm for CQP whose search directions are computed by means of
an exact solver. Also, we will introduce the “augmented normal equation” (ANE) approach
for computing the primal-dual search directions for the above algorithm. In the next three
subsections, we discuss a variant of the above algorithm where the corresponding search
directions are (inexactly) computed by means of an iterative solver. In Subsection 2.2, we
will discuss the use of the ANE in connection with the preconditioning results developed in
Monteiro, O’Neal and Tsuchiya [13] and derive a bound on the number of iterations required
for the iterative solver to obtain a sufficiently accurate solution to the ANE. In Subsection
2.3, we will discuss a suitable way to define the primal-dual search direction using an inexact
solution to the ANE. Finally, in Subsection 2.4 we will present our main algorithm and the
corresponding (inner and outer) iteration-complexity results.
4
2.1
Preliminaries and Motivation
In this subsection, we discuss a primal-dual long-step path-following CQP algorithm, Algorithm CQP-exact, which will serve as the basis for Algorithm CQP-IS given in Section 2.4.
We will also give the complexity results which have been obtained for Algorithm CQP-exact.
In particular, the complexity results obtained in this section follow as an immediate corollary
from Theorem 2.5.
Consider the following primal-dual pair of CQP problems:
½
¾
1 T 2
T
minx
kV xk + c x : Ax = b, x ≥ 0 ,
(3)
2
½
¾
1 T 2
T
T
T
max(x̂,s,y)
− kV x̂k + b y : A y + s − V V x̂ = c, s ≥ 0 ,
(4)
2
where the data are V ∈ Rn×l , A ∈ Rm×n , b ∈ Rm and c ∈ Rn , and the decision variables are
x ∈ Rn and (x̂, s, y) ∈ Rn × Rn × Rm .
It is well-known that if x∗ is an optimal solution for (3) and (x̂∗ , s∗ , y ∗ ) is an optimal
solution for (4), then (x∗ , s∗ , y ∗ ) is also an optimal solution for (4). Now, let S denote the
set of all vectors w := (x, s, y, z) ∈ R2n+m+l satisfying
Ax
A y+s+Vz
Xs
T
V x+z
=
=
=
=
T
b, x ≥ 0,
c, s ≥ 0,
0,
0.
(5)
(6)
(7)
(8)
It is clear that w ∈ S if and only if x is optimal for (3), (x, s, y) is optimal for (4), and z =
−V T x. (Throughout this paper, the symbol w will always denote the quadruple (x, s, y, z),
where the vectors lie in the appropriate dimensions; similarly, ∆w = (∆x, ∆s, ∆y, ∆z),
wk = (xk , sk , y k , z k ), w̄ = (x̄, s̄, ȳ, z̄), etc.)
We observe that the presentation of the primal-dual path-following algorithm based on
an exact solver in this subsection differs from the classical way of presenting it in that we
introduce an additional variable z as above. Clearly, it is easy to see that the variable z is
completely redundant and can be eliminated, thereby reducing the method described below
to the usual way of presenting it. However, we observe that the introduction of the variable
z plays a crucial role in the version of the algorithm below based on an iterative solver
(see Algorithm CQP-IS in Subsection 2.4), where the search directions are computed only
approximately.
We will make the following two assumptions throughout the paper:
Assumption 1 A has full row rank.
5
Assumption 2 The set S is nonempty.
m+l
For a point w ∈ R2n
, let us define
++ × R
µ
rp
rd
rV
r
µ(w) = xT s/n,
rp (w) = Ax − b,
rd (w) = AT y + s + V z − c,
rV (w) = V T x + z,
r(w) = (rp (w), rd (w), rV (w)).
:=
:=
:=
:=
:=
(9)
(10)
(11)
(12)
(13)
m+l
Moreover, given γ ∈ (0, 1) and an initial point w0 ∈ R2n
, we define the following
++ × R
neighborhood of the central path:
½
¾
krk
µ
2n
m+l
: Xs ≥ (1 − γ)µe,
Nw0 (γ) := w ∈ R++ × R
,
(14)
≤
kr0 k
µ0
where r := r(w), r0 := r(w0 ), µ := µ(w), and µ0 := µ(w0 ). (Here, we use the convention
that ν/0 is equal to 0 if ν = 0 and ∞ if ν > 0.)
The infeasible primal-dual algorithm which will serve as the basis for our iterative CQP
method is as follows:
Algorithm CQP-exact
1. Start: Let ² > 0, γ ∈ (0, 1), w0 ∈ Nw0 (γ) and 0 < σ < σ < 1 be given. Set k = 0.
2. While µk := µ(wk ) > ² do
(a) Let w := wk and µ := µk ; choose σ := σk ∈ [σ, σ].
(b) Let ∆w = (∆x, ∆s, ∆y, ∆z) denote the solution of the linear system
A∆x
A ∆y + ∆s + V ∆z
X∆s + S∆x
V T ∆x + ∆z
T
=
=
=
=
−rp ,
−rd ,
−Xs + σµe,
−rV .
(15)
(16)
(17)
(18)
(c) Let α̃ = argmax {α ∈ [0, 1] : w + α0 ∆w ∈ Nw0 (γ), ∀α0 ∈ [0, α]}.
©
ª
(d) Let ᾱ = argmin (x + α∆x)T (s + α∆s) : α ∈ [0, α̃] .
(e) Let wk+1 = w + ᾱ∆w, and set k ← k + 1.
End (while)
(Note: we titled this algorithm “CQP-exact” to distinguish it from our main algorithm in
Section 2.4, where system (15)-(18) is solved inexactly using an iterative solver.)
The following result, which establishes an iteration-complexity bound for Algorithm
CQP-exact, is well-known.
6
Theorem 2.1 Assume that the constants γ, σ, and σ are such that
©
ª
max γ −1 , (1 − γ)−1 , σ −1 , (1 − σ)−1 = O(1),
m+l
and that the initial point w0 ∈ R2n
satisfies (x0 , s0 ) ≥ (x∗ , s∗ ) for some w∗ ∈ S.
++ × R
m+l
Then, Algorithm CQP-exact finds an iterate wk ∈ R2n
satisfying µk ≤ ²µ0 and
++ × R
krk k ≤ ²kr0 k within O(n2 log(1/²)) iterations.
A few approaches have been suggested in the literature for computing the Newton search
direction (15)-(18). Instead of pursuing one of them, we will discuss below a new approach,
referred to in this paper as the augmented normal equation (ANE) approach, that we believe
to be suitable not only for direct solvers but especially for iterative solvers as we will see in
the next subsection.
Let us begin by defining the following matrices:
D := X 1/2 S −1/2 ,
µ
¶
D 0
D̃ :=
∈ R(n+l)×(n+l) ,
0 I
µ
¶
A 0
à :=
∈ R(m+l)×(n+l) .
VT I
Suppose that we first solve the following system of equations for (∆y, ∆z):
µ
¶
µ
¶ µ
¶
∆y
x − σµS −1 e − D2 rd
−rp
2 T
ÃD̃ Ã
= Ã
+
=: h.
∆z
0
−rV
(19)
(20)
(21)
(22)
This system is what we refer to as the ANE. Next, we obtain ∆s and ∆x according to:
∆s = −rd − AT ∆y − V ∆z,
∆x = −D2 ∆s − x + σµS −1 e.
(23)
(24)
Clearly, the search direction ∆w = (∆x, ∆s, ∆y, ∆z) computed as above satisfies (16) and
(17) in view of (23) and (24). Moreover, it also satisfies (15) and (18) due to the fact that
by (20), (21), (22), (23) and (24), we have that
µ
¶
µ
¶
∆x
−D2 ∆s − x + σµS −1 e
Ã
= Ã
∆z
∆z
µ 2
¶
D rd + D2 AT ∆y + D2 V ∆z − x + σµS −1 e
= Ã
∆z
µ 2 T
¶
µ 2
¶
2
D A ∆y + D V ∆z
D rd − x + σµS −1 e
= Ã
+ Ã
∆z
0
¶
µ
¶
µ 2
¶
µ
−1
∆y
D rd − x + σµS e
−rp
2 T
, (25)
+ Ã
=
= ÃD̃ Ã
−rV
∆z
0
7
or equivalently, (15) and (18) hold.
Theorem 2.1 assumes that we can solve ∆w exactly. This requires (22) to be computed
exactly, which is normally done by computing the Cholesky factorization of (22). In this
paper, we will consider a variant of Algorithm CQP-exact which computes an approximate
solution to (22) by means of an iterative solver. We then modify equations (17) and (18) to
obtain the other search directions in such a manner as to ensure that the number of outer
iterations is still polynomially bounded.
2.2
Iterative approaches for solving the ANE
In this subsection we discuss an approach for solving the ANE based on iterative solvers.
Since the condition number of ANE matrix ÃD̃2 ÃT may “blow up” for points w near an
optimal solution, the direct application of a generic iterative solver for solving the ANE (22)
without first preconditioning it is generally an ineffective approach. We discuss a natural
remedy to this problem which consists of using a preconditioner T̃ , namely the maximum
weight basis preconditioner, such that κ(T̃ ÃD̃2 ÃT T̃ T ) remains uniformly bounded regardless
of the iterate w. Finally, we analyze the complexity of the resulting approach to obtain a
suitable approximate solution to the ANE.
We start by describing the maximum weight basis preconditioner. Its construction essentially consists of building a basis B of à which gives higher priority to the columns of
à corresponding to larger diagonal elements of D̃. More specifically, the maximum weight
basis preconditioner is determined by the following algorithm:
Maximum Weight Basis Algorithm
(n+l)
Start: Given d˜ ∈ R++ , and à ∈ R(m+l)×(n+l) such that rank(Ã) = m + l,
1. Order the elements of d˜ so that d˜1 ≥ . . . ≥ d˜n+l ; order the columns of à accordingly.
2. Let B = ∅, j = 1.
3. While |B| < m + l do
(a) If Ãj is linearly independent of {Ãi : i ∈ B}, set B ← B ∪ {j}.
(b) j ← j + 1.
˜ determine the set B according to this
4. Return to the original ordering of à and d;
ordering and set N := {1, . . . , n + l}\B.
5. Set B := ÃB and D̃B := Diag(d˜B ).
˜ := D̃−1 B −1 .
6. Let T̃ = T̃ (Ã, d)
B
end
8
Note that the above algorithm can be applied to the matrix à defined in (21) since this
matrix has full row rank due to Assumption 1. The maximum weight basis preconditioner
was originally proposed by Resende and Veiga in [16] in the context of the minimum cost
network flow problem. In this case, Ã = A is the node-arc incidence matrix of a connected
digraph (with one row deleted to ensure that à has full row rank), the entries of d˜ are
weights on the edges of the graph, and the set B generated by the above algorithm defines a
maximum spanning tree on the digraph. Oliveira and Sorensen [14] later proposed the use
of this preconditioner for general matrices Ã.
For the purpose of stating the next result, we now introduce some notation. Let us define
ϕÃ := max{kB −1 ÃkF : B is a basis of Ã}.
(26)
It is easy to show that ϕÃ ≤ (n + l)1/2 χ̄Ã , where χ̄Ã is a well-known condition number (see
[18]) defined as
(n+l)
χ̄Ã := sup{kÃT (ÃE ÃT )−1 ÃEk : E ∈ Diag(R++ )}.
Indeed, this follows from the fact that kCkF ≤ (n+l)1/2 kCk for any matrix C ∈ R(m+l)×(n+l)
and that an equivalent characterization of χ̄Ã is
χ̄Ã := max{kB −1 Ãk : B is a basis of Ã},
as shown in [17] and [18].
Lemmas 2.1 and 2.2 of Monteiro, O’Neal and Tsuchiya [13] imply the following result.
˜ be the preconditioner determined according to the MaxProposition 2.2 Let T̃ = T̃ (Ã, d)
imum Weight Basis Algorithm, and define W := T̃ ÃD̃2 ÃT T̃ T . Then, kT̃ ÃD̃k ≤ ϕÃ and
κ(W ) ≤ ϕ2Ã .
2
Note that the bound ϕÃ
on κ(W ) is independent of the diagonal matrix D̃ and depends
only on Ã. This will allow us to obtain a uniform bound on the number of iterations needed
by an iterative solver to obtain a suitable approximate solution to (22). This topic is the
subject of the remainder of this subsection.
Instead of dealing directly with (22), we consider the application of an iterative solver to
the preconditioned ANE:
W u = v,
(27)
where
W := T̃ ÃD̃2 ÃT T̃ T ,
v := T̃ h.
(28)
For the purpose of our analysis below, the only thing we will assume regarding the iterative
solver when applied to (27) is that it generates a sequence of iterates {uj } such that
·
¸j
1
j
kv − W u0 k, ∀ j = 0, 1, 2, . . . ,
(29)
kv − W u k ≤ c(κ) 1 −
f (κ)
9
where c and f are positive functions of κ ≡ κ(W ).
Examples of solvers which satisfy (29) include the steepest descent (SD) and conjugate
gradient (CG) methods, with the following values for c(κ) and f (κ):
Solver c(κ)
√
SD
√κ
CG
2 κ
f (κ)
(κ
√ + 1)/2
( κ + 1)/2
Table 2.2
The justification for the table above follows from Section 7.6 and Exercise 10 of Section 8.8
of [10].
The following result gives an upper bound on the number of iterations that the generic
√
iterative linear solver needs to perform to obtain an iterate uj satisfying kv − W uj k ≤ ξ µ
for some constant ξ > 0:
Proposition 2.3 Let u0 be an arbitrary starting point. Then, a generic iterative solver with
√
a convergence rate given by (29) generates an iterate uj satisfying kv − W uj k ≤ ξ µ in
µ
µ
¶¶
c(κ)kv − W u0 k
O f (κ) log
(30)
√
ξ µ
iterations, where κ := κ(W ).
Proof: The proof is similar to Theorem 2.4 of [12].
√
We will refer to an iterate uj satisfying kv − W uj k ≤ ξ µ as a ξ-approximate solution
of (27). From Proposition 2.3, it is clear that different choices of the initial point u0 lead to
different bounds on the number of iterations performed by the iterative solver. In Subsection
2.4, we will describe a suitable way of selecting u0 and ξ so that the resulting number of
iterations performed by the iterative solver can be uniformly bounded by a universal constant
depending on the quantities f (κ), c(κ), n and ϕÃ (see the bound (51)).
2.3
Computation of the Inexact Search Direction
In this subsection, we will discuss the impact an inexact solution to (27) has on equations
(15)–(18). More specifically, we will discuss one way to define the overall search direction
so that its residual with respect to system (15)–(18) is appropriate for the development of a
polynomially convergent algorithm based on that search direction.
Suppose that we solve (27) inexactly according to Section 2.2. Then our final solution uj
√
satisfies W uj − v = f˜ for some vector f˜ such that kf˜k ≤ ξ µ. Letting
µ
¶
∆y
= T̃ T uj ,
(31)
∆z
10
we see from (28) that
µ
2
T
ÃD̃ Ã
∆y
∆z
¶
= h + T̃ −1 f˜.
(32)
The component ∆s of the search direction is still obtained as before, i.e. as in (23). However,
the component ∆x is computed as
∆x = −D2 ∆s − x + σµS −1 e − S −1 p,
(33)
where p ∈ Rn will be specified below. To motivate the choice of p, we first note that an
argument similar to (25) implies that
µ
¶ µ
¶
∆x
rp
Ã
+
∆z
rV
¶
µ
¶ µ
−D2 ∆s − x + σµS −1 e − S −1 p
rp
= Ã
+
∆z
rV
µ 2
¶ µ
¶
2 T
2
D rd + D A ∆y + D V ∆z − x + σµS −1 e − S −1 p
rp
= Ã
+
∆z
rV
µ T
¶
µ 2
¶
µ −1 ¶ µ
¶
−1
A ∆y + V ∆z
D rd − x + σµS e
S p
rp
2
= ÃD̃
+ Ã
− Ã
+
∆z
0
0
rV
¶
µ 2
¶
µ
¶
µ
¶
µ
D rd − x + σµS −1 e
S −1 p
rp
∆y
+ Ã
− Ã
+
= ÃD̃2 ÃT
0
0
rV
∆z
µ −1 ¶
S p
−1 ˜
= T̃ f − Ã
.
(34)
0
Based on the above equation, one is naturally tempted to choose p so that the right
hand side of (34) is zero, and consequently (15) and (18) are satisfied exactly. However, the
existence of such p cannot be guaranteed and, even if it exists, its magnitude may be so large
as to yield a search direction which is not suitable for the development of a polynomially
convergent algorithm.
Instead, we consider an alternative approach where p is chosen so that the first component
of (34) is zero and the second component is small enough. More specifically, we choose
(p, q) ∈ Rn × Rl such that
µ −1 ¶
µ
¶
S p
(XS)−1/2 p
˜
˜
0 = f − T̃ Ã
= f − T̃ ÃD̃
.
(35)
q
q
Then, using (34) and (35), we conclude that
¶
µ −1 ¶
µ
¶ µ
S p
∆x
rp
−1 ˜
= T̃ f − Ã
Ã
+
rV
0
∆z
(36)
from which we see that the first component of (34) is set to 0 and the second component is
exactly q. We now discuss a convenient way to choose (p, q) satisfying (35). (Note that the
11
coefficient matrix of system (35) has full row rank, implying that (35) has multiple solutions.)
First, let B = (B1 , . . . , Bm+l ) be the ordered set of basic indices computed by the Maximum
Weight Basis algorithm and note that, by step 6 of this algorithm, the Bi -th column of T̃ ÃD̃
is the ith unit vector for every i = 1, . . . , m+l. Then, the vector u ∈ Rn+l defined as uBi = f˜i
for i = 1, . . . , m + l and uj = 0 for every j ∈
/ {B1 , . . . , Bm+l } clearly satisfies
f˜ = T̃ ÃD̃u.
We then obtain a pair (p, q) ∈ Rn × Rl satisfying (35) by defining
µ
¶
µ ¶
p
(XS)1/2 0
:=
u.
q
0
I
(37)
(38)
We summarize the results of this subsection in the following proposition:
Proposition 2.4 The search direction ∆w derived in this subsection satisfies
A∆x
A ∆y + ∆s + V ∆z
X∆s + S∆x
V T ∆x + ∆z
T
=
=
=
=
−rp ,
−rd ,
−Xs + σµe − p,
−rV + q,
(39)
(40)
(41)
(42)
where the vectors p and q satisfy:
kpk ≤ kXSk1/2 kf˜k,
kqk ≤ kf˜k.
(43)
(44)
Proof: Equations (39) and (42) immediately follow from (21) and (36). Equations (40)
and (41) follow from (23) and (33), respectively. The bounds on kpk and kqk follow from
(38) and the fact kuk = kf˜k.
2.4
The Algorithm
In this subsection, we describe the primal-dual long-step path-following algorithm based
on an iterative solver that will be the main focus of the analysis of this paper. We also
state the main convergence results for this algorithm, whose proofs are given in the next
section. The first result essentially shows that the number of outer iterations is bounded in
the same way as for its exact counterpart discussed in Subsection 2.1. The second result
shows that the number of inner iterations performed by the iterative solver to compute an
approximate solution of (27) is uniformly bounded by a universal constant depending only
on the quantities f (κ), c(κ), n and ϕÃ .
12
m+l
First, given a starting point w0 ∈ R2n
and the parameters η ≥ 0, γ ∈ [0, 1], and
++ × R
θ > 0, we will define the following set:
(
)
0 0
,
r
Xs
≥
(1
−
γ)µe,
(r
,
r
)
=
η(r
),
p
d
p d
m+l
Nw0 (η, γ, θ) = w ∈ R2n
:
,
(45)
√
++ × R
0
krV − ηrV k ≤ θ µ,
η ≤ µ/µ0
where µ = µ(w), µ0 = µ(w0 ), r = r(w) and r0 = r(w0 ). Next, we define the neighborhood
[
Nw0 (γ, θ) =
Nw0 (η, γ, θ).
(46)
η∈[0,1]
We now present the algorithm which will be the focus of the analysis of this paper.
Algorithm CQP-IS:
1. Start: Let ² > 0, γ ∈ (0, 1), θ > 0, w0 ∈ Nw0 (γ, θ), and 0 < σ < σ < 4/5 be given.
Determine w̄ which satisfies equations (49). Set k = 0.
2. While µk := µ(wk ) > ² do
(a) Let w := wk and µ := µk ; choose σ ∈ [σ, σ].
(b) Define D, Ã and D̃ according to (19), (21) and (20). Set rp = Ax − b, rd =
AT y + s + V z − c, rV = V T x + z, and η = krp k/krp0 k.
˜ using the Maximum Weight Basis Algo(c) Build the preconditioner T̃ = T̃ (Ã, d)
rithm, and define W , v, and u0 according to (28) and (50).
(d) Using u0 as the start point for an iterative solver, find an approximate solution
√
uj of W u = v such that f˜ = W uj − v satisfies kf˜k ≤ ξ µ, where
½
·r
¸ ¾
³
γσ
γ´
√ ,
ξ := min
1+ 1−
σ−1 θ .
(47)
2
4 n
(e) Solve for ∆y and ∆z using equation (31). Let (p, q) be computed according to
(38). Compute ∆s and ∆x by (23) and (33).
(f) Compute α̃ := argmax{α ∈ [0, 1] : w + α0 ∆w ∈ Nw0 (γ, θ), ∀α0 ∈ [0, α]}.
(g) Compute ᾱ := argmin{(x + α∆x)T (s + α∆s) : α ∈ [0, α̃]}.
(h) Let wk+1 = w + ᾱ∆w, and set k ← k + 1.
End (while)
Next, we present the main convergence results we have obtained for Algorithm CQP-IS.
The first result gives a bound on the number of “outer” iterations needed by the algorithm
to obtain an ²-solution to the KKT conditions (5)–(8).
13
Theorem 2.5 Assume that the constants γ, σ, σ and θ are such that
(
µ
¶−1 )
√
5
max γ −1 , (1 − γ)−1 , σ −1 , 1 − σ
= O(1), θ = O( n),
4
(48)
m+l
and that the initial point w0 ∈ R2n
satisfies (x0 , s0 ) ≥ (x∗ , s∗ ) for some w∗ ∈ S.
++ × R
m+l
Then, Algorithm CQP-IS generates an iterate wk ∈ R2n
satisfying µk ≤ ²2 µ0 ,
++ × R
√
k k
2
0 0
k
2
0
2
k(rp , rd )k ≤ ² k(rp , rd )k, and krV k ≤ ² krV k + ²θ µ0 within O (n log(1/²)) iterations.
We will prove Theorem 2.5 in Section 3.2.
We next describe a suitable way of selecting u0 so that the resulting number of iterations
performed by the iterative solver can be uniformly bounded by a universal constant depending only on the quantities f (κ), c(κ), n and ϕÃ . First, compute a point w̄ = (x̄, s̄, ȳ, z̄) such
that
µ ¶ µ ¶
x̄
b
Ã
=
, AT ȳ + s̄ + V z̄ = c.
(49)
z̄
0
Note that vectors x̄ and z̄ satisfying the first equation in (49) can be easily computed once
a basis of à is available (e.g., the one computed by the Maximum Weight Basis Algorithm
in the first outer iteration of Algorithm CQP-IS). Once ȳ is arbitrarily chosen, a vector s̄
satisfying the second equation of (49) is immediately available. We then define
µ 0
¶
y − ȳ
0
−T
u = −η T̃
.
(50)
z 0 − z̄
For the specific starting point u0 defined in (50), the following result gives an upper
bound on the number of iterations that the generic iterative linear solver needs to perform
√
to obtain an iterate uj satisfying kv − W uj k ≤ ξ µ, where ξ is given in (47). We will prove
this result in Subsection 3.1.
Theorem 2.6 Assume that ξ is defined in (47), where σ, γ, θ are such that
max{σ −1 , γ −1 , (1 − γ)−1 , θ, θ−1 }
is bounded by a polynomial in n. Assume also that w0 and w̄ are such that (x0 , s0 ) > |(x̄, s̄)|
and (x0 , s0 ) ≥ (x∗ , s∗ ) for some w∗ ∈ S. Then, a generic iterative solver with a convergence
rate given by (29) generates an approximate solution uj as in step (d) of Algorithm CQP-IS
in
O (f (κ) log (c(κ)nϕÃ ))
(51)
iterations, where κ := κ(W ). As a consequence, the SD and CG methods generate this
approximate solution uj in O(ϕ2Ã log(nϕÃ )) and O(ϕÃ log(nϕÃ )) iterations, respectively.
14
3
Technical Results
In this section, we will prove Theorems 2.5 and 2.6. Section 3.1 is devoted to the proof of
Theorem 2.6, and Section 3.2 will give the proof of Theorem 2.5.
3.1
“Inner” Iteration Results – Proof of Theorem 2.6
In this subsection, we will provide the proof of Theorem 2.6.
We begin by establishing three technical lemmas.
m+l
, w ∈ Nw0 (η, γ, θ) for some η ∈ [0, 1], γ ∈ [0, 1]
Lemma 3.1 Suppose that w0 ∈ R2n
++ × R
∗
and θ > 0, and w ∈ S. Then
(x − ηx0 − (1 − η)x∗ )T (s − ηs0 − (1 − η)s∗ ) ≥ −
θ2
µ.
4
(52)
Proof: Let us define w̃ := w − ηw0 − (1 − η)w∗ . Using the definitions of Nw0 (η, γ, θ), r,
and S, we have that
Ax̃ = 0
A ỹ + s̃ + V z̃ = 0
V T x̃ + z̃ = rV − ηrV0 .
T
Multiplying the second relation by x̃T on the left, and using the first and third relations
along with the fact that w ∈ Nw0 (η, γ, θ), we see that
x̃T s̃ = −x̃T V z̃ = [z̃ − (rV − ηrV0 )]T z̃ = kz̃k2 − z̃ T (rV − ηrV0 )
µ
¶2
krV − ηrV0 k
krV − ηrV0 k2
2
0
≥ kz̃ k − kz̃kkrV − ηrV k = kz̃k −
−
2
4
0 2
2
krV − ηrV k
θ
≥ −
≥ − µ.
4
4
m+l
Lemma 3.2 Suppose that w0 ∈ R2n
such that (x0 , s0 ) ≥ (x∗ , s∗ ) for some w∗ ∈ S.
++ × R
Then, for any w ∈ Nw0 (η, γ, θ) with η ∈ [0, 1], γ ∈ [0, 1] and θ > 0, we have
µ
¶
θ2
T 0
T 0
µ.
(53)
η(x s + s x ) ≤ 3n +
4
Proof: Using the fact w ∈ Nw0 (η, γ, θ) and (52), we obtain
T
xT s − η(xT s0 + sT x0 ) + η 2 x0 s0 − (1 − η)(xT s∗ + sT x∗ )
θ2
T
T
T
+η(1 − η)(x∗ s0 + s∗ x0 ) + (1 − η)2 x∗ s∗ ≥ − µ.
4
15
T
T
Rearranging the terms in this equation, and using the facts that η ≤ xT s/x0 s0 , x∗ s∗ = 0,
(x, s) ≥ 0, (x∗ , s∗ ) ≥ 0, (x0 , s0 ) > 0, η ∈ [0, 1], x∗ ≤ x0 , and s∗ ≤ s0 , we conclude that
T
T
T
T
T
η(xT s0 + sT x0 ) ≤ η 2 x0 s0 + xT s + η(1 − η)(x∗ s0 + s∗ x0 ) +
≤ η 2 x0 s0 + xT s + 2η(1 − η)x0 s0 +
T
≤ 2ηx0 s0 + xT s +
θ2
≤ 3x s + µ =
4
T
θ2
µ
4
µ
θ2
3n +
4
θ2
µ
4
θ2
µ
4
¶
µ.
m+l
Lemma 3.3 Suppose w0 ∈ R2n
, w ∈ Nw0 (η, γ, θ) for some η ∈ [0, 1], γ ∈ [0, 1] and
++ × R
θ > 0, and w̄ satisfies (49). Let W , v and u0 be given by (28) and (50), respectively. Then,
µ
¶
−x + σµS −1 e + η(x0 − x̄) + ηD2 (s0 − s̄)
0
W u − v = T̃ Ã
.
(54)
rV − ηrV0
Proof: Using the fact that w ∈ Nw0 (η, γ, θ) along with (21), (45) and (49), we easily
obtain that
µ
¶
µ
¶
µ
¶
¶
µ 0
ηrp0
rp
0
x − x̄
=
+ Ã
(55)
= η Ã
rV
rV − ηrV0
z 0 − z̄
ηrV0 + (rV − ηrV0 )
s0 − s̄ = −AT (y 0 − ȳ) − V (z 0 − z̄) + rd0 .
(56)
Using relations (20), (21), (28), (45), (50), (55) and (56), we obtain
µ
¶
µ
¶
x − σµS −1 e − D2 rd
rp
0
2 T T 0
W u − v = T̃ ÃD̃ Ã T̃ u − T̃ Ã
+ T̃
0
rV
µ 0
¶
µ
¶
µ
¶
−1
2 0
y − ȳ
x − σµS e − ηD rd
rp
2 T
= −η T̃ ÃD̃ Ã
− T̃ Ã
+ T̃
z 0 − z̄
0
rV
¶
µ 2 T 0
D A (y − ȳ) + D2 V (z 0 − z̄) − D2 rd0
= −η T̃ Ã
z 0 − z̄
¶
µ
¶
µ
x − σµS −1 e
rp
,
− T̃ Ã
+ T̃
rV
0
µ
¶
µ
¶
−D2 (s0 − s̄)
x − σµS −1 e
= −η T̃ Ã
− T̃ Ã
z 0 − z̄
0
µ 0
¶
µ
¶
x − x̄
0
+ η T̃ Ã
+ T̃ Ã
,
z 0 − z̄
rv − ηrV0
16
which yields equation (54), as desired.
The next result shows that the initial residual of the iterative solver satisfies kv −W u0 k ≤
√
Ψ µ, where Ψ > 0 is a universal constant independent of the current iterate of Algorithm
CQP-IS.
˜ is given and that w0 and w̄ are such that (x0 , s0 ) >
Lemma 3.4 Assume that T̃ = T̃ (Ã, d)
0 0
∗ ∗
|(x̄, s̄)| and (x , s ) ≥ (x , s ) for some w∗ ∈ S. Further, assume that w ∈ Nw0 (γ, θ) for
some γ ∈ [0, 1] and θ > 0, and that W , v and u0 are given by (28) and (50), respectively.
√
Then, the initial residual in (29) satisfies kv − W u0 k ≤ Ψ µ, where
¸
·
7n + θ2 /2
√
(57)
+ θ ϕÃ .
Ψ :=
1−γ
Proof: Since w ∈ Nw0 (γ, θ), we have that xi si ≥ (1 − γ)µ for all i, which implies
k(XS)−1/2 k ≤ p
1
(1 − γ)µ
.
(58)
Note that kXs − σµek, when viewed as a function of σ ∈ [0, 1], is convex. Hence, it is
maximized at one of its endpoints, which, together with the facts kXs − µek < kXsk and
σ ∈ [σ, σ] ⊂ [0, 1], immediately implies that
kXs − σµek ≤ kXsk ≤ kXsk1 = xT s = nµ.
(59)
Using the fact that (x0 , s0 ) > |(x̄, s̄)| together with Lemma 3.2, we obtain that
©
ª
©
ª
ηkS(x0 − x̄) + X(s0 − s̄)k ≤ η kS(x0 − x̄)k + kX(s0 − s̄)k ≤ 2η kSx0 k + kXs0 k
µ
¶
θ2
T 0
T 0
≤ 2η(x s + x s ) ≤ 6n +
µ.
(60)
2
Since w ∈ Nw0 (γ, θ), there exists η ∈ [0, 1] such that w ∈ Nw0 (η, γ, θ). It is clear that the
requirements of Lemma 3.3 are met, so equation (54) holds. By (19), (20) and (54), we see
that
°
µ
¶°
°
(XS)−1/2 {Xs − σµe − η[S(x0 − x̄) + X(s0 − s̄)]} °
0
°
°
kv − W u k = °T̃ ÃD̃
°
rV − ηrV0
½
h
i
≤ kT̃ ÃD̃k k(XS)−1/2 k kXs − σµek + ηkX(s0 − s̄) + S(x0 − x̄)k
¾
0
+ krV − ηrV k ,
)
(
·
µ
¶ ¸
θ2
1
√
√
≤ ϕÃ p
nµ + 6n +
µ + θ µ = Ψ µ,
2
(1 − γ)µ
17
where the last inequality follows from Proposition 2.2, relations (58), (59), (60), and the
assumption that w ∈ Nw0 (γ, θ).
The proof of the first part of Theorem 2.6 immediately follows from Proposition 2.3 and
Lemma 3.4. The proof of the second part of Theorem 2.6 follows immediately from Table
2.2 and Proposition 2.2.
3.2
“Outer” Iteration Results – Proof of Theorem 2.5
In this subsection, we will present the proof of Theorem 2.5. Specifically, we will show
that Algorithm CQP-IS obtains an ²-approximate solution to (5)–(8) in O(n2 log(1/²)) outer
iterations.
Throughout this section, we use the following notation:
w(α) := w + α∆w,
µ(α) := µ(w(α)),
r(α) := r(w(α)).
Lemma 3.5 Assume that ∆w satisfies (39)-(42) for some σ ∈ R, w ∈ R2n+m+l and (p, q) ∈
Rn × Rl . Then, for every α ∈ R, we have:
(a) X(α)s(α) = (1 − α)Xs + ασµe − αp + α2 ∆X∆s;
(b) µ(α) = [1 − α(1 − σ)]µ − αpT e/n + α2 ∆xT ∆s/n;
(c) (rp (α), rd (α)) = (1 − α)(rp , rd );
(d) rV (α) = (1 − α)rV + αq.
Proof: Using (41), we obtain
X(α)s(α) =
=
=
=
(X + α∆X)(s + α∆s)
Xs + α(X∆s + S∆x) + α2 ∆X∆s
Xs + α(−Xs + σµe − p) + α2 ∆X∆s
(1 − α)Xs + ασµe − αp + α2 ∆X∆s,
thereby showing that (a) holds. Left multiplying the above equality by eT and dividing the
resulting expression by n, we easily conclude that (b) holds. Statement (c) can be easily
verified by means of (39) and (40), while statement (d) follows from (42).
Lemma 3.6 Assume that ∆w satisfies (39)-(42) for some σ ∈ R, w ∈ R2n+m+l and (p, q) ∈
Rn × Rl such that pT e/n ≤ σµ/2. Then, for every scalar α satisfying
¾
½
σµ
,
(61)
0 ≤ α ≤ min 1 ,
2 k∆X∆sk∞
18
we have
µ(α)
≥ 1 − α.
µ
(62)
Proof: Using Lemma 3.5(b) and the assumption that pT e/n ≤ σµ/2, we conclude for
every α satisfying (61) that
µ(α) = [1 − α(1 − σ)]µ − αpT e/n + α2 ∆xT ∆s/n
1
≥ [1 − α(1 − σ)]µ − ασµ + α2 ∆xT ∆s/n
2
1
≥ (1 − α)µ + ασµ − α2 k∆X∆sk∞
2
≥ (1 − α)µ.
Lemma 3.7 Assume that ∆w satisfies (39)-(42) for some σ > 0, w ∈ Nw0 (γ, θ) with γ ∈
[0, 1], θ ≥ 0, and (p, q) ∈ Rn × Rl such that
kpk∞ ≤ γσµ/4
and
·r
kqk ≤
¸
γ´
√
1+ 1−
σ − 1 θ µ.
2
(63)
³
Then, w(α) ∈ Nw0 (γ, θ) for every scalar α satisfying
½
¾
γσµ
0 ≤ α ≤ min 1 ,
.
4 k∆X∆sk∞
(64)
(65)
Proof: Since w ∈ Nw0 (γ, θ), there exists η ∈ [0, 1] such that w ∈ Nw0 (η, γ, θ). We will
show that w(α) ∈ Nw0 ((1 − α)η, γ, θ) ⊆ Nw0 (γ, θ) for every α satisfying (65).
First, we note that (rp (α), rd (α)) = (1 − α)η(rp0 , rd0 ) by Lemma 3.5(c) and the definition
of Nw0 (η, γ, θ). Next, since the assumption that γ ∈ [0, 1] and equation (63) imply that
pT e/n ≤ σµ/2, it follows from Lemma 3.6 that (62) holds for every α satisfying (61), and
hence (65). Thus, for every α satisfying (65), we have
(1 − α)η ≤
µ(α) µ
µ(α)
µ(α)
η ≤
=
.
µ
µ µ0
µ0
(66)
Now, it is easy to see that for every u ∈ Rn and τ ∈ [0, n], there holds ku − τ (uT e/n)ek∞ ≤
(1 + τ )kuk∞ . Using this inequality twice, the fact that w ∈ Nw0 (η, γ, θ), relation (63) and
19
statements (a) and (b) of Lemma 3.5, we conclude for every α satisfying (65) that
X(α)s(α) − (1 − γ) µ(α)e
·
µ
pT e
n
¶ ¸
e
= (1 − α) [Xs − (1 − γ)µe] + αγσµe − α p − (1 − γ)
·
µ T
¶ ¸
∆x ∆s
2
e
+ α ∆X∆s − (1 − γ)
n
°
°
°
° ¸
·
°
°
pT e °
∆xT ∆s °
°
°
°
≥ α γσµ − °p − (1 − γ)
e
− α °∆X∆s − (1 − γ)
e°
° e
n °∞
n
∞
µ
¶
1
1
≥ α (γσµ − 2kpk∞ − 2αk∆X∆sk∞ ) e ≥ α γσµ − γσµ − γσµ e = 0.
2
2
Next, by Lemma 3.5(d), we have that
rV (α) = (1 − α)rV + αq = (1 − α)ηrV0 + â,
0
where
p â = (1 − α)(rV − ηrV ) + αq. To complete the proof, it suffices to show that kâk ≤
θ µ(α) for every α satisfying (65). By using equation (64) and Lemma 3.5(b) along with
√
the facts that krV − ηrV0 k ≤ θ µ and α ∈ [0, 1], we have
kâk2 − θ2 µ(α) = (1 − α)2 krV − ηrV0 k2 + 2α(1 − α)(rV − ηrV0 )T q + α2 kqk2 − θ2 µ(α)
√
≤ (1 − α)2 θ2 µ + 2α(1 − α)θ µkqk + α2 kqk2
½
¾
T
pT e
2
2 ∆x ∆s
− θ [1 − α(1 − σ)]µ − α
+α
n
n
T
p e
∆xT ∆s
√
≤ α2 kqk2 + 2αθ µkqk − αθ2 σµ + αθ2
− α2 θ 2
n
n i
³
h
γ´ 2
√
2
2
θ σµ + θ αk∆X∆sk∞
≤ α kqk + 2θ µkqk − 1 −
4´
h
³
γ 2 i
√
≤ α kqk2 + 2θ µkqk − 1 −
θ σµ ≤ 0,
2
where the last inequality follows from the quadratic formula and equation (64).
Next, we consider the minimum step length allowed under our algorithm:
Lemma 3.8 In every iteration of Algorithm CQP-IS, the step length ᾱ satisfies
½
¾
min{γσ, 1 − 54 σ}µ
ᾱ ≥ min 1,
4 k∆X∆sk∞
and
·
µ(ᾱ) ≤
µ
5
1− 1− σ
4
20
¶ ¸
ᾱ
µ.
2
(67)
(68)
Proof: By equations (43)–(44) and the bound on kf˜k in step (d) of Algorithm CQP-IS,
we conclude that
γσ √
γσµ
√
kpk∞ ≤ kpk ≤ kXSk1/2 kf˜k ≤ nµ √
µ =
,
(69)
4
4 n
·r
¸
³
´
γ
√
kqk ≤ kf˜k ≤
1+ 1−
σ − 1 θ µ,
(70)
2
so (63) and (64) are satisfied. Hence, by Lemma 3.7, the quantity α̃ computed in step (g) of
Algorithm CQP-IS satisfies
½
¾
γσµ
α̃ ≥ min 1,
.
(71)
4 k∆X∆sk∞
Moreover, by (69), it follows that the coefficient of α in the expression for µ(α) in Lemma
3.5(b) satisfies
µ
¶
pT e
1
5
−(1−σ)µ−
≤ −(1−σ)µ+kpk∞ ≤ −(1−σ)µ+ γσµ = − 1 − σ µ < 0, (72)
n
4
4
since σ ∈ (0, 45 ). Hence, if ∆xT ∆s ≤ 0, it is easy to see that ᾱ = α̃, and hence that (67)
holds in view of (71). Moreover, by Lemma 3.5(b) and (72), we have
·
µ
¶ ¸
·
µ
¶ ¸
ᾱ
pT e
5
5
µ(ᾱ) ≤ [1 − ᾱ(1 − σ)]µ − ᾱ
≤ 1 − 1 − σ ᾱ µ ≤ 1 − 1 − σ
µ,
n
4
4
2
showing that (68) also holds. We now consider the case where ∆xT ∆s > 0. In this case, we
have ᾱ = min{αmin , α̃}, where αmin is the unconstrained minimum of µ(α). It is easy to see
that
αmin
n[µ(1 − σ) − 14 σµ]
µ(1 − 54 σ)
nµ(1 − σ) + pT e
=
≥
≥
.
2∆xT ∆s
2∆xT ∆s
2 k∆X∆sk∞
The last two observations together with (71) imply that (67) holds in this case too. Moreover,
since the function µ(α) is convex, it must lie below the function φ(α) over the interval
[0, αmin ], where φ(α) is the affine function interpolating µ(α) at α = 0 and α = αmin . Hence,
·
µ
¶ ¸
pT e
5
ᾱ
ᾱ
≤ 1− 1− σ
µ(ᾱ) ≤ φ(ᾱ) = [1 − (1 − σ) ]µ − ᾱ
µ,
(73)
2
2n
4
2
where the second inequality follows from (72). We have thus shown that ᾱ satisfies (68).
Our next task will be to show that the stepsize ᾱ remains bounded away from zero. In
view of (67), it suffices to show that the quantity k∆X∆sk∞ remains bounded. The next
lemma addresses this issue.
21
m+l
Lemma 3.9 Let w0 ∈ R2n
be such that (x0 , s0 ) ≥ (x∗ , s∗ ) for some w∗ ∈ S, and
++ × R
let w ∈ Nw0 (γ, θ) for some γ ≥ 0 and θ ≥ 0. Then, the search direction ∆w generated by
Algorithm CQP-IS satisfies
µ
¶1/2
σ2
√
max(kD ∆xk, kD∆sk) ≤
1 − 2σ +
nµ
1−γ
µ
¶
1
γσ
θ2 √
√
+√
+ 6n +
µ + θ µ.
2
1−γ 4
−1
(74)
Proof: Since w ∈ Nw0 (γ, θ), there exists η ∈ [0, 1] such that w ∈ Nw0 (η, γ, θ). Let
g
∆w := ∆w + η(w0 − w∗ ). Using relations (39), (40), (42), and the fact that w ∈ Nw0 (η, γ, θ),
we easily see that
f = 0
A∆x
f + ∆s
f + V ∆z
f = 0,
AT ∆y
f + ∆z
f = q − rV + ηrV0 .
V T ∆x
(75)
(76)
(77)
f T and using (75) and (77), we obtain
Pre-multiplying (76) by ∆x
f T ∆s
f = −∆x
f T V ∆z
f = [∆z
f − (q − rV + ηrV0 )]T ∆z
f = k∆zk
f 2 − (q − rV + ηrV0 )T ∆z
f
∆x
0 2
f 2 − kq − rV + ηr0 k k∆zk
f ≥ − kq − rV + ηrV k .
≥ k∆zk
V
4
(78)
Next, we multiply equation (41) by (XS)−1/2 to obtain D−1 ∆x + D∆s = H(σ) −
(XS)−1/2 p, where H(σ) := −(XS)1/2 e + σµ(XS)−1/2 e. Equivalently, we have that
£
¤
f + D∆s
f = H(σ) − (XS)−1/2 p + η D(s0 − s∗ ) + D−1 (x0 − x∗ ) =: g.
D−1 ∆x
Taking the squared norm of both sides of the above equation and using (78), we obtain
T
kq − rV + ηrV0 k2
2
2
2
2
f
f
f
f
kD ∆xk + kD∆sk = kgk − 2∆x ∆s ≤ kgk +
2
µ
¶2
kqk + krV − ηrV0 k
√
√
≤ (kgk + θ µ)2 ,
≤
kgk +
2
√ √
£√
¤ √
√
since kqk + krV − ηrV0 k ≤
2 − 1 θ µ + θ µ = 2θ µ by (45), (70), and the fact that
1 + (1 − γ/2)σ ≤ 2. Thus, we have
−1
√
f , kD∆sk)
f
≤ kgk + θ µ
max(kD−1 ∆xk
£
¤
√
≤ kH(σ)k + k(XS)−1/2 k kpk + η kD(s0 − s∗ )k + kD−1 (x0 − x∗ )k + θ µ.
22
g and the fact that
This, together with the triangle inequality, the definitions of D and ∆w,
w ∈ Nw0 (η, γ, θ), imply that
max(kD−1 ∆xk, kD∆sk)
£
¤
√
≤ kH(σ)k + k(XS)−1/2 k kpk + 2η kD(s0 − s∗ )k + kD−1 (x0 − x∗ )k + θ µ
£
¤
√
≤ kH(σ)k + k(XS)−1/2 k kpk + 2ηk(XS)−1/2 k kX(s0 − s∗ )k + kS(x0 − x∗ )k + +θ µ
£
¡
¢¤
1
√
≤ kH(σ)k + p
kpk + 2η kX(s0 − s∗ )k + kS(x0 − x∗ )k + θ µ.
(79)
(1 − γ)µ
It is well-known (see e.g. [7]) that
µ
kH(σ)k ≤
σ2
1 − 2σ +
1−γ
¶1/2
√
nµ.
Moreover, using the fact that s∗ ≤ s0 and x∗ ≤ x0 along with Lemma 3.2, we obtain
¶
µ
¡
¢
θ2
0
∗
0
∗
T 0
T 0
η kX(s − s )k + kS(x − x )k ≤ η(s x + x s ) ≤ 3n +
µ.
4
(80)
(81)
Finally, it is clear from (69) that
kpk ≤
γσµ
.
4
(82)
The result now follows by incorporating inequalities (80)–(82) into (79).
We are now ready to prove Theorem 2.5.
Proof of Theorem 2.5: Let ∆wk denote the search direction, and let rk = r(wk ) and
µk = µ(wk ), at the k-th iteration of Algorithm CQP-IS. Clearly, wk ∈ Nw0 (γ, θ). Hence,
using Lemma 3.9, assumption (48) and the inequality
k∆X k ∆sk k∞ ≤ k∆X k ∆sk k ≤ k(Dk )−1 ∆xk k kDk ∆sk k,
we easily see that k∆X k ∆sk k∞ = O(n2 )µk . Using this conclusion together with assumption
(48) and Lemma 3.8, we see that, for some universal constant β > 0, we have
µ
¶
β
µk+1 ≤ 1 − 2 µk , ∀k ≥ 0.
n
Using this observation and some standard arguments (see, for example, Theorem 3.2 of [19]),
we easily see that the Algorithm CQP-IS generates an iterate wk ∈ Nw0 (γ, θ) satisfying
µk /µ0 ≤ ²2 within O (n2 log(1/²)) iterations. The theorem now follows from this conclusion
and the definition of Nw0 (γ, θ).
23
4
Concluding Remarks
We have shown that the primal-dual long-step path-following LP algorithm based on an
iterative solver presented in [12] can be extended to the CQP case. This was not immediately
obvious at first, since the standard normal equation for CQP does not fit into the mold
required for the results in [13] to hold. By creating the ANE, we were able to use the results
about the maximum weight basis preconditioner developed in [13] in the context of CQP.
Another difficulty we encountered was the proper choice of the starting iterate u0 for the
iterative solver. By choosing u0 = 0 as in the LP case, we obtain kv − W u0 k = kvk, which
√
can only be shown to be O(max{µ, µ}). In this case, for every µ > 1, Proposition 2.3
would guarantee that the number of inner iterations of the iterative solver is
O (f (κ) max { log (c(κ)nϕÃ ) , log µ }) ,
a bound which depends on the logarithm of the current duality gap. On the other hand,
Theorem 2.6 shows that choosing u0 as in (50) results in a bound on the number of inner
iterations of the iterative solver that does not depend on the current duality gap.
The usual way of defining the dual residual is as the quantity
Rd := AT y + s − V V T x − c,
which, in view of (11) and (12), can be written in terms of the residuals rd and rV as
R d = r d − V rV .
(83)
Note that along the iterates generated by Algorithm CQP-IS, we have rd = O(µ) and
√
√
rV = O( µ), implying that Rd = O( µ). Hence, while the usual primal residual converges
√
to 0 according to O(µ), the usual dual residual does so according to O( µ). This is an
unusual feature of the algorithm presented in this paper, and one which contrasts it with
other infeasible interior-point path-following algorithms, which require that the primal and
dual residuals go to zero at an O(µ)-rate. It should be noted however that this feature is
√
possible due to the specific form of the O( µ)-term present in (83), i.e. one that lies in the
range space of V .
CQP problems where V is explicitly available arise frequently in the literature. One
important example arises in portfolio optimization (see [4]), where the rank of V is often
small. In such problems, l represents the number of observation periods used to estimate
the data for the problem. We believe that Algorithm CQP-IS could be of particular use for
this type of application.
References
[1] V. Baryamureeba and T. Steihaug. On the convergence of an inexact primal-dual interior
point method for linear programming. Technical Report 188, Department of Informatics,
University of Bergen, 2000.
24
[2] V. Baryamureeba, T. Steihaug, and Y. Zhang. Properties of a class of preconditioners
for weighted least squares problems. Technical Report 16, Department of Computational
and Applied Mathematics, Rice University, 1999.
[3] L. Bergamaschi, J. Gondzio, and G. Zilli. Preconditioning indefinite systems in interior
point methods for optimization. Technical Report MS-02-002, Department of Mathematical Methods for Applied Sciences, University of Padua, Italy, 2002.
[4] T.J. Carpenter and R.J. Vanderbei. Symmetric indefinite systems for interior-point
methods. Mathematical Programming, 58:1–32, 1993.
[5] R.W. Freund, F. Jarre, and S. Mizuno. Convergence of a class of inexact interior-point
algorithms for linear programs. Mathematics of Operations Research, 24(1):50–71, 1999.
[6] M. Kojima, N. Megiddo, and S. Mizuno. A primal-dual infeasible-interior-point algorithm for linear programming. Mathematical Programming, 61(3):263–280, 1993.
[7] M. Kojima, S. Mizuno, and A. Yoshise. A primal-dual interior point algorithm for linear
programming. in Progress in Mathematical Programming: Interior-Point and Related
Methods, ch. 2, pages 29–47, 1989.
[8] J. Korzak. Convergence analysis of inexact infeasible-interior-point algorithms for solving linear programming problems. SIAM Journal on Optimization, 11(1):133–148, 2000.
[9] V.V. Kovacevic-Vujcic and M.D. Asic. Stabilization of interior-point methods for linear
programming. Computational Optimization and Applications, 14:331–346, 1999.
[10] D.G. Luenberger. Linear and Nonlinear Programming. Addison-Wesley, 1984.
[11] S. Mizuno and F. Jarre. Global and polynomial-time convergence of an infeasibleinterior-point algorithm using inexact computation. Mathematical Programming,
84:357–373, 1999.
[12] R.D.C. Monteiro and J.W. O’Neal. Convergence analysis of a long-step primal-dual
infeasible interior-point lp algorithm based on iterative linear solvers. Technical report,
Georgia Institute of Technology, 2003. Submitted to Mathematical Programming.
[13] R.D.C. Monteiro, J.W. O’Neal, and T. Tsuchiya. Uniform boundedness of a preconditioned normal matrix used in interior point methods. Technical report, Georgia Institute
of Technology, 2003. Submitted to SIAM Journal on Optimization.
[14] A.R.L. Oliveira and D.C. Sorensen. Computational experience with a preconditioner
for interior point methods for linear programming. Technical Report 28, Department of
Computational and Applied Mathematics, Rice University, 1997.
[15] L.F. Portugal, M.G.C. Resende, G. Veiga, and J.J. Judice. A truncated primal-infeasible
dual-feasible network interior point method. Networks, 35:91–108, 2000.
25
[16] M.G.C. Resende and G. Veiga. An implementation of the dual affine scaling algorithm
for minimum cost flow on bipartite uncapacitated networks. SIAM Journal on Optimization, 3:516–537, 1993.
[17] M.J. Todd, L. Tunçel, and Y. Ye. Probabilistic analysis of two complexity measures for
linear programming problems. Mathematical Programming A, 90:59–69, 2001.
[18] S.A. Vavasis and Y. Ye. A primal-dual interior point method whose running time
depends only on the constraint matrix. Mathematical Programming A, 74:79–120, 1996.
[19] S.J. Wright. Primal-Dual Interior-Point Methods. SIAM, 1997.
[20] Y. Zhang. On the convergence of a class of infeasible interior-point methods for the
horizontal linear complementarity problem. SIAM Journal on Optimization, 4(1):208–
227, 1994.
[21] G. Zhou and K.-C. Toh. Polynomiality of an inexact infeasible interior point algorithm
for semidefinite programming. Mathematical Programming. to appear.
26
Download