Document 10943901

advertisement
(C) 1998 OPA (Overseas Publishers Association)
Amsterdam B.V. Published under license
under the Gordon and Breach Science
Publishers imprint.
Printed in Malaysia.
J. oflnequal. & Appl., 1998, Vol. 2, pp. 181-194
Reprints available directly from the publisher
Photocopying permitted by license only
A Conjugate Direction Method for
Approximating the Analytic Center
of a Polytope*
MASAKAZU KOJIMA a,f, NIMROD MEGIDDO b
and SHINJI MIZUNO c,:l:
a
Departments of Mathematical and Computing Science,
Tokyo Institute of Technology, Meguro-ku, Tokyo 152, Japan;
b
IBM Almaden Research Center, 650 Harry Road, San Jose, Cafifomia
95120-6099, and School of Mathematical Sciences, Tel Aviv University,
Tel Aviv, Israel; c The Institute of Statistical Mathematics,
4-6-7 Minami-Azabu, Minato-ku, Tokyo 106, Japan
(Received 21 November 1996)
Rn:ax-
The analytic center o of an n-dimensional polytope P {x E
bi >_ 0 (i
1,2
m)} with a nonempty interior Pint is defined as the unique minimizer of the
logarithmic potential function F(x)
7’-1 lg(a/Tx hi) over Pint. It is shown that one
cycle of a conjugate direction method, applied to the potential function at any v E Pint
such that e=
V/(Jc
V/(-o)TV2F(o)(v-o)< 1/6, generates a point J:
Pint such that
to)XVF(to)(Jc to) <_ 23v/-e 2.
Keywords: Linear inequality; Polytope; Linear programming; Interior-point method;
Conjugate Direction Method
AMS Subject Classification Numbers." 15A39 (Linear inequalities), 52A25 (Polyhedra
and polytopes) and 90C05 (Linear programming)
* Part of this research was done when M. Kojima and S. Mizuno visited at the IBM
Almaden Research Center in the summer of 1991. Partial support from the Office of
Naval Research under Contract N00014-91-C-0026 is acknowledged.
Corresponding author. Supported by Grant-in-Aids for Co-operative Research
(03832017) of The Japan Ministry of Education, Science and Culture.
Supported by Grant-in-Aids for Encouragement of Young Scientist (03740125) and
Co-operative Research (03832017) of The Japan Ministry of Education, Science and
Culture.
181
M. KOJIMA et al.
182
1. INTRODUCTION
-
Let P denote a polytope of the form {xR" aTix>_ bi (i-1,2,..., m)}. The symbol Pint stands for its interior {x 6 Rn" aTi x > bi
(i 1, 2,... ,m)}. We assume throughout that Pint is nonempty and
bounded. Let F denote the logarithmic potential function on Pint;
rn
F(x)--
log(a/Tx- be)
for every x
Pint.
i=1
The analytic center oJ of the polytope P [11] is defined as the unique
minimizer of the potential function F over the interior Pint of P. We
denote the gradient vector and the Hessian matrix at each x Pint by
VF(x) and V2F(x), respectively. By a simple calculation, we see that
VF(x)
.=
rn
aWi x
bi
.=
(aTi x bi) 2"
Assuming there exist n linearly independent ai’s, the Hessian matrix
is positive definite at every x Pint. Hence, the potential
function F is strictly convex over Pint, and if Pint k }, the analytic center
o of the polytope P is the unique solution of the system of equations
VF(x) =0. Furthermore, at any fixed x Pint, we can define a norm
over R by
X72F(x)
II[Ix
/jTV2F(x)
(1)
We use the norm ]Ix
wl] to measure the distance from any x 6 Pint to
the analytic center o as in the papers by Renegar [10] and Vaidya [15].
Consider the linear program
Minimize
cTx
subject to
aTi x >_ bi
1, 2,
m)
Here, x, c E R’, ai R’, and bi R. We assume that the linear program
has a bounded nonempty set of optimal solutions. Denote by A* the
optimal value of the objective function. It follows that for every A > A*
the interior Sint(A) of the parametric level set
S(/)
{x @ Rn" cTx <_
,, aTi
X
>_ bi (i- 1,2,...,m)}
A CONJUGATE DIRECTION METHOD
183
is nonempty and bounded. For each/ > A*, let oa denote the analytic
center of S(A). It is well known that the set {o)a: > }, consisting of
all the analytic centers, forms a smooth curve which runs through the
interior of the feasible region and converges to an optimal solution of
the linear program as
tends ,*. The curve is called the central
trajectory or the path of centers. Thus, if we numerically trace the
central trajectory till gets sufficiently close to ,V, then we obtain an
approximate optimal solution of the linear program 11]. Renegar 10]
embodied this idea in his polynomial-time algorithm for linear
programs using Newton’s method for approximating the analytic
center of a polytope. Since then, the approximation of the central
trajectory by Newton method has played a major role in many interior
point algorithms developed for linear programs (see e.g. [2,3,5,7,8,10,
13,14,16]) and their extensions to convex quadratic programs (see e.g.
[9]) and linear complementarity problems (see e.g. [6]).
The following theorem provides a theoretical basis for the polynomial-time convergence of Renegar’s algorithm:
*
,
e < 1, then the
THEOREM 1.1 (Theorem 3.2 of [10]) If Ilv-ol[
point x* v- 2F(v)-lVF(v), generated by one Newton iteration for
minimizing the potential function F at the point r, satisfies
IIx*-ll<
(1 -Jr- )22
1-e
(2)
Vaidya [15] also investigated the behavior of a Newton-type method
for approximating the analytic center. Our research is motivated by the
following questions: (i) Do conjugate direction and conjugate gradient
methods (see, e.g. [1,4,12]) work as effectively as Newton’s method for
approximating the analytic center? (ii) Can these methods be utilized
effectively in interior point algorithms, replacing Newton’s method?
Here we begin answering these questions. Let v EintP, and let
d 1, d2,..., dn be n V2F(v)-orthonormal vectors;
]ldill2v--(di)TV2F(l)d i- (i-- 1,2,...,n),
(di)TV2F(v)dj -0 if < < j _< n.
(3)
We are concerned with a conjugate direction method for approximating the analytic center.
M. KOJIMA et al.
184
Conjugate Direction Method (One Cycle)
Step 0: Let y0__ v and k 1.
Step 1" Define y/ Geint by F(y)=min{F(y + ud) uR}.
Step 2: If k n, then stop. Otherwise, set k k + and go to Step 1.
Our main result is:
=-
THEOREM 1.2
Ife
IIv- ll _< 1/6, then Ily n- toll -< 23x/e2.
In the remainder of this note we prove this theorem. As by-products,
we derive several interesting properties of the potential function and its
quadratic approximation. In particular, we will see in Corollary 2.5
that an inequality slightly stronger than (2) holds under the assumption
of Theorem 1.1.
2. BASIC ANALYSIS
2.1. Symbols and Notation
We begin by introducing the notation. Let
f (x) F(x) F(to)
for every x E Pint.
Obviously, the gradient and the Hessian of the function f at every
x E Pint coincide with those of the potential function F, respectively. By
the definition of the analytic center to of the polytope P, we have
f(to) --0 and f(x) > 0 for every
xE
Pint.
We also call f a potential function.
Define a quadratic approximation of the potential function fat every
Y Pint by
qy(X) :f(y)+ VF(y)T(x- y)
+ 1/2 (x y) v:r(y)(x y) for every x R n.
(4)
In particular, we have
qv(X)
qv(X*) +1/2(x- x*)TVZF(v)(x x*) for every x
R n. (5)
A CONJUGATE DIRECTION METHOD
185
Here, x* v-(V2F(v))- 1VF(v) denotes the point generated by one
Newton iteration for minimizing the potential function f at a point
v E Pint (see Theorem 1.1).
For each x E Pint, define an n m matrix
Ux
Ux by
am
a2
al
ax- axbl’
a Tm x-
b2’
bm
)
and for each pair x, y Pint, define the m x m diagonal matrix
RYx
diag (ayx bl
bl
\a
ay- b2
aTm y
aTm X
ax
With this notation, we can rewrite the norm
RYx by
bm
IIllx of/j
R defined by
(1) as
11411- v/TV2F(x) [[UxTII (x P).
This equality is used very often without any reference. Also, it is easy to
verify the following equalities for every pair {x,y} c Pint:
7F(x)
-Uxe,
(6)
(7)
(8)
(9)
72F(x)- UxUXx,
Uy Ux Ry,
e Ryx e
UyT (y x).
X
2.2. Some Properties of the Quadratic Approximation
In this subsection we present three lemmas. The first one estimates the
difference between the norms [ll[y and I[[[x of
Pint when y and x are
close. The second and the third ones evaluate the errors in the gradient
vector Vqy and the Hessian matrix 2qy of the quadratic approximation qy of the potential function f, respectively.
LEMMA 2.1
If I Rn, {x, y} C Pint, and IIx ylly <
, then
(0)
M. KOJIMA et al.
186
Proof For every (i--1,2,...,m), let ri denote the ith diagonal
element (a Xix bi)/(a iY- bi) of the diagonal matrix R;. By (9) and
the assumption,
m
Z(
2
x
T
x)ll 2
ri) 2 -lie- eell
-IIU(y-
i=l
yll y2 <2.
Ilx
This implies that
m
-(1 -ri) 2
e2
and
-e
-+-e (i-- 1,2,...,m). (11)
?’i
i=1
U
UTx
On the other hand, we know by (8) that
Hence, taking
R;
the Euclidean norm of both sides of this equality, we obtain
Thus, the desired inequalities in (10) follow.
E Rn,
LEMMA 2.2 If
IIx- yllx
<-
{x,y}
C Pint, and either
[[x-y[ly <- e
or
e, then
I(Vqy(x)
VF(x))TI
2
Proof As in the proof of Lemma 2.1, we denote the ith diagonal
element (aXix-bi)/(aXiY-bi) of the diagonal matrix R by ri
(i-l,2,...,m). If ][x-ylly <_ then the inequalities in (11) hold.
Since Rx (R) -1, we see by symmetry that if [Ix -Yllx < e, then the
,
inequalities
m (1 /.)2
< e2 and
i:1
-
<-_< +e (i --1, 2,
t’i
m) (12).
hold. We can easily verify that
I(Vqy(x)
VF(x)) TI
I(VF(y)+ VZF(y)(x- y)-
(by (4))
-
A CONJUGATE DIRECTION METHOD
(-e + UT(X y) Uxe) Tl
( -Uye + Uy(-e + Ry e) + Uy R e
x
187
(by (6) and (7))
(by (8) and (9))
U
-< II- 2e + R e + RYx ell" Ull
-2e+Rye+Re
i=1
(i=1 --ri)2i)ll]lY--(i=1 (1
(1
In either case ((11) or (12)) the desired inequality follows.
LEMMA 2.3 If ; E Rn, {x,y)
C
Pint and ]Ix ylly < e, then
[;T(V2qy(X)- V2F(x))I <_ (2 + e)ell/jllZy.
Proof By the definition (4) of the quadratic function qy and (7),
By Lemma 2.1,
IT (V2qy(X)
-< max{l(1<_ max{ (2
(2 +
2.3. Minimization of the Quadratic Approximation
The following lemma shows a relation between the minimizer x f of the
potential function f and the minimizer xq of its quadratic approximation qy over any affine subspace S that intersects Pint.
M. KOJIMA et al.
188
LEMMA 2.4 Suppose S is an affine subspace of R n which intersects Pint,
and let y E Pint be any point. Let x f be the minimizer of the potential
function foyer S N Pint, and let x q be the minimizer over S of the quadratic
approximation qy of f at y Pint. Under these conditions, if either
Ilxf ylly <_ or Ilxf yllx <_ then Ilx q -xfiiy <_ e2/(1 --e).
,
Proof From the definition of x f and x q,
z)TF(x f)
(x
0
(x z)TVqy(X q)
and
0
for any x and z in S. In particular, these inequalities hold for x
x q and
z x f. Hence,
xf)mVF(x f)
(x q
0 and
(x q xf)Tqy(X q)
O.
(13)
Therefore,
]iX q
Xf
liy2_ IlUyT(X q -xf)ll 2
(xq xf )TVy VyT(xq X f
(X q xf )Tv2F(y)(X q X f) (by (7))
(x q xf )T(Vqy(X q) Vqy(xf )) (see (4))"
(x q xf)T(vr(x f) Vqy(xf)) (by (13))
2
q<
l-Z- I[x xf[]y (by Lemma 2.2).
Using Lemma 2.4, we can strengthen the inequality (2) given in
Theorem 1.1.
COROLLARY 2.5
Under the assumptions
of Theorem 1.1, [[x* -w][ <_
+
Proof
If we apply Lemma 2.4 to the case where S
R and y- v, then
xf-oo and xq-x *, and hence [[x*-w[[ _< e2/(1 -e). On the other
hand, by Lemma 2.1, [[x* o[] <_ (1 + e)I[x* o[[. Thus the desired
inequality follows.
2.4. Neighborhoods of the Analytic Center
We define an ellipsoidal neighborhood E(e) of the analytic center o by
E(e)
{z: [Iz-
_<
{+
I1 11
and 0
A CONJUGATE DIRECTION METHOD
189
and the level set L(c) of the potential function f by
{x: f(x) < c)
L(c)
for each c >_ 0. The following lemma shows a relation between the
neighborhood E(e) of the analytic center to of the polytope P and the
level set L(c):
LEMMA 2.6 For every e E [0, ],
E(e)
Proof By Lemma 5.1
C
L(2e2/3) C E(’/e).
of [15],
3
If(a’ + 5) 1/2’2TvF()I < 3(1 5)
Rn with
for every 6 E [0, 1) and every tj
uwuTw, it follows that
T
<
to
1. Since TV2F(to)--
3
3( -5)
Rn with ]ll]
for every 6 E [0, 1) and every tj
be used later.
Now, if
x
I111
+ s/ E(e), 0 < s < e <
and
(14)
1. This inequality will
I111
then
f(x) =f(to + sty)
S2
2
S
3(1
s)
3(
)
(by (14))
2
2
2e
3
(since s < e < 1)
1,
M. KOJIMA et al.
190
This implies that x E L(2e2/3). Thus, we have shown the first inclusion
relation E(e) c L(2e2/3) of the lemma.
We now prove the second inclusion relation L(2e2/3) c E(x/e). It
suffices to show that if
x
o+
s E(x/e)
and
1111
1,
then
x
o)+
sj
L(2e2/3).
Since s > x/e and f is a strictly convex function attaining its minimum
value 0 at the analytic center o of the polytope P, we have
f(o + s) > f(o + v/e).
On the other hand, by (14) and 0 _< e _< 1/6,
2e2
f(o + x/et) > 2
Hence,
>
2e2/3.
xf L(2e2/3).
3. PROOF OF THE MAIN THEOREM
By applying Lemma 2.4 with S-R", xf--o0 and x q --X*, we first
observe that
2
Ilx*-wl[ <.
(15)
We also see by Lemma 2.6 that v eE(e)cL(2e2/3), which implies
f(yO) =f(v)< 2e2/3. It follows
from the construction of the sequence
(see the Conjugate Direction Method described in Section 1) that
f(y) <_f(yO)
22
<_---’
i.e.,
y: L(2e2/3)
(k- 0, 1,...,n).
A CONJUGATE DIRECTION METHOD
191
Hence, for every k-0, 1,..., n,
Ily
(by Lemma 2.1)
vii
"11 / IIv ’11)
(llY
(+)
(by y E L(22/3), Lemma 2.6 and v E E(e))
<_3e (since0<e<_
1/6).
Thus, we have shown that
Ily
Recall that
ing (3).
vllv
dl, d2,...,dn
(k
3e
0, 1,..., n).
V2F(v)-orthonormal
are n
(16)
vectors satisfy-
X* t_ -]iL1 Ai di, then for every fixedj (j- 1,...,n),
the minimum of the quadratic approximation qv of the potentialfunction f
at v Pint or/the line {y + c d J: ty R} is attained at x* + ij i d i.
LEMMA 3.1 If y
Proof By (5) and (3),
qv(y + adj)
qv(X*)
+
V2F(v)
dj
Ai di
Ai di @ dj
i=l
i=l
q(*) +g
_
Thus, the minimum of q(y + d with respect to
Let
R is attained at
.
(i- 1, 2,..., n) be real numbers such that
y0
x*
+d
i=1
For each k (k 1,2,..., n), let u denote the step length from y-i taken
at Step of the Conjugate Direction Method along the direction d;
y
yC- + ugd
.
(17)
M. KOJIMA et al.
192
Then each y k (k
1, 2,..., n) can be rewritten as
k
y= yo +
k
iai= x* + <.i + )d +
i=1
i=1
.d
i=k+l
.
For each k (k- 1,2,..., n), let p k be the minimizer of qv on the line
{yk-1 + a dk. a E R}. By Lemma 3.1,
k-1
k
X*
-’[-
Z
n
(#i -1- V’i) di -t-
i=1
Hence,
Z Izidi
yk (#k -1- tk) dk.
i=k+l
-
n
Ily"
x*ll 2v
Z(#i -’[-" ui)dill 2v
i=1
II(#/+ t/i)dill 2v (by (3))
i=1
i=1
-<
\
3Ji=l
_<n
Since 0
<_ e < 1/6,
( (3e)2
(by (16) and Lemma 2.4)
"3eJ
[lyn-x*[lv<-X/
Consequently,
2
3e
(18)
A CONJUGATE DIRECTION METHOD
193
4. CONCLUDING REMARKS
Our main theorem (Theorem 1.2) suggests a modification of Renegar’s
polynomial-time algorithm [10] for linear programs. Specifically, we
can replace Newton’s method, which is used to trace the central
trajectory in Renegar’s algorithm, by the Conjugate Direction Method.
Such a modification does not, however, seem so attractive. The authors
,
are not satisfied with the fact that in the Conjugate Direction Method
the search directions d d2,..., dn are chosen as conjugate directions
with respect to a fixed Hessian matrix 72F(v) of the potential function
F at v. Only the line search from yk- is performed along the direction
dk using the potential function f at each iteration of the method. In
standard applications of conjugate gradient methods to nonlinear
minimization, the search direction d is computed adaptively rather
than fixed in advance. Our ultimate goal is to see whether conjugate
gradient methods can be effectively incorporated in interior point
algorithms. Thus it would require more effort to establish a similar
result for conjugate gradient methods rather than the Conjugate
Direction Method given in Theorem 1.2.
References
[1] A.I. Cohen, Rate of convergence of several conjugate gradient algorithms, SIAM
J. Numer. Anal., 9 (1972), 248-259.
[2] R.M. Freund and K.-C. Tan, A method for the parametric center problem, with a
[3]
[4]
[5]
[6]
[7]
[8]
[9]
strictly monotone polynomial-time algorithm for linear programming, Mathematics
of Operations Research, 16 (1991), 775-801.
C.C. Gonzaga, An algorithm for solving linear programming programs in O(n3L)
operations, in: N.Megiddo, Ed., Progress in Mathematical Programming: InteriorPoint and Related Methods, Springer-Verlag, New York, 1989, pp. 1-28.
M. Hestenes, Conjugate Direction Methods in Optimization, Springer-Verlag, New
York, 1980.
M. Kojima, S. Mizuno and A. Yoshise, A primal-dual interior point algorithm for
linear programming, in: N. Megiddo, Ed., Progress in Mathematical Programming:
Interior-Point and Related Methods, Springer-Verlag, New York, 1989, pp. 29-47.
M. Kojima, S. Mizuno and A. Yoshise, A polynomial-time algorithm for a class of
linear complementary problems, Mathematical Programming, 44 (1989), 1-26.
N. Megiddo, Pathways to the optimal set in linear programming, in: N. Megiddo, Ed.,
Progress in Mathematical Programming: Interior-Point and Related Methods, SpringerVerlag, New York, 1989, pp. 131-158.
R.D.C. Monteiro and I. Adler, Interior path following primal-dual algorithms. Part I:
linear programming, Mathematical Programming, 44 (1989), 27-41.
R.D.C. Monteiro and I. Adler, Interior path following primal-dual algorithms. Part
II: convex quadratic programming, Mathematical Programming, 44 (1989), 43-66.
M. KOJIMA et al.
194
[10] J. Renegar, A polynomial-time algorithm based on Newton’s method for linear
programming, Mathematical Programming, 40 (1988), 59-94.
[11] G. Sonnevend, An ’analytical centre’ for polyhedrons and new classes of global
algorithms for linear (smooth, convex) programming, in: Lecture Notes in Control
and Information Sciences, Vol. 84, Springer-Verlag, New York, 1985, pp. 866-876.
[12] J. Stoer, On the relation between quadratic termination and convergence properties
of minimization algorithms, Numer. Math., 28 (1977), 343-366.
[13] K. Tanabe, Centered Newton method for mathematical programming, in: M. Iri and
K. Yajima, Eds., System Modeling and Optimization, Springer-Verlag, New York,
1988, pp. 197-206.
[14] K. Tanabe, Centered Newton method for linear programming: Interior and
’exterior’ point method (in Japanese), in: K. Tone, Ed., New Methods for Linear
Programming, 3, The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu,
Minato-ku, Tokyo 106, Japan, 1990, pp. 98-100.
[15] P.M. Vaidya, A locally well-behaved potential function and a simple Newton-type
method for finding the center of a polytope, in: N. Megiddo, Ed., Progress in
Mathematical Programming: Interior-Point and Related Methods, Springer-Verlag,
New York, 1989, pp. 131-158.
[16] P.M. Vaidya, An algorithm for linear programming which requires O((m + n)nZ+
(m + n)S nL) arithmetic operations, Mathematical Programming, 47 (1990),
175-201.
Download