(C) 1998 OPA (Overseas Publishers Association) Amsterdam B.V. Published under license under the Gordon and Breach Science Publishers imprint. Printed in Malaysia. J. oflnequal. & Appl., 1998, Vol. 2, pp. 181-194 Reprints available directly from the publisher Photocopying permitted by license only A Conjugate Direction Method for Approximating the Analytic Center of a Polytope* MASAKAZU KOJIMA a,f, NIMROD MEGIDDO b and SHINJI MIZUNO c,:l: a Departments of Mathematical and Computing Science, Tokyo Institute of Technology, Meguro-ku, Tokyo 152, Japan; b IBM Almaden Research Center, 650 Harry Road, San Jose, Cafifomia 95120-6099, and School of Mathematical Sciences, Tel Aviv University, Tel Aviv, Israel; c The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106, Japan (Received 21 November 1996) Rn:ax- The analytic center o of an n-dimensional polytope P {x E bi >_ 0 (i 1,2 m)} with a nonempty interior Pint is defined as the unique minimizer of the logarithmic potential function F(x) 7’-1 lg(a/Tx hi) over Pint. It is shown that one cycle of a conjugate direction method, applied to the potential function at any v E Pint such that e= V/(Jc V/(-o)TV2F(o)(v-o)< 1/6, generates a point J: Pint such that to)XVF(to)(Jc to) <_ 23v/-e 2. Keywords: Linear inequality; Polytope; Linear programming; Interior-point method; Conjugate Direction Method AMS Subject Classification Numbers." 15A39 (Linear inequalities), 52A25 (Polyhedra and polytopes) and 90C05 (Linear programming) * Part of this research was done when M. Kojima and S. Mizuno visited at the IBM Almaden Research Center in the summer of 1991. Partial support from the Office of Naval Research under Contract N00014-91-C-0026 is acknowledged. Corresponding author. Supported by Grant-in-Aids for Co-operative Research (03832017) of The Japan Ministry of Education, Science and Culture. Supported by Grant-in-Aids for Encouragement of Young Scientist (03740125) and Co-operative Research (03832017) of The Japan Ministry of Education, Science and Culture. 181 M. KOJIMA et al. 182 1. INTRODUCTION - Let P denote a polytope of the form {xR" aTix>_ bi (i-1,2,..., m)}. The symbol Pint stands for its interior {x 6 Rn" aTi x > bi (i 1, 2,... ,m)}. We assume throughout that Pint is nonempty and bounded. Let F denote the logarithmic potential function on Pint; rn F(x)-- log(a/Tx- be) for every x Pint. i=1 The analytic center oJ of the polytope P [11] is defined as the unique minimizer of the potential function F over the interior Pint of P. We denote the gradient vector and the Hessian matrix at each x Pint by VF(x) and V2F(x), respectively. By a simple calculation, we see that VF(x) .= rn aWi x bi .= (aTi x bi) 2" Assuming there exist n linearly independent ai’s, the Hessian matrix is positive definite at every x Pint. Hence, the potential function F is strictly convex over Pint, and if Pint k }, the analytic center o of the polytope P is the unique solution of the system of equations VF(x) =0. Furthermore, at any fixed x Pint, we can define a norm over R by X72F(x) II[Ix /jTV2F(x) (1) We use the norm ]Ix wl] to measure the distance from any x 6 Pint to the analytic center o as in the papers by Renegar [10] and Vaidya [15]. Consider the linear program Minimize cTx subject to aTi x >_ bi 1, 2, m) Here, x, c E R’, ai R’, and bi R. We assume that the linear program has a bounded nonempty set of optimal solutions. Denote by A* the optimal value of the objective function. It follows that for every A > A* the interior Sint(A) of the parametric level set S(/) {x @ Rn" cTx <_ ,, aTi X >_ bi (i- 1,2,...,m)} A CONJUGATE DIRECTION METHOD 183 is nonempty and bounded. For each/ > A*, let oa denote the analytic center of S(A). It is well known that the set {o)a: > }, consisting of all the analytic centers, forms a smooth curve which runs through the interior of the feasible region and converges to an optimal solution of the linear program as tends ,*. The curve is called the central trajectory or the path of centers. Thus, if we numerically trace the central trajectory till gets sufficiently close to ,V, then we obtain an approximate optimal solution of the linear program 11]. Renegar 10] embodied this idea in his polynomial-time algorithm for linear programs using Newton’s method for approximating the analytic center of a polytope. Since then, the approximation of the central trajectory by Newton method has played a major role in many interior point algorithms developed for linear programs (see e.g. [2,3,5,7,8,10, 13,14,16]) and their extensions to convex quadratic programs (see e.g. [9]) and linear complementarity problems (see e.g. [6]). The following theorem provides a theoretical basis for the polynomial-time convergence of Renegar’s algorithm: * , e < 1, then the THEOREM 1.1 (Theorem 3.2 of [10]) If Ilv-ol[ point x* v- 2F(v)-lVF(v), generated by one Newton iteration for minimizing the potential function F at the point r, satisfies IIx*-ll< (1 -Jr- )22 1-e (2) Vaidya [15] also investigated the behavior of a Newton-type method for approximating the analytic center. Our research is motivated by the following questions: (i) Do conjugate direction and conjugate gradient methods (see, e.g. [1,4,12]) work as effectively as Newton’s method for approximating the analytic center? (ii) Can these methods be utilized effectively in interior point algorithms, replacing Newton’s method? Here we begin answering these questions. Let v EintP, and let d 1, d2,..., dn be n V2F(v)-orthonormal vectors; ]ldill2v--(di)TV2F(l)d i- (i-- 1,2,...,n), (di)TV2F(v)dj -0 if < < j _< n. (3) We are concerned with a conjugate direction method for approximating the analytic center. M. KOJIMA et al. 184 Conjugate Direction Method (One Cycle) Step 0: Let y0__ v and k 1. Step 1" Define y/ Geint by F(y)=min{F(y + ud) uR}. Step 2: If k n, then stop. Otherwise, set k k + and go to Step 1. Our main result is: =- THEOREM 1.2 Ife IIv- ll _< 1/6, then Ily n- toll -< 23x/e2. In the remainder of this note we prove this theorem. As by-products, we derive several interesting properties of the potential function and its quadratic approximation. In particular, we will see in Corollary 2.5 that an inequality slightly stronger than (2) holds under the assumption of Theorem 1.1. 2. BASIC ANALYSIS 2.1. Symbols and Notation We begin by introducing the notation. Let f (x) F(x) F(to) for every x E Pint. Obviously, the gradient and the Hessian of the function f at every x E Pint coincide with those of the potential function F, respectively. By the definition of the analytic center to of the polytope P, we have f(to) --0 and f(x) > 0 for every xE Pint. We also call f a potential function. Define a quadratic approximation of the potential function fat every Y Pint by qy(X) :f(y)+ VF(y)T(x- y) + 1/2 (x y) v:r(y)(x y) for every x R n. (4) In particular, we have qv(X) qv(X*) +1/2(x- x*)TVZF(v)(x x*) for every x R n. (5) A CONJUGATE DIRECTION METHOD 185 Here, x* v-(V2F(v))- 1VF(v) denotes the point generated by one Newton iteration for minimizing the potential function f at a point v E Pint (see Theorem 1.1). For each x E Pint, define an n m matrix Ux Ux by am a2 al ax- axbl’ a Tm x- b2’ bm ) and for each pair x, y Pint, define the m x m diagonal matrix RYx diag (ayx bl bl \a ay- b2 aTm y aTm X ax With this notation, we can rewrite the norm RYx by bm IIllx of/j R defined by (1) as 11411- v/TV2F(x) [[UxTII (x P). This equality is used very often without any reference. Also, it is easy to verify the following equalities for every pair {x,y} c Pint: 7F(x) -Uxe, (6) (7) (8) (9) 72F(x)- UxUXx, Uy Ux Ry, e Ryx e UyT (y x). X 2.2. Some Properties of the Quadratic Approximation In this subsection we present three lemmas. The first one estimates the difference between the norms [ll[y and I[[[x of Pint when y and x are close. The second and the third ones evaluate the errors in the gradient vector Vqy and the Hessian matrix 2qy of the quadratic approximation qy of the potential function f, respectively. LEMMA 2.1 If I Rn, {x, y} C Pint, and IIx ylly < , then (0) M. KOJIMA et al. 186 Proof For every (i--1,2,...,m), let ri denote the ith diagonal element (a Xix bi)/(a iY- bi) of the diagonal matrix R;. By (9) and the assumption, m Z( 2 x T x)ll 2 ri) 2 -lie- eell -IIU(y- i=l yll y2 <2. Ilx This implies that m -(1 -ri) 2 e2 and -e -+-e (i-- 1,2,...,m). (11) ?’i i=1 U UTx On the other hand, we know by (8) that Hence, taking R; the Euclidean norm of both sides of this equality, we obtain Thus, the desired inequalities in (10) follow. E Rn, LEMMA 2.2 If IIx- yllx <- {x,y} C Pint, and either [[x-y[ly <- e or e, then I(Vqy(x) VF(x))TI 2 Proof As in the proof of Lemma 2.1, we denote the ith diagonal element (aXix-bi)/(aXiY-bi) of the diagonal matrix R by ri (i-l,2,...,m). If ][x-ylly <_ then the inequalities in (11) hold. Since Rx (R) -1, we see by symmetry that if [Ix -Yllx < e, then the , inequalities m (1 /.)2 < e2 and i:1 - <-_< +e (i --1, 2, t’i m) (12). hold. We can easily verify that I(Vqy(x) VF(x)) TI I(VF(y)+ VZF(y)(x- y)- (by (4)) - A CONJUGATE DIRECTION METHOD (-e + UT(X y) Uxe) Tl ( -Uye + Uy(-e + Ry e) + Uy R e x 187 (by (6) and (7)) (by (8) and (9)) U -< II- 2e + R e + RYx ell" Ull -2e+Rye+Re i=1 (i=1 --ri)2i)ll]lY--(i=1 (1 (1 In either case ((11) or (12)) the desired inequality follows. LEMMA 2.3 If ; E Rn, {x,y) C Pint and ]Ix ylly < e, then [;T(V2qy(X)- V2F(x))I <_ (2 + e)ell/jllZy. Proof By the definition (4) of the quadratic function qy and (7), By Lemma 2.1, IT (V2qy(X) -< max{l(1<_ max{ (2 (2 + 2.3. Minimization of the Quadratic Approximation The following lemma shows a relation between the minimizer x f of the potential function f and the minimizer xq of its quadratic approximation qy over any affine subspace S that intersects Pint. M. KOJIMA et al. 188 LEMMA 2.4 Suppose S is an affine subspace of R n which intersects Pint, and let y E Pint be any point. Let x f be the minimizer of the potential function foyer S N Pint, and let x q be the minimizer over S of the quadratic approximation qy of f at y Pint. Under these conditions, if either Ilxf ylly <_ or Ilxf yllx <_ then Ilx q -xfiiy <_ e2/(1 --e). , Proof From the definition of x f and x q, z)TF(x f) (x 0 (x z)TVqy(X q) and 0 for any x and z in S. In particular, these inequalities hold for x x q and z x f. Hence, xf)mVF(x f) (x q 0 and (x q xf)Tqy(X q) O. (13) Therefore, ]iX q Xf liy2_ IlUyT(X q -xf)ll 2 (xq xf )TVy VyT(xq X f (X q xf )Tv2F(y)(X q X f) (by (7)) (x q xf )T(Vqy(X q) Vqy(xf )) (see (4))" (x q xf)T(vr(x f) Vqy(xf)) (by (13)) 2 q< l-Z- I[x xf[]y (by Lemma 2.2). Using Lemma 2.4, we can strengthen the inequality (2) given in Theorem 1.1. COROLLARY 2.5 Under the assumptions of Theorem 1.1, [[x* -w][ <_ + Proof If we apply Lemma 2.4 to the case where S R and y- v, then xf-oo and xq-x *, and hence [[x*-w[[ _< e2/(1 -e). On the other hand, by Lemma 2.1, [[x* o[] <_ (1 + e)I[x* o[[. Thus the desired inequality follows. 2.4. Neighborhoods of the Analytic Center We define an ellipsoidal neighborhood E(e) of the analytic center o by E(e) {z: [Iz- _< {+ I1 11 and 0 A CONJUGATE DIRECTION METHOD 189 and the level set L(c) of the potential function f by {x: f(x) < c) L(c) for each c >_ 0. The following lemma shows a relation between the neighborhood E(e) of the analytic center to of the polytope P and the level set L(c): LEMMA 2.6 For every e E [0, ], E(e) Proof By Lemma 5.1 C L(2e2/3) C E(’/e). of [15], 3 If(a’ + 5) 1/2’2TvF()I < 3(1 5) Rn with for every 6 E [0, 1) and every tj uwuTw, it follows that T < to 1. Since TV2F(to)-- 3 3( -5) Rn with ]ll] for every 6 E [0, 1) and every tj be used later. Now, if x I111 + s/ E(e), 0 < s < e < and (14) 1. This inequality will I111 then f(x) =f(to + sty) S2 2 S 3(1 s) 3( ) (by (14)) 2 2 2e 3 (since s < e < 1) 1, M. KOJIMA et al. 190 This implies that x E L(2e2/3). Thus, we have shown the first inclusion relation E(e) c L(2e2/3) of the lemma. We now prove the second inclusion relation L(2e2/3) c E(x/e). It suffices to show that if x o+ s E(x/e) and 1111 1, then x o)+ sj L(2e2/3). Since s > x/e and f is a strictly convex function attaining its minimum value 0 at the analytic center o of the polytope P, we have f(o + s) > f(o + v/e). On the other hand, by (14) and 0 _< e _< 1/6, 2e2 f(o + x/et) > 2 Hence, > 2e2/3. xf L(2e2/3). 3. PROOF OF THE MAIN THEOREM By applying Lemma 2.4 with S-R", xf--o0 and x q --X*, we first observe that 2 Ilx*-wl[ <. (15) We also see by Lemma 2.6 that v eE(e)cL(2e2/3), which implies f(yO) =f(v)< 2e2/3. It follows from the construction of the sequence (see the Conjugate Direction Method described in Section 1) that f(y) <_f(yO) 22 <_---’ i.e., y: L(2e2/3) (k- 0, 1,...,n). A CONJUGATE DIRECTION METHOD 191 Hence, for every k-0, 1,..., n, Ily (by Lemma 2.1) vii "11 / IIv ’11) (llY (+) (by y E L(22/3), Lemma 2.6 and v E E(e)) <_3e (since0<e<_ 1/6). Thus, we have shown that Ily Recall that ing (3). vllv dl, d2,...,dn (k 3e 0, 1,..., n). V2F(v)-orthonormal are n (16) vectors satisfy- X* t_ -]iL1 Ai di, then for every fixedj (j- 1,...,n), the minimum of the quadratic approximation qv of the potentialfunction f at v Pint or/the line {y + c d J: ty R} is attained at x* + ij i d i. LEMMA 3.1 If y Proof By (5) and (3), qv(y + adj) qv(X*) + V2F(v) dj Ai di Ai di @ dj i=l i=l q(*) +g _ Thus, the minimum of q(y + d with respect to Let R is attained at . (i- 1, 2,..., n) be real numbers such that y0 x* +d i=1 For each k (k 1,2,..., n), let u denote the step length from y-i taken at Step of the Conjugate Direction Method along the direction d; y yC- + ugd . (17) M. KOJIMA et al. 192 Then each y k (k 1, 2,..., n) can be rewritten as k y= yo + k iai= x* + <.i + )d + i=1 i=1 .d i=k+l . For each k (k- 1,2,..., n), let p k be the minimizer of qv on the line {yk-1 + a dk. a E R}. By Lemma 3.1, k-1 k X* -’[- Z n (#i -1- V’i) di -t- i=1 Hence, Z Izidi yk (#k -1- tk) dk. i=k+l - n Ily" x*ll 2v Z(#i -’[-" ui)dill 2v i=1 II(#/+ t/i)dill 2v (by (3)) i=1 i=1 -< \ 3Ji=l _<n Since 0 <_ e < 1/6, ( (3e)2 (by (16) and Lemma 2.4) "3eJ [lyn-x*[lv<-X/ Consequently, 2 3e (18) A CONJUGATE DIRECTION METHOD 193 4. CONCLUDING REMARKS Our main theorem (Theorem 1.2) suggests a modification of Renegar’s polynomial-time algorithm [10] for linear programs. Specifically, we can replace Newton’s method, which is used to trace the central trajectory in Renegar’s algorithm, by the Conjugate Direction Method. Such a modification does not, however, seem so attractive. The authors , are not satisfied with the fact that in the Conjugate Direction Method the search directions d d2,..., dn are chosen as conjugate directions with respect to a fixed Hessian matrix 72F(v) of the potential function F at v. Only the line search from yk- is performed along the direction dk using the potential function f at each iteration of the method. In standard applications of conjugate gradient methods to nonlinear minimization, the search direction d is computed adaptively rather than fixed in advance. Our ultimate goal is to see whether conjugate gradient methods can be effectively incorporated in interior point algorithms. Thus it would require more effort to establish a similar result for conjugate gradient methods rather than the Conjugate Direction Method given in Theorem 1.2. References [1] A.I. Cohen, Rate of convergence of several conjugate gradient algorithms, SIAM J. Numer. Anal., 9 (1972), 248-259. [2] R.M. Freund and K.-C. Tan, A method for the parametric center problem, with a [3] [4] [5] [6] [7] [8] [9] strictly monotone polynomial-time algorithm for linear programming, Mathematics of Operations Research, 16 (1991), 775-801. C.C. Gonzaga, An algorithm for solving linear programming programs in O(n3L) operations, in: N.Megiddo, Ed., Progress in Mathematical Programming: InteriorPoint and Related Methods, Springer-Verlag, New York, 1989, pp. 1-28. M. Hestenes, Conjugate Direction Methods in Optimization, Springer-Verlag, New York, 1980. M. Kojima, S. Mizuno and A. Yoshise, A primal-dual interior point algorithm for linear programming, in: N. Megiddo, Ed., Progress in Mathematical Programming: Interior-Point and Related Methods, Springer-Verlag, New York, 1989, pp. 29-47. M. Kojima, S. Mizuno and A. Yoshise, A polynomial-time algorithm for a class of linear complementary problems, Mathematical Programming, 44 (1989), 1-26. N. Megiddo, Pathways to the optimal set in linear programming, in: N. Megiddo, Ed., Progress in Mathematical Programming: Interior-Point and Related Methods, SpringerVerlag, New York, 1989, pp. 131-158. R.D.C. Monteiro and I. Adler, Interior path following primal-dual algorithms. Part I: linear programming, Mathematical Programming, 44 (1989), 27-41. R.D.C. Monteiro and I. Adler, Interior path following primal-dual algorithms. Part II: convex quadratic programming, Mathematical Programming, 44 (1989), 43-66. M. KOJIMA et al. 194 [10] J. Renegar, A polynomial-time algorithm based on Newton’s method for linear programming, Mathematical Programming, 40 (1988), 59-94. [11] G. Sonnevend, An ’analytical centre’ for polyhedrons and new classes of global algorithms for linear (smooth, convex) programming, in: Lecture Notes in Control and Information Sciences, Vol. 84, Springer-Verlag, New York, 1985, pp. 866-876. [12] J. Stoer, On the relation between quadratic termination and convergence properties of minimization algorithms, Numer. Math., 28 (1977), 343-366. [13] K. Tanabe, Centered Newton method for mathematical programming, in: M. Iri and K. Yajima, Eds., System Modeling and Optimization, Springer-Verlag, New York, 1988, pp. 197-206. [14] K. Tanabe, Centered Newton method for linear programming: Interior and ’exterior’ point method (in Japanese), in: K. Tone, Ed., New Methods for Linear Programming, 3, The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106, Japan, 1990, pp. 98-100. [15] P.M. Vaidya, A locally well-behaved potential function and a simple Newton-type method for finding the center of a polytope, in: N. Megiddo, Ed., Progress in Mathematical Programming: Interior-Point and Related Methods, Springer-Verlag, New York, 1989, pp. 131-158. [16] P.M. Vaidya, An algorithm for linear programming which requires O((m + n)nZ+ (m + n)S nL) arithmetic operations, Mathematical Programming, 47 (1990), 175-201.