Using an Interior Point Method for the Master Problem in a

Using an Interior Point Method for the Master Problem in a Decomposition Approachy J. Gondzioz, R. Sarkissian and J.-P. Vial Logilab, HEC, Section of Management Studies, University of Geneva, 102 Bd Carl Vogt, CH-1211 Geneve 4, Switzerland Technical Report 1995.30 October 24, 1995, revised May 15, 1996 Abstract We addres some of the issues that arise when an interior point method is used to handle the master problem in a decomposition approach. The main points concern the ecient exploitation of the special structure of the master problem to reduce the cost of a single interior point iteration. The particular structure is the presence of GUB constraints and the natural partitioning of the constraint matrix into blocks built of cuts generated by dierent subproblems. The method can be used in a fairly general case, i.e., in any decomposition approach whenever the master is solved by an interior point method in which the normal equations are used to compute orthogonal projections. Computational results demonstrate its advantages for one particular decomposition approach: Analytic Center Cutting Plane Method (ACCPM) is applied to solve large scale nonlinear multicommodity network ow problems (up to 5000 arcs and 10000 commodities). Key words. Convex programming, interior point methods, cutting plane methods, linear programming, network optimization. This research has been supported by the Fonds National de la Recherche Scientique Suisse, grant #12 , 34002:92. y to appear in European Journal of Operational Research. z on leave from the Systems Research Institute, Polish Academy of Sciences, Newelska 6, 01-447 Warsaw, Poland. 1 1 Introduction Decomposition methods attracted a lot of attention during the past few years. Some of them became the methods of choice for solving the particular classes of very large scale optimization problems. This was possible due to the development of the well understood decomposition approaches as well as due to their suitability to the new parallel and/or distributed computer environments. The analysis of dierent decomposition approaches is out of the scope of this paper. The reader interested in decomposition methods and in their modern (parallel) implementations is encouraged to consult the references in the survey of Eckstein [9]. In this paper we are concerned with classical decomposition approaches as for example [6, 2, 25]. Their common feature is the use of oracle which generates cutting planes that continuously improve (shrink) the localization set containing the solution of the problem. The master problems consist in the minimization of the associated MinMax problem and can be replaced with the minimization of some linear objective over the localization set. More generally, the master problem may consist in nding a particular point in the current localization set. It is not always necessary, for example, to nd the optimum of the relaxation; sometimes it may be advantageous to nd \central points" in it [29, 10, 14]. Independently of the goal of the master that may look for the optimum or some \center" in the set of localization, it is supposed to apply a variant of the linear programming approach. Interior point methods (IPMs) developed in the last decade proved to be extremely ecient tools for solving large scale linear optimization problems. These optimization techniques are now well understood [19] and, as the computational results have shown (cf. [18, 26]), they usually run signicantly faster than modern implementations of the simplex method if the size of the problem solved is considerable. It is thus natural to apply a variant of an interior point method to solve large master problems in a decomposition approach. A key issue in the solution of real-life large scale problems with the decomposition approach is the ability to take advantage of the disaggregation of cuts. The reason for the superiority of the disaggregate approach is simple. This technique allows to accumulate all the necessary information faster: every call to the oracle adds a considerable number of new cuts, and a moderate number of such calls (in our experience, this number seldom exceeds 30) suces for the cutting plane process to converge. Clearly, the use of disaggregation causes considerable increase of the size of the restricted master problem since it adds an important number of special GUB rows to it (cf. Dantzig [5]). Independently, it increases the total number of columns generated. However, the resulting special structure of the restricted master problem can be exploited eciently both within the simplex method and within an interior point algorithm. The techniques of exploiting it in the latter approach are subject of this paper. The computational eort of a single iteration of any IPM is dominated by the factorization of the AAT matrix and the following one or more backsolves with this factorization (cf. [18, 19]) 2 where A denotes the LP constraint matrix and is a diagonal scaling matrix that changes in subsequent iterations. In the master problem of the decomposition approach, matrix A has a lot of special structure: it is naturally partitioned into blocks associated with dierent subproblems and it contains a number of convexity constraints (GUB rows). We take advantage of the presence of these GUB rows (sometimes a considerable number of them) and explicitely pivot out their contribution from AAT . The resulting smaller matrix of the size equal to the number of coupling constraints still displays a particular structure reecting the partition of the constraint matrix into blocks associated to dierent subproblems. We exploit this structure to simplify the sparsity pattern analysis when building the adjacency matrix. Let us mention that several authors have already addressed the issue of the structure exploitation within the context of IPMs [3, 4, 21, 30]. The paper is organized in the following way. In Section 2 we discuss the particular structure of the LP constraint matrix in the master problem. A structure exploiting representation of the inverse of AAT matrix is subject of Section 3. The method presented in this paper has been implemented and tested in the context of one particular decomposition approach | Analytic Center Cutting Plane Method (ACCPM) of Gon, Haurie and Vial [14] applied to solving large scale nonlinear multicommodity network ow problems [1]. In Section 4 we briey recall the ACCPM approach. In Section 5 we describe the multicommodity ow problems and in Section 6 we give the computational results. Finally, in Section 7 we give our conclusions. 2 Master problem 2.1 The decomposition principle Let us consider a convex optimization problem minimize hc; xi + f (x) subject to x 2 X; x x x: (1) where f : Rn 7! R is a convex function, X 2 Rn is a convex set such that X fx : f (x) < 1g and a linear term hc; xi in the objective has been introduced only for the reason that it often appears in applications. We impose an additional (standard) assumption in non-dierentiable optimization: both the epigraph of f and the set X are approximated by intersections of hyperplanes. The procedure called oracle rst tests whether x 2 X . If x 2 X , then the oracle generates a subgradient 2 @f (x) at x f (x0) f (x) + h; x0 , xi; 8x0 2 X: (2) 3 This inequality is a supporting hyperplane for the optimized function; we shall call it an optimality cut. If x 62 X , then the oracle generates a hyperplane (; ) 2 Rn R for X that separates x from X , i.e., h; xi < and h; x0i ; 8x0 2 X: (3) This hyperplane improves the description of the feasible set X ; we shall call it a feasibility cut. Suppose a sequence of points fxl : l 2 g, at which the oracle has been called, is given. Some of these points were feasible xl 2 X , l 2 opt , the others were not xl 62 X , l 2 fsb = , opt. The oracle generated a set of optimality cuts at xl , l 2 opt that dene a piecewise linear approximation f : Rn 7! R to the convex function f f (x) = lmax ff (x) + hl ; x , xlig: (4) 2 opt It also generated a set of feasibility cuts at xl , l 2 fsb that dene an outer approximation X of the feasible set X X = fx : hl; xi l ; 8l 2 fsbg: (5) The program minimize hc; xi + f (x) (6) subject to x 2 X; x x x is an outer relaxation of the master problem (1). Note that X is a polyhedron and f is a polyhedral (piecewise linear) function. The above program can thus be reformulated as a linear problem, the relaxed master program minimize + hc; xi subject to f (xl ) + hl ; x , xl i; 8l 2 opt; (7) hl; xi l; 8l 2 fsb; x x x: Its solution gives a lower bound for the master problem (1). Observe also that the best feasible solution in the generated sequence provides an upper bound for the master problem u = l2min fhc; xli + f (xl)g: (8) opt For a given upper bound , let us consider the following polyhedral approximation called a localization set L() = f(; x) : ; hc; xli + f (xl ) + hl ; x , xli; 8l 2 opt ; (9) hl; xi l; 8l 2 fsb; x x xg: It is the best (outer) approximation of the optimal set in (1). We are now ready to give a prototype cutting plane method [25]. Its main steps are the following: 4 1. Compute a point (; x) 2 L(u ) and an associated lower bound . 2. Call the oracle at (; x). The oracle returns one or several cuts; if all of them are optimality cuts the oracle also generates an upper bound hc; xi + f (x). 3. Update the bounds: (a) if x is feasible, u := minfhc; xi + f (x); ug. (b) l := maxf; l g: 4. Update the upper bound in the denition of the localization set (9) and add the new cuts. These steps are repeated until a feasible point to (1) is found such that u , l falls below a prescribed optimality tolerance. There exists a number of possible strategies that can be applied at point 1. The optimal point strategy, for example, consists in solving every relaxed master program (7) to optimality [2, 6]. The central point strategy consists in nding some \central point" in a localization set [10, 14, 29]. The discussion of their advantages is beyond the scope of this paper. Whatever strategy is used, however, we are concerned with a linear optimization problem (7). Whenever its size is large enough, say, it grows to tens of thousand of rows and constraints, one can expect advantages from an application of the interior point method. Let us nally observe that we have been concerned by now with a fairly general convex optimization problem (1). In many applications, one can benet from a special structure of it, like e.g., the additivity of the objective that writes f (x) = m X i=1 fi (x); (10) and the intersection type denition of the feasible set X= p \ j =1 Xj : (11) In such a case, it is possible to modify the relaxed master program so that multiple cuts are introduced one at time: m P minimize i + hc; xi i=1 subject to i fi (xl) + hli ; x , xl i; 8l 2 opt; i = 1; : : :; m; (12) hjl ; xi jl ; 8l 2 fsb; j = 1; : : :; p; x x x: Although such a disaggregation results in an important increase of the size of the restricted master program compared with (7), it has unquestionable computational advantages. It makes possible to collect all the necessary information (cuts active at the optimal solution) faster and usually signicantly reduces the number of outer iterations. 5 2.2 Special structure of the master problem Most decomposition methods solve the dual of (12). Let li ; l 2 opt ; i = 1; : : :; m denote dual variables associated with the optimality cuts that involve function subgradients li ; l 2 opt ; i = 1; : : :; m, respectively. For a given i = 1; : : :; m, we group all variables li; l 2 opt into a vector i 2 Rjopt j and, accordingly, group all subgradients li ; l 2 opt into a matrix GTi 2 Rjopt jn . Analogously, let lj ; l 2 fsb ; i = 1; : : :; p denote dual variables associated with the feasibility cuts (separating hyperplanes) (jl ; jl ); l 2 fsb ; j = 1; : : :; p. For a given j = 1; : : :; p, we group all variables lj ; l 2 fsb into a vector j 2 Rjf sbj and group all vectors jl ; l 2 fsb into a matrix ,HjT 2 Rjf sbjn . With this new notation, (12) may be rewritten as m P minimize i +hc; xi i=1 subject to i ei ,GTi x gi ; i = 1; : : :; m; ,HiT x j ; j = 1; : : :; p; x x x; (13) where ei is a vector of ones in Rjopt j , and the right hand side vectors gi 2 Rjopt j ; i = 1; : : :; m and j 2 Rjf sbj ; j = 1; : : :; p, are dened accordingly. A dual to the above problem writes p P hj ; j i j =1 p + , , + P Hj j j =1 maximize xT + , xT , + subject to m P hgi; ii i=1 m P + Gi i i=1 + ,c; hei; ii = 1; i = 1; :::; m; i 0; i = 1; :::; m; j 0; j = 1; :::; p; (14) where + and , are dual variables corresponding to box constraints. This linear program has n + m constraints and 2 n + m joptj + p jfsbj variables. Usually, the number of columns signicantly exceeds the number of rows. In nontrivial applications, e.g., when the number of subproblems grows to thousands, (14) can become a very large linear optimization problem. We suppose that this is the case. Consequently, we consider it natural to solve (14) with the most ecient linear programming approach available nowadays | the interior point method. Moreover, we shall specialize the interior point algorithm so as it could exploit the particular structure of the restricted master problem in which the LP constraint matrix is naturally partitioned into blocks associated with dierent subproblems and it contains a considerable part of GUB rows [5]. 6 3 Projections in interior point methods Interior point methods (IPMs) for linear programming are now deeply understood and very ecient tools [19, 26] that can tackle very large problems. Their general discussion is beyond the scope of this paper. Instead, we shall look closer at the computational eort of a single IPM iteration. 3.1 Linear algebra of IPM iteration A common feature of almost all IPMs is that they can be interpreted in terms of following the path of centers that leads to the optimal solution (see, e.g., [20] and [28] for the up to date references). With some abuse of mathematics, a basic iteration of a path{following algorithm consists of moving from one point in a neighborhood of the central path to another one called target [22] that preserves the property of lying in a neighborhood of central path and reduces the distance to optimality measured with some estimation of the duality gap. Such a move can in principle involve more than one step towards the target. Depending on how signicant is the update of the target (and, consequently, whether just one or more Newton steps are needed to reach the vicinity of the new target) one distinguishes between short and long step methods. Due to the considerable cost of every Newton step, usually, (at least in implementations) one Newton step is allowed before a new target is dened and a very loose requirement on the proximity is used. Every Newton step requires computing at least one orthogonal projection onto the null space of a scaled linear operator A1=2, where A is the LP constraint matrix and is a positive diagonal scaling matrix that changes in subsequent iterations. Modern LP codes use direct methods to solve the underlying linear algebra problem " #" # " # ,,1 AT x = r ; (15) A 0 y h where x and y are Newton directions and r and h dene appropriate right hand side vectors. Dierent variants of IPMs use dierent denitions of , r and h but the computational eort needed to compute the Newton's directions remains basically the same. (This explains why a comparison of the eciency of dierent algorithms is often limited to the comparison of the iteration numbers to reach the desired accuracy.) One can either apply the Bunch-Parlett-Kaufmann factorization to the full (unsymmetric and indenite) system (15) or one can reduce it further to the normal equations. In the latter case, the equation (AAT )y = Ar + h; (16) is easily obtained from the former after an elimination of x = (AT y , r): 7 The normal equations system is positive denite; hence, Cholesky decomposition can be applied to solve it. There are no general rules to choose the most suitable factorization for a given problem. However, in some special cases one of them has denite advantages over the other. In particular, this is the case when one applies IPM to solve the relaxed master problem (14). The hint comes from the analysis of the problem dimensions. Note, that in the LP constraint matrix of (14) the number of variables always signicantly exceeds the number of rows (it is usual to observe a ratio of 10 to 20 between these two numbers). Consequently, we have to reject an application of the augmented system approach (15) due to its prohibitively large size. 3.2 Normal equations matrix for restricted master problem The constraint matrix in the (dual) restricted master problem (14) can be rearranged to have the following form 2 3 6 6 6 6 A = 66 6 6 4 where G0 G1 G2 Gm 7 7 eT1 7 7 7; eT2 7 ... eTm 7 7 5 G0 = [I j , I jH1jH2j jHp] : (17) (18) The IPM scaling matrix can be partitioned accordingly = diag(0 ; 1; 2; ; m ); producing the following normal equations matrix " # AAT = CT B ; B D where C = G00 GT0 + m X i=1 Gi i GTi ; B = (G11e1 ; G22 e2; ; Gmm em ); D = diag(eTi iei ); (19) (20) (21) (22) (23) Let us now look closer at the matrices B; C and D. A column i of B 2 Rnm is a combination of columns of Gi , D 2 Rmm is a diagonal matrix and C 2 Rnn is a union of adjacency matrices associated with the optimality cuts (extreme points) of all subproblems and the adjacency matrix G00 GT0 that gathers the contribution of all feasibility cuts (extreme rays). 8 3.3 Inversion of AAT As explained in the previous sections, a single IPM iteration requires the solution of the normal equations system (16). The system writes " C B BT D #" # " # y1 = z1 ; y2 z2 (24) where the unknown y and the right hand side are partitioned accordingly to (20), i.e., y1 ; z1 2 Rn and y2; z2 2 Rm. Our inversion technique exploits the special structure of the normal equations matrix. First, we pivot out the whole diagonal block D, as this operation introduces no ll-in " # " C B = I BD,1 BT D 0 I and we build #" C , BD,1BT 0 0 D #" I 0 D,1BT I # ; (25) S = C , BD,1 BT : (26) Note that matrix S is symmetric by denition. It is also positive denite as it is a Schur complement being the result of an elimination of a diagonal block from a positive denite matrix AAT (cf. Golub and Van Loan [16]). Consequently, we can compute its Cholesky decomposition S = L0D0,1 LT0 ; (27) and obtain a complete inverse representation AAT " L BD,1 = 0 0 I #" D0 0 0 D #" # LT0 0 : D,1 BT I (28) Thus the solution of equation with AAT is dominated by the solution of one equation with L and one equation with LT , where " # ,1 L = L0 BD : 0 I (29) 3.4 Sparsity issues In numerous applications of decomposition approaches cuts generated by subproblems are sparse. In consequence, the matrices in the restricted master problem are also sparse. This fact has to be taken into account in the decomposition of AAT , especially, if we aim at solving very large problems with thousands of subproblems. 9 Analysis and factorization of S The most computationally involved operation in the solution of the normal equations is usually building and factorization of matrix S . Recall S = G00 GT0 + m X i=1 Gii GTi , m X 1 b bT ; i=1 dii i i (30) where bi = Gi i ei denotes column i of B . There is a choice in the implementation whether matrix B should be computed explicitly or not. In general, there is no such a need, although, for reasons of eciency in the exploitation of the sparsity of cuts, this is highly advantageous. This is even essential when building the adjacency structure of S . Note that this adjacency structure is a union of the structures of G0GT0 , those of all subproblems GiGTi ; i = 1; 2; :::; m and, nally, that of BB T . The following results are useful. Observation 1. The sparsity structure of column bi is the union of sparsity patterns of all columns of Gi . Proof follows directly from the denition of column bi as a linear combination of columns of Gi . Observation 2. The sparsity structure of the adjacency matrix of Gi i GTi is a subset of the sparsity pattern of the product bi bTi . Proof follows immediately from Observation 1. These observations have important consequences in practice. Observation 2, for example, simplies the analysis of adjacency matrix implied by subproblem i G GT , 1 b bT ; i i i dii i i to the analysis of only one clique [8] associated with column bi. This not only yields important savings in the time of building adjacency matrix S , but it also simplies a reordering of this matrix to get the maximum sparsity in the Cholesky factor. The reordering algorithm takes advantage of the presence of supernodes [12] implied by the columns of B . Note that the number of such columns is equal to m, i.e., it is signicantly smaller than the overall number of columns (cuts) in A. Matrix B For the reasons mentioned earlier, we store matrix B in the explicit form. This is in contrast with the previous implementation [13] in which C and S were stored as dense matrices and B was handled implicitly in order to make possible the exploitation of the supersparsity of the network cuts in A. Knowing explicitly matrix B clearly simplies its multiplications with a vector that contribute nonnegligibely to the backsolves with L and LT . 10 4 ACCPM | an example of a decomposition approach A single (outer) iteration in any decomposition method [6, 2, 25] consists of two main steps: rst, one computes some solution of the restricted master problem; next the oracle generates new (one or more) cuts that improve (shrink) the localization set. In the Analytic Center Cutting Plane Method (ACCPM) of Gon, Haurie and Vial [14], the restricted master problem is not solved to optimality. Instead, an analytic center of the localization set is found. The dual variables associated with the solution are thus not \extreme points" as is the case in the Dantzig-Wolfe algorithm. They are \central prices" that contain a richer information about the economic value of the common resources, taking into account all past proposals [15] and not only a subset of them whose combination, locally, looks optimal. Cuts generated from the analytic center are likely to be deeper, thereby entailing faster convergence than the DantzigWolfe procedure. Dierent interior point methods can be used to compute the analytic center. In ACCPM we chose a variant [7] of Karmarkar's projective algorithm [24]. The reader interested in a detailed presentation of this particular variant of the interior point method can consult [14, 15]. An excellent, complete presentation of ACCPM is given by du Merle [27]. Although there exist important particularities of this interior point method imposed by the fact that it looks for an analytic center (and not for an optimum of the problem), a single iteration of it does not dier too much from the iteration of any other IPM. Viewed from the perspective of calculations, a single iteration is a Newton step in which the main computational eort is, as usual, the factorization and solution of normal equations system (16). 5 Nonlinear multicommodity network ow problems We are given a graph G = (V ; A), where V denotes the set of nodes and A f(s; t) : s 2 V ; t 2 V ; s 6= tg is the set of (directed) arcs. We dene the transpose of A as AT =: f(s; t) : (t; s) 2 Ag and T (a) =: f(s; t) : a = (t; s) 2 Ag, a mapping that associates to every directed arc (s; t) the arc with the reverse orientation (t; s). Clearly, AT = T (A). Now, two directed arcs a and T (a) represent the undirected arc fs; tg. Next for the augmented graph G = (V ; A), where A =: A [ AT , we denote by nn and na the numbers of nodes and arcs in it, respectively and by N the nn na node-arc incidence matrix of G. The set of commodities I is dened by exogeneous ow vectors (supplies and demands) di = (dis )s2V that satisfy eTV di = 0, with eV a vector of ones. These ows must be shipped through 11 G. The feasible ows for commodity i: xi = (xia)a2A are members of F i = fxi 0 : Nxi = dig: The nonlinear multicommodity network ow problem can be formulated as minimize subject to X i i X hc ; x i + ya , ya a2A0 a X i (xa + xiT (a)) ya ; 8a 2 A0 i2I xi 2 F i; 8i 2 I ; i2I 0 ya a ; 8a 2 A A: 0 (31) A; (32) (33) (34) The vector y = (ya )a2A0 is meant to represent the joint total arc ow of a and T (a). It is in fact an upper bound to it, that is in the objective penalized for approaching the arc capacity a. We briey recall the standard decomposition approach to solve multicommodity network ow problems. First, the complicating capacity constraints are dualized to form a partial Lagrangian. This Lagrangian is minimized in the variables x and y leading naturally to an additive formulation of the objective function. We then take advantage of this additivity to disaggregate the optimality cuts, which is a necessary condition to solve the problem eciently. The reader is encouraged to consult [13] for more details on the problem formulation and on the application of decomposition method to its solution. 6 Numerical results The techniques of exploiting sparsity in the solution of master problem presented in this paper have been implemented within the context of ACCPM. We applied the method to solve very large nonlinear multicommodity network ow problems as described in Section 5. The exploitation of sparsity in the solution of master problem presented in this paper permitted us to push the limit sizes of problems solved from 1000 arcs and 5000 commodities in a previous version to 5000 arcs and 10000 commodities, respectively in a current implementation of ACCPM. The new implementation of ACCPM is written in C++ (except for the library of routines applied for handling sparse Cholesky factorization [17] that is written in FORTRAN 77). The C++ and FORTRAN routines have been compiled with the xlC and xlf compilers, respectively. The program was run on a Power PC workstation (66MHz, 64MB of RAM, type 7011, model 25T), hence the compilation was made with options -O -qarch=ppc. Below we present the results of running the methdod on a set of 17 problems: 2 well known examples NDO22 and NDO148 from [11] and 15 randomly generated problems. A description of 12 Problem Nodes Arcs Comm. Subprobs NDO22 14 22 23 45 NDO148 61 148 122 270 Random12 300 600 1000 1600 Random15 300 1000 4000 5000 Random16 300 1000 1000 2000 Random20 400 1000 5000 6000 Random21 500 1000 3000 4000 Random31 700 2000 2000 4000 Random32 1000 3000 2000 5000 Random33 1000 4000 2000 6000 Random41 1000 5000 2000 7000 Random42 1000 5000 3000 8000 Random51 1000 2000 6000 8000 Random52 1000 2000 8000 10000 Random53 1000 2000 10000 12000 Random54 1000 3000 6000 9000 Random55 1000 3000 8000 11000 Table 1: Problems statistics. our (public domain) generator can be found in [13]. Problem statistics: the numbers of nodes, arcs and commodities, respectively are collected in Table 1. Additionally, Table 1 reports the number of subproblems to facilitate the determination of the size of restricted master problem. In the formulation of nonlinear multicommodity network ow problems the number of coupling constraints n equals to the number of arcs while the number of subproblems m equals to the sum of numbers of arcs and commodities. Let us observe that the numbers given in Table 1 \hide" the large size of the problems solved. The linear version of Random33 problem, for example, in an equivalent compact LP formulation, would involve 1000 blocks of 2000 constraints of commodity ow balance at each node and 4000 coupling constraints of total ow capacity on the arcs. This formulation comprises 1000 2000+ 4000 = 2004000 constraints and 4000 2000 = 8000000 variables. The reader interested in the inuence of the multicommodity ow problem formulation on the eciency of dierent solution methods is referred to [23]. Table 2 brings information on the sparsity of matrix S and its Cholesky factor (27). Its columns report: the size of S , n, the number of subdiagonal nonzero elements in the adjacency matrix BB T , the number of subdiagonal nonzero elements in the Cholesky matrix L0 , the ll-in, i.e., the number of new nonzero entries created during the factorization and, nally, a measure of the computational eort (ops) to nd the Cholesky decomposition [17]. From the results collected in Table 2, one can see that Cholesky factors in all problems solved show considerable ll-in. Random42 problem, for example, produces, in the worst case, about 2.9 13 Problem NDO22 NDO148 Random12 Random15 Random16 Random20 Random21 Random31 Random32 Random33 Random41 Random42 Random51 Random52 Random53 Random54 Random55 n nonz(BB T ) 122 148 600 1000 1000 1000 1000 2000 3000 4000 5000 5000 2000 2000 2000 3000 3000 143 6116 33340 85972 54391 134551 101683 167440 232472 324426 330021 477296 328651 395855 423901 436833 528839 Fill-in 36 2205 41837 124693 115642 176777 165204 582747 1296961 2048411 1459869 2426969 814771 846064 876762 1643768 1778717 nonz(L) 179 8321 75177 210665 170033 311328 266887 750187 1529433 2372837 1789890 2904265 1143422 1241919 1300663 2080601 2307556 Flops 1.83e+3 6.12e+5 1.47e+7 7.73e+7 5.12e+7 1.46e+8 1.10e+8 5.13e+8 1.52e+9 2.94e+9 1.80e+9 3.91e+9 1.02e+9 1.17e+9 1.27e+9 2.51e+9 2.95e+9 Table 2: Sparsity of Cholesky factors (worst case). million nonzeros. This leads to pretty expensive factorization: its estimated eort is 3:91 109 ops. Observe that storing matrix S as dense, as was the case in [13], and applying LAPACK routine to its factorization would require space for 25 million nonzeros (LAPACK requires the whole symmetric matrix to be stored in order to exploit level 3 BLAS eciently). A single factorization would in such a case cost 31 (5000)3 = 4:17 1010, i.e., it would be, roughly speaking, 10 times more expensive. Table 3 collects data on the solution of our collection of large scale problems. We report in it: the number of outer iterations, NITER, the number of inner iterations, Newton, the total number of cuts (subgradients) added through the whole solution process, the number of shortest path type cuts and the CPU time (to reach a 6-digit accurate solution on a POWER PC computer). To give a bit of an insight into the ACCPM's behavior, Table 3 additionally reports the time spent in the factorizations of S (dominating term in the master), tF , and the time spent to solve subproblems, tS . The reader may note that the solution times for these problems are considerable. It took ACCPM about 32 hours of CPU, for example, to get a 6-digit optimal solution of Random42. The dominating term in the solution time was the computation of 24 analytic centers, i.e., the solution of 24 subsequent relaxed master problems. This required 534 very expensive IPM iterations (factorizations of S ). One has to be aware, however, that each master problem was a nontrivial linear program with 13000 rows and a considerable number of columns that grew at the end to about 150,000. 14 Problem NITER Newton NDO22 NDO148 Random12 Random15 Random16 Random20 Random21 Random31 Random32 Random33 Random41 Random42 Random51 Random52 Random53 Random54 Random55 17 13 19 26 21 28 23 22 20 20 26 24 26 30 32 26 28 84 83 257 527 259 577 397 434 389 382 466 534 640 688 929 593 750 CUTS CPU time total paths tF tS total 420 46 0.81 0.03 0.96 2230 306 13.98 0.78 15.65 16519 5119 639.11 82.54 736.54 51888 25888 5595.43 518.16 6180.80 26494 5494 1322.05 102.28 1448.44 64241 36241 10714.02 876.20 11688.74 41857 18857 5103.80 523.03 5682.01 57700 13700 17008.75 568.70 17651.66 73795 13795 41542.88 825.13 42477.39 95902 15902 70354.10 921.51 71429.38 148214 18214 61853.98 1397.73 63452.67 149791 29791 114956.05 1541.63 116726.15 92503 40503 58364.35 2844.77 61387.06 115151 55151 73889.13 4379.27 78527.08 132816 68816 94361.97 4849.51 99485.27 122243 44243 111193.18 3316.83 114765.34 149046 65046 137348.32 3886.51 141534.21 Table 3: Eciency of ACCPM. 7 Conclusions We have given in this paper a systematic discussion of the way of treating relaxed master program with an interior point algorithm. We have not imposed any condition on the decomposition scheme used (it may be either the \optimal point strategy" or the \central point strategy"). We have concentrated on the exploitation of the special structure of the relaxed master problem within a single iteration of any interior point method to make it the most ecient. The techniques presented in this paper have been incorporated into the implementation of the Analytic Center Cutting Plane Method. We have demonstrated its advantages when applying ACCPM to the solution of the large scale nonlinear multicommodity network ow problems. The use of the structure exploiting techniques presented in this paper has allowed us to solve signicantly larger problems than [13]. Their sizes have been pushed from 1000 arcs and 5000 commodities in the old version to 5000 arcs and 10000 commodities, respectively in the new one. References [1] Ahuja R.K., T.L. Magnanti and J.B. Orlin, Network Flows, Prentice-Hall, 1993. [2] Benders J.F., Partitionning procedures for solving mixed-variables programming problems, Numerische Mathematik 4 (1962), pp. 238-252. 15 [3] Birge J. and L. Qi, (1988). Computing block-angular Karmarkar projections with applications to stochastic programming, Management Science 34, No 12, pp. 1472-1479. [4] Choi I.C. and D. Goldfarb, Exploiting special structure in a primal-dual path following algorithm, Mathematical Programming 58 (1993) 33{52. [5] Dantzig G.B., Linear Programming and Extensions, Princeton University Press, Princeton, 1963. [6] Dantzig G.B. and P. Wolfe, The decomposition algorithm for linear programming, Econometrica 29, 4 (1961) 767-778. [7] de Ghellinck G. and J.-P. Vial, A polynomial Newton method for linear programming, Algorithmica 1 (1986) 425{453. [8] Du I.S., A.M. Erisman and J.K. Reid, Direct methods for sparse matrices, Oxford University Press, New York, 1987. [9] Eckstein J., Large-scale parallel computing, optimization and operations research: a survey, ORSA CSTS Newsletter 14 (1993), No 2, Fall 1993. [10] Elzinga J. and T. G. Moore, A central cutting plane algorithm for the convex programming problem, Mathematical Programming 8 (1973) 134-145. [11] Gafni E.M. and D.P. Bertsekas, Two-metric projection methods for constrained optimization, SIAM Journal on Control and Optimization 22, 6 (1984) 936-964. [12] George A. and J.W.H. Liu, The Evolution of the Minimum Degree Ordering Algorithm, SIAM Review 31 (1989), 1{19. [13] Gon J.-L., J. Gondzio, R. Sarkissian and J.-P. Vial, Solving nonlinear multicommodity ow problems by the analytic center cutting plane method, Technical Report 1994.21, Department of Management Studies, University of Geneva, September 1994, revised October 1995. Mathematical Programming (to appear). [14] Gon J.-L., A. Haurie and J.-P. Vial, Decomposition and nondierentiable optimization with the projective algorithm, Management Science 38, 2 (1992) 284-302. [15] Gon J.-L., A. Haurie, J.-P. Vial and D.L. Zhu, Using central prices in the decomposition of linear programs, European Journal of Operational Research 64 (1993) 393-409. [16] Golub G.H. and C. Van Loan, Matrix Computations, (2nd ed.) The Johns Hopkins University Press, Baltimore and London, 1989. [17] Gondzio J., Implementing Cholesky factorization for interior point methods of linear programming, Optimization 27 (1993) pp. 121-140. [18] Gondzio J., Multiple centrality corrections in primal dual method for linear programming, Technical Report 1994.20, Department of Management Studies, University of Geneva, Switzerland, November 1994, revised May 1995. Computational Optimization and Applications (to appear). [19] Gondzio J. and T. Terlaky, A computational view of interior point methods for linear programming, in: Advances in Linear and Integer Programming, J. Beasley (ed.), Oxford University Press, Oxford, England, 1996, pp. 106{147. [20] Gonzaga C.C, Path following methods for linear programming, SIAM Review 34 (1992) 167{227. 16 [21] Hurd J.K. and F. M. Murphy, Exploiting special structure in primal-dual interior point methods, ORSA Journal on Computing 4 (1992) pp. 39{44. [22] Jansen B., C. Roos, T. Terlaky, and J.-P. Vial, Primal{Dual Target Following Algorithms for Linear Programming, Technical Report 93{107, Faculty of Technical Mathematics and Informatics, Technical University Delft, Delft, The Netherlands, 1993, to appear in a special issue of Annals of Operation Research, K. Anstreicher and R. Freund (eds.). [23] Jones K.L, I.J. Lustig, J.M. Farvolden and W.B. Powell, Multicommodity network ows: the impact of formulation on decomposition, Mathematical Programming 62 (1993) 95-117. [24] Karmarkar N., A new polynomial time algorithm for linear programming, Combinatorica 4 (1984) 373{395. [25] Kelley J.E., The cutting plane method for solving convex programs, Journal of the SIAM 8 (1960) 703-712. [26] Lustig I.J., R.E. Marsten and D.F. Shanno, Interior Point Methods for Linear Programming: Computational State of the Art, ORSA Journal on Computing 6 (1994), 1{14. [27] Merle O. du, Interior Points and Cutting Palnes: a Development and Implementation of Methods for Convex Optimization and Large Scale Structured Linear Programming, Ph.D Thesis, Department of Management Studies, University of Geneva, Geneva, Switzerland, 1995 (in French). [28] Roos C. and J.-P. Vial, Interior Point Methods, in: Advances in Linear and Integer Programming, Beasley, J.E. (ed.), Oxford University Press, Oxford, England, 1996, pp. 49{104. [29] Sonnevend G., New algorithms in convex programming based on a notion of "centre" (for systems of analytic inequalities) and on rational extrapolation, in: K.H. Homann, J.B. Hiriat-Urruty, C. Lemarechal, and J. Zowe, eds., Trends in Mathematical Optimization: Proceedings of the 4th French{German Conference on Optimization in Irsee, West{Germany, April 1986, volume 84 of International Series of Numerical Mathematics, pp 311{327, Birkhauser Verlag, Basel, Switzerland, 1988. [30] Todd M.J., Exploiting special structure in Karmarkar's linear programmingalgorithm, Mathematical Programming 41 (1988) pp. 81{103. 17

Using an Interior Point Method for the Master Problem in a

Related documents

Products

Support

Using an Interior Point Method for the Master Problem in a

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib