Using an Interior Point Method for the Master Problem in a

advertisement
Using an Interior Point Method for the Master Problem in a
Decomposition Approachy
J. Gondzioz, R. Sarkissian and J.-P. Vial
Logilab, HEC, Section of Management Studies, University of Geneva,
102 Bd Carl Vogt, CH-1211 Geneve 4, Switzerland
Technical Report 1995.30
October 24, 1995, revised May 15, 1996
Abstract
We addres some of the issues that arise when an interior point method is used to handle
the master problem in a decomposition approach. The main points concern the ecient
exploitation of the special structure of the master problem to reduce the cost of a single
interior point iteration. The particular structure is the presence of GUB constraints and the
natural partitioning of the constraint matrix into blocks built of cuts generated by dierent
subproblems.
The method can be used in a fairly general case, i.e., in any decomposition approach
whenever the master is solved by an interior point method in which the normal equations
are used to compute orthogonal projections.
Computational results demonstrate its advantages for one particular decomposition approach: Analytic Center Cutting Plane Method (ACCPM) is applied to solve large scale
nonlinear multicommodity network ow problems (up to 5000 arcs and 10000 commodities).
Key words. Convex programming, interior point methods, cutting plane methods,
linear programming, network optimization.
This research has been supported by the Fonds National de la Recherche Scientique Suisse, grant #12 ,
34002:92.
y to appear in European Journal of Operational Research.
z on leave from the Systems Research Institute, Polish Academy of Sciences, Newelska 6, 01-447 Warsaw,
Poland.
1
1 Introduction
Decomposition methods attracted a lot of attention during the past few years. Some of them
became the methods of choice for solving the particular classes of very large scale optimization problems. This was possible due to the development of the well understood decomposition
approaches as well as due to their suitability to the new parallel and/or distributed computer environments. The analysis of dierent decomposition approaches is out of the scope of this paper.
The reader interested in decomposition methods and in their modern (parallel) implementations
is encouraged to consult the references in the survey of Eckstein [9].
In this paper we are concerned with classical decomposition approaches as for example [6, 2, 25].
Their common feature is the use of oracle which generates cutting planes that continuously improve (shrink) the localization set containing the solution of the problem. The master problems
consist in the minimization of the associated MinMax problem and can be replaced with the
minimization of some linear objective over the localization set. More generally, the master problem may consist in nding a particular point in the current localization set. It is not always
necessary, for example, to nd the optimum of the relaxation; sometimes it may be advantageous
to nd \central points" in it [29, 10, 14].
Independently of the goal of the master that may look for the optimum or some \center" in the
set of localization, it is supposed to apply a variant of the linear programming approach.
Interior point methods (IPMs) developed in the last decade proved to be extremely ecient
tools for solving large scale linear optimization problems. These optimization techniques are
now well understood [19] and, as the computational results have shown (cf. [18, 26]), they
usually run signicantly faster than modern implementations of the simplex method if the size
of the problem solved is considerable. It is thus natural to apply a variant of an interior point
method to solve large master problems in a decomposition approach.
A key issue in the solution of real-life large scale problems with the decomposition approach
is the ability to take advantage of the disaggregation of cuts. The reason for the superiority
of the disaggregate approach is simple. This technique allows to accumulate all the necessary
information faster: every call to the oracle adds a considerable number of new cuts, and a
moderate number of such calls (in our experience, this number seldom exceeds 30) suces
for the cutting plane process to converge. Clearly, the use of disaggregation causes considerable
increase of the size of the restricted master problem since it adds an important number of special
GUB rows to it (cf. Dantzig [5]). Independently, it increases the total number of columns
generated. However, the resulting special structure of the restricted master problem can be
exploited eciently both within the simplex method and within an interior point algorithm.
The techniques of exploiting it in the latter approach are subject of this paper.
The computational eort of a single iteration of any IPM is dominated by the factorization of
the AAT matrix and the following one or more backsolves with this factorization (cf. [18, 19])
2
where A denotes the LP constraint matrix and is a diagonal scaling matrix that changes in
subsequent iterations. In the master problem of the decomposition approach, matrix A has a lot
of special structure: it is naturally partitioned into blocks associated with dierent subproblems
and it contains a number of convexity constraints (GUB rows).
We take advantage of the presence of these GUB rows (sometimes a considerable number of
them) and explicitely pivot out their contribution from AAT . The resulting smaller matrix of
the size equal to the number of coupling constraints still displays a particular structure reecting
the partition of the constraint matrix into blocks associated to dierent subproblems. We exploit
this structure to simplify the sparsity pattern analysis when building the adjacency matrix.
Let us mention that several authors have already addressed the issue of the structure exploitation
within the context of IPMs [3, 4, 21, 30].
The paper is organized in the following way. In Section 2 we discuss the particular structure of
the LP constraint matrix in the master problem. A structure exploiting representation of the
inverse of AAT matrix is subject of Section 3. The method presented in this paper has been
implemented and tested in the context of one particular decomposition approach | Analytic
Center Cutting Plane Method (ACCPM) of Gon, Haurie and Vial [14] applied to solving large
scale nonlinear multicommodity network ow problems [1]. In Section 4 we briey recall the
ACCPM approach. In Section 5 we describe the multicommodity ow problems and in Section 6
we give the computational results. Finally, in Section 7 we give our conclusions.
2 Master problem
2.1 The decomposition principle
Let us consider a convex optimization problem
minimize hc; xi + f (x)
subject to x 2 X;
x x x:
(1)
where f : Rn 7! R is a convex function, X 2 Rn is a convex set such that X fx : f (x) < 1g
and a linear term hc; xi in the objective has been introduced only for the reason that it often
appears in applications.
We impose an additional (standard) assumption in non-dierentiable optimization: both the
epigraph of f and the set X are approximated by intersections of hyperplanes. The procedure
called oracle rst tests whether x 2 X . If x 2 X , then the oracle generates a subgradient
2 @f (x) at x
f (x0) f (x) + h; x0 , xi; 8x0 2 X:
(2)
3
This inequality is a supporting hyperplane for the optimized function; we shall call it an optimality cut. If x 62 X , then the oracle generates a hyperplane (; ) 2 Rn R for X that
separates x from X , i.e.,
h; xi < and h; x0i ; 8x0 2 X:
(3)
This hyperplane improves the description of the feasible set X ; we shall call it a feasibility cut.
Suppose a sequence of points fxl : l 2 g, at which the oracle has been called, is given. Some
of these points were feasible xl 2 X , l 2 opt , the others were not xl 62 X , l 2 fsb = , opt.
The oracle generated a set of optimality cuts at xl , l 2 opt that dene a piecewise linear
approximation f : Rn 7! R to the convex function f
f (x) = lmax
ff (x) + hl ; x , xlig:
(4)
2
opt
It also generated a set of feasibility cuts at xl , l 2 fsb that dene an outer approximation X
of the feasible set X
X = fx : hl; xi l ; 8l 2 fsbg:
(5)
The program
minimize hc; xi + f (x)
(6)
subject to x 2 X;
x x x
is an outer relaxation of the master problem (1). Note that X is a polyhedron and f is a
polyhedral (piecewise linear) function. The above program can thus be reformulated as a linear
problem, the relaxed master program
minimize + hc; xi
subject to f (xl ) + hl ; x , xl i; 8l 2 opt;
(7)
hl; xi l; 8l 2 fsb;
x x x:
Its solution gives a lower bound for the master problem (1). Observe also that the best feasible
solution in the generated sequence provides an upper bound for the master problem
u = l2min
fhc; xli + f (xl)g:
(8)
opt
For a given upper bound , let us consider the following polyhedral approximation called a
localization set
L() = f(; x) : ;
hc; xli + f (xl ) + hl ; x , xli; 8l 2 opt ;
(9)
hl; xi l; 8l 2 fsb;
x x xg:
It is the best (outer) approximation of the optimal set in (1).
We are now ready to give a prototype cutting plane method [25]. Its main steps are the following:
4
1. Compute a point (; x) 2 L(u ) and an associated lower bound .
2. Call the oracle at (; x). The oracle returns one or several cuts; if all of them are optimality
cuts the oracle also generates an upper bound hc; xi + f (x).
3. Update the bounds:
(a) if x is feasible, u := minfhc; xi + f (x); ug.
(b) l := maxf; l g:
4. Update the upper bound in the denition of the localization set (9) and add the new
cuts.
These steps are repeated until a feasible point to (1) is found such that u , l falls below a
prescribed optimality tolerance.
There exists a number of possible strategies that can be applied at point 1. The optimal point
strategy, for example, consists in solving every relaxed master program (7) to optimality [2, 6].
The central point strategy consists in nding some \central point" in a localization set [10, 14, 29].
The discussion of their advantages is beyond the scope of this paper. Whatever strategy is used,
however, we are concerned with a linear optimization problem (7). Whenever its size is large
enough, say, it grows to tens of thousand of rows and constraints, one can expect advantages
from an application of the interior point method.
Let us nally observe that we have been concerned by now with a fairly general convex optimization problem (1). In many applications, one can benet from a special structure of it, like
e.g., the additivity of the objective that writes
f (x) =
m
X
i=1
fi (x);
(10)
and the intersection type denition of the feasible set
X=
p
\
j =1
Xj :
(11)
In such a case, it is possible to modify the relaxed master program so that multiple cuts are
introduced one at time:
m
P
minimize
i + hc; xi
i=1
subject to i fi (xl) + hli ; x , xl i; 8l 2 opt; i = 1; : : :; m;
(12)
hjl ; xi jl ; 8l 2 fsb; j = 1; : : :; p;
x x x:
Although such a disaggregation results in an important increase of the size of the restricted
master program compared with (7), it has unquestionable computational advantages. It makes
possible to collect all the necessary information (cuts active at the optimal solution) faster and
usually signicantly reduces the number of outer iterations.
5
2.2 Special structure of the master problem
Most decomposition methods solve the dual of (12).
Let li ; l 2 opt ; i = 1; : : :; m denote dual variables associated with the optimality cuts that involve function subgradients li ; l 2 opt ; i = 1; : : :; m, respectively. For a given i = 1; : : :; m, we
group all variables li; l 2 opt into a vector i 2 Rjopt j and, accordingly, group all subgradients
li ; l 2 opt into a matrix GTi 2 Rjopt jn .
Analogously, let lj ; l 2 fsb ; i = 1; : : :; p denote dual variables associated with the feasibility
cuts (separating hyperplanes) (jl ; jl ); l 2 fsb ; j = 1; : : :; p. For a given j = 1; : : :; p, we group
all variables lj ; l 2 fsb into a vector j 2 Rjf sbj and group all vectors jl ; l 2 fsb into a
matrix ,HjT 2 Rjf sbjn .
With this new notation, (12) may be rewritten as
m
P
minimize
i +hc; xi
i=1
subject to i ei ,GTi x gi ; i = 1; : : :; m;
,HiT x j ; j = 1; : : :; p;
x x x;
(13)
where ei is a vector of ones in Rjopt j , and the right hand side vectors gi 2 Rjopt j ; i = 1; : : :; m
and j 2 Rjf sbj ; j = 1; : : :; p, are dened accordingly. A dual to the above problem writes
p
P
hj ; j i
j =1
p
+ , , + P Hj j
j =1
maximize xT + , xT , +
subject to
m
P
hgi; ii
i=1
m
P
+ Gi i
i=1
+
,c;
hei; ii = 1; i = 1; :::; m;
i 0;
i = 1; :::; m;
j 0;
j = 1; :::; p;
(14)
where + and , are dual variables corresponding to box constraints. This linear program has
n + m constraints and 2 n + m joptj + p jfsbj variables. Usually, the number of columns
signicantly exceeds the number of rows. In nontrivial applications, e.g., when the number of
subproblems grows to thousands, (14) can become a very large linear optimization problem. We
suppose that this is the case.
Consequently, we consider it natural to solve (14) with the most ecient linear programming
approach available nowadays | the interior point method. Moreover, we shall specialize the
interior point algorithm so as it could exploit the particular structure of the restricted master
problem in which the LP constraint matrix is naturally partitioned into blocks associated with
dierent subproblems and it contains a considerable part of GUB rows [5].
6
3 Projections in interior point methods
Interior point methods (IPMs) for linear programming are now deeply understood and very
ecient tools [19, 26] that can tackle very large problems. Their general discussion is beyond
the scope of this paper. Instead, we shall look closer at the computational eort of a single IPM
iteration.
3.1 Linear algebra of IPM iteration
A common feature of almost all IPMs is that they can be interpreted in terms of following the
path of centers that leads to the optimal solution (see, e.g., [20] and [28] for the up to date
references). With some abuse of mathematics, a basic iteration of a path{following algorithm
consists of moving from one point in a neighborhood of the central path to another one called
target [22] that preserves the property of lying in a neighborhood of central path and reduces
the distance to optimality measured with some estimation of the duality gap. Such a move can
in principle involve more than one step towards the target. Depending on how signicant is the
update of the target (and, consequently, whether just one or more Newton steps are needed to
reach the vicinity of the new target) one distinguishes between short and long step methods. Due
to the considerable cost of every Newton step, usually, (at least in implementations) one Newton
step is allowed before a new target is dened and a very loose requirement on the proximity is
used.
Every Newton step requires computing at least one orthogonal projection onto the null space of
a scaled linear operator A1=2, where A is the LP constraint matrix and is a positive diagonal
scaling matrix that changes in subsequent iterations. Modern LP codes use direct methods to
solve the underlying linear algebra problem
"
#"
# " #
,,1 AT x = r ;
(15)
A
0
y
h
where x and y are Newton directions and r and h dene appropriate right hand side vectors.
Dierent variants of IPMs use dierent denitions of , r and h but the computational eort
needed to compute the Newton's directions remains basically the same. (This explains why a
comparison of the eciency of dierent algorithms is often limited to the comparison of the
iteration numbers to reach the desired accuracy.)
One can either apply the Bunch-Parlett-Kaufmann factorization to the full (unsymmetric and
indenite) system (15) or one can reduce it further to the normal equations. In the latter case,
the equation
(AAT )y = Ar + h;
(16)
is easily obtained from the former after an elimination of
x = (AT y , r):
7
The normal equations system is positive denite; hence, Cholesky decomposition can be applied
to solve it.
There are no general rules to choose the most suitable factorization for a given problem. However,
in some special cases one of them has denite advantages over the other. In particular, this is
the case when one applies IPM to solve the relaxed master problem (14). The hint comes from
the analysis of the problem dimensions. Note, that in the LP constraint matrix of (14) the
number of variables always signicantly exceeds the number of rows (it is usual to observe a
ratio of 10 to 20 between these two numbers). Consequently, we have to reject an application
of the augmented system approach (15) due to its prohibitively large size.
3.2 Normal equations matrix for restricted master problem
The constraint matrix in the (dual) restricted master problem (14) can be rearranged to have
the following form
2
3
6
6
6
6
A = 66
6
6
4
where
G0 G1 G2 Gm 7
7
eT1
7
7
7;
eT2
7
...
eTm
7
7
5
G0 = [I j , I jH1jH2j jHp] :
(17)
(18)
The IPM scaling matrix can be partitioned accordingly
= diag(0 ; 1; 2; ; m );
producing the following normal equations matrix
"
#
AAT = CT B ;
B D
where
C = G00 GT0 +
m
X
i=1
Gi i GTi ;
B = (G11e1 ; G22 e2; ; Gmm em );
D = diag(eTi iei );
(19)
(20)
(21)
(22)
(23)
Let us now look closer at the matrices B; C and D. A column i of B 2 Rnm is a combination of
columns of Gi , D 2 Rmm is a diagonal matrix and C 2 Rnn is a union of adjacency matrices
associated with the optimality cuts (extreme points) of all subproblems and the adjacency matrix
G00 GT0 that gathers the contribution of all feasibility cuts (extreme rays).
8
3.3 Inversion of AAT
As explained in the previous sections, a single IPM iteration requires the solution of the normal
equations system (16). The system writes
"
C B
BT D
#"
#
"
#
y1 = z1 ;
y2
z2
(24)
where the unknown y and the right hand side are partitioned accordingly to (20), i.e., y1 ; z1 2
Rn and y2; z2 2 Rm.
Our inversion technique exploits the special structure of the normal equations matrix. First, we
pivot out the whole diagonal block D, as this operation introduces no ll-in
"
#
"
C B = I BD,1
BT D
0 I
and we build
#"
C , BD,1BT 0
0
D
#"
I
0
D,1BT I
#
;
(25)
S = C , BD,1 BT :
(26)
Note that matrix S is symmetric by denition. It is also positive denite as it is a Schur
complement being the result of an elimination of a diagonal block from a positive denite
matrix AAT (cf. Golub and Van Loan [16]). Consequently, we can compute its Cholesky
decomposition
S = L0D0,1 LT0 ;
(27)
and obtain a complete inverse representation
AAT
"
L BD,1
= 0
0
I
#"
D0 0
0 D
#"
#
LT0 0 :
D,1 BT I
(28)
Thus the solution of equation with AAT is dominated by the solution of one equation with L
and one equation with LT , where
"
#
,1
L = L0 BD
:
0
I
(29)
3.4 Sparsity issues
In numerous applications of decomposition approaches cuts generated by subproblems are sparse.
In consequence, the matrices in the restricted master problem are also sparse. This fact has to
be taken into account in the decomposition of AAT , especially, if we aim at solving very large
problems with thousands of subproblems.
9
Analysis and factorization of S
The most computationally involved operation in the solution of the normal equations is usually
building and factorization of matrix S . Recall
S = G00 GT0 +
m
X
i=1
Gii GTi ,
m
X
1 b bT ;
i=1 dii
i i
(30)
where bi = Gi i ei denotes column i of B .
There is a choice in the implementation whether matrix B should be computed explicitly or not.
In general, there is no such a need, although, for reasons of eciency in the exploitation of the
sparsity of cuts, this is highly advantageous. This is even essential when building the adjacency
structure of S . Note that this adjacency structure is a union of the structures of G0GT0 , those of
all subproblems GiGTi ; i = 1; 2; :::; m and, nally, that of BB T . The following results are useful.
Observation 1.
The sparsity structure of column bi is the union of sparsity patterns of all
columns of Gi .
Proof follows directly from the denition of column bi as a linear combination of columns of
Gi .
Observation 2.
The sparsity structure of the adjacency matrix of Gi i GTi is a subset of the
sparsity pattern of the product bi bTi .
Proof follows immediately from Observation 1.
These observations have important consequences in practice. Observation 2, for example, simplies the analysis of adjacency matrix implied by subproblem i
G GT , 1 b bT ;
i i i
dii
i i
to the analysis of only one clique [8] associated with column bi. This not only yields important
savings in the time of building adjacency matrix S , but it also simplies a reordering of this
matrix to get the maximum sparsity in the Cholesky factor. The reordering algorithm takes
advantage of the presence of supernodes [12] implied by the columns of B . Note that the
number of such columns is equal to m, i.e., it is signicantly smaller than the overall number of
columns (cuts) in A.
Matrix B
For the reasons mentioned earlier, we store matrix B in the explicit form. This is in contrast
with the previous implementation [13] in which C and S were stored as dense matrices and B was
handled implicitly in order to make possible the exploitation of the supersparsity of the network
cuts in A. Knowing explicitly matrix B clearly simplies its multiplications with a vector that
contribute nonnegligibely to the backsolves with L and LT .
10
4 ACCPM | an example of a decomposition approach
A single (outer) iteration in any decomposition method [6, 2, 25] consists of two main steps:
rst, one computes some solution of the restricted master problem; next the oracle generates
new (one or more) cuts that improve (shrink) the localization set.
In the Analytic Center Cutting Plane Method (ACCPM) of Gon, Haurie and Vial [14], the restricted master problem is not solved to optimality. Instead, an analytic center of the localization
set is found.
The dual variables associated with the solution are thus not \extreme points" as is the case in
the Dantzig-Wolfe algorithm. They are \central prices" that contain a richer information about
the economic value of the common resources, taking into account all past proposals [15] and
not only a subset of them whose combination, locally, looks optimal. Cuts generated from the
analytic center are likely to be deeper, thereby entailing faster convergence than the DantzigWolfe procedure.
Dierent interior point methods can be used to compute the analytic center. In ACCPM we
chose a variant [7] of Karmarkar's projective algorithm [24]. The reader interested in a detailed
presentation of this particular variant of the interior point method can consult [14, 15]. An
excellent, complete presentation of ACCPM is given by du Merle [27].
Although there exist important particularities of this interior point method imposed by the fact
that it looks for an analytic center (and not for an optimum of the problem), a single iteration
of it does not dier too much from the iteration of any other IPM. Viewed from the perspective
of calculations, a single iteration is a Newton step in which the main computational eort is, as
usual, the factorization and solution of normal equations system (16).
5 Nonlinear multicommodity network ow problems
We are given a graph G = (V ; A), where V denotes the set of nodes and A f(s; t) : s 2 V ; t 2
V ; s 6= tg is the set of (directed) arcs. We dene the transpose of A as AT =: f(s; t) : (t; s) 2 Ag
and T (a) =: f(s; t) : a = (t; s) 2 Ag, a mapping that associates to every directed arc (s; t) the
arc with the reverse orientation (t; s). Clearly, AT = T (A). Now, two directed arcs a and T (a)
represent the undirected arc fs; tg.
Next for the augmented graph G = (V ; A), where A =: A [ AT , we denote by nn and na the
numbers of nodes and arcs in it, respectively and by N the nn na node-arc incidence matrix
of G.
The set of commodities I is dened by exogeneous ow vectors (supplies and demands) di =
(dis )s2V that satisfy eTV di = 0, with eV a vector of ones. These ows must be shipped through
11
G. The feasible ows for commodity i: xi = (xia)a2A are members of
F i = fxi 0 : Nxi = dig:
The nonlinear multicommodity network ow problem can be formulated as
minimize
subject to
X i i
X
hc ; x i +
ya
, ya
a2A0 a
X i
(xa + xiT (a)) ya ; 8a 2 A0
i2I
xi 2 F i; 8i 2 I ;
i2I
0 ya a ; 8a 2 A A:
0
(31)
A;
(32)
(33)
(34)
The vector y = (ya )a2A0 is meant to represent the joint total arc ow of a and T (a). It is in fact
an upper bound to it, that is in the objective penalized for approaching the arc capacity a.
We briey recall the standard decomposition approach to solve multicommodity network ow
problems. First, the complicating capacity constraints are dualized to form a partial Lagrangian.
This Lagrangian is minimized in the variables x and y leading naturally to an additive formulation of the objective function. We then take advantage of this additivity to disaggregate the
optimality cuts, which is a necessary condition to solve the problem eciently.
The reader is encouraged to consult [13] for more details on the problem formulation and on the
application of decomposition method to its solution.
6 Numerical results
The techniques of exploiting sparsity in the solution of master problem presented in this paper
have been implemented within the context of ACCPM. We applied the method to solve very
large nonlinear multicommodity network ow problems as described in Section 5.
The exploitation of sparsity in the solution of master problem presented in this paper permitted
us to push the limit sizes of problems solved from 1000 arcs and 5000 commodities in a previous version to 5000 arcs and 10000 commodities, respectively in a current implementation of
ACCPM.
The new implementation of ACCPM is written in C++ (except for the library of routines
applied for handling sparse Cholesky factorization [17] that is written in FORTRAN 77). The
C++ and FORTRAN routines have been compiled with the xlC and xlf compilers, respectively.
The program was run on a Power PC workstation (66MHz, 64MB of RAM, type 7011, model
25T), hence the compilation was made with options -O -qarch=ppc.
Below we present the results of running the methdod on a set of 17 problems: 2 well known
examples NDO22 and NDO148 from [11] and 15 randomly generated problems. A description of
12
Problem
Nodes Arcs Comm. Subprobs
NDO22
14
22
23
45
NDO148
61 148
122
270
Random12
300 600
1000
1600
Random15
300 1000
4000
5000
Random16
300 1000
1000
2000
Random20
400 1000
5000
6000
Random21
500 1000
3000
4000
Random31
700 2000
2000
4000
Random32 1000 3000
2000
5000
Random33 1000 4000
2000
6000
Random41 1000 5000
2000
7000
Random42 1000 5000
3000
8000
Random51 1000 2000
6000
8000
Random52 1000 2000
8000
10000
Random53 1000 2000 10000
12000
Random54 1000 3000
6000
9000
Random55 1000 3000
8000
11000
Table 1: Problems statistics.
our (public domain) generator can be found in [13]. Problem statistics: the numbers of nodes,
arcs and commodities, respectively are collected in Table 1. Additionally, Table 1 reports the
number of subproblems to facilitate the determination of the size of restricted master problem.
In the formulation of nonlinear multicommodity network ow problems the number of coupling
constraints n equals to the number of arcs while the number of subproblems m equals to the
sum of numbers of arcs and commodities.
Let us observe that the numbers given in Table 1 \hide" the large size of the problems solved.
The linear version of Random33 problem, for example, in an equivalent compact LP formulation,
would involve 1000 blocks of 2000 constraints of commodity ow balance at each node and 4000
coupling constraints of total ow capacity on the arcs. This formulation comprises 1000 2000+
4000 = 2004000 constraints and 4000 2000 = 8000000 variables. The reader interested in the
inuence of the multicommodity ow problem formulation on the eciency of dierent solution
methods is referred to [23].
Table 2 brings information on the sparsity of matrix S and its Cholesky factor (27). Its columns
report: the size of S , n, the number of subdiagonal nonzero elements in the adjacency matrix
BB T , the number of subdiagonal nonzero elements in the Cholesky matrix L0 , the ll-in, i.e.,
the number of new nonzero entries created during the factorization and, nally, a measure of
the computational eort (ops) to nd the Cholesky decomposition [17].
From the results collected in Table 2, one can see that Cholesky factors in all problems solved
show considerable ll-in. Random42 problem, for example, produces, in the worst case, about 2.9
13
Problem
NDO22
NDO148
Random12
Random15
Random16
Random20
Random21
Random31
Random32
Random33
Random41
Random42
Random51
Random52
Random53
Random54
Random55
n nonz(BB T )
122
148
600
1000
1000
1000
1000
2000
3000
4000
5000
5000
2000
2000
2000
3000
3000
143
6116
33340
85972
54391
134551
101683
167440
232472
324426
330021
477296
328651
395855
423901
436833
528839
Fill-in
36
2205
41837
124693
115642
176777
165204
582747
1296961
2048411
1459869
2426969
814771
846064
876762
1643768
1778717
nonz(L)
179
8321
75177
210665
170033
311328
266887
750187
1529433
2372837
1789890
2904265
1143422
1241919
1300663
2080601
2307556
Flops
1.83e+3
6.12e+5
1.47e+7
7.73e+7
5.12e+7
1.46e+8
1.10e+8
5.13e+8
1.52e+9
2.94e+9
1.80e+9
3.91e+9
1.02e+9
1.17e+9
1.27e+9
2.51e+9
2.95e+9
Table 2: Sparsity of Cholesky factors (worst case).
million nonzeros. This leads to pretty expensive factorization: its estimated eort is 3:91 109
ops. Observe that storing matrix S as dense, as was the case in [13], and applying LAPACK
routine to its factorization would require space for 25 million nonzeros (LAPACK requires the
whole symmetric matrix to be stored in order to exploit level 3 BLAS eciently). A single
factorization would in such a case cost 31 (5000)3 = 4:17 1010, i.e., it would be, roughly speaking,
10 times more expensive.
Table 3 collects data on the solution of our collection of large scale problems. We report in it: the
number of outer iterations, NITER, the number of inner iterations, Newton, the total number
of cuts (subgradients) added through the whole solution process, the number of shortest path
type cuts and the CPU time (to reach a 6-digit accurate solution on a POWER PC computer).
To give a bit of an insight into the ACCPM's behavior, Table 3 additionally reports the time
spent in the factorizations of S (dominating term in the master), tF , and the time spent to solve
subproblems, tS .
The reader may note that the solution times for these problems are considerable. It took
ACCPM about 32 hours of CPU, for example, to get a 6-digit optimal solution of Random42.
The dominating term in the solution time was the computation of 24 analytic centers, i.e.,
the solution of 24 subsequent relaxed master problems. This required 534 very expensive IPM
iterations (factorizations of S ). One has to be aware, however, that each master problem was a
nontrivial linear program with 13000 rows and a considerable number of columns that grew at
the end to about 150,000.
14
Problem
NITER Newton
NDO22
NDO148
Random12
Random15
Random16
Random20
Random21
Random31
Random32
Random33
Random41
Random42
Random51
Random52
Random53
Random54
Random55
17
13
19
26
21
28
23
22
20
20
26
24
26
30
32
26
28
84
83
257
527
259
577
397
434
389
382
466
534
640
688
929
593
750
CUTS
CPU time
total paths
tF
tS
total
420
46
0.81
0.03
0.96
2230
306
13.98
0.78
15.65
16519 5119
639.11 82.54
736.54
51888 25888 5595.43 518.16 6180.80
26494 5494 1322.05 102.28 1448.44
64241 36241 10714.02 876.20 11688.74
41857 18857 5103.80 523.03 5682.01
57700 13700 17008.75 568.70 17651.66
73795 13795 41542.88 825.13 42477.39
95902 15902 70354.10 921.51 71429.38
148214 18214 61853.98 1397.73 63452.67
149791 29791 114956.05 1541.63 116726.15
92503 40503 58364.35 2844.77 61387.06
115151 55151 73889.13 4379.27 78527.08
132816 68816 94361.97 4849.51 99485.27
122243 44243 111193.18 3316.83 114765.34
149046 65046 137348.32 3886.51 141534.21
Table 3: Eciency of ACCPM.
7 Conclusions
We have given in this paper a systematic discussion of the way of treating relaxed master program
with an interior point algorithm. We have not imposed any condition on the decomposition
scheme used (it may be either the \optimal point strategy" or the \central point strategy").
We have concentrated on the exploitation of the special structure of the relaxed master problem
within a single iteration of any interior point method to make it the most ecient.
The techniques presented in this paper have been incorporated into the implementation of the
Analytic Center Cutting Plane Method. We have demonstrated its advantages when applying
ACCPM to the solution of the large scale nonlinear multicommodity network ow problems.
The use of the structure exploiting techniques presented in this paper has allowed us to solve
signicantly larger problems than [13]. Their sizes have been pushed from 1000 arcs and 5000
commodities in the old version to 5000 arcs and 10000 commodities, respectively in the new one.
References
[1] Ahuja R.K., T.L. Magnanti and J.B. Orlin, Network Flows, Prentice-Hall, 1993.
[2] Benders J.F., Partitionning procedures for solving mixed-variables programming problems, Numerische Mathematik 4 (1962), pp. 238-252.
15
[3] Birge J. and L. Qi, (1988). Computing block-angular Karmarkar projections with applications to
stochastic programming, Management Science 34, No 12, pp. 1472-1479.
[4] Choi I.C. and D. Goldfarb, Exploiting special structure in a primal-dual path following algorithm,
Mathematical Programming 58 (1993) 33{52.
[5] Dantzig G.B., Linear Programming and Extensions, Princeton University Press, Princeton, 1963.
[6] Dantzig G.B. and P. Wolfe, The decomposition algorithm for linear programming, Econometrica 29,
4 (1961) 767-778.
[7] de Ghellinck G. and J.-P. Vial, A polynomial Newton method for linear programming, Algorithmica
1 (1986) 425{453.
[8] Du I.S., A.M. Erisman and J.K. Reid, Direct methods for sparse matrices, Oxford University Press,
New York, 1987.
[9] Eckstein J., Large-scale parallel computing, optimization and operations research: a survey, ORSA
CSTS Newsletter 14 (1993), No 2, Fall 1993.
[10] Elzinga J. and T. G. Moore, A central cutting plane algorithm for the convex programming problem,
Mathematical Programming 8 (1973) 134-145.
[11] Gafni E.M. and D.P. Bertsekas, Two-metric projection methods for constrained optimization, SIAM
Journal on Control and Optimization 22, 6 (1984) 936-964.
[12] George A. and J.W.H. Liu, The Evolution of the Minimum Degree Ordering Algorithm, SIAM
Review 31 (1989), 1{19.
[13] Gon J.-L., J. Gondzio, R. Sarkissian and J.-P. Vial, Solving nonlinear multicommodity ow problems by the analytic center cutting plane method, Technical Report 1994.21, Department of Management Studies, University of Geneva, September 1994, revised October 1995. Mathematical Programming (to appear).
[14] Gon J.-L., A. Haurie and J.-P. Vial, Decomposition and nondierentiable optimization with the
projective algorithm, Management Science 38, 2 (1992) 284-302.
[15] Gon J.-L., A. Haurie, J.-P. Vial and D.L. Zhu, Using central prices in the decomposition of linear
programs, European Journal of Operational Research 64 (1993) 393-409.
[16] Golub G.H. and C. Van Loan, Matrix Computations, (2nd ed.) The Johns Hopkins University Press,
Baltimore and London, 1989.
[17] Gondzio J., Implementing Cholesky factorization for interior point methods of linear programming,
Optimization 27 (1993) pp. 121-140.
[18] Gondzio J., Multiple centrality corrections in primal dual method for linear programming, Technical
Report 1994.20, Department of Management Studies, University of Geneva, Switzerland, November
1994, revised May 1995. Computational Optimization and Applications (to appear).
[19] Gondzio J. and T. Terlaky, A computational view of interior point methods for linear programming,
in: Advances in Linear and Integer Programming, J. Beasley (ed.), Oxford University Press, Oxford,
England, 1996, pp. 106{147.
[20] Gonzaga C.C, Path following methods for linear programming, SIAM Review 34 (1992) 167{227.
16
[21] Hurd J.K. and F. M. Murphy, Exploiting special structure in primal-dual interior point methods,
ORSA Journal on Computing 4 (1992) pp. 39{44.
[22] Jansen B., C. Roos, T. Terlaky, and J.-P. Vial, Primal{Dual Target Following Algorithms for Linear Programming, Technical Report 93{107, Faculty of Technical Mathematics and Informatics,
Technical University Delft, Delft, The Netherlands, 1993, to appear in a special issue of Annals of
Operation Research, K. Anstreicher and R. Freund (eds.).
[23] Jones K.L, I.J. Lustig, J.M. Farvolden and W.B. Powell, Multicommodity network ows: the impact
of formulation on decomposition, Mathematical Programming 62 (1993) 95-117.
[24] Karmarkar N., A new polynomial time algorithm for linear programming, Combinatorica 4 (1984)
373{395.
[25] Kelley J.E., The cutting plane method for solving convex programs, Journal of the SIAM 8 (1960)
703-712.
[26] Lustig I.J., R.E. Marsten and D.F. Shanno, Interior Point Methods for Linear Programming: Computational State of the Art, ORSA Journal on Computing 6 (1994), 1{14.
[27] Merle O. du, Interior Points and Cutting Palnes: a Development and Implementation of Methods for
Convex Optimization and Large Scale Structured Linear Programming, Ph.D Thesis, Department
of Management Studies, University of Geneva, Geneva, Switzerland, 1995 (in French).
[28] Roos C. and J.-P. Vial, Interior Point Methods, in: Advances in Linear and Integer Programming,
Beasley, J.E. (ed.), Oxford University Press, Oxford, England, 1996, pp. 49{104.
[29] Sonnevend G., New algorithms in convex programming based on a notion of "centre" (for systems of analytic inequalities) and on rational extrapolation, in: K.H. Homann, J.B. Hiriat-Urruty,
C. Lemarechal, and J. Zowe, eds., Trends in Mathematical Optimization: Proceedings of the 4th
French{German Conference on Optimization in Irsee, West{Germany, April 1986, volume 84 of
International Series of Numerical Mathematics, pp 311{327, Birkhauser Verlag, Basel, Switzerland,
1988.
[30] Todd M.J., Exploiting special structure in Karmarkar's linear programmingalgorithm, Mathematical
Programming 41 (1988) pp. 81{103.
17
Download