Hyper-sparsity in the revised simplex method and how to exploit it

advertisement
Hyper-sparsity in the revised simplex
method and how to exploit it
J. A. J. Hall
K. I. M. McKinnon
th
08 October 2002
Abstract
The revised simplex method is often the method of choice when
solving large scale sparse linear programming problems, particularly when
a family of closely-related problems is to be solved. Each iteration of
the revised simplex method requires the solution of two linear systems
and a matrix vector product. For a significant number of practical
problems the result of one or more of these operations is usually sparse, a
property we call hyper-sparsity. Analysis of the commonly-used techniques
for implementing each step of the revised simplex method shows them
to be inefficient when hyper-sparsity is present. Techniques to exploit
hyper-sparsity are developed and their performance is compared with the
standard techniques. For the subset of our test problems that exhibits
hyper-sparsity, the average speedup in solution time is 5.2 when these
techniques are used. For this problem set our implementation of the
revised simplex method which exploits hyper-sparsity is shown to be
competitive with the leading commercial solver and significantly faster
than the leading public-domain solver.
1
Introduction
Linear programming (LP) is a widely applicable technique both in its own right
and as a sub-problem in the solution of other optimization problems. The
revised simplex method and the barrier method are the two efficient methods
for solving general large sparse LP problems. In a context where families of
related LP problems have to be solved, such as in integer programming and
decomposition methods, the revised simplex method is usually the more efficient
method.
The constraint matrices for most practical LP problems are sparse and, for
an implementation of the revised simplex method to be efficient, it is crucial
that only non-zeros in the coefficient matrix are stored and operated on. Each
iteration of the revised simplex method requires the solution of two linear
systems, the computation of which are commonly referred to as FTRAN and
BTRAN, and a matrix vector product computed as the result of an operation
referred to as PRICE. The matrices involved in these operations are submatrices
of the coefficient matrix so are normally sparse. For many problems these three
operations yield dense vectors, in which case there is limited scope for improving
the performance by fully exploiting any zeros in the vectors. However, there is
1
a significant number of practical problems where the results of one or more
of these operations are usually sparse, a property referred to in this paper as
hyper-sparsity. This phenomenon has been reported independently by Bixby [3]
and Bixby et al [4]. A number of classes of problems are known to exhibit
hyper-sparsity, in particular those with a significant network structure such
as multicommodity flow problems. Specialist variants of the revised simplex
method which exploit this structure explicitly have been developed, in particular
by McBride and Mamer [16, 17]. This paper identifies hyper-sparsity in a
wider range of more general LP problems. Techniques are developed which
can be applied within a general revised simplex solver when hyper-sparsity
is identified during the solution process and, for problems exhibiting hypersparsity, significant performance improvements are demonstrated.
The computational components of the revised simplex method and standard
techniques for FTRAN and BTRAN are introduced in Section 2 of this paper.
Section 3 describes hyper-sparsity and gives statistics on its occurrence in a
test set of LPs drawn from the standard Netlib set [10], the Kennington test
set [5] and the authors’ personal collection [13]. Analysis given in Section 4
shows the commonly-used techniques for each of the computational components
of the revised simplex method to be inefficient when hyper-sparsity is present
and techniques to exploit hyper-sparsity are described. Section 5 presents a
computational comparison of the authors’ revised simplex solver, EMSOL, with
and without the techniques for exploiting hyper-sparsity. For those problems
which exhibit hyper-sparsity, a comparison is also made between EMSOL,
SOPLEX 1.2 [21] and the CPLEX 6.5 [15] primal simplex solver. Conclusions
are offered in Section 6.
Although it contains no new ideas, this paper represents a significant
reworking of [12]. The presentation has been greatly improved and the
comparisons between EMSOL, SOPLEX and CPLEX have been added.
2
The revised simplex method
The revised simplex method and its computational requirements are most
conveniently discussed in the context of LP problems in standard form
cT x
Ax = b
x ≥ 0,
minimize
subject to
where x ∈ IRn and b ∈ IRm .
In the simplex method, the variables are partitioned into index sets B of
m basic variables and N of n − m nonbasic variables such that the basis
matrix B formed from the columns of A corresponding to the basic variables
is nonsingular. The set B itself is conventionally referred to as the basis. The
columns of A corresponding to the nonbasic variables form the matrix N and the
components of c corresponding to the basic and nonbasic variables are referred to
as, respectively, the basic costs cB and non-basic costs cN . When the nonbasic
variables are set to zero the values b̂ = B −1 b of the basic variables, if nonnegative, correspond to a vertex of the feasible region. If any basic variables
are infeasible they are given costs of −1, with other variables being given costs
of zero. Phase I iterations are then performed in order to force all variables to
2
be non-negative, if possible. An account of the details of the revised simplex
method is given by Chvátal in [6] and the computational components of each
iteration are summarised in Figure 1. Note that although the reduced costs may
be computed directly using the following BTRAN and PRICE operations
π TB = cTB B −1
ĉTN = cTN − π TB N,
it is more efficient computationally to update them by calculating the pivotal
row âTp = eTp B −1 N , where ep is column p of the identity matrix, using
the BTRAN and PRICE operations defined in Figure 1. The only significant
computational requirement which is not indicated in Figure 1 occurs when, in a
phase I iteration, one or more variables which remain basic become feasible so
their cost coefficients increase by one to zero. In order to update the reduced
costs, it is necessary to compute the corresponding linear combination of tableau
rows. This composite row is formed by the following BTRAN and PRICE
operations
π Tδ = δ T B −1
âTδ = π Tδ N,
where the nonzeros in δ are equal to one for each variable which remains
basic and becomes feasible. In this paper, techniques are discussed which are
applicable to all BTRAN and PRICE operations. This is done with reference to
an unsubscripted vector π corresponding to a generic right-hand-side vector
r for the system B0 T π = r solved by BTRAN. When techniques are only
applicable to a particular class of BTRAN or PRICE operations, the specific
notation for the right-hand-side and solution is used.
CHUZC: Scan ĉN for a good candidate q to enter the basis.
FTRAN: Form the pivotal column âq = B −1 aq , where aq is column q of A.
CHUZR: Scan the ratios b̂i /âiq for the row p of a good candidate to leave the
basis. Let α = b̂p /âpq .
Update b̂ := b̂ − αâq .
BTRAN: Form π Tp = eTp B −1 .
PRICE: Form the pivotal row âTp = π Tp N .
Update reduced costs ĉTN := ĉTN − ĉq âTp .
If (growth in factors) then
INVERT: Form a factored representation of B −1 .
else
UPDATE: Update the factored representation of B −1 corresponding to the
basis change.
end if
Figure 1: Operations in an iteration of the revised simplex method
Efficient implementations of the revised simplex method generally weight
each reduced cost in CHUZC by some measure of the magnitude of the
corresponding column of the standard simplex tableau, commonly referred to
as an edge weight. Many solvers, including the authors’ solver EMSOL, use
the Harris Devex strategy [14]. This requires only the pivotal row to update
3
the edge weights so carries no significant computational overhead beyond that
outlined in Figure 1.
2.1
The representation of B −1
In each iteration of the simplex method it is necessary to solve two systems,
one involving the current basis matrix B and the other its transpose. This
is achieved by passing respectively forwards and backwards through the data
structure corresponding to a factored representation of B −1 . There are a number
of procedures for updating the factored representation of B −1 , the original and
simplest being the product form update of Dantzig and Orchard-Hays [7]. This
approach is used by EMSOL and the techniques in this paper are developed
with the product form update in mind.
If B0−1 is used to denote the factored representation obtained by INVERT,
and EU represents the transformation of B0 corresponding to subsequent basis
changes such that B = B0 EU , it follows that B −1 may be expressed as
B −1 = EU−1 B0−1 .
In a typical implementation of the revised simplex method for large sparse LP
problems, B0−1 is represented as a product of KI elimination operations derived
directly from the nontrivial columns of the matrices L and U which form an
LU decomposition of (a row and column permutation of) B0 . This invertible
Q
representation allows B0−1 to be expressed algebraically as B0−1 = 1k=KI Ek−1 ,
where



1
−η1k
1
..


..
..



.
.
.





k


1
−η
1

−1
p
k




k

1
µ
Ek−1 = 
 . (1)




k


1
−η
1

+1
p
k



..


.
.

.
..
..


k
1
1
−ηm
Within an implementation, the nonzeros in the ‘eta’ vector
η k = [ η1k
. . . ηpkk −1
0
ηpkk +1
T
k
. . . ηm
]
I
are stored as value-index pairs and the data structure {pk , µk , η k }K
k=1 is referred
to as an eta file. Eta vectors associated with the matrices L and U coming from
Gaussian elimination are referred to as L-etas and U -etas respectively. Note that
the number of nonzeros in the eta file is no more that the number of nonzeros
in the LU decomposition.The operations with pk , µk and η k required when
forming Ek−1 r during the standard algorithms for FTRAN and Ek−T r during
BTRAN are illustrated in Figure 2.
The product form update leaves the factored form of B0−1 unaltered and
represents EU−1 as a product of pairs of elementary matrices of the form (1).
The representation of each UPDATE operation is obtained directly from the
pivotal column and is given by pk = p, µk = 1/âpq and η k = âq − âpq ep .
In solvers based on the product form, the representation of the UPDATE
operations can be appended to the eta file following INVERT, resulting in a
4
if (rpk 6= 0) then
rpk := µk rpk
r := r − rpk η k
end if
rpk := µk (rpk − rT η k )
(a) FTRAN
(b) BTRAN
Figure 2: Standard operations in FTRAN and BTRAN
single homogeneous data structure. However, in this paper and in EMSOL,
the particular properties of the INVERT and UPDATE etas and the nature of
the operations with them are exploited, so FTRAN is performed as the pair of
operations
(I-FTRAN)
ãq = B0−1 aq
followed by
âq = EU−1 ãq .
(U-FTRAN)
Conversely, BTRAN is performed as
followed by
π̃T = rT EU−1
(U-BTRAN)
π T = π̃ T B0−1 .
(I-BTRAN)
Note that the term RHS is used to refer, not just to the vector on the right hand
side of the system to be solved, but to the vector at any stage in the process of
transforming it into the solution of the system.
3
What is hyper-sparsity?
Each iteration of the revised simplex method performs FTRAN to obtain the
pivotal column âq = B −1 aq as the solution of one linear system, BTRAN to
obtain π Tp = eTp B −1 as the solution of a second linear system, and PRICE
to form the pivotal row âTp = π Tp N as the result of a matrix-vector product.
For many LP problems, even those for which the constraint matrix is sparse,
there are few zero values in the results of these operations. However, as this
paper demonstrates, there is a significant set of practical problems for which the
proportion of zero values in the results of these operations is such that exploiting
this yields very large computational savings.
For the sake of classifying test problems, in this paper the result of an
FTRAN, BTRAN or PRICE operation is considered to be sparse if no more
than 10% of its entries are nonzero and if at least 60% of the results of that
particular operation are sparse then this is taken to be a clear majority. An LP
problem is considered to exhibit hyper-sparsity if a clear majority of the results
of one or more of these operations is sparse. These thresholds are used for the
presentation of results only and are not parameters in the implementation of
the methods which exploit hyper-sparsity.
The extent to which hyper-sparsity exists in LP problems was investigated
for a subset of the standard Netlib test set [10], the Kennington test set [5]
5
and the authors’ personal collection [13]. Those problems from the Netlib set
whose solution requires less than ten seconds of CPU time were excluded, as
were FIT2D and the OSA problems from the Kennington set. For the latter
problems, the number of columns is particularly large relative to the number of
rows so the the solution techniques developed in this paper are inappropriate.
Note that a simple standard scaling algorithm is applied to each of the problems
for reasons of numerical stability and each problem is solved from the initial basis
obtained using the crash procedure in CPLEX [15].
For each problem the density of each pivotal column âq following FTRAN, of
π p and π δ following BTRAN and of âTp and âTδ following PRICE was determined,
and the total number of each which was found to be sparse was counted. The
problems for which there is a clear majority of sparse results for at least one of
these three operations are listed in Table 1 and referred to as test set H. The
remaining problems, those which exhibit no hyper-sparsity in any of FTRAN,
BTRAN or PRICE, are listed in Table 2 and referred to as test set H0 . For
each of the three operations, Tables 1 and 2 give the percentage of the results
which are sparse. Subsequent columns give, for each of the three operations,
the average density of those results which are sparse and the average density of
those results which are not. The final column of Table 1 summarises the extent
to which the problem exhibits hyper-sparsity by giving the initial letter of the
operation(s) for which a clear majority of the results is sparse.
The first thing to note from the results in Table 1 is that all but two of the
problems in H exhibit hyper-sparsity for BTRAN and most do so for all three
operations. It is interesting to consider why this is the case and why there are
exceptions.
Recall that the result of FTRAN is the pivotal column âq and the result
of (most) PRICE operations is the pivotal row. Since these are a column and
row from the same matrix (the standard simplex tableau B −1 N ) it might be
expected that a problem would exhibit hyper-sparsity in both FTRAN and
PRICE or in neither. Also, since the π p is just a single row of B −1 , and the
pivotal column is usually a linear combination of several columns of B −1 , it
might be expected that π p would be less dense than âq . These arguments would
lead us to expect all problems in H to have property B, and that problems in H
would be either FBP or B. The reasons for the exceptions are now explained.
There are two problems, DCP1 and DCP2, of type F, i.e. the pivotal columns
are typically sparse but the results of BTRAN and PRICE are not. These
problems are decentralised planning problems for which a typical standard
simplex tableau is very sparse with a few dense rows. Thus the pivotal columns
are usually sparse. However, the pivot is usually chosen from one of the dense
rows.
Conversely there are five problems of the opposite type, BP, i.e. the pivotal
rows are typically sparse but the pivotal columns are not. The most remarkable
of these is FIT2P: 81% of pivotal columns are essentially full and all but one of
the pivotal rows are sparse. For this problem, most columns of the constraint
matrix have only one nonzero entry, with the remaining columns being very
dense. Thus B −1 is largely diagonal with a small number of essentially full
columns. For this model it turns out that most variables chosen to enter the
basis have a single nonzero entry such that the pivotal column is (a multiple of)
one of these dense columns of B −1 . Each π p is a row of B −1 and its resulting
6
7
Problem
80BAU3B
FIT2P
GREENBEA
GREENBEB
STOCFOR3
WOODW
DCP1
DCP2
CRE-A
CRE-C
KEN-11
KEN-13
KEN-18
PDS-06
PDS-10
PDS-20
Rows
2262
3000
2392
2392
16675
1098
4950
32388
3516
3068
14694
28632
105127
9881
16558
33874
Dimensions
Columns Nonzeros
9799
21002
13525
50284
5405
30877
5405
30877
15695
64875
8405
37474
3007
93853
21087
559390
4067
14987
3678
13244
21349
49058
42659
97246
154699
358171
28655
62524
48763
106436
105728
230200
Mean density (%)
Sparse results
Non-sparse results
Sparse results (%)
FTRAN
BTRAN
PRICE
FTRAN
BTRAN
PRICE
FTRAN
BTRAN
PRICE
97
19
15
16
29
32
98
100
100
100
100
100
100
100
100
100
72
100
60
61
100
73
56
59
82
83
98
92
93
97
96
94
70
100
60
60
75
71
47
53
80
80
97
92
93
97
96
94
2.40
0.17
4.62
4.55
2.09
3.90
4.52
1.25
1.91
2.86
0.10
0.12
0.05
0.94
1.22
2.55
0.61
0.65
0.49
0.48
2.26
1.01
1.33
0.68
0.39
0.45
0.27
0.16
0.21
0.34
0.24
0.21
0.66
0.91
0.81
0.77
2.86
1.14
1.26
0.45
0.61
0.59
0.28
0.17
0.20
0.46
0.31
0.29
10.84
91.10
24.73
27.81
62.18
22.60
11.00
—
—
—
—
—
—
—
—
—
42.41
16.93
75.54
77.75
—
57.55
13.90
15.08
28.17
33.71
11.88
35.85
39.58
18.95
23.60
41.87
50.85
19.46
80.75
83.78
14.97
59.91
82.60
82.34
56.06
58.43
13.79
42.07
41.94
17.71
25.19
41.11
Table 1: Problems exhibiting hyper-sparsity (set H): dimensions, percentage of the results of FTRAN, BTRAN and PRICE which are
sparse, average density of those results which are sparse, average density of those results which are not and summary of those operations
for which more than 60% of the results are sparse.
Hypersparse
FBP
BP
BP
BP
BP
BP
F
F
FBP
FBP
FBP
FBP
FBP
FBP
FBP
FBP
8
Problem
BNL2
D2Q06C
D6CUBE
DEGEN3
DFL001
MAROS-R7
PILOT
PILOT87
QAP8
TRUSS
CRE-B
CRE-D
Rows
2324
2171
415
1503
6071
3136
1441
2030
912
1000
9648
8926
Dimensions
Columns Nonzeros
3489
13999
5167
32417
6184
37704
1818
24646
12230
35632
9408
144848
3652
43167
4883
73152
1632
7296
8806
27836
72447
256095
69980
242646
Mean density (%)
Sparse results
Non-sparse results
Sparse results (%)
FTRAN
BTRAN
PRICE
FTRAN
BTRAN
PRICE
FTRAN
BTRAN
PRICE
29
12
2
15
2
24
9
8
0
8
58
57
36
25
9
43
37
1
27
19
11
39
55
56
35
25
9
42
37
1
24
15
11
38
55
55
2.28
1.84
5.10
4.42
1.57
0.54
3.89
1.33
1.46
4.03
4.15
3.61
0.61
0.48
0.64
1.22
0.22
0.59
1.59
0.99
0.55
0.81
0.19
0.22
0.79
0.86
1.11
1.47
0.37
1.43
2.06
0.89
1.92
1.28
0.37
0.39
28.56
49.32
65.98
25.55
51.57
81.25
60.86
74.93
83.92
47.71
14.01
13.82
40.03
66.32
94.43
62.05
83.45
30.78
76.31
72.50
75.51
85.56
55.81
54.43
70.22
87.24
97.91
84.97
92.51
62.30
88.51
91.08
98.45
80.62
89.16
89.20
Table 2: Problems not exhibiting hyper-sparsity (set H0 ): dimensions, percentage of the results of FTRAN, BTRAN and PRICE which are
sparse, average density of those results which are sparse and average density of those results which are not.
sparsity is inherited by the pivotal row since most columns of N in the PRICE
operation have only one nonzero entry.
Several observations can be made from the columns in Tables 1 and 2.
For those problems in H0 the density of the majority of results is such that
the techniques developed for hyper-sparsity are seen to be of limited value.
For the problems in H, those results which are not sparse may have a very
high proportion of nonzero entries. It is important therefore that techniques
for exploiting hyper-sparsity are ‘switched off’ when such cases are identified
during a particular FTRAN, BTRAN or PRICE operation. If this is not done
then any computational savings obtained when exploiting hyper-sparsity may
be compromised.
4
Exploiting hyper-sparsity
Each computational component of the revised simplex method either forms,
or operates with, the result of FTRAN, BTRAN or PRICE. Each of these
components is considered below and it is shown that typical computational
techniques are inefficient in the presence of hyper-sparsity. In each case,
equivalent computational techniques are developed which exploit hyper-sparsity.
4.1
Relative cost of computational components
Table 3 gives, for test set H, the percentage of solution time which can be
attributed to each of the major computational components in the revised
simplex method. This allows the value of exploiting hyper-sparsity in a
particular component to be assessed. Although PRICE and CHUZC together
are dominant for most problems, each of the other computational components,
with the exception of I-FTRAN and I-BTRAN, constitute at least 10% of the
solution time for some of the problems. Thus, to obtain general improvement
in computational performance requires hyper-sparsity to be exploited in all
computational components of the revised simplex method.
For the problems in test set H0 , only U-BTRAN, PRICE and INVERT benefit
noticeably from the techniques developed below. The percentages of the solution
time for these computational components are given as part of Table 6.
4.2
Hyper-sparse FTRAN
For most problems which exhibit hyper-sparsity, Table 3 shows that the
dominant computational cost of FTRAN is I-FTRAN, particularly so for the
larger problems. When the pivotal column âq computed by FTRAN is sparse,
only a very small proportion of the INVERT (and UPDATE) eta vectors, needs
to be applied (unless there is an improbable amount of cancellation). Indeed the
number of floating point operations required to perform these few operations
can be expected to be of the same order as the number of nonzeros in âq . Gilbert
and Peierls [11] identified this property in the context of Gaussian elimination
when the pivotal column is formed as required and is expected to be sparse.
It follows that the cost of I-FTRAN (and hence FTRAN) will be dominated by
the test for zero if the INVERT etas are applied using the standard operation
illustrated in Figure 2(a).
9
10
Problem
80BAU3B
FIT2P
GREENBEA
GREENBEB
STOCFOR3
WOODW
DCP1
DCP2
CRE-A
CRE-C
KEN-11
KEN-13
KEN-18
PDS-06
PDS-10
PDS-20
Mean
Solution
CPU (s)
31.51
117.77
66.71
47.78
372.43
10.52
47.51
2572.45
13.71
10.12
209.86
1221.09
21711.40
143.33
868.28
10967.10
Percentage of solution time
CHUZC
I-FTRAN
U-FTRAN
CHUZR
I-BTRAN
U-BTRAN
PRICE
INVERT
20.26
9.00
4.74
4.68
4.29
9.94
3.86
4.58
12.13
12.58
9.00
8.60
9.10
12.76
12.45
11.88
9.37
5.52
4.01
9.34
9.57
7.84
4.48
4.71
3.83
9.08
8.35
10.58
9.96
10.57
7.93
7.01
5.58
7.40
0.65
1.65
3.59
3.79
7.95
0.68
1.94
2.27
0.98
1.10
0.20
0.15
0.08
0.22
0.18
0.27
1.61
3.35
13.11
5.92
6.15
13.91
2.77
3.33
2.39
6.38
7.15
3.27
2.80
2.79
3.11
2.68
2.66
5.11
6.46
15.57
16.87
17.02
17.13
7.86
17.79
17.11
16.81
15.22
21.84
20.08
20.72
14.41
13.51
11.01
15.59
1.09
0.76
5.17
5.31
4.20
1.70
2.33
2.46
2.46
2.56
0.96
0.66
0.36
1.28
1.24
2.02
2.16
60.02
44.90
42.00
40.41
23.07
68.60
59.74
60.38
47.52
47.71
53.26
57.03
55.93
58.48
61.05
63.80
52.74
1.58
8.32
11.12
11.83
18.34
2.99
5.64
6.87
3.00
3.53
0.59
0.59
0.42
1.47
1.70
2.65
5.04
Table 3: Total solution time and percentage of solution time for computational components when not exploiting hyper-sparsity for test
set H.
The aim of this subsection is to develop a computational technique which
identifies the INVERT etas which have to be applied without passing through
the whole INVERT eta file and testing each value of rpk for zero. The Gilbert
and Peierls approach finds this minimal set of etas at a cost proportional to
their total number of nonzeros.
A limitation of the Gilbert and Peierls approach is that the entire minimal
set of etas has to be found before any of it can be used. If ãq is not sparse
then the cost of this approach could be significantly greater than the standard
I-FTRAN. This drawback is avoided by the hyper-sparse I-FTRAN algorithm
given as pseudo-code in Figure 3, and explained in the following paragraph.
K = {k : rpk 6= 0}
While K =
6 ∅
k0 = mink∈K
rpk0 := µk0 rpk0
for i ∈ Ek0 do
if (ri 6= 0) then
ri := ri − rpk0 [η k0 ]i
else
ri := −rpk0 [η k0 ]i
(1)
(1)
if (Pi > k0 ) K := K ∪ {Pi }
(2)
(2)
if (Pi > k0 ) K := K ∪ {Pi }
R := R ∪ {i}
end if
end do
K := K\{k0 }
end while
Figure 3: Hyper-sparse I-FTRAN algorithm
For a given RHS vector r and set of indices of nonzeros R = {i : ri 6= 0}
(which is always available without a search for the nonzeros), the hyper-sparse
I-FTRAN algorithm is initialised by forming a set K of indices k of etas for which
rpk is nonzero. This is done by passing through the indices in R and using the
arrays P (1) and P (2) . These record, for each row, the index of the first and
second eta which have a pivot in the particular row. Note that, because the
etas used in I-FTRAN correspond to the LU factors, there can be at most two
etas with a pivot in any particular row. Unless cancellation occurs, all the etas
with indices in K will have to be applied. If K is empty then no more etas need to
be applied so I-FTRAN is complete. Otherwise, the least index k0 ∈ K identifies
the next eta which needs to be applied. In applying eta k0 the algorithm steps
through the nonzeros in η k0 (whose indices are in Ek0 ). For each fill-in row the
algorithm checks if there are etas k with k > k0 whose pivot is in this row, and
any such are added to K. (The check only requires two lookups using the P (1)
and P (2) arrays.) Finally the k0 entry is removed.
The set K must be searched to determine the next eta to be applied, and
there is some scope for variation in the way that this is achieved. In EMSOL,
K is maintained as an unordered list and, if the number of entries in K becomes
large, there comes a point at which the cost of the search exceeds the cost of
the tests for zero which it seeks to avoid. To prevent this happening the average
11
skip through the eta file which has been achieved during the current FTRAN is
compared with a multiple of |K| to determine the point at which it is preferable
to complete I-FTRAN using the standard algorithm.
Although the set of possible nonzeros in the RHS is not required by the
algorithm in Figure 3, it is maintained as the set R. There is no real scope for
exploiting hyper-sparsity in U-FTRAN which, as a consequence, is performed
using the standard algorithm with the modification that R is maintained so
long as the RHS is sparse. The availability of set R which, on completion of
FTRAN, gives the possible positions of nonzeros in the pivotal column, allows
hyper-sparsity to be exploited in CHUZR, as indicated below.
4.3
Hyper-sparse CHUZR
CHUZR performs a small number of floating-point operations for each of the
nonzero entries in the pivotal column. If the indices of the nonzeros are not
known a priori then all entries in the pivotal column will have to be tested for
zero, and for problems when this vector is typically sparse, the cost of CHUZR
will be dominated by the cost of performing the tests for zero. If a list of indices
of entries in the pivotal column which are (or may be) nonzero is known, then
this overhead is eliminated. The nonzero entries in the pivotal column are also
required both to update the values of the basic variables following CHUZR and,
as described in Section 4.8, to update the product form UPDATE eta file. If
the nonzero entries in the workspace vector used to compute the pivotal column
are zeroed after being packed onto the end of the UPDATE eta file, this yields
a contiguous list of real values to update the values of the basic variables and
makes the UPDATE operation near-trivial. A further consequence is that, so
long as pivotal columns remain sparse, the only complete pass through the
workspace vector used to compute the pivotal column is that required to zero
it before the first simplex iteration.
4.4
Hyper-sparse BTRAN
When performing BTRAN using the standard operation illustrated in
Figure 2(b), most of the work comes from the evaluation of the inner product
r T η k . However, when the RHS is sparse, it will usually be the case that there
is no intersection of the nonzeros in rT and η k so that the result is structurally
zero. Unfortunately, checking directly whether there is a non-empty intersection
is slower than evaluating the inner product directly. Better techniques are
discussed in the remainder of this subsection.
4.4.1
Avoiding structurally zero inner products and operations with
zero
When using the product form update, it is valuable to consider U-BTRAN
separately from I-BTRAN. When forming π Tp = eTp B −1 , which constitutes
the overwhelming majority of BTRAN operations, it is possible in U-BTRAN
to eliminate all the structurally zero inner products and significantly reduce
the number of operations with zero. For all BTRAN operations it is possible
to eliminate a significant number of the structurally zero inner products in
I-BTRAN.
12
Hyper-sparse U-BTRAN when forming π p
Let KU denote the number of UPDATE operations which have been performed
since INVERT and let P denote the set of indices of those rows which have been
pivotal. Note that the inequality |P| ≤ KU is strict if a particular row has been
pivotal more than once. Since the RHS of B T π p = ep has only one nonzero in
the row which has just been pivotal and fill-in during U-BTRAN can only occur
in components of the RHS corresponding to pivots, it follows that the nonzeros
in π̃ Tp = eTp EU−1 are restricted to the components with indices in P. Thus, when
applying the k th UPDATE eta, only the nonzeros with indices in P contribute
to rT ηk . Since |P| is very much smaller than the dimension of B, it follows that
unless this observation is exploited, most of the floating point operations when
applying the UPDATE etas involve zero. A significant improvement in efficiency
is achieved by maintaining a rectangular array EP of dimension |P| × KU which
holds the values of the entries corresponding to P in the UPDATE etas, allowing
π̃ p to be formed as a sequence of KU inner products. These are computed by
indirection into EP using a list of indices of nonzeros in the RHS which is simple
and cheap to maintain.
When using this technique, if the update etas are sparse then EP will be
largely zero. As a consequence, most of the inner products rT η k will be
structurally zero, and most of the values of rpk will be zero. The first (next) eta
for which this is not the case, and so must be applied, is identified by searching
for the first (next) nonzero in the rows of EP for which the corresponding
component of r is nonzero. The extra work of performing this search is usually
much less than the computation which is avoided. Indeed, the initial search
frequently identifies that none of the UPDATE etas needs to be applied.
If KU is sufficiently large then a prohibitively large amount of storage is
required to store EP in a rectangular array. However EP may be represented
by ensuring that in the UPDATE eta file the indices and values of nonzeros
in ηk for rows in P are stored before any remaining indices and values. It is
still possible to perform row-wise searches of EP by using an additional, more
compact, data structure from which the eta index of nonzeros in each row of EP
may be deduced. The overhead of maintaining this data structure and searching
it is usually much less than the floating-point operations with zero that would
otherwise be performed.
Note that since π̃ p is sparse for any LP problem, the technique for exploiting
this is of benefit whether or not the update etas are sparse. However it is seen
in Table 6 that, for problems in test set H0 , the saving is a small proportion of
the overall cost of performing a simplex iteration.
Hyper-sparse I-BTRAN using a column-wise INVERT eta file
As in the U-BTRAN case above we wish to find a way of only applying those
INVERT eta vectors which require at least one nonzero operation. Assume we
have available a list, Q(1) , of the indices of the last INVERT eta with a nonzero
in each row, with an index of zero used to indicate that there is no such eta. The
greatest index in Q(1) corresponding to the nonzeros in π̃ then indicates the first
INVERT eta which must be applied. As with the UPDATE etas, the technique
may indicate that a significant number of the INVERT etas need not be applied
and, if the index is zero, it follows immediately that π = π̃. More generally,
13
if the list Q(l) of the index of the lth last INVERT eta with a nonzero in each
row is recorded, for l from 1 to some small limit, then several significant steps
backwards through the INVERT eta file may be made. However, to implement
this technique requires an integer array of dimension equal to that of B0 for
each list, and the backward steps are likely to become smaller for larger l, so in
EMSOL only Q(1) and Q(2) are recorded.
4.4.2
Hyper-sparse I-BTRAN using a row-wise INVERT eta file
The limitations and/or high storage requirements associated with exploiting
hyper-sparsity during I-BTRAN with the conventional column-wise (INVERT)
eta file motivate the formation of an equivalent representation stored row-wise.
This may be formed after the normal column wise INVERT by passing twice
through the complete column-wise INVERT eta file. This row-wise eta file
permits I-BTRAN to be performed using the algorithm given in Figure 3 for
I-FTRAN. For problems in which π is typically sparse, the computational
overhead in forming the row-wise eta file is far outweighed by the savings
achieved when applying it, even when compared to the hyper-sparse I-BTRAN
using a column-wise INVERT eta file.
4.4.3
Maintaining a list of the indices of nonzeros in the RHS
During BTRAN, when using the standard operation illustrated in Figure 2(b),
it is simple and cheap to maintain a list of the indices of the possible nonzeros
in the RHS: if rpk is zero and rT η k is nonzero then the index pk is added to the
end of a list. When performing I-BTRAN by applying the algorithm given in
Figure 3 with a row-wise INVERT eta file, the list of the indices of the possible
nonzeros in the RHS is maintained as set R. For problems when π is frequently
sparse, knowing the indices of those elements which are (or may be) nonzero
allows a valuable saving to be made in PRICE.
4.5
Row-wise (hyper-sparse) PRICE
For the problems in test set H, it is clear from Table 3 that PRICE accounts
for about half of the CPU time required for most problems and significantly
more than that for others. The matrix-vector product π T N is commonly
formed as a sequence of inner products between π and (the packed form of)
the appropriate columns of the constraint matrix. In the case when π is full,
there will be no floating-point operations with zero so this simple technique
is optimal. However this is far from being true if π is sparse, in which case,
by forming π T N as a linear combination of those rows of N which correspond
to nonzero entries in π, all floating point operations with zero are avoided.
Although the cost of maintaining the row-wise representation of N is non-trivial,
this is far outweighed by the efficiency with which π T N may then be formed.
Within reason, even for problems when π is not sparse, performing PRICE
with a row-wise representation of N is advantageous. For example, even if π
is on average half full, the overhead of maintaining the row-wise representation
of N is small compared to the half of the work of column-wise PRICE which is
saved.
14
For problems when π is particularly sparse, the time to test all entries for
zero dominates the time to do the small number of floating point operations
which involve the nonzeros in π. The cost of testing can be avoided if a list of
the indices of the non-zeros in π is known and, as identified in Section 4.4.3,
such a list can be constructed at negligible cost during BTRAN. After each
nonzero entry in π is used it is zeroed, so that by the end of BTRAN the entire
π workspace is zeroed in preparation for the next BTRAN. Thus, so long as
π vectors remain sparse, the only complete pass through this workspace array
when solving an LP problem is that required to zero it before the first simplex
iteration.
Provided π T N remains sparse, it is advantageous to maintain a list of the
indices of its nonzeros. This list can be used to avoid having to search for
the nonzeros in π T N , which would otherwise be the dominant cost when using
this vector. Once again the list of indices of the non-zeros can be use to zero
the workspace so, provided the π T N vectors remain sparse, the only full pass
through this workspace that is needed is to zero it before the first simplex
iteration.
4.6
Hyper-sparse CHUZC
Before discussing methods for CHUZC which exploit hyper-sparsity, it should
be observed that, since the vector cB of basic costs may be full, the vector of
reduced costs given by
ĉTN = cTN − cTB B −1 N,
may also be full. Further, for most of the solution time, a significant proportion
of the reduced costs are negative. Thus, even for LP problems exhibiting hypersparsity, the attractive nonbasic variables do not form a small set whose size
could then be exploited. However, if the pivotal row, eTp B −1 N , is sparse, the
number of reduced costs which change each iteration is small and this can be
exploited to improve the efficiency of CHUZC.
The aim of the hyper-sparse CHUZC algorithm is to maintain a set, C,
consisting of a small number, s, of variables which is guaranteed to include the
variable with the most negative reduced cost. At the start of an LP iteration
let g be the most negative reduced cost of any variable in C and let h be the
lowest reduced cost of those non-basic variables not in C.
The steps of CHUZC for one LP iteration are:
• If h < g then reinitialise C by performing a full CHUZC: pass through all
the non-basic variables and find the s + 1 most attractive (i.e. those with
the lowest reduced costs). Store those with the lowest s reduced costs
in C, set g to the lowest reduced cost and h to the reduced cost of the
remaining variable.
• Find and remove the variable with the best reduced cost from C.
This provides the candidate to enter the basis.
• Whilst updating the reduced costs
– form a set D of the variables (not in C) with the lowest s reduced
costs which have changed;
15
– update h each time a variable is not included in D and has a lower
reduced cost than the current value of h.
Note that hyper-sparse PRICE generates a list of the non-zeros in the
pivot row, which is used to drive this step and simultaneously zero the
workspace for the pivot row.
• Update C to correspond to the lowest s reduced costs in C ∪ D, set g to
the lowest reduced cost in the resultant C and update h if a variable which
is not included in C has a lower reduced cost than the current value of h.
Note that the set operations with C and D can be performed in a time
proportional to their length. Also, observe that the technique for hyper-sparse
CHUZC described above extends naturally, and at no additional cost, when the
column selection strategy incorporates edge weights. Since the edge weight for
a column changes only if the corresponding entry in the pivotal row is nonzero,
the list D still contains the most attractive candidates not in C whose reduced
cost and weight have changed.
4.7
Hyper-sparse Tomlin INVERT
The default INVERT used in EMSOL is based on the procedure described by
Tomlin [19]. This procedure identifies, and uses as pivots for as long as possible,
rows and columns in the active submatrix which have only a single nonzero.
Following this triangularisation phase, any residual submatrix is then factorised
using Gaussian elimination with the order of the columns determined prior to the
numerical calculation by merit counts based on fill-in estimates. Since the pivot
in each stage of Gaussian elimination is selected from a predetermined pivotal
column, only this column of the active submatrix is required. Thus, rather than
apply elimination operations to maintain the up-to-date active submatrix, the
up-to-date pivotal column is formed each iteration. This procedure requires
much simpler data structure management and pivot search strategy compared
to a Markowitz-based procedure, which maintains and selects the pivot from
the whole up-to-date active submatrix.
The pivotal column in a given stage of the Tomlin INVERT is formed by
passing forwards through the L-etas for the residual block which have been
computed up to that stage. Even for problems which do not exhibit hypersparsity, the pivotal column of the active submatrix during Gaussian elimination
is very likely to be sparse. This is the situation where what is referred to in this
paper as hyper-sparsity was identified by Gilbert and Peierls [11]. This partial
FTRAN operation is particularly amenable to the exploitation of hyper-sparsity
using the algorithm illustrated in Figure 3. Note that the data structures
required to exploit hyper-sparsity during this stage in INVERT, as well as during
I-FTRAN itself, are generated at almost no cost during INVERT.
For the problems in test set H, the Tomlin INVERT yields factors which, in
the worst case, have an average of 4% more entries than those generated by a
Markowitz-based INVERT. The average fill-in with the Tomlin INVERT is no
more than 10%. For problems with such low fill-in, the Tomlin INVERT (when
exploiting hyper-sparsity) is at least as fast as a Markowitz-based INVERT. For
many problems in H the dimension of the residual block in the Tomlin INVERT
is very small (no more than a few percent) relative to the dimension of B0 so
16
the scope for exploiting hyper-sparsity is negligible. However, for others, the
average dimension of the residual block is between 10 and 25 percent of the
dimension of B0 . Since there is little fill-in, the scope for exploiting hypersparsity during INVERT for these problems is significant. Indeed, if hypersparsity is not exploited, the Tomlin INVERT may be significantly slower than
a Markowitz-based INVERT.
For some of the problems in test set H0 , using the Tomlin INVERT results in
significantly more fill-in than would occur with a Markowitz-based INVERT, in
which case it would generally be preferable to use a Markowitz-based INVERT.
For the remaining problems, the Tomlin INVERT is at least competitive when
exploiting hyper-sparsity.
4.8
Hyper-sparse (product-form) UPDATE
The product-form UPDATE requires the nonzeros in the pivotal column to be
stored in packed form, with the pivot stored as its reciprocal (so that the
divisions in FTRAN and BTRAN are effected by multiplication). As explained
in Section 4.3 this packed form is produced at negligible cost during the course
of CHUZR.
4.9
Hyper-sparsity for other update procedures
The product form update is commonly criticised for its lack of numerical
stability and inefficiency with regard to sparsity. For this reason, some
implementations of the revised simplex method are based on the ForrestTomlin [9] or Bartels-Golub [1] update procedures which modify the U (but
not the L) etas in the representation of B0−1 in order to gain numerical stability
and efficiency with regard to sparsity. If such a procedure were used, the data
structure which enables hyper-sparsity to be exploited when applying U -etas
during BTRAN and FTRAN would have to be modified after each UPDATE.
The overhead of doing this is likely to limit severely the value of exploiting
hyper-sparsity. Also, the advantage of the Forrest-Tomlin and Bartels-Golub
update procedures with respect to sparsity is small for problems which exhibit
hyper-sparsity in the product form UPDATE etas. If greater numerical stability
is required than is offered by the product form update, the Schur complement
update [2] may be used. Like the product form update, the representation
of B0−1 is unaltered so the data structures for exploiting hyper-sparsity when
applying the INVERT etas remain static. Techniques analogous to those
described above for the product form update may be used to exploit hypersparsity during U-BTRAN when using a Schur complement update.
4.10
Controlling the use of hyper-sparsity techniques
All the hyper-sparse techniques described above are less efficient than the
standard versions in the absence of hyper-sparsity and so should only be applied
when hyper-sparsity is present. For problems which do not exhibit hypersparsity at all, or for problems where a particular computational component
does not exhibit hyper-sparsity, this is easily recognised by monitoring a running
average of the density of the result over a number of iterations. The technique
would then be switched off for all subsequent iterations if hyper-sparsity is seen
17
to be absent. For a computational component which typically exhibits hypersparsity, it is important to identify the situation where the result for a particular
iteration is not going to be sparse, and switch to the standard algorithm which
will then be more efficient. This can be achieved by monitoring the density
of the result during the operation and switching on some tolerance. Practical
experience has shown that performance is very insensitive to changes in these
tolerances around the optimal value.
5
Results
Computational results in this section demonstrate the value of the techniques for
exploiting hyper-sparsity described in the previous section. A detailed analysis is
given of the speedup of the authors’ revised simplex solver, EMSOL, as a result
of exploiting hyper-sparsity. In addition, for problems which exhibit hypersparsity, EMSOL is compared with SOPLEX 1.2 [21] and the CPLEX 6.5 [15]
primal simplex solver. All results were obtained on a SunBlade 100 with 512Mb
of memory.
5.1
Speedup of EMSOL when exploiting hyper-sparsity
The efficiency of the techniques for exploiting hyper-sparsity is demonstrated by
the results in Table 4 for the problems in test set H and Table 6 for the problems
in test set H0 . These tables give the speedup of both overall solution time and
the time attributed to each computational component where significant hypersparsity may be exploited. In this paper, mean values of speedup are geometric
since this avoids bias when favourable and unfavourable speedups are being
combined.
5.1.1
Problems which exhibit hyper-sparsity
For test set H, Table 4 shows clearly the value of exploiting hyper-sparsity. The
solution time of all problems improves, by more than an order of magnitude
in the case of the larger problems, and all computational components show a
significant mean speedup. Note that, particularly for the larger problems, the
speedup in PRICE and CHUZC underestimates the efficiency of these operations
when the pivotal row is sparse: although only a small percentage of the pivotal
rows are not sparse, they dominate the time required for PRICE, and in addition,
if the pivotal row is not sparse, the set C in the hyper-sparse CHUZC must be
reinitialised, requiring a full CHUZC.
Although U-BTRAN exhibits the greatest mean speedup, it is seen in Table 3
that, when not exploiting hyper-sparsity, this operation makes a relatively small
contribution to overall solution time. It is the speedups in I-FTRAN, I-BTRAN,
PRICE and CHUZC which are of greatest value. However, the speedup in all
operations is of some value.
We now consider whether there is scope for further improvement. Table 5
gives the percentage of solution time attributable to the major computational
components when exploiting hyper-sparsity. The column headed ‘Hypersparsity’ is the percentage of the solution time which is attributable to creating
and maintaining the data structures required to exploit hyper-sparsity in the
18
speedup in total solution time and computational components
19
Problem
80BAU3B
FIT2P
GREENBEA
GREENBEB
STOCFOR3
WOODW
DCP1
DCP2
CRE-A
CRE-C
KEN-11
KEN-13
KEN-18
PDS-06
PDS-10
PDS-20
Mean
Solution
3.34
1.75
2.71
2.44
1.85
3.40
3.25
5.32
3.05
2.89
22.84
12.12
15.27
17.48
10.36
10.35
5.21
CHUZC
I-FTRAN
CHUZR
I-BTRAN
U-BTRAN
PRICE
INVERT
3.05
18.91
1.33
1.39
4.47
1.82
1.58
1.60
2.54
2.88
19.36
6.31
6.63
15.25
10.67
8.58
4.38
5.13
1.30
1.30
1.35
1.14
1.70
2.36
8.24
4.00
4.67
98.04
104.09
263.94
24.07
11.24
5.96
7.03
1.93
0.93
1.13
1.21
0.96
1.30
1.19
2.51
1.72
1.97
13.93
7.31
15.27
3.57
1.85
1.68
2.28
3.51
12.22
3.60
3.69
7.26
4.17
3.87
6.21
3.64
3.58
27.22
12.87
13.91
21.58
16.60
14.33
7.64
6.72
3.59
19.87
21.88
56.99
11.72
6.25
13.99
6.50
6.53
9.90
9.19
13.07
35.76
49.99
189.19
15.44
6.06
13.47
3.45
3.44
7.61
5.14
6.71
6.20
4.48
4.97
66.36
17.60
19.92
28.18
17.55
15.40
9.71
1.34
0.87
2.83
2.78
3.16
1.53
1.70
8.63
1.14
1.08
1.02
0.94
1.01
1.02
0.96
1.44
1.55
Table 4: Speedup for test set H when exploiting hyper-sparsity.
computational components. PRICE and CHUZC are still the major cost for many
problems and some form of partial/multiple pricing might reduce the time per
iteration attributable to PRICE and CHUZC. However, the saving may be more
than offset by an increase in the number of iterations required to solve these
problems.
The one operation where no techniques for exploiting hyper-sparsity have
been developed is U-FTRAN. The contribution of this operation to overall
solution time has increased from an average of 1.61% when not exploiting hypersparsity in other components (see Table 3) to a far from dominant 6.11%.
5.1.2
Problems which do not exhibit hyper-sparsity
As identified in Section 4, for problems in test set H0 the only significant scope
for exploiting hyper-sparsity is in U-BTRAN, PRICE and INVERT. Table 6 gives
the percentage of solution time attributable to these three operations when
not exploiting hyper-sparsity. Although the overhead of INVERT is higher
than for problems in set H, the dominant operation is, again, PRICE. With
the exception of PILOT, all other problems show some speedup in solution
time when exploiting hyper-sparsity, with a modest but not insignificant mean
speedup of 1.45. Despite the significant speedup in U-BTRAN, much of the
overall performance gain can be put down to halving the time attributable to
PRICE. Although the mean speed of INVERT is doubled, it should be born in
mind that a Markowitz-based INVERT procedure may well be preferable for
these problems. For the other computational components, there is an mean
speedup of between 1.02 and 1.10, indicating that the hyper-sparse techniques
are not significant relative to the rest of the calculation. However, the overhead
associated with creating and maintaining the data structures required to exploit
hyper-sparsity is not significant and for the only problems where it accounts for
more than 1% of the solution time, it yields a significant overall speedup.
5.2
Comparison with representative simplex solvers
In this section the performance of EMSOL is compared with other simplex
solvers for the problems in test set H. CPLEX [15] is commonly held to be
the leading commercial simplex solver, a view supported by benchmark tests
performed by Mittelmann [18]. SOPLEX [21] is a public-domain simplex solver
developed by Wunderling [20] which outperforms other such solvers in tests
performed by Mittelmann [18]. In the results presented below, EMSOL is
compared with the most recent version of CPLEX available to the authors
(version 6.5) and SOPLEX version 1.2. Although SOPLEX version 1.2.1 is
available, it shows no noticeable performance improvement over version 1.2.
It is important when comparing the performance of computational
components of different solvers that all solvers start from the same (or similar)
advanced basis and any algorithmic differences which affect the computational
requirements are reduced to a minimum. In the comparisons below, EMSOL is
started from the basis obtained using the CPLEX crash procedure. SOPLEX
cannot be started from a given advanced basis. However, since the SOPLEX
crash procedure described by Wunderling [20] appears to be closely related to
that of CPLEX, the SOPLEX initial basis may be expected to be comparable
to that of CPLEX, and hence EMSOL.
20
Percentage of solution time
21
Problem
80BAU3B
FIT2P
GREENBEA
GREENBEB
STOCFOR3
WOODW
DCP1
DCP2
CRE-A
CRE-C
KEN-11
KEN-13
KEN-18
PDS-06
PDS-10
PDS-20
Mean
CHUZC
I-FTRAN
U-FTRAN
CHUZR
I-BTRAN
U-BTRAN
PRICE
INVERT
26.24
1.21
8.24
7.97
1.97
17.79
8.16
13.54
14.03
13.60
12.02
17.06
20.75
12.29
11.89
12.55
12.46
4.23
7.78
16.51
16.87
14.18
8.45
6.69
2.20
6.58
5.60
2.74
1.20
0.60
4.80
6.35
8.49
7.08
4.00
4.66
7.94
7.98
13.91
3.51
8.92
5.93
5.87
5.69
5.22
3.85
2.59
5.90
5.99
5.75
6.11
6.89
35.34
12.03
12.14
29.68
7.02
9.42
4.51
10.92
11.41
6.08
4.81
2.76
12.77
14.74
14.35
12.18
7.34
3.21
10.82
10.95
4.85
6.17
15.39
13.05
13.43
13.32
20.64
19.53
22.51
9.75
8.29
6.96
11.64
1.13
0.51
0.79
0.90
0.21
1.17
1.80
0.67
1.65
2.02
4.61
1.81
1.01
1.50
0.77
0.30
1.30
39.22
8.39
28.10
27.89
6.23
43.90
29.85
46.16
31.08
30.31
20.72
40.58
42.42
30.38
35.43
37.54
31.14
2.81
18.56
7.48
7.37
13.55
3.18
8.73
6.26
4.47
4.21
8.51
3.86
2.62
6.03
5.18
5.48
6.77
Hyper-sparsity
4.83
13.01
4.98
4.90
9.07
4.70
7.93
6.99
8.32
9.50
15.14
5.85
4.32
12.86
9.27
7.16
8.05
Table 5: Percentage of solution time for computational components and overhead attributable to exploiting hyper-sparsity for test set H.
Not exploiting hyper-sparsity
Percentage of solution time
22
Problem
BNL2
D2Q06C
D6CUBE
DEGEN3
DFL001
MAROS-R7
PILOT
PILOT87
QAP8
TRUSS
CRE-B
CRE-D
Mean
U-BTRAN
PRICE
INVERT
6.29
2.69
0.45
6.51
2.90
1.79
1.40
1.19
1.25
2.62
1.31
1.25
2.47
32.61
36.65
72.35
27.12
24.22
18.50
22.15
19.51
15.56
54.50
74.61
74.95
39.39
14.74
14.57
3.83
14.14
22.52
26.22
13.03
17.39
16.68
10.26
2.97
2.82
13.26
Solution
1.87
1.33
1.65
2.32
1.69
1.05
0.96
1.00
1.05
1.80
1.82
1.52
1.45
Exploiting hyper-sparsity
Speedup
Hyper-sparsity
U-BTRAN PRICE INVERT
CPU (%)
14.61
2.43
2.80
4.79
8.34
1.84
2.20
1.12
1.37
2.04
1.37
0.23
11.58
3.15
2.11
6.65
10.93
1.97
5.21
0.59
2.14
2.06
0.98
0.51
1.28
1.43
0.92
0.26
1.39
1.50
0.85
0.26
1.29
1.24
1.48
0.40
6.47
2.49
4.40
0.88
50.74
2.33
2.93
0.60
41.97
2.16
3.14
0.65
5.80
1.99
2.01
1.41
Table 6: Percentage of solution time when not exploiting hyper-sparsity, speedup when exploiting hyper-sparsity and percentage of solution
time attributable to exploiting hyper-sparsity for test set H0 .
When CPLEX is run, the default approach to pricing is to start with an
inexpensive strategy and switch to Devex. For the test problems in this paper,
this approach leads to a speedup of 1.22 (1.14 for the problems in H) over the
performance when using only Devex pricing. However, since EMSOL uses only
Devex pricing, it is compared with CPLEX using only Devex pricing.
By default, SOPLEX uses the dual simplex method, although it often
switches to the primal, particularly when close to optimality and occasionally
significantly before. Although it is suggested that SOPLEX can run as a primal
simplex solver, in practice it soon switches to the dual simplex method for the
test problems in this paper. Thus, for the comparisons below, SOPLEX is
run using its default choice of method. The default pricing strategy used by
SOPLEX is steepest edge which is described by Forrest and Goldfarb in [8].
Even for the dual simplex method, steepest edge carries a higher computational
overhead than Devex since it requires an additional BTRAN operation in the
case of the primal (FTRAN in the dual) as well as some pricing in the primal.
SOPLEX can be forced to use Devex pricing which has the same computational
requirements in the dual as in the primal. Thus, in the results below, SOPLEX
is forced to use only Devex pricing. Although the use of the primal or dual
simplex method may significantly affect the number of iterations required to
solve a problem, comparing the average time per iteration of SOPLEX and
EMSOL gives a fair measure of the efficiency of the computational components
of the respective solvers.
SOPLEX has a presolve which is used by default. Since the presolve improves
the solution time by only about 10% and incorporates scaling, in the results for
SOPLEX given below it is run with the presolve. As a result, the relative
iteration speed of SOPLEX may give a slight overestimate of the efficiency of
the techniques used in its computational components.
The results of the comparisons of EMSOL with CPLEX and SOPLEX, when
run in the modes discussed above, are given in Table 7. Values of speedup which
are greater than unity indicate that EMSOL is faster.
5.2.1
Comparison with CPLEX
Table 7 shows that for the H problems EMSOL is faster on seven and CPLEX
is faster on nine. On average, CPLEX is slightly faster: the mean speedup for
EMSOL is 0.84. There is no consistent reason for the difference, with neither
code dominating the other in time per iteration or number of iterations. Because
the different solvers do not follow the same paths to the solution, it is difficult
to make precise comparisons. Also we have noticed that making an arbitrary
perturbation to the pivot choice can double or half the solution time (though
the result presented in this paper are all for the EMSOL default settings). This
difference is due both to the variation in number of iterations taken and to
variation in the average amount of hyper-sparsity encountered on the solution
path.
Brief comments by Bixby [3] and by Bixby et al [4] indicate that methods
to exploit hyper-sparsity in FTRAN, BTRAN and PRICE have been developed
independently for use in CPLEX [15]. These techniques, for which no details
have been published, were introduced to CPLEX between versions 6 and 6.5.
There may be other unpublished features of CPLEX unconnected with hypersparsity which contribute the the differences in performance when compared
23
with EMSOL.
5.2.2
Comparison with SOPLEX
In the comparison of EMSOL and SOPLEX, two values are given in Table 7
for each model: the relative speedup in total solution time when EMSOL is
used and the relative speedup in the time required by EMSOL to perform a
simplex iteration. In terms of solution speed, EMSOL is clearly far superior,
dramatically so for FIT2P and DCP1 for which SOPLEX takes an excessive
number of iterations. However, even after excluding these problems, the mean
speedup when using EMSOL is 2.72, which is considerable. Note that, for the
problems in test set H, SOPLEX with Devex pricing is 1.14 times faster than
with all its defaults when it uses steepest edge.
For the reasons set out above, the values for the speedup in iteration
time provide the most reliable comparison of the techniques used within the
computational components of EMSOL and SOPLEX. With the exception
of STOCFOR3, the iteration speed of EMSOL exceeds that of SOPLEX,
significantly so for the larger problems.
Problem
80BAU3B
FIT2P
GREENBEA
GREENBEB
STOCFOR3
WOODW
DCP1
DCP2
CRE-A
CRE-C
KEN-11
KEN-13
KEN-18
PDS-06
PDS-10
PDS-20
Mean
EMSOL
Solution
time
CPU (s)
9.43
67.34
24.63
19.55
201.21
3.09
14.64
483.64
4.49
3.50
9.19
100.77
1421.50
8.20
83.83
1059.19
CPLEX
Solution
time
CPU (s)
14.39
42.08
21.18
18.31
41.15
3.99
16.30
1412.89
6.29
5.91
13.57
91.41
786.08
5.49
26.92
238.48
(Devex)
EMSOL
solution
speedup
1.53
0.62
0.86
0.94
0.20
1.29
1.11
2.92
1.40
1.69
1.48
0.91
0.55
0.67
0.32
0.23
0.84
SOPLEX (Devex)
Solution EMSOL EMSOL
time
solution iteration
CPU (s) speedup speedup
20.50
2.17
1.78
1058.79
15.72
1.30
150.53
6.11
1.38
76.36
3.91
1.56
154.11
0.77
0.64
21.41
6.93
2.58
330.35
22.56
1.69
1507.40
3.12
1.24
5.07
1.13
1.09
3.39
0.97
1.06
46.99
5.11
2.20
417.26
4.14
2.05
5167.60
3.64
1.91
44.22
5.39
3.37
158.79
1.89
2.37
1754.17
1.66
2.73
3.47
1.67
Table 7: Performance of EMSOL relative to CPLEX 6.5 and SOPLEX 1.2
6
Conclusions and extensions
This paper has identified hyper-sparsity as a property of a significant number
of LP problems when solved by the revised simplex method. Techniques for
exploiting this property in each of the computational components of the revised
simplex method have been described. Although this has been done in the
24
context of the primal simplex method, the techniques developed in this paper
can be applied immediately to the dual simplex method since it has comparable
computational components.
For the subset of our test problems that do not exhibit hyper-sparsity (H),
the geometric mean speedup is 1.45, and for those problems which do exhibit
hyper-sparsity (H0 ), the speedup is substantial, increases with problem size and
has a mean value of 5.38.
For this latter subset of problems our implementation of the revised simplex
which exploits hyper-sparsity has been shown to be comparable to the leading
commercial simplex solver and several times faster than the leading publicdomain solver. Although this performance gain is substantial for only a subset
of LP problems, the amenable problems in this paper are of genuine practical
value and the techniques yield some performance improvement in all but one of
the test problems.
The authors would like to thank John Reid who brought the Gilbert-Peierls
algorithm to their attention and made valuable comments on an earlier version
of this paper.
References
[1] R. H. Bartels. A stabilization of the simplex method. Numer. Math.,
16:414–434, 1971.
[2] J. Bisschop and A. J. Meeraus. Matrix augmentation and partitioning in
the updating of the basis inverse. Mathematical Programming, 13:241–254,
1977.
[3] R. E. Bixby. Solving real-world linear programs: A decade and more of
progress. Operations Research, 50(1):3–15, 2002.
[4] R. E. Bixby, M. Fenelon, Z. Gu, E. Rothberg, and R. Wunderling.
MIP: Theory and practice closing the gap. In M. J. D. Powell and
S. Scholtes, editors, System Modelling and Optimization: Methods, Theory
and Applications, pages 19–49. Kluwer, The Netherlands, 2000.
[5] W. J. Carolan, J. E. Hill, J. L. Kennington, S. Niemi, and S. J. Wichmann.
An empirical evaluation of the KORBX algorithms for military airlift
applications. Operations Research, 38(2):240–248, 1990.
[6] V. Chvátal. Linear Programming. Freeman, 1983.
[7] G. B. Dantzig and W. Orchard-Hays. The product form for the inverse in
the simplex method. Math. Comp., 8:64–67, 1954.
[8] J. J. Forrest and D. Goldfarb. Steepset-edge simplex algorithms for linear
programming. Mathematical Programming, 57:341–374, 1992.
[9] J. J. H. Forrest and J. A. Tomlin. Updated triangular factors of the basis
to maintain sparsity in the product form simplex method. Mathematical
Programming, 2:263–278, 1972.
25
[10] D. M. Gay. Electronic mail distribution of linear programming test
problems. Mathematical Programming Society COAL Newsletter, 13:10–
12, 1985.
[11] J. R. Gilbert and T. Peierls. Sparse partial pivoting in time proportional
to arithmetic operations. SIAM J. Sci. Stat. Comput., 9(5):862–874, 1988.
[12] J. A. J. Hall and K. I. M. McKinnon. Hyper-sparsity in the revised simplex
method and how to exploit it. Technical Report MS00-015, Department of
Mathematics and Statistics, University of Edinburgh, 2000. Submitted to
SIAM Journal of Optimization.
[13] J. A. J. Hall and K. I. M. McKinnon.
http://www.maths.ed.ac.uk/hall/PublicLP/, 2002.
LP test problems.
[14] P. M. J. Harris. Pivot selection methods of the Devex LP code.
Mathematical Programming, 5:1–28, 1973.
[15] ILOG. CPLEX 6.5 Reference Manual, 1999.
[16] R. D. McBride and J. W. Mamer. Solving multicommodity flow problems
with a primal embedded network simplex algorithm. INFORMS Journal
on Computing, 9(2):154–163, Spring 1997.
[17] R. D. McBride and J. W. Mamer. A decomposition-based pricing
procedure for large-scale linear programs: an application to the linear
multicommodity flow problem.
INFORMS Journal on Computing,
46(5):693–709, May 2000.
[18] H. D. Mittelmann.
Benchmarks for optimization
http://www.plato.la.asu.edu/bench.html, April 2002.
software.
[19] J. A. Tomlin. Pivoting for size and sparsity in linear programming inversion
routines. J. Inst. Maths. Applics, 10:289–295, 1972.
[20] R. Wunderling. Paralleler und objektorientierter simplex. Technical Report
TR-96-09, Konrad-Zuse-Zentrum für Informationstechnik Berlin, 1996.
[21] R. Wunderling, A. Bley, T. Pfender, and T. Koch. SOPLEX 1.2.0, 2002.
26
Download