Computing Load of the Simplex Method

advertisement
Computational Aspects of the Simplex Method
MSIS 685: Linear Programming
Lecture 9
Professor: Farid Alizadeh
Scribe: Shuguang Liu
Lecture Date: Nov. 5 1998
We have addressed the Simplex Method to solve Linear Programming Problem. By
introducing matrix notation, we know that solving linear systems can be viewed as
proceeding matrix operations such as matrix addition, subtraction, multiply, inverse, etc.
When we implement simplex method as a computer program, however, we do not
practice matrix operations but solve corresponding equation systems. In order to know
how much the computing load of the system is, how we can reduce it and to what degree
we can reduce it, first we need some elementary knowledge of computing load. Then we
will try to decrease the computation.
1. Basic to Computing Load
1.1 Computing Loads of Matrix and Vector Operations
Let A, B  R nn and U, V  R n .
(1) UV T is a rank one matrix.
(2) Computing A -1 requires o(n 3 ) flops, which means that solving system AX  b requires
o(n 3 ) flops. We denote this by A -1  o(n 3 ).
(3) AB  o(n 3 ).
(4) AU, U T A  o(n 2 ).
(5) UV T , U T AV  o(n 2 ).
If L and U denote lower and upper tria ngle marixs respective ly, then decomposin g A into
L and U requires o(n 3 ) flops : A  LU  o(n 3 ).
Matrix decomposition is three times less time-consuming than matrix inverse.
1.2 Sherman-Merrison Formula (SMF)
Suppose we know A and A-1, we need to compute (A+UVT)-1. We can use ShermanMerrison Formula, which is as follows
A 1UV T A 1
( A  UV T ) 1  A 1 
1  V T A 1U
Computing Load of SMF
Computing the right hand of the formula needs (n2), because A-1 is known, from (5)
computing VTA-1U needs (n2), and it can be shown that computing A-1UVTA-1 also
requires (n2) flops.
To compute A-1UVTA-1, we can first compute (A-1U) and (VTA-1) and then multiply the
results, which are three (n2) operations. So computing A-1UVTA-1 is a (n2) operation.
So that, by using SMF, we can reduce computing load of (A+UV T)-1 from (n3) to (n2)
flops.
Proof of SMF
Bear in mind the fact that VTA-1U is a number so that we have following equation:
UV T A-1UV T A-1  U (V T A-1U )V T A-1  (V T A-1U )UV T A-1
Using the equation above prove SMF as follows:
A 1UV T A 1
T
1
( A  UV )( A 
)
T
1
1

V
A
U
UV T A 1
U (V T A 1U )V T A 1
T
1
I

UV
A

1  VT T A11U
1 1V T A 1TU 1
T
1
T
UV A  (1  V A U )UV A  (V A U )UV T A 1
I
1  V T A 1U
I
2. Using SMF to Simplex Method
Given the basis, let N and B denote nonbasic and basic parts of matrix A respectively,
then BRmm and NRmn. We need to solve equation system AX=b, ARm(n+m),
XRn+m, bRm.
B  i1 , i1 ,im 
N   j1 , j 2 , j n 
Here i’s and j’s are index of variables.
To some extend, the simplex method is to continuously construct new basic variable sets
in order to find an optimal one. And the new set is different from the old set in just one
variable. If we calculate the new equation system based on the new set without
considering the information we have got from calculating the old system, we meet
another (n3)-flop operation. Fortunately, we can use SMF in the new system derived
from the old one.
2.1 At iteration 1:
At this time, we need to compute B-1.
First, from BX B  b  X B  B 1b  o(m 3 )
Then, compute y T  C BT B 1  o(m 2 ), because B -1 is known.
~
B -1 N  N  o(mn)
2.2 At next iteration:
Bnew  B \ ik    j s , where ik is the index of leaving variable and js is the index of
incoming variable.
Bnew
0  0 b1r
 B   
 
0  0 bnr
 0  0  0 N 1 j s

    


 0 0  0 N njs
 0


 0
Let brT  (b1r  bnr ) , which is the rth column of B, and aqT  ( N1 js  N njs ) , which
is the qth column of N. These two vectors will change and only change the values of rth
column of B to Bnew.
The second and third parts of right hand side of the equation can be written as –brerT and
aqerT, where erT  0 0  1  0 , and the number “1” appears on the rth column
of erT. Then we get
Bnew  B  derT
d  a q b r
1
Bnew
 ( B  derT ) 1
Now we can use SMF and have:
B
1
new
B 1 derT B 1
B 
1  erT B 1
1
It can be shown that all elements of Bnew-1 have been calculated at iteration 1, hence need
no further calculation.
(1) B-1 is known.
~
(2) B 1d  B 1 (aq  br )  N q  er , where B-1aq has been calculated at iteration 1 and it is
~
the qth column of N , and B-1br is the rth column of B-1B=I.
~
~
~
(3) 1  erT B 1d  1  erT ( N q  er )  N rq , which is an element of N calculated before.
From section 2.1 and 2.2, we can see that with SMF solving an LP needs one (n3)-flop
operation only at the first iteration and (n2)-flop operations at the following iterations.
3. LU Factorization
Sometimes even with SMF it is not economic to compute B-1 at the first iteration, because
LP matrices (A) are large (in rows and columns) and sparse (each column and row have
only few nonzero numbers, and the others are zeros). With these properties we can use
LU factorization to reduce computing load even further.
3.1 Gaussian Elimination
Using Gaussian elimination to solve a system of equations is to get an upper triangle
matrix. Let’s see an example.
2 1 0

0 3 0
0 0 2

4 0 4

1
 2

3
 0
 0
0

 (4) (2)

2
 2


0
 0
r1 ( 2 )  r4
X  b  


1
0


 (4)
2 

0 2 
 2


0 0  r3 ( 2)  r4  0
 

2 1 
0


 (4)
4  2 

2 

0  r2 ( 2 / 3)  r4
 
0 2 1 

 2 4  2 
1
0
2 

3
0
0 
0
2
1 

(2) (4)  4 
1
3
0
0
There is a nice property of the last triangle matrix. This property is:
2
 2
 2 1 0 2 




3
3 0 0 
0 3


A
 LU


0 0 2
2
2 1 




 4  2 4  4 



4

4




By decomposing A into L and U, we reduce the computing load of solving AX=b from
(n3) to (n2).
 LY  b
AX  b  LUX  b  
 o( n 2 )
UX

Y

If we have U and L, when b changes we just need (n2) flops instead of (n3) to solve the
new problem.
The procedure of Gaussian elimination to a matrix is the procedure of successively
multiplying the matrix by lower diagonal matrices, which are essentially identical but
excepts one column. The procedure can be shown as follows:
AX  b
 L1 AX  L1b
 L2 L1 AX  L2 L1b

 ( Ln 1  L2 L1 A) X  ( Ln 1  L2 L1 )b
At last we get an upper triangle matrix: (Ln-1L2L1A). These operations require (n3)
flops.
It can be proved that the results of lower triangle matrices multiplication is still a lower
triangle matrix.
Lemma: If L1, L2 are lower triangle matrices, then L1L2 is also a lower triangle matrices.
Proof:
~
~
L
L
0 
0 
L2   ~21 ~  , then we multiply L1 and L2:
Let L1   ~11 ~ 
 B1 L12 
 B2 L22 
~ ~

L L
0 
L1 L2   ~ ~ 11 21~ ~
~ ~ 
 B2 L12  B1 L21 L12 L22 
By generalization it is easy to prove that this matrix is a triangle matrix.
3.2 Permutation Matrix
It is always the case that the number of zeros in the LU matrices after LU decomposition
is decreased. We know that we should keep as many zeros as possible in order to make
further computation efficient. Thus we need matrix permutation. The other reason we
need permutation is to make Gaussian Elimination possible and accurate. From last
section, a Gaussian elimination operation is to pivot a diagonal element, as show below:
1 a12

1





a13
a 23
a33
a 43
a53
a14
a 24
a34
a 44
a54
a15 
a 25 
a35 

a 45 
a55 
To this matrix we need pivot element a33. If a33 is zero, we have to rearrange row or/
column to put a nonzero element in this position. If a33 is small, pivoting it means divided
by a small number, which will result in a big number hence losing accuracy. So we need
to pivot say a44 by exchange row 3 and 4 and column 3 and 4. Switching row and/or
column can be done by pre- and/or post-multiplying a permutation matrix.
Permutation matrix is a matrix of which each row and column has only one nonzero
element 1. Pre-multiplying a permutation matrix changes the position of rows, while
post-multiplying a permutation matrix changes the position of columns
We can rearrange rows and columns before operating Gaussian elimination. At this time
the Gaussian elimination procedure can be shown as:
AX  b
 P1 AX  P1b
 L1 P1 AX  L1 P1b

 ( Ln 1 Pn 1  L2 P2 L1 P1 A) X  ( Ln 1 Pn 1  L1 P1 )b
How to choose a pivoting element is a complicated problem, which is even more
complex than LP itself. However, there is a heuristic rule we can follow.
Minimum-degree heuristic (quoted from textbook):
Before eliminating the nonzeros below a diagonal pivot element, scan all uneliminated
rows and select the sparsest row, i.e., that row having the fewest nonzeros in its
uneliminated part. Swap this row with pivot row. Then scan the uneliminated nonzeros in
this row and select that one whose column has the fewest nonzeros in its uneliminated
part. Swap this column with the pivot column so that this nonzero becomes the pivot
element. (Of course, provision should be made to reject such a pivot element if its value
is close to zero.)
4. Updating Factorization
From section 3, using LU factorization we reduce computing load of the first iteration by
three times. Now we will address how to use information we get from LU factorization at
the first iteration to decrease computation at the following iterations.
With BXB=b and B=LU from the first iteration, at the next iteration we construct a new B
and will get LU factorization of this new B. If we work with another factorization, it
needs (n3) flops to compute. Fortunately we can construct new L and U without actually
doing factorization again.
We know from section 2.2 that:
Bnew  B  (a q b r )erT 
L1 Bnew  U  L1 (a q b r )erT
 U  (a~q  U r )erT
In above equation:
a~q  L1 a q
U r  L1b r
(a~q  U r )erT  0  a~q  U r

 0

The nonzero column vector on the rth column of L-1Bnew so that we have:

L1 Bnew  U1  U r 1
a~q U r 1  U m

By permutation we change the rth column to the mth column, and we get:

L1 Bnew  U1  U r 1 U r 1  U m
Now this matrix looks like:
a~q

1 X X X X X X 

1 X X X X X 


X X X X X


1 X X X X


1 X X X


1 X X


1 X 

In order to obtain new L and U from this matrix, we need to pre-multiply a special lower
triangle matrix. This special matrix essentially is an identity matrix except one column
and looks like:
1

 


Er  
 er  


1

Er
1
1

 



 er  


1

The nonzero element er is on the rth column. By continuously multiplying Er matrix, we
have:
~
E r L1 Bnew P  E r B
~
E n 1  E r 1 E r L1 Bnew P  E n 1  E r 1 E r B
At last we get an upper triangle matrix. From this triangle matrix we can derive new L
and U.
~
1
E n 1  E r 1 E r L1  Lnew
is a (r2) operation. And, En1  Er 1 Er B  U new is another (r2)
operation. As a whole, getting Bnew P  LnewU new needs (n2)-flop computation.
Download