Uploaded by Yun Bi

linear programming

advertisement
Notes on Linear Programming
1
Introduction and Elementary Examples
Linear programming and it variants are certainly to most widely used optimization algorighms in applications. The main algorithm used for actual computation is the Simplex
Algorighm and is based on techniques from Linear Algebra. Computer codes are commonly available with most libraries of numerical routines and used for diverse problems
ranging from problems of the efficient use of limited resources to the design of routings
for computer cables.
The basic numerical algorithm the Simplex Algorighm is designed to solve the linear
optimization problem whose standard form can be written as

n
X



Minimize
cj xj




j=1




 under the side conditions

(1.1)
xj ≥ 0 , j = 1, . . . , n ,





and



n

P



aij xj = bi , i = 1, . . . , m .

j=1
This problem is a linear, finite dimensional problem since the cost functional, as
well as the side conditions, are written in terms of finitely many linear functionals.
Remark: Note that these two problems are written as minimization problems. It is
common in practice that one is confronted by a problem in maximization. However,
each such problem may be rewritten as one of minimization simply by replacing the cost
functional
n
X
cj xj
by
(−1)
n
X
cj xj
j=1
j=1
We will, for the theoretical development, always speak in terms of minimization problems
although some of the examples are ones of maximization of an appropriate cost.
1
1.1
Some Examples
To make the general formulation above more concrete, we look at a pair of examples.
Example 1.1
Suppose a manufacturing concern produces four different products which we will call
A, B, C, and D. Suppose further that there are fixed amounts of three different resources,
such as labor, raw materials, equipment, etc. which can be combined in certain fixed and
known ways to produce the four products. In particular, it is known how much of each
resource must be used to produce one unit of each product and the variable cost of each
resource. Moreover, we will assume that the profit made for each unit of product sold is
known. To be specific, let us take the data as listed in the following table:
Resource
PRODUCT
Limitation
on Resource
A
B C D
1
5
2
3
1
300
2
1
2
1
2
200
3
1
0
1
0
100
Unit Price
6
4
2
1
From this data, we wish to make the decision of how much of each product to make in
order to maximize profit.
We want to write this problem mathematically as a linear programming problem. To
accomplish this, we let xj , j = 1, 2, 3, 4, represent the amounts of the products to be
manufactured. These are the numbers that we wish to determine. From the last line of
the table, we can write the objective functionsal as
z = 6x1 + 4x2 + 2x3 + x4 .
Since the resources are limited, we have certain constraints which are determined by the
limitations on the resources listed in the upper lines of the table. The three constraints
lead to the set of inequalities:
5x1 + 2x2 + 3x3 + x4
≤ 300
x1 + 2x2 + x3 + 2x4
≤ 200
x1 + x3
≤ 100
2
Likewise we have the usual non-negativity constraints,
xj ≥ 0, j = 1, 2, 3, 4.
This linear program is summarized by:


Maximize 6x1 + 4x2 + 2x3 + x4






under the side conditions




4



 x ∈ R and xi ≥ 0, i = 1, 2, 3, 4
together with




5x1 + 2x2 + 3x3 + x4 ≤ 300






x1 + 2x2 + x3 + 2x4 ≤ 200




 x + x ≤ 100
1
3
(1.2)
This last example is one with four unknowns. If there are only two, it is possible to use
graphical methods to see the optimal solution. By doing so, we gain insight into the
general case.
Example 1.2
A bank has $100 million dollars dedicated to investment. A portion will be invested in
loans ($L million) and a portion in securities ($S million). Loans yield high returns (for
example 10%) but have the drawback that they are long-term investments. On the other
hand, securities yield, for example, 5% earnings. On a yearly basis, the bank recieves
0.1 L + 0.05 S in returns on these investments. This amount is to be maximized subject
to the obvious constraints.
1. Implicit side conditions: L ≥ 0, S ≥ 0.
2. Limited money supply: L + S ≤ 100.
3. Liquidity constraints: It is desired (or required by law) that the bank hold at least
25% of the available investment monies in liquid form, i.e., S ≥ 0.25 (L + S) or
L − 3 S ≤ 0.
4. Minimal investments: Certain companies expect sizable loans, e.g. L ≥ 30.
3
A pair (L, S) which satisfies the constraints (1) − (4) are called admissible investments.
The managers of the bank wish to find, among all admissible investments, that which
maximizes the income from investments 0.1 L + 0.05 S. In other words, we have the linear
programming problem:


Maximize 0.1 L + 0.05 S






under the side conditions







 L ≥ 0, S ≥ 0
together with




L + S ≤ 100






L− 3S ≤ 0




 L ≥ 30
(1.3)
We can solve this problem graphically as follows:
100
90
80
70
S
60
50
L−3S=0
40
P
30
L+S=100
20
10
0
0
20
40
60
80
100
L
The shaded region of the triangle represents the admissble investments. The question is:
Which of the points in the triangle are optimal?
1
1
L + 20
S = constant, i.e., the lines
The dotted lines in the figure represents the lines 10
2L + S = c for different constants c > 0. The line with slope −2 is apparently optimal if
it passes through P . Then P = (L∗ , S ∗ ) is optimal!


 L∗ + S ∗ = 100 
The point P is characterized by the system of equations
, and
 L∗ − 3S ∗ = 0 
therefore
L∗ = 75, S ∗ = 25 with value $8.75 million.
4
Remark: It is important to notice that the optimal choice occurs at a corner of the
polygonal domain of admissible investment pairs. Indeed, this is true in general for such
problems. It may well be that there are other optimal solutions as we will see in the next
example.
Example 1.3
Let us consider the linear programming problem:


Maximize 2 x1 + 0.5 x2






under the side conditions




 x ≥ 0, x ≥ 0
1
2


together with





4 x1 + 5 x2 ≤ 30




 4 x + x ≤ 12
1
2
(1.4)
In the diagram below, the feasible region lies in the bounded polygonal region, and the
dotted lines are level lines of the cost functional. Here we see that the point P as well
as the point (3, 0) are optimal points as are all points on the line segment joining these
points. In fact, all these points represent optimal solutions for the problem. An optimal
solution (in fact two of them) lie at a corner point of the feasible region. Notice that there
are no interior points which are optimal.
8
4x1+x2=12
7
6
z=8
x
2
5
4
P
3
2
1
0
0
z=2
z=4 z=6
2
4x1+5x2=30
4
x
6
8
10
1
The interesting thing is that we can, from these examples, see what the general picture
is regardless of the number of unknowns or the number of constraints that are involved.
Explicitly,
5
1. The intersection of the quadrant boundaries and the constraint boundaries generated
a convex polygon. This is also true in the n-dimensional case where the constraints
are geometrically represented as hyperplanes and the domain is an n-dimensional
polygonal body.
2. A solution to the linear programming problem is at a “corner” or extreme point of
the polygon and not in the interior of the polygon of feasible solutions. We say that
the common boundaries which meet at the corner are the set of “active” constraints.
This is also true in the n-dimensional case with a polyhedron. Indeed, to find an
optimal solution, one need only search through the extreme points of the set of
feasible solutions.
Since, in higher dimensions, it is not possible to use geometric methods to arrive at a
solution, we will need an algebraic formulation of an algorithm that searches these points.
That algorithm is, of course, the Simplex Algorithm.
1.2
Formulation in Terms of Matrices
We start with some basic notation that we will use throughout these notes.
We will always write vectors a, b, c, x, y, z ∈ Rn as column vectors.
An (m × n)−matrix will be denoted by capital roman letters, as, for example, A =
(aij )i=1,...,m ∈ Rm×n .
j=1,...,n
The transposed matrix A⊤ is defined by A⊤ = (aji ) j=1,...,n ∈ Rn×m . We also will
i=1,...,m
interpret x ∈ Rn as an (n × 1)− matrix, in which case x⊤ is exactly the row vector in
R1×n .
We recall that there are different possibilities for multiplying vectors which are special
cases of the rule for matrix multiplication AB ∈ Rm×p for A ∈ Rm×n and B ∈ Rn×p .
Specifically, the scalar product or dot product is
x⊤ y =
n
P
xj yj
for x = (xj ), y =
j=1
(yj ) ∈ Rn .
For x, y ∈ Rn we introduce the partial order:
x ≤ y ⇐⇒ xj ≤ yj for all j = 1, . . . , n.
With these conventions, we can write the problem (1.1) as:
6



















Minimize c⊤x
under the side conditions
(1.5)
x ≥ 0,
and
Ax = b .
where c ∈ Rn , A ∈ Rm×n , b ∈ Rm are given and x ∈ Rn is to be found.
Example 1.4
Let us look at an example which is not in the standard form and see how we might
introduce auxiliary variables to derive an equivalent problem in standard form.


Maximize x1 + 2 x2 + 3 x3






under the side conditions







 x1 ≥ 0, x3 ≤ 0
together with




x1 − 2x2 + x3 ≤ 4






x1 + 3 x2 ≥ 5




 x + x = 10
1
3
(1.6)
This problem is first replaced with a minimization problem with constraints in the standard form:


Minimize − x1 − 2 x2 − 3 x3






under the side conditions







 x1 ≥ 0, −x3 ≥ 0
together with




x1 − 2x2 + x3 ≤ 4






−x1 − 3 x2 ≤ 5




 x + x = 10
1
3
7
(1.7)
We then introduce the slack variables x4 ≥ 0 and x5 ≥ 0 and rewrite the inequality
constraints as equality constraints.


Minimize − x1 − 2 x2 − 3 x3






under the side conditions







 x1 ≥ 0, −x3 ≥ 0, x4 ≥ 0, x5 ≥ 0
together with




x1 − 2x2 + x3 ≤ 4






−x1 − 3 x2 ≤ 5




 x + x = 10
1
3
(1.8)
Now, since x2 is not subject to inequality constraints1 , we introduce auxiliary variables
u ≥ 0 and v ≥ 0 and set x2 = u − v as well as x̂3 = −x3 . Substitution of these forms of
x2 and x3 leads to the new system


Minimize − x1 − 2 x2 + 3 x̂3






under the side conditions







 x1 ≥ 0, u ≥ 0, v ≥ 0, x̂3 ≥ 0, x4 ≥ 0, x5 ≥ 0
together with




x1 − 2 u + 2 v − x̂3 + x4 = 4






−x1 − 3 u + 3 v + x5 = 5




 x − x̂ = 10
1
3
(1.9)
Finally, we can rename all the variables. Whichever way we do that, to recover the
solution of the original problem, we need to keep track of the renaming. For the purposes
of illustration, we make the following assignments:
x1 → x1 ,
u → x2 ,
x̂3 → x4 ,
x4 → x5 ,
1
v → x3
x5 → x6
(1.10)
(1.11)
Such variables are often called free variables which should not be confused with the use of the term
free variable in Gaussian elimination
8
so that the problem can be written as


Minimize − x1 − 2 x2 + 2 x3 + 3 x4






under the side conditions







 x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, x4 ≥ 0, x5 ≥ 0, x6 ≥ 0
together with




x1 − 2 x2 + 2 x3 − x4 + x5 = 4






−x1 − 3 x2 + 3 x3 + x6 = 5




 x − x = 10
1
4
(1.12)
The matrix form is then


Minimize c⊤x







 under the side conditions
x≥0




together with





Ax = b
where


−1


 −2 







4
1
−2
−2
−1
1
0
 2 






c = 
 A = 
3
0 0 1 
 b :=  −5
 −1 −3
 3 


10
1
0
0 −1 0 0


 0 


0
(1.13)








 x := 







x1


x2 


x3 


x4 


x5 

x6
Notation: In the following we will always take M to be the set of feasible points, i.e.,
M = {x ∈ Rn : x ≥ 0, Ax = b} .
9
It is important to note that the set M ⊂ Rn is a convex set, a fact which follows from the
linearity of matrix multiplication.
We can then introduce the following definition:
Definition 1.5 Let (P ) stand for the linear programming problem in standard form. The
vector x∗ ∈ Rn is called a solution of the problem (P ) provided
(i) x∗ is feasible, i.e., x ∈ M, and
(ii) x∗ is optimal, i.e., c⊤x∗ ≤ c⊤x for all x ∈ M.
In addition, it is useful to introduce two further terms:
Definition 1.6
A basic feasible solution of the linear programming problem is a feasible solution with
no more than m positive components xj .
Note that the positive components of x correspond to linearly independent columns of
the matrix A.
Definition 1.7
A non-degenerate basic feasible solution is a basic feasible solution with exactly m
positive components xj . A degenerate basic feasible solution has fewer than m positive
components.
2
The Simplex Method for Linear Programming
The Simplex Method is designed for problems written in the standard form, i.e.
(P )
Minimize
f (x) := c⊤ x on M = {x ∈ Rn : Ax = b , x ≥ 0}.
Here c ∈ Rn , A ∈ Rm×n and b ∈ Rm with m ≤ n are given.
In the case that M is bounded, then M is compact, and there exists a solution of the
optimization problem. Weaker assumptions that guarantee the existence of an optimal
solution we will make later. In any case, no solution can exist if c⊤ x on M is unbounded
below, i.e. if inf c⊤ x = −∞.
x∈M
Remarks:
10
(i) Every linear optimization problem can be rewritten in the form (P ) so that the
new problem is equivalent to the old. (To do so, one introduces so-called ”slack
variables” and the substitution of the unconstrained variables xj by uj − vj with
uj ≥ 0, vj ≥ 0).
(ii) If the rank A < m, then the equations are “ redundant”, and by dropping those
which are, the matrix can be replaced by one of full rank. From a theoretical
viewpoint, one can drop the requirement that rank A = m.
2.1
The Gauss-Jordan Method
The Gauss-Jordan method is used for the solution of the equation Ax = b where, again,
A ∈ Rm×n and b ∈ Rm are given and we are to find x ∈ Rn . It may well be that m ≤ n.
By no means do we need to require that rank A = m, and we ignore also the requirement
that x ≥ 0.
For Ax = b we introduce the shorthand form by introducing what we will call the
Gauss-Jordan Tableaus:
a11
..
.
a12
..
.
···
a1n
..
.
b1
..
.
am1 am2 · · · amn bm
The Gauss-Jordan Method consists of:
(i) Choice of pivot: One can take any non-zero element lying to the left of the b−column.
(ii) “Empty” the pivot column (both over and under the non-zero pivot element) and
normalize the pivot element to 1.
(iii) Choose a new pivot element from the rows below all the rows that already contain
a pivot element. If all rows either already contain a pivot element, or consist only
of zeros then STOP.
(iv) Return to step (ii).
Example 2.1 Find the general solution of the linear system
x1 +
x2 +
x3 −
x4
=
1
4x1 + 5x2 + 5x3 + 2x4
=
0 .
=
5
x1 − 2x2
− 5x4
11
Using the steps listed above, we obtain the following sequence of Tableaus:
1
1
1
−1 1
4
5
5
2 0
1 −2
subtract 5∗ 1. eqn.!
0 −5 5
1
1
1
−1
0
0
add 2nd eqn.!
−1 1
7 −5
add 2nd eqn.!
0 −5 5
1 −2
0
1
1
−1
0
0 7 −5
×(−1)
0 2 0
div (−2)
0 −2
0
1
1
1
0
0 −7 5
0
1
0 −1 0
6 −4
subtract 3rd row.
6 −4
0
0
1
1
0
0 −7
5
0
1
0 −1
0
7 −4
We are now finished. Written out completely, the system of linear equations is:
x3 + 7x4 = −4
x1
x2
− 7x4 =
5
−
0
x4 =
From this system, we can write down the general solution of the system. We take the
variable x4 as the parameter. (We have three equations in four unknowns, so we expect
that there will be one degree of freedom in describing the general solution!) We refer to
x4 as the “free variable” in contrast to the dependent variables x1 , x2 and x3 . With this
choice, x = (5 + 7t, t, −4 − 7t, t)⊤ ∈ R4 , t ∈ R is the general solution and
12
dim ker A = 1 and rank A = 32 .
Generally r = rank A is the number of pivot elements and n − r is the number of free
variables.
Of some interest is the special solution that one obtains when all the free variables are set
equal to 0. In our example, this special solution is x = (5, 0, −4, 0)⊤ . How do we obtain
this solution “ mechanically”? To do this, we simply set the variables xj which belong to
the columns which do not contain a pivot element equal to 0. The other variables appear
in the right-hand column b “suitably” permuted.
Although we are now finished with the linear system, we can nevertheless continue to look
for further “special” solutions, for example:
0
0
1
−4
0
0
1/7
1
−4/7
1
0
0 −7
5
1
0
1
0
1
0
1
0 −1
0
0
1
1/7
0
−4/7
7
So we obtain as a special solution x3 = 0 and x1 = 1, x2 = −4/7, x4 = −4/7.
The Simplex Algorithm (in Phase II) searches for these so-called basic solutions.
2.2
The Idea of the Simplex Method through a Special Case
Before discussing the general method, we want to take a special case and work through
the details. We chose the following particular example.
Example 2.2
Maximize 20 x1 + 60 x2
under the side conditions
 


120
1 1  
 x1

 
5 10
 x2 ≤ 700 ,

520
2 10
0
x1
.
≥
0
x2
The next figure represents the feasible region.
We first transform this problem into the normal form through the introduction of slack
variables x3 , x4 , x5 . Moreover, we rewrite the maximization problem as a minimization
problem. Hence, we wish to
2
Recall that ker A := {x ∈ Rm |A x = 0}.
13
60
(0,52)
50
(60,40)
40
x_2
30
(100,20)
20
10
(120,0)
(0,0)
0
20
40
60
80
100
120
140
x_1
Figure 1: The Feasible Region
Minimize −20 x1 − 60 x2
x ∈ R5 ,
under the side conditions

 

120
1 1 1 0 0

 

5 10 0 1 0 x = 700 .

 

520
2 10 0 0 1
x ≥ 0,
We see now that the system Ax = b is already in the form that we would obtain by
application of the Gauss-Jordan method. Indeed, rank A = 3. The variables x1 and x2
are the free variables while x3 , x4 , x5 are dependent. The corresponding basic solution is
ẑ = (0, 0, 120, 700, 520)⊤ ∈ R5 .
We append the equation c⊤ x = γ to the Gauss-Jordan Tableau (γ is a parameter) in the
following way:
−20 −60 0 0 0
γ
1
1 1 0 0 120
5
10 0 1 0 700
2
10
(2.14)
0 0 1 520
This new tableau is called the Simplex Tableau. The basic solution ẑ1 = (0, 0, 120, 700, 520)⊤
is a solution of the top equation corresponding to the choice γ = 0 (since c⊤ z = 0 in this
case. Moreover, this basic solution ẑ1 is admissible since b ≥ 0!
We now introduce a different tableau in which we choose the marked entry 10 as pivot
element:
14
−8 0 0 0
4
5
1
0 1 0 − 10
3 0 0 1
1
5
6 γ + 3120
68
−1
180
1
10
52
1 0 0
(2.15)
Now, x1 , x5 are the free variables and x2 , x3 , x4 the dependent ones. The corresponding
basic solution is z̃ = (0, 52, 68, 180, 0)⊤. We have had some good luck here since we also see
that z̃ ≥ 0. Thus ẑ2 is the solution for the complete system (??) for γ + 3120 = c⊤ ẑ = 0,
and therefore γ = −3120. Having made nothing more than equivalent transformations,
ẑ2 is also a solution of (??) γ = −3120. And for this value of ẑ2 the cost function takes a
smaller value than at ẑ1 , and hence is “better”!
Remarks:
(A) What happens, in the last example, when one takes, say a12 as pivot element rether
than a32 ? Then we obtain
40 0
1 1
60 0 0 γ + 60 · 120
1 0 0
120
−5 0 −10 1 0
−500
−8 0 −10 0 1
−680
with the basic solution (0, 120, 0, −500, −680)⊤. This solution is not admissible for
some of the components are negative. The step in the simplex algorithm apparently
is valid in the case that the pivot element ars is positive and br /ars is minimal.
(B) When do we get something > γ in the upper right entry of the tableau? Obviously,
this occurs exactly when cs < 0 and br /ars > 0. This tells us that if all the entries of
the first row to the left of the left of the last entry are non-negative, then we cannot
obtain a better solution by performing further steps of the Simplex Algorithm.
We continue the example above and consider (??): we must look for the pivot element
in the first column since the first row of the tableau contains a negative element only in
that column.
15
−8 0 0 0
4
5
6 γ + 3120
1
0 1 0 − 10
68
3
0 0 1
−1
180
1
5
1 0 0
1
10
52
8
3
10
3
γ + 3600
4
0 0 1 − 15
1
6
1
−3
1
6
0 0 0
1 0 0
0 1 0
1
3
1
− 15
20
60
40
The corresponding basic solution is x∗ = (60, 40, 20, 0, 0)⊤ ∈ M. This vector also satisfies
ĉ⊤ x∗ = γ for γ+3600 = 0, i.e. γ = −3600. Here ĉ is the “current cost vector”, i.e., the first
row in the current tableau. Hence x∗ also solves the initial system (??) for γ = −3600.
Now we are finished since the current cost vector is non-negative. Each further simplex
step would only increase the value of γ. We can now ignore the slack variables x3 , x4 , x5
and then we have found an optimal vector (60, 40)⊤ ∈ R2 as the solution of the initial
problem.
Remark: It is important to note that the basic solutions produced in the various steps,
namely ẑ1 , ẑ2 and x∗ , have first and second components which correspond to (actually
neighboring) vertices of the corner points of the feasible region!
We now should have some insight into the importance of the simplex tableau as a method
of solution of organizing the computations necessary to solve the linear programming
problem. Indeed, its importance lies in the fact that it collects in fact that it collects in a
particularly efficient manner, all the information necessary for carrying out the algorithm
itself. In particular:
(a) the basic solution can be read off directly: the basic variables correspond to columns
which, taken together and properly permuted, correspond to an identity matrix.
Since the remaining variables are set equal to zero, we have
xj1 = bj1 , xj2 = bj2 , . . . xjm = bjm ,
and we have bj1 ≥ 0, . . . , bjm ≥ 0 if the basis is feasible;
16
(b) the value of the objective function is obtained by solving the equation γ + K = 0
from the upper right-hand corner of the tableau. Indeed, the first row reads, in
effect c⊤ z = 0 = γ + K;
(c) The reduced costs of the non-basic variables are obtained by reading directly the first
row of the simplex tableau. They allow us, in particular, to see at once whether
the current basis is optimal. This is the case when all entries of the first row
corresponding to the non-basic variables are non-negative.
We now wish to try to understand the observation made above that the solutions produced
by the various steps correspond to the vertices of the feasible region. Let us begin with
the following theorem.
Theorem 2.3 Consider the linear program


min c⊤ · x






 subject to
Ax = b
Then








and
x≥0
1. If there is any feasible solution, then there is a basic feasible solution.
2. If there is any optimal solution, then there is a basic optimal solution.
Proof: Suppose that a feasible solution exists. Choose any feasible solution among
those with the fewest non-zero components. If there are no non-zero components, then x =
0 and x is a basic solution by definition. Otherwise, take the index set J := {j1 , j2 , . . . , jr }
with elements corresponding to those xji > 0. Then the matrix AJ := col (a(ji ) ) must be
non-singular. Indeed, were Aj singular, then its columns {a(j1 ) , a(j2 ) , . . . a(jr ) } would be a
linearly dependent set of vectors and hence for some choice of scalars αi , not all zero,
α1 a(j1 ) + α2 a(j2 ) + . . . + αr a(jr ) = 0.
(2.16)
Without loss of generality, we may take α1 6= 0 and, indeed, α1 > 0 (othewise multiply
(??) by −1).
17
Now, since xji > 0, the corresponding feasible solution x is just a linear combination of
the columns a(ji ) . Hence
Ax =
r
X
xji a(ji ) = b.
i=1
Now multiplying the dependence relation (??) by a real number λ and subtracting, we
have
r
X
(xji − λαi ) a(ji ) = b.
i=1
Now if λ is taken to be sufficiently small, we still have a feasible solution with components
(xji − λαi ) ≥ 0. Indeed, we can insure this inequality holds by taking λ < x(ji ) /αi , for
those i ∈ {1, 2, . . . , r} for whichαi > 0. If we take λ sufficiently large, however, we can
arrange that the component (xj1 − λα1 ) < 0 and this will be the case if and only if
λ > xj1 /α1 since α1 > 0.
On the other hand, if αi ≤ 0 then (xji − λαi ) ≥ 0 for all λ ≥ 0. We see not that we may
choose a number λ̃ by
xji
xji > 0, αi > 0 .
(2.17)
λ̃ := min
αi
If the minimum quotient occurs for i = k, then xjk − λ̃αk = 0 (why?) and so we have
found a new feasible solution with fewer non-zero components and hence the a(ji ) must be
linearly independent. It follows that the matrix AJ is non-singular and so xJ is feasible
and basic.
Hence, if we have a feasible solution, then there exists a basic feasible solution.
Now assume that x∗ is an optimal solution. There is no guarantee that this optimal
solution is unique. In fact, in many cases there is no uniqueness. Again, some of these
solutions may have more positive components than others. Without loss of generality,
we assume that x∗ has a minimal number of positive components. If x∗ = 0 then x∗ is
basic and the cost is zero. If x∗ 6= 0 and if J is the corresponding index set, then we wish
to show that the matrix AJ is non-singular or, equivalently, that the columns of AJ are
linearly independent.
If {a(ji ) } are, on the contrary, linearly dependent, then there exist coefficients αi , not all
r
P
zero such that
αi a(ji ) = 0. As before, we may assume that α1 > 0. We claim that
i=1
r
X
αi cji = 0.
i=1
18
(2.18)
r
P
Since x∗ is feasible, we have A x∗ = b so that
xji a(ji ) = b. Now look at the equation
i=1
(j )
Pr
∗
i
= b. Then the condition x∗ji > 0 implies that x∗ji − λαi ≥ 0 for
i=1 xji − λαi a
sufficiently small |λ|. Hence
r
X
cj i
i=1
x∗ji
− λαi
=
r
X
cji x∗ji
−
i=1
r
X
λ cji αi = c⊤ x∗ − λ c⊤ α,
i=1
so that if (??) were false, then we could decrease the cost by letting λ be some small
positive or negative number and hence x∗ would not be optimal.
r
P
Now, set γλ :=
cji (x∗ji − λαi ). Then γo = c⊤ x∗ . If we let λ increase from zero, then, as
i=1
long as the components x∗ji − λαi remain non-negative for all i = 1, . . . , r, then the vector
xλ with these components remains feasible and is optimal with the same cost as x∗ in
light of the dependence relation (??). Since α1 > 0 we see that for some λ at least one
component will be negative, in particular for λ > xj1 /α1 . Again, set λ̃ as in (??). If the
minimum occurs for i = k then x∗jk − λ̃αk = 0 and so we have produced a new optimal
feasible solution with fewer positive components than x∗ which contradicts the choice of
the original optimal solution.
We conclude that the matrix AJ must be non-singular, and so the optimal solution is
basic.
2
3
The Simplex Algorithm: description
The Simplex Method or Simplex Algorithm can be described in terms of two phases,
namely
Phase I: We construct an admissible basic solution ẑ, Jˆ of Ax = b, a vector ĉ ∈ Rn and
a representation M = {x ∈ Rn : Âx = b̂, x ≥ 0} so that M has the following properties:
(a) If Jˆ = {j1 , . . . , jm }, then âjk = ek for k = 1, . . . , m, where ek is the k th unit vector
in Rm ,
(b) ĉj = 0 for all j ∈ Jˆ (and hence ĉ⊤ ẑ = 0), and b̂ ≥ 0,
(c) ĉ⊤ x + f (ẑ) = c⊤ x for all x with Ax = b.
ˆ of
Phase II: We start with the assumption that we have found a basic solution (ẑ, J)
n
Ax = b, a vector ĉ ∈ R and a representation M, so that (a), (b), and (c) are satisfied.
˜ of Ax = b, a vector c̃ and Ã, b̃ with
We then seek another admissible basic solution (z̃, J)
the properties (a),(b),(c) for which f (z̃) < f (ẑ) . Once a new basic solution is found,
19
we then replace ẑ with z̃ and the other quantities in a similar manner. We continue this
process to construct a sequence of feasible basic solutions {ẑ k } and hope that it converges
to an optimal solution of the problem or, that the algorithm stops at an optimal solution,
or that it gives us the information that inf(P ) = −∞.
We illustrate this process with an example.
Example 3.1
Consider the problem
Maximize 3 x1 + x2 + 3 x3 ,
subject to
2x1
+
x2
+
x1
+2x2
+
2x1
+
2x2
x3
≤
3x3 ≤
5
+
2
x3 ≤ 6
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0.
In order to put this problem into a standard form so that the simplex procedure can
be applied, we change the maximization problem to minimization by multiplying the
objective function by −1 and we introduce three non-negative slack variables x4 , x5 , x6 .
We then have the initial tableau
INITIAL TABLEAU
x1
x2
x2
x4 x5 x6 b
2
1
1
1
0
0
2
← eq′ n 1
1
2
3
0
1
0
5
← eq′ n 2
2
2
1
0
0
1
6
← eq′ n 3
−3
−1
−3
0
0
0
0 ← (−costfct′ n)
↑
↑
↑
neg. neg. neg.
Note: This problem is in canonical form with the three slack variables as basic variables.
20
Simplex Method Phase II
Given (P̂ ) with admissible basic solution
ˆ satisfying the conditions (a),(b),(c)
(ẑ, J),
?
6
Set γ̂ := f (ẑ) = c⊤ ẑ
?
ẑ Solution of (P ) yes
γ̂ = inf (P ), STOP
ˆ
ĉj ≥ 0 ∀j 6∈ J?
no
?
Choose s ∈ {1, . . . , n} \ Jˆ with ĉs < 0,
e.g. ĉs := minj6∈Jˆ ĉj
(P ) has no solution,
inf(P ) = −∞, STOP
update:
˜:=ˆ
?
yes
6
â∗s ≤ 0?
no
?
Determine r ∈ {1, . . . , m} with
b̂r /ârs = min{b̂i /âis : i ∈ {1, . . . , m}, âis > 0}
?
New basis indices: ̃k = jk , k 6= r, ̃r := s,
i.e. J˜ = {j1 , . . . , jr−1 , s, jr+1, . . . , jm }
?
New Ã, b̃, c̃ and γ̃:
(ãr1 , · · · , ãrn |b̃r ) =
1
(âr1 , · · ·
ârs
, ârn |b̂r )
(ãk1 , · · · , ãkn |b̃k ) = (âk1 , · · · , âkn |b̂k ) − âks (ãr1 , · · · , ãrn |b̃r ), k 6= r
(c̃1 , · · · , c̃n | − γ̃) = (ĉ1 , · · · , ĉn | − γ̂) − ĉs (ãr1 , · · · , ãrn |b̃r )
?
New admissible basic Solution:
˜ z̃̃ = b̃k , k = 1, . . . , m
z̃j := 0 (j 6∈ J),
k
21
-
bi
, for all positive elements xi,j .
Rule for Pivots: Compute the ratios
xi,j
Find the element in the possible pivoting columns that corresponds to
that minimal ratio. Pivoting on that element maintains feasibility and
decreases cost. (Remember the non-degeneracy assumption!)
Look again at the given array; the circled entries correspond to the possible pivot elements
chosen by that rule!
FIRST TABLEAU
x1
x2
x2
x4 x5 x6 b
2
1
1
1
0
0
2
← eq′ n 1
1
2
3
0
1
0
5
← eq′ n 2
2
2
1
0
0
1
6
← eq′ n 3
−3
−1
−3
0
0
0
0
← (−cost fct′ n)
↑
↑
↑
neg. neg. neg.
If we actually carry out the division and look at the corresponding entries we can see that
this is the case since the resulting entries for the variables becomes
1 2
2 2 0 0
5
2
5
0 5 0
3
3 3
6 0 0 6
5
We chose to pivot on 1 in the second column (because of ease of hand computation) and
obtain the following tableau. Note that the last row is computed by r4 − r1 which means
that the entries are
(−3 + 2), (−1 + 1), (−3 + 1), (0 + 1), (0 + 0), (0 + 0), (0 + 2) .
so that the second tableau is
22
SECOND TABLEAU
2
1
1
−3
0
−2
−1
1
0
0
2
1
−2 1
0
1
0
−1
−2 0
1
2
0
−2
0
2
1
0
↑
↑
↑
neg.
neg.
decr. to − 2
Notice that the first and third columns contain negative entries for the coefficients of
the cost function so represent appropriate columns for pivoting. We have circled the
appropriate pivot element. Using 1 in the third column as pivot element, we arrive at
the third tableau:
THIRD TABLEAU
5
1 0
3
−1
0
1
−3
0 1
−2
1
0
1
−5
0 0
−4
1
1
3
−7
0 0
−3
2
0
4
↑
↑
↑
neg.
neg.
decr. to − 4
Since there are still negative elements in the last row, we pivot again this time chosing
the (1,1) entry as pivot element.
23
1
0
1
5
3
5
FINAL TABLEAU
3
1
0
−
0
5
5
1
2
1
−
0
5
5
1
5
8
5
0
1
0
−1
0
0
4
0
7
5
0
6
5
3
5
0
27
5
↑
↑
↑
pos. pos. pos. pos. pos. decr.to
−
27
5
Looking at the last tableau, we can rewrite the linear system as
x1
0 x1
1
3
1
1
x2 + 0 x3 +
x4 +
x5 + 0 x6 =
5
5
5
5
3
1
2
8
+
x2 + x3 −
x4 +
x5 + 0 x6 =
5
5
5
5
+
0 x1 +
x2
+ 0 x3 −
x4
+ 0 x5 +
x6
= 4
which yields the basic solution (setting the variables x2 = x4 = x5 = 0)


1
 5 




 0 


 8 



xB = 
 5 ,


 0 




 0 


4
which is the optimal solution.
24
4
Extreme Points and Basic Solutions
In Linear Programming, the feasible region in Rn is defined by P := {x ∈ Rn | Ax = b, x ≥
0}. The set P , as we have seen, is a convex subset of Rn . It is called a convex polytope.
The term convex polyhedron refers to convex polytope which is bounded. Polytopes in
two dimensions are often called polygons. Recall that the vertices of a convex polytope
are defined as the extreme points of that set.
Extreme points of a convex set are those which cannot be represented as a proper convex
combination of two other (distinct) points of the convex set. It may, or may not be the
case that a convex set has any extreme points as shown by the example in R2 of the strip
S := {(x, y) ∈ R2 | 0 ≤ x ≤ 1, y ∈ R}. On the other hand, the square defined by the
inequalities |x| ≤ 1, |y| ≤ 1 has exactly four extreme points, while the unit disk described
by the ineqality x2 + y 2 ≤ 1 has infinitely many. These examples raise the question of
finding conditions under which a convex set has extreme points. The answer in general
vector spaces is answered by one of the “big theorems” called the Krein-Milman Theorem.
However, as we will see presently, our study of the linear programming problem actually
answers this question for convex polytopes without needing to call on that major result.
The algebraic characterization of the vertices of the feasible polytope confirms the observation that we made by following the steps of the Simplex Algorithm in our introductory
example. Some of the techniques used in proving the preceeding theorem come into play
in making this characterization as we will now discover.
Theorem 4.1
The set of extreme points, E, of the feasible region P is exactly the set, B of all basic
feasible solutions of the linear programming problem.
Proof:
We wish to show that E = B so, as usual, we break the proof into two parts.
Part (a) B ⊂ E.
Suppose that x(b) ∈ B. Then, for the index set J(x(b) ) ⊂ {1, 2, . . . , n} is defined by
(b)
j ∈ J(x(b) ) if and only if xj > 0. Now suppose that x(b) 6∈ E. Then there exist two
distinct feasible points y, z ∈ P and a λ ∈ (0, 1) for which x(b) = (1 − λ) y + λ z.
Observe that for any integer k 6∈ J, it must be true that (1 − λ) yk + λ zk = 0. Since
0 < λ < 1 and y, z ≥ 0, this implies that, for all such indices k, yk = zk = 0.
Now x(b) is basic, so that the columns of A corresponding to the non-zero components
form a linearly independent set in Rn . Since the only non-zero components of y and
z have the same indices as the non-zero components of x(b) , we have span(a(j) ), j ∈
25
J(x(b) contains x(b) , y and z. Moreover, since y and z are feasible, we have Ay = b
and Az = b so that
b = (1 − λ) b + λ b = (1 − λ) A y + λ A z.
Now the system Ax = b is uniquely solvable on the set {x ∈ Rn |xi = 0, i 6∈ J(x(b) )},
so that we must have Ax(b) = Ay = Az and hence x(b) cannot we written as a proper
convex combination of two other distinct points of P , which means that x(b) is an
extreme point of P .
Part (b) E ⊂ B
If x(e) is an extreme point of P , then it has a minimal number of non-zero components. Indeed, if the number of non-zero components were not minimal, then there
is a feasible solution y, i.e., y ≥ 0, Ay = b, with fewer non-zero components and
we may, without loss of generality, assume that y has a minimal number of such
components. Let J(y) be the index set for y and let k ∈ J(y) for which
xk
= min
i∈J(y)
yk
xi
yi
=: λ̃ > 0.
junk Then x(e) − λ̃y ≥ 0, and
A(x(e) − λ̃y) = (1 − λ̃)b.
(4.19)
We now consider two cases:
1. If λ̃ ≥ 1 then for some δ > 0, λ̃ = 1 − δ. It follows that the equation (??) can
be rewritten
(1 − λ̃)b
=
−δ b = A(x(e) − y − δ)
= A(x(e) − y) − δA(y) = A(x(e) − y) − δ b.
and so, since x(e) 6= y, we have z := x(e) − y ≥ 0 (y has fewer non-zero
components than x(e) , and A(z) = 0. Hence we can write x(e) as
(e)
x
1
=
2
2
2
1
1
1
y+ x +
y + x = y1 + y2 ,
3
2
3
2
2
26
y1 6= y2 6= x(e) .
This being the case, we have
2
A y1 = A y + x = A y
3
and likewise A y2 = A y, so that A y1 = A y2 = b. y1 , y2 ≥ 0. So x(e) is a proper
convex combination of two points of P and is therefore not an extreme point
of P , a contradiction.
Hence we conclude that we must have:
x(e) − λ̃ y
satisfies Az = b and z ≥ 0.
(1 − λ̃)
Furthermore, x(e) 6= z (again since y has fewer non-zero components than x(e) )
so that x(e) = (1 − λ̃) z + λ̃ y and therefore x(e) cannot be an extreme point of
P.
2. λ̃ < 1. In this case, the vector z :=
From these two cases, we see that x(e) must have a minimal number of non-zero
components.
It remains to show that the number of non-zero components of x(e) is at most m.
Indeed, if there were more than m such components, then the corresponding columns
of A would be linearly dependent. Then there would exist a y with fewer non-zero
components such that J(y) ⊂ J(x(e) ) and A y = 0 and, consequently also y 6= x(e) .
By choosing
λ = − min
{i|yi <0}
(
(e)
x
− i
yi
)
,
or λ = − min
i
(
(e)
xi
yi
)
if
y ≥ 0,
the vector x(e) +λ y would be a feasible solution i.e., x(e) +λ y ≥ 0, with J(x(e) +λ y) (
J(x(e) ). This is a contradiction. Hence J(x(e) ) can contain at most m indices and
hence x(e) is a basic feasible solution.
2
As a simple corollary of this last theorem, we can see that there are at most a finite
number of vertices that must be checked by the Simplex Algorithm.
Corollary 4.2
The number of vertices of P is at most Cnm =
27
n!
.
m! (n − m)!
Proof: : Cnm is he number of choices of m columns out of n, so the largest number of
bases is Cnm , not all of which may be feasible.
2
The final piece of the puzzle is stated in the final result of this subsection that shows that
if there is an optimal solution, it occurs at an extreme point of P .
Corollary 4.3
The optimum of the linear form c⊤ x on a convex polyhedron, P , is attained at at least
one vertex of P . If it is attained at more than one vertex, then it is attained at every
point which is a convex combination of the two vertices.
Proof:
Let x(ei ) , i = 1, 2, . . . , k, be the extreme points of P . Set v ∗ = min {c⊤ x(ei ) , i =
i=1..k
1, . . . , k}. We show that v ∗ is the minimum value of the cost on P .
Using the first corollary, we see that every x ∈ P can be written as a convex combination
of the x(ei ) and so
x =
k
X
(ei )
λi x
, with λi ≥ 0,
i=1
then c⊤ x =
k
P
k
X
λi = 1,
i=1
λi c⊤ x(ei ) , by linearity of the dot product in Rn . Hence
i=1
⊤
c x ≥ v
∗
k
X
λi = v ∗ .
i=1
Therefore v ∗ is the minimum of c⊤ x on P , and it is attained on at least one vertex. The
second part of the corollary follows directly from the linearity of the cost function.
2
28
Download