Uploaded by matthewcamilleri99

SOR1320 - LP - 20182019

advertisement
Linear Programming
SOR 1320
Lecture Notes
Department of Statistics & Operations Research
Faculty of Science
University of Malta
Organization
Subject:
Method of assessment:
Prerequisites:
4 ECTS
1 hour per week for 28 hours + tutorials
Course held during semesters 1 and 2
Examination
Advanced level in Mathematics
Lecturer:
Dr. Maria Kontorinaki
Email: maria.kontorinaki@um.edu.mt
Room: 508
Authors:
Dr. Mark Anthony Caruana, Dr. Natalie Attard
Contents
1. Introduction to Linear Programming
2. Review of Linear Algebra, Convex and Polyhedral Sets
3. Graphical Solution to solve Linear Programming Problems with two variables
4. Simplex Method to Linear Programming Problems with two or more variables
5. Duality Theory
6. Network Problems
Suggested Texts
1. Bazaraa, M.S., Jarvis, J.J and Sherali, H.D. (1990) Linear Programming and Network
Flows, Wiley
2. Dantzig, G.B. and Thapa, M.N., (1997) Linear Programming 1: Introduction,
Springer
3. Dantzig, G.B. and Thapa, M.N., (2003) Linear Programming 2: Theory and
Extensions, Springer
4. Luenberger, D.G. (1984) Linear and Non-linear Programming, Addison-Wesley
5. Walker, R.C. (1999) Introduction to Mathematical Programming, Prentice Hall Inc.
6. Williams, H.P. (1990) Model Building in Mathematical Programming, Wiley
7. Bazaraa, M.S., Sherali, H.D and Shetty, C.M. (1993) Non-linear Programming
Theory, Algorithms and Applications, Prentice Hall Inc.
8. Wolsey, L.A. (1998) Integer Programming, Wiley
9. Sierksma, G., (1996) Linear and Integer Programming, Marcel Dekker, Inc.
10. Taha, H.A. (1997) Operations Research An Introduction, Prentice Hall Inc.
11. Winston, W.L. (1994) Operations Research – Applications and Algorithms, Duxbury
Press
-2-
Chapter 1
Introduction to Linear Programming
A Linear Program is a mathematical program, in which a linear function is
maximized (or minimized) subject to a set of linear constraints. This problem
class is broad enough to entail many interesting applications, and at the same
time being tractable for large-scaled problems.
1.1
Overview on the history of Linear Programming
™ Linear Programming was developed in the 1940's, to solve complex
planning problems in wartime operations.
™ The development of LP spread widely in the postwar period, as many
industries found valuable uses for it.
™ The father of the subject is George B. Dantzig, who introduced the
simplex method to solve Linear Programming problems, in 1947.
™ At the same year, John von Neumann, established the theory of duality.
™ The mathematician Leonid Kantorovich and the economist Tjalling
Koopmans were awarded the Nobel Price in economics in 1975 for their
contribution to the theory of optimal allocation of resources, where
Linear Programming played a crucial part.
™ Nowadays, many industries use linear programming as a standard tool, to
find an optimal allocation of resources. Other important applications of
Linear Programming include airline crew scheduling, shipping or
telecommunication networks, oil refining and blending, and stock and
bond portfolio selection.
-3-
1.2
Examples of Linear Programming Problems
Example 1.1 Product Mix Problem:
An engineering factory produces electronics, electrical appliances, toys and diecastings components using four production processes: assembling, grinding,
drilling and testing.
Each electronic yields a profit of €200, electrical
appliances a profit of €300 per unit, while toys and die-casting components
produce a profit of €100 and €150 per unit respectively. Each unit of the above
products requires a certain time on each process, as shown in the following
table:
Electronics
Electrical
Appliances
Toys
Die-Castings
Assembling (hrs)
3
2.1
1.5
1.5
Grinding (hrs)
4.3
1.2
3
6
Drilling (hrs)
5
3
2
2
Testing (hrs)
7
6
6
5
The assembly machine works 40 hours per week, the grinding and drilling
machines 50 hours per week, while the testing machine works for 20 hours. The
problem is to determine the weekly production plan that maximizes the total
profit.
Example 1.2 Diet Problem:
A hospital dietician must prepare breakfast menus every morning for the
patients. The dietician’s responsibility is to make sure that the minimum daily
requirements for vitamins A and B are met, however the menus must be kept at
a minimum cost to avoid waste. The breakfast supplements which include
vitamins A and B, are eggs, bacon and cereal bar. One egg consists of 2mg of
-4-
Vitamin A and 3mg of Vitamin B, each bacon strip contains 4mg of Vitamin A
and 2mg of Vitamin B, and one cereal bar contains 1mg of both Vitamin A and
vitamin B. The minimum daily requirements of vitamin A and B are 16mg and
12mg respectively. Each egg costs 4 cents, one bacon strip costs 3 cents,
whereas the cost of one cereal bar is 20cents. The dietician wants to determine
how much of supplement to serve in order to meet the minimum daily
requirements of vitamins A and B at a minimum cost.
Example 1.3: Investment Problem
An investor wants to invest exactly €10000 in two types of funds: Accumulator
Fund and Vilhena Fund. Accumulator Fund pays a dividend of 8% while the
Vilhena Fund pays a dividend of 5%. The investor is advised to invest no more
that €3000 in Vilhena Fund. In addition, the amount invested in the
Accumulator Fund must be at least twice the amount invested in Vilhena Fund.
How much should the investor invest in each fund in order to maximize his
revenue?
1.3
Steps involved in building a good linear programming model
To solve Optimization Problems, in particular Linear Programming (LP), one
must following the next steps:
i.
Define the goal of the study – to clarify the objective of the study and the
decisions to be taken
ii.
Construct the model – developing an appropriate mathematical
description of the problem
iii.
Solving the model constructed in step ii
iv.
Interpreting the results obtained
-5-
v.
Performing sensitivity analysis – how the solution will vary with some
variation in the model
1.4
Formulation of a LP model
A LP model is made up of the following basic components:
i.
Decision Variables: - the optimal allocation of resources that we need
to evaluate using alternative methods. In the product mix problem for
example, the decision variables are the amount of electronics,
electrical appliances, toys and die-castings that we need to produce to
maximize the profit of the company.
ii.
Parameters: - these are exact or approximate values, which are known
to the analyst. In the product mix problem the parameters are
a. the profit per unit of each product
b. the time per unit of each product required on each of the
four processes
c. the availability of the resources
iii.
Constraints: - these are the restrictions which limit the values of the
decision variables. For example in the product mix problem the total
assembly hours per week must not exceed 40 hours.
iv.
Objective function: - a linear function of the decision variables which
may represent profit/contribution or cost. Thus, the optimal decision
variables are those, which maximize profit or minimize cost i.e. which
maximize or minimize the objective function. In the product mix
problem we need to maximize the profit of the company made from
the four products.
-6-
LP formulation
min/ max cT x
subject to
Ax i b
x≥0
where • ∈ {≤, ≥, =} or a mixture of these.
Example 1.1 continued: Product Mix Problem
Identifying the Decision variables:
x1 : amount of electronics to produce
x2 : amount of electrical appliances to produce
x3 : amount of toys to produce
x4 : amount of die − castings to produce
Identifying the objective function:
Maximize total profit:
max 200 x1 + 300 x2 + 100 x3 + 150 x4
Identifying the constraints:
i.
Assembling hours: 3x1 + 2.1x2 + 1.5 x3 + 1.5 x4 ≤ 40
ii.
Grinding hours: 5 x1 + 3x2 + 2 x3 + 2 x4 ≤ 50
iii.
Drilling hours: 4.3 x1 + 1.2 x2 + 3 x3 + 6 x4 ≤ 50
iv.
Testing hours: 7 x1 + 6 x2 + 6 x3 + 5 x4 ≤ 20
v.
x1, x2, x3 and x4 must be non-negative (cannot produce negative
amounts)
-7-
LP model:
max 200 x1 + 300 x2 + 100 x3 + 150 x4
subject to
3x1 + 2.1x2 + 1.5 x3 + 1.5 x4 ≤ 40
5 x1 + 3x2 + 2 x3 + 2 x4 ≤ 50
4.3 x1 + 1.2 x2 + 3x3 + 6 x4 ≤ 50
7 x1 + 6 x2 + 6 x3 + 5 x4 ≤ 20
x1 , x2 , x3 , x4 ≥ 0
Note that there is no requirement to produce only integer amounts (fractions
may for example represent partially finished products) and also note that any
combination of the four products is acceptable including produce just one
product. If these facts were not true, there would be other constraints. Also,
problems with integer requirements are solved by special methods called
Integer Linear Programming. However problems of this type are beyond the
scope of this credit.
-8-
Chapter 2
Review of Linear Algebra, Convex and Polyhderal Sets
The aim of this chapter is to give some definitions and theoretical results related
to Linear Programming. In particular we shall review results from
ƒ vector and matrix algebra
ƒ convex analysis
These results will motivate the use for solving LP problems.
2.1 Vectors and Matrices
Definition 2.1: An n-vector (or a vector of dimension n) is an array (either a
row or column) of n numbers. Vectors shall be represented with bold letters:
⎛ x1 ⎞
⎜ ⎟
x
x = ⎜ 2 ⎟ column vector
⎜ ⎟
⎜ ⎟
⎝ xn ⎠
x T = ( x1 , x2 ,...xn ) row vector
where xT denotes transpose of the vector x.
Definition 2.2: The Euclidean Space
n
is the collection of all n-vectors.
Definition 2.3: An m × n matrix A is a rectangular array of mn numbers
arranged where m represent the number of rows and n the number of columns:
-9-
⎛ a11
⎜a
A = ⎜ 21
⎜
⎜⎜
⎝ am1
a12
a22
am 2
a1n ⎞
a2 n ⎟⎟
⎟
⎟
amn ⎟⎠
If m = n then the matrix is said to be a square matrix of order n. Also, aii for
1 ≤ i ≤ n form the main diagonal of the matrix A.
2.2 Properties of Matrices
Let α and β be scalars and A, B and C matrices, then:
i.
A+B=B+A
ii.
A + (B + C) = (A + B) + C
iii.
(AB) C = A(BC)
iv.
A(B + C) = AB + AC
v.
(A+B)C = AC + BC
vi.
α ( β A) = ( α β )A
vii.
( α + β )A =( α A + β A)
viii.
α (A+B) = α A + α B
ix.
A( α B) = α (AB)
x.
(AT)T = A
xi.
(A + B)T = AT + BT
xii.
(AB)T = BTAT
xiii.
( α A)T = α AT
Note: AB ≠ BA
- 10 -
2.3 Matrix Row Operation
An elementary row operation on an m × n matrix A consists of ONE of each of
the following operations:
ƒ interchanging of two rows of A
ƒ multiply a row of A by a non-zero constant
ƒ add a multiple of one row of A to another row of A
2.4 Systems of Linear equations
A system of m linear equations and n unknowns
a11 x1 + a12 x2 + ...a1n xn = b1
a21 x1 + a22 x2 + ...a2 n xn = b2
am1 x1 + am 2 x2 + ...amn xn = bm
can be represented in matrix form as shown below:
- 11 -
⎛ a11
⎜a
⎜ 21
⎜
⎜
⎝ a11
a1n ⎞ ⎛ x1 ⎞ ⎛ b1 ⎞
⎜ ⎟ ⎜ ⎟
a2 n ⎟⎟ ⎜ x2 ⎟ ⎜ b2 ⎟
=
⎟⎜ ⎟ ⎜ ⎟
⎟⎜ ⎟ ⎜ ⎟
amn ⎠⎝ xn ⎠ ⎝ bm ⎠
a12
a22
am 2
⇔
Ax = b
where
⎛ a11 a12
⎜a
a22
A = ⎜ 21
⎜
⎜
⎝ a11 am 2
⎛ x1 ⎞
⎜ ⎟
x
x=⎜ 2⎟
⎜ ⎟
⎜ ⎟
⎝ xn ⎠
⎛ b1 ⎞
⎜ ⎟
b
b=⎜ 2 ⎟
⎜ ⎟
⎜ ⎟
⎝ bm ⎠
a1n ⎞
a2 n ⎟⎟
⎟
⎟
amn ⎠
This also holds for ≤ and ≥ inequalities : Ax ≤ b and Ax ≥ b .
2.5 Inverse of a Square Matrix
Definition 2.4: An n × n matrix A-1 is said to be the inverse of an n × n matrix
A if A-1 A = In where In is the n × n identity matrix. If A-1 exists, then A is said
to be nonsingular or invertible, otherwise it is said to be singular or
noninvertible.
Properties
i.
If A is invertible, then A-1 is invertible and (A-1)-1 = A
- 12 -
ii.
If A and B are invertible then AB is invertible and
(AB)-1=B-1A-1
iii.
If A is invertible, then AT is invertible and
(AT)-1=(A-1)T
2.6 Linear Independence, Basis and Spanning Set
Definition 2.5: A vector b in
in n if
n
is a linear combination of vectors a1, a2, ... , ak
k
b = ∑ λ ja j
j =1
where λ1, λ2, ... λk are real numbers. If λj for j = 1,2,…k are non-negative, then
b is a non-negative linear combination of a1, a2,… ak.
If the coefficients λ j ∈ are restricted to satisfy
k
∑λ
j =1
j
=1
then b is an affine combination of a1, a2,… ak. Furthermore, if the coefficients λj
are also restricted to be nonnegative then b is a convex combination of a1, a2,…
ak and if all coefficients λj are positive then b is a strict convex combination of
a1, a2,… ak
Definition 2.6: A collection of vectors a1, a2,… ak in
independent if:
k
∑λ a
j
j =1
implies that λj = 0 for j = 1,2, ... , k.
- 13 -
j
=0
n
is called linearly
Vectors a1, a2,… ak are linearly dependent otherwise.
Vectors a1, a2,… ak are affinely independent if a2 - a1, ... , ak - a1 are linearly
independent.
Definition 2.7: A collection of vectors a1, a2,… ak in
any vector in
n
n
is said to span
n
if
can be represented as a linear combination of a1, a2,… ak. A
collection of vectors a1, a2,… ak in
n
forms a basis of
n
if they span
if any of the vectors is removed, the remaining collection does not span
n
and
n
. It
can be shown that for a basis:
1)
k=n
2)
a1, a2,… ak are linearly independent.
Definition 2.8: A set S in
n
is called a convex set if for any two points x1 and
x2 in S the vector λx1 + (1-λ)x2 ∈ S for λ∈[0,1]. In other words, a line segment
connecting x1 and x2 is a part of S.
Note that λx1 + (1-λ)x2 for 0 ≤ λ ≤ 1 is a convex combination (weighted
average) of x1 and x2. For 0 < λ < 1 it is a strict convex combination (the
endpoints of the line are excluded).
Lemma 2.9: Let S1 and S2 be convex sets in
n
. Then:
1) S1 ∩ S2 is convex.
2) S1 + S2 = {x1 + x2 : x1 ∈ S1, x2 ∈ S2} is convex.
3) S1 - S2 = {x1 - x2 : x1 ∈ S1, x2 ∈ S2} is convex.
- 14 -
Proof to 1): any two points of S1 ∩ S2 are both in S1 and in S2. So a line segment
that connects them is also both in S1 and in S2 because these sets are convex.
This completes the proof. Also note that S1 ∩ S2 ∩ S3 = (S1 ∩ S2) ∩ S3, so the
intersection of any number of convex sets is also a convex set. This result is
very important because feasible sets of mathematical programs generally are
intersections of sets that satisfy the individual constraints.
Proof to 2) and 3): Exercise
Definition 2.10: Let S be any set in
n
. The convex hull of S denoted conv(S) is
the collection of all convex combinations of S. In other words x ∈ conv(S ) iff x
can be expressed as
k
x = ∑λj xj
j =1
k
∑λ
j =1
j
=1
λj ≥ 0 for j = 1 ... k
where k is a positive integer and x1, x2, ... , xk ∈ S.
Definition 2.11: The convex hull of a finite number of points x1, x2, ... , xk+1 in
n
is called a polytope. If the points x1, x2, ... , xk+1 are affinely independent
then the convex hull conv(x1, x2, ... , xk+1) is called a simplex with vertices x1,
x2, ... , xk+1.
In
n
the maximum number of linearly independent vectors is n, so a simplex
cannot have more than n +1 vertices. Later we shall see that the simplex method
used to find optimal solution of linear problems is basically movement on the
edges of a simplex that explains its name. Following, the so called
Carathéodory Theorem shows that any point in the convex hull of a set S in
n
can be represented as a convex combination of, at most, n+1 points in S. For a
- 15 -
simplex it means that all points of a simplex can be represented as a convex
combination of its corners.
Theorem 2.12: Let S be any set in
n
. If x ∈ conv(S) then x ∈ conv(x1, x2,
...,xn+1), where xj ∈ S for j = 1 ... n+1. In other words, x can be represented as
n+1
x = ∑λj xj
j=1
where
n +1
∑λ
j
= 1 , λj ≥ 0 for j = 1 ... n+1 and xj ∈ S for j = 1 ... n+1.
j =1
Proof: The theorem is trivially true for x ∈ S (explain why). Since x ∈ conv(S)
then, by definition of a convex hull, x can be expressed as
k
x = ∑λj xj
j =1
where
k
∑λ
j
= 1 , λj ≥ 0 for j = 1 ... k
and xj ∈ S for j = 1 ... k.
j =1
If k ≤ n+1 the theorem is proved. Next we shall show that if k > n+1 , it is
possible to eliminate one term with λi>0. Because k ≥ n+2, the vectors x2-x1,
x3-x1, ... , xk-x1 are linearly dependent and so there exist such scalars µj not all
zero that
k
∑ µ (x
j
j
− x1 ) = 0
j =2
The sum can be expressed as
k
k
k
j =2
j =2
j =1
∑ µ j x j − ∑ µ j x1 = ∑ µ j xj = 0
- 16 -
k
where µ1 = −∑ µ j or
j =2
k
∑µ
j
=0
j =1
Now for an real number α, x can be represented as follows:
k
k
k
k
j =1
j =1
j =1
j =1
x = ∑ λj x j + 0 =∑ λj x j − α ∑ µ j x j =∑ (λj − αµ j ) x j
Now we choose α in such a way that one of the coefficients in the above sum
becomes zero. Note that x is represented as a convex combination, so the
coefficients must remain nonnegative. That's why we choose α as follows:
⎧⎪ λ j
⎫⎪ λ
: µ j > 0⎬ = i
1≤ j ≤ k ⎪ µ
⎪⎭ µ i
⎩ j
α = min ⎨
for some i ∈ {1 … k}
Note that α > 0 and also note that for µj ≤ 0 the coefficient λj-αµj ≥ 0. For µj > 0
we have λj/µj ≥ λi/µi = α and so λj ≥ αµj or λj -αµj ≥ 0.
Now x is represented as
k
x = ∑(λj −αµj )xj
j =1
j ≠i
Moreover,
k
∑(λ
j =1
j
k
k
k
j =1
j =1
j =1
−αµj ) =∑λ j −α∑µ j =∑λ j =1
and λj-αµj ≥ 0, j = 1 ...k.
In other words x is represented as a convex combination of mostly k-1 points in
S. This can continue until we get mostly n+1 points that completes the proof.
- 17 -
Definition 2.13: A hyperplane H in
n
is a set of vectors: {x: pTx = k} where k
is a scalar and p is the normal or gradient vector of H.
A hyperplane can be expressed by eliminating k. Let pTx0 = k for a certain
vector x0 on H. Then pTx = pTx0 or pT(x - x0) = 0. So H = {x: pT(x - x0) = 0}.
The vector p is orthogonal to all vectors (x - x0) for x ∈ X and so it is
perpendicular to the surface of the hyperplane H that explains its name. A
hyperplane is a convex set (prove it).
Definition 2.14: A hyperplane divides
n
into two halfspaces. A closed
halfspace is a set of vectors: {x: pTx ≥ k} or {x: pTx ≤ k}. The union of these
two sets is
n
, their intersection is the hyperplane. Open halfspaces are defined
by strict inequalities.
Halfspaces can also be expressed by eliminating k. Let pTx0 = k for a certain
vector x0 on H. Then pTx = pTx0 or pT(x - x0) = 0. So a halfspace is a set {x: pT(x
- x 0) ≤ 0 }
or {x: pT(x - x0) ≥ 0 }. A halfspace is a convex set (prove it).
Definition 2.15: A polyhedral set or polyhedron is the intersection of a finite
number m of closed halfspaces. Each halfspace can be represented by an
inequality pjTx ≤ bj , where bj is scalar and pj is its normal vector.
So a polyhedron is a set:
{ x : Ax ≤ b }
where A is an m x n-matrix and b is an m-vector. A = (a1 a2 ... an), bT = (b1 b2 ...
bm), and pjT is the j-th row of A. A polyhedron is a convex set (prove it). Note
that inequalities can be converted by multiplying both sides by -1 and equality
- 18 -
can be expressed as two inequalities. Also nonnegativity conditions can be
expressed in terms of halfspaces. So there are several alternative ways how to
express a polyhedron, for example:
{ x : Ax ≥ b } , { x : Ax ≤ b, x ≥ 0 } , { x : Ax = b, x ≥ 0 }
Definition 2.16: A polyhedral cone is a polyhedral set whose hyperplanes all
pass through the origin.
So it is a set:
{ x : Ax ≤ 0 }
The above set definition was obtained by expressing the hyperplanes in the
form pT(x - x0) ≤ 0 , where x0 = 0.
The following theorem known as Farkas’ Theorem is very important in the
derivation of optimality conditions of linear and non-linear problems.
Theorem 2.17: Let A be an m x n matrix and c be an n vector. Then exactly
ONE of the following two systems has a solution:
System 1:
Ax ≤ 0 and cTx > 0
System 2:
ATy = c and y ≥ 0 for some y in
for some x in
n
n
Assuming that aj , j = 1 ... m are columns of the matrix AT, the vector c of the
System 2 is their nonnegative linear combination. In the System 1 the same
rows have the role of normal vectors of the hyperplanes that define halfspaces
whose intersection is the closed convex cone {x : Ax ≤ 0}of the System 1.
- 19 -
System 1 has a solution if this cone has a nonempty intersection with the open
halfspace {cTx > 0}. But in this case c cannot be a nonnegative linear
combination of the vectors aj. See the following pictures taken from [15].
Definition 2.18:
A ray is a set of points of the form {x0 + λd : λ ≥ 0} where x0 is the vertex and
d is the direction of the ray.
Definition 2.19: A direction d of a convex set S is a nonzero vector such that the
ray {x0 + λd, x0 ∈ S, λ ≥ 0} also belongs to the set S. For two directions d1 and
d2 of a convex set their convex combination λd1 + (1-λ)d2 is also a direction. An
extreme direction cannot be represented as a positive linear combination of two
distinct directions of the set. Clearly a bounded convex set has no directions.
Definition 2.20: A point x in a convex set S is called an extreme point of S, if x
can not be represented as a strict convex combination of two distinct points in S.
In other words, iff x = λx1 + (1-λ)x2 , λ∈(0,1) , x1, x2 ∈ S then x = x1 = x2 .
- 20 -
The next list is a summary of important facts about extreme points and extreme
directions of convex and polyhedral sets. Some will be proved later in the
context of the simplex method. In the following list we assume that
S = { x : Ax = b, x ≥ 0 }.
ƒ Number of extreme points of S is finite.
ƒ S has at least one extreme point.
ƒ Number of extreme directions of S is finite (possibly zero).
ƒ S has at least one extreme direction iff it is unbounded.
Next the so called Representation Theorem gives a way how to describe a
polyhedral set by means of its extreme points and extreme directions. This fact
is fundamental to linear programming.
Theorem 2.21: Let S = { x : Ax = b , x ≥ 0 } be a nonempty polyhedral set.
Then the set of extreme points x1, x2, ... xk is not empty and finite. The set of
extreme directions is empty iff S is bounded. If S is unbounded, then the set of
extreme directions d1, d2, ... dl is not empty and finite. Then a point x belongs to
S iff it can be represented as a convex combination of extreme points plus a
nonnegative linear combination of extreme directions:
k
l
j =1
j =1
x = ∑ λ j x j + ∑ µ jd j
where
k
∑λ
j =1
j
=1 , λj ≥ 0 , µ j ≥ 0
Note: The above theorem also holds for constraint sets with various types of
inequalities (equality can be replaced by two inequalities ≥ and ≤ ).
- 21 -
- 22 -
Chapter 3
Graphical Solution to Linear Programming Problems with two
variables
A LP model consisting of just two decision variables can be solved using the
Graphical Method.
Step 1:
Formulate a LP model.
Step 2:
Draw axes for variables x1 and x2. Scales may be different, but both
must start at zero and both must be linear.
Step 3:
Draw each limitation as a separate line on the graph. The lines
define the Feasible Region (set of acceptable solutions). If not
stated explicitly, assume that x1 ≥ 0 and x2 ≥ 0.
Step 4:
Draw a line (called also iso-profit line) that represents a certain
value of the objective function. Then draw a parallel line that
touches the feasible region to maximize/minimize the value of the
objective function.
Step 5:
Compute the exact values of the decision variables in the optimal
corner of the feasible region and the corresponding optimum value
of the objective function. The corner of the feasible region defines
the binding constraints (limiting factors).
- 23 -
Note: There may be more solutions if the iso-profit lines are parallel with a
limitation line.
Example 3.1
A certain manufacturer produces 2 products called A and B. The product A has
a contribution 4 per unit, the product B has a contribution 5 per unit. To
produce the products the following resources are required:
Decision
Machine
Labour
Material
variable
hours
hours
[kg]
Product A
x1
4
3
1
Product B
x2
2
5
1
Resources available per week: 100 machine hours, 150 labor hours and 50
kilograms of material. There are no other limitations.
The manufacturer wants to establish a weekly production plan that maximizes
the total contribution.
Standard linear programming model:
Maximize
Subject to
4x1 + 5x2
≤ 100
4x1 + 2 x2
≤ 150
3 x1 + 5 x2
x1 + x2
≤ 50
x1 ≥ 0 , x2 ≥ 0
Note that there is no requirement to produce only integer amounts (fractions
may for example represent partially finished products) and also note that any
combination of the two products is acceptable including producing only one of
the two products. If these facts were not true, there would be other limitations
- 24 -
like x1 ≥ 5 - produce at least 5 units of the product A, x1 - x2 ≥ 3 - production
of A must not exceed production of B by more than 3, etc.
3.1 Classification of Models
i) Unboundedness
Unboundednesscan apply both to the feasible region and to the objective
value. Unbounded feasible region means that at least one variable is not
limited in the value (always assuming nonnegativity). This is typical for
minimization problems. Unbounded objective value means that the objective
value can grow to +∞ (maximization) or -∞ (minimization) respectively.
Clearly a bounded feasible region implies bounded objective value (obviously
we assume finite objective coefficients). For an unbounded feasible region
there can be both bounded and unbounded objective value. So unbounded
model means a model with unbounded objective value.
ii) Feasibility
Feasibility means whether there is a solution or not. So an infeasible model
does not have any feasible solution - no vector exists that would satisfy all
(in)equalities including nonnegativity. Feasible model is a model that has at
least one feasible solution. Often this adjective is used for models that are both
feasible and bounded (it means models that have at least one optimal solution).
So there are three types of models:
i. Feasible
ii. Unbounded
- 25 -
iii. Infeasible
Next we shall mostly assume that the model is feasible, because for practical
problems infeasibility and/or unboundedness are caused by wrong model
specification.
- 26 -
Chapter 4
Simplex Method to solve general Linear Programming Probems
Simplex method represents one of the most famous algorithms of Operations
Research. It has been described originally by the Russian mathematician
Kantorovich in 1939, but his work was unknown internationally until 1959.
Meanwhile Dantzig discovered the algorithm in 1947.
It is convenient when using the simplex method, to convert all Linear
Programming problems into standard form as shown below:
Changing Constraint Type
1. Add a slack variable in each ≤ inequality constraint.
Example:
4x1 + 5x2 ≤ 150 →
4x1 + 5x2 + s1 = 150 , s1 ≥ 0
s1 = amount of the resource not used (s1 = 150 - amount used)
Feasible initial point: x1 = 0, x2 = 0, s1 = 150
2. Subtract a surplus variable in each ≥ inequality constraint and add an
artificial variable.
Example:
4x1 + 5x2 ≥ 130 →
4x1 + 5x2 - s1 = 130 , s1 ≥ 0
s1 = excess over the minimum (s1 = actual value - 130)
- 27 -
Feasible initial point can not be as above (x1 = 0, x2 = 0, s1 = -130) because
y1 has to be nonnegative. To select certain nonzero values of x1 and x2, it
would be necessary to test for feasibility.
To find fast a feasible initial point, add an artificial variable:
4x1 + 5x2 - s1 = 130
→
4x1 + 5x2 - s1 + a1 = 130 , s1, a1 ≥ 0
a1 = mathematical tool without practical (model) interpretation
Feasible initial point: x1 = 0, x2 = 0, y1 = 0, a1 = 130
3. Add an artificial variable in each equality constraint.
To find fast a feasible initial point for an equality constraint, add an
artificial variable
Example:
4x1 + 5x2 = 130
→
4x1 + 5x2 + a1 = 130 , a1 ≥ 0
a1 = mathematical tool without practical (model) interpretation
Feasible initial point: x1 = 0, x2 = 0, a1 = 130
Using the conversion methods shown above, all Linear Programming problems
can be transformed into the following standard form:
- 28 -
Find a vector x that minimizes (or maximizes) z =c T x ,
such that
Ax = b
⎛ a11 a12
⎜
a
a22
A = ⎜ 21
⎜
⎜
⎝ am1 am 2
c T = ( c1 c2
and
x≥0
a1n ⎞
⎟
a2 n ⎟
⎟
⎟
amn ⎠
cn ) ,
where
⎛ x1 ⎞
⎜ ⎟
x
,
x = ⎜ 2 ⎟,
⎜ ⎟
⎜ ⎟
⎝ xn ⎠
m<n
⎛ b1 ⎞
⎜ ⎟
b
b=⎜ 2 ⎟
⎜ ⎟
⎜ ⎟
⎝ bm ⎠
The set of equations Ax = b can also be expressed as:
x1a1 + x2a2 + … + xnan = b
where xj are elements of the vector x and aj are columns of the matrix A (1 ≤ j ≤
n ). Next we shall consider a minimization problem. Results for maximization
will mostly differ only in signs (max cTx ≡ min -cTx) and types of inequalities.
Note that the vector x contains the original solution variables together with all
slacks, surpluses, and artificial variables used to convert original constraints
into equalities with trivial initial feasible solution. The m x n matrix A contains
all coefficients and it is also supposed to contain a unity matrix as its part
(usually but not necessarily in the last m columns). b is the m vector of right
hand sides, and c is the n vector of objective coefficients that includes zeros for
slacks/surpluses and possible penalties (+M, -M) for artificial variables - see
later.
At this stage it is also convenient to mention the various solutions that a LP
program may have.
- 29 -
Types of solutions (of a feasible LP model)
• Feasible solution satisfies all constraints including nonnegativity.
• Infeasible
solution
does
not
satisfy
all
constraints
including
nonnegativity.
• Basic solution has m basic variables and n-m zero nonbasic variables.
• Basic nondegenerate solution has m nonzero basic variables and n-m
zero nonbasic variables. Short name "basic solution" is often used
instead.
• Basic degenerate solution has m basic variables some of which are zero
and n-m zero nonbasic variables.
• Nonbasic solution has more than m nonzero variables.
From various possible combinations these are particularly important:
• Basic feasible nondegenerate solution is a feasible solution that has m
positive basic variables and n-m zero nonbasic variables.
• Basic feasible degenerate solution has less than m positive variables and
more than n-m zero variables.
• Basic feasible solution (BFS) covers both above cases.
• Optimal solution is a basic feasible solution such that no other feasible
solution has bigger (maximization) or smaller (minimization) objective
value respectively. There can be more than one optimal solution.
- 30 -
4.1 Geometry of the Simplex Method
The simplex method is based on the so called Fundamental Theorem of Linear
Programming, which says that if an LP model has an optimal feasible solution,
then there exists a basic feasible solution that is optimal. The aim of this
section is to state and prove four theorems, which will be used to prove the
Fundamental Theorem of Linear Programming.
Theorem 4.1: The feasible region of a feasible LP problem is a convex set.
Proof: We already know that the feasible region is a polyhedron and so it is
convex. Anyway here is a proof related to LP problems. Let x1, x2 be any two
feasible solutions to the LP problem. Then all elements of x1 and x2 are
nonnegative and also:
Ax1 = b and Ax2 = b
Let x = λx1 + (1-λ)x2 such that 0 ≤ λ ≤ 1 be a convex combination of x1 and x2
(a point of the line segment connecting x1 and x2). Clearly all elements of x are
nonnegative. Then the following holds:
Ax = A(λx1 + (1-λ)x2) =λAx1 + (1-λ)Ax2 = λb + (1-λ)b = b
Since x is also a feasible solution, the feasible region is convex.
- 31 -
Theorem 4.2: If an LP problem has an optimal feasible solution, there must be
an extreme point (corner) of the feasible region that is optimal.
Proof (i):
Assumption – Feasible region is bounded
Let’s assume, that the feasible region is bounded. Later we shall see that the
theorem is valid also for some cases with unbounded feasible regions. Let xp be
the optimal solution and let {x1, x2, ... xk} be the set of the extreme points. From
the representation theorem we know that this set is finite and nonempty. Then
obviously cTx ≥ cTxp for all points x of the feasible region. Now let’s express the
optimal point xp as a convex combination of extreme points. Then it holds:
cTxp = cT(λ1x1 + λ2x2 + ... + λkxk) = λ1cTx1 + λ2cTx2 + ... + λkcTxk ≥
≥ λ1cTxq + λ2cTxq + ... + λkcTxq = (λ1 + λ2 + ... + λk)cTxq = cTxq
where xq is the extreme point with minimum value of the objective function:
cTxq = Min {cTx1 , cTx2 , ... , cTxk}
So we have:
cTxp ≥ cTxq ≥ cTxp
From this it follows that cTxq = cTxp , so the extreme point xq is optimal.
It is possible, that the objective function is optimal in more extreme points.
Then it is optimal also in any convex combination of these optimal extreme
- 32 -
points. To show this, let’s assume, that the LP problem has r optimal extreme
points x1, x2, ... , xr with a certain objective value cTxp. Let
x = λ1x1 + λ2x2 + ... + λrxr , λi ≥ 0 ,
r
∑
i =1
λi = 1
be their convex combination. Then:
cTx = cT(λ1x1 + λ2x2 + ... + λrxr) = λ1cTx1 + λ2cTx2 + ... + λrcTxr =
= (λ1 + λ2 + ... + λr) cTxp = cTxp
So the objective value of a convex combination of optimal extreme points is
also optimal.
Proof (ii)
Assumption – Feasible region can be either bounded or unbounded
Here we shall show another way how to prove the above theorem for problems
with both bounded and unbounded feasible regions.
Using the representation theorem, any point of the feasible region can be
expressed as:
k
l
j =1
j =1
x = ∑ λ j x j + ∑ µ j d j where
k
∑λ
j =1
j
=1 , λj ≥ 0 , µ j ≥ 0
and where {x1, x2, ... xk} is the nonempty and finite set of extreme points and
where {d1, d2, ... dl} is the nonempty and finite set of extreme directions
(unbounded region) or there are no extreme directions for a bounded feasible
region. So we can transform the original problem in the variables x1, x2, ... xn
into a problem in variables λ1, ... λk, µ1, ... µl:
- 33 -
k
l
j =1
j =1
Minimize z = ∑ λ j (c T x j ) + ∑ µ j (c T d j )
Subject to
k
∑λ
j =1
j
=1 , λj ≥ 0 ,
j = 1,2,
k , µj ≥ 0 ,
j = 1,2,
l
Since µj is not limited (can be made arbitrarily large), there are two cases:
1) If cTdj < 0 for some j = 1, 2 ... l, then the minimum is -∞ (corresponding µj
can be arbitrarily large) and the problem is unbounded (in objective value).
2) If cTdj ≥ 0 for all j = 1, 2 ... l, then all µj can be chosen as zero, and we have
to minimize the first term only over λ1, ... λk. To do this, we select the
minimum cTxj (say cTxp) for a certain extreme point xp (there may be more
with the same value). Then we let λp = 1 and all other λj equal to zero.
Summary: The optimal solution is finite iff cTdj ≥ 0 for all extreme directions
(for a bounded feasible region there are no extreme directions, so the optimum
is always finite). Then the optimal (minimum) cTxj occurs at at least one
extreme point. If there are more optimal extreme points, their convex
combinations are also optimal (as was shown above).
Theorem 4.3: Let's consider a feasible region S = { x : Ax = b , x ≥ 0 }. If in the
equation
x1a1 + x2a2 + … + xnan = b
- 34 -
there are k (k ≤ m < n) linearly independent vectors ai - the above equation can
always be rearranged in such a way, that these vectors are the first k vectors and if there is a linear combination with coefficient xi ≥ 0 , 1 ≤ i ≤ k such that:
x1a1 + x2a2 + … + xkak = b
then the vector x = (x1 x2 … xk 0 … 0) T is an extreme point of the feasible
region S.
Proof: Let’s assume, that the vector x is not an extreme point. If this were true,
there must be a convex combination x = λv + (1-λ)w , 0 < λ < 1 where v and w
are two feasible solutions. Clearly the last n-k elements of the vectors v and w
must be zero, otherwise the last n-k elements of the vector x would not be zero
(all vectors of the feasible region have all elements nonnegative). Since v and w
are feasible solutions, they both satisfy the equations Av = b and Aw = b. These
two equations can be expressed in this way (taking only first k terms):
v1a1 + v2a2 + … + vkak = b
w1a1 + w2a2 + … + wkak = b
These two equations (sets of linear equations) together with the equation
x1a1 + x2a2 + … + xkak = b
have all the same coefficients. Because the columns are linearly independent,
the solutions to all three equations are uniquely defined and all are the same. So
- 35 -
x1 = v1 = w1 , x2 = v2 = w2 , … , xk = vk = wk
and so
x=v=w
This shows that the vector x cannot be expressed as a linear combination of two
feasible vectors. That’s why it must be an extreme point.
Theorem 4.4: Let's consider a feasible region S = { x : Ax = b , x ≥ 0 }. If the
vector x is an extreme point of S, then the vectors ai in the equation
x1a1 + x2a2 + … + xnan = b
that correspond to nonzero elements xi of x are linearly independent.
Proof: Let’s assume again, that the nonzero elements x1, x2, … xk are the first k
elements of x. To prove the theorem let’s assume the opposite, that the vectors
a1, a2 , … , ak are linearly dependent. This means that there exist numbers r1, r2,
… , rk such that at least one of them is not zero, that:
r1a1 + r2a2 + … + rkak = 0
Now let’s multiply the above equation by a real number q and let’s first add and
then subtract the product from the equation x1a1 + x2a2 + … + xkak = b. By this
we get:
(x1 + qr1)a1 + (x2 + qr2)a2 + … + (xk + qrk)ak = b
(x1 - qr1)a1 + (x2 - qr2)a2 + … + (xk - qrk)ak = b
- 36 -
It is possible to select a small q such that all coefficients at the vectors ai are
positive. Then the vectors
x’ = (x1 + qr1, x2 + qr2, … , xk + qrk) T
x’’ = (x1 - qr1, x2 - qr2, … ,xk - qrk) T
are both feasible solutions and their sum divided by 2 gives:
x = ½ x’ + ½x’’
This shows, that x is a strict convex combination of two feasible vectors x’ and
x’’ and so it can not be a corner, which is a contradiction. That’s why the
vectors a1, a2 , … , ak must be linearly independent.
Theorem 4.5 (Fundamental Theorem of Linear Programming):
If an LP problem has an optimal feasible solution, then there is a basic feasible
solution that is optimal.
Proof: The theorem 4.2 says, that the optimum of a feasible LP problem is at an
extreme point (or possibly more extreme points) of the convex feasible region.
To prove the fundamental theorem, we have to prove that an extreme point has
mostly m nonzero elements, so it is a basic solution. The proof follows from the
theorem 4.4, that says that nonzero elements of an extreme point correspond to
linearly independent columns of the matrix A. This matrix has the size m x n, so
it can have mostly m linearly independent columns. So an extreme point is a
basic solution (it has mostly m nonzero elements). This proves the fundamental
theorem of linear programming.
- 37 -
We can also show that in fact all basic solutions are extreme points. Each basic
solution can be decomposed into the vector of basic variables xB that is a
solution to the equation BxB = b (where B is an m x m nonsingular matrix made
of linearly independent columns of the matrix A) and the vector of nonbasic
variables xN = 0. Then it follows from the theorem 4.3 that the vector (xB xN) is
an extreme point, so a basic solution is an extreme point.
The fundamental theorem has very important consequences. It says, that to
find an optimum, we can limit ourselves only to the corners (extreme
points) of the feasible region. This is the basic principle of the simplex
method, that starts at some initial (trivial) basic feasible solution - corner
and moves on the edges of the convex polyhedron until it reaches the
optimal corner. Algebraically it means moving from one basic feasible
solution to another one until the optimum is reached (or the problem is
found to be unbounded).
Extreme Points
- 38 -
The maximum possible number of basic solutions - corners of an m x n LP
problem is the number of combinations how to select m basic variables (or m
columns of the matrix A):
⎛n⎞
n!
⎜⎜ ⎟⎟ =
⎝ m ⎠ (n − m)!m!
Note that some of the above combinations may not have a solution (they would
generate a singular basis) and some may produce an infeasible solution.
For large LP models this number can be very big, so theoretically the simplex
method is an NP hard problem (in the worst case, the simplex method has to
move through all corners). Fortunately, in practice it performs very well and in
fact it is considered as a very fast algorithm. There was a study testing many LP
models with 50 variables. The average number of steps was 2m. Other studies
showed, that for most models the optimum was found in less than 3m steps.
4.2 Algebra of the Simplex Method
The simplex method is based on the following steps:
1. finding an initial basic feasible solution
2. testing for optimality
3. if not optimal, finding the most promising improvement direction
4. finding the distance of the move (to reach the appropriate corner)
5. adjusting the model accordingly.
- 39 -
Next paragraphs explain the basic ideas on how to solve each of the above
steps.
1. To find an initial basic feasible solution, we can find for each constraint a
variable whose initial nonzero value is known. So we have m nonzero basic
variables. All other n-m variables will be nonbasic (zero). How to find such
a variable depends on the type of the (in)equality (see beginning of this
chapter – changing constraint type).
a) For an ≤ inequality add a slack variable.
b) For an ≥ inequality first subtract a surplus variable (add negative slack)
and then add an artificial variable.
c) For equality constraints add an artificial variable.
Artificial variables must be forced out from the solution because they are
just a tool how to start the algorithm. They must eventually leave the
solution (as nonbasic ones). These are the two methods how to do it (these
methods will be covered in detail later on):
(i) The M-method
(ii) The 2 phase method
Note: In both cases it may happen that an artificial variable remains
nonzero in the optimal solution. This means infeasibility.
2. The most promising improvement can be found from the coefficients of the
objective function. Let’s assume that in a certain maximization problem the
initial point has all original solution variables nonbasic (equal to zero).
Improving in the first step means inserting a certain solution variable as a
- 40 -
basic one. That will force one of the current basic variables (slacks) out of the
basic solution - see the next paragraph. The most promising variable has the
maximum positive coefficient in the objective function (objective function is
linear, so these coefficients are in the first step partial derivatives of the
objective function with respect to particular nonbasic variables). The column
that contains the maximum coefficient is called pivot column. If there is no
positive coefficient, the optimum (maximum) has been found. Note that the
simplex table typically contains the negative values of the coefficients, so the
optimality condition in case of maximization means no negative values in the
coefficients row in the table. For minimization the optimality condition
means no positive values in the coefficients row in the table.
3. Selecting a nonbasic variable to be inserted as a basic one defines the
direction of the move in the n-dimensional feasible region. The maximum
distance of the move is given by the natural requirement to keep feasibility.
A certain constraint will thus stop the movement, so the corresponding slack
will become zero, that means it will change from a basic variable into a
nonbasic variable. The constraint involved can be found by using the
coefficients of the constraint matrix, namely in the pivot column, and the
right hand side values. The ratio of the right hand side value and the pivot
column coefficient defines the maximum possible increase of the nonbasic
variable allowed by this constraint. Obviously the minimum value is chosen,
that is called the ratio test. The row that contains the minimum value (the
pivot row) thus defines the distance of the move (the ratio) and the basic
variable that becomes nonbasic. The intersection of the pivot row and the
pivot column is called pivot element.
- 41 -
4. To introduce a nonbasic variable it is necessary to take the amounts of all
resources needed by this variable. This is performed by making zeros in the
pivot column, except the pivot element, that is 1. The zero in the coefficients
row represents the fact, that the variable is basic. The values in the
coefficients row will be changed, but they always represent profits of
introducing one unit of the nonbasic variable into the solution.
Clearly these basic ideas neither justify the simplex method nor define exactly
the algorithm and how to create the simplex table. The detail has to be covered
through the algebra of the simplex method.
First let's recall the standard form of an LP minimization problem (results for
maximization will mostly differ only in signs and types of inequalities): find
such x to
Minimize
z = cTx
Subject to
Ax = b , x ≥ 0
where x and c are n-vectors, b is an m-vector, A is an m x n matrix (m ≤ n) and z
is the scalar objective value. Note that the set of equations Ax = b can be
expressed as:
x1a1 + x2a2 + … + xnan = b
(1)
where xi are elements of the vector x and ai are columns of the matrix A (1 ≤ i ≤
n ).
- 42 -
Now let's assume that the matrix A has the rank m, so there are m independent
columns. Then A can be re-arranged in this way:
A = (B N), where B is an m x m invertible matrix and N is an m x (n-m) matrix.
B is called the basic matrix (shortly the base) of A, N is called the nonbasic
matrix. Vector x can be decomposed accordingly: xT = (xBT xNT) where xBT =
(x1 x2 ... xm) and xNT = (xm+1 xm+2 ... xn). Then:
⎛x ⎞
Ax = ( B N ) ⎜ B ⎟ = Bx B + Nx N = b
⎝ xN ⎠
or
xB = B -1b − B −1 Nx N
(2)
The solution xT = (xBT xNT) such that xN = 0 and xB = B-1b is called a basic
solution of the system Ax = b. So a basic solution has m basic variables
(components of xB) and n-m zero nonbasic variables (components of xN). If xB
≥ 0, than x is called a basic feasible solution. If the solution has less than m
nonzero variables, it is called a degenerate basic solution. From the
fundamental theorem of linear programming we know that if an LP problem has
an optimal feasible solution, then there is a basic feasible solution that is
optimal.
Example 4.1
Consider the LP model:
min x1 + x2
st
x1 + 2 x2 ≤ 4
x2 ≤ 1
x1 , x2 ≥ 0
- 43 -
The model in standard form is given by:
min x1 + x2 + 0 x3 + 0 x4
min (1 1 0 0 ) x
st
st
x1 + 2 x2 + x3
x2
⇔
=4
⎛1 2 1 0⎞
⎛ 4⎞
⎜
⎟x =⎜ ⎟
⎝0 1 0 1⎠
⎝1 ⎠
x = ( x1 x2 x3 x4 )T ≥ 0
+ x4 =1
x1 , x2 , x3 , x4 ≥ 0
Clearly A has rank m = 2. Suppose we choose the third and fourth linearly
independent columns (a3 and a4) to form the basic matrix B. This means that
we can arrange the matrix A as shown below:
⎛1 0 1 2⎞
A=⎜
⎟
⎝0 1 0 1⎠
B
N
so that
cTB
cTN
min ( 0 0 1 1) x
st
⎛1 0 1 2⎞
⎛ 4⎞
x
=
⎜
⎟
⎜ ⎟
⎝0 1 0 1⎠
⎝1 ⎠
x = ( x3 x4 x1 x2 )T ≥ 0
x TB
x TN
−1
−1
⎛1 0⎞ ⎛ 4⎞ ⎛ 1 0⎞ ⎛ 1 2⎞⎛ 0⎞ ⎛ 4⎞
Thus xB = B b − B Nx N = ⎜
⎟ ⎜ ⎟−⎜
⎟ ⎜
⎟ ⎜ ⎟ = ⎜ ⎟ so that the basic
⎝ 0 1 ⎠ ⎝ 1 ⎠ ⎝ 0 1 ⎠ ⎝ 0 1 ⎠ ⎝ 0 ⎠ ⎝1 ⎠
-1
−1
feasible solution is x = ( xB x N ) = ( 4 1 0 0 )
T
- 44 -
Equation (2) shows how the basic variables would change if the nonbasic
variables (currently all zero) changed their values. Let's express this equation in
terms of columns of the matrix B-1N and corresponding nonbasic variables:
x B = B -1b − B -1 Nx N = B -1b − ∑ ( B -1a j )x j = b* − ∑ y j x j
j∈R
j∈R
(3)
where R is the index set of the columns that make the nonbasic matrix N, aj are
columns of the matrix N (and the matrix A), yj are columns of the matrix B-1N
and b* contains the current values of basic variables.
Example 4.1 Continued
−1
−1
⎛ 1 0 ⎞ ⎛ 4 ⎞ ⎛ 1 0 ⎞ ⎛ 1 2 ⎞ ⎛ x1 ⎞
xB = B b − B Nx N = ⎜
⎟ ⎜ ⎟−⎜
⎟ ⎜
⎟⎜ ⎟
⎝ 0 1 ⎠ ⎝ 1 ⎠ ⎝ 0 1 ⎠ ⎝ 0 1 ⎠ ⎝ x2 ⎠
−1
−1
⎛ 4 ⎞ ⎡⎛ 1 0 ⎞ ⎛ 1 ⎞
⎛1 0⎞ ⎛ 2⎞ ⎤
= ⎜ ⎟ − ⎢⎜
⎟ ⎜ ⎟ x1 + ⎜
⎟ ⎜ ⎟ x2 ⎥
⎝ 1 ⎠ ⎢⎣⎝ 0 1 ⎠ ⎝ 0 ⎠
⎝ 0 1 ⎠ ⎝ 1 ⎠ ⎥⎦
-1
−1
⎛ 4 ⎞ ⎡⎛ 1 ⎞
⎛ 2⎞ ⎤
= ⎜ ⎟ − ⎢⎜ ⎟ x1 + ⎜ ⎟ x2 ⎥
⎝ 1 ⎠ ⎣⎝ 0 ⎠
⎝1 ⎠ ⎦
= b* −
∑
j∈{1,2}
yjxj
Similarly let's find out how the objective value would change with respect to
nonbasic variables:
- 45 -
⎛x ⎞
z = cT x = (cTB cTN ) ⎜ B ⎟ = cTB x B + cTN x N = cTB ( B −1b − ∑ B −1a j x j ) + cTN x N
j∈R
⎝ xN ⎠
T
T
−1
−1
= cB B b − ∑ cB B a j x j + ∑ c j x j
j∈R
j∈R
= z0 − ∑ ( z j − c j ) x j
j∈R
(4)
where zj = cBTB-1aj is a scalar value for each nonbasic variable.
Example 4.1 Continued
⎛ x3 ⎞
⎜ ⎟
x
T
z = c x = ( 0 0 1 1) ⎜ 4 ⎟
⎜ x1 ⎟
⎜⎜ ⎟⎟
⎝ x2 ⎠
⎛ x3 ⎞
⎛ x1 ⎞
= ( 0 0 ) ⎜ ⎟ + (1 1) ⎜ ⎟
⎝ x2 ⎠
⎝ x4 ⎠
⎧
−1
1
0
4
⎛
⎞ ⎛ ⎞ ⎪⎪
⎛1
= (0 0) ⎜
⎟ ⎜ ⎟ − ⎨( 0 0 ) ⎜
⎝0 1⎠ ⎝1⎠ ⎪
⎝0
⎪⎩
z1
= 0 − {(0 − 1) x1 + (0 − 1) x2 }
⎫
−1
−1
0⎞ ⎛1⎞
1
0
2
⎛ x1 ⎞
⎛
⎞ ⎛ ⎞ ⎪⎪
⎟ ⎜ ⎟ x1 + ( 0 0 ) ⎜
⎟ ⎜ ⎟ x2 ⎬ + (1 1) ⎜ ⎟
1⎠ ⎝0⎠
⎝ 0 1⎠ ⎝ 1⎠ ⎪
⎝ x2 ⎠
⎪⎭
z2
= z0 − ∑ ( z j − c j ) x j
j∈R
Exercise 4.2 : Work out the value of the BV, NBV and the objective functions
T
if the Basic feasible solution to be considered is x = ( x2 x3 x1 x4 ) . Does
this basic feasible solution improves the objective function?
We can use equation (4) to find a nonbasic variable that (if not zero) would
improve the current objective value z0. If such a variable does not exist, we
know that the current basic feasible solution is optimal. To do it, let's optimize
- 46 -
with respect to nonbasic variables at the point x (the origin in the sub-space of
nonbasic variables) using the equations (3) and (4):
Minimize
z = z0 − ∑ ( z j − c j ) x j
j∈R
Subject to
∑y x
j∈R
j
j
+ x B = b∗ , x j ≥ 0 , j ∈ R , x B ≥ 0
Note that in the above problem the current basic variables play the role of
slacks. So the problem can be rewritten as:
Minimize
z = z0 − ∑ ( z j − c j ) x j
j∈R
Subject to
∑y x
j∈R
j
j
≤ b∗ , x j ≥ 0 , j ∈ R
From the objective function of the above LP problems, we can directly state the
optimality condition or optimality test:
If (zj - cj) ≤ 0 for all j ∈ R, then the current basic feasible solution is optimal.
The proof is simple: since xj ≥ 0 then for all nonpositive (zj - cj) we get z ≥ z0 for
any other solution. But currently z = z0 since xj = 0 for all j ∈ R.
- 47 -
If not all (zj - cj) ≤ 0, then we select one positive (zk - ck) - possibly but not
necessarily the greatest one - and we shall increase the corresponding xk as
much as possible by holding the remaining n-m-1 nonbasic variables at zero.
The new objective value will be:
z = z0 - (zk - ck) xk
(5)
Note: Look at the definition of simplex in chapter 2. At the beginning of each
iteration we are in the origin of the sub-space of nonbasic variables. Each
iteration represents a move along one axis of the nonbasic variables space to
the neighboring corner. So geometrically it is a move on one of the edges of
the simplex (which explains the name of the method).
Example 4.3
Consider the LP in standard form:
min − x1 − x2 + 0 x3 + 0 x4
st
x1 + 2 x2 + x3
x2
=4
+ x4 =1
x1 , x2 , x3 , x4 ≥ 0
T
Suppose that the IBFS is x = ( x3 x4 x1 x2 ) = (4 1 0 0) . Thus, the initial
objective function
z = c TB B −1b − ∑ cTB B −1a j x j + ∑ c j x j = z0 − ∑ ( z j − c j ) x j = 0
j∈R
j∈R
To check if zo can be improved:
R = {1,2}
- 48 -
j∈R
⇒ z1 − c1 = cTB B −1a1 − c1 = 0 − (−1) = 1 > 0
⇒ z2 − c2 = cTB B −1a2 − c2 = 0 − (−1) = 1 > 0
Clearly the IBFS is not the optimal solution, thus we need to choose a current
NBV to become a BV, so that the objective function value will be decreased.
We can choose any NBV xj corresponding to a positive value of z j − c j . As a
rule of thumb we choose that NBV xj for which z j − c j has the highest positive
value. Since in this case z j − c j = 1 for j = 1,2, then we can choose either x1 or
x2 , say x1.
From equation (3) we can find the new values of basic variables:
xB = b* - ykxk
(all other xj , j ∈ R \{k} are zero)
Expanding this equation we obtain:
⎛ xB1 ⎞ ⎛ b1∗ ⎞ ⎛ y1k ⎞
⎟
⎟ ⎜ ∗⎟ ⎜
⎜
⎜ xB 2 ⎟ ⎜ b2 ⎟ ⎜ y2 k ⎟
⎟ xk
⎟ =⎜ ⎟−⎜
⎜
⎟
⎟ ⎜ ⎟ ⎜
⎜
⎜ x ⎟ ⎜ b∗ ⎟ ⎜ y ⎟
⎝ Bm ⎠ ⎝ m ⎠ ⎝ mk ⎠
The indices Bi depend on the current basis. The value of xk can be found from
the feasibility requirement of the new solution:
xBi ≥ 0
or
bi* - yikxk ≥ 0 , i = 1, 2, ... m
Feasibility is at danger only for positive yik for which it must hold:
- 49 -
xk ≤
bi∗
y ik
, i = 1, 2,
m ,
y ik > 0
So for yik ≤ 0 the corresponding xBi remains nonnegative. For yik > 0 the
corresponding xBi decreases. We can continue increasing xk until the first basic
variable drops to zero. Then we have to stop, otherwise the solution would
become infeasible. This gives the so called feasibility condition also called
ratio test or feasibility test:
⎧ b∗
⎫
x k = Min ⎨ i : y ik > 0⎬
1≤ i ≤ m y
⎩ ik
⎭
(6)
If r is the row with the minimum ratio, then the new solution is:
x Bi = bi∗ − y ik
xk =
br∗
y rk
xj = 0 ,
br∗
y rk
, i = 1, 2, … m ( x Br drops to zero - the leaving variable)
( the entering variable)
j ∈ R \ {k} (other nonbasic variables remain zero)
In this way we have reached new (better) basic feasible solution. This process
must terminate (unless there is cycling) because the number of corners (basic
feasible solutions) is finite. Cycling can occur in case of degeneracy and it
represents a real problem in computer implementation of the simplex method.
There are methods to cope with cycling that are beyond the scope of this
material. Commercial LP packages mostly ignore cycling because cycling
- 50 -
prevention would slow down considerably the computation. Also rounding of
floating point numbers can help - due to the limited precision cycling is in fact
often prevented.
Example 4.3 Continued
x B = b* - yk xk
⎛ x ⎞ ⎛ 4 ⎞ ⎛1 ⎞
⎛ 0⎞
⇒ ⎜ B1 ⎟ = ⎜ ⎟ − ⎜ ⎟ x1 ≥ ⎜ ⎟
⎝ 0⎠
⎝ xB 2 ⎠ ⎝ 1 ⎠ ⎝ 0 ⎠
⎛4 ⎞
⇒ x1 ≤ ⎜ 1 ⎟
⎜ ⎟
⎝ 0⎠
⎛4 ⎞
⇒ x1 = min ⎜ 1 ⎟ = 4
⎜ ⎟
⎝ 0⎠
As the minimum value occurs in the first row, then the current BV s1 must drop
to zero, thus becoming a NBV. Hence, the new basic feasible solution is
x = ( x1 x4 x2 x3 )T = (4 1 0 0)T and the new objective function value is
z = cTB B −1b − ∑ cTB B −1a j x j + ∑ c j x j
j∈R
j∈R
= 0 − 0 + (−1)4
= −4
which is an improvement to zo (remember we are minimizing!)
Exercise 4.4
Check if the new basic feasible solution is optimal, if no determine the new
basic feasible solution and the new objective function value.
Practical interpretation of (zk - ck)
After entering a so far nonbasic (zero) variable xk into the solution, the new
value of the objective function is:
- 51 -
z = z0 - (zk - ck) xk = cBTb* - zkxk + ckxk
So,
ck = cost of entering one unit of xk
zk = saving caused by entering one unit of xk
That's why (ck - zk) is called reduced cost. We are working with its negative
value (zk - ck) because this is the value in the simplex table - see later. Now let's
expand zk :
m
zk = cTB B -1ak = cBT yk = ∑ cBi yik
i =1
So
cBi = unit cost of the i-th basic variable
yik = by how much the i-th basic variable will decrease
cBi yik = saving caused by decreasing the i-th basic variable
zk = total saving caused by decreasing all basic variables.
Cases of termination
So far we have assumed, that a unique optimal solution has been reached. This
is in fact one of three possible cases (we always assume a feasible problem):
1. If zj - cj < 0 for all j ∈ R then there is a unique optimal solution.
2. If there exists some k in R such that zk - ck = 0, then there are alternative
optima. Entering xk into the solution would change the basic feasible solution,
but not the value of z.
- 52 -
3. zj - cj ≤ 0 , j ∈ R (some zk - ck > 0 and yk ≤ 0), then the problem is
unbounded. All items of the column yk are non-positive, xk can be increased
arbitrarily, so z → -∞.
Simplex Algorithm
Using the above results, we can express the simplex algorithm formally by
matrix operations (assuming that the optimal solution exists):
Find a basic matrix B of an initial basic feasible solution
Repeat
Compute the current solution x = (xB xN)T = (B-1b 0)T
Compute the objective z = cBTxB = cBTB-1b
Compute the reduced costs (c - z)T = (cT - cTB-1A)
If (not optimal)
Select the entering variable
Use the ratio test to find the leaving variable
Update the basic matrix B
EndIf
Until (optimal)
Note that the current basis can be represented by an index vector that defines
which columns of A and in which order form the basic matrix B. Updating basis
then means replacing the index of the leaving variable by the index of the
entering variable. Optimality test and selection of the entering variable depends
on the problem (maximization or minimization), ratio test means division of xB
by the pivot column (as explained above). Reduced costs of basic variables are
zero, so it is possible to compute only reduced costs of nonbasic variables: (cN T
- 53 -
- cBTB-1N) where N is the nonbasic matrix made of columns of A not included in
B. In order to compute the objective and the reduced costs, we need the socalled simplex multipliers wT = cBTB-1 also called shadow costs - see later. Then
the objective can be computed as z = wTb and the reduced costs as (cN T - wTN).
4.3 Simplex Table
To use the simplex algorithm in practice we (and the computer) need a table
that would store all information needed for tests and operations of the algorithm
and that would eventually contain the optimal solution together with its optimal
objective value. Later we shall learn that there will be in fact even more than
that. Let's summarize what we need:
• current basic feasible solution
• current objective value
• information whether the solution is optimal and if not what variable
should enter. We know that for this purpose we need reduced costs of
nonbasic variables
• information needed to find the values of entering variables that will also
show which variable leaves the solution. We know, that for that (ratio
test) we need columns yj of nonbasic variables and current basic feasible
solution.
Derivation of the simplex table
By using the equations (2) and (4) the original problem
Minimize
Subject to
z = cTx
Ax = b , x ≥ 0
can be restated as:
- 54 -
Minimize
Subject to
z
NxN + BxB = b
(7)
-cNTxN - cBTxB + z = 0
To get the current basic solution, we multiply the first equation by B-1 from left
and then by cBT from left:
B-1NxN + IxB = B-1b
cBTB-1NxN + cBTxB = cBTB-1b
(8)
To get the reduced costs, lets add the second equations in (7) and (8):
(cBTB-1N - cN T)xN + 0TxB + z = cBTB-1b
(9)
Finally let's express the first equation of (8) and (9) in a unified way, where the
first term in (9) has been replaced by zNT - see the equation (4):
B-1NxN
+ IxB
(zN - cN )TxN + 0TxB
+ 0z = B-1b
+ 1z = cBTB-1b
(10)
Note that currently xN = 0, so right hand sides are equal to xB and z respectively.
The coefficients of equations (10) can be stored in a table that has m + 1 rows,
and columns corresponding to xN, xB, z and the right hand sides. We shall label
the columns accordingly. For practical reasons we can also label the rows by
the basic variables and by z. This makes a simplex table where BV means basic
variables:
- 55 -
BV
xNT
xB T
z
RHS
xB
B-1N
I
0
B-1b
z
cBTB-1N - cN T
0T
1
cBTB-1b
This table contains all we need to carry out simplex iterations. The last column
contains the values of the basic variables (the nonbasic ones are obviously zero)
and the value of z. The second equation of (10) shows that the z-row contains
negative reduced costs in xN columns.
Notes:
1. Some authors describe a table with positive reduced costs and negative
objective value (together with the coefficient -1 in the z-column). This can be
obtained by multiplying the second equation of (10) by -1.
2. The z-column contains always the same, so it is in fact redundant. That's
why it is omitted in most books.
3. Some authors place the z-row as the first one.
4. The above table has the basic and nonbasic variables grouped together. This
can always be done by re-arranging columns of the table. Doing this after
each iteration is time consuming and in fact useless. That's why it is typical
(for manual solution) that the above "nice" form of the table exists only at
the beginning, where the basic variables are slacks and/or artificial
variables that we are used to place at the right side in such an order to form
directly the unity matrix. Note that in this case the initial basis is a unity
- 56 -
matrix: B = B-1 = I, so the initial simplex table contains directly the
coefficients of the constraint equations and the objective function:
BV
xNT
xBT
z
RHS
xB
N
I
0
b
z
(cBTN - cN )T
0T
1
cBTb
5. The z row is created by first writing negative objective coefficients into the z
row. Slacks have zero coefficients in the objective function (cB = 0), so the so
called all-slack LP problems (with only ≤ inequalities) have directly zeros in
the z-row in xB columns and zero in the objective value (bottom right) entry.
z-row entries in the xN columns are negative coefficients of the objective
function. So the initial simplex table does not need any pre-processing.
However if there are artificial variables, the initial table depends on the
solution method. The M-method penalizes artificial variables by a big
coefficient M in the objective function. The 2-phase method first minimizes
the sum of artificial variables, so the objective coefficients of artificial
variables are 1. To get the initial simplex table in its consistent form, it is in
both cases necessary to perform some elementary matrix operations to
obtain zeros in the z-row in xB columns.
6. Note that after each iteration one basic variable leaves, one nonbasic
variable enters. So assuming we keep the heading labels fixed, after the first
iteration one column of the unity matrix moves to the place of the new basic
variable and becomes the column of the variable that is now nonbasic. After
several iterations the columns of the unity matrix are "scattered" in the
- 57 -
table, but all of them are always present. So it is convenient to keep labels of
basic variables in the first (label) column.
7. Each solution has its basis that can be created by taking the appropriate
columns of the original matrix A (indices are the same as the basic
variables). Note that at each iteration the table contains the result of
multiplying the original matrix A by B-1. So if there were originally the unity
matrix in A (typically at the right side), there is B-1 now.
Information found in the simplex table
In addition to the above requirements, the simplex table contains a lot of useful
information:
a) Objective value (z) in terms of nonbasic variables
Apart from zeros in the columns of basic variables, the z-row of the simplex
table contains the negative reduced costs. Using reduced costs let's once more
express the objective value in terms of nonbasic variables:
z = c TB B -1b − (c BT B -1 N − c TN ) x N = c BT B -1b + ∑ (c j − z j ) x j
j∈R
The rate of change of z as a function of a nonbasic variable xj is:
∂z
= cj − zj
∂x j
- 58 -
(11)
This is another justification that to minimize z, xj should be increased if cj - zj <
0 or zj - cj > 0 (because this value is stored in the simplex table).
b) Basic variables in terms of nonbasic variables
From equation (10) we get for xB:
xB = B -1b − B -1 Nx N = B -1b − ∑ B -1a j x j =B -1b − ∑ y j x j
j∈R
(12)
j∈R
Vectors yj show how basic variables change in terms of nonbasic variables:
∂x B
= −yj
∂x j
,
∂xBi
= − yij
∂x j
c) Objective value (z) in terms of the original right hand side values
From equation (11) we can compute the partial derivatives of z with respect to
b:
∂z
= ( cTB B -1 )
i
∂bi
These values are the so called shadow costs. Their interpretation depends on
the type of inequality. If bi represents availability of a certain resource ( ≤
inequality in a maximization problem) then ∂z/∂bi is the worth of one unit of the
particular resource. If bi represents some minimum acceptable amount ( ≥
inequality in a minimization problem like minimum production, minimum
weight and similar) then the derivative ∂z/∂bi is the cost that we pay for one unit
of that limitation. Shadow costs can also be found in the simplex table (actually
- 59 -
for some nonbasic variables shadow costs are equal to reduced costs). From the
equation (9) we know that in the z-row the entries in nonbasic variables
columns are cBTB-1N - cNT or cBTB-1aj - cj for one particular nonbasic variable xj.
Now lets assume that the nonbasic variable xj is a zero slack of a certain scarce
resource whose availability is bi (i-th component of b). Such slacks have zero
coefficients in the objective function, so cj = 0. Also initially in the simplex
table the slack was a basic variable with its associated column of a unity matrix,
so aj = ei (a column vector with i-th component equal to 1 and the remaining
components equal to zero):
cBTB-1aj - cj = cBTB-1ei = (cBTB-1)i
Thus the z-row entry of xj is the i-th entry of cBTB-1 that is the shadow cost
∂z/∂bi.
d) Basic variables in terms of the original right hand side values
From equation (12) we can compute the partial derivatives of xB with respect to
b:
∂xBi
= ( B −1 )ij
∂b j
So the i,j-th entry of B-1 shows how the i-th basic variable (the one in the i-th
row, not the one that is the i-th component of x) changes with the right hand
side value bj.
- 60 -
4.4 The Simplex Algorithm (using Simplex Table)
1. Create the initial (possibly inconsistent) simplex table:
BV
xNT
xBT
RHS
xB
z
N
- cNT
I
- cBT
b
0
where
(N I) = A is the matrix of coefficients with unity matrix in last m
columns.
b = vector of right-hand sides
cN = objective coefficients of nonbasic variables
cB = objective coefficients of basic variables
xB = basic variables
xN = nonbasic variables
2. Make the table consistent by performing such row operations, that the values
in the z-row in basic columns are zero. This is not necessary for all slack
models.
3. Use Optimality test to check whether the table is optimal:
Minimization: All negative reduced costs (z-row entries) must be negative or zero.
Maximization: All negative reduced costs (z-row entries) must be positive or zero.
If the table is optimal go to the step 6. If not, select the entering variable:
- 61 -
Minimization: Select variable with the greatest positive value.
Maximization: Select variable with the most negative value.
Entering variable defines the pivot column.
4. Use Feasibility test to find the leaving variable:
Compute the ratios of the right-hand sides and the positive
coefficients in the pivot column. Ignore rows with non-positive
coefficients.
If there are no positive coefficients, the problem is
unbounded. Otherwise select the row with the minimum ratio to find
the pivot row and the leaving variable.
5. Pivot on pivot element: perform such row operations to create zeros in the
pivot column except the pivot element that has to be 1. Go to the step 3.
6. Interpret the optimal simplex table that contains (among others):
- The objective value
- Values of basic variables (nonbasic ones are zero)
- Shadow costs of resources in slack columns
- Penalties caused by introducing nonbasic variables
- 62 -
Example 4.5
Use the simplex method to find the optimum production plan for the following
problem:
Product Quantity
A
x1
B
x2
C
x3
Amount Available
Machine
Hours
2
3
1
400
Components Alloy
1
1
150
2
4
200
Limits
≤ 50
-
1. Write the LP model for the above problem:
max 8 x1 + 5 x2 + 10 x3
subject to
2 x1 + 3 x2 + x3 ≤ 400
x1 + x3 ≤ 150
2 x1 + 4 x3 ≤ 200
x2 ≤ 50
x1 , x2 , x3 ≥ 0
2. Express the LP model into standard form:
max z = 8 x1 + 5 x2 + 10 x3 + 0 x4 + 0 x5 + 0 x6 + 0 x7
subject to
2 x1 + 3 x2 + x3 + x4
x1 +
x3 +
2 x1 +
4 x3 +
= 400
= 150
x5
= 200
x6
x7
x2 +
x1 , x2 , x3 , x4 , x5 , x6 , x7 ≥ 0
- 63 -
=
50
Contribution/
unit
8
5
10
Interpretation of Slack variables:
™ x4 - unused machine hours
™ x5 - unused components
™ x6 - unused alloy
™ x7 - amount of product B not produced
3. Set up the initial simplex table:
Solution Products
Variable
x1
x2
2
3
x4
1
0
x5
2
0
x6
0
1
x7
z
-8
-5
Slack Variables
x3
1
1
4
0
-10
x4
1
0
0
0
0
x5
0
1
0
0
0
x6
0
0
1
0
0
x7
0
0
0
1
0
Solution
Quantity
400
150
200
50
0
IBFS:
x T = ( x4 , x5 , x6 , x7 , x1 , x2 , x3 ) = (400,150, 200,50, 0, 0, 0)
BV
Initial Objective function value:
NBV
z =0
4. Perform Optimality Test: Is the current BFS, optimal? – No
i.
Select the highest negative contribution in the z row (i.e. –10
corresponding to x3)
⇒
x3 is the entering variable (i.e. x3 becomes a BV)
- 64 -
5. Perform Ratio Test: Which current BV has to be reduced to zero (i.e.
become a NBV)?
x B = b * − y 3 x3 ≥ 0
⎛ 400 ⎞
⎜ 1 ⎟
⎜
⎟
⎛ x 4 ⎞ ⎛ 400 ⎞ ⎛ 1 ⎞
⎛0⎞
⎛ 400 ⎞
150 ⎟ ⎛ 400 ⎞
⎜
⎜x ⎟ ⎜
⎟
⎜
⎟
⎜
⎟
⎜
⎟
⎜ 150 ⎟
150 ⎟ ⎜ 1 ⎟
0⎟
⎜ 1 ⎟ ⎜ 150 ⎟
5 ⎟
⎜
⎜
⎜
⎟ = 50
⇒
=
−
⇒ x3 ≤ ⎜
=
⇒ x 3 = m in ⎜
x ≥
⎜ x 6 ⎟ ⎜ 200 ⎟ ⎜ 4 ⎟ 3 ⎜ 0 ⎟
⎜ 50 ⎟
200 ⎟ ⎜ 50 ⎟
⎜
⎟ ⎜
⎜ ⎟ ⎜
⎟ ⎜ ⎟
⎜ ⎟
⎟
⎜
⎟
4
⎜
⎟ ⎝ ignore ⎠
50
0
0
x
⎠ ⎝ ⎠
⎝ ⎠
⎝ ignore ⎠
⎝ 7⎠ ⎝
⎜ 50 ⎟
⎜
⎟
⎝ 0 ⎠
The amount of product C produced ( x3) can be increased by 50 (pivot row =
row 3), hence x6 must be reduced to 0, and thus becomes a NBV.
Interpretation: instead of x6 the basic variable will be x3 (no unused alloy,
all alloy used to produce 50 units of product C).
Consequence: production of 50 units of product C will also affect other
resources. Thus, we need to find the new values of the other BV.
6. Ring the element in both the pivot row and pivot column (pivot element) – 4.
Divide all the elements in the identified row (x6) by the pivot element (4)
and change the solution variable (i.e. x6 → x3)
New Row 3 is:
x3
0.5
0
1
0
New Simplex Table is:
- 65 -
0
0.25
0
50
Solution Products
Variable
x1
x2
x4
2
3
x5
1
0
x3
0.5
0
x7
0
1
z
-8
-5
Slack Variables
x3
1
1
1
0
-10
x4
1
0
0
0
0
x5
0
1
0
0
0
x6
0
0
0.25
0
0
x7
0
0
0
1
0
Solution
Quantity
400
150
50
50
0
7. Make all other elements in the pivot column equal to zero by repetitive row
by row operations:
i.
New Row 1 = Old Row 1 – Row 3
x4
x3
New Row 1
ii.
iii.
iv.
2
0.5
1.5
3
0
3
1
1
0
1
0
1
0
0
0
0
0.25
-0.25
0
0
0
400
50
350
New Row 2 = old row 2 – Row 3
Row 4 = already zero
Row 5 = old row 5 – 10(Row 3)
Solution Products
Variable
x1
x2
1.5
3
x4
0.5
0
x5
0.5
0
x3
0
1
x7
z
-3
-5
Slack Variables
x3
0
0
1
0
0
x4
1
0
0
0
0
x5
0
1
0
0
0
x6
-0.25
-0.25
0.25
0
2.5
x7
0
0
0
1
0
Solution
Quantity
350
100
50
50
500
New BFS:
x T = ( x3 , x4 , x5 , x7 , x1 , x2 , x6 ) = (50,350,100,50, 0, 0, 0)
BV
NBV
New objective function value
z = 500.
- 66 -
8. Repeat steps 4, 5 and 6 until there are no negative values in the z row. (i.e.
no improvement is possible).
Simplex Table after next step:
Solution Products
Variable
x1
x2
1.5
0
x4
0.5
0
x5
0.5
0
x3
0
1
x2
-3
0
z
New BFS:
Solution
Quantity
Slack Variables
x3
0
0
1
0
0
x4
1
0
0
0
0
x5
0
1
0
0
0
x6
-0.25
-0.25
0.25
0
2.5
x7
-3
0
0
1
5
200
100
50
50
750
x T = ( x2 , x3 , x4 , x5 , x1 , x6 , x7 ) = (50,50, 200,100, 0, 0, 0)
NBV
BV
New Objective function value
z = 750.
Interpretation: Produce maximum possible quantities of product B (since x1 and
x6 are zero) and product C (since x7 is zero).
Optimal Simplex Table
Solution Products
Variable
x1
x2
0
0
x4
0
0
x5
1
0
x1
0
1
x2
0
0
z
Slack Variables
x3
-3
-1
2
0
6
x4
1
0
0
0
0
- 67 -
x5
0
1
0
0
0
x6
-1
-0.5
0.5
0
4
x7
-3
0
0
1
5
Solution
Quantity
50
50
100
50
1050
Interpretation of table:
Produce 100 units of product A, 50 units of product B and no units of product C
to yield a maximum contribution of 1050.
Unused (Abundant) Resources:
50 machine hours (x4) and 50 components (x5)
Scarce (fully utilized) resources:
All alloy is used.
Limitation on x2 is exhausted.
Shadow prices:
i.
4 (contributing to alloy (x6))
ii.
5 (contributing to x2 limitation (x7))
iii.
6 (contributing to x3)
Interpretation:
i.
If RHS of alloy constraint is increased/decreased by 1 the total
contribution would increase/decrease by 4.
ii.
If RHS of x2 limitation constraint is increased/decreased by 1 the total
contribution, will increase/decrease by 5.
iii.
If production of product C (x3) is increased by 1 the contribution will
be 6 units less.
- 68 -
Example 4.6 Continued (Matlab Session)
This session shows Simplex Algoritm (all-slack problem) using table approach.
Comments added later are in italics, some empty lines and spaces have been
removed. Assuming that the folder Z:\Matlab contains the file pivot.m
» type pivot
function a=pivot(A,r,c)
% pivot matrix A at row r and column c
% for zero pivot item no operation
% no other tests
x=A(r,c);
if x ~= 0
rmax=length(A(:,1));
A(r,:)=A(r,:)/x;
for i=1:rmax
if i~=r
A(i,:)=A(i,:)-A(r,:)*A(i,c);
end
end
end
a=A;
Entering A,b,c of the model:
max
ST
z = 8x1
2x1
x1
2x1
+ 5x2 + 10x3
+ 3x2 + x3 ≤
+
x3 ≤
+
4x3 ≤
x2
≤
400
150
200
50
xi ≥ 0
» A=[2 3 1;1 0 1;2 0 4;0 1 0]
A = 2
3
1
1
0
1
2
0
4
0
1
0
» A=[A eye(4)]
A = 2
3
1
0
2
0
0
1
1
1
4
0
1
0
0
0
0
1
0
0
0
0
1
0
» c=[8 5 10 zeros(1,4)]'
c = 8
5
10
0
0
0
0
- 69 -
0
0
0
1
» b=[400 150 200 50]'
b = 400
150
200
50
Constructing the simplex table:
» s=[A b]
s =
2
1
2
0
3
0
0
1
» s=[s;-c' 0]
s = 2
3
1
0
2
0
0
1
-8
-5
1
1
4
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
400
150
200
50
1
1
4
0
-10
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0
400
150
200
50
0
3rd variable enters (max negative), computing ratios: last column/3rd
column:
» r=s(:,end)./s(:,3)
Warning: Divide by zero.
Ignore this message, in the ratios ignore the Inf values and the last
value
r =400
150
50
Inf
0
3rd raw has minimum ratio - pivot at (3,3):
» s=pivot(s,3,3)
s = ....
If necessary, format the output ( best of fixed and floating point)
» format short g
» s
s = Columns 1 through 6
1.5
3
0
1
0
-0.25
0.5
0
0
0
1
-0.25
0.5
0
1
0
0
0.25
0
1
0
0
0
0
-3
-5
0
0
0
2.5
Columns 7 through 8
0
350
0
100
0
50
1
50
0
500
- 70 -
2nd variable enters (max negative), computing ratios: last column/2nd column
» r=s(:,end)./s(:,2)
r =
116.67
Inf
Inf
50
-100
4th raw has minimum ratio - pivot at (4,2):
» s=pivot(s,4,2)
s = Columns 1 through 6
1.5
0
0.5
0
0.5
0
0
1
-3
0
Columns 7 through 8
-3
0
0
1
5
0
0
1
0
0
1
0
0
0
0
0
1
0
0
0
-0.25
-0.25
0.25
0
2.5
200
100
50
50
750
1st variable enters (only negative), computing ratios: last column/1st
column
» r=s(:,end)./s(:,1)
r =
133.33
200
100
Inf
-250
3rd raw has minimum ratio - pivot at (3,1):
» s=pivot(s,3,1)
s = Columns 1 through 6
0
0
0
0
1
0
0
1
0
0
-3
-1
2
0
6
Columns 7 through 8
-3
50
0
50
0
100
1
50
5
1050
1
0
0
0
0
Optimal table
Retrieving results from the optimal table:
» z=s(end,end)
Objective value
- 71 -
0
1
0
0
0
-1
-0.5
0.5
0
4
z = 1050
Solution vector is found by basic columns ( x4 is the first basic variable
with the value 50, x1 is the third basic variable with the value 100,
etc.)
» x=[100 50 0 50 50 0 0]'
x =100
50
0
50
50
0
0
Computing left hand sides of inequalities using the first 3 columns of A
and first three items of x:
» A
A =
2
1
2
0
Displaying
3
1
0
1
0
4
1
0
A again
1
0
0
1
0
0
0
0
0
0
1
0
0
0
0
1
» lhs=A(:,1:3)*x(1:3)
lhs = 350
100
200
50
» [lhs b]
ans =
350
400
100
150
200
200
50
50
Left and Right hand sides:
» unused=b-lhs
unused =
50
50
0
0
Unused resources = RHS - LHS
Note that only last two constraints are tight
Shadow costs are in last 4 columns of the z row:
» s
s = Columns 1 through 6
0
0
0
0
1
0
0
1
0
0
-3
-1
2
0
6
1
0
0
0
0
Columns 7 through 8
-3
50
- 72 -
0
1
0
0
0
-1
-0.5
0.5
0
4
0
0
1
5
50
100
50
1050
» w=s(5,4:7)'
w = 0
0
4
5
Dual solution = Shadow costs
» c'*x
ans = 1050
Primal objective
» w'*b
ans = 1050
Dual objective
Inverted base is where unity matrix was originally (last 4 columns):
» s
s = Columns 1 through 6
0
0
0
0
1
0
0
1
0
0
-3
-1
2
0
6
1
0
0
0
0
0
1
0
0
0
-1
-0.5
0.5
0
4
Columns 7 through 8
-3
50
0
50
0
100
1
50
5
1050
» Binv=s(1:4,4:7)
Binv =
1
0
0
0
» B=inv(Binv)
B =
1
0
0
1
0
0
0
0
0
1
0
0
-1
-0.5
0.5
0
-3
0
0
1
B-1
Inversion of B-1 is B
2
1
2
0
3
0
0
1
Note that the base is made of columns of A that correspond to the basic
variables x4, x5, x1, and x2 in this order. To check this, let's create the
base again. Note that it is possible to retrieve columns in any order by
giving the set - vector of indices:
» BB=A(:,[4 5 1 2])
parameter
BB = 1
0
2
All rows, columns as given by the 2nd
3
BB = B
- 73 -
0
0
0
1
0
0
1
2
0
0
0
1
»
4.5 Big M method and II phase method
Recall that to start the simplex algorithm, an initial Basic Feasible Solution is
required.
In the problems we considered so far, an initial Basic Feasible
Solution was found by using slack variables as our initial basic variables. This
was possible since the problems considered, contained only of constraints of the
form
Ax ≤ b
x≥0
However, if an LP model is made up of ≥ or = (or a mixture of these)
constraints, an initial Basic Feasible Solution is not readily apparent.
For
example, consider the following LP:
max z = 2 x1 + 5 x2
subject to
2 x1 + 3 x2 ≤ 6
2 x1 − x2 ≥ 2
− x1 + 6 x2 = 2
x1 , x2 ≥ 0
Inserting slack variables in each of the inequalities, we obtain the LP in
standard form:
- 74 -
max z = 2 x1 + 5 x2
subject to
2 x1 + 3 x2 + x3
=6
2 x1 − x2
− x4 = 2
− x1 + 6 x2
=2
x1 , x2 , x3 , x4 ≥ 0
In the last two constraints, there is no readily apparent variable, which can act
as a basic variable for an initial basic feasible solution. Thus, in order to obtain
an initial basic feasible solution, two other variables have to be introduced into
the last two constraints. These are called artificial variables, and have no
practical interpretation. In fact they must be eliminated from the final solution.
Hence, the above LP is converted in the following form:
max z = 2 x1 + 5 x2 + 0 x3 + 0 x4
subject to
2 x1 + 3 x2 + x3
2 x1 − x2
=6
− x4 + a1
− x1 + 6 x2
=2
+ a2 = 2
x1 , x2 , x3 , x4 , a1 , a2 ≥ 0
- 75 -
The Big M Method
As already pointed out, artificial variables must be eliminated from the optimal
solution, after obtaining the IBFS. There are two types of methods, which
eliminate artificial variables: the Big M Method and the Two Phase Method.
First we shall examine the Big M Method.
To remove artificial variables, in case of Minimization problems, the Big M
Method adds a term Mai to the objective function for each artificial variable ai.
In case of Maximization problems, the term − Mai is added to the objective
function for each artificial variable ai.. M represents some very large number and
can be interpreted as a huge penalty for underfulfillment of the requirements.
Thus the above example now becomes:
max z = 2 x1 + 5 x2 + 0 x3 + 0 x4 − Ma1 − Ma2
subject to
2 x1 + 3 x2 + x3
2 x1 − x2
=6
− x4 + a1
− x1 + 6 x2
=2
+ a2 = 2
x1 , x2 , x3 , x4 , a1 , a2 ≥ 0
Modifying the objective function makes it extremely “non-profitable” for an
artificial variable to be positive. Thus the optimal solution should force a1 and
a2 to be 0. The above problem is then solved using the simplex method.
Note: if any artificial variables are positive in the optimal solution, then the
problem is infeasible.
- 76 -
Two Phase Method
This method is divided in two phases: Phase 1 and Phase 2.
Phase 1:
For the time being we ignore the original LP’s objective function and instead
we minimize a LP whose objective function z ' is the sum of the artificial
variables. Thus Phase 1 will force the artificial variables to be zero.
Since each artificial variable is non-negative, solving the LP in phase one will
result in one of the following three cases:
Case 1: The optimal value of z ' > 0
Case 2: The optimal value of z ' = 0 and no artificial variables are in the optimal
solution of Phase 1.
Case 3: The optimal value of z ' = 0 and at least one artificial variable is in the
optimal solution of Phase 1 (i.e. it is 0).
Phase 2:
Case 1: Stop, the original LP is infeasible.
Case 2:
i.
Delete the columns corresponding to the artificial variables.
ii.
Combine the original objective function with the constraints from the
optimal phase 1 tableaux
iii.
Make the tableaux consistent using row operations, so that the basic
variables will have 0 objective coefficients.
iv.
Perform the simplex method to this new consistent tableau.
Case 3:
i.
Delete the columns corresponding to both the non-basic and the basiczero artificial variables.
- 77 -
ii.
Combine the original objective function with the constraints from the
optimal phase 1 tableaux
iii.
Make the tableaux consistent using row operations, so that the basic
variables will have 0 objective coefficients.
iv.
Perform the simplex method to this new consistent tableau.
Example 4.7
This session shows Simplex Algoritm (M method) using table approach.
Comments added later are in italics, some empty lines and spaces have been
removed.
Assuming that the folder Z:\Matlab contains the file pivot.m
» cd Z:\Matlab
» type pivot
function a=pivot(A,r,c)
% pivot matrix A at row r and column c
% for zero pivot item no operation
% no other tests
x=A(r,c);
if x ~= 0
rmax=length(A(:,1));
A(r,:)=A(r,:)/x;
for i=1:rmax
if i~=r
A(i,:)=A(i,:)-A(r,:)*A(i,c);
end
end
end
a=A;
Entering A,b,c of the model:
Max z = 3x1 + 4x2 + 7x3
ST
2x1 + 3x2
4x2 + 7x3
+ 9x3
2x1
7x1
≤
≤
≥
≥
30
75
50
20
» A=[2 3 0;0 4 7;2 0 9;7 0 0]
A = 2
3
0
0
4
7
2
0
9
7
0
0
- 78 -
(all nonnegative)
» A=[A [0 0 -1 0]' [0 0 0 -1]']
A = 2
3
0
0
0
0
4
7
0
0
2
0
9
-1
0
7
0
0
0
-1
» A=[A eye(4)]
variables
A = 2
3
0
4
2
0
7
0
Adding columns of negative slacks
slack (surplus) s3
slack (surplus) s4
Adding columns of positive slacks and artificial
0
7
9
0
0
0
-1
0
0
0
0
-1
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
slack s1
slack s2
artificial a1
artificial a2
» b=[30 75 50 20]'
b = 30
75
50
20
» M=sym('M')
Definition of the symbol M
» cT=[3 4 7 0 0 0 0 -M -M]
cT = [ 3, 4, 7, 0, 0,
» s=[A b; -cT 0]
0,
0, -M, -M]
Initial simplex table (column labels added)
s =
x1 x2 x3 s3 s4 s1 s2 a1 a2
[ 2, 3, 0, 0, 0, 1, 0, 0, 0,
[ 0, 4, 7, 0, 0, 0, 1, 0, 0,
[ 2, 0, 9, -1, 0, 0, 0, 1, 0,
[ 7, 0, 0, 0, -1, 0, 0, 0, 1,
[ -3, -4, -7, 0, 0, 0, 0, M, M,
rhs
30]
75]
50]
20]
0]
Not consistent: subtract M(row3 + row4) from rowz:
» s(5,:)=s(5,:)-(s(3,:)+s(4,:))*M
s =
[
2
3,
0,
0,
0,
[
0,
4,
7,
0,
0,
[
2,
0,
9,
-1,
0,
[
7,
0,
0,
0,
-1,
[-3-9*M,
-4,-7-9*M,
M,
M,
» col=3
col = 3
0,
1,
0,
0,
0,
x3 enters
» r=s(:,end)./s(:,col)
r =
[
Inf]
[
75/7]
[
50/9]
[
Inf]
[ -70*M/(-7-9*M)]
» row=3
row = 3
1,
0,
0,
0,
0,
ratio test
minimum
ignore z-entry of ratio tests
a1 leaves
- 79 -
0,
0,
1,
0,
0,
0,
30]
0,
75]
0,
50]
1,
20]
0, -70*M]
» s=pivot(s,row,col)
s =
[
2,
3,
[
-14/9,
4,
[
2/9,
0,
[
7,
0,
[-13/9-7*M,
-4,
» col=1
col = 1
0,
0,
1,
0,
0,
0,
7/9,
-1/9,
0,
-7/9,
0,
30]
0,
325/9]
0,
50/9]
1,
20]
0, -20*M+350/9]
minimum
a2 leaves
1,
0,
0,
0,
0,
0,
0,
-2/7,
170/7]
1, -7/9,
2/9,
365/9]
0,
1/9, -2/63, 310/63]
0,
0,
1/7,
20/7]
0, 7/9+M,13/63+M, 2710/63]
x2 enters
» r=s(:,end)./s(:,col)
r =
[
170/21]
[
365/36]
[
Inf]
[
Inf]
[ -1355/126]
» row=1
row = 1
0,
0,
1, -7/9,
0,
1/9,
0,
0,
0, 7/9+M,
ratio test
» s=pivot(s,row,col)
s =
[0,
3,
0,
0,
2/7,
[0,
4,
0,
7/9,
-2/9,
[0,
0,
1, -1/9,
2/63,
[1,
0,
0,
0,
-1/7,
[0,
-4,
0, -7/9, -13/63,
» col=2
col = 2
1,
0,
0,
0,
0,
x1 enters
» r=s(:,end)./s(:,col)
r =
[
15]
[
-325/14]
[
25]
[
20/7]
[ (-20*M+350/9)/(-13/9-7*M)]
» row=4
row = 4
0,
0,
0,
-1,
M,
ratio test
minimum
s1 leaves
» s=pivot(s,row,col)
s =
[0,
1,
0,
0,
2/21,
[0,
0,
0,
7/9, -38/63,
[0,
0,
1, -1/9,
2/63,
[1,
0,
0,
0,
-1/7,
[0,
0,
0, -7/9, 11/63,
» col=4
col = 4
s3 enters
» row=2
s2 leaves
1/3,
-4/3,
0,
0,
4/3,
(the only positive)
- 80 -
0,
1,
0,
0,
0,
0,
-2/21, 170/21]
-7/9,
38/63, 515/63]
1/9,
-2/63, 310/63]
0,
1/7,
20/7]
7/9+M,-11/63+M, 4750/63]
row = 2
» s=pivot(s,row,col)
s =
[0,
1,
0,
[0,
0,
0,
[0,
0,
1,
[1,
0,
0,
[0,
0,
0,
» col=5
col = 5
s4 enters
» row=1
row = 1
x2 leaves
0,
2/21,
1, -38/49,
0, -8/147,
0,
-1/7,
0,
-3/7,
1/3,
-12/7,
-4/21,
0,
0,
0,
9/7,
1/7,
0,
1,
0,
-1,
0,
0,
M,
-2/21, 170/21]
38/49, 515/49]
8/147, 895/147]
1/7,
20/7]
3/7+M,
585/7]
(the only positive)
» s=pivot(s,row,col)
s =
[
0, 21/2,
0,
[
0, 57/7,
0,
[
0,
4/7,
1,
[
1,
3/2,
0,
[
0,
9/2,
0,
0,
1,
0,
0,
0,
1,
0,
0,
0,
0,
7/2,
1,
0,
1/2,
3/2,
0,
9/7,
1/7,
0,
1,
0,
-1,
0,
0,
M,
-1,
85]
0, 535/7]
0, 75/7]
0,
15]
M,
120]
Retrieving results from optimal table:
» x=[15 0 75/7 535/7 85 0 0] Solution (x1 x2 x3 s3 s4 s1 s2)
x = 15.0000 0 10.7143 76.4286 85.0000 0 0
Activities: x1 = 15.0000
x2 = 0
x3 = 10.7143
Slacks:
s1
s2
s3
s4
» z=s(end,end)
z = 120
= 0
= 0
= 76.4286
= 85
1st constraint
2nd constraint
surplus above
surplus above
tight
tight
50
20
objective value
See shadow costs of RHS:
sc1 = 3/2
worth of resource 1 (30)
sc2 = 1
worth of resource 2 (75)
(sc3 = sc4 = 0 <– not tight)
Reduced cost of (nonbasic) x2 = -9/2 (one unit of x2 would decrease z by
9/2)
- 81 -
Example 4.8
This session shows Simplex Algoritm (II phase method) using table approach.
Comments added later are in italics, some empty lines and spaces have been
removed.
Assuming that the folder Z:\Matlab contains the file pivot.m
» cd Z:\Matlab
» type pivot
function a=pivot(A,r,c)
% pivot matrix A at row r and column c
% for zero pivot item no operation
% no other tests
x=A(r,c);
if x ~= 0
rmax=length(A(:,1));
A(r,:)=A(r,:)/x;
for i=1:rmax
if i~=r
A(i,:)=A(i,:)-A(r,:)*A(i,c);
end
end
end
a=A;
Entering A,b,c of the model:
Max z = 3x1 + 4x2 + 7x3
ST
2x1 + 3x2
4x2 + 7x3
+ 9x3
2x1
7x1
≤
≤
≥
≥
30
75
50
20
(all nonnegative)
» A=[2 3 0;0 4 7;2 0 9;7 0 0]
A = 2
3
0
0
4
7
2
0
9
7
0
0
» A=[A [0 0 -1 0]' [0 0 0 -1]']
A = 2
3
0
0
0
0
4
7
0
0
2
0
9
-1
0
7
0
0
0
-1
» A=[A eye(4)]
variables
A = 2
3
0
4
2
0
7
0
Adding columns of negative slacks
slack (surplus) s3
slack (surplus) s4
Adding columns of positive slacks and artificial
0
7
9
0
0
0
-1
0
0
0
0
-1
1
0
0
0
» b=[30 75 50 20]'
- 82 -
0
1
0
0
0
0
1
0
0
0
0
1
slack s1
slack s2
artificial a1
artificial a2
b = 30
75
50
20
Phase I = minimization of a1 + a2
» cT=[zeros(1,7) 1 1]
cT =
0
0
0
» s=[A b; -cT 0]
s =
x1
x2
x3
2
3
0
0
4
7
2
0
9
7
0
0
0
0
0
0
0
0
0
1
1
Initial simplex table (column labels added)
s3
0
0
-1
0
0
s4
0
0
0
-1
0
s1
1
0
0
0
0
s2
0
1
0
0
0
a1
0
0
1
0
-1
a2
0
0
0
1
-1
rhs
30
75
50
20
0
0
0
1
0
0
0
0
0
1
0
30
75
50
20
70
Not consistent - add row3 and row4 to rowz:
» s(end,:)=s(end,:) + s(3,:) + s(4,:)
s = 2
3
0
0
0
1
0
4
7
0
0
0
2
0
9
-1
0
0
7
0
0
0
-1
0
9
0
9
-1
-1
0
» col=1
col = 1
x1 enters
» r=s(:,end)./s(:,col)
r = 15.0000
Inf
25.0000
2.8571
7.7778
» row=4
row = 4
0
1
0
0
0
ratio test
a2 leaves
» s=pivot(s,row,col)
s = Columns 1 through 7
0
3.0000
0
0
4.0000
7.0000
0
0
9.0000
1.0000
0
0
0
0
9.0000
Columns 8 through 10
0
-0.2857
0
0
1.0000
-0.2857
0
0.1429
0
-1.2857
0
0
-1.0000
0
-1.0000
24.2857
75.0000
44.2857
2.8571
44.2857
- 83 -
0.2857
0
0.2857
-0.1429
0.2857
1.0000
0
0
0
0
0
1.0000
0
0
0
» col=3
col = 3
x3 enters
» r=s(:,end)./s(:,col)
r =
Inf
10.7143
4.9206
Inf
4.9206
» row=3
row = 3
ratio test
a1 leaves
» s=pivot(s,row,col)
s = Columns 1 through 7
0
3.0000
0
0
4.0000
0
0
0
1.0000
1.0000
0
0
0
0
0
0
0.7778
-0.1111
0
0
0.2857
-0.2222
0.0317
-0.1429
-0.0000
1.0000
0
0
0
0
0
1.0000
0
0
0
Columns 8 through 10
0
-0.7778
0.1111
0
-1.0000
-0.2857
0.2222
-0.0317
0.1429
-1.0000
24.2857
40.5556
4.9206
2.8571
-0.0000
End of phase I: optimal table (a1 + a2 = 0 <-> a1 and a2 nonbasic)
Removing columns 8 and 9 of artificial variables:
» s(:,8:9)=[]
s = Columns 1 through 7
0
3.0000
0
0
4.0000
0
0
0
1.0000
1.0000
0
0
0
0
0
0
0.7778
-0.1111
0
0
0.2857
-0.2222
0.0317
-0.1429
-0.0000
1.0000
0
0
0
0
0
1.0000
0
0
0
» s(end,:)=[-cT 0]
new row z in the simplex table !
s = Columns 1 through 7
0
3.0000
0
0
0.2857
1.0000
0
4.0000
0
0.7778
-0.2222
0
0
1.0000
Column 8
24.2857
40.5556
4.9206
2.8571
-0.0000
Phase II = maximization of 3x1 + 4x2 + 7x3
» cT=[3 4 7 0 0 0 0]
cT =
3
4
7
0
0
0
- 84 -
0
0
1.0000
-3.0000
0
0
-4.0000
1.0000
0
-7.0000
-0.1111
0
0
0.0317
-0.1429
0
0
0
0
0
0
0
0.2857
-0.2222
0.0317
-0.1429
-0.2063
1.0000
0
0
0
0
0
1.0000
0
0
0
0.0952
-0.6032
0.0317
-0.1429
0.1746
0.3333
-1.3333
0
0
1.3333
0
1.0000
0
0
0
Column 8
24.2857
40.5556
4.9206
2.8571
0
Not consistent - add
7*row3 + 3*row4 to rowz:
» s(end,:)=s(end,:) + 7*s(3,:) + 3*s(4,:)
s = Columns 1 through 7
0
3.0000
0
0
0
4.0000
0
0.7778
0
0
1.0000
-0.1111
1.0000
0
0
0
0
-4.0000
0
-0.7778
Column 8
24.2857
40.5556
4.9206
2.8571
43.0159
» col=2
col = 2
x2 enters
» r=s(:,end)./s(:,col)
r = 8.0952
10.1389
Inf
Inf
-10.7540
» row=1
row = 1
ratio test
s1 leaves
» s=pivot(s,row,col)
s = Columns 1 through 7
0
1.0000
0
0
0
0
0
0
1.0000
1.0000
0
0
0
0
0
0
0.7778
-0.1111
0
-0.7778
Column 8
8.0952
8.1746
4.9206
2.8571
75.3968
» col=4
s3 enters
- 85 -
col = 4
» row=2
row = 2
s2 leaves
» s=pivot(s,row,col)
s = Columns 1 through 7
0
1.0000
0
0
0
0
0
0
1.0000
1.0000
0
0
0
0
0
0
1.0000
0
0
0
0.0952
-0.7755
-0.0544
-0.1429
-0.4286
0.3333
-1.7143
-0.1905
0
0
0
1.2857
0.1429
0
1.0000
0
1.0000
0
0
0
1.0000
0
0
0
0
3.5000
1.0000
0
0.5000
1.5000
0
1.2857
0.1429
0
1.0000
Column 8
8.0952
10.5102
6.0884
2.8571
83.5714
» col=5
col = 5
s4 enters
» row=1
row = 1
x2 leaves
» s=pivot(s,row,col)
s = Columns 1 through 7
0
10.5000
0
0
8.1429
0
0
0.5714
1.0000
1.0000
1.5000
0
0
4.5000
0
Column 8
85.0000
76.4286
10.7143
15.0000
120.0000
Retrieving results from optimal table:
» x=[15 0 10.7143 76.4286 85 0 0]
x = 15.0000
0
10.7143
76.4286
85.0000
0
Activities: x1 = 15.0000
x2 = 0
x3 = 10.7143
Slacks:
s1
s2
s3
s4
= 0
= 0
= 76.4286
= 85
1st constraint
2nd constraint
surplus above
surplus above
- 86 -
tight
tight
50
20
0
» z=s(end,end)
z = 120
objective value
See shadow costs of RHS:
sc1 = 1.5
worth of resource 1 (30)
sc2 = 1
worth of resource 2 (75)
(sc3 = sc4 = 0 <– not tight)
Reduced cost of (nonbasic) x2 = -4.5 (one unit of x2 would decrease z by
4.5)
4.6. Sensitivity Analysis
Numerical parameters of LP models, and other mathematical models in
Operations Research generally, are often not known exactly. Very often the
actual values that represent availability of resources, contribution, costs are just
estimates of future values. That's why it is very important to evaluate how much
does the current (optimal) solution depend on actual values of parameters and to
update the solution if some parameter changes its value without solving the
problem again from scratch. Considering LP problems changes in the model
can result in one of these four cases:
1. The current (basic feasible optimal) solution remains unchanged.
2. The current solution becomes infeasible.
3. The current solution becomes non-optimal.
4. The current solution becomes both non-optimal and infeasible.
The methods how to recover optimality and feasibility in the above cases are
these:
2. Use the dual simplex method to recover feasibility.
3. Use the (primal) simplex method to obtain the new optimum.
- 87 -
4. Use both the primal and the dual simplex methods to obtain the new
solution.
To assess how sensitive is the solution to the changes of a particular parameter
we need the range of its values that keep feasibility/optimality of the current
solution (whose actual objective value can change). Next paragraphs deal with
both problems. We shall consider only right hand side values of (in)equalities
and coefficients of the objective function because most software packages
provide sensitivity data on these parameters only. We shall also assume that
always only one parameter is changed while the others remain constant.
For convenience let's repeat the contents of the simplex table of a feasible LP
problem bounded in objective value after reaching the optimal feasible solution.
Note that the columns of basic and nonbasic variables are in fact scattered in the
table because usually the original column labels are not changed during simplex
iterations.
BV
xNT
xBT
z
RHS
xB
B-1N
I
0
B-1b
z
cBTB-1N - cNT
0
1
cBTB-1b
In the table B is the optimal basic matrix and N is the corresponding nonbasic
matrix (both made of the columns of the original m x n matrix A). n-vectors x
and c are divided accordingly. b is the m vector of the RHS values. Note also
that the inverted basic matrix B-1 is available in the columns that originally
- 88 -
contained the unity matrix - typically the last m columns and that the dual
optimal solution equal to shadow prices of primal RHS values are also available
in the columns of the z-row corresponding to the slack variables of the primal
model. Of course the primal and dual optimal objective values are equal.
Changes in the right-hand side values
From the above table it is evident that these changes can not affect optimality of
the current solution because the values zj - cj in the z-row don't depend on b.
Changes in b change the values b* = B-1b of the basic variables and the
objective value cBTB-1b. If the new values of the basic variables are still
nonnegative, we have new feasible optimum. Otherwise it is necessary to apply
the dual simplex method to recover feasibility.
Range in which elements of b can vary (feasibility range)
Let's assume that the new i-th RHS value bi is changed to bi + di , i = 1 … m and
that all the other RHS values are not changed. Let's call the new vector of RHS
values b'. The condition B-1b' ≥ 0 can be in this case expressed as:
b* + (B-1)idi ≥ 0
where b* are the current values of basic variables and (B-1)i is the i-th column of
B-1. This is a set of inequalities:
bj* + (B-1)jidi ≥ 0 , j = 1 … m
- 89 -
All these inequalities must be satisfied that defines the minimum and the
maximum possible values of di or in other words the range of bi values that
keep feasibility of the current solution.
Example 4.9
Consider the LP model:
max 8 x1 + 5 x2 + 10 x3
subject to
2 x1 + 3 x2 + x3 ≤ 400
x1 + x3 ≤ 150
2 x1 + 4 x3 ≤ 200
x2 ≤ 50
x1 , x2 , x3 ≥ 0
The optimal table is:
Solution Products
Variable
x1
x2
0
0
x4
0
0
x5
1
0
x1
0
1
x2
0
0
z
Slack Variables
x3
-3
-1
2
0
6
x4
1
0
0
0
0
x5
0
1
0
0
0
x6
-1
-0.5
0.5
0
4
x7
-3
0
0
1
5
The basic matrix B corresponding to the optimal table is
⎛1
⎜⎜
⎜⎜0
⎜
B = ⎜⎜
⎜⎜⎜0
⎜0
⎜⎜⎝
⎛1 0 −1 −3⎞⎟
0 2 3⎞⎟
⎟⎟
⎜⎜⎜
⎟⎟
⎟
1
⎜0 1 −
1 1 0⎟⎟
2 0 ⎟⎟⎟
⎟⎟ ⇒ B−1 = ⎜⎜⎜
⎟
0 2 0⎟⎟⎟
⎜⎜⎜0 0 1 2
0 ⎟⎟⎟
⎟⎟
⎟⎟
⎜⎜
0 0 1⎠⎟⎟
⎜⎜⎝0 0
0
1 ⎠⎟⎟⎟
- 90 -
Solution
Quantity
50
50
100
50
1050
Suppose we would like to know the range in which b3 can vary such that the
feasibility conditions still hold, in other words, such that:
b`= b* + (B-1)idi ≥ 0
⎛50 ⎞⎟ ⎛⎜−1 ⎞⎟
⎛0⎞⎟
⎜⎜ ⎟ ⎜
⎜ ⎟
⎟⎟
⎜⎜50 ⎟⎟ ⎜⎜− 1 ⎟⎟
⎜⎜⎜0⎟⎟
⎜ ⎟ ⎜ 2⎟⎟
⎜ ⎟
⇒ ⎜⎜ ⎟⎟⎟ + ⎜⎜⎜
⎟⎟d3 ≥ ⎜⎜ ⎟⎟⎟
⎜⎜100⎟⎟ ⎜ 1 ⎟⎟
⎜⎜0⎟⎟
⎜⎜ ⎟⎟ ⎜⎜ 2 ⎟⎟
⎜ ⎟⎟
⎜⎝50 ⎠⎟⎟ ⎜⎜0 ⎟⎟
⎜⎝⎜0⎠⎟⎟
⎝
⎠
⇒ d3 ≤ 50, d3 ≤ 100, d3 ≥ −200
Thus b3 can vary in the range [0, 250].
Changes in the objective coefficients
From the above table it is evident that these changes can not affect directly the
values of the basic variables b* = B-1b, but they can affect optimality. So
generally it is necessary to apply the primal simplex method to reach optimality
again. Let's take separately the cases where the changing coefficient ck relates to
a nonbasic and to a basic variable xk.
1. Assume xk is a nonbasic variable
In this case cB (and thus the objective value) remain unchanged. The values zj cj in the z row are also not changed except the value zk - ck in the xk column that
becomes zk - ck' where ck' is the new value of the coefficient. If the new value
zk
-
ck'
keeps
optimality
(remains
non-positive/non-negative
for
minimization/maximization problems) the current solution is not changed.
Otherwise xk has to enter the basis, so we use the primal simplex method to do
so and to perform possibly more iterations until optimality is recovered. In this
- 91 -
case we get a new optimum (generally more basic variables can leave).
Optimality condition can also be used to find the range of ck values that don't
change the current solution.
Example 4.9 (Continued)
Recall the optimal table:
Solution Products
Variable
x1
x2
0
0
x4
0
0
x5
1
0
x1
0
1
x2
0
0
z
Slack Variables
x3
-3
-1
2
0
6
x4
1
0
0
0
0
x5
0
1
0
0
0
x6
-1
-0.5
0.5
0
4
x7
-3
0
0
1
5
Solution
Quantity
50
50
100
50
1050
The NBV are x3, x6 and x7. Since only x3 corresponds to the decision variables
(x6 and x7 are slacks) we are interested in the range in which c3 can vary, so that
the optimality condition z3 – c3' is still non-negative.
z 3 − c3' = z 3 − c3 + c3 − c3' = (z 3 − c3 ) − (c3' − c3 ) = 6 − (c3' − 10) ≥ 0
⇒ c3' ≤ 16
Thus, range in which c3 can vary from the current value of 10 is (-∞,16]
2. xk is basic.
In this case cB is changed that affects the objective value and optimality
conditions generally. Let's assume that xk ≡ xBt and the old value cBt is replaced
by cBt' (note that xk is the t-th basic variable). Let the new value of zj be zj'. Then
we can calculate the new values of (negative) reduced costs in the z row:
- 92 -
zj' - cj = cB'T B-1aj - cj
= cB'T yj - cj = (cBT yj - cj) + (0, 0, … , cBt' - cBt, … ,0)yj
= (zj - cj) + (cBt' - cBt)ytj
for all j ∈ R
where yj is the j-th column of the simplex table and R is the set of indexes of
nonbasic variables. Note that the reduced costs of basic variables remain zero
by definition of the simplex table. So to get the new z-row, we multiply the
current row t of the optimal simplex table by the net change in the cost (cBt' cBt) and add it to the original z row. Then it is necessary to restore zero in the k-
th column. This will also produce the new objective value
cB'TB-1b = cBT B-1b + (cBt' - cBt)bt*.
Of course the new values in the z row can violate optimality conditions, so
again it may be necessary to use the primal simplex method to find the new
optimum. Assuming minimization, optimality can be expressed by a set of
inequalities:
(zj - cj) + (cBt' - cBt)ytj ≤ 0 for all j ∈ R
These inequalities can be used to find the range of the cBt values that keep
optimality. For maximization the only change is the inequality sign.
- 93 -
Example 4.9 (Continued)
Recall the optimal table:
Solution Products
Variable
x1
x2
0
0
x4
0
0
x5
1
0
x1
0
1
x2
0
0
z
Slack Variables
x3
-3
-1
2
0
6
x4
1
0
0
0
0
x5
0
1
0
0
0
x6
-1
-0.5
0.5
0
4
x7
-3
0
0
1
5
Solution
Quantity
50
50
100
50
1050
⎛x B1 ⎞⎟ ⎛x 4 ⎞⎟
⎜⎜ ⎟ ⎜⎜ ⎟
⎜⎜x B 2 ⎟⎟ ⎜⎜x 5 ⎟⎟
The BV are ⎜⎜x ⎟⎟⎟ = ⎜⎜x ⎟⎟⎟ . The solution variables are x1 and x2 . Consider the
⎜⎜ B 3 ⎟⎟ ⎜⎜ 1 ⎟⎟
⎜x ⎟⎟ ⎜x ⎟⎟
⎜⎝ B 4 ⎠⎟ ⎝⎜ 2 ⎠⎟
BV x1 = xB3.
Recall the set of indices R corresponding to the NBV: i.e. R = {3,6,7}.
i.
j = 3: (z3 – c3) + (c1' – c1)y33 = 6 + (c1' - 8)2 ≥ 0 ⇒ c1' ≥ 5
ii.
j = 6: (z6 – c6) + (c1' – c1)y36 = 4 + (c1' - 8) 1 2 ≥ 0 ⇒ c1' ≥ 0
iii.
j = 7: (z7 – c7) + (c1' – c1)y37 = 5 + (c1' - 8) 0 ≥ 0 ⇒ Cannot conclude
Thus range of c1 = [5,∞)
- 94 -
Chapter 5
Duality Theory
Together with every linear programming (LP) problem, there is an associated
LP problem referred to as its dual. The importance of introducing the dual
problem lies in the fact that it has many properties in common with its primal.
In fact, sometimes, one can solve the dual problem with the intent to solve the
primal. The notion of a dual model also leads to rich practical and economical
interpretation of both. To create a dual problem, there are these basic rules:
• Change the type of the problem (Max ↔ Min).
• There is one dual variable for each primal constraint.
• There is one dual constraint for each primal variable.
The following table defines the relationships:
Variables
Constraints
Minimization
Problem
≥0
≤0
Unrestricted
Maximization
Problem
≤
≥
=
≥
≤
=
≥0
≤0
Unrestricted
Constraints
Variables
Moreover this is the relationship of the vectors involved assuming that the usual
interpretation of A, cT, b, and x holds for the primal:
- 95 -
Primal
A
b
cT
x
Dual
AT
c
bT
w
Vectors are considered as column vectors, row vectors are transposed (c is a
column vector, cT is a row vector). The general pattern is the following where
the proper constraint signs and bounds of variables are defined by the above
table:
Primal
Max (Min)
ST
Dual
c Tx
Ax
?
x ? 0
Min (Max)
ST
b
b Tw
A Tw ? c
w ? 0
There are two special cases of duality with simplified conversion rules:
Canonical form of duality:
Min
ST
cTx
Ax ≥ b
x≥0
Max bTw
ST ATw ≤ c
w≥0
Standard form of duality:
Min
ST
cTx
Ax = b
x≥0
Max bTw
ST ATw ≤ c
w unrestricted
or
Max
ST
cTx
Ax = b
x≥0
Min bTw
ST ATw ≥ c
w unrestricted
- 96 -
Example 5.1:
Primal :
Min 3x1 + 4x2
ST 3x1 + 4x2 ≥ 3
6x1 + 9x2 ≥ 10
x1, x2 ≥ 0
Dual :
Max 3w1 + 10w2
ST 3w1 + 6w2 ≤ 3
4w1 + 9w2 ≤ 4
w1, w2 ≥ 0
5.1 Primal - Dual Relationships
Theorem 5.2: The dual form of the dual LP problem is the primal LP problem.
Proof: This theorem shall be proved for the canonical form of duality. Consider
the following dual LP problem D (the primal form is stated above):
Max bTw
ST
ATw ≤ c
w≥0
This LP problem may also be stated as:
Max
∑
ST
ci ≥ ∑ j =1 A ji w j
m
j =1
bjwj
m
, i = 1… n
Note that b and w are an m-vectors, c is an n-vector and AT is an n×m matrix, so
in the sum there is instead of the i-th row of AT the i-th column of A. The same
optimal values wj may be obtained if the objective function was changed to:
- 97 -
Min
∑
n
j =1
− bjwj
So it holds that maximization of bTw can be converted to minimization of
(-bT)w. Similarly it is possible to multiply the inequalities by -1 to change the
signs. Thus using this transformation, the dual LP problem can be converted to:
Min (-bT)w
ST
(-AT)w ≥ (-c)
w≥0
This is the second form of the dual D with the same optimum and the same
optimal values of the variables w. Now, upon taking the dual of D - using the
method described above for the canonical form of duality - we get this problem:
Max (-cT)x
ST
(-A)x ≤ (-b)
x≥0
But by applying the above transformation in reverse order, we get:
Min cTx
ST
Ax ≥ b
x≥0
- 98 -
This is the primal problem. Therefore the dual of the dual is the primal. It
means that the terms primal or dual can be exchanged. The term primal is used
for the original problem from the application point of view.
This theorem is known as the Weak Duality Property.
Theorem 5.3: The objective value for any feasible solution to the minimization
problem is always greater than or equal to the objective value for any feasible
solution to the maximization problem.
Proof: Consider the canonical form of duality. Let xo and wo be any two
feasible solutions for the respective primal and dual problems. To prove the
theorem, let's express the constraints in canonical form for the primal:
Axo ≥ b
(1)
xo ≥ 0
and for the dual solution:
ATwo ≤ c
(2)
wo ≥ 0
Now, multiply (1) by woT on the left-hand side, thus giving woTAxo ≥ woTb or
alternatively woTAxo ≥ bTwo. Then we can convert (2) into woTA ≤ cT and then
multiply by xo on the right hand side, thus giving woTA xo ≤ cTxo. Using these
two inequalities derived above, we get:
- 99 -
(From maximizing dual) bTwo ≤ cTxo (From minimizing primal)
which proves the theorem. With this result, the following two corollaries
follow:
Corollary 5.4: If xo and wo are feasible solutions to the primal and the dual
problems such that bTwo = cTxo, then xo and wo are optimal solutions to their
respective problems.
Corollary 5.5: If either problem has an unbounded objective value, then the
other problem possesses no feasible solution.
The second corollary suggests that if the primal has no minimum optimum
value (objective value → -∞), then there is no maximum optimum value that
can be reached for the dual according to the weak duality property. The same
argument may be applied if the maximum value of the dual may never be
reached (objective value → +∞). In this case there is no minimal optimal value
that may be reached in the primal.
Note that this corollary does not suggest that an infeasible primal/dual implies
an unbounded dual/primal (i.e. the theorem does not work in reverse order).
There are examples where both the primal and the dual are infeasible. Thus
infeasibility in the primal may imply also infeasibility in the dual. To
summarize: infeasibility in the primal implies infeasibility or unboundedness in
the dual.
- 100 -
The next important duality property is based on the so called Karush-Kuhn-
Tucker (KKT) Optimality Conditions given here without a proof.
Theorem 5.6: Suppose that x* is an optimal point for the following LP (primal)
minimization problem:
Min cTx
ST
Ax ≥ b
x≥ 0
Then, there exists a vector w* such that:
1)
Ax* ≥ b ; x* ≥ 0,
2)
ATw* ≤ c ; w* ≥ 0,
3)
w*T(Ax* - b) = 0 and x* T(c – ATw*) = 0
Similarly suppose that x* is an optimal point for the following LP (primal)
maximization problem:
Max cTx
ST
Ax ≤ b
x≥ 0
Then, there exists a vector w* such that:
4)
Ax* ≤ b ; x* ≥ 0,
5)
ATw* ≥ c ; w* ≥ 0,
6)
w*T(b - Ax*) = 0 and x* T(ATw* – c) = 0
- 101 -
Note that w* can be interpreted as the dual optimal solution. The meaning of the
first condition (1) or (4) respectively is that if x* is to be optimal, then it has to
be a primal feasible point. The same thing may be said for the meaning of the
second condition (2) or (5) respectively. This time, it is the dual optimal
solution that must be feasible. The last condition (3) or (6) respectively is the
most important condition for it leads to the proof of the Strong Duality
Property:
Theorem 5.7: If one problem possesses an optimal solution, then both problems
possess optimal solutions and the two optimal values are equal.
Proof: Using both equations which are listed in the KKT condition (3), it may
be directly shown that cTx* = bTw*. The weak duality property suggests that w*
must be an optimal solution to the dual problem (since in general, bTw ≤ cTx and
equality is reached only if optimality is reached for both problems). Therefore
w* must maximize bTw over the dual feasible region. Similarly the KKT
optimality condition (6) for the dual (maximization) problem imply the
existence of a primal feasible solution whose objective is equal to that of the
optimal dual. These arguments complete the proof.
Strong duality property has important consequences. Let x and w be any two
feasible solutions in the primal and dual LP problems respectively. The strong
duality property suggests a simple method to check if the two solutions are
optimal values to their respective problems. The method would be to check if
cTx = bTw. If this is true, x and w would both be optimal. This is known as The
Supervisor’s Principle. Another important relationship is the so called
Complementary Slackness:
- 102 -
Theorem 5.8: Let x and w be any two feasible solutions to the primal and the
dual problems in the canonical form. Then, they are respectively optimal iff
•
xj(cj – ajTw) = 0 : j = 1 … n
•
wi(aiTx – bi) = 0 : i = 1 … m
Alternatively at least one of the two terms of each equation must be zero:
• xj > 0 ⇒ cj = ajTw
• ajTw < cj ⇒ xj = 0
• wi > 0 ⇒ aiTx = bi
• aiTx > bi ⇒ wi = 0
Where ai is the i-th row of A and aj is the j-th column of A.
Proof: The KKT condition (3) says that if both x and w are optimal to their
respective problems, then:
w*T(Ax* - b) = 0 and x* T(c – ATw*) = 0
The theorem is just the expansion of these conditions. Using the first two
conditions also the following is true:
Ax* – b ≥ 0, c – ATw* ≥ 0, w* ≥ 0, and x* ≥ 0
This is used in the above implications.
- 103 -
The complementary slackness can be stated verbally in this way:
1) If a variable in one problem is positive, then the corresponding constraint in
the other problem must be tight (it must be equality). The opposite is not true
(in case of degeneracy a zero variable can correspond to a tight constraint).
2) If a constraint in one problem is not tight, then the corresponding variable in
the other problem must be zero. Again the opposite is not true. A tight
constraint can correspond to a zero variable.
Complementary slackness can be used to find an optimal solution to a problem
provided the optimal solution to its dual is known. Assume that we know the
optimal dual solution w. The matrix A is known, so we can compute the
products ajTw and compare them with known values cj. If the values are not
equal, the corresponding primal variables are zero. In case of equality the
corresponding primal variables xj are shadow prices (costs) of the tight dual
constraints - see the next paragraph. By exchanging the words dual and primal
in this reasoning it is possible to find the optimal dual solution provided the
optimal primal solution is known. See also the example at the end of the next
paragraph.
Results from the above theorems and corollaries can be combined into the
Fundamental theorem of Duality:
Theorem 5.9: With regard to the primal and dual linear programming problems,
exactly one of the following statements is true:
- 104 -
1.
Both possess optimal solutions x* and w* with cTx* = bTw*.
2.
One problem has unbounded objective value, in which case the other
problem must be infeasible.
3.
Both problems are infeasible.
5.2 Economic Interpretation of Duality:
A linear programming problem may be viewed as an allocation of resources to
achieve the desired optimal value, that is, either to maximize the profit, or to
minimize the cost (loss). This is achieved by varying certain variables, which
may represent time for work, material, and other factors encountered in any
particular task. Generally they are called activities. The allocation is limited by
a certain number of constraints, which may represent the maximum amount of
material available (in case of distribution of materials), maximum time that may
be spent on each job, etc. An economic interpretation shall be given both when
the primal is a minimizing and maximizing LP problem.
First, suppose the primal is a minimizing LP problem. Then the corresponding
LP problems will be of the form:
Primal: Min z = cTx
ST
Dual: Max: z’ = bTw
Ax ≥ b
ST
x≥0
w≥0
- 105 -
ATw ≤ c
Note that in general, z’ ≤ z by the weak duality property, but at optimality the
inequality reduces to equality (by the strong duality property).
At optimality z* = cTx* = bTw*. Let's compute the first partial derivative of z*
with respect to primal RHS value bi:
∂z ∗
= wi∗
∂bi
This means that wi* is equal to the rate of change of the optimal (primal/dual)
objective value with respect to bi (availability ot the i-th resource), given that
the current nonbasic variables are held at zero. This value will be defined as the
shadow price (also called shadow cost). But by the boundaries attached to the
dual problem, wi* ≥ 0. Thus the optimal value will increase or at least stay
constant as any variable of b increases. The opposite will happen if the value of
bi decreases. An increase in any binding resource should be accompanied (at
optimal values) by an increase in the net cost or profit (according to the type of
the problem).
Interpreting the minimizing primal: Suppose you are managing a company,
which needs to produce m outputs in quantities of at least bi units each.
However you are interested in a production that would satisfy the request of bi
outputs at the minimal cost of production for the company. If aij denotes the
amount of product i generated by one unit of activity j, and if xj represents the
number of units of activities employed, then
n
∑
aijxj represents the units of
j =1
output i produced by n activities. This expression should be greater or equal to
- 106 -
the required amount bi – which corresponds to the minimum output that can be
produced. Finally, if cj denotes the cost of any activity j, then clearly the
objective function (which should lead to the minimization of the cost) should be
Minimize
n
∑
cjxj.
j =1
Interpreting the maximizing dual: Now, suppose that instead of managing the
company, you are its customer. You need to buy specified amounts bi of
goods/outputs (i = 1…m), on which, you need to agree on the unit prices, wi, for
each product i. Since aij is the number of units of output i produced by one unit
of activity j, then
m
∑
aijwi can be interpreted as the price, which will be paid for
i =1
one unit of activity j. You stipulate that the price of each activity does not
exceed cj. This condition may be summarized by the inequality
m
∑
aijwi ≤ cj.
i =1
On the other side, the company needs to maximize the profit from the selling of
the products. Therefore, the objective function will be Maximize
m
∑
wibi. Thus
i =1
the dual LP problem may be formed.
The strong duality property suggests that there is equality in the two optimal LP
values. Therefore, the minimal production cost (deduced by the primal) is
equal to the maximal return. In fact, this should be intuitively true, since both
the objective values represent the fair charge of the customer.
Now, interchange the roles of the primal and dual problems to give an
economic interpretation for the following two problems:
- 107 -
Primal: Max z = cTx
ST
Dual: Min: z’ = bTw
Ax ≤ b
ST
ATw ≥ c
x≥0
w≥0
Interpreting the maximizing primal: Suppose n products are being produced
with m types of resources. In this case xj will represent the number of units (j =
1 … n) that are produced of product j, and bi will represent the number of units
available of the resource i (i = 1 … m). A product j would provide the company
with a profit of cj per unit. Note that in this case, aij will be the number of
resources i needed to produce one unit of product j.
Interpreting the minimizing dual: For interpretation of the dual, let wi denote the
fair price to be put on one unit of resource i. Suppose that the manufacturer is
now renting out the m resources at unit prices w1, …, wm instead of
manufacturing the mix of products. Then every unit of product j not
manufactured would result in a loss of profit cj, since it has not been produced
and sold. The renting should at least compensate for this loss in profit. Thus the
prices set on the resources should be such that the renting income is not smaller
than income from production for each product:
m
∑a w
i =1
ij
i
≥ cj. Still, the renting
company seeks to minimize the total rent to eliminate any competition, and thus
the dual objective function is formed in order to Minimize
m
∑b w .
i
i =1
Example 5.10 (Maximization Primal – Minimization Dual)
- 108 -
i
A company produces two types of paints, namely paint A, and paint B.
Production of both paints is made by the use of two basic materials, namely
M1, and M2. The production involves mixing specific quantities of each
material for every ton of either paint A, or paint B. These quantities in tons are
summarized in the following table:
Available
Paint A (x1)
Paint B (x2)
Resources
Raw Material M1
6
4
24 (a)
Raw Material M2
1
2
6 (b)
Profit per ton of
5
4
paint (×$1000)
The aim of the company is obviously to maximize the profit.
Let x1, and x2 be the daily production (rates in tons) of paint A, and paint B
respectively.
Then, using the above table the primal LP problem may be
formulated as follows:
Maximize
z = 5x1 + 4x2
ST
6x1 + 4x2 ≤ 24
(a)
x1 + 2 x2 ≤ 6
x1, x2 ≥ 0
Upon taking the dual of this LP problem we get:
Minimize
w = 24y1 + 6y2
- 109 -
ST
6 y1 + y2
≥5
4 y1 + 2 y2
≥4
y 1, y 2 ≥ 0
Supposing that the company now instead of manufacturing the paints, wants to
sell the resources, then y1 and y2 will be the unit prices that are decided by the
company for the selling. Now, if one unit of, say, x1 (paint A) is not
manufactured then this would result in the loss of $5,000 per ton of profit in
manufacturing (since z = 5x1 + 4x2). Therefore, not to run at a loss, the cost of
selling should not result in a diminishing profit compared with manufacturing.
By selling resources, the company will have an alternative income of 6y1 + y2
(×$1,000) per unit of not produced x1 because the coefficients (6, 1) represent
the amount of resources needed to produce one unit of x1. Therefore, to run at a
profit, the alternative compensation should exceed the cost: 6y1 + y2 ≥ 5.
Applying the same reasoning to the selling of paint B, the constraints of the
dual are formed. The company’s main objective is to eliminate competition.
Thus the company should aim to minimize the price of selling of the resources
(without producing any loss in profit). This justifies the objective function of
the dual, i.e., Minimize w = 24y1 + 6y2.
By the use of the simplex method, it may be shown that the optimal values of
the primal LP problem are:
x1 = 3 ; x2 = 1.5 ; z = 21
- 110 -
Interpreting the primal result: The maximum profit in manufacturing the two
types of paint is of $21,000 daily. This will happen with a mixing of 3 tons of
paint A and 1.5 tons of paint B daily.
Applying the simplex method to solve the dual problem we get:
y1 = 0.75 ; y2 = 0.5 ; w = 21
Interpreting the dual result: The fair price to put on the resources will be of
$750 of resource 1 (material M1 - constraint (a) of the primal) and $500 of
resource 2 (material M2 - constraint (b) of the primal).
Now let's assume that we know only the optimal primal solution and let's apply
the complementary slackness property to find the optimal dual solution. The
first two primal constraints are tight:
6x1 + 4x2 = 24
(a)
x1 + 2 x2 = 6
(b)
So the corresponding dual variables are shadow prices of these binding
constraints. The solution of the primal by the simplex gives shadow prizes 0.75
and 0.5.
Similarly let's assume that we know only the optimal dual solution and let's
apply the complementary slackness property to find the optimal primal solution.
Both dual constraints are tight:
- 111 -
6 y1 + y2 = 5
4 y1 + 2 y2 = 4
Primal variables are then their shadow prices. Simplex method gives their
values as 3 and 1.5.
Example 5.11(Minimization in Primal – Maximization Dual):
Suppose that a family is trying to make a minimal cost diet from six available
primary food (called 1,2,3,4,5,6) so that the diet contains at least 9 units of
vitamin A and 19 units of vitamin C. The following table shows the data on the
foods.
Number of Units of
Minimum Daily
Nutrients per kg of Food
Requirement of Nutrient
Nutrient
1
2
3
4
5
6
Vitamin A
1
0
2
2
1
2
9
Vitamin C
0
1
3
1
3
2
19
Cost of food (c/kg)
35
30 60 50 27
22
The primal model:
min z = 35 x1 + 30 x2 + 60 x3 + 50 x4 + 27 x5 + 22 x6
s.t.
x1 +2x3 +2x4 +x5 +2x6 ≥ 9
x2 +3x3 +x4 +3x5 +2x6 ≥ 19
x1 ,...x6 ≥ 0
- 112 -
Now suppose that a manufacturer proposes to make synthetic pills of each
nutrient and to sell them to this family. The manufacturer has to persuade the
family to meet all the nutrient requirements by using the pills instead of the
primary food.
However, the family will not use the pills unless the
manufacturer can convince them that the prices of the pills are competitive
when compared with each of the primary food. This forces several constraints
on the prices the manufacturer can charge for the pills. Let w1 and w2 be the
prices of vitamin A and Vitamin C respectively in pill form. Consider say
primary food 5. One kg of this food contains one unit of vitamin A and 3 units
of vitamin C and costs 27 cents. Thus the family will not buy the pills unless
w1 + 3w2 ≤ 27. Similarly for the other primary foods.
Also, since the family is cost conscious, if they decide to use the pills instead of
the primary foods, they will buy just as many pills as are required to satisfy the
minimal nutrient requirements exactly.
Hence, the manufacturer’s sales
revenue will be v = 9w1+19w2, and the manufacturer wants to maximize his
revenue. Thus the prices that the manufacturer can charge for the pills are
obtained by solving the following dual LP model.
m ax v = 9 w1 + 19 w 2
s .t .
w1 ≤ 35
w 2 ≤ 30
2 w1 + 3 w 2 ≤ 60
2 w1 + w 2 ≤ 50
w1 + 3 w 2 ≤ 27
2 w1 + 2 w 2 ≤ 22
w1 , w 2 ≥ 0
- 113 -
The price w1 is associated with the nonnegative primal slack variable
x7 = x1 +2x3 +2x4 +x5 +2x6 − 9
nonnegative primal slack variable
while the price w2 is associated with the
x8 = x2 +3x3 +x4 +3x5 +2x6 − 19 .
The following are the results obtained by solving the primal and the dual
models by the package LINDO.
Results:
PRIMAL:
OBJECTIVE FUNCTION VALUE
1)
179.0000
VARIABLE
VALUE
REDUCED COST
X1
0.000000
32.000000
X2
0.000000
22.000000
X3
0.000000
30.000000
X4
0.000000
36.000000
X5
5.000000
0.000000
X6
2.000000
0.000000
SLACK OR SURPLUS
SHADOW COSTS/PRICES
X7
0.000000
-3.000000
X8
0.000000
-8.000000
- 114 -
DUAL:
OBJECTIVE FUNCTION VALUE
1)
179.0000
VARIABLE
VALUE
W1
3.000000
0.000000
W2
8.000000
0.000000
SLACK OR SURPLUS
REDUCED COST
SHADOW COSTS/PRICES
W3
32.000000
0.000000
W4
22.000000
0.000000
W5
30.000000
0.000000
W6
36.000000
0.000000
W7
0.000000
5.000000
W8
0.000000
2.000000
Recall that in an LP model, the rate of change in the optimal objective function
value per unit change in the Right Hand Side values of each constraint (keeping
other values fixed) is known as the shadow costs.
Thus in this case, the shadow costs of the primal LP problem represents the
amount of extra money the family has to spend by using an optimum diet, per
unit increase in the requirement of that vitamin i.e. 3 cents for vitamin A and 8
cents for vitamin C. Thus, the price charged by the manufacturer, per unit of
vitamin, is acceptable to the family if the price of each vitamin is less than or
equal to the shadow cost of that vitamin in the primal problem. Therefore, in
- 115 -
order to maximize his revenue, the manufacturer must price the vitamins 3 and
8 cents per unit respectively.
Hence, in an optimum solution of the dual problem, the prices w1 and w2
correspond to the shadow costs of vitamins A and C respectively. Similarly, in
any LP, the dual variables are the shadow costs/prices of the resources
associated with the constraints in the primal problem.
5.3 Dual Simplex Method
Sometimes it happens that a basic solution to a LP problem is not feasible
though it is optimal, in the sense that the negative reduced costs are ≥ 0 in case
of maximization and ≤ 0 in case of minimization. This may happen if for
example an optimal solution has been calculated for a particular LP problem
and then a new problem has to be solved for different RHS values. Since the
optimality conditions are therefore satisfied (primal problem is optimal), the
dual is feasible though not optimal, so we want to pivot in such a way to make
the dual problem optimal. Instead of building a new simplex table for the dual
problem it is more practical to make the dual problem from the primal
“infeasible though optimal” tableau. This technique is known as the Dual
Simplex Method.
The idea involved in this method is to retain optimality of the primal tableau
while reaching for feasibility. Note that in the dual problem, the opposite
procedure is being carried out, that is maintaining feasibility while reaching
optimality.
- 116 -
Consider the following problem:
Min cTx
ST
Ax ≥ b
x≥0
Let B be a basic matrix of this LP problem. Let's assume that the basis need not
be necessarily feasible because after subtracting surplus variables, or
equivalently, adding negative slacks (and no artificial variables are introduced),
we get an infeasible initial solution in the following simplex table:
xB1
xB2
.
xBm
z
z
0
0
0
0
1
Solution Variables
Slack Variables
x2
xn+1
x1
…
xn
…
xn+m
RHS
y11
y12
y1,n+1
b 1*
y1n
…
… y1,n+m
y21
y22
y2,n+1
b 2*
y2n
… y2,n+m
…
.
.
.
.
.
.
ym1
ym2
ym,n+1
… ym,n+m
…
ymn
bm*
z1-c1 z2-c2
…
zn-cn zn+1-cn+1 … zn+m-cn+m cBTb*
-I
If for all i, bi* ≥ 0 , then the table represents a primal feasible solution (obtained
by multiplying all rows by -1). Also, if for all j, zj – cj ≤ 0 then optimality for
the primal problem has been reached.
Example 5.12
min3x1 + 2 x2
subject to
3x1 + x2 ≥ 3
4 x1 + 3x2 ≥ 4
x1, x2 ≥ 0
- 117 -
Expressing the problem in standardized form by subtracting negative slack
variables and multiply constraints by -1 to get negative RHS values:
min 3x1 + 2 x2 + 0 x3 + 0 x4
subject to
−3x1 − x2 + x3 = −3
−4 x1 − 3x2 + x4 = −4
x1 , x2 , x3 , x4 ≥ 0
Primal Tableau:
x1
-3
-4
-3
BV
x3
x4
z
x2
-1
-3
-2
x3
1
0
0
x4
0
1
0
RHS
-3
-4
0
Initial Tableau satisfies optimality conditions (zj-cj ≤ 0) but is infeasible since
b* < 0 .
Now we are going to show that optimality in the primal problem is equivalent
to feasibility in the dual problem.
Let's define wT = cBTB-1.
For all j = 1 … n, we have, by definition
zj – cj = cBTB-1aj - cj = wTaj - cj = ajTw - cj
At primal optimality, zj – cj ≤ 0 so using the above equation, ajTw ≤ cj , or in
matrix form:
ATw ≤ c
- 118 -
Further, an+i = -ei for i = 1 … m (because in this problem one negative slack is
added for every constraint and no artificial variables are introduced) and also
cn+i = 0 (because no objective function coefficients are assigned to slack
variables). Thus:
zn+i – cn+i = wTan+i – cn+i = wT(-ei) - 0 = -wi
or
zn+i = -wi
Thus if zn+i – cn+i ≤ 0 (for i = 1 … m), then wi ≥ 0, for all i or in matrix form:
w≥0
Thus, together we have ATw ≤ c and w ≥ 0 which defines the dual feasible
region. But these constraints have been derived from the primal optimality
conditions zj – cj ≤ 0. Thus the primal optimality implies dual feasibility.
Also at (primal) optimality w*T = cBTB-1, where B is the optimal basis. Then the
dual objective value is:
w*Tb = bTw* = (cBTB-1)b = cBT(B-1b) = cBTb* = z*
(the primal optimal objective value). Thus at feasibility, the primal and dual
optimal objectives will be equal (this has already been proved as the strong
duality property). These arguments lead to the following lemma:
Lemma 5.13:
At optimality of the primal minimizing problem in canonical form (i.e. zj-cj ≤ 0
∀j) the w*T=cBTB-1 is an optimal solution to the dual problem. Furthermore, wi*
= -zn+i for i = 1 ... m.
- 119 -
Looking again at the initial LP problem, we can add negative slack variables to
put the LP problem in the form:
Min cT x
ST Ax = b
x≥0
Without the use of artificial variables, generally, it is difficult to find an initial
basic feasible solution. So the starting basis B need not be feasible. However, B
will be dual-feasible since, for all j, zj – cj ≤ 0 which means that it will be
primal-optimal.
Algorithm 5.14 (Dual Simplex Algorithm)
Looking back at the above simplex table the algorithm will follow these steps:
ƒ First check if optimality in the primal is already reached – check if for all j:
zj – cj ≤ 0.
ƒ After multiplying by -1, the row entries of the above table, corresponding to
the constraints, check if bi* ≥ 0 for all i. If yes, then feasibility is attained
and no further work is required.
ƒ If this is not the case, choose some r such that br* < 0 (for example the most
negative one). This defines the pivot row.
ƒ Once the pivot row yrT is chosen (the leaving basic variable xr is known), we
have to choose the pivot column k of the entering variable xk. To find it let's
bear in mind the objective: nonnegative value on the right hand side.
Because the pivot item yrk will eventually become 1, the new RHS value will
be br*/yrk. This means that yrk must be negative. So we shall consider only
negative values in the row yrT. Further the primal optimality has to be kept.
After pivoting the new values in the z row must remain non-positive. We
- 120 -
shall show that this will be achieved by the ratio test to choose a column k
such that:
⎫⎪
⎧⎪ z j − c j
z k − ck
= Min ⎨
: y rj < 0⎬
y rk
⎪⎭
⎪⎩ y rj
Note that since both the denominator and the numerator are negative, then
each fraction will be positive. To get the negative reduced costs,
corresponding to the entering variable, to zero we compute (zj – cj)’ = (zj –
cj ) -
y rj
y rk
( z k − c k ) . The ratio (zk – ck)/yrk is positive. First suppose that yrj ≥ 0.
Then (zj – cj)’ ≤ (zj – cj) ⇒ (zj – cj)’ ≤ 0 ,i.e. , optimality is maintained. Now
assume that yrj < 0. Then, by the choice of yrk, the following holds:
z k − ck z j − c j
. After multiplying both sides by negative yrj, we get
≤
y rk
y rj
⎛ z − ck
z j − c j ≤ ⎜⎜ k
⎝ y rk
⎞
⎛ z − ck
⎟⎟ y rj ⇒ z j − c j − ⎜⎜ k
⎠
⎝ y rk
⎞
⎟⎟ y rj ≤ 0 i.e. (zj – cj)’ ≤ 0. Thus
⎠
optimality in the primal is still retained.
⎛ z k − ck
⎝ y rk
ƒ The new dual objective value after pivoting will be cBTB-1b - ⎜⎜
⎞ *
⎟⎟br . But
⎠
⎛ z − ck ⎞⎟ *
⎟b ≥ 0. Thus, the dual
zk – ck ≤ 0, yrk < 0, and br* < 0. Thus −⎜⎜ k
⎜⎝ yrk ⎠⎟⎟ r
objective improves over the current value of bTw = cBTB-1b. (Note that in the
dual problem, we need to maximize, not minimize as in the primal.) Thus,
each iteration contributes in approaching the dual optimal solution – which
at the end would have the same value as the optimal objective value of the
optimal feasible primal.
- 121 -
The above method moves from a dual feasible solution to the next until
optimality is reached. These procedures are equivalent in the primal problem to
moving from one optimal (not necessarily feasible) basic solution to the next,
until finally, optimal feasibility is reached. This is the algorithm of the dual
simplex method assuming initial optimal not necessarily feasible simplex table:
Repeat
If not (b* ≥ 0) (feasible optimum reached ?)
Select row r such that br*= min{bj*}
If (yrj > 0 ∀j)
Stop (dual unbounded)
Else
⎧⎪ z j − c j
⎫⎪
z k − ck
= min ⎨
: y rj < 0⎬
y rk
⎪⎩ y rj
⎪⎭
Pivot at yrk
EndIf
EndIf
Until (b* ≥ 0)
- 122 -
Example 5.12 Continued
Initial Simplex Primal Table:
BV
x3
x4
z
x1
-3
-4
-3
x2
-1
-3
-2
x3
1
0
0
x4
0
1
0
RHS
-3
-4
0
Select row r which gives the smallest RHS value:
br*
⎧ −3 ⎫
⎪ ⎪
= min{b } = min ⎨ −4 ⎬
⎪0 ⎪
⎩ ⎭
*
i
Ignoring the last value 0 (since it represents the objective row), the minimum
value is -4, thus the pivot row r is 2.
⇒ xB 2 = x4 is the leaving variable
⇒ yT2 = −4
−3 0 1 −4
Note: since y2j ≥ 0 ∀j then the primal is feasible which implies dual is bounded
Ratio test (for keeping optimality) to choose the pivot column k representing the
entering variable:
zk − ck
y2 k
⎧⎪ z − c j
⎫⎪
= min ⎨ j
y2 j < 0 ⎬
⎩⎪ y2 j
⎭⎪
⎧ −3 −2 0 0 0 ⎫
= min ⎨ , , , , ⎬
⎩ −4 −3 0 1 − 4 ⎭
Ignoring the 5th value since it represents the ratio on RHS, and the 3rd and 4th
values since
y23 and y24 are non-negative, the minimum value occurs in the
- 123 -
second column. Therefore, the pivot column k is column 2 which implies that x2
is the entering variable. Pivoting at
BV
x3
x2
z
x1
-5/3
4/3
-1/3
y22 we obtain the following tableau:
x2
0
1
0
x3
1
0
0
x4
-1/3
-1/3
-2/3
RHS
-5/3
4/3
8/3
Repeating the same procedure, the optimal and feasible simplex table is
obtained:
BV
x1
x2
z
x1
1
0
0
x2
0
1
0
x3
-0.6
0.8
-0.2
x4
0.2
-0.6
-0.6
RHS
1
0
3
The dual optimal table (computed using normal simplex algorithm) is:
BV
w1
w2
z
w1
1
0
0
w2
0
1
0
w3
0.6
-0.2
1
w4
-0.8
0.6
0
RHS
0.2
0.6
3
Note:
• the optimal dual variables are equal to the reduced costs in the primal
optimal and feasible table.
• Since one of the reduced costs in the dual optimal table is 0, there are
other optimal solutions (multiple solution) giving the same optimal
objective function value of 3.
- 124 -
Graphical Representation
Primal Model/Dual Model:
3
Solution
2
after
st
1 iteration
Still
infeasible
1
Initial
infeasible
and
Optimal
solution
0
: 3.0 w1 + 1.0 w2 = 3.0
:0 4.0 w1 + 3.0 w2 = 4.0
Payoff: 3.0 w1 + 2.0 w2 = 5.0
Final Solution 1
Optimal and
Feasible giving
optimal objective
function value of 3
: 3.0w1 + 1.0w2 >= 3.0
: 4.0w1 + 3.0w2 >= 4.0
Payoff: 3.0 w1 + 4.0 w2 = 2.5
w2
1
Feasible and optimal solution
(with normal simplex) giving
optimal objective function
: 1.0 w1 + 3.0 w2 = 2.0
0
: 3.0 w1 + 4.0 w2 = 3.0
0
: 3.0w1 +
Initial feasible
: 1.0w1 +
Solution but not
optimal
1
4.0w2 <= 3.0
3.0w2 <= 2.0
- 125 -
w1
w1
Example 5.13
This session shows the Dual Simplex Algoritm using table approach.
Note that comments added later are not bold and text is in italics and that some
empty lines have been removed.
Assuming that the folder Z:\Matlab contains the file pivot.m
» type pivot
function a=pivot(A,r,c)
% pivot matrix A at row r and column c
% for zero pivot item no operation
% no other tests
x=A(r,c);
if x ~= 0
rmax=length(A(:,1));
A(r,:)=A(r,:)/x;
for i=1:rmax
if i~=r
A(i,:)=A(i,:)-A(r,:)*A(i,c);
end
end
end
a=A;
Entering A,b,c of the model:
min
ST
2x1 + 3x2 + 4x3
x1 + 2 x2 + x3
2x1 - x2 + 3x3
≥ 3
≥ 4
xi ≥ 0
» A=[1 2 1;2 -1 3]
A=
1 2 1
2 -1 3
» A=[-A eye(2)]
Note that constraints were multiplied by -1
- 126 -
A=
-1
-2
-2 -1
1 -3
1
0
0
1
» b=-[3 4]'
b=
-3
-4
» c=[2 3 4 0 0]'
c=
2
3
4
0
0
» s=[A b;-c' 0]
s=
-1 -2 -1 1
-2 1 -3 0
-2 -3 -4 0
Initial simplex table
0 -3
1 -4
0 0
Optimal, not feasible
» row=2
row =
2
Second row leaves (max. negative)
» rc=s(3,:)
rc =
-2 -3 -4
Reduced costs
» y=s(row,:)
y=
-2 1 -3
0
0
0
Pivot row
0
1
-4
» format short g
Better format
» ra=rc./y
Ratios
Warning: Divide by zero. Ignore
- 127 -
ra =
1
-3
1.3333
NaN
0
0
To see pivot row together with ratios
» [y; ra]
ans =
-2
1
1
-3
-3
1.3333
0
NaN
1
0
-4
0
Minimum ratio and negative coefficient
» col=1
col =
1
Pivotting
» s=pivot(s,row,col)
s=
0
-2.5
1
-0.5
0
-4
0.5
1.5
-1
1
0
0
-0.5
-0.5
-1
-1
2
4
» row=1
row =
1
Still not feasible, 1st row leaves
» y=s(row,:)
y=
0
-2.5
Second iteration
» rc=s(3,:)
rc =
0 -4 -1
0
0.5
-1
1
-0.5
-1
4
» ra=rc./y
Warning: Divide by zero.
ra =
NaN
1.6
-2
0
» [y;ra]
ans =
- 128 -
2
-4
0
NaN
-2.5
1.6
» col=2
col =
2
» s=pivot(s,row,col)
s=
0
1
1
0
0
0
» z=s(3,6)
z=
5.6
0.5
-2
1
0
-0.5
2
-1
-4
Minimum ratio (negative coefficient)
-0.2
1.4
-1.8
-0.4
-0.2
-1.6
0.2
-0.4
-0.2
0.4 Optimal & Feasible
2.2
5.6
Objective value
» x=[2.2 0.4 0 0 0]' Solution vector (see columns of the optimal table)
x=
2.2
0.4
0
0
0
Shadow costs
» wT=-s(3,4:5)
wT =
1.6
0.2
- 129 -
Chapter 6
Networks
6.1 Introduction
There is a group of linear programming problems defined on networks (directed
graphs) that have many special properties. These properties enable some fast
special algorithms and also an efficient version of simplex method called
network simplex method. This chapter introduces the basic ideas, some special
practically important versions of network problems and presents selected
algorithms. Special attention will be given to transportation and assignment
network models. Knowledge of graph theory is not a precondition; all used
terms are defined here.
Definition 6.1 Network is a simple directed graph (digraph) N = (V, A) made of
a finite non-empty set V = {v1, v2, ... vm} of vertices (nodes) and a set A ⊂ V × V
of directed arcs where each arc is an ordered pair of vertices (i, j) ,i,j = 1 … m.
Note that between two vertices in one direction there can be mostly one arc –
simple graph.
2
1
4
3
6
5
N = ({1, 2, 3, 4, 5, 6} , {(1, 2), (2, 5), (3,1), (3, 4), (4, 6), (5, 3), (5, 5), (6, 5)})
- 130 -
In this text we shall assume that loops do not exist: (i, i) ∉ A, i = 1 … m. Also
let n be the number of arcs. Note that in graph theory n usually means the
number of vertices. Here we shall have one variable for each arc, so to keep
compatibility with linear programming notation (n variables) the meaning of
symbols is reversed.
6.2 Minimum Cost Network Flow Problem
The most general network optimization problem that covers most network
problems is the minimum cost network flow problem. The problem is based on
these assumptions: Let flow (movement of any commodity through an arc – for
example, current in an electric network,) in the arc (i, j) connecting vertices i
and j (in this direction) be xij. Then, for each arc, there is generally:
- A lower bound lij ≤ xij (mostly 0 – nonnegativity)
- An upper bound uij ≥ xij (interpreted as the arc’s capacity)
- A certain cost cij paid for unit flow through the arc (i, j). The total cost
that we pay for the flow through the arc (i, j) is then cijxij.
Flow is in a certain way inserted into the network and somehow removed
(example power station which generates electricity for industries, households
etc).
This can be generalized by introducing for each vertex i:
- An external input flow bi+
- An external output flow bi-.
Let Pi be the set of predecessors of the vertex i (the set of vertices where arcs
ending in i start) and similarly let Si be the set of successors of i. Graphically:
- 131 -
Pi
Si
i
bi+
bi-
Example 6.3 – Electricity Network (flow = electric current)
3900
Household 1
bT+1 = 200 5400
1500
Transformer 1
+
bPS
= 42200
POWER
STATION
bL−1 = 3900
L1
bA−1 = 1500
A1
−
bAD
2 = 5200
AD2
5200
bT+2 = 2500
30000
5200
32500
Transformer 2
Industry A
10000
2300
7000
E2
15000
L2
bL−2 = 10000
S2
bS−2 = 2300
1625
3425
Transformer 3
Household 4
1800
bT−3 = 150
3425
Household 5
1425
2000
bE− 2 = 15000
A3
bA−3 = 1625
L3
bL−3 = 1800
A4
bA−4 = 1425
L4
bL−4 = 2000
The general condition that is supposed to be satisfied with all network problems
is flow conservation stating that flow must neither originate, nor vanish in a
vertex. In other words, for each vertex the total flow out must be equal to the
total flow in:
∑x
j∈Si
ij
+ bi− = ∑ xki + bi+
k∈Pi
Simple rearrangement gives:
- 132 -
, i = 1...m
∑x −∑x
j∈Si
ij
k∈Pi
ki
= bi+ − bi− = bi
, i = 1...m
According to the value of bi there are three types of vertices:
- Source with bi > 0 that adds flow to the network,
- Sink with bi < 0 that removes flow from the network,
- Transshipment vertex with bi = 0.
Example 6.3 – Continued
In the previous figure we see that the source vertices are the Power Station,
Transformer 1 and 2, while the sink vertices correspond to the third transformer
(since it reduces current), L1, A1, AD2, E2, L2, S2, A3, L3, A4 and L4. The
transshipment nodes are the households and the industry.
From the LP point of view the equations
∑x −∑x
j∈Si
ij
k∈Pi
ki
= bi+ − bi− = bi
, i = 1...m
represent restrictions (constraints). If we known the cost per unit flow
associated with each arc, then the objective is logically a minimum cost flow.
All these can now be expressed in matrix form in usual way:
Min z = cTx
ST Ax = b
L≤x≤U
Where c, L and U are n - vectors of unit costs, lower bounds and upper bounds
respectively. The matrix A has m rows (each row represents one vertex) and n
- 133 -
columns (one column for each arc). Compared with other LP problems, there
are few differences. First there are double subscripts of the vectors x, c, L, U
and columns of A. This is just a formal difference in notation to avoid separate
indexing of vertices and arcs. Lower and upper bounds represent generally
additional 2n constraints. Lower bounds are mostly zero – usual non-negativity
requirement. If not, a simple change of variables can be used to replace l ≤ x
by 0 ≤ x-l = x*. Also upper bounds can be eliminated, if necessary by replacing
a bounded variable by two variables: x ≤ u can be replaced by x1 – x2 ≤ u, where
x1 and x2 are nonnegative and unbounded. Of course in the model x is replaced
by x1 – x2. So without loss of generality (and with some modifications in the
model) the bounds can be ignored.
Example 6.4
Suppose we want to find the flow in the following network, which generates the
total minimum cost.
cs3
ls3
us3
9,0,45
3
5
12,0,60
17,0,75
bs+ = 80
s
15,0,60
11,0,3
8,0,55
6,0,50
19,0,70
t
30,0,62
2
10,0,52
4
8,0,50
The LP problem for this problem is given below:
- 134 -
6
bt− = 80
min17 xs 3 + 19 xs 2 + 15 x23 + 10 x24 + 11x34 + 9 x35 + 6 x45 + 8 x46 + 8 x56 + 12 x5t + 30 x6t
st
xs 3 + xs 2
= 80
=0
− xs 2 + x23 + x24
− xs 3
− x23
+ x34 + x35
− x24 − x34
=0
+ x45 + x46
− x35 − x45
=0
+ x56 + x5t
− x46 − x56
=0
+ x6t = 0
− x5t − x6t = −80
( Vertex s )
( Vertex 2 )
( Vertex 3 )
( Vertex 4 )
( Vertex 5 )
( Vertex 6 )
( Vertex t )
0 ≤ xs 3 ≤ 75, 0 ≤ xs 2 ≤ 70, 0 ≤ x23 ≤ 60, 0 ≤ x24 ≤ 52
0 ≤ x34 ≤ 63, 0 ≤ x35 ≤ 45, 0 ≤ x45 ≤ 50, 0 ≤ x46 ≤ 50
0 ≤ x56 ≤ 55, 0 ≤ x5t ≤ 60, 0 ≤ x6t ≤ 62
What makes networks problems special are the properties of A and b (in
balanced case – i.e. supply = demand). For A there is in each row (vertex):
- +1 for starting arcs
- -1 for ending arc
- Zeros otherwise.
Similarly in each column (arc) there is:
- +1 for the starting vertex
- -1 for the ending vertex
- Zeros otherwise.
In particular, columns are very special: there is only one +1, one –1 and m-2
zeros. The sum of rows is thus 0, so the rows are linearly dependent. It means
- 135 -
that the maximum rank of A is m-1 (in fact it is exactly m-1). Here we assume
that n≥m that is satisfied for all connected networks that are not trees – see later.
6.2.1 Balancing a Network
For a balanced network where the total inserted flow is equal to the total flow
removed we have:
m
∑b = 0
i =1
i
Unbalanced networks can be balanced by adding artificial vertices and arcs that
make the difference. Let S be the total supply to the network and D the total
demand:
S=
∑b
i:bi > 0
i
D = − ∑ bi
i:bi < 0
A balanced network has S = D. This is the balancing algorithm:
- If S > D (excess supply) then add an artificial vertex with demand S-D,
and add artificial arcs connecting all sources to this artificial vertex.
These new arcs have costs that correspond to the cost (if any) of excess
production.
- If D > S (excess demand) then add an artificial vertex with supply D-S,
and add artificial arcs connecting this artificial vertex to all sinks. These
new arcs have costs that correspond to the cost (if any) of unmet demand.
- 136 -
Without loss of generality we shall assume that a network is balanced.
Obviously flow through artificial arcs represents excess supply or unmet
demand respectively. During optimization there is no difference between real
and artificial arcs and vertices.
6.2.2 Special cases of network flow problems
The general minimum cost network flow problem defined above is also called
Transshipment problem, because there can be all three types of vertices
(sources, sink, transshipment).
Another problem is called the Transportation problem. This has only sources
and sinks and every arc goes from a source to a sink. Conservation constraints
have one of two forms:
∑x
ij
= bi
for a source with bi > 0, and
j
−∑ xki = bi
for a sink with bi < 0.
k
Transportation problems model direct movement of goods from suppliers to
customers with cij coefficients interpreted as the unit cost of transportation from
a particular supplier to a particular customer. Objective value is the total cost of
shipment.
Assignment problem is a special case of the transportation problem, where bi =
1 for a source and bi = -1 for a sink. For a balanced problem there are the same
number m/2 of sources and sinks. Assignment typically models assigning
- 137 -
people to jobs with cij coefficients interpreted as the value of a person if
assigned to a particular job, or assigning jobs to machines with cij is the unit
cost of assigning job i to machine j. Objective value is then interpreted as total
profit (maximization) or total cost (minimization) of all assignments. Later we
shall see that the integrity of flows is guaranteed, so the only possible values are
1 and 0. There is a special fast algorithm for assignment.
Shortest path problem determines the shortest (fastest) path between an origin
and a destination. There are efficient shortest path algorithms in graph theory,
but linear programming can also solve the problem. Shortest path problem can
be represented as a minimum cost network flow problem with one source (the
origin) with supply equal to 1, and one sink (the destination) with demand equal
to 1. There are typically many transshipment vertices. The cij coefficients are
interpreted as lengths of arcs (that can be generalized as time required to
traverse an arc, or cost of using an arc). Unlike in graph theory algorithms, the
coefficients need not be nonnegative.
Maximum flow problem determines the maximum amount of flow that can be
moved through a network from the source to the sink. Because the external flow
is not known a-priori, a slight modification of the general problem is necessary.
Probably the simplest one adds an artificial arc with infinite capacity from sink
to source that returns the flow back to the source. Then all vertices are
transshipment and the model maximizes the flow through the artificial arc. Let s
be the source and let t be the sink. This is then the model: Max xts, ST Ax=0,
0≤x≤U where U are the capacities of arcs. Note that costs are not used in this
model.
- 138 -
The maximum flow problem has an interesting dual problem that deals with
cuts. A cut is defined as a division of the vertices into two disjoint sets V1 and
V2, the first V1 containing the source s, the second V2 containing the sink t:
V1 ∪ V2 = V, V1 ∩ V2 = ∅, s ∈ V1, t ∈ V2.
The capacity of the cut is the sum of the capacities of the arcs that lead from V1
to V2.
Now let’s first rewrite the primal maximum flow problem:
Max z = xts
ST
∑x −∑x
j∈Si
ij
k∈Pi
0 ≤ xij ≤ uij
ki
= 0 , i = 1...m
, (n inequalities representing arcs)
The dual problem has one variable for each primal constraint, so there will be
m+n dual variables. Let’s call them yi , i = 1 … m for the first m flow
conservation inequalities and vij for the second group of n capacity limitation
inequalities. The dual objective coefficients are the RHS of primal inequalities,
so the dual objective is
Min w = ∑ uij vij
Now let’s create the dual constraints. There is one for each primal variable, it
means one for each arc (including the artificial one from t to s). Also note that
the only non-zero primal objective coefficient 1 corresponds to this arc. The
- 139 -
capacity of this arc is infinite, so it is not included in the second group of n
capacity limitation inequalities. So we have these dual constraints:
yt – ys = 1
yi – yj + vij ≥ 0
for the artificial arc
for all the other arcs (i, j)
vij ≥ 0
The interpretation of the dual is the following: let yi=0 if vertex i is in the set V1,
and let yi=1 if vertex i is in the set V2. So the dual variables y define a cut. The
first dual constraint guarantees that s ∈ V1 and t ∈ V2. Let vij=1 if arc (i, j)
connects V1 with V2 (the dual constraints guarantee this fact) and let vij=0
otherwise. Then the dual objective is the capacity of the cut and the dual
optimum is the minimum capacity cut. Using the strong duality, we can
formulate the famous max-flow min-cut theorem: “maximum flow in a network
is equal to the minimum of the capacities of all cuts in the network”.
Maximum flow problem can be expanded to Minimum cost maximal flow
problem. To avoid possibly conflicting criteria, one way to solve this problem
is the following: first find the maximum flow without costs. Then minimize the
total cost provided the flow in the artificial arc is kept at maximum value
(additional constraint). A modification of this problem is Minimum cost flow
with given value (or Minimum cost flow with given minimum acceptable value).
Both can be solved by the general minimum cost flow algorithm with one more
constraint (= or ≥) on the flow through the artificial arc.
- 140 -
There are other practically less significant special cases of the general minimum
cost network flow problem. Note that all network problems can be solved by
the standard simplex method, so any LP solver can generally be used. However,
there are two points to mention. First network problems are mostly degenerate
(many zero basic variables). So there can be problems. On the other hand
special properties of the matrix A of network problems make it possible to use
special algorithms faster than the standard simplex algorithm. Some will be
presented even though with today’s fast computers their use is justified only for
very big models.
6.3 Summary of relevant Graph Theory terms
Definition 6.5
A subnetwork N1 = (V1, A1) of a network N = (V, A) has these properties:
V1 ⊆ V and A1 ⊆ A ∩ (V1× V1)
So a subnetwork (subgraph) is created by removing some vertices, all arcs
incident with these vertices, and possibly some more arcs.
1
4
3
4
6
6
5
5
Original Network
Subnetwork
- 141 -
Definition 6.6
A path from vertex i1 to the vertex ik is a subnetwork consisting of a sequence
of vertices i1, i2, … , ik, together with a set of distinct arcs connecting each
vertex in the sequence to the next. The arcs need not all point in the same
direction.
1 i1
1
4
4
3
6
3
i3
i2
5
5 i4
Original Network
Path
Definition 6.7
A network is said to be connected if there is a path between every pair of
vertices in the network. From now we shall assume that the network is
connected (if not, the problem can be decomposed into two or more smaller
problems).
Connected Network
Disconnected Network
- 142 -
Definition 6.8
A cycle is a path from a vertex to itself.
i2
i1
i3
i4
Definition 6.9
A tree is a connected subnetwork containing no cycles.
Definition 6.10
A spanning tree is a tree that includes every vertex in the network.
Original Network
Spanning Tree
- 143 -
6.4 Summary of relevant properties of trees
The properties of (spanning) trees that are relevant for the network simplex
method will be given as lemmas. Let’s recall that our assumption is a connected
network without loops.
Lemma 6.11
Every tree consisting of at least two vertices has at least one end (a vertex that
is incident to exactly one vertex).
Proof: Select any vertex i and follow any path away from it (one must exist
because the network is connected). There are no cycles and the number of
vertices is finite, so an end will eventually be reached.
Lemma 6.12
A spanning tree for a network with m vertices contains exactly m-1 arcs.
Proof: Lemma can be proved by induction:
1. Lemma in true for m=1 (no arc) and m=2 (one arc).
2. Let’s assume that it holds for any m ≥ 2.
3. Adding one more vertex to the tree means that this vertex will be
connected to a vertex of the current tree with one more arc. So we have a
tree with m+1 vertices and m arcs. This completes the proof.
Lemma 6.13
If a spanning tree is augmented by adding to it an additional arc of the network,
then exactly one cycle is formed.
- 144 -
Proof: Suppose an arc (i, j) is added to the spanning tree. Since there was
already a path between the vertices i and j, this path together with the arc (i, j)
forms a cycle. Suppose that two (or more) distinct cycles were formed. They all
must contain the arc (i, j) because the spanning tree had no cycles. Then the
union of the two (or more) cycles minus the arc (i, j) still contains a cycle, but
this is a contradiction because before adding the arc (i, j) there were no cycles.
This shows that exactly one cycle is formed.
Lemma 6.14
Every connected network contains a spanning tree.
Proof: If the network contains no cycles, then it is also a spanning tree since it is
connected and contains all the vertices. Otherwise, there exists a cycle. Deleting
any arc from this cycle results in a subnetwork that is still connected. This
deleting can continue until there are no cycles. Finally a subnetwork is obtained
that contains no cycles, is connected, and contains all the vertices, so it is a
spanning tree.
Lemma 6.15
Let B be the submatrix of the constraint matrix A corresponding to a spanning
tree with m vertices. Then B can be rearranged to form a full-rank lowertriangular matrix of dimension m × (m-1) with diagonal entries ±1.
Proof
By Lemma 6.12 a spanning tree consists of m vertices and m-1 arcs, so B is of
dimension m × (m-1). The rest can be proved by induction:
- 145 -
1. If m=1 then B is empty. If m=2 then the spanning tree consists of one arc
so there are two possible forms of B that are both of the required form:
⎛1⎞
⎛ −1⎞
B = ⎜ ⎟ or B = ⎜ ⎟
⎝ −1⎠
⎝1⎠
2. Let’s assume that the Lemma holds for any m ≥ 2.
3. Let’s add one more vertex to the tree. This vertex will be connected to a
vertex of the current tree with one more arc. Let’s assume that in the new
matrix the added vertex is in the row 1 and the new arc is in the column 1.
Then the new matrix will have the following form:
⎛ ±1 0 ⎞
⎜
⎟
⎝ v B⎠
where B is the original matrix. The rest of the row 1 is made of zeros
because the newly added vertex is an end (only the new arc starts or ends
in this vertex: ±1 in the first position). The vector v contains all zeros
except ±1 at the position where the new arc is connected to the tree. If B
has the required form then the new matrix has also the required form.
A lower triangular matrix with nonzero diagonal entries has full rank. This
completes the proof.
6.5 Basis of network problems
To show the relationship between a spanning tree and a basis of network
problems we need two more definitions:
- 146 -
Definition 6.16
Given a spanning tree for a network, a spanning tree solution x is a set of flow
values that satisfy the flow conservation constraints Ax = b for the network, and
for which xij=0 for any arc (i, j) that is not part of the spanning tree.
Definition 6.17
A feasible spanning tree solution x is a spanning tree solution that satisfies the
nonnegativity constraints x ≥ 0.
Theorem 6.18: A flow x is a basic feasible solution for the network flow
constraints{ x : Ax = b, x ≥ 0 } if and only if it is a feasible spanning tree
solution.
Proof: First let’s assume that x is a feasible spanning tree solution. Then by
Lemma 6.12 it has mostly m-1 nonzero components. Let B be the submatrix of
A corresponding to the spanning tree. By Lemma 6.15, B is full rank with
linearly independent m-1 columns, and hence x is a basic feasible solution.
For the second half of the proof let’s assume that x is a basic feasible solution,
so it has mostly m-1 nonzero components. Now we consider the set of arcs
corresponding to the strictly positive components of x. We shall prove that
these arcs do not contain a cycle, so they either form a spanning tree or they can
be augmented with zero-flow arcs to form a spanning tree. To prove it, let’s
assume the opposite that these arcs contain a cycle. In this cycle we can add a
small flow ε in one direction (flow in all arc with the same direction will be
increased by ε, flow in all arcs with the opposite direction will be decreased by
ε). We can select ε small enough to keep all flows positive. Let’s call this new
- 147 -
flow x+ε. Similarly we can add the same small flow ε in the opposite direction to
get a new flow x-ε. For these two flows we have:
x=
1
1
x + ε + x −ε
2
2
But this is a contradiction with the assumption that x is a basic feasible solution
(an extreme point) that cannot be expressed as a convex combination of two
distinct points. This completes the proof.
The constraint matrix A does not have full rank because rows are not linearly
independent. That’s why the basis B has only m-1 columns that correspond to
the m-1 arcs of a spanning tree. Lemma 6.15 gives the properties of a basis that
have very important consequence. Let’s suppose that having a base B we want
to compute the values of the basic variables. B has m rows and m-1 columns.
We can remove the last dependent row to get a set of equations:
B’xB = b’
where B’ is obtained from B by deleting the last row, b’ is similarly obtained by
deleting the last component from b, and xB is the (m-1) vector of basic flows. B’
is a square full rank lower triangular matrix, so the solution can be found by
forward substitution. Moreover we know that the matrix entries Bij are either
zeros or ±1. So we know directly the first component x1 of xB:
B11x1 = b1 → x1 = ±b1
Similarly for the second component and so on. Generally we get:
- 148 -
i −1
xi = ±(bi − ∑ ± x j ) , i = 2… m − 1
j =1
Or in other words xi is computed by adding or subtracting first i components of
b. This guarantees that for integer values of outer flows b the values of basic
flows are also integer. This is a natural requirement of many network problems
(like assignment) that is thus automatically satisfied. This is very important –
integer solution is obtained by normal simplex method. There is no need to use
a time consuming integer programming algorithm.
6.6 Network Simplex Method
For convenience, let's repeat the basic facts about the simplex method. This is
the content of the simplex table of a feasible LP problem bounded in objective
value:
BV
xN
xB
z
RHS
xB
B-1N
I
0
B-1b
z
cBB-1N - cN
0
1
cBB-1b
Note that the columns of basic and nonbasic variables are in fact scattered in the
table because usually the original column labels are not changed during simplex
iterations. In the table, B is the current basic matrix and N is the corresponding
nonbasic matrix (both made of the columns of the original m x n matrix A). The
n vectors x and c are divided accordingly. b is the m vector of the RHS values.
Note also that the inverted basic matrix B-1 is available in the columns that
originally contained the unity matrix - typically the last m columns. The
- 149 -
simplex algorithm is based on two tests. The Optimality test checks whether
optimum has been reached. It is based on negative reduced costs in the z row:
cBB-1N - cN = yTN - cN = z - cN
where yT = cBB-1 are the simplex multipliers. Individual negative reduced costs
are given by:
cBB-1Aj – cj = yTAj – cj = zj – cj
where Aj is the j-th column of A. If the table is not optimal, the most negative
value (maximization) or the greatest positive value (minimization) defines the
entering nonbasic variable. This means that one column of the base (leaving
variable) is replaced by the column of a selected so far nonbasic (entering)
variable.
The leaving variable is chosen by the Feasibility test (minimum ratio after
dividing RHS values by the values of the pivot column). The actual update is
done by pivoting. This is repeated until optimality is reached. After reaching the
optimum, the simplex multipliers form the dual optimal solution w equal to
shadow prices of primal RHS values. w is available in the columns of the z row
corresponding to the slack variables of the primal model. Of course the primal
and dual optimal objective values are equal: cTx = wTb.
Network simplex method is based on special simplified form of the simplex
operations. Note that we have to change slightly the notation (double indexing
of the variables).
- 150 -
Optimality test
Let's express directly the positive reduced cost: cij - zij = cij - yTAij where Aij is
the particular column of A. But we know that this column is made of zeros
except +1 in the i-th row and -1 in the j-th row. So the formula for the positive
reduced cost (let's call it rij) simplifies to:
rij = cij - yi + yj
To evaluate it we need the simplex multipliers. From the above equations we
get yT = cBB-1. After multiplication by B from right we get yTB = cB. Again,
column (i,j) of B is made of zeros except +1 in the i-th row and -1 in the j-th
row. So the equations for the simplex multipliers simplify to:
yi - yj = cij
for all basic variables xij
This makes m-1 equations for m variables, so one of them can be selected
arbitrarily. The others are then computed and used to compute positive reduced
costs for the optimality test. Initially any value of any variable can be chosen,
but to simplify computation it is convenient to assign 0 to a multiplier that
corresponds to an end of the spanning tree. The other values are then computed
by traversing the spanning tree starting from the selected vertex. Initial
assignment affects the values of simplex multipliers but not the values of
reduced costs because they are given by differences of the particular
multipliers.
- 151 -
Feasibility test
If the table is not optimal, the optimality test gives the entering variable xij.
Using the network terminology we are adding an arc to a spanning tree by
increasing its flow from the current value 0. Using Lemma 6.13, this will create
exactly one cycle. To keep flow conservation, we have to increase flow in all
arcs of the newly created cycle. Unless the problem is unbounded, some arcs in
the cycle have opposite direction compared with the new arc. So increasing a
flow in the cycle will actually decrease the flow in these arcs with opposite
direction. That's why we can increase the cycle flow until the flow in one (or
more) of these arcs drops to zero. Further increase is impossible to keep
feasibility - nonnegative flows. This will also restore the spanning tree because
the arc whose flow has dropped to zero can now be removed. If the flow drops
to zero in more arcs, only one of them can be removed (degeneracy).
We can now summarize the steps of the network simplex method:
1. The optimality test - compute the simplex multipliers y: Start at an end of the
spanning tree and set the associated simplex multiplier to zero. Following the
arcs (i, j) of the spanning tree, use the formula yi - yj = cij to compute the
remaining simplex multipliers.
Compute the positive reduced costs. For each nonbasic arc (i, j) compute rij =
cij - yi + yj. If rij ≥ 0 (minimization) or rij ≤ 0 (maximization) for all nonbasic
arcs, then the current basis is optimal. Otherwise select the entering arc (i, j).
2. The feasibility test. Identify the cycle created by entering the arc (i, j) to the
spanning tree. Find the arc(s) with opposite orientation compared with the arc
(i, j) with minimum flow f. If no such arcs exist, the flow in the arc (i, j) can
be increased arbitrarily and the problem is unbounded.
- 152 -
3. The pivoting. Update the spanning tree. In the cycle subtract f from flows of
opposite arcs and add f to flows of same direction arcs. Remove the arc
whose flow dropped to zero (if there are more, select one arbitrarily).
To find an initial basic feasible solution there are various methods for different
forms of network problems. Generally it is possible to add artificial variables arcs in such a way that an "obvious" initial basic feasible solution can easily be
found. Then the artificial variables have to be removed like in the standard
simplex method (M-method, II-phase method). For some problems there are
direct methods (like North-West corner method for transportation). Next we
shall deal with more detail with Transportation and Assignment problems.
6.7 Transportation Problem
Here we shall assume that each source is connected to each destination. This
allows a computer friendly tabular representation of the problem. Anyway a
not-existent connection can always be modeled by an arc with a very high
prohibitive cost (or a very big negative profit in case of maximization). For
tabular transportation problems there are simple ways how to obtain an initial
basic feasible solution. Then we shall describe two optimization methods. One
(stepping stone method) is based directly on the spanning tree properties. The
other (MODI method) is a version of the network simplex method.
We first recall the transportation problem:
Minimization of total transportation costs of a certain commodity from m1
sources to m2 destinations, based on known unit transportation costs from each
- 153 -
source to each destination and known amounts available (supplies) at each
source and known demands of all destinations.
Remarks:
1) The number of vertices is m = m1 + m2, the number of arcs (variables) is n =
m1×m2.
2) Unbalanced problems can be balanced directly in the table by adding a
dummy row (dummy source) or a dummy column (dummy destination) with
zero costs and the supply or demand that makes the balance. Interpretation:
allocations to dummy cells represent the fact, that the commodity is not
transported (no satisfied demand for a dummy row, or commodity left at
source for a dummy column respectively).
Example 6.19
Balance the next table.
Destinations
A
B
C
D
Sourc
Supply/Dema
20
30
15
5
es
nd
I
40
2
4
1
6
II
20
4
3
3
3
III
20
1
2
5
2
Using the table notation, we can express directly the LP model of a balanced
transportation problem (minimization):
- 154 -
j
1
1
i
m2
.
cij = Unit cost of the i to j transportation
.
si
… cij xij
dj = Demand at the destination j
xij = Solution variable (amount transported from i to j)
.
m1
= Supply at the source i
.
Min
m1
m2
∑∑ c x
i =1 j =1
m2
∑x
j =1
ij
m1
∑x
i =1
ij
ij ij
= si
= dj
, i = 1, 2,
,
j = 1, 2,
m1
m2
xij ≥ 0
Using the above model, we can solve the transportation problem by any LP
solver. Another possibility is to convert it into a network and solve it by the
network simplex method given in the previous chapter. Here we shall describe a
modification of the network simplex method that is performed directly in the
table. We assume a balanced transportation problem.
1. Algorithms to find an initial basic feasible solution
Spanning tree in a table is represented by m-1 nonzero allocations such that all
demands are satisfied (this also guarantees that all supplies are fully utilized).
Also there must not be any cycles. Cycle is in a table represented by such
allocations that enable return into a cell by moving only vertically or
horizontally along nonzero allocations. Example of cycles in table:
- 155 -
x
x
x
x
or
x
x
x
etc.
x
x
x
This is the general algorithm to find an initial basic feasible solution. The
algorithm guarantees that the allocations will form a spanning tree.
While (there are less than m-1 allocations) do
Select a next cell (see the following algorithms)
Allocate as much as possible to this cell (this allocation is equal to the
minimum of the supply in the row and the demand in the column)
Adjust the associated amounts of the supply and the demand (subtract the
allocation)
Cross out the column or the row with zero supply or demand, but not
both !
EndWhile
There are several algorithms to select the next cell to be allocated. One is
trivial; the other two attempt to allocate cells with low costs.
a) North - West corner method
Start with the upper left cell
Allocate as much as possible
If row crossed move down, otherwise move right.
- 156 -
Note: It may happen that a zero is allocated (a zero basic variable). After
allocating the bottom right entry, there will be exactly one uncrossed
row or column and m-1 allocations.
b) Least cost method
Next cell is a not allocated cell with minimum cost. Break ties arbitrarily.
Note: Not allocated cell is a cell whose row and column are not crossed. It
may happen that a zero is allocated (a zero basic variable). Stop if
exactly one row or column with zero supply or demand remains. This
provides the required m-1 allocations.
c) Vogel’s approximation method (VAM)
a) For each not crossed-out row and column compute the penalty as the
difference between the smallest and the next smallest costs in that row
or column. This has to be computed in each step (after crossing a row,
recalculate penalties in columns, after crossing a column, recalculate
penalties in rows).
b) Select a row or a column with the highest penalty. Break ties arbitrarily.
c) Allocate as much as possible to the cell with the minimum cost in the
selected row or column. Break ties by the Least cost method.
Note: If all uncrossed-out rows and columns have zero remaining supply and
demand, determine the zero basic variables by the Least cost method and
do not form a cycle. Stop if exactly one row or column with zero supply
or demand remains. This provides the required m-1 allocations.
- 157 -
Comparison: All methods provide the m-1 allocations (basic variables), some of
them may be zero. Vogel's method often provides an optimum directly. The
Least cost method is probably a good compromise between complexity and the
number of improvement steps (for manual solutions). NW corner method is
simple, but may require many improvement steps. So it is a usual choice for
computerized solutions.
2. Algorithms to find an optimal solution
a) Stepping - Stone method
Idea: Repeatedly try all not allocated cells to improve the total cost until no
improvement exists. In particular do the following, for each not allocated
cell:
Create a cycle starting and ending at this empty (nonbasic) cell marked by +
that is made of already allocated (basic) cells, that are marked repeatedly by
- and + . The cycle can be made of horizontal and vertical segments only not diagonal ones. (Degenerate allocations with zero entry can be used in the
cycle). Using Lemma 6.13 we know that there is exactly one such cycle
created by entering a given nonbasic variable.
Optimality test: Sum costs of cells marked by + , subtract from this sum the
costs of cells marked by - . If the result is negative, the solution is not
optimal. Entering the variable associated with this empty cell will improve
the solution. This can be applied immediately. Alternatively we can evaluate
all nonbasic cells and select the variable with maximum cost decrease. If no
such cell exists, the table is optimal.
- 158 -
Feasibility test: To allocate as much as possible to the new cell, find the cell
in the cycle marked by - with minimum allocation. Add this value to the
cells in the cycle marked by +, subtract this value from the cells in the cycle
marked by - . This will enter a new solution variable with maximum possible
value, the variable that changed to zero leaves the solution. If more than one
variable reach zero value (the so-called temporary degeneracy), only one of
them can leave the solution. It can be chosen arbitrarily. So there will always
be m-1 basic allocations, some of them may be zero. For a degenerate
solution it may happen that a zero is moved.
Note: Stepping - Stone method is simple but it involves many steps. If for some
not allocated cells the cost difference is zero and for all the others it is positive,
there are alternative optima. To find them, enter such a variable. The total cost
remains the same, but allocations will change.
b) Modified Distribution (MODI) method
The MODI method (also called "method of multipliers") improves the search
for entering variables. Compared with the stepping - stone method, all nonbasic
variables are evaluated in one step and then compared to select the most
promising one. It is basically the network simplex method. First multipliers di
are associated with each row, multipliers rj are associated with each column of
the table. They can be interpreted as unit dispatch and unit reception costs
respectively. Then for each basic variable xij the following holds:
di + rj = cij
- 159 -
Compare this equation with the general network simplex method and note that
dispatch costs are directly simplex multipliers, reception costs are negative
simplex multipliers. So we have again m-1 equations for m variables. By
selecting any value for one of them (usually d1=0), the values of the others can
easily be computed directly in the table - see the worksheet in the appendix. The
sum di + rj for nonbasic variables is called shadow cost. Note that shadow cost
is equal to –zij in the simplex table. Positive reduced cost is then cij – (di + rj) or
verbally actual cost of the table entry minus its shadow cost. Because the model
seeks to minimize the total cost, the presence of a negative value shows that the
solution is not optimal. In this case enter the variable with the maximum
negative reduced cost, break ties arbitrarily. Reduced cost can be interpreted as
cost saved by transporting one unit of the commodity through this so far not
allocated (nonbasic) rout.
After selecting the entering variable, the rest is done by creating the loop in the
same way as in the stepping - stone method. This will find the leaving variable.
This is repeated until there is no possible improvement. Zero reduced cost
indicates alternative optima.
Complete MODI method algorithm (including balancing)
If (demands < supplies) then add a dummy destination (column) with zero
transportation costs and the demand that makes the balance
If (supplies < demands) then add a dummy source (row) with zero
transportation costs and the supply that makes the balance
Make an Initial basic feasible allocation (all demands must be satisfied,
supplies must not be exceeded). There must be m-1 allocations.
Repeat
- 160 -
Calculate dispatch and reception costs by using the basic (allocated) cells.
Set the first dispatch cost to zero to get m-1 equations.
Calculate the reduced cost of each empty not allocated cell as the
difference:
actual cost - shadow cost = actual cost - (dispatch cost + reception cost)
If (there is a negative reduced cost) then reduce the total cost:
Select the nonbasic cell with maximum negative reduced cost, break ties
arbitrarily
Mark this cell by +
Mark other basic cells by - and + to keep row and column balances
This creates a cycle that starts and ends in the selected nonbasic cell
Find the minimum allocation of the cells in the cycle marked by Add this value to + cells in the cycle, subtract this value from - cells in
the cycle
EndIf
Until (there is no negative reduced cost)
Compute the total cost of the optimal solution.
Transportation: maximization problems
Maximization is necessary if the table entries are interpreted as contribution
associated with transporting unit commodity through the rout. There are only
minor modifications to the methods described above. Note that instead of costs
the table contains contributions.
Initial basic feasible allocation:
NW corner - no change.
"Least cost" method - select not allocated cells with maximum contribution.
- 161 -
VAM - compute penalty as the difference between two maximum contributions
in the row or the column. Select the row or the column with maximum
penalty, allocate the cell with maximum contribution in that row or
column.
Algorithms to find optimal solution
Stepping - Stone method: enter nonbasic cells with positive contribution
difference.
MODI method: enter nonbasic cell with most positive reduced cost. Optimal
table has no positive reduced cost.
6.8 Assignment Problem
Assignment is a special case of transportation with all supplies and all demands
equal to one. Like with transportation we shall assume that all possible
assignments are possible that will allow a simplified tabular representation of
the problem. If this were not true, we can again assign prohibitive
costs/contributions into non-existing entries of the table.
Problem Specification:
Minimization (maximization) of total cost (contribution) of an assignment of n
sources to m destinations based on known costs (contributions) for each
combination.
- 162 -
Remarks:
1) The problem is balanced if the number of sources is equal to the number of
destinations. Unbalanced problems can be balanced by adding dummy rows
(dummy sources) or dummy columns (dummy destinations) with zero costs.
Interpretation: assignments to dummy cells represent the fact, that the
assignment is in fact not done (no satisfied destinations for dummy rows, or
sources left not assigned for dummy columns respectively). Next only
balanced models are considered, so there are m sources and m destinations.
Using network terminology there are 2m vertices and m2 arcs.
2) Most common assignment application is assignment of jobs to applicants
(and it actually does not matter whether applicants are listed in rows of the
table and jobs in columns or vice versa). Use the following table as an
example assignment problem:
Applicants/Jobs
A
B
C
D
I
9
12
7
15
II
13
14
15
10
III
8
10
20
6
IV
11
15
13
10
Using the table notation, we can express directly the LP model of a balanced
assignment problem (minimization):
j
1
1
i
.
… cij xij
m
cij = Cost of the i-j assignment
.
xij=Solution variable (1 = i assigned to j, 0 = i not assigned to j)
m
- 163 -
Min
m
m
∑∑ c x
i =1 j =1
m
∑x
j =1
ij
m
∑x
i =1
ij
ij ij
= 1 , i = 1, 2,
m
= 1 , j = 1, 2,
m
Assignment as a special case of transportation can be solved by any LP solver.
Moreover we know that if the right hand sides are integer (here ±1), the solution
is also integer, so in fact the integer requirement in the above model is
redundant. However note that in the table there is exactly one cell with value 1
in each row and in each column, the other cells have value 0. Due to this
property of an assignment table there are fast methods to find an initial basic
feasible solution and to perform simplex iterations. The so-called Hungarian
method reduces the cost matrix that is possible due to the next theorem. After
reducing the matrix, only entries with zeros are used for assignment. For this it
might be necessary to create more zeros.
Theorem 6.20: The optimal solution to a balanced assignment problem remains
unchanged if a constant is subtracted from (or added to) any row or column of
the cost matrix.
Proof: Let pi be the constant subtracted from the row i and let qj be the constant
subtracted from the column j. This also represents addition because addition
means subtracting a negative constant. Then the cij entry of the cost matrix
changes to:
dij = cij - pi - qj
- 164 -
The new value of the objective function is:
m
m
∑∑ d
i =1 j =1
m
m
m
m
m
m
m
m
m
x =∑∑ (cij − pi − q j ) xij =∑∑ cij xij − ∑∑ pi xij − ∑∑ q j xij =
ij ij
m
i =1 j =1
m
i =1 j =1
m
m
m
i =1 j =1
m
m
i =1 j =1
m
m
m
m
∑∑ c x − ∑ p (∑ x ) − ∑ q (∑ x ) = ∑∑ c x −∑ p (1) −∑ q (1) = ∑ ∑ c x
i =1 j =1
ij ij
i =1
i
j =1
ij
j =1
j
i =1
ij
i =1 j =1
ij ij
i =1
i
j =1
j
i =1 j =1
ij ij
−C
The difference between the new and the modified objective values is constant,
so the optimum solution is not changed.
Assignment algorithm (Hungarian method)
If (number of sources ≠ number of destinations) then add dummy row(s)
or dummy column(s) with zero cost entries to get a square matrix
If (maximization) then reduce columns by the largest number in the column:
New entry = Largest number in the column - Old entry
Else reduce each column by the smallest number in the column for minimizing
Reduce each row by the smallest number in the row
Repeat
Cover all zeros by the minimum necessary number of lines
If (number of lines < number of assignments) then
Find the smallest not covered value x
Subtract x from all not covered cells
Add x to all cells covered twice
EndIf
Until (number of necessary lines = number of assignments)
- 165 -
Make assignments to zeros unique in columns or rows (taking into account only
rows or columns so far not assigned)
Using original table entries compute the total cost (contribution) of the optimum
assignment.
Remarks:
1) Note that a maximization problem was converted into minimization of
opportunity losses relative to the maximum values in columns (alternatively
it could be done in rows).
2) After reducing both columns and rows there is at least one zero in each row
and in each column. Next assignments will be done only to zero entries.
3) Minimum number of lines is the number of possible zero assignments
because each assignment covers both the row and the column. If this number
is equal to the number of necessary assignments, then all assignments can be
zero. Otherwise it is necessary to further reduce the matrix to create more
zeros.
4) The matrix cannot be reduced directly, because there is already at least one
zero in each row and in each column (considering obviously only nonnegative costs). But the following can be done:
• Select a minimum not covered cell
• Add this value to all covered rows and columns
• Subtract it from the whole matrix.
The above steps result in these operations given in the algorithm:
• Select a minimum not covered cell
- 166 -
• Add it to cells covered twice
• Subtract it from all not covered cells
• (Don't change cells covered once)
Modifications of the method:
1) Impossible assignments can be modeled by giving them a very big cost (or
very big negative contribution). Then the problem can be solved by the
above method that eliminates cells with prohibitive entries. If cells with
prohibitive entries are part of the optimal assignment, it is not possible to
make all m assignments.
2) The so-called Bottleneck Assignment does not minimize the total cost, but
the objective is to minimize the value of the maximum assigned cell.
Consider this situation: a group of workers move to a certain place, each is
assigned a certain job and they can return after all jobs are finished.
Assuming that they cannot help each other, the whole group has to wait until
the longest job is finished. Assignment matrix would in this case contain
times needed by the workers to complete the jobs (again impossible
assignments can be expressed by a very long time). A simple trick can
convert this problem into a standard assignment problem: rank the times in
the matrix in increasing order and then replace each matrix entry by 2RANK,
where RANK is the order of that entry. Then solve the problem by the
Hungarian method. The point is that the value 2RANK is greater than the sum
of all powers 2n for 0 ≤ n < RANK.
- 167 -
The Hungarian method was developed by H.W.Kuhn in 1955. It is based on
theories of Hungarian mathematicians Konig and Egervary from about 1931.
That's why its name.
- 168 -
- 169 -
Download