Linear Programming SOR 1320 Lecture Notes Department of Statistics & Operations Research Faculty of Science University of Malta Organization Subject: Method of assessment: Prerequisites: 4 ECTS 1 hour per week for 28 hours + tutorials Course held during semesters 1 and 2 Examination Advanced level in Mathematics Lecturer: Dr. Maria Kontorinaki Email: maria.kontorinaki@um.edu.mt Room: 508 Authors: Dr. Mark Anthony Caruana, Dr. Natalie Attard Contents 1. Introduction to Linear Programming 2. Review of Linear Algebra, Convex and Polyhedral Sets 3. Graphical Solution to solve Linear Programming Problems with two variables 4. Simplex Method to Linear Programming Problems with two or more variables 5. Duality Theory 6. Network Problems Suggested Texts 1. Bazaraa, M.S., Jarvis, J.J and Sherali, H.D. (1990) Linear Programming and Network Flows, Wiley 2. Dantzig, G.B. and Thapa, M.N., (1997) Linear Programming 1: Introduction, Springer 3. Dantzig, G.B. and Thapa, M.N., (2003) Linear Programming 2: Theory and Extensions, Springer 4. Luenberger, D.G. (1984) Linear and Non-linear Programming, Addison-Wesley 5. Walker, R.C. (1999) Introduction to Mathematical Programming, Prentice Hall Inc. 6. Williams, H.P. (1990) Model Building in Mathematical Programming, Wiley 7. Bazaraa, M.S., Sherali, H.D and Shetty, C.M. (1993) Non-linear Programming Theory, Algorithms and Applications, Prentice Hall Inc. 8. Wolsey, L.A. (1998) Integer Programming, Wiley 9. Sierksma, G., (1996) Linear and Integer Programming, Marcel Dekker, Inc. 10. Taha, H.A. (1997) Operations Research An Introduction, Prentice Hall Inc. 11. Winston, W.L. (1994) Operations Research – Applications and Algorithms, Duxbury Press -2- Chapter 1 Introduction to Linear Programming A Linear Program is a mathematical program, in which a linear function is maximized (or minimized) subject to a set of linear constraints. This problem class is broad enough to entail many interesting applications, and at the same time being tractable for large-scaled problems. 1.1 Overview on the history of Linear Programming Linear Programming was developed in the 1940's, to solve complex planning problems in wartime operations. The development of LP spread widely in the postwar period, as many industries found valuable uses for it. The father of the subject is George B. Dantzig, who introduced the simplex method to solve Linear Programming problems, in 1947. At the same year, John von Neumann, established the theory of duality. The mathematician Leonid Kantorovich and the economist Tjalling Koopmans were awarded the Nobel Price in economics in 1975 for their contribution to the theory of optimal allocation of resources, where Linear Programming played a crucial part. Nowadays, many industries use linear programming as a standard tool, to find an optimal allocation of resources. Other important applications of Linear Programming include airline crew scheduling, shipping or telecommunication networks, oil refining and blending, and stock and bond portfolio selection. -3- 1.2 Examples of Linear Programming Problems Example 1.1 Product Mix Problem: An engineering factory produces electronics, electrical appliances, toys and diecastings components using four production processes: assembling, grinding, drilling and testing. Each electronic yields a profit of €200, electrical appliances a profit of €300 per unit, while toys and die-casting components produce a profit of €100 and €150 per unit respectively. Each unit of the above products requires a certain time on each process, as shown in the following table: Electronics Electrical Appliances Toys Die-Castings Assembling (hrs) 3 2.1 1.5 1.5 Grinding (hrs) 4.3 1.2 3 6 Drilling (hrs) 5 3 2 2 Testing (hrs) 7 6 6 5 The assembly machine works 40 hours per week, the grinding and drilling machines 50 hours per week, while the testing machine works for 20 hours. The problem is to determine the weekly production plan that maximizes the total profit. Example 1.2 Diet Problem: A hospital dietician must prepare breakfast menus every morning for the patients. The dietician’s responsibility is to make sure that the minimum daily requirements for vitamins A and B are met, however the menus must be kept at a minimum cost to avoid waste. The breakfast supplements which include vitamins A and B, are eggs, bacon and cereal bar. One egg consists of 2mg of -4- Vitamin A and 3mg of Vitamin B, each bacon strip contains 4mg of Vitamin A and 2mg of Vitamin B, and one cereal bar contains 1mg of both Vitamin A and vitamin B. The minimum daily requirements of vitamin A and B are 16mg and 12mg respectively. Each egg costs 4 cents, one bacon strip costs 3 cents, whereas the cost of one cereal bar is 20cents. The dietician wants to determine how much of supplement to serve in order to meet the minimum daily requirements of vitamins A and B at a minimum cost. Example 1.3: Investment Problem An investor wants to invest exactly €10000 in two types of funds: Accumulator Fund and Vilhena Fund. Accumulator Fund pays a dividend of 8% while the Vilhena Fund pays a dividend of 5%. The investor is advised to invest no more that €3000 in Vilhena Fund. In addition, the amount invested in the Accumulator Fund must be at least twice the amount invested in Vilhena Fund. How much should the investor invest in each fund in order to maximize his revenue? 1.3 Steps involved in building a good linear programming model To solve Optimization Problems, in particular Linear Programming (LP), one must following the next steps: i. Define the goal of the study – to clarify the objective of the study and the decisions to be taken ii. Construct the model – developing an appropriate mathematical description of the problem iii. Solving the model constructed in step ii iv. Interpreting the results obtained -5- v. Performing sensitivity analysis – how the solution will vary with some variation in the model 1.4 Formulation of a LP model A LP model is made up of the following basic components: i. Decision Variables: - the optimal allocation of resources that we need to evaluate using alternative methods. In the product mix problem for example, the decision variables are the amount of electronics, electrical appliances, toys and die-castings that we need to produce to maximize the profit of the company. ii. Parameters: - these are exact or approximate values, which are known to the analyst. In the product mix problem the parameters are a. the profit per unit of each product b. the time per unit of each product required on each of the four processes c. the availability of the resources iii. Constraints: - these are the restrictions which limit the values of the decision variables. For example in the product mix problem the total assembly hours per week must not exceed 40 hours. iv. Objective function: - a linear function of the decision variables which may represent profit/contribution or cost. Thus, the optimal decision variables are those, which maximize profit or minimize cost i.e. which maximize or minimize the objective function. In the product mix problem we need to maximize the profit of the company made from the four products. -6- LP formulation min/ max cT x subject to Ax i b x≥0 where • ∈ {≤, ≥, =} or a mixture of these. Example 1.1 continued: Product Mix Problem Identifying the Decision variables: x1 : amount of electronics to produce x2 : amount of electrical appliances to produce x3 : amount of toys to produce x4 : amount of die − castings to produce Identifying the objective function: Maximize total profit: max 200 x1 + 300 x2 + 100 x3 + 150 x4 Identifying the constraints: i. Assembling hours: 3x1 + 2.1x2 + 1.5 x3 + 1.5 x4 ≤ 40 ii. Grinding hours: 5 x1 + 3x2 + 2 x3 + 2 x4 ≤ 50 iii. Drilling hours: 4.3 x1 + 1.2 x2 + 3 x3 + 6 x4 ≤ 50 iv. Testing hours: 7 x1 + 6 x2 + 6 x3 + 5 x4 ≤ 20 v. x1, x2, x3 and x4 must be non-negative (cannot produce negative amounts) -7- LP model: max 200 x1 + 300 x2 + 100 x3 + 150 x4 subject to 3x1 + 2.1x2 + 1.5 x3 + 1.5 x4 ≤ 40 5 x1 + 3x2 + 2 x3 + 2 x4 ≤ 50 4.3 x1 + 1.2 x2 + 3x3 + 6 x4 ≤ 50 7 x1 + 6 x2 + 6 x3 + 5 x4 ≤ 20 x1 , x2 , x3 , x4 ≥ 0 Note that there is no requirement to produce only integer amounts (fractions may for example represent partially finished products) and also note that any combination of the four products is acceptable including produce just one product. If these facts were not true, there would be other constraints. Also, problems with integer requirements are solved by special methods called Integer Linear Programming. However problems of this type are beyond the scope of this credit. -8- Chapter 2 Review of Linear Algebra, Convex and Polyhderal Sets The aim of this chapter is to give some definitions and theoretical results related to Linear Programming. In particular we shall review results from vector and matrix algebra convex analysis These results will motivate the use for solving LP problems. 2.1 Vectors and Matrices Definition 2.1: An n-vector (or a vector of dimension n) is an array (either a row or column) of n numbers. Vectors shall be represented with bold letters: ⎛ x1 ⎞ ⎜ ⎟ x x = ⎜ 2 ⎟ column vector ⎜ ⎟ ⎜ ⎟ ⎝ xn ⎠ x T = ( x1 , x2 ,...xn ) row vector where xT denotes transpose of the vector x. Definition 2.2: The Euclidean Space n is the collection of all n-vectors. Definition 2.3: An m × n matrix A is a rectangular array of mn numbers arranged where m represent the number of rows and n the number of columns: -9- ⎛ a11 ⎜a A = ⎜ 21 ⎜ ⎜⎜ ⎝ am1 a12 a22 am 2 a1n ⎞ a2 n ⎟⎟ ⎟ ⎟ amn ⎟⎠ If m = n then the matrix is said to be a square matrix of order n. Also, aii for 1 ≤ i ≤ n form the main diagonal of the matrix A. 2.2 Properties of Matrices Let α and β be scalars and A, B and C matrices, then: i. A+B=B+A ii. A + (B + C) = (A + B) + C iii. (AB) C = A(BC) iv. A(B + C) = AB + AC v. (A+B)C = AC + BC vi. α ( β A) = ( α β )A vii. ( α + β )A =( α A + β A) viii. α (A+B) = α A + α B ix. A( α B) = α (AB) x. (AT)T = A xi. (A + B)T = AT + BT xii. (AB)T = BTAT xiii. ( α A)T = α AT Note: AB ≠ BA - 10 - 2.3 Matrix Row Operation An elementary row operation on an m × n matrix A consists of ONE of each of the following operations: interchanging of two rows of A multiply a row of A by a non-zero constant add a multiple of one row of A to another row of A 2.4 Systems of Linear equations A system of m linear equations and n unknowns a11 x1 + a12 x2 + ...a1n xn = b1 a21 x1 + a22 x2 + ...a2 n xn = b2 am1 x1 + am 2 x2 + ...amn xn = bm can be represented in matrix form as shown below: - 11 - ⎛ a11 ⎜a ⎜ 21 ⎜ ⎜ ⎝ a11 a1n ⎞ ⎛ x1 ⎞ ⎛ b1 ⎞ ⎜ ⎟ ⎜ ⎟ a2 n ⎟⎟ ⎜ x2 ⎟ ⎜ b2 ⎟ = ⎟⎜ ⎟ ⎜ ⎟ ⎟⎜ ⎟ ⎜ ⎟ amn ⎠⎝ xn ⎠ ⎝ bm ⎠ a12 a22 am 2 ⇔ Ax = b where ⎛ a11 a12 ⎜a a22 A = ⎜ 21 ⎜ ⎜ ⎝ a11 am 2 ⎛ x1 ⎞ ⎜ ⎟ x x=⎜ 2⎟ ⎜ ⎟ ⎜ ⎟ ⎝ xn ⎠ ⎛ b1 ⎞ ⎜ ⎟ b b=⎜ 2 ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ bm ⎠ a1n ⎞ a2 n ⎟⎟ ⎟ ⎟ amn ⎠ This also holds for ≤ and ≥ inequalities : Ax ≤ b and Ax ≥ b . 2.5 Inverse of a Square Matrix Definition 2.4: An n × n matrix A-1 is said to be the inverse of an n × n matrix A if A-1 A = In where In is the n × n identity matrix. If A-1 exists, then A is said to be nonsingular or invertible, otherwise it is said to be singular or noninvertible. Properties i. If A is invertible, then A-1 is invertible and (A-1)-1 = A - 12 - ii. If A and B are invertible then AB is invertible and (AB)-1=B-1A-1 iii. If A is invertible, then AT is invertible and (AT)-1=(A-1)T 2.6 Linear Independence, Basis and Spanning Set Definition 2.5: A vector b in in n if n is a linear combination of vectors a1, a2, ... , ak k b = ∑ λ ja j j =1 where λ1, λ2, ... λk are real numbers. If λj for j = 1,2,…k are non-negative, then b is a non-negative linear combination of a1, a2,… ak. If the coefficients λ j ∈ are restricted to satisfy k ∑λ j =1 j =1 then b is an affine combination of a1, a2,… ak. Furthermore, if the coefficients λj are also restricted to be nonnegative then b is a convex combination of a1, a2,… ak and if all coefficients λj are positive then b is a strict convex combination of a1, a2,… ak Definition 2.6: A collection of vectors a1, a2,… ak in independent if: k ∑λ a j j =1 implies that λj = 0 for j = 1,2, ... , k. - 13 - j =0 n is called linearly Vectors a1, a2,… ak are linearly dependent otherwise. Vectors a1, a2,… ak are affinely independent if a2 - a1, ... , ak - a1 are linearly independent. Definition 2.7: A collection of vectors a1, a2,… ak in any vector in n n is said to span n if can be represented as a linear combination of a1, a2,… ak. A collection of vectors a1, a2,… ak in n forms a basis of n if they span if any of the vectors is removed, the remaining collection does not span n and n . It can be shown that for a basis: 1) k=n 2) a1, a2,… ak are linearly independent. Definition 2.8: A set S in n is called a convex set if for any two points x1 and x2 in S the vector λx1 + (1-λ)x2 ∈ S for λ∈[0,1]. In other words, a line segment connecting x1 and x2 is a part of S. Note that λx1 + (1-λ)x2 for 0 ≤ λ ≤ 1 is a convex combination (weighted average) of x1 and x2. For 0 < λ < 1 it is a strict convex combination (the endpoints of the line are excluded). Lemma 2.9: Let S1 and S2 be convex sets in n . Then: 1) S1 ∩ S2 is convex. 2) S1 + S2 = {x1 + x2 : x1 ∈ S1, x2 ∈ S2} is convex. 3) S1 - S2 = {x1 - x2 : x1 ∈ S1, x2 ∈ S2} is convex. - 14 - Proof to 1): any two points of S1 ∩ S2 are both in S1 and in S2. So a line segment that connects them is also both in S1 and in S2 because these sets are convex. This completes the proof. Also note that S1 ∩ S2 ∩ S3 = (S1 ∩ S2) ∩ S3, so the intersection of any number of convex sets is also a convex set. This result is very important because feasible sets of mathematical programs generally are intersections of sets that satisfy the individual constraints. Proof to 2) and 3): Exercise Definition 2.10: Let S be any set in n . The convex hull of S denoted conv(S) is the collection of all convex combinations of S. In other words x ∈ conv(S ) iff x can be expressed as k x = ∑λj xj j =1 k ∑λ j =1 j =1 λj ≥ 0 for j = 1 ... k where k is a positive integer and x1, x2, ... , xk ∈ S. Definition 2.11: The convex hull of a finite number of points x1, x2, ... , xk+1 in n is called a polytope. If the points x1, x2, ... , xk+1 are affinely independent then the convex hull conv(x1, x2, ... , xk+1) is called a simplex with vertices x1, x2, ... , xk+1. In n the maximum number of linearly independent vectors is n, so a simplex cannot have more than n +1 vertices. Later we shall see that the simplex method used to find optimal solution of linear problems is basically movement on the edges of a simplex that explains its name. Following, the so called Carathéodory Theorem shows that any point in the convex hull of a set S in n can be represented as a convex combination of, at most, n+1 points in S. For a - 15 - simplex it means that all points of a simplex can be represented as a convex combination of its corners. Theorem 2.12: Let S be any set in n . If x ∈ conv(S) then x ∈ conv(x1, x2, ...,xn+1), where xj ∈ S for j = 1 ... n+1. In other words, x can be represented as n+1 x = ∑λj xj j=1 where n +1 ∑λ j = 1 , λj ≥ 0 for j = 1 ... n+1 and xj ∈ S for j = 1 ... n+1. j =1 Proof: The theorem is trivially true for x ∈ S (explain why). Since x ∈ conv(S) then, by definition of a convex hull, x can be expressed as k x = ∑λj xj j =1 where k ∑λ j = 1 , λj ≥ 0 for j = 1 ... k and xj ∈ S for j = 1 ... k. j =1 If k ≤ n+1 the theorem is proved. Next we shall show that if k > n+1 , it is possible to eliminate one term with λi>0. Because k ≥ n+2, the vectors x2-x1, x3-x1, ... , xk-x1 are linearly dependent and so there exist such scalars µj not all zero that k ∑ µ (x j j − x1 ) = 0 j =2 The sum can be expressed as k k k j =2 j =2 j =1 ∑ µ j x j − ∑ µ j x1 = ∑ µ j xj = 0 - 16 - k where µ1 = −∑ µ j or j =2 k ∑µ j =0 j =1 Now for an real number α, x can be represented as follows: k k k k j =1 j =1 j =1 j =1 x = ∑ λj x j + 0 =∑ λj x j − α ∑ µ j x j =∑ (λj − αµ j ) x j Now we choose α in such a way that one of the coefficients in the above sum becomes zero. Note that x is represented as a convex combination, so the coefficients must remain nonnegative. That's why we choose α as follows: ⎧⎪ λ j ⎫⎪ λ : µ j > 0⎬ = i 1≤ j ≤ k ⎪ µ ⎪⎭ µ i ⎩ j α = min ⎨ for some i ∈ {1 … k} Note that α > 0 and also note that for µj ≤ 0 the coefficient λj-αµj ≥ 0. For µj > 0 we have λj/µj ≥ λi/µi = α and so λj ≥ αµj or λj -αµj ≥ 0. Now x is represented as k x = ∑(λj −αµj )xj j =1 j ≠i Moreover, k ∑(λ j =1 j k k k j =1 j =1 j =1 −αµj ) =∑λ j −α∑µ j =∑λ j =1 and λj-αµj ≥ 0, j = 1 ...k. In other words x is represented as a convex combination of mostly k-1 points in S. This can continue until we get mostly n+1 points that completes the proof. - 17 - Definition 2.13: A hyperplane H in n is a set of vectors: {x: pTx = k} where k is a scalar and p is the normal or gradient vector of H. A hyperplane can be expressed by eliminating k. Let pTx0 = k for a certain vector x0 on H. Then pTx = pTx0 or pT(x - x0) = 0. So H = {x: pT(x - x0) = 0}. The vector p is orthogonal to all vectors (x - x0) for x ∈ X and so it is perpendicular to the surface of the hyperplane H that explains its name. A hyperplane is a convex set (prove it). Definition 2.14: A hyperplane divides n into two halfspaces. A closed halfspace is a set of vectors: {x: pTx ≥ k} or {x: pTx ≤ k}. The union of these two sets is n , their intersection is the hyperplane. Open halfspaces are defined by strict inequalities. Halfspaces can also be expressed by eliminating k. Let pTx0 = k for a certain vector x0 on H. Then pTx = pTx0 or pT(x - x0) = 0. So a halfspace is a set {x: pT(x - x 0) ≤ 0 } or {x: pT(x - x0) ≥ 0 }. A halfspace is a convex set (prove it). Definition 2.15: A polyhedral set or polyhedron is the intersection of a finite number m of closed halfspaces. Each halfspace can be represented by an inequality pjTx ≤ bj , where bj is scalar and pj is its normal vector. So a polyhedron is a set: { x : Ax ≤ b } where A is an m x n-matrix and b is an m-vector. A = (a1 a2 ... an), bT = (b1 b2 ... bm), and pjT is the j-th row of A. A polyhedron is a convex set (prove it). Note that inequalities can be converted by multiplying both sides by -1 and equality - 18 - can be expressed as two inequalities. Also nonnegativity conditions can be expressed in terms of halfspaces. So there are several alternative ways how to express a polyhedron, for example: { x : Ax ≥ b } , { x : Ax ≤ b, x ≥ 0 } , { x : Ax = b, x ≥ 0 } Definition 2.16: A polyhedral cone is a polyhedral set whose hyperplanes all pass through the origin. So it is a set: { x : Ax ≤ 0 } The above set definition was obtained by expressing the hyperplanes in the form pT(x - x0) ≤ 0 , where x0 = 0. The following theorem known as Farkas’ Theorem is very important in the derivation of optimality conditions of linear and non-linear problems. Theorem 2.17: Let A be an m x n matrix and c be an n vector. Then exactly ONE of the following two systems has a solution: System 1: Ax ≤ 0 and cTx > 0 System 2: ATy = c and y ≥ 0 for some y in for some x in n n Assuming that aj , j = 1 ... m are columns of the matrix AT, the vector c of the System 2 is their nonnegative linear combination. In the System 1 the same rows have the role of normal vectors of the hyperplanes that define halfspaces whose intersection is the closed convex cone {x : Ax ≤ 0}of the System 1. - 19 - System 1 has a solution if this cone has a nonempty intersection with the open halfspace {cTx > 0}. But in this case c cannot be a nonnegative linear combination of the vectors aj. See the following pictures taken from [15]. Definition 2.18: A ray is a set of points of the form {x0 + λd : λ ≥ 0} where x0 is the vertex and d is the direction of the ray. Definition 2.19: A direction d of a convex set S is a nonzero vector such that the ray {x0 + λd, x0 ∈ S, λ ≥ 0} also belongs to the set S. For two directions d1 and d2 of a convex set their convex combination λd1 + (1-λ)d2 is also a direction. An extreme direction cannot be represented as a positive linear combination of two distinct directions of the set. Clearly a bounded convex set has no directions. Definition 2.20: A point x in a convex set S is called an extreme point of S, if x can not be represented as a strict convex combination of two distinct points in S. In other words, iff x = λx1 + (1-λ)x2 , λ∈(0,1) , x1, x2 ∈ S then x = x1 = x2 . - 20 - The next list is a summary of important facts about extreme points and extreme directions of convex and polyhedral sets. Some will be proved later in the context of the simplex method. In the following list we assume that S = { x : Ax = b, x ≥ 0 }. Number of extreme points of S is finite. S has at least one extreme point. Number of extreme directions of S is finite (possibly zero). S has at least one extreme direction iff it is unbounded. Next the so called Representation Theorem gives a way how to describe a polyhedral set by means of its extreme points and extreme directions. This fact is fundamental to linear programming. Theorem 2.21: Let S = { x : Ax = b , x ≥ 0 } be a nonempty polyhedral set. Then the set of extreme points x1, x2, ... xk is not empty and finite. The set of extreme directions is empty iff S is bounded. If S is unbounded, then the set of extreme directions d1, d2, ... dl is not empty and finite. Then a point x belongs to S iff it can be represented as a convex combination of extreme points plus a nonnegative linear combination of extreme directions: k l j =1 j =1 x = ∑ λ j x j + ∑ µ jd j where k ∑λ j =1 j =1 , λj ≥ 0 , µ j ≥ 0 Note: The above theorem also holds for constraint sets with various types of inequalities (equality can be replaced by two inequalities ≥ and ≤ ). - 21 - - 22 - Chapter 3 Graphical Solution to Linear Programming Problems with two variables A LP model consisting of just two decision variables can be solved using the Graphical Method. Step 1: Formulate a LP model. Step 2: Draw axes for variables x1 and x2. Scales may be different, but both must start at zero and both must be linear. Step 3: Draw each limitation as a separate line on the graph. The lines define the Feasible Region (set of acceptable solutions). If not stated explicitly, assume that x1 ≥ 0 and x2 ≥ 0. Step 4: Draw a line (called also iso-profit line) that represents a certain value of the objective function. Then draw a parallel line that touches the feasible region to maximize/minimize the value of the objective function. Step 5: Compute the exact values of the decision variables in the optimal corner of the feasible region and the corresponding optimum value of the objective function. The corner of the feasible region defines the binding constraints (limiting factors). - 23 - Note: There may be more solutions if the iso-profit lines are parallel with a limitation line. Example 3.1 A certain manufacturer produces 2 products called A and B. The product A has a contribution 4 per unit, the product B has a contribution 5 per unit. To produce the products the following resources are required: Decision Machine Labour Material variable hours hours [kg] Product A x1 4 3 1 Product B x2 2 5 1 Resources available per week: 100 machine hours, 150 labor hours and 50 kilograms of material. There are no other limitations. The manufacturer wants to establish a weekly production plan that maximizes the total contribution. Standard linear programming model: Maximize Subject to 4x1 + 5x2 ≤ 100 4x1 + 2 x2 ≤ 150 3 x1 + 5 x2 x1 + x2 ≤ 50 x1 ≥ 0 , x2 ≥ 0 Note that there is no requirement to produce only integer amounts (fractions may for example represent partially finished products) and also note that any combination of the two products is acceptable including producing only one of the two products. If these facts were not true, there would be other limitations - 24 - like x1 ≥ 5 - produce at least 5 units of the product A, x1 - x2 ≥ 3 - production of A must not exceed production of B by more than 3, etc. 3.1 Classification of Models i) Unboundedness Unboundednesscan apply both to the feasible region and to the objective value. Unbounded feasible region means that at least one variable is not limited in the value (always assuming nonnegativity). This is typical for minimization problems. Unbounded objective value means that the objective value can grow to +∞ (maximization) or -∞ (minimization) respectively. Clearly a bounded feasible region implies bounded objective value (obviously we assume finite objective coefficients). For an unbounded feasible region there can be both bounded and unbounded objective value. So unbounded model means a model with unbounded objective value. ii) Feasibility Feasibility means whether there is a solution or not. So an infeasible model does not have any feasible solution - no vector exists that would satisfy all (in)equalities including nonnegativity. Feasible model is a model that has at least one feasible solution. Often this adjective is used for models that are both feasible and bounded (it means models that have at least one optimal solution). So there are three types of models: i. Feasible ii. Unbounded - 25 - iii. Infeasible Next we shall mostly assume that the model is feasible, because for practical problems infeasibility and/or unboundedness are caused by wrong model specification. - 26 - Chapter 4 Simplex Method to solve general Linear Programming Probems Simplex method represents one of the most famous algorithms of Operations Research. It has been described originally by the Russian mathematician Kantorovich in 1939, but his work was unknown internationally until 1959. Meanwhile Dantzig discovered the algorithm in 1947. It is convenient when using the simplex method, to convert all Linear Programming problems into standard form as shown below: Changing Constraint Type 1. Add a slack variable in each ≤ inequality constraint. Example: 4x1 + 5x2 ≤ 150 → 4x1 + 5x2 + s1 = 150 , s1 ≥ 0 s1 = amount of the resource not used (s1 = 150 - amount used) Feasible initial point: x1 = 0, x2 = 0, s1 = 150 2. Subtract a surplus variable in each ≥ inequality constraint and add an artificial variable. Example: 4x1 + 5x2 ≥ 130 → 4x1 + 5x2 - s1 = 130 , s1 ≥ 0 s1 = excess over the minimum (s1 = actual value - 130) - 27 - Feasible initial point can not be as above (x1 = 0, x2 = 0, s1 = -130) because y1 has to be nonnegative. To select certain nonzero values of x1 and x2, it would be necessary to test for feasibility. To find fast a feasible initial point, add an artificial variable: 4x1 + 5x2 - s1 = 130 → 4x1 + 5x2 - s1 + a1 = 130 , s1, a1 ≥ 0 a1 = mathematical tool without practical (model) interpretation Feasible initial point: x1 = 0, x2 = 0, y1 = 0, a1 = 130 3. Add an artificial variable in each equality constraint. To find fast a feasible initial point for an equality constraint, add an artificial variable Example: 4x1 + 5x2 = 130 → 4x1 + 5x2 + a1 = 130 , a1 ≥ 0 a1 = mathematical tool without practical (model) interpretation Feasible initial point: x1 = 0, x2 = 0, a1 = 130 Using the conversion methods shown above, all Linear Programming problems can be transformed into the following standard form: - 28 - Find a vector x that minimizes (or maximizes) z =c T x , such that Ax = b ⎛ a11 a12 ⎜ a a22 A = ⎜ 21 ⎜ ⎜ ⎝ am1 am 2 c T = ( c1 c2 and x≥0 a1n ⎞ ⎟ a2 n ⎟ ⎟ ⎟ amn ⎠ cn ) , where ⎛ x1 ⎞ ⎜ ⎟ x , x = ⎜ 2 ⎟, ⎜ ⎟ ⎜ ⎟ ⎝ xn ⎠ m<n ⎛ b1 ⎞ ⎜ ⎟ b b=⎜ 2 ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ bm ⎠ The set of equations Ax = b can also be expressed as: x1a1 + x2a2 + … + xnan = b where xj are elements of the vector x and aj are columns of the matrix A (1 ≤ j ≤ n ). Next we shall consider a minimization problem. Results for maximization will mostly differ only in signs (max cTx ≡ min -cTx) and types of inequalities. Note that the vector x contains the original solution variables together with all slacks, surpluses, and artificial variables used to convert original constraints into equalities with trivial initial feasible solution. The m x n matrix A contains all coefficients and it is also supposed to contain a unity matrix as its part (usually but not necessarily in the last m columns). b is the m vector of right hand sides, and c is the n vector of objective coefficients that includes zeros for slacks/surpluses and possible penalties (+M, -M) for artificial variables - see later. At this stage it is also convenient to mention the various solutions that a LP program may have. - 29 - Types of solutions (of a feasible LP model) • Feasible solution satisfies all constraints including nonnegativity. • Infeasible solution does not satisfy all constraints including nonnegativity. • Basic solution has m basic variables and n-m zero nonbasic variables. • Basic nondegenerate solution has m nonzero basic variables and n-m zero nonbasic variables. Short name "basic solution" is often used instead. • Basic degenerate solution has m basic variables some of which are zero and n-m zero nonbasic variables. • Nonbasic solution has more than m nonzero variables. From various possible combinations these are particularly important: • Basic feasible nondegenerate solution is a feasible solution that has m positive basic variables and n-m zero nonbasic variables. • Basic feasible degenerate solution has less than m positive variables and more than n-m zero variables. • Basic feasible solution (BFS) covers both above cases. • Optimal solution is a basic feasible solution such that no other feasible solution has bigger (maximization) or smaller (minimization) objective value respectively. There can be more than one optimal solution. - 30 - 4.1 Geometry of the Simplex Method The simplex method is based on the so called Fundamental Theorem of Linear Programming, which says that if an LP model has an optimal feasible solution, then there exists a basic feasible solution that is optimal. The aim of this section is to state and prove four theorems, which will be used to prove the Fundamental Theorem of Linear Programming. Theorem 4.1: The feasible region of a feasible LP problem is a convex set. Proof: We already know that the feasible region is a polyhedron and so it is convex. Anyway here is a proof related to LP problems. Let x1, x2 be any two feasible solutions to the LP problem. Then all elements of x1 and x2 are nonnegative and also: Ax1 = b and Ax2 = b Let x = λx1 + (1-λ)x2 such that 0 ≤ λ ≤ 1 be a convex combination of x1 and x2 (a point of the line segment connecting x1 and x2). Clearly all elements of x are nonnegative. Then the following holds: Ax = A(λx1 + (1-λ)x2) =λAx1 + (1-λ)Ax2 = λb + (1-λ)b = b Since x is also a feasible solution, the feasible region is convex. - 31 - Theorem 4.2: If an LP problem has an optimal feasible solution, there must be an extreme point (corner) of the feasible region that is optimal. Proof (i): Assumption – Feasible region is bounded Let’s assume, that the feasible region is bounded. Later we shall see that the theorem is valid also for some cases with unbounded feasible regions. Let xp be the optimal solution and let {x1, x2, ... xk} be the set of the extreme points. From the representation theorem we know that this set is finite and nonempty. Then obviously cTx ≥ cTxp for all points x of the feasible region. Now let’s express the optimal point xp as a convex combination of extreme points. Then it holds: cTxp = cT(λ1x1 + λ2x2 + ... + λkxk) = λ1cTx1 + λ2cTx2 + ... + λkcTxk ≥ ≥ λ1cTxq + λ2cTxq + ... + λkcTxq = (λ1 + λ2 + ... + λk)cTxq = cTxq where xq is the extreme point with minimum value of the objective function: cTxq = Min {cTx1 , cTx2 , ... , cTxk} So we have: cTxp ≥ cTxq ≥ cTxp From this it follows that cTxq = cTxp , so the extreme point xq is optimal. It is possible, that the objective function is optimal in more extreme points. Then it is optimal also in any convex combination of these optimal extreme - 32 - points. To show this, let’s assume, that the LP problem has r optimal extreme points x1, x2, ... , xr with a certain objective value cTxp. Let x = λ1x1 + λ2x2 + ... + λrxr , λi ≥ 0 , r ∑ i =1 λi = 1 be their convex combination. Then: cTx = cT(λ1x1 + λ2x2 + ... + λrxr) = λ1cTx1 + λ2cTx2 + ... + λrcTxr = = (λ1 + λ2 + ... + λr) cTxp = cTxp So the objective value of a convex combination of optimal extreme points is also optimal. Proof (ii) Assumption – Feasible region can be either bounded or unbounded Here we shall show another way how to prove the above theorem for problems with both bounded and unbounded feasible regions. Using the representation theorem, any point of the feasible region can be expressed as: k l j =1 j =1 x = ∑ λ j x j + ∑ µ j d j where k ∑λ j =1 j =1 , λj ≥ 0 , µ j ≥ 0 and where {x1, x2, ... xk} is the nonempty and finite set of extreme points and where {d1, d2, ... dl} is the nonempty and finite set of extreme directions (unbounded region) or there are no extreme directions for a bounded feasible region. So we can transform the original problem in the variables x1, x2, ... xn into a problem in variables λ1, ... λk, µ1, ... µl: - 33 - k l j =1 j =1 Minimize z = ∑ λ j (c T x j ) + ∑ µ j (c T d j ) Subject to k ∑λ j =1 j =1 , λj ≥ 0 , j = 1,2, k , µj ≥ 0 , j = 1,2, l Since µj is not limited (can be made arbitrarily large), there are two cases: 1) If cTdj < 0 for some j = 1, 2 ... l, then the minimum is -∞ (corresponding µj can be arbitrarily large) and the problem is unbounded (in objective value). 2) If cTdj ≥ 0 for all j = 1, 2 ... l, then all µj can be chosen as zero, and we have to minimize the first term only over λ1, ... λk. To do this, we select the minimum cTxj (say cTxp) for a certain extreme point xp (there may be more with the same value). Then we let λp = 1 and all other λj equal to zero. Summary: The optimal solution is finite iff cTdj ≥ 0 for all extreme directions (for a bounded feasible region there are no extreme directions, so the optimum is always finite). Then the optimal (minimum) cTxj occurs at at least one extreme point. If there are more optimal extreme points, their convex combinations are also optimal (as was shown above). Theorem 4.3: Let's consider a feasible region S = { x : Ax = b , x ≥ 0 }. If in the equation x1a1 + x2a2 + … + xnan = b - 34 - there are k (k ≤ m < n) linearly independent vectors ai - the above equation can always be rearranged in such a way, that these vectors are the first k vectors and if there is a linear combination with coefficient xi ≥ 0 , 1 ≤ i ≤ k such that: x1a1 + x2a2 + … + xkak = b then the vector x = (x1 x2 … xk 0 … 0) T is an extreme point of the feasible region S. Proof: Let’s assume, that the vector x is not an extreme point. If this were true, there must be a convex combination x = λv + (1-λ)w , 0 < λ < 1 where v and w are two feasible solutions. Clearly the last n-k elements of the vectors v and w must be zero, otherwise the last n-k elements of the vector x would not be zero (all vectors of the feasible region have all elements nonnegative). Since v and w are feasible solutions, they both satisfy the equations Av = b and Aw = b. These two equations can be expressed in this way (taking only first k terms): v1a1 + v2a2 + … + vkak = b w1a1 + w2a2 + … + wkak = b These two equations (sets of linear equations) together with the equation x1a1 + x2a2 + … + xkak = b have all the same coefficients. Because the columns are linearly independent, the solutions to all three equations are uniquely defined and all are the same. So - 35 - x1 = v1 = w1 , x2 = v2 = w2 , … , xk = vk = wk and so x=v=w This shows that the vector x cannot be expressed as a linear combination of two feasible vectors. That’s why it must be an extreme point. Theorem 4.4: Let's consider a feasible region S = { x : Ax = b , x ≥ 0 }. If the vector x is an extreme point of S, then the vectors ai in the equation x1a1 + x2a2 + … + xnan = b that correspond to nonzero elements xi of x are linearly independent. Proof: Let’s assume again, that the nonzero elements x1, x2, … xk are the first k elements of x. To prove the theorem let’s assume the opposite, that the vectors a1, a2 , … , ak are linearly dependent. This means that there exist numbers r1, r2, … , rk such that at least one of them is not zero, that: r1a1 + r2a2 + … + rkak = 0 Now let’s multiply the above equation by a real number q and let’s first add and then subtract the product from the equation x1a1 + x2a2 + … + xkak = b. By this we get: (x1 + qr1)a1 + (x2 + qr2)a2 + … + (xk + qrk)ak = b (x1 - qr1)a1 + (x2 - qr2)a2 + … + (xk - qrk)ak = b - 36 - It is possible to select a small q such that all coefficients at the vectors ai are positive. Then the vectors x’ = (x1 + qr1, x2 + qr2, … , xk + qrk) T x’’ = (x1 - qr1, x2 - qr2, … ,xk - qrk) T are both feasible solutions and their sum divided by 2 gives: x = ½ x’ + ½x’’ This shows, that x is a strict convex combination of two feasible vectors x’ and x’’ and so it can not be a corner, which is a contradiction. That’s why the vectors a1, a2 , … , ak must be linearly independent. Theorem 4.5 (Fundamental Theorem of Linear Programming): If an LP problem has an optimal feasible solution, then there is a basic feasible solution that is optimal. Proof: The theorem 4.2 says, that the optimum of a feasible LP problem is at an extreme point (or possibly more extreme points) of the convex feasible region. To prove the fundamental theorem, we have to prove that an extreme point has mostly m nonzero elements, so it is a basic solution. The proof follows from the theorem 4.4, that says that nonzero elements of an extreme point correspond to linearly independent columns of the matrix A. This matrix has the size m x n, so it can have mostly m linearly independent columns. So an extreme point is a basic solution (it has mostly m nonzero elements). This proves the fundamental theorem of linear programming. - 37 - We can also show that in fact all basic solutions are extreme points. Each basic solution can be decomposed into the vector of basic variables xB that is a solution to the equation BxB = b (where B is an m x m nonsingular matrix made of linearly independent columns of the matrix A) and the vector of nonbasic variables xN = 0. Then it follows from the theorem 4.3 that the vector (xB xN) is an extreme point, so a basic solution is an extreme point. The fundamental theorem has very important consequences. It says, that to find an optimum, we can limit ourselves only to the corners (extreme points) of the feasible region. This is the basic principle of the simplex method, that starts at some initial (trivial) basic feasible solution - corner and moves on the edges of the convex polyhedron until it reaches the optimal corner. Algebraically it means moving from one basic feasible solution to another one until the optimum is reached (or the problem is found to be unbounded). Extreme Points - 38 - The maximum possible number of basic solutions - corners of an m x n LP problem is the number of combinations how to select m basic variables (or m columns of the matrix A): ⎛n⎞ n! ⎜⎜ ⎟⎟ = ⎝ m ⎠ (n − m)!m! Note that some of the above combinations may not have a solution (they would generate a singular basis) and some may produce an infeasible solution. For large LP models this number can be very big, so theoretically the simplex method is an NP hard problem (in the worst case, the simplex method has to move through all corners). Fortunately, in practice it performs very well and in fact it is considered as a very fast algorithm. There was a study testing many LP models with 50 variables. The average number of steps was 2m. Other studies showed, that for most models the optimum was found in less than 3m steps. 4.2 Algebra of the Simplex Method The simplex method is based on the following steps: 1. finding an initial basic feasible solution 2. testing for optimality 3. if not optimal, finding the most promising improvement direction 4. finding the distance of the move (to reach the appropriate corner) 5. adjusting the model accordingly. - 39 - Next paragraphs explain the basic ideas on how to solve each of the above steps. 1. To find an initial basic feasible solution, we can find for each constraint a variable whose initial nonzero value is known. So we have m nonzero basic variables. All other n-m variables will be nonbasic (zero). How to find such a variable depends on the type of the (in)equality (see beginning of this chapter – changing constraint type). a) For an ≤ inequality add a slack variable. b) For an ≥ inequality first subtract a surplus variable (add negative slack) and then add an artificial variable. c) For equality constraints add an artificial variable. Artificial variables must be forced out from the solution because they are just a tool how to start the algorithm. They must eventually leave the solution (as nonbasic ones). These are the two methods how to do it (these methods will be covered in detail later on): (i) The M-method (ii) The 2 phase method Note: In both cases it may happen that an artificial variable remains nonzero in the optimal solution. This means infeasibility. 2. The most promising improvement can be found from the coefficients of the objective function. Let’s assume that in a certain maximization problem the initial point has all original solution variables nonbasic (equal to zero). Improving in the first step means inserting a certain solution variable as a - 40 - basic one. That will force one of the current basic variables (slacks) out of the basic solution - see the next paragraph. The most promising variable has the maximum positive coefficient in the objective function (objective function is linear, so these coefficients are in the first step partial derivatives of the objective function with respect to particular nonbasic variables). The column that contains the maximum coefficient is called pivot column. If there is no positive coefficient, the optimum (maximum) has been found. Note that the simplex table typically contains the negative values of the coefficients, so the optimality condition in case of maximization means no negative values in the coefficients row in the table. For minimization the optimality condition means no positive values in the coefficients row in the table. 3. Selecting a nonbasic variable to be inserted as a basic one defines the direction of the move in the n-dimensional feasible region. The maximum distance of the move is given by the natural requirement to keep feasibility. A certain constraint will thus stop the movement, so the corresponding slack will become zero, that means it will change from a basic variable into a nonbasic variable. The constraint involved can be found by using the coefficients of the constraint matrix, namely in the pivot column, and the right hand side values. The ratio of the right hand side value and the pivot column coefficient defines the maximum possible increase of the nonbasic variable allowed by this constraint. Obviously the minimum value is chosen, that is called the ratio test. The row that contains the minimum value (the pivot row) thus defines the distance of the move (the ratio) and the basic variable that becomes nonbasic. The intersection of the pivot row and the pivot column is called pivot element. - 41 - 4. To introduce a nonbasic variable it is necessary to take the amounts of all resources needed by this variable. This is performed by making zeros in the pivot column, except the pivot element, that is 1. The zero in the coefficients row represents the fact, that the variable is basic. The values in the coefficients row will be changed, but they always represent profits of introducing one unit of the nonbasic variable into the solution. Clearly these basic ideas neither justify the simplex method nor define exactly the algorithm and how to create the simplex table. The detail has to be covered through the algebra of the simplex method. First let's recall the standard form of an LP minimization problem (results for maximization will mostly differ only in signs and types of inequalities): find such x to Minimize z = cTx Subject to Ax = b , x ≥ 0 where x and c are n-vectors, b is an m-vector, A is an m x n matrix (m ≤ n) and z is the scalar objective value. Note that the set of equations Ax = b can be expressed as: x1a1 + x2a2 + … + xnan = b (1) where xi are elements of the vector x and ai are columns of the matrix A (1 ≤ i ≤ n ). - 42 - Now let's assume that the matrix A has the rank m, so there are m independent columns. Then A can be re-arranged in this way: A = (B N), where B is an m x m invertible matrix and N is an m x (n-m) matrix. B is called the basic matrix (shortly the base) of A, N is called the nonbasic matrix. Vector x can be decomposed accordingly: xT = (xBT xNT) where xBT = (x1 x2 ... xm) and xNT = (xm+1 xm+2 ... xn). Then: ⎛x ⎞ Ax = ( B N ) ⎜ B ⎟ = Bx B + Nx N = b ⎝ xN ⎠ or xB = B -1b − B −1 Nx N (2) The solution xT = (xBT xNT) such that xN = 0 and xB = B-1b is called a basic solution of the system Ax = b. So a basic solution has m basic variables (components of xB) and n-m zero nonbasic variables (components of xN). If xB ≥ 0, than x is called a basic feasible solution. If the solution has less than m nonzero variables, it is called a degenerate basic solution. From the fundamental theorem of linear programming we know that if an LP problem has an optimal feasible solution, then there is a basic feasible solution that is optimal. Example 4.1 Consider the LP model: min x1 + x2 st x1 + 2 x2 ≤ 4 x2 ≤ 1 x1 , x2 ≥ 0 - 43 - The model in standard form is given by: min x1 + x2 + 0 x3 + 0 x4 min (1 1 0 0 ) x st st x1 + 2 x2 + x3 x2 ⇔ =4 ⎛1 2 1 0⎞ ⎛ 4⎞ ⎜ ⎟x =⎜ ⎟ ⎝0 1 0 1⎠ ⎝1 ⎠ x = ( x1 x2 x3 x4 )T ≥ 0 + x4 =1 x1 , x2 , x3 , x4 ≥ 0 Clearly A has rank m = 2. Suppose we choose the third and fourth linearly independent columns (a3 and a4) to form the basic matrix B. This means that we can arrange the matrix A as shown below: ⎛1 0 1 2⎞ A=⎜ ⎟ ⎝0 1 0 1⎠ B N so that cTB cTN min ( 0 0 1 1) x st ⎛1 0 1 2⎞ ⎛ 4⎞ x = ⎜ ⎟ ⎜ ⎟ ⎝0 1 0 1⎠ ⎝1 ⎠ x = ( x3 x4 x1 x2 )T ≥ 0 x TB x TN −1 −1 ⎛1 0⎞ ⎛ 4⎞ ⎛ 1 0⎞ ⎛ 1 2⎞⎛ 0⎞ ⎛ 4⎞ Thus xB = B b − B Nx N = ⎜ ⎟ ⎜ ⎟−⎜ ⎟ ⎜ ⎟ ⎜ ⎟ = ⎜ ⎟ so that the basic ⎝ 0 1 ⎠ ⎝ 1 ⎠ ⎝ 0 1 ⎠ ⎝ 0 1 ⎠ ⎝ 0 ⎠ ⎝1 ⎠ -1 −1 feasible solution is x = ( xB x N ) = ( 4 1 0 0 ) T - 44 - Equation (2) shows how the basic variables would change if the nonbasic variables (currently all zero) changed their values. Let's express this equation in terms of columns of the matrix B-1N and corresponding nonbasic variables: x B = B -1b − B -1 Nx N = B -1b − ∑ ( B -1a j )x j = b* − ∑ y j x j j∈R j∈R (3) where R is the index set of the columns that make the nonbasic matrix N, aj are columns of the matrix N (and the matrix A), yj are columns of the matrix B-1N and b* contains the current values of basic variables. Example 4.1 Continued −1 −1 ⎛ 1 0 ⎞ ⎛ 4 ⎞ ⎛ 1 0 ⎞ ⎛ 1 2 ⎞ ⎛ x1 ⎞ xB = B b − B Nx N = ⎜ ⎟ ⎜ ⎟−⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎝ 0 1 ⎠ ⎝ 1 ⎠ ⎝ 0 1 ⎠ ⎝ 0 1 ⎠ ⎝ x2 ⎠ −1 −1 ⎛ 4 ⎞ ⎡⎛ 1 0 ⎞ ⎛ 1 ⎞ ⎛1 0⎞ ⎛ 2⎞ ⎤ = ⎜ ⎟ − ⎢⎜ ⎟ ⎜ ⎟ x1 + ⎜ ⎟ ⎜ ⎟ x2 ⎥ ⎝ 1 ⎠ ⎢⎣⎝ 0 1 ⎠ ⎝ 0 ⎠ ⎝ 0 1 ⎠ ⎝ 1 ⎠ ⎥⎦ -1 −1 ⎛ 4 ⎞ ⎡⎛ 1 ⎞ ⎛ 2⎞ ⎤ = ⎜ ⎟ − ⎢⎜ ⎟ x1 + ⎜ ⎟ x2 ⎥ ⎝ 1 ⎠ ⎣⎝ 0 ⎠ ⎝1 ⎠ ⎦ = b* − ∑ j∈{1,2} yjxj Similarly let's find out how the objective value would change with respect to nonbasic variables: - 45 - ⎛x ⎞ z = cT x = (cTB cTN ) ⎜ B ⎟ = cTB x B + cTN x N = cTB ( B −1b − ∑ B −1a j x j ) + cTN x N j∈R ⎝ xN ⎠ T T −1 −1 = cB B b − ∑ cB B a j x j + ∑ c j x j j∈R j∈R = z0 − ∑ ( z j − c j ) x j j∈R (4) where zj = cBTB-1aj is a scalar value for each nonbasic variable. Example 4.1 Continued ⎛ x3 ⎞ ⎜ ⎟ x T z = c x = ( 0 0 1 1) ⎜ 4 ⎟ ⎜ x1 ⎟ ⎜⎜ ⎟⎟ ⎝ x2 ⎠ ⎛ x3 ⎞ ⎛ x1 ⎞ = ( 0 0 ) ⎜ ⎟ + (1 1) ⎜ ⎟ ⎝ x2 ⎠ ⎝ x4 ⎠ ⎧ −1 1 0 4 ⎛ ⎞ ⎛ ⎞ ⎪⎪ ⎛1 = (0 0) ⎜ ⎟ ⎜ ⎟ − ⎨( 0 0 ) ⎜ ⎝0 1⎠ ⎝1⎠ ⎪ ⎝0 ⎪⎩ z1 = 0 − {(0 − 1) x1 + (0 − 1) x2 } ⎫ −1 −1 0⎞ ⎛1⎞ 1 0 2 ⎛ x1 ⎞ ⎛ ⎞ ⎛ ⎞ ⎪⎪ ⎟ ⎜ ⎟ x1 + ( 0 0 ) ⎜ ⎟ ⎜ ⎟ x2 ⎬ + (1 1) ⎜ ⎟ 1⎠ ⎝0⎠ ⎝ 0 1⎠ ⎝ 1⎠ ⎪ ⎝ x2 ⎠ ⎪⎭ z2 = z0 − ∑ ( z j − c j ) x j j∈R Exercise 4.2 : Work out the value of the BV, NBV and the objective functions T if the Basic feasible solution to be considered is x = ( x2 x3 x1 x4 ) . Does this basic feasible solution improves the objective function? We can use equation (4) to find a nonbasic variable that (if not zero) would improve the current objective value z0. If such a variable does not exist, we know that the current basic feasible solution is optimal. To do it, let's optimize - 46 - with respect to nonbasic variables at the point x (the origin in the sub-space of nonbasic variables) using the equations (3) and (4): Minimize z = z0 − ∑ ( z j − c j ) x j j∈R Subject to ∑y x j∈R j j + x B = b∗ , x j ≥ 0 , j ∈ R , x B ≥ 0 Note that in the above problem the current basic variables play the role of slacks. So the problem can be rewritten as: Minimize z = z0 − ∑ ( z j − c j ) x j j∈R Subject to ∑y x j∈R j j ≤ b∗ , x j ≥ 0 , j ∈ R From the objective function of the above LP problems, we can directly state the optimality condition or optimality test: If (zj - cj) ≤ 0 for all j ∈ R, then the current basic feasible solution is optimal. The proof is simple: since xj ≥ 0 then for all nonpositive (zj - cj) we get z ≥ z0 for any other solution. But currently z = z0 since xj = 0 for all j ∈ R. - 47 - If not all (zj - cj) ≤ 0, then we select one positive (zk - ck) - possibly but not necessarily the greatest one - and we shall increase the corresponding xk as much as possible by holding the remaining n-m-1 nonbasic variables at zero. The new objective value will be: z = z0 - (zk - ck) xk (5) Note: Look at the definition of simplex in chapter 2. At the beginning of each iteration we are in the origin of the sub-space of nonbasic variables. Each iteration represents a move along one axis of the nonbasic variables space to the neighboring corner. So geometrically it is a move on one of the edges of the simplex (which explains the name of the method). Example 4.3 Consider the LP in standard form: min − x1 − x2 + 0 x3 + 0 x4 st x1 + 2 x2 + x3 x2 =4 + x4 =1 x1 , x2 , x3 , x4 ≥ 0 T Suppose that the IBFS is x = ( x3 x4 x1 x2 ) = (4 1 0 0) . Thus, the initial objective function z = c TB B −1b − ∑ cTB B −1a j x j + ∑ c j x j = z0 − ∑ ( z j − c j ) x j = 0 j∈R j∈R To check if zo can be improved: R = {1,2} - 48 - j∈R ⇒ z1 − c1 = cTB B −1a1 − c1 = 0 − (−1) = 1 > 0 ⇒ z2 − c2 = cTB B −1a2 − c2 = 0 − (−1) = 1 > 0 Clearly the IBFS is not the optimal solution, thus we need to choose a current NBV to become a BV, so that the objective function value will be decreased. We can choose any NBV xj corresponding to a positive value of z j − c j . As a rule of thumb we choose that NBV xj for which z j − c j has the highest positive value. Since in this case z j − c j = 1 for j = 1,2, then we can choose either x1 or x2 , say x1. From equation (3) we can find the new values of basic variables: xB = b* - ykxk (all other xj , j ∈ R \{k} are zero) Expanding this equation we obtain: ⎛ xB1 ⎞ ⎛ b1∗ ⎞ ⎛ y1k ⎞ ⎟ ⎟ ⎜ ∗⎟ ⎜ ⎜ ⎜ xB 2 ⎟ ⎜ b2 ⎟ ⎜ y2 k ⎟ ⎟ xk ⎟ =⎜ ⎟−⎜ ⎜ ⎟ ⎟ ⎜ ⎟ ⎜ ⎜ ⎜ x ⎟ ⎜ b∗ ⎟ ⎜ y ⎟ ⎝ Bm ⎠ ⎝ m ⎠ ⎝ mk ⎠ The indices Bi depend on the current basis. The value of xk can be found from the feasibility requirement of the new solution: xBi ≥ 0 or bi* - yikxk ≥ 0 , i = 1, 2, ... m Feasibility is at danger only for positive yik for which it must hold: - 49 - xk ≤ bi∗ y ik , i = 1, 2, m , y ik > 0 So for yik ≤ 0 the corresponding xBi remains nonnegative. For yik > 0 the corresponding xBi decreases. We can continue increasing xk until the first basic variable drops to zero. Then we have to stop, otherwise the solution would become infeasible. This gives the so called feasibility condition also called ratio test or feasibility test: ⎧ b∗ ⎫ x k = Min ⎨ i : y ik > 0⎬ 1≤ i ≤ m y ⎩ ik ⎭ (6) If r is the row with the minimum ratio, then the new solution is: x Bi = bi∗ − y ik xk = br∗ y rk xj = 0 , br∗ y rk , i = 1, 2, … m ( x Br drops to zero - the leaving variable) ( the entering variable) j ∈ R \ {k} (other nonbasic variables remain zero) In this way we have reached new (better) basic feasible solution. This process must terminate (unless there is cycling) because the number of corners (basic feasible solutions) is finite. Cycling can occur in case of degeneracy and it represents a real problem in computer implementation of the simplex method. There are methods to cope with cycling that are beyond the scope of this material. Commercial LP packages mostly ignore cycling because cycling - 50 - prevention would slow down considerably the computation. Also rounding of floating point numbers can help - due to the limited precision cycling is in fact often prevented. Example 4.3 Continued x B = b* - yk xk ⎛ x ⎞ ⎛ 4 ⎞ ⎛1 ⎞ ⎛ 0⎞ ⇒ ⎜ B1 ⎟ = ⎜ ⎟ − ⎜ ⎟ x1 ≥ ⎜ ⎟ ⎝ 0⎠ ⎝ xB 2 ⎠ ⎝ 1 ⎠ ⎝ 0 ⎠ ⎛4 ⎞ ⇒ x1 ≤ ⎜ 1 ⎟ ⎜ ⎟ ⎝ 0⎠ ⎛4 ⎞ ⇒ x1 = min ⎜ 1 ⎟ = 4 ⎜ ⎟ ⎝ 0⎠ As the minimum value occurs in the first row, then the current BV s1 must drop to zero, thus becoming a NBV. Hence, the new basic feasible solution is x = ( x1 x4 x2 x3 )T = (4 1 0 0)T and the new objective function value is z = cTB B −1b − ∑ cTB B −1a j x j + ∑ c j x j j∈R j∈R = 0 − 0 + (−1)4 = −4 which is an improvement to zo (remember we are minimizing!) Exercise 4.4 Check if the new basic feasible solution is optimal, if no determine the new basic feasible solution and the new objective function value. Practical interpretation of (zk - ck) After entering a so far nonbasic (zero) variable xk into the solution, the new value of the objective function is: - 51 - z = z0 - (zk - ck) xk = cBTb* - zkxk + ckxk So, ck = cost of entering one unit of xk zk = saving caused by entering one unit of xk That's why (ck - zk) is called reduced cost. We are working with its negative value (zk - ck) because this is the value in the simplex table - see later. Now let's expand zk : m zk = cTB B -1ak = cBT yk = ∑ cBi yik i =1 So cBi = unit cost of the i-th basic variable yik = by how much the i-th basic variable will decrease cBi yik = saving caused by decreasing the i-th basic variable zk = total saving caused by decreasing all basic variables. Cases of termination So far we have assumed, that a unique optimal solution has been reached. This is in fact one of three possible cases (we always assume a feasible problem): 1. If zj - cj < 0 for all j ∈ R then there is a unique optimal solution. 2. If there exists some k in R such that zk - ck = 0, then there are alternative optima. Entering xk into the solution would change the basic feasible solution, but not the value of z. - 52 - 3. zj - cj ≤ 0 , j ∈ R (some zk - ck > 0 and yk ≤ 0), then the problem is unbounded. All items of the column yk are non-positive, xk can be increased arbitrarily, so z → -∞. Simplex Algorithm Using the above results, we can express the simplex algorithm formally by matrix operations (assuming that the optimal solution exists): Find a basic matrix B of an initial basic feasible solution Repeat Compute the current solution x = (xB xN)T = (B-1b 0)T Compute the objective z = cBTxB = cBTB-1b Compute the reduced costs (c - z)T = (cT - cTB-1A) If (not optimal) Select the entering variable Use the ratio test to find the leaving variable Update the basic matrix B EndIf Until (optimal) Note that the current basis can be represented by an index vector that defines which columns of A and in which order form the basic matrix B. Updating basis then means replacing the index of the leaving variable by the index of the entering variable. Optimality test and selection of the entering variable depends on the problem (maximization or minimization), ratio test means division of xB by the pivot column (as explained above). Reduced costs of basic variables are zero, so it is possible to compute only reduced costs of nonbasic variables: (cN T - 53 - - cBTB-1N) where N is the nonbasic matrix made of columns of A not included in B. In order to compute the objective and the reduced costs, we need the socalled simplex multipliers wT = cBTB-1 also called shadow costs - see later. Then the objective can be computed as z = wTb and the reduced costs as (cN T - wTN). 4.3 Simplex Table To use the simplex algorithm in practice we (and the computer) need a table that would store all information needed for tests and operations of the algorithm and that would eventually contain the optimal solution together with its optimal objective value. Later we shall learn that there will be in fact even more than that. Let's summarize what we need: • current basic feasible solution • current objective value • information whether the solution is optimal and if not what variable should enter. We know that for this purpose we need reduced costs of nonbasic variables • information needed to find the values of entering variables that will also show which variable leaves the solution. We know, that for that (ratio test) we need columns yj of nonbasic variables and current basic feasible solution. Derivation of the simplex table By using the equations (2) and (4) the original problem Minimize Subject to z = cTx Ax = b , x ≥ 0 can be restated as: - 54 - Minimize Subject to z NxN + BxB = b (7) -cNTxN - cBTxB + z = 0 To get the current basic solution, we multiply the first equation by B-1 from left and then by cBT from left: B-1NxN + IxB = B-1b cBTB-1NxN + cBTxB = cBTB-1b (8) To get the reduced costs, lets add the second equations in (7) and (8): (cBTB-1N - cN T)xN + 0TxB + z = cBTB-1b (9) Finally let's express the first equation of (8) and (9) in a unified way, where the first term in (9) has been replaced by zNT - see the equation (4): B-1NxN + IxB (zN - cN )TxN + 0TxB + 0z = B-1b + 1z = cBTB-1b (10) Note that currently xN = 0, so right hand sides are equal to xB and z respectively. The coefficients of equations (10) can be stored in a table that has m + 1 rows, and columns corresponding to xN, xB, z and the right hand sides. We shall label the columns accordingly. For practical reasons we can also label the rows by the basic variables and by z. This makes a simplex table where BV means basic variables: - 55 - BV xNT xB T z RHS xB B-1N I 0 B-1b z cBTB-1N - cN T 0T 1 cBTB-1b This table contains all we need to carry out simplex iterations. The last column contains the values of the basic variables (the nonbasic ones are obviously zero) and the value of z. The second equation of (10) shows that the z-row contains negative reduced costs in xN columns. Notes: 1. Some authors describe a table with positive reduced costs and negative objective value (together with the coefficient -1 in the z-column). This can be obtained by multiplying the second equation of (10) by -1. 2. The z-column contains always the same, so it is in fact redundant. That's why it is omitted in most books. 3. Some authors place the z-row as the first one. 4. The above table has the basic and nonbasic variables grouped together. This can always be done by re-arranging columns of the table. Doing this after each iteration is time consuming and in fact useless. That's why it is typical (for manual solution) that the above "nice" form of the table exists only at the beginning, where the basic variables are slacks and/or artificial variables that we are used to place at the right side in such an order to form directly the unity matrix. Note that in this case the initial basis is a unity - 56 - matrix: B = B-1 = I, so the initial simplex table contains directly the coefficients of the constraint equations and the objective function: BV xNT xBT z RHS xB N I 0 b z (cBTN - cN )T 0T 1 cBTb 5. The z row is created by first writing negative objective coefficients into the z row. Slacks have zero coefficients in the objective function (cB = 0), so the so called all-slack LP problems (with only ≤ inequalities) have directly zeros in the z-row in xB columns and zero in the objective value (bottom right) entry. z-row entries in the xN columns are negative coefficients of the objective function. So the initial simplex table does not need any pre-processing. However if there are artificial variables, the initial table depends on the solution method. The M-method penalizes artificial variables by a big coefficient M in the objective function. The 2-phase method first minimizes the sum of artificial variables, so the objective coefficients of artificial variables are 1. To get the initial simplex table in its consistent form, it is in both cases necessary to perform some elementary matrix operations to obtain zeros in the z-row in xB columns. 6. Note that after each iteration one basic variable leaves, one nonbasic variable enters. So assuming we keep the heading labels fixed, after the first iteration one column of the unity matrix moves to the place of the new basic variable and becomes the column of the variable that is now nonbasic. After several iterations the columns of the unity matrix are "scattered" in the - 57 - table, but all of them are always present. So it is convenient to keep labels of basic variables in the first (label) column. 7. Each solution has its basis that can be created by taking the appropriate columns of the original matrix A (indices are the same as the basic variables). Note that at each iteration the table contains the result of multiplying the original matrix A by B-1. So if there were originally the unity matrix in A (typically at the right side), there is B-1 now. Information found in the simplex table In addition to the above requirements, the simplex table contains a lot of useful information: a) Objective value (z) in terms of nonbasic variables Apart from zeros in the columns of basic variables, the z-row of the simplex table contains the negative reduced costs. Using reduced costs let's once more express the objective value in terms of nonbasic variables: z = c TB B -1b − (c BT B -1 N − c TN ) x N = c BT B -1b + ∑ (c j − z j ) x j j∈R The rate of change of z as a function of a nonbasic variable xj is: ∂z = cj − zj ∂x j - 58 - (11) This is another justification that to minimize z, xj should be increased if cj - zj < 0 or zj - cj > 0 (because this value is stored in the simplex table). b) Basic variables in terms of nonbasic variables From equation (10) we get for xB: xB = B -1b − B -1 Nx N = B -1b − ∑ B -1a j x j =B -1b − ∑ y j x j j∈R (12) j∈R Vectors yj show how basic variables change in terms of nonbasic variables: ∂x B = −yj ∂x j , ∂xBi = − yij ∂x j c) Objective value (z) in terms of the original right hand side values From equation (11) we can compute the partial derivatives of z with respect to b: ∂z = ( cTB B -1 ) i ∂bi These values are the so called shadow costs. Their interpretation depends on the type of inequality. If bi represents availability of a certain resource ( ≤ inequality in a maximization problem) then ∂z/∂bi is the worth of one unit of the particular resource. If bi represents some minimum acceptable amount ( ≥ inequality in a minimization problem like minimum production, minimum weight and similar) then the derivative ∂z/∂bi is the cost that we pay for one unit of that limitation. Shadow costs can also be found in the simplex table (actually - 59 - for some nonbasic variables shadow costs are equal to reduced costs). From the equation (9) we know that in the z-row the entries in nonbasic variables columns are cBTB-1N - cNT or cBTB-1aj - cj for one particular nonbasic variable xj. Now lets assume that the nonbasic variable xj is a zero slack of a certain scarce resource whose availability is bi (i-th component of b). Such slacks have zero coefficients in the objective function, so cj = 0. Also initially in the simplex table the slack was a basic variable with its associated column of a unity matrix, so aj = ei (a column vector with i-th component equal to 1 and the remaining components equal to zero): cBTB-1aj - cj = cBTB-1ei = (cBTB-1)i Thus the z-row entry of xj is the i-th entry of cBTB-1 that is the shadow cost ∂z/∂bi. d) Basic variables in terms of the original right hand side values From equation (12) we can compute the partial derivatives of xB with respect to b: ∂xBi = ( B −1 )ij ∂b j So the i,j-th entry of B-1 shows how the i-th basic variable (the one in the i-th row, not the one that is the i-th component of x) changes with the right hand side value bj. - 60 - 4.4 The Simplex Algorithm (using Simplex Table) 1. Create the initial (possibly inconsistent) simplex table: BV xNT xBT RHS xB z N - cNT I - cBT b 0 where (N I) = A is the matrix of coefficients with unity matrix in last m columns. b = vector of right-hand sides cN = objective coefficients of nonbasic variables cB = objective coefficients of basic variables xB = basic variables xN = nonbasic variables 2. Make the table consistent by performing such row operations, that the values in the z-row in basic columns are zero. This is not necessary for all slack models. 3. Use Optimality test to check whether the table is optimal: Minimization: All negative reduced costs (z-row entries) must be negative or zero. Maximization: All negative reduced costs (z-row entries) must be positive or zero. If the table is optimal go to the step 6. If not, select the entering variable: - 61 - Minimization: Select variable with the greatest positive value. Maximization: Select variable with the most negative value. Entering variable defines the pivot column. 4. Use Feasibility test to find the leaving variable: Compute the ratios of the right-hand sides and the positive coefficients in the pivot column. Ignore rows with non-positive coefficients. If there are no positive coefficients, the problem is unbounded. Otherwise select the row with the minimum ratio to find the pivot row and the leaving variable. 5. Pivot on pivot element: perform such row operations to create zeros in the pivot column except the pivot element that has to be 1. Go to the step 3. 6. Interpret the optimal simplex table that contains (among others): - The objective value - Values of basic variables (nonbasic ones are zero) - Shadow costs of resources in slack columns - Penalties caused by introducing nonbasic variables - 62 - Example 4.5 Use the simplex method to find the optimum production plan for the following problem: Product Quantity A x1 B x2 C x3 Amount Available Machine Hours 2 3 1 400 Components Alloy 1 1 150 2 4 200 Limits ≤ 50 - 1. Write the LP model for the above problem: max 8 x1 + 5 x2 + 10 x3 subject to 2 x1 + 3 x2 + x3 ≤ 400 x1 + x3 ≤ 150 2 x1 + 4 x3 ≤ 200 x2 ≤ 50 x1 , x2 , x3 ≥ 0 2. Express the LP model into standard form: max z = 8 x1 + 5 x2 + 10 x3 + 0 x4 + 0 x5 + 0 x6 + 0 x7 subject to 2 x1 + 3 x2 + x3 + x4 x1 + x3 + 2 x1 + 4 x3 + = 400 = 150 x5 = 200 x6 x7 x2 + x1 , x2 , x3 , x4 , x5 , x6 , x7 ≥ 0 - 63 - = 50 Contribution/ unit 8 5 10 Interpretation of Slack variables: x4 - unused machine hours x5 - unused components x6 - unused alloy x7 - amount of product B not produced 3. Set up the initial simplex table: Solution Products Variable x1 x2 2 3 x4 1 0 x5 2 0 x6 0 1 x7 z -8 -5 Slack Variables x3 1 1 4 0 -10 x4 1 0 0 0 0 x5 0 1 0 0 0 x6 0 0 1 0 0 x7 0 0 0 1 0 Solution Quantity 400 150 200 50 0 IBFS: x T = ( x4 , x5 , x6 , x7 , x1 , x2 , x3 ) = (400,150, 200,50, 0, 0, 0) BV Initial Objective function value: NBV z =0 4. Perform Optimality Test: Is the current BFS, optimal? – No i. Select the highest negative contribution in the z row (i.e. –10 corresponding to x3) ⇒ x3 is the entering variable (i.e. x3 becomes a BV) - 64 - 5. Perform Ratio Test: Which current BV has to be reduced to zero (i.e. become a NBV)? x B = b * − y 3 x3 ≥ 0 ⎛ 400 ⎞ ⎜ 1 ⎟ ⎜ ⎟ ⎛ x 4 ⎞ ⎛ 400 ⎞ ⎛ 1 ⎞ ⎛0⎞ ⎛ 400 ⎞ 150 ⎟ ⎛ 400 ⎞ ⎜ ⎜x ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ 150 ⎟ 150 ⎟ ⎜ 1 ⎟ 0⎟ ⎜ 1 ⎟ ⎜ 150 ⎟ 5 ⎟ ⎜ ⎜ ⎜ ⎟ = 50 ⇒ = − ⇒ x3 ≤ ⎜ = ⇒ x 3 = m in ⎜ x ≥ ⎜ x 6 ⎟ ⎜ 200 ⎟ ⎜ 4 ⎟ 3 ⎜ 0 ⎟ ⎜ 50 ⎟ 200 ⎟ ⎜ 50 ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ ⎟ 4 ⎜ ⎟ ⎝ ignore ⎠ 50 0 0 x ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ignore ⎠ ⎝ 7⎠ ⎝ ⎜ 50 ⎟ ⎜ ⎟ ⎝ 0 ⎠ The amount of product C produced ( x3) can be increased by 50 (pivot row = row 3), hence x6 must be reduced to 0, and thus becomes a NBV. Interpretation: instead of x6 the basic variable will be x3 (no unused alloy, all alloy used to produce 50 units of product C). Consequence: production of 50 units of product C will also affect other resources. Thus, we need to find the new values of the other BV. 6. Ring the element in both the pivot row and pivot column (pivot element) – 4. Divide all the elements in the identified row (x6) by the pivot element (4) and change the solution variable (i.e. x6 → x3) New Row 3 is: x3 0.5 0 1 0 New Simplex Table is: - 65 - 0 0.25 0 50 Solution Products Variable x1 x2 x4 2 3 x5 1 0 x3 0.5 0 x7 0 1 z -8 -5 Slack Variables x3 1 1 1 0 -10 x4 1 0 0 0 0 x5 0 1 0 0 0 x6 0 0 0.25 0 0 x7 0 0 0 1 0 Solution Quantity 400 150 50 50 0 7. Make all other elements in the pivot column equal to zero by repetitive row by row operations: i. New Row 1 = Old Row 1 – Row 3 x4 x3 New Row 1 ii. iii. iv. 2 0.5 1.5 3 0 3 1 1 0 1 0 1 0 0 0 0 0.25 -0.25 0 0 0 400 50 350 New Row 2 = old row 2 – Row 3 Row 4 = already zero Row 5 = old row 5 – 10(Row 3) Solution Products Variable x1 x2 1.5 3 x4 0.5 0 x5 0.5 0 x3 0 1 x7 z -3 -5 Slack Variables x3 0 0 1 0 0 x4 1 0 0 0 0 x5 0 1 0 0 0 x6 -0.25 -0.25 0.25 0 2.5 x7 0 0 0 1 0 Solution Quantity 350 100 50 50 500 New BFS: x T = ( x3 , x4 , x5 , x7 , x1 , x2 , x6 ) = (50,350,100,50, 0, 0, 0) BV NBV New objective function value z = 500. - 66 - 8. Repeat steps 4, 5 and 6 until there are no negative values in the z row. (i.e. no improvement is possible). Simplex Table after next step: Solution Products Variable x1 x2 1.5 0 x4 0.5 0 x5 0.5 0 x3 0 1 x2 -3 0 z New BFS: Solution Quantity Slack Variables x3 0 0 1 0 0 x4 1 0 0 0 0 x5 0 1 0 0 0 x6 -0.25 -0.25 0.25 0 2.5 x7 -3 0 0 1 5 200 100 50 50 750 x T = ( x2 , x3 , x4 , x5 , x1 , x6 , x7 ) = (50,50, 200,100, 0, 0, 0) NBV BV New Objective function value z = 750. Interpretation: Produce maximum possible quantities of product B (since x1 and x6 are zero) and product C (since x7 is zero). Optimal Simplex Table Solution Products Variable x1 x2 0 0 x4 0 0 x5 1 0 x1 0 1 x2 0 0 z Slack Variables x3 -3 -1 2 0 6 x4 1 0 0 0 0 - 67 - x5 0 1 0 0 0 x6 -1 -0.5 0.5 0 4 x7 -3 0 0 1 5 Solution Quantity 50 50 100 50 1050 Interpretation of table: Produce 100 units of product A, 50 units of product B and no units of product C to yield a maximum contribution of 1050. Unused (Abundant) Resources: 50 machine hours (x4) and 50 components (x5) Scarce (fully utilized) resources: All alloy is used. Limitation on x2 is exhausted. Shadow prices: i. 4 (contributing to alloy (x6)) ii. 5 (contributing to x2 limitation (x7)) iii. 6 (contributing to x3) Interpretation: i. If RHS of alloy constraint is increased/decreased by 1 the total contribution would increase/decrease by 4. ii. If RHS of x2 limitation constraint is increased/decreased by 1 the total contribution, will increase/decrease by 5. iii. If production of product C (x3) is increased by 1 the contribution will be 6 units less. - 68 - Example 4.6 Continued (Matlab Session) This session shows Simplex Algoritm (all-slack problem) using table approach. Comments added later are in italics, some empty lines and spaces have been removed. Assuming that the folder Z:\Matlab contains the file pivot.m » type pivot function a=pivot(A,r,c) % pivot matrix A at row r and column c % for zero pivot item no operation % no other tests x=A(r,c); if x ~= 0 rmax=length(A(:,1)); A(r,:)=A(r,:)/x; for i=1:rmax if i~=r A(i,:)=A(i,:)-A(r,:)*A(i,c); end end end a=A; Entering A,b,c of the model: max ST z = 8x1 2x1 x1 2x1 + 5x2 + 10x3 + 3x2 + x3 ≤ + x3 ≤ + 4x3 ≤ x2 ≤ 400 150 200 50 xi ≥ 0 » A=[2 3 1;1 0 1;2 0 4;0 1 0] A = 2 3 1 1 0 1 2 0 4 0 1 0 » A=[A eye(4)] A = 2 3 1 0 2 0 0 1 1 1 4 0 1 0 0 0 0 1 0 0 0 0 1 0 » c=[8 5 10 zeros(1,4)]' c = 8 5 10 0 0 0 0 - 69 - 0 0 0 1 » b=[400 150 200 50]' b = 400 150 200 50 Constructing the simplex table: » s=[A b] s = 2 1 2 0 3 0 0 1 » s=[s;-c' 0] s = 2 3 1 0 2 0 0 1 -8 -5 1 1 4 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 400 150 200 50 1 1 4 0 -10 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 400 150 200 50 0 3rd variable enters (max negative), computing ratios: last column/3rd column: » r=s(:,end)./s(:,3) Warning: Divide by zero. Ignore this message, in the ratios ignore the Inf values and the last value r =400 150 50 Inf 0 3rd raw has minimum ratio - pivot at (3,3): » s=pivot(s,3,3) s = .... If necessary, format the output ( best of fixed and floating point) » format short g » s s = Columns 1 through 6 1.5 3 0 1 0 -0.25 0.5 0 0 0 1 -0.25 0.5 0 1 0 0 0.25 0 1 0 0 0 0 -3 -5 0 0 0 2.5 Columns 7 through 8 0 350 0 100 0 50 1 50 0 500 - 70 - 2nd variable enters (max negative), computing ratios: last column/2nd column » r=s(:,end)./s(:,2) r = 116.67 Inf Inf 50 -100 4th raw has minimum ratio - pivot at (4,2): » s=pivot(s,4,2) s = Columns 1 through 6 1.5 0 0.5 0 0.5 0 0 1 -3 0 Columns 7 through 8 -3 0 0 1 5 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 -0.25 -0.25 0.25 0 2.5 200 100 50 50 750 1st variable enters (only negative), computing ratios: last column/1st column » r=s(:,end)./s(:,1) r = 133.33 200 100 Inf -250 3rd raw has minimum ratio - pivot at (3,1): » s=pivot(s,3,1) s = Columns 1 through 6 0 0 0 0 1 0 0 1 0 0 -3 -1 2 0 6 Columns 7 through 8 -3 50 0 50 0 100 1 50 5 1050 1 0 0 0 0 Optimal table Retrieving results from the optimal table: » z=s(end,end) Objective value - 71 - 0 1 0 0 0 -1 -0.5 0.5 0 4 z = 1050 Solution vector is found by basic columns ( x4 is the first basic variable with the value 50, x1 is the third basic variable with the value 100, etc.) » x=[100 50 0 50 50 0 0]' x =100 50 0 50 50 0 0 Computing left hand sides of inequalities using the first 3 columns of A and first three items of x: » A A = 2 1 2 0 Displaying 3 1 0 1 0 4 1 0 A again 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 » lhs=A(:,1:3)*x(1:3) lhs = 350 100 200 50 » [lhs b] ans = 350 400 100 150 200 200 50 50 Left and Right hand sides: » unused=b-lhs unused = 50 50 0 0 Unused resources = RHS - LHS Note that only last two constraints are tight Shadow costs are in last 4 columns of the z row: » s s = Columns 1 through 6 0 0 0 0 1 0 0 1 0 0 -3 -1 2 0 6 1 0 0 0 0 Columns 7 through 8 -3 50 - 72 - 0 1 0 0 0 -1 -0.5 0.5 0 4 0 0 1 5 50 100 50 1050 » w=s(5,4:7)' w = 0 0 4 5 Dual solution = Shadow costs » c'*x ans = 1050 Primal objective » w'*b ans = 1050 Dual objective Inverted base is where unity matrix was originally (last 4 columns): » s s = Columns 1 through 6 0 0 0 0 1 0 0 1 0 0 -3 -1 2 0 6 1 0 0 0 0 0 1 0 0 0 -1 -0.5 0.5 0 4 Columns 7 through 8 -3 50 0 50 0 100 1 50 5 1050 » Binv=s(1:4,4:7) Binv = 1 0 0 0 » B=inv(Binv) B = 1 0 0 1 0 0 0 0 0 1 0 0 -1 -0.5 0.5 0 -3 0 0 1 B-1 Inversion of B-1 is B 2 1 2 0 3 0 0 1 Note that the base is made of columns of A that correspond to the basic variables x4, x5, x1, and x2 in this order. To check this, let's create the base again. Note that it is possible to retrieve columns in any order by giving the set - vector of indices: » BB=A(:,[4 5 1 2]) parameter BB = 1 0 2 All rows, columns as given by the 2nd 3 BB = B - 73 - 0 0 0 1 0 0 1 2 0 0 0 1 » 4.5 Big M method and II phase method Recall that to start the simplex algorithm, an initial Basic Feasible Solution is required. In the problems we considered so far, an initial Basic Feasible Solution was found by using slack variables as our initial basic variables. This was possible since the problems considered, contained only of constraints of the form Ax ≤ b x≥0 However, if an LP model is made up of ≥ or = (or a mixture of these) constraints, an initial Basic Feasible Solution is not readily apparent. For example, consider the following LP: max z = 2 x1 + 5 x2 subject to 2 x1 + 3 x2 ≤ 6 2 x1 − x2 ≥ 2 − x1 + 6 x2 = 2 x1 , x2 ≥ 0 Inserting slack variables in each of the inequalities, we obtain the LP in standard form: - 74 - max z = 2 x1 + 5 x2 subject to 2 x1 + 3 x2 + x3 =6 2 x1 − x2 − x4 = 2 − x1 + 6 x2 =2 x1 , x2 , x3 , x4 ≥ 0 In the last two constraints, there is no readily apparent variable, which can act as a basic variable for an initial basic feasible solution. Thus, in order to obtain an initial basic feasible solution, two other variables have to be introduced into the last two constraints. These are called artificial variables, and have no practical interpretation. In fact they must be eliminated from the final solution. Hence, the above LP is converted in the following form: max z = 2 x1 + 5 x2 + 0 x3 + 0 x4 subject to 2 x1 + 3 x2 + x3 2 x1 − x2 =6 − x4 + a1 − x1 + 6 x2 =2 + a2 = 2 x1 , x2 , x3 , x4 , a1 , a2 ≥ 0 - 75 - The Big M Method As already pointed out, artificial variables must be eliminated from the optimal solution, after obtaining the IBFS. There are two types of methods, which eliminate artificial variables: the Big M Method and the Two Phase Method. First we shall examine the Big M Method. To remove artificial variables, in case of Minimization problems, the Big M Method adds a term Mai to the objective function for each artificial variable ai. In case of Maximization problems, the term − Mai is added to the objective function for each artificial variable ai.. M represents some very large number and can be interpreted as a huge penalty for underfulfillment of the requirements. Thus the above example now becomes: max z = 2 x1 + 5 x2 + 0 x3 + 0 x4 − Ma1 − Ma2 subject to 2 x1 + 3 x2 + x3 2 x1 − x2 =6 − x4 + a1 − x1 + 6 x2 =2 + a2 = 2 x1 , x2 , x3 , x4 , a1 , a2 ≥ 0 Modifying the objective function makes it extremely “non-profitable” for an artificial variable to be positive. Thus the optimal solution should force a1 and a2 to be 0. The above problem is then solved using the simplex method. Note: if any artificial variables are positive in the optimal solution, then the problem is infeasible. - 76 - Two Phase Method This method is divided in two phases: Phase 1 and Phase 2. Phase 1: For the time being we ignore the original LP’s objective function and instead we minimize a LP whose objective function z ' is the sum of the artificial variables. Thus Phase 1 will force the artificial variables to be zero. Since each artificial variable is non-negative, solving the LP in phase one will result in one of the following three cases: Case 1: The optimal value of z ' > 0 Case 2: The optimal value of z ' = 0 and no artificial variables are in the optimal solution of Phase 1. Case 3: The optimal value of z ' = 0 and at least one artificial variable is in the optimal solution of Phase 1 (i.e. it is 0). Phase 2: Case 1: Stop, the original LP is infeasible. Case 2: i. Delete the columns corresponding to the artificial variables. ii. Combine the original objective function with the constraints from the optimal phase 1 tableaux iii. Make the tableaux consistent using row operations, so that the basic variables will have 0 objective coefficients. iv. Perform the simplex method to this new consistent tableau. Case 3: i. Delete the columns corresponding to both the non-basic and the basiczero artificial variables. - 77 - ii. Combine the original objective function with the constraints from the optimal phase 1 tableaux iii. Make the tableaux consistent using row operations, so that the basic variables will have 0 objective coefficients. iv. Perform the simplex method to this new consistent tableau. Example 4.7 This session shows Simplex Algoritm (M method) using table approach. Comments added later are in italics, some empty lines and spaces have been removed. Assuming that the folder Z:\Matlab contains the file pivot.m » cd Z:\Matlab » type pivot function a=pivot(A,r,c) % pivot matrix A at row r and column c % for zero pivot item no operation % no other tests x=A(r,c); if x ~= 0 rmax=length(A(:,1)); A(r,:)=A(r,:)/x; for i=1:rmax if i~=r A(i,:)=A(i,:)-A(r,:)*A(i,c); end end end a=A; Entering A,b,c of the model: Max z = 3x1 + 4x2 + 7x3 ST 2x1 + 3x2 4x2 + 7x3 + 9x3 2x1 7x1 ≤ ≤ ≥ ≥ 30 75 50 20 » A=[2 3 0;0 4 7;2 0 9;7 0 0] A = 2 3 0 0 4 7 2 0 9 7 0 0 - 78 - (all nonnegative) » A=[A [0 0 -1 0]' [0 0 0 -1]'] A = 2 3 0 0 0 0 4 7 0 0 2 0 9 -1 0 7 0 0 0 -1 » A=[A eye(4)] variables A = 2 3 0 4 2 0 7 0 Adding columns of negative slacks slack (surplus) s3 slack (surplus) s4 Adding columns of positive slacks and artificial 0 7 9 0 0 0 -1 0 0 0 0 -1 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 slack s1 slack s2 artificial a1 artificial a2 » b=[30 75 50 20]' b = 30 75 50 20 » M=sym('M') Definition of the symbol M » cT=[3 4 7 0 0 0 0 -M -M] cT = [ 3, 4, 7, 0, 0, » s=[A b; -cT 0] 0, 0, -M, -M] Initial simplex table (column labels added) s = x1 x2 x3 s3 s4 s1 s2 a1 a2 [ 2, 3, 0, 0, 0, 1, 0, 0, 0, [ 0, 4, 7, 0, 0, 0, 1, 0, 0, [ 2, 0, 9, -1, 0, 0, 0, 1, 0, [ 7, 0, 0, 0, -1, 0, 0, 0, 1, [ -3, -4, -7, 0, 0, 0, 0, M, M, rhs 30] 75] 50] 20] 0] Not consistent: subtract M(row3 + row4) from rowz: » s(5,:)=s(5,:)-(s(3,:)+s(4,:))*M s = [ 2 3, 0, 0, 0, [ 0, 4, 7, 0, 0, [ 2, 0, 9, -1, 0, [ 7, 0, 0, 0, -1, [-3-9*M, -4,-7-9*M, M, M, » col=3 col = 3 0, 1, 0, 0, 0, x3 enters » r=s(:,end)./s(:,col) r = [ Inf] [ 75/7] [ 50/9] [ Inf] [ -70*M/(-7-9*M)] » row=3 row = 3 1, 0, 0, 0, 0, ratio test minimum ignore z-entry of ratio tests a1 leaves - 79 - 0, 0, 1, 0, 0, 0, 30] 0, 75] 0, 50] 1, 20] 0, -70*M] » s=pivot(s,row,col) s = [ 2, 3, [ -14/9, 4, [ 2/9, 0, [ 7, 0, [-13/9-7*M, -4, » col=1 col = 1 0, 0, 1, 0, 0, 0, 7/9, -1/9, 0, -7/9, 0, 30] 0, 325/9] 0, 50/9] 1, 20] 0, -20*M+350/9] minimum a2 leaves 1, 0, 0, 0, 0, 0, 0, -2/7, 170/7] 1, -7/9, 2/9, 365/9] 0, 1/9, -2/63, 310/63] 0, 0, 1/7, 20/7] 0, 7/9+M,13/63+M, 2710/63] x2 enters » r=s(:,end)./s(:,col) r = [ 170/21] [ 365/36] [ Inf] [ Inf] [ -1355/126] » row=1 row = 1 0, 0, 1, -7/9, 0, 1/9, 0, 0, 0, 7/9+M, ratio test » s=pivot(s,row,col) s = [0, 3, 0, 0, 2/7, [0, 4, 0, 7/9, -2/9, [0, 0, 1, -1/9, 2/63, [1, 0, 0, 0, -1/7, [0, -4, 0, -7/9, -13/63, » col=2 col = 2 1, 0, 0, 0, 0, x1 enters » r=s(:,end)./s(:,col) r = [ 15] [ -325/14] [ 25] [ 20/7] [ (-20*M+350/9)/(-13/9-7*M)] » row=4 row = 4 0, 0, 0, -1, M, ratio test minimum s1 leaves » s=pivot(s,row,col) s = [0, 1, 0, 0, 2/21, [0, 0, 0, 7/9, -38/63, [0, 0, 1, -1/9, 2/63, [1, 0, 0, 0, -1/7, [0, 0, 0, -7/9, 11/63, » col=4 col = 4 s3 enters » row=2 s2 leaves 1/3, -4/3, 0, 0, 4/3, (the only positive) - 80 - 0, 1, 0, 0, 0, 0, -2/21, 170/21] -7/9, 38/63, 515/63] 1/9, -2/63, 310/63] 0, 1/7, 20/7] 7/9+M,-11/63+M, 4750/63] row = 2 » s=pivot(s,row,col) s = [0, 1, 0, [0, 0, 0, [0, 0, 1, [1, 0, 0, [0, 0, 0, » col=5 col = 5 s4 enters » row=1 row = 1 x2 leaves 0, 2/21, 1, -38/49, 0, -8/147, 0, -1/7, 0, -3/7, 1/3, -12/7, -4/21, 0, 0, 0, 9/7, 1/7, 0, 1, 0, -1, 0, 0, M, -2/21, 170/21] 38/49, 515/49] 8/147, 895/147] 1/7, 20/7] 3/7+M, 585/7] (the only positive) » s=pivot(s,row,col) s = [ 0, 21/2, 0, [ 0, 57/7, 0, [ 0, 4/7, 1, [ 1, 3/2, 0, [ 0, 9/2, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 7/2, 1, 0, 1/2, 3/2, 0, 9/7, 1/7, 0, 1, 0, -1, 0, 0, M, -1, 85] 0, 535/7] 0, 75/7] 0, 15] M, 120] Retrieving results from optimal table: » x=[15 0 75/7 535/7 85 0 0] Solution (x1 x2 x3 s3 s4 s1 s2) x = 15.0000 0 10.7143 76.4286 85.0000 0 0 Activities: x1 = 15.0000 x2 = 0 x3 = 10.7143 Slacks: s1 s2 s3 s4 » z=s(end,end) z = 120 = 0 = 0 = 76.4286 = 85 1st constraint 2nd constraint surplus above surplus above tight tight 50 20 objective value See shadow costs of RHS: sc1 = 3/2 worth of resource 1 (30) sc2 = 1 worth of resource 2 (75) (sc3 = sc4 = 0 <– not tight) Reduced cost of (nonbasic) x2 = -9/2 (one unit of x2 would decrease z by 9/2) - 81 - Example 4.8 This session shows Simplex Algoritm (II phase method) using table approach. Comments added later are in italics, some empty lines and spaces have been removed. Assuming that the folder Z:\Matlab contains the file pivot.m » cd Z:\Matlab » type pivot function a=pivot(A,r,c) % pivot matrix A at row r and column c % for zero pivot item no operation % no other tests x=A(r,c); if x ~= 0 rmax=length(A(:,1)); A(r,:)=A(r,:)/x; for i=1:rmax if i~=r A(i,:)=A(i,:)-A(r,:)*A(i,c); end end end a=A; Entering A,b,c of the model: Max z = 3x1 + 4x2 + 7x3 ST 2x1 + 3x2 4x2 + 7x3 + 9x3 2x1 7x1 ≤ ≤ ≥ ≥ 30 75 50 20 (all nonnegative) » A=[2 3 0;0 4 7;2 0 9;7 0 0] A = 2 3 0 0 4 7 2 0 9 7 0 0 » A=[A [0 0 -1 0]' [0 0 0 -1]'] A = 2 3 0 0 0 0 4 7 0 0 2 0 9 -1 0 7 0 0 0 -1 » A=[A eye(4)] variables A = 2 3 0 4 2 0 7 0 Adding columns of negative slacks slack (surplus) s3 slack (surplus) s4 Adding columns of positive slacks and artificial 0 7 9 0 0 0 -1 0 0 0 0 -1 1 0 0 0 » b=[30 75 50 20]' - 82 - 0 1 0 0 0 0 1 0 0 0 0 1 slack s1 slack s2 artificial a1 artificial a2 b = 30 75 50 20 Phase I = minimization of a1 + a2 » cT=[zeros(1,7) 1 1] cT = 0 0 0 » s=[A b; -cT 0] s = x1 x2 x3 2 3 0 0 4 7 2 0 9 7 0 0 0 0 0 0 0 0 0 1 1 Initial simplex table (column labels added) s3 0 0 -1 0 0 s4 0 0 0 -1 0 s1 1 0 0 0 0 s2 0 1 0 0 0 a1 0 0 1 0 -1 a2 0 0 0 1 -1 rhs 30 75 50 20 0 0 0 1 0 0 0 0 0 1 0 30 75 50 20 70 Not consistent - add row3 and row4 to rowz: » s(end,:)=s(end,:) + s(3,:) + s(4,:) s = 2 3 0 0 0 1 0 4 7 0 0 0 2 0 9 -1 0 0 7 0 0 0 -1 0 9 0 9 -1 -1 0 » col=1 col = 1 x1 enters » r=s(:,end)./s(:,col) r = 15.0000 Inf 25.0000 2.8571 7.7778 » row=4 row = 4 0 1 0 0 0 ratio test a2 leaves » s=pivot(s,row,col) s = Columns 1 through 7 0 3.0000 0 0 4.0000 7.0000 0 0 9.0000 1.0000 0 0 0 0 9.0000 Columns 8 through 10 0 -0.2857 0 0 1.0000 -0.2857 0 0.1429 0 -1.2857 0 0 -1.0000 0 -1.0000 24.2857 75.0000 44.2857 2.8571 44.2857 - 83 - 0.2857 0 0.2857 -0.1429 0.2857 1.0000 0 0 0 0 0 1.0000 0 0 0 » col=3 col = 3 x3 enters » r=s(:,end)./s(:,col) r = Inf 10.7143 4.9206 Inf 4.9206 » row=3 row = 3 ratio test a1 leaves » s=pivot(s,row,col) s = Columns 1 through 7 0 3.0000 0 0 4.0000 0 0 0 1.0000 1.0000 0 0 0 0 0 0 0.7778 -0.1111 0 0 0.2857 -0.2222 0.0317 -0.1429 -0.0000 1.0000 0 0 0 0 0 1.0000 0 0 0 Columns 8 through 10 0 -0.7778 0.1111 0 -1.0000 -0.2857 0.2222 -0.0317 0.1429 -1.0000 24.2857 40.5556 4.9206 2.8571 -0.0000 End of phase I: optimal table (a1 + a2 = 0 <-> a1 and a2 nonbasic) Removing columns 8 and 9 of artificial variables: » s(:,8:9)=[] s = Columns 1 through 7 0 3.0000 0 0 4.0000 0 0 0 1.0000 1.0000 0 0 0 0 0 0 0.7778 -0.1111 0 0 0.2857 -0.2222 0.0317 -0.1429 -0.0000 1.0000 0 0 0 0 0 1.0000 0 0 0 » s(end,:)=[-cT 0] new row z in the simplex table ! s = Columns 1 through 7 0 3.0000 0 0 0.2857 1.0000 0 4.0000 0 0.7778 -0.2222 0 0 1.0000 Column 8 24.2857 40.5556 4.9206 2.8571 -0.0000 Phase II = maximization of 3x1 + 4x2 + 7x3 » cT=[3 4 7 0 0 0 0] cT = 3 4 7 0 0 0 - 84 - 0 0 1.0000 -3.0000 0 0 -4.0000 1.0000 0 -7.0000 -0.1111 0 0 0.0317 -0.1429 0 0 0 0 0 0 0 0.2857 -0.2222 0.0317 -0.1429 -0.2063 1.0000 0 0 0 0 0 1.0000 0 0 0 0.0952 -0.6032 0.0317 -0.1429 0.1746 0.3333 -1.3333 0 0 1.3333 0 1.0000 0 0 0 Column 8 24.2857 40.5556 4.9206 2.8571 0 Not consistent - add 7*row3 + 3*row4 to rowz: » s(end,:)=s(end,:) + 7*s(3,:) + 3*s(4,:) s = Columns 1 through 7 0 3.0000 0 0 0 4.0000 0 0.7778 0 0 1.0000 -0.1111 1.0000 0 0 0 0 -4.0000 0 -0.7778 Column 8 24.2857 40.5556 4.9206 2.8571 43.0159 » col=2 col = 2 x2 enters » r=s(:,end)./s(:,col) r = 8.0952 10.1389 Inf Inf -10.7540 » row=1 row = 1 ratio test s1 leaves » s=pivot(s,row,col) s = Columns 1 through 7 0 1.0000 0 0 0 0 0 0 1.0000 1.0000 0 0 0 0 0 0 0.7778 -0.1111 0 -0.7778 Column 8 8.0952 8.1746 4.9206 2.8571 75.3968 » col=4 s3 enters - 85 - col = 4 » row=2 row = 2 s2 leaves » s=pivot(s,row,col) s = Columns 1 through 7 0 1.0000 0 0 0 0 0 0 1.0000 1.0000 0 0 0 0 0 0 1.0000 0 0 0 0.0952 -0.7755 -0.0544 -0.1429 -0.4286 0.3333 -1.7143 -0.1905 0 0 0 1.2857 0.1429 0 1.0000 0 1.0000 0 0 0 1.0000 0 0 0 0 3.5000 1.0000 0 0.5000 1.5000 0 1.2857 0.1429 0 1.0000 Column 8 8.0952 10.5102 6.0884 2.8571 83.5714 » col=5 col = 5 s4 enters » row=1 row = 1 x2 leaves » s=pivot(s,row,col) s = Columns 1 through 7 0 10.5000 0 0 8.1429 0 0 0.5714 1.0000 1.0000 1.5000 0 0 4.5000 0 Column 8 85.0000 76.4286 10.7143 15.0000 120.0000 Retrieving results from optimal table: » x=[15 0 10.7143 76.4286 85 0 0] x = 15.0000 0 10.7143 76.4286 85.0000 0 Activities: x1 = 15.0000 x2 = 0 x3 = 10.7143 Slacks: s1 s2 s3 s4 = 0 = 0 = 76.4286 = 85 1st constraint 2nd constraint surplus above surplus above - 86 - tight tight 50 20 0 » z=s(end,end) z = 120 objective value See shadow costs of RHS: sc1 = 1.5 worth of resource 1 (30) sc2 = 1 worth of resource 2 (75) (sc3 = sc4 = 0 <– not tight) Reduced cost of (nonbasic) x2 = -4.5 (one unit of x2 would decrease z by 4.5) 4.6. Sensitivity Analysis Numerical parameters of LP models, and other mathematical models in Operations Research generally, are often not known exactly. Very often the actual values that represent availability of resources, contribution, costs are just estimates of future values. That's why it is very important to evaluate how much does the current (optimal) solution depend on actual values of parameters and to update the solution if some parameter changes its value without solving the problem again from scratch. Considering LP problems changes in the model can result in one of these four cases: 1. The current (basic feasible optimal) solution remains unchanged. 2. The current solution becomes infeasible. 3. The current solution becomes non-optimal. 4. The current solution becomes both non-optimal and infeasible. The methods how to recover optimality and feasibility in the above cases are these: 2. Use the dual simplex method to recover feasibility. 3. Use the (primal) simplex method to obtain the new optimum. - 87 - 4. Use both the primal and the dual simplex methods to obtain the new solution. To assess how sensitive is the solution to the changes of a particular parameter we need the range of its values that keep feasibility/optimality of the current solution (whose actual objective value can change). Next paragraphs deal with both problems. We shall consider only right hand side values of (in)equalities and coefficients of the objective function because most software packages provide sensitivity data on these parameters only. We shall also assume that always only one parameter is changed while the others remain constant. For convenience let's repeat the contents of the simplex table of a feasible LP problem bounded in objective value after reaching the optimal feasible solution. Note that the columns of basic and nonbasic variables are in fact scattered in the table because usually the original column labels are not changed during simplex iterations. BV xNT xBT z RHS xB B-1N I 0 B-1b z cBTB-1N - cNT 0 1 cBTB-1b In the table B is the optimal basic matrix and N is the corresponding nonbasic matrix (both made of the columns of the original m x n matrix A). n-vectors x and c are divided accordingly. b is the m vector of the RHS values. Note also that the inverted basic matrix B-1 is available in the columns that originally - 88 - contained the unity matrix - typically the last m columns and that the dual optimal solution equal to shadow prices of primal RHS values are also available in the columns of the z-row corresponding to the slack variables of the primal model. Of course the primal and dual optimal objective values are equal. Changes in the right-hand side values From the above table it is evident that these changes can not affect optimality of the current solution because the values zj - cj in the z-row don't depend on b. Changes in b change the values b* = B-1b of the basic variables and the objective value cBTB-1b. If the new values of the basic variables are still nonnegative, we have new feasible optimum. Otherwise it is necessary to apply the dual simplex method to recover feasibility. Range in which elements of b can vary (feasibility range) Let's assume that the new i-th RHS value bi is changed to bi + di , i = 1 … m and that all the other RHS values are not changed. Let's call the new vector of RHS values b'. The condition B-1b' ≥ 0 can be in this case expressed as: b* + (B-1)idi ≥ 0 where b* are the current values of basic variables and (B-1)i is the i-th column of B-1. This is a set of inequalities: bj* + (B-1)jidi ≥ 0 , j = 1 … m - 89 - All these inequalities must be satisfied that defines the minimum and the maximum possible values of di or in other words the range of bi values that keep feasibility of the current solution. Example 4.9 Consider the LP model: max 8 x1 + 5 x2 + 10 x3 subject to 2 x1 + 3 x2 + x3 ≤ 400 x1 + x3 ≤ 150 2 x1 + 4 x3 ≤ 200 x2 ≤ 50 x1 , x2 , x3 ≥ 0 The optimal table is: Solution Products Variable x1 x2 0 0 x4 0 0 x5 1 0 x1 0 1 x2 0 0 z Slack Variables x3 -3 -1 2 0 6 x4 1 0 0 0 0 x5 0 1 0 0 0 x6 -1 -0.5 0.5 0 4 x7 -3 0 0 1 5 The basic matrix B corresponding to the optimal table is ⎛1 ⎜⎜ ⎜⎜0 ⎜ B = ⎜⎜ ⎜⎜⎜0 ⎜0 ⎜⎜⎝ ⎛1 0 −1 −3⎞⎟ 0 2 3⎞⎟ ⎟⎟ ⎜⎜⎜ ⎟⎟ ⎟ 1 ⎜0 1 − 1 1 0⎟⎟ 2 0 ⎟⎟⎟ ⎟⎟ ⇒ B−1 = ⎜⎜⎜ ⎟ 0 2 0⎟⎟⎟ ⎜⎜⎜0 0 1 2 0 ⎟⎟⎟ ⎟⎟ ⎟⎟ ⎜⎜ 0 0 1⎠⎟⎟ ⎜⎜⎝0 0 0 1 ⎠⎟⎟⎟ - 90 - Solution Quantity 50 50 100 50 1050 Suppose we would like to know the range in which b3 can vary such that the feasibility conditions still hold, in other words, such that: b`= b* + (B-1)idi ≥ 0 ⎛50 ⎞⎟ ⎛⎜−1 ⎞⎟ ⎛0⎞⎟ ⎜⎜ ⎟ ⎜ ⎜ ⎟ ⎟⎟ ⎜⎜50 ⎟⎟ ⎜⎜− 1 ⎟⎟ ⎜⎜⎜0⎟⎟ ⎜ ⎟ ⎜ 2⎟⎟ ⎜ ⎟ ⇒ ⎜⎜ ⎟⎟⎟ + ⎜⎜⎜ ⎟⎟d3 ≥ ⎜⎜ ⎟⎟⎟ ⎜⎜100⎟⎟ ⎜ 1 ⎟⎟ ⎜⎜0⎟⎟ ⎜⎜ ⎟⎟ ⎜⎜ 2 ⎟⎟ ⎜ ⎟⎟ ⎜⎝50 ⎠⎟⎟ ⎜⎜0 ⎟⎟ ⎜⎝⎜0⎠⎟⎟ ⎝ ⎠ ⇒ d3 ≤ 50, d3 ≤ 100, d3 ≥ −200 Thus b3 can vary in the range [0, 250]. Changes in the objective coefficients From the above table it is evident that these changes can not affect directly the values of the basic variables b* = B-1b, but they can affect optimality. So generally it is necessary to apply the primal simplex method to reach optimality again. Let's take separately the cases where the changing coefficient ck relates to a nonbasic and to a basic variable xk. 1. Assume xk is a nonbasic variable In this case cB (and thus the objective value) remain unchanged. The values zj cj in the z row are also not changed except the value zk - ck in the xk column that becomes zk - ck' where ck' is the new value of the coefficient. If the new value zk - ck' keeps optimality (remains non-positive/non-negative for minimization/maximization problems) the current solution is not changed. Otherwise xk has to enter the basis, so we use the primal simplex method to do so and to perform possibly more iterations until optimality is recovered. In this - 91 - case we get a new optimum (generally more basic variables can leave). Optimality condition can also be used to find the range of ck values that don't change the current solution. Example 4.9 (Continued) Recall the optimal table: Solution Products Variable x1 x2 0 0 x4 0 0 x5 1 0 x1 0 1 x2 0 0 z Slack Variables x3 -3 -1 2 0 6 x4 1 0 0 0 0 x5 0 1 0 0 0 x6 -1 -0.5 0.5 0 4 x7 -3 0 0 1 5 Solution Quantity 50 50 100 50 1050 The NBV are x3, x6 and x7. Since only x3 corresponds to the decision variables (x6 and x7 are slacks) we are interested in the range in which c3 can vary, so that the optimality condition z3 – c3' is still non-negative. z 3 − c3' = z 3 − c3 + c3 − c3' = (z 3 − c3 ) − (c3' − c3 ) = 6 − (c3' − 10) ≥ 0 ⇒ c3' ≤ 16 Thus, range in which c3 can vary from the current value of 10 is (-∞,16] 2. xk is basic. In this case cB is changed that affects the objective value and optimality conditions generally. Let's assume that xk ≡ xBt and the old value cBt is replaced by cBt' (note that xk is the t-th basic variable). Let the new value of zj be zj'. Then we can calculate the new values of (negative) reduced costs in the z row: - 92 - zj' - cj = cB'T B-1aj - cj = cB'T yj - cj = (cBT yj - cj) + (0, 0, … , cBt' - cBt, … ,0)yj = (zj - cj) + (cBt' - cBt)ytj for all j ∈ R where yj is the j-th column of the simplex table and R is the set of indexes of nonbasic variables. Note that the reduced costs of basic variables remain zero by definition of the simplex table. So to get the new z-row, we multiply the current row t of the optimal simplex table by the net change in the cost (cBt' cBt) and add it to the original z row. Then it is necessary to restore zero in the k- th column. This will also produce the new objective value cB'TB-1b = cBT B-1b + (cBt' - cBt)bt*. Of course the new values in the z row can violate optimality conditions, so again it may be necessary to use the primal simplex method to find the new optimum. Assuming minimization, optimality can be expressed by a set of inequalities: (zj - cj) + (cBt' - cBt)ytj ≤ 0 for all j ∈ R These inequalities can be used to find the range of the cBt values that keep optimality. For maximization the only change is the inequality sign. - 93 - Example 4.9 (Continued) Recall the optimal table: Solution Products Variable x1 x2 0 0 x4 0 0 x5 1 0 x1 0 1 x2 0 0 z Slack Variables x3 -3 -1 2 0 6 x4 1 0 0 0 0 x5 0 1 0 0 0 x6 -1 -0.5 0.5 0 4 x7 -3 0 0 1 5 Solution Quantity 50 50 100 50 1050 ⎛x B1 ⎞⎟ ⎛x 4 ⎞⎟ ⎜⎜ ⎟ ⎜⎜ ⎟ ⎜⎜x B 2 ⎟⎟ ⎜⎜x 5 ⎟⎟ The BV are ⎜⎜x ⎟⎟⎟ = ⎜⎜x ⎟⎟⎟ . The solution variables are x1 and x2 . Consider the ⎜⎜ B 3 ⎟⎟ ⎜⎜ 1 ⎟⎟ ⎜x ⎟⎟ ⎜x ⎟⎟ ⎜⎝ B 4 ⎠⎟ ⎝⎜ 2 ⎠⎟ BV x1 = xB3. Recall the set of indices R corresponding to the NBV: i.e. R = {3,6,7}. i. j = 3: (z3 – c3) + (c1' – c1)y33 = 6 + (c1' - 8)2 ≥ 0 ⇒ c1' ≥ 5 ii. j = 6: (z6 – c6) + (c1' – c1)y36 = 4 + (c1' - 8) 1 2 ≥ 0 ⇒ c1' ≥ 0 iii. j = 7: (z7 – c7) + (c1' – c1)y37 = 5 + (c1' - 8) 0 ≥ 0 ⇒ Cannot conclude Thus range of c1 = [5,∞) - 94 - Chapter 5 Duality Theory Together with every linear programming (LP) problem, there is an associated LP problem referred to as its dual. The importance of introducing the dual problem lies in the fact that it has many properties in common with its primal. In fact, sometimes, one can solve the dual problem with the intent to solve the primal. The notion of a dual model also leads to rich practical and economical interpretation of both. To create a dual problem, there are these basic rules: • Change the type of the problem (Max ↔ Min). • There is one dual variable for each primal constraint. • There is one dual constraint for each primal variable. The following table defines the relationships: Variables Constraints Minimization Problem ≥0 ≤0 Unrestricted Maximization Problem ≤ ≥ = ≥ ≤ = ≥0 ≤0 Unrestricted Constraints Variables Moreover this is the relationship of the vectors involved assuming that the usual interpretation of A, cT, b, and x holds for the primal: - 95 - Primal A b cT x Dual AT c bT w Vectors are considered as column vectors, row vectors are transposed (c is a column vector, cT is a row vector). The general pattern is the following where the proper constraint signs and bounds of variables are defined by the above table: Primal Max (Min) ST Dual c Tx Ax ? x ? 0 Min (Max) ST b b Tw A Tw ? c w ? 0 There are two special cases of duality with simplified conversion rules: Canonical form of duality: Min ST cTx Ax ≥ b x≥0 Max bTw ST ATw ≤ c w≥0 Standard form of duality: Min ST cTx Ax = b x≥0 Max bTw ST ATw ≤ c w unrestricted or Max ST cTx Ax = b x≥0 Min bTw ST ATw ≥ c w unrestricted - 96 - Example 5.1: Primal : Min 3x1 + 4x2 ST 3x1 + 4x2 ≥ 3 6x1 + 9x2 ≥ 10 x1, x2 ≥ 0 Dual : Max 3w1 + 10w2 ST 3w1 + 6w2 ≤ 3 4w1 + 9w2 ≤ 4 w1, w2 ≥ 0 5.1 Primal - Dual Relationships Theorem 5.2: The dual form of the dual LP problem is the primal LP problem. Proof: This theorem shall be proved for the canonical form of duality. Consider the following dual LP problem D (the primal form is stated above): Max bTw ST ATw ≤ c w≥0 This LP problem may also be stated as: Max ∑ ST ci ≥ ∑ j =1 A ji w j m j =1 bjwj m , i = 1… n Note that b and w are an m-vectors, c is an n-vector and AT is an n×m matrix, so in the sum there is instead of the i-th row of AT the i-th column of A. The same optimal values wj may be obtained if the objective function was changed to: - 97 - Min ∑ n j =1 − bjwj So it holds that maximization of bTw can be converted to minimization of (-bT)w. Similarly it is possible to multiply the inequalities by -1 to change the signs. Thus using this transformation, the dual LP problem can be converted to: Min (-bT)w ST (-AT)w ≥ (-c) w≥0 This is the second form of the dual D with the same optimum and the same optimal values of the variables w. Now, upon taking the dual of D - using the method described above for the canonical form of duality - we get this problem: Max (-cT)x ST (-A)x ≤ (-b) x≥0 But by applying the above transformation in reverse order, we get: Min cTx ST Ax ≥ b x≥0 - 98 - This is the primal problem. Therefore the dual of the dual is the primal. It means that the terms primal or dual can be exchanged. The term primal is used for the original problem from the application point of view. This theorem is known as the Weak Duality Property. Theorem 5.3: The objective value for any feasible solution to the minimization problem is always greater than or equal to the objective value for any feasible solution to the maximization problem. Proof: Consider the canonical form of duality. Let xo and wo be any two feasible solutions for the respective primal and dual problems. To prove the theorem, let's express the constraints in canonical form for the primal: Axo ≥ b (1) xo ≥ 0 and for the dual solution: ATwo ≤ c (2) wo ≥ 0 Now, multiply (1) by woT on the left-hand side, thus giving woTAxo ≥ woTb or alternatively woTAxo ≥ bTwo. Then we can convert (2) into woTA ≤ cT and then multiply by xo on the right hand side, thus giving woTA xo ≤ cTxo. Using these two inequalities derived above, we get: - 99 - (From maximizing dual) bTwo ≤ cTxo (From minimizing primal) which proves the theorem. With this result, the following two corollaries follow: Corollary 5.4: If xo and wo are feasible solutions to the primal and the dual problems such that bTwo = cTxo, then xo and wo are optimal solutions to their respective problems. Corollary 5.5: If either problem has an unbounded objective value, then the other problem possesses no feasible solution. The second corollary suggests that if the primal has no minimum optimum value (objective value → -∞), then there is no maximum optimum value that can be reached for the dual according to the weak duality property. The same argument may be applied if the maximum value of the dual may never be reached (objective value → +∞). In this case there is no minimal optimal value that may be reached in the primal. Note that this corollary does not suggest that an infeasible primal/dual implies an unbounded dual/primal (i.e. the theorem does not work in reverse order). There are examples where both the primal and the dual are infeasible. Thus infeasibility in the primal may imply also infeasibility in the dual. To summarize: infeasibility in the primal implies infeasibility or unboundedness in the dual. - 100 - The next important duality property is based on the so called Karush-Kuhn- Tucker (KKT) Optimality Conditions given here without a proof. Theorem 5.6: Suppose that x* is an optimal point for the following LP (primal) minimization problem: Min cTx ST Ax ≥ b x≥ 0 Then, there exists a vector w* such that: 1) Ax* ≥ b ; x* ≥ 0, 2) ATw* ≤ c ; w* ≥ 0, 3) w*T(Ax* - b) = 0 and x* T(c – ATw*) = 0 Similarly suppose that x* is an optimal point for the following LP (primal) maximization problem: Max cTx ST Ax ≤ b x≥ 0 Then, there exists a vector w* such that: 4) Ax* ≤ b ; x* ≥ 0, 5) ATw* ≥ c ; w* ≥ 0, 6) w*T(b - Ax*) = 0 and x* T(ATw* – c) = 0 - 101 - Note that w* can be interpreted as the dual optimal solution. The meaning of the first condition (1) or (4) respectively is that if x* is to be optimal, then it has to be a primal feasible point. The same thing may be said for the meaning of the second condition (2) or (5) respectively. This time, it is the dual optimal solution that must be feasible. The last condition (3) or (6) respectively is the most important condition for it leads to the proof of the Strong Duality Property: Theorem 5.7: If one problem possesses an optimal solution, then both problems possess optimal solutions and the two optimal values are equal. Proof: Using both equations which are listed in the KKT condition (3), it may be directly shown that cTx* = bTw*. The weak duality property suggests that w* must be an optimal solution to the dual problem (since in general, bTw ≤ cTx and equality is reached only if optimality is reached for both problems). Therefore w* must maximize bTw over the dual feasible region. Similarly the KKT optimality condition (6) for the dual (maximization) problem imply the existence of a primal feasible solution whose objective is equal to that of the optimal dual. These arguments complete the proof. Strong duality property has important consequences. Let x and w be any two feasible solutions in the primal and dual LP problems respectively. The strong duality property suggests a simple method to check if the two solutions are optimal values to their respective problems. The method would be to check if cTx = bTw. If this is true, x and w would both be optimal. This is known as The Supervisor’s Principle. Another important relationship is the so called Complementary Slackness: - 102 - Theorem 5.8: Let x and w be any two feasible solutions to the primal and the dual problems in the canonical form. Then, they are respectively optimal iff • xj(cj – ajTw) = 0 : j = 1 … n • wi(aiTx – bi) = 0 : i = 1 … m Alternatively at least one of the two terms of each equation must be zero: • xj > 0 ⇒ cj = ajTw • ajTw < cj ⇒ xj = 0 • wi > 0 ⇒ aiTx = bi • aiTx > bi ⇒ wi = 0 Where ai is the i-th row of A and aj is the j-th column of A. Proof: The KKT condition (3) says that if both x and w are optimal to their respective problems, then: w*T(Ax* - b) = 0 and x* T(c – ATw*) = 0 The theorem is just the expansion of these conditions. Using the first two conditions also the following is true: Ax* – b ≥ 0, c – ATw* ≥ 0, w* ≥ 0, and x* ≥ 0 This is used in the above implications. - 103 - The complementary slackness can be stated verbally in this way: 1) If a variable in one problem is positive, then the corresponding constraint in the other problem must be tight (it must be equality). The opposite is not true (in case of degeneracy a zero variable can correspond to a tight constraint). 2) If a constraint in one problem is not tight, then the corresponding variable in the other problem must be zero. Again the opposite is not true. A tight constraint can correspond to a zero variable. Complementary slackness can be used to find an optimal solution to a problem provided the optimal solution to its dual is known. Assume that we know the optimal dual solution w. The matrix A is known, so we can compute the products ajTw and compare them with known values cj. If the values are not equal, the corresponding primal variables are zero. In case of equality the corresponding primal variables xj are shadow prices (costs) of the tight dual constraints - see the next paragraph. By exchanging the words dual and primal in this reasoning it is possible to find the optimal dual solution provided the optimal primal solution is known. See also the example at the end of the next paragraph. Results from the above theorems and corollaries can be combined into the Fundamental theorem of Duality: Theorem 5.9: With regard to the primal and dual linear programming problems, exactly one of the following statements is true: - 104 - 1. Both possess optimal solutions x* and w* with cTx* = bTw*. 2. One problem has unbounded objective value, in which case the other problem must be infeasible. 3. Both problems are infeasible. 5.2 Economic Interpretation of Duality: A linear programming problem may be viewed as an allocation of resources to achieve the desired optimal value, that is, either to maximize the profit, or to minimize the cost (loss). This is achieved by varying certain variables, which may represent time for work, material, and other factors encountered in any particular task. Generally they are called activities. The allocation is limited by a certain number of constraints, which may represent the maximum amount of material available (in case of distribution of materials), maximum time that may be spent on each job, etc. An economic interpretation shall be given both when the primal is a minimizing and maximizing LP problem. First, suppose the primal is a minimizing LP problem. Then the corresponding LP problems will be of the form: Primal: Min z = cTx ST Dual: Max: z’ = bTw Ax ≥ b ST x≥0 w≥0 - 105 - ATw ≤ c Note that in general, z’ ≤ z by the weak duality property, but at optimality the inequality reduces to equality (by the strong duality property). At optimality z* = cTx* = bTw*. Let's compute the first partial derivative of z* with respect to primal RHS value bi: ∂z ∗ = wi∗ ∂bi This means that wi* is equal to the rate of change of the optimal (primal/dual) objective value with respect to bi (availability ot the i-th resource), given that the current nonbasic variables are held at zero. This value will be defined as the shadow price (also called shadow cost). But by the boundaries attached to the dual problem, wi* ≥ 0. Thus the optimal value will increase or at least stay constant as any variable of b increases. The opposite will happen if the value of bi decreases. An increase in any binding resource should be accompanied (at optimal values) by an increase in the net cost or profit (according to the type of the problem). Interpreting the minimizing primal: Suppose you are managing a company, which needs to produce m outputs in quantities of at least bi units each. However you are interested in a production that would satisfy the request of bi outputs at the minimal cost of production for the company. If aij denotes the amount of product i generated by one unit of activity j, and if xj represents the number of units of activities employed, then n ∑ aijxj represents the units of j =1 output i produced by n activities. This expression should be greater or equal to - 106 - the required amount bi – which corresponds to the minimum output that can be produced. Finally, if cj denotes the cost of any activity j, then clearly the objective function (which should lead to the minimization of the cost) should be Minimize n ∑ cjxj. j =1 Interpreting the maximizing dual: Now, suppose that instead of managing the company, you are its customer. You need to buy specified amounts bi of goods/outputs (i = 1…m), on which, you need to agree on the unit prices, wi, for each product i. Since aij is the number of units of output i produced by one unit of activity j, then m ∑ aijwi can be interpreted as the price, which will be paid for i =1 one unit of activity j. You stipulate that the price of each activity does not exceed cj. This condition may be summarized by the inequality m ∑ aijwi ≤ cj. i =1 On the other side, the company needs to maximize the profit from the selling of the products. Therefore, the objective function will be Maximize m ∑ wibi. Thus i =1 the dual LP problem may be formed. The strong duality property suggests that there is equality in the two optimal LP values. Therefore, the minimal production cost (deduced by the primal) is equal to the maximal return. In fact, this should be intuitively true, since both the objective values represent the fair charge of the customer. Now, interchange the roles of the primal and dual problems to give an economic interpretation for the following two problems: - 107 - Primal: Max z = cTx ST Dual: Min: z’ = bTw Ax ≤ b ST ATw ≥ c x≥0 w≥0 Interpreting the maximizing primal: Suppose n products are being produced with m types of resources. In this case xj will represent the number of units (j = 1 … n) that are produced of product j, and bi will represent the number of units available of the resource i (i = 1 … m). A product j would provide the company with a profit of cj per unit. Note that in this case, aij will be the number of resources i needed to produce one unit of product j. Interpreting the minimizing dual: For interpretation of the dual, let wi denote the fair price to be put on one unit of resource i. Suppose that the manufacturer is now renting out the m resources at unit prices w1, …, wm instead of manufacturing the mix of products. Then every unit of product j not manufactured would result in a loss of profit cj, since it has not been produced and sold. The renting should at least compensate for this loss in profit. Thus the prices set on the resources should be such that the renting income is not smaller than income from production for each product: m ∑a w i =1 ij i ≥ cj. Still, the renting company seeks to minimize the total rent to eliminate any competition, and thus the dual objective function is formed in order to Minimize m ∑b w . i i =1 Example 5.10 (Maximization Primal – Minimization Dual) - 108 - i A company produces two types of paints, namely paint A, and paint B. Production of both paints is made by the use of two basic materials, namely M1, and M2. The production involves mixing specific quantities of each material for every ton of either paint A, or paint B. These quantities in tons are summarized in the following table: Available Paint A (x1) Paint B (x2) Resources Raw Material M1 6 4 24 (a) Raw Material M2 1 2 6 (b) Profit per ton of 5 4 paint (×$1000) The aim of the company is obviously to maximize the profit. Let x1, and x2 be the daily production (rates in tons) of paint A, and paint B respectively. Then, using the above table the primal LP problem may be formulated as follows: Maximize z = 5x1 + 4x2 ST 6x1 + 4x2 ≤ 24 (a) x1 + 2 x2 ≤ 6 x1, x2 ≥ 0 Upon taking the dual of this LP problem we get: Minimize w = 24y1 + 6y2 - 109 - ST 6 y1 + y2 ≥5 4 y1 + 2 y2 ≥4 y 1, y 2 ≥ 0 Supposing that the company now instead of manufacturing the paints, wants to sell the resources, then y1 and y2 will be the unit prices that are decided by the company for the selling. Now, if one unit of, say, x1 (paint A) is not manufactured then this would result in the loss of $5,000 per ton of profit in manufacturing (since z = 5x1 + 4x2). Therefore, not to run at a loss, the cost of selling should not result in a diminishing profit compared with manufacturing. By selling resources, the company will have an alternative income of 6y1 + y2 (×$1,000) per unit of not produced x1 because the coefficients (6, 1) represent the amount of resources needed to produce one unit of x1. Therefore, to run at a profit, the alternative compensation should exceed the cost: 6y1 + y2 ≥ 5. Applying the same reasoning to the selling of paint B, the constraints of the dual are formed. The company’s main objective is to eliminate competition. Thus the company should aim to minimize the price of selling of the resources (without producing any loss in profit). This justifies the objective function of the dual, i.e., Minimize w = 24y1 + 6y2. By the use of the simplex method, it may be shown that the optimal values of the primal LP problem are: x1 = 3 ; x2 = 1.5 ; z = 21 - 110 - Interpreting the primal result: The maximum profit in manufacturing the two types of paint is of $21,000 daily. This will happen with a mixing of 3 tons of paint A and 1.5 tons of paint B daily. Applying the simplex method to solve the dual problem we get: y1 = 0.75 ; y2 = 0.5 ; w = 21 Interpreting the dual result: The fair price to put on the resources will be of $750 of resource 1 (material M1 - constraint (a) of the primal) and $500 of resource 2 (material M2 - constraint (b) of the primal). Now let's assume that we know only the optimal primal solution and let's apply the complementary slackness property to find the optimal dual solution. The first two primal constraints are tight: 6x1 + 4x2 = 24 (a) x1 + 2 x2 = 6 (b) So the corresponding dual variables are shadow prices of these binding constraints. The solution of the primal by the simplex gives shadow prizes 0.75 and 0.5. Similarly let's assume that we know only the optimal dual solution and let's apply the complementary slackness property to find the optimal primal solution. Both dual constraints are tight: - 111 - 6 y1 + y2 = 5 4 y1 + 2 y2 = 4 Primal variables are then their shadow prices. Simplex method gives their values as 3 and 1.5. Example 5.11(Minimization in Primal – Maximization Dual): Suppose that a family is trying to make a minimal cost diet from six available primary food (called 1,2,3,4,5,6) so that the diet contains at least 9 units of vitamin A and 19 units of vitamin C. The following table shows the data on the foods. Number of Units of Minimum Daily Nutrients per kg of Food Requirement of Nutrient Nutrient 1 2 3 4 5 6 Vitamin A 1 0 2 2 1 2 9 Vitamin C 0 1 3 1 3 2 19 Cost of food (c/kg) 35 30 60 50 27 22 The primal model: min z = 35 x1 + 30 x2 + 60 x3 + 50 x4 + 27 x5 + 22 x6 s.t. x1 +2x3 +2x4 +x5 +2x6 ≥ 9 x2 +3x3 +x4 +3x5 +2x6 ≥ 19 x1 ,...x6 ≥ 0 - 112 - Now suppose that a manufacturer proposes to make synthetic pills of each nutrient and to sell them to this family. The manufacturer has to persuade the family to meet all the nutrient requirements by using the pills instead of the primary food. However, the family will not use the pills unless the manufacturer can convince them that the prices of the pills are competitive when compared with each of the primary food. This forces several constraints on the prices the manufacturer can charge for the pills. Let w1 and w2 be the prices of vitamin A and Vitamin C respectively in pill form. Consider say primary food 5. One kg of this food contains one unit of vitamin A and 3 units of vitamin C and costs 27 cents. Thus the family will not buy the pills unless w1 + 3w2 ≤ 27. Similarly for the other primary foods. Also, since the family is cost conscious, if they decide to use the pills instead of the primary foods, they will buy just as many pills as are required to satisfy the minimal nutrient requirements exactly. Hence, the manufacturer’s sales revenue will be v = 9w1+19w2, and the manufacturer wants to maximize his revenue. Thus the prices that the manufacturer can charge for the pills are obtained by solving the following dual LP model. m ax v = 9 w1 + 19 w 2 s .t . w1 ≤ 35 w 2 ≤ 30 2 w1 + 3 w 2 ≤ 60 2 w1 + w 2 ≤ 50 w1 + 3 w 2 ≤ 27 2 w1 + 2 w 2 ≤ 22 w1 , w 2 ≥ 0 - 113 - The price w1 is associated with the nonnegative primal slack variable x7 = x1 +2x3 +2x4 +x5 +2x6 − 9 nonnegative primal slack variable while the price w2 is associated with the x8 = x2 +3x3 +x4 +3x5 +2x6 − 19 . The following are the results obtained by solving the primal and the dual models by the package LINDO. Results: PRIMAL: OBJECTIVE FUNCTION VALUE 1) 179.0000 VARIABLE VALUE REDUCED COST X1 0.000000 32.000000 X2 0.000000 22.000000 X3 0.000000 30.000000 X4 0.000000 36.000000 X5 5.000000 0.000000 X6 2.000000 0.000000 SLACK OR SURPLUS SHADOW COSTS/PRICES X7 0.000000 -3.000000 X8 0.000000 -8.000000 - 114 - DUAL: OBJECTIVE FUNCTION VALUE 1) 179.0000 VARIABLE VALUE W1 3.000000 0.000000 W2 8.000000 0.000000 SLACK OR SURPLUS REDUCED COST SHADOW COSTS/PRICES W3 32.000000 0.000000 W4 22.000000 0.000000 W5 30.000000 0.000000 W6 36.000000 0.000000 W7 0.000000 5.000000 W8 0.000000 2.000000 Recall that in an LP model, the rate of change in the optimal objective function value per unit change in the Right Hand Side values of each constraint (keeping other values fixed) is known as the shadow costs. Thus in this case, the shadow costs of the primal LP problem represents the amount of extra money the family has to spend by using an optimum diet, per unit increase in the requirement of that vitamin i.e. 3 cents for vitamin A and 8 cents for vitamin C. Thus, the price charged by the manufacturer, per unit of vitamin, is acceptable to the family if the price of each vitamin is less than or equal to the shadow cost of that vitamin in the primal problem. Therefore, in - 115 - order to maximize his revenue, the manufacturer must price the vitamins 3 and 8 cents per unit respectively. Hence, in an optimum solution of the dual problem, the prices w1 and w2 correspond to the shadow costs of vitamins A and C respectively. Similarly, in any LP, the dual variables are the shadow costs/prices of the resources associated with the constraints in the primal problem. 5.3 Dual Simplex Method Sometimes it happens that a basic solution to a LP problem is not feasible though it is optimal, in the sense that the negative reduced costs are ≥ 0 in case of maximization and ≤ 0 in case of minimization. This may happen if for example an optimal solution has been calculated for a particular LP problem and then a new problem has to be solved for different RHS values. Since the optimality conditions are therefore satisfied (primal problem is optimal), the dual is feasible though not optimal, so we want to pivot in such a way to make the dual problem optimal. Instead of building a new simplex table for the dual problem it is more practical to make the dual problem from the primal “infeasible though optimal” tableau. This technique is known as the Dual Simplex Method. The idea involved in this method is to retain optimality of the primal tableau while reaching for feasibility. Note that in the dual problem, the opposite procedure is being carried out, that is maintaining feasibility while reaching optimality. - 116 - Consider the following problem: Min cTx ST Ax ≥ b x≥0 Let B be a basic matrix of this LP problem. Let's assume that the basis need not be necessarily feasible because after subtracting surplus variables, or equivalently, adding negative slacks (and no artificial variables are introduced), we get an infeasible initial solution in the following simplex table: xB1 xB2 . xBm z z 0 0 0 0 1 Solution Variables Slack Variables x2 xn+1 x1 … xn … xn+m RHS y11 y12 y1,n+1 b 1* y1n … … y1,n+m y21 y22 y2,n+1 b 2* y2n … y2,n+m … . . . . . . ym1 ym2 ym,n+1 … ym,n+m … ymn bm* z1-c1 z2-c2 … zn-cn zn+1-cn+1 … zn+m-cn+m cBTb* -I If for all i, bi* ≥ 0 , then the table represents a primal feasible solution (obtained by multiplying all rows by -1). Also, if for all j, zj – cj ≤ 0 then optimality for the primal problem has been reached. Example 5.12 min3x1 + 2 x2 subject to 3x1 + x2 ≥ 3 4 x1 + 3x2 ≥ 4 x1, x2 ≥ 0 - 117 - Expressing the problem in standardized form by subtracting negative slack variables and multiply constraints by -1 to get negative RHS values: min 3x1 + 2 x2 + 0 x3 + 0 x4 subject to −3x1 − x2 + x3 = −3 −4 x1 − 3x2 + x4 = −4 x1 , x2 , x3 , x4 ≥ 0 Primal Tableau: x1 -3 -4 -3 BV x3 x4 z x2 -1 -3 -2 x3 1 0 0 x4 0 1 0 RHS -3 -4 0 Initial Tableau satisfies optimality conditions (zj-cj ≤ 0) but is infeasible since b* < 0 . Now we are going to show that optimality in the primal problem is equivalent to feasibility in the dual problem. Let's define wT = cBTB-1. For all j = 1 … n, we have, by definition zj – cj = cBTB-1aj - cj = wTaj - cj = ajTw - cj At primal optimality, zj – cj ≤ 0 so using the above equation, ajTw ≤ cj , or in matrix form: ATw ≤ c - 118 - Further, an+i = -ei for i = 1 … m (because in this problem one negative slack is added for every constraint and no artificial variables are introduced) and also cn+i = 0 (because no objective function coefficients are assigned to slack variables). Thus: zn+i – cn+i = wTan+i – cn+i = wT(-ei) - 0 = -wi or zn+i = -wi Thus if zn+i – cn+i ≤ 0 (for i = 1 … m), then wi ≥ 0, for all i or in matrix form: w≥0 Thus, together we have ATw ≤ c and w ≥ 0 which defines the dual feasible region. But these constraints have been derived from the primal optimality conditions zj – cj ≤ 0. Thus the primal optimality implies dual feasibility. Also at (primal) optimality w*T = cBTB-1, where B is the optimal basis. Then the dual objective value is: w*Tb = bTw* = (cBTB-1)b = cBT(B-1b) = cBTb* = z* (the primal optimal objective value). Thus at feasibility, the primal and dual optimal objectives will be equal (this has already been proved as the strong duality property). These arguments lead to the following lemma: Lemma 5.13: At optimality of the primal minimizing problem in canonical form (i.e. zj-cj ≤ 0 ∀j) the w*T=cBTB-1 is an optimal solution to the dual problem. Furthermore, wi* = -zn+i for i = 1 ... m. - 119 - Looking again at the initial LP problem, we can add negative slack variables to put the LP problem in the form: Min cT x ST Ax = b x≥0 Without the use of artificial variables, generally, it is difficult to find an initial basic feasible solution. So the starting basis B need not be feasible. However, B will be dual-feasible since, for all j, zj – cj ≤ 0 which means that it will be primal-optimal. Algorithm 5.14 (Dual Simplex Algorithm) Looking back at the above simplex table the algorithm will follow these steps: First check if optimality in the primal is already reached – check if for all j: zj – cj ≤ 0. After multiplying by -1, the row entries of the above table, corresponding to the constraints, check if bi* ≥ 0 for all i. If yes, then feasibility is attained and no further work is required. If this is not the case, choose some r such that br* < 0 (for example the most negative one). This defines the pivot row. Once the pivot row yrT is chosen (the leaving basic variable xr is known), we have to choose the pivot column k of the entering variable xk. To find it let's bear in mind the objective: nonnegative value on the right hand side. Because the pivot item yrk will eventually become 1, the new RHS value will be br*/yrk. This means that yrk must be negative. So we shall consider only negative values in the row yrT. Further the primal optimality has to be kept. After pivoting the new values in the z row must remain non-positive. We - 120 - shall show that this will be achieved by the ratio test to choose a column k such that: ⎫⎪ ⎧⎪ z j − c j z k − ck = Min ⎨ : y rj < 0⎬ y rk ⎪⎭ ⎪⎩ y rj Note that since both the denominator and the numerator are negative, then each fraction will be positive. To get the negative reduced costs, corresponding to the entering variable, to zero we compute (zj – cj)’ = (zj – cj ) - y rj y rk ( z k − c k ) . The ratio (zk – ck)/yrk is positive. First suppose that yrj ≥ 0. Then (zj – cj)’ ≤ (zj – cj) ⇒ (zj – cj)’ ≤ 0 ,i.e. , optimality is maintained. Now assume that yrj < 0. Then, by the choice of yrk, the following holds: z k − ck z j − c j . After multiplying both sides by negative yrj, we get ≤ y rk y rj ⎛ z − ck z j − c j ≤ ⎜⎜ k ⎝ y rk ⎞ ⎛ z − ck ⎟⎟ y rj ⇒ z j − c j − ⎜⎜ k ⎠ ⎝ y rk ⎞ ⎟⎟ y rj ≤ 0 i.e. (zj – cj)’ ≤ 0. Thus ⎠ optimality in the primal is still retained. ⎛ z k − ck ⎝ y rk The new dual objective value after pivoting will be cBTB-1b - ⎜⎜ ⎞ * ⎟⎟br . But ⎠ ⎛ z − ck ⎞⎟ * ⎟b ≥ 0. Thus, the dual zk – ck ≤ 0, yrk < 0, and br* < 0. Thus −⎜⎜ k ⎜⎝ yrk ⎠⎟⎟ r objective improves over the current value of bTw = cBTB-1b. (Note that in the dual problem, we need to maximize, not minimize as in the primal.) Thus, each iteration contributes in approaching the dual optimal solution – which at the end would have the same value as the optimal objective value of the optimal feasible primal. - 121 - The above method moves from a dual feasible solution to the next until optimality is reached. These procedures are equivalent in the primal problem to moving from one optimal (not necessarily feasible) basic solution to the next, until finally, optimal feasibility is reached. This is the algorithm of the dual simplex method assuming initial optimal not necessarily feasible simplex table: Repeat If not (b* ≥ 0) (feasible optimum reached ?) Select row r such that br*= min{bj*} If (yrj > 0 ∀j) Stop (dual unbounded) Else ⎧⎪ z j − c j ⎫⎪ z k − ck = min ⎨ : y rj < 0⎬ y rk ⎪⎩ y rj ⎪⎭ Pivot at yrk EndIf EndIf Until (b* ≥ 0) - 122 - Example 5.12 Continued Initial Simplex Primal Table: BV x3 x4 z x1 -3 -4 -3 x2 -1 -3 -2 x3 1 0 0 x4 0 1 0 RHS -3 -4 0 Select row r which gives the smallest RHS value: br* ⎧ −3 ⎫ ⎪ ⎪ = min{b } = min ⎨ −4 ⎬ ⎪0 ⎪ ⎩ ⎭ * i Ignoring the last value 0 (since it represents the objective row), the minimum value is -4, thus the pivot row r is 2. ⇒ xB 2 = x4 is the leaving variable ⇒ yT2 = −4 −3 0 1 −4 Note: since y2j ≥ 0 ∀j then the primal is feasible which implies dual is bounded Ratio test (for keeping optimality) to choose the pivot column k representing the entering variable: zk − ck y2 k ⎧⎪ z − c j ⎫⎪ = min ⎨ j y2 j < 0 ⎬ ⎩⎪ y2 j ⎭⎪ ⎧ −3 −2 0 0 0 ⎫ = min ⎨ , , , , ⎬ ⎩ −4 −3 0 1 − 4 ⎭ Ignoring the 5th value since it represents the ratio on RHS, and the 3rd and 4th values since y23 and y24 are non-negative, the minimum value occurs in the - 123 - second column. Therefore, the pivot column k is column 2 which implies that x2 is the entering variable. Pivoting at BV x3 x2 z x1 -5/3 4/3 -1/3 y22 we obtain the following tableau: x2 0 1 0 x3 1 0 0 x4 -1/3 -1/3 -2/3 RHS -5/3 4/3 8/3 Repeating the same procedure, the optimal and feasible simplex table is obtained: BV x1 x2 z x1 1 0 0 x2 0 1 0 x3 -0.6 0.8 -0.2 x4 0.2 -0.6 -0.6 RHS 1 0 3 The dual optimal table (computed using normal simplex algorithm) is: BV w1 w2 z w1 1 0 0 w2 0 1 0 w3 0.6 -0.2 1 w4 -0.8 0.6 0 RHS 0.2 0.6 3 Note: • the optimal dual variables are equal to the reduced costs in the primal optimal and feasible table. • Since one of the reduced costs in the dual optimal table is 0, there are other optimal solutions (multiple solution) giving the same optimal objective function value of 3. - 124 - Graphical Representation Primal Model/Dual Model: 3 Solution 2 after st 1 iteration Still infeasible 1 Initial infeasible and Optimal solution 0 : 3.0 w1 + 1.0 w2 = 3.0 :0 4.0 w1 + 3.0 w2 = 4.0 Payoff: 3.0 w1 + 2.0 w2 = 5.0 Final Solution 1 Optimal and Feasible giving optimal objective function value of 3 : 3.0w1 + 1.0w2 >= 3.0 : 4.0w1 + 3.0w2 >= 4.0 Payoff: 3.0 w1 + 4.0 w2 = 2.5 w2 1 Feasible and optimal solution (with normal simplex) giving optimal objective function : 1.0 w1 + 3.0 w2 = 2.0 0 : 3.0 w1 + 4.0 w2 = 3.0 0 : 3.0w1 + Initial feasible : 1.0w1 + Solution but not optimal 1 4.0w2 <= 3.0 3.0w2 <= 2.0 - 125 - w1 w1 Example 5.13 This session shows the Dual Simplex Algoritm using table approach. Note that comments added later are not bold and text is in italics and that some empty lines have been removed. Assuming that the folder Z:\Matlab contains the file pivot.m » type pivot function a=pivot(A,r,c) % pivot matrix A at row r and column c % for zero pivot item no operation % no other tests x=A(r,c); if x ~= 0 rmax=length(A(:,1)); A(r,:)=A(r,:)/x; for i=1:rmax if i~=r A(i,:)=A(i,:)-A(r,:)*A(i,c); end end end a=A; Entering A,b,c of the model: min ST 2x1 + 3x2 + 4x3 x1 + 2 x2 + x3 2x1 - x2 + 3x3 ≥ 3 ≥ 4 xi ≥ 0 » A=[1 2 1;2 -1 3] A= 1 2 1 2 -1 3 » A=[-A eye(2)] Note that constraints were multiplied by -1 - 126 - A= -1 -2 -2 -1 1 -3 1 0 0 1 » b=-[3 4]' b= -3 -4 » c=[2 3 4 0 0]' c= 2 3 4 0 0 » s=[A b;-c' 0] s= -1 -2 -1 1 -2 1 -3 0 -2 -3 -4 0 Initial simplex table 0 -3 1 -4 0 0 Optimal, not feasible » row=2 row = 2 Second row leaves (max. negative) » rc=s(3,:) rc = -2 -3 -4 Reduced costs » y=s(row,:) y= -2 1 -3 0 0 0 Pivot row 0 1 -4 » format short g Better format » ra=rc./y Ratios Warning: Divide by zero. Ignore - 127 - ra = 1 -3 1.3333 NaN 0 0 To see pivot row together with ratios » [y; ra] ans = -2 1 1 -3 -3 1.3333 0 NaN 1 0 -4 0 Minimum ratio and negative coefficient » col=1 col = 1 Pivotting » s=pivot(s,row,col) s= 0 -2.5 1 -0.5 0 -4 0.5 1.5 -1 1 0 0 -0.5 -0.5 -1 -1 2 4 » row=1 row = 1 Still not feasible, 1st row leaves » y=s(row,:) y= 0 -2.5 Second iteration » rc=s(3,:) rc = 0 -4 -1 0 0.5 -1 1 -0.5 -1 4 » ra=rc./y Warning: Divide by zero. ra = NaN 1.6 -2 0 » [y;ra] ans = - 128 - 2 -4 0 NaN -2.5 1.6 » col=2 col = 2 » s=pivot(s,row,col) s= 0 1 1 0 0 0 » z=s(3,6) z= 5.6 0.5 -2 1 0 -0.5 2 -1 -4 Minimum ratio (negative coefficient) -0.2 1.4 -1.8 -0.4 -0.2 -1.6 0.2 -0.4 -0.2 0.4 Optimal & Feasible 2.2 5.6 Objective value » x=[2.2 0.4 0 0 0]' Solution vector (see columns of the optimal table) x= 2.2 0.4 0 0 0 Shadow costs » wT=-s(3,4:5) wT = 1.6 0.2 - 129 - Chapter 6 Networks 6.1 Introduction There is a group of linear programming problems defined on networks (directed graphs) that have many special properties. These properties enable some fast special algorithms and also an efficient version of simplex method called network simplex method. This chapter introduces the basic ideas, some special practically important versions of network problems and presents selected algorithms. Special attention will be given to transportation and assignment network models. Knowledge of graph theory is not a precondition; all used terms are defined here. Definition 6.1 Network is a simple directed graph (digraph) N = (V, A) made of a finite non-empty set V = {v1, v2, ... vm} of vertices (nodes) and a set A ⊂ V × V of directed arcs where each arc is an ordered pair of vertices (i, j) ,i,j = 1 … m. Note that between two vertices in one direction there can be mostly one arc – simple graph. 2 1 4 3 6 5 N = ({1, 2, 3, 4, 5, 6} , {(1, 2), (2, 5), (3,1), (3, 4), (4, 6), (5, 3), (5, 5), (6, 5)}) - 130 - In this text we shall assume that loops do not exist: (i, i) ∉ A, i = 1 … m. Also let n be the number of arcs. Note that in graph theory n usually means the number of vertices. Here we shall have one variable for each arc, so to keep compatibility with linear programming notation (n variables) the meaning of symbols is reversed. 6.2 Minimum Cost Network Flow Problem The most general network optimization problem that covers most network problems is the minimum cost network flow problem. The problem is based on these assumptions: Let flow (movement of any commodity through an arc – for example, current in an electric network,) in the arc (i, j) connecting vertices i and j (in this direction) be xij. Then, for each arc, there is generally: - A lower bound lij ≤ xij (mostly 0 – nonnegativity) - An upper bound uij ≥ xij (interpreted as the arc’s capacity) - A certain cost cij paid for unit flow through the arc (i, j). The total cost that we pay for the flow through the arc (i, j) is then cijxij. Flow is in a certain way inserted into the network and somehow removed (example power station which generates electricity for industries, households etc). This can be generalized by introducing for each vertex i: - An external input flow bi+ - An external output flow bi-. Let Pi be the set of predecessors of the vertex i (the set of vertices where arcs ending in i start) and similarly let Si be the set of successors of i. Graphically: - 131 - Pi Si i bi+ bi- Example 6.3 – Electricity Network (flow = electric current) 3900 Household 1 bT+1 = 200 5400 1500 Transformer 1 + bPS = 42200 POWER STATION bL−1 = 3900 L1 bA−1 = 1500 A1 − bAD 2 = 5200 AD2 5200 bT+2 = 2500 30000 5200 32500 Transformer 2 Industry A 10000 2300 7000 E2 15000 L2 bL−2 = 10000 S2 bS−2 = 2300 1625 3425 Transformer 3 Household 4 1800 bT−3 = 150 3425 Household 5 1425 2000 bE− 2 = 15000 A3 bA−3 = 1625 L3 bL−3 = 1800 A4 bA−4 = 1425 L4 bL−4 = 2000 The general condition that is supposed to be satisfied with all network problems is flow conservation stating that flow must neither originate, nor vanish in a vertex. In other words, for each vertex the total flow out must be equal to the total flow in: ∑x j∈Si ij + bi− = ∑ xki + bi+ k∈Pi Simple rearrangement gives: - 132 - , i = 1...m ∑x −∑x j∈Si ij k∈Pi ki = bi+ − bi− = bi , i = 1...m According to the value of bi there are three types of vertices: - Source with bi > 0 that adds flow to the network, - Sink with bi < 0 that removes flow from the network, - Transshipment vertex with bi = 0. Example 6.3 – Continued In the previous figure we see that the source vertices are the Power Station, Transformer 1 and 2, while the sink vertices correspond to the third transformer (since it reduces current), L1, A1, AD2, E2, L2, S2, A3, L3, A4 and L4. The transshipment nodes are the households and the industry. From the LP point of view the equations ∑x −∑x j∈Si ij k∈Pi ki = bi+ − bi− = bi , i = 1...m represent restrictions (constraints). If we known the cost per unit flow associated with each arc, then the objective is logically a minimum cost flow. All these can now be expressed in matrix form in usual way: Min z = cTx ST Ax = b L≤x≤U Where c, L and U are n - vectors of unit costs, lower bounds and upper bounds respectively. The matrix A has m rows (each row represents one vertex) and n - 133 - columns (one column for each arc). Compared with other LP problems, there are few differences. First there are double subscripts of the vectors x, c, L, U and columns of A. This is just a formal difference in notation to avoid separate indexing of vertices and arcs. Lower and upper bounds represent generally additional 2n constraints. Lower bounds are mostly zero – usual non-negativity requirement. If not, a simple change of variables can be used to replace l ≤ x by 0 ≤ x-l = x*. Also upper bounds can be eliminated, if necessary by replacing a bounded variable by two variables: x ≤ u can be replaced by x1 – x2 ≤ u, where x1 and x2 are nonnegative and unbounded. Of course in the model x is replaced by x1 – x2. So without loss of generality (and with some modifications in the model) the bounds can be ignored. Example 6.4 Suppose we want to find the flow in the following network, which generates the total minimum cost. cs3 ls3 us3 9,0,45 3 5 12,0,60 17,0,75 bs+ = 80 s 15,0,60 11,0,3 8,0,55 6,0,50 19,0,70 t 30,0,62 2 10,0,52 4 8,0,50 The LP problem for this problem is given below: - 134 - 6 bt− = 80 min17 xs 3 + 19 xs 2 + 15 x23 + 10 x24 + 11x34 + 9 x35 + 6 x45 + 8 x46 + 8 x56 + 12 x5t + 30 x6t st xs 3 + xs 2 = 80 =0 − xs 2 + x23 + x24 − xs 3 − x23 + x34 + x35 − x24 − x34 =0 + x45 + x46 − x35 − x45 =0 + x56 + x5t − x46 − x56 =0 + x6t = 0 − x5t − x6t = −80 ( Vertex s ) ( Vertex 2 ) ( Vertex 3 ) ( Vertex 4 ) ( Vertex 5 ) ( Vertex 6 ) ( Vertex t ) 0 ≤ xs 3 ≤ 75, 0 ≤ xs 2 ≤ 70, 0 ≤ x23 ≤ 60, 0 ≤ x24 ≤ 52 0 ≤ x34 ≤ 63, 0 ≤ x35 ≤ 45, 0 ≤ x45 ≤ 50, 0 ≤ x46 ≤ 50 0 ≤ x56 ≤ 55, 0 ≤ x5t ≤ 60, 0 ≤ x6t ≤ 62 What makes networks problems special are the properties of A and b (in balanced case – i.e. supply = demand). For A there is in each row (vertex): - +1 for starting arcs - -1 for ending arc - Zeros otherwise. Similarly in each column (arc) there is: - +1 for the starting vertex - -1 for the ending vertex - Zeros otherwise. In particular, columns are very special: there is only one +1, one –1 and m-2 zeros. The sum of rows is thus 0, so the rows are linearly dependent. It means - 135 - that the maximum rank of A is m-1 (in fact it is exactly m-1). Here we assume that n≥m that is satisfied for all connected networks that are not trees – see later. 6.2.1 Balancing a Network For a balanced network where the total inserted flow is equal to the total flow removed we have: m ∑b = 0 i =1 i Unbalanced networks can be balanced by adding artificial vertices and arcs that make the difference. Let S be the total supply to the network and D the total demand: S= ∑b i:bi > 0 i D = − ∑ bi i:bi < 0 A balanced network has S = D. This is the balancing algorithm: - If S > D (excess supply) then add an artificial vertex with demand S-D, and add artificial arcs connecting all sources to this artificial vertex. These new arcs have costs that correspond to the cost (if any) of excess production. - If D > S (excess demand) then add an artificial vertex with supply D-S, and add artificial arcs connecting this artificial vertex to all sinks. These new arcs have costs that correspond to the cost (if any) of unmet demand. - 136 - Without loss of generality we shall assume that a network is balanced. Obviously flow through artificial arcs represents excess supply or unmet demand respectively. During optimization there is no difference between real and artificial arcs and vertices. 6.2.2 Special cases of network flow problems The general minimum cost network flow problem defined above is also called Transshipment problem, because there can be all three types of vertices (sources, sink, transshipment). Another problem is called the Transportation problem. This has only sources and sinks and every arc goes from a source to a sink. Conservation constraints have one of two forms: ∑x ij = bi for a source with bi > 0, and j −∑ xki = bi for a sink with bi < 0. k Transportation problems model direct movement of goods from suppliers to customers with cij coefficients interpreted as the unit cost of transportation from a particular supplier to a particular customer. Objective value is the total cost of shipment. Assignment problem is a special case of the transportation problem, where bi = 1 for a source and bi = -1 for a sink. For a balanced problem there are the same number m/2 of sources and sinks. Assignment typically models assigning - 137 - people to jobs with cij coefficients interpreted as the value of a person if assigned to a particular job, or assigning jobs to machines with cij is the unit cost of assigning job i to machine j. Objective value is then interpreted as total profit (maximization) or total cost (minimization) of all assignments. Later we shall see that the integrity of flows is guaranteed, so the only possible values are 1 and 0. There is a special fast algorithm for assignment. Shortest path problem determines the shortest (fastest) path between an origin and a destination. There are efficient shortest path algorithms in graph theory, but linear programming can also solve the problem. Shortest path problem can be represented as a minimum cost network flow problem with one source (the origin) with supply equal to 1, and one sink (the destination) with demand equal to 1. There are typically many transshipment vertices. The cij coefficients are interpreted as lengths of arcs (that can be generalized as time required to traverse an arc, or cost of using an arc). Unlike in graph theory algorithms, the coefficients need not be nonnegative. Maximum flow problem determines the maximum amount of flow that can be moved through a network from the source to the sink. Because the external flow is not known a-priori, a slight modification of the general problem is necessary. Probably the simplest one adds an artificial arc with infinite capacity from sink to source that returns the flow back to the source. Then all vertices are transshipment and the model maximizes the flow through the artificial arc. Let s be the source and let t be the sink. This is then the model: Max xts, ST Ax=0, 0≤x≤U where U are the capacities of arcs. Note that costs are not used in this model. - 138 - The maximum flow problem has an interesting dual problem that deals with cuts. A cut is defined as a division of the vertices into two disjoint sets V1 and V2, the first V1 containing the source s, the second V2 containing the sink t: V1 ∪ V2 = V, V1 ∩ V2 = ∅, s ∈ V1, t ∈ V2. The capacity of the cut is the sum of the capacities of the arcs that lead from V1 to V2. Now let’s first rewrite the primal maximum flow problem: Max z = xts ST ∑x −∑x j∈Si ij k∈Pi 0 ≤ xij ≤ uij ki = 0 , i = 1...m , (n inequalities representing arcs) The dual problem has one variable for each primal constraint, so there will be m+n dual variables. Let’s call them yi , i = 1 … m for the first m flow conservation inequalities and vij for the second group of n capacity limitation inequalities. The dual objective coefficients are the RHS of primal inequalities, so the dual objective is Min w = ∑ uij vij Now let’s create the dual constraints. There is one for each primal variable, it means one for each arc (including the artificial one from t to s). Also note that the only non-zero primal objective coefficient 1 corresponds to this arc. The - 139 - capacity of this arc is infinite, so it is not included in the second group of n capacity limitation inequalities. So we have these dual constraints: yt – ys = 1 yi – yj + vij ≥ 0 for the artificial arc for all the other arcs (i, j) vij ≥ 0 The interpretation of the dual is the following: let yi=0 if vertex i is in the set V1, and let yi=1 if vertex i is in the set V2. So the dual variables y define a cut. The first dual constraint guarantees that s ∈ V1 and t ∈ V2. Let vij=1 if arc (i, j) connects V1 with V2 (the dual constraints guarantee this fact) and let vij=0 otherwise. Then the dual objective is the capacity of the cut and the dual optimum is the minimum capacity cut. Using the strong duality, we can formulate the famous max-flow min-cut theorem: “maximum flow in a network is equal to the minimum of the capacities of all cuts in the network”. Maximum flow problem can be expanded to Minimum cost maximal flow problem. To avoid possibly conflicting criteria, one way to solve this problem is the following: first find the maximum flow without costs. Then minimize the total cost provided the flow in the artificial arc is kept at maximum value (additional constraint). A modification of this problem is Minimum cost flow with given value (or Minimum cost flow with given minimum acceptable value). Both can be solved by the general minimum cost flow algorithm with one more constraint (= or ≥) on the flow through the artificial arc. - 140 - There are other practically less significant special cases of the general minimum cost network flow problem. Note that all network problems can be solved by the standard simplex method, so any LP solver can generally be used. However, there are two points to mention. First network problems are mostly degenerate (many zero basic variables). So there can be problems. On the other hand special properties of the matrix A of network problems make it possible to use special algorithms faster than the standard simplex algorithm. Some will be presented even though with today’s fast computers their use is justified only for very big models. 6.3 Summary of relevant Graph Theory terms Definition 6.5 A subnetwork N1 = (V1, A1) of a network N = (V, A) has these properties: V1 ⊆ V and A1 ⊆ A ∩ (V1× V1) So a subnetwork (subgraph) is created by removing some vertices, all arcs incident with these vertices, and possibly some more arcs. 1 4 3 4 6 6 5 5 Original Network Subnetwork - 141 - Definition 6.6 A path from vertex i1 to the vertex ik is a subnetwork consisting of a sequence of vertices i1, i2, … , ik, together with a set of distinct arcs connecting each vertex in the sequence to the next. The arcs need not all point in the same direction. 1 i1 1 4 4 3 6 3 i3 i2 5 5 i4 Original Network Path Definition 6.7 A network is said to be connected if there is a path between every pair of vertices in the network. From now we shall assume that the network is connected (if not, the problem can be decomposed into two or more smaller problems). Connected Network Disconnected Network - 142 - Definition 6.8 A cycle is a path from a vertex to itself. i2 i1 i3 i4 Definition 6.9 A tree is a connected subnetwork containing no cycles. Definition 6.10 A spanning tree is a tree that includes every vertex in the network. Original Network Spanning Tree - 143 - 6.4 Summary of relevant properties of trees The properties of (spanning) trees that are relevant for the network simplex method will be given as lemmas. Let’s recall that our assumption is a connected network without loops. Lemma 6.11 Every tree consisting of at least two vertices has at least one end (a vertex that is incident to exactly one vertex). Proof: Select any vertex i and follow any path away from it (one must exist because the network is connected). There are no cycles and the number of vertices is finite, so an end will eventually be reached. Lemma 6.12 A spanning tree for a network with m vertices contains exactly m-1 arcs. Proof: Lemma can be proved by induction: 1. Lemma in true for m=1 (no arc) and m=2 (one arc). 2. Let’s assume that it holds for any m ≥ 2. 3. Adding one more vertex to the tree means that this vertex will be connected to a vertex of the current tree with one more arc. So we have a tree with m+1 vertices and m arcs. This completes the proof. Lemma 6.13 If a spanning tree is augmented by adding to it an additional arc of the network, then exactly one cycle is formed. - 144 - Proof: Suppose an arc (i, j) is added to the spanning tree. Since there was already a path between the vertices i and j, this path together with the arc (i, j) forms a cycle. Suppose that two (or more) distinct cycles were formed. They all must contain the arc (i, j) because the spanning tree had no cycles. Then the union of the two (or more) cycles minus the arc (i, j) still contains a cycle, but this is a contradiction because before adding the arc (i, j) there were no cycles. This shows that exactly one cycle is formed. Lemma 6.14 Every connected network contains a spanning tree. Proof: If the network contains no cycles, then it is also a spanning tree since it is connected and contains all the vertices. Otherwise, there exists a cycle. Deleting any arc from this cycle results in a subnetwork that is still connected. This deleting can continue until there are no cycles. Finally a subnetwork is obtained that contains no cycles, is connected, and contains all the vertices, so it is a spanning tree. Lemma 6.15 Let B be the submatrix of the constraint matrix A corresponding to a spanning tree with m vertices. Then B can be rearranged to form a full-rank lowertriangular matrix of dimension m × (m-1) with diagonal entries ±1. Proof By Lemma 6.12 a spanning tree consists of m vertices and m-1 arcs, so B is of dimension m × (m-1). The rest can be proved by induction: - 145 - 1. If m=1 then B is empty. If m=2 then the spanning tree consists of one arc so there are two possible forms of B that are both of the required form: ⎛1⎞ ⎛ −1⎞ B = ⎜ ⎟ or B = ⎜ ⎟ ⎝ −1⎠ ⎝1⎠ 2. Let’s assume that the Lemma holds for any m ≥ 2. 3. Let’s add one more vertex to the tree. This vertex will be connected to a vertex of the current tree with one more arc. Let’s assume that in the new matrix the added vertex is in the row 1 and the new arc is in the column 1. Then the new matrix will have the following form: ⎛ ±1 0 ⎞ ⎜ ⎟ ⎝ v B⎠ where B is the original matrix. The rest of the row 1 is made of zeros because the newly added vertex is an end (only the new arc starts or ends in this vertex: ±1 in the first position). The vector v contains all zeros except ±1 at the position where the new arc is connected to the tree. If B has the required form then the new matrix has also the required form. A lower triangular matrix with nonzero diagonal entries has full rank. This completes the proof. 6.5 Basis of network problems To show the relationship between a spanning tree and a basis of network problems we need two more definitions: - 146 - Definition 6.16 Given a spanning tree for a network, a spanning tree solution x is a set of flow values that satisfy the flow conservation constraints Ax = b for the network, and for which xij=0 for any arc (i, j) that is not part of the spanning tree. Definition 6.17 A feasible spanning tree solution x is a spanning tree solution that satisfies the nonnegativity constraints x ≥ 0. Theorem 6.18: A flow x is a basic feasible solution for the network flow constraints{ x : Ax = b, x ≥ 0 } if and only if it is a feasible spanning tree solution. Proof: First let’s assume that x is a feasible spanning tree solution. Then by Lemma 6.12 it has mostly m-1 nonzero components. Let B be the submatrix of A corresponding to the spanning tree. By Lemma 6.15, B is full rank with linearly independent m-1 columns, and hence x is a basic feasible solution. For the second half of the proof let’s assume that x is a basic feasible solution, so it has mostly m-1 nonzero components. Now we consider the set of arcs corresponding to the strictly positive components of x. We shall prove that these arcs do not contain a cycle, so they either form a spanning tree or they can be augmented with zero-flow arcs to form a spanning tree. To prove it, let’s assume the opposite that these arcs contain a cycle. In this cycle we can add a small flow ε in one direction (flow in all arc with the same direction will be increased by ε, flow in all arcs with the opposite direction will be decreased by ε). We can select ε small enough to keep all flows positive. Let’s call this new - 147 - flow x+ε. Similarly we can add the same small flow ε in the opposite direction to get a new flow x-ε. For these two flows we have: x= 1 1 x + ε + x −ε 2 2 But this is a contradiction with the assumption that x is a basic feasible solution (an extreme point) that cannot be expressed as a convex combination of two distinct points. This completes the proof. The constraint matrix A does not have full rank because rows are not linearly independent. That’s why the basis B has only m-1 columns that correspond to the m-1 arcs of a spanning tree. Lemma 6.15 gives the properties of a basis that have very important consequence. Let’s suppose that having a base B we want to compute the values of the basic variables. B has m rows and m-1 columns. We can remove the last dependent row to get a set of equations: B’xB = b’ where B’ is obtained from B by deleting the last row, b’ is similarly obtained by deleting the last component from b, and xB is the (m-1) vector of basic flows. B’ is a square full rank lower triangular matrix, so the solution can be found by forward substitution. Moreover we know that the matrix entries Bij are either zeros or ±1. So we know directly the first component x1 of xB: B11x1 = b1 → x1 = ±b1 Similarly for the second component and so on. Generally we get: - 148 - i −1 xi = ±(bi − ∑ ± x j ) , i = 2… m − 1 j =1 Or in other words xi is computed by adding or subtracting first i components of b. This guarantees that for integer values of outer flows b the values of basic flows are also integer. This is a natural requirement of many network problems (like assignment) that is thus automatically satisfied. This is very important – integer solution is obtained by normal simplex method. There is no need to use a time consuming integer programming algorithm. 6.6 Network Simplex Method For convenience, let's repeat the basic facts about the simplex method. This is the content of the simplex table of a feasible LP problem bounded in objective value: BV xN xB z RHS xB B-1N I 0 B-1b z cBB-1N - cN 0 1 cBB-1b Note that the columns of basic and nonbasic variables are in fact scattered in the table because usually the original column labels are not changed during simplex iterations. In the table, B is the current basic matrix and N is the corresponding nonbasic matrix (both made of the columns of the original m x n matrix A). The n vectors x and c are divided accordingly. b is the m vector of the RHS values. Note also that the inverted basic matrix B-1 is available in the columns that originally contained the unity matrix - typically the last m columns. The - 149 - simplex algorithm is based on two tests. The Optimality test checks whether optimum has been reached. It is based on negative reduced costs in the z row: cBB-1N - cN = yTN - cN = z - cN where yT = cBB-1 are the simplex multipliers. Individual negative reduced costs are given by: cBB-1Aj – cj = yTAj – cj = zj – cj where Aj is the j-th column of A. If the table is not optimal, the most negative value (maximization) or the greatest positive value (minimization) defines the entering nonbasic variable. This means that one column of the base (leaving variable) is replaced by the column of a selected so far nonbasic (entering) variable. The leaving variable is chosen by the Feasibility test (minimum ratio after dividing RHS values by the values of the pivot column). The actual update is done by pivoting. This is repeated until optimality is reached. After reaching the optimum, the simplex multipliers form the dual optimal solution w equal to shadow prices of primal RHS values. w is available in the columns of the z row corresponding to the slack variables of the primal model. Of course the primal and dual optimal objective values are equal: cTx = wTb. Network simplex method is based on special simplified form of the simplex operations. Note that we have to change slightly the notation (double indexing of the variables). - 150 - Optimality test Let's express directly the positive reduced cost: cij - zij = cij - yTAij where Aij is the particular column of A. But we know that this column is made of zeros except +1 in the i-th row and -1 in the j-th row. So the formula for the positive reduced cost (let's call it rij) simplifies to: rij = cij - yi + yj To evaluate it we need the simplex multipliers. From the above equations we get yT = cBB-1. After multiplication by B from right we get yTB = cB. Again, column (i,j) of B is made of zeros except +1 in the i-th row and -1 in the j-th row. So the equations for the simplex multipliers simplify to: yi - yj = cij for all basic variables xij This makes m-1 equations for m variables, so one of them can be selected arbitrarily. The others are then computed and used to compute positive reduced costs for the optimality test. Initially any value of any variable can be chosen, but to simplify computation it is convenient to assign 0 to a multiplier that corresponds to an end of the spanning tree. The other values are then computed by traversing the spanning tree starting from the selected vertex. Initial assignment affects the values of simplex multipliers but not the values of reduced costs because they are given by differences of the particular multipliers. - 151 - Feasibility test If the table is not optimal, the optimality test gives the entering variable xij. Using the network terminology we are adding an arc to a spanning tree by increasing its flow from the current value 0. Using Lemma 6.13, this will create exactly one cycle. To keep flow conservation, we have to increase flow in all arcs of the newly created cycle. Unless the problem is unbounded, some arcs in the cycle have opposite direction compared with the new arc. So increasing a flow in the cycle will actually decrease the flow in these arcs with opposite direction. That's why we can increase the cycle flow until the flow in one (or more) of these arcs drops to zero. Further increase is impossible to keep feasibility - nonnegative flows. This will also restore the spanning tree because the arc whose flow has dropped to zero can now be removed. If the flow drops to zero in more arcs, only one of them can be removed (degeneracy). We can now summarize the steps of the network simplex method: 1. The optimality test - compute the simplex multipliers y: Start at an end of the spanning tree and set the associated simplex multiplier to zero. Following the arcs (i, j) of the spanning tree, use the formula yi - yj = cij to compute the remaining simplex multipliers. Compute the positive reduced costs. For each nonbasic arc (i, j) compute rij = cij - yi + yj. If rij ≥ 0 (minimization) or rij ≤ 0 (maximization) for all nonbasic arcs, then the current basis is optimal. Otherwise select the entering arc (i, j). 2. The feasibility test. Identify the cycle created by entering the arc (i, j) to the spanning tree. Find the arc(s) with opposite orientation compared with the arc (i, j) with minimum flow f. If no such arcs exist, the flow in the arc (i, j) can be increased arbitrarily and the problem is unbounded. - 152 - 3. The pivoting. Update the spanning tree. In the cycle subtract f from flows of opposite arcs and add f to flows of same direction arcs. Remove the arc whose flow dropped to zero (if there are more, select one arbitrarily). To find an initial basic feasible solution there are various methods for different forms of network problems. Generally it is possible to add artificial variables arcs in such a way that an "obvious" initial basic feasible solution can easily be found. Then the artificial variables have to be removed like in the standard simplex method (M-method, II-phase method). For some problems there are direct methods (like North-West corner method for transportation). Next we shall deal with more detail with Transportation and Assignment problems. 6.7 Transportation Problem Here we shall assume that each source is connected to each destination. This allows a computer friendly tabular representation of the problem. Anyway a not-existent connection can always be modeled by an arc with a very high prohibitive cost (or a very big negative profit in case of maximization). For tabular transportation problems there are simple ways how to obtain an initial basic feasible solution. Then we shall describe two optimization methods. One (stepping stone method) is based directly on the spanning tree properties. The other (MODI method) is a version of the network simplex method. We first recall the transportation problem: Minimization of total transportation costs of a certain commodity from m1 sources to m2 destinations, based on known unit transportation costs from each - 153 - source to each destination and known amounts available (supplies) at each source and known demands of all destinations. Remarks: 1) The number of vertices is m = m1 + m2, the number of arcs (variables) is n = m1×m2. 2) Unbalanced problems can be balanced directly in the table by adding a dummy row (dummy source) or a dummy column (dummy destination) with zero costs and the supply or demand that makes the balance. Interpretation: allocations to dummy cells represent the fact, that the commodity is not transported (no satisfied demand for a dummy row, or commodity left at source for a dummy column respectively). Example 6.19 Balance the next table. Destinations A B C D Sourc Supply/Dema 20 30 15 5 es nd I 40 2 4 1 6 II 20 4 3 3 3 III 20 1 2 5 2 Using the table notation, we can express directly the LP model of a balanced transportation problem (minimization): - 154 - j 1 1 i m2 . cij = Unit cost of the i to j transportation . si … cij xij dj = Demand at the destination j xij = Solution variable (amount transported from i to j) . m1 = Supply at the source i . Min m1 m2 ∑∑ c x i =1 j =1 m2 ∑x j =1 ij m1 ∑x i =1 ij ij ij = si = dj , i = 1, 2, , j = 1, 2, m1 m2 xij ≥ 0 Using the above model, we can solve the transportation problem by any LP solver. Another possibility is to convert it into a network and solve it by the network simplex method given in the previous chapter. Here we shall describe a modification of the network simplex method that is performed directly in the table. We assume a balanced transportation problem. 1. Algorithms to find an initial basic feasible solution Spanning tree in a table is represented by m-1 nonzero allocations such that all demands are satisfied (this also guarantees that all supplies are fully utilized). Also there must not be any cycles. Cycle is in a table represented by such allocations that enable return into a cell by moving only vertically or horizontally along nonzero allocations. Example of cycles in table: - 155 - x x x x or x x x etc. x x x This is the general algorithm to find an initial basic feasible solution. The algorithm guarantees that the allocations will form a spanning tree. While (there are less than m-1 allocations) do Select a next cell (see the following algorithms) Allocate as much as possible to this cell (this allocation is equal to the minimum of the supply in the row and the demand in the column) Adjust the associated amounts of the supply and the demand (subtract the allocation) Cross out the column or the row with zero supply or demand, but not both ! EndWhile There are several algorithms to select the next cell to be allocated. One is trivial; the other two attempt to allocate cells with low costs. a) North - West corner method Start with the upper left cell Allocate as much as possible If row crossed move down, otherwise move right. - 156 - Note: It may happen that a zero is allocated (a zero basic variable). After allocating the bottom right entry, there will be exactly one uncrossed row or column and m-1 allocations. b) Least cost method Next cell is a not allocated cell with minimum cost. Break ties arbitrarily. Note: Not allocated cell is a cell whose row and column are not crossed. It may happen that a zero is allocated (a zero basic variable). Stop if exactly one row or column with zero supply or demand remains. This provides the required m-1 allocations. c) Vogel’s approximation method (VAM) a) For each not crossed-out row and column compute the penalty as the difference between the smallest and the next smallest costs in that row or column. This has to be computed in each step (after crossing a row, recalculate penalties in columns, after crossing a column, recalculate penalties in rows). b) Select a row or a column with the highest penalty. Break ties arbitrarily. c) Allocate as much as possible to the cell with the minimum cost in the selected row or column. Break ties by the Least cost method. Note: If all uncrossed-out rows and columns have zero remaining supply and demand, determine the zero basic variables by the Least cost method and do not form a cycle. Stop if exactly one row or column with zero supply or demand remains. This provides the required m-1 allocations. - 157 - Comparison: All methods provide the m-1 allocations (basic variables), some of them may be zero. Vogel's method often provides an optimum directly. The Least cost method is probably a good compromise between complexity and the number of improvement steps (for manual solutions). NW corner method is simple, but may require many improvement steps. So it is a usual choice for computerized solutions. 2. Algorithms to find an optimal solution a) Stepping - Stone method Idea: Repeatedly try all not allocated cells to improve the total cost until no improvement exists. In particular do the following, for each not allocated cell: Create a cycle starting and ending at this empty (nonbasic) cell marked by + that is made of already allocated (basic) cells, that are marked repeatedly by - and + . The cycle can be made of horizontal and vertical segments only not diagonal ones. (Degenerate allocations with zero entry can be used in the cycle). Using Lemma 6.13 we know that there is exactly one such cycle created by entering a given nonbasic variable. Optimality test: Sum costs of cells marked by + , subtract from this sum the costs of cells marked by - . If the result is negative, the solution is not optimal. Entering the variable associated with this empty cell will improve the solution. This can be applied immediately. Alternatively we can evaluate all nonbasic cells and select the variable with maximum cost decrease. If no such cell exists, the table is optimal. - 158 - Feasibility test: To allocate as much as possible to the new cell, find the cell in the cycle marked by - with minimum allocation. Add this value to the cells in the cycle marked by +, subtract this value from the cells in the cycle marked by - . This will enter a new solution variable with maximum possible value, the variable that changed to zero leaves the solution. If more than one variable reach zero value (the so-called temporary degeneracy), only one of them can leave the solution. It can be chosen arbitrarily. So there will always be m-1 basic allocations, some of them may be zero. For a degenerate solution it may happen that a zero is moved. Note: Stepping - Stone method is simple but it involves many steps. If for some not allocated cells the cost difference is zero and for all the others it is positive, there are alternative optima. To find them, enter such a variable. The total cost remains the same, but allocations will change. b) Modified Distribution (MODI) method The MODI method (also called "method of multipliers") improves the search for entering variables. Compared with the stepping - stone method, all nonbasic variables are evaluated in one step and then compared to select the most promising one. It is basically the network simplex method. First multipliers di are associated with each row, multipliers rj are associated with each column of the table. They can be interpreted as unit dispatch and unit reception costs respectively. Then for each basic variable xij the following holds: di + rj = cij - 159 - Compare this equation with the general network simplex method and note that dispatch costs are directly simplex multipliers, reception costs are negative simplex multipliers. So we have again m-1 equations for m variables. By selecting any value for one of them (usually d1=0), the values of the others can easily be computed directly in the table - see the worksheet in the appendix. The sum di + rj for nonbasic variables is called shadow cost. Note that shadow cost is equal to –zij in the simplex table. Positive reduced cost is then cij – (di + rj) or verbally actual cost of the table entry minus its shadow cost. Because the model seeks to minimize the total cost, the presence of a negative value shows that the solution is not optimal. In this case enter the variable with the maximum negative reduced cost, break ties arbitrarily. Reduced cost can be interpreted as cost saved by transporting one unit of the commodity through this so far not allocated (nonbasic) rout. After selecting the entering variable, the rest is done by creating the loop in the same way as in the stepping - stone method. This will find the leaving variable. This is repeated until there is no possible improvement. Zero reduced cost indicates alternative optima. Complete MODI method algorithm (including balancing) If (demands < supplies) then add a dummy destination (column) with zero transportation costs and the demand that makes the balance If (supplies < demands) then add a dummy source (row) with zero transportation costs and the supply that makes the balance Make an Initial basic feasible allocation (all demands must be satisfied, supplies must not be exceeded). There must be m-1 allocations. Repeat - 160 - Calculate dispatch and reception costs by using the basic (allocated) cells. Set the first dispatch cost to zero to get m-1 equations. Calculate the reduced cost of each empty not allocated cell as the difference: actual cost - shadow cost = actual cost - (dispatch cost + reception cost) If (there is a negative reduced cost) then reduce the total cost: Select the nonbasic cell with maximum negative reduced cost, break ties arbitrarily Mark this cell by + Mark other basic cells by - and + to keep row and column balances This creates a cycle that starts and ends in the selected nonbasic cell Find the minimum allocation of the cells in the cycle marked by Add this value to + cells in the cycle, subtract this value from - cells in the cycle EndIf Until (there is no negative reduced cost) Compute the total cost of the optimal solution. Transportation: maximization problems Maximization is necessary if the table entries are interpreted as contribution associated with transporting unit commodity through the rout. There are only minor modifications to the methods described above. Note that instead of costs the table contains contributions. Initial basic feasible allocation: NW corner - no change. "Least cost" method - select not allocated cells with maximum contribution. - 161 - VAM - compute penalty as the difference between two maximum contributions in the row or the column. Select the row or the column with maximum penalty, allocate the cell with maximum contribution in that row or column. Algorithms to find optimal solution Stepping - Stone method: enter nonbasic cells with positive contribution difference. MODI method: enter nonbasic cell with most positive reduced cost. Optimal table has no positive reduced cost. 6.8 Assignment Problem Assignment is a special case of transportation with all supplies and all demands equal to one. Like with transportation we shall assume that all possible assignments are possible that will allow a simplified tabular representation of the problem. If this were not true, we can again assign prohibitive costs/contributions into non-existing entries of the table. Problem Specification: Minimization (maximization) of total cost (contribution) of an assignment of n sources to m destinations based on known costs (contributions) for each combination. - 162 - Remarks: 1) The problem is balanced if the number of sources is equal to the number of destinations. Unbalanced problems can be balanced by adding dummy rows (dummy sources) or dummy columns (dummy destinations) with zero costs. Interpretation: assignments to dummy cells represent the fact, that the assignment is in fact not done (no satisfied destinations for dummy rows, or sources left not assigned for dummy columns respectively). Next only balanced models are considered, so there are m sources and m destinations. Using network terminology there are 2m vertices and m2 arcs. 2) Most common assignment application is assignment of jobs to applicants (and it actually does not matter whether applicants are listed in rows of the table and jobs in columns or vice versa). Use the following table as an example assignment problem: Applicants/Jobs A B C D I 9 12 7 15 II 13 14 15 10 III 8 10 20 6 IV 11 15 13 10 Using the table notation, we can express directly the LP model of a balanced assignment problem (minimization): j 1 1 i . … cij xij m cij = Cost of the i-j assignment . xij=Solution variable (1 = i assigned to j, 0 = i not assigned to j) m - 163 - Min m m ∑∑ c x i =1 j =1 m ∑x j =1 ij m ∑x i =1 ij ij ij = 1 , i = 1, 2, m = 1 , j = 1, 2, m Assignment as a special case of transportation can be solved by any LP solver. Moreover we know that if the right hand sides are integer (here ±1), the solution is also integer, so in fact the integer requirement in the above model is redundant. However note that in the table there is exactly one cell with value 1 in each row and in each column, the other cells have value 0. Due to this property of an assignment table there are fast methods to find an initial basic feasible solution and to perform simplex iterations. The so-called Hungarian method reduces the cost matrix that is possible due to the next theorem. After reducing the matrix, only entries with zeros are used for assignment. For this it might be necessary to create more zeros. Theorem 6.20: The optimal solution to a balanced assignment problem remains unchanged if a constant is subtracted from (or added to) any row or column of the cost matrix. Proof: Let pi be the constant subtracted from the row i and let qj be the constant subtracted from the column j. This also represents addition because addition means subtracting a negative constant. Then the cij entry of the cost matrix changes to: dij = cij - pi - qj - 164 - The new value of the objective function is: m m ∑∑ d i =1 j =1 m m m m m m m m m x =∑∑ (cij − pi − q j ) xij =∑∑ cij xij − ∑∑ pi xij − ∑∑ q j xij = ij ij m i =1 j =1 m i =1 j =1 m m m i =1 j =1 m m i =1 j =1 m m m m ∑∑ c x − ∑ p (∑ x ) − ∑ q (∑ x ) = ∑∑ c x −∑ p (1) −∑ q (1) = ∑ ∑ c x i =1 j =1 ij ij i =1 i j =1 ij j =1 j i =1 ij i =1 j =1 ij ij i =1 i j =1 j i =1 j =1 ij ij −C The difference between the new and the modified objective values is constant, so the optimum solution is not changed. Assignment algorithm (Hungarian method) If (number of sources ≠ number of destinations) then add dummy row(s) or dummy column(s) with zero cost entries to get a square matrix If (maximization) then reduce columns by the largest number in the column: New entry = Largest number in the column - Old entry Else reduce each column by the smallest number in the column for minimizing Reduce each row by the smallest number in the row Repeat Cover all zeros by the minimum necessary number of lines If (number of lines < number of assignments) then Find the smallest not covered value x Subtract x from all not covered cells Add x to all cells covered twice EndIf Until (number of necessary lines = number of assignments) - 165 - Make assignments to zeros unique in columns or rows (taking into account only rows or columns so far not assigned) Using original table entries compute the total cost (contribution) of the optimum assignment. Remarks: 1) Note that a maximization problem was converted into minimization of opportunity losses relative to the maximum values in columns (alternatively it could be done in rows). 2) After reducing both columns and rows there is at least one zero in each row and in each column. Next assignments will be done only to zero entries. 3) Minimum number of lines is the number of possible zero assignments because each assignment covers both the row and the column. If this number is equal to the number of necessary assignments, then all assignments can be zero. Otherwise it is necessary to further reduce the matrix to create more zeros. 4) The matrix cannot be reduced directly, because there is already at least one zero in each row and in each column (considering obviously only nonnegative costs). But the following can be done: • Select a minimum not covered cell • Add this value to all covered rows and columns • Subtract it from the whole matrix. The above steps result in these operations given in the algorithm: • Select a minimum not covered cell - 166 - • Add it to cells covered twice • Subtract it from all not covered cells • (Don't change cells covered once) Modifications of the method: 1) Impossible assignments can be modeled by giving them a very big cost (or very big negative contribution). Then the problem can be solved by the above method that eliminates cells with prohibitive entries. If cells with prohibitive entries are part of the optimal assignment, it is not possible to make all m assignments. 2) The so-called Bottleneck Assignment does not minimize the total cost, but the objective is to minimize the value of the maximum assigned cell. Consider this situation: a group of workers move to a certain place, each is assigned a certain job and they can return after all jobs are finished. Assuming that they cannot help each other, the whole group has to wait until the longest job is finished. Assignment matrix would in this case contain times needed by the workers to complete the jobs (again impossible assignments can be expressed by a very long time). A simple trick can convert this problem into a standard assignment problem: rank the times in the matrix in increasing order and then replace each matrix entry by 2RANK, where RANK is the order of that entry. Then solve the problem by the Hungarian method. The point is that the value 2RANK is greater than the sum of all powers 2n for 0 ≤ n < RANK. - 167 - The Hungarian method was developed by H.W.Kuhn in 1955. It is based on theories of Hungarian mathematicians Konig and Egervary from about 1931. That's why its name. - 168 - - 169 -