Mathematical Programming Let R n and f : R . The optimization problems max { f (x) : x } and min { f (x) : x } are also called mathematical programs. is called the feasible set and f (x) the objective function. In problem max { f (x) : x } , we seek x̂ such that f (xˆ ) f (x) x . If such an optimal solution x̂ exists, then f (xˆ ) is called the optimal value of the problem, and we can also write f (xˆ ) = max { f (x) : x } x̂ = argmax { f (x) : x } Note that there can be several optimal solutions but only one optimal value. Cases where a maximization problem fails to have an optimal solution 1. (problem is infeasible) 2. k R, x f (x) k ( f (x) unbounded over : problem unbounded) 3. f (x) bounded over , but does not attain a maximum value. Similar statements apply to the minimization problem min { f (x) : x } . Every maximization problem can be formulated as a minimization problem since max { f (x) : x } min { f (x) : x } 1 Examples of mathematical optimization problems 1. Linear program (LP): is set of solutions to a given system of linear inequalities and/or equations. x R n : g i x ki (for i 1,, r ), h j x l j (for j 1,, q) Here, each g i and h j is an n-dimensional (data) row vector, each ki and l j is a scalar (data), and x is an n-dimensional column vector of variables. f (x) is a linear function of the variables x1 , x2 ,, xn : f (x) n wjxj j 1 Every linear program can be (re)formulated into any of the following forms: maxc x : Ax b, x 0 (a maximization LP in canonical form) min c x : Ax b, x 0 (a minimization LP in canonical form) maxc x : Ax b, x 0 (a maximization LP in standard form) min c x : Ax b, x 0 (a minimization LP in standard form) where c is an n-dimensional row vector, x is an n-dimensional column vector, matrix A has m rows and n columns, and b is an m-dimensional column vector. Examples of LP’s: transportation problem, network flow problems. 2. Nonlinear program (NLP): is set of solutions to a given system of inequalities and/or equations. x R n : g i (x) 0 (for i 1,, p), hk (x) 0 (for k 1,, q) where gi and hk represent arbitrary functions. 2 (Linear) Integer program (IP): An LP with the additional requirement that all variables take integer values. max c x : Ax b, x 0, x Zn 3. (Linear) Mixed Integer program (MIP): An LP where some variables must take integer values max cx hy : Ax Gy b, x 0, x R n , y 0, y Z p Matrices A and G have n and p columns, respectively, and same number of rows. Examples: Network design problems, facility location problems. 4. (Linear) Binary Integer program (BIP): an IP with 0-1 variables max c x : Ax b, x {0,1}n max c x : Ax b, 0 x 1, x Zn Remarks: Any IP with a bounded feasible set can be reformulated as a BIP, using binary (base 2) representation of integers. Any BIP can be re-formulated as a NLP. Use the fact that xi2 xi 0 xi {0,1} 3 5. (Linear objective) Combinatorial Optimization Problems (CO): Some examples: Minimum Spanning Tree Problem (MST) Assignment Problem Shortest Path Problem Traveling Salesman Problem (TSP) Chinese Postman Problem (CPP) Knapsack Problem (KP) Typically, in a combinatorial optimization problem we have a finite set of n objects (e.g., edges, vertices, items for knapsack) that we identify with the set N 1,2,n. Additionally, there is a set composed of subsets of N. Examples: MST: N= E (G) , = { S N : S induces a spanning tree of G} TSP: N= E (G) , = { S N : S induces a hamiltonian cycle of G} KP: N = set of items, ={ S N : total weight of S knapsack capacity} In a linear objective combinatorial optimization problem each j N has a value c j . The objective is to determine the maximum (or minimum) value of { c j : S } jS and identify an (optimal) set Sˆ N that yields this optimal value. That is, find Sˆ N such that c j = max { c j : S } = max c j S jSˆ jS c j = min{ c j : S } = jSˆ jS (maximization problem) jS min c j S jS (minimization problem). A combinatorial optimization problem can be formulated as an integer program (typically a BIP) or an MIP, since any set S N is completely described by its incidence vector x {0,1}n , where x j 1 if j S , and x j 0 otherwise. 4 Continuous optimization vs discrete optimization: Continuous optimization: variables can vary continuously (e.g.: LP, NLP). Discrete optimization: feasible set is discrete (e.g.: CO, IP, and also LP since we can solve linear programs by considering only the finite set of extreme points of the polyhedral feasible region). Computational complexity of optimization problems: Problems with polynomial time algorithms: Assignment Problem Minimum Spanning Tree Shortest Path Chinese Postman Problem Linear Programming Problems proven to be NP-Hard: Traveling Salesman Problem Knapsack Problem Integer Programming Importance of linear programming for discrete optimization: Linear programming theory has been used to develop polynomial time (combinatorial) algorithms for some CO problems (e.g., nonbipartite weighted matching, network flow problems). Linear programming algorithms and theory are also useful for many efficient current algorithms to solve NP-Hard problems such as TSP, Knapsack Problem, and general IP or MIP problems. 5 Difficulty of solving continuous optimization problems: Descent algorithms for minimization problems. (Ascent algorithms for maximization problems) Local optimum vs global optimum. Convexity theory identifies problems which are easier to solve. Convexity: Definitions 1. Convex sets Convex combinations Extreme points 2. Convex functions and concave functions. Linear functions are both convex and concave. 3. Convex (mathematical) programs: min { f (x) : x } where f is convex function and is a convex set. max { f (x) : x } where f is concave function and is a convex set. Convexity results: In a convex program, a local optimum is a global optimum. If is a convex set, f is convex function, and max { f (x) : x } has an optimal solution, then there is an optimal solution that is an extreme point. The intersection of a collection of convex sets is a convex set. Hyperplanes and halfspaces are convex sets. 6 Polyhedral sets: Polyhedron: solution of a system of linear inequalities and/or equations. Feasible region of a linear program is a polyhedron. A polyhedron is a convex set. 1. Linear programs are convex programs (local optima are also global). 2. If LP has an optimal solution exists, then there is one that is an extreme point of the feasible region. Polytope: a bounded polyhedron. Representation theorem of polytopes. Solvability of linear programs: Given an LP, one of the following must be true: 1. LP has an optimal solution. 2. LP is infeasible. 3. LP is unbounded. (in this case, the feasible region must be unbounded ) Algorithms for solving linear programs: 1. Simplex Method (Dantzig 1947): not polynomial, but very efficient in practice. Views LP as a discrete problem, since method focuses on the extreme points. Primal Simplex and Dual Simplex are two main variants of the Simplex method. 2. Ellipsoid Method (Khachian 1979): first polynomial method, not practical. 3. Interior Point Methods (Karmarkar 1984): polynomial and practical. Generates points in the interior of the polyhedron: LP as a continuous problem. 7 Review of Matrix multiplication: Product of a row vector w and column vector x of dimension n: wx n wjxj j 1 A matrix of size (or order) m n : an array with m rows and n columns a11 a A 21 am1 a12 a22 am 2 a1n a2 n amn where aij denotes the entry in the i-th row and j-th column of matrix A (the (i,j)-th entry of A ). We will also refer to aij by A ij . Vectors: a column vector with n entries is a matrix of size n 1 . a row vector with n entries is a matrix of size 1 n . i-th row of A : j-th column of A : Ai ai1 ai 2 ain a1 j a 2j Aj a mj A1 2 A Thus, we can write: A and m A A A1 A2 An Matrix Multiplication: If A is m n and B is n p , then AB is an m p matrix with (i,j)-th entry [AB ]ij A i B j 8 Duality Theory of Linear Programming: Provides some theoretical insight on why LP has polynomial algorithms while none have been found for IP (and other NP-Hard problems). LP duality theory is key in development of polynomial time algorithms for some CO problems (e.g., nonbipartite weighted matching problem). Dual of a linear program: Consider the (primal) LP: maxc x : Ax b, x 0 where A is an m by n matrix. Then, x must be an n-dimensional column vector, c an n-dimensional row vector, and b an m-dimensional column vector. The dual of the LP above is the linear program: min yb : yA c, y 0 where, for the products to be defined, y must be an m-dimensional row vector of variables. Remember: the dual of a canonical maximization LP is a canonical minimization LP. Weak Duality Lemma: y y : yA c, y 0, then cx yb . Let ~ x x : Ax b, x 0 and ~ Corollary: If one of the above LP’s is unbounded, the other is infeasible. Strong Duality Theorem: Consider maxc x : Ax b, x 0 and its dual min yb : yA c, y 0 . Then exactly one of the following must occur: 1. Both LP’s are feasible and maxc x : Ax b, x 0 = min yb : yA c, y 0 2. One LP is unbounded and the other is infeasible. 3. Both LP’s are infeasible. 9 Dual of a general linear program: Weak and strong duality results apply to any primal-dual pair of LP’s. Dual of the dual is the original primal LP (Involutory property of duality) Complementary Slackness Theorem: Let A be the m by n constraint matrix, b the rhs vector, and c the objective function vector of a (primal) LP. Let the column vector ~ x R n be a feasible solution to the y R m be a feasible solution primal LP and let the row vector ~ to the dual LP. Then ~ y are respectively optimal for each of the problems if and only if: x and ~ c j ~yA j x j 0 j 1,, n ~ yi A i ~ x bi 0 i 1,, m 10 Linear combinations A vector b R n is a linear combination of vectors a1 , a 2 ,, a k (also all in R n ) if b k ja j j 1 for some choice of real numbers 1 , 2 ,, k . We also say that: b is an affine combination of a1 , a 2 ,, a k if b k k j 1 j 1 j a j and j 1 . b is a nonnegative combination of a1 , a 2 ,, a k if b b is a convex combination of a1 , a 2 ,, a k if b ja j k j a j and j 0 j . j 1 k k j 1 j 1 j a j , j 1 , j 0 j . will always denote a linear combination of a finite number of vectors. More definitions: Let R n (the set may have infinite number of vectors). Then linear span of : lin j a j : a j affine span of : aff cone generated by : cone j a j : a j , j 0 j convex hull of : conv j a j : a j , j 1, j 0 j j a j : a j , j 1 11 Let R n ( may have infinite number of vectors). We say that is a subspace if lin is a affine set if aff is a convex cone if cone is a convex set if conv Linear Independence: the set { a1 , a 2 ,, a k } is linearly independent if k j a j 0 implies that j 0 j 1,, k j 1 Equivalently, the set { a1 , a 2 ,, a k } is linearly independent if no vector in the set can be written as a linear combination of the remaining vectors. Affine Independence: the set { a1 , a 2 ,, a k } is affinely independent if k j a j 0 and j 1 k j 1 j 0 imply that j 0 j 1,, k Equivalently, the set { a1 , a 2 ,, a k } is affinely independent if no vector in the set can be written as an affine combination of the remaining vectors. Spanning set of a subspace: (review definition) Basis of a subspace: (review definition) Dimension of a subspace: number of vectors in a basis Dimension of a convex set : max {| S |: S , S affinely independent} - 1. Notation: | S | cardinality of the (finite) set S . 12