Linear Programming (Optimization)

advertisement
Chapter 1 Introduction
๏ฑ Mathematical Programming Problem:
min/max
๐‘“(๐‘ฅ)
subject to
๐‘”๐‘– (๐‘ฅ) ≤ 0, ๐‘– = 1, … , ๐‘š,
(โ„Ž๐‘— ๐‘ฅ = 0, ๐‘— = 1, … , ๐‘˜)
(๐‘ฅ ∈ ๐‘‹ ⊂ ๐‘…๐‘› )
๐‘“, ๐‘”๐‘– , โ„Ž๐‘— : ๐‘…๐‘› → ๐‘…
๏ฑ If ๐‘“, ๐‘”๐‘– , โ„Ž๐‘— linear (affine) function ๏‚ฎ linear programming problem
If ๐‘“, ๐‘”๐‘– , โ„Ž๐‘— (or part of them) nonlinear function ๏‚ฎ nonlinear programming
problem
If solution set (or some of the variables) restricted to be integer points ๏‚ฎ
integer programming problem
Linear Programming 2012
1
๏ฑ Linear programming: problem of optimizing (maximize or minimize) a linear
(objective) function subject to linear inequality (and equality) constraints.
๏ฑ General form:
{max, min} ๐‘ ′ ๐‘ฅ
subject to
๐‘Ž๐‘–′ ๐‘ฅ ≥ ๐‘๐‘– , ๐‘– ∈ ๐‘€1
๐‘Ž๐‘–′ ๐‘ฅ ≤ ๐‘๐‘– , ๐‘– ∈ ๐‘€2
๐‘Ž๐‘–′ ๐‘ฅ = ๐‘๐‘– , ๐‘– ∈ ๐‘€3
๐‘ฅ๐‘— ≥ 0, ๐‘— ∈ ๐‘1 ,
๐‘ฅ๐‘— ≤ 0, ๐‘— ∈ ๐‘2
๐‘, ๐‘Ž๐‘– , ๐‘ฅ ∈ ๐‘…๐‘›
(There may exist variables unrestricted in sign)
๏ฑ inner product of two column vectors ๐‘ฅ, ๐‘ฆ ∈ ๐‘…๐‘› :
๐‘ฅ ′ ๐‘ฆ = ๐‘›๐‘–=1 ๐‘ฅ๐‘– ๐‘ฆ๐‘–
If ๐‘ฅ ′ ๐‘ก = 0, ๐‘ฅ, ๐‘ฆ ≠ 0, then ๐‘ฅ, ๐‘ฆ are said to be orthogonal. In 3-D, the angle
between the two vectors is 90 degrees.
( vectors are column vectors unless specified otherwise)
Linear Programming 2012
2
๏ฑ Big difference from systems of linear equations is the existence of objective
function and linear inequalities (instead of equalities)
๏ฑ Much deeper theoretical results and applicability than systems of linear
equations.
๏ฑ ๐‘ฅ1 , ๐‘ฅ2 , … , ๐‘ฅ๐‘› : (decision) variables
๐‘๐‘– : right-hand-side
๐‘Ž๐‘–′ ๐‘ฅ { ๏‚ณ, ๏‚ฃ, ๏€ฝ } ๐‘๐‘– : i th constraint
๐‘ฅ๐‘— { ๏‚ณ, ๏‚ฃ } 0 : nonnegativity (nonpositivity) constraint
๐‘ ′ ๐‘ฅ : objective function
๏ฑ Other terminology:
feasible solution, feasible set (region), free (unrestricted) variable, optimal
(feasible) solution, optimal cost, unbounded
Linear Programming 2012
3
Important submatrix multiplications
๏ฑ Interpretation of constraints: see as submatrix multiplication.
A: ๐‘š × ๐‘› matrix
๏ƒฉ๏€ญ
๏ƒช
A๏€ฝ
๏ƒช
๏ƒช๏ƒซ ๏€ญ
Ax ๏€ฝ ๏ƒฅ
a1 '
am '
n
j ๏€ฝ1
๏€ญ๏ƒน ๏ƒฉ |
๏ƒบ ๏ƒช
๏€ฝ A
๏ƒบ ๏ƒช 1
๏€ญ ๏ƒบ๏ƒป ๏ƒช๏ƒซ |
| ๏ƒน
๏ƒบ
An
๏ƒบ
| ๏ƒบ๏ƒป
A j x j ๏€ฝ ๏ƒฅ i ๏€ฝ1 a i ' xe i , where ๐‘’๐‘– is i-th unit vector
m
y ' A ๏€ฝ ๏ƒฅ i ๏€ฝ1 y i a i ' ๏€ฝ ๏ƒฅ
m
n
j ๏€ฝ1
y' A je j '
denote constraints as ๐ด๐‘ฅ { ๏‚ณ, ๏‚ฃ, ๏€ฝ } ๐‘
Linear Programming 2012
4
๏ฑ Any LP can be expressed as min ๐‘ ′ ๐‘ฅ, ๐ด๐‘ฅ ≥ ๐‘
max ๐‘ ′ ๐‘ฅ ๏‚ฎ min (−๐‘ ′ ๐‘ฅ) and take negative of the optimal cost
๐‘Ž๐‘– ′๐‘ฅ ≤ ๐‘๐‘– ๏‚ฎ −๐‘Ž๐‘–′ ๐‘ฅ ≥ −๐‘๐‘–
๐‘Ž๐‘–′ ๐‘ฅ = ๐‘๐‘– ๏‚ฎ ๐‘Ž๐‘–′ ๐‘ฅ ≥ ๐‘๐‘– , −๐‘Ž๐‘–′ ๐‘ฅ ≥ −๐‘๐‘–
nonnegativity (nonpositivity) are special cases of inequalities which will be
handled separately in the algorithms.
Feasible solution set of LP can always be expressed as ๐ด๐‘ฅ ≥ ๐‘ (or ๐ด๐‘ฅ ≤ ๐‘)
(called polyhedron, a set which can be described as a solution set of finitely
many linear inequalities)
๏ฑ We may sometimes use max ๐‘ ′ ๐‘ฅ, ๐ด๐‘ฅ ≤ ๐‘ form (especially, when we study
polyhedron)
Linear Programming 2012
5
Standard form problems
๏ฑ Standard form : min ๐‘ ′ ๐‘ฅ, ๐ด๐‘ฅ = ๐‘, ๐‘ฅ ≥ 0
Ax ๏€ฝ ๏ƒฅ
n
j ๏€ฝ1
m
A j x j ๏€ฝ ๏ƒฅ i ๏€ฝ1 a i ' xe i
Two view points:
๏ƒ˜Find optimal weights (nonnegative) from possible nonnegative linear
combinations of columns of A to obtain b vector
๏ƒ˜Find optimal solution that satisfies linear equations and nonnegativity
๏ฑ Reduction to standard form
Free (unrestricted) variable ๐‘ฅ๐‘— ๏‚ฎ ๐‘ฅ๐‘—+ − ๐‘ฅ๐‘—− ,
๐‘ฅ๐‘—+ , ๐‘ฅ๐‘—− ≥ 0
๐‘— ๐‘Ž๐‘–๐‘— ๐‘ฅ๐‘–๐‘—
≤ ๐‘๐‘–
๏‚ฎ
๐‘— ๐‘Ž๐‘–๐‘— ๐‘ฅ๐‘–๐‘—
+ ๐‘ ๐‘– = ๐‘๐‘– ,
๐‘ ๐‘– ≥ 0 (slack variable)
๐‘— ๐‘Ž๐‘–๐‘— ๐‘ฅ๐‘–๐‘—
≥ ๐‘๐‘–
๏‚ฎ
๐‘— ๐‘Ž๐‘–๐‘— ๐‘ฅ๐‘–๐‘—
− ๐‘ ๐‘– = ๐‘๐‘– ,
๐‘ ๐‘– ≥ 0 (surplus variable)
Linear Programming 2012
6
๏ฑ Any (practical) algorithm can solve the LP problem in equality form only
(except nonnegativity)
๏ฑ Modified form of the simplex method can solve the problem with free
variables directly (without using difference of two variables).
It gives more sensible interpretation of the behavior of the algorithm.
Linear Programming 2012
7
1.2 Formulation examples
๏ฑ See other examples in the text.
๏ฑ Minimum cost network flow problem
Directed network ๐บ = (๐‘, ๐ด), ( ๐‘ = ๐‘› )
arc capacity ๐‘ข๐‘–๐‘— , (๐‘–, ๐‘—) ∈ ๐ด, unit flow cost ๐‘๐‘–๐‘— , (๐‘–, ๐‘—) ∈ ๐ด
๐‘๐‘– : net supply at node i (๐‘๐‘– > 0: supply node, ๐‘๐‘– < 0: demand node), (We
may assume ๐‘–∈๐‘ ๐‘๐‘– = 0)
Find minimum cost transportation plan that satisfies supply, demand at each
node and arc capacities.
minimize
(๐‘–,๐‘—)∈๐ด ๐‘๐‘–๐‘— ๐‘ฅ๐‘–๐‘—
subject to
{๐‘—:(๐‘–,๐‘—)∈๐ด} ๐‘ฅ๐‘–๐‘—
−
๐‘—: ๐‘—,๐‘– ∈๐ด
๐‘ฅ๐‘—๐‘– = ๐‘๐‘– ,
i = 1, …, n
(out flow - in flow = net flow at node i)
(some people use, in flow – out flow = net flow)
๐‘ฅ๐‘–๐‘— ≤ ๐‘ข๐‘–๐‘— ,
(๐‘–, ๐‘—) ∈ ๐ด
๐‘ฅ๐‘–๐‘— ≥ 0,
Linear Programming 2012
(๐‘–, ๐‘—) ∈ ๐ด
8
๏ฑ Choosing paths in a communication network ( (fractional)
multicommodity flow problem)
๏ฑ Multicommodity flow problem: Several commodities share the network.
For each commodity, it is min cost network flow problem. But the
commodities must share the capacities of the arcs. Generalization of min
cost network flow problem. Many applications in communication,
distribution / transportation systems
๏ƒ˜Several commodities case
๏ƒ˜Actually one commodity. But there are multiple origin and destination pairs
of nodes (telecom, logistics, ..). Each origin-destination pair represent a
commodity.
๏ฑ Given telecommunication network (directed) with arc set A, arc capacity
๐‘ข๐‘–๐‘— bits/sec, (๐‘–, ๐‘—) ∈ ๐ด, unit flow cost ๐‘๐‘–๐‘— /bit , (๐‘–, ๐‘—) ∈ ๐ด, demand ๐‘ ๐‘˜๐‘™
bits/sec for traffic from node k to node l.
Data can be sent using more than one path.
Find paths to direct demands with min cost.
Linear Programming 2012
9
Decision variables:
๐‘˜๐‘™
๐‘ฅ๐‘–๐‘—
: amount of data with origin k and destination l that
traverses link (๐‘–, ๐‘—) ∈ ๐ด
Let ๐‘๐‘–๐‘˜๐‘™ = ๐‘ ๐‘˜๐‘™
if ๐‘– = ๐‘˜
−๐‘ ๐‘˜๐‘™
if ๐‘– = ๐‘™
0
otherwise
๏ฑ Formulation (flow based formulation)
๐‘˜๐‘™
๐‘
๐‘ฅ
๐‘–๐‘—
๐‘™
๐‘–๐‘—
minimize
(๐‘–,๐‘—)∈๐ด
subject to
๐‘˜๐‘™
{๐‘—:(๐‘–,๐‘—)∈๐ด} ๐‘ฅ๐‘–๐‘—
๐‘˜
−
๐‘—: ๐‘—,๐‘– ∈๐ด
๐‘ฅ๐‘—๐‘–๐‘˜๐‘™ = ๐‘๐‘–๐‘˜๐‘™ ,
๐‘–, ๐‘˜, ๐‘™ = 1, … , ๐‘›
(out flow - in flow = net flow at node i for
commodity from node k to node l)
๐‘˜
๐‘˜๐‘™
๐‘™ ๐‘ฅ๐‘–๐‘—
≤ ๐‘ข๐‘–๐‘— ,
(๐‘–, ๐‘—) ∈ ๐ด
(The sum of all commodities should not exceed the
capacity of link (i, j) )
๐‘˜๐‘™
๐‘ฅ๐‘–๐‘—
≥ 0,
Linear Programming 2012
๐‘–, ๐‘— ∈ ๐ด,
๐‘˜, ๐‘™ = 1, … , ๐‘›
10
๏ฑ Alternative formulation (path based formulation)
Let K: set of origin-destination pairs (commodities)
๐‘ ๐‘˜ : demand of commodity ๐‘˜ ∈ ๐พ
P(k): set of all possible paths for sending commodity k๏ƒŽK
P(k;e): set of paths in P(k) that traverses arc e๏ƒŽA
E(p): set of links contained in path p
Decision variables:
๐‘ฆ๐‘๐‘˜ : fraction of commodity k sent on path p
minimize
subject to
๐‘˜ ๐‘˜
๐‘∈๐‘ƒ(๐‘˜) ๐‘ค๐‘ ๐‘ฆ๐‘
๐‘˜
๐‘∈๐‘ƒ(๐‘˜) ๐‘ฆ๐‘ = 1,
๐‘˜∈๐พ
๐‘˜∈๐พ
0
where ๐‘ค๐‘๐‘˜ = ๐‘๐‘˜
๐‘˜ ๐‘˜
๐‘∈๐‘ƒ(๐‘˜;๐‘’) ๐‘ ๐‘ฆ๐‘
≤ ๐‘ฆ๐‘๐‘˜ ≤ 1,
for all ๐‘˜ ∈ ๐พ
≤ ๐‘ข๐‘’ , for all ๐‘’ ∈ ๐ด
for all ๐‘ ∈ ๐‘ƒ ๐‘˜ , ๐‘˜ ∈ ๐พ,
๐‘’∈๐ธ(๐‘) ๐‘๐‘’
๏ฑ If ๐‘ฆ๐‘๐‘˜ ∈ {0,1}, it is a single path routing problem (path selection problem,
integer multicommodity flow problem).
Linear Programming 2012
11
๏ฑ path based formulation has smaller number of constraints, but enormous
number of variables.
can be solved easily by column generation technique (later).
Integer version is more difficult to solve.
๏ฑ Extensions: Network design - also determine the number and type of facilities
to be installed on the links (and/or nodes) together with routing of traffic.
๏ฑ Variations: Integer flow. Bifurcation of traffic may not be allowed. Determine
capacities and routing considering rerouting of traffic in case of network failure,
Robust network design (data uncertainty), ...
Linear Programming 2012
12
๏ฑ Pattern classification (Linear classifier)
Given m objects with feature vector ๐‘Ž๐‘– ∈ ๐‘…๐‘› , ๐‘– = 1, … , ๐‘š.
Objects belong to one of two classes. We know the class to which each
sample object belongs.
We want to design a criterion to determine the class of a new object using the
feature vector.
Want to find a vector (๐‘ฅ, ๐‘ฅ๐‘›+1 ) ∈ ๐‘…๐‘›+1 with ๐‘ฅ ∈ ๐‘…๐‘› such that, if ๐‘– ∈ ๐‘†, then
๐‘Ž๐‘–′ ๐‘ฅ ≥ ๐‘ฅ๐‘›+1 , and if ๐‘– ∉ ๐‘†, then ๐‘Ž๐‘–′ ๐‘ฅ < ๐‘ฅ๐‘›+1 . (if it is possible)
Linear Programming 2012
13
๏ฑ Find a feasible solution (๐‘ฅ, ๐‘ฅ๐‘›+1 ) that satisfies
๐‘Ž๐‘–′ ๐‘ฅ ≥ ๐‘ฅ๐‘›+1 ,
๐‘–∈๐‘†
๐‘Ž๐‘–′ ๐‘ฅ < ๐‘ฅ๐‘›+1 ,
๐‘–∉๐‘†
for all sample objects i
Is this a linear programming problem?
( no objective function, strict inequality in constraints)
Linear Programming 2012
14
๏ฑ Is strict inequality allowed in LP?
consider min x, x > 0 ๏‚ฎ no minimum point. only infimum of objective
value exists
๏ฑ If the system has a feasible solution (๐‘ฅ, ๐‘ฅ๐‘›+1 ), we can make the difference of
the values in the right hand side and in the left hand side large by using
solution ๐‘€(๐‘ฅ, ๐‘ฅ๐‘›+1 ) for M > 0 and large. Hence there exists a solution that
makes the difference at least 1 if the system has a solution.
Remedy: Use ๐‘Ž๐‘–′ ๐‘ฅ ≥ ๐‘ฅ๐‘›+1 ,
๐‘–∈๐‘†
๐‘Ž๐‘–′ ๐‘ฅ ≤ ๐‘ฅ๐‘›+1 − 1,
๐‘–∉๐‘†
๏ฑ Important problem in data mining with applications in target marketing,
bankruptcy prediction, medical diagnosis, process monitoring, …
Linear Programming 2012
15
๏ฑ Variations
๏ƒ˜What if there are many choices of hyperplanes? any reasonable criteria?
๏ƒ˜What if there is no hyperplane separating the two classes?
๏ƒ˜Do we have to use only one hyperplane?
๏ƒ˜Use of nonlinear function possible? How to solve them?
• SVM (support vector machine), convex optimization
๏ƒ˜More than two classes?
Linear Programming 2012
16
1.3 Piecewise linear convex objective functions
๏ฑ Some problems involving nonlinear functions can be modeled as LP.
๏ฑ Def: Function ๐‘“: ๐‘…๐‘› → ๐‘… is called a convex function if for all ๐‘ฅ, ๐‘ฆ ∈ ๐‘…๐‘› and
all ๏ฌ๏ƒŽ[0, 1]
๐‘“ ๐œ†๐‘ฅ + 1 − ๐œ† ๐‘ฆ ≤ ๐œ†๐‘“ ๐‘ฅ + 1 − ๐œ† ๐‘“(๐‘ฆ).
( the domain may be restricted)
f called concave if −๐‘“ is convex
(picture: the line segment joining (๐‘ฅ, ๐‘“ ๐‘ฅ ) and (๐‘ฆ, ๐‘“ ๐‘ฆ ) in ๐‘…๐‘›+1 is not
below the locus of ๐‘“(๐‘ฅ) )
Linear Programming 2012
17
๏ฑ Def: ๐‘ฅ, ๐‘ฆ ∈ ๐‘…๐‘› , ๏ฌ1, ๏ฌ2 ๏‚ณ 0, ๏ฌ1+ ๏ฌ2 = 1
Then ๏ฌ1x + ๏ฌ2y is said to be a convex combination of x, y.
Generally, ๐‘˜๐‘–=1 ๐œ†๐‘– ๐‘ฅ ๐‘– , where ๐‘˜๐‘–=1 ๐œ†๐‘– = 1 and ๐œ†๐‘– ≥ 0, ๐‘– = 1, … , ๐‘˜ is a convex
combination of the points ๐‘ฅ 1 , … , ๐‘ฅ ๐‘˜ .
๏ฑ Def: A set ๐‘† ⊆ ๐‘…๐‘› is convex if for any ๐‘ฅ, ๐‘ฆ ∈ ๐‘†, we have ๐œ†1 ๐‘ฅ + ๐œ†2 ๐‘ฆ ∈ ๐‘† for
any ๐œ†1 , ๐œ†2 ≥ 0, ๐œ†1 + ๐œ†2 = 1.
Picture:
๐œ†1 ๐‘ฅ + ๐œ†2 ๐‘ฆ = ๐œ†1 ๐‘ฅ + 1 − ๐œ†1 ๐‘ฆ,
0 ≤ ๐œ†1 ≤ 1
= ๐‘ฆ + ๐œ†1 (๐‘ฅ − ๐‘ฆ),
0 ≤ ๐œ†1 ≤ 1
(line segment joining ๐‘ฅ, ๐‘ฆ lies in ๐‘†)
x (๏ฌ1 = 1)
(๐‘ฅ − ๐‘ฆ)
(๐‘ฅ − ๐‘ฆ)
Linear Programming 2012
y (๏ฌ1 = 0)
18
๏ฑ If we have ๐œ†1 ๐‘ฅ + ๐œ†2 ๐‘ฆ, ๐œ†1 + ๐œ†2 = 1 (without ๐œ†1 , ๐œ†2 ≥ 0), it is called an affine
combination of x and y.
Picture:
๐œ†1 ๐‘ฅ + ๐œ†2 ๐‘ฆ = ๐œ†1 ๐‘ฅ + 1 − ๐œ†1 ๐‘ฆ,
= ๐‘ฆ + ๐œ†1 (๐‘ฅ − ๐‘ฆ),
(๏ฌ1 is arbitrary)
(line passing through points ๐‘ฅ, ๐‘ฆ)
Linear Programming 2012
19
Picture of convex function
( x , f ( x )) ๏ƒŽ R
f ( x)
n ๏€ซ1
( y , f ( y ))
( ๏ฌ x ๏€ซ (1 ๏€ญ ๏ฌ ) y , ๏ฌ f ( x ) ๏€ซ (1 ๏€ญ ๏ฌ ) f ( y ))
๏ฌ f ( x ) ๏€ซ (1 ๏€ญ ๏ฌ ) f ( y )
f ( ๏ฌ x ๏€ซ (1 ๏€ญ ๏ฌ ) y )
x
Linear Programming 2012
๏ฌ x ๏€ซ (1 ๏€ญ ๏ฌ ) y
y
x๏ƒŽ R
20
n
๏ฑ relation between convex function and convex set
๏ฑ Def: ๐‘“: ๐‘…๐‘› → ๐‘…. Define epigraph of ๐‘“ as epi(๐‘“) = { ๐‘ฅ, ๐œ‡ ∈ ๐‘…๐‘›+1 : ๐œ‡ ≥ ๐‘“ ๐‘ฅ }.
๏ฑ Then previous definition of convex function is equivalent to epi(๐‘“) being a
convex set. When dealing with convex functions, we frequently consider
epi(๐‘“) to exploit the properties of convex sets.
๏ฑ Consider operations on functions that preserve convexity and operations on sets
that preserve convexity.
Linear Programming 2012
21
๏ฑ Example:
Consider ๐‘“ ๐‘ฅ = ๐‘š๐‘Ž๐‘ฅ๐‘–=1,…,๐‘š (๐‘๐‘–′ ๐‘ฅ + ๐‘‘๐‘– ), ๐‘๐‘– ∈ ๐‘…๐‘› , ๐‘‘๐‘– ∈ ๐‘…
(maximum of affine functions, called a piecewise linear convex function.)
f ( x)
๐‘1′ ๐‘ฅ + ๐‘‘1
๐‘2′ ๐‘ฅ + ๐‘‘2
๐‘3′ ๐‘ฅ + ๐‘‘3
x๏ƒŽ R
๐‘ฅ
Linear Programming 2012
22
n
๏ฑ Thm: Let ๐‘“1 , … , ๐‘“๐‘š : ๐‘…๐‘› → ๐‘… be convex functions. Then
๐‘“ ๐‘ฅ = ๐‘š๐‘Ž๐‘ฅ๐‘–=1,…,๐‘š ๐‘“๐‘– (๐‘ฅ) is also convex.
pf)
๐‘“ ๐œ†๐‘ฅ + 1 − ๐œ† ๐‘ฆ = ๐‘š๐‘Ž๐‘ฅ๐‘–=1,…,๐‘š ๐‘“๐‘– (๐œ†๐‘ฅ + 1 − ๐œ† ๐‘ฆ)
๏‚ฃ ๐‘š๐‘Ž๐‘ฅ๐‘–=1,…,๐‘š (๐œ†๐‘“๐‘– ๐‘ฅ + (1 − ๐œ†)๐‘“๐‘– (๐‘ฆ)
๏‚ฃ ๐‘š๐‘Ž๐‘ฅ๐‘–=1,…,๐‘š ๐œ†๐‘“๐‘– ๐‘ฅ + ๐‘š๐‘Ž๐‘ฅ๐‘–=1,…,๐‘š (1 − ๐œ†)๐‘“๐‘– (๐‘ฆ)
= ๐œ†๐‘“ ๐‘ฅ + 1 − ๐œ† ๐‘“(๐‘ฆ)
๏ฟ
Linear Programming 2012
23
๏ฑ Min of piecewise linear convex functions
Minimize ๐‘š๐‘Ž๐‘ฅ๐‘–=1,…,๐‘š (๐‘๐‘–′ ๐‘ฅ + ๐‘‘๐‘– )
Subject to
๐ด๐‘ฅ ≥ ๐‘
Minimize
Subject to
Linear Programming 2012
๐‘ง
๐‘ง ≥ ๐‘๐‘–′ ๐‘ฅ + ๐‘‘๐‘– ,
๐ด๐‘ฅ ≥ ๐‘
๐‘– = 1, … , ๐‘š
24
๏ฑ Q: What can we do about finding maximum of a piecewise linear convex
function?
maximum of a piecewise linear concave function (can be obtained as
minimum of affine functions)?
Minimum of a piecewise linear concave function?
Linear Programming 2012
25
๏ฑ Convex function has a nice property such that a local minimum point is a
global minimum point. (when domain is ๐‘…๐‘› or convex set) (HW later)
Hence finding the minimum of a convex function defined over a convex set is
usually easy. But finding the maximum of a convex function is difficult to
solve. Basically, we need to examine all local maximum points.
Similarly, finding the maximum of a concave function is easy, but finding the
minimum of a concave function is difficult.
Linear Programming 2012
26
๏ฑ Suppose we have ๐‘“(๐‘ฅ) ≤ โ„Ž in constraints, where ๐‘“(๐‘ฅ) is a piecewise linear
convex function ๐‘“ ๐‘ฅ = ๐‘š๐‘Ž๐‘ฅ๐‘–=1,…,๐‘š ๐‘“๐‘–′ ๐‘ฅ + ๐‘”๐‘– .
๏ƒž ๐‘“๐‘–′ ๐‘ฅ + ๐‘”๐‘– ≤ โ„Ž,
๐‘– = 1, … , ๐‘š
Q: What about constraints ๐‘“(๐‘ฅ) ≥ โ„Ž? Can it be modeled as LP?
๏ฑ Def: ๐‘“: ๐‘…๐‘› → ๐‘…, is a convex function, ๐›ผ ∈ ๐‘…
The set ๐ถ = {๐‘ฅ: ๐‘“(๐‘ฅ) ≤ ๐›ผ} is called the level set of ๐‘“.
๏ฑ level set of a convex function is a convex set. (HW later)
solution set of LP is convex (easy) ๏‚ฎ non-convex solution set can’t be
modeled as LP.
Linear Programming 2012
27
Problems involving absolute values
๏ฑ Minimize
subject to
๐‘›
๐‘–=1 ๐‘๐‘– |๐‘ฅ๐‘– |
๐ด๐‘ฅ ≥ ๐‘
(assume ๐‘๐‘– ≥ 0)
More direct formulations than piecewise linear convex function is possible.
(1)
Min ๐‘›๐‘–=1 ๐‘๐‘– ๐‘ง๐‘–
subject to
๐ด๐‘ฅ ≥ ๐‘
๐‘ฅ๐‘– ≤ ๐‘ง๐‘– , ๐‘– = 1, … , ๐‘›
−๐‘ฅ๐‘– ≤ ๐‘ง๐‘– , ๐‘– = 1, … , ๐‘›
Linear Programming 2012
(2)
Min ๐‘›๐‘–=1 ๐‘๐‘– (๐‘ฅ๐‘–+ + ๐‘ฅ๐‘–− )
subject to
๐ด๐‘ฅ + − ๐ด๐‘ฅ − ≥ ๐‘
๐‘ฅ+, ๐‘ฅ− ≥ 0
(want ๐‘ฅ๐‘–+ = ๐‘ฅ๐‘– if ๐‘ฅ๐‘– ≥ 0, ๐‘ฅ๐‘–− = −๐‘ฅ๐‘– if
๐‘ฅ๐‘– < 0 and ๐‘ฅ๐‘–+ ๐‘ฅ๐‘–− = 0, i.e., at most one
of ๐‘ฅ๐‘–+ , ๐‘ฅ๐‘–− is positive in an optimal
solution.
๐‘๐‘– ≥ 0 guarantees that.)
28
Data Fitting
๏ฑ Regression analysis using absolute value function
Given m data points ๐‘Ž๐‘– , ๐‘๐‘– , ๐‘– = 1, … , ๐‘š, ๐‘Ž๐‘– ∈ ๐‘…๐‘› , ๐‘๐‘– ∈ ๐‘….
Want to find ๐‘ฅ ∈ ๐‘…๐‘› that predicts results ๐‘ given ๐‘Ž with function ๐‘ = ๐‘Ž′ ๐‘ฅ.
Want ๐‘ฅ that minimizes prediction error |๐‘๐‘– − ๐‘Ž๐‘–′ ๐‘ฅ| for all ๐‘–.
minimize
subject to
๐‘ง
๐‘๐‘– − ๐‘Ž๐‘–′ ๐‘ฅ ≤ ๐‘ง,
−๐‘๐‘– + ๐‘Ž๐‘–′ ๐‘ฅ ≤ ๐‘ง,
Linear Programming 2012
๐‘– = 1, … , ๐‘š
๐‘– = 1, … , ๐‘š
29
๏ฑ Alternative criterion
′
minimize ๐‘š
๐‘–=1 |๐‘๐‘– − ๐‘Ž๐‘– ๐‘ฅ|
minimize
subject to
๐‘ง1 + … + ๐‘ง๐‘š
๐‘๐‘– − ๐‘Ž๐‘–′ ๐‘ฅ ≤ ๐‘ง๐‘– ,
−๐‘๐‘– + ๐‘Ž๐‘–′ ๐‘ฅ ≤ ๐‘ง๐‘– ,
๐‘– = 1, … , ๐‘š
๐‘– = 1, … , ๐‘š
Quadratic error function can't be modeled as LP, but need calculus method
(closed form solution)
Linear Programming 2012
30
๏ฑ Special case of piecewise linear objective function : separable piecewise
linear objective function.
function ๐‘“: ๐‘…๐‘› → ๐‘…, is called separable if ๐‘“ ๐‘ฅ = ๐‘“1 ๐‘ฅ1 + ๐‘“2 ๐‘ฅ2 + … +
๐‘“๐‘› (๐‘ฅ๐‘› )
๐‘“๐‘– (๐‘ฅ๐‘– )
๐‘1 < ๐‘2 < ๐‘3 < ๐‘4
๐‘4
Approximation of
nonlinear function.
๐‘3
slope: ๐‘๐‘–
๐‘2
๐‘1
0
Linear Programming 2012
๐‘ฅ1๐‘–
๐‘Ž1
๐‘ฅ2๐‘–
๐‘Ž2
๐‘ฅ3๐‘–
๐‘Ž3
๐‘ฅ4๐‘–
๐‘ฅ๐‘–
31
๏ฑ Express variable ๐‘ฅ๐‘– in the constraints as ๐‘ฅ๐‘– ≡ ๐‘ฅ1๐‘– + ๐‘ฅ2๐‘– + ๐‘ฅ3๐‘– + ๐‘ฅ4๐‘– , where
0 ≤ ๐‘ฅ1๐‘– ≤ ๐‘Ž1 , 0 ≤ ๐‘ฅ2๐‘– ≤ ๐‘Ž2 − ๐‘Ž1 , 0 ≤ ๐‘ฅ3๐‘– ≤ ๐‘Ž3 − ๐‘Ž2 , 0 ≤ ๐‘ฅ4๐‘–
In the objective function, use :
min ๐‘1 ๐‘ฅ1๐‘– + ๐‘2 ๐‘ฅ2๐‘– + ๐‘3 ๐‘ฅ3๐‘– + ๐‘4 ๐‘ฅ4๐‘–
Since we solve min problem, it is guaranteed that we get
๐‘ฅ๐‘˜๐‘– > 0 in an optimal solution implies ๐‘ฅ๐‘—๐‘– , ๐‘— < ๐‘˜ have values at their
upper bounds.
Linear Programming 2012
32
1.4 Graphical representation and solution
๏ฑ Let ๐‘Ž ∈ ๐‘…๐‘› , ๐‘ ∈ ๐‘….
Geometric intuition for the solution sets of
{๐‘ฅ: ๐‘Ž′ ๐‘ฅ
{๐‘ฅ: ๐‘Ž′ ๐‘ฅ
{๐‘ฅ: ๐‘Ž′ ๐‘ฅ
{๐‘ฅ: ๐‘Ž′ ๐‘ฅ
{๐‘ฅ: ๐‘Ž′ ๐‘ฅ
{๐‘ฅ: ๐‘Ž′ ๐‘ฅ
= 0}
≤ 0}
≥ 0}
= ๐‘}
≤ ๐‘}
≥ ๐‘}
Linear Programming 2012
33
๏ฑ Geometry in 2-D
{๐‘ฅ: ๐‘Ž′ ๐‘ฅ ≥ 0}
๐‘Ž
0
{ ๐‘ฅ โˆถ ๐‘Ž’๐‘ฅ ๏‚ฃ 0 }
Linear Programming 2012
{ ๐‘ฅ โˆถ ๐‘Ž’๐‘ฅ = 0 }
34
๏ฑ Let ๐‘ง be a (any) point satisfying ๐‘Ž′ ๐‘ฅ = ๐‘. Then
๐‘ฅ: ๐‘Ž′ ๐‘ฅ = ๐‘ = ๐‘ฅ: ๐‘Ž′ ๐‘ฅ = ๐‘Ž′ ๐‘ง = {๐‘ฅ: ๐‘Ž′ ๐‘ฅ − ๐‘ง = 0}
Hence ๐‘ฅ − ๐‘ง = ๐‘ฆ, where ๐‘ฆ is any solution to ๐‘Ž′ ๐‘ฆ = 0, or ๐‘ฅ = ๐‘ฆ + ๐‘ง.
Similarly, for {๐‘ฅ: ๐‘Ž′ ๐‘ฅ ≤ ๐‘}, {๐‘ฅ: ๐‘Ž′ ๐‘ฅ ≥ ๐‘}.
{๐‘ฅ: ๐‘Ž′ ๐‘ฅ ≥ ๐‘}
๐‘Ž
๐‘ง
{๐‘ฅ: ๐‘Ž′ ๐‘ฅ ≤ ๐‘}
0
{๐‘ฅ: ๐‘Ž′ ๐‘ฅ = ๐‘}
{๐‘ฅ: ๐‘Ž′ ๐‘ฅ = 0}
Linear Programming 2012
35
๏ฑ min ๐‘1 ๐‘ฅ1 + ๐‘2 ๐‘ฅ2
s.t. −๐‘ฅ1 + ๐‘ฅ2 ≤ 1, ๐‘ฅ1 ≥ 0, ๐‘ฅ2 ≥ 0
๐‘ฅ2
๐‘ = (1, 0)
๐‘ = (−1, −1)
๐‘ = (1, 1)
๐‘ = (0, 1)
๐‘ฅ1
{๐‘ฅ: ๐‘ฅ1 + ๐‘ฅ2 = 0}
Linear Programming 2012
{๐‘ฅ: ๐‘ฅ1 + ๐‘ฅ2 = ๐‘ง}
36
๏ฑ Representing complex solution set in 2-D
( ๐‘› variables, ๐‘š equations (coefficient vectors are linearly independent),
nonnegativity, and ๐‘› − ๐‘š = 2 )
๐‘ฅ3
๐‘ฅ1 = 0
๐‘ฅ2
๐‘ฅ2 = 0
๐‘ฅ3 = 0
๐‘ฅ1
๏ฑ See text sec. 1.5, 1.6 for more backgrounds
Linear Programming 2012
37
Download