Nonlinear Programming: Introduction & Examples

Nonlinear Programming
Junlong Zhang
February 21, 2022
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
Practical Optimization: Algorithms and Engineering Applications
By Andreas Antoniou and Wu-Sheng Lu, Springer, 2007
A second edition is also available
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
Reference book
Convex Optimization
By Stephen Boyd and Lieven Vandenberghe, Cambridge university press,
Tsinghua University library (online access)
Nonlinear Programming: Theory and Algorithms
By Mokhtar S. Bazaraa, Hanif D. Sherali, and Chitharanjan M. Shetty,
John Wiley & Sons, 2006
Nonlinear Programming
By Dimitri P. Bertsekas, Athena Scientific, 2016
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
Grading (subject to change)
Homework: 25%
Group project (slides and presentation): 15%
Mid-term exam: 20%
Final exam: 40%
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
This Class
1. Introductory Examples
2. The Basic Optimization Problem
3. The Feasible Region
4. Branches of Mathematical Programming
5. Types of Extrema
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
1. Introductory Examples: Portfolio Optimization
Objective: for a investment portfolio that comprises n securities, design an
optimal portfolio that would minimize the risk involved subject to an
acceptable return.
xi : random parameter representing the return of security i at some
specified time in the future
µi : expected return of security i, µi = E[xi ]
σi2 : variance of the return of security i, σi2 = E[(xi − µi )2 ]
ρij : correlation between the returns of securities i and j,
ρij = E[(xi − µi )(xj − µj )]/(σi σj )
µ∗ : acceptable expected return
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
1. Introductory Examples: Portfolio Optimization
Decision variable:
wi : fraction of the available resources allocated to security i, 0 ≤ wi ≤ 1
Objective function:
risk of the investment: measured by the variance for the portfolio:
[ n
[ n
n ∑
wi x i − E
wi x i
(σi σj ρij )wi wj
i=1 j=1
Model formulation:
wi ≥0, 1≤i≤n
subject to
n ∑
(σi σj ρij )wi wj
i=1 j=1
µ i wi ≥ µ ∗
wi = 1
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
1. Introductory Examples: Text Classification
Problem: The assignment of natural language text to predefined classes based
on their contents
Two classes of news articles: Sports and Politics
A news article with the headline “China’s Sui Wenjing and Han Cong
win gold in pairs figure skating” may be classified as Sports
A news article with the headline “Harris says US ‘stands with Ukraine’
while warning Russia of ‘swift, severe and united’ consequences” may be
classified as Politics
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
1. Introductory Examples: Text Classification
A machine learning approach:
Feature extraction:
Define a dictionary for each class, e.g., {athlete, baseball, basketball,
champion, gold, medal, Olympics, skating}
Vectorize the text, e.g., “China’s Sui Wenjing and Han Cong win gold in
pairs figure skating” is vectorized as (0, 0, 0, 0, 1, 0, 0, 1)
Construct a classification model
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
1. Introductory Examples: Text Classification
Construct a classification model
{(x1 , y1 ), . . . , (xn , yn )}: a collection of examples
xi : a vector representing the features of a text document
yi : a label indicating whether the text document belongs (yi = 1) or not
(yi = −1) to a particular class
h: prediction function
Rn (h): empirical risk of misclassification defined as
Rn (h) =
1[h(xi ) ̸= yi ], where 1[A] =
if A is true,
Objective: search for a prediction function h that minimizes the
frequency of observed misclassifications
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
1. Introductory Examples: Text Classification
Construct a classification model
Consider prediction functions of the form h(x; ω, τ ) = ω T x − τ
The indicator function 1[·] is not continuous
Consider a log-loss function of the form
ℓ(h, y) = log(1 + e−hy )
Solve the convex optimization problem:
ℓ(h(xi ; ω, τ ), yi ) + ∥ω∥22
(ω,τ )∈Rd ×R
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
2. The Basic Optimization Problem
x1 , x2 , . . . , xn : n independent decision variables that can be adjusted
f (x1 , x2 , . . . , xn ): objective or cost function, f : Rn 7→ R
The basic (unconstrained) optimization problem:
x1 ,...,xn ∈R
f (x1 , x2 , . . . , xn )
Let x be a column vector with
xT = [x1 x2 · · · xn ]
The basic (unconstrained) optimization problem in matrix notation:
min f (x)
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
2. The Basic Optimization Problem
A collection of equality and/or inequality constraints might be imposed
on the variable vector x, for example
wi = 1
x21 + x22 ≤ 4
Given functions ai : Rn 7→ R and cj : Rn 7→ R, the general constrained
optimization problem is stated as:
f (x)
subject to ai (x) = 0, ∀i = 1, . . . , p
cj (x) ≥ 0, ∀j = 1, . . . , q
Note: restrictions on variables like xi ≥ 0 and li ≤ xi ≤ ui might also be
treated as constraints
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
3. The Feasible Region
Feasible point: any point x that satisfies both the equality and the
inequality constraints
Feasible region: the set of all feasible points
S = {x ∈ Rn : ai (x) = 0 ∀i = 1, . . . , p and cj (x) ≥ 0 ∀j = 1, . . . , q}
The general constrained optimization problem:
min f (x)
An optimal solution:
x∗ ∈ arg min f (x)
Note: there can be multiple optimal solutions to an optimization problem
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
3. The Feasible Region
Suppose that the constraints in an optimization problem are all inequalities,
i.e., p = 0 and q ≥ 1.
Interior point: a point x ∈ Rn for which cj (x) > 0 for all j = 1, . . . , q
Boundary point: a point x ∈ Rn for which cj (x) = 0 for at least one
j ∈ {1, . . . , q}
Exterior point: a point x ∈ Rn for which cj (x) < 0 for at least one
j ∈ {1, . . . , q}
Active constraint: the jth constraint is active at x ∈ Rn if cj (x) = 0
Constrained optimal solution: cj (x∗ ) = 0 for some j ∈ {1, . . . , q}
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
3. The Feasible Region
Example 1
min f (x) = x21 + x22 − 4x1 + 4
s.t. c1 (x) = x1 − 2x2 + 6 ≥ 0
c2 (x) = −x21 + x2 − 1 ≥ 0
c3 (x) = x1 ≥ 0
c4 (x) = x2 ≥ 0
Contour: a set of points in the
(x1 , x2 ) plane for which
f (x1 , x2 ) is constant.
The optimal point is A, which
is a constrained optimum point.
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
3. The Feasible Region
Example 2
min f (x) = x21 + x22 + 2x2
s.t. a1 (x) = x21 + x22 − 1 = 0
c1 (x) = x1 + x2 − 0.5 ≥ 0
c2 (x) = x1 ≥ 0
c3 (x) = x2 ≥ 0
The optimal point is A, which is a
constrained optimum point.
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
4. Branches of Mathematical Programming
Linear Programming
Standard form
min cT x
subject to Ax = b
General form (or alternative form)
min cT x
subject to Ax ≥ b
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
4. Branches of Mathematical Programming
Integer Programming
Pure Integer Linear Programming
min cT x
subject to Ax ≥ b
x ∈ Zn+
Mixed-Integer Linear Programming
min cT x + hT y
subject to Ax + Gy ≥ b
x ∈ Zn+1 , y ∈ Rn+2
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
4. Branches of Mathematical Programming
Quadratic Programming
min cT x + xT Qx
subject to Ax = a
Bx ≥ b
Quadratically Constrained Quadratic Programming
min cT x + xT Qx
subject to xT P i x + (q i )T x + ri ≤ 0, ∀i = 1, . . . , m
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
4. Branches of Mathematical Programming
Convex Programming
min f (x)
subject to cj (x) ≤ 0, ∀i = 1, . . . , q,
where the functions f, c1 , . . . , cq : Rn 7→ R are convex.
Multiobjective Optimization (Vector Optimization)
min f (x) = (f1 (x), . . . , fm (x))
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
5. Types of Extrema
The extrema of a function are its minima and maxima.
Minimizers (maximizers) are points at which a function has minima
Definition 1 Weak local minimizer
A point x∗ ∈ S, where S is the feasible region, is said to be a weak local
minimizer of f (x) if there exists a distance ϵ > 0 such that for x ∈ S and
∥x − x∗ ∥2 < ϵ,
f (x) ≥ f (x∗ )
Definition 2 Weak global minimizer
A point x∗ ∈ S is said to be a weak global minimizer of f (x) if
f (x) ≥ f (x∗ )
for all x ∈ S.
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
5. Types of Extrema
Definition 3 Strong local minimizer
A point x∗ ∈ S, where S is the feasible region, is said to be a strong local
minimizer of f (x) if there exists a distance ϵ > 0 such that
f (x) > f (x∗ )
for x ∈ S, x ̸= x∗ and ∥x − x∗ ∥2 < ϵ.
Definition 4 Strong global minimizer
A point x∗ ∈ S is said to be a strong global minimizer of f (x) if
f (x) > f (x∗ )
for all x ∈ S and x ̸= x∗ .
Global minimizer =⇒ Local minimizer
Global minimizer ⇐= Local minimizer ?
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
5. Types of Extrema
A strong global minimizer
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
5. Types of Extrema
Types of minima
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
Read Chapter 1 of the textbook
Junlong Zhang (Tsinghua IE)
Nonlinear Programming
