A smoothing algorithm for finite min–max–min problems

advertisement
Optim Lett
DOI 10.1007/s11590-008-0090-9
ORIGINAL PAPER
A smoothing algorithm for finite min–max–min
problems
Angelos Tsoukalas · Panos Parpas · Berç Rustem
Received: 19 February 2008 / Accepted: 27 May 2008
© Springer-Verlag 2008
Abstract We generalize a smoothing algorithm for finite min–max to finite min–
max–min problems. We apply a smoothing technique twice, once to eliminate the inner
min operator and once to eliminate the max operator. In mini–max problems, where
only the max operator is eliminated, the approximation function is decreasing with respect to the smoothing parameter. Such a property is convenient to establish algorithm
convergence, but it does not hold when both operators are eliminated. To maintain
the desired property, an additional term is added to the approximation. We establish
convergence of a steepest descent algorithm and provide a numerical example.
1 Introduction
We propose an algorithm for the finite min–max–min problem
min (x) = max min f i, j (x)
x∈Rn
i∈I
j∈J
with finite sets I, J and continuously differentiable functions f i, j with bounded first
derivatives for all i ∈ I, j ∈ J .
The min–max–min problem has applications in multiple fields including computer
aided design [1], facility location [2] and civil engineering [3].
A. Tsoukalas (B) · P. Parpas · B. Rustem
Department of Computing, Imperial College, London, UK
e-mail: at102@doc.ic.ac.uk
P. Parpas
e-mail: pp500@doc.ic.ac.uk
B. Rustem
e-mail: br@doc.ic.ac.uk
123
A. Tsoukalas et al.
To tackle the problem, we generalize the smoothing technique described in [3–7],
which approximates a max function
max gi (x)
i
with the smooth approximation
log
exp(gi (x)) /.
i
The same idea, where in this case the maximization is over an interval and an
integral appears in place of the summation, appears in the laplace method, that studies
the asymptotic behavior of integrals [8]. In the context of optimization, it has been used
in convergence proofs of simulated annealing methods [9] to describe a distribution
over the state space of solutions that converges to the global optimum as the parameter
increases [10].
This smoothing technique has been used to solve the minimax problem in [5–7] by
transforming it into a smooth minimization problem. In [7] an adaptive method for the
update of the parameter is used which provides robustness in view of potential illconditioning problems. Furthermore, the same technique has been used to transform
the min–max–min problem to minimax problems [3], which are solved with a different
method.
In Sect. 2 we extend this methodology to eliminate both the inner minimization and
the maximization operands and reduce the min–max–min problem to the minimization
of a smooth function and in Sect. 3 we provide a steepest descent algorithm of the
approximation function and we show that it converges to a point that satisfies a first
order optimality condition for the original min–max–min problem. In Sect. 4 we
provide a computational example.
2 Smoothing of max–min functions
Proposition 2.1 provides a smooth approximation of maxi∈I min j∈J f i, j (x).
Proposition 2.1 Let
(x) = max min f i, j (x)
i∈I
j∈J
Let M I = |I |, M J = |J | and I > 0, J < 0. Then
⎤ I ⎤
J
ln(M J )
ln(M I )
1 ⎢ ⎣
⎥
⎦
(x) +
≤ ln ⎣
exp( J f i, j (x))
.
⎦ ≤ (x) +
J
I
I
⎡
⎡
i∈I
123
j∈J
(1)
A smoothing algorithm for finite min–max–min problems
Proof Let
i (x) = min f i, j (x).
j
Then
i ≤ f i, j , ∀i ∈ I, j ∈ J,
exp( J i ) ≤
exp( J f i, j ) ≤ M J exp( J i ), ∀i ∈ I,
j∈J
⎛
J i ≤ ln⎝
⎞
exp( J f i, j )⎠≤ ln(M J )+ J i , ∀i ∈ I,
j∈J
⎛
⎞
1
1 ⎝
ln(M J ) ≤
ln
exp( J f i, j )⎠ ≤ i , ∀i ∈ I,
i +
J
J
j∈J
⎛⎡
⎤ I ⎞
J
I
⎟
⎜
ln(M J ) ≤ ln ⎝⎣
exp J f i, j ⎦ ⎠ ≤ I i , ∀i ∈ I,
I i +
J
j∈J
⎡
I
J
M J · exp( I i ) ≤ ⎣
⎤ I
J
exp( J f )⎦
≤ exp( I i ), ∀i ∈ I,
j∈J
ln
i∈I
⎛
⎤ I ⎞
⎡
J
I
⎟
⎜ ⎣
⎦
exp( I i ) + ln(M J ) ≤ ln ⎝
exp( J f i, j )
⎠
J
i∈I
≤ ln
j∈J
exp( I i ) .
(2)
i∈I
We have
= max i
i
exp( I ) ≤
exp( I i ) ≤ M I exp( I )
i∈I
I ≤ ln
exp( I i ) ≤ ln(M I ) + I .
(3)
i∈I
123
A. Tsoukalas et al.
From (2,3), we have
ln(M I ) + I ≥ ln
exp( I i )
i∈I
⎛
⎤ I ⎞
⎡
J
⎜
⎟
⎦
⎣
≥ ln ⎝
exp( J f )
⎠
i∈I
≥ ln
j∈J
exp( I i ) +
i∈I
1
ln(M J )
J
1
ln(M J )
J
⎛
⎡
⎤ I ⎞
J
1
1
1 ⎜
⎟
⎣
+
ln(M J ) ≤
ln ⎝
exp( J f )⎦ ⎠ ≤
ln(M I ) + J
I
I
≥ I +
i∈I
j∈J
Set
⎤ I ⎞
⎡
J
1 ⎜
⎟
⎦
⎣
I , J (x) =
ln ⎝
exp( J f (x))
⎠
I
⎛
i∈I
j∈J
It is shown in [5] that the convergence of the approximation method described in
the Introduction is monotonic. Proposition 2.2 shows that this is not the case when we
eliminate both operands.
Proposition 2.2 For J2 < J1 < 0,
I , J1 (x) < I , J2 (x),
and for I2 > I1 > 0,
α (x, I1 , J ) > α (x, I2 , J ).
Proof Let gi (x, J ) =
123
j∈J
exp( J f i, j (x))
1
J
> 0.
A smoothing algorithm for finite min–max–min problems
1
J
2
j∈J exp( J2 f i, j (x))
gi (x, J2 )
= 1
gi (x, J1 )
J
1
j∈J exp( J1 f i, j (x))
⎡
⎤
⎢
=⎢
⎣ j∈J
j∈J
⎡
=⎣
j∈J
exp( J2 f i, j (x)) ⎥
⎥
J2 ⎦
exp( J1 f i, j (x)) J1
1
J
2
exp( J1 f i, j (x))
k∈J exp( J1 f i,k (x))
J2
J
1
⎤
⎦
1
J
2
.
Since
J2
> 1,
J1
exp( J1 f i, j (x))
< 1 and J2 < 0,
k∈J exp( J1 f i,k (x))
we have
⎡
⎤ 1
J
2
exp( J1 f i, j (x))
gi (x, J2 ) ⎣
⎦
≥
= 1.
gi (x, J1 )
k∈J exp( J1 f i,k (x))
j∈J
Therefore
1
ln
I
i∈I
gi (x, J2 ) ≥ gi (x, J1 )
1
I
I
gi (x, J2 )
ln
gi (x, J1 )
≥
I
i∈I
I , J2 (x) ≥ I , J1 (x).
For the second part we have
i∈I
i∈I
gi (x, J )
I2
gi (x, J ) I1
1
I2
1
⎡
⎢
=⎣
I1
⎡
⎢
=⎣
⎤
i∈I
i∈I
i∈I
i∈I
gi (x, J )
I2
gi (x, J ) I1
gi (x, J )
gi (x, J )
I2
⎥
⎦
1
I
2
I1
I1
I
2
I
1
I1
I2
⎤
⎥
⎦
1
I
2
≤ 1.
I1
123
A. Tsoukalas et al.
Therefore
gi (x, J ) I2
1
I
2
≤
i∈I
gi (x, J ) I1
1
I
1
i∈I
⎛
⎛
1 ⎞
1 ⎞
I
I
2
1
⎠ ≤ ln ⎝
⎠
ln ⎝
gi (x, J ) I2
gi (x, J ) I1
i∈I
i∈I
I2 , J (x) ≤ I1 , J (x).
We see that when we increase | I |, I , J decreases and when we increase | J |,
I , J increases. In order to prove convergence of the local optimization algorithm it
would be convenient if the approximation function was decreasing with both | I |
and | J |. Consider the alternative approximation function
I , J (x) = I , J (x) −
ln(M J )
.
J
It is obvious that I , J decreases as | I | increases. Next we examine what happens
when we increase | J |.
Lemma 2.3 For all b > 1, 0 ≤ ai ≤ 1, we have
1+
1
m−1
i=1
b
ai
+
m−1
i=1
1+
ai
m−1
k=1
b
≥ m 1−b .
ak
Proof Let
f (a1 , a2 , . . . , am−1 ) = 1 +
m−1
aib − m 1−b 1 +
i=1
m−1
b
ai
, 0 ≤ ai ≤ 1.
i=1
The set {(a1 , . . . , am−1 )|0 ≤ ai ≤ 1} is closed and bounded and therefore, f has a
global minimum a ∗ .
The partial derivatives of f are
b−1
m−1
∂f
b−1
1−b
= ba j − m b 1 +
ai
.
∂a j
i=1
Since
∂ f (a1 , . . . , a j = 0, . . . , am−1 )
<0
∂a j
123
A smoothing algorithm for finite min–max–min problems
ai∗ is strictly positive for all i. Solving the system
∂f
= 0, ∀ j = 1, . . . , m − 1,
∂a j
we obtain the unique solution a j = 1 for all j. Since this boundary point is the only
stationary point and there is no other boundary point eligible for a global minimum,
we have a ∗j = 1 for all j.
As a result we obtain
f (a1 , . . . , am−1 ) ≥ f (1, 1, . . . , 1) = 1 + m − 1 − m 1−b m b = 0.
m−1 b
ai and re-arranging, we have
Dividing by 1 + i=1
1+
1
m−1
i=1
b
+
ai
m−1
i=1
1+
ai
m−1
k=1
b
≥ m 1−b .
ak
Proposition 2.4
I , J (x) ≥ I , J (x)
1
2
for J2 < J1 < 0.
Proof Let gi (x, J ) =
j∈J
exp( J f i, j (x))
1
J
−1
e
M J j > 0.
1
1
J
J
2
j∈J exp( J2 f i, j (x))
gi (x, J2 )
MJ 1
= 1
1
gi (x, J1 )
J
J
1
MJ 2
j∈J exp( J1 f i, j (x))
⎤ 1
⎡
J
1
2
J
1
exp(
f
(x))
⎥
⎢
M
J
i,
j
2
j∈J
J
⎥
=⎢
1
⎣ J2 ⎦
J
J
2
1
M
exp(
f
(x))
J
i,
j
1
j∈J
J
⎡
=⎣
j∈J
exp( J1 f i, j (x))
k∈J exp( J1 f i,k (x))
J2
J
1
⎤
⎦
1
J
2
1
J
1
MJ
1
J
2
MJ
Let
f io (x) = max f i, j (x) and
j
ai, j =
exp( f i, j (x))
.
exp( f io (x))
123
A. Tsoukalas et al.
We have 0 ≤ ai, j ≤ 1 ∀i, j
and ∀i, ∃ j with ai, j = 1. Therefore
⎡
1
J ⎤ 1
ai, j exp( J1 f o (x)) J21 J2 M J1
gi (x, J2 )
J
i
⎦
=⎣
1
o
gi (x, J1 )
J
k∈J ai,k exp( J1 f i (x))
j∈J
MJ 2
⎡
⎤ 1
1
J2 J2
J
ai, j
J
MJ 1
1
⎦
=⎣
1
J
k∈J ai,k
j∈J
MJ 2
1
J 1
1− 2 J2 M J1
J1
J
≤ MJ
=1
1
J
MJ 2
where the inequality holds due to Lemma 2.3 and the fact that
Therefore,
J2
J1
≥ 1, 1J < 0.
2
gi (x, J2 ) ≤ gi (x, J1 )
I , J (x) ≤ I , J (x).
2
1
Choosing
I = − J = > 0
we obtain the approximation function
1
1
ln(M J )
(x) = ln
+
j∈J exp(− f i, j (x))
i∈I
which is a decreasing function of .
3 A steepest-descent algorithm
The following theorem from [11], gives a first order optimality condition for min–
max–min problems.
Theorem 3.1 If f i, j are continuously differentiable, and x̂ is a local minimizer of
(x) then 0 ∈ Ḡ(x) where
⎧⎡
⎤⎫
⎨ f i, j (x) − i (x) ⎬
Ḡ(x) = convi conv j ⎣ (x) − i (x) ⎦ .
⎩
⎭
∇ f i, j (x)
123
A smoothing algorithm for finite min–max–min problems
We have
∇ (x) =
µi, j ∇ f i, j (x)
(4)
i∈I, j∈J
where
µi, j
=
exp(− f i, j (x))
exp(− f i,k (x)))2
1
ρ∈I
k∈J exp(− f ρ,k (x))
(
k∈J
.
(5)
Note that
µi, j = 1, ∀,
i∈I, j∈J
and
lim µi, j = 0 ∀(i, j) ∈
/ ( Iˆ, Jˆi ),
→∞
where
Iˆ = {i ∈ I : (x) = min f i, j (x)},
j
Jˆi = { j ∈ J : min f i, j (x) = f i, j (x)}.
j
Corollary 3.2 If
lim ∇ (x̂) = 0,
→∞
then x̂ satisfies the stationarity condition of Theorem 3.1.
We present the following Armijo Gradient algorithm.
Concerning step 7 we remark that any increasing updating rule u(k ) can be used
as long as
lim k = ∞.
k→∞
(6)
For example we can choose u(k ) = k + γ . Condition 6 is also satisfied, by the
adaptive parameter update proposed in [3,7], which has the advantage of avoiding
ill-conditioning due to a quick increase or .
The proof of the next proposition is similar to the proof of stationarity of limit
points for gradient methods in [12]. We repeat the arguments for completeness.
Proposition 3.3 Let {xk } be a bounded sequence generated by Algorithm 1. Then the
sequence {xk } converges to a stationary point.
123
A. Tsoukalas et al.
Algorithm 1 Armijo Gradient Algorithm
1:
2:
3:
4:
Set σ, β ∈ (0, 1), s > 0
Choose 0 > 0
Set k = 0
Compute the steepest descent direction
h k = −∇k (xk ).
5: Compute the step-size
αk = max{sβ l : k (xk + sβ l h(k)) − k (xk ) ≤ −σβ l s||h k ||2 }
l∈N
6: Set xk+1 = xk + αk h k
7: Set k+1 = u(k ), k = k + 1
8: go to 4
Proof To arrive at a contradiction assume that x̂ is a limit point of {xk } with
lim ||∇ (x̂)|| ≥ ρ,
→∞
with ρ > 0.
Since is continuous and monotonically non-increasing with respect to both and x, and given
|1 − 2 | ≤
MI + MJ
, ∀1 , 2 > 0,
min{1 , 2 }
it follows that k (xk ) converges to the finite value
(x̂) = lim (x̂).
→∞
We have
k (xk ) − k+1 (xk+1 ) ≥ k (xk ) − k (xk+1 )
≥ σ αk ||∇k (xk )||2 ,
where the first inequality follows from Propositions 1.2, 1.4 and the second inequality
by the definition of Armijo’s rule.
Hence, since σ > 0,
αk ||∇k (xk )||2 → 0,
and therefore, by the hypothesis,
{αk } → 0.
123
A smoothing algorithm for finite min–max–min problems
From the definition of Armijo’s rule, there is an index k̂ ≥ 0 such that
#
k (xk ) − k
xk −
$
αk
σ αk
∇ (xk ) <
||∇ (xk )||2 , ∀k ≥ k̂,
β
β
that is, the initial step-size will be reduced at least once for all k ≥ k̂.
This can be written as
k (xk ) − k xk −
αk ||∇ (xk )|| ∇ (xk )
β
||∇ (xk )||
αk ||∇ (xk )||
β
< σ ||∇ (xk )||, ∀k ≥ k̂.
By the Mean Value Theorem, we obtain
#
∇k
∇ (xk )
xk − α̃k
||∇ (xk )||
$T
∇ (xk )
< σ ||∇ (xk )||, ∀k ≥ k̂,
||∇ (xk )||
where α̃k is a scalar in the interval 0, αk ||∇β (xk )|| .
Since ||∇ || is bounded, taking limits we obtain
lim ||∇ (x̂)||(1 − σ ) ≤ 0,
→∞
lim ||∇ (x̂)|| ≤ 0,
→∞
which contradicts the initial assumption.
Therefore,
lim ||∇ (x̂)|| = 0
→∞
and x̂ is a stationary point of (x) by Corollary 3.2.
4 Numerical example
To prevent overflows, the values of (x) and µi, j have to be computed carefully,
in a similar way to the approximations of mini–max problems in [4,6].
1
1
ln(M J )
(x) = ln
+
exp(−
f
(x))
i, j
j∈J
i∈I
1
1
ln(M J )
= (x) + ln
+
,
exp(−(
f
(x)
−
(x)))
i,
j
j∈J
i∈I
123
A. Tsoukalas et al.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
Fig. 1 Approximations for increasing values of 123
A smoothing algorithm for finite min–max–min problems
and
µi, j
=
ex p(− f i, j (x))
ex p(− f i,k (x)))2
1
ρ∈I
k∈J ex p(− f ρ,k (x))
(
k∈J
=
ex p(−( f i, j (x)−(x)))
ex p(−( f i,k (x)−(x))))2
1
ρ∈I
k∈J ex p(−( f ρ,k (x)−(x)))
(
k∈J
.
(7)
Example 1 Consider the problem
min max
min
x∈R i∈{1,2} j∈{1,2,3}
f i, j (x)
with
f 1,1 (x) = 10x 2 − 15, f 1,2 (x) = (x + 2)2 + 3, f 1,3 (x) = 10(x − 6)2 − 10,
f 2,1 (x) = 2x 2 − 5, f 2,2 (x) = (3x − 15)2 − 10, f 2,3 (x) = x.
Figure 1 shows how the approximation function converges to for increasing
values of .
We run our algorithm with parameters σ = 0.01, β = 0.5, s = 1,0 = 0.02 and
the updating parameter rule
¯k+1 =
⎧
⎫
⎨ k if ||∇k (xk )|| > 0.5 ⎬
⎩
2k if ||∇k (xk )|| ≤ 0.5
⎭
.
From the starting point x0 = −10 our algorithm converges to x ∗ = 0 in 11 iterations.
From the starting point x0 = 6 the point x ∗ = 5.51318 is reached in 25 iterations.
Acknowledgment
The authors wish to acknowledge support from EPSRC grant EP/C513584/1.
References
1. Mayne, D.Q., Polak, E., Trahan, R.: An outer approximations algorithm for computer-aided design
problems. J. Optim. Theory Appl. 28(3), 331–352 (1979)
2. Drezner, Z., Thisse, J.-F., Wesolowsky, G.O.: The minimax–min location problem. J. Reg. Sci. 26(1),
87–101 (1986)
3. Polak, E., Royset, J.O.: Algorithms for finite and semi-infinite min–max–min problems using adaptive
smoothing techniques. J. Optim. Theory Appl. 119(3), 421–457 (2003)
4. Bertsekas, D.: Constrained Optimization and Lagrange Multiplier Methods. Athena Scientific, Nashua
(1996)
5. Xingsi, L.: An entropy-based aggregate method for minimax optimization. Eng. Opt. 18, 277–285
(1992)
6. Xu, S.: Smoothing method for minimax problems. Comput. Optim. Appl. 20(3), 267–279 (2001)
7. Polak, E., Royset, J.O., Womersley, R.S.: Algorithms with adaptive smoothing for finite minimax
problems. J. Optim. Theory Appl. 119(3), 459–484 (2003)
8. De Bruijn, N.G.: Asymptotic methods in analysis, 2nd edn. Bibliotheca Mathematica, vol IV. NorthHolland, Amsterdam (1961)
9. Parpas, P., Rustem, B., Pistikopoulos, E.N.: Linearly constrained global optimization and stochastic
differential equations. J. Global Optim. 36(2), 191–217 (2006)
123
A. Tsoukalas et al.
10. Hwang, C.: Laplace’s method revisited: weak convergence of probability measures. Ann. Probab.
8(6), 1177–1182 (1980)
11. Polak, E.: Optimization. Applied Mathematical Sciences, vol. 124. In: Algorithms and consistent
approximations. Springer, New York (1997)
12. Bertsekas, D.: Nonlinear Programming. Athena Scientific, Nashua (1999)
123
Download