Optim Lett DOI 10.1007/s11590-008-0090-9 ORIGINAL PAPER A smoothing algorithm for finite min–max–min problems Angelos Tsoukalas · Panos Parpas · Berç Rustem Received: 19 February 2008 / Accepted: 27 May 2008 © Springer-Verlag 2008 Abstract We generalize a smoothing algorithm for finite min–max to finite min– max–min problems. We apply a smoothing technique twice, once to eliminate the inner min operator and once to eliminate the max operator. In mini–max problems, where only the max operator is eliminated, the approximation function is decreasing with respect to the smoothing parameter. Such a property is convenient to establish algorithm convergence, but it does not hold when both operators are eliminated. To maintain the desired property, an additional term is added to the approximation. We establish convergence of a steepest descent algorithm and provide a numerical example. 1 Introduction We propose an algorithm for the finite min–max–min problem min (x) = max min f i, j (x) x∈Rn i∈I j∈J with finite sets I, J and continuously differentiable functions f i, j with bounded first derivatives for all i ∈ I, j ∈ J . The min–max–min problem has applications in multiple fields including computer aided design [1], facility location [2] and civil engineering [3]. A. Tsoukalas (B) · P. Parpas · B. Rustem Department of Computing, Imperial College, London, UK e-mail: at102@doc.ic.ac.uk P. Parpas e-mail: pp500@doc.ic.ac.uk B. Rustem e-mail: br@doc.ic.ac.uk 123 A. Tsoukalas et al. To tackle the problem, we generalize the smoothing technique described in [3–7], which approximates a max function max gi (x) i with the smooth approximation log exp(gi (x)) /. i The same idea, where in this case the maximization is over an interval and an integral appears in place of the summation, appears in the laplace method, that studies the asymptotic behavior of integrals [8]. In the context of optimization, it has been used in convergence proofs of simulated annealing methods [9] to describe a distribution over the state space of solutions that converges to the global optimum as the parameter increases [10]. This smoothing technique has been used to solve the minimax problem in [5–7] by transforming it into a smooth minimization problem. In [7] an adaptive method for the update of the parameter is used which provides robustness in view of potential illconditioning problems. Furthermore, the same technique has been used to transform the min–max–min problem to minimax problems [3], which are solved with a different method. In Sect. 2 we extend this methodology to eliminate both the inner minimization and the maximization operands and reduce the min–max–min problem to the minimization of a smooth function and in Sect. 3 we provide a steepest descent algorithm of the approximation function and we show that it converges to a point that satisfies a first order optimality condition for the original min–max–min problem. In Sect. 4 we provide a computational example. 2 Smoothing of max–min functions Proposition 2.1 provides a smooth approximation of maxi∈I min j∈J f i, j (x). Proposition 2.1 Let (x) = max min f i, j (x) i∈I j∈J Let M I = |I |, M J = |J | and I > 0, J < 0. Then ⎤ I ⎤ J ln(M J ) ln(M I ) 1 ⎢ ⎣ ⎥ ⎦ (x) + ≤ ln ⎣ exp( J f i, j (x)) . ⎦ ≤ (x) + J I I ⎡ ⎡ i∈I 123 j∈J (1) A smoothing algorithm for finite min–max–min problems Proof Let i (x) = min f i, j (x). j Then i ≤ f i, j , ∀i ∈ I, j ∈ J, exp( J i ) ≤ exp( J f i, j ) ≤ M J exp( J i ), ∀i ∈ I, j∈J ⎛ J i ≤ ln⎝ ⎞ exp( J f i, j )⎠≤ ln(M J )+ J i , ∀i ∈ I, j∈J ⎛ ⎞ 1 1 ⎝ ln(M J ) ≤ ln exp( J f i, j )⎠ ≤ i , ∀i ∈ I, i + J J j∈J ⎛⎡ ⎤ I ⎞ J I ⎟ ⎜ ln(M J ) ≤ ln ⎝⎣ exp J f i, j ⎦ ⎠ ≤ I i , ∀i ∈ I, I i + J j∈J ⎡ I J M J · exp( I i ) ≤ ⎣ ⎤ I J exp( J f )⎦ ≤ exp( I i ), ∀i ∈ I, j∈J ln i∈I ⎛ ⎤ I ⎞ ⎡ J I ⎟ ⎜ ⎣ ⎦ exp( I i ) + ln(M J ) ≤ ln ⎝ exp( J f i, j ) ⎠ J i∈I ≤ ln j∈J exp( I i ) . (2) i∈I We have = max i i exp( I ) ≤ exp( I i ) ≤ M I exp( I ) i∈I I ≤ ln exp( I i ) ≤ ln(M I ) + I . (3) i∈I 123 A. Tsoukalas et al. From (2,3), we have ln(M I ) + I ≥ ln exp( I i ) i∈I ⎛ ⎤ I ⎞ ⎡ J ⎜ ⎟ ⎦ ⎣ ≥ ln ⎝ exp( J f ) ⎠ i∈I ≥ ln j∈J exp( I i ) + i∈I 1 ln(M J ) J 1 ln(M J ) J ⎛ ⎡ ⎤ I ⎞ J 1 1 1 ⎜ ⎟ ⎣ + ln(M J ) ≤ ln ⎝ exp( J f )⎦ ⎠ ≤ ln(M I ) + J I I ≥ I + i∈I j∈J Set ⎤ I ⎞ ⎡ J 1 ⎜ ⎟ ⎦ ⎣ I , J (x) = ln ⎝ exp( J f (x)) ⎠ I ⎛ i∈I j∈J It is shown in [5] that the convergence of the approximation method described in the Introduction is monotonic. Proposition 2.2 shows that this is not the case when we eliminate both operands. Proposition 2.2 For J2 < J1 < 0, I , J1 (x) < I , J2 (x), and for I2 > I1 > 0, α (x, I1 , J ) > α (x, I2 , J ). Proof Let gi (x, J ) = 123 j∈J exp( J f i, j (x)) 1 J > 0. A smoothing algorithm for finite min–max–min problems 1 J 2 j∈J exp( J2 f i, j (x)) gi (x, J2 ) = 1 gi (x, J1 ) J 1 j∈J exp( J1 f i, j (x)) ⎡ ⎤ ⎢ =⎢ ⎣ j∈J j∈J ⎡ =⎣ j∈J exp( J2 f i, j (x)) ⎥ ⎥ J2 ⎦ exp( J1 f i, j (x)) J1 1 J 2 exp( J1 f i, j (x)) k∈J exp( J1 f i,k (x)) J2 J 1 ⎤ ⎦ 1 J 2 . Since J2 > 1, J1 exp( J1 f i, j (x)) < 1 and J2 < 0, k∈J exp( J1 f i,k (x)) we have ⎡ ⎤ 1 J 2 exp( J1 f i, j (x)) gi (x, J2 ) ⎣ ⎦ ≥ = 1. gi (x, J1 ) k∈J exp( J1 f i,k (x)) j∈J Therefore 1 ln I i∈I gi (x, J2 ) ≥ gi (x, J1 ) 1 I I gi (x, J2 ) ln gi (x, J1 ) ≥ I i∈I I , J2 (x) ≥ I , J1 (x). For the second part we have i∈I i∈I gi (x, J ) I2 gi (x, J ) I1 1 I2 1 ⎡ ⎢ =⎣ I1 ⎡ ⎢ =⎣ ⎤ i∈I i∈I i∈I i∈I gi (x, J ) I2 gi (x, J ) I1 gi (x, J ) gi (x, J ) I2 ⎥ ⎦ 1 I 2 I1 I1 I 2 I 1 I1 I2 ⎤ ⎥ ⎦ 1 I 2 ≤ 1. I1 123 A. Tsoukalas et al. Therefore gi (x, J ) I2 1 I 2 ≤ i∈I gi (x, J ) I1 1 I 1 i∈I ⎛ ⎛ 1 ⎞ 1 ⎞ I I 2 1 ⎠ ≤ ln ⎝ ⎠ ln ⎝ gi (x, J ) I2 gi (x, J ) I1 i∈I i∈I I2 , J (x) ≤ I1 , J (x). We see that when we increase | I |, I , J decreases and when we increase | J |, I , J increases. In order to prove convergence of the local optimization algorithm it would be convenient if the approximation function was decreasing with both | I | and | J |. Consider the alternative approximation function I , J (x) = I , J (x) − ln(M J ) . J It is obvious that I , J decreases as | I | increases. Next we examine what happens when we increase | J |. Lemma 2.3 For all b > 1, 0 ≤ ai ≤ 1, we have 1+ 1 m−1 i=1 b ai + m−1 i=1 1+ ai m−1 k=1 b ≥ m 1−b . ak Proof Let f (a1 , a2 , . . . , am−1 ) = 1 + m−1 aib − m 1−b 1 + i=1 m−1 b ai , 0 ≤ ai ≤ 1. i=1 The set {(a1 , . . . , am−1 )|0 ≤ ai ≤ 1} is closed and bounded and therefore, f has a global minimum a ∗ . The partial derivatives of f are b−1 m−1 ∂f b−1 1−b = ba j − m b 1 + ai . ∂a j i=1 Since ∂ f (a1 , . . . , a j = 0, . . . , am−1 ) <0 ∂a j 123 A smoothing algorithm for finite min–max–min problems ai∗ is strictly positive for all i. Solving the system ∂f = 0, ∀ j = 1, . . . , m − 1, ∂a j we obtain the unique solution a j = 1 for all j. Since this boundary point is the only stationary point and there is no other boundary point eligible for a global minimum, we have a ∗j = 1 for all j. As a result we obtain f (a1 , . . . , am−1 ) ≥ f (1, 1, . . . , 1) = 1 + m − 1 − m 1−b m b = 0. m−1 b ai and re-arranging, we have Dividing by 1 + i=1 1+ 1 m−1 i=1 b + ai m−1 i=1 1+ ai m−1 k=1 b ≥ m 1−b . ak Proposition 2.4 I , J (x) ≥ I , J (x) 1 2 for J2 < J1 < 0. Proof Let gi (x, J ) = j∈J exp( J f i, j (x)) 1 J −1 e M J j > 0. 1 1 J J 2 j∈J exp( J2 f i, j (x)) gi (x, J2 ) MJ 1 = 1 1 gi (x, J1 ) J J 1 MJ 2 j∈J exp( J1 f i, j (x)) ⎤ 1 ⎡ J 1 2 J 1 exp( f (x)) ⎥ ⎢ M J i, j 2 j∈J J ⎥ =⎢ 1 ⎣ J2 ⎦ J J 2 1 M exp( f (x)) J i, j 1 j∈J J ⎡ =⎣ j∈J exp( J1 f i, j (x)) k∈J exp( J1 f i,k (x)) J2 J 1 ⎤ ⎦ 1 J 2 1 J 1 MJ 1 J 2 MJ Let f io (x) = max f i, j (x) and j ai, j = exp( f i, j (x)) . exp( f io (x)) 123 A. Tsoukalas et al. We have 0 ≤ ai, j ≤ 1 ∀i, j and ∀i, ∃ j with ai, j = 1. Therefore ⎡ 1 J ⎤ 1 ai, j exp( J1 f o (x)) J21 J2 M J1 gi (x, J2 ) J i ⎦ =⎣ 1 o gi (x, J1 ) J k∈J ai,k exp( J1 f i (x)) j∈J MJ 2 ⎡ ⎤ 1 1 J2 J2 J ai, j J MJ 1 1 ⎦ =⎣ 1 J k∈J ai,k j∈J MJ 2 1 J 1 1− 2 J2 M J1 J1 J ≤ MJ =1 1 J MJ 2 where the inequality holds due to Lemma 2.3 and the fact that Therefore, J2 J1 ≥ 1, 1J < 0. 2 gi (x, J2 ) ≤ gi (x, J1 ) I , J (x) ≤ I , J (x). 2 1 Choosing I = − J = > 0 we obtain the approximation function 1 1 ln(M J ) (x) = ln + j∈J exp(− f i, j (x)) i∈I which is a decreasing function of . 3 A steepest-descent algorithm The following theorem from [11], gives a first order optimality condition for min– max–min problems. Theorem 3.1 If f i, j are continuously differentiable, and x̂ is a local minimizer of (x) then 0 ∈ Ḡ(x) where ⎧⎡ ⎤⎫ ⎨ f i, j (x) − i (x) ⎬ Ḡ(x) = convi conv j ⎣ (x) − i (x) ⎦ . ⎩ ⎭ ∇ f i, j (x) 123 A smoothing algorithm for finite min–max–min problems We have ∇ (x) = µi, j ∇ f i, j (x) (4) i∈I, j∈J where µi, j = exp(− f i, j (x)) exp(− f i,k (x)))2 1 ρ∈I k∈J exp(− f ρ,k (x)) ( k∈J . (5) Note that µi, j = 1, ∀, i∈I, j∈J and lim µi, j = 0 ∀(i, j) ∈ / ( Iˆ, Jˆi ), →∞ where Iˆ = {i ∈ I : (x) = min f i, j (x)}, j Jˆi = { j ∈ J : min f i, j (x) = f i, j (x)}. j Corollary 3.2 If lim ∇ (x̂) = 0, →∞ then x̂ satisfies the stationarity condition of Theorem 3.1. We present the following Armijo Gradient algorithm. Concerning step 7 we remark that any increasing updating rule u(k ) can be used as long as lim k = ∞. k→∞ (6) For example we can choose u(k ) = k + γ . Condition 6 is also satisfied, by the adaptive parameter update proposed in [3,7], which has the advantage of avoiding ill-conditioning due to a quick increase or . The proof of the next proposition is similar to the proof of stationarity of limit points for gradient methods in [12]. We repeat the arguments for completeness. Proposition 3.3 Let {xk } be a bounded sequence generated by Algorithm 1. Then the sequence {xk } converges to a stationary point. 123 A. Tsoukalas et al. Algorithm 1 Armijo Gradient Algorithm 1: 2: 3: 4: Set σ, β ∈ (0, 1), s > 0 Choose 0 > 0 Set k = 0 Compute the steepest descent direction h k = −∇k (xk ). 5: Compute the step-size αk = max{sβ l : k (xk + sβ l h(k)) − k (xk ) ≤ −σβ l s||h k ||2 } l∈N 6: Set xk+1 = xk + αk h k 7: Set k+1 = u(k ), k = k + 1 8: go to 4 Proof To arrive at a contradiction assume that x̂ is a limit point of {xk } with lim ||∇ (x̂)|| ≥ ρ, →∞ with ρ > 0. Since is continuous and monotonically non-increasing with respect to both and x, and given |1 − 2 | ≤ MI + MJ , ∀1 , 2 > 0, min{1 , 2 } it follows that k (xk ) converges to the finite value (x̂) = lim (x̂). →∞ We have k (xk ) − k+1 (xk+1 ) ≥ k (xk ) − k (xk+1 ) ≥ σ αk ||∇k (xk )||2 , where the first inequality follows from Propositions 1.2, 1.4 and the second inequality by the definition of Armijo’s rule. Hence, since σ > 0, αk ||∇k (xk )||2 → 0, and therefore, by the hypothesis, {αk } → 0. 123 A smoothing algorithm for finite min–max–min problems From the definition of Armijo’s rule, there is an index k̂ ≥ 0 such that # k (xk ) − k xk − $ αk σ αk ∇ (xk ) < ||∇ (xk )||2 , ∀k ≥ k̂, β β that is, the initial step-size will be reduced at least once for all k ≥ k̂. This can be written as k (xk ) − k xk − αk ||∇ (xk )|| ∇ (xk ) β ||∇ (xk )|| αk ||∇ (xk )|| β < σ ||∇ (xk )||, ∀k ≥ k̂. By the Mean Value Theorem, we obtain # ∇k ∇ (xk ) xk − α̃k ||∇ (xk )|| $T ∇ (xk ) < σ ||∇ (xk )||, ∀k ≥ k̂, ||∇ (xk )|| where α̃k is a scalar in the interval 0, αk ||∇β (xk )|| . Since ||∇ || is bounded, taking limits we obtain lim ||∇ (x̂)||(1 − σ ) ≤ 0, →∞ lim ||∇ (x̂)|| ≤ 0, →∞ which contradicts the initial assumption. Therefore, lim ||∇ (x̂)|| = 0 →∞ and x̂ is a stationary point of (x) by Corollary 3.2. 4 Numerical example To prevent overflows, the values of (x) and µi, j have to be computed carefully, in a similar way to the approximations of mini–max problems in [4,6]. 1 1 ln(M J ) (x) = ln + exp(− f (x)) i, j j∈J i∈I 1 1 ln(M J ) = (x) + ln + , exp(−( f (x) − (x))) i, j j∈J i∈I 123 A. Tsoukalas et al. (a) (b) (c) (d) (e) (f) (g) Fig. 1 Approximations for increasing values of 123 A smoothing algorithm for finite min–max–min problems and µi, j = ex p(− f i, j (x)) ex p(− f i,k (x)))2 1 ρ∈I k∈J ex p(− f ρ,k (x)) ( k∈J = ex p(−( f i, j (x)−(x))) ex p(−( f i,k (x)−(x))))2 1 ρ∈I k∈J ex p(−( f ρ,k (x)−(x))) ( k∈J . (7) Example 1 Consider the problem min max min x∈R i∈{1,2} j∈{1,2,3} f i, j (x) with f 1,1 (x) = 10x 2 − 15, f 1,2 (x) = (x + 2)2 + 3, f 1,3 (x) = 10(x − 6)2 − 10, f 2,1 (x) = 2x 2 − 5, f 2,2 (x) = (3x − 15)2 − 10, f 2,3 (x) = x. Figure 1 shows how the approximation function converges to for increasing values of . We run our algorithm with parameters σ = 0.01, β = 0.5, s = 1,0 = 0.02 and the updating parameter rule ¯k+1 = ⎧ ⎫ ⎨ k if ||∇k (xk )|| > 0.5 ⎬ ⎩ 2k if ||∇k (xk )|| ≤ 0.5 ⎭ . From the starting point x0 = −10 our algorithm converges to x ∗ = 0 in 11 iterations. From the starting point x0 = 6 the point x ∗ = 5.51318 is reached in 25 iterations. Acknowledgment The authors wish to acknowledge support from EPSRC grant EP/C513584/1. References 1. Mayne, D.Q., Polak, E., Trahan, R.: An outer approximations algorithm for computer-aided design problems. J. Optim. Theory Appl. 28(3), 331–352 (1979) 2. Drezner, Z., Thisse, J.-F., Wesolowsky, G.O.: The minimax–min location problem. J. Reg. Sci. 26(1), 87–101 (1986) 3. Polak, E., Royset, J.O.: Algorithms for finite and semi-infinite min–max–min problems using adaptive smoothing techniques. J. Optim. Theory Appl. 119(3), 421–457 (2003) 4. Bertsekas, D.: Constrained Optimization and Lagrange Multiplier Methods. Athena Scientific, Nashua (1996) 5. Xingsi, L.: An entropy-based aggregate method for minimax optimization. Eng. Opt. 18, 277–285 (1992) 6. Xu, S.: Smoothing method for minimax problems. Comput. Optim. Appl. 20(3), 267–279 (2001) 7. Polak, E., Royset, J.O., Womersley, R.S.: Algorithms with adaptive smoothing for finite minimax problems. J. Optim. Theory Appl. 119(3), 459–484 (2003) 8. De Bruijn, N.G.: Asymptotic methods in analysis, 2nd edn. Bibliotheca Mathematica, vol IV. NorthHolland, Amsterdam (1961) 9. Parpas, P., Rustem, B., Pistikopoulos, E.N.: Linearly constrained global optimization and stochastic differential equations. J. Global Optim. 36(2), 191–217 (2006) 123 A. Tsoukalas et al. 10. Hwang, C.: Laplace’s method revisited: weak convergence of probability measures. Ann. Probab. 8(6), 1177–1182 (1980) 11. Polak, E.: Optimization. Applied Mathematical Sciences, vol. 124. In: Algorithms and consistent approximations. Springer, New York (1997) 12. Bertsekas, D.: Nonlinear Programming. Athena Scientific, Nashua (1999) 123