Canonical Duality Theory for Solving General Mixed Integer Nonlinear Programming Problems with Applications David Gao Alex Rubinov Prof. of Mathematics, Federation University Research Prof. of Eng. Science, Australian National University 1. Duality Gap between Math and Physics conceptual problems 2. Canonical Duality-Triality: Unified Modeling Unified Solutions 3. Challenges Breakthrough Gap between Math and Mechanics Nonlinear/Global Optimization Problem: min f (x) s.t. g(x) ≤ 0 f(x) is an "objective" function g(x) is a general constraint. (naive) questions: What is the objective function? target and cost? what is Lagrangian? … "Mathematics is a part of physics. …In the middle of the twentieth century it was attempted to divide physics and mathematics. The consequences turned out to be catastrophic. " — V.I. Arnold (1997) Mathematics needs to remarry physics – A. Jaffe Gao-Ogden-Ratiu, Springer Duality in mathematics is not a theorem, but a “principle” – Sir M.F. Atiyah Duality gap is not allowed in mathematical physics! Canonical Duality-Triality Theory Gao-Strang, 1989 MIT and Gao, 1991 Harvard A methodological theory comprises mainly 1. Canonical dual transformation Unified Modeling 2. Complementary-Dual Principle Unified Solution Pd max = max P min = min 3. Triality Theory Identify both global and local extrema Design powerful algorithms Unified understanding complexities Nothing is too wonderful to be true, if it be consistent with the laws of nature Michael Farady (1860 AC) x s min = max Philosophical Foundation I-Ching (易 經 2800 BC-2737 BC): The fundamental Law of Nature is the Dao : the complementarity of one yin (Ying) and one Yang 一陰一阳以谓 道 Laozi: All things have the receptivity of the yin and the activity of the yang. Through union with the life-giving force (chi) they blend in harmony Everything = {( Yin, Yang) ; Chi } = { (subj. , obj.) ; verb } Canonical System = { (Ying, Yang) | H-Chi } = { ( X , X* ) | A } Convex Canonical System: Unified Modeling input Convex System out put Xa x ∂ F( x) D x (P): min P(x) = W(Dx) - F( x) Ya y ∂ W(y) s.t. x Xc= { xXa | Dx Ya} F(x) = f T x Subjective function The 1st duality: x* = ∂ F( x) = f , action-reaction W( y ) : Objective function (Gao, 2000): W( Q y ) = W( y ) QT = Q -1, det Q = 1 f x* = f Xa* D* = DT y* Ya* P x Exam: W(y) = ½ | y |2 , |Q y|2 = yTQTQ y = | y|2 The 2nd duality: y* = ∂ W(y) Constitutive law d P Legendre transf. W*( y*) = yT y* - W(y) min P(x) = max Pd(y*) Lagrangian: L(x, y*) = (Dx)T y* - W*( y*) - f T x = xT ( DTy* - f ) - W*( y*) (Pd ): max Pd(y*) = - W*( y*) s.t. DT y* = f frame-indifference Objectivity is not a hypothesis, a principle. Objectivity, Gaobut 2000 P.G. Ciarlet, Nonlinear Functional Analysis, 2013, SIAM Manufacturing Company System Xa Products x F( x ) = xT x* Price x* Xa* Company D* D Ya Workers y Salary y* Ya* (P): min P(x) = W(Dx) – F( x ) Target (Lose) cost income Unified Understanding Constraints (Gao, 1997) (P): min P(x) = W(Dx) – U( x ) W(y) : Ya = { y Y | g(y) ≥ 0 } physically feasible U(x) : Xa = { x X | Bx ≤ 0 } geometrically feasible u* ≥ 0 u ┴ u* 0≥u Boundary (external) constraints in Xa external KKT conditions u= Bx u*=B*x* 0 ≥ Bx = u ┴ u* = B*x* ≥ 0 Xa (x, x*) Xa* Constitutive (objective) constraints in Ya internal KKT conditions 0 ≤ g(y) = u ┴ u* = g*(y*) ≤ 0 Indicator ( J-J Moreau, 1963) D*=DT Dmn Ya u= g(y) ( y; y*) Ya* u*= g*(y*) W(y) if g(y) ≥ 0 u* ≤ 0 W (y) = u ┴ u* 0≤u ∞ otherwise ∂W constitutive law and Math = { ( X, X* ) ; A} KKT conditions (P): min P(x) = W(Dx) – U( x ) , x X = Obj. – Subj. { Canonical Duality - Triality Theory Nonconvex W(y) * (P): min P(x) = W(Dx) – x T f x x 1. Canonical transf. choose an objective measure y* = ∂W e =L(x) W(D x) = V(L (x)) convex in e D D* y canonical dual eqn (one-to-one): s = ∂ V (e ) y y* Legendre Trans: V*(s ) = e T s – V(e) e* = ∂V Total complementary function (Gao-Strang, 1989) L L t* X(x, s ) = L(x) T s - V*(s ) – x T f e T T (Quadratic L) = ½ x G(s ) x - V*(s ) – x f e e* ∂xX = 0 Analytic solution: x = G(s ) -1f Canonical Dual: Pd(s ) = X (x (s), s ) = - ½ f T G(s ) -1 f - V*(s ) 2. Complemenary-Dual Principle: Gap function If sc is a critical point of Pd(s ), then xc = G(s c ) -1f is a critical solution of (P) and P(xc ) = Pd( sc ) Let S+ = {s | G( s ) 0 } 3. Triality Theory: G-Strang (1989) If sc S+ , then S- = {s | G( s ) < 0 } P(xc ) = min P(x) = max Pd(s ) = Pd(sc ) If sc S - , then either P(xc ) = max P(x) = max Pd(s ) = Pd(sc ) (Gao, 1996) or P(xc ) = min P(x) = min Pd(s ) = Pd(sc ) Example: Nonconvex in Rn Convex in R1 P(x) = W(Dx) – F(x) = ½( ½ |x|2 - 1 )2 – x T f Pd W(y) = ½ ( ½ y 2 - 1)2 e = ½ |x|2 V(e) = ½ (e – 1 )2 y s = ∂ V(e ) = e - 1 n=1: double-well d 2 -1 2 P (s) = - ½ | f | s - ½s - s f P x s ∂Pd(s) = 0 s 2 (s + 1) = ½ | f |2 s3 ≤ s2 ≤ 0 ≤ s1 Complementary-Dual Principle: Analytic solutions: xk = (s k ) -1 f P(xk ) = Pd(sk ) k =1,2,3 n=2: Mexican hat Triality Theory: P(x1 ) = Pd(s1 ) P(x2 ) = Pd(s2 ) P(x3 ) = Pd(s3)Problem (2003): If dim x ≠ dim s Open P(x2 ) = min P(x) ≠ mins < 0 Pd(s ) = Pd(s2 ) Solved in 2012 f = 0 Multiple solution Perturbation: f ≠ 0 Unique solution s Buridan’s donkey P Pd 4 x Quadratic Boolean Programming (P): min P(x) = ½ xTAx – f T x s.t. x {-1,1}n Canonical transformation: e i = x i 2 – 1 ≤ 0 X(x, s ) = P(x) + S si ( xi 2 - 1 ) = ½ x TG(s ) x - S si - f T x , G (s ) = A+2 Diag (s ) xX (x, s ) = 0 x = G(s ) -1 f (Pd): max Pd(s) = - ½ f T [G(s ) ]-1 f – S si s.t. s S + = {s Rn | s ≥ 0, G(s ) 0 } KKT: si ≥ 0 , ei = xi2 - 1 ≤ 0, ( xi2 - 1 ) si = 0 si ≠ 0 xi2 =1 integer! min = min P(x) minP(x)= max Pd(s) Thm (Gao,2007): For each critical point sc ≠ 0 , the vector xc = G -1(sc ) f {-1,1}n is a KKT point of P(x) and P(xc ) = Pd(sc ) if G(sc ) 0 P(xc )= min P(x ) = max Pd (s ) =Pd (sc ) if G(sc ) 0 P(xc )= min P(x ) = min Pd (s ) =Pd (sc ) (P) Could be NP-Hard if Pd (s ) has no critical point in S + Results for Max-Cut Problem (NP-Complete) Wang-Fang-Gao-Xing (2012) J. Global Optimization max P(x) = ½ xTAx – f T x linear perturbation s.t x {0,1}n (Pd): max Pd(s) = - ½ f T [G(s ) ]-1 f – S si s.t. G(s ) ≥ 0 Comparison of the running time produced by the canonical dual approach and GW’s approach (Goemans and Williamson) Max -Cut Problem (contin.) ■ Randomly produce 50 instances on graphs of sizes 20,50, 100, 150,200 and 500. The weight of each edge is uniformly from [0,10] ■ Ave ratio is the average approximate ratio, the ratio is close to 1 when the dimension increases The 2nd Canonical Dual for Integer Programming s.t. x {-1,1}n (P): min P(x) = ½ xTAx – f T x The second canonical dual (Gao, 2009) (Pg): min Pg(s) = - ½ s T A-1s – S | fi - si | s.t s Rn Nonconvex/nonsmooth minimization DIRECT method (Deterministic ) Thm: If sc is a solution of ( Pg ) , then xc i = { P(x) Pg(s ) 1 if fi > sc i -1 if fi < sc i P(x) is a feasible solution of (P) and P(xc ) = Pg(sc ) . If A 0, P(xc )= minP(x)= maxPg(s )= Pg(s c ) Pg(s ) If A 0, P(xc )= min P(x)= min Pg(s ) = Pg(sc ) If A = - B T B , B Rm n , Pg(s) = ½ s Ts – S | fi - Bjisj | m<n n.m General MINLP Problems (P): min P(x,y ) = W(x,y) + aT x – bT y , x Xa , y Ya s.t. C1 x + C2 y ≤ c , D1 x + D2 y = d , Xa = {x Rn | 0 x u }, Ya = { y Zm | 0 y v } Let z = (x, y) , assume W(z ) is objective such that an objective measure e =L(z ) and a convex V(e ) W(z ) = V(L (z )) Canonical form: min P(z ) = V(L (z )) – f T z s.t. z Za Mixed Integer (fixed Cost) Problem (with H.D. Sherali and N. Ruan) (P): min P(x,y) = ½ xTA x + cT x – f T y s.t. -y ≤ x ≤ y, y { 0 , 1 }n (Pd): max Pd(s ) = - ½ cTG(s )-1c - ½ S (si - fi )+ s.t. s ≥ 0 , G(s ) = A + 2 Diag (s ) p.d. Thm: If sc is a solution of (Pd ) , then xc = - G (sc )-1 c , yci = { 1 if fi < sc i 0 if fi > sc i is a global solution of (P) and P(xc , yc ) = Pd(sc ) Applications to scheduling and decision science x Rd x n Problems that can be solved Benchmark Problems: 1. Rosenbrock function 2. Lennard-Jones potential minimization 3. Three Hump Camel Back Problem 4. Goldstein-Price Problem 5. 2n order polynomials minimizations 6. Canonical functions … New math– Nonlinear space Nonconvex constrained problems (P): min P(x) = || y – z || 2 s.t. h(y) = ½ y A y – r ellipsoid g (z) = ½ ( || z – c || 2 - b )2 – d t ( z - c) Lagrangian: x = ( y, z ) R2n L(x, l, m ) = || y – z || 2 + l h(y) + m g(z) Let e = L(z ) = || z – c || 2 , y z V (e ) = ½ (e - b ) 2 s = ∂ V (e ) = e - b , V*(s ) = e s - V (e ) = ½ s 2 + bs Total complementary function X (x, l , m, s ) = || y – z || 2 + l h(y) +m [ L (z )s - V* (s ) – d t ( z - c) ] G (l,m,s ) = 0 (Pd): Pd(s ) = minx X (x, l , s ) = - ½ F T G (l,m,s ) -1 F - mV* (s ) Thm: If G (l, m , s ) 0 , (Pd) has at least one critical solution which gives to a global optimal solution to (P). Challenges Super-Duality Since 2010, Zalinescu (+ 2) has wrote 11 papers + 1 letter challenging the Canonical Duality Theory, which can be grouped in three categories: 1. Conceptual Duality (4 papers, two published and two rejected) • min P(x) = V(L(x)) – F(x) F (x) external energy (must be linear function) ∂F(x) = x* = f e ) internal energy (must be objective ) ∂V( =s 2.V( Moral Duality(stored) (6 papers) all on the same open problem left ein) 2003: min P(x) ≠ min Pd(s ) s S3. Multi-scale duality (1 paper): Locally correct but globally wrong If dim P ≠ dim Pd Certain condition in S+ is missing Total complementary function X (x, l , m, s ) , x = ( y , z ) R2n 0 y z “Counter-Example” Hidden truth Conclusion: The consideration of the Gao-Strang function X (x, l , m, s ) is useless, at least for the problem studied in [3]. Morales-Gao (2012): linear perturbation X (x, l , m, s ) – k -1 xT f Unified Global Optimization Discrete optimization Combinatorial Optim. Integer Programming Combinatorial Algebra Graph, lattice, fuzzy max-plus algebra Mixed Integer Optim. Supply Chain Process Nonconvex/nonsmooth Variational/V.I. Analysis Continuous Optimization FEM, FDM, FVM, SDP Meshless, Wavelet, SIP Numerical Analysis Canonical Duality-Triality Theory Duality in Nonconvex Systems: Theory, Methods and Application David Yang Gao Kluwer Academic Publishers, 2000, 454pp Part I Symmetry in Convex Systems 1. Mono-duality in static systems 2. Bi-duality in dynamical systems Part II Symmetry Breaking: Triality Theory in Nonconvex Systems 3. Tri-duality in nonconvex systems 4. Multi-duality and classifications of general systems Part III Duality in Canonical Systems 5. Duality in geometrically linear systems 6. Duality in finite deformation systems 7. Applications, open problems and concluding remarks duality in fluid mechanics ? All happy families are alike, Reason: canonical duality Every unhappy family is unhappy in its own way Reason: different duality gaps Anna Karenina --- Leo N Tolstoy Philosophy = Love of Canonical Duality Proof: 1. By Greeks: Philosophy = Love of Wisdom 2. By Confucius: The highest Wisdom = Dao 3. By I-Ching (4000BC): Dao = one Ying + one Yang = Canonical Duality 一陰一阳以谓 道 --- 易 經 Open Problem: How to correctly understand the Triality Canonical Duality –Triality Theory: Rn 1. Non-convex concave 2. Discrete continuous Rm n L 3. Non-smooth smooth 4. Rescaling: Rn Rm Rr n> m>r Rm 6. Non-deterministic deterministic Rn L* y* = f (y y*) L=L* 5. Diff. eqn Algebraic eqn. 7. Challenges (x , x*) Rm L Rmr Rr Rr (ox xo) L* L x = 0 Breakthrough Open Problems: (P) is NP-Hard if (Pd) has no solution in Sa+ ? 