Semi-Matchings for Bipartite Graphs and Load Balancing Nick Harvey, Richard Ladner, Laszlo Lovasz, Tami Tamir Talk Outline “Swiss Bank” Problem Formal Definitions Optimal Semi-Matchings Algorithms Experiments “Swiss Bank” Problem 5 bank tellers each speaks a different language 10 bank customers each speaks one or more languages Assume servicing a customer takes 1 time unit Problem: Assign each customer to a teller Problem Model: Bipartite Graph Customers Tellers German Customer speaks German and Italian Italian French Romansh English Customer Assignment Customers Tellers German Italian French Romansh English Optimization Objectives Minimize: 1. Flow time: 2. Makespan: 3. Total time customers wait (or average time) Maximum time a customer waits Variance: Load balance of tellers’ queue lengths Customer Wait Time (Flow Time) Wait Time Customer Queues Tellers 1 German 1 Italian 4+3+2+1=10 French 2+1=3 Romansh 2+1=3 English Total Wait Time: 1+1+10+3+3 = 18 units Square of difference from mean Teller Load and Variance Customer Queues Tellers Load Variance German 1 (1-2)2=1 Italian 1 (1-2)2=1 French 4 (4-2)2=4 Romansh 2 (2-2)2=0 English 2 (2-2)2=0 Mean Load 10/5 = 2 Variance: (1+1+4+0+0)/5 = 6/5 Optimal Assignment Customers Tellers German Italian French Romansh English Variance: 0 Total Wait Time: (2+1)*5 = 15 units Talk Outline “Swiss Bank” Problem Formal Definitions Optimal Semi-Matchings Algorithms Experiments Formal Definitions Let G = (U V, E) be a bipartite graph U V Matchings M E is a matching if each vertex is incident with at most one edge in M U V The blue edges are a matching, M Matchings First studied by Philip Hall of Cambridge University “Marriage Theorem” characterizes the existence of perfect matchings Philip Hall (1904-1982) P. Hall, On representatives of subsets, J. London Math. Soc. 10 (1935), 26-30. Matchings “Hungarian Algorithm” used to find matchings of maximum cardinality Harold W. Kuhn Princeton University H. W. Kuhn, The Hungarian method for the assignment problem, Naval Res. Logist. Quart. 2:83-97, 1955. Semi-Matchings M E is a semi-matching if each U-vertex is incident with exactly one edge in M U V The blue edges are a semimatching, M Max-Weight Semi-Matchings Let w: E R be an edge-weight function Problem (Lawler ‘76): Find semi-matching M maximizing w(e) eM Solvable by simple greedy algorithm Eugene Lawler, Combinatorial Optimization: Networks and Matroids. Holt, Rinehart & Winston, 1976. Optimal Semi-Matching, M Edges are unweighted Let degM(v) denote number of Medges incident with vV Define cost of M at a vertex vV: c M (v ) deg M ( v ) i 1 2 ... deg i 1 c M (v ) (v ) Define c(M) = M is an optimal semi-matching if c(M) is minimal vV M Optimal Semi-Matchings c(M) gives the total weighting time of the “customers” in U U V cM(v) 1 1 4+3+2+1=10 2+1=3 2+1=3 c(M) = 1+1+10+3+3 = 18 Talk Outline “Swiss Bank” Problem Formal Definitions Optimal Semi-Matchings Cost Reducing Paths Optimality Criterion Lp-norm Algorithms Experiments Optimality Properties Optimal Semi-Matchings have useful load balancing properties: 1. 2. 3. Minimize variance of { degM(v) } Minimize max { degM(v) } Minimize Lp-norm of { degM(v) } Optimal Semi-Matchings contain a maximum matching as a subset And a max matching is easy to find Alternating Paths Let P = (v1, u1, …, uk-1, vk) be a path in G If {vi, ui} M and {ui, vi+1} E \ M for all i, then P is called an alternating path U V v3 u2 u1 P is an alternating path v2 v1 White edges are in E \ M Blue edges are in M Cost-Reducing Paths (CRPs) Let P be an alternating path in G If degM(vk) < degM(v1)-1 then P is called a cost-reducing path U V v3 degM(v3) = 1 u2 u1 P is not a cost-reducing path v2 v1 degM(v1) = 2 Cost-Reducing Paths (CRPs) Let P be an alternating path in G If degM(vk) < degM(v1)-1 then P is called a cost-reducing path U V v2 degM(v2) = 1 v1 degM(v1) = 4 u1 P is a costreducing path Improvement with CRPs Let P = (v1, u1, …, uk-1, vk) be a CRP Remove {vi, ui} from M for all i Add {ui, vi+1} to M for all i U V v2 degM(v2) = 2 1 v1 degM(v1) = 43 u1 P is a costreducing path Improvement with CRPs cM(v1) decreases by degM(v1) cM(vk) increases by degM(vk)+1 Total decrease is (degM(v1)-degM(vk)-1) > 0 U V v2 cM(v2) = 3 1 v1 cM(v1) = 10 6 u1 c(M) Decrease of 4 + Increase of 2 = Decrease of 2 Talk Outline “Swiss Bank” Problem Formal Definitions Optimal Semi-Matchings Cost Reducing Paths Optimality Criterion Lp-norm Algorithms Experiments Optimality Criterion Theorem: A semi-matching is optimal if and only if no cost-reducing path exists Proof: Any CRP can reduce the cost of a semimatching – this was just shown If a semi-matching is not optimal then a cost-reducing path exists – this requires some proof Proof of Optimality Criterion Let M be a suboptimal semi-matching Let O be an optimal semi-matching with smallest symmetric difference with M In color the edges of M red and the edges in O green = (M \ O) (O \ M) Let G be G restricted to edge-set Construction of G Direct red edges VU and green edges UV Suboptimal M M\O Optimal O G O\M Properties of G Acyclicity G contains no alternating red/green cycle Monotonicity If there is an alternating red/green path in G from v1 to v2 in V then degO(v1) degO(v2) Both properties hold by choice of O Properties of G G O degO(v) 2 • Acyclic 2 • Monotone 2 2 2 G yields CRP for M G M degM(v) 3 There is a cost-reducing red/green path for M 2 3 2 0 Existence of CRP Proof Choose V-vertex v1 such that degM(v1) > degO(v1) Build red/green path in G until we find V-vertex v2 with degM\O(v2) = 0 degM(v2) ≤ degO(v2) - 1 ≤ degO(v1) - 1 < degM(v1) - 1 arrived at v2 on O\M edge monotonicity choice of v1 Path from v1 to v2 is a cost-reducing path for M! Talk Outline “Swiss Bank” Problem Formal Definitions Optimal Semi-Matchings Cost Reducing Paths Optimality Criterion Lp-norm Algorithms Experiments Lp-norm of load vector Let xi = degM(vi) The Lp-norm of the vector X=(x1,x2,…,xm) is ||X ||p = (i xip)1/p ||X ||1 is always |U | For any p>1, ||X ||p is a measure of the balance of the load on V-vertices ||X ||2 is the sum of squares ||X || is the load of the most loaded V-vertex ||X ||p is everything in between Optimality of Lp-norm Theorem: Let p>1. A semi-matching is optimal iff the Lp-norm of its load vector is optimal Proof outline: Based on following claims 1. A cost-reducing path can reduce the Lpnorm of the load vector Proof: Simple calculation 2. A semi-matching M has optimal Lp-norm iff no cost-reducing path relative to M exists Proof: Similar to the proof for optimal total cost Optimality of L-norm Theorem: An optimal semi-matching is optimal with respect to L (load on most loaded teller) Proof: more complicated The converse does not hold: xi xi 2 2 0 1 2 1 Optimal L Optimal semi-matching Total cost: 6 Total cost: 5 Talk Outline “Swiss Bank” Problem Formal Definitions Optimal Semi-Matchings Algorithms Network Flow Algorithms Algorithm SM1 Algorithm SM2 Experiments Network Flow Algorithms Can reduce semi-matching problem to known network-flow problems 1. Assignment Problem Requires O(n0.5 m . log(n)) time (Gabow and Tarjan, 1989) 2. Min-cost Max-flow Problem Requires O(n .m . log2(n)) time (Goldberg and Tarjan, 1987) where n=num vertices and m=num edges Assignment Problem G Assignment Problem Min-cost Max-flow Problem Source U V Cost Centers Sink Talk Outline “Swiss Bank” Problem Formal Definitions Optimal Semi-Matchings Algorithms Network Flow Algorithms Algorithm SM1 Algorithm SM2 Experiments Algorithm SM1 Simple modification of “Hungarian Algorithm” for Bipartite Matching Runtime: O(n .m) (n=|U |+|V | and m=|E |) Same Actual as Hungarian Algorithm performance is not good Algorithm SM1 Pseudocode • • Initially M is empty For each uU • • • • Build tree T of alternating paths rooted at u Let v be a V-vertex in T such that degM(v) is minimum Switch matching and non-matching edges on path from v to u Note: u is matched and |M | increased by one Algorithm SM1 Example U V 1 1 Initially: no one is assigned. 2 Step 1: assign u1 to a least loaded V-vertex 3 4 5 Algorithm SM1 Example U V 1 2 1 2 3 4 5 Step 2: assign u2 Algorithm SM1 Example U V 1 2 3 1 Step 3: assign u3 2 Can increase the load on v2 3 or v1 or v3 4 v3 is the least loaded 5 Algorithm SM1 Example U V 1 2 3 1 2 3 4 5 Talk Outline “Swiss Bank” Problem Formal Definitions Optimal Semi-Matchings Algorithms Network Flow Algorithms Algorithm SM1 Algorithm SM2 Experiments Algorithm SM2 General idea: Find and remove cost-reducing paths Runtime: O(|U |3/2 . |E |) Worse bound than Algorithm SM1 Actual performance is very good! Algorithm SM2 Pseudocode • • • Quickly find an initial semi-matching M While M contains a cost-reducing path P • Improve M by switching edges along P Stop: M is optimal Step 1: Initial Semi-Matching Any semi-matching will work but a near-optimal one is better Easy approach Match each uU with its least-loaded V-neighbor Better approach Sort vertices in U by increasing degree Match each uU with its least-loaded V-neighbor. In case of a tie, choose V-neighbor with least degree. Step 1: Greedy Example U V Step 2: Find CRP Easy approach: • • For each vV • Build tree T of alternating paths rooted at v • If T contains a cost-reducing path, return it Return false Runtime: O(|V | . |E |) Step 2: Find CRP Better approach: • • • Build forest F of alternating paths where each tree root is a least-loaded V-vertex that is not in F If F contains a cost-reducing path, return it Return false Runtime: O(|E |) Step 2: Find CRP Example U V This vertex has load 2. No CRP yet. This vertex has load 1. CRP has been found! This vertex has the highest load (3 matched neighbors) Algorithm SM2 Analysis Step 1: Find Greedy Matching: O(|E |) Step 2: Find CRP: O(|E |) Step 3: Eliminate CRP: O(|U |+|V |) How many CRPs must be eliminated to achieve optimality? Depends on cost of Greedy Assignment Algorithm SM2 Num Iterations Worst-possible Greedy Assignment has Total Cost = |U | . (|U |+1)/2 Each iteration reduces Total Cost by at least 1 Therefore at most O(|U |2) iterations Total Runtime: O(|U |2 . |E |) Can prove tighter bound: O(|U |3/2 . |E |) Coin Towers Problem Start: Tower of coins C stories tall Goal: C towers of coins each 1 story tall Coins can only move down and right Minimum number of moves is obviously C-1 Problem: What is maximum number of moves? Coin Towers Example Tower 1 Tower 2 Tower 3 Tower 4 Tower 5 Total: 8 Moves Tower 6 Coin Towers Analysis Assume tower heights non-increasing from left to right For any K, each coin moves at most K times before passing beyond Tower K Tower K has maximum height C/K. Thus, each coin moves at most C/K times after passing Tower K Because each move goes right Because each move goes down For arbitrary K, can prove that each coins moves at most K+C/K times Fix K=sqrt(C). Then maximum possible moves is O(C . sqrt(C)) = O(C1.5) Talk Outline “Swiss Bank” Problem Formal Definitions Optimal Semi-Matchings Algorithms Experiments Semi-Matching Experiments Compute Optimal Semi-Matchings Compare SM1 & SM2 to reduction to assignment problem CSA: Goldberg & Kennedy, 1993 LEDA: www.algorithmic-solutions.com Use input graph generators from Cherkassky et al., 1998 Computing Optimal Semi-Matching (FewG Graph) 1000 Runtime (s) 100 SM1 10 SM2 CSA LEDA 1 32768 65536 131072 0.1 0.01 Number of Vertices 262144 524288 Computing Optimal Semi-Matching (Grid Graph) 10 Runtime (s) 1 SM1 32768 65536 131072 262144 524288 SM2 CSA LEDA 0.1 0.01 Number of Vertices Computing Optimal Semi-Matching (Hilo Graph) 100 10 Runtime (s) SM1 SM2 CSA LEDA 1 32768 65536 131072 0.1 Number of Vertices 262144 524288 Computing Optimal Semi-Matching (Hexa Graph) 100 10 Runtime (s) SM1 SM2 1 CSA 32768 65536 131072 0.1 0.01 Number of Vertices 262144 524288 LEDA Computing Optimal Semi-Matching (ManyG Graph) 100 10 Runtime (s) SM1 SM2 1 CSA 32768 65536 131072 0.1 0.01 Number of Vertices 262144 524288 LEDA Computing Optimal Semi-Matching (Rope Graph) 100 10 Runtime (s) SM1 SM2 CSA LEDA 1 32768 65536 131072 0.1 Number of Vertices 262144 524288 Maximum Matching Experiments Compute Maximum Matchings Compare SM1 & SM2 to existing matching algorithms BFS: Breadth-First Search based alternating-path algorithm LO: Push-relabel algorithm with Lo heuristic Both from Cherkassky et al., 1998 Use input graph generators from Cherkassky et al., 1998 Computing Maximum Matching (FewG Graph) 100 10 Runtime (s) SM1 SM2 1 BFS 32768 65536 131072 0.1 0.01 Number of Vertices 262144 524288 LO Computing Maximum Matching (Grid Graph) 10 SM1 Runtime (s) 1 32768 65536 131072 262144 524288 SM2 BFS LO 0.1 0.01 Number of Vertices Computing Maximum Matching (Hexa Graph) 100 Runtime (s) 10 SM1 SM2 1 BFS 32768 65536 131072 0.1 0.01 Number of Vertices 262144 524288 LO Computing Maximum Matching (Hilo Graph) 10 Runtime (s) 1 32768 65536 131072 262144 524288 SM1 SM2 LO BFS 0.1 0.01 Number of Vertices Computing Maximum Matching (ManyG Graph) 100 Runtime (s) 10 SM1 SM2 1 LO 32768 65536 131072 0.1 0.01 Number of Vertices 262144 524288 BFS Computing Maximum Matching (Rope Graph) 10 Runtime (s) 1 SM1 32768 65536 131072 262144 524288 SM2 LO BFS 0.1 0.01 Number of Vertices Computing Maximum Matching (Zipf Graph) 1000 Runtime (s) 100 SM1 10 SM2 LO BFS 1 32768 65536 131072 0.1 0.01 Number of Vertices 262144 524288 Conclusions Optimal Semi-Matchings solve simple load balancing problems Minimize maximum load and variance Optimal Semi-Matchings contain Maximum Bipartite Matchings Algorithm SM1 has an efficient theoretical bound Algorithm SM2 is efficient in practice at computing Optimal Semi-Matchings and Maximum Matchings Questions? Algorithm SM1 Example 1 2 u 3 • Build an alternating tree rooted at u. Edges (ui,vj) are in E\M and edges (vj,ui) are M. 1 2 3 4 5 v • Select v, the least loaded V-vertex in the tree. •Re-assign matching edges on path from u to v Algorithm SM1 Example U V 1 2 3 4 1 Step 4: assign u4 2 Can increase the load on v1 or v3 or v1 or v2 3 or v2 4 All have the same load. 5 Algorithm SM1 Example U V 1 2 3 4 1 2 3 4 5 Assign u4 to v3 Algorithm SM1 Example U V 1 2 3 4 5 1 Step 5: assign u5 Step 6: assign u6 2 3 6 4 5 Algorithm SM1 Example U V 1 2 3 4 5 1 2 3 6 7 4 5 Step 7: assign u7 Can increase the load on v3 or v1 or v2 or v1 or v2 v1 is the least loaded Algorithm SM1 Example U V 1 2 3 4 5 1 2 3 6 7 4 5