CS5234 Combinatorial and Graph Algorithms Welcome! Administrative Problem Sets Programming Assignment Exams CS5234 Overview Problem set 1 due PS1: Hand it in now! Problem set 2 released tonight PS2: Submit next week IVLE (optional, electronic submission) Soon: Once registration completes… CS5234 Overview Problem set grading Distributed grading scheme: Once registration completes, there will be an IVLE survey where you choose a week. Grading supervised (and verified) by the TA. CS5234 Overview How to draw pictures? By hand: Either submit hardcopy, or scan, or take a picture with your phone! Or use a tablet / iPad… Digitally: 1. 2. 3. 4. xfig (ugh) OmniGraffle (mac) Powerpoint (hmmm) ??? CS5234 Overview Programming assignment(s): Berth assignment problem: Last 3-4 weeks of the semester. Very hard algorithmic problem (related to graph partitioning). Design the best solution you can! Must know or learn: C++ More details later… CS5234 Overview Mid-term exam October 8 In class Final exam November 19 Reading Week Exams will be graded and returned. Quick Review Vertex Cover Vertex Cover Definition: Given: graph G = (V,E) Find: smallest cover C ⊆ V such that every edge e ∈ E is covered by some node v ∈ V. Challenge: NP-complete: No polynomial time algorithm unless P = NP. Vertex Cover Example: (Suboptimal) cover (size 9) Vertex Cover Example: Optimal cover? (size 6) Vertex Cover Greedy approximation algorithm: repeat: until every edge is covered 1. Let e = (u,v) be an uncovered edge. 2. Add u and v to the cover. Details: Graph representation: Sorted adjacency list… Vertex Cover Analysis: 1. For every matching M: OPT ≥ |M|. 2. The cover C contains a matching of size |C|/2. 3. Therefore: OPT ≥ |C|/2. Conclusion: |C| ≤ 2OPT 2-approximation algorithm Today’s Plan Vertex Cover (Review) Set Cover (Review + Analysis) Steiner Trees Set Cover Definition: Given: set X subsets S1, S2, …, Sm where Sj ⊆ X Find: smallest collection I ⊆ {1, …, m} such that: Challenge: NP-complete: No polynomial time algorithm unless P = NP. Set Cover Example: X = {C, C++, Java, Ruby, Python} Alice: C, C++ Bob: C++, Java Collin: C++, Ruby, Python Dave: C, Java Choose a good team: Collin and Dave: optimal solution. Set Cover set that contains the most uncovered elements Set Cover Example: Set # uncovered S1 4 S2 6 S3 3 S4 5 S5 4 1. Choose S2. Set Cover Example: Set # uncovered S1 2 S2 0 S3 3 S4 3 S5 2 1. Choose S2. 2. Choose S3. Set Cover Example: Set # uncovered S1 1 S2 0 S3 0 S4 2 S5 1 1. Choose S2. 2. Choose S3. 3. Choose S4. Set Cover Example: Set # uncovered S1 1 S2 0 S3 0 S4 0 S5 0 1. 2. 3. 4. Choose S2. Choose S3. Choose S4. Choose S1 Set Cover Example: Set # uncovered S1 0 S2 0 S3 0 S4 0 S5 0 Output: S1, S2, S3, S4 OPT: S1, S4, S5 Set Cover To get a good approximation, show a lower bound on OPT. – “OPT has to be AT LEAST this large.” – E.g., “OPT ≥ |M|” Set Cover Example: Set # uncovered S1 4 S2 6 S3 3 S4 5 S5 4 Note: 12 elements ≤ 6 elements per set OPT ≥ 2 Set Cover Example: Set # uncovered S1 2 S2 0 S3 3 S4 3 S5 2 Note: 6 uncovered elements ≤ 3 elements per set OPT ≥ 2 Set Cover Example: Set # uncovered S1 2 S2 0 S3 3 S4 3 S5 2 General rule: k uncovered elements ≤ t elements per set OPT ≥ (k / t) Notation Assume GREEDY covers elements in order: x1, x2, x3, x4, x5, x6, x7, x8, …, xn xj = the jth item covered Notation Assume GREEDY covers elements in order: x1, x2, x3, x4, x5, x6, x7, x8, …, xn S4 covers 1 new item S3 covers 3 items S1 covers 2 new items S6 covers 2 new items Notation Assume GREEDY covers elements in order: x1, x2, x3, x4, x5, x6, x7, x8, …, xn 3 2 2 1 Notation Assume GREEDY covers elements in order: x1, x2, x3, x4, x5, x6, x7, x8, …, xn 3 2 2 1 c1, c2, c3, c4, c5, c6, c7, c8, …, cn 3, 3, 3, 2, 2, 2, 2, 1, …, cj = number of items covered in same step as xj Notation Assume GREEDY covers elements in order: x1, x2, x3, x4, x5, x6, x7, x8, …, xn 2 3 – – – – – – cost(x1) = cost(x2) = cost(x3) = cost(x4) = cost(x5) = … 2 1 1/3 1/3 1/3 1/2 1/2 cost(xj) = 1/cj = amount payed to cover xj Notation Assume GREEDY covers elements in order: x1, x2, x3, x4, x5, x6, x7, x8, …, xn 3 2 2 1 cost(xj) = 1/cj = amount payed to cover xj Set Cover Example: Set # uncovered S1 2 S2 0 S3 3 S4 3 S5 2 Note: 6 uncovered elements ≤ 3 elements per set OPT ≥ 2 Analysis Assume (x1, x2, …, xj-1) are covered: – How does OPT cover (xj, xj+1, …, xn)? Analysis Assume (x1, x2, …, xj-1) are covered: – How does OPT cover (xj, xj+1, …, xn)? Fact: No set covers more than c(xj) elements in the set (xj, xj+1, …, xn). Why? If it did, GREEDY would have chosen it! (Tip: use the fact that algorithm is GREEDY.) Analysis Assume (x1, x2, …, xj-1) are covered: – How does OPT cover (xj, xj+1, …, xn)? As before: n – j + 1 uncovered elements ≤ c(xj) elements per set OPT ≥ (n – j + 1) / c(xj) = (n – j + 1)cost(xj) Analysis Assume (x1, x2, …, xj-1) are covered: – How does OPT cover (xj, xj+1, …, xn)? As before: n – j + 1 uncovered elements ≤ c(xj) elements per set OPT ≥ (n – j + 1) / c(xj) = (n – j + 1)cost(xj) cost(xj) ≤ OPT / (n – j + 1) Analysis Assume GREEDY covers elements in order: x1, x2, x3, x4, x5, x6, x7, x8, …, xn 3 2 2 1 Algebra Set Cover Conclusion: Theorem: Greedy-Set-Cover is an O(log n) approximation algorithm for set cover. Today’s Plan Vertex Cover (Review) Set Cover (Review + Analysis) Steiner Trees A Warm-Up Problem Problem Statement: Given: 2d (Euclidean) map & set of points of interest Find: shortest set of roads connecting all points A Warm-Up Problem Problem Statement: Given: 2d (Euclidean) map & set of points of interest Find: shortest set of roads connecting all points A Warm-Up Problem Algorithm: Compute: for every pair (u,v): distance(u,v) Build: complete graph G = (V, E) where w(u,v) = distance(u,v) Find: minimum spanning tree of G. Spanning Tree Definition: a spanning tree is an acyclic subset of the edges that connects all nodes 13 Weight: 32 4 5 9 1 8 9 2 Minimum Spanning Tree Definition: a spanning tree with minimum weight 13 4 5 9 1 8 9 2 Minimum Spanning Tree Definition: a spanning tree with minimum weight 13 Weight: 20 4 5 9 1 8 9 2 Properties of MST Property 1: No cycles Properties of MST Property 2: If you cut an MST, the two pieces are both MSTs. Properties of MST Property 3: Cycle property For every cycle, the maximum weight edge is not in the MST. max-weight edge on cycle Properties of MST Property 4: Cut property For every partition of the nodes, the minimum weight edge across the cut is in the MST. min-weight edge on cut Prim’s Algorithm Prim’s Algorithm. (Jarnik 1930, Dijkstra 1957, Prim 1959) Basic idea: – S : set of nodes connected by blue edges. – Initially: S = {A} – Repeat: • Identify cut: {S, V–S} • Find minimum weight edge on cut. • Add new node to S. 5 H 15 A 3 C 12 10 8 7 G 9 6 1 4 E 16 D 20 2 B 11 13 F Kruskal’s Algorithm Kruskal’s Algorithm. (Kruskal 1956) Basic idea: – Sort edges by weight. – Consider edges in ascending order: • If both endpoints are in the same blue tree, then color the edge red. • Otherwise, color the edge blue. 5 Data structure: – Union-Find – Connect two nodes if they are in the same blue tree. H 15 A 3 C 12 10 8 7 G 9 6 1 4 16 D E 20 2 B 11 13 F A Warm-Up Problem Algorithm: O(n2 log n) Compute: for every pair (u,v): distance(u,v) Build: complete graph G = (V, E) where w(u,v) = distance(u,v) Find: minimum spanning tree of G. Try again… Problem Statement: Given: 2d (Euclidean) map & set of points of interest Find: shortest set of roads connecting all points Thinking outside the box… Problem Statement: Given: 2d (Euclidean) map & set of points of interest Find: shortest set of roads connecting all points Thinking outside the box… Euclidean Steiner Tree: Given: 2d (Euclidean) map & set of points of interest (POI) Find: set of extra points shortest set of links connecting all POI Steiner Tree Problems Metric Steiner Tree: Given: set of required points R set of optional points S distance metric d(., .) v w u Steiner Tree Problems Metric Steiner Tree: Given: set of required points R set of optional points S distance metric d(., .) Find: tree T = (V,E) : R ⊆ V ⊆ (S∪R) • Tree includes all required points. • Tree may include some optional points. cost of tree is minimized: Steiner Tree Problems General Steiner Tree: Given: set of required points R set of optional points S set of edges E edge weights w(.) Find: tree T = (V,E) : R ⊆ V ⊆ (S∪R) • Tree includes all required points. • Tree may include some optional points. cost of tree is minimized Steiner Tree Problems Three variants: Euclidean Steiner Tree: points in a 2d-plane Metric Steiner Tree: distances form (any) metric General Steiner Tree: arbitrary distances All three variants are NP-hard. – No polynomial time solutions (unless P = NP). – Find a good approximation algorithm. Steiner Tree Problems Proposed Algorithm: Step 1: Construct a graph G from points R. Ignore points S. Step 2: Find an MST of G. Question: Is an MST a good approximation of a Steiner Tree? Steiner Tree Problems Challenge: Find the worst example you can for the proposed algorithm. – Euclidean – Metric – General Worst maximize(cost(MST) / Steiner_OPT). Euclidean Steiner Tree Given: A set R of points in the plane Find: Minimum-cost tree spanning R Euclidean metric 1 1 1 1 Cost = 2 Cost = 3 Steiner Point 1 Conjecture: the worst-case ratio of (MST-to-Steiner) is . Metric Steiner Tree Given: Set R and S of points, and a distance metric Find: Minimum-cost tree spanning R 10 10 5 5 10 10 10 5 5 10 10 5 5 5 5 10 10 10 5 5 10 MST: 20 Steiner Tree: 15 Ratio: 1.33 5 MST: 30 Steiner Tree: 20 Ratio: 1.5 Generalize: n-gon MST: 10(n-1) Steiner Tree: 5n Ratio: 2(1 – 1/n) 5 10 MST: 40 Steiner Tree: 25 Ratio: 1.8 General Steiner Tree Given: Set R and S of points, and a set of weighted edges Find: Minimum-cost tree spanning R 1 3000 3000 1 1 MST: 6000 Steiner Tree: 3 Ratio: 2000 3000 Conclusion: MST is NOT a good approximation for a general Steiner Tree. Today’s Plan Euclidean Steiner Tree (skip) (left as an exercise) Metric Steiner Tree Show that MST is a 2-approximation. General Steiner Trees Reduce to Metric Steiner Tree. Metric Steiner Tree To get a good approximation, show a lower bound on OPT. – “OPT has to be AT LEAST this large.” – E.g., “OPT ≥ |M|” Notation Given: – set of required points R – set of optional points S – distance metric d(., .) Define: – T = (V, E) be the optimal (minimum) Steiner Tree. Build a cycle… Consider a DFS traversal of T: (x0, x1, x2, x3, …, xm) where x0 = xm Example: 1 1 1 1 1 1 1 1 All other pairs are distance 2. Build a cycle… Consider a DFS traversal of T: (x0, x1, x2, x3, …, xm) where x0 = xm a Example: 1 1 g 1 d 1 f 2 e h 1 c 1 b Build a cycle… Consider a DFS traversal of T: (x0, x1, x2, x3, …, xm) where x0 = xm a Example: 1 1 g 1 d 1 f 2 e h 1 c 1 b DFS: a g d g f g a h c h b h a e a Build a cycle… Consider a DFS traversal of T: Each edge is included in the DFS traversal twice. cost(DFS) = 2cost(T) = 2OPT a 1 1 g Example: cost(T) =8 cost(DFS) = 16 1 d 1 f 2 e h 1 c 1 b DFS: a g d g f g a h c h b h a e a Shortcut Steiner nodes… Example: Replace (a g d) with (a d) DFS: a g d g f g a h c h b h a e a NEW: a d g f g a h c h b h a e a a Triangle Inequality: d(a,d) ≤ d(a,g) + d(g,d) 1 g 1 Hint: use the fact that distance is a metric. d 1 f 2 1 e h 1 c 1 b Shortcut Steiner nodes… Example: Replace (a g d) with (a d) DFS: a g d g f g a h c h b h a e a NEW: a d g f g a h c h b h a e a a Conclusion: cost(NEW) ≤ cost(DFS) 1 g 1 Hint: use the fact that distance is a metric. d 1 f 2 1 e h 1 c 1 b Shortcut Steiner nodes… Example: Replace (d g f) with (d f) DFS: a g d g f g a h c h b h a e a NEW: a d f g a h c h b h a e a a Conclusion: cost(NEW) ≤ cost(DFS) 1 g 1 Hint: use the fact that distance is a metric. d 1 f 2 1 e h 1 c 1 b Shortcut Steiner nodes… Example: Continue until done… DFS: a g d g f g a h c h b h a e a NEW: a d f a c b a e a a Conclusion: cost(NEW) ≤ cost(DFS) 1 g 1 d 1 f 2 1 e h 1 c 1 b Remove repeats… Example: Replace (f a c) with (f c) NEW: a d f a c b a e a NEW2: a d f c b a e a a Conclusion: cost(NEW2) ≤ cost(DFS) 1 g 1 d 1 f 2 1 e h 1 c 1 b Remove repeats… Example: Replace (b a e) with (b e) NEW: a d f a c b a e a NEW2: a d f c b e a a Conclusion: cost(NEW2) ≤ cost(DFS) 1 g 1 d 1 f 2 1 e h 1 c 1 b Remove repeats… Example: Final: a d f c b e a 11 = cost(Final) ≤ cost(DFS) = 16 = 2OPT 2 a 1 2 g 1 d e h 1 1 1 f 2 2 1 b c 2 1 2 Break the cycle… Example: Path: a d f c b e 9= cost(Path) ≤ cost(DFS) = 16 = 2OPT a 1 2 g 1 d e h 1 1 1 f 2 2 1 b c 2 1 2 Path is a spanning tree… Example: Spanning tree: a d f c b e 9= cost(Spanning tree) ≤ cost(DFS) = 16 = 2OPT a 1 2 g 1 d e h 1 1 1 f 2 2 1 b c 2 1 2 Path is a spanning tree… Example: Spanning tree: a d f c b e cost(MST) ≤ cost(Spanning tree) ≤ 2OPT a 1 2 g 1 d e h 1 1 1 f 2 2 1 b c 2 1 2 Approximation Proof Analysis: 1. Let T be an optimal Steiner tree. 2. Let DFS be a DFS-traversal of T. 3. Let NoSteiner be DFS where we short-cut past Steiner nodes. 4. Let Rcycle be NoSteiner where we short-cut past repeated nodes. 5. Let Path be Rcycle where we remove the last edge. Approximation Proof Analysis: By definition of MST. 1. cost(MST) ≤ cost (Path) 2. cost(Path) ≤ cost(Rcycle) Trivial. 3. cost(Rcycle) ≤ cost(NoSteiner) 4. cost(NoSteiner) ≤ cost(DFS) By triangle inequality. 5. cost(DFS) ≤ 2cost(T) = 2OPT By construction. Metric Steiner Tree Theorem: A minimum spanning tree is a 2-approximation of the optimal metric Steiner Tree. Question: Is this analysis tight? 2(1 – 1/n) approximation? Today’s Plan Euclidean Steiner Tree (skip) (left as an exercise) Metric Steiner Tree Show that MST is a 2-approximation. General Steiner Trees Reduce to Metric Steiner Tree. Steiner Tree Problems General Steiner Tree: Given: set of required points R set of optional points S set of edges E edge weights w(.) Find: tree T = (V,E) : R ⊆ V ⊆ (S∪R) • Tree includes all required points. • Tree may include some optional points. cost of tree is minimized General Steiner Tree Problem: Minimum spanning tree of R is not a good approximation. General Steiner Tree Idea: Reduction 1. Construct an instance of Metric Steiner Tree from the input. 2. Solve the Metric Steiner Tree problem (by finding an MST). 3. Translate the solution back. General Steiner Tree Beware: Reductions are tricky for approximation algorithms Typical example: – Assume two problems ABC and XYZ – Function f : ABC XYZ – Function g : “solutions to XYZ” “solutions to ABC” – Show: If S is an optimal solution for f(A), then g(S) is an optimal solution for A. General Steiner Tree Beware: Reductions are tricky for approximation algorithms Typical example: XYZ ABC A f f(A) ALG g g(S) S General Steiner Tree Beware: Reductions are tricky for approximation algorithms Problem: ALG does not find optimal solution – Function g may not preserve approximation ratio. XYZ ABC f A f(A) ALG g g(S) S General Steiner Tree Idea: Reduction 1. Construct an instance of Metric Steiner Tree from the input. 2. Solve the Metric Steiner Tree problem (by finding an MST). 3. Translate the solution back. Steiner Tree Problems General Steiner Tree: Given: set of required points R set of optional points S set of edges E edge weights w(.) Construct a Metric Construction: 1. Required and optional points stay the same. 2. For every pair of points (u,v) define: d(u,v) = distance of shortest path from u to v. Construct a Metric Example: – d(A,B) = 10 – d(H,E) = 11 2 H How do we find all the shortest paths? 3 12 G 7 C 10 8 9 – d(B,H) = 12 –… 15 A 6 1 4 16 D E 20 5 B 11 13 F Construct a Metric Example: – d(A,B) = 10 – d(H,E) = 11 2 H How do we find all the shortest paths? • Dijkstra’s Algorithm : O(VE log V) • Floyd-Warshall : O(V3) 3 12 G 7 C 10 8 9 – d(B,H) = 12 –… 15 A 6 1 4 16 D E 20 5 B 11 13 F Construct a Metric Claim: The function d(.,.) is a metric. Construct a Metric Claim: The function d(.,.) is a metric. Usual properties: • • • (don’t matter) d(u, u) = 0 d(u, v) = d(v, u) d(u, v) ≥ 0 Construct a Metric Claim: The function d(.,.) is a metric. 2 H Triangle Inequality: Fix some (u, v, w). d(u,w) ≤ d(u,v) + d(u,w) – • 3 12 G 7 C 10 8 9 • • 15 A 6 1 4 16 D E 11 13 20 If not, find a shorter path from u to w by going u v w. Shortest paths always satisfy triangle inequality, by definition! 5 B F General Steiner Tree Idea: Reduction 1. Construct an instance of Metric Steiner Tree from the input (via shortest paths). 2. Solve the Metric Steiner Tree problem (by finding an MST). 3. Translate the solution back. Translate back… Given a Steiner Tree T’ for the metric problem: 1. For every edge (u,v) in T’, add the shortest path from (u v) to the graph G. (Note, G may not be a tree.) 2. Find an MST of G. Analysis Overview: 1. Input: IN = (R, S, G, w) 2. Construct: MET = (R, S, d) where d = shortest paths. Show: OPT(MET) ≤ OPT(IN) 3. Solve: Let T’ be the approximately-optimal Steiner tree for MET. Fact: cost (T’) ≤ 2OPT(MET) 4. Translate: Convert T’ into a Steiner tree T for IN. Show: cost(T) ≤ cost(T’) Conclude: cost(T) ≤ cost(T’) ≤ 2OPT(MET) ≤ 2OPT(IN) Analysis Construct: MET = (R, S, d) where d = shortest paths. Show: OPT(MET) ≤ OPT(IN) 1. Let T be an optimal Steiner tree for IN. 2. Let T’ be the same tree in MET. 3. cost(T’) ≤ cost(T) • • For every edge (u,v): d(u,v) ≤ w(u,v) Hence the tree only costs less under the distance metric. 4. OPT(MET) ≤ cost(T’) ≤ cost(T) = OPT(IN) Analysis Overview: 1. Input: IN = (R, S, G, w) 2. Construct: MET = (R, S, d) where d = shortest paths. Show: OPT(MET) ≤ OPT(IN) 3. Solve: Let T’ be the approximately-optimal Steiner tree for MET. Fact: cost (T’) ≤ 2OPT(MET) 4. Translate: Convert T’ into a Steiner tree T for IN. Show: cost(T) ≤ cost(T’) Conclude: cost(T) ≤ cost(T’) ≤ 2OPT(MET) ≤ 2OPT(IN) Analysis Translate: Convert T’ into a Steiner tree T for IN. Show: cost(T) ≤ cost(T’) 1. Let G be the graph constructed from T’. 2. cost(G) ≤ cost(T’) • • • Every edge in T’ corresponds to a “shortest path.” G is constructed by adding these paths. Not always equal due to overlapping paths. 3. cost(T) ≤ cost(G) • Remove edges from G to find an MST. Analysis Overview: 1. Input: IN = (R, S, G, w) 2. Construct: MET = (R, S, d) where d = shortest paths. Show: OPT(MET) ≤ OPT(IN) 3. Solve: Let T’ be the approximately-optimal Steiner tree for MET. Fact: cost (T’) ≤ 2OPT(MET) 4. Translate: Convert T’ into a Steiner tree T for IN. Show: cost(T) ≤ cost(T’) Conclude: cost(T) ≤ cost(T’) ≤ 2OPT(MET) ≤ 2OPT(IN) General Steiner Tree Theorem: If A is a c-approximation algorithm for Metric Steiner Tree, then we can construct a c-approximation algorithm for General Steiner Tree. General Steiner Tree Theorem: If A is a c-approximation algorithm for Metric Steiner Tree, then we can construct a c-approximation algorithm for General Steiner Tree. Theorem: There exists a 2-approximation algorithm for General Steiner Tree. Steiner Tree Example 2 H 15 A 3 C 12 10 8 G 7 9 6 1 4 16 D E 20 5 B 11 13 F Steiner Tree Example 2 H Shortest Paths: (A,H) = 2 (A,D) = 7 (A,E) = 9 (H,D) = 9 (H,E) = 11 (D,E) = 10 7 11 9 2 H 9 10 D E 15 A 3 12 G 7 C 10 8 9 6 16 D E 20 5 B 1 4 (Look ahead: ignore Steiner nodes.) A 11 13 F Steiner Tree Example 2 H A 7 11 Run MST Algorithm 9 10 D 2 H 9 E 15 A 3 12 G 7 C 10 8 9 6 1 4 16 D E 20 5 B 11 13 F Steiner Tree Problem 2 H A 7 11 Convert solution back to original graph. 9 10 D 2 H 9 E 15 A 3 12 G 7 C 10 8 9 6 1 4 16 D E 20 5 B 11 13 F General Steiner Tree Theorem: If A is a c-approximation algorithm for Metric Steiner Tree, then we can construct a c-approximation algorithm for General Steiner Tree. Theorem: There exists a 2-approximation algorithm for General Steiner Tree. Best known approximation: 1.55 Today’s Plan Euclidean Steiner Tree (skip) (left as an exercise) Metric Steiner Tree Show that MST is a 2-approximation. General Steiner Trees Reduce to Metric Steiner Tree.