U N I V E R S I T Y O F B E R G E N Parameterized Algorithms The Basics Bart M. P. Jansen August 18th 2014, Bฤdlewo Why we are here • To create the recipes that make computers solve our problems efficiently – With a bounded number of resources (memory, time) • We measure the quality of an algorithm by the dependence of its running time on the size ๐ of the input – For an ๐-bit input, the running time can be ๐2 , 6๐ log ๐ , 2๐ ⋅ 3๐6 , … – Smaller functions are better, but as a general guideline: • Polynomials are good, exponential functions are bad • Unfortunately, many problems are NP-complete • We believe that for NP-complete problems, there is no algorithm that: – always gives the right answer, and whose – running time is bounded by a polynomial function of the input size 2 Dealing with NP-complete problems Approximation Sacrifice quality of the solution: quickly find a solution that is provably not very bad Local search Quickly find a solution for which you cannot give any quality guarantee (but which might often be good) Branch & bound Sacrifice running time guarantees: create an algorithm for which you do not know how long it will take (but which might do well on the inputs you use) Parameterized algorithms Sacrifice the running time: allow the running time to have an exponential factor, but ensure that the exponential dependence is not on the entire input size but just on some parameter that is hopefully small Kernelization Quickly shrink the input by preprocessing so that afterward running an exponential-time algorithm on the shrunk instance is fast enough 3 History of parameterized complexity PCP Theorem Downey & Fellows book Kernelization lower bounds NP-completeness 1940 1950 1960 MATCHING algorithm Simplex algorithm 4 1970 1980 Graph Minors Theorem 1990 2000 Parameterized (in)tractability 2010 … Bฤdlewo school Planar DOMINATING SET kernel Google Scholar Papers on FPT and Kernelization 1200 1000 800 600 400 200 0 1985 1990 1995 FPT 5 2000 2005 Kernelization 2010 2015 This lecture Fixed-parameter tractability Kernelization algorithms • VERTEX COVER • FEEDBACK ARC SET in Tournaments Bounded-depth search trees • VERTEX COVER • FEEDBACK VERTEX SET Dynamic programming • SET COVER 6 FIXED-PARAMETER TRACTABILITY 7 Parameterized problems • As usual in complexity theory, we primarily study decision problems (YES/NO questions) – OPTIMIZATION: “Find the shortest path from ๐ฅ to ๐ฆ” – DECISION: “Is there a path from ๐ฅ to ๐ฆ of length at most ๐?” • Having an efficient algorithm for one typically gives an efficient algorithm for the other • A parameterized problem is a decision problem where we associate an integer parameter to each instance – The parameter measures some aspect of the instance 8 Problem parameterizations • PACKET DELIVERY PROBLEM Input: A graph ๐บ, a starting vertex ๐ , a set ๐ of delivery vertices, and an integer ๐ Question: Is there a cycle in ๐บ that starts and ends in ๐ , visits all vertices in ๐, and has length at most ๐? • There are many possible parameters for this problem: – The length ๐ of the tour – The number of delivery points |๐| – Graph-theoretic measures of how complex ๐บ is (treewidth, cliquewidth, vertex cover number) • Parameterized complexity investigates: 9 Can the problem be solved efficiently, if the parameter is small? Fixed-parameter tractability – informally • A parameterized problem is fixed-parameter tractable if there is an algorithm that solves size-๐ inputs with parameter value ๐ in time ๐ ๐ ⋅ ๐๐ for some constant ๐ and function ๐ • For each fixed ๐, there is a polynomial-time ๐ ๐๐ algorithm • VERTEX COVER: – “Can all the edges of this ๐-vertex graph be covered by at most ๐ vertices?” – Solvable in time 1.2738๐ ⋅ ๐, so FPT 10 Fixed-parameter tractability – formally • Let Σ be a finite alphabet used to encode inputs – (Σ = {0,1} for binary encodings) • A parameterized problem is a set ๐ ⊆ Σ ∗ × โ – ๐ = { ๐ฅ1 , ๐1 , ๐ฅ2 , ๐2 , … } • The set ๐ contains the tuples ๐ฅ, ๐ where the answer to the question encoded by ๐ฅ is yes; ๐ is the parameter • The parameterized problem ๐ is fixed-parameter tractable if there is an algorithm that, given an input (๐ฅ, ๐), – decides if ๐ฅ, ๐ belongs to ๐ or not, and – runs in time ๐ ๐ ๐ฅ ๐ for some function ๐ and constant ๐ 11 KERNELIZATION 12 Data reduction with a guarantee • Kernelization is a method for parameterized preprocessing – Efficiently reduce an instance (๐ฅ, ๐) to an equivalent instance of size bounded by some ๐(๐) • One of the simplest ways of obtaining FPT algorithms – Apply a brute force algorithm on the shrunk instance to get an FPT algorithm • Kernelization also allows a rigorous mathematical analysis of efficient preprocessing 13 The VERTEX COVER problem Input: Parameter: Question: An undirected graph ๐บ and an integer ๐ ๐ Is there a set ๐ of at most ๐ vertices in ๐บ, such that each edge of ๐บ has an endpoint in ๐? • Such a set S is a vertex cover of ๐บ 14 Reduction rules for VERTEX COVER – (R1) (R1) If there is an isolated vertex ๐ฃ, delete ๐ฃ from ๐บ – Reduce to the instance ๐บ − ๐ฃ, ๐ (๐บ = , ๐ = 7) (๐บ′ = , ๐′ = 7) 15 Reduction rules for VERTEX COVER – (R1) (R1) If there is an isolated vertex ๐ฃ, delete ๐ฃ from ๐บ – Reduce to the instance ๐บ − ๐ฃ, ๐ • To ensure that a reduction rule does not change the answer, we have to prove safeness of the reduction rule • If (๐บ, ๐) is transformed into (๐บ ′ , ๐ ′ ) then we should prove that: (๐ฎ, ๐) is a YES-instance ⇔ (๐ฎ′ , ๐′ ) is a YES-instance 16 Reduction rules for VERTEX COVER – (R1) (R1) If there is an isolated vertex ๐ฃ, delete ๐ฃ from ๐บ – Reduce to the instance ๐บ − ๐ฃ, ๐ (๐บ = , ๐ = 7) (๐บ′ = , ๐′ = 7) 17 Reduction rules for VERTEX COVER – (R2) (R2) If there is a vertex ๐ฃ of degree more than ๐, then delete ๐ฃ (and its incident edges) from ๐บ and decrease the parameter by 1 – Reduce to the instance ๐บ − ๐ฃ, ๐ − 1 (๐บ = , ๐ = 7) (๐บ′ = , ๐′ = 6) 18 Reduction rules for VERTEX COVER – (R3) (R3) If the previous rules are not applicable and ๐บ has more than ๐ 2 + ๐ vertices or more than ๐ 2 edges, then conclude that we are dealing with a NOinstance 19 Correctness of the cutoff rule • Claim. If ๐บ is exhaustively reduced under (R1)-(R2) and has more than ๐ 2 + ๐ vertices or ๐ 2 edges, then there is no size-≤ ๐ vertex cover • Proof. – Suppose ๐บ has a vertex cover ๐ – Since (R1) does not apply, every vertex of ๐บ − ๐ has at least one edge – Since (R2) does not apply, every vertex has degree at most ๐: ๐ธ ๐บ ≤๐⋅ ๐ ๐ ๐บ − ๐ ≤ ๐ธ ๐บ ≤ ๐ ⋅ |๐| – So ๐ ๐บ ≤ ๐ + 1 ⋅ |๐| – So if ๐บ has a size-๐ vertex cover, ๐ ๐บ ≤ ๐ 2 + ๐ and ๐ธ ๐บ ≤ ๐ 2 S 20 ≤๐ Preprocessing for VERTEX COVER • (R1)-(R3) can be exhaustively applied in polynomial time • In polynomial time, we can reduce a VERTEX COVER instance (๐บ, ๐) to an instance (๐บ ′ , ๐ ′ ) such that: – the two instances are equivalent: ๐บ, ๐ has answer YES if and only if (๐บ ′ , ๐ ′ ) has answer YES – instance (๐บ ′ , ๐ ′ ) has at most ๐ 2 + ๐ vertices and ๐ 2 edges – ๐′ ≤ ๐ • This gives an FPT algorithm to solve an instance (๐บ, ๐): – Compute reduced instance (๐บ ′ , ๐ ′ ) – Solve (๐บ ′ , ๐ ′ ) by brute force: try all 2 +๐ ๐ 2 vertex subsets ๐ • For each ๐, test if it is a vertex cover of size at most ๐′ Theorem. ๐-VERTEX COVER is fixed-parameter tractable 21 Kernelization – formally • Let ๐ ⊆ Σ ∗ × โ be a parameterized problem and ๐: โ → โ • A kernelization (or kernel) for ๐ of size ๐ is an algorithm that, given ๐ฅ, ๐ ∈ Σ ∗ × โ, takes time polynomial in ๐ฅ + ๐, and outputs an instance ๐ฅ ′ , ๐ ′ ∈ Σ ∗ × โ such that: – ๐ฅ, ๐ ∈ ๐ ⇔ ๐ฅ ′ , ๐ ′ ∈ ๐ – ๐ฅ ′ , ๐ ′ ≤ ๐(๐) • A polynomial kernel is a kernel whose function ๐ is a polynomial Theorem. A parameterized problem is fixedparameter tractable if and only if it is decidable and has a kernel (of arbitrary size) 22 Kernel for FEEDBACK ARC SET IN TOURNAMENTS Input: Parameter: Question: 23 A tournament ๐บ and an integer ๐ ๐ Is there a set ๐ of at most ๐ directed edges in ๐บ, such that ๐บ − ๐ is acyclic? Reduction rules for FEEDBACK ARC SET (R1) If vertex ๐ฃ is not in any triangle, then remove ๐ฃ (R2) If edge (๐ข, ๐ฃ) is in at least ๐ + 1 distinct triangles, reverse it and decrease ๐ by one (R3) If the previous rules are not applicable and ๐บ has more than ๐(๐ + 2) vertices, then conclude that we are dealing with a NOinstance Theorem. ๐-FEEDBACK ARC SET IN TOURNAMENTS has a kernel with ๐(๐ + 2) vertices 24 High-level kernelization strategy • Compare to VERTEX COVER: – (R1) deals with elements that do not constrain the solution – (R2) deals with elements that must be in any solution – (R3) deals with graphs that remain large after reduction 25 BOUNDED-DEPTH SEARCH TREES 26 Background • A branching algorithm that explores a search tree of bounded depth is one of the simplest types of FPT algorithms • Main idea: – Reduce problem instance (๐ฅ, ๐) to solving a bounded number of instances with parameter < ๐ • If you can solve ๐ฅ, ๐ in polynomial time using the answers to two instances ๐ฅ1 , ๐ − 1 and (๐ฅ2 , ๐ − 1), then the problem can be solved in 2๐ ⋅ ๐๐ time – (assuming the case ๐ = 0 is polynomial-time solvable) • If you generate ๐ subproblems instead of 2, then the problem can be solved in ๐ ๐ ⋅ ๐๐ = 2๐ ๐ log ๐ ⋅ ๐๐ time 27 A search tree (๐ฅ, ๐ = 3) (๐ฅ1 , 2) (๐ฅ3 , 1) (๐ฅ7 , 0) 28 (๐ฅ8 , 0) (๐ฅ2 , 2) (๐ฅ4 , 1) (๐ฅ9 , 0) (๐ฅ5 , 1) (๐ฅ6 , 1) (๐ฅ10 , 0) (๐ฅ11 , 0) (๐ฅ12 , 0) (๐ฅ13 , 0) (๐ฅ14 , 0) Analysis of bounded-depth search trees • If the parameter decreases for each recursive call, the depth of the tree is at most ๐ • # nodes in a depth-๐ tree with ๐ leaves is ๐(๐ ⋅ ๐) – Usually sufficient to bound the number of leaves • If the computation in each node takes polynomial time, total running time is ๐(๐ ⋅ ๐ ⋅ ๐๐ ) 29 VERTEX COVER revisited Input: Parameter: Question: 30 A graph ๐บ and an integer ๐ ๐ Is there a set ๐ of at most ๐ vertices in ๐บ, such that each edge has an endpoint in ๐? Algorithm for VERTEX COVER • Algorithm VC(Graph ๐บ, integer ๐) • if ๐ < 0 then return NO • if ๐บ has no edges then return YES • else pick an edge in ๐บ and let ๐ข and ๐ฃ be its endpoints – return (VC(๐บ– ๐ข, ๐ − 1) OR (VC(๐บ − ๐ฃ, ๐ − 1)) • Correct because any vertex cover must use ๐ข or ๐ฃ • A size-๐ vertex cover in G that uses ๐ข, yields a size-(๐ − 1) vertex cover in ๐บ − ๐ข 31 Running time for VERTEX COVER • Every iteration either solves the problem directly or makes two recursive calls with a decreased parameter • The branching factor of the algorithm–and therefore of the search tree–is two • Tree of depth ๐ with branching factor 2 has at most 2๐ leaves – Running time is 2๐ ⋅ ๐๐ – Much better than 2 ๐ 2 +๐ from the kernelization algorithm • One way to faster algorithms: – Pick a vertex ๐ฃ of maximum degree, recurse on (๐บ − ๐ฃ, ๐ − 1) and (๐บ − ๐ ๐ฃ , ๐ − ๐ ๐ฃ ) 32 The FEEDBACK VERTEX SET problem Input: Parameter: Question: An undirected (multi)graph ๐บ and an integer ๐ ๐ Is there a set ๐ of at most ๐ vertices in ๐บ, such that each cycle contains a vertex of ๐? • We allow multiple edges and self-loops • Such a set ๐ is a feedback vertex set of ๐บ – Removing ๐ from ๐บ results in an acyclic graph, a forest 33 Branching for FEEDBACK VERTEX SET • For VERTEX COVER, we could easily identify a set of vertices to branch on: the two endpoints of an edge • For feedback vertex set, a solution may not contain any endpoint of an edge – How should we branch? • We will find a set ๐ of ๐(๐) vertices such that any size-๐ feedback vertex set contains a vertex of ๐ • To find ๐ we first have to simplify the graph using reduction rules that do not change the answer 34 Reduction rules (R1) If there is a loop at vertex ๐ฃ, then delete ๐ฃ and decrease ๐ by one (R2) If there is an edge of multiplicity larger than 2, then reduce its multiplicity to 2 (R3) If there is a vertex ๐ฃ of degree at most 1, then delete ๐ฃ (R4) If there is a vertex ๐ฃ of degree two, then delete ๐ฃ and add an edge between ๐ฃ’s neighbors If (R1-R4) cannot be applied anymore, then the minimum degree is at least 3 Observation. If ๐บ, ๐ is transformed into (๐บ ′ , ๐ ′ ), then: 1. FVS of size ≤ ๐ in ๐บ ⇔ FVS of size ≤ ๐′ in ๐บ′ 2. Any feedback vertex set in ๐บ′ is a feedback vertex set in ๐บ when combined with the vertices deleted by (R1) 35 Identifying a set to branch on • Let ๐บ be a graph whose vertices have degree three or more – Order the vertices as ๐ฃ1 , ๐ฃ2 , … , ๐ฃ๐ by decreasing degree – Let ๐3๐ โ {๐ฃ1 , … , ๐ฃ3๐ } be the 3๐ largest-degree vertices • Lemma. If all vertices of ๐บ have degree 3 or more, then any size-≤ ๐ feedback vertex set of ๐บ contains a vertex from ๐3๐ • So if there is a size-≤ ๐ solution, it contains a vertex of ๐3๐ – For each ๐ฃ ∈ ๐3๐ recurse on the instance (๐บ − ๐ฃ, ๐ − 1) • Gives an algorithm with running time 3๐ ๐ ⋅ ๐๐ – Apply the reduction rules, compute ๐3๐ , then branch 36 A useful claim • Claim. If ๐ is a feedback vertex set of ๐บ, then ๐ ๐ฃ −1 ≥ ๐ธ ๐บ − ๐ ๐บ +1 ๐ฃ∈๐ • Proof. Graph ๐น โถ= ๐บ – ๐ is a forest – So ๐ธ ๐น ≤ ๐ ๐น − 1 = |๐(๐บ)| − |๐| − 1 – Every edge not in ๐น, is incident with a vertex of ๐ ๐(๐ฃ) + ๐ ๐บ − ๐ − 1 ≥ |๐ธ(๐บ)| ๐ฃ∈๐ • With this claim, we can prove the degree lemma 37 Proving the degree lemma • Lemma. If all vertices of ๐บ have degree 3 or more, then any size-≤ ๐ feedback vertex set of G contains a vertex from ๐3๐ • Proof by contradiction. By the๐ ∩ ๐3๐ = ∅ – Let ๐ be a size-≤ ๐ feedback vertex set with previous – By choice of ๐3๐ we have: claim min ๐(๐ฃ) ≥ max ๐ ๐ฃ , so: ๐ฃ∈๐3๐ ๐ฃ∈๐ ๐ ๐ฃ −1 ≥3⋅ ๐ฃ∈๐3๐ ๐ ๐ฃ −1 ≥3⋅ ๐ธ ๐บ − ๐ ๐บ ๐ฃ∈๐ – Define ๐ + โ ๐ ๐บ โ ๐3๐ . Since ๐ ⊆ ๐ + : (๐ ๐ฃ − 1) ≥ ๐ฃ∈๐ + ๐ ๐ฃ −1 ≥ ๐ธ ๐บ − ๐ ๐บ +1 ๐ฃ∈๐ ๐ ๐ฃ − 1 ≥4 ⋅ ๐ธ ๐บ − ๐ ๐บ + 1 . 38 ๐ฃ∈๐ ๐บ +1 . Proving the degree lemma (II) • ๐ฃ∈๐ ๐บ ๐ ๐ฃ − 1 ≥4⋅ ๐ธ ๐บ − ๐ ๐บ + 1 • The degree sum counts every edge twice: ๐ ๐ฃ = 2 ⋅ |๐ธ(๐บ)| ๐ฃ∈๐ ๐บ • Combining these: 4⋅ ๐ธ ๐บ − ๐ ๐บ +1 ≤ ๐ ๐ฃ −1 =2⋅ ๐ธ ๐บ ๐ฃ∈๐ ๐บ • So 2 ⋅ ๐ธ ๐บ < 3 ⋅ |๐ ๐บ | • But since all vertices have degree ≥ 3 we have: 2⋅ ๐ธ ๐บ = ๐ ๐ฃ ≥3⋅ ๐ ๐บ , ๐ฃ∈๐ ๐บ • Contradiction 39 − |๐ ๐บ | A final word on bounded-depth search trees • The degree lemma proves the correctness of our branching strategy for FEEDBACK VERTEX SET • When building a branching algorithm for a parameterization by the solution size: – Find an ๐(๐)-size set that contains a vertex of the solution – Branch in ๐(๐) directions, trying all possibilities – We get a search tree of depth ๐ and branching factor ๐(๐) • You can think of the branching process as guessing 40 DYNAMIC PROGRAMMING 41 The SET COVER problem Input: Parameter: Question: A set family โฑ over a universe ๐ and an integer ๐ |๐| Is there a subfamily โฑ ′ ⊆ โฑ of at most ๐ sets, such that ๐น∈โฑ′ ๐น = ๐? • The subfamily โฑ′ covers the universe ๐ • SET COVER parameterized by the universe size is FPT – Algorithm with running time 2 ๐ ⋅ ๐ + โฑ – Based on dynamic ๐น programming ๐น2 1 ๐น4 42 ๐น3 ๐ Dynamic programming for SET COVER • Let โฑ = {๐น1 , ๐น2 , … , ๐น๐ } • We define a DP table for ๐ ⊆ ๐ and ๐ ∈ {0,1, … , ๐} ๐ ๐, ๐ = min nr. of sets from ๐น1 , … , ๐น๐ needed to cover ๐ Or +∞ if impossible • The value ๐[๐, ๐] gives the minimum size of a set cover – To solve the problem, compute ๐ using base cases and a recurrence 43 Filling the dynamic programming table • ๐ ๐, ๐ = min nr. of sets from ๐น1 , … , ๐น๐ needed to cover ๐ Base case: ๐ = 0 ๐ ๐, ๐ = 0 if ๐ = ∅, otherwise it is +∞ Recursive step: ๐ > 0 ๐ ๐, ๐ = min(๐ ๐, ๐ − 1 , 1 + ๐ ๐\F๐ , ๐ − 1 ) • Skip set ๐น๐ , or pay for ๐น๐ and afterwards cover ๐\F๐ • Each entry can be computed in polynomial time – ( โฑ + 1) ⋅ 2 ๐ entries in total 44 More on dynamic programming • Dynamic programming is a memory-intensive algorithmic paradigm that yields FPT algorithms in various situations – Here: dynamic programming over subsets of ๐ – Later: dynamic programming over tree decompositions • Research challenge: – Determine whether the 2 ๐ factor can be improved to 2 − ๐ ๐ for some ๐ > 0 45 Exercises From this lecture .. • Prove the safeness that the reduction rules for FEEDBACK ARC SET in tournaments are safe • Improve the running time of the Vertex Cover branching algorithm to 1.6181๐ Kernelization • 2.4, 2.7, 2.9, 2.14 Branching • 3.2, 3.4, 3.7, 3.8 Dynamic programming • 6.2 46 Summary • Parameterized algorithmics is a young, vibrant research area that investigates how to cope with NP-completeness • We saw three ways of building FPT algorithms: 1. Kernelization 2. Bounded-depth search trees 3. Dynamic programming over subsets 47