U N I V E R S I T Y O F B E R G E N Parameterized Algorithms Randomized Techniques Bart M. P. Jansen August 18th 2014, BΔdlewo Randomized computation • For some tasks, finding a randomized algorithm is much easier than finding a deterministic one • We consider algorithms that have access to a stream of uniformly random bits – So we do not consider randomly generated inputs • The actions of the algorithm depend on the values of the random bits – Different runs of the algorithm may give different outcomes, for the same input 2 Monte Carlo algorithms • A Monte Carlo algorithm with false negatives and success probability π ∈ (0, 1) is an algorithm for a decision problem that – given a NO-instance, always returns NO, and – given a YES-instance, returns YES with probability ≥ π • Since the algorithm is always correct on NO–instances, but may fail on YES-instances, it has one-sided error • If π is a positive constant, we can repeat the algorithm a constant number of times – ensure that the probability of failure is smaller than the probability that, cosmic radiation causes a bit to flip in memory – (Which would invalidate even a deterministic algorithm) • If π is not a constant, we can also boost the success probability 3 Independent repetitions increase success probability • Suppose we have a Monte Carlo algorithm with one-sided error probability π, which may depend on π – For example, π = 1/π π • If we repeat the algorithm π‘ times, the probability that all runs fail is at most 1 − π π‘ ≤ π −π⋅π‘ since 1 + π₯ ≤ π π₯ • Probability ≥ 1 − π −π⋅π‘ that the repeated algorithm is correct – Using π‘ = 1 ⌈ ⌉ π trials gives success probability ≥ 1 – For example, π‘ = ⌈ 4 1 1 π π ⌉ = π(π) 1 − π This lecture Color Coding • The LONGEST PATH problem Random Separation • The SUBGRAPH ISOMORPHISM problem Chromatic Coding • π-CLUSTERING Derandomization Monte Carlo algorithm for FEEDBACK VERTEX SET 5 COLOR CODING 6 Color coding • Randomly assign colors to the input structure • If there is a solution and we are lucky with the coloring, every element of the solution has received a different color • Then find an algorithm to detect such colorful solutions – Solutions of elements with pairwise different colors 7 The odds of getting lucky • Lemma. – Let π be a set of size π, and let π ⊆ π have size π – Let π: π → π = 1,2, … , π be a coloring of the elements of π, chosen uniformly at random • Each element of π is colored with one of π colors, uniformly and independently at random The probability that the elements of π are colored with pairwise distinct colors is at least π −π • Proof. – There are π π possible colorings π – In π! ⋅ π π−π of them, all colors on π are distinct π! ⋅ π π−π π! ππ −π = > = π ππ ππ π π ⋅ ππ – We used π! > π π /π π 8 The LONGEST PATH problem Input: Parameter: Question: An undirected graph πΊ and an integer π π Is there a simple path on π vertices in πΊ? • A solution is a π-path • The LONGEST PATH problem is a restricted version of the problem of finding patterns in graphs 9 Color coding for LONGEST PATH • Color the vertices of πΊ randomly with π colors • We want to detect a colorful π-path if one exists – Use dynamic programming over subsets • For every subset of colors π ⊆ [π] and vertex π£, define π[π, π£] = TRUE iff there is a colorful path whose colors are π and that has π£ as an endpoint 10 The dynamic programming table • For every subset of colors π ⊆ [π] and vertex π£, define π[π, π£] = TRUE iff there is a colorful path whose colors are π and that has π£ as an endpoint • Colorful π-path if π[ π , π£] = TRUE for some π£ ∈ π(πΊ) π[{β ,β }, π] = TRUE π[{β ,β ,β }, π] = FALSE π[{β ,β ,β }, π] = TRUE 11 A recurrence to fill the table • If π is a singleton set, containing some color π: π[{π}, π£}] = TRUE if and only if π π£ = π • If π > 1: π π, π£ = π’∈π π£ π[π β {π π£ }, π’] π π, π£ = FALSE • Fill the table in time 2π ππ 12 if π π£ ∈ π otherwise Randomized algorithm for LONGEST PATH • Algorithm LongPath(Graph πΊ, integer π) – repeat π π times: • Color the vertices of πΊ uniformly at random with π colors • Fill the DP table π • if ∃π£ ∈ π πΊ such that π[[π], π£] = TRUE then return YES – return NO • By standard DP techniques we can construct the path as well – For each cell, store a backlink to the earlier cell that determined its value 13 Analysis for the Longest Path algorithm • Running time is is π π ⋅ 2π ⋅ ππ = 2π π ⋅ ππ – By the get-lucky lemma, if there is a π-path, it becomes colorful with probability p ≥ π −π – If the coloring produces a colorful π-path, the DP finds it – By the independent repetition lemma, π π repetitions give constant success probability Theorem. There is a Monte Carlo algorithm for LONGEST PATH with one-sided error that runs in time 2π π ⋅ ππ and has constant success probability 14 Discussion of color coding • When doing dynamic programming, color coding effectively allows us to reduce the number of states from – keeping track of all vertices visited by the path, ππ , to – keeping track of all colors visited by the path, 2π • The technique extends to finding size-π occurrences of other “thin” patterns in graphs – A size-π pattern graph of treewidth π‘ can be found in time 2π π ⋅ ππ π‘ , with constant probability 15 RANDOM SEPARATION 16 The SUBGRAPH ISOMORPHISM problem Input: Parameter: Question: Does πΊ = 17 A host graph πΊ and pattern graph π» (undirected) π β |π π» | Does πΊ have a subgraph π» isomorphic to π»? contain π» = ? Background • The traditional color coding technique gives FPT algorithms for LONGEST PATH – Even for SUBGRAPH ISOMORPHISM when the pattern graph has constant treewidth • If the pattern graph is unrestricted, we expect that no FPT algorithm exists for SUBGRAPH ISOMORPHISM – It generalizes the π-CLIQUE problem – Canonical W[1]-complete problem used to establish parameterized intractability (more later) • If the host graph πΊ (and therefore the pattern graph π») has constant degree, there is a nice randomized FPT algorithm 18 Random 2-coloring of host graphs • Suppose πΊ is a host graph that contains a subgraph π» isomorphic to a connected π-vertex pattern graph π» – Color the edges of πΊ uniformly independently at random with colors red (π ) and blue (π΅) – If all edges of π» are colored red, and all other edges incident to π(π») are colored blue, it is easy to identify π» • The pattern occurs as a connected component of πΊ[π ] • Isomorphism of two π-vertex graphs in π! ⋅ π π time 19 Probability of isolating the pattern subgraph • Let π» be a π-vertex subgraph of graph πΊ • A 2-coloring of πΈ(πΊ) isolates π» if the following holds: – All edges of πΈ(π») are red – All other edges incident to π(π») are blue • Observation. If the maximum degree of πΊ is π, the probability that a random 2-coloring of πΈ(πΊ) isolates a fixed π-vertex subgraph π» is at least 2−π⋅π – There are at most π ⋅ π edges incident on π(π») – Each such edge is colored correctly with probability 20 1 2 Randomized algorithm for SUBGRAPH ISOMORPHISM • Algorithm SubIso(Host graph πΊ, connected pattern graph π») – Let π be the maximum degree of πΊ – Let π be the number of vertices in π» – repeat 2π⋅π times: • Color the edges of πΊ uniformly at random with colors R, B • for each π-vertex connected component π» of πΊ[π ]: – if π» is isomorphic to π», then return YES – return NO is a Monte algorithmpatterns for SUBGRAPH • EasyTheorem. to extendThere the algorithm toCarlo disconnected π» ISOMORPHISM with one-sided error and constant success probability. For π-vertex pattern graphs in a host graph of maximum degree π, the running time is 2π⋅π ⋅ π! ⋅ ππ 21 CHROMATIC CODING 22 The π-CLUSTERING problem Input: Parameter: Question: A graph πΊ and an integer π π Is there a set π΄ ⊆ π(πΊ) of at most π adjacencies 2 such that πΊ ⊕ π΄ consists of π disjoint cliques? • Such a graph is called a π-cluster graph 23 How to color • π-CLUSTERING looks for a set of (non-)edges, instead of vertices • We solve the problem on general graphs • By randomly coloring the input, we again hope to highlight a solution with good probability, making it easier to find – We color vertices of the graph 24 Proper colorings • A set of adjacencies π΄ is properly colored by a coloring of the vertices if: – For all pairs π’π£ ∈ π΄, the colors of π’ and π£ are different • As before, two crucial ingredients: 1. What is the probability that a random coloring has the desired property? 2. How to exploit that property algorithmically? • We assign colors to the vertices and hope to obtain a property for the (non-)edges in a solution – This allows us to save on colors 25 Probability of finding a proper coloring • Lemma. If the vertices of a simple graph πΊ with π edges are colored independently and uniformly at random with ⌈ 8π⌉ colors, then the probability that πΈ(πΊ) is properly colored is at π least − 2 2 For constant success probability, 2π π repetitions suffice • Corollary. If a π-CLUSTERING instance (πΊ, π) has a solution set π΄ of π adjacencies, the probability that π΄ is properly colored by a random coloring with ⌈ 8π⌉ colors is at 26 π − 2 least 2 Detecting a properly colored solution (I) • Suppose π: π πΊ →The [π]π-coloring properly colors a solution (πΊ,≤π)π ⋅ π Observation. partitions π πΊπ΄ of into – The graph πΊ ⊕ π΄that is a π-cluster graphby the solution cliques are unbroken • For π ∈ [π], let ππ be the vertices colored π – As π΄ is properly colored, no (non-)edge of π΄ has both ends in ππ – No changes are made to πΊ[ππ ] by the solution • πΊ[ππ ] is an induced subgraph of a π-cluster graph • For all π ∈ [π], the graph πΊ ππ is a ≤ π-cluster graph • πΊ[ππ ] consists of ≤ π cliques that are not broken by the solution 27 Detecting a properly colored solution (II) • For each of the π ⋅ π cliques into which πΊ is partitioned, guess into which of the π final clusters it belongs • For each guess, compute the cost of this solution – Count edges between subcliques in different clusters – Count non-edges between subcliques in the same cluster • Total of ππ⋅π guesses, polynomial cost computation for each – Running time is ππ⋅π ⋅ ππ = 2π π⋅π⋅log π ⋅ ππ to detect a 1 properly colored3solution, if one exists • Using dynamic programming (exercise), this can be improved 3 to 2π π⋅π ⋅ ππ time 2 2 28 Randomized algorithm for π-CLUSTERING • Algorithm π-Cluster(graph πΊ, integer π) – Define π β ⌈ 8π⌉ π 2 is a Monte Carlo algorithm for π-CLUSTERING –Theorem. repeat 2 There times: with •one-sided error and success probability that runs Color the vertices of constant πΊ uniformly at random with π colors π π ⋅ 2π(π⋅π) ⋅ ππ = 2π π⋅π ⋅ ππ in time 2 • if there is a properly colored solution of size π then – return YES – return NO 29 DERANDOMIZATION 30 Why derandomize? • Truly random bits are very hard to come by – Usual approach is to track radioactive decay • Standard pseudo-random generators might work – When spending exponential time on an answer, we do not want to get it wrong • Luckily, we can replace most applications of randomization by deterministic constructions – Without significant increases in the running time 31 How to derandomize • Different applications require different pseudorandom objects • Main idea: instead of picking a random coloring π: π → [π], construct a family β± of functions ππ : π → π – Ensure that at least one function β± in has the property that we hope to achieve by the random choice • Instead of independent repetitions of the Monte Carlo algorithm, run it once for every coloring in β± • If the success probability of the random coloring is 1/π(π), we can often construct such a family β± of size π π log π 32 Splitting evenly • Consider a π-coloring π: π → [π] of a universe π • A subset π ⊆ π is split evenly by π if the following holds: – For every π, π ′ ∈ π the sizes π −1 π ∩ π and |π −1 (π ′ ) ∩ π| differ by at most one – All colors occur almost equally often within π • If a set π of size π ≤ π is split evenly, then π is colorful 33 Splitters • For π, π, π ∈ β, an (π, π, π)-splitter is a family β± of functions from [π] to [π] such that: – For every set π ⊆ [π] of size π, there is a function π ∈ β± that splits π evenly Theorem. For any π, π ≥ 1 one can construct an (π, π, π 2 )splitter of size π π(1) ⋅ log π in time π π(1) ⋅ π ⋅ log π 34 Perfect hash families derandomize LONGEST PATH • The special case of an (π, π, π)-splitter is called an (π, π)perfect hash family Theorem. For any π, π ≥ 1 one can construct an (π, π)-perfect hash family of size π π ⋅ π π(log π) log π in time π π ⋅ π π(log π) π log π • Instead of trying π π random colorings in the LONGEST PATH algorithm, try all colorings in a perfect hash family – If π is the vertex set of a π-path, then π = π so some function splits π evenly – Since π = π = π, this causes π to be colorful • The DP then finds a colorful path 35 Universal sets • For π, π ∈ β, an π, π -universal set is a family π of subsets of [π] such that for any π ⊆ [π] of size π, all 2π subsets of π are contained in the family: π΄∩π π΄∈π Theorem. For any π, π ≥ 1 one can construct an (π, π)-universal set of size 2π ⋅ π π(log π) log π in time 2π ⋅ π π(log π) π log π • Universal sets can be used to derandomize the random separation algorithm for SUBGRAPH ISOMORPHISM (exercise) 36 Coloring families • For π, π, π ∈ β, an (π, π, π)-coloring family is a family β± of functions from [π] to [π] with the following property: – For every graph πΊ on the vertex set [π] with at most π edges, there is a function π ∈ β± that properly colors πΈ(πΊ) Theorem. For any π, π ≥ 1 one can construct an (π, π, 2⌈ π⌉)coloring family of size 2 π log π ⋅ log π in time 2 π log π ⋅ π log π • Coloring families can be used to derandomize the chromatic coding algorithm for π-CLUSTERING – Instead of trying 2π π random colorings, try all colorings in an π, π, 2 π -coloring family 37 A RANDOMIZED ALGORITHM FOR FEEDBACK VERTEX SET 38 The FEEDBACK VERTEX SET problem Input: Parameter: Question: 39 A graph πΊ and an integer π π Is there a set π of at most π vertices in πΊ, such that each cycle contains a vertex of π? Reduction rules for FEEDBACK VERTEX SET (R1) If there is a loop at vertex π£, then delete π£ and decrease π by one (R2) If there is an edge of multiplicity larger than 2, then reduce its multiplicity to 2 (R3) If there is a vertex π£ of degree at most 1, then delete π£ (R4) If there is a vertex π£ of degree two, then delete π£ and add an edge between π£’s neighbors If (R1-R4) cannot be applied anymore, then the minimum degree is at least 3 Observation. If πΊ, π is transformed into (πΊ ′ , π ′ ), then: 1. FVS of size ≤ π in πΊ ⇔ FVS of size ≤ π′ in πΊ′ 2. Any feedback vertex set in πΊ′ is a feedback vertex set in πΊ when combined with the vertices deleted by (R1) 40 How randomization helps • We have seen a deterministic algorithm with runtime 5π ⋅ ππ • There is a simple randomized 4π ⋅ ππ Monte Carlo algorithm • In polynomial time, we can find a size-π solution with probability at least 4−π , if one exists – Repeating this 4π times gives an algorithm with running time 4π ππ and constant success probability • Key insight is a simple procedure to select a vertex that is contained in a solution with constant probability 41 Feedback vertex sets in graphs of min.deg. ≥ 3 • Lemma. Let πΊ be an π-vertex multigraph with minimum degree at least 3. For every feedback vertex set π of πΊ, more than half the edges of πΊ have at least one endpoint in π. • Proof. Consider the forest π» β πΊ − π – We prove that πΈ πΊ β πΈ π» > |πΈ π» | – π π» > πΈ π» for any forest π» • It suffices to prove πΈ πΊ β πΈ(π») > π π» – Let π½ be the edges with one end in π and the other in π(π») – Let π≤1 , π2 , and π≥3 be the vertices of π» with π»-degrees ≤ 1, 2, ≥ 3 • Every vertex of π≤1 contributes ≥ 2 to π½ • Every vertex of π2 contributes ≥ 1 to π½ πΈ πΊ β πΈ π» ≥ π½ ≥ 2 π≤1 + π2 > π≤1 + π2 + π≥3 = π π» π≥3 < π≤1 in any forest 42 Monte Carlo algorithm for FEEDBACK VERTEX SET Theorem. There is a randomized polynomial-time algorithm that, given a FEEDBACK VERTEX SET instance (πΊ, π), • either reports a failure, or • finds a feedback vertex set in πΊ of size at most π. If πΊ has an FVS of size ≤ π, it returns a solution with probability at least 4−π 43 Monte Carlo algorithm for FEEDBACK VERTEX SET • Algorithm FVS(Graph πΊ, integer π) – Exhaustively apply (R1)-(R4) to obtain πΊ ′ , π ′ • Let π0 be the vertices with loops removed by (R1) – – – – – 44 if π′ < 0 then FAILURE if πΊ′ is a forest then return π0 Uniformly at random, pick an edge π of πΊ′ Uniformly at random, pick an endpoint π£ of π return π£ ∪ π0 ∪ FVS(πΊ′ − π£, π′ − 1) Correctness (I) • The algorithm outputs a feedback vertex set or FAILURE • Claim: If πΊ has a size-π FVS, then the algorithm finds a solution with probability at least 4−π • Proof by induction on π – Assume πΊ has a size-π feedback vertex set π – By safety of (R1)-(R4), πΊ’ has a size-π’ FVS π’ • We have π ′ = π − |π0 | • Since loops are in any FVS, we have π0 ⊆ π – If π ′ = 0, then π0 = |π| so πΊ’ is a forest • Algorithm outputs π0 = π which is a valid solution – If π ′ ≥ 1, we will use the induction hypothesis 45 Correctness (II) • Case π ′ ≥ 1: – Probability that random π has an endpoint in π′ is ≥ 1 4 1 2 – Probability that π£ ∈ π′ is ≥ – If π£ ∈ π′, then πΊ ′ − π£ has an FVS of size ≤ π′ − 1 ′ • Then, by induction, with probability ≥ 4−(π −1) recursion gives a size-(π ′ − 1) FVS π ∗ of πΊ ′ − π£ • So π£ ∪ π ∗ is a size-π′ FVS of πΊ’ • By reduction rules, output π£ ∪ π0 ∪ X ∗ is an FVS of πΊ – Size is at most π ′ + π0 = π 1 4 – Probability of success is ≥ ⋅ 4− π ′ −1 ′ = 4−π ≥ 4−π Theorem. There is a Monte Carlo algorithm for FEEDBACK VERTEX SET with one-sided error and constant success probability that runs in time 4π ππ 46 Discussion • This simple, randomized algorithm is faster than the deterministic algorithm from the previous lecture • The method generalizes to β±-MINOR-FREE DELETION problems: delete vertices from the graph to ensure the resulting graph contains no member from the fixed set β± as a minor – FEEDBACK VERTEX SET is {πΎ3 }-MINOR-FREE DELETION 47 Exercises Color coding • 5.1, 5.2, 5.8, 5.15, 5.19 48 Summary • Several variations of color coding give efficient FPT algorithms • The general recipe is as follows: – Randomly color the input, such that if a solution exists, one is highlighted with probability 1/π π – Show that a highlighted solution can be found in a colored instance in time π π ⋅ ππ • For most problems we obtained single-exponential algorithms – For π-CLUSTERING we obtained a subexponential algorithm 49