NP and Computational Intractability 叶德仕 yedeshi@zju.edu.cn 1 Decision Problems Decision problem. (the answer is simply "yes" or "no" ) X is a set of strings. Instance: string s. Algorithm A solves problem X: A(s) = yes iff s ∈ X. Optimization problems. in which each feasible (i.e., "legal") solution has an associated value, and we wish to find the feasible solution with the best value. 2 Polynomial time. Algorithm A runs in poly-time if for every string s, A(s) terminates in at most p(|s|) "steps", where p(.) is some polynomial. Certification algorithm intuition. Certifier views things from "managerial" viewpoint. Certifier doesn't determine whether s ∈ X on its own; rather, it checks a proposed certificate t that s ∈ X. i.e., certifier verify the proposed solution t and the instance s whether C(s,t) is yes. 3 Def. Algorithm C(s, t) is a certifier for problem X if for every string s, s ∈ X iff there exists a string t such that C(s, t) = yes. NP. Decision problems for which there exists a poly-time certifier. Remark. NP stands for nondeterministic poly-time. 4 Certifiers and Certificates: Composite (合数) COMPOSITES. Given an integer s, is s composite? Certificate. A nontrivial factor t of s. Note that such a certificate exists iff s is composite. Moreover |t| ≤ |s|. Certifier. boolean C(s, t) { if (t ≤1 or t ≥ s) return false else if (s is a multiple of t) return true else return false } 5 Example Instance. s = 437,669. Certificate. t = 541 or 809. Conclusion. COMPOSITES is in NP. 6 Certifiers and Certificates: 3-Satisfiability SAT. Given a CNF formula , is there a satisfying assignment? Certificate. An assignment of truth values to the n boolean variables. Certifier. Check that each clause in has at least one true literal. Instance: ( x1 x2 x3 ) ( x1 x2 x3 ) ( x1 x2 x4 ) ( x1 x3 x4 ) Certificate: x1 1, x2 1, x3 0, x4 1 Conclusion. SAT is in NP. 7 Certifiers and Certificates: Hamiltonian Cycle HAM-CYCLE. Given an undirected graph G = (V, E), does there exist a simple cycle C that visits every node? Certificate. A permutation of the n nodes. Certifier. Check that the permutation contains each node in V exactly once, and that there is an edge between each pair of adjacent nodes in the permutation. Conclusion. HAM-CYCLE is in NP. 8 Halting problem The halting problem is a decision problem which can be stated as follows: given a description of a program and a finite input, decide whether the program finishes running or will run forever, given that input. The halting problem is undecidable. (Alan Turing 1936) 9 P, NP, EXP P. Decision problems for which there is a poly-time algorithm. EXP. Decision problems for which there is an exponential-time algorithm. NP. Decision problems for which there is a poly-time certifier. 10 Claim. P NP Pf. Consider any problem X in P. By definition, there exists a poly-time algorithm A(s) that solves X. Certificate: certifier C(s, t) = A(s). Claim. NP EXP Pf. Consider any problem X in NP. By definition, there exists a poly-time certifier C(s, t) for X. To solve input s, run C(s, t) on all strings t with |t| ≤ p(|s|). Return yes, if C(s, t) returns yes for any of these. ! 11 The Main Question: P Versus NP Does P = NP? [Cook 1971, Edmonds, Levin, Yablonski, Gödel] Is the decision problem as easy as the certification problem? Clay $1 million prize. EXP EXP P P = NP NP If P = NP If P ≠ NP 12 A formual-language Framework An alphabet Σ is a finite set of symbols. A language L over Σ is any set of strings made up of symbols from Σ. For example, Σ = {0, 1}, L = {10, 11, 101, 111, 1011, 1101, 10001,...} L is the language of binary representations of prime numbers Denote empty string by ε Denote empty language by Ø 13 The language of all strings over Σ is denoted Σ*. For example Σ = {0, 1}, then Σ* = {ε, 0, 1, 00, 01, 10, 11, 000,...} is the set of all binary strings. Every language L over Σ is a subset of Σ*. 14 Set-theoretic operations complement of L by Σ*- L. The concatenation of two languages L1 and L2 is the language L = {x1x2 : x1 ∈ L1 and x2 ∈ L2}. The closure or Kleene star of a language L is the language L*= {ε} ∪ L ∪ L2 ∪ L3 ∪ ···, where Lk is the language obtained by concatenating L to itself k times. 15 Decision problem and language the set of instances for any decision problem Q is simply the set Σ*, where Σ = {0, 1}. Since Q is entirely characterized by those problem instances that produce a 1 (yes) answer, we can view Q as a language L over Σ = {0, 1}, where L = {x ∈ Σ*: Q(x) = 1}. Example: PATH = {〈G, u, v, k〉 : G = (V, E) is an undirected graph, u, v ∈ V, k ≥ 0 is an integer, and there exists a path from u to v in G consisting of at most k edges}. 16 We say that an algorithm A accepts a string x ∈ {0, 1}* if, given input x, the algorithm's output A(x) is 1. The language accepted by an algorithm A is the set of strings L = {x ∈ {0, 1}*: A(x) = 1} An algorithm A rejects a string x if A(x) = 0. A language L is decided by an algorithm A if every binary string in L is accepted by A and every binary string not in L is rejected by A. 17 A language L is accepted in polynomial time by an algorithm A if it is accepted by A and if in addition there is a constant k such that for any length-n string x ∈ L, algorithm A accepts x in time O(nk). A language L is decided in polynomial time by an algorithm A if there is a constant k such that for any length-n string x ∈ {0, 1}*, the algorithm correctly decides whether x ∈ L in time O(nk). To accept a language, an algorithm need only worry about strings in L, but to decide a language, it must correctly accept or reject every string in {0, 1}*. 18 Class P P = {L ⊆ {0, 1}* : there exists an algorithm A that decides L in polynomial time} 19 Polynomial-time verification For example, suppose that for a given instance 〈G, u, v, k〉 of the decision problem PATH, We are also given a path p from u to v. We can easily check whether the length of p is at most k. So, we can view p as a "certificate" that the instance indeed belongs to PATH. 20 Verification algorithms define a verification algorithm as being a twoargument algorithm A, where one argument is an ordinary input string x and the other is a binary string y called a certificate. A two-argument algorithm A verifies an input string x if there exists a certificate y such that A(x, y) = 1. The language verified by a verification algorithm A is L = {x ∈ {0, 1}* : there exists y ∈ {0, 1}* such that A(x, y) = 1}. 21 Class NP The complexity class NP is the class of languages that can be verified by a polynomial-time algorithm More precisely, a language L belongs to NP if and only if there exist a two-input polynomial-time algorithm A and constant c such that L = {x ∈ {0, 1}* : there exists a certificate y with |y| = O(|x|c) such that A(x, y) = 1}. We say that algorithm A verifies language L in polynomial time. 22 class co-NP That is, does L ∈ NP imply L NP ? We can define the complexity class co-NP as the set of languages L such that L NP NP=co-NP P=NP=co-NP P co-NP P=NP∩ co-NP NP co-NP NP∩ co-NP P NP 23 NP-completeness and reducibility Polynomial Transformation Def. Problem X polynomial reduces (Cook) to problem Y if arbitrary instances of problem X can be solved using: Polynomial number of standard computational steps, plus Polynomial number of calls to oracle that solves problem Y. Def. Problem X polynomial transforms (Karp) to problem Y if given any For any input x to X, we can construct an input y in polynomial time such that x is a yes instance of X iff y is a yes instance of Y. 24 we say that a language L1 is polynomial-time reducible to a language L2 written L1 ≤P L2 , if there exists a polynomial-time computable function f : {0, 1}* → {0,1}* such that for all x {0, 1}*, x L1 if and only if f ( x) L2 We call the function f the reduction function, and a polynomial-time algorithm F that computes f is called a reduction algorithm. 25 NP-Complete NP-complete. A problem Y in NP with the property that for every problem X in NP, X ≤P Y. A language L ⊆ {0, 1}* is NP-complete if L ∈ NP, and L′ ≤P L for every L∈ NP. 26 Lemma. If L1, L2 ⊆ {0,1}* are languages such that L1 ≤P L2, then L2 ∈ P implies L1 ∈ P. yes, f ( x) L2 yes, x L1 f (x) x F A2 A1 no, f ( x) L2 no, x L2 27 Theorem. If any NP-complete problem is polynomial-time solvable, then P = NP. Pf. Suppose that L ∈ P and also that L ∈ NPC. For any L′ ∈ NP, we have L′ ≤P L by property 2 of the definition of NP-completeness. Thus, by above Lemma , we also have that L′ ∈ P, which proves the theorem. NP P NPC If P ≠ NP 28 Circuit Satisfiability CIRCUIT-SAT. Given a combinational circuit built out of AND, OR, and NOT gates, is there a way to set the circuit inputs so that the output is 1? Output Yes: 101 1 0 ? ? ? 29 The "First" NP-Complete Problem Circuit-SAT: given a boolean combinatorial circuit composed of AND, OR, and NOT gates, is it satisfiable? Theorem. CIRCUIT-SAT is NP-complete. [Cook 1971, Levin 1973] Pf. The circuit-satisfiability problem belongs to the class NP. show that every language in NP is polynomial-time reducible to CIRCUIT-SAT. 30 Establishing NP-Completeness Method for showing that a problem Y is NPcomplete Show that Y is in NP. Choose an NP-complete problem X. X ≤P Y Justification. If X is a problem such that Y ≤P X for some Y ∈ NPC, then X is NP-hard. Moreover, if X ∈ NP, then L ∈ NPC. 31 3-CNF satisfiability A boolean formula is in conjunctive normal form, or CNF, if it is expressed as an AND of clauses, each of which is the OR of one or more literals. 3-CNF, if each clause has exactly three distinct literals. Example: (x1 ∨ ¬x1 ∨ ¬x2) ∧ (x3 ∨ x2 ∨ x4) ∧ (¬x1 ∨ ¬x3 ∨ ¬x4) 32 3-SAT is NP-Complete Theorem. 3-SAT is NP-complete. PfS. Suffices to show that CIRCUIT-SAT ≤P 3-SAT since 3-SAT is in NP. 33 The clique problem A clique in an undirected graph G = (V, E) is a subset V' ⊆ V of vertices, each pair of which is connected by an edge in E. In other words, a clique is a complete subgraph of G. The size of a clique is the number of vertices it contains. The clique problem is the optimization problem of finding a clique of maximum size in a graph. As a decision problem, we ask simply whether a clique of a given size k exists in the graph. The formal definition is CLIQUE = {〈G, k〉 : G is a graph with a clique of size k}. 34 Clique problem Theorem. The clique problem is NP-complete. Pf. CLIQUE is in NP, for a given graph G = (V, E), we use the set V' ⊆ V of vertices in the clique as a certificate for G. Checking whether V' is a clique can be accomplished in polynomial time by checking whether, for each pair u, v ∈ V', the edge (u, v) belongs to E. next prove that 3-CNF-SAT ≤P CLIQUE reduction algorithm: Let φ = C1 ∧ C2 ∧ ··· ∧ Ck be a boolean formula in 3-CNF with k clauses. We shall construct a graph G such that φ is satisfiable if and only if G has a clique of size k. 35 Construct of graph G The graph G = (V, E) is constructed as follows. For each clause Cr=(r1, r2, r3), we place a triple of vertices vr1, vr2, vr3 in V. We put an edge between two vertices vri, vsj if both of the following hold: vri and vsj are in different triples, that is, r ≠ s, their corresponding literals are consistent, that is ri is not the negation of sj. This graph can easily be computed from φ in polynomial time 36 Example φ = (x1 ∨ ¬x2 ∨ ¬x3) ∧ (¬x1 ∨ x2 ∨ x3) ∧ (x1 ∨ x2 ∨ x3), 37 Proof. Con. We must show that this transformation of φ into G is a reduction First, suppose that φ has a satisfying assignment. Then each clause Cr contains at least one literal rj that is assigned 1. and each such literal corresponds to a vertex vrj. Picking one such "true" literal from each clause yields a set V' of k vertices. We claim that V' is a clique. For any two vertices vri, vsj where r ≠ s, both corresponding literals ri rand sj are mapped to 1 by the given satisfying assignment, and thus the literals cannot be complements. 38 Proof. Con. Conversely, suppose that G has a clique V' of size k. No edges in G connect vertices in the same triple, and so V' contains exactly one vertex per triple. We can assign 1 to each literal ri such that vrj ∈V' , then assigning 1 to both a literal and its complement can not happen, since G contains no edges between inconsistent literals. Each clause is satisfied, and so φ is satisfied Any variables that do not correspond to a vertex in the clique may be set arbitrarily 39 The vertex-cover problem A vertex cover of an undirected graph G = (V, E) is a subset V' ⊆ V such that if (u, v) ∈ E, then u ∈ V' or v ∈ V' (or both). A vertex cover for G is a set of vertices that covers all the edges in E. As a decision problem, we define VERTEX-COVER = {〈G, k〉 : graph G has a vertex cover of size k}. 40 Vertex-cover is NPC Theorem. The vertex-cover problem is NPcomplete. We first show that VERTEX-COVER ∈ NP. Given a graph G = (V, E) and an integer k. Certificate is the vertex cover V' ⊆ V itself. The verification algorithm affirms that |V'| = k, and then it checks, for each edge (u, v) ∈ E, that u ∈ V' or v ∈ V'. This verification can be performed straightforwardly in polynomial time. 41 The vertex-cover problem is NP-hard by showing that CLIQUE ≤P VERTEX-COVER This reduction is based on the notion of the "complement" of a graph. Given an undirected graph G = (V, E), we define the complement of G, the graph containing exactly those edges that are not in G. 42 Example: complement of G 43 Vertex-cover The reduction algorithm takes as input an instance 〈G, k〉 of the clique problem . It computes the complement G , which is easily done in polynomial time. The output of the reduction algorithm is the instance G,| V | k of the vertex-cover problem. The graph G has a clique of size k if and only if the graph G has a vertex cover of size |V | - k. 44 Vertex-cover Suppose that G has a clique V' ⊆ V with |V'| = k . We claim that V - V' is a vertex cover in .G Let (u, v) be any edge in Ē. Then, (u, v) ∉ E, which implies that at least one of u or v does not belong to V', since every pair of vertices in V' is connected by an edge of E. Equivalently, at least one of u or v is in V - V‘, which means that edge (u, v) is covered by V - V'. Hence, the set V - V', which has size |V | - k, forms a vertex cover for G 45 Vertex-cover con. Conversely, suppose that G has a vertex cover V' ⊆ V , where |V'| = |V| - k. Then, for all u, v ∈ V, if (u, v) ∈ Ē, then u ∈ V' or v ∈ V' or both. For all u, v ∈ V, if u ∉ V' and v ∉ V', then (u, v) ∈ E. In other word, V -V' is a clique, and it has size |V |-|V'| = k. 46 Hamiltonian Cycle HAM-CYCLE: given an undirected graph G = (V, E), does there exist a simple cycle that contains every node in V. 47 YES: vertices and faces of a dodecahedron. Hamiltonian Cycle HAM-CYCLE: given an undirected graph G = (V, E), does there exist a simple cycle that contains every node in V. NO: bipartite graph with odd number of nodes.. 48 Directed Hamiltonian Cycle DIR-HAM-CYCLE: given a digraph G = (V, E), does there exists a simple directed cycle that contains every node in V? Claim. DIR-HAM-CYCLE ≤P HAM-CYCLE. V Vin Vout V 49 3-SAT Reduces to Directed Hamiltonian Cycle Claim. 3-SAT ≤P DIR-HAM-CYCLE 50 3-SAT Reduces to Directed Hamiltonian Cycle Claim. 3-SAT ≤P DIR-HAM-CYCLE Pf. Given an instance of 3-SAT, we construct an instance of DIRHAM-CYCLE that has a Hamiltonian cycle iff is satisfiable. Construction. First, create graph that has 2n Hamiltonian cycles which correspond in a natural way to 2n possible truth assignments. Intuition: traverse path i from left to right iff set variable xi = 1. 51 Hamiltonian Path HAM-CYCLE ≤p HAM-PATH: 52 Longest Path SHORTEST-PATH. Given a digraph G = (V, E), does there exists a simple path of length at most k edges? LONGEST-PATH. Given a digraph G = (V, E), does there exists a simple path of length at least k edges? 53 Longest Path Claim. HAM-CYCLE ≤p Longest-Path Let G=(V,E) be an instance of HAM-CYCLE. If the longest-simple-cycle in G is of length |V|, then every vertex was visited and thus there is a Hamiltonian cycle in G. Reduction: Compute the longest-simple-cycle in G. If the length of this cycle=|V|, There is a Hamiltonian cycle. Else, there is no Hamiltonian cycle. 54 The longest path I have been hard working for so long. I swear it's right, and he marks it wrong. Woh-oh-oh-oh, find the longest path! Some how I'll feel sorry when it's done: Woh-oh-oh-oh, find the longest path! GPA 2.1 If you said P is NP tonight, Is more than I hope for. There would still be papers left to write, Garey, Johnson, Karp and other men I have a weakness, (and women) I'm addicted to completeness, Tried to make it order N log N. And I keep searching for the longest path. Am I a mad fool The algorithm I would like to see If I spend my life in grad school, Is of polynomial degree, Forever following the longest path? But it's elusive: Woh-oh-oh-oh, find the longest path! Nobody has found conclusive Woh-oh-oh-oh, find the longest path! Evidence that we can find a longest path. Woh-oh-oh-oh, find the longest path. Lyrics. Copyright © 1988 by Daniel J. Barrett. Music. Sung to the tune of The Longest time by Billy Joel. Recorded by Dan Barrett while a grad student at Johns Hopkins during a difficult algorithms final. 55 The traveling-salesman problem (TSP) Traveling-salesman problem: A salesman spends his time visiting n cities (or nodes) cyclically. In one tour he visits each city just once, and finishes up where he started. In what order should he visit them to minimise the distance travelled? There is an integer cost c(i, j) to travel from city i to city j, 56 TSP: NPC Theorem. The traveling-salesman problem is NP-complete. Pf. HAM-CYCLE ≤P TSP. Let G = (V, E) be an instance of HAM-CYCLE. We construct an instance of TSP as follows. We form the complete graph G' = (V, E'), where E' = {(i, j) : i, j ∈ V and i ≠ j}, and we define the cost function c by 0 if (i, j ) E C (i, j ) 1 else 57 Proof. graph G has a hamiltonian cycle if and only if graph G' has a tour of cost at most 0. Suppose that graph G has a hamiltonian cycle h. Each edge in h belongs to E and thus has cost 0 in G'. Thus, h is a tour in G' with cost 0 Conversely, suppose that graph G' has a tour h' of cost at most 0. Since the costs of the edges in E' are 0 and 1, the cost of tour h' is exactly 0 and each edge on the tour must have cost 0. Therefore, h' contains only edges in E. We conclude that h' is a hamiltonian cycle in graph G. 58 NP Problems 59 Subset sum problem Given a set of integers and an integer s, does any non-empty subset sum to s? 60 Partition Problem Given a set S of integers, is there a way to partition S into two subsets S1 and S2 such that the sum of the numbers in S1 equals the sum of the numbers in S2? The subsets S1 and S2 must form a partition in the sense that they are disjoint and they cover S. 61 3-Partition problem A variation of the partition problem is the 3partition problem, in which the set S must be partitioned into |S|/3 triples each with the same sum. 62 Exercise Prove that the Knapsack problem is NP-hard Prove that the Bin packing problem is NP-hard 63 A compendium of NP optimization problems Editors: Pierluigi Crescenzi, and Viggo Kann Subeditors: Magnús Halldórsson (retired) Graph Theory: Covering and Partitioning, Subgraphs and Supergraphs, Sets and Partitions. Marek Karpinski Graph Theory: Vertex Ordering, Network Design: Cuts and Connectivity. Gerhard Woeginger Sequencing and Scheduling. http://www.nada.kth.se/~viggo/problemlist/ 64 The NP-Completeness Column David Johnson. ACM TRANSACTIONS ON ALGORITHMS 1:1, 160-176 (2005) 65 Open Problem Open: the Visibility Graph Recognition problem is not even known to be in NP. Visibility Graph Recognition problem: Given a visibility graph G and a Hamiltonian circuit C, determine in polynomial time whether there is a simple polygon whose vertex visibility graph is G, and whose boundary corresponds to C. Visibility Graph: Let S be a set of simple polygonal obstacles in the plane, then the nodes of the visibility graph of are just the vertices of S, and there is an edge (called a visibility edge) between vertices and if these vertices are mutually visible. A simple polygon is a polygon which does not intersect itself anywhere. These are also called Jordan polygons. 66