MCS 312: NP Completeness and Approximation algorthms Instructor Neelima Gupta ngupta@cs.du.ac.in Table of Contents Circuit Satisfiablity is NP Complete (CNF) SAT is NP Complete 3(CNF) SAT is NP Complete C SAT is NP Complete The Cook’s Theorem Thanks to Stuti Chawla (28) MCS '11 The CIRCUIT-SAT is in NP Do it yourself Circuit SAT or C SAT is a set of all combinatorial circuits such that C is satisfiable <c> : <c> is satisfiable All NP Hard problems should be reducible to CSAT for it to be NP Hard Thanks to Stuti Chawla (28) MCS '11 Proof : Let ‘P’ be an arbitrary problem in NP. P is a decision problem (say). Let ‘L’ be the language corresponding to P. Since P Є NP, there exists an algorithm A that verifies P (or L) in polynomial time. This implies that for every x Є L, there exists a certificate y, polynomial in the length of x, such that A(x,y) = 1 (by def. of NP) Thanks to Stuti Chawla (28) MCS '11 Aim : To reduce the given NP problem, ‘P’ to CSAT The algorithm ‘A’ changes the state of the system from one configuration to another Suppose |x| = n Then |y| = O(nk), const k (by def.) A runs in time T(n) = O(|x| + |y|) = O(nk) Thanks to Stuti Chawla (28) MCS '11 Reduction of an NP problem to Circuit-SAT Claim: • Cx is satisfiable if there exists a certificate y such that A(x, y) = 1. • If Cx is satisfiable then there exists a certificate y such that A(x, y) = 1. Proof: The claim follows from the construction that Cx (y) = A(x, y). Reduction of an NP problem to Circuit-SAT Claim: Reduction runs in polynomial time. Proof: We’ll prove that the size of the circuit Cx is polynomial in |x|. 1. Size of the program for A is independent of the size of x. (Size of any program is independent of its input size, its execution time may depend but not the size of the code.) 2. |y| is polynomial in |x| 3. Working Storage Area in each configuration = O(T(n)) where T(n) is the running time of A. (That’s true for any algorithm) 4. Size of M is polynomial in the length of a configuration. 5. Number of configurations is T(n). Hence proved. The CIRCUIT-SAT is in NP We shall construct a 2-input , polynomial time algorithm that can verify CIRCUIT SAT. The CNF-SAT Problem Terminology • Boolean Formula is a parenthesized expression formed from Boolean variables using Boolean operations, such as AND, OR, NOT, IMPLIES, IF AND ONLY IF. • A clause is formed as the OR of Boolean variables or their negation called literals. • A Boolean Formula is in Conjunctive Normal Form(CNF) if it is formed as a collection of subexpressions, called clauses, which are combined using AND. • For example, the following Boolean formula is in CNF (x1’ + x2 + x4 + x7’) (x3 + x5’) (x2’ + x4 + x6’ + x8) (x1 + x3 + x5 + x8’) The CNF-SAT Problem Statement : CNF SAT takes a Boolean formula in CNF as input and asks if there is an assignment of Boolean values to its variables so that the formula evaluates to 1. The CNF-SAT Problem To Prove: CNF-SAT is NP Complete Step 1: Show that CNF-SAT belongs to NP It is possible to design a polynomial time algorithm which takes the following 2 inputs and checks whether the assignment specified by S satisfies every clause in I: • An instance I of the problem • A proposed solution S CNF-SAT is NP-Hard Step 2: Show that CNF-SAT is NP hard We shall do this through the Local-Replacement approach by trying to reduce the CIRCUIT-SAT problem to CNF_SAT in polynomial time. Given: A Boolean circuit C Assume that each AND / OR gate has 2 inputs, and each NOT gate has 1 input CNF-SAT is NP Hard: Reduction from Circuit-SAT CNF SAT is NP-Hard • ϕ = y4 ∧(y1 x1 ∧x2) ∧(y1 ~x3) ∧(y3 x2 ∧y2) ∧(y4 y3 ∨ y1) Note that for every assignment of xi’s there is an assignment of yj’s. These values of xi’s and yj’s will always saitisfy the last four clauses above by construction Suppose the the circuit is satisfiable. Then there exists an assignment of xi’s which makes y4 true.. Hence the above formula is satisfiable. If the circuit is not satisfiable. Then every assignment of xi’s will force y4 to be set to false. Hence the above formula is not satisfiable. Converting Implications to CNF • Create a formula Bg corresponding to each gate g in C as follows: 1. If g is an AND gate with inputs a and b (which could be either xi‘s or yi’s) and output c, then Bg = (c ↔ (a.b)). 2. If g is an OR gate with inputs a and b, and output c, then Bg = (c ↔ (a+b)) 3. If g is a NOT gate with input a and output b, then Bg = (b ↔ a’) • Convert each Bg to CNF as follows 1. Construct a truth table for Bg. 2. Derive a formula for Bg in CNF form. ( If you don’t know how to do it directly then convert it into DNF and then convert it into CNF by De Margan’s Law) CNF-SAT : An Example Conversion of BG1= (y1 ↔ (x1.x2)) to CNF y1 x1 x2 1 1 1 1 0 0 0 0 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 Bg1 = (y1 ↔ (x1.x2)) 1 0 0 1 0 1 0 1 CNF-SAT : An Example • DNF formula for BG1’ = x1.x2.y1’ + x1.x2’.y1 + x1’.x2.y1 + x1’.x2’.y1 • CNF formula for BG1 = (x1’+x2’+y1)(x1’+x2+y1’)(x1+x2’+y1’)(x1+x2+y1’) Similarly CNF formula for BG3 = (x2’+y2’+y3)(x2’+y2+y3’)(x2+y2’+y3’ )(x2+y2+y3’) CNF-SAT : An Example Conversion of BG2= (y2 ↔ x3’) to CNF x3 y2 Bg2 = (y2 ↔ x3’) 0 0 0 0 1 1 1 0 1 1 1 0 CNF-SAT : An Example • DNF formula for BG2’ = x3’.y2’ + x3.y2 • CNF formula for BG2 = (x3+y2)(x3’+y2’) BG3 is similar to BG1 CNF-SAT : An Example y1 y3 y4 Bg4 = (y4 ↔ Conversion of BG4= (y1 V y3)) (y4 ↔ (y1 V y3)) to CNF 0 0 0 1 0 0 0 1 1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1 0 0 1 0 1 0 1 CNF-SAT : An Example • DNF formula for BG4’ = y1’.y3’.y4 + y1’.y3.y4’ + y1.y3’.y4’ + y1.y3.y4’ • CNF formula for BG4 = (y1+y3+y4’)(y1+y3’+y4)(y1’+y3+y4)(y1’+y3’+y4) CNF-SAT : An Example CNF for the entire circuit is given by Ф=y4(x1’+x2’+y1)(x1’+x2+y1’)(x1+x2’+y1’)(x1+x2+y1’)(x3+y2)(x3’+y2’)(x2’+y2’+ y3)(x2’+y2+y3’)(x2+y2’+y3’)(x2+y2+y3’)(y1+y3+y4’)(y1+y3’+y4)(y1’+y3+y4)(y1 ’+y3’+y4) Note that for every assignment of xi’s there is an assignment of yj’s. These values of xi’s and yj’s will always satisfy the above clauses (but y4) by construction. Suppose the circuit is satisfiable. Then there exists an assignment of xi’s which makes y4 true.. Hence the above formula is satisfiable. If the circuit is not satisfiable. Then every assignment of xi’s will force y4 to be set to false. Hence the above formula is not satisfiable. THE 3SAT PROBLEM Statement: This problem takes a Boolean formula S in CNF, having exactly 3 literals, and asks if S is satisfiable. Observation : This is a restricted version of the CNF-SAT problem. THE 3SAT PROBLEM To prove: 3SAT is NP-Complete Step 1: 3SAT is obviously in NP (by restriction: 3 SAT is a special case of CNF SAT) Step 2: Show that 3SAT is NP hard Observe that Restriction form of NP hardness proof does not apply to this situation. WHY ? We shall use the local replacement form of proof for this and try to reduce the CNF-SAT problem to 3SAT in polynomial time. THE 3SAT PROBLEM Given : A Boolean formula C in CNF. Perform the following local replacement for each clause Ci • If Ci = (a), that is, it has a single term, replace Ci Si = (a + b + c) . (a + b’ + c) . (a + b + c’) . (a + b’ where b and c are new variables not used anywhere • • • If Ci = (a+b), that is, it has 2 terms, replace Ci with Si = (a + b + c) . (a + b + c’) where c is a new variable not used anywhere else. If Ci = (a+b+c), that is, it has 3 terms, set Si = Ci If Ci = (a1 + a2 + ……. + ak), that is, it has k>3 terms, replace Ci with Si = (a1 + a2 + b1) . (b1’ + a3 + b2) . (b2’ + a4 + b3)……..( bk-3’+ ak-1 + ak) where b1, b2,…..... bk-1 are new variables not used anywhere else. in C: with + c’) else. Reducing SAT to 3SAT • If Ci = 1 then let a_r be true.. • if r=1 or 2, then (a1 \/ a2 \/ b1) is satisfied so set all b_i’s to false to satisfy all other clauses; • if r=k-1 or k, then (~b_{r-3} \/ a_{r-1} \/ a-r) is satisfied so set all b_i’s to true to satisfy all other clauses; • if 2 < r < k-1, then (~b_{r-2} \/ a_r \/ b_{r-1}) is satisfied so set b1 = b2 = ... = b_{r-2} = true to satisfy the first r-2 clauses and set b_{i-1} = b_i = ... = b_{r-3} = false to satisfy the remaining clauses. Hence, Si is satisfied. Reducing SAT to 3SAT • If Si is satisfiable then: • if none of the new variables b1, ..., b{r-3} is set to false, then (~b{r-3} \/ a{r-1} \/ ar) can only be satisfied by setting either a{r-1} = true or ar = true; • If b1 is set to false, then (a1 \/ a2 \/ b1) can only be satisfied by setting either a1 = true or a2 = true; • else let b_r be the first new variable set to false i.e. b1 = b2 = ... = b{r-1} = true: then (~b{r-1} \/ a{r+1} \/ br) can only be satisfied by setting a{r+1} = true; • Thus in all the cases, one of the original literals must be set to true. Hence Ci is satisfied. Reducing SAT to 3SAT Reduction is polynomial time • Each clause increases in size by at most a constant factor and the computations involved are simple substitutions. T h u s 3 S AT i s N P - C o m p l e t e . POLYNOMIAL TIME VARIANTS OF SAT T h e fo l l o w i n g 2 va r i a nt s o f t h e SAT p ro b l e m c a n be solved in polynomial time, and therefore belong to the complexity class P. • If all clauses contain at most one positive literal, then the Boolean formula is called a Horn Formula, and a satisfying truth assignment can be found by greedy algorithm. • If the clauses have only 2 literals, then SAT can be solved in linear time by finding the strongly connected components o f a p a r t i c u l a r g r a p h constructed from the instance. We’ll not do them here. Do them yourself. The 2SAT Problem • Instance: Collection C = {c1, …, cm} of clauses on a set U of n Boolean variables such that |ci| = 2 for 1 ≤ i ≤ m. • Question: Is there a truth assignment for the variables in U that satisfies all clauses in C? The 2SAT Problem 2SAT can be solved by formulating it as a graph algorithm: • Let Ф be an instance of 2SAT. Construct a directed graph G(Ф) such that vertices of G(Ф) are the variables of Ф and their negations. • There is an arc (x, y) in G(Ф) if and only if there is a clause (x’ + y) or (y + x’) in Ф. The 2SAT Problem Note: a + b is equivalent to each of a’ ⇒ b and b’ ⇒ a. Thus a 2SAT formula may be viewed as a set of implications. Accordingly, if we have the following formula: (a’ + b) (b’ + c) (c’+ d) then we have a string of implications a ⇒ b ⇒ c ⇒ d, which leads to a ⇒ d, • If for some variable a, there is a string of implications a ⇒ ... ⇒ a’, and another string of implications a’ ⇒ ... ⇒ a, then the formula is not satisfiable, otherwise the formula is satisfiable. The 2SAT Problem Consider the following formula (x1+x2) (x2’+x3) (x1’+x2’) (x3+x4) (x3’+x5) (x4’ + x5’) (x3’ + x4). The implication graph is as follows: The 2SAT Problem x1 x2’ x4’ x3’ x5 x5’ x3 x4 x2 x1’ The 2SAT Problem • The 2SAT problem thus reduces to the graph problem of finding strongly connected components (SCC) in the implication graph. • A 2SAT formula is unsatisfiable if and only if some variable and its complement reside in the same SCC. • As SCC is known to have a linear-time solution and the implication graph is constructible in linear time, it is clear that 2SAT may be decided under the same time bound. Horn Formula References: 1. Chapter 5 Greedy Algorithms (S. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani) 2. http://www.cs.berkeley.edu/~daw/teaching/cs170-s03/Notes/lecture15.pdf (UC Berkeley|CS 170: Ecient Algorithms and Intractable Problems :Lecturer: David Wagner) In Horn formulas, knowledge about variables is represented by two kinds of clauses : 1. Implications, whose left-hand side is an AND of any number of positive literals and whose right-hand side is a single positive literal. These express statements of the form x1 ^ x2 ^… ^ xk implies xk+1 which can alternatively be written as follows: x’1 v x’2 v… v x’k v xk+1 • Here, if the conditions on the left hold, then the one on the right must also be true. • A degenerate type of implication is the singleton implies x, meaning simply that x is true. 2. Pure negative clauses, consisting of an OR of any number of negative literals, as in (u’ v v’ v y’) Horn Formula • The implications tell us to set some of the variables to true. • The negative clauses encourage us to make them false. Strategy for solving a Horn formula : start with all variables set to false. proceed to set some of them to true, one by one, only if an implication would otherwise be violated. once done with this phase, when and all implications are satisfied, turn to the negative clauses and make sure they are all satisfied.