Restricted Satisfiability (SAT) Problem By Mutaz Flmban Out line •CSAT & 3SAT •Normal Forms for Boolean Expression •Converting Expression to CNF •Theorem 10.12 •NP-Completeness of CSAT •Theorem 10.13 •NP-Completeness of 3SAT •Theorem 10.15 CSAT & 3SAT CSAT: an intermediate step that is used to show that SAT can be reduced to 3SAT 3SAT: is a problem about satisfiability of boolean expressions, these expression have a very regular form: •they are the AND of “clauses”. •each clause is the OR of exactly three variables or negated variables. OUR GOAL Reduce satisfiability for any expression to satisfiability for expressions in the normal form for the 3SAT problem. Convert each expression E in SAT to another expression F in the normal form for 3SAT. Note: F is satisfiability if and only if E is satisfiability. Normal Forms for Boolean Expression essential definitions A literal: is either a variable, or a negated e.g. ¬x, x A clause: is the logical OR of one or more literals. e.g. x , x ˅ ¬y ˅ z. A boolean expression: is said to be in conjunctive normal form or CNF, if it is the AND of clauses. Important notation OR () is treated as a Sum, using the (+) operator. AND () is treated as a product. normally use juxtaposition (no operator). Not = (¬) Example The expression (x ˅ ¬y) ˄ (¬x ˅ z): will be written in CNF as (x + ¬y)(¬x + z) It is in conjunctive normal form, since it is the AND (product) of the clauses: (x + ¬y) and (¬x + z). Question? are the expressions (x + y¬z)(x + y + z)(¬y + ¬z), xyz in CNF? The first expression not in CNF. It is the AND of three subexpressions, the last two are clauses, but the first is not, it is the sum of a literal and a product of two literals. The second expression in CNF. the clause can have only one literal, so it is the product of three clauses, (x ), (y), and (z) k-Conjunctive Normal Form (k-CNF) (K-CNF) is an expression that contains a product of clauses. Each of which is the sum of exactly k distinct literals. •For example: (x + ¬y)(y + ¬z)(z + ¬x) is in 2-CNF because each of its clauses has exactly two literals. CSAT and KSAT • CSAT is a Boolean expression in CNF that is satisfiable. •KSAT is a Boolean expression in k-CNF that is satisfiable Some problems CSAT is the problem: Given a Boolean expression in CNF, is it satisfiable? kSAT is the problem: Given a Boolean expression in k-CNF, is it satisfiable? We shell see that CSAT, 3SAT and KSAT for all K higher than 3 are NP-Complete Converting Expression to CNF We will take a SAT instance E and convert it to a CSAT instance F such that F is satisfiable if and only if E is statisfiable. • The reduction of SAT to CSAT includes two parts: Reduction of SAT to CSAT •The two parts are: We push all ¬’s down the expression tree so that the only negations are of variables, the boolean expression becomes an AND and OR of literals. ii. Write the Boolean expressions in CNF form, we will get new variables. •The new expression F will not be equivalent to the expression E. •If T is a truth assignment that makes E true, then there is an extension of T , say S, that makes F true. •S is an extension of T if S assigns the same value as T to each variable that T assigns. i. DeMorgan’s Laws I. ¬(E F) ⇒ ¬(E) ¬(F) to push ¬ below , the is changed to an . II. ¬(E F) ⇒ ¬(E) ¬(F) to push ¬ below , the is changed to an . III. ¬(¬E) ⇒ E : the double negation. Example E = ¬((¬(x + y))(¬x + y)) ¬((¬(x + y))(¬x + y)) Start ¬(¬(x + y))+¬(¬x + y) (1) x + y + ¬(¬x + y) (3) x + y + (¬(¬x)) ¬y (2) x + y + x ¬y (3) Theorem 10.12 Every Boolean expression E is equivalent to an expression F in which the only negations occur in literals. the length of F is linear in the number of symbols of E, and F can be constructed from E in polynomial time. The PROOF •The proof is an induction on the number of operators (¬, and ) in E. •There is an equivalent expression F with ¬’s only in literals. •If E has n ≥ 1 operators, then F has no more than 2n – 1 operators. continue • F need not have more than one pair of parentheses per operator. • The number of variable in an expression cannot exceed the number of operators by more than one. the length of F is linearly proportional to the length of E. the time it takes to construct F is proportional to its length and the length of E. BASIS •If E has one operator, it must be of the form: • ¬x •xy •xy •In each case, E is already in the required form, so F = E. •Note, E and F each have one operator: •The relationship “F has at most twice the number of operators of E minus 1”. induction Suppose the statement is true for all expressions with fewer operations than E. if the highest operator of E is not “¬”. Then E must be of the form: E1 E2 E1 E2 In either case the induction hypothesis applies to E1 and E2. The inductive hypothesis There are equivalent expressions F1 and F2, respectively, in which all ¬’s occur in literals only. F = F1 F2 or F = F1 F2 Let E1 and E2 have a and b operators, respectively. E has a + b + 1 operators By inductive hypothesis: F1 and F2 have 2a – 1 and 2b – 1 F has at most 2a + 2b – 1 operators. Which is no more then 2(a+b+1)-1, or twice the number of operators of E minus 1. Induction (continued) consider the case where E is of the form ¬ E1 There are three cases depending on what the top operator of E1 is: E1 = ¬E2 E1 = E2 E3 E1 = E2 E3 The case “E1 = ¬E2” By the law of double negation, E = ¬(¬E2) E2 has fewer operators than E. By the inductive hypothesis: there is an equivalent F for E2 in which the only ¬’s are in literals. The number of operators of F is at most twice the number in E2 minus 1, and it is surely true for E. The Case “E1 = E2 E3” By DeMorgan’s Law: E= ¬(E2 E3) = ¬E2 ¬E3 Both (¬E2) and (¬E3) have fewer operators than E. By the inductive hypothesis: they have an equivalents F2 and F3 that have ¬’s only in literals. F = F2 F3 = E Let (E2) has (a) operators, and (E3) has (b) operators. (E) has (a+b+2) operators. Since ¬(E2) and ¬(E3) have (a+1) and (b+1) operators, and (F2) has at most 2(a+1)-1 operators, (F3) has at most 2(b+1)-1 operators. (F) has at most 2a+2b+3 operators. And this number is twice the number of operators of E minus 1. The Case “E1 = E2 E3” This argument, using the second of DeMorgan’s laws, is essentially the same as the case (2). Theorem 10.13 CSAT is NP-Complete PROOF: We show how to reduce SAT to CSAT in polynomial time: Using method 10.12 to convert a SAT to an expression E whose ¬’s are only in literals. Convert E to a CNF expression F in polynomial time. F is satisfiable if and only if E is. BASIS If E consists of one or two symbols, then it is a literal. A literal is a clause, so E is already in CNF. Induction Assume that every expression shorter than E can be converted to a product of clauses. And conversion takes at most cn2 time on an expressions of length n. There are two cases depending on the top-level operator of E: E = E1 E2 E = E1 E2 Case 1: E = E1 E2 By the inductive hypothesis: There are expressions F1 and F2 derived from E1 and E2 respectively in CNF All and only the satisfying assignments for E1 can be extended to a satisfying assignment for F1. All and only the satisfying assignments for E2 can be extended to a satisfying assignment for F2. Let F = F1 F2 where: F1 F2 is a CNF expression if F1 and F2 are CNF expressions We must show that a truth assignment T for E can be extended to a satisfying assignment for F if and only if T satisfies E. First Part (IF) Suppose T satisfies E: Let T1 be T restricted so it applies only to the variables that appear in E1. Let T2 be T restricted so it applies only to the variables that appear in E2. By the inductive hypothesis: T1 and T2 can be extended to assignments S1 and S2 that satisfy F1 and F2. Let S agree with S1 and S2 on each of the variables they define. We can construct S that is an extension of T that satisfies F. Second Part (only if) Suppose that T has an extension S that satisfies F. Let T1 be T restricted to the variables of E1. Let T2 be T restricted to the variables of E2. Let S1 be S restricted to the variables of F1 Let S2 be S restricted to the variables of F2 Then S1 is an extension of T1, and S2 is an extension of T2. Because F is the AND of F1 and F2 then S1 satisfies F1 and S2 satisfies F2. By the inductive hypothesis: T1 must satisfy E1 and T2 must satisfy E2 T satisfies E. Case 2: E = E1 E2 By the inductive hypothesis: there are CNF expressions F1 and F2 with the properties: •A truth assignment for E1 satisfies E1 if and only if it can extended to a satisfying assignment for F1. •A truth assignment for E2 satisfies E2 if and only if it can extended to a satisfying assignment for F2. •The variables of F1 and F2 are disjoint, except for those variables that appear in E. •F1 and F2 are in CNF Note: we cannot simply take the OR of F1 and F2 to construct F, because the resulting expression would not be in CNF. Continued Suppose F1= g1 g2 .. gp. And F2= h1 h2 .. hq Where g’s and h’s are clauses, also introduce a new variable y, and let: F=(y+g1)*(y+g2)*…*(y+gp) * (¬y+h1) * (¬y+h2) * … *(¬y+hq) We want to proof that a truth assignment T for E satisfies E if and only if T can be extended to a truth assignment S that satisfies F. Case 2: (If) Part Assume T satisfies E. •Let T1 be T restricted to the variables of E1. •Let T2 be T restricted to the variables of E2. •Since E = E1 E2, either T satisfies E1 or T satisfies E2. •Let us assume T satisfies E1. then T1 which is T restricted to the variables of E1, can be extended to S1, which is satisfies F1. Case 2: If Part Construct an extension S for T, S will satisfy the expression F as follows: 1) for all variables x in F1, S(x) = S1(x). 2) S(y) = 0 , this makes all the clauses of F that are derived from F2 true. 3) For all variables x that are in F2 but not in F1, S(x) can be either 0 or 1. S makes all the clauses derived from the g’s true because the rule 1. S makes all the clauses derived from the h’s true by rule 2. S satisfies F. Case 2: (Only-If) Part Suppose that truth assignment T for E is extended to truth assignment S for F and S satisfies F. T satisfies E whenever S satisfies F. NP-Completeness of 3SAT The problem 3SAT is: Given a boolean expression E that is the product of clauses, each of which is the sum of three distinct literals, is E satisfiable? Theorem 10.15 3SAT is NP-Complete Proof: 3SAT is in NP, since SAT is NP To prove NP-Completeness we shell Reduce CSAT to 3SAT. PROOF Given a CNF expression E = e1 … ek, replace each clause ei to create a new expression F •If ei is a single literal (x), we introduce new variables u and v. Replace (x) by the four clauses: (x+u+v)(x+u+ ¬v)(x+¬u+v)(x+¬u+¬v) •If ei is the sum of two literals, (x+y). Introduce new variable z. Replace (x+y) by the product of to clauses: (x+y+z)(x+y+¬z) •If ei is the sum of 3 literals, already in the form required. •If ei = (x1+x2+…+xm) for some m ≥ 4, introduce new variables y1, y2, …, ym-3. then replace ei by the product of clauses. (x1+x2+y1)(x3+¬y1+y2)(x4+¬y2+y3)…(xm-2+¬ym-4+ym-3)(xm1+xm+ym-3) continue Thus, each instance of E of CSAT can be reduced to an instance of F of 3SAT such that F is satisfiable if and only if E is satisfiable. Construction time is linear in length of E. Since CSAT is NP-Complete, 3SAT is NP-complete. Reference John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullman, Introduction to Automata Theory, Languages, and Computation Questions 1. How we can reduce the SAT to the CSAT? We push all ¬’s down the expression tree. Write the Boolean expressions in CNF form. 2. Show how to convert ¬((¬(x + y))(¬x + y)) to CNF form ¬((¬(x + y))(¬x + y)) ¬(¬(x + y))+¬(¬x + y) x + y + ¬(¬x + y) x + y + (¬(¬x)) ¬y x + y + x ¬y