Upper bounds on k-SAT Chris Calabro ccalabro@cs.ucsd.edu 9/23/04 1 What is k-SAT? Variables – x1, x2, x3, … Literals – x1, ¬x1, x2, ¬x2,… Clauses – (x1 or ¬x2 or x3), (x2), () CNFs – (x1 or ¬x2 or x3) and (x2) k-CNFs – (x__________ x3) and (x2) 1 or ¬x2 or___ k 9/23/04 2 Assignment (x1 or ¬x2 or x3) and (x2) 9/23/04 3 Assignment (x1 or ¬x2 or x3) and (x2) x1 0 9/23/04 x2 0 x3 0 4 Assignment (x1 or ¬x2 or x3) and (x2) x1 1 9/23/04 x2 1 x3 1 5 Formal problem Decision version – k-SAT={F ε k-CNF | exists assignment a with F(a)=1} Search version – given F ε k-CNF find an assignment a with F(a)=1 or output “not satisfiable” k-SAT ε NPC 9/23/04 6 Outline Motivation Why care about NPC problems? Why upper bounds? Why SAT? How hard is SAT? Algorithms Design Conclusion 9/23/04 7 NPC Language L ε NPC iff L ε NP forall A ε NP, A ≤p L VERIFY={<M,x> | M is a Turing machine, x ε Σ*, exists w ε Σ|x|, M(x,w) accepts in time |x|} ε NPC Solve L ε NPC efficiently => forall A ε NP, solve A efficiently forall L ε NPC, (L ε P iff P=NP) 9/23/04 8 The big question P = NP? Posed by Kurt Gödel to John von Neumann in 1956 Almost the same question: NP ≤ BPP? 9/23/04 9 Who cares? Complexity theorists Cryptographers Engineers Everybody else! 9/23/04 10 Possible worlds P=NP? Unresolved Theoreticians need to keep saying, “unless P=NP” Could be wasting time on pointless theory Can’t fully trust current crypto systems P=NP Poly hierarchy collapses: for each fixed m, can answer questions of form x1x2 xm ( x1,, xm ) in poly time 9/23/04 11 If SAT ε k DTIME(n ), k large Crypto theory changes No OWFs No trapdoor functions Crypto practice? Could there be a more efficient algorithm out there? PRIMES ε DTIME(n12) [AKS02] PRIMES ε RTIME(n3) [M76,R80] What about BQP? 9/23/04 12 If SAT ε k DTIME(n ), k small No RSA, Diffie-Hellman, DES, AES, SHA1 Symmetric key crypto not necessarily doomed Info theoretically secure key exchange in boundedstorage model [AR99] Quantum key exchange [BB84] 9/23/04 13 Secure symmetric encryption 9/23/04 14 The good stuff Test circuits/programs for correctness Automatically generate proofs of mathematical theorems Build better machines Faster CPU Perfect airplane wing Solve AI problems 9/23/04 15 Black-box mistake-bounded learning Online problem: at time t Input – (x1,y1),…,(xt,yt) where |xi|=n, |yi|=1 coming from circuit C Output – circuit Ct Goal – Ct equivalent to C Can bound # times t with Ct(xt+1)≠yt+1 by O(|C|2 lg|C|) If |C| can be estimated, O(|C| lg|C|) 9/23/04 16 Why upper bounds? Most researchers (61 to 9) think P≠NP [Gas02] Why look for good upper bounds? There aren’t any, right? NP problems could be easy in average case There could be practical algorithms 9/23/04 17 Why SAT? Why not subset sum, vertex cover, hampath, TSP? Formulas look like natural language questions Common problems efficiently reduce to SAT, often preserving structure of solution space Many practical algorithms for SAT 9/23/04 18 An application Circuit equivalence checking Standard techniques – random simulation, BDDs Alternative – reduce to SAT [BCCFZ99] compared performance on real circuits with 1000s of gates BDDs > 1 day 9/23/04 SATO 1 min – 1 h 19 How hard is SAT? Define sk=inf{ε>0 | time O(2εn)} exists random alg A solving k-SAT in σk=inf{ε>0 | exists random alg A solving unique kSAT in time O(2εn)} s∞=limk→∞ sk σ∞=limk→∞ σk Exponential time hypothesis (ETH) – s3>0 forall k≥3 (ETH↔sk>0) 9/23/04 20 Upper bounds on sk k 3 4 ∞ Upper bound .415037 [Sch99] .75 [PPZ97] 1 .411706 [SSW01] .58496 [Sch99] .411635 [HSSW02] .56249 [PPSZ03] .411112 [Rol03] 9/23/04 21 Outline Motivation Why care about NPC problems? Why upper bounds? Why SAT? How hard is SAT? Algorithms Design Conclusion 9/23/04 22 Outline Motivation Algorithms Problem variants Incremental assignment Local search Heuristics Design Conclusion 9/23/04 23 Problem variants General k-SAT – worst case Unique k-SAT – promise problem Random k-SAT YES – exactly 1 solution NO – 0 solutions Choose m random k-clauses with replacement Plant a solution – choose random assignment a, then choose m clauses that agree with a k-SAT from “real world” problem 9/23/04 Bounded model checking Circuit fault analysis Planning 24 Algorithmic paradigms Incremental assignment (DPLL) – assign variables 1 at a time, simplifying formula as you go Local search – choose an initial assignment, perform random walk directed by formula Heuristics – combine features of other algorithms ad hoc 9/23/04 25 Incremental assignment Ordered-DLL(F) do t times for i=1,…,n choose unassigned var x at random if clause {x} ε F, assign x ← 1 else if {¬x} ε F, assign x ← 0 else assign x at random simplify F by removing true clauses, false literals if F=1, return assignment 9/23/04 26 Incremental assignment - analysis Idea – lower bound probability p that each variable gets forced Definition Clause C ε F is critical at solution a iff C has exactly 1 true literal at a Solution a is isolated iff flipping any variable in a yields a nonsolution Claim – a isolated => F has ≥ 1 critical clause for each variable => p ≥ 1/k, conditioned on making random assignments in agreement with a 9/23/04 27 Incremental assignment - analysis Corollary – a isolated => Pr(find a in 1 iteration) ≥ 2-(1-1/k)n Either there is a nearly isolated solution or there are many solutions => running time ≤ poly(n)2(1-1/k)n [PPZ97] Strengthened analysis – poly(n)(2n/s)1-1/k [CIKP03] 9/23/04 28 Resolution Each clause provides info about solution set Why not add preprocessing step that adds clauses implied by formula F? (x1 or α) and (¬x1 or β) => α or β Definition – F is fully resolved iff F is closed under resolution 9/23/04 29 Resolution is complete Let S = solution space of formula F Definition – variable x is determined iff forall a ε S, a(x)=1 or forall a ε S, a(x)=0 Theorem – let F be fully resolved, x be a variable, b ε {0,1}, then 9/23/04 F is satisfiable iff ¬(Ø ε F) x is determined iff F contains a unit clause in x F|x←b is fully resolved 30 Bounded resolution Problems with full resolution Resolvent of 2 k-clauses could have size 2k-2 Exponentially many large clauses Shortest resolution refutation of unsatisfiable formulas may be exponentially large Simple solution – s-bounded resolution 9/23/04 For some constant s, produce resolvent R only if |R| ≤ s 31 Resolve SAT Algorithm – preprocess F by performing s-bounded resolution (s=O(lg n)), then use ordered-DLL Bound improves from poly(n)2(1-1/k)n to 1 (1-μ /(k-1))n poly(n)2 k where j( j ) , k≥5 Same bound for unique k-SAT, k≥3 k j 1 1 k 1 Best k-SAT alg for k≥4. O(2.5625n) for k=4. Best unique k-SAT alg for k≥3. O(2.3863n) for k=3. 9/23/04 32 Local search Local-search(F) do t times choose random assignment a do 3n times if F(a)=1, return a choose an unsatisfied clause C choose var x in C at random flip x in a 9/23/04 33 Local search - analysis Let z be a particular solution Can model each iteration as a Markov process States – 0,1,2,… State i represents dist(a,z)=i State 0 is absorbing Initial state is binomial (n,1/2), then algorithm performs a random walk 9/23/04 34 Local search - analysis Let z be a particular solution Can model each iteration as a Markov process States – 0,1,2,… State i represents d(a,z)=I State 0 is absorbing Initial state is binomial (n,1/2), then algorithm performs a random walk 0---1---2---3---4---5---6---7---… 9/23/04 35 Local search - analysis Let z be a particular solution Can model each iteration as a Markov process States – 0,1,2,… State i represents d(a,z)=I State 0 is absorbing Initial state is binomial (n,1/2), then algorithm performs a random walk 0---1---2---3---4---5---6---7---… 9/23/04 36 Local search - analysis Let z be a particular solution Can model each iteration as a Markov process States – 0,1,2,… State i represents d(a,z)=I State 0 is absorbing Initial state is binomial (n,1/2), then algorithm performs a random walk 0---1---2---3---4---5---6---7---… 9/23/04 37 Local search - analysis Let z be a particular solution Can model each iteration as a Markov process States – 0,1,2,… State i represents d(a,z)=I State 0 is absorbing Initial state is binomial (n,1/2), then algorithm performs a random walk 0---1---2---3---4---5---6---7---… 9/23/04 38 Local search - analysis Let z be a particular solution Can model each iteration as a Markov process States – 0,1,2,… State i represents d(a,z)=I State 0 is absorbing Initial state is binomial (n,1/2), then algorithm performs a random walk 0---1---2---3---4---5---6---7---… 9/23/04 39 Local search - analysis Let z be a particular solution Can model each iteration as a Markov process States – 0,1,2,… State i represents d(a,z)=I State 0 is absorbing Initial state is binomial (n,1/2), then algorithm performs a random walk 0---1---2---3---4---5---6---7---… 9/23/04 40 Local search - analysis Let z be a particular solution Can model each iteration as a Markov process States – 0,1,2,… State i represents d(a,z)=I State 0 is absorbing Initial state is binomial (n,1/2), then algorithm performs a random walk 0---1---2---3---4---5---6---7---… 9/23/04 41 Local search - analysis Let z be a particular solution Can model each iteration as a Markov process States – 0,1,2,… State i represents d(a,z)=I State 0 is absorbing Initial state is binomial (n,1/2), then algorithm performs a random walk 0---1---2---3---4---5---6---7---… 9/23/04 42 Local search - analysis Pr(go left) ≥ 1/k Pr(go right) ≤ 1-1/k Let p = Pr(reach 0 in one iteration) Naïve analysis – p ≥ n 1 1 ( ( 1 )) Pr(reach 0 by successive left moves) ≥ 2 k Better analysis – use reflection principle to show p ≥ poly (n)1 ( 12 (1 k11 )) n Running time – poly (n)( 2(1 k1 )) n [Sch99] 9/23/04 43 Heuristics Complete – gives correct answer with probability 1 (satz, SATO, zChaff) Incomplete – sometimes fails to find witness Probabilistically Approximately Complete (PAC) – succeed with probability 1 without restarts if given enough time (UnitWalk) Essentially incomplete – succeed with probability <1 unless restarted (GSAT, GWSAT, HSAT, HWSAT, WalkSAT) 9/23/04 44 UnitWalk choose initial assignment a do t times use ordered-DLL (with a as randomness!) to find next assignment 9/23/04 45 Variants UnitWalk+WalkSAT – alternate steps of UnitWalk and WalkSAT! UnitWalk+incBinSAT+WalkSAT – add preprocessing step of 2-bounded resolution Common benchmarks 9/23/04 Random formulas, circuit problems, bounded-model checking, planning problems Variables and clauses – 100s to 10,000s Running times – seconds to minutes 46 Outline Motivation Algorithms Problem variants Incremental assignment Local search Heuristics Design Conclusion 9/23/04 47 Outline Motivation Algorithms Design Sparsification Isolation Conclusion 9/23/04 48 Sparsification Given k-CNF F, ε > 0, we can generate in time poly(n)2εn a disjunction G of 2εn k-CNFs Gi, each of size O(n) [IPZ98] Each Gi can be produced in time poly(n) Can be used as a preprocessing step when Algorithm is exponential time anyway Linear sized formulas are needed 9/23/04 49 Sparsification example Theorem – σ3 = 0 => s∞ = 0 [CIKP03] Proof shows a reduction from general k-SAT to unique 3-SAT To do this requires clause width reduction Standard algorithm produces formula with O(n+m) variables (m = #clauses in input formula) Sparsify first! Then O(n+m)=O(n) 9/23/04 50 Isolation Reducing general k-SAT to unique k-SAT [VV86] Intersect solution space with O(n) linear equations in GF(2) Reduces the number of solutions to 1 with inverse polynomial success probability Need to square number of variables to represent such equations 9/23/04 51 Isolation Solution use sparse linear equations (εn variables each) to constrain solutions to a small Hamming ball (radius εn) Use random restriction – set O(εn) variables at random to select just 1 solution from ball The above isolates with subexponential success probability Theorem 9/23/04 sk ≤ σk+o(1) s∞ = σ∞ 52 Outline Motivation Algorithms Design Sparsification Isolation Conclusion 9/23/04 53 Outline Motivation Algorithms Design Conclusion Fate of k-SAT Open problems 9/23/04 54 Fate of k-SAT Last few improvements to best 3-SAT algorithms have improved constant in exponent by only a puny amount – .4150, .4117, .4116, .4111 – are these techniques running out of steam? Paradigm shifts yield greater improvements – .667, .415 Conjecture – ETH false (s3 = 0). 9/23/04 55 Open problems sk = σk? ETH => sk increases infinitely often. Does sk strictly increase? Local search variants usually involve clever initial assignments. Can we also find better search directions? Can resolve SAT analysis be tightened? Other algorithmic paradigms? 9/23/04 56 Thanks M. Paturi, R. Impagliazzo Theory lab denizens 9/23/04 57