two

advertisement
Propositional Satisfiability
(SAT)
Toby Walsh
Cork Constraint Computation Centre
University College Cork
Ireland
4c.ucc.ie/~tw/sat/
Outline

What is SAT?

How do we solve
SAT?

Why is SAT
important?
Propositional satisfiability
(SAT)

Given a propositional
formula, does it have
a “model” (satisfying
assignment)?


1st decision problem
shown to be NPcomplete
Usually focus on
formulae in clausal
normal form (CNF)
Clausal Normal Form

Formula is a conjunction of
clauses


Each clause is a disjunction
of literals




C1 & C2 & …
L1 v L2 v L3, …
Empty clause contains no
literals (=False)
Unit clauses contains single
literal
Each literal is variable or its
negation

P, -P, Q, -Q, …
Clausal Normal Form

k-CNF


3-CNF



Each clause has k
literals
NP-complete
Best current complete
methods are exponential
2-CNF

Polynomial (indeed,
linear time)
How do we solve SAT?

Systematic methods



Local search methods




Truth tables
Davis Putnam procedure
GSAT
WalkSAT
Tabu search, SA, …
Exotic methods

DNA, quantum computing,
Procedure DPLL(C)
(SAT)
if C={} then SAT
(Empty) if empty clause in C then UNSAT
(Unit)
if unit clause, {l} then
DPLL(C[l/True])
(Split)
if DPLL(C[l/True]) then SAT else
DPLL(C[l/False])
GSAT

[Selman, Levesque, Mitchell AAAI 92]
Repeat MAX-TRIES times or until clauses
satisfied


T:= random truth assignment
Repeat MAX-FLIPS times or until clauses satisfied


v := variable which flipping maximizes number of SAT
clauses
T := T with v’s value flipped
WalkSAT

[Selman, Kautz, Cohen AAAI 94]
Repeat MAX-TRIES times or until clauses
satisfied


T:= random truth assignment
Repeat MAX-FLIPS times or until clauses satisfied



c := unsat clause chosen at random
v:= var in c chosen either greedily or at random
T := T with v’s value flipped
Focuses on UNSAT clauses
Why is SAT important?

Computational complexity



1st problem shown NP-complete
Can therefore be used in theory to solve any
NP-complete problem
Many direct applications
Some applications of SAT


Hardware design
Signals



Hi = True
Lo = False
Gates



AND gate = and
connective
INVERTOR gate = not
connective
..
Some applications of SAT


Hardware design
State of the art




HP verified 1/7th of the DEC Alpha chip using a DP
solver
100,000s of variables
1,000,000s of clauses
Modelling environment is one of the biggest problems
Some applications of SAT



Planning
But planning is
undecidable in general
Even propositional
STRIPS planning is
PSPACE complete!

How can a SAT solver,
which only solves NPhard problems be used
then?
Some applications of SAT




Planning as SAT
Put bound on plan length
If bound too small, UNSAT
Introduce new propositional variables for each
time step
Some applications of SAT



Diagnosis as SAT
Otherwise know as “SAT
in space”
Deep Space One
spacecraft


Propositional theory to
monitor, diagnose and
repair faults
Runs in LISP!
Computational complexity

Study of “problem hardness”


Typically worst case
Big O analysis


Sorting is easy, O(n logn)
Chess and GO are hard, EXP-time


“Can I be sure to win?”
Need to generalize problem to n by n board
Where do things start getting hard?
Computational complexity

Hierarchy of complexity classes


Polynomial (P), NP, PSpace, ….
NP-complete problems mark boundary of
tractability


No known polynomial time algorithm
Though open if P=/=NP
NP-complete problems

Non-deterministic Polynomial time



If I guess a solution, I can check it in polynomial time
But no known easy way to guess solution correctly!
Complete



Representative of all problems in this class
If this problem can be solved in polynomial time, all
problems in the class can be solved
Any NP-complete problem can be mapped into any
other
NP-complete problems

Many examples





Propositional satisfiability (SAT)
Graph colouring
Travelling salesperson problem
Exam timetabling
…
SAT is NP-complete

Cook (1971) showed
that all non-deterministic
Turing machines can be
reduced to SAT
=> There is a polynomial
reduction of any problem
in NP to SAT
But not all SAT problems
are equally hard!
SAT phase transition

Random k-SAT



[Mitchell, Selman,
Levesque AAAI-92]
sample uniformly from space of all possible k-clauses
n variables, l clauses
Rapid transition in satisfiability

2-SAT occurs at l/n=1 [Chavatal & Reed 92, Goerdt 92]

3-SAT occurs at 3.26 < l/n < 4.598
Random 3-SAT

Which are the hard
instances?

around l/n = 4.3
What happens with
larger problems?
Why are some dots red
and others blue?
Random 3-SAT
Complexity peak
coincides with solubility
transition



l/n < 4.3 problems
under-constrained and
SAT
l/n > 4.3 problems overconstrained and UNSAT
l/n=4.3, problems on
“knife-edge” between
SAT and UNSAT
Random 3-SAT

Varying problem size, n

Complexity peak appears
to be largely invariant of
algorithm


backtracking algorithms
like Davis-Putnam
local search procedures
like GSAT
3SAT phase transition

Lower bounds (hard)


Analyse algorithm that almost always solves
problem
Backtracking hard to reason about so
typically without backtracking


Complex branching heuristics needed to ensure
success
But these are complex to reason about
3SAT phase transition

Upper bounds (easier)

Typically by estimating count of solutions
3SAT phase transition

Upper bounds (easier)


Typically by estimating count of solutions
E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
3SAT phase transition

Upper bounds (easier)


Typically by estimating count of solutions
E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
No assumptions about the distribution of X except
non-negative!
3SAT phase transition

Upper bounds (easier)


Typically by estimating count of solutions
E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a
3SAT problem
3SAT phase transition

Upper bounds (easier)


Typically by estimating count of solutions
E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a
3SAT problem
The expected value of X can be easily calculated
3SAT phase transition

Upper bounds (easier)


Typically by estimating count of solutions
E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a
3SAT problem
E[X] = 2^n * (7/8)^l
3SAT phase transition

Upper bounds (easier)


Typically by estimating count of solutions
E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a
3SAT problem
E[X] = 2^n * (7/8)^l
If E[X] < 1, then prob(X>=1) = prob(SAT) < 1
3SAT phase transition

Upper bounds (easier)


Typically by estimating count of solutions
E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT
problem
E[X] = 2^n * (7/8)^l
If E[X] < 1, then 2^n * (7/8)^l < 1
3SAT phase transition

Upper bounds (easier)


Typically by estimating count of solutions
E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a 3SAT
problem
E[X] = 2^n * (7/8)^l
If E[X] < 1, then 2^n * (7/8)^l < 1
n + l log2(7/8) < 0
3SAT phase transition

Upper bounds (easier)


Typically by estimating count of solutions
E.g. Markov (or 1st moment) method
For any statistic X
prob(X>=1) <= E[X]
Let X be the number of satisfying assignments for a
3SAT problem
E[X] = 2^n * (7/8)^l
If E[X] < 1, then 2^n * (7/8)^l < 1
n + l log2(7/8) < 0
l/n > 1/log2(8/7) = 5.19…
3SAT phase transition

Upper bounds (easier)


Typically by estimating count of solutions
To get tighter bounds than 5.19, can refine
the counting argument

E.g. not count all solutions but just those maximal
under some ordering
SAT phase transition

Shape of transition


“sharp” both for 2-SAT and 3-SAT [Friedut 99]
Backbone (dis)continuity
2-SAT transition is "2nd order", continuous
 3-SAT transition is "1st order", discontinuous
 backbone = truth assignments that are fixed when we
satisfy as many clauses as possible
[Monasson et al. 1998],…

2+p-SAT
Morph between 2-SAT
and 3-SAT

fraction p
of 3-clauses

fraction (1-p) of 2-clauses
[Monasson et al 1999]
2+p-SAT

Maps from P to NP

NP-complete for any
p>0

Insight into change from P to
NP, continuous to
discontinuous, …?
[Monasson et al 1999]
2+p-SAT
2+p-SAT


Observed search cost

linear for p<0.4

exponential for p>0.4
But NP-hard for all
p>0!
2+p-SAT
Continuous
2SAT like
Discontinuous
3SAT like
Simple bound

Are the 2-clauses
UNSAT?


2-clauses are more
constraining than 3clauses
For p<0.4,
transition occurs
at lower bound!

3-clauses are not
contributing
2+p-SAT trajectories
The real world isn’t random?

Very true!
Can we identify structural
features common in real
world problems?

Consider graphs met in
real world situations




social networks
electricity grids
neural networks
...
Real versus Random

Real graphs tend to be sparse


Real graphs tend to have short
path lengths


dense random graphs contains
lots of (rare?) structure
as do random graphs
Real graphs tend to be clustered

unlike sparse random graphs
L, average path length
C, clustering coefficient
(fraction of neighbours connected to
each other, cliqueness measure)
mu, proximity ratio is C/L normalized
by that of random graph of same
size and density
Small world graphs

Sparse, clustered, short
path lengths

Six degrees of separation



Stanley Milgram’s
famous 1967 postal
experiment
recently revived by Watts
& Strogatz
shown applies to:




actors database
US electricity grid
neural net of a worm
...
An example

1994 exam timetable at
Edinburgh University



less than 10^-10 chance
in a random graph


59 nodes, 594 edges so
relatively sparse
but contains 10-clique
assuming same size and
density
clique totally dominated
cost to solve problem
Small world graphs

To construct an ensemble of small world graphs


morph between regular graph (like ring lattice) and
random graph
prob p include edge from ring lattice, 1-p from random
graph
real problems often contain similar structure and
stochastic components?
Small world graphs


ring lattice is clustered but has long paths
random edges provide shortcuts without
destroying clustering
Small world graphs
Small world graphs

Other bad news


disease spreads more
rapidly in a small
world
Good news

cooperation breaks
out quicker in iterated
Prisoner’s dilemma
Other structural features
It’s not just small world graphs that have been studied

Large degree graphs


Ultrametric graphs


Barbasi et al’s power-law model
Hogg’s tree based model
Numbers following Benford’s Law

1 is much more common than 9 as a leading digit!
prob(leading digit=i) = log(1+1/i)

such clustering, makes number partitioning much
easier
Open questions

Prove random 3-SAT occurs at l/n = 4.3




random 2-SAT proved to be at l/n = 1
random 3-SAT transition proved to be in range
3.26 < l/n < 4.598
random 3-SAT phase transition proved to be “sharp”
2+p-SAT


heuristic argument based on replica symmetry
predicts discontinuity at p=0.4
prove it exactly!
Open questions

Impact of structure on phase transition
behaviour



some initial work on quasigroups (alias Latin
squares/sports tournaments)
morphing useful tool (e.g. small worlds, 2-d to
3-d TSP, …)
Optimization v decision

some initial work by Slaney & Thiebaux
Open questions

Does phase transition behaviour give insights to
help answer P=NP?



it certainly identifies hard problems!
problems like 2+p-SAT and ideas like backbone also
show promise
But problems away from phase boundary can
be hard to solve


over-constrained 3-SAT region has exponential
resolution proofs
under-constrained 3-SAT region can throw up occasional
hard problems (early mistakes?)
Conclusions



SAT is fundamental
problem in logic, AI, CS,
…
There exist both complete
and incomplete methods
for solving SAT
We can often solve larger
problems than (worstcase) complexity would
suggest is possible!
Download