M. Krivelevich, D. Vilenchik SODA 2006

advertisement
Solving Random Satisfiable 3CNF
Formulas in Expected Polynomial Time
M. Krivelevich, D. Vilenchik
SODA 2006
Lecture Outline

What is expected polynomial time and some motivation

The planted SAT distribution and related work

Description of our algorithm

Outline of the analysis

Open problems
Why Consider Prob. Models ?

Many interesting problems are known to be NP-hard

Hardness results only show that there exist hard instances

Should not discourage us from trying to design heuristics
that work well for “almost all” instances

For rigorous analysis - define “almost all” in meaningful way

One possibility - use probabilistic models such as Gn,p
Expected Polynomial Time


D - a distribution on the inputs
Algorithm works whp over D, if it succeeds whp when instance
sampled according to D

Such algorithm may fail completely on some instances

E.g. Greedy Coloring Algorithm:

Fix the vertices in some arbitrary order

For every vertex, assign minimal possible color
Expected Polynomial Time

Greedy uses whp at most n/logn colors for Gn,½ [GM75]

(Gn,½) ~ n/2logn whp
Therefore,

Greedy yields whp 2-approximation of (G) for G2Gn,½
However,

Let G=Kn/2,n/2 minus some perfect matching

Greedy uses n/2 colors - order vertices according to matching

(G)=2
greedy fails completely
Expected Polynomial Time Cont.
Alternatively, demand success for all instances while keeping
an overall average polynomial time
Formally …
Def. Algorithm A with running time tA(I) on I runs in
expected polynomial time over distribution D if
PrD[I]¢tA(I) is polynomial in n
Expected Polynomial Time Cont.

To achieve this – separate “easy” instances (can be handled
in polynomial time) from “hard” ones (rare, but may require
super-polynomial time)

Requires a better understanding of the probability space

Encourages efficient, natural and more robust algorithms
What’s Next ?

What is expected polynomial time and some motivation

The planted SAT distribution and related work.

Description of our algorithm.

Outline of the analysis.

Open problems.
3SAT - Definition
literal
3CNF form:
clause
(x1Ç x2 Ç ¬x5)Æ(x3Ǭx4 Ǭx1) Æ (x1Ç x2Ç x6) Æ…
Partial truth assignment:
x1
x2
x3
x4
x5
x6
T
F
T
F
T
*

3SAT = {all satisfiable 3CNF formulas}.

3SAT is NP-complete [Cook71].
Different SAT Distributions

(Arguably) most natural distribution - Pn,p

Include every possible clause w.p. p=p(n)
n 
ρ = p  8   /n
3

Let  = expected number of clauses / n,

Satisfiability shows sharp threshold behavior [Fri99]

 < 3.42, almost all instances are satisfiable
[KKL02]
Analog of
Gn,p

 > 4.5, almost all are unsatisfiable [KKS+01]

Our focus is =d, d a sufficiently large constant
Different SAT Distributions

Pn,p not interesting at such ratios (for satisfiability algorithms)
Alternatively …

Consider distributions over satisfiable instances

One possibility, PSATn,p where PSATn,p (I) = Pn,p(I | I is sat.)

PSATn,p is hard to sample (experimentally)

PSATn,p seems hard to tackle rigorously (no efficient algorithm
known for =o(logn))
Different SAT Distributions

Planted SAT can serve as intermediate step towards PSATn,p

It is interesting and well studied on its own right



It is the analog of Planted k-Coloring [BS95], [AK97],
Planted Clique [AKS98], [FK00]
It is a random distribution over satisfiable 3CNF formulas
with arbitrarily large clauses/variables ratio
Can be efficiently sampled
The Planted 3SAT Distribution

Generating an instance:

Randomly pick a truth assignment 

Include every clause satisfied by  w.p. p=d/n2
E.g.
x1
T
x2
F
x3
T
x4
F
x5
T
(x1Ç x2Ç ¬x5)Æ(x3Ç ¬x4Ç x1)Æ(¬x1Ç x2Ç x6)Æ…
x6
F
Planted Distributions: Related Work

[KP92] - greedy variables assignment, p≥d/n
(Implicitly) works in expected polynomial time

[AK97] – spectral technique for coloring sparse planted
3-colorable graphs (np=d)

[BSBG02] – majority vote suffices for p≥d¢logn/n2

[Fla03] – techniques similar to [AK97], solves whp planted
3SAT, p≥d/n2
Related Work Cont.

[CO04] – SDP based expected polynomial time algorithm for
(semi-random) planted k-colorable graphs, np≥d¢k¢logn

[Böt05] – SDP based expected polynomial time algorithm for
planted k-colorable graphs, np≥d¢k2
What’s Next ?

What is expected polynomial time and some motivation

The planted SAT distribution and related work

Description of our algorithm

Outline of the analysis

Open problems
Our Results

An algorithm that decides 3SAT

Expected polynomial running time over planted 3SAT, p=d/n2

Result extends to any constant k (in which case d=d0k)

First work to address the issue of expected poly. time
algorithms for satisfiable SAT distributions.
Algorithm: General Outline
Most expected poly. time
heuristics discard the solution
and exhaustively
search
for a
The algorithm
proceeds in
2 steps:
correct one
correct means
coincides with the
planted solution
1.
Find a partial correct solution containing a large fraction
of variables (always poly time)
2.
a. Try to complete the partial solution to a satisfying
assignment
Typically,
all but asolution
small until
b. If not possible, gradually
fix the partial
constant, e-(d), fraction
step 2.a ends up successfully
(steps a+b run in expected poly. time)
Algorithm: Basic Ingredients
The Majority Vote:
(x1Çx2Ǭx3)Æ(x4Ç x2Ǭx1)Æ(¬x1Ç x2Ç x4)Æ(x3Ǭx2Ç x4)
x1
x2
x3
x4
F
T
T
T
Basic Ingredients Cont.
The Unassignment Procedure:

If C = (x Ç :y Ç z)!(T Ç F Ç F), then x supports C w.r.t 

Note: all three variables are assigned by 
E.g. unassignment with threshold t =1
(x1Çx2Ǭx3)Æ(x4Çx2Ǭx1)Æ(¬x1Çx2Ç ¬x4)Æ(x3Ǭx1Ǭx4)
* Ç *F Ç *F) Æ (F* Ç F
* ÇF
* ) Æ( T
* Ç *F Ç *F )
(T* Ç F* Ç *F) Æ (T
Unassignment stops when all remaining variables
d support at least t clauses

Basic Ingredients Cont.
The Exhaustive
Search:
If every
component is of
size O(logn), the
procedure is polynomial.
 Given 3CNF formula I, define its induced graph GI=(V,E):

V = {x1, x2, …, xn} - the set of variables

(xi,xj)2E if 9 clause C containing both (polarity disregarded)

Given I, find the connected components in GI

Search every component separately for a satisfying assignment
Basic Ingredients: Motivation



Assume
input
according to
planted
3SAT by
Wrongly
assigned
But we
alsosampled
expect the
the Majority.
majority to wrongly assig
Suppose
(x)=T
We call such variable
n
some
variables whp
wrong variable.
(small fraction)
In every clause, x appears w.p. 4/7¢ 3/n, :x w.p. 3/7¢ 3/n
Therefore,


Must be another
wrong variable
Majority Vote approximates  closely
whp the
surviving
unassignment
Suppose a wrongly assigned variable survives unassignment
F Ç F Ç F)
(T
T
Motivation Cont.





W - the set of wrong variables surviving unassignment
There exist at least t¢|W | clauses, each containing at least
2 variables from W
We call such W dense
each
clause
was with
If |W | is small, this is analogous to
small
subgraph
counted once, as the
atypically high average degree
support is unique.
This happens with small probability in random graphs, Gn,p
Algorithm: General Outline
Majority Vote +
Unassignment
The algorithm basically proceeds in 2 steps:
1.
Find a partial correct solution containing aExhaustive
large fraction
of the variables
Search
2.
a. Try to complete the partial solution to a satisfying
assignment
b. If not possible, gradually fix the partial solution until
step 2.a ends up successfully.
Make sure algorithm
always succeeds.
Putting Everything Together
d/2 is the
expected
Algorithm SAT(I):
support
1.
MAJ Ã Majority Vote of I.
2.
3.
4.
5.
6.
completeness
Carry
unassignment with threshold
0.999d/2
soundness
w.r.t MAJ.
Let  be the partial assignment.
Let U be the set of unassigned variables.
Construct G=(U,E).
For all subsets Y µ V\U, |Y|=0..|V\U|, and for
all possible assignments Y of Y:
1. Fix  according to Y.
2. Using exhaustive search on G(U,E) try to comp
lete  to a satisfying assignment.
Y is the fixing set o
f variables
3. If success, return the assignment.
What’s Next ?

What is expected polynomial time and some motivation.

The planted SAT distribution and related work.

Description of our algorithm.

Outline of the analysis.

Open problems.
Analyzing the Running Time
Algorithm SAT(I):
1. MAJ Ã Majority Vote of I.
2. Carry unassignment with threshold
Expected to0.999d/2
perfor
Expected running time
w.r.t MAJ.
m O(1) times
O(n1+)
3. Let  be the partial assignment.
4. Let U be the set of unassigned variables.
5. Construct G=(U,E).
6. For all subsets Y µ V\U, and for all
possible assignments Y of Y:
Always polynomial.
1. Fix  according to Y.
In fact expected linear tim
2. Using exhaustive searche on G(U,E) try to
complete  to a satisfying assignment.
3. If success, return the assignment.
Analysis Outline
Typically (for Planted 3SAT), the following happens:

arguments
Distance between MAJ and thesimilar
planted
assignment is e-(d)n
to Gn,p, np<1

Almost all correct variables, (1-e-(d) ) n, survive unassignment

Only correct variables survive the unassignment

G=(U,E) breaks down to O(logn)-size connected components
Therefore,

“Density” arguments
Exhaustive search is successful and polynomial
Analysis Outline

What can go wrong, preventing successful execution ?

Wrong variables survived the unassignment:

The partial assignment induces a (FÇFÇF) clause

Formula induced by unassigned variables is not satisfiable

Y0 - the set of fixing variables with which the algorithm ends

Typically, Y0=;
Analysis Outline Cont.
Key observation: if Y0; then:
1.
The Majority Vote is wrong for at least |Y0| variables
2.
Y0 is a dense set of variables

For “large” |Y0|, (1) happens with small probability


Suppose
x 2small
Y0 !probability
For “small” |Y0|, (2) happens
with
x survives the
unassignment
!
Otherwise,
the algorithm
x supports
~d/2have
clauses
! with a
would
ended
It remains to carry out the exact
calculations
x smaller
yset Y’  Y .
0
F
(T Ç F Ç F)
T
! y2Y0, otherwise, algorithm can not
end
A Taste of Rigorous Analysis
The following properties hold whp for Planted 3SAT:

Let 0=e-d/C0

FMAJ - the set of variables on which MAJ and  disagree

Claim: for y ¸ 0n, Pr[|FMAJ|¸ y] · e-yd/C1



For JµV, F(J) is the set of clauses in I containing at least
2 variables from J
Claim: Pr[9J, |J|·0n, |F(J)|¸|J|d/3]· e-|J|log(n/|J|)d/12
Properties proved using standard probabilistic techniques
(union bound, Chernoff)
A Taste of Rigorous Analysis
The expected number of fixing iterations is at most:
 n  y'
E[# iterations]   Pr[|Y0 |= y]   2
y=0
y'=0  y'
n
n
for y  ,
2

 n  y'
n  y
n
2

y


2

exp
y

log



 y'
y
y
y'=0 

 

y
 n  y'
always,   2  3n
y'=0  y'
y
y
A Taste of Rigorous Analysis
E[#iterations] 



n
n
1   exp  y log   exp -y log  d/12  +
y=0
y
y
y=1


144444444442 44444444443
α0n
Y0 is a dense set of size y

n


exp  y log   exp -y  d/C1  +

y
y=α0n
 42 4444443

14444442 4444443 144444
n/2
n
1 d
log log 
y
α0 C0
MAJ wrong for |Y0| variables
n
n
3
 exp -y  d/C1  = O(1)

14444442 4444443
y=n/2
yd/C1 dn/2C1 2n
Open Problems

[FV04] show a k-opt based heuristic solving whp Planted
3SAT, p=d/n2




Change k-opt version to run in expected polynomial time
Challenge: no explicit distinction between wrong and
correct variables
Simplify [Böt05], e.g. replacing SDP approximation with
simpler and stronger procedure (similar to Majority Vote)
Design an efficient algorithm for random (not planted)
satisfiable formulas, p=d/n2
Download