pptx - Princeton University

advertisement
Time-Space Tradeoffs
in Proof Complexity:
Superpolynomial Lower Bounds for
Superlinear Space
Chris Beck
Princeton University
Joint work with
Paul Beame & Russell Impagliazzo
Jakob Nordstrom & Bangsheng Tang
SAT
• The satisfiability problem is a central problem
in computer science, in theory and in practice.
• Terminology:
– A Clause is a boolean formula which is an OR of
variables and negations of variables.
– A CNF formula is an AND of clauses.
• Object of study for this talk:
CNF-SAT: Given a CNF formula, is it satisfiable?
Resolution Proof System
• Lines are clauses, one simple proof step
• Proof is a sequence of clauses each of which is
– an original clause or
– follows from previous ones via resolution step
• a CNF is UNSAT iff can derive empty clause ⊥
Proof DAG
General resolution: Arbitrary DAG
Tree-like resolution: DAG is a tree
SAT Solvers
• Well-known connection between Resolution
and SAT solvers based on Backtracking
• These algorithms are very powerful
– sometimes can quickly handle CNF’s with millions
of variables.
• On UNSAT formulas, computation history
yields a Resolution proof.
– Tree-like Resolution ≈ DPLL algorithm
– General Resolution ≿ DPLL + “Clause Learning”
• Best current SAT solvers use this approach
SAT Solvers
• DPLL algorithm: backtracking search for
satisfying assignment
– Given a formula which is not constant, guess a
value for one of its variables 𝑥, and recurse on the
simplified formula.
– If we don’t find an assignment, set 𝑥 the other
way and recurse again.
– If one of the recursive calls yields a satisfying
assignment, adjoin the good value of 𝑥 and return,
otherwise report fail.
SAT Solvers
• DPLL requires very little memory
• Clause learning adds a new clause to the input
CNF every time the search backtracks
– Uses lots of memory to try to beat DPLL.
– In practice, must use heuristics to guess which
clauses are “important” and store only those.
Hard to do well! Memory becomes a bottleneck.
• Question: Is this inherent? Or can the right
heuristics avoid the memory bottleneck?
Proof Complexity & Sat Solvers
• Proof Size ≤ Time for Ideal SAT Solver
• Proof Space ≤ Memory for Ideal SAT Solver
• Many explicit hard UNSAT examples known
with exponential lower bounds for Resolution
Proof Size.
• Question: Is this also true for Proof Space?
Space in Resolution
…
𝐶1
𝐶2
x˅𝑦
𝑥˅𝑧
𝑦˅𝑧
Time step 𝑡
⊥
Must be in memory
• Clause space ≔ max(# active clauses).
𝑡
[Esteban, Torán ‘99]
Lower Bounds on Space?
• Considering Space alone, tight lower and upper
bounds of 𝛉(𝒏), for explicit tautologies of size 𝒏.
• Lower Bounds: [ET’99, ABRW’00, T’01, AD’03]
• Upper Bound: All UNSAT formulas on 𝑛 variables
have tree-like refutation of space ≤ 𝑛.
[Esteban, Torán ‘99]
• For a tree-like proof, Space is at most the height of
the tree which is the stack height of DPLL search
• But, these tree-like proofs are typically too large
to be practical, so we should instead ask if both
small space and time are simultaneously feasible.
Size-Space Tradeoffs for Resolution
• [Ben-Sasson ‘01] Pebbling formulas with linear
size refutations, but for which all proofs have
Space ∙ log Size = Ω(n/log n).
• [Ben-Sasson, Nordström ‘10] Pebbling formulas
which can be refuted in Size O(n), Space O(n),
but Space O(n/log n)  Size exp(n(1)).
But, these are all for Space < 𝑛, and SAT solvers
generally can afford to store the input formula in
memory. Can we break the linear space barrier?
Size-Space Tradeoffs
• Informal Question: Can we formally show that
memory rather than time can be a real
bottleneck for resolution proofs and SAT solvers?
• Formal Question (Ben-Sasson):
“Does there exist a 𝑐 such that any CNF with a
refutation of size T also has a refutation of size T𝑐
in space O(𝑛)?”
• Our results: Families of formulas of size n having
refutations in Time, Space nk, but all resolution
refutations have T > (n0.58 k/S)loglog n/logloglog n
Tseitin Tautologies
Given an undirected graph
:𝑉→𝔽2 , define a CSP:
and
Boolean variables:
Parity constraints:
(linear equations)
When
has odd total parity, CSP is UNSAT.
Tseitin Tautologies
• When  odd, G connected, corresponding CNF is
called a Tseitin tautology. [Tseitin ‘68]
• Only total parity of  matters
• Hard when G is a constant degree expander:
[Urqhart 87]: Resolution size = 2Ω(𝐸)
[Torán 99]: Resolution space =Ω(E)
• This work: Tradeoffs on 𝒏 × 𝒍 grid, 𝒍 ≫ 𝒏, and
similar graphs, using isoperimetry.
Tseitin formula on Grid
• Consider Tseitin formula on 𝒏 × 𝒍 grid, 𝒍=4𝒏𝟐
n
l
• How can we build a resolution refutation?
Tseitin formula on Grid
• One idea: Mimic linear algebra refutation
• If we add the lin eqn’s in any order, get 1 = 0.
n
l
• A linear equation on 𝑘 variables corresponds
to 2𝑘 clauses. Resolution is implicationally
complete, so can simulate any step of this
with 2𝑂(𝑘) -blowup.
Tseitin formula on Grid
• One idea: Mimic linear algebra refutation
• If we add the lin eqn’s in column order:
n
l
• Never have an intermediate equation with
more than 𝑛 variables. Get a proof of Size ≈
#𝑣𝑒𝑟𝑡𝑖𝑐𝑒𝑠 ∙ 2𝑛 , Space ≈ 2𝑛 .
Tseitin formula on Grid
• Different idea: Divide and conquer
• In DPLL, repeatedly bisect the graph
n
l
• Each time we cut, one of the components is
unsat. Done after 𝒏 𝐥𝐨𝐠 |𝐕| queries, tree-like
proof with Space ≈ 𝒏 𝐥𝐨𝐠 𝐧𝐥, Size (𝒏𝒍)𝒏 .
• Savitch-like savings in space, for 𝒍 = 𝐩𝐨𝐥𝐲(𝒏).
Tseitin formula on Grid
• “Chris, isn’t this just tree-width?” – Exactly.
• Our work is about whether this can be improved.
n
l
• The lower bound shows is that quasipolynomial
blowup in size when the space is below 2𝑛 is
necessary for the proof systems we studied.
• For technical reasons, work with “doubled” grid.
High Level Overview of Lower Bound
• [Beame Pitassi ’96] simplified previous Proof
Size lower bounds using random restrictions.
• A restriction 𝜌 to a formula 𝜑 is a partial
assignment of truth values to variables,
resulting in some simplification. We denote
the restricted formula by 𝜑|𝜌 .
• If Π is a proof of 𝜑, Π|𝜌 is a proof of 𝜑|𝜌 .
High Level Overview of Lower Bound
• Philosophy of [Beame, Pitassi ’96]
– Find 𝜑 such that any refutation contains a
“complex clause”
– Find 𝜑 and a random restriction which restricts φ’
to 𝜑, but also kills any complex clause whp.
– Modern form of “bottleneck counting” due to
Haken. Can reinterpret this as finding a large
collection of assignments, one extending each
restriction, each passing through a wide clause.
High Level Overview of Lower Bound
• This work: Multiple classes of complex clauses
– Can state main idea as a pebbling argument.
– Suppose G is a DAG with T vertices that can be
scheduled in space S, and the vertices can be
broken into 4 classes, μ ∶ 𝑉 → {1,2,3,4}.
– Assume that for arcs (𝑢, 𝑣), 𝜇 𝑢 ≤ 𝜇(𝑣) + 1
– 𝐴ll sources in 1, all sinks in 4. Everything in 2 is a
“complex clauses” of first type, everything in 3 is a
“complex clauses” of second type.
High Level Overview of Lower Bound
• Suppose that there is a distribution Π on
source-to-sink paths and a real number p s.t.:
– For any vertex 𝑣, μ 𝑣 ∈ 2,3 , 𝑃𝑟𝜋 𝑣 ∈ 𝜋 ≤ p.
– For any two vertices 𝑣1 , 𝑣2 , 𝜇 𝑣1 = 2, 𝜇 𝑣2 = 3,
𝑃𝑟𝜋 𝑣1 , 𝑣2 ∈ 𝜋 ≤ 𝑝2 .
• Then, 𝑇 2 𝑆 = Ω 𝑝−3 .
High Level Overview of Lower Bound
• Proof:
– Let 𝑣1 … 𝑣𝑇 be sequence of pebbling steps using
only S pebbles. Divide time into 𝑚 epochs, 𝑚 to
be determined later.
– Plot 𝜇 vs. time
4
3
2
1
Time
High Level Overview of Lower Bound
• Let 𝜋 be any source-to-sink path. Claim:
– Either, π hits a vertex of level 2 and a vertex of
level 3 during the same epoch
– Or, 𝜋 hits a vertex of level 2 or 3 during one of the
breakpoints between epochs.
4
3
2
1
Time
High Level Overview of Lower Bound
• Now choose 𝜋 randomly.
– Either, π hits a vertex of level 2 and a vertex of
level 3 during the same epoch
Probability of this: By a union bound, at most
𝑚(𝑇 𝑚)2 𝑝2
4
3
2
1
Time
High Level Overview of Lower Bound
• Now choose 𝜋 randomly.
– Or, 𝜋 hits a vertex of level 2 or 3 during one of the
breakpoints between epochs.
Probability of this: By a union bound, at most
𝑚𝑆𝑝
4
3
2
1
Time
High Level Overview of Lower Bound
• Now choose 𝜋 randomly.
– Conclude 1 ≤ 𝑚−1 𝑇 2 𝑝2 + 𝑚𝑆𝑝
• Optimizing 𝑚 yields 𝑇 2 𝑆 = Ω 𝑝−3 .
• Previous such results only for expanders, etc.
• Gets stronger when you have more levels.
Later we will see that with 𝑙 levels, get
𝑇 ≥ 𝑝𝑆
log 𝑙
Ω
log log 𝑙
High Level Overview of Lower Bound
• Goal for remainder of analysis:
– Show that any proof of Tseitin on doubled grid has
such a flow (using restrictions), for the best 𝑝 and
as many complex clause classes as possible.
– It is enough that any k complex clauses from
distinct classes have collective width ≥ 𝑘𝑊 for
some large W. (For n x l grid, W will be n.)
– How did we get complex clauses before in [Beame
Pitassi ‘96]? A “measure of progress” argument…
Progress Measure
• The following progress measure construction
appeared in [Ben-Sasson and Wigderson ’01].
• Let 𝐴1 … 𝐴𝑚 be a partition of clauses of an
unsat CNF. Assume it is minimally unsat.
• For any clause C, define
𝜇 𝐶 ≔ min 𝑆
𝑆⊆[𝑚],
𝐴𝑖 𝑖∈𝑆 ⊨𝐶
Progress Measure
• 𝜇 is a sub-additive complexity measure:
 𝜇(initial clause) = 1,
 𝜇(⊥) = m,
 𝜇(𝐶) ≤ 𝜇(𝐶1 ) + 𝜇(𝐶2), when 𝐶1, 𝐶2 ⊦ 𝐶.
• In any refutation, there must occur a clause C
1 2
with 𝜇 𝐶 /𝑚 ∈ [ , ]. In previous work,
3 3
choose 𝐴1 … 𝐴𝑚 so that this implies C is wide.
Progress Measure for Tseitin
• For Tseitin tautologies, natural choice is to
have one 𝐴𝑖 for each vertex v, containing all
associated clauses. So 𝐴𝑖 is an 𝔽2 lin. eqn.
• Claim: Let C be any clause, and let 𝑆 ⊆ 𝑉,
𝑆 ⊨ 𝐶 be any subset attaining the min, 𝑆 =
𝜇(𝐶). Then, δ(𝑆) ⊆ 𝑉𝑎𝑟𝑠(𝐶).
Progress Measure for Tseitin
• Claim: Let C be any clause, and let 𝑆 ⊆ 𝑉,
𝑆 ⊨ 𝐶, 𝑆 = 𝜇(𝐶). Then δ(𝑆) ⊆ 𝑉𝑎𝑟𝑠(𝐶).
• Proof:
• If 𝑆 ⊨ 𝐶, then 𝑆 ⋀¬𝐶 is unsat linear system. So
we can derive 1 = 0 by linear combinations.
• If any equation from S has coeff 0, S wasn’t
minimal. In sum of equations of S, noncancelling
vars are exactly boundary edge vars. These must
cancel with ¬𝐶 to get 1=0, so δ(𝑆) ⊆ 𝑉𝑎𝑟𝑠(𝐶).
Isoperimetry → Lower Bounds
• Consequence: Tseitin formula has minimum
refutation width at least the balanced
bisection width of graph G.
• Corollary: Tseitin on a constant degree
expander requires width Ω(𝑛).
• Corollary: On 𝑛 ⨉ 𝑙 grid, needs width 𝑛.
• Here, “Complex clause” := 𝜇(𝐶)/m ∈
1 2
[ , ]
3 3
Size Lower Bounds from Width
• We have formulas with width lower bounds,
can use XOR-substitution and restrictions to
get size lower bounds.
• For 𝐹 a CSP, 𝐹[⨁] is the CSP over variable set
with two copies 𝑥1 , 𝑥2 of each var 𝑥 of 𝐹, and
each constraint of 𝐹 interpretted as
constraining the parities (𝑥1 ⨁𝑥2 ) rather than
the variables 𝑥. (Then expand as CNF.)
Size Lower Bounds from Width
• 𝐹[⨁] has a natural random restriction 𝜌:
– For each 𝑥, independently choose one of 𝑥1 , 𝑥2 to
restrict to {0,1} at random. Substitute either 𝑥 or
¬𝑥 for the other, so that 𝑥1 ⨁𝑥2 |𝜌 = 𝑥.
• Previously used in [B’01]. Nice properties:
– 𝐹 ⨁ |𝜌 = 𝐹, always
– Fairly dense, kills all wide clauses whp.
– Black box, doesn’t depend on 𝐹.
Size Lower Bounds from Width
• When 𝐹 is a Tseitin tautology, 𝐹[⨁] is
“multigraph-Tseitin” on same graph but where
each edge has been doubled.
• Suppose Tseitin on doubled grid has a proof of
size less than 2𝑐𝑛 . Then by a union bound,
some restriction kills all clauses of width 𝑛, so
Tseitin on the grid has proof of width < 𝑛.
Contradiction, for small enough c.
Additional Complex Clause Levels
• Idea: Instead of just considering C with
𝜇(𝐶)/m ∈ [1/3, 2/3] , what about [1/6, 1/3],
[1/12, 1/16], etc.?
• Gives us the levelled property we need.
• Task: Show that clauses from k different levels
are collectively wide.
• Given the machinery we have, this is some
“extended isoperimetry” property of the grid.
Extended Isoperimetry
• Claim: Let 𝑆1 … 𝑆𝑘 be subsets of vertices,
𝑆𝑖 ∈ [𝑛2 ,𝑛𝑙/2], of superincreasing sizes.
n
l
Then, δ(𝑆𝑖 ) ≥ Ω(𝑘𝑛).
• Several simple combinatorial proofs of this.
• Implies clauses from distinct levels are
collectively wide.
Main Lemma
• For any set 𝑆 of 𝑀 clauses in doubled-Tseitin,
Pr[𝑆|𝜌 contains ≥ 𝑘 complex clauses
of 𝑑𝑖𝑠𝑡𝑖𝑛𝑐𝑡 complexity levels]
≤ 𝑀2
−Ω 𝑛
𝑘
Proof: For any k-tuple, Ω(kn) opportunities for
𝜌 to kill at least one. Union bound over k-tuples.
Complexity vs. Time
• Consider the time ordering of any proof, and
plot complexity of clauses in memory v. time
μ
Input clauses
Time
• Sub-additivity implies cannot skip over any
[t, 2t] window of μ-values on the way up.
⊥
Complexity vs. Time
• Consider the time ordering of any proof, and
divide time into 𝑚 equal epochs (fix 𝑚 later)
Hi
Med
Low
Time
Two Possibilities
• Consider the time ordering of any proof, and
divide time into 𝑚 equal epochs (fix 𝑚 later)
Hi
Med
Low
Time
• Either, a clause of medium complexity
appears in memory for at least one of the
breakpoints between epochs,
Two Possibilities
• Consider the time ordering of any proof, and
divide time into 𝑚 equal epochs (fix 𝑚 later)
Hi
Med
Low
Time
• Or, all breakpoints only have Hi and Low. Must
have an epoch which starts Low ends Hi, and
so has clauses of all ≈log 𝑛 Medium levels.
Analysis
• Consider the restricted proof. With probability
1, one of the two scenarios applies.
• For first scenario, 𝑀 = 𝑚𝑆 clauses,
Pr 1st scenario ≤ 𝑚𝑆 exp(−𝑛)
• For second scenario, 𝑚 epochs with 𝑀 = 𝑇/𝑚
clauses each. Union bound and main lemma:
𝑘
𝑇
Pr 2nd scenario ≤ 𝑚
exp(−Ω(𝑛)) ,
𝑚
where 𝑘 ≈ log 𝑛
Analysis
• Optimizing 𝑚 yields something of the form
𝑇𝑆 = exp Ω 𝑛
for 𝑘 = ω 1 .
• Can get a nontrivial tradeoff this way, but
need to do a lot of work on constants.
Better tradeoff
• Don’t just divide into epochs once
• Recursively divide proof into epochs and
sub-epochs where each sub-epoch contains 𝑚
sub-epochs of the next smaller size
• Prove that, if an epoch does a lot of work,
either
– Breakpoints contain many complexities
– A sub-epoch does a lot of work
An even better tradeoff
𝑎
𝑘
a
• If an epoch contains a clause at the end of
level 𝑏, but every clause at start is level ≤ 𝑏
− 𝑎, (so the epoch makes 𝑎 progress),
• and the breakpoints of its 𝑚 children epochs
contain together ≤ 𝑘 complexity levels,
• then some child epoch makes 𝑎/𝑘 progress.
Internal Node Size
= mS
Leaf Node Size
= T/m^(h-1)
Previous Slide shows:
Some set has ≥ 𝑘
complexities, where
𝑘 ℎ = log 𝑛
…
An even better tradeoff
• Choose ℎ = 𝑘, have 𝑘𝑘 = log 𝑛, so 𝑘 = 𝜔(1)
• Choose 𝑚 so all sets are the same size:
𝑇/𝑚𝑘−1 ≈ 𝑚𝑆, so all events are rare together.
• Have 𝑚𝑘 sets in total
Finally, a union bound yields 𝑇 ≥
exp(Ω(𝑛)) 𝑘
𝑆
Final Tradeoff
• Tseitin formulas on 𝒏 × 𝒍 grid for 𝑙 = 2𝑛4
– are of size ≈ 𝒏𝟓 .
– have resolution refutations in Size, Space ≈ 2𝑛
– have 𝑛 log 𝑛 space refutations of Size 2𝑛 log 𝑛
– and any resolution refutation for Size 𝑇 and
Space 𝑆 requires 𝑇 > (2 0.58 𝑛 /𝑆)ω(1)
If space is at most 2𝑛 /2 then size blows up
by a super-polynomial amount
Open Questions
• More than quasi-polynomial separations?
– For Tseitin formulas upper bound for small space
is only a log n power of the unrestricted size
– Candidate formulas? Are these even possible?
• Other proof systems?
– In [BNT’12], got Polynomial Calculus Resolution.
– Cutting Planes? Frege subsystems?
• Other cases for separating search paradigms:
“dynamic programming” vs. “binary search”?
Thanks!
Extended Isoperimetric Inequality
If the sets aren’t essentially blocks, we’re done.
If they are blocks, reduce to the line:
Intervals on the line
• Let 𝑎1 , 𝑏1 , … , [𝑎𝑘 , 𝑏𝑘 ] be intervals on the
line, such that 𝑏𝑘 − 𝑎𝑘 ≥ 2(𝑏𝑘−1 − 𝑎𝑘−1 )
• Let 𝜇(𝑘) be the minimum number of distinct
endpoints of intervals in such a configuration.
• Then, a simple inductive proof shows 𝜇 𝑘 ≥
𝑘+1
Proof DAG
Proof DAG
“Regular”:
On every
root to leaf path,
no variable resolved
more than once.
Tradeoffs for Regular Resolution
• Theorem : For any k, 4-CNF formulas (Tseitin
formulas on long and skinny grid graphs)
of size n with
– Regular resolution refutations in size nk+1,
Space nk.
– But with Space only nk-, for any  > 0, any regular
resolution refutation requires size at least
n log log n / log log log n.
Regular Resolution
• Can define partial information more precisely
• Complexity is monotonic wrt proof DAG
edges. This part uses regularity assumption,
simplifies arguments with complexity plot.
• Random Adversary selects random
assignments based on proof
– No random restrictions, conceptually clean and
don’t lose constant factors here and there.
Download