Time-space tradeoff lower bounds for non-uniform computation Paul Beame

advertisement
Time-space tradeoff
lower bounds for
non-uniform computation
Paul Beame
University of Washington
4 July 2000
1
Why study time-space tradeoffs?
 To understand relationships between
the two most critical measures of
computation
 unified comparison of algorithms with
varying time and space requirements.
 non-trivial tradeoffs arise frequently in
practice
 avoid storing intermediate results by recomputing them
2
e.g. Sorting n integers from [1,n2]
 Merge sort
 S = O(n log n), T = O(n log n)
 Radix sort
 S = O(n log n), T = O(n)
 Selection sort
 only need
- smallest value output so far
- index of current element
 S = O(log n) , T = O(n2)
3
Complexity theory
 Hard problems
 prove L  P
 prove non-trivial time lower bounds for
natural decision problems in P
 First step
 Prove a space lower bound, e.g. S=w (log n),
given an upper bound on time T, e.g. T=O(n)
for a natural problem in P
4
An annoyance
 Time hierarchy theorems imply
 unnatural problems in P not solvable in time
O(n)
 Makes ‘first step’ vacuous for unnatural
problems
5
Non-uniform computation
 Non-trivial time lower bounds still open
for problems in P
 First step still very interesting even
without the restriction to natural problems
 Can yield bounds with precise constants
 But proving lower bounds may be harder
6
Talk outline
 The right non-uniform model (for now)
 branching programs
 Early success
 multi-output functions, e.g. sorting
 Progress on problems in P
 Crawling
 restricted branching programs
 That breakthrough first step (and more)
 true time-space tradeoffs
 The path ahead
7
Branching programs
x1
1
x3
x2
x4
x5
0
x5
x3
x1
x7
x2
x7
0
1
x8
8
Branching programs
x1
1
x3
x2
x4
x5
0
x5
x=(0,0,1,0,...)
x3
x1
To compute
f:{0,1} n  {0,1}
on input (x1,…,xn) x7
follow path from
source to sink
x2
x7
0
1
x8
9
Branching program properties
 Length = length of longest path
 Size = # of nodes
 Simulate TM’s
 node = configuration with input bits erased
 time T= Length
 space S=log2Size =TM space +log2n (head)
= space on an index TM
 polysize = non-uniform L
10
TM space complexity
x1 x2 x3 x4 … xn
read-only input
working storage
Space = # of bits
of working storage
output
11
Branching program properties
 Simulate random-access machines (RAMs)
 not just sequential access
 Generalizations
 Multi-way version for xi in arbitrary domain D
 good for modeling RAM input registers
 Outputs on the edges
 good for modeling output tape for multi-output
functions such as sorting
 BPs can be leveled w.l.o.g.
 like adding a clock to a TM
12
Talk outline
 The right non-uniform model (for now)
 branching programs
 Early success
 multi-output functions, e.g. sorting
 Progress on problems in P
 Crawling
 restricted branching programs
 That breakthrough first step (and more)
 true time-space tradeoffs
 The path ahead
13
Success for multi-output problems
 Sorting
 T S = W (n2/log n) [Borodin-Cook 82]
 T S = W (n2) [Beame 89]
 Matrix-vector product
 T S = W (n3) [Abrahamson 89]
 Many others including
 Matrix multiplication
 Pattern matching
14
Proof ideas: layers and trees
 m outputs on input x
 at least m/r outputs in
some tree Tv
T/r  Only 2S trees Tv
 Typical Claim
v0
v1
v
vr-1
0
1
T/r
vr
 if T/r = en, each tree Tv
outputs p correct answers on
only a c-p fraction of inputs
 Correct for all x implies
2Sc-m/r is at least 1
 S=W(m/r)=W(mn/T)
15
Limitation of the technique
 Never more than T S = W (nm) where m
is number of outputs
 “It is unfortunately crucial to our proof that
sorting requires many output bits, and it
remains an interesting open question whether
a similar lower bound can be made to apply to
a set recognition problem, such as recognizing
whether all n input numbers are distinct.”
[Cook: Turing Award Lecture, 1983]
16
Talk outline
 The right non-uniform model (for now)
 branching programs
 Early success
 multi-output functions, e.g. sorting
 Problems in P
 Crawling
 restricted branching programs
 That breakthrough first step (and more)
 true time-space tradeoffs
 The path ahead
17
Restricted branching programs
 Constant-width - only a constant number
of nodes per level
 [Chandra-Furst-Lipton 83]
 Read-once - every variable read at most
once per path
 [Wegener 84], [Simon-Szegedy 89], etc.
 Oblivious - same variable queried per level
 [Babai-Pudlak-Rodl-Szemeredi 87],
[Alon-Maass 87], [Babai-Nisan-Szegedy 89]
 BDD = Oblivious read-once
18
BDDs and best-partition
communication complexity
x7
x1
x6
A
x3
x2
x8 B
x4
x5
0
1
 Given f:{0,1}8->{0,1}
 Two-player game
 Player A has {x1,x3,x6,x7}
 Player B has {x2,x4,x5,x8}
 Goal: communicate fewest
bits possible to compute f
 Possible protocol: Player A
sends the name of
node.
 BDD space  # of bits sent
for best partition into A and B
19
Communication complexity ideas
 Each conversation for f:{0,1}Ax{0,1}B {0,1}
corresponds to a rectangle YAxYB of inputs
YA  {0,1}A
YB  {0,1}B
 BDD lower bounds
 size  min(A,B) # of rectangles in tiling of inputs
by f-constant rectangles with partition (A,B)
 Read-once bounds
 same tiling as BDD bounds but each rectangle in
tiling may have a different partition
20
Restricted branching programs
 Read-k - no variable queried > k times on
 any path - syntactic read-k
 [Borodin-Razborov-Smolensky 89],
[Okol’nishnikova 89], etc.
 any consistent path - semantic read-k
 many years of no results
 nothing for general branching programs
either
21
Uniform tradeoffs
 SAT is not solvable using O(n1-e) space
if time is n1+o(1). [Fortnow 97]
 uses diagonalization
 works for co-nondeterministic TM’s
 Extensions for SAT
 S=logO(1) n implies T= W (n1.4142..-e ) deterministic
[Lipton-Viglas 99]
 with up to no(1) advice [Tourlakis 00]
 S= O(n1-e) implies T=W (n 1.618..-e ).
[Fortnow-van Melkebeek 00]
22
Non-uniform computation
 [Beame-Saks-Thathachar FOCS 98]
 Syntactic read-k branching programs
exponentially weaker than semantic read-twice.
 f(x) = “xTMx=0 (mod q)”
x  GF(q)n
 e nloglog n time  W(n log1-en) space for q~n
 f(x) = “xTMx=0 (mod 3)”
x {0,1}n
 1.017n time implies W (n) space
 first Boolean result above time n for general
branching programs
23
Non-uniform computation
 [Ajtai STOC 99]
 0.5log n Hamming distance for x  [1,n2]n
 kn time implies W(n log n) space
 follows from [Beame-Saks-Thathachar 98]
 improved to W(nlog n) time by [Pagter-00]
 element distinctness for x [1,n2]n
 kn time implies W(n) space
 requires significant extension of techniques
24
That breakthrough first step!
 [Ajtai FOCS 99]
 f(x,y) = “xTMyx (mod 2)”
 kn time implies W(n) space
x{0,1}n
y{0,1}2n-1
 First result for non-uniform Boolean
computation showing
 time O(n)  space w(log n)
25
Ajtai’s Boolean function
y1
0
y2
f(x,y)= xTMyx (mod 2)
y3
y4
yn
y6
y7
y8
y2n-1
My
My is a modified Hankel matrix
26
Superlinear lower bounds
 [Beame-Saks-Sun-Vee FOCS 00]
 Extension to e-error randomized
non-uniform algorithms
 Better time-space tradeoffs
T  W(n log/loglog(n/S) )
 Apply to both element distinctness and
f(x,y) = “xTMyx (mod 2)”
27
(m,a)-rectangles
 An (m,a)-rectangle R  DX is a subset
defined by
disjoint sets A,B  X,
s  DAUB
SA  DA, SB  DB such that
 R = { z | zAUB = s, zASA, zBSB }
 |A|,|B|
m
 |SA|/|DA|, |SB|/|DB|
a


28
SA
An (m,a)-rectangle
SB
s
x1
DB
DA
m
m
SA
SB
A
B
xn
SA and SB each have density at least a
In general A and B may be interleaved in [1,n]
29
Key lemma [BST 98]
 Let program P use
 time T = kn
 space S
 accept fraction d of its inputs in Dn
 then P accepts all inputs in some
(m,a)-rectangle where
 m = bn
 a is at least d 2-4(k+1) m - (S+1) r
 b-1 ~ 2k and r ~ k2 2k
30
Improved key lemma [Ajtai 99 s]
 Let program P use
 time T = kn
 space S
 accept fraction d of its inputs in Dn
 then P accepts all inputs in some
(m,a)-rectangle where
 m = bn
1/50k
b
m  Sr
 a is at least d 2
 b-1 and r are constants depending on k
31
Proving lower bounds
using the key lemmas
 Show that the desired function f
 evaluates to 1 a large fraction of the time
 i.e., d is large
 evaluates to 0 on some input in any large
(m,a)-rectangle
 where large is given by the lemma bounds
 or ... do the same for f
32
Our new key lemma
 Let program P use time T = kn space S
and accept fraction d of its inputs in Dn
 Almost all inputs P accepts are in
(m,a)-rectangles accepted by P where
 m = bn
1/8k m  Sr
b
 a is at least d 2
2

b-1
and r are k
O(k 2 )
 no input is in more than O(k) rectangles
33
Proving randomized lower
bounds from our key lemma
 Show that the desired function f
 evaluates to 1 a large fraction of the time
 i.e, d is large
 evaluates to 0 on a g fraction of inputs in
any large-enough (m,a)-rectangle
 or ... do the same for f
 Gives space lower bound for O(gd/k)-error
randomized algorithms running in time kn
34
Proof ideas: layers and trees
v0
f=
v1
kn/r
v2
f (v ,…,v
1
(v1,…,vr-1)
r-1)
# of (v1,…,vr-1) is 2S(r-1)
vr-1
0
1
kn/r
vr
f (v1,…,vr-1) =
f
r
i=1
f
vi-1vi
vi-1vi can be computed in kn/r height
35
(r,e)-decision forest
 The conjunction of r decision trees
(BP’s that are trees) of height en
 Each f (v1,…,vr-1) is a computed by a
(r,k/r)-decision forest
 Only 2S(r-1) of them
 The various f (v1,…,vr-1) accept disjoint
sets of inputs
36
Decision forest
kn/r
T1
T2
T3
T4
Tr
 Assume wlog all variables read on every input
 Fix an input x accepted by the forest
 Each tree reads only a small fraction of the
variables on input x
 Fix two disjoint subsets of trees, F and G
37
Core variables
kn/r
T1
T2
T3
T4
Tt
 Can split the set of variables into
 core(x,F)=variables read only in F (=not read outside F)
 core(x,G)=variables read only in G (=not read outside G)
 remaining variables
 stem(x,F,G)=assignment to remaining variables
 General idea: use core(x,F), core(x,G), and
stem(x,F,G) to define (m,a)-rectangles
38
A partition of accepted inputs
 Fix F, G,x accepted by P
 Rx,F,G={ y | core(y,F)=core(x,F),
core(y,G)=core(x,G),
stem(y,F,G)=stem(x,F,G),
and P accepts y}
 For each F, G the Rx,F,G partition the
accepted inputs into equivalence classes
 Claim: the Rx,F,G are (m,a)-rectangles
39
Classes are rectangles
 Let A=core(x,F), B=core(x,G), s=stem(x,F,G)
 SA={yA| y in Rx,F,G }, SB={zB| z in Rx,F,G }
 Let w=(s,yA,zB)
 w agrees with y in all trees outside G
 core(w,G)=core(y,G)=core(x,G)
 w agrees with z in all trees outside F
 core(w,F)=core(z,F)=core(x,F)
 stem(w,F,G)=s=stem(x,F,G)
 P accepts w since it accepts y and z
 So... w is in Rx,F,G
40
Few partitions suffice
 Only 4k pairs F,G suffice to cover
almost all inputs accepted by P by large
(m,a)-rectangles Rx,F,G
 Choose F,G uniformly at random of suitable
size, depending on access pattern of input
 probability that F,G isn’t good is tiny
 one such pair will work for almost all inputs with
the given access pattern
 Only 4k sizes needed.
41
Special case: oblivious BPs
 core(x,F), core(x,G) don’t depend on x
 Choose Ti in F
with prob q
G
with prob q
neither with prob 1-2q
42
xTMyx on an (m,a)-rectangle
B
A
x
For every s on AUB,
f(xAUB,s,y)
A
= xAT MAB xB
+ g(xA,y)
+ h(xB,y)
B
My
x
43
Rectangles, rank, & rigidity
 largest rectangle on which xATMxB is
constant has a  2-rank(M)
 [Borodin-Razborov-Smolensky 89]
 Lemma [Ajtai 99] Can fix y s.t. every
bnxbn minor MAB of My has
rank(MAB)  cbn/log2(1/b)
 improvement of bounds of
[Beame-Saks-Thathachar 98] &
[Borodin-Razborov-Smolensky 89]
for Sylvester matrices
44
High rank implies balance
 For any rectangle SAxSB  {0,1}Ax{0,1}B
with m(SAxSB)  |A||B|23-rank(M)
Pr[ xATMxB= 1 | xA  SA, xB SB]  1/32
Pr[ xATMxB= 0 | xA SA, xB  SB]  1/32
 derived from result for inner product in r
dimensions
 So rigidity also implies balance for all large
rectangles and so T  W(n log/loglog(n/S) )
 Also follows for element distinctness
 [Babai-Frankl-Simon 86]
45
Talk outline
 The right non-uniform model (for now)
 branching programs
 Early success
 multi-output functions, e.g. sorting
 Progress on problems in P
 Crawling
 restricted branching programs
 That breakthrough first step (and more)
 true time-space tradeoffs
 The path ahead
46
Improving the bounds
 What is the limit?
 T=W(nlog(n/S)) ?
 T=W(n2/S) ?
 Current bounds for general BPs are
almost equal to best current bounds for
oblivious BPs !
 T=W(nlog(n/S)) using 2-party CC [AM]
 T=W(nlog2(n/S)) using multi-party CC [BNS]
47
Improving the bounds
 (m,a)-rectangles a 2-party CC idea
 insight: generalizing to non-oblivious BPs
 yields same bound as [AM] for oblivious BPs
 Generalize to multi-party CC ideas to get
better bounds for general BPs?
 similar framework yields same bound as [BNS] for
oblivious BPs
 Improve oblivious BP lower bounds?
 ideas other than communication complexity?
48
Extension to other problems
 Problem should be hard for (best-partition)
2-party communication complexity (after
most variables fixed).
 try oblivious BPs first
 Prime candidate: (directed) st-connectivity
 Many non-uniform lower bounds in
structured JAG models [Cook-Rackoff], [BBRRT],
[Edmonds], [Barnes-Edmonds], [Achlioptas-Edmonds-Poon]
 Best-partition communication complexity
bounds known
49
Limitations of current method
 Need n>T/r = decision tree height
 else all functions trivial
 so r > T/n
 A decision forest works on a 2-Sr fraction of
the accepted inputs
• only place space bound is used
 So need Sr<n else d.f. need only work on one
input
 implies ST/n < n, i.e. T < n2/S
50
Download