ppt

advertisement
Discrepancy and SDPs
Nikhil Bansal (TU Eindhoven)
1/30
Outline
Discrepancy: definitions and applications
Basic results: upper/lower bounds
Partial Coloring method (non-constructive)
SDPs: basic method
Algorithmic Spencer’s Result
Lovett-Meka result
Lower bounds via SDP duality (Matousek)
2/30
Material
Classic: Geometric Discrepancy by J. Matousek
Papers:
Bansal. Constructive algorithms for discrepancy minimization,
FOCS 2010
Matousek. The determinant lower bound is almost tight
Lovett, Meka. Discrepancy minimization by walking on the
edges
Survey with fewer technical details:
Bansal. …
3/30
Discrepancy: What is it?
Study of gaps in approximating the continuous by the discrete.
Original motivation: Numerical Integration/ Sampling
Problem: How well can you approximate a region by discrete points
Discrepancy:
Max over intervals I
|(# points in I) – (length of I)|
4/30
Discrepancy: What is it?
Study of gaps in approximating the continuous by the discrete.
Problem: How uniformly can you distribute points in a grid.
“Uniform” : For every axis-parallel rectangle R
| (# points in R) - (Area of R) | should be low.
Discrepancy:
Max over rectangles R
|(# points in R) – (Area of R)|
n1/2
n1/2
5/30
Distributing points in a grid
Problem: How uniformly can you distribute points in a grid.
“Uniform” : For every axis-parallel rectangle R
| (# points in R) - (Area of R) | should be low.
n= 64
points
Uniform
n1/2 discrepancy
Random
n1/2 (loglog n)1/2
Van der Corput Set
O(log n) discrepancy!
6/30
Quasi-Monte Carlo Methods
With N random samples: Error \prop 1/\sqrt{n}
Quasi-Monte Carlo Methods: \prop Disc/n
Can discrepancy be O(1) for 2d grid?
No. \Omega(log n) [Schmidt …]
d-dimensions: O(log^{d-1} n) [Halton-Hammersely ]
\Omega(log^{(d-1)/2} n) [Roth ]
\Omega(log^{(d-1)/2 + \eta} n
[Bilyk,Lacey,Vagharshakyan’08]
7/30
Discrepancy: Example 2
Input: n points placed arbitrarily in a grid.
Color them red/blue such that each rectangle is colored as
evenly as possible
Discrepancy: max over rect. R
( | # red in R - # blue in R | )
Continuous: Color each element
1/2 red and 1/2 blue (0 discrepancy)
Discrete:
Random has about O(n1/2 log1/2 n)
Can achieve O(log2.5 n)
8/30
Combinatorial Discrepancy
Universe: U= [1,…,n]
Subsets: S1,S2,…,Sm
S3
S4
S1
Color elements red/blue so each
set is colored as evenly as possible.
S2
Find : [n] ! {-1,+1} to
Minimize |(S)|1 = maxS | i 2 S (i) |
If A is m \times n incidence matrix.
Disc(A) = min_{x \in {-1,1}^n}
|Ax|_\infty
9/30
Applications
CS: Computational Geometry, Comb. Optimization, Monte-Carlo
simulation, Machine learning, Complexity, Pseudo-Randomness, …
Math: Dynamical Systems, Combinatorics, Mathematical Finance,
Number Theory, Ramsey Theory, Algebra, Measure Theory, …
10/30
Hereditary Discrepancy
11/30
Rounding
Lovasz-Spencer-Vesztermgombi’86
Given any matrix A, and x \in R^n
can round x to \tilde{x} \in Z^n s.t.
|Ax – A\tilde{x}|_\infty < Herdisc(A)
Proof: Round the bits one by one.
12/30
Can we find it efficiently?
Nothing known until recently.
Thm [B’10]. Can efficiently round so that
Error \leq O(\sqrt{log m log n})
Herdisc(A)
13/30
More rounding approaches
Bin Packing
Refined further by Rothvoss (Entropy rounding
method)
14/30
Dynamic Data Structures
N points in a 2-d region.
Weights update over time.
Query: Given an axis-parallel rectangle R,
determine the total weight on points in R.
Preprocess:
1) Low query time
2) Low update time (upon weight change)
15/30
Example
Line:
Query = O(n)
Query = 1
Query = 2
Update = 1
Update = O(n^2)
Update = O(n)
Query = O(log n) Update = O(log n)
Recursively can get for 2-d.
16/30
What about other objects?
Query
Circles
arbitrary rectangles aligned triangle
Turns out t_q t_u \geq n^{1/2}/log^2 n ?
Larsen-Green: t_q t_u \geq disc(S)^n/log^2 n
17/30
Sketch of idea
A good data structure implies
D=AP
A = row sparse P = Column sparse
(low query time) (low update time)
18/30
Outline again
19/30
Basic Results
20/30
Best Known Algorithm
Random: Color each element i independently as
x(i) = +1 or -1 with probability ½ each.
Thm: Discrepancy = O (n log n)1/2
Pf: For each set, expect O(n1/2) discrepancy
2
1/2
-c
Standard tail bounds: Pr[ | i 2 S x(i) | ¸ c n ] ¼ e
Union bound + Choose c ¼ (log n)1/2
Analysis tight: Random actually incurs ((n log n) ).
1/2
21/30
Better Colorings Exist!
[Spencer 85]: (Six standard deviations suffice)
Always exists coloring with discrepancy · 6n1/2
(In general for arbitrary m, discrepancy = O(n1/2 log(m/n)1/2)
Tight: For m=n, cannot beat 0.5 n1/2 (Hadamard Matrix, “orthogonal” sets)
Inherently non-constructive proof
(pigeonhole principle on exponentially large universe)
Challenge: Can we find it algorithmically ?
Certain algorithms do not work [Spencer]
Conjecture [Alon-Spencer]: May not be possible.
22/30
Beck Fiala Thm
U = [1,…,n]
S3
Sets: S1,S2,…,Sm
S4
S1
Suppose each element lies in at most t sets (t << n). S2
[Beck Fiala’ 81]: Discrepancy 2t -1.
(elegant linear algebraic argument, algorithmic result)
Beck Fiala Conjecture: O(t1/2) discrepancy possible
Other results: O( t1/2 log t log n ) [Beck]
O( t1/2 log n )
[Srinivasan]
O( t1/2 log1/2 n ) [Banaszczyk]
Non-constructive
23/30
Approximating Discrepancy
Question: If a set system has low discrepancy (say << n1/2)
Can we find a good discrepancy coloring ?
[Charikar, Newman, Nikolov 11]:
Even 0 vs. O (n1/2) is NP-Hard
12…n
S1
S2
…
1’ 2’ … n’
S’1
S’2
…
(Matousek): What if system has low Hereditary discrepancy?
herdisc (U,S) = maxU’ ½ U disc (U’, S|U’)
Robust measure of discrepancy (often same as discrepancy)
Widely used: TU set systems, Geomety, …
24/30
Our Results
Thm 1: Can get Spencer’s bound constructively.
That is, O(n1/2) discrepancy for m=n sets.
Thm 2: If each element lies in at most t sets, get bound of
O(t1/2 log n) constructively (Srinivasan’s bound)
Thm 3: For any set system, can find
Discrepancy · O(log (mn)) Hereditary discrepancy.
Other Problems: Constructive bounds (matching current best)
k-permutation problem [Spencer, Srinivasan,Tetali]
Geometric problems , …
25/30
Relaxations: LPs and SDPs
Not clear how to use.
Linear Program is useless. Can color each element ½ red and
½ blue. Discrepancy of each set = 0!
SDPs
(LP on vi ¢ vj, cannot control dimension of v’s)
| i 2 S vi |2 · n
8S
|vi|2 = 1
Intended solution vi = (+1,0,…,0) or (-1,0,…,0).
Trivially feasible: vi = ei (all vi’s orthogonal)
Yet, SDPs will be a major tool.
26/30
Punch line
SDP very helpful if “tighter” bounds needed for some sets.
|i 2 S vi |2 · 2 n
| i 2 S’ vi|2 · n/log n
|vi|2 · 1
Tighter bound for S’
Not apriori clear why one can do this.
Entropy Method.
Algorithm will construct coloring over time and
use several SDPs in the process.
27/30
Talk Outline
Introduction
The Method
Low Hereditary discrepancy -> Good coloring
Additional Ideas
Spencer’s O(n1/2) bound
28/30
Partial Coloring Method
29/30
A Question
-n
n
30/30
Slight improvement
Can be improved to O(\sqrt{n})/2^n
If you pick a random {-1,1} coloring s
w.p. say >= ½
|a \cdot s| \leq c \sqrt{n}
2^{n-1} colorings s, with |a\cdot s| \leq c
\sqrt{n}
31/30
Algorithmically
Easy: 1/poly(n)
(How?)
Answer: Pick any poly(n) colorings.
[Karmarkar-Karp’81]: \approx 1/n^log n
Huge gap: Major open question
Remark: {-1,+1} not enough. Really need
color 0 also.
32/30
Yet another enhancement
There is a {-1,0,1} coloring with at least
n/2 {-1,1}’s s.t. \sum_i a_i s_i \leq n/2^{n/5}
Make buckets of size 2n/2^{n/5}
At least 2^{4n/5} sums fall in same bucket
Claim: Some two s’ and s’’ in same bucket and differ in at
least n/2 coordinates
Again consider s = (s’-s’’)/2
33/30
Proof of Claim
Claim: Any set of 2^{4n/5} vertices of the
boolean cube has
[Kleitman’66] Isoperimetry for cube.
Hamming ball B(v,r) has the smallest
diameter for a given number of vertices.
|B(v,n/4)| < 2^{4n/5}
34/30
Spencer’s proof
35/30
Our Approach
36/30
Algorithm (at high level)
Cube: {-1,+1}n
start
Each dimension: An Element
Each vertex: A Coloring
finish
Algorithm: “Sticky” random walk
Each step generated by rounding a suitable SDP
Move in various dimensions correlated, e.g. t1 + t2
¼0
Analysis: Few steps to reach a vertex (walk has high variance)
Disc(Si) does a random walk (with low variance)
37/30
An SDP
Hereditary disc.  ) the following SDP is feasible
SDP:
Low discrepancy: |i 2 Sj vi |2 · 2
|vi|2 = 1
Rounding:
Pick random Gaussian g = (g1,g2,…,gn)
each coordinate gi is iid N(0,1)
For each i, consider i = g ¢ vi
Obtain vi 2 Rn
38/30
Properties of Rounding
Lemma: If g 2 Rn is random Gaussian. For any v 2 Rn,
g ¢ v is distributed as N(0, |v|2)
Pf:
N(0,a2) + N(0,b2) = N(0,a2+b2)
g¢ v = i v(i) gi » N(0, i v(i)2)
Recall: i = g ¢ vi
1.
2.
Each i » N(0,1)
For each set S,
i 2 S i = g ¢ (i2 S vi) » N(0, · 2)
(std deviation ·)
SDP:
|vi|2 = 1
|i2 S vi|2 ·2
’s mimics a low discrepancy coloring (but is not {-1,+1}) 39/30
Algorithm Overview
Construct coloring iteratively.
Initially: Start with coloring x0 = (0,0,0, …,0) at t = 0.
At Time t: Update coloring as xt = xt-1 +  (t1,…,tn)
( tiny: 1/n suffices)
xt(i) =  (1i + 2i + … + ti)
+1
x(i)
-1
time
Color of element i: Does random walk
over time with step size ¼  N(0,1)
Fixed if reaches -1 or +1.
Set S: xt(S) = i 2 S xt(i) does a random walk w/ step  N(0,· 2)
40/30
Analysis
Consider time T = O(1/2)
Claim 1: With prob. ½, at least n/2 elements reach -1 or +1.
Pf: Each element doing random walk with size ¼ .
Recall: Random walk with step 1, is ¼ O(t1/2) away in t steps.
A Trouble: Various element updates are correlated
Consider basic walk x(t+1) = x(t) 1 with prob ½
Define Energy (t) = x(t)2
E[(t+1)] = ½ (x(t)+1)2 + ½ (x(t)-1)2 = x(t)2 + 1 = (t)+1
Expected energy = n at t= n.
Claim 2: Each set has O() discrepancy in expectation.
Pf: For each S, xt(S) doing random walk with step size ¼  
41/30
Analysis
Consider time T = O(1/2)
Claim 1: With prob. ½, at least n/2 variables reach -1 or +1.
) Everything colored in O(log n) rounds.
Claim 2: Each set has O() discrepancy in expectation per round.
) Expected discrepancy of a set at end = O( log n)
Thm: Obtain a coloring with discrepancy O( log (mn))
Pf: By Chernoff, Prob. that disc(S) >= 2 Expectation + O( log m)
= O( log (mn))
is tiny (poly(1/m)).
42/30
Recap
At each step of walk, formulate SDP on unfixed variables.
Use some (existential) property to argue SDP is feasible
Rounding SDP solution -> Step of walk
Properties of walk:
High Variance -> Quick convergence
Low variance for discrepancy on sets -> Low discrepancy
43/30
Refinements
Spencer’s six std deviations result:
Goal: Obtain O(n1/2) discrepancy for any set system on m = O(n) sets.
Random coloring has n1/2 (log n)1/2 discrepancy
Previous approach seems useless:
Expected discrepancy for a set O(n1/2),
but some random walks will deviate by up to (log n)1/2 factor
Need an additional idea to prevent this.
44/30
Spencer’s O(n1/2) result
Partial Coloring Lemma: For any system with m sets, there exists a
coloring on ¸ n/2 elements with discrepancy O(n1/2 log1/2 (2m/n))
[For m=n, disc = O(n1/2)]
Algorithm for total coloring:
Repeatedly apply partial coloring lemma
Total discrepancy
O( n1/2 log1/2 2 )
+ O( (n/2)1/2 log1/2 4 )
+ O((n/4)1/2 log1/2 8 )
+…
[Phase 1]
[Phase 2]
[Phase 3]
= O(n1/2)
45/30
Proving Partial Coloring Lemma
Beautiful Counting argument (entropy method + pigeonhole)
Idea: Too many colorings (2n), but few “discrepancy profiles”
Key Lemma: There exist k=24n/5 colorings X1,…,Xk such that
every two Xi, Xj are “similar” for every set S1,…,Sn.
Some X1,X2 differ on ¸ n/2 positions
X1 = ( 1,-1, 1 , …,1,-1,-1)
Consider X = (X1 – X2)/2
X2 = (-1,-1,-1, …,1, 1, 1)
X = ( 1, 0, 1 , …,0,-1,-1)
Pf: X(S) = (X1(S) – X2(S))/2 2 [-10 n1/2 , 10 n1/2]
46/30
A useful generalization
There exists a partial coloring with non-uniform
discrepancy bound S for set S
Even if S = ( n1/2) in some average sense
47/30
An SDP
Suppose there exists partial coloring X:
1. On ¸ n/2 elements
2. Each set S has |X(S)| · S
SDP:
Low discrepancy: |i 2 Sj vi |2 · S2
Many colors:
i |vi|2 ¸ n/2
|vi|2 · 1
Pick random Gaussian g = (g1,g2,…,gn)
each coordinate gi is iid N(0,1)
For each i, consider i = g ¢ vi
Obtain vi 2 Rn
48/30
Algorithm
Initially write SDP with S = c n1/2
Each set S does random walk and expects to reach
discrepancy of O(S) = O(n1/2)
Some sets will become problematic.
Reduce their S on the fly.
Not many problematic sets, and entropy penalty low.
Danger 1
0
20n1/2
Danger 2 Danger 3
30n1/2 35n1/2
…
…
49/30
Concluding Remarks
Construct coloring over time by solving sequence of SDPs
(guided by existence results)
Works quite generally
Can be derandomized [Bansal-Spencer]
(use entropy method itself for derandomizing + usual tech.)
E.g. Deterministic six standard deviations can be viewed as a way to
derandomize something stronger than Chernoff bounds.
50/30
Thank You!
51/30
52/30
53/30
Rest of the talk
1. How to generate i with required properties.
2. How to update S over time.
Show n1/2 (log log log n)1/2 bound.
54/30
Why so few algorithms?
• Often algorithms rely on continuous relaxations.
– Linear Program is useless. Can color each element ½
red and ½ blue.
• Improved results of Spencer, Beck, Srinivasan, …
based on clever counting (entropy method).
– Pigeonhole Principle on exponentially large systems
(seems inherently non-constructive)
55/30
Partial Coloring Lemma
Suppose we have discrepancy bound S for set S.
Consider 2n possible colorings
Signature of a coloring X: (b(S1), b(S2),…, b(Sm))
Want partial coloring with signature (0,0,0,…,0)
56/30
Progress Condition
Energy increases at each step:
E(t) = \sum_i x_i(t)^2
Initially energy =0, can be at most n.
Expected value of E(t) = E(t-1) + \sum_i
\gamma_i(t)^2
Markov’s inequality.
57/30
Missing Steps
1. How to generate the \eta_i
2. How to update \Delta_S over time
58/30
Partial Coloring
If exist two colorings X1,X2
1. Same signature (b1,b2,…,bm)
2. Differ in at least n/2 positions.
X1 = (1,-1, 1 , …, 1,-1,-1)
X2 = (-1,-1,-1, …, 1,1, 1)
Consider X = (X1 –X2)/2
1. -1 or 1 on at least n/2 positions, i.e. partial coloring
2. Has signature (0,0,0,…,0)
X(S) = (X1(S) – X2(S)) / 2, so |X(S)| · S for all S.
Can show that there are 24n/5 colorings with same signature.
So, some two will differ on > n/2 positions. (Pigeon Hole)
59/30
60/30
Spencer’s O(n1/2) result
Partial Coloring Lemma: For any system with m sets,
there exists a coloring on ¸ n/2 elements with
discrepancy O(n1/2 log1/2 (2m/n))
[For m=n, disc = O(n1/2)]
Algorithm for total coloring:
Repeatedly apply partial coloring lemma
Total discrepancy
O( n1/2 log1/2 2 )
+ O( (n/2)1/2 log1/2 4 )
+ O((n/4)1/2 log1/2 8 )
+…
[Phase 1]
[Phase 2]
[Phase 3]
= O(n1/2)
Let us prove the lemma for m = n
61/30
Proving Partial Coloring Lemma
-30 n1/2
-2
-10 n1/2
10 n1/2
-1
0
30 n1/2
1
2
Pf: Associate with coloring X, signature = (b1,b2,…,bn)
(bi = bucket in which X(Si) lies )
Wish to show: There exist 24n/5 colorings with same signature
Choose X randomly: Induces distribution  on signatures.
Entropy () · n/5 implies some signature has prob. ¸ 2-n/5.
Entropy ( ) · i Entropy( bi)
bi = 0 w.p. ¼ 1- 2 e-50,
= 1 w.p. ¼ e-50
= 2 w.p. ¼ e-450
[Subadditivity of Entropy]
Ent(b1) · 1/5
62/30
A useful generalization
Partial coloring with non-uniform discrepancy S for set S
For each set S, consider the “bucketing”
-2
-1
-3S
0
-S
2
1
S
3S
5S
Suffices to have s Ent (bs) · n/5
Or, if S = s n1/2 , then s g(s) · n/5
g()
2
¼ e- /2
¼ ln(1/)
>1
<1
Bucket of n1/2/100
has penalty ¼ ln(100)
63/30
Recap
Partial Coloring: S ¼ 10 n1/2 gives low entropy
) 24n/5 colorings exist with same signature.
) some X1,X2 with large hamming distance.
(X1 – X2) /2 gives the desired partial coloring.
Only if we could find the partial coloring
efficiently…
Trouble: 24n/5/2n is an exponentially small fraction.
64/30
Download