Yuan Zhou Carnegie Mellon University

advertisement
Yuan Zhou
Carnegie Mellon University
Joint works with Boaz Barak, Fernando G.S.L. Brandão,
Aram W. Harrow, Jonathan Kelner, Ryan O'Donnell and
David Steurer
Constraint Satisfaction Problems
• Given:
– a set of variables: V
– a set of values: Ω
– a set of "local constraints": E
• Goal: find an assignment σ : V -> Ω to maximize
#satisfied constraints in E
• α-approximation algorithm: always outputs a
solution of value at least α*OPT
Example 1: Max-Cut
• Vertex set: V = {1, 2, 3, ..., n}
• Value set: Ω = {0, 1}
• Typical local constraint: (i, j) э E wants σ(i) ≠ σ(j)
• Alternative description:
– Given G = (V, E), divide V into two parts,
– to maximize #edges across the cut
• Best approx. alg.: 0.878-approx. [GW'95]
• Best NP-hardness: 0.941 [Has'01, TSSW'00]
Example 2: Balanced Seperator
• Vertex set: V = {1, 2, 3, ..., n}
• Value set: Ω = {0, 1}
• Minimize #satisfied local constraints:
(i, j) э E : σ(i) ≠ σ(j)
• Global constraint: n/3 ≤ |{i : σ(i) = 0}| ≤ 2n/3
• Alternative description:
– given G = (V, E)
– divide V into two "balanced" parts,
– to minimize #edges across the cut
Example 2: Balanced Seperator (cont'd)
• Vertex set: V = {1, 2, 3, ..., n}
• Value set: Ω = {0, 1}
• Minimize #satisfied local constraints:
(i, j) э E : σ(i) ≠ σ(j)
• Global constraint: n/3 ≤ |{i : σ(i) = 0}| ≤ 2n/3
• Best approx. alg.: sqrt{log n}-approx. [ARV'04]
• Only (1+ε)-approx. alg. is ruled out even assuming
3-SAT does not have subexp time alg. [AMS'07]
Example 3: Unique Games
• Vertex set: V = {1, 2, 3, ..., n}
• Value set: Ω = {0, 1, 2, ..., q - 1}
• Maximize #satisfied local constraints:
(i, j) э E : σ(i) - σ(j) = c (mod q)
• Unique Games Conjecture (UGC) [Kho'02,
KKMO'07]
No poly-time algorithm, given an instance
where optimal solution satisfies (1-ε)
constraints, finds a solution satisfying ε
constraints
• Stronger than (implies) "no constant approx.
Example 3: Unique Games (cont'd)
• Vertex set: V = {1, 2, 3, ..., n}
• Value set: Ω = {0, 1, 2, ..., q - 1}
• Maximize #satisfied local constraints:
(i, j) э E : σ(i) - σ(j) = c (mod q)
• UG(ε): to tell whether an instance has a solution
satisfying (1-ε) constraints, or no solution
satisfying ε constraints
• Unique Games Conjecture (UGC). UG(ε) is hard
for sufficiently large q
Example 3: Unique Games (cont'd)
• Implications of UGC
– For large class of problems, BASIC-SDP
(semidefinite programming relaxation) achieves
optimal approximation ratio
Max-Cut: 0.878-approx.
Vertex-Cover: 2-approx.
Max-CSP
[KKMO '07, MOO '10, KV '03, Rag '08]
Open questions
• Is UGC true?
• Are the implications of UGC true?
– Is Max-Cut hard to approximate better than
0.878?
– Is Balanced Seperator hard to approximate
with in constant factor?
SDP Relaxation hierarchies
• A systematic way to write tighter and tighter
SDP relaxations
BASIC-SDP
r rounds SDPO (relaxation
r)
in roughly n
time
?
…
UG(ε)
ARV SDP for Balanced Seperator
GW SDP for Maxcut (0.878-approx.)
• Examples
– Sherali-Adams+SDP [SA'90]
– Lasserre hierarchy [Par'00, Las'01]
How many rounds of tighening
suffice?
• Upperbounds
  (1 )
– n
rounds of SA+SDP suffice for UG(ε)
[ABS'10, BRS'11]
• Lowerbounds [KV'05, DKSV'06, RS'09, BGHMRS '12]
(also known as constructing integrality gap instances)
 (1)
– exp((log log n) ) rounds of SA+SDP needed
for UG(ε)
 (1)
exp((log
log
n
)
) rounds of SA+SDP needed
–
for better-than-0.878 approx for Max-Cut
 (1)
(log
log
n
)
–
rounds for SA+SDP needed for
constant approx. for Balanced Seperator
Our Results
• We study the performance of Lasserre SDP
hierarchy against known lowerbound instances
for SA+SDP hierarchy, and show that
• 8-round Lasserre solves the Unique Games
lowerbound instances [BBHKSZ'12]
• 4-round Lasserre solves the Balanced Seperator
lowerbound instances [OZ'12]
• Constant-round Lasserre gives better-than0.878 approximation for Max-Cut lowerbound
instances [OZ'12]
Proof overview
• Integrality gap instance
– SDP completeness: a good vector solution
– Integral soundness: no good integral solution
• A common method to construct gaps (e.g. [RS'09])
– Use the instance derived from a hardness
reduction
– Lift the completeness proof to vector world
– Use the soundness proof directly
Proof overview (cont'd)
• Our goal: to prove there is no good vector
solution
– Rounding algorithms?
• Instead,
– we bound the value of the dual of the SDP
– interpret the dual of the SDP as a proof
system ("Sum-of-squares proof system")
– lift the soundness proof to the proof system
Remarks
• Using a connection between SDP hierarchies and
algebraic proof systems, we refute all known UG
lowerbound instances and many instances for its
related problems
• We provide new insight in designing integrality
gap instances -- should avoid soundness proofs
that can be lifted to the powerful Sum-ofSquares proof system
• We show that Lasserre is strictly stronger than
other hierarchies on UG and its related
problems (as it was believed to be)
Outline of the rest of the talk
• Sum-of-Squares proof system and
Lasserre hierarchy
• Lift the soundness proofs to the SoS
proof system
Sum-of-Squares proof system
and Lasserre hierarchy
Polynomial optimization
• Maximize/Minimize p (x )
• Subject to
q1 ( x)  0, q2 ( x)  0, qm ( x)  0
r1 ( x)  0, r2 ( x)  0, rm' ( x)  0
all functions are low-degree n-variate
polynomial functions
• Max-Cut example:
2
Maximize E ( xi  x j )
(i,j)E
s.t.
xi (1  xi )  0, i
Polynomial optimization (cont'd)
• Maximize/Minimize p (x )
• Subject to
q1 ( x)  0, q2 ( x)  0, qm ( x)  0
r1 ( x)  0, r2 ( x)  0, rm' ( x)  0
all functions are low-degree n-variate
polynomial functions
• Balanced Seperator example:
2
Minimize E ( xi  x j )
(i,j)E
s.t.
xi (1  xi )  0, i
E[ xi ] 
i
1
3
, E[ xi ]  2 3
i
Certifying no good solution
• Maximize
• Subject to
p (x )
q1 ( x)  0, q2 ( x)  0, qm ( x)  0
r1 ( x)  0, r2 ( x)  0, rm' ( x)  0
• To certify that there is no solution better than
, simply
 say that the following equations &
inequalities are infeasible
p(x)  
q1 ( x)  0, q2 ( x)  0, qm ( x)  0
r1 ( x)  0, r2 ( x)  0, rm' ( x)  0
The Sum-of-Squares proof system
• To show the following equations & inequalities
are infeasible,
q1 ( x)  0, q2 ( x)  0, qm ( x)  0
r1 ( x)  0, r2 ( x)  0, rm' ( x)  0
• Show that
1 
 f ( x ) q ( x )  h( x )
i 1... m
i
i
• where h(x) is a sum of squared polynomials,
including ri (x)'s
• A degree-d "Sum-of-Squares" refutation, where
d  max {deg( f i )  deg( qi ), deg( h)}
i
Example 1
• To refute
x2
x (1  x )  0
• We simply write
 1  x(1  x)  ( x  2)  ( x  1) 2
• A degree-2 SoS refutation
Example 2: Max-Cut on triangle graph
• To refute
( x1  x2 ) 2  ( x2  x3 ) 2  ( x3  x1 ) 2  2  
x1 (1  x1 )  0, x2 (1  x2 )  0, x3 (1  x3 )  0
• We "simply" write
... ...
Example 2: Max-Cut on triangle graph
(cont'd)


 ( x1  x2 ) 2  ( x2  x3 ) 2  ( x3  x1 ) 2  2  

 ( x1 x2  x2 x3  x1 x3  x2 ) 2  ( x1  x2  1) 2  ( x2  x3  1) 2
 x1 (1  x1 )( x22  x32  2 x2 x3  1)
 x2 (1  x2 )( x1  x32  2 x1 x3  2 x1  2 x3  3)
 x3 (1  x3 )( x1  x2  2 x1 x2  1)
• A degree-4 SoS refutation
Relation between SoS proof system and
Lasserre SDP hierarchy
Finding SoS refutation by SDP
• A degree-d SoS refutation corresponds to
d
solution of an SDP with O(n ) variables
• The SDP is the same as the dual of (d ) -round
Lasserre relaxation
Bounding SDP value by SoS refutation
• An SoS refutation => upperbound on the dual of
optimum of Lasserre => upperbound on the value
of Lasserre
– e.g. 4-round Lasserre says that Max-Cut of
the triangle graph is at most 2
(BASIC-SDP gives 9/4)
Remarks
• Positivestellensatz. [Krivine'64, Stengle'73] If the
given equalities & inequalities are infeasible,
there is always an SoS refutation (degree not
bounded).
• The degree-d SoS proof system was first
proposed by Grigoriev and Vorobjov in 1999
• Grigoriev showed (n) degree is needed to
refute unsatisfiable sparse F2 -linear equations
– later rediscovered by Schoenbeck in Lasserre
world
SoS proofs (in contrast to refutations)
• Given assumptions
q1 ( x)  0, q2 ( x)  0, qm ( x)  0
r1 ( x)  0, r2 ( x)  0, rm' ( x)  0
p(x)  
to prove that
• A degree-d SoS proof writes
  p( x)   f i ( x)qi ( x)  h( x)
i 1... m
where gi ( x), h( x) are sums of squared polynomials
d  max {deg( f i )  deg( qi ), deg( h)}
i
• Remark. Degree-d SoS proof => degree-d SoS
refutation for p( x)     ,   0
Technical Part:
Lift the proofs to SoS proof
system
Components of the soundness proof
(of known UG instances)
•
•
•
•
•
Cauchy-Schwarz/Hölder's inequality
Hypercontractivity inequality
Smallsets expand in the noisy hypercube
Invariance Principle
Influence decoding
Hypercontractivity Inequality
• 2->4 hypercontractivity inequality:
for low degree polynomial f ( x) 
we have

 S  xi
S [ n ],| S | d
2
2
E n [ f ( x) 4 ]  9 d  E n [ f ( x)
x{1,1}
 x{1,1}
iS
] 

• Goal of an SoS proof:
write
2
d
2 
4
2
9  E n [ f ( x) ]   E n [ f ( x) ]  i h( , {1} , {2} ,)
 x{1,1}
 x{1,1}
Note that  S 's are indeterminates
Traditional proof of hypercontractivity
• 2->4 hypercontractivity inequality:
for low degree polynomial f ( x) 
we have

 S  xi
S [ n ],| S | d
2
2
E n [ f ( x) 4 ]  9 d  E n [ f ( x)
x{1,1}
 x{1,1}
iS
] 

• (Traditional) proof. Apply induction on d and n.
– Let f  x1 g  h
– g and h are (n-1)-variate polynomials,
deg( g )  n  1, deg( h)  n
Traditional proof of hypercontractivity (cont'd)
E[ f 4 ]  E[( x1 g  h) 4 ]
 E[ x14 g 4  h 4  6 x12 g 2 h 2  4 x1 gh3  4 x13 g 3h]
 E[ g 4 ]  E[h 4 ]  6E[ g 2 h 2 ]
 E[ g 4 ]  E[h 4 ]  6 E[ g 4 ] E[h 4 ] (Cauchy-Schwartz)
 9d E[ g 2 ]2  9d E[h 2 ]2  6  9d / 2 E[ g 2 ]  9( d 1) / 2 E[h 4 ]
(induction)
 9d (E[ g 2 ]  E[h 2 ]) 2
 9d (E[ f 2 ]) 2
All equalities are polynomial identities
about indeterminates  S
SoS proof of hypercontractivity?
• The square-root in the Cauchy-Schwartz step
looks difficult for polynomials
• Solution: Prove a stronger statement -- twofunction hypercontractivity inequality
• Theorem. Suppose
f ( x) 

S [ n ],|S | d
• then
 S  xi , g ( x) 
iS
de
2

S [ n ],|S | e
E[ f g ]  9 E[ f ]E[ g ]
2
2
2
2
 S  xi
iS
SoS proof of two-fcn hypercontractivity
f  x1 f 0  f1 , g  x1 g 0  g1
• Write
E[ f 2 g 2 ]  E[( x1 f 0  f1 ) 2 ( x1 g 0  g1 ) 2 ]
 E[ f 02 g 02  f12 g12  f 02 g12  f12 g 02  4 f 0 f1 g 0 g1 ]
 E[ f 02 g 02  f12 g12  f 02 g12  f12 g 02 ]  2E[ f 02 g12 ]  2E[ f12 g 02 ]
2
using ( f 0 g1  f1 g 0 )  0
 E[ f 02 g 02  f12 g12  3 f 02 g12  3 f12 g 02 ]
d e
2
d e
2
(induction)
 9 E[ f ]E[ g ]  9 E[ f12 ]E[ g12 ]
2
0
2
0
 39
d e1
2
E[ f g ]  3  9
2
0
2
1
d e1
2
E[ f12 g 02 ]
d e
2
 9 (E[ f 02 ]  E[ f12 ])( E[ g 02 ]  E[ g12 ])
d e
2
 9 E[ f ]E[ g ]
2
2
unroll the induction to
get the SoS proof
Components of the soundness proof
(of known UG instances)
•
•
•
•
•
Cauchy-Schwarz/Hölder's inequality
Hypercontractivity inequality
Smallsets expand in the noisy hypercube
Invariance Principle
Influence decoding
Smallset expansion of noisy hypercube
• For f : {1,1}  R , let T1 f ( x)  E [ f ( y)]
n
y ~1 x
• Theorem. If f ( x)(1  f ( x))  0, x
E[ f ]  
• then E[ f ( x)T1 f ( x)]   1( )
x
• Traditional proof. Let P be the projection
operator onto the eigenspace of T1 with
eigenvalue   . I.e. the space spanned by
{ S ( x)   xi : S  log 1 }
iS
Traditional proof of SSE of noisy
hypercube (cont'd)
E[ f ( x)T1 f ( x)]
x
 E[ f ( x)T1 P f ( x)]  E[ f ( x)T1 P f ( x)](poly. identity)
x
x
 E[ f ( x) P f ( x)]   E[ f ( x) 2 ]
x
(SoS friendly)
x
 E[ f ( x) 4 / 3 ]3 / 4 E[( P f ( x)) 4 ]1/ 4   E[ f ( x) 2 ] (Holder's)
x
x
 E[ f ( x)]
3/ 4
x
x
4 1/ 4
E[( P f ( x)) ]
x
log1 
  E[ f ( x)] (SoS friendly)
x
E[( P f ( x)) 2 ]1/ 2   E[ f ( x)]
x
x
x
(hypercontractivity)
log1 
3/ 4
2 1/ 2
 E[ f ( x)]  3
E[ f ( x) ]   E[ f ( x)]
x
x
x
(SoS
friendly)
log1 
 3
E[ f ( x)]5 / 4   E[ f ( x)] (SoS friendly)
x
x
 E[ f ( x)]
3/ 4
 3
Traditional proof of SSE of noisy
hypercube (cont'd)
E[ f ( x)T1 f ( x)]
x
 
 3
 3

log1 
E[ f ( x)]5 / 4   E[ f ( x)]
x
log1 
1  ( )
 5 / 4  
x
(SoS friendly)
(take   
 / 100
)
Key problem: fractional power
involved in the Holder's step
Solution: Cauchy-Schwartz/Holders
with no fractional power
SoS-izable Cauchy-Schwartz
• Theorem. For any constant a > 0
2
2
a
1

E[
f
]
E[
g
]

2
2 a -E[ fg ]  SoS
where SoS is a sum of squared polynomials of
degree at most 2
• Remark. a2  X  21a  X and the equality holds
when a  X
• Proof. Skipped.
• Corollary. (Holder's) For any constant a > 0
4 2
4
4
3
ab
a
1

E[
f
]
E[
g
]


E[
f
]

E[
f
g ]  SoS
4
4b
2a
• Proof. Apply C-S twice
SoS proof of SSE
E[ f ( x) P f ( x)]  E[ f ( x) 3 P f ( x)] (SoS friendly)
x
x
 ab4 E[ f ( x) 4 ]2 E[( P f ( x)) 4 ]  4ab E[ f ( x) 4 ]  21a
x
x
x
(Holder's)
2
4
ab
a
1
 4   E[( P f ( x)) ]  4b    2a
x
2
log1 
5/ 4
log1 
E[( P f ( x)) 2 ]2  4ab    21a
x
(hypercontractivity)
log1 
4
ab
 4   3
 4ab    21a

ab
4
  3
 
1
4
 3
 34  5 / 4 (take a   5 / 4 , b   6 / 4 )
SoS proof of SSE (cont'd)
E[ f ( x)T1 f ( x)]
x
 
 E[ f ( x) P f ( x)]   E[ f ( x) 2 ]
x
x
 
1
4

5/ 4
 3
1  (  )
log1 
 
3
4
5/ 4
 
 / 100



(take
)
Components of the soundness proof
(of known UG instances)
•
•
•
•
•
Cauchy-Schwarz/Hölder's inequality
Hypercontractivity inequality
Smallsets expand in the noisy hypercube
Invariance Principle
Influence decoding
A few words on Invariance Principle
• trickier
• "bump function" is used in the original proof
--- not a polynomial!
• but... a polynomial substitution is enough for UG
Max-Cut and Balanced Seperator
• An SoS proof for "Majority Is Stablest"
theorem is needed for Max-Cut instances
– We don't know how to get around the bump
function issue in the invariance step
– Instead, we proved a weaker theorem: "2/pi
theorem" -- suffices to give better-than0.878 algorithms for known Max-Cut instances
• Balanced Seperator. Key is to SoS-ize the proof
for KKL theorem
– Hypercontractivity and SSE is also useful
there
– Some more issues to be handled
Summary
• SoS/Lasserre hierarchy refutes all known UG
instances and Balanced Seperator instances, gives
better-than-0.878 approximation for known MaxCut instances,
– certain types of soundness proof does not work
for showing a gap of SoS/Lasserre hierarchy
Open problems
• Show that SoS/Lasserre hierarchy fully refutes
Max-Cut instances?
– SoS-ize Majority Is Stablest theorem...
• More lowerbound instances for SoS/Lasserre
hierarchy?
Thank you!
Download