Tutorial: The Lasserre Hierarchy in Approximation algorithms Thomas Rothvoß Department of Mathematics, MIT MAPSP 2013 Motivation Problem: Weak LP K = {x ∈ Rn | Ax ≥ b} Example: Independent Set ◮ Relaxation: K = {RV | xu + xv ≤ 1 ∀(u, v) ∈ E} x3 = x1 = 1 2 1 2 x2 = 1 2 Motivation Problem: Weak LP K = {x ∈ Rn | Ax ≥ b} ◮ Option I: Find and add violated inequalities Cons: Which ones? Solvable in polytime? Example: Independent Set ◮ Relaxation: K = {RV | xu + xv ≤ 1 ∀(u, v) ∈ E} x3 = x1 = 1 2 1 2 x2 = 1 2 Motivation Problem: Weak LP K = {x ∈ Rn | Ax ≥ b} ◮ Option I: Find and add violated inequalities Cons: Which ones? Solvable in polytime? ◮ Option II: Apply the Lasserre hierarchy! Pros: Solvable in polytime + known properties Example: Independent Set ◮ Relaxation: K = {RV | xu + xv ≤ 1 ∀(u, v) ∈ E} x3 = x1 = 1 2 1 2 x2 = 1 2 Motivation Problem: Weak LP K = {x ∈ Rn | Ax ≥ b} ◮ Option I: Find and add violated inequalities Cons: Which ones? Solvable in polytime? ◮ Option II: Apply the Lasserre hierarchy! Pros: Solvable in polytime + known properties Example: Independent Set ◮ Relaxation: K = {RV | xu + xv ≤ 1 ∀(u, v) ∈ E} x3 = x1 = 1 2 1 2 x2 = 1 2 Try to convince you: Option II is better! Lift-and-project methods b b K int. hull of K b Lift-and-project methods b Las1 (K) b b b b b K int. hull of K b Lift-and-project methods b Las2 (K) b b b b b K int. hull of K b Lift-and-project methods b Las3 (K) b b b b b K int. hull of K b Lift-and-project methods b Lasn (K) b b b b b K int. hull of K b Positive semidefinitness For a symmetric matrix M the following is equivalent ◮ positive semidefinite Positive semidefinitness For a symmetric matrix M the following is equivalent ◮ positive semidefinite ◮ xT M x ≥ 0 ∀x Positive semidefinitness For a symmetric matrix M the following is equivalent ◮ positive semidefinite ◮ xT M x ≥ 0 ∀x ◮ Any principal submatrix has det ≥ 0 Positive semidefinitness For a symmetric matrix M the following is equivalent ◮ positive semidefinite ◮ xT M x ≥ 0 ∀x ◮ ◮ Any principal submatrix has det ≥ 0 ∃ vectors vi with Mij = hvi , vj i Round-t Lasserre relaxation ◮ Given: K = {x ∈ Rn | Ax ≥ b}. Round-t Lasserre relaxation ◮ ◮ Given: K = {x ∈ Rn | Ax ≥ b}. Introduce variables for I ⊆ [n], |I| ≤ t ^ yI ≡ (xi = 1) i∈I Round-t Lasserre relaxation ◮ ◮ Given: K = {x ∈ Rn | Ax ≥ b}. Introduce variables for I ⊆ [n], |I| ≤ t ^ yI ≡ (xi = 1) i∈I ◮ y{i} = xi Round-t Lasserre relaxation ◮ ◮ Given: K = {x ∈ Rn | Ax ≥ b}. Introduce variables for I ⊆ [n], |I| ≤ t ^ yI ≡ (xi = 1) i∈I ◮ ◮ y{i} = xi y∅ = 1 Round-t Lasserre relaxation ◮ ◮ Given: K = {x ∈ Rn | Ax ≥ b}. Introduce variables for I ⊆ [n], |I| ≤ t ^ yI ≡ (xi = 1) i∈I ◮ ◮ y{i} = xi y∅ = 1 Round-t Lasserre relaxation P (yI∪J )|I|,|J|≤t A y − b y ℓ I∪J i∈[n] ℓi I∪J∪{i} |I|,|J|≤t 0 0 ∀ℓ ∈ [m] y∅ = 1 Round-t Lasserre relaxation ◮ ◮ Given: K = {x ∈ Rn | Ax ≥ b}. Introduce variables for I ⊆ [n], |I| ≤ t ^ yI ≡ (xi = 1) i∈I ◮ ◮ y{i} = xi y∅ = 1 moment matrix Round-t Lasserre relaxation P (yI∪J )|I|,|J|≤t A y − b y ℓ I∪J i∈[n] ℓi I∪J∪{i} |I|,|J|≤t 0 0 ∀ℓ ∈ [m] y∅ = 1 Round-t Lasserre relaxation ◮ ◮ Given: K = {x ∈ Rn | Ax ≥ b}. Introduce variables for I ⊆ [n], |I| ≤ t ^ yI ≡ (xi = 1) i∈I ◮ ◮ y{i} = xi y∅ = 1 slack moment matrix moment matrix Round-t Lasserre relaxation P (yI∪J )|I|,|J|≤t A y − b y ℓ I∪J i∈[n] ℓi I∪J∪{i} |I|,|J|≤t 0 0 ∀ℓ ∈ [m] y∅ = 1 Round-t Lasserre relaxation ◮ ◮ Given: K = {x ∈ Rn | Ax ≥ b}. Introduce variables for I ⊆ [n], |I| ≤ t ^ yI ≡ (xi = 1) i∈I ◮ ◮ y{i} = xi y∅ = 1 slack moment matrix moment matrix Round-t Lasserre relaxation P ◮ (yI∪J )|I|,|J|≤t A y − b y ℓ I∪J i∈[n] ℓi I∪J∪{i} Solvable in time nO(t) mO(1) |I|,|J|≤t 0 0 ∀ℓ ∈ [m] y∅ = 1 Example: Independent Set Moment matrix for t = 1: ∅ {1} {2} {3} 1 y1 y2 y3 ∅ y1 y1 y12 y13 {1} M1 (y) = y2 y12 y2 y23 {2} y3 y13 y23 y3 {3} x3 x1 x2 Example: Independent Set Moment matrix for t = 1: ∅ {1} {2} {3} 1 y1 y2 y3 ∅ y1 y1 y12 y13 {1} M1 (y) = y2 y12 y2 y23 {2} y3 y13 y23 y3 {3} x3 x1 x2 Moment matrix for edge (1, 2) for t = 1: {1} {2} {3} ∅ 1 − y1 − y 2 y1 − y1 − y12 y2 − y12 − y2 y3 − y13 − y23 ∅ {1} y1 − y1 − y12 y − y − y y − y − y y − y − y 1 1 12 12 12 12 13 13 123 y2 − y12 − y2 y12 − y12 − y12 y2 − y12 − y2 y23 − y123 − y23 {2} y3 − y13 − y23 y13 − y13 − y123 y23 − y123 − y23 y3 − y13 − y23 {3} Basic properties (1) Lemma conv(K ∩ {0, 1}n ) ⊆ Lasproj (K) t b Last (K) b b b b b K int. hull of K b Basic properties (1) Lemma conv(K ∩ {0, 1}n ) ⊆ Lasproj (K) t b ◮ For x ∈ K ∩ {0, 1}n define ( 1 xi = 1 ∀i ∈ I yI := 0 otherwise Last (K) b b b b b K int. hull of K b Basic properties (1) Lemma conv(K ∩ {0, 1}n ) ⊆ Lasproj (K) t b ◮ ◮ For x ∈ K ∩ {0, 1}n define ( 1 xi = 1 ∀i ∈ I yI := 0 otherwise Last (K) b b Then b b T (Mt (y))I,J = yI∪J = yI · yJ = (yy )I,J and yy T 0. b K int. hull of K b Basic properties (1) Lemma conv(K ∩ {0, 1}n ) ⊆ Lasproj (K) t b ◮ ◮ For x ∈ K ∩ {0, 1}n define ( 1 xi = 1 ∀i ∈ I yI := 0 otherwise Last (K) b b Then b b T (Mt (y))I,J = yI∪J = yI · yJ = (yy )I,J and yy T 0. ◮ b int. hull of K K P Similarly ni=1 ai yI∪J∪{i} − βyI∪J = (ax − β) · yI · yJ and (ax − β) · yy T 0. b Basic properties (2) Lemma For y ∈ Last (K): a) (y1 , . . . , yn ) ∈ K Basic properties (2) Lemma For y ∈ Last (K): a) (y1 , . . . , yn ) ∈ K a) Slack Pn moment matrix for ax ≥ β contains diagonal entry (∅, ∅): i=1 ai yi − y∅ β ≥ 0 Basic properties (2) Lemma For y ∈ Last (K): a) (y1 , . . . , yn ) ∈ K b) 0 ≤ yI ≤ 1 for all |I| ≤ t a) Slack Pn moment matrix for ax ≥ β contains diagonal entry (∅, ∅): i=1 ai yi − y∅ β ≥ 0 Basic properties (2) Lemma For y ∈ Last (K): a) (y1 , . . . , yn ) ∈ K b) 0 ≤ yI ≤ 1 for all |I| ≤ t a) Slack Pn moment matrix for ax ≥ β contains diagonal entry (∅, ∅): i=1 ai yi − y∅ β ≥ 0 b) Submatrix of Mt (y) indexed by {∅, I}: ∅ I ∅ y∅ yI = y (1 − y ) ≥ 0 I I I yI yI Basic properties (2) Lemma For y ∈ Last (K): a) (y1 , . . . , yn ) ∈ K b) 0 ≤ yI ≤ 1 for all |I| ≤ t c) yJ ≤ yI for I ⊆ J with |I|, |J| ≤ t. a) Slack Pn moment matrix for ax ≥ β contains diagonal entry (∅, ∅): i=1 ai yi − y∅ β ≥ 0 b) Submatrix of Mt (y) indexed by {∅, I}: ∅ I ∅ y∅ yI = y (1 − y ) ≥ 0 I I I yI yI Basic properties (2) Lemma For y ∈ Last (K): a) (y1 , . . . , yn ) ∈ K b) 0 ≤ yI ≤ 1 for all |I| ≤ t c) yJ ≤ yI for I ⊆ J with |I|, |J| ≤ t. a) Slack Pn moment matrix for ax ≥ β contains diagonal entry (∅, ∅): i=1 ai yi − y∅ β ≥ 0 b) Submatrix of Mt (y) indexed by {∅, I}: ∅ I ∅ y∅ yI = y (1 − y ) ≥ 0 I I I yI yI c) Submatrix of Mt (y) indexed by {I, J}: I I yI J yJ J yJ = yJ (yI −yJ ) ≥ 0 yJ Inducing on one variable n −1 R2 y Lemma For y ∈ Last (K), t ≥ 1, i ∈ [n], y ∈ conv{z ∈ Last−1 (K) | zi ∈ {0, 1}}. Last y{i} 0 1 Inducing on one variable n −1 R2 (0) Lemma z{i} For y ∈ Last (K), t ≥ 1, i ∈ [n], y ∈ conv{z ∈ Last−1 (K) | zi ∈ {0, 1}}. z (0) Last−1 =0 y b (1) z{i} = 1 b z (1) Last y{i} 0 1 Inducing on one variable n −1 R2 (0) Lemma z{i} For y ∈ Last (K), t ≥ 1, i ∈ [n], y ∈ conv{z ∈ Last−1 (K) | zi ∈ {0, 1}}. z (0) ◮ (1) Define zI := yI∪{i} yi (0) and zI := yI −yI∪{i} 1−yi Last−1 =0 y b (1) z{i} = 1 b z (1) Last y{i} 0 1 Inducing on one variable n −1 R2 (0) Lemma z{i} For y ∈ Last (K), t ≥ 1, i ∈ [n], y ∈ conv{z ∈ Last−1 (K) | zi ∈ {0, 1}}. z (0) (1) ◮ Define zI ◮ Clearly y = y −yI∪{i} yI∪{i} (0) and zI := I 1−y yi i yi · z (1) + (1 − yi ) · z (0) := Last−1 =0 y b (1) z{i} = 1 b z (1) Last y{i} 0 1 Inducing on one variable n −1 R2 (0) Lemma z{i} For y ∈ Last (K), t ≥ 1, i ∈ [n], y ∈ conv{z ∈ Last−1 (K) | zi ∈ {0, 1}}. z (0) Last−1 =0 y b (1) Define zI b z (1) Last y{i} 0 y −yI∪{i} yI∪{i} (0) and zI := I 1−y yi i ◮ Clearly y = yi · z (1) + (1 − yi ) · z (0) (0) (1) ◮ Moreover z i = 0, zi = 1. Remains to show ◮ (1) z{i} = 1 1 := psdness. Inducing on one variable n −1 R2 (0) Lemma z{i} For y ∈ Last (K), t ≥ 1, i ∈ [n], y ∈ conv{z ∈ Last−1 (K) | zi ∈ {0, 1}}. z (0) Last−1 =0 y b ◮ (1) Define zI b z (1) Last y{i} 0 y −yI∪{i} yI∪{i} (0) and zI := I 1−y yi i ◮ Clearly y = yi · z (1) + (1 − yi ) · z (0) (0) (1) ◮ Moreover z i = 0, zi = 1. Remains to show ◮ (1) z{i} = 1 1 := Take vI with hvI , vJ i = yI∪J for |I|, |J| ≤ t. psdness. Inducing on one variable n −1 R2 (0) Lemma z{i} For y ∈ Last (K), t ≥ 1, i ∈ [n], y ∈ conv{z ∈ Last−1 (K) | zi ∈ {0, 1}}. z (0) Last−1 =0 y b (1) Define zI 0 ◮ Take vI with hvI , vJ i = yI∪J for |I|, |J| ≤ t. ◮ Then vI∪{i} vJ∪{i} √ , √ yi yi and Mt−1 (z (1) ) 0. = z (1) Last 1 := b y{i} y −yI∪{i} yI∪{i} (0) and zI := I 1−y yi i ◮ Clearly y = yi · z (1) + (1 − yi ) · z (0) (0) (1) ◮ Moreover z i = 0, zi = 1. Remains to show ◮ (1) z{i} = 1 psdness. yI∪J∪{i} (1) = zI∪J yi Inducing on one variable n −1 R2 (0) Lemma z{i} For y ∈ Last (K), t ≥ 1, i ∈ [n], y ∈ conv{z ∈ Last−1 (K) | zi ∈ {0, 1}}. z (0) Last−1 =0 y b 0 yI −yI∪{i} yI∪{i} (0) and z := I yi 1−yi ◮ Clearly y = yi · z (1) + (1 − yi ) · z (0) (0) (1) ◮ Moreover z i = 0, zi = 1. Remains to show ◮ Take vI with hvI , vJ i = yI∪J for |I|, |J| ≤ t. ◮ (1) Define zI b z (1) Last y{i} 1 := psdness. Moreover vI − vI∪{i} vJ − vJ∪{i} vI vJ − vI vJ∪{i} − vJ vI∪{i} + vI∪{i} vJ∪{i} √ , √ = 1 − yi 1 − yi 1 − yi yI∪J − yI∪J∪{i} (0) = = zI∪J 1 − yi ◮ (1) z{i} = 1 Thus Mt−1 (z (0) ) 0. Inducing on one variable n −1 R2 (0) Lemma z{i} For y ∈ Last (K), t ≥ 1, i ∈ [n], y ∈ conv{z ∈ Last−1 (K) | zi ∈ {0, 1}}. z (0) Last−1 =0 y b (1) Define zI b z (1) Last y{i} 0 y −yI∪{i} yI∪{i} (0) and zI := I 1−y yi i ◮ Clearly y = yi · z (1) + (1 − yi ) · z (0) (0) (1) ◮ Moreover z i = 0, zi = 1. Remains to show ◮ (1) z{i} = 1 1 := ◮ Take vI with hvI , vJ i = yI∪J for |I|, |J| ≤ t. ◮ Similar for slack moment matrices. psdness. Consequences (1) Operation: “Induce on xi = 1” n −1 R2 y Last y{i} 0 1 Consequences (1) Operation: “Induce on xi = 1” n −1 R2 Last−1 y b z zi = 1 Last y{i} 0 1 Consequences (1) Operation: “Induce on xi = 1” n −1 R2 Last−1 y b z zi = 1 Last y{i} 0 1 Lemma For y ∈ Last (K), pick a set |S| ≤ t: y ∈ conv{z ∈ Last−|S| (K) | zi ∈ {0, 1} ∀i ∈ S} ◮ Explicit formula for each z via inclusion exclusion formula Consequences (1) Operation: “Induce on xi = 1” n −1 R2 Last−1 y b z zi = 1 Last y{i} 0 1 Lemma For y ∈ Last (K), pick a set |S| ≤ t: y ∈ conv{z ∈ Last−|S| (K) | zi ∈ {0, 1} ∀i ∈ S} ◮ ◮ Explicit formula for each z via inclusion exclusion formula Gap closed after n rounds Consequences (2) Lemma (Locally consistent probability distribution) For y ∈ Last (K) and |S| ≤ t, there is a random variable X ∈ {0, 1}S with h^ i Pr (Xi = 1) = yI ∀I ⊆ S i∈I ◮ For example for Independent Set, for |S| ≤ t nodes, y gives a probability distribution of independent sets in G[S] Application 1: Scheduling On 2 Machines With Precedence Constraints P 2 | prec, pj = 1 | Cmax Source: [Svensson, unpublished, 2011] P 2 | prec, pj = 1 | Cmax Input: ◮ jobs J with unit processing time ◮ precedence constraints ◮ 2 identical machines Goal: minimize makespan 1 4 2 5 3 6 machine 1 time machine 2 0 1 2 3 4 P 2 | prec, pj = 1 | Cmax Input: ◮ jobs J with unit processing time ◮ precedence constraints ◮ 2 identical machines Goal: minimize makespan 1 4 2 5 3 6 machine 1 1 machine 2 2 0 4 3 1 5 2 6 3 time 4 P 2 | prec, pj = 1 | Cmax Input: ◮ jobs J with unit processing time ◮ precedence constraints ◮ 2 identical machines Goal: minimize makespan 1 4 2 5 3 6 machine 1 1 machine 2 2 0 Known results: ◮ NP-hard for general m 4 3 1 5 2 6 3 time 4 P 2 | prec, pj = 1 | Cmax Input: ◮ jobs J with unit processing time ◮ precedence constraints ◮ 2 identical machines Goal: minimize makespan 1 4 2 5 3 6 machine 1 1 machine 2 2 0 4 3 1 5 2 6 3 Known results: ◮ NP-hard for general m ◮ poly-time for 2 machines [Coffman, Graham ’72] time 4 Time indexed LP T X xjt = 1 ∀j ∈ J xjt ≤ 2 ∀t ∈ [T ] t=1 X j∈J X t′ ≤t xit′ ≥ X t′ ≤t+1 xjt ≥ 0 xjt′ ∀i ≺ j ∀t ∈ [T ] ∀j ∈ J ∀t ∈ [T ] Time indexed LP T X xjt = 1 ∀j ∈ J xjt ≤ 2 ∀t ∈ [T ] t=1 X j∈J X t′ ≤t ◮ xit′ ≥ X xjt′ t′ ≤t+1 xjt ≥ 0 Integrality gap is ≥ ∀i ≺ j ∀t ∈ [T ] ∀j ∈ J ∀t ∈ [T ] 4 3 1 4 2 2 5 1 3 6 0 1 2 3 4 5 6 1 2 3 0 1 4 5 time 6 2 3 Time indexed LP T X xjt = 1 ∀j ∈ J xjt ≤ 2 ∀t ∈ [T ] t=1 X j∈J X t′ ≤t ◮ ◮ xit′ ≥ X xjt′ t′ ≤t+1 xjt ≥ 0 Integrality gap is ≥ ∀i ≺ j ∀t ∈ [T ] ∀j ∈ J ∀t ∈ [T ] 4 3 1 4 2 2 5 1 3 6 0 1 2 3 1 2 3 4 5 6 4 5 6 time 0 1 2 3 Claim: y ∈ Las1 (LP (T )) ⇒ ∃ schedule of length T Scheduling algorithm (1) Find y ∈ Las1 (LP (T )) Scheduling algorithm (1) Find y ∈ Las1 (LP (T )) Fractional schedule for j: yj,t > 0 0 slot t T Scheduling algorithm (1) Find y ∈ Las1 (LP (T )) ∗ > 0} (2) Fractional completion time Cj∗ := max{t | y{(j,t)} Fractional schedule for j: Cj∗ yj,t > 0 0 slot t T Scheduling algorithm (1) Find y ∈ Las1 (LP (T )) ∗ > 0} (2) Fractional completion time Cj∗ := max{t | y{(j,t)} (3) Sort jobs s.t. C1∗ ≤ C2∗ ≤ . . . ≤ Cn∗ Fractional schedule for j: Cj∗ yj,t > 0 0 slot t T Scheduling algorithm (1) Find y ∈ Las1 (LP (T )) ∗ > 0} (2) Fractional completion time Cj∗ := max{t | y{(j,t)} (3) Sort jobs s.t. C1∗ ≤ C2∗ ≤ . . . ≤ Cn∗ (4) Run list schedule w.r.t. ordering ⇒ σ : J → N Fractional schedule for j: Cj∗ yj,t > 0 0 slot t T Analysis Lemma For any job σj ≤ Cj∗ . schedule σ: 0 Analysis Lemma For any job σj ≤ Cj∗ . ◮ Consider job j1 with smallest index, s.t. σj1 > Cj∗1 (delete jobs > j1 ) schedule σ: j1 0 Cj∗1 Analysis Lemma For any job σj ≤ Cj∗ . ◮ ◮ Consider job j1 with smallest index, s.t. σj1 > Cj∗1 (delete jobs > j1 ) Let j0 < j1 highest index job scheduled alone fully busy schedule σ: ? j1 j0 0 Cj∗1 Analysis Lemma For any job σj ≤ Cj∗ . ◮ ◮ ◮ Consider job j1 with smallest index, s.t. σj1 > Cj∗1 (delete jobs > j1 ) Let j0 < j1 highest index job scheduled alone Jobs in J0 := {j < j1 | σj > σj0 } ∪ {j1 } depend on j0 ! fully busy schedule σ: ? j1 j0 0 Cj∗1 Analysis Lemma For any job σj ≤ Cj∗ . ◮ ◮ ◮ Consider job j1 with smallest index, s.t. σj1 > Cj∗1 (delete jobs > j1 ) Let j0 < j1 highest index job scheduled alone Jobs in J0 := {j < j1 | σj > σj0 } ∪ {j1 } depend on j0 ! fully busy schedule σ: j0 0 ? j1 Cj∗0 Cj∗ ≤ Cj∗1 Analysis Lemma For any job σj ≤ Cj∗ . ◮ ◮ ◮ ◮ Consider job j1 with smallest index, s.t. σj1 > Cj∗1 (delete jobs > j1 ) Let j0 < j1 highest index job scheduled alone Jobs in J0 := {j < j1 | σj > σj0 } ∪ {j1 } depend on j0 ! Induce on xj0 ,Cj∗ = 1 → ỹ ∈ Las0 (LP (T )) j0 0 ỹ: fully busy schedule σ: ? j1 Cj∗0 Cj∗ ≤ Cj∗1 > fully busy j0 0 Analysis Lemma For any job σj ≤ Cj∗ . ◮ ◮ ◮ ◮ ◮ Consider job j1 with smallest index, s.t. σj1 > Cj∗1 (delete jobs > j1 ) Let j0 < j1 highest index job scheduled alone Jobs in J0 := {j < j1 | σj > σj0 } ∪ {j1 } depend on j0 ! Induce on xj0 ,Cj∗ = 1 → ỹ ∈ Las0 (LP (T )) For LP infeasible! Contradiction! j0 0 ỹ: fully busy schedule σ: ? j1 Cj∗0 Cj∗ ≤ Cj∗1 > fully busy j0 0 A PTAS for 3 machines? Open problem For m = 3 machines, is the gap for f (ε)-round Lasserre at most 1 + ε? ◮ Neither known to be NP-hard, nor is a PTAS known! Application 2: Set Cover Source: [Chlamtac, Friggstad, Georgiou 2012] Subexp. Set Cover Approximation Set Cover: ◮ ◮ Input: Family of sets S1 , . . . , Sm ⊆ [n] with cost cSj Goal: Cover elements at minimum cost Subexp. Set Cover Approximation Set Cover: ◮ ◮ Input: Family of sets S1 , . . . , Sm ⊆ [n] with cost cSj Goal: Cover elements at minimum cost Known: ◮ ln(n)-apx [Chvátal ’79] Subexp. Set Cover Approximation Set Cover: ◮ ◮ Input: Family of sets S1 , . . . , Sm ⊆ [n] with cost cSj Goal: Cover elements at minimum cost Known: ◮ ln(n)-apx [Chvátal ’79] ◮ Reduction Sat to Set Cover with gap (1 − ε) ln(n) in time nO(1/ε) [Dinur, Steurer ’13] Subexp. Set Cover Approximation Set Cover: ◮ ◮ Input: Family of sets S1 , . . . , Sm ⊆ [n] with cost cSj Goal: Cover elements at minimum cost Known: ◮ ln(n)-apx [Chvátal ’79] ◮ Reduction Sat to Set Cover with gap (1 − ε) ln(n) in time nO(1/ε) [Dinur, Steurer ’13] Theorem ε There is a (1 − ε) ln(n)-apx in time 2Õ(n ) . The algorithm b b b b b b b b b b b b b b b b b b b b b b b b b b The algorithm (1) Compute y ∈ Lasnε (K) with o n X xi ≥ 1 ∀element j; cT x ≤ OP T K := x ∈ Rm ≥0 | i:j∈Si b b b b b b b b b b b b b b b b b b b b b b b b b b The algorithm (1) Compute y ∈ Lasnε (K) with o n X xi ≥ 1 ∀element j; cT x ≤ OP T K := x ∈ Rm ≥0 | i:j∈Si (2) FOR i = 1, . . . , nε DO (3) Find set Si covering most new elements b b b b b b b b b b b b b b b b b b b b b b b b b b S1 The algorithm (1) Compute y ∈ Lasnε (K) with o n X xi ≥ 1 ∀element j; cT x ≤ OP T K := x ∈ Rm ≥0 | i:j∈Si (2) FOR i = 1, . . . , nε DO (3) Find set Si covering most new elements (4) Induce on xi = 1 → ỹ b b b b b b b b b b b b b b b b b b b b b b b b b b S1 The algorithm (1) Compute y ∈ Lasnε (K) with o n X xi ≥ 1 ∀element j; cT x ≤ OP T K := x ∈ Rm ≥0 | i:j∈Si (2) FOR i = 1, . . . , nε DO (3) Find set Si covering most new elements (4) Induce on xi = 1 → ỹ b b b b b b b b b b b b b b b b b b b b b b b b b b S1 S2 The algorithm (1) Compute y ∈ Lasnε (K) with o n X xi ≥ 1 ∀element j; cT x ≤ OP T K := x ∈ Rm ≥0 | i:j∈Si (2) FOR i = 1, . . . , nε DO (3) Find set Si covering most new elements (4) Induce on xi = 1 → ỹ b b b b b b b b b b b b b b b b b b b b b b b b b b S1 S2 Snε The algorithm (1) Compute y ∈ Lasnε (K) with o n X xi ≥ 1 ∀element j; cT x ≤ OP T K := x ∈ Rm ≥0 | i:j∈Si (2) FOR i = 1, . . . , nε DO (3) Find set Si covering most new elements (4) Induce on xi = 1 → ỹ (5) Run ln(|largest set|)-apx for rest b b b b b b b b b b b b b b b b b b b b b b b b b b S1 S2 Snε The algorithm (1) Compute y ∈ Lasnε (K) with o n X xi ≥ 1 ∀element j; cT x ≤ OP T K := x ∈ Rm ≥0 | i:j∈Si (2) FOR i = 1, . . . , nε DO (3) Find set Si covering most new elements (4) Induce on xi = 1 → ỹ (5) Run ln(|largest set|)-apx for rest b Final ỹ has: ◮ ỹ1 = . . . = ỹnε = 1 b b b b b b b b b b b b b b b b b b b b b b b b b S1 S2 Snε The algorithm (1) Compute y ∈ Lasnε (K) with o n X xi ≥ 1 ∀element j; cT x ≤ OP T K := x ∈ Rm ≥0 | i:j∈Si (2) FOR i = 1, . . . , nε DO (3) Find set Si covering most new elements (4) Induce on xi = 1 → ỹ (5) Run ln(|largest set|)-apx for rest b Final ỹ has: ◮ ◮ ỹ1 = . . . = ỹnε = 1 ỹ ∈ Las0 (K) b b b b b b b b b b b b b b b b b b b b b b b b b S1 S2 Snε The algorithm (1) Compute y ∈ Lasnε (K) with o n X xi ≥ 1 ∀element j; cT x ≤ OP T K := x ∈ Rm ≥0 | i:j∈Si (2) FOR i = 1, . . . , nε DO (3) Find set Si covering most new elements (4) Induce on xi = 1 → ỹ (5) Run ln(|largest set|)-apx for rest b Final ỹ has: ◮ ◮ ỹ1 = . . . = ỹnε = 1 ỹ ∈ Las0 (K) ◮ cT ỹ ≤ OP T b b b b b b b b b b b b b b b b b b b b b b b b b S1 S2 Snε The algorithm (1) Compute y ∈ Lasnε (K) with o n X xi ≥ 1 ∀element j; cT x ≤ OP T K := x ∈ Rm ≥0 | i:j∈Si (2) FOR i = 1, . . . , nε DO (3) Find set Si covering most new elements (4) Induce on xi = 1 → ỹ (5) Run ln(|largest set|)-apx for rest b Final ỹ has: ◮ ◮ ỹ1 = . . . = ỹnε = 1 ỹ ∈ Las0 (K) ◮ cT ỹ ◮ ≤ OP T b b b b b b b b b b b b b b b b b b ≤ n1−ε b b b b b b b S1 S2 0 < ỹS < 1 ⇒ |S ∩ remain. elem| ≤ n1−ε Snε S The algorithm (1) Compute y ∈ Lasnε (K) with o n X xi ≥ 1 ∀element j; cT x ≤ OP T K := x ∈ Rm ≥0 | i:j∈Si (2) FOR i = 1, . . . , nε DO (3) Find set Si covering most new elements (4) Induce on xi = 1 → ỹ (5) Run ln(|largest set|)-apx for rest b Final ỹ has: ◮ ◮ ỹ1 = . . . = ỹnε = 1 ỹ ∈ Las0 (K) ◮ cT ỹ ◮ ◮ ≤ OP T b b b b b b b b b b b b b b b b b b ≤ n1−ε b b b b b b b S1 S2 0 < ỹS < 1 ⇒ |S ∩ remain. elem| ≤ n1−ε Apx-ratio: ln(n1−ε ) = (1 − ε) ln(n) Snε S Application 3: Max Cut Source: [Goemans, Williamson ’95] MaxCut MaxCut: Given G = (V, E). Maximize |δ(S)| MaxCut MaxCut: Given G = (V, E). Maximize |δ(S)| S MaxCut MaxCut: Given G = (V, E). Maximize |δ(S)| S LP: max nX e∈E ze | zij ≤ min{xi + xj , 2 − xi − xj } ∀(i, j) ∈ E o MaxCut MaxCut: Given G = (V, E). Maximize |δ(S)| zij = 1 xj = 12 xi = 21 LP: max nX e∈E ◮ ze | zij ≤ min{xi + xj , 2 − xi − xj } ∀(i, j) ∈ E Always OP Tf = |E| o MaxCut MaxCut: Given G = (V, E). Maximize |δ(S)| zij = 1 xj = 12 xi = 21 LP: max nX e∈E ◮ ◮ ze | zij ≤ min{xi + xj , 2 − xi − xj } ∀(i, j) ∈ E Always OP Tf = |E| Consider (x, z) ∈ Las3 (K) o Vector representation Observation For y ∈ Last (K), there are vI with hvI , vJ i = yI∪J for |I|, |J| ≤ t. Vector representation Observation For y ∈ Last (K), there are vI with hvI , vJ i = yI∪J for |I|, |J| ≤ t. ◮ kvI k22 = yI Vector representation Observation For y ∈ Last (K), there are vI with hvI , vJ i = yI∪J for |I|, |J| ≤ t. ◮ ◮ kvI k22 = yI kv∅ k22 = 1 Vector representation Observation For y ∈ Last (K), there are vI with hvI , vJ i = yI∪J for |I|, |J| ≤ t. ◮ ◮ ◮ kvI k22 = yI kv∅ k22 = 1 hvI , vJ i ≥ 0 Vector representation Observation For y ∈ Last (K), there are vI with hvI , vJ i = yI∪J for |I|, |J| ≤ t. vI ◮ ◮ kvI k22 = yI kv∅ k22 = 1 0 ◮ hvI , vJ i ≥ 0 ◮ vI lies on the sphere with radius 1 2 v∅ 1 2 v∅ and center 12 v∅ (since kvI − 12 v∅ k22 = kvI k22 − 2 · 21 vI v∅ + 41 kv∅ k22 = 14 ) Vector representation for MaxCut MaxCut: hvi , vj i = x{i,j} vi 0 1 2 v∅ vj v∅ Vector representation for MaxCut MaxCut: hvi , vj i = x{i,j} and zij = xi + xj − 2x{i,j} vi 0 1 2 v∅ vj v∅ Vector representation for MaxCut MaxCut: hvi , vj i = x{i,j} and zij = xi + xj − 2x{i,j} vi 0 1 2 v∅ vj ◮ Now Hyperplane rounding? v∅ Vector representation for MaxCut MaxCut: hvi , vj i = x{i,j} and zij = xi + xj − 2x{i,j} vi 0 1 2 v∅ v∅ vj ◮ Now Hyperplane rounding? ◮ Problem: Angles are in [0o , 90o ], Pr[cut (i, j)] ≤ 1 2 Vector transformation b vi 0 1 2 v∅ v∅ Vector transformation ui := 2vi − v∅ b b vi 0 1 2 v∅ v∅ Vector transformation b b ui := 2vi − v∅ b b b b b b b b b vi 0 b b b b b b b b b b b v∅ 1 2 v∅ b b b b b b b b b b b Observations: ◮ ui are unit vectors b b Vector transformation b b ui := 2vi − v∅ b b b b b b b b b vi 0 b b b b b b b b b b b b v∅ 1 2 v∅ b b b b b b b b b b b b Observations: ◮ ui are unit vectors ◮ hui , uj i = 1 − 2zij (angles are in [0o , 180o ]) Vector transformation b b ui := 2vi − v∅ b b b b b b b b b vi 0 b b b b b b b b b b b b v∅ 1 2 v∅ b b b b b b b b b b b b Observations: ◮ ui are unit vectors ◮ hui , uj i = 1 − 2zij (angles are in [0o , 180o ]) ◮ ui ’s are solutions to Goemans Williamson SDP o n X 1 − ui uj cij · | kui k2 = 1 ∀i ∈ V max 2 (i,j)∈E Hyperplane rounding ui uj Hyperplane rounding g ui uj Algorithm: (1) Pick a random Gaussian g Hyperplane rounding g ui uj Algorithm: (1) Pick a random Gaussian g (2) Set S := {i ∈ V | hg, ui i ≥ 0} Hyperplane rounding Pr[e ∈ δ(S)] 1.0 g 0.8 ui 1 π arccos(1 − 2ze ) 0.878 · ze 0.6 θ uj 0.4 0.2 0 ze 0 0.2 0.4 0.6 0.8 1.0 Algorithm: (1) Pick a random Gaussian g (2) Set S := {i ∈ V | hg, ui i ≥ 0} Analysis: Pr[(i, j) ∈ δ(S)] = angle of ui and uj arccos(1 − 2ze ) = ≥ 0.87 · ze π π Theory: Global Correlation Rounding Source: ◮ [Barak, Raghavendra, Steurer ’11] ◮ [Guruswami, Sinop ’11] Global Correlation Rounding Rand. Var. X1 , X2 ∈ {0, 1} are uncorrelated / independent ⇔ ! Cov[X1 , X2 ] = Pr[X1 = X2 = 1] − Pr[X1 = 1] · Pr[X2 = 1] = 0 Global Correlation Rounding Rand. Var. X1 , X2 ∈ {0, 1} are uncorrelated / independent ⇔ ! Cov[X1 , X2 ] = Pr[X1 = X2 = 1] − Pr[X1 = 1] · Pr[X2 = 1] = 0 Theorem For any y ∈ Last (K) can induce on ≤ O( ε13 ) variables to obtain y ′ ∈ Last−O(1/ε3 ) (K) s.t. i h ′ Pr yi′ · yj′ − y{i,j} ≥ε ≤ε i,j∈[n] Proof outline ◮ Consider random variable X ∈ {0, 1}n . Proof outline ◮ ◮ Consider random variable X ∈ {0, 1}n . Small calculation: 2 E [Var[X1 | X2 ]] = . . . ≤ Var[X1 ] − 4 · Cov[X1 , X2 ] X2 Proof outline ◮ ◮ Consider random variable X ∈ {0, 1}n . Small calculation: 2 E [Var[X1 | X2 ]] = . . . ≤ Var[X1 ] − 4 · Cov[X1 , X2 ] X2 ◮ Pick a random variable and condition on it → X ′ ∈ {0, 1}n ′ 2 E [Var[Xi ] − Var[Xi ]] ≥ 4 · E [Cov[Xi , Xj ] ] i∈[n] i,j if correlated ≥ 4ε3 Proof outline ◮ ◮ Consider random variable X ∈ {0, 1}n . Small calculation: 2 E [Var[X1 | X2 ]] = . . . ≤ Var[X1 ] − 4 · Cov[X1 , X2 ] X2 ◮ Pick a random variable and condition on it → X ′ ∈ {0, 1}n ′ 2 E [Var[Xi ] − Var[Xi ]] ≥ 4 · E [Cov[Xi , Xj ] ] i∈[n] ◮ i,j Cannot go on for more than O( ε13 ) rounds if correlated ≥ 4ε3 Proof outline ◮ ◮ Consider random variable X ∈ {0, 1}n . Small calculation: 2 E [Var[X1 | X2 ]] = . . . ≤ Var[X1 ] − 4 · Cov[X1 , X2 ] X2 ◮ Pick a random variable and condition on it → X ′ ∈ {0, 1}n ′ 2 E [Var[Xi ] − Var[Xi ]] ≥ 4 · E [Cov[Xi , Xj ] ] i∈[n] i,j ◮ Cannot go on for more than O( ε13 ) rounds ◮ Variance also exists for Lasserre! if correlated ≥ 4ε3 Application 4: Max Cut in dense Graphs Source: [de la Vega, Kenyon-Mathieu ’95] PTAS for MaxCut in dense graphs Problem: ◮ ◮ Given G = (V, E) with |E| ≥ εn2 . Maximize |δ(S)| S PTAS for MaxCut in dense graphs Problem: ◮ Given G = (V, E) with |E| ≥ εn2 . S Maximize |δ(S)| o nX ze | zij ≤ min{xi + xj , 2 − xi − xj } ∀(i, j) ∈ E LP : max ◮ e∈E PTAS for MaxCut in dense graphs Problem: ◮ Given G = (V, E) with |E| ≥ εn2 . S Maximize |δ(S)| o nX ze | zij ≤ min{xi + xj , 2 − xi − xj } ∀(i, j) ∈ E LP : max ◮ e∈E PTAS: 4 ◮ Compute in nO(1/ε ) uncorrelated (x, z) ∈ Las3 (K): Pr [|xi xj − x{i,j} | ≥ ε] ≤ ε2 i,j∈V PTAS for MaxCut in dense graphs Problem: ◮ Given G = (V, E) with |E| ≥ εn2 . S Maximize |δ(S)| o nX ze | zij ≤ min{xi + xj , 2 − xi − xj } ∀(i, j) ∈ E LP : max ◮ e∈E PTAS: 4 ◮ Compute in nO(1/ε ) uncorrelated (x, z) ∈ Las3 (K): Pr [|xi xj − x{i,j} | ≥ ε] ≤ ε2 i,j∈V ◮ Observe: 1 − ε fraction of edges ≤ ε correlated. PTAS for MaxCut in dense graphs Problem: ◮ Given G = (V, E) with |E| ≥ εn2 . S Maximize |δ(S)| o nX ze | zij ≤ min{xi + xj , 2 − xi − xj } ∀(i, j) ∈ E LP : max ◮ e∈E PTAS: 4 ◮ Compute in nO(1/ε ) uncorrelated (x, z) ∈ Las3 (K): Pr [|xi xj − x{i,j} | ≥ ε] ≤ ε2 i,j∈V ◮ ◮ Observe: 1 − ε fraction of edges ≤ ε correlated. Take S ⊆ V with Pr[i ∈ S] = xi independently!! PTAS for MaxCut in dense graphs Problem: ◮ S Given G = (V, E) with |E| ≥ εn2 . Maximize |δ(S)| o nX ze | zij ≤ min{xi + xj , 2 − xi − xj } ∀(i, j) ∈ E LP : max ◮ e∈E PTAS: 4 ◮ Compute in nO(1/ε ) uncorrelated (x, z) ∈ Las3 (K): Pr [|xi xj − x{i,j} | ≥ ε] ≤ ε2 i,j∈V ◮ ◮ ◮ Observe: 1 − ε fraction of edges ≤ ε correlated. Take S ⊆ V with Pr[i ∈ S] = xi independently!! Then Pr[(i, j) ∈ δ(S)] = xi + xj − 2xi xj (i,j) uncor. ≈ xi + xj − 2x{i,j} = ze The end Open problems: ◮ f (ε, k) rounds solve Unique Games? ◮ O(1) rounds give a O(log n)-apx for coloring 3-colorable graphs? ◮ f (ε)-rounds give PTAS for P 3 | prec, pj = 1 | Cmax ? ◮ O(1) rounds give (2 − ε)-apx for Unrelated Machine Scheduling Q | pij | Cmax ? The end Open problems: ◮ f (ε, k) rounds solve Unique Games? ◮ O(1) rounds give a O(log n)-apx for coloring 3-colorable graphs? ◮ f (ε)-rounds give PTAS for P 3 | prec, pj = 1 | Cmax ? ◮ O(1) rounds give (2 − ε)-apx for Unrelated Machine Scheduling Q | pij | Cmax ? Thanks for your attention Slides and lecture notes can be found under http://www-math.mit.edu/˜rothvoss/lecturenotes.html