Ryan O'Donnell (CMU, IAS) Yi Wu (CMU, IBM) Yuan Zhou (CMU) Solving linear equations • Given a set of linear equations over reals, is there a solution satisfying all the equations? – Easy : Gaussian elimination. Noisy version • Given a set of linear equations for which there is a solution satisfying 99% of the equations, – can we find a solution that satisfies at least 1% of the equations? • I.e. 99% vs 1% approximation algorithm for linear equations over reals? Hardness of Max-3Lin(q) • Theorem. [Håstad '01] Given a set of linear equations modulo q, it is NP-hard to distinguish between – there is a solution satisfying (1 - ε)-fraction of the equations – no solution satisfies more than (1/q + ε)-fraction of the equations • Equations are sparse, and are of the form xi + xj - xk = c (mod q) • (1 - ε) vs (1/q + ε) approx. for Max-3Lin(q) is NP-Hard • A 3-query PCP of completeness (1 - ε), soundness (1/q + ε) Sparser equations: Max-2Lin(q) • Theorem. [KKMO '07] Assuming Unique Games Conjecture, for any ε, δ > 0, there exists q > 0, such that (1 - ε) vs δ approx. for Max-2Lin(q) is NP-Hard Max-3Lin Max-2Lin over [q] (1 - ε) vs (1/q + ε) NP-hardness [Håstad '01] (1 - ε) vs δ UG-hardness [KKMO '07] over integers/reals ? ? Equations over integers: Max-3Lin(Z) • Approximate Max-3Lin/Max2Lin over large domains? • Intuitively, it should be harder, because when domain size increases, – soundness becomes smaller in both [Håstad '01] and [KKMO '07] • Obstacle of getting hardness – "Long code" becomes too long (even infinitely long) Hardness of Max-3Lin(Z) • Theorem. [Guruswami-Raghavendra '07] For all ε, δ > 0, it is NP-Hard to (1 - ε) vs δ approximate Max-3Lin(Z) – 3-query PCP over integers – Implies the hardness for Max-3Lin(R) • Proof follows [Håstad '01], but much more involved – derandomized Long Code testing – Fourier analysis with respect to an exponential distribution on Z+ Max-3Lin Max-2Lin over [q] (1 - ε) vs (1/q + ε) NP-hardness [Håstad '01] (1 - ε) vs δ UG-hardness [KKMO '07] over integers/reals (1 - ε) vs δ NP-hardness [GR '07] ? Unique Games over Integers? • Can we use the techniques in [Guruswami-Raghavendra '07] prove a (1 - ε) vs δ UG-hardness for Max-2Lin(Z)? – Seems difficult – Open question from Raghavendra's thesis [Raghavendra '09] : Our results • Relatively easy to modify the KKMO proof to get – Theorem. For all ε, δ > 0, it is UG-Hard to (1 - ε) vs δ approximate Max-2Lin(Z) • Also applies to Max-2Lin over reals and large domains – Simpler proof (and better parameters) of Max3Lin(Z) hardness Dictatorship Test • Theorem. For all ε, δ > 0, it is UG-Hard to (1 - ε) vs δ approximate Max-2Lin(Z) • By [KKMO '07], only need to design a (1 - ε) vs δ 2-query dictatorship test over integers. Dictatorship Test (cont'd) • f: [q]d -> Z is called a dictator if f(x1, x2, ..., xd) = xi (for some i) • Dictatorship test over [q]: a distribution over equations f(x) - f(y) = c (mod q) – Completeness: for dictators, Pr[equation holds] ≥ 1 - ε – Soundness: for functions far from dictators, Pr[equation holds] < δ (1 - ε) vs δ hardness of Max-2Lin(q) Dictatorship Test over Integers • A distribution over equations f(x) - f(y) = c – Completeness: for dictators, Pr[f(x) - f(y) =c] ≥ 1 - ε – Soundness: for functions far from dictators, Pr[f(x) - f(y) = c mod q] < δ • It is UG-Hard to distinguish between – a Max-2Lin(Z) instance is (1 - ε)-satisfiable – the instance is not δ-satisfiable even when the the equations are modulo q Recap of KKMO Dictatorship Test Back to KKMO Dictatorship Test •Dictatorship test over [q]: a distribution over equations f(x) - f(y) = c (mod q) •Completeness: for dictators, Pr[equation holds] ≥ 1 - ε •Soundness: for functions far from dictators, Pr[equation holds] < δ •KKMO Test •Pick x ∈ [q]d by random •Get y by rerandomizing each coordinate of x w.p. ε •Test f(x) - f(y) = 0 (mod q) Back to KKMO Dictatorship Test (cont'd) •KKMO Test •Pick x ∈ [q]d by random •Get y by rerandomizing each coordinate of x w.p. ε •Test f(x) - f(y) = 0 (mod q) • Soundness analysis "Majority Is Stablest" Theorem [MOO '05] – If f is far from dictators and "β-balanced", then Pr[f passes the test] < βε/2 – f is β-balanced : Pr[f(x) = a mod q] < β for all 0 ≤ a < q Back to KKMO Dictatorship Test (cont'd) •KKMO Test •Pick x ∈ [q]d by random •Get y by rerandomizing each coordinate of x w.p. ε •Test f(x) - f(y) = 0 (mod q) • Soundness analysis – "Folding" trick: to make sure f is β-balanced – Idea: when query f(x) = f(x1, x2, ..., xn), return g(x) = f(0, (x2 - x1) mod q, ..., (xn - x1) mod q) + x1 – Dictators not affected in completeness analysis – g(x) is 1/q-balanced Dictatorship Test for Max-2Lin(Z) • A distribution over equations f(x) - f(y) = c – Completeness: for dictators, Pr[f(x) - f(y) =c] ≥ 1 - ε – Soundness: for functions far from dictators, Pr[f(x) - f(y) = c mod q] < δ • • If we use KKMO test... – Soundness: the same, – Completeness does not hold, because • when query f(x), get g(x) = (xi - x1) mod q + x1 • when query f(y), get g(y) = (yi - y1) mod q + y1 Max-2Lin(q): Pr[g(x) - g(y) = 0 mod q] ≥ 1 - ε Max-2Lin(Z): Pr[g(x) - g(y) ≠ 0] ≥ Pr["wrap-around" (exactly one of g(x), g(y) ≥ q)] ≈ 1/2 Our method Step I Introducing the new "active folding" The new "active folding" •KKMO Test with active folding mod q •Pick x ∈ [q]d by random •Get y by rerandomizing each coordinate of x w.p. ε •Pick c, c' ∈ [q] by random, test f(x1 - c, ..., xn - c) + c = f(y1 - c', ..., yn - c') + c' (mod q) • Completeness: • Soundness: – Claim. g(x) = f(x1 - c, ..., xn - c) + c is 1/q-balanced – Proof. Prx,c[f(x1 - c, ..., xn - c) + c = a mod q] = Ec [Prx[f(x1 - c, ..., xn - c) = a - c mod q] ] = Ec [Prx[f(x) = a - c mod q] ] = Ex [Prc[f(x) = a - c mod q] ] ≤ 1/q Our method Step II "Partial active folding" "Partial active folding" •KKMO Test with partial active folding for Max-2Lin(Z) •Pick x ∈ [q]d by random •Get y by rerandomizing each coordinate of x w.p. ε •Pick c, c' ∈ [q0.5] by random, test f(x1 - c, ..., xn - c) + c = f(y1 - c', ..., yn - c') + c' • Completeness: – f(x1 - c, ..., xn - c) + c = (xi - c) mod q + c = (xi - c) + c = xi – f(y1 - c', ..., yn - c') + c' = yi w.p. 1 - 1/q0.5 w.p. 1 - 1/q0.5 Pr[f(x1-c, ..., xn-c)+c = f(y1-c', ..., yn-c')+c'] ≥ 1 - ε - 2/q0.5 "Partial active folding" (cont'd) •KKMO Test with partial active folding for Max-2Lin(Z) •Pick x ∈ [q]d by random •Get y by rerandomizing each coordinate of x w.p. ε •Pick c, c' ∈ [q0.5] by random, test f(x1 - c, ..., xn - c) + c = f(y1 - c', ..., yn - c') + c' • Completeness: • Soundness: – Claim. g(x) = f(x1 - c, ..., xn - c) + c is 1/q0.5-balanced – Proof. Prx,c[f(x1 - c, ..., xn - c) + c = a mod q] = Ec [Prx[f(x1 - c, ..., xn - c) = a - c mod q] ] = Ec [Prx[f(x) = a - c mod q] ] = Ex [Prc[f(x) = a - c mod q] ] ≤ 1/q0.5 "Partial active folding" (cont'd) •KKMO Test with partial active folding for Max-2Lin(Z) •Pick x ∈ [q]d by random •Get y by rerandomizing each coordinate of x w.p. ε •Pick c, c' ∈ [q0.5] by random, test f(x1 - c, ..., xn - c) + c = f(y1 - c', ..., yn - c') + c' • Completeness: • Soundness: – Claim. g(x) = f(x1 - c, ..., xn - c) + c is 1/q0.5-balanced – By Majority Is Stablest Theorem, when f is far from dictators Pr[f(x1-c,...,xn-c)+c = f(y1-c',...,yn-c')+c' mod q] < 1/qε/4 Application to Max-3Lin(Z) Key Idea in Max-2Lin(Z): "Partial folding" to deal with "wrap-around" event Håstad's reduction for Max-3Lin(q) •Hastad's Matching Dictatorship Test for f: [q]L -> Z, g : [q]R -> Z, π : [R] -> [L] •Pick x ∈ [q]L , y ∈ [q]R, by random •Let z∈[q]R, s.t. zi = (yi + xπ(i)) mod q •Rerandomizing each coordinate of x, y, z w.p. ε •Test f(0, x2 - x1, ..., xn - x1) + x1 + g(y) = g(z) mod q • Completeness: if g is i-th dictator, f is π(i)-th dictator Pr[f, g pass the test] ≥ 1 - 3ε • Soundness: if f and g far from being "matching dictators" Pr[f, g pass the test] < 1/q + δ (1 - 3ε) vs (1/q + δ) NP-Hardness of Max-3Lin(q) Our reduction for Max-3Lin(Z) •Matching Dictatorship Test with partial active folding for f: [q2]L -> Z, g : [q3]R -> Z, π : [R] -> [L] •Pick x ∈ [q2]L , y ∈ [q3]R, by random •Let z∈[q3]R, s.t. zi = (yi + xπ(i)) mod q •Rerandomizing each coordinate of x, y, z w.p. ε •Pick c ∈ [q] by random •Test f(x1 - c, ..., xn - c) + c + g(y) = g(z) • Completeness: if g is i-th dictator, f is π(i)-th dictator Pr[f(x1 - c, ..., xn - c) + c + g(y) = g(z)] ≥ 1 - 3ε - 2/q • Soundness: if f and g far from being "matching dictators" Pr[f(x1 - c, ..., xn - c) + c + g(y) = g(z) mod q] < 1/q + δ (1-3ε-2/q) vs (1/q+δ) NP-Hardness of Max-3Lin(Z) The End. Any questions?