The Complexity of Linear Dependence Problems in Vector Spaces David Woodruff IBM Almaden Joint work with Arnab Bhattacharyya, Piotr Indyk, and Ning Xie from MIT The 3-SUM Problem • Given a set S containing r real numbers, are there: a, b, c 2 S with a+b+c = 0? • Solve in O(r2) time – Interview question • Conjectured to require ~(r2) time • Useful for hardness results in P. Many problems are “3-SUM Hard” Generalizations • We study generalizations of this problem: – Replace 3 summands with k summands – Replace real field R with a finite field – Replace sum of field elements with sum of vectors – Replace sum with a fixed linear combination – Replace sum with any linear combination – Require vectors be minimally linearly dependent – Replace target 0 with an arbitrary vector – and so on… Applications • Maximum Likelihood Decoding - Given x1, …, xr in Fqn and z in Fqn, do there exist xi1, …, xik that contain z in their span? - xi are the columns of a parity-check matrix - z is the syndrome - there is a codeword corrupted in at most k positions with syndrome z iff the k-span contains z • Weight Distribution Problem – Let A be an n x r matrix over F2 – Define the code C = {x | Ax = 0} – C has a codeword of weight k iff k columns of A sum to 0 Formal Definitions • In this talk, we focus on two problems: • (k,r)-LinDependence: given r elements x1, …, xr in F2n and z in F2n, do there exist xi1, …, xik that span z? • (k,r)-ZeroSum: given r elements x1, …, xr in F2n, do there exist xi1, …, xik with xi1 + xi2 + … + xik = 0? • We allow k and r to be functions of n • First problem at least as hard as second Results • Assume 3-SAT cannot be solved in time less than 2cn for a constant c > 0 • Then (k,r)-ZeroSum requires min(rk, 2n) time, up to polynomial factors – So, (k,r)-LinDependence requires min(rk, 2n) time – Other variants also require this time • Have matching upper bound: – rk is trivial. Can get roughly rk/2 – Can get 2n with the FFT Implications • (k,r)-LinDependence reduces to Maximum Likelihood Decoding, so min(rk, 2n) lower bound • (k,r)-ZeroSum reduces to the Weight Distribution Problem, so min(rk, 2n) lower bound 1/4 k r • Results improve previous best lower bounds for these coding theory problems [Downey, Fellows, Vardy, Whittle] • Hold for r and k functions of n Our starting point: [PW] showed an r(k) bound for k-SUM over R assuming 3-SAT on n variables requires 2cn time: 3-SAT formula F with n variables and m clauses [CIP] - s = 2εn. Each Ái has n variables and O(n) clauses This ensures bit complexity of resulting numbers is small … Á1 Á2 Ás - Ái replaced with 1in-3-SAT formula Ãi Ã1 Ãs Ã2 … - Ãi converted to k-SUM instance Each k-SUM instance on a set of r = 2Θ(n/k) real numbers. If can solve k-SUM in time ro(k), can solve 3-SAT in time ro(k) ¢2εn Reducing a 1-in-3-SAT formula Ãi on n variables and O(n) clauses to k-SUM on r = 2Θ(n/k) real numbers • Partition variables into k groups G1, …, Gk of n/k variables … Gk 1 true iff… ÃG there areGki real numbers i is that sum to 1k + O(n) vi,1 vi,2 vi,3 … vi, 2n/k • In each group Gi, create a real number vi,j for each possible assignment to its n/k variables vi,j: Base-k representation • • • k group indicator digits O(n) clause digits i-th indicator digit is 1 iff v 2 Gi j-th clause digit is 1 iff A(v) sets exactly 1 literal of j-th clause to 1 All other digits are 0 Can we do the same for F2? • Partition variables into k groups G1, …, Gk of n/k variables - A sum of k vectors over F2 can equal k+O(n), but just means an odd number 1G … Gi … 1 of literals in each clause are true v v i,1 i,2 - Odd-SAT is easy vi,3 … Gk vi, 2n/k • In each group Gi, create a real vector number vi,j for veach each possible assignment to its i,j for possible assignment n/k variablesto its n/k variables vi,j: k + O(n) Base-k representation coordinates • • • k kgroup groupindicator coordinates digits O(n) clause O(n) coordinates clause digits i-th indicator coordinate digit is 1 iff is v 21 G iffi v 2 Gi j-th clause coordinate digit is 1 iff A(v) is 1 iff sets A(v)exactly sets exactly 1 literal1 of literal j-th clause of j-th clause to 1 to 1 All other digits coordinates are 0 are 0 Our Modifications 3-SAT formula F withthis n variables and - Before was used form clauses bit complexity. - s = 2εn. Each Ái has n variables and O(n) clauses - Ái replaced with NAE-SAT Formula Ãi - Ãi converted to (k,r)-ZeroSum - Now it determines the of Á dimensions… Ánumber 2 formula à is 1 if -1We A NAE-SAT need interaction i for each clause, at least one between groups but not all literals are - With 1-in-3-SAT overtrue R, Ã1 Ã2 … variables in different groups independently update the clause digit [CIP] Ás Ãs Each (k,r)-ZeroSum instance on a set of r = 2Θ(n/k) vectors. If can solve (k,r)-ZeroSum in time ro(k), solve 3-SAT in time ro(k) ¢2εn Interacting Variables • We can replace duplicates of a variable with distinct variables and introduce equality constraints – preserve NAE-SAT and · 3 literals per clause – each variable occurs in a constant number of clauses Gi still has O(n/k) variables ┐ • For each clause (a Ç b Ç c), we introduce pairvairs – 1 variable is [a, b], 1 variable is [b, c], and 1 variable is [c, a] • Partition original n variables into k groups Gi of n/k variables • For a pairvar [a,b], – if original variables a and b occur in the same group Gi, place [a,b] in Gi – else, if a 2 Gi and b 2 Gj, place [a,b] in Gmin(i, j) New Reduction • In each group Gi, create a vector vi,j for each assignment to its n/k variables as well as variables in Gi’s pairvars vi,j1: k+O(n) coordinates pair of consistency coordinates for each pairvar (a,b) k group coordinates O(n) clause coordinates O(n) consistency coordinates • i-th group coordinate is 1, the others are 0 • clause coordinates more complicated • depend on variables and pairvars assigned to the group • consistency coordinates allow for assignments to the same variable from different groups to be patched together Clause Coordinates • Clause coordinates are set so that for a consistent assignment (i.e., group and consistency coordinates are ok), then for clause with literals a, b, c – v(a) + v(b) + v(c) – v(a) ¢ v(b) – v(b) ¢ v(c) – v(a) ¢ v(c) – v(.) denotes the value assigned • Case analysis – Clause only equals 1 if exactly 1 or 2 literals are true Upper Bounds • Consider functions f: Z2n ! {0,1} • Fourier transform F: Z2n ! R is F(x) = 2-n¢y f(y)¢(-1)<x,y> • Fast Fourier Transform computes F from f in O(n¢2n) time • Let f be indicator function of input set of r vectors. Then sum v1+ v2 + … vk = 0 f(v1) ¢ f(v2) f(vk) is what we want • This is just 2n times the 0n-Fourier coefficient of fk • So we can get O(n¢2n) time instead of the trivial 2nk Conclusion Assuming 3-SAT cannot be solved in time less than 2cn for a constant c > 0, – (k,r)-LinDependence and (k,r)-ZeroSum require min(rk, 2n) time (up to polynomial factors) – Same bound holds for many similar problems – Almost matching upper bounds – New way to prove hardness in coding theory – Optimal hardness of basic problems in coding theory