Slides - Events @ CSA Dept., IISc Bangalore

Algebraic Property Testing: A Unified Perspective Arnab Bhattacharyya Indian Institute of Science Property Testing Distinguish between and and Property P 𝜖-far from property P Testing and Learning  Proper learning (with membership queries) is as hard as testing, for any property  For many natural properties, testing is much easier than learning  Learning always requires time at least as large as size of hypothesis but testing can be in constant time! A brief history  Initially appeared as a tool for program checking [BlumLuby-Rubinfeld ‘90]  [Babai-Fortnow-Lund ’90, Rubinfeld-Sudan ’96]: application to PCPs and low-degree testing  [Goldreich-Goldwasser-Ron ‘98] considered property testing as full-fledged computational problem in its own right  Now, many other variants and connections known (e.g., implicit learning, active testing, coding theory, inapproximability, …) Testable Properties Property P is testable if there exist functions 𝑞 and 𝛿 and a randomized algorithm A such that, given an input object and a parameter 𝜖: • A makes 𝑞(𝜖) queries into input object • A rejects with prob. ≥ 𝛿(𝜖) if input is 𝜖-far from P and with prob. 0 if 𝜖 = 0 𝛿 Positive at 𝜖>0 Zero at 𝜖 = 0 ϵ Testing all-orangeness P = {all-orange rectangle} • 𝑞=1 • Query hits non-orange region with probability ≥ 𝜖 Testing linearity P = {linear functions 𝐿: {𝟎, 𝟏}𝑛 → {𝟎, 𝟏} } 000… 0 000… 1 001… 1 001… 1 010… 1 010… 0 100… 0 100… 1 101… 1 111… 0 111… 0 Functions of the form 101… 1 𝑳 𝒙 , … , 𝒙𝒏 = 𝒙𝒊𝟏 + 𝒙𝒊𝟐 + 𝒙110… 𝒊𝟑 + ⋯0+ 𝒙𝒊𝒌 110… 𝟏 1 over characteristic011… 2 011… 0 1 • 𝑞=3 • 𝑓 𝑥 + 𝑓 𝑦 ≠ 𝑓 𝑥 + 𝑦 with probability Ω(𝜖). [BlumLuby-Rubinfeld] Testing low-degree P = {polynomials 𝑃: 𝑭𝒑𝑛 → 𝑭𝒑 of degree ≤ 𝑑} } Functions of the form 𝑷 𝒙𝟏 , … , 𝒙𝒏 = 𝒊 𝒊 𝒊 𝒄𝒊𝟏 ⋯𝒊𝒏 ⋅ 𝒙𝟏𝟏 𝒙𝟐𝟐 ⋯ 𝒙𝒏𝒏 Check that function restricted to a random O(1)-dim 𝒊𝟏 +⋯+𝒊𝒏 ≤𝒅 subspace is poly of degree ≤ 𝑑 over characteristic p Test detects a violation with probability Ω(𝜖) [KaufmanRon, Haramaty-Shpilka-Sudan] Affine-invariant properties Subject of this talk: “algebraic properties” of multivariate functions over finite fields. Definition (affine-invariant property): If a function f on domain 𝑭𝒏𝒑 satisfies an affine-invariant property P, then 𝑓 ∘ 𝐴 must also satisfy P for every affine map 𝐴: 𝑭𝒏𝒑 → 𝑭𝒏𝒑 [Kaufman-Sudan] Why? Introspective:  A principled understanding of testability (~ VC dimension theory for PAC learning)  What kinds of tools are needed to prove testability? Limitations? Extrospective:  Search for locally testable codes (applications to inapproximability) Testing Fourier Sparsity P = {functions 𝑃: 0,1 𝑛 → {0,1} have at most k non-zero Fourier coefficients} [Gopalan-O’Donnell-Servedio-Shpilka-Wimmer] showed P is testable (though only with 𝛿 0 > 0). Testing Decompositions  P = {functions 𝑃: 𝑭𝒑𝑛 → 𝑭𝒑 that are a product of two quadratics}  P = {functions 𝑃: 𝑭𝒑𝑛 → 𝑭𝒑 that are the square of a quartic}  P = {polynomials 𝑃: 𝑭𝒑𝑛 → 𝑭𝒑 of the form 𝑎𝑏 + 𝑐𝑑 where 𝑎, 𝑏, 𝑐, 𝑑 are all cubics} Testability of all such properties open previously! Main Result An exact characterization of testability for affineinvariant properties: Theorem: An affine-invariant property is testable* if and only if it is locally characterized [Joint work with Fischer, H&P Hatami, Lovett ‘13] Locally Characterized Definition (locally characterized): A property P is locally characterized if there always exists a constant-sized witness to non-membership in P. Examples: • Linearity: If 𝑓 isn’t linear, then there exist 𝑥, 𝑦 such that 𝑓 𝑥 + 𝑓 𝑦 ≠ 𝑓(𝑥 + 𝑦). • Low degree: If deg 𝑓 > 𝑑, then there exists a point at which the (𝑑 + 1)-th order derivative is nonzero. Necessity of locality Theorem: An affine-invariant property is testable if and only if it is locally characterized. Proof of : The set of queries by the tester that make it reject is itself a 𝑞-sized witness to nonmembership in the property. Proof of : Rest of this talk… Degree-structural properties  Are properties like “Fourier sparsity”, “product of two quadratics”, “square of a quartic”, “sum of two products of quadratics”, “splitting into d linear forms” locally characterized? Theorem: Any property described as the property of decomposing into a known structure of low-degree polynomials is locally characterized. Subsequent work  Degree-structural properties are not only testable but reconstructible! ◦ Given degree-𝑑 poly 𝑃 = 𝑄1 𝑄2 + 𝑄3 𝑄4 where 𝑄1 , … , 𝑄4 are degree-(𝑑/2) polys, we can reconstruct 𝑄1 , … , 𝑄4 by making 𝑂(𝑛𝑑 ) queries to 𝑃. [B. ‘14]  Characterization of two-sided testability [Yoshida ‘14] Some drawbacks… Unfolded proof uses non-constructive results. We get no bound on the locality and query complexity for even simple properties like “product of two quadratics”! A Running Example: RBG 𝑥+𝑦 𝑥+𝑦+𝑧 Witness to non-membership: 𝑥 𝑥+𝑧 RBG square A function 𝑓: 𝑭𝑛 → {𝑅𝑒𝑑, 𝐵𝑙𝑢𝑒, 𝐺𝑟𝑒𝑒𝑛} satisfies the RBG property if there are no 𝑥, 𝑦, 𝑧 ∈ 𝑭𝑛 such that 𝑓 𝑥 = 𝑓 𝑥 + 𝑦 = 𝑅𝑒𝑑, 𝑓 𝑥 + 𝑧 = 𝐵𝑙𝑢𝑒, and 𝑓 𝑥 + 𝑦 + 𝑧 = 𝐺𝑟𝑒𝑒𝑛. The RBG claim Claim: The RBG property is testable with 4 queries. Suffices to show that if 𝑓: 𝑭𝑛 → {𝑅𝑒𝑑, 𝐵𝑙𝑢𝑒, 𝐺𝑟𝑒𝑒𝑛} is far from RBG, then a random tuple (𝑥, 𝑥 + 𝑦, 𝑥 + 𝑧, 𝑥 + 𝑦 + 𝑧) is an RBG square with constant probability. 𝑥+𝑦 𝑥 𝑥+𝑦+𝑧 𝑥+𝑧 Dreams of a proof (Image © Kozmic Konstructions) The dreamiest situation Suppose non-RBG function 𝑓 has the form: 𝑓(𝑥) = Γ 𝑥1 , 𝑥2 , … , 𝑥𝑐 (0,4) • 𝑝𝑐 cells of equal size (0,3) (0,2) (0,1) (0,0) • Must exist at least one RBG square • Probability of selecting an RBG square is 𝑝−3𝑐 . A slight nudge Same analysis works if non-RBG function 𝑓 has the form: 𝑓(𝑥) = Γ ℓ1 𝑥 , ℓ2 𝑥 , … , ℓ𝑐 (𝑥) where ℓ1 , ℓ2 , … , ℓ𝑐 are linearly independent linear forms. Also works if “linearly independent” replaced by “random” More rocking of the bed Suppose non-RBG function 𝑓 has the form: 𝑓(𝑥) = Γ 𝑄1 𝑥 , 𝑄2 𝑥 , … , 𝑄𝑐 (𝑥) where 𝑄1 , … , 𝑄𝑐 are random bounded degree nonlinear polynomials. Partitioning by random polys (0,4) (0,3) Joint distribution of 𝑄1 , … , 𝑄𝑐 close to uniform distribution (0,2) (0,1) (0,0) 𝑝𝑐 cells of roughly equal size Partitioning by random polys (0,4) (0,3) (0,2) Joint distribution of (𝑄1 (𝑥), . . , 𝑄𝑐 (𝑥), 𝑄1 (𝑥 + 𝑦), . . . , 𝑄𝑐 (𝑥 + 𝑦), 𝑄1 (𝑥 + 𝑧), . . . , 𝑄𝑐 (𝑥 + 𝑧), 𝑄1 (𝑥 + 𝑦 + 𝑧), . . . , 𝑄𝑐 (𝑥 + 𝑦 + 𝑧))close to uniform (0,1) (0,0) Probability of selecting an RBG square is ≈ 𝑝−3𝑐 . Returning to intermediate dream Suppose non-RBG function 𝑓 has the form: 𝑓(𝑥) = Γ 𝑄1 𝑥 , 𝑄2 𝑥 , … , 𝑄𝑐 (𝑥) where 𝑄1 , … , 𝑄𝑐 are arbitrary low degree polynomials. Instead of insisting 𝑄1 , … , 𝑄𝑐 be truly random, can we weaken the requirement? Returning to intermediate dream Suppose non-RBG function 𝑓 has the form: 𝑓(𝑥) = Γ 𝑄1 𝑥 , 𝑄2 𝑥 , … , 𝑄𝑐 (𝑥) where 𝑄1 , … , 𝑄𝑐 are arbitrary low degree polynomials. How to ensure (𝑄1 , … , 𝑄𝑐 ) distributed close to uniform? How to even ensure 𝑄1 distributed close to uniform? High rank polynomials Theorem [Green-Tao, Kaufmann-Lovett]: A polynomial 𝑄: 𝐹 𝑛 → 𝐹 is distributed close to uniform if Q is of high rank. Definition: A polynomial 𝑄: 𝐹 𝑛 → 𝐹 is of rank > 𝑟 if there are no polynomials 𝑃1 , … , 𝑃𝑟 of degrees less than deg(Q) such that 𝑄 = Λ 𝑃1 , … , 𝑃𝑟 for some function Λ. High rank polynomial collection Theorem [Green-Tao, Kaufmann-Lovett]: Polynomials 𝑄1 , … , 𝑄𝑐 : 𝐹 𝑛 → 𝐹 jointly distributed close to uniform if every nontrivial linear combination of 𝑄1 , … , 𝑄𝑐 is high rank. Polynomial Regularity Lemma A straightforward inductive argument shows that, given 𝑓(𝑥) = Γ 𝑄1 𝑥 , 𝑄2 𝑥 , … , 𝑄𝑐 (𝑥) we can find a high rank collection of polynomials 𝑃1 , … , 𝑃𝑐 ′ such that: 𝑓(𝑥) = Γ′ 𝑃1 𝑥 , 𝑃2 𝑥 , … , 𝑃𝑐 ′ (𝑥) 𝑃1 , … , 𝑃𝑐 ′ form cells of roughly equal size Equidistribution of squares? Recall we also wanted equidistribution of squares. Is high rank sufficient for this purpose? Wanted Lemma: If 𝑃1 , … , 𝑃𝑐 ′ of sufficiently high rank, then: 𝑃1 𝑥 , . . , 𝑃𝑐 ′ 𝑥 , 𝑃1 𝑥 + 𝑦 , . . . , 𝑃𝑐 ′ 𝑥 + 𝑦 , 𝑃1 𝑥 + 𝑧 , . . . , 𝑃𝑐 ′ 𝑥 + 𝑧 , 𝑃1 𝑥 + 𝑦 + 𝑧 , . . . , 𝑃𝑐 ′ 𝑥 + 𝑦 + 𝑧 distributed close to uniform. A bit of a nightmare… Wanted lemma not necessarily true! Suppose 𝑃1 is of degree 1. Then: 𝑃1 𝑥 − 𝑃1 𝑥 + 𝑦 − 𝑃1 𝑥 + 𝑧 + 𝑃1 𝑥 + 𝑦 + 𝑧 = 0 identically. Lesson: some correlations may be present because of degree, no matter how large the rank is Corrected equidistribution Lemma: Given polynomials 𝑃1 , … , 𝑃𝑐 of high rank, either there’s some linear identity among 𝑃𝑖 𝑥 , 𝑃𝑖 𝑥 + 𝑦 , 𝑃𝑖 𝑥 + 𝑧 , 𝑃𝑖 (𝑥 + 𝑦 + 𝑧) for some i, or the tuple 𝑃1 𝑥 , . . , 𝑃𝑐 𝑥 , 𝑃1 𝑥 + 𝑦 , . . . , 𝑃𝑐 𝑥 + 𝑦 , 𝑃1 𝑥 + 𝑧 , . . . , 𝑃𝑐 𝑥 + 𝑧 , 𝑃1 𝑥 + 𝑦 + 𝑧 , . . . , 𝑃𝑐 𝑥 + 𝑦 + 𝑧 is close to uniformly distributed. Proof uses “Gowers inverse theorem” [Tao-Ziegler] Applying corrected equidistribution 𝟐 𝟎 2 2 𝟑 4 𝟐 𝟎 𝟐 𝟒 𝟑 Suppose 𝑃1 satisfies: 𝑃1 𝑥 − 𝑃1 𝑥 + 𝑦 − 𝑃1 𝑥 + 𝑧 + 𝑃1 (𝑥 + Almost awake to reality… Now, let’s remove all restrictions on RBG-far function 𝑓: 𝑭𝑛 → {𝑅𝑒𝑑, 𝐵𝑙𝑢𝑒, 𝐺𝑟𝑒𝑒𝑛}. Can we “approximate” f by some function g of the form: 𝑔(𝑥) = Γ 𝑄1 𝑥 , 𝑄2 𝑥 , … , 𝑄𝑐 (𝑥) where 𝑄1 , … , 𝑄𝑐 are low degree polynomials? Regularity Lemma Given any function 𝑓: 𝑭𝑛 → {0,1} and integer 𝑑, there exist constantly many polynomials 𝑄1 , … , 𝑄𝑐 : 𝑭𝑛 → 𝑭 of degrees at most 𝑑, a function Γ: 𝑭𝑐 → [0,1] and a function ℎ: 𝑭𝑛 → [−1,1] such that: • 𝑓 𝑥 = Γ 𝑄1 𝑥 , … , 𝑄𝑐 𝑥 +ℎ 𝑥 • Collection 𝑄1 , … , 𝑄𝑐 has high rank Uses “Gowers inverse theorem” [Tao-Ziegler] • Γ(𝑎1 , … , 𝑎𝑐 ) is the average value of 𝑓 on the cell indexed (𝑎1 , … , 𝑎𝑐 ) • (𝑑 + 1)-th order Gowers norm of h is small Victory in dreamland!  Letting 𝑓 𝑐 : 𝐹 𝑛 → {0,1} be indicator function of color c. Then, rejection probability is: 𝐸𝑥,𝑦,𝑧 [𝑓 𝑟𝑒𝑑 𝑥 𝑓 𝑟𝑒𝑑 𝑥 + 𝑦 𝑓 𝑏𝑙𝑢𝑒 𝑥 + 𝑧 𝑓 𝑔𝑟𝑒𝑒𝑛 𝑥 + 𝑦 + 𝑧 ]  We apply the regularity lemma. The term with low Gowers norm contributes negligibly to expectation.  A combinatorial argument finishes the claim. Waking up to hard reality  “Inverse conjecture for the Gowers norm is false” [LovettMeshulam-Samorodnitsky]  Turns out that in the conjecture, one needs to modify the definition of polynomials to include functions that are not 𝑭valued.  Need to reprove regularity lemma and equidistribution claims for such “non-classical polynomials” (technical core of our work)  Also, several lies encountered in this dream sequence addressed Non-classical polynomials Definition (non-classical polynomial): A function 𝑓: 𝑭𝒏𝒑 → ℝ/ℤ is of degree ≤ 𝑑 if Δℎ1 Δℎ2 ⋯ Δℎ𝑑+1 𝑓 ≡ 0 for any ℎ1 , ℎ2 , … , ℎ𝑑+1 ∈ 𝑭𝒏𝒑 , where: Δℎ 𝑓 𝑥 = 𝑓 𝑥 + ℎ − 𝑓(𝑥) • If range of 𝑓 is 1 2 𝑝−1 0, , , … , 𝑝 𝑝 𝑝 same as classical , non-classical Non-classical polynomials Definition (non-classical polynomial): A function 𝑓: 𝑭𝒏𝒑 → ℝ/ℤ is of degree ≤ 𝑑 if Δℎ1 Δℎ2 ⋯ Δℎ𝑑+1 𝑓 ≡ 0 for any ℎ1 , ℎ2 , … , ℎ𝑑+1 ∈ 𝑭𝒏𝒑 , where: Δℎ 𝑓 𝑥 = 𝑓 𝑥 + ℎ − 𝑓(𝑥) 1 1 3 4 2 4 • Function 𝑓: 𝐹2 → 0, , , defined as 1 𝑓 0 = 0, 𝑓 1 = 4 is non-classical poly of degree 2. Gowers norm and inverse theorem  For functions 𝑓: 𝐹𝑝𝑛 → ℂ, the Gowers norm of order d measures the expected multiplicative derivative of 𝑓 at a random point in 𝑑 random directions.  Fact: If Gowers norm of f of order d is 1 and f takes values inside the unit disk, then f is the exponential of a non-classical polynomial of degree ≤ 𝑑 − 1.  Gowers Inverse Theorem: If Gowers norm of f of order d is 𝛿 > 0 and f takes values inside the unit disk, then f is correlated with a non-classical polynomial of degree ≤ 𝑑 − 1. Open Questions  Proof for degree-structural property also uses the same core technical ideas, so no good locality bounds even for “simple” properties like product of two quadratics.  Give non-trivial lower bound for testing any natural affine-invariant property. No superpolynomial lower bounds known in 1/𝜖.  Find other uses of equidistribution of high rank polynomials. THANK YOU!

Slides - Events @ CSA Dept., IISc Bangalore

Related documents

Products

Support

Slides - Events @ CSA Dept., IISc Bangalore

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib