Algebraic Property Testing: A Unified Perspective Arnab Bhattacharyya Indian Institute of Science Property Testing Distinguish between and and Property P π-far from property P Testing and Learning ο Proper learning (with membership queries) is as hard as testing, for any property ο For many natural properties, testing is much easier than learning ο Learning always requires time at least as large as size of hypothesis but testing can be in constant time! A brief history ο Initially appeared as a tool for program checking [BlumLuby-Rubinfeld ‘90] ο [Babai-Fortnow-Lund ’90, Rubinfeld-Sudan ’96]: application to PCPs and low-degree testing ο [Goldreich-Goldwasser-Ron ‘98] considered property testing as full-fledged computational problem in its own right ο Now, many other variants and connections known (e.g., implicit learning, active testing, coding theory, inapproximability, …) Testable Properties Property P is testable if there exist functions π and πΏ and a randomized algorithm A such that, given an input object and a parameter π: • A makes π(π) queries into input object • A rejects with prob. ≥ πΏ(π) if input is π-far from P and with prob. 0 if π = 0 πΏ Positive at π>0 Zero at π = 0 Ο΅ Testing all-orangeness P = {all-orange rectangle} • π=1 • Query hits non-orange region with probability ≥ π Testing linearity P = {linear functions πΏ: {π, π}π → {π, π} } 000… 0 000… 1 001… 1 001… 1 010… 1 010… 0 100… 0 100… 1 101… 1 111… 0 111… 0 Functions of the form 101… 1 π³ π , … , ππ = πππ + πππ + π110… ππ + β―0+ πππ 110… π 1 over characteristic011… 2 011… 0 1 • π=3 • π π₯ + π π¦ ≠ π π₯ + π¦ with probability Ω(π). [BlumLuby-Rubinfeld] Testing low-degree P = {polynomials π: πππ → ππ of degree ≤ π} } Functions of the form π· ππ , … , ππ = π π π πππ β―ππ ⋅ πππ πππ β― πππ Check that function restricted to a random O(1)-dim ππ +β―+ππ ≤π subspace is poly of degree ≤ π over characteristic p Test detects a violation with probability Ω(π) [KaufmanRon, Haramaty-Shpilka-Sudan] Affine-invariant properties Subject of this talk: “algebraic properties” of multivariate functions over finite fields. Definition (affine-invariant property): If a function f on domain πππ satisfies an affine-invariant property P, then π β π΄ must also satisfy P for every affine map π΄: πππ → πππ [Kaufman-Sudan] Why? Introspective: ο A principled understanding of testability (~ VC dimension theory for PAC learning) ο What kinds of tools are needed to prove testability? Limitations? Extrospective: ο Search for locally testable codes (applications to inapproximability) Testing Fourier Sparsity P = {functions π: 0,1 π → {0,1} have at most k non-zero Fourier coefficients} [Gopalan-O’Donnell-Servedio-Shpilka-Wimmer] showed P is testable (though only with πΏ 0 > 0). Testing Decompositions ο P = {functions π: πππ → ππ that are a product of two quadratics} ο P = {functions π: πππ → ππ that are the square of a quartic} ο P = {polynomials π: πππ → ππ of the form ππ + ππ where π, π, π, π are all cubics} Testability of all such properties open previously! Main Result An exact characterization of testability for affineinvariant properties: Theorem: An affine-invariant property is testable* if and only if it is locally characterized [Joint work with Fischer, H&P Hatami, Lovett ‘13] Locally Characterized Definition (locally characterized): A property P is locally characterized if there always exists a constant-sized witness to non-membership in P. Examples: • Linearity: If π isn’t linear, then there exist π₯, π¦ such that π π₯ + π π¦ ≠ π(π₯ + π¦). • Low degree: If deg π > π, then there exists a point at which the (π + 1)-th order derivative is nonzero. Necessity of locality Theorem: An affine-invariant property is testable if and only if it is locally characterized. Proof of : The set of queries by the tester that make it reject is itself a π-sized witness to nonmembership in the property. Proof of : Rest of this talk… Degree-structural properties ο Are properties like “Fourier sparsity”, “product of two quadratics”, “square of a quartic”, “sum of two products of quadratics”, “splitting into d linear forms” locally characterized? Theorem: Any property described as the property of decomposing into a known structure of low-degree polynomials is locally characterized. Subsequent work ο Degree-structural properties are not only testable but reconstructible! β¦ Given degree-π poly π = π1 π2 + π3 π4 where π1 , … , π4 are degree-(π/2) polys, we can reconstruct π1 , … , π4 by making π(ππ ) queries to π. [B. ‘14] ο Characterization of two-sided testability [Yoshida ‘14] Some drawbacks… Unfolded proof uses non-constructive results. We get no bound on the locality and query complexity for even simple properties like “product of two quadratics”! A Running Example: RBG π₯+π¦ π₯+π¦+π§ Witness to non-membership: π₯ π₯+π§ RBG square A function π: ππ → {π ππ, π΅ππ’π, πΊππππ} satisfies the RBG property if there are no π₯, π¦, π§ ∈ ππ such that π π₯ = π π₯ + π¦ = π ππ, π π₯ + π§ = π΅ππ’π, and π π₯ + π¦ + π§ = πΊππππ. The RBG claim Claim: The RBG property is testable with 4 queries. Suffices to show that if π: ππ → {π ππ, π΅ππ’π, πΊππππ} is far from RBG, then a random tuple (π₯, π₯ + π¦, π₯ + π§, π₯ + π¦ + π§) is an RBG square with constant probability. π₯+π¦ π₯ π₯+π¦+π§ π₯+π§ Dreams of a proof (Image © Kozmic Konstructions) The dreamiest situation Suppose non-RBG function π has the form: π(π₯) = Γ π₯1 , π₯2 , … , π₯π (0,4) • ππ cells of equal size (0,3) (0,2) (0,1) (0,0) • Must exist at least one RBG square • Probability of selecting an RBG square is π−3π . A slight nudge Same analysis works if non-RBG function π has the form: π(π₯) = Γ β1 π₯ , β2 π₯ , … , βπ (π₯) where β1 , β2 , … , βπ are linearly independent linear forms. Also works if “linearly independent” replaced by “random” More rocking of the bed Suppose non-RBG function π has the form: π(π₯) = Γ π1 π₯ , π2 π₯ , … , ππ (π₯) where π1 , … , ππ are random bounded degree nonlinear polynomials. Partitioning by random polys (0,4) (0,3) Joint distribution of π1 , … , ππ close to uniform distribution (0,2) (0,1) (0,0) ππ cells of roughly equal size Partitioning by random polys (0,4) (0,3) (0,2) Joint distribution of (π1 (π₯), . . , ππ (π₯), π1 (π₯ + π¦), . . . , ππ (π₯ + π¦), π1 (π₯ + π§), . . . , ππ (π₯ + π§), π1 (π₯ + π¦ + π§), . . . , ππ (π₯ + π¦ + π§))close to uniform (0,1) (0,0) Probability of selecting an RBG square is ≈ π−3π . Returning to intermediate dream Suppose non-RBG function π has the form: π(π₯) = Γ π1 π₯ , π2 π₯ , … , ππ (π₯) where π1 , … , ππ are arbitrary low degree polynomials. Instead of insisting π1 , … , ππ be truly random, can we weaken the requirement? Returning to intermediate dream Suppose non-RBG function π has the form: π(π₯) = Γ π1 π₯ , π2 π₯ , … , ππ (π₯) where π1 , … , ππ are arbitrary low degree polynomials. How to ensure (π1 , … , ππ ) distributed close to uniform? How to even ensure π1 distributed close to uniform? High rank polynomials Theorem [Green-Tao, Kaufmann-Lovett]: A polynomial π: πΉ π → πΉ is distributed close to uniform if Q is of high rank. Definition: A polynomial π: πΉ π → πΉ is of rank > π if there are no polynomials π1 , … , ππ of degrees less than deg(Q) such that π = Λ π1 , … , ππ for some function Λ. High rank polynomial collection Theorem [Green-Tao, Kaufmann-Lovett]: Polynomials π1 , … , ππ : πΉ π → πΉ jointly distributed close to uniform if every nontrivial linear combination of π1 , … , ππ is high rank. Polynomial Regularity Lemma A straightforward inductive argument shows that, given π(π₯) = Γ π1 π₯ , π2 π₯ , … , ππ (π₯) we can find a high rank collection of polynomials π1 , … , ππ ′ such that: π(π₯) = Γ′ π1 π₯ , π2 π₯ , … , ππ ′ (π₯) π1 , … , ππ ′ form cells of roughly equal size Equidistribution of squares? Recall we also wanted equidistribution of squares. Is high rank sufficient for this purpose? Wanted Lemma: If π1 , … , ππ ′ of sufficiently high rank, then: π1 π₯ , . . , ππ ′ π₯ , π1 π₯ + π¦ , . . . , ππ ′ π₯ + π¦ , π1 π₯ + π§ , . . . , ππ ′ π₯ + π§ , π1 π₯ + π¦ + π§ , . . . , ππ ′ π₯ + π¦ + π§ distributed close to uniform. A bit of a nightmare… Wanted lemma not necessarily true! Suppose π1 is of degree 1. Then: π1 π₯ − π1 π₯ + π¦ − π1 π₯ + π§ + π1 π₯ + π¦ + π§ = 0 identically. Lesson: some correlations may be present because of degree, no matter how large the rank is Corrected equidistribution Lemma: Given polynomials π1 , … , ππ of high rank, either there’s some linear identity among ππ π₯ , ππ π₯ + π¦ , ππ π₯ + π§ , ππ (π₯ + π¦ + π§) for some i, or the tuple π1 π₯ , . . , ππ π₯ , π1 π₯ + π¦ , . . . , ππ π₯ + π¦ , π1 π₯ + π§ , . . . , ππ π₯ + π§ , π1 π₯ + π¦ + π§ , . . . , ππ π₯ + π¦ + π§ is close to uniformly distributed. Proof uses “Gowers inverse theorem” [Tao-Ziegler] Applying corrected equidistribution π π 2 2 π 4 π π π π π Suppose π1 satisfies: π1 π₯ − π1 π₯ + π¦ − π1 π₯ + π§ + π1 (π₯ + Almost awake to reality… Now, let’s remove all restrictions on RBG-far function π: ππ → {π ππ, π΅ππ’π, πΊππππ}. Can we “approximate” f by some function g of the form: π(π₯) = Γ π1 π₯ , π2 π₯ , … , ππ (π₯) where π1 , … , ππ are low degree polynomials? Regularity Lemma Given any function π: ππ → {0,1} and integer π, there exist constantly many polynomials π1 , … , ππ : ππ → π of degrees at most π, a function Γ: ππ → [0,1] and a function β: ππ → [−1,1] such that: • π π₯ = Γ π1 π₯ , … , ππ π₯ +β π₯ • Collection π1 , … , ππ has high rank Uses “Gowers inverse theorem” [Tao-Ziegler] • Γ(π1 , … , ππ ) is the average value of π on the cell indexed (π1 , … , ππ ) • (π + 1)-th order Gowers norm of h is small Victory in dreamland! ο Letting π π : πΉ π → {0,1} be indicator function of color c. Then, rejection probability is: πΈπ₯,π¦,π§ [π πππ π₯ π πππ π₯ + π¦ π πππ’π π₯ + π§ π πππππ π₯ + π¦ + π§ ] ο We apply the regularity lemma. The term with low Gowers norm contributes negligibly to expectation. ο A combinatorial argument finishes the claim. Waking up to hard reality ο “Inverse conjecture for the Gowers norm is false” [LovettMeshulam-Samorodnitsky] ο Turns out that in the conjecture, one needs to modify the definition of polynomials to include functions that are not πvalued. ο Need to reprove regularity lemma and equidistribution claims for such “non-classical polynomials” (technical core of our work) ο Also, several lies encountered in this dream sequence addressed Non-classical polynomials Definition (non-classical polynomial): A function π: πππ → β/β€ is of degree ≤ π if Δβ1 Δβ2 β― Δβπ+1 π ≡ 0 for any β1 , β2 , … , βπ+1 ∈ πππ , where: Δβ π π₯ = π π₯ + β − π(π₯) • If range of π is 1 2 π−1 0, , , … , π π π same as classical , non-classical Non-classical polynomials Definition (non-classical polynomial): A function π: πππ → β/β€ is of degree ≤ π if Δβ1 Δβ2 β― Δβπ+1 π ≡ 0 for any β1 , β2 , … , βπ+1 ∈ πππ , where: Δβ π π₯ = π π₯ + β − π(π₯) 1 1 3 4 2 4 • Function π: πΉ2 → 0, , , defined as 1 π 0 = 0, π 1 = 4 is non-classical poly of degree 2. Gowers norm and inverse theorem ο For functions π: πΉππ → β, the Gowers norm of order d measures the expected multiplicative derivative of π at a random point in π random directions. ο Fact: If Gowers norm of f of order d is 1 and f takes values inside the unit disk, then f is the exponential of a non-classical polynomial of degree ≤ π − 1. ο Gowers Inverse Theorem: If Gowers norm of f of order d is πΏ > 0 and f takes values inside the unit disk, then f is correlated with a non-classical polynomial of degree ≤ π − 1. Open Questions ο Proof for degree-structural property also uses the same core technical ideas, so no good locality bounds even for “simple” properties like product of two quadratics. ο Give non-trivial lower bound for testing any natural affine-invariant property. No superpolynomial lower bounds known in 1/π. ο Find other uses of equidistribution of high rank polynomials. THANK YOU!