Slides - Events @ CSA Dept., IISc Bangalore

advertisement
Algebraic Property Testing:
A Unified Perspective
Arnab Bhattacharyya
Indian Institute of Science
Property Testing
Distinguish between
and
and
Property P
πœ–-far from property P
Testing and Learning
ο‚—
Proper learning (with membership queries) is as hard
as testing, for any property
ο‚—
For many natural properties, testing is much easier
than learning
ο‚—
Learning always requires time at least as large as size
of hypothesis but testing can be in constant time!
A brief history
ο‚—
Initially appeared as a tool for program checking [BlumLuby-Rubinfeld ‘90]
ο‚—
[Babai-Fortnow-Lund ’90, Rubinfeld-Sudan ’96]: application
to PCPs and low-degree testing
ο‚—
[Goldreich-Goldwasser-Ron ‘98] considered property testing
as full-fledged computational problem in its own right
ο‚—
Now, many other variants and connections known (e.g.,
implicit learning, active testing, coding theory,
inapproximability, …)
Testable Properties
Property P is testable if there exist functions π‘ž and 𝛿 and a
randomized algorithm A such that, given an input object
and a parameter πœ–:
•
A makes π‘ž(πœ–) queries into input object
•
A rejects with prob. ≥ 𝛿(πœ–) if input is πœ–-far from P and
with prob. 0 if πœ– = 0
𝛿
Positive at
πœ–>0
Zero at πœ– = 0
Ο΅
Testing all-orangeness
P = {all-orange rectangle}
• π‘ž=1
• Query hits non-orange region with
probability ≥ πœ–
Testing linearity
P = {linear functions 𝐿: {𝟎, 𝟏}𝑛 → {𝟎, 𝟏} }
000…
0
000…
1
001…
1
001…
1
010…
1
010…
0
100…
0
100…
1
101…
1
111…
0
111…
0
Functions of the form
101…
1
𝑳 𝒙 , … , 𝒙𝒏 = π’™π’ŠπŸ + π’™π’ŠπŸ + 𝒙110…
π’ŠπŸ‘ + β‹―0+ π’™π’Šπ’Œ
110… 𝟏 1
over characteristic011…
2
011…
0
1
• π‘ž=3
• 𝑓 π‘₯ + 𝑓 𝑦 ≠ 𝑓 π‘₯ + 𝑦 with probability Ω(πœ–). [BlumLuby-Rubinfeld]
Testing low-degree
P = {polynomials 𝑃: 𝑭𝒑𝑛 → 𝑭𝒑 of degree ≤ 𝑑} }
Functions of the form
𝑷 π’™πŸ , … , 𝒙𝒏 =
π’Š
π’Š
π’Š
π’„π’ŠπŸ β‹―π’Šπ’ ⋅ π’™πŸπŸ π’™πŸπŸ β‹― 𝒙𝒏𝒏
Check that function restricted
to a random O(1)-dim
π’ŠπŸ +β‹―+π’Šπ’ ≤𝒅
subspace is poly of degree ≤ 𝑑
over characteristic p
Test detects a violation with probability Ω(πœ–) [KaufmanRon, Haramaty-Shpilka-Sudan]
Affine-invariant properties
Subject of this talk: “algebraic properties” of
multivariate functions over finite fields.
Definition (affine-invariant property):
If a function f on domain 𝑭𝒏𝒑 satisfies an
affine-invariant property P, then 𝑓 ∘ 𝐴 must
also satisfy P for every affine map 𝐴: 𝑭𝒏𝒑 →
𝑭𝒏𝒑
[Kaufman-Sudan]
Why?
Introspective:
ο‚— A principled understanding of testability (~
VC dimension theory for PAC learning)
ο‚— What kinds of tools are needed to prove
testability? Limitations?
Extrospective:
ο‚— Search for locally testable codes (applications
to inapproximability)
Testing Fourier Sparsity
P = {functions 𝑃: 0,1 𝑛 → {0,1} have at
most k non-zero Fourier coefficients}
[Gopalan-O’Donnell-Servedio-Shpilka-Wimmer]
showed P is testable (though only with 𝛿 0 > 0).
Testing Decompositions
ο‚—
P = {functions 𝑃: 𝑭𝒑𝑛 → 𝑭𝒑 that are a product of two
quadratics}
ο‚—
P = {functions 𝑃: 𝑭𝒑𝑛 → 𝑭𝒑 that are the square of a
quartic}
ο‚—
P = {polynomials 𝑃: 𝑭𝒑𝑛 → 𝑭𝒑 of the form π‘Žπ‘ + 𝑐𝑑
where π‘Ž, 𝑏, 𝑐, 𝑑 are all cubics}
Testability of all such properties open previously!
Main Result
An exact characterization of testability for affineinvariant properties:
Theorem: An affine-invariant property is
testable* if and only if it is locally
characterized
[Joint work with Fischer, H&P Hatami, Lovett ‘13]
Locally Characterized
Definition (locally characterized):
A property P is locally characterized if there
always exists a constant-sized witness to
non-membership in P.
Examples:
• Linearity: If 𝑓 isn’t linear, then there exist π‘₯, 𝑦 such that
𝑓 π‘₯ + 𝑓 𝑦 ≠ 𝑓(π‘₯ + 𝑦).
• Low degree: If deg 𝑓 > 𝑑, then there exists a point at
which the (𝑑 + 1)-th order derivative is nonzero.
Necessity of locality
Theorem: An affine-invariant property is
testable if and only if it is locally
characterized.
Proof of : The set of queries by the tester that
make it reject is itself a π‘ž-sized witness to nonmembership in the property.
Proof of
: Rest of this talk…
Degree-structural properties
ο‚—
Are properties like “Fourier sparsity”, “product
of two quadratics”, “square of a quartic”, “sum
of two products of quadratics”, “splitting into d
linear forms” locally characterized?
Theorem: Any property described as the
property of decomposing into a known
structure of low-degree polynomials is
locally characterized.
Subsequent work
ο‚—
Degree-structural properties are not only
testable but reconstructible!
β—¦ Given degree-𝑑 poly 𝑃 = 𝑄1 𝑄2 + 𝑄3 𝑄4 where
𝑄1 , … , 𝑄4 are degree-(𝑑/2) polys, we can
reconstruct 𝑄1 , … , 𝑄4 by making 𝑂(𝑛𝑑 ) queries to
𝑃. [B. ‘14]
ο‚—
Characterization of two-sided testability
[Yoshida ‘14]
Some drawbacks…
Unfolded proof uses
non-constructive
results. We get no
bound on the locality
and query
complexity for even
simple properties
like “product of two
quadratics”!
A Running Example: RBG
π‘₯+𝑦
π‘₯+𝑦+𝑧
Witness to non-membership:
π‘₯
π‘₯+𝑧
RBG square
A function 𝑓: 𝑭𝑛 → {𝑅𝑒𝑑, 𝐡𝑙𝑒𝑒, πΊπ‘Ÿπ‘’π‘’π‘›} satisfies the RBG
property if there are no π‘₯, 𝑦, 𝑧 ∈ 𝑭𝑛 such that 𝑓 π‘₯ =
𝑓 π‘₯ + 𝑦 = 𝑅𝑒𝑑, 𝑓 π‘₯ + 𝑧 = 𝐡𝑙𝑒𝑒, and 𝑓 π‘₯ + 𝑦 + 𝑧 =
πΊπ‘Ÿπ‘’π‘’π‘›.
The RBG claim
Claim: The RBG property is testable with 4
queries.
Suffices to show that if 𝑓: 𝑭𝑛 →
{𝑅𝑒𝑑, 𝐡𝑙𝑒𝑒, πΊπ‘Ÿπ‘’π‘’π‘›} is far from
RBG, then a random tuple
(π‘₯, π‘₯ + 𝑦, π‘₯ + 𝑧, π‘₯ + 𝑦 + 𝑧) is an
RBG square with constant
probability.
π‘₯+𝑦
π‘₯
π‘₯+𝑦+𝑧
π‘₯+𝑧
Dreams of a proof
(Image © Kozmic Konstructions)
The dreamiest situation
Suppose non-RBG function 𝑓 has the form:
𝑓(π‘₯) = Γ π‘₯1 , π‘₯2 , … , π‘₯𝑐
(0,4)
• 𝑝𝑐 cells of equal size
(0,3)
(0,2)
(0,1)
(0,0)
• Must exist at least one
RBG square
• Probability of selecting
an RBG square is 𝑝−3𝑐 .
A slight nudge
Same analysis works if non-RBG function 𝑓
has the form:
𝑓(π‘₯) = Γ β„“1 π‘₯ , β„“2 π‘₯ , … , ℓ𝑐 (π‘₯)
where β„“1 , β„“2 , … , ℓ𝑐 are linearly independent
linear forms.
Also works if “linearly
independent” replaced
by “random”
More rocking of the bed
Suppose non-RBG function 𝑓 has the form:
𝑓(π‘₯) = à 𝑄1 π‘₯ , 𝑄2 π‘₯ , … , 𝑄𝑐 (π‘₯)
where 𝑄1 , … , 𝑄𝑐 are random bounded degree nonlinear polynomials.
Partitioning by random polys
(0,4)
(0,3)
Joint distribution of
𝑄1 , … , 𝑄𝑐
close to uniform
distribution
(0,2)
(0,1)
(0,0)
𝑝𝑐 cells of roughly
equal size
Partitioning by random polys
(0,4)
(0,3)
(0,2)
Joint distribution of
(𝑄1 (π‘₯), . . , 𝑄𝑐 (π‘₯), 𝑄1 (π‘₯ +
𝑦), . . . , 𝑄𝑐 (π‘₯ + 𝑦), 𝑄1 (π‘₯ +
𝑧), . . . , 𝑄𝑐 (π‘₯ + 𝑧), 𝑄1 (π‘₯ +
𝑦 + 𝑧), . . . , 𝑄𝑐 (π‘₯ + 𝑦 +
𝑧))close to uniform
(0,1)
(0,0)
Probability of selecting an
RBG square is ≈ 𝑝−3𝑐 .
Returning to intermediate dream
Suppose non-RBG function 𝑓 has the form:
𝑓(π‘₯) = à 𝑄1 π‘₯ , 𝑄2 π‘₯ , … , 𝑄𝑐 (π‘₯)
where 𝑄1 , … , 𝑄𝑐 are arbitrary low degree polynomials.
Instead of insisting 𝑄1 , … , 𝑄𝑐 be truly random, can
we weaken the requirement?
Returning to intermediate dream
Suppose non-RBG function 𝑓 has the form:
𝑓(π‘₯) = à 𝑄1 π‘₯ , 𝑄2 π‘₯ , … , 𝑄𝑐 (π‘₯)
where 𝑄1 , … , 𝑄𝑐 are arbitrary low degree polynomials.
How to ensure (𝑄1 , … , 𝑄𝑐 ) distributed close to
uniform? How to even ensure 𝑄1 distributed close
to uniform?
High rank polynomials
Theorem [Green-Tao, Kaufmann-Lovett]: A
polynomial 𝑄: 𝐹 𝑛 → 𝐹 is distributed close to
uniform if Q is of high rank.
Definition: A polynomial 𝑄: 𝐹 𝑛 → 𝐹 is of rank > π‘Ÿ if
there are no polynomials 𝑃1 , … , π‘ƒπ‘Ÿ of degrees less than
deg(Q) such that
𝑄 = Λ π‘ƒ1 , … , π‘ƒπ‘Ÿ
for some function Λ.
High rank polynomial collection
Theorem [Green-Tao, Kaufmann-Lovett]:
Polynomials 𝑄1 , … , 𝑄𝑐 : 𝐹 𝑛 → 𝐹 jointly distributed
close to uniform if every nontrivial linear
combination of 𝑄1 , … , 𝑄𝑐 is high rank.
Polynomial Regularity Lemma
A straightforward inductive argument shows that,
given
𝑓(π‘₯) = à 𝑄1 π‘₯ , 𝑄2 π‘₯ , … , 𝑄𝑐 (π‘₯)
we can find a high rank collection of polynomials
𝑃1 , … , 𝑃𝑐 ′ such that:
𝑓(π‘₯) = Γ′ 𝑃1 π‘₯ , 𝑃2 π‘₯ , … , 𝑃𝑐 ′ (π‘₯)
𝑃1 , … , 𝑃𝑐 ′ form cells of roughly equal size
Equidistribution of squares?
Recall we also wanted equidistribution of squares. Is
high rank sufficient for this purpose?
Wanted Lemma: If 𝑃1 , … , 𝑃𝑐 ′ of sufficiently high
rank, then:
𝑃1 π‘₯ , . . , 𝑃𝑐 ′ π‘₯ ,
𝑃1 π‘₯ + 𝑦 , . . . , 𝑃𝑐 ′ π‘₯ + 𝑦 ,
𝑃1 π‘₯ + 𝑧 , . . . , 𝑃𝑐 ′ π‘₯ + 𝑧 ,
𝑃1 π‘₯ + 𝑦 + 𝑧 , . . . , 𝑃𝑐 ′ π‘₯ + 𝑦 + 𝑧
distributed close to uniform.
A bit of a nightmare…
Wanted lemma not necessarily true!
Suppose 𝑃1 is of degree 1. Then:
𝑃1 π‘₯ − 𝑃1 π‘₯ + 𝑦 − 𝑃1 π‘₯ + 𝑧 + 𝑃1 π‘₯ + 𝑦 + 𝑧 = 0
identically.
Lesson: some correlations may be present because
of degree, no matter how large the rank is
Corrected equidistribution
Lemma: Given polynomials 𝑃1 , … , 𝑃𝑐 of high rank,
either there’s some linear identity among
𝑃𝑖 π‘₯ , 𝑃𝑖 π‘₯ + 𝑦 , 𝑃𝑖 π‘₯ + 𝑧 , 𝑃𝑖 (π‘₯ + 𝑦 + 𝑧) for some i,
or the tuple
𝑃1 π‘₯ , . . , 𝑃𝑐 π‘₯ ,
𝑃1 π‘₯ + 𝑦 , . . . , 𝑃𝑐 π‘₯ + 𝑦 ,
𝑃1 π‘₯ + 𝑧 , . . . , 𝑃𝑐 π‘₯ + 𝑧 ,
𝑃1 π‘₯ + 𝑦 + 𝑧 , . . . , 𝑃𝑐 π‘₯ + 𝑦 + 𝑧
is close to uniformly distributed.
Proof uses “Gowers
inverse theorem”
[Tao-Ziegler]
Applying corrected equidistribution
𝟐
𝟎
2
2
πŸ‘
4
𝟐
𝟎
𝟐
πŸ’
πŸ‘
Suppose 𝑃1 satisfies: 𝑃1 π‘₯ −
𝑃1 π‘₯ + 𝑦 − 𝑃1 π‘₯ + 𝑧 + 𝑃1 (π‘₯ +
Almost awake to reality…
Now, let’s remove all restrictions on RBG-far
function 𝑓: 𝑭𝑛 → {𝑅𝑒𝑑, 𝐡𝑙𝑒𝑒, πΊπ‘Ÿπ‘’π‘’π‘›}.
Can we “approximate” f by some function g of
the form:
𝑔(π‘₯) = à 𝑄1 π‘₯ , 𝑄2 π‘₯ , … , 𝑄𝑐 (π‘₯)
where 𝑄1 , … , 𝑄𝑐 are low degree polynomials?
Regularity Lemma
Given any function 𝑓: 𝑭𝑛 → {0,1} and integer 𝑑, there exist
constantly many polynomials 𝑄1 , … , 𝑄𝑐 : 𝑭𝑛 → 𝑭 of degrees at
most 𝑑, a function Γ: 𝑭𝑐 → [0,1] and a function β„Ž: 𝑭𝑛 → [−1,1]
such that:
• 𝑓 π‘₯ = à 𝑄1 π‘₯ , … , 𝑄𝑐 π‘₯
+β„Ž π‘₯
• Collection 𝑄1 , … , 𝑄𝑐 has high rank
Uses “Gowers
inverse theorem”
[Tao-Ziegler]
• Γ(π‘Ž1 , … , π‘Žπ‘ ) is the average value of 𝑓 on the cell indexed
(π‘Ž1 , … , π‘Žπ‘ )
• (𝑑 + 1)-th order Gowers norm of h is small
Victory in dreamland!
ο‚—
Letting 𝑓 𝑐 : 𝐹 𝑛 → {0,1} be indicator function of color c.
Then, rejection probability is:
𝐸π‘₯,𝑦,𝑧 [𝑓 π‘Ÿπ‘’π‘‘ π‘₯ 𝑓 π‘Ÿπ‘’π‘‘ π‘₯ + 𝑦 𝑓 𝑏𝑙𝑒𝑒 π‘₯ + 𝑧 𝑓 π‘”π‘Ÿπ‘’π‘’π‘› π‘₯ + 𝑦 + 𝑧 ]
ο‚—
We apply the regularity lemma. The term with low
Gowers norm contributes negligibly to expectation.
ο‚—
A combinatorial argument finishes the claim.
Waking up to hard reality
ο‚—
“Inverse conjecture for the Gowers norm is false” [LovettMeshulam-Samorodnitsky]
ο‚—
Turns out that in the conjecture, one needs to modify the
definition of polynomials to include functions that are not 𝑭valued.
ο‚—
Need to reprove regularity lemma and equidistribution
claims for such “non-classical polynomials” (technical core of
our work)
ο‚—
Also, several lies encountered in this dream sequence
addressed
Non-classical polynomials
Definition (non-classical polynomial):
A function 𝑓: 𝑭𝒏𝒑 → ℝ/β„€ is of degree ≤ 𝑑 if
Δβ„Ž1 Δβ„Ž2 β‹― Δβ„Žπ‘‘+1 𝑓 ≡ 0
for any β„Ž1 , β„Ž2 , … , β„Žπ‘‘+1 ∈ 𝑭𝒏𝒑 , where:
Δβ„Ž 𝑓 π‘₯ = 𝑓 π‘₯ + β„Ž − 𝑓(π‘₯)
• If range of 𝑓 is
1 2
𝑝−1
0, , , … ,
𝑝 𝑝
𝑝
same as classical
, non-classical
Non-classical polynomials
Definition (non-classical polynomial):
A function 𝑓: 𝑭𝒏𝒑 → ℝ/β„€ is of degree ≤ 𝑑 if
Δβ„Ž1 Δβ„Ž2 β‹― Δβ„Žπ‘‘+1 𝑓 ≡ 0
for any β„Ž1 , β„Ž2 , … , β„Žπ‘‘+1 ∈ 𝑭𝒏𝒑 , where:
Δβ„Ž 𝑓 π‘₯ = 𝑓 π‘₯ + β„Ž − 𝑓(π‘₯)
1 1 3
4 2 4
• Function 𝑓: 𝐹2 → 0, , ,
defined as
1
𝑓 0 = 0, 𝑓 1 =
4
is non-classical poly of degree 2.
Gowers norm and inverse theorem
ο‚—
For functions 𝑓: 𝐹𝑝𝑛 → β„‚, the Gowers norm of order d measures
the expected multiplicative derivative of 𝑓 at a random point
in 𝑑 random directions.
ο‚—
Fact: If Gowers norm of f of order d is 1 and f takes values
inside the unit disk, then f is the exponential of a non-classical
polynomial of degree ≤ 𝑑 − 1.
ο‚—
Gowers Inverse Theorem: If Gowers norm of f of order d is
𝛿 > 0 and f takes values inside the unit disk, then f is
correlated with a non-classical polynomial of degree ≤ 𝑑 − 1.
Open Questions
ο‚—
Proof for degree-structural property also uses the same
core technical ideas, so no good locality bounds even for
“simple” properties like product of two quadratics.
ο‚—
Give non-trivial lower bound for testing any natural
affine-invariant property. No superpolynomial lower
bounds known in 1/πœ–.
ο‚—
Find other uses of equidistribution of high rank
polynomials.
THANK YOU!
Download