Complexity Lower Bounds, P vs NP & Gowers Blog

advertisement
Complexity Lower Bounds, P vs NP
& Gowers Blog
๐‘ท ≠ ๐‘ต๐‘ท?
Try to prove they are different by using circuit complexity.
Define a function to be ๐‘“: {0,1}๐‘› → {±1}
To built a circuit we define the following:
1) "Basic Functions" – ๐‘“๐‘– (๐‘ฅ1 , … , ๐‘ฅ๐‘› ) = (-1)
2) "Basic Operations":
๐‘ฅ๐‘– +1
a. ๐‘“ → -๐‘“
b. ๐‘“, ๐‘” → ๐‘“ ∨ ๐‘”
c. ๐‘“, ๐‘” → ๐‘“ ∧ ๐‘”
3) Straight line composition: ๐‘“1 , ๐‘“2 , … , ๐‘“๐‘› such that ๐‘“๐‘– is either a basic function or
obtained from ๐‘–1 , ๐‘–2 < ๐‘– by basic operations.
The definition represents a DAG (Directional Acyclic Graph)
Definition: a function ๐น has (circuit) complexity ๐‘š if ∃ a circuit ๐‘“1 , … , ๐‘“๐‘š as above.
๐‘ƒ⁄
๐‘ƒ๐‘œ๐‘™๐‘ฆ: All functions with a polynomial circuit complexity are our equivalent of ๐‘ƒ.
๐‘“{{0,1}๐‘› → {±1}|๐‘“ โ„Ž๐‘Ž๐‘  ๐‘Ž ๐‘๐‘–๐‘Ÿ๐‘๐‘ข๐‘–๐‘ก ๐‘๐‘œ๐‘š๐‘๐‘™๐‘’๐‘ฅ๐‘–๐‘ก๐‘ฆ < ๐‘›log log ๐‘› }
If we find โ„Ž ∈ ๐‘๐‘ƒ ๐‘ . ๐‘ก. โ„Ž ∉ ๐‘ƒ⁄๐‘ƒ๐‘œ๐‘™๐‘ฆ then ๐‘ƒ ≠ ๐‘๐‘ƒ.
For that purpose, lets define a complexity measuring function ๐œ….
Such ๐œ… must satisfy:
1) ๐œ…(๐‘“) = 1 if ๐‘“ is basic
2) ๐œ…(๐‘“) = ๐œ…(-๐‘“)
3) If ๐œ…(๐‘“), ๐œ…(๐‘”) are small then ๐‘“(๐‘“ ∨ ๐‘”) is small
4) ๐œ…(๐‘“) is large for some โ„Ž ∈ ๐‘๐‘ƒ
Attempts to find such ๐œฟ
Idea 1
Take the fourier representation of ๐น = ∑๐‘†⊂[๐‘›] ๐‘“(๐‘ )๐œ’๐‘ 
๐‘“ฬ‚๐‘š๐‘Ž๐‘ฅ = max|๐‘“(๐‘ )|
๐‘ 
Define ๐œ…(๐‘“) = ๐‘“
1
๐‘š๐‘Ž๐‘ฅ
๐‘“: {0,1}๐‘› → {±1}
๐‘›
๐‘“ ∈ โ„2
Instead of the standard base, we use the Fourier base:
๐‘ฅ๐‘–
๐œ’{๐‘–} = (-1) .
∀๐‘  ⊆ {1, … , ๐‘›}. ๐œ’๐‘  = ∏ ๐œ’(๐‘–)
๐‘–∈๐‘ 
Fourier base is: {๐œ’๐‘  }๐‘ 
Problem
2
1 = ๐‘‡๐‘‚๐ท๐‘‚ = โ€–๐‘“โ€–22 = ∑|๐‘“ฬ‚(๐‘ )|
∀๐‘“{0,1}2 → {๐ผ1 } ∃๐‘  ๐‘ . ๐‘ก. |๐‘“ฬ‚(๐‘ )| >
And the function ๐‘”(๐‘ฅ) = (-1)
๐‘ฅ1 ๐‘ฅ2 +๐‘ฅ2 ๐‘ฅ3 +๐‘ฅ3 ๐‘ฅ4 ,…,๐‘ฅ๐‘› ๐‘ฅ1
1
2๐‘›
also has ๐‘”ฬ‚๐‘š๐‘Ž๐‘ฅ =
1
๐‘›
22
Could there be a good measure ๐œฟ?
๐œ…(๐‘“ ∧ ๐‘”) ≤ ๐œ… (๐‘“) + ๐œ…(๐‘”)
} - a formal complexity measure
๐œ…(๐‘“ ∧ ๐‘”) ≤ ๐œ…(๐‘“) + ๐œ…(๐‘”)
Related to the formula size of ๐‘“ – ๐œ†(๐‘“)?
Formula trees with basic formulas on leaves and basic operations in internal nodes. Formula
size is the number of leaves.
Claim: For any formal complexity measure ๐œ… – ๐œ…(๐‘“) ≤ ๐œ†(๐‘“)
Proof: By induction
๐œ…(๐‘“ ∨ ๐‘”) ≤ ๐œ…(๐‘“) + ๐œ…(๐‘”) ≤ ๐œ†(๐‘“) + ๐œ†(๐‘”)
Assume the smallest formula for f writes ๐‘“ = ๐‘” ∨ โ„Ž, then
๐œ†(๐‘”) = ๐‘Ÿ, ๐œ†(โ„Ž) = ๐‘  → ๐œ†(๐‘“) = ๐‘Ÿ + ๐‘ .
Note that the smallest formula might be even smaller!
๐œ…(๐‘“) = ๐œ…(๐‘”) + ๐œ…(โ„Ž) ≤ ๐œ†(๐‘”) + ๐œ†(โ„Ž) ≤ ๐œ†(๐‘“)
So far it seems as if there isn't a good ๐œ……
On one hand: ๐œ…(๐‘“) = ๐‘“ฬ‚
1
๐‘š๐‘Ž๐‘ฅ
is bad for easy functions.
On the other hand, ๐œ…(๐‘“) = ๐œ†(๐‘“) is bad because tautological!
We haven't made any progress...
Natural Proofs – Razborov & Rudich
๐‘›
{0,1}๐ด={0,1}
X
(0,1,1,0, … )
๐‘ฅ=โŸ
|๐ด|
๐‘‹ ⊂ {0,1}๐ด
๐น ⊂ {{0,1}๐ด → {±1}}
๐‘‹ is ๐œ– pseudo-random w.r.t. ๐น if
∀๐น ∈ โ„ฑ |๐‘ƒ[๐น(๐‘ฅ) = 1|๐‘ฅ ∈ ๐‘‹] − ๐‘ƒ[๐น(๐‘ฅ) = 1]| < ๐œ–
Extreme opposite – 1๐‘‹ ∈ โ„ฑ
๐‘‹ is pseudo random if every ๐น ∈ โ„ฑ cannot distinguish ๐‘ฅ ≈ ๐‘‹ from ๐‘ฅ ≈ {0,1} ๐ด
Main point: Random functions of lowe complexity look like random functions w.r.t. a poly
time distinguisher.
Let ๐‘‹ = all points in {0,1}๐ด with respect to polyline functions
๐›ฟ = {๐น: {0,1}๐ด → {±1}|๐น ๐‘–๐‘  ๐‘๐‘œ๐‘™๐‘ฆ๐‘›๐‘œ๐‘š๐‘–๐‘Ž๐‘™ ๐‘ก๐‘–๐‘š๐‘’ (๐‘–๐‘› ๐‘–๐‘ก๐‘  ๐‘–๐‘›๐‘๐‘ข๐‘ก ๐‘™๐‘’๐‘›๐‘”๐‘กโ„Ž ๐ด)}
Statement: ∀๐œ– > 0. ๐‘‹ is ๐œ–-pseudorandom w.r.t. โ„ฑ.
A "natural proof" for ๐‘ƒ ≠ ๐‘๐‘ƒ would
- Devise a "simplicity" probability S of Boolean functions, so that ๐‘ (๐‘“) = 1 for all
simple (poly-time complexity) functions ๐น ∈ ๐‘‹.
- If ๐‘† itself is poly-time computable (in its input length - |๐ด| = 2๐‘› ) then since X is ๐œ–pseudorandom for โ„ฑ and ๐‘  ∈ โ„ฑ, it follows that ๐‘†(๐‘“) = 1 for almost all ๐น ∈ {0,1}๐ด .
This is bad because a random function ๐น ∈ {0,1}^๐ด shouldn't be simple!
So either ๐‘† ∉ โ„ฑ or ๐‘†(๐‘“) = 1 for almost all functions.
A proof is "natural" if it defines a simplicity property ๐‘  such that:
(1) All low complexity functions are simple
(2) A random function is not simple
(3) Whether or not a function is simple can be determined in poly-time
(4) Some NP-function is not simple
1,2,3 cannot hold together!
-------- end of lesson 1
Connection between P, NP and Circuits
๐ฟ ⊂ {0,1}∗
๐ฟ๐‘› โ‰” ๐ฟ ∩ {0,1}๐‘›
๐ฟ = {๐ฟ๐‘› }∞
๐‘›=1
The Clique language has a circuit complexity ≥ ๐‘†(๐‘›)
↔
∞
{๐‘
}
For any sequence of circuits ๐‘› ๐‘›=1 solves Clique
∃๐‘›0 ∀๐‘› > ๐‘›0 ๐‘ (๐ถ๐‘› ) > ๐‘ (๐‘›)
The class ๐‘ƒ⁄๐‘ƒ๐‘œ๐‘™๐‘ฆ ≡ all languages computable by poly circuits.
The set of poly-time functions looks like the set of all functions
Looks like = To a simple observer (another polynomial time algorithm).
Exercise: Let ๐œ… be a formal complexity measure. Prove that if there exist โ„Ž: {0,1}๐‘› → {±1}
1
๐œ…(โ„Ž) > 4 โˆ™ ๐‘ , then ๐‘ƒ๐‘Ÿ๐‘œ๐‘[๐œ…(๐‘“) > ๐‘ ] ≥ 4
๐‘“ random, ๐‘“: {0,1}๐‘› → {±1}
Things that are known
๏‚ท
The discrete log function
Let โ„ค๐‘› be the cyclic group with N elements
Let g be a generator
๐บ = {๐‘”1 , ๐‘”2 , ๐‘”3 , … } โ„ค๐‘∗
๐‘“๐‘” : ๐บ → ๐บ ๐‘“(๐‘ฅ) = ๐‘” ๐‘ฅ
๐‘“ -−1 ← discrete log function, is 1-1, believed hard to compute.
๐œ–
CONJ – There exists some ๐œ– > 0 ๐‘ . ๐‘ก. the complexity of this problem is ≥ 2๐‘› .
Goldreich-Levin “Hard core bit”: Any one way permutation → gives rise to a pseudorandom generator.
A pseudo random generator is:
{0,1}๐‘˜ → {0,1}๐‘˜+1
Such that you cannot tell the difference between the half that emerged from the domain
and the half that didn’t (in the range).
๐‘ƒ๐‘…๐บ ( โŸ
๐‘ฅ , โŸ
๐‘Ÿ ) = (๐‘“(๐‘ฅ)
๐‘Ÿ , ∑ ๐‘ฅ๐‘– ๐‘Ÿ๐‘– ๐‘š๐‘œ๐‘‘ 2)
โŸ ,โŸ
โŸ
๐‘˜
๐‘˜
๐‘˜
2
๐‘๐‘–๐‘ก๐‘ 
2
๐‘˜
2
๐‘๐‘–๐‘ก๐‘ 
{0,1}๐‘˜
๐‘œ๐‘›๐‘’ ๐‘š๐‘œ๐‘Ÿ๐‘’ ๐‘๐‘–๐‘ก
2๐‘˜
{0,1}
2
๐‘ƒ๐‘…๐บ:
→
Now they constructed a pseudo random function generator
The took a seed (denoted ๐‘ฆ).
y
y(๐‘”1 (๐‘ฆ))
y(๐‘”0 (๐‘ฆ))
y(๐‘”0 (๐‘”0 (๐‘ฆ)))
y(๐‘”1 (๐‘”0 (๐‘ฆ)))
y(๐‘”0 (๐‘”1 ((๐‘ฆ)))
y(๐‘”1 (๐‘”1 ((๐‘ฆ)))
๐น(๐‘ฆ, ๐‘ฅ) โ‰” ๐‘€๐‘†๐ต (๐‘”๐‘ฅ๐‘˜ โˆ˜ … โˆ˜ ๐‘”๐‘ฅ2 ๐‘”๐‘ฅ1 (๐‘ฆ))
Define ๐‘“๐‘ฆ : {0,1}๐‘› → {0,1} ๐‘“๐‘ฆ (๐‘ฅ) = ๐น(๐‘ฆ, ๐‘ฅ)
Consider the distribution {๐‘“๐‘ฆ }๐‘ฆ∈{0,1}๐‘˜ , ๐‘˜ > ๐‘›๐‘ ๐‘ > 2
๏‚ท
For each y ๐‘“๐‘ฆ is poly-time computable
๏‚ท
This distribution is pseudo-random against polynomial time(in ๐Ÿ๐’ ) distinguishers
On the other hand…
๐œ…, or any property being used in a lower bound proof, shouldn’t be too complex either!
๐œ…(๐‘“) = 1 iff ๐‘“ has low circuit complexity is not good (trivial).
Note ๐œ… ∈ ๐‘๐‘ƒ!
Take the basic functions:
(- − 1)
โ‹ฎ
100
๐‘›
๐‘ฅ1
(- − 1)
โ‹ฎ
๐‘ฅ2
(- − 1)
โ‹ฎ
๐‘ฅ3
… (- − 1)
โ‹ฎ
๐‘ฅ๐‘›
A model for generating a random formula.
I have ๐‘› โˆ™ ๐‘›100 โˆ™ 2 functions.
But this is false! Why? Because using AND or OR changes the distribution from
Gowers Norms (๐‘ผ๐’Œ ๐‘ต๐’๐’“๐’Ž๐’”)
Fix a finite set ๐ด and consider โ„๐ด (the vector space of functions ๐‘“: ๐ด → โ„.
A norm on this space is a function โ€–๐‘ฅโ€–: โ„ ๐ด → โ„+
s.t.
โ€–๐›ผ๐‘“โ€– = |๐›ผ|โ€–๐‘“โ€–
๐‘“ ≠ 0 → โ€–๐‘“โ€– ≠ 0
โ€–๐‘“ + ๐‘”โ€– ≤ โ€–๐‘“โ€– + โ€–๐‘”โ€–
1
2
3
4
๐‘“๐‘œ๐‘Ÿ 0 to …
Example: โ€–๐‘“โ€– = max|๐‘“(๐‘ฅ)| (๐‘œ๐‘› ๐‘ฅ ∈ ๐ด)
Definition (dual norm):
1
โ€–๐‘“โ€–∗ = max{⟨๐‘“, ๐‘”⟩|โ€–๐‘”โ€– ≤ 1} where ⟨๐‘“, ๐‘”⟩ = ∑๐‘ฅ∈๐ด ๐‘“(๐‘ฅ)๐‘”(๐‘ฅ) ← “Correlation”
|๐ด|
Example: For โ€–๐‘“โ€–∞, the dual is:
โ€–๐‘“โ€–∗∞ = max{< ๐‘“, ๐‘” > |โ€–๐‘”โ€–∞ ≤ 1, ∀๐‘ฅ: |๐‘”(๐‘ฅ)| ≤ 1}
1
1
max {∑ ๐‘“(๐‘ฅ)๐‘”(๐‘ฅ) ||๐‘”(๐‘ฅ)| ≤ 1} =
∑ ๐‘“(๐‘ฅ)๐‘†๐‘–๐‘”๐‘›(๐‘“(๐‘ฅ)) =
∑|๐‘“(๐‘ฅ)| = โ€–๐‘“โ€–1
|๐ด|
|๐ด|
In general: If โ€–โˆ™โ€–is in P, it doesn’t mean that โ€–โˆ™โ€–∗is in P.
Another Example: Let๐ดฬ‚ be an abelian group
โ€–๐‘“โ€–4 42 = ๐ธ [๐‘“(๐‘ฅ)๐‘“(๐‘ฅ + ๐‘Ž)๐‘“(๐‘ฅ + ๐‘)๐‘“(๐‘ฅ + ๐‘Ž + ๐‘)]
๐‘ฅ,๐‘Ž,๐‘
Is it in P? YES.
1
Is โ€–๐‘“โ€–∗42 in P? Turns out that โ€–๐‘“โ€–∗42 = โ€–๐‘“ฬ‚โ€–4 = (∑๐‘ ⊆[๐‘›](๐‘“ฬ‚(๐‘ฅ)4 ))4
Exercise: Prove that the ๐‘ข2 norm is a norm. Hint: Cauchy’s norm.
However, can define ๐‘ข2 norm
TODO: Did not have enough time to copy the formula:
โ€–๐‘“โ€–๐‘ข2 = ( ๐ธ (๐‘“(๐‘ฅ)๐‘“(๐‘ฅ + ๐‘Ž)๐‘“(๐‘ฅ + ๐‘)๐‘“(๐‘ฅ + ๐‘Ž + ๐‘)๐‘“(๐‘ฅ + ๐‘)๐‘“(๐‘ฅ + ๐‘Ž + ๐‘)๐‘“()
๐‘ฅ,๐‘Ž,๐‘,๐‘
Do not know a poly-time algorithm for โ€– โˆ™ โ€–∗๐‘ข2
“Goal” for introducing these norms was to extend fourier analysis to “higher degree”.
๐‘“: {0,1}๐‘› → {±1}. ๐‘“ฬ‚(๐‘ ) = ๐‘๐‘œ๐‘Ÿ๐‘Ÿ๐‘’๐‘™๐‘Ž๐‘ก๐‘–๐‘œ๐‘› ๐‘œ๐‘“ ๐‘“ ๐‘ค๐‘–๐‘กโ„Ž ๐‘กโ„Ž๐‘’ ๐‘™๐‘–๐‘›๐‘’๐‘Ž๐‘Ÿ ๐‘“๐‘ข๐‘›๐‘๐‘ก๐‘–๐‘œ๐‘› (−1)∑๐‘–∈๐‘† ๐‘‹๐‘–
๐‘ฅ๐‘  : {0,1}๐‘› → {±1} ๐‘ฅ2 (๐‘ฅ1 , … , ๐‘ฅ๐‘› ) = ∏๐‘–∈๐‘†(−1)๐‘ฅ๐‘– ←linear phase functions
Consider degree d phase functions (−1)^๐‘”(๐‘ฅฬ… ) where deg ๐‘” ≤ ๐‘‘
Fix ๐‘“: {0,1}๐‘› → {±1}
๐‘“๐‘ฆ :
{0,1}๐‘›
๐‘ฆ ∈ {0,1}๐‘›
→ {±1} defined by ๐‘“๐‘ฆ (๐‘ฅ) = ๐‘“(๐‘ฅ) โˆ™ ๐‘“(๐‘ฅ + ๐‘ฆ)
๐‘“(๐‘ฅ) = (−1)(๐œ™(๐‘ฅ))
Say, for instance:
๐‘“(๐‘ฅ) = (−1)๐‘ฅ1 +๐‘ฅ2 +๐‘ฅ3 โˆ™ (−1)๐‘ฅ1 +๐‘ฅ2 +๐‘ฅ3 +๐‘ฆ1 +๐‘ฆ2 +๐‘ฆ3 = (−1)๐‘ฆ1 +๐‘ฆ+๐‘ฆ3 =constant! Doesn’t depend
on x.
If ๐‘“(๐‘ฅ) = (−1)๐‘ž(๐‘ฅ) for deg ๐‘ž ≤ ๐‘‘
Then ๐‘“๐‘ฆ (๐‘ฅ) = (−1)๐‘ž
′ (๐‘ฅ)
๐‘คโ„Ž๐‘’๐‘Ÿ๐‘’ deg ๐‘ž ′ ≤ ๐‘‘ − 1
(−1)๐‘ž(๐‘ฅ) โˆ™ (−1)๐‘ž(๐‘ฅ+๐‘ฆ)
Define ๐‘“๐‘ฆ,๐‘ง ≡ (๐‘“๐‘ฆ )๐‘ง
Similarily: ๐‘“๐‘ฆ1 …๐‘ฆ๐‘‘ = (… ((๐‘“๐‘ฆ1 ) ) … )
๐‘ฆ2
---- end of lesson 2
๐‘ฆ๐‘‘
๐‘“: {0,1}๐‘› → {±1}
๐‘˜
โ€–๐‘“โ€–2๐‘ข๐‘˜ =
๐ธ
๐‘ฅ1 …๐‘ฅ๐‘˜ ∈{0,1}๐‘›
∏
๐‘“(๐‘ฅ0 + ๐œ–1 ๐‘ฅ1 + โ‹ฏ + ๐œ–๐‘˜ ๐‘ฅ๐‘˜ )
๐œ–1 …๐œ–๐‘˜ ∈{0,1}
๐‘ฅ0 = ๐‘™๐‘–๐‘›๐‘’๐‘Ž๐‘Ÿ ๐‘ ๐‘๐‘Ž๐‘› (๐‘ฅ1 … ๐‘ฅ๐‘˜ ) (dimension k affine subspace)
๐‘ฅ0 , ๐‘ฅ0 + ๐‘ฅ1 , ๐‘ฅ0 + ๐‘ฅ2 , ๐‘ฅ0 + ๐‘ฅ1 + ๐‘ฅ2
Example: Suppose ๐‘“ is a linear function
∃๐‘Ž1 , … , ๐‘Ž๐‘› ๐‘ . ๐‘ก. ๐‘“(๐‘ฅ) = ∑ ๐‘Ž๐‘– ๐‘ฅ๐‘– (๐‘š๐‘œ๐‘‘2)
Then for any choice of ๐‘ฅ0 , ๐‘ฅ1 , ๐‘ฅ2
๐‘“(๐‘ฅ0 ) + +๐‘“(๐‘ฅ0 ) + ๐‘“(๐‘ฅ1 ) + + โ‹ฏ = 0
Hence for any choise of ๐‘ฅ0 , ๐‘ฅ1 , ๐‘ฅ2
+ + ๐‘“(๐‘ฅ1 + ๐œ–1 ๐‘ฅ1 + ๐œ–2 ๐‘ฅ2 ) ≡ 0
๐œ–1 ,๐œ–2
Similarly the expectancy is 0 as well.
Linearity Testing
(proven by Blum-Luby-Rubinfeld)
๐‘“: {0,1}๐‘› → {±1}
Question: Is ๐‘“ a linear function?
Definition 1: ∃๐‘Ž1 , … , ๐‘Ž๐‘› ๐‘ . ๐‘ก. ∀๐‘ฅ. ๐‘“(๐‘ฅ) = ∑ ๐‘Ž๐‘– ๐‘ฅ๐‘–
Definition 2: ∀๐‘ฅ, ๐‘ฆ ๐‘“(๐‘ฅ) + ๐‘“(๐‘ฆ) = ๐‘“(๐‘ฅ + ๐‘ฆ)
Definition 2 implies definition 1 since:
We can define ๐‘Ž๐‘– = ๐‘“(๐‘’๐‘– ), then ๐‘“(๐‘ฅ) = ๐‘“(∑ ๐‘ฅ๐‘– ๐‘’๐‘– )
๐‘™๐‘–๐‘›๐‘’๐‘Ž๐‘Ÿ๐‘–๐‘ก๐‘ฆ
=
∑ ๐‘ฅ๐‘– โˆ™ ๐‘“(๐‘’๐‘– )…
Testing
Global object, e.g. ๐‘“: {0,1}๐‘› → {0,1}
Want to test if ๐‘“ ∈ ๐’ซ
In our example – all linear functions.
Only willing to invest limited resources, but willing to randomize.
Question: Can we deduce global property by considering local behavior?
If the answer is yes we say that this ๐’ซ is testable.
Non testable property:
๐‘ฅ1 … ๐‘ฅ๐‘› - Boolean variables. ~๐‘ฅ1 … ~๐‘ฅ๐‘› - their negations.
Fom the list above, I select ๐‘š 3-CNF clauses indices at andom.
Most of the time we will find clauses that don’t have any shared variables
If ๐‘š > 50๐‘› than with high probability ๐œ‘ is unsatisfiable!
PCP “theory” implies that every polynomialy varifiable property (e.g. formula satisfiability)
can be cast (“encoded”) in testable form.
Definition: ∀๐‘“, ๐‘” {0,1}๐‘› → {0,1}
Distance(๐‘“, ๐‘”) = ๐‘ƒ๐‘Ÿ๐‘œ๐‘๐‘› [๐‘“(๐‘ฅ) ≠ ๐‘“(๐‘ฅ)]
๐‘ฅ∈{0,1}
Distance(๐‘“, ๐‘†) = min ๐ท๐‘–๐‘ ๐‘ก๐‘Ž๐‘›๐‘๐‘’(๐‘“, ๐‘”)
f∈S
Theorem (BLR):
Let ๐‘“: {0,1}๐‘› → {0,1} If distance (๐‘“, ๐ฟ๐ผ๐‘๐ธ๐ด๐‘…) ≥ 0
Then ๐‘ƒ๐‘Ÿ๐‘œ๐‘(๐‘“(๐‘ฅ) + ๐‘“(๐‘ฆ) + ๐‘“(๐‘ฅ + ๐‘ฆ) ≠ 0) ≥ Ω(๐›ฟ)
๐‘ฅ,๐‘ฆ
Theorem (AKKLR):
Let ๐‘“: {0,1}๐‘› → {0,1} if ๐ท๐‘–๐‘ ๐‘ก๐‘Ž๐‘›๐‘๐‘’(๐‘“, ๐ท๐ธ๐บ๐‘…๐ธ๐ธ(๐‘˜)) ≥ δ
Then ๐‘ƒ๐‘Ÿ๐‘œ๐‘ (๐‘‹๐‘‚๐‘…(๐‘“(๐‘ฅ0 + ๐œ–1 ๐‘ฅ1 + โ‹ฏ + ๐œ–๐‘˜+1 ๐‘ฅ๐‘˜+1 ))) ≥ Ω(๐›ฟ โˆ™ 2−๐‘˜ )
๐‘ฅ0 ,…,๐‘ฅ๐‘˜+1
So we select an affine space of ๐‘˜ + 1 points.
We look at all of these points and check that they are not zero.
๐‘“ is a degree k function if
๐‘“(๐‘ฅ1 , … , ๐‘ฅ๐‘˜ ) = ∑ ๐‘Ž๐‘† ∏ ๐‘ฅ๐‘–
|๐‘†|⊆๐‘˜
๐‘–∈๐‘†
๐‘ฅ, ๐‘ฆ
๐‘“(๐‘ฅ1 , … , ๐‘ฅ๐‘˜ ) = ๐‘ฅ1 ๐‘ฅ2 … ๐‘ฅ๐‘˜
If ๐‘“ has degree ๐‘˜ then ∀๐‘ฅ0 , … , ๐‘ฅ๐‘˜+1
๐‘‹๐‘‚๐‘…(๐‘“(๐‘ฅ0 + ๐œ–1 ๐‘ฅ1 + โ‹ฏ + ๐œ–๐‘˜+! ๐‘ฅ๐‘˜+1 ))
Proof: Let ๐‘“๐‘ฆ (๐‘ฅ) = ๐‘“(๐‘ฅ + ๐‘ฆ)
The function ๐‘” = ๐‘“๐‘ฅ๐‘˜+1 ,๐‘ฅ๐‘˜ ,…,๐‘ฅ2 has degree 0 (it is constant)
The above expression equals ๐‘”(๐‘ฅ0 )๐‘‹๐‘‚๐‘…๐‘”(๐‘ฅ0 + ๐‘ฅ1 ) ≡ 0
Claim: Let ๐‘๐‘œ๐‘Ÿ๐‘Ÿ(๐‘“, ๐‘˜ − 1) be the correlation of ๐‘“ with degree ๐‘˜ − 1 polynomials.
๐‘๐‘œ๐‘Ÿ๐‘Ÿ = (1 − ๐›ฟ) − ๐›ฟ = 1 − 2 โˆ™ ๐ท๐‘–๐‘ ๐‘ก๐‘Ž๐‘›๐‘๐‘’(๐‘“, ๐‘˜ − 1 ๐ท๐ธ๐บ๐ธ๐ธ ๐น๐‘ˆ๐‘๐ถ๐‘‡๐ผ๐‘‚๐‘๐‘†)
๐‘๐‘œ๐‘Ÿ๐‘Ÿ(๐‘“, ๐‘˜ − 1) ≤ โ€–๐‘“โ€–๐‘ข๐‘˜
Reed-Mulle (Low-Degee) Test
Let ๐‘ be the closest degree ๐‘˜ polynomial to ๐‘“.
๐›ฟ โ‰” ๐ท๐‘–๐‘ ๐‘ก๐‘Ž๐‘›๐‘๐‘’(๐‘, ๐‘“)
1. ๐›ฟ is tiny – ๐›ฟ < ๐›ฝ2−๐‘˜
If an affine ๐‘˜ + 1 space contains exactly one point ๐‘ฅ ๐‘ . ๐‘ก. ๐‘“(๐‘ฅ) ≠ ๐‘(๐‘ฅ) then the test
rejects.
We will prove that this happens with constant probability.
Assume ๐›ฟ~2−๐‘˜
Choose a random affine subspace ๐ด by choosing ๐‘€๐‘›×๐‘™ a random full rank matrix
over ๐”ฝ2 and a random ๐‘ ∈ ๐”ฝ๐‘›2
๐ด = {๐‘Ž๐‘ฅ = ๐‘€๐‘ฅ + ๐‘|๐‘ฅ ∈ ๐”ฝ๐‘™2 }
For each ๐‘ฅ ๐ธ๐‘ฅ − The event ๐‘“(๐‘ฅ) ≠ ๐‘(๐‘ฅ)
๐น๐‘ฅ − ๐ธ๐‘ฅ and ∀ ๐‘ฆ ≠ ๐‘ฅ ๐‘“(๐‘Ž๐‘ฆ ) = ๐‘(๐‘Ž๐‘ฆ )
๐‘Ž๐‘ฅ is distributed uniformly in ๐”ฝ๐‘›2
๐‘Ž๐‘ฆ is distributed uniformly on ๐”ฝ๐‘›2 \{๐‘Ž๐‘ฅ }
∀๐‘ฅ ≠ ๐‘ฆ๐‘ƒ๐‘Ÿ๐‘œ๐‘[๐ธ๐‘ฅ ] = ๐›ฟ
๐‘ƒ๐‘Ÿ๐‘œ๐‘[๐ธ๐‘ฅ ๐‘Ž๐‘›๐‘‘ ๐ธ๐‘ฆ ] ≤ ๐›ฟ 2
๐‘€,๐‘
๐‘ƒ๐‘Ÿ๐‘œ๐‘[๐น๐‘ฅ ] ≥ ๐‘ƒ๐‘Ÿ๐‘œ๐‘[๐ธ๐‘ฅ ] − ∑ ๐‘ƒ๐‘Ÿ๐‘œ๐‘[๐ธ๐‘ฆ ∧ ๐ธ๐‘ฅ ] ≥ ๐›ฟ − 2๐‘™ โˆ™ ๐›ฟ 2 ≈ ๐›ฟ
๐‘ฅ≠๐‘ฆ
๐‘ƒ๐‘Ÿ๐‘œ๐‘ (โ‹ƒ ๐น๐‘ฅ ) = ∑ ๐‘ƒ๐‘Ÿ๐‘œ๐‘[๐น๐‘ฅ ] = 2๐‘™ โˆ™ ๐›ฟ = ๐ถ๐‘œ๐‘›๐‘ ๐‘ก๐‘Ž๐‘›๐‘ก
๐‘€,๐‘
2.
--- End of lesson 3
๐‘ฅ
๐‘ฅ
Last week we:
๏‚ท Defined the ๐‘ข๐‘˜ norm ⇔Degree ๐‘˜ − 1 test (“low degree test”)
๏‚ท Proved the following theorem: If โ€–๐‘“โ€–๐‘ข๐‘˜ > 1 − ๐›ฟ Then there ∃๐‘ of degree ๐‘˜ − 1
⟨๐‘“, ๐‘⟩ = ๐ธ [๐‘“(๐‘ฅ)๐‘(๐‘ฅ)] > 1 − ๐›ฟ ๐‘™
๐‘ฅ
Lemma 1: Let ๐‘“: {0,1}๐‘› → {0,1}
๐‘ - some polynomial
Denote ๐‘๐‘œ๐‘Ÿ๐‘Ÿ(๐‘“, deg ๐‘˜) = max{⟨๐‘“, ๐‘”⟩|๐‘” = (−1) ๐‘“(๐‘ฅ) , ๐‘ − deg ๐‘˜}
So:
๐‘๐‘œ๐‘Ÿ๐‘Ÿ(๐‘“, deg ๐‘˜) ≤ โ€–๐‘“โ€–๐‘ข๐‘˜+1
Lemma 2: For every โ„Ž: {0,1}๐‘› → {0,1}
โ€–โ„Žโ€–๐‘ข๐‘˜ ≤ โ€–โ„Žโ€–๐‘ข๐‘˜+1
Proof of 2:
We shall use the fact that:
๐ธ[๐‘ 2 ] ≥ (๐ธ[๐‘])2
๐‘˜+1
๐‘˜+1
โ€–โ„Žโ€–2๐‘ข๐‘˜+1
=
๐ธ
∏
๐‘ฅ
๐‘ฆ1 ,…,๐‘ฆ๐‘˜ ๐œ– ,…,๐œ–
๐‘˜+1
๐‘ฆ๐‘˜+1 1
โ„Ž (๐‘ฅ + ∑ ๐œ–๐‘– ๐‘ฆ๐‘– )
๐‘–=1
๐‘˜
=
๐ธ
๐ธ
๐‘˜
∏ โ„Ž (๐‘ฅ + ∑ ๐œ–๐‘– ๐‘ฆ๐‘– ) ∏ โ„Ž (๐‘ฅ + ๐‘ฆ๐‘˜+1 ∑ ๐œ–๐‘– ๐‘ฆ๐‘– )
๐‘ฆ1 ,…,๐‘ฆ๐‘˜ ๐‘ฅ
๐‘ฆ๐‘˜+1 ๐œ– ,…,๐œ–
1
๐‘˜
๐‘–=1
๐œ–1 ,…,๐œ–๐‘˜
๐‘–=1
′
Now let’s fix: ๐‘ฅ ← ๐‘ฅ
๐‘ฆ ′ ← ๐‘ฅ + ๐‘ฆ๐‘˜+1
๐‘˜
=
๐ธ
๐ธ
๐‘˜
′
′
∏ โ„Ž (๐‘ฅ + ∑ ๐œ–๐‘– ๐‘ฆ๐‘– ) ∏ โ„Ž (๐‘ฆ ∑ ๐œ–๐‘– ๐‘ฆ๐‘– )
๐‘ฆ1 ,…,๐‘ฆ๐‘˜ ๐‘ฅ
๐‘ฆ๐‘˜+1 ๐œ– ,…,๐œ–
1
๐‘˜
๐‘–=1
๐œ–1 ,…,๐œ–๐‘˜
๐‘–=1
2
๐‘˜
=
๐ธ
๐‘ฆ1 ,…,๐‘ฆ๐‘˜
(๐ธ ∏ โ„Ž (๐‘ฅ ′ + ∑ ๐œ–๐‘– ๐‘ฆ๐‘– ))
๐‘ฅ
๐œ–1 ,…,๐œ–๐‘˜
๐‘–=1
2
๐‘˜
≥( ๐ธ
๐‘ฅ
๐‘˜+1
∏ โ„Ž (๐‘ฅ ′ + ∑ ๐œ–๐‘– ๐‘ฆ๐‘– )) = โ€–โ„Žโ€–2๐‘ข๐‘˜
๐‘ฆ! ,…,๐‘ฆ๐‘˜ ๐œ– ,…,๐œ–
1
๐‘˜
โˆŽ
๐‘–=1
Proof of 1:
For any โ„Ž: {0,1}๐‘› → {±1}
1
1) |๐ธ โ„Ž(๐‘ฅ)| = โ€–โ„Žโ€–๐‘ข′
๐‘ฅ
๐‘×3
(โ€–โ„Žโ€–๐‘ข2
4 4
= (๐ธ |โ„Žฬ‚(๐‘ )| ) )
๐‘ฅ
2) ∀๐‘˜ โ€–โ„Žโ€–๐‘ข๐‘˜ ≤ โ€–โ„Žโ€–๐‘ข๐‘˜+1
โ€–๐‘“ โˆ™ ๐‘โ€–๐‘ข๐‘˜
3) ∀๐‘“. ∀๐‘: {0,1}๐‘› → {±1} degree ๐‘˜ polynomial โŸ
๐‘๐‘œ๐‘–๐‘›๐‘ก๐‘ค๐‘–๐‘ ๐‘’
๐‘š๐‘ข๐‘™๐‘ก๐‘–๐‘๐‘™๐‘–๐‘๐‘Ž๐‘ก๐‘–๐‘œ๐‘›
= โ€–๐‘“โ€–๐‘ข๐‘˜+1
Proof for 1:
โ€–โ„Žโ€–2๐‘ข1 = ๐ธ
∏ โ„Ž(๐‘ฅ + ๐œ–๐‘ฆ1 ) = ๐ธ
∏ โ„Ž(๐‘ฅ + ๐œ–๐‘ฆ1 )
๐‘ฅ
๐‘ฅ
๐‘ฆ1 ๐œ–=0,1
′
′
๐‘ฆ1 ๐œ–=0,1
Denote ๐‘ฅ = ๐‘ฅ, ๐‘ฆ = ๐‘ฅ + ๐‘ฆ
= (๐ธ โ„Ž(๐‘ฅ)) (๐ธ โ„Ž(๐‘ฆ))
๐‘ฅ
๐‘ฆ
Proof for 3:
๐‘˜+1
โ€–๐‘“ โˆ™
๐‘˜+1
๐‘โ€–2๐‘ข๐‘˜+1
=
๐ธ
๐‘ฅ
๐‘˜+1
๐‘˜+1
∏
๐‘ฆ1 ,…,๐‘ฆ๐‘˜+1 ๐œ– ,…,๐œ–
1
๐‘˜+1
๐‘“ (๐‘ฅ + ∑ ๐œ–๐‘– ๐‘ฆ๐‘– ) ๐‘ (๐‘ฅ + ∑ ๐œ–๐‘– ๐‘ฆ๐‘– ) = โ€–๐‘“โ€–2๐‘ข๐‘˜+1
๐‘–=1
Because ∀๐‘ฅ, ๐‘ฆ1 , … , ๐‘ฆ๐‘˜+1 ∏๐œ–1 ,…,๐œ–๐‘˜ ๐‘ฅ +
๐‘–=1
∑๐‘˜+1
๐‘–=1 ๐œ–๐‘– ๐‘ฆ๐‘–
=1
Let ๐‘: {0,1}๐‘› → {±1} be the degree ๐‘˜ function closest to ๐‘“ (i.e. attaining max correlation).
Define โ„Ž(๐‘ฅ) = ๐‘“(๐‘ฅ) โˆ™ ๐‘(๐‘ฅ)
๐‘๐‘œ๐‘Ÿ๐‘Ÿ(๐‘“, ๐‘) = |๐ธ ๐‘“(๐‘ฅ)๐‘(๐‘ฅ)|
๐‘๐‘ฆ ๐‘ ๐‘ก๐‘’๐‘ 1
๐‘ฅ
-
=
๐‘๐‘ฆ ๐‘ ๐‘ก๐‘’๐‘ 2
โ€–โ„Žโ€–๐‘ข๐‘˜
≤
โ€–โ„Žโ€–๐‘ข๐‘˜+1
๐‘๐‘ฆ ๐‘ ๐‘ก๐‘’๐‘ 3
=
โ€–๐‘“โ€–๐‘ข๐‘˜+1 โˆŽ
considered โ€–๐‘“โ€–๐‘ข3 as a “formal complexity measure”
Consider dual norms:
โ€–๐‘“โ€–∗ = max{⟨๐‘“, ๐‘”⟩|โ€–๐‘”โ€– ≤ 1}
Motivation for dual:
1) “NP-ish” definition possibly circumvents RR
2) More robust
TODO: Draw world
Suppose we have two parts in our world. A and B
And we have two functions ๐‘“, ๐‘” such that ๐‘“ is random on A and is 1 on B and ๐‘” is the exact
opposite.
โ€–๐‘“โ€–๐‘ข3 =constant.
โ€–๐‘”โ€–๐‘ข3 =constant as well.
โ„Ž =๐‘“∨๐‘”
โ€–โ„Žโ€–๐‘ข3 = ๐‘ง๐‘’๐‘Ÿ๐‘œ!
This is a problem! We just use an or and got such a dramatic difference
โ€–โ„Žโ€–∗ ≥ ⟨โ„Ž, โ„Ž⟩ = 1
⟨โ„Ž, ๐›ผโ„Ž⟩ = ๐›ผ
1
Can take ๐›ผ = โ€–โ„Žโ€– - very large!
โ€–๐‘“โ€–∗๐‘ข3 =?
โ€–๐‘“โ€–∗๐‘ข3
Need to find a “norming function”
Use โ„Ž!
โ€–
โ„Ž
โ€–=1
โ€–โ„Žโ€–
1
โ„Ž
1
1
โ€–๐‘“โ€–∗ ≥ ⟨๐‘“, ⟩ =
๐ธ ๐‘“(๐‘ฅ)โ„Ž(๐‘ฅ) = ๐‘ƒ๐‘Ÿ๐‘œ๐‘(๐ด) โž
๐ธ๐‘“(๐‘ฅ)๐‘”(๐‘ฅ) + ๐‘ƒ๐‘Ÿ๐‘œ๐‘[๐ด] โˆ™ ๐ธโ„Ž(๐‘ฅ) โˆ™ 1 = 2 โˆ™
โ€–โ„Žโ€–
โ€–โ„Žโ€–
๐‘ฅ
1
โ€–โ„Žโ€–
=very large!
Note: ⟨๐‘“, ๐‘“⟩ ≤ โ€–๐‘“โ€– โˆ™ โ€–๐‘“โ€–∗
Question 1: Given ๐‘“: {0,1}๐‘› → {±1} Can we compute โ€–๐‘“โ€–∗๐‘ข๐‘˜ in polytime?
(an open question even for k=3)
Question 2: Suppose you know that โ€–๐‘“โ€–∗๐‘ข3 ≥ ๐œ– by [Samorodnitsky ‘07] ∃ deg 2 polynomial
2
that correlates with ๐‘“. Can P be found in time poly(2๐‘› )? (search space is 2๐‘› )
Gappalan-Klivans-Zuckerman ’08: “list-decoding Reed-Muller Codes”.
If ๐‘“ is ๐œ–-correlated with some degree-2 function, then the following is true:
1) Number of deg 2 polynomials correlation with ๐‘“ ≤ 2๐‘‚๐œ– (๐‘›)
2) Can find list above in time 2๐‘‚๐œ– (๐‘›)
Interpretation: if โ€–๐‘“โ€– small then โ€–๐‘“โ€–∗ is large
Simplicity Property = {๐‘“|โ€–๐‘“โ€–∗ ๐‘–๐‘  ๐‘ ๐‘š๐‘Ž๐‘™๐‘™} ⊆ {๐‘“|โ€–๐‘“โ€– ๐‘–๐‘  ๐‘™๐‘Ž๐‘Ÿ๐‘”๐‘’}- This is a property in ๐‘ƒ!
If the first if ๐‘ƒ1 and the second is ๐‘ƒ2 , then the first one that contains all functions.
Next idea: ๐‘ข๐‘˜ norm for super-constant ๐‘˜.
Now the naïve algorithm for computing โ€–๐‘“โ€–๐‘ข๐‘˜
Takes (2๐‘› )๐‘˜ −time. If ๐‘˜ = ๐‘˜(๐‘) this is not polytime
๐‘ = 2๐‘› , ๐‘^๐พ
Still intent to use dual โ€– โ€–∗๐‘ข๐‘˜ norm (want robustness)
Question 3: Is there an algorithm for โ€– โ€–๐‘ข๐‘˜ running in time better than ๐‘ ๐‘˜−1 โˆ™ log ๐‘.
Download