Complexity Lower Bounds, P vs NP & Gowers Blog

Complexity Lower Bounds, P vs NP & Gowers Blog 𝑷 ≠ 𝑵𝑷? Try to prove they are different by using circuit complexity. Define a function to be 𝑓: {0,1}𝑛 → {±1} To built a circuit we define the following: 1) "Basic Functions" – 𝑓𝑖 (𝑥1 , … , 𝑥𝑛 ) = (-1) 2) "Basic Operations": 𝑥𝑖 +1 a. 𝑓 → -𝑓 b. 𝑓, 𝑔 → 𝑓 ∨ 𝑔 c. 𝑓, 𝑔 → 𝑓 ∧ 𝑔 3) Straight line composition: 𝑓1 , 𝑓2 , … , 𝑓𝑛 such that 𝑓𝑖 is either a basic function or obtained from 𝑖1 , 𝑖2 < 𝑖 by basic operations. The definition represents a DAG (Directional Acyclic Graph) Definition: a function 𝐹 has (circuit) complexity 𝑚 if ∃ a circuit 𝑓1 , … , 𝑓𝑚 as above. 𝑃⁄ 𝑃𝑜𝑙𝑦: All functions with a polynomial circuit complexity are our equivalent of 𝑃. 𝑓{{0,1}𝑛 → {±1}|𝑓 ℎ𝑎𝑠 𝑎 𝑐𝑖𝑟𝑐𝑢𝑖𝑡 𝑐𝑜𝑚𝑝𝑙𝑒𝑥𝑖𝑡𝑦 < 𝑛log log 𝑛 } If we find ℎ ∈ 𝑁𝑃 𝑠. 𝑡. ℎ ∉ 𝑃⁄𝑃𝑜𝑙𝑦 then 𝑃 ≠ 𝑁𝑃. For that purpose, lets define a complexity measuring function 𝜅. Such 𝜅 must satisfy: 1) 𝜅(𝑓) = 1 if 𝑓 is basic 2) 𝜅(𝑓) = 𝜅(-𝑓) 3) If 𝜅(𝑓), 𝜅(𝑔) are small then 𝑓(𝑓 ∨ 𝑔) is small 4) 𝜅(𝑓) is large for some ℎ ∈ 𝑁𝑃 Attempts to find such 𝜿 Idea 1 Take the fourier representation of 𝐹 = ∑𝑆⊂[𝑛] 𝑓(𝑠)𝜒𝑠 𝑓̂𝑚𝑎𝑥 = max|𝑓(𝑠)| 𝑠 Define 𝜅(𝑓) = 𝑓 1 𝑚𝑎𝑥 𝑓: {0,1}𝑛 → {±1} 𝑛 𝑓 ∈ ℝ2 Instead of the standard base, we use the Fourier base: 𝑥𝑖 𝜒{𝑖} = (-1) . ∀𝑠 ⊆ {1, … , 𝑛}. 𝜒𝑠 = ∏ 𝜒(𝑖) 𝑖∈𝑠 Fourier base is: {𝜒𝑠 }𝑠 Problem 2 1 = 𝑇𝑂𝐷𝑂 = ‖𝑓‖22 = ∑|𝑓̂(𝑠)| ∀𝑓{0,1}2 → {𝐼1 } ∃𝑠 𝑠. 𝑡. |𝑓̂(𝑠)| > And the function 𝑔(𝑥) = (-1) 𝑥1 𝑥2 +𝑥2 𝑥3 +𝑥3 𝑥4 ,…,𝑥𝑛 𝑥1 1 2𝑛 also has 𝑔̂𝑚𝑎𝑥 = 1 𝑛 22 Could there be a good measure 𝜿? 𝜅(𝑓 ∧ 𝑔) ≤ 𝜅 (𝑓) + 𝜅(𝑔) } - a formal complexity measure 𝜅(𝑓 ∧ 𝑔) ≤ 𝜅(𝑓) + 𝜅(𝑔) Related to the formula size of 𝑓 – 𝜆(𝑓)? Formula trees with basic formulas on leaves and basic operations in internal nodes. Formula size is the number of leaves. Claim: For any formal complexity measure 𝜅 – 𝜅(𝑓) ≤ 𝜆(𝑓) Proof: By induction 𝜅(𝑓 ∨ 𝑔) ≤ 𝜅(𝑓) + 𝜅(𝑔) ≤ 𝜆(𝑓) + 𝜆(𝑔) Assume the smallest formula for f writes 𝑓 = 𝑔 ∨ ℎ, then 𝜆(𝑔) = 𝑟, 𝜆(ℎ) = 𝑠 → 𝜆(𝑓) = 𝑟 + 𝑠. Note that the smallest formula might be even smaller! 𝜅(𝑓) = 𝜅(𝑔) + 𝜅(ℎ) ≤ 𝜆(𝑔) + 𝜆(ℎ) ≤ 𝜆(𝑓) So far it seems as if there isn't a good 𝜅… On one hand: 𝜅(𝑓) = 𝑓̂ 1 𝑚𝑎𝑥 is bad for easy functions. On the other hand, 𝜅(𝑓) = 𝜆(𝑓) is bad because tautological! We haven't made any progress... Natural Proofs – Razborov & Rudich 𝑛 {0,1}𝐴={0,1} X (0,1,1,0, … ) 𝑥=⏟ |𝐴| 𝑋 ⊂ {0,1}𝐴 𝐹 ⊂ {{0,1}𝐴 → {±1}} 𝑋 is 𝜖 pseudo-random w.r.t. 𝐹 if ∀𝐹 ∈ ℱ |𝑃[𝐹(𝑥) = 1|𝑥 ∈ 𝑋] − 𝑃[𝐹(𝑥) = 1]| < 𝜖 Extreme opposite – 1𝑋 ∈ ℱ 𝑋 is pseudo random if every 𝐹 ∈ ℱ cannot distinguish 𝑥 ≈ 𝑋 from 𝑥 ≈ {0,1} 𝐴 Main point: Random functions of lowe complexity look like random functions w.r.t. a poly time distinguisher. Let 𝑋 = all points in {0,1}𝐴 with respect to polyline functions 𝛿 = {𝐹: {0,1}𝐴 → {±1}|𝐹 𝑖𝑠 𝑝𝑜𝑙𝑦𝑛𝑜𝑚𝑖𝑎𝑙 𝑡𝑖𝑚𝑒 (𝑖𝑛 𝑖𝑡𝑠 𝑖𝑛𝑝𝑢𝑡 𝑙𝑒𝑛𝑔𝑡ℎ 𝐴)} Statement: ∀𝜖 > 0. 𝑋 is 𝜖-pseudorandom w.r.t. ℱ. A "natural proof" for 𝑃 ≠ 𝑁𝑃 would - Devise a "simplicity" probability S of Boolean functions, so that 𝑠(𝑓) = 1 for all simple (poly-time complexity) functions 𝐹 ∈ 𝑋. - If 𝑆 itself is poly-time computable (in its input length - |𝐴| = 2𝑛 ) then since X is 𝜖pseudorandom for ℱ and 𝑠 ∈ ℱ, it follows that 𝑆(𝑓) = 1 for almost all 𝐹 ∈ {0,1}𝐴 . This is bad because a random function 𝐹 ∈ {0,1}^𝐴 shouldn't be simple! So either 𝑆 ∉ ℱ or 𝑆(𝑓) = 1 for almost all functions. A proof is "natural" if it defines a simplicity property 𝑠 such that: (1) All low complexity functions are simple (2) A random function is not simple (3) Whether or not a function is simple can be determined in poly-time (4) Some NP-function is not simple 1,2,3 cannot hold together! -------- end of lesson 1 Connection between P, NP and Circuits 𝐿 ⊂ {0,1}∗ 𝐿𝑛 ≔ 𝐿 ∩ {0,1}𝑛 𝐿 = {𝐿𝑛 }∞ 𝑛=1 The Clique language has a circuit complexity ≥ 𝑆(𝑛) ↔ ∞ {𝑐 } For any sequence of circuits 𝑛 𝑛=1 solves Clique ∃𝑛0 ∀𝑛 > 𝑛0 𝑠(𝐶𝑛 ) > 𝑠(𝑛) The class 𝑃⁄𝑃𝑜𝑙𝑦 ≡ all languages computable by poly circuits. The set of poly-time functions looks like the set of all functions Looks like = To a simple observer (another polynomial time algorithm). Exercise: Let 𝜅 be a formal complexity measure. Prove that if there exist ℎ: {0,1}𝑛 → {±1} 1 𝜅(ℎ) > 4 ∙ 𝑠, then 𝑃𝑟𝑜𝑏[𝜅(𝑓) > 𝑠] ≥ 4 𝑓 random, 𝑓: {0,1}𝑛 → {±1} Things that are known  The discrete log function Let ℤ𝑛 be the cyclic group with N elements Let g be a generator 𝐺 = {𝑔1 , 𝑔2 , 𝑔3 , … } ℤ𝑝∗ 𝑓𝑔 : 𝐺 → 𝐺 𝑓(𝑥) = 𝑔 𝑥 𝑓 -−1 ← discrete log function, is 1-1, believed hard to compute. 𝜖 CONJ – There exists some 𝜖 > 0 𝑠. 𝑡. the complexity of this problem is ≥ 2𝑛 . Goldreich-Levin “Hard core bit”: Any one way permutation → gives rise to a pseudorandom generator. A pseudo random generator is: {0,1}𝑘 → {0,1}𝑘+1 Such that you cannot tell the difference between the half that emerged from the domain and the half that didn’t (in the range). 𝑃𝑅𝐺 ( ⏟ 𝑥 , ⏟ 𝑟 ) = (𝑓(𝑥) 𝑟 , ∑ 𝑥𝑖 𝑟𝑖 𝑚𝑜𝑑 2) ⏟ ,⏟ ⏟ 𝑘 𝑘 𝑘 2 𝑏𝑖𝑡𝑠 2 𝑘 2 𝑏𝑖𝑡𝑠 {0,1}𝑘 𝑜𝑛𝑒 𝑚𝑜𝑟𝑒 𝑏𝑖𝑡 2𝑘 {0,1} 2 𝑃𝑅𝐺: → Now they constructed a pseudo random function generator The took a seed (denoted 𝑦). y y(𝑔1 (𝑦)) y(𝑔0 (𝑦)) y(𝑔0 (𝑔0 (𝑦))) y(𝑔1 (𝑔0 (𝑦))) y(𝑔0 (𝑔1 ((𝑦))) y(𝑔1 (𝑔1 ((𝑦))) 𝐹(𝑦, 𝑥) ≔ 𝑀𝑆𝐵 (𝑔𝑥𝑘 ∘ … ∘ 𝑔𝑥2 𝑔𝑥1 (𝑦)) Define 𝑓𝑦 : {0,1}𝑛 → {0,1} 𝑓𝑦 (𝑥) = 𝐹(𝑦, 𝑥) Consider the distribution {𝑓𝑦 }𝑦∈{0,1}𝑘 , 𝑘 > 𝑛𝑐 𝑐 > 2  For each y 𝑓𝑦 is poly-time computable  This distribution is pseudo-random against polynomial time(in 𝟐𝒏 ) distinguishers On the other hand… 𝜅, or any property being used in a lower bound proof, shouldn’t be too complex either! 𝜅(𝑓) = 1 iff 𝑓 has low circuit complexity is not good (trivial). Note 𝜅 ∈ 𝑁𝑃! Take the basic functions: (- − 1) ⋮ 100 𝑛 𝑥1 (- − 1) ⋮ 𝑥2 (- − 1) ⋮ 𝑥3 … (- − 1) ⋮ 𝑥𝑛 A model for generating a random formula. I have 𝑛 ∙ 𝑛100 ∙ 2 functions. But this is false! Why? Because using AND or OR changes the distribution from Gowers Norms (𝑼𝒌 𝑵𝒐𝒓𝒎𝒔) Fix a finite set 𝐴 and consider ℝ𝐴 (the vector space of functions 𝑓: 𝐴 → ℝ. A norm on this space is a function ‖𝑥‖: ℝ 𝐴 → ℝ+ s.t. ‖𝛼𝑓‖ = |𝛼|‖𝑓‖ 𝑓 ≠ 0 → ‖𝑓‖ ≠ 0 ‖𝑓 + 𝑔‖ ≤ ‖𝑓‖ + ‖𝑔‖ 1 2 3 4 𝑓𝑜𝑟 0 to … Example: ‖𝑓‖ = max|𝑓(𝑥)| (𝑜𝑛 𝑥 ∈ 𝐴) Definition (dual norm): 1 ‖𝑓‖∗ = max{⟨𝑓, 𝑔⟩|‖𝑔‖ ≤ 1} where ⟨𝑓, 𝑔⟩ = ∑𝑥∈𝐴 𝑓(𝑥)𝑔(𝑥) ← “Correlation” |𝐴| Example: For ‖𝑓‖∞, the dual is: ‖𝑓‖∗∞ = max{< 𝑓, 𝑔 > |‖𝑔‖∞ ≤ 1, ∀𝑥: |𝑔(𝑥)| ≤ 1} 1 1 max {∑ 𝑓(𝑥)𝑔(𝑥) ||𝑔(𝑥)| ≤ 1} = ∑ 𝑓(𝑥)𝑆𝑖𝑔𝑛(𝑓(𝑥)) = ∑|𝑓(𝑥)| = ‖𝑓‖1 |𝐴| |𝐴| In general: If ‖∙‖is in P, it doesn’t mean that ‖∙‖∗is in P. Another Example: Let𝐴̂ be an abelian group ‖𝑓‖4 42 = 𝐸 [𝑓(𝑥)𝑓(𝑥 + 𝑎)𝑓(𝑥 + 𝑏)𝑓(𝑥 + 𝑎 + 𝑏)] 𝑥,𝑎,𝑏 Is it in P? YES. 1 Is ‖𝑓‖∗42 in P? Turns out that ‖𝑓‖∗42 = ‖𝑓̂‖4 = (∑𝑠⊆[𝑛](𝑓̂(𝑥)4 ))4 Exercise: Prove that the 𝑢2 norm is a norm. Hint: Cauchy’s norm. However, can define 𝑢2 norm TODO: Did not have enough time to copy the formula: ‖𝑓‖𝑢2 = ( 𝐸 (𝑓(𝑥)𝑓(𝑥 + 𝑎)𝑓(𝑥 + 𝑏)𝑓(𝑥 + 𝑎 + 𝑏)𝑓(𝑥 + 𝑐)𝑓(𝑥 + 𝑎 + 𝑐)𝑓() 𝑥,𝑎,𝑏,𝑐 Do not know a poly-time algorithm for ‖ ∙ ‖∗𝑢2 “Goal” for introducing these norms was to extend fourier analysis to “higher degree”. 𝑓: {0,1}𝑛 → {±1}. 𝑓̂(𝑠) = 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑓 𝑤𝑖𝑡ℎ 𝑡ℎ𝑒 𝑙𝑖𝑛𝑒𝑎𝑟 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 (−1)∑𝑖∈𝑆 𝑋𝑖 𝑥𝑠 : {0,1}𝑛 → {±1} 𝑥2 (𝑥1 , … , 𝑥𝑛 ) = ∏𝑖∈𝑆(−1)𝑥𝑖 ←linear phase functions Consider degree d phase functions (−1)^𝑔(𝑥̅ ) where deg 𝑔 ≤ 𝑑 Fix 𝑓: {0,1}𝑛 → {±1} 𝑓𝑦 : {0,1}𝑛 𝑦 ∈ {0,1}𝑛 → {±1} defined by 𝑓𝑦 (𝑥) = 𝑓(𝑥) ∙ 𝑓(𝑥 + 𝑦) 𝑓(𝑥) = (−1)(𝜙(𝑥)) Say, for instance: 𝑓(𝑥) = (−1)𝑥1 +𝑥2 +𝑥3 ∙ (−1)𝑥1 +𝑥2 +𝑥3 +𝑦1 +𝑦2 +𝑦3 = (−1)𝑦1 +𝑦+𝑦3 =constant! Doesn’t depend on x. If 𝑓(𝑥) = (−1)𝑞(𝑥) for deg 𝑞 ≤ 𝑑 Then 𝑓𝑦 (𝑥) = (−1)𝑞 ′ (𝑥) 𝑤ℎ𝑒𝑟𝑒 deg 𝑞 ′ ≤ 𝑑 − 1 (−1)𝑞(𝑥) ∙ (−1)𝑞(𝑥+𝑦) Define 𝑓𝑦,𝑧 ≡ (𝑓𝑦 )𝑧 Similarily: 𝑓𝑦1 …𝑦𝑑 = (… ((𝑓𝑦1 ) ) … ) 𝑦2 ---- end of lesson 2 𝑦𝑑 𝑓: {0,1}𝑛 → {±1} 𝑘 ‖𝑓‖2𝑢𝑘 = 𝐸 𝑥1 …𝑥𝑘 ∈{0,1}𝑛 ∏ 𝑓(𝑥0 + 𝜖1 𝑥1 + ⋯ + 𝜖𝑘 𝑥𝑘 ) 𝜖1 …𝜖𝑘 ∈{0,1} 𝑥0 = 𝑙𝑖𝑛𝑒𝑎𝑟 𝑠𝑝𝑎𝑛 (𝑥1 … 𝑥𝑘 ) (dimension k affine subspace) 𝑥0 , 𝑥0 + 𝑥1 , 𝑥0 + 𝑥2 , 𝑥0 + 𝑥1 + 𝑥2 Example: Suppose 𝑓 is a linear function ∃𝑎1 , … , 𝑎𝑛 𝑠. 𝑡. 𝑓(𝑥) = ∑ 𝑎𝑖 𝑥𝑖 (𝑚𝑜𝑑2) Then for any choice of 𝑥0 , 𝑥1 , 𝑥2 𝑓(𝑥0 ) + +𝑓(𝑥0 ) + 𝑓(𝑥1 ) + + ⋯ = 0 Hence for any choise of 𝑥0 , 𝑥1 , 𝑥2 + + 𝑓(𝑥1 + 𝜖1 𝑥1 + 𝜖2 𝑥2 ) ≡ 0 𝜖1 ,𝜖2 Similarly the expectancy is 0 as well. Linearity Testing (proven by Blum-Luby-Rubinfeld) 𝑓: {0,1}𝑛 → {±1} Question: Is 𝑓 a linear function? Definition 1: ∃𝑎1 , … , 𝑎𝑛 𝑠. 𝑡. ∀𝑥. 𝑓(𝑥) = ∑ 𝑎𝑖 𝑥𝑖 Definition 2: ∀𝑥, 𝑦 𝑓(𝑥) + 𝑓(𝑦) = 𝑓(𝑥 + 𝑦) Definition 2 implies definition 1 since: We can define 𝑎𝑖 = 𝑓(𝑒𝑖 ), then 𝑓(𝑥) = 𝑓(∑ 𝑥𝑖 𝑒𝑖 ) 𝑙𝑖𝑛𝑒𝑎𝑟𝑖𝑡𝑦 = ∑ 𝑥𝑖 ∙ 𝑓(𝑒𝑖 )… Testing Global object, e.g. 𝑓: {0,1}𝑛 → {0,1} Want to test if 𝑓 ∈ 𝒫 In our example – all linear functions. Only willing to invest limited resources, but willing to randomize. Question: Can we deduce global property by considering local behavior? If the answer is yes we say that this 𝒫 is testable. Non testable property: 𝑥1 … 𝑥𝑛 - Boolean variables. ~𝑥1 … ~𝑥𝑛 - their negations. Fom the list above, I select 𝑚 3-CNF clauses indices at andom. Most of the time we will find clauses that don’t have any shared variables If 𝑚 > 50𝑛 than with high probability 𝜑 is unsatisfiable! PCP “theory” implies that every polynomialy varifiable property (e.g. formula satisfiability) can be cast (“encoded”) in testable form. Definition: ∀𝑓, 𝑔 {0,1}𝑛 → {0,1} Distance(𝑓, 𝑔) = 𝑃𝑟𝑜𝑏𝑛 [𝑓(𝑥) ≠ 𝑓(𝑥)] 𝑥∈{0,1} Distance(𝑓, 𝑆) = min 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒(𝑓, 𝑔) f∈S Theorem (BLR): Let 𝑓: {0,1}𝑛 → {0,1} If distance (𝑓, 𝐿𝐼𝑁𝐸𝐴𝑅) ≥ 0 Then 𝑃𝑟𝑜𝑏(𝑓(𝑥) + 𝑓(𝑦) + 𝑓(𝑥 + 𝑦) ≠ 0) ≥ Ω(𝛿) 𝑥,𝑦 Theorem (AKKLR): Let 𝑓: {0,1}𝑛 → {0,1} if 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒(𝑓, 𝐷𝐸𝐺𝑅𝐸𝐸(𝑘)) ≥ δ Then 𝑃𝑟𝑜𝑏 (𝑋𝑂𝑅(𝑓(𝑥0 + 𝜖1 𝑥1 + ⋯ + 𝜖𝑘+1 𝑥𝑘+1 ))) ≥ Ω(𝛿 ∙ 2−𝑘 ) 𝑥0 ,…,𝑥𝑘+1 So we select an affine space of 𝑘 + 1 points. We look at all of these points and check that they are not zero. 𝑓 is a degree k function if 𝑓(𝑥1 , … , 𝑥𝑘 ) = ∑ 𝑎𝑆 ∏ 𝑥𝑖 |𝑆|⊆𝑘 𝑖∈𝑆 𝑥, 𝑦 𝑓(𝑥1 , … , 𝑥𝑘 ) = 𝑥1 𝑥2 … 𝑥𝑘 If 𝑓 has degree 𝑘 then ∀𝑥0 , … , 𝑥𝑘+1 𝑋𝑂𝑅(𝑓(𝑥0 + 𝜖1 𝑥1 + ⋯ + 𝜖𝑘+! 𝑥𝑘+1 )) Proof: Let 𝑓𝑦 (𝑥) = 𝑓(𝑥 + 𝑦) The function 𝑔 = 𝑓𝑥𝑘+1 ,𝑥𝑘 ,…,𝑥2 has degree 0 (it is constant) The above expression equals 𝑔(𝑥0 )𝑋𝑂𝑅𝑔(𝑥0 + 𝑥1 ) ≡ 0 Claim: Let 𝑐𝑜𝑟𝑟(𝑓, 𝑘 − 1) be the correlation of 𝑓 with degree 𝑘 − 1 polynomials. 𝑐𝑜𝑟𝑟 = (1 − 𝛿) − 𝛿 = 1 − 2 ∙ 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒(𝑓, 𝑘 − 1 𝐷𝐸𝐺𝐸𝐸 𝐹𝑈𝑁𝐶𝑇𝐼𝑂𝑁𝑆) 𝑐𝑜𝑟𝑟(𝑓, 𝑘 − 1) ≤ ‖𝑓‖𝑢𝑘 Reed-Mulle (Low-Degee) Test Let 𝑝 be the closest degree 𝑘 polynomial to 𝑓. 𝛿 ≔ 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒(𝑝, 𝑓) 1. 𝛿 is tiny – 𝛿 < 𝛽2−𝑘 If an affine 𝑘 + 1 space contains exactly one point 𝑥 𝑠. 𝑡. 𝑓(𝑥) ≠ 𝑝(𝑥) then the test rejects. We will prove that this happens with constant probability. Assume 𝛿~2−𝑘 Choose a random affine subspace 𝐴 by choosing 𝑀𝑛×𝑙 a random full rank matrix over 𝔽2 and a random 𝑏 ∈ 𝔽𝑛2 𝐴 = {𝑎𝑥 = 𝑀𝑥 + 𝑏|𝑥 ∈ 𝔽𝑙2 } For each 𝑥 𝐸𝑥 − The event 𝑓(𝑥) ≠ 𝑝(𝑥) 𝐹𝑥 − 𝐸𝑥 and ∀ 𝑦 ≠ 𝑥 𝑓(𝑎𝑦 ) = 𝑝(𝑎𝑦 ) 𝑎𝑥 is distributed uniformly in 𝔽𝑛2 𝑎𝑦 is distributed uniformly on 𝔽𝑛2 \{𝑎𝑥 } ∀𝑥 ≠ 𝑦𝑃𝑟𝑜𝑏[𝐸𝑥 ] = 𝛿 𝑃𝑟𝑜𝑏[𝐸𝑥 𝑎𝑛𝑑 𝐸𝑦 ] ≤ 𝛿 2 𝑀,𝑏 𝑃𝑟𝑜𝑏[𝐹𝑥 ] ≥ 𝑃𝑟𝑜𝑏[𝐸𝑥 ] − ∑ 𝑃𝑟𝑜𝑏[𝐸𝑦 ∧ 𝐸𝑥 ] ≥ 𝛿 − 2𝑙 ∙ 𝛿 2 ≈ 𝛿 𝑥≠𝑦 𝑃𝑟𝑜𝑏 (⋃ 𝐹𝑥 ) = ∑ 𝑃𝑟𝑜𝑏[𝐹𝑥 ] = 2𝑙 ∙ 𝛿 = 𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡 𝑀,𝑏 2. --- End of lesson 3 𝑥 𝑥 Last week we:  Defined the 𝑢𝑘 norm ⇔Degree 𝑘 − 1 test (“low degree test”)  Proved the following theorem: If ‖𝑓‖𝑢𝑘 > 1 − 𝛿 Then there ∃𝑝 of degree 𝑘 − 1 ⟨𝑓, 𝑝⟩ = 𝐸 [𝑓(𝑥)𝑝(𝑥)] > 1 − 𝛿 𝑙 𝑥 Lemma 1: Let 𝑓: {0,1}𝑛 → {0,1} 𝑝 - some polynomial Denote 𝑐𝑜𝑟𝑟(𝑓, deg 𝑘) = max{⟨𝑓, 𝑔⟩|𝑔 = (−1) 𝑓(𝑥) , 𝑝 − deg 𝑘} So: 𝑐𝑜𝑟𝑟(𝑓, deg 𝑘) ≤ ‖𝑓‖𝑢𝑘+1 Lemma 2: For every ℎ: {0,1}𝑛 → {0,1} ‖ℎ‖𝑢𝑘 ≤ ‖ℎ‖𝑢𝑘+1 Proof of 2: We shall use the fact that: 𝐸[𝑍 2 ] ≥ (𝐸[𝑍])2 𝑘+1 𝑘+1 ‖ℎ‖2𝑢𝑘+1 = 𝐸 ∏ 𝑥 𝑦1 ,…,𝑦𝑘 𝜖 ,…,𝜖 𝑘+1 𝑦𝑘+1 1 ℎ (𝑥 + ∑ 𝜖𝑖 𝑦𝑖 ) 𝑖=1 𝑘 = 𝐸 𝐸 𝑘 ∏ ℎ (𝑥 + ∑ 𝜖𝑖 𝑦𝑖 ) ∏ ℎ (𝑥 + 𝑦𝑘+1 ∑ 𝜖𝑖 𝑦𝑖 ) 𝑦1 ,…,𝑦𝑘 𝑥 𝑦𝑘+1 𝜖 ,…,𝜖 1 𝑘 𝑖=1 𝜖1 ,…,𝜖𝑘 𝑖=1 ′ Now let’s fix: 𝑥 ← 𝑥 𝑦 ′ ← 𝑥 + 𝑦𝑘+1 𝑘 = 𝐸 𝐸 𝑘 ′ ′ ∏ ℎ (𝑥 + ∑ 𝜖𝑖 𝑦𝑖 ) ∏ ℎ (𝑦 ∑ 𝜖𝑖 𝑦𝑖 ) 𝑦1 ,…,𝑦𝑘 𝑥 𝑦𝑘+1 𝜖 ,…,𝜖 1 𝑘 𝑖=1 𝜖1 ,…,𝜖𝑘 𝑖=1 2 𝑘 = 𝐸 𝑦1 ,…,𝑦𝑘 (𝐸 ∏ ℎ (𝑥 ′ + ∑ 𝜖𝑖 𝑦𝑖 )) 𝑥 𝜖1 ,…,𝜖𝑘 𝑖=1 2 𝑘 ≥( 𝐸 𝑥 𝑘+1 ∏ ℎ (𝑥 ′ + ∑ 𝜖𝑖 𝑦𝑖 )) = ‖ℎ‖2𝑢𝑘 𝑦! ,…,𝑦𝑘 𝜖 ,…,𝜖 1 𝑘 ∎ 𝑖=1 Proof of 1: For any ℎ: {0,1}𝑛 → {±1} 1 1) |𝐸 ℎ(𝑥)| = ‖ℎ‖𝑢′ 𝑥 𝑝×3 (‖ℎ‖𝑢2 4 4 = (𝐸 |ℎ̂(𝑠)| ) ) 𝑥 2) ∀𝑘 ‖ℎ‖𝑢𝑘 ≤ ‖ℎ‖𝑢𝑘+1 ‖𝑓 ∙ 𝑝‖𝑢𝑘 3) ∀𝑓. ∀𝑝: {0,1}𝑛 → {±1} degree 𝑘 polynomial ⏟ 𝑝𝑜𝑖𝑛𝑡𝑤𝑖𝑠𝑒 𝑚𝑢𝑙𝑡𝑖𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛 = ‖𝑓‖𝑢𝑘+1 Proof for 1: ‖ℎ‖2𝑢1 = 𝐸 ∏ ℎ(𝑥 + 𝜖𝑦1 ) = 𝐸 ∏ ℎ(𝑥 + 𝜖𝑦1 ) 𝑥 𝑥 𝑦1 𝜖=0,1 ′ ′ 𝑦1 𝜖=0,1 Denote 𝑥 = 𝑥, 𝑦 = 𝑥 + 𝑦 = (𝐸 ℎ(𝑥)) (𝐸 ℎ(𝑦)) 𝑥 𝑦 Proof for 3: 𝑘+1 ‖𝑓 ∙ 𝑘+1 𝑝‖2𝑢𝑘+1 = 𝐸 𝑥 𝑘+1 𝑘+1 ∏ 𝑦1 ,…,𝑦𝑘+1 𝜖 ,…,𝜖 1 𝑘+1 𝑓 (𝑥 + ∑ 𝜖𝑖 𝑦𝑖 ) 𝑝 (𝑥 + ∑ 𝜖𝑖 𝑦𝑖 ) = ‖𝑓‖2𝑢𝑘+1 𝑖=1 Because ∀𝑥, 𝑦1 , … , 𝑦𝑘+1 ∏𝜖1 ,…,𝜖𝑘 𝑥 + 𝑖=1 ∑𝑘+1 𝑖=1 𝜖𝑖 𝑦𝑖 =1 Let 𝑝: {0,1}𝑛 → {±1} be the degree 𝑘 function closest to 𝑓 (i.e. attaining max correlation). Define ℎ(𝑥) = 𝑓(𝑥) ∙ 𝑝(𝑥) 𝑐𝑜𝑟𝑟(𝑓, 𝑝) = |𝐸 𝑓(𝑥)𝑝(𝑥)| 𝑏𝑦 𝑠𝑡𝑒𝑝 1 𝑥 - = 𝑏𝑦 𝑠𝑡𝑒𝑝 2 ‖ℎ‖𝑢𝑘 ≤ ‖ℎ‖𝑢𝑘+1 𝑏𝑦 𝑠𝑡𝑒𝑝 3 = ‖𝑓‖𝑢𝑘+1 ∎ considered ‖𝑓‖𝑢3 as a “formal complexity measure” Consider dual norms: ‖𝑓‖∗ = max{⟨𝑓, 𝑔⟩|‖𝑔‖ ≤ 1} Motivation for dual: 1) “NP-ish” definition possibly circumvents RR 2) More robust TODO: Draw world Suppose we have two parts in our world. A and B And we have two functions 𝑓, 𝑔 such that 𝑓 is random on A and is 1 on B and 𝑔 is the exact opposite. ‖𝑓‖𝑢3 =constant. ‖𝑔‖𝑢3 =constant as well. ℎ =𝑓∨𝑔 ‖ℎ‖𝑢3 = 𝑧𝑒𝑟𝑜! This is a problem! We just use an or and got such a dramatic difference ‖ℎ‖∗ ≥ ⟨ℎ, ℎ⟩ = 1 ⟨ℎ, 𝛼ℎ⟩ = 𝛼 1 Can take 𝛼 = ‖ℎ‖ - very large! ‖𝑓‖∗𝑢3 =? ‖𝑓‖∗𝑢3 Need to find a “norming function” Use ℎ! ‖ ℎ ‖=1 ‖ℎ‖ 1 ℎ 1 1 ‖𝑓‖∗ ≥ ⟨𝑓, ⟩ = 𝐸 𝑓(𝑥)ℎ(𝑥) = 𝑃𝑟𝑜𝑏(𝐴) ⏞ 𝐸𝑓(𝑥)𝑔(𝑥) + 𝑃𝑟𝑜𝑏[𝐴] ∙ 𝐸ℎ(𝑥) ∙ 1 = 2 ∙ ‖ℎ‖ ‖ℎ‖ 𝑥 1 ‖ℎ‖ =very large! Note: ⟨𝑓, 𝑓⟩ ≤ ‖𝑓‖ ∙ ‖𝑓‖∗ Question 1: Given 𝑓: {0,1}𝑛 → {±1} Can we compute ‖𝑓‖∗𝑢𝑘 in polytime? (an open question even for k=3) Question 2: Suppose you know that ‖𝑓‖∗𝑢3 ≥ 𝜖 by [Samorodnitsky ‘07] ∃ deg 2 polynomial 2 that correlates with 𝑓. Can P be found in time poly(2𝑛 )? (search space is 2𝑛 ) Gappalan-Klivans-Zuckerman ’08: “list-decoding Reed-Muller Codes”. If 𝑓 is 𝜖-correlated with some degree-2 function, then the following is true: 1) Number of deg 2 polynomials correlation with 𝑓 ≤ 2𝑂𝜖 (𝑛) 2) Can find list above in time 2𝑂𝜖 (𝑛) Interpretation: if ‖𝑓‖ small then ‖𝑓‖∗ is large Simplicity Property = {𝑓|‖𝑓‖∗ 𝑖𝑠 𝑠𝑚𝑎𝑙𝑙} ⊆ {𝑓|‖𝑓‖ 𝑖𝑠 𝑙𝑎𝑟𝑔𝑒}- This is a property in 𝑃! If the first if 𝑃1 and the second is 𝑃2 , then the first one that contains all functions. Next idea: 𝑢𝑘 norm for super-constant 𝑘. Now the naïve algorithm for computing ‖𝑓‖𝑢𝑘 Takes (2𝑛 )𝑘 −time. If 𝑘 = 𝑘(𝑁) this is not polytime 𝑁 = 2𝑛 , 𝑁^𝐾 Still intent to use dual ‖ ‖∗𝑢𝑘 norm (want robustness) Question 3: Is there an algorithm for ‖ ‖𝑢𝑘 running in time better than 𝑁 𝑘−1 ∙ log 𝑁.

Complexity Lower Bounds, P vs NP & Gowers Blog

Related documents

Products

Support

Complexity Lower Bounds, P vs NP & Gowers Blog

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib