Lecture 5. Formula complexity I A Boolean function can be computed by different models, and an important one is the class of formulas. Roughly speaking, formulas are circuits with fanout 1. To be more specific, a formula is a rooted tree s.t. ïŽ each internal node is associated with a gate which is either AND or OR, ïŽ each leaf is associated with a variable xi, the negation of a variable ðĨĖ ð , or a constant 0 or 1. (Note that we have used DeMorgan rule to push all negations to the leaf level.) The formula computes a function f on input x by evaluating the gates in a bottom-up manner, and the root outputs f(x). Different measures of formula are studied, among which depth, size (number of gates) and leaf size (number of leaves) are the most important ones. The leaf size L(F) of a formula F is simply the number of leaves. The leaf size L(f) of a function f is the minimum leaf size L(F) of any formula F that computes f. A function ð: {0,1}ð → {0,1} is monotone if ð(ðĨ) ≤ ð(ðĶ) whenever ðĨð ≤ ðĶi , ∀ð ∈ [ð]. In other words, it requires that changing an input variable from 0 to 1 doesn’t decrease the function value. For monotone functions, we can require the formula contain only variables and constants in leaves, but no negation of variables. (Convince yourself.) These are called monotone formulas. So for monotone functions, we can define the monotone leaf size ðŋ+ (ð) as the minimal number of leaves of any monotone formula that computes f. While circuit lower bounds are too hard to prove---the best unconditional lower bound is only cn for some constant c, people stepped back and tried to prove formula lower bounds. In this and the next lectures, we will learn various methods to prove lower bounds for formula complexity. 1. (Random) Restriction The idea of restriction is as follows. We will pick some variables and fix their values to be some constants, resulting in a “subfunction” f’. The variables are picked and fixed in such a ways that the complexity of this subfunction is considerably smaller than that of f. But if we keep doing so, we’ll go to a very small scale function with complexity 0. But this small scale function is still not a constant function, thus has a positive complexity---contradiction. This idea went all the way back to Subbotovskaya [Sub61]. Lemma 1.1. For every Boolean function f of n variables, it is possible to fix one variable such that the resulting subfunction ð′ satisfies 1 3/2 ðŋ(ð ′ ) ≤ (1 − ð) ⋅ ðŋ(ð). Before proving the lemma, let’s get some feelings of the parameters and see why it’s nontrivial. Suppose F is a minimal formula which computes f, and F has s = L(f) leaves. Since f has n variables, there exists one variable xi which appears at least s/n times in the leaf level (though some of the appearances are in the form of ðĨĖ ð ). If we fix xi to be some constant, then the leaf size will reduce to s − s/n = (1 − 1/n)s. The lemma says that actually one can reduce the size by more. Now we prove it. Proof. Define F and s as in the above discussion. The idea is that if there is a subtree H in F which is of the form xi ∧ ðš for some subtree G, then setting xi = 0 will kill the whole subtree G. (You can handle the cases where the gate is OR, or where the occurrence of xi is in negation, in a similar way.) Therefore, suppose variable i appears t ≥ s/n times, then for each appearance there is an assignment of xi = 0/1 to kill its “neighbor” subtree G. Then one of the two assignments, xi = 0 and xi = 1, kills more, at least t/2, subtrees. Use that assignment. The leaf size reduces to at most t 3 1 3/2 s − t − 2 ≤ (1 − 2n) s ≤ (1 − ð) s, as claimed. The above analysis is almost correct except that it forgets to consider the case in which the subtrees G intersect. Consider the extreme case that there is a subtree H of the form ðĨð ∧ ðĨð ∧ ðĨð … This looks weird since clearly some xi’s are redundant. We can indeed make this rigorous by claiming the following: For any leaf ðĨð or ðĨĖ ð , its neighbor subtree G doesn’t contain variable i. (Thus the G’s don’t intersect.) The claim is not hard to prove. Basically, if we see xi ∧ ðš and G contains xi, then it is observed that one can simply replace the xi in G by 1 without changing the function. (If xi = 1, then we “guessed correctly” by replacing xi in G by 1. If xi = 0, then xi ∧ ðš = 0 anyway.) Other cases such as G contains ðĨĖ ð , or the gate connecting xi and G is OR, can be handled similarly. This completes the proof of the lemma. ⥠Repeatedly applying the lemma gives the following theorem. Theorem 1.2. For every Boolean function f of n variables and every k ∈ [n], there is a way to fix n − k variables such that the resulting subfunction of k variables satisfies ð 3/2 ðŋ(ð ′ ) ≤ (ð) ⋅ ðŋ(ð). Proof. Applying the lemma n − k times, we obtain a subfunction of k variables with leaf size at most 1 3/2 ðŋ(ð ′ ) ≤ (1 − ð) 3/2 1 (1 − ð−1) 1 3/2 … (1 − ð+1) ð 3/2 ⋅ ðŋ(ð) = (ð) ⋅ ðŋ(ð). ⥠In the above theorems, we carefully constructed an assignment to shrink the leaf size. Next we will show that actually this construction doesn’t need to be so careful---a random one does the job with good probability. Theorem 1.3. Let f be a Boolean function of n variables. Pick k variables uniformly at random, and for the rest n − k variables, assign each of them a random constant c ∈ {0,1}. Then with probability at least 3/4, ð 3/2 ð ðŋ(ð ′ ) ≤ 4 ( ) ⋅ ðŋ(ð). Proof. Let’s choose a variable randomly and assign a random value. (We will choose variables one by one; note that the process is actually the same as the random assignment in the statement of the theorem.) Denote by s1 the number of resulting leaves. The proof of the previous theorem actually shows that the expected number of reduced leaves is at least 3ð 2ð . Thus 3ð 1 3/2 ðļ[ð 1 ] ≤ ð − ≤ (1 − ) ð . 2ð ð Continue the above process for n − k variables, resulting in a subfunction of k variables with the expected leaf size 1 3/2 ðļ[ð ð−ð ] ≤ (1 − ð) 1 3/2 (1 − ð−1) 1 3/2 … (1 − ð+1) The claimed statement now follows from Markov Inequality. ð 3/2 ⋅ ð = (ð) ⋅ ð . ⥠(The above analysis is a bit informal about the expectation of multiple stages. Can you make it more formal? Just compute the stage 2, namely E[s2], and you’ll see why it’s correct.) We can apply the theorem to show a lower bound for the parity function ð(ðĨ) = ðĨ1 ⊕ ðĨ2 ⊕ … ⊕ ðĨð . Theorem 1.4. For the parity function, ðŋ(ð) ≥ ð3/2. Proof. Fix all but one variable, resulting in the parity function (or its negation) of one variable, which has leaf size at least 1. So 1 3/2 1 ≤ ðŋ(ð ′ ) ≤ (ð) which implies the claimed lower bound. 2. ðŋ(ð), ⥠Rectangle and tiling number We now introduce another lower bound method for leaf size, due to Khrapchenko [Khr72] and Rychkov [Ryc85]. A rectangle is a pair (A,B) of disjoint sets ðī, ðĩ ⊆ {0,1}ð . A rectangle (A,B) is monochromatic if there exists an i ∈ [n], s.t. for all ðĨ ∈ ðī and ðĶ ∈ ðĩ, it holds that ðĨð ≠ ðĶð . A rectangle (A,B) is 1-monochromatic if there exists an i ∈ [n], s.t. for all x ∈ A and y ∈ B, it holds that ðĨð = 1 and ðĶð = 0. A rectangle (A,B) is 0-monochromatic if there exists an i ∈ [n], s.t. for all ðĨ ∈ ðī and ðĶ ∈ ðĩ, it holds that ðĨð = 0 and ðĶð = 1. Any monochromatic rectangle is either 1- or 0-monochromatic. (See why?) The tiling number ð(ð ) of a rectangle R is the minimum number r s.t. R can be partitioned into r monochromatic rectangles. Please note that by “partition”, we emphasize that the rectangles are disjoint (and, of course, their union needs to be the whole set R). The tiling number ð(ð) of a function f is χ(ð −1 (1) × ð −1 (0)), the tiling number of the rectangle ð −1 (1) × ð −1 (0). Theorem 2.1. For every Boolean function f, ðŋ(ð) ≥ ð(ð). If f is monotone, then ðŋ+ (ð) ≥ ð+ (ð). Proof. We prove the first inequality by induction on L(f), and the second one is similar. The base is easy to confirm: If ðŋ(ð) = 1, then ð(ðĨ) = ðĨð or ð(ðĨ) = ðĨĖ ð . Then the rectangle ð −1 (1) × ð −1 (0) is itself monochromatic by definition. Now the induction step. Take a formula F with the minimal leaf size. Assume that the root is AND function; the other case can be handled similarly. So the function ð = ð0 ∧ ð1, and ðŋ(ð) = ðŋ(ð0 ) + ðŋ(ð1 ). (Convince yourself that the subtrees are also optimal for the corresponding subfunctions.) Define ðĩ0 = ð −1 (0) ∩ ð0−1 (0), and ðĩ1 = ð −1 (0) ∩ ð0−1 (1). First, we observe that ð −1 (1) ⊆ ð0−1 (1), ð −1 (1) ⊆ ð1−1 (1), ðĩ0 ⊆ ð0−1 (0), ðĩ1 ⊆ ð1−1 (0), ð −1 (0) = ðĩ0 ∪ ðĩ1 . The first two inclusions are because ð(ðĨ) = 1 imples that ð0 (ðĨ) = ð1 (ðĨ) = 1. (Recall that ð = ð0 ∧ ð1 .) The third one is by definition of B0. The fourth one is because ð(ðĨ) = 0 and ð0 (ðĨ) = 1 ⇒ ð1 (ðĨ) = 0. The last one is trivial by definition of B0 and B1. Now ð(ð) = ð(ð −1 (1) × ð −1 (0)) // def. of ð(ð) ≤ ð(ð −1 (1) × ðĩ0 ) + ð(ð −1 (1) × ðĩ1 ) // tiling separately ≤ ð (ð0−1 (1) × ð0−1 (0)) + ð(ð1−1 (1) × ð1−1 (0)) // the four inclusions above ≤ ð(ð0 ) + ð(ð1 ) // def. of ð(ðð ) ≤ ðŋ(ð0 ) + ðŋ(ð1 ) // induction hypothesis ⥠= ðŋ(ð) Now we need to further lower bound ð(ð). Theorem 2.2. Let ðī = ð −1 (1), ðĩ = ð −1 (0), and ð = {(ðĨ, ðĶ): ðĨ ∈ ðī, ðĶ ∈ ðĩ, |ðĨ ⊕ ðĶ| = 1}. We have ðŋ(ð) ≥ |ð|2 |ðī| ⋅ |ðĩ| where |ðĨ ⊕ ðĶ| is the Hamming weight of ðĨ ⊕ ðĶ. Proof. Suppose we have a decomposition of ðī × ðĩ into r monochromatic rectangles Ri of dimensions ð ð × ðĄð . Let ci be the number of elements of S in Ri. We claim that each ðð ≤ ððð{ð ð , ðĄð } ≤ √ð ð ðĄð . Indeed, suppose that all elements (x,y) in Ri differ at position j: ðĨð ≠ ðĶð . Then each x has at most one y s.t. (ðĨ, ðĶ) ∈ ð ∩ ð ð , therefore ðð ≤ ð ð . Similarly, we can show that ðð ≤ ðĄð . Then ð |ð|2 2 = (∑ ðð ) ≤ ð=1 ð ð ∑ ðð2 ð=1 Rearranging the terms gives the bound. ð ≤ ð ∑ ð ð ðĄð = ð|ðī × ðĩ| = ð|ðī| ⋅ |ðĩ|, ð=1 ⥠We can use this to show a quadratic lower bound for parity function, improving the bound given in the last section. Theorem 2.3. Let f be the parity function of n variables. Then ðŋ(ð) ≥ ð2 . Proof. Clearly, |A| = |B| = 2n−1 , and |S| = n2n−1 . So ðŋ(ð) ≥ ð2 . ⥠One may wonder whether this lower bound can be further improved. The answer is no: actually there is a formula to compute parity function with O(n2 ) leaves. Can you find one? (hint: Consider the two-variable version, ðĨ1 ⊕ ðĨ2 , first.) References [Sub61] B. A. Subbotovskaya, Realizations of linear functions by formulas using +, ., -, Soviet Math. Dokl. 2, 110–112, 1961. [Khr72] V.M. Khrapchenko, A method of obtaining lower bounds for the complexity of ð-schemes, Math. Notes Acad. of Sci. USSR 10, 474–479, 1972. [Ryc85] K. L. Rychkov, A modification of Khrapchenko’s method and its application to lower bounds for -schemes of code functions, in Metody Diskretnogo Analiza, 42 (Novosibirsk), 91–98, 1985.