Permanent Does Not Have Succinct Polynomial Size Arithmetic Circuits of Constant Depth Maurice Jansen and Rahul Santhanam School of Informatics, The University of Edinburgh maurice.julien.jansen@gmail.com, rsanthan@inf.ed.ac.uk Abstract We show that over fields of characteristic zero there does not exist a polynomial p(n) and a constant-free succinct arithmetic circuit family {Φn } using division by constants1 , where Φn has size at most p(n) and depth O(1), such that Φn computes the n × n permanent. A circuit family {Φn } is succinct if there exists a nonuniform Boolean circuit family {Cn } with O(log n) many inputs and size no(1) such that that Cn can correctly answer direct connection language queries about Φn - succinctness is a relaxation of uniformity. To obtain this result we develop a novel technique that further strengthens the connection between black-box derandomization of polynomial identity testing and lower bounds for arithmetic circuits. From this we obtain the lower bound by explicitly constructing a hitting set against arithmetic circuits in the polynomial hierarchy. 1 Introduction Proving super-polynomial arithmetic circuit lower bounds for explicit polynomials is one of the hardest challenges in theoretical computer science. For unrestricted circuits the best-known lower bounds are Ω(n log r), for polynomials in n variables and degree r, due to Baur and Strassen [BS82]. To make further progress one popular approach has been to aim at proving lower bounds for small depth circuits first. This restriction is well-motivated by a recent result of Agrawal and Vinay [AV08], who show that proving a 2Ω(n) size lower bound for a multilinear polynomial in n variables for depth four arithmetic circuits translates into a 2Ω(n) lower bound for the unrestricted model. Over finite fields, Grigoriev and Karpinski [GK98] show that depth three arithmetic formulas that are Ω(n) sums-of-products-of-sums require to compute the determinant polyP Qn size 2 nomial detn = σ∈Sn sgn(σ) j=1 xiσ(j) . Over fields of characteristic zero, e.g. Q or R, super-polynomial lower bounds have been especially difficult to obtain. The best-known lower bound for depth three sums-of-products-of-sums P Qn formulas for detn , and also for the permanent polynomial pern = σ∈Sn j=1 xiσ(j) is Ω(n4 / log n), due to Shpilka and Wigderson [SW01]. Over fields of characteristic 1 In the constant-free model the only allowed constants in the circuit are −1, 0 and 1, cf. [Bür09,KP07]. Additionally, we allow the circuit to use divisions of the form 1/g, for any previously computed nonzero constant g. zero, no lower bounds for this depth three model are known beyond Ω(n2 ), for explicit polynomials in n variables of degree at most poly(n). For higher depth, the best-known bounds are due to Raz [Raz10], who constructs a polynomial with coefficients in {0, 1} of degree O(d) in n variables that requires depth d circuits of size n1+Ω(1/d) over any field. Adding the restriction of multilinearity2 Ω(1/d) Raz and Yehudayoff [RY09] prove that detn and pern require size 2n for multilinear circuits of product depth d. We remark that for nonuniform constant depth arithmetic circuits using integer constants an exponential lower may be derived for Permanent from the Razborov-Smolensky [Raz87,Smo87] result that MOD3 requires exponential size bounded depth Boolean circuits with parity gates. Namely, consider the following argument3 , which was communicated to us by Pavel Hrubeš: By the RazborovSmolensky result we have a Boolean function f with a poly size Boolean formula F that requires exponential size bounded depth Boolean circuits with parity gates. Interpret F as an arithmetic circuit over GF (2) by replacing AND with multiplication and parity with addition. Let f˜ be the computed polynomial. Observe that f˜ requires exponential size arithmetic circuits of constant depth over GF (2), as we can also perform the reverse replacement of gates to obtain a Boolean circuit for f . By Valiant’s [Val79a] completeness result for determinant, we can write f˜ = det(A), where A is a poly-size matrix with variables, 0s and 1s, working over GF (2). However, over GF (2), Determinant equals Permanent and thus also Permanent requires exponential size arithmetic bounded depth circuits over GF (2). This implies the exponential lower bound over Z. For constant-free uniform4 arithmetic circuits it is known that depth o(log log n) circuits for pern must have super-polynomial size, as proved by Koiran and Perifel [KP09]. We interpret this result as a lower bound for circuits that are succinct in some extreme sense. In order to make progress we will prove lower bounds for a far less restrictive notion of succinctness. To make this precise, we give some definitions. Consider a family {Φn } of constant-free arithmetic circuits, i.e. only using constants in {−1, 0, 1}. We say that {Φn } is (a(n), b(n))-succinct4 , if there exists a family of nonuniform Boolean circuits {Cn }, where Cn has at most a(n) inputs and is of size at most b(n), such that Cn can answer direct connection language queries about Φn . This means that given names of gates in Φn , the circuit Cn should be able to answer queries like whether two gates are connected, what the type of a gate is: +, ×, /, or ‘input’, and in the latter case whether the variable or constant label equals some given string. Regarding division gates, to express our lower bound result in its strongest form, we allow divisions of the form 1/g, where g is any nonzero constant previously computed by the circuit. We will call such circuits “constant-free arithmetic circuits using divisions by constants”. 2 This means that at any gate the computed polynomial must be multilinear. We missed this point previously. This has prompted us to strengthen the model to include divisions by constants in order to properly demonstrate the power of our techniques. 4 See Section 3 for a formal definition. 3 Succinctness interpolates between uniformity and non-uniformity. It can be observed that DLOGTIME-uniform circuits of polynomial size are (O(log n), O(log n log log n))-succinct. On the other hand, if a sequence {Φn } of circuits is (a(n), b(n))-succinct, the circuit Φn can be constructed in time 2O(a(n)) b(n)O(1) , given O(b(n) log(b(n))) bits of advice. In what follows we take a(n) = O(log n), which limits the size of the arithmetic circuit to be nO(1) . By convention, whenever a(n) = O(log n), we will drop it from the notation, and just write “b(n)-succinct”. Within this setting there is a spectrum of scenarios to study. At one extreme, poly(n)-succinctness forms no restriction at all. At the other end, for polylog(n)-succinctness, a lower bound can be derived with an advice elimination argument using the known uniform TC0 lower bound for permanent by Allender [All99]. Our main result is to push the envelope towards the high end by proving the following theorem: Theorem 1. Let d ≥ 0 be an integer. Over fields of characteristic zero, there does not exist a polynomial p(n) such that {pern } has no(1) -succinct size p(n) depth d constant-free arithmetic circuits using divisions by constants. Thus either the permanent does not have polynomial size constant depth constant-free arithmetic circuits using divisions by constant, or in (the unlikely) case that it does, the direct connection language L for such circuits is hard in the sense that for deciding inputs of size k = O(log n), for some > 0 one requires nonuniform Boolean circuits of size n = 2Ω(k) , i.e. exponential in the input size5 . Why are we interested in lower bounds against succinct circuits? If we’re interested in actually computing the permanent, the mere fact of there being polynomial-size circuits for it doesn’t seem to be sufficient, for there is the question of how we can find these circuits. Succinctness is a measure of how easy the circuits themselves are. Also, ultimately, much of the motivation for arithmetic complexity comes from classical problems like NP vs P and PSPACE vs P, which have resisted solution, or even progress, for many decades. Often the nonuniform versions of these questions are studied because we have few ways of taking advantage of uniformity apart from diagonalization. One aspect of our proof is that we do not make use of hierarchy theorems, although it is conceivable a more traditional diagonalization argument6 can also yield results similar to ours. However, we think our approach has more potential as it comes to breaking new ground, depending on whether one can manage to use more sophisticated algebraic techniques for constructing hitting sets. As this paper shows, already a naive counting argument for constructing a hitting set in the polynomial hierarchy manages to establish a strong lower bound theorem. We next describe the actual techniques used in the proof. 5 6 RP It is possible to give a uniform upper of ENP for L. One of the referees of this paper suggested such an argument. 1.1 Techniques At a high level, we exploit the well-known connection between lower bounds and derandomization of arithmetic circuit identity testing (ACIT). In the ACIT problem one is given an arithmetic circuit, and the question is to decide whether the polynomial computed by the circuit is identically zero or not. Using the Schwartz-Zippel-deMillo-Lipton Lemma [DL78,Sch80,Zip79], Ibarra and Moran [IM83] show this problem is in coRP. It is known that non-trivial deterministic algorithms would lead to circuit lower bounds against Boolean or arithmetic circuits [HS80,KI04,Agr05]. This is often interpreted as evidence for the difficulty of derandomizing ACIT. However Agrawal turns this interpretation on its head and advocates derandomization of ACIT as an approach toward proving circuit lower bounds [Agr05]. Our result is a step in this direction. There are two major parts to our argument. The first is to show that the lower bound of Theorem 1 follows from a black-box derandomization hypothesis (Working Hypothesis 1) for ACIT. Our Working Hypothesis 1 poses the existence of integer sequences definable by no(1) -succinct TC0 circuits that form a hitting set against small constant-free arithmetic circuits. We prove the implication by combining ideas of Agrawal [Agr05], Bürgisser [Bür09] and Koiran [Koi10], and making critical use of our succinctness assumption to indirectly perform computations which we cannot afford to do directly. We do not know how to prove Working Hypothesis 1. Instead what we will establish for the second part of our argument, is a weaker derandomization of ACIT using integer sequences so-called weakly-definable in the polynomial hierarchy. However, this statement appears not to be strong enough to get the lower bound of Theorem 1 directly. To resolve this, we argue by contradiction as follows. Assuming that the conclusion of Theorem 1 fails, then due to Toda’s Theorem [Tod91] and an improvement by Zankó [Zan91], cf. [All99], of Valiant’s [Val79b] completeness result for pern , this induces a collapse of the polynomial hierarchy, which makes the integer sequence we explicitly construct good enough to satisfy the requirements of Working Hypothesis 1. This way we obtain a lower bound for pern after all, but this contradicts the assumption that the conclusion of Theorem 1 does not hold. 1.2 Overview The rest of this paper is organized as follows. In Section 2 we import some required definitions and results from the literature. In Section 3 we define our computational model of succinct circuits, and we prove some preliminary results pertaining to this model. In Section 4 we state our derandomization hypothesis (Working Hypothesis 1), and we show this statement implies a lower bound for succinct circuits computing permanent. Section 5 contains a simple construction of a hitting set against arithmetic circuits in the polynomial hierarchy. Putting everything together, we prove Theorem 1 in Section 6. In Section 7 we switch gears to give the advice elimination argument alluded to before. Finally, we discuss some open problems in Section 8. 2 Preliminaries We use standard base 2 representation of integers by binary strings, with one bit representing the sign. Per abuse of notation, type casting between integers and binary strings is omitted throughout the paper. An arithmetic circuit Φ over a set of variables X = {x1 , x2 , . . . , xn } and a field F is given by a directed acyclic graph such that nodes with in-degree = 0 are labeled with elements of X ∪ F. Nodes with higher in-degree are labeled by + or ×. To each vertex in Φ we can associate an element of the polynomial ring F[X] in the obvious way. A polynomial f ∈ F[X] is said to be computed by Φ, if there exists a node in Φ where the associated polynomial equals f . The size s(Φ) of the circuit is defined to be the number of edges in Φ. We will also write |Φ| for s(Φ). The depth d(Φ) of the circuit is defined to be the length of the longest directed path in Φ. For a polynomial f ∈ F[X]. its arithmetic circuit complexity s(f ) is defined by s(f ) = min{s(Φ) : Φ computes f }. If the underlying graph of Φ is a tree, then Φ is called a formula. Arithmetic formula size of f is denoted by e(f ). For constant-free arithmetic circuits the only field constants that are allowed for labels are ∈ {−1, 0, 1}, cf. [Bür09,KP07]. Constant-free arithmetic circuits compute polynomials with integer coefficients. To express our main lower bound result in its strongest form, we extend the constant-free model to include socalled divisions by constants by allowing gates of the form f /g, where the inputs f and g are computed previously in the circuit, and such that g is a nonzero constant. Constant-free arithmetic circuits using division by constants defined as such compute polynomials in Q[X]. For definitions of standard complexity classes like P, PH, PP, etc., we refer the reader to Scott Aaronson’s Complexity Zoo7 . Next we list several of the important ones we use for the reader’s convenience. #P is the class of all function f : {0, 1}∗ → {0, 1}∗ such that there exists a language A ∈ P and a polynomial p(n) such that f (x) = |{w ∈ {0, 1}p(|x|) : (x, w) ∈ A}|. GapP is the class of all function f − g, where f, g ∈ #P. We will make combined use of Valiant’s [Val79b] result that computing pern (M ) for M with entries in {0, 1} over Z is complete for #P, and Toda’s [Tod91] Theorem, which states that PH ⊆ P#P[1] . We define the majority operator C. acting on a complexity class as follows. Given a complexity class C, C.C is the class of all languages L for which there exists L0 ∈ C and a polynomial p(n) such that x ∈ L ⇔ |{w ∈ {0, 1}p(|x|) : (x, w) ∈ L0 }| > 2p(|x|)−1 S . The counting hierarchy, introduced by Wagner [Wag86], is defined to be i≥0 Ci P, where C0 P = P, and for all i ≥ 1, Ci P = C.Ci−1 P. The first level C1 P of this hierarchy corresponds to the standard complexity class PP. We will use Torán’s [Tor91] characterization of the counting hierarchy which states that Ci+1 P = PPCi P , for all i ≥ 0. Next we define nonuniform versions of complexity classes. An advice function is a function of type h : N → {0, 1}∗ . For a complexity class C, we define C/poly 7 http://http://qwiki.stanford.edu/index.php/Complexity Zoo to be the class of languages for which there exists L0 ∈ C, and advice function h with |h(n)| = nO(1) , such that x ∈ L ⇔ (x, h(|x|)) ∈ L0 . We use the Boolean circuit complexity classes AC0 , TC0 and NC1 . AC0 is the class of all Boolean functions computable by polynomial size constant depth circuits with unbounded fan-in gates in {∨, ∧, ¬}. TC0 is the class of all Boolean functions computable by polynomial size constant depth unbounded fan-in threshold circuits. A threshold circuit is a Boolean circuit in which all gates either compute the negation, or the majority function of their inputs. NC1 is the class of all Boolean functions that can be decided by polynomial size O(log n) depth circuits of bounded fan-in. For these classes we have that AC0 ⊆ TC0 ⊆ NC1 . For a Boolean circuit family {Cn }, if there are no requirements on constructability, we call the family nonuniform. For the uniform versions of Boolean complexity classes we will always be using the notion of DLOGTIME-uniformity. We postpone defining this until Section 3. We will use the notion of definability of Ref. [KP07]. To distinguish this from definability as in Ref. [Bür09], we use the term weakly-definable. An integer sequence of bit size q(n) is given by a function a(n, k1 , k2 , . . . , kt ), for some fixed number t, such that there exist polynomials p(n) and q(n) so that a(n, k1 , k2 , . . . , kt ) ∈ Z is defined for all n ≥ 0, and all 0 ≤ k1 , k2 , . . . , kt < 2p(n) , and where the bit size of a(n, k1 , k2 , . . . , kt ) is bounded by q(n). We will often write an (k1 , k2 , . . . , kt ) instead of a(n, k1 , k2 , . . . , kt ). We define the language uBit(a) = {(1n , k1 , k2 , . . . , kt , j, b) : the jth bit of a(n, k1 , k2 , . . . , kt ) equals b}. Here k1 , k2 , . . . , kt and j are encoded in binary. For a sequence a(n, k1 , k2 , . . . , kt ) and a complexity class C, if uBit(a) ∈ C, then we say that the sequence a(n, k1 , k2 , . . . , kt ) is weakly-definable in C. We make use of the well-known Schwartz-Zippel-deMillo-Lipton Lemma. Lemma 1 ([DL78,Sch80,Zip79]). Let A be an arbitrary nonempty subset of the field F. Then for any nonzero polynomial f ∈ F[X] of degree d, d , where the ai s are picked independently and uniPr[f (a1 , a2 , . . . , an ) = 0] ≤ |A| formly at random from A. Finally, we require the following completeness result for permanent: Proposition 1 (Proposition 2.17 in [Bür00]). Suppose char(F) 6= 2. Assume the polynomial g ∈ F[x1 , . . . , xn , y1 , . . . , ym ] can be computed by a formula P of size < e that uses constants ∈ A ⊂ F. Put f (X) = a∈{0,1}m g(X, a). Then there exists a digraph G of size |G| ≤ 6e with edge weights in A ∪ {−1, − 21 , 0, 12 , 1} ∪ {x1 , . . . , xn } such that f = per(G). In other words, f is a projection of per6e . 3 Representing Arithmetic Circuits by Boolean Circuits For the set {x1 , x2 , . . . , xn }∪{−1, 0, 1}∪{+, ×, /}, we assume that we have fixed some naming scheme that assigns to each element an O(log n) bit binary string, which is called a type. We assume that all gates in a circuit have been labeled by unique binary strings that also specify the type. Definition 1. A representation of a constant-free arithmetic circuit Φ using divisions by constants is given by a Boolean circuit Cn that accepts precisely all tuples (t, a, b, q) such that 1. In case q = 1 (connection query), a and b are numbers of gates in Φ, b is a child of a, and a has type t. 2. In case q = 0 (type query only), a is a number of a gate in Φ, and a is of type t. For a constant-free arithmetic circuit using divisions by constants over n variables that is of size poly(n), gate names can be encoded using O(log n) bits. Definition 2. Let a(n), b(n) be two functions N → N. For a family of arithmetic circuits {Φn }, we say it is (a(n), b(n))-succinct, if there exists a non-uniform family of Boolean {∨, ∧, ¬}-circuits {Cn }, such that Cn represents Φn , where for all large enough n, Cn has ≤ a(n) inputs and is of size ≤ b(n). As a matter of convention, if a(n) = O(log n), we drop it from the notation, and just write b(n)-succinct. We want to study the notion of no(1) -succinctness. We will fix some arbitrarily slow growing function γ(n) and consider n1/γ(n) -succinctness instead. A typical example to think of would be γ(n) = log∗ n. Notation 1 For the rest of the paper we let γ(n) : N → N be an unbounded monotone function, such that ∀n, γ(n) < log log n. Similarly to the above, we define the notion of (a(n), b(n))-succinct Boolean circuits. In this case type names are assumed to form a naming scheme for the elements of {x1 , x2 , . . . , xn } ∪ {0, 1} ∪ {∨, ∧, ¬, MAJ}. Regarding DLOGTIME-uniformity we refer the reader to Ref. [BIS90] for an extensive treatment. In our set-up this can be defined as follows. A poly size Boolean circuit family {Cn } is DLOGTIME-uniform, if given (n, t, a, b, q) with n in binary, we can answer the queries of Definition 1 in time O(log n) on a Turing machine. Using standard conversions from Turing machines to Boolean circuits, observe that if a Boolean circuit family {Cn } is DLOGTIME-uniform, then it is O(log n log log n)-succinct, but that a converse of this does not generally hold. We use the following results. For iterated integer multiplication one is given n integers A1 , A2 , . . . , An of n bits each, and the problem is to compute the bits of A1 A2 . . . An . For the problem division, one is given two n bit numbers A and B and the required output is bA/Bc. Hesse, Allender and Barrington [HAB01] show that both these computational tasks can be performed with DLOGTIME-uniform TC0 circuits. For the problem of iterated integer addition it is well-known it is in DLOGTIME-uniform TC0 , cf. [Vol99]. Next we prove several technical lemmas that deduce consequences from the assumption that {pern } has n1/γ(n) -succinct arithmetic circuits. First however, we prove an easy proposition which shows that we can assume that for a constantfree arithmetic circuit using divisions by constants only the output gate computes a division. Proposition 2. Let Φ be a (t, t0 )-succinct constant-free arithmetic circuit using division by constants of size s and depth d computing f ∈ Q[X]. Then there exists an equivalent (O(t), O(t0 ))-succinct constant-free arithmetic circuit Φ0 using divisions by constants of size O(s2 ) and depth 2d that has a single division-byconstant gate p/c at the output such that f = p/c for some c ∈ Z\{0}. Furthermore, for the gate u computing c we have that the subcircuit rooted at u does not contain variables. Proof. For a gate u in an arithmetic circuit let us denote the polynomial it computes by [u]. The arithmetic circuit Φ0 is obtained by allocating for each gate u in Φ a triple of gates (u1 , u2 , u3 ) in Φ0 with the following properties: – – – – [u1 ] ∈ Z[X] and [u2 ] ∈ Z\{0}. [u] = [u1 ]/[u2 ]. [u3 ] = [u1 ](0). The subcircuits rooted at u2 and u3 do not contain variables. The construction goes as follows. The above properties are easily established using structural induction, which we do not write in full, but we will make some remarks when appropriate. For an input gate labeled with xi we create a triple of gates computing (xi , 1, 0). Similarly, for an input gate labeled with c ∈ {−1, 0, 1} the triple is (c, 1, c). For a ×-gate u in Φ with inputs g 1 , g 2 , . . . , g k , say (g11 , g21 , g31 ), . . . , (g1k , g2k , g3k ) are corresponding triples already created. We then allocate a triple of gates in Φ0 computing [g11 ][g12 ] . . . [g1k ], [g21 ][g22 ] . . . [g2k ], and [g31 ][g32 ] . . . [g3k ]. Similary, if u is an Q addition gate we first allocate a layer of k-ary product gates computing [g1j ] · i∈[k],i6=j [g2i ] for each j ∈ [k]. Then we allocate a gate computing the sum of these. For the second component of the triple 1 2 k we P compute Q[g2 ][g2 ] . . . i[g2 ]. For the third component of the triple we compute j [g ] · [g ] j∈[k] 3 i∈[k],i6=j 2 For a division gate with inputs g and h, say corresponding triples are (g1 , g2 , g3 ) and (h1 , h2 , h3 ). Then we allocate gates computing the triple ([g1 ][h2 ], [g2 ][h3 ], [g3 ][h2 ]). Note that as division is allowed only if [h] is a constant, and we are given by induction hypothesis that h2 ∈ Z\{0} and [h] = [h1 ]/[h2 ], we get that [h1 ] ∈ Z. Therefore [g2 ][h1 ] = [g2 ][h3 ]. Similarly, we have that [g1 ][h2 ](0) = [g1 ](0)][h2 ] = [g3 ][h2 ]. Note that for the second an third component of each triple inputs are never taken from a first component. This means that subcircuits rooted at such nodes do not contain variables. Finally, say for the output gate u we have created the triple (u1 , u2 , u3 ). Then we add a single division gate computing [u1 ]/[u2 ]. The construction we have described yields a circuit of size O(s2 ) and depth 2d. If we have a Boolean circuit of size t0 with t inputs representing Φ we can easily obtain a representation for Φ0 using O(t) input bits and of size O(t0 ). t u Lemma 2. Assume that for some constant c0 > 0, {pern } can be computed by n1/γ(n) -succinct size nc0 depth d constant-free arithmetic circuits using division by constants, Then for some constants d0 , d00 , we can compute pern (M ) over Z, where entries of M are in {0, 1} by O(n1/γ(n) )-succinct TC0 circuits of depth d0 · d + d00 . Proof. First we apply Proposition 2 to obtain two constant-free arithmetic circuits computing f ∈ Z[X] and c ∈ Z\{0} such that pern = f /c. The circuit computing c does not contain variables. We can bound the size and depth of these circuits by s := O(n2c0 ) and 2d, respectively. They can be represented by Boolean circuits of size O(n1/γ(n) ). By a straightforward induction we can 2d bound the magnitude of values at gates in these circuits for 0, 1 inputs by 2s . 2 c0 (2d+1) The value of pern (M ) is less than 2n log n . Let m = max(2n , 2n ). In order to compute the permanent we evaluate both depth 2d arithmetic circuit, where arithmetic is done modulo m. To do this we use that addition and multiplication can be done in DLOGTIME uniform-TC0 [HAB01]. Computing remainders mod m is easily done by discarding bits of too large significance. Finally using the fact that integer division can be done in DLOGTIME-uniform TC0 [HAB01], we add circuitry for computing the division f (M )/c. As remarked on before, DLOGTIME uniformity implies O(log n log log n)succinctness. Regarding succinctness of the circuits obtained this way, we represent the resulting circuits by modifying the Boolean circuit family representing the original arithmetic circuit for {pern } with additional circuitry of at most O(log n log log n) many gates. t u Using an improvement of Valiant’s completeness result by Zankó [Zan91], cf. [All99], who shows that 0, 1-pern over Z is complete for #P under DLOGTIME uniform-AC0 reductions, one can now easily verify the following statement: Lemma 3. Assume that for some constant c0 > 0, {pern } can be computed by n1/γ(n) -succinct size nc0 depth d constant-free arithmetic circuits using division by constants. Then for any F ∈ GapP there exists constants d0 , d00 and c0 ≥ 1 0 such that F can be computed by nc /γ(n) -succinct depth d0 · d + d00 TC0 circuits. The following corollary is now easy, and is left as an exercise: Corollary 1. If for some constant c0 > 0, {pern } can be computed by n1/γ(n) succinct size nc0 depth d constant-free arithmetic circuits using division by constants, then CH/poly ⊆ nonuniform-TC0 . Finally, we need the following lemma. Lemma 4. Assume for some constant c0 > 0, {pern } can be computed by n1/γ(n) -succinct size nc0 depth d constant-free arithmetic circuits using division by constants. Let F : {0, 1}∗ → {0, 1}∗ be a GapP function, and let c0 and d0 , d00 be the constants provided by Lemma 3 for this F . Let (An ) be an integer sequence of bit size at most `(n) that is weakly-definable in CH/poly. If it holds 0 that `(n)c /γ(`(n)) = nO(1) and log `(n) ≤ nO(1) , then bn := F (An ) is an integer sequence of bit size poly(`(n)) weakly-definable in CH/poly. Proof. We use the technique of ‘scaling up to the counting hierarchy’ as in [Bür09]. First we use Lemma 3 to get a poly(`(n)) size depth d0 · d + d00 circuit Cn with majority and negation gates for computing F on inputs of size `(n). Furthermore, there is a Boolean circuit Bn which can answer connectivity queries of this 0 circuit. This circuit has O(log `(n)) inputs and is of size `(n)c /γ(`(n)) = nO(1) . 0 n For 0 ≤ i ≤ d + d , let Li be the language of tuples (G, 1 , b) such that the condition “G is the name of a gate on level i in Cn . It outputs b when Cn is given input An ” holds. We will prove by induction on i that Li ∈ CH/poly. For the base case i = 0 this is clear, since uBbit(An ) ∈ CH/poly. Now assume Li ∈ CH/poly. By Torán’s [Tor91] charaterization of the counting hierarchy it suffices to show that we can decide Li+1 in PPLi /poly. This is done as follows. We assume the gate G is of majority type. Negation gates are handled similarly. Let N be an nondeterministic Turing machine that on input (G, 1n , b) nondeterministically guesses the O(c log `(n)) = poly(n) size name of a gate H, uses the advice Bn to check that H → G is a wire in Cn . If this is not true, nondeterministically flip a bit b0 and accept if b0 = 1, reject if b0 = 0. Otherwise, ? query (H, 1n , b) ∈ Li . Accept if the answer to this query is yes, reject otherwise. Observe that N accepts on the majority of its nondeterministic guesses iff the majority of the inputs to G are outputting b in Cn (An ). t u 4 Conditional Lower Bound for Permanent In this section we will pose a derandomization hypothesis for the polynomial identity testing problem, and show that if the hypothesis can be met, then a lower bound for the permanent follows. Given an integer sequence an (i, j) with 0 ≤ i < n, 0 ≤ j < p(n) for some function p(n), we think of this as encoding a collection {Hn } of subsets Hn ⊆ Zn with |Hn | = p(n), where Hn = {(an (0, j), an (1, j), . . . , an (n − 1, j)) : 0 ≤ j < p(n)}. Working Hypothesis 1 There exists an integer sequence an (i, j) of polynomial bit size with 0 ≤ i < n, 0 ≤ j < p(n), with p polynomially bounded, such that uBit(an (i, j)) can be decided by n1/γ(n) -succinct TC0 circuits, and for which the following holds: – For any constant-free arithmetic circuit Φ of size n over m ≤ n variables, if Φ(x1 , x2 , . . . , xm ) computes a nonzero polynomial, then there exist 0 ≤ j < p(n) such that Φ(an (0, j), an (1, j), . . . , an (m − 1, j)) 6= 0. The following is our randomness-to-hardness theorem: Theorem 2. If Working Hypothesis 1 is true, then there does not exist a polynomial p(n) and integer constant d such that {pern } has size p(n) depth d constantfree arithmetic circuits using division by constants, where in addition these circuits are n1/γ(n) -succinct. Proof. We will argue by contradiction, hence we start by assuming that for some constant c0 > 0, {pern } can be computed by a family {Φn } of size nc0 depth d constant-free arithmetic circuits using division by constants. Furthermore, we assume that for all large enough n, these circuits are represented by a family of Boolean circuits {Cn } with O(log n) inputs and size n1/γ(n) . Hence by Corollary 1, we have the collapse CH/poly ⊆ nonuniform-TC0 . We proceed as in Ref. [Agr05] by using the black-box derandomization assumption to construct a hard polynomial by solving a system of linear equations. Let an (i, j) be the integer sequence given by Working Hypothesis 1. Let {Hn } be given as mentioned in the remark before Working Hypothesis 1. Let k be such that we can take p(n) ≤ nk in Working Hypothesis 1, and let c be such that the bit size of any integer an (i, j) is bounded by nc . Choose positive < min(k −1 , c−1 ). Let N = bnγ(n) c, and let m = dγ(n) log ne. Then N k < 2m . Let fm = m 2X −1 cm (e)xe11 xe22 . . . xemm , (1) e=0 where ej denotes the jth bit of e. In other words, we sum over all strings in {0, 1}m in the above. We want to take cm (e) to be a nonzero integer solution to the system (S) given by f (b) = 0, for all b ∈ HN . These are at most N k linear equations in 2m variables. Since 2m > N k , we can get a nonzero solution. The system (S) is slightly too large to manipulate directly. Coding (S) as an integer sequence weakly-definable in CH/poly will allow us to indirectly let a solution finding procedure act on (S), due to Lemma 4. For this purpose we think of (S) as presented by a 2m × 2m matrix MS (with 2m − N k zero rows). We code MS as an integer sequence by letting An be the m integer represented by the binary string 12 01r 0list(MS ), where r := N c m is an upper bound on the maximum bit length of entries of M , and list(MS ) is the concatenation of length r binary representations of the entries of MS (say left-to-right, top-to-bottom). Define `(n) to be the bit length of An . 4.1 Checking that (An ) is Weakly-Definable in CH/poly The goal of this subsection is to prove the following lemma: Lemma 5. Given that γ(n) is an unbounded monotone nondecreasing function such that for all n, γ(n) < log log n, we have that (An ) is an integer sequence of bit size `(n), where for all but finitely many n, `(n) ≤ n4γ(n) . Furthermore, (An ) is weakly-definable in CH/poly. Proof. For the bit size of An we can give an upper bound of 2m + 2 + N c m + 22m N c m = O(nγ(n)(2+c) dγ(n) log ne). This is at most n4γ(n) , provided n is large enough. Next we show that (An ) is weakly-definable in CH/poly. On input (1n , i, b), we assume that 1m = 1dγ(n) log ne and N = bnγ(n) c (in binary) are given as advice (for technical convenience). Similarly, we assume that the bitlength r = N c m (in binary) in the coding of MS is given as advice. From the binary index i, we can then easily compute e ∈ {0, 1}m , j with 0 ≤ j < p(n), and k ≤ r such that (An )i equals the kth bit of dN (j, e), where dN (j, e) := Qm−1 ep−1 . p=0 aN (p, j) Claim 1 There exist a TC0 -circuit family {Dn } such that – |Dn | = nO(γ(n)) . – The family {Dn } is (O(γ(n) log n), 3n )-succinct. – Dn has input gates so it can be given e ∈ {0, 1}m , 0 ≤ j < p(n) in binary, and k ≤ r in binary. – Dn (e, j, k) outputs the kth bit of dN (j, e). Proof. Recall that uBit(an ) = {(1n , i, j, k, b) : the kth bit of an (i, j) equals b}. Working Hypothesis 1 gives that we have a (O(log n), n1/γ(n) )-succinct family {En } of TC0 circuits, such that En (1n , i, j, k, b) = 1 iff the kth bit of an (i, j) equals b. By adding O(log n) circuitry to the nonuniform circuits representing {En }, we can hardcode the first n inputs to the circuit En to be 1, so let us assume wlog. En takes input tuple (i, j, k, b) instead. Let us now describe the circuit Dn . We take mr copies of EN by ranging over all values for 0 ≤ i ≤ m − 1 and 1 ≤ k ≤ r, and hardcoding these values into the corresponding copy. One single copy of EN is of size N O(1) = nO(γ(n)) , and it is represented by Boolean circuits with O(γ(n) log n) many inputs of size N 1/γ(N ) ≤ nγ(n)/γ(N ) ≤ n . We have that mr = N c m2 = nO(γ(n)) (dγ(n) log ne)2 = nO(γ(n)) . Now the important observation is that we can do this mr-fold duplication by adding O(log mr) = O(γ(n) log n) bits to gate names, and adding poly(log mr) = polylog(n) to the size of the Boolean circuit that represents the individual copy. This way we construct TC0 circuits that are (O(γ(n) log n), 2n ))-succinct that computes m many r-bit sequences encoding aN (0, j), aN (1, j), . . . , aN (m − 1, j). Within the succinctness constraints, we obtain m many r-bit strings giving aN (0, j)e0 , aN (1, j)e1 , . . . , aN (m − 1, j)em−1 , by just masking using the bits e0 , e1 , . . . , em−1 . We add below this the DLOGTIMEuniform TC0 circuits that compute iterated multiplication from Ref. [HAB01]. Note the input size in this case is nO(γ(n)) . This means we have a Boolean circuit of size O(γ(n) log n(log γ(n) + log log n)) with O(γ(n) log n) inputs representing the circuit for iterated multiplication. Merging this representation with the representation of the circuit computing aN (0, j), aN (1, j), . . . , aN (m−1, j) can easily be done to yield a (O(γ(n) log n), 3n )-succinct TC0 circuit family. This ends the description of the circuit Dn . To summarize, we have given TC0 circuits computing dN (j, e) of size at most nO(γ(n)) that are represented by Boolean circuits with O(γ(n) log n) many inputs, and having size 3n , for all large enough n. t u We now use the above claim to finish the proof of Lemma 5. Similar to the proof of Lemma 4, we use the circuit family {Dn } to scale up to the counting hierarchy. The Boolean circuits of size 3n with O(γ(n) log n) many inputs that represent this circuit family are given as advice. Let d0 be a bound on the depth of the TC0 family {Dn }. For 0 ≤ i ≤ d0 , let Li be the language of tuples (G, 1n , (e, j, k), b) such that the condition “G is the name of a gate on level i in Dn . It outputs b when Dn is given input (e, j, k) with e ∈ {0, 1}m , 0 ≤ j < p(n) in binary, and k ≤ r in binary, where m = dγ(n) log ne, N = bnγ(n) c, r = N c m” holds. We will prove by induction on i that Li ∈ CH/poly. Consider the input (G, 1n , (e, j, k), b). Note that it is of length n + O(γ(n) log n) = n + O(log n log log n), where n bits of this input are used for expressing n in unary. Note that for technical convenience we can again assume N , r (in binary) and 1m are given as advice. Names of gates in Dn have length O(γ(n) log n) = O(log n log log n). We have a size n Boolean circuits Bn with O(γ(n) log n) = O(log n log log n) that can answer connection queries about Dn . This will be part of the advice as well. For the base case i = 0, it is easy to check which variable the gate G is labeled with, since for a gate labeled with a variable x` , ` (in binary of length O(log n log log n)) is part of the gate name. Then one just need to fetch the `th bit of (e, j, k). Note that (e, j, k) has length O(γ(n) log n), so we can just scan the input string and keep a counter. Gates labeled by Boolean constants are dealt with even more easily as these constants appear in the gate name itself. To check whether G is on level 0 we can either assume wlog. this information can be obtained from the gate name (or if we want we can add another level of oracle calling to the below argument by making existential queries to Bn of the form “Does there exist H such that G is a child of H?”), so we will ignore this detail. Now assume Li ∈ CH/poly. By Torán’s [Tor91] charaterization of the counting hierarchy it suffices to show that we can decide Li+1 in PPLi /poly. This is done as follows. We assume the gate G is of majority type. Negation gates are handled by a straightforward extension of the below argument. Let N be the following nondeterministic Turing machine. On input (G, 1n , (e, j, k), b) nondeterministically guesses the O(γ(n) log n) = O(log n log log n) size name of a gate H, uses the advice Bn to check that H → G is a wire in Cn . If this is not true, nondeterministically flip a bit b0 and accept if b0 = 1, reject if b0 = 0. Other? wise, query (H, 1n , (e, j, k), b) ∈ Li . Accept if the answer to this query is yes, reject otherwise. Observe that N accepts on the majority of its nondeterministic guesses iff the majority of the inputs to G are outputting b in Dn (e, j, k). The machine N runs in nondeterministic time O(n). This puts Li+1 in PPLi /poly. The conclusion is that the language consisting of all tuples (1n , j, e, k, b) subject to the condition – The kth bit of dN (j, e) = b, where N = bnγ(n) c, and – with k ≤ r, 0 ≤ j < p(n), and e ∈ {0, 1}m , is in CH/poly. Hence, by the remarks before Claim 1, we get that (An ) is weaklydefinable in CH/poly. t u 4.2 Coefficients of fm Are Weakly-Definable in CH/poly Next we want to apply a solution finding procedure to (S). For this purpose, we let F : {0, 1}∗ → {0, 1}∗ be the following poly-time computable mapping: m̃ On input x of length ñ, try to parse x = 12 01r̃ 0y, for some integer r̃, m̃ and ∗ 2m̃ y ∈ {0, 1} with |y| = 2 r̃. If this fails, output 0. Otherwise, construct the 2m̃ × 2m̃ matrix M whose (left-to-right, top-to-bottom) entries are given by consecutive r bit blocks of y. Then using standard tools8 , try to compute a nonzero integer 2m̃ -vector c such that M c = 0. Output c if this succeeds, 0 otherwise. We define α to be the absolute integer constant such that for all large enough ñ, F runs within time ñα . This implies that for all large enough ñ, for any x ∈ {0, 1}ñ , |F (x)| ≤ ñα . By Lemma 5, (An ) is weakly-definable in CH/poly. Clearly, since γ(n) is assumed to be a monotone growing function, for any constant c0 ≥ 1, it is satisfied 0 that `(n)c /γ(`(n)) = nO(1) , as for all but finitely many n, n ≤ `(n) ≤ n4γ(n) . Hence we can apply Lemma 4. We get that F (An ) is an integer sequence weaklydefinable in CH/poly. We have that F (An ) encodes a 2m -vector of integers. Indexing this vector by e ∈ {0, 1}m , we let cm (e) be the integer encoded by the eth entry, i.e. cm (e) = F (An )e , and let fm be the polynomial given by fixing these integer coefficients in Equation (1). We have the following properties: – ∀∞ n, cm (e) has bit size at most `(n)α ≤ 24αγ(n) log n ≤ 24αm . – L := {(1n , e, j, b) : e ∈ {0, 1}m , j ∈ {0, 1}4αm , and cm (e)j = b, where m = dγ(n) log ne} ∈ CH/poly. 4.3 Finishing Up using the Completeness of Permanent First we prove the following claim: Claim 2 fm = perm0 (M 0 ) for m0 = 2o(m) , where M 0 is a matrix whose entries 0 1 4αm }. are in {x1 , x2 , . . . , xm } ∪ {−1, − 21 , 0, 12 , 1} ∪ {22 , 22 , . . . , 22 Proof. Recall the bit sizes of cm (e) are bounded by 24αm . Let cm (e)i denote the ith bit of cm (e), and let sm (e) be the sign bit of cm (e). P2m −1 P24αm −1 sm (e) i Write fm = (−1) cm (e)i 2 xe11 xe22 . . . xemm . Since L ∈ e=0 i=0 nonuniform-TC0 ⊆ nonuniform-NC1 , we have a Boolean formula Cm for computing cm (e)i , given i in binary of length 4αm bits and e ∈ {0, 1}m and n in unary of size poly(4αm + n + log m) = nO(1) . The depth of Cm is O(log n) and the fan-in can be assumed to be bounded by two for each gate. 8 If M is of full rank 0 must be returned. Otherwise, one way to compute c, while avoiding divisions so the result is integer, would be to first compute a set of independent rows of M , say with indices in I. Then form an invertible matrix M 0 by incrementally extending this set with standard basis vectors. Now a nonzero integer solution can be selected to be any column with index 6∈ I of the adjugate adj(M 0 ) of M 0 . The latter matrix is defined entirely in terms of determinants of minors of M 0 and satisfies M 0 adj(M 0 ) = det(M 0 )I. Fact 1 Since γ(n) is an unbounded monotone function and m = dγ(n) log ne, we can say that the family {Cm } of Boolean formulas has depth o(m). Hardcoding the unary input 1n into the formula Cm , and performing a straightforward arithmetization where we replace (x ∨ y) by x + y − xy, (x ∧ y) 0 by xy, and ¬x by (1 − x), we obtain an arithmetic circuit Ψm on inputs α1 , α2 , . . . , αm , β1 , β2 , . . . , β4αm of depth o(m) that coincides with Cm on 0, 1inputs. Similarly, we obtain an arithmetic formula Ψ of depth o(m) that coincides with (−1)sm (e) Cm (e, i) on 0, 1-inputs. By duplicating nodes, we can assume Ψ is a formula of depth o(m) and size 2o(m) . Hence X X fm = Ψm (e, i)2nat(i) xe11 xe22 . . . xemm , e∈{0,1}m i∈{0,1}4αm where nat(i) ∈ N denote the integer represented by the binary string i ∈ {0, 1}4αm . It is now easy to write f as an exponential sum of the form X 0 1 4αm−1 fm = Γm (x1 , x2 , . . . , xm , b, 22 , 22 , . . . , 22 ), b∈{0,1}s(m) for 2o(m) size arithmetic formula Γm and s(m) = Θ(m). Namely, take Γm (x1 , x2 , . . . , xm , α1 , α2 , . . . , αm , β1 , β2 , . . . , β4αm , δ1 , δ2 , . . . , δ4αm ) = Ψm (α1 , α2 , . . . , αm , β1 , β2 , . . . , β4αm ) 4αm Y (βj δj + 1 − βj ) j=1 m Y (αj xj + 1 − αj ) j=1 Then X 0 X 1 4α−1 Γm (x, e, i, 22 , 22 , . . . , 22 )= e∈{0,1}m i∈{0,1}4αm X X Ψm (e, i) 4αm Y (ij 22 j−1 + 1 − ij ) j=1 e∈{0,1}m i∈{0,1}4αm X Ψm (e, i) e∈{0,1}m i∈{0,1}4αm X X e∈{0,1}m i∈{0,1}4αm X (ej xj + 1 − ej ) = j=1 X m Y 4αm Y 22 j=1 P4αm Ψm (e, i) 2 X j−1 j=1 ij xe11 xe22 . . . xemm = xe11 xe22 . . . xemm = 2j−1 ij Ψm (e, i)2nat(i) xe11 xe22 . . . xemm = fm . e∈{0,1}m i∈{0,1}4αm By Proposition 1, the above implies fm = perm0 (M 0 ) for m0 = 2o(m) , where M 0 is a matrix whose entries are in {x1 , x2 , . . . , xm } ∪ {−1, − 12 , 0, 12 , 1} ∪ 0 1 4αm−1 }. This proves the claim. t u {22 , 22 , . . . , 22 We can now finish the proof of Theorem 2. Applying Proposition 2 to the constant-free arithmetic circuit Φm0 computing perm0 gives a constant-free arithmetic circuit of size s := O((m0 )2c0 ) = 2o(m) computing am0 · perm0 , for some am0 ∈ Z\{0}, together with a constant-free arithmetic circuit of size 2o(m) com0 0 0 puting am0 . Consider fm = 2m am0 ·fm . Observe that fm is a nonzero polynomial in m << N variables that vanishes on HN . Hence we have that 0 Fact 2 fm does not have constant-free arithmetic circuits of size N . 0 By Claim 2, we have that fm = am0 · perm0 (2M 0 ). We can put together 2M 0 us0 2 o(m) ing O((m ) ) = 2 constant-free circuitry. Combining with the constant-free arithmetic circuits computing am0 , we obtain s0 := 2o(m) size constant-free arith0 metic circuits for fm Recalling that N = b2γ(n) log n c and m = dγ(n) log ne, we have that for all but finitely many n, s0 < N . We have arrived at a contradiction with Fact 2. t u 5 Proving a Weak Derandomization Hypothesis Unconditionally We don’t know how to prove Working Hypothesis 1, but as we will see in Section 6, neither do we need to for obtaining the sought after lower bound for the permanent as in the conclusion of Theorem 2! What we can establish is the following theorem. Note that there is no restriction to constant depth. Theorem 3. There exists an integer sequence an (i) of polynomial bit size with 0 ≤ i < n, such that an (i) is weakly-definable in the polynomial hierarchy, and for which the following holds: – For any arithmetic circuit Φ of size n over m ≤ n variables and constants in 0 1 n Γn := {22 , 22 , . . . , 22 }∪{−1, 0, 1}, if Φ(x1 , x2 , . . . , xm ) computes a nonzero polynomial, then Φ(an (0), an (1), . . . , an (m − 1)) 6= 0. Proof. Let Cn be the set9 of all arithmetic circuits of size n over m ≤ n variables using constants in Γn . For some constant c > 0, we can bound 2 |Cn | ≤ 2cn , provided n is large enough. Note that circuits in Cn can com3 pute polynomials with degree at most 2n . Let S = {1, 2, . . . , 2n +n }. If we pick s1 , . . . , sn independently and uniformly at random from S, then by Lemma 1, for any nonzero polynomial f computed by a circuit in Cn , 3 3 Pr[f (s1 , s2 , . . . , sn ) = 0] ≤ 2n /2n +n = 2−n . Hence by the union bound, 2 3 Pr[∃ nonzero f computed by a circuit ∈ Cn , f (s1 , s2 , . . . , sn ) = 0] ≤ 2cn −n < 1. This means there exist at least one s ∈ S n , such that for any nonzero polynomial f computed by a circuit from Cn , f (s1 , s2 , . . . , sn ) 6= 0. For s = (s1 , s2 , . . . , sn ) ∈ S n define the predicate P(s) by ∀Ψ ∈ Cn , (if Ψ (s1 , s2 , . . . , sn ) = 0, then ∀t ∈ S n , Ψ (t1 , t1 , . . . , tn ) = 0). (2) 9 If f is computed by a circuit in the class Cn , then it is defined over m ≤ n variables, but in our notation we will still write f (s1 , s2 , . . . , sn ) with it being understood that sm+1 , sm+2 , . . . , sn are not being used. Observe that if P(s) is true, then for any nonzero f computed by a circuit from Cn , f (s1 , s2 , . . . , sn ) 6= 0. Also observe that for both universal quantifiers in (2), they range over sets whose elements can be described by poly(n) size strings, assuming encodings of circuits using type names of size O(log n) as defined in Section 3. Hence if we can argue that given Ψ ∈ Cn and u ∈ S n , we can decide ? whether Ψ (u1 , u2 , . . . , un ) = 0 in PH, then the predicate P(s) is PH-decidable. Consequently, we can define an to be the lexicographically least element s of S n such that P(s) holds. This makes an computable in P PH by binary search. This implies that an is weakly-definable in PH. We complete the proof by showing that given Ψ ∈ Cn and u ∈ S n , we can ? decide whether Ψ (u1 , u2 , . . . , un ) = 0 in PH. The only problem that may arise when evaluating Ψ (u1 , u2 , . . . , un ) is that intermediate values get too large. For this we employ the idea of Ibarra and Moran [IM83] by evaluating modulo a prime number in some large range. 2 Namely, consider primes p in the range {1, 2, . . . , 2n }. Each such prime takes 2 n many bits. We can evaluate Ψ (u1 , u2 , . . . , un ) mod p in polynomial time10 . By the Prime Number Theorem, the number of primes in this range is at least 2 2n log 2n2 > 22n , provided n is large enough. By a simple induction, it follows that 2n the value |Ψ (u1 , u2 , . . . , un )| can be bounded by 22 . Hence it cannot be that all 2 primes ≤ 2n divide Ψ (u1 , u2 , . . . , un ), if Ψ (u1 , u2 , . . . , un ) 6= 0. In other words, Ψ (u1 , u2 , . . . , un ) = 0 ⇔ 2 ∀m ∈ {1, 2, . . . , 2n }, if m is prime, then Ψ (u1 , u2 , . . . , un ) mod p = 0. Agrawal, Kayal and Saxena [AKS02] proved primality testing is in polynomial time. Hence we get that the above is a Π1 P-predicate. t u 6 Proof of Theorem 1 It is sufficient to show the following claim: Claim 3 For any unbounded monotone function γ(n) = o(log log n), there does not exist a polynomial p(n) such that ∀∞ n, pern has size p(n) depth d constantfree arithmetic circuits using divisions by constants, where in addition these circuits are n1/γ(n) -succinct. Proof. We argue by contradiction. Let γ(n) be given as in the claim, and suppose that for some polynomial p(n), for all large enough n, pern has size p(n) depth d constant-free arithmetic circuits using divisions by constants, where in addition these circuits are n1/γ(n) -succinct. 10 n As a technical detail, we remark that for large numbers like 22 , which appear as a n O(log n) type name in the encoding of Ψ , we can compute (22 mod p) by performing n repeated squarings computed modulo p. By Lemma 2, we get that we can compute pern (M ) over Z, where entries of M are in {0, 1} by O(n1/γ(n) )-succinct TC0 circuits. By Toda’s Theorem and Valiant’s completeness result for pern , any L ∈ PH can be decided in polynomial time with a single query to the 0, 1-permanent. We can think of this as a three stage process: first apply a function Q ∈ FP to the input x, then obtain per(Q(x)), finally apply R ∈ FP to produce the output R(per(Q(x)). Since FP ⊆ #P, Lemma 3 implies that for some constant b, we obtain nb/γ(n) succinct TC0 -circuits for Q and R. Combining all three levels of TC0 circuits yields TC0 -circuits for L, where for some constant c depending on L, this family is nc/γ(n) -succinct. The constant c can be picked larger than b to accommodate for the increase in size of joining the three representations. Wlog. assume c > 1. In particular, for the integer sequence an (i) of Theorem 3, we get that uBit(an ) can be decided by nc/γ(n) -succinct TC0 circuits, for some constant c > 1. This means that Working Hypothesis 1 is satisfied for γ 0 (n) := c−1 γ(n) instead of γ(n). Since Theorem 2 holds for any unbounded monotone growing function γ(n) where ∀∞ γ(n) < log log n, we can apply Theorem 2 with γ 0 (n) instead, to yield that there does not exist a polynomial q(n) and integer d0 constant such that, for all large enough n, pern has size q(n) depth d0 constant-free arithmetic circuits using divisions by constants, where in addition these circuits 0 are n1/γ (n) -succinct. Since 1/γ 0 (n) > 1/γ(n), this contradicts the assumption that for all large enough n, pern has size p(n) depth d constant-free arithmetic circuits using divisions by constants, where in addition these circuits are n1/γ(n) succinct. t u 7 Advice Elimination In this section TC0 refers to the DLOGTIME uniform version of the class.QTC0 is the quasi-polynomial analogue. QP is deterministic quasi-polynomial time, and BPQP is probabilistic quasi-polynomial time. We show how to extend Allender’s lower bound for the permanent to a lower bound against a small amount of advice. The technique used is advice elimination - we show that if the permanent is in TC0 with a small amount (polylogarithmic) of advice, then the permanent is in fact in QTC0 . Allender [All99] showed that the Permanent does not have uniform threshold circuits of “fractional sub-exponential” size, where a fractional sub-exponential function s is one such that for any integer k, sk (n) = o(2n ). The following is a consequence. Theorem 4. The permanent is not in QTC0 . In order to show our advice elimination result, we use the following lemma, which is implicit in work by Trevisan and Vadhan [LS07], and follows from the fact that the permanent has instance checkers [LFKN92]. Lemma 6. If the permanent is in BPP/O(polylog(n)), then the permanent is in BPQP. We now show how to do the advice elimination when the lower bound is against very weak circuits, i.e., uniform constant-depth threshold circuits. Our proof has three stages. First we use Lemma 6 to show that if the permanent is in TC0 with polylogarithmic advice, then it is in BPQP. Next, we use the downward self-reducibility property for SAT to show that the assumption also implies that BPP ⊆ QP and hence by a translation argument the permanent is in QP. Finally, we use local checkability for the P-complete Circuit Value Problem (CVP) to get the collapse all the way down to QTC0 . Indeed, our advice elimination works against any “reasonable” class of uniform circuits between AC0 and P, but we will not state the result in its full generality. Theorem 5. If the permanent is in TC0 /O(polylog(n)), then the permanent is in QTC0 . Proof. If the permanent is in TC0 /O(polylog(n)), then the permanent is in BPP/O(polylog(n)), since TC0 ⊆ BPP. Using Lemma 6, we have that the permanent is in BPQP. Now using, the assumption again, we have that SAT ∈ P/O(polylog(n)), since SAT m-reduces to the permanent by Toda’s theorem [Tod91], and TC0 ⊆ P. SAT ∈ P/O(polylog(n)) implies that NP ⊆ QP [BU92], and hence PH ⊆ QP. Since BPP is in the polynomial hierarchy [Lau83,Sip83], we get that BPP ⊆ QP, from which it follows by translation that BPQP ⊆ QP and hence the permanent is in QP. Now using the assumption a third time, we have that CVP ∈ TC0 /O(polylog(n)), since TC0 ⊆ P and CVP m-reduces to the permanent via an AC0 reduction [Zan91,All99]. We will show that this implies that CVP ∈ QTC0 and hence QP ⊆ QTC0 . The basic idea is to use local checkability of CVP to verify that the answer given by a certain advice string for an instance is correct. More specifically, suppose < C, x > is the input. Given an advice string a, we would like to verify that the uniform TC0 circuit with advice a gives the correct answer on < C, x >. To do this, it is enough to check that for every gate g of C with input gates g1 and g2 (which might be constants or literals), the answer given by the TC0 circuit with advice a on < g, x > is consistent with the answers given on < g1 , x > and < g2 , x > (or with the values for g1 and/or g2 specified by x, if g1 and/or g2 are constants or literals). Such a check is what we call a local check. Note that these are also questions about CVP. We assume an encoding of CVP in which inputs can be padded so that it makes sense to use advice a for all questions about smaller sub-circuits. Note that if an advice string a is the correct one for length | < C, x > | , then all the local checks on input < C, x > succeed, whereas if an advice string gives the wrong answer for < C, x > at least one of the local checks must fail. All the local checks for all (quasi-polynomially many) possible advice strings can be performed in parallel by uniform TC0 circuits. If there is some advice string for which all the local checks succeed and the advice-taking circuit outputs 1, then the uniform circuit outputs 1, otherwise it outputs 0. The uniform circuits are of depth at most 2 greater than the advice-taking TC0 circuits presumed to compute CVP, and they are of quasi-polynomial size, hence we have that P ⊆ QTC0 . Whence by a translation argument, QP ⊆ QTC0 . Together with our earlier conclusion that the permanent is in QP, we have that the permanent is in QTC0 . Using Theorem 5 in conjunction with Theorem 4 and Lemma 2, we have the following corollary. Corollary 2. The permanent does not have O(polylog(n))-succinct constantdepth arithmetic circuits of polynomial size. Since the advice elimination argument requires multiple use of the assumption, we do not know how to use it to rule out no(1) -succinct circuits, which is what is done in the main result of this paper. 8 Conclusion Let us briefly discuss some open problems. One obvious task would be to improve the parameters of our result. As a matter of fact we have already obtained some recent improvements using a different framework involving high degree univariate polynomial identity testing [JS11]. Still, using low-degree multivariate polynomial identity testing as the target for derandomization may be easier, but there is the problem of connecting this up to lower bounds for permanent. In this paper we established this connection for the succinct circuit model. A main question is whether more sophisticated methods for constructing hitting sets can be developed against arithmetic circuit, which of course leads us into the general area of polynomial identity testing. However, to pose a more directed question here, can Theorem 3 to be improved beyond the constant-free model to give a hitting set against circuits that use constants from some field F of characteristic zero, e.g. the complex numbers, at unit cost ? Acknowledgments We thank the anonymous referees for their valuable comments. We thank Pavel Hrubeš for pointing out to us that without division gates a lower bound can be obtained for succinct circuits of constant depth by a reduction to the Razborov-Smolensky lower bound. References [Agr05] [AKS02] [All99] [AV08] M. Agrawal. Proving lower bounds via pseudo-random generators. In Proc. 25th Annual Conference on Foundations of Software Technology and Theoretical Computer Science, pages 92–105, 2005. M. Agrawal, N. Kayal, and N. Saxena. Primes is in P. Ann. of Math, 2:781–793, 2002. E. Allender. The permanent requires large uniform threshold circuits. Chicago Journal of Theoretical Computer Science, 1999. article 7. M. Agrawal and V. Vinay. Arithmetic circuits: A chasm at depth four. In Proc. 49th Annual IEEE Symposium on Foundations of Computer Science, pages 67–75, 2008. [BIS90] D. Mix Barrington, N. Immerman, and H. Straubing. On uniformity within NC1 . J. Comp. Sys. Sci., 41:274–306, 1990. [BS82] W. Baur and V. Strassen. The complexity of partial derivatives. Theor. Comp. Sci., 22:317–330, 1982. [BU92] J. Balcázar and U.Schöning. Logarithmic advice classes. Theoretical Computer Science, 99(2):279–290, 1992. [Bür00] P. Bürgisser. Completeness and Reduction in Algebraic Complexity Theory. Springer Verlag, 2000. [Bür09] P. Bürgisser. On defining integers and proving arithmetic circuit lower bounds. Computational Complexity, 18:81–103, 2009. [DL78] R. DeMillo and R. Lipton. A probabilistic remark on algebraic program testing. Inf. Proc. Lett., 7:193–195, 1978. [GK98] D. Grigoriev and M. Karpinski. An exponential lower bound for depth 3 arithmetic circuits. In Proc. 13th Annual ACM Symposium on the Theory of Computing, pages 577–582, 1998. [HAB01] W. Hesse, E. Allender, and D.A.M. Barrington. Uniform constant-depth threshold circuits for division and iterated multiplication. J. Comp. Sys. Sci., 64(4):695–716, 2001. [HS80] J. Heintz and C.P. Schnorr. Testing polynomials which are easy to compute (extended abstract). In Proc. 12th Annual ACM Symposium on the Theory of Computing, pages 262–272, 1980. [IM83] O. Ibarra and S. Moran. Probabilistic algorithms for deciding equivalence of straight-line programs. J. Assn. Comp. Mach., 30:217–228, 1983. [JS11] M. Jansen and R. Santhanam. Marginal hitting sets imply super-polynomial lower bounds for permanent, 2011. Manuscript. [KI04] V. Kabanets and R. Impagliazzo. Derandomizing polynomial identity testing means proving circuit lower bounds. Computational Complexity, 13(1–2):1– 44, 2004. [Koi10] P. Koiran. Shallow circuits with high powered inputs. In In proc. 2nd Symp. on Innovations in Computer Science, 2010. [KP07] P. Koiran and S. Perifel. Interpolation in Valiant’s theory, 2007. To Appear. [KP09] P. Koiran and S. Perifel. A superpolynomial lower bound on the size of uniform non-constant-depth threshold circuits for the permanent. In Proc. 24th Annual IEEE Conference on Computational Complexity, 2009. [Lau83] C. Lautemann. BPP and the polynomial hierarchy. Inf. Proc. Lett., 17:215– 217, 1983. [LFKN92] C. Lund, L. Fortnow, H. Karloff, and N. Nisan. Algebraic methods for interactive proof systems. J. Assn. Comp. Mach., 39:859–868, 1992. [LS07] L.Trevisan and S.Vadhan. Pseudorandomness and average-case complexity via uniform reductions. Computational Complexity, 16(4):331–364, 2007. [Raz87] A. Razborov. Lower bounds for the size of circuits of bounded depth with basis {∧, ⊕}. Mathematical Notes, (formerly of the Academy of Natural Sciences of the USSR), 41:333–338, 1987. [Raz10] R. Raz. Elusive functions and lower bounds for arithmetic circuits. Theory of Computing, 6, 2010. [RY09] R. Raz and A. Yehudayoff. Lower bounds and separations for constant depth multilinear circuits. Computational Complexity, 18(2), 2009. [Sch80] J.T. Schwartz. Fast probabilistic algorithms for polynomial identities. J. Assn. Comp. Mach., 27:701–717, 1980. [Sip83] M. Sipser. A complexity-theoretic approach to randomness. In Proc. 15th Annual ACM Symposium on the Theory of Computing, pages 330–335, 1983. [Smo87] [SW01] [Tod91] [Tor91] [Val79a] [Val79b] [Vol99] [Wag86] [Zan91] [Zip79] R. Smolensky. Algebraic methods in the theory of lower bounds for Boolean circuit complexity. In Proc. 19th Annual ACM Symposium on the Theory of Computing, pages 77–82, 1987. A. Shpilka and A. Wigderson. Depth-3 arithmetic formulae over fields of characteristic zero. Journal of Computational Complexity, 10(1):1–27, 2001. S. Toda. PP is as hard as the polynomial-time hierarchy. SIAM J. Comput., 20:865–877, 1991. J. Torán. Complexity classes defined by counting quantifiers. J. Assn. Comp. Mach., 38(3):753–774, 1991. L. Valiant. Completeness classes in algebra. In Proc. 11th Annual ACM Symposium on the Theory of Computing, pages 249–261, 1979. L. Valiant. The complexity of computing the permanent. Theor. Comp. Sci., 8:189–201, 1979. H. Vollmer. Introduction to Circuit Complexity. Springer-Verlag, 1999. A uniform approach. K. Wagner. The complexity of combinatorial problems with succinct input representation. Acta Informatica, 23:325–356, 1986. V. Zankó. #P-completeness via many-one reductions. International Journal of Foundations of Computer Science, 2:77–82, 1991. R. Zippel. Probabilistic algorithms for sparse polynomials. In Proceedings of the International Symposium on Symbolic and Algebraic Manipulation (EUROSAM ’79), volume 72 of Lect. Notes in Comp. Sci., pages 216–226. Springer Verlag, 1979.