Alexandra Kolla : It`s a pleasure to have Ran Raz with us this week

advertisement
>>Alexandra Kolla : It's a pleasure to have Ran Raz with us this week from Weizmann,
and I don't think I should say much more bus you already know he's so famous that, you
know, my words are unnecessary. So there you go.
>>Ran Raz: Thank you for the introduction and for inviting me to give this talk.
So actually -- okay. So thanks for inviting me to give this talk. Actually I wasn't sure if I
give this talk or some other talk about a survey about parallel partition theory and
applications. Finally decided to give this talk, so if you want to hear the other talk, you
will have to invite me again.
The title of this talk is How to Fool People to Work on Circuit Lower Bounds, and I will
explain the title in some minutes, but this point I just want to say because [inaudible]
already said that he's not fooled to work on circuit lower bounds.
So I want to say I'm not going to show any lower bounds here and I'm not going to fool
you to work on circuit lower bounds. What I'm going to do is to show you how to fool
other people to work on circuit lower bounds. That's why it's very important to come to
this talk.
So what do I mean by lower bounds? In general I talk about lower bounds for circuit
complexity, but in this particular talk I'm going to talk about arithmetic circuit complexity.
When we talk about arithmetic circuits in formalize, I will tell you what it is. When we
talk about arithmetic circuits in formalize, we have some field F that can be any field.
Sometimes we talk about C or R, but many times about finite fields, and this time you
can take -- sometimes I will talk about C, but usually almost everything that I'm going to
say will be to over any field, so we have some field F, and we have n input variables x1
to xn. We have the gates plus and product. An arithmetic formula is a directed tree
with edges directed from the leaves to the root of the formula.
Every leaf of the formula is labeled by either an input variable from x1 to xn or a field
element from the field F. So you see 1 or x1, x2. And every other node in the formula
is labeled by one of the gates, either plus or product. And you can see that if we do
that, every gate in the formula computes a polynomial in the ring F of x1 to xn in the
obvious way.
So, for example, this gate computes the product of x1 and x1, this gate x2 plus x1, and
this gate the final polynomial computed by the formula which is in this particular case x1
times x1 times x2 plus 1. So this is, in general, the polynomial computed by a formula.
When we talk about arithmetic circuits, it's exactly the same. Again, we have the same
gates, the same variables, the same field, except that we allow arbitrary directed acyclic
graphs rather than trees. So this time this can be any directed acyclic graph, which
means that we can use a computation that we've already done, we can use it more than
once. So this is an arithmetic circuit, and, again, each gate in the circuit computes a
polynomial in F of x1 to xn in the obvious way.
And we usually look in two parameters of the circuit and formula that we use, the size of
the circuit or formula and the depth of the circuit or formula. In this talk I will talk mainly
about the size, but I will mention the depth several times.
The size is defined to be the number of edges in the circuit or formula. That's the size
of either the graph. And the depth is the length of the longest path. When we talk about
the depth, there are two versions. Sometimes we talk about the depth when fan-in of
every gate is unbounded and sometimes we talk about the depth when the fan-in of
every gate is 2. The fan-in is the n degree in the graph.
So usually when we talk about depth, we mean for gates of fan-in 2 except when we
talk about circuit of constant size, and then we refer to gates of unbounded fan-in. I will
usually say if I'm into this or this. If not, you can assume in this talk that ->>: [inaudible].
>>Ran Raz: Constant depth. Sorry. Oh, I'm sorry. When we talk about circuits of
constant depth, we talk about unbounded fan-in.
For now you can think of the depth as the length of the longest path where the fan-in of
each gate is 2.
But I will talk mainly about the size of the circuit and formula.
Arithmetic circuits and formulas are the standard model for algebraic computations. So,
for example, if you want to compute the determinant of a matrix or the permanent of a
matrix or the product of two matrices, then the standard model to look at is arithmetic
circuits. And sometimes for simplifications, we look on arithmetic formulas.
So these are the standard model for arithmetic computations. You want to ask how
many operations we need in order to compute the determinant, you need the model to
look at these probably arithmetic circuits.
And we would like to prove lower bounds for computations. We would like to prove
lower bounds for the size of arithmetic circuits in formulas. For example, for the size of
an arithmetic circuit for computing the permanent. The Holy Grail of this research is to
prove super-polynomial lower bounds for circuit or formula size. At this point we are
very far from it. For circuits the only bounds that we know are [inaudible] n times log r
where r is the degree. N times log the degree of the polynomial computed, that's what
we can prove.
And therefore formulas we can prove lower bounds of [inaudible] n squared, but not
super-polynomial lower bounds. And that's the Holy Grail.
Now, for the $1 million question will it give the result that appears different than np. It
won't give this result because we talk here about arithmetic circuits and not boolean
circuits. It's a little bit different. But it's a result of the same flavor. Actually, it gives the
lower bound for a computation except that it's algebraic computation so you're not going
to win the $1 million prize, but it will be very important scientifically.
So why is super-polynomial lower bounds still not proved? Maybe it's because not
enough people are working on it. So some people think these problems are too hard
and are not fooled to work on it. Some -- so there are also rumors about all kind of
barriers for proving lower bounds, and sometimes partial progress is not very
appreciated. And some people think, okay, the problems are not nice mathematically.
You have an arithmetic circuit, it can be something very -- it can have many forms, so
there are not enough starting points for working on these problems.
So I have a secret plan. So the plan is to fool people to work on circuit low bounds. So
the idea is to come up with a set of problems that are innocent looking, they seem -- it
seems as if they are completely unrelated to circuit lower bounds, and they should be -I mean, hopefully they will be clean and simple problems that can be attacked from
many different points of view that seems very nice mathematically. And then if it will be
really interesting mathematically, many people will work on it for many different points of
view and maybe we will make some progress. So that's the plan. It's a good plan,
right?
>>: Are these the problems you give to your students?
>>: [inaudible].
>>Ran Raz: Which problem.
>>: The coming up with ->>Ran Raz: No. Coming up with is no, but I will show you some problem. So that's
what I wanted to say. It's not only a secret plan, it's also my new hobby, to come up
with problems that look nice and clean mathematically and the solution will give strong
circuit lower bounds.
So this is the plan, and I will tell you today about two results in this direction. One of
them -- both of them from the last two years. One of them is an approach for proving
super-polynomial lower bounds for circuits, and the other approach for proving
super-polynomial lower bounds for formalized, and more or less what I will tried to do
today is describe the problems plus giving some comments about it.
So first I want to start with elusive functions in lower bounds for arithmetic circuits, and
this is approach for proving super-polynomial lower bounds for arithmetic circuits, even
exponential lower bounds for arithmetic circulates.
So just to see what we're talking about, so I want to define what's a polynomial
mapping. A polynomial mapping is just a polynomial mapping. Polynomial mapping
from a field to the end to a field to the end, for this part of the talk I'm going to take the
field as C, the complex numbers, although almost everything that I'm going to say is
going to be true over any other field.
So a polynomial mapping F, which is F1 to Fm from C to the n to C to the m is just a
mapping such that each coordinate of the mapping can be described as a polynomial in
n variables. And then saying that the polynomial mapping is of degree d, if each of
these polynomials are of total degree at most d. So this is a polynomial mapping, very
simply, very simple.
And I also need the notion of what it means an explicit polynomial mapping. And for this
notion I use the notion of polynomial definable of variant. So anything that is
polynomially definable is explicit from my point of view, and in particular, the following
will be explicit.
If you're able, given a monomial M of the polynomial and an index i of the polynomial
that you want to compute, so you get an index i between 1 and M and you get a
polynomial M, then the coefficient of -- so M is a monomial. The coefficient of the
monomial M in the polynomial fi should be computed in polynomial time. That's what I
mean by explicit. Actually the definition is even more general, but in particular this will
be explicit, and there are days that most of the thing that we are able to construct that
we think about are explicit.
>>: Sorry. If this is polynomial time, what's that a polynomial in?
>>Ran Raz: Okay. That was exactly the next comment.
So this is a very important point. The polynomial time is polynomial in the length of
what you get. So you get a monomial and you get an index i, and the polynomial time is
polynomial in the length of the monomial, the length of the description of the monomial
and the length of the index i. And this is a very important point that allow your
polynomial timing the length ever the index, and so it's polynomial time in log n -- sorry,
in log M and not in M itself.
And this is important because I will work with M that is super-polynomial in n. So the
polynomial time is in the length of the index.
Nevertheless, I think most of the mappings that one consider usually are explicit in this
sense of explicitness.
So this is what I mean by polynomial mapping, and this is what I mean by explicit.
And this is one example for a polynomial mapping, an example that most of you
probably saw, the Moments Curve, and this is a polynomial mapping from C to C to the
m. F of x is mapped to x, x squared, x cubed to x to the m. This is the Moments Curve.
And there is one basic fact that probably you have all seen, a basic fact in linear
algebra, that for any affine subspace Fa of dimension M minus 1, M is subspace of C to
the M of dimension M minus 1, the image of F is not contained in the subspace A, in the
affine subspace A.
This is very easy to prove and very basic. I can write it equivalently as follows: For any
polynomial mapping gamma from C to the m minus 1 to C to the m of total degree 1, so
gamma is just an affine mapping, for any affine mapping from C to the M minus 1 to C
to the m, the image of f is not contained in the image of gamma.
And this is because the image of gamma is an affine subspace, and any affine
subspace can be written as the image of some affine gamma from C to the M minus 1
to C to the M. So we have this profit that the image of F is not contained in the image of
gamma for any mapping of total degree 1. That's the property that we have. And what
I'm asking is what happens if we consider mappings of total degree 2.
And what I would like to get is an explicit example for a polynomial mapping from C to C
to the m such that for any mapping from C to the m minus 1 to C to the m of degree 2,
the image of F will not be contained in the image of gamma. That's what I want.
And I allow you to use mappings F of degree almost exponential 2 to the m to the little o
of 1 which means smaller than 2 to the m to the epsilon for any epsilon. So I allow
you -- I don't allow you exponential degree, but almost exponential degree, and that's
what I'm looking for, just such a mapping that a mapping F such that the image of F is
not contained in the image of gamma for any gamma of degree 2.
>>: Can you give us some lower bound [inaudible].
>>Ran Raz: I think that m square. More than m square is probably -- more than m
square you can probably do that, and less man m squared, probably not.
>>: Less than m squared?
>>Ran Raz: Less than m squared, probably you cannot do, and more than -- little bit
more than m squared you can do that. So I can prove it non-explicitly that more than m
squared you can give -- you have such a mapping.
But here I'm asking if there exists an explicit mapping, and that's the problem. And if
you find such a mapping, you prove exponential lower bounds for computing -- actually
super-polynomial bounds for computing the permanent over C.
>>: [inaudible].
>>Ran Raz: Okay. So if the mapping is not explicit, you still get the lower bound, but
for a non-explicit function. And we no lower bounds for non-explicit polynomials by
counting arguments. Actually here they mention arguments, not counting, because the
field is C.
So this is not very hard to do. If you want it to be true for explicit polynomials, you need
to give an explicit mapping, and then it follows that it's also true for the permanent.
So this is what I'm looking for. And you can see that this problem seems very innocent.
It seems unrelated to complexity or to arithmetic circuit complexity. It seems that one
can try to study it for many different points of view maybe for major metrical, make a
major metrical approach, algebraic approach ->>: [inaudible].
>>Ran Raz: The moments, no, it's not such a function. It can actually be -- the image
is contained in the image of gamma of degree 2 and the square root of m [inaudible].
So it's not such a function. I can give you many examples for such functions, but I
cannot prove it for any of them.
>>: [inaudible].
>>Ran Raz: For such functions, they cannot be proved. But many candidates, but ->>: [inaudible].
>>Ran Raz: I don't want to get into it. Almost everything you think will be a good
candidate because that's how we teach in this research is how to prove the low bounds
so can -- for example, the permanent is a hard candidate for arithmetic circuits, but you
cannot prove it. I cannot prove it at least.
>>: If you find something that has degree much more than two [inaudible].
>>Ran Raz: Yes. So if this is the degree, you'll get a super-polynomial lower bound. If
the degree is much smaller than this, you get a better lower bound up to exponential
lower bound. You can get really strong lower bounds by this approach if you only find
such a function.
Another common thing is that the problem is even easier, gamma doesn't need to really
depend on m minus 1 variables. It shouldn't really be of dimension m minus 1, it can be
of dimension m to the 0.9. And F can depend on n variables where n is m to the little o
of 1. And this will still give what I need. And if you -- so, again, this problem seems
really unrelated to complexity.
I didn't mention the word complexity. I said explicit. But that's the only thing that I said
that is somehow related to complexity, and in this word, explicit, this word hides all the
complexity of the word complexity.
So this is the first problem that I wanted to talk about. Actually, I can generalize it and
define an elusive function. I want to say that F from C to the n to C to the m are elusive
if for any gamma from C to the s to C to the m of degree or if the image of F is not
contained in the image of gamma. And I can show for many setting of the parameters
that explicit construction of elusive functions imply lower bounds for the size of
arithmetic circuits. So in general, elusive functions are very related to lower bounds for
arithmetic circuits.
Okay. I'm not going to really give the proof, just the proof idea -- actually the idea is
kind of simple and I just want to give the proof idea.
So what we consider is a mapping gamma from C to the s to C to the m that maps a
circuit, a description of a circuit to the polynomial that is computed by the circuit. And if
we have such a mapping gamma, the image of gamma is just the set of all polynomials
that can be computed by small circuits. And the trick is to show that this gamma can be
taken as a mapping of small degree. Actually, this is not exact. The image of gamma
will be embedded in the image, will be contained in the image of a mapping of small
degree, sometimes even degree 2 depending exactly how we work.
So when we have such gamma, the whole business of proving lower bounds becomes
just -- is reduced to the problem of finding points outside the image of gamma.
Now suppose that we have an elusive function F, we have a function F such that the
image of F is not contained in the image of gamma. Then F hit a hard polynomial
because it hits a polynomial that is not in the image of gamma.
And the last point is that we take that polynomial F and we add these variables -- so we
take the variables that are needed to describe F and we add them as additional
variables to the polynomials here.
>>: [inaudible].
>>Ran Raz: I mean that in the image of F, there is a point that corresponds to a
polynomial that is hard to compute. So this is -- that's it. I'm not going to do -- I don't
really want to do the proof or talk more about it. It's just the proof idea, and this is what
gives the result.
Now, you can ask how to get the result for the permanent, and the result for the
permanent you get just by variance theorem that the permanent is a complete function
for v and p, for variant arithmetic version of np. So if you get a lower bound for any
explicit polynomial, you also get a lower bound for the permanent. So that's more or
less -- in one slide, that's the proof.
>>: [inaudible].
>>Ran Raz: Yeah. There is a [inaudible].
>>: [inaudible].
>>Ran Raz: I just want to mention that there is one class of circuits for which the best
known lower bounds are proved by this approach, and this is lower bounds for Depth-d
for constant, depth arithmetic circuits. Unlike boolean circuits, I guess many of you
know that for boolean circuits we are able to prove exponential lower bounds.
For arithmetic circuits this is not the case. The best lower bounds that we know for
arithmetic circuits is size lower bounds of n to the one plus omega of 1 over d, and this
is for circuits of depth d. And the proof for this lower bound use exactly this approach of
elusive function. This was implicit in the work of Shoop [phonetic] and Smolenski
[phonetic] in '91 and was made explicit in my work.
Okay. So what I want to talk now is about another approach in this direction, and this is
a work from actually last year, from a year ago, tensor-rank and lower bounds for
arithmetic formulas.
So let me tell you what is tensor-rank. First, what is a tensor? By n with this
parentheses I denote the numbers 1 to n, and here n and r are 2 integers. R is the
dimension. A tensor is just a function from n to the r to F where F is some field. So this
is a tensor of dimension r, exactly like matrices where r equals 2, it's a matrix where r is
larger, it's a tensor.
We say that the tensor A from n to the r to F is of rank 1 if it can be written as the
product of vectors, exactly like in matrices of rank 1. So a is of rank 1 if there exists r
vectors a1 to ar from n -- from 1 to n to F, so vectors with n coordinates, such that a is
the tensor product of these r vectors. A is a1 tensor product a to ar. Or, more formally,
if the entry i1 to ir of the tensor a is just a1 of i1 times a2 of i2 times ar of ir. So this is a
tensor of rank 1, of tensor rank 1.
In general, we define the rank of a as the minimal k such that we can write A as the sum
of k tensors of rank 1. And this is exactly a generalization of matrix rank except that we
talk about tensors of ir dimension. For any tensor A, the rank of A is at most n to the r
minus 1, exactly like the rank of a matrix is at most n. So this we know. So everything
here is just a generalization of matrix rank.
Given a tensor A from n to the r to f and nr variables, input variables, x11 to xrn, I can
think of these nr variables as partitioned into r sets of n variables in each and define a
polynomial fa in these nr variables that corresponds to the tensor A as follows.
So the polynomial fA, in the polynomial fA I take every multilinear monomial that
contains exactly one variable in which of the r sets and each of these monomials I take
with the corresponding coefficient -- with the coefficient which is the corresponding
element of the vector. So this is the definition. Actually, it's not really important what is
the definition. I just gave the definition, but I will only need to know that there is a
polynomial fA that corresponds to the tensor A. I will not really give the proofs in this
talk. So I will only state the results, and to state the results it's only important to know
that for a tensor A I can define a polynomial fA. And it's not really important for the rest
of the talk what is exactly the definition of this polynomial fA. But if you want, this is the
definition.
So given a tensor A, we have a polynomial fA that is defined as follows. And it is very
well known that there is a connection between tensor rank and lower bounds for
arithmetic circuits. And this connection goes back to Strausen [phonetic], the work of
Strausen from '73. And what Strausen showed is that given a tensor A of dimension 3A
from n to the 3 to F, the rank of m is a lower bound for the arithmetic circuit complexity
of the polynomial fA. So in particular, if you're able to give an explicit tensor A from n
cubed to F of rank m, then you get an explicit lower bound for -- then you get a lower
bound of omega of m for an explicit polynomial f of A. So this was observed by
Strausen, and actually this is practically an observation. It's just to notice that you can -given some known results, this is practically an observation.
When you use this, you can potentially prove lower bounds of up to omega n squared
for arithmetic circuits because such a tensor can be of rank omega of n squared. But at
this point we still don't know any tensor, any explicit example of a tensor of rank larger
than omega of n. So this approach doesn't give any bound better than omega of n, and
actually no other approach gives any bound better than omega of n for polynomials of
degree 3. In this particular case, fA is a polynomial of degree 3.
So -- but potentially we can get potential lower bounds of omega of n squared by this
approach. And what I tried to do is to consider tensors of larger dimension, and what I
consider are tensors of dimension r which is super constant. Think of r as any super
constant, any number larger than any constant and smaller than log n over log log n.
And what I can prove, that if you have a tensor A from n to the r to f of rank n to the r
minus -- n to the r times 1 minus little o of 1. So what you need is a tensor A of
dimension r such that the rank of A is larger than n to the r times 1 minus epsilon for any
epsilon. So it should be larger than n to the 0.99r. And the same for any constant
smaller than 1.
That's what you need. You need the tensor -- a tensor with tensor rank close to n to the
r in the sense that it's larger than n to the r times 1 minus epsilon for any epsilon. And if
you get such a tensor, if you get an explicit example for such a tensor A, this gives you
super-polynomial low bounds for arithmetic formulas this time for the polynomial fA.
And, again, the lower bound can also be proved for the permanent. If you get an
explicit example for such a tensor with high tensor rank, you get -- you prove
super-polynomial lower bound for computing -- for arithmetic formulas for the
permanent. So it doesn't give you for the more general notion of arithmetic circuits, but
it does give you for arithmetic formulas. So this is the approach, more or less.
>>: [inaudible].
>>Ran Raz: So, again, for random tensors, the rank is very close to n to the r minus 1.
Also, in the previous approach ->>: [inaudible].
>>Ran Raz: I wanted an explicit one, one that you can describe. Again, the notion of
explicit is exactly the same as before. If I give you the index of an element of the
tensor, I want you to be able to, in polynomial time, to output the ->>: [inaudible].
>>Ran Raz: Actually in this particular case even polynomial time in n would be fine,
would be enough.
Also in the approach of elusive function, it's very easy to do it non-explicitly, to show that
some random construction works, but here, also, if you take some random tensor, you
know that its rank is high, but you need it for an explicit example. That's the name of
the game here.
But if do you give such an example, you'll get super-polynomial lower bounds for
computing the permanent. So, again, in order to prove lower bounds for the permanent,
you just need to give one example for a tensor with high rank. That's all you need.
Again, it's a problem that seems unrelated to computational complexity, but if you solve
it, you prove an extremely strong and important result in arithmetic circuit complexity.
>>: [inaudible].
>>Ran Raz: Here not so much. But, yes. In general, yeah.
So in order to satisfy [inaudible], now I will give a problem that doesn't use the word
explicit -- oh, not now. In the next slide.
Okay. Before I'm going to satisfy -- before I'm going to do that, I want to highlight one
aspect of this result. And this is the point of view of depth 3 versus general arithmetic
formulas.
Here when I talk about depth 3 formulas, I'll talk about formulas with unbounded fanning
for every gate. And it's not hard to see the tensor rank corresponds in some way to
depth 3 formulas for the polynomial fA. Actually, set-multilinear formulas, but I don't
want to get into it now what is it is set-multilinear exactly.
But, any, if you have a description of A with small tensor rank, you get small arithmetic
formulas of depth 3 for the polynomial fA. And this means that if you prove strong
enough lower bounds for depth 3 arithmetic formulas, you get super-polynomial lower
bounds for general arithmetic formulas. But the strong enough needs to be very strong,
and you need to prove a lower bound of close to n to the r for arithmetic formulas of
depth 3. But if do you that ->>: [inaudible].
>>Ran Raz: Yes. Sum of all the -- sum of products of sums. Exactly.
Usually we don't count the product with a constant as part of the depth. That's the -okay, I should say that.
So it was a folklore theorem that if we proved a strong enough lower bounds for depth 4
circuits, arithmetic circuits, we get exponential lower bounds for general arithmetic
circuits. And, also, a recent result by [inaudible] show that any exponential lower bound
for depth 4 circuit I am complies exponential lower bound for a general circuit. So this
result is that only very strong exponential lower bound for depth 4 circuit implies such a
bound, and this result is that any, even relatively weak exponential lower bound for
depth 4 circuit implies exponential lower bound for general arithmetic circuits.
>>: [inaudible].
>>Ran Raz: What?
>>: [inaudible].
>>Ran Raz: I don't know what you mean.
>>: [inaudible].
>>Ran Raz: I don't know.
>>: [inaudible].
>>Ran Raz: Okay. So now I want to talk about approach of how to construct -- how to
construct a tensor with high tensor rank. And, again, this approach goes back to
Strausen. And the idea is the following. Suppose that you have two tensors, A1 and
A2. A1 is a tensor from n1 to the r to F and A2 from n2 to the r to F, and you take the
tensor product of A1 and A2. And this is a tensor from n1 times n2 to the r to F. And
the tensor product -- this is what I mean by tensor product. So A of -- now the r indices
of A are in n1 times n2, so I can think of each of them as a pair i1 -- i1 and j1 and i1 and
n1 and j1 and n2, so a of i1 j1 and i2 j2 to ir jr is just A1 of i1 i2 to ir times A2 j1 j2 to jr.
So this is the tensor product of two tensors.
Now, for dimension 2 for matrices, we know that the rank of the tensor product is the
product of the ranks. This is well known.
For dimension r larger than 2, this is not true in general. But the question is whether we
can prove that it's approximately 2, that the rank of the tensor product is larger than the
rank of A1 times the rank of A2 times something, times 2, times 100, or maybe even
times -- maybe -- so divided in 100 or maybe divided by n to the little o of 1 even.
If you prove such a thing, you get super-polynomial lower bounds for arithmetic circuits
for the permanent.
>>: [inaudible].
>>Ran Raz: Formulas. Sorry. Thank you.
>>: You need to prove this -- do you have to prove it for r equals 3 or can you do it for ->>Ran Raz: So you need to prove it for -- if you want super-polynomial lower bounds,
you need to prove it for r that is super constant. For any r that is super constant and
smaller than log n over log log n.
>>: [inaudible].
>>Ran Raz: So depending what exactly you prove. But you can get -- if you prove
something very strong for 3 you already get a very strong lower bounds but polynomial,
like omega of up to n squared. So even for 3, it's extremely interesting. But for r you
can -- for r larger than 3 you can even allow a larger [inaudible].
>>: So is there -- are you going to talk about what's known about how much less -- how
much less can a rank A ->>Ran Raz: Actually, I'm not sure exactly, but it's very little. There are examples in
which it's less than the product of the ranks, but very little. Maybe a constant or
something like that. Maybe even less than a constant. I'm not sure what are the best
examples.
So the proof for that is very easy. I'm going to do it fast because it's less important.
So if we have such a theorem, what we're going to do is to take m to be n to the 1 over
r, and then suppose that you have r tensors A1 to Ar from m to the r to F of high rank, of
high tensor rank. We're just going to take the tensor product of these r tensors. And
this is a tensor from n to the r to F. And if all of these are of high rank, then this will also
be of sufficiently high rank.
Now, of course, you can ask where do we get A1 to Ar from. So the idea is that if -since m is n to the 1 over r, each of these tensors depend on only n coefficients, and
therefore you can fix these coefficients as additional input variables for your formula as
kind of auxiliary input variables, and since the tensor rank of a random vector is high,
you will get that this will be, in general, of high rank, so that's more or less how you get
the proof.
So you can see this problem, this problem doesn't mention at all the word complexity. I
can even write here one million so you will not even have little o of 1. It doesn't mention
complexity. It's a problem in algebra. If the people in algebra were more serious, they
would have solved it a long time ago.
And if you proof it, you get super-polynomial lower bounds for computing the permanent
by arithmetic formulas.
>>: Well, the question is whether rank is actually [inaudible]. I think the algebraists -- I
can't speak for them, but I'd respond that a rank is not really such an algebraic
[inaudible].
>>Ran Raz: Okay. [inaudible].
Okay. I'm not going to do the proof. The proof in this case is a little bit more
complicated.
I'm going to describe the main steps of the proof, although I know in this forum you will
not get much out of it, but it's only one slide so it's okay.
So suppose that we have a small form for fA. How do we get from -- we want to prove
that the tensor rank of A is small. So the first step is to put the formula in some nice
form. We need to homogenize the formula and make it homogenous and multilinear.
And I will talk a little bit about this step later on.
Then when the formula is in a nice form, I define a notion of syntactic rank of a formula,
and this notion, unlike tensor rank, you can just go over the formula and compute the
syntactic rank exactly. Tensor rank is how to compute. If you get a tensor, it's actually
np hard to figure out what's the exact tensor rank. This is a result by [inaudible]. And
this is even for tensors of dimension 3.
But the syntactic rank can be very easily computed by going -- by looking at the formula.
And once we have this notion, and this notion will bound the tensor rank from above, we
can actually, for any formula, compute its syntactic rank and we can actually find the
formula with the highest syntactic rank, and this way we will bound the tensor rank. So I
know that this is not really proof. It's just the steps that need to be done, but I just want
to talk a little bit more about the first step of the proof because this might be interesting
in its own right although it's less related to proving lower bounds.
So the problem here is the problem of homogenization. So, first, what's a homogenous
polynomial? A homogenous polynomial is a polynomial such that all the monomials that
appear in the polynomial are of the exact same degree. That's homogenous
polynomial.
A homogenous formula is a formula such that in every intermediate stage computes a
homogenous polynomial. So all the gates compute homogenous polynomials.
In general, if you want to compute homogenous polynomial left, you can compute it by a
non-homogenous formula. Maybe the formula computes intermediate steps some other
polynomials that are not homogenous but finally it all cancels and the final output is
homogenous.
Now, suppose that you have a formula C of size s that computes a polynomial f which is
homogenous and of degree r, but you don't know that the formula is homogenous. Can
you construct for me -- do you know -- can you construct for me a homogenous formula
D for the polynomial f?
And this question again goes back to Strausen in '73, and Strausen gave a very simple
kind of the obvious procedure to construct D from C, but the problem is that D's of size s
to the o of log r where r is the degree. So it's actually -- it's a very efficient procedure
when you look on the circuits, but if you look on formula size, it increases the formula
size. If r is more than a constant, it increases the -- it blows up the formula size
super-polynomially. And this position was actually conjectured to be optimal by several
people.
And the first step of what I described is actually a different analysis of the same
procedure that shows -- that actually taken the size D plus r, choose r times s. Here D
is the product depth of the formula C when the fanning of every gate is D and r is the
degree. It's not really important what is exactly this expression. The important thing is
that this is always smaller than this -- I mean this is always smaller than this, and when
s is polynomial and r is less than log n, you can result loss of generality assume that D
is less than log n and get the size of this polynomial.
So if you have a homogenous polynomial of degree log n and you have any formula for
it of size s, you can, from this formula, get a homogenous formula for the same
polynomial. So this is kind of a new homogenization procedure.
And, thus, the corollary of this step, what you can get is that if you prove
super-polynomial lower bounds for homogenous formulas for degree up to log n over
log log n, actually for homogenous over -- up to log n, you get super-polynomial lower
bounds for general formulas. And the same can be proved also for what I call
set-multilinear formulas. I will not tell you what a set-multilinear formulas, only what
multilinear formulas.
So -- okay, so multi-linear formulas, it's obvious. A multi-linear polynomial is a
polynomial such that in each monomial each variable appears with degree at most 1.
Either zero or 1, but not with degree 2. A multilinear formula is a formula such that in
any step it computes a multilinear polynomial.
So, for example, the determinant is a multilinear polynomial, right? If you want to
compute the determinant, maybe by a formula, sometimes you -- some formulas for
computing the determinant use -- are not multilinear. So they compute in intermediate
stages polynomials that are not multilinear and somehow at the end all the monomials
that are not multilinear cancel.
We say that the formula is multilinear if it computes a multilinear polynomial in every
stage of the computation in intermediate stages. So every gate of the formula
computes a multilinear polynomial. And today we know how to prove super-polynomial
lower bounds for multilinear formulas for the determinant and the permanent. So we
know that any multilinear formula for the determinant or for the permanent is of
super-polynomial size. So if you want to compute the determinant by a formula such
that all intermediate computations will compute multilinear polynomials, you need
super-polynomial size.
We now how to prove that, so we are able to get super-polynomial low bounds for
general form for multilinear formulas, but unfortunately only for polynomials of degree
larger than log n. And in order to get super-polynomial ones for general formulas by this
theory, we need to get super-polynomial lower bounds for multilinear formulas for
polynomials of degree up to log n. So somehow it didn't work this way.
Thank you.
[applause].
>>: [inaudible].
>>Ran Raz: Even slightly less than log n. No, actually, log n. Log n.
So log n over log log n is actually for -- oh, no, you're right. Log n over log log n. Sure.
Yeah. So we need to prove it for log n over log log n.
>>: Any more questions?
>>: Didn't Strausen give a talk in the '70s on how to get people to work on tensor
ranks?
>>Ran Raz: Well, you got me [laughter].
>>: [inaudible].
>>Ran Raz: So -- okay, so I just want to mention that although it might seem as the
new problem is harder than the old one, it's not necessarily the case. I mean, it might
be that you can prove -- that you can prove a lower bound of n to the r minus 2, for
example. So it will solve this problem, but it won't solve this problem. So it might be
that this problem is here even.
>>: I have another question, too. So what -- did people try to introduce other things,
other gates, like max, things like that?
>>Ran Raz: So then you don't -- you are compute other objects, not only polynomials.
Yeah, there are work with other computational models like the [inaudible] model, and
there are other computational models. So this is only for the arithmetic circuit model.
>>: [inaudible] after the end of the survey of hard problem [inaudible].
>>Ran Raz: [inaudible].
>>: You have to ask him are there any problems in this area that haven't been solved
not because they're so hard but because they haven't been noticed.
>>Ran Raz: Was it in Beijing? I think in Beijing I heard about it, actually, because a
student of mine ->>: And do you know what the answer is?
>>Ran Raz: The answer that he would have solved it [laughter].
>>: [inaudible].
>>Ran Raz: Well, certainly -- certainly it makes all the problems harder if you do not
solve the [inaudible], but I wouldn't say that they're impossible.
So I don't know. This -- like if you ask me, I was very excited about the elusive function
approach and tried to work hard on it, but at this point I'm more discouraged than
before, but I'm still very optimistic about the tense rank approach. But I guess to fool
some -- not you, but some other people to work on it.
>>: Who put you to work [laughter]?
>>Ran Raz: [inaudible] thank you.
[applause]
Download