>> Amir Dembo: So it's a pleasure to introduce... who will talk about Oded's work on Boolean functions.

advertisement
>> Amir Dembo: So it's a pleasure to introduce David Wilson from Microsoft,
who will talk about Oded's work on Boolean functions.
>> David Wilson: Thank you. Okay. I'm still going to be talking about these
Boolean functions tie into random term games and percolation. So remember at
one point Oded declared that he didn't intend to work on Boolean functions
anymore, but then we had Ryan O'Donnell as a post-doc and Eti was visiting and
Mike Saks was visiting, and somehow this was too much to resist continuing
work on Boolean functions.
So we're going to -- there are two conventions. So we're going to have the
functions met by this one, one to the Nth to minus 1 and 1, and so one particular
question that we looked at, and this was with Eti and Oded was the following. So
if you have a function and you have a way of evaluating it, by a decision tree, so
what the tree does is it looks at bits one at a time, makes a decision as to what
bit to look at next and then read the next bit and at some point it stops reading
bits and that puts the value of the function. So delta I is the probability that the
Ith bit is read. Okay.
And we see that delta T is the maximum over I of delta IT, and delta of the
function is the minimum over decision trees of the maximum probability that the
bit is read. Okay. So how small can delta be?
All right. So just to give some example Boolean functions. Okay, so if the
function is always one then delta is zero. Okay. And so to run up this
degenerate case, we require the function to be balanced, and what a balanced
function is expected value of the function on a random input somebody zero. So
half if time it's one and half the time it's minus one.
Okay. So there's the dictator function which always returns the value of the first
bit. And for that, delta's one because you always have to look at that bit. The
majority -- like I said, this is order of one. You have to look at most of the bits in
order to figure out what the majority is. And this is in contrast to the related
concept of the influence of a bit on the random function. So what the influence is
is the probability that F of X is not equal to F with F of X with the Ith bit flipped.
Okay. So this is the influence of the Ith bit.
Okay. And so here the majority of influences is pretty small. It's order one over
root N. Okay. So another classic example of Boolean function which is two to
the [inaudible] lineal is the tribes function. For this you have two to the N blocks
of N bits. And if any of these blocks are all zero, the function is zero. Or I should
say minus one.
Okay. So for tribes the influence is log N over N, which is small, but delta is still
pretty big because you have to look at most of the tribes in order to verify that the
function isn't minus one. But for any given tribe you only have to look at order
one bits typically, and so it's order one over log N.
Okay. So we constructed some example for which delta was one over N to the
half times square root of log N, and I don't really want to describe the example in
great detail, but there's an easy word bound of one over root N for any function,
and that's just because if you -- if each bit is thread with probability one over root
N, you're looking at fewer than root N bits, and if you have an independent input
and an independent run of the algorithm, there's a good chance that the two runs
will look at different bits, different collections of bits, and furthermore that they'll
have different answers, and then you can just sort of combine the two inputs, and
the algorithm would -- I mean, I can't decide what output for that case.
So, all right, so that's the lower bound here. And for this example, the influence
is also one over root N times root log N. Okay. So then we also looked at
monotone functions, and for that there's some construction that gives one over
the key root of N, and turns out the influence is one over N to the two-thirds log
N. So this is based on a branching process, but I won't say too much about the
actual function.
So what I want to talk about is this. So there's a matching lower bound, one over
N to the one-third, which holds for any Boolean function and.
>>: [inaudible].
>> David Wilson: Any monotone Boolean function, yes. Thanks. That was
proved by Oded. And I'll say a few words about how this proof goes. Okay. So
this is the Fourier coefficient corresponding to a set S. This is the expected value
of F of X times the product of I and S of X and I. And for monotone functions, the
influence of the Ith bit is just the Fourier coefficient on the set containing I.
Okay. So there's a band due to Servedio and O'Donnell, or O'Donnell and
Servedio, which says that if you were to sum up these Fourier coefficients, this is
at most a square root of the expected number of bits read by any algorithm that
evaluates the function.
>>: Is that just [inaudible].
>> David Wilson: This is for any function F. Okay. And so this is O'Donnell and
Servedio. Basically the way the proof works is that this sum can be expressed
as an inner product. And so this is by Schwarz and let's rewrite this slightly.
>>: [inaudible] remove the letter from [inaudible].
>> David Wilson: All right. All right. So this is sum over I, F of X squared. Sum
over I, X of I one if the Ith bit is read. Okay. So F is plus or minus one valued.
All right. So this evaluates to one, and here if you expand this, so you get a term
of both bit I and bit J read, and if I and J are different and expectation is just
going to be zero, and so you only get the diagonal terms here, and okay, so this
is X of I, X of I, which is always going to be one, and so you end up getting an
expected number of bits read.
And like I said, this also follows from inequality due to Schramm and Steif which
is that. If you sum over sets of size K, this is the most delta K [inaudible] of L
squared, which for Boolean functions is always going to be one. So here you
take K equal to one, and do some manipulations and get a similar inequality.
Okay. So there's another inequality which is due to O'Donnell, Saks, Schramm,
and Servedio, which we need, and that's at the variance of a function, Boolean
function is at most the sum over the input bits I, delta I for any decision tree times
influence of the Ith bit.
Okay. So for balance functions, the variance is going to be one. This is upper
bounded by delta. Okay. Let me get the sum of the influences. And for
monotone functions influences just this Fourier coefficient, which then by
O'Donnell, Servedio is MS this. And so expected number of bits read, number of
bits read is MS delta times N, and in the end this knows that delta has to be at
least N to the minus one-third.
Okay. All right. So I'll say a few words about how the O'Donnell Saks, Servedio,
Schramm inequality's derived. So what they do is they say that if you look at say
X and Y are independent bit patterns and look at the expected difference
between F of X and F of Y, and if the -- if the decision tree is gonna look at bits
I1, I2, and so on, up to IS in that order, we're going to define E subzero to equal
to Y except of the notation that they use as X at I1 up to IS times Y. So what this
means is that it's equal to Y except that at these bit positions where it's equal to
X. U1 is the same thing except starting at I2. And then U sub-S is equal to Y.
Okay. And so by the triangle inequality, this is at most the difference between F
at U0 minus F at U1. Okay. And so if you take a generic term in there, F at U
sub-T, minus F minus one, F at U sub-T, this is the sum over I of on the same
thing times one is the I sub-Tth bit read is equal to I.
Okay. So if you condition on the bits that you've read up to time T minus one,
then what you have in U sub-T minus one, well, that was just equal to Y, except
that at these bits that you haven't read yet where it's equal to X. And so this is
conditionally just a random input, and if you compare it with U sub-T, that's just
the same random input except we randomize that the bit I sub-T and okay, so if
you subtract these then conditional on what you've read so far the expected
value of this is just the influence of the Ith bit on F times the probability that you
read bit I at time T okay? And so then if you sum over all Is and sum over all Ts,
it skips to the right-hand side over here, and the expected absolute difference
here is basically the variance of the function. So it's how they got their bound.
Okay. All right. So basically the O'Donnell Saks Schramm Servedio band says
that the product of these two has to be at least one over N and it's and it's
recently typed for several of these examples.
Okay. So I guess I have a few minutes left, and so I'll tie this into random-turn
games and okay, so how about if I start the projector. All right. All right. I
apologize for the technical difficulties here. All right. So this is -- this is one of
Oded's favorite Boolean functions here. If we percolate that is.
Okay. So we're going to tie this into random-turn games. Okay. So this is the
game of Hex where at each turn a coin is flipped to decide which player moves
next. And in this case the computer is playing against itself. And so this is from
work with Yuval Peres, Scott Sheffield, and Oded. And so one thing that we
show is that the probability that one of the players wins is equal to the probability
that you'd have a crossing in random percolation.
And you can show this by induction. So ->>: [inaudible].
>> David Wilson: On any board, yes, that's right. Okay. So it's true. If there are
no nukes left or if there's one nuke left and -- okay. So suppose that you've
shown that if there are K nukes left then if they're K plus one nukes left you can
verify that you know the best move for black is the site that's most likely to be
pivotal and it's also the best move for white and -- all right. So anyway.
The probability that black wins is the probability of a black crossing in this game.
Okay. So all right so from work of Smirnov and Werner ->>: [inaudible].
>> David Wilson: Yes. I believe I said that.
>>: [inaudible] what do you have to do to win? [laughter].
>> David Wilson: All right. So white ones connect the white sides, black ones
connect the black sides, and they toss a coin to decide who moves next. And
they can't move tiles. Once they're placed, they're placed.
>>: What was the last thing you said?
>> David Wilson: You can't move tiles. Once you lay them down, then there
they stay.
>>: [inaudible].
>> David Wilson: Okay. So this can be viewed as a decision tree algorithm. So
the players play optimally. They say we're going to move, and instead of having
the coin toss decide who moves next, you have -- they agree on their common
best move and the coin toss is the random bit, and from Smirnov and Werner, we
know that the influence of the bits near the middle is going to be for an L by L
board, L to the minus 5 over 4.
And then from the O'Donnell-Servedio bound, okay, first of all, this is a monotone
function. If we do submission over all sites, this is going to be L squared, L to the
minus 5 over 4.
>>: [inaudible]. Right. Let's [inaudible] square. And so this is -- all right. So this
is at least that. At least L to the three halves plus little O of one. So this says
that if you are to evaluate this percolation crossing function, no matter which
strategy you use, it's always going to take at least L to the three halves coins to
uncover and decide what the function is.
And for random-turn Hex experimentally exponent seems to be around 1.5 and
1.6.
And so I think I'm about out of time, so I'll stop here.
[applause].
>> Amir Dembo: Questions or comments?
>>: So you don't expect this to be optimal [inaudible].
>>: It's an optimal strategy for the players, but it's ->>: No, no, but it's not optimal [inaudible].
>> David Wilson: We don't know any better algorithm. Maybe it's optimal,
maybe it isn't. We do not have an opinion. Yes?
>>: I have one comment. One is that this whole project actually also started
from a hike with [inaudible] ledge where the, you know, on the hike suggested
that there should be a connection between percolation and Hex because they're
happening on the same board and so you should take seriously these
suggestions to, you know, play various games.
And the other comment is that there's a -- so there's a lower bound of L to the
three halves for the length of the game that we get from the argument [inaudible]
but the embarrassing thing is we don't know any upper bounds. So the game
last most L squared, it's about L to the [inaudible] 1.6 [inaudible] show less than L
to the 1.99. L to the 2 minus epsilon from the length of the game. You don't
know -- we don't know even [inaudible].
>>: I think we'll have ->>: [inaudible].
[applause].
>> Amir Dembo: Okay. So it's a pleasure to resume the session with the next
talk by Christophe Garban on Oded's work on Noise Sensitivity.
>> Christophe Garban: Okay. So first I would like to thank the organizers for this
conference which enables all of us to remember Oded and his mathematics. I
had the chance to work with him during a about two years, and let me tell you it
was an amazing experience. Maybe I should say quite and amazing experience.
[laughter]. So I could experience many things that [inaudible] described this
morning, like for example when we worked with Gabor in the [inaudible] and
some strategy completely collapsed which happened very frequently, then the
next morning Oded would come with something completely new. So I
experienced many of these things. Many of the only thing that's changed
from '99 to 2007 was that [inaudible] was into fussball maybe.
So in this talk I will present one of the many fields in which Oded made the great
contributions, namely his work on noise sensitivity of Boolean functions. But I will
restrict things to the case of percolation. So we will encounter Boolean functions
but applied to the case of percolation.
So what we will see is that when you look at critical percolation in the plane and
you're interested in microscopic properties of this percolation, these microscopic
properties they are very sensitivity to perturbations. If you change just a little bit
the picture, your large scale connect properties will be completely diminished.
So we will see that this will correspond to the following phenomenon, the
phenomenon that macroscopic events in some sense refuse high frequency. So
we will see what I mean by this. So just to illustrate this perturbation to this
sensitive to small perturbations, here I have a simulation of a create Z2
percolation. So I have PC equals 1F, and I have joined the three largest clusters
in the left so the red one is the biggest and then the blue and the green, and on
the right it's the same configuration except there are some mistakes, so -- or
rather I know a little bit the left picture.
So if you have good eyes you would see that the two discrete configuration are in
a way very close but nevertheless the microscopic properties there are very
different.
>>: What's the noise level?
>> Christophe Garban: Well, I cheated a little bit because it's not that
microscopic. Here the noise is maybe 0.1 or something like this. So the noise
here, it means that for each single edge you keep the edge with probability 1
minus epsilon [inaudible] epsilon you [inaudible] it.
So here I -- so one of the motivations that Oded [inaudible] when they looked at
these things was to start a root towards proving conformal invariants. So I won't
go into this because it starts from a simple root, but another motivation was to
study a model called dynamical percolation which is a model which is the analog
of [inaudible] dynamic but in the case of percolation, and here I have a movie
done by Oded but I don't know if -- okay -- it's going to work.
So this movie done by Oded you see a per calculation in the triangular lattice, so
imagine this thing in the whole plane and each [inaudible] are called into
[inaudible] so you have your critical percolation which is everyone being in time
and what was expected from the very first paper on dynamical percolation was
that even though at each fixed time you see a critical percolation, so eventually
you see only finite [inaudible] when you run the dynamic you will have some
explosion times where an finite cluster should appear.
So this took several years to be proven, and it was proved by Oded and Jeff and
I mention this later. But in order to prove that you have such explosion times,
exceptional times, you need to show that somehow the [inaudible] the system is
moving really fast so that sometimes it can catch infinite paths. So you need to
prove these kind of sensitivity statements that large scale properties they move
very -- really fast. There is a fast mixing property for these things.
Okay. So those are the motivations to you can this instability. So in percolation
we are mainly interested in clusters and connectivity and things like this, and
these large scale connectivity properties, they are naturally encoded by Boolean
functions.
So for example, if you take a large rectangle, you may ask if there is a crossing
on that from left to right. So this is a Boolean function and this is the same kind
of Boolean function as the one David talked about at the end of his talk. So it's
defined very simply like that. It's going to be one if there's a crossing, and zero
else. And we are going to consider this Boolean function in larger and larger
scales.
So remind the movie that I showed at the beginning. So when we get zero would
be the initially configuration. And if you look at the configuration at later time T,
small time T, then again you have noised configuration of the initial one. So
between [inaudible] zero and [inaudible] T, you change a small proportion of
hexagons.
So we will look at function at this big scale percolation incarceration and we will
wonder how sensitive they are to this type of noise. So how to quantify that a big
connectivity property like the Boolean function we have before, how to quantify
that it mixes fast or that it loses its memory very fast, very simply we can classify
this with the covariance. So we can look at the covariance of having a crossing
at time zero and having a crossing at time T. And if this covariance is the
converge zero it means you lose these formation. So the correlation of
microscopic properties will correspond to these covariance is going to zero.
So noise sensitivity corresponds to the vanishing of these acquaintances here.
Maybe the second one you can think of it as you know what's the initial
configuration is and you know that a small proportion roughly T of bits has
changed. What can you guess about the outcome FN of omega T, but the
crossing of FNT. If it's no incentive this variance is going to zero so basically this
means that you can't guess anything.
So note that -- note that defined in such a way noise sensitivity is not quantitative
statement and in order to prove that you have explosion times we will need more
quantitative versions of noise sensitivity. And what I mean by this is we will need
to know at what speed the large scale system decorrelates. So we will need to
know at what -- at which speed these covariance converge to zero.
So the natural setup to look at these things and I think Eti once told me that
initially that was not so much convinced about this, but his later contributions
proved that finally was convinced and he was very comfortable with it is to walk
on the Fourier side. So I quickly recall harmonic analysis of Boolean function,
but David used some of it just before. So we view all Boolean functions in the
larger space of day two space in the space in the [inaudible] and [inaudible] and
when you do if you will analysis you just need to find the natural and convenient
optimal basis. And here there is a very natural and convenient one which are the
characters of this group. So this autonomic is indexed by the subsets of the bits,
S subset of this, and for each of these subsets S, the character corresponding to
the subset S is just the product of the value of the bit in this subset S.
So this gives you 2 to the N such functions and it's easy to see that they are
orthogonal, and so dock full analysis in the [inaudible] is just projecting your
function on each of these two to the N characters. So this gives you a Fourier
coefficient. And for example, the Fourier coefficient corresponding to the empty
sets is just protecting to the constant, so this is the average. So we will see that
when you take any Boolean function and you look at the sequence of it's Fourier
coefficients, it's fully encode the interesting properties for us.
So why is it helpful in mixing of sensitivity of the Boolean functions is just
because the covariance is -- are easily expressed in terms of the Fourier
coefficients. So if you want to compute the correlation between time zero and
time T, then you write down your observable and you project it on your basis, and
then you use the fact that for two different characters they are orthogonal, so only
the diagonal terms remains. And you end up with this simple formula here.
So try to remind this formula so at least the next slide. The covariance is the
sum of the Fourier coefficient squared times this E to the minus T times S. So
you notice here that if your Boolean function has coefficients of high cardinality,
then the covariance is going to be very small.
On the other hand if your Boolean function is supposed on the low frequencies,
then this is going to be stable. Okay. So this agrees with the usual intuition. But
what will be important for us when you have a Boolean function will be this what I
think physicists call energy spectrum, which for presence the contribution for all K
between 1 and N of the level K Fourier coefficient. So if your function adds a
distribution which is not on the right then it's going to be more sensitivity than if
it's stuck on the left.
So just using possible you can see that the total mass of this spectrum where we
don't need to use the empty set corresponds to the variance of your Boolean
function. So now in percolation what we will need to do, we have these Boolean
functions F of N, which corresponds to left to right crossing in boxes of scale N,
and we'd like to know what is the shape of the energy spectrum of this Boolean
function. Is it located near finite frequencies or does it spread to infinity or how
does it look?
So we'd like to describe the shape of a -- what we call FN was a -- these type of
Boolean functions. We'd like to describe the shape of the energy spectrum. So
and if we want a quantitative statement in the sensitivity of percolation, we want
more we would like to know at which speed the spectrum mass will diverge to
infinity.
So just as a comparison I don't hear the energy spectrum of the majority function.
So in this case, you can compute things explicitly, and you can see that in some
sense most of the spectrum mass will be localized in finite frequencies. So we
expect that in the case of percolation it should look differently. So there are
basically three very different approaches to try to localize where the special mass
is. So the first one was done by Oded, Eti and Gil in '98, and it's used techniques
from analysis.
And so I mention what their result was later. The second technique was a
technique based on randomized algorithm, and this is something which is related
to the previous talk a little bit. And the last one was the work we had many
attempts Gabor and Oded, and finally [inaudible].
And so you can -- when you look at these approaches which turn out to be very
different, you can see that there is a common denominator to these three. And if
you don't see then I can help you. So okay. So now I just present these results.
So the first one -- from the beginning there were heuristical arguments which say
that which speed the spectral mass should diverge at infinity, and it's even more
than this. It can prove that a positive fraction of the mass will spread at this
speed here, if you know the critical exponent.
But what a little bit embarrassing is that even though you know that a positive
fraction diverges really fast, it could still be that you have a positive fraction lying
at the bottom part of the -- of this distribution. And it turned out to be hard to
handle the lower tail of the distribution.
So the first result that I mention is -- which uses hypercontractivity and which
uses ideas that came from analysis but are so computer science is at the time
proves that the spectral mass diverges at least logarithmically fast. So stated in
terms of Fourier coefficient, it means that asymptomatically, you don't have mass
in the finite frequencies. So this already proves you noise sensitivity statement.
So the second result was -- that I did with Jeff went a little bit further and proved
that the spectral mass diverges at least polynomially fast. But they proved more.
They gave bone on what is the lower tail of this energy spectrum and this lower
tail control enabled them to prove that there are exceptional times in the
dynamical percolation model. And finally with Oded and Gabor we could go all
the way to the expected end to the [inaudible] with a sharp control on the lower
tail.
So maybe I should insist here that even though on the third we can go all the way
to N to the three-fourths. These two approaches they have the advantage to be
much more general than the one we did a priori. For example, the one that use
the hypercontractivity in some sense says that if you take any Boolean function
so that all -- each of its variable are very influenced in the outcome, then you will
have a control -- logarithmic control on it's Fourier spectrum.
So somehow there are matters here we can apply them to many cases. And as
well in the case of a -- the work by Oded and Jeff, if you have a randomized
algorithm which computes your function so that it looks at very few bits, then you
will a very good control in the Fourier spectrum. So maybe this third approach
could be applied to other things but it's not clear yet. So just to say on word on
the last approach, and then I will describe in more detail the green one, the one
with Jeff.
The last approach, it's in the real case if you would have a function in L2, and you
look at this Fourier transform, then the square root of this Fourier transform it
defines you a probability measure in the real line, so you can see this as a
random variable for each function F you can see this function as a random
variable in the line. For Boolean function you can do exactly the same. But this
time it -- instead of having a probability distribution in the real line, you will have a
probability distribution on the characters. So you will have a probability
distribution in subsets of your bits.
And very roughly speaking the idea of this third approach is to sample uniformly
to sample a frequency according to this measure. So this will be some subsets
of -- so if your Boolean function is the left to right pressing of before it will be
some random subsets. And this random subset was believed to be very close to
the pivotal points of percolation. But it turns out that this actual should be
different. And the goal was to try to study the properties of this random set and
to try to set it in a way. This is close to a random counter set, a little bit. But the
difficulty was that the dependency structure of this random set is a little bit hard
to analyze. So I won't describe more the red and the blue approaches and I will
now try to describe more the algorithmic approach that he has done with Jeff. So
this is based on randomized algorithms.
So what is this? So you if take a Boolean function F from the cube into 0, 1, a
randomized algorithm is something that computes the function F by examining
bits one at a time. So you take a first bit random E or according to some
procedure and depending on the value of this first bit you choose any one and so
on until you discover the output of the function F. And an algorithm for us will be
efficient if you manage to compute F using the smallest number of bits.
So to quantify this, I use the same thing as a what David talked about before,
which is the [inaudible] over all the variables of the probability that the variable is
used along the algorithm. So this is called revealment.
And for an algorithm A, we will call J the random subset of the bits which is
actually used along the algorithm. So David already gave the example of
majority, so majority there is nothing really smart to do, you just need to ask
people one at a time and you'll at least to ask half of the people, and it's easy to
see that with high probability you need to ask many, many, so the revealment is
not going to be very good.
So you could take other examples like recursive majority. And here you can -- I
won't do it, but you can easily find algorithm which will ask very few people. And
so we are interested in percolation. So what would be a natural randomized
algorithm?
Well, we've seen it already several times today, so say you want to have the left
to right event here by white hexagons, so what I did that is present in all the work
of Oded and 2D percolation is to use exploration pass. So here the randomized
algorithm you can ask what is the value of the first hexagon and depending on it
you ask -- you continue your exploration pass and so on.
So if your exploration pass ends up being here, the Boolean function is going to
be zero, and [inaudible] it will be one. So this is a nice randomized algorithm, but
not so nice because it makes the revealment equal to one. Why? Because here
I always ask the value of the first [inaudible]. There is an easy way. You can
randomize the first -- the departure of the exploration pass and maybe you need
at most two exploration pass to be sure of the result but eventually by
randomizing the beginning you can have a nice randomized algorithm computing
the left-right crossing. So let me just tell you what will be the revealment for such
function. Asymptotically the bits used by the algorithm is J set. It will be this S
86 curve that [inaudible] mentioned this morning. So asymptotically you see that
the amount of bits which are used by the algorithm is very small because this
curve is -- occupies a very small fraction of the window. And since we know that
this is something of dimension, as David mentioned seven-fourths, it is it even
implies that the revealment is of order N to the minus one-fourth. So there are
some issues to play with, but this is standard in percolation, so for left to right
pursing on the triangular grid you can have algorithms very small revealment.
So now how does it -- what does it say about the spectrum itself? How does it
help us to localize where the spectral mass is? Well, there is a I think very nice
and the proof is very short of this theorem which were done by Oded and Jeff
and was mentioned by David before. So if you take a Boolean function and
[inaudible] even the real valued function, and if you have a randomized algorithm
which computes F of revealment delta, then only this information gives you deep
information in the spectrum. For any level K, it will tell you that the level K mass
of the Fourier distribution is dominated by K times the revealment times delta
squared of F.
So in particular in the case of the percolation crossing, since we have a very
small revealment, then we directly have a information on the lower tail of the
Fourier distribution.
So in the time remaining I'd like to explain how to get this result here. So so with
start with some function like this, and we have this algorithm which revealment
data, and we would like to have a bond on the level K Fourier coefficient. So to
do this, we can look at the projection of the function F on the level K characters.
So we look at the function F, but only on the level K frequencies. And so the only
goal is to estimate what is the L2 norm squared of this function. So what we
want to prove is that this is less than K times delta times this. So how to do this.
Well, first notice that L2 norm squared, this is the same as this. And now we
want to use the knowledge of the fact that the -- we have an algorithm and the
fact that it has a small revealment and so on, so knowing the information coming
from the algorithm, now the function F is computed by the algorithm, so this is
measurable with respect to this algorithm A that this going to be this. And by
[inaudible] this is less than the norm of F. So now the goal is to try to bond this
expectation here. So let me just try to take a simple example to see what's going
on. Assume we have six bits, so X1, X2, X3. So X5, X6. And say we have
some Boolean function but we only look at the projection in the level 2
coefficients and say that once projected we have some coefficient times X1, X2,
plus some coefficient times X1, X4. Because [inaudible] times X5, X6.
So basically we have this frequency here, this frequency here, and this one. And
now say that you -- you compute your function F, which is above this function G,
right, and say that the randomized algorithm is going to look at these three
variables. So F is computed but J is not computed completely. So what is EJ of
AE? So in that case if you know what the algorithm here, you -- this conditional
expectation is going to be alpha times X1, X2. Here you didn't learn anything.
Then you will have beta X1. So let me write it like it this. So this term is constant
because you know what it is. The algorithm told you what it is. And then you
have the last term which is gamma X5 A, X of A. So this is a constant term. This
is a level one term. And this is a level two term.
So all of this, to say that once you apply the algorithm, you have a kind of
collapsing of the frequencies from the size K to the smaller sizes. And the whole
game is to get a bone on the collapse on the empty coefficient.
>>: When you say [inaudible].
>> Christophe Garban: Yes. So -- right. So here it was not the expected value,
it was -- I don't know how to write it but -- oh, actually, I want to write it like this.
So this is the random function which depends on the algorithm and then the set J
that you visited, and this is the expectation of this thing, which is the expectation
of J knowing A. And here you only keep this term which is constant and the
other one they vanish.
So this term is basically the average of this random function when you average
all the other bits. So we can write it like that. Okay. So here when you run the
algorithm, you discover some sites and you have all your frequencies floating
around. And when you discover the whole frequency, this goes to the empty
coefficients, and this accumulates, and you want to say that if the revealment is
small, this is not going to accumulate too much.
So now just to bond this thing, so this is the expectation of the square root of the
empty coefficient for this random function here, and we can write it like this. So
by possible, this is going to be exactly this.
So in some sense it's not so easy to study this collapsing packages, so Oded and
Jeff had this trick to reverse the study and now what to do with it? Well, the
expectation of this, this is just the expectation of G squared. And okay, so we get
this. And now we don't really know what to do with the frequencies which are of
size other than K, so we just bond. So so far we didn't lose anything in the
control of the Fourier mass, but here we bond by saying that this is less than G
squared minus only on the size K coefficient. And this gives us -- so this gives us
so E of J squared, this is the sum this. And we subtract the sum of this thing.
So these are also the coefficient of J. This is the same. So now what do we do
with it? Well, when you have a frequency somewhere and you set up bits of size
K, just when we need it, then now you send the algorithm in to reveal sites one at
a time. If for this frequency here you reveal one of the sites, then the frequency
here will collapse to the size smaller. So for this to be nonzero, you need to -you need the algorithm to avoid the frequency here.
But now each individually each site of this size K set as probability at most delta
to be visited. So the policy that's the whole thing is visited will be less than K
times delta. So this is nonzero only with [inaudible] here. K times delta, so this is
going to be less than K times delta. Some over S equals K. Like this. And this
is G squared. And if you plug in in the [inaudible] this gives you this. Okay. So
this estimate gave a lower bound in energy spectrum of percolation and also for
the right zero -- for the right zero event, and that's what gives [inaudible]
manages the existence of explosion times. Okay.
[applause].
>> Amir Dembo: Questions or comments? I think we'll thank Christophe again.
And we will start without a break.
So it's a pleasure to introduce Gabor Pete who will present a talk on how to
produce -- prove tightness for the size of strange random sets.
>> Gabor Pete: Thanks. Okay. So this is Oded fetching water for our dinner
with Christophe at Mt. Ranier. So the water is coming from inside the glacier. So
officially the [inaudible] was unsuccessful in the sense that we didn't get to the
top. We had to turn back from 4,000 meters about. But I really enjoyed it, and it
was a great experience for me. So still I'm very grateful to Oded for the hike.
>>: I have a question.
>> Gabor Pete: Yes?
>>: [inaudible]. [laughter].
>>: How did you get him to turn back? [inaudible]. [laughter].
>> Gabor Pete: Okay. Ask Christophe. But he was okay.
>>: Probably worried about [inaudible].
>> Gabor Pete: Okay. So the photo is not related to the talk. Except that the
people involved are the same. So it will be a continuation of Christophe's talk in
the sense that this strange random set I'm talking about is the Fourier coefficients
of create preparation. But some of the stories I think is more -- is interesting also
more generally. So I'm trying -- we'll try to explain that. And also I think
whenever we gave a talk on this paper we never managed to get to the -- to this
last point that I tried to remedy this. Okay. Okay. So Christophe gave the
introduction what the Fourier coefficients have to do with the noise sensitivity, but
I recall very briefly. So, okay, this is difficult.
Okay. Anyway, so when we have a finite set, V, and we have all the possible
plus minus configurations, save it's simplicity with uniform measure, so say or
plus -- or black and white colored inks of the head board and you have -- you
have functions on that for example the crossing function, the indicator function of
having the left side crossing, and the -- so that add to space has a very nice
autonomic basis with respect to the seizure of [inaudible]. This is just the
expectation of the product which is just the product of the bit. So this is and
optimize basis. And you can take the Fourier expansion in this basis. This is
called a Fourier-Walsh expansion. And you can do a little bit strangle thing. This
is like a bit like what quantum mechanics does that you can encode all this
information about the Fourier coefficient into a random set.
So by Parseval the sum of the Fourier coefficient squared is just the F2 norm of
the function squared. Okay. So if I normalize by this, I can just define a random
set, a random subset of the bit corresponding to F. So the probability that the
set, the spectral sample is this given S is just the normalized Fourier coefficient
squared.
So this way I get a random subset of diversities. So I get it in one piece. It tells
me the distribution.
And so how is this useful for noise sensitivity? So Christophe showed you this
formula that if you have a configuration omega and you resample each bit with
probability epsilon independently, so this is the noise diversion of the
configuration and you look at the correlation -- correlation between and after the
noise, then you can write this very nicely in terms of the Fourier coefficients. So
this -- you have the sum over non empty sets because I subtracted this thing
here which was exactly corresponding to the zero, to the empty set. Okay. So if
you do this little error to arithmetic you get that -- you get this formula. So the
correlation is basically some ways measured by the size of the -- somehow the
typical size of the spectral sample. So typically in what sense? You can see
from here that -- so if -- so you are taking a noise epsilon. If there is no weight
between zero and some large constant over epsilon, so -- yeah, so the
probability of those samples is very small, then that part of this expectation will
be small. And for larger sizes, it will be small because it's an exponential thing.
So this whole thing will be small if you don't have math between zero and K over
epsilon.
So which means that if you want to show that epsilon noise or that they made
you decorrelate, then you want to show that there is no -- with this strange
random set doesn't have -- it's unlikely that it's non empty, but it has this size. So
we are after such lower [inaudible] for noise sensitivity. And so at least to us it
was Gil Kalai who suggested that although we are really interested only in the
size, there is no -- I mean, here is only the size of the set. To understand that
size maybe you could look at the whole distribution as a random set in the critical
percolation say for the crossing in an invariant squared, it's a random subset of
the invariant squared.
So we should look at it as a random set. And the -- so it's a strange set, as I
said. We don't know a way to sample from this set, for example, effectively.
There's I mean some relation to quantum word is that there's a theorem due to
Bernstein and Vazirani that if the Boolean function itself is polynomial time
computable, then there's a quantum algorithm to sample from this virtual sample.
Okay. I don't know. Maybe there are some conjectures that say that it isn't be
possible. I have no idea. It is possible to sample or not. Don't expect anything.
I don't know anything about quantum algorithms and stuff.
Okay. One sign that looking at the spectral sample for critical preparation could
be interesting is that Tsirelson, Boris Tsirelson has a theory of noises and this
combined with the Schramm Smirnov theorem think that the scaling limit of
critical percolation is a noise. They don't tell you what that is. So Tsirelson's
theory of noises does apply. And from that and the -- and Smirnov's theorem
that critical percolation has a conformally invariant scaling limit. The combination
of the thing says that the spectral sample, so the left-right crossing in a conformal
rectangle just take a domain with four mesh points and I'm interested in the
left-right crossing in this conformal rectangle with mesh 1 over N.
So the sequence of a random sets as N goes to infinite has a scaling limit and
the scaling limit is conformally invariant. It doesn't say [inaudible] anything about
the scaling limit, but certainly it should be from source of interesting object.
Now, also maybe this came up in the previous talk. At least for -- so it has
something -- the spectral sample has something to do with pivotals, so if F is plus
minus 1 value, then you can talk about pivotal bits, so those are the bits that if
you flip the bit then the outcome of the event changes from plus to minus one or
vice versa. And so it's not hard to see that the probability that the size is
[inaudible] is exactly the probability that the size is contained in the strange
spectral sample. And this is also true for pairs of points, okay? But it's not true
for more than pairs. So for three deposits, not 24. So this is also Gil Kalai's
observation.
Which sort of says that it does have a lot -- this set does have a lot to do with the
set of pivotals, but it is different. So both random subsets measure the influence
or relevance of bits in some sense but in a little bit different senses. Okay. At
least for left-right crossing in a quad you can show that the probability of the
spectral sample intersects some simply connected domain -- domain B is
comparable to -- the constant is the same as the probability that that subdomain
B is pivotal for the left-right crossing. So a subdomain is pivotal means that if you
change all the bits to all black inside the domain or all white in the domain, this
makes at least this makes a difference.
And that is basically the probability that you have the 4M event from the
boundary of B to the boundary of Q. So what is this 4M event? So like a bit is
pivotal if and only if it's very easy to see that you must have a -- so if you are
interested in a white left-right crossing, either you have a left -- white left-right
crossing or you have a black top down crossing, exactly one of these two, and it
is pivotal if you have a black arm connecting to the bottom and a black arm
connecting to the top, a white arm connecting to the right, a white arm connecting
to the left. But if it's white, then you do have this and if it's black you do have
that.
So this is the 4M event. So this 4M event is something that is understood well for
particular percolation [inaudible]. Okay. There's another small fact that for
example the probability that the spectral sample is contained in some say box B
but is not empty, is the square of the 4M probability. So this is very special to
percolation. So for other Boolean functions you won't get something like that.
And also, for example, so this is only one sign that for pivotal something is
different than for the spectral sample is that the probability that all the pivotals in
the configuration is inside this box B, and you do have pivotals is not the square
of the 4M probability but is the 6 M probability. Which is different from this.
Okay. So what do I want to say? Okay. So we have this spectral sample just to
-- I mean, you have sort of seen these examples in the previous two talks. So for
example for the dictator which is the first bit, it's very noise stable. Because you
have to toss the dictator in order to make a change. And there the spectral
sample is concentrated on the dictator. With quality 1 is a dictator.
Majority is still noise stable. The spectral sample, most of the spectral -- most of
the mesh of the spectral sample is concentrated on singletons, now of course
distributed evenly on all the singletons, however, there is some interesting thing
is that so since the probability of being pivotal and being the spectral example is
the same, the expectation of the two sets are the same, and the probability that a
bit is pivotal for majority, so with probability one over square root of N, you will
have the same number of plus ones and minus ones. And once you have this
situation, every assemble bit is pivotal. So with this small probability one over
square root of N, you will have N pivotals, which means that the expectation of
the pivotals will be large, huge, if square root of N. So the expectation of this
spectral sample is square root of N. On the other hand, most of the mesh is on
singletons. So this shows that -- it is not at all the case that the spectral sample
is always kind of typically the size of its expectation, the example for it is not at all
true.
Okay. Last simple example is that parity you just multiply the bits and that's the
most sensitive to noise function that you can imagine. It's the spectral sample is
concentrated on the entire set, whatever you change, you change everything.
Okay. Now, the spectral sample for the left right crossing in N by N square is -has some basic following from the conformal invariants of -- maybe the existence
of the scaling limit of critical percolation. It has some very nice similarity
properties.
So just on the level of expectations to start with, so the expected size is -- so it's
N squared the number of bits in an N by N box times the 4N probability from
distance one to distance N that is known to be N to the minus 5 quarters some
smear November Werner, so you get some N to the three-quarters as the
expected number of -- the expected size of the spectral sample.
Now, if you look at this on a coarser scale, so you take this super lattice of
medium size boxes, so R by R boxes and you look at the super lattice, and look
at which boxes -- which R by R boxes are intersected by the spectral sample?
Now, by this previous result that I was showing you, so the number of boxes is L
squared over R squared and the probability that the spectral sample intersects
an R by R box is the 4N probability from distance R to N.
It turns out so that this expected number is exactly as if you were looking at the
spectral sample in an N over R by N over R box. So somehow when you look at
it on the large scale, it looks like the same function on that super grid.
Okay. So this is one form of self-similarity. And another form of self-similarity is
that if you condition on a -- the spectral sample intersect into some R by R box,
then inside the R by R box it will look somehow like just if you take -- if you had
taken the crossing in an R by R box.
So at least for the expectations the expected size there is the expected size of
SR. So the left-right in an R by R squared. Of course these two results are
compatible with each other in the sense but also in the proof of these. You use
the quasi-multiplicativity of the F of 4, so of the 4N probabilities what does that
mean?
So if you look at the -- so if you multiply the -- so the number of R box -- the
expected number of R boxes intersected times the expected size of the
intersection once you intersect it, this product of course should give the expected
size of the entire set. And it does give. So this is the compatibility I told you. So
the point is that in order to have -- so the probability that you have the 4N event
from distance one to N is of two constants is the same of having it from one to R
and then having from R to N.
So of course one is bigger than the other, because if you have from one to N,
then you also have from one to R and you have there R to N. But also the other
way. So once you have from one to R and you have from R to N with positive
probability you get the connection. You get the 4Ns from one to N. So this is a
nice property of critical percolation. So all these formulas would be true for the
pivotals as well. There is nothing special to the spectral sample. But they are
also like that.
A more classical example is if you take the zero-set of one dimensional simple
random walk our one dimensional ground in motion, so of length N then the size
of the zero set is around square root of N. Now if you take R boxes on the line,
so R intervals, then -- so for example in the second thing is that it's a condition on
that you have zeros in these specific R interval, then of course the expected size
there is around square root of R, which is about the size of the number of zeros
in a length R simple random walk. And maybe from this or from other
considerations you also get that the number of R boxes that you intersect is
typically the size of the number of zeros for the simple random walk of length.
So you do get the self-similarity in many instances of probability. Maybe this
self-similarity can be looked at as somewhat a fact -- some with a result of having
a scaling limit for these sets. Okay.
Now, so you have this self-similarity. What type of concentration do you expect?
So Christophe explained that the intuition is or the hope was for a long while that
the spectral sample should be concentrated near this expectation. So why do
you expect that and how concentrated -- what kind of concentration do we
expect? Okay. So these are just silly examples. If you know probabilities it's
obvious.
So say if you take a uniform set, so just IID with the same density. Well,
surprise, surprise, it will be more uniform than spectral sample. So it will be -- it
will intersect more boxes but intersection within a given box will be smaller. So it
doesn't have the clustering effect that the spectral sample does. And then you
get just CRT, central limit theorem, that the concentration of size -- really most of
the masses are within square root of the expectation.
Okay. A bit more similar to the spectral sample is where you take -- so I fix some
R, which is some medium box, N to the gamma, so gamma between zero and
one, and I take the super grid and I say well let's take a IID sequence again but
now the probability -- so with some small probability which is the probability that
the spectral sample will intersect that box I take up large X which is exactly the
expected size of the spectral sample given that the intersects the box and
otherwise it's zero. So it is more similar. And then I take the sum of this
independent things.
Now this is still -- you can also some generalization of central limit theorem tells
you what concentration you get. It is a bit more spread out than this completely
uniform thing. But it is still a little over this expectation.
Now, if you have this self-similarity stuff going on on every level that you can
imagine, then you don't expect this [inaudible] anymore. So somehow you get
the same [inaudible] on every level and you get that you expect only tightness
around the mean. So you expect that the probability that the spectral sample is
non empty but it's half size smaller than lambda times its expectation should go
to zero as lambda goes to zero. And we did get -- so we would like to get the
exact rate depending on lambda and we did get the exact rate depending on
lambda. So how can you get such a thing? Okay.
So if you had a lot of independence, then how would that proof of the tightness
work? So for example for the zero set of one dimensional sample random walk
you can run the proof, the following proof. So what is the probability that -- so
conditioned on the event that the zero set intersects an R box, and conditioned
on everything else, so the configuration -- the set of zeros in all other R boxes on
the interval. So with all this conditioning still for zeros it's obvious that the
probability that you have at least constant times the expected number of their
expectation is -- you have some positive probability for that. So this is just the
second moment method [inaudible] or what is it [inaudible] signal or something
like that? Okay. So this is -- this is easy for the zeros.
And another one -- so this is somehow -- so what is the probability that the -- you
intersect only very few R boxes. It should be very close to the probability that
you intersect only one R box. So it just -- you need this sub-exponential in K
direction. This is the statement. So this statement is a somewhat -- it's a result
of the clustering effect that I was telling you. So if you have -- you have this KR
intervals if you think that they are pretty far from each other then the probability -you have to pay K times the probability to get to the set and leave the set. And
this probability is too big. It didn't discuss -- it's not balanced by. So you have
more ways if the KR boxes are far from each other, then you have a lot of
combinatorial ways to play with them. But this combinatorial entropy is not
balanced by the -- it didn't balance the cost that it takes. So you can -- with a
little bit of work you can approve this for the zeros. Okay. And once you have
these two properties what -- how the proof looks like. So the probability you take
some whatever Rs, your favorite Rs, what is the probability that the zero set is at
most C times this expectation? Well, you break it into events that the number of
R boxes intersected is K. Now, given that you have -- so condition on having K
in -- K, R boxes intersected, for each of them independently of all the others, this
is the conditioning that, you have probability C to have all of the -- a lot of points
in there.
So the probability that you failed all the time is at most one minus C to the K.
And so you have this line, and from this by two, so this is -- this thing is -- gross
sub-exponentially in K and this thing -- the K's exponentially in K, so you get
basically this probability that you have only one up to constant vectors is the
probability that you have only one intersection. And that you know for sample
random walk is -- well, the number of these boxes is N over R, and the probability
that get there and leave and don't come back is N over R to the minus 3 over 2.
And of course the opposite direction is obvious. So once you just condition to
have exactly one box to intersect, then this is the number of intersections that
you expect. So -- and then you take lambda to be the ratio between C times the
C of -- expectation of CR and expectation of CN, the expectation of the subject,
and looking at so if you take this lambda and you plug this in you get for simple
random walk the probability that zero is less than lambda, the expectation is
roughly lambda.
Okay. I have three minutes. So the point -- the problem is that we don't have
that much in -- we have no idea if this amount of independence is true for the
spectral sample or not. We have only this very limited independent. So if you
condition on the spectral sample to intersect a bowl, an R box, an R by R box but
you -- and you condition on -- you cannot condition on anything, you must
condition on sum W is not intersected. So these Ws that could be anything, not
very close to the bowl, but all the conditioning that we can handle is completely
negative conditioning, you must condition that it doesn't intersect. And under this
conditioning we can proof this condition of second moment and sense this
positive probability result.
So where does this negative conditioning come from? Why it's only negative
conditioning? So the probability that we can handle this probability, the
probability that spectral sample is contained entirely in some set U, some subset
of the bit. So they're just taking all the coefficients corresponding inside that U.
And this is like a projection. So it's not very surprising that it's exactly like a
projection of taking a conditional expectation. So this is this conditional variance
if we can easily check that. Which means for example that for two disjoint sets A
and B the probability that the spectral sample intersects B but doesn't intersect A,
you can interpret it as probability of contained in something minus probability of
not contained -- as contained in something else. So this is this thing. And then
you use the [inaudible] theorem for marking this or something like that.
So you get -- the point is that the whole strategy for the spectral sample of
[inaudible] the spectral sample of critical percolation depends on the fact that this
sort of inclusion formal and these sort of probabilities we can understand the
physical space. So in a -- because this conditional probabilities and conditional
expectations can be rephrased in terms of 4M events here and 4M events there.
So whenever you have a Boolean function for which you can -- you can
understand in physical space this sort of probabilities, then you have a chance to
run other strategy. And okay. Now here comes the -- so we have these two
results. So actually I didn't say how to prove them. So one -- so we have this
replacement when you only have this negative conditioning, and you also have
some clustering effect similar to the zeros of the brown emotion.
So along each of them was a lot of work to prove. But assuming that you have
these two, what do you do? Well, the trouble is that you cannot repeat that
calculation that I showed you here because I cannot just take the case power
that I would like because I need a -- whenever I failed to have a large enough
intersection, I cannot just condition just go on, because I don't have the
independence. I go could on if not enough points for a failure, not enough points
in one box. If it's meant we found nothing in the box, they be we sort of could go
on -- okay. For -- so there is a simple remedy for that, this is a nice idea. You
take an independent random dilute sample, so you take a random subset
independently of everything with the right size such that if an R box is the size of
the sample in an R box is small, then it's likely that they didn't intersect the dilute
sample, and if the size is large then it is likely that it intersected. Which means
that you can measure -- if it's small or big just by looking at whether it's empty or
not. So failure will be empty.
So you have only negative information that you gain by looking at these dilute
sample. So now you have a chance to do. However, there was another
problem, which is that you have -- you had also this conditioning that the number
of -- and when you want to put one and two together, you wanted to use that -the probability that the number of R boxes intersected. The probability they too
are small, were small. And this conditioning is not negative conditioning. So we
cannot just take the case powers of this one minus C because there is this
positive conditioning. And what do you do with the positive conditioning? And
okay, we will completely -- I mean it looks like a silly technical problem. And
while we were with Christophe we were digesting how silly this problem is. So
that's when first came up with one solution and then while we were digesting that
solution, he came up with another solution. So the first solution -- so you can try
to do something like you scan sequentially the boxes with this random dilute
sample, and so if you see that you didn't succeed, you didn't find anything, it's
empty, empty, empty, empty, you go on. So if you could say that somehow the
good probability you had many, many chances, you had to find something then
you would be fine.
So the question is how do you put this thing in that actually we had -- we had a
large number of boxes intersected so the first solution was a filtered macro
inequality, which is that. So if you have some non negative variables -- so it's
extremely general, some X scale on negative variables and some F case of
monotone increasing filtration. So this is some of what you learned during this
scanning process. And the Y case are the condition expectation. So the
probability that the unconditional ones is large and the conditioned once are
small is small. So you can try -- so very simple proof due to SF numbers, you
can try to [inaudible]. So but so this sort of gave some solution to our troubles.
The trouble is that this is just a macro inequality. It's too weak. We didn't get the
full result that we wanted. So that came up with a better solution which where I
will end. Almost.
Okay. So sorry. Yes. Okay. So this is a strange large deviation lemma with
very dependent things. And so this is exactly the type -- so instead of scanning
sequentially, we know that if we have a information about any set of boxes that
we didn't intersect there, given that the spectral sample is there, we have a
positive probability that we actually -- there is a lot of intersections, so the
random dilute sample will find that. So we have this type of thing.
So in order to get an exponential large deviation thing instead of doing this
sequential scanning, you do -- you average these inequalities at once. So you
take a random, a suitably chosen random J and take the expectation of this
inequality and you get -- you get something. And you get the -- that's where we
ask how did you come up with this proof and this result and said that he tried not
to think probabilistically but tried to -- well, we had a bunch of inequalities, he
wanted to get another inequality, he tried to be an analyst and do something.
Okay. So this is what [inaudible] so with this we actually get the final result that
we get is that the probability that the spectral sample is less than the expected is
something, the basically equivalent to contained in a single R by R sub-square.
And that we know showed you on the second slide. And so we do get some
result, and we get the scaling limit of the spectral sample is conformally invariant
Cantor-set with Hausdorff-dimension three-quarters. And you could run the
same process with pivotals but you get some different exponent here. But it's
and overkill to do this, because pivotals have much more independence, they
don't real want to do that. And I stop here. Thanks.
[applause].
>> Amir Dembo: Comments or questions?
>>: So if I understand, one of the worries is that this thing is not an [inaudible].
So can you prove [inaudible] so do you expect clustering I guess?
>> Gabor Pete: Yes.
>>: So can you prove that it's [inaudible].
>> Gabor Pete: Well, there is something about -- so pivotals has a lot of
independence. So there you can say that however the set of pivotals looks here
if you condition to see pivotals in here, then the configuration inside here will be
basically independence of all the other conditionings that you've made. So for
pivotals we know that. So pivotals still have the clock clustering effect. It has the
same type of clustering as the spectral sample. But it has this independence.
This approximate independence thing. And we don't have a clue about the
spectral spam. Actually, there is one things that I wanted to mention that Gil
Kalai had a conjecture that the entropy of such sets, similar sets should be just
bounded by the expectation of the size of the set so the log factor shouldn't be
there, that you would get from uniform. So that's a factor I think I can prove for
pivotal but not for spectral sample.
[applause]
Download