Document 17864500

>> Chris: It’s a great pleasure to introduce Jeff Steif. I would like to start by saying that he’s the Beanbon lecturer this year. So every year we have one distinguished mathematician forabolist Beanbon lecturer and let me say a few words about William Beanbon. Some of you know who he was but some of you might be attending this conference for the first time. William Beanbon lived almost the whole 20th Century from 1903 to 2000. It gives me a special pleasure to say a few words about him because he was born in Poland. He got his PhD in Rolf which was in Poland before the war and that’s the place where much of pre-war Polish mathematics was done. Some of the best known names associated with Rolf were Banek, Elam Katz, and Beanbon. I will not go into detailed history but in 1939 he came to the University of Washington and he stayed at the University of Washington until the end of his life. Of course at the end he was retired. He’s probably best known as a statistician. He was one of the editors of the Annals of Mathematical Statistics. This was the journal that preceded separate journals of Annals of Statistics and Annals of Probability. He was also the President of the IMS. He also made a very important contribution to mathematics. Now days many people know about Orlage spaces but some people started calling them Beanbon Orlage spaces because Beanbon and Orlage co-authored the first original paper. Orlage kept working on these spaces and Beanbon moved to different problems. He made them mostly his contributions to statistics. Any how it’s a great pleasure to have a distinguished speaker today and who will talk about Boolean Functions, Noise Sensitivity, Influences and Percolation. >> Jeff Steif: Thanks a lot Chris. So first I want to thank the organizers for inviting me to give this talk. I’m visiting here at Microsoft for a year and having great pleasure enjoying myself talking with theory group and all the visitors and it’s a great opportunity. I appreciate being here. So, I’m going to talk about the following topics. It’s going to be very much of an overview lecture so feel free to interrupt me anytime you want with comments or questions. So these are the four concepts I’ll me discussing, Boolean Functions, Noise Sensitivity, Influences and how they arise in a particular model called Percolation. I think [indiscernible] said all his pictures were due to someone else so these pictures also borrowed from elsewhere. Okay, there’s some lecture notes that Christophe Garban and I wrote called Noise Sensitivity of Boolean Functions and Percolation and they’re already published but we’re in the process of putting, extending them and writing a book. So, if anyone were to look at the lecture notes and have any comments we’d welcome any types of comments for the coming book. Today’s lecture will be a brief survey of some of the topics covered in these lecture notes. Okay, so the first things we’re going to talk about are Boolean functions and Noise sensitivity and the basic set up for noise sensitivity is quite elementary. We have n random variables x1 through, xn there i.i.d. + or – 1 each with probability ½ so we just have n coin flips and we denote the vector by x. Then we have a function f from -1, 1 to the n, this is sequences of length n of + and – 1’s into + or – 1 and this is called the Boolean function, it just has 2 possible outputs. So, in some sense this function will be a function of these n inputs x1 through xn. So this is what a Boolean function is and the next notion we need to do to introduce noise sensitivity is we have to talk about a small perturbation of x. So x is denoted there and x epsilon is here and this denote, this again a vector of length n, x1 epsilon through x and epsilon which we think of as a small perturbation of x and the way it’s going to be perturbed, and it’s a very simple way you go to each of these n bits and independently with probability epsilon I take that bit and I throw it away and I replace it by a new + or – 1 each with probability of ½, completely independently. So for each of these is erased with probability a very small probability epsilon and replaced by a new coin flip. So that’s for perturbation. Then the basic question is if we look at, and of course note that x epsilon this perturbed thing is also of course because everything is independent and symmetric this is again a sequence of n coin flips, they’re, it’s all ½ a ½. The basic question of noise sensitivity is if we look at f of x in f of x epsilon of f of the small perturbation are these things going to be close to independent or are they going to be highly correlated? Now obviously if epsilon is extremely small then x epsilon and x will be the same with very high probability you wouldn’t have perturbed it and f of x and f of x epsilon will be more or less the same value almost with probability close to 1 so we have it highly correlated. So what we should think instead is epsilon is very small but fixed and this function f will be function of very many variables in some complicated function of these n variables. Then it’s not quite clear if things will be independent or not and this is the first definition by Benjamini, Kalai, and Schramm who introduced this concept. We say a sequence of Boolean functions fn which will be mapping as above sequences of length n at the + or – 1 we call it noise sensitive if you fix any, if you take any fixed epsilon bigger than zero and we look at the correlation of the covariance between fn of x and fn of x epsilon this is a covariance that this goes to zero. Now what this means fn is just taking 2 values so being uncorrelated is the same as being independent. So this simply says that if we look at f of x and f of x epsilon these are asymptotically independent. They become independent and then goes to infinity, if you fix, if epsilon is fixed. Okay, so I said interrupt me if there are any comments or questions. Okay, so let’s take 3 quick examples that are easy. The first one is the simplest Boolean function in the world, it’s called the Dictator Function, fn of x1 through xn which = x1, it’s just the first bit. So this of course doesn’t really depend upon n and it’s not so hard to see if you do a small perturbation of this most likely things are not going to change and this will not be noise sensitive. The function doesn’t really even depend upon n. An example of a noise sensitive function is the Parity Function. These bits are + or – 1 so I can multiply them out and if I take the product of them, we call this the Parity function this one is another Boolean function that turns out it is noise sensitive. Because basically if epsilon’s fixed and n is enormously big with very high probability you’ll be re-sampling one of the bits with very large probability and once it’s resampled it’s just as likely to be + or – 1 and so the things become completely uncorrelated and independent. An example where things are maybe not so clear at first is something which is called the Majority Function. So in this function n the number of variables has to be odd and what the function simply does it looks at the n bits and it looks if there is a majority of 1 or a majority of – 1’s. If it’s a majority of 1’s the function outputs 1 if it’s a majority of – 1’s it outputs -1. One way to do that is you can sum up the bits with + or – 1 and you take the sign of the sum. So this gives you the Majority Function and question is whether this is noise sensitive or not, it’s not as perhaps obvious as the other ones but it turns out this one is not noise sensitive. So if you fix your epsilon very small, imagine and election with Democrats and Republicans and with everything being i.i.d., ½, ½, and you say that the Democrats won and then you change a very small percentage of the peoples votes the Democrats will still be the winners. So this is not noise sensitive. So our main example of interest that will be the interest of the talk is Percolation theory. So in Percolation theory what we do is we have an R by R square piece of hexagonal lattice exactly like this and each of those hexagons is painted black or white each with probability of ½ independently. I think of those as my input bits. They’re my, now I have about R, since its R by R I have about R squared of these + or – 1, think of black as 1 and white as – 1. Now I can define a fairly, a Boolean function which describes the percolation picture. What I’m going to do is I’m going to ask if there’s a crossing from the left side to the right consisting only of black hexagons. So in this particular realization there is a black path from the left to the right side going as I just showed you, the probability is about a half. So we define a Boolean function one if there’s a left to right black crossing and -1 if it’s not. So this function simply tells you if there’s this right crossing and you can ask if this is noise sensitive or percolation crossings noise sensitive. So the way you think about this is on the left side we have original percolation configuration omega and let’s assume that there is a left to 1 I’ve made black turned into red. Let’s assume there is a left to right crossing and now I’m going to apply this so called epsilon noise so each of these hex, vary only epsilon portion of these hexagons are going to be re-flipped to determine their value. The question is, is there still a crossing from left to right? Given the fact there was a crossing before I did this noise is that, how much information does that tell you about there will be a crossing afterwards. Noise sensitivity means basically there’s no information being transferred. So the question of noise sensitivity turns out to be in here and the theorem concerning noise sensitivity was proven by Benjamini, Kalai, and Schramm that the percolation crossings are noise sensitive. So that, if you have a crossing initially you do this very small perturbation you get no information what so ever whether there will be a crossing after the perturbation. So, we call this noise sensitive. It’s sensitive to a small amount of noise. So, now this is, if epsilon is completely fixed the statement is for every fixed epsilon these become asymptotically uncorrelated. Now it’s clear as epsilon gets smaller the perturbation is closer to the original configuration and so it’s going to be harder for things to become independent. Still we can ask what would happen if we let this epsilon not be fixed but go to zero with R at some rate? Now of course if epsilon R decreases to zero too quickly you’ll never get the independence of the crossing before and after. If epsilon’s too small the perturbation won’t change anything. But one can still ask how if it can go to zero and what kind of rate? In the initial paper where they proved noise sensitivity they proved something stronger. They said we can even let the epsilon R goes to zero as long as it doesn’t go too quickly. It has to be bigger than or equal to some specified sufficiently large constant divided by log R. So as long as it doesn’t go quicker than 1 over log R essentially you still get this asymptotic independence of the percolation from before and after. The next question one can ask is what happens if the amount of noise, so vista says you can go log arithmetically but one can ask can you send up or to go to zero even quicker, like an inverse power? So what happens if the amount of noise epsilon R decreases to zero as a power or an inverse power of R, like 1 over R to the ½, could you still have these things being asymptotically uncorrelated? Okay, so this will be the motivating question which we’ll come back to in a few minutes. So now I’m going to introduce a couple other concepts that arise in Boolean functions and arise in this area that turns out to be very key to the study of all these things. This key player is two things which are called Pivotality and Influences. So now we go back to the general context of Boolean function and I want to talk about the idea a particular bit ith, so ith is going to be 1 over a bit, the 5th bit or the 10th bit. I want to talk about the ith bit being pivotal. So for Boolean function f the event that ith is pivotal this is defined as follows it’s the event, this turns out to be an event that is measurable with respect to the other variables. It’s the event that if you were to change the ith bit the outcome of the function would change. So in other words take a realization and look at f as 1 or -1, now go to the ith bit and change it, maybe f changed, maybe f didn’t change. If f changed then I say i was pivotal, basically it’s pivotal for the event whether f is going to be + or – 1. So that’s the notion of pivotal and the notion of influence, the influence of the ith bit which we denote i sub i of f the big I is for influence, the little i is that we’re talking about the ith bit and that’s our Boolean function. The influence of the ith bit is exactly the probability that i is pivotal. The probability that this occurs at this particular ith bit turned out to be very important in this realization that by changing it you change the outcome f. So let me go back to our 3 Boolean functions, Dictator, the Parity function, the Majority function and just check what that is there just so that the concepts clear. So for the simplest one for the Dictator function fn of x1 through xn is just the first bit. Obviously the first bit has influence 1 because it’s always pivotal. If you change the first bit the outcome always changes. The other bits of course have influence zero because those function of course don’t, those variables don’t even come into f. For the Parity function that’s also very simple, all the variable has influence 1 because if you take any bit is always pivotal. If I change any bit the function gets multiplied, will just change because you’re taking a product of + and – 1’s so trivially all the influences are 1. For the Majority function it’s slightly more interesting yet not so hard to see that the influences, of course the influence of all the variables are the same because all the variables are playing the same role and the influences are 1 over n to the ½ approximately and this is basically because if you look at the first bit what does it mean for the first bit to be pivotal? When is it going to be the case that the first bit by becoming, can change the outcome? The first bit can only change the outcome if there’s a tie among all of the other bits. If you have n – other bits then you want to tie between + and – 1’s this is basically like a random walk being back at the origin at time n and this decays like a constant over n to the ½ power. So in this all the influences are 1 over n to the ½. Let me mention although it’s not so related to the noise sensitivity it’s sort of an interesting theorem if you find the notion of influence interesting so I want to mention an interesting theorem in the area. The theorem is an answer to the following question. How small can all the influences be? Now if you take f to be the constant function all the influences are zero so this is uninteresting. So we have to avoid the degenerate things so let’s stick to functions f with a probability that f is 1 is say somewhere between ¼ and ¾, so it’s non-degenerate. If we stick to such f you can ask how small can you make the largest influence. Is there any guarantee that there will be some variable of reasonably large influence and if you how large? So if we look at the majority functions this, the majority functions show that the maximum influence can be a small as 1 over n to the ½. The question is could it get even smaller than 1 over n to the ½? The answer is it turns out it can be a lot smaller than 1 over n to the ½ power. So here is a particular Boolean function. So if you take the n variables x1 through nxn and we’re going to partition it into disjoint blocks, first block, second block, third block, etcetera, and the length of each of these blocks is going to be ascabin which is going to be log based 2 of n – log base 2, log base 2 of n. If you define the Boolean, so we partition it that way and then we say that the function f is 1 if at least 1 of these blocks is all 1. So if you can find 1 of these blocks that are all 1 the function is 1. If there’s no such block the function is – 1. Okay, this is a Boolean function it turns out its non-degenerate. The probability that f is 1 is somewhere between, well between ¼ and ¾, it’s non-degenerate. So it’s easy to check that the functions are nondegenerate and it’s also easy to check that these influences have become our log n over n. So these influences are much smaller than 1 over n to the ½ and the majority function is log n over n which is much smaller. It turns out there’s a theorem by Kahn, Kalai, and Linial that says this is the best possible. So if you take any non-degenerate Boolean function there’s, you always can find at least 1 of the variables who’s influences at least a constant log n over n. So the answer to the question how small can the influences be? They can get as small as log n over n but they can’t get any smaller. >>: [inaudible] types of [indiscernible] constant? >> Jeff Steif: What’s that? >>: Does types of [indiscernible] constant? >>: No. >> Jeff Steif: I don’t know. No, Jeffrey says no. >>: So what does? >>: We don’t know – >> Jeff Steif: What’s that? >>: There’s a gap of I think terminal 2. >> Jeff Steif: [inaudible] the optimal concept – >>: [inaudible] best [indiscernible]. >> Jeff Steif: Okay, okay. >>: But the best example is something like that? >>: The best example is [indiscernible]. >> Jeff Steif: Okay, so we originally talked about noise sensitivity and then we moved into influences, so what is, I should say so this argument uses Fourier analysis and hypercontractivity of certain operators and so called Benomne Beckner growths and equality but I’m not going to be going into this is more of an overview but these are the types of things that come in. So why are the influences relative to noise sensitivity? If you’re only interested in noise sensitivity why do you care about the influences and one of the fundamental theorems in the area was originally in the Benjamini, Kalai, and Schramm paper which says that if you take any sequence of Boolean functions and you look at the sum of the squared influences, you square the influences and sum them up if this goes to zero then the sequence is noise sensitive. So in this sense the influences give you information on whether you’re noise sensitive. So this condition is certainly not necessary because if you take the parity function, the parity function is trivially, it’s very noise sensitive but it certain doesn’t satisfy that because all of the influences are 1 so that sum is in fact n, which is going to infinity. So it’s not necessary but it turns out to be necessary for a very large class of functions. The condition is necessary for monotone functions this means increasing. So Boolean functions monotone if whenever you change some of the bits from -1 to 1 the output of the function can only increase. It turns out for these this condition is necessary and so this gives you a sufficient, necessary and sufficient condition for noise sensitivity. If you take the majority functions I told you that this was not noise sensitive. What happens when you plug this into the formula so the influences was 1 over squared of n, you square it you get 1 over n, you sum it and you get 1. So it becomes that sum of order 1 for the majority so it just barely misses satisfying this condition. It turns out that the majority function is an extremal sequence in many respects for Boolean functions. This theorem is also proved using for analysis hypercontractivity in any quality or generalizing of any quality due to Talagrand. Okay, so this is, if we take this general theorem, okay so now we want to get back to percolation crossings. So I told you that percolation crossings are noise sensitive so how do we get the noise sensitivity of percolation crossing from this. So I told you percolation crossings are noise sensitive and although it was not done in this way in the original paper I want to explain to you basically why, how it would follow from this theorem. So, assuming this theorem how do we get the noise sensitivity? We have to somehow compute the influences for the particular case of percolation. So at this point we now have the event if you have a square, an R by R square the event is whether there’s a left to right black crossing and I want to have some way of computing what the influence is. The answer to this is going to bring us into what are called critical exponents in percolation. So let me describe what these are. So first of all I only describe percolation on an R by R box but normally we actually do percolation on an infinite lattice. So imagine you do percolation on an infinite hexagonal lattice. Now we’ve always been taking P as a ½ the probability of a black, but you could take the probability of a black to be anything. So imagine you take the probability of a black to be P and white 1 – P independently and I want ask is there an infinite black component? So I do this on entire infinite hexagonal lattice, is there an infinite black component? The answer is it depends upon P and there’s a critical value and it was essential shown in 1960 by Harris that when P is ½ there’s no infinite black component. It was proven 20 years later by Kestin that if P is slightly bigger than a half, if P’s bigger than ½ then in fact you suddenly get this infinite black component. So we say the critical value for percolation is ½. At this critical value of ½ there’s no infinite component. Okay, now we look at this picture, we look at our infinite lattice, P is ½ and I want to look at the event that is an open path, a black path from the origin to distance R way. Since the probability of having an infinite black path is zero we know that this probability goes to zero. The question is how fast does it go? How fast this will go was answered by Lawler, Schramm and Werner in 2002 it says the probability of this event decays like alpha 1 of R is just the definition of this, it decays like R to the – 5/48 essentially + little o of 1, don’t worry about that, so decay is like 1 over R of the 5/48 and we call 5/48 a critical exponent. So let me just mention that there’s a critical value which is P as a ½ is a so called critical value. That determines if there’s an infinite cluster and there’s this critical exponent 5/48. There’s a big difference between them. So people like to say that the 5/48 is universal and they like to say the P as a ½ is not universal. What this means is if you, this ½ it happened to be the right thing for this model but if you changed the model a little bit and did something else you could get a different critical value for P. However, if you took a different model and looked at the critical percolation in this, if you looked at the percolation picture at the new critical value and looked at this event again it’s believed that this should still decay like 1 over R to the 5/48. So 5/48 should be the number that comes up for all of these models but the ½ was very special to this and that’s why they call that universal. Okay, now there’s another critical exponent that’s going to be relevant for the influences. This is called the four-arm exponent. So now I want to look at the following event that from the origin there’s 2 black paths going out, 2 white paths going out and in this clockwise order, black-white, black-white. Again, of course this is even more unlikely than the other so the probability of this will also go to zero. The question is how fast does it go? Smirnov and Werner showed that the probability of this event decays like 1 over R to a different power and this other power is 5/4. So we say that 5/4 is the critical exponent for the four-arm events because there are 4 arms in this picture. Okay, now this is exactly the right picture to capture the notion of influence. So what is the probability of a hexagon being pivotal for percolation? So now we’re back into this crossing picture. We’re looking for a blue now path from left to right and we want to know when is it the case that a particular, forget the D here that a particular hexagon x is pivotal. So what has to happen if x is pivotal? It means if x is on, if it’s blue there’s a left to right crossing and if it’s red there’s no crossing. Well, so if x is on that means there has to be some blue crossing from left to right and this blue crossing has to cross through x. If the blue crossing didn’t cross through x then by turning x off you wouldn’t have gotten rid of the blue crossing. So, for x to be pivotal this blue crossing has to go through x. In addition there has to be no path, blue path from this part of the path to this part of the path. Because if there were a path going like this there you could get from here to there avoiding x and x would not be pivotal. So, in order for there to be no blue path from here to here there has to be a red, red is now the other color rather than black and white, blue and red now there has to be a red path going up there and a red path going down there. So that’s what the picture is for to be pivotal and of course as long as you’re not near the boundary this is exactly what we had on the previous slide, the four-arm exponent with these four-arms going out. So the probability of being pivotal is as long as you’re away from the boundary it’s at most by the critical exponent on the previous slide 1 over R to the 5/4 and this yields the sum, what does this do to the sum of the squared influences? Well, there R squared many hexagons each of the influence is 1 over R to the 5/4. I square the 1 over R to the 5/4, I get 1 over R to the 5/2, I get R squared x that and that goes to zero. So other than dealing with that you have to deal with the boundaries. Not very difficult you have to deal with boundary issues because this is really the only the appropriate picture if you’re away from the boundary but this is basically tells you that the sum of the squared influence goes to zero and you get noise sensitivity. Okay, so now if we go to quantitative noise sensitivity this is the question can you take the epsilon R is going to zero so recall that Benjamini, Kalai, and Schramm as I told you said that percolation crossings de-correlate under noise even when the epsilon R goes to zero as long as they don’t go to quickly, if they’re bigger than constant over log R for sufficiently large constant C. Might we believe that percolation crossings can de-correlate and when I, de-correlate can mean lots of things. It can mean, de-correlate can mean partially independent or completely independent. For me de-correlate means essentially completely independent so that they become asymptotically independent. Not that they’re just not completely correlated but they’re asymptotically independent. So, might we believe that percolation crossing can de-correlate even if epsilon R is 1 over R to the alpha? If so what would be the largest alpha we could use? If we take alpha too big the probability the noise becomes so small that things won’t change and you can’t possibly de-correlate and so this is a well defined question. How big can we get the alpha? Basically a lot of talk will be trying to convince you or telling you what’s known and how to get what the alpha is. Okay, now you know the alpha so the heuristic is yes, I don’t know if this is the answer, maybe we’ll see the heuristic on the next slide. So, yes we might believe that crossings de-correlate and guess at the largest alpha is ¾. What I want to do in the next slide is explain to you why you might believe the best exponent is ¾. Okay, so we repeated that here so we have it here the heuristic for the noise sensitivity exponent for percolation might we believe that we de-correlate even if we go like 1 over power, yes in the largest alpha is ¾. Okay, so by the bottom of the slide I hope to convince you heuristically that ¾ is the answer. So, recall the, so we already said this before the four-arm exponent says that the probability of a hexagon being pivotal is about 1 over R to the 5/4. So that’s the probability of a hexagon being pivotal. So the expected, if we look at now the total number of, that’s the probability of 1 hexagon being pivotal if we look at all the pivotal and look at the expected number of pivotals, how many, what’s the expected number of pivotals? Well, we have R squared many hexagons each is pivotal with about that probability so the expected number of pivotal hexagons is about R to the ¾ which is then ¾ above. Therefore, image you now, the noise epsilon R the probability to which you’re re-sampling is 1 over R to the alpha. What would be the expected number of pivotal hexagons that we resample? We have this random set of pivotals what’s the expected number that we resample? Well, it would be exactly the number of, the expected number of pivotal hexagons times epsilon R. So this would be just R to the ¾ epsilon. So if alpha is bigger than ¾ then the expected number of pivotals that we resample is going to zero. So if alphas bigger than ¾ that means we don’t resample, we don’t actually resample a pivotal and things don’t change. Okay, you might argue right away on that, hold it, you know it’s possible I hit, I don’t have to touch a pivotal to change things. Maybe I change this one which wasn’t pivotal and this one which is pivotal and I change them both of course I could change things. So it’s certainly not the case the fact that I didn’t hit pivots says I don’t change things but it’s heuristic. In fact making this heuristic rigorous is actually quite easy. So this can be made rigorous easily in a five minute argument. If alpha’s less than ¾ then the number of pivotal hexagons that we resample is now R to the ¾ - alpha which is now very big, so that means that we’re very likely to hit a pivotal now. If we hit a pivotal well things change and things should all get mixed up, you’ve lost all your information. So that’s the heuristic of the ¾. Now, this part is much, much harder to make rigorous. It’s heuristic where the ¾ come from. If you hit a pivotal things could change and then maybe everything got lost in the shuffle and you have no information about, and so what you knew originally at about the picture you had a crossing gives you no information afterwards. Okay, so now that explains the heuristic for ¾. So now let me tell you 2 different approaches that were used to get partial results. The first one didn’t give the ¾ but it was the first argument that allowed epsilon R to go like 1 over R to some power. Now this approach and the approach I’m going to describe after they’re very different and they’re also very different than the original Benjamini, Kalai, and Schramm argument that said if the noise is bigger than constant over log or things go to zero. So, although I should say and I’m going to be getting into these, I’ll explain very simply some of the Fourier analysis later all the 3 approaches used for analysis but beyond that common component the arguments are quite different. On this slide I’m going to describe the second approach which is related to computer science, theoretical computer science you could say. So now you have to sort of think like theoretical computer scientists if you know how they think. So we have a Boolean function, I have to take I want to consider randomize algorithm. So before reading the slide let me tell you in words what you should imagine what a randomize algorithm is. Let’s say you have a Boolean function of n variables. You know what the Boolean function is maybe it’s a majority, maybe it’s something else. I ask you to compute f of x1 through xn. Well, you could do it you know the function f but unfortunately you can’t see the bits, x1 through xn are covered; you don’t know what they are. So what you can do is you can ask for me some of the values of x1 through xn. You might say Jeff tell me what the 3rd bit is please and I say oh that was a 1 and then you’d look at that and say ah that’s a 1. Then please tell me what the 10th bit is. Maybe I’ll say that’s – 1. Then depending on as you get more and more information you keep on asking about new bits but each new bit you ask about which bit you choose will depend on what you’ve see up until that point in time. You want to, at some point, presumably before you found out about every bit you might say don’t tell me anything else I already know what the outcome is. For example in a percolation picture if you already found out there was a left to right black crossing even though you didn’t some of the other hexagons you wouldn’t need to get anymore information. What you want to do is you want to do this type of thing, ask me these questions with trying to ask as few questions as possible. Okay, so now given that it should be easier to understand this. A randomize algorithm a for f examines the bits one by one with the choice of the next bit examine may depend on the values of the bits examined so far. Now it’s also allowed to be random. Which bit you choose next is allowed to depend on what you’ve seen and some exterior randomness. Even the first bit is random. So the algorithm might pick a bit uniformly at random or according to some other distribution that’s allowed. That’s what we call a randomized algorithm. The algorithm stops as soon as you know what the output of f is. Now I said you want to ask as few questions as possible. So this is one way of describing this is the following. When the game is over and you know at the output f is there are certain bits which you’ve asked me about. We let g be J be this random set of bits that are examined by the algorithm. You define, so that’s a random set and I define the revealment of A to be the following. I call it deltas of A, it’s sort of represents the degree to which my algorithm A, I call it A, reveals the bits. What I can do is for every one of these end bits I can say, what’s the probability that bit was ever looked at by this randomized algorithm? That’s the probability that I is in J. So if you fix J as random but if I fix a bit I, I can talk about that probability is in J, in other words the probability that you asked what I is. I want to take the maximum, the maximum over all the different bits. That’s call the revealment of the algorithm. With Oded one version of one of the interference we have is that if you have a sequence of, let fn be a sequence of Boolean functions with an being a randomized algorithm for fn. The first theorem says if these algorithms are such that the revealments go to zero, so very large n for any fixed bit it’s very unlikely you’ll ask what the value is then it turns out the sequence is noise sensitive. I’ll explain concretely how you do this in percolation. This also turns out to give you a quantitative version which shows the following, if the revealments go down according to some inverse power of n. So if it’s most C over n to the alpha for some alpha then if you take any beta less than alpha over 2 then you end up getting the noise sensitivity you want as long as the noise, the noise is one, we take the noise to be 1 over n to the beta. So what we’re asking is fn of x + fn of x perturbed the noise version with this very small amount of noise, 1 over n to the beta this thing says that these become asymptotically independent. Whenever beta is smaller than alpha over 2 if you have a bound on this on the revealment. Okay, so how do you, what does this give you for percolation. So I have to explain, to explain percolation I have to tell you what this interface is. I also unfortunately have to change white to black and black to white now. So now we’re going to look at left, unfortunately I got these pictures from someone else I have no idea how to change the pictures. [laughter] Rotating doesn’t help either. Anyway, so now we wonder is there a left to right crossing or not? This is a way; this is called the interface between the 2. When you look at SLE-6 for critical percolation there’s a famous picture of Oded and this is what’s going to be described here. So here’s how you determine if there’s a left to right crossing of whites. I start a picture, this red path here and I continue the path always keeping whites on the right and reds on the left, blacks on the left. So here it comes this way to be white, and then I want a black on the left, comes down here. This defines the path precisely. It comes around and goes around and always keeps the whites on the right, comes here, bounces and stays here, and then it comes back here and goes up there. Okay it keeps whites on the right, blacks on the left. Now this red path tells you if there’s a left to right crossing. Start this path it’s going to be bouncing off this side awhile and this side awhile. Eventually it’s going to hit the top or the left side. If it hits the top that says there’s no left to right crossing because on the left side of this red path is this black vertical path that will stop the white path. Conversely if this red path the interface hit the left side before the right side you could look at exactly what’s above the interface and that would be your white path. So looking at this red path tells you that there’s a crossing. Okay, now the algorithm is the following. The algorithm starts here, now you may not like this because that means I always examine this bit with probability 1 so the revealment won’t go to zero but let’s not worry about that. So the algorithm simply asks about bits it needs in order to be defined. So it says what are you? It’s white, okay. Now it goes this way the path so it asks this bit because this is the next one it needs to know, that’s black the path goes up here then it needs to ask this one. So basically this path is going to be in the end asking these bits on the 2 sides of the path but no others. So that’s how the algorithm works. Now let’s ignore the boundary, this doesn’t, you have to modify this and I’m not going to explain how to modify it. I’ll tell you why it works if you’re away from the boundary. If you’re in the middle of the, here somewhere and I ask what’s the probability that this particular hexagon is being looked at by the algorithm that’s what we need to know. What’s the probability of this hexagon being look at by the algorithm? Well to be looked at you’re only looking at the ones that are adjacent to the interface. So what has to happen is if you take a point of this interface you’ll see you have a black path to the boundary and a white path to the boundary. So a hexagon near the center of the picture is examined only if there is both a white and a black path coming out of it. That’s a critical exponent that I have not described for you. That’s called the two arm exponent and that’s also critical exponent which is known and it decays like 1 over R to the ¼. So this says points near, not near the boundary the probability of them being revealed is at most 1 over R to the ¼. You can do some extra randomization to get rid of the problems on the boundary. So hence, and therefore by the previous theorem you end up getting de-correlation if epsilon R is larger, as long as epsilon R is larger than 1 over R to the 1/8. So it gives a proof that you can let the epsilon R decay as a power, of this power and you can still get noise sensitivity, the de-correlation. Of course the 1/8 this is a factor of 6 off from the ¾ conjecture. Yeah. >>: Do you have an example of a nice monotone function that is noise sensitive but has no algorithm – >> Jeff Steif: Yeah – >>: [inaudible]? >> Jeff Steif: Yeah, do GNP, P is a ½ and do cleak containment at the right value of cleak. It will ask you if you have a 2 log n cleak. It turns out its noise sensitive and there’s no algorithm. >>: [inaudible] noise it’s whatever it should be. >> Jeff Steif: Yeah, yeah, yeah, it’s about to, you want to choose the size to get it non-degenerate, it’s about 2 log and then it only becomes a non-degenerate function at certain values of n. There’s an example and it’s known that there’s no algorithms. >>: [inaudible] >> Jeff Steif: Yeah, it’s, it’s known. So that’s the best example I know for the answer to your question. I can also say, you might say okay you’re off by a 6 but can you find a better algorithm? This is one algorithm and the answer is maybe you can find better algorithms. We don’t know if you can find better algorithms, it’s an interesting question. Nonetheless there are theorems that say maybe they’re better algorithms but they can’t be that much better. There are bounds on how good an algorithm can be and it’s knowing that you can’t possibly get up to the ¾ through via this method. Okay, so the, do you know what time I started? >> Chris: 10 more minutes. >> Jeff Steif: I have 10 more minutes, okay. So, okay, the way this is done is the Fourier set-up. You have the following thing. If you take the set of functions now that zero 1 to the end should be +, - 1 to the end into R. If we look at all functions not just Boolean this is of course a 2 to the end dimensional vector space because the domain has 2 to n elements. There’s a very, very nice orthogonal basis for this vector space which are the sets that’s called Ki sub S where S is a subset of 1 through n. If S is the empty set, Ki empty set is taken to be the constant function 1. Otherwise Ki S is the following Boolean function Ki f, x1 through xn is you simply take the bits sitting inside of S and you multiply them, that’s Ki sub S. If you know about group theory these are the characters for the group Zema 2 to the end but you don’t need in this context everything can sort of stay commentarial and you don’t have to deal with that. Okay, so those are the characters. Now those are our basis elements, they’re orthogonal and therefore if I give you any function f you can simply write it out in this orthogonal basis. I can sum it up over all f containing 1 through n, f had of S, Ki of S. F had of S is called the Fourier co-efficient of S. What’s elementary to check, very elementary to check is that is correlation which is what we’re interested in that this is equal to; this can be expressed in the following very simple way in terms of these Fourier co-efficient. Now since my f is mapping into + or – 1 it has L 2 norm as 1. That means if I sum up the squares of these Fourier co-efficient just by Pythagoreans Theorem they add up to 1. What’s interesting is, but you end up getting that this correlation is given by the sum k was 1 of the n, 1 – epsilon to the k times this thing. Now, if I sum up f had of S squared over all the S’s we get 1. But here I’m just going to sum them up over the S’s of size k. This is sort of we call this the weight at size k. In order for this to be small, I mean epsilon small but if I take 1 – epsilon to a big power it becomes small. So what you basically want noise sensitivity corresponds to the Fourier weights, these things here being concentrated on large values of S. Most of the weight should be on large values of S. That’s what this formula tells us. So let me give you an interesting relationship between the spectral sample and something else. So it’s not necessary but it’s nice to have the following picture. There’s something called the spectral sample. Given a Boolean function f its spectral sample which we call script S sub f, this is a random set. It’s a random set of 1 through n and it’s defined distributionally as follows. The probability that this random set is equal to S is simply the Fourier co-efficient of S squared. These all add up to 1 so this gives us a probability distribution on the sub sets. Then in terms of this random set this, you can re-write this correlation which you want to go to zero very nicely in terms of this S sub f, that it’s basically it’s the expected value of 1 – epsilon raised to the size of this random set. So noise sensitivity is basically equivalent to the cardinality of this set going off to infinity. Now, so to understand noise sensitivity what one has to understand is the structure of the cardinality of this set. This set is quite comp, there’s a very complicated random set but this tells us that’s what we want to understand, so quantitative noise sensitivity can be obtained by understanding the typical behavior of this random set. Now what’s not so hard to get a hold of is the expected value of that. But the hard thing is to show you that the expected value of this random variable is actually telling you the typical situation. So, there turns out to be an interesting relation between, we have 2 different random sets for Boolean functions. We have the spectral sample that’s 1 random set and we have the pivotal set. The pivotal set is the set of points which are pivotal for the Boolean function. That’s another random set. It turns out that there’s a very interesting relationship between them. They are not defined on the same space so you can’t ask if they’re independent or not but you can ask how they’re distributionally related and there are some amazing things. The probability that a given bit belongs to these 2 random sets are the same. Therefore the expected size of these 2 things turn out to be the same and it also turns out these 2 random sets have the same 2 dimensional marginal’s. Now this turns out to be very useful because you can sometimes obtain weak results for the spectrum. You can, what you can do this tells you that second moment methods turn out to be very useful because if you apply second moment methods to this spectral set you can instead transfer it over to the set of pivotal points, which turns out to be a little bit easier to handle. This thing allows you to transform sort of second moment arguments but unfortunately they don’t have the same distributions. In general just these 2 dimensional so this can’t carry you all the way as you want. So what’s happened with percolation? So the expected size of these 2 things basically if you look at the slides I told you before and put them together the total influence is R to the ¾ but what you need to show is that’s the typical behavior of the spectral set. In his very long paper and act by Garban, Pete and Schamm this is just a corollary of what they did. They did a lot but just this one thing which is relevant to the talk, is the main point of the talk, is that the typical behavior of the spectral sample is R to the ¾. So not only the mean it’s typical behavior. This was a very difficult project. The fact that this is a typical behavior this in the end yields the conjectured noise sensitivity. So that’s what you need to do. So the point is even though one is really interested in the cardinality of this their method here you didn’t just look at the cardinality it looked at this random set as a random sub set and try to analyze in some way and it turned out that it was, it has some relationship to fractal percolation and they were able to get enough information on it to prove this result. Okay, skip that part then. Okay. [applause]

Document 17864500

Related documents

Products

Support

Document 17864500

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib