Yuval Peres: And for those with stamina who have stayed with us

>> Yuval Peres: And for those with stamina who have stayed with us since. So far we have the great closing talk by Madhu Sudan on Imperfectly Shared Randomness in Communication. >> Madhu Sudan: Okay, so thanks Yuval. Thanks for arranging this Theory Day. It’s really great to be celebrating theory within Microsoft. Even if you’re a little less strong than they use to be. Or maybe not in a strict sense, we’ve done well since. But we’ve definitely lost a few people in the recent past. Raghu thankfully is now at UCLA. This was joint work when Raghu and Clement, and Venkatesan were all at MSR in New England. We started thinking about discretion. What’s this talk about? I’m going to talk about Communication Complexity, very quick introduction to those of you who haven’t seen this. You have two players, Alice and Bob. They have some private inputs x and y. They want to compute some joint function, f of x, and typically zero or one valued functions are the ones that we’re thinking about. They’re allowed to do some interaction. The way I want to think about it. Bob wants to compute this function. Bob outputs f of x, y at the end. The question that we are usually interested in asking is. Over all the possible interactions that we could consider which is the one which minimizes the number of bits that they exchange. Okay and today it’ll be very important to talk about randomized communication strategies. Here there’s some randomness which is shared between Alice and Bob. They’re now allowed to output the correct value with property at least two thirds. That’s the change in the model. Now once again we can still continue to ask questions about this. Why do people usually study communication complexity? Well Norm is not here anymore. He wrote the book on it so I would suspect he may agree or not. But by and large if you look at the bulk of the literature on it it’s focused on proving lower bounds in communication complexity. If you ever give a lower bound which is little less than N, then you know your paper is rejected. You want to; the Holy Grail is to get good strong lower bounds. Then these things to have lots of applications like complexity, streaming, data structures, etcetera, etcetera. But today I really want to think about it as a positive concept. I mean look if I’m out here trying to figure out how to design a web form or some such thing. I want to understand what is the user likely to know and care about. What are the things that I should tell them about all the services that we have to offer? What would I like to do at the end is to minimize the total amount of communication. The user is not going to sit around on this website forever and ever. They’re going to do something quickly. They’re going to go away. What kind of website should I offer them? What’s the right way to study this? I would say look actually this communication complexity is the right model to be studying these kinds of things. A particular thing that I want to in the long run talk about but not something that we’ll get to today. Is, you know when we talk about human-human communication or computer to computer communication. These are often cases where these are you know inputs are huge. The x and the y and so on that you’re thinking about are huge. What is the context for today’s talk of mine? Well, all the knowledge of English that I have. All the, you know current social, political things that we might want to invoke. All the mathematics that I know and the same on [indiscernible], but these two are not exactly matched. Okay, so do I really need to know all the words in English that you know in order to be able to design this talk. If I do it’s going to fail miserably. There’s going to be a large amount of context. The communications relatively brief. I really want to think of communication as extremely short. You know rapid fire communications tweet. Twitter is a good example. On the other hand the context that all is large. The contexts are not going to be shared perfectly. This is roughly the kinds of things that we want to talk about. This is the kinds of things that I like communication complexity to be able to address. I’m not going to get there today. Today we’re going to talk about one very simple idealized mathematical problem in this space. But, you know this is the general picture that I want to start thinking about. What is in this talk? Two things that I want to do one which is basically just a little bit of pedagogy. We’ve had lots and lots of, I mean there’s a book. There’s other surveys on communication complexity. Most of them start off and they’ll here is one protocol. Now let’s start doing what we really want to do lower bounds. I want to change that thing. I want to give you a few different problems which have low communication complexity protocols. I want to just sort of tell you about these that they exist. Mainly because I want to say that actually low communication complexity is very interesting. This is the context in which I want to talk about some extra levels of uncertainty in how to overcome. You know how to make communication protocols more reliable under a new notion of error or unreliability, okay. Alright, so this is one talk, one slide which I hope will be very, very useful to many of you. There’s nothing here that’s new. However it’s very, things that you might not have all put together on the same page in the past. One of the most basic problems, I mean there are a few problems where communication complexity is very small deterministically. But that’s sort of not the interesting ones. The interesting things happen when really I cannot take all the input in my head and put it in some small number of equal in its classes. Still manage to communicate very little to you and get you to figure out the function. One of the simplest problems of this type is equality. I have a big string x in my head. You have a big string y in your head. We want to know if these two strings are equal or not, okay. The communication complexity of this problem under this shared randomness model is a constant, fixed constant. Depends only on the error, the error we fixed to be two thirds, so it’s constant. What is this protocol? It’s actually very, very elementary but I’ll mention it anyway because it starts introducing some things that I’ll use later anyway. The way you should, you know one way to check for equalities. I could look at, the randomness tells us to look at a particular coordinate of x and y and we look at it. But that’s clearly going to be a very bad idea if x and y happen to defer in only one coordinate. Okay, so what do you do? It’s the very, most obvious thing. You don’t talk about x and y. You talk about the encodings of x and y in some error correcting code. Use an error correcting code, now all these strings defer in a constant fraction of the places. Your randomness tells you to look at a particular coordinate. You just exchange that particular bits of x and y of the encodings of x and y and check and see if they’re equal. Clearly a constant number of bits and you get as small an error probably as you want. Okay, that’s protocol number one. Now there are a few other problems. I mean actually for most people I think it stops right here. That’s the constant communication. That’s the landscape of constant communication complexity. No it’s not. There are a few other problems which are all generalizations of these. Hamming distance but in a restrictor sense. We’re given two strings x and y. We want to say one is the hamming distance between them is at most k where k is, think of k as a constant, five or ten. You want to say zero if the hamming distance is large. This is an extension of the previous problem. The previous problem was k equal to zero. If k equal to zero it had the identity problem. Here it turns out you can actually do something fairly reasonable which will get us to k log k bits of communication. I’ll tell you some weak approximation to this in a minute. That’s one of the problems where you have constant communication independent of the length of the input. Here’s another problem. We are looking at small set intersection. Think of x as the characteristic vector of some set, y is the characteristic vector of some set. Both sets are of size at most k, okay. The universe is massive. There’s two sets x and y of size at most k. You want to know if they intersect or not. Here there’s a very clever protocol you are to host Hastad Wigderson which gets k bits of communication. If you don’t want to be very clever then there’s a very simple way of getting some poly and k bits. How do these things work? Very simple, we have lots of common randomness. We can design and pick a random hash function. Pick a random hash function which maps the universe to something like k squared bits. They’ll be no collisions. Now if you use this hash function in some appropriate way in each one of these cases. You’ll be able to reduce the universe size to something like k squared. You can just exchange all the bits that you want to exchange. In this universe you get self [indiscernible] k communication, okay. That works out quite nicely. You have again constant communication protocols independent of the length of the input. Ravi? >>: All these are the shared randomness is all the log n bits? >> Madhu Sudan: In all these the shared randomness is all the log n bits. It’s not a coincidence. It comes without loss of generality. There’s a theorem do to Newman which says that any problem you want to solve with shared randomness you only need to use log n bits of randomness. Okay, it’s sort of; it’s a variation of the same proof that went into Adelman’s theorem. But when you apply it here you still need some randomness. Now comes the nice problem which I don’t think has been sort of studied in the communication complexity literature itself. However, it’s actually well studied in a different literature where they actually offer a solution which looks like a communication complexity solution. This is what I call the Gap Inner Product question. In communication complexity there’s an inner product. But here we’re talking of real vectors and real inner product. I have in my head a unit vector in n dimensional real space. You have in your head a unit vector in n dimensional real space. What we want to know is the inner product. But we can’t estimate it exactly. We are happy to estimate it to within a gap of plus or minus epsilon, something like that. Or in general you want to say yes if the inner product is sufficiently large. At least on number C for completeness. You want to say zero if the inner product is less than s, s for soundness. Can you do this? It turns out] that you can actually do this with communication complexity which is something like one over c minus s squared. If c and s are separated by epsilon, epsilon’s one over epsilon squared bits suffices to solve this problem. I won’t give you a full explanation of how this is done. But a very quick insight is the following. I have a huge vector in my head. I don’t want to send you the huge vector. I will be able to reduce it to a single real number by the following observation. If you pick a random Gaussian vector and N dimensions. In each coordinate it’s a normal mean zero variance one. It’s independent across the coordinates. I look at the inner product of G with x and the inner product of G with y, and take the product. This is the expectation over G of this quantity this is an unbiased estimator of the inner product, okay. It’s even got pretty good variants. If you now use this thing, I will use the shared randomness to define this vector G. I compute G times x, [indiscernible], and send it to you. That more or less gives you this protocol, okay. It’s a, so by the way I mean if you want to talk about communication complexity and applying it in the real world. Tomorrow, this is a pretty interesting problem. I mean I could imagine large number of communication problems being explained in terms of I have some objective in my head. You have some offering in your head. We, the gain that we’re going to get by collaborating or interacting is probably this inner product, okay. I mean it’s a pretty reasonable thing in the fact that it can be solved very efficiently is important. >>: In statistics they mention reduction, right? >> Madhu Sudan: This is probably demo; yeah maybe I have the wrong reference there. But the for whatever reason… >>: [indiscernible] the argument is you know usually for vector differences. But you subtract out the length [indiscernible] and so on everything will work out. >> Madhu Sudan: I think so, right, yeah, exactly. It’s… >>: But prepare for [indiscernible] products depends on the lengths of the weights which you’ve put it in your C minus epsilon. >> Madhu Sudan: Right, over here I just normalize everything and so, right, yeah, good, okay. I should thank my students Badih and Pritish who sort you know put this taxonomy of what was known together for me. Alright, so those are the examples of problems which have short communication complexity. That was not the goal of the talk. But I just did want to emphasize that there are many of these problems. These are the problems we’ll be coming back too to motivate the rest of the talk. The general question that I want to talk about is reliability in communication, and to model some notion of uncertainty. Now what could uncertainty mean? Modeling this is a lot of effort. A lot of what goes into it; I will not be talking about any general settings. But I just want to draw this picture. You know you have this thing Alice and Bob are talking to each other. There are lots of things that are supposed to be common to the two. Anything that’s common to the two you can sort of ask you know you can put it in the relents and say is it really common? You can look at this wire that they’re using to interact with each other. If you decide that it’s not a reliable wire then you get Schulman’s problem of interactive communication. There’s been a lot of work on that in the recent past. Now, what we’re going to do. I mean you could put any dividing line between the two and say look okay what about this function. Bob wants to compute f of x, y. But what does Alice know about f. I mean maybe she doesn’t know everything. It’s probably a pretty reasonable setting to, we don’t get to it. What we’re going to look at is this middle arrow. The randomness, you’re saying there’s common randomness which is shared between both the players. What happens if that one is not really perfectly shared? Okay, so that’s the question Alice and Bob don’t share randomness perfectly; only approximately, what can you do? I’ll tell you what the model is. I’ll tell you about some positive results and maybe a couple of slides on the negative results. Alright, so here’s the model. What I’ll sort of go out and make, come up with a very simple clean model of imperfectly shared randomness. This is all we’ll work within the talk. There can be other variations that can be looked at. But not in this talk. Alright, so Alice gets a sequence of random bits. These are going to be plus minus one bits just so that I can write everything else in nice, formally. Bob gets a sequence of bits. The pair ri, si are all going to be independent. They’re all identically distributed. Okay, so over i they’re independent, ri is correlated with si. Each one of these is going to have a marginal which is zero, sorry marginal expectation is zero. The two of them are correlated which means if you look at the product of ri times si, each one has the same distribution. The expected value is row. Okay, so this is, so when row is positive these are positively correlated. If row was equal to one these are identical. If row is zero these are independent. I’ll use isr, maybe not really use. But anyway maybe in this slide I’ll use isr to denote the communication complexity of a function with imperfectly shared randomness, where the correlation between the individual pairs in row. Extreme cases perfectly shared randomness is row equal to one. Private randomness is row equal to zero, okay. Starting point for Boolean functions, now I’ll come back to. I mentioned this already earlier. Purely you know communication complexity of a function is upper bounded by its communication complexity with some correlation. This is perfect correlation, some correlation, no correlation. But even this is upper bounded by the communication complexity plus log n. Why is this? Because of Newman’s theorem which says that in any problem the total amount of randomness that you need is log n bits, okay. This already gives us a pretty good approximation to how good communication complexity of function can be, up to an arrative log n. But I don’t like arrative log n’s. This talk, yeah? >>: Everybody knows this row is random? >> Madhu Sudan: Yes, everybody knows this row is fix and constant. You need to know a lower bound in the row. >>: [inaudible]. >> Madhu Sudan: And protocols will work, but, yeah, sorry. >>: I’m just [indiscernible] to show that it’s less than [indiscernible]. I guess you don’t even need to know before you could [indiscernible] Alice is the odd bits and obviously the even bits and… >> Madhu Sudan: It works out exactly. In general actually there’s an even stronger [indiscernible] you can take anywhere. You can look in [indiscernible]. I mean I can just re-randomize every bit with a little bit extra thing and then. Alright, so, but what if the communication complexity of some functions are much, much less than log n? Like we’ve seen a few examples where it’s a constant. Can you get a constant in those cases? Okay, so a little bit of a history. We started looking at this question and Clemont invented this talk at ICALP. He sees this paper by Bavarian, Gavinsky, and Ito with exactly the same model. Now this wouldn’t be such a dramatic surprise except Bavarian is my student. I had no idea that he was working on this problem. [laughter] Anyway, so fortunately they didn’t have the same collection of goals and questions. It wasn’t scooping us completely. But that’s the only other paper that I know in the literature which is really talked about communication complexity with this correlation. In general in the literature and probability, as well as information theory and signal crossing. People have been looking at what can you do with some correlated randomness, even cryptography, quantum key exchange, etcetera, etcetera. What can you do? But not in the language of communication complexity which is what we’re going to do today. But Bavarian, etal, said well this plus log n makes this problem uninteresting. We’ll look at it in the sense of simultaneous communication which is a different model of communication complexity. For us really I am interested in two party communication complexity. But I’m really interested in constant communication protocol. Here for us it still made sense. One of the things that they show even in their sort of their model is actually you can do less in their model. But even their model they show that equality testing actually has a constant bit communication protocol, even with imperfectly shared randomness. It’s not completely trivial. If you look at the protocols we have they’re very, very sensitive to the randomness that you’re using. You can’t just convert it to a different one. One of the things that I’ll try to do is show you this. What we did was actually came up with a very general result which says if you have a problem with communication complexity k. Its communication complexity with imperfectly shared randomness is actually most exponential in this k. >>: [indiscernible] weigh the constant row between zero and one? >> Madhu Sudan: For every constant row it is strictly positive, yes. We were very unhappy with this result and tried to improve it. Finally, we were able to prove that we couldn’t improve it. There is a function. It’s not a function it’s a promise problem for which the communication complexity is at most k. But with imperfectly shared randomness its communication complexity bumps up to two to the k, for every choice of k there is one such thing. >>: I guess two depends on row or? >> Madhu Sudan: Okay, so there’s some constants in front of all of these that depend on row. >>: But it’s a constant it’s not in the exponent? >> Madhu Sudan: It’s not in exponent. Yeah, I mean this is sort of theoretical computer science notation. [laughter] Constants can be suppressed; O can be suppressed, alright. [laughter] Alright, so in the rest of the talk I’ll try to tell you a little bit about the positive result, how it’s obtained. Then hint at the negative result. The positive result comes from thinking about inner products. Now again I’m talking of real inner products. What we’re going to do is encode my vector x in an error correcting code. But error correcting code which is sort of where bits are plus and minus one, okay, just so that if you think of it as a real vector it’s a real vector of constant length and constant, okay. You encode x into this big string capital X, y into this big string capital Y. X and Y are going to have this thing. If x and y are equal, little x and little y are equal so are capital X and capital Y. The inner product between these two is N, okay. If little x is different from y then if you’re encoding is good enough then the inner product here is at most N over two, okay. I mean it goes from anywhere from plus N to minus N, and over to [indiscernible]. Okay, so what we’re going to do is try and estimate this inner product. The way we’re going to do this is based on this, sort of this sketching protocol that I already mentioned earlier. But we do it slightly differently. What we’re going to do is the following. Alice will pick a collection of Gaussians. Each one of these is in capital N dimensional vector and t of these. What’s she going to send to Bob is she’s going to pick the index i which maximizes the inner product between Gi and X, okay. If you’ve seen analysis of the coloring algorithm based on semi-definite programming and so on, this is very similar. You’re looking, you have a vector and N dimensions. You picked N run, k, t random vectors and pick the nearest guy, okay. Bob does not have these Gaussians. Some of these are random Gaussians. Bob does not have them. But he has the ability to manufacture Gaussians which are row correlated with Alice’s Gaussians. Okay, just take a collection of bits and I’ll raise them out and that gives you Gaussians, okay. They will, the correlation is preserved at least positivity of the correlation is preserved. Now what she’s going to do is look at the ith coordinate over here of her ith Gaussian that she has. Because he sent, Bob, sorry Alice sent i so Bob is going to look at the ith coordinate. He is going to check if this inner product within Gi and Y is positive or not. If it’s positive he accepts. Turns out that this works out, you have to pick t to be something like. Alright, okay, so some constant t works for this example. But in general if you’re trying to measure inner products to within plus or minus you know epsilon when X and Y are normalized to unit length, then you would have to pick exponential in epsilon squared, different Gaussians. Then you send the index of one of them which is log of exponential of one over epsilon squared bits, one over epsilon squared bits, okay. This manages to work out. This is what we call the Gaussian protocol. It turns out to be very, very useful. How do you go from this very simple inner product problem, you know equality testing problem to a general case? Turns our inner products really capture everything in communication complexity. This is quite standard if people have looked at it. If not I’ll just give you a very brief idea why this is the case. I mean what does a, for simplicity let’s think of a one way communication. Alice is going to send some message to Bob. Bob is just going to spit out the value of the function. What does this protocol look like? Well, Alice is going to send some message. Some number between one and two to the k, k bits of information. She is sending some information. Bob is taking some function which says given whatever Alice sends what should he do, except or reject? Alice is input is represented by an integer between one and two to the k. Bob by a function from two to the k to one, zero one, okay. What we know is that they have a perfectly shared random thing. I mean so if you look at all the different random strings on which they’re doing, they can use. For each random string they’ve come up with a possible message and a possible function to work with. If you look at a random choice of R this value fR of iR is the right value. Okay, that’s the definition of having a communication protocol with k bits of communication and perfectly shared randomness. We want to take this problem and just solve it without having the perfectly shared randomness. First thing we were going to do is convert their strategy. Alice’s strategy is a collection of integers I of R over, ranging over R. Bob’s strategy is a collection of function f of R ranging over R. We’re going to take both of these strategies and represent them as vectors. It’s a very natural representation of the vectors. That’s the same thing that I had earlier. The representation is simple, i of R is just going to be represented by a unit vector of length two to the k, one in the coordinate i of R and zero everywhere else. F of R is going to be represented by the truth table of this function. Both of these are vectors of length two to the k. Their inner product is the value of the functions, okay. The only thing that’s happening over here is that there is a certain normalization. This vector is of length two to the k, norm two to the k, L two norm two the k. You want to distinguish this inner product to within plus or minus one, okay. You want a two to the minus k approximation to this inner product if you normalize. That’s the approximation that we will eventually have to compute. The Gaussian protocol can do it with about two to the k bits of communication, okay, so all of that, alright that’s it. Alright so this is how you do one way communication. Two-way communication is again just the same thing again. You just look at the definition of what is two-way communication, extract vectors, and represent strategies as vectors, and acceptance as inner products, and do it. It turns out to be all similar. I won’t do it in this talk because it’s. At the end you get a theorem. No matter what function you have if it has k bits of communication with imperfectly shared randomness you get two to the k bits of communication, whatever, okay. This maybe actually you know I’m not so sure this maybe actually four to the k because of some error. >>: [inaudible] >> Madhu Sudan: Some form of two. [laughter] Okay the epsilon squared I don’t know what happens to it. I don’t, okay, never mind. Alright, so the main technical result in the work which I might not really get to. Is lower bound says that there’s a promise problem which can be solved with k bits of communication. But the imperfectly shared communication complexity is exponentially large in k. It’s actually some constant times two to the k, but again. Once again this lower bound works as long as row is less than one. For any row less than one you get this lower bound. First thing I want to tell you is what is this problem. Because I think even this problem is actually a natural and interesting problem. We should probably think about it occasionally. This is what I call the Gap. The problems that we’ve been working with so far are the Gap inner product problems. Inner product is large or small distinguish between them. There’s a gap between the good case and the bad case. Now we’re going to have an additional feature which is sparsity. What’s going to happen? Alice is going to get a sparse vector in zero one to the n. The vector is sparse in the sense only one out of every two to the k bits are ones and the others are zeros, okay. Now Bob gets a vector in zero one to the n. They want to estimate the inner product. Now the largest that the inner product can get to be is two to the minus k times n, right. Because that’s the weight of the vector x, so it cannot be larger than that. These are just; I’m just looking at accrued inner products no normalization. The promise that I’ll give you is in the good case. In the yes case the inner product is ninety percent of the maximum possible. In the bad case it’s almost like x and y are uncorrelated. Okay, so almost like fifty percent. We have to decide which of these the problem is, okay. This is the problem. We want to do two things about it. We want to show that it has small communication complexity with perfectly shared randomness. We want to show that it has very high communication complexity if the randomness is not perfect. I’ll do one of the two, okay. Alright, so that’s sort of a, just to remind you what the question is hopefully you’ll stick around for the next slide. If you just decided to do the Gaussian protocol we would have communication complexity of about two to the k for this problem, okay. That would be ignoring sparsity. We want to do better by exploiting sparsity. What are we going to do? The main idea is the following. If you can discover a random coordinate i such that xi is non-zero then yi is sort of positively correlated with the answer, right. Because ninety percent of the time if xi is one yi will be one in the good case, but it’ll only one with probably sixty percent in the bad case. That gives us enough of a distinguishing advantage. We want to discover a random coordinate i says that xi is one. How do we do this? Well, they share randomness. They’re sharing randomness perfectly. That’s the model that we have here. What do they do? They have a sequence of random numbers that’s what the shared randomness is which is uniform over n, okay, just picked a random sequence. Alice will send to Bob the index of the first j such that x of i of j is non-zero, okay. Clearly it’s a random; this is generating for us a random coordinate where xi is non-zero, okay. How large is j going to be? Well, for each one of these coordinates the probability that xi of j is non-zero is two to the minus k. Amongst the first two to the k guys you’re going to find a non-zero coordinate. I have to send you the index which means I have to send you k bits. We got an exponential advantage and the trivial protocol, okay. By looking at this problem carefully we’re able to get a significant advantage over the ninth protocol. Now the question is can you give lower bound for this? >>: Just a question. >> Madhu Sudan: Sure, yeah. >>: I mean one sixth of point four would make no difference… >> Madhu Sudan: Would make no difference. I think point six we just used to make sure that in the bad distribution we would actually pick x and y uncorrelated. We just want to make… >>: But it would be correlated negatively… >> Madhu Sudan: Negatively in this, the result would hold yeah, actually. Just have to say things slightly different, yeah. Great, okay so that’s the protocol. Tells you why this works. Now I’m trying, so in the lower bound. I won’t tell you, give you the lower bound. But I’ll tell you about a few interesting things about it. First, you can’t do the usual strategy in communication complexity. What’s the usual strategy in communication complexity? Lower bounds you fix a distribution. Then you say well on this distribution the communication complexity is large. Okay, so you don’t look at the worse case instance. The complexity you look at the average case complexity over distribution. You can’t fix a distribution anymore. If you fix a distribution there is a deterministic strategy which has k bits of communication. Okay, why? Because it’s just, I mean for every input there is a strategy which works with k bits of perfectly share. You know communication with perfectly shared randomness which means that for every distribution there’s a randomized strategy which works with k bits of communication with perfectly shared randomness. But then there’s the distribution you can sort of flipped the minimum and the maximum or whatever. You now have a deterministic protocol. You really have to do the things in the reverse order. You have to look at the strategy that you’re going to employ. Then fix the distribution later. Okay, the distribution has to be tailor made to the strategy. This won’t come out in the proof but it is, I mean in the proof that I’ll describe. This is an ingredient that has to be there. One thing that we can, we do need to do is to say look I mean there was this perfectly shared randomness protocol. You might try to implement something like that with your imperfectly shared randomness. This should not work. Why will this not work? We need to somehow rule out the following. Somehow or other Alice and Bob make, manage to agree on a common index i which is uniformly distributed between one and n where xi is non-zero. We need to rule this out. How do you rule this out? We show that look this immediately implies that they’re agreeing on some randomness with high probability, okay. A large amount of randomness, randomness of entropy log n bits with very little communication. We proved that if you want to agree on a large amount of randomness perfectly when you only have correlations to start with. This takes amount of time which is amount of communication which is proportional to the entropy that you’re sharing. You cannot agree on log n bit strings without log n bit, on an index between one and n without log n bits of communication, even with correlations that you are sharing, okay. This is sort of, once we phrased it this way the lower bound was in the literature, but [indiscernible]. Now, the other thing is I showed you a protocol which ignores the sparsity and achieves two to the k bits. But is that the best possible? We proved that yes if you ignore the sparsity and so on. These are some of the ideas behind the proof. The unfortunate thing is the protocol could like anything. This is not an exhaustive list or doesn’t look like an exhaustive list. But the remarkable thing is it almost does. The formal proof of this goes through something called the Invariance Principle. It was an amazing miracle when I saw this you know the way this principle can be applied to get a reduction. Take any protocol and either Alice and Bob are really agreeing on some variable very with high probability. Then you know sort of this way of strategy has some common influence in it. Then you know you get sort of, get to this setting. Or it turns out that it’s actually ignoring all. It’s not placing too high an influence on any coordinate. In which case you can actually replace bits by Gaussians and Gaussians there’s no notion of sparsity anymore. You know you really sort of you can’t. Then you’re forced with a two to the k lower bound. This is sort of a remarkable discovery which comes in you know perfectly suited for our setting. We were able to use it. Summarizing if you have k bits of communication with perfect sharing, two to the k bits of communication with imperfect sharing suffice, this is tight. Along the way we, you know we the invariance principle is sort of very nice, very cute. But I feel again for the sake of utility especially in either PCP literature or in the probability literature. There would be a different articulation of which would be very, very useful and nice. We have in our paper what I think is a very, very convenient way of stating it. Once we were able to do it we sort of hired a lower bound saying if you have this thing. That one-way communication would be at least two to the k. Without almost any change in the whole proof we were able to get it to two to the k just by making the invariance principle sufficiently general. I’ll encourage you to look at our paper for that portion of the thing. Back to the big picture, I think this imperfect agreement of context is important. This is sort of something that we’ve been wishing aways but it’s not true. Communication complexity in the classical sense we tried to abstract things as either you and I know this or you and I don’t know this. A very firm division between me and you, but most natural communication is not along these lines. They’re for everything that I know there’s some chance that you might know it and some chance that you don’t. We manage to leverage this uncertainty and still work with it in anecdotally in common context, common settings. Can we sort of make it more rigorous and prove that this happens in mathematical sentences. This is where we’re going here. The other thing that I do want to say is that yeah, you know large context and short communication is certainly sort of an interesting range of parameters to think about. Okay, thanks very much. I’ll stop here. [applause] >> Yuval Peres: Any additional questions of Madhu? >> Madhu Sudan: I’m tired. [laughter] >>: Yes, very. [laughter] >>: Well, just something about your first slide or motivation? >> Madhu Sudan: Okay. >>: It doesn’t seem that the case that you came with [indiscernible] communication that tries to minimize the information that it tests? There’s certain types of information that are much easier to comprehend than others. For example pictures versus words. >> Madhu Sudan: Right, okay, so the measure of complexity may not be simply. It’s also a function of what is easy to achieve, right. I mean I can certainly point to an object in the room and that’s maybe large number of bits of communication, if you sort of look at all the different things. But there is a certain, I mean we do want to minimize the total time it takes to get this communication done. >>: Some of us do maximize. [laughter] >> Madhu Sudan: Some of us maximize, I agree. [laughter] >>: Yeah, all like I guess a lot of these things lower bounds are used to prove let’s say lower bounds [indiscernible]. Like if you have sparseness. But morally like if you have a good [indiscernible] graph that insures [indiscernible] other [indiscernible]. So morally are those other examples where you are clear of low communication protocols for calculating certain functions as well? >> Madhu Sudan: In streaming or? >>: Yeah, let say let’s take example from [indiscernible]. Because you’re saying there are very few examples which are known which have low communication and where you have the [indiscernible] function. >> Madhu Sudan: Right, so up to an exponential factor in the communication. The inner product problem that I described is everything. In fact we proved it during the course of the talk almost. Everything else reduces to that. But then within that scope once again there are other things. I would think, I mean yeah. You know small set intersection seems like a very reasonable thing to be thinking about. I mean we’ve sort of narrowed out interest down to something reasonable small set where we might have an overlap. We sort of play the game of you know and now can we find it within this space. I mean in that sense I do think that it makes a lot of sense to model things in terms of communication complexity. Is that the question? >>: [indiscernible] can I talk to you privately perhaps? >> Madhu Sudan: Okay. >> Yuval Peres: Let’s thank Madhu in particular. [applause]

Yuval Peres: And for those with stamina who have stayed with us

Related documents

Products

Support

Yuval Peres: And for those with stamina who have stayed with us

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib