>> Tom McMail: I'm Tom McMail from Microsoft Research Connections, and I'm very glad to welcome you here today and very glad to welcome Venkatesan Guruswami from Carnegie Mellon University. Dr. Guruswami received his Bachelor's degree from the Indian Institute of Technology at Madras and his Ph.D. from MIT. He has been a faculty member at Computer Science and Engineering at University of Washington right here in Seattle, and he was a Miller Research Fellow at the University of California, Berkeley. So with that, I'd like to introduce Dr. Guruswami. >> Venkatesan Guruswami: Thanks, Tom. And thanks, all of you, for coming early in the morning. So this is a talk about bridging two lines of work encoding theory pioneered by and Hamming, and it brings in some ideas from complexity theory and cryptography to design codes for a new model. And this is joint work with Adam Smith at Penn State. And so directly jumping to the outline, so this talk, I'm going to spend a fair bit of time because it's sort of a new model and so on, not an entirely new model, but a new twist of an old model on the context. So really more than half the talk will be sort of survey-like. I will talk about what was done in these lines of work, and then I'll state our results and show some ideas in the proof. So, specifically, I'll begin with two popular error models to use to model errors in coding theory, Shannon and Hamming, then we'll talk about one way to bridge between these, which is we are list decoding, and we'll see what sort of the challenges of that are, and that will motivate our main topic, which is computationally bounded channels, and then we'll see some previous results and then state our results. Okay. So that's the plan for the talk. And I have a longer slot than usual so feel free to ask any questions in between. So we should have time for questions. So there are two classic channel noise models. Suppose Alice wants to send a message to Bob across a noisy channel. So we'll assume that Alice is sending a sequence of n bits. So the noisy channel is going to distort some of these bits, so two popular models is -- one that's pioneered by Shannon is a probabilistic view where you say that as the bit stream -- by each bit is flipped independently with the probability p. It's called a binary symmetric channel, or another way to say it is that the error distribution is simply a binomial with a mean p times n. Okay. So typically pn errors happen, and they happen independently across bit positions. Another model is the Hamming model where you say that the channel can do whatever it wants in the most nasty way. The only restriction is that it never corrupts more than p fraction of the bits as if goes through. And here p is some error fraction which is between zero and one half. And in both of these settings the challenge is to understand what is the best possible rate at which you can send bits. So if you actually send n bits, through how much actual bits of information can you get across to the receiver because of this noise, and that's really the basic question. And to motivate this formula, let me define this. This is probably known to all of you, but just to set up some notation, what's an error correcting code? We'll think of it as a way to encode k bits of information -- that's the amount of information you have in your message -- into an n long string which is a redundant string, and m is typically called the message and c is the codeword, and c of m is the codeword associated with m. And the other -- one important parameter for this talk is p, the error fraction which we saw in the previous slide. The other important quantity is the rate R which is simply the ratio k over n, and this really conveys how much bit do you have per -how much actual information do you have per bit of the code word because you're sending k bits of information through n bits of communication. And the asymptotics will be such that we'll think of p as an error fraction as something between zero and one half as a constant, and we would also like the rate to be bounded away from zero. So wealthy think of k and n as growing with k over n being fixed. And just to sort of give some reason -- and that's what a code basically is, but in order to use codes for error correction, we would like the codewords to be well separated. So if you take all the images of this map, you'd like these blue points to be well separated. Why? Because if you send a codeword, then maybe you sent the c, but there was some noise e added to it, so you actually received c plus e, and here the assumption is e is some unknown error vector which you're trying to remove which obeys some noise model like the Shannon or the Hamming model. And the hope is that the codewords will separate it. You can, from r, go back to c. That's the idea. Okay. So the question then becomes what is the best rate you can achieve if you want to correct errors. So let's look at Shannon's capacity limit. Suppose you send a codeword c, and note that the error is a binomially distributed random variable, so when you send c, the sort of possible received words you can get is really a Hamming ball of radius p times n centered around c. This is what you typically -- what you'll see should lie inside this ball. And note that for all of these points in this yellow ball, you should really go back to c, because you may get any one of these and you should decode that back to c. That's the -- for most of these things, you should decode them back to c. And what is the volume of this ball? This ball, since it has various p times n, has number of points, roughly 2002 to the entropy p times n. And just what I said earlier, so if you want to do decoding, if you take these balls around all these codewords, they have to be more or less disjointed so that there is good chance that you will recover the center of the ball when you get some arbitrary point in the ball. And a simple volume argument shows that each ball has this volume, and the total number of points is 2 to the n, so you can't pack more than 2 to the n by 2 to the h of pn number of codewords. And that basically tells you that the largest rate you can have is 1 minus h of p. Okay. So all I've really done here is just a hint of the converse to Shannon's coding theorem. So on even the binary symmetric channel, the largest rate you can have is 1 minus h of p. And I'll have a plot of this a few slides down. But it's a quantity which for p small is very close to 1, which makes sense. If there's no error, you can send that information rate 1. P goes to half as this rate approaches zero. And the amazing thing in Shannon's work was that he showed the converse to the converse, which is Shannon's theorem, that they can in fact achieve this bound. So he said that there exists a code of rate as close to this capacity limit, which we know is called Shannon's capacity, such that no matter what error message you want to send, if you send it and it gets hit with binomially distributed error, then the probability that the c [inaudible] lies in the ball centered around that code word but in nothing else is very, very high. So basically in principle, you send this codeword with high probability, it will be inside this low region and nothing else. It won't lie in these little intersections so you'll go back to c. And the closer you get to the capacity, there's some tradeoff here. But in principle, you can be as close to that as you want. And this was one of the early uses of the probabilistic method, and over the last 50, 60 years we now know various efficient explicit constructions with polynomial time algorithms which realize this dream both in theory and practice, various constructions of concatenated codes due to [inaudible] LDPC codes and, you know, recently, as of two years back, these things called polar codes. So, largely speaking, this problem is very well solved. Now, the motivation for this talk is that this is all -- theirs nice, but this assumes that each bit is flipped independently, and this is a rather strong assumption, because errors often tend to happen in bursty or unpredictable ways. To assume that the tenth bit is independent of the ninth bit may not always be realistic. So we would like to understand what happens in the Hamming model, completely worst-case errors. All you assume is that the fraction of errors is at most p. So what can I say there? Okay. So let's ask the same question. Now, if you think about it, in the Hamming model, the largest weight with which you can communicate is simply the largest number of points you can pack so that if you take these Hamming balls of radius pn, they're going to be completely disjoint. So there is no chance for any confusion. So you can ask the same question. It's a very basic combinatorial question. What's the largest number of some balls you can pack in Hamming space? Very nice question. The answer is unknown. It's one of the major open problems in combinatorial coding theory, indeed in all of combinatorials. So we don't know that. But what we do know is that it is strictly less than Shannon's capacity. So you cannot quite achieve Shannon's capacity in this model. In particular, as soon as the error fraction exceeds one fourth, the rate must go to zero. So [inaudible] Shannon's capacity remains positive all the way up to half. That's a very stark contrast. And also the best known rate even non-constructively is something like 1 minus the entropy of 2p instead of p. So another way to say this is that for a similar rate, in the Hamming model you can only correct about one half the fraction of errors you can correct in the Shannon model, and that is even allowing you to pick objects which we don't know how to construct efficiently, so even non-constructively. So that's really the pitch. So I think I've said a few tradeoffs, but it's good to see a picture here. So on the horizontal axis is the error fraction which is between zero and half, and then the vertical axis is the rate between zero and 1, and this is -- the green curve is the Shannon capacity limit, which is the best you can do in the Shannon world. And the best you can do in the Hamming world currently is this blue curve. As you can see, it hits zero at .25 and it's well bounded away from this. And we also have upper bounds which say that you cannot go above this red line. So there is a provable separation. So given this, the [inaudible] is going to say how do you -- is there any way you can get the best of both worlds where you have codes operating added Shannon capacity but maybe resilient to I'm not sure expressive classes of errors rather than just IAD [phonetic] errors. And ideally we would like to handle worst-case errors, but we know something has to give if you do that. So why do we care about worst-case errors, first of all? I mean, Shannon's problem is solved. Why don't we just go home? Well, worst-case errors are very nice. We like worst-case models in computer science. And the more serious answer is that a lot of extraneous applications, of course, which don't, per se, have to do anything with communication, which there are many such examples in theoretical computer science today, almost all of them require resilience to worst-case errors, so it's a natural model for us. And even in communication, worst-case errors, maybe channels are not completely adversarial, but allowing yourself worst-case errors allows you to model channels whose behaviors maybe sort of unpredictable and varying with time; for example, bursty errors and so on. And if you design a code for worst-case errors you are somehow robust against all, you know, possible classes of things you can imagine. So it's a very strong notion to design codes problem. Okay. All that is good, but the problem is that if you want this, you have to pay a big price. So the rest of the talk we're going to see some ways by which you can maybe bridge between these two worlds. Okay. So the first such method is list decoding, and this is not directly the focus of this talk, but it's going to figure out in several places as a useful [inaudible]. So what's the list decoding model? So here it's very similar to earlier. So you have a message, it's encoded by a code, and then the Hamming model you flip arbitrary p fractional bits and then you have to decode. The only change is that you make the decoder's life a little bit easier. You say that the decoder does not have to recommend m uniquely, but it has to recover a small list of candidates, one of which should be m. So that's the relaxation. So the model remains worst-case errors, you've made something has to give, and what gives is the decoder's life is a little bit easier. And a combinatorial notion associated with this is that we say a code is p, L list-decodable. No matter what point you take, if you take a ball of radius p times n around it, centered around it, you'll have at most L codewords. And if you think about it, this means that if you started with this codeword, went to this red point, and you tried to decode it back, you wouldn't get this codeword, of course, but you might get L minus 1 [inaudible]. And if L is not too large, then this is -- you know, we consider this a good notion. Okay. That's the list decoding notion. And why this notion? I'm come to that in a second. But in this notion it turns out that you can actually achieve Shannon capacity even for worst-case errors. So an old result, almost 30 years old, shows that if you take random code of rate arbitrarily close to 1 minus h of p, then it is going to be p, L list-decodable for a list size above 1 over epsilon. So if you are 1 over epsilon ambiguity you can be within epsilon of Shannon capacity. And this is, I should say, is an existential result. I'll come to that again. And another way to really think about this is that, you know, in an old view, when you pack Hamming balls around these codewords, there is a way to pack a lot of balls, essentially an optimal number, such that -- Shannon said that these Hamming balls are going to be mostly disjoint. This is a more worst-case guarantee which says that these balls will never overlap at any point more than 1 over epsilon times. And these two notions are the same as what I said in the previous slide. Okay. So it's some sort of an almost disjoint packing of a perfect packing space. And last career we showed that the similar thing is also true for linear codes, which actually for this talk comes up later on. And now one thing, if you've seen this notion for the first time, you're probably thinking maybe it doesn't make much sense. What use do I have if I don't recover my message uniquely? So I don't have time to dwell on that yet, but just a few items here. So for various reasons this is good. Well, first of all, it's better than simply giving up. So that's hard to argue with, but -- unless you spend a lot of time and do that. And also you can show that for variation interesting codes and noise models with high probability, you won't have more than one element. So you sort of essentially have the same benefits of getting unique decoding, but you've obviated the need to change your model, and you still work with worst-case errors. And for many of these extraneous applications I mentioned it exactly -list decoding fits the bill perfectly. And more specifically for that talk, you will see that even if you don't care about list decoding, it comes up as a nice primitive for other models. So it's a useful tool. Okay. All that is good. So why don't we just consider list-decodable codes and, you know, be done with it? I would love to, but unfortunately we don't know how to. So unfortunately there is no constructive result which achieves this bound of Zyablov-Pinkser. So here is the plot again. Just for variety, I've flipped the thing that -- there's the rate now and this is the error fraction. And you would like to be on this green curve. That's the optimal tradeoff. And the best we can construct so far is this pink curve. So there's a huge gap. And even this was achieved by works with [inaudible] from two, three years back where we constructed optimal codes over large alphabets. So a similar thing, if instead of binary codes you allow me large fields, then we can actually get the optimal tradeoffs, but for binary it's unknown. So we can take such codes and use other ideas to get this tradeoff, but as you can see, there's a big gap, and closing this gap is -- it remains a major open question, and that's not what we will address in this talk. So just a summary is that list decoding gives one nice way to bridge between Shannon and Hamming. You can work with worst-case errors, and you just have to relax the decoding requirement a little bit. And by doing that you can achieve Shannon capacity, but the problem is we do not know explicit codes. So that's the sort of introduction or preamble. Now we come to the main object of this talk, which is another way to bridge between Shannon and Hamming. But are there any questions so far? Okay. So the main things you need to remember are Shannon capacity, this tradeoff between p and R and some sense of what list decoding is for the rest of the talk. Okay. So computationally bounded channels. So here the model is now -- in the list decoding we didn't change the noise model, but now we'll actually change the noise model. We won't allow worst-case errors. What we will do is that we will say that the channel instead of being in adversity is somehow computationally simple. It's not solving np hard problems to cause errors, for example. So what does simple mean? And this model is not new. It was put forth by Lipton in 1994, and his suggestion was maybe we will just model the channel as having a small circuit or as being a polynomial time machine. And this seems very well motivated, because maybe natural processes which cause errors are unpredictable and hard to model, but they are probably not arbitrarily malicious in solving very hard problems. So it seemed like a reasonable way to bridge between our idea, which is a very simple computational model -- you're just flipping coins at each position -- to adversarial, which could be doing arbitrary things. Okay. This is a fairly expressive class which bridges between these two things. And pretty much if you allow circuits of linear size, of quadratic size, that is going to encompass every model considered in the literature. Yes? >>: So in the other error model -- so, I mean, circuits can obviously make circuits convert a randomized circuit into a deterministic one, but it might blow up the size. And so I just ->> Venkatesan Guruswami: So here we'll allow randomized circuits. >>: Oh, you are allowing randomized circuits? Okay. >> Venkatesan Guruswami: Yeah. That's right. And I think it's okay. I think the classes -- yeah, right. I think when we talk about log space, it may be an issue, so we'll just allow randomized machines. And we work with non-uniform models just for convenience. If you want, you can work with [inaudible] machines and so on. Okay. And just to be very clear what I'm saying, what does it mean here? So the channel class now is going to be specified by two things. One is a complexity class like poly-sized circuits, log space [phonetic] machine, all those things, and the error parameter. Of course, the channel can flip every bit with -- you know, if p becomes very large, there is nothing you can do. The channel can always zero out the whole thing and output zeros. There's nothing you can do against such charges. So you still put a [inaudible] so the channel doesn't corrupt too many bits. So the error parameter p is going to remain. So these are the two things. So in some examples of complexity classes we might allow this polynomial-sized circuits or a log space channel which is going to be important to this talk here. The channel, as the codeword streams by, it may be remembers something about the codeword, but not too much. Maybe log number of picks. And it causes the errors [inaudible]. So that's the other model. And the third and the weakest model is an oblivious model where the channel can do arbitrary things, it can add an arbitrary error, but it somehow decides to add that error before seeing the codeword. So its actual behavior is oblivious to the codeword. Okay. This is a somewhat restricted model, but, so we'll get some results for all these models. And for all these things you must design a single code which will work for an arbitrary channel which belongs to this class. So you have to build one code for all possible instantiations of channels from these classes. Okay. So, again, this model is not new. So the model was put forth, and there was some previous work, so let's go over that before we say what our results are. So Lipton already in his work said that if you set up some assumptions where you, let's say, assume that the sender and receiver share some random bits which is hidden from the channel, then you can actually achieve good results in this model. So here the model is [inaudible] Alice and Bob. Alice wants to send a message to Bob, but there's some secret randomness just shared between Alice and Bob. And that's the model. And this is also studied in the recent work of Ding, Gopalan and Lipton. And in this model it turns out it can give a simple reduction of worst-case errors. So if you have shared randomness you can essentially reduce worst-case errors to random errors, and therefore you can get the same results for worst-case errors. And actually it's a very simple idea, so let's go over that. So what you do is you assume that the sender and receiver share a random permutation of n bits and a random n bit offset delta. So now what the sender will do is that let's assume -- we know that the Shannon problem is well solved, so REC is a random error code which can handle random errors of p bit flips. So we'll take any such code, your favor one, and we'll encode it. Now, before sending this we'll scramble this codeword according to this Pi and then we'll add this offset and then we'll send it along channel which is this adversarial channel, worst-case error. And now what the receiver must do is that, well, the receiver also knows delta, so he can add the delta back and remove the offset, and he also knows Pi, so he can unscramble the codeword. And now the key point is that after unscrambling, really for the codeword of the random error code, the error which is being added is Pi of e. Now e is an arbitrary error vector, and because of this offset delta you can essentially assume that the error e is independent of the permutation. So this Pi of e is really like a random error. Even though it's like a worst-case ring, you're applying a hitting of the random permutation so it's a random error. And therefore this REC was, of course, designed to correct random errors, so the decoder will succeed. And that's it. And the rate of this code is exactly the same, right, because you just took the random error code, and nothing else happened. The rest of the thing was just shuffling around information. So if you have shared randomness, worst-case errors is reduced to random errors. And, of course, one immediate complaint about this is that this is a lot random bits, right? So Pi is n log n random bits. That's more random bits than the bits you're trying to communicate, so that's not such a great thing. But then Lipton's addition was that if you assume [inaudible] computational restriction come in -- what I said in the last slide was for adversarial channels. Lipton suggested that if you have a computationally bounded channel, then you don't need a purely random string. You can maybe take a seed for a pseudorandom generator and expand it to a string which looks random to this channel, and therefore you can make your seed much shorter. So, in fact, if you want polynomially small error probability, instead of sharing about n bits you can share only, like, log [inaudible] bits. Yes? >>: I think I got it, but I want to confirm. Like Pi, you don't mean permutation on 0, 1 to the n, you mean permutation on ->> Venkatesan Guruswami: Oh, yeah, Pi is just a permutation of 1 to n bit positions. Thanks. Just permuted around the code bits. >>: Got it. >> Venkatesan Guruswami: Yeah. Any other questions? Okay. So with pseudorandom generators you can make the seed length small, but what I'll show in the next slide is that you actually don't need any computational assumption to even reduce the amount of shared randomness. And that's what I want to show in the next slide, that the errors can actually be adversarial and they can do what I did in the previous slide with a much smaller random string. I'll show how in the next slide. And what this shows is that in this model of shared randomness you really don't need computational assumption. It's an extracting which makes your life easy, but it doesn't really -- in some sense it's not [inaudible] to the problem. If you have shared randomness, you can still work with worst-case errors. And this will also illustrate -- and one of the reasons I'm showing this is that it will show how to use list decoding in this context, and you'll see similar things later in the talk. So, again, what I want to show here is that I have -- suppose we had in our hand optimal rate list-decodable codes with the caveat that, of course, we don't, so there will be few slides like this where if we did list decoding, we'd be in good shape. Then we can get optimal rate codes even on worst-case channels with a very small random seed. That's what I want to show. And, again, the difference between what I showed from two slides back is that this random seed will be rather small [inaudible]. So how do we do this? So the idea is going to be this picture. So you have in your hand a list-decodable code, and you want to send this message. So, of course, one thing you could do is you could just encode the message by the list-decodable code, but that won't be quite good because then you'll get a list of possibilities and you won't know which one is the correct one. So what you do is first you essentially hash the message and get some hash tag t. So this MAC is a message authentication code. It's an information theoretic MAC. Or if you don't know things, just think of it as some hash function applied to m. It's really the same thing. And then you encode m, t together by the list-decodable code. Now, then you send it on this worst-case channel which adds as nasty an error as it wants, but you still have the guarantee that no matter what you did, because of the list-decoding guarantee, if it didn't add more than p in errors you can recover a list of L possibilities, one of which will be correct. So you'll get a list of L m1 t1 [phonetic] errors. And you should think of L as something like polynomial in n here or a large constant or the exponential number of possibilities here whittled it down to just some polynomial number. And now the goal is, okay, how do you know which one is correct. Well, for that you use the fact that whether the shared randomness comes in, so this hash function was produced by -- this s is really the randomness used to sample this hash function or the MAC, so you also know that as the decoder, so you compute the MAC of each of these m1s and see which one matches. And is the one which matches, you pass through. And essentially what -- the idea is that the sender is authenticating the message using this shared randomness, this key, and if your [inaudible] property for the chance that you produced a spurious m1 whose tag matched with what you sent is at most delta, then by a union bound there's at most L delta chance that something else will mess up this thing. And you'll just assume that the delta is 1 in polynomial in n and L is polynomial in n, so this will be a polynomially small problem. And why does the rate go? So I still have to say that. Well, first of all, this part, the rate of the list-decodable code is very good by assumption. And the other point is that we're not quite -- and the amount of shared randomness can essentially be log in 1 over delta, so that can be -- if you want 1 over poly you can have just log shared random things. And this tag is often also much shorter than the message. So you can assume that this t only has log n bits instead of m, so instead of sending n -- encoding n bits you're encoding n plus log n bits. That causes a negligible loss in rate. So, again, if you didn't get this, just think of t and s as being small, and that ensures that the shared randomness is small and the loss in rate is also negligible. So this way if you had good list-decodable codes, a small number of shared random strings, you can work with arbitrary channels. Yes? >>: [inaudible] >> Venkatesan Guruswami: Okay. So we'll come to this later, but the question is how do you know which hash function you sample here. So the decoder needs to know how to compute the hash. But actually we'll see a scheme later on where we'll precisely do that. We'll kind of get rid of the shared random thing, but something has to give for that, and we'll weaken the noise model and then we'll see that you actually don't need the shared randomness. Again, this is not directly centered to the talk because there is no computational assumption here, but you'll see schemes like this later in the talk as well. Yes? >>: Is it possible to maybe add some redundancy to a [inaudible] instead of sampling hash functions? >> Venkatesan Guruswami: In effect, this is kind of what -- the hash function is like some sort of check sum, so it's really -- I mean, if you sort of go into the innards of some of these things, that's what it will be doing. >>: But by adding redundancy I mean you could get away with shared randomness -- >> Venkatesan Guruswami: Oh, yeah. But, remember, that's kind of what this step is doing. This step is and redundancy in a way so that you can remove the adverse effects of errors. So you can do what you want, but you must remember that there's always a channel between what you send and what you receive. And redundancy is being added in this box, which, you know -- I'm not telling you what this box is. >>: But if the channel can remove the redundancy -- the point of the secret is that the channel doesn't know how to remove redundancy. >> Venkatesan Guruswami: Okay. So one last bit of previous work and then we'll actually talk about our results. So another model in which you can solve this thing that was put forth by Micali and others is to use a public key setting. This is another setup assumption where you assume that there is some public key infrastructure, the sender has a public key/private key pair, and only the sender knows the private key. Everyone knows the public key. And this is a very clean model of solution once again using list decoding and digital signatures. So now the information theoretic map will be replaced by digital signatures, and you can handle computational assumptions. So how does this work? Again, the picture is very similar to this. Now we'll assume that the channel is a polynomial-bounded thing, so it's a poly-sized circuit, let's say. And now what you'll do is that, again, you have access to a list-decodable code. There is no shared randomness or anything. But what you have is a secret key/public key pair. So the sender will sign the message using -- Alice will sign the message using her secret key to produce a signature sigma, and then you will encode the m, sigma pair by the list-decodable code. And as before, the decoder will recover L possibilities, m1 sigma 1 pairs, and so the goal is now to figure out which one is correct. Now, using the public key of the signature scheme, certainly m will pass through, and you just have to make sure that nobody else passes through. And once again, if the -- because it's a polynomially-bounded channel, if the forgery probability of the digital signal was small enough, then it's unlikely anybody else will fool you. And there are some additional ideas to handle the fact that what if you send multiple messages and so on, which is also discussed in this paper. And so another way to say this is if you had optimal -- so because of this, if you had optimal rate codes for list decoding, and we certainly know very good digital signatures, so then you can also get optimal rate codes for polynomially-bounded channels in this model where you have a public key. So the summary of both of these things is that if you had good list key decodable codes, then with shared randomness you don't need any computational assumptions. You can solve worst-case errors nicely. Or if you have computational assumptions and have a public key structure, then also you can handle this. So list decoding -- both of these problems are reduced to list decoding. That's the context. But now I'll come to our results. And the main difference in our results is that we will do away with these setup assumptions. So these things are nice, but in some sense they deviate a little bit from the simple setting where a sender sends a message to a receiver, and that's it. We agree upon a code and that's -there's no shared randomness, there's no other extra setup. So we would like to go back to that model which we saw in the first slide. So I'm now going to state our results. So we get explicit codes which essentially achieve optimal rates where there's no setup assumption. So it's just a good old setting where I send a message and the decoder has to figure out what it is. There is no -- nothing we've agreed upon other than this. Of course, I tell you the code I use. So the first result for the simplest model is the model of additive errors. And I'll formally state the theorem below next. But really at a high level it basically gives explicit codes which achieve Shannon capacity for this model of additive errors, and previously such results were only known -- and only existent results were known and they also give a similar existence proof, again, using list decoding which actually helps some of our explicit results. So what's formally our result? So this is the formal result. So we give an explicit randomized code. So this may worry you, but I'll come to this in a second. So you have a message and you pick some random coins and then you take a randomized encoding of this into n bits with the rate arbitrarily close to Shannon capacity, and there is an efficient decoder such that the following is true no matter what message you want to send and no matter what error vector the channel might pick. So if you add the error vector -- hence, the name additive errors -- if you add the error vector to the codeword, which is a random variable depending on m and your random string omega, the property that it decoded correctly is very close to 1. And the point is the decoder does not know the encoder's random bits. So it's just a randomized code where instead of sending a fixed codeword for each message, I sample from a space of codewords randomly. But you don't know my random string, so this is still a very reasonable model. You just need some private randomness at the sender's end. And if this is the case, you can essentially handle any additive error with very high probability. Is the statement clear? I'll flash the statement again a few times. But this is the first -- and the code is explicit, and the rate cannot exceed 1 minus h of p and be arbitrarily close to that. >>: Again, what is important here? That e doesn't depend on the ->> Venkatesan Guruswami: Omega. So e is -- yeah, what is important is that e doesn't -- e is picked -- so maybe -- I believe this is a better term. So e is picked obliviously to your codeword. E can depend on m, but, of course, the codeword -- basically e may depend on m, but not omega. >>: So how low is the failure probability of the coding? I mean, maybe ->> Venkatesan Guruswami: Okay. I'm hiding the [inaudible] at the start, but I think what we can get is it can be exponentially small. It can be something like 2 to the minus n over log squared or something. >>: Oh, okay. So it's not so far from ->> Venkatesan Guruswami: So it's not so far, yeah. And the existent [inaudible] can even achieve exponentially small error probability. For this result we don't quite know how to achieve 2 to the minus omega and error probability. That's an open question. >>: But it's not bad. >> Venkatesan Guruswami: No, it's not bad. It's like 2 to the n over log squared, minus n over log squared. >>: [inaudible] [laughter]. >>: So you get error correction as long as the errors are no more than [inaudible]? >> Venkatesan Guruswami: Wait for the answer, yes. Actually, that's -- so Leo's questions was -- so if e -- if the number of errors is not too big, then you've succeeded with high probability, what if the errors are too big? What would be nice if you gracefully say, oh, too many errors have happened I detected recent. So we don't quite achieve that. But the building block for this, which is an existent thing, does have that, and we use this. So that's another open question in this result. So no result is ever final, right? There's always things you can improve in this result. But the high-order bits are correct in this result. Okay. That's the first result. And that's sort of the -- a little bit of a [inaudible] noise model, right? Oblivious seems very simple. You don't even look at the codeword before deciding the error. The more realistic model is perhaps the channel has limited memory. So that's the logspace model. So here you just assume that the channel has logspace memory and it flips bits in an online way. And here we can get the optimal rate, but there is a caveat that we don't quite get unique decoding, we'll only get list decoding. And this would mean that the decoder will output a bunch of messages, one of which will be the correct message, a slight difference from the earlier notion. Let's not worry about it. And one comment about this is that I'll show -- I'll sketch in the next slide that the list decoding restriction is necessary. So logspace channels are already powerful enough that the moment you have more than one fault errors, you cannot do any decoding. So in some sense list decoding is not the bad part here. The bad part in some sense the logspace errors, because what I told you earlier is that if you allow me list decoding, then I don't need any computational assumptions. I can just work with adversarial errors. And the reason we're not doing that is we don't know how to do it. So with logspace restriction on the channel we are able to construct these codes. Another way to think about this is that we're not able to solve the list decoding problem, but we're solving some restrictions of it when you assume that the channel is somewhat nice. Okay. And the only thing about online logspace use here is that we have Nisan's [phonetic] beautiful generator. So as long as you have suitable PRGs, you can do this for any channel model, so we can certainly handle poly-sized circuits as you make some computational assumptions. Okay. And one comment is in both of these cases, contrary to many uses in cryptography, the decoder is going to be more powerful handle it channel. So you want to use logspace channels, maybe your decoder will use 2 log n space. And this somehow seems necessary because, otherwise, the channel can essentially play the decoder, right? So the channel can pretend it's the decoder, decode the message, and then maybe change it to something else. So somehow for this kind of thing, it seems necessary. But that may be a reasonable assumption in communication settings. Okay. Those are our results. And so just one comment about the logspace list decoding thing before I give the details on [inaudible] is -- so, again, what I just said is that why you need the list decoding restriction is that even if you have constant space, you don't even know logspace, the moment the error fraction becomes more than one-fourth, your rate goes to zero. So there's no chance of getting 1 minus h of p. And there is an open question for p less than one-fourth. Let me postpone this for the last slide. And the proof idea is very simple. Suppose you actually sent some codeword. What you do is you can sample a random message n prime and a random string omega prime, and basically you're sampling another codeword, and essentially you just -- these two codewords will agree in about half places in expectation, and whenever they disagree you move one towards the other. So at the end you won't know whether it was c which was sent or c prime which was sent. Sort of a simple fooling idea which is quite common in coding. So this is why one-fourth is sort of a bad [inaudible], and this is a very simple channel with just used constant space. So now to the technical part. So I've stated our results. So I think all I'll probably do in the remaining time is give some ideas about additive error things. >>: Can you go back to [inaudible]? So why is a random codeword something that a logspace machine can actually do with part of its ->> Venkatesan Guruswami: Okay. So we're assuming a non-uniform model [inaudible] so ->>: Oh, I see ->> Venkatesan Guruswami: [inaudible] we just assume examples and prime omega prime computes this, let's say, in a non-uniform way. >>: Because it does seem like that might be a -- I see. It's some fixed random codeword. >> Venkatesan Guruswami: And then you could just insubstantiate. This probably is single c prime which works for many things. >>: So each channel has its own c prime non-uniform [inaudible]? >> Venkatesan Guruswami: Exactly. So because it will work for all non-uniform channels in this class, this is a problem. And there are some weaknesses in this result in the sense that we only get a small failure probability and so on. There's some technical ways that you can strengthen it. But my feeling is any reasonable thing you do, I think one-fourth is going to be the area for logspace kind of channels. Another way -- sort of a philosophical way to maybe think about this is that there are various combinatorial bounds in coding theory which says that if you have an adverse at this, you cannot achieve something against this adversity. And what this calls for is you go and look at those combinatorial proofs and see whether you can implement those adversities in some simple model. And this is saying, roughly speaking, this Plotkin bound which we have can be implemented by a very simple process. And this raises some interesting questions. If you go to even the next level of sophisticated bounds, those really seem to require adversarial things, so I don't know. There may be some nice set of questions here. Any other questions? Okay. So now I'll try to give some scheduled ideas what kind this construction where we give a randomized code to handle arbitrary additive errors. And first I want to talk about the -- show why these codes exist. And, actually, even that is not entirely clear in this setting. So I'll first show a new existence proof which is much simpler than the previous existence proofs and also helps us in our construction. And this, again, is going to combine list decoding with a concern crypto primitive. So now we will take a list-decodable code which further has to be a linear code. And this simply means that the map from k bits to n bits is a linear transformation. That's all it means. And recall what I said many slides back is that we can also show that good linear list decodable codes exist. So that comes in handy here. And, plus, a certain kind of -- special kind of message authentication code called an additive MAC or AMD by Cramer and others which was discovered for a crypto purpose. So I won't formally define it, but let's go back to this picture of how we are going to do this. So now the change from the earlier picture is that the errors are not arbitrary or they're not caused by a poly-bounded thing, but it's a fixed offset. It's a fixed -you know, it's some string e. So you're given m and you're given e, and you have to succeed with high probability. So now what you do is that there's going to be your private randomness omega which is going to be like the hash function or this key for this AMD code, and that will produce a tag. And then what you do is that you take this linear list decoding of list-decodable code and encode m, omega and t together by this code. And now e acts on it, and the list decoder doesn't care. It always can -- whatever it gets, it can decode up to areas of p. So you'll get L triples now of m1 omega t1. And now you have to figure out what m is. In the earlier setting you knew what omega is because of the shared randomness. So you could compute this and everything worked. But now you don't know omega 1, so this seems to be a problem. But now what we're going to leverage is the fact that this error is not arbitrary but additive. And really the key point is that by the linearity of this code, because this error vector is fixed, the offsets of these spurious guys from the correct one are fixed. They only depend on e. And this just follows from the linearity of the code. We can essentially assume move everything to the origin, so these offsets only depend on e. And since you're picking your omega and so on independent of e, so it's unlikely that these l fixed offsets will be -- will essentially fool the AMD code. And this AMD is essentially a special MAC which is well-designed to handle these additive corruptions. So that's -- I mean, I'm not formally defining it, but just the high order bits is that because of linearity, these offsets are going to be fixed, and for a fixed offset an AMD code is unlikely to be fooled. It will just remember these two high-level points and this [inaudible] plug-in and you get this result. And only point is what is the rate. If you assume the optimal list decodable codes, this rate is good. And instead of encoding m you're encoding a little bit extra stuff, omega and t, but those are going to be much smaller. So they don't cause much loss in rate. And that's your overall scheme. So this shows that these kinds of objects exist where for any fixed additive error you can get the unique answer with high probability. The probability is now over the choice of the key for the AMD code. And now going back to Leo's question, one nice thing about this point is that if the error fraction is more than p, then certainly the m1 -- the correct thing won't figure here, but none of the other guys will still fool the decoder, so you will detect with high probability. >>: [inaudible] is small? If the [inaudible] is exponential, then single error e can expand to too many things. >> Venkatesan Guruswami: Okay, that's a good point, yeah. >>: You have to push e ->> Venkatesan Guruswami: No, no, no ->>: You have to code e in a list decodable way, right? >> Venkatesan Guruswami: No, no, no. Sorry. The list decoder always decodes up to areas of p, and it will never see more than L things. This is true. If the error is more than -- you worried me there for a second, because we've used this. If the error fraction is more than p, the correct guy won't be in this list. That's all. But the list will never have more than L things. The list is always small. The correct thing won't occur, but none of the other bad guys will pass the check as well, so you will gracefully reject. So you somehow get this very nice threshold behavior from this thing. And this we don't quite get in our explicit construction, and this may be a nice thing to do. Okay. Now for the explicit thing. So let's try to -- so now what I want is that -- all that was fine, but the problem with that was, again, we don't know these good list decodable codes. Right? So we need some other way. So how do we do this? And as a warmup, we'll actually go back to our solution with shared randomness. And, of course, we don't want shared randomness, but we'll first see that and we'll try to get rid of it. So now we'll take -- for the additive errors we have a very simple solution. We don't even need the offset. The shared randomness is simply a random permutation Pi. So we'll take our random error code and scramble the codeword positions according to Pi and then send it along. And now the key point is for any fixed e, if Pi is a random permutation, Pi of e is a random error, so the decoder can work. Exactly what I said earlier. So it's very simple. And now the whole idea is we want to get rid of this Pi. So an idea is going to be -- so we'll try to have the scheme but instead hide this Pi itself in the codeword. So we'll try to hide the shared randomness in the codeword itself. Okay. But, of course, one way to do it, we write -- in the beginning of the codeword say what Pi is. But that's not such a smart idea because the additive error could just completely destroy it. So somehow -- you know, errors act also on that part. So, first of all, you cannot just hide that information, which we call the control information. You still have to encode it. But then you cannot encode it and put it right in the beginning because the error might just completely destroy it, so you have to encode it and then also hide it in some random allocations which are hard to figure out. Okay. Those two things seem necessary. But if you do this, you want the decoder -- remember, the decoder doesn't know the shared randomness, doesn't know omega, so the decoder must be able to figure out value at Pi without your telling is explicitly. So somehow the places value are going to hide Pi should also be part of the control information. And that control information must include this data as well, but there's no -- the recursion stops there, so that's all you need to include. So that's good. And the idea is going to be somehow do you this. You'll take some control information which is going to be Pi and also where you're going to hide this Pi, and you encode it by some code and hide it in those places. And now the idea is if you could somehow correctly get this control information, then it's good, because you told the places where it was hidden, the remaining part you unscramble according to Pi, and you go back to old solution. Okay. I left a gap here because -- but there's some problem, right? If you think about it, you're trying to encode the control information in a way which is resilient to additive errors. But that's really the problem you're trying to solve because, you know -- and also, the control information, if I do it as such, that's n log n bits. If I even just write it in encoded form, that's -- my rate goes to zero, so how can afford something like this? Okay. And both of these have an intertwined solution. Well, first of all, to address the second point, because I cannot hide a lot of bits, I better make my control information small. And for this you use some pseudorandom ideas. Since you don't really need Pi to be a completely random permutation, it suffices if it has limited independence which can be sampled with some, say, n over log n bits or some small fraction of n bits. That's good. So once you have this, your control information is a teeny fraction of your message. And once you achieve this, what time the problem you're trying to solve is essentially the same, you don't have to solve it with great efficiency. This is such a small fraction of thing, you can encode it by a really crummy code which has much lower rate than capacity and try to predict it against adversity errors. And that we know how to do, because this is a much weaker goal, and we now how to do this. Okay. That's really the hope that this is all going to work out. Okay. So I think both the ideas will become clear when I show these things. So really there are going to be two main pieces. The first is how do you protect this control information which is going to tell the decoder its bearings, and that's going to consist of three things. Actually -- yeah, so, first of all, what you will you'll do is -- okay. So two things come directly. So one is the random permutation, and we'll see that the offset comes back in a crucial way, so there will also be an offset. So now what you're really going to do is that you're -- so first part is -- sorry, this is the data part. So you sample your Pi and delta and do exactly what we did earlier. So you take the random error encoding, scramble the bits, and add the offset, and you keep your data part ready. So this is one part. Now, the other part is going to be the control information which is -- already we need to tell the receiver what Pi and delta are, but there's also going to be a third thing, which is the locations where you're going to hide this control information which is going to be some random subset of blocks. Let's say each block is of length log n, so it's some random subset of blocks of some suitable size from 1 to n over log n. And that's also a small amount. So this triple is the control information which is random and which you have to hide in the codeword. And now the point is we're going to take this omega and we'll encode it by some low-rate code -- something like an Reed-Solomon code works well -- into these control blocks. And now each of -- and these things you better protect against additive errors, so you can encode it by this existential code I showed a few slides back. And that's really the whole encoding. So you take this omega, which is the control information, take some Reed-Solomon encoding, which simply means you think of it as a polynomial evaluated at various places, you add some header information because you might move these things around in some crazy ways, so you just add alpha i's, and then this inner part SC is some stochastic code which you assumed existed from the previous slide which -- because it doesn't have the to be a particularly good rate code, we can even construct it explicitly. And those blocks are all going to be the control blocks. And these are the two components. And once you do this, you have the data part, you have the control part, and you just interleave them. Namely, now, use the third thing, T, and you use T and you place the control blocks at the locations specified by T. That's your final code. Okay. That specifies the encoding. >>: [inaudible]? >> Venkatesan Guruswami: There is no shared randomness, because all this -so far I haven't talked about the decoder at all. All of this is at the encoding end. So the omega is random, and the encoder does all this, and he'll send this. Nothing else will be sent to the decoder. >>: So the decoder has to tell which blocks are payload, which blocks are ->> Venkatesan Guruswami: Yeah. So the decoder first has to -- yeah, that leads off to my next slide. Somehow you have to figure out which one -- you have to figure out what T is, you have to figure out what Pi and delta is. You don't have to get all the details here, just the high-level ideas. Okay. You have some control blocks which are kind of important. They will tell you your bearings. They are hidden in random places. You have to figure it out for the decoder. Yes? >>: [inaudible] >> Venkatesan Guruswami: Okay. Very good question. So I hid that point which is -- so these analysis -- so these constructions for random error codes, they also work for errors which are only limited [inaudible] independent. So there are some constructions which are known to do that. But that's a valid point. But we leverage fact that certain constructions only use limited [inaudible] -- need the errors to be limited [inaudible] independent. Yeah, but that's an excellent point. So we don't really need the errors to be fully random. Okay. So now to decode. Obviously decoding is just the inverse of this, so what you need to figure out is which one of these control blocks and then try to decode this part of the code, figure out your Pi and delta and then go back and handle this. Okay. That's going to be the one. So once you get the control part correct, you'll be in good shape, because we're back in the old setting. So you know Pi and delta, you'll be fine. Okay. So now how do you do the control? So this is sort of the most technical slide, but I'll try to at least maybe you believe that it should work out. Well, the first point is because these -- so, remember, this is a codeword. And what are you hitting this codeword with? You're hitting it with an arbitrary -- a fixed error vector of size pn. So you're flipping some number of -- some fixed set of bits you're going to flip. Okay. That's your error part. Now, because these blocks are all randomly chosen, if you fix the error thing -remember, you fix the error before I pick my randomness. So the number of -there will be a decent fraction of control blocks, which are these pink guys, which don't have too many errors just by standard average. Okay. So some decent -- some small fraction epsilon n of the control blocks will not have more than p plus epsilon fraction of errors just by average. But, of course, what you can really try and do then is if you knew those blocks, you can try to decode them up to this ratio, and you know that those control blocks are designed to handle such errors. But the problem is you do not know which are the control blocks. So you'll have to kind of figure this out. So the idea is simply going to be you just assume -- you know, give everybody the benefit of the doubt. Assume it can be a control block. So you go to each block, decode it, try to decode it up to areas of p plus epsilon. Now, decode it according to what code? You're going to try to decode it according to this inner code SC. Okay, you try to decode it, some fixed code. If that succeeds, like you find somebody within p plus epsilon, then you sort of say, okay, I think it's a control block. And what I said in the first bullet says that enough of these control blocks will be successfully decoded. And another point that leads to what Leo was asking earlier, and that's why I was worried for a second, is that it's also important that for the blocks where you might have too many errors, you'll detect them and then you'll say that, well, okay too many errors have happened. I cannot decode this control block. So that's fine. There's one other problem. What if some data block gets mistaken for a control block? Okay. And that's kind of the third piece, random offset, comes in. Remember, this is random offset which is applied only to the data blocks. So any data block essentially locally looks like a completely random string, and you can never decode a completely random string as per any code. It won't be close to any of these codewords, so those guys will just fail. So the summary of all this is that a decent fraction of the control blocks will get successfully decoded by this and their correct values found out, and a very small number of payload blocks will get mistaken for control blocks. And no control block -- or the property that any control block gets incorrectly decoded is also very small. >>: So the use of delta here seems to actually be quite different from the other case because ->> Venkatesan Guruswami: Yeah, it's sort of a very different object, yeah. It's some sort of a masking thing, yeah. And I think -- so with these things, the Reed-Solomon code is just designed to handle this. It's a loaded Reed-Solomon code, so as long as you get a small number of things correctly and very few errors, it can handle this and correctly recover omega. Okay. So I'm not going to do a sketching of all the parameters, but it's completely standard. The main points are these things. The offset prevents mistaking payload for control, and the random placement and these -- the special property of these existential codes make sure you get enough of the control things correctly and very few incorrectly. And as I said, once you know omega, you're in good shape. So you'll just recover -- you know what delta and Pi is, so go back, remove the offset, and scramble and decode. >>: Where did the random offset come from again? >> Venkatesan Guruswami: Where does the random offset come from? >>: Yeah. >> Venkatesan Guruswami: Oh, so that's part of the encoding. So we'll just go back to this picture. So you added this random offset to the payload part. >>: Is this shared by encoder and decoder? >> Venkatesan Guruswami: No, nothing is shared by encoder. So, see, the encoder is recovering -- what the decoder first try to do is it tries to decode this thing, tries to find enough of these pink guys and decodes omega. And we kind of argued that we'll get omega correctly with high probability. What is omega? Omega is just the triple Pi delta and random locations. So once it gets omega correctly, it knows the locations which are these pink blocks, so it'll throw it out. It knows delta, so it will remove the offset. It knows Pi. It will remove the scrambling. And at this point what it sees is an error which it can handle with high probability. So it will correct it. >>: [inaudible] >> Venkatesan Guruswami: Oh, yeah, yeah. So -- but that's not random, that's a fixed -- obviously the sender and receiver need to agree upon a code, and that includes both this random error code and the Reed-Solomon code. Yes? >>: Do you need to encode the positions of the pink blocks? Because it seems like the decoding itself will figure out the position of the pink blocks. Right? The decoder figures out which blocks are pink and which blocks are blue and then confirms that the positions of the pink blocks were correct. Why do you need that? >> Venkatesan Guruswami: Oh, so you're asking why encode T as well? No, no. But how do you know -- so the decoder will figure out what omega is, but then how will it know which of these encoded blocks correspond to it? >>: So it really doesn't decode anything. It's already made a good, solid guess at it, right? >> Venkatesan Guruswami: Yeah, but it needs to get every one of them correct, right? So one thing it could do is it could re-encode it back and go and check which of the blocks match with it, but ->>: I see. >> Venkatesan Guruswami: Maybe. But this is safer, right. I haven't thought about it, but maybe there's an issue, maybe not. But suddenly this way it gets the whole set [inaudible]. Okay. So that's it. And so I'll just have a couple of -- I mean, I don't want to get into too much details about the online logspace thing so. That somehow is a similar high-level structure. It has similar components. But the details are a lot more complicated somehow. First of all, this hiding this location of the control blocks, it becomes a problem, because earlier we just said the error vector was fixed. So if you place it in random places, then enough control blocks will have few errors. So to do this, now we have to hide -- we have to mask the control blocks with some sort of pseudorandom code to hide their location. And this -- really the object of this is to ensure what we did earlier, that enough control blocks have few errors. But maybe I don't have -- so the details are complicated, but let me at least hint at why we suddenly get list decoding in this case. And the reason for that is that once you go to a channel like -- which is non-additive, what the channel can do is it can take the first block and always just change it to a control codeword, simply alter it. That's something a logspace machine can do. And so, therefore, the second argument we had that there are very few payload blocks which get mistaken for control blocks is just completely false. So because of this, the channel can inject many fake-looking control blocks, and this way the amount of wrong information might far exceed the amount of correct information you have. So in such a case you cannot uniquely recover the control information, so all you can do is recover a small list, one of which will be correct. So that's kind of, in the proof, where the list decoding comes up. And as I said earlier, this seems fundamental also because for a lot of errors, this seems [inaudible]. So once you do this -- okay, that's the control part. How do you argue about the payload part? So here the argument uses some standard sort of arguments that are common in crypto but may be less standard in coding theory, which is to say that some of the error distribution which is caused by this channel, so we know that this random error code is well designed to handle sort of random errors. When the error was oblivious, the distribution was nice. So what we somehow show is that the channel's error distribution is indistinguishable by a logspace machine from an oblivious distribution. And this is where we use the fact that the offset delta is now something which can fool online logspace machines. Okay. And the next step is somehow given this correct information, we have to argue that the events which ensure that you handle the oblivious case correctly also are going to occur with high probability against the online thing. And that event is essentially the same thing as the error being well distributed between various blocks. And one problem here is that you have to check the condition in online logspace. This is sort of a standard distinguisher [phonetic] argument where you have to build -- if you assume this is not the case, you have to build an online logspace machine which beats Nisan's generator and get a contradiction. And because of this online thing, this step ends up being a little bit harder here. So actually a proof for the poly-time channels is a lot easier than the logspace thing. Okay. That's what I just said. So there's somehow a weaker condition we identify and worked for that, and that gives the whole result. I think I'm not going to remember the details, but I think because of this, our error probability for this may be quite a bit weaker, maybe 1 in poly or -- yeah, I think something like that. Okay. And the same thing works if you change the Nisan generator to some other PRG, everything works for poly-time, poly-circuits, and the analysis is actually easier in this case because you have a more richer class of things to build a distinguisher with it. Okay. So I think I'll conclude that. Just to summarize, list decoding allows you to communicate at optimal rates even against worst-case errors, but the problem is we do not know explicit constructions. And we saw -- so given that we saw another way to bridge between Shannon and Hamming, which is to consider channels which are limited in computational power, which seems like a reasonably well-motivated model. And for this setting we got the first results which do not use any setup assumptions like shared randomness or public keys, and we got optimal rate codes for oblivious errors and also for more powerful channels once you have pseudorandom generators, but in the model of list decoding. Okay. So there are many open questions, I think. So some of this area of bridging complexity encoding, you know -- there are a lot of things we do not know, and many things seem quite hard. So one question is that I said that for p more than one-fourth, you kind of really need list decoding even for simple channels. What happens for p less than one-fourth? Can you do something for logspace channels which is better than completely worst-case channels? We do not know such a result. Or is it possible that you can actually prove you cannot achieve Shannon capacity with unique decoding? Okay. So this is one question. So if you didn't get this, don't worry. I think sort of a nicer question which is more purely information theoretic is forget all these complexity restrictions. We'll just work up worst-case channels, but now I'll just restrict -- place an information theoretic restriction, namely, the channel is online. So the channel can cause any error it wants, but it can flip [inaudible] bit only depending on the past, but not the future. Just a class online model for errors. And in this case, really, a lot of things seem to be open. So we are quite far off from understanding the correct tradeoff between rate and the error-correction fractions, so we have some upper bounds and some lower bounds, and I don't think any of these are tight, and understanding this would be very nice. So I think that's all I have. [applause] >> Venkatesan Guruswami: Questions? >> Tom McMail: Thank you very much. >> Venkatesan Guruswami: Thanks. [applause]