Tom McMail: I`m Tom McMail from Microsoft Research Connections

advertisement
>> Tom McMail: I'm Tom McMail from Microsoft Research Connections, and I'm
very glad to welcome you here today and very glad to welcome Venkatesan
Guruswami from Carnegie Mellon University. Dr. Guruswami received his
Bachelor's degree from the Indian Institute of Technology at Madras and his
Ph.D. from MIT. He has been a faculty member at Computer Science and
Engineering at University of Washington right here in Seattle, and he was a Miller
Research Fellow at the University of California, Berkeley.
So with that, I'd like to introduce Dr. Guruswami.
>> Venkatesan Guruswami: Thanks, Tom. And thanks, all of you, for coming
early in the morning.
So this is a talk about bridging two lines of work encoding theory pioneered by
and Hamming, and it brings in some ideas from complexity theory and
cryptography to design codes for a new model. And this is joint work with Adam
Smith at Penn State.
And so directly jumping to the outline, so this talk, I'm going to spend a fair bit of
time because it's sort of a new model and so on, not an entirely new model, but a
new twist of an old model on the context. So really more than half the talk will be
sort of survey-like. I will talk about what was done in these lines of work, and
then I'll state our results and show some ideas in the proof.
So, specifically, I'll begin with two popular error models to use to model errors in
coding theory, Shannon and Hamming, then we'll talk about one way to bridge
between these, which is we are list decoding, and we'll see what sort of the
challenges of that are, and that will motivate our main topic, which is
computationally bounded channels, and then we'll see some previous results and
then state our results. Okay. So that's the plan for the talk.
And I have a longer slot than usual so feel free to ask any questions in between.
So we should have time for questions.
So there are two classic channel noise models. Suppose Alice wants to send a
message to Bob across a noisy channel. So we'll assume that Alice is sending a
sequence of n bits. So the noisy channel is going to distort some of these bits,
so two popular models is -- one that's pioneered by Shannon is a probabilistic
view where you say that as the bit stream -- by each bit is flipped independently
with the probability p. It's called a binary symmetric channel, or another way to
say it is that the error distribution is simply a binomial with a mean p times n.
Okay. So typically pn errors happen, and they happen independently across bit
positions.
Another model is the Hamming model where you say that the channel can do
whatever it wants in the most nasty way. The only restriction is that it never
corrupts more than p fraction of the bits as if goes through. And here p is some
error fraction which is between zero and one half.
And in both of these settings the challenge is to understand what is the best
possible rate at which you can send bits. So if you actually send n bits, through
how much actual bits of information can you get across to the receiver because
of this noise, and that's really the basic question.
And to motivate this formula, let me define this. This is probably known to all of
you, but just to set up some notation, what's an error correcting code? We'll think
of it as a way to encode k bits of information -- that's the amount of information
you have in your message -- into an n long string which is a redundant string,
and m is typically called the message and c is the codeword, and c of m is the
codeword associated with m.
And the other -- one important parameter for this talk is p, the error fraction which
we saw in the previous slide. The other important quantity is the rate R which is
simply the ratio k over n, and this really conveys how much bit do you have per -how much actual information do you have per bit of the code word because
you're sending k bits of information through n bits of communication.
And the asymptotics will be such that we'll think of p as an error fraction as
something between zero and one half as a constant, and we would also like the
rate to be bounded away from zero. So wealthy think of k and n as growing with
k over n being fixed.
And just to sort of give some reason -- and that's what a code basically is, but in
order to use codes for error correction, we would like the codewords to be well
separated. So if you take all the images of this map, you'd like these blue points
to be well separated. Why? Because if you send a codeword, then maybe you
sent the c, but there was some noise e added to it, so you actually received c
plus e, and here the assumption is e is some unknown error vector which you're
trying to remove which obeys some noise model like the Shannon or the
Hamming model. And the hope is that the codewords will separate it. You can,
from r, go back to c. That's the idea.
Okay. So the question then becomes what is the best rate you can achieve if
you want to correct errors. So let's look at Shannon's capacity limit. Suppose
you send a codeword c, and note that the error is a binomially distributed random
variable, so when you send c, the sort of possible received words you can get is
really a Hamming ball of radius p times n centered around c. This is what you
typically -- what you'll see should lie inside this ball.
And note that for all of these points in this yellow ball, you should really go back
to c, because you may get any one of these and you should decode that back to
c. That's the -- for most of these things, you should decode them back to c.
And what is the volume of this ball? This ball, since it has various p times n, has
number of points, roughly 2002 to the entropy p times n.
And just what I said earlier, so if you want to do decoding, if you take these balls
around all these codewords, they have to be more or less disjointed so that there
is good chance that you will recover the center of the ball when you get some
arbitrary point in the ball.
And a simple volume argument shows that each ball has this volume, and the
total number of points is 2 to the n, so you can't pack more than 2 to the n by 2 to
the h of pn number of codewords. And that basically tells you that the largest
rate you can have is 1 minus h of p.
Okay. So all I've really done here is just a hint of the converse to Shannon's
coding theorem. So on even the binary symmetric channel, the largest rate you
can have is 1 minus h of p. And I'll have a plot of this a few slides down. But it's
a quantity which for p small is very close to 1, which makes sense. If there's no
error, you can send that information rate 1. P goes to half as this rate
approaches zero.
And the amazing thing in Shannon's work was that he showed the converse to
the converse, which is Shannon's theorem, that they can in fact achieve this
bound. So he said that there exists a code of rate as close to this capacity limit,
which we know is called Shannon's capacity, such that no matter what error
message you want to send, if you send it and it gets hit with binomially distributed
error, then the probability that the c [inaudible] lies in the ball centered around
that code word but in nothing else is very, very high.
So basically in principle, you send this codeword with high probability, it will be
inside this low region and nothing else. It won't lie in these little intersections so
you'll go back to c.
And the closer you get to the capacity, there's some tradeoff here. But in
principle, you can be as close to that as you want.
And this was one of the early uses of the probabilistic method, and over the last
50, 60 years we now know various efficient explicit constructions with polynomial
time algorithms which realize this dream both in theory and practice, various
constructions of concatenated codes due to [inaudible] LDPC codes and, you
know, recently, as of two years back, these things called polar codes. So, largely
speaking, this problem is very well solved.
Now, the motivation for this talk is that this is all -- theirs nice, but this assumes
that each bit is flipped independently, and this is a rather strong assumption,
because errors often tend to happen in bursty or unpredictable ways. To assume
that the tenth bit is independent of the ninth bit may not always be realistic.
So we would like to understand what happens in the Hamming model, completely
worst-case errors. All you assume is that the fraction of errors is at most p. So
what can I say there?
Okay. So let's ask the same question. Now, if you think about it, in the Hamming
model, the largest weight with which you can communicate is simply the largest
number of points you can pack so that if you take these Hamming balls of radius
pn, they're going to be completely disjoint. So there is no chance for any
confusion.
So you can ask the same question. It's a very basic combinatorial question.
What's the largest number of some balls you can pack in Hamming space? Very
nice question. The answer is unknown. It's one of the major open problems in
combinatorial coding theory, indeed in all of combinatorials. So we don't know
that.
But what we do know is that it is strictly less than Shannon's capacity. So you
cannot quite achieve Shannon's capacity in this model. In particular, as soon as
the error fraction exceeds one fourth, the rate must go to zero. So [inaudible]
Shannon's capacity remains positive all the way up to half. That's a very stark
contrast. And also the best known rate even non-constructively is something like
1 minus the entropy of 2p instead of p.
So another way to say this is that for a similar rate, in the Hamming model you
can only correct about one half the fraction of errors you can correct in the
Shannon model, and that is even allowing you to pick objects which we don't
know how to construct efficiently, so even non-constructively. So that's really the
pitch.
So I think I've said a few tradeoffs, but it's good to see a picture here. So on the
horizontal axis is the error fraction which is between zero and half, and then the
vertical axis is the rate between zero and 1, and this is -- the green curve is the
Shannon capacity limit, which is the best you can do in the Shannon world.
And the best you can do in the Hamming world currently is this blue curve. As
you can see, it hits zero at .25 and it's well bounded away from this. And we also
have upper bounds which say that you cannot go above this red line. So there is
a provable separation.
So given this, the [inaudible] is going to say how do you -- is there any way you
can get the best of both worlds where you have codes operating added Shannon
capacity but maybe resilient to I'm not sure expressive classes of errors rather
than just IAD [phonetic] errors.
And ideally we would like to handle worst-case errors, but we know something
has to give if you do that. So why do we care about worst-case errors, first of
all? I mean, Shannon's problem is solved. Why don't we just go home?
Well, worst-case errors are very nice. We like worst-case models in computer
science. And the more serious answer is that a lot of extraneous applications, of
course, which don't, per se, have to do anything with communication, which there
are many such examples in theoretical computer science today, almost all of
them require resilience to worst-case errors, so it's a natural model for us.
And even in communication, worst-case errors, maybe channels are not
completely adversarial, but allowing yourself worst-case errors allows you to
model channels whose behaviors maybe sort of unpredictable and varying with
time; for example, bursty errors and so on. And if you design a code for
worst-case errors you are somehow robust against all, you know, possible
classes of things you can imagine. So it's a very strong notion to design codes
problem.
Okay. All that is good, but the problem is that if you want this, you have to pay a
big price. So the rest of the talk we're going to see some ways by which you can
maybe bridge between these two worlds.
Okay. So the first such method is list decoding, and this is not directly the focus
of this talk, but it's going to figure out in several places as a useful [inaudible].
So what's the list decoding model? So here it's very similar to earlier. So you
have a message, it's encoded by a code, and then the Hamming model you flip
arbitrary p fractional bits and then you have to decode.
The only change is that you make the decoder's life a little bit easier. You say
that the decoder does not have to recommend m uniquely, but it has to recover a
small list of candidates, one of which should be m. So that's the relaxation. So
the model remains worst-case errors, you've made something has to give, and
what gives is the decoder's life is a little bit easier.
And a combinatorial notion associated with this is that we say a code is p, L
list-decodable. No matter what point you take, if you take a ball of radius p times
n around it, centered around it, you'll have at most L codewords.
And if you think about it, this means that if you started with this codeword, went
to this red point, and you tried to decode it back, you wouldn't get this codeword,
of course, but you might get L minus 1 [inaudible]. And if L is not too large, then
this is -- you know, we consider this a good notion.
Okay. That's the list decoding notion. And why this notion? I'm come to that in a
second. But in this notion it turns out that you can actually achieve Shannon
capacity even for worst-case errors.
So an old result, almost 30 years old, shows that if you take random code of rate
arbitrarily close to 1 minus h of p, then it is going to be p, L list-decodable for a
list size above 1 over epsilon. So if you are 1 over epsilon ambiguity you can be
within epsilon of Shannon capacity. And this is, I should say, is an existential
result. I'll come to that again.
And another way to really think about this is that, you know, in an old view, when
you pack Hamming balls around these codewords, there is a way to pack a lot of
balls, essentially an optimal number, such that -- Shannon said that these
Hamming balls are going to be mostly disjoint. This is a more worst-case
guarantee which says that these balls will never overlap at any point more than 1
over epsilon times. And these two notions are the same as what I said in the
previous slide.
Okay. So it's some sort of an almost disjoint packing of a perfect packing space.
And last career we showed that the similar thing is also true for linear codes,
which actually for this talk comes up later on.
And now one thing, if you've seen this notion for the first time, you're probably
thinking maybe it doesn't make much sense. What use do I have if I don't
recover my message uniquely? So I don't have time to dwell on that yet, but just
a few items here.
So for various reasons this is good. Well, first of all, it's better than simply giving
up. So that's hard to argue with, but -- unless you spend a lot of time and do
that.
And also you can show that for variation interesting codes and noise models with
high probability, you won't have more than one element. So you sort of
essentially have the same benefits of getting unique decoding, but you've
obviated the need to change your model, and you still work with worst-case
errors. And for many of these extraneous applications I mentioned it exactly -list decoding fits the bill perfectly.
And more specifically for that talk, you will see that even if you don't care about
list decoding, it comes up as a nice primitive for other models. So it's a useful
tool.
Okay. All that is good. So why don't we just consider list-decodable codes and,
you know, be done with it? I would love to, but unfortunately we don't know how
to.
So unfortunately there is no constructive result which achieves this bound of
Zyablov-Pinkser. So here is the plot again. Just for variety, I've flipped the thing
that -- there's the rate now and this is the error fraction. And you would like to be
on this green curve. That's the optimal tradeoff. And the best we can construct
so far is this pink curve. So there's a huge gap.
And even this was achieved by works with [inaudible] from two, three years back
where we constructed optimal codes over large alphabets. So a similar thing, if
instead of binary codes you allow me large fields, then we can actually get the
optimal
tradeoffs, but for binary it's unknown. So we can take such codes and use other
ideas to get this tradeoff, but as you can see, there's a big gap, and closing this
gap is -- it remains a major open question, and that's not what we will address in
this talk.
So just a summary is that list decoding gives one nice way to bridge between
Shannon and Hamming. You can work with worst-case errors, and you just have
to relax the decoding requirement a little bit. And by doing that you can achieve
Shannon capacity, but the problem is we do not know explicit codes.
So that's the sort of introduction or preamble. Now we come to the main object
of this talk, which is another way to bridge between Shannon and Hamming.
But are there any questions so far?
Okay. So the main things you need to remember are Shannon capacity, this
tradeoff between p and R and some sense of what list decoding is for the rest of
the talk.
Okay. So computationally bounded channels. So here the model is now -- in the
list decoding we didn't change the noise model, but now we'll actually change the
noise model. We won't allow worst-case errors. What we will do is that we will
say that the channel instead of being in adversity is somehow computationally
simple. It's not solving np hard problems to cause errors, for example.
So what does simple mean? And this model is not new. It was put forth by
Lipton in 1994, and his suggestion was maybe we will just model the channel as
having a small circuit or as being a polynomial time machine.
And this seems very well motivated, because maybe natural processes which
cause errors are unpredictable and hard to model, but they are probably not
arbitrarily malicious in solving very hard problems. So it seemed like a
reasonable way to bridge between our idea, which is a very simple computational
model -- you're just flipping coins at each position -- to adversarial, which could
be doing arbitrary things.
Okay. This is a fairly expressive class which bridges between these two things.
And pretty much if you allow circuits of linear size, of quadratic size, that is going
to encompass every model considered in the literature.
Yes?
>>: So in the other error model -- so, I mean, circuits can obviously make circuits
convert a randomized circuit into a deterministic one, but it might blow up the
size. And so I just ->> Venkatesan Guruswami: So here we'll allow randomized circuits.
>>: Oh, you are allowing randomized circuits? Okay.
>> Venkatesan Guruswami: Yeah. That's right. And I think it's okay. I think the
classes -- yeah, right. I think when we talk about log space, it may be an issue,
so we'll just allow randomized machines.
And we work with non-uniform models just for convenience. If you want, you can
work with [inaudible] machines and so on.
Okay. And just to be very clear what I'm saying, what does it mean here? So the
channel class now is going to be specified by two things. One is a complexity
class like poly-sized circuits, log space [phonetic] machine, all those things, and
the error parameter. Of course, the channel can flip every bit with -- you know, if
p becomes very large, there is nothing you can do. The channel can always zero
out the whole thing and output zeros. There's nothing you can do against such
charges. So you still put a [inaudible] so the channel doesn't corrupt too many
bits. So the error parameter p is going to remain.
So these are the two things. So in some examples of complexity classes we
might allow this polynomial-sized circuits or a log space channel which is going to
be important to this talk here. The channel, as the codeword streams by, it may
be remembers something about the codeword, but not too much. Maybe log
number of picks. And it causes the errors [inaudible]. So that's the other model.
And the third and the weakest model is an oblivious model where the channel
can do arbitrary things, it can add an arbitrary error, but it somehow decides to
add that error before seeing the codeword. So its actual behavior is oblivious to
the codeword. Okay. This is a somewhat restricted model, but, so we'll get
some results for all these models.
And for all these things you must design a single code which will work for an
arbitrary channel which belongs to this class. So you have to build one code for
all possible instantiations of channels from these classes.
Okay. So, again, this model is not new. So the model was put forth, and there
was some previous work, so let's go over that before we say what our results are.
So Lipton already in his work said that if you set up some assumptions where
you, let's say, assume that the sender and receiver share some random bits
which is hidden from the channel, then you can actually achieve good results in
this model.
So here the model is [inaudible] Alice and Bob. Alice wants to send a message
to Bob, but there's some secret randomness just shared between Alice and Bob.
And that's the model.
And this is also studied in the recent work of Ding, Gopalan and Lipton.
And in this model it turns out it can give a simple reduction of worst-case errors.
So if you have shared randomness you can essentially reduce worst-case errors
to random errors, and therefore you can get the same results for worst-case
errors. And actually it's a very simple idea, so let's go over that.
So what you do is you assume that the sender and receiver share a random
permutation of n bits and a random n bit offset delta. So now what the sender
will do is that let's assume -- we know that the Shannon problem is well solved,
so REC is a random error code which can handle random errors of p bit flips. So
we'll take any such code, your favor one, and we'll encode it.
Now, before sending this we'll scramble this codeword according to this Pi and
then we'll add this offset and then we'll send it along channel which is this
adversarial channel, worst-case error.
And now what the receiver must do is that, well, the receiver also knows delta, so
he can add the delta back and remove the offset, and he also knows Pi, so he
can unscramble the codeword.
And now the key point is that after unscrambling, really for the codeword of the
random error code, the error which is being added is Pi of e. Now e is an
arbitrary error vector, and because of this offset delta you can essentially assume
that the error e is independent of the permutation.
So this Pi of e is really like a random error. Even though it's like a worst-case
ring, you're applying a hitting of the random permutation so it's a random error.
And therefore this REC was, of course, designed to correct random errors, so the
decoder will succeed. And that's it.
And the rate of this code is exactly the same, right, because you just took the
random error code, and nothing else happened. The rest of the thing was just
shuffling around information.
So if you have shared randomness, worst-case errors is reduced to random
errors. And, of course, one immediate complaint about this is that this is a lot
random bits, right? So Pi is n log n random bits. That's more random bits than
the bits you're trying to communicate, so that's not such a great thing.
But then Lipton's addition was that if you assume [inaudible] computational
restriction come in -- what I said in the last slide was for adversarial channels.
Lipton suggested that if you have a computationally bounded channel, then you
don't need a purely random string. You can maybe take a seed for a
pseudorandom generator and expand it to a string which looks random to this
channel, and therefore you can make your seed much shorter.
So, in fact, if you want polynomially small error probability, instead of sharing
about n bits you can share only, like, log [inaudible] bits.
Yes?
>>: I think I got it, but I want to confirm. Like Pi, you don't mean permutation on
0, 1 to the n, you mean permutation on ->> Venkatesan Guruswami: Oh, yeah, Pi is just a permutation of 1 to n bit
positions. Thanks. Just permuted around the code bits.
>>: Got it.
>> Venkatesan Guruswami: Yeah. Any other questions?
Okay. So with pseudorandom generators you can make the seed length small,
but what I'll show in the next slide is that you actually don't need any
computational assumption to even reduce the amount of shared randomness.
And that's what I want to show in the next slide, that the errors can actually be
adversarial and they can do what I did in the previous slide with a much smaller
random string. I'll show how in the next slide.
And what this shows is that in this model of shared randomness you really don't
need computational assumption. It's an extracting which makes your life easy,
but it doesn't really -- in some sense it's not [inaudible] to the problem. If you
have shared randomness, you can still work with worst-case errors.
And this will also illustrate -- and one of the reasons I'm showing this is that it will
show how to use list decoding in this context, and you'll see similar things later in
the talk.
So, again, what I want to show here is that I have -- suppose we had in our hand
optimal rate list-decodable codes with the caveat that, of course, we don't, so
there will be few slides like this where if we did list decoding, we'd be in good
shape. Then we can get optimal rate codes even on worst-case channels with a
very small random seed. That's what I want to show.
And, again, the difference between what I showed from two slides back is that
this random seed will be rather small [inaudible]. So how do we do this? So the
idea is going to be this picture. So you have in your hand a list-decodable code,
and you want to send this message.
So, of course, one thing you could do is you could just encode the message by
the list-decodable code, but that won't be quite good because then you'll get a list
of possibilities and you won't know which one is the correct one.
So what you do is first you essentially hash the message and get some hash tag
t. So this MAC is a message authentication code. It's an information theoretic
MAC. Or if you don't know things, just think of it as some hash function applied
to m. It's really the same thing.
And then you encode m, t together by the list-decodable code. Now, then you
send it on this worst-case channel which adds as nasty an error as it wants, but
you still have the guarantee that no matter what you did, because of the
list-decoding guarantee, if it didn't add more than p in errors you can recover a
list of L possibilities, one of which will be correct. So you'll get a list of L m1 t1
[phonetic] errors.
And you should think of L as something like polynomial in n here or a large
constant or the exponential number of possibilities here whittled it down to just
some polynomial number.
And now the goal is, okay, how do you know which one is correct. Well, for that
you use the fact that whether the shared randomness comes in, so this hash
function was produced by -- this s is really the randomness used to sample this
hash function or the MAC, so you also know that as the decoder, so you compute
the MAC of each of these m1s and see which one matches. And is the one
which matches, you pass through.
And essentially what -- the idea is that the sender is authenticating the message
using this shared randomness, this key, and if your [inaudible] property for the
chance that you produced a spurious m1 whose tag matched with what you sent
is at most delta, then by a union bound there's at most L delta chance that
something else will mess up this thing. And you'll just assume that the delta is 1
in polynomial in n and L is polynomial in n, so this will be a polynomially small
problem.
And why does the rate go? So I still have to say that. Well, first of all, this part,
the rate of the list-decodable code is very good by assumption. And the other
point is that we're not quite -- and the amount of shared randomness can
essentially be log in 1 over delta, so that can be -- if you want 1 over poly you
can have just log shared random things. And this tag is often also much shorter
than the message. So you can assume that this t only has log n bits instead of
m, so instead of sending n -- encoding n bits you're encoding n plus log n bits.
That causes a negligible loss in rate.
So, again, if you didn't get this, just think of t and s as being small, and that
ensures that the shared randomness is small and the loss in rate is also
negligible.
So this way if you had good list-decodable codes, a small number of shared
random strings, you can work with arbitrary channels.
Yes?
>>: [inaudible]
>> Venkatesan Guruswami: Okay. So we'll come to this later, but the question
is how do you know which hash function you sample here. So the decoder
needs to know how to compute the hash. But actually we'll see a scheme later
on where we'll precisely do that. We'll kind of get rid of the shared random thing,
but something has to give for that, and we'll weaken the noise model and then
we'll see that you actually don't need the shared randomness.
Again, this is not directly centered to the talk because there is no computational
assumption here, but you'll see schemes like this later in the talk as well.
Yes?
>>: Is it possible to maybe add some redundancy to a [inaudible] instead of
sampling hash functions?
>> Venkatesan Guruswami: In effect, this is kind of what -- the hash function is
like some sort of check sum, so it's really -- I mean, if you sort of go into the
innards of some of these things, that's what it will be doing.
>>: But by adding redundancy I mean you could get away with shared
randomness --
>> Venkatesan Guruswami: Oh, yeah. But, remember, that's kind of what this
step is doing. This step is and redundancy in a way so that you can remove the
adverse effects of errors.
So you can do what you want, but you must remember that there's always a
channel between what you send and what you receive. And redundancy is being
added in this box, which, you know -- I'm not telling you what this box is.
>>: But if the channel can remove the redundancy -- the point of the secret is
that the channel doesn't know how to remove redundancy.
>> Venkatesan Guruswami: Okay. So one last bit of previous work and then
we'll actually talk about our results.
So another model in which you can solve this thing that was put forth by Micali
and others is to use a public key setting. This is another setup assumption
where you assume that there is some public key infrastructure, the sender has a
public key/private key pair, and only the sender knows the private key. Everyone
knows the public key.
And this is a very clean model of solution once again using list decoding and
digital signatures. So now the information theoretic map will be replaced by
digital signatures, and you can handle computational assumptions.
So how does this work? Again, the picture is very similar to this. Now we'll
assume that the channel is a polynomial-bounded thing, so it's a poly-sized
circuit, let's say. And now what you'll do is that, again, you have access to a
list-decodable code. There is no shared randomness or anything. But what you
have is a secret key/public key pair.
So the sender will sign the message using -- Alice will sign the message using
her secret key to produce a signature sigma, and then you will encode the m,
sigma pair by the list-decodable code.
And as before, the decoder will recover L possibilities, m1 sigma 1 pairs, and so
the goal is now to figure out which one is correct.
Now, using the public key of the signature scheme, certainly m will pass through,
and you just have to make sure that nobody else passes through. And once
again, if the -- because it's a polynomially-bounded channel, if the forgery
probability of the digital signal was small enough, then it's unlikely anybody else
will fool you.
And there are some additional ideas to handle the fact that what if you send
multiple messages and so on, which is also discussed in this paper.
And so another way to say this is if you had optimal -- so because of this, if you
had optimal rate codes for list decoding, and we certainly know very good digital
signatures, so then you can also get optimal rate codes for polynomially-bounded
channels in this model where you have a public key.
So the summary of both of these things is that if you had good list key decodable
codes, then with shared randomness you don't need any computational
assumptions. You can solve worst-case errors nicely. Or if you have
computational assumptions and have a public key structure, then also you can
handle this. So list decoding -- both of these problems are reduced to list
decoding. That's the context.
But now I'll come to our results. And the main difference in our results is that we
will do away with these setup assumptions. So these things are nice, but in
some sense they deviate a little bit from the simple setting where a sender sends
a message to a receiver, and that's it. We agree upon a code and that's -there's no shared randomness, there's no other extra setup. So we would like to
go back to that model which we saw in the first slide.
So I'm now going to state our results. So we get explicit codes which essentially
achieve optimal rates where there's no setup assumption. So it's just a good old
setting where I send a message and the decoder has to figure out what it is.
There is no -- nothing we've agreed upon other than this. Of course, I tell you the
code I use.
So the first result for the simplest model is the model of additive errors. And I'll
formally state the theorem below next. But really at a high level it basically gives
explicit codes which achieve Shannon capacity for this model of additive errors,
and previously such results were only known -- and only existent results were
known and they also give a similar existence proof, again, using list decoding
which actually helps some of our explicit results.
So what's formally our result? So this is the formal result. So we give an explicit
randomized code. So this may worry you, but I'll come to this in a second. So
you have a message and you pick some random coins and then you take a
randomized encoding of this into n bits with the rate arbitrarily close to Shannon
capacity, and there is an efficient decoder such that the following is true no
matter what message you want to send and no matter what error vector the
channel might pick.
So if you add the error vector -- hence, the name additive errors -- if you add the
error vector to the codeword, which is a random variable depending on m and
your random string omega, the property that it decoded correctly is very close to
1.
And the point is the decoder does not know the encoder's random bits. So it's
just a randomized code where instead of sending a fixed codeword for each
message, I sample from a space of codewords randomly. But you don't know my
random string, so this is still a very reasonable model. You just need some
private randomness at the sender's end.
And if this is the case, you can essentially handle any additive error with very
high probability.
Is the statement clear? I'll flash the statement again a few times. But this is the
first -- and the code is explicit, and the rate cannot exceed 1 minus h of p and be
arbitrarily close to that.
>>: Again, what is important here? That e doesn't depend on the ->> Venkatesan Guruswami: Omega. So e is -- yeah, what is important is that e
doesn't -- e is picked -- so maybe -- I believe this is a better term. So e is picked
obliviously to your codeword. E can depend on m, but, of course, the
codeword -- basically e may depend on m, but not omega.
>>: So how low is the failure probability of the coding? I mean, maybe ->> Venkatesan Guruswami: Okay. I'm hiding the [inaudible] at the start, but I
think what we can get is it can be exponentially small. It can be something like 2
to the minus n over log squared or something.
>>: Oh, okay. So it's not so far from ->> Venkatesan Guruswami: So it's not so far, yeah. And the existent [inaudible]
can even achieve exponentially small error probability. For this result we don't
quite know how to achieve 2 to the minus omega and error probability. That's an
open question.
>>: But it's not bad.
>> Venkatesan Guruswami: No, it's not bad. It's like 2 to the n over log squared,
minus n over log squared.
>>: [inaudible] [laughter].
>>: So you get error correction as long as the errors are no more than
[inaudible]?
>> Venkatesan Guruswami: Wait for the answer, yes. Actually, that's -- so Leo's
questions was -- so if e -- if the number of errors is not too big, then you've
succeeded with high probability, what if the errors are too big? What would be
nice if you gracefully say, oh, too many errors have happened I detected recent.
So we don't quite achieve that. But the building block for this, which is an
existent thing, does have that, and we use this. So that's another open question
in this result.
So no result is ever final, right? There's always things you can improve in this
result. But the high-order bits are correct in this result.
Okay. That's the first result. And that's sort of the -- a little bit of a [inaudible]
noise model, right? Oblivious seems very simple. You don't even look at the
codeword before deciding the error.
The more realistic model is perhaps the channel has limited memory. So that's
the logspace model. So here you just assume that the channel has logspace
memory and it flips bits in an online way.
And here we can get the optimal rate, but there is a caveat that we don't quite get
unique decoding, we'll only get list decoding. And this would mean that the
decoder will output a bunch of messages, one of which will be the correct
message, a slight difference from the earlier notion. Let's not worry about it.
And one comment about this is that I'll show -- I'll sketch in the next slide that the
list decoding restriction is necessary. So logspace channels are already powerful
enough that the moment you have more than one fault errors, you cannot do any
decoding. So in some sense list decoding is not the bad part here. The bad part
in some sense the logspace errors, because what I told you earlier is that if you
allow me list decoding, then I don't need any computational assumptions. I can
just work with adversarial errors.
And the reason we're not doing that is we don't know how to do it. So with
logspace restriction on the channel we are able to construct these codes.
Another way to think about this is that we're not able to solve the list decoding
problem, but we're solving some restrictions of it when you assume that the
channel is somewhat nice.
Okay. And the only thing about online logspace use here is that we have Nisan's
[phonetic] beautiful generator. So as long as you have suitable PRGs, you can
do this for any channel model, so we can certainly handle poly-sized circuits as
you make some computational assumptions.
Okay. And one comment is in both of these cases, contrary to many uses in
cryptography, the decoder is going to be more powerful handle it channel. So
you want to use logspace channels, maybe your decoder will use 2 log n space.
And this somehow seems necessary because, otherwise, the channel can
essentially play the decoder, right? So the channel can pretend it's the decoder,
decode the message, and then maybe change it to something else. So
somehow for this kind of thing, it seems necessary. But that may be a
reasonable assumption in communication settings.
Okay. Those are our results. And so just one comment about the logspace list
decoding thing before I give the details on [inaudible] is -- so, again, what I just
said is that why you need the list decoding restriction is that even if you have
constant space, you don't even know logspace, the moment the error fraction
becomes more than one-fourth, your rate goes to zero. So there's no chance of
getting 1 minus h of p.
And there is an open question for p less than one-fourth. Let me postpone this
for the last slide.
And the proof idea is very simple. Suppose you actually sent some codeword.
What you do is you can sample a random message n prime and a random string
omega prime, and basically you're sampling another codeword, and essentially
you just -- these two codewords will agree in about half places in expectation,
and whenever they disagree you move one towards the other. So at the end you
won't know whether it was c which was sent or c prime which was sent. Sort of a
simple fooling idea which is quite common in coding.
So this is why one-fourth is sort of a bad [inaudible], and this is a very simple
channel with just used constant space.
So now to the technical part. So I've stated our results. So I think all I'll probably
do in the remaining time is give some ideas about additive error things.
>>: Can you go back to [inaudible]? So why is a random codeword something
that a logspace machine can actually do with part of its ->> Venkatesan Guruswami: Okay. So we're assuming a non-uniform model
[inaudible] so ->>: Oh, I see ->> Venkatesan Guruswami: [inaudible] we just assume examples and prime
omega prime computes this, let's say, in a non-uniform way.
>>: Because it does seem like that might be a -- I see. It's some fixed random
codeword.
>> Venkatesan Guruswami: And then you could just insubstantiate. This
probably is single c prime which works for many things.
>>: So each channel has its own c prime non-uniform [inaudible]?
>> Venkatesan Guruswami: Exactly. So because it will work for all non-uniform
channels in this class, this is a problem.
And there are some weaknesses in this result in the sense that we only get a
small failure probability and so on. There's some technical ways that you can
strengthen it. But my feeling is any reasonable thing you do, I think one-fourth is
going to be the area for logspace kind of channels.
Another way -- sort of a philosophical way to maybe think about this is that there
are various combinatorial bounds in coding theory which says that if you have an
adverse at this, you cannot achieve something against this adversity.
And what this calls for is you go and look at those combinatorial proofs and see
whether you can implement those adversities in some simple model. And this is
saying, roughly speaking, this Plotkin bound which we have can be implemented
by a very simple process.
And this raises some interesting questions. If you go to even the next level of
sophisticated bounds, those really seem to require adversarial things, so I don't
know. There may be some nice set of questions here.
Any other questions?
Okay. So now I'll try to give some scheduled ideas what kind this construction
where we give a randomized code to handle arbitrary additive errors. And first I
want to talk about the -- show why these codes exist. And, actually, even that is
not entirely clear in this setting.
So I'll first show a new existence proof which is much simpler than the previous
existence proofs and also helps us in our construction. And this, again, is going
to combine list decoding with a concern crypto primitive. So now we will take a
list-decodable code which further has to be a linear code.
And this simply means that the map from k bits to n bits is a linear
transformation. That's all it means. And recall what I said many slides back is
that we can also show that good linear list decodable codes exist. So that comes
in handy here.
And, plus, a certain kind of -- special kind of message authentication code called
an additive MAC or AMD by Cramer and others which was discovered for a
crypto purpose. So I won't formally define it, but let's go back to this picture of
how we are going to do this.
So now the change from the earlier picture is that the errors are not arbitrary or
they're not caused by a poly-bounded thing, but it's a fixed offset. It's a fixed -you know, it's some string e. So you're given m and you're given e, and you
have to succeed with high probability.
So now what you do is that there's going to be your private randomness omega
which is going to be like the hash function or this key for this AMD code, and that
will produce a tag. And then what you do is that you take this linear list decoding
of list-decodable code and encode m, omega and t together by this code.
And now e acts on it, and the list decoder doesn't care. It always can -- whatever
it gets, it can decode up to areas of p. So you'll get L triples now of m1 omega
t1. And now you have to figure out what m is.
In the earlier setting you knew what omega is because of the shared
randomness. So you could compute this and everything worked.
But now you don't know omega 1, so this seems to be a problem. But now what
we're going to leverage is the fact that this error is not arbitrary but additive.
And really the key point is that by the linearity of this code, because this error
vector is fixed, the offsets of these spurious guys from the correct one are fixed.
They only depend on e. And this just follows from the linearity of the code.
We can essentially assume move everything to the origin, so these offsets only
depend on e. And since you're picking your omega and so on independent of e,
so it's unlikely that these l fixed offsets will be -- will essentially fool the AMD
code. And this AMD is essentially a special MAC which is well-designed to
handle these additive corruptions. So that's -- I mean, I'm not formally defining it,
but just the high order bits is that because of linearity, these offsets are going to
be fixed, and for a fixed offset an AMD code is unlikely to be fooled. It will just
remember these two high-level points and this [inaudible] plug-in and you get this
result.
And only point is what is the rate. If you assume the optimal list decodable
codes, this rate is good. And instead of encoding m you're encoding a little bit
extra stuff, omega and t, but those are going to be much smaller. So they don't
cause much loss in rate. And that's your overall scheme.
So this shows that these kinds of objects exist where for any fixed additive error
you can get the unique answer with high probability. The probability is now over
the choice of the key for the AMD code.
And now going back to Leo's question, one nice thing about this point is that if
the error fraction is more than p, then certainly the m1 -- the correct thing won't
figure here, but none of the other guys will still fool the decoder, so you will detect
with high probability.
>>: [inaudible] is small? If the [inaudible] is exponential, then single error e can
expand to too many things.
>> Venkatesan Guruswami: Okay, that's a good point, yeah.
>>: You have to push e ->> Venkatesan Guruswami: No, no, no ->>: You have to code e in a list decodable way, right?
>> Venkatesan Guruswami: No, no, no. Sorry. The list decoder always
decodes up to areas of p, and it will never see more than L things. This is true.
If the error is more than -- you worried me there for a second, because we've
used this.
If the error fraction is more than p, the correct guy won't be in this list. That's all.
But the list will never have more than L things. The list is always small. The
correct thing won't occur, but none of the other bad guys will pass the check as
well, so you will gracefully reject. So you somehow get this very nice threshold
behavior from this thing.
And this we don't quite get in our explicit construction, and this may be a nice
thing to do.
Okay. Now for the explicit thing. So let's try to -- so now what I want is that -- all
that was fine, but the problem with that was, again, we don't know these good list
decodable codes. Right? So we need some other way.
So how do we do this? And as a warmup, we'll actually go back to our solution
with shared randomness. And, of course, we don't want shared randomness, but
we'll first see that and we'll try to get rid of it.
So now we'll take -- for the additive errors we have a very simple solution. We
don't even need the offset. The shared randomness is simply a random
permutation Pi. So we'll take our random error code and scramble the codeword
positions according to Pi and then send it along. And now the key point is for any
fixed e, if Pi is a random permutation, Pi of e is a random error, so the decoder
can work. Exactly what I said earlier.
So it's very simple. And now the whole idea is we want to get rid of this Pi. So
an idea is going to be -- so we'll try to have the scheme but instead hide this Pi
itself in the codeword. So we'll try to hide the shared randomness in the
codeword itself.
Okay. But, of course, one way to do it, we write -- in the beginning of the
codeword say what Pi is. But that's not such a smart idea because the additive
error could just completely destroy it. So somehow -- you know, errors act also
on that part. So, first of all, you cannot just hide that information, which we call
the control information. You still have to encode it.
But then you cannot encode it and put it right in the beginning because the error
might just completely destroy it, so you have to encode it and then also hide it in
some random allocations which are hard to figure out.
Okay. Those two things seem necessary.
But if you do this, you want the decoder -- remember, the decoder doesn't know
the shared randomness, doesn't know omega, so the decoder must be able to
figure out value at Pi without your telling is explicitly. So somehow the places
value are going to hide Pi should also be part of the control information. And that
control information must include this data as well, but there's no -- the recursion
stops there, so that's all you need to include. So that's good.
And the idea is going to be somehow do you this. You'll take some control
information which is going to be Pi and also where you're going to hide this Pi,
and you encode it by some code and hide it in those places. And now the idea is
if you could somehow correctly get this control information, then it's good,
because you told the places where it was hidden, the remaining part you
unscramble according to Pi, and you go back to old solution.
Okay. I left a gap here because -- but there's some problem, right? If you think
about it, you're trying to encode the control information in a way which is resilient
to additive errors. But that's really the problem you're trying to solve because,
you know -- and also, the control information, if I do it as such, that's n log n bits.
If I even just write it in encoded form, that's -- my rate goes to zero, so how can
afford something like this?
Okay. And both of these have an intertwined solution. Well, first of all, to
address the second point, because I cannot hide a lot of bits, I better make my
control information small. And for this you use some pseudorandom ideas.
Since you don't really need Pi to be a completely random permutation, it suffices
if it has limited independence which can be sampled with some, say, n over log n
bits or some small fraction of n bits.
That's good. So once you have this, your control information is a teeny fraction
of your message. And once you achieve this, what time the problem you're trying
to solve is essentially the same, you don't have to solve it with great efficiency.
This is such a small fraction of thing, you can encode it by a really crummy code
which has much lower rate than capacity and try to predict it against adversity
errors.
And that we know how to do, because this is a much weaker goal, and we now
how to do this.
Okay. That's really the hope that this is all going to work out.
Okay. So I think both the ideas will become clear when I show these things.
So really there are going to be two main pieces. The first is how do you protect
this control information which is going to tell the decoder its bearings, and that's
going to consist of three things. Actually -- yeah, so, first of all, what you will
you'll do is -- okay.
So two things come directly. So one is the random permutation, and we'll see
that the offset comes back in a crucial way, so there will also be an offset.
So now what you're really going to do is that you're -- so first part is -- sorry, this
is the data part. So you sample your Pi and delta and do exactly what we did
earlier. So you take the random error encoding, scramble the bits, and add the
offset, and you keep your data part ready. So this is one part.
Now, the other part is going to be the control information which is -- already we
need to tell the receiver what Pi and delta are, but there's also going to be a third
thing, which is the locations where you're going to hide this control information
which is going to be some random subset of blocks. Let's say each block is of
length log n, so it's some random subset of blocks of some suitable size from 1 to
n over log n. And that's also a small amount.
So this triple is the control information which is random and which you have to
hide in the codeword. And now the point is we're going to take this omega and
we'll encode it by some low-rate code -- something like an Reed-Solomon code
works well -- into these control blocks. And now each of -- and these things you
better protect against additive errors, so you can encode it by this existential
code I showed a few slides back. And that's really the whole encoding.
So you take this omega, which is the control information, take some
Reed-Solomon encoding, which simply means you think of it as a polynomial
evaluated at various places, you add some header information because you
might move these things around in some crazy ways, so you just add alpha i's,
and then this inner part SC is some stochastic code which you assumed existed
from the previous slide which -- because it doesn't have the to be a particularly
good rate code, we can even construct it explicitly. And those blocks are all
going to be the control blocks.
And these are the two components. And once you do this, you have the data
part, you have the control part, and you just interleave them. Namely, now, use
the third thing, T, and you use T and you place the control blocks at the locations
specified by T. That's your final code.
Okay. That specifies the encoding.
>>: [inaudible]?
>> Venkatesan Guruswami: There is no shared randomness, because all this -so far I haven't talked about the decoder at all. All of this is at the encoding end.
So the omega is random, and the encoder does all this, and he'll send this.
Nothing else will be sent to the decoder.
>>: So the decoder has to tell which blocks are payload, which blocks are ->> Venkatesan Guruswami: Yeah. So the decoder first has to -- yeah, that
leads off to my next slide. Somehow you have to figure out which one -- you
have to figure out what T is, you have to figure out what Pi and delta is.
You don't have to get all the details here, just the high-level ideas. Okay. You
have some control blocks which are kind of important. They will tell you your
bearings. They are hidden in random places. You have to figure it out for the
decoder.
Yes?
>>: [inaudible]
>> Venkatesan Guruswami: Okay. Very good question. So I hid that point
which is -- so these analysis -- so these constructions for random error codes,
they also work for errors which are only limited [inaudible] independent. So there
are some constructions which are known to do that. But that's a valid point.
But we leverage fact that certain constructions only use limited [inaudible] -- need
the errors to be limited [inaudible] independent. Yeah, but that's an excellent
point. So we don't really need the errors to be fully random. Okay.
So now to decode. Obviously decoding is just the inverse of this, so what you
need to figure out is which one of these control blocks and then try to decode this
part of the code, figure out your Pi and delta and then go back and handle this.
Okay. That's going to be the one.
So once you get the control part correct, you'll be in good shape, because we're
back in the old setting. So you know Pi and delta, you'll be fine.
Okay. So now how do you do the control? So this is sort of the most technical
slide, but I'll try to at least maybe you believe that it should work out.
Well, the first point is because these -- so, remember, this is a codeword. And
what are you hitting this codeword with? You're hitting it with an arbitrary -- a
fixed error vector of size pn. So you're flipping some number of -- some fixed set
of bits you're going to flip. Okay. That's your error part.
Now, because these blocks are all randomly chosen, if you fix the error thing -remember, you fix the error before I pick my randomness. So the number of -there will be a decent fraction of control blocks, which are these pink guys, which
don't have too many errors just by standard average.
Okay. So some decent -- some small fraction epsilon n of the control blocks will
not have more than p plus epsilon fraction of errors just by average.
But, of course, what you can really try and do then is if you knew those blocks,
you can try to decode them up to this ratio, and you know that those control
blocks are designed to handle such errors.
But the problem is you do not know which are the control blocks. So you'll have
to kind of figure this out.
So the idea is simply going to be you just assume -- you know, give everybody
the benefit of the doubt. Assume it can be a control block.
So you go to each block, decode it, try to decode it up to areas of p plus epsilon.
Now, decode it according to what code? You're going to try to decode it
according to this inner code SC. Okay, you try to decode it, some fixed code.
If that succeeds, like you find somebody within p plus epsilon, then you sort of
say, okay, I think it's a control block.
And what I said in the first bullet says that enough of these control blocks will be
successfully decoded.
And another point that leads to what Leo was asking earlier, and that's why I was
worried for a second, is that it's also important that for the blocks where you
might have too many errors, you'll detect them and then you'll say that, well, okay
too many errors have happened. I cannot decode this control block. So that's
fine.
There's one other problem. What if some data block gets mistaken for a control
block? Okay. And that's kind of the third piece, random offset, comes in.
Remember, this is random offset which is applied only to the data blocks. So any
data block essentially locally looks like a completely random string, and you can
never decode a completely random string as per any code. It won't be close to
any of these codewords, so those guys will just fail.
So the summary of all this is that a decent fraction of the control blocks will get
successfully decoded by this and their correct values found out, and a very small
number of payload blocks will get mistaken for control blocks.
And no control block -- or the property that any control block gets incorrectly
decoded is also very small.
>>: So the use of delta here seems to actually be quite different from the other
case because ->> Venkatesan Guruswami: Yeah, it's sort of a very different object, yeah. It's
some sort of a masking thing, yeah.
And I think -- so with these things, the Reed-Solomon code is just designed to
handle this. It's a loaded Reed-Solomon code, so as long as you get a small
number of things correctly and very few errors, it can handle this and correctly
recover omega. Okay.
So I'm not going to do a sketching of all the parameters, but it's completely
standard. The main points are these things. The offset prevents mistaking
payload for control, and the random placement and these -- the special property
of these existential codes make sure you get enough of the control things
correctly and very few incorrectly.
And as I said, once you know omega, you're in good shape. So you'll just
recover -- you know what delta and Pi is, so go back, remove the offset, and
scramble and decode.
>>: Where did the random offset come from again?
>> Venkatesan Guruswami: Where does the random offset come from?
>>: Yeah.
>> Venkatesan Guruswami: Oh, so that's part of the encoding. So we'll just go
back to this picture.
So you added this random offset to the payload part.
>>: Is this shared by encoder and decoder?
>> Venkatesan Guruswami: No, nothing is shared by encoder. So, see, the
encoder is recovering -- what the decoder first try to do is it tries to decode this
thing, tries to find enough of these pink guys and decodes omega. And we kind
of argued that we'll get omega correctly with high probability.
What is omega? Omega is just the triple Pi delta and random locations.
So once it gets omega correctly, it knows the locations which are these pink
blocks, so it'll throw it out. It knows delta, so it will remove the offset. It knows Pi.
It will remove the scrambling. And at this point what it sees is an error which it
can handle with high probability. So it will correct it.
>>: [inaudible]
>> Venkatesan Guruswami: Oh, yeah, yeah. So -- but that's not random, that's
a fixed -- obviously the sender and receiver need to agree upon a code, and that
includes both this random error code and the Reed-Solomon code.
Yes?
>>: Do you need to encode the positions of the pink blocks? Because it seems
like the decoding itself will figure out the position of the pink blocks. Right? The
decoder figures out which blocks are pink and which blocks are blue and then
confirms that the positions of the pink blocks were correct. Why do you need
that?
>> Venkatesan Guruswami: Oh, so you're asking why encode T as well?
No, no. But how do you know -- so the decoder will figure out what omega is, but
then how will it know which of these encoded blocks correspond to it?
>>: So it really doesn't decode anything. It's already made a good, solid guess
at it, right?
>> Venkatesan Guruswami: Yeah, but it needs to get every one of them correct,
right? So one thing it could do is it could re-encode it back and go and check
which of the blocks match with it, but ->>: I see.
>> Venkatesan Guruswami: Maybe. But this is safer, right. I haven't thought
about it, but maybe there's an issue, maybe not. But suddenly this way it gets
the whole set [inaudible].
Okay. So that's it. And so I'll just have a couple of -- I mean, I don't want to get
into too much details about the online logspace thing so. That somehow is a
similar high-level structure. It has similar components. But the details are a lot
more complicated somehow.
First of all, this hiding this location of the control blocks, it becomes a problem,
because earlier we just said the error vector was fixed. So if you place it in
random places, then enough control blocks will have few errors.
So to do this, now we have to hide -- we have to mask the control blocks with
some sort of pseudorandom code to hide their location. And this -- really the
object of this is to ensure what we did earlier, that enough control blocks have
few errors.
But maybe I don't have -- so the details are complicated, but let me at least hint
at why we suddenly get list decoding in this case.
And the reason for that is that once you go to a channel like -- which is
non-additive, what the channel can do is it can take the first block and always just
change it to a control codeword, simply alter it. That's something a logspace
machine can do.
And so, therefore, the second argument we had that there are very few payload
blocks which get mistaken for control blocks is just completely false. So because
of this, the channel can inject many fake-looking control blocks, and this way the
amount of wrong information might far exceed the amount of correct information
you have. So in such a case you cannot uniquely recover the control
information, so all you can do is recover a small list, one of which will be correct.
So that's kind of, in the proof, where the list decoding comes up.
And as I said earlier, this seems fundamental also because for a lot of errors, this
seems [inaudible].
So once you do this -- okay, that's the control part. How do you argue about the
payload part?
So here the argument uses some standard sort of arguments that are common in
crypto but may be less standard in coding theory, which is to say that some of
the error distribution which is caused by this channel, so we know that this
random error code is well designed to handle sort of random errors. When the
error was oblivious, the distribution was nice.
So what we somehow show is that the channel's error distribution is
indistinguishable by a logspace machine from an oblivious distribution. And this
is where we use the fact that the offset delta is now something which can fool
online logspace machines.
Okay. And the next step is somehow given this correct information, we have to
argue that the events which ensure that you handle the oblivious case correctly
also are going to occur with high probability against the online thing.
And that event is essentially the same thing as the error being well distributed
between various blocks. And one problem here is that you have to check the
condition in online logspace. This is sort of a standard distinguisher [phonetic]
argument where you have to build -- if you assume this is not the case, you have
to build an online logspace machine which beats Nisan's generator and get a
contradiction.
And because of this online thing, this step ends up being a little bit harder here.
So actually a proof for the poly-time channels is a lot easier than the logspace
thing. Okay. That's what I just said.
So there's somehow a weaker condition we identify and worked for that, and that
gives the whole result.
I think I'm not going to remember the details, but I think because of this, our error
probability for this may be quite a bit weaker, maybe 1 in poly or -- yeah, I think
something like that.
Okay. And the same thing works if you change the Nisan generator to some
other PRG, everything works for poly-time, poly-circuits, and the analysis is
actually easier in this case because you have a more richer class of things to
build a distinguisher with it.
Okay. So I think I'll conclude that. Just to summarize, list decoding allows you to
communicate at optimal rates even against worst-case errors, but the problem is
we do not know explicit constructions. And we saw -- so given that we saw
another way to bridge between Shannon and Hamming, which is to consider
channels which are limited in computational power, which seems like a
reasonably well-motivated model.
And for this setting we got the first results which do not use any setup
assumptions like shared randomness or public keys, and we got optimal rate
codes for oblivious errors and also for more powerful channels once you have
pseudorandom generators, but in the model of list decoding.
Okay. So there are many open questions, I think. So some of this area of
bridging complexity encoding, you know -- there are a lot of things we do not
know, and many things seem quite hard.
So one question is that I said that for p more than one-fourth, you kind of really
need list decoding even for simple channels. What happens for p less than
one-fourth? Can you do something for logspace channels which is better than
completely worst-case channels? We do not know such a result. Or is it
possible that you can actually prove you cannot achieve Shannon capacity with
unique decoding? Okay. So this is one question.
So if you didn't get this, don't worry. I think sort of a nicer question which is more
purely information theoretic is forget all these complexity restrictions. We'll just
work up worst-case channels, but now I'll just restrict -- place an information
theoretic restriction, namely, the channel is online. So the channel can cause
any error it wants, but it can flip [inaudible] bit only depending on the past, but not
the future. Just a class online model for errors.
And in this case, really, a lot of things seem to be open. So we are quite far off
from understanding the correct tradeoff between rate and the error-correction
fractions, so we have some upper bounds and some lower bounds, and I don't
think any of these are tight, and understanding this would be very nice.
So I think that's all I have.
[applause]
>> Venkatesan Guruswami: Questions?
>> Tom McMail: Thank you very much.
>> Venkatesan Guruswami: Thanks.
[applause]
Download