23800 >> Raymond Yeung: Thank you very much for --... suppose. Thank you very much for the kind introduction. ...

advertisement
23800
>> Raymond Yeung: Thank you very much for -- the microphone is on, I
suppose. Thank you very much for the kind introduction. Okay. So the talk I'm
going to give today is, has a title BATS code coding for network coded fountain.
Okay. So we're going to first talk about the problem. Now, in this picture we
show a network with package generated at the source node S being model cast
to two signals T and T2. And this has nothing to do with the butterfly network but
we're using it anyway. So here we see packets in green and red, where the
green packets are those packets that successfully arrive at the best nation and
the red ones are the ones that got dropped along the way.
We're talking about sending out files which are relatively big, for example, like 20
megabytes consisting of about 20,000 packets. Okay. So we like to find a
practical solution to this problem, such that it has low computational and storage
costs. Storage costs refers to the amount of storage that you need at an
intermediate nodes. And you want one with high transmission rate and also with
small protocol overhead.
Okay. So one possibility to do this is through TCP. By doing so you need to do
acknowledgment hop by hop, which is not scaleable for multi-cast.
And mainly because the cost of feedback is too high. Okay. We can also
consider using some well known [inaudible] scheme such as filing codes. It's
very scaleable for multi-cast because you don't need to feedback every so often.
And so therefore in a case of multi-cast, it can be implemented quite efficiently.
Okay. This is the regarding the complexity of filing codes with routing, which is in
fact extremely efficient. So we consider a file consisting of K packets here. So
remember this notation K. K is an important parameter in this problem.
We've got five consistent K packets and for each packet it consists of T symbols.
So T is pretty much a constant. So for filing code, the coding is extremely
efficient because it uses a kind of sparse coding. It's O of T per packet, which
means that the encoding complexity does not depend on the file size at all.
Whenever you don't see a K, something good, okay?
Now, for decoding, uses propagation decoding is again a constant, which does
not depend on file size. And routing, in between, is a very simple operation, and
you need only a very small offer for that.
Now, this picture shows how things work. So on the top, the input packets are
the source packets. And here we go. Okay. So at the top are the source
packets. And there's an encoder S here which encodes them into different
encoder packets, and they're sent to the source -- sent to the intermediate U, and
what it does is nothing but store forward, and along the way there's some red
packets that are being lost.
Now, the decoder, at the receiving NT, it employs belief propagation decoder so
it doesn't have to receive all the packets as long as it receives enough packets it
can stop decoding properly. And at the end you would be able to recover all
these source packets.
Okay. However, there's a drawback with using fountain code SS. Let's consider
this case when we try to consider S through T to U, and these links packet drop
rate even point to you. I was told in LT, this is not something too uncommon to
happen because the rate is so high. Now, for this network, the capacity's equal
to .8. The reason is that suppose you can -- you send a file from S to U, because
the packet rate is equal to .2. You can set it by using for error correction filing
code TCP.
You can send things from S to U at rate .8. And then you can decode the whole
file and then you re-encode, and you send it on another link again with packet
rate, packet drop rate equal to .2. So by doing so you're repeating, what you're
doing at U is repeat at what you're doing at S. By doing so you can actually send
things from S to U at rate equal .8. But there's not -- the way I've shown you it's
not necessarily the best thing to do.
Now, if you apply forward thing and retransmission end-to-end for a filing code,
the maximum rate you get is only .64. The reason is that this guy does nothing
but forwarding the packets that it receives. So from here to here you lose
20 percent. So you get 80 percent remaining. From here to here you lose
another 20 percent. So you get maximum rate equal to .64.
This is for both retransmission and also for filing codes. So what [inaudible] is
not the rate but it's the efficiency. Okay. Now, and this is the theoretical upper
limit for the rate that can be achieved by a filing code. But in reality if you work
with filing code with small field size, then there's actually a gap between this
upper bound and the real rate that you can actually achieve.
Now, I mentioned that in principle you can get a rate equal to .8 by setting the file
here, encode and re-encode. There are two problems with this implementation.
First of all, there's a delay incurred, because you have to decode before you
re-encode, which means that you have to wait until the whole file comes. And if
you have a multi-hop network then every hop you incur a delay, which is not
something very desirable.
And also that's this node has to store all the packets before you can encode,
which means that the proper size required at the intermediate nodes grows with
the file size also something not good to have.
Okay. Now, we know that if you apply random linear network coding at U, in
principle you can achieve rate equal to .8. By random linear network coding I
mean the following. You have a buffer at node U and this implementation
actually was -- it's the same thing as the Av Launch System that was invented by
the Microsoft people at Cambridge. You have a buffer that stores all the packet
that arrives at node U. And whenever you send out a packet you just take a
random linear combination of whatever you have on hand.
Even if there's no new arrival packets next time you send out another packet you
take a different random linear combination. So there's no delay incurred at all.
It's just pipelining. However, if you apply -- okay, before I tell you the drawbacks
of this straightforward implementation of network coding, I will try to convince you
with random linear network coding you can actually achieve rate equal to .8.
Now, this picture actually shows the operation of random linear network coding.
So the first row is the row representing node S, the middle row representing node
U and the last row representing node T. So you go horizontally, this represents
time. So this node here is S, node S at T equals 0, T equals 1, T equals 2, so on
and so forth. Same thing for node U, T equals 0 Node U T equals 1, so on and
so forth.
Now here we have some red arrows. These arrows represents the transmission
from node S to node U every time unit, and it has capacity equal to 1. And at the
cross here, once in a while representing the packet drop rate.
On the average you lose about 20 percent. Now, the same thing from the U
layer to the T layer, okay, again we have packet drops here and there. Now,
there is also another kind of arrows in black that goes horizontally. Now, these
represent the memory from the past. Because we buffer everything and we
assume that you remember things in the past.
And for the sake of convenience, we assume that this link has capacity infinity
but in reality it doesn't have to be larger than the file size.
Now, because we have -- the thing of this is water pipes. And then you try to
pump water in at node S at time zero, and you want to know how much water
you can get at the bottom layer. As time goes by. Now, because 20 percent of
these pipes are broken, you can send -- you can press water down from -- I think
it's better to assume that these pipes are blocked instead of of the pipes being
broken because for broken pipe you lose the water. For this discussion, you
know, we think of them -- think of these pipes being blocked.
So because 20 percent of these pipes are blocked, you can press water down
from the top layer to the middle layer at rate equal to .8. Now, you can also
press water down from the middle layer to the bottom layer equal to .8, because
about 20 percent of these pipes are blocked.
The question is whether you can press water down from the top layer to the
bottom layer at rate equal .8. Now, the reason why it can be done is because of
the existence of these thick pipes that goes horizontally.
The thing is you press water down from the middle layer to the bottom layer at
rate equal .8 and the water can travel horizontally and find their way down to the
bottom layer. Now, if we examine one of these nodes in the middle layer
carefully, okay, so if it has 2 inputs link one is the link from the past and another
input link is the link that you receive the new packets.
So what you do is that you -- by applying random linear network coding you take
a random linear combination of the past and new arrival packets and you send
out a new packet.
So this picture, this time parameterized graph or you want to call it [inaudible]
diagram depicts precisely what's the avalanche system is doing. In random
linear coding all we care about is the maximum flow. So now we've seen that the
maximum flow rate -- actually the maximum flow of S to the bottom grow actually
grows with time at rate equal .8. That's why we can -- with linear network coding
you actually can send information from S to T at rate equal .8. This is the
intuitive explanation why random linear network coding would do the job. Okay.
However, you sacrificed for efficiency because the reason why filing codes are so
efficient is that the encoding is very sparse. Whereas for random linear network
coding, which is depicted here, you just take this packet is formed by taking
random linear combination of whatever you have, and this is a dense encoding.
Again, when you do a random linear network coding at the intermediate node it's
dense encoding, so you can still decode.
But decoding is not efficient. In particular, the encoding complexities O of TK per
packet, okay, as I told you the K is the size of the file and thing -- as long as K
turns up, it's not something good. We don't want that.
For decoding, you use straightforward Gaussian elimination. The complexities O
of K squared plus TK per packet. So essentially T is the constant. It's essentially
K squared. Again, it's not something very desirable.
And so for the intermediate node if you apply network coding, again there's some
complexity associated with it. And for this straightforward implementation, it
requires you to buffer all the K packets, which is essentially the whole file.
And so if you want to transmit a bigger file, then you require the intermediate
nodes to have a bigger buffer, which is something not very desirable either.
Okay. So after seeing these slides, we come up with a quick summary. On the
one hand we can have routing plus filing code, which is low complexity but the
rate is also not satisfactory. If you want to have high rate you can go for an error
coding but complexity is high.
So let us first review some existing schemes to try to tackle the problem. Okay.
Now, the very reason why applying random linear network coding in the middle
node would screw things up is network coding changes the degree of the
distribution of the received packets.
In designing a filing code, choosing the right degree distribution is the main thing.
If you do randomization between you screw up the degree distribution, and solo
decoding complexity cannot be guaranteed. Okay.
So there have been some efforts trying to get around the problem. Okay. The
main idea is to try to trick the random linear network coding at the intermediate
node so N to N it still looks like a filing code.
But this is something rather ad hoc and it's very hard to extend beyond very
simple network. But even if you do that, at the intermediate node is
computational cost is still high. And you also need to store all the K packets.
And there has been some work coming from this group. Actually, it's quite a
while ago. Almost ten years ago. You guys really know what problems are
important. Okay. So the idea is to use so-called chunks to reduce the coding
complexity.
So we know that's the coding complexity is gross at a rate higher than linear. But
as long as we keep things small, things are still manageable, with -- if you chop
things up into chunks so you do random linear network coding within this chunk
here and within this chunk here. Here you see the first node depends on the first
node. The second node depends on the first node and the second node this is
kind of a conciliatory constraint on the coding.
So the idea is to keep things small. If you do that, the encode complexity O of
TKL where L is the chunk size, the K is still there. And decoding complexity of O
of K error squared plus TKL.
This is a little better, because previously we have K squared. But here is only
solved linear with K. Where we can regard T and L essentially as constant.
Okay. On top of buffer requirement in the intermediate nodes. It really depends
on -- depends on the implementation. But one big problem with using the chunk
approach is how to transmit the chunks. So there have been different
approaches. The obvious approach is to do sequential scheduling of chunks,
which means that I transmit for this chunk and I wait until everybody are done
with this and then I move on to the second chunk.
But one drawback of this scheduling is that the -- it's not scaleable for multi-cast.
Because for multi-cast there can be many receivers and some can be faster.
Some can be slower. So it's not really scaleable for multi-cast.
Another approach is to use random scheduling of chunks, meaning I have all
these chunks and I randomly pick one to transmit. And hopefully this is a new
one to you.
Now with this implementation, however, the intermediate node again have to
cash all the K packets. And unlike sequential scheduling for sequential
scheduling you don't have to cash all the K packets. Now, however, such a
scheme would be less efficient when a major fraction of all the chunks have been
decoded. The thing is that if there are 100 chunks and you already have
received 90 of them, if I randomly pick one then 90 percent of the time it would
be redundant chunk for you.
Okay. And there has also been efforts along the line of overlap chunks. The
chunks are not totally independent of each other. But this can improve the
throughput of random scheduling. But still it cannot reduce the buffer size.
So we'll probably learn from all these discussions is that filing codes are not
really comparable with error coding, but weightlessness is a good property we
want in multi-cast application.
Chunks can be used for network coding, but difficult to schedule. So we are
trying to address all these issues using a new approach we call BATS code.
BATS code refers to batched sparse codes.
And the operation of BATS code is shown in this picture here. So on the top
again we have all these sauce packets. We organize the encoder packets into
what we call chunks. In this case -- I'm sorry, it's a batch.
A batch is different from a chunk in the sense that for chunks we kind of think of
them as operating independently, but as we're going to see, a batch actually
interoperate with each other.
So here a batch has size equal to 3. And so let's see how we form the whole
batch. Let's see how we form the first packet in the first batch. To form the first
packet in the first batch we draw a degree distribution.
Okay. We draw a degree from a degree distribution. And let's say that the
degree is equal to four and then we randomly pick four of these sauce packets,
let's say we picked this one, this one and this one. So what we do is we get a
random linear combination of these four sauce packets to form the first packet for
the first batch.
And they form this second packet of the first batch, we stick with the same subset
of source packets, but we take a different random linear configuration, so on and
so forth. So we're done with the first batch.
Now to form the second batch we throw another degree from the degree
distribution, and this time let's say the degree is equal 1, 2, 3, 4, 5. So we
randomly pick five of the input, five of the source packets, and form a random
linear combination to form the first packet here.
Now, to form the second packet here we use the same five source packets but
use a different random linear combination.
So it goes on like this. And then we -- at the intermediate nodes, there can be
one but here I only show one. You do random linear coding but only within the
same batch.
All right. So this picture gives a little more detail of the operation. So first you
obtain a degree D by sampling a certain degree distribution psy, and then you
pick D distinct input packets randomly. As I said, and then you generate a batch
of M coded packets using the D packets. So M is a size, capital M is the size of
a batch.
So here we formed the batches X1, X2, X3, X4 and so forth. So here are the
details. So XI is the Ith batch. So the degree that we used for the batch XI is
equal to DI.
Okay. And the packets BI 1. BI 2, up to BI are the packets involved in forming
the Ith batch. And then you multiply this by, to generate a matrix GI, and then
you form this Ith batch equal to BI which is this row vector times GI.
Okay. Any question?
>>: The size of the batch is always the same.
>> Raymond Yeung: Always the same. Capital M.
>>: Alternatively, I imagine keep the ratio constant like it's always alpha times D
of ->> Raymond Yeung: Alpha times ->>: So when you choose the degree D, right, and you say that keep the batch
size constant multiply ->> Raymond Yeung: It's actually a good idea to keep the batch size small, as
we're going to see. Okay. So we form these batches. And okay maybe I can go
back to this picture and try and explain a little bit more.
So this coding scheme actually can be understood as an outer code. Okay.
Here. Which is kind of like a matrix generalization of a filing code and then
there's an inner code. And the inner code are the random linear coding applied
within the network within each batch.
You don't cross, do cross-batch random linear coding. Okay. So you form these
batches, and they're sent through the network, which can do arbitrary linear
network coding. Doesn't matter as long as things are linear. And you get
batches out as Y 1 Y 2 Y 3 so forth where YI is equal to XI input batch into the
network multiplied by HI, where HI is the transfer matrix that it goes through
within the network.
Now, because we don't do cross batch linear network coding YI only depends on
XI but not other batches. That's why you -- that's how you keep the structure of
the outer code intact, even though if you do random linear network coding within
the network.
Okay. So the end-to-end effect is the following: On the top we have these input
packets. And on the bottom we have some technos characterized by the matrix
GIHI. So if you're familiar with belief propagation filing code, what they do is that
you find a chat node with degree 1 and you start propagating. Now, in this case
we instead of doing the same thing we look for a chat node I with degree I equal
to the ring of GIHI.
For example, for this chat node here suppose -- well the degree is equal to 2.
Suppose the ring is also equal to 2 you can code B1 and B3 and you can start
propagating.
So specifically the linear equation associated with a chat node is YI is equal to
BI, which is the BI is the vector of all the packets that are involved in batch I. And
you multiply by GI times HI. GI is the generator matrix and HI is the transfer
matrix in the network.
Okay. So you can also apply the technique of precoding, as in the rapter codes.
So the idea is you precode by fixed rate erasure correction code. So these are
the source packets. You expand it into a larger number of packets by an erasure
code. And when you do the belief propagation here, upon being able to decode
a fraction 1 minus E to all these packets, you would be able to recover the
original source packets by means of the erasure code.
This actually can give you a higher rate and also lower complexity. Okay. So we
need a degree distribution size such that the belief propagation can decode
successfully with high probability. The encoding decoding complexity is low and
the coding rate is high.
So I'm not going to get into the details of this asymptotic optimization program.
All I want to say is that this optimization actually has to do with the expected -has to do with the ranked distribution of the transfer matrix X. It does not depend
on the details of this transfer matrix H but it does depend on the ranked
distribution.
Okay. And then one can do some optimization accordingly. And this is the
complexity of sequential scheduling of these batches. Now, so you may ask so
just a moment ago we said that the sequential scheduling is not efficient because
it's not scaleable for multi-casting. Then why do we use sequential scheduling
here?
Now, the reason why sequential scheduling was not efficient for chunk-based
random linear network coding is because you need to -- as a receiver, you need
to receive every chunk. If you cannot receive it, you have to wait until you
receive it before you can move on to the next chunk. But for best code, because
it has a kind of like a matrix filing as an outer code you don't actually have to
receive all the batches. You only have to receive a sufficient number of batches
and then you can start the belief propagation decoding.
So the result is that the source node encoding complexity is O of TM. Okay. M
is the batch size. And so these two are constant. It doesn't depend on K.
Destinational decoding O of M squared plus TM. Again, it doesn't depend on K.
No matter how large your file is, the encoding decoding complexity remains
constant.
Now for the intermediate node, for this particular configuration, okay, which I will
elaborate a little bit. This particular configuration is such that from the source to
all these signals it forms it has a tree structure. So that packets cannot overtake
each other.
I'm going to elaborate further why this is important. The buffer size only needs to
be O of TM. And it's independent on file size. And for the network coding
operation, the complexity is O of TM per packet.
>>: I thought last time when you were given complexity numbers, the reason you
had K was because it was the complexity for the decoding the entire file.
>> Raymond Yeung: Actually, I clarify with my post-doc last night. And in fact
the slide you saw last time, it had a K in it as for the total complexity. But now I
taught about complexity per packet.
>>: So if you weighed this for total file complexity you'd have a K in there.
>> Raymond Yeung: Yes, of course. To decode a lot of files you need to work
harder. But per packets, at least for per packet you don't have to work harder.
>> Yeah. Okay. So T you pretty much can forget about it. Just the length of the
packet that doesn't change. So K is the number of packets that depends on the
file size. And M is a parameter to choose. It's a batch size.
Okay. So the one thing I would like to mention is that here the optimal value of
theta is almost the same thing as the rate of the code, which is very close to
expectation of the ring of H. So when you have a one -- only have one-half and
then the ring of H corresponds to the erasure probability. When there's multiple
hops, then this is what you have to look at.
It can be proved that when expectation -- okay. The optimal value theta can be
proved to be exactly equal to expectation of ring of K when expectation of ring of
K is equal to M which is the batch size times probability of ring of K -- ring is
actually equal to M.
So let's go back to this example with packet loss equal to .2. So here we apply
BATS code at node S which encodes K packets. And node U only needs to
cache one batch. The reason is that from S to T, there's only one path. And so
the packets cannot overtake each other. So at node U, you only need to start
one batch. As long as you see packets from a new batch coming in you know
the old patch is just over and you can just throw away everything.
And node T only needs to send one feedback after successful decoding. Okay.
So here are some parameters we obtained by assimilation, and I want to note
here we have not applied the precoding techniques yet. If you apply the
precoding techniques then the number would look even better.
So here K is the file size. 16,000, 32,000, 64,000. And Q is the size of the finite
field. For Q, equal to 2. This is binary. For Q, 4 is a very small finite field.
And here we choose batch size equal to 32. And as we see on the lower right
corner, okay, the rate can already exceed .64, which is the theoretical upper
bound on the rate of a finite code, which actually cannot be achieved with a small
finite field for finite code in any case.
So things look pretty encouraging. And in fact from the theoretical point of view,
what we have obtained is actually a framework, which on the one extreme in
comparisons rapter codes, filing code, that family, when M is equal to 1, when a
batch size is equal to 1, then BATS code degenerates to filing codes. And which
has low complexity, but it doesn't enjoy the benefit of network coding. On the
other extreme, when you take M equals K, which is the whole file, and degree is
also equal to K, that is you take random linear combination of all the packets in
the source file, then BATS code becomes full fledged random linear network
encoding, which is high complexity which at the same time it enjoys the full
benefit of narrow coding. Somewhere between we're trying to choose some
parameters such that it performs well and at the same time the efficiency, the
complexity is low. That's what we're trying to do.
Okay. So let's talk about some recent developments. The one thing I would like
to mention is that there's one ->>: May I ask a question about this code? So, let's look back. So this discussion
is about -- I mean, the specific codes is showing the performance is about one -two hop, right?
>> Raymond Yeung: Yeah.
>>: So if I have multiple hop, how does the code perform?
>> Raymond Yeung: The more hops you have, the better it is. Because if you
have packet loss, then you lose packets along the way. If you don't do anything
in between it, then you keep losing packets.
>>: You say the performance -- compared with basically ->> Raymond Yeung: Routing. Compared with routing.
>>: Also compared with random linear code.
>> Raymond Yeung: It cannot be -- in terms of the rate, it cannot beat a random
linear network coding.
>>: Of course. But how much is the performance gap to the random linear code
changes when you have multiple hops?
>> Raymond Yeung: Okay.
>>: So I mean, for example ->> Raymond Yeung: For BATS code. That's something we are in the process of
investigating. Okay. So we need to do much more simulation to see how it
actually works in the real environment.
>>: Seems like .6 data was -- should be like .8.
>> Raymond Yeung: It should be close to .8, yeah, because what .8 is is actually
let's say -- the bona fide network. And then nominally the capacity is equal to 1.
But because of 10 percent drop here you get .9. Here you get .8. This is .85.
And blah, blah, blah.
So in principle, if you applied random linear network coding, you would be able to
achieve the min cut of this graph.
So the advantage of such a coding scheme is that you prevent the packet loss to
accumulate. And also at the same time you try to -- you prevent delay from
accumulating.
>>: It would be good to quantify it.
>> Raymond Yeung: Exactly.
>>: The operation. Simply because if it's too hot and there's already a drop off
performance from .8 to .68 something like that.
>> Raymond Yeung: .64.
>>: After the hops. I feel this performance may degenerate.
>> Raymond Yeung: In fact, we just got a funding from the government to build
a prototype using BATS code applied to P2P networks. It's the same thing.
Also, that's the -- BATS code can also handle the situation when you have some
intermediate nodes, which are just helper nodes. They're just there to help.
They don't want to decode a whole file.
Okay. One thing I'd like to mention is that for fountain code, the asymptotic
optimal degree distribution actually does not depend on the erasure probability,
which is something good. Okay? So you don't need to know the channel
condition before you decide on the degree distribution.
Having said that, the actual filing code being used is -- actually deviates from the
theoretical asymptotic optimal degree distribution. I think for the filing code that
gets into the standard-like raptor queue they actually obtain the -- it's a
recommended degree distribution that is obtained by very extensive simulation in
different situations.
So even though the theoretically optimal degree distribution doesn't depend on
the erasure probability of the channel, they don't use that one in practice.
So this may or may not be a very big issue, we are trying to see. So although the
theoretically optimal degree distribution does depend on the ranked distribution
of the transfer matrix, we see that the dependence might not be that severe. So
we're trying to come up with some robust degree distribution for different ranked
distribution and see how things work out.
And we also are conducting some finite length analysis that's being done by one
of my undergraduate project students, and we are building testing systems on
multi-hop wireless transformation, and also as I mentioned for P2P file transfer
systems.
Okay. So as a summary, best code provide additional fountain solution for
networks employing linear network coding. Also, as I mentioned, the more hops
between the source node and signal node applying end-to-end filing codes.
Further improvement would include -- further development would include proof of
the near capacity achievement of BATS code. And in the more generous set up
and also design of intermediate operations to maximize the expectation of the
rank of X, which is a transformation X and to minimize the buffer size.
As I mentioned, if you have a tree structured such that packets cannot overtake
each other, then you only need to buffer one packet. But in a general topology,
when there are multiple paths connecting the nodes, it's not clear how big you
need a buffer.
I think that takes a lot of -- a lot more experiment before we can tell what is a
good size of the buffer that one needs to maintain.
So that's the end of my talk. I thank you very much.
[applause].
>> Philip Chou: Questions?
>>: So what if you had -- you pulled a degree out of your distribution that was
bigger than M?
>> Raymond Yeung: Bigger than M. That's okay.
>>: So how do you do the decoding purely enough for belief propagation?
>> Raymond Yeung: Let's see. You get a -- actually, I forgot the details. The
degree distribution, they use actually -- have an upper limit. I don't remember
exactly how it is being chosen.
I have to go back and look at the details.
>>: It seems like you presented the decoding.
>> Raymond Yeung: I know what it means. I think probably it was set, the
support is from 1 to something, which is smaller than M, I think.
>>: That would be --
>> Raymond Yeung: That's a very good question.
>>: That would be a strong correlation between that and the degree.
>> Raymond Yeung: That's a very good question. I have to look back into the
details before I can answer the question.
>>: Just kind of following up on his question. You only find the local decoding
within the batch. Never tried to put all the equations through the entire file
together and the Gaussian matrix and ->> Raymond Yeung: Exactly.
>>: So you must lose something there, right?
>> Raymond Yeung: Let's see. Yeah, you lose something. I mean in the
extreme case, when you have M equals K and degree goes to K it basically falls
back to full fledged linear random encoding.
>>: It seems like in the immediate case you essentially have like two [inaudible]
so would you leave the batch and try to code, BATS code and propagate back.
But as Phil was asking, if you cannot decode a batch, because of rank, do you
piece some other ranks together so maybe it might decode, right?
>>: If you restrict your decoding ->> Raymond Yeung: There are -- if you stop propagating this or also we are
looking into the literature of filing code there. There are many techniques that
can be employed, yeah, for moving forward. We're looking into that, too.
>>: Seems like it could be very fast decoding for most of the stuff. And if you
have any ambiguity left over, then you can try that eek out the last remaining bits
of it.
>>: Because in the fountain codes they have the cycles essentially so you don't
have degree one nodes and you cannot decode them. Then you have
co-efficiency here, you don't have to stop there and ->> Raymond Yeung: Yeah.
>>: I have two questions. One is the on [inaudible] and one is on
implementation. So I mean so for this network, which is a two-hop network, with
some packet loss, and the example you showed is that it was a fixed loss rate.
Now, let's assume that the loss rate fluctuates across time.
My understanding it's basically the information theory basically based on come
up with performance. It should be the basically reverted to the average loss off
of basically the network, simply because if you use network coding you can have
infinite amount of bandwidth with the right --
>> Raymond Yeung: Memory doesn't have to be larger than the file size.
>>: Basically you need a larger memory to average out the losses.
>> Raymond Yeung: Actually, not quite so for filing codes. You don't have to -- it
really depends on the statistical modeling of the channel. You think about filing
code. Okay. If I have a blackout, then you don't receive packets during that
period of time. If you resume, you just pretend that nothing has happened. You
don't even have to know that something's happened.
>>: My point is it seems to me that you will -- I mean, basically if the rate
fluctuates quite a lot, I mean the end, if your memory is not large enough, you
may not be able to take advantage of the higher rate [inaudible] weighty ->> Raymond Yeung: Let me see, the..I think the issue that you brought up with,
becomes significant if the packets can overtake each other.
>>: I mean or if there's significant fluctuation.
>>: Basically what you care about is it shuts off for -- packet loss goes to
100 percent for some period of time. That's a pretty big fluctuation.
>>: It's like this as far as I'm concerned, in the beginning the first pipe has full
capacity. No loss for half of the time. And then for the later half of the time the
capacity is zero. I mean, the second pipe has zero capacity for the first half and
100 capacity for the later half.
Now, to achieve the full capacity, you need to be basically a memory equal to the
whole files to be able to downpour the waters -- if you do batches, then you are
not able to take basically advantage of the capacity fluctuations.
>> Raymond Yeung: Actually, it is not sensitive to the fluctuation in the sense
that all that matters is the number of batches that can arrive. Because you think
of this -- this ->>: Number of batches.
>> Raymond Yeung: Yeah, yeah. You think about this in terms of the encoding
graph. If PEC cannot arrive, then you just delete that from the graph. And then
because it's random, it actually looks like -- it looks the same everywhere.
>>: My second question is related to basically the implementations. What if I
implement basically I do the coding separately. So it exists -- I mean, the BATS
code in a sense have two stages, right? So the first stage it basically says
intermediate node. Some of the information arrives at this batch, right? And
then you have a second stage which is this -- basically this coding within batch.
Now let's say we are operating on a packet basis. And so the overhead of each
packet is relatively new. So can I simply -- basically during the second stage I try
to decode what the first stage is.
>>: So you mean the intermediate nodes are trying to decode?
>>: No, inter nodes trying to decode.
>> Raymond Yeung: Let's go back to this picture. Let's see if we can make -okay. So at what stage are you experiencing?
>>: What I'm saying is this.
>> Raymond Yeung: This is not the right picture. Not the right picture.
>>: I need a two stage graph. So it's like this. So the first stage, for example, in
the intermediate nodes, basically have the information.
>> Raymond Yeung: You mean here?
>>: Yes, here. Too brief.
>> Raymond Yeung: These are -- okay. For this case this arrives at
intermediate arrives, this doesn't arrive.
>>: This doesn't arrive. So basically put that patent into the message.
>> Raymond Yeung: Put that patent into the message? I see.
>>: So let's say I use a binary vector says, here's the packets which have arrived.
>> Raymond Yeung: Okay. I see. Okay. So this is fixed, the size of this is
fixed. And then you ->>: This is ->> Raymond Yeung: So you tell the downstream that the third guy actually had
not arrived and I'm only taking random linear combination on the first two things.
So they're trying to see whether this can help.
>>: Whether the scheme is simpler.
>> Raymond Yeung: Whether the scheme is simpler, but how will you make use
of the information.
>>: Let's say this can be transmitted overhead.
>> Raymond Yeung: But in what way is this information useful to those down
streams?
>>: The down streams only need to know how many valid packets is in that batch
and they're simply designed basically a code, which is equal to the size of
basically, equal to basically be able to decode that batch.
So, for example, the batch here is size K. Okay. Now ->> Raymond Yeung: Let's stick with M.
>>: M. Okay. After the first stage, no more packet arrive, is anywhere between 1
to N. I only need to basically get the information of what packet arrived in this
and the vector. And send to basically the receivers. The receiver basically have
a rate-less code, to cover exactly the number of packets received in the
intermediate node, and when it received basically says, okay, I don't believe
anymore on the code. So here, of course, the receiver is in a sense tacked to
the intermediate. They're already basically of this K packet.
>> Raymond Yeung: I'll think about it more carefully. See whether we can take
advantage of this. Thanks for the input.
>>: I want to go back to that question IH was constant M.
>> Raymond Yeung: IH constant M. In this case, in this example you only need
to store M packets.
>>: I think the main reason is intermediate nodes buffer, right.
>> Raymond Yeung: Right.
>>: If you set it equal to D then you avoid the [inaudible] issue.
>> Raymond Yeung: Well, it really depends on the application. For example, in
the current generation of P 2 P system so there's no helper node, everybody
wants to help out anyway. So buffering is not an issue. So for some applications
it doesn't matter. But this is mainly for helping those, or like a router, where you
don't want the buffer sizes of the, of the router to grow with the file.
>>: I guess if you've already got D and an upper limit on D, and you already
picked M to be ->> Raymond Yeung: Well, but if there's the upper limit, M can still be big, right?
>> Yeah, but I guess -- so you pick M to be bigger than the upper limit on D. But
maybe if D, if you picked a D that's smaller than the upper limit, then maybe you
can just use a smaller batch.
>> Raymond Yeung: Well, the size of M cannot be too small. Otherwise you
cannot get the benefit of random linear network coding. But actually what we find
a little surprising is that M can be chosen to be quite small. And yet there's
always this -- already exceeds pharmaco [phonetic].
>> Philip Chou: Okay. Thanks.
>> Raymond Yeung: Thank you very much.
[applause]
Download