23800 >> Raymond Yeung: Thank you very much for -- the microphone is on, I suppose. Thank you very much for the kind introduction. Okay. So the talk I'm going to give today is, has a title BATS code coding for network coded fountain. Okay. So we're going to first talk about the problem. Now, in this picture we show a network with package generated at the source node S being model cast to two signals T and T2. And this has nothing to do with the butterfly network but we're using it anyway. So here we see packets in green and red, where the green packets are those packets that successfully arrive at the best nation and the red ones are the ones that got dropped along the way. We're talking about sending out files which are relatively big, for example, like 20 megabytes consisting of about 20,000 packets. Okay. So we like to find a practical solution to this problem, such that it has low computational and storage costs. Storage costs refers to the amount of storage that you need at an intermediate nodes. And you want one with high transmission rate and also with small protocol overhead. Okay. So one possibility to do this is through TCP. By doing so you need to do acknowledgment hop by hop, which is not scaleable for multi-cast. And mainly because the cost of feedback is too high. Okay. We can also consider using some well known [inaudible] scheme such as filing codes. It's very scaleable for multi-cast because you don't need to feedback every so often. And so therefore in a case of multi-cast, it can be implemented quite efficiently. Okay. This is the regarding the complexity of filing codes with routing, which is in fact extremely efficient. So we consider a file consisting of K packets here. So remember this notation K. K is an important parameter in this problem. We've got five consistent K packets and for each packet it consists of T symbols. So T is pretty much a constant. So for filing code, the coding is extremely efficient because it uses a kind of sparse coding. It's O of T per packet, which means that the encoding complexity does not depend on the file size at all. Whenever you don't see a K, something good, okay? Now, for decoding, uses propagation decoding is again a constant, which does not depend on file size. And routing, in between, is a very simple operation, and you need only a very small offer for that. Now, this picture shows how things work. So on the top, the input packets are the source packets. And here we go. Okay. So at the top are the source packets. And there's an encoder S here which encodes them into different encoder packets, and they're sent to the source -- sent to the intermediate U, and what it does is nothing but store forward, and along the way there's some red packets that are being lost. Now, the decoder, at the receiving NT, it employs belief propagation decoder so it doesn't have to receive all the packets as long as it receives enough packets it can stop decoding properly. And at the end you would be able to recover all these source packets. Okay. However, there's a drawback with using fountain code SS. Let's consider this case when we try to consider S through T to U, and these links packet drop rate even point to you. I was told in LT, this is not something too uncommon to happen because the rate is so high. Now, for this network, the capacity's equal to .8. The reason is that suppose you can -- you send a file from S to U, because the packet rate is equal to .2. You can set it by using for error correction filing code TCP. You can send things from S to U at rate .8. And then you can decode the whole file and then you re-encode, and you send it on another link again with packet rate, packet drop rate equal to .2. So by doing so you're repeating, what you're doing at U is repeat at what you're doing at S. By doing so you can actually send things from S to U at rate equal .8. But there's not -- the way I've shown you it's not necessarily the best thing to do. Now, if you apply forward thing and retransmission end-to-end for a filing code, the maximum rate you get is only .64. The reason is that this guy does nothing but forwarding the packets that it receives. So from here to here you lose 20 percent. So you get 80 percent remaining. From here to here you lose another 20 percent. So you get maximum rate equal to .64. This is for both retransmission and also for filing codes. So what [inaudible] is not the rate but it's the efficiency. Okay. Now, and this is the theoretical upper limit for the rate that can be achieved by a filing code. But in reality if you work with filing code with small field size, then there's actually a gap between this upper bound and the real rate that you can actually achieve. Now, I mentioned that in principle you can get a rate equal to .8 by setting the file here, encode and re-encode. There are two problems with this implementation. First of all, there's a delay incurred, because you have to decode before you re-encode, which means that you have to wait until the whole file comes. And if you have a multi-hop network then every hop you incur a delay, which is not something very desirable. And also that's this node has to store all the packets before you can encode, which means that the proper size required at the intermediate nodes grows with the file size also something not good to have. Okay. Now, we know that if you apply random linear network coding at U, in principle you can achieve rate equal to .8. By random linear network coding I mean the following. You have a buffer at node U and this implementation actually was -- it's the same thing as the Av Launch System that was invented by the Microsoft people at Cambridge. You have a buffer that stores all the packet that arrives at node U. And whenever you send out a packet you just take a random linear combination of whatever you have on hand. Even if there's no new arrival packets next time you send out another packet you take a different random linear combination. So there's no delay incurred at all. It's just pipelining. However, if you apply -- okay, before I tell you the drawbacks of this straightforward implementation of network coding, I will try to convince you with random linear network coding you can actually achieve rate equal to .8. Now, this picture actually shows the operation of random linear network coding. So the first row is the row representing node S, the middle row representing node U and the last row representing node T. So you go horizontally, this represents time. So this node here is S, node S at T equals 0, T equals 1, T equals 2, so on and so forth. Same thing for node U, T equals 0 Node U T equals 1, so on and so forth. Now here we have some red arrows. These arrows represents the transmission from node S to node U every time unit, and it has capacity equal to 1. And at the cross here, once in a while representing the packet drop rate. On the average you lose about 20 percent. Now, the same thing from the U layer to the T layer, okay, again we have packet drops here and there. Now, there is also another kind of arrows in black that goes horizontally. Now, these represent the memory from the past. Because we buffer everything and we assume that you remember things in the past. And for the sake of convenience, we assume that this link has capacity infinity but in reality it doesn't have to be larger than the file size. Now, because we have -- the thing of this is water pipes. And then you try to pump water in at node S at time zero, and you want to know how much water you can get at the bottom layer. As time goes by. Now, because 20 percent of these pipes are broken, you can send -- you can press water down from -- I think it's better to assume that these pipes are blocked instead of of the pipes being broken because for broken pipe you lose the water. For this discussion, you know, we think of them -- think of these pipes being blocked. So because 20 percent of these pipes are blocked, you can press water down from the top layer to the middle layer at rate equal to .8. Now, you can also press water down from the middle layer to the bottom layer equal to .8, because about 20 percent of these pipes are blocked. The question is whether you can press water down from the top layer to the bottom layer at rate equal .8. Now, the reason why it can be done is because of the existence of these thick pipes that goes horizontally. The thing is you press water down from the middle layer to the bottom layer at rate equal .8 and the water can travel horizontally and find their way down to the bottom layer. Now, if we examine one of these nodes in the middle layer carefully, okay, so if it has 2 inputs link one is the link from the past and another input link is the link that you receive the new packets. So what you do is that you -- by applying random linear network coding you take a random linear combination of the past and new arrival packets and you send out a new packet. So this picture, this time parameterized graph or you want to call it [inaudible] diagram depicts precisely what's the avalanche system is doing. In random linear coding all we care about is the maximum flow. So now we've seen that the maximum flow rate -- actually the maximum flow of S to the bottom grow actually grows with time at rate equal .8. That's why we can -- with linear network coding you actually can send information from S to T at rate equal .8. This is the intuitive explanation why random linear network coding would do the job. Okay. However, you sacrificed for efficiency because the reason why filing codes are so efficient is that the encoding is very sparse. Whereas for random linear network coding, which is depicted here, you just take this packet is formed by taking random linear combination of whatever you have, and this is a dense encoding. Again, when you do a random linear network coding at the intermediate node it's dense encoding, so you can still decode. But decoding is not efficient. In particular, the encoding complexities O of TK per packet, okay, as I told you the K is the size of the file and thing -- as long as K turns up, it's not something good. We don't want that. For decoding, you use straightforward Gaussian elimination. The complexities O of K squared plus TK per packet. So essentially T is the constant. It's essentially K squared. Again, it's not something very desirable. And so for the intermediate node if you apply network coding, again there's some complexity associated with it. And for this straightforward implementation, it requires you to buffer all the K packets, which is essentially the whole file. And so if you want to transmit a bigger file, then you require the intermediate nodes to have a bigger buffer, which is something not very desirable either. Okay. So after seeing these slides, we come up with a quick summary. On the one hand we can have routing plus filing code, which is low complexity but the rate is also not satisfactory. If you want to have high rate you can go for an error coding but complexity is high. So let us first review some existing schemes to try to tackle the problem. Okay. Now, the very reason why applying random linear network coding in the middle node would screw things up is network coding changes the degree of the distribution of the received packets. In designing a filing code, choosing the right degree distribution is the main thing. If you do randomization between you screw up the degree distribution, and solo decoding complexity cannot be guaranteed. Okay. So there have been some efforts trying to get around the problem. Okay. The main idea is to try to trick the random linear network coding at the intermediate node so N to N it still looks like a filing code. But this is something rather ad hoc and it's very hard to extend beyond very simple network. But even if you do that, at the intermediate node is computational cost is still high. And you also need to store all the K packets. And there has been some work coming from this group. Actually, it's quite a while ago. Almost ten years ago. You guys really know what problems are important. Okay. So the idea is to use so-called chunks to reduce the coding complexity. So we know that's the coding complexity is gross at a rate higher than linear. But as long as we keep things small, things are still manageable, with -- if you chop things up into chunks so you do random linear network coding within this chunk here and within this chunk here. Here you see the first node depends on the first node. The second node depends on the first node and the second node this is kind of a conciliatory constraint on the coding. So the idea is to keep things small. If you do that, the encode complexity O of TKL where L is the chunk size, the K is still there. And decoding complexity of O of K error squared plus TKL. This is a little better, because previously we have K squared. But here is only solved linear with K. Where we can regard T and L essentially as constant. Okay. On top of buffer requirement in the intermediate nodes. It really depends on -- depends on the implementation. But one big problem with using the chunk approach is how to transmit the chunks. So there have been different approaches. The obvious approach is to do sequential scheduling of chunks, which means that I transmit for this chunk and I wait until everybody are done with this and then I move on to the second chunk. But one drawback of this scheduling is that the -- it's not scaleable for multi-cast. Because for multi-cast there can be many receivers and some can be faster. Some can be slower. So it's not really scaleable for multi-cast. Another approach is to use random scheduling of chunks, meaning I have all these chunks and I randomly pick one to transmit. And hopefully this is a new one to you. Now with this implementation, however, the intermediate node again have to cash all the K packets. And unlike sequential scheduling for sequential scheduling you don't have to cash all the K packets. Now, however, such a scheme would be less efficient when a major fraction of all the chunks have been decoded. The thing is that if there are 100 chunks and you already have received 90 of them, if I randomly pick one then 90 percent of the time it would be redundant chunk for you. Okay. And there has also been efforts along the line of overlap chunks. The chunks are not totally independent of each other. But this can improve the throughput of random scheduling. But still it cannot reduce the buffer size. So we'll probably learn from all these discussions is that filing codes are not really comparable with error coding, but weightlessness is a good property we want in multi-cast application. Chunks can be used for network coding, but difficult to schedule. So we are trying to address all these issues using a new approach we call BATS code. BATS code refers to batched sparse codes. And the operation of BATS code is shown in this picture here. So on the top again we have all these sauce packets. We organize the encoder packets into what we call chunks. In this case -- I'm sorry, it's a batch. A batch is different from a chunk in the sense that for chunks we kind of think of them as operating independently, but as we're going to see, a batch actually interoperate with each other. So here a batch has size equal to 3. And so let's see how we form the whole batch. Let's see how we form the first packet in the first batch. To form the first packet in the first batch we draw a degree distribution. Okay. We draw a degree from a degree distribution. And let's say that the degree is equal to four and then we randomly pick four of these sauce packets, let's say we picked this one, this one and this one. So what we do is we get a random linear combination of these four sauce packets to form the first packet for the first batch. And they form this second packet of the first batch, we stick with the same subset of source packets, but we take a different random linear configuration, so on and so forth. So we're done with the first batch. Now to form the second batch we throw another degree from the degree distribution, and this time let's say the degree is equal 1, 2, 3, 4, 5. So we randomly pick five of the input, five of the source packets, and form a random linear combination to form the first packet here. Now, to form the second packet here we use the same five source packets but use a different random linear combination. So it goes on like this. And then we -- at the intermediate nodes, there can be one but here I only show one. You do random linear coding but only within the same batch. All right. So this picture gives a little more detail of the operation. So first you obtain a degree D by sampling a certain degree distribution psy, and then you pick D distinct input packets randomly. As I said, and then you generate a batch of M coded packets using the D packets. So M is a size, capital M is the size of a batch. So here we formed the batches X1, X2, X3, X4 and so forth. So here are the details. So XI is the Ith batch. So the degree that we used for the batch XI is equal to DI. Okay. And the packets BI 1. BI 2, up to BI are the packets involved in forming the Ith batch. And then you multiply this by, to generate a matrix GI, and then you form this Ith batch equal to BI which is this row vector times GI. Okay. Any question? >>: The size of the batch is always the same. >> Raymond Yeung: Always the same. Capital M. >>: Alternatively, I imagine keep the ratio constant like it's always alpha times D of ->> Raymond Yeung: Alpha times ->>: So when you choose the degree D, right, and you say that keep the batch size constant multiply ->> Raymond Yeung: It's actually a good idea to keep the batch size small, as we're going to see. Okay. So we form these batches. And okay maybe I can go back to this picture and try and explain a little bit more. So this coding scheme actually can be understood as an outer code. Okay. Here. Which is kind of like a matrix generalization of a filing code and then there's an inner code. And the inner code are the random linear coding applied within the network within each batch. You don't cross, do cross-batch random linear coding. Okay. So you form these batches, and they're sent through the network, which can do arbitrary linear network coding. Doesn't matter as long as things are linear. And you get batches out as Y 1 Y 2 Y 3 so forth where YI is equal to XI input batch into the network multiplied by HI, where HI is the transfer matrix that it goes through within the network. Now, because we don't do cross batch linear network coding YI only depends on XI but not other batches. That's why you -- that's how you keep the structure of the outer code intact, even though if you do random linear network coding within the network. Okay. So the end-to-end effect is the following: On the top we have these input packets. And on the bottom we have some technos characterized by the matrix GIHI. So if you're familiar with belief propagation filing code, what they do is that you find a chat node with degree 1 and you start propagating. Now, in this case we instead of doing the same thing we look for a chat node I with degree I equal to the ring of GIHI. For example, for this chat node here suppose -- well the degree is equal to 2. Suppose the ring is also equal to 2 you can code B1 and B3 and you can start propagating. So specifically the linear equation associated with a chat node is YI is equal to BI, which is the BI is the vector of all the packets that are involved in batch I. And you multiply by GI times HI. GI is the generator matrix and HI is the transfer matrix in the network. Okay. So you can also apply the technique of precoding, as in the rapter codes. So the idea is you precode by fixed rate erasure correction code. So these are the source packets. You expand it into a larger number of packets by an erasure code. And when you do the belief propagation here, upon being able to decode a fraction 1 minus E to all these packets, you would be able to recover the original source packets by means of the erasure code. This actually can give you a higher rate and also lower complexity. Okay. So we need a degree distribution size such that the belief propagation can decode successfully with high probability. The encoding decoding complexity is low and the coding rate is high. So I'm not going to get into the details of this asymptotic optimization program. All I want to say is that this optimization actually has to do with the expected -has to do with the ranked distribution of the transfer matrix X. It does not depend on the details of this transfer matrix H but it does depend on the ranked distribution. Okay. And then one can do some optimization accordingly. And this is the complexity of sequential scheduling of these batches. Now, so you may ask so just a moment ago we said that the sequential scheduling is not efficient because it's not scaleable for multi-casting. Then why do we use sequential scheduling here? Now, the reason why sequential scheduling was not efficient for chunk-based random linear network coding is because you need to -- as a receiver, you need to receive every chunk. If you cannot receive it, you have to wait until you receive it before you can move on to the next chunk. But for best code, because it has a kind of like a matrix filing as an outer code you don't actually have to receive all the batches. You only have to receive a sufficient number of batches and then you can start the belief propagation decoding. So the result is that the source node encoding complexity is O of TM. Okay. M is the batch size. And so these two are constant. It doesn't depend on K. Destinational decoding O of M squared plus TM. Again, it doesn't depend on K. No matter how large your file is, the encoding decoding complexity remains constant. Now for the intermediate node, for this particular configuration, okay, which I will elaborate a little bit. This particular configuration is such that from the source to all these signals it forms it has a tree structure. So that packets cannot overtake each other. I'm going to elaborate further why this is important. The buffer size only needs to be O of TM. And it's independent on file size. And for the network coding operation, the complexity is O of TM per packet. >>: I thought last time when you were given complexity numbers, the reason you had K was because it was the complexity for the decoding the entire file. >> Raymond Yeung: Actually, I clarify with my post-doc last night. And in fact the slide you saw last time, it had a K in it as for the total complexity. But now I taught about complexity per packet. >>: So if you weighed this for total file complexity you'd have a K in there. >> Raymond Yeung: Yes, of course. To decode a lot of files you need to work harder. But per packets, at least for per packet you don't have to work harder. >> Yeah. Okay. So T you pretty much can forget about it. Just the length of the packet that doesn't change. So K is the number of packets that depends on the file size. And M is a parameter to choose. It's a batch size. Okay. So the one thing I would like to mention is that here the optimal value of theta is almost the same thing as the rate of the code, which is very close to expectation of the ring of H. So when you have a one -- only have one-half and then the ring of H corresponds to the erasure probability. When there's multiple hops, then this is what you have to look at. It can be proved that when expectation -- okay. The optimal value theta can be proved to be exactly equal to expectation of ring of K when expectation of ring of K is equal to M which is the batch size times probability of ring of K -- ring is actually equal to M. So let's go back to this example with packet loss equal to .2. So here we apply BATS code at node S which encodes K packets. And node U only needs to cache one batch. The reason is that from S to T, there's only one path. And so the packets cannot overtake each other. So at node U, you only need to start one batch. As long as you see packets from a new batch coming in you know the old patch is just over and you can just throw away everything. And node T only needs to send one feedback after successful decoding. Okay. So here are some parameters we obtained by assimilation, and I want to note here we have not applied the precoding techniques yet. If you apply the precoding techniques then the number would look even better. So here K is the file size. 16,000, 32,000, 64,000. And Q is the size of the finite field. For Q, equal to 2. This is binary. For Q, 4 is a very small finite field. And here we choose batch size equal to 32. And as we see on the lower right corner, okay, the rate can already exceed .64, which is the theoretical upper bound on the rate of a finite code, which actually cannot be achieved with a small finite field for finite code in any case. So things look pretty encouraging. And in fact from the theoretical point of view, what we have obtained is actually a framework, which on the one extreme in comparisons rapter codes, filing code, that family, when M is equal to 1, when a batch size is equal to 1, then BATS code degenerates to filing codes. And which has low complexity, but it doesn't enjoy the benefit of network coding. On the other extreme, when you take M equals K, which is the whole file, and degree is also equal to K, that is you take random linear combination of all the packets in the source file, then BATS code becomes full fledged random linear network encoding, which is high complexity which at the same time it enjoys the full benefit of narrow coding. Somewhere between we're trying to choose some parameters such that it performs well and at the same time the efficiency, the complexity is low. That's what we're trying to do. Okay. So let's talk about some recent developments. The one thing I would like to mention is that there's one ->>: May I ask a question about this code? So, let's look back. So this discussion is about -- I mean, the specific codes is showing the performance is about one -two hop, right? >> Raymond Yeung: Yeah. >>: So if I have multiple hop, how does the code perform? >> Raymond Yeung: The more hops you have, the better it is. Because if you have packet loss, then you lose packets along the way. If you don't do anything in between it, then you keep losing packets. >>: You say the performance -- compared with basically ->> Raymond Yeung: Routing. Compared with routing. >>: Also compared with random linear code. >> Raymond Yeung: It cannot be -- in terms of the rate, it cannot beat a random linear network coding. >>: Of course. But how much is the performance gap to the random linear code changes when you have multiple hops? >> Raymond Yeung: Okay. >>: So I mean, for example ->> Raymond Yeung: For BATS code. That's something we are in the process of investigating. Okay. So we need to do much more simulation to see how it actually works in the real environment. >>: Seems like .6 data was -- should be like .8. >> Raymond Yeung: It should be close to .8, yeah, because what .8 is is actually let's say -- the bona fide network. And then nominally the capacity is equal to 1. But because of 10 percent drop here you get .9. Here you get .8. This is .85. And blah, blah, blah. So in principle, if you applied random linear network coding, you would be able to achieve the min cut of this graph. So the advantage of such a coding scheme is that you prevent the packet loss to accumulate. And also at the same time you try to -- you prevent delay from accumulating. >>: It would be good to quantify it. >> Raymond Yeung: Exactly. >>: The operation. Simply because if it's too hot and there's already a drop off performance from .8 to .68 something like that. >> Raymond Yeung: .64. >>: After the hops. I feel this performance may degenerate. >> Raymond Yeung: In fact, we just got a funding from the government to build a prototype using BATS code applied to P2P networks. It's the same thing. Also, that's the -- BATS code can also handle the situation when you have some intermediate nodes, which are just helper nodes. They're just there to help. They don't want to decode a whole file. Okay. One thing I'd like to mention is that for fountain code, the asymptotic optimal degree distribution actually does not depend on the erasure probability, which is something good. Okay? So you don't need to know the channel condition before you decide on the degree distribution. Having said that, the actual filing code being used is -- actually deviates from the theoretical asymptotic optimal degree distribution. I think for the filing code that gets into the standard-like raptor queue they actually obtain the -- it's a recommended degree distribution that is obtained by very extensive simulation in different situations. So even though the theoretically optimal degree distribution doesn't depend on the erasure probability of the channel, they don't use that one in practice. So this may or may not be a very big issue, we are trying to see. So although the theoretically optimal degree distribution does depend on the ranked distribution of the transfer matrix, we see that the dependence might not be that severe. So we're trying to come up with some robust degree distribution for different ranked distribution and see how things work out. And we also are conducting some finite length analysis that's being done by one of my undergraduate project students, and we are building testing systems on multi-hop wireless transformation, and also as I mentioned for P2P file transfer systems. Okay. So as a summary, best code provide additional fountain solution for networks employing linear network coding. Also, as I mentioned, the more hops between the source node and signal node applying end-to-end filing codes. Further improvement would include -- further development would include proof of the near capacity achievement of BATS code. And in the more generous set up and also design of intermediate operations to maximize the expectation of the rank of X, which is a transformation X and to minimize the buffer size. As I mentioned, if you have a tree structured such that packets cannot overtake each other, then you only need to buffer one packet. But in a general topology, when there are multiple paths connecting the nodes, it's not clear how big you need a buffer. I think that takes a lot of -- a lot more experiment before we can tell what is a good size of the buffer that one needs to maintain. So that's the end of my talk. I thank you very much. [applause]. >> Philip Chou: Questions? >>: So what if you had -- you pulled a degree out of your distribution that was bigger than M? >> Raymond Yeung: Bigger than M. That's okay. >>: So how do you do the decoding purely enough for belief propagation? >> Raymond Yeung: Let's see. You get a -- actually, I forgot the details. The degree distribution, they use actually -- have an upper limit. I don't remember exactly how it is being chosen. I have to go back and look at the details. >>: It seems like you presented the decoding. >> Raymond Yeung: I know what it means. I think probably it was set, the support is from 1 to something, which is smaller than M, I think. >>: That would be -- >> Raymond Yeung: That's a very good question. >>: That would be a strong correlation between that and the degree. >> Raymond Yeung: That's a very good question. I have to look back into the details before I can answer the question. >>: Just kind of following up on his question. You only find the local decoding within the batch. Never tried to put all the equations through the entire file together and the Gaussian matrix and ->> Raymond Yeung: Exactly. >>: So you must lose something there, right? >> Raymond Yeung: Let's see. Yeah, you lose something. I mean in the extreme case, when you have M equals K and degree goes to K it basically falls back to full fledged linear random encoding. >>: It seems like in the immediate case you essentially have like two [inaudible] so would you leave the batch and try to code, BATS code and propagate back. But as Phil was asking, if you cannot decode a batch, because of rank, do you piece some other ranks together so maybe it might decode, right? >>: If you restrict your decoding ->> Raymond Yeung: There are -- if you stop propagating this or also we are looking into the literature of filing code there. There are many techniques that can be employed, yeah, for moving forward. We're looking into that, too. >>: Seems like it could be very fast decoding for most of the stuff. And if you have any ambiguity left over, then you can try that eek out the last remaining bits of it. >>: Because in the fountain codes they have the cycles essentially so you don't have degree one nodes and you cannot decode them. Then you have co-efficiency here, you don't have to stop there and ->> Raymond Yeung: Yeah. >>: I have two questions. One is the on [inaudible] and one is on implementation. So I mean so for this network, which is a two-hop network, with some packet loss, and the example you showed is that it was a fixed loss rate. Now, let's assume that the loss rate fluctuates across time. My understanding it's basically the information theory basically based on come up with performance. It should be the basically reverted to the average loss off of basically the network, simply because if you use network coding you can have infinite amount of bandwidth with the right -- >> Raymond Yeung: Memory doesn't have to be larger than the file size. >>: Basically you need a larger memory to average out the losses. >> Raymond Yeung: Actually, not quite so for filing codes. You don't have to -- it really depends on the statistical modeling of the channel. You think about filing code. Okay. If I have a blackout, then you don't receive packets during that period of time. If you resume, you just pretend that nothing has happened. You don't even have to know that something's happened. >>: My point is it seems to me that you will -- I mean, basically if the rate fluctuates quite a lot, I mean the end, if your memory is not large enough, you may not be able to take advantage of the higher rate [inaudible] weighty ->> Raymond Yeung: Let me see, the..I think the issue that you brought up with, becomes significant if the packets can overtake each other. >>: I mean or if there's significant fluctuation. >>: Basically what you care about is it shuts off for -- packet loss goes to 100 percent for some period of time. That's a pretty big fluctuation. >>: It's like this as far as I'm concerned, in the beginning the first pipe has full capacity. No loss for half of the time. And then for the later half of the time the capacity is zero. I mean, the second pipe has zero capacity for the first half and 100 capacity for the later half. Now, to achieve the full capacity, you need to be basically a memory equal to the whole files to be able to downpour the waters -- if you do batches, then you are not able to take basically advantage of the capacity fluctuations. >> Raymond Yeung: Actually, it is not sensitive to the fluctuation in the sense that all that matters is the number of batches that can arrive. Because you think of this -- this ->>: Number of batches. >> Raymond Yeung: Yeah, yeah. You think about this in terms of the encoding graph. If PEC cannot arrive, then you just delete that from the graph. And then because it's random, it actually looks like -- it looks the same everywhere. >>: My second question is related to basically the implementations. What if I implement basically I do the coding separately. So it exists -- I mean, the BATS code in a sense have two stages, right? So the first stage it basically says intermediate node. Some of the information arrives at this batch, right? And then you have a second stage which is this -- basically this coding within batch. Now let's say we are operating on a packet basis. And so the overhead of each packet is relatively new. So can I simply -- basically during the second stage I try to decode what the first stage is. >>: So you mean the intermediate nodes are trying to decode? >>: No, inter nodes trying to decode. >> Raymond Yeung: Let's go back to this picture. Let's see if we can make -okay. So at what stage are you experiencing? >>: What I'm saying is this. >> Raymond Yeung: This is not the right picture. Not the right picture. >>: I need a two stage graph. So it's like this. So the first stage, for example, in the intermediate nodes, basically have the information. >> Raymond Yeung: You mean here? >>: Yes, here. Too brief. >> Raymond Yeung: These are -- okay. For this case this arrives at intermediate arrives, this doesn't arrive. >>: This doesn't arrive. So basically put that patent into the message. >> Raymond Yeung: Put that patent into the message? I see. >>: So let's say I use a binary vector says, here's the packets which have arrived. >> Raymond Yeung: Okay. I see. Okay. So this is fixed, the size of this is fixed. And then you ->>: This is ->> Raymond Yeung: So you tell the downstream that the third guy actually had not arrived and I'm only taking random linear combination on the first two things. So they're trying to see whether this can help. >>: Whether the scheme is simpler. >> Raymond Yeung: Whether the scheme is simpler, but how will you make use of the information. >>: Let's say this can be transmitted overhead. >> Raymond Yeung: But in what way is this information useful to those down streams? >>: The down streams only need to know how many valid packets is in that batch and they're simply designed basically a code, which is equal to the size of basically, equal to basically be able to decode that batch. So, for example, the batch here is size K. Okay. Now ->> Raymond Yeung: Let's stick with M. >>: M. Okay. After the first stage, no more packet arrive, is anywhere between 1 to N. I only need to basically get the information of what packet arrived in this and the vector. And send to basically the receivers. The receiver basically have a rate-less code, to cover exactly the number of packets received in the intermediate node, and when it received basically says, okay, I don't believe anymore on the code. So here, of course, the receiver is in a sense tacked to the intermediate. They're already basically of this K packet. >> Raymond Yeung: I'll think about it more carefully. See whether we can take advantage of this. Thanks for the input. >>: I want to go back to that question IH was constant M. >> Raymond Yeung: IH constant M. In this case, in this example you only need to store M packets. >>: I think the main reason is intermediate nodes buffer, right. >> Raymond Yeung: Right. >>: If you set it equal to D then you avoid the [inaudible] issue. >> Raymond Yeung: Well, it really depends on the application. For example, in the current generation of P 2 P system so there's no helper node, everybody wants to help out anyway. So buffering is not an issue. So for some applications it doesn't matter. But this is mainly for helping those, or like a router, where you don't want the buffer sizes of the, of the router to grow with the file. >>: I guess if you've already got D and an upper limit on D, and you already picked M to be ->> Raymond Yeung: Well, but if there's the upper limit, M can still be big, right? >> Yeah, but I guess -- so you pick M to be bigger than the upper limit on D. But maybe if D, if you picked a D that's smaller than the upper limit, then maybe you can just use a smaller batch. >> Raymond Yeung: Well, the size of M cannot be too small. Otherwise you cannot get the benefit of random linear network coding. But actually what we find a little surprising is that M can be chosen to be quite small. And yet there's always this -- already exceeds pharmaco [phonetic]. >> Philip Chou: Okay. Thanks. >> Raymond Yeung: Thank you very much. [applause]