18904 >> Philip Chou: It's my great pleasure to introduce... assistant professor at the Verterbe School of Engineering at USC. ...

18904 >> Philip Chou: It's my great pleasure to introduce Alex Dimakis. He is an assistant professor at the Verterbe School of Engineering at USC. And before that he was a post-doc at Cal Tech, and before that a graduate student at U.C. Berkeley, and during that time he was an MSR graduate fellow here. So it's a great pleasure for me to welcome him back to MSR. Alex. >> Alex Dimakis: Thanks, Phil. So it's great to be back in the new building. So I will talk about network coding for distributed storage. And before I talk about network coding, I want to talk about coding. So how to store information using [inaudible] codes. So this is a very basic slide. But very useful for anything else that follows. So let's say you have a file or a data object which is this big yellow box here and you cut it into two pieces. So K is going to be 2 in these examples. And you can use a code, a 3, 2 code to store information distributedly. So the way you store is you store the first bucket, the first part of the file in one server. The second part in another server, and you store the bit-wise [inaudible] of the packets in a third server. And this is just a single parity disk and this is used all over the place, of course. And the key point here is that you have generated three packets so that any two out of the three allow you to recover the original two. So that's why this is a 3, 2 maximum distance separable code and it's also known as single parity. This is a little more interesting case where I generate four packets. This is a single parity. And this is A plus 2B. A plus 2B what does that mean? That means I group my bits into groups of two, let's say. I think of them as numbers in a larger finite field. And then I do my operations in a larger finite field. If you don't know about finite fields you can just think of these as linear equations over the reels and everything will work out fine. The key point here is that these are four linear equations. And any two out of these four allow me to reconstruct back A and B. For example, if I get B and A plus 2B these are two linear equations I can to get back A and B. The point of this slide is that erasure codes are better than replication. What is replication? Replication says just store the packet twice and store the other packet twice, and this is in four disks, and let's compare this with a 4, 2 erasure code, which is the one I just showed. And what I want to say here is that this scheme here, the code, is much better than replication, because to the right any two allow us to recover the file, whereas for this 2X replication scheme, if I lose the first and the third disk I'm fine, but if I lose the two first disks I have lost A and therefore I have lost A. So this scheme is more reliable than replication, of course. So this is well known. And in fact erasure codes are introducing redundancy in an optimal way. It's optimal because any K packets allow you to recover your file and you could not hope to get your file back from any K minus 1 because you don't get enough bits. And that's why people use erasure codes like Reed-Solomon, fountain codes and error collecting codes more generally and they use them all over the place. Still, however, most peer-to-peer storage systems and most information stored in data centers today uses replication. So they make three, four copies of the objects and they don't use parity. They don't mix information together. Now, why is that? So this guy, this is Claude Shannon, and he was in the phone company. And he basically, one of the main results that he showed is that when people were sending bits over noisy channels, they were observing that, okay, we can repeat the bits. Therefore, our rate goes down. But our probability of error reduces. And people thought that you can, to reduce the probability of error you have to reduce the rate. And so you have to repeat your symbol more and more times. And if you want your probability of error to go to 0, the useful things you would be sending over the total number of bits you would be sending would be erasure that vanishes. That's what people believed. But the break-through result and the change of thinking was that you don't actually have to have it. You can have arbitrarily small probability of error with a fixed rate, as long as you're below the capacity. So practically what I'm saying is that replication, which is repetition code, is a terribly bad code and every coding theorist knows that, but still this is what we use in many distributed storage systems. And the question is can we improve the efficiency? Of course we use replication for other things, too, for load balancing, for efficiency. But at least for archival storage, when the main bottleneck is you want reliability, then for those cases we should be using codes. Why don't we? Well, people have worked on using codes. There are many problems. When you try to use codes over networks. And this is great, because there are new open problems for people to look at. So let me show you a few of them. One is when you have two servers, let's say, that have two packets A and B and they want to create this very basic code here that is A, B and A, X or B. Every time you send a packet over a network now you have to pay communication costs. Typically these costs were not considered important because everything was happening centrally. But now when you have a network of computers and the network is the bottleneck, you want to minimize the communication that's happening in the network. So this is one problem, how to create codes that have the smallest number of edges or the smallest amount of information when they are created. The other issue, for example, is update complexity, that if I'm storing A on this disk and A plus B on this disk, the X or when one packet changes that means I have to go to all the disks that are storing parities combined with that and I have to change those. So graph theoretically that is correlated to the degree of each pa account and I would like to make these degrees as small as possible. The problem is these degrees are in direct conflict with reliability. So what does a minimum update complexity that you can have if you want to tolerate any N minus K failures, for example, is one question that has been addressed a little bit, but there's many open problems there, too. The main open problem that I will talk about is this one here where, okay, you have your code. So this was 3, 2 code. But now you have a failure. So you have to get a new disk and you have to communicate some information to the new disk so that these two guys with the new guy form a good code again. So this is going to be called the code repair problem, and the repair communication is a new problem that has not been looked at before because there was no network in the picture. Okay. So the story is this guy. This is Sean Wier, a Ph.D. student at Berkeley, and he was building this system. Has anyone heard of this, Open DHD system? It's a distributed peer-to-peer storage system. And later as I read Dynamo system Amazon is building on that and many other systems are building on these ideas, it's a system where you store distributed information over the Internet and they use a lot of replication or coding. So I was the coding theory guy around. And I went, we went for coffee and we were talking and said, okay, you know there's all these amazing developments in coding theory, we can do very fast decoding, sparse graph codes, fountain codes, network coding, all that, perhaps we could use any of that in the distributed storage system. And basically after talking with him, I realized that the main problem that or one of the most important problems is this repair issue. It's not coding, decoding complexity or sparsity or any of these things. So here is an example. Let's say you have, your file is seven packets total. And you use, you encode it into 14. That means any seven out of these 14 allow you to get your file back. So this is your original data object here and these are parities. Seven parities. So any seven packets out of these 14 will allow you to recover your file back. And each packet is one megabyte in size, let's say. The total file is seven. And they were using an Open DHD. They were using Reed-Solomon code or the information despairs algorithm idea which is very similar. When they had one failure, they had a new peer here that had to create a new parity. Either parity or X4 like the systematic symbol that was lost. The problem to create a Reed-Solomon code symbol, for example, to create new parity you need to have all the data. So to create one packet here that had size one megabyte they had to send all these things to this guy so that was seven megabytes in communication over a network. And then this guy would use these, solve for all X1. X2 up to X7 and create a new packet or new linear combination here and store that. So this is the punch line that the amount of network traffic required to reconstruct one lost data block was the main argument against using erasure codes in peer-to-peer storage systems. And several studies have pointed out that this is a big problem. And when you use codes as a black box, repairing one failure basically requires all the data to be present. Yes? >>: Do you have the same problem if some of your result data changes? Or your XI changes? >> Alex Dimakis: If some of your XI changes then you have to go to your parities and update it. That's another problem. It's called the update complexity problem. One way around this is say I'm never going to change things, I'm only going to use codes for stuff that is archival. So that's -- typically this is the case of most interest. But there has been work on the update complexity. So it's a separate related problem. The big codes, for example, have optimal update complexity. Codes designed exactly for that issue. But for repair problem there was nothing ->>: If you consider archival, it's also true that usually archives are updated. You append. You apply, you never change things. >> Alex Dimakis: Sure. >>: So if you add things you still require a lot of communication, right, so you will have to update. >> Alex Dimakis: Yes, update is a separate problem that was known. And there is work -- as far as I know, the optimal construction theoretically is a big code that has minimum update complexity for given reliability. So they use these in RAID 6. So this is a problem that's relevant even when the storage is centralized. The novelty in these problems is that when the storage is distributed, the network is the bottleneck and now you also have these repair issues. Any other questions so far? Okay. So I will talk about this. My main, the main problem is how do we repair a code? Well, one way to repair a code is bring all the data in one point. But you can do better than that. So this is setting up the problem more formally. Assume you have your code. Let's say it's a 4, 2 MDS code and the one node here, this guy leaves the system. The question is how much data do we have to communicate to this guy, the newcomer, so that these three guys combine with the newcomer form a good M, K MDS code. So is the problem clear to everyone? Okay. First idea. Definitely I can communicate two megabytes, because any two megabytes out of this give me back all the data, as we said. So I can send two megabytes here. This guy can solve for NP. And store here A. Or it could store a different linear combination. So it does have to be A here. It you could store A plus 50B as long as this is linearly independent of these, this is still a good code. But the newcomer is going to download two megabytes and store only one. Is it possible to download less? This is a quote from peer-to-peer storage paper that says if you use any of the codes we know to make one packet you need all the data. And the main point is that this is not always true. You don't always need all the data to make a new encoded packet. If you use network coding you can do it with much less. So this is the main message of this talk. Okay. So it is possible to download 1.5 megabytes. This is for this example. This is the information theoretic minimal. Okay. How do you do it? The first thing you have to do is do what's called sub packetization. So you have to take every packet and cut it into two. So check that this is a code that has four variables. Any two new boxes now include four linear equations in four variables. You can check that you can actually solve this equation. This is a 4, 2 MDS code. When you have a failure each guy is going to make a small linear combination of the two packets they have. Each packet here has size half a meg. So this is half a meg. So these are three packets of total size 1.5. You send these three packets to the newcomer. The newcomer makes a linear combination and stores it. And makes another linear combination and stores it here. And this is what is being stored. So you see what the scheme I'm suggesting is? One question is instead of cutting it in two packets, why don't we cut it into 100 packets and send around 33 from each. And the total communication would not be 1.5. The total communication would be approaching one megabyte as I cut smaller and smaller. Well, if you do that, however, you can prove that this packet, these linear equations here will not be good. They will not be in general position compared to this. So they will not form a good MDS code. So you can only reduce your communication to 1.5 in this example. So let me try to show you why this is the case. Okay. So this is called information flow graph. I take every storage node and I make two copies and I connect them with one edge that is the capacity of the storage node. And here I put a source, a virtual source where all the data began. This is the newcomer, and the communication is going to be beta. This is to be minimized to repair this failure. Now, I want any two disks to contain enough information to recover the data. So I claim that if the minimum cuts separating this source to this data collector is smaller than the file size, which is two megabytes in my example, it is information theoretically impossible to get the data here. So if you compute the minimum cut in this example, this is one cutting this edge, plus 2 beta, and you find that 1 plus 2 beta has to be greater than two megabytes. So beta has to be greater than one-half. So this is how you find that you could not communicate less than 1.5. So now but how do you actually achieve that? Is it possible to achieve -- this is just -- this is called a cut set lower bound. Okay. The problem is why is it not achievable, because it's not only this pair of nodes that could fail. You want any pair of two nodes that fail to allow you to construct the data. So you have to somehow simultaneously serve all the possible 4, 2 pairs that could appear. And this is here after a lot of pain in PowerPoint I added all the possible 4, 2 pairs and you want a code that simultaneously sends the data to all of these guys. And how can we hope to do that? This is called multi-casting in information theory. And the main message is that repairing a code is equivalent to multi-casting on this graph that you can construct. And the break-through results of network coding originally by Ahlswede et al, Koetter and Medard and Tracy [inaudible] and coauthors showed basically that if the minimum of the min cuts, so the minimum from the source to each one of these data collectors is sufficient, then there exists a code that sends the information to all of them simultaneously. This is highly nontrivial, right? Because I'm serving many people at the same time. And I'm serving every one at the rate of the poorest. And further, you can achieve that with a linear code. And further you don't even have to think too much about it. You can just make linear combinations of everything arriving. And with high probability all these guys will get linear equations that will be solvable as long as you are not trying to communicate more than the min cut. So is it clear what I'm saying here? So the only thing we have to worry about, if we're trying to repair our code, is basically what is the right amount of information, and just random linear combinations will suffice. And you can -- the thing you have to evaluate is the minimum cuts on these graphs that are formed by nodes failing. So if you have an N comma MDS code, if each node is going out of this and communicates beta from each existing node, you can do the graph theory and find this is the minimum storage and this is the communication required to repair a failure. And it's a reduction to a flow problem, but and the graph is infinite. I will show you why the graph is infinite in one second. But before that, if you just plug into this 14, 7 example that we had for peer-to-peer systems, even repairing a single pair naively costs you seven megabytes network traffic, if you evaluate this bound, you find you can repair with only 1.85. Very large reduction in the communication required. Of course, there is one key problem that I don't know if you have seen already. I have not promised you that this box is going to be exactly what is lost. I'm going to form here something that is a new linear equation that I only promise that any seven out of these gives you back the data. Right? So this is much different than just having exactly X4 here. X4 was part of the data. But now I formed a new parity. So I'm changing my code as I go. So that's one weakness of these results. Okay. Now, why do you want -- why is this graph infinite? So in general you have your code here, any four out of these -- sorry any K. This is general. So your file has size M. You cut it into K pieces. Each one is M over K in size and this is alpha the stored information. So when you have a failure here, you repair it by having a new node connecting to D existing nodes and communicating beta bits from each. Now you have repaired this failure but now maybe there's another failure. And you repair it again and then maybe there's another failure and you repair it again. So this graph is, you don't know which failures are going to happen, right? And these failures could be going on forever. So the graph here is unbounded in size. And you want to make sure that no matter, when you connect to any K nodes throughout this infinite graph you have enough flow. So you have to compute these parameters so at any given time during the evolution process there's enough flow on this infinite graph. So you have to find what is the trade-off between this beta D and alpha. The storage and the repair communication. So this is what we did. The punch line is if you give me a little bit more storage, the repair bandwidth can be greatly reduced. So I'll just give you some numbers here. If you have a file that's let's say 20 megabytes you cut it into 20 pieces and you make 25 out of those. So that any 20 out of the 25 give you back your file. So you can tolerate five disks failing. If you use a Reed-Solomon code, then each disk will store one megabyte and each failure will cost you 20, because for repairing one failure you need all the data. If you use what we call minimum storage regenerating codes, then you will be storing the same, but a repair will only require 4.8 megabytes. And the trade-off now was the following. Okay, if you allow me to store a little more, so I'm going to inflate my storage. So each storage node stores 1.65, then you can compute the cut set bounds and you will find that the repair bound width is reduced to 1.65 also. So I increased my storage in the system by 60 percent but my repair bandwidth goes down four times. So is there ->>: Do you have any example maybe later of this reconstruction code, is it just a linear code? >> Alex Dimakis: These are all linear network codes. The reason they are network codes is because to construct, to maintain them, you have to mix mixed packets. So I will show some examples in a second. >>: So just random coefficients and elements in a finite field? >> Alex Dimakis: Yes. >>: But still a -- >> Alex Dimakis: Yes. The thing is if you do that, you are not going to have exact repair. You're going to be changing your code as you go. The parities will be changing. They will be linear equations but they will be different linear equations. And that is a big problem in practice and the more exciting thing is how do you actually keep the code fixed. So I will talk about that in one second. >>: The goal is to reconstruct the file, right, so why is it a big problem if you can always reconstruct the file? >> Alex Dimakis: You can always reconstruct the file from any K. But most of the time you don't want to -- most of the time you just want to read one thing. So if you want to read the substance of the file like you want to read the one specific bit of the file, you don't have to get all K solve these equations and get all your data. You could ask for partial read. Because most of the time you have zero failures and you just want to read something. So that was -- that's why it's always good to keep the half of the code uncoded. That's called systematic. So half of the code is that -- yeah. >>: Can you reduce by increasing the [inaudible] storage [inaudible] can you reduce the amount of bandwidth to the exact size of the ->> Alex Dimakis: Well, yes. Yes. >>: So the minimum ->> Alex Dimakis: Yes. >>: So if storage is not an issue then bandwidth can be as [inaudible]. >> Alex Dimakis: Yes. Yes. So there's an interesting point here at this operation point you communicate exactly what you store, which is the minimum possible. You see, because you only -- there's no way you could go below that, right because you're storing 1.65. You can always achieve that. At the minimum bandwidth point these two are equal. Okay. So there's a trade-off between storage and communication here. And the question is what are the achievable points? So, okay, so you can pose this as a graph problem. You have this infinite graph. Everybody connects to D communicates beta. You want any data collector. So any K to give you your data back. So choose these parameters so that this is everybody here gets enough flow. And this is the general theorem that I will spare you from the details. The general idea is if you have an N, K code if you store alpha bits, you connect to D nodes and download D times beta. So D from each. D times beta is the gamma. That's the total communication. Then there is the crazy formula that describes the region, the trade-off region between communication and storage. And if I just plot it, it's much easier. So this is the region. This is how much you store per node. This is how much you repair totally, total communication. This point here is called the minimum storage point. So minimum storage regenerating codes stands for MSR. This is no coincidence, because we did half of this while I was here at Microsoft Research. This is Microsoft -- no, minimum storage point. This is the minimum bandwidth point. So and there's a trade-off between the two. Everything above this blue line is achievable with random linear network coding and everything below is information theoretically impossible by a cut set bound. Like the one I showed. Okay. So this is all good. And this is going to appear. But there's one very important problem. So we characterize this region but only if you're changing your code. So if you talk to a practitioner and you say, okay, that's all nice but I actually want to repair exactly what I lost. So this looks like a trivial extension initially, right? So you say, okay, now I lost this guy and here I want to create a packet that is not just any linear combination that's in general position. But I want to repair exactly what I lost. And I have these bounds that were the cut set bounds, and now I want to ask, okay, this is a strictly harder problem. Can I achieve the same cut set bounds? This was the open problem. And this is a very, very hard problem. Because as I said before, when we reduce this problem to a network coding problem, you only have these data collector guys who want all the data. Now, when all your clients want all the data, this is called a multi-casting problem, and there is all multi casting problems have been characterized and they're easy to characterize. You only have to serve the poorest and everybody else will get the information. However, now when we have exact repair, we have these intermediate guys who want data themselves. They want specific linear combinations. So that problem -- so this is the picture, repair is multi-casting, but exact repair is multi-casting with intermediate nodes having requests, and the requests are overlapping. So different people might want stuff that overlaps. And therefore the cut set region, the region I showed you before, the blue line, might not be achievable, linear codes might not suffice, and we don't know basically -- it's a very, very difficult problem if you have network coding with multiple sources, we have crazy examples of codes not, linear codes not sufficing and we don't know how to characterize that region. So this was a very difficult problem and it was open for a few years. Let me tell you the story. So in general the question is, this is a blue region. What can you achieve with exact repair? So the two points that we will mainly focus on is this point and this point. There are no results as far as I know for any intermediate points. The intermediate region is open. But let's only look at these two points. These two points have received some work. The minimum storage point here and the minimum bandwidth point. So for starting with Leong Ho, when I was an intern here, we had this paper that said: If K is equal to 2. So the original file size, you separate the file in two pieces, then systematic or exact, systematic repair is exact repair, it's the same thing. You can match the cut set bound if K equals 2. We had constructions, codes, linear codes that achieved the cut set bound for this case. And then when I was a post-doc at Cal Tech, we were trying find 5, 3, this was the smallest case that was open. And Dunkliner, an undergraduate at the time, ran a huge computer search over all possible codes and there was some optimizations how to make this feasible, and we could find some codes that were 5, 3, exact. So they were matching cut set for this case. And then there were some results by these two groups that showed that if the rate of the code is less than one-half, then exact repair can match the cut set bound. So this generalizes this result but not necessarily this one. But they have very specific code. So your error correcting code has to be constructed in a very special way from these cost sheet matrices [phonetic] and then they show you can actually repair these codes exactly with as long as you're rate below one-half and you could match the point on the blue curve, the cut set bound. Are these clear what I'm saying? Okay. One obvious question is for high rates, what can you do? So Kadamir [inaudible] independently showed that you can actually approach -- approach, not match -approach the cut set bound for any N, K for the minimum storage point using this technique that's called the symbolic extension technique. That's quite remarkable because this technique was developed for an entirely different problem, the interference channel in wireless. Nothing to do with network coding, nothing wired networks. It's a problem over the reels with wireless channels interfere. The exact same technique can be applied and they show you can approach the cut set bound for all K and A. This is quite remarkable result. The problem is that it requires an enormous field size and enormous packetization. Remember I was cutting every packet into two. Now you have to cut the packet in 10,000 or billions, it's exponential in N, K. So but it shows that it's possible. It can be done. So linear code suffice to approach the cut set region for exact repair. For the whole range of parameters for the minimum storage point. So this is one point where we know we can approach it. Okay. So now I want to give you -- how much time do I have? I should have quite a lot of time? >>: Half an hour. >> Alex Dimakis: Half an hour. Good. So since most people are still awake, I'm going to give you my 5-minute -- 10-minute crash course on interference alignment and how it's possible to achieve these results. Okay. So what's happening here? Imagine I give you three linear equations in four variables. Okay. In general, if I give you three equations in four variables, in four unknowns, you cannot solve for any of the variables, right? You would hope to solve for three of them, and if the equations were trivial, if the equations were A 1 is 5 and A 2 is 11 and A 3 is 12, then three equations and three unknowns you get them. But now I have three equations, four unknowns. In general, you cannot recover anything from them, the only thing you can recover is they lie on a sub space. Well, let's look at these three equations, three equations and four unknowns. However, as you can probably see, I can use these three equations. So this equation says B 1 plus B 2 is Y 3. I can subtract these equations from these two and get two linear equations in A 1 and A 2 and I can solve for A 1 and A 2. So this is three equations in four unknowns but I can solve for two of them. Why? Because these coefficients here and these coefficients here are aligned. That means basically that the rank of this matrix, this is 1, 1, 1, 1. The rank of this matrix is only 1, 1. And therefore I can get this equation and cancel this interference and get two linear equations in the two things I actually want and recover them. Do you see what I'm trying to say here? This is, of course, something you can do in high school. The difficult thing is how do you do many of these alignments at the same time. So how do we form good codes that have these crazy alignments, a lot of them at the same time? This is really the question. So here is my example of a 4, 2 code that is exactly reparable. This is what you were asked before. So, first of all, observe this is a systematic code. The first two parities are the data themselves. These are linear equations. A box now, a node stores two linear equations. Observe that any two nodes contain four equations that you can solve for your four variables. So this is indeed the 4, 2 code. Any two boxes give you back your data. Now, you lose one. What am I allowed to do? I'm allowed to send one linear equation from each of these guys to this -- this is the newcomer. The newcomer wants to solve for X1, X2. What I'm allowed to choose is my coefficients here. These were called repair coefficients. So, for example, I can do 1 times X3, plus 1 times X 4 and I form this linear equation. The size of this is half a meg. I can choose these coefficients here again and form another equation that has half a meg and another equation here. I can choose any of these the way I like. This is my choice. What is this? These are three linear equations in four variables. There is no way I can make these equations contain only three variables, because this has to be a good code. So there's no way I can choose my coefficients to only have this stuff that this guy wants. This guy wants X1 and X2 here. He really doesn't want X3 and X 4. That's why this is red. So X3 and X 4 is interference to this, to our friend here. But the question is, I can choose 1, 1 here, 1, 1 here and 2 inverse and 3 inverse here so that the interference part is the same here and here. And then using this equation, this is exactly the same equation as the ones as I had before. I can cancel this stuff out and this guy now has two equations in the two things he wants. So is that example clear? Okay. So the interesting thing is that this code, if you lose the first node you can do this. If you lose the second node you can choose again different coefficients to repair exactly. Now if you lose this parity, now you have to recover this specific parity, right? Well, again, you can choose the coefficient so you can solve for these linear equations and for these two. So this code is a 4, 2 exactly reparable code and it matches the 1.5 bound. So this was the construction that we had before. But this is not generalizable construction. Okay. So and then there was the symbol extension idea of showing how you could actually generalize this. So first of all, before -- I want to go from this example into matrices. So how am I doing that? The first one is X one, X2. Imagine multiplying by a vector here that's X one, X two, X3, X four. This is the first equation, this is the second equation, and the third equation here is X one plus X3. X one plus X3 is the third row here. This is my code. You can represent the code by the coefficients in this form. The repair coefficients are these things that are sitting here. What is happening in the previous example is that this is the interference part. This is the X3 plus X 4 and the key is that this matrix has low rank. This sub matrix here has rank 1 and I get this extra equation on X3 and X 4 and I can cancel these two and get the full rank matrix on this. This is really what happened before. Let's look at it more abstractly now. This is a systematic part. These are the diagonal ones. Now, this is the general code. I can choose any code I want. And I can choose any repair coefficients I want. In fact, I chose different repair coefficients here and here. But the [inaudible] scheme uses the same coefficient. So I'm going to restrict my freedom and choose the same coefficients here and choose the same coefficients for all the systematic blocks so that's one assumption. I'm going to form all these matrices. And what is my goal here? So choose the same V prime for all systematic, the same V for all nonsystematic. And they also chose the matrices to be IID diagonal. This is their choice. They can do whatever they want with the matrices. So they chose the matrices to be IID diagonal. What's the requirement? The requirement is that all these things here are contained in this small matrix so I can cancel. And at the same time all this stuff here is full rank. So I can get the stuff I actually want. All right. Now, how am I going to do that? That's actually a very difficult problem when there's many matrices. All right. So I say we want this full rank and we want these vectors to be in the span of this. All right. So we have to choose V and V prime because we already chose these matrices A. In general, I could choose a matrices A and V. But these are quadratic equations that I cannot solve. But they set A. So now I have to choose the Vs. Okay. So this is my one slide crash course on this symbolic extension thing. Let's start by choosing V prime here. Let's start by choosing V to be one vector. Sorry not V prime, V. Let's say this was one vector, only one vector. If this was one vector what do I need? I need V prime to contain V times this matrix and V times this matrix. Okay. That's two extra vectors. If I chose V to be this, then V prime would have to be these three vectors, right? The blue here is the extra stuff. The suboptimal stuff. Ideally if I was operating at the cut set bound there would be no blue stuff. But now when I start from one vector after mapping it I get three. That's a lot of overhead. It's a huge overhead. But now I'm going to do the following thing. I'm going to say now I'm going to take this and pretend this was V. So if I -- so if I started -- so I call this fold-back V prime into V. So if V was this vector, these three vectors now, well, now I would have to make sure that every vector multiplied by A 32 and A 42 would stay in variant. Now, I multiple again by A and I get the six vectors. So observe now I have some overlap because W times A is already in here, and W times A squared is not, but in the next step I'm going to fold it back in. I'm going to have more overlap. So this is how this construction is working. And if you keep -- now you pretend this is V and you see what would V prime have to be. And you keep on doing that. You will see that the blue stuff, the extra stuff I have to get to cancel is actually vanishing fraction of the overlap. So almost everything is aligned, and the part that is not alined is vanishing as I keep on doing this process. Now, why is that a problem? What is the problem with that? The problem is that until this becomes very small I have to make this very large number of equations. So this is the general statement that says if you use this idea of folding your equations back again and again, you get perfect alignment. But to get close to perfect -- you actually never get perfect alignment. You get super close to perfect alignment. To get super close alignment you need to have foldings exponential number. Can you do it better? I don't know. I think nobody knows. It's the million-dollar question how can you do this without extending your field so much, without cutting it into so many small vectors. Okay. But this shows that linear code suffice to approach the cut set region for exact repair for this is for the minimum storage point, of course. For the other point in the trade-off we don't know again. It's only for one more point we do know. But for the region we don't. And the key question is, do it with small field and small sub packetization. Okay. So some other new results -- so this technique was done for the wireless interference channel, right? It's a completely different problem. So this is surprising result we had is the following: If you give me a code, I choose the repair coefficients that reduce the repair communication over a field. This is a computational problem. If you give me a fixed code, this is a computational problem. It's NP hard. It's a rank minimization problem over a finite field. This is another problem. You give me channel matrices in the wireless interference channel, and I have to choose a bin forming mate tries that maximize the degrees of freedom of that wireless interference channel. So one result that we established recently with my student is that both of these problems are basically the same problem. Basically if you had the box that could solve one, you could use the box to solve the other. But, of course, this is over a field. This is over the reels. But the problem is essentially the same. It's minimizing the rank of some matrices, subject to the rank of some other matrices being full. Like I think you can see that for the repair case. For the interference channel, it's another story, but it's very simple. So these problems are connected. And there is other problems that can be put into that framework. So, for example, you can think of security and secrecy problems where I want to communicate to some guys, and I want other guys to not get anything. Again, that you can pose as a problem I want full rank at this receiver and I want 0 or rank 1 at the bad guys. This is, one problem, is for example, the multi-access channel with eavesdroppers, and they're using the interference technique we can find degrees of freedom for that. So recently also this was applied for the problem of multiple unicast in network coding. Probably the most important problem when you have multiple single source single destination pairs. Again you can apply interference alignment techniques and you can show in some cases very good performance. And another of [inaudible] problems is of course I assume now my topology was fixed and everybody was distance one from every other and I was counting bits. But instead maybe there is cheap bits and there's expensive bits. There's some people who are close by and some people who are far away. And maybe you want to get more bits from some people and a fewer number from others. So what's the right repair if you have a given topology with costs? That is one open problem. There was one paper by [inaudible] Li and his group at Infocom recently on that on repairing on trees and I'm going to talk about allocations now. How much time do I have? Maybe 15 or 10? >>: 10 or 15. >> Alex Dimakis: So before I move to that, are there any questions on repair problems before I move to a slightly different -- yes. >>: Is there not lots of validating in between reels and fields, like suppose -- as far as the repair problem over reels there are some like finite number of bits. >> Alex Dimakis: Yeah. >>: Packetization, do you think it automatically -- because you have the constraint. >> Alex Dimakis: If I had a box that could show both. If I had a box that could minimize the rank over both, something like that. >>: Still the question, the box is the same. >> Alex Dimakis: So we don't know of any scheme that works in one and does not work in the other. Of course, I would expect if you limit my field size, if you limit my field size to be binary, then of course I have much more restricted -- so I mean we don't even know the repair bandwidth for binary. Even for functional repair. Bounding the field size is also very difficult problem. But all the techniques that work for reels so far work for -- so I don't know. Any other questions? Okay. Let me move on to this allocation and there will be more. All right. So this is just a motivating slide that says that everybody is watching videos on their iPhones and that's a huge problem because 3G cannot tolerate that. And what are we going to do, right? And you can put more antennas, but this is not going to scale in the right way. And the key approach, I think, is to do delivery of content with opportunistic contacts. So use some idea like [inaudible] cells or even device-to-device communication to cache the content and give the content in a device-to-device way rather than getting it from the server, using 3G. Okay. What is the point of this slide? Basically the video you want to watch is very likely to be downloaded by someone nearby in the near future or past. I claim this. This is not always true. Depends where you are. But in many cases it is true. This is one of those plots that shows, you know, 10% of the content is responsible for 90 percent of the traffic on YouTube and everywhere else. This is the other interesting thing is that storage is increasing more than anything else. Storage in phones is increasing more than anything else and storage in boxes is increasing. So you can have a lot of storage. So the idea is can you do distributed storage of the popular content and deliver it to the device 2 device localized way. A lot of problems here. So let me tell you a few of them. So again, of course, you might want to use coding instead of replication. You might want to code across the content rather than store the content in different storage nodes. You have again the problem of maintaining the code and all these regenerating code stuff is relevant here. But there's also many other problems. So you would like nodes to cache different content in a distributed way. You don't want everybody to cache the latest Lady Gaga video and the latest Lady Gaga video is nowhere to be found. So we have to coordinate about what we have to cache and we have to find a way to cache the popular content but in a somewhat balanced way. So which content to cache is one question. How much to store, this is the most basic question, how much to store on each of the storage nodes. How to find who has the stuff I want is another question. And how do you give incentives maybe to people to donate their storage and their resources, maybe you know in a [inaudible] way you will get faster. >>: Going to be over WIFI? >> Alex Dimakis: You could do it over WIFI. You could actually do it over 3G. >>: Same bottleneck. >> Alex Dimakis: No, device-to-device over 3G. That would require -- it's technologically very feasible. It's not done in current technology. But so you could talk -- yeah, over different models. The key point is you want to limit your power. So you want to go to the model where I talk very quietly in a very small radius. So any technology that will allow that. So I want to -- so these are very relevant problems for this. I'm going to just mention the most trivial one. I'm going to mention a trivial problem. How much to store. So, for example, I have two files and I want to store them in five storage nodes. One thing you could do is you could say I'm going to store the first file in the first two nodes and the second file in the second. But -- and then somebody is going to drive by and with some probability access each one of these. That's one storage scheme. But you could take the first file, cut it into pieces and code. So any two out of the orange give you the yellow file. Any two out of these give you the blue file. And then store this. Now, any two storage nodes contain both files. So this is again strictly better storage than using replication. So you might want to code across your storage devices and get better access to your content. This is what I want to say here. Okay. But you could also change the allocation. So maybe you store both files at the first guy, both files at the second guy and you leave these guys empty. By empty I mean you store other stuff there. This is a different allocation. Was there a question? Okay. So this is a different allocation. So which allocation is better? You say, okay, this one is better, obviously. But that's not clear. So I'm going to make an even more trivial problem. My most trivial problem is I have one file to store and each one of my storage devices is going to be like a bucket. And each of my buckets is going to be killed with some probability independently. And the same probability. So every device will be crushed with probability .1. I have fixed redundancy of two liters of water say. I have the code. Even one liter out of the two gives me back my five. What's the best way to allocate my redundancy in these five symmetric storage devices. It's like the most basic thing in the world. I have five minutes, maybe? Basically. Give or take. All right. So this is the most basic thing in the world, right? Okay. So let me give you an example. So you have your fixed storage budget. So you're going to allow two units of storage. So one thing you could do is this. 1, 1 and leave the others empty for another file. We call this minimum spreading. The other extreme is maximal spreading. Spread your budget equally over all and maybe you could do this. Maybe you could do one-half, one-half, one-half and empty. So when somebody told me this problem a while ago, I said, okay, everybody knows that's the best thing to do. That's the most [inaudible] turns out it is not. And well, said, okay, maybe if the probabilities are different. Let's say no but even if the probabilities are the same and everybody fails independently, this is not always the most reliable thing. Okay. Why? You just look at -- you just play with some examples. What's the problem? Maximize the probability that the sum of the XIs that survive -- so this is indicators -- is greater than 1. So if it's greater than 1, my code suffices to recover the data. If it's smaller than 1, even if it's .99, I get nothing because from a code unless you do something clever, you will get nothing. Subject to the total storage is less than your budgeting. And of course you can generalize to different failure models. This is noncovex and harder than it looks. This is what I want to tell you. Even this very basic allocation problem we don't know how to solve. Why? Okay. So first claim. Symmetrical locations can be suboptimal. What do I mean? Let's say you give me five storage nodes and you give me this budget, 12 over 5. That's my total. You can prove that this crazy allocation that is not symmetric, so it's 3 over 5, 3 over 5, 2 over 5, 2 over 52 over 5 is better than anything else. This is the best way to allocate information. If you restrict yourself within symmetric allocations, which means they are all the same or 0, the best symmetrical allocation is not to store evenly on all the nodes but to store evenly on the four and leave the fifth guy empty. Why? Because two nodes now contain enough information to get one unit whereas if you were spreading evenly over all of them you would need three nodes to get one. So finding the optimal allocation is very difficult. We don't know how to do. Finding even the optimal symmetric allocation is nontrivial. Of course you can check but we don't have a closed form for even the best symmetric allocation. This problem has been discussed, it was discussed at Berkeley by several people for a while, and in general it is open. Do you understand the problem? Is the problem clear at all? So I can tell you what the few intermediate results we had on this. Okay. So for the IID model we approved the following thing. Maximal spreading, which is spread the intuitive thing to do maximal spread the eggs as much as you can, is optimal -- it's not optimal. It's asymptotically 0 gap from optimal from. So the gap from an upper bound vanishes. If you're in the regime where TP is greater than one. T is the budget. P is the probability of success. TP is the expected amount of bits that will survive in the system. And if under this condition then this is an allocation that approaches optimality as in -- this is the only result we have right now. We have other results for symmetric allocations, but for the best allocation, it's quite challenging. So this will appear in globe com. So even for starting one thing, what's the best way to allocate is highly nontrivial. That's what I wanted to say. Other problems, repair problems are, of course, very difficult under errors. So if you have incorrect linear equations if somebody introduces incorrect error equations into your code, then these equations will poison everything else afterwards. That's a big problem with combining stuff. If some of your equations are wrong, even if the guy leaves, now you repair, this other guy, you get poisoned equations. Your whole system will get poisoned, even if one linear equation is incorrect. How do you deal with that? You need codes that can tolerate errors. How do we repair codes that can't tolerate errors, or how do we use some hash, some signature schemes and all these are very interesting. So I will conclude. Before I conclude, I actually maintain a wiki of the bibliography on storage stuff. So if anyone is interested you can go there and see there's like a lot of literature for the repair problem and the allocation problem and a few other things, so if anyone is interested, you can find it on my page. Okay. A few open problems. Cut set bounds are tied. We don't know. We only know for a few points. And other practical codes can achieve them. This is of course a relevant problem. What's the limit of interference alignment techniques is a very fascinating question for network coding. I think actually interference alignment is more useful for network coding than wireless, because for wireless you have to assume you know the channels perfectly, whereas for network coding you actually can do what you want by. You design the problem. Repairing codes in small fields as we discussed this is very tricky and interesting. Repairing existing codes that people have already deployed is a very relevant problem. So, for example, the B code and even node are useful codes using RAID systems, how do you repair those given codes. And we have some prior, some preliminary work on that. How do you deal with bit errors in security? There's a paper on security that appeared in I Society and finally what's the role of nontrivial network topologies, as I mentioned? And last one, you know, allocations, even if you have multiple objects is the real problem and it's highly nontrivial because even for one we don't know what the optimal thing is. So I think I'll stop here and any questions would be welcome. [applause]. >>: The interference alignment seemed to, I mean, there must be some special structure of the problem of the exact repair that makes it possible to find, to achieve the same capacity region. Because in light of the other results of Dougherty and Seeger so forth in general it seems that you can't do as well. So do you have any sense of what that special structure is and what kinds of networks these tricks will work. >> Alex Dimakis: It's a very good question. So so far all the networks that I know are reduced to these rank minimization subject to fool rank constraint. So I can write, I can write repair as an optimization problem of choosing these repair coefficients to minimize rank of some matrices. Say some of the ranks. Subject to some other matrices being full rank. And interference for the wireless channel for the wireless interference channel I can write again in the same fashion. Now, I cannot write every network coding problem in that fashion. So this is -but it is a fairly general framework. I do not know if -- so there's many techniques for interference alignment. But the symbolic extension is the technique that actually achieves the cut set bound asymptotically. For example, we don't know if this technique achieves the cut set bound for the intermediate points because we have shown it achieves it for the minimum storage point. So the intermediate points, for example, is unclear if -- so I mean the minimum storage point is a very interesting point because it corresponds to MDS codes. So it corresponds to people using Reed-Solomon. So it's exactly the same point. I don't know what is the magical structure of the property that allows it. But the fact that it was used for multi unicasts also shows me that it's very -- it's not just interference and the repair. It seems to be more general. >>: On the topic, things are very good to work and interesting talk. But the spreading code which you discussed later, have you considered that with the interference alignment and any intuitions on that? >> Alex Dimakis: No, because I have not -- you mean the allocation problems? Yeah, no, I have not tried to apply any. Well, one interesting question there is if someone gets a smaller subset of the equations, so, for example, there are some users that get K equations and some users get, let's say, K over 2. In general, if you have an MDS code, any K will give you all the data. But K over 2 will give you nothing. The reason the allocation problem is difficult, if you get .99 percent of water you get 0 useful information. >>: I think here is this -- I mean basically this allocation problem generally targeted data survival. So basically let's say I have data I want to store in the systems, I want to make sure that the whole copy of the data is safe in this, right? But the systematic property of the code is important for data retrieval. So many times where you want data, you may not want the whole data. Just may want basically a piece. And there's a systematic [inaudible] that would make the retrieval much more efficient. So I think these two properties is indeed a very desirable system. And ideally you want the both. >> Alex Dimakis: Yes. I see. Basically you're saying that -- that's a very good point. So the reason I'm the allocation problem is hard is exactly this hard all or nothing. It's this all or nothing that makes it hard. Because I say the probability that I get one, if I get one, I get all the data. If I get .99, I get nothing. So that's a step function that is you're trying to optimize. That's why it's so difficult. If I had the softer function here, so if I said, okay, if you get 99 you still get some utility. Then this problem would become easy. But now the question becomes how do you design codes that have good graceful degradation. Of course as you said a systematic code, systematic MDS cold has some grateful degradation. Is that the best you can do, I don't know, for example. And you could use interference for that. >>: Basically my point is for practical reasons. I would rather have a systematic code. >> Alex Dimakis: Of course. >>: Now optimal than optimal code but ->> Alex Dimakis: Actually, I believe that the systematic code is also optimum for graceful -- in terms of graceful degradation, I don't think you can do anything better than keeping some bits in the clear and some parities. Practically, of course, I agree with you. But I think even in theory it's the best. Perhaps you can use alignment for that because it's alignment type of role. Very good point. >>: Well, this resource allocation problem you present today like it is [inaudible] but have you -- what do you think about using coding that is not rating that model but rating [inaudible] do we need help? >> Alex Dimakis: Coding makes it water because -- so I'm thinking of taking the data and multiplying by a full matrix that is random IID and N by K matrix. Any K will give me back all my data. So that's why any one liter of water gives me back my original liter of water. So and then of course I'm saying I make the packets super small so I don't have to worry about the discreteness of the packet. But as we were saying, if you had some graceful degradation then the problem would be different, because I don't just want this, I want to maximize some utility which depends on the surviving amount of water, and then if I had the very simple objective function here, maximize the amount of water just the amount of water, then it's trivial. But that's not going to be -- if you had the magical code that any K gave you the original data and any K minus 1 K gave you K minus 1 of the data and any K over 2 gave you the data then this would be trivial. But of course I don't think this is possible. What's the best thing you can achieve than a code that performs well at any one point or any two points is very good -- I conjecture that the best code is the systematic one. Systematic MDS code is performing, I don't think you can beat that actually with any other scheme. >> Philip Chou: All right. Let's thank the speaker again. [applause]

18904 >> Philip Chou: It's my great pleasure to introduce... assistant professor at the Verterbe School of Engineering at USC. ...

Related documents

Products

Support

18904 &gt;&gt; Philip Chou: It's my great pleasure to introduce... assistant professor at the Verterbe School of Engineering at USC. ...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

18904 >> Philip Chou: It's my great pleasure to introduce... assistant professor at the Verterbe School of Engineering at USC. ...