>> Dave Maltz: Hi. It's my pleasure to introduce Hitesh Ballani who is visiting us here from Cornell. He's a student of Paul Francis. He'll be talking to us today about his work on the router table space exhaustion and what can be done to fix that, his other work in extensive work in Internet measurement, working with the DNS and anycast problems, as well as work in top-down management of IT infrastructure and networks. So thank you very much for joining us. And Hitesh, please take it away. >> Hitesh Ballani: Thanks, Dave. Thank you for having me here. Before we jump into the technical details, I wanted to give you a brief overview of the kind of work I do. And I figured what better way than a [inaudible] of my research statement. Now, I'm sure you've all heard this argument in one form or the other that the explosive group of the Internet over the pass decade has meant that many of the original assumptions underlying Internet design no longer hold true. This has led to a plethora of problems which is good for us, because there's work to be done. However, there's a flip side of the story. The success of the Internet has meant that you can't really go out and say well, let's take this protocol out and put this new one in and whoala, the problem goes away. Throughout my graduate career I've tried to stay cognizant of this ground truth. And I've really strived to strike a balance between two complete [inaudible] regarding my research. On one hand is the freedom of doing blue sky research. Openly questioning Internet design in the face of new needs and challenges. For instance, as Dave mentioned, as part of my dissertation I look at the problem of network management. And I argue that networks today are so difficult to manage because the Internet was not designed with manageability in mind. So what I mean by that is protocols today tend to expose their internal bells and whistles. And what humans and management applications are sort of expected to understand these low level parameters in order to be able to manage the network. Hence, we proposed a new morning management architecture called complexity oblivious network management or CONMan for short. As part of CONMan we precisely defined the kind of information that should be exposed by protocols in order to make them amenable to management. All these protocols are implemented and deployed on a real test bed and what we found was that the resulting test bed was much easier to configure. When things went wrong you could debug and diagnose the network forms in a structured fashion. So overall it was lot of function showing that this complex problem of network management can be tackled through such an invasive approach. Question? >>: Are you going to give us detail later in the talk about how you demonstrate this? >> Hitesh Ballani: On network management? No. The talk is generally about my routing work, but if you're interested I'll be more than happy to speak to you about the network management stuff and my vision there. Okay? On the other hand, I do want my research to have real impact through solutions that are immediately deployable. Now, this may seem to contradict what I said earlier when I claimed that a lot of these network problems arise due to a mismatch between design and use. And I still stand by that. However, what I have found is that in many cases simply by focussing on a subset of the given problem space, you can come up with a solution that does not require wholesale change. And this is the recurring theme in my work, a trick that I use over and over again. And these are not just point solutions. As I'll explain later, this can actually serve as small incremental steps towards the eventual solution that you would like anyway, so there's a clear progress path between where what can be done today and where you'd like to be 10, 15 years from now, which I think is pretty neat. To give you a few examples, I have worked on how to deploy IP anycast which is essentially a network layer service discovery mechanism in a practical and scaleable fashion. So effectively I've built the service, I deployed it, I have seven servers across the Internet which included BGP peerings with various ISPs, and [inaudible] this test bed for the past four years and it has also been used by other researchers for active routing experiments. On a similar note, I came up with a minor modification to the caching behavior of DNS servers and assured that this can substantially mitigate the impact of the service attacks on DNS. Now this was a very, very trivial area. But I'm still very proud of it because at the time when these papers were published, a lot of people were proposing normal peer-to-peer architectures for DNS. Now, that's all great work because DNS does suffer from severe problems. What I showed was that if you focus on the specific problem of flooding attacks, you can come up with a solution that does not require infrastructural change. And as it turns out, there is a connection between the simple idea and the other clean-slate proposals that I just mentioned. And finally, there is my more recent work on routing scalability which is what the rest of this talk is about. So we'll be switching gears here a little bit and moving on to the technical component of the talk. Now, I'm sure many of you must have heard about the growth in the Internet routing table. Then you must have seen this graph. And you can see that the growth has been rather steep over the past few years. What is more [inaudible] is that this is probably going to get worse in the near future. This is because as the IPv4 address space runs out, more and more small prefixes will be advertised, resulting in a larger routing table. And if, lo and behold, the IP deployment takes off the ground, we could end up with a very bloated routing table. Now, this routing table specifies how a given router should follow data packets. And so routers need to maintain this in fast memory as part of something called the forwarding information base or the FIB for short. And I'll explain how that works later. Hence, a larger routing table means that ISPs need routers with more and more FIB space. Given this, there are two questions that come to mind. The first of which being why is the routing table growing so rapidly? And I'll comment on that very briefly. The scalability of the Internet serving system relies on hierarchy, which in turn requires that the addressing of ISPs on the Internet be in line with the actual physical topology. So if you had a nice figure like this wherein the address space of any given node is a subset of the address space of its parent, you'll get good scalability. For instance, if you look at the routing stated ISP, it only comprises of other top level ISPs and its immediate children, a small amount of state. However, the growth in the Internet has meant that this nice picture is no longer true. In the figure I've shown how [inaudible] wherein a given sight connects to multiple upstream ISPs can lead to a mismatch between addressing and topology. And there are multiple factors that can lead to this. But the point here is that this mismatch between addressing and topology is the route cause for the rapid growth in the Internet serving table. The second question that I'm often asked is so what. [inaudible] throw more RAM in the problem. And that's exactly what has been happening. Every few years or so router windows come out with a new generation of routers that have more memory and more processing power. And I'm not arguing that they can't keep on doing this forever, however, the problem is the scalability properties of this FIB memory. On the technical side, there's a concern about the amount of power these massive chips consume. Not to mention the concomitant heat dissipation problems which are especially relevant in the kinds of locations where these router's housed. However, off-chip SRAM, which is commonly used for router FIB is a low volume component that has been shown to not track Moore's law. So the cost codes are not in our favor. And what I mean by that is a larger routing table reduces the cost effectiveness of ISPs because the price per byte forwarded increases. And of course there's the cost of actually upgrading the router because it ran out of memory. However, in spite of all these reasons, I have to admit that FIB size is a contentious issue. There are people who think FIB size is important problem, be and then there are the non-believers. As a matter of fact, in one of my own papers a few years ago, I had argued that it is technologically feasible to add more state to routers and I still stand by that. And while my thinking has evolved over time, the standard I'm going to take in this talk is that given the technical and the business implications of FIB code, you can't be sure of the criticality of the FIB code problem given existing data. Looking ahead there are several very good reasons why you would want a smaller routing table. And of course these are speculative in nature. But what really sealed the deal for me were recent events and discussions with ISPs and operators. And I'll give you a quick example. ISPs in Australia have already started filtering out small prefixes, mostly slash 24s, which is the smallest address block that you can advertise into Internet routing. This is because these ISPs don't want to upgrade their in-store router base, which is scary because it implies that parts of the of Internet may not have reachability to each other. So hopefully this quick anecdote convinces you that ISPs are concerned about the FIB on the routers, and they're actually willing to undergo some pain to extend the lifetime of the installed router base. Which is good for me because there's a specify real problem to solve. Question? >>: The ISPs do that kind of filtering in the slash 24s, does that really mean you have [inaudible] reachability or does it mean you just get inefficient routes? >> Hitesh Ballani: It could mean inefficient routes for the most part, but it could mean to non-reachability so they're not just filtering out. >>: [inaudible]. >> Hitesh Ballani: I don't think there has been active study. I have done some random hacks or measurements and it seems that in some cases there's a decent amount of inefficiency because you're not just throwing the slash 24 away, you're assuring there is some super prefix such that you have a slash 16 and you're throwing the underlying slash 24s away. But because of [inaudible] in some cases it might be the indication that you don't even get reachability. And I've seen a couple of instances of that. But I don't think there has been any measurement, systematic measurement. >>: I'm curious. Who is the Australian ISP? >> Hitesh Ballani: This was a Pacnar [phonetic] mailing list, and they had mentioned several Australian ISPs. I don't have specific names. All right? Okay. So this is good because there's a specific problem. So motivated by this, we propose Virtual Aggregation, ViAggre. A configuration only approach to shrinking the FIB on ISP routers. By configuration only what I mean is that ViAggre does not require changes to routers and earning protocols and it can be deployed independently by any ISP on the Internet today. In case you were wondering, I did not come up with that name. The credit or the discredit for that goes to Brad Karp at UCL. Also, it's ViAggre with a V, not with a W as it will sound when I say it because my Vs and Ws are messed up and it doesn't really come out with my pronunciation. But anyway. This talk focus mostly on the research component of ViAggre. Now, I do want to point out that these features have allowed ViAggre to have real world impact. For instance, there's ongoing standardization effort in the IETF ViAggre. This is peer headed by Huawei, the largest router manufacturer in China that provides equipment to more than 70 percent of the top telecos. As a matter of fact, Huawei is also implementing ViAggre natively into its routers. And I'll explain later what the advantage of such an approach would be. But the point here is that this possibility of real impact is what is so exciting about all this research, at least from my point of view. The basic idea between ViAggre is really simple. Today every router needs to maintain their entire routing table. In ViAggre, we allow the ISP to divide the routing table into parts such that individual routers only need to maintain routes to a few of these parts, and so you get a shrinkage in FIB size. And I'll explain that as we go along the talk. So the way I have set up the rest of the talk is that I'll begin with some basic information on Internet routing, which I'm sure most of you know anyway, and will help me come up with a crisp problem statement, and it will help me place this work in context. So Internet routing. Internet routing is domain based wherein the domains are independent, autonomous entities. This could be ISPs like AT&T and Sprint and enterprises like Microsoft and [inaudible] and so on. Such domains lend themselves nicely to a two tiered routing architecture where in you have intra-domain routing to establish routes within a domain and inter-domain routing to establish routes between domains. On the Internet BGP serves as the de facto inter-domain routing protocol and BGP is what the rest of this talk is about. Moving from this very high level view to the innards of factored routers, what I've shown here is a router that is connected to two other routers. At the top there's a route processor, which is responsible for the routers [inaudible] tasks. This route processor maintains what is effectively a database of routes that it obtains from other routers. This is known as the routing information base or the RIB for short. And it's generally maintain on slower memory, for instance data might be used. On the other end we have the line cards which are responsible for actually sending and receiving the packets. These line cards maintain a table of routes based on which to forward these packets. And as I mentioned earlier, this is known as the forwarding information base or forwarding table or routing table or FIB for short. Obviously the FIB need to be accessible at line rates and so it's generally maintain on fast memory, for instance off-chip SRAM may be used. And here I've shown how packets might come in through one line card, mismatched on to another line card. So two important things to note, RIB resides on slow memory, FIB resides on fast memory. Given this basic routing information, we can take a look at the scalability problems afflicting the Internet routing. As I mentioned earlier, a larger routing table has led to concerns regarding FIB growth. Beyond this, there are concerns about RIB size problem, routing convergence and all this has been known for a while, which shows up in the massive amount of work done if this area. And at a very high level, there are approaches that argue for separating edge networks from core networks, there is geographical routing. There's been a whole lot of work especially in the theory community on compact routing. There are elimination approaches and without going into any details at all, the common thing here is that all these approaches require architectural change. I don't mean that in a bad way because I do realize that a lot of these problems arise from the way Internet routing works, and so some change is in order. However, this very need for change has meant that non of these approaches have seen any deployment. Frustrated by this lack of deployment, I started wondering if you can do away with the need for change by focussing on some of these problems gathered by this in ViAggre we tackle the FIB size problem in an incremental fashion. And as it turns out, our current techniques can also help with both RIB size and routing convergence. So again we have this notion of achieving deployability when narrowing down our focus. So in this talk, I'm going to focus mostly on FIB size and if time permits I'll comment on how ViAggre can be applied in a more invasive fashion to [inaudible] the inter-routing scalability problem space. And with that big picture in mind, we can delve into the ViAggre design. Question? >>: So you showed us how the [inaudible] over time. Can you tell us how the FIB grows compared to that? >> Hitesh Ballani: Say that again, please. >>: How does the FIB grow compared to the growth in the routing table? >> Hitesh Ballani: So the routing table sizes are the [inaudible] the actual FIB. So it's the FIB size, the global FIB of -- there are routers in the default free zone which routers that can't use default routes and that when a set Internet routing table I essentially meant the FIB size in the core of the Internet. I don't want to use the term FIB at that point in time because I hadn't introduced it. Obviously the RIB can be much bigger because it depends on the number of peers you have and other factors. So that graph was for the FIB on the core of the Internet. Okay? So ViAggre design. And in the figure here, I've shown an ISP with three PoPs or points of presence. These are routing centers in different geographical areas and a few routers in each PoP. Today as I mentioned, each of these routers feeds to maintain the entire routing table. In ViAggre, we allow the ISP to divide the IPv4 address space in parts. Here I have divided it into four parts which are color coordinated. Each of these parts can be represented by a slash 2 prefix, which I'm going to reference to as a virtual prefix. Now, I'm assuming that people are aware of the slash rotation. Since I have four parts, I only need two bits to represent them, and so only the two higher bits of the IP address are relevant, everything else is masked off and that's what slash 2 means. Beyond this, the ISP assigns its routers to these virtual prefixes. So the two green routers are assigned to the green virtual prefix or their aggregation points for the green virtual prefix. What this means is that these routers are responsible for routes to any prefix in the green part of the address space. And you can imagine how this will ensure that on average every router maintenance a quarter of the Internet routing table. Of course this assumes that the prefixes are distributed uniformly across the address space, which is obviously not true for the Internet. However, you can choose these divisions such that the distribution of prefixes across them is relatively uniform. Now, be this basic idea of dividing the routing table can be can achieved in a number of ways. For instance you could change the routing protocol to do this the. However, we wanted a deployable system. And so a key design challenge was how do you achieve this without changes to the routers and without requiring external cooperation? Further, once you divide the routing table, how do packs flow through the ISP's network? Because routers now only have partial information. So in the next part of the talk, I'm going to elaborate on ViAggre design by answering these questions in order. First up, the control plane design. In ViAggre, we need to make sure that the following table on any given router -- question? >>: Yes. So is there only one router per point of presence in this picture? >> Hitesh Ballani: This is a simplified picture and that's what is misleading here. So in a typical deployment, and I'll get to that later, a typical route, a typical PoP has 20, 30 routers, and so you want to make sure that there are two routers or at a least two routers responsible for every color in each PoP for things like stretch and [inaudible] and I'll get to that in like four or five slides from now. Okay? So in ViAggre, you need to make sure that the following table in any given router contains only routes for the prefix or this aggregate. So route router should only contain route prefixes, which is a problem when you consider interaction with external ISPs. In this case, the external router is advertising the complete routing table to the blue edge router, even though it's responsible only for blue prefixes. And so we need some mechanism of selectively inserting routes into a given routers forwarding table. We came up with two mechanisms to achieve this. As it turns out, one of them double really work in practice, the performance overhead is too high. So what I'm going on do is I'm going to explain both the mechanisms at this point in time, and once we go on to the performance evaluation section I'll explain why one works while the other doesn't. The first mechanism or design one takes a one page of router feature that we call FIB suppression. With FIB suppression you can configure a router to only load a subset of its RIB into its FIB so in this case the blue router gets the entire routing table, loads this in its RIB and only loads the blue routes into its FIB. The advantage of this scheme is that the ISP does not need to modify its internal routing setup at all. Everything remains the same as today [inaudible] the entire routing table and selectively inserts into its forwarding table. The disadvantage is that there's no RIB shrinkage at all because every router still needs to maintain the complete routing table in its RIB, which is the slow memory. So that's the disadvantage. Question? >>: [inaudible] change your hardware for this? >> Hitesh Ballani: Yes. >>: Is that true? >> Hitesh Ballani: We don't change the hardware and the software for this. As it turns out, existing routers have this capability of putting entries, selectively putting entries from the RIB into the FIB. Whether that's practical or not, I'll show you later that it turns out to be not practical. That's the catch here. But there is a second approach that does end up working. Okay? >>: [inaudible] I mean the RIB is something external DRAM, right? >> Hitesh Ballani: Yes. >>: Presumably only a few gigabytes [inaudible] right? >> Hitesh Ballani: Yes. >>: And so we're talking about gigabytes DRAM that costs hundreds of dollars. So is that a problem? >> Hitesh Ballani: No, the DRAM on the RIB size is not a problem, per se. What it turns out, and I'll show this later, that reducing the RIB, which the second design does, not this design, actually helps out with internal routing convergence. What ends up happening is that you are sending less number of routes to your routers because the RIB size is smaller and it helps with internal routing convergence. So the memory cost on the RIB side is not important. I should have clarified that. The second design is slightly more invasive. And it offloads the task of maintaining this complete routing table on to machines that have off the data path and it's machines that don't forward data packets. As it turns out, ISPs already use such machines for scaleable internal distribution of BGP routes, something called BGP route reflectors. So without going into any design details of how route reflectors work, in this design the external router instead of advertising routes to a blue edge router, it has a peering to a PoP route reflector, and it is the route reflector that selectively forwards these advertisements to the ISP service. So the blue router gets only the blue prefixes, the green router gets only the green prefixes. And these routes are propagated to route reflectors and other PoPs and so on. Question? >>: Since you have route reflectors why don't you run another routing algorithm with the third level of [inaudible] BGP [inaudible]? >> Hitesh Ballani: So a lot of people have proposed things like RCP and forwarding value, separate this logic from the actual data plane routers, and you can take advantage. If you notice that the route reflector here is similar to a decision element in 4D and IRSCP and you notice that the solution I come up with, for instance, RCP was focussing on routing convergence, 4D was focussing on network management. I'll come up with a solution that is pretty similar and compatible with those models so you can use ViAggre in the context of those models to achieve reduction routing table size. So I'll answer the question once I explain the process and see, I'll explain why ViAggre fits into the RSV model and what would happen if you had the flexibility of routing, changing routing protocols. Question? Yes. >>: Do [inaudible] how many IBG congigurations do [inaudible]. >> Hitesh Ballani: How many IBG configurations do these route reflectors have? >>: Yes, how many IBG configurations inside the [inaudible]. >> Hitesh Ballani: Oh, how many IGB configurations inside ASs don't use route reflectors? As far as I know, and maybe people here might be able to, as far as I know most tier one ISPs and tier two ISPs use route reflector based configurations and so we should be save in that aspect. Okay. >>: So are you saying the route reflector does need [inaudible]. >> Hitesh Ballani: DRAM because ->>: [inaudible]. >> Hitesh Ballani: Yeah, because it's not on the forwarding path. Yes. Yeah, the only point was that it's not on the forwarding path, so it didn't need fast memory, it only needs slow memory. Question? >>: [inaudible] ISPs that you use reflectors are not running route reflectors like traditional routers that are in the forwarding path? >> Hitesh Ballani: I know at least a couple of ISPs that are not -- don't have the route reflectors on their forwarding path. I also know of an ISP where the route reflectors is on the forwarding path. So, yes, that's not completely compatible with our model. >>: [inaudible]. >> Hitesh Ballani: Okay. But that being said, this could be actual PCs. So you could shift away from these machines being on the data path. So, yes, there might be slight changes for some ISPs. Okay. And I guess that the disadvantage here because of slightly more invasive because it has the use of route reflectors, you need to reconfigure your external peerings which is achieved because I wanted to be completely transparent to neighboring ISPs. On the other hand, the advantage here is that apart from shrinking the FIB, you shrink the RIB on all data plane routers. And again the model length there is not memory but routing convergence which present evaluations [inaudible]. >>: Could you back up? I'm missing something. >> Hitesh Ballani: Sure. >>: Fundamental here. So you divide the Internet up into subset of colors. You stick the routes for one particular color in a router. >> Hitesh Ballani: Yes. >>: Okay. Now it's external facing. It gets a packet that's in some other section. What does it do with it in. >> Hitesh Ballani: Excellent question, which is my next slide. I'll explain that. So I'll explain. Because that's the thing I wanted to answer. Now routers only have partial information and how a packet's going to flow. In this case, packets come into a prefix in the red part of the address space to a blue end router and it doesn't know what to do with that. These packets are routed from ingress router I to aggregation point A, which is segment 1 and then from aggregation point A to external router X which is segment 2. And I'll explain how both these segments work. When the packets come in, the blue ingress router doesn't have a route to the destination prefix and so it somehow needs to know that these packets need to be sent to a close by aggregation point which in this case happens to be router A. To achieve this in ViAggre, routers advertise the virtual prefix they are aggregating into the ISP's internal routing. So to red routers advertise the red virtual prefix. And if you look at the following table of the blue router, it contains routes for all the blue prefixes and it contains one entry for every virtual prefix that is not aggregated. So this is how the red route -- the blue router knows that these packets need to be send to router A. Once the packets get there router A has a route to their destination, and so it knows that packets need to be forwarded to external router X. However, the packets can't be forwarded in the normal hub-by-hub fashion because the routers along the path, the blue, the green, and orange router don't have a route to the destination prefix, so they'll probably end up sending the packets back to router A, resulting in a routing loop. Toward this router A tunnels the packets and if it encapsulates the packets in an extra header such that intermediate routers only need to forward packets destined to external router X, which they can do nominally. However, you can't tunnel the packets directly to the external router because that would require cooperation from the neighboring ISP. Because the external router X is going to get these packets with an extra header and they won't know what to do with them. Hence it is the egress router E that deters the packets. Now, as it strips off the opportunity header before forwarding them on to the external router X. And all this can be achieved with standard router configuration. As a matter of fact, those of you doing BPM research might realize that behavior of the egress router here is similar to the [inaudible] in an MPLS VPN scenario. And the important point to take over here is that we figured out away to achieve all this with standard router configuration today. Question? >>: [inaudible] the header includes in the next PoP so that when he strips it -when he strips off the header and knows that it's ->> Hitesh Ballani: This is the next talk, and that's the trick. That's why E doesn't need to have the entire routing table even though it's orange. >>: [inaudible] operation on the line card? >> Hitesh Ballani: Yes, it's [inaudible] on the line card. >>: It's not an exception case that goes to general [inaudible]. >> Hitesh Ballani: No. That's what saves us. What saves us is the fact that tunnels have been adopted in mainstream networks for things like traffic engineering, and so most routers are equipped with line cards that can do tunnelling at line card rates. And that's why the computational [inaudible] is not high and ->>: This is [inaudible] tunnel plus this extra bit of [inaudible] like it's [inaudible]. >> Hitesh Ballani: So as it turns out existing MPLS tunnels have that sort of technology built in. So I'm just using standard MPS technology wherein the label that we use to identify a tunnel inverts this information off what the next is and so this is no exception case, it's all on the data path, fast path. Okay? Question? >>: So basically you change the -- although you cannot reduce the routing table size you change the network topology [inaudible] to evidence prefix, right? >> Hitesh Ballani: Sure. >>: So how does it deal with [inaudible] inside of the ISP? >> Hitesh Ballani: Excellent question. You are two slide ahead of me and I'll address that point in a second. I'll get to that. Okay. Question. >>: [inaudible]. >> Hitesh Ballani: Excellent question. So tunnels represent two kind of overhead. There is an overhead in terms of computational cost which is suspension and there is a storage overhead. As it turns out, and this is again a copout because ISPs use tunnels existing in -- for existing purposes. It turns out that layer two technologies and [inaudible] technologies has been designed in such a fashion that you don't have MQ issues. For instance, MPS can work as a layer two technology, and even [inaudible], the MPUs have been designated in such a fashion that there is no IP fragmentation. If there was IP fragmentation, this would not work. Okay. Yes? >>: An alternative would be to beef up the memory in some of the [inaudible] so you have to tunnel to the red one to tunnel to ones that know everything and have [inaudible] memory [inaudible]. >> Hitesh Ballani: Yes. So there is an entire design space wherein this is -what I'm going to explain is the assumption that you don't want a lot of SRAM on any router. There are cases where you won't have, oh, this router is just a new router and so he can keep all the memory and then you have these end routers that lie on that specific router and we have explored that in a little bit detail with engineers [inaudible]. I'll comment on that very briefly at the end of the talk, and if you're interested we can talk about that offline. Okay? >>: Is there a slight bit more [inaudible] might have a path from I to E or I to X, but you might not have it back [inaudible] goes through one of the red routers. >> Hitesh Ballani: Excellent question. Which is my next slide. So this basic design leads to a couple of design concerns, the first of which being failover. What happens when router A fails because there's a part that exists from external router to external router? If you remember, the blue router receives two routes to the red virtual prefix, one from A, another from A 2. Hence when router A fails, it installs the alternate route into its forwarding table. And the packets are rerouted automatically. So the point here is that failover in ViAggre happens automatically using existing mechanisms and the ISP doesn't need to do anything fancy other than to ensure that there is some aggregation point to failover on to. And so you have this management overhead wherein you want to make sure that in order to get the same amount of quality robustness, you pick and choose your aggregation points properly, for instance one way to do it would be to ensure that every PoP has a couple of aggregation points. And so the robustness doesn't suffer much. And the [inaudible] don't need to do anything fancy, other than to be smart about how you place the aggregation points. >>: [inaudible]. >> Hitesh Ballani: So what happens here is that I'm zooming a mesh of internal peerings. This guy already had two routes to the red virtual prefix in the RIB, and so essentially what he needs to do is send stuff from the RIB into the FIB which in our measurements is sub millisecond times. As ->>: [inaudible] that says A is [inaudible]. >> Hitesh Ballani: So what happens is assuming we have a mesh of peerings, your TC -- your BGP peering session goes down. So if you are using aggressive time-outs there, you get that instantly and you fall back onto the other option [inaudible] FIB. And our measurements have shown that this is on actual hardware routers that if you can configure that properly or engineer that properly, that comes out to be in the sub millisecond range. Essentially detecting that your peering has gone down -- >>: [inaudible]. >> Hitesh Ballani: Yes. So one [inaudible] while [inaudible]. So it turns out [inaudible]. Question? >>: One way to look at this is you're taking what was one router and distributing it into four other routers. >> Hitesh Ballani: Yes. It's essentially, yes. >>: Does the cost go up, but do you not have to buy four more routers, four times as many routers as you have in the past? I mean, what extra load does this put on the internal routing structure? >> Hitesh Ballani: Excellent question which is my next slide. So as you mentioned, and it's ViAggre requires traffic to be routed through an aggregation point which as people mentioned can impose stretch on traffic, it can increase load across the ISPs routers and across the ISP's link. And I guess I don't -- I'm not sure as I say this at some point in the slide, but the stretch answer is the fact that most ISPs are designed in the form of PoPs, where you have a few routers in each PoP and you want to make sure you divide the routing table within each PoP, so you're not going from New York to Chicago. The lower answer is slightly tricky. As it turns out traffic on the Internet follows a power-law distribution. That is 95 percent of the traffic is destined to 5 percent of the prefixes. Hence, we proceeds that these popular prefixes should be load into the following table of heavy router. This will ensure that a majority of traffic follows direct paths, a small fraction of traffic takes this detower and so it substantially reduces the impact of ViAggre on the ISP's network. And this is what it makes ViAggre a good trade-off, the substantial reduction in FIB size while having a minimal increase in load and stretch. And that's what the evaluation results are going to show. So unless there are questions at this point of time, I'll move on to evaluation. Yes? >>: So are those traffic to [inaudible] that doesn't mean that those traffic are not important for example some traffic might be [inaudible] but they are not very much latency-sensitive but some traffic if you're going into this [inaudible]. >> Hitesh Ballani: That is very true. My argument that, well, for most of the traffic things work well and for small fraction of traffic things couldn't go bad, if things can go really bad in some application specific scenarios, this would not be acceptable to the ISPs, it might break SLAs, and you notice that I'll come to that in four slides from now. I'll address that issue in terms of how you want to assign the aggregation points. Because you want to be smart about that. Yes? >>: [inaudible] when you said 95 percent, you mean 95 percent of the bytes, 95 percent of the what? >> Hitesh Ballani: 95 percent of bytes. >>: Isn't -- isn't bytes [inaudible] by BitTorrent? >> Hitesh Ballani: No, it used to be the case. It's not now. If you look at [inaudible] BitTorrent doesn't ->>: [inaudible] for four years. Could you back to one slide? >> Hitesh Ballani: Yes. This is actually 10 years. Jennifer Reckford's [phonetic] paper in the late '90s even now what it [inaudible]. >>: [inaudible]. >> Hitesh Ballani: So what happens is that for instance I'm AT&T. I have two prefixes that belong to me that are my customers and so a lot of traffic is going to them. There are three, four prefixes that belong to AOL and Comcast that are getting services from me. And their traffic is again AT&T local. So you have these four prefixes that are carrying about 40 percent of the traffic. What turns out is the rest of the traffic is essentially going to Google and Microsoft data centers that are again connected to AT&T. So it turns out any traffic coming from AOL, Comcast, and AT&T customers going to these data centers are peer-to-peer applications is AT&T local, which again belongs to these 10, 15 or 50 or 100 prefixes which turn out to the popular. Actually I was very confused about that point, too. When I came up and did this evaluation based on network records, I was like this doesn't make sense. And I have some insight into why that works. And I'll be happy to explain that to you offline. Okay? So evaluation results. And I'll address that question of what happens for really bad cases. So first up evaluation metrics. Which an ISP is choosing to deploy ViAggre it is looking to shrink the FIB on its service. On the other hand, the use of ViAggre imposes stretch on traffic and it increases load across the ISP service. So there are positives and there are negatives. Further, ViAggre employs the ISP with a number of deployment options and I listed three of these here. So the theme of the evaluation is going to be how can the ISP use these deployment options to tune the positives and the negatives? And the main result that I'm going to try and show is that the positives far outweigh the negatives, which makes this a good trade-off. In the interest of time in this talk I'm going to focus on the latter two deployment options. Now, if you think about it the choice of which routers aggregate a given virtual prefix is an important one because the more aggregation points you have the less stretch you'll impose on traffic. On the other hand, the more aggregation points you have, the more cumulative FIB space you'll end up using. So there's a trade-off between FIB size and stretch. And the ISP can use its choice of aggregation points to tune this trade-off. In our work, we consider a simple constrained based optimization problem wherein the ISP is trying to minimize the worst-case FIB size across all its routers while constraining the worst-case stretch. Now, this is simply constrained. But I'm sensing that it's [inaudible] and it answers your question because I want to make sure that my existing [inaudible] are not bridged and my latency-sensored traffic is not completely hosed. So I could say well, even in the worst case scenario my stretch should not be more than four milliseconds or five milliseconds and that's the kind of Xs I've got on ISP can do, and you can get more sophisticated. On the other hand, the worst case FIB size is important because that's what I need to provision for. As it turns out, the simple constraint problem can be mapped to the multi-commodity facility location problem, which turns out to be NP-hard and has been studied quite a bit at the theory community. As a matter of fact there was a SODA paper in 2004 that proposed an approximation algorithm per logarithmic bounds for the problem. For our purpose we implemented a simple greedy approximation algorithm and we applied this data from an actual tier-1 ISP on the Internet. So it took the ISP's topology, their routing tables, their traffic matrix, and we assumed that the ISP wanted to deploy widely and was using our tool to determine an allocation of aggregation points, the results for which are plotted on the graph here. So X axis is constrained on worst case stretch, Y axis is FIB size. As you relax the constrained, the worst-case FIB size drops. And with the constraint of about four milliseconds you get a worst case FIB size across all the routers of from 10,000 prefix, which is four percent of the global routing table. On the same graph on the right hand side Y axis, I have plotted the actual stretch. And there are two points to note here. The worst-case stretch, which is the dark blue line, is always less than the constraint. And we will finally check for the algorithm. While the average case stretch is pretty much negligible throughout, .2 milliseconds. So FIB size reduces, stretch increases, stretch is pretty much negligible throughout. Another way to look at this reduction in FIB size is to look at the extension in lifetime of routers due to the use of ViAggre. >>: I have a question. >> Hitesh Ballani: Sure. >>: So what you're saying is that that blue line there's this imaginary X equals Y line that it lies under? >> Hitesh Ballani: Yes. >>: So that's the measured worst-case stretch as opposed to the constraint that you saw before? >> Hitesh Ballani: Yes. >>: Okay. >> Hitesh Ballani: So that's the measure of the stretch and obviously you would want to make sure if your algorithm is working right it's below that diagonal line. >>: Right. >> Hitesh Ballani: Yes. Question? >>: So the FIB size bottoms out around four percent, what's the integration for why -- what was the ->> Hitesh Ballani: The [inaudible] there is that we are limited by the smallest PoP that this ISP had. So if you -- if this guy had a PoP with five routers and limited to wiring the routing table amongst those routers and then relying on some close by a PoPs, but my stretch constraint ensure that I can't rely on some random PoP far away. >>: [inaudible] about five reflects the -- your choice of number of colors in your [inaudible]. >> Hitesh Ballani: Yes. >>: So that was fixed? >> Hitesh Ballani: Yes. It was fixed. >>: Wait. Now I didn't understand. I didn't think that the number of colors was fixed, I thought you were saying that the stretch constraint forged to the number of colors was fixed because it was of the -- because you didn't have more colors than that in one PoP? >> Hitesh Ballani: No, what I said was that in this exercise I started out with a certain number of colors, let's say 200 colors, and then I said ->>: [inaudible]. >> Hitesh Ballani: Yes. A decent number of colors. So that I get an even distribution of prefixes across these various colors. >>: Okay. >> Hitesh Ballani: That's why you're decent -- that's why I needed a decent size, not four or eight or something like that. All right? And another way to look at this reduction in FIB size is to look at the extension and lifetime of routers due to the use of ViAggre. And we conducted a study to defect, and the highlight of that study was the fact that ViAggre can be used to extend the lifetime of already outdated routers by 7 to 7 years, while imposing no stretch on the ISP's traffic, which I'm sure many ISPs would be very excited about. Now, [inaudible] from ViAggre requires traffic to be routed through an aggregation point, which as I mentioned, can impose load on the ISPs ->>: I don't [inaudible]. >>: So this is a zero stretch. >> Hitesh Ballani: I should have clarified zero stretch means that if you're taking a hop within the PoP that is zero. >>: Okay. Within a PoP. >> Hitesh Ballani: Within a PoP. And as it turns ->>: So the number of routers ->> Hitesh Ballani: Number of routers. So I should have clarified that. >>: So [inaudible]. >> Hitesh Ballani: It's no magic. >>: [inaudible] exactly the same routers you had before? >> Hitesh Ballani: Yes. Sure. >>: All right. >> Hitesh Ballani: The only problem is that well, you can still take a hop within the PoP, but the load is a problem. Which is ->>: [inaudible]. >> Hitesh Ballani: Yeah. But it runs off stretch they're okay in terms of load they're not okay. >>: Right. >> Hitesh Ballani: And this is where the popular prefixes come from. Question? >>: So in 10 years assuming that the growth rate continues [inaudible]. >> Hitesh Ballani: So there I used two models of FIB growth rate. One was an exponential model proposed by Jeff Hustin [phonetic], one was a quadratic model proposed in some IETF document, a quadratic based model. So, yes, people based on past growth have assumed extra [inaudible] future code, and I took two models and based on that, I got this range. And of course if IPv6 deployment takes off or IPv4 the aggregation takes, the growth might be more, in which case this number would reduce a little bit, yes. Okay? So the load problem, and this is where proper prefixes comes in, we perform a pretty long term and comprehensive study to determine the fraction of traffic carried by different paucity of prefixes. And the highlight of that result was the fact that a small fraction of prefixes carry a vast majority of traffic. This is what fast results are shown, this is what we found. Hence, we proposed that these popular prefixes should be loaded into the following table of every router. Given this use of popular proceed fixes, we conducted a load analysis to determine the increase in load across the ISP servers. So X axis [inaudible] popular prefixes, Y axis increase core types for increasing load across the axis routers. As we increase the popular prefixes, load drops sharply. And with around 5 percent popular prefixes, we get a maximum load increase of 1.38 percent. Which should be pretty acceptable. So hopefully this quick set of results convinces you that ViAggre can be used by ISPs to extend the lifetime of their routers while imposing negligible traffic stretch and also no increase in load across the routers. Beyond this, ViAggre has a number of other advantages, offense ISPs don't need to bind to the ViAggre model completely. They can play around when their done on limited scale so that they get comfortable. And there are several other advantages. Question? >>: So [inaudible] selection of aggregation only consider the stretch, have they considered the load on the [inaudible]. >> Hitesh Ballani: So the simple optimization that I presented in this case, the idea that was put a constraint on the stretch, measure the load, turns out that load is pretty small. Beyond that, obviously for an ISP to deploy this, you wouldn't want to constrain the stretch and load. I have a mathematical formulation of these constraints that I fed into an ILP [inaudible], ILOG. So essentially I have a tool that takes these constraints and generates a deployment model that would satisfy those constraints. But the results in this talk are based on the simple constraint model. >>: If -- so the model you've done is with no failures of the routers. If you start thinking -- well, maybe -- I say this without knowing. >> Hitesh Ballani: So what I'll [inaudible] failure model, failure model I wanted to ensure was that you are placing a color in a given PoP you want to make sure that there are two routers for that color. >>: In that PoP? >> Hitesh Ballani: In that PoP. >>: Okay. >> Hitesh Ballani: Because well if that fails you can land something on the same PoP, otherwise you go to some -- >>: [inaudible] property then you won't effect stretch with any single failure? >> Hitesh Ballani: Yes. >>: Right? >> Hitesh Ballani: Yes. >>: All right. >> Hitesh Ballani: That being said, you could come up with a more complex constraints where you want to constrain the stretch even in the face of failure. >>: [inaudible] going to be if you took a failure would you have much worse long haul. >> Hitesh Ballani: Question. >>: So I'll be impressed if you have a slide for this one. [laughter]. You don't do this work in [inaudible] service attacks, and it looks like if I were a [inaudible] service attacker, I could go after the layout of the routing and make sure that load ended up just where I wanted and make sure that things get cropped. >> Hitesh Ballani: That's an excellent question. I don't have a slide for this. Sorry. Essentially as an attacker, what you want to do -- and I will answer this very briefly and we can have a discussion later, if you -- you could send traffic to popular prefixes. That being said, ViAggre, when you have a packet destined to a prefix which is not popular, you're not relying on the control plane. So it's not as if you're taking a cache head and getting something from the control plane, you're essentially forwarding it to some other router. Packets always stay on the data path. So that would ensure that no matter -well, not no matter, but for decent size attack traffic, you should be -- you should be fine. Obviously that can be accounted in how you deploy ViAggre. And I have some very preliminary results on that that I'll be happy to share with you. Okay. Not in this talk. So evaluation was fine, but there's a question that -- question? >>: [inaudible] followed by a constraint but [inaudible]. So here's one question. [laughter]. >>: Research [inaudible] [laughter]. >>: So here is a question. There's one thing that sort of -- maybe it's not [inaudible] but it's the [inaudible] and there are had already so many papers talking about routing [inaudible] how hard it is to keep them [inaudible]. >>: Yes, how do you deploy this? >> Hitesh Ballani: Excellent question. [laughter]. Which is my next slide. So we wanted a deployable system. And we went out and spoke to ISPs and operators, be and there were two main concerns. First was, well, you're using all these control plane hacks, and what happens with installation time and convergence time and all those data metrics. The second and perhaps more important concern was the management overhead. I went out and spoke to operators and I was happy because I found a solution that was going to save the world, but they were concerned about the operational costs of this extra configuration. And having done some network management I really appreciate that concern. Pretty much now, to answer these questions, we went out and deployed our system on a test bed of actual hardware routers at the WAIL lab in Wisconsin. And the figure I show very simple topology although we experimented with all kinds of different topologies. In the figure we have an ISP that has deployed ViAggre and it's changing routes with two neighboring ISPs, AS2 and AS3. We configured these routers to propagate routes using three different mechanisms and I'll briefly recap them for you. First is status quo, which is what happens today. External router advertises routes to the edge router, edge router forwards these using a mesh of internal peerings. Second, we have design one, wherein the internal routing setup remains exactly the same, except that routers use FIB suppression to only load the relevant routes into their FIB. We achieved this FIB suppression using something called access list, which is a standard route 3 mechanism available on all routers and the only thing that you need to know about access lists from this talk's point of view is that they can be massive due to the use of popular prefixes. Because if you have a thousand popular prefixes, the access lists need to enumerate them, to tell the router that these need to go into the following table. And finally we have design two wherein internal router advertise the routes to a route reflector, and these are selectively forward. We achieve this selective forwarding using something called prefix lists, which is again a standard [inaudible] mechanism available on all routers and these can be massive due to the use of popular prefixes. Now, we conducted a whole slew of experiments, and the one experiment that I'm going to focus on in this talk, what happened there was you peer down and you reestablish the peering between the external router and the edge router, and you measure the amount of time it takes for the route to be advertised, installed into the edge routers FIB and then forwarded on to it, something that we call installation time. The experiment had two key parameters. First is the number of routes being advertised by the external router, which in some sense representing the routing overhead, and the number of popular prefixes which represents the size of the access list or the prefix list that you'll be using in the ViAggre employment. And ->>: [inaudible]. >> Hitesh Ballani: Yes. So these results are based on Cisco 7300s. I think we have experimented with Juniper routers, too, M20s, I think, but the results that I'm going to show in this talk are based on Cisco router results. Okay? Which is shown here. So design 1 imposes substantial overhead in terms of installation time. And this increases dramatically as we increase the fraction of popular prefixes. This is because as it turns out, routers today just aren't designed to deal with massive access list. The overhead is too high, which is what I was alluding to at the beginning of the talk. So this is not a practical implementation, and we still are working on it. However, design two actually reduces installation time. This is because in design two the route reflector only needs to forward a subset of the routing table to the ISP's routers. As a contrast with status quo, the edge router needs to advertise the entire routing table to ISP's routers, and that's where that advantage is coming from. Further, the installation time doesn't increase much as we increase the fraction of popular prefixes. So this is great news because this shows that not only are routers designed to [inaudible] the massive prefix lists, but reducing the following table can also help with internal routing convergence. So we started by focussing on FIB size, and it seems we can help with both RIB size on data plane routers and internal routing convergence, which is one of the advantage of reducing the RIB size. Question? >>: I don't have a good feeling for how many prefixes are often advertised within networks on the X axis. >> Hitesh Ballani: So on the X axis we are here. So today the Internet routing table is around 280,000, 300,000 prefixes. So we are here and we've done stuff here. And it's done sort of recent. Next is the management -- question? >>: [inaudible] the router [inaudible] fails. >> Hitesh Ballani: What happens if the route reflector fails? You need to be very careful about how do you the route reflector deployment because all the external routers peer with them. But that is not any qualitatively different than what happens today because today ISPs use redundancy to ensure that route reflector failure does not lead to route propagation failure, which is the same for ViAggre. Question? >>: [inaudible]. >> Hitesh Ballani: Why the [inaudible]. >>: [inaudible]. >>: [brief talking over]. >>: [inaudible] the opposite. Why is it so [inaudible]? >>: [brief talking over]. >>: [inaudible] linear scale, it would look like that. >> Hitesh Ballani: This curve? >>: Yes. [inaudible]. >> Hitesh Ballani: This curve would increase -- I don't have an answer to that question. I'll have to think about that. I guess it could be a function of as you're increasing the number of routes, you're sending more data in, and things are getting congested. I don't have an answer to that question. I'll think about that and I'll get back to you in a second, okay, after the talk. The other question was the management overhead. Now, to address this concern, we went out to rep with our deployment that left a management tool that can help with the ViAggre configuration. It's a simple tool that takes an ISP's existing configuration files and the net flow records for the traffic statistics and it generates configuration files that are ViAggre compliant. So effectively you have an automated means to go from a status quo network to a ViAggre compliant network without any manual intervention. Of course this [inaudible] was specific to the Cisco 7300s we were using and the imperialistic knowledge we were using, but the simplicity suggests that the configuration problem might not be in some [inaudible]. That being said, it's an excellent question because as we chose a configuration-only part from ViAggre because we thought it would lead to easier deployment which is not necessarily true. And another way to approach the problem would be to assume router vendor support and to build all these primitives directly into the router, which is what we are working on with engineers at Huawei. So this would reduce the configuration on the ISPs and it would make them more comfortable because they would have router vendor support. So we have a couple of IETF drafts on this, but would I say this is more of a work in progress. So with that, I'm going to conclude the ViAggre component of the talk. ViAggre or [inaudible] an aggregation is a configuration-only approach to shrinking the FIB on ISP routers. ISPs today can use ViAggre to extend the lifetime of their installed outer base. Of course ISPs may need to upgrade the routers for other reasons but at least their hand is not going to be forced by factors beyond their control, mainly the growth in the Internet routing stable. Further, I don't think of ViAggre as a be all and end all solution to the routing scalability problem. And so I think of it as a simple yet effective technique to sort of hold the fort until a more clean-slate solution can come along and save the day. So that was ViAggre. And so I have a few minutes left, and so I would like to talk briefly about future direction. Question? >>: [inaudible] this argument, you just said that you know, this is a first step solution and a clean-slate solution. Isn't the fact that the lack of clean-slate solutions coming in is because we [inaudible] isn't it kind of [inaudible]. >> Hitesh Ballani: That's an excellent question. So you -- I [inaudible]. >> Hitesh Ballani: Yes. Yes. If I were arguing for dirty-slate solutions where you come up with these dirty solutions and place band-aids on the network architecture, you'll never get to the clean-slate solution. And so as researchers, we have to be very smart about how you design the dirty-slate solutions so that there is a clear progress path. In the context of ViAggre, the idea there is you convince ISPs to use this. They'll run into trouble, they'll management overhead, all those things, you will get these changes implemented into routers. Once you have things implemented into routers which is what the second step, and that's where we are currently pursuing, you can actually move to inter-domain ViAggre. Because if I'm an ISP, I have a routing table size problem, I can turn the switch off into my routers without any management overhead and get these advantages. So now you have various ISPs that have deployed ViAggre. And beyond that point, there would be incentives for them to cooperate to reduce the FIB size even more and to ensure that end-to-end latency does not suffer. So there is a clear progress path at least for ViAggre of why this is not a bandage solution but this would actually encourage our moment towards a clean-slate solution, which I think I'm very proud of. Okay? Question? Yes. >>: How [inaudible] are FIB caching [inaudible]. >> Hitesh Ballani: FIB caches are -- went out of play ten years ago. And there were reasons for that. ISP -- ISPs weren't happy because when there was a cache miss, you'd have reduction in throughput and increase in loss. Vendors weren't happy because they couldn't benchmark their product. And that's why FIB caches went out of business. That being said, it's a very insightful observation because what I've designed is essentially a distributed caching system. So centralized caching wherein you -when you have a cache miss and you get stuff from the control plane doesn't work 10 years ago and it won't work today. That being said, that's why you need the distributed caching system, and that's what ViAggre is. >>: So if you had routers today that didn't have FIB caching, didn't use the FIB cache, it seems that it would do some of what you're proposing in that, you know, for the [inaudible] of the most popular traffic you know is [inaudible] and FIB and anything else you go out to the router and [inaudible] going to take a hit, but maybe that hit would be smaller than a routing stretch hit. >> Hitesh Ballani: That is actually what we are doing with Huawei. You actually do these popular prefixes calculation online. You populate the FIB without any management overhead. If you do have to take a hit, you go on to some other line coding on some other router. And so that the sort of the kind of things that we are exploring with ->>: So is that pretty much exact the same as the old FIB cache stuff, or is there some new [inaudible]. >> Hitesh Ballani: There's some new stuff there, because old FIB cache stuff had to rely completely on the control plane. In fetching data from the route processor, you don't want to do that, because of reduction in throughput. You either want to rely on some other line column, the same router or some of the router in the same PoP. So there are some differences there. But the dynamic calculation of popular prefixes what you just mentioned is something that we are implementing in Huawei and others. Albert? >>: Yes, I was just thinking about the economics of the [inaudible] ISPs [inaudible] upgrading line cards but their they wouldn't care about upgrade routers that they do and they don't even have to update all of them and you [inaudible] so again this brings me back to my question. Why don't you buy a small number of code routers that know everything? Why do you need a partition by ->> Hitesh Ballani: That's a good observation. So, yes, there's one deployment model wherein you can buy some deployment core routers that have a lot of memory and you rely on them and that leads to a simpler deployment model. And we have explored that a little bit. That being said, you brought up an interesting point when you said that, well, you need to upgrade routers every three, four years anyway because of data raids. >>: Core routers. >> Hitesh Ballani: Core routers of data raids, and so why do you need to bother? Because the memory sort of -- memory upgrades come along with the upgrade of the actual router. Right. >>: Compared to the cost of the links for most ISPs. >> Hitesh Ballani: Yes. >>: It's [inaudible] a big deal. >> Hitesh Ballani: Yes. As it turns out, well, this thing -- there's a separation between what happens for medium ISPs and large ISPs, medium ISPs being tier to tier 3s. For medium ISPs, the concern is that they need medium links and medium routers but they need big routing tables. As a [inaudible] bigger ISP need big links and big routers and big routing tables. So there's a mismatch there. That's where ViAggre helps. That being said, on the side of tier 1 ISPs or big ISPs, as far as my understanding goes and please help me here, five, 10 years ago, every few years you change a router because you need new data raids. Data raids went up and you need to carry more traffic. Now, router line cards have reached lambda speeds, which in turn implies that most of the upgrades will come because of memory. And so it is going to hurt the bottom line of ISPs. Is it going to be an influential factor for [inaudible] deployment? I don't know. But we can take that conversation offline. Question? >>: I have a specific question. You were talking about [inaudible] that's used to set upper your [inaudible]. >> Hitesh Ballani: Yes. >>: Do you have an [inaudible] of how many [inaudible] you need to set up? Because it almost feels like for each ->> Hitesh Ballani: That's an excellent question. I never figured that somebody would get to that in detail. >>: Actually, I'll tell you why I care about it. Because you were talking about routers 7 to 10 years old and hadn't talked [inaudible] like that they have limits on how many [inaudible] ends can sit on routers, especially the older routers don't support very many of them. >> Hitesh Ballani: That is very true. And [inaudible] which I explained in this talk, you'd have tunnels equal to the number of directly connected external routers that you have in the deployment model that I showed. That being said, you can do an optimization wherein you only advertise tunnels for every edge router, which reduces the number of MPLS tunnels that you need to hold by an order of magnitude. So we're talking about 2,000, 1,000, 1,000 tunnels. The number of routers that you have in the ISP's network, which can vary from 200 to 1,000, to 1,500, and as far as I know at least the three, four year old Cisco 7300's viewer working with, they can't support that many number of MPLS ISPs. So there is an optimization that the goes from reducing the number of tunnels to 10,000, which is the number of directly connected routers to the ISP to the number of edge routers, which is a direct function of what your topology looks like. Okay? That is a good question. Well, I wanted to speak about future directions. I don't know how things work. Do I have five minutes? Okay. I'll try to keep this very brief. Throughout my graduate career I focused on lower level networking problems. This has allowed me to delve deeply into one area so as to have most impact. I think I've been relatively successful at this. And it has led to a number of new ideas that I plan to pursue. Now, while there is a decent amount of work to be done in core Internet research, I do believe that the field's best days are behind it. And I say that with some amount of sadness. In this context, I think my past work is useful for two reasons. First, it taught me that good research can comprise of clean-slate and dirty-slate components. Second, be it equipped me with the requisite tools to diversify my research beyond these core interests. And to give you a quick example, I in one of my current projects am building a system that facilitates the transfer of delay-tolerant bulk data. So this would be users downloading movies and software but they don't necessarily need it immediately. And my insight here is to take advantage of this delayed tolerance to reduce user cost and to improve performance. So this is an example of a project which is at the bottom line of traditional networks and peer-to-peer systems and distributed systems. And so I'm already moving up the stack. Beyond this my current pipeline of projects is a mixed bag, and I'd be happy to speak to you about these if people are interested. To put this all together, I want my research to be geared towards building network systems, both traditional and emerging that are better along several dimensions, including their scalability, their manageability, their reliability, and their security. Further, I want my research to retain this dual theme of instant and delayed gratification, and I'll explain what I mean by that. Given proper space, I want my projects to try a specific roadmap, wherein the first part of the first step is to find the largest subset of the problem space that can be tackled without invasive [inaudible]. So this is the [inaudible] solution that can have things today. Next I would like to use insight from this solution to a more complete solution that most likely will be invasive in nature, which is the long-term solution that can help things eventually. In the context of past work, I have done work on clean-slate and dirty-slate projects, and the idea now is sort of combine them and have a unifying thread going through them. And I'll give you a quick example. In a couple of my past projects, I've used tunnels in several ways. For instance, in ViAggre, we used tunnels to shrink the FIB on ISP routers. I think tunnelling is a very powerful primitive. And I'm actually priced at how little attention tunnels have received in the research community. Although that's slightly changing a little bit. For instance, I noticed that Dave and Albert have done [inaudible] stuff where they use tunnels for load balancing and scalability. I think tunneling is a very powerful primitive. And I think that a properly designed and a cleverly incentivized inter-domain tunnel architecture represents a very good opportunity to have genuine impact on many longstanding networking problems, like routing skate and traffic engineering and load balancing and perhaps even network security. So this is an example where the same notion can be used in the near term to help things today and it can be used in the long term to get to event solution. Note that I'm not claiming that these solutions will spur architectural change, all I'm saying is that when the severity of the problem just fist the cost of change, my solutions will have a better alignment of cost and benefits, and hence, I think a better chance of deployment. And on that optimistic note, I think I'm going to conclude my talk. Thank you all for your patience. Thank you for having me here. And I'll be happy to answer questions. >> Dave Maltz: Well, let's thank the speaker. [applause]. >> Dave Maltz: Any more questions? >>: So I think you have done a great job of actually showing how this solution [inaudible] but if you look at [inaudible] do you think any of it has had impacts in [inaudible]. >> Hitesh Ballani: In terms of the older work. So the IP ->>: [inaudible] examples where your previous work has been [inaudible]. >> Hitesh Ballani: Sure. Sure. That's a fair question. So in terms of the anycast stuff, I think it had some impact because I go after deployment servers. These are seven boxes that are actually advertising these prefixes. And people are interested in growing that deployment. Because for IP Anycast, and this is technical as I'm sure we can get this offline, for IP Anycast to actually to complete with DNS based anycast, I need a substantial size deployment. I have seven servers right now. And if I can get to 20, 25 servers, that would be a substantial enough deployment where other services, other research projects might be able to use my IP Anycast based servers instead of DNS based anycast for some advantages. >>: So when you say I have seven servers, you mean you have a start? I mean, what would that mean you have seven servers? >> Hitesh Ballani: I have seven servers in the sense I took one used sever in [inaudible] boxes and shipped them to different ISPs or labs that were interested in hosting my boxes. I went there installed my server, spoke to their upstream ISP, installed a BGP peering, advertised my prefix and installed my software on those machines and which is running right now. So as a graduate student, I think it's not as kind of an impact I would like to have, but given the persistence I think I'm pretty happy with the kind of things I've been able to achieve. Albert? >>: The ViAggre work is [laughter]. >> Hitesh Ballani: Sorry about that. >>: It's a [inaudible] being pitched in a very impactive [inaudible] both have an IETF effort and IRTG effort, there must be a dozen [inaudible]. So how are you doing? I mean, how is it stacked up and what's the difference [inaudible] proposal and [inaudible]. >> Hitesh Ballani: Sure. So there are a whole set of proposals that I mentioned that are clean-slate in the sense they require changes to software and hardware. And yes, they are amazing proposals because they tackle the entire problem space. But they don't get enough traction. There's a second set of protocols that do not require substantial change of protocols but they do require cooperation from router vendors, for instance FIB compression. The problem there is that Cisco doesn't want to cooperate because it wants to send more routers, which I can understand, because they have their economic incentives. And the reason ViAggre, and this is me being optimistic or speculative, might just take off, and it's getting some traction is because there is a form of ViAggre that does not require any cooperation. If you're on ISP, you have a problem, you go out and deploy this. Obviously there are some management concerns that we have a management tool to address. And if you wanted to, there could be a startup where they have a management solution and people might deploy ViAggre. And I think that is the only reason why there is some hope of this actually getting somewhere some -beyond some odysseys it actually getting deployed in some networks which is why I'm slightly pursuing that. Because it can be a lot of pain in terms of my time. And the only reason I'm pursuing that is it's something I came up with, and I think there is some hope that this can help network problems in real. Question? >>: So I like the approach the way one of the slides was forwarded that says that you want to -- you would be able to increase the lifetime from 7 to 10 years from now. There's this battle I'm trying to understand whether you're trying to engineer software beyond the projected hardware reliability guarantees and could be provided for these routers anyway, is there area ->> Hitesh Ballani: No, there's no projected reliability guarantees. The RAM works just fine. It's just that somebody else -- I'm an ISP, you are as a network operator might be lazy and you're not doing your aggregation job and you are advertising 10,000 prefixes, even though you're supposed to advertise 1,000 prefixes, and now I'm supposed to maintain all that in fast memory. And my router vendor won't help me because he wants me to buy new routers. And so as an ISP, I need some mechanism of being able to solve this problem which is where ViAggre comes in. So I don't think it runs into problems of reliability. Yes, if you were -- for instance, most ISPs, when they upgrade the routers, it's because while data reg went up and I have more customers which is fine because at least you're getting more revenue. When you need to upgrade a router because of RAM constraints and [inaudible] constraints it's not because you are getting more revenue. Although I'd like to avoid the purely economic discussion because I'm not very good at that. Question. >>: This is a question [inaudible] and in a sense we're trying to get [inaudible] if you will. So what you've done essentially is add an extra layer of hierarchy that uses an amount of state at the load level [inaudible]. >> Hitesh Ballani: Yes. >>: [inaudible]. >> Hitesh Ballani: Sure, sure, sure. >>: [inaudible]. >> Hitesh Ballani: There's no unique insight. That's essentially the insight. I came up with a distributed caching system or a level of interaction which tackles a pressing problem and I figured out a way to achieve that without any changes to hardware and software. That's the key point. It's yes, I have not come up with I'm not going to get any big prize or something like that. It's more of a I came up with a cool idea which there was a pressing problem for and I came up with a way to achieve that without requiring any cooperation. Because as [inaudible] researchers you have this constraint, and I'm sure you must have experienced this when you have written your papers that you want to do cool, interesting work but it's difficult to do cool and interesting work because there's all this legacy stuff out there. And the fact that I was able to do this cool solution in the context of legacy stuff is something that I really like. >>: [inaudible] essentially making it work with the right infrastructure. >> Hitesh Ballani: Making it work with the right infrastructure ensuring that it provides good tradeoffs. Because it's in a simple level you have reduction FIB size but load goes up and stretch goes up. And so you want to make sure that you balance the right elements, and you want to make sure that the negatives are tiny enough that ISPs might be interested to take those costs while the positives are high enough that ISPs might actually be willing to take the trouble of deploying these things. >>: So are you [inaudible] but let me ask. >> Hitesh Ballani: No, please, please. >>: Are you only constraining yourself by the constraints that exist [inaudible]. >> Hitesh Ballani: No, I'm not only constraining ->>: So, so. >> Hitesh Ballani: Yes. Please continue. >>: [inaudible] you talk about that which is I want us to work with parameters that are given to me, and I want to make sure that my solutions actually work within the parameters and sort of [inaudible] clean slate and dirty slate and all this stuff. Can you go back and tell me where you've actually done things [inaudible]. >> Hitesh Ballani: Yes. This is the stuff that I had done on network management, I guess the slide is pretty close enough. >>: [inaudible]. >> Hitesh Ballani: Yeah, okay. So the idea there was network management is really hard. People here have really done a lot of work on network management. And the idea there was that we came up with this -- the reason why we thought network management was difficult and we came up with this explanation that it's hard because today protocols and devices tend to expose too much information. If you look at the management interface of the main depositories, which is the management information base, it comprises of thousands of variables, and you need to build management applications that are -- understand these variables. >>: [inaudible]. >> Hitesh Ballani: Yes. >>: [inaudible]. So the question is that do you feel gone from that, or do you feel that that's just -- do you feel like you get enough out of it? >> Hitesh Ballani: Yes. So there are pros and cons I guess is the positive side and the negative side. From a research point of view, I had a lot of fun in CONMan. From a deployment point of view, I had a lot of fun because I was able to show that you take these protocols and you model them according to CONMan abstraction and you get all these benefits. It was a lot of fun publishing this [inaudible] fun on paper. What was not fun was going to Cisco and talking to actual engineers and them saying doesn't seem like we'll be interested in this in the next five years. So, yes, you have to balance the tradeoffs. That's what I was saying. Yes, there are cases where I would like to make sure that oh, I want this functionality to be provided in systems of the Internet 10 years from now, or this problem to be solved, I want to distill the architectural reasons why that might be the case, I want to pursue where we'd like to be and come up with solutions like CONMan. On the other hand, I would like to keep my engineer hat on and come up with solutions that can help with things today, too. So I'm very happy with what CONMan ended up with. So I didn't want to give the impression that, oh, some work in the past that I'm not very proud of. I really enjoyed that a lot. >> Dave Maltz: That's great. >> Hitesh Ballani: Thank you. Thank you. [applause]