>> Kristin Lauter: Okay. So today we're very... here visiting us from Boston University. She was a...

advertisement
>> Kristin Lauter: Okay. So today we're very pleased to have Sharon Goldberg
here visiting us from Boston University. She was a post-doc at MSR New
England and has been very involved with various research efforts at Microsoft
Research, and so we're very pleased to have her visiting and talking to us about
finding incentives to security internet routing.
>> Sharon Goldberg: Okay. Hi. Thank you so much for having me and giving
me this venue to speak.
So what I want to talk about is a problem that we've been working on for about
two years. We've been trying -- well, for three or four years, but this particular
problem is we're trying to understand why the security routing protocols, which
from a cryptographic perspective and a security perspective are pretty trivial, the
protocols themselves, why is it so hard to actually get them used in the internet.
So what we're trying to understand is the incentives for really deploying these
security routing protocols that will prevent all of the attacks, why is it so hard to
do this and what ways we might use to encourage people to actually deploy
them.
So this is two different papers -- actually three different papers I'm going to talk
about, joint work with Phillipa Gill, who's a student at Toronto, Pete Hummon,
who was a student at Princeton when I worked with him, Zhenming Liu is a
student at Harvard, Jennifer Rexford, who's my advisor, and Michael Schapira,
who's been my collaborator for the past four years. He's going to be at the
Hebrew University very soon.
Okay. So what I want to talk about today is I'll start off by showing some of the
attacks that we worry about in interdomain routing, so in particular we're going to
be focusing on the security of the global internet routing system and attacks on it
and some of the security protocols that have been proposed to defend against
these attacks. And if you look at the security protocols and you're not too excited
about how novel they are, that shouldn't surprise you. They're very simple.
The real focus of the talk is trying to understand how do we actually get networks
to deploy these security routing protocols. And to this end I have two different
approaches that we've been working on for the past two years, one of which is
because it somehow improves their security, so ISPs or networks have an
incentive to deploy security routing because it makes their networks more
secure. That seems logical.
But that may be a little bit of a hard sell to networks, so one of the other
approaches that we propose is not because they care about security but because
it's somehow going to be impact how much money they make. So they're going
to somehow lose money from their customers unless they become secure, and
we're going to do a study of seeing how they actually -- the dynamics of how this
all happens.
So before I start talking about economics, let's focus on the actual problem. So
the setting that we consider is the global internet routing system. So you can
think of a graph where each node is a network. So, for example, Level 3 is an
ISP. This is an organization that provides [inaudible] service across the world.
This is China Telecom. You can see it's a residential provider in China.
This is Verizon Wireless, and this is another network owned by Verizon Wireless,
so you can see that each one of our nodes is a giant network. Inside the node
there's hundreds of routers. But we're not going to worry about that. We're just
going to worry about the view of these networks talking to each other, and that's
the system we're going to analyze.
So what this system does is it sets up paths from ASes, so we call these ASes
autonomous systems, these nodes, to destination addresses. So you can think
of -- this is an IP address that all these guys want to reach. And when you think
of IP addresses we sometimes think of IP addresses being mobile. Don't think
about them this way. IP addresses are allocated to these particular networks.
So this IP address belongs though this network 22394. That's the number of the
network. And this is the IP address block that belongs to this network.
So what's happening here is this is the routing protocol message, and the routing
protocol message is saying if you want to reach this IP address block, then route
to 22394.
Yes?
>>: [inaudible]
>> Sharon Goldberg: So slash 24 means that the first 24 bits, that's the block of
addresses, and the last remaining 24 minus 32 bits is sort of all the different
addresses that are subparts of this, so we call it a prefix. So this is a contiguous
group of IP addresses with the first 24 bits being this.
So you can have different groups. You can have slash 24, slash 22 is a bigger
group of IP addresses, slash 8 is a bigger group of IP addresses. So this block
belongs to this network.
And so this is how the routing protocol goes. And let's assume that this network
routes this way, its traffic, to reach this IP prefix. And then when the next
network wants to announce how to reach the prefix, it does what's called a path
vector protocol. So this protocol is BGP, a path vector protocol. What it's doing
is it says the path to reach this prefix is Verizon Wireless and then this network.
So everyone, when they announce their path, they just put their name in the
path. So the next guy is just going to say the path to reach this prefix is Level 3
Verizon 22394. That's what the protocol does. It puts down paths. Each node in
the path puts down its name and it says this is the destination we're trying to
reach, this block of addresses that belongs to this guy. Okay. And that's the
system we're going to focus on securing.
So just to say a little bit before I get into the attacks, the reason that this is
interesting is each one of these networks is a separate business. They're not
under a single jurisdiction. They're owned by different companies. So, for
example, Microsoft is one of these -- is an example of one of these networks.
And when we want to actually go and upgrade the routing protocol we have to
find some way of making all these different networks that are controlled by
different entities actually agree on doing an upgrade, and that's not easy to do
because it's not like, you know, Windows wants to do an update, Windows does
the update. It doesn't have to tell anyone else because it's just Windows, right?
In this system we have multiple companies that have to cooperate with each
other if we want them to actually change what they're doing.
>>: [inaudible] in the network is basically like a set of trees and the next level is
a ->> Sharon Goldberg: It's not exactly a set of trees. You can think of the
[inaudible] of the network in the following ways. So think of the core of the
network, what we call the tier 1s, there's a very small number of them. You can
think about 15ish networks that are all sort of connected to each other. And then
from there you have these trees, but they're not really trees. Like they're sort
of -- a node here might be in multiple trees. So it doesn't really look like a tree.
It's sort of -- there's sort of these offshoots, like two level offshoots from them, but
one node may be connected to this tree and this tree too, something like this.
So we'll talk more about the actual structure of the network later on in the talk.
Okay. So let me show you an attack. So who heard about what happened with
China Telecom in April 2010? This was in the New York Times. Okay. So let
me show you what happened.
So what happened was there was a router here -- and, by the way, this is from
data that we actually looked at. So there was a router here at China Telecom
that claimed to own a set of IP prefixes. So actually this was the address that's
really owned by this guy, but this address really lives here, but this China
Telecom router was lying and saying that it owned this address.
And it wasn't just this set of addresses, it was 50,000 different addresses.
And so this is obviously a lie because the address belonged in this example to
22394. So what happens -- so let's look at ISP 1. So this is an actually ISP that
I've anonamized. What does ISP 1 get? So it sees two announcements, it
doesn't know where the prefix really belongs, so it sees a long one, three hops,
and a short one, one hop. So it has some process of deciding what path it's
going to choose. And it turns out that in this case this path was shorter, and so it
routes this way.
So really what happened was the traffic from this ISP 1, which is a large ISP,
actually went to the wrong place. It didn't go here, it went over there. So that
was the attack.
Even more interestingly, there was some interception going on. So what
happened was there was some honest router here -- well, anyway, a router that
wasn't misconfigured that knew how to reach the actual destination so it routed
that way through Level 3. That's how you really go. And then the cool thing was
the traffic went internally through the China Telecom network, went out the other
side, and went to its destination.
So what ended up happening is we call this a traffic interception attack because
the traffic was travelling through this network and went to the destination.
Nobody saw any disruption in service for these set of prefixes except that they
were sort of controlled by this network for a little while. So this is the kind of stuff
we worry about in this space.
>>: [inaudible]
>> Sharon Goldberg: Yeah. And it was potentially going through a network that
it didn't know it was going through. Like it didn't want the traffic for this
destination to go through that network.
But when you look at this as a cryptographer you say, okay, well, this should just
be -- if it's sensitive data, it should be encrypted, it should be authenticated, it
should never be going through the China Telecom network. And that's what we'd
like, but that's not actually what happens with all of the traffic that's going through
these networks.
>>: [inaudible]
>> Sharon Goldberg: Well, okay, so what's wrong with this with this case is that
technically what's happening here is that this ISP 1 is thinking that this prefix
belongs to China Telecom, but it really doesn't. It belongs to this one. So this is
sort of an incorrect piece of information. The correct piece of information that
should have been here was 22394 and then the prefix.
So we look at this as an attack because the decision from this ISP was that this
is the way the traffic should go. So, I mean, one way to look at all of this stuff -and we have argued this. In fact, I've even argued this a couple years ago -- was
that, you know, don't assume BGP sends your traffic anywhere that you think it's
going and just encrypt and authenticate all your traffic.
But in reality people don't encrypt and authenticate all their traffic, and so this is
why we worry about this kind of attack. So this is false routing information
propagating.
Yes?
>>: [inaudible]
>> Sharon Goldberg: It could be denial of service, yes. So you have denial of
service.
>>: [inaudible]
>> Sharon Goldberg: So availability attacks are a big problem. Yeah.
>>: Did they put their name in the list [inaudible]?
>> Sharon Goldberg: No. No. So what this guy saw was a short announcement
through China Telecom, but the real path was ISP 1 China Telecom Level 3
Verizon Wireless block, but what ISP 1 saw was this.
>>: But did you say that as it goes through it appends the name of the path?
>> Sharon Goldberg: Oh, okay. So let's distinguish between the routing
announcements and the traffic flow.
So what you're seeing here is just routing announcements. It's saying if you
send traffic to me, this is how it's going to go. The traffic itself, as it goes through,
it doesn't get written. No writing -- the packets are not written with the path they
traverse. They just go there. Does that make sense? That's a really important
point. These are just routing announcements. So think of them like one
announcement every five seconds as opposed to packets every two
nanoseconds. Packets every two nanoseconds don't get written with their path.
That's too expensive.
>>: So how much announcement is made, like, from each node if you don't
obviously announce all the paths [inaudible]?
>> Sharon Goldberg: You do, actually. You do announce all the paths that
you're willing to carry for everyone. So basically there's 300,000 prefixes in the
internet, and you would make announcements for all 300,000 of them to your
neighbors for which you're willing to carry traffic for this. That's how much traffic
you have.
Okay. So slide No. 1. So let's go to slide No. 2. So this is bad.
Okay. So now what's actually nice is that we've been working on this problem for
20 years, and after 20 years I should impress upon you the excitement that
they're actually deploying a PKI for this system. And this took 20 years. So this
is actually happening. Like if you go network operator conferences, which I went
to last week, half the talks are on this, how you should get this into your network.
So this is very exciting.
And what they have here is this public infrastructure that's a PKI which assigns
basically public keys to these networks. And in addition to the public keys, it also
assigns mapping prefixes, so it will have an entry that says this prefix belongs to
22394. And that will be included in the database as well, so there's an allocation
of addresses to ASes as well as public keys.
So of course the attack we saw before, to all of the crypto people in the room this
is not going to succeed, right, because we can check the RPKI, see that China
Telecom doesn't own that prefix, and this announcement should get dropped.
So the thing that we tell operators that this would stop most of the attacks we've
seen to date because most of the attacks we've seen just look like this. It's a
false announcement of a prefix you don't own.
But, of course, you know, if we have really malicious attackers -- so here's a
hypothetical situation. The malicious attacker can still fool the RPKI if it wants to
do the attack we saw before. All it has to do is announce itself, right, attached to
the actual destination that owns the IP address, and then the attack will still
succeed because this path is shorter and he wins.
So obviously RPKI on its own just having a mapping from prefixes to the owners
is not going to stop all possible attacks, so I think in five seconds every single
person in this room can come up with a solution to -- now that we have a public
key infrastructure, how do we prevent this kind of attack. So we'll use digital
signatures, right?
So basically we have this protocol called BGPsec, or secure BGP. You may
have heard of it. All it is is the standard thing that we would expect with digital
signatures. Now instead of announcing the prefix in the clear you're going to sign
it.
So what this says is I'm 22394, here's the prefix you want, and, Verizon, I
authorize you to use this announcement. And then that whole business gets
signed by the key of this guy.
So now we can see we can start building up chains of signatures where because
you've signed the forward hop saying Verizon is authorized to use this, you can
start chaining this together and see that when you get an announcement,
everyone on the path has actually made the announcement.
So that's the solution to these types of attacks. Now, are there any questions on
that before I go on?
Okay. So what are the challenges in deploying this protocol? Well, first of all, it's
an online protocol, right? So whenever you have an announcement you have to
sign it. That means you have to pull out your public key and do the signature.
There's also a need for keeping a cache with lots of public keys so you can do
the verifies, because you're getting announcements from all over the internet
because you need to know how to reach every destination in the internet which
means you'll eventually get a signature from someone really far away you're
going to have to verify. So there's a lot of overhead involved in deploying this
and it's very expensive. So that's really the reason that people sort of have
argued against using this.
Another interesting thing -- okay, so obviously this doesn't work. Okay. So the
attack fails, right? Because the attack before he couldn't announce his path
because he doesn't have a signature saying that he's authorized to announce a
path to that prefix through him.
Okay. So another interesting thing about this setting is that we have to consider
partial deployment scenarios. So like I said, each one of these networks is
owned by a different company, so we're never going to be living in a world in
which one day everybody wakes up, turns on their router that's equipped with the
latest crypto accelerator that can do lots of signatures really fast and, boom, the
whole internet is secure, right? That's not going to happen. We're going to have
multiple, like, islands of secure networks, some people are going to be on, some
people are going to be off.
And so one thing that scares network operators is the following. Let me first
show you -- let's consider the following scenario. Let's imagine our China
Telecom router is doing the same attack as before. This network is going to be
insecure. Let's assume he hasn't deployed security routing. All of these guys
have deployed security routing. So if this network here gets these two
announcements, if it just says, okay, the path is shorter, then the attack will still
go through. This is -- I'm saying something very obvious here, but the point is
that if this network, the one that's secure, doesn't use security in its decision
process, it doesn't change its routing based on security properties, then the
attacks will still go through.
So something that scares network operators in addition to upgrading their routers
and having all this expensive hardware in their routers is it'll change their traffic
engineering. And if you think of what these blobs are, these are hundreds or
thousands of routers that are engineered to have particularly -- you know, certain
capacity links between them, and everything is designed within an inch of its life,
and if all of the traffic in the world just shifts in one direction because of security
or not, then you can have big issues. So even convincing the operators to turn
this on and use this in their decision process is a big deal.
Yes?
>>: Is it true that in your threat model you preclude any possibility of collude
attacks?
>> Sharon Goldberg: I haven't even given the threat model. I'm not really going
to in this talk. There are works that look at threat models. I'm not going to even
talk about security in this talk at all. I just wanted to motivate why the protocol is
needed, but ->>: Because it seems like even when you deploy this kind of public key system,
if China Telecom and Verizon collude, it can still announce as a legitimate owner
of that IP address.
>> Sharon Goldberg: Yeah, exactly. So it's very vulnerable to those types of
collusions. People don't really worry about them as much just because I don't
see a good way to prevent those kind collusions.
>>: I was just wondering how that had an impact on the incentive if you just went
out without the deployment of this kind.
>> Sharon Goldberg: Yeah. So one thing is -- when I talk about incentives for
deployment I'm going to sort of divorce the security considerations from the
incentives in some sense. So we want to eventually get to a world where
everyone's secure, so I'm sort of going to think about how do we get there and
not exactly what happens along the way. That's sort of what we're working on
now.
But in terms of collusion, I haven't seen anything that's really satisfactory. And in
some sense I don't know how to do anything because if these guys are in
collusion, then they can share secret keys and start doing all kinds of stuff.
Yeah, it's not really something people worry about too much.
Okay. So the point is routing policies have to be impacted by security as well.
So just to recap, why is it taking so long to deploy this stuff? And when I say so
long, [inaudible] has been working on this for 20 years. That's what he told me
when I talked to him. He's one of the people that I work with on this stuff.
So for many years they said that we'd never have a public key infrastructure. If
you think about it, there's a lot of politics involved. You have to get, like, Russia
and China and the U.S. to agree on a single route of trust. That may not be so
easy.
But now it's actually happening, believe it or not, and these political issues are
somehow actually being sorted out. It's kind of amazing. So you actually have
the Asia routing organization and the U.S. and the European routing organization
sort of all advocating for the same RPKI. So it's pretty exciting.
So now that this is happening, this problem becomes more realistic. It's more
realistic to start thinking about actual secure routing deployments. But now we
sort of have to think about what's going to drive these ASes to do the deployment
locally.
So the first problem is their economically motivated. They're not just going to
turn it on when we say turn it on. It costs them money.
And the second thing is that if you think about how these systems look, imagine
you're the first one to become secure. Nobody else in the world is secure except
for you. What have you gained? You've gained nothing. Because you can't sign
anything from anyone. You can't send anyone signatures because no one can
read them. You can't read anything else because nobody else has keys.
So sort of the initial adoption -- the initial benefits of adopting this thing are very
low, so we need some way of sort of creating a large number of people that are
secure so that they can start signing things for each other before the protocol will
even do anything at all.
>>: So it seems like you're saying that this thing is going to be deployed, so
presume that somebody's pushing for it, so I imagine [inaudible] 15 top level
guys. So the way it usually works, they would just agree [inaudible].
>> Sharon Goldberg: What's wrong with that? Okay. So that is actually the
scenario we consider. There are some issues. So what's wrong with that? Can
I answer that question later? It's just -- like what [inaudible] is saying is like we
have -- so there's 36,000 -- let me put some numbers. So there's 36,000
networks in the world, and there's 15 top level guys. And this is very nice for us,
actually, because if you want to look at the degree distribution of this graph, it's
like one, this is 30 percent, two is 30 percent, three is like 15 percent, and then
it's like up to a thousand. So it's a really, really, really, really secured degree
distribution. It's extremely skewed.
So, yeah, what [inaudible] is saying is that if the top level ASes, that should be
enough. So that's actually what we simulated, and we showed that does actually
work under some set of conditions. It doesn't always work. You need to create
the right set of conditions, and that's what this work is about, how do we create
the right set of conditions for that. So that's ->>: [inaudible] other people who are pushing this technology [inaudible] you're
saying it's finally happened, so somebody is making it happen ->> Sharon Goldberg: So this is not finally happening. Only the PKI is
happening. This is happening. PKI is happening.
>>: [inaudible]
>> Sharon Goldberg: It's a combination of governments, the top guys, Cisco.
There's these things called routing information registries which is where you have
to register your IP addresses. They're pushing for it. If you go to the network
operator conferences, there's a very strong social network. There's sort of a lot
of peer pressure. So it's not exactly like the top 15 guys. There's all these
organizations that are in play here.
This thing is not happening. This thing is only being standardized right now. And
so if you're working on aggregate signatures, for example, this is your
application. So there's actually a new standard that has many different
requirements that are interesting for aggregate signatures. We actually have a
paper on that right now.
But this is not happening yet. This is like ten years out or five years out.
Okay. So this is our setting. And now what we want to understand is how can
we create local incentives for adoption that will drive global deployment. So this
is a very common problem in the mechanism design social networks literature.
This is a very beautiful paper for computer scientists to read, so I really
recommend reading it if you're even remotely interested in what I'm talking about
today.
So there's a paper called Technology Adoption in Social Networks. So what's
the idea? So basically you have this game, and it's a graph, and nodes are going
to deploy the new technology, so in our case it would be secure routing, when
their utility, for some measure of utility, exceeds some threshold. So if I get some
utility from the protocol, I'll become secure.
And all of these models basically look at the following thing. For example, am I
going to go see a movie? Well, I'll go see a movie if enough of my friends tell me
the movie was good and I should go see it.
So all of these utility models are a function of your immediate neighbors. In our
setting that doesn't work because in our setting what we want is paths to
destinations, so I want to reach IP addresses that are far away on the other side
of the world. So my utility won't only derive from what my neighbors are doing,
it's going to derive from how many paths I can see. So fundamentally our
problem is different from all of the problems that have been considered that base
themselves on neighbors. So technically it's a different problem.
There's also been sort of folkloric and all kinds of discussions about how we can
do protocol adoption, like new protocols like IP Version 6. If anyone's interested,
there's currently IP Version 4. We ran out of addresses. There's IP Version 6
which gives us more addresses. So people have been talking about how do we
get to IP Version 6.
So one thing people say is if you're not up to the latest standard of whatever
protocol you're supposed to be using, you're going to use customers. So people
have talked about this idea. Nobody's actually shown that it will happen, and
that's what we're going to do. So we actually run simulations and show that this
is exactly what happens.
So that idea was out there in the literature. And then there was some works that
talk about how if you have a large enough group of people deploying, then the
security incentives increase and so you're going to deploy because you want the
security properties.
So there's sort of two approaches. One of them is you want money, customers,
you don't want to lose your customers. The other one is you want security.
So what we do in this work, these set of works, is that we're trying to develop
guidelines for how you should deploy these protocols, so who should the early
adopters be, like what [inaudible] suggested. How does it interact with routing
decisions, how does it impact routing decisions? And the approach we're going
to use is going to be to leverage the properties of the internet graph. So I'll tell
you a bit later about what the internet graph looks like and what properties we
can use. But our approaches are really strongly tailored to the specific structure
of the internet graph which is very interesting in particular with this kind of degree
distribution and other things that I'll show you.
So now that we developed our guidelines, how can we evaluate them? And
that's where we need modeling. So I don't look at these works as just modeling.
The models exist to evaluate the guidelines.
So the model involves how do we model ISP utility, how does routing happen.
We use the same sort of framework as Morris and Kempe and Kleinberg-Tardos
which is that ISPs are myopic and they're going to upgrade if their utility gets high
enough.
So we, of course, need to figure out what utility means for our context, but we
use the same sort of idea.
We analyze the model for tractability and convergence, which is the basic stuff,
but the really interesting stuff is running simulations. What we do is we take
maps of the internet or our best guess of what the internet looks like and simulate
these deployment processes to see if they work. And these simulations are quite
large scale.
Questions before I go on?
Okay. So now I'll show you some results. I want to show you two different works
that we're doing. The first one is a more stylized, more theoretical, it's a very
pretty theoretical model, and in this model we're sort of assuming that ISP's utility
depends on how much security they get from the protocol.
And then I'll show you a more realistic kind of work, much more simulation based,
where ISPs are going to upgrade because they care about revenue.
So let me start with the first one. So here's the problem that we define. We have
an input which is a graph, and each node is given a threshold. So this is node 7,
and it has threshold 4.
So this is the input to our problem. Nodes, a graph, and a threshold. Initially
everyone is insecure except for the early adopters. So let's assume that these
guys are the early adopters. And now each node is going to become secure if
his utility exceeds his threshold. So at the end of the game we want everyone to
be secure, but we want to find the smallest set of early adopters that will cause
this to happen.
So if you've read Kempe-Kleinberg-Tardos, this is very similar except that our
utility function is given by the size of the security connected component adjacent
to a node. So let's look at what that means.
So if we look at the utility of this guy, his utility is going to be 2 because he can
reach 2 nodes that are secure from his network. So he'll turn on because his
threshold is 2, he's got 2 nodes he can reach, and that's enough for him.
If we look at this guy's utility, his utility is 3 because he can reach these 3 nodes,
and so his threshold is 2, and so he's going to become secure. And that's how
the game goes.
And what we want to do is find the smallest set of early adopters that we can.
>>: [inaudible]
>> Sharon Goldberg: Yeah.
>>: [inaudible]
>> Sharon Goldberg: Yeah. So there's sort of rounds and it's sort of a cascade.
So you want to figure out who should be the first set of nodes that you turn on, so
this thing cascades and the whole network turns on.
>>: What's the motivation for the different thresholds? Just their cost to
upgrade?
>> Sharon Goldberg: Yeah. So the threshold will represent a cost to upgrade.
So this is really a very general formulation ->>: [inaudible]
>> Sharon Goldberg: No, actually at this point I don't have a good argument for
what threshold should be assigned for what person. At this point this is really like
much more theoretical, and the next set of works I'll show you a much more
grounded, but this one is more about like is there a way to actually figure out who
these early adopters should be if you had the graph and if you knew the
thresholds.
And there's actually a huge amount of work on this in the mechanism design EC
type of people. It's [inaudible], those people. They've done a lot of work on
understanding, like, what the threshold should look like, like if the thresholds are
random can you get better algorithms, if the thresholds are not random it
becomes much harder. We haven't been able to do anything with giving models
to the thresholds very much with this particular problem.
Yes?
>>: Could you explain from the ISP's perspective, what does mean when the
utility exceeds the threshold?
>> Sharon Goldberg: Yeah. Okay. That's what I want to talk about. So why did
I choose this utility function?
We chose this utility function for the following reason. What this looks like to him
is if he's sitting here and he's saying if I became secure, how many places in the
internet could I reach along paths that are secure? So what this is saying is if I
can reach a destination here and I can reach a destination here, so that's sort of
two destinations that I can get to that are secure. If I continue, for example -what this guy is getting is if he becomes secure, he can now walk to one, two,
three, four places along secure paths.
>>: [inaudible]
>> Sharon Goldberg: Yeah. So basically you can't -- so the way the BGP
protocol -- the way the signature protocol works, you only have guarantees that
everyone announce the path if everyone on the path was already secure. So a
path is only secure if everyone on that path is secure. Does that make sense?
So you're sort of only looking at the secure connected component because, for
example, if this guy was secure here, it's not going to contribute -- well, I need
another node, but if there was some other node here that was secure, it wouldn't
contribute to the utility of this guy because he can't reach it along secure paths,
so he's only looking at the nodes that he can reach because those are the only
ones he can get secure routing announcements from. Make sense? So this is a
really, really stylized simplified model of this problem.
Okay. So we came up with this problem in 2009, and I couldn't solve it for a
really long time, and then Zhenming came along. Okay, we knew it was hard.
So it's np hard. Guess why? Because of set cover [phonetic]. So I think most
people here can probably come up with that.
But after two and a half years of not having any results on this Zhenming has
figured out how to do this. So here's the result that we have with Zhenming.
So suppose the graph has a small diameter. So what does diameter mean? The
distance between any pair of nodes. So actually the internet graph is an
extremely small diameter. On average it's 4. In the worst case it's like 11. So
it's extremely small diameter.
So suppose you have a graph with small diameter and you have a small number
of thresholds. Then we actually have an approximation algorithm to choose the
optimal early adopters so our set is going to be too big. We're going to be off by
this much. So radius squared times the number of thresholds times log squared
the number of vertices. I think the squared is gone now, actually. Yes, the
squared is gone.
Okay. So what does this mean? So suppose that we only have -- so suppose
this set of thresholds that nodes can have, suppose they can be only 2, 4, and 7.
That means I have a total of three thresholds. So that's the parameter here, T,
how many thresholds I have.
So why is this -- why does this work? So if you think about this problem, think
about a graph where there's a blob of security guys here and a blob of secure
guys there and they're not connected. Imagine someone in the middle turns on
and these two blobs now all of a sudden are connected. That means that the
utility of everyone surrounding these two blobs has just shot up by a lot, right?
The idea is that if this is secure and this is secure and all of a sudden this guy
turns on and bridges them, then the utility of everyone here has just changed
dramatically the moment that this guy turns on.
What Zhenming observed is that if you assume that the set of people that are
turning on are always connected, so if there's only one secure connected blob,
someone new turns on, the blob grows by one. Another new person turns on,
the blob grows by another one. So it's very easy to keep track of the changes in
the utility of each of nodes.
So what he observed was that if we can make sure that the early adopter are
connected, then this blob grows in a very controlled way and we can actually
build this into an LP and solve it with an LP.
When we have this kind of complicated jumps in growth when these different
blobs get connected, then we don't know how to deal with the problem. And so
to me that was of the big insight on this problem.
So we haven't actually run any simulations and seen how it works yet, but this is
the approximation.
Yes?
>>: So why is it the case that these blobs only grow by a little bit? That's the
way the algorithm works?
>> Sharon Goldberg: Yeah, the algorithm is designed so that if the early
adopters are connected, then the blob is always going to be connected, because
anyone who turns on has to be -- the only part that's on is going to be one
component, and anyone who turns on has to be connected to that component.
Otherwise it has utility zero. So the blob always grows by one at each time.
>>: [inaudible]
>> Sharon Goldberg: Yeah. You have to make sure that everyone's connected.
So if the optimal guys that you should turn on are these three guys but they're not
connected, you just connect them all and then you lose a factor of R.
>>: If I understand you right, now you're saying that I'm going to assume that the
adopters that we decide will adopt first will be connected.
>> Sharon Goldberg: Yes.
>>: And then we detect the algorithm and this gives them good approximation?
>> Sharon Goldberg: Yeah. Exactly. So the LP will solve for the -- it will search
for the optimal solution given that the early adopters are connected. That's what
the LP does.
And then there's many other things that it does, but basically that's the basic
thing.
So I can tell you more about this if you're interested.
Okay. So that's our theory view, and now we have to be pessimists. So this was
all very nice because what it was assuming, that networks, when they see more
security, they're happier and they're like, okay, now I'm going to spend a lot of
money on my routers, upgrade them all, and become secure.
But the pessimistic view says they're never going to do this. Even if everybody in
the world is secure, I don't really care if I'm not secure. It's just too expensive. I
don't want to do it.
So what we want to say is that -- so the pessimistic view says there's not really
any economic incentives to do this. The only gain you're getting from these
things is security incentives, and so no one's going to ever do this.
But here's -- we challenge this view with this next set of works where we're going
to show how this protocol actually has an advantage. Like I showed you earlier,
it affects route selection. So if you wanted to have any security impact, it should
impact the routes that your network chooses. Otherwise it's not doing anything
for you.
And so because it impacts route selection, it controls traffic flows. And if you
think about the way these networks work, they make money by attracting more
traffic. They get paid to carry traffic from their customers. That's where their
revenue comes from. So the more traffic they carry, the more money they make.
So the idea is, is because route selection is impacting traffic flows, the more
secure you are, you're going to draw more traffic into your network, you're going
to have more customers, and you're going to make more money. That's what
we're going to use to drive the next model, so that's what this is about.
Yes?
>>: So there is also this question -- I don't know if it's addressed -- but if you're
saying [inaudible] just want to get more traffic to generate more things, they can
potentially lie and try to claim that [inaudible].
>> Sharon Goldberg: No, it doesn't. It doesn't assume. So lying is another thing
altogether, and we've looked at lying in a separate work. What we're talking
about is let's assume that they're not going to lie in this model.
So let's assume the only way they can get traffic -- that everyone's going to be
honest, but they will get more money if they're getting more customers.
So in order to talk about money, I need to tell you a little bit about how money
flows in these systems.
So I haven't told you where the money is in the system. So let's start with a very
simple model that we use to understand the relationships between these
networks. So the first relationships that we have is we have customer provides.
So this is a network that is a customer of this network.
So what's happening here is that this customer is going to pay the provider both
to send and to receive traffic. So you might think of -- like let's say Telecom Italia
might be a provider to Turkish Telecom. So Turkish Telecom might pay some
money to Telecom Italia both to send and receive traffic because Telecom Italia
is a much larger network. They have much bigger coverage, and so if you want
to reach more distant places, you'd have to pay.
So in this graph I have two customer provider relationships. These guys -- this is
a customer of this one and also of that one.
The other kind of relationships we have is settlement free peering agreements, a
peer-to-peer relationship. So what's happening here is you have two networks,
usually networks of the same size, and what they do is they freely exchange
traffic because the idea is that the amount of traffic that I'm sending and receiving
from this guy is equal to what he's sending and receiving to me.
So AT&T and Verizon have this type of relationship. The other place you see
this relationship is, for example, Google has this relationship with almost
everyone it talks to because Google wants everyone to get to them as fast as
possible so they can see more advertisements. So they're not in the business of
making money from internet routing, they're in the business of making money
from advertisements, so they allow these sort of relationships. So this is sort of
how money flows in this system.
And now a little bit more about the structure of the graph. 85 percent of the
nodes in this graph are a special kind of node called a stub. So what a stub is is
it's a node that only has providers. So this guy is a stub. He has only providers.
So if we think about the stub, do we ever expect to see the stub carrying traffic
like this? And the answer is no, because if it was doing that it would have to pay
this guy to receive the traffic. He's paying to receive and then he's also paying to
send over here. So that doesn't really make sense.
So the way to think about this network is it's sort of like 15 percent of the graph is
in the middle like this, and it's actually carrying traffic from one point to another.
So this actually happens. This is what we call transit. It goes from a to b to c.
And then there's these 85 percent of nodes that just basically act like sources
and syncs. And they may have degree 1, they may have degree 2. They're
usually low-degree nodes that are acting like sources and syncs.
So these are sort of customers. These are people who don't make money from
internet service. They're just consumers of internet.
So we also have to think about what we're going to do with these guys because
these guys are never going to make any money from becoming secure.
Everything is a cost to them. And so what we're going to have to worry about is
how do we find incentives for them also to become secure.
Any questions on this?
So here's an example. Suppose everyone's insecure, and let me just show you
how routing is going to happen in this example. We have Sprint here, and it's
learned two paths. Both of them are equal length paths. So how should it route?
It's just going to route this way because there's some tie-breaking algorithm that
routes it that way. I haven't told you what the tie break is. It was actually a hash
function in the simulation we ran here, but anyway, there's some sort of
tie-breaking algorithm.
So suppose traffic's flowing this way. Let's look at the incentives now.
So this network here has now attracted traffic from Sprint to the customer and the
customer is paying it, which means it's delivering more traffic to this customer
and means it's getting more money from this customer which means it's making
more money.
So the idea is because he's serving from Sprint to the customer, his utility has
gone up because he gets to charge the customer more.
So if these guys all become secure -- so I have three secure networks here.
Sprint learns now two announcements. One is secure and one is insecure.
So now let's make this assumption. So this is the first guideline that we're going
to make. Assume that secure networks break ties on secure paths. So if Sprint
learns two paths, they're both equally long, he's going to choose the secure one,
so he's going to route this way.
And now let's look at the utility. This guy's earned the right to deliver a lot of
traffic to the stub, but this guy hasn't, so his utility has gone down because he's
now serving less traffic to this customer and he's making less money. So this is
really the premise behind the work.
But because the secure routing is going to shift the traffic this way, the utility of
this ISP goes up and this ISP goes down, and that's going to create pressure for
him to become secure. Does that make sense?
Yes?
>>: Is it common that one customer is served by multiple [inaudible]?
>> Sharon Goldberg: Yes. Absolutely. Yeah. Very common. So I think it's like
level more than three quarters. Probably more than three quarters. A lot more
than three quarters. But I can tell you for sure it's three quarters.
>>: [inaudible]
>> Sharon Goldberg: So usually one is for backup and the other one is for
primary. That's one of the examples. So if one of the providers goes down, you
want to have another one that you can talk to. Sometimes you do load balancing
that way. But, yeah, most of them have ->>: [inaudible]
>> Sharon Goldberg: No, no, no. Don't think of them as residential customers
like you at home. It's not [inaudible] at home, it's Bank of America, it's CIVC, it's
Royal Bank. Microsoft is not because Microsoft is huge and it has all sorts of
internet business that it does, but Boston University is a stub. So universities,
small businesses, things like that. Does that make sense? So it's not you at
home. You at home have only one provider. It's usually Comcast, and Comcast
for us is one network.
>>: So you're talking about the 36,000 ASes. Three quarters of those are
connected to more than one?
>> Sharon Goldberg: At least, yeah.
>>: [inaudible]
>> Sharon Goldberg: 85 percent. So of 36,000, about 30,000 of them are stubs.
This is the number of ASes, and we have about 5- or 6,000 ISPs, so people who
actually make money from service.
>>: Do those stubs pay in proportion to how much traffic they send?
>> Sharon Goldberg: Uh-huh.
>>: It's not like a [inaudible]?
>> Sharon Goldberg: Well, it could be kept, but you may have granularity, but it's
commensurate. So we model it as -- we can't model the granularity of the
payment agreements, but, yeah, there is scaling that way.
Okay. So here's our guideline for deploying security routing. So the first thing
that we say is we want secure ISPs to at the very least break ties in favor of
secure paths. So the example I showed you before, what we had was this guy,
he's learning two paths, one of which is secure and one of which is insecure.
Because these are of equal length, he should choose the secure one.
So we could have said that the very, very first thing you should do is prefer
secure routes. That would also be nice for us. That would be better for security.
That would be better for everything. But the problem is the ISPs probably won't
buy it if we tell them that now you have to change your routing policies that
secure routes is the most important thing, you know, bypass all your other
considerations and take security first. They're not going to buy it. So we said,
okay, just make sure that you break ties in favor of secure routes.
So this was sort of the weakest condition we could think of to get this to work.
The next thing is we have early adopters. So let's pick a couple of early adopters
to deploy this protocol.
And the final thing is that we have to deal with these stubs. So like I said, the
stub is not [inaudible] at home, it's Bank of America or Boston University. So
these are pretty decently sized networks that speak BGP, but they're not really -they're customers, they're not providers.
So how do we get these guys to become secure? So here's what our strategy
suggests. As soon as this ISP here becomes secure, what it should do is it's job
is to now upgrade its customers. So if I'm a stub and I'm connected to this ISP,
my stub should go and have a project in which it upgrades me.
And how should it upgrade me and why is this even a reasonable thing to claim?
So the idea is that these stubs, because they only act as sources and syncs, the
only routing announcements they make are for their own prefixes. If I'm a sync of
traffic the only announcement I have to make are for the prefixes that originate at
me. So for the guy that we saw before it was, let's say, 10.10.10.0/24. The only
announcement I'm going to make is pertaining to this particular prefix. So it's
basically one prefix out of 36,000. So that's the only announcement that's going
to come out of me. Everything else I receive, the only thing I send is stuff
pertaining to me.
So because we're only sending announcements that have one happen long and
because they're only basically one prefix out of 36,000, you can have these guys
only sign and not verify, and the overhead of that is much lower. It's a software
upgrade rather than a hardware upgrade.
So now I don't need to keep the public keys of the entire internet. I just need my
own secret key and I only need to sign.
So the idea is that because this upgrade is so much cheaper, we want the ISPs
to do it to their customers. And actually the fact that it has no hardware upgrade
doesn't come from me, it comes from the router manufacturer, so I feel really
good saying this now because one of the router guys got up and said this at the
last meeting that I was at.
So it's cheap, and the security impact is only minor because you're not verifying,
but because the announcements terminate at you, by not verifying you're only
hurting yourself. So the impact of not verifying is quite small.
Yes?
>>: [inaudible]
>> Sharon Goldberg: In what sense?
>>: [inaudible] like when they join they just tell once ->> Sharon Goldberg: Yeah. But there is a time stamp on these messages that I
haven't shown to prevent replay attacks. So you do have to what they call
re-beacon. You have to re-announce just because otherwise the time stamp will
go forever. So it's not very dynamic, it's very slightly dynamic. Like you could
think of days or half day or something like this.
>>: I mean, why should those guys be secure at all? Because if ISP knows that
this Bank of America is just pretty much connected to me and never changes, it
will just kind of ->> Sharon Goldberg: So one solution is to have the ISP do the signaling for him.
So he could take the public key of him, own the public key of Bank of America,
and do the signaling for him. And that's actually another solution we proposed.
But that means Bank of America loses its ability to do the announcement itself,
but that's also fine.
>>: But that's my question. Why would they even do the announcements
because if it never changes ->> Sharon Goldberg: So if it has two providers and maybe it wants to announce
it this way but not that way, this allows him to do it. It allows him to decide that I
want this prefix to go out on this edge and then not on this edge, for example. So
it gives it control.
Is there another question? Yes?
>>: It would seem like, more importantly, it would allow him to prevent, you
know, the Chinese ISP from highjacking his traffic.
>> Sharon Goldberg: Right. So he doesn't have to put his trust in anyone except
for himself. I mean, ideally if this ISP is not the Chinese ISP, then he might trust
him to do the announcement for him.
>>: Right. But if he only signs things for -- to localize anybody he's got a secure
group to, he's not going to get tricked into routing through the Chinese ISP
because that's not going to be able to get signed.
>> Sharon Goldberg: Right. Okay.
So I want to quickly finish. So this is basically what we say. So if you want to
know what our main contribution is, it's right on this slide.
So this is how we've proposed deploying the protocol. Make sure it impacts
routing decisions. Make sure that this process happens. And this is extremely
important because of the structure of the network, right? We want the edges to
have some reason of becoming secure, and the way we do that is by having their
providers become secure.
So we need to model if this works. So how are we going to model if it works?
Well, let's take our network, let's define the state, the state of the set of the node
that's secure. So this graph, those three yellow ones are secure. That's the
state.
And now we want to sort of model how deployment will go. Initially the early
adopters become secure, and every state, what's going to happen is that we're
going to compute the utility for each ISP given its current state. So, for example,
8359, we compute its utility in the current state and then we'd say what happens
if he became secure, how would his utility change.
If his utility changes enough -- so if it changes by, let's say, 5 percent, when he
becomes secure, just looking at the current state of the world, then he'll also
become secure.
So this is a myopic process. You look at the world right now, you say how much
do I gain by becoming secure, if it's enough, I become secure. That's basically
the whole model right there.
So now this is really hard to work with from an analytic perspective. We can
show that things converge, blah, blah, blah, but actually the real work is to
simulate that this works.
So let me -- oh, okay. So I'm missing something very important. Sorry.
How do we actually measure the utility of these nodes? So the utility of the node
is how much traffic routes through it. So this is how we measure utility. We look
at the ISP, we look at its destination. So for every destination we see a
destination that can be reached along a customer edge, which means if the
destination is somehow down here I'm getting paid for delivering to this
destination, I count how much traffic I deliver to that destination, and that's the
utility for that particular destination. And then I sum over all destinations. That's
my utility.
And now in order to compute utility -- all I need to do to compute utility is look at
the state and decide how these networks make their routing decisions. So to
decide how they make their routing decisions -- so the point is I need to compute
utility. I know the state. And that state fully determines how routing will go given
that I know how these networks make their routing decisions.
So we have a model for how they make routing decisions which is basically
you're going to prefer to route through your customer, someone who pays you is
preferred to route to over someone who doesn't pay you over a provider, over
someone you pay. So that's the first step in the decision process. So that's sort
of what is the cost of this route.
The second step is how long is this route. The third step is is this route secure,
and then, finally, there's a tie break if you need to do a tie break, and that will
allow us to determine all the routing.
>>: [inaudible]
>> Sharon Goldberg: So we did is empirically. We assumed that they had a
global view. In reality, it's not as easy to compute, but we just did it to do the
simulations because we didn't know how else to estimate.
>>: [inaudible]
>> Sharon Goldberg: We assume that they have a global view, yeah.
Okay. So ->>: [inaudible]
>> Sharon Goldberg: Yeah, yeah. So estimating how much more you get by
becoming secure is something you might be able to test. It's not super easy to
test, but we put it in the model because we didn't know how else to measure a
gain, so we just said let's look myopically and see how much you gain.
>>: [inaudible]
>>: [inaudible]
>>: But one piece of change may have cascading impact ->> Sharon Goldberg: Yeah, it has a cascading impact. So it is in fact the
cascading impact that makes things interesting. So it's really -- it is quite hard to
measure. Yeah, it is quite hard to measure.
But we just -- again, there's probably ways to estimate it. We could have noised
it up a little bit to account for estimation errors, but we didn't. That is actually one
thing that we talked about doing, but we didn't actually have time to do it.
>>: I think the global thing -- I mean, you can do the experiments, but even the
global thing, it's not understood how to compute it.
>> Sharon Goldberg: No, it's not. It's not. No, it's not.
Okay. So actually just to state, since I'm at Microsoft, this simulation is, by the
way, an order N cubed simulation. I don't know if you can see that from here.
The reason is you need to compute the all paired shortest paths for each one of
these ISPs. So all paired shortest paths is N squared and then there's N ISPs,
so it's an N cubed algorithm. N is this, 36,000. So it's big. So this is why I had a
lot of help from Microsoft SVC from Frank McSherry and Mihai Budiu who let me
use DryadLING. And so each one of these simulations took half a day to do. So
they were pretty fun. That was our big job last semester.
Okay. So let me show you some of the simulation results. So here's a
simulation that we ran. We picked 10 early adopters. By the way, these are their
network numbers, not their size. So ignore these numbers if you don't -- don't
get scared by these numbers. This is the wrong slide. I should have taken this
off.
Anyway, these are the 10 early adopters that we picked. So the early adopters
included content providers like Google and Microsoft, Facebook, Akamai and
Limelight. So these are the biggest content providers on the internet.
Then the 10 biggest providers, sort of networks that act like backbones, we
turned those guys on and then big content providers and we saw what we got.
And the exciting thing is that if we assume that the thresholds of all these
nodes -- so they need to see 5 percent increases in traffic before they turn on -then we get 85 percent of these networks becoming secure.
By the way, it's not just 85 percent where these are all stubs. It's 85 percent
where 65 percent are ISPs. So 65 percent of the nodes in the core became
secure and a total of 85 percent of nodes became secure. So we really did see
significant adoption in the simulation.
And let me show you why. Sprint was one of our early adopters, and this was
the flow of traffic. So what happens in the fourth round -- first round is that this
guy says if I become secure, then my stub becomes secure and the traffic shifts.
And then the next thing that happens is this guy will say, wait a minute, I'm losing
money, let me become secure, and then he becomes secure and the traffic shifts
back, so his utility goes up.
And what you start to see is longer and longer paths. So this guy is going to say
if I become secure, if my stub becomes secure, the traffic shifts and I get more
money because I'm delivering traffic to a customer. And then we get even longer
paths where this guy says, okay, now I have this long, nice path here, I can
become secure and my stub will become secure and I can grab the traffic this
way. So we really the just see this jumping and traffic.
And the interesting thing is that we only need to assume that they break ties on
secure routes. We don't have to assume that they prefer secure as the top thing.
Yes?
>>: What if you give them a little bit of ability to see in the future? Won't they say
in the long run it's not going to gain me anything?
>> Sharon Goldberg: Okay. Let me answer [inaudible] question. I have a slide
for you.
>>: What are the guys who resist this process? Are they the guys who have just
one stub --
>> Sharon Goldberg: Yeah, exactly. When you look at the guys who never
secure themselves, they tend to be providers with a lot of stubs that are single
honed, and they don't have any competition, so they just don't become secure in
our simulation. That's true.
And then there's also the frog boiling guys. So I'll show you a frog boiling guy in
a second.
So here are some guys, and let's look at how their utility changes. So this one
became secure in round 4, 5, and never.
So this is the round 4 guy. What's happening is that people are becoming secure
and he's losing traffic, but he never gains enough by becoming secure. At this
point he sort of sees that by becoming secure his utility will increase by more
than 5 percent so he secures himself. And then we can see that his utility goes
up for a while.
Same with this guy. This was the guy who became secure in the fifth round. He
turns on and his utility goes up, but then it drops.
So what we're seeing is that sort of like at this point what the simulation was
showing was around here basically everyone who was going to be secure was
already secure. And so what's happening is this traffic shifted back to normal.
This guy was like frog boiling in the sense that he only loses about 4 percent of
traffic. He never gets to 5 percent, so he never turned on. The threshold was
too high for him.
But what we do see is the ones that are secure, their utility is sort of higher at the
end of the game than it was at the beginning, and the ones that don't become
secure, their utility's lower.
So the reason we see this kind of behavior was that this is actually a zero sum
game. We didn't assume that more traffic appears in the course of simulation.
It's just this customer traffic that's moving around. So it should actually go back
to this state. But what we do see is we see these periods where they're sort of
getting a lot of utility out of the protocol and then it goes back down to normal.
Yes?
>>: [inaudible]
>> Sharon Goldberg: They stole all my traffic so it's much worse for me.
>>: It seems like it could happen. I don't know if it's ->> Sharon Goldberg: I have an example I'll show you. I think I want to end, so I
won't talk about that too much, but, yeah, like all kinds of strange things can
happen.
When we looked at this on average, we saw that generally when you become
secure, your utility will go up and stay up for a while.
Now, in fact, there are simulations where it got stuck very quickly, like it got stuck
with, like, 20 percent becoming secure. When it gets stuck, if you're one of the
secure ones, your utility is up and it stays up because the traffic doesn't shift
back.
So sort of two outcomes. One is the process gets stuck and you get more traffic
or, two, the process doesn't get stuck, you lose most of what you gained, but now
you have a secure internet. So it's sort better for you to become secure.
So that was ->>: [inaudible]
>> Sharon Goldberg: Well, this guy didn't -- he lost relative to here, but he didn't
lose relative to the initial state. He's still better off than he was at the beginning.
Okay. So we did a lot of work trying to understand who the early adopters should
be. I guess I'm going to skip this, but we have many, many simulations looking at
the effect of different thresholds and who should be secure.
What we ended up finding is that it's important to have the tier 1s. The tier 1s are
the providers at the core of the backbone networks becoming secure. The
reason for that is because the backbone networks have a lot of stubs attached to
them, and so turning on a backbone network sort of gets you a lot of stubs and
it's just going to amplify the effect because they have such high degree and so
many stubs attached to them.
The content providers have very high degree, but most of the degree is through
peering. So what Microsoft does, by the way, is it says, hi, I'm Microsoft, I'm in
this data center, I'm in this internet access point, please come be my peer, it's
free, and everybody does. So they have very, very high degree, but they don't
charge any money, so they don't have any stubs. So in our model they didn't
really upgrade any of their neighbors when they became secure.
>>: [inaudible]
>> Sharon Goldberg: Oh, I see. I see. Yeah, that would have been folded into
our early adopters. So our early adopters are the only one with security-based
incentives and everybody else is just playing on traffic volume. Yeah. So
hopefully -- yeah, so hopefully these content providers, you can make the
argument that they have security-based incentives to do this, but we didn't mix
that -- no, we didn't mix that in.
Yes?
>>: I didn't quite get the incentive of why a [inaudible] will turn the stubs on,
especially when the stubs are provided mainly by multiple ISPs. So why I should
use my money to turn it on ->> Sharon Goldberg: Yeah, so actually one of the goals of this stuff was to talk
about government incentives. So we actually do talk to the government about
this, and this is where we think that the government incentive should go.
So it's not completely clear what his incentive is to do this, so we want to create
artificial incentives for this to happen. So this is really the place where we think,
like, if you want to inject money into this process, inject it into having him turn on
his stubs and inject it into the early adopters, because that's where there's, likes,
a lack of incentives.
There is also sort of like if you become secure and your stub becomes secure,
then you can steer the traffic over to the stub. Like if you see -- like you see what
happened here was the traffic used to be here, he becomes secure, turns on his
stub, and now he can steer it over because this destination -- he won't be able to
steer it over if this was insecure, you see, because this would not be a secure
path. The traffic wouldn't shift this way because this is not a secure path to this
destination.
>>: In your model [inaudible] sort of a passive entity instead of a very ->> Sharon Goldberg: Yeah, yeah. He's not a player. So in the game, the only
people -- the stubs are non-players. We still have to compute the routes to them
and the routes that they choose, but they're non-players. The players are
everyone else, are the non-stubs. So the 15 percent of nodes are the players.
>>: But in reality are they actually employers themselves because they choose -as you said, they choose to announcement the ownership of prefix to maybe
different providers.
>> Sharon Goldberg: So one way to model the fact that they're players is they
could start creating links. In some sense they act as players in the following way.
They choose routes. So somehow -- you can't see it in this picture, but imagine a
stub choosing a route through him as opposed to through him. He's going to
increase his utility because he's got more traffic. So in some sense our model
does incorporate the stubs as players because they make routing decisions
based on security.
We found that we didn't even need that in our model. Like we basically get the
same results whether or not they make decisions based on security, but one way
to model them as players is to have them make decisions based on security.
Another thing that we didn't do was that you might imagine people forming and
breaking links. Because one guy's secure, all of a sudden everybody's forming
links to him. We didn't do forming links because it seemed like a very difficult
thing to model because you might need to incorporate geographic information,
and we didn't have good data on that, so we just didn't do it. But that's another
way you might think about, like, the graph reforming as a result of this, which we
didn't do.
Okay. So [inaudible] asked me if anyone ever wants to turn off. Yes. The
answer is yes. You can have weird things.
Can I take two minutes and show this?
>> Kristin Lauter: Sure
>> Sharon Goldberg: Okay. So let me show you if you use a different utility
model why someone might want to turn off. So one thing if you start working on
BGP, anything crazy that you want to happen can basically happen. So let me
show you something crazy that can happen.
So these guys are secure. This is, by the way, out of a simulation that I set up
very, very carefully to make this bad example.
So these guys are all secure, this one's not secure. So let's look at routing in this
network.
Our destination is here. This guy, Akamai, who's, by the way, a big content
provider, is going to learn two paths, an insecure one, because this node's
insecure so this path becomes insecure, and a secure one. They're equally long.
They're both through providers so they're equally good paths, so he's going to
choose the secure route and he's going to route like this because this is secure.
Now let's looks at the utility of him. He's paying to receive traffic this way. That's
not so great. Now let's imagine what happens -- so he's paying to receive traffic
from his provider.
Let's see what happens when 4755 becomes insecure. Now all of these paths
are insecure because this one in the middle is insecure, so what used to be
secure here is now no longer. So he's learned two insecure paths, and what
happened in this simulation was the tie break went this way.
So now if we look at the utility of this node, he's getting money from a customer,
and that's better for him. So his utility goes up.
So what I'm showing you is an example where you might want to undeploy
because it might get you more revenue. And in fact this was so much fun that we
thought we can show oscillation examples, and it was even more fun when you
can show piece-based completeness.
So we can show that it's piece-based complete to determine if oscillations will
ever occur in this graph. So you know what you can do with these things is you
can make the game of chicken out of these things. If you're interested, I can
show you how to make chicken out of this thing. And then if you have a chicken
gadget you can actually hold states, and then you can build a touring machine,
and then you can start to show oscillates. So we actually did do that.
And so we were able to show that it's hard -- if you have a utility function that
allows nodes utility to depend on where the traffic comes from -- by the way, the
other stuff that I showed you, we did not make the utility depend on where nodes
traffic came from. We just made it dependent on where it was going. In this
model I'm caring about the fact that it's coming in from a customer here versus a
provider there.
So we didn't -- in most of our simulations we took this out because we didn't want
to deal with oscillations, but if you want to start thinking about where the traffic
came in, then you can start to have oscillations and all kinds of bad things.
Although realistically I don't expect oscillations to happen because the structures
were so crazy that we don't really expect to see them, but in terms of actual
undeploy, yeah, I expect to see cases where you don't want to get the traffic this
way, so you may want to become insecure.
Okay. So just to conclude, what drives BGP security deployment? So we've
looked at multiple approaches, one of which is security, which was sort of a very
clean graph theoretic formulation, and then the other one, which is attracted
traffic, which was much more messy because we had to incorporate business
relationships and count traffic volumes through networks so it's much harder to
work analytically with these things.
If you ask me how we actually create a market for this deployment, the idea that
stubs need to be turned on by their providers is extremely important. Our
simulations would just get stuck without this.
We think it's very important that route selection is influenced by BGP security, so
the more secure you are, the more traffic you get. And the government
incentives should go in two places, having the ISPs turn on their customers and
having the early adopters turn on.
There's many, many open problems in this. We don't have good algorithms to
find early adopters so in our second paper we just did heuristics, large degree
nodes, high volume traffic nodes, things like this.
In the first approach we have some very restricted algorithms so far.
The next question that we're currently working on is trying to understand what
happens in settings where there's secure nodes and insecure nodes, what are
the security guarantees there. So it sounds like something people should have
looked at, and there has been a little bit of work on that, but there's no even sort
of metric of, like, you have 100 secure nodes on, what kind of security does the
internet get. We don't really know how to measure that, so we're sort of working
on that right now.
And that's it. And this is three different papers, one of which we're still working
on. Okay. Thanks.
[applause].
>> Kristin Lauter: Questions?
>>: So what is the -- you didn't mention it. So what is the effect on functionality?
Because presumably security is [inaudible], but if you're saying [inaudible]?
>> Sharon Goldberg: Yeah. So we haven't included capacities of links in the
models, and that's really problematic. So last week I spoke to the Network
Operators Conference on this, and I thought they were going to crucify me
because we didn't incorporate capacities, and they actually didn't crucify me, so I
was very happy. But we don't know how to do that yet.
So some of the things that bothers them is that because their links are very
carefully engineered, even the claim that we're making that they should break
ties on secure paths is sort of scary to them because they're afraid that they'll
burn out the links by doing that.
>>: So you're saying nobody has done studies how security will affect
functionality?
>> Sharon Goldberg: We don't know. We don't know. So that's another
important open thing that we don't even, at the moment, have the data to even
analyze. So that would be an important thing to try to understand.
Some of the guys I spoke to said they would be willing to do it if, like, all their
neighbors did it at the same time and they could sort of minimize the impact or
they could pretest with one destination and see what it does. But I haven't
really -- yeah, we don't really understand that. In fact, the idea that this should
even impact traffic engineering is not really -- wasn't really thought about until we
did this paper.
>>: When you were looking sort of at the payments models that you're
optimizing, do people tend to pay by the packet or pay by the width of the pipe?
>> Sharon Goldberg: It depends. Sometimes it's by the width of the pipe,
sometimes it's by the packet. I think in Europe it tends to be more by the
packet -- not exactly by the packet but sort of much smoother, and then a lot of
other places it's sort of by the pipe. So sometimes it's like if you exceed the pipe
by something, then you get paid. But a lot of the time I think it's like 95th
percentile utilization, and that's what you pay on.
>>: Does it ever vary according to where the packets are going?
>> Sharon Goldberg: I don't think so. I haven't heard ->>: Because it would seem like when you're talking to an ISP, sometimes you're
sending a packet to somebody else that's this charge and sometimes it's listed as
somebody that's going to charge him.
>> Sharon Goldberg: No, I don't --
>>: [inaudible]
>> Sharon Goldberg: They should charge you more for packets that are going
further? Yeah, that would violate that neutral ->>: So that neutral says that they're not allowed to charge differently?
>>: Well, it's not in place ->> Sharon Goldberg: Let me show you an example of something that they might
do ->>: Either they're going to allow that or not allow that.
>> Sharon Goldberg: Here's an example of something they might do. A routing
policy that sometimes people use that we don't really see often is that they'll have
a customer here that pays it for servicing and they'll only announce to the
customer links through other customers and through peers, but they won't
announce to this particular customer any paths through its provider because it
doesn't want to have to pay for this customer.
So you have these sort of semi-customers where they don't get a full set of
routes from this guy, just the peering and customer routes. So that's one way to
enforce what you're saying. But we didn't model that in this paper. This is kind
of more rare. But you do see it.
>>: Do you have plan in the future to model more complex incentives? Because
right now the model is a little bit simplified as the market economies [inaudible]
incentive, if there's other political issues ->> Sharon Goldberg: No, we haven't figured out how to do that. That would be
good to do, but we don't have plans, nor do we know how to do that.
Okay, thanks.
>> Kristin Lauter: Let's thank Sharon.
[applause]
Download