Document 17889407

advertisement
>> Jitu Padhye: This is a little bit of a older work, but now we can see what happen again about solving
the problem of mapping the reverse path in traceroute, so… yeah.
>> Ethan Katz-Bassett: Okay. Thanks Jitu, thanks everybody for coming. I gave some of this talk years
ago when I interviewed here actually, possibly in this same… or at least in a very similar looking room.
So there’s kind of two things going on: one is, I’m gonna talk about the paper which is from when I was
at University of Washington, 2010, and so that’s the set of authors down here and we’ve just started
reviving the project and I’m only gonna briefly touch on what we’re doing with that, but I’m happy to
answer questions about it—and that’s the set of people up here, so Harsha’s carried over. And please
interrupt with questions at any time.
So the motivating example that I’m gonna give is from a Google paper that appeared at IMC in 2009.
They found that twenty percent of client prefixes were experiencing really inflated network latencies
even though the clients had servers nearby, and so they built a system called WiHi to debug these
problems. I’m gonna use one example that’s in that paper as a motivating example. So in this case
there were clients in Taiwan—shown in the red box there—they were experiencing a network latency of
five hundred milliseconds, so this is enough to circle the equator twice. And so this system—WiHi—set
out to debug this problem. And, of course, the first thing that someone like Google does to try to get
good performance is to replicate servers at locations around the world and try to send the client to a
nearby one. So one potential problem could be that the client was actually sent to a distant server, but
they used server logs and verified that actually the clients were being served from a nearby server in
Southeast Asia. So the next thing you might think to investigate is the directness of the path from the
server to the clients. And so they looked at traceroutes from their datacenter, so looking to see if
perhaps the path was really indirect; and using traceroutes they’re actually able to verify that the
forward path was quite direct. But most communication is two way—including Google’s—and most
paths on the internet are asymmetric, and so it seems like there’s probably a problem on the reverse
path, but they have almost no visibility into it. So their conclusion was that to more precisely
troubleshoot paths they needed the ability to gather information about the reverse path back from
clients to Google. In fact, this is a problem for more than just Google. In the past they’ve built systems
for geolocation, for monitoring internet outages, for predicting paths in internet performance, and all of
these techniques take measurements as their input, but we can only see in one direction, and in each
case the biggest limitation that we had was the lack of reverse path information—it hurt their accuracy.
And it’s not just my systems. Many systems use sophisticated measurements, including things like
Netdiff from here, and they were all forced to assume symmetry and could immediately benefit from
the ability to see reverse path information. So I’ve listed a few examples on the slides here, but really,
many systems that use traceroute would benefit from reverse paths as well. There’s been a widespread
recognition of this limitations. This is a quote from a January 2009 NANOG tutorial on troubleshooting
paths with traceroute, and this talk actually happens at most NANOG meetings still. So, “The number
one go-to tool is traceroute, but asymmetric paths are the number one plague of traceroute, and that’s
because the reverse path itself is completely invisible.” You can get a good idea about the problem
when it’s on the forward path, but you’re basically flying blind if it’s on the reverse path. Like, as our
goal: we want to build a reverse traceroute system that’s deployable without requiring any new
functionality from routers
So to show the setting, in this case we control S—we can install software there—we don’t control D; we
want to be able to measure the reverse path from D back to S. And to bring this back to that Google
example: in this case S would be the datacenter and D would be the client. Of course, we can issue a
forward traceroute, gives us some information, but the path is probably asymmetric. We’re not gonna
be able to just use TTL limiting like traceroute does because the destination is gonna set the TTL to a
normal value for the reply… yeah?
>>: Do you want us to hold questions til the end or do you…
>> Ethan Katz-Bassett: No, no. Please jump in.
>>: Okay, ‘cause one of the things that we’ve seen before here is that because of multi-bundling…
>> Ethan Katz-Bassett: Mmhmm.
>>: …sometimes even the forward path is your only one particular path of a multi-bundle route…
>> Ethan Katz-Bassett: Mmhmm.
>>: … and so there’s some asymmetry or there’s some sort of other innovation…sorry, not innovation…
there’s some other sort of issue that could exist on one off-leg, end paths for example. It’s the radix of
paths that connects S to D could be higher than one. This assumes one and it’s true for everything,
which may not be true going forward.
>> Ethan Katz-Bassett: Yeah…
>>: [indiscernible] not true in our network and neither it is true in Google’s network.
>> Ethan Katz-Bassett: Yeah. So we’ve looked a little bit at kind of how to incorporate our certs and
measurements into the type of like multipath measurement that’s meant to uncover some things like
that—you know—starting with Paris traceroute which was just looking at load balancers that showed up
at the IP level and now there’s been kind of follow-on work looking at the issue where even within a
given IP-level path you might get… go through different links in a bundle that might have really different
performance or they might have problems in them. We haven’t… we’ve done a little bit with this work,
with extending it with thinking about how to control which links we’re seeing on the reverse path and
maybe how to expose them to kind of uncover differences along the reverse path. I don’t have any
particular slides on that today, but we’re kind of in this process of—just in the last couple weeks—
starting to reboot the project, and that’s one of the things we’re hoping to kind of incorporate better
going forward.
>>: I think there are, like, multipath traceroute or some other tools which do port and—I mean—tuple
hashing…
>> Ethan Katz-Bassett: Yeah. So this is…
>> … figure out some of those things, but I’m mean we use those things today because we get… some of
the problems that we’re trying with here are also visible in production today where—you know—like
one, we get link inside each bundle going back, and it’s very hard to distinguish or to differentiate
because ninety percent of time it looks perfectly fine except for ten percent of people who are not and…
yeah.
>> Ethan Katz-Bassett: Yeah, so that… the first place I saw this kind of multi-path measurement was a
tool called Paris traceroute which originally appeared at IMC 2006, I think, and then there were a series
of follow-on papers. So one of the students on that is Italo, who is now faculty in Brazil and is working
with me on the reverse traceroute project, so we’re trying to kind of incorporate some of those ideas.
Randy Bush also had a paper at IMC last year, I think, where… that’s where they looked at kind of even
with… even if you’re… think you’re getting the same path ‘cause it’s the same IP-level path, you would
see really big performance differences within just kind of individual links in a bundle…
>>: Correct.
>> Ethan Katz-Bassett: … and so they had a tool called Tokyo ping which is building off the Paris
traceroute; the same sort of idea. We’ve been looking at—I’m trying to think of the setting we’d
originally looked at this… oh, I know what it was. So one issue you can get with this load balancing is if
you have reverse path load balancing but traceroute IP addresses are coming from the outgoing
interfaces, then you can see what looks like varying forward paths, but it’s really just the reverse path is
being load balanced, so we’re looking at kind of building tools that incorporate some of that information
a little better. I don’t have anything on that in the slides today, but we’re trying to think about how to
kind of control for those.
So we need something other than TTL limiting if we’re gonna do it over the reverse path because we’re
not gonna be able to control what the client sets the TTL to, plus the error messages are gonna go back
to the destination—we don’t control it. The next thing you might think to do is use other vantage points
that we have access to. And, in fact, we do have other vantage points around the world; so in my case I
use things like PlanetLab a lot. These aren’t gonna directly give us the path from D, but we’re gonna use
them to get a view that’s unavailable from S and, in fact, we’re combine them to get a view that’s
unavailable from any particular vantage point. So what can we do with them? Well, first thing we can
do is start issuing traceroutes from them—oh, I’m missing a slide there… but one of the sets of
traceroutes we can issue are the traceroutes from all of our vantage points back to the source. And
we’re gonna use these as a baseline that we can basically build off of. So the key idea that we’re gonna
use is that internet routing is generally destination-based, and so from the perspective of a packet on
the reverse path, destination-based routing means that it’s running towards S, and means that the next
hop depends only on the destination, not on the source or where the packets come from so far. So this
means that once we intersect one of these paths that we know about, we can assume that the rest of
our path is going to be the same. In fact, destination-based routing actually gives us something even
better than that; it means that we can build the path back incrementally. So if we somehow learn that
the path from D goes through R1, now we just need to measure the path from R1 back to S and we can
ignore D. So this means we can stitch together the path a hop at a time. So destination-based routing
doesn’t always hold. We actually had a follow up paper looking at kind of how often is it violated due to
things like load balancing or tunneling, but it holds often enough that we found that it was actually safe
to base this system around it. Just like traceroute, there’s gonna be some caveats that you need to be
aware of when you use this and assume this, but the results are still generally useful. So because of
destination-based routing, we were able to learn the path goes through R1, now we focus on measuring
from R1. Let’s say that we somehow figure out that the path from R1 goes through R2 and R3, we then
just need to measure back from R3. We somehow learn that the path from R3 goes to R4. We’ve now
intersected a path we know about and we can assume that we follow V1’s path the rest of the way, and
so we now have stitched together the path. Of course, I left out a really key detail of how we actually
get any of these segments and the… basically, what we need is something that can tell us a few hops at
a time what IP addresses are on the reverse path.
>>: Is it router-based or [indiscernible]?
>> Ethan Katz-Bassett: So we’re measuring at the IP level, so router-based with some aliasing going on
and things like this. The… in terms of the impact of this destination-based assumption, part of the way
that we can get away with it is that even if you make some little errors within, in AS you’ll often be right
at the AS level, but we do measure everything at the IP level. So IP options are gonna give us the tool
that we need to measure a couple of these hops at a time. IP options, of course, are a built-in part of
the internet protocol, but we’re the first to use them to measure the reverse paths. The key is that,
unlike TTL, the destination’s gonna copy over the options into its reply if you ping it, and so the source
can ping the destination with options set and get back the response including the options. The first
option that we’re gonna use is the Record Route option. So, record route allocates space in the header
of the packet for nine IP addresses. If the destination is reached before all the slots fill up, the remaining
slots can fill up on the reverse path. So to give you an example of that, if the paths look like this, then
record route will give me these yellow hops, five along the forward path, the destination, and then the
last three slots along the reverse path.
>>: Even have you tried asking people if there’s flattened RR and other IP options on entry and exit?
>> Ethan Katz-Basset: So asking them if what?
>>: I mean, have you tried experimenting with this hop? Some folks do squash record route.
>> Ethan Katz-Basset: Yeah.
>>: Okay.
>> Ethan Katz-Basset: So I actually tried to put those results into my slides and I couldn’t find any recent
ones that seemed worth including, but Rob Sherwood did a big study that actually kind of got me
thinking about using options where he looked at option support for record route across the internet and
that was like SIGCOMM 2008. We had a paper in IMC 2010 looking at timestamp option coverage
across the internet, and I’m trying to get a student in class right now to look—‘cause we’ve now
accumulated a lot of data on record route coverage over time—and trying to get that student to look at
how it’s changed over time and what holes are there. Unfortunately the student hasn’t finished the
project, so I don’t have any great numbers on it. We’ve done measurements; we’re hoping to redo
coverage study. I also… I’m gonna defer them ‘til the end, but I have some extra slides if we have time
that show things that you can do to help get better coverage where there’s options problems. So for
example, it’s sometimes the case that the source might have options limited on its forward path, but the
reverse path might be clear. And so in cases like that you can have, like, this vantage point send a probe
and spoof as a source and so you’ll miss that filter. And I have a graph at the end that I can show if you
guys are interested and there’s time that kind of shows that that gets you a lot better coverage. There’s
other tricks you can do when there’s limited option support in different ways, and a lot of the meat of
the work in the paper was figuring out how to handle these different cases where people support
options in different ways in order to get enough coverage that the system will work. And I have… kind
of… what I would say in the end is there isn’t going to be coverage some places, you’re gonna be
filtering some places. The results that we had so far show that you can piece together enough coverage
to have a useful system even if it doesn’t always work.
>>: Do you know when it does not work?
>> Ethan Katz-Bassett: Can you detect when it doesn’t work?
>>: I guess one resource maybe… so I was trying to avoid it, but since it’s come up, so I’ll ask: so in my
mind, I think the… one of the big difficulties in using natural diagnostics tools of this ilk is like, it’s hard to
tell when they’re wrong, like… and it’s almost like, I think, tools that may give wrong information
without telling you that this may be wrong…
>> Ethan Katz-Bassett: Yeah.
>>: … are actually worse than no tools at all sometimes.
>> Ethan Katz-Bassett: Yeah.
>> Ethan Katz-Bassett: And this is something kind of like I always wonder about like some of the
diagnostic work that I’ve done that uses machine learning or things like that or even this thing where the
answer maybe not be accurate and whatnot, how do you… in this context, how do you reason about
correctness of the answers since you are extrapolating from other factors?
>> Ethan Katz-Bassett: Yeah, so our approach so far has been to include… so, I mean, were kind of just
starting to get into it, but, like, what you’re kind of hearing from this is we have to piece together
different techniques in different places in order to deal with different issues that come up along the
paths. And so our approach so far has been to kind of label how we learn different pieces of
information so that if you’re… at least if you’re looking at it as a user you can understand, this data came
from this source and there are this caveats with that type of data; this other data came from this other
source and there is this other set of caveats. And you can reason about it, like, fairly similarly to how—
you know—when I’m using traceroute [indiscernible]… I don’t understand some routes, I understand
that there’s caveats that I have to be aware of with traceroute. The other thing that we didn’t
incorporate into the tool but that we plan to at… when we’re rebooting it now: so we had… so my
student, Toby, who’s—some of you know from his TCP work—before he started working on that, he
looked at how often were these assumptions we were making violated, and the way we tested for that
was to measure the same path in multiple ways and see when we got different answers. And so you can
imagine building those tests into the tool so that you can trade off having extra probe overhead to remeasure the path multiple ways and then point out places where you’re getting the same result
different ways which gives you confidence, or where there’s maybe divergent between the paths which
might mean that you might be a little more suspicious in those places.
>>: Is there a way to, like, distinguish between false positive and false negative like quartiles under ROC
curve or has that effort been made? Is it [indiscernible]?
>> Ethan Katz-Bassett: So we tried to tease out places… so it’ll come up later on in the evaluation, but
the basic evaluation baseline we tried to go for is: do we get the same results as traceroute or not—
because that’s kind of what people are used to using? And we tried to tease out the different causes for
differences between us and traceroute. And so I’ll get into that in a little bit later on. Did that kind of
answer your question Virgil?
>>: Yeah, I think you’re hinting at things you might be able to do, but I… so I think that maybe if I would
paraphrase your answer it’s like… which is: the output of the two needs to be carefully interpreted by
self… which by itself in the sense, like, I can put traceroute in the context of a larger diagnostic system,
because its output can be trusted in some ways, and it will be harder for me to put reverse traceroute in
an automated diagnostic system, where this output of reverse traceroute is one of the factors that go in,
because it needs to be interpreted by humans—at least, as things stand.
>> Ethan Katz-Bassett: So I think that you can put in… I think that we can tell… we have some pretty
good information about when you should probably trust the tool and when you should probably not
trust it. And the main thing is that—and again, this is kind of like jumping forward and backwards—but
if we can’t measure a particular hop because we… none of our techniques for that… work for that hop,
we just have to assume that particular hop is symmetric—it’s kind of just our fallback. And what we’ve
found on a large scale analysis is that if you only have to assume a hop of symmetry inside one AS, you
don’t really get screwed up. If you assume symmetry across an AS boundary then you’re probably in
trouble, you kind of lose track of what’s going on with the path. And so I think that in cases when we
don’t have to make those assumptions, it’s fine to trust the tool at roughly the level of what you would
trust traceroute for as long as you’re aware that it’s a slightly different set of caveats in terms of the
types of confusing information it can give, just like traceroute gives types of confusing information in
some… certain places. So I think we can get… I think we can flag the cases where you shouldn’t trust it
at that level and the cases when you should and then it’s safe to build something around it. Okay?
So we’ve kind of jumped and—you know—already talked about some of the stuff in the future but to
get… to come back to here, with record route you’d have to get those three hops. The problem is that
the average path on the internet is like fifteen hops in each direction and so you usually won’t be close
enough to use record route. So what do we do in those cases? Well, in those cases we can take
advantage of the fact that some of our vantage points are able to source spoof, so they can send out
packets with the source address set to something other than their own IP address. And what we’re
gonna do is set it to S. So I already mentioned this in that other context of getting around filters, but for
this nine hop, eight hop limitation, what we’ll do is find a vantage point that’s close to the destination—
in this case V3—and we can have it send the record route probe to D. So when it gets to D, some
number of the slots will be filled up—in this case, seven—D will add itself in, but then it thinks that the
packet came from S, so it’s gonna reply back to S, and we’ve now learned that R1 is on the reverse path
from D back to S. And once we know this, we just need to determine the path from R1 back to S so we
can repeat the process. We find a vantage point that’s close to R1—in this case let’s say V2 is—it can
send a probe spoof in as S, it gets back, we’ve learned R2 and R3 are on the reverse path.
But what do we do in cases where none of our vantage points are close enough to R3? Well, we still
might have some idea of what the set of possible next hops are. And what we’ll do is consult an internet
atlas and figure out where we think options are for where it might go. So we might have observed paths
like this in the past when we were just issuing traceroutes, and so R4 and R5 are candidates for where
this path might go. We need to weigh, given one of this candidate guesses, to verify whether or not it’s
on the path or not. And this is where we’re gonna use a different option, the IP Timestamp Option.
We’re gonna use the prespecified version of the IP Timestamp Option, and what this does is let you
specify up to four IP addresses in the header of the packet and each of them will record a timestamp if
they’re traversed in order. So this ordering property is what we’re gonna take advantage of. We’re
gonna have S to test for R4; it’s gonna first ask for R3’s timestamp, then ask for R4’s timestamp and that
will force R4 to only fill out after R3’s filled out a timestamp, which means it must be on the reverse
path. So in this case we send this probe, R3 fills out a timestamp; if we get back a stamp for R4 we know
it was on the path because the stamps are ordered and we asked for it after R3, it has to be on the
reverse path. In this case, let’s say we do get back that stamp, so we now can assume the path goes
through R4. So we have a way now—if we have a guess—of testing that guess and verifying it. Now,
we’ve measured the path to R4…
>>: Do you know that there was nothing in between?
>> Ethan Katz-Bassett: So we don’t know that. So we assume that if we observed them adjacent in a
traceroute before and we observe R4 being on the reverse path now that we can assume that it’s gonna
be adjacent now. That’s another potential source of error, although if they were adjacent in another
path, it’s pretty likely that they’re probably adjacent again.
>>: Okay, are you looking at it as adjacency from the IP level or adjacency from connectivity level, ‘cause
you could have, for example, IP things, but like, say, an APM or a frame relay under may have been with
multiple hops in between.
>> Ethan Katz-Bassett: There are certainly measurement… yeah, things like that that could be hidden at
the IP level.
>>: That was my question, actually.
>> Ethan Katz-Bassett: Yeah, so we assume that because we observe them adjacent at IP level before
that probably the path that has them both on it is gonna be a direct path again but… If it wasn’t,
basically what we’d have is we’d be on a good path, but there should be some kind of dot, dot, dot
between there, so.
Now I’ve intersected the path from V1 and so we’re just gonna assume that we can piece it together like
this.
So there are kind of a number of key ideas that enabled this. Of course, the goal was working without
control of the destination and so we’re gonna do this by using destination-based routing and multiple
vantage points to stitch together the path incrementally. There are a bunch of other things you have to
do to kind of get this to work well and I’m gonna defer you to the paper for learning about those, but as
I mentioned to Vijay before, different routers process options in different ways. You have to account for
this sort of variation. You have to deal with filters because some people are gonna filter options and, of
course, some people filter source spoof probes. And then you want to make sure that it doesn’t take
tons and tons of measurements to do this, so we have some techniques for kind of correctly picking the
right measurements that’ll get us the most bang for the buck so we don’t have to issue tons.
Now, as I mentioned before, when none of the vantage points is able to measure a hop using any of the
techniques, we’re gonna have to assume symmetry for that particular hop and then measure from
there. So this means that the coverage and the accuracy are really tied to the set of vantage points that
you have to use, and in particular to the ones that can spoof. So in the evaluation in our paper which is
what I’m gonna talk about now, this was using PlanetLab and Measurement Lab sites that could spoof,
and at that point, when we did the original study about ninety sites allowed spoofing—that was about a
third of the sites that existed. And then in the study we measured back to PlanetLab sites, so we used
PlanetLab sites as our sources and our vantage points. In the paper we evaluate coverage and
overhead; here I’m just gonna talk about accuracy. And we’d really like our tool to be equivalent to
what you would get if you had control of the destination and could just run traceroute on it. People are
kind of used to traceroute so it’s nice if our app can look similar. We lack ground truth data reverse
paths for the entire internet, of course, otherwise we wouldn’t need to build this. So what we’re gonna
do is evaluate this in a controlled setting. We’re gonna use paths between PlanetLab nodes to let us
compare a reverse traceroute to a direct traceroute. And when we measure the reverse traceroute,
we’re gonna pretend that we don’t actually control the destination; for the direct traceroute, of course,
we do control it. And we had a similar study in the paper using traceroute servers—public traceroute
servers—instead of PlanetLab nodes to get a little bit more path diversity, and the results were very
similar. So it’s not just PlanetLab biases that got us the results. So this graph is showing that
comparison. On the x axis we have the fraction of the direct traceroute hops that we also saw using our
reverse path measurements, and then it’s a CDF of the paths that we looked over. So the… what you
would be forced to do now is just to assume that the paths are symmetric and this black line shows how
well you do if you just assume and guess that the paths are symmetric. So you get a median of thirtyeight percent of the hops right. The red line shows our system, so in the median case we got eightyseven percent of the hops that you would get if you were able to issue a direct traceroute. And then the
benefit of the system is basically the shaded region here. Now, it can be hard to know if two hops are
actually the same to make this path comparison because the routers have multiple interfaces and we
might see different ones; so we might actually be doing better than this. Traceroute is generally gonna
give you the incoming interface; record route will often give you the outgoing one. It can be hard to tie
those two to each other.
>>: Couldn’t you also be doing worse than the… assuming symmetric [indiscernible]? Say it’s thirtyeight percent, but couldn’t it actually be doing better than that if you’re counting something as being
asymmetric just because you interface it?
>> Ethan Katz-Bassett: Ah, so you’re saying that just as our red line might look worse than it is, the black
line also might look worse than it is.
>>: Better.
>> Ethan Katz-Bassett: The black line might look better. So the black line is our baseline comparison;
we’re trying to do better than the black line.
>>: I’m saying that, you’re saying that it’s wrong when you assume symmetric because you have some
sort of—what—ground truths, right…
>> Ethan Katz-Bassett: Yeah.
>>: … in these tests? But that ground truth, aren’t you assuming that the paths are different because
you see different IPs in the hop?
>> Ethan Katz-Bassett: Yeah. So…
>>: [indiscernible] might actually be the same…
>> Ethan Katz-Bassett: Right. So the black line might actually… we might… the real black line might be
somewhere further to the right because we might be saying it doesn’t match, but it actually does match
in some cases is what you’re saying. Yeah. So that’s totally true, and so we apply the same… we used IP
aliasing datasets to try to identify all these cases. We used a number of techniques to try to pin them
down. The techniques don’t have complete coverage and so in cases where they’re missing coverage,
we would count it as a miss, both for us and for the Guess Forward line. So we’re kind of in both cases, I
think, conflating hops that are wrong with hops where we’re right or Guess Forward is right, but we just
don’t know. And we tried to tease apart those two, unfortunately in this talk, I don’t have the line for
where the Guess Forward would move—it’s in the paper. But basically what we did in those cases was
instead of looking at things at the IP alias level, we tried to look at them at the PoP level, where we’re
defining PoP as basically all of the routers for a particular AS in a particular city with a particular airport
code, basically. And so you can see, for us in the median case, that gives us a hundred percent accuracy.
I don’t have the line up here, but we have a similar line moving the Guess Forward according to PoPs. I
forget exactly what the numbers look like, but it’s still a pretty big gap and that’s kind of the best we can
do in terms of trying to account for these cases where it’s right but we don’t realize it. Does that answer
the question?
>>: Yes.
>> Ethan Katz-Bassett: So in the paper we discuss other reasons why the paths might differ, but I think
the main take-away is that we might not always be a hundred percent accurate and sometimes we’re
not gonna be able to give you the whole thing, but the differences are largely due to alias information—
like I was just talking about—and then also to load balancing which you can try to account for, like we
were talking about with Vijay before. So it gives you enough to… I mean, I think that we’ve… I came
away feeling like we could trust the data in most cases and identify many of the cases when you
wouldn’t want to trust it.
So I started out with the problem that Google was having with clients in Taiwan, and I’m gonna just
walk through an example of how you could use reverse traceroute to troubleshoot those sorts of
problems. So in this case, we observed a round-trip latency of a hundred and fifty milliseconds from a
PlanetLab machine in Orlando to a network in Seattle—it wasn’t UW-affiliated, it just happened to be in
Seattle. And that’s about two or three times what you would expect. So currently, if you were trying to
troubleshoot this, you might issue a forward traceroute and then look for a circuitousness in the
traceroute. And so that’s the first thing we did; this is the traceroute from Florida to Seattle. And if you
look at it, what happens is it starts in Florida, it goes up to DC then it comes back down from DC to
Florida, and then continues across the country. And so there is this big detour up to DC and back down
to Florida, and that’s explaining about half of the inflation, but it still seems like there’s some problem
on the reverse path—that doesn’t… the forward path didn’t explain all of it. And with existing tools
you’d really have no idea what was going on. So with reverse traceroute we could actually just measure
that path, so that’s what this is. And if you look at this path, what’s happening is it’s starting in Seattle,
it’s going down to LA then it’s going back up to Seattle, and continuing across the country to Florida.
And if you look a little bit closer at the hops, you can see that it’s starting out in Seattle on Internap, it’s
going down to LA on Internap then it’s switching over to TransitRail, coming back up the coast on
TransitRail and continuing across the country. And so one possible reason for this might be that
TransitRail and Internap only peer in LA and the traffic had to go down there to get over a peering link,
but I was able to verify with a traceroute from University of Washington that actually they peer here in
Seattle as well. And so it appeared to be some sort of misconfiguration, out of date information. I
actually was able to track down operators from both of those companies at NANOG when I was talking
about this, and they verified that it was kind of an unintentional detour. So without access to both
directions of the path, you really have no way to understand what’s going on; you don’t know who to
contact to fix it. And with access to the information—you know—even I, when I was in grad school, was
able to actually track down people who were responsible for this routing and had the… and were able to
fix it.
So that was one motivating example, but I do think that this information is potentially useful in many
cases when using traceroute, and so I just wanted to walk through one other example of what you can
use the information for. So there are many cases when you want link latency information when you’re
trying to understand routing. And the traditional technique to get in that would be to issue a forward
traceroute and then assume that the—if you subtract the round-trip time to A from the round-trip time
to B and divide by two, you have some approximation of the A,B link latency. Now, maybe you’re gonna
have a filter to throw out obviously wrong values, but this is pretty commonly used. And this is gonna
be a good assumption if the reverse path is symmetric—like this case—but if the reverse path is
asymmetric, you really don’t know what you’re getting when you subtract these latencies because you
could be following a completely different path from B back to A… sorry, from B back to S, from the path
from A back to S, and subtracting the round-trip times doesn’t necessarily mean anything. So the first
thing that you can do with reverse traceroute is actually use it to tell which of these cases you’re in, and
in cases when you’re able to find symmetric traversals, then you can safely use the subtracting
technique to get good link latencies. So we’ve been able to solve these two because they’re symmetric.
But you can actually do better than that. You can throw in constraints from measurements to other
destinations or from other sources and build up a system of constraints that says that the round-trip
time for any particular path has to equal the latency of the forward links plus the latency of the reverse
links. And issue enough measurements, you can get enough constraints in the system to solve it.
>>: So why won’t you… for such a thing, why won’t you directly use the atlas data you have, like, why do
I… some… for things like the central emerging information that was collected offline, so you’re not
actually getting real-time on queue sizes. So you’re basically getting some… maybe some low-pass filter
on the queue size, or are you getting propagation delays?
>> Ethan Katz-Bassett: Yes.
>>: So I’m just thinking if that’s what the problem you want to solve, just use the atlas data and process
it. Like I don’t need… what did reverse trace route bring to the table here?
>> Ethan Katz-Bassett: Yeah. So in our case we’re trying just to get propagation delay or some sort of
measure of minimum latency over time, and the thing that you need reverse traceroute from is if these
paths are asymmetric like this, I have no way of figuring out what the latency of A,B is versus the latency
to B,C.
>>: I was thinking, the way you build your ad list you already had all possible bag of traceroutes you can
conduct given your set of vantage points and destinations, right?
>> Ethan Katz-Bassett: Yes.
>>: So just process that data to do this, right?
>> Ethan Katz-Bassett: Yes. So there might be some hops that don’t appear on any traceroutes from
any of our vantage points and only appear on reverse paths. And I’ll get to that—actually I have an
example using the Sprint backbone—in a second that will show how often we saw that. But basically,
there’s some links you’re not gonna observe at all. There’s gonna be some links that you don’t have
enough constraints to solve for and so you need this to, first of all, tell which cases you’re in; second of
all, build up enough constraints. So in this case, we used a measurement from V, let’s say it observed
paths like this, we can then constrain these paths, and now we have enough constraints in the system to
solve for A,D. So we solve for all the other links and can get that one. So… yeah?
>>: Let’s see this.
>> Ethan Katz-Bassett: Okay. So we… Sprint published their topology and had traceroute server that let
you issue pings between all of their routers and so were all their PoPs, so we were able to get a good
ground truth understanding of their topology in the link latencies, and so we ran a comparison to see
how accurate our technique was versus the technique where you just use traceroutes and subtract
adjacent hops. So the first issue is that with traceroute alone, you might miss some of the paths. So
there were eighty-nine links in their topology, with traceroute alone, from all of our vantage points—
hundreds of PlanetLab vantage points—we’re only saw sixty-one of the links. With reverse traceroute
we were able to see seventy-nine of them. And then the graph shows the error and link latency
estimation, so x axis is error in milliseconds; y axis, it’s the cumulative fraction of the links. And the red
line is our technique; blue line is if you use traceroute. So in the median, mean and worst case we’re
doing ten times better than traceroute. Median, mean are both under a millisecond, so you can get a
really good understanding of backbone link latencies using this constraint-based approach.
>>: Really, I think [indiscernible] what you’re saying is backbone topological latencies, not link latencies,
right? Because the link latencies are also independent on instantaneous queued up.
>> Ethan Katz-Bassett: Yes, sorry. Yeah, so propagation delay between PoPs is really what I mean.
>>: Yes.
>> Ethan Katz-Bassett: It’s… I haven’t thought that much about it, but kind of the… what we were going
for with this, we had a couple of other techniques also to measure link latencies and one-way latencies
that appeared in other papers, and the kind of goal for it was build an offline propagation delay map,
and then that’s potentially useful for comparing to wide measurements and understanding current
queuing, things like this. We haven’t pursued that direction but that was kind of what we were thinking
with it would be, this would be not for live debugging, but for annotating a map offline, essentially, that
you can then use… combine with live measurements.
>>: This could useful for things like, for example, building like global traffic management type solutions
and annotating those, rather than any kind of, like higher speed would not… probably not use them…
generally because…
>> Ethan Katz-Bassett: Yes. Yeah, yeah, I don’t think an ISP would want this for their own
measurements, they don’t need it. Yeah, exactly.
And to maybe put them—latencies—in a little bit of a relative scale, the median actual link latency was
eleven milliseconds—a lot of links in the U. S. are about eleven milliseconds—and the longest link in the
dataset was ninety-seven milliseconds. So traceroute in the median case is adding about fifty percent
error. So most of that material was from the NSDI paper in 2010. I wanted to tell you a little bit about
what we’re doing with the project now. We’ve just started, basically, rebooting; it’s been shut down for
a while. We’re trying to build a better version of it. The goal is to have an open system that people,
researchers, and operators can actually plug into, use our vantage points to make measurements, and to
have an operational staff to help support that. So we’re building a client source code that you can use
to plug into the system. We’re adding vantage points, so we used Measurement Lab in our original
measurements… or measure—everybody familiar with Measurement Lab? So it’s a test bed that Google
and some other people started that’s meant to expose information about broadband home networks,
essentially. It’s basically that data… it’s supposed to give transparency into edge networks, basically, to
give the consumers a better idea of what’s going on with their traffic.
>>: Sure, the sites are not inside homes or residential high speeds…
>> Ethan Katz-Bassett: No.
>>: The lab sites are still very much like server-centric.
>>: Yes.
>> Ethan Katz-Bassett: Yes, exactly, but the nice thing from our perspective is we don’t actually want
vantage points in the home networks, what we want is vantage points that are in well peered locations
so that they’re close to lots of things for doing our spoof measurements, and so from our perspective,
Measurement Lab is really well located—so it’s at major PoPs basically.
>>: Intersection appearance…
>>: Like, I think the thing about… illustrates in general, like, I think you want to continuously keep
calibrating the system. Like, your original calibration and accuracy measurements was done on
PlanetLab with some augmentation like you mentioned.
>> Ethan Katz-Bassett: Yep.
>>: Most of the times, like, people like us, we are interested in, like, users inside homes.
>> Ethan Katz-Bassett: Mmhmm.
>>: So from that perspective I would say—and you can correct me if wrong—reverse traceroute is
completely uncalibrated.
>> Ethan Katz-Bassett: So it’s true that we haven’t done a validation study in that sort of context.
>>: Well, I’m not sure we actually really need to yet, right? So the way I’m thinking of this is: I’m
assuming you think you’ll be publishing, for example, what your data structures are for your annotation
and the maps. So you can use those data structures, map them with real user measurements and stuff
the Jack’s team is doing on the client side—then you’ll actually have an idea of what’s happening.
‘Cause the client side will… is blind, right? The client side will go from measurement from the client to
the server, but in between, it goes away, and yet you—the next layer down—you …
>>: No, I’m sorry, but actually it’s simple problem. I think that’s useful outside just the… so there’s
some coverage and accuracy numbers you can pass based on what is built. Those numbers are driven
by where you could get the sites. Now, the data we’re interested in is around like people sitting in
homes and trying to reach Microsoft or group or whatnot. Those set of like site… those site of clients
may have completely different characteristics with respect to the routing variable—those data tricks
may or may not work. So…
>>: That would be mostly in this [indiscernible] type stuff—right—which is a local access networks?
Access networks are gonna be like, from the measurement perspective, are going to be, for example,
interleaving in DSL, is adding on, like, twenty milliseconds to whatever you’re…
>>: You can just forget performance, if you just want the path, let’s say. I think it would be good to
know… Okay, I think here’s what I’m asking, like, what do you need to do calibration for home users,
because that’s where I think content providers interested in so…
>>: He has these RIPE nodes, right?
>>: What?
>>: He has those RIPE nodes and a large number of them are gonna be…
>>: Actually that’s a good point. So right… right… some… yeah. So that would be interesting to see…
okay, so that’s…
>> Ethan Katz-Bassett: Yeah. So this is one thing that we’re working on it right now, is incorporating
RIPE Atlas. So RIPE Atlas is this little USB Ethernet dongles and they have kind of very quickly got a very
large deployment, so they have seven thousand nodes in thousands of ISPs, most of them are in access
ISPs. And so right now we’re working on incorporating those into our system as traceroute vantage
points essentially, but also as potentially a validation data set that has more access measurements. So a
couple things to think about in terms of coverage: one is that usually what we’re trying to do is measure
a couple hops until we intersect a path that we already know about, and so you don’t need to measure
across really long distances usually. The other is… I don’t know, but my instinct would be that most
home user’s paths are symmetric for the first couple hops, like, you’re not gonna have lots of
asymmetry for the PoP where your traffic enters…
>>: The data we’ve seen, I think, from large residentialized feeds, they could be doing weird things
with… you go blind inside, let’s say, Comcast. Comcast is a large residentialized feed, and I don’t know
to what extent symmetry would hold in there…
>> Ethan Katz-Bassett: So not across…
>>: … [indiscernible] those—like it is large residentialized feeds—my understanding is they do a lot of
tinkering with how traffic flows through the network, and that is what leads to asymmetry.
>> Ethan Katz-Bassett: Yeah, so I would need to look back at measurements to see which of those we
have good visibility into, but at least in many cases, if it’s that large of… like when someone like AT&T
who has a—you know—globally-distributed backbone, like, in those cases we can get pretty reasonable
coverage in a lot of ti…
>>: But the residential ASNs are different, so I don’t know…
>> Ethan Katz-Bassett: Yeah. Anyway, it’s worth looking into. I haven’t looked at it so… I’ll stop…
>>: … [indiscernible] a good source and I’d be very curious if you went back and did the same analysis
using…
>> Ethan Katz-Bassett: So I imagine that once we have the system fully up and running we’ll do that
analysis. We’re also trying to add in optimizations to make the system more efficient at scale. So some
kind of obvious directions that we’ve been pursuing is caching of partial measurements and figuring out
how long you can safely cache partial measurements for and how you can either use those to construct
a path because you believe that the path hasn’t changed or use them to give you hints on what
measurements you should issue so that you can much… you can very quickly refresh a path rather than
having to measure it from scratch. We’re also working on inferring rate limits at routers to our different
types of probes so that we can kind of safely push the limit without kind of triggering a situation where
our probes are getting ignored and kind of doing smart centralized scheduling. So the piece that we’re
doing right now with RIPE Atlas is trying to build a system that smartly selects what traceroutes to issue
that are likely to kind of help you with a particularly query that you’re working on. So there are—you
know—thousands and thousands of RIPE nodes; we can’t constantly be issuing measurements from
them to every destination in the world, and so we have kind of a prediction engine that can smartly
allocate our budget to try to get the most use out of the RIPE nodes.
So lack of reverse path information was identified as a key limitation by Google in their IMC paper, by
operators, by researchers—you really had no way to get this information and so anyone who cared
about internet routing was really only seeing half of the information. We built this reverse traceroute
system to try to give us return information and address this key limitation in traceroute. I had this
example of using it to debug performance problems, we hope that it’s also useful in a range of other
settings and kind of like to think that most settings, when you’re using traceroute, you could get useful
addition from using reverse traceroute as well. We can essentially tell you how anyone gets to you, and
so our goal is to really build this into a scalable system that can support community measurements and
not just have it as a little thing that I run on my own desk. Happy to take more questions.
>>: Since you said you’re rebooting it and looking to add new things in the future.
>> Ethan Katz-Bassett: Yes.
>>: I ask you about two things which you may not have thought about yet. That’s fine, but that may be
a request for doing things. Number one is IPv6—right—as we start to see more traffic shipped IPv6. It
used to be the case that you could figure first traceroute very easily in the IPv6, because of there was
the routing error that you can use to do your round-trip and report routes and so on, but since that was
deprecated you can’t. And so it seems to me like some of your same techniques might translate, but
since you don’t have IP options maybe you need to find other equipment since that would be the first
request because I think it would be an increasing need going forward.
>> Ethan Katz-Bassett: Yeah. It sounds like a great idea. Kind of level that I’ve gotten to is exactly what
you say, which is like their used to be support for it, there’s also a proposal for record route where you
could tell it which hop to turn on which would have been really useful—again, deprecated.
>>: What you want right now is: how can you do it in today’s internet rather than what might…
>> Ethan Katz-Bassett: Yeah. So I haven’t looked into it. It’s a great question and we’re moving
towards… I kind of have this coupled with a BGP test bed that we have and we’re moving towards IPv6
support for the BGP test bed, and so it’s a great thing for us to think about going forward in this. I
haven’t looked into it yet though.
>>: The second thing is as people start deploying things like carrier-grade NATs at the edge of large ISPs,
can you use a piecing together two things against—you know—on this side the carrier-grade NAT might
be symmetric, but it’s the same point that we were talking about before is that it might actually not for
these very large ISPs and so on. So if you can do something there it might be part of the same work as
we’re talking about there. And then, if you can do work on both of those—just to stretch call—you can
put them together and try to go through a NAT64 and do a trace route through that to say the ultimate,
right?
>> Ethan Katz-Bassett: Yeah, that sounds interesting. So like you hypothesize carrier-grade NAT, also
not something we’ve looked into a ton. And kind of all these cases, I’m… I think it’s pretty interesting to
try to figure out new techniques that you can use to try to get some visibility like in the tunnels and
behind NATs and things like this. I don’t have great answers on what exactly those techniques are.
>>: [indiscernible… but I would think would be a great topic for future papers, right? [laughs]
>> Ethan Katz-Bassett: Sounds perfect. I agree, those are all interesting.
Any other questions or any more questions on some of the things like the filtering that… I can actually
show that result if you want.
>>: Okay so, I mean, yes I think we should, but I’m looking at operationalizing some of this, right? So
two things now that come to mind—three things—first one is: how are you funded for this? B—what is
the long-term context if we, for example, asked the global load balancing team—Chad’s team here—to
build something that depends on this? What happens if you get bored or somebody gets bored, how do
you productionize that—that particular component? The system here would help, I mean, it’s not being
the final, but it would help in improving our visibility. And the third one is, like, if you open source it or
plan to open the system, is there a mechanism for constant care and feeding of this or some sort of
other… I mean, I’ve seen grad school code, right? But how does this work in production environments?
>> Ethan Katz-Bassett: So this is kind of what we’re trying to answer right now. So the previous funding
was NSF research funding, Google research funding; the new funding is NSF Community Research
Infrastructure funding, so it’s designed to build infrastructure. We just got that funding and just put out
an ad for a full-time developer on the project, and then have funding for operational support staff for a
couple years on this combined with the BGP test bed. But we also want to, in that time, start thinking
about how to have a long-term home for it that shouldn’t be grad students because that’s not
sustainable. You know, we’re in this new version of it, we’re planning on building in more automated
recovery, more automated monitoring, things like this, and we’ll have full time staff, but we want to
open source the code for sure, we want to find some way for this to live long term. Right now we have a
three year plan for it living, but I’d love to chat more about kind of ways of making sure that it lives past
that. I mean, kind of where we’re at right now is, like, as a grad student, I was able to kind of maintain a
version that we could use at University of Washington for two years, but then I got too—you know—
busy with other things and I mean, that model is clearly not gonna work.
>>: So most of your infrastructure is based on, like, PlanetLab which is not open to the industry at all, so
how can we neutralize those infrastructure to build our own production system?
>>: You’re not currently planning to use PlanetLab for future right? It was RIPE, Measurement Lab,
and…
>> Ethan Katz-Bassett: So it’s a little bit up in the air if we’re using PlanetLab or not. It’s gon…
>>: I think the important point, the release point is, like, the users of reverse traceroute system don’t
need to use PlanetLab directly.
>> Ethan Katz-Bassett: Yeah. So the model is: we have a live distributed system, you have a piece of
client code that can hook into our distributed system, make a request to it, it will orchestrate your
measurements with PlanetLab, but you don’t need to directly access PlanetLab. So that’s the direct
answer to your question. In terms of whether or not we’re using PlanetLab going forward, we’re kind of
assessing PlanetLab’s long term health at the moment and trying to figure this out, and also assessing
how much benefit it provides now that we have Measurement Lab—has extended a lot in the last four
years and we’re being incorporated as a core service in Measurement Lab now, and now that RIPE Atlas
has provided some of the diversity. So we’re gonna assess whether or not we still need it.
>>: So like, what’s the longevity of the RIPE Atlas? Because you know the MAC we get in limited module
the communication model is like it is this one person sometimes and sends emails, right? I mean, that’s
the… [laughter]
>> Ethan Katz-Bassett: Yeah, it’s a good question. So RIPE Atlas years ago, I think, even possibly before I
did reverse traceroute, maybe from Hubble, they contacted Arvind and me and asked if we wanted to
help them kind of design this system that they were thinking about at that point. And at that point they
went to their body—like, we had a proposal—they went to the body and it didn’t get funded and we
dropped off the project, didn’t hear about it, and then next thing you knew they had hundreds or
thousands of nodes. So they are reliant on their members to fund the project. I don’t… I mean, that’s a
really good question, but I don’t know, like, how long it’s funded for right now, what their plan is.
They’ve done a good job with every other system that I can think of that they had of, once it was funded
it continued to operate—like, RIPE RIS, their BGP collector, has been around forever.
>>: Are you exploring the FCC Samknows stuff? Like that could be another kind of source of funding.
The FCC’s…
>> Ethan Katz-Bassett: Yeah.
>>: They want to move away like… because NSF—I think you’re right—NSF is not designed to fund
infrastructure.
>> Ethan Katz-Bassett: So [crowd noise] for Samknows, if Matt wasn’t so busy with you guys maybe he
would have made progress on that project. He and Toby are supposedly… like, they approved a project
proposal from us and just nobody’s had time to work on it, ‘cause it was Matt and Toby, but they
both…like Toby lives at Google, and Matt lives at…
>>: Well, Matt [indiscernible] boring stuff, so you should use Toby. [laughs]
>>: [indiscernible] We hear they’re talking.
>> Ethan Katz-Bassett: Yeah and actually this new stuff is…
>>: It would be interesting to hear feedback on it.
>>: So is it possible that—and I’ve talked to Russell about it in the past—like we have development
resource in-house, and we use that to, like, avoid [indiscernible] in our organization. We build it and we
kind of go back and forth and…
>> Ethan Katz-Bassett: Yeah. I’d love to figure out a good model for this. I mean, a big problem for us
is—you know—we’re not a software engineering shop, like it’s… when we had students work on it, it
works for a while, and then pieces die. Right now we’re trying to hire a developer, but that’s not
something we have experience with, so we’d love to find ways to do this.
>>: For us, like, the hard part is figure out the continuity of it, like, kind of—you know—have… keep the
business running.
>> Ethan Katz-Bassett: If we can find a good model for it, I’d love to do that.
>>: So what… you mentioned there’s another infrastructure like you used tens of exchange point to
location to do the reverse traceroute, so how… is that open to everyone or is it just something that you
deployed the server there?
>> Ethan Katz-Bassett: So, in terms of our BGP infrastructure that we have?
>>: No. Go to previous slides. You mentioned there is another…
>> Ethan Katz-Bassett: This one?
>>: Yeah.
>> Ethan Katz-Bassett: Oh, yeah. Sorry. Yeah, yeah, right, I didn’t mention… I forgot that that… in my
mind that said PlanetLab, so that was my answer to your question. That said PlanetLab until this
morning. Yeah, so we have this BGP test bed that’s now called Peering, it used to be called Transit
Portal, and what we have there is, I have a slash nineteen prefix and just adding a slash twenty-two, and
I have an ASN and we’re building basically a research ISP that researchers can use if they want to do
inter-domain routing research and do BGP research. So we peer currently with hundreds of real ISPs,
including Microsoft; hope is by the end of the year it will be thousands and we’re putting routers into
internet exchange points that were on our software. So right now we’re in Amsterdam Internet
Exchange which is one of the biggest couple ones, and we’re in a smaller one and we’re kind of looking
to expand that over the spring semester. And so those routers that we have for that will also be running
reverse traceroute code because they’re really close to a lot of networks. They have, like, hundreds of
peers and so they’re good as vantage points. In terms of using the internet exchange access, again,
we’re building a community test bed there and so we’re accepting proposals from people who want to
use it. That’s… I gave a talk on Friday at UW about that.
>>: [indiscernible] open that for public use, like, we do a traceroute prompt exchange point to server
that you have located there, so…
>> Ethan Katz-Bassett: So in terms of traceroute experiments with that, the plan is to expose it as part
of the reverse traceroute system. I hadn’t thought at all about exposing direct traceroutes from it. A lot
of the same exchange points are gonna already have traceroute servers—IM’s I’m pretty sure does, at
least. So the two ways—at least so far—I’d thought about exposing it was: one, it’ll be part of the
reverse traceroutes. So if you hook into that, one of the vantage points that might get used to make
some of your measurements might be at an exchange point; and then two, if you’re running BGP
experiments, we accept proposals for a BGP test bed.
Okay. And I don’t know, Vijay, do you want to see the results showing the filtering that you were asking
about?
>> Vijay: Yeah, that would be very helpful… ‘cause we know…
>> Ethan Katz-Bassett: Figure out how I cut ahead.
>> Vijay: The reason I’m asking for that is that if you can tie it to port AS…
>> Ethan Katz-Bassett: Yeah.
>> Vijay: …expose that to the data structure, we can then say, “Okay, I’ll port these AS’s, move the error
bar to useless…”
>> Ethan Katz-Basset: Yep.
>> Vijay: …and for the others
>> Ethan Katz-Bassett: Yep. This is exactly what a student was doing for a class project this semester,
but he didn’t finish it—but I agree, that that’s how you want… you want to know, like, here I should
look, here I shouldn’t.
>>: What kind of students are your guys?
>> Ethan Katz-Bassett: I don’t know. He was in the theory group, so maybe this is a little bit too close to
the… So to give just some ideas on IP option support, I couldn’t find any great pull numbers from Rob’s
papers, but he has this sitcom…
>>: These things are so old.
>> Ethan Katz-Bassett: Yeah, it’s true.
>>: Then don’t even bother.
>> Ethan Katz-Bassett: Okay, okay. So I’ll skip this, but so this is showing… this is, again, old data, but
showing the fact that you can use vantage points to get around filtering in a lot of cases. So the scenario
here is you’re trying to measure from D back to S again. When S sounds… sends the options probes
they’re filtered. How often can we find another vantage point that can get around that filter and still
measure the reverse path? And so this is showing over a thousand destinations that responded to
options from somewhere. How many responded if you sent a direct probe and how many responded if
you bounce the probe like this? So the white—and then this is just showing for PlanetLab nodes—so the
white is how much they could get on their own; the red is how much they could get with help. And so
you can see, in a lot of cases, using other vantage points to get your own filters actually is a really big
benefit.
>>: This isn’t as bad as I was thinking. I mean, I thought this was going to be pretty bad because… might
we also use [indiscernible] to bring up, like, part AS and then say which ones, I mean, if you can have,
like, some… a bunch of small AS’s who are okay with spoofing because they don’t know any better—
right—I mean, this is the clown category. And then there’re some larger ones who are… who may be…
and they’re… the thing is, how much of the path or the customer base is covered by people that squash
and how much is covered by people that don’t? Right, this now is all equal-reign AS.
>> Ethan Katz-Bassett: Yep. Yep. Yeah, I agree. I don’t have results that directly look at that here, but
certainly that’s kind of…
>>: We can just eyeball it, and I can tell you.
>> Ethan Katz-Bassett: Yeah. It’s certainly what you want to think about. The nice thing is that, in
general, the closer you are to the core, the less likely you are to be filtered ‘cause it’s really hard to set
up the filters. And so of all… last time I checked—which was maybe last year—all of the Measurement
Lab nodes except for one could spoof, whereas the PlanetLab percentage is much lower. And the other
thing is, I mean we’re only… we’re spoofing, but it’s in this very restricted way that’s safe, where we only
ever spoof as one of our other vantage points and so, like, when I initially talked about this at Google,
they’re… they didn’t even really consider the spoofing, ‘cause really what you have is one of your
vantage points pretending to be another one of yours; they’re both Microsoft IPs—that’s not spoofing,
that’s just like any casting the address. Like it’s… in some sense it’s more similar to that. And so if we
have kind of a real deployed infrastructure that’s not just me piecing together PlanetLab, I can imagine
that it might end up working out more like that where this range of IP addresses is actually a perfectly
valid set of IP addresses to source from here; it’s not spoofing arbitrary things. And like at USC we were
able to set up kind of that sort of… whatever… the opposite of filtering… ACLs still, like, let us do that
three ISPs up from us so that we can do the certs and measurements.
>>: Pointless.
>> Ethan Katz- Bassett: Yeah.
>>: That’s very cool.
>> Ethan Katz-Bassett: Any other questions? Okay, thanks. [applause]
Download