23008 >> Helen Wang: It's my great pleasure to have Nick Weaver from Berkeley ICSI Research Institute. Nick's institute and as well as UCSD Research Group have been working together to uncovering the underground economy as well as a bunch of other network security which has made really significant impact. So I'm really happy that Nick is here with us and telling us about two pieces of work. So there are two talks. The first talk is about Netalyzer, and he has some very exciting story on some ICs, I will not, you will hear all the exciting story from Nick firsthand. The next talk is about the spam underground economy. Okay. Thank you, Nick. >> Nicholas Weaver: Thank you very much. The way I figured it, if I'm flying up to Microsoft, I might as well give two talks rather than one. Also, it gave me an excuse to prepare the second talk. This first one I've given in a couple of places because Netalyzer is this big network measurement tool that we've built at ICSI. It has been widely successful at this point as network measurement tools go. How many people have run it before? Well, if you haven't, you probably want to run it on your own network just to find out what exactly is broken. If you have, run it again, because we keep adding new tests. Also it's really fun to run on random hot spots. So this is joint work with Christian, Boris, Gregor and Vern. Boris, while he was at ICSI as a visitor, and the work is sponsored by the National Science Foundation, with additional support from Google and Comcast, as well as EC2 Time from Amazon. Again, opinions are myself and not those of the sponsor or co-authors. So let's start off with an anecdote of how Web search is supposed to work for people. They type something into the browser's search bar. In this case it's Internet Explorer 8 and the browser goes, okay, ask the DNS resolver what's the address of the search engine, which by default is Bing. I'm using this example not because I'm at Microsoft but because this is what actually happens for more people. The result comes back. The request goes to the Web server. And the reply goes. However, we saw something interesting for about one to two percent of Netalyzer sessions. So not a huge number, but enough to be really significant. And I will get back to what this was later. But as a teaser, the ISP had installed some sort of DNS appliance in their network. And some third party, for this purposes unknown, was running a proxy server. Again, the user does a query. The DNS resolver returns the address of the local Akamai node that's serving Bing and the resolver appliance in between, or the appliance in between the resolver and the user goes, oh, wait, this is a request for Bing, I'm going to change it to the address of a proxy server. The request goes to the proxy server and in this example nothing happens. Request gets forwarded to Bing. Results come back. And, well, nothing happens. The proxy forwards it on and the user has no understanding of what happened. This is just one example of how the network really works in practice for people. And it gets on to this lovely theoretical view of the Internet we have where we've got our server over on that side and the user on this side. Traffic goes back and forth. But in reality, the path packets might take might as well be Napoleon's march on Moscow. We have fragments are just not allowed by the network these days. We have hidden proxies on various networks for DNS and http and other protocols. We have this problem called buffer bloat where the network can do great file transfers or do interactive tasks but if you try to do both at the same time, you're in a world of pain. We have this problem of annex domain wildcarding. Anybody who thinks that an error should be properly reported back to the computer doing it is unfortunately disagreeing with most of the ISPs out there. And we have this where ISPs were men in the middle and search sites for up until a month ago completely unknown reasons. And, finally, this lovely gentleman created the home gateway. The home gateway or NAT is an abomination unto God and man. No fragments. Broken DNS resolvers. FTP proxies. SIP proxies. If a NAT can break the network in a new and interesting way, there's at least one vendor whose NAT breaks the network in that particular interesting way. I think the way to put it, all NATs are broken, all NATs are broken differently. So we wanted a tool to know what the network actually is. So we wanted two roles. An easy-to-use network survey for anybody. Basically the motto is so easy my mother can run it. And indeed it is actually easy enough that I can have my mother run this tool. >>: Your mother install an applet, Java applet? >> Nicholas Weaver: She doesn't have to install the applet it's two mouse clicks for the typical user. She's running a Mac. Windows XP has Java installed. Windows by default, Windows XP is still your biggest market share so for all those users it's two mouse clicks. As well as a detailed diagnostic tool for expert including what we colloquially call mom-mode debugging tell your mother to run Netalyzer have her send you the results and now you know all the different ways that her network is broken. And so we targeted the Web browser with just two mouse clicks and we did it in Java rather than JavaScript, because JavaScript doesn't allow us to do raw traffic and we didn't do it in flash because flash gives us headaches. But there's a lot of sneakiness in our tests. And. >>: Would flash give you headaches? >> Nicholas Weaver: I've just looked at it and their same origin policy is weird. One of the nice things about Java is it has blame the user bypass of same origin. So our architecture is rather distributed. We built a cloud-based service. So our front end Web server DNS server is running at ICSI. All our servers are custom. We wrote them ourselves because they fold spindle and mutilate protocols in interesting ways. Then the test runs on a EC2 node. We right now tend to run five EC2 nodes at a time but we've run as many as 20 when we're expecting a flash crowd. And this probes the network for everything. And then a result page is delivered. And then we fetch the result back. And now I try to do the bold thing of a live demo on the network. So let's get on the guest network. This doesn't count in the mouse clicks, okay? >>: Well while that's happening let me ask you something, a couple of weeks ago I was at a hot spot and I tried to do a search of the browser just as you described. And instead of actually getting some hidden proxy what actually happened was instead of getting a Bing result I got a result from some other search engine. Redirected, hijacked me away from Bing to another search engine. >> Nicholas Weaver: That's interesting. We've contemplating adding in a test for that direct to our server, because if it's done using DNS, Netalyzer would capture it if it's done using direct injection we don't currently capture it but we capture for rewriting which a couple of ISPs are doing where the ISP looks at the 404 page and rewrites it. So this is the basic user experience. They see our Web page. It's got a big start analysis button. Although, if we ask my sister who is doing marketing stuff, it's not big enough. They click the start analysis button. It gets redirected to a random back end node and there's a Java signed applet dialogue, that's mouse click No. 2. If they click deny, most tests still run just not all of them. And then it starts running, and the results display after way too much. And the guest network here is relatively clean. There's a little bit of port filtering, namely the Windows ports. Yes. Even Microsoft is unwilling to allow the Windows ports onto the Internet. And DNS passes through a firewall that's protocol-aware. And there are a couple of minor bugs in the DNS server, basically it's not doing failover for TCP write. Common, annoying, doesn't really matter until you want to start doing DNSSEC. When you start doing DNSSEC, this may be a problem. So back to the talk. Yes, it works well enough that I'm willing to do live demos in all my talks. So to give you an idea of the test suite, this is actually most of the test suite, we've added in 404 rewriting detection. We've added in some more DNS probing. We've added in UP&P queries. And it really depends what the PC is, how they react to the paper. The IMC paper committee loved our paper. The SIGCOMM paper committee said this reads like a data dump. One thing we have is lots of biases. Netalyzer is run by geeks. Far more Linux. Far more Firefox than you'd ever expect in the real world. Far more open DNS than you'd expect in the real world. We have a tendency towards flash crowds. So our initial flash crowd was biased towards Comcast users. We recently received 20,000 visitors from Germany, because we did a -- Heise provided a German translation and a "link to" on their page and as a result a lot of German geeks go, oh, this is a really cool tool. We've participated in at least two IPv6 deployment tests, Tora Anderson's trial. If users had problems it threw up Netalyzer. Comcast IPv6 deployment trial. And one online game has decided through no knowledge of our own that Netalyzer is a cool debugging tool. And this is a quote from their FAQ. None of this also worked, my Internet is poned and the answer is run Netalyzer, post results in the forum and somebody who is a geek in the forum will actually be able to find out what's going wrong. We didn't intend this, but it shows just how usable this is for mom-level debugging. So one thing we do is we really exploit Java same-origin policy. If the signature is rejected, we are confined to same origin, and we wanted as many tests to run as possible within that. Fortunately Java same origin is amazingly permissive from our point of view We can contact our own origin server on any port using raw TCP and UDP as well as http through the Java API. No objections. It allows us to do whatever we want. And since we have raw TCP and UDP we can do anything that's network layer and above. We can play games with protocols, we can do direct DNS, all sorts. We can perform any DNS lookup but the Java VM enforces same origin at DNS sort of. If the answer is the origin server's IP address, that answer is returned. Otherwise it raises a security exception of this request would violate same origin. You aren't supposed to know about any other IP addresses. But this allows us Boolean property tests on the DNS infrastructure. You do a query. It either returns the server's IP address or an error. So as long as it's something that's a Boolean property with the resolver itself, we can test it without violating same origin. However, if the signature is accepted, Java sandboxing is gone. Java believes in a blame-the-user sandbox. There's one click to bypass the sandbox. And if the applet does something wrong ->>: What are examples of the Boolean test? >> Nicholas Weaver: For example, how is glue policy handled? Does the DNS resolver accept arbitrarily in bailiwick glue records, does it accept in bailiwick glue records only when they're in the NSR set? Does the resolver validate DNSSEC as one that we will able to do. Can the resolver fetch large responses over TCP? For that one, we send truncate back with false. If it caches that, a dozen, then when it retries with TCP we send the true value. So this allows a large number of tests to be done. >>: Who accepts this material if you're Java -- >> Nicholas Weaver: Almost everybody. >>: The user? >> Nicholas Weaver: The user. >>: So that's another click? >> Nicholas Weaver: Yes, that's click two. >>: Allowing deny click? >> Nicholas Weaver: Yes. And I hate to admit it, but there's basically UI bugs involved that we are inadvertently but happily exploiting. The Java dialogue actually is fairly reasonable for the Mac. It's not great. But it says the applet is requesting access to your computer. Deny/allow. Deny means run within same origin. We can't touch the disk. We can't touch the computer. So it could be much clearer, but it's okay. It's great for our purposes. It's kind of okay from a user education standpoint. Suns dialogue box however for Windows, what does this mean? This application's digital signature cannot be verified. It can't be verified because I'm running on debug. So I'm running with a nonverified self-signed certificate in this case. This is its dialogue. If it has no clue where the code's coming from. Run or cancel. Do you know what cancel means? Run with same origin. Run means run bypassing same origin. This is a textbook example of how not to design a dialogue box to educate users. But it's great for us because it means everybody bypasses same origin. So the first result of interest is fragments and path MTU discovery. We test fragments by sending a 2,000 byte UDP data gram. It should just be fragmented by the network. One sent from our server are sent without DSF set the thing should defragment, et cetera. Eight percent can't send fragments, eight percent can't receive fragments. So basically fragmentation to the client does not work on the modern Internet. This is due to stateless firewalls and NATs. It's easy to make a stateless firewall break all fragments. People do it all the time. Those who can send fragments may have an MTU hull. Three percent of the sessions just can send a 2,000 byte UDP data gram but cannot send a 1500 byte UDP data gram. This is largely due to we'll get to it in a bit, PPP over Ethernet, which is still distressingly common. >>: Doesn't exactly match what's on the slide. Did you mean that three percent of sessions can send 2,000 byte frames or three percent which can send? >> Nicholas Weaver: Three percent of the sessions which can send fragments. Can't send a subfragment value that would be fragmented on anything other than an Ethernet network. So I hope that's precise enough. Basically they don't have a full 1500 byte Ethernet MTU and something keeps fragments from working. Our server, Java does not allow us to send raw packets. Our servers do. So we have a UDP echo server that reads and writes raw UDP packets. It can return the size of the initial fragment received so we can see what the fragment is. It can send packets with DS SET and specific size. Replies with ICMP 6 too big. It basically supports trace route functionality from the server to the client with path MTU discovery. So we can figure out where the path MTU bottleneck is. And the network's mostly but not all ethernet. There's this great quote "I don't know what the physical layer 20 years from now will look like but it will be called ethernet and have a 1500 byte MTU." Well, at least for right now. But there's a lot of PPP over ethernet still out there. There's a lot of old DSL modems and they're configured with PPPoE. >>: What's the denominator for all this, how many sessions? >> Nicholas Weaver: What do you mean? This is for all our user base of 300,000 plus sessions. So this is a good representative survey of the Internet. Modulo. We would have hoped that with the number of geeks involved this would be lower, because geeks are willing to pay for better Internet and upgrade their modem. So that's kind of distressing. In these cases, where you should get an ICM P2 big sent, only 60 percent sent them. So path MTU discovery for IPv4 has been decreed broken on the Internet. It doesn't work. 40 percent of the time you never get the ICM P2 big. And basically path MTU discovery must use fallback when ICMP isn't received. Oh, and you guys don't have this problem, but we do. Linux uses path MTU discovery on UDP by default and it sets the DF bit on all UDP data grams that aren't fragmented by the network driver. Don't ask me why. >>: Question, what's eight percent of sessions, so this is join for how many hosts? >> Nicholas Weaver: About 250,000 hosts or so. Over 200,000 plus. >>: Okay. >>: So I have a question. How did you decide to look into -- [inaudible] among the characteristics what motivates you to looking at these two? >> Nicholas Weaver: Basically we motivated by looking at problems that we've personally experienced. We've personally experienced a problem where in the middle of the network there was a path MTU hole and that broke SSH connections from Germany to ICSI but not from Germany to Comcast to ICSI because it was in the middle of the network. So that's how we chose that. So proxies are another thing that we see way too many of. They're delicate things. Proxies assume that the traffic conforms to the protocol. But we control both ends, because for everything but http and DNS we're connecting to a custom server on our end over a custom protocol. Namely, you connect and it sends back the IP and port number. So this allows us all this port filtering policy. If the response is received as expected, we don't detect filtering. There could be a smarter proxy that goes, oh, this is just somebody playing games, I'm going to allow it. If the IP address is changed, this says either this protocol is redirected through a different computer, or the user has a load balancing network connection between multiple IP addresses or the user started Netalyzer, closed it and opened it elsewhere, which is why these days we do tests in order. So we'll see that. If the connection fails, so if you get a connection and then no data back, then the port is blocked somewhere in the network, if the connection fails altogether, so the SIM fails to connect. But the interesting thing is that the connection succeeds and different data is returned. It passes through a declared proxy. The proxy actually sent something back that was unexpected to us. And if the connection succeeds but no data is returned -- [cell phone] -- sorry. Then the requester response was blocked by a network device that was probably protocol-aware. That something in the network goes, oh, this is a protocol violation, stop it. That's actually quite common. So a few surprises. POP proxies are amazingly common. Seven percent of sessions reject our protocol violation and another six percent actually send us a proxy banner. And it's almost invariably local host antivirus, that the local host antivirus return is block mail worms by intercepting POP connections. They don't actually end up intercepting IMAP SSL or any of those. So it's questionable how well it works in practice. But we see a lot of it. NATs do FTP proxies for those who remember networking 101. FTP is broken and bass ackwards. So the data connection comes from the server to the client. So most NATs or a large number of NATs just don't bother doing port forwarding and parsing the protocol. They just run a proxy. >>: Is the NAT, is the network translation ->> Nicholas Weaver: The home gateways. >>: Okay. The ->> Nicholas Weaver: Sorry. I capitalized the S. SIP-aware devices are amazingly common. >>: This local antivirus programs and [inaudible] to the client how do you know? Do you have screen catcher? >> Nicholas Weaver: We don't have screen capture. We have the banner that it returns. And often the banner says AVG POP proxy. So the data coming back is specifically saying what the product is. SIP-aware network devices are surprisingly common. This is a big headache for those doing VoIP work. We saw less SMTP filtering than we expected. Only about 25 percent blocked SMTP requests to our server and eight percent rejected the protocol violation, which basically suggests that many ISPs are doing port 25 filtering dynamically. They're looking at the network. If you have a little port 25, they allow it. If you have too much, then you're blocked. This is a compromise. This keeps the spam bots from spamming but keeps the support calls down. Expected results, port 443 is almost completely unmolested. Run SSH server on port 443 back at your home institution. You will thank yourself when you're on odd ball networks. And the Windows port blocking is very common. And this was distressing: Port, UDP port 1434, the slammer worm of, what, eight years ago, that single UDP port which is in the efemoral range, so you can have stuff randomly assigned to use that port, 20, 30 percent of the net still blocks it. It's kind of distressingly still high. DNS is commonly firewalled. 99 percent have DNS access; that is, they can seem to contact our server. But of those, 10 percent failed to fetch non-DNS over port 53 which says there is some device in the network path that's going, oh, this doesn't look like DNS, kill it, which means you better hope that as you deploy DNSSEC that this device actually understands DNS enough to not kill your legitimate DNS traffic. DNS is somewhat proxied. 1.4 percent has the DNS request that is sent from the client direct to a remote machine, redirected through some other system. In this case I've seen NATs where if you don't set the recursive desired bit it will actually even block those connections. DNS filtering on the wire, you see that 10 percent number. A lot of these devices we've already shown don't actually understand the DNS protocol. Quad A records get blocked. Text records get blocked. Unknown resource records get blocked. EDNS gets blocked. Oh, you wanted to use DNSSEC to validate certificates so you don't have to deal with DIGI NOPAR 2 [phonetic], sorry, you are out of luck. The network has decreed DNSSEC shall not work for one, two percent of users. This is actually a big problem for those who want to do DNSSEC to reduce a lot of the certificate problems. Http proxies, well, we took advantage of the custom http server so known headers, random capitalization, any proxy that resynthesizes headers will change capitalization. If you see the capitalization of the connection header changed, that may be a sensor ware proxy. At least one vendor changes just that header's capitalization. Http proxies might not handle unknown requests. They may have a bug that's now two, three years old in this field where it follows the host field. About five percent plus show evidence of some http proxy. So it's significant. We also detect caches, because hidden cache much cache. So basically having a custom Web server we can lie about the cache control headers and so the first time you fetch an image it shows up like this. The second time you fetch the image it shows up like this. Same size, different MD 5. Known size. Four percent of IPs show in path http caching, which would be fine and good but 50 percent of the caches are broken. 50 percent of the caches cache data that the server says, oh, cache, any cache that may exist in the network, don't cache this, please, they cache it anyway. Hidden filtering must filter. About ten percent of hosts have virus filters that are running on the end host itself. We test with the ICAR test virus. >>: What structure are you using for don't cache, are you using all of them? >> Nicholas Weaver: We do weekly. Kindly, sir, don't cache and strongly everything that says don't cache this thing! >>: So 50 percent is even with the strong one? >> Nicholas Weaver: A lot of this is with week. A lot of this with weekly uncacheable will still cache. We do see a fair number that strongly cacheable will still cache. Like during development we had problems where the MSF network would change connection from close to keep alive and we weren't handling that. And it would cache things that were explicitly marked as, oh, my God do not cache. >>: Are these ISPs? >> Nicholas Weaver: Usually ISPs in foreign country where bandwidth costs more, or corporate networks. >>: It's a seminal question. So basically four percent of IPs it's about like more than 10,000 IPs. So did you try looking to the cities and to a particular area? >> Nicholas Weaver: We haven't looked into geographical beyond just sort of country level. That certain countries are much more prevalent than others. So like New Zealand you see a lot of Web caches. >>: So what would be a hunch, it will be a few ISPs or ->> Nicholas Weaver: It's very, very rare among U.S. ISPs. It is much more common among foreign ISPs, where they have high transit costs. >>: Those undesired caches, if the client sends about every day reload or if modify sends, I would assume that they honor that and then batch it for ->> Nicholas Weaver: We don't check for that. But I would not bet on it, since they were caching stuff that were marked from the server as oh my God do not cache. So that the browser itself should always be generating new requests. Another problem we see is big buffers. They're all over the place. They're in access devices. They're in network drivers. They're in wireless access cards. Basically everywhere there's a buffer there's odds somebody involved in building it thought bigger was better. And so what we do is we just do a UDP hammer. We do a max rate UDP flow. See how much the latency increases and then go the other way and see what happens the other way. And it's just an endemic problem. So this is what we call the graph of pain. Basically, it's the up load bandwidth and the inferred buffer capacity based on the delay in packet size. And we get nice vertical striations suggesting that we're getting realistic measurements of buffers. We get nice horizontal striations that indicate common service tiers. Anything right of the green line under stress the user experience a half second of latency, anything right of this red line, under stress you can induce two seconds of latency into the user's connection. >>: Can you elaborate a little bit why the big buffers are bad? >> Nicholas Weaver: Okay. I'm a full rate TCP flow. I fill up the full buffer, and I'll alternate between full and half full and the sawtoothing. If the buffer is a simple droptail queue and I have some latency thing like a SIN, that SIN now sees that full buffer delay until the packet goes through and basically the Net can't walk and chew gum. >>: Are most of the buffers [inaudible]. >> Nicholas Weaver: Yes, because a lot of these are either in the wireless card or in the access device, the cable modem and DSL modem hardware. And this is a -- down link is fairly bad. Up link is worse. And it basically is until you do a full rate transfer at the bottleneck, there's no problem. Once you do, everything interactive goes to heck. This is why if you have roommates, don't allow them to run Bit Torrent. Your gaming will suffer. Even if we just do user self-reporting as wired connections to exclude the wireless buffers, which are unfortunately big, we still see a lot of systems with very large buffers. So we're able to fill up buffers. If we can fill them up a TCP flow can. If it's just a drop tail queue then they're in trouble. We have this lovely graph of just Comcast users. God bless biases in your dataset. Well, we only really see two big bands of capacity, which is what you would expect. They're more homogenous. They have a lot of rental cable modems. There's some suggestion that some cable modems actually implemented buffering properly; that is, in a simple drop tail queue you put a max delay, not a max capacity. And you do that and you get good behavior. It's great if you're doing just dumb droptail queues. It gives you 90 percent there with a comparator and a timestamp. However, we track down these cable modems. It looked like one old Linksys. I bought one of these off of Flea Bay, plugged it in and got the Comcast walled garden saying this cable modem is too old to connect to our network, which suggests that at least one hardware vendor used to get it right and then doesn't anymore. But basically it's the walk and chew gum problem. There's some work for 90 percent solutions, smarter NATs. I hate NATs, but if you're going to have them anyway, you can have them do queue management so that you can fix your network with a better NAT. The 100 percent solution is very hard. Basically every buffer which might be congested needs smart queue management. And this is required if you want sub hundred millisecond latency under load. So if you want clean VoIP when you're doing full Bit Torrents, you need smart queues. And NX domain wildcarding is endemic at this point. This is the problem where you do a typo and you get to a helpful search page from your ISP. Strangely enough this helpful search page is filled with ads. When they do wildcard, they're often broken. Seriously, they don't just wildcard WWW, which the browser generates, they'll wildcard anything. They basically assume that every DNS lookup is only from a Web browser, because the only thing they return is the address of a Web server. Six percent, most of these being open DNS, but some not open DNS. Wildcard serve fail. So they don't care to limit collateral damage. One percent of the non-open DNS sessions five percent overall, wildcard and A record for 0 answer reply. If I have an IPv6 test site, it returns a quad A record and queries for the A record return a valid answer with a 0 number of answers. So it's basically the server is saying this name is valid. There's just no record for it. Open DNS wildcards up a record. You cannot build an IPv6 test site and expect it to work reliably if you have open DNS clients. This is a bug. I reported it to them months ago. As far as I know this bug is still there. If you're going to use third-party DNS use Google public DNS. It doesn't have these issues. And this is disturbing. We actually tracked down the vendor literature. This is for one to $3 per customer per year. That for what for even a big ISP is slightly more than pocket lint, they will drastically change how the Internet operates. But this is the interesting one. And this directly affects this company. Some ISPs were manipulating DNS results for Google, Yahoo! and Bing. This is what I showed at the start. Yahoo! and Bing lookups always return the IP address of a third-party control proxy in one of two address ranges. Google was sometimes faithful. Sometimes an ISP controlled proxy. Sometimes a third-party proxy. And the behavior was set per ISP. So the ISPs had specific control over how Google was manipulated. Google put a halt to all Google proxying back in May by throwing up a couch on the search page with a linker, with a link to a page of why did I get this couch, your ISP is doing this stuff without your permission, presumably they also called their lawyers involved. Depending on most lawyers you talk to call this wiretapping. And the records are for clear proxy. This isn't Akamai notes. So when Yahoo! and Bing don't have clean reverse, we now do a direct probe of the server because a lot of them, Akamai you don't get clean reverse DNS on. But these are not Akamai notes. These are proxies. Proxy traffic appears to be unmolested. As we will see in a sec, it's not the case. Http proxy is proxied to Google unmolested, because Google is the only one to runs https services on any of these domains. So if you are set to one of these rogue DNS resolvers and go to https//www.bing.com you get a complaint from your web browser going this certificate is only for Google. The nonproxied host return redirect to 255.255.255 so that we can see that it's proxying all three of these and the banner reveals that it is squid proxy. Invalid request put up a page referring to phishing-warning-site.com which is a Go Daddy parked domain name with anonymous registration, would be a dead end, except if you Google for phishing-warning-site.com you get complaints about Google being down, because if it the proxy has an internal error it also throws up this banner. So when the proxy breaks, the user thinks Google is down, sees the banner and if you search for it you get this. So anybody who has friends on the DNS library side at Microsoft? Client side DNSSEC validation is a must have. We cannot trust the recursive resolver. So why were they doing this? Well, these ISPs had this behavior as of this summer. Cavalier, Cogent, Correct PCS, Frontier, Fuse, IBBS, Inside Broadband, Mega Tech, Pay Tech Currency and Wide Open West and Exo [phonetic]. Who here uses Frontier? Double-check your network or use a third-party DNS resolver. >>: How do you conclude these ISPs have these problems? >> Nicholas Weaver: Because when we do these lookups on the client ship them to the server and record them, we saw that these ISPs were routing the user request for Yahoo! and Bing through these proxies. >>: Have sufficient clients ->> Nicholas Weaver: Yes. Dozens to hundreds, depending on the ISP. >>: And they all experienced problems? >> Nicholas Weaver: Almost all of them. It depends on the ISP. So like Paytech, only one of their resolver clusters used this and other ones did not. Frontier, pretty much all their resolver clusters did this. Actually, Weidong, I'd like to talk to you later we'd like to do some testing on Frontier, your home network using Frontier. >>: I'm kidding. >> Nicholas Weaver: So Charter was doing this, but they stopped. And Iowa Telecom was doing this. But when they got bought by Windstream, they stopped. Who was responsible, this lovely company called Pacfire [phonetic] who specializes in monetizing DNS. We found Pacfire demonstration servers and a bunch of other evidence. We also found collaborating data online. And this was the one that stumped us until just a little over a month ago is why we couldn't get the proxies to change. Note that also there's an MSR paper that also noticed this but as far as I can tell didn't answer the why question either. Let alone get the publicity. It turns out there was the Google post had a pointer to another blog post that had a URL that was changed, and that allowed us to determine the condition. If the query was from the browser search or URL bar only, and the query matched one of at least 165 specific keywords, then replace search results with a 302 redirect through an affiliate network. >>: How do you know whether the query is from the browser ->> Nicholas Weaver: I will get to that. Because this is what happened. Let's say the search is for CA. They're interested in the state of California. Standard flow. Goes to the resolver. The DNS appliance lies and it goes to the proxy. Now the proxy actually observes two things here. The first thing it observed is that it's for a keyword search, and the second is the URL actually does say this is from the browser URL or search bar, not from the Bing home page. So it looks for both of these conditions, does a 302 redirect in this case to Pacfire's own analysis server that goes, oh, this is for a specific affiliate program, which goes and does a 302 redirect through the affiliate program, which then goes and does a 302 redirect back to the final merchant and after the final merchant gets the request, well, in this case the user ended up at the computer associate's Web store. In the process presumably the merchant pays the affiliate program, which pays packs fire, which splits the money with the ISP. This sounds like very complicated, but let's see basically real time. Well, bounces back and forth. And the page renders. And how are users supposed to know about this? >>: How do you get these queries? >> Nicholas Weaver: What we did we took the alexa top 10,000, we stripped out the suffix so it was just the domain name, like Google, Amazon, EC2 and we bombarded this server with queries to see which ones it would modify. >>: I have this question. >>: Sued you for modifying their network traffic. >> Nicholas Weaver: No, they sued us for libel because we pointed out that they -- or they've sued another party for libel because we pointed out that this can monitor user search. They don't claim to monitor user search, just modify it so it's okay. >>: How long does it take to run an analysis from end-to-end? >> Nicholas Weaver: Netalyzer, takes about three, four minutes. We do a lot of tests. So they do modify and monitor, but they claim not to record and profile. So from their point of view it's okay. >>: The user clearly notices. >> Nicholas Weaver: How? >>: Because he didn't get the query results. >> Nicholas Weaver: How many users would actually notice this? Weidong, do you use packs fire or do you use frontier's DNS resolver? >>: D 2 to e-mail -- I assume he's working somewhere. >>: Working from ->>: And I never observe anything bad. I search Bing. I always get Bing. Basically ->>: Search bar. >>: Search bar, yes. >> Nicholas Weaver: Do you hit the keywords? >>: No like IE doesn't have search bar it's only in address bar. >>: It's interesting stuff. >>: So let me finish what I want to say. Is basically I know Frontier is doing well if I type an address it's gone it's not resolved correctly, it would redirect to their own search engine. >> Nicholas Weaver: That's their ->>: Search this website. >> Nicholas Weaver: Actually, talk to us. That opt out is -- if Frontier does not add a new opt out the opt out they provide on the bottom of the page is fictional. It just sets a browser cookie so the next time you go to the page it sends up a fake browser error message. I have a protocol for testing their opt out. So let's communicate afterwards. So on the 26th we measured the monetization Web of the things they manipulate. Some were redirected direct to Amazon. If you did Amazon, eBay, PayPal, it went direct because they have direct affiliate programs. I got to C before I got tired of copying company logos. Alibras, Computer Associates, comp USA, Dyno Direct, they went through Link Share. Google Affiliate Network went through a lot. Two of which are note. If you typed in HBO or FIFA into your address bar on the 26th you would get to a page saying: This double click URL is wrong. So you wouldn't even get to the final destination. Some went through Ask. All of these ended up going to YP.com with a banner on the top, did you really mean the site you were looking for or YP.com and a random yellowpages.com listing in it. Kohl's was especially amusing on the 26th. You would get a Kohl's Web page framed with an ad for JC Penney. A lot went through Commission Junction, they're the biggest. I gave up at A. And some Commission Junction was then redirected to Double Click for these airlines. During this process it's unclear whether anybody understood how packs fire was driving the traffic. So ->>: So are these people actually paying packs fire. >> Nicholas Weaver: Presumably, because that's how it works. >>: Then it looks that someone went to a search site and clicked on a link or something. >> Nicholas Weaver: Or affiliate ad. So an ad on somebody's Web page. >>: Google use Pacfire? >> Nicholas Weaver: All three big affiliate programs, Link Share, Google and Commission Junction suspended the accounts within two business days. >>: So they didn't know. >> Nicholas Weaver: They didn't know as far as we know. So the current situation, they halted the proxy server redirection to affiliate links within 24 hours of public disclosure. But as of the 16th they've maintained the DNS redirections that are routing user traffic through the proxy servers. So they can resume modifying results at a later date, presumably. Why they haven't stopped, I don't know. Commission Junction, Link Share and Google have all publicly suspended Pacfire account pending further investigation. We haven't heard anything more from them. But they've all publicly stated the account is at least temporarily suspended. And so that's amazing what ISPs will do to manipulate traffic. Like we now see at least two ISPs that do in path 404 rewriting where they have an IDS box that looks at http replies and if the http reply is a 404 they inject a JavaScript redirect to their advertisement page. >>: So Pacfire ->> Nicholas Weaver: But that's a different company that's doing that. >>: So sounds like the ISPs are the problems. >> Nicholas Weaver: Both. There are ISPs that use Pacfire for an X domain wildcarding that don't do this behavior. It's both. >>: Sounds like Pacfire is just exposing a service, providing you an additional ->> Nicholas Weaver: That's what they sell to the ISP but they're direct participants in the traffic. They're the ones running the proxy server. So they are the ones that are providing the infrastructure and running the systems that are monitoring user search requests and modifying the results or at least were monitoring and modifying. So conclusions. Well, this worked. We actually did write once run anywhere and actually got it to work. You can, as a small group, build a robust and comprehensive network measurement and diagnostic tool. We got slash dotted and sys admin never noticed with a tool that does full rate UDP flows for ten second bursts. We didn't get noticed. We did a lot of sneakiness well the network is your adversary these days. We viewed the network as an adversary and tried to trip it up, and we found interesting things. [applause]. >>: How do you turn this into a sort of longer term monitoring strategy, seems like you show up/once or twice. >> Nicholas Weaver: A lot of users it for debugging now. So it becomes something as people run on a regular basis. We're 20 to 70 sessions an hour just at random baseline at this point, because of ran do links, people running it, people getting to Starbucks and running it, et cetera, et cetera. >>: Administrators running? >> Nicholas Weaver: A fair number of network administrators really like the tool because it allows them to help debug their own and their parents' networks. >>: Isn't debugging performance, connection. >> Nicholas Weaver: Connection problems like NAT issues. >>: But if one connection doesn't work don't connect to your server how can ->> Nicholas Weaver: Because there's a lot of problems where the connections sort of works. So, for example, the Internet seems slow. What causes the ->>: The problem. >> Nicholas Weaver: Yeah, or some sites aren't loading right. That could be a DNS error. So like, for example, open DNS by wildcarding serve fail, if the site's DNS server is down you'll get odd ball open DNS stuff rather than the legitimate response. So there's a lot of things like that. There's a lot of just weird edge cases. One of the tests we do is background outages, so the test takes three minutes. During that three minutes we look for, we do one ping every tenth of a second if we lose three packets in a row we alert that. So that says that your network is flaky and dropping. >>: So you also did the trip wire work with Charlie Wright before. Can you put them together somehow? That ->> Nicholas Weaver: We actually do use the Trip Wire test to see if we're in a frame and we do direct http requests to see some supplies are modified. So we do do similar detection. So we include Trip Wire functionality in this. >>: So do you -- so in that work you just cover a bunch of ISPs injecting content into the result page? >> Nicholas Weaver: Yeah. And we see two ISPs injecting content into 404 pages, which is something that you actually really would not be able to do with the trip wire, because it's really kind of -- because of how the framing and stuff like that works and the redirections. So there's a lot of that stuff. >>: So for IPv6, seems like there's a lot of negative lessons here, a lot of things that will break. Is there anything we can take as a positive, like here's a way which we could deploy IPv6. >> Nicholas Weaver: IPv6 is looking remarkably good in our data about five percent of users of IPv6 connectivity. The gross problem of broken IPv6 means broken Internet to dual stack sites is no longer the case. That Apple and Opera fixed those show stoppers so now they're basically everybody's doing happy eyeballs. So that if you get a quad A record and an A record you try them both. If whichever one finishes first is the one you use. IPv6 path MTU discovery is much less broken than v4, mostly because most IPv6 is going through tunnels, but at the same time it has that minimum 1280 byte MTU that's enforced in the spec. So the path MTU and fragment issues for v6 seem to be a lot better. One thing that's nice for v6 is it gets the NATs out of the way because the NATs just pass traffic. So it reduces the fragment problems. So port filtering on v6 is vastly less than v4. A lot of computers have v6 without knowing it because of Toredo and 6to4. The situation of 6to4 is not vastly better than it was a year where 6to4 turned on would mean your computer broke for dual stack sites. Now it means if 6to4 is turned on, it doesn't work, it doesn't work and it doesn't matter. >>: 6to4 are those implied on the server side? >> Nicholas Weaver: Those are deployed in the network infrastructure. How 6to4 works is your v6 packet, you encapsulate it in v4 to a special anycast address. It gets to the first one, strips it out, sends the v6 on and your v6 has the v4 address embedded in it, but goes to a special v6 anycast address, gets to a proxy that reencapsulates it and sends it on. >>: Our own networks -- major ISPs. >> Nicholas Weaver: Very few ISPs are -- 6to4 is actually something that's just on the client. Toredo is just on the client and Microsoft has it on Windows and it actually works most of the time now. So the V 6 story is it's still pretty rare on the order of 5 percent. But it's not breaking things, which means turning on V 6 is no longer the catastrophe it was a year ago. And some retail ISPs are shifting to it. So like Comcast is supporting V 6 in trial deployments as native V 6 to clients. I think mostly because they want to use dual stack light to handle the v4 address space exhaustion problem. >> Helen Wang: Great. So we have a half an hour break. I'm sorry, are there any more questions? We have half an hour break and then the next talk starts at 11:30.