DNS Cache Poisoning Chris Racki CMPT-585 Dr. Robila 8 Dec 2008 Chris Racki Abstract Using the internet any computer can theoretically communicate with any other computer in the world so long as they are both connected to the internet. Amidst all of the computers connected to the internet, how do you know where the computer that you want to talk to is located? The answer is to use DNS which serves as an internet phone book of sorts linking us to anywhere that we might want to go. As many requests to DNS servers will be repeated it makes sense to cache the results and just reuse them to improve performance. This caching functionality opens up a risk of compromising the integrity of the DNS server by making it susceptible to DNS cache poisoning attacks. DNS cache poisoning attacks can potentially assign any IP address to any internet address. DNS cache poisoning is not new, but recently a new approach to the attack has been discovered that makes it much more effective and potent. Finally as a curiosity some commonly available DNS server safety check tools were tested to determine the current state of this author’s internet service provider’s DNS server. Introduction The internet has had a very colorful history so far. Its reach and influence continues to increase. We continually learn of new applications for the internet, and we continue to push more and more of our lives into the web. More and more electronic devices that we rely on in daily life are dependent on the internet in turn making us dependent on the internet to some degree. We are placing a tremendous amount of blind faith in the reliability of the internet. This would not be a problem if the internet was in fact reliable. However the internet is vulnerable to many points of weakness. One such point of Page 2 of 18 Chris Racki weakness is in the very foundation of the infrastructure of the internet. The internet is at its base flawed and vulnerable to severe compromise. The system on the internet responsible for navigating users to their appropriate locations, DNS, is exposed to attack by its very design. In the summer of 2008 a new threat to this system was discovered that drove fear and almost panic into the entire internet security community. This threat is a new application of the classic DNS Cache Poisoning attack. Internet navigation Using the internet any computer can theoretically communicate with any other computer in the world so long as they are both connected to the internet. The obvious question becomes, amidst all of the computers connected to the internet, how do you know where the computer that you want to talk to is located? We can find a very convenient real world analogy for this. If you want to telephone Dr. Robila at Montclair State University in New Jersey, how do you know which phone is his? Unless we call Dr. Robila on a regular basis we don’t know his direct number. So in this case we need to perform a lookup of his number using a telephone book. Once we get his number from the phone book we can call him. The internet works in much the same way. When a user attempts to connect to a computer on the internet, in other words visit a website or access some other resource, the client computer sends that request to its internet service provider. When the request is to visit a commonly visited website, such as www.google.com, the internet service provider may already know where it is and it provides the connection information to the client. If the request is for something that the internet service provider doesn’t know, then it must look it up much as we looked up Dr. Page 3 of 18 Chris Racki Robila’s phone number in the telephone book. In technical terms, this phonebook is called a DNS (Domain Name System) [1]. DNS In the early days that pre-dated the internet as we know it today, there weren’t very many computers interconnected and so the problem of locating computers was much simpler. Back then there was no internet, it was just ARPAnet (Advanced Research Projects Agency Network). Many people consider this to be the beginning of the internet. ARPAnet was a computer network developed by the United States Department of Defense to facilitate communication among computers [6]. When a computer wanted to connect to another in the network it would look up the receiving computer’s address in a file called HOSTS.TXT [1]. This file was stored on a computer at SRI (Stanford Research Institute) which is now known as SRI International [7]. As networks began to grow it became apparent that this system would not be feasible. Using a host file to store lookup information has many drawbacks that become painfully more apparent as the network grows. The main drawback is that when there are many host files and the address of a computer changes, all of the host files that refer to it must be updated. The answer to this problem came in 1983 when the first DNS system was invented by Paul Mockapetris. In 1984 the first UNIX based implementation was developed at the University of California Berkley. The system was called BIND (Berkley Internet Name Domain). BIND has evolved greatly since that time currently existing at version 9. Page 4 of 18 Chris Racki There are other DNS systems available and in use, but BIND remains the most commonly used DNS on the internet [1]. How DNS works When we wanted to look up Dr. Robila’s phone number we looked in the phone book. This assumes that we know which phone book to look in. There are in fact many different phone books to make all of the records more manageable. In certain cases we may not even know which phone book we should consult. DNS functions much the same way. DNS is composed of a number of servers. Each server knows something that the others don’t, and combined they can offer us all of the information that we need. DNS can be viewed as a distributed database system. There are 13 DNS root servers scattered in locations throughout the world serving the internet. Since they are a sort of starting point for many DNS requests, their IP addresses do not change often. This makes their addresses relatively constant and quite reliable. They are named A.ROOT-SERVERS.NET, B.ROOT-SERVER.NET, …, M.ROOT-SERVERS.NET [9]. Each root server is responsible to serve a particular geographical location and is operated independently. For instance A.ROOTSERVERS.NET is located in two sites in Dulles, VA USA, and Ashburn, VA USA and is operated by VeriSign, Inc. B.ROOT-SERVERS.NET is located in Marina Del Ray, CA USA and is operated by Information Sciences Institute. Similarly all of the other root servers are distributed around the world [8]. The root servers don’t know everything, and alone are not very helpful. However they do know where to find other servers that Page 5 of 18 Chris Racki have more information. The root servers are responsible for knowing where to find the Global Top Level Domain Servers. The Global Top Level Domain Servers know all about the top level domains. The top level domains include the following: COM, EDU, GOV, MIL, NET, ORG, ARPA, (country code), AERO, BIZ, COOP, INFO, MUSEUM, NAME, and PRO [9]. These servers know how to route requests further so that they can be answered. Beyond that there are lower level servers that know even more details to route a request. Eventually this granularity leads to the authoritative server for a domain. That is a server that knows directly how a local domain is organized because it’s responsible for managing it [2]. What’s in a request DNS servers are most famous for their ability to provide us with the IP address of computers that we want to find. And while this is their primary function they can perform others as well. Some of the common records that DNS servers work with are type A, type NS, type MX, type SOA, type CNAME, and type TXT [5]. For the purpose of this discussion we will only focus on type A and type NS records. A type A record is the IP address of a given URL. When we ask for the IP address of www.google.com the DNS server will respond with an A record that has this information [5]. Page 6 of 18 Chris Racki An A record is what is returned when the server actually knows the address of what we are looking for. In cases where the server doesn’t know directly about the address we are requesting it might respond with an NS record. A type NS record is what is returned when a server doesn’t know directly about what we are requesting. Essentially it’s a referral to a different server that might have more information. The NS record has the name of servers that might have further information. It also has what is called glue information which is the associated IP address of any referred server name. If the server returns just the domain name of another server it’s likely that the requesting server may have to look that IP address up as well. So as a convenience the IP address of the referred server is also provided so that the requesting server can contact it directly [5]. A typical DNS lookup Let us follow the path of a typical DNS request. 1. A client issues a request to connect to www.google.com by asking the local DNS for the location. If the local DNS server knows the location it supplies it and that’s the end of the interaction with the DNS server for this request. 2. If the local DNS server doesn’t know the location it has to look it up. What it will do first is send a request to its designated root server. It asks the root server if it knows the IP address of www.google.com. 3. The root server will respond that it does not know about www.google.com, but it does know about .com. It will respond with an NS record referring our server to Page 7 of 18 Chris Racki some top level domain servers that are responsible for the .com domain that might have more information. 4. Our requesting server now armed with the names and addresses of the .com servers will send the same request to each of them asking if they know the IP address of www.google.com. 5. Again, it is unlikely that any of these servers will know the whole address, but they will know about google.com. The responding server will send an NS record referring our server to a DNS server that handles the google.com domain. 6. With this information the requesting server will now ask the google.com DNS server for the IP address of www.google.com. This DNS server will know the IP address and will respond with an A record containing the appropriate IP address [5]. Caching for later When the IP address of a computer is not known to the local DNS server it requires a lot of work to resolve it as we see in the example above. If www.google.com is a popular resource, and is accessed many times then this lookup needs to be performed very often. This is very time consuming, and inefficient. This is where the concept of caching comes in. When a DNS server goes through all of this work to resolve an IP address it will save the result in its cache. This way when a subsequent request is made for the same IP address the DNS server can simply check its cache rather than performing this costly lookup all over again [1]. Fresh data Page 8 of 18 Chris Racki Caching known IP address is a great timesaver that can eliminate unnecessary lookups and improve overall performance. However, there is a limit to this convenience. IP addresses change often and if this cached data is used perpetually it will eventually be out of date and point to the wrong resource. Because of this, cached data also has a Time To Live attribute associated with it. The Time To Live attribute is set by the authoritative server and allows DNS servers to recycle known data without letting it become outdated. The TTL can be anything from a few seconds to several weeks. As long as the cached data has not passed its TTL it can be reused by the DNS server for subsequent requests. When the TTL expires then any subsequent requests will need to be looked up [1]. Vulnerabilities of DNS DNS servers are extremely important to the proper functioning of the internet. Without DNS servers the internet would be like a road system without street signs. Computers would be unable to communicate with the resources that they are seeking to connect to. With such great responsibility it can be somewhat frightening how open DNS servers are. But after all, they have to be open by their very nature. If an attacker can subvert a DNS server then he can cause a lot of chaos and potentially a lot of damage. Phishing is a serious threat to people’s personal information on the internet. One of the main mitigations to the threat of phishing is to avoid clicking links in emails. Instead users are urged to type the name of the resource they want to reach in their browsers directly. We place a lot of faith in the fact that when we type www.mybank.com in our web browsers that we will in fact arrive at www.mybank.com. Page 9 of 18 Chris Racki This is precisely why DNS cache poisoning is so dangerous. If an attacker can successfully poison a DNS cache he can assign any arbitrary IP address to www.mybank.com. Users trying to reach www.mybank.com would then be unwittingly directed to the resource of the attacker’s liking. The implications are staggering. Users can be sent to malicious websites or just randomly redirected away from their intended destination in an effort to cause chaos. This form of highly advanced phishing is cutely called pharming [4]. This has the soundings of a paranoid delusion. Unfortunately it’s completely possible, probable, and in fact already done. In March of 2005 through the use of a cache poisoning attack people that were trying to access popular websites such as Google, EBay [3], CNN, and MSN were redirected to malicious sites that installed spyware on the victim computers [4]. Anatomy of a cache poisoning A Cache Poisoning attack is an attack where the attacker modifies the stored IP addresses in a DNS server cache. This causes all future requests to that DNS server for that resource to be sent to wherever the attacker chooses. To see how a cache poisoning attack works we need to revisit the DNS server request. When a DNS server doesn’t know the IP address of a particular resource it sends a request to other DNS servers asking them about the resource. The DNS server attaches to this request a query ID sometimes called a transaction ID [5]. This ID serves as a signature for the request. When a response comes back to the requesting server, the server checks to make sure that the ID matches the ID that was sent with Page 10 of 18 Chris Racki the request. This serves two purposes. First it allows requests and responses to be associated to each other since after all a DNS server might be handling multiple requests concurrently. And second it helps prevent denial of service attacks. Since the DNS server will ignore all unsolicited responses it insulates itself from being flooded by pointless traffic [5]. Therefore if the attacker knows the query ID he can forge a reply to the DNS server, and if he can deliver his forged reply to the DNS server before the real reply arrives then the DNS server will accept it as being legitimate. That forged reply will then be forwarded back to the original requester and treated as a proper address. When the real reply finally arrives, the transaction has been closed with the arrival of the forged reply and the real reply is simply ignored as being unsolicited [5]. This is very much like a race. When a DNS server sends a request the attacker tries to beat the real reply with his forged one. But how does the attacker know what the query ID is, or when a DNS server is making a request? This requires a little bit of preparation before the actual attack. An attacker will set up his own domain and using the victim DNS server he will request access to a resource in his domain. Since the victim DNS server doesn’t know about the attacker’s domain, it needs to perform a lookup. That lookup request will eventually arrive at the attacker’s domain’s DNS server. The attacker’s DNS server will oblige and reply to the request as expected, however at this point the attacker will also be able to see the query ID associated with the request. Now the attacker knows the victim DNS server’s current query ID. Since many versions of DNS software use sequential query IDs or Page 11 of 18 Chris Racki easily predicted random IDs the attacker can predict what the query IDs will be in the near future. Now comes the real assault. The attacker makes a request to the victim server for a resource; let’s say www.google.com. The victim server goes to perform the lookup and the attacker immediately begins flooding the victim DNS server with forged ‘A record’ replies. The attacker doesn’t know the exact query ID that the victim DNS server is using for this request, but he has a good idea from the preliminary work that he already performed. So the attacker sends many forged ‘A record’ responses each signed with a different, but highly likely query ID. The victim DNS server will of course ignore the unsolicited replies, but if the attacker manages to guess the proper query ID before the real reply returns, then the victim DNS server will accept it. It will also cache the forged result for future use. Now the victim DNS server is poisoned. When any subsequent users try to access www.google.com using the this DNS server they will be treated to whatever the attacker has placed into the victim DNS’s cache [5]. Why isn’t the security community panicked? This is the way in which a DNS cache poisoning attack goes down. The results are profound, but can this attack be effectively, reliably, and consistently pulled off? The answer is no, and for as long as DNS cache poisoning has been around it wasn’t touted as a common threat. It’s a theoretical nightmare, but a practical improbability. There are several problems with successfully executing this attack. The first problem is that there is a possibility that the attacker won’t guess the proper query ID in time. If the real response comes back before the attacker can succeed then the attack is thwarted. Another problem is that once a lookup is performed the IP address is cached, so future Page 12 of 18 Chris Racki requests will be satisfied using cached data and not through DNS lookups. The attacker needs to know precisely when a cache entry TTL expires so that he can try to forge a request when the victim server has to perform a DNS lookup. This all requires perfect timing and incalculable amounts of luck. The scope of this attack is also quite limited. If in fact it succeeds as planned, the attack only poisons one IP address. The other IP addresses in that domain are still intact. So the net effect of the attack (if it even succeeds) are not terribly great [5]. So this is why we are not terribly worried about cache poisoning. Ok, now they’re panicked! In 2008 Dan Kaminsky, renowned internet security expert, was researching improved methods of streaming video to users by modifying the way the associated DNS entries are treated. To his shock Kaminsky stumbled upon a new way that could yield the classic DNS cache poisoning attack much more effectively, much more reliably, much more readily, and with much greater reach. Keeping his discovery close to his chest Kaminsky worked closely with other security experts to quickly fashion a fix for his newly discovered threat which was deployed during the summer of 2008 [10]. A more toxic poison Kaminsky’s discovery meant that DNS cache poisoning attacks could now be executed more reliably and with much more devastating effects. The classic DNS cache poisoning attack was limited because it had to be timed perfectly with the expiration of the TTL and, if successful, it could only affect one resource at a time [5]. Kaminsky discovered that he didn’t have to wait for the TTL to expire on a cached entry. He was Page 13 of 18 Chris Racki able force a DNS server to perform a lookup at will. This meant that he could perform his attack any time and that he could repeat it as often as he liked in case of failure. He also discovered that he could take over an entire domain with one shot with his new attack whereas the classic version was only able to pick off one resource in a domain at a time [10]. The new attack is essentially the same as the classic DNS cache poisoning attack but with some minor modifications. Firstly the attacker needs to perform some preparation. The attacker needs to set up a fake authoritative DNS server for whatever domain he is trying to target, in this case our attacker is hoping to attack google.com. His creating a fake google.com DNS server is useless because all DNS servers will always point to the legitimate google.com DNS server and not his. His fake DNS server won’t be seen by anyone (yet). We’ll come back to the purpose of this fake DNS server later. The next issue is overcoming the TTL expiration. After all, if the cache entry is not expired then the server will not perform a lookup and will not offer the attacker a chance to be poisoned. This attack forces a lookup every time. The old attack targeted a specific resource such as www.google.com. It’s very likely that www.google.com will be cached therefore limiting the window of opportunity for the attacker. However, RandomThingThatDoesNotExist0001.google.com will not be cached because first, it does not exist, and second, no one is trying to access it. So with that in mind if the attacker requests to get the IP to that resource, the DNS server has no choice but to perform a DNS lookup. So now, just as with the classic attack the attacker floods the victim DNS server with forged replies. The difference is in what the response is. In the classic attack the response was a forged ‘A record’ which is the IP address of a Page 14 of 18 Chris Racki particular resource. In this case, the forged reply is an ‘NS record’. An ‘NS record’ is a referral to a different DNS name server for further resolution. In this case the forged ‘NS record’ refers the victim DNS server to the attacker’s fake google.com DNS server. The victim server will now use the attacker’s fake google.com DNS server instead of the real one. When someone requests anything inside the google.com domain from this poisoned DNS server, they will be actually making requests to the attacker’s fake google.com domain DNS server. As part of the forged ‘NS record’ the attacker also sets a very long TTL so that the fake entry remains in the cache as long as possible before the DNS server performs another lookup resetting the entry to the legitimate google.com domain DNS server [10]. This is a devastating attack now. You can imagine how it can be taken a step higher to redirect all .com traffic or all .net traffic. In the hacker lexicon the attacker has just “pwned” the whole internet. Mitigation Short of changing the DNS server protocols there is not much that can be done to completely protect against this threat. The best solution currently available, and the one that was implemented by Kaminsky and colleagues this past summer was to increase the entropy in the way query IDs are generated and to also use more randomness in the communication port that is used. Typically DNS servers would exclusively operate on port 53. To thwart attackers, the port can be randomized making it more difficult for attackers to exploit this weakness [10]. Experiment Page 15 of 18 Chris Racki After learning about DNS cache poisoning and the recent developments in the way this attack can be carried out I became a little curious as to how susceptible my internet service provider’s DNS server was to these threats. As a result I performed some informal tests using web based tools to check my DNS server, and here are the results. From Dan Kaminsky’s website http://www.doxpara.com/ I tested my DNS server with the tool seen on the right. Based on the results of this test, my DNS server appears to be reasonably safe from DNS cache poisoning attacks. In addition to the above test I also wanted to compare the result against another tool. For this test I used the DNS Vulnerability Check tool found at http://member.dnsstuff.com/tools/vu800113_results.php seen below. The result of using this tool was not as reassuring as the previous. Curiously the tool yielded different results when the test was performed several times. On certain occasions it reported more problems than on other attempts. Since this test was performed for more of a curiosity than an actual analysis no further investigation was invested in identifying the different results. Page 16 of 18 Chris Racki Conclusion In conclusion there are many security threats on the internet, and one of the most frightening is DNS cache poisoning. If a DNS server is compromised then effectively the entire internet has been compromised for the affected user. Even though DNS server cache poisoning is not new and has been with us for a long time, we saw recently that it is not an old threat with nothing new to offer. The threat is alive and well, and may very well come back in the future with new threats. Page 17 of 18 Chris Racki Works Cited [1] "Domain Name System". Wikipedia. 06 Dec 2008 < http://en.wikipedia.org/wiki/Domain_name_system>. [2] "Name Server". Wikipedia. 06 Dec 2008 <http://en.wikipedia.org/wiki/Authoritative_name_server#Authoritative_name_ser ver>. [3] Evers, J. (2005, August) DNS servers--an Internet Achilles' heel. CNET News. Retrieved December 6, 2008 from <http://news.cnet.com/2100-7349_35816061.html>. [4] Evers, J. (2005, March). Phishers using DNS servers to lure victims? CNET News. Retrieved December 6, 2008 from <http://news.cnet.com/Phishers-using-DNSservers-to-lure-victims/2100-7349_3-5604555.html>. [5] Friedl, S. (2008) An Illustrated Guide to the Kaminsky DNS Vulnerability. Retrieved 06 Dec 2008 from < http://unixwiz.net/techtips/iguide-kaminsky-dns-vuln.html>. [6] “ARPANET". Wikipedia. 06 Dec 2008 < http://en.wikipedia.org/wiki/ARPAnet>. [7] “SRI International". Wikipedia. 06 Dec 2008 < http://en.wikipedia.org/wiki/SRI_International>. [8] root-servers.org. 06 Dec 2008 < http://www.root-servers.org/>. [9] Du, W. (2007) DNS Protocol and Attacks. Retrieved 06 Dec 2008 from < http://www.cis.syr.edu/~wedu/Teaching/cis483/LectureNotes/DNS.pdf>. [10] Naone, Erica (2008, November). The Flaw at the Heart of the Internet. Technology Review, 111(6), 62-67. Retrieved December 7, 2008, from ABI/INFORM Global database. Page 18 of 18