RES240 / RES224 TD DNS: Architecture Performance

advertisement
RES24 0 / RES22 4
TD DNS:
Architecture Performanc e
D.Rossi
Ressource s: http:/ / w w w . e n s t.fr / ~ d r o s s i
Note:
Due to historical reasons, this document is written in english; sorry for the inconvenience.
Preliminary questions
o What are the most important properties that define DNS ?
DNS is an application layer protocol that 1) implement a distributed hierarchical database mapping IP addresses to Internet names and 2) defines a protcol to query such database
o What is the essence of DNS functionalities, and why are so important ?
Applications query DNS services to resolve the Internet names into IP addresses: only once the IP address is known, the application can open a socket toward that hosts and start the data exchange (or voip call, or mail message, ...)
o In which case is DNS used by users ?
Anytime, but implicitly (when sending mail, when using web, when opening an SSH tunnel, when using an Instant Messaging application, when converting a CD into MP3 using a CDDB database, when you ping a given host, etc.)
o In which case is DNS directly used by users ?
When you use nslookup, dig, host or similar commands on a console (for example, when you suspect that your browser is stuck because of a DNS server problem.)
o What is the most common DNS server (on *nix machines) ?
BIND, the Berkeley Internet Name Domain
o What protocol and port pair does DNS listen to ?
UDP:53 (and TCP:53)
o What is the most common DNS resolver program (on *nix machines) ?
nslookup, dig, host o What is the standard DNS resolver library call?
gethostbyname
True or false ?
1.
2.
3.
4.
5.
6.
DNS is a hierarchical naming scheme [y]
DNS is defined in a IEEE informational standard [n]
DNS is a distributed database [y]
DNS database is implemented in a logical hierarchy of servers [y]
DNS is an application layer protocol to query the distributed database [y]
DNS can be considered a network layer protocol as it has to handle network layer address translation (to application layer names) [n]
7. DNS stands for Distributed Naming System [n]
8. DNS needs to be a distributed system for scalability purposes [y]
9. DNS needs to be a distributed system mainly for robustness against failure (i.e., avoid single point of failure) [n]
10. DNS uses only UDP at the transport layer [n]
11. DNS uses TCP at the transport layer only if the application requiring the address conversion uses TCP [n]
12. DNS can use TCP, although rarely, depending on the size of the query [y]
13. DNS queries for separate hosts need to be carried over separate packets [n]
14. DNS queries for separate hosts can be carried in the same packet only if the hosts have a common prefix [n]
15. Originally, DNS top­level domain were seven (.com, .edu, .gov, .int, .mil, .org, .net) [y]
16. DNS root servers know all the top­level domain servers [y]
17. Nowadays, there are about a dozen root servers [y]
18. Nowadays, there are more than 100 DNS top­level domains [y]
19. Nowadays, there are more than 100 top­level servers [y]
20. DNS top­level domain names are managed by IANA [n]
21. DNS domain name are case sensitive [n]
22. The leaf domain must be a single host [n] 23. Leaf domain names cannot be longer than 16 characters [n]
24. Full domain names cannot be longer than 255 characters [y]
25. DNS limits the number of valid sub­domains [y]
26. DNS subdomains are 4, one per each byte of the IP address [n]
27. No organization can use more than two sub­domains (e.g., infres.enst, cs.yale) [n]
28. Only public organizations can use more than two sub­domains [n]
29. Domain names can be absolute or relative [y]
30. Relative domain names always end with a dot “.” [n]
31. Relative domain names always start with a dot “.” [n]
32. To avoid database inconsistence, an organization cannot register under two distinct top level domains (e.g., sony.com, sony.biz) [n]
33. DNS defines formal rules that dictates how sub­domain of top­level domains should be allocated (so, only for the first level) [n]
34. DNS is used to map host names to IP addresses [y]
35. DNS is used to map email addresses to IP addresses of the mail server [y]
36. DNS is used to map MAC addresses to IP addresses [n]
37. DNS can be used to map host names to MAC addresses [n]
38. DNS resource records include a Time To Live field [y]
39. The time to leave field is normally expressed in minutes [n]
40. The time to leave information is usually tied to the DHCP lease information [n]
41. Each host has at most one type A resource record [n]
42. Some host (e.g. multi­homed) may have more type A record [y]
43. The MX record specify the name of the server that accepts incoming mail for the domain [y]
44. A cached record cannot be out of date because of the Time to Live field [n]
45. An authoritative record cannot be out of date by design [y]
46. The PTR field contain an alias of the host name used to perform reverse lookup (i.e., query for an IP address and return the corresponding domain name) [y]
47. The HINFO field contains the hardware (i.e. MAC) address of the host [n]
48. DNS is also used to perform load distribution among replicated servers [y]
49. The CNAME field enable the load distribution by providing an alias to the hostname [n]
50. The CNAME field of the DNS database contains the canonical name, that is, the name of the host without spaces and control characters [n]
51. The use of the CNAME field is discouraged [n]
52. DNS queries must be either recursive or iterative [n]
53. DNS prefers iterative queries because they minimize the delay [n]
54. DNS lookups cannot mix recursive and iterative queries [n]
55. DNS caching is used to reduce the network load at the expense of the lookup latency [n]
56. DNS caching rules forbid to cache top­level domain server queries [n]
57. DNS non­authoritative server for a host may contains a Type A record of that host due to caching [y]
58. Applications (e.g., Web browsers) can cache DNS query and response too [y]
Strange Correlations
(this question is heavily inspired by J. Rexford question “A rose by any other name”)
 Before the 9/11 attacks, the top­level domain (TLD) server for South Africa was located in New York City. Explain why the physical destruction on 9/11 disrupted Internet communication within South Africa (e.g., for a Web user in South Africa accessing a Web site in South Africa)?
Web transfers within South Africa rely on DNS to resolve domain names for South African Web site names (e.g., www.gov.za). The Web user’s local DNS server needs to contact the TLD server for the .za domain as part of translating the Web server name into an IP address, making the Web download fail because the DNS information is not available.
Following up on the previous question, explain why the effects in South Africa took place gradually, disrupting progressively more communication within the country in the hours (and even days) after connectivity to NYC was lost.
The local DNS servers in South Africa presumably had many name­to­address mappings cached already, particularly for popular sites. However, these cached entries had a time­
to­live field that eventually expired, causing the local DNS servers to evict the expired entries. Once the expired entries were evicted, future requests from end hosts would fail due to the inability to contact the TLD server.

Thoughts on Root Servers



How do the local DNS servers know the identity of the root servers? The local DNS servers are statically configured with the identity of the root servers. This file is available through FTP (at which address?)
Do you think root servers have to be treated differently from the other servers ?
Yes, and indeed they do. For instance, the identity of root server is stored as hints rather than as cache: aside from the name, the important difference is that cached items simply vanish when their time to live TTL is exceeded. Thus, cached items are implemented as “soft­state”: in other words, cached items are forgot by DNS servers after TTL seconds. Instead, hints are refreshed after the TTL expires: they are never forgot, they never expire but rather the TTL expiration triggers the DNS to download again the hint file with FTP Why do you think that there are exactly 13 root servers? 


Because that's about how many servers you can put in a 512 bytes UDP packet
How do you think a resolver chooses between the different root servers ?
Latest versions of BIND implement a simple but effective technique based on the estimation of the round trip time (RTT) latency of the response from each server, so that the closest server can be automatically chosen by inference based on purely passive measurement. In more detail, RTT is measured as the time it elapses between the time a request is sent and the time the corresponding reply is received: denoting with k the k­th root server, we have RTT_k = t_REPLY_k – t_QUESTION_k
At the beginning, each server is assigned a random bootstrap value RTT_0_k which is typically lower than any real­world value:
RTT_0_k < RTT_j for any j,k
This could be done e.g., by assigning RTT_0_k as a random number uniformly chosen in between [0,k]. Then, the resolver will always use the DNS root server with the lowest RTT value, so that:
RTT_0_i = min k ( RTT_0_k ) will be chosen first and its actual RTT value RTT_i will be computed. Since the RTT_0_k of the other servers will be by definition < RTT_i, this means that each of the k root servers will be evaluated in turn. At the end of the evaluation, the resolver will preferentially use the server whose RTT_k is minimum, and use the other (in order of increasing RTT) only when the preferred servers fail.
This is actually a very simple but effective mean of automatically tuning a distributed protocol based on actual network performance.
Imagine sending a DNS query to your local resolver that will surely end up to a DNS root server: how does such a query possible look like (I.e., what are you querying ?)
Bogus queries will definitively end up being routed on the DNS server hyerarchy up to a root server. Indeed, assume you look for the unexistent domain name le.diner.des.cons, which could happen if for instance you directly typed the URL http://le.diner.des.cons in your favorite browser. Since the top level domain “.cons” does not exist (yet :), the query will be routed until some DNS server is able to answer that such domain does not exist. Root server answer will be authoritative in this case.
What should happen if you repeated the same query above ?
Negative caching should avoid the query being routed further than the local DNS resolver (I.e., the stub resolver). Actually, negative caching should avoid any query for the “.cons” top level domain, since the root server already replied on this query. In practice, most of the above queries will still end up to a root server, since in the real­world bogus servers, non­conformant implementation or simpy buggy software exists. See e.g.:
Wessel et al. “Wow, that's a lot of packets” in Passive and Active Measurement 2003, www.caida.org/outreach/papers/2003/dns packets
/wessels­pam2003.pdf
How many DNS questions do you think a root­server receives on a typical day ?
Again, see
Wessel et al. “Wow, that's a lot of packets” in Passive and Active Measurement 2003, www.caida.org/outreach/papers/2003/dns packets
/wessels­pam2003.pdf

The above is a research paper. If you find the above paper interesting, this means that you may find that research could be an interesting challenge for your future career, e.g., pursue your undergraduate Engineering study into a PhD. Feel free to contact me anytime if you wish to have a chat on that
Caching (again, courtesy of J. Rexford)

Who determines the value of the time­to­live field that determines how long DNS servers cache a name­to­address mapping? What are the pros and cons of using a small value?
The operator of the responding DNS server (e.g., the authoritative DNS server) assigns the TTL value. The advantages of a small TTL are: (i) rapid failover if the IP address associated with a name changes and (ii) enabling content distribution networks to exert fine­grain control (e.g., for load balancing) over which Web server replica handles the Web client requesst. The disadvantages of a small TTL are: (i) extra DNS requests (which place extra load on the network and the DNS servers) and (ii) extra latency for the user to wait for these DNS queries to complete successfully.

A local DNS server typically discards cached name­to­address mappings when the time­to­live expires. Alternatively, the local DNS server could optimistically issue a new query for the cached domain name. Given one advantage and one disadvantage of that approach.
By “prefetching” the name­to­address mapping, the local DNS server can hide the DNS look­up delay, improving the performance experienced by the end user. However, prefetching introduces extra DNS queries and network load to look up name­to­address mappings that may never be needed.
Get your hands dirty
Comment the output of these lookup commands, and infer what DNS lookup command has been sent (i.e., what type A/NS/... of question and for what host). You can actually play with DNS using nslookup and dig commands on the Linux/Unix shell. (sorry for those of you who still use Windows, no help provided in this case.
ssh.enst.fr is an alias for ares.enst.fr
 Type HINFO query
 The reponse is in reality a CNAME reponse telling you that ssh is an alias for ares
 [root@kadath]# host ­t HINFO ssh.enst.fr
ares.enst.fr has no HINFO record
 Type HINFO query
 This time, the reponse is that there is HINFO associated to ares.enst.fr
 [root@kadath]# host ­t HINFO ares.enst.fr
;; ANSWER SECTION:
1.86.192.130.in­addr.arpa. 86400 IN PTR serverlipar.polito.it.
 Reverse query
 Dig ­x 130.192.86.1
 ;; QUESTION SECTION:
 ;1.86.192.130.in­addr.arpa. IN PTR
;; ANSWER SECTION:
enst.fr. 170229 IN MX 10 smtp2.enst.fr.
;; AUTHORITY SECTION:
enst.fr. 141710 IN NS minos.enst.fr.
enst.fr. 141710 IN NS enst.enst.fr.
enst.fr. 141710 IN NS phoenix.uneec.eurocontrol.fr.
enst.fr. 141710 IN NS ns3.enst.fr.
enst.fr. 141710 IN NS infres.enst.fr.
;; ADDITIONAL SECTION:
smtp2.enst.fr. 955 IN A 137.194.2.14
infres.enst.fr. 12885 IN A 137.194.160.3
infres.enst.fr. 12885 IN A 137.194.192.1
ns3.enst.fr. 12885 IN A 137.194.32.84
ns3.enst.fr. 12885 IN AAAA 2001:660:330f:20::54
enst.enst.fr. 12885 IN A 137.194.2.16



MX Type Query query
[root@servermisure svrload]# dig ­t MX enst.fr
Notice that also complete NS information is provided in the authority/additional sections
;; ANSWER SECTION:
www.l.google.com. 250 IN A 209.85.135.103
www.l.google.com. 250 IN A 209.85.135.104
www.l.google.com. 250 IN A 209.85.135.147
www.l.google.com. 250 IN A 209.85.135.99
;; AUTHORITY SECTION:
l.google.com. 77445 IN NS g.l.google.com.
l.google.com. 77445 IN NS f.l.google.com.
l.google.com. 77445 IN NS e.l.google.com.
l.google.com. 77445 IN NS c.l.google.com.
l.google.com. 77445 IN NS d.l.google.com.
l.google.com. 77445 IN NS a.l.google.com.
l.google.com. 77445 IN NS b.l.google.com.
;; ADDITIONAL SECTION:
b.l.google.com. 79531 IN A 64.233.179.9
c.l.google.com. 79531 IN A 64.233.161.9
d.l.google.com. 79531 IN A 66.249.93.9
e.l.google.com. 79531 IN A 209.85.137.9
f.l.google.com. 79531 IN A 72.14.235.9
g.l.google.com. 79531 IN A 64.233.167.9
a.l.google.com. 79531 IN A 209.85.139.9
;; Query time: 2 msec
;; SERVER: 130.192.3.24#53(130.192.3.24)
;; WHEN: Mon Oct 1 18:15:16 2007
;; MSG SIZE rcvd: 322



[root@kadath]# dig www.l.google.com
Note load balancing; Note query time: 2ms is typical of LAN environment, indeed google is a very popular page so it would be impossible for it not have been (freshly) cached by the local DNS server Geeky stuff with DNS DNS can also be fun... at least, if you are a geek. Below, some examples of « creative » ways to use DNS functionalities that system administrators around the world have found. 
Helpful hints (1)
> quit
Server: rns1.earthlink.net
Address: 207.217.126.81
Name: type­exit­you­idiot.it.earthlink.net
Address: 206.149.249.11
Aliases: quit.it.earthlink.net
> exit
Helpful hints (2)

10:52pm ~/ARIN/CRC>nslookup quit
Server: nic1.concentric.net
Address: 205.158.16.5
Name: use­exit­to­quit.or­is­your­brain.missing.to
Address: 207.88.46.254
Aliases: quit.internex.net
Helpful hints (3)

lion$ nslookup 192.168.1.1 Server: nic1.concentric.net
Address: 205.158.16.5
Name: read­rfc1918­for­details.iana.net
Address: 192.168.1.1
You can find a few more others at http://www.netgeek.net/ If you like the kind of humor, you may consider reading the adventures of the BoFH http://
members.iinet.net.au/~bofh/ (this won't be in the exam however)
Case study: birth of a new domain
1. Assume you and your mates, of the Computer Science department of the recently founded Miskatonic University, want to start a Lovecraft fan­club webpage lovecraft.cs.miskatonic.edu . What is the logical chain and which are the entities involved in the process of domain name assignment (supposing that CS department maintains its own server) ? Describe the Resource Records (Name, Value, Type) at each server in the chain at the end of the process.
The IT department at the University asks the ICANN organization if the miskatonic domain is available. Hopefully, it is still available (in france, unfortunately it is not), so the univerisity buy the right to use it, and advertise this to the top level .edu domain. Then, the DNS server at the University set up the proper entries to point to the Computer Science department DNS server. As the last step, a PhD of the CS writes the correct entry in the CS DNS server, so your lovecraft server is accessible from the outside.
address are not given but IP­DNS­MISKATONIC = 137.x.x.x.
IP­DNS­CS­MISKATONIC
TLD .edu : (miskatonic.edu dns.miskatonic.edu NS)
(dns.miskatonic.edu IP­DNS­MISKATONIC A)
dns.miskatonic.edu: (cs.miskatonic.edu dns.cs.miskatonic.edu NS)
(dns.cs.miskatonic.edu IP­DNS­CS­MISKATONIC A)
dns.cs.miskatonic.edu (lovecraft.cs.miskatonic.edu IP­LOVECRAFT A)
2. Now, the university decides to change its name in Kadath College. What should you and your mates have to do to reflect the correct change to lovecraft.cs.kadath.edu ?
nothing: no change in the lovecraft machine is needed. hopefully changes
will be corrected in the University and CS databases, so your page will
still be accessible from the outside.
3. Assuming all caches are empty, describes a DNS query from lovecraft.cs.kadath.edu for hp,cs.kadath.edu. Do the same for a DNS query from your home machine (assuming you leave out of the Kadath campus). As the host is new, and so the university domain, it is unlikely that servers around the world have that entry already cached.
When you are inside the campus, the authoritative server is dns.cs.kadath.edu, so a simple query/reply exchange will take place. When you are at home, you contact with a recursive query the DNS server of your ISP. The local ISP server will contact .edu TLD server (unless it has never (or since a too long time so chache is out of date) issued a .edu query, in which case it will start from a root server), in an iterative fashion. So, the TLD will return to the ISP server a reply containing an A record for the DNS server responsible for the kadath.edu domain. The local ISP server will query dns.kadath.edu for the cs.kadath.edu domain, and receive a reply containing an A records for dns.cs.kadath.edu. Then, the locals ISP ask dns.cs.kadath.edu for lovecraft.cs.kadath.edu, Once the reply received, it will complete the recursive query initiated from your home machine by telling the IP address of lovecraft.
4. Assume the webpage at lovecraft.cs.kadath.edu contains the words “Work in progress’’. Assume again that you are at a geek­party and want to show to your nerd­mate this wonderful webpage you clearly are proud of. Assume further that your mate has never visited the page before, so his web browser start a recursive DNS query. Assume that, during the query, n distinct DNS servers must be contacted prior in order to obtain the IP of lovecraft.cs.kadath.edu . Denoting with RTTi the round trip delay between server i and server i+1, write an expression of the time that it takes before the webpage starts to be loaded. Assuming that the query is non­recursive and that RTTi denotes this time the round trip delay between the party machine and server i, how does the expression need to be changed ?
Sum all RTTs from 1 to N in both cases, no no change is needed (is the semanitic of RTT to be different from case A to case B). Also, if your mate has the same ISP as you, the dns address of kadath.edu in the latter case could already be cached, so RTT1 is the right answer in this case
5. What happens when you misspell
lovecraft.cs.kadath.edu as lovecrqft.cs.kqdqth.edu (which is what happen when you’re using a QWERTY keyboard as an AZERTY and vice­versa) ? And what is the misspell is lovecrqft.cs.kadath.edu
S ame as in question 4), but kqdqtg domain can either exist or not. the
length of the query depends on the position of the typos: i,e., in case only the first a is wrong in lovecratf, then the queries will proceed until dns.cs.kadath.edu, otherwise they will possibly end up at the toplevel domain. Also, if your mate has the same ISP as you, the dns address of kadath.edu in the latter case could already be cached
6. Suddenly, the server where your webpage was hosted stop working. What should you do in case that the Ethernet card of the previous server was not damaged and can be reused ? What should you do if the Ethernet card has been damaged and cannot be reused ?
In no case, the DNS is affected, just the DHCP database
7. The IT staff decides to physically move all the servers from CS to EE department. What should you do to keep your lovecraft.cs.kadath.edu working ? Is it better to change the domain name into lovecraft.kadath.edu to avoid such problems ? In the latter case, what would you need to do ?
Nothing to do neither in this case. If you want to change the name of ythe machine ,you will have to try to push the administration of your univestiy: in case they accept, the authoritative server for lovecraft will be
dns.miskatonic.edu: which will contain the additional entry (lovecraft.miskatonic.edu IP­LOVECRAFT A)
Download