How DNS Works In Practice

advertisement
DNS
Domain Name Systems
Source
http://isr.uncc.edu/ITIS3100/3100PowerPoint/DNS.ppt
Domain Name System
DNS Overview
DNS Zones
» Forward
» Reverse
» Fowarding
DNS Delegation/Parenting
Mail Exchangers
DNS Overview
http://en.wikipedia.org/wiki/Dns
Overview
On the Internet, the Domain Name System (DNS)
associates various sorts of information with
domain names
» Serves as the "phone book" for the Internet
» Translates human-readable computer hostnames into IP
addresses
» Required by networking equipment to delivering
information
» Also stores other information
» Such as the list of mail exchange servers that accept email
for a given domain.
By providing a worldwide keyword-based
redirection service, the Domain Name System is
an essential component of the modern Internet
Uses
Uses
The most basic use of DNS is to translate hostnames
to IP addresses.
» Very much like a phone book
» For example, what is the internet address of en.wikipedia.org?
» The Domain Name System can be used to tell you it is
66.230.200.100
Uses
DNS also has other important uses
» DNS makes it possible
» Assign Internet destinations to the human organization or
concern they represent
» Independent of the physical routing hierarchy represented
by the numerical IP address.
» Hyperlinks and Internet contact information can remain
the same
» Whatever the current IP routing arrangements may be
» Can take a human-readable form (such as "wikipedia.org")
Easier to remember than an IP address (such as
66.230.200.100).
» People take advantage of this when they recite
meaningful URLs and e-mail addresses
» Do not need to care how the machine will actually locate
them
Uses
The Domain Name System distributes the responsibility for
assigning domain names and mapping them to IP
networks
» allows an authoritative server for each domain to keep track of its
own changes
» avoids the need for a central registrar to be continually consulted
and updated
History
History
Using a name as a more human-legible abstraction of a machine's
numerical address on the network predates even TCP/IP
» All the way to the ARPAnet era
Back then however, a different system was used, as DNS was only
invented in 1983, shortly after TCP/IP was deployed.
» With the older system, each computer on the network retrieved a file
called HOSTS.TXT from a computer at SRI (now SRI International).
» The HOSTS.TXT file mapped numerical addresses to names.
» A hosts file still exists on most modern operating systems, either by
default or through configuration
» Allows users to specify an IP address (eg. 192.0.34.166) to use for a
hostname (eg. www.example.net) without checking DNS.
» Nowadays, the hosts file serves primarily for troubleshooting DNS errors or
for mapping local addresses to more organic names
» Systems based on a hosts file have inherent limitations
» The obvious requirement that every time a given computer's address
changed, every computer that seeks to communicate with it would need an
update to its hosts file
On Windows: C:\WINDOWS\system32\drivers\etc>
History
The growth of networking called for a more scalable
system
» Records a change in a host's address in one place only
» Other hosts would learn about the change dynamically through a
notification system
» Completes a globally accessible network of all hosts' names and
their associated IP Addresses
History
At the request of Jon Postel, Paul
Mockapetris invented the Domain Name
System in 1983 and wrote the first
implementation.
» The original specifications appear in RFC 882 and
883
» In 1987, the publication of RFC 1034 and RFC 1035
updated the DNS specification
» Made RFC 882 and RFC 883 obsolete.
» Several more-recent RFCs have proposed various
extensions to the core DNS protocols.
History
In 1984, four Berkeley students1 wrote the
first UNIX implementation
» In 1985 Kevin Dunlap of DEC significantly re-wrote
the DNS implementation
» Renamed it BIND (Berkeley Internet Name Domain)
» BIND was ported to the Windows NT platform in
the early 1990s.
Due to BIND's long history of security issues
and exploits, several alternative name
server/resolver programs have been written
and distributed in recent years.
1Douglas
Terry, Mark Painter, David Riggle and Songnian Zhou
How DNS works
Theory
How DNS Works - Theory
Domain names
com.
» Arranged in a tree
» Cut into zones
» Each served by a
name server
example.com.
.
How DNS Works - Theory
The domain name space consists of a tree
of domain names.
» Each node or leaf in the tree has one or more
resource records, which hold information
associated with the domain name.
» The tree sub-divides into zones.
» A zone consists of a collection of connected
nodes authoritatively served by an
authoritative DNS name server.
» Note that a single name server can host several
zones
How DNS Works - Theory
When a system administrator wants to let
another administrator control a part of the
domain name space within his or her zone of
authority
» Can delegate control to the other administrator.
» Splits a part of the old zone off into a new zone
» Comes under the authority of the second
administrator's name servers
» The old zone becomes no longer authoritative for
what goes under the authority of the new zone.
How DNS Works - Theory
A resolver looks up the information
associated with nodes.
» A resolver knows how to communicate with name
servers by sending DNS requests, and heeding
DNS responses.
» Resolving usually entails iterating through
several name servers to find the needed
information.
Some resolvers function simplistically and
can only communicate with a single name
server.
» These simple resolvers rely on a recursing name
server to perform the work of finding information
for them.
How DNS Works - Theory
Parts of a domain name
A domain name usually consists of two or
more parts (labels), separated by dots
» Example: wikipedia.org.
» The rightmost label conveys the top-level domain
The address en.wikipedia.org has the top-level domain
org
» Each label to the left specifies a subdivision or
subdomain of the domain above it.
» Note that "subdomain" expresses relative
dependence, not absolute dependence:
wikipedia.org comprises a subdomain of the org
domain
en.wikipedia.org comprises a subdomain of the domain
wikipedia.org
How DNS Works - Theory
Parts of a domain name
A domain name usually consists of two or
more parts (labels), separated by dots
» In theory
» this subdivision can go down to 127 levels deep
» each label can contain up to 63 characters
» the whole domain name does not exceed a total
length of 255 characters
» In practice
» some domain registries have shorter limits than that.
How DNS works in theory
Parts of a domain name
A hostname refers to a domain name that has one
or more associated IP addresses
» For example, the en.wikipedia.org and wikipedia.org
domains are both hostnames, but the org domain is not
The Domain Name System consists of a
hierarchical set of DNS servers
» Each domain or subdomain has one or more
authoritative DNS servers that publish information about
that domain and the name servers of any domains
"beneath" it.
» The hierarchy of authoritative DNS servers matches the
hierarchy of domains
» At the top of the hierarchy stand the root nameservers:
the servers to query when looking up (resolving) a toplevel domain name (TLD)
How DNS works in theory
Parts of a domain name
Iterative and recursive queries:
» Iterative query: the DNS server may provide a
partial answer to the query (or give an error)
» DNS servers must support non-recursive queries.
» Recursive query: the DNS server will fully
answer the query (or give an error)
» DNS servers are not required to support recursive
queries and both the resolver (or another DNS acting
recursively on behalf of another resolver) negotiate
use of recursive service using bits in the query
headers
How DNS works in theory
Address resolution mechanism
A full host name may have several name segments
» e.g ahost.ofasubnet.ofabiggernet.inadomain.example
In practice full host names typically consist of three segments
» ahost.inadomain.example
» www.inadomain.example
Software interprets the name segment by segment, right to left
» Using an iterative search procedure
» At each step along the way, the program queries a corresponding
DNS server to provide a pointer to the next server which it should
consult.
Example:
» A DNS recursor consults three nameservers to resolve the address
www.wikipedia.org.
(This description deliberately uses the fictional .example TLD in accordance with the DNS guidelines themselves.)
How DNS works in theory
Address resolution mechanism
As originally envisaged, the process was as simple
as:
» The local system is pre-configured with the known
addresses of the root servers in a file of root hints
» Needed to be updated periodically by the local administrator
from a reliable source to be kept up to date with the changes
which occur over time
» Query one of the root servers to find the server authoritative
for the next level down
» Query this second server for the address of a DNS server
with detailed knowledge of the second-level domain
» Repeat the previous step to progress down the name, until
the final step which would return the final address sought
(This description deliberately uses the fictional .example TLD in accordance with the DNS guidelines themselves.)
How DNS works in theory
Address resolution mechanism
The diagram illustrates this process for the
real host www.wikipedia.org.
(This description deliberately uses the fictional .example TLD in accordance with the DNS guidelines themselves.)
How DNS works in theory
Address resolution mechanism
A search done in this simple form has a
major problem:
» a huge operating burden on the root servers
» each and every search for an address would be
started by querying one of them
Root nameservers are critical to the overall
function of the system
» Such a heavy use would create an
insurmountable bottleneck for trillions of
queries placed every day
In practice preemptive measures are taken
(This description deliberately uses the fictional .example TLD in accordance with the DNS guidelines themselves.)
How DNS works in theory
Circular dependencies and glue records
Name servers in delegations appear listed by
name, rather than by IP address.
» This means that a resolving name server must
issue another DNS request to find out the IP
address of the server to which it has been referred
» Since this can introduce a circular dependency if
the nameserver referred to is under the domain that
it is authoritative of, it is occasionally necessary for
the nameserver providing the delegation to also
provide the IP address of the next nameserver
» This record is called a glue record
How DNS works in theory
Circular dependencies and glue records
For example, assume that the sub-domain
en.wikipedia.org contains further sub-domains
(such as something.en.wikipedia.org) and that the
authoritative nameserver for these is at
ns1.en.wikipedia.org
» A computer trying to resolve something.en.wikipedia.org
will thus first have to resolve ns1.en.wikipedia.org
» Since ns1 is also under the en.wikipedia.org subdomain,
resolving ns1.en.wikipedia.org requires resolving
ns1.en.wikipedia.org which is exactly the circular
dependency mentioned above
» The dependency is broken by the glue record in the
nameserver of wikipedia.org that provides the IP address
of ns1.en.wikipedia.org directly to the requestor,
enabling it to bootstrap the process by figuring out
where ns1.en.wikipedia.org is located
How DNS Works
In Practice
How DNS Works In Practice
When an application tries to find the IP address of
a domain name
» It doesn't necessarily follow all of the steps outlined in the
Theory section
» Uses caching
How DNS works In practice
Caching and time to live
» Huge volume of requests generated by a system
like DNS
» Need to provide a mechanism to reduce the load
on individual DNS servers
» DNS resolution process allows for caching for a given
period of time after a successful answer
Caching: the local recording and subsequent
consultation of the results of a DNS query
» How long a resolver caches a DNS response is
determined by a value called the time to live (TTL)
» The TTL is set by the administrator of the DNS server
handing out the response
The period of validity may vary from just seconds to
days or even weeks or years
How DNS Works In Practice
- Caching time
As a consequence of the distributed and caching
architecture, changes to DNS do not always take effect
immediately and globally
» This is best explained with an example:
» An administrator has set a TTL of 6 hours for the host
www.wikipedia.org
Then changes the IP address to which www.wikipedia.org resolves at 12:01pm
Administrator must consider that a person who cached a response with the old IP
address at 12:00pm will not consult the DNS server again until 6:00pm.
» The period between 12:01pm and 6:00pm in this example is called
caching time
The period of time that begins when you make a change to a DNS
record and ends after the maximum amount of time specified by the
TTL expires
» This essentially leads to an important logistical consideration
when making changes to DNS: not everyone is necessarily seeing
the same thing you're seeing.
RFC 1537 helps to convey basic rules for how to set the TTL.
How DNS Works In Practice
- Caching time
Note that the term "propagation", although
very widely used in this context, does not
describe the effects of caching well
» Specifically, it implies that
» [1] when you make a DNS change, it somehow spreads
to all other DNS servers (instead, other DNS servers
check in with yours as needed)
» [2] that you do not have control over the amount of time
the record is cached
you control the TTL values for all DNS records in your
domain
except your NS records and any authoritative DNS servers
that use your domain name
How DNS Works In Practice
- Caching time
Some resolvers may override TTL values
» Protocol supports caching
» up to 68 years
» no caching at all
Negative caching (the non-existence of records) is
determined by name servers authoritative for a zone
which MUST include the SOA record (Start Of
Authority) when reporting no data of the requested
type exists.
The MINIMUM field of the SOA record and the TTL of
the SOA itself is used to establish the TTL for the
negative answer
How DNS Works In Practice
- Caching time
Many people incorrectly refer to a mysterious 48 hour
or 72 hour propagation time when you make a DNS
change.
» When one changes the NS records for one's domain or the IP
addresses for hostnames of authoritative DNS servers using
one's domain (if any), there can be a lengthy period of time
before all DNS servers use the new information
» Those records are handled by the zone parent DNS servers
» Typically cache those records for 48 hours
» However, those DNS changes will be immediately available for
any DNS servers that do not have them cached
» And any DNS changes on your domain other than the NS
records and authoritative DNS server names can be nearly
instantaneous, if you choose for them to be (by lowering the
TTL once or twice ahead of time, and waiting until the old TTL
expires before making the change)
How DNS Works In Practice
- In the Real World
DNS resolving from program to OS-resolver to
ISP-resolver to greater system.
Users generally do not communicate directly
with a DNS resolver
» Instead DNS-resolution takes place transparently in
client-applications such as web-browsers, mailclients, and other Internet applications
» When an application makes a request which
necessitates a DNS lookup
» Such programs send a resolution request to the local DNS
resolver in the local operating system
» Which in turn handles the communications required
How DNS Works In Practice
- In the Real World
The DNS resolver will almost invariably have a cache containing
recent lookups
» If the cache can provide the answer to the request, the resolver will
return the value in the cache to the program that made the request
» If the cache does not contain the answer, the resolver will send the
request to one or more designated DNS servers
» In the case of most home users, the Internet Service Provider to
which the machine connects will usually supply this DNS server:
» Such a user will either have configured that server's address manually or
allowed DHCP to set it
» System administrators have configured systems to use their own DNS
servers
Their DNS resolvers point to separately maintained nameservers of the
organization
» In any event, the name server thus queried will follow the process
outlined above
» Until it either successfully finds a result or does not
» It then returns its results to the DNS resolver
» Assuming it has found a result, the resolver duly caches that result for
future use
Hands the result back to the software which initiated the request
How DNS Works In Practice
- Broken Resolvers
Broken resolvers
» An additional level of complexity emerges when resolvers violate
the rules of the DNS protocol
» Some people have suggested that a number of large ISPs have
configured their DNS servers to violate rules (presumably to
allow them to run on less-expensive hardware than a fullycompliant resolver), such as by disobeying TTLs, or by
indicating that a domain name does not exist just because one of
its name servers does not respond
» As a final level of complexity, some applications (such as webbrowsers) also have their own DNS cache, in order to reduce the
use of the DNS resolver library itself.
» This practice can add extra difficulty when debugging DNS
issues, as it obscures the freshness of data, and/or what data
comes from which cache.
» These caches typically use very short caching times
of the
order of one minute
» Internet Explorer offers a notable exception: recent versions cache DNS
records for half an hour
How DNS Works In Practice
- Other Applications
The system outlined above provides a somewhat simplified
scenario.
The Domain Name System includes several other functions:
» Hostnames and IP addresses do not necessarily match on a one-toone basis.
» Many hostnames may correspond to a single IP address: combined
with virtual hosting
» this allows a single machine to serve many web sites
» Alternatively a single hostname may correspond to many IP
addresses:
» Facilitates fault tolerance and load distribution
» Allows a site to move physical location seamlessly
Many uses of DNS besides translating names to IP addresses
» E.g. Mail transfer agents use DNS to find out where to deliver e-mail
for a particular address
The domain to mail exchanger mapping provided by MX records
accommodates another layer of fault tolerance and load
distribution on top of the name to IP address mapping
39
How DNS Works In Practice
- Other Applications
Sender Policy Framework and DomainKeys
» Instead of creating their own record types
» Designed to take advantage of another DNS record type, the TXT record
To provide resilience in the event of computer failure,
multiple DNS servers are usually provided for coverage of
each domain, and at the top level, thirteen very powerful
root servers exist, with additional "copies" of several of
them distributed worldwide via Anycast
DNS primarily uses UDP on port 53 to serve requests.
» Almost all DNS queries consist of a single UDP request from
the client followed by a single UDP reply from the server
» TCP comes into play only when the response data size
exceeds 512 bytes, or for such tasks as zone transfer
» Some operating systems such as HP-UX are known to have
resolver implementations that use TCP for all queries, even
when UDP would suffice
DNS Extensions
Extensions to DNS
» EDNS is an extension of the DNS protocol
» Enhances the transport of DNS data in UDP packages
» Adds support for expanding the space of request and
response codes
» Described in RFC 2671
Types of DNS records
Important categories of data stored in DNS include the following:
» An A record or address record maps a hostname to a 32-bit IPv4
address.
» An AAAA record or IPv6 address record maps a hostname to a 128-bit
IPv6 address.
» A CNAME record or canonical name record is an alias of one name to
another
» The A record to which the alias points can be either local or remote - on a
foreign name server.
» This is useful when running multiple services (like an FTP and a webserver)
from a single IP address.
» Each service can then have its own entry in DNS (like ftp.example.com. and
www.example.com.)
» An MX record or mail exchange record maps a domain name to a list of
mail exchange servers for that domain.
» A PTR record or pointer record maps an IPv4 address to the canonical
name for that host.
» Setting up a PTR record for a hostname in the in-addr.arpa. domain that
corresponds to an IP address implements reverse DNS lookup for that
address.
» For example (at the time of writing), www.icann.net has the IP address
192.0.34.164, but a PTR record maps 164.34.0.192.in-addr.arpa to its
canonical name, referrals.icann.org.
» An NS record or name server record maps a domain name to a list of
DNS servers authoritative for that domain.
» Delegations depend on NS records.
Types of DNS records
Important categories of data stored in DNS include the following: (cont.)
» An SOA record or start of authority record specifies the DNS server
providing authoritative information about an Internet domain, the email of
the domain administrator, the domain serial number, and several timers
relating to refreshing the zone.
» An SRV record is a generalized service location record.
» A TXT Record allows an administrator to insert arbitrary text into a DNS
record.
» For example, this record is used to implement the Sender Policy Framework and
DomainKeys specifications.
» NAPTR records ("Naming Authority Pointer") are a newer type of DNS record
that support regular expression based rewriting.
» Other types of records simply provide information (for example, a LOC
record gives the physical location of a host), or experimental data (for
example, a WKS record gives a list of servers offering some well known
service such as HTTP or POP3 for a domain).
» When sent over the internet, all records use the common format specified in
RFC 1035 shown below.
» RR (Resource Record) Fields FieldDescriptionLength (Octets)NAMEName of
the node to which this record pertains.(variable)TYPEType of RR.
» For example, MX is type 15.2CLASSClass code.2TTLSigned time in seconds that
RR stays valid.4RDLENGTHLength of RDATA field.2RDATAAdditional RR-specific
data.(variable)For a complete list of DNS Record types consult IANA DNS
Parameters.
DNS Records
Complete List
http://www.iana.org/assignments/dns-parameters
Example DNS Record for logicbbs.org
IN NS ns.planix.com
IN NS ns1.mydyndns.org
IN NS ns2.mydyndns.org
IN MX 10 mail
IN A 69.17.158.109
www IN A 69.17.158.109
mail IN A 69.17.158.109
First three lines describe valid name servers for
logicbbs.org.
Following entry indicates that the mail exchanger for
logicbbs.org has a priority of 10 and messages should
be directed to mail.logicbbs.org.
Priority values indicate where to send e-mail if a server is
unavailable; the lower the priority value, the higher the
priority of that server.
» Mail servers send e-mail to the server with the lowest priority
value, and then work their way up the values listed as
necessary.
The last three lines indicate that logicbbs.org (the
second-level domain) points to 69.17.158.109.
» The www and mail subdomains (www.logicbbs.org,
mail.logicbbs.org) also point to 69.17.158.109.
The DNS record is the reason why some internet
addresses do not need the www prefix, while others do.
» If that particular domain has a www A record that differs
from the basic A record, then anydomain.com may be
different from www.anydomain.com, and the former may not
work.
» Other sites, like logicbbs.org, have both the top-level domain
and the www subdomain pointing to the same IP address,
which reduces confusion and ambiguity
Internationalized Domain Names
While domain names technically have no restrictions on the
characters they use and can include non-ASCII characters,
the same is not true for host names
» Host names are the names most people see and use for things
like e-mail and web browsing.
» Host names are restricted to a small subset of the ASCII
character set that includes
»
»
»
»
the Roman alphabet in upper and lower case
the digits 0 through 9
the dot
the hyphen
» This prevented the native representation of names and words
of many languages
» ICANN has approved the Punycode-based IDNA system, which
maps Unicode strings into the valid DNS character set, as a
workaround to this issue
» Some registries have adopted IDNA
Security issues
DNS was not originally designed with security in mind, and thus has a
number of security issues.
» DNS responses are traditionally not cryptographically signed, leading to many
attack possibilities;
» DNSSEC modifies DNS to add support for cryptographically signed responses
» There are various extensions to support securing zone transfer information as well
Even with encryption it still doesn't prevent the possibility that a DNS
server could become infected with a virus (or for that matter a disgruntled
employee) that would cause IP addresses of that server to be redirected
to a malicious address with a long TTL.
» This could have far reaching impact to potentially millions of internet users if
busy DNS servers cache the bad IP data.
» This would require manual purging of all affected DNS caches as required by
the long TTL (up to 68 years).
Some domain names can spoof other, similar-looking domain names.
» For example, "paypal.com" and "paypa1.com" are different names, yet users
may be unable to tell the difference when the user's typeface (font) does not
clearly differentiate the letter l and the number 1.
» This problem is much more serious in systems that support internationalized
domain names, since many characters that are different, from the point of view
of ISO 10646, appear identical on typical computer screens.
Legal users of domains
Registrant
» Most of the NICs in the world receive an annual fee from a legal user in order
for the legal user to utilize the domain name (i.e. a sort of a leasing agreement
exists, subject to the registry's terms and conditions)
» Depending on the various naming convention of the registries, legal users become
commonly known as "registrants" or as "domain holders"
» ICANN holds a complete list of domain registries in the world
» One can find the legal user of a domain name by looking in the WHOIS database
held by most domain registries
» For most of the more than 240 country code top-level domains (ccTLDs), the
domain registries hold the authoritative WHOIS (Registrant, name servers,
expiry dates, etc.).
» For instance, DENIC, Germany NIC, holds the authoritative WHOIS to a .DE domain
name
» However, some domain registries, such as for .COM, .ORG, .INFO, etc., use a
registry-registrar model
» There are hundreds of Domain Name Registrars that actually perform the domain
name registration with the end user (see lists at ICANN or VeriSign)
» By using this method of distribution, the registry only has to manage the
relationship with the registrar, and the registrar maintains the relationship with the
end users, or 'registrants'
» For .COM, .NET domain names, the domain registries, VeriSign holds a basic WHOIS
(registrar and name servers, etc.)
» One can find the detailed WHOIS (registrant, name servers, expiry dates, etc.) at the
registrars
» Since about 2001, most gTLD registries (.ORG, .BIZ, .INFO) have adopted a socalled "thick" registry approach, i.e. keeping the authoritative WHOIS with the
various registries instead of the registrars
Legal users of domains
Administrative contact
» A registrant usually designates an administrative contact to manage the
domain name
» The administrative contact usually has the most immediate power over a domain
» Management functions delegated to the administrative contacts may include:
the obligation to conform to the requirements of the domain registry in order to retain the right to use
a domain name
authorization to update the physical address, e-mail address and telephone number etc. in WHOIS
Technical contact
» A technical contact manages the name servers of a domain name. The
many functions of a technical contact include:
» making sure the configurations of the domain name conforms to the
requirements of the domain registry
» updating the domain zone
» providing the 24×7 functionality of the name servers
allows accessibility of the domain name
Billing contact
» The party whom a NIC invoices
Name servers
» Namely the authoritative name servers that host the domain name zone of a
domain name
Politics
Many investigators have voiced criticism of the methods currently used to
control ownership of domains
» Critics commonly claim abuse by monopolies or near-monopolies, such as
VeriSign, Inc
» Particularly noteworthy was the VeriSign Site Finder system which redirected all
unregistered .com and .net domains to a VeriSign webpage
» Despite widespread criticism, VeriSign only reluctantly removed it after the
Internet Corporation for Assigned Names and Numbers (ICANN) threatened to
revoke its contract to administer the root name servers
There is also significant disquiet regarding the United States' political
influence over ICANN
» This was a significant issue in the attempt to create a .xxx top-level domain and
sparked greater interest in alternative DNS roots that would be beyond the
control of any single country
Truth in Domain Names Act
» Main article: Anticybersquatting Consumer Protection Act
» In the United States, the "Truth in Domain Names Act" (actually the
"Anticybersquatting Consumer Protection Act"), in combination with the
PROTECT Act, forbids the use of a misleading domain name with the intention of
attracting people into viewing a visual depiction of sexually explicit conduct on
the Internet
Other Internet Resources
See also
» Dynamic DNS
» Alternative DNS root
» Comparison of DNS server software
DNS Zones
Forward
DNS Zones
Reverse
http://en.wikipedia.org/wiki/Reverse_DNS_lookup
DNS Reverse Lookup
Overview
» Typically, the Domain Name System is used to
determine what IP address is associated with a
given domain name.
» So, to reverse-resolve a known IP address is to
look up what the associated domain name is
belonging to that IP address.
» A reverse lookup is often referred to as reverse
resolving, or more specifically reverse DNS
lookup, and is accomplished using a "reverse
IN-ADDR entry" in the form of a PTR record
DNS Reverse Lookup
IPv4 Reverse DNS
» Reverse DNS lookups for IPv4 addresses use a reverse
IN-ADDR entry in the special domain in-addr.arpa.
» An IPv4 address is represented in the in-addr.arpa
domain by a sequence of bytes in reverse order,
represented as decimal numbers, separated by dots with
the suffix .in-addr.arpa.
» For example
the reverse lookup domain name corresponding to the IPv4
address 10.12.13.140 is 140.13.12.10.in-addr.arpa.
A host name for 1.2.3.4 can be obtained by issuing a DNS
query for the PTR record for that special address 4.3.2.1.inaddr.arpa.
DNS Reverse Lookup
Classless Reverse DNS
» Historically, IP addresses were allocated in blocks of 256
» Each block fell upon an octet boundary
» Configuration of the PTR records easy
Dot separators delimited each block
» IP addresses are now allocated in very much smaller
blocks
» Traditional way of configuring a nameserver to perform reverse
DNS cannot work
» A means of overcoming this problem was devised and
published as RFC 2317
» Uses a CNAME entry which corresponds to each block
Multiple PTR records
» While most rDNS entries only have one PTR record, it is
perfectly legal to have many different PTR records
» Although it is perfectly legal having multiple PTR
records for the same IP address it is generally not
recommended, unless you have a specific need
» For example, if a webserver supports many virtual hosts
Can be one PTR record for each host
Some versions of name server software will automatically add a PTR record for
each host
» Multiple PTR records can cause a couple of problems
» Including triggering bugs in programs that only expect there to
ever be a single PTR record
» In the case of a large webserver, having hundreds of PTR records
can cause the DNS packets to be much larger than normal
Records other than PTR records
» While uncommon compared with PTR records,
it is also legal to put other types of records in
the reverse DNS tree.
» In particular, encryption keys can be placed
there
» for, example,
IPsec (RFC 4025)
SSH (RFC 4255)
IKE (RFC 4322)
» Less standardized usages include
» comments placed in TXT records and LOC records to
identify the location of the IP address
DNS
Forward
-vsReverse
Lookups
Forward DNS lookup
» Forward DNS lookup is using an Internet domain name to find
an IP address.
Reverse DNS lookup
» Reverse DNS lookup is using an Internet IP address to find a
domain name.
http://searchsmb.techtarget.com/sDefinition/0,,sid44_gci213968,00.html
Lookups
When you enter an address for a Web site at your browser
» The address is transmitted to a nearby router
» The router does a forward DNS lookup in a routing table to
locate the IP address
Forward DNS lookup is the more common lookup
» Most users think in terms of domain names rather than IP
addresses
Occasionally you may see a Web page with a URL in which
the domain name part is expressed as an IP address
(sometimes called a dot address) and want to be able to see
its domain name.
An Internet facility that lets you do either forward or reverse
DNS lookup yourself is called nslookup
» Comes with some operating systems
» Can download the program and install it in your computer
DNS Zones
Forwarding
DNS Forwarding
In large, well organized, academic or ISP networks you will
sometimes find that the network people have set up a
forwarder hierarchy of DNS servers
» Helps lighten the internal network load and the load on the
outside servers
It's not easy to know if you're inside such a network or not
By using the DNS server of your network provider as a
and less of a load on your network
Your nameserver forwards queries to your ISP nameserver
Each time this happens you will dip into the big cache of
your ISPs nameserver
» Thus speeding your queries up, your nameserver does not
have to do all the work itself
If you use a modem this can be quite a win
http://tldp.org/HOWTO/DNS-HOWTO-4.html
DNS
Delegation/Parenting
Mail Exchangers
Covered in previous section:
How DNS Works In Practice
- Other Applications
Content Delivery Networks
This slides are provided by Saverio Nicolini of NEC
Laboratories Europe
AN INTRODUCTION
Retrieving Content WITHOUT a CDN
Imagine you want to get the latest iPhone software update* and
there is no CDN
»
apple.com/update
Do you see any problem with this?
*the same concepts apply to the downloading of a web page
Retrieving Content WITHOUT a CDN: problems
Server Scalability
» Origin server becomes a bottleneck
» Too many requests to be served in short time
» Too few bandwidth to deliver the (huge) content
Network Performance
» Network utilization and congestion: same data flows
repeatedly over links between clients and origin server (even
if clients are next to each other)
» Delay: content can encounter big delays when flowing each
time from remote locations
Server Scalability Solutions: server farms
Trivial: use a large number of servers and balance the
requests
Load Balancer
» Load balancer can balance at random (or on L3/4/7 parameters)
» Does not improve network utilization
» Does not improve user latency due to network congestion
Network Performance Solutions: caching
proxies
Solution: ISP intercepts and redirects specific traffic (e.g., TCP traffic on
port 80) to a local cache (if content was previously seen)
» Example: Squid Web Proxy Cache (http://www.squid-cache.org/)
Major benefits (for ISPs)
» Reduced network utilization
» Imrpoved user latency
Drawbacks of caching proxies
» Serve only ISP customers
» Content providers can not assume
caching proxies are deployed
» not all ISPs deploy caching proxies
» Content providers can not assume
they work correctly
» correct handling of time to live
» correct caching of what can be
cached and what can not
» Accounting and reporting is very
important to service operators
» E.g., would need to retrieve number of
hits from caching proxies to monetize
advertisement
ISP
corriere.it
Cache
Content
-the-
Internet
Retrieving Content WITH a CDN
What if the iPhone software update has been
previously distributed to multiple locations
all over the world?
Replica/Surrogate
Server
apple.com/update
Replica/Surrogate
Server
Origin
Server
Replica/Surrogate
Server
Replica/Surrogate
Server
Do you see the improvement?
Retrieving Content WITH a CDN:
advantages
The same data traverses potentially congested links
on the Internet only few times
» Most importantly: the number of times is now independent
from the number of user requests
The origin server load is highly reduced
» And again: the load is now independent from the number of
user requests
Users receive their content quicker
» The nearest the location the smaller the delay
Further optimizations are possible (and independent
from the network operators routing strategies)
» The content delivery can avoid congested regions (by
intelligently redirecting the content request to a non-congested
zone)
A number of geographically distributed servers
deployed to facilitate the distribution of contents
in a timely and efficient manner
CDN Arena: players
CDN (Service) Providers
» The (traditional) operators of a CDN
» Owners of the geographically distributed infrastructure hosting the content
replicas
» Operating the software that manages the CDN (either with software developed inhouse or acquired)
» Delivering content to users on behalf of the content providers
Content Providers (CP)
» The owner of the contents (and normally of the origin server as well)
» The ones willing to pay a CDN Service Provider in order to distribute their
contents efficiently
» E.g., by delegating the URI name space of the objects to be distributed
» Example(s): CNN, Netflix, YouTube, Disney, Warner Bros.
CDN Solution Vendors
» Technology providers developing turnkey solutions to run a CDN (or just some
software components)
» Example(s): CoBlitz, Velocix, Jet-Stream
CDN Arena: players
Internet Service Providers (ISPs)
» (Traditionally) Bit-pipes (the looser of in the CDN arena)
» they upgrade their infrastructure to face the huge increase in content
demand but they do not earn anything from the contents
» Example(s): Telecom Italia, France Telecom, etc.
»
(Network) Equipment Vendors
» Vendors developing (traditionally) network infrastructures devices in order to
transport intelligently and efficiently the bit pipes
» Example(s): Cisco, Alcatel-Lucent, Ericsson, NEC
»
End Users
What is the difference between
a CDN and Caching Proxies?
Who wants them and why?
» An ISP deploys a caching proxy to reduce bandwidth
consumption, save costs (OPEX: bandwidth, CAPEX: less
network upgrades) and satisfy its own users
» A content provider gives its content to a CDN Service Provider
in order to improve the quality of service towards its users
(independently of the ISP, i.e., all Internet users)
How they operate?
» Caching proxies operate in a reactive manner
» They start caching after they have seen the content one or
multiple times
» CDN operative in a proactive manner
» Content Providers push the latest blockbuster movie to
multiple locations around the world
What is the difference between
a CDN and Caching Proxies?
Whom are they pleasing?
»
providers)
» A CDN satisfies all end users and content providers (who are in
full control of their contents)
we will discuss later why ISPs are not fully satisfied with the
traditional CDN model
QUIZ: NAME THE 5 BIGGEST
CDN SERVICE PROVIDERS
WORLDWIDE
Five Biggest CDN Service Providers
Worldwide
Source: Yankee Group 2009 CDN Scorecard
1.
2.
3.
4.
5.
Akamai
» 84.000 servers
in 1.000 networks*
Limelight
EdgeCast
AT&T
Level3
» became in Nov. 2010 the
primary CDN for Netflix
* source: Mike Afergan, Akamai CTO
CDN FUNCTIONAL
ARCHITECTURE
CDN Functionalities
Content Ingestion/Acquisition
» Content entry point inside the CDN from the content source
» i.e., origin server, content provider
» Content preparation for the CDN: multiple formats and codec
rates
Content Distribution
» Placement: moving or replicating content to surrogate servers
(replica placement)
» Outsourcing/Retrieval: moving or replicating content within
surrogate servers
» Selection/Delivery: selecting and delivering the right content
(in the rigth format) from the surrogate server to the end user
CDN Functionalities
Request Routing
» Directing a request to the closest suitable surrogate server
CDN Management
» Monitoring performance, logging, reporting and accounting on
distribution and delivery activities
» Can have interfaces towards Content Provider and/or other
CDNs (e.g., for interconnection)
CDN Functional Architecture
Origin
Server
Ingestion /
Acquisition
CDN
Preparation
Placement
CDN Management
Request Routing
Retrieval
Delivery
Surrogate
Surrogate
Distribution
Origin Server Talking to CDN
Origin
Server
CDN
Ingestion /
Acquisition
Preparation
Placement
CDN Management
Retrieval
Distribution
Delivery
Origin server interactions Content Ingestion/Acquisition
» New content pushed to the CDN (e.g., http, nfs, webdav, etc.)
afterwards content is prepared for Placement
Origin server interactions with CDN Management
» Exchange of logs and other accounting info
End-user Interactions with CDN:
request routing
End-user requests for www.very-popular.com
» Request Routing intercepts the request and redirects the request towards
nap.very-popular.akamai.com)
Request Routing and Delivery make sure that surrogate server has the right
content in the right format (otherwise surrogate asks to Distribution and
Retrieval kicks in)
Placement
CDN
Retrieval
Request Routing
Delivery
(IT)
Surrogate
nap.akamai.very-popular.com
(DE)
Surrogate
fra.akamai.very-popular.com
Distribution
Request Routing
Responsible for routing client requests to best surrogate
servers
» Metrics used to select the surrogate server to route the request to
» Network proximity
» Client-perceived latency
» Surrogate server load
» Algorithms
» Adaptive (take into account current system conditions)
commonly used by successful commercial systems (e.g., Akamai, Cisco,
etc.) with proprietary combinations of the metrics above
» Non-adaptive (based on heuristics ignoring current system
conditions)
E.g., round-robin load-balancing
» Mechanisms
» See next slides
» For a complete taxonomy read
http://tools.ietf.org/html/rfc3568
Request Routing Mechanisms
(requiring no interaction with origin server)
DNS-based (most used by CDN systems)
» A modified DNS servers is inserted in the DNS resolution process
» capable of returning a different set of A, NS or CNAME records based on
user defined policies, metrics, or a combination of both
» mapping the surrogate server symbolic name to IP address
» Single reply
» DNS server is authorative for the entire DNS domain
» The DNS server returns the IP address of the best surrogate in an A record
to the requesting DNS server
» Could also be a virtual IP address of the best set of surrogates for
requesting DNS server
Request Routing
nap.akamai.very-popular.com
Local DNS
(IT)
Surrogate
(DE)
Surrogate
fra.akamai.very-popular.com
CDN
Request Routing Mechanisms
(requiring no interaction with origin server)
» Multiple reply
» Request routing DNS server returns multiple replies such as several A
records for various surrogates
» Common implementations of client site DNS server's cycles through the
multiple replies in a Round-Robin fashion
» Multi-level resolution
» Multiple request routing DNS servers returned in a single DNS resolution
» Allow to distribute more complex decisions from a single server to multiple,
more specialized, request-routing DNS servers
Request Routing
nap.akamai.very-popular.com
Local DNS
(IT)
Surrogate
(DE)
Surrogate
fra.akamai.very-popular.com
CDN
DNS-based request routing is not errorRequest routing function guesses the location of the user based on
» Possible solution: ISPs to communicate to CDN Service Providers the
location of their Local DNS servers
» Even more complicated: end-users configure manually their DNS servers
» E.g., they do not have trust in the DNS server of their ISPs
» Possible solution: recent IETF standardization on ALTO (Application Layer
Traffic Optimization)
» ALTO: ISP service to provide applications guidance on the underlying
network not deployed yet
ALTO Service:
P2P-use case
Request Routing with the help of an ALTO
service
Request Routing
nap.akamai.very-popular.com
ALTO
Protocol
3a
(IT)
Surrogate
3b
CDN
fra.akamai.very-popular.com
2a
2b
4a
4b
ALTO
Local DNS
1a
(DE)
Surrogate
1b
ISP
1a. Give me www.very-popular.com
2a. Please resolve www.very-popular.com
3a. Which one to prefer between nap.akamai and
frap.akamai for that end-user?
3b. Go to nap.akamai
2b. Go to nap.akamai
1b. Go to IP address of nap.akamai
4a. Give me content
4b. Here it is...
Request Routing Mechanisms
(requiring interaction with origin server)
HTTP redirection
» Information about surrogate servers are propagated in the HTTP headers
» Web server answers to client telling him to resubmit its request to
another server
Flexible and simple
Lacks of transparency, introduces extra round-trip, introduces extra
processing on origin server
URL rewriting (used by some CDN systems)
» Origin server rewrites the embedded objects URLs in the dynamically
generated pages thus the endsurrogate servers
Allows fine granularity, flexible (routing potentially can change at every
request)
Introduces extra load on origin server
Distribution: Placement (I)
cntv.cn/videos
Replica/Surrogate
Server
apple.com/update
Origin
Server
Origin
Server
Replica/Surrogate
Server
Replica/Surrogate
Server
Replica/Surrogate
Server
Distribution: Placement (II)
Replica Placement
» Objective: optimize the placement of replica (choosing which contents to
replicate where, among the surrogate servers available)
» Performance improvements targeted
» Reduce user-perceived latency for accessing content
» Minimize the overall network bandwidth consumption for transferring
content to clients
» Benefits
» for the CDN Service Provider: reduced infrastructure (CAPEX)
» for the ISP: reduced communications costs (OPEX)
» for the users: improved quality of experience
» Related optimization problem: choice of physical locations where to place
surrogate servers
» In reality: locations are given by a mix of theoretical analysis and
business constraints (e.g., hosting/storage facilities actually available
from third-parties and their costs)
» Single-ISP approach (e.g., AT&T): many choices on locations (up to base
station level)
» Multi-ISP approach: caches placed at global ISP Points of Presence
(POPs) by CDN Service Provider (e.g., Akamai, Amazon CloudFront)
Replica Placement Strategies
very hard to solve computationally, i.e., NP-complete)
» Theoretical approaches
»
centers and servers inside the storage centers, minimize the
maximum distance between a node (end-user) and the nearest
center
Minimum K-center problem [1]
K-hierarchically well-separated trees (K-HST) [2] (graph-theory)
» Heuristics
» Reducing the computational complexity as a trade-off of optimality:
find an acceptably good solution in a fixed amount of time (rather
than the best possible solution)
Greedy algorithms [3]
Topology-informed algorithms [4]
Hot-spot algorithms [5]
Probabilistic meta-heuristics for combinatorial optimization [6]
th
[6] http://en.wikipedia.org/wiki/Metaheuristic
Annual IEEE Symposium on Foundations of Computer Science, 1996
Distribution: Retrieval (I)
After content was placed
»
origin)?
cntv.cn/videos
Replica/Surrogate
Server
?
apple.com/update
?
Origin
Server
?
Origin
Server
Replica/Surrogate
Server
Replica/Surrogate
Server
Replica/Surrogate
Server
Distribution: Retrieval (II)
Outsourcing/Retrieval
» Objective: given contents are placed in surrogate servers, how contents get
outsourced by other surrogate servers not having the specific content?
» Cooperative push-based
Pre(e.g., based on P2P technologies) the delivering of the content to the surrogates not
having the content
» Cooperative pull-based
used by many commercial systems
» Non-cooperative pull-based
used by many
commercial systems
» Related issue: Once the content gets to the surrogate server of interest,
should it stay there (placement)?
many successful commercial systems (e.g., YouTube) always replicate the content to the
surrogate server located in the region where it was created and/or requested at least once
» given the huge number of contents they do not bother with replica placement (same
design strategy as Verivue, a successful startup in the CDN space,
http://www.verivue.com/blog/, from Larry Peterson, Chief Scientist at Verivue and
Professor at Princeton University)
Distribution: Cache Update
Cache Update
» Objective: declare content cached in the
surrogate servers as obsolete
» Periodic updates (most common cache
update method)
Origin server conveys information
(e.g., Cache-Control HTTP settings)
on what is cacheable, for how long, etc,
Such information is stored in the surrogate servers
Surrogate servers are inspected periodically and cleaned of expired contents
» Update propagation
Triggered by a change in the content
Requires active content pushing to the surrogate servers in order to replace the
content
Used only for slowly varying contents
» On-demand update
Update for the content is requested only when there is a demand for the content
» Invalidation
Invalidation message sent to all surrogate servers for a specific content
» Benefit: server end-user with up to date information/contents
Distribution: Cache Replacement
Cache Replacement
» Objective: action to be taken when the object store is full
» Most simple/popular replacement rules
LRU (replace Least Recently Used object)
» E.g., replace the object which was not requested for longest time
LFU (replace Least Frequently Used object)
» E.g., replace the object with fewer requests since it was stored
» Cache pollution: highly popular contents from the past could prevent new
popular contents from being cached (long term vs. short term popularity)
» What to take into account for designing an efficient replacement
strategy
What is the probability of the object being requested in the future?
What is the history of requests of that object in the past?
When was the object requested last?
What would is the (network) cost of the object? (e.g., bandwidth to the origin server
for fetching it)
What is the (storage) cost of the object? (i.e., object size)
What are premium user requesting?
Distribution: Delivery
Selection and Delivery
» Objective: right selection of the content to be delivered to the end-user
» Full-site vs. Partial-site content selection
Full-site: entire origin server content is outsourced to CDN (e.g., DNS configuration
by content provider thus all requests are resolved by the CDN DNS server)
» Pros: simplicity
» Cons: huge storage required at surrogate servers
Partial-site: partial replication of heavy objects only (embedded videos, huge
contents) (e.g., objects have host names in a domain for which the CDN provider is
authoritative)
» Network and/or Device-based content selection
Content to be delivered is chosen based on network and/or end-user terminal
characteristics
» If CDN could detect end-user device, it could adapt the content selection
based on device characteristics (e.g., http://wurfl.sourceforge.net/)
» Benefit: a correct content selection and delivery can dramatically reduce
the client download time, the server load and the network usage
CDN Management: Performance
Measurement (I)
Performance measurement of a CDN
»
» Most important metrics to measure the performance of a CDN
» Cache hit ratio: ratio between the number of cached objects
versus total objects requested (high is better )
» Origin server bandwidth: bandwidth used by the origin server
(low is better )
» Latency: time for the user to receive the objects requested
(e.g., a web page, startup delay of a video) (low is better )
» Surrogate server utilization: busy time of the surrogate server
(high is better
» Availability: how reliable a CDN is (e.g., measured with packetlosses, requests timed out, etc.)
» Good performance measurement (via detailed content-access
logs) is one of the differentiators of the most successful CDN
Service Providers
» it is about how you present it and how you make it usable to
your customers (i.e., content providers)
CDN Management: Performance
Measurement (II)
* source: http://www.akamai.com/html/technology/dataviz2.html
* source: http://www.akamai.com/worldcup
CDN Management: Performance
Measurement (III)
* source: http://www.akamai.com/html/technology/dataviz1.html
* source: http://www.akamai.com/html/technology/nui/news/index.html
Video: Akamai´s
Network Operation
and Control Center (NOCC)
(from min 2:58 to 5:52)
Content
Cache
Server
Cache
Server
Server
Content
Content
Provider
Providers
Content
Server
Centralized
Model
Content
Content
Provider
Providers
Content
Distribution
Revenues
Global CDN
Providers
(Mobile)
Carrier Network
Cache
Cache
Server
Server
CDN
Model
(Mobile)
Carrier Network
Backhaul and
Core are
bottleneck
Download