Overview: Naming

advertisement
Naming
Jennifer Rexford
Advanced Computer Networks
http://www.cs.princeton.edu/courses/archive/fall08/cos561/
Tuesdays/Thursdays 1:30pm-2:50pm
Goals of Today’s Lecture
• Names in the Internet
• Domain Name System (DNS)
– DNS server hierarchy
– DNS queries and responses
– DNS caching
– Improving DNS reliability
• DNS security vulnerabilities
– DNS cache poisoning and home-network attacks
• Use of DNS for (Web) server load balancing
• Beyond today’s naming and DNS
Names
Names in the Internet
• What gets named?
– Hosts, especially servers
– E.g., www.cnn.com or ftp.cs.princeton.edu
• What format do names have?
– Human-readable for ease of remembering
– Decentralized, hierarchical allocation of names
• Why are names separate from addresses?
– Names are easier (for humans) to remember
– Allows IP addresses to change over time
– Allows many-to-one and one-to-many mapping
Names in the Internet
• When are names translated to addresses?
– Before IP-level communication begins
– To learn the IP address of the remote end-point
• Who requests the translation?
– The end-host initiating the communication
• Can addresses be translated back to names?
– Yes, useful for access control, customization of
content, interpreting measurement data, etc.
– Though not always necessary or possible
Domain Name System (DNS)
Proposed in 1983 by Paul Mockapetris
Key Concepts Underlying DNS
• Indirection
– Use of names in place of addresses
– Queries from local servers rather than end hosts
• Hierarchy
– For scalability
• Many servers to handle the large load of queries
– For decentralized control
• Of assigning unique names
• Of deploying and running DNS servers
• Caching
– Of information from each level in the hierarchy
– On behalf of variety of users at an organization
Variable-Depth Tree
unnamed root
com
edu
org
generic domains
bar
uk
ac
zw
arpa
country domains
ac
inaddr
west
east
cam
12
foo
my
usr
34
my.east.bar.edu
usr.cam.ac.uk
56
12.34.56.0/24
DNS Root Servers
• 13 root servers (see http://www.root-servers.org)
– Labeled A through M
E NASA Mt View, CA
F Internet Software C. Palo
Alto, CA (and 17 other
locations)
A Verisign, Dulles, VA
C Cogent, Herndon, VA (also Los Angeles)
D U Maryland College Park, MD
G US DoD Vienna, VA
K RIPE London (also Amsterdam, Frankfurt)
H ARL Aberdeen, MD
J Verisign (11 locations)
I Autonomica, Stockholm (plus
3 other locations)
B USC-ISI Marina del Rey, CA
L ICANN Los Angeles, CA
m WIDE Tokyo
TLD and Authoritative DNS Servers
• Top-level domain (TLD) servers
– Generic domains (e.g., com, org, edu)
– Country domains (e.g., uk, fr, ca, jp)
– Typically managed professionally
• Network Solutions maintains servers for “com”
• Educause maintains servers for “edu”
• Authoritative DNS servers
– Provide public records for hosts at an organization
– For organization’s servers (e.g., Web and mail)
– Can be maintained locally or by a service provider
Local DNS Server and End-Host Resolver
• Local DNS server (“default name server”)
– Usually near the end hosts who use it
– Local hosts configured with local server
(e.g., /etc/resolv.conf) or learn via DHCP
• End-host resolver
– Triggered by application making system call
– E.g., gethostbyname() or gethostbyaddr()
– Sends query to the local DNS server
Example
Host at cis.poly.edu
wants IP address for
gaia.cs.umass.edu
root DNS server
2
3
TLD DNS server
4
local DNS server
dns.poly.edu
5
1
8
requesting host
cis.poly.edu
7
6
authoritative DNS server
dns.cs.umass.edu
gaia.cs.umass.edu
Recursive vs. Iterative Queries
• Recursive query
– Ask server to get
answer for you
– E.g., request 1
and response 8 local DNS server
• Iterative query
– Ask server who
to ask next
– E.g., all other
request-response
pairs
root DNS server
2
3
TLD DNS server
4
dns.poly.edu
5
1
8
requesting host
cis.poly.edu
7
6
authoritative DNS server
dns.cs.umass.edu
DNS Caching
• Performing all these queries take time
– All before the actual communication takes place
– E.g., 1 sec latency before starting Web download
• Caching can substantially reduce overhead
– The top-level servers very rarely change
– Popular sites (e.g., www.cnn.com) visited often
– Local DNS server often has the information cached
• How DNS caching works
– DNS servers cache responses to queries
– Responses include a “time to live” (TTL) field
– Server deletes the cached entry after TTL expires
Negative Caching
• Remember things that don’t work
– Misspellings like www.cnn.comm and
www.cnnn.com
– These can take a long time to fail the first time
– Good to remember that they don’t work
• Benefits of negative caching
– Reduce time to respond the next time
– Avoid placing high load on other DNS servers
DNS Resource Records (RRs)
• Distributed database storing resource records
RR format: (name,
• Type=A
– name is hostname
– value is IP address
• Type=NS
– name is domain (e.g.
foo.com)
– value is hostname of
authoritative name server
for this domain
value, type, ttl)
• Type=CNAME
– name is alias name for
some “canonical” (the
real) name
– www.ibm.com is really
east.backup2.ibm.com
• Type=MX
– value is name of the
mail server associated
with name
Inserting Resource Records into DNS
• Example: just created startup “FooBar”
• Register foobar.com at Network Solutions
– Provide registrar with names and addresses of
your authoritative name server (primary and
secondary)
– Registrar inserts two RRs into the com TLD server:
• (foobar.com, dns1.foobar.com, NS)
• (dns1.foobar.com, 212.212.212.1, A)
• Put in authoritative server dns1.foobar.com
– Type A record for www.foobar.com
– Type MX record for foobar.com
DNS Protocol
DNS protocol : query and reply messages, both
with same message format
Message header
• Identification: 16 bit #
for query, reply to
query uses same #
• Flags:
– Query or reply
– Recursion desired
– Recursion available
– Reply is
authoritative
Reliability
• DNS servers are replicated
– Name service available if at least one replica is up
– Queries can be load balanced between replicas
• UDP used for queries
– Need reliability: must implement this on top of
UDP
• Try alternate servers on timeout
– Exponential backoff when retrying same server
• Same identifier for all queries
– Don’t care which server responds
Reliability: IP Anycast
• Multiple replicas with same IP address
– Replicas located at multiple geographic locations
– Routing system directs query to “closest” replica
• Used especially for the root DNS servers
– Can add more servers and locations without
adding new IP addresses for the root servers
Root server
1.2.3.0/24
1.2.3.4
Root server
1.2.3.4
Bogus Queries at Root Server (Wessels03 Paper)
• Many kinds of bogus queries
– Undefined DNS query types
– Name-to-address queries on IP addresses
– Unknown TLD (e.g., “.elvis”) or ill-formed address
(e.g., “209.17.66.80.196.200.64.in-addr.arpa”)
– Queries on private IP addresses (e.g., 10.0.0.0)
– Repeated queries (e.g., retransmissions due to
packet filters dropping the DNS responses)
• Less than 2% of queries were legitimate!
DNS Security
DNS Cache Poisoning
• Suppose an attacker owns sub.example.com
– And wants to control wikipedia.org’s domain
• Receives a legitimate request for the address
records of sub.example.com
– “sub.example.com IN A”
• Redirects to target domain & assigns address
– “example.com. 3600 IN NS ns.wikipedia.org”
– “ns.wikipedia.org IN A w.x.y.z” (a glue record)
• Vulnerable server caches additional A record
– Now attacker, who controls w.x.y.z can resolve
queries for the entire wikipedia.org domain
DNS Cache Poisoning (Continued)
• DNS forgery is another approach
– Beating the real answer to a recursive DNS query
• DNS server tries to resolve www.wikipedia.org
– Attacker sends a forged response
– Challenging: needs to match 16-bit ID and port #
• Overcoming the challenges
– Some servers increment the id and use fixed port
– Some servers accept queries from anyone
• So attacker can send *queries* to the server for
www.wikipedia.org to trigger the server to make more
queries of its own
Preventing DNS Cache Poisoning
• Making DNS servers less trusting
– Ignore records not directly relevant to the query
• Making it harder to guess query id
– Using cryptographically secure random numbers
– (Some early servers use bad random number generators)
• Disallowing DNS queries from outsider
– Filtering DNS queries based on source IP address
• Ensuring the authenticity of the data
– DNSSEC using digital certificates (not widely deployed)
• Ensuring the right transport or application connection
to avoid talking to wrong endpoint
– Using HTTPs or SSH with digital certificates
Recent DNS Attack
• Poisoning authoritative records
– For the entire domain (e.g., bankofsteve.com)
– Rather than an individual address
• Even against well-protected servers
– E.g., those that randomize the query id
– By sending many, many requests to the server
• Need to make sure query results aren’t cached
– Send many queries for random domain names
– E.g., www12345678.bankofsteve.com
– Attack can be successful within (say) 10 seconds
• The patch: randomize source port number, too
http://unixwiz.net/techtips/iguide-kaminsky-dns-vuln.html
DNS Attacks on Edge Networks
• Many end hosts check local “hosts” file first
– … before sending queries to local DNS server
• Malware can add entries to this file
– … to direct certain domains to different addresses
• Many home networks have a local DNS server
– … running on a local network router
• Attacker can compromise the router
– … and reconfigure the next DNS server
– … or completely overwrite the firmware
DNS-Based Load Balancing
Directing Web Clients to Replicas: Different Names
• Simple approach: different names
– www1.cnn.com, www2.cnn.com, www3.cnn.com
– But, this requires users to select specific replicas
Web server
www1
Web server
www2
Directing Web Clients to Replicas: Different Addresses
• More elegant approach: different IP addresses
– Single name (e.g., www.cnn.com), multiple addresses
– E.g., 64.236.16.20, 64.236.16.52, 64.236.16.84, …
• Authoritative DNS server returns many addresses
– And the local DNS server selects one address
– Authoritative server may vary the order of addresses
Web server
1.2.3.4
Web server
5.6.7.8
Directing Web Clients to Replicas: Finer Control
• Web sites need greater flexibility
– For load balancing over the Web server replicas
– Directing Web clients to the closest server
– Directing clients to customized version of content
• Different DNS responses to different queries!
Web server
1.2.3.4
Web server
5.6.7.8
Challenges of Fine-Grain Control
• Frequent modification to DNS records
– To exercise fine-grain control
– To remove IP addresses for failed replicas
• Inferring the Web client location
– Based on the IP address of local DNS server
– And mapping to topological or geographic location
• Caching of query results at the local DNS server
– Sending the same cached result to many users
• Even setting small TTL is not fully effective
– Many Web browsers cache the resolved address
– And smaller TTLs add latency and DNS server load
• Load balancing at machine level, not Web object
Beyond Today’s Naming and DNS
Problems with DNS and Naming/Addressing
• Many levels of look-up is slow
– Sometimes > 1 sec when all queries miss in cache
• Cache expiry is clumsy
– Low TTL leads to poor scaling and higher delays
– High TTL leads to slow failover and poor control
• Operates at the level of host names/addresses
– Yet many apps (like CDNs) care about objects
• Increasingly an address is not a host anyway
– Multiple servers (anycast), front-end for a load
balancer, NAT box, …
Questions
• Is hierarchical allocation necessary?
– E.g., to ensure uniqueness?
• Is hierarchical lookup necessary?
– E.g., for scalability?
• Are mnemonic names necessary?
– E.g., for human readability?
• Should the name correspond to a host?
– E.g., rather than to an object?
• Should the lookup map to a machine address?
– E.g., rather than to a direction to follow?
Download