EE 122: Computer Networks

advertisement
DNS: Domain Name System
EE 122: Intro to Communication Networks
Fall 2010 (MW 4-5:30 in 101 Barker)
Scott Shenker
TAs: Sameer Agarwal, Sara Alspaugh, Igor Ganichev, Prayag Narula
http://inst.eecs.berkeley.edu/~ee122/
Materials with thanks to Jennifer Rexford, Ion Stoica, Vern Paxson
and other colleagues at Princeton and UC Berkeley
1
Announcements
• HW #2 Problem 7 has been corrected
2
Goals of Today’s Lecture
• Finish Transport
– We left off talking about UDP
– Ready to move on to TCP
• Concepts & principles underlying the Domain
Name System (DNS)
• Inner workings of DNS
• Security problems with DNS
3
Transmission Control Protocol (TCP)
• Connection oriented
– Explicit set-up and tear-down of TCP session
• Stream-of-bytes service
– Sends and receives a stream of bytes, not messages
• Congestion control
– Dynamic adaptation to network path’s capacity
• Reliable, in-order delivery
– TCP tries very hard to ensure byte stream (eventually)
arrives intact
o In the presence of corruption and loss
• Flow control
–
Ensure that sender doesn’t overwhelm receiver
4
Reliable Delivery
• How do we design for reliable delivery?
– How do you converse on a noisy cell phone connection?
• Positive acknowledgment (“Ack”)
– Explicit confirmation by receiver
o On cell phone, “OK”
o But how do you know you heard correctly?
– TCP acknowledgments are cumulative (“I’ve received
everything up through sequence #N”)
o With an option for acknowledging individual segments (“SACK”)
• Negative acknowledgment (“Nack”)
– “I’m missing the following: …”
– How might the receiver tell something’s missing?
Can they always do this?
– (Only used by TCP in implicit fashion - “fast retransmit”)
5
Reliable Delivery, con’t
• Timeout
– If haven’t heard anything from receiver, send again
– Problem: for how long do you wait?
o TCP uses function of estimated RTT
– Problem: what if no Ack for retransmission?
o TCP (and other schemes) employs exponential backoff
o Double timer up to maximum - tapers off load during congestion
• A very different approach to reliability: send
redundant data
– Cell phone analogy: “Meet me at 3PM - repeat 3PM”
– Forward error correction
– Recovers from lost data nearly immediately!
– But: only can cope with a limited degree of loss
– And: adds load to the network (interesting tradeoff)
6
TCP Support for Reliable Delivery
•
Sequence numbers
–
–
•
Checksum
–
–
–
•
Used to detect missing data
... and for putting the data back in order
Used to detect corrupted data at the receiver
…leading the receiver to drop the packet
No error signal sent - recovery via normal retransmission
Retransmission
–
–
–
Sender retransmits lost or corrupted data
Timeout based on estimates of round-trip time (RTT)
Fast retransmit algorithm for rapid retransmission
7
Efficient Transport Reliability
8
Automatic Repeat reQuest (ARQ)
• Automatic Repeat Request
– Receiver sends
acknowledgment (ACK) when
it receives packet
– Sender waits for ACK and
times out if does not arrive
within some time period
Receiver
Timeout
Sender
• Simplest ARQ protocol
– Stop and Wait
– Send a packet, stop and wait
until ACK arrives
Time
9
How Fast Can Stop-and-Wait Go?
• Suppose we’re sending from UCB to New York:
– Bandwidth = 1 Mbps (megabits/sec)
– RTT = 100 msec
– Maximum Transmission Unit (MTU) = 1500 B = 12,000 b
– No other load on the path and no packet loss
• What (approximately) is the fastest we can
transmit using Stop-and-Wait?
• How about if Bandwidth = 1 Gbps?
10
Computation
• Latency: 100msec = .1sec
• Transmission time of data packet:
– 12000bits/(1000000bits/sec) = .012sec
• Throughput = 12000bits/.112sec ≈ 110kbits/sec
• With linespeed of 1Gbits/sec
– Transmission time negligible
– Throughput ≈ 120kbits/sec
11
Allowing Multiple Packets in Flight
• “In Flight” = “Unacknowledged”
• Sender-side issue: how many packets (bytes)?
• Receiver-side issue: how much buffer for data
that’s “above a sequence hole”?
– I.e., data that can’t be delivered since previous data is
missing
– Assumes service model is in-order delivery (like TCP)
12
Sliding Window
• Allow a larger amount of data “in flight”
– Allow sender to get ahead of the receiver
– … though not too far ahead
Sending process
TCP
Last byte written
Last byte ACKed
Receiving process
TCP
Sender Window
Last byte can send
Last byte read
Next byte needed
Receiver Window
Last byte received
13
Sliding Window, con’t
• Both sender & receiver maintain a window that
governs amount of data in flight (sender) or notyet-delivered (receiver)
• Left edge of window:
– Sender: beginning of unacknowledged data
– Receiver: beginning of undelivered data
• For the sender:
– Window size = maximum amount of data in flight
o Determines rate
o Sender must have at least this much buffer (maybe more)
• For the receiver:
– Window size = maximum amount of undelivered data
o Receiver has this much buffer
14
Sliding Window
• For the sender, when receives an
acknowledgment for new data, window advances
(slides forward)
Sending process
TCP
Last byte written
Last byte ACKed
Sender Window
Last byte can send
15
Sliding Window
• For the sender, when receives an
acknowledgment for new data, window advances
(slides forward)
Sending process
TCP
Last byte written
Last byte ACKed
Sender Window
Last byte can send
16
Sliding Window
• For the receiver, as the receiving process
consumes data, the window slides forward
Receiving process
TCP
Last byte read
Next byte needed
Receiver Window
Last byte received
17
Sliding Window
• For the receiver, as the receiving process
consumes data, the window slides forward
Receiving process
TCP
Last byte read
Next byte needed
Receiver Window
Last byte received
18
Sliding Window, con’t
• Sender: window advances when new data ack’d
• Receiver: window advances as receiving process
consumes data
• What happens if sender’s window size exceeds
the receiver’s window size?
• Receiver advertises to the sender where the
receiver window currently ends (“righthand edge”)
– Sender agrees not to exceed this amount
– It makes sure by setting its own window size to a value
that can’t send beyond the receiver’s righthand edge
19
Performance with Sliding Window
• Given previous UCB  New York 1 Mbps path
with 100 msec RTT
and Sender (and Receiver) window = 100 Kb = 12.5 KB
• How fast can we transmit?
20
Computation
• Ignoring per-packet transmission time:
• Throughput = 100000bits/.1sec ≈ 1Mbps
– Links is fully utilized!
– “Pipe is filled”
• With linespeed of 1Gbits/sec, still 1Mbps
• What size window would reach 1Gbps?
• Bandwidth-delay product
• 1 Gbps * 100 msec = 100 Mb
• Note: large window = many packets in flight
21
Summary
• IP packet forwarding
– Based on longest-prefix match
– End systems use subnet mask to determine if traffic
destined for their LAN …
o In which case they send directly, using ARP to find MAC address
– … or for some other network
o In which case they send to their local gateway (router)
– This info either statically config’d or learned via DHCP
• Transport protocols
– Multiplexing and demultiplexing via port numbers
– UDP gives simple datagram service
– TCP gives reliable byte-stream service
– Reliability immediately raises performance issues
o Stop-and-Wait vs. Sliding Window
22
DNS
23
Host Names vs. IP addresses
• Host names
–Mnemonic name appreciated by humans
–Variable length, full alphabet of characters
–Provide little (if any) information about location
–Examples: www.cnn.com and bbc.co.uk
• IP addresses
–Numerical address appreciated by routers
–Fixed length, binary number
–Hierarchical, related to host location
–Examples: 64.236.16.20 and 212.58.224.131
24
Separating Naming and Addressing
• Names are easier to remember
– www.cnn.com vs. 64.236.16.20 (but not tiny urls)
• Addresses can change underneath
– Move www.cnn.com to 4.125.91.21
– E.g., renumbering when changing providers
• Name could map to multiple IP addresses
– www.cnn.com to multiple (8) replicas of the Web site
– Enables
o Load-balancing
o Reducing latency by picking nearby servers
o Tailoring content based on requester’s location/identity
• Multiple names for the same address
– E.g., aliases like www.cnn.com and cnn.com
25
Scalable (Name  Address) Mappings
• Originally: per-host file
–Flat namespace
–/etc/hosts
–SRI (Menlo Park) kept master copy
–Downloaded regularly
• Single server doesn’t scale
–Traffic implosion (lookups & updates)
–Single point of failure
–Amazing politics
Needed a distributed, hierarchical collection of servers
26
Domain Name System (DNS)
• Properties of DNS
–Hierarchical name space divided into zones
–Zones distributed over collection of DNS servers
• Hierarchy of DNS servers
–Root (hardwired into other servers)
–Top-level domain (TLD) servers
–Authoritative DNS servers
• Performing the translations
–Local DNS servers
–Resolver software
27
Distributed Hierarchical Database
unnamed root
com
edu
org
generic domains
bar
uk
ac
zw
arpa
country domains
Top-Level Domains (TLDs)
ac
west
east
cam
foo
my
usr
my.east.bar.edu
usr.cam.ac.uk
inaddr
28
DNS Root
• Located in Virginia, USA
• How do we make the root scale?
Verisign, Dulles, VA
29
DNS Root Servers
• 13 root servers (see http://www.root-servers.org/)
– Labeled A through M
• Does this scale?
A Verisign, Dulles, VA
C Cogent, Herndon, VA
D U Maryland College Park, MD
G US DoD Vienna, VA
K RIPE London
H ARL Aberdeen, MD
I Autonomica, Stockholm
J Verisign
E NASA Mt View, CA
F Internet Software
Consortium
Palo Alto, CA
M WIDE Tokyo
B USC-ISI Marina del Rey, CA
L ICANN Los Angeles, CA
30
DNS Root Servers
• 13 root servers (see http://www.root-servers.org/)
– Labeled A through M
• Replication via any-casting (localized routing for addresses)
E NASA Mt View, CA
F Internet Software
Consortium,
Palo Alto, CA
(and 37 other locations)
A Verisign, Dulles, VA
C Cogent, Herndon, VA (also Los Angeles, NY, Chicago)
D U Maryland College Park, MD
G US DoD Vienna, VA
K RIPE London (plus 16 other locations)
H ARL Aberdeen, MD
I Autonomica, Stockholm
J Verisign (21 locations)
(plus 29 other locations)
M WIDE Tokyo
plus Seoul, Paris,
San Francisco
B USC-ISI Marina del Rey, CA
L ICANN Los Angeles, CA
31
TLD and Authoritative DNS Servers
• Top-level domain (TLD) servers
– Generic domains (e.g., com, org, edu)
– Country domains (e.g., uk, fr, cn, jp)
– Special domains (e.g., arpa)
– Typically managed professionally
o Network Solutions maintains servers for “com”
o Educause maintains servers for “edu”
• Authoritative DNS servers
– Provide public records for hosts at an organization
– For the organization’s servers (e.g., Web and mail)
– Can be maintained locally or by a service provider
32
Question
• Could we replace DNS with a Google-like
infrastructure?
33
Using DNS
• Local DNS server (“default name server”)
–Usually near the endhosts that use it
–Local hosts configured with local server (e.g.,
/etc/resolv.conf) or learn server via DHCP
• Client application
–Extract server name (e.g., from the URL)
–Do gethostbyname() to trigger resolver code
• Server application
–Extract client IP address from socket
–Optional gethostbyaddr() to translate into name 34
Example
root DNS server
Host at cis.poly.edu
wants IP address for
gaia.cs.umass.edu
2
3
TLD DNS server
4
local DNS server
dns.poly.edu
5
1
8
requesting host
cis.poly.edu
7
6
authoritative DNS server
dns.cs.umass.edu
gaia.cs.umass.edu
35
Recursive vs. Iterative Queries
• Recursive query
– Ask server to get
answer for you
– E.g., request 1
and response 8
• Iterative query
– Ask server who
to ask next
– E.g., all other
request-response
pairs
root DNS server
2
3
TLD DNS server
4
local DNS server
dns.poly.edu
5
1
8
requesting host
7
6
authoritative DNS server
dns.cs.umass.edu
cis.poly.edu
36
Reverse Mapping (Address  Host)
• How do we go the other direction, from an IP
address to the corresponding hostname?
• Addresses already have natural “quad” hierarchy:
– 12.34.56.78
• But: quad notation has most-sig. hierarchy element
on left, while www.cnn.com has it on the right
• Idea: reverse the quads = 78.56.34.12 …
– … and look that up in the DNS
• Under what TLD?
– Convention: in-addr.arpa
– So lookup is for 78.56.34.12.in-addr.arpa
37
Distributed Hierarchical Database
unnamed root
com
edu
org
generic domains
bar
uk
ac
zw
arpa
country domains
ac
inaddr
west
east
cam
12
foo
my
usr
34
my.east.bar.edu
usr.cam.ac.uk
56
38
12.34.56.0/24
DNS Caching
• Performing all these queries takes time
– And all this before actual communication takes place
– E.g., 1-second latency before starting Web download
• Caching can greatly reduce overhead
– The top-level servers very rarely change
– Popular sites (e.g., www.cnn.com) visited often
– Local DNS server often has the information cached
• How DNS caching works
– DNS servers cache responses to queries
– Responses include a “time to live” (TTL) field
– Server deletes cached entry after TTL expires
39
Negative Caching
• Remember things that don’t work
– Misspellings like www.cnn.comm and www.cnnn.com
– These can take a long time to fail the first time
– Good to remember that they don’t work
– … so the failure takes less time the next time around
• But: negative caching is optional
– And not widely implemented
40
DNS Resource Records
DNS: distributed DB storing resource records (RR)
RR format: (name,
• Type=A
– name is hostname
– value is IP address
value, type, ttl)
• Type=CNAME
– name is alias name for some
“canonical” name
E.g., www.cs.mit.edu is really
• Type=NS
– name is domain (e.g. foo.com)
– value is hostname of authoritative name
server for this domain
• Type=PTR
– name is reversed IP quads
o E.g. 78.56.34.12.in-addr.arpa
– value is corresponding
hostname
eecsweb.mit.edu
– value is canonical name
• Type=MX
– value is name of mailserver
associated with name
– Also includes a weight/preference
41
DNS Protocol
DNS protocol: query and reply messages, both with
same message format
Message header:
• Identification: 16 bit # for
query, reply to query uses
same #
• Flags:
– Query or reply
– Recursion desired
– Recursion available
– Reply is authoritative
• Plus fields indicating size
(0 or more) of optional
header elements
16 bits
16 bits
Identification
Flags
# Questions
# Answer RRs
# Authority RRs
# Additional RRs
Questions
(variable # of resource records)
Answers
(variable # of resource records)
Authority
(variable # of resource records)
Additional information
(variable # of resource records)
42
Reliability
• DNS servers are replicated
– Name service available if at least one replica is up
– Queries can be load-balanced between replicas
• Usually, UDP used for queries
– Need reliability: must implement this on top of UDP
– Spec supports TCP too, but not always implemented
• Try alternate servers on timeout
– Exponential backoff when retrying same server
• Same identifier for all queries
– Don’t care which server responds
43
Inserting Resource Records into DNS
• Example: just created startup “FooBar”
• Get a block of address space from ISP
– Say 212.44.9.128/25
• Register foobar.com at Network Solutions (say)
– Provide registrar with names and IP addresses of your
authoritative name server (primary and secondary)
– Registrar inserts RR pairs into the com TLD server:
o (foobar.com, dns1.foobar.com, NS)
o (dns1.foobar.com, 212.44.9.129, A)
• Put in your (authoritative) server
dns1.foobar.com:
– Type A record for www.foobar.com
– Type MX record for foobar.com
44
Setting up foobar.com, con’t
• In addition, need to provide reverse PTR bindings
– E.g., 212.44.9.129  dns1.foobar.com
• Normally, these would go in 9.44.212.in-addr.arpa
• Problem: you can’t run the name server for that
domain. Why not?
– Because your block is 212.44.9.128/25, not
212.44.9.0/24
– And whoever has 212.44.9.0/25 won’t be happy with you
owning their PTR records
• Solution: ISP runs it for you
– Now it’s more of a headache to keep it up-to-date :-(
45
DNS Measurements (MIT data from 2000)
• What is being looked up?
– ~60% requests for A records
– ~25% for PTR records
– ~5% for MX records
– ~6% for ANY records
• How long does it take?
– Median ~100msec (but 90th percentile ~500msec)
– 80% have no referrals; 99.9% have fewer than four
• Query packets per lookup: ~2.4
46
DNS Measurements (MIT data from 2000)
• Top 10% of names accounted for ~70% of lookups
– Caching should really help!
• 9% of lookups are unique
– Cache hit rate can never exceed 91%
• Cache hit rates ~ 75%
– But caching for more than 10 hosts doesn’t add much
47
DNS Measurements (MIT data from 2000)
• Does DNS give answers?
– ~23% of lookups fail to elicit an answer!
– ~13% of lookups result in NXDOMAIN (or similar)
o Mostly reverse lookups
– Only ~64% of queries are successful!
o How come the web seems to work so well?
• ~ 63% of DNS packets in unanswered queries!
– Failing queries are frequently retransmitted
– 99.9% successful queries have ≤2 retransmissions
48
Moral of the Story
• If you design a highly resilient system, many things
can be going wrong without you noticing it!
49
Security Analysis of DNS
• What security issues does the design & operation
of the Domain Name System raise?
• Degrees of freedom:
16 bits
16 bits
Identification
Flags
# Questions
# Answer RRs
# Authority RRs
# Additional RRs
Questions
(variable # of resource records)
Answers
(variable # of resource records)
Authority
(variable # of resource records)
Additional information
(variable # of resource records)
50
Security Problem #1: Starbucks
• As you sip your latte and surf the Web, how does
your laptop find google.com?
• Answer: it asks the local name server per Dynamic
Host Configuration Protocol (DHCP) …
– … which is run by Starbucks or their contractor
– … and can return to you any answer they please
– … including a “man in the middle” site that forwards your
query to Google, gets the reply to forward back to you,
yet can change anything they wish in either direction
• How can you know you’re getting correct data?
– Today, you can’t. (Though if site is HTTPS, that helps)
– One day, hopefully: DNSSEC extensions to DNS
51
Security Problem #2: Cache Poisoning
• Suppose you are a Bad Guy and you control the
name server for foobar.com. You receive a request
to resolve www.foobar.com and reply:
;; QUESTION SECTION:
;www.foobar.com.
IN
A
Evidence of the attack
disappears 5 seconds later!
;; ANSWER SECTION:
www.foobar.com.
300
IN
A
212.44.9.144
;; AUTHORITY SECTION:
foobar.com.
foobar.com.
600
600
IN
IN
NS
NS
dns1.foobar.com.
google.com.
5
IN
A
212.44.9.155
;; ADDITIONAL SECTION:
google.com.
A foobar.com machine, not google.com
52
Cache Poisoning, con’t
• Okay, but how do you get the victim to look up
www.foobar.com in the first place?
• Perhaps you connect to their mail server and send
– HELO www.foobar.com
– Which their mail server then looks up to see if it
corresponds to your source address (anti-spam
measure)
• Note, with compromised name server we can also
lie about PTR records (address  name mapping)
– E.g., for 212.44.9.155 = 155.44.9.212.in-addr.arpa return
google.com (or whitehouse.gov, or whatever)
o If our ISP lets us manage those records as we see fit, or we
happen to directly manage them
53
Cache Poisoning, con’t
• Suppose Bad Guy is at Starbuck’s and they can
sniff (or even guess) the identification field the
local server will use in
16 bits
16 bits
its next request:
Identification
Flags
• They:
– Ask local server for a (recursive) lookup of google.com
– Locally spoof subsequent reply from correct name server
using the identification field
– Bogus reply arrives sooner than legit one
• Local server duly caches the bogus reply!
– Now: every future Starbuck customer is served the
bogus answer out of the local server’s cache
o In this case, the reply uses a large TTL
54
Summary
• Domain Name System (DNS)
– Distributed, hierarchical database
– Distributed collection of servers
– Caching to improve performance
• DNS lacks authentication
– Can’t tell if reply comes from the correct source
– Can’t tell if correct source tells the truth
– Malicious source can insert extra (mis)information
– Malicious bystander can spoof (mis)information
– Playing with caching lifetimes adds extra power to
attacks
55
Download