EE122: DNS and the Web (after a little more multicast) October 27, 2003 Katz, Stoica F04 EECS 122: Introduction to Computer Networks DNS and WWW Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley, CA 94720-1776 Katz, Stoica F04 Barriers to Multicast Hard to change IP - multicast means change to IP - details of multicast were very hard to get right Not always consistent with ISP economic model - charging done at edge, but single packet from edge can explode into millions of packets within network Troublesome security model - Anyone can send to a group - Denial-of-service attacks on known groups 6/27/2016 7:55 PM Katz, Stoica F04 3 Application Layer Multicast (ALM) Let the hosts do all the “special” work - only require unicast from infrastructure Basic idea: - hosts do the copying of packets - set up tree between hosts Example: Narada [Yang-hua et al, 2000] - Small group sizes <= hundreds of nodes - Typical application: chat 6/27/2016 7:55 PM Katz, Stoica F04 4 Narada: End System Multicast Gatech Stanford Stan1 Stan2 CMU Berk1 Berk2 Berkeley “Overlay” Tree Stan1 Gatech Stan2 CMU 6/27/2016 7:55 PM Berk1 Berk2 Katz, Stoica F04 5 Algorithmic Challenge Choosing replication/forwarding points among hosts - how do the hosts know about each other - and know which hosts should forward to other hosts 6/27/2016 7:55 PM Katz, Stoica F04 6 Advantages of ALM No need for changes to IP or routers No need for ISP cooperation End hosts can prevent other hosts from sending Easy to implement reliability - use hop-by-hop retransmissions 6/27/2016 7:55 PM Katz, Stoica F04 7 Performance Concerns Stretch - ratio of latency in the overlay to latency in the underlying network Stress - number of duplicate packets sent over the same physical link 6/27/2016 7:55 PM Katz, Stoica F04 8 Performance Concerns Gatech Delay from CMU to Berk1 increases Stan1 Stan2 CMU Berk1 Duplicate Packets: Bandwidth Wastage Gatech Stanford Berk2 Stan1 Stan2 CMU Berk1 Berkeley Berk2 6/27/2016 7:55 PM Katz, Stoica F04 9 Single Sender Multicast Many problems with IP multicast disappear if each group is associated with a single source Hosts joining multicast group can send join messages to source - this sets up delivery tree - no worry about “root” being in wrong place This solves several problems: - better security and charging model - simple algorithm 6/27/2016 7:55 PM Katz, Stoica F04 10 Example Group members: M1, M2, M3 source M1 M2 M3 control (join) messages data 6/27/2016 7:55 PM Katz, Stoica F04 11 What’s Wrong with SSM? Multiple sources? - can set up group per source, or... - source can serve as relay for other senders Algorithm? - trivial So, why isn’t SSM the answer? - multicast no longer serves as “rendezvous” - ok for “broadcast” apps, not good for “meeting” apps 6/27/2016 7:55 PM Katz, Stoica F04 12 What Do You Need to Know? DVRMP CBT SSM How they compare 6/27/2016 7:55 PM Katz, Stoica F04 13 Today’s Lecture: 17 17, 18, 19 2 6 10,11 Transport 14, 15, 16 7, 8, 9 21, 22, 23 25 Application Network (IP) Link Physical 6/27/2016 7:55 PM Katz, Stoica F04 14 Internet Names & Addresses Names: e.g. ariachne.berkeley.edu - human-usable labels for machines - conforms to “organizational” structure Addresses: e.g. 169.229.131.109 - router-usable labels for machines - conforms to “network” structure How do you map from one to another? - Domain Name System (DNS) 6/27/2016 7:55 PM Katz, Stoica F04 15 DNS: History Initially all host-addess mappings were in a file called hosts.txt (in /etc/hosts) - Changes were submitted to SRI by email - New versions of hosts.txt ftp’d periodically from SRI - An administrator could pick names at their discretion As the internet grew this system broke down because: - SRI couldn’t handled the load - Names were not unique - Many hosts had inaccurate copies of hosts.txt Internet growth was threatened! - Domain Name System (DNS) was born 6/27/2016 7:55 PM Katz, Stoica F04 16 Basic DNS Features Hierarchical namespace - as opposed to original flat namespace Distributed storage architecture - as opposed to centralized storage (plus replication) Client--server interaction on UDP Port 53 - but can use TCP if desired 6/27/2016 7:55 PM Katz, Stoica F04 17 Naming Hierarchy root edu berkeley eecs gov com mil org net uk “Top Level Domains” are at the top Depth of tree is arbitrary (limit 128) Domains are subtrees fr etc. mit sims - argus Name collisions avoided - 6/27/2016 7:55 PM E.g: .edu, berkeley.edu, eecs.berkeley.edu E.g. berkeley.edu and berkeley.com can coexist, but uniqueness is job of domain Katz, Stoica F04 18 Host names are administered hierarchically root edu com argus sims mil org net uk fr mit berkeley eecs gov A zone corresponds to an administrative authority that is responsible for that portion of the hierarchy eecs controls names: x.eecs.berkeley.edu berkeley controls names: x.berkeley.edu and y.sims.berkeley.edu 6/27/2016 7:55 PM Katz, Stoica F04 19 Server Hierarchy Each server has authority over a portion of the hierarchy - A server maintains only a subset of all names Each server contains all the records for the hosts in its zone - might be replicated for robustness Each server needs to know other servers that are responsible for the other portions of the hierarchy - Every server knows the root - Root server knows about all top-level domains 6/27/2016 7:55 PM Katz, Stoica F04 20 DNS Name Servers Local name servers: - Each ISP (company) has local default name server - Host DNS query first goes to local name server Authoritative name servers: - For a host: stores that host’s (name, IP address) - Can perform name/address translation for that host’s name Can also do IP to name translation, but won’t discuss 6/27/2016 7:55 PM Katz, Stoica F04 21 DNS: Root Name Servers Contacted by local name server that can not resolve name Root name server: - Contacts authoritative name server if name mapping not known - Gets mapping - Returns mapping to local name server ~ Dozen root name servers worldwide 6/27/2016 7:55 PM Katz, Stoica F04 22 Simple DNS Example root name server Host whistler.cs.cmu.edu wants IP address of www.berkeley.edu 1. Contacts its local DNS server, 2 4 mango.srv.cs.cmu.edu 3 5 2. mango.srv.cs.cmu.edu contacts root name server, if necessary 3. Root name server contacts authoritative name server, ns1.berkeley.edu, if necessary local name server authorititive name server mango.srv.cs.cmu.edu 1 6 requesting host 6/27/2016 7:55 PM ns1.berkeley.edu www.berkeley.edu whistler.cs.cmu.edu Katz, Stoica F04 23 Example of Recursive DNS Query root name server Root name server: May not know authoritative name server May know intermediate name server: who to contact to find authoritative name server? 2 6 7 3 Recursive query: local name server Puts burden of name resolution on contacted name mango.srv.cs.cmu.edu server 1 8 Heavy load? intermediate name server (edu server) 4 5 authoritative name server ns1.berkeley.edu requesting host whistler.cs.cmu.edu www.berkeley.edu 6/27/2016 7:55 PM Katz, Stoica F04 24 Example of Iterated DNS Query Iterated query: root name server Contacted server replies with name of server to contact “I don’t know this name, but ask this server” iterated query 2 3 4 5 local name server mango.srv.cs.cmu.edu 1 8 intermediate name server (edu server) 6 7 authoritative name server ns1.berkeley.edu requesting host whistler.cs.cmu.edu 6/27/2016 7:55 PM www.berkeley.edu Katz, Stoica F04 25 DNS Records Four fields: (name, value, type, TTL) Type = A: - name = hostname - value = IP address Type = NS: - name = domain - value = name of dns server for domain 6/27/2016 7:55 PM Katz, Stoica F04 26 DNS Records (cont’d) Type = CNAME: - name = hostname - value = canonical name Type = MX: - name = domain in email address - value = canonical name of mail server 6/27/2016 7:55 PM Katz, Stoica F04 27 DNS as Indirection Service Can refer to machines by name, not address - not only easier for humans - also allows machines to change IP addresses without having to change way you refer to machine Can refer to machines by alias - www.berkeley.edu can be generic web server - but DNS can point this to particular machine that can change over time But, this flexibility applies only within domain! 6/27/2016 7:55 PM Katz, Stoica F04 28 Special Topics DNS caching - improve performance by saving results of previous lookups DNS “hacks” - return records based on requesting IP address dynamic DNS - allows remote updating of IP address for mobile hosts DNS politics (ICANN) and branding battles 6/27/2016 7:55 PM Katz, Stoica F04 29 Important Properties of DNS Administrative delegation and distributed server architecture results in: Easy unique naming Fate sharing for network failures Reasonable trust model 6/27/2016 7:55 PM Katz, Stoica F04 30 The Web A distributed database of “pages” Core components: - Servers: store files and execute remote commands - Browsers: retrieve and display “pages” - URLs: way to refer to pages Need a protocol to transfer information between clients and servers - HTTP 6/27/2016 7:55 PM Katz, Stoica F04 31 Uniform Record Locator protocol://host-name:port/directorypath/resource Extend the idea of hierarchical namespaces to include anything in a file system - ftp://www.eecs.berkeley.edu/122/Lecture6/presentation.ppt Extend to program executions as well… - - http://us.f413.mail.yahoo.com/ym/ShowLetter?box=%40B%40Bul k&MsgId=2604_1744106_29699_1123_1261_0_28917_3552_12 89957100&Search=&Nhead=f&YY=31454&order=down&sort=dat e&pos=0&view=a&head=b Server side processing can be incorporated in the name 6/27/2016 7:55 PM Katz, Stoica F04 32 Web and DNS URLs use hostnames Thus, content names are tied to specific hosts This is bad! URNs are one proposal to achieve persistence 6/27/2016 7:55 PM Katz, Stoica F04 33 Hyper Text Transfer Protocol Client-server architecture Synchronous request/reply protocol - Runs over TCP, Port 80 Stateless Uses unicast 6/27/2016 7:55 PM Katz, Stoica F04 34 Big Picture Client Server Establish connection Client request Request response .. . Close connection 6/27/2016 7:55 PM Katz, Stoica F04 35 Hyper Text Transfer Protocol Commands GET – transfer resource from given URL HEAD – GET resource metadata (headers) only PUT – store/modify resource under given URL DELETE – remove resource POST – provide input for a process identified by the given URL (usually used to post CGI parameters) 6/27/2016 7:55 PM Katz, Stoica F04 36 Response Codes 1x informational 2x success 3x redirection 4x client error in request 5x server error; can’t satisfy the request 6/27/2016 7:55 PM Katz, Stoica F04 37 Client Request Steps to get the resource: http://www.eecs.berkeley.edu/index.html 1. Use DNS to obtain the IP address of www.eecs.berkeley.edu 2. Send to an HTTP request: GET /index.html HTTP/1.0 6/27/2016 7:55 PM Katz, Stoica F04 38 Server Response HTTP/1.0 200 OK Content-Type: text/html Content-Length: 1234 Last-Modified: Mon, 19 Nov 2001 15:31:20 GMT <HTML> <HEAD> <TITLE>EECS Home Page</TITLE> </HEAD> … </BODY> </HTML> 6/27/2016 7:55 PM Katz, Stoica F04 39 Example (from Kurose and Ross) 1. 2. 3. 4. 5. 6. http://www.mylife.org/mypictures.htm After finding out the IP address of the host… http client initiates a TCP connection on :80 Client sends the get request via socket established in 1 Server sends the html file, which is encapsulated in its response http server tells tcp to terminate connection http client receives the file and the browser parses it…contains ten jpeg images Client repeats steps 1-4 6/27/2016 7:55 PM Katz, Stoica F04 40 HTTP/1.0 Example Client Server Finish display page 6/27/2016 7:55 PM Katz, Stoica F04 41 HHTP/1.0 Performance Create a new TCP connection for each resource - Large number of embedded objects in a web page - Many short lived connections TCP transfer - Too slow for small object - May never exit slow-start phase Connections may be set up in parallel (5 is default in most browsers) 6/27/2016 7:55 PM Katz, Stoica F04 42 HTTP/1.0 Caching Exploit locality of reference A modifier to the GET request: - If-modified-since – return a “not modified” response if resource was not modified since specified time A response header: - Expires – specify to the client for how long it is safe to cache the resource A request directive: - No-cache – ignore all caches and get resource directly from server These features can be best taken advantage of with HTTP proxies - Locality of reference increases if many clients share a proxy 6/27/2016 7:55 PM Katz, Stoica F04 43 Web Proxies Intermediaries between client and server Client 1 Client 2 .. . Proxy Proxy Server Client N 6/27/2016 7:55 PM Katz, Stoica F04 44 HTTP/1.1 (1996) Performance: - Persistent connections - Pipelined requests/responses - … Support for virtual hosting Efficient caching support - Network Cache assumed more explicitly in the design - Gives more control to the server on how it wants data cached 6/27/2016 7:55 PM Katz, Stoica F04 45 Persistent Connections Allow multiple transfers over one connection Avoid multiple TCP connection setups Avoid multiple TCP slow starts 6/27/2016 7:55 PM Katz, Stoica F04 46 Pipelined Requests/Responses Buffer requests and responses to reduce the number of packets Multiple requests can be contained in one TCP segment Note: order of responses has to be maintained Client Server 6/27/2016 7:55 PM Katz, Stoica F04 47 What You Need to Know DNS: record types, and how they are used HTTP basics (and essential differences between 1.0 and 1.1) 6/27/2016 7:55 PM Katz, Stoica F04 48 What’s the Moral of this Story? QoS and IP Multicast: - interesting algorithmic and architectural issues thousands of academic papers ubiquitous in routers, but not deployed by ISPs little or no impact on end users DNS and the Web: - no research papers on topic before deployment - really boring designs - they changed the world.... 6/27/2016 7:55 PM Katz, Stoica F04 49