Computer Networks 364 Protocols John Morris Computer Science/Electrical Engineering University of Auckland Email: jmor159@cs.auckland.ac.nz URL: http:/www.cs.auckland.ac.nz/~jmor159 Protocols - HTTP ► HyperText Transfer Protocol (HTTP) WWW application layer protocol Client: browser (Netscape, Opera, that other one, … ) Server: a web server (source of Web pages - Apache, … ) Defines the language used by clients to request web pages ► RFC 2616 (HTTP/1.1) [ RFC 1945 (HTTP/1.0) ] RFC = Request for Comment Now managed by the Internet Engineering Task Force (IETF) Over 2000 RFCs Standards for the Internet ► Default port is 80 Protocols - HTTP ► HyperText Transfer Protocol (HTTP) Web pages consist of a number of objects ► Basic page ► Embedded images, etc ► Each object is fetched from the server in a single session Open TCP connection GET message from client Response from server with object Close connection ► HTTP is stateless • Server does not keep track of state of session with client • Each request/response pair is independent of any other Stateless protocols are simpler! • Suitable for information serving only applications • Transaction oriented applications eg database update generally require some state to be maintained • HTTP makes it difficult to implement ‘safe’ transaction based systems but • Cookies provide a simple mechanism for maintaining state Protocols - HTTP ► HyperText Transfer Protocol (HTTP) Web pages consist of a number of objects ► ... ► Each object is fetched from the server in a single session Open TCP connection GET message from client Response from server with object Close connection ► Obviously rather inefficient TCP connection establishment is expensive Persistent connections ► TCP connection is left open for subsequent requests ► Further efficiency from pipelining Send additional requests before first response received Allows browser to do useful work while server is fetching objects ► Parsing to discover embedded objects, ► Formatting and displaying pages, etc HTTP example Method URL Version GET /somedir/page.html HTTP/1.1 Host: www.someschool.edu Connection: close User-agent: Mozilla/4.0 Accept-language:fr (extra carriage return, line feed) } Methods: GET POST HEAD Request line Header lines HTTP request messages General form HTTP response Version Status code Status message Status line HTTP/1.1 200 OK Connection: close Header Date: Thu, 06 Aug 1998 12:00:15 GMT lines Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998 09:23:24 GMT Content-Length: 6821 Content-Type: text/html (data data data data data . . .) Entity body HTTP response - general format HTTP response - common status codes ► * 200 OK: Request succeeded and the information is returned in the response. ► * 301 Moved Permanently: Requested object has been permanently moved; new URL is specified in Location: header of the response message. The client software will automatically retrieve the new URL. ► * 400 Bad Request: A generic error code indicating that the request could not be understood by the server. ► * 404 Not Found: The requested document does not exist on this server. ► * 505 HTTP Version Not Supported: The requested HTTP protocol version is not supported by the server. Try out http (client side) for yourself 1. Telnet to your favorite Web server: Opens TCP connection to port 80 telnet www.eurecom.fr 80 (default http server port) at www.eurecom.fr. Anything typed in sent to port 80 at www.eurecom.fr 2. Type in a GET http request: GET /~ross/index.html HTTP/1.0 By typing this in (hit carriage return twice), you send this minimal (but complete) GET request to http server 3. Look at response message sent by http server! User-server interaction: authentication Authentication : control access to server content ► authorization credentials: typically name, password ► stateless: client must present authorization in each request authorization: header line in each request if no authorization: header, server refuses access, sends WWW authenticate: header line in response client server usual http request msg 401: authorization req. WWW authenticate: usual http request msg + Authorization: <cred> usual http response msg usual http request msg + Authorization: <cred> usual http response msg time Cookies: keeping “state” ► Server-generated # rembered by server ► ► Later used for: authentication remembering ► user preferences, ► previous choices Server sends “cookie” to client in response msg client server usual http request msg usual http response + Set-cookie: # usual http request msg cookie: # usual http response msg cookiespectific action Set-cookie: 1678453 ► Client presents cookie in later requests cookie: 1678453 usual http request msg cookie: # usual http response msg cookiespectific action Conditional GET: client-side caching ► Goal Don’t send object if client has up-to-date cached version ► server http request message If-modified-since: <date> Client Specify date of cached copy in http request If-modified-since: <date> ► client http response object not modified HTTP/1.0 304 Not Modified Server Response has no object if cached copy is up-to-date: HTTP/1.0 304 Not Modified http request message If-modified-since: <date> http response HTTP/1.1 200 OK <data> object modified Web Caches (proxy server) Goal Satisfy client request without involving original server ► origin server User sets browser Web accesses via web cache ► Client sends all http requests to web cache client Object in web cache ► web cache returns object else web cache requests object from original server, returns object to client client Proxy server origin server Why Web Caching? origin servers Assume Cache is “close” to client eg in same network Smaller response time public Internet Cache “closer” to client Decrease traffic to distant servers 1.5 Mbps access link Link out of local network often bottleneck ► Cache works on locality of reference principle institutional network Recently used objects more likely to be needed again ► Temporal locality Keep them ‘closer’ Processor caches use the same principle (+ spatial locality!) 10 Mbps LAN institutional cache FTP: File Transfer Protocol user at host ► FTP FTP user client interface local file system file transfer FTP 21 server remote file system Transfer file(s) to/from remote host ► Client/Server model Client: side that initiates transfer (either to/from remote) Server: remote host ► RFC 959 ► ftp server: port 21 FTP: Separate Control and Data Connections ► FTP client contacts server at port 21 Specifies TCP as transport protocol ► Two parallel TCP connections opened: Control connection ► Exchange commands, responses between client and server ► Out of band control Data connection ► File data to / from server ► FTP server maintains state Current directory Earlier authentication FTP client TCP control connection port 21 TCP data connection port 20 FTP server FTP Commands and Responses Sample commands: Sample return codes ► ► Sent as ASCII text over control channel USER username PASS password ► ► LIST ► Return list of files in current directory RETR filename ► Retrieves (gets) file STOR filename ► Stores (puts) file onto remote host ► ► Status Code and Phrase (as in HTTP) 331 Username OK, password required 125 data connection already open; transfer starting 425 Can’t open data connection 452 Error writing file Electronic Mail outgoing message queue Three major components: user agent ► User agents ► Mail servers ► Simple Mail Transfer Protocol: SMTP User Agent ► Mail reader ► Composing, editing, reading mail messages ► Examples Eudora, Outlook, elm, Netscape Messenger ► user mailbox mail server SMTP SMTP mail server Outgoing, incoming messages stored on server SMTP user agent user agent user agent mail server user agent user agent eMail: Mail servers Mail Servers user agent ► Mailbox contains incoming messages (yet to be read) for user ► Message queue of outgoing (to be sent) mail messages ► SMTP protocol mail server Used between mail servers to send email messages SMTP Client: sending mail server Server: receiving mail server mail server SMTP SMTP user agent user agent user agent mail server user agent user agent eMail: SMTP ► RFC 821 First published: 1982 ► Uses TCP to reliably transfer email message ► Port 25 ► Direct transfer Sending server to receiving server ► Three phases of transfer Handshaking (greeting) Transfer of messages Closure ► Command/response interaction Commands: ASCII text Response: status code and phrase ► Messages must be in 7-bit ASCII Legacy of 1982 Binary data must be encoded before transfer Sample SMTP interaction S: C: S: C: S: C: S: C: S: C: C: C: S: C: S: 220 hamburger.edu HELO crepes.fr 250 Hello crepes.fr, pleased to meet you MAIL FROM: <alice@crepes.fr> 250 alice@crepes.fr... Sender ok RCPT TO: <bob@hamburger.edu> 250 bob@hamburger.edu ... Recipient ok DATA 354 Enter mail, end with "." on a line by itself Do you like ketchup? How about pickles? This is why 7-bit ASCII . is required! 250 Message accepted for delivery QUIT 221 hamburger.edu closing connection Try SMTP yourself ► telnet servername 25 See 220 reply from server Enter ► HELO ► MAIL FROM ► RCPT TO ► DATA ► QUIT ► HELP commands You can send email Using telnet to send commands yourself By writing a simple program to do it for you! Exercise: make a simple Java mail sender Able to send messages directly from programs SMTP: Final Words ► Uses persistent connections ► Requires message (header & body) to be in 7-bit ASCII ► Certain character strings not permitted in message Comparison with HTTP: ► HTTP: pull ► eMail: push Command and response, interaction and status codes ► Example ► CRLF.CRLF Thus message has to be encoded ► Usually All ASCII in both ► Each object encapsulated in its own response message base-64 or quoted printable Server uses CRLF.CRLF to determine end of message HTTP ► SMTP Multiple objects sent in multipart message Mail message format RFC 821 SMTP protocol for exchanging email messages RFC 822 Text message format Header lines, e.g. ► To: ► From: ► Subject: Different from SMTP commands! Defines semantics (interpretation) also ► Body The “message” ASCII characters only! header body blank line Message format: Multimedia extensions ► RFC 822 format OK for text messages Inefficient for multimedia Multipurpose Internet Mail Extensions (MIME) RFC 2045, 2056 ► Additional lines in message header declare MIME content type MIME version Method used to encode data Multimedia data type, subtype, parameters Encoded data From: alice@crepes.fr To: bob@hamburger.edu Subject: Picture of yummy crepe MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Type: image/jpeg base64 encoded data ..... ......................... ......base64 encoded data MIME types Content-Type: type/subtype; parameters Text ► Video Subtypes: plain, html, ... Image ► Subtypes: mpeg, quicktime Application Subtypes: jpeg, gif Audio ► ► Subtypes basic ► 8-bit -law encoded 32kadpcm ► 32 Kbps coding (RFC 1911) ► Other data that must be processed by reader before becoming “viewable” ► Subtypes msword octet-stream ► Arbitrary binary data MIME: Multipart Type From: alice@crepes.fr To: bob@hamburger.edu Subject: Picture of yummy crepe. MIME-Version: 1.0 Content-Type: multipart/mixed; boundary=98766789 --98766789 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain Dear Bob, Please find a picture of a crepe. --98766789 Content-Transfer-Encoding: base64 Content-Type: image/jpeg base64 encoded data ..... ......................... ......base64 encoded data --98766789-- Arbitrary ASCII string which defines boundaries of a part Mail Access Protocols user agent SMTP SMTP sender’s mail server ► ► POP3 or IMAP receiver’s mail server SMTP: delivery to and storage on receiver’s server Mail access protocol: retrieval from server POP: Post Office Protocol [RFC 1939] ► Simple, limited functions (agent server) and download IMAP: Internet Mail Access Protocol [RFC 2060] ► More features (more complex) ► Manipulation of stored messages on server ► Authorization Set up, search folders, etc HTTP: Hotmail , Yahoo! Mail, etc user agent POP3 Protocol Authorization phase ► ► Client commands: user: declare username pass: password Server responses +OK -ERR Transaction phaseclient: list: list message numbers ► retr: retrieve message by number ► dele: delete ► quit ► S: C: S: C: S: +OK POP3 server ready user alice +OK pass hungry +OK user successfully logged C: S: S: S: C: S: S: C: C: S: S: C: C: S: list 1 498 2 912 . retr 1 <message 1 contents> . dele 1 retr 2 <message 1 contents> . dele 2 quit +OK POP3 server signing off on DNS: Domain Name System People: Many identifiers: IRD #, name, passport # Internet hosts, routers: IP address (32 bit) ► Used for in datagrams “name” eg gaia.cs.umass.edu ► Used by humans (with some exceptions!) ? Map between IP addresses and name? Domain Name System: ► Distributed database Implemented in hierarchy of many name servers ► Application-layer protocol Host, routers, name servers to communicate to resolve names (address name translation) ► Note Core Internet function, implemented as applicationlayer protocol Complexity at network’s “edge” DNS name servers Why not centralize DNS? ► Single point of failure ► Congestion Traffic volume on central server ► Distance Time to reach centralized database ► Maintenance Doesn’t scale! No server has all name IP address mappings Local name servers: ► Each ISP, company has local (default) name server DNS query first goes to local name server Authoritative name server: For a host: stores that host’s IP address, name Can perform name/address translation for that host’s name DNS: Root Name Servers ► Contacted by local name server that can not resolve name ► Root Name Server: Contacts authoritative name server if name mapping not known Gets mapping Returns mapping to local name server a NSI Herndon, VA c PSInet Herndon, VA d U Maryland College Park, MD g DISA Vienna, VA h ARL Aberdeen, MD j NSI (TBD) Herndon, VA k RIPE London i NORDUnet Stockholm m WIDE Tokyo e NASA Mt View, CA f Internet Software C. Palo Alto, CA b USC-ISI Marina del Rey, CA l ICANN Marina del Rey, CA 13 root name servers worldwide Simple DNS example root name server Host surf.eurecom.fr wants IP address of gaia.cs.umass.edu 1. Contacts its local DNS server, dns.eurecom.fr 2. dns.eurecom.fr contacts root name server, if necessary 3. Root name server contacts authoritative name server, dns.umass.edu, if necessary 2 4 5 local name server 3 authorititive name server dns.umass.edu dns.eurecom.fr 1 6 requesting host surf.eurecom.fr gaia.cs.umass.edu DNS example root name server Root name server: May not know authoritative name server ► May know intermediate name server: who to contact to find authoritative name server 6 2 ► 7 local name server dns.eurecom.fr 1 8 3 intermediate name server dns.umass.edu 4 5 authoritative name server dns.cs.umass.edu requesting host surf.eurecom.fr gaia.cs.umass.edu DNS: iterated queries root name server iterated query Recursive query: 2 ► Puts burden of name resolution on contacted name server Heavy load? Iterated query: ► Contacted server replies with name of server to contact ► “I don’t know this name, but ask this server” 3 4 7 local name server dns.eurecom.fr 1 8 intermediate name server dns.umass.edu 5 6 authoritative name server dns.cs.umass.edu requesting host surf.eurecom.fr gaia.cs.umass.edu DNS: Caching and updating records ► ► Once (any) name server learns mapping, it caches mapping Cache entries timeout (disappear) after some time Update/notify mechanisms being designed by IETF RFC 2136 http://www.ietf.org/html.charters/dnsind-charter.html DNS records DNS: Distributed database storing resource records (RR) RR format: (name, value, type, ttl) ► Type=A name is hostname value is IP address ► Type=NS ► Type=CNAME name is alias name for some “cannonical” (real) name ► www.ibm.com is really servereast.backup2.ibm.com value is cannonical name name is domain (eg foo.com) value is IP address of authoritative name server for ► Type=MX this domain value is name of mailserver associated with name DNS protocol, messages DNS protocol query and reply messages both with same message format Message header ► Identification 16 bit # for query, reply to query uses same # ► Flags query or reply recursion desired recursion available reply is authoritative DNS protocol, messages Name, type fields for a query RRs in reponse to query Records for authoritative servers Additional “helpful” info that may be used