SWE-306 Computer Communication & Networks (CC&N) Application Layer: HTTP, FTP, SMTP, POP3, IMAP & DNS 1 Application layer RQ 2 Application layer RQ 3 Application Layer The application layer is responsible for providing services to the user It provides user interfaces and support for services such as electronic mail, file access and transfer, access to system resources, surfing the world wide web, and network management. Goals: conceptual + implementation aspects of network application protocols client server paradigm Peer-to-Peer (P2P) paradigm service models learn about protocols by examining popular applicationlevel protocols More goals specific protocols: http ftp smtp pop dns programming network applications socket programming 4 Applications and application-layer protocols Application: communicating, distributed processes running in network hosts in “user space” exchange messages to implement app defined in RFCs e.g., email, file transfer, the Web Application-layer protocols one “piece” of an app define messages their syntax & semantics exchanged by apps and actions taken user services provided by lower layer protocols application transport network data link physical application transport network data link physical application transport network data link physical THREE PARADIGMS •Client server •Peer-to-Peer •Hybrid 2: Application Layer 5 Client-server paradigm Typical network app has two pieces: client and server server: always-on host permanent IP address data centers for scaling clients: communicate with server may be intermittently connected may have dynamic IP addresses do not communicate directly with each other client/server 2: Application Layer 6 Peer-to-Peer paradigm •Peers communicate with each other •Peers are desktop/laptops controlled by users •Communication is without passing through a dedicated server •Examples: File distribution application (Bit Torrent, Lime wire, eMule), Internet Telephony (Skype), IPTV (PPLive) •Hybrid Applications are instant messaging services (MSN, Yahoo) •Servers are used to track peer IP address and messages are exchanged between users without passing through intermediate servers 2: Application Layer 7 Internet apps: their protocols and transport protocols Application e-mail remote terminal access Web file transfer streaming multimedia remote file server Internet telephony Application layer protocol Underlying transport protocol smtp [RFC 821] telnet [RFC 854] http [RFC 2068] ftp [RFC 959] proprietary (e.g. RealNetworks) NSF proprietary (e.g., Vocaltec) TCP TCP TCP TCP TCP or UDP TCP or UDP typically UDP 2: Application Layer 8 WWW & HTTP ARCHITECTURE WWW is a system of interlinked hypertext documents accessed via the Internet. With a web browser, one can view web pages that may contain text, images, videos, and other multimedia and navigate between them via hyperlinks The WWW today is a distributed client/server service, in which a client using a browser can access a service using a server. However, the service provided is distributed over many locations called sites. Each site holds one or more documents, referred to as Web pages. The request, among other information, includes the address of the site and the Web page, called the URL Web and HTTP First, a review… web page consists of objects object can be HTML file, JPEG image, Java applet, audio file,… web page consists of base HTML-file which includes several referenced objects each object is addressable by a URL, e.g., www.someschool.edu/someDept/pic.gif host name path name Application Layer 210 HTTP overview HTTP: hypertext transfer protocol Web’s application layer protocol client/server model client: browser that requests, receives, (using HTTP protocol) and “displays” Web objects server: Web server sends (using HTTP protocol) objects in response to requests PC running Firefox browser server running Apache Web server iphone running Safari browser Application Layer 211 HTTP overview (continued) uses TCP: client initiates TCP connection (creates socket) to server, port 80 server accepts TCP connection from client HTTP messages (application-layer protocol messages) exchanged between browser (HTTP client) and Web server (HTTP server) TCP connection closed HTTP is “stateless” server maintains no information about past client requests aside protocols that maintain “state” are complex! past history (state) must be maintained if server/client crashes, their views of “state” may be inconsistent, must be reconciled Application Layer 212 HTTP connections non-persistent HTTP at most one object sent over TCP connection connection then closed downloading multiple objects required multiple connections persistent HTTP multiple objects can be sent over single TCP connection between client, server Application Layer 213 Non-persistent HTTP suppose user enters URL: www.someSchool.edu/someDepartment/home.index 1a. HTTP client initiates TCP connection to HTTP server (process) at www.someSchool.edu on port 80 2. HTTP client sends HTTP request message (containing URL) into TCP connection socket. Message indicates that client wants object time someDepartment/home.in dex (contains text, references to 10 jpeg images) 1b. HTTP server at host www.someSchool.edu waiting for TCP connection at port 80. “accepts” connection, notifying client 3. HTTP server receives request message, forms response message containing requested object, and sends message Layer 2intoApplication its socket 14 Non-persistent HTTP (cont.) 5. HTTP client receives response 4. HTTP server closes TCP connection. message containing html file, displays html. Parsing html file, finds 10 referenced jpeg objects time 6. Steps 1-5 repeated for each of 10 jpeg objects non-persistent connection: one object in each TCP connection some browsers create multiple TCP connections simultaneously - one per object persistent connection: multiple objects transferred within one TCP connection Application Layer 215 Non-persistent HTTP: response time RTT (definition): time for a small packet to travel from client to server and back HTTP response time: one RTT to initiate TCP connection one RTT for HTTP request and first few bytes of HTTP response to return file transmission time non-persistent HTTP response time = 2RTT+ file transmission time initiate TCP connection RTT request file time to transmit file RTT file received time time Application Layer 216 Persistent HTTP persistent HTTP: non-persistent HTTP issues: open after sending response subsequent HTTP messages between same client/server sent over open connection client sends requests as soon as it encounters a referenced object as little as one RTT for all the referenced objects requires 2 RTTs per object OS overhead for each TCP connection browsers often open parallel TCP connections to fetch referenced objects server leaves connection Application Layer 217 Cookies A cookie, also known as an HTTP cookie, web cookie, or browser cookie, is a small piece of data sent from a website and stored in a user's web browser while the user is browsing that website. Every time the user loads the website, the browser sends the cookie back to the server to notify the website of the user's previous activity. Cookies were designed to be a reliable mechanism for websites to remember stateful information (such as items in a shopping cart) or to record the user's browsing activity (including clicking particular buttons, logging in, or recording which pages were visited by the user as far back as months or years ago). Cookies: keeping “state” client ebay 8734 server usual http request msg cookie file usual http response ebay 8734 amazon 1678 set-cookie: 1678 usual http request msg cookie: 1678 usual http response msg Amazon server creates ID 1678 for user cookiespecific action one week later: ebay 8734 amazon 1678 create backend entry database access access usual http request msg cookie: 1678 usual http response msg cookiespecific action Application Layer 219 Web Caches (proxy server) Goal: satisfy client request without involving origin server user sets browser: WWW accesses via web cache client sends all http requests to web cache if object at web cache, web cache immediately returns object in http response else requests object from origin server, then returns http response to client origin server client client Proxy server origin server 2: Application Layer 20 Web Caching Hierarchy national/international proxy cache regional proxy cache local proxy cache (e.g., local ISP, University) client 2: Application Layer 21 FTP: the file transfer protocol FTP user interface file transfer FTP client user at host local file system FTP server remote file system transfer file to/from remote host client/server model client: side that initiates transfer (either to/from remote) server: remote host ftp: RFC 959 ftp server: port 21 Application Layer 222 FTP: separate control, data connections FTP client contacts FTP server at port 21, using TCP client authorized over control connection client browses remote directory, sends commands over control connection when server receives file transfer command, server opens 2nd TCP data connection (for file) to client after transferring one file, server closes data connection TCP control connection, server port 21 FTP client TCP data connection, server port 20 FTP server server opens another TCP data connection to transfer another file control connection: “out of band” FTP server maintains “state”: current directory, earlier authentication Application Layer 223 FTP connections Control connection Data connection RQ 24 Electronic mail outgoing message queue Three major components: user mailbox user agents user agent mail servers simple mail transfer protocol: mail server SMTP User Agent SMTP mail messages e.g., Outlook, Thunderbird, iPhone mail client outgoing, incoming messages stored on server mail server user agent SMTP a.k.a. “mail reader” composing, editing, reading user agent SMTP mail server user agent user agent user agent Application Layer 225 User Agent types Command-Driven e.g. mail, pine and elm GUI-based e.g. Eudora, Outlook and Netscape 26 Electronic mail: mail servers mail servers: user agent mailbox contains incoming messages for user message queue of outgoing (to be sent) mail messages SMTP protocol between mail servers to send email messages client: sending mail server “server”: receiving mail server mail server user agent SMTP mail server user agent SMTP SMTP mail server user agent user agent user agent Application Layer 227 Electronic Mail: SMTP [RFC 2821] uses TCP to reliably transfer email message from client to server, port 25 direct transfer: sending server to receiving server three phases of transfer handshaking (greeting) transfer of messages closure command/response interaction (like HTTP, FTP) commands: ASCII text response: status code and phrase messages must be in 7-bit ASCI Application Layer 228 Email address An email address has two parts: Local Part Domain Name RQ Defines the name of the user mailbox Name of the mail exchanger 29 Scenario: Alice sends message to Bob 4) SMTP client sends Alice’s message over the TCP connection 5) Bob’s mail server places the message in Bob’s mailbox 6) Bob invokes his user agent to read message 1) Alice uses UA to compose message “to” bob@someschool.edu 2) Alice’s UA sends message to her mail server; message placed in message queue 3) client side of SMTP opens TCP connection with Bob’s mail server 1 user agent 2 mail server 3 Alice’s mail server user agent mail server 6 4 5 Bob’s mail server Application Layer 230 SMTP: final words SMTP uses persistent connections SMTP requires message (header & body) to be in 7-bit ASCII SMTP server uses CRLF.CRLF to determine end of message comparison with HTTP: HTTP: pull SMTP: push both have ASCII command/response interaction, status codes HTTP: each object encapsulated in its own response msg SMTP: multiple objects sent in multipart msg Application Layer 231 Mail access protocols user agent SMTP mail access protocol SMTP sender’s mail server receiver’s mail server user agent (e.g., POP, IMAP) SMTP: delivery/storage to receiver’s server mail access protocol: retrieval from server POP: Post Office Protocol [RFC 1939]: authorization, download IMAP: Internet Mail Access Protocol [RFC 1730]: more features, including manipulation of stored msgs on server HTTP: gmail, Hotmail, Yahoo! Mail, etc. Application Layer 232 POP3 and IMAP more about POP3 IMAP previous example uses keeps all messages in one POP3 “download and delete” mode Bob cannot re-read email if he changes client POP3 “download-andkeep”: copies of messages on different clients POP3 is stateless across sessions place: at server allows user to organize messages in folders keeps user state across sessions: names of folders and mappings between message IDs and folder name Application Layer 233 Domain Name System (DNS) IP Address Name RQ TCP/IP protocols use IP addresses to uniquely identify connection of a host to the Internet. However, people prefer to use names instead of numeric addresses. Therefore, we need a system that can map a name to an address or an address to a name. 35 IP Address Name When the Internet was small, mapping was done by using a host file. Now Internet has grown, so host files would be too large and impossible to keep up-to-date. Solution: Domain Name System (DNS) RQ It had only 2 columns: name and address A client-server bases system/protocol 36 Example of using DNS service DNS server RQ 37 DNS: Domain Name System People: many identifiers: – SSN, name, Passport # Internet hosts, routers: – IP address (32 bit) - used for addressing datagrams – “name”, e.g., hermite.cs.smith.edu - used by humans Q: map between IP addresses and name ? Domain Name System: • distributed database implemented in hierarchy of many name servers • application-layer protocol host, routers, name servers to communicate to resolve names (address/name translation) – note: core Internet function implemented as applicationlayer protocol – complexity at network’s “edge” DNS DNS services • Hostname to IP address translation • Host aliasing – Canonical and alias names • Mail server aliasing • Load distribution – Replicated Web servers: set of IP addresses for one canonical name Why not centralize DNS? • single point of failure • traffic volume • distant centralized database • maintenance doesn’t scale! Structure of DNS Names • Each name consists of a sequence of alphanumeric components separated by periods • Examples: – – – – www.ssuet.edu.pk ssuet.edu.pk khi.comsats.net.pk aurangzeb.ssuet.edu.pk • Top Level Domains (right-most components; also known as TLDs) are defined by global authority Organizations apply for names in a top-level domain: – fsu.edu – macdonalds.com • Organizations determine own internal structure – eng.fsu.edu – cs.purdue.edu TLD and Authoritative Servers Root DNS Servers com DNS servers yahoo.com DNS servers • • • • • amazon.com DNS servers org DNS servers pbs.org DNS servers edu DNS servers poly.edu DNS servers umass.edu DNS servers Distributed hierarchical database maintained at server NAME Servers Root name servers: 13 root name servers worldwide, contacts authoritative name server if name mapping not known, gets mapping and returns mapping to local name server e.g verisign, Sprint, AT&T etc. Top-level domain (TLD) servers: responsible for com, org, net, edu, etc, and all top-level country domains uk, fr, ca, jp. Authoritative DNS servers: organization’s DNS servers, providing authoritative hostname to IP mappings for organization’s servers (e.g., Web and mail). Local DNS servers: organization’s DNS servers located on various subnets to provide DNS lookups for hosts on the subnet. May not be accessible from outside the subnet. DNS Name Resolution Example DNS: caching and updating records • once (any) name server learns a mapping, it caches the mapping (Domain’s DNS = IP) – cache entries timeout (disappear) after some time (usually 20 minutes) – TLD servers typically cached longer in local name servers • Thus root name servers not often visited • update/notify mechanisms under design by IETF – RFC 2136 – http://www.ietf.org/html.charters/dnsind-charter.html DNS messages RQ DNS can use the services of UDP or TCP, using the well-known port 53 DNS has two types of messages: query and response; both have the same format. The query message consists of a header and question records The response message consists of a header, question records, answer records, authoritative records, and additional records. 44 Header DNS messages RQ 45 Dynamic DNS In DNS, changes to the DNS master file are made manually. To rectify this shortfall, DDNS was designed. In DDNS, when a binding between a name and an address is determined, the information is sent to a primary DNS server. The primary server updates the zone. RQ 46 Pure P2P architecture • no always-on server • arbitrary end systems directly • communicate • peers are intermittently • connected and change IP • addresses Peer-to-Peer paradigm • • • • Peers communicate with each other Peers are desktop/laptops controlled by users Communication is without passing through a dedicated server Examples: File distribution application (Bit Torrent, Lime wire, eMule), Internet Telephony (Skype), IPTV (PPLive) • Hybrid Applications are instant messaging services (MSN, Yahoo) – Servers are used to track peer IP address and messages are exchanged between users without passing through intermediate servers BitTorrent Overview • Website allowing peers to share music, video and other media files • Central server helps users find initial set of peers that have pieces of the file • Tracker server keeps track of peers possessing content of individual files • Users download the file by participating in exchange: – They exchange pieces that they have – for pieces that they don’t have • Therefore, for the system to work, users must have incentive to give • Users who just get, but do not give are called free riders • Protocol must discourage free riding BitTorrent BitTorrent P2P: centralized directory Skype • Proprietary application layer Protocol • IP telephony system (P2P) • Allows users to make phone calls – to Skype users – to regular phone users • Skype Components - Skype client (SC) – the client program used to make phone calls to known Skype users from host cache maintained at each Skype client - Central login server – a centralized component. Processes account information, authentication - Super-nodes (SNs) - Nodes with powerful CPU and bandwidth that know about location of other nodes and contacted by other Skype clients/ordinary nodes to make calls • Calls are routed via Skype nodes Skype User Search Procedure • A Skype client making a phone call needs to find other users • It contacts super-nodes from its host cache, asking them to help find the user • Super-nodes return a list of nodes to contact • The client contacts those nodes • If unsuccessful, the client asks for more nodes • Guarantees to find any user that has logged in within the last 72 hours • Not much specific information on Skype protocol is available…