Ch4 Traditional Internet Applications

advertisement
Department of Engineering Science
ES465/CES 440, Intro. to Networking & Network Management
Traditional Internet Applications
http://www.sonoma.edu/users/k/kujoory
References
• “Computer Networks & Internet,” Douglas Comer, 6th ed, Pearson, 2014, Ch 4,
Textbook, 5th ed, slides by Lami Kaya (LKaya@ieee.org) with some changes.
• “Computer Networks,” A. Tanenbaum, 5th ed., Prentice Hall, 2011, ISBN:
13:978013212695-3.
• “Computer & Communication Networks,” Nader F. Mir, 2nd ed, Prentice Hall, 2015,
ISBN: 13: 9780133814743.
• “Data Communications Networking,” Behrouz A. Forouzan, 4th ed, Mc-Graw Hill,
2007
• “Data & Computer Communications,” W. Stallings, 7th ed., Prentice Hall, 2004.
• “Computer Networks: A Systems Approach," L. Peterson, B. Davie, 4th Ed., Morgan
Kaufmann 2007.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
1
Topics Covered
•
•
•
•
•
•
•
•
•
•
•
•
•
•
4.1 Introduction
4.2 Application-Layer Protocols
4.3 Representation & Transfer
4.4 Web Protocols
4.5 Document Representation with
HTML
4.6 Uniform Resource Locators &
Hyperlinks
4.7 Web Document Transfer with
HTTP
4.8 Caching in Browsers
4.9 Browser Architecture
4.10 File Transfer Protocol (FTP)
4.11 FTP Communication Paradigm
4.12 Electronic Mail
4.13 The Simple Mail Transfer
Protocol (SMTP)
4.14 ISPs, Mail Servers, & Mail
Access
Ali Kujoory
6/30/2016
• 4.15 Mail Access Protocols (POP,
IMAP)
• 4.16 Email Representation Standards
(RFC2822, MIME)
• 4.17 Domain Name System (DNS)
• 4.18 Domain Names That Begin with
www
• 4.19 The DNS Hierarchy & Server
Model
• 4.20 Name Resolution
• 4.21 Caching in DNS Servers
• 4.22 Types of DNS Entries
• 4.23 Aliases & CNAME Resource
Records
• 4.24 Abbreviations & the DNS
• 4.25 Internationalized Domain
Names
• 4.26 Extensible Representations
(XML)
Not to be reproduced without permission
2
4.1 Introduction
The chapter
– Explains that Internet services are defined by application
programs
– Characterizes the client-server model that such programs use to
interact
– Covers the socket API
– Examines Internet applications
– Defines the concept of a transfer protocol
– Explains how applications implement transfer protocols
– Considers standard Internet applications
– Describes the transfer protocol each uses
Ali Kujoory
6/30/2016
Not to be reproduced without permission
3
4.2 Application-Layer Protocols
• Whenever a programmer creates two network
applications, the programmer specifies some details,
such as:
–
–
–
–
The syntax & semantics of messages that can be exchanged
Whether the client or server initiates interaction
Actions to be taken if an error arises
How the two sides know when to terminate communication
• There are two broad types of application-layer
protocols that depend on the intended use:
– Private communication
– Standardized service
“Network” means two applications that can communicate.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
4
4.2 Application-Layer Protocols
• Private communication
– A programmer creates a pair of applications that communicate
over the Internet with the intention that the pair is for private use
– Interaction between the two applications is straightforward
• code can be written without writing a formal protocol specification
• Standardized service
– Expectation is that many programmers will create server software
to offer the service or client software to access the service, in this
case
• Application protocol must be documented independent of
implementation
• The specification must be precise & unambiguous
• The size of a protocol specification depends on the
complexity of the service
Ali Kujoory
6/30/2016
Not to be reproduced without permission
5
4.3 Representation & Transfer
• Application-layer protocols specify two aspects of
interaction
– Representation
– Transfer
• Fig. 4.1 explains the distinction
Figure 4.1 Two key aspects of & application layer protocol
Ali Kujoory
6/30/2016
Not to be reproduced without permission
6
4.4 Web Protocols
• The World Wide Web (WWW)
is one of the most widely used
services in the Internet
• Web is complex
– many protocol standards have
been devised to specify various
aspects & details
Client
page
Hyperlink to
sonoma.edu using
HTTP over TCP
connection
Server
sonoma.edu
file
HTTP
server
Browser
program
DISK
DISK
The Internet
• Fig. 4.2 illustrates major WWW
standards
Figure 4.2 Three key standards that the World Wide Web service uses.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
7
4.5 Document Representation with HTML
• HyperText Markup Language (HTML) is a
representation standard that specifies the syntax of a
web page
• HTML has the following general characteristics:
–
–
–
–
–
–
Uses a textual representation
Describes pages that contain multimedia
Follows a declarative rather than procedural paradigm
Provides markup specifications instead of formatting
Permits a hyperlink to be embedded in an arbitrary object
Allows a document to include metadata
• It allows a programmer to specify a complex web page
that contains graphics, audio, video, as well as text
– We should have used hypermedia in the name instead of hypertext
Ali Kujoory
6/30/2016
Not to be reproduced without permission
8
4.5 Document Representation with HTML
• HTML is classified as declarative
– It allows one to specify what is to be done, not how to do it
• HTML is classified as a markup language
– It only gives general guidelines for display & does not include
detailed formatting instructions
– HTML allows a page to specify the level of importance of a
heading
– HTML does not require the author to specify the exact font,
typeface, point size, or spacing for the heading
• HTML extensions have been created that do allow the specification of
an exact font, typeface, point size, & formatting
• A browser chooses all display details
– The use of a markup language is important
• because it allows a browser to adapt the page to the underlying display
hardware
• a page can be formatted for a high resolution or low resolution display, a
large screen or a small hand-held device such as an iPhone or PDA
Ali Kujoory
6/30/2016
Not to be reproduced without permission
9
4.5 Document Representation with HTML
• To specify markup
– HTML uses tags embedded in the document (see Fig. 4.3)
• Tags provide structure as well as formatting
• Tags control all display
– white space (i.e., extra lines & blank characters) can be inserted
at any point in the HTML document
• without any effect on the formatted version that a browser displays
• HTML tags are case insensitive
– does not distinguish between uppercase & lowercase letters
• Examples:
– IMG tag to encode a reference to an external image
• Additional parameters can be specified in an IMG tag to specify the
alignment of the figure with surrounding text
• An example is given in Fig. 4.4
Ali Kujoory
6/30/2016
Not to be reproduced without permission
10
4.5 Document Representation with HTML
HTML is based on declaration for each
markup.
<HTML>
<HEAD>
<TITLE>
text that forms the document title
</TITLE>
</HEAD>
<BODY>
body of the document appears here
</BODY>
</HTML>
Figure 4.3 The general form of an HTML
An example of a web display looks like
this:
Sonoma State University (SSU)
Engineering Science Department:
The Engineering Science Department offers
an Electrical Engineering program.
The simplified page source looks like
this:
<html>
<h1> Sonoma State University (SSU) </h1>
<b><h2> Engineering Science Department:
</h2></b>
The Engineering Science Department offers
an Electrical Engineering program.
</html>
HTML uses IMG tag to encode a
reference to an external image
Here is an icon of a house. <IMG
SRC=“house_icon.jpg” ALIGN=middlel>
Ali Kujoory
6/30/2016
Not to be reproduced without permission
11
4.6 Uniform Resource Locators & Hyperlinks
• The Web uses a syntactic form known as a Uniform
Resource Locator (URL) to specify a web page
• The general form of a URL is:
• where
– protocol is the name of the protocol used to access the document
– computer_name is the domain name of the computer on which
the document resides
– port (optional) port number at which the server is listening
– document_name (optional) name of the document
– % (optional) parameters for the page
– Example:
Ali Kujoory
6/30/2016
Not to be reproduced without permission
12
4.6 Uniform Resource Locators & Hyperlinks
• In a typical URL, a user can omit many of the parts
• Which omits the
–
–
–
–
protocol (http is assumed)
port (80 is assumed)
document name (index.html is assumed), &
parameters (none are assumed)
• A URL contains the information a browser needs to retrieve a page
• Browser uses the separator characters
– colon, slash, & percent, to divide the URL into four components:
• a protocol, a computer name, a document name, & parameters
• Browser uses the computer name & protocol port to form a
connection to the server on which the page resides
• Browser uses the document name & parameters to request a
page
Ali Kujoory
6/30/2016
Not to be reproduced without permission
13
4.7 Web Document Transfer with HTTP
• HyperText Transfer
Protocol (HTTP) is the
primary transfer protocol
that a browser uses to
interact with a web server
• A browser is a client that
extracts a server name
from a URL & contacts the
server
Ali Kujoory
6/30/2016
• Most URLs contain an
explicit protocol reference
of http:// or omit the
protocol altogether (HTTP
is assumed)
• HTTP can be
characterized as follows:
– Uses textual control
messages
– Transfers binary data files
– Can download or upload data
– Incorporates caching
Not to be reproduced without permission
14
4.7 Web Document Transfer with HTTP
• Once it establishes a connection
– a browser sends an HTTP request to the server
• Fig. 4.5 lists the four major request types
Most
common
Figure 4.5 The four major HTTP request types.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
15
4.7 Web Document Transfer with HTTP
• The most common form of
interaction begins with the
browser requesting a page
from the server
• The browser (client) sends a
GET request over
• The server responds by
sending a header, a blank line,
& the requested document
• A GET request has the
following form:
GET /item version CRLF
– item gives the URL for the item
being requested
Ali Kujoory
6/30/2016
– version specifies a version of
the protocol (HTTP/1.0 or
HTTP/1.1)
– CRLF denotes two ASCII
characters
• carriage return & linefeed, that
are used to signify the end of a
line of text
• Version information is important
in HTTP
– it allows the protocol to change &
yet remain backward
compatible
– a browser sends version
information which allows
• a server to choose the highest
version that they can both
understand
Not to be reproduced without permission
16
4.7 Web Document Transfer with HTTP
• Fig. 4.6 shows the general format of lines in a basic
response header
Figure 4.6 General format of lines in a basic response header.
Figure 4.7 Example of status codes used in HTTP.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
17
4.7 Web Document Transfer with HTTP
• The first line of a response
header contains a status code
– that tells the browser whether the
server handled the request
– If the request was incorrectly
formed or the requested item
was not available, the status
code pinpoints the problem
• Additional lines of the header
give further information, such
as
– its length
– when it was last modified
– and the content type
• E.g., a server returns status
code 404 if the requested item
cannot be found
• When it honors a request, a
server returns status code 200
Ali Kujoory
6/30/2016
Not to be reproduced without permission
18
4.7 Web Document Transfer with HTTP
• Fig. 4.8 shows sample output from an Apache web server
• The item being requested is a text file containing 16 characters
– i.e., the text “This is a test.” plus a NEWLINE character
• Although the GET request specifies HTTP version 1.0, the server
runs version 1.1
• The server returns 9 lines of header, a blank line, & the contents of
the file
Figure 4.8 Sample HTTP response from an apache web server.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
19
4.8 Caching in Browsers
• Caching provides an important
optimization for web access
– when users tend to visit the
same web sites repeatedly
• Much of the content at a given
site consists of large images
– Graphics Image Format (GIF)
– Joint Picture Encoding Group
(JPEG)
• Such images often contain
backgrounds or banners
• A browser can reduce
download times significantly
– by saving a copy of each image
in a cache on the user's disk &
using the cached copy
• What happens if the document
on the web server changes
after a browser stores a copy in
its cache?
– How can a browser tell whether
its cached copy is up-to-date?
– they do not change frequently
Ali Kujoory
6/30/2016
Not to be reproduced without permission
20
4.8 Caching in Browsers
• Whenever a browser obtains a
document from a web server,
the header specifies the last
time the document was
changed
• A browser saves the LastModified date information
along with the cached copy
– A browser makes a HEAD
request to the server &
compares the Last-Modified date
of the server's copy to the LastModified date in the cached
– If the cached version is stale, the
browser downloads the new
version
Ali Kujoory
6/30/2016
• Algorithm 4.1 summarizes
caching, but omits several
minor details, e.g.,
– HTTP allows a web site to
include a No-cache header that
specifies a given item should not
be cached
• Browsers do not cache small
items
– because the time to download
the item with a GET request is
almost the same as the time to
make a HEAD request & keeping
many small items in a cache can
increase cache lookup times
Not to be reproduced without permission
21
4.8 Caching in Browsers
Ali Kujoory
6/30/2016
Not to be reproduced without permission
22
Browser Architecture
Parts of Web model, “Computer Networks,” A. Tanenbaum, 4th ed.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
23
4.9 Browser Architecture
• A browser structure is complex
• It must understand HTTP
• A browser also provides
support for other protocols
– It must contain client code for
each of the protocols used
– It must know how to interact with
a server & how to interpret
responses
– It must know how to access the
File Transfer Protocol (FTP)
service
• Fig. 4.9 illustrates components
of a browser
Ali Kujoory
6/30/2016
Figure 4.9 Architecture of a browser that
can access multiple services.
Not to be reproduced without permission
24
4.10 File Transfer Protocol (FTP
– Information type
– naming
– file access mechanisms
– a document, spreadsheet,
computer program, graphic
image, or data
• Examples of differences in
OSes
• FTP can send a copy of a file
from one computer to another
– provides a powerful mechanism
for the exchange of data
• File transfer across the Internet
is complicated because
computers are heterogeneous
• Each computer system may
have a different:
6/30/2016
)
– file representations
• A file is the fundamental
storage abstraction
• A file can hold an arbitrary
object, e.g.,
Ali Kujoory
http://en.wikipedia.org/wiki/File_Transfer_Protocol
– file extension .jpg or .jpeg for a
JPEG image
– Line termination in a text file by a
LINEFEED character or
CARRIAGE RETURN &
LINEFEED
– a slash (/) or a backslash (\) for a
separator in file names
Not to be reproduced without permission
25
4.10 File Transfer Protocol (FTP)
• FTP, RFC 959, a standardized
file transfer service, provides
for:
– Transfer of any file content &
data type
• FTP protocol is usually
invisible
– Invoked automatically by a
browser when a user requests a
file download
• There is also TFTP (Trivial
FTP, RFC 1350)
– Bidirectional Transfer
(download/get or upload/put)
http://en.wikipedia.org/wiki/Trivial_File_Transfer_Protocol
– Uses TCP for reliability
– Authentication & Ownership
Support
• Allows each file to have ownership
& access restrictions
– A simple FTP version to get from
or put a file onto a remote Host
– But not as reliable as FTP
• Uses UDP
• Mainly over LANs
– Browse of Folders
– No authentication
– Use of ASCII text for Textual
Control Messages
– Heterogeneity among computer,
Operating Systems
Ali Kujoory
6/30/2016
Not to be reproduced without permission
26
4.11 FTP Communication Paradigm
• FTP uses a client-server approach for interactions
– A client establishes a connection to an FTP server which is listening &
sends a series of requests to which the server responds
– FTP uses at the server a
• control connection (port 21) &
• data connection (port 20)
– Each time the server needs to download or upload a file, the server opens
a new connection
– Most commands & interactions are transparent to the users
Client
User at
terminal
File
System
Ali Kujoory
User interface
function
Server
User protocol
interpreter
Control comm FTP
commands/replies
port #21
User protocol
interpreter
User data
transfer function
Data comm
port #20
Server data
transfer function
6/30/2016
Not to be reproduced without permission
File
System
27
Fig. 4.10 Illustration
of FTP connections
during a typical
session
The exchanges for security
(password) not shown
Ali Kujoory
6/30/2016
Not to be reproduced without permission
28
4.11 FTP Communication Paradigm
• Fig. 4.10 omits several
important details, e.g.,
• When accessing public files, a
client uses anonymous login
– after creating the control
connection, a client must log into
the server that provide
• a USER command that the client
sends to provide a login name
• a PASS command that the client
sends to provide a password
– which consists of user name
anonymous & password mostly
guest
• What protocol port number
should a server specify when
connecting to the client?
– The server sends a numeric
status response over the control
connection to let the client know
whether the login was successful
• A client can only send other
commands after a login is
successful
Ali Kujoory
6/30/2016
Not to be reproduced without permission
29
4.11 FTP Communication Paradigm
• A client allocates a
protocol port on its local
OS & sends the port # to
the server
– i.e., the client binds to the
port to await a connection
– Then transmits a PORT
command over the control
connection to inform the
server about the port # being
used
• Algorithm 4.2 summarizes
the steps
• FTP protocol may face
problems in certain cases
– transmission of a protocol
port # will fail if one of the two
endpoints lies behind a
Network Address
Translation (NAT) device
• i.e., as a wireless router used
in a residence or small office
• Ch 23 explains that FTP is an
exception
⌐ A NAT device recognizes an
FTP control connection,
⌐ inspects the contents of the
connection, &
⌐ rewrites the values in a PORT
command
Ali Kujoory
6/30/2016
Not to be reproduced without permission
30
Ali Kujoory
6/30/2016
Not to be reproduced without permission
31
FTP Commands & Examples
• Ftp commands:
• Example 2: Obtain a copy of
tcpbook.tar
$
$ ftp arthur.cs.purdue.edu
abort
cd
get
put
ascii
close
help
pwd
bell
delete
ls
rename
200 arthur.cs.purdue.edu FTP Server (DYNIX
V3.0.12) ready
binary
debug
mkdir
rmdir
Name (arthur:usra): anonymous
bye
disconnect
open
status
331 Guest login ok, send ident as password
Connected to arthur.cs.purdue.edu
Password: guest
•
Example 1:
230 Guest login ok, access restrictions apply
– ftp>
ftp> get pub/comer/tcpbook.tar bookfile
help ls
200 PORT Command okay
• ls list content of remote directory
– ftp>
150 Opening data connection for /bin/ls
(128.10.2.1, 2363) (7897088 bytes)
help bell
• bell beep when command completed
•
Anonymous ftp session
– User does not need an account or
226 Transfer complete
8272793 bytes received in 98.04 seconds (82
Kbytes/s)
ftp> close
password
– Used for publicly available files
221 Goodbye
Ftp> quit
Ali Kujoory
6/30/2016
Not to be reproduced without permission
32
4.12 Electronic Mail
• One of the most widely used
Internet applications
• Fig. 4.11 illustrates a simplified
architecture of electronic email
• Email software is divided into
two conceptually pieces:
– An email interface application
• A mechanism for a user to
compose & edit outgoing
messages as well as read &
process incoming email
Ali Kujoory
6/30/2016
– A mail transfer program
• Acts as a client to send a message
to the mail server on the
destination computer
• the mail server accepts incoming
messages & deposits each in the
appropriate user's mailbox
• Email system is architecturally
based on postal system.
– Message (Content) & Envelope
are separate.
– Very helpful in handling content &
envelope separately.
Not to be reproduced without permission
33
Algorithm 4.3
lists the steps
taken to send
an email
Ali Kujoory
6/30/2016
Not to be reproduced without permission
34
4.12 Electronic Mail
• The specifications used for Internet email can be divided
into three broad categories as Fig. 4.12 lists
Figure 4.12 The three types of protocols used with email
Specifications & Standards:
• IETF Simple Mail Transfer Protocol (SMTP) delivers simple text messages. Originally RFC
821, currently RFC 5321, both over TCP/IP, carry ASCII text only.
• IETF Multi-purpose Internet Mail Extension (MIME) can deliver other types of data (Voice,
images, video clips). Originally RFC 822, currently RFC 5322.
• ITU-T (ISO/OSI) Message Handling System, X.400 (MHS) counterpart of SMTP. not used
as much due to its complexity.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
35
Email Architecture & Operation
• User Agent (UA) program creates,
reads, sends, & receive email.
– Uses a local (client) program, e.g.,
MS Outlook.
• MTA also implements mailing list
to deliver a message to a list
• UA can configure MTA.
– Can be command-based or graphical.
• Message Transfer Agent (MTA,
mail server) is a server process.
– Queues & moves the message from
source to destination.
• Mailboxes can be implemented in
MTA (mail server) to store the
email received by a user.
• Users can use different UA to
access mailbox.
– Uses SMTP over TCP to transfer the
message.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
36
4.13 The Simple Mail Transfer Protocol (SMTP)
• The Simple Mail Transfer
Protocol (SMTP) is the
standard protocol that a mail
transfer program uses
• SMTP can be characterized as:
– Follows a stream paradigm
(TCP)
– Allows a sender to specify
recipients’ names &
• check each name
– Sends one copy of a given
message
Ali Kujoory
6/30/2016
– MIME (Multipurpose Internet
Mail Extensions) standard that
allows email to include
attachments, e.g.,
• graphic images or binary files
– Uses textual control messages
– Only transfers text messages
• SMTP has a restriction to send
only textual content
• SMTP can send a single
message to multiple
recipients
– The protocol allows a client to
list users & then send a single
copy of a message for all users
on the list
Not to be reproduced without permission
37
Differences between FTP & Email
• FTP provides point-topoint, peer-to-peer, 2-way
transfer.
– More efficient for file transfer.
• Knowledge of peer’s status,
data retrieval, ..
– More suitable for file
operation.
Ali Kujoory
6/30/2016
– More efficient for mail service
with features.
• Blind Carbon copy, with or
without receipt, ..
• Access control &
management, file format, file
size
• Files can be big (GB)
• SMTP provides point-tomultipoint, based on
store-and-forward at the
application layer.
– Has limitation on file size &
type.
– No file operation
• No file access & management
Not to be reproduced without permission
38
Email Message
• eMail comprises Envelope & Message that are separate.
– Envelope encapsulates the message for routing.
• Added & read by MTA.
• Has info needed for
transporting the
message.
– Message consists of
Header & Body that
are separate.
• Made by UA
• Consists of
⌐ Header (control info
for UA)
⌐ Body (transparent
to MTA)
Envelopes & messages:
(a) Paper mail
(b) Electronic mail.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
39
An Example of SMTP Session
Message from John
To: Paul (OK)
To: Matthew (no such user)
CR = Carriage Return, LF=Left Feed
<CR><LF>= end of line & to next line
<CR><LF>, <CR><LF>=end of data
Ali Kujoory
6/30/2016
Not to be reproduced without permission
40
User Agent (UA)
• Called the email reader that can
accept a variety of commands.
– Composing, receiving, & replying to
messages, & managing mailboxes.
– E.g., MS Outlook, Google gmail,
Mozilla Thunderbird.
• Has menu or icon-driven interface
using mouse or touch screen.
• Displays message folders,
message summary, message
search, & sometimes the calendar
• Auto responder works on behalf of
UA but run on mail server
– Can forward incoming email to a
different address, or work as
vacation agent.
Typical elements of the
user agent interface.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
41
Message Formats
• SMTP messages consist of simple
envelope based on RFC 5321.
• UA builds a message & passes it to
MTA.
• MTA uses some of the header
fields to construct the envelope.
RFC 5322 header
fields related to
message transport.
Ali Kujoory
• RFC 822 extended to RFC 5322 to
support multimedia.
• “To” field in the header gives DNS
address of primary recipients, 1st
party.
• Messages sent by UA must be
placed in a standard format to be
handled by message transfer
agent.
Header
• ASCII email based on RFC 822.
• Cc (2nd party) & Bcc (3rd party) give
secondary addresses.
Meaning
To:
E-mail address(es) of primary recipient(s)
Cc:
E-mail address(es) of secondary recipient(s)
Bcc:
E-mail address(es) for blind carbon copies
From:
Person or people who created the message
Sender:
E-mail address(es) of the actual sender
Received:
Line added by each transfer agent along the route
Return-Path:
Can be used to identify a path back to the sender
6/30/2016
Not to be reproduced without permission
42
4.14 ISPs, Mail Servers, & Mail Access
Ali Kujoory
6/30/2016
Not to be reproduced without permission
43
4.14 ISPs, Mail Servers, & Mail Access
• The web browser (webmail)
approach is straightforward:
– an ISP provides a special web
page that displays messages
from a user's mailbox
• Advantages of webmail
– ability to read email from any
computer anywhere connected
to Internet
– a user does not need to run a
special mail interface application
• Disadvantage
– No access to email when off line
– May lose the emails in the
mailbox by changing the provider
Ali Kujoory
6/30/2016
• Using a special mail
application can download an
entire mailbox onto a local
computer, such as a laptop
– When connected to the Internet,
a user can run an email program
that downloads an entire mailbox
onto the laptop
• Advantages
– Can process email when the
laptop is offline (on an airplane)
– Once online can upload emails
the user has created & download
any new email
– Always has access to emails in
the laptop
Not to be reproduced without permission
44
IMAP vs POP3
(a) IMAP - Sending & reading email when
receiver has a permanent Internet connection.
• UA runs on the same machine as the MTA.
• Client connects to server using a secure
transport & begins to issue commands.
• Assumes all emails remain on server
indefinitely.
• Displays all messages on a computer
– Do not use IMAP on slow modems.
• RFC 3501 (over TCP port 143)
(b) POP3 - Reading e-mail when receiver has an
Internet connection to an email provider.
• Allows a UA to contact email provider’s MTA.
• No need for receiver to have a connection
after download.
• Can clear out of provider mailbox after read.
• Emails spreads over multiple PC’s when read.
• RFC 1939 (over TCP port 110); allows
– Authorization - login
– Transaction - collect emails, mark/delete
– Update - delete email
IMAP = Internet Message Access
Protocol
POP3 = Post Office Protocol ver 3
Server
Internet
connection
Ali Kujoory
6/30/2016
Not to be reproduced without permission
45
A Comparison of POP3 & IMAP
RFC 3501
Used by
Ali Kujoory
ISPs
6/30/2016
Corporations
Not to be reproduced without permission
46
4.15 Mail Access Protocols (POP3, IMAP)(Skip to MIME)
• Protocols have been created that
provide email access
• An access protocol is distinct
from a transfer protocol
– access only involves a single user
interacting with a single mailbox
– transfer protocols allow a user to
send mail to other users
• Access protocols have the
following characteristics:
• Viewing a list of messages without
downloading the message contents
is useful
– Especially, in cases where the link
between two parties is slow
– E.g., a user browsing on a cell phone
may look at headers & delete spam
without waiting to download the
message contents
– Provide access to a user’s mailbox
– Permit a user to view headers,
download, delete, or send
messages
– Client runs on user’s personal
computer
– Server runs on a computer that
stores user’s mailbox
Ali Kujoory
6/30/2016
Not to be reproduced without permission
47
4.15 Mail Access Protocols (POP, IMAP)
• A variety of mechanisms available for email access
– Some ISPs provide free email access software to their subscribers
– In addition, two standard email access protocols have been
created
• Fig. 4.15 lists the standard protocol names
• Two protocols differ in many details
– In particular, each provides its own authentication mechanism that
a user follows to identify themselves
Figure 4.15 The email access protocols.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
48
4.16 Email Representation Standards (RFC2822, MIME)
• Two important email
representation standards
exist:
– Mail Message Format RFC
2822
– Multi-purpose Internet Mail
Extensions (MIME) RFC
2045
• 2822 Mail Message Format:
– takes its name from the IETF
standards document RFC
2822
– an email message is
represented as a text file &
consists of
Ali Kujoory
6/30/2016
• a header section
• a blank line, &
• a body
– Header lines each have the
form:
Keyword: information
• where the set of keywords is
defined to include
⌐ From:, To:, Subject:, Cc:
• MIME, next slide
Not to be reproduced without permission
49
4.16 Email Representation Standards (RFC 2822, MIME)
• But MIME does not restrict
encoding to a specific form
• Multi-purpose Internet
Mail Extensions (MIME)
– MIME standard extends the
functionality of email to allow
the transfer of non-text data
in a message
– MIME specifies how a binary
file can be encoded into
printable characters,
included in a message, &
decoded by the receiver
– The Base64 encoding
standard is most popular
• Maps 6-bit block into 8-bit
block printable ASCII output
Ali Kujoory
6/30/2016
• MIME permits a sender
/receiver to choose a
convenient encoding
• The sender includes extra
lines in the header to specify
encoding used
– MIME allows a sender to
divide a message into several
parts &
– To specify an encoding for
each part independently
• a user can send a plain text
message & attach a graphic
image, a spreadsheet, & an
audio clip, each with their
own encoding
Not to be reproduced without permission
50
4.16 Email Representation Standards (RFC2822, MIME)
• MIME adds two lines to an
email header
– One to declare that MIME has
been used to create the
message, &
– Another to specify how MIME
information is included in the
body, e.g.,
– The header lines:
• MIME-Version: 1.0
• Content-Type: Multipart/Mixed;
Boundary=Mime_separator
– Mime_separator will appear
in the message body before
each part
Ali Kujoory
6/30/2016
• When MIME is used to
send a standard text
message, the 2nd line
becomes
Content-Type: text/plain
• MIME is backward
compatible with email
systems that do not
understand the MIME
standard or encoding
– such systems have no way of
extracting non-text
attachments
– they treat the body as a
single block of text
Not to be reproduced without permission
51
4.17 Domain Name System (DNS)
• DNS provides a service that
maps human-readable
symbolic names to computer
addresses
• Whenever an application needs
to translate a name, the
– application becomes a client of
the naming system
– maps name to address
• Provides a directory service for
TCP/IP applications, e.g.,
– Browsers, mail software
• DNS is an interesting example
of client-server interaction mapping
– Is not performed by a single
server
– Is distributed among many
servers located at sites across
the Internet
Ali Kujoory
6/30/2016
– client sends a request message
to a name server
– server finds the corresponding
address & sends a reply
message
• if it cannot answer a request, a
name server temporarily becomes
the client of another name server,
until a server is found that can
answer the request
• RFC 1034 & 1035, Domain
Names
• ITU X.500, Directory Service
Not to be reproduced without permission
52
4.17 Domain Name System
• Syntactically, each name
consists of a sequence of
alpha-numeric segments
separated by periods, e.g.,
– A computer in the Computer
Science Department at Purdue
University has the domain
name:
mordred.cs.purdue.edu
– A computer at Cisco, Inc. has
the domain name:
anakin.cisco.com
• Domain names are
hierarchical, with the
most significant part of the
name on the right (e.g.,
edu, com
– The left-most segment of a
name (mordred & anakin in
the examples) is the name of
an individual computer
– Other segments in a domain
name identify the group that
owns it, e.g., the segment
• purdue gives the name of a
university, &
• cisco gives the name of a
company
Ali Kujoory
6/30/2016
Not to be reproduced without permission
53
DNS Structure (Partly from A. Tanenbaum)
• A portion of the Internet domain name space
– A hierarchy, a tree structure
• Each domain is partitioned into subdomains
– Subdomains are further partitioned
• Leaves may be a single host or a company with many hosts
Generic
Countries
unnamed root
conceptual
server for domain
server for
sub-domain
com
edu
att sonoma
cs es
gov
mil
nsf
org
net
acm ieee
int
jp
us
nl
ac co
ali.kujoory
nec
ali.kujoory@ieee.org
Ali Kujoory
6/30/2016
Not to be reproduced without permission
54
4.17 Domain Name System
• DNS does not specify the
number of segments in a name
• DNS does specify values for
the most significant segment,
which is called a top-level
domain (TLD)
– Controlled by the Internet
Corporation for Assigned
Names & Numbers (ICANN)
– ICANN designates one or more
domain registrars to administer
a given top-level domain &
approve specific names
• Some TLDs are generic,
meaning they are generally
available
• Fig. 4.16 lists example top-level
DNS domains
• An organization applies for a
name under one of the existing
top-level domains
– most US corporations choose to
register under the com domain
• DNS allows organizations to
use a geographic registration
– E.g., the Corporation For
National Research Initiatives
registered the domain:
cnri.reston.va.us
– Other TLDs are restricted to
specific groups or government
agencies
Ali Kujoory
6/30/2016
Not to be reproduced without permission
55
Fig. 4.16 Example
top-level domains
& the group to
which each is
assigned
Ali Kujoory
6/30/2016
Not to be reproduced without permission
56
DNS Operation
• DNS application
program calls a library
procedure called
resolver
• Armed with the IP
address, the program can
then
– establish a TCP connection
with the destination, or
– send its UDP packets
– Provides to it the name as a
parameter
• Resolver (DNS client)
sends a UDP packet to a
local DNS server
– which then looks up the
name & returns the IP
address to the resolver
– which then returns the IP
address to the caller
Ali Kujoory
6/30/2016
1
2
5
3
4
DNS
server
6
Not to be reproduced without permission
57
Name Servers
• Finding the IP address for a given
hostname is called resolution & is
done with the DNS protocol.
• Resolution:
• DNS protocol:
– Runs on UDP port 53, retransmits
lost messages.
– Caches name server answers for
better performance.
– Computer requests local name
server to resolve.
• Example of a computer looking up
the IP for a name.
– Local name server asks the root
name server.
– Root returns the name server for a
lower zone.
– Continue down zones until name
server can answer.
Example of a resolver looking up a remote
name in 10 steps.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
58
DNS - Resource Records
• Records that make up the database are known as
“Resource Records”.
• Namespace stored on a “Name Server”.
• Every domain, whether a single host or a top-level
domain can have a set of resource records associated
with it.
– For a single host most common resource record is its IP address.
• When a resolver gives a domain name to DNS,
– it gets back the resource record associated with that name.
• Real function of DNS is to map domain names onto
resource records.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
59
DNS - Resource Records (2)
• A resource record is a five-tuple:
– Encoded in binary for efficiency, but represented in ASCII text
• One line per resource record
Example
Domain_name
Time_to_live Class
Type
Value
128.32.137.3
a)
ucbvax.berkeley.edu
60
IN
A
b)
berkeley.edu
86400
IN
NS
ucbvax.berkeley.edu
1. Domain_name field indicates the domain to which this record applies.
– There are many records for each domain.
– This field is the primary search key to satisfy queries.
2. Time_to_live field indicates how stable the record is.
– E.g., 86400 (# of sec in 1 day).
– Large value for a highly stable record.
3. Class use IN for Internet information.
– Other codes for non-Internet.
4. Type field indicates what kind of record this is (next slide).
5. Value field - a number, a domain name, or an ASCII string.
– Semantics depend on the record type.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
60
DNS - Resource Records (3)
Type
Meaning
Value
SOA
Start of authority
Parameters for this zone
A
IPv4 address of a host
32-Bit integer
AAAA
IPv6 address of a host
128-Bit integer
MX
Mail exchange
Priority, domain willing to accept email
NS
Name server
Name of a server for this domain
CNAME Canonical name
Domain name
PTR
Pointer
Alias for an IP address
SPF
Sender policy framework Text encoding of mail sending policy
SRV
Service
Host that provides it
TXT
Text
Descriptive ASCII text
The principal DNS resource record types.
Ali Kujoory
6/30/2016
Not to be reproduced without permission
61
DNS - Query/Response Scenarios
QUERY
Client
e.g., email agent,
resolver
RESPONSE
Name Server
Example: Client asking Name server for the IP address
of a host.
ID operation
type
query name
answer
23 QUERY
A gemini.tuc.noao.edu
23 RESPONSE A gemini.tuc.noao.edu 140.252.3.54
Ali Kujoory
6/30/2016
Not to be reproduced without permission
62
4.18 Domain Names That Begin with www (can skip to XML)
• Many organizations
assign domain names that
reflect the service a
computer provides, e.g.,
– A computer that runs a
server for FTP might be
named:
ftp.foobar.com
– Similarly, a computer that
runs a web server might be
named:
www.foobar.com
Ali Kujoory
6/30/2016
• Such names are
mnemonic, but are not
required
• The use of www to name
computers that run a web
server is merely a
convention
– an arbitrary computer can run
a web server, even if the
computer's domain name
does not contain www
– a computer that has a domain
name beginning with www is
not required to run a web
server
Not to be reproduced without permission
63
4.19 The DNS Hierarchy & Server Model
• Each organization is free to choose the details of its
servers, e.g.,
– a small organization that only has a few computers can contract
with an ISP to run a DNS server
• An organization that runs its own server can choose to
place all names for the organization in a single physical
server, or it can choose to divide its names among
multiple servers, e.g.,
– Fig. 4.17 illustrates how the hypothetical Foobar Corporation might
choose to structure servers if the corporation had a candy division
& a soap division
Ali Kujoory
6/30/2016
Not to be reproduced without permission
64
4.19 The DNS Hierarchy & Server Model
Ali Kujoory
6/30/2016
Not to be reproduced without permission
65
4.19 The DNS Hierarchy & Server Model
• DNS is designed to allow
each organization to
assign names
– To computers or to change
those names without
informing a central authority
– To achieve autonomy, each
organization is permitted to
operate DNS servers for its
part of the hierarchy
• Purdue University operates a
server for names ending in
purdue.edu
• IBM Corporation operates a
server for names ending in
ibm.com
• Each DNS server contains
information that links the
server to other domain
name servers up & down
the hierarchy
– a given server can be
replicated, e.g.,
• multiple physical copies of
the server exist
• Replication is useful for
heavily used servers, such
as root servers that
provide information about
top-level domains
– administrators must
guarantee that all copies are
coordinated
• so they provide exactly the
same information
Ali Kujoory
6/30/2016
Not to be reproduced without permission
66
4.19 The DNS Hierarchy & Server Model
Ali Kujoory
6/30/2016
Not to be reproduced without permission
67
4.20 Name Resolution
• The translation of a
domain name into an
address is called
– Name resolution,i.e.,
– “Name is said to be
resolved to an address”
– Software to perform the
translation is known as a
name resolver (or simply
resolver)
• In the socket API, e.g.,
– the resolver is invoked by
calling function
gethostbyname
Ali Kujoory
6/30/2016
• The resolver becomes a
client by contacting a DNS
server
– DNS server returns an
answer to the caller
• Each resolver is
configured with the
address of one or more
local domain name
servers
• The resolver forms a DNS
request message
– sends the message to the
local server
– waits for the server to send a
DNS reply message for the
answer
Not to be reproduced without permission
68
4.20 Name Resolution
• A resolver can choose to
use either the stream or
message paradigm when
communicating with a
DNS server
– most resolvers are configured
to use a message paradigm
because it imposes less
overhead for a small request
• Fig. 4.17a illustrates &
assume a computer in the
soap division generates a
request for name
• The resolver will be
configured to send the
request to the local DNS
server (i.e., the server for
foobar.com)
– Although it cannot answer the
request, the server knows to
contact the server for
candy.foobar.com, which can
generate an answer
chocolate.candy.foobar.com
Ali Kujoory
6/30/2016
Not to be reproduced without permission
69
4.21 Caching in DNS Servers
• The locality of reference principle that forms the basis
for caching applies to the Domain Name System in two
ways:
– Spatial: A user tends to look up the names of local computers
more often than the names of remote computers
• For this, a name resolver contacts a local server first
– Temporal: A user tends to look up the same set of domain names
repeatedly
• For this, a DNS server caches all lookups
• Algorithm 4.4 summarizes the process
Ali Kujoory
6/30/2016
Not to be reproduced without permission
70
Ali Kujoory
6/30/2016
Not to be reproduced without permission
71
4.21 Caching in DNS Servers
• From the algorithm, when
a request arrives for a
name outside the set for
which the server is an
authority further clientserver interaction results
• The server temporarily
becomes a client of
another name server
• When the other server
returns an answer
• In addition to knowing the
address of all servers
down the hierarchy
– each DNS server must know
the address of a root server
• How long should items be
cached?
– If an item is cached too long,
the item will become stale
– The cache timeout that DNS
has specified for each item
– the original server caches
the answer & sends a copy of
the answer back to the
resolver from which the
request arrived
Ali Kujoory
6/30/2016
Not to be reproduced without permission
72
4.22 Types of DNS Entries
• Each entry in a DNS database consists of three items:
– a domain name
– a record type
• specifies how the value is to be interpreted
Type A for ipv4
Type CNAME for Domain name
Type MX for Mail Exchange
Type NS for Name server
– a value
• A query sent to a DNS server specifies both a domain
name & a type & the
– server only returns a binding that matches the type of the query
• The principal type maps a domain name to an IP address
– DNS classifies such bindings as type A
• type A lookup is used by applications such as FTP, ping, or a browser
– DNS supports several other types, including type MX
• that specifies a Mail eXchanger
• when it looks up the name in an email address, SMTP uses type MX
Ali Kujoory
6/30/2016
Not to be reproduced without permission
73
4.22 Types of DNS Entries
• Each entry in a DNS
server has a type
• When a resolver looks up
a name, the
– resolver specifies the type
that is desired
– DNS server returns only
entries that match the
specified type
• The DNS type system can
produce unexpected
results
• a corporation may decide to
use the name corporation.com
for both web & email services
• It is possible for the
corporation to divide the
workload between
separate computers by
– mapping type A lookups to
one computer &
– Mapping type MX lookups to
another
– because the address returned
can depend on the type, e.g.,
Ali Kujoory
6/30/2016
Not to be reproduced without permission
74
4.23 Aliases & CNAME Resource Records
• The DNS offers a CNAME
(Canonical Name)
– it is analogous to a symbolic
link in a file system
– the entry provides an alias for
another DNS entry
• Aliases can be useful,
e.g.,
– Suppose Foobar Corporation
has a computer named as
hobbes.foobar.com to run a
web server using name www
Ali Kujoory
6/30/2016
• Organization foobar can
create a CNAME entry for
www.foobar.com that
points to hobbes
• Whenever a resolver
sends a request for
www.foobar.com, the
server returns the address
of computer hobbes
Not to be reproduced without permission
75
4.23 Aliases & CNAME Resource Records
• The use of aliases is
especially convenient
– it permits an organization to
change the computer used
for a particular service without
changing the names or
addresses:
• E.g., Foobar Corporation can
move its web service from
hobbes  calvin
• changing the CNAME record
in the DNS server, the two
computers retain their
original names & IP
addresses
Ali Kujoory
6/30/2016
• The use of aliases also
allows an organization to
associate multiple aliases
with a single computer
– Thus, Foobar Corporation
can run an FTP server & a
web server on the same
computer, & can create
CNAME records:
www.foobar.com
ftp.foobar.com
Not to be reproduced without permission
76
4.24 Abbreviations & the DNS
• DNS does not incorporate
abbreviations
– a server only responds to a
full name
• Most resolvers can be
configured with a set of
suffixes that allow a user
to abbreviate names, e.g.,
– each resolver at Foobar
Corporation might be
programmed to look up a
name twice:
• once with no change & once
with the suffix foobar.com
appended
Ali Kujoory
6/30/2016
• If a user enters a full
domain name
– the local server will return the
address, & processing will
proceed
• If a user enters an
abbreviated name
– it will first try to resolve the
name, &
– will receive an error because
no such name exists
– then it will try appending a
suffix & looking up the
resulting name
Not to be reproduced without permission
77
4.25 Internationalized Domain Names
• DNS uses the ASCII
character set
• Languages such as Russian,
Greek, Chinese, & Japanese
each contain characters for
which no ASCII
representation exists
– Many European languages
use diacritical marks that
cannot be represented in
ASCII
• IETF debated modifications
& extensions of the DNS to
accommodate international
domain names
Ali Kujoory
6/30/2016
– After considering many
proposals, IETF chose an
approach known as
Internationalizing Domain
Names in Applications
(IDNA)
• IDNA uses ASCII to store all
names
• If a domain name contains a
non-ASCII character
– IDNA translates the name
into a sequence of ASCII
characters &
– stores the result in the DNS
Not to be reproduced without permission
78
4.25 Internationalized Domain Names
• IDNA relies on applications to translate between the
international character set & the internal ASCII form
used
• The rules for translating international domain names are
complex &
• Use the latest versions of the widely-used browsers, e.g.,
– Firefox & Internet Explorer, can accept & display non-ASCII
domain names because they each implement IDNA
Ali Kujoory
6/30/2016
Not to be reproduced without permission
79
4.26 Extensible Markup Language (XML)
• XML is a markup language
that defines a set of rules for
encoding documents
• Although the design of XML
focuses on documents,
– in a format which is both humanreadable & machine-readable
• Defined by the W3C's XML 1.0,
it is free open standards
• Design goals of XML
emphasize simplicity, generality
& usability across the Internet
• It is a textual data format with
strong support via Unicode for
different human languages
https://en.wikipedia.org/wiki/XML
Ali Kujoory
6/30/2016
– it is widely used for
representation of arbitrary data
structures such as those used in
web services
• XML describes the structure of
data &
– provides names for each field
• XML does not assign any
meaning to tags
– tag names can be created as
needed
– tag names can be selected to
make data easy to parse or
access
Not to be reproduced without permission
80
4.26 Extensible Markup Language (XML)
Example:
• Two companies agree to exchange corporate telephone
directories, they
– define an XML format that has data items (Fig. 4.18), e.g.,
• as employee's name, phone number, & office, &
– choose to further divide a name into a last & a first name
Figure 4.18 XML example
Ali Kujoory
6/30/2016
Not to be reproduced without permission
81
Download