Unit 5: Web applications Service protocols Q.1. Explain role of MUA, MTA and MAA in Electronic Mailing System. OR Q.4. With neat diagram explain architecture of E-mail System. Ans: To explain the architecture of e-mail, we give four scenarios in which we UA, MTA and MAA. First Scenario In the first scenario, the sender and the receiver of the e-mail are users (or application programs) on the same mail server; they are directly connected to a shared mail server. The administrator has created one mailbox for each user where the received messages are stored. A mailbox is part of a local hard drive, a special file with permission restrictions. Only the owner of the mailbox has access to it. When Alice needs to send a message to Bob, she runs a user agent (UA) program to prepare the message and store it in Bob’s mailbox. The message has the sender and recipient mailbox addresses (names of files). Bob can retrieve and read the contents of his mailbox at his convenience using a user agent. Figure 23.1 shows the concept. When the sender and the receiver of an e-mail are on the same mail server, we need only two user agents. Second Scenario In the second scenario, the sender and the receiver of the e-mail are users (or application programs) on two different mail servers. The message needs to be sent over the Internet. Here we need user agents (UAs) and message transfer agents (MTAs) as shown in Figure 23.2. Alice needs to use a user agent program to send her message to the mail server at her own site. The mail server at her site uses a queue (spool) to store messages waiting to be sent. Bob also needs a user agent program to retrieve messages stored in the mailbox of the system at his site. The message, however, needs to be sent through the Internet from Alice’s site to Bob’s site. Here two message transfer agents are needed: one client and one server. Like most client-server programs on the Internet, the server needs to run all of the time because it does not know when a client will ask for a connection. The client, on the other hand, can be triggered by the system when there is a message in the queue to be sent. When the sender and the receiver of an e-mail are on different mail servers, we need two UAs and a pair of MTAs (client and server). Third Scenario Figure 23.3 shows the third scenario. Bob, as in the second scenario, is directly connected to his mail server. Alice, however, is separated from her mail server. Alice is either connected to the mail server via a point-to-point WAN—such as a dial-up modem, a DSL, or a cable modem—or she is connected to a LAN in an organization that uses one mail server for handling e-mails; all users need to send their messages to this mail server. Alice still needs a user agent to prepare her message. She then needs to send the message through the LAN or WAN. This can be done through a pair of message transfer agents (client and server). Whenever Alice has a message to send, she calls the user agent which, in turn, calls the MTA client. The MTA client establishes a connection with the MTA server on the system, which is running all the time. The system at Alice’s site queues all messages received. It then uses an MTA client to send the messages to the system at Bob’s site; the system receives the message and stores it in Bob’s mailbox. At his convenience, Bob uses his user agent to retrieve the message and reads it. Note that we need two pairs of MTA client-server programs. When the sender is connected to the mail server via a LAN or a WAN, we need two UAs and two pairs of MTAs (client and server). Fourth Scenario In the fourth and most common scenario, Bob is also connected to his mail server by a WAN or a LAN. After the message has arrived at Bob’s mail server, Bob needs to retrieve it. Here, we need another set of client-server agents, which we call message access agents (MAAs). Bob uses an MAA client to retrieve his messages. The client sends a request to the MAA server, which is running all the time, and requests the transfer of the messages. The situation is shown in Figure 23.4. There are two important points we need to emphasize here. First, Bob cannot bypass the mail server and use the MTA server directly. To use the MTA server directly, Bob would need to run the MTA server all the time because he does not know when a message will arrive. This implies that Bob must keep his computer on all the time if he is connected to his system through a LAN. If he is connected through a WAN, he must keep the connection up all the time. Neither of these situations is feasible today. Second, note that Bob needs another pair of client-server programs: message access programs. This is because an MTA client-server program is a push program: the client pushes the message to the server. Bob needs a pull program. The client needs to pull the message from the server. Figure 23.5 shows the difference. When both sender and receiver are connected to the mail server via a LAN or a WAN, we need two UAs, two pairs of MTAs (client and server), and a pair of MAAs (client and server). This is the most common situation today. Q.2. Explain HTTP Query and Response Message in detail. Ans: The Hypertext Transfer Protocol (HTTP) is a protocol used mainly to access data on the World Wide Web. HTTP functions like a combination of FTP and SMTP. It is similar to FTP because it transfers fifiles and uses the services of TCP. Request Message The format of the request is shown in Figure 22.11. A request message consists of a request line, a header, and sometimes a body. Request Line The first line in a request message is called a request line. There are three fields in this line separated by some character delimiter as shown in Figure 22.11. The fields are called methods, URL, and Version. These three should be separated by a space character. At the end two characters, a carriage return followed by a line feed, terminate the line. The method field defines the request type.The second field, URL, It defines the address and name of corresponding Web page. The third field, version, gives the version of the protocol; the most current version of HTTP is 1.1. Header Lines In Request Message After the request line, we can have zero or more request header lines. Each header line sends additional information from the client to the server. For example, the client can request that the document be sent in a special format. Each header line has a header name, a colon, a space, and a header value. Table 22.2 shows some header names commonly used in a request. The value field defines the values associated with each header name. The list of values can be found in the corresponding RFCs. Body In Request Message The body can be present in a request message. Usually, it contains the comment to be sent. Response Message The format of the response message is shown in Figure 22.12. A response message con sists of a status line, header lines, a blank line and sometimes a body Status Line The first line in a response message is called the status line. There are three fields in this line separated by spaces and terminated by a carriage return and line feed. The first field defines the version of HTTP protocol, currently 1.1. The status code field defines the status of the request. It consists of three digits. Whereas the codes in the 100 range are only informational, the codes in the 200 range indicate a successful request. The codes in the 300 range redirect the client to another URL, and the codes in the 400 range indicate an error at the client site. Finally, the codes in the 500 range indicate an error at the server site. The status phrase explains the status code in text form. Header Lines In Response Message After the status line, we can have zero or more response header lines. Each header line sends additional information from the server to the client. For example, the sender can send extra information about the document. Each header line has a header name, a colon, a space, and a header value. We will show some header lines in the examples at the end of this chapter. Table 22.4 shows some header names commonly used in a response message. Body The body contains the document to be sent from the server to the client. The body is present unless the response is an error message. Q.3. What are the types of web documents explain detail. Also explain how web server serves active document. Ans: Web DocumentsThe documents in the WWW can be grouped into three broad categories: static, dynamic, and active. The category is based on the time the contents of the document are determined. 1. Static Documents Static documents are fifixed-content documents that are created and stored in a server. The client can get a copy of the document only. In other words, the contents of the fifile are determined when the fifile is created, not when it is used. Of course, the contents in the server can be changed, but the user cannot change them. When a client accesses the document, a copy of the document is sent. The user can then use a browsing program to display the document (see Figure 22.5). Static documents are prepared using one of the several languages: Hypertext Markup Language (HTML), Extensible Markup Language (XML), Extensible Style Language (XSL), and Extended Hypertext Markup Language (XHTML). 2. Dynamic Documents A dynamic document is created by a Web server whenever a browser requests the document. When a request arrives, the Web server runs an application program or a script that creates the dynamic document. The server returns the output of the program or script as a response to the browser that requested the document. Because a fresh document is created for each request, the contents of a dynamic document may vary from one request to another. A very simple example of a dynamic document is the retrieval of the time and date from a server. Time and date are kinds of information that are dynamic in that they change from moment to moment. The client can ask the server to run a program such as the date program in UNIX and send the result of the program to the client A few technologies have been involved in creating dynamic documents using scripts. Among the most common are Hypertext Preprocessor (PHP), which uses the Perl language; Java Server Pages (JSP), which uses the Java language for scripting; Active Server Pages (ASP), a Microsoft product, which uses Visual Basic language for scripting; and ColdFusion, which embeds SQL database queries in the HTML document. Dynamic documents are sometimes referred to as server-site dynamic documents. 3. Active Documents For many applications, we need a program or a script to be run at the client site. These are called active documents. For example, suppose we want to run a program that creates animated graphics on the screen or a program that interacts with the user. The program defifinitely needs to be run at the client site where the animation or inter action takes place. When a browser requests an active document, the server sends a copy of the document or a script. The document is then run at the client (browser) site. Java Applets One way to create an active document is to use Java applets. Java is a combination of a high-level programming language, a run-time environment, and a class library that allows a programmer to write an active document (an applet) and a browser to run it. It can also be a stand-alone program that doesn’t use a browser. An applet is a program written in Java on the server. It is compiled and ready to be run. The document is in bytecode (binary) format. The client process (browser) creates an instance of this applet and runs it. A Java applet can be run by the browser in two ways. In the fifirst method, the browser can directly request the Java applet program in the URL and receive the applet in binary form. In the second method, the browser can retrieve and run an HTML fifile that has embedded the address of the applet as a tag. Figure 22.8 shows how Java applets are used in the fifirst method; the second is similar but needs two transactions. JavaScript The idea of scripts in dynamic documents can also be used for active documents. If the active part of the document is small, it can be written in a scripting language; then it can be interpreted and run by the client at the same time. The script is in source code (text) and not in binary form. The scripting technology used in this case is usually JavaScript. JavaScript, which bears a small resemblance to Java, is a very high level scripting language developed for this purpose. Figure 22.9 shows how JavaScript is used to cre ate an active document. Q.4.With neat diagram explain arctitecture of E-mail system Ans- See Ans of Q.2 Q.5. With neat and labeled diagram explain HTTP Architecure. AnsHTTP- The Hypertext Transfer Protocol (HTTP) is a protocol used mainly to access data on the World Wide Web. HTTP functions like a combination of FTP and SMTP It is similar to FTP because it transfers files and uses the services of TCP. However, it is much simpler than FTP because it uses only one TCP connection. There is no separate control connection; only data are transferred between the client and the server. HTTP is like SMTP because the data transferred between the client and the server look like SMTP messages. In addition, the format of the messages is controlled by MIME-like headers. Unlike SMTP, the HTTP messages are not destined to be read by humans; they are read and interpreted by the HTTP server and HTTP client (browser). SMTP messages are stored and forwarded, but HTTP messages are delivered immedi ately. The commands from the client to the server are embedded in a request message. The contents of the requested file or other information are embedded in a response message. HTTP uses the services of TCP on well-known port 80. HTTP Transaction - Figure 22.10 illustrates the HTTP transaction between the client and server. Although HTTP uses the services of TCP, HTTP itself is a stateless protocol, which means that the server does not keep information about the client. The client initializes the transac tion by sending a request. The server replies by sending a response. The HTTP protocol is a request/response protocol based on the client/server based architecture where web browsers, robots and search engines, etc. act like HTTP clients, and the Web server acts as a server. Client- The HTTP client sends a request to the server in the form of a request method, URI, and protocol version, followed by a MIME- like message containing request modifiers, client information, and possible body content over a TCP/IP connection Server : The HTTP server responds with a status line, including the message's protocol version and a success or error code, followed by a MIME-like message containing server information, entity meta information, and possible entity-body content Characteristics of HTTP: • HTTP is IP based communication protocol which is used to deliver data from server to client or vice-versa. • Server processes a request, which is raised by client and also server and client knows each other only during current request and response period. • Any type of content can be exchanged as long as server and client are compatible with it. • Once data is exchanged then servers and client are no more connected with each other. • It is a request and response protocol based on client and server requirements. • It is connection less protocol because after connection is closed, server does not remember anything about client and client does not remember anything about server. • It is stateless protocol because both client and server does not expecting anything from each other but they are still able to communicate. Q.6. Write note on MIME Ans: MIME Electronic mail has a simple structure. Its simplicity, however, comes with a price. It can send messages only in NVT 7-bit ASCII format. In other words, it has some limita tions. It cannot be used for languages other than English (such as French, German, Hebrew, Russian, Chinese, and Japanese). Also, it cannot be used to send binary fifiles or video or audio data. Multipurpose Internet Mail Extensions (MIME) is a supplementary protocol that allows non-ASCII data to be sent through e-mail. MIME transforms non-ASCII data at the sender site to NVT ASCII data and delivers it to the client MTA to be sent through the Internet. The message at the receiving site is transformed back to the original data. We can think of MIME as a set of software functions that transforms non-ASCII data to ASCII data and vice versa, as shown in Figure 23.15. MIME Headers MIME defifines fifive headers that can be added to the original e-mail header section to defifine the transformation parameters: 1. MIME-Version 2. Content-Type 3. Content-Transfer-Encoding 4. Content-Id 5. Content-Description Figure 23.16 shows the MIME headers. MIME-Version This header defifines the version of MIME used. The current version is 1.1. Content-Type This header defifines the type of data used in the body of the message. The content type and the content subtype are separated by a slash. Depending on the subtype, the header may contain other parameters. MIME allows seven different types of data. Text. The original message is in 7-bit ASCII format and no transformation by MIME is needed. There are two subtypes currently used, plain and HTML. Multipart. The body contains multiple, independent parts. The multipart header needs to define the boundary between each part. A parameter is used for this purpose. Four subtypes are defifined for this type: mixed, parallel, digest, and alternative. In the mixed subtype, the parts must be presented to the recipient in the exact order as in the message. Each part has a different type and is defined at the boundary. Message. In the message type, the body is itself an entire mail message, a part of a mail message, or a pointer to a message. Three subtypes are currently used: RFC822, partial, and external-body. The subtype RFC822 is used if the body is encapsulating another message Image. The original message is a stationary image, indicating that there is no animation. The two currently used subtypes are Joint Photographic Experts Group (JPEG), which uses image compression, and Graphics Interchange Format (GIF). Video. The original message is a time-varying image (animation). The only sub type is Moving Picture Experts Group (MPEG). If the animated image contains sounds, it must be sent separately using the audio content type. Audio. The original message is sound. The only subtype is basic, which uses 8-kHz standard audio data. Application. The original message is a type of data not previously defined. There are only two subtypes used currently: PostScript and octet-stream. PostScript is used when the data are in Adobe PostScript format. Octet-stream is used when the data must be interpreted as a sequence of 8-bit bytes (binary file) Content-Transfer-Encoding This header defines the method used to encode the messages into 0s and 1s for transport: Content-Transfer-Encoding: <type> The five types of encoding methods are listed in Table 23.4 Content-Id This header uniquely identififies the whole message in a multiple message environment. Content-Description This header defifines whether the body is image, audio, or video. Q.7. Explain POP3 and IMAP4 protocols in details. Ans- Currently two message access protocols are available: Post Offifice Protocol, version 3 (POP3) and Internet Mail Access Protocol, version 4 (IMAP4). Figure 23.13 shows the position of these two protocols in the most common situation. POP3 Post Offifice Protocol, version 3 (POP3) is simple and limited in functionality. The client POP3 software is installed on the recipient computer; the server POP3 software is installed on the mail server. Mail access starts with the client when the user needs to download its e-mail from the mailbox on the mail server. The client opens a connection to the server on TCP port 110. It then sends its user name and password to access the mailbox. The user can then list and retrieve the mail messages, one by one. Figure 23.14 shows an example of downloading using POP3 POP3 has two modes: the delete mode and the keep mode. In the delete mode, the mail is deleted from the mailbox after each retrieval. In the keep mode, the mail remains in the mailbox after retrieval. The delete mode is normally used when the user is working at her permanent computer and can save and organize the received mail after reading or replying. The keep mode is normally used when the user accesses her mail away from her primary computer (e.g., a laptop). The mail is read but kept in the sys tem for later retrieval and organizing. IMAP4 Another mail access protocol is Internet Mail Access Protocol, version 4 (IMAP4). IMAP4 is similar to POP3, but it has more features; IMAP4 is more powerful and more complex. POP3 is defificient in several ways. It does not allow the user to organize her mail on the server; the user cannot have different folders on the server. (Of course, the user can create folders on her own computer.) In addition, POP3 does not allow the user to par tially check the contents of the mail before downloading. IMAP4 provides the following extra functions: ❑ A user can check the e-mail header prior to downloading. ❑ A user can search the contents of the e-mail for a specifific string of characters prior to downloading. ❑ A user can partially download e-mail. This is especially useful if bandwidth is lim ited and the e-mail contains multimedia with high bandwidth requirements. ❑ A user can create, delete, or rename mailboxes on the mail server. ❑ A user can create a hierarchy of mailboxes in a folder for e-mail storage. Q.8. Draw and explain Browser Architecure. AnsThe WWW today is a distributed client-server service, in which a client using a browser can access a service using a server. However, the service provided is distrib uted over many locations called sites. Each site holds one or more documents, referred to as Web pages. Each Web page, however, can contain some links to other Web pages in the same or other sites. In other words, a Web page can be simple or composite. A simple Web page has no link to other Web pages; a composite Web page has one or more links to other Web pages. Each Web page is a file with a name and address. Hypertext and Hypermedia Hypertext means creating documents that refer to other documents. In a hypertext document, a part of text can be defined as a link to another document. When a hypertext is viewed with a browser, the link can be clicked to retrieve the other document. Hypermedia is a term applied to document that contains links to other textual document or documents containing graphics, video, or audio. Web Client (Browser) A variety of vendors offer commercial browsers that interpret and display a Web docu ment, and all of them use nearly the same architecture. Each browser usually consists of three parts: a controller, client protocol, and interpreters The controller receives input from the keyboard or the mouse and uses the client programs to access the document. After the document has been accessed, the controller uses one of the interpreters to display the document on the screen. The client protocol can be one of the protocols described previously such as FTP, or TELNET, or HTTP (as discussed later in the chapter). The interpreter can be HTML, Java, or JavaScript, depending on the type of document. We discuss the use of these interpreters based on the document type. Some commercial browsers include Internet Explorer, Netscape Navigator, and Firefox.