Chapter 27 WWW and HTTP 27.1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 27-1 ARCHITECTURE The WWW today is a distributed client/server service, in which a client using a browser can access a service using a server. However, the service provided is distributed over many locations called sites. Topics discussed in this section: Client (Browser) Server Uniform Resource Locator Cookies 27.2 Figure 27.1 Architecture of WWW 27.3 Figure 27.2 Browser 27.4 Figure 27.3 URL http:// https:// ftp:// 27.5 80 by default Cookies 27.6 The WWW was originally designed as a stateless entity. Cookies are needed for extending functionalities of the Web, such as: To remember past client in order to show a customized webpage. Cookies: Creation and storage 27.7 When a server receives a request from a client, it stores information about the client in a file or a string. The server includes the cookie in the response that it sends to the client. When the client receives the response, the browser stores the cookie in the cookie directory. 27-2 WEB DOCUMENTS The documents in the WWW can be grouped into three broad categories: static, dynamic, and active. The category is based on the time at which the contents of the document are determined. Topics discussed in this section: Static Documents Dynamic Documents Active Documents 27.8 Figure 27.4 Static document 27.9 Uses the Hypertext Markup Language (HTML) Figure 27.5 Boldface tags 27.10 Figure 27.7 Beginning and ending tags Example: 27.11 <a href="Chapter3-part3.ppt">Chapter3-part3</a> <img src="../images/smallUCF.gif" width="200" border="0" height="76"> Figure 27.8 Dynamic document using CGI 27.12 Figure 27.9 Dynamic document using server-site script 27.13 Note Dynamic documents are sometimes referred to as server-site dynamic documents. 27.14 Figure 27.10 Active document using Java applet 27.15 Figure 27.11 Active document using client-site script 27.16 Note Active documents are sometimes referred to as client-site dynamic documents. 27.17 27-3 HTTP The Hypertext Transfer Protocol (HTTP) is a protocol used mainly to access data on the World Wide Web. Topics discussed in this section: HTTP Transaction Persistent Versus Nonpersistent Connection 27.18 Figure 27.12 HTTP transaction HTTP uses the services of TCP on wellknown port 80. 27.19 Figure 27.13 Request and response messages (all in Plain Text) 27.20 Table 27.1 Methods 27.21 Table 27.2 Status codes 27.22 Table 27.2 Status codes (continued) 27.23 Figure 27.15 Header format 27.24 Table 27.3 General headers 27.25 Table 27.4 Request headers 27.26 Table 27.5 Response headers 27.27 Table 27.6 Entity headers 27.28 Figure 27.16 Example 27.1 27.29 Figure 27.17 Example 27.2 (client sends data to server) 27.30 Example 27.3 (continued) 27.31 Trying out HTTP (client side) for yourself 1. Telnet to your favorite Web server: telnet www.cs.ucf.edu 80 Opens TCP connection to port 80 (default HTTP server port) at cs.ucf.edu. Anything typed in sent to port 80 at www.cs.ucf.edu 2. Type in a GET HTTP request: GET /~czou/CNT3004/example.html HTTP/1.1 Host: www.cs.ucf.edu By typing this in (hit carriage return twice), you send this minimal (but complete) GET request to HTTP server 3. Look at response message sent by HTTP server! 27.32 Web Browser’s Operation First, get the basis static HTML file /~czou/CNT3004/example.html Second, interpret HTML to find all contained “objects” Images, java applets, flash,…. 27.33 <img src="../images/smallUCF.gif“ > <img src=“http://upload.wikimedia.org/wikipedia/commons/6/63/ Wikipedia-logo.png” > Third, get those objects via HTTP Let’s look at HTTP in action Telnet example “GET” must be Capital letters! Must have “host” header! For web proxy reason 27.34 A proxy can know where to forward the GET request What if type in “HTTP/1.0” ? Wireshark example Note HTTP version 1.1 specifies a persistent connection by default. 27.35 Persistent vs Nonpersistent Connection 27.36 In a nonpersistent connection, one TCP connection is made for each request/response. In a persistent connection, the server leaves the connection open for more requests after sending a response. The server can close the connection at the request of a client or if a time-out has been reached. Wireshark example