Web Server Design Week 7 Old Dominion University Department of Computer Science CS 495/595 Spring 2010 Martin Klein <mklein@cs.odu.edu> 2/24/10 Encodings • gzip – extension: .gz – (sometimes seen as x-gzip) • compress – extension: .Z – (sometimes seen as x-compress) • deflate – extension: .zip • identity – no encoding at all • chunked – breaks the body into a series of server-chosen “chunks” – optimization for dynamically produced content Content Encoding vs. Transfer Encoding GET /bottle1 HTTP/1.1 Client HTTP/1.1 200 OK Content-Encoding Transfer-Encoding images from: http://www.madeirawineguide.com/madeira_v300/ Origin Server Content-Encodings • “Content-Encoding is primarily used to allow a document to be compressed without losing the identity of its underlying media type.” – section 14.11 Content-Encoding Example (Correct) AIHT:~/Desktop/cs595-s06 mln$ telnet www.cs.odu.edu 80 Trying 128.82.4.2... Connected to xenon.cs.odu.edu. Escape character is '^]'. HEAD /~mln/pubs/bollenj_adaptive.ps.gz HTTP/1.1 Host: www.cs.odu.edu Connection: close HTTP/1.1 200 OK Date: Mon, 20 Feb 2006 04:30:25 GMT Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2 PHP/4.3.4 Last-Modified: Thu, 25 Jul 2002 16:58:58 GMT ETag: "1c16-139ea-3d402e52" Accept-Ranges: bytes Content-Length: 80362 Connection: close Content-Type: application/postscript Content-Encoding: x-gzip Connection closed by foreign host. Content-Encoding Example (Incorrect) AIHT:~/Desktop mln$ telnet www.cs.odu.edu 80 Trying 128.82.4.2... Connected to xenon.cs.odu.edu. Escape character is '^]'. HEAD /~mln/pubs/bollenj_adaptive.ps.gz HTTP/1.1 Host: www.cs.odu.edu Connection: close HTTP/1.1 200 OK Date: Mon, 26 Feb 2007 02:06:25 GMT Server: Apache/2.2.0 Last-Modified: Thu, 25 Jul 2002 16:58:58 GMT ETag: "1c16-139ea-92cab880" Accept-Ranges: bytes Content-Length: 80362 Connection: close Content-Type: application/x-gzip Wrong, Wrong, Wrong!!!!!!! Connection closed by foreign host. Transfer Encodings • “Transfer-Encoding is a property of the message, not of the entity, and thus MAY be added or removed by any application along the request/response chain.” – section 4.3 TE Request Header & Transfer-Encoding Response Header • Client specifies preferences for transfer encoding in the “TE:” header – section 14.39 • Server marks the encoding used with the “Transfer-Encoding” header – section 14.41 • Both headers use the same encoding values available with “Content-Encoding”, plus the special “chunked” encoding and the “Trailers” keyword TE & Transfer-Encoding dhcp65-74-196-93:~/Desktop/cs595-s06 mln$ telnet www.cs.odu.edu 80 Trying 128.82.4.2... Connected to xenon.cs.odu.edu. Escape character is '^]'. GET / HTTP/1.1 TE: gzip;q=1.0, Trailers Host: www.cs.odu.edu HTTP/1.1 200 OK Date: Mon, 27 Feb 2006 15:52:33 GMT Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2 PHP/4.3.4 Transfer-Encoding: chunked Content-Type: text/html 5f6 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <!-- saved from url=(0036)http://www.cs.odu.edu/newcssite/new/ --> <!-- saved from url=(0019)http://sci.odu.edu/ --> [more html deleted] Time / Space Tradeoff • Hard to find examples of compression used in transfer encoding – – http://www.webreference.com/internet/software/servers/http/compression/2.html http://www-128.ibm.com/developerworks/web/library/wa-httpcomp/ – idea: for very heavy volume web servers, answering the request quickly is more important than preserving bandwidth • Complexity of management seems to be the limiting factor Chunked Encoding • “The chunked encoding modifies the body of a message in order to transfer it as a series of chunks, each with its own size indicator, followed by an OPTIONAL trailer containing entity-header fields. This allows dynamically produced content to be transferred along with the information necessary for the recipient to verify that it has received the full message.” – sections 3.6.1, 19.4.6 Chunked Encoding Example AIHT:~/Desktop/cs595-s06 mln$ telnet www.cs.odu.edu 80 | tee 1-6.out Trying 128.82.4.2... Connected to xenon.cs.odu.edu. Escape character is '^]'. GET /~mln HTTP/1.1 Connection: close Host: www.cs.odu.edu Connection closed by foreign host. HTTP/1.1 301 Moved Permanently Date: Mon, 09 Jan 2006 19:32:24 GMT Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2 PHP/4.3.4 Location: http://www.cs.odu.edu/~mln/ Connection: close Transfer-Encoding: chunked Content-Type: text/html; charset=iso-8859-1 12e <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>301 Moved Permanently</TITLE> </HEAD><BODY> <H1>Moved Permanently</H1> The document has moved <A HREF="http://www.cs.odu.edu/~mln/">here</A>.<P> <HR> <ADDRESS>Apache/1.3.26 Server at www.cs.odu.edu Port 80</ADDRESS> </BODY></HTML> 0 Chunked Encoding Example 2 AIHT:~/Desktop/cs595-s06 mln$ telnet www.cs.odu.edu 80 Trying 128.82.4.2... Connected to xenon.cs.odu.edu. Escape character is '^]'. GET / HTTP/1.1 Host: www.cs.odu.edu HTTP/1.1 200 OK Date: Tue, 21 Feb 2006 03:54:31 GMT Server: Apache/1.3.26 (Unix) ApacheJServ/1.1.2 PHP/4.3.4 Transfer-Encoding: chunked Content-Type: text/html 5f6 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <!-- saved from url=(0036)http://www.cs.odu.edu/newcssite/new/ --> <!-- saved from url=(0019)http://sci.odu.edu/ --> <HTML xmlns:st1 = "urn:schemas-microsoft-com:office:smarttags"><HEAD><TITLE>Department Of Computer Science</TITLE> <META http-equiv=Content-Type content="text/html; charset=windows-1252"> <META content="College of Sciences" name=Description> <META content="College of Sciences" name=Keywords><LINK title=style href="files/style.css" type=text/css rel=stylesheet> [lots of html deleted] [demo this example to see the various “chunks”] 0 Trailer Response Header • The “Trailer” response header lets the client know that additional headers will appear at the end of the chunked response – section 14.40 – headers can be reconstructed by downstream servers – headers that can never be trailers: • Transfer-Encoding • Content-Length • Trailer Trailer Example HTTP/1.1 200 OK Date: Mon, 22 Mar 2004 11:15:03 GMT Content-Type: text/html Content-Length: 129 Expires: Sat, 27 Mar 2004 21:12:00 GMT HTTP/1.1 200 OK Date: Mon, 22 Mar 2004 11:15:03 GMT Content-Type: text/html Transfer-Encoding: chunked Trailer: Expires <html><body><p>The file you requested is 3,400 bytes long and was last modified: Sat, 20 Mar 2004 21:12:00 GMT. </p></body></html> 29 <html><body><p>The file you requested is 5 3,400 23 bytes long and was last modified: 1d Sat, 20 Mar 2004 21:12:00 GMT 13 .</p></body></html> 0 Expires: Sat, 27 Mar 2004 21:12:00 GMT “Expires:” response header covered in section 14.21 Examples from: http://www.tcpipguide.com/free/t_HTTPDataLengthIssuesChunkedTransfersandMessageTrai-3.htm Two More Request Headers to Process User-Agent Request Header • User-Agent: – section 14.43 – “This is for statistical purposes, the tracing of protocol violations, and automated recognition of user agents for the sake of tailoring responses to avoid particular user agent limitations. User agents SHOULD include this field with requests.” – examples: • User-Agent: CERN-LineMode/2.15 libwww/2.17b3 • User-Agent: CS 495/595 Spring 2006 Automated Checker (A3) Referer Request Header • Referer: – section 14.36 – “The Referer[sic] request-header field allows the client to specify, for the server's benefit, the address (URI) of the resource from which the Request-URI was obtained (the "referrer", although the header field is misspelled.) The Referer request-header allows a server to generate lists of back-links to resources for interest, logging, optimized caching, etc. It also allows obsolete or mistyped links to be traced for maintenance. The Referer field MUST NOT be sent if the Request-URI was obtained from a source that does not have its own URI, such as input from the user keyboard.” – examples: • Referer: http://www.w3.org/hypertext/DataSources/Overview.html • Referer: /opening-page.php