HTTP Hypertext Transfer Protocol 26-Jul-16

advertisement
HTTP
Hypertext Transfer Protocol
26-Jul-16
HTTP messages

HTTP is the language that web clients and web servers
use to talk to each other


HTTP is largely “under the hood,” but a basic understanding
can be helpful
Each message, whether a request or a response, has
three parts:
1. The request or the response line
2. A header section
3. The body of the message
2
What the client does, part I


The client sends a message to the server at a particular
port (80 is the default)
The first part of the message is the request line,
containing:




A method (HTTP command) such as GET or POST
A document address, and
An HTTP version number
Example:

GET /index.html HTTP/1.0
3
Other methods

Other methods beside GET and POST are:







HEAD: Like GET, but ask that only a header be returned
PUT: Request to store the entity-body at the URI
DELETE: Request removal of data at the URI
LINK: Request header information be associated with a
document on the server
UNLINK: Request to undo a LINK request
OPTIONS: Request information about communications
options on the server
TRACE: Request that the entity-body be returned as received
(used for debugging)
4
What the client does, part II

The second part of a request is optional header
information, such as:




What the client software is
What formats it can accept
All information is in the form Name: Value
Example:
User-Agent: Mozilla/2.02Gold (WinNT; I)
Accept: image/gif, image/jpeg, */*

A blank line ends the header
5
Client request headers

Accept: type/subtype, type/subtype, ...


Accept-Language: en, fr, de


The browser or other client program sending the request
From: dave@acm.org


Preferred language (For example: English, French, German)
User-Agent: string


Specifies media types that the client prefers to accept
Email address of user of client program
Cookie: name=value


Information about a cookie for that URL
Multiple cookies can be separated by commas
6
What the client does, part III

The third part of a request (after the blank line) is the
entity-body, which contains optional data


The entity-body part is used mostly by POST requests
The entity-body part is always empty for a GET request
7
What the server does, part I


The server response is also in three parts
The first part is the status line, which tells:





The HTTP version
A status code
A short description of what the status code means
Example: HTTP/1.1 404 Not Found
Status codes are in groups:
100-199
200-299
300-399
400-499
500-599
Informational
The request was successful
The request was redirected
The request failed
A server error occurred
8
Common status codes

200 OK


301 Moved Permanently


You can’t do this, and we won’t tell you why
404 Not Found


There is a syntax error in your request
403 Forbidden


URL temporarily out of service, keep the old one but use this one for now
400 Bad Request


URI was moved, but here’s the new address for your records
302 Moved temporarily


Everything worked, here’s the data
No such document
408 Request Time-out, 504 Gateway Time-out

Request took too long to fulfill for some reason
9
What the server does, part II


The second part of the response is header information,
ended by a blank line
Example:
Content-Length: 2532
Connection: Close
Server: GWS/2.0
Date: Sun, 01 Dec 2002 21:24:50 GMT
Content-Type: text/html
Cache-control: private
Set-Cookie:
PREF=ID=05302a93093ec661:TM=1038777890:LM=1038777890:S=
All on
one line yNWNjraftUz299RH; expires=Sun, 17-Jan-2038 19:14:07 GMT;
path=/; domain=.google.com

10
Viewing the response




There is a header viewer at http://www.delorie.com/web/headers.html
(with nasty jittery advertisements)
Example 2.3 (GetResponses) in the Gittleman book does the same thing
Here’s an example (from GetResponses):
% java GetResponses http://www.cis.upenn.edu/~matuszek/cit5972003/index.html
Status line:
HTTP/1.1 200 OK
Response headers:
Date: Wed, 10 Sep 2003 00:26:53 GMT
Server: Apache/1.3.26 (Unix) PHP/4.2.2 mod_perl/1.27
mod_ssl/2.8.10 OpenSSL/0.9.6e
Last-Modified: Tue, 09 Sep 2003 19:24:50 GMT
ETag: "1c1ad5-1654-3f5e2902”
Accept-Ranges: bytes
Content-Length: 5716
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html
11
The GetResponses program, I

Here’s just the skeleton of the program that provided the output on the last
slide:
 import java.net.*;
import java.io.*;
public class GetResponses {
public static void main(String [ ] args) {
try {
...interesting code goes here...
}
catch(Exception e) {
e.printStackTrace();
}
}
}
12
The GetResponses program, II

Here’s the interesting part of the code:

URL url = new URL(args[0]);
URLConnection c = url.openConnection();
System.out.println("Status line: ");
System.out.println('\t' + c.getHeaderField(0));
System.out.println("Response headers:");
String value = "";
int n = 1;
while (true){
value = c.getHeaderField(n);
if (value == null) break;
System.out.println('\t' + c.getHeaderFieldKey(n++) +
": " + value);
}
13
Server response headers

Server: NCSA/1.3


Content-Type: type/subtype


Name and version of the server
Should be of a type and subtype specified by the client’s
Accept header
Set-Cookie: name=value; options

Requests the client to store a cookie with the given name and
value
14
What the server does, part III


The third part of a server response is the entity body
This is often an HTML page

But it can also be a jpeg, a gif, plain text, etc.--anything the
browser (or other client) is prepared to accept
15
The <meta http-equiv> tag



The <meta http-equiv=string content=string> tag may occur in the
<head> of an HTML document
http-equiv and content typically have the same kinds of values as
in the HTTP header
This tag asks the client to pretend that the information actually
occurred in the header




The information is not really in the header
This tag is available because you have little direct control over what is in
the header (unless you write your own server)
As usual, not all browsers handle this information the same way
Example:
<meta http-equiv="Set-Cookie"
content="value=n;expires=date; path=url">
16
Summary

HTTP is a fairly straightforward protocol with a lot of possible
kinds of predefined header information


More kinds can be added, so long as client and server agree
A request from the client consists of three parts:
1. A header line
2. A block of header information, ending with a blank line
3. The (optional) entity body, containing data



A response from the server consists of the same three parts
HTTP headers are “under the hood” information, not normally
displayed to the user
As with most of the things covered in CIT597,


We have covered only the fundamentals
Much more detail can be found on the Web
17
The End
18
Download