lesson14

advertisement
Introduction to HTTP
The HyperText Transport Protocol
is an ‘application-layer’ protocol
for the ‘client/server’ paradigm
‘Request’ and ‘Response’
client
server
HTTP
Request
message
HTTP
Response
message
timeline
Built on TCP/IP
• Application programmers will need to be
aware that HTTP relies on TCP’s reliable,
stream-oriented and connection-based
transport-layer facilities when specifying
the socket types, functions, and options
socket()
bind()
listen()
socket()
accept()
connect()
read()
write()
write()
read()
close()
close()
server
client
HTTP Request
Request line
Headers
Empty line
Body
(may be absent)
HTTP Response
Status line
Headers
Empty line
Body
(may be absent)
Sample Request line
space
space
carriage-return
and line-feed
“GET /home/web/cs336/syllabus.s09 HTTP/1.0\r\n”
resource pathname
(UNIX filename syntax)
command
(one word - all capitals)
protocol and
version-number
Sample Request header-lines
carriage-return
and line-feed
“Connection: close\r\n”
“User-agent: Mozilla 4.0\r\n”
“Accept-language: en\r\n”
• The header-lines must be followed by an
‘empty’ line (carriage-return and line-feed)
Sample Response line
space
space
“HTTP/1.0 200 OK\r\n”
protocol and
version-number
response
phrase
status code
carriage-return
and line-feed
Sample Response header-lines
carriage-return
and line-feed
“Connection: close\r\n”
“Date: Tue, 15 March 2009\r\n”
“Server: Apache/1.3 *Unix)\r\n”
“Content-Type: text/html\r\n”
• The header-lines must be followed by an
‘empty’ line (carriage-return and line-feed)
Demo: ‘grabfile.cpp’
• We shall construct a simple HTTP client
which will allow a user to obtain a named
internet object by typing its URL (Uniform
Resource Locator) on the command-line:
$ grabfile http://www.cs.usfca.edu/index.html
The URL concept
• URL means ‘Uniform Resource Locator’
• It’s a standard way of specifying any kind
of information available on the Internet
• Four elements of a URL specification:
– Method (i.e., the protocol for object retrieval)
– Host (i.e., location hostname or IP-address)
– Port (i.e., port-number for contacting server)
– Path (i.e., pathname of the resource’s file)
The URL Format
method
://
host
:
port
/
path
EXAMPLE: http://cs.usfca.edu:80/~cruse/cs336/syllabus.pdf
Note: The port-number is often omitted in cases where the ‘method’
is an internet protocol (like HPPT) which uses a ‘well-known port’
Application’s organization
Parse the URL entered on the command-line
to determine the server’s hostname and port-number
and the pathname to the desired file-obsect
Open a stream-oriented TCP internet socket
and establish a connection with the server
Form the HTTP Request message
and write it to the socket
Read from the socket to receive
the HTTP Response message
(and echo it to the display)
Close the socket to terminate the TCP connection
Parsing the URL
• The most challenging part of this program
concerns the parsing of the command-line
argument, allowing for some ‘degenerate’
cases and some malformed specifications
• Several standard string-functions from the
UNIX runtime-library are put to good use,
including ‘strlen()’, ‘strncpy()’, ‘strtok()’ and
‘strtok_r()’, plus ‘strspn()’ and ‘strcspn()’
‘strlen()’
size_t
strlen( const char *s );
• This function calculates the length of the
null-terminated string whose address is
supplied as the function-argument
#include <string.h>
char
message[ ] = “Hello”;
int main( void )
{
int
len = strlen( message );
printf( “\’%s\’ has %d characters\n”, len );
}
OUTPUT: ‘Hello’ has 5 characters
‘strncpy()’
char
*strncpy( char *dst,const char *src, size_t n );
• This function copies at most n characters
from the ‘src’ string into the ‘dst’ string, so
provides a ‘safe’ way to copy from a string
that might be too long to fit the destination
int main( int argc, char *argv[] )
{
char
param[ 64 ];
if ( argc == 1 ) { fprintf( stderr, “ param? \n” ); exit(1); }
strncpy( param, argv[ 1 ], 63 ); // source string has unknown length
…
}
‘strtok()’
char
*strtok( char *s, const char *delim );
• This function extracts tokens from a string,
but after being called once, it remembers
where it stopped in case the caller wants
to extract more tokens from that string
char
sentence[ ] = “Hello, world!\n”;
char
*word1 = strtok( sentence, “ ,!\n” );
char
*word2 = strtok( NULL, “ ,!\n” );
char
*word3 = strtok( NULL, “ ,!\n” );
printf( “ \’%s\’ \’%s\’ \’%s\’ \n”, word1, word2, word3 );
OUTPUT: ‘Hello’ ‘world’ ‘<nul>’
‘strtok_r()’
char *strtok_r( char *s, const char *delim, char **saveptr
);
• This function is a ‘reentrant’ version of the
‘strtok()’ function, placing the address of
the character where a subsequent search
for another token to extract would begin
char
sentence[ ] = “Hello, world!\n”;
char
*word1, *word2, *word3;
word1 = strtok( sentence, “ ,!\n”, word2 );
strtok( word2, “ ,!\n”, word3 );
printf( “ \’%s\’ \’%s\’ \’%s\’ \n”, word1, word2, word3 );
OUTPUT: ‘Hello’ ‘world’ ‘<nul>’
‘strspn()’
size_t
strspn( const char *s, const char *accept );
• This function searches a string for a set of
characters, and returns the length of the
initial segment which consists entirely of
characters that are in the ‘accept’ string
char
vowels[ ] = “aeiou”;
char
word[ ] = “eating”;
int
len = strspn( word, vowels );
printf( “\’%s\’ has %d vowels before any consonant \n”, word, vowels );
OUTPUT: ‘eating’ has 2 vowels before any consonant
‘strcspn()’
size_t
strcspn( const char *s, const char *reject );
• This function searches a string for a set of
characters, and returns the length of the
initial segment which consists entirely of
characters that are not in the ‘reject’ string
char
vowels[ ] = “aeiou”;
char
word[ ] = “shout”;
int
len = strcspn( word, vowels );
printf( “\’%s\’ has %d consonants before any vowel \n”, word, vowels );
OUTPUT: ‘shout’ has 2 consonants before any vowel
Examples
• Here are a few examples of ‘malformed’
and ‘degenerate’ URL parameter-strings
http://:54321/index.html
# no server hostname
http://yahoo.com:/index.html
# missing port
http://usfca.edu:::54321/index.html # excess ‘:’s
www.sfmuni.com/index.html
# no ‘method’
http://www.bart.gov/
# no pathname
www.sfsu.edu:80:57/index.html # extra chars
In-class exercise
• Download our ‘grabfile.cpp’ application
and see whether you are able to retrieve
any files by typing a URL as an argument
• HINT: You can use some of the same IPaddresses and hostnames that you tried
successfully while you were testing your
earlier ‘showpath.cpp’ project
Download