116099808 -1- Chapter 7 A simple HTTP Transaction A simple CGI script Using CGI.pm to Generate HTML Sending Input to a CGI Script Using HTML Forms to Send Input Using CGI.pm to Create Forms and Read Input Other Headers 116099808 -2The Common Gateway Interface CGI describes a set of protocols through which applications ( CGI scripts ) interact with Web servers and indirectly with clients (e.g. Web browsers). Perl is particularly effective with CGI - Common Gateway Interface protocol – - ideal for the WWW – allows users of the internet to connect to web servers which can then interact with other applications and pass their output to client applications in the form of dynamic content. The event of CGI – transformed Perl from a systems administration tool to the most widely used - Server Side Internet Programming Language HTTP Protocol CGI Protocol CGI/script Internet Browser (User’s computer) Web server stores HTML documents Data Bases and other files common – CGI is designed to be used with any programming language ` 116099808 -3A simple HTTP Transaction When a browser displays a Web page it displays HTML (HyperText Markup Language) document. Any HTML file available for viewing over the web has a URL ( Universal Resource Locator) associated with it. http://www.deitel.com/books/downloads.htm http:// indicates that the resource is to be obtained using Hypertext Transfer Protocol www.deitel.com is the hostname of the server where the document resides. The hostname translates to an IP address ( 207.60.134.230) that identifies the computer. The translation from host name to IP address is performed by a Domain Name Server. DNS – a computer that maintains a data base of hostnames and corresponding addresses. (DNS lookup) / When the browser is given a URL it performs a simple HTTP transaction to fetch and display a Web page. The transaction is performed between a Web browser application on the client side and a Web server application on the server side. The request could be a follows: GET /books/downloads.htm HTTP/1.0 The word GET is a method ( function ) indicating that the client wishes a resource The server responds: HTTP/1.0 OK Indicates success HTTP/1.0 Not found Indicates failure. The server normally then sends one or more HTTP headers, which provide additional information about the data being sent. Example: Content-type: text/html 116099808 -4- Each type of data sent from the server has a MIME ( Multipurpose Internet Mail Extensions) type Example: text/txt the data should be displayed without attempting to interpret any of the content as HTML markup image/gif indicates the content is a GIF image The header or set of headers is followed by a blank line, which indicates to the client that the server is finished sending HTTP headers. Forgetting the blank line is a logic error. The server then sends the text in the requested HTML document ( downloads.htm) The connection is terminated when the transfer is complete The client side browser interprets the HTML it receives and displays ( or renders ) the results. As long as the html file remains the same the browser will render a static web page. For dynamically created web pages a CGI script allows data to be sent from the server to the browser. By default, a Perl script outputs data to the screen ( standard output ) using the print statement. The standard output however can be re-directed. We can redirect standard output (pipe) to the Web Server as opposed to the screen. The server sends the output to the client, which interprets the headers and tags as if they were part of a normal server response to and HTML document request and render the information. If the server is on the local computer, you place the script in the localhost cgi-bin directory and access the script using the URL http://localhost/cgi-bin/fig07_02.pl The Web server must be configured to recognize the resources. When the resource is a CGI script, the script must be executed on the server. Special filename extensions such as .cgi or .pl are required or it is located in a special directory such as cgi-bin. In addition the client must be able to access and execute the script. 116099808 -5Using CGI.pm to Generate HTML #!/usr/bin/perl Fig. 7.4: fig07_04.pl Program to display CGI environment variables. use warnings; use strict; use CGI qw( :standard ); #directs Perl to include a set of functions # CGI library function header() returns Content-type: txt/html\n\n # CGI library function strt_html() returns standard HTML opening tags print header(), start_html( "Environment Variables" ); #sends HTML table tag and attributes print '<table border = "0" cellspacing = "2">'; # %ENV is a built in hash table which contains names and values of environment # variable which are then sorted foreach my $variable ( sort( keys %ENV ) ) { # Tr is a function that returns <tr> </tr> tags – the functions takes two arguments # the output of the b function generates bold tags for bold type the text is the # name of the environment variable, # i – generates tags to italicize text # # For the same result the print statement below could be replaced to print HTML # statements as follows: # print “<tr><td><b>$variable:</b></td>”; # print “<td><i>$ENV{ $variable }</i>>/td></tr>”; print Tr( td( b( "$variable:" ) ), td( i( $ENV{ $variable } ) ) ); } # print end table tag and call function end_html to generate </body> # and </html> tags print '</table>', end_html(); CGI scripts send output to the Server by redirecting standard output from the screen to the Server. The second way the server and CGI scripts interact is by the use of environment variables. Environment variables provide information about the server’s and client’s execution environment. For example, the script may send browser specific information based on the HTTP_USER_AGENT variable which identifies the browser the client is using. The script fig07_04.pl print’s out the execution environment variables. 116099808 -6- 116099808 -7- 116099808 -8Sending Input to a CGI Script ( 3 methods) First method: The Environment variable QUERY_STRING provides a mechanism to send information to the CGI script . QUERY_STRING contains the client’s name and search engine. The information is appended to the URL in a GET request. www.somesite.com/cgi-bin/script.pl?state=California The ? is a delimiter and not part of the data #!/usr/bin/perl Fig. 7.5: fig07_05.pl An example of using QUERY_STRING. use warnings; use strict; use CGI qw( :standard ); my $query = $ENV{ "QUERY_STRING" }; print header(), start_html( "QUERY_STRING example" ); print h2( "Name/Value Pairs" ); if ( $query eq "" ) { print 'Please add some name-value pairs to the URL above. '; print 'Or try <a href = "fig07_05.pl?name=Joe&age=29">this</a>.'; } else { print i( "The query string is '$query'." ), br(); my @pairs = split ( "&", $query ); foreach my $pair ( @pairs ) { my ( $name, $value ) = split ( "=", $pair ); print "You set '$name' to value '$value'.", br(); } } print end_html(); The above cgi script reads and reacts to data passed through the environment variable QUERY_STRING Each name/value pair ( separated by & ) 116099808 -9- Query string is empty: Query string added to URL ( or part of a hyperlink ): 116099808 - 10 - Using HTML Forms to Send Input Second Method: using GET HTML provides the ability to include forms on Web pages . The <form> and </form> tags surround the HTML form. The form tags take two attributes: action and method - Action - specifies the action when the user submits the form - Method – specifies the method – either GET pr POST that will be executed. HTML forms can contain elements such as: Tag name Type attribute <input> text Description Provides a single-line text field for text input. This tag is the default input type. password Like text, but each character typed by the user appears as an asterisk (*) to hide the input for security checkbox Displays a checkbox that can be checked (true) or unchecked (false) radio Radio buttons are like checkboxes, except that only one radio button in a group of radio buttons can be selected at a time. button A standard push button. submit A push button that submits form data according to the form’s action. image The same as submit, but displays an image instead of a button. reset A button that resets form fields to their default values file Displays a text field and button that allows the user to specify a file to upload to a Web server. The button displays a dialog box that allows The user to select the file. hidden Allows hidden form data that can be used by the form handler on the server. <select> Drop down menu or selection box. Used with the <option> tag to specify options to select. <textarea> This is a multiline area in which text can be input or displayed. 116099808 - 11 - #!/usr/bin/perl # Fig 7.7: fig07_07.pl # Demonstrates GET method with HTML form. use warnings; use strict; use CGI qw( :standard ); our ( $name, $value ) = split( '=', $ENV{ QUERY_STRING } ); print header(), start_html( 'Using GET with forms' ); print p( 'Enter one of your favorite words here: ' ); print '<form method = "GET" action = "fig07_07.pl">'; print '<input type = "text" name = "word">'; print '<input type = "submit" value = "Submit word">'; print '</form>'; if ( $name eq 'word' ) { print p( 'Your word is: ', b( $value ) ); } print end_html(); - The first time the script is executed there is no value in the query string - once the word is entered and the submit button is clicked the script is called again - the name of the text field “word” and the value entered by the user “technology” the query string is assigned the name/value pair word=technology is assigned to the query string and appended to the URL in the browser window. - during the second execution of the script when the query is decoded the variable $name is set equal to ‘word’ variable $value is set equal to technology - since $name = ‘word’ the print output goes to the server and is sent to the browser and displayed. The GET method with an HTML form passes data to the CGI script with the environment variable 116099808 - 12 - 116099808 - 13 - Third Method: using POST #!/usr/bin/perl # Fig 7.8: fig07_08.pl # Demonstrates POST method with HTML form. use warnings; use strict; use CGI qw( :standard ); our ( $data, $name, $value ); read( STDIN, $data, $ENV{ 'CONTENT_LENGTH' } ); ( $name, $value ) = split( '=', $data ); print header(), start_html( 'Using POST with forms' ); print p( 'Enter one of your favorite words here: ' ); print '<form method = "POST" action = "fig07_08.pl">'; print '<input type = "text" name = "word">'; print '<input type = "submit" value = "Submit word">'; print '</form>'; if ( $name eq 'word' ) { print p( 'Your word is: ', b( $value ) ); } print end_html(); 116099808 - 14 - Differences between POST and GET The POST method sets the environment variable CONTENT_LENGTH to indicate the number of characters sent in place of sending the QUERY_STRING with the name/value pair. The read( ) function reads in exactly the number of characters that was posted from STDIN and stores the data in a variable $data up to and including the new line ( in case of a scalar variable). If no newline is attached the server eventually terminates the script. Another difference is when WEB pages are cached. POST does not allow pages to be cached otherwise subsequent request my be inaccurate. In either case POST and GET we need to consider how the data is encoded. Some symbols are translated into ASCII values – Using CGI.pm module deals with low level parsing and encoding No read and splitting of the data is required but in place the function param ( ) and start_form(), textfield(), submit(), and end_form( ) are used. 116099808 - 15 - #!/usr/bin/perl Fig 7.9: fig07_09.pl # Demonstrates use of CGI.pm with HTML form # using the shortcut functions to generate HTML code makes # the script more concise and readable. use warnings; use strict; use CGI qw( :standard ); my $word = param( "word" ); print header(), start_html( 'Using CGI.pm with forms' ); print p( 'Enter one of your favorite words here: ' ); # p - paragraph print start_form(), textfield( "word" ); print submit( "Submit word" ), end_form( ); print p( 'Your word is: ', b( $word ) ) if $word; #b -bold print end_html(); Function param( ) takes one argument: the name of the HTML form field and the form fields associated value is returned. No read, split or URL decoding is required. The function works with both GET and POST method ( POST is default). Function start_form ( ) generates the opening <form> tag. Although the <form> tag’s default method is “GET” the default given by start_Form() is POST. By default the action will be that the script calls itself. The function start_form() can be called passing method and action. start_form( -method => “POST”, -action => “fig07_09.pl” ); each argument is paired using the hash operator => the attributes are prefixed by a hyphen and if the named argument syntax is used the order is not important. Function textfield( ) generates HTML to create an <input> single line form argument with the name word <input type = “text” name = “word”> calling the function textfield with the override attribute set to 1 allows the user to fix previously entered data. textfield( -name => “word”, -override => 1); function submit( ) creates the submit button with the label “Submit word” function end_form() generates the end of the form tag. 116099808 - 16 - Other Headers Function header() outputs Content-type: text/html by default header(“text/plain”) produces Content-type:text/plain header( -Refresh => “5; URL = www.deitel.com/newpage.html” ); redirects the client to a new location after 5 seconds – without the redirection the current page would be refreshed. CGI protocol indicates that certain types of headers output by a CGI script are to be handled by the server, rather than be passed directly to the client. Function redirect () can be used to output the Location header Ex: redirect( “/newpage.html” ); #relative pathname Location: http://www.deitel.com/newpage.html Redirection performed on the server side – no Content type is necessary since the new location will have its own content type The status message can be changed using the function header with the flag –status header( -status => “204 No Response” ); Summary: CGI protocol allows scripts to interact with servers in three basic ways: 1. through the output of headers and content to the server via standard output; 2. by the server’s setting of environment variables via %ENV including URL-encoded QUERY_STRING (GET method) 3. through POSTed, URL-encoded data that the server sends to the scripts standard input.