T4L2 CGI, PERL, & PHP Introduction As originally developed, the World Wide Web provided stateless connections to information on web servers. This meant that every server query would result in a new page download. Each query would be independent of other queries, and no information would be saved from one query to the next. The idea is similar to reading pages from a magazine. One can read the pages in proper sequence, or can read them out of sequence. No concept of the “state of the user” was employed. As a result, the Web was largely static, with any interactivity limited to the users selection of hyperlinks. The problem facing Web developers was how to create dynamic output through the Web. An example might be using the Web to access information in a database such as a search engine or online catalog. Another example might be using the Web to run a program on some server, and then returning the results tailored to user-selected options. In other words – making the Web truly interactive. By the end of this lesson, you should be able to: Describe the concept of dynamic web pages. Describe how CGI, Perl and PHP are used for server-side programming. Locate and utilize online Perl, and PHP libraries and resources. This lesson will cover the: Common Gateway Interface (CGI) - This is a protocol that allows web pages to run programs on web servers. Perl programming language - This will be used to show examples of CGI. PHP scripting language - Will also be examined. Additional Resources CGI scripts from the O'Reilly www.perl.com site http://www.perl.com/reference/query.cgi?cgi The CGI Collection from ITM http://www.itm.com/cgicollection/ CGI Script Collections by Web Publishers http://www.webpub.com/tools/cgiscript.html CAC advanced Web FAQ 1 T4L2 http://www.personal.psu.edu/faq/web-weenie.shtml CAC Scripts Library FAQ http://www.personal.psu.edu/faq/cgi.shtml BigNoseBird Scripts Library http://bignosebird.com/cgi.shtml The CGI Resource Index http://cgi.resourceindex.com/ Perl.com - the source for Perl http://www.perl.com/pub The PHP Home Page [[http://www.php.net/ ]] The PHP FAQ [[http://www.php.net/FAQ.php]] The PHP Code Exchange [[http://px.sklar.com/]] 2 T4L2 Common Gateway Interface (CGI) The Common Gateway Interface (CGI) is a standardized specification for interfacing external applications with web servers via the Hypertext Transfer Protocol (HTTP). The specification is sometimes called the 'standard CGI' because it uses the standard input device STDIN and standard output device STDOUT to read and send the data. The above diagram shows the client-server interaction that takes place with CGI. The client browser sends a request to the server via HTTP. This request contains the name of the CGI program (called a CGI script) and any parameters. The server processes the CGI request by first getting environment information about the server making the request, then sending a data stream to the CGI script via STDIN. The CGI script processes the data stream and responds by sending an output data stream (most likely an HTML file) via STDOUT. NOTE -- STDIN and STDOUT are mnemonics for standard input and standard output, two predefined handles for file streams. Many operating systems allow a programs input and output to be specified (or redirected) at run time -- these programs are written for input from STDIN and output to STDOUT. If unspecified at run time - STDIN is usually the keyboard and STDOUT is usually the display. CGI Scripts CGI programs are often called 'scripts' because the earliest versions were written in Unix shell scripts. Today, CGI scripts may be written in C, Perl, Visual Basic and other highlevel languages -- any language that can run on the server and follow the rules of CGI. The specific choice of CGI language may depend on your server. For example, scripts on a Unix-based server will likely be written in Perl whereas scripts on a Microsoft NT server will likely be written in Visual Basic. Depending on your server, CGI scripts may be required to exist in a particular directory. For example, the Unix Apache server requires CGI scripts to reside in a directory named cgi-bin. It is, however, possible to call a CGI script that resides on a different server. 3 T4L2 Perl The Practical Extraction and Report Language (Perl) was created by Larry Wall to facilitate the creation of reports from large data files – such as server logs. Perl was originally written for the Unix operating system, but was later ported to other OS’s. One factor in its popularity is that most implementations of Perl are free. Perl overcomes come deficiencies with C and Unix shell scripts. For example, Perl can perform many of the same tasks as C, but with less programming effort. Perl was designed to permit OS commands to be executed from within Perl – resulting in its popularity as a scripting language. Perl is interpreted rather than compiled – which provides for portability across systems, providing that the host system has a Perl interpreter. Example of a CGI Script Written in Perl The following example will walk you through creation, installation and testing of a very simple CGI script written in Perl. This example is assuming use of the Penn State Web server www.personal.psu.edu. First, you will need the address for the Perl interpreter on your server. For the www.personal.psu.edu server this address is #!/usr/local/bin/perl Next, you need your script. Following is a very simple Perl script. All it does is display the message “Hello World.” print "Content-type: text/html\n\n"; print "<HTML><HEAD>"; print "<TITLE>Test Script </TITLE>"; print "</HEAD><BODY>"; print "Hello World!"; print "</BODY></HTML>"; [[show a screen shot of this output]] Next, put the script in a file, give it the name hello.pl Note that Perl source files always have the file name extension .pl Next, use FTP to upload the file to your server space on ftp.personal.psu.edu Be sure to use ASCII transfer mode. 4 T4L2 Use CHMOD 755 to turn on execute status for the file. (This can be done using WSFTP or by issuing the Unix command CHMOD 755 in a terminal session.) Now you may test the script by launching your web browser and typing in the script URL. In my case this would be: http://scripts.cac.psu.edu/staff/g/m/gms/cgi-bin/hello.pl The result of running the script is that you get a new browser window with the message "Hello World!" displayed. To review, here is the complete script: #!/usr/local/bin/perl # #This is a very simple Perl script # print "Content-type: text/html\n\n"; print "<HTML><HEAD>"; print "<TITLE>Test Script </TITLE>"; print "</HEAD><BODY>"; print "Hello World!"; print "</BODY></HTML>"; Here is what is sent to the browser after the CGI script runs: <HTML> <HEAD> <TITLE>Test Script </TITLE> </HEAD> <BODY>Hello World!</BODY> </HTML> Some Notes: The # character precedes comments in the script The print command sends the following string to STDOUT The string Content-type: text/html\n\n tells the browser That HTML code is coming o Content-type: text/html specifies the MIME type of the Perl output o \n\n terminates the line and sends a blank line 5 T4L2 The ; character ends each line of code Calling the CGI Script From a Web Page CGI scripts are typically called from a web page. The following page illustrates calling a CGI script via a hyperlink to the script. <html> <head> <title>test</title> </head> <body> <h1>Test of script</h1> <p><ul> Click <a href="http://scripts.cac.psu.edu/gms/cgi-bin/hello.pl"> here</a> to test script. </ul> </body> </html> Review of the Process This is a good time to review the process. 1. Create a Perl script. 2. Use FTP to publish the Perl script in the cgi-bin directory of your server. Be sure you have included the location of the Perl interpreter as the first line of your script. 3. Set file modes so the script can be executed on server. Actual commands will vary depending on type of server. 4. Test the script. 5. Write a web page that calls script. Calling a Perl CGI Script from a Web Form 6 T4L2 A common use of CGI is to process the data returned from a Web form. In this case, the data from the form is directed to STDIN of the CGI script. The script processes the form data, does any other necessary processing, and constructs a response page. As data, the Web browser sends name/value pairs from the form. Each pair contains the name of the form element and the value entered by the user. Optional parameters may also be sent from the form, these are sent as name/value pairs, and represent state information that may be useful to the CGI script. HTTP provides two methods for passing the data from the form to the CGI script. These are the POST method and the GET method. The POST method is more general and more powerful. The method is specified as an attribute to the FORM tag. Another important FORM tag attribute is the action. This attribute specifies the URL of the CGI script. Following is an example FORM tag with method and action attributes. (See the lesson on Forms for more information .) <FORM METHOD=”POST” ACTION=”../cgi-bin/myscript.pl”> The CGI script finds the length of the input string in the environment variable CONTENT_LENGTH, which is provided by the calling server. This information is used to parse the input data stream into the name/value pairs. The CGI script then proceeds to process this data, do any other processing (such as updating a database), create an output data stream, and send the output data stream to STDOUT. The browser receives the output data stream, interprets and displays it for the user. Perl CGI Script Collections It may not be necessary to write your own Perl scripts from scratch. Many Perl script collections are available through the Web. Here are a few: CGI scripts from the O'Reilly www.perl.com site http://www.perl.com/reference/query.cgi?cgi The CGI Collection from ITM http://www.itm.com/cgicollection/ CGI Script Collections by Web Publishers http://www.webpub.com/tools/cgiscript.html Important tips for installing Perl CGI scripts: 7 T4L2 Be sure you know what you are doing! (Otherwise you may break something.) At very least know basic operating system and file system management commands. Be sure to make backup copies of any files you change – just in case. Beware of Trojans! It is possible to write programs that contain illicit or dangerous subroutines that are unknown to the user. These programs are called “Trojans” because they follow the strategy of the “Trojan Horse,” with the dangerous program code hidden within a (typically) useful program. For this reason it is important that you only use CGI scripts obtained from reputable sites. Know the location of your system's Perl interpreter. And other important services, such as the mailer daemon (if needed). Be sure to set script file modes (CHMOD) correctly Not all servers require this. Check with your server administrator for details. At Penn State, use the server scripts.cac.psu.edu server to test your script. This is a test server that can be used until you have debugged your script. After debugging, you can send a request to webmaster@psu.edu to have the script installed on your www.personal.psu.edu webspace. Additional Resources CAC advanced Web FAQ http://www.personal.psu.edu/faq/web-weenie.shtml CAC Scripts Library FAQ http://www.personal.psu.edu/faq/cgi.shtml BigNoseBird Scripts Library http://bignosebird.com/cgi.shtml The CGI Resource Index http://cgi.resourceindex.com/ Perl.com - the source for Perl 8 T4L2 http://www.perl.com/pub 9 T4L2 PHP An emerging scripting language is PHP (Hypertext Pre-Processor). PHP is a server-side, cross-platform embedded HTML scripting language that incorporates the better design features of Perl, java and C. PHP was developed with the intent of providing server-side functionality from hypertext, as opposed to being modified for that purpose after the fact. Because of this it is possible to create powerful applications and database interactions with few lines of PHP code. Most importantly, PHP is developed under the Open Source [[http://www.opensource.org/]] software concept. This means that PHP processors are completely free. Versions are available for download from the PHP home page for Apache, Windows NT, Netscape and other web servers. Extension libraries and ODBC drivers can also be downloaded for connectivity to various database management systems. Aside from being free, the ease of PHP, and the continued growth of its popularity, may result in its becoming one of the best web programming environments. Running a PHP Script PHP scripts are inserted directly into an HTML document, much like Javascript. The script code is contained between <?php and ?> tags. The source file is given the filename extension .php3 to inform the Web server that the file contains PHP code and that the PHP processor should be invoked. Following is the simple Hello World program written in PHP: <html> <head> <title>PHP Hello World</title> </head> <body> <?php echo "Hello World<P>"; ?> </body> </html> Comments may also be placed inside PHP scripts, by enclosing them between the strings /* and */ . Additional Resources The PHP Home Page [[http://www.php.net/]] 10 T4L2 The PHP FAQ [[http://www.php.net/FAQ.php]] The PHP Code Exchange [[http://px.sklar.com/]] 11 T4L2 12 T4L2 CGI, Perl, and PHP Summary This lesson is designed for you to gain some basic information about CGI, Perl, and PHP. When you are finished with the lesson, you should be able to do the following: Discuss the concept of dynamic web pages. Describe how CGI, Perl and PHP are used for server-side programming. Locate and utilize online Perl, and PHP libraries and resources. A short summary of these topics is listed below. If you do not understand these things, you should review the lesson at least once. If you are still having difficulty, you should consider other sources of information that compliment this lesson, such as textbooks, tutors, and instructors. Common Gateway Interface (CGI) The Common Gateway Interface (CGI) is a standardized specification for interfacing external applications with web servers via the Hypertext Transfer Protocol (HTTP). This is one way of providing interactivity on the web. Perl The Practical Extraction and Report Language (Perl) is widely used to develop CGI applications. PHP An emerging scripting language is PHP (Hypertext Pre-Processor). PHP is a server-side, cross-platform embedded HTML scripting language that incorporates the better design features of Perl, java and C. PHP was developed with the intent of providing server-side functionality from hypertext, as opposed to being modified for that purpose after the fact. Because of this it is possible to create powerful applications and database interactions with few lines of PHP code. 13