PHP: Hypertext Preprocessor Brian Wright October 18, 2004 Introduction: The acronym PHP stands for PHP: Hypertext Preprocessor, and is a web development language that is syntactically similar to C. Not only is the language syntactically similar to C but the underling development of PHP was performed using the C language. The main objective of the language is to allow for the easy creation of dynamic web pages. PHP has evolved to become a technology that is utilized by approximately thirty-three percent of the web domains on the Internet (Lerdorf). Since the prominence of PHP is so great, the goal of this paper is to provide the historical and technical background on it as well as the advantages and disadvantages to using this technology. Historical Background – Prior PHP: HTTP is a stateless system. This means that it does not save any data across pages. Web pages that include forms need, most often, to save data. The way web page forms were interpreted during the early to mid 90’s was through parsing the URLs for the data. The mainly used technology to parse this information was C (Hudson). For instance, take a hypothetical web page that includes two text-input boxes: firstName, and lastName. Suppose that a user inputs “John” and “Smith” into the input boxes on this web page and then clicks submit. By default, the URL generated would be concluded with: firstName=John&lastName=Smith. Programmers would use the C programming language to decipher this information into however necessary variables and values. This process is known as parsing. Utilizing C to parse this information was, according to Paul Hudson, the author of Practical PHP Programming, “very clunky to program – a simple parsing program could easily take up fifty lines or more [of code].” With the problems involved in parsing data in C came the creation of PERL— Practical Extraction and Report Language. PERL allowed even novice programmers to successfully parse HTML form data. One of the largest drawbacks to using PERL was that it was written so that in order to output HTML, HTML must be embedded into the PERL program. Another way to phrase this is that PERL is PERL-centric. Historical Background – PHP: The first version of PHP was created by Rasmus Lerdorf in 1995 and was called: Personal Home Page/Form Interpreter: PHP/FI. According to Lerdorf’s article—entitled: Do You PHP?—the reason he created PHP/FI was “purely a case of needing a tool to solve real-world Web-related problems.” PHP is a HTML-centric language. This means that PHP code is actually embedded inside the HTML (Hudson). PHP/FI was “mostly a C library of common C functions that [Lerdorf] had written” (Lerdorf, Do You PHP?). An early example of PHP/FI code is this: <!--getenv HTTP_USER_AGENT--> <!--ifsubstr $exec_result Mozilla--> Hey, you are using Netscape!<p> <!--endif--> This code would check for the web browser agent and print a string if it the agent was Mozilla. Lerdorf was not satisfied with the original parser he wrote and decided to write a new one. Instead of encompassing the commands with <!-- cmd -->, he chose to change it to <? cmd > syntax—the syntax still used today. After recombining the original home page tools and the new syntax, he released of PHP/FI v2 in mid 1997. Later in 1997, two Israeli students—Zeev Suraski and Andi Gutmans—contacted Lerdorf and asked whether or not he would be interested in working with a new code engine that they had created. Lerdorf decided to pursue this option. With the combination of Lerdorf, Suraski, Gutmans, and a few others that were submitting PHP/FI patches at the time, the group released PHP Version 3 in June 1998. Lerdorf wrote, “[t]his was probably the most crucial moments during the development of PHP. The project would have died at that point if it had remained a one-man effort” (Lerdorf, Do You PHP?). PHP Version 3 including several improvements to PHP/FI Version 2. Version 3 had gained limited objected oriented support. The language itself was also much stricter syntactically—though since the language was moderately new, most were not affected by this. There was also a great increase of speed in the parser. Language extensibility was in the third version of PHP which allowed programmers to write their own modules for the language. Version 3 also supported a wide range of databases such as Oracle, Sybase, Solid, MySQL, mSQL, as well as ODBC data sources (Hudson). After PHP Version 3, Suraski and Gutmans went to work on a new version of PHP. Their objective was “abstracting the layer between the language and the web server, adding a thread-safety mechanism, and adding a more advanced, two-stage parse/execute tag-parsing system” (Lerdorf, Programming PHP). Combining parts of their first names, Zeev Suraski and Andi Gutmans, named this creation the Zend Engine. After completing the Zend Engine, the developers were joined by works of many other developers. This compilation resulted in the release of PHP Version 4 on May 22, 2000. PHP Version 5 was released in September 2004. This version includes object oriented scripts. “Developers are now able to declare how their objects may be used, which makes it easier for one developer to work with another's code” (Hudson). The new version also brought new and better error checking in the form of Java’s try and catches. With the increased usage of XML in the programming world, SimpleXML was added to PHP Version 5’s extensions. This allows for easy interaction between XML documents. Technical Background: PHP has evolved from being a web form parser very functional web development tool. PHP began as only a server-side scripting language that allowed for dynamic web content. Now PHP can also be used for command line scripting—much like Perl, awk, or the UNIX shell. Utilizing the PHP-GTK, it is possible to create cross platform GUI applications in PHP—much like Java applications. Although PHP is capable of doing all of the aforementioned tasks, its still best suited for server-side scripting. This paper will only deal with server-side scripting aspects of PHP. PHP can be used on all major operating systems including, Windows, Linux, UNIX, FreeBSD, Solaris, and Mac OS X. To install PHP, one simply needs to download the archive corresponding to their operating system at www.php.net. After following the installation instructions the user needs to configure their web server to load the PHP libraries when dealing with PHP code. Once this task is complete, the user now only needs to write PHP web pages. Since PHP is a server-side scripting language, the code is handled completely on the server side. When the server encounters a PHP file request, it lets the PHP libraries handle the request and then prints the results. The server receives these results and thus the results are what are sent to the client. The client has no idea that the page is PHP without noticing the web page’s file extension. With PHP being a server-side scripting language, clients do not need to change anything about their browsers. PHP pages are handled just as HTML pages are from the client’s perspective. No plug-ins are required and no downloads are necessary for viewing PHP pages. PHP uses a combination of code compiling and interpreting. On Paul Hudson’s online book he writes: Behind the scenes, PHP compiles your script down to a series of instructions […] whenever it is accessed. These instructions are then executed one by one until the script terminates. This is different to conventional compiled languages such as C++ where the code is compiled down to native executable code then that executable is run from then on - PHP re-compiles your script each time it is requested (Practical PHP Programming, 2004). This provides quick error feedback because the code is not compiled unless error-free. Main Body – Why PHP: With plethora web development tools out there today, it is easy to ask: “why use PHP?” There are many reasons to use PHP. One reason is that it is a server-side scripting language and therefore the clients do not need to install any plug-ins or reconfigure their web browsers. This also allows for security of PHP code. The client/server relationship is maintained and the client only sees what the server chooses to allow the client to see. Not only is PHP free but it is open-source. The source code for it is freely available and this allows for many side development projects to better the language and supply new changes at rapid rates. According to Leon Atkinson, the author of Core PHP Programming: Using PHP to Build Dynamic Web Sites, “PHP is better [than other web development tools]. It is faster to code and faster to execute. The same PHP code runs unaltered on different Web servers and different operation systems.” PHP is supported commercially. The creators of the Zend engine, Suraski and Gutmans sell supporting products as well as technical support for the language. The language is also supported by the community. There are many popular web communities that are available to help and assist with programming problems. Examples include www.phpbuilder.com, www.phphub.com, and www.phpindex.com. Because of the popularity of PHP, it is very capable of completing web-based tasks. “There are thousands of pre-written functions to perform a wide variety of helpful tasks - handling databases of all sorts (MySQL, Oracle, MS SQL, PostgreSQL, and many others), file uploads, FTP, email, graphical interfaces, generating Flash movies, and more” (Hudson). With the quick feedback mechanism stated in the prior section, it is argued that very little time is spent debugging PHP code. And, with the language constantly expanding, performance and production efficiency is greatly improving. Main Body – PHP Alternatives: There are numerous PHP alternatives out there, the most notable is Perl. Perl is the oldest. One of the major advantages to using Perl is its flexibility. There are many modules already written for it and easy to obtain; however, Perl is has a signature “messy” code appearance. The only obvious reason one should consider using Perl instead of PHP is in the case of using already written modules. But, since PHP is becoming much more popular there are more and more PHP modules available for usage. Another alternative to PHP is Microsoft’s Active Server Pages (ASP). ASP is automatically packaged in Microsoft’s web servers by default. ASP is a single-platform language, it only works on Microsoft’s, but it has been implemented on others. If using Internet Information Services(IIS), it runs at very fast speeds; but on other servers, if they run it at all, it is notably slow. A third alternative to PHP is Java Servlet Pages (JSP). A downfall to using JSPs is the detrimental speeds; nevertheless, it does have some advantages. One advantage is that JSP is highly promoted by the prominent company, Sun, and therefore has strong patronage in large enterprises. Since JSP uses Java, it is a smart choice for companies that run back-end processes in Java to use JSP in order to keep the development under a single language. It is commonly argued that JSPs scale better than PHP, but according to Paul Hudson’s Practical PHP Programming, PHP “scales perfectly” as long as the developer utilizes the same design patterns that were used for the JSPs. Main Body – When Not to Use PHP: According to Paul Hudson, PHP is not always recommended for web applications. One case is when speed is absolutely critical. Even though PHP’s speed is drastically improving with each new version, it is still recommended that one should resort to C or C++ when in need of superior performance. A way around this might be to write the C code as an extension to the PHP code (Hudson). Another situation in which one might not want to resort to using PHP is in the case of an already completed system. If the system is already rather large and written in a different language, it is often cumbersome to move on to PHP. It is recommended in this situation that the old language remain in tact. Main Body – Writing PHP Code: Writing PHP code is not a difficult task. One way to approach this task is to rename the original web pages to have the extension “.php” and add in the code to perform the tasks. An example would be to insert this line into the body of an existing HTML document: <?php echo ‘hello, world’ ?>. This would print out “hello, world” in the body. As stated in the technical section, since PHP is a server-side scripting language the client’s web browser will not see this code. Instead the source code from the client’s perspective will simply be: “hello, world”. PHP statements end, like C, C++, Java and sever other languages, with a semicolon. Notice that the above statement is an exclusion to the rule and that is because it is a one-liner. Variables in PHP all have the prefix “$” such as “$name”. When creating a variable, declaring the data type is not needed. There can be local, static and constant variables. To create an array, the programmer needs to use the array() function to construct it. An example is: $junk = array(232,121231,123211); This code would create an array named $junk that contained three elements: 232, 121231, and 123211. There are also stored environmental variables that are easily referenced such as HTTP_USER_AGENT, which stores the browser agent of the client browsing the page. Connecting to a database can be done in only two lines! The following example was taken from Practical PHP Programming: <?php mysql_connect("localhost", "phpuser", "alm65z"); mysql_select_db("phpdb"); $result = mysql_query("SELECT * FROM usertable"); $numrows = mysql_num_rows($result); print "There are $numrows people in usertable\n"; ?> It connects to a MySQL database named “phpdb” and selects all the records from the table entitled “usertable.” Using the mysql_num_rows function it obtains the amount of records in the result and then prints this number to the screen. Notice there was not any include’s or import’s at the top of this code. It is all included in the PHP package. Main Body – PHP IDEs: There are several different PHP integrated development environments (IDEs). It is often stated that most PHP developers use the humble text editors such as Notepad. There are downsides to doing such. Using Notepad limits the developer to using code techniques that they already know. In order for a PHP developer to truly achieve the maximum PHP programming heights, it is recommended that developers utilize one of the many powerful editors such as the Zend Studio. These power IDEs include many features that minimize the hard work the programmer has in task. Main Body – How One Can Contribute to the PHP Development: Since PHP is an open-source development project it is possible for a user to download the source code for it, modify it, and submit it back to the developers for everyone to use. It is recommended that even those who cannot program contribute to the development by submitting any bugs that they run in to. For the programmers, it is recommended that contact with the PHP developers be made because there is plenty of room for contributions (Hudson). The only way for PHP, or any open-source project for that matter, to better, is for people to contribute to the project regardless of how small or large the contributions are. Summary PHP’s history may only be of a ten year course, but its evolution is drastic. Its usage has increased extremely over the course of its creation. A graph taken from Lerdorf’s article on Oracle’s website shows the history of PHP usage on the web: PHP has evolved from a simple scripting language into a highly versatile and capable web development tool. PHP is an easy-to-learn and use language that ultimately meets the needs of the ever-so-changing Internet. It allows for the basic home user to, for free, have a highly complex website at the ease of an understandable programming language. It also allows for corporations to expand their web bases because of PHP’s enormous list of capabilities. The pile of C code that created PHP is ever growing due to its open-source availability on the internet, and because of this, PHP will most likely always be a competitor in the web development language race. Its pros outweigh its cons. PHP is therefore rendered the ultimate web development tool of the day. Works Cited: Atkinson, Leon. Programming: Using PHP to Build Dynamic Web Sites. Prentice Hall, 2000. Hudson, Paul. Practical PHP Programming. 2004 <http://www.hudzilla.org/php>. Lerdorf, Rasmus. Do You PHP? 2004 <http://www.oracle.com/technology/pub/articles/php_experts/rasmus_php.html>. Lerdorf, Rasmus and Kevin Tatroe. Programming PHP. Sebastopol: O’Reilly & Associates, Inc, 2002.