Perl Kurtis Hage CSC 415: Programming Languages History of Perl Perl 1.0 was originally released in December of 1987; it was developed by Larry Wall as an interpreted language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information, as stated in the original Perl Manual. Perl is implemented using C, and borrows from awk, sh, and sed; all languages that preceded it.(Lapworth) Perl has a few nicknames; it is called “the Swiss Army chainsaw of programming languages”(Sheppard) and “the duct tape that holds the Internet together.”(Leonard) It gained these nicknames because of its high degree of adaptability and its rise in popularity as the web was being developed; because most of what was being done on the early part of the web happened with text, and because Perl was designed, at least in part, to handle text processing, it was better suited than the available alternative languages at the time. (Sheppard) The Perl motto is 'There's More Than One Way To Do It', which emphasizes both the flexibility of Perl and the fact that Perl is about getting the job done. Its strengths are its writability, which stems from its prevalent use of the English language and the design decision to be easy for humans to write, rather than easy for computers to understand (Cozens), and its free, open-source distribution model, which allows for a high degree of portability across computers and platforms. (Cozens) Perl's most common use in today's environment is for CGI (Common Gateway Interface, not to be confused with Computer Generated Imagery) Programming, meaning that Perl is used to dynamically create web pages. Perl is the powerhouse behind popular sites such as Slashdot and Amazon. Overview of the Language Names and Scopes Perl scripts start with an implied declaration of package main; where “main” is the name of the namespace to which the block, subroutine, eval, or file belongs. Package variables can be declared by using the fully package qualified name in the code, # can use variable without declaring it with 'my' $some_package::answer=42; warn "The value is '$some_package::answer'\n"; which is allowed regardless of which namespace the code currently resides, or by use of the “our” variable, which creates the variable in the current namespace . (London) package Hogs; our $speak = 'oink'; warn "Hogs::speak is '$Hogs::speak'"; > Hogs::speak is 'oink' ... Package declarations can be made inside code blocks, but upon leaving the block, the package namespace reverts to the previous, overarching namespace. After reverting to the overarching namespace, the package declaration inside the block can still be accessed; however, it can then only be accessed by using the fully package qualified name. (London) This is due to the use of Perl's lexical scope. Lexical scope refers to anything that is visible or has an effect only within a certain boundary of the source text or code. (London) Lexically scoped variables have three main features: 1) Lexical variables do not belong to any package namespace. 2) Lexical variables are only directly accessible from the point where they are declared to the end of the nearest enclosing block, subroutine, eval, or file. no warnings; no strict; { my $speak = 'moo'; } warn "speak is '$speak'\n"; > speak is ' ' In the above example, the lexical scoping of $speak only exists inside of the bracketed structure. 3) Lexical variables are subject to garbage collection at the end of the scope. (London) Perl allows for lexical variables with the “my” keyword, though, as noted above, they go out of scope at the end of the code block. Package variables are permanent, though, and never go out of scope. Data Types Perl uses three basic storage types for it's variables. (London) These variables requires special characters at the beginning of their name to specify their type. (Sebesta) These special characters are $, which represents a scalar, @, which represents arrays, and %, which represents hashes. (London) Scalars Scalar ($) variables can store strings, numbers, references, and filehandles. (London) However, no declaration of which type is intended is necessary; Perl automatically promotes and handles the scalar as the correct type. Because Perl was designed in part to handle text processing, it has a variety of very useful functions to operate on strings, which are a subset of scalars. These functions are a large part of what makes the language so powerful, and some of these functions are documented in the Appendix. my $fullname = 'mud' . "bath"; # concatenation my $line = '-' x 80; # repition; $line is eighty hyphens my $len = length($line); # length; $len is 80 qw() #covered in Example 5 Perl string literals must be placed in single or double quotes, though a list of string literals can be created using the qw() function. (London) my ($first,$last)=qw( John Doe ); print "first is '$first'\n"; print "last is '$last'\n"; #returns “John”, returns “Doe” Perl scalars allow for different formats for numeric literals with special declarations; numbers preceded by “0b” are binary, by “0x” are hexadecimal, and by “0” are octal (London); otherwise, numerical literals are assumed to be integers, though floatingpoint and scientific notation are also allowed. my $base_address = 01234567; # octal my $high_address = 0xfa94; # hexadecimal my $low_address = 0b100101; # binary Similar to its powerful string functions, Perl also has a myriad of built-in powerful numeric functions; it doesn't need to call a module for (some) trigonometric functions which return a value in radians, for exponentiation, square roots, or natural logarithms. my $radian = 45 * ( 3.14 / 180 ); # 45 degrees in radians my $sine_radians = sin($radian); # Trig sin, returns 0.707, Correct! my $seven_squared = 7 ** 2; # exponentiation, returns 49 my $square_root_of_123 = sqrt(123); # sqrt, returns 11.0905 Arrays Arrays are preceded with the “@” symbol, and store scalars that are accessed via an integer index. (London) Arrays start at index 0 (as opposed to index 1), unless they are declared as negative indexes, in which case they start from the end of the array and work backwards. Perl arrays are one-dimensional only. (London) Standard notation is different in Perl than most other languages; while the array is declared with the @ symbol, accessing an index of the array must use the scalar $ symbol with the number of the index placed inside square ( [] ) brackets. my @numbers = qw ( zero one two three ); my $string = $numbers[2]; warn $string; #returns two, the third element (indexed at 2) The length of Perl arrays is not pre-defined; instead, Perl allocates whatever space is needed. Perl also knows its' array's length via a function called scalar(). my @phonetic = qw ( alpha bravo charlie delta ); my $quantity = scalar(@phonetic); warn $quantity; # returns 4, the number of elements in the array This can also be done by assigning the entire array into a scalar variable. my @phonetic = qw ( alpha bravo charlie ); my $quant = @phonetic; warn $quant; # returns 3 Arrays can be treated similar to the “Stack” data structure, in which push() and pop() methods operate on the end (highest index) of an array. Also included are shift() and unshift(), which operate at the beginning of the array (index 0). Other operations included are sort(), which sorts the array alphabetically, and reverse(), which returns the array in reverse order.(London) Hashes Hashes are preceded with the % symbol and store scalars that are accessed via a string index called a key. Like arrays, hashes are one-dimensional only. Any even number of scalars can be assigned to a hash; Perl extracts them in pairs. (London) The odd-numbered items are treated as the key, with the even numbered items treated as the key's value. Below is a typical hash call: my %info = qw ( name John age 42 ); my $data = $info{name}; warn $data; #returns “John” Keys do not have to be pre-declared (as they were in the above example); if the key does not exist during an assignment, the key is created and given the assigned value. my %inventory; $inventory{apples}=42; Again, hashes have many built in functions, though they are not covered. Expressions and Assignment Statements Perl integrates regular expressions into the syntax of the core language itself. Its regular expressions allow you to search for and transform text in innumerable ways with ease and speed. (Cozens) These expressions allow for searches of strings for particular patterns, find what matched the patterns, and substitute the matched patterns with new strings. This is accomplished through three separate functions, match(), substitute(), and transliterate(). Perl allows for any delimiter in these operators. (London) There are two ways to “bind” these operators to a string expression: 1) =~ in which the pattern matches the string expression and 2) !~ in which the pattern does NOT match the string expression. (London) Braces in the three above mentioned functions are equivalent to double-quote marks. Statement-Level Control Structures Standard statements get executed in sequential order in Perl, but control flow statements allow you to alter the order of execution. Many of these control flow structures are shared in other languages, such as if..elseif..else blocks, while blocks, and etc. Perl has an included if..unless block, though, which is just an added conditional level. This unless block can be extended by elseif and else blocks, as well. Typical “for” loops, though, are handled as “foreach” loops in Perl. (London) Perl also has labels, which are optional names for associated control structures. These names are next, last, and redo. The last command goes to the end of the entire control structure, skipping any continue block if it exists. The next command skips the remaining block, but executes anything in a continue block if it is there. Regardless of a continue block exists, execution will then begin at the next iteration of the control structure (assuming it is a loop). The redo command skips the remaining block, again not executing anything in a continue block, and resumes at the start of the control structure without evaluating the conditional again. (London) Subroutines Perl allows named and anonymous subroutines that can be declared with the & symbol, though unlike scalars, arrays, and hashes, the & symbol is not mandatory. Subroutines follow a syntax of “sub NAME BLOCK”, where NAME is any valid identifier and BLOCK is a code block enclosed by parenthesis. The name of a subroutine is placed in the current package namespace and can be accessed with just NAME if you are in the correct package, or with the fully package qualified name if you are outside the package, all with or without the optional &. sub Ping {print "ping\n";} Ping; &Ping; MyArea::Ping; &MyArea::Ping; #all of these return “ping” and are correct calls The contents of each block are invisible to anything outside the code block (function calls, etc; not invisible in the IDE). (London) Any values that get passed to or from a subroutine are put in the parenthesis at the subroutine call, where all arguments are reduced to scalars and their corresponding elements; the subroutine will NOT know if those reduced scalars came from scalars, arrays, or hashes. (London) Inside the subroutine, the arguments are accessed via a special array called @_. If the arguments are fixed and known, they can be extracted by assigning @_ to a list of scalars with meaningful names. sub compare { my ($left,$right)=@_; return $left<=>$right; } The @_ array is really a list of aliases for the original arguments that were passed in; one must then be careful, because assigning a value to an element in @_ will change the value in the original variable that was passed in. One must also be careful when calling a subroutine with the & symbol and no parenthesis; the current @_ array gets implicitly passed to the subroutine being called. (London) Subroutines can return single values or lists of values; the returned value can be explicit or is implied to be the last statement of the subroutine. An interesting included function of subroutines is caller(). This function returns a list of information about how and where the subroutine was called. This information includes the package namespace at the time of call, the filename where it was called, the line inside the file where it was called, the subroutine that calls it, whether or not the subroutine had explicit arguments passed in, and some other information. (London) Support for Object-Oriented Programming Perl supports both procedural and object-oriented programming. This is accomplished by the creation of classes, which can refer to a package or module. The SUPER:: method allows a child object (a class that uses another class as its' base) to call a method that belongs to its parent's class. This has a built in limitation, though, of only being able to look up the class inheritance hierarchy starting at the class from which it was originally called. (London) Long trees of class extension, then, can cause SUPER to not work. Created objects are subject to garbage collection by something called Object Destruction. This occurs when all references to a specific object have gone out of lexical scope. When this happens, Perl calls the DESTROY method on the object (if it exists), otherwise removing the data if it does not. This has similar limitations as SUPER, in that the DESTORY method only travels up to the first method of DESTORY in its hierarchical tree. (London) Modules are included by making calls in the structure “use MODULENAME 'directory' “. “use base” can also be used to have classes inherit from a base class that has common methods. Classes can override the methods of their base class by simply writing the same MethodName as the method that is attempting to be overridden. (London) Concurrency Perl supports concurrency...sort of. The language itself wasn't designed to with concurrency in mind; instead, Perl handles concurrency with the use of a threading module. Each thread runs as a separate virtual machine, and only data that is explicitly marked as shared can be shared. (Wegrzanowski) In this way, Perl can run concurrently; this concurrency allows access to all Perl libraries and avoids some problematic errors by not allowing everything to be shared by default. This type of concurrency works very well for some problems, but very poorly for large-scale concurrency programs, in which specialty languages may be more suitable. (Wegrzanowski) Exception and Event Handling Perl exception handling typically occurs in eval..do blocks, which are functionally similar to try..catch blocks. These exceptions typically carry three important pieces of information with them; 1) Type of exception as determined by the class of the exception object, 2) Where the exception occurred and 3) Context information, which includes the error message and other state information. (Shankar) Shankar states: “Object-oriented exception handling allows you to separate error-handling code from the normal code. As a result, the code is less complex, more readable and, at times, more efficient. The code is more efficient because the normal execution path doesn't have to check for errors. As a result, valuable CPU cycles are saved.” However, there exists a module hosted on CPAN, the Comprehensive Perl Archive Network, called Error.pm that attempts to mimic other object-oriented languages like Java and C++. This module provides interfaces for procedural exception handling and a base class for other exception classes. (Shankar) Other Issues Perl does not have a boolean type variable. Instead, the interpreter interprets scalar strings and numbers as true or false based on a set of rules. 1)Strings "" and "0" are FALSE, any other string or stringification is TRUE 2) Number 0 is FALSE, any other number is TRUE 3)all references are TRUE 4)undef is FALSE Any value that is evaluated that is NOT a scalar is evaluated in a scalar context and then treated as a string or number. The scalar context of an array is its size, but, of note is that an array with one undefined value still has a scalar value of true. Subroutines return scalars or a list depending on the context in which it is called, and in order to explicitly return false, an empty return statement is used. (London) A potential issue that Perl has is its Autovivcation. Without “use strict” (and “use warnings”, in some cases), and without declaring a variable with “my”, variables are created and initialized to be undefined (undef in Perl), which returns false. (London) A last potential issue is Perl's garbage collection, which has been briefly discussed in the “Names and Scope” section of this paper. When Perl frees up memory, the memory is not returned to the system, but instead is used for possible declarations of new lexically scoped variables that could be declared later in the program (after the garbage has been collected). This means that running Perl programs will never get smaller; any memory that is allocated remains under Perl's jurisdiction. (London) Evaluation Readability Perl ranges from overtly simple to read to notoriously difficult, as is abundantly clear in obfuscated Perl competitions that have taken place in the past. Much of what makes Perl so powerful also contributes to its (potential) obfuscation; all of the delimiter available functions and its regular expressions, in particular, can quickly become a jumbled mess. Another feature of the language that can make Perl difficult to read is its flexibility with its subroutine calls; Appendix 13 shows four separate ways to call the same function, even though it may not be inherently obvious that each line produces the same output. With good programming habits, though, Perl's typical block structure allows for relatively simplistic readability, barring perhaps the special characters assigned to variable types for those previously unfamiliar with the language. Writability For the average user, Perl's writability is almost on par with languages like C++ or Java, again with the possible exception of how variable type declaration is handled. Much of what can potentially harm Perl's readability can also greatly enhance its' writability, as seen in Appendix 16 and 17, which are incredibly powerful programs written in 2 and 7 lines of code, respectively. Reliability Perl code tends to have a long lifespan, as is evident by its continued use in today's programming environment. As noted in the introduction, Perl is the “duct tape that holds the Internet together.” (Leonard) With the continued open-source developmental support of Perl 5 and its iterations, and the continued development of the as-of-yet not fully released Perl 6, Perl appears as though it will continue to be maintained and reliable. Cost Perl's initial cost is zero; the full source code and documentation are free to copy, compile, print, and give away. Any programs written in Perl incur no royalties or restrictions on distribution. Perl is released under the terms of the “Artistic” GNU General Public License, meaning any modifications must be clearly flagged and the original modules distributed along with the modified versions. Training a new Perl user, though, may not be free. While its open-source nature, with plenty of free books and example code, can certainly allow for a programmer to be self-taught, courses on Perl can range anywhere from $120 to upwards of $5000 for “boot camp” type courses. Perl is available for most operating systems, particularly Unix and its variants, meaning hardware costs are typically kept to a minimum. Conclussion Perl is an incredibly powerful and yet simple or as complex-as-you-want-it-to-be language that is still widely in use today. It continues to be used, as well as updated and improved, and should continue to be relevant to programmers for some time to come. Code Appendix Example 1: Two Line RSA Algorithm – (Beck – just two lines of Perl.) print pack"C*",split/\D+/,`echo "16iII*o\U@{$/=$z;[(pop,pop,unpack"H*",&lt;&gt; )]}\EsMsKsN0[lN*1lK[d2%Sa2/d0&lt;X+d*lMLa^*lN%0]dsXx++lMlN/dsM0&lt;J]dsJxp"|dc` Example 2: 7 Line method DVD Copy Protection descramble – (Winstein and Horowits) $_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map{$_%16or$t^=$c^=( $m=(11,10,116,100,11,122,20,100)[$_/16%8])&110;$t^=(72,@z=(64,72,$a^=12*($_%16 2?0:$m&17)),$b^=$_%64?12:0,@z)[$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h =5;$_=unxb24,join"",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$ d=unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d8^($f=$t&($d12^$d4^ $d^$d/8))<<17,$e=$e8^($t&($g=($q=$e14&7^$e)^$q*8^$q<<6))<<9,$_=$t[$_]^ (($h=8)+=$f+(~$g&$t))for@a[128..$#a]}print+x"C*",@a}';s/x/pack+/g;eval Bibliography L a p w o r t h , L e o . " Th e P e r l P r o gr a m m i n g L a n gu a g e . " T h e P e r l P ro g r a m m i n g L a n g u a g e . N . p . , 2 0 11 . We b . 2 6 S e p 2 0 1 1 . < http://www.perl.org/> W i e r s d o r f , S c o t t . P e r l . c o m . To m C h r i s t i a n s e n , S e p t e m b e r 2 1 s t , 2 0 1 1 . We b . 2 6 S e p . 2 0 1 1 . < h t t p : / / w w w. p e r l . c o m / > . S h a n k a r , A r u n U d a y a . " O b j e c t O r i e n t e d E x c e p t i o n H a n d l i n g i n P e r l . " P e r l . c o m . To m C h r i s t i a n s e n , n . d . We b . 2 6 O c t . 2 0 1 1 . Wo l f g a n g , G. P e r l B e g i n n e r s ' S i t e . N . p . , J u l y 2 2 , 2 0 1 1 . We b . 2 6 S e p 2 0 1 1 . < h t t p : / / p e r l begin.org/> S e b e s t a , R o b e r t W. C o n c e p t s o f P r o g r a m m i n g L a n g u a g e s . 9 t h . A d d i s o n - We s l e y M c C u l l a g h , D e c l a n . " D e s c r a m b l e t h a t D V D i n 7 L i n e s . " Wi r e d . 0 7 0 3 2 0 0 1 : n . p a g e . We b . 1 1 O c t . 2 0 1 1 . < h t t p : / / w w w. w i r e d . c o m / c u l t u r e / l i f e s t y l e / n e w s / 2 0 0 1 / 0 3 / 4 2 2 5 9 > S h e p p a r d , D o u g . " B e g i n n e r ' s I n t r o d u c t i o n t o P e r l . " P e r l . c o m . To m C h r i s t i a n s e n , 1 6 1 0 2 0 0 0 . We b . 1 2 O ct. 2 0 11 . L e o n a r d , A n d r e w. " T h e j o y o f P e r l . " S a l o n . 0 8 0 1 2 0 1 1 : n . p a g e . We b . 1 2 O c t . 2 0 1 1 . C o z e n s , S i m o n . B e g i n n i n g P e r l . W r o x P r e s s , 2 0 0 0 . P r i n t . < h t t p : / / w w w. p e r l . o r g / b o o k s / b e g i n n i n g perl/>. S c h w a r t z , R a n d a l , To m P h o e n i x , a n d B r i a n F o y. L e a r n i n g P e r l . 5 t h E d i t i o n . O ' R e i l l y, P r i n t . L o n d o n , G r e g g . I m p a t i e n t P e r l . Ve r s i o n 9 . 2 0 1 0 . e B o o k . We g r z a n o w s k i , To m a s z . " W h y P e r l I s a G r e a t L a n g u a g e f o r C o n c u r r e n t P r o g r a m m i n g . " TA W ' S B l o g . 0 4 , O c t o b e r , 2 0 0 9 . We b . 2 6 O c t . 2 0 1 1 . < h t t p : / / t - a - w. b l o g s p o t . c o m / 2 0 0 6 / 1 0 / w h y - p e r l - i s g r e a t - l a n g u a g e - f o r. h t m l > .