AE6382

advertisement
Perl
OBJECTIVES


What is Perl
Concepts





Variables
Control Structures
Modules
Objects
Windows
AE6382
Perl







Practical Extraction and Report Language
Originally designed as a text processing and “glue”
language
Perl is a scripting language
Each invocation of a Perl script compiles then executes
code
Uses a C-like syntax
Has object-oriented programming features
Highly portable between OS’s
AE6382
Running Perl

On Unix


Typically set line 1 to #!/usr/bin/perl (wherever Perl is installed)
On Windows

Set file extension to
–
–

.pl for standard Perl
.pls for PerlScript (ActiveX scripting engine)
Run from the perl command line
AE6382
Variables


Perl is not a strongly typed language, the contents of a variable
are converted as necessary
The first character of a variable name indicates the type of a
variable


$name
The name part of variable can also be enclosed in { }


${name}
@{$reference_to_array}
Scalar
Array
Hash
Subroutine
Typeglob
$
@
%
&
*
Name of individual value
List a values, keyed by index
List of values, keyed by string
Callable Perl code
Everything
AE6382
Variables - Scalar

A scalar represents a single value






The data held by the variable is converted as necessary
Scalar names start with a $


Integer
Floating point
String
Reference
$name
As an lvalue

$name = “george burdell”;
AE6382
Variables - Arrays






An array is an ordered list of scalars
Arrays are indexed by a number, starting at 0
Arrays indexed by negative numbers are ordered
backwards from the end of the array
The indexing operator is [ ]
An array starts with @
To refer to full array (or a slice)




@names
@names[1,3,5]
@names[2 .. 6]
slice
slice
A single element of an array starts with $


$names[4]
$names[$value]
AE6382
Variables - Arrays

As an lvalue




$names[4] = 345;
@names = (1,2,3,4,5);
@names = 1 .. 5;
$last_value = $names[-1];
AE6382
Variables - Hashes





A hash, or associative array, is an un-ordered list of
scalars
Hashes are indexed by strings
The indexing operator is { }
A hash starts with %
To refer to the entire hash


A single element of a hash starts with $



%months
$months{‘Mar’}
$months{$some_string}
As an lvalue


$months{‘Mar’} = ‘March’;
%months = (‘Jan’ => ‘January’, ‘Feb’ => ‘February’);
AE6382
Variables - Namespaces

Two types of namespace



Global variables are kept in symbol tables that are
named and accessible



Global
Lexical
Are created in the context of a package (default is $main::)
Can be referenced from another package using
$package::variable
Lexical variables are created and exist only in the
context of a Perl block (normally region enclosed with { })
AE6382
Literals – Numeric

Numeric literals can take several formats







12345
12345.67
1.23e06
1_234_567
0123
0xffff
0b101010
integer
floating point
scientific
octal
hexidecimal
binary
AE6382
Literals - String


There are several ways to quote a string
Substitution for variables in a string is known as interpolation



print “The value is $value\n”;
print ‘The value is ‘,$value,”\n”;
Interpolation occurs for variables and back slash literals
Usual
General
Meaning
Interpolate
‘ ‘
q/ /
Literal string
No
“ “
qq/ /
Literal string
Yes
` `
qx/ /
Command execution
Yes
( )
qw/ /
Word list
No
/ /
m/ /
Pattern match
Yes
s/ / /
s/ / /
Pattern substitution
Yes
y/ / /
tr/ / /
Character translation
No
AE6382
Literals - String


Special additions to the character set
Backslash escape characters








\n
\r
\t
\033
\cX
\x{263a}
\\
newline
carriage return
tab
character represented by octal 033
Control-X
Unicode character
back slash
Translation escapes





\u
\l
\U
\L
\E
force next character to uppercase
force next character to lowercase
force all following characters to uppercase
force all following characters to lowercase
end \U or \L switch
AE6382
Literals - String

There is flexibility in choosing quotes




The following executes a command using the OS shell
and returns its output as a string


$string = qq[This method allows inclusion of ‘ and ‘’];
$string = qq{This method allows inclusion of ‘ and ‘’};
$string = qq/This method allows inclusion of ‘ and ‘’/;
$result = qx(ls);
Word list form does not require tedious quoting

@months = qw(January February March April);
AE6382
Interpolation



Interpolation is the process of expanding a variable in a
string literal, the “ form of the string
Scalars are resolved in place, numeric values are
converted to characters
Arrays are interpolated by joining all the elements of the
array separated by the value of the special $” variable



$” = ‘~‘;
@months = qw(jan feb mar apr may jun);
$string = “The months are: @months”;
–

The months are: jan~feb~mar~apr~may~jun
Hashes are interpolated similarly, the key followed by the
value are inserted into the string
AE6382
List Values

A list consists of values enclosed in ( ) and separated by
commas



In list context the above example loads the array with the
values
In a scalar context, each value is evaluated and the last
value is returned, $value == 11 below


@array = (1,3,5,7,9,11);
$value = (1,3,5,7,9,11);
There is an important difference between a list and an
array, when an array is evaluated in scalar context it
returns its length, $length == 6



$length = @array;
$length = scalar @array;
$length = @array + 0;
AE6382
List Values

List interpolation




Lists can be indexed using [ ]


(@array1, @array2, 1)
Each element above is evaluated and inserted into the list that is
generated
There are no lists of lists
($day,$month,$year) = (localtime())[3..5];
Lists may be used as lvalues (see above)
AE6382
Context






Every operation in Perl is evaluated in one of two
contexts: scalar or list
Assignment to a scalar lvalue will cause the right side to
be evaluated in scalar context
Assignment to an arrary, hash, or a slice lvalue will
cause the right side to be evaluated in list context
Assignment to a list on the left will cause the right side to
be evaluated in list context
Use the scalar function to force evaluation in scalar
context
Some operations return different values depending on
the context in which they are evaluated


$number_of_matches = m/([^,]+)*/;
@numbers = m/([^,]+)*/;
AE6382
Arrays and Context




An array when referenced using @ operates in a list
context
An array element operates in a scalar context
When a list is assigned to an array each value is
inserted into the next element
Special forms of arrays



$length = scalar @array;
$last_index = $#array;
scalar @array == $#array + 1
(scalar not required here)
(an identity)
AE6382
Hashes and Context



A hash when referenced in the % form operates in list
context
A hash element operates in a scalar context
When a list is assigned to a hash each pair of values in
the list is taken as a key-value pair


There is a special syntax available for this



%colors = (‘red’,0xff0000,’green’,0x00ff00,’blue’,0x0000ff);
%colors = (red => 0xff0000, green => 0x00ff00, blue =>
0x0000ff);
Use the keys function to generate a list of keys for a
hash
To find the number of keys in a particular hash

$number_of_keys = scalar keys %hash;
AE6382
Filehandles and Input


A filehandle refers to a file
Filehandles are, by convention, all upper case


Use <> operator to read from a filehandle



STDIN, STDOUT, STDERR are predefined
$line = <STDIN>;
@lines = <STDIN>;
read one line from STDIN
read all lines from STDIN
Read and print entire STDIN

while(<>) { print; }
–
reads each line to the special variable $_ which is used implicitly in
both the <> and print commands
AE6382
Operators


Operator precedence
Operators can be overloaded
when using objects
Terms and list operators
->
++ -**
! ~ \ unary + unary =~ !~
* / % x
+ - .
<< >>
Named unary operators
< > <= >= lt gt le ge
== != <=> eq ne cmp
&
| ^
&&
||
.. ...
? : (ternary)
= += -= *= (etc)
, =>
List operators
not
and
or xor
AE6382
Simple Statements



A simple statement is an expression that is evaluated
A simple statement is terminated with a ;
A simple statement may be followed by a modifier






if expr
unless expr
while expr
until expr
foreach list
Examples


print “Value is $i\n” if $i > 5;
print “i=$i-- \n” while $i != 0;
AE6382
Compound Statements



Expressions containing blocks
A block is normally contained in { }
if statement





if (expr) block
if (expr) block else block
if (expr) block elsif (expr) block
if (expr) block elsif (expr) block else block
unless statement is similar
$i = $max;
if ($i == $max) {
print “The max is five\n”;
exit;
} else {
$i++;
}
$i = $max;
unless ($i == $max) {
$i++;
} else {
print “The max is five\n”;
exit;
}
AE6382
Compound Statements

while statement



until statement



label while (expr) block
label while (expr) block continue block
label until (expr) block
label until (expr) block continue block
The continue block is executed before starting next
iteration of loop
while (<STDIN>) {
chomp;
@fields = split(/:/);
print “Field 1:
$fields[0]\n”;
}
AE6382
Compound Statements

for loop




label for (expr1 ; expr2 ; expr3) block
expr1 start condition
expr2 ending condition
expr3 loop statement
for (my $i = 0;$i < 10;$i++) {
print “i=$i\n”;
}
AE6382
Compound Statements

foreach statement





label foreach (list) block
label foreach var (list) block
label foreach var (list) block continue block
Loops over each entry in the list
When var is omitted then $_ is used
foreach my $key (sort keys %people) {
print “Key: $key, Value=$people{key}\n”;
}
foreach my $entry (@items) {
print “Item: $entry\n”;
}
AE6382
Compound Statements

Labeled block




label block
label block continue block
Equivalent to a single iteration loop
Can be used with last, next, and redo
AE6382
Loop Control



These statements can be used with blocks
The optional label further refines their effect
last label



next label



Skip the rest of this iteration and start the next iteration
Execute the continue block before the next iteration begins
redo label



Exit the loop (block)
The continue block is not executed
Restart the loop with the current iteration parameters
The continue block is not executed
The label parameter enables multi-level block control
AE6382
Declarations


Subroutine declaration is a global declaration
Must declare a subroutine before using it


Can define a subroutine at declaration


sub count;
sub count { … }
Pragmas are directives to the Perl compiler




use strict;
use integer;
use warnings;
use English;
AE6382
Declarations

Variable declarations

Lexically scoped declarations
–
–
–

Lexically scoped global declarations
–

my $var;
my ($var1, $var2);
my $value = function();
our $var;
Dynamically scoped global declarations
–
local $var;
AE6382
Pattern Matching

Regular Expressions


Simple patterns


Rule based pattern matching mechanism
m/Class/
Complex pattern

m/AE[0-9]+[A-Z]/
AE6382
Regular Expressions

Meta-characters




Quantifiers




* + ? {3} {2,5}
RE’s normally match maximal text
Add ? to end to match minimal text
Character classes


\ | ( ) [ { ^ $ * + ? .
Have special meanings inside patterns
\ is the escape character used to use one of the meta-characters
as itself in a pattern, eg, \\ or \.
[ ] or [^ ]
Grouping

()
AE6382
Regular Expressions

The pattern matching operators




match
substitute
transliterate
Binding operators



m//
s///
tr///
=~
!~
binds string to pattern operator
Examples




$string =~ m/AE[0-9]{4}[A-Z]/;
$string =~ s/old/new/;
$string =~ s(old)(new);
$string =~ s’old’new’;
can use arbitrary delimiters
AE6382
Regular Expressions

Maximal and Minimal matches

“exasperate” =~ m/e(.*)e/
–

Returns “xasperat”
“exasperate” =~ m/e(.*?)e/
–
Returns “xasp”
AE6382
Functions

There are many built-in functions

Can be used with or without parentheses around arguments
–
–
–
–
With parentheses it will be parsed as a function
Without parentheses it will be parsed as a prefix operator, preferred
Use the –w switch on the #!/usr/bin/perl –w line to flag when it is
being parsed as a function
Example
• print 1+2*4;
• print (1+2)*4;


# prints 9
# prints 3
For details see perl documentation or Camel book
Users may define functions


sub name { code };
User functions are called with parentheses around arguments
AE6382
Functions - Arguments


Arguments are passed to functions in the built-in array
@_
The elements of @_ can be accessed by any of several
techniques
sub func {
sub func {
}


sub func {
sub func {
my $arg1 = $_[0];
my $arg1 = shift;
my $arg1 = shift;
my $arg2 = $_[1];
my $arg2 = shift;
my @rest = @_;
}
}
my $nargs = @_;
my $arg1 = shift;
my @rest = @_;
}
shift is a built-in function that returns the first element of
an array then shifts the remaining elements down
shift operates in a manner similar to a stack pop
AE6382
eval Function


The eval function normally used to trap runtime errors
The eval function has two forms

eval block
–

eval expr
–
–

Will execute the code enclosed by the block
Compiles and executes the code in expr
The code in expr can be dynamically created
The special variable $@ contains the result of execution


$@ is set to the error message if there is an error
$@ is set to an empty string if there is no error
eval { … }
# execute block of code
if ($@) { … }
# handle error
AE6382
References


A reference in Perl is a scalar that contains a pointer to
some data in memory
Perl has two types symbolic and hard



Use the $ prefix to dereference a reference




Symbolic: scalar contains the name of another variable
Hard: scalar contains the address of the memory
$ref is the scalar that contains the reference
$$ref
# dereference
${$ref}
# dereference
Hard references are generally more common
AE6382
References


The \ (backslash) operator is used to create a hard
reference
$ref = \$sample



In this example $ref is an alias for $sample, they both refer to the
same location in memory
Use $$ref to refer to that memory location: $$ref == $sample and
${$ref} = $sample
$ref = \@array



In this example $ref is an alias for @array
To access an array element: $$ref[1] or ${$ref}[1] or $ref->[1]
To access array: @$ref or @{$ref}
AE6382
Data Structures


References are useful in accessing anonymous data
structures
Anonymous array




Anonymous hash




[ element1, element2, … , elementN ]
$ref = [0,1,2,3,4];
$$ref[0] or ${$ref}[0] or $ref->[0]
{ key1=>element1, key2=>element2, … , keyN=>elementN }
$ref = { Jan=>1, Feb=>2, Mar=>3, Apr=>4 };
$$ref{Jan} or ${$ref}{Jan} or $ref->{Jan}
The -> operator is syntactic shorthand that removes the
extra $ dereference
AE6382
Data Structures



Creating arbitrarily complex data structures is relatively
easy using references
Create any number of anonymous structures placing
their address into a scalar (reference)
Store the resulting scalars into other structures
AE6382
Arrays of Arrays



An array of arrays is how to create a multi-dimensional
array in Perl
In each cell of one array save a reference to another
array
There is no requirement that each secondary array be
the same length
my @array;
my $array_ref;
for (my $i=0;$i<4;$i++) {
for (my $i=0;$i<4;$i++) {
my $ref;
my $ref;
for (my $j=$i;$j<$i+4;j++) {
for (my $j=$i;$j<$i+4;j++) {
push @{$ref},$j;
push @{$ref},$j;
}
}
$array[$i] = $ref;
$array_ref->[$i] = $ref;
}
}
print $array[0]->[0],”\n”;
print $array_ref->[0]->[0],”\n”;
AE6382
Hash of Arrays

In each cell of a hash table save a reference to an array
my %months = ( Jan=>[1..31],
Feb=>[1..28]);
$, = ‘, ‘;
foreach my $month (keys %months) {
print “$month: “,@{$months{$month}},”\n”;
}
Jan: 1, 2, 3, 4, … 27, 28, 29, 30, 31
Feb: 1, 2, 3, 4, … 27, 28
AE6382
Complex Structures


Data structures can be created to any level of complexity
Can mix all types to any depth


Arrays of hashes of hashes of arrays
Hashes containing references to user defined functions
–
&{$func_list{$member}}(…arguments…)
sub startup {
print “Startup\n”;
}
sub shutdown {
$code = shift;
print “Shutdown: $code\n”;
}
%func_list = (Startup=>\&startup,
Shutdown=>\&shutdown);
&{$func_list{shutdown}}(99);
AE6382
Packages






A package is the way to isolate code in its own
namespace
This is particularly useful for re-usable code (libraries)
As generally used, the scope of a package declaration
is the file in which it appears
Usually package is the first line of a file that is
processed by require or use
To refer to a variable in another package use
$package::variable
The default package is main, $main::variable or
$::variable
AE6382
Modules



The module is the basic unit of re-usable Perl code
Module files end with the .pm file extension
Modules come in two forms



Modules are accessed with the use keyword



Traditional: functions and variables
Object-Oriented: methods and properties
use Module;
A module file contains a package declaration with the
same name as the file
A module may export a list of functions and variables to
the namespace that contains the use statement (do not
export OO methods)
AE6382
Modules


Module names should begin with a capital letter and end
with .pm
The last line of a module must be 1;
File Sample.pm
use Sample;
package Sample; my $result = Sample::func1;
sub func1 {
}
sub func2 {
}
1;
AE6382
Modules

Beyond the simple form there is additional support for
modules



The Exporter module can be used to place selected symbols into
the Perl code that uses the module
There is a version checking mechanism
There is an autoload feature
File Sample.pm
package Sample;
use Sample;
require Exporter;
our @ISA = qw(Exporter);
my $result = func1;
our @EXPORT = qw(func1 func2);
sub func1 {
}
sub func2 {
}
1;
AE6382
Objects








The module forms the basis of the Object Oriented
features of Perl
The package name is the class name (type)
The function definitions in the module are the methods
A class may inherit methods from parent classes
A class may be sub-classed
Perl classes inherit methods not data
An object is a reference to an instance of a class
All Perl classes are sub-classes of the UNIVERSAL
class
AE6382
Objects – Method Invocation


Assume a class named Sample with an instance named
$instance
Invoking a class method


Invoking an instance method



Sample->class_method(…arguments…);
$instance->instance_method(… arguments…);
The first argument of a method invocation is hidden and
is either the class name (class method) or a reference to
an object (instance method)
Methods can override super class methods
AE6382
Objects – Method Invocation (2)


There is an alternate invocation method using indirect
objects
Looks like





method object (list)
method object list
method object
This method is less common as it suffers from some
syntactic ambiguity
Frequently used in calling constructor


$q = new CGI;
$q = CGI->new;
AE6382
Objects - Constructors




A constructor method is an ordinary method, usually
named new
Constructors for sub-classable classes need to be
designed carefully (Camel Book 3rd ed, p 318)
The instance properties are usually kept in an
anonymous hash that is saved in the instance variable
The bless function associates the reference variable
with the class
# Constructor for class named Sample
sub new {
my $obj = shift;
my $class = ref($obj) || $obj;
$object = Sample->new(alpha=>1,beta=>2);
my $self = { @_ };
bless($self,$class);
return $self;
}
AE6382
Objects - Constructors




In the previous example the instance data are stored in
an anonymous hash
The ref built-in function returns the class name of the
object that is referred to
Any reference can be used, hashes are common and
convenient
The use fields …; pragma is useful for creating object
field storage, use this with the use base …; pragma
AE6382
Objects – Properties

The instance data can be referenced as hash entries
when the object is hash based
my $prop1 = $object->{alpha};
my $prop2 = $object->{beta};

Instance data should normally be accessed using
accessor methods
AE6382
Objects - Overloading



Perl provides a mechanism to overload operators
use overload implements this
There is a handler (method/function) associated with
each operator that has been overloaded, Perl will take
care of the details
AE6382
Tied Variables




In Perl the tie function associates an object with a
normal Perl variable (scalar, array, hash)
For example, a file can be accessed as if it were a
simple array
The store and fetch accesses to the variable are
provided by methods, Perl handles the details
There are numerous available modules that create tied
variables to access more complex data sources
AE6382
Extending Perl

There are several ways to extend Perl




Hundreds of modules are available at
http://www.cpan.org/
Perl is available for almost every OS



Create modules (object oriented or traditional)
Create native code, C code, that is appended to the Perl
interpreter
Generally pre-compiled for Linux
Windows version from http://www.activestate.com/
The Perl interpreter can be embedded in native code
programs
AE6382
Download