Variables: Scalars ($), Arrays(@), Hash(%)

advertisement
1. Perl on Windows - ActivePerl
ActivePerl distribution (developed by ActiveState Tool Corporation ) is at http://www.activestate.com/.
We need Perl for Win32 - Binary for the core Perl distribution. The ActivePerl binary comes as a selfextracting executable that uses the standard Win32 InstallShield setup wizard to guide you through the
installation process. By default, Perl is installed into the directory C:\Perl\version, where version is the
current version number (e.g., 5.005).
2. The Perl Manpages – online documentation
Perl comes with lots of online documentation. Run ‘man perl’ or ‘perldoc perl’ to read the top-level
page. That page in turn directs you to more specific pages. . To make life easier, the manpages have
been divided into separate sections If you know which section you want, you can go directly there by
using ‘man perlvar’ or ‘perldoc perlvar’. A partial sections list:
Section Description
perlre
Regular expressions
perlfunc Builtin functions
perlvar
Predefined variables
3. Executing Perl
testpgm.pl
#!/usr/bin/perl
use strict;
# The body of the script:
print "Hello, world!\n";



The perl interpreter (i.e., the perl executable) is usually located in /usr/bin/perl. Accordingly, the first
line is: #!/usr/bin/perl. For Unix systems, this #! line tells the shell to look for the /usr/bin/perl
program and pass the rest of the file to that program for execution.
strict is a pragma for doing strict error checking: (1) Generates runtime error if you use any symbolic
references. (2) Generates compile-time error if you use a bareword identifier that's not a predeclared
subroutine. (3)Generates compile-time error if you access a variable that wasn't declared via my, or
wasn't imported.
Comments within a program are indicated by #. Everything following a pound sign to the end of the
line is interpreted as a comment.
To run Perl, pass Perl the name of your script as the first parameter: > perl testpgm.pl. Alternatively, you
may write your script with -e switches on the command line:
> perl -e 'print "Hello, world\n"' #Unix, or > perl -e "print \"Hello, world\n\"" #Win32
4. Variables: Scalars ($), Arrays(@), Hash(%)
-
Lexical scoping: (as in C) use my
A. Scalar
my
my
my
my
my
Default Initialization: int:0, string "", logical: false
Examples:
$answer = 42;
# an integer
$pi = 3.14159265;
# a "real" number
$avocados = 6.02e23;
# scientific notation
$pet = "Camel";
# string
$sign = "I love my $pet";
# string with interpolation (variables and
backslash interpolation)
my $cost = 'It costs $100';
# string without interpolation
my $h = $w;
# assignent
my $val = $x * $y
# expression
my $camels = "123";
print $camels + 1, "\n";
#prints 124, number context!
Operators
+ - (addition, subtraction)
* / % ** (multiply, divide, modulus, exponentiation)
++ -- (autoincrement, autodecrement)
= += -= *= etc. (assignment operators)
. (string concatenation)
bits: << >> (left bit-shift, right bit-shift) & | ^ (bit-and, bit-or, bit-xor)
logical: or ||, and &&, not !
Comparison
Numeric String
Equal
==
eq
Not equal
!=
ne
Less than
<
lt
Greater than
>
gt
Less than or equal to
<=
le
Greater than or equal to >=
ge
Examples:
$x = 12;
--$x;
# $x is now 11
$y = $x--; # $y is 11, and $x is now 10
$str = $str . " "; # append a space to $str
B. Array
- Default Initialization: empty array.
- examples:
my @home = ("couch", "chair", "table", "stove");
my($potato, $lift, $tennis, $pipe) = @home;
($alpha,$omega) = ($omega,$alpha); #switch alpha and omega values
$home[0] = "couch";
$home[1] = "chair";
$home[2] = "table";
$home[3] = "stove";
print $#home;
# prints 3 (last element location)
$home[++$#home] = $bath;
# access to [4],
$length = @home
# array in scalar context return length = 5
@my_home = @array
# copy array
@a = $b
# equal to @a=($b)
push (@a,$val);
$last_val = pop(@a);
unshift (@a,$val);
$first_val = shift(@a);
@b=reverse(@a);
@b=sort(@a);
chomp(@a);
chop(@a);
#
#
#
#
# pushes $val at the end of @a
# removes $val from the end of @a and
returns it
# pushes $val at the beginning of @a
# removes $val from the beginning of @a and
returns it
reversed order
sorting strings by ascii, numbers by value
removing last \n from each element
removing last char from each element
$str = "This is a pet";
@a = split " ", $str;
# @a becomes ("This","is","a","pet")
$newstr=join (" ", @a);
# $newstr equals $str.
#split and join might use any regular expression /…/
C. Hash
%longday = ("Sun", "Sunday", "Mon", "Monday", "Tue", "Tuesday",
"Wed", "Wednesday", "Thu", "Thursday", "Fri",
"Friday", "Sat", "Saturday");
$longday{"Sun"}="Sunday";
# adding key Sun and value Sunday.
delete $longday{"Sun"};
# deleting key and its value
if(exists ($longday{"Sun"})){…}
# check if Sun key exists.
%longday = ();
# cleaning the hash (also default
initialization)
%new_hash = %longday;
#copy
@a = keys %longday
#@a becomes ("Sun","Mon","Tue","Wed","Thu", "Fri","Sat");
@a = values %longday #@a becomes ("Sunday","Monday","Tuesday","Wednsdaa",
"Thursday", "Friday","Saterday");
foreach $key(sort keys %longday){ print $key.":".$longday{$key}."\n"; }
#sorted by keys
foreach $val(sort values %longday){ print $key.":".$longday{$key}."\n"; }
#sorted by values
%merged = (%a,%b);
5. Statements
Block - A sequence of statements. A block is delimited by
{ }.
Conditionals
The if and unless statements execute blocks of code depending on whether a condition
is met. These statements take the following forms:
if (expression) {block} else {block}
unless (expression) {block} else {block}
if (expression1) {block}
elsif (expression2) {block}
...
elsif (lastexpression) {block}
else {block}
while loops -
The while statement repeatedly executes a block as long as its
conditional expression is true.
$i = 1;
while ($i < 10) {
...
$i++;
}
while (<INFILE>) {
print OUTFILE, "$_\n";
}
The while statement has an optional extra block on the end called a continue block.
This block is executed before every successive iteration of the loop, even if the
main while block is exited early by the loop control command next. However, the
continue block is not executed if the main block is exited by a last statement. The
continue block is always executed before the conditional is evaluated again.
for loops- The
for loop has three semicolon-separated expressions for initialization,
condition, and the re-initialization expressions of the loop.
for ($i = 1; $i < 10; $i++) {
...
}
foreach loops - The
foreach loop iterates over a list value and sets the control
variable (var) to be each element of the list in turn:
foreach var (list) {
...
}
If VAR is omitted, $_ is used.
If LIST is an actual array, you can modify each element of the array by modifying
VAR inside the loop. That's because the foreach loop index variable is an alias for
each item in the list that you're looping over.
foreach $elem (@elements) {
$elem *= 2;
}
# multiply by 2
foreach $key (sort keys %hash) {
print "$key => $hash{$key}\n";
}
# sorting keys
Loop control
The last command is like the break statement in C (as used in loops): exits the
loop. The next command is like the continue statement in C: skips the rest of the
current iteration and starts the next iteration of the loop. Any block can be given
a label (by convention, in uppercase) which identifies the loop. For example:
WID: foreach $this (@ary1) {
JET: foreach $that (@ary2) {
next WID if $this > $that;
$this += $that;
}
}
Example 1: union and intersection of arrays
@a = (1, 3, 5, 6, 7, 8);
@b = (2, 3, 5, 7, 9);
@union_arr = @isect_arr = ();
%union_hash = %isect_hash = ();
foreach $e (@a) { $union_hash{$e} = 1 }
foreach $e (@b) {
if ($union_hash{$e} ) { $isect_hash{$e} = 1 }
$union_hash{$e} = 1;
}
@union_arr = keys %union_hash;
@isect_arr = keys %isect_hash;
Example 2: Find common keys in hashes
my @common = ();
foreach $c (keys %hash1) {
if (exists ($hash2{$c}) )
push(@common, $c);
}
6. Subroutines:
sub NAME BLOCK
# A declaration and a definition.
NAME(LIST);
#calling subroutine directly
Arguments are in @_ by reference.
$bestday = max($mon,$tue,$wed,$thu,$fri);
sub max {
my(@values)=@_; #warning: this is copy, use references to save time.
my($max) = shift(@values);
foreach my $foo (@values) {
if($max < $foo){
$max = $foo;
}
}
return $max;
}
7. Command line arguments:
in @ARGV. For example:
#!\usr\bin perl
my($param1,$param2,$param3)=@ARGV;
print “got $param2 \n”;
-------------------> perl myscript.pl a b c
> got b
>
8. Basic I/O
$a = <STDIN>; # read the next line
@a = <STDIN>; # all remaining lines as a list, until ctrl-D
while ($line=<STDIN>) {
chomp($line);
# other operations with $line here
}
open(FILEHANDLE,"somename"); #opens the filehandle for reading
open(OUT, ">outfile"); # opens the filehandle for writing
open(LOGFILE, ">>mylogfile"); # opens the filehandle for append
close(LOGFILE); #finished with a filehandle
example:
open (OUTFILE, “>./dir/out.txt”);
open (INFILE, $ARGV[0]);
while ($line = <INFILE>){
chomp($line);
print OUTFILE “saw $line here \n”;
}
close(INFILE);
close(OUTFILE);
9. Pattern Matching
while ($line = <FILE>) {
if($line =~ /http:/) # match operator // pattern binding operator =~
{print $line;}
# prints all lines from FILE that include substring http:
}
$italiano =~ s/butter/olive oil/; # substitution operator s///
# substitutes all butter with olive oil in $italiano
$a =~ s/x//; #delete the first x
$a =~ s/x//g; #delete all x characters (g is modifier for global substitution)
@arr = split /aaa/, "jaaalaaah"; # @arr=("j","l","h")
$: match at the end of the string. ^ match at the beginning of the string
/a$/ match "abba" but not "abb"
/^http:/ match "http:/../... ", but not "located in http:... "
quantifiers:
* 0 and more times
+ 1 or more times
? 0 or 1 times
{min, max} between min and max. {min,} above min. {,max} below max
{num} exactly num
/abc*/ match ab, abc , abcc , abccc
/abc+/ match abc , abcc , abccc but not ab
/(abc){2,}/ match abcabc and abcabcabcabc but not abccabc
/(abc){3}/ match abcabcabc but not abcabc or abcabcabcabc
. match all except \n
/Frodo./ match "Frodon" but not "Frodo\n"
/Frodo\./ match only "Frodo." \ is used as de-metacharater
metasymbols:
\t tab
\w word character[a-zA-Z_0-9]
\s whitespace[\t\n\r\f],
\d digit[0-9]
/^\w+\s+\w+$/ match exactly two words
/a\t/ match a followed by tab. /a\\t/ match "a\t"
alternations:
/ I am (Fred|Wilma|Pebbles) Flintstone/ match exactly one of the names
capture and clustering:
while($line = <STDIN>){
if($line =~ /^(.*):(.*)$/) {
$hash{$1}=$2;
}}
s/^(\w+) (\w+)/$2 $1/ substitutes between the two first words
/\s(\w+)\s\1\s/ match only two consecutive identical words
10. References and Data structures:
Creating reference:
$scalarref=\$scalar;
$arrref=\@arr;
$hashref=\%hash;
$arrref=["a","b","c","d"];
$hashref={"red",1, "green",2};
#anonymous data
#anonymous data
Dereferencing:
$val = $$scalarref;
@arr = @$arrref;
$val = $arrref->[2];
%hash = %$hashref;
$val = $hashref->{"red"};
Data structures – examples:
@arr = (1,2,{"red","flowers","blue","sky"},["a","b","c","d"]); #creating ds
$val = $arr[2]->{"blue"}; # accessing ds: $val="sky". Same as $arr[2]{"blue"}
$arr[3] = [];
#the array ["a","b","c","d"] is now empty!
Array of arrays of all STDIN words:
my(@LoL);
while ($line = <STDIN>) {
my @tmp = split " ", $line;
push @LoL, [ @tmp ];
}
… or you may use only: push @LoL, [split " ", $line ];
foreach $i (0,,$#LoL){
$row_ref = $LoL[$i];
foreach $j (0.. $#{$row_ref}){
print "element $i $j is $LoL[$i][$j] \n";
}
Download