1. Perl on Windows - ActivePerl ActivePerl distribution (developed by ActiveState Tool Corporation ) is at http://www.activestate.com/. We need Perl for Win32 - Binary for the core Perl distribution. The ActivePerl binary comes as a selfextracting executable that uses the standard Win32 InstallShield setup wizard to guide you through the installation process. By default, Perl is installed into the directory C:\Perl\version, where version is the current version number (e.g., 5.005). 2. The Perl Manpages – online documentation Perl comes with lots of online documentation. Run ‘man perl’ or ‘perldoc perl’ to read the top-level page. That page in turn directs you to more specific pages. . To make life easier, the manpages have been divided into separate sections If you know which section you want, you can go directly there by using ‘man perlvar’ or ‘perldoc perlvar’. A partial sections list: Section Description perlre Regular expressions perlfunc Builtin functions perlvar Predefined variables 3. Executing Perl testpgm.pl #!/usr/bin/perl use strict; # The body of the script: print "Hello, world!\n"; The perl interpreter (i.e., the perl executable) is usually located in /usr/bin/perl. Accordingly, the first line is: #!/usr/bin/perl. For Unix systems, this #! line tells the shell to look for the /usr/bin/perl program and pass the rest of the file to that program for execution. strict is a pragma for doing strict error checking: (1) Generates runtime error if you use any symbolic references. (2) Generates compile-time error if you use a bareword identifier that's not a predeclared subroutine. (3)Generates compile-time error if you access a variable that wasn't declared via my, or wasn't imported. Comments within a program are indicated by #. Everything following a pound sign to the end of the line is interpreted as a comment. To run Perl, pass Perl the name of your script as the first parameter: > perl testpgm.pl. Alternatively, you may write your script with -e switches on the command line: > perl -e 'print "Hello, world\n"' #Unix, or > perl -e "print \"Hello, world\n\"" #Win32 4. Variables: Scalars ($), Arrays(@), Hash(%) - Lexical scoping: (as in C) use my A. Scalar my my my my my Default Initialization: int:0, string "", logical: false Examples: $answer = 42; # an integer $pi = 3.14159265; # a "real" number $avocados = 6.02e23; # scientific notation $pet = "Camel"; # string $sign = "I love my $pet"; # string with interpolation (variables and backslash interpolation) my $cost = 'It costs $100'; # string without interpolation my $h = $w; # assignent my $val = $x * $y # expression my $camels = "123"; print $camels + 1, "\n"; #prints 124, number context! Operators + - (addition, subtraction) * / % ** (multiply, divide, modulus, exponentiation) ++ -- (autoincrement, autodecrement) = += -= *= etc. (assignment operators) . (string concatenation) bits: << >> (left bit-shift, right bit-shift) & | ^ (bit-and, bit-or, bit-xor) logical: or ||, and &&, not ! Comparison Numeric String Equal == eq Not equal != ne Less than < lt Greater than > gt Less than or equal to <= le Greater than or equal to >= ge Examples: $x = 12; --$x; # $x is now 11 $y = $x--; # $y is 11, and $x is now 10 $str = $str . " "; # append a space to $str B. Array - Default Initialization: empty array. - examples: my @home = ("couch", "chair", "table", "stove"); my($potato, $lift, $tennis, $pipe) = @home; ($alpha,$omega) = ($omega,$alpha); #switch alpha and omega values $home[0] = "couch"; $home[1] = "chair"; $home[2] = "table"; $home[3] = "stove"; print $#home; # prints 3 (last element location) $home[++$#home] = $bath; # access to [4], $length = @home # array in scalar context return length = 5 @my_home = @array # copy array @a = $b # equal to @a=($b) push (@a,$val); $last_val = pop(@a); unshift (@a,$val); $first_val = shift(@a); @b=reverse(@a); @b=sort(@a); chomp(@a); chop(@a); # # # # # pushes $val at the end of @a # removes $val from the end of @a and returns it # pushes $val at the beginning of @a # removes $val from the beginning of @a and returns it reversed order sorting strings by ascii, numbers by value removing last \n from each element removing last char from each element $str = "This is a pet"; @a = split " ", $str; # @a becomes ("This","is","a","pet") $newstr=join (" ", @a); # $newstr equals $str. #split and join might use any regular expression /…/ C. Hash %longday = ("Sun", "Sunday", "Mon", "Monday", "Tue", "Tuesday", "Wed", "Wednesday", "Thu", "Thursday", "Fri", "Friday", "Sat", "Saturday"); $longday{"Sun"}="Sunday"; # adding key Sun and value Sunday. delete $longday{"Sun"}; # deleting key and its value if(exists ($longday{"Sun"})){…} # check if Sun key exists. %longday = (); # cleaning the hash (also default initialization) %new_hash = %longday; #copy @a = keys %longday #@a becomes ("Sun","Mon","Tue","Wed","Thu", "Fri","Sat"); @a = values %longday #@a becomes ("Sunday","Monday","Tuesday","Wednsdaa", "Thursday", "Friday","Saterday"); foreach $key(sort keys %longday){ print $key.":".$longday{$key}."\n"; } #sorted by keys foreach $val(sort values %longday){ print $key.":".$longday{$key}."\n"; } #sorted by values %merged = (%a,%b); 5. Statements Block - A sequence of statements. A block is delimited by { }. Conditionals The if and unless statements execute blocks of code depending on whether a condition is met. These statements take the following forms: if (expression) {block} else {block} unless (expression) {block} else {block} if (expression1) {block} elsif (expression2) {block} ... elsif (lastexpression) {block} else {block} while loops - The while statement repeatedly executes a block as long as its conditional expression is true. $i = 1; while ($i < 10) { ... $i++; } while (<INFILE>) { print OUTFILE, "$_\n"; } The while statement has an optional extra block on the end called a continue block. This block is executed before every successive iteration of the loop, even if the main while block is exited early by the loop control command next. However, the continue block is not executed if the main block is exited by a last statement. The continue block is always executed before the conditional is evaluated again. for loops- The for loop has three semicolon-separated expressions for initialization, condition, and the re-initialization expressions of the loop. for ($i = 1; $i < 10; $i++) { ... } foreach loops - The foreach loop iterates over a list value and sets the control variable (var) to be each element of the list in turn: foreach var (list) { ... } If VAR is omitted, $_ is used. If LIST is an actual array, you can modify each element of the array by modifying VAR inside the loop. That's because the foreach loop index variable is an alias for each item in the list that you're looping over. foreach $elem (@elements) { $elem *= 2; } # multiply by 2 foreach $key (sort keys %hash) { print "$key => $hash{$key}\n"; } # sorting keys Loop control The last command is like the break statement in C (as used in loops): exits the loop. The next command is like the continue statement in C: skips the rest of the current iteration and starts the next iteration of the loop. Any block can be given a label (by convention, in uppercase) which identifies the loop. For example: WID: foreach $this (@ary1) { JET: foreach $that (@ary2) { next WID if $this > $that; $this += $that; } } Example 1: union and intersection of arrays @a = (1, 3, 5, 6, 7, 8); @b = (2, 3, 5, 7, 9); @union_arr = @isect_arr = (); %union_hash = %isect_hash = (); foreach $e (@a) { $union_hash{$e} = 1 } foreach $e (@b) { if ($union_hash{$e} ) { $isect_hash{$e} = 1 } $union_hash{$e} = 1; } @union_arr = keys %union_hash; @isect_arr = keys %isect_hash; Example 2: Find common keys in hashes my @common = (); foreach $c (keys %hash1) { if (exists ($hash2{$c}) ) push(@common, $c); } 6. Subroutines: sub NAME BLOCK # A declaration and a definition. NAME(LIST); #calling subroutine directly Arguments are in @_ by reference. $bestday = max($mon,$tue,$wed,$thu,$fri); sub max { my(@values)=@_; #warning: this is copy, use references to save time. my($max) = shift(@values); foreach my $foo (@values) { if($max < $foo){ $max = $foo; } } return $max; } 7. Command line arguments: in @ARGV. For example: #!\usr\bin perl my($param1,$param2,$param3)=@ARGV; print “got $param2 \n”; -------------------> perl myscript.pl a b c > got b > 8. Basic I/O $a = <STDIN>; # read the next line @a = <STDIN>; # all remaining lines as a list, until ctrl-D while ($line=<STDIN>) { chomp($line); # other operations with $line here } open(FILEHANDLE,"somename"); #opens the filehandle for reading open(OUT, ">outfile"); # opens the filehandle for writing open(LOGFILE, ">>mylogfile"); # opens the filehandle for append close(LOGFILE); #finished with a filehandle example: open (OUTFILE, “>./dir/out.txt”); open (INFILE, $ARGV[0]); while ($line = <INFILE>){ chomp($line); print OUTFILE “saw $line here \n”; } close(INFILE); close(OUTFILE); 9. Pattern Matching while ($line = <FILE>) { if($line =~ /http:/) # match operator // pattern binding operator =~ {print $line;} # prints all lines from FILE that include substring http: } $italiano =~ s/butter/olive oil/; # substitution operator s/// # substitutes all butter with olive oil in $italiano $a =~ s/x//; #delete the first x $a =~ s/x//g; #delete all x characters (g is modifier for global substitution) @arr = split /aaa/, "jaaalaaah"; # @arr=("j","l","h") $: match at the end of the string. ^ match at the beginning of the string /a$/ match "abba" but not "abb" /^http:/ match "http:/../... ", but not "located in http:... " quantifiers: * 0 and more times + 1 or more times ? 0 or 1 times {min, max} between min and max. {min,} above min. {,max} below max {num} exactly num /abc*/ match ab, abc , abcc , abccc /abc+/ match abc , abcc , abccc but not ab /(abc){2,}/ match abcabc and abcabcabcabc but not abccabc /(abc){3}/ match abcabcabc but not abcabc or abcabcabcabc . match all except \n /Frodo./ match "Frodon" but not "Frodo\n" /Frodo\./ match only "Frodo." \ is used as de-metacharater metasymbols: \t tab \w word character[a-zA-Z_0-9] \s whitespace[\t\n\r\f], \d digit[0-9] /^\w+\s+\w+$/ match exactly two words /a\t/ match a followed by tab. /a\\t/ match "a\t" alternations: / I am (Fred|Wilma|Pebbles) Flintstone/ match exactly one of the names capture and clustering: while($line = <STDIN>){ if($line =~ /^(.*):(.*)$/) { $hash{$1}=$2; }} s/^(\w+) (\w+)/$2 $1/ substitutes between the two first words /\s(\w+)\s\1\s/ match only two consecutive identical words 10. References and Data structures: Creating reference: $scalarref=\$scalar; $arrref=\@arr; $hashref=\%hash; $arrref=["a","b","c","d"]; $hashref={"red",1, "green",2}; #anonymous data #anonymous data Dereferencing: $val = $$scalarref; @arr = @$arrref; $val = $arrref->[2]; %hash = %$hashref; $val = $hashref->{"red"}; Data structures – examples: @arr = (1,2,{"red","flowers","blue","sky"},["a","b","c","d"]); #creating ds $val = $arr[2]->{"blue"}; # accessing ds: $val="sky". Same as $arr[2]{"blue"} $arr[3] = []; #the array ["a","b","c","d"] is now empty! Array of arrays of all STDIN words: my(@LoL); while ($line = <STDIN>) { my @tmp = split " ", $line; push @LoL, [ @tmp ]; } … or you may use only: push @LoL, [split " ", $line ]; foreach $i (0,,$#LoL){ $row_ref = $LoL[$i]; foreach $j (0.. $#{$row_ref}){ print "element $i $j is $LoL[$i][$j] \n"; }