Perl for biologists

advertisement
Perl for biologists
PERL
More references, complex data
types and objects
A. Emerson, Perl for Biologists
Perl for biologists
References in Perl
 An alternative mechanism of using variables such as
arrays or associative arrays.
 Useful to think of a reference as the address of the
object in memory.
 Often very efficient and convenient because
references are always stored in scalar variables,
regardless of the size or complexity of the thing being
referenced.
 Can be used to avoid the difficulty of passing multiple
arrays in or out of subroutines.
 In Perl you can take a reference of just about
anything, including subroutines.
A. Emerson, Perl for Biologists
Perl for biologists
Creating references and dereferencing
@dna1=(G,G,T,C,T,G);
$dna_ref = \@dna1; # storing a reference in scalar
add_seqs(\@dna1,\@dna2); # implicit references
$ref_ref = \$dna_ref; # reference of a reference
# dereferencing
@array = @{$dna_ref};
$base = $$dna_ref[1]; # careful! dereferencing occurs
$$dna_ref[0]=‘C’;
# before array lookup. (We are not
# dereferencing @dna_ref)
$base = $dna_ref->[1]; # alternative notation (clearer)
It may help to consider a reference as an alias of the variable
being referenced.
A. Emerson, Perl for Biologists
Perl for biologists
Two dimensional arrays
 References can be used to create data types not present in
standard Perl, e.g. matrices (n-dimensional arrays)


@row1=(1.0,0.0,1.0);
@row2=(0.0,1.0,0.0);
@row3=(1.0,0.0,1.0);
@matrix=(\@row1,\@row2,\@row3);
# perl simulation of a 2-d matrix(3,3)
for(my $i=0;$i<3; $i++) {
for(my $j=0;$j<3;$j++) {
print “$matrix[$i]->[$j];
}
print “\n”;
}
Strictly speaking this is a an array of references (to other arrays) but is
more flexible because the rows can be of different lengths
The above solution though creates unnecessary arrays (@row1, etc) –
we would like to create the matrix directly.
A. Emerson, Perl for Biologists
Perl for biologists
Anonymous arrays and hashes
 Perl provides an alternative method for creating
arrays and hashes:
# standard array definition
@dna1=(G,G,T,C,T,G);
# anonymous definition using a reference
$dna = [ A, A, A, A ];
# NB [] brackets!
$dna->[0] = ‘T’;
 The new array does not have a name (it is
“anonymous”) but can be referenced.
A. Emerson, Perl for Biologists
Perl for biologists
Using anonymous arrays
# defining a 2D matrix in one line
@matrix = ( [1.0,0.0,1.0],
[0.0,1.0,0.0],
[1.0,0.0,1.0]);
print “$matrix[1]->[1]\n”;
# or even
$matrix = [ [1.0,0.0,1.0],
[0.0,1.0,0.0],
[1.0,0.0,1.0]];
$matrix->[1]->[1]=2.0;
A. Emerson, Perl for Biologists
Perl for biologists
Using anonymous arrays
 Between two subscripts the -> is optional
# defining a 2D matrix in one line
@matrix = ( [1.0,0.0,1.0],
[0.0,1.0,0.0],
[1.0,0.0,1.0]);
print “$matrix[1]->[1]\n”;
# alternatively
print “$matrix[1][1]\n”;
 Similarly for 3d arrays, hashes,etc.
A. Emerson, Perl for Biologists
Perl for biologists
Using anonymous arrays
 Assigning one array ref to another doesn’t copy the
array:
@array=(0,0,0,0,0);
$aref=\@array;
$bref=$aref;
$aref->[1]=99;
print “@$aref \n @$bref\n”;
 -you can use an anonymous array
$aref=\@array;
$bref=[@$aref];
$aref->[1]=99;
print “@$aref \n @$bref\n”;
A. Emerson, Perl for Biologists
Perl for biologists
Using anonymous arrays and hashes
 Anonymous hashes can be similarly created using { }:
# Anonymous hashes
$code={AAA=>stop, CGT=> ‘ser’};
$code->{UAC}=‘tyr’;
foreach $key (keys %{$code}) {
print $code->{$key},”\n”;
}
A. Emerson, Perl for Biologists
Perl for biologists
Using anonymous arrays and hashes
You can mix anonymous arrays and hashes to create
compilcated structures:
my $planck={home => 'csc', name=>"PLANCK",
accounts=>['csc07141','csc18709','csc18091','csc11014'],all
ocated => 500000};
my $muheart={home =>’epcc’,name=>”MuHeart”,
accounts=>[‘hpx001’,’hpx0002’,’hpx0003’],allocated=>100000}
;
notice
my @projects=($planck,$muheart);
foreach $project (@projects) {
print $project->{name},”\n”;
foreach $acc (@{$project->{accounts}}) {
print “$project->{home} $acc\n”;
}
}
A. Emerson, Perl for Biologists
Perl for biologists
References to functions and other things
 You can make references to anything, including
scalars, functions, and other references.
# reference to a sub
$coderef = sub { print "Boink!\n" };
$coderef->();
 This looks a bit strange at first but becomes important
in object-oriented programming.
A. Emerson, Perl for Biologists
Perl for biologists
Objects
 Simple scalars, arrays, hashes or more complex structures are
often given the generic term object.
 Objects can created, destroyed, copied or passed between one
part of the program and another. They can be combined to give
other objects and you can access them via references.
 In languages such as C or Fortran (90+) new objects can
defined by creating new data types which extend the built in set
of int, double, real, character, etc.
/* C program */
typedef struct {
float i; /*real part*/
float j; /*imag part */
} complex;
complex a,b;
A. Emerson, Perl for Biologists
Perl for biologists
Objects
 In Object-oriented programming languages (e.g. C++
or Java) objects don’t just contain data but also
program code which governs how the object interacts
with the rest of the program.
 The code in the form of method functions can be
used to create the object (“the constructor”) or to
operate on other objects.
 The program is no longer written as a set of
sequential instructions but instead a collection of
interacting objects. Often a very convenient and
natural way of representing a programming problem.
A. Emerson, Perl for Biologists
Perl for biologists
Implementation of objects
STATE
BEHAVIOUR
State is usually held as local
variables (also called properties),
quantities sometimes not visible
outside the object (data hiding)
Behaviour controlled by
method functions or
subroutines which act on
the local variables and
interface with the outside.
A. Emerson, Perl for Biologists
Perl for biologists
C++ objects
// template definition
class CRectangle
{
// state data
int x, y;
public:
// method functions
void setvalues (int a,int b){x=a;y=b;}
int area (){ return (x*y);}
}
// main code
CRectangle recta, rectb;
recta.setvalues(3,4);
cout << recta.area() << endl;
A. Emerson, Perl for Biologists
Perl for biologists
OOP – key concepts
 Classes
 The templates used to define the objects. Note that defining the
class does not actually create the object.
 Instantiation
 When the object is created (an object is an instance of a class).
 Properties
 Data about the object itself. Can be private (not directly accessible
by other objects ) or public (accessible).
 Method functions
 Program code used to define the behaviour or functionality of the
object. Special functions called constructors create the object.
 Inheritance
 Deriving new classes from previously defined classes. Saves
programming effort and can create object heirarchies.
A. Emerson, Perl for Biologists
Perl for biologists
Key feature of OOP - Inheritance
Important ability of any OOP is the ability to derive one
object from a more general class of related objects: this is
called inheritance.
A standard eukaryotic cell
• nucleus, cell membrane,
cytoplasm, alive or dead
• undergoes division, makes
proteins from DNA
white blood
cell
nerve cell
skin cell
A. Emerson, Perl for Biologists
Perl for biologists
OOP and Perl
 Perl was never designed as a true OOP language.
The implementation is rather ad-hoc, and the OOP
syntax is non-standard.
 Creating Perl objects is rarely done but it is important
to know how to use them because library packages
(e.g. from CPAN) often use them.
 Using and managing Perl objects requires the heavy
use of references.
A. Emerson, Perl for Biologists
Perl for biologists
PERL Objects - Example
# Perl Database object package
# (contains definitions of the DB object)
use DBI;
my $dsn=“mysql.cineca.it –p 2222”;
my $user=“test01”;
my $password=“testtest”;
# create a DB object, referenced by $dbh
my $dbh = DBI->connect($dsn, $user, $password,
{ RaiseError => 1, AutoCommit => 0 });
..... do something
$dbh->commit;
$dbh->disconnect;
A. Emerson, Perl for Biologists
Perl for biologists
Summary
 References, together with anonymous arrays and
hashes, can be used to create complex data
structures which arent present in standard Perl (e.g.
2D arrays or tables).
 An important use of references is in object –oriented
programming. Most Perl programmers do not write
object definitions in Perl but libraries (e.g. BioPerl)
often use them.
A. Emerson, Perl for Biologists
Download