Perl_3

Programming and Perl for

Bioinformatics

Part III

Basic Data Types



Perl has three basic data types :



scalar



array (list)



associative array (hash)

Associative Arrays/Hashes







List of scalar values (like array)

Elements referred to by

key

, not index number

Elements stored as a list of

key-value

pairs

%threeletter = ('A','ALA','V','VAL','L','LEU'); key value key value key value print $threeletter{'A'}; # “ALA” print $threeletter{'L'}; ?



exists checks if a specific hash key exists if ($threeletter{'E'}) print ($threeletter{'E'}); ?

print "Exists\n" if exists $array{$key}; print "Defined\n" if defined $array{$key}; print "True\n" if $array{$key};

Getting all keys and values in a hash

%threeletter = ('A','ALA','V','VAL','L','LEU');



 keys values returns a list of all keys returns a list of all values

 each returns one key-value pair each time it’s called

($key, $val) = each %threeletter;



Unlike array, not an ordered list (order of determined by the Perl interpreter) key-value pairs foreach $k ( keys %threeletter ) { print $k;}

# Might return, for instance, “A L V”,

# not “A V L” (need not to be sorted) foreach $v ( values %threeletter ) { print $v;} ?

Associative Arrays



Some common functions:







 keys(%hash) #returns a list of all the keys values(%hash) #returns a list of all the values each(%hash) #each time this is called, it will

#return a 2 element list

#consisting of the next

#key/value pair in the array delete($hash{[key]}) #remove the pair associated

#with key

More on Perl



Subroutines and Functions



A way to organize a program



Wrap up a block of code



Have a name



Provide a way to pass values to the block and report back the results



Regular expression

Basics about Subroutines



# define a subroutine sub myblock { my ($arg1, $arg2, $arg3, …, $argN) = @_;

# @_ is special variable containing args print " Please enter something: ";

}



# function call myblock($arg1, $arg2, …, $argN);



Example sub add8A { my ($rna) = @_;

$rna .= "AAAAAAAA"; return $rna;

}

#the original rna

$rna = "CGAAUCUAGGAU " ;

$longer_rna = add8A($rna); print " I added 8 As to $rna to get

$longer_rna.\n";

More example

sub denaturizing { my (@products) = @_; my @strands = (); foreach $pairs (@products) {

($A,$B) = split /\s/, $pairs;

@strands = (@strands, $A, $B);

} return @strands;

}

#templates are in the form "A B". Ex. “ACGT TGCA”

@Denatured = denaturizing(@PCRproducts);

Variables Scope



A variable $a is used both in the subroutine and in the main part program of the program.

use strict;

$a = 0; print " $a\n "; sub changeA {

$a = 1;

} print " $a\n "; changeA(); print " $a\n "; my $a = 0; print " $a\n "; sub changeA { my $a = 1;

} print " $a\n "; changeA(); print " $a\n ";



The value of $a is printed three times. Can you guess what values are printed?



$a is a global variable

Ex: What would be the output?

#!/usr/bin/perl -w

$dna = 'AAAAA';

$result = A_to_T($dna); print "I changed all the A's in $dna to T's and got

$result\n\n";

#############################################

# Subroutines sub A_to_T { my($input) = @_;

$dna = $input;

$dna =~ s/A/T/g; return $dna;

}

Output?

Regular Expressions







Regular Expressions: Language for specifying text strings

Regular Expressions is a mechanism for specifying character patterns

Useful for





Finding files by name

Finding text in a file







Finding (or not finding) interesting text in a string

Text based search and replace

Finding and extracting text

Pattern Finding

Problem: find an ORF in nucleotide sequence







Look for start (ATG) and stop codons (TAA, TAG, TGA)

Pattern search operator: m// or //

$string =~ /<pattern>/ returns true if the pattern matches somewhere in $string , false otherwise



Example:

$dna = "GATGCCATGACACTGTTCA"; if ($dna =~ /ATG/){ print "starting codon is there";

} else { print "no starting codon!\n";

}

Regular Expressions





Optional characters ?

, * and +



/colou ?

r/  color or colour



?

(0 or 1)





/oo * h!/  oh!

or ooh!

or ooooh!



* (0 or more)

/o + h!/  oh!

or ooh!

or ooooh!



+ (1 or more)

Wild cards .



/beg .

n/  begin or began or begun

* +

Stephen Cole Kleene

Common Regular Expressions

White-space characters \t (tab), \n (newline), \r (return)

\s x

.

^r

: match a whitespace character

: character 'x'

: any character except newline

: match at beginning of line r$ r|s

(r)

[xyz]

: match at end of line

: match either or

: group characters (to be saved in $1, $2, etc)

: character class , in this case, matches either an 'x', a 'y', or a 'z'

[abj-oZ] : character class with a range in it; matches 'a', 'b', any letter from 'j' through 'o', or 'Z' r* r+

: zero or more r's, where r is any regular expression

: one or more r's r?

: zero or one r's (i.e., an optional r)

{name} : expansion of the "name" definition rs : RE r followed by RE s (e.g., concatenation)

Exercise

Ex1:

$dna = AGGCTCGTACGACG; if( $dna =~ /CT[CGT]ACG/ ) { print "I found the motif!!\n"; #?

}

Ex2: Find an ORF in nucleotide sequence (look for start

(ATG) and stop codons (TAA, TAG, TGA))

$dna = "tatggagcctcctgaggctacagccacacctgagccactctaaga";

?

Perl_3

Programming and Perl for

Bioinformatics

Part III

Basic Data Types

Perl has three basic data types :

scalar

array (list)

associative array (hash)

Associative Arrays/Hashes

key

key-value

Getting all keys and values in a hash

Associative Arrays

More on Perl

Subroutines and Functions

Regular expression

Basics about Subroutines

More example

Variables Scope

Ex: What would be the output?

Output?

Regular Expressions

Pattern Finding

Regular Expressions

Common Regular Expressions

Exercise

Related documents

Products

Support

Perl_3

Programming and Perl for

Bioinformatics

Part III

Basic Data Types

Perl has three basic data types :

scalar

array (list)

associative array (hash)

Associative Arrays/Hashes

key

key-value

Getting all keys and values in a hash

Associative Arrays

More on Perl

Subroutines and Functions

Regular expression

Basics about Subroutines

More example

Variables Scope

Ex: What would be the output?

Output?

Regular Expressions

Pattern Finding

Regular Expressions

Common Regular Expressions

Exercise

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib