8.1 Hashes (associative arrays) 8.2 Hash Motivation Let's say we want to create a phone book . . . Enter a name that will be added to the phone book: Dudi Enter a phone number: 6409245 Enter a name that will be added to the phone book: Dudu Enter a phone number: 6407693 8.3 Hash – an associative array An associative array (or simply – a hash) is an unordered set of pairs of keys and values. Each key is associated with a value. A hash variable name always start with a “%”: my %hash; Initialization: %hash = ("a"=>5, "bob"=>"zzz", 50=>"John"); Accessing: you can access a value by its key: print $hash{50}; John Tip you can reset the hash (to an empty one) by %hash = (); %hash "a" => 5 "bob" => "zzz" 50 => "John" 8.4 Hash – an associative array modifying : $hash{bob} = "aaa"; adding : $hash{555} = "z"; (modifying an existing value) %hash (adding a new key-value pair) You can ask whether a certain key exists in a hash: if (exists $hash{50} )... You can delete a certain key-value pair in a hash: delete($hash{50}); "a" => 5 "bob" => "aaa" "zzz" 555 50 => "John" "z" 555 => "z" 8.5 Variable types in PERL Scalar Array Hash $number -3.54 @array %hash => $string "hi\n" => $array[0] => $hash{key} 8.6 Hash – an associative array An associative array of the phone book suggested in the first slide (we will se a more elaborated version later on): Declare my %phoneBook; Updating $phoneBook{"Dudi"} = 9245; $phoneBook{"Dudu"} = 7693; Fetching print $phoneBook{"Dudi"}; %hash "Dudi" => 9245 "Dudu" => 7693 8.7 Iterating over hash elements It is possible to get a list of all the keys in %hash my @hashKeys = keys(%hash); Similarly you can get an array of the values in %hash my @hashVals = values(%hash); @hashKeys "a" "bob" @hashVals 50 5 "zzz" "John" %hash "a" => 5 "bob" => "zzz" 50 => "John" 8.8 Iterating over hash elements To iterate over all the values in %hash my @hashVals = values(%hash); foreach my $value (@hashVals)... %hash "a" => 5 "bob" => "zzz" To iterate over the keys in %hash my @hashKeys = keys(%hash); foreach my $key (@hashKeys)... 50 => "John" @hashVals 5 "zzz" "John" @hashKeys "a" "bob" 50 8.9 Iterating over hash elements For example, iterating over the keys in %hash : my @hashKeys = keys(%hash); foreach my $key (@hashKeys) { print "The key is $key\n"; print "The value is $hash{$key}\n"; } The key is a The value is 5 The key is bob The value is zzz The key is 50 The value is John %hash "a" => 5 "bob" => "zzz" 50 => "John" @hashVals 5 "zzz" "John" @hashKeys "a" "bob" 50 8.10 Iterating over hash elements Notably: The elements are given in an arbitrary order, so if you want a certain order use sort: my @hashKeys = keys(%hash); my @sortedHashKeys = sort(@hashKeys); foreach $key (@sortedHashKeys) { print "The key is $key\n"; print "The value is $hash{$key}\n"; } %hash "a" => 5 "bob" => "zzz" 50 => "John" @hashVals 5 "zzz" "John" @hashKeys "a" "bob" 50 8.11 Example – phoneBook.pl #1 ###################################### # Purpose: Store names and phone numbers in a hash, # and allow the user to ask for the number of a certain name. # Input: Enter name-number pairs, enter "END" as a name to stop, # then enter a name to get his/her number # use strict; my %phoneNumbers = (); my $number; 8.12 Example – phoneBook.pl #2 # Ask user for names and numbers and store in a hash my $name = ""; while ($name ne "END") { print "Enter a name that will be added to the phone book:\n"; $name = <STDIN>; chomp $name; if ($name eq "END") { last; } print "Enter a phone number: \n"; $number = <STDIN>; chomp $number; $phoneNumbers{$name} = $number; } 8.13 Example – phoneBook.pl #3 # Ask for a name and print the corresponding number $name = ""; while ($name ne "END") { print "Enter a name to search for in the phone book:\n"; $name = <STDIN>; chomp $name; if (exists($phoneNumbers{$name})) { print "The phone number of $name is: $phoneNumbers{$name}\n"; } elsif ($name eq "END") { last; } else { print "Name not found in the book\n"; } } 8.14 Class exercise 8 1. Write a script that reads a file with a list of protein names and lengths (proteinLengths ): AP_000081 181 AP_000174 104 AP_000138 145 stores the names of the sequences as hash keys, with the length of the sequence as the value. Print the keys of the hash. 2. Add to Q1: Read another file, and print the names that appeared in both files with the same length. Print a warning if the name is the same but the length is different. 3. Write a script that reads a GenPept file (you may use the preproinsulin record), finds all JOURNAL lines, and stores in a hash the journal name (as key) and year of publication (as value): a. Store only one year value for each journal name b*. Store all years for each journal name Then print the names and years, sorted by the journal name (no need to sort the years for the same journal in b*, unless you really want to do so…)