1.00 Tutorial 11 HashTables Goals of this Tutorial • Understand utility methods – equals(), toString(), hashCode() • Understand HashTables – what a HashTable is – how to use a HashTable • Understand hash codes – what is a hash code? – what makes a good hash code? The Object Class • Object class methods: public boolean equals (Object obj) Indicates whether another object is equal to this one. public String toString() Returns a string representation of the object. public int hashCode() Returns a hash code value for the object. The equals() Method public class Student { int id; public Student(int id){ this.id = id; } public static void main(String args[]){ Student s1 = new Student(12); // student id is 12 Student s2 = new Student(12); // student id is also 12 Student s3 = s1; // reference same object s1.equals(s2); // what will default equals() method return? s1.equals(s3); // what will default equals() method return? } } QUESTION: how do you make equals() return true if two students objects have the same ID? (Hint: override the default equals() method.) The equals() Method (cont’d) public class Student { public Student(int id){ this.id = id; } public boolean equals (Object obj) { if (obj != null && obj instanceof Student) return id == ((Student) obj).id; else return false; } public static void main(String args[ ]){ Student s1 = new Student(12); // student id is 12 Student s2 = new Student(12); // student id is also 12 Student s3 = s1; s1.equals(s2); // what does it return now? s1.equals(s3); // what does it return now? } } The toString() Method public class Student { int id; public Student(int id){ this.id = id; } //we override the default toString() method public String toString() { return new String ("ID: " + id);} public static void main(String args[]) { Student s = new Student(12); System.out.println(s); // ID: 12 System.out.println ("Student " + s); // “Student ID: 12” System.out.println (s.toString()); // ID: 12 } } hashCode() method • hashCode(): returns reasonably uniform distribution of integers based on object data • equals(): returns true if the objects’ data are equal. Note: if two objects a and b are equal (a.equals(b) returns true), then their hashCodes must have the same value. Recollection of our collections elem[0] elem[1] item 1 elem[2] item 2 item 3 linked list ... array binary search tree Pop quiz: what is the average search time for each of these? Worst case search time? We want to improve the average search time. HashTable Bucket index is the hash code of the key bucket [0] Contents of a bucket are a collection of objects whose keys hash to the same value (0 in this case). Can be a linked list, array, Vector, binary tree, etc. Obj 1 Obj 2 null bucket [1] bucket [2] … bucket [n] Obj 2: key Obj 2: value In general, objects in the hash table have two components: a key, and a value. The hash code is computed on the key. The value is the data that is stored. In some cases, the key and value will be the same. Super Simple Example • We will design hash table with the following restrictions – Key is the same as the value – Can only hold Strings – Hash function is the length of the String – Always has just 5 buckets Let's go through the file SuperSimpleHashTable.java hashCode() • In reality, it is often very difficult to come up with a good hash code algorithm • hash codes should be evenly distributed across the range of integers • if we are using the modulo operator (%) to find the right bucket, the hash table should use a prime number of buckets hashCode (2) • What was wrong with our hash code algorithm in the SuperSimpleHashTable? • Why was it okay? • Let's think about a hash code algorithm for the Person class used in PS#9, and PS#10 Person (PS9 & 10) public class Person { String name; int[] myProfile; // me int[] yourProfile; // desired public int hashCode() { /* ? */ } public boolean equals(Object o) { /* ? */ } } You don't necessarily need these methods for PS#10; we just use the Person class as an example. Hash Table Questions 1. What would be the result if hashCode() returned the same value for all objects? 2. How many objects can a hash table hold? 3. What is the effect of adding objects to a hash table if they are sorted vs. in random order? 4. What can you say about a hash table that has n buckets and 10n objects? Answers 1. All objects would get stored in the same bucket, making the hash table useless. 2. In theory, there is no limit. (In practice, limited by memory and hard disk space.) 3. No effect. Hash tables do not store objects in order. 4. It is very inefficient; there will necessarily be many hash collisions. Load factor (# objects/#buckets) should be between 0.5 and 1.