Homework on Hash Tables Key 1. What are the characteristics of a “good” hash function? Computation is fast and easy, it minimizes collisions, and it should be uniform. 2. Define the following: a. Collision A collision occurs when key1 key2 but h(key1) = h(key2). b. perfect hash function A perfect hash function has zero collisions. c. uniform hash function A uniform hash function is a hash function that is equally likely to select any index in the table. In other words, if k is an index in the table, the probability that k will be selected is 1/tablesize. d. open addressing A collision resolution method such that when one key collides with another, the collision is resolved by finding an address that is open (no key is stored there). e. primary clustering Primary clustering is a problem with linear-probing such that the table contains groups of consecutively occupied locations after several collisions have occurred. f. secondary clustering Secondary clustering is a problem with quadratic-probing such that when two items hash into the same location, the same probe sequence is used for each item when a collision occurs. g. (load factor) The load factor is defined to be (Current number of table items) / table size. Load factor is a measure of how full the hash table is in open addressing. 3. If a table is 5/6 full, what is the approximate average number of comparisons that a search requires (for a successful search) for linear probing? For chaining? For linear probing? ½[1 + 1 / (1- (5/6))] = 3.5 For quadratic probing? -loge(1-4/5) / (4/5) = 2.15 For chaining 1 +( 5/6)/2 = 1.42 4. If h(x) = x mod 7 and chaining resolves collisions, what does the hash table look like after the following insertions occur: 9, 12, 16, 8, 2, 0? 0 -> 0 1 -> 8 2 -> 9 -> 16 -> 2 3 4 5 -> 12 6 5. Redo #5 if linear probing is used to resolve collisions. 0 0 1 8 2 9 3 16 4 2 5 12 6 6. Redo #5 if quadratic probing is used to resolve collisions. 0 0 1 8 2 9 3 16 4 5 12 6 2 7. What is the load factor of the table in #6? Load factor = (current number of table items) / table size = 6/7 = .857 8. Suppose shift folding is used for the hash function and the table size is 100. Where would the following keys be placed in the table? Assume chaining is used to resolve collisions. 41389217, 21634289, 15161718, 42356117 12345217 -> 12 + 34 + 52 + 17 = 115 -> 115 % 100 = 15 54321289 -> 54 + 32 + 12 + 89 = 187 -> 187 % 100 = 87 15161234 -> 15 + 16 + 12 + 34 = 77 42354321 -> 42 + 35 + 43 + 21 = 141 -> 141 % 100 = 41 No collisions needed to be resolved using shift folding. 9. Redo #9 if folding on the boundaries is used for the hash function and the table size is 100. 12345217 -> 12 + 43 + 52 + 71 = 178 -> 178 % 100 = 78 54321289 -> 54 + 23 + 12 + 98 = -> 187 % 100 = 87 15161234 -> 15 + 61 + 12 + 43 = 131 -> 131 % 100 = 31 42354321 -> 42 + 53 + 43 + 12 = 150 -> 150 % 100 = 50 No collisions needed to be resolved using shift folding. 10. Write pseudocode for the table operation TableDelete when the implementation uses hashing and chaining is used to resolve collisions. This is not pseudocode – it is code using STL lists. template<class T> bool HashTable<T>::TableRetrieve(int key) { int index = h(key); list<T>::iterator itr = items[index].begin(); while(itr != items[index].end() && itr->getKey() != key) ++itr; if(itr != items[index].end() ) { return true; } return false; } 11. What table size should be used if the division method is used as the hash function? A prime number should be used as the table size if the division method is used as the hash function. Why? A prime number is used to reduce collisions when the division method is used as the hash function. 12. What table size should be used if quadratic probing is used for collision resolution? When quadratic probing is used for collision resolution a prime number in the form of 4k +3 for some number k is used as the table size. Why? This is done so that all indices in the table are considered during collision resulution.