Homework on Heaps and Hash Tables

advertisement
Homework on Hash Tables
Key
1. What are the characteristics of a “good” hash function?
 Computation is fast and easy,
 it minimizes collisions, and
 it should be uniform.
2. Describe each of the following hash functions:
a. Division method
hash(x) = x modulo tableSize. The tableSize must be a prime number so that the hash
function is uniform thus reducing collisions
b. Extraction
hash(12345678) = 18 (extract the first and last digit). Selecting digits from the key.
The digits are not necessarily successive. You should select digits that will vary
among the group of keys to reduce collision.
c. Shift Folding
Divide the key into groups of digits, then add then shifts each group under the other,
then add the groups together.
3. Define the following:
a. Collision
A collision occurs when key1  key2 but h(key1) = h(key2).
b. perfect hash function
A perfect hash function has zero collisions.
c. uniform hash function
A uniform hash function is a hash function that is equally likely to select any index in
the table. In other words, if k is an index in the table, the probability that k will be
selected is 1/tablesize.
d. open addressing
A collision resolution method such that when one key collides with another, the
collision is resolved by finding an address that is open (no key is stored there).
e. primary clustering
Primary clustering is a problem with linear-probing such that the table contains
groups of consecutively occupied locations after several collisions have occurred.
f. secondary clustering
Secondary clustering is a problem with quadratic-probing such that when two items
hash into the same location, the same probe sequence is used for each item when a
collision occurs.
g.  (load factor)
The load factor is defined to be (Current number of table items) / table size.
Load factor is a measure of how full the hash table is.
4. If a table is 4/5 full, what is the approximate average number of comparisons that a
search requires (for a successful search) for linear probing?
½[1 + 1 / (1- (4/5))] = 3
For quadratic probing?
-loge(1-4/5) / (4/5) = 2.012
5. If h(x) = x mod 7 and chaining resolves collisions, what does the hash table look like
after the following insertions occur: 8, 10, 24, 15, 32, 17?
0
1
2
3
4
5
6
15
8
17
24
10
32
6. Redo #5 if linear probing is used to resolve collisions.
empty
0
1 8
2 15
3 10
4 24
5 32
6 17
7. Redo #5 if quadratic probing is used to resolve collisions.
0 17
1 8
2 15
3 10
4 24
5 32
6 empty
8. What is the load factor of the table in #6?
Load factor = (current number of table items) / table size = 6/7 = .857
9. Suppose shift folding is used for the hash function and the table size is 100. Where
would the following keys be placed in the table? Assume chaining is used to resolve
collisions. 41389217, 21634289, 15161718, 42356117
41389217 = 41|38|92|17 will go into index 41+38+92+17=188 so 188%100 = 88 of the array.
21634289 = 21|63|42|89 will go into index 21+63+42+89=215 so 215%100 = 15 of the array.
15161718 = 15|16|17|18 will go into index 15+16+17+18=66 of the array.
42356117 = 42|35|61|17 will go into index 42+35+61+17=155 so 155%100 = 55 of the array.
No collisions needed to be resolved using shift folding.
10. Redo #9 if folding on the boundaries is used for the hash function and the table size is
100.
41389217 = 41|38|92|17 will go into index 41+83+92+71=287 so 287%100 = 87 of the array.
21634289 = 21|63|42|89 will go into index 21+36+42+98=197 so 197%100 = 97 of the array.
15161718 = 15|16|17|18 will go into index 15+61+17+81=174 so 174%100 = 74 of the array.
42356117 = 42|35|61|17 will go into index 42+53+61+71=227 so 227%100 = 27 of the array.
No collisions needed to be resolved using shift folding.
11. Write pseudocode for the table operation TableDelete when the implementation uses
hashing and chaining is used to resolve collisions. This is not pseudocode – it is code
using STL lists.
template<class T>
void HashTable<T>::delete(int key, bool found)
{
int index = h(key);
list<T>::iterator itr = items[index].begin();
while(itr != items[index].end() && itr->getKey() != key) ++itr;
if(itr != items[index].end() )
{
found = true;
items[index].erase(itr);
return;
}
found = false;
return;
}
12. What table size should be used if the division method is used as the hash function?
A prime number should be used as the table size if the division method is used as the
hash function.
Why?
A prime number is used to reduce collisions when the division method is used as the hash
function.
13. What table size should be used if quadratic probing is used for collision resolution?
When quadratic probing is used for collision resolution a prime number in the form of 4k
+3 for some number k is used as the table size.
Why?
This is done so that all indices in the table are considered during collision resulution.
Download