Hash Functions

advertisement
Chapter 5 Hashing
Hash table operations
Dictionary implementations and complexity of operations
unsorted array—
sorted array
linked list
open hashing (also called separate chaining)--
closed hashing—
Hashing and Hash Functions
Basic idea—Take key k of the item, use a hash function h and
place the record with key k into cell h(k).
Hash Functions
The characteristics of a good hash function are:
1.
2.
A commonly used hash function is obtained by choosing a prime
number p as the table size with function h(key) = key (mod p).
Consider the hash function h(x) = x2 (mod 10).
Two other hashing strategies--truncation and folding:
Open Hashing (Separate Chaining)
0
1
2
—>
rec —>
—>
rec
—>
rec —>
rec
•
•
rec —>
rec
•
m-1
The load factor of a hash table is the ratio of the number of items
in the table to the size of the table. notation:  = n/m.
A probe is an access into the data structure.
a. unsuccessful search—
b. successful search—
Closed Hashing—Open Addressing
Collision resolution strategies
linear probing
0
1
X
2
X
3
X
4
5
6
X
7
8
X
9
10
Suppose the next item to be placed in the table hashes to 3.
Claim (without proof) (p. 189 in the text) The expected number of
probes is ½ [1 + 1/(1 - 2)] for insertions and unsuccessful
searches and ½ [1 + 1/(1 - )] for successful searches.
Deletions
Other Collision Resolution Strategies
Quadratic probing is a collision resolution strategy in which i2 is
the increment where i is the number of collisions encountered.
Double hashing uses a second hash function h2 to help
determine where the element will be placed.
Rehashing is used to rebuild the hash table when it gets too full.
Comparison and Efficiency of Different Methods
Advantages and disadvantages of open hashing
operation
insert
delete
find
find min
find max
BST
O(lg n) average
case
O(lg n) average
case
O(lg n) average
case
O(lg n)
Hash table Sorted array
O(1)
O(n)
O(1)
O(n)
O(1)
O(lg n)
O(n)
O(1)
Extendible Hashing
Basic idea: extendible hashing allows the table to grow and
shrink while keeping access times bounded.
How it works:
Now, suppose that we try to insert the string 10000.
The resulting table
Deletions:
Download