Slides for Lab10 (Click to download)

advertisement
Hash tables
 Definition: A data structure that uses a hash function
to map keys into index of an array element.
k1
k5
k2
k4
k3
Some properties of hash table
 Size of hash table (Example will be shown.)
 Hash function: map keys into index of an array
element. (To be continued…)
 Multiplication Hash
 Division Hash
 Input to build a hash table: array of keys to store in the
hash table
 int [] input = {1,2,3,4,5,6,7,8}
1
2
/
3
4
/
5
6
/
7
8
/
Example
 Hash table size is 10
20
110
103
13
10
69
/
/
53
/
Division Hash
(input size, m
value)
(500, 499)
 h1(k) = k mod m
(1000, 997)
 Returns the index of array
(2000, 1999)
(4000, 3989)
 k is the key
Table 1.
 m is the size of the hash table.
 Good values of m: prime numbers smaller than and
closest to the size of the input. See Table 1.
 Java syntax of mod is %.
Multiplication hash
 h2(k) = floor(m (kA mod 1) )
 m is size of hash table
 Good values of m: prime numbers smaller than and
closest to the size of the input. See table 1.
 k is key
 A = 0.61803 (Came from (sqrt(5) - 1)/2 )

Hints: Use the decimal in your program is better, it may
reduce your bugs.
Collisions
 When hashing a key, if collision happens the new key
is stored in the linked list in that location
 Number of collisions of a location = Number of
elements in that location - 1
# of collisions = 2-1=1
20
110
/
# of collisions = 3-1=2
103
13
53
/
"the 3 metrics"
 maxCollisions: Maximum number of collisions of all
locations in a hash table
 minCollisions: Minimum number of collisions of all
locations in a hash table
 totalCollisions: Total collisions of all locations in a
hash table
 Examples on the next slide
 maxCollisions = 2
 minCollisions = 1
 (** Note that the minCollisions will be at least 1 if there exists
collisions in some locations, even if there are locations with 0
collisions. If there is no collisions at all, return 0. )
 totalCollisions = 4
# of collisions = 1
20
110
/
# of collisions = 2
103
105
13
15
53
/
/
# of collisions = 1
Discussion
 Why metrics?
 It can tell us which hash is better according to the
collision metrics
 Why 3 metrics, why not just measure totalCollisions?
 Let’s see an example.
Which hash table is better?
20
110
/
103
13
/
103
13
/
103
13
/
20
110
13
Hash table 1: totalCollisions = 4
103
13
/
103
/
13
/
Hash table 2: totalCollisions = 4
103
/
 We not only want less collisions, but also want to
distribute the collisions evenly into the hash table.
That is why hash table 1 is better than hash table 2.
 This lab is to implement two hash functions, division
and multiplication and use metrics of collisions to
demonstrate which hash is better.
Download