Previous Lecture Revision Hashing Searching :

advertisement
•
Previous Lecture Revision
Searching :
Hashing
– The Main purpose of computer is to store & retrieve
– Locating for a record is the most time consuming action
• Methods:
– Linear Search ( Search for a target element by element)
»
Asymptotic notation is O(n)
– Binary Search ( Break the Group into two and search in
one half)
»
Asymptotic notation is O(log 2 n)
– Hashing ( Direct access )-Used in DBMS / File Systems
»
Asymptotic notation is O(1)
Hashing Techniques
• What is Hashing? : from the Key (INPUT) itself the
index where it should be stored is derived.
» Advantage :While Reading Back it can be READ IMMEDIATELY
• Techniques:
–
–
–
Identity - Key itself becomes a index
»
Dis :
Memory Should be LIMITLESS
Truncation - The Last digit is truncated and used as index
»
EG : Key 123456
So the index is 6
Folding - Addition/Multiplication/Division is done on
the key to obtain the index
»
EG : 123456
12 34 56 102 so index is 2
Hashing Techniques
• Modular Arithmetic : Use Some Arithmetic calculation
on the Key to obtain the index
– KEY % size EG : 123 % 10
– CAN WE USE
123 / 10 ?
• What is a Collision ?
• If the hashing Function gives out SAME INDEX for
TWO keys then it is called a collision
» EG : 123 % 10 index will be 3
»
223 % 10 index will be again 3
• We cannot store two values in the same location
RESOLVING COLLISIONS
How to resolve collisions
• Three Methods are available,
1.
Resolving collisions by REPLACEMENT
2.
Resolving collisions by OPEN ADDRESSING
» Linear probing
» Quadratic probing
3.
Resolving collisions by CHAINING
Resolving collisions by replacement
• Working : we simply replace the old KEY with the
new KEY when there is a collision
» The Old Key is simply Lost
» Or
» It is combined with the new Key.
» EG : 121 122 221 124 125 126 224 index = key % 10
221
121 122
124 125 126
224
• When it is used ?
– Very Rarely used
– Used only when the data are sets . Using UNION Operation
old data is combined with the new data.
Resolving collisions by Open addressing
• We resolve the collision by putting the new Key in
some other empty location in the table.
– Two methods are used to locate the empty location
1. Linear Probing
2. Quadratic Probing
• Linear Probing :
– Start at the point where the collision occurred and do a
sequential search through the table for an empty location.
• Improvement :
– Circular Probing : After reaching the end start probing from the
first
Example
Example - Linear Probing
Keys = 6
Table Size = 7
key
Index
12
5
SOLUTION
Index
0
21
15
1
1
15
Function = key mod 7
21
0
36
1
2
3
36
84
84
0
96
5
4
5
12
6
96
Resolving collisions by Open addressing
• Quadratic Probing:
If there is a collision at the address ‘h’, this
method probes the table at locations
h+I2 ( % hashsize )for I = 1,2,…
Dis: It does not probe all locations in the table.
Keys = 6
key
Table Size = 7
12
15
21
Function = key mod 7
36
Index
5
1
0
1
index = h + I 2 % 7
I = 1 , 2 , 3 ……
21
15
36
84
96
0
5
84
12
96
example
Void main ( )
{ int table[MAX],index,I target;
for(I=1;I<=MAX;I++)
table[I-1]=10*I;
cin>>target;
index = HASH(target);
if (index!=1)
{if table[index] == target)
cout<<“Found at”<<index;
else cout <<“Target Not found”;}
else cout <<“Target Not found”;}
# define MAX 20
int HASH(int key)
{ int index;
index = key/10-1;
if (index<MAX) return index;
else return -1;
}
Resolving collisions by Chaining
key
Index
12
5
15
21
36
84
96
1
0
1
0
5
0
21
84
1
15
36
2
3
4
5
6
12
96
• Implemented using
Linked List
• Whenever a collision
occurs a new node is
created and the new
value is stored and
linked to the old value.
Exercise
Using modulo-division method and linear probing store the
below in an array of 19 elements
224562, 137456, 214562,140145, 214576, 162145, 144467,
199645, 234534
Index is :
224562 % 19 = 1, 137456 % 19 =10 , 214562 % 19 =14,140145 % 19 =1 ( 2),
214576 % 19 =9, 162145 % 19 =18, 144467 % 19 =10 ( 11), 199645 % 19 =12,
234534 % 19 =17
224562
144467
199645
140145
214576
214562
234534
162145
137456
Download