• Previous Lecture Revision Searching : Hashing – The Main purpose of computer is to store & retrieve – Locating for a record is the most time consuming action • Methods: – Linear Search ( Search for a target element by element) » Asymptotic notation is O(n) – Binary Search ( Break the Group into two and search in one half) » Asymptotic notation is O(log 2 n) – Hashing ( Direct access )-Used in DBMS / File Systems » Asymptotic notation is O(1) Hashing Techniques • What is Hashing? : from the Key (INPUT) itself the index where it should be stored is derived. » Advantage :While Reading Back it can be READ IMMEDIATELY • Techniques: – – – Identity - Key itself becomes a index » Dis : Memory Should be LIMITLESS Truncation - The Last digit is truncated and used as index » EG : Key 123456 So the index is 6 Folding - Addition/Multiplication/Division is done on the key to obtain the index » EG : 123456 12 34 56 102 so index is 2 Hashing Techniques • Modular Arithmetic : Use Some Arithmetic calculation on the Key to obtain the index – KEY % size EG : 123 % 10 – CAN WE USE 123 / 10 ? • What is a Collision ? • If the hashing Function gives out SAME INDEX for TWO keys then it is called a collision » EG : 123 % 10 index will be 3 » 223 % 10 index will be again 3 • We cannot store two values in the same location RESOLVING COLLISIONS How to resolve collisions • Three Methods are available, 1. Resolving collisions by REPLACEMENT 2. Resolving collisions by OPEN ADDRESSING » Linear probing » Quadratic probing 3. Resolving collisions by CHAINING Resolving collisions by replacement • Working : we simply replace the old KEY with the new KEY when there is a collision » The Old Key is simply Lost » Or » It is combined with the new Key. » EG : 121 122 221 124 125 126 224 index = key % 10 221 121 122 124 125 126 224 • When it is used ? – Very Rarely used – Used only when the data are sets . Using UNION Operation old data is combined with the new data. Resolving collisions by Open addressing • We resolve the collision by putting the new Key in some other empty location in the table. – Two methods are used to locate the empty location 1. Linear Probing 2. Quadratic Probing • Linear Probing : – Start at the point where the collision occurred and do a sequential search through the table for an empty location. • Improvement : – Circular Probing : After reaching the end start probing from the first Example Example - Linear Probing Keys = 6 Table Size = 7 key Index 12 5 SOLUTION Index 0 21 15 1 1 15 Function = key mod 7 21 0 36 1 2 3 36 84 84 0 96 5 4 5 12 6 96 Resolving collisions by Open addressing • Quadratic Probing: If there is a collision at the address ‘h’, this method probes the table at locations h+I2 ( % hashsize )for I = 1,2,… Dis: It does not probe all locations in the table. Keys = 6 key Table Size = 7 12 15 21 Function = key mod 7 36 Index 5 1 0 1 index = h + I 2 % 7 I = 1 , 2 , 3 …… 21 15 36 84 96 0 5 84 12 96 example Void main ( ) { int table[MAX],index,I target; for(I=1;I<=MAX;I++) table[I-1]=10*I; cin>>target; index = HASH(target); if (index!=1) {if table[index] == target) cout<<“Found at”<<index; else cout <<“Target Not found”;} else cout <<“Target Not found”;} # define MAX 20 int HASH(int key) { int index; index = key/10-1; if (index<MAX) return index; else return -1; } Resolving collisions by Chaining key Index 12 5 15 21 36 84 96 1 0 1 0 5 0 21 84 1 15 36 2 3 4 5 6 12 96 • Implemented using Linked List • Whenever a collision occurs a new node is created and the new value is stored and linked to the old value. Exercise Using modulo-division method and linear probing store the below in an array of 19 elements 224562, 137456, 214562,140145, 214576, 162145, 144467, 199645, 234534 Index is : 224562 % 19 = 1, 137456 % 19 =10 , 214562 % 19 =14,140145 % 19 =1 ( 2), 214576 % 19 =9, 162145 % 19 =18, 144467 % 19 =10 ( 11), 199645 % 19 =12, 234534 % 19 =17 224562 144467 199645 140145 214576 214562 234534 162145 137456