COURSE : DATA STRUCTURE (CS 306) UNIT : 8 SEARCHING <I> SEQUENTIAL SEARCH Boolean SequentialSearch(int array[], int key) { for (int i=0; i<N ; i++) if(key==array[i]) return TRUE; return FALSE; } Searching Time Efficiency : On average, it take (n+1)/2 comparison to complete a searchO(n) (1+2+..+(n-2)+(n)) /n (((1+n)*n)/2)/n (n2 +n)/2n (n+1)/2 <II> BINARY SEARCH Boolean BinSearch(int array[], int key) Eg : key=18 { int low=0, high=N-1, mid; while( low<=high) { mid=(low+high)/2; if (key==array[mid]) return TRUE; else if(key< array[mid]) high=mid-1; else low=mid+1; } return FALSE; } Searching Time Efficiency : On average, it take (log2 n) comparison to complete a searchO(log n) 1 <III> TREE SEARCHING - Binary tree ( 2 ways searching ) - B-Tree ( multiway searching ) <IV> HASHING Hashing It is an efficient searching, insertion and deletion technique that minimize the number of comparison to O(1) in average. It use a hash table to hold key and data. It use a hash function h(key) to compute the index of a hash table array Hash Function It is a function that transforms a key into a table index Criteria in selecting a Hash Function <1> Easy and Quick to compute A complicate function waste time to compute.(especially in rehashing) <2> Even distribution of keys across the range of indices 1. Avoid sparse table only a small fraction of position is occupied. 2. Minimize collision of keys several possible keys might be mapped to the same index(location) 3. Avoid clustering keys are concentrated in several parts of the array Building a Hash Function <1> Using Modular Division ( h(key)=(key computation)%M -> result in proper range) i. Convert a key to an integer ii. Divide by the size of index range iii. Take the remainder 2 <2> Choice of Modulus Bad choice i. Even number made the hash table biased (M=even number) It maps even key to even location, odd keys to odd location ii. Power of 2 Eg: M=23 Number Binary form 64 1 000 000 72 1 001 000 80 1 010 000 88 1 011 000 All go to address 0 . The hash address are determined by the last 3 bits of the h(key). If the last 3 bits is the same, they go to the same address. If the h(key) have the same pattern (000), they go to the same address. In practice, it is more often likely to have things that has the same pattern (same pattern same address collision increase) Good Choice i. Prime number Eg: M=7 Number Binary form 48 0 110 001 56 0 111 000 63 0 111 111 70 0 000 110 All go to address 0 . But the hash address are determined by all the bits of the h(key). They are having different pattern. In practice, it is more often likely to have things that has the same pattern (same pattern same address ) (different pattern same address collision decrease ) Achieve even spreading of keys Collision Resolution <1>Linear Probing(testing) Probe at location (h+i)%HashSize i: Start with the hash address (location collision occurs) ii: Do a sequential search through the table for an empty location. Drawbacks When the table become half full, it trends to be clustering. t e b c d v f j l z r y u i o 3 <2>Quadratic Probing Probe at location (h+i2)%HashSize If there is a first(1st) collision at hash address(h), go to location h+1 If there is a second(2nd) collision at hash address(h+1), go to location h+4 Reduce Clustering <3>Chaining Using link list to keep all element that hash to the same address Use an Array of Link List [0] [1] [2] [3] : : Advantages 1. Avoid overflow 2. Resolve collision 3. Fast to insert and delete || || Drawbacks All the links required space, only large record make this waste negligible. Alternative Ways of Chaining 1. Use BST tree 2. Other hash table Limitation 1. Does not support ordering operations such as finding max or min 2. Does not support sorting 4 <VI> Implementation Of HASHING Using CHAINING #include<iostream.h> #include<stdlib.h> #include<string.h> #define HASHSIZE 79 #define KEYSIZE 50 #define TRUE 1 #define FALSE -1 struct node { char key[30]; int data; struct node *next; }; typedef struct node Node; typedef int Boolean; Boolean InsertHashTable(Node *[], Node *node, int loc); int HashFun(char*); EXERCISE : 5