csc120ch8 - Chu Hai College

advertisement
COURSE : DATA STRUCTURE (CS 306)
UNIT
: 8
SEARCHING
<I>
SEQUENTIAL SEARCH
Boolean SequentialSearch(int array[], int key)
{ for (int i=0; i<N ; i++)
if(key==array[i])
return TRUE;
return FALSE;
}
Searching Time Efficiency :
On average, it take (n+1)/2 comparison to complete a searchO(n)
(1+2+..+(n-2)+(n)) /n
(((1+n)*n)/2)/n
(n2 +n)/2n
(n+1)/2
<II>
BINARY SEARCH
Boolean BinSearch(int array[], int key) Eg : key=18
{
int low=0, high=N-1, mid;
while( low<=high)
{
mid=(low+high)/2;
if (key==array[mid])
return TRUE;
else if(key< array[mid])
high=mid-1;
else
low=mid+1;
}
return FALSE;
}
Searching Time Efficiency :
On average, it take (log2 n) comparison to complete a searchO(log n)
1
<III>
TREE SEARCHING
- Binary tree ( 2 ways searching )
- B-Tree ( multiway searching )
<IV>
HASHING
Hashing
It is an efficient searching, insertion and deletion technique that minimize the
number of comparison to O(1) in average.
It use a hash table to hold key and data.
It use a hash function h(key) to compute the index of a hash table array
Hash Function
It is a function that transforms a key into a table index
Criteria in selecting a Hash Function
<1> Easy and Quick to compute
A complicate function waste time to compute.(especially in rehashing)
<2> Even distribution of keys across the range of indices
1. Avoid sparse table
only a small fraction of position is occupied.
2. Minimize collision of keys
several possible keys might be mapped to the same index(location)
3. Avoid clustering
keys are concentrated in several parts of the array
Building a Hash Function
<1> Using Modular Division ( h(key)=(key computation)%M -> result in proper range)
i. Convert a key to an integer
ii. Divide by the size of index range
iii. Take the remainder
2
<2> Choice of Modulus
Bad choice
i. Even number made the hash table biased (M=even number)
It maps even key to even location, odd keys to odd location
ii. Power of 2 Eg: M=23
Number
Binary form
64
1 000 000
72
1 001 000
80
1 010 000
88
1 011 000
All go to address 0 .
The hash address are determined by the last 3 bits of the h(key).
If the last 3 bits is the same, they go to the same address.
 If the h(key) have the same pattern (000), they go to the same address.
 In practice, it is more often likely to have things that has the same pattern
(same pattern  same address  collision increase)
Good Choice
i. Prime number Eg: M=7
Number
Binary form
48
0 110 001
56
0 111 000
63
0 111 111
70
0 000 110
All go to address 0 .
But the hash address are determined by all the bits of the h(key).
They are having different pattern.
 In practice, it is more often likely to have things that has the same pattern
(same pattern  same address )
(different pattern  same address  collision decrease )
 Achieve even spreading of keys
Collision Resolution
<1>Linear Probing(testing)
Probe at location (h+i)%HashSize
i: Start with the hash address (location collision occurs)
ii: Do a sequential search through the table for an empty location.
Drawbacks
When the table become half full, it trends to be clustering.
t e
b c d v
f j l z r
y u i o
3
<2>Quadratic Probing
Probe at location (h+i2)%HashSize
If there is a first(1st) collision at hash address(h), go to location h+1
If there is a second(2nd) collision at hash address(h+1), go to location h+4
Reduce Clustering
<3>Chaining
Using link list to keep all element that hash to the same address
Use an Array of Link List [0]
[1]
[2]
[3]
:
:
Advantages
1. Avoid overflow
2. Resolve collision
3. Fast to insert and delete
||
||
Drawbacks
All the links required space, only large record make this waste negligible.
Alternative Ways of Chaining
1. Use BST tree
2. Other hash table
Limitation
1. Does not support ordering operations such as finding max or min
2. Does not support sorting
4
<VI>
Implementation Of HASHING Using CHAINING
#include<iostream.h>
#include<stdlib.h>
#include<string.h>
#define HASHSIZE 79
#define KEYSIZE 50
#define TRUE 1
#define FALSE -1
struct node
{ char key[30];
int data;
struct node *next;
};
typedef struct node Node;
typedef int Boolean;
Boolean InsertHashTable(Node *[], Node *node, int loc);
int HashFun(char*);
EXERCISE :
5
Download