(part 2), Searching, Hashing

advertisement
Lecture 9&10
HEAD
NULL
NULL
Insert Right
Q = newnode;
Q  Node.Info = X;
R = Cursor  Node.Right;
R  Node.Left = Q;
Q  Node.Left = R;
Q Node.Left = Cursor;
Cursor  Node.Right = Q;
Insert Left
Q = newnode;
Q  Node.Info = X;
L = Cursor  Node.Left;
L  Node.Right = Q;
Q  Node.Left = L;
Q  Node.Right = Cursor;
Cursor  Node.Left = Q;
Notation Used:
p  Pointer to Node.
node (p)  Node pointed to by p.
info (p)  Data value in node.
next (p)  Pointer to next list.
info (next (p))  Value of node after p.
Operations needed:





GetNode – new (or pop of stack-based heap)
FreeNode – delete (or push on stack-based heap)
InsertAfter
DeleteAfter
Place  Adds node to sorted list.
DataNode Node[500];
Avail = 1;
for i = 1 to 499;
Node[i].Next = i + 1;
Node[500].Next = 0;
1
2
2
3
3
4
4
0
NULL
GetNode () return (Ptr)
{
if Avail = 0 then
return ()
temp = AVAIL;
AVAIL = node [Avail].Next;
return (temp)
}
FreeNode (Ptr P)
{
Node [P].Next = Avail;
AVAIL = P;
return
}
InsertAfter (Ptr P, Data X, Flag Error)
{
if (P = NULL) then
Err = true;
else
{
Q = GetNode;
if (Q = NULL) then
Err = true;
else
{
Node [Q].Info = X;
Node [Q].Next = Node [P].Next;
Node [P].Next = Q;
}
}
}
DelAfter (Ptr P, Data X, Flag Error)
{
if (P = NULL) then
Err = true;
else if (Node[P].Next = NULL)
Then Err = true;
else
{
Q = Node [P].Next;
X = Node [Q].Info;
Node[P].Next = Node[Q].Next;
}
}
Searching:





Files or arrays.
Key {subscript, embedded}
Keys might live in separated table or index.
Unique key for each record  primary key
Other keys are secondary keys.
Internal searches happen completely in computers memory – RAM.
External searches happen in secondary storage devices.
Sequential Search:



Doesn’t make assumptions about the data.
The Big  complexity is (n).
1p(1) + 2p(2) + …………… np(n)
Probability Distribution:
1. Move to front
After every successful search, item retrieved goes to head of the list.
2. Transposition
Every search for item, trade positions with the predecessor.
Hashing Functions:
Hashing
Key
Address
Uniform Distribution

Prime Number division remainder.
key = 35
key mod 17
= 1
(Best if Prime Number)

Digit Extraction

Folding
25936715=2593+6715
=9308
259+36+75=1010
2961+5375=8336

Radix Conversion (converts base 10 to base 3)

Mid-Square
keys: 2 9 6 1 5 8 3 4
(158) 2 = 4 9 6 4
key 123 = (158)2 = 5 1 2 4

Folding
25936715=2593+6715
=9308
259+36+75=1010
2961+5375=8336

Random Number function.
Table Address
Perfect hashing functions are the functions that involve no collisions.
1. Quotient Reduction
hash(n) = (n+s)/n
hash(n) = 0
Note: Keys have to be uniformly distributed. (1) is guaranteed with hashing
2. Remainder Reduction
Note: Keys do not have to be uniformly distributed.
Collisions Handling:
1. Re-Hashing
2. Chaining
 = Load factor
Approach # 1: Linear Probing
Avg. Number of probes
for successful searches:
Avg. Number of probes
for unsuccessful searches:
Approach # 2: Quadratic Probing
½(1+
1
)
(1 - )2
½(1+
1
) = 8.5
2
(1 - .75)
½(1+
1
)
(1 - )
½(1+
1
) = 2.5
(1 - .75)
(reduces clustering)
i = 1, 2, 3, 4
table = 4 * j + 1
Avg. Number of probes
for successful searches:
- 1 * log c1 (1 - )

Avg. Number of probes
for unsuccessful searches:
1
(1 - )
Approach # 3: Two-Pair Hashing


First pass, place anything in empty cell. (collect synonym)
During Second pass, hash synonyms. (deal with collisions)
Approach # 4: Overflow Table
Avg. Number of probes
for successful searches:
Avg. Number of probes
for unsuccessful searches:
( - ½ ln (1- ))
1
(1 - )
0
Chaining
1
2
3
4
5
Buckets
Spill Addressing
Linear Search
Download