Lecture Notes

advertisement
Lecture Notes
Slide 1
Summarize the four method of search that we will discuss.
 Serial Search
 Binary Search
 Open Address Hashing
 Chaining
Slide 4
 Airline reservation systems – search for free seats
on all flights for a given day of the week.
 Given N credit card transactions for a given day and
M known stolen credit cards. Identify which of these
N transactions involve a stolen credit card. N is
typically much larger than M. For example, N =
10,000,000 and M = 100,000.
Slide 6-7
 Serial Search
Examines the elements of an array a[0:n-1], from left
to right, to see if one of them equals x. If an
element equal to x is found, return the position of
the first occurrence of x. If the array has no element
equal to x, return –1.
Slide 8
 Best Case
If x is stored in a[0], only one array access is
necessary.
 Worst Case
If x is in position a[n-1] or x is not in the array at
all, n array accesses are necessary.

Average Case (successful search)
Assume x is randomly placed in the array (i.e., x has
an equal probability of being stored in any position
of the array). We analyze what happens if x is stored
in each position of the array.
If x is in a[0] then 1
If x is in a[1] then 2
If x is in a[1] then 3
...
If x is in a[n-1] then
comparison is necessary to detect x.
comparisons are necessary to detect x.
comparisons are necessary to detect x.
n comparisons are necessary to detect x.
Average number of comparisons = (1 + 2 + … + n)/n = (n+1)/2

Average Case (unsuccessful search)
Average number of comparisons = n
Slides 14-15
 Binary Search
Examines the elements of a sorted array a[0:n-1] to
see if one of them equals x. The search begins by
comparing x to the middle of the array, a[middle]. If
x equals this element, the search terminates. If x is
smaller than this element, then we need to search the
left half of the array for x. If x is greater than
this element, then we search the right half of the
array for x. The processes continues until we find the
element that equals x, in which case, we return its
position. If the search fails, we return –1.
Slide 17
Search for x = 51
0
1
2
12
17
30
3
45
4
47
5
6
7
8
52
59
77
81
9
82
10
85
start
size
middle
0
11
5
0
5
2
3
2
4
5
1
5
5
0
(Seems to work but suspicious)
Analysis of splitting array
Start with an array of odd size = 2m+1, a[0:2m] then first = 0, size
= 2m+1, size/2 = m, middle = m.
So,
search(a,first,size/2,x) = search(a, 0, m, x)=> a[0:m-1]
search(a,middle+1,size/2,target) = search(a, m+1, m, x)=> a[m+1,2m]
Splitting looks ok.
Start with an array of even size = 2m, a[0:2m-1] then first = 0, size = 2m,
size/2 = m, middle = m.
So,
search(a,first,size/2,x) = search(a, 0, m, x)=> a[0:m-1]
search(a,middle+1,size/2,target) = search(a, m+1, m, x)=> a[m+1,2m]
Splitting looks bogus.
Slide 21
Can a link list be used to implement binary search?
We would have to build an ordered list.
Problem: finding the middle would require O(n) operations vs. O(1) for an array. The
maximum number of function calls is (log n). Therefore, with an array the running
time is (log n), while with a linked list the running time is (n log n).
Slide 22
Suppose 2k-1  N  2k – 1 nodes. N = 2k-1 + t where 0  t  2k-1 – 1
Nodes
1
3
7
…
2k-1 – 1
2k-1
N
2k – 1
Max Levels (comparisons)
1
2
3
…
k-1
k
k
k
Example N = 15 = 23 + 7 => k-1 = 3 or k = 4 is the max number of comparisons need
with binary search.
Example (Search for H)
0
A
1
A
Start
0
8
8
8
2
A
3
C
4
E
5
E
6
F
size
15
7
3
1
7
G
8
H
9 10
I L
middle
7
11
9
8
(Search for P)
Start
0
8
12
size
15
7
3
middle
7
11
13
Recurrence relation:
T(1) = 1
T(n) = T(n/2) + 1, n > 1
Suppose n = 2k
T(n) = T(n/2) + 1
= T(n/4) + 1 + 1
= T(n/8) + 1 + 1 + 1
=
…
= T(n/2k) + 1 + 1 + … + 1
= k + 1
= lg n + 1
11
M
12
N
13
P
14
R
Slide 25
Interpolation search
Generalizes the idea of computing the
middle = first + size/2
with
first + size*(x – a[first])/(a[first+size-1]-a[first])
Assumptions must be made though – the keys are numerical
and they are uniformly distributed. Performs poorly when
keys are not uniformly distributed.
Lg lg n + 1 for large random files
For small n, lg n is close to lg lg n so no improvement
over binary search.
Example n = 1,000,000,000
lg lg 1,000,000,000 < 5
Slide 32
Toy Example:
H(key) = key % 11
Hash the integer keys: 50, 31, 78, 20, 22, 65
0
1
2
22
78
65
3
4
5
6
7
8
50
findIndex(31) returns i= 9
findIndex(65) returns i= 10, 0, 1, 2
findIndex(53) returns i = 9, 10, 0, 1, 2, 3, -1
9
31
10
20
Slide 33
put(28,”Happy Birthday”)
0
1
2
22
78
65
3
4
5
6
7
8
50
Clustering
9
31
10
20
Replace key 50
with 28 and its data
with “Happy
Birthday”, return
50’s data to caller
Slide 34
Put(104,”Bon jour”)
Insert key 104 at index 5 in keys, insert “Bon jour” in
data[5], hasBeenUsed[5] is set to true, numItems is
increased by 1, return null to caller.
How is an element removed from a hash table?
Slide 35-36
Double hashing
H1(key) = key % 11
H2(key) = key % 9
Hash the integer keys: 50, 31, 78, 20, 22, 65, 39
0
1
20
78
2
3
4
22
H1(20) = 20 % 11 = 9 collision
H2(20) = 20 % 9 = 2
H1(22) = 22 % 11 = 0 collision
H2(22) = 22 % 9 = 4
H1(65) = 65 % 11 = 10
5
6
7
50
39
8
9
31
10
65
H1(39)
H2(39)
Trying
How to
= 39 % 11 = 6 collision
= 39 % 9 = 3
to place 39: Collision at 9, 1, then 4 finally place at 7.
choose the second hash function?

Make sure the second function can never evaluate to 0 since if it
does the probe sequence would lead to an infinite loop.

(Knuth) Chose twin primes M and M-2. If h1(key) = key mod M then
h2(key) = key mod (M-2) + 1
This discussion applies to an assignment made during the summer of
2002.
Discuss HW3 Algorithm
Input M (number of soldiers) and N (number used to eliminate soldiers).
Input the names of the soldiers in clockwise order.
Construct the almost complete binary tree T.
(see figure 1)
Let p point to the root of T.
Let the number of leaves that remain to be counted = N % M.
while there is more than one leaf in T
{
//Locate the next soldier to be eliminated
while there is more than one leaf in the subtree pointed to by p
(see figure 2, 5, 8)
{
Let p be set to point to its left child.
if number of leaves that remain to be counted > number of leaves in the subtree pointed to by p
{
Decrement number of leaves that remain to be counted by number of leaves in the subtree
pointed to by p
Let p point to its right brother
}
}
// p should be pointing to the next soldier to be eliminated from the circle
Display the soldier pointed to by p.
Let q point to the same node that p points to.
while q is not null
(see figure 3, 6, 9)
{
//Reduce the count of each ancestor of q
Decrement number of leaves in the subtree pointed to by q by 1.
if number of leaves in the subtree pointed to by q equals 1 then
{
if number of leaves in the subtree pointed to by the left child of q equals 1 then
replace the soldier's name in q with the soldier's name in q's left child
else
replace the soldier's name in q with the soldier's name in q's right child
}
replace q with q's parent
}
Let the number of leaves that remain to be counted be set to N.
if p points to a left child then
let p point to it right brother
while number of leaves
pointed to by
{
Decrement number of
pointed to by p
while p points to a
(see figure 4, 7, 10)
that remain to be counted > number of leaves in the subtree
p and p not equal to T
leaves that remain to be counted by the number of leaves in the subtree
right child
set p to point to its father
if p does not point to T then
let p point to its right brother
}
if p points to T then
set number of leaves to be counted - 1 to itself modulo the number of leaves in the subtree
pointed to by T + 1
}
// At this point only, one leaf remains containing the lucky soldier who goes for reinforcements
Figure 1
p
Remain = 4
T
9
5
3
Ajax
1
2
Cato
1
2
4
Drusus
1
2
Flavius
1
Gaius
1
2
Hector
1
Maximus
1
Otho
1
Brutus
1
Figure 2
T
Remain = 4
9
p
5
4
p
p
3
2
2
Remain = 1
2
p
Cato
1
2
Ajax
1
Brutus
1
Drusu
s
1
Flavius
1
Gaius
1
Hector
1
Maximus
1
Otho
1
Figure 3
q
T
9
8
q
5
4
4
Flavius
2
q
3
2
2
1
P
Cato
1
2
q
Drusus
1 0
Flavius
1
Gaius
1
Hector
1
Maximus
1
Otho
1
Remain = 13
Ajax
1
Brutus
1
Figure 4
T
P
8
Remain = 8
P
P
4
Remain = 8
4
P
Flavius
1
3
2
Remain = 12
2
P
Cato
1
2
Ajax
1
Brutus
1
Drusus
0
Flavius
1
Gaius
1
Hector
1
Maximus
1
Otho
1
Figure 5
T
P
Remain = 8
8
P
P
Remain = 4
4
4
P
Flavius
1
3
P
2
2
P
Cato
1
2
Ajax
1
Gaius
1
Remain = 2
Hector
1
p
Maximus
1
Remain = 1
Otho
1
Brutus
1
Figure 6
T
q
8
7
q
4
3
4
q
Flavius
1
3
2
2
1
Maximus
remain = 13
p q
Cato
1
2
Ajax
1
Brutus
1
Gaius
1
Hector
1
Maximus
1
Otho
0
Figure 7
T
p
7
p
4
3
p
Flavius
1
3
1
Maximus
2
remain = 13
p
Cato
1
2
Ajax
1
Gaius
1
Hector
1
Maximus
1
Brutus
1
Figure 8
T
p
7
remain = 6
p
p Remain = 2
4
3
p
Flavius
1
3
2
p
Cato
1
2
Ajax
1
Brutus
1
Maximus
1
Gaius
1
p
Remain = 1
Hector
1
Otho
0
Figure 9
T
q
7
6
q
4
32
q
Flavius
1
3
Maximus
1
21
Gaius
p q
Cato
1
2
Ajax
1
Gaius
1
Remain = 1
Hector
10
Brutus
1
Figure 10
T
P
Remain=6
6
p
4
2
P
p
Flavius
1
3
Gaius
Cato
1
2
Ajax
1
Brutus
1
Maximus
1
1
p
Gaius
1
Remain = 12
Remain = 13
Hector
0
Figure 11
T
p
6
Remain=6
p
p
4
3
Remain=2
p
Gaius
1
Remain=1
Maximus
1
Cato
1
2
Ajax
1
p
Flavius
1
2
Brutus
1
Figure 12
T
Q
Remain=3
5
q
1
4
Gaius
Flavius
1
3
Cato
1
2
Ajax
1
Brutus
1
Gaius
1
p
q
Remain=13
Maximus
0
Figure 13
T
p
Remain=3
5
p
p
p
Ajax
1
Flavius
1
3
p
Remain=1
Cato
1
2
Brutus
1
Gaius
1
4
Trie
//*** Leaf
//***
//***
//***
//***
//***
-------------|is Leaf| true |
|-------|------|
|suffix |
|
--------------
//*** NonLeaf
//***
//***
//***
//***
//***
//***
//***
//***
//***
--------------|isLeaf
|false|
|---------|-----|
|endOfWord|
|
|---------|-----|
|letters | o--|----->[ ][ ][ ][ ][ ] (array of letters)
|---------|-----|
|ptrs[]
| o--|----->[o][o][o][o][o](ptrs to leaves/nonleaves)
--------------| | | | |
v v v v v
class TrieNode
{
public boolean isLeaf;
}
class TrieNonLeaf extends TrieNode
{
public boolean endOfWord;
public String letters;
public TrieNode[] ptrs = new TrieNode[1];
TrieNonLeaf()
{
isLeaf = false;
}
TrieNonLeaf(char ch)
{
letters = new String();
letters += ch;
isLeaf = false;
}
}
class TrieLeaf extends TrieNode
{
public String suffix;
TrieLeaf()
{
isLeaf = true;
}
TrieLeaf(String s)
{
suffix = new String(s);
isLeaf = true;
}
}
Words:
the
there
their
was
when
wasp
#
t
w
#
h
#
a
h
#
#
#
the
#
s
#
a
e
i
r
r
#
#
e
p
#
was
wasp
there
their
t
what
Download