Searching

advertisement
Searching
The truth is out there ...
searching
1
Serial Search
• Brute force algorithm: examine each array
item sequentially until either:
– the item is found
– all items have been examined
• Algorithm is easy to code and works OK for
small data sets
searching
2
Code example for serial search
// precondition: none
// postcondition: searches an array of N items for target value:
// returns true if target found, false if not
template <class item>
bool SerialSearch (item array[], size_t N, item target)
{
bool found = false;
for (size_t x=0; (x < N) && (!found); x++)
if (array[x] == target)
found = true;
return found;
searching
}
3
Time analysis of serial search
• Worst case: serial search is O(N) -- if item
not found, have to go through whole array
before this can be verified
• Best case: O(1) -- target value found at
array[0]
• Average case: O((N+1)/2) -- basically still
O(N), but about 1/2 the time required for
worst case
searching
4
Binary search
• Much faster than serial search
• Works only if data are sorted
• Uses divide & conquer approach with
recursive calls:
– check value at midpoint; if not target then
– if greater than target, make recursive call to
search “upper” half of structure
– if less than target, recursively search “lower”
half
searching
5
Implementation of binary search
// precondition: none
// postcondition: searches an array of N items for target value:
// returns true if target found, false if not
template <class item>
void BinarySearch(item array[], size_t first, size_t size,
item target, bool& found, size_t& location)
// parameters: array is the array to be searched,
//
first is the first index to be considered,
//
size is the number of items in search group
//
target is the value being sought,
//
found is the success/failure flag
//
location is the index of the entry containing the
//
target value, if found
searching
6
Binary search code continued
{ // start of function
size_t middle; // index of midpoint of current search area
if (size == 0)
found = false; // base case
else
{
middle = first + size / 2;
if (target == array[middle])
{
location = middle;
found = true;
}
searching
7
Binary search code continued
// target not found at current midpoint -- search appropriate half
else if (target < array[middle])
BinarySearch (array, first, size/2, target, found, location);
// searches from start of array to index before midpoint
else
BinarySearch (array, middle+1, (size-1)/2, target, found, location);
// searches from index after midpoint to end of array
} // ends outer else
} // ends function
searching
8
Binary search in action
Suppose you have a 13-member array of sorted numbers:
14
23
47
59
71
82
[0] [1]
[2]
[3] [4]
[5]
[6] [7]
5
99 108 113 130 151
[8]
172
[9] [10] [11] [12]
Searching for value: 113
Initial function call: first = 0,
size = 13,
middle = 6
searching
9
Binary search in action
14
23
47
59
71
82
[0] [1]
[2]
[3] [4]
[5]
[6] [7]
5
99 108 113 130 151
[8]
172
[9] [10] [11] [12]
Searching for value: 113
Initial function call: first = 0,
size = 13,
middle = 6
Since 113 != 82, make recursive call:
BinarySearch (array, middle+1, (size-1)/2, target, found, location);
searching
10
Binary search in action
14
23
47
59
71
82
[0] [1]
[2]
[3] [4]
[5]
[6] [7]
5
99 108 113 130 151
[8]
172
[9] [10] [11] [12]
Searching for value: 113
Recursive call(1):
first = 7,
size = 6,
middle = 10
searching
11
Binary search in action
14
23
47
59
71
82
[0] [1]
[2]
[3] [4]
[5]
[6] [7]
5
99 108 113 130 151
[8]
172
[9] [10] [11] [12]
Searching for value: 113
Recursive call(1):
first = 7,
size = 6,
middle = 10
Since 113 != 130, make recursive call:
BinarySearch(array, first, size/2, target, found, location);
searching
12
Binary search in action
14
23
47
59
71
82
[0] [1]
[2]
[3] [4]
[5]
[6] [7]
5
99 108 113 130 151
[8]
172
[9] [10] [11] [12]
Searching for value: 113
Recursive call(2):
first = 7,
size = 3,
middle = 8
searching
13
Binary search in action
14
23
47
59
71
82
[0] [1]
[2]
[3] [4]
[5]
[6] [7]
5
99 108 113 130 151
[8]
172
[9] [10] [11] [12]
Searching for value: 113
Recursive call(2):
first = 7,
size = 3,
middle = 8
Since 113 != 108, make recursive call:
BinarySearch(array, middle+1, (size+1)/2, target, found, location);
searching
14
Binary search in action
14
23
47
59
71
82
[0] [1]
[2]
[3] [4]
[5]
[6] [7]
5
99 108 113 130 151
[8]
172
[9] [10] [11] [12]
Searching for value: 113
Recursive call(3):
first = 9,
size = 1,
middle = 9
searching
15
Binary search in action
14
23
47
59
71
82
[0] [1]
[2]
[3] [4]
[5]
[6] [7]
5
99 108 113 130 151
[8]
172
[9] [10] [11] [12]
Searching for value: 113
Recursive call(3):
first = 9,
size = 1,
middle = 9
Since 113 == 113, target is found; found = true, location = 9
searching
16
Binary Search Analysis
• Worst-case scenario: item is not in the array
– algorithm keeps searching smaller subarrays
– eventually, array size will be 0, and the search
will stop
• Analysis requires computing time needed
for operations in function as well as amount
of time for recursive calls
• We will analyze the algorithm’s
performance in the worst case
searching
17
Step 1: count operations
• Test base case: if (size==0)
• Compute midpoint:
middle = first + size/2;
1 operation
3 operations
• Test for target at midpoint:
if (target == array[middle])
2 operations
• Test for which recursive call to make:
if (target < array[middle])
2 operations
• Recursive call - requires some arithmetic and
argument passing - estimate 10 operations
searching
18
Step 2: analyze cost of recursion
• Each recursive call is preceded by 18 (or
fewer) operations
• Multiply this number by the depth of
recursive calls and add the number of
operations performed in the stopping case to
determine worst-case running time (T(n))
• T(n) = 18 * depth of recursion + 3
searching
19
Step 3: estimate depth of
recursion
• Calculate upper bound approximation for
depth of recursion; may slightly
overestimate, but will not underestimate
actual value
– Each recursive call is made on an array segment
that contains, at most, N/2 elements
– Subsequent calls are always made on size/2
– Thus, depth of recursion is, at most, the number
of times N can be divided by 2 with a result > 1
searching
20
Estimating depth of recursion
• Referring to “the number of times N is divisible
by 2 with result > 1” as H(n), or the halving
function, the time expression becomes:
T(n) = 18 * H(n) + 3
• H(n) turns out to be almost exactly equal to
log2n: H(n) = log2n meaning that fractional
results are rounded down to the nearest whole
number (e.g. 3.7 = 3) -- this notation is called the
floor function
searching
21
Worst-case time for binary search
• Substituting the floor function of the
logarithm for H(n), the time expression
becomes:
T(n) = 18 * ( log2n ) + 3
• Throwing out the constants, the worst-case
running time (big O) function is: O(log n)
searching
22
Significance of logarithms
(again)
• Logarithmic algorithms are very fast
because log n is much smaller than n
• The larger the data set, the more dramatic
the difference becomes:
–
–
–
–
log28 = 3
log264 = 6
log21000 < 10
log21,000,000 < 20
searching
23
For binary search algorithm...
• To search a 1000 element array will require
no more than 183 operations in the worst
case
• To search a 1,000,000 element array will
require less than 400 operations in the worst
case
searching
24
Download