Searching The truth is out there ... searching 1 Serial Search • Brute force algorithm: examine each array item sequentially until either: – the item is found – all items have been examined • Algorithm is easy to code and works OK for small data sets searching 2 Code example for serial search // precondition: none // postcondition: searches an array of N items for target value: // returns true if target found, false if not template <class item> bool SerialSearch (item array[], size_t N, item target) { bool found = false; for (size_t x=0; (x < N) && (!found); x++) if (array[x] == target) found = true; return found; searching } 3 Time analysis of serial search • Worst case: serial search is O(N) -- if item not found, have to go through whole array before this can be verified • Best case: O(1) -- target value found at array[0] • Average case: O((N+1)/2) -- basically still O(N), but about 1/2 the time required for worst case searching 4 Binary search • Much faster than serial search • Works only if data are sorted • Uses divide & conquer approach with recursive calls: – check value at midpoint; if not target then – if greater than target, make recursive call to search “upper” half of structure – if less than target, recursively search “lower” half searching 5 Implementation of binary search // precondition: none // postcondition: searches an array of N items for target value: // returns true if target found, false if not template <class item> void BinarySearch(item array[], size_t first, size_t size, item target, bool& found, size_t& location) // parameters: array is the array to be searched, // first is the first index to be considered, // size is the number of items in search group // target is the value being sought, // found is the success/failure flag // location is the index of the entry containing the // target value, if found searching 6 Binary search code continued { // start of function size_t middle; // index of midpoint of current search area if (size == 0) found = false; // base case else { middle = first + size / 2; if (target == array[middle]) { location = middle; found = true; } searching 7 Binary search code continued // target not found at current midpoint -- search appropriate half else if (target < array[middle]) BinarySearch (array, first, size/2, target, found, location); // searches from start of array to index before midpoint else BinarySearch (array, middle+1, (size-1)/2, target, found, location); // searches from index after midpoint to end of array } // ends outer else } // ends function searching 8 Binary search in action Suppose you have a 13-member array of sorted numbers: 14 23 47 59 71 82 [0] [1] [2] [3] [4] [5] [6] [7] 5 99 108 113 130 151 [8] 172 [9] [10] [11] [12] Searching for value: 113 Initial function call: first = 0, size = 13, middle = 6 searching 9 Binary search in action 14 23 47 59 71 82 [0] [1] [2] [3] [4] [5] [6] [7] 5 99 108 113 130 151 [8] 172 [9] [10] [11] [12] Searching for value: 113 Initial function call: first = 0, size = 13, middle = 6 Since 113 != 82, make recursive call: BinarySearch (array, middle+1, (size-1)/2, target, found, location); searching 10 Binary search in action 14 23 47 59 71 82 [0] [1] [2] [3] [4] [5] [6] [7] 5 99 108 113 130 151 [8] 172 [9] [10] [11] [12] Searching for value: 113 Recursive call(1): first = 7, size = 6, middle = 10 searching 11 Binary search in action 14 23 47 59 71 82 [0] [1] [2] [3] [4] [5] [6] [7] 5 99 108 113 130 151 [8] 172 [9] [10] [11] [12] Searching for value: 113 Recursive call(1): first = 7, size = 6, middle = 10 Since 113 != 130, make recursive call: BinarySearch(array, first, size/2, target, found, location); searching 12 Binary search in action 14 23 47 59 71 82 [0] [1] [2] [3] [4] [5] [6] [7] 5 99 108 113 130 151 [8] 172 [9] [10] [11] [12] Searching for value: 113 Recursive call(2): first = 7, size = 3, middle = 8 searching 13 Binary search in action 14 23 47 59 71 82 [0] [1] [2] [3] [4] [5] [6] [7] 5 99 108 113 130 151 [8] 172 [9] [10] [11] [12] Searching for value: 113 Recursive call(2): first = 7, size = 3, middle = 8 Since 113 != 108, make recursive call: BinarySearch(array, middle+1, (size+1)/2, target, found, location); searching 14 Binary search in action 14 23 47 59 71 82 [0] [1] [2] [3] [4] [5] [6] [7] 5 99 108 113 130 151 [8] 172 [9] [10] [11] [12] Searching for value: 113 Recursive call(3): first = 9, size = 1, middle = 9 searching 15 Binary search in action 14 23 47 59 71 82 [0] [1] [2] [3] [4] [5] [6] [7] 5 99 108 113 130 151 [8] 172 [9] [10] [11] [12] Searching for value: 113 Recursive call(3): first = 9, size = 1, middle = 9 Since 113 == 113, target is found; found = true, location = 9 searching 16 Binary Search Analysis • Worst-case scenario: item is not in the array – algorithm keeps searching smaller subarrays – eventually, array size will be 0, and the search will stop • Analysis requires computing time needed for operations in function as well as amount of time for recursive calls • We will analyze the algorithm’s performance in the worst case searching 17 Step 1: count operations • Test base case: if (size==0) • Compute midpoint: middle = first + size/2; 1 operation 3 operations • Test for target at midpoint: if (target == array[middle]) 2 operations • Test for which recursive call to make: if (target < array[middle]) 2 operations • Recursive call - requires some arithmetic and argument passing - estimate 10 operations searching 18 Step 2: analyze cost of recursion • Each recursive call is preceded by 18 (or fewer) operations • Multiply this number by the depth of recursive calls and add the number of operations performed in the stopping case to determine worst-case running time (T(n)) • T(n) = 18 * depth of recursion + 3 searching 19 Step 3: estimate depth of recursion • Calculate upper bound approximation for depth of recursion; may slightly overestimate, but will not underestimate actual value – Each recursive call is made on an array segment that contains, at most, N/2 elements – Subsequent calls are always made on size/2 – Thus, depth of recursion is, at most, the number of times N can be divided by 2 with a result > 1 searching 20 Estimating depth of recursion • Referring to “the number of times N is divisible by 2 with result > 1” as H(n), or the halving function, the time expression becomes: T(n) = 18 * H(n) + 3 • H(n) turns out to be almost exactly equal to log2n: H(n) = log2n meaning that fractional results are rounded down to the nearest whole number (e.g. 3.7 = 3) -- this notation is called the floor function searching 21 Worst-case time for binary search • Substituting the floor function of the logarithm for H(n), the time expression becomes: T(n) = 18 * ( log2n ) + 3 • Throwing out the constants, the worst-case running time (big O) function is: O(log n) searching 22 Significance of logarithms (again) • Logarithmic algorithms are very fast because log n is much smaller than n • The larger the data set, the more dramatic the difference becomes: – – – – log28 = 3 log264 = 6 log21000 < 10 log21,000,000 < 20 searching 23 For binary search algorithm... • To search a 1000 element array will require no more than 183 operations in the worst case • To search a 1,000,000 element array will require less than 400 operations in the worst case searching 24