SEARCHING part II Searching is the process of finding a designated target element within a group of times, or determining that the target does not exist within the group. The group of items to be searched is sometimes called the search pool. The more items there are in the search pool, the more comparisons it will take to find the target. The size of the problem is defined by the number of items in the search pool. To be able to search for an object, we must be able to compare one object to another. These algorithms search an array of Comparable objects. The elements involved must implement the Comparable interface and be comparable to each other. public class SortingandSearching<T extends Comparable> Recall that the Comparable interface contains one method, compareTo, which is designed to return an integer that is less than, equal to, or greater than zero. Any class that implements the Comparable interface defines the relative order of any two objects of that class. Linear Search /******************************************************************** Searches the specified array of objects using a linear search algorithm. ********************************************************************/ public boolean linearSearch (T[] data,int min, int max, T target) { int index = min; boolean found = false; while (!found && index <= max) { if (data[index].compareTo(target) == 0) found = true; index++; } return found; } The while loop steps through the elements of the array, terminating when either the element is found or the end o the array is reached. The Boolean variable found is initialized to false and is changed to true only if the target element is located. Variations of this implementation could return the element found in the array if it is found and return a null reference if it is not found. Alternatively, an exception could be thrown if the target element is not found. Binary Search A binary search algorithm eliminates large parts of the search pool with each comparison by capitalizing on the fact that the search pool is in sorted order. Instead of starting the search at one end or the other, a binary search begins in the middle of the sorted list. If the target is in the list, it will be on one side or the other, depending on whether the target is less than or greater than the middle element. Half of the search pool gets eliminated with one carefully chosen comparison. The remaining half of the search pool represents the viable candidates in which the target element may yet be found. With each comparison, a binary search eliminates approximately half of the remaining data to be searched. It also eliminates the middle element as well. The following method, binarySearch is implemented recursively. If the target element is not found, and there is more data to search, the method calls itself, passing parameters that shrink the size of viable candidates within the array. The min and max indexes are used to determine if there is still more data to search. /******************************************************************** Searches the specified array of objects using a binary search algorithm. ********************************************************************/ public boolean binarySearch (T[] data, int min, int max, T target) { boolean found = false; int midpoint = (min + max) / 2; // determine the midpoint if (data[midpoint].compareTo(target) == 0) found = true; else if (data[midpoint].compareTo(target) > 0) { if (min <= midpoint - 1) found = binarySearch(data, min, midpoint - 1, target); } else if (midpoint + 1 <= max) found = binarySearch(data, midpoint + 1, max, target); return found; } If the reduced search area does not contain at least one element, the method does not call itself and a value of false is returned. In this implementation of the binary search, the calculation that determines the midpoint index discards any fractional part, and therefore picks the first of the two middle values. Comparing Search Algorithms For a linear search, the best case occurs when the target element happens to be the first item we examine in the group. The worst case occurs when the target is not in the group, and we have to examine every element before we determine it isn’t present. The expected case is that we would have to search a half of the list before we find the element. If there are n elements in the search pool, on average we would have to examine n elements before finding the one for which we were searching. 2 Therefore, the linear search algorithm has a linear time complexity of O(n). Because the elements are searched one at a time in turn, the complexity is linear in direct proportion to the number of elements to be searched. A binary search is generally much faster. Because we can eliminate half of the remaining data with each comparison, we can find elements much more quickly. The best case is that we find the target in one comparison. That is, the target element happens to be at the midpoint of the array. The worst case occurs if the element is not present in the list, in which case we have to make approximately log2n comparisons before we eliminate all of the data. Therefore, a binary search is a logarithmic algorithm and has a time complexity of O(log2n). Compared to a linear search, a binary search is much faster for large values of n. The question must be asked, if a logarithm search is more efficient than a linear search, why would we ever use a linear search? First, a linear search is generally simpler than a binary search, and thus easier to program and debug. Second, a linear search does not require the additional overhead of sorting the search list. For small problems, there is little practical difference between the two types of algorithms. However, as n gets larger, the binary search becomes increasingly attractive. Suppose a given set of data contains one million elements. In a linear search, we’d have to examine each of the one million elements to determine that a particular target is not in the group. In a binary search, we could make that conclusion in roughly 20 comparisons. Questions 1. When would a linear search be preferable to a logarithmic search? 2. Which searching method requires that the list be sorted? Exercises: 1. Compare and contrast the linearSearch and binarySearch algorithms by searching for the numbers 45 and 54 in the following list: 3, 8, 12, 34, 54, 84, 91, 110 2. Change binarySearch implementation to return the element found in the array if it is found and return a null reference if it is not found.