1 Thursday February 12, 2015 BINARY SEARCH CS16: Introduction to Data Structures & Algorithms Thursday February 12, 2015 Outline 1) Binary Search 2) Binary Search Pseudocode 3) Analysis of Binary Search 4) In-Place Binary Search 5) Iterative Binary Search 2 Thursday February 12, 2015 3 The Problem • Determine whether an item, x, is in a sorted array 1 1 3 4 7 8 10 10 12 18 19 21 23 23 24 e.g. is 5 in this array? • Obvious solution: iterate through the entire array and check each element to see if it’s the one we’re searching for • This solution is O(n). Can we do better? • Let’s use the fact that the array is sorted! • If we’re looking for the item x, we can stop searching as soon as we find an item y > x, because we know x can’t come after y in the array • But what if we’re looking for 25 in the example above? • Worst case still O(n). Boooo. Thursday February 12, 2015 4 Binary Search 1 1 3 4 7 8 10 10 12 18 19 21 23 23 24 mid • What if we compared x to the middle element of the array, mid? • If mid == x, then we found x! • If mid < x, then we know x must be in the second half of the array, if it’s there at all • If mid > x, then we know x must be in the first half of the array, if it’s there at all • Important observation: • No matter what, we can eliminate half of the array • We then end up with the same problem, but half the size! WOAH WE SHOULD DO IT AGAIN Thursday February 12, 2015 5 Binary Search Simulation Goal: Find 5 1 1 3 4 7 8 10 10 12 18 19 21 23 23 24 because 5 < 10 1 1 3 4 7 8 10 because 5 > 4 7 8 10 because 5 < 8 7 Since 7 ≠ 5, we can conclude that 5 is NOT in the array. And it only took us four comparisons! Thursday February 12, 2015 Binary Search: First Analysis • For an array of size n, how many comparisons do we need to make to determine if x is in the array? (worst case) • After each comparison, the array size is cut in half • So how many times must we divide n by 2, before we get an array of size 1? log2n! • So binary search should be O(log n), right??? • Let’s try out some pseudocode and see… 6 Thursday February 12, 2015 7 Binary Search Pseudocode function binarySearch(A, x): // Input: A, a sorted array // x, the item to find // Output: true if x is in A, else false if A.size return if A.size return == 0: false == 1: A[0] == x mid = A.size / 2 if A[mid] return if A[mid] return if A[mid] return == x: true < x: binarySearch(A[mid + 1...end], x) > x: binarySearch(A[0...mid – 1], x) Thursday February 12, 2015 Binary Search Analyzed • Since each recursive call cuts the problem size in half, the recurrence relation for binary search looks like: T(1) = c T(n) = T(n/2) + f(n) • Where f(n) is the amount of work done at each level of recursion 8 Thursday February 12, 2015 9 Binary Search Analyzed (2) function binarySearch(A, x): if A.size == 0: O(1) (base case 1) return false O(1) if A.size == 1: return A[0] == x O(1) (base case 2) O(1) mid = A.size / 2 O(1) if A[mid] return if A[mid] return if A[mid] return == x: O(1) true O(1) < x: O(1) binarySearch(A[mid + 1...end], x) > x: O(1) binarySearch(A[0...mid – 1], x) Haaang on. What is this sketchy business here?? In order to pass a smaller array to the recursive call, we’re making a new array and copying over half the contents! This step’s O(n), kid… Thursday February 12, 2015 10 Binary Search Analyzed (3) • Now that we know f(n) is O(n), we can solve our recurrence relation using plug ‘n’ chug T (n) = T ( n2 ) + c1n + c2 f(n), a linear function of n T (1) = c0 T (2) = T (1) + 2c1 + c2 = c0 + 2c1 + c2 T (4) = T (2) + 4c1 + c2 = c0 + (4 + 2)c1 + 2c2 T (8) = T (4) + 8c1 + c2 = c0 + (8 + 4 + 2)c1 + 3c2 T (n) = c0 + (n + n2 + n4 +... + 4 + 2)c1 + (log n)c2 this sum converges to 2n, as n increases • Therefore, T(n) is O(n + log n), which is O(n). • That’s just as bad as iterating through the whole array! Thursday February 12, 2015 What went wrong? • In our initial simulation of binary search, we found that it took only O(log n) comparisons to solve the problem • But when it came to implementing the algorithm, copying half the array ended up costing us too much at each step! The runtime went back up to O(n) • This is a very common pitfall when trying to implement efficient algorithms. Sometimes taking the most straightforward approach is not enough to achieve the fast runtime you hope for • In the case of binary search, this means we need to implement the algorithm in-place. In other words, we can only use the array that was given to us as input. No copying allowed! 11 Thursday February 12, 2015 12 In-Place Binary Search function binarySearch(A, lo, hi, x): // Input: A – a sorted array // lo, hi – two valid indices of the array // x – the item to find // Output: true if x is in the array between lo and hi, inclusive if lo >= hi: return A[lo] == x mid = (lo + hi) / 2 if A[mid] return if A[mid] return if A[mid] return == x: true < x: binarySearch(A, mid + 1, hi, x) > x: binarySearch(A, lo, mid – 1, x) Thursday February 12, 2015 13 In-Place Binary Search (2) • Now it’s clear that our binary search only performs a constant number of operations at each iteration • The recurrence relation becomes: T(n) = T(n/2) + c1 • Plugging in, we get: T(1) = c0 T(2) = T(1) + c1 = c0 + c1 T(4) = T(2) + c1 = c0 + 2c1 T(8) = T(4) + c1 = c0 + 3c1 T(2k) = c0 + kc1 • If we let n = 2k, then: T(n) = c0 + (log2n)c1 • So our in-place algorithm is O(log n)! Yay! Thursday February 12, 2015 14 In-Place Binary Search: Iterative function binarySearch(A, x): // Input: A – a sorted array // x – the item to find // Output: true if x is in the array lo = 0 hi = A.size - 1 while lo < hi: mid = (lo + hi) / 2 if A[mid] == x: return true if A[mid] < x: lo = mid + 1 if A[mid] > x: hi = mid – 1 return A[lo] == x Remember: Any recursive algorithm can be implemented iteratively!