Introduction to Algorithms: Verification, Complexity, and Searching (2) Andy Wang Data Structures, Algorithms, and Generic Programming Lecture Overview Review: Binary search algorithms More on computational complexity Binary Search Goal Find a value in a collection of values Idea Divide and conquer Binary Search (2) Requirements Collection must be “array”-like Can use an index to jump to any array element Collection must be sorted Efficiency Very fast No extra space required Binary Search—the idea 0 11 1 23 2 35 3 47 35 < 47 Search range: 0 – 7 Search target: 35 4 53 5 60 6 72 7 82 Binary Search—the idea 0 11 1 23 2 35 23 < 35 Search range: 0 – 3 Search target: 35 3 47 4 53 5 60 6 72 7 82 Binary Search—the idea 0 11 1 23 2 35 35 == 35 Search range: 2 - 3 Search target: 35 3 47 4 53 5 60 6 72 7 82 Binary Search Algorithm Three versions Binary_search Lower_bound Upper_bound Assumptions Collection L of data type of T with size sz L is sorted Element t of type T Binary Search Algorithm (2) 0 11 1 23 2 35 Lower bound 3 47 4 53 5 60 6 72 7 82 Upper bound Outcomes Binary_search: true if t in L; false, otherwise Lower_bound: smallest j, where t <= L[j] Upper_bound: smallest j, where t < L[j] If there are duplicate entries 0 11 1 23 2 35 Lower bound Smallest j, where t <= L[j] Search range: 0 – 7 Search target: 35 3 35 4 53 5 60 6 72 7 82 Upper bound Smallest j, where t < L[j] If t is not in L… 0 11 1 23 2 24 3 25 4 53 Lower bound Upper bound Search range: 0 – 7 Search target: 35 5 60 6 72 7 82 Correctness and Loop Invariants Correctness Loop termination State when entering the loop State when exiting the loop Loop invariants Conditions that remain true for each iteration Mathematical induction Invariants—Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low - 1] < t <= L[high] (if index is valid) mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } // (3) low <= high // (4) high – low has decreased // (5) L[low - 1] < t <= L[high] (if index is valid) } return low; } Invariants—Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low - 1] < t <= L[high] (if index is valid) mid = (low + high) / 2; if (L[mid] < t) { t does not have to be in L low = mid + 1; } else { high = mid; } // (3) low <= high // (4) high – low has decreased // (5) L[low - 1] < t <= L[high] (if index is valid) } return low; } Invariants—Binary Search // (1) low < high // (2) L[low - 1] < t <= L[high] mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } // (3) low <= high low = mid + 1 = (old_low + high)/2 + 1 low <= (old_low + high)/2 + 1 low < (high + high)/2 + 1 low < high + 1 low <= high Invariants—Binary Search // (1) low < high // (2) L[low - 1] < t <= L[high] mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } // (3) low <= high high = mid = (low + old_high)/2 high > (low + old_high)/2 - 1 high > (low + low)/2 - 1 high > low – 1 high >= low Invariants—Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low - 1] < t <= L[high] (if index is valid) mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; Termination: } else { (3) shows that the loop can terminate high = mid; (4) shows progress } // (3) low <= high // (4) high – low has decreased // (5) L[low - 1] < t <= L[high] (if index is valid) } return low; } Invariants—Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low - 1] < t <= L[high] (if index is valid) mid = (low + high) / 2; if (L[mid] < t) { Return value: low = mid + 1; (5) smallest t <= L[j], since } else { L[j < low] != t, and t <= L[high] high = mid; } // (3) low <= high // (4) high – low has decreased // (5) L[low - 1] < t <= L[high] (if index is valid) } return low; } Computational Complexity Compares growth of two functions Independent of constant multipliers and lower-order effects Metrics “Big O” Notation “Big Omega” Notation “Big Theta” Notation Big “O” Notation f(n) is O(g(n)) iff c, n0 > 0 | 0 < f(n) < cg(n) n >= n0 cg(n) = cn2 f(n) = n n0 Examples F(n) is O(1) F(n) = 1 F(n) = 2 F(n) = c (constant) F(n) is O(log(n)) F(n) = 1 F(n) = 2log(n) F(n) = 3log2(4n5) + 1 F(n) = c1logc2(c3nc4) + O(log(n)) + O(1) Examples F(n) = O(n) F(n) = 2log(n) F(n) = n F(n) = 3n + 1 F(n) = c1n + O(n) + O(log(n)) F(n) = O(nlog(n)) F(n) = 3n + 2 F(n) = nlog(n) F(n) = 3nlog4(5n7) + 2n F(n) = c1nlogc2(c3nc4) + O(nlog(n)) + O(n) Examples F(n) = O(n2) F(n) = 3nlog(n) + 2n F(n) = n2 F(n) = 3n2 + 2n + 1 F(n) = c1n2 + O(n2) + O(nlog(n)) Big “Theta” Notation f(n) is (g(n)) iff c1, c2, n0 > 0 | 0 < c1g(n) < f(n) < c2g(n) n >= n0 2g(n) = 2n f(n) = n 1/2g(n) = 1/2n n0 Examples F(n) is (1) F(n) = 1 F(n) = 2 F(n) = c (constant) F(n) is (log(n)) F(n) = 1 F(n) = 2log(n) F(n) = 3log2(4n5) + 1 F(n) = c1logc2(c3nc4) + O(log(n)) + O(1) Examples F(n) = (n) F(n) = 2log(n) F(n) = n F(n) = 3n + 1 F(n) = c1n + O(n) + O(log(n)) F(n) = (nlog(n)) F(n) = 3n + 2 F(n) = nlog(n) F(n) = 3nlog4(5n7) + 2n F(n) = c1nlogc2(c3nc4) + O(nlog(n)) + O(n) Examples F(n) = (n2) F(n) = 3nlog(n) + 2n F(n) = n2 F(n) = 3n2 + 2n + 1 F(n) = c1n2 + O(n2) + O(nlog(n)) Big “Omega” Notation f(n) is (g(n)) iff c, n0 > 0 | 0 < cg(n) < f(n) n >= n0 f(n) = n cg(n) = c n0 Examples F(n) is (1) F(n) = 1 F(n) = 2n F(n) = n2 F(n) is (log(n)) F(n) = 1 F(n) = 2log(n) F(n) = nlogn + n + 1 F(n) = n3 Examples F(n) = (n) F(n) = 2log(n) F(n) = n F(n) = 3n2 + 1 F(n) = nlogn + 3n2 + 1 F(n) = (nlog(n)) F(n) = 3n + 2 F(n) = nlog(n) F(n) = 3n2 F(n) = n3 + 2n2 Examples F(n) = (n2) F(n) = 3nlog(n) + 2n F(n) = n2 F(n) = 3n3 + 2n + 1 Order the following functions… n1/50, log(n2), (log(n))2, 5n2, 100log(n), 1.1n Order the following functions… n1/50, log(n2), (log(n))2, 5n2, 100log(n), 1.1n log(n2) = 2log(n) (log(n))2 = log(n)log(n) 100log(n) = nlog(100) = n2 log(n2) < (log(n))2 < n1/50 < 100log(n) < 5n2 < 1.1n Complexity Analysis Steps Find n = size of input Find an atomic activity to count Find f(n) = the number of atomic activities done by an input size of n Complexity of an algorithm = complexity of f(n) Algorithm Complexity—Loops for (j = 0; j < n; ++j) { // 3 atomics } Complexity = (3n) = (n) Loops with Break for (j = 0; j < n; ++j) { // 3 atomics if (condition) break; } Upper bound = (3n) = (n) Lower bound = (3) = (1) Complexity = O(n) Loops in Sequence for // } for // } (j = 0; j < n; ++j) { 3 atomics (j = 0; j < n; ++j) { 5 atomics Complexity = (3n + 5n) = (n) Nested Loops for (j = 0; j < n; ++j) { // 2 atomics for (k = 0; k < n; ++k) { // 3 atomics } } Complexity = ((2 + 3n)n) = (n2) Sequential Search for (T item = begin(L); item != end(L); item = next(L)) { if (t == item) return true; } if (t == item) return true; return false; Input size: n Atomic computation: comparison Complexity = O(n) Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } } return low; } Input size: n Atomic computation: comparison Complexity = k iterations x 1 comparison/loop Iteration Count for Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } } return low; } Iter # search space 1 n 2 n/2 3 n/4 k n/2(k-1) n/2(k-1) = 1 n = 2(k-1) log2(n) = k - 1 Iteration Count for Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } } return low; } n/2(k-1) = 1 n = 2(k-1) log2(n) = k - 1 log2(n) + 1 = k Complexity function f(n) = log(n) iterations x 1 comparison/loop = (log(n)) Announcement Exam 1 (9/29) String class BitVector and Bit operations Hash functions/hash tables Templates Vectors Algorithm verification, complexity analysis, and searching