Algorithm Design and Analysis (ADA) 242-535, Semester 1 2014-2015 4. Divide and Conquer • Objective o look at several divide and conquer examples (merge sort, binary search), and 3 approaches for calculating their running time 242-535 ADA: 4. Divide/Conquer 1 Overview 1. 2. 3. 4. 5. 6. 7. 8. 9. Divide and Conquer A Faster Sort: merge sort The Iteration Method Recursion Trees Merge Sort vs Insertion Sort Binary Search Recursion Tree Examples Iteration Method Examples The Master Method 242-535 ADA: 4. Divide/Conquer 2 1. Divide and Conquer 1. Divide the problem into subproblems 2. Conquer the subproblems by solving them recursively 3. Combine subproblem solutions. 242-535 ADA: 4. Divide/Conquer 3 2. A Faster Sort: Merge Sort Initial call: MERGESORT(A, 1, n) MERGESORT( A, left, right) 1. If left < right, // if left ≥ right, do nothing 2. mid := floor(left+right)/2) 3. MergeSort(A, left, mid) 4. MergeSort( A, mid+1,right) 5. Merge(A, left, mid, right) 6. return A faster sort: MergeSort input A[1 .. n] A[1 . . mid] A[mid+1 . . n] MERGESORT MERGESORT Sorted A[1 . . mid] Sorted A[mid+1 . . n] MERGE output 242-535 ADA: 4. Divide/Conquer 5 Tracing MergeSort() merge 242-535 ADA: 4. Divide/Conquer 6 Merging two sorted arrays 20 12 20 12 20 12 20 12 20 12 20 12 13 11 13 11 13 11 13 11 13 11 13 7 9 7 7 2 1 2 1 9 2 9 7 9 9 11 Time = one pass through each array = O(n) to merge a total of n elements (linear time). 12 Analysis of Merge Sort Statement Effort MergeSort(A, left, right) if (left < right) { mid = floor((left+right)/2); MergeSort(A, left, mid); MergeSort(A, mid+1, right); Merge(A, left, mid, right); } } T(n) O(1) O(1) T(n/2) T(n/2) O(n) As shown on the previous slides 242-535 ADA: 4. Divide/Conquer 20 merge() Code • merge(A, left, mid, right) • Merges two adjacent subranges of an array A o left == the index of the first element of the first range o mid == the index of the last element of the first range o right == to the index of the last element of the second range 242-535 ADA: 4. Divide/Conquer 21 void merge(int[] A, int left, int mid, int right) { int[] temp = new int[right–left + 1]; int aIdx = left; int bIdx = mid+1; for (int i=0; i < temp.length; i++){ if(aIdx > mid) temp[i] = A[bIdx++]; // copy 2nd range else if (bIdx > right) temp[i] = A[aIdx++]; // copy 1st range else if (a[aIdx] <= a[bIdx]) temp[i] = A[aIdx++]; else temp[i] = A[bIdx++]; } // copy back into A for (int j = 0; j < temp.length; j++) A[left+j] = temp[j]; } ADA: 4. Divide/Conquer 242-535 22 3. The Iteration Method • Up to now, we have been solving recurrences using the Iteration method o o o o o Write T() as a recursive equation using big-Oh Convert T() equation into algebra (replace O()'s) Expand the recurrence Rewrite the recursion into a summation Convert algebra back to O() 242-535 ADA: 4. Divide/Conquer 23 MergeSort Running Time • Recursive T() equation: o T(1) = O(1) o T(n) = 2T(n/2) + O(n), for n > 1 • Convert to algebra o T(1) = a o T(n) = 2T(n/2) + cn 242-535 ADA: 4. Divide/Conquer 24 Recurrence for Merge Sort • The expression: a T (n) n 2 T cn 2 n 1 n 1 is called a recurrence. • A recurrence is an equation that describes a function in terms of its value for smaller function calls. 242-535 ADA: 4. Divide/Conquer 25 a T (n) n 2 T cn 2 n 1 n 1 • T(n) = 2T(n/2) + cn 2(2T(n/2/2) + cn/2) + cn 22T(n/22) + cn2/2 + cn 22T(n/22) + cn(2/2 + 1) 22(2T(n/22/b) + cn/22) + cn(2/2 + 1) 23T(n/23) + cn(22/22) + cn(2/2 + 1) 23T(n/23) + cn(22/22 +2/2 + 1) … 2kT(n/2k) + cn(2k-1/2k-1 + 2k-2/2k-2 + … + 22/22 + 2/2 + 1) 242-535 ADA: 4. Divide/Conquer 26 a T (n) n 2 T cn 2 n 1 n 1 • So we have o T(n) = 2kT(n/2k) + cn(2k-1/2k-1 + ... + 22/22 + 2/2 + 1) • For k = log2 n k-1 of these o n = 2k, so T() argument becomes 1 o T(n) = 2kT(1) + cn(k-1+1) = na + cn(log2 n) = O(n) + O(n log2 n) = O(n log2 n) 242-535 ADA: 4. Divide/Conquer 27 4. Recursion Trees • A graphical technique for finding a big-oh solution to a recurrence o Draw a tree of recursive function calls o Each tree node gets assigned the big-oh work done during its call to the function. o The big-oh equation is the sum of work at all the nodes in the tree. 242-535 ADA: 4. Divide/Conquer 28 MergeSort Recursion Tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. • We usually omit stating the base case because our algorithms always run in time O(1) when n is a small constant. MergeSort Recursion Tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. T(n) MergeSort Recursion Tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn T(n/2) T(n/2) MergeSort Recursion Tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/2 T(n/4) T(n/4) cn/2 T(n/4) T(n/4) MergeSort Recursion Tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn cn/2 cn/2 h = log n cn/4 cn/4 cn/4 cn … cn/4 cn O(1) #leaves = n O(n) Total O(n log n) Height and no. of Leaves • Node value: n n/2 n2/4 ... 1 h steps • n(1/2)h = 1 • n = 2h • log2n = h // height = h // take logs of both sides • No. of nodes: 1 2 22 --> ... --> no. of leaves • no. of leaves = 2h = 2log2 𝑛 = 𝑛log2 2 = n1 = n 242-535 ADA: 4. Divide/Conquer why? 34 Logarithm Equalities because of this 242-535 ADA: 4. Divide/Conquer 35 5. Merge Sort vs Insertion Sort • O(n lg n) grows more slowly than O(n2). • In other words, merge sort is asymptotically faster (runs faster) than insertion sort in the worst case. • In practice, merge sort beats insertion sort for n > 30 or so. 242-535 ADA: 4. Divide/Conquer 36 Timing Comparisons • Running time estimates: o Laptop executes 108 compares/second. o Supercomputer executes 1012 compares/second. Lesson 1. Good algorithms are better than supercomputers. 242-535 ADA: 4. Divide/Conquer 37 6. Binary Search • Binary Search from part 3 is a divide and conquer algorithm. • Find an element in a sorted array: 1. Divide: Check middle element. 2. Conquer: Recursively search 1 subarray. 3. Combine: Easy; return index Example: Find 9 3 242-535 ADA: 4. Divide/Conquer 5 7 8 9 12 15 38 Binary Search Find an element in a sorted array: 1. Divide: Check middle element. 2. Conquer: Recursively search 1 subarray. 3. Combine: Trivial. Example: Find 9 3 5 7 8 9 12 15 Binary Search Find an element in a sorted array: 1. Divide: Check middle element. 2. Conquer: Recursively search 1 subarray. 3. Combine: Trivial. Example: Find 9 3 5 7 8 9 12 15 Binary Search Find an element in a sorted array: 1. Divide: Check middle element. 2. Conquer: Recursively search 1 subarray. 3. Combine: Trivial. Example: Find 9 3 5 7 8 9 12 15 Binary Search Find an element in a sorted array: 1. Divide: Check middle element. 2. Conquer: Recursively search 1 subarray. 3. Combine: Trivial. Example: Find 9 3 5 7 8 9 12 15 Binary Search Find an element in a sorted array: 1. Divide: Check middle element. 2. Conquer: Recursively search 1 subarray. 3. Combine: Trivial. Example: Find 9 3 5 7 8 9 12 15 Binary Search Code (again) int binSrch(char A[], int i,int j, char key) { int k; if (i > j) /* key not found */ return -1; k = (i+j)/2; if (key == A[k]) /* key found */ return k; if (key < A[k]) j = k-1; /* search left half */ else i = k+1; /* search right half */ return binSrch(A, i, j, key); } 242-535 ADA: 4. Divide/Conquer 44 Running Time (again) n == the range of the array being looked at • Using big-oh. o Basis: T(1) = O(1) o Induction: T(n) = O(1) + T(< n/2 >), for n > 1 • As algebra o Basis: T(1) = a o Induction: T(n) = c + T(< n/2 >), for n > 1 • Running time for binary search is O(log2 n). 242-535 ADA: 4. Divide/Conquer 45 Recurrence for Binary Search T(n) = 1 T(n/2) + O(1) # subproblems subproblem size work dividing and combining BS Recursion tree Solve T(n) = T(n/2) + c, where c > 0 is constant. • We usually don't bother with the base case because our algorithms always run in time O(1) when n is a small constant. BS Recursion tree Solve T(n) = T(n/2) + c, where c > 0 is constant. T(n) BS Recursion tree Solve T(n) = T(n/2) + c, where c > 0 is constant. c T(n/2) BS Recursion tree Solve T(n) = T(n/2) + c, where c > 0 is constant. c c T(n/4) BS Recursion tree c c c … h = log2 n c … Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. c c O(1) a Total = c log2n + a = O(log2 n) Two recurrences so far • Merge Sort T(n) = 2T(n/2) + O(n) = O(n log n) • Binary Search T(n) = T(n/2) + O(1) = O(log n) • The big-oh running times were calculated in two ways: the iteration method and using recursion trees. • Let's do some more example of both. 242-535 ADA: 4. Divide/Conquer 52 7. Recursion Tree Examples 1 Solve T(n) = T(n/4) + T(n/2) + n2: Example 1 Solve T(n) = T(n/4) + T(n/2) + n2: T(n) Example 1 Solve T(n) = T(n/4) + T(n/2) + n2: n2 T(n/4) T(n/2) Example 1 Solve T(n) = T(n/4) + T(n/2) + n2: n2 (n/4)2 T(n/16) T(n/8) (n/2)2 T(n/8) T(n/4) Example 1 Solve T(n) = T(n/4) + T(n/2) + n2: n2 (n/4)2 (n/16)2 O(1) (n/8)2 (n/2)2 (n/8)2 (n/4)2 Example 1 Solve T(n) = T(n/4) + T(n/2) + n2: n2 (n/4)2 (n/16)2 O(1) (n/8)2 (n/2)2 (n/8)2 (n/4)2 n2 Example 1 Solve T(n) = T(n/4) + T(n/2) + n2: n2 (n/4) 2 (n/16)2 O(1) (n/8)2 (n/2)2 (n/8)2 (n/4)2 n2 5 n2 16 Example 1 Solve T(n) = T(n/4) + T(n/2) + n2: n2 (n/4) 2 (n/16)2 O(1) (n/8)2 (n/2)2 (n/8)2 (n/4)2 n2 5 n2 16 25 n 2 256 Example 1 Solve T(n) = T(n/4) + T(n/2) + n2: n2 (n/2)2 (n/4)2 (n/16)2 O(1) (n/8)2 n2 (n/8)2 2 (n/4)2 5 n2 16 25 n 2 256 2 3 5 5 5 (1 +16 + 16 + 16 +L Total = n = 16/11*n2 = O(n2) ( ) ( ) geometric series ) Geometric Series Reminder 1 2 1 x x L 2 1 x x L x n 1 x 1 x for |x| < 1 n 1 1 x for x 1 Recursion Tree 2 • T(n) = 3T(n/4) + cn2 242-535 ADA: 4. Divide/Conquer 63 • T(n) = 3T(n/4) + cn2 242-535 ADA: 4. Divide/Conquer 64 • T(n) = 3T(n/4) + cn2 height (h) = no. of leaves = 242-535 ADA: 4. Divide/Conquer 65 Height and no. of Leaves • Node value: n n/4 n/16 ... 1 h steps • n(1/4)h = 1 • n = 4h • log4n = h // height = h // take logs • No. of nodes: 1 3 32 --> ... --> no. of leaves • no. of leaves = 3h = 3log4 𝑛 = 𝑛log4 3 242-535 ADA: 4. Divide/Conquer why? 66 Cost of the Tree next to bottom level • Add the cost of all the levels: leaves level 242-535 ADA: 4. Divide/Conquer 67 Recursion Tree 3 • T(n) = T(n/3) + T(2n/3) + cn height = 242-535 ADA: 4. Divide/Conquer 68 Height and no. of Leaves • Node value: n (2/3)n (2/3)2n ... 1 h steps for the longest path • n(2/3)h = 1 • n = (3/2)h • log3/2n = h // height = h // take logs • If the tree was a complete binary tree (which it isn't) then no. of branches: 1 2 22 --> ... --> no. of leaves • no. of leaves = 2h = 2log3/2 𝑛 = 𝑛log3/2 2 242-535 ADA: 4. Divide/Conquer why? 69 Cost of the Tree • Since the tree is smaller than a complete binary tree, then the cost of all the level will be: • T(n) ≤ cn * log3/2 n • T(n) is O(n log3/2 n) is O(n log n) • Since log3/2 n = log2 n / log2 3/2 = c log2 n // see slide 35 242-535 ADA: 4. Divide/Conquer 70 8. Iteration Method Examples 0 T (n) c T ( n 1) 1 n0 n0 0 T (n) n T ( n 1) 2 d T (n) n aT cn b 3 242-535 ADA: 4. Divide/Conquer n0 n0 n 1 n 1 71 Example 1 • T(n) = = = = = … = 0 T (n) c T ( n 1) n0 n0 c + T(n-1) c + c + T(n-2) 2c + T(n-2) 2c + c + T(n-3) 3c + T(n-3) kc + T(n-k) = ck + T(n-k) 242-535 ADA: 4. Divide/Conquer 72 0 T (n) c T ( n 1) n0 n0 • When k == n o T(n) = cn + T(0) = cn • The conversion back to big-oh: o T(n) is O(n) 242-535 ADA: 4. Divide/Conquer 73 Example 2 • = = = = = = 0 T (n) n T ( n 1) n0 n0 T(n) n + T(n-1) n + n-1 + T(n-2) n + n-1 + n-2 + T(n-3) n + n-1 + n-2 + n-3 + T(n-4) … n + n-1 + n-2 + n-3 + … + n-(k-1) + T(n-k) 242-535 ADA: 4. Divide/Conquer 74 0 T (n) n T ( n 1) • = = = = = = = n0 n0 T(n) n + T(n-1) n + n-1 + T(n-2) n + n-1 + n-2 + T(n-3) n + n-1 + n-2 + n-3 + T(n-4) … n + n-1 + n-2 + n-3 + … + n-(k-1) + T(n-k) n i i n k 1 T (n k ) 242-535 ADA: 4. Divide/Conquer 75 • When k = n, T(n) = n i T (0) i 1 n i0 n i 1 n 1 2 • In general, T (n) n n 1 2 242-535 ADA: 4. Divide/Conquer T(n) is O(n2) 76 Example 3 d n T (n) aT cn b n 1 n 1 • T(n) = aT(n/b) + cn a(aT(n/b/b) + cn/b) + cn a2T(n/b2) + cna/b + cn a2T(n/b2) + cn(a/b + 1) a2(aT(n/b2/b) + cn/b2) + cn(a/b + 1) a3T(n/b3) + cn(a2/b2) + cn(a/b + 1) a3T(n/b3) + cn(a2/b2 + a/b + 1) … akT(n/bk) + cn(ak-1/bk-1 + ak-2/bk-2 + … + a2/b2 + a/b + 1) 242-535 ADA: 4. Divide/Conquer 77 d n T (n) aT cn b n 1 n 1 • So we have o T(n) = akT(n/bk) + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1) • For k = logb n o n = bk o T(n) = akT(1) + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1) = akd + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1) ~= cak + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1) = cnak /bk + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1) = cn(ak/bk + ... + a2/b2 + a/b + 1) 242-535 ADA: 4. Divide/Conquer 78 d n T (n) aT cn b n 1 n 1 • With k = logb n o T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1) • There are three cases at this stage depending on if a == b, a < b, or a > b • If a == b Case 1 o T(n) = cn(k + 1) = cn(logb n + 1) = O(n logb n) 242-535 ADA: 4. Divide/Conquer 79 d n T (n) aT cn b n 1 n 1 • With k = logb n o T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1) Case 2 • If a < b o Recall that (xk + xk-1 + … + x + 1) = (xk+1 -1)/(x-1) o So: a k b k a b k 1 k 1 L a 1 b a b k 1 1 a b 1 1 a b // slide 62 k 1 1 a b 1 1 a b o T(n) = cn * O(1) = O(n) 242-535 ADA: 4. Divide/Conquer 80 d n T (n) aT cn b n 1 n 1 • With k = logb n o T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1) • If a > b? a k b k a b Case 3 k 1 k 1 L 242-535 ADA: 4. Divide/Conquer a b 1 a b k 1 1 a b 1 O a b k 81 why? o T(n) = cn * O(ak / bk) = cn * O(𝑎log𝑏 𝑛 / 𝑏 log𝑏 𝑛 ) = cn * O(𝑎log𝑏 𝑛 / n) = cn * O(𝑛log𝑏 𝑎 / n) = O(n * 𝑛log𝑏 𝑎 / n) = O(𝑛log𝑏 𝑎 ) 242-535 ADA: 4. Divide/Conquer 82 • So… d n T (n) aT cn b O n T ( n ) O n log b n O n log b a n 1 n 1 ab ab ab e.g. merge sort (a = b = 2 and c = 1) 242-535 ADA: 4. Divide/Conquer 83 9. The Master Method The master method only applies to divide and conquer recurrences of the form: this is a more general version of the last example T(n) = a T(n/b) + f (n) where a 1, b > 1, and f (n) > 0 for all n > n0 The Master method gives us a cookbook solution for an algorithm’s running time • plug in the numbers, get the equation nlogba == no. of leaves in the recursion tree (see next slides) Three cases • When T(n) = aT(n/b) + f(n) then log a n b log a T ( n ) is n b log n f (n ) log a Case 1 f (n) O n b <a 0 log b a Case 2 = f (n) n c 1 >a log b a f (n) n and a * f ( n / b ) c * f ( n ) for large n note: nԑ is a polynomial 242-535 ADA: 4. Divide/Conquer Case 3 85 Example 1 EX. T(n) = 4T(n/2) + n a = 4, b = 2 so nlogba = n2; f (n) = n. CASE 1 since f(n) < nlogba (n < n2) T(n) is (n2) Example 2 EX. T(n) = 4T(n/2) + n2 a = 4, b = 2 so nlogba = n2; f (n) = n2. CASE 2 since f(n) is same as nlogba (n2 = n2) T(n) is (n2 * log n) Example 3 EX. T(n) = 4T(n/2) + n3 a = 4, b = 2 so nlogba = n2; f (n) = n3. CASE 3 since f(n) > nlogba (n3 > n2) and 4(n/2)3 cn3 (reg. cond.) for c = 1/2 T(n) is (n3) Example 4 (fail) EX. T(n) = 4T(n/2) + n2/log n a = 4, b = 2 so nlogba = n2; f (n) = n2 / log n. The master method does not apply because n2/log n ≠ n2-ԑ for any ԑ. f(n) must be a simple polynominal function for the master method to be applicable Example 5 • T(n) = 9T(n/3) + n o a=9, b=3, f(n) = n o 𝑛log𝑏 𝑎 = 𝑛log3 9 = n2 o Case 1 since f(n) < 𝑛log𝑏 𝑎 (n < n2) o T(n) = (n2) 242-535 ADA: 4. Divide/Conquer 90 Common Examples 242-535 ADA: 4. Divide/Conquer Cannot use Master method since f(n) is not a polynomial 91 Recursion Tree for Master T() T(n) = aT(n/b) + f(n) f (n) a f (n/b) a2 f (n/b2) … a f (n/b) f (n/b) … f (n/b) a h = logbn f (n/b2) f (n/b2) … f (n/b2) f (n) T (1) #leaves = nlogba nlogbaT (1) Height and no. of Leaves • Node value: n n/b n/b2 ... 1 h steps • n(1/b)h = 1 • n = bh • logbn = h // height = h // take logs • No. of nodes: 1 a a2 --> ... --> no. of leaves • no. of leaves = ah = 𝑎log𝑏 𝑛 = 𝑛log𝑏 𝑎 242-535 ADA: 4. Divide/Conquer why? 93 The –ԑ means that f(n) is smaller than leaf sum. f(n) = O(𝑛log𝑏 𝑎−ԑ ) f (n) a f (n/b) f (n/b) … f (n/b) a h = logbn f (n/b2) f (n/b2) … f (n/b2) T (1) The sums increase geometrically from the root to the leaves. The leaves hold the biggest part of the total sum. f (n) a f (n/b) a2 f (n/b2) … Case 1 Explained nlogbaT (1) (nlogba) No ԑ means that f(n) is roughly equal to the leaf sum. f(n) = O(𝑛log𝑏 𝑎 ) f (n) a f (n/b) f (n/b) … f (n/b) a h = logbn f (n/b2) f (n/b2) … f (n/b2) T (1) The sums are approximately the same on each of the levels (Θ(total of all sums)). f (n) a f (n/b) a2 f (n/b2) … Case 2 Explained nlogbaT (1) (nlogba * log n) f(n) = O(𝑛log𝑏 𝑎+ԑ ) f (n) a f (n/b) f (n/b) … f (n/b) a h = logbn f (n/b2) f (n/b2) … f (n/b2) T (1) The sums decrease geometrically from the root to the leaves. The root holds the biggest part of the total sum. f (n) a f (n/b) a2 f (n/b2) … Case 3 Explained af(n/b) is getting smaller at lower levels (see defn) nlogbaT (1) ( f (n))