COSC 3100 Transform and Conquer Instructor: Tanvir What is Transform and Conquer ? • The 4th algorithm design technique we are going to study • Three major variations – Instance Simplification: Transform to a simpler or more convenient instance of the same problem – Representation Change: Transform to a different representation of the same instance – Problem Reduction: Transform to an instance of a different problem for which you know an efficient algorithm • Two-step process: – Step 1: Modify problem instance to something easier to solve – Step 2: Conquer! What is Transform and Conquer ? 1. Presorting Problem’s instance Simpler instance Or Another representation Or Another problem’s instance 1. Heapsort 2. Horner’s Rule 3. Binary Exponentiation Solution 1. Computing Least Common Multiple 2. Counting Paths in a Graph 3. Reduction to Optimization Problems 4. Reduction to Graph Problems Trns. & Conq.: Presorting • Many questions about a list are easier to answer if the list is sorted • We shall see three examples • Example 1: Checking element uniqueness in an array T(n) = (n-1)+(n-2)+…+1 E.g., 4 1 8 9 3 7 10 2 3 1 = 𝒏−𝟏 𝒏 𝟐 є Θ(n2) What is the brute-force idea ? Take say the first element and check whether it is in the rest of the array, take the second element and check if it is in the rest of the array, and so on… Trns. & Conq.: Presorting • Let us apply sorting first ALGORITHM PresortElementUniqueness(A[0..n-1]) sort the array A 4 1 8 9 3 7 10 2 3 1 for i <- 0 to n-2 do S O if A[i] = A[i+1] R T return false 1 1 2 3 3 4 7 8 9 10 return true T(n) = Tsort(n) + Tscan(n) є Θ(nlgn) + Θ(n) = Θ(nlgn) Trns. & Conq.: Presorting • Example 2: Computing a mode • A mode is a value that occurs most often in a list • E.g., in 5 1 5 7 6 5 7 , mode is ? 5 Brute-force: scan the list and compute the frequencies of all distinct values, then find the value with the highest frequency How to implement brute-force ? 7 1 Value: 5 Store the values already encountered along with their frequencies, in a separate list. Frequency: 3 𝒏−𝟏 𝒏 𝟐 2 What is the worst-case input ? C(n) = 0 + 1 + … + (n-1) = 1 є Θ(n2) An array with no equal elements… 6 1 Trns. & Conq.: Presorting • Example 2: Computing a mode (contd.) • If we sort the array first, all equal values will be adjacent to each other. • To compute the mode, all we need to know is to find the longest run of adjacent equal values in the sorted array Trns. & Conq.: Presorting Example 2: Computing a mode (contd.) ALGORITHM PresortMode( A[0..n-1] ) sort the array A i <- 0 1 1 2 3 3 3 7 8 9 10 modefrequency <- 0 while i ≤ n-1 do runlength <- 1; runvalue <- A[i] while i+runlength ≤ n-1 and A[i+runlength] = runvalue runlength <- runlength+1 if runlength > modefrequency modefrequency <- runlength; modevalue <- runvalue i <- i+runlength return modevalue while loop takes linear time, so the overall runtime is dominated by the time for sorting, Θ(nlgn) Trns. & Conq.: Heapsort • Another O(nlgn) sorting algorithm • Uses a clever data structure called “Heap” • It transforms an array into a Heap and then sorting becomes very easy • Heap has other important uses, like in implementing “priority queue” • Recall, priority queue is a multiset of items with an orderable characteristic called “priority”, with the following operations: – Find an item with the highest priority – Deleting an item with the highest priority – Adding a new item to the multiset Let us study Heap first… Trns. & Conq.: Heap • A “heap” can be described as a binary tree, with one key per node, provided the following two conditions are met: – Shape property: Binary tree is essentially complete, all levels are full except possibly the last level, where only some right most leaves may be missing – Parental dominance (or heap property): key in each node is greater than or equal to the keys in its children (considered automatically satisfied for the leaves) This is actually the “max heap”, There is a corresponding “min heap”… Trns. & Conq.: Heap Shape and Heap properties… 10 10 5 5 4 7 7 2 Is it a Heap ? 1 10 5 6 7 2 1 2 1 Trns. & Conq.: Heap Note: Sequence of values on a path from the root to a leaf is nonincreasing 10 7 8 5 2 3 index: value: 6 ALGORITHM Parent(i) return 𝑖 2 1 5 0 1 There is no left-to-right order in key values, though 1 2 10 8 ALGORITHM Left(i) return 2i 3 7 4 5 parents 5 2 6 1 7 6 8 3 9 5 leaves 10 1 ALGORITHM Right(i) return 2𝑖 + 1 Trns. & Conq.: Heap Important properties of heaps: 1. There exists one essentially complete binary tree with n nodes, its height is 𝑙𝑔𝑛 2. The root of a heap contains the largest element 3. A node with all its descendants is also a heap 4. Heap can be implemented as an array By recording its elements in the top-down, left-right fashion. H[0] is unused or contains a number bigger than all the rest. a. Parental node keys will be in first 𝑛 2 positions, leafs will Be in the last 𝑛 2 positions 9 7 8 5 3 0 H: 2 2 6 1 5 1 1 3 4 5 6 7 8 9 10 9 8 7 5 2 1 6 3 5 1 b. Children of key at index i (1 ≤ i ≤ 𝑛 2 ) will be in positions 2i and 2i+1. The parent of a key at index i (2 ≤ i ≤ n) will be in Position 𝑖 2 . Which gives: H[i] ≥ max{ H[2i], H[2i+1] } for i = 1, … , 𝒏 𝟐 Trns. & Conq.: Heap • How can we transform an array into 2 a heap? Two ways: 1. Start at the last parent at index 𝒏 𝟐 2 9 6 0 1 7 5 2 3 4 8 5 6 2 9 7 6 5 8 9 6 5 2. Check if parental dominance holds for this node’s key 3. If it does not, Exchange the node’s 6 key K with the larger n key of its children 4. Check if parental dominance holds for K in its new position and so on… Stop after doing for root. 6 2 8 7 2 9 8 5 7 9 2 6 8 5 This process is called “Heapifying” This method is called “bottom-up heap construction” 7 Heapify one parent Trns. & Conq.: Heap ALGORITHM HeapBottomUp(H[1..n]) //Input: An array H[1..n] of orderable items //Output: A heap H[1..n] for i <- 𝑛 2 downto 1 do k <- i; v <- H[k] heap <- false while not heap and 2*k ≤ n do j <- 2*k if j < n // there are 2 children if H[j] < H[j+1] j <- j+1 if v ≥ H[j] // only left child heap <- true else H[k] <- H[j] k <- j H[k] <- v 2 9 7 6 5 8 n 0 1 2 3 4 5 6 2 9 7 6 5 8 Trns. & Conq.: Heap • Let us analyze the HeapBottomUp(H[1..n])’s worst case Let’s take a heap whose binary tree is full n = 22-1 n = 23-1 Why ? 7 So, number of nodes, n = 2m-1 largest possible # of nodes at each level… 4 6 So what is the height ? h = 𝒍𝒈𝒏 = 𝐥𝐠(𝒏 + 𝟏) -1 = m-1 A key on level i must travel to the leaf-level h; Level i has (h-i) lower levels. Going down by 1 level requires 2 comparisons: One to determine the larger child, the other to determine if exchange is needed. 5 than 2n comparisons are required 𝒊𝟐𝒊 = (𝒏 − 𝟏)𝟐𝒏+𝟏 + 𝟐 9 3 Let’s look at a worst-case… Cworst(n) = ℎ−1 𝑖=0 So, fewer 𝑖 = ℎ−1 𝑖=0 2(ℎ − 𝑖)2 𝒏 𝒊=𝟏 2 2𝑖 𝑗=1 2(ℎ ℎ−1 ℎ−1 2𝑖 − 2 = 2ℎ 𝑖=0 − 𝑖) 𝑖2𝑖 𝑖=0 = 2h(2h-1) – 2(h-2)2h + 4 = h2h+1-2h-h2h+1+2*2h+1+4 = 2 ( n – lg(n+1) + 4 ) Trns. & Conq.: Heap • Building a heap, Method 2 • It’s called “top-down heap construction” • Idea: Successively insert a new key into a previously constructed heap • Attach a new node with key K after the last leaf of the existing heap • Sift K up until the parental dominance is established 9 9 8 6 2 5 7 10 10 6 10 2 5 7 9 6 8 2 5 7 Insertion operation cannot take more than twice the height of the heap’s tree, so it is in O(lgn) 8 Trns. & Conq.: Heap • Maximum key deletion from a heap – Exchange root’s key with the last key K in the heap – Decrease the heap’s size by 1 – “Heapify” the smaller tree by sifting K down establishing “parental dominance” as long as it is needed 1 want 1 We to delete the root 9 6 8 2 5 1 2 Efficiency is O(lgn) 5 2 2 9 5 5 8 8 6 1 6 8 6 8 6 1 2 5 Trns. & Conq.: Heapsort • An interesting algorithm discovered by J. W. J. Williams in 1964 • It has 2 stages – Stage 1: Costruct a heap from a given array – Stage 2: Apply the root-deletion operation n-1 times to the remaining heap Trns. & Conq.: Heapsort Stage 1: Heap construction 0 0 0 0 0 1 2 3 4 5 6 2 9 7 6 5 8 1 2 3 4 5 6 2 9 8 6 5 7 1 2 3 4 5 6 2 9 8 6 5 7 1 2 3 4 5 6 9 2 8 6 5 7 1 2 3 4 5 6 9 6 8 2 5 7 2 9 6 7 5 8 9 6 2 2 8 5 9 7 8 6 5 9 2 6 2 8 5 7 7 9 6 8 5 7 Trns. & Conq.: Heapsort Stage 2: Maximum deletions 0 0 0 0 0 1 2 3 4 5 6 9 6 8 2 5 7 0 1 2 3 4 5 6 7 6 8 2 5 9 1 2 3 4 5 6 8 6 7 2 5 9 1 2 3 4 5 6 5 6 7 2 8 9 1 2 3 4 5 6 7 6 5 2 8 9 0 0 0 1 2 3 4 5 6 2 6 5 7 8 9 1 2 3 4 5 6 6 2 5 7 8 9 1 2 3 4 5 6 5 2 6 7 8 9 1 2 3 4 5 6 2 5 6 7 8 9 Trns. & Conq.: Heapsort • What is Heapsort’s worst-case complexity ? Stage 1: Heap construction is in ? O(n) Stage 2: Maximum deletions is in ? Let’s see… Let C(n) be the # of comparisons needed in Stage 2 Recall, height of heap’s tree is 𝒍𝒈𝒏 C(n) ≤ 2 𝐥𝐠(𝒏 − 𝟏) + 2 𝒍𝒈(𝒏 − 𝟐) + … … + 2 𝒍𝒈(𝟏) ≤ 2 C(n) ≤ 2 𝒏−𝟏 𝒊=𝟏 𝒍𝒈(𝒏 − 𝟏) = 2(n-1)lg(n-1) ≤ 2nlgn 𝒏−𝟏 𝒊=𝟏 𝐥𝐠(𝒊) So, C(n) є O(nlgn) So, for two stages, O(n) + O(nlgn) = O(nlgn) Trns. & Conq.: Heapsort • More careful analysis shows that Heapsort’s worst and best cases are in Θ(nlgn) like Mergesort • Additionally Heapsort is in-place • Timing experiments on random files show that Heapsort is slower than Quicksort, but can be competitive with Mergesort • What sorting algorithm is used by Collections.sort(List<T> list) of Java? Trns. & Conq.: Problem Reduction reduction Problem 1 To be solved Alg. A Problem 2 Solution to Problem 2 Solvable by Alg. A • Compute least common multiple of two nonnegative integers, lcm(m, n) So, m*n = lcm(m, n) * gcd(m, n) 𝒎∗𝒏 • Say m = 24, and n = 60 lcm(m, n) = ! 𝐠𝐜𝐝(𝒎,𝒏) • 24 = 2*2*2*3 and 60 = 2*2*3*5 • Take product of all common factors of m and n, and all factors of m not in n, and all factors of n not in m • lcm(24, 60) = (2*2*3)*2*5 = 120 Is it a good algorithm? Notice, 24*60 = (2*2*2*3) * (2*2*3*5) = (2*2*3)2*2*5