Lecture 4: Divide-and-Conquer Steps in Divide and Conquer As its name implies divide-and-conquer involves dividing a problem into smaller problems that can be more easily solved. While the specifics vary from one application to another, divide-and-conquer always includes the following three steps in some form: Divide - Typically this steps involves splitting one problem into two problems of approximately 1/2 the size of the original problem. Conquer - The divide step is repeated (usually recursively) until individual problem sizes are small enough to be solved (conquered) directly. Recombine - The solution to the original problem is obtained by combining all the solutions to the sub-problems. Divide and Conquer is not applicable to every problem class. Even when D&C works it may not provide for an efficient solution. Binary Search - Iterative Version procedure bin1(n:integer,x:keytype,S:list,loc:index)is lo,hi,mid : index; begin x=42 lo:=1; hi:=n; loc:=0; lo=1 hi=9 S while lo<=hi and loc=0 loop mid=4 S(4)=19 0 10 mid:=(lo+hi)/2; if x=S(mid) then 1 12 lo=5 hi=9 loc:=mid; 2 15 mid=7 S(7)=45 elsif x<S(mid) then 3 19 hi:=mid-1; 4 19 lo=5 hi=6 else 5 24 lo:=mid+1; mid=5 S(5)=24 end if; 6 39 end loop; lo=6 hi=6 7 45 end bin1; mid=6 S(6)=39 8 53 9 77 lo=7 hi=6 T(n) = C + lg2n D -> O(lg2n) Binary Search - Recursive Version function location(lo,hi : index) return index is begin if lo>hi then x=42 return 0; lo=0 hi=9 else mid=4 S(4)=19 mid:=(lo+hi)/2; if x=S(mid) then lo=5 hi=9 mid=7 S(7)=45 return mid; elsif x<S(mid) then lo=5 hi=6 return location(lo,mid-1); mid=5 S(5)=24 else return location(mid+1,hi); lo=6 hi=6 end if; mid=6 S(6)=39 end if; end location; lo=7 hi=6 return 0 0 1 2 3 4 5 6 7 8 9 S 10 12 15 19 19 24 39 45 53 77 Analysis of Function location( ) C n<=1 T(n) = T(n/2) + D otherwise We need a closed form expression for T(n) that does not contain a term involving T. We assume that n>1 and note that we need to replace T(n/2) with an explicit function of n. T(n/2)=T(n/4) + D T(n)=(T(n/4) + D) + D T(n)=T(n/4) + 2D T(n)=T(n/8) + 3D : T(n)=T(n/2k) + kD T(n)=T(1) + (lg2n)D T(n)=C + (lg2n)D We can use the recurrence relation to express the term T(n/2) in another form that can, in turn be substituted back into our expression for T(n). Eventually we can write a general expression for the kth substitution. T(n) -> O(lg2n) We can then determine the order of complexity. Then we can determine a value for k (in terms of n) that sets the parameter of T equal to one. This lets us substitute T(1)=C in our recurrence and replace any other occurrences of k with terms involving n. Mergesort procedure msort(n:integer; S:list) is h: constant :=n/2; procedure merge(h,m:integer; m: constant :=n-h; U,V,S:list); U,V : list; i,j,k : integer; begin begin i:=1; j:=1; k:=1; if n>1 then while i<=h and j<=m loop U(1..h):=S(1..h); if U(i)<V(j) then V(1..m):=S(h+1..n); S(k):=U(i); msort(h,U); i:=i+1; else msort(m,V); S(k):=V(j); merge(h,m,U,V,S); j:=j+1; end if; end if; k:=k+1; end msort; end loop; if i>h then S(k..h+m):=V(j..m); Write the recurrence relation else for the complexity of msort S(k..h+m):=U(i..h); and then solve it. end if; end merge; Quicksort procedure quicksort(lo,hi: integer) is pivot : integer; begin procedure partition(lo,hi,pivot)is j,item : integer; partition(lo,hi,pivot); begin quicksort(lo,pivot-1); item := S(lo); quicksort(pivot,+1,hi); j:= lo; end if; for i in lo+1..hi loop if S(i)<item then end quicksort; j:=j+1; swap(S(i),S(j)); end if; Where is the recombine step in end loop; the quicksort divide-and-conquer pivot:=j algorithm? swap(S(lo),S(pivot)); end partition; if hi>lo then Quicksort Example i j 1 2 3 4 5 6 7 8 9 10 11 12 13 14 2 1 6 9 8 5 7 4 6 3 5 4 7 6 0 5 4 2 6 5 8 9 7 4 6 3 5 4 7 6 0 5 6 3 6 5 4 9 7 8 6 3 5 4 7 6 0 5 8 4 6 5 4 3 7 8 6 9 5 4 7 6 0 5 9 5 6 5 4 3 5 8 6 9 7 4 7 6 0 5 10 6 6 5 4 3 5 4 6 9 7 8 7 6 0 5 13 7 6 5 4 3 5 4 0 9 7 8 7 6 6 5 14 8 6 5 4 3 5 4 0 5 7 8 7 6 6 9 8 5 5 4 3 5 4 0 6 7 8 7 6 6 6 pivot value items being swapped new sublists for next pass of quicksort Quicksort: Worst-Case Performance What does it mean? An analysis shows that the worst-case performance for quicksort( ) is order O(n2). However, quicksort performs much better than this in practice. Does this mean that algorithm analysis is a waste of effort? Not really. In addition to showing worst-case performance, algorithm analysis provides us with a better understanding of the conditions under which an algorithm performs well. We can see that the quicksort algorithm performs worst on a sorted list and performs best on a randomly distributed list (at least one in which the pivot value is always near the median value of the sublist being partitioned). Algorithm analysis helps us determine which algorithm is best for a particular application, as well as suggesting ways to modify an algorithm to improve its performance. Can you think of a simple way to improve quicksort( )? Other Applications for D&C Searching and sorting problems are obvious candidates for D&C but the Divide-and-Conquer problem-solving method gets used in other ways you might not expect. Game Strategy Computing numeric roots Fast Exponentiation Modular Exponentiation Suppose that we need to compute the value of an for some reasonably large n. Such problems occur in primality testing for cryptography. The simplest algorithm performs n − 1 multiplications, by computing a × a × . . . × a. However, we can do better by observing that n = ⌊n/2⌋ + ⌈n/2⌉. If n is even, then an = (an/2)2. If n is odd, then an = a(a⌊n/2⌋)2. In either case, we have halved the size of our exponent at the cost of at most two multiplications, so O(lg2n) multiplications suffice to compute the final value. function power(a, n) if (n = 0) return(1) x = power(a, ⌊n/2⌋) if (n is even) return(x2) else return(a * x2) Introduction to Algorithms - Steven Skiena Square Root (and other roots) of a Number The square root of n is the number r such that r2 = n. Square root computations are performed inside every pocket calculator – but how? Observe that the square root of n ≥ 1 must be at least 1 and at most n. Let l = 1 and r = n. Consider the midpoint of this interval, m = (l + r)/2. How does m2 compare to n? If n ≥ m2, then the square root must be greater than m, so the algorithm repeats with l = m. If n < m2, then the square root must be less than m, so the algorithm repeats with r = m. Either way, we have halved the interval with only one comparison. Therefore, after only lg2n rounds we will have identified the square root to within ±1. r n Twenty Questions Player One - Picks a word. Player Two - Asks a series of YES or NO questions, attempting to guess the word in <20 tries. alphabetical version of Twenty Questions Computing the Median Computing the median of n numbers is easy: just sort them. The drawback is that this takes O(n log2n) time, whereas we would ideally like something linear. We have reason to be hopeful, because sorting is doing far more work than we really need. We just want the middle element and don't care about the relative ordering of the rest of them. When looking for a recursive solution, it is paradoxically often easier to work with a more general version of the problem for the simple reason that this gives a more powerful step to recurse upon. In our case, the generalization we will consider is selection. SELECTION Input: A list of numbers S; an integer k Output: The kth smallest element of S For instance, if k = 1, the minimum of S is sought, whereas if k = [|S|/2], it is the median. Divide-and-Conquer Algorithms - Vazirani The K-Median Algorithm (Selection) Chapter 2 Divide-and-conquer algorithms – see readings http://www.cs.berkeley.edu/~vazirani/ Finding the Median 6 7 8 4 0 2 4 3 5 8 1 4 8 9 7 3 2 0 1 3 3 6 7 8 4 4 5 8 4 8 9 7 SL SM SR 6 7 8 4 4 5 8 4 8 9 7 4 4 4 5 6 7 8 8 8 9 7 SL 4 4 SM 4 SR Summary Divide-and-Conquer (actually three steps) 1. divide 2. conquer 3. recombine Most Naturally Implemented using Recursion (although iteration possible) Analyze performance by solving Recurrence Relations Worst-Case Performance may not be typical (or even possible) performance. Divide-and-Conquer is embedded in Arithmetic Processors/Calculators Cryptography relies on D&C Methods Game Strategy can be based on D&C Using D&C Selection Algorithm to Find the Median