Lecture 12: Lower bounds By "lower bounds" here we mean a lower bound on the complexity of a problem, not an algorithm. Basically we need to prove that no algorithm, no matter how complicated or clever, can do better than our bound. This is often hard! That is: we say that f(n) is a lower bound on the complexity of a problem P if, for every algorithm A to solve P, and every n, there exists some input I of size n such that A uses at least f(n) steps on input I. Why do we care about lower bounds in an algorithms course? Because we want our algorithms to be fast, as fast as possible. If we can prove that no algorithm can be faster than one we have devised, then we can stop there, making no more efforts to improve. Lower bound methods Counting argument -- everything is eventually a “counting argument”. Usually tedious. Incompressibility argument – for average case Adversary argument -- for worst case. 1. Incompressibility Argument THEOREM. Any comparison-based sorting algorithm must use Ω(n log n) comparisons on average (averaged over all inputs, where each arrangement is considered equally likely). Proof (by incompressibility argument): Fix “Kolmogorov random” permutation π of 1.. n, such that the shortest description of π, C(π) ≥ log n! – O(1) Suppose A sorts π in T(n) comparisons. We can describe π by A description of this encoding scheme and A in O(1) bits; The binary outcomes of the T(n) comparisons in T(n) bits. Thus T(n) + O(1) ≥ C(π) ≥ log n! –O(1) = Ω (n log n) Since π is “typical”, the average case is also Ω(n log n). QED. Incompressibility method summary Step 1. Choose a “typical” (incompressible) input I. I.e. most of other inputs share the same property of incompressibility. Step 2. Consider the running time of the algorithm under discussion on this input only. Step 3. Show if the algorithm runs in shorter time, then you can give input I a shorter description. Step 4. Whatever time you obtain for I is the average running time of the algorithm. 2. Adversary arguments Generally, we can think of a lower bound proof as a game between the algorithm and an "adversary". The adversary should be thought of as a very powerful, clever being that is trying to make your algorithm run as slowly as possible. The adversary cannot "read the algorithm's mind", but it can try to be prepared for anything the algorithm does. Finally, the adversary is not allowed to "cheat"; that is, the adversary cannot answer questions inconsistently. The algorithm, of course, is trying to run as quickly as possible. The adversary is trying (through its cleverness) to force the algorithm to run slowly. What is an Adversary? Maybe considered as a second algorithm which intercepts access to data structures. Constructs the input data only as needed Attempts to make original algorithm work as hard as possible Analyze Adversary to obtain lower bound. Important Restriction Although data is created dynamically, the adversary must return consistent results. The adversary does not lie. If it replies that x[1]<x[2], we cannot have x[2]<x[1]. Example 1. Thm 1. Every comparison-based algorithm for determining the minimum of a set of n elements must use at least n/2 comparisons. Proof. Every element must participate in at least one comparison; if not, the uncompared element can be chosen (by an adversary) to be the minimum. Each comparison compares 2 elements. Hence, at least n/2 comparisons must be made Example 2. Thm 2. Every comparison-based algorithm for determining the minimum of a set of n elements must use at least n-1 comparisons. Proof. To say that a given element, x, is the minimum, implies that every other element has won at least one comparison with another element (not necessarily x). (By "x wins a comparison with y" we mean x > y.) This is because otherwise the adversary can (consistently) declare a never-wonelement to be smaller than x. Each comparison produces at most one winner. Hence at least n-1 comparisons must be used. Finding both maximum and minimum It is possible to compute both the maximum and minimum of a list of n numbers, using (3/2)n - 2 comparisons if n is even, and (3/2)n - 3/2 comparisons if n is odd. We will prove that this solution is optimal, that is, no comparison-based method can correctly determine both the maximum and minimum using fewer comparisons, in the worst case. Lower bound on finding max&min The idea is as follows: in order for the algorithm to correctly decide that x is the minimum and y is the maximum, it must know that (a) every element other than x has won at least one comparison and (b) every element other than y has lost at least one comparison. (Here x winning a comparison with y means x > y). Calling a win W and a loss L, the algorithm must, in effect, assign n1 W's and n-1 L's. That is, the algorithm must determine 2n-2 "units of information" to always give the correct answer. Example 3. Adversary argument for max min We now construct an adversary strategy that will force the algorithm to learn its 2n-2 "units of information" as slowly as possible. Here it is: the adversary labels each element of the input as N, W, L, or WL. These labels may change over time. "N" signifies that the element has never been compared to any other by the algorithm. "W" signifies the element has won at least one comparison. "L" signifies the element has lost at least one comparison. "WL" signifies the element has won at least one and lost at least one comparison. Finding Both Max and Min Codes: N-never used W-won once but never lost L-lost once but never won WL-won and lost at least once Key values will be arranged to make answers to come out right When comparing x and y Status Response N,N x>y W,N x>y WL,N x>y L,N x<y W,W x>y L,L x>y W,L; WL,L; W,WL x>y L,W; L,WL; WL,W x<y WL,WL Consistent New Status Info Gain W,L W,L WL,L L,W W,WL WL,L N/C N/C N/C 2 1 1 1 1 1 0 0 0 Accumulating Information 2n-2 bits of information are required to solve the problem All keys except one must lose once, all keys except one must win once Comparing N-N pairs results n/2 comparisons and n bits of info n-2 additional bits are required --- one comparison each is needed Max-Min continues 3n/2-2 comparisons are needed Upper bound is given by the following Compare elements pairwise, put losers in one pile, winners in another pile Find max of winners, min of losers This gives 3n/2-2 comparisons The algorithm is optimal