Methodologies for Analyzing Algorithms Essentially, today's lecture is going to look very similar to the first lecture in CS2. One of the major goals of CS3 is to be able to analyze the time efficiency of algorithms. If you have forgotten what we talked about in that class, skim over my first lecture for CS2 this semester. It discusses most of what I want you to remember. In this lecture I will add some tidbits and highlight parts of that lecture that I feel are important. Since analyzing algorithms is going to be a central focus in this course, we will have weekly algorithms, starting next week, to analyze. Thus, we will get to practice your algorithm analysis skills throughout the semester. In CS2 you were shown quite a bit of the analysis of algorithms, but in this class you will get to practice producing these on your own. Two main ways we will analyze algorithms: 1) Experimentally 2) Analytically In order to do #1, we must run our algorithm many times with different sized inputs, tracking the running time for each different sized input. (Typically we want to repeat each of these many times to get reasonable average values for run-times.) Once we have a graph of run-time versus input size, we would like to find a function (within a constant factor) that best fits the given data. This can be done by analyzing the fraction t(n)/f(n), where t(n) is the experimental run time, and f(n) is a function you are "testing to see if it's the actual run time." In order to do #2, you must look at the steps of the algorithm and determine how many "simple instructions" an algorithm will execute in terms of the input size, n, to the algorithm. Then we will try to show that the amount of steps is O(f(n)), for some function f(n). (This would prove something about the worstcase running time of a function.) As always, there are three different types of algorithmic running times we are concerned with: 1) Average running time 2) Worst case running time 3) Best case running time The reason I have these listed in this particular order is because of the importance of each. The most common goal is to minimize the average run time of an algorithm, thus, this seems to be the most important value to find. Next, it would be nice to be able to make some guarantee than an algorithm will "do no worse than ..." Finally, the last is usually not of great importance, but something that may be computed for curiousity's sake. With respect to determining each of these, the textbook goes into some detail about pseudocode and the RAM Model. I will not strictly stick to either of these for the course. When I describe algorithms, sometimes I will show you code to implement them and other times I will write my own "pseudocode" to describe them. Also, I won't make reference to the RAM Model all that often. Instead, I will simply assume that all SIMPLE statements/expressions in programming languages take a constant amount of time to execute. These include variable assignments, method CALLS (just the call itself), arithmetic, comparison, indexing an array, and return statements. Example of Counting Primitive operations I will use the same example from my CS2 notes, a binary search of a sorted array of size n: low = 0; high = n-1; while (low high) { mid = (low+high)/2; if (val == A[mid]) found... else if (val > A[mid]) low = mid+1; else high = mid - 1; } First we run each of the assignment statements before the while loop: 2 simple statements We make the comparison in the while loop at most log2n +2 times. Can you tell me why? The next statement involves 3 simple operations: addition, division, and assignment. This gets run at most log2n +1 times. Each of the two comparisons in the if statement execute at most log2n +1 times. Finally, for each iteration, we have an arithmetic operation and assignment statement(eg. high=mid-1) also. Adding we have: 2 + (log2n +2) + 3(log2n +1) + 4(log2n +1) = 8log2n +11 = (lgn) What I have done in the above example is the WORST CASE analysis. For each statement, I have determined the maximum number of times it could possibly execute. Based on this, I am guaranteed that no execution of this algorithm will result in more than a constant times log2n simple statements. One very important note is that just because we've determined an UPPER bound for the running time of an algorithm does NOT mean that we've come up with a TIGHT UPPER BOUND for the worst case running time. Consider a different analysis of this algorithm: Since low or high changes by at least one during each loop iteration, each loop statement runs at most n times. There are 11 statements in the code total, each of which is run n times or less, thus the worst case running time of this 11n simple instructions or O(n). There is NOTHING wrong with this analysis. It is correct, but it is also misleading, (like the statement, "Kobe Bryant is a better basketball player than some high school basketball players.") Although the algorithm above is O(n), that does not mean that the ACTUAL running time of the algorithm is necessarily a constant times n. The less simplifications we make in our analysis, the more accurate our Big-Oh bounds we determine will be. In order to improve the analysis above, we must make the observation that the difference between high and low decreases by a factor of 2 for each iteration, not just that it decreases. In our analysis, we will ALSO try to determine and prove average case running times. (Note: Average case running times are ALWAYS in between worst and best case times.) Big-Oh Notation I will state the definition of Big-Oh here and go through a couple proofs showing Big-Oh bounds of functions: Definition of f(n) = O(g(n)) : There exists constants n0 and c such that for all n n0, f(n) cg(n). Using this definition, we'll prove the following: 6n2 + 200nlogn + 2 = O(n2) Pick n0 = 2 and c=208. We will prove that 6n2 + 200nlogn + 2 210n2, for all n 2. 6n2 + 200nlogn + 2 6n2 + 200n(n) + 2, since log n < n. 6n2 + 200n(n) + 2n2, since n > 1 = 208n2. In general, hopefully you can see from the technique that I used here that any function that is a SUM of terms is going to be Big-Oh of the "largest order" term of the SUM. Now, let's prove the following: 5nlog2n = O(n1.5). Pick n0 = 64 and c=5. We will prove that 5nlog2n 5n1.5, for all n 64. 5nlog2n 5n(n), this is because log264 < 640.5, AND comparing derivatives of the two functions shows that the square root function grows more quickly than the log function for all n 64. Now, let's look at some rules that we can use to simplify functions into their Big-Oh representation: 1. If d(n) is O(f(n)) then ad(n) is O(f(n)) for any constant a>0. 2. If d(n) is O(f(n)) and e(n) is O(g(n)) then d(n)+e(n) is O(f(n)+g(n)). 3. If d(n) is O(f(n)) and e(n) is O(g(n)) then d(n)e(n) is O(f(n)g(n)). 4. If d(n) is O(f(n)) and f(n) is O(g(n)) then d(n) is O(g(n)). 5. If f(n) is a polynomial of degree k, then f(n) is O(nk) 6. nc is O(an), for any constants c>0 and a>1. 7. log nc is O(log n) for any constant c > 0. 8. log c x is O(nd) for any positive constants c and d. Since it's in poor taste to say that a function is O(8n/log n+2n), although perfectly correct, we can use the rules above to simplify this expression into something nicer. 8n/log n+2n is O(n/log n + n), by rules 1 and 2 is O(n)O(1/log n) + O(n), by rule 3 is O(n)O(1) + O(n), by rule 4 is O(n) + O(n), by rule 2 is O(n), by rule 5 Final Notes about Big-Oh Using the rules above, we always want to simplify the running time of an algorithm to be big-Oh and/or big-Omega of a simple function without constants. In doing so, knowing the "ordering" of most of the major simple functions is important. Rule 8 says that logs to any power are smaller than any polynomial term and rule 6 says that any polynomial term is smaller than any exponential term. These are probably the most important rules. For example, they can be used to show that 3nn10 = O(4n): 3nn10 = O(3n)O(n10), rule 3 = O(3n)O((4/3)n), rule 6 = O((3n)(4/3)n), rule 3 = O(4n) Also, the limit definitions (included in the text) may be easier to apply than the definition I have included here. Finally, big-Oh notation does hide constants, so in practice, just because one algorithm has a better big-Oh bound than another doesn't necessarily mean it'll outperform the other algorithm in practice. Let's end class with an analysis of the following algorithm. Determine the worst and best case running time of the following, in terms of the input n: An algorithm to determine if the number n is prime: int test = 2; while (n%test != 0) test++; if (n == test) System.out.println("Prime."); else System.out.println("Not Prime."); Can you think of a simple tweak to this algorithm that will improve it's worst case running time substantially?