Foundations of Algorithms, Fourth Edition Richard Neapolitan, Kumarss Naimipour Updated by Richard P. Simpson Chapter 1 Algorithms: Efficiency, Analysis, and Order What is a problem • A problem is a question to which we seek an answer. • Examples - We want to rearrange a list of numbers in numerical order. (sort) - Determine whether the number x is in a list S of n numbers. - What is the 25 Fibonacci number? What is an instance of a problem? • An instance of a problem is a specific assignment of the parameters that define the problem. • For example in the case of sorting n numbers we need to be giving the n numbers in specific order and n the number of values to sort. This creates the specific case we are interested in. What is an Algorithm? • In mathematics and computer science, an algorithm (from Algoritmi, the Latin form of Al-Khwārizmī) is an effective method expressed as a finite list of well-defined instructions for calculating a function • IE a step by step solution to the problem. • In computer systems, an algorithm is basically an instance of logic written in software by software developers to be effective for the intended "target" computer(s), in order for the target machines to produce output from given input (perhaps null). Sequential Search • Problem: Is the key x in the array S of n keys? • Inputs(parameters): integer n, array of keys indexed from 1 to n (0 to n-1 ?) • Outputs: location, 0 if not in S void seqsearch(int n, const int S[], int x, index& location) { location = 1; while(location<=n && S[location]!=x) location++; if (location >n)location=0; } Matrix Multiplication void matrixmult(int n, const int A[][],const int B[][], int c[][]); { index i,j,k; for(i=1; i<=n; i++) for(j=1, j<=n; j++){ C[i][j]= 0; for(k=1; k<=n; k++) C[i][j] = C[i][j] + C[i][k]* B[k][j]; } } Searching Arrays • Sequential Search • Binary Search – Recursive ( be able to write this !) – Non Recursive (in book) A problem can solved using a lot of different algorithms. These may vary in efficiency and or complexity(we will discuss this later) See table 1.1 Recursive Fibonacci See Wolfram MathWorld Discussion 1 2 3 5 8 13 21 34 55 89 . . . f(0) = 0, f(1)=1, f(n)=f(n-1)+f(n-2) Recursive Solution int fib( int n) { if(n<=1) return n; else return fib(n-1) + fib(n-2); } The recursive algorithm hits all these nodes! Is this efficient?? Iterative Version int fib2 ( int n) { int I; int f[0..n]; f[0]=0; if (n > 0) f[1]=1; for (i=2; i<=n; i++) f[i] = f[i-1] + f[i-2]; } return f[n]; } Fills array from left to right. very efficient! 0 1 1 2 3 0 1 2 3 4 5 8 5 6 7 n-2 n-1 n SEE Table 1.2 8 Analysis of Algorithms • Complexity Analysis – This is a measure of the amount of work done by an algorithm as a function of its input data size. IE it’s a function of n. • Efficiency – I use this term in a very specific way. If two different algorithms having the same complexity are run on the same data set the execution time will probably be different. The faster algorithm is more efficient than the other one. Types of complexity • • • • Worst Case ( our main focus) Best Case Average Case Every Case (i.e. Best = Worst) Average case complexity • Sequential Search – Suppose we have an array of n items and would like to do a sequential search for the value x. Also assume that the value x can be in any location with equal probability (ie 1/n) 1 1 × )= × 𝑛 𝑛 𝑛(𝑛+1) 𝑛+1 = 2 2 𝑛 𝑘=1(𝑘 𝐴 𝑛 = 1 𝑛 = × 𝑛 𝑘=1 𝑘 See the analysis for the possibility that x is not in the array. p22 Complexity Classes recall that n is the data set size! • • • • • Constant ( 1, 3, 9, 232, etc) Linear ( n, 2n, 3n-2, 21n+100 etc) Quadratic (n2,2n2-3,4n2-3n+23, etc) Cubic ( n3, 4n3+3n2-2n+7, etc) Etc NOTE: The leading term of the polynomial is the most important term from a growth perspective. Complexity Classes • The complexity class of cubic polynomials is represented by the notation Ɵ(n3) • Note that Ɵ(n3) is a set of functions. • Common complexity sets include – Ɵ(lg n) – Ɵ(n) – Ɵ(n lg n) – Ɵ(n2) Ɵ(2n) Ɵ(n!) etc Figure 1.3: Growth rates of some common complexity functions. Doubling the Data Size • If an algorithm is Ɵ(n2) and the data set is doubled what happens to the execution time? Specificly assume that we have 2n items Hence Ɵ((2n)2)=Ɵ(4n2) = 4Ɵ(n2) Four times as long! What about cubics? Big O (memorize) This is not a suggestion Definition For a given complexity function f(n), O(f(n)) is the set of functions g(n) for which there exists some positive real constant c and some nonnegative integer N such that for all n≥N, g(n) ≤ c × f(n) 2 2 Showing that 5𝑛 + 3𝑛 − 6 ∈ 𝜃(𝑛 ) We need to find a c and a N that will make the following inequality true 5𝑛2 + 3𝑛 − 6 ≤ 𝑐𝑛2 What would a good choice for c be? 5𝑛2 + 3𝑛 − 6 ≤ 6𝑛2 3𝑛 − 6 ≤ 𝑛2 0 ≤ 𝑛2 − 3𝑛 + 6 We can solve this or just guess. A solution : c=3 and c=6 works. Big O , Big Ω, Big θ Greater than quadratics N3,2n Θ( n2 ) Ω(n2) All quadratics Less than quadratics nlgn,n,lg O(n2) Figure 1.4: Illustrating "big O", Ω and Θ Figure 1.5: The function n2 + 10n eventually stays beneath the function 2n2 Another way of looking at it Figure 1.6: The sets O (n2), Ω (n2)and Θ (n2). Some exemplary members are shown. Logarithm Rules The logarithm to the base b of x denoted logbx is defined to that number y such that by = x logb(x1*x2) = logb x1 + logb x2 logbx > 0 if x > 1 logb(x1/x2) = logb x1 - logb x2 logbx = 0 if x = 1 logbx < 0 if 0 < x < 1 logb xc = c logbx Additional Rules For all real a>0, b>0 , c>0 and n logb a = logca / logc b logb a logb (1/a) = - logb a = 1/ logab a logb n = n logb a Theorem: log(n!)(nlogn) Case 1 nlogn O(log(n!)) log(n!) = log(n*(n-1)*(n-2) * * * 3*2*1) = log(n*(n-1)*(n-2)**n/2*(n/2-1)* * 2*1 => log(n/2*n/2* * * n/2*1 *1*1* * * 1) = log(n/2)n/2 = n/2 log n/2 O(nlogn) Case 2 log(n!) O(nlogn) log(n!) = logn + log(n-1) + log(n-2) + . . . Log(2) + log(1) < log n + log n + log n . . . + log n = nlogn The Little o Theorem: If log(f)o(log(g)) and lim g(n) =inf as n goes to inf then f o(g) Note the above theorem does not apply to big O for log(n2) O(log n) but n2 O(n) Application: Show that 2n o(nn) Taking the log of functions we have log(2n)=nlog22 and log( nn) = nlog2n. Hence lim n n log n log n lim n log 2 n log 2 Implies that 2n o(nn) Theorem: lg n o( n ) lg n ln n lim lim n n n ln 2 n 1 ln n lim ln 2 n n 1 1/ n lim ln 2 n 1 (2 n ) Homework • 1.1 problem 7 • 1.3 problem 14 • 1.4 problem 15, 19