CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Analysis of Algorithms The field of computer science that studies efficiency of algorithms is known as analysis of algorithms. Example 1: Bubblesort requires n2 statements to sort n elements for a given time interval If computing speed is increased by 1,000,000, then 1,000,000n2 statements can be executed in the same time interval. But, 1,000,000n2 = (1000n)2 i.e. 1000n elements can be sorted which is an increase (we lost 3 of 6 orders of magnitude) Example 2: Swap down in heapsort requires lg(n) statements for n elements for a given time interval If computing speed increased by 16, then 16lg(n) statements can be executed in the same time interval. But, 16lgn = lg(n16) [since lg m n n lg m ] i.e. n16 elements can be processed, which is an exponential increase Algorithms for which problem size, which can be solved in the same time interval increase linearly with computing speed are called linear complexity algorithms. This information has been summarized in the table below: Table 1: Effect of algorithm efficiency on ability to benefit from increases in speed Algorithm Complexity Increase in computing speed Equivalent problem size that can be solved in the same amount of time Benefit from increases in speed Linear i.e. n Quadratic i.e. n2 Logarithmic i.e. lg(n) 1,000,000 (106) 1,000,000 (106) 1,000,000 (106) 1,000,000n (106) 1000n (103) Linear 106 n (to the power of 106) Exponential 1 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Complexity of Algorithm Depends On Algorithm Implementation: Example 1: Bubblesort Code: for i:= 1 to n-1 -----line 1 for j:= 1 to n-1 -----line 2 if X[j] > X[j+1] -----line 3 temp := X[j] -----line 4 X[j] := X[j+1] -----line 5 X[j+1] := temp -----line 6 T(n) = = = = Statement # # of executions 1 2 3 4, 5, 6 n (n-1)n (n-1)(n-1) n( n 1) 2 3n 2 3n n + (n – n) + (n – 2n + 1) + 2 2 3 n 3 n 2n2 – 2n + 1 + 2 2 4n 4n 2 3n 2 3n + 2 2 7 2 7 7n 2 7n 2 n - n+1 = 2 2 2 2 Modify line 2 of code to: 2 j:=1 to n – i {this eliminates unnecessary comparisons} Statement # # of executions 1 2 n 3, 4, 5, 6 n n(n 1) 1 i 2 i 2 n(n 1) n 1 i 2 i 1 n2 n n 1 2(n 2 n) T(n) = 2 n2 n (2n 2 n 2n 1) = 2 4n 2 2n 2 n 2 n = + 2 2 5 2 1 5n 2 n 2 n - n -1 = = 2 2 2 Thus, we see as algorithm implementation will vary, T(n) will vary. 2 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Some Definitions: Characteristic Operation: Statement performed repeatedly; used for finding time complexity (T(n)). Time Complexity: Number of characteristic operation performed A characteristic operation and its corresponding time complexity function are called realistic if the execution time with respect to some implementation is bounded by a linear function of T(n). Example: Choosing characteristic operation of Bubblesort Statement # chosen as characteristic operation Time complexity 1 2 3 4, 5, 6 n (n-1)n (n-1)(n-1) 0 to (n-1)(n-1) depending on condition We’ve seen that for bubblesort, T (n) O(n 2 ) From the table above, time complexity is linearly bounded by T(n) if characteristic operation is chosen to be either statement # 2 or statement # 3. But statement # 2 (for j:= 1 to n-1) is a looping construct that does not have any inherent connection to the meaning of the algorithm. Statement # 3 (if X[j] > X[j+1]), on the other hand is the core of the algorithm and hence a good choice for characteristic operation of the algorithm/ this algorithm implementation Another variation of bubblesort code: Code: pass := 1 swap := true while swap and (pass < n-1) swap := false for j := 1 to (n – pass) if X[j] > X[j + 1] temp := X[j] X[j] := X[j+1] X[j+1] := temp swap := true -----line 1 -----line 2 -----line 3 -----line 4 -----line 5 -----line 6 -----line 7 -----line 8 -----line 9 -----line 10 Here, number of passes varies from 1 to (n – 1) depending on contents of array. 3 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Best, Worst and Average Time Complexities Algorithm P accepts ‘k’ different instances of size ‘n’ Let Ti(n) is time complexity of P given ith instance (1 ≤ i ≤ k) Pi is probability that this instance occurs Then, Best-case Complexity, B(n) = Min T (n) 1i k i Worst-case Complexity, W(n) = Max T (n) 1i k i Average-case Complexity, A(n) = k pi Ti (n) i 1 Example: Linear Search Code: LinearSearch (var R: RA type; a, b, x: integer): boolean var i integer; found: boolean; i := a; found := false while (i ≤ b ) and not found found := R[i] = = x i++ characteristic operation LinearSearch := found Algorithm has infinite instances, but all can be categorized into (n+1) classes: x = R[1], x = R[2], x = R[3], … x = R[n], x R i 1 2 i n n+1 Instance x = R[1] x = R[2] x = R[i] x = R[n] x R[1… n] pi p/n p/n p/n p/n 1–p Ti(n) 1 2 i n n Here, B(n) = 1 W(n) = n (occurs in two cases) 4 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport n 1 A(n) = p T ( n) i i i 1 n = p T ( n) p = T ( n) n 1 n 1 i i i 1 n p n i (1 p)n i 1 = = = n i p (1 p)n i 1 n p (n)(n 1) (1 p)n n 2 ( p)(n 1) (1 p)n 2 Variation: Linear Search with Ordered Array (early stop possible) Here, i 1 2 i n n+1 n+2 j 2n-1 2n Instance x = A[1] x = A[2] x = A[i] x = A[n] x < A[1] x < A[2] x < A[j] x < A[2n-1] x < A[n] pi p/n p/n p/n p/n (1 – p)/n (1 – p)/n (1 – p)/n (1 – p)/n (1 – p)/n Ti(n) 1 2 i n 1 2 j n-1 n success failure For linear search with ordered array, 2n A(n) = p T ( n) i i i 1 n = piTi (n) i 1 = = = 2n piTi (n) i n 1 p (n)(n 1) 1 p 2n [ ]( ) (i n) n 2 n i i 1 p (n)(n 1) (1 p) n [ ] i n 2 n i 1 ( p)(n 1) (1 p)(n 1) 2 2 2n p (1 p) i (i n) n i 1 n i n 1 n = = ( n 1) 2 5 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Q. When is early stop better than no early stop in unordered array? The above occurs when or, or, or, or, or, or, or, A UNORDERED (n) ( p)(n 1) (1 p)n 2 ( p)(n 1) 2 n(1 p) 2 pn p 2 n 2 pn 2 p 2n pn p pn p(1 n) p > > > > A EARLY STOP (n) ( n 1) 2 ( n 1) 2 ( n 1) 2 n 1 1 n 1 n 1 (we assume n > > 1 sign changes by division) > > > < This shows that with our assumption that whenever Pr(element in array) < 1.0 EARLY STOP is better. 6 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Analysis of Recursive Algorithms Let T(n) ≡ time complexity of (recursive) algorithm Time complexity can be defined recursively. Example 1: Factorial Function Code: int factorial (int n) { if (n = = 0) return 1; else return n * factorial (n – 1); } Recurrence Equation: T(n) = T (n – 1) + 1; n > 0 {all n > 0} T(0) = 1 ; n = 0 {base case} Thus, T(n – 1) = T(n – 2) + 1 T(n – 2) = T(n – 3) + 1 T(n – 3) = T(n – 4) + 1 . . . . . . There are a variety of techniques for solving the recurrence equation (to solve for T(n) without T(n-1) on right side of the equation) to get a closed form. We’ll start with the simplest technique: Repeated Substitution Solving the recurrence equation of the factorial form to get the closed form, T(n) = T(n – 1) + 1 = [T(n – 2) + 1] + 1 = [[T(n – 3) + 1] + 1] + 1 = …… = T(n – i) + i {ith level of substitution} if ‘i = n’, T(n) = T(0) + n or, T(n) = n + 1 7 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Example 2: Towers of Hanoi Code: procedure Hanoi (n: integer; from, to, aux: char) if n > 0 Hanoi (n – 1, from, aux, to) write (‘from’, from, ‘to’, to) Hanoi (n – 1, aux, to, from) Here, T(0) = 0 T(n) = T(n – 1) + 1 + T(n – 1); = 2T(n – 1) + 1 n>0 Solving the recurrence equation to get the closed form using repeated substitution, T(n) = 2T(n – 1) + 1 T(n – 1) = 2T(n – 2) + 1 T(n – 2) = 2T(n – 3) + 1 T(n – 3) = 2T(n – 4) + 1 T(n) = 2T(n – 1) + 1 = 2[2T(n – 2) + 1] + 1 = 2[2[2T(n – 3) + 1] + 1] + 1 = 2[22T(n – 3) + 21 + 20] + 20 = 23T(n – 3) + 22 + 21 + 20 Repeating ‘i’ times, T(n) = 2iT(n – i) + 2i-1 + 2i-2 + … + 20 If ‘i = n’, T(n) = 2nT(0) + 2n-1 + 2n-2 + … + 20 = 2n-1 + 2n-2 + … + 20 n 1 = 2 i i0 This is a geometric series whose sum is given by: Here, a = 1, r = 2, n = n. Therefore, Sn i.e. T(n) = 2n – 1 Sn 1 2n = 2n – 1. 1 2 a ar n 1 r 8 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Example 3: Binary Search Code: function BinSearch (var R: Ratype; a,b: indextype; x: keytype): boolean var mid: indextype if a > b BinSearch:= false else mid:= (a + b) div 2 characteristic operation if x = R[mid] BinSearch:= true else if x < R[mid] BinSearch:=BinSearch(R,a,mid–1,x) else BinSearch:=BinSearch(R,mid+1,b,x) Here, T(0) = 0 T(n) = 1 1 T ( (n 1) / 2 1) 1 T (n (n 1) / 2 1) if x = R[mid] if x < R[mid] if x > R[mid] Complexity of BinSearch(R, 1,n, x) We can eliminate the floor function by limiting ‘n’ to 2k – 1; k Z 2k – 1 2k-1 – 1 Now, 2k-1 – 1 T(0) = 0 T(n) = 1 1 T (2 k 1 1) if x = R[mid] if x ≠ R[mid] Best case occurs if x = R[mid] is true the very first time. Thus, B(n) = 1 9 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Worst case occurs if x = R[mid] is always false. Now, W(0) = 0 W(2k – 1) = 1 W (2 k 1 1) Solving the recurrence equation to get the closed form using repeated substitution, W(2k – 1) = = = 1 W (2 k 1 1) 1 + [1 W (2 k 2 1) ] 1 + [1 + [1 W (2 k 3 1) ]] Repeating ‘i’ times, W(2k – 1) = i W (2 k i 1) = k+0 =k if ‘i = k’, W(2k – 1) We want W(n), so 2k – 1 = n or, lg 2k = lg (n + 1) or, k = lg (n + 1) {k in terms of n} Thus, W(n) = lg (n + 1) Average Case Complexity First we need to set up a model. As a first step we’ll determine the average case complexity given that the element is present in the array. Here, we take into account the probability of not finding an element at a particular level (in worst case we assumed had taken this part to be one i.e. the element being not found at a particular level was taken as a certainty) and add to it the work done at the next level recursively. Thus, AElementPresent(1) = 1 1 (1 k ) A(2 k 1 1) AElementPresent (2k – 1) = 2 1 1 1 (1 k ) [(1 k 1 ) A(2 k 2 1)] = 2 1 2 1 1 1 1 1 k 1 k 1 1 k 2 A(2 k 3 1) = 2 1 2 1 2 1 10 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Repeating ‘i’ times, AElementPresent (2k – 1) = [i] Let ‘i = k-1’, 1 1 1 1 k 1 k 2 ... k ( i 1) A(2 k i 1) 2 1 2 1 2 1 2 1 k k 2 [k 1] 1 AElementPresent (2k – 1) = i 0 k 2 k = i 0 k 2 However, 2 i0 k 2 2 i 0 1 k 1 1 k i 1 k i 1 0 2 1 1 2 k i 1 is a strictly increasing series, therefore its upper bound is given by k 1 1 k x 2 1 k i 1 dx 0 k 1 2 0 1 k x dx 2 xk 2 xk dx ln 2 k 1 0 k 1 2 2 ln 2 ln 2 Substituting, k = lg (n + 1) we have 1 2 k 1 2 lg( n 1) 2 ln 2 ln 2 2 ln 2 ln 2 1 1 1 1 lg(n 1) 2 ln 2 ln 2 2 2 ln 2 ln 2(n 1) Now combining both the terms we have, AElementPresent (n) = lg(n 1) 1 1 2 ln 2 ln 2(n 1) [ 1 0.7213 ] 2 ln 2 Given that: p ≡ Pr (element exist in the array) A(n) p(average cost when element is present) + (1 – p) (worst cost) 1 1 p[ lg(n 1) ] + (1 – p) [lg(n+1)] 2 ln 2 ln 2(n 1) = = For example, if n = 1023, A(n) = p[lg(1024) – 0.7213 + = p(9.3) + (1 – p)(10) 1 + (1 – p)[lg(1024) 0.6932(1024) 11 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Solving Homogenous Linear Recurrence Relations Using Method of Characteristic Roots (Another Method To Solve Recurrence Equations) Homogenous Linear Recurrence Relation: The term linear means that each term of the sequence is defined as a linear function of the preceding terms. The coefficients and the constants may depend on n, even nonlinearly. Homogeneous means that the constant term of the relation is zero. Example 1: Recurrence Relation: an 5an1 6an2 a1 7 a0 1 General form: an c1 a n 1 c2 an 2 ... c k an k 0 a k 1 ... . . a0 ... Let an r n n Substituting this in original recurrence relation, we have r n 5r n1 6r n2 or, r n 5r n1 6r n2 0 Dividing by rn-2 gives us, r 2 5r 6 0 or, (r – 2) (r – 3) = 0 => r1 = 2 r2 = 3 => an = 2n or, an = 3n; both work Therefore, general solution is: an 2 n 3n Now, consider the initial conditions: or, 21 31 7 or, 2 3 7 a1 7 a0 1 or, 2 0 3 0 1 or, 1 ----- (1) or, 1 ----- (2) 12 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Substituting value of α from (2) in (1), we get 2(1 ) 3 7 2 2 3 7 2 7 5 Therefore, 1 1 5 or, 4 Thus, specific solution is: an 5 3n 4 2 n Cross-checking some values, a0 5 3 0 4 2 0 1 a1 5 3 4 2 7 1 a2 5 3 4 2 29 2 a n 5a n 1 6 a n 2 1 2 These are same as . . a 0 1; and , a1 7; a 2 5 a1 6 a 2 or, a 2 5 7 6 1 29 an 5 3 n 4 2 n ... Example 2: Fibonacci Sequence Recurrence Relation: an an1 an2 a1 1 a0 0 Let an r n n Substituting this in original recurrence relation, we have r n r n 1 r n 2 or, r n r n1 r n2 0 Dividing by rn-2 gives us, r2 r 1 0 1 5 1 5 r1 r2 => 2 2 1 5 n 1 5 n an ( ) or, a n ( ) ; => 2 2 both work 13 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport an ( Therefore, general solution is: 1 5 n 1 5 n ) ( ) 2 2 Now, consider the initial conditions: 1 5 1 1 5 1 ) ( ) 1 or, ( a1 1 2 2 a0 0 or, 0 or, ----- (1) ----- (2) Substituting value of α from (2) in (1), we get 1 5 1 5 ( ) ( ) 1 2 2 (1 5 ) (1 5 ) 2 ( 5 ) ( 5 ) 2 2 5 ) 2 5 ) 1 1 5 1 5 Thus, specific solution is: an 1 1 5 n 1 1 5 n ( ) ( ) 2 2 5 5 Crosschecking values, 1 1 5 0 1 1 5 0 1 1 a0 ( ) ( ) 0 2 2 5 5 5 5 1 1 5 1 1 1 5 1 1 1 5 1 5 2 5 a1 ( ) ( ) ( ) 1 2 2 2 5 5 5 2 5 Example 3: Recurrence Relation: an an1 6an2 a1 8 a0 1 Let an r n n Substituting this in original recurrence relation, we have r n r n1 6r n2 14 CS 502 Analysis Of Algorithms or, Summer 2006 University of Bridgeport r n r n1 6r n2 0 Dividing by rn-2 gives us, r2 r 6 0 => r1 2 r2 3 => an (2) n or, an 3n ; both work Therefore, general solution is: an (2) n 3n Now, consider the initial conditions: or, 2 3 8 a1 8 a0 1 or, 1 ----- (1) or, 1 ----- (2) Substituting value of α from (2) in (1), we get 2(1 ) 3 8 2 5 8 5 10 2 1 2 1 Thus, specific solution is: an 1(2) n 2 3n Crosschecking values, a0 1(2) 0 2(3) 0 1 2 1 a1 1( 2)1 2(3)1 2 6 8 a2 1( 2) 2 2(3) 2 4 18 14 a3 1( 2) 3 2(3) 3 8 54 62 15 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Iterative Algorithms Just as recursive algorithms lead naturally to recurrence equations, so iterative algorithms lead naturally to formulas involving summations. The following examples use an iterative approach to solve the given problem. Example 1: Average Depth of BST Node First we need to set up a model Let n = 2k – 1 ≈ 2k Given, n = 16, k = 4 Approach 1 One way to look at it is that half the total number of nodes have depth 3, one-fourth of total nodes have depth 2 and so on. Thus, total depth of all the nodes can be put in the form: 1 1 1 1 3 2 1 0 2 4 8 16 or, k k 1 k k i ( k i ) i i i i 1 2 i 1 2 i 1 2 k k k 2 i2 i i i 1 i 1 This gives us a model but we can find the formula. However, this approach uses approximate values. The second approach we’ll see uses exact values and hence gives a better model. Approach 2 Another way to look at it is that 1 node has depth 0, 2 nodes have depth 1 and so on. Here, the total depth of all the nodes is given by the form: 2 0 0 21 1 2 2 2 23 3 or, k 1 k 1 i0 i 1 i2 i i2 i But we know, k i2 i = (k 1)2 k 1 2 (proof in handout: ‘Mathematical Tools’, Page 4) i 1 16 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Substituting this value in the equation above, we have k 1 i2 i 1 i (k 2)2 k 2 (log 2 (n 1) 2)(n 1) [since n = 2k – 1] Thus, Average Depth of full BST Node = (log 2 (n 1) 2)(n 1) n The following table shows some actual values of average depths of BST nodes for given values of k: k 4 5 6 7 8 9 10 11 12 13 4 15 16 n 15 31 63 127 255 511 1023 2047 4095 8191 16383 32767 65535 average depth 2.27 3.16 4.10 5.06 6.03 7.02 8.01 9.01 10.00 11.00 12.00 13.00 14.00 Example 2: Binary Search Using iterative approach, the following model can be set up for the average number of comparisons for binary search work. Assume the total number of elements in an array, n = 2k – 1 Then, Pr (comparisons = 1) = Pr (comparisons = 2) = . . Pr (comparisons = k) = 1 20 2k 1 2k 1 2 21 2k 1 2k 1 2 k 1 2k 1 17 CS 502 Analysis Of Algorithms Summer 2006 University of Bridgeport Therefore, assuming element is in array Average number of comparisons # ofComparisons * Pr(# comparisons) comp k 2 i 1 i k 2 1 i 1 k i 2 i 1 i 1 2k 1 Solving the numerator, we have k i 2 i 1 i 1 k i[2 i 2 i 1 ] i 1 k k i2 i2 i i 1 k 1 k i 2 (i 1)2 i i i 1 i 1 k k 1 k 1 i 1 i0 i0 i0 i2 i i2 i 2 i k2 2 1 This is a g.p. where, A -> 1 R -> 2 N -> k ( k 1)2 k 1 Therefore, Sn = i 1 k k Putting the entire expression together, we have Average number of comparisons(assuming element is in array) or, Average number of comparisons(assuming element is in array) 12 k 12 (k 1)2 k 1 2k 1 (lg(n 1) 1)(n 1) 1 (n 1) 1 (n 1) lg(n 1) n 1 1 n (n 1) lg(n 1) n n (n 1) lg(n 1) 1 n As n , the complexity approaches lg(n 1) 1 [This is the large N solution] For example, if n = 15 (24 – 1)i.e. k = 4 2i 1 1 4 12 32 49 3.266 k 15 15 2 1 i 1 (4 1)2 4 1 49 1 3.266 or, Application of analytical solution gives us: 4 2 1 15 (15 1) lg(15 1)2 4 (16)(4) 49 1 1 3.266 15 15 15 4 Numeric solution is given by: i 18