TDDB57 DALG – Lecture 2: Analysis of algorithms. Analysis of Algorithms Page 1 J. Maluszynski, IDA, Linköpings Universitet, 2004. [Lewis/Denenberg 2.1] [Lewis/Denenberg 2.2] [Lewis/Denenberg 1.3 (except of pp. 26-32)] TDDB57 DALG – Lecture 2: Analysis of algorithms. Correctness Page 2 “An algorithm must not give a wrong answer.” J. Maluszynski, IDA, Linköpings Universitet, 2004. [Lewis/Denenberg] A function fact for computing factorial must not return 6 for the call fact 2 . Which answers are wrong? the user knows that, or An algorithm is correct iff for any legal input – memory – time (focus of this course) Analysis of time efficiency should be: – machine-independent – valid for all legal data We compare: – time growth-rate for growing size of (input) data (scalability) – mostly for worst-case data J. Maluszynski, IDA, Linköpings Universitet, 2004. a specification of legal inputs and corresponding correct answers is needed. What to analyze correctness termination efficiency Time efficiency Mathematical background: comparing functions growth rate Page 4 the computation terminates, and Page 3 analysis techniques for recursive algorithms TDDB57 DALG – Lecture 2: Analysis of algorithms. Resources used by an algorithm: Efficiency n 1 n Termination 1 2n An algorithm should terminate: TDDB57 DALG – Lecture 2: Analysis of algorithms. worst case, expected case, amortized J. Maluszynski, IDA, Linköpings Universitet, 2004. the answer is as specified. 2 analysis techniques for iterative algorithms 1 Different algorithms may solve the same problem. How to compare them? TDDB28 1 produce an answer in a finite number of steps for any legal input 0. Termination is a difficult problem Example: Algorithm for squaring an integer using n2 function Square integer n : integer if n 0 return 0 if n 0 return Square n 1 2 n does not terminate for n Page 5 J. Maluszynski, IDA, Linköpings Universitet, 2004. J. Maluszynski, IDA, Linköpings Universitet, 2004. 1 , key K : integer TDDB57 DALG – Lecture 2: Analysis of algorithms. Analysis of running time: example function TableSearch( table key T 0 n (1) for i from 0 to n 1 do (2) if T i K then return i (3) if T i K then return 1 (4) return 1 What is the worst-case problem instance? Worst case time: n t1 t2 t3 t4 n characterizes the size of input data. Page 7 TDDB57 DALG – Lecture 2: Analysis of algorithms. Page 6 Principles of Algorithm Analysis J. Maluszynski, IDA, Linköpings Universitet, 2004. An algorithm should work for (input) data of any size. (Example TableSearch: input size is the size of the table.) Estimate the used resource (time/memory) by a function of the size of input. Estimation : comparison(?) with well-known functions Focus on the worst case performance (also average performance). Ignore constant factors analysis should be machine-independent; more powerful computers introduce speed-up by constant factors. Page 8 n log2 n 4 256 4096 n2 4 6 5 104 1 84 1019 2n J. Maluszynski, IDA, Linköpings Universitet, 2004. Study scalability / asymptotic behaviour for large problem sizes: ignore lower-order terms, focus on dominating terms. TDDB57 DALG – Lecture 2: Analysis of algorithms. n 2 64 384 TDDB57 DALG – Lecture 2: Analysis of algorithms. Growth of well-known functions log2n 2 16 64 Comparing Efficiency of Algorithms n 1 4 6 Functions describing execution time: time: Nonnegative real numbers size: Natural numbers usually growing/non-decreasing 2 16 64 1 84 1019µsec = 2 14 108 days = 5845 centuries How quickly grows the execution time of your algorithm? More efficient algorithm: the function grows slower but: constant factors should be ignored: we can install faster computer We need a formal basis for the comparison. TDDB57 DALG – Lecture 2: Analysis of algorithms. Page 9 J. Maluszynski, IDA, Linköpings Universitet, 2004. TDDB57 DALG – Lecture 2: Analysis of algorithms. Page 10 Order Notation: Formal Definition Motivation for Order Notation J. Maluszynski, IDA, Linköpings Universitet, 2004. f is (in) O g iff there exist c f n cg n f , g functions from natural numbers to positive real numbers + comparing growth rates of functions classes of functions Intuition: Apart from constant factors, f grows at most as quickly as g 0, n0 0 such that for all n n0 Motivation: + estimating efficiency of algorithms by reference to simple functions 0, n0 0 such that for all n n0 + abstracting from constant factors f is (in) Ω g iff there exist c f n cg n f is (in) Θ g iff f n Ogn and g n O f n Page 12 0 f(n) c g(n) g(n) n n2 dominates g n 7n.) J. Maluszynski, IDA, Linköpings Universitet, 2004. Intuition: Apart from constant factors, f grows exactly as quickly as g Intuition: Apart from constant factors, f grows at least as quickly as g Ω is the converse of O, i.e. f is in Ω g iff g is in O f TDDB57 DALG – Lecture 2: Analysis of algorithms. J. Maluszynski, IDA, Linköpings Universitet, 2004. Dominance relation Page 11 f(n) o f ) iff n f(n) g(n) TDDB57 DALG – Lecture 2: Analysis of algorithms. 4 g(n) g(n) n 3 g(n) n0 3 g(n) n0 2 g(n) n0 f dominates (grows more quickly than) g (g f n gn ∞ for n ∞ 3 g(n) 2 g(n) that is, for any constant c 0, there is some n0 (Ex.: f n such that f n c g n for all n n0. O f but not vice versa. o f then g If g 4 g(n) f(n) g(n) n 2 g(n) n n0 Relate f and g using O and Ω n0 n0 Page 13 Θ g ,: J. Maluszynski, IDA, Linköpings Universitet, 2004. TDDB57 DALG – Lecture 2: Analysis of algorithms. 1 3 2 O n2 O n3 Prove or disprove n 1 O 2n Ωg, f 1 O n2 n 3n n5 TDDB57 DALG – Lecture 2: Analysis of algorithms. Comparing functions more... Og, f l f n lim n n ∞g To check f use definitions of O Ω Θ. may be a bit complicated, or ∞ If ∞ 0 l O g iff l Ω g iff l Θ g iff 0 exists then –f –f –f Page 14 J. Maluszynski, IDA, Linköpings Universitet, 2004. J. Maluszynski, IDA, Linköpings Universitet, 2004. Page 16 TDDB57 DALG – Lecture 2: Analysis of algorithms. J. Maluszynski, IDA, Linköpings Universitet, 2004. Page 15 TDDB57 DALG – Lecture 2: Analysis of algorithms. Oh Some properties of O Comparing Growth Rates of Well-known Functions O h then f Some simple facts: If f O g and g transitivity If f O g then also f g O g growth rate depends only on fastest growing components O f g O g then f g and g O f If f nα O nβ iff α β α β 0 nα o nβ iff α β growth rate of power function is determined by the value of power If there are d 0, n0 0 such that f n d for all n n0 then k f c O f for any constants k c 0. growth rate does not depend on constant coefficients d logb n o nα for any b α 0 power functions grow more quickly than logarithms o d n iff c nα o cn for any α 0, c 1 exponential functions grow more quickly than power functions cn loga n O logb n for any a and b growth rate of logarithms of various bases is equal d, O d n iff c cn Any constant function f n c is in O 1 there is no difference in growth rate of constant functions y1:n ! # $ t3 t4 t5 t6 n y1:n J. Maluszynski, IDA, Linköpings Universitet, 2004. 1 0 i J. Maluszynski, IDA, Linköpings Universitet, 2004. : table real , with n n ∑ ai j x j j 1 nn yi Page 18 maxit t2 n , matrix A with n t4 t1 t0 Worst case time: where maxit = maximal number of iterations of the while loop " n2 t3 TDDB57 DALG – Lecture 2: Analysis of algorithms. t2 Example: Independent Nested Loops Matrix-vector product (here, for a quadratic matrix) Time: n t1 that is, Page 20 A x given: vector x compute: vector y TDDB57 DALG – Lecture 2: Analysis of algorithms. y procedure matvec table real x 1 n , A 1 n 1 n (1) for i from 1 to n do (2) yi 00 (3) for j from 1 to n do (4) yi yi Ai j x j return y J. Maluszynski, IDA, Linköpings Universitet, 2004. n J. Maluszynski, IDA, Linköpings Universitet, 2004. t4 j 1 n t3 1 Page 17 1 i TDDB57 DALG – Lecture 2: Analysis of algorithms. i Estimating execution time for iterative programs n Remark: There exists a better, linear-time algorithm! ∑ xj Elementary operation takes / can be bound by a constant time yi Sequence of operations takes the sum of the times of its components Loop (for... and while...) the time of the body multiplied by number of repetitions (in the worst case) with x 1 n ) : table integer t4 Conditional statement (if...then...else...) the time for evaluating and checking the condition plus maximum of the times for then and else parts. Page 19 TDDB57 DALG – Lecture 2: Analysis of algorithms. 1 2 nn 1 t3 2 Example: Complexity Analysis of Binary Search Example: Dependent Nested Loops t2 n , given: Vector x compute: “Prefix-sums” vector y t2 A straightforward algorithm follows directly from the definition: n t1 n t1 Prefix-Sums n function BinSearch table key T 0 n 1 key K : integer (0) if n 0 then return 1 (1) l 0; u n 1 (2) while l u do (3) mid l u 2 (4) if K T mid then return mid (5) if K T mid then u mid 1 else l mid 1 (6) if K T l then return l else return 1 procedure pre f ixsum table integer (1) for i from 1 to n do (2) yi 00 (3) for j from 1 to i do (4) yi yi x j return y Total time: t n TDDB57 DALG – Lecture 2: Analysis of algorithms. Page 22 t4 k1 n c2 Page 24 k2 J. Maluszynski, IDA, Linköpings Universitet, 2004. J. Maluszynski, IDA, Linköpings Universitet, 2004. Example: comparing time complexity of search algorithms t3 t2 n (Why?) n (Why?) log n (Why?) n (Why?) O Ω O Θ tTableSearch n n t1 hence: tTableSearch n tTableSearch n tTableSearch n tTableSearch n $ log2 n O log n Θ log n On J. Maluszynski, IDA, Linköpings Universitet, 2004. n/2 n/2 # tBinSearch n c1 hence: tBinSearch n tBinSearch n tBinSearch n TDDB57 DALG – Lecture 2: Analysis of algorithms. Analysis of Recursive Programs (2) Characterize execution time by a recurrence relation Find solution (closed form, non-recursive) of the recurrence relation If not listed in a textbook, you may: $ Page 21 ? TDDB57 DALG – Lecture 2: Analysis of algorithms. 1. Unroll the recurrence relation a few times to get a hypothesis for a possible solution: T n 2. Prove the hypothesis for T n by induction. If that fails, modify the hypothesis and try again ... 12 &' Example continued How to compute maxit for n n 0, 1, 1, 2, 2, 2 1 2 3 4 5 6 3 J. Maluszynski, IDA, Linköpings Universitet, 2004. T n maxit maxit maxit maxit maxit maxit maxit n 2 1 maxit n log2 n maxit n Page 23 # TDDB57 DALG – Lecture 2: Analysis of algorithms. Analysis of Recursive Programs (1) $ function f act integer n : integer if n 0 then return 1 else return n f act n 1 Execution time: time for comparison: tc time for multiplication: tm time for call and return neglected (T is defined by a recurrence relation) 0 Total execution time T n 1 , if n T n 2 tc tm tm tc T n tm T n tc tc 0: % T 0 T n Hence for n tm tm tc ( Θn tc tc tc n times tm tm tm tm tc tc tc n tc # Page 25 Page 27 n J. Maluszynski, IDA, Linköpings Universitet, 2004. J. Maluszynski, IDA, Linköpings Universitet, 2004. On 1 , key K ) : integer nn 1 tc 2n T 0 n tc TDDB57 DALG – Lecture 2: Analysis of algorithms. Page 26 2n 1 to to 7to 1 3 Gives no information about the worst cases Often difficult to analyze J. Maluszynski, IDA, Linköpings Universitet, 2004. Θ 2n J. Maluszynski, IDA, Linköpings Universitet, 2004. * Hanoi procedure Hanoi integer n char X Y Z : move n topmost slices from tower X to tower Z, using Y as temporary if n 1 then out put(“move X to Z”) else Hanoi n 1 X Z Y out put(“move X to Z”) Hanoi n 1 Y X Z return 8T n 3to to 2T n 2 T 1 T n 4T n Page 28 T n TDDB57 DALG – Lecture 2: Analysis of algorithms. ) Average Case Analysis (2) We have to know the probability distribution for the input data TDDB57 DALG – Lecture 2: Analysis of algorithms. Towers of Hanoi + TDDB57 DALG – Lecture 2: Analysis of algorithms. Average Case Analysis (1) Reconsider TableSearch(): sequential search through a table Input argument: one of the table elements, assume it is chosen with equal probability for all elements. function TableSearch table key for i from 0 to n 1 do if T i K then return i 2 3 n Expected search time: 1 Page 29 J. Maluszynski, IDA, Linköpings Universitet, 2004. TDDB57 DALG – Lecture 2: Analysis of algorithms. Page 30 Lab 1: Experimental evaluation of time complexity TDDB57 DALG – Lecture 2: Analysis of algorithms. Amortized Analysis Page 31 nn 2 1 tc The total time for linear search of all elements is Guaranteed! TDDB57 DALG – Lecture 2: Analysis of algorithms. Lab 1: A few questions What kind of time complexity is estimated? What about estimation of time complexity of another kind? compare with plots of known functions plot the results repeat on data of various size J. Maluszynski, IDA, Linköpings Universitet, 2004. run the program on each of them, measure the time, take average generate randomly several input data of a fixed size. Is it possible to estimate complexity? Done for sequences of operations and input data. Is it necessary to generate many inputs of the same size? J. Maluszynski, IDA, Linköpings Universitet, 2004. A code may be too complex for complexity analysis, or unknown e.g. commercial procedure Lab1: Example: Given: a sorted table T 0 n 1 Input: a permutation e1 en of all elements of T