Discrete Maths 242-213, Semester 2, 2015-2016 10. Running Time of Programs • Objective to describe the Big-Oh notation for estimating the running time of programs 1 Overview 1. Running Time 2. Big-Oh and Approximate Running Time 3. Big-Oh for Programs 4. Analyzing Function Calls 5. Analyzing Recursive Functions 6. Further Information 2 1. Running Time • What is the running time of this program? void main() { int i, n; scanf("%d", &n); for(i=0; i<n; i++) printf("%d"\n", i); } continued 3 Counting Instructions Assume 1 instruction takes 1 ms n value no. of loops Time (ms) 1 1 3 1,000 1,000 3,000 1,000,000 1,000,000 3,000,000 • There is no single answer! the running time depends on the size of the n value • Instead of a time answer in seconds, we want a time answer which is related to the size of the input. continued 5 • For example: programTime(n) = constant * n this means that as n gets bigger, so does the program time running time is linearly related to the input running time constant * n size of n 6 n value no. of loops Time (ms) 1 1 3 1,000 1,000 3,000 1,000,000 1,000,000 3,000,000 A simple way of writing the running time is: T(n) = 3n Running Time Theory • A program/algorithm has a running time T(n) n is some measure of the input size • T(n) is the largest amount of time the program takes on any input of size n • Time units are left unspecified. 8 1.1. Kinds of Running Time Worst-case: (we use this usually) • T(n) = maximum time of algorithm on any input of size n. - one possible value Average-case: (we sometimes use this) • T(n) = expected time of algorithm over all inputs of size n. - this approach needs info about the statistical distribution (probability) of the inputs. - e.g. uniform spread of data (i.e. all data is equally likely) Best-case: (don't use this, it's misleading) • e.g. write a slow algorithm that works fast on specially selected input. 1.2. T(n) Example Loop fragment for finding the product of all the positive numbers in the A[] array of size n: (2) (3) (4) (5) int prod = 1; for(j = 0; j < n; j++) if (A[j] > 0) prod = prod * A[j]; Count each assignment and test as 1 “time unit”. Convert 'for' to 'while' The while-loop is easier to count (and equivalent to the for-loop): What about counting the loop? int prod = 1; // 1 int j = 0; // 1 while (j < n) { // 1 for the test if (A[j] > 0) // 1 prod = prod*A[j]; // 1 + 1 j++; // 1 } We assume that 1 instruction takes 1 "time unit" Calculation • The for loop executes n times each loop carries out (in the worse case) 5 ops • test of j < n, if test, multiply, assign, j increment total loop time = 5n plus 3 ops at start and end • small assign (line 2), init of j (line 3), final j < n test • Total time T(n) = 5n + 3 running time is linear with the size of the array 12 1.3. Comparing Different T()’s 20000 T(n) value Tb(n) = 2n2 15000 10000 Ta(n) = 100n 5000 20 40 60 80 100 input size n • If input size < 50, program B is faster. • But for large n’s, which are more common in real code, program B gets worse and worse. 13 1.4. Common Growth Formulae & Names • Formula (n = input size) Name n n2 n3 nm mn ( m >= 2) n! 1 log n n log n log log n linear quadratic cubic polynomial, e.g. n10 exponential, e.g. 5n factorial constant logarithmic 14 1.5. Execution Times Assume 1 instruction takes 1 microsec (10-6 secs) to execute. How long will n instructions take? n (no. of instructions) growth formula T() • 3 n 3 n2 9 n3 27 2n 8 log n 2 9 9 81 729 512 3 50 50 2.5ms 125ms 36yr 6 100 100 10ms 1sec 4*1016 yr 7 1000 1ms 1sec 16.7 min 3*10287 yr 10 106 1sec 12 days 31,710yr 3*10301016yr 20 if n is 50, you will wait 36 years for an answer! 15 Notes • Logarithmic running times are best. • Polynomial running times are acceptable, if the power isn’t too big e.g. n2 is ok, n100 is terrible • Exponential times mean sloooooooow code. some size problems may take longer to finish than the lifetime of the universe! 16 1.6. Why use T(n)? • T() can guide our choice of which algorithm to implement, or program to use e.g. selection sort or merge sort? • T() helps us look for better algorithms in our own code, without expensive implementation, testing, and measurement. 17 T() is too Machine Dependent We want T() to be the same for an algorithm independent of the machine where it is running. This is not true since different machines (and OSes) execute instructions at different speeds. Consider the loop example (slide 10) on machine A, every instruction takes 1 "time unit" the result is TA(n) = 5n + 3 On machine B, every instruction takes 1 "time unit" except for multiplication, which takes 5 "time units". The for loop executes n times each loop carries out (in the worse case) 5 ops test of j < n, if test, multiply, assign, j increment total loop time = 9n plus 3 ops at start and end small assign (line 2), init of j (line 3), final j < n test Total time TB(n) = 9n + 3 running time is linear with the size of the array TA() = 5n + 3 and TB() = 9n +3 These are both linear equations (which is good), but the constants are different (which is bad) We want a T() notation that is independent of machines. 2. Big-Oh and Running Time Big-Oh notation for T(n) ignores constant factors which depend on compiler/machine behaviour that's good Big-Oh simplifies the process of estimating the running time of programs we don't need to count every code line that's also good • The Big-Oh value specifies running time independent of: machine architecture • e.g. don’t consider the running speed of individual machine operations machine load (usage) • e.g. time delays due to other users compiler design effects • e.g. gcc versus Visual C 22 Example When we counted instructions for the loop example, we found: TA() = 5n + 3 TB() = 9n + 3 The Big-Oh equation, O(), is based on the T(n) equation but ignores constants (which vary from machine to machine). This means for both machine A and B: T(n) is O(n) we say "T(n) is order n" or "T(n) is about n" More Examples • T(n) value 10n2+ 50n+100 (n+1)2 n10 5n3 + 1 Big Oh value: O() O(n2) O(n2) O(2n) O(n3) hard to understand • These simplifications have a mathematical reason, which is explained in section 2.2. 24 2.1. Is Big-Oh Useful? O() ignores constant factors, which means it is a more general measure of running time for algorithms across different platforms/compilers. It can be compared with Big-Oh values for other algorithms. i.e. linear is better than polynomial and exponential, but worse than logarithmic i.e. O(log n) < O(n) < O(n2) < O(2n) 2.2. Definition of Big-Oh T(n) is O( g(n) ) means: g(n) is the most important thing in T() when n is large Example 1: T(n) = 5n + 3 write as T(n) is O(n) // the g() function is n Example 2: T(n) = 9n + 3 write as T(n) is O(n) // the g() function is n continued More Formally We write T(n) is O(g(n)) if there exist constants c > 0, n0 > 0 such that 0 T(n) c*g(n) for all n n0. n0 and c are called witnesses to T(n) is O(g(n) ) O-notation as a Graph O-notation gives an above T(n) T(n) is upper bound for a function to within a constant factor. We write T(n) is O(g(n)) if there are positive constants n0 and c such that at and to the right of n0, the value of T(n) always lies on or below c*g(n). Example 1 the n2 part is the most important thing in the T() function • T(n) = 10n2 + 50n + 100 which allows that T(n) is O(n2) • Why? Witnesses: n0 = 1, c = 160 then T(n) <= c*g(n), n >= 1 so 10n2 + 50n + 100 <= 160 n2 since 10n2 + 50n + 100 <= 10n2 + 50n2 + 100n2 <= 160 n2 29 T() and O() Graphed http://dlippman.imathas.com/ graphcalc/graphcalc.html T(n) T(n) is O(n2) (c g(n) == 160n2) above T(n) = 10n2 + 50n + 10 n0 == 1 n Example 2 • T(n) = (n+1)2 which allows that T(n) is O(n2) • Why? Witnesses: n0 = 1, c = 4 then T(n) <= c*g(n), n >= 1 so (n+1)2 <= 4n2 since n2 + 2n + 1 <= n2 + 2n2 + n2 <= 4n2 31 T() and O() Graphed T(n) T(n) is O(n2) (c g(n) == 4n2) above T(n) = (n+1)2 n0 == 1 n Example 3 • T(n) = n10 which allows that T(n) is O(2n) • Why? Witnesses: n0 = 64, c = 1 then T(n) <= c*g(n), n >= 64 so n10 <= 2n since 10*log2 n <= n (by taking log2s) which is true when n >= 64 (10*log2 64 == 10*6; 60 <= 64) 33 10 n and T(n) n 2 Graphed T(n) is O(2n) (c g(n) == 2n above T(n) = n10 (58.770, 4.915E17) n 3. Big-Oh for Programs • First decide on a size measure for the data in the program. This will become the n. • Data Type integer string array Possible Size Measure its value its length its length 35 3.1. Building a Big-Oh Result • The Big-Oh value for a program is built up inductively by: 1) Calculate the Big-Oh’s for all the simple statements in the program • e.g. assignment, arithmetic 2) Then use those value to obtain the Big-Oh’s for the complex statements • e.g. blocks, for loops, if-statements 36 Simple Statements (in C) • We assume that simple statements always take a constant amount of time to execute written as O(1) this is not a time unit (not 1 ms, not 1 microsec) O(1) means a running time independent of the input size n • Kinds of simple statements: assignment, break, continue, return, all library functions (e.g. putchar(),scanf()), arithmetic, boolean tests, array indexing 37 Complex Statements • The Big-Oh value for a complex statement is a combination of the Big-Oh values of its component simple statements. • Kinds of complex statements: blocks { ... } conditionals: if-then-else, switch loops: for, while, do-while continued 38 3.2. Structure Trees • The easiest way to see how complex statement timings are based on simple statements (and other complex statements) is by drawing a structure tree for the program. 39 Example: binary conversion (1) (2) (3) (4) (5) void main() { int i; scanf(“%d”, &i); while (i > 0) { putchar(‘0’ + i%2); i = i/2; } putchar(‘\n’); } 40 Structure Tree for Example block 1-5 1 5 while 2-4 the time for this is the time for (3) + (4) block 3-4 3 4 41 3.3. Details for Complex Statements • Blocks: Running time bound = summation of the bounds of its parts. "summation" means 'add' • The summation rule means that only the largest Big-Oh value is considered. 42 Block Calculation Graphically O( f1(n) ) O( f2(n) ) summation rule O( f1(n) + f2(n) + ... + fk(n)) In other words: O( largest fi(n) ) O( fk(n) ) The cost of a block is the cost of the biggest statement. 43 Block Summation Rule Example • First block's time T1(n) = O(n2) • Second block's time T2(n) = O(n) • Total running time = O(n2 + n) = O(n2) the largest part 44 Conditionals e.g. if statements, switches • Conditionals: Running time bound = the cost of the if-test + larger of the bounds for the if- and else- parts • When the if-test is a simple statement (a boolean test), it is O(1). 45 Conditional Graphically O(1) Test O( max( f1(n), f2(n)) +1 ) which is the same as O( max( f1(n), f2(n)) ) If Part Else Part O( f1(n) ) O( f2(n) ) The cost of an ifstatement is the cost of the biggest branch. 46 If Example • Code fragment: if (x < y) foo(x); else bar(y); // O(1) // O(n) // O(n2) • Total running time = O( max(n,n2) + 1) = O(n2 + 1) = O(n2) 47 Loops • Loops: Running time bound is usually = the max. number of times round the loop * the time to execute the loop body once • But we must include O(1) for the increment and test each time around the loop. • Must also include the initialization and final test costs (both O(1)). 48 While Graphically Test At most g(n) times around Altogether this is: O( g(n)*(f(n)+1) + 1 ) which can be simplified to: O( g(n)*f(n) ) O(1) O( g(n)*f(n) ) Body O( f(n) ) The cost of a loop is the cost of the body * the number of loops. 49 While Loop Example • Code fragment: x = 0; while (x < n) { // O(1) for test foo(x, n); // O(n2) x++; // O(1) } • Total running time of loop: = O( n*( 1 + n2 + 1) + 1 ) = O(n3 + 2n + 1) = O(n3) lp(n) = n f(n) = 1 + n2 + 1 50 For-loop Graphically At most g(n) times around Initialize O(1) Test O(1) Body O( f(n) ) Increment O(1) O( g(n)*(f(n)+1+1) + 1) which can be simplified to: O( g(n)*f(n) ) The cost of a loop is the cost of the body * the number of loops. 51 For Loop Example • Code Fragment: for (i=0; i < n; i++) foo(i, n); // O(n2) • It helps to rewrite this as a while loop: i=0; while (i < n) { foo(i, n); i++; } // O(1) // O(1) for test // O(n2) // O(1) continued 52 • Running time for the for loop: = O( 1 + n*( 1 + n2 + 1) + 1 ) = O( 2 + n3 + 2n ) = O(n3) lp(n) = n f(n) = 1 + n2 + 1 53 3.4.1. Example: nested loops (1) (2) (3) for(i=0; i < n; i++)_ for (j = 0; j < n; j++) A[i][j] = 0; • line (3) is a simple op - takes O(1) • line (2) is a loop carried out n times takes O(n *1) = O(n) • line (1) is a loop carried out n times takes O(n * n) = O(n2) 54 3.4.2. Example: if statement (1) (2) (3) (4) (5) (6) (7) if (A[0][0] == 0) { for(i=0; i < n; i++)_ for (j = 0; j < n; j++) A[i][j] = 0; } else { for (i=0; i < n; i++) A[i][i] = 1; } continued 55 • The if-test takes O(1); the if block takes O(n2); the else block takes O(n). • Total running time: = O(1) + O( max(n2, n) ) = O(1) + O(n2) = O(n2) // using the summation rule 56 3.4.3. Time for a Binary Conversion (1) (2) (3) (4) (5) void main() { int i; scanf(“%d”, &i); while (i > 0) { putchar(‘0’ + i%2); i = i/2; } putchar(‘\n’); } continued 57 • Lines 1, 2, 3, 4, 5: each O(1) • Block of 3-4 is O(1) + O(1) = O(1) why? • While of 2-4 loops at most (log2 i)+1 times total running time = O(1 * log2 i+1) = O(log2 i) • Block of 1-5: = O(1) + O(log2 i) + O(1) = O(log2 i) 58 Why (log2 i)+1 ? • Assume i = 2k • Start 1st iteration, i = 2k Start 2nd iteration, i = 2k-1 Start 3rd iteration, i = 2k-2 Start kth iteration, i = 2k-(k-1) = 21 = 2 Start k+1th iteration, i = 2k-k = 20 = 1 the while will terminate after this iteration • Since 2k = i, so k = log2 i • So k+1, the no. of iterations, = (log2 i)+1 59 Using a Structure Tree block O(log2 i) 1-5 1 5 while 2-4 O(1) O(1) O(log2 i) block O(1) 3-4 3 O(1) 4 O(1) 60 3.4.4. Time for a Selection Sort void selectionSort(int A[], int n) { int i, j, small, temp; (1) for (i=0; i < n-1; i++) { // outer loop (2) small = i; (3) for( j= i+1; j < n; j++) // inner loop (4) if (A[j] < A[small]) (5) small = j; (6) temp = A[small]; // exchange (7) A[small] = A[i]; (8) A[i] = temp; } } 61 Selection Sort Structure Tree for 1-8 block 2-8 2 for 3-5 6 7 8 if 4-5 5 62 if part else part • Lines 2, 5, 6, 7, 8: each is O(1) • If of 4-5 is O(max(1,0)+1) = O(1) • For of 3-5 is O( (n-(i+1))*1) = O(n-i-1) = O(n), simplified • Block of 2-8 = O(1) + O(n) + O(1) + O(1) + O(1) = O(n) • For of 1-8 is: = O( (n-1) * n) = O(n2 - n) = O(n2), simplified 63 4. Analyzing Function calls • In this section, we assume that the functions are not recursive we add recursion in section (5) • Size measures for all the functions must be similar, so they can be combined to give the program’s Big-Oh value. 64 Example Program #include <stdio.h> int bar(int x, int n); int foo(int x, int n): void main() { int a, n; (1) scanf(“%d”, &n); (2) a = foo(0, n); (3) printf(“%d\n”, bar(a,n)); } continued 65 int bar(int x, int n) { int i; (4) for(i = 1; i <= n; i++) (5) x += i; (6) return x; } int foo(int x, int n) { int i; (7) for(i = 1; i <= n; i++) (8) x += bar(i, n); (9) return x; } 66 Calling Graph main foo bar 67 Calculating Times with a Calling Graph • 1. Calculate times for Group 0 functions those that call no other user functions • 2. Calculate times for Group 1 functions those that call Group 0 functions only • 3. Calculate times for Group 2 functions those that call Group 0 and Group 1 functions only • 4. Continue until the time for main() is obtained. 68 Example Program Analysis • Group 0: bar() is O(n) bar() in body • Group 1: foo() is O( n * n) = O(n2) • Group 2: main() is = O(1) + O(n2) + O(1) + O(n) = O(n2) 69 5. Analyzing Recursive Functions • Recursive functions call themselves with a “smaller size” argument, and terminate by calling a base case. int factorial(int n) { if (n <= 1) return 1; else return n*factorial(n-1); } 70 Running Time for a Recursive Function • 1. Develop basis and inductive statements for the running time. • 2. Solve the corresponding recurrence relation. this usually requires the Big-Oh notation to be rewritten as constants and multiples of n e.g. O(1) becomes a, O(n) becomes b*n, O(n2) becomes c*n2, etc. continued 71 • 3. Translate the solved relation back into Big-Oh notation rewrite the remaining constants back into Big-Oh form e.g. a becomes O(1), b*n becomes O(n) 72 5.1. Factorial Running Time • Step 1. Basis: T(1) = O(1) Induction: T(n) = O(1) + T(n-1), for n > 1 • Step 2. Simplify the relation by replacing the O() notation with constants. Basis: T(1) = a Induction: T(n) = b + T(n-1), for n > 1 73 • The simplest way to solve T(n) is to calculate it for some values of n, and then guess the general expression. T(1) = a T(2) = b + T(1) = b + a T(3) = b + T(2) = 2b + a T(4) = b + T(3) = 3b + a • “Obviously”, the general form is: T(n) = ((n-1)*b) + a = bn + (a-b) continued 74 • Step 3. Translate back: T(n) = bn + (a-b) • Replace constants by Big-Oh notation: T(n) = O(n) + O(1) = O(n) • The running time for recursive factorial is O(n). That is fast. 75 5.2. Recursive Selection Sort void rSSort(int A[], int n) { int imax, i; if (n == 1) return; else { imax = 0; /* A[0] is biggest */ for (i=1; i < n; i++) if (A[i] > A[imax]) imax = i; swap(A, n-1, imax); rSSort(A, n-1); } } 76 Running Time n == the size of the array Assume swap() is O(1), so ignore the loop call to rSSort() • Step 1. Basis: T(1) = O(1) Induction: T(n) = O(n-1) + T(n-1), for n > 1 multiple of n-1 • Step 2. Basis: T(1) = a Induction: T(n) = b(n-1) + T(n-1), for n > 1 continued 77 • Solve the relation: T(1) = a T(2) = b + T(1) = b + a T(3) = 2b + T(2) = 2b + b + a T(4) = 3b + T(3) = 3b + 2b + b + a • General Form: T(n) = (n-1)b + ... + b + a n 1 T ( n) a b i i 0 = a + b(n-1)n/2 continued 78 • Step 3. Translate back: T(n) = a + b(n-1)n/2 • Replace constants by Big-Oh notation: T(n) = O(1) + O(n2) + O(n) = O(n2) • The running time for recursive selection sort is O(n2). That is slow for large arrays. 79 6. Further Information • Discrete Mathematics and its Applications Kenneth H. Rosen McGraw Hill, 2007, 7th edition • chapter 3, sections 3.2 – 3.3 80