Chapter 3: Efficiency of Algorithms Quality attributes for algorithms Correctness: Maintainability It should do things right No flaws in design of the algorithm If we discover it is incorrect, efforts for modifying and/or extending it must be minimized Central aspect in real-life programs Understandability Good structure Name things (variables, conditions etc.) based on their roles in the algorithm/program 1 Chapter 3: Efficiency of Algorithms Elegance Search for intuitive and intelligent solutions Example: Adding 1 to n 1. Get value of n 2. Set Sum to 0 and x to 1 3. While x <= n do 4. Add x to Sum 5. Add 1 to x End of the loop 6. Print value of Sum 7. Stop But more elegant is the Gauss method (treated earlier): 1. Get the number n 2. Set m to n+1 3. Set n to n multiplied by m 4. Divide n by 2 5. Print the result n 2 Chapter 3: Efficiency of Algorithms Efficiency How does the algorithm use resources Resources are: time and memory Major computer science concern Memory efficiency: How much memory is used in processing compared with the size of the input data Time efficiency: Amount of processing work in dependence of the size of data input Inherent time (in)efficiency cannot significantly be influenced by the use of more or less efficient machines 3 Chapter 3: Efficiency of Algorithms The choice of algorithms For the same problem there may be different algorithmic solutions possible. The algorithm attributes, above all the efficiency criterion, can be used in order to choose the “best” solution for the respective problem. Example for a problem with 3 different solutions: A survey of ages of persons in a city is undertaken in order to get the average age of the population in that city. Persons who don’t wish to publish their ages write 0 as a fictive age. Given such a survey, print the average age in the city. 4 Chapter 3: Efficiency of Algorithms Formally: Input: A list of ages including 0’s Idea of the solution: E.g. [0, 24, 16, 0, 36, 42, 23, 21, 0, 27] Successively eliminate any 0 in the list. Get the number r of the “real” (no fictive ones, no 0’s) ages. Sum up real ages and divide by r in order to get the average. Idea of Solution A: Set r to size of the list (here 10). Looking at the list from left to right and pointing with two fingers. Whenever a 0 is encountered at the position of the left finger, we copy ALL items to the right of the right finger one cell to the left and we decrement r. If the left finger points on a real age, both fingers are moved to the right. 5 Chapter 3: Efficiency of Algorithms Example: Initially: After copying the 9 items (after the right finger): r=9 [24, 16, 0, 36, 42, 23, 21, 0, 27, 27] Processing a “real” age pointed to by the left finger: r = 10 [0, 24, 16, 0, 36, 42, 23, 21, 0, 27] r=9 [24, 16, 0, 36, 42, 23, 21, 0, 27, 27] Copy operations needed whenever a 0 is processed: Here: 9 + 7 + 3 = 19 copy operations 6 Chapter 3: Efficiency of Algorithms Algorithm A: Get the n size of the list Set r to n, left to 1, right to 2 While left <= r do If left points to a non-zero item then Else Add 1 to left and right Decrease r by 1 While right <= n do Copy item at position right one cell to the left Add 1 to right End loop Set right to left+1 End loop Efficiency: Copying operations -> time efficiency is rather low Only memory for the input is required (+ tiny quantities for variables like n, left, and right) -> memory-efficient 7 Chapter 3: Efficiency of Algorithms Idea of solution B: Provide a new (output) list. Read the input from left to right and copy any non-zero item the output list Algorithm B: Get n the size of the list Set left to 1 and pos to 1 While left <= n do If left points to a non-zero item then Else Copy item to position pos Increment left and pos Increase left by 1 End loop Example: Input list: [0, 24, 16, 0, 36, 42, 23, 21, 0, 27] Output list: [24, 16, 36, 42, 23, 21, 27] 8 Chapter 3: Efficiency of Algorithms Efficiency of algorithm B: Every location is examined and non-zero ones are copied Extra memory for the output list -> less copies than solution A (or more time-efficient than A) -> each non-zero element is copied -> less memory-efficient than A Idea of Solution C r is set to the size of the list The left finger slides from left to right (beginning from the first position) The right finger slides from right to left (beginning from the last position) Whenever a 0 is encountered by the left finger the item of the right finger is copied to that place and r is decremented The process is repeated until “left” is higher than “right” 9 Chapter 3: Efficiency of Algorithms Example: Initially: After copying the 27 to the first position and decreasing right and r: r=9 [27, 24, 16, 0, 36, 42, 23, 21, 0, 27] Now left increases until 0: r = 10 [0, 24, 16, 0, 36, 42, 23, 21, 0, 27] r=9 [27, 24, 16, 0, 36, 42, 23, 21, 0, 27] In turn a copy and a decrease of both r and right: r=8 [27, 24, 16, 0, 36, 42, 23, 21, 0, 27] 10 Chapter 3: Efficiency of Algorithms Algorithm C Get n the size of the list Set r to n, left to 1, right to n While left < right do End loop If item of left is 0 then decrement r Efficiency: If item of left is not 0 then increase left by 1 Else Decrement r Copy item of right to position of left Decrement right Fewer copies than B, number of copies is number of zeros No memory requirement like B Comparison in our example: A: Too many copies, no extra memory requirement B: Fewer copies than A, more memory requirement C: Fewer copies than B (few 0’s !), no extra memory requirement C “seems” to be the best choice ! 11 Chapter 3: Efficiency of Algorithms Measuring Efficiency What is really a “good” algorithm for a given problem? Algorithms A, B, and C have the same result but do things differently. How to judge algorithms objectively? Algorithm analysis: Time and memory complexity Sequential search (telephone book) 1. Get values of N, n, N1, …, Nn, T1, …, Tn 2. Set I to 1 and mark N as not yet found 3. Repeat 4 through 7 until either N is found or I>n 4. If N = Ni then Else 5. Print phone number Ti 6. Mark N as found 7. Increment I 8. If N is not marked as found then Print ‘Sorry, could not find the name’ 12 Chapter 3: Efficiency of Algorithms Good efficiency criterion in this case is: counting the number of comparisons. Best case: N is in the first place Worst case: N is in the last place or it does not exist at all About n/2 comparisons Example: n comparisons Average case: N is somewhere in the middle of the list only 1 comparison NY city’s population 20 millions (n), computer can do say 50000 comparisons/second In average: 3.33 minutes to find the phone number (n/2*1/50000) Worst case: When the name is not in the book almost 7 minutes are needed! Memory: Space for the list is required (n) Additional cells for variable like I, and n memory-efficient 13 Chapter 3: Efficiency of Algorithms Algorithm efficiency is measured using the size of input (n) as parameter Order of Magnitude E.g. sequential search: worst case: n comparisons c peripheral work per comparison (e.g. adding a variable) worst case is c*n (in this case rather 2*n) Difference between n and 2*n 1 2 2 4 3 6 and so on Graph of 2*n in dependence of n Graphs of c = 1, c = ½, and c = 2 in comparison All of them follow the basic straight line shape 14 Chapter 3: Efficiency of Algorithms Order of magnitude f(n) (-> written O(f(n)) ) Any function g(n) that follows the basic shape of f(n) Example: Sequential search needs at least n comparisons -> O(n) in both average and worst case Another problem: A city has 4 districts Phone calls between districts are kept in a table 1 2 3 4 1 333 44 567 35 2 33 12 47 68 3 45 89 11 99 4 56 222 123 59 15 Chapter 3: Efficiency of Algorithms Process the information in the table (e.g. print out content) Algorithm For each row 1 to 4 do Number of print operations: 4 * 4 = 16 Similarly: Given n districts, number of print operations would be: For each column 1 to 4 Print the entry in this row and column n * n = n2 operations Work does not grow in the same rate as the input (like sequential search) It grows at a rate equal to the square of the input 16 Chapter 3: Efficiency of Algorithms An algorithm having c*n2 amount of work to do is said to be of order of magnitude n2 or O(n2) algorithm. In our example c = 1 Graphs of c*n2 for different values of c c*n2 is different in shape from c*n We can ignore the actual value of c Example from real life: Driving is faster than walking, running, and biking Driving has another order of magnitude! Graph of n and n2 in comparison Computers often solve problems with large problem sizes (n is very high) Humans are often able to solve problems of small problem size, they don’t really need a computer therefor. 17 Chapter 3: Efficiency of Algorithms Suppose you have 2 solutions for the same problem: For large n, the O(n) algorithm will always perform better even if the constant factor c is high A: O(n) B: O(n2) Compare: 0.0001*n2 and 100*n In the long run: 100*n will perform better than 0.0001*n2 -> constants do not affect the inherent (in)efficiency of algorithms -> constants may play a role for small n only Important rule when designing algorithms No assumptions about the size of the problem !!! If any, this should be documented, because any assumption may affect the efficiency tremendously 18 Chapter 3: Efficiency of Algorithms Machine speed does not help avoid inefficiency of algorithms! Example: Mac PowerPC 601, price $2000, speed = 15 million operations /sec Cray T3D, price $31 millions, speed = 300 billion operations /sec Let us give the Mac an O(n) and the Cray an O(n2) algorithm, respectively Result: n Mac O(n) Cray O(n2) 15.000 0.001 sec 0.00075 sec 150.000 0.01 sec 0.075 sec 15.000.000 1 sec 12.5 min 150.000.000 10 sec 20.83 hours -> The difference in order of magnitude has slowed down the very powerful Cray machine (for large n) Were the two algorithms of same order of magnitude, were the Cray better in performance 19 Chapter 3: Efficiency of Algorithms Sequential search: O(n) What is if the list is alphabetically ordered? Example: [Adam, Catherine, David, John, Nancy, Peter, Thomas] Searching for Peter? Sequential search: 6 operations needed Why not begin in the middle of the list (like searching in a telephone book) and then branch to the right or the left depending on the name to be looked for? 1. Step: Compare Peter with John failure 2. Step: Branch to the right, and compare Peter with Peter Success! Only two comparison operations are needed (sequential search needed 6) In general: O(log(n)) because the length of the list is successively divided by 2. Logarithmic search (binary search) is more efficient than sequential search. 20 Chapter 3: Efficiency of Algorithms Sorting: Direct sort: O(n2) More efficient sort: O(n*log(n)) Difficult Problems: O(2n) Examples: Given a network of cities connected by direct paths, find the minimum path between two cities. Search for Hamiltonian cycles in a network of cities. Find the minimum number of bins needed to fill n objects. These problems are “realistically” not solvable, they need many thousands of years for relative small problem sizes. Remedy: Heuristics (sub-optimal solutions) E.g. first fit for bin-packing problem 21