1 Analysis of Algorithms The method of solving a problem is known as an algorithm. An Algorithm is a sequence of instructions that act on some input data to produce some output in a finite number of steps. An Algorithm must have properties: a)input – must receive some input data supplied externally. b)output- procedure at least one output as the result. c)Finiteness – no matter what is the input, the algorithm must terminate after a finite steps. For eg:a procedure which goes on performing a series of steps infinitely is not an alg. d) Definiteness – steps to be performed in de alg must be clear. e) Effectiveness – one must be able to perform the steps in alg without applying any intelligence. Alg categories a)Iterative (repetitive) alg. b) Recursive (divide and conquer) alg. - break down a large problem into small pieces and then applies the alg to each of these small pieces. This would make recursive alg small, straight forward and simple to understand. Why Analyze Alg: An Alg must not only solve the problem but it must be able to do so in efficient manner. There might be different ways to get the solution of given problem. The characteristics of each algorithm will determine how efficiently each will operate. Determining which algorithm is efficient than the other involves analysis of Algorithms. While analyzing an alg time required to execute it is determined. This time is not in terms of number of seconds or any such time unit. It represents the number of operations that are carried out while executing the alg.Our main concern is the relative efficiency of the different algs. 2 Note: An alg cannot be termed as better because it takes less time units or worse because it takes more time units to execute. A worse alg may take less time units to execute if we move it to a faster computer, or use a more efficient language. Hence while comparing two algs it is assumed that all other things like speed of the computer and the language used are same for both the algs. We cannot says these actual number of operation while computing the alg, because depend on the input size the number of operation vary for same alg.(eg:sorting). Once the equations for two algs are formed we can then compare the two algs by comparing that rate at which their equations grow. This growth rate is critical since there are situations where one alg needs fewer operations than the other when input size is small, but many more when the input size gets large. Why analyze Alg: Book data structure through c Pg:2to6 How to analyze an alg: Eg: count = 0 While there are more characters in the file do Increment count by 1 Get the next character End while Print count If there are 500 characters present in the file we need to initialize count once, check the condition 500+1 times( the +1 is for the last check when the file is empty),and increment the counter 500 times. The total number of operation is: Initialization -1 Increments -500 Conditional checks -500 +1 Printing -1 3 No: of increments and conditional are far too many as compared to no: of initialization and printing operation. No: if of initialization and printing operation would remain same for any size. while doing analysis of this algorithm the cost of the initialization becomes meaningless as the number of input values increases. It is important what to analyzing an algorithm-Identify significant operation like comparison or arithmetic. Comparison operations-equal, not equal, less than, greater than, less than or equal and greater than or equal. Arithmetic operation has two groups additive(addition,subtraction,increment, decrement) and multiplication(multiplication,division,modulus). Note: multiplication operation take longer time to execute than additions. Complexities a) Best case Input -Input set that allows an algorithm to perform most quickly. -takes shortest time to execute (causes the algorithms to do least amount of work) For eg: In search algorithm if the value we are searching for is found in first location. As the result, this algorithm needs only one comparison. No matter how large is the input, searching in a best case will result in a constant time of 1.Since the best case would very small and frequently constant value , a best case analysis is often not done. b) Worst Case Input -The input set that allows an algorithm to perform most slowly. Worst case in an important analysis because it gives us an idea of the most time an algorithm will ever take. -Worst case analysis requires that we identify the input values that cause an algorithm to do the most work. For eg: In search algorithm, the value we search for would be in last place or is not in the list. This involve comparison of N (consider N element in the list) . 4 c) Average Case Input-The input set that allows an algorithm to deliver an average performance. 4 steps: 1) Determine the number of different groups into which all possible input sets can be divided. 2) Determine the probability that the input will come from each of these groups. 3) Determine how long the algorithm will run for each of these groups. All of the input in each group should take the same amount of time, and if not the group must be split into two separate groups. Calculate average case time using the formula: A (n)=∑𝑛𝑖=1 pi ∗ Ti n=size of input m=number of groups pi=Probability that the input will be from group i. ti=Time that the algorithm takes for input from group i. Efficiency of an algorithm: -denotes the rate at which an algorithm solves a problem of size n. -measured by amount of resources it uses, the time and space. Time refers –number of steps the algorithm executes Space refers-number of unit memory storage it requires. An algorithm’s complexity is measured by calculating the time taken and space required for performing the algorithm. Time complexity: An algorithm is amount of time needed by a program to complete its task(i.e)to execute a particular algorithm. The number of steps required by an algorithm varies with the size of the problem it is solving(input).The time taken for an algorithm is comprised of two times. 5 a)compilation time b) run time Compilation time: -is time taken to compile an algorithm (it checks syntax and semantic errors in the program and links it with the standard libraries) -is not in the control of the programmer, as the compiler (depends upon the programming language) takes it own time to compile a particular program. -compilation time is always dependent upon the compiler, since different compilers can take different times to compile the same program. -so ignore the outcome of the compile time for computing the efficiency of the alg and consider only the run time. Run time: -is the time to execute the compiled program. -it depends upon the number of instructions present in the algorithm (one unit for executing one instruction). Note: Run time is calculated only for executable statements and not for declaration statements.pg1.21 (data structure using c p.sudharsan) Worst case:denonted by f(n)=𝑛2 or f(n)=n log n.we write T(n)=o(f(n)) for the worst case time complexity.This means the algorithm will take no more than f(n)operation. Average case: depends on the probability distribution of instances of the problem. Best case:denonted by f(n) such as f(n)= 𝑛2 or f(n)=n log n .We write T(n)=Ω(f(n))for best case. When the worst and best case performance of an algorithm are the same we can write T(n)=θf(n). Space complexity: -is a amount of memory consumed by the alg (apart from the input and output, if required by specification)until it completes its execution. -amount of storage space required by an algorithm varies with the size of the problem to be solved. 6 a) A fixed amount of memory occupied by the space for the program code and space occupied by the variable used in the program. B) A variable amount of memory occupied by the component variable whose size is dependent on the problem being solved. This space increases or decreases depending upon whether the program uses iterative or recursive procedures. Memory space taken by the variables is in the control of a programmer. More the number of variables used, more will be the space taken by them in the memory. They are, a)instruction space b)data space c)environment space Instruction space: -is the space in memory occupied by the compiler version of the program.We consider this space as constant space for any value of n.The instruction space is independent of the size of the problem. Data space: -is a space in memory,which used to hold the variables,data structures,allocated memory and other data elememts.The data space is related to the size of the problem. Environment space: -is a space in memory used on run time for each function call. -holds the returning address of the previous function. Eg:iterative factorial fact(long n) {for (i=1;i<=n;j++) X=i*x; Returm x; } 7 Space occupied is Data space:I,n,x Environment space:almost nothing because the function is called only once. The space complexity remains the same since the same variables are used,and the function is called only once. Recursive factorial eg: Long fact(long x) { If (x<=1) Return(1); Else Return(x*fact(x-1)); } Data space:x Environment space:fact() is called recursively ,and so the amount of space this program used is based on the size of the problem. Performance analysis The performance-analysis function probes for bottlenecks in server-hardware performance, diagnoses the problem, and suggests ways to improve the performance. The Performance Analysis icon is displayed as one of six images, each of which represents a different meaning: This icon blinks to indicate that performance analysis is in progress. The performance-analysis is complete. The Report Viewer freezes while the results are loaded for viewing. 8 The performance-analysis report is ready and has no bottleneck recommendations, but the Details section of the report may discuss some bottlenecks or latent bottlenecks. The performance-analysis report is ready, and you have system bottlenecks. The performance-analysis report could not be created. One or more critical monitors might be missing. Performance analysis is disabled. To re-enable performance analysis, click Edit > Enable Performance Analysis. Tips: By default, Capacity Manager activates all the required performanceanalysis monitors that are present on the systems. Performance analysis is available only on the following operating systems that have all the required performance-analysis monitors: Linux® Windows® 2000 Windows 2003 Windows Server 2008 Windows Vista Performance-analysis algorithm The performance-analysis algorithm is based on the practices of experts. The algorithm can find most but not all system problems. The algorithm uses these monitors for Windows®: Memory Usage % Disk Time CPU Utilization 9 Depending on the operating system you use, any of the following monitors to reflect LAN adapter performance: Packets/sec Total Bytes/sec % Network Utilization The algorithm uses these monitors for Linux®: Used Non-Cached (MBytes) I/O operations/second CPU Utilization Any of the following monitors to reflect LAN adapter performance: Bytes/second Packets/second Asymptotic Notations: Analyzing an algorithm allows us to estimate the time it will take to complete its task,find and eliminate wasteful code, or alert us when algorithm we choose is very slow.This complexity analysis characterize the relationship between the size the data and resource usage with a simple formula.The notations are methods used to estimate the efficiency of an algorithm. Pg