ECOE 456/556: Algorithms and Computational Complexity Lecture 1 Serdar Taşıran This Week’s Outline Introduction and basics Fundamental concepts, formal definitions, pseudocode Motivation for studying algorithms Analyzing algorithms Growth of functions Asymptotic notation Recurrence equations Solving recurrences to compute asymptotic complexity 2 ECOE 556, Algorithms Al-gorithm: Named after the 9th century Arab mathematician al Harezmi (not Al Gore) An algorithm: A tool for solving a well-specified computational problem. Problem statement: Inputs, outputs The desired input/output relationship Algorithm describes a specific computational procedure for producing the required output Example: Sorting Input: A sequence of n numbers <a1,a2,…,an> Output: A reordering <a’1,a’2,…,a’n> of the sequence such that a’1 a’2 … a’n Given the input <6, 3, 1, 7>, the algorithm should produce <1, 3, 6, 7> Called an instance of the problem 3 ECOE 556, When do we need algorithmic solutions? Almost every engineering application The Human Genome Project Bioinformatics in general The Internet Routing algorithms Searching, indexing E-commerce Cryptography Authentication Scheduling, optimization of industrial processes Numerical algorithms e.g. matrix multiplication, finite-element simulation 4 ECOE 556, When do we need non-trivial data structures? Almost any industrial strength software tool needs them Need to represent Sets Relations Discrete functions Sequences Queues … Have varied requirements for what kind of operations to perform on them. 5 Insert, delete, lookup Next, previous Minimum, maximum … ECOE 556, Why worry about efficiency? Hardware and memory are fast and cheap. Computing getting cheaper and cheaper. Intelligent manpower is expensive Why worry about efficient algorithms and data structures? 6 ECOE 556, Why worry about algorithms and data structures? Algorithms are as important a technology as other advanced technologies, such as Hardware architecture Graphical user interfaces Programming technologies Networks A stupid approach uses up computing power faster than you might think. Examples: Sorting a million numbers O(n2) algorithm O(n lg n) algorithm 2n2 instructions 50 n lg n instructions 109 inst/second 107 inst/second 2000 sec. 100 sec. Interactive graphics: Algorithms must terminate in 1/30 of a sec. 7 ECOE 556, Movie: Sorting Algorithms http://www.iti.fh-flensburg.de/lang/algorithmen/sortieren/sortcontest/sortcontest.htm 8 ECOE 556, Computationally hard problems Polynomial complexity: Requires resources O(P(n)) for some polynomial Exponential complexity Why divide this way? Most polynomial problems have low-degree polynomial complexity Any exponential is asymptotically bigger than any polynomial Grey area in between No known polynomial algorithm No proof that one doesn’t exist Interesting class of problems: NP-complete problems Come up very often in practical applications All computationally equivalent If an efficient algorithm exists for one, all NP-complete problems are polynomially solvable 9 ECOE 556, Analyzing algorithms Is the algorithm correct? Does it terminate on all inputs? Does it produce the required output? What amount of resources does the algorithm use up? Memory Communication bandwidth Logic gates (if implemented in hardware), speed of logic circuit Running time Machine model: Single processor, random-access machine One instruction at a time, no concurrent processing Memory access also counts as one instruction Idealized, but adequate for characterizing general behavior of algorithms 10 ECOE 556, Example: Insertion Sort Takes array A[1..n] containing a sequence of length n to be sorted Sorts the array in place Numbers rearranged inside A with at most a constant number of them stored outside. 11 ECOE 556, 12 ECOE 556, 13 ECOE 556, 14 ECOE 556, Correctness of INSERTION-SORT 15 ECOE 556, Correctness of INSERTION-SORT Loop invariant: At the start of each iteration, the subarray A[1..j-1] contains the elements originally in A[1..j-1] but in sorted (increasing) order Prove initialization, maintenance, termination Invariant must imply interesting property about algorithm 16 ECOE 556, Analyzing running time Depends on input size: How to quantify? Number of items in input Number of bits required to encode input Running time = Number of primitive steps executed in computation Each line of pseudocode takes constant time 17 ECOE 556, Running time of INSERTION-SORT 18 ECOE 556, Running-time of Insertion-Sort 19 ECOE 556, Running time of INSERTION-SORT Depends on the input instance Best case: Worst case: We usually care about the worst and average case Why worst case? Upper bound May occur often. e.g. search, database look-up Average case often as bad as worst case Many exceptions Don’t need to assume particular input probability distribution Simplifying assumption: We care about rate of growth of running time only. Called “asymptotic complexity”. 20 ECOE 556, Growth of functions: Asymptotic notation Q Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Examples: 1/6 n2 – 7n = Q(n2) n3 = Q(n2) ? 21 ECOE 556, How to design algorithms? Many styles We’ll see examples throughout the semester Insertion sort was an example of an “incremental algorithm” Another common paradigm: Divide and conquer Steps Divide into simpler/smaller subproblems Solve subproblems recursively Combine results of subproblems Example: Merge-sort 22 ECOE 556, Merge Sort Divide into two subsequences of half size Call yourself recursively on the halves Combine results How do you prove MergeSort correct? 23 ECOE 556, Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 24 ECOE 556, Recurrence equation for MERGE-SORT Express T(n) in terms of subproblems and cost of division into subproblems. T(n/2) O(n) 25 ECOE 556, Recurrence equation for MERGE-SORT Express T(n) in terms of subproblems and cost of division into subproblems. T(n/2) O(n) T(n) = Q(n lg n) Insertion sort was Q(n2) 26 ECOE 556, Solving recurrences The substitution method The recursion tree method A graphical method for coming up with a good guess Guess needs to be verified using substitution method The master method Technicalities Integer arguments. Floors, ceilings Boundary conditions: T(n) constant for some small enough n. Powers of k 27 ECOE 556, Substitution method Guess the form of the solution Use mathematical induction to verify correctness Example: T(n) = 2T(n/2) + n Guess T(n) = O(n lg n) Prove that T(n) cn lg n for appropriate choice of c using strong induction How about T(n) = O(n)? 28 ECOE 556, The recursion tree method: MERGE-SORT 29 ECOE 556, The recursion tree method 30 ECOE 556, T(n) = 3 T(n/4) + cn2 31 ECOE 556, T(n) = 3 T(n/4) + cn2 32 ECOE 556, 33 ECOE 556,