TDDB57 DALG – Lecture 1: Basics. Main concepts (1) Page 1 Algorithmic problem + describes input data and corresponding output J. Maluszynski, IDA, Linköpings Universitet, 2004. Abstract data type (ADT) + machine-independent, high-level description of data and operations on them + e.g.: Dictionary, Stack, Queue, PriorityQueue, Set, ... Data structure + logical organization of computer memory for storing data Algorithm + high-level description of concrete operations on data structures ADT implemented by suitable data structures and algorithms J. Maluszynski, IDA, Linköpings Universitet, 2004. TDDB57 DALG – Lecture 1: Basics. Main concepts (2) Algorithm Page 2 J. Maluszynski, IDA, Linköpings Universitet, 2004. + Concrete but machine- and language-independent specification of all necessary steps (simple operations on data) to compute a solution (output) from a given problem instance (input) + Correctness + Complexity analysis time and space consumption, scalability, efficiency worst case, best case, average case, amortized analysis Page 4 J. Maluszynski, IDA, Linköpings Universitet, 2004. + Algorithmic paradigms commonly used problem solving strategies e.g., divide&conquer, dynamic programming, greedy strategy, ... TDDB57 DALG – Lecture 1: Basics. Example problem (2): Describing Data Page 3 Example algorithmic problem (1) TDDB57 DALG – Lecture 1: Basics. Checking the exam result of a student: First step: Choose a suitable Abstract Data Type: objects and operations Here, ADT Set would be also appropriate. In the example: ADT Dictionary Domains: K: linearly ordered set of keys (here: names) I: information (here: just information passed yes / not passed no) Operations on a dictionary D: LookUp D k K returns “yes” iff k D ... Details of the operations are hidden from the user. input data: exam result list, name ADT) data structure) desired output: yes or no What are the data and necessary operations? ( How to represent the data in the computer? ( How to realize the operations with that data structure? Next step: Find suitable representation of the ADT as data structure logical organization of computer memory. Page 5 1028 1032 .... 222 224 226 228 230 232 ... 412 N N A J. Maluszynski, IDA, Linköpings Universitet, 2004. A 1032 D B Info 224 000 232 Next D J. Maluszynski, IDA, Linköpings Universitet, 2004. A 222 C C ... 228 P B TDDB57 DALG – Lecture 1: Basics. Page 6 Memory (2): RAM model static: program memory current instruction Page 8 clock load CPU register 1 store dynamic: amount of data can be changed e.g., singly linked lists amount of data cannot be changed e.g. tables Static vs. Dynamic Data Structures TDDB57 DALG – Lecture 1: Basics. J. Maluszynski, IDA, Linköpings Universitet, 2004. data memory M[3] ..... M[2] M[1] M[0] J. Maluszynski, IDA, Linköpings Universitet, 2004. RAM model assumes: All primitive operations take 1 unit of time. arithmetic operations (add, sub, mul, ...) on word-sized operands load/store of word-sized operands jumps (conditional, non-conditional) Primitive operations in the RAM model: PC register 2 .... ALU An abstract model of a simple but general computer architecture with such a flat, random-access memory model is called a Random Access Machine (RAM). TDDB57 DALG – Lecture 1: Basics. Memory (1) basic addressable units (bytes) cell: a number of units with an address to store a simple data type random access: the same amount of time is taken for retrieving / storing at any address abstracts from memory hierarchy Page 7 pointer: a cell containing an address; Λ null pointer TDDB57 DALG – Lecture 1: Basics. Memory (3): Basic Memory Structures record: cell(s) containing a structured object; the components are called fields table: memory cells of equal size in a contiguous block linked structure: built with pointers A J. Maluszynski, IDA, Linköpings Universitet, 2004. TDDB57 DALG – Lecture 1: Basics. Page 10 Dictionary lookup algorithms: linear vs. binary search Page 9 Efficiency of Algorithms Search for the name (key) k in an ordered table T . TDDB57 DALG – Lecture 1: Basics. How many steps (primitive RAM operations) are needed to obtain the result? Linear search: Binary search: How many comparisons are needed? Compare k with the keys in consecutive cells of T . This depends on: - the size of input express / approximate the number of steps as a function of input size - the kind of input: worst-case / average case analysis Compare k with the key m in the middle cell of T k m: name found, stop k m: search in the first half of T k m: search in the second half of T Page 12 J. Maluszynski, IDA, Linköpings Universitet, 2004. J. Maluszynski, IDA, Linköpings Universitet, 2004. to abstract from too low-level (primitive RAM) operations to discuss efficiency of the algorithms to present commonly used data structures and algorithms Objectives: Straightforward to translate into source code for high-level programming languages such as Ada, Pascal, C, Java Easy description of typical memory operations Reflecting memory organization As simple as possible Pseudocode-Notation for Data Structures and Algorithms TDDB57 DALG – Lecture 1: Basics. How many comparisons are needed? - the data structure used J. Maluszynski, IDA, Linköpings Universitet, 2004. (more ambitious: search phone number in a telephone directory with n entries) Page 11 Example: Dictionary search (LookUp operation) search a student’s name in an exam-passed list with n entries + Linear search + Binary search TDDB57 DALG – Lecture 1: Basics. Suming-up the Example The same data can be stored in a different way table vs. records linked with pointers The same problem can be solved with different algorithms linear search vs. binary search The efficiency of the algorithms depends on the size of data and on the kind of data e.g. worst case, best case can be characterized/approximated by a function on the size TDDB57 DALG – Lecture 1: Basics. Page 14 Y Y Z X Y X J. Maluszynski, IDA, Linköpings Universitet, 2004. X Y Z X Y X TableSearch D V procedure Link pointer P, Q ... J. Maluszynski, IDA, Linköpings Universitet, 2004. J. Maluszynski, IDA, Linköpings Universitet, 2004. Page 13 Pseudocode Notation: Assignment Y means: Page 16 TDDB57 DALG – Lecture 1: Basics. Pseudocode Notation: Data Types X Simple assignment atomic: integer, boolean, ... where X is a variable, a table element (e.g. T i ), a field (e.g. Next L ) and Y is an expression representing a value X Simultaneous assignment: Abbreviation: TDDB57 DALG – Lecture 1: Basics. pointer values are addresses, null pointer Λ J. Maluszynski, IDA, Linköpings Universitet, 2004. key a generic name for totally ordered data table: declaration: table boolean T a b table of booleans with integer indices ranging from a to b the i-th element, for a i b access: T i Page 15 node with fields occupies a single block of memory (cell) fields accessed by names (dot or macro notation) anode.name, name anode created by NewNode(nodetype), which returns a pointer to the node TDDB57 DALG – Lecture 1: Basics. Pseudocode Notation: Subroutines Pseudocode Notation: Control Functions: Conditional statements: if ... then ... / if ... then ... else ... function TableSearch table key T 1 n , key K : integer a local variable integer x; ... stops execution and returns the value x ... return x; ... Call: this is a comment Loops: for ... from ... to ... do while ... do scope of do shown by indentation Comments Procedures (functions not returning values) Recursion is allowed. Short-cut evaluation of Boolean expressions: evaluated from left to right execution stops as soon as the final value is determined e.g., X and Y and ... evaluation stops if false obtained Page 17 J. Maluszynski, IDA, Linköpings Universitet, 2004. TDDB57 DALG – Lecture 1: Basics. Page 18 J. Maluszynski, IDA, Linköpings Universitet, 2004. Example: Dictionary Search, Variant 2 TDDB57 DALG – Lecture 1: Basics. Example: Dictionary Search, Variant 1 Dictionary ADT represented as an ordered, singly linked list L. The LookUp operation is realized by ListSearch: D J. Maluszynski, IDA, Linköpings Universitet, 2004. Dictionary ADT represented as an ordered table T , where T i gives the contents of the i-th cell. The LookUp operation is realized by TableSearch: C function ListSearch (pointer List, key k ) : boolean pointer P List while P Λ and Key P k do if Key P k then return true P Next P return false B Page 20 function TableSearch ( table key T 1 n , key k ): boolean for i from 1 to n do if T i k then return true if T i k then return false return false P A Alternatively, the binary search algorithm may be applied. List TDDB57 DALG – Lecture 1: Basics. Using Locatives J. Maluszynski, IDA, Linköpings Universitet, 2004. Page 19 600 procedure LLInsert ( key k, locative P ) insert a cell containing key k in list P if none exists already while P Λ and Key P k do P Next P if P Λ and Key P k then return P NewNode k P TDDB57 DALG – Lecture 1: Basics. 400 Locative: an extended pointer Locative P 600 The course book Lewis/Denenberg uses so-called locatives to simplify algorithm descriptions: 400 P <- Next(P) Locative P 600 + locative = extended pointer that memoizes the “source” of a link automatically 500 + eliminates some special cases 400 P <= NewNode(500, P) “syntactic sugar” Locative P + eliminates an auxiliary pointer TDDB57 DALG – Lecture 1: Basics. Summary Page 21 ADT tells what to do: set of operations on data. J. Maluszynski, IDA, Linköpings Universitet, 2004. To describe how to do that, we must choose a data structure (representation in memory) find algorithms that realize the ADT operations and describe them as functions on that data structure The same ADT can be implemented with different data structures. The data structure chosen may restrict the choice of the algorithms to be used for the ADT operations and thus the efficiency that can be achieved. We use pseudocode notation for precise and concise description of algorithms.