TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations Lecture 5 – Overview Page 1 Implementation of Set and Dictionary using tables + Analysis of binary search J. Maluszynski, IDA, Linköpings Universitet, 2004. [Lewis/Denenberg p.181-184] [Lewis/Denenberg 4.1] J. Maluszynski, IDA, Linköpings Universitet, 2004. Implementation of Set and Dictionary using linked lists Trees: Basic terminology Page 3 TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations Lecture 3) Page 2 Implementations of ADTs Set and Dictionary Table ( Page 4 Lecture 7) Lecture 5, 6) Lecture 3) – Unordered table – Ordered table Linked lists ( Lecture 4) – Unordered lists – Ordered lists Hashing ( (Binary) search trees ( AVL-trees, Splay trees ( TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations 0 elements, key k to search for J. Maluszynski, IDA, Linköpings Universitet, 2004. [Lewis/Denenberg 6.3] [Lewis/Denenberg 6.2] [Lewis/Denenberg 6.4] J. Maluszynski, IDA, Linköpings Universitet, 2004. Analysis of binary search in an ordered table (1) Table representations for ADTs Set and Dictionary Given: table T with n For unordered keys: LookUp by linear search unsuccessful lookup: n comparisons O n time successful lookup, worst case: n comparisons O n time successful lookup, average case with uniform distribution of requests: 1 n 1 1 2 n 1 n O n time comparisons n 2 Insert in O 1 LookUp by binary search worst case and average case: O log n time Insert requires reconstruction of the table O n worst case time function BinSearchLookU p table T 0 n 1 key k : pointer (1) l 0; r n 1 (2) while true do search in interval l r for k: (3) if r l then return Λ l r 2 (4) mid (5) if k key T mid then return T mid (6) else if k key T mid then r mid 1 (7) else l mid 1 For ordered keys: TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations Page 5 r=9 J. Maluszynski, IDA, Linköpings Universitet, 2004. depth = 3 depth = 2 depth = 1 depth = 0 TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations Page 6 J. Maluszynski, IDA, Linköpings Universitet, 2004. v from the root to each tree node v. Maximum depth of an internal node is log2 n Maximum length of a path is log2 n or log2 n log2 n 1 comparisons 1 comparisons 1 Depth dv of a node v of the tree: number of nodes (comparisons) on path from root to v, excluding v Length of a path π: number of edges on path from root to v root Analysis of binary search in an ordered table (3): Worst case k>T[4] l=5 m=7 k>T[7] Analysis of binary search in an ordered table (2) l=0 r=9 m=4 k<T[7] k<T[4] r=3 k>T[1] m=8 k>T[8] There is exactly one path π l=0 m=1 l=8 r=9 Possible executions represented by a binary tree (a decision tree): k<T[1] k>T[5] r=9 successful search: at least 1, at most unsuccessful search: log2 n or log2 n r=6 l=9 10 9 depth = 4 m=9 7 8 m=5 8 9 l=5 6 k>T[2] r=6 m=6 l=6 7 r=3 5 4 5 m=2 r=3 m=3 l=3 6 l=2 3 r=0 2 1 4 m=0 1 0 2 l=0 0 -1 3 internal nodes correspond to comparisons and successful returns external (leaf) nodes represent unsuccessful termination points edges correspond to a backward branch in the loop 1 Page 8 2n 1 log2 n 1 J. Maluszynski, IDA, Linköpings Universitet, 2004. E n n 1 n Ext dv E E log2 n log2 n 1 log2 n 1 Σv n and E n Int dv C C not worse than unsuccesful search Σv 1 Expected number of comparisons: 1 I I n 1 n n Search succeeds at internal node v with depth dv. log2 n 1 comparisons. 1 external nodes). Depth of an external node: dv C Needs then dv The tree has n internal nodes (thus n Fact: E I 2n (prove by induction) TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations J. Maluszynski, IDA, Linköpings Universitet, 2004. TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations Analysis of binary search in an ordered table (5): Average case cont. Page 7 Analysis of binary search in an ordered table (4): Average case Recall I Table with n items. Succesful search. Uniform distribution. Denote I Σv Int dv E Σv Ext dv move to front heuristic can improve average time. + expression trees Page 12 J. Maluszynski, IDA, Linköpings Universitet, 2004. + structured documents (book: chapters, sections, subsections, paragraphs, ...) + hierarchical organization diagrams (company: departments, divisions, groups, employees) + hierarchical classification systems in science and engineering + genealogic trees (successors of a person) J. Maluszynski, IDA, Linköpings Universitet, 2004. TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations on the keys. binary search impossible. Page 10 TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations Unordered list representation of ADTs Set and Dictionary In practice some keys are accessed more often than others: J. Maluszynski, IDA, Linköpings Universitet, 2004. Analysis of binary search in an ordered table (6): Average case cont. 1 J. Maluszynski, IDA, Linköpings Universitet, 2004. Page 9 By the analysis we obtained: binary search trees Binary Search Expected Time Theorem: Insert in constant time LookUp and Delete require list traversal unsuccessful and worst case successful needs n length L comparisons average case with uniform distribution of requests: 1 n 1 1 2 n 1 n n 2 log2 n Given a table of n elements, assuming uniform distribution of keys, the expected number C of comparisons in a successful search is C TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations log2 n 1 Page 11 The list items are ordered wrt ordering combine the flexibility of linked lists (insertion, deletion) with fast access to the middle elements. TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations Trees: examples of use Ordered list representation for ADTs Ordered Set, Dictionary Unsuccessful LookUp needs on average n 2 comparisons Successful LookUp needs on average n 2 comparison no direct access to the middle element Idea: Page 13 J. Maluszynski, IDA, Linköpings Universitet, 2004. Page 14 V: u w V: V: 1 J. Maluszynski, IDA, Linköpings Universitet, 2004. path from v to w in T path from u to v in T Page 16 height h v of a node v V length of longest path from v to a successor of v height h T of tree T = height of the root of T Complete binary trees 1 J. Maluszynski, IDA, Linköpings Universitet, 2004. vl in T V E from v1 to vl with length l i l, and vi vi 1 E i 1 i l An ordered binary tree T is complete iff: TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations Trees: Basic terminology (2) v1 v2 V i 1 TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations Trees: Basic terminology (1) path π if vi ancestors of a node v V: successors of a node v VE . V (store data items) depth d v of a node v V length of longest path from the root to v A Tree is a graph T Nodes v Edges E V V E referred to as parent-child relation E: w V 0, or TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations There is exactly one node that has no parent: the root of T . vw Each non-root node has exactly one parent; may have siblings The degree of a node v V is the number of its children: J. Maluszynski, IDA, Linköpings Universitet, 2004. A node that has no children is called a leaf node. Page 15 TDDB57 DALG – Lecture 5: ADT Dictionary and Its Implementations Ordered tree: linear order among the children of each node Special kinds of trees height T 1 and T has left child, (may have both) or height T height T n 1 and either left subtree of T is a perfect tree of height n 1 right subtree of T is a complete tree of height n or 2 for each node left subtree of T is a complete tree of height n 1 right subtree of T is a perfect tree of height n 2 Binary tree: ordered tree with degree left child, right child Empty binary tree (Λ): binary tree with no nodes Full binary tree: nonempty; degree is either 0 or 2 for each node Fact: number of leaves = 1 + number of interior nodes (proof by induction) Perfect binary tree: full, all leaves have the same depth Fact: number of nodes = 2h 1 1 (2h leaves) for a perfect tree of height h (proof by induction on h) Complete binary tree: approximation to perfect, see next slide