TTIT33 Algorithms and Optimization – DALG Lecture 6 TTIT33 Algorithms and Optimization – DALG Lecture 6 Content: • Balanced Search Trees TTIT33- Algorithms and optimization – AVL Trees [GT 10.2, LD 7.1] – Multi-way search trees [GT 10.4, LD 7.2, CL 18.1-2] – Information on B-trees Algorithms Lecture 5 • Piority Queues – Heaps [GT 14.3, LD 7.2, CL 18.1-2] [GT 8.1,8.3, LD 9.1, CL 6.1-5] Balanced Search Trees, Heaps Jan Maluszynski - HT 2006 6.1 Jan Maluszynski - HT 2006 TTIT33 Algorithms and Optimization – DALG Lecture 6 TTIT33 Algorithms and Optimization – DALG Lecture 6 AVL Trees Worst AVL Tree 6.2 • A worst AVL tree is a maximally unbalanced AVL tree… …or a tree of a given height with a minimum number of internal nodes LeftHeight(v)-RightHeight(v) (Example: insert 44, 17, 78, 32, 50, 88, 62) A BST Jan Maluszynski - HT 2006 6.3 Jan Maluszynski - HT 2006 6.4 TTIT33 Algorithms and Optimization – DALG Lecture 6 TTIT33 Algorithms and Optimization – DALG Lecture 6 AVL-Tree Insertion – Rebalance… AVL-Tree Insertion… (Goodrich/Tamassia) How to insert an item with key 50? • Perform a LookUp(50) • While searching for 50, keep track of last passed node with balance ≠ 0 Îcritical node • If not found, insert at the leaf where search ended • Recompute balance on the way back • Check critical node! If balance ∉ [-1..1], rebalance! Jan Maluszynski - HT 2006 6.5 Jan Maluszynski - HT 2006 30 b: b: -1 0 20 b: 1 10 b: 0 60 b: 0 1 40 b: b: -1 0 70 b: 0 50 b: 0 6.6 1 TTIT33 Algorithms and Optimization – DALG Lecture 6 TTIT33 Algorithms and Optimization – DALG Lecture 6 AVL-Tree Insertion & Re-balance… Time to re-balance… Now try an item with key 15... • Perform a LookUp(15) • While searching for 15, keep track of last passed node with balance ≠ 0 Îcritical node • If not found, insert at the leaf where search ended • Recompute balance on the way back • Check critical node! If balance ∉ [-1..1], rebalance! • Label the critical node and its 2 descendants on the 30 b: ? path to ”15” as a, b, c, such that a < b < c, in c 20 60 an in-order traversal b: 2 b: 1 • Re-structure the nodes a 10 a, b and c so that n b:40-1 b:700 b: -1 b has a and c as children! b 15 50 30 b: b: -1 ? 20 b: 12 10 b: b: -1 0 60 b: 1 40 b: -1 15 b: 0 70 b: 0 50 b: 0 k l Jan Maluszynski - HT 2006 6.7 TTIT33 Algorithms and Optimization – DALG Lecture 6 Time to re-balance… Removal of a node... • • • Jan Maluszynski - HT 2006 TTIT33 Algorithms and Optimization – DALG Lecture 6 TTIT33 Algorithms and Optimization – DALG Lecture 6 Tri-node restructuring = rotations.... Double rotations... b c l Jan Maluszynski - HT 2006 c c l a m b b a a c k 6.10 Two rotations are needed when the nodes to re-balance are placed in a zig-zag pattern... 1. Rotate up b over a Note: Labeling of a, b and c same as before! 2. Rotate up b over c c b j j 6.8 Label the critical node, the child on the deepest side, and its descendants on the deepest side as a, b, c, such that a < b < c, in an in-order traversal Re-structure as previous Go to #2 and continue update and check towards the root (we may have to re-balance more than once!) 6.9 Other authors use left and right rotations: Single left rotation: a • left part of the subb tree (a and j) is lowered j • We have ”rotated (up) k b over a”... m Same thing, but in reverse! 1. Perform a LookUp and Remove as in an ordinary binary tree 2. Update the balance on the way back to the root 3. If too unbalanced: Re-structure! ...but: • Label the critical node and its 2 descendants on the 30 b: b: -1 ? path to ”15” as a, b, c, such that a < b < c, in 60 b 15 20 c an in-order traversal b: 1 b: 0 b: 2 c 20 • Re-structure the nodes ab 15 a 10b: 0 b: 2 70 c 40 a, b and c so that 10 b 20 b: -115 n b: -1 b: 0 b: b: -1 0 b: 0 2 b has a and c as b: 0b 15 n b: 0 children! 50 k n b: 0 k • Update balance! l m Jan Maluszynski - HT 2006 b: 0 Jan Maluszynski - HT 2006 TTIT33 Algorithms and Optimization – DALG Lecture 6 a b: 0 m m k 6.11 j k l m j k l m l Jan Maluszynski - HT 2006 6.12 2 TTIT33 Algorithms and Optimization – DALG Lecture 6 TTIT33 Algorithms and Optimization – DALG Lecture 6 New approach: relax some condition... (2,3) trees [or (2,4) trees, or (a,b)-trees…] • AVL-Tree: binary tree, accepts a small unbalance • Recall: Previously: • A single ”pivot element” • If larger we search to the right • If smaller we search to the left Full binary tree: nonempty; degree is either 0 or 2 for each node Perfect binary tree: full, all leaves have the same depth • Can we build and maintain a perfect tree (if we skip ”binary”) ?? 6.13 8 5 10 2 8 12 Jan Maluszynski - HT 2006 6.14 TTIT33 Algorithms and Optimization – DALG Lecture 6 TTIT33 Algorithms and Optimization – DALG Lecture 6 (2,3) trees, (2,4) trees, or (a,b) trees... Insert in (a,b)-tree where a=2 and b=3 • Each node is either a leaf, or has c children where a ≤ c ≤ b • As long as there is room in the child we find, add element to that child... 5 Insert(10) • LookUp works aproximately as before 5 10 ...may happen repeatedly 5 10 15 • Delete must check that a node does not become empty (then we transfer or merge nodes) Jan Maluszynski - HT 2006 • If full, split and push the selected pivot element upwards... Insert(15) • Insert must check that a node does not overflow (then we split the node) 10 5 Insert(18) 15 10 5 6.15 10 Insert(17) 15 18 5 TTIT33 Algorithms and Optimization – DALG Lecture 6 TTIT33 Algorithms and Optimization – DALG Lecture 6 Delete in a (2,3)-tree Delete in a (2,3)-tree 30 Delete(25) ? 30 18 25 18 6.16 Delete(18) 5 15 20 ? 20 10 17 5 35 40 5 30 15 35 10 10 35 35 ? 15 15 35 40 10 17 10 17 5 No pivot ?? Too many children! 10 17 20 Transfer of 30 and 35 15 17 18 2. A leaf is removed (becomes empty) Î transfer some other key to that leaf, or Î merge (fuse) it with a neighbor …ok if we have a sibling with 2+ elements 20 10 17 Jan Maluszynski - HT 2006 Three cases: 1. No constraints are violated by removal 2. A leaf is removed (becomes empty) Î transfer some other key to that leaf, 5 2 Now: • Multiple pivot elements • No. of children = no. of pivot elements + 1 Î we would always know the worst search time exactly! Jan Maluszynski - HT 2006 5 18 30 15 18 30 40 5 15 17 5 15 17 30 40 40 35 40 Jan Maluszynski - HT 2006 6.17 Jan Maluszynski - HT 2006 6.18 3 TTIT33 Algorithms and Optimization – DALG Lecture 6 TTIT33 Algorithms and Optimization – DALG Lecture 6 Delete in a (2,3)-tree Properties of a (2,3) tree 3. An internal node becomes empty Root: replace with in-order pred. or succ. Î repair inconsistencies with suitable merge and transfer operations… Internal underflow... ...merge nodes Replace... ...merge leafs Delete(20) ? 10 5 35 17 30 40 5 ? 35 17 ? ∑ 17 20 10 • Always a perfect tree h +1 • A minimal tree of height h will have n = 2 − 1 nodes (it’s a full binary tree with full set of nodes at all levels) • A maximal (2,3)-tree will have a branching factor of 3, thus n = h 3i =(3h +1 − 1) / 2 30 40 5 10 i =0 …2 keys in every node Î k = 3 35 30 h +1 −1 17 35 40 5 10 30 Jan Maluszynski - HT 2006 40 • Thus the height h = log 3 k 6.19 Jan Maluszynski - HT 2006 TTIT33 Algorithms and Optimization – DALG Lecture 6 TTIT33 Algorithms and Optimization – DALG Lecture 6 B-Tree ADT Priority Queue: • Used to keep an index over external data (e.g. content of a disc) No of keys in a tree 6.20 – Linearly ordered set K of keys – We store pairs <k,I> (as in Dictionary), multiple pairs with same key are allowed. The key denotes priority – Items retrieved by priority: i.e. by the minimal key. • It’s only an (a,b)-tree where a = b/2 • We may now choose b so that b-1 references to children (other disc blocks) fit into a single disc block • By defining a = b/2 we will always fill up a disc block when two blocks are merged! Jan Maluszynski - HT 2006 6.21 Jan Maluszynski - HT 2006 TTIT33 Algorithms and Optimization – DALG Lecture 6 TTIT33 Algorithms and Optimization – DALG Lecture 6 Implementing Priority Queues Updates on a Heap Structure 6.22 • DeleteMin = deletion of the root – Replace root by last leaf – Restore partial order by swaping nodes downwards ”down-heap bubbling” This is a complete binary tree! • Insert – Insert new node after last leaf – Restore partial ordering by ”up-heap bubbling” ”last” leaf This Jan is Maluszynski also called a HEAP - HT 2006 6.23 Jan Maluszynski - HT 2006 6.24 4 TTIT33 Algorithms and Optimization – DALG Lecture 6 TTIT33 Algorithms and Optimization – DALG Lecture 6 HEAP Properties: HEAP example – bubble-up after insert(4) We store pairs of <key, data> in the tree. Recall tree representation in a table: • getLeftChild(i) = 2*i+1 • getRightChild(i) = 2*i+2 ...where i is the table index, not the key! In [G/T] the root starts at index 1 and representation is: -getLeft(i) = 2*I -getRight(i) = 2*i+1 • size(), minElemenent(): O(1) • minElement(), minKey(): O(1) • insertItem(), removeMin(): O(log n) >> HeapSort O(n log n) sorting algorithm ...in [L/D] the root starts at 0... Recall vector representation of BST! A complete binary tree... • Compact vector representation • Bubble-up and bubble-down have fast implementations Jan Maluszynski - HT 2006 6.25 Jan Maluszynski - HT 2006 6.26 TTIT33 Algorithms and Optimization – DALG Lecture 6 Heap Sort Kinds of heaps : • minKey in the root => (Min)Heap • maxKey in the root => MaxHeap Heap Sort: input D[0..n] • Heapify D => Max Heap worst case O(n log n) • For i from 0 to n-1 DeleteMax => put in D[n-i] (worst case O(log n) for restoring heap D[0..i] Jan Maluszynski - HT 2006 6.27 5