Part-F2 AVL Trees 6 v 8 3 z 4 AVL Trees 1 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree 44 2 17 78 1 3 2 32 such that for every internal node v of T, the heights of the children of v can differ by at most 1. 4 88 50 1 48 62 1 An example of an AVL tree where the heights are shown next to the nodes: AVL Trees 2 1 Balanced nodes A internal node is balanced if the heights of its two children differ by at most 1. Otherwise, such an internal node is unbalanced. AVL Trees 3 n(2) Height of an AVL Tree 3 4 n(1) Fact: The height of an AVL tree storing n keys is O(log n). Proof: Let us bound n(h): the minimum number of internal nodes of an AVL tree of height h. We easily see that n(1) = 1 and n(2) = 2 For n > 2, an AVL tree of height h contains the root node, one AVL subtree of height n-1 and another of height n-2. That is, n(h) = 1 + n(h-1) + n(h-2) Knowing n(h-1) > n(h-2), we get n(h) > 2n(h-2). So n(h) > 2n(h-2), n(h) > 4n(h-4), n(h) > 8n(n-6), … (by induction), n(h) > 2in(h-2i)>2 {h/2 -1} (1) = 2 {h/2 -1} Solving the base case we get: n(h) > 2 Taking logarithms: h < 2log n(h) +2 h/2-1 Thus the height of an AVL tree is O(log n) AVL Trees h-1 4 h-2 Insertion in an AVL Tree Insertion is as in a binary search tree Always done by expanding an external node. Example: 44 44 17 78 17 78 c=z a=y 32 50 48 88 62 32 50 88 48 62 b=x 54 w before insertion after insertion AVL Trees It is no longer balanced 5 Names of important nodes w: the newly inserted node. (insertion process follow the binary search tree method) The heights of some nodes in T might be increased after inserting a node. Those nodes must be on the path from w to the root. Other nodes are not effected. z: the first node we encounter in going up from w toward the root such that z is unbalanced. y: the child of z with higher height. y must be an ancestor of w. (why? Because z in unbalanced after inserting w) x: the child of y with higher height. The height of the sibling of x is smaller than that of x. (Otherwise, the height of y cannot be increased.) x must be an ancestor of w. See the figure in the last slide. AVL Trees 6 Algorithm restructure(x): Input: A node x of a binary search tree T that has both parent y and grand-parent z. Output: Tree T after a trinode restructuring. 1. Let (a, b, c) be the list (increasing order) of nodes x, y, and z. Let T0, T1, T2 T3 be a left-to-right (inorder) listing of the four subtrees of x, y, and z not rooted at x, y, or z. 2. Replace the subtree rooted at z with a new subtree rooted at b.. 3. Let a be the left child of b and let T0 and T1 be the left and right subtrees of a, respectively. 4. Let c be the right child of b and let T2 and T3 be the left and right subtrees of c, respectively. AVL Trees 7 Restructuring (as Single Rotations) Single Rotations: a=z single rotation b=y c=x T0 T1 T3 T2 c=z a=z T0 single rotation b=y b=y c=x T1 T3 T2 b=y a=x c=z a=x T0 T1 T2 T3 T0 AVL Trees T1 T2 T3 8 Restructuring (as Double Rotations) double rotations: double rotation a=z c= y b=x a=z c= y b=x T0 T2 T1 T3 T0 T1 double rotation c=z a=y T2 T3 b=x a=y c=z b=x T0 T1 T3 T0 T1 T2 T3 T2 AVL Trees 9 Insertion Example, continued 44 2 5 z 17 32 3 1 1 1 1 50 2 1 7 78 2y 48 64 3 4 62 88 x 5 T3 54 unbalanced... T0 T2 T1 44 2 4 3 17 32 ...balanced 1 1 1 48 2 y 2 50 4 x z6 62 3 1 5 78 2 54 7 88 T2 AVL Trees T0 T1 10 T3 1 Theorem: One restructure operation is enough to ensure that the whole tree is balanced. Proof: Left to the readers. AVL Trees 11 Removal in an AVL Tree Removal begins as in a binary search tree, which means the node removed will become an empty external node. Its parent, w, may cause an imbalance. Example: 44 44 17 62 32 50 48 17 62 78 50 88 54 before deletion of 32 AVL Trees 48 78 54 88 after deletion 12 Rebalancing after a Removal Let z be the first unbalanced node encountered while travelling up the tree from w. Also, let y be the child of z with the larger height, let x be the child of y defined as follows; If one of the children of y is taller than the other, choose x as the taller child of y. If both children of y have the same height, select x be the child of y on the same side as y (i.e., if y is the left child of z, then x is the left child of y; and if y is the right child of z then x is the right child of y.) AVL Trees 13 Rebalancing after a Removal We perform restructure(x) to restore balance at z. As this restructuring may upset the balance of another node higher in the tree, we must continue checking for balance until the root of T is reached a=z w 62 44 17 50 48 c=x 78 54 44 b=y 62 17 50 48 88 AVL Trees 78 88 54 14 Unbalanced after restructuring Unbalanced balanced a=z w h=4 44 17 50 h=3 h=5 c=x 78 62 44 b=y 62 32 1 1 17 h=5 78 50 88 88 AVL Trees 15 Rebalancing after a Removal We perform restructure(x) to restore balance at z. As this restructuring may upset the balance of another node higher in the tree, we must continue checking for balance until the root of T is reached a=z w 62 44 17 50 48 c=x 78 54 44 b=y 62 17 50 48 88 AVL Trees 78 88 54 16 Running Times for AVL Trees a single restructure is O(1) using a linked-structure binary tree find is O(log n) height of tree is O(log n), no restructures needed insert is O(log n) initial find is O(log n) Restructuring up the tree, maintaining heights is O(log n) remove is O(log n) initial find is O(log n) Restructuring up the tree, maintaining heights is O(log n) AVL Trees 17 Part-G1 Merge Sort 7 29 4 2 4 7 9 72 2 7 77 22 AVL Trees 94 4 9 99 44 18 Divide-and-Conquer (§ 10.1.1) Divide-and conquer is a general algorithm design paradigm: Divide: divide the input data S in two disjoint subsets S1 and S2 Recur: solve the subproblems associated with S1 and S2 Conquer: combine the solutions for S1 and S2 into a solution for S The base case for the recursion are subproblems of size 0 or 1 AVL Trees Merge-sort is a sorting algorithm based on the divide-and-conquer paradigm Like heap-sort It uses a comparator It has O(n log n) running time Unlike heap-sort It does not use an auxiliary priority queue It accesses data in a sequential manner (suitable to sort data on a disk) 19 Merge-Sort (§ 10.1) Merge-sort on an input sequence S with n elements consists of three steps: Divide: partition S into two sequences S1 and S2 of about n/2 elements each Recur: recursively sort S1 and S2 Conquer: merge S1 and S2 into a unique sorted sequence Algorithm mergeSort(S, C) Input sequence S with n elements, comparator C Output sequence S sorted according to C if S.size() > 1 (S1, S2) partition(S, n/2) mergeSort(S1, C) mergeSort(S2, C) S merge(S1, S2) AVL Trees 20 Merging Two Sorted Sequences The conquer step of merge-sort consists of merging two sorted sequences A and B into a sorted sequence S containing the union of the elements of A and B Merging two sorted sequences, each with n/2 elements and implemented by means of a doubly linked list, takes O(n) time Algorithm merge(A, B) Input sequences A and B with n/2 elements each Output sorted sequence of A B S empty sequence while A.isEmpty() B.isEmpty() if A.first().element() < B.first().element() S.insertLast(A.remove(A.first())) else S.insertLast(B.remove(B.first())) while A.isEmpty() S.insertLast(A.remove(A.first())) while B.isEmpty() S.insertLast(B.remove(B.first())) return S AVL Trees 21 Merge-Sort Tree An execution of merge-sort is depicted by a binary tree each node represents a recursive call of merge-sort and stores unsorted sequence before the execution and its partition sorted sequence at the end of the execution the root is the initial call the leaves are calls on subsequences of size 0 or 1 7 2 7 9 4 2 4 7 9 2 2 7 77 9 22 4 4 9 99 AVL Trees 44 22 Execution Example Partition 7 2 9 43 8 6 1 1 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 7 2 2 7 77 22 3 8 6 1 1 3 8 6 9 4 4 9 99 44 AVL Trees 3 8 3 8 33 88 6 1 1 6 66 11 23 Execution Example (cont.) Recursive call, partition 7 2 9 43 8 6 1 1 2 3 4 6 7 8 9 7 29 4 2 4 7 9 7 2 2 7 77 22 3 8 6 1 1 3 8 6 9 4 4 9 99 44 AVL Trees 3 8 3 8 33 88 6 1 1 6 66 11 24 Execution Example (cont.) Recursive call, partition 7 2 9 43 8 6 1 1 2 3 4 6 7 8 9 7 29 4 2 4 7 9 722 7 77 22 3 8 6 1 1 3 8 6 9 4 4 9 99 44 AVL Trees 3 8 3 8 33 88 6 1 1 6 66 11 25 Execution Example (cont.) Recursive call, base case 7 2 9 43 8 6 1 1 2 3 4 6 7 8 9 7 29 4 2 4 7 9 722 7 77 22 3 8 6 1 1 3 8 6 9 4 4 9 99 44 AVL Trees 3 8 3 8 33 88 6 1 1 6 66 11 26 Execution Example (cont.) Recursive call, base case 7 2 9 43 8 6 1 1 2 3 4 6 7 8 9 7 29 4 2 4 7 9 722 7 77 22 3 8 6 1 1 3 8 6 9 4 4 9 99 44 AVL Trees 3 8 3 8 33 88 6 1 1 6 66 11 27 Execution Example (cont.) Merge 7 2 9 43 8 6 1 1 2 3 4 6 7 8 9 7 29 4 2 4 7 9 722 7 77 22 3 8 6 1 1 3 8 6 9 4 4 9 99 44 AVL Trees 3 8 3 8 33 88 6 1 1 6 66 11 28 Execution Example (cont.) Recursive call, …, base case, merge 7 2 9 43 8 6 1 1 2 3 4 6 7 8 9 7 29 4 2 4 7 9 722 7 77 22 3 8 6 1 1 3 8 6 9 4 4 9 99 44 AVL Trees 3 8 3 8 33 88 6 1 1 6 66 11 29 Execution Example (cont.) Merge 7 2 9 43 8 6 1 1 2 3 4 6 7 8 9 7 29 4 2 4 7 9 722 7 77 22 3 8 6 1 1 3 8 6 9 4 4 9 99 44 AVL Trees 3 8 3 8 33 88 6 1 1 6 66 11 30 Execution Example (cont.) Recursive call, …, merge, merge 7 2 9 43 8 6 1 1 2 3 4 6 7 8 9 7 29 4 2 4 7 9 722 7 77 22 3 8 6 1 1 3 6 8 9 4 4 9 99 44 AVL Trees 3 8 3 8 33 88 6 1 1 6 66 11 31 Execution Example (cont.) Merge 7 2 9 43 8 6 1 1 2 3 4 6 7 8 9 7 29 4 2 4 7 9 722 7 77 22 3 8 6 1 1 3 6 8 9 4 4 9 99 44 AVL Trees 3 8 3 8 33 88 6 1 1 6 66 11 32 Analysis of Merge-Sort The height h of the merge-sort tree is O(log n) at each recursive call we divide in half the sequence, The overall amount or work done at the nodes of depth i is O(n) we partition and merge 2i sequences of size n/2i we make 2i+1 recursive calls Thus, the total running time of merge-sort is O(n log n) depth #seqs size 0 1 n 1 2 n/2 i 2i n/2i … … … AVL Trees 33 Summary of Sorting Algorithms Algorithm selection-sort insertion-sort heap-sort merge-sort Time Notes O(n2) slow in-place for small data sets (< 1K) O(n2) slow in-place for small data sets (< 1K) O(n log n) fast in-place for large data sets (1K — 1M) O(n log n) fast sequential data access for huge data sets (> 1M) AVL Trees 34