Algorithms R OBERT S EDGEWICK | K EVIN W AYNE 3.2 B INARY S EARCH T REES ‣ BSTs ‣ ordered operations Algorithms F O U R T H E D I T I O N R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ deletion 3.2 B INARY S EARCH T REES ‣ BSTs ‣ ordered operations Algorithms R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ deletion Binary search trees Definition. A BST is a binary tree in symmetric order. root a left link A binary tree is either: a subtree ・Empty. ・Two disjoint binary trees (left and right). right child of root null links Anatomy of a binary tree Symmetric order. Each node has a key, and every node’s key is: ・ ・Smaller than all keys in its right subtree. Larger than all keys in its left subtree. parent of A and R left link of E S E X A R C keys smaller than E key H 9 value associated with R keys larger than E Anatomy of a binary search tree 3 BST representation in Java Java definition. A BST is a reference to a root Node. A Node is comprised of four fields: ・A Key and a Value. ・A reference to the left and right subtree. smaller keys larger keys private class Node { private Key key; private Value val; private Node left, right; public Node(Key key, Value val) { this.key = key; this.val = val; } } BST Node key left BST with smaller keys val right BST with larger keys Binary search tree Key and Value are generic types; Key is Comparable 4 BST implementation (skeleton) public class BST<Key extends Comparable<Key>, Value> { private Node root; private class Node { /* see previous slide */ root of BST } public void put(Key key, Value val) { /* see next slides */ } public Value get(Key key) { /* see next slides */ } public void delete(Key key) { /* see next slides */ } public Iterable<Key> iterator() { /* see next slides */ } } 5 Binary search tree demo Search. If less, go left; if greater, go right; if equal, search hit. successful search for H S E X A R C H M 6 Binary search tree demo Insert. If less, go left; if greater, go right; if null, insert. insert G S E X A R C H G M 7 BST search: Java implementation Get. Return value corresponding to given key, or null if no such key. public Value get(Key key) { Node x = root; while (x != null) { int cmp = key.compareTo(x.key); if (cmp < 0) x = x.left; else if (cmp > 0) x = x.right; else if (cmp == 0) return x.val; } return null; } Cost. Number of compares is equal to 1 + depth of node. 8 BST insert Put. Associate value with key. inserting L S E X A ・ ・Key not in tree ⇒ add new node. H C Search for key, then two cases: Key in tree ⇒ reset value. R M P search for L ends at this null link S E X A R H C M create new node P L S E X A R C H M reset links on the way up L P Insertion into a BST 9 BST insert: Java implementation Put. Associate value with key. public void put(Key key, Value val) { root = put(root, key, val); } concise, but tricky, recursive code; read carefully! private Node put(Node x, Key key, Value val) { if (x == null) return new Node(key, val); int cmp = key.compareTo(x.key); if (cmp < 0) x.left = put(x.left, key, val); else if (cmp > 0) x.right = put(x.right, key, val); else if (cmp == 0) x.val = val; return x; } Cost. Number of compares is equal to 1 + depth of node. 10 best case H S C Tree shape R E A X ・Many BSTs correspond to same set of keys. typical case S best case Number of compares for search/insert is equal to 1 + depth of node. ・ H E X S C best case R X R X A R C E R S A worst case C BST possibilities E H H C worst case X S E H C H H C typical case X A R A S E S C E X typical case H A R E A A R S X A Tree shape worst casedepends on order of insertion. Remark. C E BST possibilities H 11 R BST insertion: random order visualization Ex. Insert keys in random order. 12 Correspondence between BSTs and quicksort partitioning P H T D O E A I S U Y C M L Remark. Correspondence is 1-1 if array has no duplicate keys. 13 BSTs: mathematical analysis Proposition. If N distinct keys are inserted into a BST in random order, the expected number of compares for a search/insert is ~ 2 ln N. Pf. 1-1 correspondence with quicksort partitioning. Proposition. [Reed, 2003] If N distinct keys are inserted in random order, expected height of tree is ~ 4.311 ln N. How Tall is a Tree? How Tall is a Tree? Bruce Reed Bruce Reed CNRS, Paris, France reed@moka.ccr.jussieu.fr CNRS, Paris, France reed@moka.ccr.jussieu.fr ABSTRACT purpose of this note to have: Let H~ be the height of a random binaryofsearch tree toonprove n purpose this note that for /3 -- ½ + ~ 3, nodes. We show that there exists constants a = 4.31107... have: Let H~ be the height of a random binary search tree on n THEOREM 1. E(H~) n d / 3 = 1.95... such that E(H~) = c~logn - / 3 1 o g l o g n + nodes. We show that there existsa constants a = 4.31107... . O(1), We also show that Var(H~) =THEOREM O(1). 1. E(H~) = ~ l o g nVar(Hn) - / 3 1 o g l=o gO(1) n + O(1) a a n d / 3 = 1.95... such that E(H~) = c~logn - / 3 1 o g l o g n + Var(Hn) = O(1) . O(1), We also show that Var(H~) = O(1). R e m a r k By the defini Categories and Subject Descriptors nition given more sug R e m a r k By the definition of a, /3 = 7"g~" 3~ is The first d we will see. Categories and Subject Descriptors E.2 [ D a t a S t r u c t u r e s ] : Trees nition given is more suggestive ofaswhy this value is corre For more information o as we will see. E.2 [ D a t a S t r u c t u r e s ] : Trees may consult [6],[7], [1],o For more information on random binary search trees, 1. THE RESULTS R e m a r k After I announ [6],[7],of[1], [2], [9], [4], and [8]. A binary search tree is a binary may tree consult to each node which 1. THE RESULTS an alternativ R e m aaxe r k drawn After I from announced these developed results, Drmota(unpubl we have associated a key; these keys some O(1) using completely A binary search tree is a binary tree to each node of which developed proof of the fact that Var(Hn) totally the key at v cannotanbealternative larger than proofs illuminate we have associated a key; these keys axeordered drawn set fromand some 14differ O(1) using completely different techniques. As our t the key at its right child nor smaller than the key at its left decided to submit theha j totally ordered set and the key at v cannot be larger than proofs illuminate different aspects of the problem, we child. a binary search tree T and a new key k, we and asked thatsame theyjour be the key at its right child nor smaller thanGiven the key at its left decided to submit the journal versions to the k into T by the tree starting at the root child. Given a binary search treeinsert T and a new keytraversing k, we ABSTRACT But… Worst-case height is N. (exponentially small chance when keys are inserted in random order) ST implementations: summary guarantee average case implementation ordered ops? operations on keys search insert search hit insert sequential search (unordered list) N N N/2 N no equals() binary search (ordered array) lg N N lg N N/2 yes compareTo() BST N N 1.39 lg N 1.39 lg N next compareTo() 15 3.2 B INARY S EARCH T REES ‣ BSTs ‣ ordered operations Algorithms R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ deletion 3.2 B INARY S EARCH T REES ‣ BSTs ‣ ordered operations Algorithms R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ deletion Minimum and maximum Minimum. Smallest key in table. Maximum. Largest key in table. max()max S min() E min X A R C H M Examples of BST order queries Q. How to find the min / max? 18 Floor and ceiling Floor. Largest key ≤ a given key. Ceiling. Smallest key ≥ a given key. floor(G) max() S min() E X A R C H ceiling(Q) M floor(D) Examples of BST order queries Q. How to find the floor / ceiling? 19 Computing the floor Case 1. [k equals the key at root] finding floor(G) The floor of k is k. S E X A R H C Case 2. [k is less than the key at root] M G is less than S so floor(G) must be on the left The floor of k is in the left subtree. S E Case 3. [k is greater than the key at root] The floor of k is in the right subtree (if there is any key ≤ k in right subtree); X A R H C G is greater than E so floor(G) could be M on the right S E otherwise it is the key in the root. X A R H C M floor(G)in left subtree is null S E A C result X R H M 20 Computing the floor function Computing the floor finding floor(G) public Key floor(Key key) { Node x = floor(root, key); if (x == null) return null; return x.key; } private Node floor(Node x, Key key) { if (x == null) return null; int cmp = key.compareTo(x.key); if (cmp == 0) return x; if (cmp < 0) return floor(x.left, key); Node t = floor(x.right, key); if (t != null) return t; else return x; S E X A R H C M G is less than S so floor(G) must be on the left S E X A R H C G is greater than E so floor(G) could be M on the right S E X A R H C M floor(G)in left subtree is null S } E A C result X R H M 21 Computing the floor function Subtree counts In each node, we store the number of nodes in the subtree rooted at that node; to implement size(), return the count at the root. node count 8 6 S E 2 A 3 1 2 C H 1 X R 1 M Remark. This facilitates efficient implementation of rank() and select(). 22 BST implementation: subtree counts private class Node { private Key key; private Value val; private Node left; private Node right; private int count; } public int size() { return size(root); } private int size(Node x) { if (x == null) return 0; ok to call return x.count; when x is null } number of nodes in subtree private Node put(Node x, Key key, Value val) { if (x == null) return new Node(key, val, 1); int cmp = key.compareTo(x.key); if (cmp < 0) x.left = put(x.left, key, val); else if (cmp > 0) x.right = put(x.right, key, val); else if (cmp == 0) x.val = val; x.count = 1 + size(x.left) + size(x.right); return x; } 23 Rank Rank. How many keys < k ? Easy recursive algorithm (3 cases!) node count 8 6 S E 2 A 3 1 2 C H 1 X R 1 M public int rank(Key key) { return rank(key, root); } private int rank(Key key, Node x) { if (x == null) return 0; int cmp = key.compareTo(x.key); if (cmp < 0) return rank(key, x.left); else if (cmp > 0) return 1 + size(x.left) + rank(key, x.right); else if (cmp == 0) return size(x.left); } 24 Inorder traversal ・Traverse left subtree. ・Enqueue key. ・Traverse right subtree. public Iterable<Key> keys() { Queue<Key> q = new Queue<Key>(); inorder(root, q); return q; } private void inorder(Node x, Queue<Key> q) { if (x == null) return; inorder(x.left, q); q.enqueue(x.key); inorder(x.right, q); } BST key left val right BST with larger keys BST with smaller keys smaller keys, in order key larger keys, in order all keys, in order Property. Inorder traversal of a BST yields keys in ascending order. 25 BST: ordered symbol table operations summary sequential search binary search BST search N lg N h insert N N h min / max N 1 h floor / ceiling N lg N h rank N lg N h select N 1 h ordered iteration N log N N N h = height of BST (proportional to log N if keys inserted in random order) order of growth of running time of ordered symbol table operations 26 3.2 B INARY S EARCH T REES ‣ BSTs ‣ ordered operations Algorithms R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ deletion 3.2 B INARY S EARCH T REES ‣ BSTs ‣ ordered operations Algorithms R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ deletion ST implementations: summary guarantee average case implementation ordered iteration? operations on keys search insert delete search hit insert delete sequential search (linked list) N N N N/2 N N/2 no equals() binary search (ordered array) lg N N N lg N N/2 N/2 yes compareTo() BST N N N 1.39 lg N 1.39 lg N ??? yes compareTo() Next. Deletion in BSTs. 29 BST deletion: lazy approach To remove a node with a given key: ・Set its value to null. ・Leave key in tree to guide search (but don't consider it equal in search). E E A S A S delete I C H ☠ C I R N H tombstone R N Cost. ~ 2 ln N' per insert, search, and delete (if keys in random order), where N' is the number of key-value pairs ever inserted in the BST. Unsatisfactory solution. Tombstone (memory) overload. 30 Deleting the minimum To delete the minimum key: ・Go left until finding a node with a null left link. ・Replace that node by its right link. ・Update subtree counts. go left until reaching null left link A S X E R H C M return that node’s right link S X E R A public void deleteMin() { root = deleteMin(root); H C } private Node deleteMin(Node x) { if (x.left == null) return x.right; x.left = deleteMin(x.left); x.count = 1 + size(x.left) + size(x.right); return x; } M available for garbage collection update links and node counts after recursive calls 7 S 5 X E R C H M Deleting the minimum in a BST 31 Hibbard deletion To delete a node with key k: search for node t containing key k. Case 0. [0 children] Delete t by setting parent link to null. deleting C update counts after recursive calls S S E X A R C E R M node to delete replace with null link 1 E X A R H H H S 5 X A 7 M M available for garbage collection C 32 Hibbard deletion To delete a node with key k: search for node t containing key k. Case 1. [1 child] Delete t by replacing parent link. deleting R update counts after recursive calls S S E X A R C H M node to delete E C X E M X H A replace with child link S 5 H A 7 C M available for garbage collection R 33 Hibbard deletion deleting E node to delete S To delete a node with key k: search for node tEcontainingX key k. A R H C M Case 2. [2 children] ・Find successor x of t. ・Delete the minimum in t's right subtree. A C ・Put x in t's spot. search for key E t S E node to delete S E X A H M x search for key E go right, then go left until reaching null left link t.left successor M min(t.right) H X deleteMin(t.right) M 7 X deleteMin(t.right) S 5 H A X R C S E E R R H C min(t.right) S H X x M C S E x t.left still a BST successor H A t A R go right, then go left until reaching null left link R C X don't garbage collect x but x deleting E x has no left child M update links and node counts after recursive calls Deletion in a BST 34 Hibbard deletion: Java implementation public void delete(Key key) { root = delete(root, key); } private Node delete(Node x, Key key) { if (x == null) return null; int cmp = key.compareTo(x.key); if (cmp < 0) x.left = delete(x.left, key); else if (cmp > 0) x.right = delete(x.right, key); else { if (x.right == null) return x.left; if (x.left == null) return x.right; Node t = x; x = min(t.right); x.right = deleteMin(t.right); x.left = t.left; } x.count = size(x.left) + size(x.right) + 1; return x; search for key no right child no left child replace with successor update subtree counts } 35 Hibbard deletion: analysis Unsatisfactory solution. Not symmetric. Surprising consequence. Trees not random (!) ⇒ sqrt (N) per op. Longstanding open problem. Simple and efficient delete for BSTs. 36 ST implementations: summary guarantee average case implementation ordered iteration? operations on keys search insert delete search hit insert delete sequential search (linked list) N N N N/2 N N/2 no equals() binary search (ordered array) lg N N N lg N N/2 N/2 yes compareTo() BST N N N 1.39 lg N 1.39 lg N √N yes compareTo() other operations also become √N if deletions allowed Next lecture. Guarantee logarithmic performance for all operations. 37 3.2 B INARY S EARCH T REES ‣ BSTs ‣ ordered operations Algorithms R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ deletion Algorithms R OBERT S EDGEWICK | K EVIN W AYNE 3.2 B INARY S EARCH T REES ‣ BSTs ‣ ordered operations Algorithms F O U R T H E D I T I O N R OBERT S EDGEWICK | K EVIN W AYNE http://algs4.cs.princeton.edu ‣ deletion