A Splay Tree Implementation by: Thomas Grindinger & Benjamin Hoipkemier Introduction Splay trees are a type of binary search tree that was developed by Robert Tarjan, and Daniel Sleator. We have based our implementation on their work as discussed in their paper Self-Adjusting Binary Trees (Sleator, 1983). Splay trees fulfill the canonical binary tree property in that every node has at most 2 children (referred to as left and right). Splay trees also have the property that the left child is less than or equal to the parent, and the right child is greater than or equal to the parent. This permits a node of a given value to be located in O(h) time where h is the height of the tree. The basic operations of a splay tree (insert, look-up, and remove) are performed in O(log(n)) amortized time given a sufficiently long sequence of operations. Splay Operation The splay operation, from which the splay tree gets its name, is performed every time one of the basic splay tree operations occurs. Splay trees have acquired the name of self-adjusting binary trees based on the splay property. The splay operation takes a given node and through a series of “splay steps” moves this Note: There may be an odd number of rotations that need to occur in order to splay the node to the root. node to the root. A splay step consists of a set of two single rotate operations. The splay step is one of the following: zig-zag, zag-zig, zigzig, or a zag-zag. A zig is also known as a right rotation, while a zag is also known as a left rotation. Figure 1 depicts the zig and the zag operations. Splay steps often are referred to in pairs of rotation operations due to the order in which they are executed in when a node is inline with its grandparent. When the splay operation is called on a node a series of splay steps occurs until the node reaches the root of the splay tree. Figure1: demonstration of single rotation operations. When the node that is being splayed is inline with its grandparent the parent will be splayed first. The second splay in the pair will be to splay the node. This has the same result of moving the node higher in the tree than its parent and grandparent. We perform the splays in this manner because the procedure will leave the tree in a more balanced state. Figure 2 illustrates the pair of rotations needing to be done on an inline node. When performing a rotation based on a node that is an inline right child of its grandparent we perform a mirror image of the process that is displayed in figure 2. Figure2: demonstration of double inline rotation operations. This modified rotation procedure improves the balance of the tree when we encounter trees that have the characteristics of a path. Consider the diagram of a splay tree in figure 3. The result of performing a find on the node with key 1 is illustrated in figure 4. The tree is more compressed than if we had splayed the node to the top using only single rotations. This is illustrated in figure 5. Notice that after performing single rotations a path still exists. This has not improved the running time of our splay tree at all for an arbitrary find operation. The transformation that figure 4 used, and the transformation that figure 5 used, required the same number of units of work to perform. Since the average distance from the root of each node in figure 4 is considerably less that the average distance from the root in figure 5 the double rotations yield better results when attempting to splay a node that is on a path. Figure 3: Path configuration Figure 4: find operation on 1 using double rotations Figure 5: find operation on 1 using single rotations The purpose of the splay operation is that frequent calls to a given splay operation will perform faster, subsequent times. Consider the case where we call find(x) twice in a row, where x is an arbitrary value. The first time that the find operation is called x will be splayed to the root of the tree. The next call to find(x) will now perform in O(1) time due to the fact that the node x is now at the root of the splay tree. The splay operation allows the splay tree to perform its basic operation in O(log(n)) time. Figure 6 contains the implementation of the splay function in pseudo code. function splay(node n) node parent = n.parent node grandparent = parent.parent if grandparent.leftchild == parent && parent.leftchild == n then rotateRight(parent) rotateRight(n) else if grandparent.rightchild == parent && parent.rightchild == n then rotateLeft(parent) rotateLeft(n) else if grandparent.rightchild == parent && parent.leftchild == n then rotateRight(n) rotateLeft(n) else if grandparent.leftchild == parent && parent.rightchild == n then rotateLeft(n) rotateRight(n) end if if root != n then splay(n) end if end function function rotateLeft(node n) node parent = n.parent node grandparent = parent.parent if grandparent.leftchild == parent then grandparent.leftchild = n else grandparent.rightchild = n end if parent.rightchild = n.leftchild n.leftchild = parent end function function rotateRight(node n) node parent = n.parent node grandparent = parent.parent if grandparent.leftchild == parent then grandparent.leftchild = n else grandparent.rightchild = n end if parent.leftchild = n.rightchild n.rightchild = parent end function Figure 6: pseudo code implementation of splay operation. Find Operation The first fundamental binary search tree operation that is encountered is the find operation. A call to find is passed an actual parameter that represents the key of the node that is to be found. The find operation is a recursive operation that compares the lookup key to the key of the current node. If the lookup key is less than the key of the current node then we traverse the left child of the current node. If the lookup key is greater than the key of the current node then we traverse the right child of the current node, repeating the process. Once we encounter a node that matches the key value of our lookup key we then use the splay function and splay this node to the root. The find operation returns a pointer to the root which is now the element with the key that matches our lookup key. If the find operation was called with a lookup key that does not exist in the tree we return null. The find operation has a O(h) running time where h is the height of the tree. Figure 7 contains the implementation of the find operation in pseudo code. function node find(int lookupKey) node toReturn = find(lookupKey, root) splay(toReturn) return toReturn end function function node find(int lookupKey, node currentNode) if lookupKey == currentNode.key then return currentNode else if lookupKey < currentNode.key then return find(lookupKey, currentNode.left) else if lookupKey > currentNode.key then return find(lookupKey, currentNode.right) end if end function Figure 7: pseudo code implementation of find operation. Delete Operation The next operation that we will consider is the delete operation. The delete operation takes a parameter of the lookup key that is to be deleted. It uses the lookup key and passes it as a parameter to the find operation. This will move the node that is to be deleted to the top of the splay tree. The operation then deletes the root node from the splay tree. This action results in two splay trees, one that is rooted at the deleted node’s left child, and the other that is rooted at the deleted node’s right child. We then take the left tree and splay its maximum element to the root. The maximum element does not have a right child, because there is no element in the tree with a key that is greater than it. This means that the root of the right tree can be joined as the right child of the left tree’s root. The operation then proceeds to join the right tree as the right child of the root of the left tree. The delete operation does not return a value. The delete operation consists of 2 find operations, the actual deletion of a node and then the reconnecting the tree. Therefore the delete operation has an overall running time of O(h) where h is the height of the splay tree. Figure 8 contains the pseudo code implementation of the delete operation. Function delete(int lookupKey) node toDelete = find(lookupKey) node leftTreeRoot = toDelete.left node rightTreeRoot = toDelete.right leftTreeRoot = findMax(leftTreeRoot) leftTreeRoot.right = rightTreeRoot end function function node findMax(node currentNode) do while currentNode.right != null currentNode = currentNode.right end while splay(currentNode) return currentNode end function Figure 8: pseudo code implementation of delete operation. Insert Operation The insert operation takes as a parameter the key that is to be inserted into the tree. After the insert operation is called it leaves the tree having the inserted key at the root. The operation starts by finding a position for the new element in the tree. This is accomplished by starting at the root and traversing the tree until you reach a null node. Once we have reached a null node, we can insert a new node with the key value that was passed to the insert function. We then proceed to splay the node up to the root of the tree. Although some implementations of splay trees allow for multiple nodes with the same key value, we have not added this as part of our implementation. Because of the property that the new node is splayed to the root each time a node is added, if we insert the elements at random we expect the splay tree to be relatively balanced. Figure 9 contains the pseudo code for the insert operation. function insert(int key) node inserted = findPosAndInsert(root, key) splay(inserted) end function function node findPosAndInsert(node curNode, int key) if curNode.key < key then if curNode.right == null then return new node(key) else return findPosAndInsert(curNode.right, key) end if else if curNode.key > key then if curNode.left == null then return new node(key) else return findPosAndInsert(curNode.left, key) end if end if end function Figure 9: pseudo code implementation of the insert operation. Advantages Splay trees are one of many binary search tree implementations. Some of the other binary search tree implementations include AVL trees, Red-Black trees, and Treaps. There are several advantages to using a splay tree implementation of a binary search tree over other implementations. One advantage of splay trees is that after accessing a node this node will be at the root so repeated accesses to a given node in a small query window will yield greater efficiency. This makes splay trees extremely useful for implementing caches (Wikipedia, 2002). Another advantage to using splay trees is that they are relatively easy to implement. Since a splay tree does not need to enforce a balancing condition, the programmer does not have to implement a significant part of the binary search tree that other implementations would require. Splay trees also require less memory to store each node, since there is not a need for augmented information concerning tree balancing. Application Explained The program that we have written to demonstrate how Splay Trees work is a graphical Windows application. This allows the user to see the inner workings of the splay tree, including the ability to watch the reordering of the nodes and edges after rotations have occurred. The visualization also displays data such as informational messages describing the current task being completed and a marker to distinguish which node is being evaluated at each step. Having these attributes, this program will allow the user to experience all of the fundamental and advanced operations of Splay Trees and better understand their mechanics. This is a screenshot of the program: The program is divided up into three areas: the message box, the button area, and the display area. The message box contains a description of the current operation being performed on the Splay Tree. The button area allows the user to select an operation to perform on the Splay Tree. The display area displays the structure and contents of the Splay Tree. The first area that we will discuss is the message box. The message box displays a description of each individual task of an operation being performed. For instance, when a node is being splayed to the top, the message box will describe which keys are being compared and rotated at each step. Here is an example for the "Element-of-Rank" operation: The next area that we will discuss is the button area. There are buttons contained in this area that correspond to all the fundamental and many advanced operations that may be performed on Splay Trees. Most of the operations are self-explanatory, except for the "Open" and "Save" operations. Here is an example of the button area: Note that all of the buttons are enabled except for the "Animate" button. The "Animate" button is only enabled when an operation has been selected and executed. The "Animate" button allows the user to step through all tasks that must be performed to complete the operation, such as rotating an edge or traversing down from a parent node to a child node. Here is an example of the button area during an operation: Note that all of the buttons are disabled except for the "Animate" button. The button area will remain in this state until all tasks for a specific operation are completed. Furthermore, while an animation is in progress, the "Animate" button will also be disabled. Note that the animations cannot be skipped and must all be completed before another operation may be selected. The "Save" and "Open" operations will save and open Splay Trees in files. The "Save" operation will open a dialog box that asks the user for a filename to save the Splay Tree as. This is an example of the save file dialog: The "Open" operation behaves similarly to the "Save" operation. It opens a dialog box that asks the user for the filename of the Splay Tree you wish to open. Here is an example of the open file dialog: It is not important for the user to understand how the files are saved, but the curious user may appreciate knowing that the files are saved in XML format and may be viewed with any XML editor, including Microsoft Internet Explorer. The third division of the program is the display area. This area contains an image that represents the structure of the splay tree, along with keys for each node, credit information that affects the size of the nodes, and a marker that distinguishes which node is currently being evaluated in the current operation. Here is an example: If a "Find" operation is performed on this tree for the element 13 up to the point where the element has been found and will be splayed to the top, the red marker will move down the tree, as well, as illustrated in this example: This area will be animated whenever the marker moves or the tree is restructured by pressing the "Animate" button. References Sleator, D. D., & Tarjan, R. E. (1983). Self-adjusting binary trees. ACM Symposium on Theory of Computing, 15, 235-245. "Splay tree." Online posting. 25 Feb 2002. Wikipedia. 14 Dec 2005. <http://en.wikipedia.org/wiki/Splay_tree>.