SUPERIOR UNIVERSITY LAHORE Faculty of Computer Science & IT Design and Analysis of Algorithm Report: AVL Tree [Design and Analysis of Algorithm] Group Members Student Name Student ID Class Section Muhammad Nawaz BCSM-F18-022 BSCS 5B Abubakar Mehboob BCSM-F18-041 BSCS 5B Ahsan Naseer BCSM-F18-031 BSCS 5B H. Abdullah Ansari BCSM-F18-452 BSCS 5B Teacher: [Dr. Imran Khan (PhD. Scholar)] 1 Contents • • • • • • • • • • • History-………………………………………………………………………………………… 3 Abstract………………………………………………………………………………………… 3 Introduction…………………………………………………………………………………… 1-4 Variations of AVL Tree…………………………………………………………………… 4 o RAVL……………………………………………………………………………………… 4 o WAVL…………………………………………………………………………………… 4 How AVL tree Work………………………………………………………………………… 5 o Balancing the Tree………………………………………………………………… 5 o Rotations………………………………………………………………………………. 5 ▪ Left Rotation………………………………………………………………. 6 ▪ Right Rotation……………………………………………………………. 6 ▪ Left-Right Rotation……………………………………………………. 7 ▪ Right-Left Rotation……………………………………………………. 7 o Operations……………………………………………………………………………8 ▪ Searching………………………………………………………………….… 8 ▪ Insertion……………………………………………………………………. 8-9 ▪ Deletion………………………………………………………………….…. 9-10 Why we use AVL Tre……………………………………………………………….………. 10 o Searching………………………………………………………………………….……. 10 o Storing……………………………………………………………………………………. 10 Advantages……………………………………………………………………………………… 11 Current Application………………………………………………………………….…… 11-12 Algorithm………………..……………………………………………………………………. ….. 12 Conclusion……………………………………………………………………………………. ….. 12 References……………………………………………………………………………….……… 13 2 History of AVL Tree: It was introduced in 1962. The AVL tree is named after its two Soviet inventors, Georgy Adelson-Velsky and Evgenii Landis, who published it in their 1962 paper "An algorithm for the organization of information". that is, sibling nodes can have hugely differing numbers of descendants. Abstract: AVL tree is the first dynamically balancing binary search tree that are to be invented. It is used in many different areas in the real world that we are interacting today with. This paper discusses the variations and methodologies of AVL tree, the basic operations of AVL tree such as insertion, deletion and searching, the purpose of AVL tree including what can be used to solve with AVL tree, the advantages and disadvantages of AVL tree and future, current application of AVL tree Introduction: AVL tree is a “height balancing binary search tree and is also known as self-balancing tree”. AVL tree is the first dynamically balanced trees to be proposed. In 1962, Adelson Velski & Lendis, the two creators of AVL tree published the concept of AVL tree in the paper “An algorithm for the organization of information”. Hence, it was given the name AVL. In this paper, it only described the algorithm on rebalancing the tree after an insertion and updating the height of the tree. An algorithm on rebalancing after deletion was lacking and it only appeared later in 1965 when another author submitted a technical report on it. In AVL tree, the height of the two child subtree nodes can only differ by 1 and this is known as the property of AVL tree. Every node maintains an extra information known as “Balance Factor”. Binary Search Tree (BST) is sometime skewed because it cannot control the order in which data comes for insertion. Similarly, AVL tree also cannot control the order in which data arrives for insertion, but it can re-arrange data in tree. All major operations on a Binary Search Tree including insertion, deletion and searching depend linearly on its height. For both average and worst cases, the time complexity for performing insert, search and delete operations in AVL tree is O (log n) where n is the number of nodes in the tree. The insert and delete operations is slower and tedious as rotations are required to re-balance the tree. Therefore, AVL tree are not preferred for applications that involves frequent insert and delete 3 operations. In 1973, an experiment was conducted by Scoggs to investigate the performance of AVL tree and to compare the actual costs and timings involved in performing the basic operations. It was found out that for large trees, searching is most expensive follow by deletion then by insertion (Scoggs, 1973). AVL tree is often compared with red-black trees as the time complexity is O (log n) for both. Red, Black (RB) trees are also balanced, but comparatively less balanced then AVL tree. For this reason, AVL tree appeared to be faster than red-black tree in search operations. Although, AVL tree appear to be more objective when compare with RB tree, but it causes more rotations for insertion and deletion. Moreover, the RB tree does not need to be rebalanced as frequently as AVL tree. Variations of AVL tree • • Relaxed AVL tree (RAVL) Weak AVL tree (WAVL) Relaxed AVL tree (RAVL): The first variation of AVL tree is RAVL tree (relaxed AVL tree) and is only re-balanced after insertion, but not after deletion. Thus, deletion can create trees that are not AVL tree. The time complexity in RAVL tree for performing insert, search and deletion operations is O(h +1), where h Is the height of the tree Weak AVL tree (WAVL): The second variation of AVL tree is WAVL tree (weak AVL tree). This new kind of balanced binary tree was proposed by Haupler, San & Tarjan in their paper “Rank-Balanced Tree” in 2015. WAVL is designed to combine the properties of AVL tree and Red Black (RB) tree. If no deletion occurs, WAVL tree is the same as the RAVL tree. The time complexity in WAVL tree for performing insert and deletion operations at most 2 operations take O (1). Most of the re-balancing in WAVL take place at the bottom of the tree 4 How AVL Tree Work? Balance Factor For every insert, delete and search operations, the depth of the balance tree is equivalent to O(log N), take O(log N) time and every node must have left and right subtrees of the same height. By default, nodes with no children have a height of 0. AVL tree checks the height of the left and right subtree and ensure that the height or of the tree is not greater than 1. “The difference between the height of left and right subtree of the node is known as the balance factor”. The balance factor for every node is either -1,0 or 1. The balance factor of the tree is denoted as follows. Balance Factor = heightOfLeftSubtree - heightOfRightSubtree These three diagrams above illustrates balance and unbalanced trees. In figure 1, the tree is equally balanced as the balance factor is 0. In Figure 2 & 3, the tree is unbalanced because the balance factor is 2 as illustrated above. In AVL trees, if the balance factor is greater than 1, the three need to be balance by using one of the four rotation techniques Rotations There are four rotations • Left Rotation • Right Rotation • Left-Right Rotation • Right Left Rotation 5 Left Rotation When a node is inserted into the right subtree of the right subtree and the tree become unbalanced and single left rotation is performed. A simple diagram will be illustrated to explain this process. Figure 4 represent the right unbalanced tree. The balance factor of this tree is 2, thus it needs to be rotated. In Figure 5, left Rotation is performed by making 2 the left subtree of 1. Figure 6 represents the balanced tree after left rotation. Right Rotation Single right rotation required when a node is inserted into the left subtree of the left subtree and the tree become unbalanced. A simple diagram will be illustrated to explain this process. Figure 7 represent the left unbalanced tree. The balance factor of this tree is 2, thus it needs to be rotated. In Figure 8, left Rotation is performed by making 1 the left subtree of 2. Figure 9 represents the balanced tree after right rotation 6 Left-Right Rotation The first type of double rotation is a left-right rotation. “Left-Right rotation is a combination of left rotation followed by right rotation”. A simple diagram will be illustrated to explain this process. In Figure-10, Z is the root node and node X is a left subtree of Z and node Y is a right subtree of X. The tree is not balanced as the balance factor of the tree is 2. For this scenario, a double rotation is required. Firstly, left rotation is performed by making Y the left subtree of Z and making X the left subtree of Y as shown in Figure-11. Then, right rotation is performed. Figure12 show the balanced tree after performing left-right rotation. Right Left Rotation The second type of double rotation is a right-left rotation. “Right-Left rotation is a combination of right rotation followed by left rotation”. A simple diagram will be illustrated to explain this process. In Figure-11, Z is the root node and node X is a right subtree of Z and node Y is a left subtree of X. The tree is not balanced as the balance factor of the tree is 2. For this scenario, a double rotation is required. Firstly, right rotation is performed by making Y the right subtree of Z and 7 making X the right subtree of Y as shown in Figure-13. Then, right rotation is performed. Figure-14 show the balanced tree after performing left-right rotation. Operations Searching The time complexity for search operation in AVL tree is O(log n). The search operation is AVL tree is similar as search in binary search tree. This means that all descendants to the right of a node are greater than the node and all descendants to the left of a node are less than the node. Steps to perform search operations in AVL tree will be described below. Firstly, the element to search provided by the user is read. Then, this element is compare with the root node value in the tree. If it matches, this value will be return to the user. If the values do not match, compare search element with the root node value. If search element is greater than the node value, continue the search process in right subtree and if not, continue from the left subtree. Repeat the process until the search element is found or completed with a leaf node. (Sathya, 2017). Insertion The time complexity for insert operation in AVL tree is O (log n). While inserting a new number, insert operation may cause a balance factor of a node to become 2 or -2 and required rotations to re-adjust the nodes and re-balance the tree. In AVL tree, new node is inserted as a leaf and the insert will be done recursively. The procedure for insertion is described below. The element is inserted into the tree according to the concept of Binary Search Tree (BST). The difference with BST is that the balance factor of every node is check after every insertion to ensure that the tree is balanced. The next element will only be added if the balance factor is either -1, 0 or 1 Generally, there are 4 cases for insertions (Poonia, 2014)). These are. • Outside Cases (only single rotation is required) o Insertion into left-subtree of left child o Insertion into right-subtree of right child • Inside Cases (double rotation is required 8 o Insertion into right-subtree of left child o Insertion into left-subtree of right child Example A simple example will be illustrated below with a construction of an AVL tree by inserting number 1,2,3,4,5,6 in order. In A, the tree become unbalanced after inserting 3. As 3 is inserted into right-subtree of the right child, a single left rotation is performed to re-balance the tree. In C, thee tree become unbalanced after inserting 5. As 5 is also inserted into right-subtree of the right child, a single left rotation is performed to re-balance the tree. F represents the balanced tree after inserting number 1 to 6 in order. Deletion The delete operation is more complex than the insert operation. The element is deleted into the tree according to the concept of Binary Search Tree (BST). The difference with BST is that the balance factor of every node is check after every delete operation to ensure that the tree is balanced. There are 3 cases to consider before performing the delete operation. Let ‘Y” be the element to delete. 9 • Case 1 → if “Y” is a leaf, delete Y. (Figure-1) • Case 2 → if Y has one child, use the child to replace the Y. (Figure-2) • Case 3 → if Y have two children, replace Y with its in-order predecessor (right most child of left-subtree) and delete recursively. (Figure-3) Why we use AVL tree ? Searching AVL tree can be used for many purposes. The first and most common usage of AVL tree is in search intensive applications as searching takes O (log n) in AVL tree as the tree is balanced. AVL tree is used to create index of search keywords for a bunch of documents (see current application of AVL tree). As an example, it is used in designing databases where insertions and deletions are not frequent but require frequent search operations for items present in there. AVL tree is also implemented in creating search engines to provide faster search results. This is because search engine requires to response to query as fast as possible. As an example, search engine in fingerprint databases used AVL tree as the fingerprints discovered in crime scenes need to be compared and searched faster in fingerprint databases in a more reliable nature. (Elmadani, 2016). Storing Another usage of AVL tree is for storing information in an efficient and sorted manner. As an example, an AVL tree can be coded together with hash table for storing and retrieving data. AVL tree can also be implemented to store data read from an input file. Another important usage of AVL tree is in classification function of data mining. 10 Advantages • The first advantage of AVL tree is that it provides faster searching operations compare to other counterparts such as red-black tree. This allow the user to complete their task much faster than other search operations. • This is vital in coding to ensure that projects are completed according to schedule. In Addition, it is essential when user needs to complete faster searches and in a more reliable nature. • AVL tree has the capability of performing searching, insertion and deletion in a more efficient manner when compare with binary search tree. • The second advantage of AVL tree is due to its self-balancing capabilities. One of the concerns for professional is to ensure that that the trees are balanced. This is because if the tree is not balanced, the time to perform operations such as searching, inserting, and deleting will be longer. (Nyln, 2016). Current Application The following are the current applications of AVL tree today. AVL tree is mostly used in databases for indexing large records to improve searching. Another application of AVL tree is in memory management subsystem of Linux kernel to search memory. The next application of AVL tree is in dictionary search engines. Data structure that may be built once without further reconstruction such as language dictionaries or program dictionaries uses AVL tree. It can be seen that AVL tree is commonly used in searching as searching in AVL tree takes O (log n) only. Another important use of AVL tree is in data mining. Classification is one of the main functions of data mining. It can be used to solve the issue arise with data mining. The concern with data mining in large databases is scalability and efficiency. Decisions trees are widely used with classification method. As classification model construction process is performed on huge data, the algorithms may result in very bushy or meaningless results Biometric authentication such as fingerprint can be used to verify a person and in crime scene. Sets of fingerprints are searched in large databases and compared in real time. To reduce consuming time, this require fast search engine in searching or retrieving from large fingerprint databases. Experiments have been conducted to find out the best searching methods among queue, SQL, graph search, hash, binary and AVL tree as methods for 11 searching and retrieving from large databases According to results, it was found out the AVL tree algorithm is the best method for searching (Elmadani, 2016). Algorithm: Conclusion Although the features offered by AVL tree such as fast searching and O (log N) time complexity for insert and delete operations may appeared appealing, it is not widely used today. This is due to its costly insertion and deletion that required frequent rotations, and harder in coding, debugging, and implementing, AVL tree is replaced with red-black tree. 12 References Algorithm & Data Structures Note 5. (N.D) Retrieved from, https://www.inf.ed.ac.uk/teaching/courses/inf2b/algnotes/note05.pdf Shafiq,H. (2015). C#Corner : How a search engine works. Retrieved from http://www.csharpcorner.com/UploadFile/e68d54/search-engine-using-2-3-trees-and-hash-table/ Elmadani,A. (2016). AVL TREE AN EFFICIENT RETRIEVAL ENGINE IN CLASSIFIED FINGERPRINT DATABASE. International Educational Applied Scientific Research Journal, 1 (1), 30-35. Retrieved from http://ieasrj.com/journal/index.php/ieasrj/article/view/11/9 Grossman,D. (2010). Lecture 8: AVL Delete; Memory Hierarchy[Powerpoint Slides] Retrieved from https://courses.cs.washington.edu/courses/cse332/10sp/lectures/lecture8.pdf Bhukya,D., & Ramachandram,S. (2010). Decision Tree Induction: An Approach for Data Classification Using AVL-Tree. International Journal of Computer and Electrical Engineering, 2(4), 660-665. Retrieved from http://www.ijcee.org/papers/208-E269.pdf Haeupler,B., Sen,S., & Tarjan,R . (2013). Rank Balanced Tree:Remarks. ACM Transactions on Algorithms 1-21. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.296.7295&rep=rep1&type=pdf 13