Data Structure Data Structure Data Structure can be defined as the group of data elements which provides an efficient way of storing and organising data in the computer so that it can be used efficiently. Some examples of Data Structures are arrays, Linked List, Stack, Queue, etc. Data Structures are widely used in almost every aspect of Computer Science i.e. operating System, Compiler Design, Artifical intelligence, Graphics and many more. Data Structures are the main part of many computer science algorithms as they enable the programmers to handle the data in an efficient way. It plays a vital role in enhancing the performance of a software or a program as the main function of the software is to store and retrieve the user's data as fast as possible. The data structure is not any programming language like C, C++, java, etc. It is a set of algorithms that we can use in any programming language to structure the data in the memory. Data structure mainly specifies the followings: i. Organization of data ii. Accessing Methods iii. Degree of Associativity iv. Processing alternatives for information Data structures are the building blocks of a program. They affect the design of both structural and functional aspects of program Algorithm + Data Structure= Program To develop a program of an algorithm, we need to select an appropriate data structure for that algorithm. Therefore algorithm and its associated data structures form a program. Basic Terminology Data structures are the building blocks of any program or the software. Choosing the appropriate data structure for a program is the most difficult task for a programmer. Following terminology is used as far as data structures are concerned. Data: The term data means a value or a set of values . Or Data can be defined as an elementary value or the collection of values, for example, student's name and its id are the data about the student. The following are some examples of data 34, 12/01/1965, ISBN 81-203-0000-0, | | | In all the above examples, some values are represented in different ways. Each value or a collection of all such values is termed data. Group Items: Data items which have subordinate data items are called Group item, for example, name of a student can have first name and the last name. Record: Record can be defined as the collection of various data items, for example, if we talk about the student entity, then its name, address, course and marks can be grouped together to form the record for the student. File: A File is a collection of various records of one type of entity, for example, if there are 60 employees in the class, then there will be 20 records in the related file where each record contains the data about each employee. Entity: An entity is one that has certain attributes and which may be assigned values. For example, an employee in an organisation is an entity. The possible attributes and their corresponding values for an entity in the present example are: Entity: EMPLOYEE Attributes: NAME Values: RAVI ANAND DOB: 30/12/61 SEX: M Attribute and Entity: An entity represents the class of certain objects. It contains various attributes. Each attribute represents the particular property of that entity. Field: Field is a single elementary unit of information representing the attribute of an entity. Information The term information is used for data with its attribute(s). In other words, information can be defined as meaningful data or processed data. For example, the set of data as presented earlier becomes information related to a person if we impose the meanings as mentioned below. Data Meaning 34 Age of the person 12/01/1965 Date of birth of the person ISBN 81-203-0000-0 Book number, recently published by the person Data type A data type is a term which refers to the kind of data that may appear in a computation. The following are a few well-known data types. Data Data type 40 Numeric (integer) ISBN 81-203-0000-0 Alphanumeric Real, boolean, character, complex, etc. are also some more frequently used data types. Built-in data type In every programming language, there is a set of data types called built-in data types. For example, in C, FORTRAN, and Pascal, the data types that are available as built-in are listed below: C: int, float, char, double, enum, etc. FORTRAN: INTEGER, REAL, LOGICAL. COMPLEX, DOUBLE PRECISION. CHARACTER, etc. Pascal: Integer, Real, Character, Boolean, etc. Abstract data type The data object which comprise the data structure and their fundamental operations are known as abstract data type. When an application requires a special kind of data which is not available as a built-in data type, then it is the programmer's responsibility to implement his own kind of data. Here, the programmer has to specify how to store a value for that data, what are the operations that can meaningfully manipulate variables of that kind of data, amount of memory required to store a variable. The programmer has to decide all these things and accordingly implement them. Programmers' own data type is termed abstract data type. The abstract data type is also alternatively termed user-defined data type. For example, suppose we want to process dates of the form dd/mm/yy. For this, no built-in data type is known in C and C++. If a programmer wants to process dates, then an abstract data type, say Date, has to be devised and various operations such as adding a few days a date to obtain another date, finding the days between two dates, obtaining a day for a given date, etc. have to be defined accordingly. Besides these, programmers should decide how to store the data, what amount of memory will be needed to represent a value of Date, etc. An abstract data type, in fact, can be built with the help of built-in data types and other abstract data type(s) already built by the programmer. Some programming languages provide a facility to build abstract data types easily. For example, using struct /class in C/C++, and using record in Pascal, programmers can define their own data types. Need of Data Structures As applications are getting complexed and amount of data is increasing day by day, there may arise the following problems: Processor speed: To handle very large amount of data, high speed processing is required, but as the data is growing day by day to the billions of files per entity, processor may fail to deal with that much amount of data. Data Search: Consider an inventory size of 106 items in a store, If our application needs to search for a particular item, it needs to traverse 106 items every time, results in slowing down the search process. Multiple requests: If thousands of users are searching the data simultaneously on a web server, then there are the chances that a very large server can be failed during that process in order to solve the above problems, data structures are used. Data is organized to form a data structure in such a way that all items are not required to be searched and required data can be searched instantly. Advantages of Data Structures Efficiency: Efficiency of a program depends upon the choice of data structures. For example: suppose, we have some data and we need to perform the search for a particular record. In that case, if we organize our data in an array, we will have to search sequentially element by element. Hence, using array may not be very efficient here. There are better data structures which can make the search process efficient like ordered array, binary search tree or hash tables. Reusability: Data structures are reusable, i.e. once we have implemented a particular data structure, we can use it at any other place. Implementation of data structures can be compiled into libraries which can be used by different clients. Abstraction: Data structure is specified by the ADT which provides a level of abstraction. The client program uses the data structure through interface only, without getting into the implementation details. Data structures can also be classified as: Static data structure: It is a type of data structure where the size is allocated at the compile time. Therefore, the maximum size is fixed. Dynamic data structure: It is a type of data structure where the size is allocated at the run time. Therefore, the maximum size is flexible. Major Operations The major or the common operations that can be performed on the data structures are: Searching: We can search for any element in a data structure. Sorting: We can sort the elements of a data structure either in an ascending or descending order. Insertion: We can also insert the new element in a data structure. Updation: We can also update the element, i.e., we can replace the element with another element. Deletion: We can also perform the delete operation to remove the element from the data structure Data Structure Classification Types of Data Structures There are two types of data structures: Primitive data structure Non-primitive data structure Primitive Data structure The primitive data structures are primitive data types. The int, char, float, double, and pointer are the primitive data structures that can hold a single value. Non-Primitive Data structure The non-primitive data structure is divided into two types: Linear data structure Non-linear data structure Linear data structure A data structure is called linear if all of its elements are arranged in the linear order. In linear data structures, the elements are stored in nonhierarchical way where each element has the successors and predecessors except the first and last element. Types of Linear Data Structures are given below: Arrays: An array is a collection of similar type of data items and each data item is called an element of the array. The data type of the element may be any valid data type like char, int, float or double. The elements of array share the same variable name but each one carries a different index number known as subscript. The array can be one dimensional, two dimensional or multidimensional. The individual elements of the array age are: age[0], age[1], age[2], age[3],......... age[98], age[99]. Linked List: Linked list is a linear data structure which is used to maintain a list in the memory. It can be seen as the collection of nodes stored at non-contiguous memory locations. Each node of the list contains a pointer to its adjacent node. Stack: Stack is a linear list in which insertion and deletions are allowed only at one end, called top. A stack is an abstract data type (ADT), can be implemented in most of the programming languages. It is named as stack because it behaves like a real-world stack, for example: - piles of plates or deck of cards etc. The Stack can be implemented in two ways: a. Using arrays. b. Using pointers. Queue: Queue is a linear list in which elements can be inserted only at one end called rear and deleted only at the other end called front. It is an abstract data structure, similar to stack. Queue is opened at both end therefore it follows First-In-First-Out (FIFO) methodology for storing the data items. Non Linear Data Structures: This data structure does not form a sequence i.e. each item or element is connected with two or more other items in a non-linear arrangement. The data elements are not arranged in sequential structure. Types of Non Linear Data Structures are given below: Trees: Trees are multilevel data structures with a hierarchical relationship among its elements known as nodes. The bottommost nodes in the hierarchy are called leaf node while the topmost node is called root node. Each node contains pointers to point adjacent nodes. Tree data structure is based on the parent-child relationship among the nodes. Each node in the tree can have more than one children except the leaf nodes whereas each node can have at most one parent except the root node. Trees can be classified into many categories Graphs: Graphs can be defined as the pictorial representation of the set of elements (represented by vertices) connected by the links known as edges. A graph is different from tree in the sense that a graph can have cycle while the tree cannot have the one. Operations on data structure 1) Traversing: Every data structure contains the set of data elements. Traversing the data structure means visiting each element of the data structure in order to perform some specific operation like searching or sorting. 2) Insertion: Insertion can be defined as the process of adding the elements to the data structure at any location. If the size of data structure is n then we can only insert n-1 data elements into it. 3) Deletion: The process of removing an element from the data structure is called Deletion. We can delete an element from the data structure at any random location. If we try to delete an element from an empty data structure then underflow occurs. 4) Searching: The process of finding the location of an element within the data structure is called Searching. There are two algorithms to perform searching, Linear Search and Binary Search. We will discuss each one of them later in this tutorial. 5) Sorting: The process of arranging the data structure in a specific order is known as Sorting. There are many algorithms that can be used to perform sorting, for example, insertion sort, selection sort, bubble sort, etc. 6) Merging: When two lists List A and List B of size M and N respectively, of similar type of elements, clubbed or joined to produce the third list, List C of size (M+N), then this process is called merging Array: - Operations; Number of elements Arrays An array is a collection of similar type of data items (homogeneous) and each data item is called an element of the array. The data type of the element may be any valid data type like char, int, float or double. The elements of array share the same variable name but each one carries a different index number known as subscript. The array can be one dimensional, two dimensional or multidimensional. One-Dimensional Arrays If only one subscript/index is required to reference all the elements in an array, then the array is termed one-dimensional array or simply an array. One dimensional array can be defined as follows: data_type var_name[expression]; Here data_type is the type of element to be stored in the array, var_name specifies the name of the array, and it may be given any name like other simple variables. The expression or subscript specifies the number of values to be stored in the array. Arrays are also called subscripted values. The subscript must be an integer value. The subscript of an array starts from 0 onwards. An array can be declared as follows: int num[10]; The array will store ten integer values as follows num 22 0 1 1 30 2 9 3 3 55 4 0 5 6 10 7 90 7 8 9 To access a particular element in the array use the following statement: var_name[subscript]; For example, to access fourth element from array num we write num[3]; The size of an array can be determined as follows: Size=(UpperBound-LowerBound)+1 For array num as described above, the size will be as follows Size=(9-0)+1=10 The total size of array in bytes can be computed as: Size in bytes= size of array x size of(base type) For example, the total memory in bytes that the array num would occupy will be Size in bytes = 10 x size of(int) =10 x 2 = 20 Initializing One Dimensional Array Either it is a One Dimensional Array or Multidimensional Array, it could only be initialized at the place of declaration. Example int value[5]={10, 40, 92, 32, 60}; Accessing One Dimensional Array Elements Individual elements of the array can be accessed using the following syntax: array_name[subscript]; Example value[2]=92; Implementation of One Dimensional Array in Memory The address of a particular element in a One Dimensional Array is given by the relation Address of element a[k]=B+W*k Where B is the base address of the array, W is the size of each element of the array, and k is the number of required element in the array. Example: Let the base address of the array is 2000 and each element of the array occupies 4 bytes in the memory, then the address of the fifth element of the array a[10] will be given as: Address of element a[5] = 2000 + 4*5 =2020 Memory Allocation for an Array Memory representation of an array is very simple. Suppose, an array A[100] is to be stored in a memory as in Figure 2.2. Let the memory location where the first element is to be stored be M. If each element requires one word, then the location for any element say A[i] in the array can be obtained as Address (A[i]) = M + (i - 1) Figure Physical representation of a one-dimensional array. Likewise, in general, an array can be written as A[L... U], where L and U denote the lower and upper bounds for the index. If the array is stored starting from the memory location M, and for each element it requires w number of words, then the address for A[i] will be Address (A[i])=M + (i-L) x w The above formula is known as the indexing formula, which is used to map the logical presentation of an array to physical presentation. By knowing the starting address of an array M, the location of the ith element can be calculated instead of moving towards i from M. See Figure 2.3. Figure 2.3 Address mapping between logical and physical views of an array. Traversing This operation is used to visit all elements in an array. Algorithm Traverse Array Input: An array A with elements. Output: According to Process ( ). Data structures: Array A[L Steps: 1. i = L 2. While i<= U do 3. Process (A[i]) 4. i=i+1 5. End While 6. Stop Note: Here the Process ( ) is a procedure which when called for an element can perform an action. For example, display the element on the screen, determine whether A[i] is empty Process ( ) can also be used to manipulate some special operations such as count the special elements of interest (for example, negative numbers in an integer array), update each element of the array (say, increase each element by 10%), and so on. Insertion This operation is used to insert an element into an array provided that the array is not full. Insertion of new element in an array can be done in two ways: i. Insertion at the end of the array ii. Insertion at requires position (See Figure 2.5). Let us observe the algorithmic description for the insertion operation. Figure 2.5 Insertion of an element to an array. Algorithm InsertArray Input: KEY is the item, LOCATION is the index of the element where it is to be inserted. Output: Array enriched with KEY. Data structures: An array A[L... U] // L and U are the lower and upper bounds // of array index Steps: 1. Start 2. If A[U] != NULL then //NULL indicates that room // is available for a new entrant 3. Print "Array is full: No insertion possible" 4. Exit 5. Else 6. i=U //Start pushing from bottom 7. While i> LOCATION do 8. A[i] = A[i - 1] 9. i=i-1 10.EndWhile 11.A[LOCATION] = KEY //Put the element in the desired location 12.Endif 13.Stop Deletion This operation is used to delete a particular element from an array. The element will be deleted by overwriting it with its subsequent element and this subsequent element then is also to be deleted. Deletion of an element in an array can be done in two ways: i. Deletion at the end of the array ii. Deletion at requires position In other words, push the tail (the part of the array after the elements which are to be deleted) one stroke up. Consider the following algorithm to delete an element from an array: Figure 2.6 Deletion of an element from an array. Algorithm DeleteArray Input: KEY the element to be deleted. Output: Slimed array without KEY. Data structures: An array A [L...U]. Steps: 1. Start 2. i= SearchArray(A. KEY)) 3. If (i=0) then //perform search operation on A an return //the location 4. Print "KEY is not found: No deletion" 5. Exit 6. Else 7. While i < U do 8. A[i] = A[i+1] //replace the element by its successor 9. i = i + 1 10.End While 11.Endif 12.A[U] = NULL 13. U = U 1 //the bottom most element is made empty //update upper bound now 14.Stop Note: It is a general practice that no intermediate location will be made empty, that is, an array should be packed and empty locations are at the tail of an array. Sorting This operation, if performed on an array, will sort it in a specified order (ascending/descending). The following algorithm is used to store the elements of an integer array in ascending order Algorithm SortArray Input: An array with integer data. Output: An array with sorted elements in an order according to Order( ). Data structures: An integer array A[L...U]. // L and U are the lower and upper bounds of Steps: 1. Start 2. i=U 3. While i >L do 4. 5. j=L While j < i do 6. If Order (A[j], A[j + 1]) = FALSE 7. 8. Swap(A[], A[+1]) Endif // Start comparing from first // If A[] and A[i+1] are // not in order // Interchange the elements 9. j=j+1 10. End While 11. i=i-1 12. End While 13.Stop Here, Order(...) is a procedure to test whether two elements are in order and Swap(...) is a procedure to interchange the elements between two consecutive locations. Searching This operation is applied to search an element of interest in an array. A simplified version of the algorithm is as follows: Algorithm SearchArray Input: KEY is the element to be searched. Output: Index of KEY in A or a message on failure. Data structures: An array A [L... U]. //L and U are the lower and upper bounds of array index Steps: 1. Start 2. i=L, found=0, location = 0 // found = 0) indicates search is not finished and unsuccessful 3. While (i<=U) and (found = 0) // Continue if all or any one condition do(es) not satisfy 4. If Compare(A[i], KEY) = TRUE then 5. found = 1 // If key is found // Search is finished and successful 6. location = i 7. Else 8. i=i+1 //move to the next 9. Endif 10.EndWhile 11.If found = 0 then 12. Print "Search is unsuccessful : KEY is not in the array" 13. Else 14. Print "Search is successful: KEY is in the array at location", location 15.Endlf 16.Return(location) 16. Stop Merging Merging is an important operation when we need to compact the elements from two different arrays into a single array (see Figure ). Merging of A1 and A2 to A Algorithm Merge Input: Two arrays A1[L1 , A2[L2..U2] Output: Resultant array A [L... U]. where L= and U1 = + (U2- L2+ 1) Data structures: Array structure. Steps: 1. i1=L1, i2=L2, // initialization of control variable 2. L = L1, U = U1 + U2 - L2+ 1 3. i = L // initialization of lower& upper bound //L and U are two bound of resultant array 4. AllocateMemory(Size(U-L + 1)) 5. While i1 //AllocateMemory for array A 1 do 6. A[i] = A1[i1] 7. i = i + 1, i1 = i1+ 1 //To copy A1 into first part of A 8. EndWhile 9. While i2 10. //To copy A2 into last part of A 11. i = i + 1, i2= i2 + 1 12.End While 13.Stop Multi-dimensional Arrays A multi-dimensional array is an array of arrays. 2-dimensional arrays are the most commonly used. They are used to store data in a tabular manner. For an array of size MxN, the rows are numbered from 0 to M-1 and columns are numbered from 0 to N-1, respectively. 2D array declaration: To declare a 2D array, you must specify the following: Row-size: Defines the number of rows Column-size: Defines the number of columns Type of array: Defines the type of elements to be stored in the array i.e. either a number, character, or other such datatype. A sample form of declaration is as follows: type arrayname[row_size][column_size]; A sample array is declared as follows: int arr[3][5]; 2D array initialization: An array can either be initialized during or after declaration. The format of initializing an array during declaration is as follows: int arr[3][3] = {{5, 12, 17}, {13, 4, 8}, {9, 6, 3}}; or int arr[3][3] = {5, 12, 17, 13, 4, 8, 9, 6, 3}; Accessing two dimensional array element Elements of the two dimensional array elements can be accessed as follows int arr[3][5]; arr[0][0] = 5; arr[1][2] = 8; Implementation of two-dimensional array in memory Let, A be a two-dimensional array m x n array. Although A is pictured as a rectangular array of elements with m rows and n columns, the array will be represented in memory by a block m x n sequential memory location. Specifically, the programming language will store the array A either, 1. column by column is called column major order or 2. Row by row, in row major order Given Figure shows these two ways when A is a two-dimensional 3 x 4 array For 3x4 array Row Major Order (RMO) For 3x4 array Column Major Order (CMO) Address of elements in Row major implementation Address of element a[i][j]=B+W(n(i-L1)+(j-L2)) Where B is the base address of the array, W is the size of each array element, n is the number of columns. L1 is the lower bound of row, L2 is the lower bound of column. Example Suppose we want to calculate the address of element A [1, 2] in array A of 2 rows and 4 columns. It can be calculated as follow: Here, B = 2000, W= 2, m=2, n=4, i=1, j=2, L1=0, L2=0 LOC (A [i, j]) = B+W(n(i-L1)+(j-L2)) LOC (A[1, 2]) = 2000 + 2 *[4*(1-0) = 2000 + 2 *[4+2] = 2000 + 2 *6 = 2000 = 2012 Address of elements in Column major implementation Address of element a[i][j]=B+W((i-L1)+ m(j-L2)) + (2-0)] Where B is the base address of the array, W is the size of each array element, m is the number of rows. L1 is the lower bound of row, L2 is the lower bound of column. Example Suppose we want to calculate the address of element A [1, 2] in array A of 2 rows and 4 columns. It can be calculated as follow: Here, B = 2000, W= 2, m=2, n=4, i=1, j=2, L1=0, L2=0 LOC (A [i, j]) = B+W(m(j-L2)+(i-L1)) LOC (A[1, 2]) = 2000 + 2 *[2*(2-0) = 2000 + 2 *[4+1] = 2000 + 2 * 5 = 2000 = 2010 + (1-0)] Application of Arrays 1. Sparse matrix representation. 2. Polynomials representation. 3. Array can be used for sorting elements. 4. Array can perform matrix operation. 5. Array can be used in CPU scheduling. Arrays can be used in various CPU scheduling algorithms. CPU Scheduling is generally managed by Queue. Queue can be managed and implemented using the array. 6. Array can be used in recursive function When the function calls another function or the same function again then the current values are stores onto the stack and those values will be retrieve when control comes back. This is similar operation like stack. Polynomial representation with arrays A polynomial is an expression that contains more than two terms. A term is made up of coefficient and exponent. An example of polynomial is P(x) = 4x3+6x2+7x+9 A polynomial thus may be represented using arrays or linked lists. Array representation assumes that the exponents of the given expression are arranged from 0 to the highest value (degree), which is represented by the subscript of the array beginning with 0. The coefficients of the respective exponent are placed at an appropriate index in the array. The array representation for the above polynomial expression is given below: Polynomial Addition Given two polynomials represented by two arrays, write a function that adds given two polynomials. Example: Input: A[ ] = {5, 0, 10, 6} B[ ] = {1, 2, 4} Output: sum[ ] = {6, 2, 14, 6} The first input array represents "5 + 0x 1 + 10x2 + 6x3" The second array represents "1 + 2x1 + 4x2" And Output is "6 + 2x1 + 14x2 + 6x3" Algorithm add(A[0..m-1], B[0..n-1]) 1) Create a sum array sum[ ] of size equal to maximum of 'm' and 'n' 2) Copy A[ ] to sum[ ]. 3) Travers array B[ ] and do following for every element B[i] sum[i] = sum[i] + B[i] 4) Return sum[ ] Sparse Matrix Sparse Matrix and its representations A matrix is a two-dimensional data object made of m rows and n columns, therefore having total m x n values. If most of the elements of the matrix have 0 value, then it is called a sparse matrix. Uses Storage: There are lesser non-zero elements than zeros and thus lesser memory can be used to store only those elements. Computing time: Computing time can be saved by logically designing a data structure traversing only non-zero elements. Example: 00304 00570 00000 02600 Representing a sparse matrix by a 2D array leads to wastage of lots of memory as zeroes in the matrix are of no use in most of the cases. So, instead of storing zeroes with non-zero elements, we only store non-zero elements. This means storing non-zero elements with triples (Row, Column, value). Sparse Matrix Representations can be done using array as follows: 2D array is used to represent a sparse matrix in which there are three rows named as Row: Index of row, where non-zero element is located Column: Index of column, where non-zero element is located Value: Value of the non-zero element located at index (row, column) Operations on Sparse Matrices The sparse matrix used anywhere in the program is sorted according to its row values. Two elements with the same row values are further sorted according to their column values. Now to add the matrices, we simply traverse through both matrices element by element and insert the smaller element (one with smaller row and col value) into the resultant matrix. If we come across an element with the same row and column value, we simply add their values and insert the added data into the resultant matrix. Input : Matrix 1: (4x4) Row Column Value 1 2 10 1 4 12 3 3 5 4 1 15 4 2 12 Matrix 2: (4X4) Row Column Value 1 3 8 2 4 23 3 3 9 4 1 20 4 2 25 Output : Result of Addition: (4x4) Row Column Value 1 2 10 1 3 8 1 4 12 2 4 23 3 3 14 4 1 35 4 2 37 To transpose a matrix, we can simply change every column value to the row value and vicerequire. Row Column Value 4 1 4 3 5 1 To multiply two matrices, we start with the result (answer) matrix full of zeros. Search through the first matrix, and every time you find a nonzero entry, add the appropriate amounts to each location of the product matrix; that is, multiply that one entry by everything in the second matrix. For example: Consider 2 matrices: Row Col Val 1 2 10 1 1 2 1 3 12 1 2 5 2 1 1 2 2 1 2 3 2 3 1 8 The resulting matrix after multiplication will be obtained as follows: result[1][2] = A[1][2]*B[2][2] = 10*1 = 10 result[1][1] = A[1][3]*B[3][1] = 12*8 = 96 result[2][1] = A[2][1]*B[1][1] = 1*2=2 result[2][2] = A[2][1]*B[1][2] = 1*5 = 5 result[2][1] = A[2][3]*B[3][1] = 2*8=16 Take the sum of result[2][1] values(2+16=18) and sorted the entire values in the matrix result. Hence the final resultant matrix will be: Row Col Val 1 1 96 1 2 10 2 1 18 2 2 5 The concept of recursion: types, example: Factorial Some computer programming languages allow a module or function to call itself. This technique is known as recursion. Recursive Procedures If P is a procedure containing a call statement to itself or to another procedure that result in a call to itself, then the procedure P is said to be recursive. The procedure Types of Recursion 1. Direct recursion 2. Indirect recursion 3. Tail recursion 4. Non-tail recursion Direct Recursion A function is called direct recursion if it call same function again. Structure of direct recursion fun( ) { //some code fun( ); //some code } Indirect recursion A function (let say fun) is called indirect recursive if it calls another function (let say fun2) and then fun2 calls fun directly or indirectly. Structure of Indirect recursion: WAP to print numbers from 1 to 10 in such a way that when number is odd, add 1 and when number is even, subtract 1. 'Output: 2 1 4 3 6 5 8 7 10 9 void odd(); void even(); int n=1; void even() { if(n <= 10) { printf("%d ", n-1); n++; odd(); void odd () { if(n <= 10) { printf("%d ", n+1); n++; even(); return; int main() { odd(); return; } Example: Recursive procedure to compute factorial of a number Factorial of a non-negative integer, is multiplication of all integers smaller than or equal to n. For example factorial of 6 is 6*5*4*3*2*1 which is 720. We can recursively define the problem. n! = | n * factorial(n 1) |1 if n > 0 if n = 0 Algorithm Step 1: Start Step 2: Read number n Step 3: Call factorial(n) Step 4: Print factorial f Step 5: Stop factorial(n) Step 1: If n==1 then return 1 Step 2: Else f=n*factorial(n-1) Step 3: Return f Let's see the factorial program in C++ using recursion #include<iostream> using namespace std; int main() { int factorial(int); int fact,value; cout<<"Enter any number: "; cin>>value; fact=factorial(value); cout<<"Factorial of a number is: "<<fact<<endl; return 0; } int factorial(int n) { if(n<0) return(-1); /*Wrong value*/ if(n==0) return(1); /*Terminating condition*/ else { return(n*factorial(n-1)); } } Output: Enter any number: 6 Factorial of a number is: 720 Tail recursive A recursive function is said to be tail recursive if the recursive call is the last thing done by the function. There is no need to keep record of the previous state. void fun(int n) { if(n == 0) return; else printf("%d", ); return fun(n-1); int main() { fun (3); return 0; } Non-tail recursive A recursive function is said to be non-tail recursive if the recursive call is not the last thing done by the function. After returning back, there is some something left to evaluate. void fun(int n) { if(n == 0) return; fun(n-1); printf("%d ", n); int main() { fun (3); return 0; } Tower of Hanoi problem The Tower of Hanoi, is a mathematical problem which consists of three rods and multiple disks. Initially, all the disks are placed on one rod, one over the other in ascending order of size similar to a cone-shaped tower. The objective of this problem is to move the stack of disks from the initial rod to another rod, following these rules: Rules 1. Only one disk can be moved at a time. 2. Only the "top" disk can be removed. 3. No disk can be placed on top of the smaller disk There are 3 pillars source (s) Intermediate (I) and destination (D) S contains a set of disk to resemble a tower, with the largest disk at the bottom and smallest at the top. The goal is transfer the entire tower of disk from S to (destination) D maintain the same order of the disk also only one disk can be moved at a time and never a larger disk can be placed on a smaller disk during the transfer. The pillar (I) is for intermediate use during the transfer. Example A simple solution to the problem for n=3 disk is given by the following transfer of disks. 1. Transfer disk 1 from Source (S) to Destination (D). 2. Transfer disk 2 from S to Intermediate (I) 3. Transfer disk 1 from D to I 4. Transfer disk 3 from S to D 5. Transfer disk 1 from I to S 6. Transfer disk 2 from I to D 7. Transfer disk 1 from S to D The goal is to move all the disks from the leftmost rod to the rightmost rod. To move N disks from one rod to another, 2 1 steps are required. So, to move 3 disks from starting the rod to the ending rod, a total of 7 steps are required . The solution to this is an application of puzzle is an application for recursive function. Recursive procedure for the solution of this problem for N number of disk is as follows. -1 disks from Source to Intermediate Move nth disk from Source to Destination -1 disks from Intermediate to Destination A recursive algorithm for Tower of Hanoi can be driven as follows Algorithm to transfer N disk from S to D is given by START Procedure Hanoi(N, S, D, I) IF N == 1, THEN move N from S to D ELSE Hanoi(N - 1, S, I, I) // Step 1 move N from S to I // Step 2 Hanoi(N - 1, I, D, S) // Step 3 END IF END Procedure STOP Lab Program 1. Add two polynomials. #include<iostream.h> #include<conio.h> #define MAX 10 int main() { int poly1[MAX]={0}, poly2[MAX]={0}, poly3[MAX]={0}; int i,deg1,deg2,deg3; clrscr(); cout<<"\nEnter the degree of polynomial 1:"; cin>>deg1; cout<<"\nEnter the degree of polynomial 2:"; cin>>deg2; cout<<"\nFor polynomial 1"; for(i=0;i<=deg1;i++) { cout<<"\nEnter the coefficient of exponent"<<i<<":"; cin>>poly1[i]; } cout<<"\nFor polynomial 2"; for(i=0;i<=deg2;i++) { cout<<"\nEnter the coefficient for exponent"<<i<<":"; cin>>poly2[i]; } deg3=(deg1>deg2)?deg1:deg2; for(i=0;i<=deg3;i++) { poly3[i]=poly1[i]+poly2[i]; } cout<<"\nPolynomial 1:"; for(i=deg1;i>0;i--) { cout<<poly1[i]<<"x^"<<i<<"+"; } cout<<poly1[0]; cout<<"\nPolynomial 2:"; for(i=deg2;i>0;i--) { cout<<poly2[i]<<"x^"<<i<<"+"; } cout<<poly2[0]; cout<<"\n RESULT \n"; for(i=deg3;i>0;i--) { cout<<poly3[i]<<"x^"<<i<<"+"; } cout<<poly3[0]; getch( ); return 0; } OUTPUT: Enter the degree of polynomial 1: 2 Enter the degree of polynomial 2: 2 For polynomial 1 Enter the coefficient for exponent 0 : 2 Enter the coefficient for exponent 1 : 3 Enter the coefficient for exponent 2 : 4 For polynomial 2 Enter the coefficient for exponent 0 : 3 Enter the coefficient for exponent 1 : 4 Enter the coefficient for exponent 2: 2 Polynomial 1: 4x^2+3x^1+2 Polynomial 2: 2x^2+4x^1+3 RESULT 6x^2+7x^1+5 Unit II Sorting algorithms: Insertion, bubble, selection, quick and merge sort; Comparison of Sort algorithms. Searching techniques: Linear and Binary search. Sorting Algorithms Sorting Sorting refers to arranging data in a particular format. Sorting algorithm specifies the way to arrange data in a particular order. The sorting method can be divided into two categories. Internal Sorting Here all the data elements that is be sorted can be process at a time. External Sorting External sorting is a term for a class of sorting algorithms that can handle massive amounts of data. External sorting is required when the data being sorted do not fit into the main memory of a computing device (usually RAM) and instead, they must reside in the slower external memory (usually a hard drive). Stable and Not Stable Sorting If a sorting algorithm, after sorting the contents, does not change the sequence of similar content in which they appear, it is called stable sorting. If a sorting algorithm, after sorting the contents, changes the sequence of similar content in which they appear, it is called unstable sorting. In-place Sorting and Not-in-place Sorting Sorting algorithms may require some extra space for comparison and temporary storage of few data elements. These algorithms do not require any extra space and sorting is said to happen in-place, or for example, within the array itself. This is called in-place sorting. Bubble sort is an example of in-place sorting. However, in some sorting algorithms, the program requires space which is more than or equal to the elements being sorted. Sorting which uses equal or more space is called not-in-place sorting. Merge-sort is an example of not-in-place sorting. Insertion Sort Insertion Sort is an efficient algorithm for sorting a small number of elements. It is similar to sort a hand of playing cards. An element which is to be inserted in this sorted sub-list, has to find its appropriate place and then it has to be inserted there. Hence the name, insertion sort. Insertion sort is a comparison based sorting algorithm which sorts the array by shifting elements one by one from an unsorted sub-array to the sorted subarray. With each iteration, an element from the input is pick and inserts in the sorted list at the correct location. Step by Step Process The insertion sort algorithm is performed using the following steps. Step 1: - Assume that first element in the list is in sorted portion and all the remaining elements are in unsorted portion. Step 2: Take first element from the unsorted portion and insert that element into the sorted portion in the order specified. Step 3: Repeat the above process until all the elements from the unsorted portion are moved into the sorted portion. Algorithm Insertion_Sort(A, n) y variable to exchange the two values. Insertion_Sort(A, n) 1. 2. Repeat steps (2) to (4) for i=1 to (n-1) Set temp= A[i] Set j=i-1 3. Set A[j+1]=A[j] Set j=j-1 4. Set A[j+1]= temp 5. Exit (Also write a suitable example of the algorithm here) Example Insertion Sort Complexity Time Best Complexity Worst Average O(n) O(n2) O(n2) Space Complexity O(1) Stability Yes Bubble sort The working procedure of bubble sort is simplest. Bubble sort works on the repeatedly swapping of adjacent elements until they are not in the intended order. It is called bubble sort because the movement of array elements is just like the movement of air bubbles in the water. Bubbles in water rise up to the surface; similarly, the array elements in bubble sort move to the end in each iteration. Although it is simple to use, it is primarily used as an educational tool because the performance of bubble sort is poor in the real world. It is not suitable for large data sets. The average and worst-case complexity of Bubble sort is O(n2), where n is a number of items. Bubble short is majorly used where complexity does not matter simple and short code is preferred Algorithm In the algorithm given below, suppose arr is an array of n elements. The assumed swap function in the algorithm will swap the values of given array elements. 1. begin BubbleSort(arr) 2. for all array elements 3. if arr[i] > arr[i+1] 4. swap(arr[i], arr[i+1]) 5. end if 6. end for 7. return arr 8. end BubbleSort Working of Bubble sort Algorithm Now, let's see the working of Bubble sort Algorithm. To understand the working of bubble sort algorithm, let's take an unsorted array. We are taking a short and accurate array, as we know the complexity of bubble sort is O(n2). Let the elements of array are - First Pass Sorting will start from the initial two elements. Let compare them to check which is greater. Here, 32 is greater than 13 (32 > 13), so it is already sorted. Now, compare 32 with 26. Here, 26 is smaller than 36. So, swapping is required. After swapping new array will look like - Now, compare 32 and 35. Here, 35 is greater than 32. So, there is no swapping required as they are already sorted. Now, the comparison will be in between 35 and 10. Here, 10 is smaller than 35 that are not sorted. So, swapping is required. Now, we reach at the end of the array. After first pass, the array will be - Now, move to the second iteration. Second Pass The same process will be followed for second iteration. Here, 10 is smaller than 32. So, swapping is required. After swapping, the array will be - Now, move to the third iteration. Third Pass The same process will be followed for third iteration. Here, 10 is smaller than 26. So, swapping is required. After swapping, the array will be - Now, move to the fourth iteration. Fourth pass Similarly, after the fourth iteration, the array will be - Hence, there is no swapping required, so the array is completely sorted. Bubble sort complexity Now, let's see the time complexity of bubble sort in the best case, average case, and worst case. We will also see the space complexity of bubble sort. 1. Time Complexity Case Time Complexity Best Case O(n) Average Case O(n2) Worst Case O(n2) Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted. The best-case time complexity of bubble sort is O(n). Average Case Complexity - It occurs when the array elements are in jumbled order that is not properly ascending and not properly descending. The average case time complexity of bubble sort is O(n2). Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse order. That means suppose you have to sort the array elements in ascending order, but its elements are in descending order. The worst-case time complexity of bubble sort is O(n2). 2. Space Complexity Space Complexity O(1) Stable YES The space complexity of bubble sort is O(1). It is because, in bubble sort, an extra variable is required for swapping. The space complexity of optimized bubble sort is O(2). It is because two extra variables are required in optimized bubble sort. Now, let's discuss the optimized bubble sort algorithm. Optimized Bubble sort Algorithm In the bubble sort algorithm, comparisons are made even when the array is already sorted. Because of that, the execution time increases. To solve it, we can use an extra variable swapped. It is set to true if swapping requires; otherwise, it is set to false. It will be helpful, as suppose after an iteration, if there is no swapping required, the value of variable swapped will be false. It means that the elements are already sorted, and no further iterations are required. This method will reduce the execution time and also optimizes the bubble sort. Algorithm for optimized bubble sort 1. bubbleSort(array) 2. n = length(array) 3. repeat 4. swapped = false 5. for i = 1 to n - 1 6. if array[i - 1] > array[i], then 7. swap(array[i - 1], array[i]) 8. swapped = true 9. end if 10. end for 11. n = n - 1 12. until not swapped 13.end bubbleSort Selection Sort The selection sort algorithm sorts an array by repeatedly finding the minimum element (considering ascending order) from unsorted part and putting it at the beginning. Let us see what the idea of Selection Sort is: First it finds the smallest element in the array. Exchange that smallest element with the element at the first position. Then find the second smallest element and exchange that element with the element at the second position. This process continues until the complete array is sorted. Algorithm SELECTION-SORT(A,n) Here A is a one dimensional array to be sorted and n is the size of the array SELECTION-SORT(A, n) 1. Repeat steps (2) to (4) for j = 1 to n-1 2. [initialize smallest with j] Set smallest = j 3. Repeat for i = j+1 to n if A[ i ] < A[smallest] then Set smallest = i 4. Exchange A[j] and A[smallest] 5. Exit (Also write a suitable example of the algorithm here) Let us see how selection sort works: Working of Selection sort Algorithm Now, let's see the working of the Selection sort Algorithm. To understand the working of the Selection sort algorithm, let's take an unsorted array. It will be easier to understand the Selection sort via an example. Let the elements of array are - Now, for the first position in the sorted array, the entire array is to be scanned sequentially. At present, 12 is stored at the first position, after searching the entire array, it is found that 8 is the smallest value. So, swap 12 with 8. After the first iteration, 8 will appear at the first position in the sorted array. For the second position, where 29 is stored presently, we again sequentially scan the rest of the items of unsorted array. After scanning, we find that 12 is the second lowest element in the array that should be appeared at second position. Now, swap 29 with 12. After the second iteration, 12 will appear at the second position in the sorted array. So, after two iterations, the two smallest values are placed at the beginning in a sorted way. The same process is applied to the rest of the array elements. Now, we are showing a pictorial representation of the entire sorting process. Now, the array is completely sorted. Time Best Complexity Worst Average O(n) O(n2) O(n2) Space Complexity O(1) Stability Yes Quick Sort It is one of the most popular sorting techniques. This algorithm works by partitioning the array to be sorted and each partition is in turn sorted recursively. It is a faster and highly efficient sorting algorithm This algorithm follows the divide and conquer approach. Divide and conquer is a technique of breaking down the algorithms into sub problems, then solving the sub problems, and combining the results back together to solve the original problem. Divide: In Divide, first pick a pivot element. After that, partition or rearrange the array into two sub-arrays such that each element in the left sub-array is less than or equal to the pivot element and each element in the right sub-array is larger than the pivot element. In partition, one of the element is chosen as the key value and rest of the array elements are grouped into two partitions such that, one partition contains elements smaller than the key value. Another partition contains elements larger than the key value. Conquer: Recursively, sort two subarrays with Quicksort. Combine: Combine the already sorted array. Quicksort picks an element as pivot, and then it partitions the given array around the picked pivot element. In quick sort, a large array is divided into two arrays in which one holds values that are smaller than the specified value (Pivot), and another array holds the values that are greater than the pivot. After that, left and right sub-arrays are also partitioned using the same approach. It will continue until the single element remains in the sub-array. Algorithm Quick_Sort(a, l, h) Where a is the array to be sorted, l is the position of the first element in the list, h is the position of the last element in the list. Quick_Sort(a, l, h) 1. [Initialization] low= l high= h key=a[(l+h)/2] 2. Repeat through step 7 while(low <= high) 3. Repeat step 4 while(a[low]<key) 4. low=low+1 5. repeat step 6 while(a[high] 6. high=high-1 7. if(low<=high) a) temp=a[low] b) a[low]=a[high] c) a[high]=temp d) low=low+1 e) high=high-1 8. if(l<high) Quick_Sort(a, l , high) 9. if(low<h) Quick_Sort(a, low , h) 10. Exit Working of Quick Sort Algorithm Now, let's see the working of the Quicksort Algorithm. To understand the working of quick sort, let's take an unsorted array. It will make the concept more clear and understandable. Let the elements of array are - In the given array, we consider the leftmost element as pivot. So, in this case, a[left] = 24, a[right] = 27 and a[pivot] = 24. Since, pivot is at left, so algorithm starts from right and move towards left. Now, a[pivot] < a[right], so algorithm moves forward one position towards left, i.e. - Now, a[left] = 24, a[right] = 19, and a[pivot] = 24. Because, a[pivot] > a[right], so, algorithm will swap a[pivot] with a[right], and pivot moves to right, as - Now, a[left] = 19, a[right] = 24, and a[pivot] = 24. Since, pivot is at right, so algorithm starts from left and moves to right. As a[pivot] > a[left], so algorithm moves one position to right as - Now, a[left] = 9, a[right] = 24, and a[pivot] = 24. As a[pivot] > a[left], so algorithm moves one position to right as - Now, a[left] = 29, a[right] = 24, and a[pivot] = 24. As a[pivot] < a[left], so, swap a[pivot] and a[left], now pivot is at left, i.e. - Since, pivot is at left, so algorithm starts from right, and move to left. Now, a[left] = 24, a[right] = 29, and a[pivot] = 24. As a[pivot] < a[right], so algorithm moves one position to left, as - Now, a[pivot] = 24, a[left] = 24, and a[right] = 14. As a[pivot] > a[right], so, swap a[pivot] and a[right], now pivot is at right, i.e. - Now, a[pivot] = 24, a[left] = 14, and a[right] = 24. Pivot is at right, so the algorithm starts from left and move to right. Now, a[pivot] = 24, a[left] = 24, and a[right] = 24. So, pivot, left and right are pointing the same element. It represents the termination of procedure. Element 24, which is the pivot element is placed at its exact position. Elements that are right side of element 24 are greater than it, and the elements that are left side of element 24 are smaller than it. Now, in a similar manner, quick sort algorithm is separately applied to the left and right sub-arrays. After sorting gets done, the array will be - Quicksort complexity Now, let's see the time complexity of quicksort in best case, average case, and in worst case. We will also see the space complexity of quicksort. 1. Time Complexity Case Time Complexity Best Case O(n*logn) Average Case O(n*logn) O(n2) Worst Case Best Case Complexity - In Quicksort, the best-case occurs when the pivot element is the middle element or near to the middle element. The best-case time complexity of quicksort is O(n*logn). Average Case Complexity - It occurs when the array elements are in jumbled order that is not properly ascending and not properly descending. The average case time complexity of quicksort is O(n*logn). Worst Case Complexity - In quick sort, worst case occurs when the pivot element is either greatest or smallest element. Suppose, if the pivot element is always the last element of the array, the worst case would occur when the given array is sorted already in ascending or descending order. The worst-case time complexity of quicksort is O(n2). Though the worst-case complexity of quicksort is more than other sorting algorithms such as Merge sort and Heap sort, still it is faster in practice. Worst case in quick sort rarely occurs because by changing the choice of pivot, it can be implemented in different ways. Worst case in quicksort can be avoided by choosing the right pivot element. 2. Space Complexity Space Complexity O(n*logn) Stable NO The space complexity of quicksort is O(n*logn). Merge sort Merge sort is a sorting technique based on divide and conquer technique. With worst-case time complexity being algorithms. it is one of the most respected It is one of the most popular and efficient sorting algorithm. It divides the given list into two equal halves, calls itself for the two halves and then merges the two sorted halves. We have to define the merge() function to perform the merging. The sub-lists are divided again and again into halves until the list cannot be divided further. Then we combine the pair of one element lists into two-element lists, sorting them in the process. The sorted two-element pairs is merged into the four-element lists, and so on until we get the sorted list. Working of Merge sort Algorithm To understand merge sort, we take an unsorted array as the following We know that merge sort first divides the whole array iteratively into equal halves unless the atomic values are achieved. We see here that an array of 8 items is divided into two arrays of size 4. This does not change the sequence of appearance of items in the original. Now we divide these two arrays into halves. We further divide these arrays and we achieve atomic value which can no more be divided. Now, we combine them in exactly the same manner as they were broken down. Please note the color codes given to these lists. We first compare the element for each list and then combine them into another list in a sorted manner. We see that 14 and 33 are in sorted positions. We compare 27 and 10 and in the target list of 2 values we put 10 first, followed by 27. We change the order of 19 and 35 whereas 42 and 44 are placed sequentially. In the next iteration of the combining phase, we compare lists of two data values, and merge them into a list of found data values placing all in a sorted order. Now we should learn some programming aspects of merge sorting. Algorithm Merge sort keeps on dividing the list into equal halves until it can no more be divided. By definition, if it is only one element in the list, it is sorted. Then, merge sort combines the smaller sorted lists keeping the new list sorted too. Merge sort complexity Now, let's see the time complexity of merge sort in best case, average case, and in worst case. We will also see the space complexity of the merge sort. 1. Time Complexity Case Time Complexity Best Case O(n*logn) Average Case O(n*logn) Worst Case O(n*logn) Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already sorted. The best-case time complexity of merge sort is O(n*logn). Average Case Complexity - It occurs when the array elements are in jumbled order that is not properly ascending and not properly descending. The average case time complexity of merge sort is O(n*logn). Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse order. That means suppose you have to sort the array elements in ascending order, but its elements are in descending order. The worst-case time complexity of merge sort is O(n*logn). 2. Space Complexity Space Complexity Stable O(n) YES The space complexity of merge sort is O(n). It is because, in merge sort, an extra variable is required for swapping. Algorithm In the following algorithm, arr is the given array, beg is the starting element, and end is the last element of the array. MERGE_SORT(arr, beg, end) if beg < end set mid = (beg + end)/2 MERGE_SORT(arr, beg, mid) MERGE_SORT(arr, mid + 1, end) MERGE (arr, beg, mid, end) end of if END MERGE_SORT // C++ program for Merge Sort #include <iostream> using namespace std; // Merges two subarrays of array[]. // First subarray is arr[begin..mid] // Second subarray is arr[mid+1..end] void merge(int array[], int const left, int const mid, int const right) { auto const subArrayOne = mid - left + 1; auto const subArrayTwo = right - mid; // Create temp arrays auto *leftArray = new int[subArrayOne], *rightArray = new int[subArrayTwo]; // Copy data to temp arrays leftArray[] and rightArray[] for (auto i = 0; i < subArrayOne; i++) leftArray[i] = array[left + i]; for (auto j = 0; j < subArrayTwo; j++) rightArray[j] = array[mid + 1 + j]; auto indexOfSubArrayOne = 0, // Initial index of first sub-array indexOfSubArrayTwo = 0; // Initial index of second sub-array int indexOfMergedArray = left; // Initial index of merged array // Merge the temp arrays back into array[left..right] while (indexOfSubArrayOne < subArrayOne && indexOfSubArrayTwo < subArrayTwo) { if (leftArray[indexOfSubArrayOne] <= rightArray[indexOfSubArrayTwo]) { array[indexOfMergedArray] = leftArray[indexOfSubArrayOne]; indexOfSubArrayOne++; } else { array[indexOfMergedArray] = rightArray[indexOfSubArrayTwo]; indexOfSubArrayTwo++; } indexOfMergedArray++; } // Copy the remaining elements of // left[], if there are any while (indexOfSubArrayOne < subArrayOne) { array[indexOfMergedArray] = leftArray[indexOfSubArrayOne]; indexOfSubArrayOne++; indexOfMergedArray++; } // Copy the remaining elements of // right[], if there are any while (indexOfSubArrayTwo < subArrayTwo) { array[indexOfMergedArray] = rightArray[indexOfSubArrayTwo]; indexOfSubArrayTwo++; indexOfMergedArray++; } } // begin is for left index and end is // right index of the sub-array // of arr to be sorted */ void mergeSort(int array[], int const begin, int const end) { if (begin >= end) return; // Returns recursively auto mid = begin + (end - begin) / 2; mergeSort(array, begin, mid); mergeSort(array, mid + 1, end); merge(array, begin, mid, end); } // UTILITY FUNCTIONS // Function to print an array void printArray(int A[], int size) { for (auto i = 0; i < size; i++) cout << A[i] << " "; } // Driver code int main() { int arr[] = { 12, 11, 13, 5, 6, 7 }; auto arr_size = sizeof(arr) / sizeof(arr[0]); cout << "Given array is \n"; printArray(arr, arr_size); mergeSort(arr, 0, arr_size - 1); cout << "\nSorted array is \n"; printArray(arr, arr_size); return 0; } Searching of Array Element Searching is used to find the location where element is available or not. There are two types of searching techniques as given below: 1. Linear or sequential searching 2. Binary searching Linear or sequential searching In linear search, search each element of an array one by one sequentially and see whether it is desired element or not. A search will be unsuccessful if all the elements are accessed and the desired element is not found. Algorithm Linear_Search(a, n, item) Here a is a linear array with n elements, and item is a given item of information. This algorithm finds the location loc of item in a, or set loc=0 if the search is unsuccessful. Linear_Search(data, n, item) 1. [insert the item at the end of the array data] Set data[n+1]=item 2. [initialize counter] Set loc=1 3. Repeat while data[loc]!=item Set loc=loc+1 4. [unsuccessful] If loc=n+1, then set loc=0 5. Exit (Also write a suitable example of the algorithm here) Working of Linear search Now, let's see the working of the linear search Algorithm. To understand the working of linear search algorithm, let's take an unsorted array. It will be easy to understand the working of linear search with an example. Let the elements of array are - Let the element to be searched is K = 41 Now, start from the first element and compare K with each element of the array. The value of K, i.e., 41, is not matched with the first element of the array. So, move to the next element. And follow the same process until the respective element is found. Now, the element to be searched is found. So algorithm will return the index of the element matched. Linear Search complexity Now, let's see the time complexity of linear search in the best case, average case, and worst case. We will also see the space complexity of linear search. 1. Time Complexity Case Time Complexity Best Case O(1) Average Case O(n) Worst Case O(n) Best Case Complexity - In Linear search, best case occurs when the element we are finding is at the first position of the array. The best-case time complexity of linear search is O(1). Average Case Complexity - The average case time complexity of linear search is O(n). Worst Case Complexity - In Linear search, the worst case occurs when the element we are looking is present at the end of the array. The worstcase in linear search could be when the target element is not present in the given array, and we have to traverse the entire array. The worst-case time complexity of linear search is O(n). The time complexity of linear search is O(n) because every element in the array is compared only once. 2. Space Complexity Space Complexity O(1) The space complexity of linear search is O(1). Binary Searching Binary Search technique searches the given item in minimum possible comparisons. To do the binary search, first we had to sort the array elements. The logic behind this technique is given below: 1. First find the middle element of the array 2. Compare the middle element with item 3. There are three cases a. If it is a desired element then the search is successful. b. If desired item is less than middle element then search only the first half of the array. c. If desired item is greater than middle element, search in the second half of the array. Algorithm Binary_Search(data, LB, UB, item) Here data is a sorted array with lower bound LB and upper bound UB, and item is a given item of information. The beg, end, and mid denoted respectively, the beginning, end, and middle location of a segment of element of a. this algorithm finds the location of item in loc or set loc=NULL. Binary_Search(data, LB, UB, item) 1. [Initialize segment variable] Set beg=LB, end=UB, mid=int(beg+end)/2 2. Repeat steps 3 and 4 while beg<=end and data[mid]!=item 3. If item < data[mid], then Set end=mid-1 Else Set beg=mid+1 4. Set mid= int(beg+end)/2 5. If a[mid]=item them Set loc = mid Else Set loc=NULL 6. Exit (Also write a suitable example of the algorithm here) Example Working of Binary search Now, let's see the working of the Binary Search Algorithm. To understand the working of the Binary search algorithm, let's take a sorted array. It will be easy to understand the working of Binary search with an example. There are two methods to implement the binary search algorithm Iterative method Recursive method The recursive method of binary search follows the divide and conquer approach. Let the elements of array are - Let the element to search is, K = 56 We have to use the below formula to calculate the mid of the array 1. mid = (beg + end)/2 So, in the given array beg = 0 end = 8 mid = (0 + 8)/2 = 4. So, 4 is the mid of the array. Now, the element to search is found. So algorithm will return the index of the element matched. Binary Search complexity Now, let's see the time complexity of Binary search in the best case, average case, and worst case. We will also see the space complexity of Binary search. 1. Time Complexity Case Time Complexity Best Case O(1) Average Case O(logn) Worst Case O(logn) Best Case Complexity - In Binary search, best case occurs when the element to search is found in first comparison, i.e., when the first middle element itself is the element to be searched. The best-case time complexity of Binary search is O(1). Average Case Complexity - The average case time complexity of Binary search is O(logn). Worst Case Complexity - In Binary search, the worst case occurs, when we have to keep reducing the search space till it has only one element. The worst-case time complexity of Binary search is O(logn). 2. Space Complexity Space Complexity O(1) The space complexity of binary search is O(1). Unit III: Stack: Operations on stack; array representation. Application of stack- i. Postfix expression evaluation. ii. Conversion of infix to postfix expression. Queues: Operation on queue. Circular queue; Dequeue, and priority queue. Application of queue: Job scheduling. Stack: Operations on stack A stack is an ordered collection of homogenous data elements where the insertion and deletion operation take place at one end only. Stack is a linear data structure that follows a particular order in which the operations are performed. The order may be LIFO (Last In First Out) or FILO (First In Last Out). Stack is a linear list in which insertion and deletions are allowed only at one end, called top. A stack is an abstract data type (ADT), can be implemented in most of the programming languages. It is named as stack because it behaves like a realworld stack, for example: - shunting of trains in a railway yard, Shipment in a cargo, Plates on a tray etc... Representation of a stack The Stack can be implemented in two ways: a. Using one dimensional arrays. b. (pointers) Single linked list. Array Representation of Stacks First we have to allocate a memory block of sufficient size to accommodate the full capacity of the stack. Then, starting from the first location of the memory block, the items of the stack can be stored in a sequential fashion. In Figure 4.3(a), ITEM, denotes the ith item in the stack: l and u denote the index range of the array in use; usually the values of these indices are l and SIZE respectively. TOP is a pointer to point the position of the array up to which it is filled with the items of the stack. With this representation, the following two ways can be stated: EMPTY: TOP<1 FULL: TOP>=U Operations on stack Push: Adds an item in the stack. If the stack is full, then it is said to be an Overflow condition. Pop: Removes an item from the stack. The items are popped in the reversed order in which they are pushed. If the stack is empty, then it is said to be an Underflow condition. Status: To know the present state of a stack. Peek or Top: Returns the top element of the stack. isEmpty: Returns true if the stack is empty, else false. Let us define all these operations of a stack. First, we will consider the abovementioned operations for a stack represented by an array. Algorithm Push_Array Input: The new item ITEM to be pushed onto it. Output: A stack with a newly pushed ITEM at the TOP position.. Data structure: An array A with TOP as the pointer. Steps: 1. If TOP>= SIZE then 2. Print "Stack is full"] 3. Else 4. TOP = TOP + 1 5. A[TOP] = ITEM 6. Endif 7. Stop Here, we have assumed that the array index varies from 1 to SIZE and TOP points the location of the current top-most item in the stack. The following algorithm Pop Array defines the POP of an item from a stack which is represented using an array A. Algorithm Pop Array Input: A stack with elements Output: Removes an ITEM from the top of the stack if it is not empty. Data structure: An array A with TOP as the pointer. Steps: 1. If TOP < 1 then 2. Print "Stack is empty" 3. Else 4. ITEM= A[TOP] 5. TOP = TOP-1 6. EndIf 7. Stop In the following algorithm Status_Array, we test the various states of a stack such as whether it is full or empty, how many items are right now in it, and read the current element at the top without removing it, etc Algorithm Status_Array Input: A stack with elements. Output: States whether it is empty or full, available free space and item at TOP. Data structure: An array A with TOP as the pointer, Steps: 1. If TOP < 1 then 2. Print "Stack is empty" 3. Else 4. If (TOP >= SIZE) then 5. Print "Stack is full" 6. Else 7. Print "The element at TOP is", A[TOP] 8. free =(SIZE - TOP)/SIZE *100 9. Print "Percentage of free stack is", free 10.EndIf 11.EndIf 12.Stop Application of stack In a stack, only limited operations are performed because it is restricted data structure. The elements are deleted from the stack in the reverse order. Following are the applications of stack 1. Expression Evaluation 2. Expression Conversion i. Infix to Postfix ii. Infix to Prefix iii. Postfix to Infix iv. Prefix to Infix 3. Backtracking 4. Memory Management0 5. Delimiter Checking Expression Representation There are three popular methods used for representation of an expression: Infix A+B Operator between operands. Prefix + AB Operator before operands. Postfix AB + Operator after operands. Evaluation of Arithmetic Expressions To convert any Infix expression into Postfix or Prefix expression we can use the following procedure. Find all the operators in the given Infix Expression. Find the order of operators evaluated according to their Operator precedence. Convert each operator into required type of expression (Postfix or Prefix) in the same order. Example Consider the following Infix Expression to be converted into Postfix Expression. D=A+B*C 1) The Operators in the given Infix Expression: =, +, * 2) The Order of Operators according to their preference: *, +, = 3) Now, convert the first operator *: D = A + B C * 4) Convert the next operator +: D = A BC* + 5) Convert the next operator =: D ABC*+ = Finally, given Infix Expression is converted into Postfix Expression as follows. DABC*+= Postfix Evaluation A postfix expression is a collection of operators and operands in which the operator is placed after the operands. That means, in a postfix expression the operator follows the operands. Postfix Expression has following general structure. Operand1 Operand2 Operator Algorithm A postfix expression can be evaluated using the Stack data structure. To evaluate a postfix expression using Stack data structure we can use the following steps. ) Step 2. Read postfix expression Left to Right ) Step 3. If operand is encountered, push it onto Stack Step 4. If operator is encountered a) Pop two elements from stack, where A is the top element and B is the next to top element. b) Evaluate B c) Push B A A onto Stack Step 5. Set result = top element on Stack Step 6. Exit. For example Infix notation: (2+4) * (4+6) Post-fix notation: 2 4 + 4 6 + * Algorithm Evaluate Postfix Input: E, an expression in postfix notation, with values of the operands appearing in the expression. Output: Value of the expression. Data structure: Array representation of a stack with TOP as the pointer to the top-most element. Steps: 1. Append a special delimiter # at the end of the expression 2. item = E.ReadSymbol() // Read the first symbol from E 3. While (item = '#') do 4. If (item = operand) then 5. PUSH(item) // Operand is the first push into the stack 6. Else 7. op = item // The item is an operator 8. y = POP() 9. x = POP() // The left-most operand of the current operator 10. t = x op y // Perform the operation with operator 'op' and // The right-most operand of the current operator operands x, y 11.PUSH(t) // Push the result into stack 12.End If 13.item = E.ReadSymbol() // Read the next item from E 14.EndWhile 15.value= POP() // Get the value of the expression 16.Return(value) 17.Stop Evaluation of Postfix Expression The Postfix notation is used to represent algebraic expressions. The expressions written in postfix form are evaluated faster compared to infix notation as parenthesis are not required in postfix. Following is an algorithm for evaluation postfix expressions. 1) Create a stack to store operands (or values). 2) Scan the given expression and do the following for every scanned element. t is an operator, pop operands for the operator from the stack. Evaluate the operator and push the result back to the stack 3) When the expression is ended, the number in the stack is the final. Example: - an all elements one by one. bottom to top) k, apply the + operator - rands from stack, apply the on operands, we get 5 9 which results in - operator - 8) There are no more elements to scan, we return the top element from the stack (which is the only element left in a stack) Conversion of Infix to Postfix Algorithm for Infix to Postfix Step 1: Consider the next element in the input. Step 2: If it is operand, display it. Step 3: If it is opening parenthesis, insert it on stack. Step 4: If it is an operator, then If stack is empty, insert operator on stack. If the top of stack is opening parenthesis, insert the operator on stack If it has higher priority than the top of stack, insert the operator on stack. Else, delete the operator from the stack and display it, repeat Step 4. Step 5: If it is a closing parenthesis, delete the operator from stack and display them until an opening parenthesis is encountered. Delete and discard the opening parenthesis. Step 6: If there is more input, go to Step 1. Step 7: If there is no more input, delete the remaining operators to output. Example: Suppose we are converting 3*3/(4-1)+6*2 expression into postfix form. Following table shows the evaluation of Infix to Postfix: Expression Stack Output 3 Empty 3 * * 3 3 * 33 / / 33* ( /( 33* 4 /( 33*4 - /(- 33*4 1 /(- 33*41 ) - 33*41- + + 33*41-/ 6 + 33*41-/6 * +* 33*41-/62 2 +* 33*41-/62 Empty 33*41-/62*+ So, the Postfix Expression is 33*41-/62*+ Algorithm Infix to Postfix Input: E, simple arithmetic expression in infix notation delimited at the end by the right parenthesis , incoming and in-stack priority values for all possible symbols in an arithmetic expression. Output: An arithmetic expression in postfix notation. Data structure: Array representation of a stack with TOP as the pointer to the top-most element. Steps: 1. TOP=0, PUSH( 2. While (TOP > 0) do // Initialize the stack 3. item E.ReadSymbol() 4. x = POP() 5. Case: item = operand 6. PUSH(X) 7. Output(item) 8. Case: item = 9. While x =! 10. // If the symbol is an operand // The stack will remain same // Add the symbol into the output expression // Scan reaches to its end // Till the left match is not found x = POP() 12. End While 13. Case: ISP(x) > =ICP(item) 14. While (ISP(x) >= ICP(item)) do 15. Output(x) 16. x= POP() 17. End While 18. PUSH(x) 19. PUSH (item) Case: ISP(x) < ICP(item) 21. PUSH(x) 22. 23. // Get the next item from the stack Output(x) 11. 20. // Scan the next symbol in infix expression PUSH (item) Otherwise: 24. Print "Invalid expression" 25.End While 26.Stop Note: This is a procedure irrespective of the type of memory representation of the stack to convert an infix expression to its postfix form using the basic operations of the stack, namely PUSH and POP. These operations can be replaced with their respective versions and hence implementable to stack with any type of memory representation. 2. Infix to Prefix Algorithm for Infix to Prefix Conversion: Step 1: Step 2: Scan A from right to left and repeat Step 3 to 6 for each element of A until the stack is empty. Step 3: If an operand is encountered, add it to B. Step 4: If a right parenthesis is encountered, insert it onto stack. Step 5: If an operator is encountered then, a. Delete from stack and add to B (each operator on the top of stack) Which has same or higher precedence than the operator. b. Add operator to stack. Step 6: If left parenthesis is encountered then, a. Delete from the stack and add to B (each operator on top of stack until a left parenthesis is encountered). b. Remove the left parenthesis. Step 7: Exit Queues: Queue is a linear data structure in which the insertion and deletion operations are performed at two different ends. It is a collection of similar data items in which insertion and deletion operations are performed based on FIFO (First In First Out) principle. In a queue data structure, adding and removing of elements are performed at two different positions. The insertion is performed at one end and deletion is performed at other end. In a queue data structure, the insertion operation is performed at a position which is known as REAR and the deletion operation is performed at a position which is known as FRONT. Operation on queue Operations on Queues the following operations are performed on a queue data structure. Enqueue - (To insert an element into the queue) Dequeue - (To delete an element from the queue) Display - (To display the elements of the queue) Algorithm Enqueue Input: An element ITEM that has to be inserted, mid Output: The ITEM is at the REAR of the queue. Data structure: Q is the array representation of a queue structure; two pointers FRONT and REAR of the queue Q are known. Steps: 1. If (REAR = N) then 2. Print "Queue is full" //Queue is full 3. Exit 4. Else 5. If (REAR = 0) and (FRONT = 0) then 6. //Queue is empty FRONT = 1 7. EndIf 8. REAR = REAR + 1 9. Q[REAR] = ITEM //Insert an item into the Queue at REAR 10. Endif 11. Stop The deletion operation DEQUEUE can be defined as given below: Algorithm Dequeue Input: A queue with elements. FRONT and REAR are the two pointers of the queue Q. Output: The deleted element is stored in ITEM. Data structures: Q is the array representation of a queue structure. Steps: 1. If (FRONT = 0) then 2. Print "Queue is empty" 3. Exit 4. Else 5. ITEM= Q[FRONT] // Get the element 6. If (FRONT = REAR) // When the Queue contains a single element 7. 8. 9. 10. REAR = 0 FRONT = 0 Else FRONT = FRONT + 1 // Queue becomes empty 11. Endif 12. EndIf 13. Stop Representation of Queues Queue data structure can be implemented in two ways. They are as follows... 1) Using Array 2) Using Linked List When a queue is implemented using array, that queue can organize only limited number of elements. When a queue is implemented using linked list, that queue can organize unlimited number of elements. Representation of Queue Using Array A queue data structure can be implemented using one dimensional array. But, queue implemented using array can store only fixed number of data values. The implementation of queue data structure using array is very simple, just define a one dimensional array of specific size and insert or delete the values into that array by using FIFO (First In First Out) principle with the help of variables 'front' and 'rear'. Initially both 'front' and 'rear' are set to -1. Whenever, we want to insert a new value into the queue, increment 'rear' value by one and then insert at that position. Whenever we want to delete a value from the queue, then increment 'front' value by one and then display the value at 'front' position as deleted element. Three states of a queue with this representation are given below: Queue is empty FRONT = 0 REAR = 0 (and/or) Queue is full REAR = N FRONT = 1 (when full by compact) Queue contains elements s >= 1 FRONT <= REAR Number of elements = REAR - FRONT + 1 Limitations of Linear Queue In a normal Queue Data Structure, we can insert elements until queue becomes full. But once if queue becomes full, we cannot insert the next element until all the elements are deleted from the queue. For example consider the queue below. After inserting all the elements into the queue. Various Queue Structures So far we have discussed two different queue structures, that is, either using an array or using a linked list (and a variation of a queue structure using an array as an assignment). Other than these, there are some more known queue structures. Circular Queue Queue represented using an array when the REAR pointer reaches the end, insertion will be denied even if room is available at the front. One way to avoid this is to use a circular array. Physically, a circular array is the same as an ordinary array, say A[1 ... N], but logically it implies that A[1] comes after A[N] or after A[N], A[1] appears. logical and physical views of a circular array. The principle underlying the representation of a circular array is as stated below: Both pointers will move in a clockwise direction. This is controlled by the MOD operation; for example, if the current pointer is at i then shift to the next location will be i MOD LENGTH+1, 1<=i<=LENGTH (where LENGTH is the queue length). Thus, if i = LENGTH (that is at the end), then the next position for the pointer is 1. With this principle the two states of the queue regarding, i.e. empty or full, will be decided as follows: Circular queue is empty FRONT= 0 REAR = 0 Circular queue is full FRONT= (REAR MOD LENGTH) + 1 The following two algorithms describe the insertion and deletion operations on a circular queue. Operations on Circular Queue enQueue(value) Inserting value into the Circular Queue. In a circular queue, enQueue() is a function which is used to insert an element into the circular queue. In a circular queue, the new element is always inserted at rear position. The enQueue() function takes one integer value as parameter and inserts that value into the circular queue. Algorithm Enqueue_CQ Input: An element ITEM to be inserted into the circular queue. Output: Circular queue with the ITEM at FRONT, if the queue is not full. Data structures: CQ be the array to represent the circular queue. Two pointers REAR are known. Steps: 1. If (FRONT = 0) then 2. FRONT = 1 3. REAR = 1 4. CQ[FRONT] = ITEM 5. Else 6. //When queue is empty //Queue is not empty next = (REAR MOD LENGTH) + 1 7. If (next =! FRONT) then 8. REAR = next 9. CQ[REAR] = ITEM 10.Else //If the queue is not full 11. Print "Queue is full" 12. EndIf 13.Endlf 14.Stop deQueue( ) Deleting a value from the Circular Queue. In a circular queue, deQueue( ) is a function used to delete an element from the circular queue. In a circular queue, the element is always deleted from front position. The deQueue() function doesn't take any value as parameter. Algorithm Dequeue_CQ Input: A queue CQ with elements. Two pointers FRONT and REAR are known. Output: The deleted element is ITEM if the queue is not empty. Data structures: CQ is the array representation of circular queue. Steps: 1. If (FRONT = 0) then 2. Print "Queue is empty" 3. Exit 4. Else 5. ITEM= CQ[FRONT] 6. If (FRONT = REAR) then //If the queue contains single element 7. FRONT = 0 8. REAR = 0 9. 10. Else FRONT= (FRONT MOD LENGTH) + 1 11. EndIf 12.EndIf 13.Stop Deque Another variation of the queue is known as deque (may be pronounced 'deck"). Unlike a queue, in deque, both insertion and deletion operations can be made at either end of the structure. Actually, the term deque has originated from double ended queue. Such a structure is shown in Figure It is clear from the deque structure that it is a general representation of both stack and queue. In other words, a déque can be used as a stack as well as a queue. There are various ways of representing a deque on the computer. One simpler way to represent it is by using a double linked list. Another popular representation is using a circular array (as used in a circular queue). The following four operations are possible on a deque which consists of a list of items: 1. Push_DQ(ITEM): To insert ITEM at the FRONT end of a deque. 2. Pop_DQ( ): To remove the FRONT item from a deque. 3. Inject(ITEM): To insert ITEM at the REAR end of a deque. 4. Eject( ): To remove the REAR ITEM from a deque. These operations are described for a deque based on a circular array of length LENGTH. Let the array be DQ[1 . Algorithm Push_DQ Input: ITEM to be inserted at the FRONT. Output: Deque with a newly inserted element ITEM it is not full already. Data structures: DQ being the circular array representation of a deque, Steps: 1. If (FRONT= 1) then 2. ahead= LENGTH 3. Else 4. // If FRONT is at extreme right or the deque is empty If (FRONT = LENGTH) or (FRONT= 0) then ahead = 1 5. 6. // If FRONT is at extreme left Else ahead= FRONT 7. 1 // If FRONT is at intermediate position 8. EndIf 9. If (ahead=REAR) then 10. Print "Deque is full" 11. Exit 12. Else 13. FRONT= ahead 14. DQ[FRONT] = ITEM 15. // Push the ITEM EndIf 16.EndIf 17.Stop Algorithm Pop DQ /* This algorithm is the same as the algorithm Dequeue_CQ */ Algorithm Inject /* This algorithm is the same as the algorithm Enqueue_CQ */ Algorithm Eject_DQ Input: A deque with elements in it. Output: The item is deleted from the REAR end. Data structures: DQ being the circular array representation of deque. Steps: 1. If (FRONT = 0) then 2. Print "Deque is empty" 3. Exit 4. Else 5. If (FRONT = REAR) then 6. ITEM= DQ[REAR] 7. FRONT= REAR = 0 8. 9. If (REAR = 1) then // REAR is at extreme left ITEM= DQ[REAR] 11. REAR = LENGTH Else 13. If (REAR= LENGTH) then 14. ITEM= DQ[REAR] 15. REAR = 1 16. Else ITEM= DQ[REAR] 18. REAR =REAR - 1 19. // REAR is at extreme right //REAR is at an intermediate position 17. EndIf 20. 21. // Deque becomes empty Else 10. 12. //The deque contains single element Endif EndIf 22.EndIf 23.Stop There are, however, two known variations of deque: 1. Input-restricted deque 2. Output-restricted deque. These two types of variations are actually intermediate between a queue and a deque specifically, an input-restricted deque is a deque which allows insertions at one end (say REAR end) only, but allows deletions at both ends. Similarly, an output-restricted deque is a deque where deletions take place at one end only (say FRONT end), but allows insertions at both ends. Priority queue A priority queue is another variation of queue structure. Here, each element has been assigned a value, called the priority of the element, and an element can be inserted or deleted not only at the ends but at any position on the queue. With this structure, an element X of priority p, may be deleted before an element which is at FRONT. Similarly, insertion of an element is based on its priority, that is, instead of adding it after the REAR it may be inserted at an intermediate position dictated by its priority value. Note that the name priority queue is a misnomer in the sense that the data structure is not a queue as per the definition; a priority queue does not strictly follow the first-in first-out (FIFO) principle which is the basic principle of a queue. Nevertheless, the name is now firmly associated with this particular data type. However, there are various models of priority queue known in different applications. Let us consider a particular model of priority queue. 1. An element of higher priority is processed before any element of lower priority. 2. Two elements with the same priority are processed according to the order in which they were added to the queue. Here, process means two basic operations namely insertion or deletion. There are various of implementing the structure of a priority queue. These are: (i) Using a simple/circular array (ii) Multi-queue implementation (iii) Using a double linked list (iv) Using heap tree. We will now see what each of these implementations is. (Heap tree implementation of priority queue will be discussed in Chapter 7 of this text.) Priority queue using an array With this representation, an array can be maintained to hold the item and its priority value. The element will be inserted at the REAR end as usual. The deletion operation will then be performed in either of the two following ways: a) Starting from the FRONT pointer, traverse the array for an element of the highest priority. Delete this element from the queue. If this is not the front- most element, shift all its trailing elements after the deleted element one stroke each to fill up the vacant position (see Figure). This implementation, however, is very inefficient as it involves searching the queue for the highest priority element and shifting the trailing elements after the deletion. A better implementation is as follows: b) Add the elements at the REAR end as earlier. Using a stable sorting algorithm", sort the elements of the queue so that the highest priority element is at the FRONT end. When a deletion is required, delete it from the FRONT end only (see Figure) There are two types of priority queues they are as follows. 1) Max Priority Queue 2) Min Priority Queue 1. Max Priority Queue In max priority queue, elements are inserted in the order in which they arrive the queue and always maximum value is removed first from the queue. For example assume that we insert in order 8, 3, 2, 5 and they are removed in the order 8, 5, 3, 2. 2. Min Priority Queue Min Priority Queue is similar to max priority queue except removing maximum element first, we remove minimum element first in min priority queue. Application of queue: Job scheduling CPU Scheduling in a Multiprogramming Environment In a multiprogramming environment, a single CPU has to serve more than one program simultaneously. Various programs in such an environment. Let us consider a multiprogramming environment where the possible jobs for the CPU are categorized into three groups: 1. Interrupts to be serviced. A variety of devices and terminals are connected to the CPU and they may interrupt the CPU at any moment to get a particular service from it. 2. Interactive users to be serviced. These are mainly user's programs under execution at various terminals. 3. Batch jobs to be serviced. These are long-term jobs mainly from non-interactive users, where all the inputs are fed when jobs are submitted; simulation programs, and jobs to print documents are of this kind. Here the problem is to schedule all sorts of jobs so that the required level of performance of the environment will be attained. One way to implement complex scheduling is to classify the workload according to its characteristics and to maintain separate process queues. So far as the environment is concerned, we can maintain three queues, as depicted in Figure. This approach is often called multi-level queues scheduling. Processes will be assigned to their respective queues. The CPU will then service the processes as per the priority of the queues In the case of a simple strategy, absolute priority, the process from the highest priority queue (for example, system processes) are serviced until the queue becomes empty. Then the CPU switches to the queue of interactive processes which has medium priority, and so on. A lower priority process may, of course, be pre-empted by a higher-priority arrival in one of the upper level queues. Multi-level queues strategy is a general discipline but has some drawbacks. The main drawback is that when processes arriving in higher-priority queues are very high, the processes in a lower-priority queue may starve for a long time. One way out to solve this problem is to time slice between the queues. Each queue gets a certain portion of the CPU time. Another possibility is known as multilevel feedback queue strategy. Normally in multi-level queue strategy, as we have seen, processes are permanently assigned to a queue upon entry to the system and processes do not move between queues. The multi-level feedback queue strategy on the contrary, allows a process to move between queues. The idea is to separate out the processes with different CPU burst characteristics. If a process uses too much of CPU time (that) is, long run process), it will be moved to a lower-priority queue. Similarly, a process which is waiting for too long a time in a lower-priority queue, may be moved to a higher-priority queue. For example, consider a multilevel feedback queue strategy with three 3 (Figure). A process entering the system is quantum of 10 ms, say. If it does not finish within this time, it is moved to the tail 2, is given a time quantum of 20 ms, say. If it does not complete within this time quantum, it is preempted and put into queue Q3. Processes in queue Q3, are serviced only when processes in queue Q3 2, A process whi are empty. -empt a process in queue Q2, or Q3. It can be observed that this strategy gives the highest priority to any process with a CPU burst of 10 ms or less. Processes which need more than 10 ms, but less than or equal to 20 ms are also served quickly, that is, they get the next highest priority over the shorter processes. Longer processes automatically sink to queue Q3; from Q3, processes will be served on a first come first-serve (FCFS) basis and in the case of a process waiting for too long a time (as decided by the scheduler) it may be put into . Applications of Queue Data Structure be processed in First In First Out order like Breadth First Search. This property of Queue makes it also useful in following kind of scenarios. 1) When a resource is shared among multiple consumers. Examples include CPU scheduling, Disk Scheduling. 2) When data is transferred asynchronously (data not necessarily received at same rate as sent) between two processes. Examples include IO Buffers, pipes, file IO, etc. 3) In Operating systems: a) Semaphores b) FCFS (first come first serve) scheduling, example: FIFO queue c) Spooling in printers d) Buffer for devices like keyboard 4) In Networks: a) Queues in routers/ switches b) Mail Queues 5) Variations: (Deque, Priority Queue, Doubly Ended Priority Queue) Unit IV: Linked list - Comparison with arrays; representation of linked list in memory. Singly linked List- structure and implementation; Operations - traversing; Add new node; Delete node; Reverse a list; Search and merge two singly linked lists. Stack with singly linked list. Circular Linked list - advantage. Queue as Circular linked list. Doubly linked list - structure; Operations - Add/delete nodes; Advantages. Linked list - Comparison with arrays An array is a data structure where elements are stored in consecutive memory locations. In order to occupy the adjacent space, a block of memory that is required for the array should be allocated beforehand. Once the memory is allocated, it cannot be extended any more. This is why the array is known as a static data structure. In contrast to this, the linked list is called a dynamic data structure where the amount of memory required can be varied during its use. In the linked list, the adjacency between the elements is maintained by means of links of pointers. A link or pointer actually is the address (memory location) of the subsequent element Thus, in a linked list, data (actual content) and link (to point to the next data) both are required to be maintained. An element in a linked list is a specially termed node, which can be viewed as shown in Figure. A node consists of two fields: DATA (to store the actual information) and LINK (to point to the next node). One disadvantage of using arrays to store data is that arrays are static structures and therefore cannot be easily extended or reduced to fit the data set. Arrays are also expensive to maintain new insertions and deletions. Linked Lists that addresses some of the limitations of arrays. Concept of dynamic data structures In Dynamic data structure the size of the structure in not fixed and can be modified during the operations performed on it. Dynamic data structures are designed to facilitate change of data structures in the run time. Linked List Linked list is an ordered collection of finite, homogeneous data elements called nodes where the linear order is maintained by means of links or pointers. Depending on the requirements the pointers are maintained, and accordingly the linked list can be classified into three major groups: Single Linked List, Circular Linked List and Double Linked List. Linked list is a linear data structure that contains sequence of elements such that each element links to its next element in the sequence. Each element in a linked list is called as "Node Each element (we will call it a node) of a list is comprising of two items - the data and a reference to the next node. The last node has a reference to null. The entry point into a linked list is called the head of the list. It should be noted that head is not a separate node, but the reference to the first node. If the list is empty then the head is a null reference. A linked list is a dynamic data structure. The number of nodes in a list is not fixed and can grow and shrink on demand. Any application which has to deal with an unknown number of objects will need to use a linked list. Advantages of Linked Lists They are dynamic in nature which allocates the memory when required. Insertion and deletion operations can be easily implemented. Stacks and queues can be easily executed. Linked List reduces the access time. Disadvantages of Linked Lists One disadvantage of a linked list against an array is that it does not allow direct access to the individual elements. If you want to access a particular item then you have to start at the head and follow the references until you get to that item. Another disadvantage is that a linked list uses more memory compare with an array. Representation of a Linked List in Memory There are two ways to represent a linked list in memory: 1. Static representation using array 2. Dynamic representation using free pool of storage Static representation In static representation of a single linked list, two arrays are maintained: one array for data and the other for links. Two parallel arrays of equal size are allocated which should be sufficient to store the entire linked list. Nevertheless this contradicts the idea of the linked list (that is non-contagious location of elements). But in some programming languages, for example, ALGOL, FORTRAN, BASIC, etc. such a representation is the only representation to manage a linked list. The static representation of the linked list in Figure is shown in Dynamic representation The efficient way of representing a linked list is using the free pool of storage. In this method, there is a memory bank (which is nothing but a collection of free memory spaces) and a memory manager (a program, in fact). During the creation of a linked list, whenever a node is required the request is placed to the memory manager; the memory manager will then search the memory bank for the block requested and, if found, grants the desired block to the caller. Again, there is also another program called the garbage collector; it plays whenever a node is no more in use; it returns the unused node to the memory bank. It may be noted that memory bank is basically a list of memory spaces which is available to a programmer. Such a memory management is known as dynamic memory management. The dynamic representation of linked list uses the dynamic memory management policy. The mechanism of dynamic representation of single linked list is illustrated in Figures 3.4(a) and 3.4(b). A list of available memory spaces is there whose pointer is stored in AVAIL. For a request of a node, the list AVAIL is searched for the block of right size. If AVAIL is null or if the block of desired size is not found, the memory manager will return a message accordingly. Suppose the block is found and let it be XY. Then the memory manager will return the pointer of XY to the caller in a temporary buffer, say NEW. The newly availed node XY then can be inserted at any position in the linked list by changing the pointers of the concerned nodes. In Figure 3.4(a), the node XY is inserted at the end and change of pointers is shown by the dotted arrows. Figure 3.4(b) explains the mechanism of how a node can be returned from a linked list to the memory bank. The pointers which are required to be manipulated while returning a node are shown with dotted arrows. Note that such allocations or deallocations are carried out by changing the pointers only. Types of Linked List There are 3 different implementations of Linked List available, they are: 1. Singly Linked List 2. Doubly Linked List 3. Circular Linked List Singly linked List Single linked list is a sequence of elements in which every element has link to its next element in the sequence. Here, N1. N2.... N6 are the constituent nodes in the list. HEADER is an empty node (having data content NULL) and only used to store a pointer to the first node N1. Thus, if one knows the address of the HEADER node from the link field of this node, the next node can be traced, and so on. This means that starting from the first node one can reach to the last node whose link field does not contain any address but has a null value. Note that in a single linked list one can move from left to right only, this is why a single linked list is also called one way list. Operations on a Single Linked List The operations possible on a single linked list are listed below: Traversing the list Inserting a node into the list Deleting a node from the list Copying the list to make a duplicate of it Merging the linked list with another one to make a larger list Searching for an element in the list. We will assume the following convention in our subsequent discussions: suppose X is a pointer to a node. The values in the DATA field and LINK field will be denoted by XDATA and XLINK, respectively. We will write NULL to imply nil value in the DATA and LINK fields. Traversing a single linked list In traversing a single linked list, visit every node in the list starting from the first node to the last node. The following is the algorithm Traverse_SL for the same. Algorithm Traverse_SL Input: HEADER is the pointer to the header node. Output: According to the Process( ) Data structures: A single linked list:-whose address of the starting node is known from the HEADER. Steps: 1. ptr = HEADER LINK // ptr is to store the pointer to a current node 2. While (ptr != NULL) do // Continue till the last node 3. Process(ptr) // Perform Process( ) on the current node 4. ptr = ptr // Move to the next node LINK 5. EndWhile 6. Stop Note: A simple operation, namely Process( ) may be devised to print the data content of a node, or count the total number of nodes, etc. Inserting a node into a single linked list There are various positions where a node can be inserted: (i) Inserting at the front (as a first element) (ii) Inserting at the end (as a last element) (iii) Inserting at any other position. Let us assume a procedure GetNode(NODE) to get a pointer of a memory block which suits the type NODE. The procedure may be defined as follows: Procedure GetNode Input: NODE is the type of the data for which a memory has to be allocated Output: Return a message if the allocation fails else the pointer to the memory block allocated. Steps: 1. If (AVAIL = NULL) // AVAIL is the pointer to the pool of free storage 2. Return(NULL) 3. Print Insufficient memory: Unable to allocate memory 4. Else // Sufficient memory is available 5. ptr = AVAIL // Start from the location, where AVAIL points 6. While (SizeOf(ptr) SizeOf(NODE)) and (ptr LINK NULL) do // Till the desired block is found or the search reaches the end of the pool 7. ptr1 = ptr 8. ptr = ptr // To keep the track of the previous block LINK // Move to the next block 9. EndWhile 10. If (SizeOf(ptr) = SizeOf(NODE)) // Memory block of right size is found 11. ptr1 LINK = ptr 12. Return(ptr) LINK // Update the AVAIL list 13. Else 14. Print The memory block is too large to fit 15. 16. Return(NULL) EndIf 17. EndIf 18. Stop Node: The GetNode( ) procedure as defined above is just to understand how a node can be allocated from the available storage space. In C, C++ and Java, there is a library routine for doing the same such as alloc( ), malloc( ) (in C, C++), new (in C++, Java). Inserting a node at the front of a single linked list The algorithm InsertFront_SL is used to insert a node at the front of a single linked list. Algorithm InsertFront_SL Input: HEADER is the pointer to the header node and X is the data of the node to be inserted. Output: A single linked list with a newly inserted node at the front of the list. Data structures: A single linked list whose address of the starting node is known from the HEADER. Steps: 1. new = GetNode(NODE) 2. If (new = NULL) then // Get a memory block of type NODE and store its pointer in new // Memory manager returns NULL on searching the memory bank 3. Print Memory underflow: No insertion 4. Exit // Quit the program 5. 6. Else // Memory is available and get a node from memory bank new LINK = HEADER LINK // Change of pointer 1 as shown in Figure 3.5(a) 7. new DATA = X // Copy the data X to newly availed node 8. HEADER LINK = new // Change of pointer 2 as shown in Figure 3.5(a) 9. EndIf 10. Stop Figure 5.5(a) Inserting a node in the front of a single linked list. Inserting a node at the end of a single linked list The algorithm InsertEnd_SL is used to insert a node at the end of a single linked list. Algorithm InsertEnd_SL Input: HEADER is the pointer to the header node and X is the data of the node to be inserted. Output: A single linked list with a newly inserted node having data X at the end of the list. Data structures: A single linked list whose address of the starting node is known from the HEADER. Steps: 1. new = GetNode(NODE) // Get a memory block of type NODE and return its pointer as new 2. 3. If (new = NULL) then // Unable to allocate memory for a node Print Memory is insufficient: Insertion is not possible 4. 5. Exit // Quit the program Else // Move to the end of the given list and then insert 6. ptr = HEADER 7. While (ptr 8. LINK ptr = ptr 9. EndWhile 10. ptr // Start from the HEADER node NULL) do // Move to the end LINK // Change pointer to the next node LINK = new // Change the link field of last node: Pointer 1 as in Figure 3.5(b) 11. new 12. EndIf 13. Stop DATA = X // Copy the content X into the new node Figure 5.5(b) Inserting a node at the end of a single linked list. Inserting a node into a single linked list at any position in the list The algorithm InsertAny_SL is used to insert a node into a single linked list at any position in the list. Algorithm InsertAny_SL Input: HEADER is the pointer to the header node, X is the data of the node to be inserted, and KEY being the data of the key node after which the node has to be inserted. Output: A single linked list enriched with newly inserted node having data X after the node with data KEY. Data structures: A single linked list whose address of the starting node is known from the HEADER. Steps: 1. new = GetNode(NODE) // Get a memory block of type NODE and returns its pointer as new 2. If (new = NULL) then // Unable to allocate memory for a node 3. Print Memory is insufficient: Insertion is not possible 4. Exit // Quit the program 5. Else 6. ptr = HEADER 7. While (ptr //Start from the HEADER node DATA KEY) and (ptr LINK NULL) do // Move to the node //having data as KEY or at the end if KEY is not in the list 8. ptr = ptr LINK 9. EndWhile 10. If (ptr LINK = NULL) then //Search fails to find the KEY 11. Print KEY is not available in the list 12. Exit 13. Else 14. new LINK = ptr LINK // Change the pointer 1 as shown in Figure 3.5(c) 15. new DATA = X // Copy the content into the new node 16. ptr 17. EndIf LINK = new // Change the pointer 2 as shown in Figure 3.5(c) 18. EndIf 19. Stop Figure 5.5(c) Inserting a node at any position in a single linked list. Deleting a node from a single linked list Like insertions, there are also three cases of deletions: (i) Deleting from the front of the list (ii) Deleting from the end of the list (iii) Deleting from any position in the list Let us consider the procedure for various cases of deletion. We will assume a procedure, namely ReturnNode(ptr) which returns a node having pointer ptr to the free pool of storage. The procedure ReturnNode(ptr) is defined as follows: Procedure ReturnNode Input: PTR is the pointer of a node to be returned to a list pointed by the pointer AVAIL. Output: The node is inserted into the list AVAIL at the end. Steps: 1. ptr1 = AVAIL 2. While (ptr1 3. // Start from the beginning of the free pool LINK ptr1 = ptr1 4. EndWhile 5. ptr1 NULL) do LINK LINK = PTR // Insert the node at the end 6. PTR 7. Stop LINK = NULL // Node inserted is the last node Note: The procedure ReturnNode( ) inserts the free node at the end of the pool of free storage whose header address is AVAIL. Alternatively, we can insert the free node at the front or at any position of the AVAIL In C and C++ this procedure is realized as a library routine free(). Deleting the node at the front of a single linked list The algorithm DeleteFront_SL is used to delete the node at the front of a single linked list. Such a deletion operation is explained in Figure 3.6(a). Algorithm DeleteFront_SL Input: HEADER is the pointer to the header node. Output: A single linked list after eliminating the node at the front of the list. Data structures: A single linked list whose address of the starting node is known from the HEADER. Steps: 1. ptr = HEADER LINK // Pointer to the first node 2. If (ptr = NULL) then // If the list is empty 3. Print The list is empty: No deletion 4. Exit // Quit the program 5. Else // The list is not empty 6. ptr1 = ptr LINK // ptr1 is the pointer to the second node, 7. HEADER LINK = ptr1 if any first node // as in 8. ReturnNode(ptr) // Deleted node is freed to the memory bank for future use 9. 10. // Next node becomes the EndIf Stop Figure 5.6(a) Deleting the node at the front of a single linked list. Deleting the node at the end of a single linked list The algorithm DeleteEnd_SL is used to delete the node at the end of a single linked list. This is shown in Figure 3.6(b). Algorithm DeleteEnd_SL Input: HEADER is the pointer to the header node. Output: A single linked list after eliminating the node at the end of the list. Data structures: A single linked list whose address of the starting node is known from the HEADER. Steps: 1. ptr = HEADER 2. If (ptr // Move from the header node LINK = NULL) then 3. Print The list is empty: No deletion possible 4. Exit 5. 6. // Quit the program Else While (ptr LINK 7. ptr1 = ptr 8. ptr = ptr 9. EndWhile 10. ptr1 NULL) do // Go to the last node // To store the previous pointer LINK // Move to the next LINK = NULL // Last but one node becomes the last node as in Figure 3.6(b) 11. ReturnNode(ptr) // Deleted node is returned to the memory bank for future use 12. EndIf 13. Stop Figure 3.6(b) Deleting the node at the end of a single linked list. Deleting the node from any position of a single linked list The algorithm DeleteAny_SL is used to delete a node from any position in a single linked list. This is illustrated in Figure 3.6(c). Algorithm DeleteAny_SL Input: HEADER is the pointer to the header node, KEY is the data content of the node to be deleted. Output: A single linked list except the node with data content as KEY. Data structures: A single linked list whose address of the starting node is known from the HEADER. Steps: 1. ptr1 = HEADER // Start from the header node 2. ptr = ptr1 3. While (ptr 4. 5. If (ptr LINK // This points to the first node, if any NULL) do DATA KEY) then // If not found the key ptr1 = ptr // Keep a track of the pointer of the previous node 6. ptr = ptr LINK 7. // Move to the next Else 8. ptr1 LINK = ptr LINK // Link field of the predecessor is to point the // successor of node under deletion, see Figure 3.6(c) 9. ReturnNode(ptr) // Return the deleted node to the memory bank 10. Exit // Exit the program 11. EndIf 12. EndWhile 13. If (ptr = NULL) then // When the desired node is not available in the list 14. Print Node with KEY does not exist: No 15. EndIf 16. Stop Figure 3.6(c) Deleting a node from any position of a single linked list. Copying a single linked list For a given list we can copy it into another list by duplicating the content of each node into the newly allocated node. The following is an algorithm to copy an entire single linked list. Algorithm Copy_SL Input: HEADER is the pointer to the header node of the list. Output: HEADER1 is the pointer to the duplicate list. Data structures: Single linked list structure. Steps: 1. ptr = HEADER // Current position in the master list 2. HEADER1 = GetNode(NODE)// Get a node for the header of the duplicate list 3. ptr1 = HEADER1 // ptr1 is the current position in the duplicated 4. ptr1 DATA = NULL 5. While (ptr list // Header node does not contain any data NULL) do // Till the traversal of master node is finished 6. new = GetNode(NODE) // Get a new node from memory 7. new DATA = ptr 8. ptr1 bank DATA LINK = new // Copy the content // Insert the node at the end of the duplicate list 9. new LINK = NULL 10. ptr1 = new 11. ptr = ptr LINK // Move to the next node in the master list 12. EndWhile 13. Stop Merging two single linked lists into one list Suppose two single linked lists, namely L1 and L2 are available and we want to merge the list L2 after L1. Also assume that, HEADER1 and HEADER2 are the header nodes of the lists L1 and L2, respectively. Merging can be done by setting the pointer of the link field of the last node in the list L1 with the pointer of the first node in L2. The header node in the list L2 should be returned to the pool of free storage. Merging two single linked lists into one list is illustrated in Figure Figure Merging two single linked lists into one single linked list. Following is the algorithm Merge_SL to merge two single linked lists into one single linked list: Algorithm Merge_SL Input: HEADER1 and HEADER2 are the pointers to the header nodes of lists (L1 and L2, respectively) to be merged. Output: HEADER is the pointer to the resultant list. Data structures: Single linked list structure. Steps: 1.ptr = HEADER1 2.While (ptr 3. LINK NULL) do // Move to the last node in the list L1 ptr = ptr LINK 4.EndWhile 5.ptr LINK = HEADER2 LINK // Last node in L1 points to the first node in L2 6.ReturnNode(HEADER2) // Return the header node to the memory bank 7.HEADER = HEADER1 // HEADER becomes the header node of the merged list 8. Stop Searching for an element in a single linked list The algorithm Search_SL( ) is given below to search an item in a single linked list. Algorithm Search_SL Input: Key, the item to be searched. Output: Location, the pointer to a node where the KEY belongs to or an error message. Data structures: A single linked list whose address of the starting node is known from HEADER. Step: 1. ptr = Header Link 2. flag = 0, Location = Null 3. While (ptr 4. If (ptr // Start from the first node NULL) and (flag = 0) do DATA = KEY) then 5. flag = 1 6. Location = ptr 7. Print Search is successful 8. Return (Location) 9. Else 10. ptr = ptr LINK 11. // Move to the next node EndIf 12. EndWhile 13. If (ptr = NULL) then 14. 15. // Search is finished Print Search is unsuccessful EndIf 16. Stop Linked list implementation of stack Instead of using array, we can also use linked list to implement stack. Linked list allocates the memory dynamically. However, time complexity in both the scenario is same for all the operations i.e. push, pop and peek. In linked list implementation of stack, the nodes are maintained non-contiguously in the memory. Each node contains a pointer to its immediate successor node in the stack. Stack is said to be overflown if the space left in the memory heap is not enough to create a node. Implement a Stack data structure using linked list. Linked list implementation of stack data structure must support basic stack operations like push, pop, peek and isEmpty. Advantage of Implementing a Stack as Linked List Dynamic size of stack. We can increase or decrease size of stack at runtime. Unlike array implementation of stack, there is no limit of maximum element in stack. Algorithm Push_LL Input: ITEM is the item to be inserted. Output: A single linked list with a newly inserted node with data content ITEM. Data structure: A single linked list structure whose pointer to the header is known from STACK HEAD and TOP is the pointer to the first node. Steps: 1. new= GetNode(NODE) /* Insert at front */ 2. new DATA = ITEM 3. new LINK = TOP 4. TOP = new 5. STACK_HEAD LINK =TOP 6. Stop Algorithm Pop_LL Input: A stack with elements. Output: The removed item is stored in ITEM. Data structure: A single linked list structure whose pointer to the header is known from STACK HEAD and TOP is the pointer to the first node. Steps: 1. If TOP = NULL 2. Print "Stack is empty" 3. Exit 4. Else 5. ptr = TOP LINK 6. ITEM= TOP DATA 7. STACK HEAD LINK = ptr 8. TOP = ptr 9. EndIf 10. Stop The operation Status now can be defined as follows: Algorithm Status_LL() Input: A stack with elements. Output: Status information such as its state (empty or full). number of items, item at the TOP. Data structure: A single linked list structure whose pointer to the header is known from STACK_HEAD and TOP is the pointer to the first node. Steps: 1. LINK 2. If (ptr= NULL) then 3. Print "Stack is empty" 4. Else 5. nodeCount=0 6. While (ptr NULL) do 7. nodeCount=nodeCount + 1 8. ptr =ptr LINK 9. End While A, "Stack contains", nodeCount, "Number of items" 11. EndIf 12. Stop Stack Using Linked List The major problem with the stack implemented using an array is, it works only for a fixed number of data values. That means the amount of data must be specified at the beginning of the implementation itself. Stack implemented using an array is not suitable, when we don't know the size of data which we are going to use. A stack data structure can be implemented by using a linked list data structure. The stack implemented using linked list can work for an unlimited number of values. That means, stack implemented using linked list works for the variable size of data. So, there is no need to fix the size at the beginning of the implementation. The Stack implemented using linked list can organize as many data values as we want. In linked list implementation of a stack, every new element is inserted as 'top' element. That means every newly inserted element is pointed by 'top'. Whenever we want to remove an element from the stack, simply remove the node which is pointed by 'top' by moving 'top' to its previous node in the list. The next field of the first element must be always NULL. Example In the above example, the last inserted node is 99 and the first inserted node is 25. The order of elements inserted is 25, 32, 50 and 99. push(value) - Inserting an element into the Stack We can use the following steps to insert a new node into the stack... Step 1 - Create a newNode with given value. Step 2 - Check whether stack is Empty (top == NULL) Step 3 - If it is Empty, then set Step 4 - If it is Not Empty, then set = NULL. = top. Step 5 - Finally, set top = newNode. pop() - Deleting an Element from a Stack We can use the following steps to delete a node from the stack... Step 1 - Check whether stack is Empty (top == NULL). Step 2 - If it is Empty, then display "Stack is Empty!!! Deletion is not possible!!!" and terminate the function Step 3 - If it is Not Empty, then define a Node pointer 'temp' and set it to 'top'. Step 4 - Then set 'top = '. Step 5 - Finally, delete 'temp'. (free(temp)). Reverse a linked list Iterative Approach We will use 3 variables: prevNode, head, and nextNode. prevNode to NULL nextNode can stay empty; Then we will continue our loop until our current maidenhead pointer is truthy (ie: not NULL), which implies that we reached the end of the linked list During our loop, we first of all update nextNode so that it acquires its namesake value, as head->next. Finally, we update the head with the value we stored in nextNode. And finally, we go on with the loop until we can. After the loop, we return prevNode. Reverse a Linked List In C++ ListNode* reverseList(ListNode* head) { ListNode *prev = NULL, *cur=head, *tmp; while(cur){ tmp = cur->next; cur->next = prev; prev = cur; cur = tmp; } return prev; } Circular Linked list A circular linked is very similar to a singly linked list, the only difference is in the singly linked list the last node of the list point to null but in the circular linked list the last node of the list point to the address of the head. This type of list is known as a circular linked list. A linked list where the last node points the header node is called the circular linked list. Figure shows a pictorial representation of a circular linked list. Circular Linked list - advantage. Circular linked lists have certain advantages over ordinary linked lists. Some advantages of circular linked lists are discussed below: Accessibility of a member node in the list In an ordinary list, a member node is accessible from a particular node, that is, from the header node only. But in a circular linked list, every member node is accessible from any node by merely chaining through the list. Example: Suppose, we are manipulating some information which is stored in a list. Also, think of a case where for a given data, we want to find the earlier occurrence(s) as well as post occurrence(s). Post occurrence(s) can be traced out by chaining through the list from the current node irrespective of whether the list is maintained as a circular linked or an ordinary linked list. In order to find all the earlier occurrences, in case of ordinary linked lists, we have to start our traversing from the header node at the cost of maintaining the pointer for the header in addition to the pointers for the current node and another for chaining. But in the case of a circular linked list, one can trace out the same without maintaining the header information, rather maintaining only two pointers. Note that in ordinary linked lists, one can chain through left to right only whereas it is virtually in both the directions for circular linked lists. Null link problem The null value in the link field may create some problem during the execution of programs if proper care is not taken. This is illustrated below by mentioning two algorithms to perform search on ordinary linked lists and circular linked lists. Algorithm Search_SL Inpur: KEY the item to be searched. Output: Location, the pointer to a node where KEY belongs or an error message. Data structures: A single linked list whose address of the starting node is known from the HEADER. Steps: 1. ptr = HEADER LINK 2. While (ptr != NULL) do 3. If (ptr DATA != KEY) then 4. ptr = ptr LINK 5. 6. Else Print "Search is successful" 7.Return(ptr) 8. Endlf 9. End While 10. If (ptr= NULL) then 11. Print "The entire list has traversed but KEY is not found" 12. Endlf 13. Stop Note that in the algorithm Search_SL, two tests in step 2 (which control the loop of searching) cannot be placed together as While (ptr NULL) AND (ptr DATA # KEY) do because in that case there will be an execution error for ptr-DATA since it is not defined when ptr = NULL. But with a circular linked list very simple implementation is possible without any special care for the NULL pointer. As an illustration the searching of an element in a circular linked list is given below: Insertion Inserting at beginning of the list. We can use the following steps to insert a new node at beginning of the circular linked list. Step 1: Create a new node, NEWNODE Step 2: [Check whether list is Empty] If (START = NULL) set START = NEWNODE Exit. Step 3: Define a node pointer TEMP set TEMP = START set START = NEWNODE Step 7: Exit. Inserting at end of the list We can use the following steps to insert a new node at end of the circular linked list. Step 1: Create a new node, NEWNODE Step 2: [Check whether list is Empty] If (START = NULL) set START = NEWNODE Exit. Step 3: Define a node pointer TEMP set TEMP = START Step 7: Exit. Deletion Deleting from Beginning of the list We can use the following steps to delete a node from beginning of the circular linked list. Step 1: [Check whether list is Empty] If (START = NULL) Display 'List is Empty. Deletion is not possible' Exit. Step 2: Define two Node pointers 'TEMP' and 'FIRST' set TEMP= START set FIRST= START Step 3: [Check whether list is having only one node] set START = NULL delete TEMP Exit. delete PRE. Step 6: Exit. Deleting from End of the list We can use the following steps to delete a node from end of the circular linked list. Step 1: [Check whether list is Empty] If (START = NULL) Display 'List is Empty. Deletion is not possible' Exit. Step 2: Define two Node pointers 'TEMP' and 'PRE' set TEMP= START Step 3: [Check whether list is having only one node] set START = NULL delete TEMP Exit. Step 4: set PRE = TEMP delete TEMP. Step 7: Exit. Queue as Circular linked list. Queue is also an abstract data type or a linear data structure, in which the first element is inserted from one end called REAR, and the deletion of existing element takes place from the other end called as FRONT. nQueue(value) This function is used to insert an element into the circular queue. In a circular queue, the new element is always inserted at Rear position. Steps: Create a new node dynamically and insert value into it. Check if front==NULL, if it is true then front = rear = (newly created node) If it is false then rear=(newly created node) and rear node always contains the address of the front node. deQueue() This function is used to delete an element from the circular queue. In a queue, the element is always deleted from front position. Steps: Check whether queue is empty or not means front == NULL. If it is empty then display Queue is empty. If queue is not empty then step 3 Check if (front==rear) if it is true then set front = rear = NULL else move the front forward in queue, update address of front in rear node and return the element. Doubly linked list - structure; In a single linked list one can move beginning from the header node to any node in one direction only (from left to right). This is why a single linked list is also termed a one-way list. On the other hand, a double linked list is a two-way list because one can move in either direction, either from left to right or from right to left. This is accomplished by maintaining two link fields instead of one as in a single linked list. A structure of a node for a double linked list is represented as in Figure. . From the figure, it can be noticed that two links, viz. RLINK and LLINK, point to the nodes on the right side and left side of the node, respectively. Thus, every node, except the header node and the last node, points to its immediate predecessor and immediate successor. Operations - Add/delete nodes; Advantages. Operations on a Double Linked List All the operations as mentioned for a single linked list can be implemented more efficiently using a double linked list. Inserting a node into a double linked list Figure 3.11 shows a schematic representation of various cases of inserting a node into a double linked list. Let us consider the algorithms of various cases of insertion. Inserting a node in the front The algorithm InsertFront_DL is used to define the insertion operation in a double linked list. Algorithm InsertFront_DL Input: X is the data content of the node to be inserted. Output: A double linked list enriched with a node in the front containing data X. Data structure: Double linked list structure whose pointer to the header node is the HEADER. Steps: 1. ptr = HEADER RLINK // Points to the first node 2. new = GetNode(NODE) // Avail a new node from the memory bank 3. If (new # NULL) then // If new node is available 4. new LLINK = HEADER // Newly inserted node points the header as 1 5. HEADER in Figure RLINK = new // Header now points to then new node as 2 in Figure 6. new RLINK= ptr 7. ptr LLINK = new // See the change in pointer shown as 4 in Figure 8. new DATA = X // Copy the data into the newly inserted node 9. Else 10. Print "Unable to allocate memory: Insertion is not possible". 11. EndIf 12. Stop Inserting a node at the end The algorithm InsertEnd_DL is to insert a node at the end into a double linked list. Algorithm InsertEnd_DL Input: X is the data content of the node to be inserted. Output: A double linked list enriched with a node containing data X at the end of the list. Data structure: Double linked list structure whose pointer to the header node is the HEADER. Steps: 1. ptr = HEADER 2. While (ptr 3. ptr= ptr RLINK NULL) do // Move to the last node RLINK 4. EndWhile 5. new = GetNode(NODE) 6. If (new != NULL) then //Avail a new node // If the node is available 7. new LLINK= ptr //Change the pointer shown as 1 in Figure 8. ptr // Change the pointer shown as 2 in Figure RLINK= new 9. new RLINK= NULL // Make the new node as the last node // Copy the data into the new node 11. Else 12. Print "Unable to allocate memory: Insertion is not possible" 13 EndIf 14. Stop Inserting a node at any position in the list The algorithm InsertAny_DL is used to insert a node at any position into a double linked list. Algorithm InsertAny_DL Input: X is the data content of the node to be inserted, and KEY the data content of the node after which the new node is to be inserted. Output: A double linked list enriched with a node containing data X after the node with data KEY, if KEY is not present in the list then it is inserted at the end. Data structure: Double linked list structure whose pointer to the header node is the HEADER. Steps: 1. ptr = HEADER 2 While (ptr DATA !=KEY) and (ptr RLINK != NULL) // Move to the key // node if the current node is not the KEY node or if the list reaches the end 3. ptr= ptr RLINK 4. EndWhile 5. new= GetNode(NODE) // Get a new node from the pool of free storage 6. If (new= NULL) then // When the memory is not available. 7. Print (Memory is not available) 8. Exit // Quit the program 9. EndIf 10. If (RLINK = NULL) then // If the KEY is not found in the list 11. new LLINK= ptr 12. ptr RLINK = new 13. new //Insert at the end RLINK = NULL 14. new DATA=X // Copy the information to the newly inserted node 15. Else // The KEY is available // Next node after the key node 16. ptrl= ptr RLINK 17. new LLINK= ptr //Change the pointer shown as 2 in Figure 18. new RLINK= ptrl //Change the pointer shown as 4 in Figure 19. ptr RLINK= new // Change the pointer shown as I in Figure 20.ptrl LLINK= new // Change the pointer shown as 3 in Figure 21. ptr = new // This becomes the current node 22. new DATA =X // Copy the content to the newly inserted node 23. EndIf 24. Stop Note: Observe that the algorithm InsertAny_DL will insert a node even the KEY node does not exist. In that case, the node will be inserted at the end of the list. Also, note that the algorithm will work even if the list is empty. Deleting a node from a double linked list Deleting a node from a double linked list may take place from any position in the list, as shown in Figure 3.12. Let us consider each of those cases separately. Deleting a node from the front of a double linked list The algorithm DeleteFront DL is defined below to delete a node from the front end of a doubleliked list. Algorithm DeleteFront_DL Input: A double linked list with data. Output: A reduced double linked list. Data structure: Double linked list structure whose pointer to the header node is the HEADER. Steps: // Pointer to the first node 2. If (ptr = NULL) then // If the list is empty 3. Print "List is empty: No deletion is made" 4. Exit 5. Else 6. ptr1= ptr 7. HEADER RLINK // Pointer to the second node RLINK = ptrl // Change the pointer shown as I in Figure 8. If (ptrl!= NULL) // If the list contains a node after the first node of deletion 9. ptr1 LLINK= HEADER // Change the pointer shown as 2 in Figure 10. EndIf 11.ReturnNode (ptr) // Return the deleted node to the memory bank 1 EndIf 13. Stop Note that the algorithm DeleteFront DL works even if the list is empty. Deleting a node at the end of a double linked list Algorithm DeleteEnd_DL Input: A double linked list with data. Output: A reduced double linked list. Data structure: Double linked list structure whose pointer to the header node is the HEADER. Steps: 1. ptr = HEADER 2. While (ptr 3. RLINK= NULL) do // Move to the last node ptr = ptr RLINK 4. EndWhile 5. If (ptr 6. HEADER) then // If the list is empty Print "List is empty: No deletion is made" 7. Exit // Quit the program 8. Else 9 ptr1= ptr 10. ptrl LLINK RLINK = NULL 11. ReturnNode (ptr) 12. Endlf 13. Stop // Pointer to the last but one node // Change the pointer shown as I in Figure // Return the deleted node to the memory bank Deleting a node from any position in a double linked list Algorithm DeleteAny DL Input: A double linked list with data, and KEY, the data content of the key node to be deleted. Output: A double linked list without a node having data content KEY, if any. Data structure: Double linked list structure whose pointer to the header node is the HEADER. Steps: 1. ptr = HEADER RLINK // Move to the first node 2. If (ptr = NULL) then 3. Print "List is empty: No deletion is made" 4. Exit 5. EndIf // Quit the program 6. While (ptr DATA != KEY) and (ptr RLINK != NULL) do // Move to the desired node 7. ptr =ptr RLINK 8. EndWhile 9. If (ptr DATA = KEY) then // If the node is found 10. ptr1= ptr LLINK // Track the predecessor node 11. ptr2= ptr RLINK // Track the successor node 12. ptr1 RLINK= ptr2 13. If (ptr2 !=NULL) then // Change the pointer shown as 1 in Figure // If the deleted node is the last node 14. ptr2 LLINK= ptrl // Change the pointer shown as 2 in Figure 15. Endf 16. ReturnNode(ptr) // Return the free node to the memory bank 17 Else 18. Print "The node does not exist in the given list" 19. EndIf 20. Stop Advantages of a Doubly Linked List Allows us to iterate in both directions. We can delete a node easily as we have access to its previous node. Reversing is easy. Can grow or shrink in size dynamically. Useful in implementing various other data structures. Unit V: Tree and Binary tree: Basic terminologies and properties; Linked representation of Binary tree; Complete and full binary trees; Binary tree representation with array. Tree traversal: in order, pre order and post order traversals. Binary Search Tree. Application of binary tree: Huffiman Code. Tree Tree is a non-linear data structure which organizes data in hierarchical structure and this is a recursive definition. Another very useful data structure is the tree, where the elements appear in a non-linear fashion, which requires a two-dimensional representation. In tree data structure, every individual element is called as Node. Node in a tree data structure, stores the actual data of that particular element and link to next element in hierarchical structure. In a tree data structure, if we have N number of nodes then we can have a maximum of N-1 number of links As another example, let us consider an algebraic expression X = (A B) / ((C * D) + E) Where different operators have their own precedence value. A tree structure to represent the same is shown in Figure. Here, operations having the highest precedence are at the lowest level; everything is stated explicitly as to which operands is for which operator. Moreover, associativity can easily be imposed if we place the operators on the left or right side. Basic terminologies and properties Node This is the main component of any tree structure. The concept of the node is the same as that used in a linked list. Node in a tree data structure, stores the actual data of that particular element and link to next element in hierarchical structure Figure represents the structure of a node. Parent The parent of a node is the immediate predecessor of a node. Here, X is the parent of Y and Z. See Figure Child If the immediate predecessor of a node is the parent of the node then all immediate successors of a node are known as child. For example, in Figure, Y and Z are the two child of X. The child which is on the left side is called the left child and that on the right side is called the right child. Link This is a pointer to a node in a tree. For example, as shown in Figure, LC and RC are two links of a node. Note that there may be more than two links of a node. Root This is a specially designated node which has no parent. In Figure. A is the root node Leaf The node which is at the end and does not have any child is called leaf node. In Figure, H, I, K, L, and M are the leaf nodes. A leaf node is also alternatively termed terminal node. Level Level is the rank in the hierarchy. The root node has level 0. If a node is at level l, then its child is at level l + 1 and the parent is at level l - 1. This is true for all nodes except the root node, being at level zero. In Figure, the levels of various nodes are depicted. Height The maximum number of nodes that is possible in a path starting from the root node to a leaf node is called the height of a tree. For example, in Figure, the longest path is A-C-F-J-M and hence the height of this tree is 5. It can be easily seen that h = lmax + 1, where h is the height and lmax is the maximum level of the tree. Degree The maximum number of children that is possible for a node is known as the degree of a node. For example, the degree of each node of the tree as shown in Figure is 2. Sibling The nodes which have the same parent are called siblings. For example, in Figure, J and K are siblings. DEFINITION AND CONCEPTS Let us define a tree. A tree is a finite set of one or more nodes such that: (i) There is a specially designated node called the root, (ii) The remaining nodes are partitioned into n (n > 0) disjoint sets T1,T2 n. ... To where each T (i = 1, 2, n) is a tree; T1, T2 , Tn, are called subtrees of the root. To illustrate the above definition, let us consider the sample tree shown in Figure In the sample tree T, there is a set of 12 nodes. Here, A is a special node being the root of the tree. The remaining nodes are partitioned into 3 sets T1. T2, and T3, they are sub-trees of the root node A. By definition, each sub-tree is also a tree. Observe that a tree is defined recursively. The same tree can be expressed in a string notation as shown below: Binary Trees A tree whose elements have at most 2 children is called a binary tree. Since each element in a binary tree can have only 2 children, we typically name them the left and right child. Binary Tree Representation in C: A tree is represented by a pointer to the topmost node in tree. If the tree is empty, then value of root is NULL. A Tree node contains following parts. 1. Data 2. Pointer to left child 3. Pointer to right child A binary tree is a special form of a tree. Contrary to a general tree, a binary tree is more important and frequently used in various applications of computer science. Like a general tree, a binary tree can also be defined as a finite set of nodes, such that: (i) T is empty (called the empty binary tree), or (ii) T contains a specially designated node called the root of T, and the remaining nodes of T form two disjoint binary trees T1 and T2 which are called the left sub-tree and the right sub-tree, respectively. One can easily notice the main difference between the definitions of a tree and a binary tree. A tree can never be empty but a binary tree may be empty. Another difference is that in the case of a binary tree a node may have at most two children (that is, a tree having degree = 2). Where as in the case of a tree, a node may have any number of children. Two special situations of a binary tree are possible: Full Binary Tree Complete Binary Tree. Full Binary Tree / Strictly Binary Trees A binary tree is a full binary tree if tit contains the maximum possible number of nodes at all levels. In a binary tree, every node can have a maximum of two children. But in strictly binary tree, every node should have exactly two children or none. That means every internal node must have exactly two children. A binary tree in which every node has either two or zero number of children is called Strictly Binary Tree. Strictly binary tree is also called as Full Binary Tree or Proper Binary Tree or 2- Tree Strictly binary tree data structure is used to represent mathematical expressions. Complete Binary Tree A binary tree is said to be a complete binary tree if all its levels, except possibly the last level, have the maximum number of possible node and all the nodes in the last level appear as far left as possible. In a binary tree, every node can have a maximum of two children. But in strictly binary tree, every node should have exactly two children or none and in complete binary tree all the nodes must have exactly two children and at every level of complete binary tree there must be 2level 23= 8 nodes. A binary tree in which every internal node has exactly two children and all leaf nodes are at same level is called Complete Binary Tree. Complete binary tree is also called as Perfect Binary Tree. Extended Binary Trees A binary tree can be converted into Full Binary tree by adding dummy nodes to existing nodes wherever required. The full binary tree obtained by adding dummy nodes to a binary tree is called as Extended Binary Tree. Properties of a Binary Tree A binary tree possesses a number of properties. These properties are very much useful and are listed below. In any binary tree, the maximum number of nodes on level l is 2 l, where l > =0. The maximum number of nodes possible in a binary tree of height h is 2h -1 The minimum number of nodes possible in a binary tree of height h is h For any non-empty binary tree, if n is the number of nodes and e is the number of edges, then n = e + 1. For any non-empty binary tree T, if no is the number of leaf nodes (degree = 0) and n2, is the number of internal nodes (degree = 2), then n0 = n2 + 1. The height of a complete binary tree with n number of nodes is [log2 (n + 1)]. The total number of binary trees possible with n nodes is 2n Cn Representations of Binary Tree A (binary) tree must represent a hierarchical relationship between a parent node and (at most two) child nodes. There are two common methods used for representing this conceptual structure. One is implicit approach called linear (or sequential) representation, where using an array we do not require the overhead of maintaining pointers (links). The other is explicit approach known as linked representation that uses pointers. Whatsoever be the representation, the main objective is that one should have direct access to the root node of the tree, and for any given node, one should have direct access to the children of it. Linear Representation of a Binary Tree / Array Representation This type of representation is static in the sense that a block of memory for an array is allocated before storing the actual tree in it, and once the memory is allocated, the size of the tree is restricted as permitted by the memory. In this representation, the nodes are stored level by level, starting from the zero level where only the root node is present. The root node is stored in the first memory location (as the first element in the array). Following rules can be used to decide the location of any node of a tree in the array (assuming that they array index starts from 1): 1. The root node is at location 1. 2. For any node with index i, 1 <i<=n (for some n): (a) PARENT(i) = [i/2] For the node when i = 1, there is no parent. (b) LCHILD(i) = 2 * i If 2* i >n, then i has no left child. (c) RCHILD(i) = 2 * i + 1 If 2* i + 1 > n, then i has no right child. For example, let us consider the case of representation of the following expression in the form of a tree: (A - B) + C + (D/E) The binary tree will appear as in Figure. The representation of the same tree using an array is shown in Figure. Value can be obtained easily if the binary tree is a full binary tree. The maximum number of nodes possible in a binary tree of height h is 2h -1 So, the size of the array to fit such a binary tree is 2h -1. Advantages of the sequential representation of a binary tree The advantages of the sequential representation of a binary tree are: 1. Any node can be accessed from any other node by calculating the index and this is efficient from execution point of view. 2. Here, only data are stored without any pointers to their successor or ancestor which are mentioned implicitly. 3. Programming languages, where dynamic memory allocation is not possible (such as BASIC, FORTRAN), array representation is the only means to store a tree. Disadvantages of the sequential representation of a binary tree With these potential advantages, there are disadvantages as well. These are: 1. Other than the full binary trees, the majority of the array entries may be empty. 2. It allows only static representation. It is in no way possible to enhance the tree structure if the array size is limited. 3. inserting a new node to the tree or deleting a node from it are inefficient with this representation, because these require considerable data movement up and down the array which demand excessive amount of processing time. Example 2. Linked List Representation of a Binary Tree We use double linked list to represent a binary tree. In a double linked list, every node consists of three fields. First field for storing left child address, second for storing actual data and third for storing right child address. In this linked list representation, a node has the following structure. The above example of binary tree represented using Linked list representation is shown as follows. In spite of simplicity and ease of implementation, the linear representation of binary trees has a number of overheads. In the linked representation of binary trees, all these overheads are taken care of. The linked representation assumes the structure of a node to be as shown in Figure. Here, LC and RC are the two link fields used to store the addresses of left child and right child of a node; DATA is the information content of the node. With this representation, if one knows the address of the root node then from it any other node can be accessed. The two forms, that is, tree' structure and 'linked' structure look almost similar. This similarity implies that the linked representation of a binary tree very closely resembles the logical structure of the data involved. This is why, all the algorithms for various operations with this representation can be easily defined. Using the linked representation, the tree with 9 nodes given in Figure 7.16(a) will appear as shown in Figure Another advantage of this representation is that it allows dynamic memory allocation; hence the size of the tree can be changed as and when the need arises without any limitation except when the limitation of the availability of the total memory is the problem. However, so far as the memory requirement for a tree is concerned, the linked representation uses more memory than that required by the linear representation. Linked representation requires extra memory to maintain the pointers. Some pointers though with null values, they too need memory to store them. In a linked representation of a binary tree, if there are n number of nodes ten the number of null links =n + 1. Traversals The traversal operation is a frequently used operation on a binary tree. This operation is used to visit each node in the tree exactly once. A full traversal on a binary tree gives a linear ordering of the data in the tree. Displaying (or) visiting order of nodes in a binary tree is called as Binary Tree Traversal. There are three types of binary tree traversals. 1. In - Order Traversal 2. Pre - Order Traversal 3. Post - Order Traversal In-Order Traversal for above example of binary tree is I - D - J - B - F - A G-K-C H Pre-Order Traversal for above example binary tree is A - B - D - I - J - F - C -G-K H Post-Order Traversal for above example binary tree is I - J - D - F - B - K G-H-C A In order traversal of a binary tree ( leftChild - root - rightChild ) Inorder traversal of a binary tree follows three ordered steps as follows Traverse the left sub-tree of the root node R in inorder. Visit the root node R. Traverse the right sub-tree of the root node R in inorder. Out of these steps, steps I and 3 are defined recursively. The following is the algorithm Inorder to implement the above definition. Algorithm Inorder Input: ROOT is the pointer to the root node of the binary tree. Output: Visiting all the nodes in inorder fashion. Data structure: Linked structure of binary tree. Steps: 1. ptr=ROOT // Start from ROOT 2. // If it is not an empty node 3. Inorder(ptr 4. Visit(ptr) 5. Inorder (ptr C) // Traverse the left sub-tree of the node in inorder // Visit the node RC) // Traverse the right sub-tree of the node in inorder 6. EndIf 7. Stop. Preorder traversal of a binary tree ( root - leftChild - rightChild ) In this traversal, the root is visited first, then the left sub-tree in preorder fashion, and then the right sub-tree in preorder fashion. Such a traversal can be defined as follows: Visit the root node R. Traverse the left sub-tree of R in preorder. Traverse the right sub-tree of R in preorder Algorithm Preorder Input: ROOT is the pointer to the root node of the binary tree. Output: Visiting all the nodes in preorder fashion. Data structure: Linked structure of binary tree. Steps: 1. ptr =ROOT // Start from the ROOT 3. Visit(ptr) //Visit the node 4. -tree of the node in preorder 5. -tree of the node in preorder 6 EndIf 7. Stop Postorder traversal of a binary tree ( leftChild - rightChild - root ) Here, the root node is visited in the end, that is, first visit the left sub-tree, then the right sub tree, and lastly the root. A definition for this type of tree traversal is stated below: Traverse the left sub-tree of the root R in postorder Traverse the right sub-tree of the root R in postorder Visit the root node R. Algorithm Postorder Input: ROOT is the pointer to the root node of the binary tree. Output: Visiting all the nodes in preorder fashion. Data structure: Linked structure of binary tree. Steps: 1. ptr = ROOT // Start from the root 2. If (ptr±NULL) then // If it is not an empty node 3 Postorder(ptr LC) // Traverse the left sub-tree of the node in inorder 4. Postorder(ptr RC) // Traverse the right sub-tree of the node in inorder 5 Visit(ptr) 6 Endif //Visit the node 7. Stop Binary Search Trees In a binary tree, every node can have maximum of two children but there is no order of nodes based on their values. In binary tree, the elements are arranged as they arrive to the tree, from top to bottom and left to right. To enhance the performance of binary tree, we use special type of binary tree known as Binary Search Tree. Binary search tree mainly focus on the search operation in binary tree. Binary search tree can be defined as follows. Binary Search Tree is a binary tree in which every node contains only smaller values in its left subtree and only larger values in its right subtree. The properties that separate a binary search tree from a regular binary tree is 1. All nodes of left subtree are less than the root node 2. All nodes of right subtree are more than the root node 3. Both subtrees of each node are also BSTs i.e. they have the above two properties In the above tree, the value of root node is 40, which is greater than its left child 30 but smaller than right child of 30, i.e., 55. So, the above tree does not satisfy the property of Binary search tree. Therefore, the above tree is not a binary search tree. Advantages of Binary search tree Searching an element in the Binary search tree is easy as we always have a hint that which subtree has the desired element. As compared to array and linked lists, insertion and deletion operations are faster in BST. Example of creating a binary search tree Now, let's see the creation of binary search tree using an example. Suppose the data elements are - 45, 15, 79, 90, 10, 55, 12, 20, 50 First, we have to insert 45 into the tree as the root of the tree. Then, read the next element; if it is smaller than the root node, insert it as the root of the left subtree, and move to the next element. Otherwise, if the element is larger than the root node, then insert it as the root of the right subtree. Now, let's see the process of creating the Binary search tree using the given data element. The process of creating the BST is shown below Step 1 - Insert 45. Step 2 - Insert 15. As 15 is smaller than 45, so insert it as the root node of the left subtree. Step 3 - Insert 79. As 79 is greater than 45, so insert it as the root node of the right subtree. Step 4 - Insert 90. 90 is greater than 45 and 79, so it will be inserted as the right subtree of 79. Step 5 - Insert 10. 10 is smaller than 45 and 15, so it will be inserted as a left subtree of 15. Step 6 - Insert 55. 55 is larger than 45 and smaller than 79, so it will be inserted as the left subtree of 79. Step 7 - Insert 12. 12 is smaller than 45 and 15 but greater than 10, so it will be inserted as the right subtree of 10. Step 8 - Insert 20. 20 is smaller than 45 but greater than 15, so it will be inserted as the right subtree of 15. Step 9 - Insert 50. 50 is greater than 45 but smaller than 79 and 55. So, it will be inserted as a left subtree of 55. Now, the creation of binary search tree is completed. After that, let's move towards the operations that can be performed on Binary search tree. Searching in Binary search tree Searching means to find or locate a specific element or node in a data structure. In Binary search tree, searching a node is easy because elements in BST are stored in a specific order. The steps of searching a node in Binary Search tree are listed as follows 1. First, compare the element to be searched with the root element of the tree. 2. If root is matched with the target element, then return the node's location. 3. If it is not matched, then check whether the item is less than the root element, if it is smaller than the root element, then move to the left subtree. 4. If it is larger than the root element, then move to the right subtree. 5. Repeat the above procedure recursively until the match is found. 6. If the element is not found or not present in the tree, then return NULL. Now, let's understand the searching in binary tree using an example. We are taking the binary search tree formed above. Suppose we have to find node 20 from the below tree. Step1: Step2: Step3: Now, let's see the algorithm to search an element in the Binary search tree. Algorithm to search an element in Binary search tree 1. Search (root, item) 2. Step 1 - if (item = root data) or (root = NULL) 3. return root 4. else if (item < root 5. return Search(root data) left, item) 6. else 7. return Search(root right, item) 8. END if 9. Step 2 - END Deletion in Binary Search tree In a binary search tree, we must delete a node from the tree by keeping in mind that the property of BST is not violated. To delete a node from BST, there are three possible situations occur The node to be deleted is the leaf node, or, The node to be deleted has only one child, and, The node to be deleted has two children We will understand the situations listed above in detail. When the node to be deleted is the leaf node It is the simplest case to delete a node in BST. Here, we have to replace the leaf node with NULL and simply free the allocated space. We can see the process to delete a leaf node from BST in the below image. In below image, suppose we have to delete node 90, as the node to be deleted is a leaf node, so it will be replaced with NULL, and the allocated space will free. When the node to be deleted has only one child In this case, we have to replace the target node with its child, and then delete the child node. It means that after replacing the target node with its child node, the child node will now contain the value to be deleted. So, we simply have to replace the child node with NULL and free up the allocated space. We can see the process of deleting a node with one child from BST in the below image. In the below image, suppose we have to delete the node 79, as the node to be deleted has only one child, so it will be replaced with its child 55. So, the replaced node 79 will now be a leaf node that can be easily deleted. When the node to be deleted has two children This case of deleting a node in BST is a bit complex among other two cases. In such a case, the steps to be followed are listed as follows First, find the inorder successor of the node to be deleted. After that, replace that node with the inorder successor until the target node is placed at the leaf of tree. And at last, replace the node with NULL and free up the allocated space. The inorder successor is required when the right child of the node is not empty. We can obtain the inorder successor by finding the minimum element in the right child of the node. We can see the process of deleting a node with two children from BST in the below image. In the below image, suppose we have to delete node 45 that is the root node, as the node to be deleted has two children, so it will be replaced with its inorder successor. Now, node 45 will be at the leaf of the tree so that it can be deleted easily. Now let's understand how insertion is performed on a binary search tree. Insertion in Binary Search tree A new key in BST is always inserted at the leaf. To insert an element in BST, we have to start searching from the root node; if the node to be inserted is less than the root node, then search for an empty location in the left subtree. Else, search for the empty location in the right subtree and insert the data. Insert in BST is similar to searching, as we always have to maintain the rule that the left subtree is smaller than the root, and right subtree is larger than the root. Now, let's see the process of inserting a node into BST using an example. Binary Search Tree Complexities Time Complexity Operation Best Case Average Case Worst Case Complexity Complexity Complexity Search O(log n) O(log n) O(n) Insertion O(log n) O(log n) O(n) Deletion O(log n) O(log n) O(n) Here, n is the number of nodes in the tree. Space Complexity The space complexity for all the operations is O(n). Applications of tree Unlike Array and Linked List, which are linear data structures, tree is hierarchical (or non-linear) data structure. 1. One reason to use trees might be because you want to store information that naturally forms a hierarchy. For example, the file system on a computer: file system / <-- root / \ ... home / \ ugrad / ... course / | \ cs101 cs112 cs113 1. If we organize keys in form of a tree (with some ordering e.g., BST), we can search for a given key in moderate time (quicker than Linked List and slower than arrays). Self-balancing search trees like AVL and Red-Black trees guarantee an upper bound of O(Logn) for search. 2. We can insert/delete keys in moderate time (quicker than Arrays and slower than Unordered Linked Lists). Self-balancing search trees like AVL and Red-Black trees guarantee an upper bound of O(Logn) for insertion/deletion. 3. Like Linked Lists and unlike Arrays, Pointer implementation of trees pointers. Other Applications : 1. Store hierarchical data, like folder structure, organization structure, XML/HTML data. 2. Binary Search Tree is a tree that allows fast search, insert, delete on a sorted data. It also allows finding closest item 3. Heap is a tree data structure which is implemented using arrays and used to implement priority queues. 4. B-Tree and B+ Tree : They are used to implement indexing in databases. 5. Syntax Tree: Scanning, parsing , generation of code and evaluation of arithmetic expressions in Compiler design. 6. K-D Tree: A space partitioning tree used to organize points in K dimensional space. 7. Trie : Used to implement dictionaries with prefix lookup. 8. Suffix Tree : For quick pattern searching in a fixed text. 9. Spanning Trees and shortest path trees are used in routers and bridges respectively in computer networks 10.As a workflow for compositing digital images for visual effects. 11.Decision trees. 12.Organization chart of a large organization. Applications of Binary Trees A binary tree is a tree data structure comprising of nodes with at most two children i.e. a right and left child. In computing, binary trees are mainly used for searching and sorting as they provide a means to store data hierarchically. Binary Search Tree is a tree that allows fast search, insert, delete on a sorted data. It also allows finding closest item. Some common operations that can be conducted on binary trees include insertion, deletion, and traversal. Routing Tables A routing table is used to link routers in a network. It is usually implemented with a trie data structure, which is a variation of a binary tree. The tree data structure will store the location of routers based on their IP addresses. Routers with similar addresses are grouped under a single subtree. Decision Trees Binary trees can also be used for classification purposes. A decision tree is a supervised machine learning algorithm. The binary tree data structure is used here to emulate the decision-making process. Expression Evaluation The expression is evaluated by applying the operator(s) in the internal node to the operands in the leaves. Sorting Binary search trees, a variant of binary trees are used in the implementation of sorting algorithms to order items. A binary search tree is simply an ordered or sorted binary tree such that the value in the left child is less than the value in the parent node. At the same time, the values in the right node are greater than the value in the parent node. Indices for Databases In database indexing, B-trees are used to sort data for simplified searching, insertion, and deletion. It is important to note that a B-tree is not a binary tree, but can become one when it takes on the properties of a binary tree. Data Compression In data compression, Huffman coding is used to create a binary tree capable of compressing data. Data compression is the processing of encoding data to use fewer bits. Given a text to compress, Huffman coding builds a binary tree and inserts the encodings of characters in the nodes based on their frequency in the text. The encoding for a character is obtained by traversing the tree from its root to the node. Frequently occurring characters will have a shorter path as compared to less occurring characters. This is done to reduce the number of bits for frequent characters and ensure maximum data compression. Huffman Coding Huffman Coding is a technique of compressing data to reduce its size without losing any of the details. It was first developed by David Huffman. Huffman Coding is generally useful to compress the data in which there are frequently occurring characters. The idea is to assign variable-length codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding characters. The most frequent character gets the smallest code and the least frequent character gets the largest code. There are mainly two major parts in Huffman Coding 1. Build a Huffman Tree from input characters. 2. Traverse the Huffman Tree and assign codes to characters. How Huffman Coding works? Suppose the string below is to be sent over a network. Initial string Each character occupies 8 bits. There are a total of 15 characters in the above string. Thus, a total of 8 * 15 = 120 bits are required to send this string. Using the Huffman Coding technique, we can compress the string to a smaller size. Huffman coding first creates a tree using the frequencies of the character and then generates code for each character. Once the data is encoded, it has to be decoded. Decoding is done using the same tree. Huffman Coding prevents any ambiguity in the decoding process using the concept of prefix code ie. a code associated with a character should not be present in the prefix of any other code. The tree created above helps in maintaining the property. Huffman coding is done with the help of the following steps. 1. Calculate the frequency of each character in the string. Frequency of string 2. Sort the characters in increasing order of the frequency. These are stored in a priority queue Q.Characters sorted according to the frequency 3. Make each unique character as a leaf node. 4. Create an empty node z. Assign the minimum frequency to the left child of z and assign the second minimum frequency to the right child of z. Set the value of the z as the sum of the above two minimum frequencies. Getting the sum of the least numbers 5. Remove these two minimum frequencies from Q and add the sum into the list of frequencies (* denote the internal nodes in the figure above). 6. Insert node z into the tree. 7. Repeat steps 3 to 5 for all the characters. Repeat steps 3 to 5 for all the characters. Repeat steps 3 to 5 for all the characters. 8. For each non-leaf node, assign 0 to the left edge and 1 to the right edge. Assign 0 to the left edge and 1 to the right edge or sending the above string over a network, we have to send the tree as well as the above compressed-code. The total size is given by the table below. Character Frequency Code Size A 5 11 5*2 = 10 B 1 100 1*3 = 3 C 6 0 6*1 = 6 D 3 101 3*3 = 9 4 * 8 = 32 bits 15 bits 28 bits Without encoding, the total size of the string was 120 bits. After encoding the size is reduced to 32 + 15 + 28 = 75. Decoding the code For decoding the code, we can take the code and traverse through the tree to find the character. Let 101 is to be decoded, we can traverse from the root as in the figure below.