DATA STRUCTURES & ANALYSIS OF ALGORITHMS What is “data structures”? A data structure is a way of organizing data that considers not only the items stored, but also their relationship to each other. Advance knowledge about the relationship between data items allows designing of efficient algorithms for the manipulation of data. List out the areas in which data structures are applied extensively? • Compiler design • Operating system • Database management system Define a linear and non linear data structure. • Linear data structure: a linear data structure traverses the data elements sequentially, in which only one data element can directly be reached. Ex: arrays, stack, queue • Non-linear data structure: every data item is attached to several other data items in a way that is specific for reflecting relationships. The data items are not arranged in a sequential structure. Ex: trees, graphs State the difference between arrays and linked lists Arrays Size of an array is fixed It is necessary to specify the number of elements during declaration. Linked Lists Size of a list varies It is not necessary to specify the number of elements during declaration Insertions and deletions make the Insertions and deletions elements to change the positions carried out easily It occupies less memory than a It occupies more memory linked list for the same number of elements are Are linked lists considered linear or non-linear data structure? • It actually depends on where you intend to apply linked lists. If you based it on storage, a linked list is considered non-linear. On the other hand, if you based it on access strategies, then a linked list is considered linear. What is the data structure used to perform recursion? • Stack, because of its LIFO (last in first out) property it remembers its 'caller' so knows whom to return when the function has to return. Recursion makes use of system stack for storing the return addresses of the function calls. • Every recursive function has its equivalent iterative (non-recursive) function. Even when such equivalent iterative procedures are written, explicit stack is to be used. Describe stack operation. • Push – pushes (inserts) the element in the stack. The location is specified by the pointer. • Pop – pulls (removes) the element out of the stack. The location is specified by the pointer • Peek: - returns the top element on the stack but does not remove it from the stack The stack pointer points to the recent element inserted. What are the different applications of stack? • Function calls. • Reversal of a string. • Checking validity of an expression containing nested parenthesis. • Conversion of infix expression to postfix. • Evaluation of postfix expression. Describe queue operation. Queue is a data structure that follows first in first out strategy. • Enqueue – inserts the element in the queue at the end. • Dequeue – removes the element out of the queue from the front • Size – returns the size of the queue • Front – returns the first element of the queue. • Isempty – to find if the queue is empty. Discuss how to implement queue using stack. • Have two stacks S1 and S2 • For enqueue, perform push on S1 • For dequeue, if S2 is not empty, pop top element from S2. If S2 is empty, then pop all elements currently in S1 and push them to S2, the last element popped from S1 is the element to be dequeued. What are priority queues? A priority queue is essentially a list of items in which each item has associated with it a priority. Items are inserted into a priority queue in any, arbitrary order. However, items are withdrawn from a priority queue in order of their priorities starting with the highest priority item first. Applications of queue • Serving requests of a single shared resource (printer, disk, cpu), transferring data asynchronously (data not necessarily received at same rate as sent) between two processes (io buffers), Pipes, file io, sockets. • Call center phone systems will use a queue to hold people in line until a service representative is free. • Buffers on mp3 players and portable cd players, ipod playlist. Playlist for jukebox - add songs to the end, play from the front of the list. • When programming a real-time system that can be interrupted (eg., By a mouse click or wireless connection), it is necessary to attend to the interrupts immediately, before proceeding with the current activity. If the interrupts should be handled in the same order they arrive, then a FIFO queue is the appropriate data structure. Define a deque • Deque (double-ended queue) is another form of a queue in which insertions and deletions are made at both the front and rear ends of the queue. There are two variations of a deque, namely, input restricted deque and output restricted deque. • The input restricted deque allows insertion at one end (it can be either front or rear) only. • The output restricted deque allows deletion at one end (it can be either front or rear) only. Describe tree data structures • A recursive data structure, which consists of nodes, connected with edges. The following statements are true for trees: - Each node can have 0 or more direct descendants (children). - Each node has at most one parent. There is only one special node without parent – the root (if the tree is not empty). - All nodes are reachable from the root – there is a path from the root to each node in the tree. • A node is a tree and this node can have zero or more children, which are also trees. Describe binary tree A binary tree consists of • A node (called the root node) and • Left and right sub-trees. Both the sub-trees are themselves binary trees. Define a binary search tree A binary search tree is a special binary tree, which is either empty or it should satisfy the following characteristics: • Every node has a value and no two nodes should have the same value i.E) the values in the binary search tree are distinct • The values in any left sub-tree is less than the value of its parent node • The values in any right sub-tree is greater than the value of its parent node • The left and right sub-trees of each node are again binary search trees What are the different binary tree traversal techniques? • Preorder traversal • Inorder traversal • Postorder traversal What is the order of traversing nodes in in-order traversal? • Traverse the left sub-tree • Process the root node • Traverse the right sub-tree What is the order of traversing nodes in preorder traversal? • Process the root node • Traverse the left sub-tree • Traverse the right sub-tree What is the order of traversing nodes in postorder traversal? • Traverse the left sub-tree • Traverse the right sub-tree • Process the root node Give the various traversals of the following tree •Pre-Order: A-B-D-C-E-F •In-Order: B-D-A-E-C-F •Post-Order: D-B-E-F-C-A What is an AVL tree? An avl tree is a binary tree in which the difference between the height of the right and left subtrees (or the root node) is never more than one. Binary heap A binary heap is a complete binary tree which satisfies the heap ordering property. The ordering can be one of two types: • The min-heap property: the value of each node is greater than or equal to the value of its parent, with the minimumvalue element at the root. • The max-heap property: the value of each node is less than or equal to the value of its parent, with the maximumvalue element at the root. What is a graph? • A graph G = (V,E) is composed of: V: set of vertices E: set of edges connecting the vertices in V • An edge e = (u,v) is a pair of vertices a b • Example: V= {a,b,c,d,e} c E= {(a,b),(a,c),(a,d),(b,e),(c,d),(c,e), (d,e)} d e What are the representations of a graph? Adjacency matrix Adjacency lists How can you represent a graph using adjacency matrix? Let g=(v,e) be a graph with n vertices. The adjacency matrix of g is a two-dimensional n by n array, say adj_mat If the edge (vi, vj) is in e(g), adj_mat[i][j]=1 If there is no such edge in e(g), adj_mat[i][j]=0 The adjacency matrix for an undirected graph is symmetric; the adjacency matrix for a digraph need not be symmetric Adjacency list representation of a graph • An array of linked lists is used. Size of the array is equal to number of vertices. Let the array be array[]. An entry array[i] represents the linked list of vertices adjacent to the i th vertex. This representation can also be used to represent a weighted graph. The weights of edges can be stored in nodes of linked lists. What are the ways to traverse all the nodes in the graph? • Depth first search • Once a possible path is found, continue the search until the end of the path • Breadth first search • Start several paths at a time, and advance in each one step at a time Give an example of dfs for graph Give an example of bfs for graph Topological sorting • An ordering of the vertices in a directed acyclic graph, such that: if there is path from u to v, then u appears before v in the ordering. One possible sorting: V1, V2, V4, V3, V5 Another sorting: V1, V2, V4, V5, V3 Applications of graph • Computer networks • Electronic circuits What is a spanning tree? • A spanning tree of a graph g(v,e) is a free tree (i.e., A tree with no root) with | v | - 1 edges that connects all the vertices of the graph. What is a minimum spanning tree A minimum spanning tree for g is a spanning tree of g that has the least total cost. Example: the graph Has 16 spanning trees. Some are: The graph has two minimum-cost spanning trees, each with a cost of 6: Give some applications of minimum-cost spanning trees • Building cable networks that join n locations with minimum cost. • Building a road network that joins n cities with minimum cost. • Obtaining an independent set of circuit equations for an electrical network. What are the algorithms available to find mst • Prim’s algorithm • Kruskal’s algorithm What is the idea of prim’s algorithm to find MST? • Prim’s algorithm finds a minimum cost spanning tree by selecting edges from the graph one-by-one as follows: • It starts with a tree, t, consisting of the starting vertex, x. • Then, it adds the shortest edge emanating from x that connects t to the rest of the graph. • It repeats the process and stops when the tree covers all vertices What is the idea of kruskal’s algorithm to find MST? • A forest is a graph whose connected components are trees. Starting from a spanning forest with no edges, repeatedly add edges of minimum weight (never creating a cycle) until the forest becomes a tree. Sequential search • A sequential search of a list/array begins at the beginning of the list/array and continues until the item is found or the entire list/array has been searched • Sequential search (array implementation) uses n+1 comparisons for an unsuccessful search (always). • Time complexity will be in the order of n in average Binary Search • Binary search algorithm assumes that the items in the array being searched are sorted • The algorithm begins at the middle of the array in a binary search • If the item for which we are searching is less than the item in the middle, we know that the item won’t be in the second half of the array • Once again we examine the “middle” element • The process continues with each comparison cutting in half the portion of the array where the item might be • Time complexity will be in the order of log N in average. Hash function • determines position of key in the array. • assume table (array) size is n • function f(x) maps any key x to an int between 0 and n−1 • for example, assume that n=15, that key x is a non-negative integer between 0 and max_int, and hash function f(x) = x % 15. How can collision be handled in hashing? • Open hashing (closed addressing) - separate chaining • Closed hashing(open addressing) Separate chaining: disadvantages • Parts of the array might never be used. • As chains get longer, search time increases to o(n) in the worst case. • Constructing new chain nodes is relatively expensive (still constant time, but the constant is high). ALGORITHM TECHNIQUES Brute force • A straightforward approach, usually based directly on the problem’s statement and definitions of the concepts involved • Examples: • Computing an (a > 0, n a nonnegative integer) • Computing n! • Multiplying two matrices • Searching for a key of a given value in a list Divide-and-conquer technique a problem of size n (instance) subproblem 1 of size n/2 subproblem 2 of size n/2 a solution to subproblem 1 a solution to subproblem 2 a solution to the original problem Examples • sorting: mergesort and quicksort • binary tree traversals Decrease and conquer 1. Reduce problem instance to smaller instance of the same problem 2. Solve smaller instance 3. Extend solution of smaller instance to obtain solution to original instance • Decrease by a constant: insertion sort • Decrease by a constant factor: binary search • Decreaase by variable: euclid’s algorithm gcd(m,n) = gcd(n, m mod n) Transform and conquer This group of techniques solves a problem by a transformation. • Instance simplification: • Presorting • Representation change: • Heap sort • Problem reduction: • Mathematical results Greedy technique Constructs a solution to an optimization problem piece by piece through a sequence of choices that are: Feasible, i.E. Satisfying the constraints • Locally optimal (with respect to some neighborhood definition) • Greedy (in terms of some measure), and irrevocable • Example: minimum spanning tree, single source shortest path, huffman coding, simple scheduling problems THANK YOU ALL THE BEST