CSE674 Advanced Data Structures and Algorithms Course Overview (Week 10) Analysis of Algorithms We can have different algorithms to solve a given problem. Which one is the most efficient? Between the given two algorithms, which one is more efficient? Analysis of Algorithms is the area of computer science that provides tools to analyze the efficiency of different methods of solutions. The efficiency of an algorithm means How much time it requires. How much memory space it requires. How much disk space and other resources it requires. We will concentrate on the time requirement. We try to find the efficiency of the algorithms, not their implementations. An analysis should focus on gross differences in the efficiency of algorithms that are likely to dominate the overall cost of a solution. O-Notation (Upper Bounds) Formal Definition of O-Notation: We write f(n) = O(g(n)) if there exist constants c > 0, n0 > 0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0. Ex. 2n2 = O(n3) 2n2 ≤ cn3 2 ≤ cn c = 1 & n0 = 2 or c = 2 & n0 = 1 2n3 = O(n3) 2n3 ≤ cn3 2 ≤ c c = 2 & n0 = 1 -Notation (Lower Bounds) Formal Definition of -Notation: f(n) = (g(n)) if there exist constants c > 0, n0 > 0 such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0. Set Definition of -Notation: (g(n)) = { f(n) : there exist constants c > 0, n0 > 0 such that 0 ≤ cg(n) ≤ f(n) for all n ≥ n0 } Θ-Notation (Tight Bounds) Formal Definition of Θ-Notation: f(n) = Θ(g(n)) if there exist constants c1 > 0, c2 > 0, n0 > 0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0. Set Definition of Θ -Notation: Θ(g(n)) = { f(n) : there exist constants c1 > 0, c2 > 0, n0 > 0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0 } Linked Lists • Linked lists and arrays are similar since they both store collections of data. – The array's features all follow from its strategy of allocating the memory for all its elements in one block of memory. – Linked lists use an entirely different strategy: linked lists allocate memory for each element separately and only when necessary. • Linked lists are used to store a collection of information (like arrays) • A linked list is made of nodes that are pointing to each other • We only know the address of the first node (head) • Other nodes are reached by following the “next” pointers • The last node points to NULL Stack • Stack stores arbitrary objects. • Insertions and deletions follow the last-in first-out (LIFO) scheme. – The last item placed on the stack will be the first item removed. Major Stack Operations: • Create an empty stack • Determine whether a stack is empty or not • Get top of stack • Add a new item – push • Remove the item that was added most recently -- pop Queue Data Type • A queue is a list from which items are deleted from one end (front) and into which items are inserted at the other end (rear, or back) – It is like line of people waiting to purchase tickets: • Queue is referred to as a first-in-first-out (FIFO) data structure. – The first item inserted into a queue is the first item to leave • Queues have many applications in computer systems: – Any application where a group of items is waiting to use a shared resource will use a queue. e.g. • jobs in a single processor computer • print spooling • information packets in computer networks. – A simulation: a study to see how to reduce the wait involved in an application vector and list in the STL • • • • The C++ language includes an implementation of common data structures. This part of the language is popularly known as the Standard Template Library (STL). The List ADT is one of the data structures implemented in the STL. There are two popular implementations of the List ADT: vector and list – The vector provides a growable array implementation of the List ADT. – The advantage of using the vector is that it is indexable in constant time. – The disadvantage is that insertion of new items and removal of – existing items is expensive, unless the changes are made at the end of the vector. – The list provides a doubly linked list implementation of the List ADT. – The advantage of using the list is that insertion of new items and removal of existing items is cheap, provided that the position of the changes is known. – The disadvantage is that the list is not easily indexable. – Both vector and list are inefficient for searches. Amortized Complexity (Amortized Analysis) of vector insertion • In a vector data structure adding an element to the back has a Ω(1) and O(vectorSize). • We want to have a complexity that is more tightly bounded • Amortized Analysis attempts to do this. • The starting capacity is 4 and we double the capacity when full. T(n) = n + (n/2) - 1 + T(n/2) T(4) = 4 + 3 Amortized Analysis • Amortized running time analysis is a way to get a tighter bound on a sequence of operations. • The amortized cost of a sequence of n operations is the average cost per operation, for the worst-case sequence • In an amortized analysis, we average the time required to perform a sequence of data-structure operations over all the operations performed. • With amortized analysis, we can show that the average cost of an operation is small, if we average over a sequence of operations, even though a single operation within the sequence might be expensive. • Amortized analysis differs from average-case analysis in that probability is not involved; an amortized analysis guarantees the average performance of each operation in the worst case. Priority Queues • A priority queue is a data structure for maintaining a set S of elements, each with an associated value called a key. • A max-priority queue supports the following operations: – MAXIMUM(S) returns the element of S with the largest key. – EXTRACT-MAX(S) removes and returns the element of S with the largest key. – INCREASE-KEY(S,x,k) increases the value of element x’s key to the new value k, which is assumed to be at least as large as x’s current key value. – INSERT(S,k) inserts a new element into the set S with key k. Heaps Definition: A heap is a complete binary tree such that – It is empty, or – Its root contains a search key greater than or equal to the search key in each of its children, and each of its children is also a heap. • In this definition, since the root contains the item with the largest search key, heap in this definition is also known as maxheap. • On the other hand, a heap which places the smallest search key in its root is know as minheap. • We will talk about maxheap as heap in the rest of our discussions. Sorting Algorithms Comparison-based sorting algorithms: – Insertion Sort – Selection Sort – Bubble Sort – Merge Sort – Quick Sort – Heapsort • First three sorting algorithms are not so efficient, but last three are efficient sorting algorithms. Non-Comparison-based sorting algorithms: – Counting sort – Radix Sort – Bucket Sort Binary Tree A binary tree T is a set of nodes with the following properties: • The set can be empty. • Otherwise, the set is partitioned into three disjoint subsets: – a tree consists of a distinguished node r, called root, and – two possibly empty sets are binary tree, called left and right subtrees of r. • T is a binary tree if either – T has no nodes, or – T is of the form where r is a node and TL and TR are binary trees. Height of Binary Tree • The height of a binary tree T can be defined as recursively as: – If T is empty, its height is 0. – If T is non-empty tree, then since T is of the form the height of T is 1 greater than the height of its root’s taller subtree; ie. height(T) = 1 + max{ height(TL), height(TR) } Full Binary Tree • In a full binary tree of height h, all nodes that are at a level less than h have two children each. • Each node in a full binary tree has left and right subtrees of the same height. • Among binary trees of height h, a full binary tree has as many leaves as possible, and they all are at level h. • A full binary has no missing nodes. • Recursive definition of full binary tree: – If T is empty, T is a full binary tree of height 0. – If T is not empty and has height h>0, T is a full binary tree if its root’s subtrees are both full binary trees of height h-1. A full binary tree of height 3 Complete Binary Tree • A complete binary tree of height h is a binary tree that is full down to level h-1, with level h filled in from left to right. • A binary tree T of height h is complete if 1. All nodes at level h-2 and above have two children each, and 2. When a node at level h-1 has children, all nodes to its left at the same level have two children each, and 3. When a node at level h-1 has one child, it is a left child. – A full binary tree is a complete binary tree. Binary Tree Traversals Preorder Traversal • the node is visited before its left and right subtrees, Postorder Traversal • the node is visited after both subtrees. Inorder Traversal • the node is visited between the subtrees, • visit left subtree, visit the node, and visit the right subtree. • We can apply preorder and postorder traversals to general trees, but inorder traversal soe not make any sense for general trees. Binary Search Tree • An important application of binary trees is their use in searching. • Binary search tree is a binary tree in which every node X contains a data value that satisfies the following: a) all data values in its left subtree are smaller than the data value in X b) the data value in X is smaller than all the values in its right subtree. c) the left and right subtrees are also binary search trees. • The definition above assumes that values in the tree are unique. If we want to store non-unique values, we have to change (a) or (b) in that definition (not both). Minimum Height • Complete trees and full trees have minimum height. • The height of an n-node binary search tree ranges from log2(n+1) to n. • Insertion in search-key order produces a maximum-height binary search tree. • Insertion in random order produces a near-minimum-height binary tree. • That is, the height of an n-node binary search tree – Best Case – log2(n+1) – Worst Case – n O(log2n) O(n) – Average Case – close to log2(n+1) O(log2n) • In fact, 1.39log2n Order of Operations on BSTs Balanced Search Trees • The height of a binary search tree is sensitive to the order of insertions and deletions. – The height of a binary search tree is between log2(N+1) and N. – So, the worst case behavior of some BST operations are O(N). • There are various search trees that can retain their balance despite insertions and deletions. – AVL Trees – 2-3-4 Trees (Multiway Trees) – Splay Trees – Red-Black Trees – 2-3 Trees, Scapegoat Trees, … • In these height balance search trees, the run time complexity of insertion, deletion and retrieval operations will be O(log2N) at worst case. AVL Trees • • • • An AVL tree is a binary search tree with a balance condition. AVL is named for its inventors: Adel’son-Vel’skii and Landis AVL tree approximates the ideal tree (completely balanced tree). AVL Tree maintains a height close to the minimum. Definition: An AVL tree is a binary search tree such that for any node in the tree, the height of the left and right subtrees can differ by at most 1. A Minimum Tree of height H Balance Operations • Balance is restored by tree rotations. • There are 4 cases that we might have to fix. 1. Single Right Rotation 2. Single Left Rotation 3. Double Right-Left Rotation 4. Double Left-Right Rotation AVL Tree Insertion • Insert 3, 6, 9, 2, 1, 4 into an empty AVL tree in the given order. Show after the tree each insertion. AVL Tree Deletion • Delete 1 from the following AVL Tree • Delete 1 from the following AVL Tree AVL Tree Deletion Example Splay Trees • Splay Tree that guarantees that any M consecutive tree operations starting from an empty tree take at most O(MlogN) time (Amortized Cost). – Although this guarantee does not preclude the possibility that any single operation might take O(N) time, • A splay tree has an O(logN) amortized cost per operation. • SPLAYING: The basic idea of the splay tree is that after a node is accessed, it is pushed to the root by a series of AVL tree rotations. – Let X be a (non-root) node on the access path at which we are rotating. – If the parent of X is the root of the tree, we merely rotate X and the root. Otherwise, X has both a parent (P) and a grandparent (G), and we need double rotations Splay Tree Insertion Example • Insert 1,2,3,6,5,4 into empty tree Splay Tree Insertion Example • Insert 1,2,3,6,5,4 into empty tree Splay Tree Search • Search 1 in the following splay tree Splay Tree Deletion Delete 6 from the following splay tree. Skip Lists Simple Sorted Linked List Linked list with links to two cells ahead Linked list with links to four cells ahead Linked list with links to 2i cells ahead Skip List Insertion B-Trees • A B-Tree is a multiway search tree that is designed to keep trees on hard disks. • A B-Tree of order m is a multiway search tree with the following properties: • 1. The root has at least two subtrees unless it is a leaf 2. Each non-root and each non-leaf node holds k-1 keys and k pointers to subtrees where 𝒎/𝟐 <= k <= m 3. Each leaf node holds k-1 keys where 𝒎/𝟐 <= k <= m 4. All leaves are on the same level A B-Tree of order 5 B-Tree of Order 5: Example • Insert 6,1,5,4,7,2,8,3,9,10 into an empty tree Hashing • Using balanced trees (2-3-4, red-black, and AVL trees) we can implement table operations (retrieval, insertion and deletion) efficiently. O(logN) • Can we find a data structure so that we can perform these table operations better than balanced search trees? O(1) YES HASH TABLES • In hash tables, we have an array (index: 0..n-1) and an address calculator (hash function) which maps a search key into an array index between 0 and n-1. • A hash function tells us where to place an item in array called a hash table. This method is known as hashing. • Collisions occur when the hash function maps more than one item into the same array location. Hashing: Collision Resolution Open Addressing – Each entry holds one item • During an attempt to insert a new item into a table, if the hash function indicates a location in the hash table that is already occupied, we probe for some other empty (or open) location in which to place the item. • The sequence of locations that we examine is called the probe sequence. • There are different open-addressing schemes: – Linear Probing – Quadratic Probing – Double Hashing Chaining – Each entry can hold more than item – Buckets – hold certain number of items Graph – Definitions • A graph G = (V, E) consists of – a set of vertices, V, and – a set of edges, E, where each edge is a pair (v,w) s.t. v,w V • Vertices are sometimes called nodes, edges are sometimes called arcs. • If the edge pair is ordered then the graph is called a directed graph (also called digraphs) . • We also call a normal graph (which is not a directed graph) an undirected graph. • We can label the edges of a graph with numeric values, the graph is called a weighted graph. Graph Implementations Adjacency Matrix: A two dimensional array Adjacency List: For each vertex we keep a list of adjacent vertices Graph Traversals Breadth-First Traversal Depth-First Traversal • BFS tree from source node a. DFS tree from source node a. Single-Source Shortest Paths Problem Dijkstra’s Algorithm • • Single-source shortest-paths problem: given a graph G=(V,E), we want to find a shortest path from a given source vertex s to each vertex v in V . Dijkstra’s algorithm solves the single-source shortest-paths problem on a weighted directed graph G=(V,E) for the case in which all edge weights are nonnegative. Dynamic Programing • Divide-and-conquer algorithms partition the problem into disjoint subproblems, solve the subproblems recursively, and then combine their solutions to solve the original problem. • Dynamic programming applies when the subproblems overlap — that is, when subproblems share subsubproblems. • Dynamic programming is typically applied to optimization problems. – There are many possible solutions (feasible solutions) – Each solution has a value (a cost) – Want to find an optimal solution to the problem • When developing a dynamic-programming algorithm, we follow a sequence of four steps: 1. 2. 3. 4. Characterize the structure of an optimal solution Recursively define the value of an optimal solution Compute the value of an optimal solution in a bottom-up fashion Construct an optimal solution from the computed information in Step 3 Dynamic Programing Matrix-chain Multiplication Problem Input: A sequence (chain) of n matrices <A1,A2,…,An> Output: An optimal fully parenthesized version (a fully parenthesized chain with the minimum cost) of matrix-chain multiplication A1 A2 … An • A product of matrices is fully parenthesized if – It is either a single matrix, or – The product of fully parenthesized matrix products is surrounded by a pair of parentheses. • There are five distinct ways of fully parenthesization of the matrix-chain multiplication A1 A2 A3 A4 : Cost of Matrix-chain Multiplication • Let a chain < A1, A2, A3 > where A1 : 10×100 A2 : 100×5 A3 : 5×50 • There are two distinct ways of fully parenthesization of the matrix-chain multiplication A1 A2 A3 • First parenthesization yields 10 times faster computation Example A1:20x10 A2:10x5 A3:5x30 A4:30x40 A5:40x10 • Compute minimum costs for chains of length 1: m[1,1], m[2,2], …, m[5,5] – Compute m[i,i]=0 for i=1,2,…,n Array m 1 1 2 3 4 5 Array s 2 3 4 5 0 2 1 0 2 0 3 0 4 0 3 4 5 Example A1:20x10 A2:10x5 A3:5x30 A4:30x40 A5:40x10 Compute minimum costs for chains of length 2: m[1,2], m[2,3], m[3,4], m[4,5] 1..2: 2..3: 3..4: 4..5: 1: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 0 + 0 + 20*10*5 = 1000 2: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 0 + 0 + 10*5*30 = 1500 3: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 0 + 0 + 5*30*40 = 6000 4: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 0 + 0 + 30*40*10 = 12000 Array m 1 2 3 4 5 Array s 1 2 0 1000 0 3 4 5 2 1 1500 0 2 6000 0 3 12000 0 4 3 4 5 1 2 3 4 Example A1:20x10 A2:10x5 A3:5x30 A4:30x40 A5:40x10 Compute minimum costs for chains of length 3: m[1,3], m[2,4], m[3,5] 1..3: 2..4: 3..5: 1: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 0 + 1500 + 20*10*30 = 7500 2: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 1000 + 0 + 20*5*30 = 4000 2: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 0 + 6000 + 10*5*40 = 8000 3: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 1500 + 0 + 10*30*40 = 13500 3: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 0 + 12000 + 5*30*10 = 13500 4: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 6000 + 0 + 5*40*10 = 8000 Array m 1 2 3 4 5 Array s 1 2 3 4 5 0 1000 4000 0 1500 8000 0 6000 8000 3 0 12000 4 1 2 0 2 3 1 2 2 4 5 2 3 4 4 Example A1:20x10 A2:10x5 A3:5x30 A4:30x40 A5:40x10 Compute minimum costs for chains of length 4: m[1,4], m[2,5] 1..4: 1: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 0 + 8000 + 20*10*40 = 16000 2: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 1000 + 6000 + 20*5*40 = 11000 3: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 4000 + 0 + 20*30*40 = 28000 2: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 0 + 8000 + 10*5*10 = 8500 3: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 1500 + 12000 + 10*30*10 = 16500 4: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 8000 + 0 + 10*40*10 = 12000 2..5: Array m 1 2 3 4 5 Array s 1 2 3 4 5 0 1000 4000 11000 0 1500 8000 8500 2 0 6000 8000 3 0 12000 4 1 0 2 3 4 5 1 2 2 2 2 2 3 4 4 Example A1:20x10 A2:10x5 A3:5x30 A4:30x40 A5:40x10 Compute minimum cost for chain of length 5: m[1,5] 1..5: 1: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 0 + 8500 + 20*10*10 = 10500 2: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 1000 + 8000 + 20*5*10 = 10000 3: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 4000 + 12000 + 20*30*10 = 22000 4: m[i][k] + m[k+1][j] + p[i-1]*p[k]*p[j] = 11000 + 0 + 20*40*10 = 19000 Array m 1 2 3 4 5 Array s 1 2 3 4 5 0 1000 4000 11000 10000 1 0 1500 8000 8500 2 0 6000 8000 3 0 12000 4 0 2 3 4 5 1 2 2 2 2 2 2 3 4 4 Elements of Dynamic Programming • • When should we look for a DP solution to an optimization problem? Two key ingredients for the problem – Optimal substructure – Overlapping subproblems Optimal substructure • A problem exhibits optimal substructure if an optimal solution to the problem contains within it optimal solutions to subproblems. • In dynamic programming, we build an optimal solution to the problem from optimal solutions to subproblems. Overlapping subproblems • When a recursive algorithm revisits the same problem repeatedly, we say that the optimization problem has overlapping subproblems • Dynamic-programming algorithms typically take advantage of overlapping subproblems by solving each subproblem once and then storing the solution in a table where it can be looked up when needed, using constant time per lookup. Memoized Recursive Algorithm • Offers the efficiency of the usual DP approach while maintaining top-down strategy • Maintains an entry in a table for the solution to each subproblem • Each table entry contains a special value to indicate that the entry has yet to be filled in • When the subproblem is first encountered its solution is computed and then stored in the table • Each subsequent time that the subproblem encountered, the value stored in the table is simply looked up and returned • The approach assumes that – The set of all possible subproblem parameters are known – The relation between the table positions and subproblems is established Data Compression Huffman Coding: A Greedy Algorithm • Huffman codes compress data very effectively. • Huffman’s greedy algorithm uses a table giving how often each character occurs (i.e., its frequency) to build up an optimal way of representing each character as a binary string. • We consider only codes in which no codeword is also a prefix of some other codeword. Such codes are called prefix codes. • Encoding is always simple for any binary character code; we just concatenate the codewords representing each character of the file. • Prefix codes are desirable because they simplify decoding. – Since no codeword is a prefix of any other, the codeword that begins an encoded file is unambiguous. – We can simply identify the initial codeword, translate it back to the original character, and repeat the decoding process on the remainder of the encoded file. Constructing a Huffman code Greedy Choice Property Greedy choice property: a sequence of locally optimal (greedy) choices ⇒ a global optimal solution • Some problems has greedy choice property, we can use a greedy algorithm for those problems. • If a problem does not have greedy choice property, we may not use a greedy algorithm for that problem, • How can you judge whether a greedy algorithm will solve a particular optimization problem? • Two key ingredients – Greedy choice property – Optimal substructure property Memory Management The Sequential-Fit Methods • A simple organization of memory could require a linked list of all memory blocks, which is updated after a block is either requested or returned. • In the sequential-fit methods, all available memory blocks are linked, and the list is searched to find a block whose size is larger than or the same as the requested size. – The first-fit algorithm allocates the first block of memory large enough to meet the request. – The best-fit algorithm allocates a block that is closest in size to the request. – The worst-fit method finds the largest block on the list so that, after returning its portion equal to the requested size, the remaining part is large enough to be used in later requests. – The next-fit method allocates the next available block that is sufficiently large Memory Management The Nonsequential-Fit Methods • • The sequential-fit methods may become inefficient for large memory. In the case of large memory, nonsequential-fit methods may be desirable. Buddy Systems • Nonsequential memory management methods known as buddy systems do not just assign memory in sequential slices, but divide it into two buddies that are merged whenever possible. • In the buddy system, two buddies are never free. • A block can have either a buddy used by the program or none. • The classic buddy system is the binary buddy system. – The binary buddy system assumes that storage consists of 2m locations for some integer m, with addresses 0, . . . , 2m–1, and that these locations can be organized into blocks whose lengths can only be powers of 2. – There is also an array avail[] such that, for each i=0, . . . , m, avail[i] is the head of a doubly linked list of blocks of the same size, 2i.