9/16/23 CPSC 535: Advanced Algorithm Fall 2023 Review of Fundamental Data Structures Instructor: Dr. Sampson Akwafuo 1 Elementary Data Structures Data Structures • A scheme for organizing related pieces of information • Act like containers – They hold other data • Organized to ensure efficient processing – Includes lists, arrays, stacks, queues – Present specialized format for organizing and storing data 2 1 9/16/23 Abstract Data Types (ADT) • An abstract data type (ADT) is a logical description of data. – It includes the views and the allowed operations (without implementation details) • An ADT is defined by its behavior (its operations). – The implementation may vary. – Eg a priority queue may be implemented with an array or heap • Some Common Abstract Data Types (ADTs): List, Set, Graph, Stack, Queue, Priority queue. • A set of objects that are related to each other together with a set of operations: – Description of the data type object – Description of the relationships between the individual objects – Description of the operations to be performed on the objects • An abstract data type defines a data representation for objects of the type and the set of operations that can be performed on these objects. 3 ADT versus Its Implementation • ADT is language-independent and must be implemented in an appropriate computer language • An ADT may be implemented in several ways using the same programming language • Eg, a stack ADT is a structure which supports operations such as push and pop. • A stack can be implemented in a number of ways: eg using an array or using a linked list. 4 2 9/16/23 Array ADT And Matrices • An array is a sequence of fixed-size data records, indexed by a system of integer coordinates • An index is used to access elements and to permit alteration of individual elements. • The elements can be accessed directly in O(1) time • Two integers m and n, m:n denotes all integers in the range m..n 5 Array ADT & Matrices • A matrix is a two-dimensional array but it can be extended to an arbitrary dimension • Array and Matrix are homogeneous data structures. – Matrix is a singular vector arranged into the specified dimensions. • In a k–dimensional array A, the index ij is the dimension of j, between mj:nj: – A[i1,i2,…ik] is the element of array A with index (i1,i2,…ik) 6 3 9/16/23 Array Operations 7 Vector • An array whose length is not fixed. – Also known as arrayed list, dynamic array, and resizable array • Supports same operations as an array: retrieving by index, assigning by index, and iteration. • Additionally supports adding and removing elements dynamically. • Vector is an extensible array. – It automatically grows 8 4 9/16/23 Operations 9 Dynamic Sets • Sets that can change over time are called dynamic • A dictionary is a dynamic set that supports the following operations: • insert elements into, delete elements from, test membership in a set. • Other operations can also be supported • Each element in a set is represented by an object whose fields can be examined and manipulated • Sets elements appear in Key:Value pairs 10 5 9/16/23 Operations on Dynamic Sets • The operations can be grouped into two types: – Queries: return info about the set, an element in the set, or a group of elements in the set – Modifying operations: change the set • Typical operations for a set S: – Search(S, k): given S and a key value k, return a pointer x to an element in S whose key is k (key[x]=k) or NIL, if S does not contain such an element (query) – Insert(S,x): augment S with the element pointed by x, assuming that all the fields in x have been initialized (modifying op.) – Delete(S,x): given a pointer x to an element of S, remove x from S (modifying op.). Note: x is a pointer and not a key value. (modifying op.) – Minimum(S,x): given a totally ordered set S, return the element of S with the smallest key value (query) 11 Operations on Dynamic Sets – Maximum(S,x): given a totally ordered set S, return the element of S with the largest key value (query) – Successor(S,x): given an element x whose key is from a totally ordered set S, return the next larger element in S, or NIL if x is the maximum element (query) – Predecessor(S,x): given an element x whose key is from a totally ordered set S, return the next smaller element in S, or NIL if x is the minimum element (query) 12 6 9/16/23 Linked List • A linked list is a dynamic data structure where each element ( node) is made up of two items: – the data and a reference (or pointer), which points to the next node. • A collection of nodes where each node is connected to the next node through a pointer • Variations: – Doubly-linked list – Linked list 13 Linked Lists • A singly linked list is a list in which each element points to its successor; • a list item stores an element and a pointer to its successor FIRST Item 1 Item 2 Item 3 Item 4 • In a doubly linked list, each element points to its successor and to its predecessor; • a list item stores an element and two pointers, one to its successor and one to its predecessor 14 7 9/16/23 Array Vs Linked List • Linked lists are more complex to code and manage than arrays. – Linked list offer some distinct advantages though. • Dynamic: a linked list can easily grow and shrink in size. – No need to know how many nodes will be in the list. They are created in memory as needed. – In contrast, the size of a C++ array is fixed at compilation time. • Easy and fast insertions and deletions – To insert or delete an element in an array, we need to copy to temporary variables to make room for new elements or close the gap caused by deleted elements. – With a linked list, no need to move other nodes. Only need to reset some pointers. 15 Linked List Operations 16 8 9/16/23 Stack • Stores a set of elements in a Last-in First-out (LIFO) order • The last element inserted is the first one to be removed 17 Time Complexities of Stack Operations 18 9 9/16/23 Queue • A queue maintains elements in First-in First-out (FIFO) order. • Elements are added to the “back of the line” – and elements are removed from the front." • Mimics how people wait fairly for service, in a line (AmE) or queue (BrE) • Adding an element is called enqueuing and removing the front element is called dequeuing. 19 Queue Operations 20 10 9/16/23 Priority Queue • A priority queue is a list (an ADT) that maintains S elements, each with an associated value called a key • In a priority queue, entries are ordered according to a numeric priority value that is stored as the keys of the entries. • By convention, the least (lowest) priority correspond to the back of a priority queue while the greatest (highest) priority corresponds to the front of a priority queue. 21 Queues Vs Priority Queues www.berkeley.edu 22 11 9/16/23 Operations in a Priority Queue • A max-priority queue supports the following operations: –Init(S):Initialize the priority queue to be empty. –Is_empty(S):Test to see whether the priority queue is empty. –Insert(S,x): inserts the elements x into the set S. It can be written π=π∪{π₯} –Maximum(S): returns the element of S with the largest key –Extract-max(S): removes and returns the element of S with the largest key –Increase-key(S,x,k): increases the value of element x’s key to the new value k, which is assumed to be at least as large as x’s current key value 23 Example: unfulfilled obligations • The Bill Payer Problem: You get bills in the mail from time to time. But you don't pay bills the moment you receive them: instead, you place them on your desk, where they are unfulfilled obligations. • • When you decide to pay a bill, you take one from your desk and pay it. • The stack strategy: When you get a bill, you place it on the top of your pile of bills. When you decide to pay a bill, you pay the one on top, i.e., the one you got most recently. • The queue strategy: The bill you pay is the one that's been on your desk the longest. One way to do this is to insert every new bill at the bottom (rear) of the pile, and take from the top (front) of the pile. • The heap strategy: Every bill has a due date. When you decide to pay, you pay the bill that has the earliest due date. 24 12 9/16/23 Priority Queue Operations • Stacks, queue, heaps (min heap and max heap) are examples of a priority queue: – In a stack, the priority item is the most recently inserted item (called the top). – In a queue, the priority item is the least recently inserted item (insert at front; delete from rear) – In a heap, the priority item is the item with the minimum (or maximum) key. https://www.tutorialspoint.com/data_structures_algorithms/priority_queue.htm 25 Priority Queue by Heap • A priority queue can be implemented using a binary heap. • A binary heap will allow us to enqueue or dequeue items in π(ππππ). • Two common variations: – min heap, in which the smallest key is always at the front – max heap, in which the largest key value is always at the front. • • The binary tree (heap) is always balanced A balanced binary tree has roughly the same number of nodes in the left and right subtrees of the root. – Insertion changes this • A rebalancing is therefore performed after any ‘insert’ operation • A heap with N elements has leaves indexed by N/2+1, N/2+2 , N/2+3 …. up to N. 26 13 9/16/23 Priority Queue by Heap Example • Root has the highest priority (max-heap) – Left_child = Arr[2*i], right_child= [2*i+1] Convert the following array into a heap: Arr= [ , 6, 4, 5,3,2,0,1] https://www.hackerearth.com/ 27 Performing Max-Heapify maxHeapify (A, i, n) l = left(i), r = right(i) largest = i if (l ≤ n && A[l] > A[largest]): then largest = l if (r ≤ n && A[r] > A[largest]): then largest = r if (largest != i) then exchange A[i] with A[largest] maxHeapify(A, largest, n) Time complexity = πΏππ π 28 14 9/16/23 Deleting an Element from a Binary Heap • The standard deletion operation on Heap is to delete the element present at the root node of the Heap. • Deleting an element at any intermediary position can be costly. – Hence, the root is replaced by the last element and the last element is deleted. • Steps – Replace the root or element to be deleted by the last element. – Delete the last element from the Heap. – The last element (now at the position of the root node) may alter the heap property. – Therefore, a heapify is needed. Example will be provided during discussions on heap 29 Inserting an Element into a binary Heap Steps • Provide a placeholder for the new element by increasing the heap size by 1 • Insert the new element at the end of the Heap. • The newly inserted element may alter the Heap properties. – Hence, a heapify process is needed. 30 15 9/16/23 Min-heap operation Worst Case Create an empty heap O(1) Convert an unsorted list into a heap in-place O(n) Push an element into the heap O(log n) Find the minimum element into the heap O(1) Pop the minimum element out of the heap & restructure heap Return the top of the heap (aka the minimum element) Pop, return the top of the heap & restructure heap Test whether the heap is empty O(log n) O(1) O(log n) O(1) 31 Graphs • A graph is a mathematical object with of a set of vertices; and edges – Each edge represents a connection between two vertices. – The two vertices connected by an edge are called the ends of that edge. πΊ = π, πΈ where πΈ = π − π‘ π , π‘ Ο΅ π • A graph may be directed or undirected • The degree of a vertex, π£, is the number of edges on π£ • Graphs can be represented using an adjacency matrix or adjacency lists 32 16 9/16/23 Undirected Graph • Undirected Graph – A undirected graph is a pair πΊ = (π, πΈ) where π is a set whose elements are called vertices, and πΈ is a set of unordered pairs of distinct elements of V. • Vertices are often also called nodes. • Elements of πΈ are called edges, or undirected edges. • Each edge may be considered as a subset of π containing two elements • {v, w} denotes an undirected edge • In diagrams, this edge is the line v---w. • In text, we simply write π£π€, or π€π£ • π£π€ is said to be incident upon the vertices v and w 33 Directed Graph • Directed Graph – A directed graph (or digraph) is a pair πΊ = (π, πΈ) where π is a set whose elements are called vertices, and πΈ is a set of ordered pairs of elements of π. • Vertices are often also called nodes. • Elements of πΈ are called directed edges, or arcs. • For directed edge (v, w) in E, v is its tail and w its head; • (v, w) is represented in the diagrams as the arrow, v -> w. • In text, we simply write π£π€. 34 17 9/16/23 Weighted Graph • A weighted graph is a triple (π, πΈ, π) where (π, πΈ) is a graph (directed or undirected) and π is a numerical value of the edge 35 Complete Graph • A complete graph is a graph in which each pair of vertices is connected by an edge • With N vertices, there is an edge between every two vertices. • The total number of edges in a 8(89:) complete graph is ; 36 18 9/16/23 Adjacency Matrix of an Unweighted Graph • A 2D representation of graph • adj[u][v] = 1 indicates that there is an edge from vertex π’ to v. • Adjacency matrix for undirected graph is always symmetric. For weighted graph, if πππ[π’][π£] = π€, then there is an edge from vertex π’ to vertex π£ with weight π€. • Adding a vertex takes π(π^2) time. Removing an edge takes π(1) 1 2 4 3 6 5 7 0 1 1 0 0 0 0 1 0 1 1 0 0 0 1 1 0 1 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 1 1 1 0 1 0 0 0 0 0 1 0 (a) An undirected graph (b) Its adjacency matrix 37 Adjacency matrix of a weighted directed Graph 25 1 10 5 6 4 32 16 42 ^ 14 18 3 5 2 6 11 14 > (a) A weighted directed graph 7 0 25 ∞ ∞ ∞ ∞ ∞ ∞ 0 10 14 ∞ ∞ ∞ 5 ∞ 0 ∞ ∞ 16 ∞ ∞ 6 18 0 ∞ ∞ ∞ ∞ ∞ ∞ ∞ 0 ∞ ∞ ∞ ∞ ∞ 32 42 0 14 ∞ ∞ ∞ ∞ ∞ 11 0 (b) Its weight matrix 38 19 9/16/23 Adjacency List • Uses Linked list of vertex objects • Each element in the list will have two values. • • • The first one is the destination node The second one is the weight between these two nodes Adding a vertex/edge takes π 1 ;removing a vertex takes π(deg π£)) 1 2 25 2 3 10 4 14 3 1 5 6 16 4 2 6 3 18 6 4 32 5 42 7 6 11 5 7 (b) Its adjacency lists 14 (a) A weighted directed graph 39 Sparse, Dense and Planar Graphs Sparse graph: a graph in which the number of edges is close to the minimal number of edges (0) Dense graph: a graph in which the number of edges is close to the maximal number of edges ( #(#%&) ), ( where n is the number of nodes Planar graph: a graph that can be embedded in a plane, such that no edge crosses each other Example: 40 20 9/16/23 B A C Sparse graph (planar): |E|=1 < 6 (6 = maximal number of edges) D E F B A C E A B D Sparse graph (planar): |E|=5 < 15 (15=maximal number of edges) E C A C B D D Dense graph (planar) |E|=9 (close to 10=maximal number of edges) Dense graph (not planar) |E|=10 (close to 10=maximal number of edges) 41 Is this Sparse? Dense? Planar? 42 21