Uploaded by Anthony Hernandez

5 Fundamental Data Structures

advertisement
9/16/23
CPSC 535: Advanced Algorithm
Fall 2023
Review of Fundamental Data Structures
Instructor: Dr. Sampson Akwafuo
1
Elementary Data Structures
Data Structures
• A scheme for organizing related pieces of
information
• Act like containers
– They hold other data
• Organized to ensure efficient processing
– Includes lists, arrays, stacks, queues
– Present specialized format for organizing and
storing data
2
1
9/16/23
Abstract Data Types (ADT)
• An abstract data type (ADT) is a logical description of data.
– It includes the views and the allowed operations (without implementation details)
• An ADT is defined by its behavior (its operations).
– The implementation may vary.
– Eg a priority queue may be implemented with an array or heap
• Some Common Abstract Data Types (ADTs): List, Set, Graph, Stack, Queue,
Priority queue.
• A set of objects that are related to each other together with a set of
operations:
– Description of the data type object
– Description of the relationships between the individual objects
– Description of the operations to be performed on the objects
• An abstract data type defines a data representation for objects of the type and
the set of operations that can be performed on these objects.
3
ADT versus Its Implementation
• ADT is language-independent and must
be implemented in an appropriate
computer language
• An ADT may be implemented in several
ways using the same programming
language
• Eg, a stack ADT is a structure which
supports operations such as push and
pop.
• A stack can be implemented in a number of
ways: eg using an array or using a linked list.
4
2
9/16/23
Array ADT And Matrices
• An array is a sequence of fixed-size data records, indexed by
a system of integer coordinates
• An index is used to access elements and to permit alteration
of individual elements.
• The elements can be accessed directly in O(1) time
• Two integers m and n, m:n denotes all integers in the range
m..n
5
Array ADT & Matrices
• A matrix is a two-dimensional array but it can be extended to
an arbitrary dimension
• Array and Matrix are homogeneous data structures.
– Matrix is a singular vector arranged into the specified dimensions.
• In a k–dimensional array A, the index ij is the dimension of
j, between mj:nj:
– A[i1,i2,…ik] is the element of array A with index (i1,i2,…ik)
6
3
9/16/23
Array Operations
7
Vector
• An array whose length is not fixed.
– Also known as arrayed list, dynamic array,
and resizable array
• Supports same operations as an array:
retrieving by index, assigning by index,
and iteration.
• Additionally supports adding and
removing elements dynamically.
• Vector is an extensible array.
– It automatically grows
8
4
9/16/23
Operations
9
Dynamic Sets
• Sets that can change over time are called dynamic
• A dictionary is a dynamic set that supports the following
operations:
• insert elements into, delete elements from, test membership
in a set.
• Other operations can also be supported
• Each element in a set is represented by an object whose
fields can be examined and manipulated
• Sets elements appear in Key:Value pairs
10
5
9/16/23
Operations on Dynamic Sets
• The operations can be grouped into two types:
– Queries: return info about the set, an element in the set, or a group of
elements in the set
– Modifying operations: change the set
• Typical operations for a set S:
– Search(S, k): given S and a key value k, return a pointer x to an element in
S whose key is k (key[x]=k) or NIL, if S does not contain such an element
(query)
– Insert(S,x): augment S with the element pointed by x, assuming that all
the fields in x have been initialized (modifying op.)
– Delete(S,x): given a pointer x to an element of S, remove x from S
(modifying op.). Note: x is a pointer and not a key value. (modifying op.)
– Minimum(S,x): given a totally ordered set S, return the element of S with the
smallest key value (query)
11
Operations on Dynamic Sets
–
Maximum(S,x): given a totally ordered set S, return the element
of S with the largest key value (query)
–
Successor(S,x): given an element x whose key is from a totally
ordered set S, return the next larger element in S, or NIL if x is the
maximum element (query)
–
Predecessor(S,x): given an element x whose key is from a totally
ordered set S, return the next smaller element in S, or NIL if x is
the minimum element (query)
12
6
9/16/23
Linked List
• A linked list is a dynamic data structure where each
element ( node) is made up of two items:
– the data and a reference (or pointer), which points to the next
node.
• A collection of nodes where each node is connected to
the next node through a pointer
• Variations:
– Doubly-linked list
– Linked list
13
Linked Lists
• A singly linked list is a list in which each element points to its
successor;
• a list item stores an element and a pointer to its successor
FIRST
Item
1
Item
2
Item
3
Item
4
• In a doubly linked list, each element points to its successor and to
its predecessor;
• a list item stores an element and two pointers, one to its successor
and one to its predecessor
14
7
9/16/23
Array Vs Linked List
• Linked lists are more complex to code and manage than arrays.
– Linked list offer some distinct advantages though.
•
Dynamic: a linked list can easily grow and shrink in size.
– No need to know how many nodes will be in the list. They are
created in memory as needed.
– In contrast, the size of a C++ array is fixed at compilation time.
• Easy and fast insertions and deletions
– To insert or delete an element in an array, we need to copy to
temporary variables to make room for new elements or close the gap
caused by deleted elements.
– With a linked list, no need to move other nodes. Only need to reset
some pointers.
15
Linked List Operations
16
8
9/16/23
Stack
• Stores a set of elements in a Last-in First-out (LIFO)
order
• The last element inserted is the first one to be
removed
17
Time Complexities of Stack Operations
18
9
9/16/23
Queue
• A queue maintains elements in First-in First-out (FIFO)
order.
• Elements are added to the “back of the line”
– and elements are removed from the front."
• Mimics how people wait fairly for service, in a line
(AmE) or queue (BrE)
• Adding an element is called enqueuing and removing
the front element is called dequeuing.
19
Queue Operations
20
10
9/16/23
Priority Queue
• A priority queue is a list (an ADT) that
maintains S elements, each with an
associated value called a key
• In a priority queue, entries are ordered
according
to a numeric priority value that is stored as
the keys of the entries.
• By convention, the least (lowest) priority
correspond to the back of a priority queue
while the greatest (highest) priority
corresponds to the front of a priority
queue.
21
Queues Vs Priority Queues
www.berkeley.edu
22
11
9/16/23
Operations in a Priority Queue
•
A max-priority queue supports the following operations:
–Init(S):Initialize the priority queue to be empty.
–Is_empty(S):Test to see whether the priority queue is empty.
–Insert(S,x): inserts the elements x into the set S. It can be written
𝑆=𝑆∪{π‘₯}
–Maximum(S): returns the element of S with the largest key
–Extract-max(S): removes and returns the element of S with the
largest key
–Increase-key(S,x,k): increases the value of element x’s key to
the new value k, which is assumed to be at least as large as x’s
current key value
23
Example: unfulfilled obligations
• The Bill Payer Problem: You get bills in the mail from time to time. But
you don't pay bills the moment you receive them: instead, you place
them on your desk, where they are unfulfilled obligations.
•
• When you decide to pay a bill, you take one from your desk and pay it.
• The stack strategy: When you get a bill, you place it on the top of your
pile of bills. When you decide to pay a bill, you pay the one on top, i.e.,
the one you got most recently.
• The queue strategy: The bill you pay is the one that's been on your desk
the longest. One way to do this is to insert every new bill at the bottom
(rear) of the pile, and take from the top (front) of the pile.
• The heap strategy: Every bill has a due date. When you decide to pay,
you pay the bill that has the earliest due date.
24
12
9/16/23
Priority Queue Operations
•
Stacks, queue, heaps (min heap and max
heap) are examples of a priority queue:
– In a stack, the priority item is the most
recently inserted item (called the top).
– In a queue, the priority item is the least
recently inserted item (insert at front;
delete from rear)
– In a heap, the priority item is the item with
the minimum (or maximum) key.
https://www.tutorialspoint.com/data_structures_algorithms/priority_queue.htm
25
Priority Queue by Heap
•
A priority queue can be implemented using a binary heap.
•
A binary heap will allow us to enqueue or dequeue items in 𝑂(π‘™π‘œπ‘”π‘›).
•
Two common variations:
– min heap, in which the smallest key is always at the front
– max heap, in which the largest key value is always at the front.
•
•
The binary tree (heap) is always balanced
A balanced binary tree has roughly the same number of nodes in the left and
right subtrees of the root.
– Insertion changes this
•
A rebalancing is therefore performed after any ‘insert’ operation
•
A heap with N elements has leaves indexed by N/2+1, N/2+2 , N/2+3 …. up to
N.
26
13
9/16/23
Priority Queue by Heap Example
• Root has the highest priority (max-heap)
– Left_child = Arr[2*i], right_child= [2*i+1]
Convert the following
array into a heap:
Arr= [ , 6, 4, 5,3,2,0,1]
https://www.hackerearth.com/
27
Performing Max-Heapify
maxHeapify (A, i, n)
l = left(i), r = right(i)
largest = i
if (l ≤ n && A[l] > A[largest]): then largest = l
if (r ≤ n && A[r] > A[largest]): then largest = r
if (largest != i) then
exchange A[i] with A[largest]
maxHeapify(A, largest, n)
Time complexity = πΏπ‘œπ‘” 𝑛
28
14
9/16/23
Deleting an Element from a Binary Heap
• The standard deletion operation on Heap is to delete the
element present at the root node of the Heap.
• Deleting an element at any intermediary position can be
costly.
– Hence, the root is replaced by the last element and the last element
is deleted.
• Steps
– Replace the root or element to be deleted by the last element.
– Delete the last element from the Heap.
– The last element (now at the position of the root node) may alter
the heap property.
– Therefore, a heapify is needed.
Example will be provided during discussions on heap
29
Inserting an Element into a binary Heap
Steps
• Provide a placeholder for the new element by
increasing the heap size by 1
• Insert the new element at the end of the Heap.
• The newly inserted element may alter the Heap
properties.
– Hence, a heapify process is needed.
30
15
9/16/23
Min-heap operation
Worst Case
Create an empty heap
O(1)
Convert an unsorted list into a heap in-place
O(n)
Push an element into the heap
O(log n)
Find the minimum element into the heap
O(1)
Pop the minimum element out of the heap & restructure
heap
Return the top of the heap (aka the minimum element)
Pop, return the top of the heap & restructure heap
Test whether the heap is empty
O(log n)
O(1)
O(log n)
O(1)
31
Graphs
•
A graph is a mathematical object with of a set of vertices; and edges
– Each edge represents a connection between two vertices.
– The two vertices connected by an edge are called the ends of that edge.
𝐺 = 𝑉, 𝐸 where 𝐸 =
𝑠 − 𝑑 𝑠, 𝑑 Ο΅ 𝑉
•
A graph may be directed or undirected
•
The degree of a vertex, 𝑣, is the number of edges on 𝑣
•
Graphs can be represented using an adjacency matrix or adjacency lists
32
16
9/16/23
Undirected Graph
•
Undirected Graph
– A undirected graph is a pair 𝐺
= (𝑉, 𝐸) where 𝑉 is a set whose elements
are called vertices, and 𝐸 is a set of unordered pairs of distinct elements of V.
• Vertices are often also called nodes.
• Elements of 𝐸 are called edges, or undirected edges.
• Each edge may be considered as a subset of 𝑉 containing two elements
• {v, w} denotes an undirected edge
• In diagrams, this edge is the line v---w.
• In text, we simply write 𝑣𝑀, or 𝑀𝑣
• 𝑣𝑀 is said to be incident upon the vertices v and w
33
Directed Graph
•
Directed Graph
– A directed graph (or digraph) is a pair 𝐺 = (𝑉, 𝐸) where 𝑉 is
a set whose elements are called vertices, and 𝐸 is a set of
ordered pairs of elements of 𝑉.
• Vertices are often also called nodes.
• Elements of 𝐸 are called directed edges, or arcs.
• For directed edge (v, w) in E, v is its tail and w its head;
• (v, w) is represented in the diagrams as the arrow, v -> w.
• In text, we simply write 𝑣𝑀.
34
17
9/16/23
Weighted Graph
• A weighted graph is a triple (𝑉, 𝐸, π‘Š) where (𝑉, 𝐸) is a
graph (directed or undirected) and π‘Š is a numerical
value of the edge
35
Complete Graph
• A complete graph is a graph in
which each pair of vertices is
connected by an edge
• With N vertices, there is an
edge between every two
vertices.
• The total number of edges in a
8(89:)
complete graph is
;
36
18
9/16/23
Adjacency Matrix of an Unweighted Graph
•
A 2D representation of graph
•
adj[u][v] = 1 indicates that there is an edge from vertex 𝑒 to v.
•
Adjacency matrix for undirected graph is always symmetric. For weighted graph, if
π‘Žπ‘‘π‘—[𝑒][𝑣] = 𝑀, then there is an edge from vertex 𝑒 to vertex 𝑣 with weight 𝑀.
•
Adding a vertex takes 𝑂(𝑉^2) time. Removing an edge takes 𝑂(1)
1
2
4
3
6
5
7
0
1
1 0
0
0
0
1
0
1 1
0
0
0
1
1
0 1
0
1
0
0
1
1 0
0
1
0
0
0
0 0
0
1
0
0
0
1 1
1
0
1
0
0
0 0
0
1
0
(a) An undirected graph
(b) Its adjacency matrix
37
Adjacency matrix of a weighted directed Graph
25
1
10
5
6
4
32
16
42
^
14
18
3
5
2
6
11
14 >
(a) A weighted directed graph
7
0
25
∞
∞
∞
∞
∞
∞
0
10
14
∞
∞
∞
5
∞
0
∞
∞
16
∞
∞
6
18
0
∞
∞
∞
∞
∞
∞
∞
0
∞
∞
∞
∞
∞
32
42
0
14
∞
∞
∞
∞
∞
11
0
(b) Its weight matrix
38
19
9/16/23
Adjacency List
•
Uses Linked list of vertex objects
•
Each element in the list will have two values.
•
•
•
The first one is the destination node
The second one is the weight between these two nodes
Adding a vertex/edge takes 𝑂 1 ;removing a vertex takes 𝑂(deg 𝑣))
1
2
25
2
3
10
4
14
3
1
5
6
16
4
2
6
3
18
6
4
32
5
42
7
6
11
5
7
(b) Its adjacency lists
14
(a) A weighted directed graph
39
Sparse, Dense and Planar Graphs
Sparse graph: a graph in which the number of edges is close to the minimal
number of edges (0)
Dense graph: a graph in which the number of edges is close to the maximal
number of edges (
#(#%&)
),
(
where n is the number of nodes
Planar graph: a graph that can be embedded in a plane, such that no edge
crosses each other
Example:
40
20
9/16/23
B
A
C
Sparse graph (planar): |E|=1 < 6 (6
= maximal number of edges)
D
E
F
B
A
C
E
A
B
D
Sparse graph
(planar): |E|=5 < 15
(15=maximal
number of edges)
E
C
A
C
B
D
D
Dense graph (planar) |E|=9
(close to 10=maximal
number of edges)
Dense graph (not planar)
|E|=10 (close to 10=maximal
number of edges)
41
Is this
Sparse?
Dense?
Planar?
42
21
Download