Uploaded by Tolu Adesanya

anna algo

advertisement
Introduction to the module
COMP5180 Algorithms, Correctness and Efficiency
Week 9-1
Anna Jordanous
a.k.jordanous@kent.ac.uk
Today’s lecture
• Introduction to the module
• Summary
• Assessments
• Reading
• Teaching staff
• Schedule
• Drop-in sessions
• Motivation for the module
• What you did last year in relevant modules
• Algorithms, maths*, java
• * Next lecture: Maths recap in a little more detail
Introduction to the module
In this module, we shall explore various algorithms and their underlying
data structures, their correctness and their efficiency.
We shall take forward our understanding from previous
algorithm modules (COMP3830) by exploring data
structures like lists, balanced trees and graphs,
Recursive algorithms and the recursion tree method for
the analysis of recursive algorithms. Plus improvements
upon recursive algorithms, like backtracking and
dynamic programming.
Tools for analysis of the computational complexity of
algorithms
• like the O() notation and the functions used therein,
Finally, we will briefly look at classes of algorithms
categorised based on run time
• In particular, complexity classes P, NP, NP-hard and NPcomplete and the possibility of proving P = NP or not.
Assessments
The module will be assessed via:
• 50% coursework:
• Two programming assessments A1 and A2, each with 25% of total
weight of the module
• Dates TBC soon once approved
• A1 – covering content from Anna’s topics
• A2 – covering content from Sergey’s topics
• Further details will be provided in the Moodle Assessment section
when assessments are set.
• 50% written examination at the end of the year
Reading
• Algorithms are fundamental to computing theory and
implementation.
• There are many excellent books on the topic, some of which are
included in the reading list.
• The reading or reference material for each lecture or topic will be
specified by the respective lecturer.
• The same topic may be explored in several books with different levels
of exposition. The more the exposure, the better for learning.
• See the Reading list link from Moodle
Teaching staff
The lectures will be delivered by
Anna Jordanous (Convenor) and Sergey Ovchinnik.
Contact Anna or Sergey with issues regarding the module.
Drop-ins will be run by Joseph Kearney.
Classes will be supervised by: Ben Alison, Vanessa Bonthuys, Matthew
Hibbin, Joseph Kearney, Antonin Rapini
Teaching Schedule
• Lectures (weeks 9-14 and 16-20): There will be two 1 hour-long
lectures every week
• Classes (weeks 10-14 and 16-20): Every week, a student will attend
one terminal session or class that is scheduled for a small group of
students they belong to. See your timetable.
• (No classes this week!)
Schedule
Week#
Week (date)
Topic for the week
Lecturer
9
26/9/22
Intro to module, recap
Anna
10
11
12
13
14
15
16
17
18
19
20
3/10/22
10/10/22
17/10/22
24/10/22
31/10/22
7/11/22
14/11/22
21/11/22
28/11/22
5/12/22
12/12/22
Algorithms basics and examples
O notation
Balanced Binary Search Trees
Graphs
Heaps (Anna) / Recursions (Sergey)
Project week - no lectures or classes
Recursions, Recurrences
Solving Recurrences, Sorting
Backtracking
Dynamic Programming
Complexity classes
Anna
Anna
Anna
Anna
Anna/Sergey
Sergey
Sergey
Sergey
Sergey
Sergey
[What happened to week 1 ??]
• Short answer: KentVision
• Who’d like to know the long answer?
Drop-in sessions
• Extended learning / Drop-in consolidation optional sessions
• In addition to your compulsory lectures and classes, you might like to
make use two optional sessions each week, if you feel that you need
extra support.
• Drop-ins are held during weeks 9-14 and 16-20 on Mondays and
Fridays
• Except not this morning
• Run by Joseph Kearney
Motivation for the module
• Getting our software to work as well as possible
• “Every program depends on algorithms and data structures, but few programs
depend on the invention of brand new ones.” -- Kernighan & Pike
• Understanding how things work
• Advancing what you learned last year
• People in industry care that you know about this
• E.g. ‘Hello from Google’ …
• (email conversation 2014)
“…tips for a successful Google interview: The
interview will include topics such as coding, data
structures, algorithms, computer science theory,
and systems design.”
Example: quicksort
From ‘Hello from Google’
email for interview prep:
Quicksort: empirical analysis
Running time estimates:
・Home PC executes 108 compares/second.
・Supercomputer executes 1012 compares/second.
insertion sort (N2)
mergesort (N log N)
“…Sorting: Know how to
sort. Don’t do bubblesort. You should know
the details of at least one
n*log(n) sorting
algorithm, preferably
two (say, quick sort and
merge sort)…”
quicksort (N log N)
computer
thousand
million
billion
thousand
million
billion
thousand
million
billion
home
instant
2.8 hours
317 years
instant
1 second
18 min
instant
0.6 sec
12 min
super
instant
1 second
1 week
instant
instant
instant
instant
instant
instant
Lesson 1. Good algorithms are better than supercomputers.
Lesson 2. Great algorithms are better than good ones.
19
Learning outcomes (comp-specific)
• On successfully completing the module students will be able to:
• 8.1 specify, test, and verify program properties;
• 8.2 analyse the time and space behaviour of simple algorithms;
• 8.3 use known algorithms to solve programming problems;
• 8.4 make informed decisions about the most appropriate data
structures and algorithms to use when designing software.
Learning outcomes (generic)
On successfully completing the module students will be able to:
• 9.1 demonstrate an understanding of trade-offs when making design
decisions;
• 9.2 make effective use of existing techniques to solve problems;
• 9.3 demonstrate an understanding of how programs (can fail to)
match a specification;
• 9.4 analyse and compare solutions to technical problems.
Pre-requisites
• COMP5200: Further Object-Oriented Programming
• (and COMP3200 Introduction to Object-Oriented Programming)
• COMP3250: Foundations of Computing II
• (and COMP3220 Foundations of Computing I)
• COMP3830: Problem Solving with Algorithms
Pre-requisites: Java
(up to COMP5200)
object-oriented program design and implementation,
a range of fundamental data structures and algorithms,
advanced features of object-orientation, such as:
• interface inheritance,
• abstract classes,
• nested classes,
• functional abstractions,
• exceptions
Pre-requisites: Algorithms (COMP3830) –
what you learned
introductory algorithms,
algorithm correctness,
algorithm runtime,
big-O notation,
essential data structures such as arrays, lists and trees,
algorithmic programming skills such as searching and sorting, recursion, and divide
and conquer.
Pre-requisites: Maths
(COMP3250/3220)
matrices,
logic,
functions,
vectors,
differential
calculus,
probability,
algebra,
reasoning and
proof,
set theory,
statistics,
computer
arithmetic
We’ll recap some
relevant maths in the
next lecture
Today’s lecture
• Introduction to the module
• Summary
• Assessments
• Reading
• Teaching staff
• Schedule
• Drop-in sessions
• Motivation for the module
• What you did last year in relevant modules
• Algorithms, maths*, java
• * Next lecture: Maths recap in a little more detail
Which maths do we need?
COMP5180 Algorithms, Correctness and Efficiency
Week 9-2
Anna Jordanous
a.k.jordanous@kent.ac.uk
Today’s lecture
• Introduction: What maths and why
• Polynomials
• Exponential functions
• Factorial
• Logarithm
• Ceiling/floor function
• Modular arithmetic
Meta-Comment
• Most of this is not new, it is a recap of comp3220/comp3250 but put
into context
• Content of this lecture(s) serves mainly as a reference for later, i.e. if
you come across some befuddling maths later on, chances are it is
explained here
• Some textbooks on the subject have a similar structure, they start
with the maths they need before commencing with algos
What do we use maths for (in COMP5180)?
• We want to describe the performance of algorithms, mostly w.r.t. runtime, sometimes
w.r.t. memory consumption
• For that description we deploy some standard mathematical functions, some of which
we will recap here
• Generally (there is an exception) we do not use these functions in our programs, they
just give us a performance model
• exception: spreadsheets, measuring performance
• We also need a form of abstraction over these functions that allows the description of
paramerised programs and is hardware-independent (O-notation and friends)
• Not all of the maths is directly tied to performance description though
What functions and why?
• We will be looking at which functions are commonly used here, but also…
• …why they are commonly used
• In addition, this is a refresher in notation, and a clarification of jargon
• When describing performance etc. we typically want to do this not for one
instance of the problem, but for all of them
• we characterise a class of problem by a number of “measurements” and describe the
performance as a function of those
• very often just one measurement “n”, typically the “size”
Polynomials
• Any function you can build by using addition* and multiplication** to combine
constants, variables and exponents
e.g. ! ! + 4! + 7
* subtraction is possible via negative constants,
** division is possible via fractional constants
the exponentiation is here just used for notation the same function can be written 7 + !(4 + !);
NB you might also see 7 +
# ⋅ 4 + # and # ! + 4 ⋅
#+7
• Special cases: linear, quadratic, cubic (polynomials of degree 1,2,3, e.g. ! , ! !, ! ")
! $!
#
#
• These examples are not polynomial: 2 , ! , #, !
• Sometimes we describe a function as polynomial if it is bounded by a polynomial, e.g.
! log !(!) is not a polynomial itself but…
Why polynomials?
• addition: if you have a sequence of statements )%; )!; where the “time” needed to
run statement )& is +& , then the time for the sequence is +% + +!
• multiplication: if you have a for-loop repeated N times, e.g.
for(int i=0;i<N,i++)
!! ; !" ; …
and the cost for a single loop iteration is +
then the overall cost for the loop is - × +
• Arbitrary polynomials then arise through sequences of nested loops
• (We see these being useful for Big O notation)
Example: bubble sort pseudocode
for i = 0 to N - 2
for j = 0 to N - 2
if (A(j) > A(j + 1)
multiplication
temp = A(j)
A(j) = A(j + 1)
A(j + 1) = temp
end-if
end-for
end-for
addition
For this forloop,
T=1+1+1
=3
(if we
assume
time +" for
each line
here =1 )
Polynomial emerges: N × - × 3 = 3- !
(in practice, we usually ignore the constant ‘3’ , and refer to bubble sort as O ' ! - why?
• addition: if you have a
sequence of statements
'" ; '! ; where the “time”
needed to run statement '# is
)# , then the time for the
sequence is )" + )! .
• multiplication: if you have a
for-loop for(int
i=0;i<N,i++) S; and the
cost for a single loop iteration
is ) then the overall cost for
the loop is * × ).
Objection
• What if the cost of the loop body is not constant, but depends on the loop variable?
• Often it suffices for our purposes to overapproximate the cost by a common value
• But if that is too crude then:
if +& is the time for the ith iteration and there are N iterations
then the overall time is ∑*$%
&() +&
• This does not appear tremendously helpful, however certain patterns are common,
*(*$%)
*$%
e.g. ∑&() 3 =
!
• More generally, if +& is a (degree 4) polynomial over variable 3 then the sum can be
expressed as a (degree 4 + 1) polynomial over -.
Exponential functions
• Functions of the form 6- where c is a constant such that 7 > 1
• These grow eventually faster than any polynomial
• Jargon: “exponential” is not synonymous with “very bad”; we merely have
exponential functions as performance bounds
• Algorithms with exponential performance are usually bad news, because they do
not scale at all, however there are “bad algorithms” and “bad problems”, i.e.
there is a big distinction between
• badly coded problem solutions, and
• problems for which there is no efficient solution
Example of
exponential
growth
• The spread of COVID-19
(pre-vaccine)
https://twitter.com/GaryWarshaw/status/12403026537
64059136/photo/1
Image credit @garywarshaw @SignerLab
Example 2: bad maximum function
int max(int[] arrayA) {
return max(arrayA, arrayA.length-1);
}
int max(int arrayA[],int to) {
if (to==0)
return arrayA[0];
if (arrayA[to] > max(arrayA, to-1))
return arrayA[to];
else return max(arrayA, to-1);
}
Explanation
• Because the recursive call max(arrayA, to-1) is not stored in a variable,
it could be called 2 times,
• and max(arrayA, to-2) could be called 2 times for each of those 2 calls (so 4
times overall) [2! = 4] …
• and max(arrayA, to-10) called 1024 times overall [2%)]
• and max(arrayA, to-100) will be too often for the lifetime of the computer
[2%))]
• This will specifically happen if the numbers in the array are in reverse
order e.g.
int arrayA []=new int[100];
for(int i=0;i<100;i++)
arrayA[i]=100-i;
100, 99, … , 2, 1
Example 3: create truth table
boolean vals[]=new boolean[VARS];
void truthtable(int v) {
if (v==VARS) produceRow();
vals[v]=true; truthtable(v+1);
vals[v]=false; truthtable(v+1);
}
Explanation
• The method truthtable produces (a portion of) a truth-table
• Parameter v tells us for how many variables we already have a value
• If we have one for all (v==VARS) we can add a row to the table
• Otherwise, add a value and recurse
• This is also exponential (2,-./ ), but here we cannot do better than
that, because a truth table is exponential in the number of a variables
(Non-recursive version of code for creating
truth table)
boolean nextrow() {
for (int i=VAR-1; i>=0; i--) {
if (!vals[i]) { vals[i]=true; return true; }
else { vals[i]=false; }
}
return false;
}
void truthtable() {
do { produceRow(); } while(nextrow());
}
Factorial
• "! = 1 × 2 × 3 × ⋯ × " − 1 × "
• Number of permutations of n distinct elements
• Grows faster than exponential functions (but slower than "0 )
• We do not normally consider algorithms with such a bad time
complexity, but they can arise
• e.g. generate-and-test algorithms can in bad scenarios act like that
Generate-and-test?
• We do not always have a constructive way to solve problem X
• but we may have a way of testing whether a candidate solution is an actual
solution
• so we can repeatedly produce candidate solutions and test them, until we find
the real deal; but if there are a lot of candidates…
• Example: a naïve sorting algorithm (‘bogo sort’) would be to permute the elements
of an array randomly, if the result is in order we are done; otherwise repeat
.!
• on average this will require ! many iterations
• Generate-and-test is also a popular naïve approach to computational creativity /
creative AI (see COMP6590, Stage 3)
2" = 8
log ! 8 = 3
Logarithm
• log 0 !, the logarithm of x to base c, is defined as the inverse to exponentiation to
base c: (constraint: c>0, x>0)
• If 3 $ = 5 6ℎ89 log % 5 = !
• log % 3 $ = !
• Common cases for c: 2, 10, ; (<=>?@=A ABC=@3>ℎE), j
• (why these common cases? See next slides)
• Algebraic laws:
• log % => = log % = + log % >,
&
'
log % = log % = − log % >
• 3 ()*# $ = !
• =' = 3 ()*# &' , hence log % =' = > log % =
• NB in the world of algorithms we often just use log without a base, because logs
of different bases differ only by a constant factor
How do logarithms arise in algorithms? Here
are some ways
%
*
• Remember the generic cost of a for-loop ∑&(% +& ? If +& = then this sums up
&
approximately to log 8 - .
• The height of a randomly-built binary search tree (with < elements) is on average
2 log 8 < = log 8 <
• Various kinds of trees that are “balanced” are so, because the height is bounded
by log 0 < for some c:
• 3 = A for AVL trees and also weight-balanced union/find structures
• The natural log (log 8 or A<) is useful for exponential growth and decay models
such as algorithms for economics models
• If we are trying to find the index position of a value v in a sorted array, we can do
that in log ! < iterations, where n is the length of the array…:
Binary search
int find(int v, int arrayB[]) {
int low=0; int high=arrayB.length-1;
while(low<=high) {
int mid=(low+high)/2;
if(v==arrayB[mid])
return mid;
else if(v<arrayB[mid])
high=mid-1;
else low=mid+1;
}
return -1; //not found
}
Binary search is O(log
n)
– i.e. an algorithm with
logarithmic complexity
meaning run time
grows proportionally to
the log of the size of
the input
More on Big O notation
later in this module
Fibonacci and j
Image credit:
arbyreed (Flickr)
• the golden ratio is the number G =
9:%
≈ 1.618
!
• Fibonacci numbers: K) = 0, K% = 1, K.:! = K.:% + K.
• 0, 1, 1, 2, 3, 5, 8, 13, 21, …
;
;
%&
• for large n we have #$%
≈
G,
e.g.
≈ 1.619
;
;
#
'
<# $%
• approximation (from below): K. ≈ 9
https://www.invisionapp.com/insidedesign/golden-ratio-designers/
• log < makes it into the description of some algorithms, because it is (nearly) the
inverse of the Fibonacci function,
• i.e. given N to find the < such that K. = N we might as well compute log < N
Interested
Interested in
in Fibonacci?
Fibonacci?
Here’s
Here’s an
an episode
episode of
of BBC
BBC podcast
podcast ‘In
‘In our
our time’
time’ on
on “The
“The Fibonacci
Fibonacci
Sequence”
Sequence”
Example: AVL trees
AVL trees are binary search trees such that the height of left and right child can differ by at most
1, and every child is itself an AVL tree.
• How many nodes does the most sparsely populated AVL tree of height n have? Let’s call that
sequence of numbers !,
• Clearly: !- = 0, !. = 1, there is not even a choice at those heights
• What about height n+2? One child must have height n+1, for the other we only need height n.
• Make both most sparsely populated and we get: B+,! = B+," + B+ + 1
• These are the Leonardo numbers (closely related to Fibonacci numbers):
• 1, 1, 3, 5, 9, 15, 25, 41, 67, 109, …
• D+ = 2F+," − 1
• Maximum height of AVL tree with m elements: log / (* + 1)
[We’ll explore this more in a few weeks]
Aside: ceiling/floor function
• When converting from real numbers to integers, we have a choice between
rounding, rounding up (ceiling), and rounding down (floor)
• the ceiling function @ gives rounding up,
• e.g. 1.08 =2, 3.72 = 4, 5 =5
• the floor function @ rounds the other way,
• e.g. 1.08 =1, 3 = 3
• Generally, if @ is a real number then @ and @ are integers such that: @ − 1 <
@ ≤@ ≤ @ <@+1
• On the previous slide, we wanted to know the maximal height of a tree, but
heights of trees are integers, so…
Modular arithmetic
• Sometimes we compute arithmetic operations in Computing modulo a number p,
• i.e. instead of computing = + R and =R we are really computing = + R %T and
=R %T
• This is even true for built-in integers (type int) in Java: T = 2"!
• Modular equivalence: we write = ≡= R iff =%T = R%T
• e.g. 5 ≡- 9 because 5%4 = 9%4
Generally, if we have > ≡# @ and A ≡# B
Generally (for any C) there is an additive inverse
of > which is C − >
• then we also have = + 3 ≡. > + O and
=3 ≡. >O
• because > + C − > = C ≡# 0 [multi inverse
when C is prime]
Example Application
• In cryptography, one typically encrypts a message * (large integer) as
*P modulo some integer +, where , is also a fairly large integer
• *P is monstrously huge, would take ages to compute – but we do not
need it as an intermediate result
• We can exploit:
• E) ≡= 1
• E!G ≡= EG × EG ≡= E! G
• E!G:% ≡ EG × EG × E ≡ E! G × E
Want to read more about how modular arithmetic and exponents are used in public key cryptography? See Chapter 4, “9 algorithms
that changed the future” (John MacCormick)
Further additional reading
• COMP3250 / COMP3220 notes plus recommended reading from COMP3250 /
COMP3220
• COMP3830 notes plus recommended reading from COMP3830
• Algorithms in Java (Sedgewick) Ch 2
• Data structures & problem solving using Java (Weiss) Ch 5,19
• Introduction to algorithms (Cormen et al): relevant parts of Ch 1, 2, 3, 13.3
Glossary
• Polynomials: Any function you can build by using addition and multiplication to combine constants, variables and
exponents
• Exponential functions: Functions of the form 0$ where c is a constant such that 1 > 1
• Factorial: 3! = 1 × 2 × 3 × ⋯ × 3 − 1 × 3 - number of permutations of n distinct elements
• Generate-and-test: repeatedly produce candidate solutions and test them, until we find an actual working solution
• Logarithm: log % #, the logarithm of x to base c, is defined as the inverse to exponentiation to base c: (constraint: c>0, x>0)
• golden ratio: the number < =
&'(
≈ 1.618
!
• Fibonacci numbers: A) = 0, A( = 1, A*'! = A*'( + A* . Sequence 0, 1, 1, 2, 3, 5, 8, 13, 21, …
• Ceiling function: rounding up a number
• Floor function: rounding down a number
• Modular arithmetic: compute arithmetic operations modulo a number p
Today’s lecture
• Introduction: What maths and why
• Polynomials
• Exponential functions
• Factorial
• Logarithm
• Ceiling/floor function
• Modular arithmetic
Idea of a Data Structure
COMP5180 Algorithms, Correctness and Efficiency
Anna Jordanous
a.k.jordanous@kent.ac.uk
Today’s lecture
• Goals of a data structure
• Data structures – flexible vs fixed
• Examples
• Pros and cons
• Hybrid approaches for the examples
• Data structures for algorithms
• Case study: Dynamic arrays (Array Lists)
• Description and representation
• Performance analysis
What are the goals of a Data Structure?
• We want to store data in it
• We want to access and manipulate the data efficiently, via dedicated
methods
• We want to use memory effectively
How is the data organised?
This varies, but there are two substantially different approaches.
1. Flexible sized memory, flexible structure:
• The data structure is made up of similarly behaving “tiles”, linked together by
object references (pointers)
• Adding data just adds another tile; sometimes tiles are rewired for long-term
performance gain
2. Fixed size memory, rigid structure:
• A fixed amount of memory is associated with the structure
• When accessing parts of the data a small amount of processing info needs to
be maintained
• When adding data we may need to clone & destroy the old structure
Example (type 1), Top-to-Bottom tree
class Tree { int data; Tree left, right; }
Tree x,y,z;
x= new Tree(); y=new Tree(); z=new Tree();
x.data=4; y.data=3; z.data=7;
x.left=y; x.right=z;
4
3
7
Example (type 2), Tree with fixed positioning
int SIZE=3;
int data[] = new int[SIZE];
data[0]=4; data[1]=3; data[2]=7;
4
3
7
Note: tree is the same as before. We use the convention that left child
of node 0 is in position 1, right child in position 2. General convention:
left child of node n is at 2n+1, right child at 2n+2.
Pros and Cons
• flexible approach:
• pros: substructures can be moved as a whole, shape is flexible, incremental
structure growth is easy
• cons’s: substantial proportion of memory dedicated to object references
(here: 75%)
• fixed approach
• pros: links implicit, and go both ways, efficient use of memory when full
• con’s: no substructures, empty substructures need to be encoded, inefficient
use of memory when nearly empty, growth may run into stop-the-world
(necessary pause to free up/reorganise memory)
Which is better?
• This depends on the application:
• Binary search trees [old+new]: flexible approach is better, because
1. we can do balancing [new] efficiently as we can move entire subtrees
2. even if we do not balance, the worst-case memory footprint of the rigid structure is
exponential to tree size, linear for flexible
• Binary heaps [new]: rigid approach is better, because
1. trees can/should be kept rigidly balanced all the time anyway
2. we need to traverse tree up as well as down
3. between 50% and 100% of memory is payload (contains the actual data) (instead of
25%)
Hybrid approaches between those
• Hash tables typically use a rigid structure at the top, with a flexible
structure underneath – kind of an orchard of trees
• no stop-the-world when (rigid part of) table becomes full
• performance deterioration when full is minimal
• Trees/Forests over a fixed number of data points that have a unique (if
any) parent are often represented as int arrays or finite maps
• links remain explicit; what to do if data points aren’t fixed?
• used in Dijkstra’s algorithm (finding shortest path), union/find
What are we investigating when looking at a
Data Structure and its Algorithms?
• we typically characterize performance when n pieces of elementary data are
represented, i.e. the following is relative to n or O(n)
• how much computer memory do we need for all that data?
• this is a question we often ignore, e.g. when rivalling implementations barely differ in
that respect (typical for rivalling flexible approaches)
• how much time does it take to perform operations X and Y
• …in the worst case scenario
• …in a random scenario (on average)
• …on average over the lifetime of a data structure (amortized time complexity)
Case Study: Array Lists
• arrays are a primitive (in most PLs) data structure that is rigid at least
at creation time (at compilation time in some PLs)
• dynamic arrays (= array lists) are a way to modify arrays to
incorporate incremental growth somehow
• main difference in usage (other than syntax):
• only op to store data in arrays is “set”: array[index]=data;
arraylist.set(index,data) is disabled unless previous indexes are assigned
• main op to store data in arraylists is “add”, arraylist.add(data); corresponding
op for arrays does not exist, but can be faked with additional fields
How do dynamic arrays work?
internal array
dynamic array object
Representation of Dynamic Array
• a dynamic array has two/three fields:
1. an actual array that contains all the data
2. an integer index that points to the next free cell in the array
3. an integer field giving the length of the array (the “capacity”); in
Java this comes for free with the array, but not in all programming
languages (PLs)
• the idea on the previous slide was: the dynamic array contained 4
elements, so position 4 was the first free position in the array
• if the array is full the index is equal to the length
How does add work? It may fit…
internal array
dynamic array object
How does add work? It may not fit…
internal array
dynamic array object
So if the new element added does not fit…
• a fresh bigger array is created (bigger by a factor of about 1.5)
• this becomes the new internal array of the dynamic array
• all the data is copied across from old array to new array
• now there is room for the extra element
• the old internal array will be picked up by the garbage collector at
some point
Performance?
• In the best case O(1) – we check whether there is room, there is, we
put the data in
• In the worst case O(n) – we check whether there is room, there is not,
we create a fresh array, copy n elements across, put the data in
• On average, over the lifetime of the data structure? Hm. This is the
interesting bit. To get a good performance it is important that the
array is grown by a factor, not just a fixed amount!
Average performance of add
• To compute the “amortized” time complexity we have to add up all the
work that happens over the lifetime (n add operations) of the dynamic
array and divide it by n. How?
• We have the cost of storing all those n elements themselves, over the
lifetime that amounts to O(n)
• We have the cost of copying all those cells each time the array is replaced.
How many are copied over all?
• For simplicity, suppose we grow the array by factor of 2. In the worst case,
all cells would have been copied once, half of those a second time, half
those a third time, in the limit: 2n copies, average at 2.
• Cost for n add ops is O(n), so O(1) for a single op.
Not
examinable
ArrayList themselves grow by k=1.5
• On average cells are copied 3 times (worst case). Why?
• in general, when the growth factor is k, cells will be copied 1/(1-1/k)
%
$
!
times. This derives from the formula ∑!"# " = %&' for " < 1. This
%
applies here for " = ( .
• thus: k=1.5,
1/k=2/3,
1-1/k=1/3,
1/(1-1/k)=3
Other ops on dynamic arrays
• what about other operations on dynamic arrays?
• we could look at the implementation, but representation itself tells us
what is possible at best
• for example, contains (checking whether an object is in the structure)
is surely O(n):
• if it is there, we will n/2 comparisons on average to find it
• if not, we need n comparisons to make sure it is absent
• either case is O(n)
An intriguing case is removing an element
• there are two issues of concern here
1. how is the element to be removed identified? By itself, or by an index
position of the dynamic array?
2. what promises, if any, do we make about index positions of other elements
in the array, before and after?
• the actual Java API for ArrayList “remove” has two versions, for either
kind of removal
• both of these are necessarily O(n), but with a weaker promise on
index positions remove(int i) could be realised as O(1).
Further additional reading
• Algorithms in Java (Sedgewick 2003) – Ch 3
• Cracking the coding interview: 150 programming questions and solutions
(McDowell 2013) – Ch 1,2
• Data structures & problem solving using Java (Weiss 2010) – Ch 2, 15
• Introduction to algorithms (Cormen et al 2009) - Ch 3, also intro to Section III
Glossary
• stop-the-world – a pause needed to perform garbage collection e.g. when adding to a fixed data
structure
• garbage collection (freeing up/reorganise memory)
• payload - the part of transmitted data that is the actual intended message
• substructure – a structure within a bigger structure
• amortized time complexity - average time complexity, based on total cost of operations over the
lifetime of a data structure
• primitive data structure – an inbuilt data structure in a programming language
• dynamic arrays - modified arrays that allow incremental growth during runtime (= array lists)
• performance – run time measurements of an algorithm, or of operations on a data structure
• operation - a function that gets performed, e.g. on a data structure
• PL = programming language
Today’s lecture
• Goals of a data structure
• Data structures – flexible vs fixed
• Examples
• Pros and cons
• Hybrid approaches for the examples
• Data structures for algorithms
• Case study: Dynamic arrays (Array Lists)
• Description and representation
• Performance analysis
A complete binary tree in nature
Balanced
Binary Search Trees
COMP5180 Algorithms, Correctness and Efficiency
Anna Jordanous
a.k.jordanous@kent.ac.uk
15
Today’s lecture
• Using Trees for searching data and keeping it sorted
• Binary tree and Binary Search tree recap
• Tree height and number of nodes
• Balanced trees and complete trees
• Constructing balanced trees
• AVL trees
• Red-Black trees
• [2-3 trees *]
• [B-Trees *]
* = Non
examinable
content
* We won’t cover details of 2-3 trees or B-Trees. Slides on 2-3 trees and B-Trees are kept in
this powerpoint for reference only, if you are interested. Details of 2-3 trees and B-Trees
are not examinable – you are expected to know that these are two other ways of
constructing balanced trees, but not how they work.
Trees
2
9
2
1
3
5
Let’s first recap Binary Trees
A binary tree is a linked data structure where each
node has links to two other other nodes (left and
right).
ROOT NODE
7
RIGHT
LEFT
3
2
null
9
null
null
5
null
null
LEAF
null null
9
null
Similarly, right refers to a subtree of nodes that
come after the root node.
A node which has empty left and right links (i.e. a
node which doesn’t point to any other nodes) is
called a leaf node
8
1
The entry node is called the root node and left
leads to a subtree of values that come before the
root node.
LEAF
null
We use family hierarchy terms such as parent, child,
sibling, ancestor, descendant, to describe
relationships between nodes
Searching through the tree to find items
Root node
7
right
left
If we are looking for an item in a
binary tree, we have two ways to
search through the tree:
1. Depth first search
2. Breadth first search
3
2
null
9
null
8
1
null
5
null null
null
null
9
null
null
A binary search tree example
(using integer keys)
Root node
7
right
left
4
2
null
null
5
null null
null
10
a right child node’s value is always >
its parent node’s value
null
And if we have two nodes with the
same value?
8
6
null
9
null
In a binary search tree
a left child node’s value is always <
its parent node’s value
(either don’t allow duplicate keys, or make a
decision to always put them left or right, or use
hash functions to generate unique keys)
null
we don’t need to choose between
depth first or breadth first search, we
are guided by the values:
Searching a BST
Root node
7
•
right
left
•
10
4
if the tree is empty then the key
cannot be found.
Otherwise compare the key we
are looking (k) for with the key in
the root node (root_k).
If equal, then the search is
over
if k < root_k then ignore the
right subtree, return the
result of searching the left
subtree.
If k > root_k then return
result of searching the right
subtree.
•
null
2
null
8
6
null
5
null null
null
•
•
null
9
null
null
http://algs4.cs.princeton.edu/home/
What a Tree class looks like
• The key field, used for comparing
nodes is an integer, to keep things
simple. It could just as easily be a
String.
• We have called the node class Tree
since it can be the root of a whole
tree.
class Tree {
private int key;
private Tree left;
private Tree right;
// any other fields
};
// functions for
// manipulating trees
Issues with (not) using null as empty tree
• the empty tree issue, there is a choice:
• we can use null for the empty tree (most common, not very OO)
• makes recursive code a bit awkward: use static methods instead, or lots of null-checks
• we can use a dedicated EmptyTree class; issues:
• makes loops awkward
• if the empty tree is not shared we waste half the memory
• we can use a dedicated object of the Tree class as "I am empty"
• half-OO = compromise between the two alternatives above
pure OO version
abstract class Tree {
abstract Tree search(int key);
abstract Tree insert(int key);
}
class EmptyTree extends Tree {
Tree search (int k) { return this; }
Tree insert(int k) {
return new NETree(this,k,this); }
}
class NETree extends Tree { private int key;
private Tree left,right; // constructor...
Tree search (int k) {
if (k==key) return this;
if (k<key) return left.search(k);
else return right.search(k); }
Tree insert(int k) { ... }
}}
11
Half-OO, faking Empty tree class
class Tree {
private Tree left,right;
private int key; // constructor...
private final static Tree empty = new Tree(null,0,null);
Tree search(int k) { Tree cur=this;
while (cur!=empty) {
if (k==cur.key) return cur;
cur=k<cur.key?cur.left:cur.right; }
return null; }
Tree insert(int k) {
if (this==empty) return new Tree(this,k,this);
if (key==k) return this;
if (k<key) left=left.insert(k);
else right=right.insert(k);
return this; }
12
Tree traversal
7
• inorder tree traversal
• visit left node (recursive),
• then visit root node,
• then visit right node (recursive)
• 2, 4, 6, 7, 8, 10
right
left
SORTED LIST if we
traverse a BST
inorder
10
4
• preorder tree traversal
• visit root node,
• then visit left node (recursive),
• then visit right node (recursive)
• 7, 4, 2, 6, 10, 8
Root
node
null
2
6
8
• postorder tree traversal
• visit left node (recursive),
• then visit right node (recursive),
• then visit root node
• 2, 6, 4, 8, 10, 7
null null
null null null null
How efficient are binary search trees?
• The algorithms to search, traverse and insert in a tree move one step
down the tree at each iteration (or each recursive method call).
• Means that the time taken is limited by the height of the tree.
• A tree's height is the maximum distance from the root to a leaf node (that is,
a node with no subtrees).
• The amount of data stored in a tree (its size) is the number of nodes
hash table—example
e.g. BSTs vs Hash Tables
0
null
1
null
2
3
null
4
null
5
Fred
Alan
8822
43333
null
null
Mary
7
null
null
8
null
9
null
6
4227
10
11
John
3929
null
null
• Binary search trees and hash tables provide alternative ways to implement a
lookup-table abstract data type.
• Hash tables are potentially faster
• Search time is the same however many records are stored
• Hash tables are the usual way to implement a lookup table where speed is very
important
• Generally, there is a trade-off between table size and speed
• BSTs are more flexible and often easier
• can create off the top of your head if you understand the principles
• can search for more than just an exact match… (see next slide)
Flexible Tree Searching Example
• Suppose we want write a program to keep track of lengths of timber
in a store.
• If someone needs a 7.3 meter length then we want to find the piece
that is nearest to 7.3 meters but is not smaller.
• can trim it to size (with some wastage) but cannot make it longer.
• We’d do this by amending the search...
• How would you go about this?
Today’s lecture
• Using Trees for searching data and keeping it sorted
• Binary tree and Binary Search tree recap
• Tree height and number of nodes
• Balanced trees and complete trees
• Constructing balanced trees
• AVL trees
• Red-Black trees
• [2-3 trees *]
• [B-Trees *]
What is the relationship between the height
of a tree and its size?
• It depends on the shape of the tree.
• A tree in which each left subtree is null will
effectively be a linked list and its height h will
be equal to its size (number of nodes) n.
• h=n
• Same if each right subtree is null
• But this doesn’t look like a very efficient tree…
Can we do better?
null
null
null
null
null
null
null
(We’ll need to know about balanced trees)
• A tree is said to be balanced if the height of the two subtrees of every
node never differ by more than 1
• NB the height of an empty tree is undefined… but for the purposes of
balancing trees, assume the height of an empty tree is -1
What is the best we can do?
• The best (most efficient)
tree of a given size is the
one with the least height.
• It is easier to turn the
problem round and ask:
• How many nodes can we
fit into a tree with a given
height?
a tree of height
h=0 has up to
n=1 nodes (1
root)
a tree of height
h=1 can have up
to n=3 nodes (1
root + 2
children)
a tree of height
h=2 can have up
to n=7 nodes ( 1
root + 2
children + 4
grandchildren)
etc…
n=1
when h =0
n=1+2
when h =1
n =1 + 2 + 4
when h = 2
This is the
same as
saying:
n = 20 when
h=0
This is the
same as
saying:
n = 20 + 21
when h =1
This is the
same as
saying:
n = 20 + 21 +
22 when h
=2
Tree height
•
A tree of height h can contain up to 1+2+4+ ... +2h nodes
h
n
0
1
1
3
2
7
• This is a geometric progression and so n (#of nodes) = 2h+1 – 1
• A tree is balanced if the height of the two subtrees of every node never differ
by more than 1
•
If we have a balanced tree with n nodes and height h:
•
2h – 1 < n
≤ 2h+1 – 1
• So 2h
< n+1 ≤ 2h+1
• If we take logs we get
• h
< log2(n+1) ≤ h+1
•
•
•
•
So we know now: h < log2(n+1) and h+1 ≥ log2(n+1)
• So:
h < log2(n+1) and h ≥ log2(n+1) -1
Let’s simplify this a bit: h is roughly log2(n+1)
Once n gets big, there’s not much point distinguishing between n and n+1.
So let’s ditch the +1s
• h is about log2n
- now we can calculate the best possible height
What is the height of a tree with a million nodes? [n =
Remember:
1,000,000]
• 2 < n+1 ≤ 2
h
• h is approximately log2n
• log2 1,000,000 = 19.931568569
• So h is approximately 20 (to nearest whole number)
h+1
• h is approximately log2n
• The best (most efficient) tree of a
given size is the one with the least
height.
• So a tree with 1,000,000 nodes can have any height between 20
null
(minimum, balanced) and 1,000,000 (maximum)
null
• We ideally want our tree to be much closer to height 20 than
1,000,000. A tree of height 20 will be much faster to search than one
null
of height 1,000,000.
• Unfortunately, if we start with 1,000,000 keys that have been sorted into
order, and then build a tree by adding them one at a time, we will end up
with a tree that has height 1,000,000. This is a problem when we come to
binary search trees
null
null
null
null
nary tree. Empty or node with links to left and right binary trees.
Complete trees
omplete
Perfectly
except
• Atree.
complete
binary balanced,
tree is a binary
tree for
withbottom
all levelslevel.
except the last
level completely filled, and with all the leaves in the last level to the
left side.
complete tree with N = 16 nodes (height = 4)
Complete trees
• A complete binary tree is a binary tree with all levels except the last
level completely filled, and with all the leaves in the last level to the
left side.
• Every complete binary tree is balanced.. but this is not necessarily
true the other way around.
• Below, which of the balanced trees are also complete? Which are not
complete?
• Can you understand why the unbalanced trees cannot be complete trees?
• (Think about it for a bit…)
Tree-height questions
• Suppose we have nodes in random order and we build a tree by
inserting them one at a time, how high will the tree be on average?
• answer is about 1.38 log2n for n nodes
• (proof – see Weiss’s “Data Structures ..”, Theorem 19.1, p. 704)
• means a random tree with 1,000,000 nodes will have height about 27 (which
is good).
• Suppose that you want to build a binary search tree out of nodes that
are (or might be) already in order, can you do it in such a way that the
tree is guaranteed not to be high?
Today’s lecture
• Using Trees for searching data and keeping it sorted
• Binary tree and Binary Search tree recap
• Tree height and number of nodes
• Balanced trees and complete trees
• Constructing balanced trees
• AVL trees
• Red-Black trees
• [2-3 trees *]
• [B-Trees *]
Are
these
BSTs?
Are
thesetrees
AVL balanced
trees or not?
Reminder:
A tree is said to be balanced if
the height of the two
subtrees of every node never
differ by more than 1
NB the height of an empty
tree is undefined… but for the
purposes of balancing trees,
assume the height of an
empty tree is -1
Example adapted from Tomas Petricek’s material
Constructing balanced BSTs
• Suppose that you want to build a binary search tree out of nodes that are
(or might be) already in order, can you do it in such a way that the tree is
guaranteed not to be high?
• (in other words, keep it as balanced as possible?)
• There are ways to build a binary search tree so that it stays pretty*
balanced whatever order the nodes are added.
• 2-3 trees
• AVL trees
• Red-black trees
• B-trees
Example: AVL Trees
• There are ways to build a binary search tree so that it stays pretty*
balanced whatever order the nodes are added.
• One way is to build an AVL tree (named after the inventors AdelsonVelskii and Landis).
• An AVL tree has the following properties
• the root's left and right subtrees differ in height by at most 1
• the root's left and right subtrees are both AVL trees
• An AVL tree can be built by adding nodes and then rebalancing
when it loses the AVL property.
AVL Trees
• in AVL trees we distinguish between nodes that are balanced, or have
a slight left tilt, or a slight right tilt
• a tilt is introduced by height changes
• slight tilt: height differs by 1
• tilt that leaves the tree unbalanced: height differs by >1
• fix tilt problems by "tree rotation" (of subtrees)
Rotations
https://www.cs.odu.edu/~zeil/cs361/latest/Public/avl/index.html
Rotations
https://en.m.wikipedia.org/wiki/Tree_rotation
Example: Red-Black trees
There are ways to build a binary search tree so that it stays pretty*
balanced whatever order the nodes are added.
One way is to build an Red-Black tree. A Red-Black tree has the
following properties
• A node can be red or black
• normal (black) nodes and overflow (red) nodes
• Root node is always black
• Null leaves of the tree are black
• If a node is red, then its children are black
Red-Black Trees (continued)
• in red-black trees we need to store data to distinguish between the
normal (black) nodes and overflow (red) nodes
• additional data: one bit (red/black)
• red nodes are only allowed to occur underneath black nodes
• invariant: any path from the root to an empty tree passes through
exactly the same number of black nodes: black height
• fix problems by rotations (of subtrees)
CLR p.309: “We call the number of black nodes
on any simple path from, but not including, a
node x down to a leaf the black-height of the
node, denoted bh(x).
… We define the black-height of a red-black
tree to be the black-height of its root. “
A red black tree example
7
Root node
right
left
2
8
6
null
• normal=black, overflow=red
11
4
null
RULES for red-black trees:
• A node can be red or black
5
null null
null
14
null
null
9
null
• Root node is always black
• Null leaves are black
• If a node is red, then its children are
black
• red nodes are only allowed to occur
underneath black nodes
• invariant: any path from the root to an
empty tree passes through exactly the
same number of black nodes
• (black height)
null
null
Insertion into a red-black tree
• new data is entered into the tree as an overflow node (red)
• so black height not affected (yet)
• if red node appears underneath red node (violation of invariant) this
part of the tree is rotated, pushing a red node up
• several cases to be considered, we will look through two of those in the slides
• https://www.youtube.com/watch?v=5IBxA-bZZH8 summary: *
• if red node reaches top, it is turned black
Insertion into a red-black tree
• 0: Z = root -> colour Z black
• 1: Z.uncle = red -> flip colour of parent, uncle and grandparent
• 2: Z.uncle = black (triangle – Z is a Left child and parent is a Right child, or
vv) -> rotate so Z becomes the parent
• 3: Z.uncle = black (line - Z is a Left child and parent is Left child, or both
Right children) -> rotate so Z’s parent becomes the grandparent, then
recolour as needed
• NB Michael Sambol gives the pseudocode from CLR
CONFUSED? See p. 316 of CLR for an alternative representation of the cases
1-3, and the Rob Edwards video at the end for another alternative
representation
Insertion into a red-black tree
• new data is entered into the tree as an overflow node (red)
• so black height not affected (yet)
• if red node appears underneath red node (violation of invariant) this
part of the tree is rotated, pushing a red node up
• several cases to be considered, we will look through two of those in the slides
• https://www.youtube.com/watch?v=5IBxA-bZZH8 summary: *
• if red node reaches top, it is turned black
A red black tree example
7
Root node
right
left
2
5
null null
null
14
8
6
null
• normal=black, overflow=red
11
4
null
RULES for red-black trees:
• A node can be red or black
9
null
null
null
• Root node is always black
• Null leaves are black
• If a node is red, then its children are
black
• red nodes are only allowed to occur
underneath black nodes
• invariant: any path from the root to
an empty tree passes through exactly
the same number of black nodes
• (black height)
null
null
Example: Inserting 1
7
Root node
right
left
2
1
null null
5
null null
null
14
8
6
null
• normal=black, overflow=red
11
4
null
RULES for red-black trees:
• A node can be red or black
9
null
null
null
• Root node is always black
• Null leaves are black
• If a node is red, then its children are
black
• red nodes are only allowed to occur
underneath black nodes
• invariant: any path from the root to
an empty tree passes through exactly
the same number of black nodes
• (black height)
null
null
Example: Inserting 10, a red node under 9
7
Root node
right
left
2
6
null
null
null
null null
• normal=black, overflow=red
11
4
1
RULES for red-black trees:
• A node can be red or black
5
14
8
9
null
null
10
null null
null
null
null
• Root node is always black
• Null leaves are black
• If a node is red, then its children are
black
• red nodes are only allowed to occur
underneath black nodes
• invariant: any path from the root to
an empty tree passes through exactly
the same number of black nodes
• (black height)
Step 1: repair tree (that was)
rooted at 8
7
Root node
2
6
null
null
null null
• normal=black, overflow=red
• Root node is always black
11
4
1
RULES for red-black trees:
• A node can be red or black
right
left
5
9
8
null null null null
14
null
10
rotate so Z’s parent
becomes the grandparent,
then recolour as needed
null
• Null leaves are black
• If a node is red, then its children are
black
• red nodes are only allowed to occur
underneath black nodes
• invariant: any path from the root to
an empty tree passes through exactly
the same number of black nodes
• (black height)
null
null
Step 2: repair tree rooted at 7
7
Root node
2
6
null
null
null null
• normal=black, overflow=red
• Root node is always black
11
4
1
RULES for red-black trees:
• A node can be red or black
right
left
5
9
8
null null null null
14
null
10
flip colour of parent,
uncle and grandparent
null
• Null leaves are black
• If a node is red, then its children are
black
• red nodes are only allowed to occur
underneath black nodes
• invariant: any path from the root to
an empty tree passes through exactly
the same number of black nodes
• (black height)
null
null
Final step: blacken root
node
7
right
• normal=black, overflow=red
• Root node is always black
11
4
2
6
null
null
null null
RULES for red-black trees:
• A node can be red or black
Root node
left
1
Colour Z black
5
9
null
8
null null null null
14
10
null
• Null leaves are black
• If a node is red, then its children are
black
• red nodes are only allowed to occur
underneath black nodes
• invariant: any path from the root to
an empty tree passes through exactly
the same number of black nodes
• (black height)
null
null
(CLR pp316-7)
Red-black trees
Fixes, for red-red violations
Case 1: z’s uncle y is red -> flip colours above z
Case 2: z’s uncle y is black and z is a right child
-> left rotate at z (NB this leads into Case 3)
Case 3: z’s uncle y is black and z is a left child
-> right rotate (at parent of z)
Finish by recolouring root black as needed
Observations
• If the black height of a red-black tree is !, then...
• ...its ordinary height is "(!)
• minimum number of nodes in tree for black height !: when there are
no overflow nodes: 2" − 1
• maximum number of nodes in a tree when all black nodes have 2 red
children: 4" − 1
• thus the ordinary height of a red-black tree with k elements is
between log # , and 2 - log # ,, thus it is "(log ,).
Comparison: AVL vs Red-Black
• AVL trees are more balanced than red-black trees
• But AVL trees are slightly more costly to administer
• Minimum number of nodes in a tree of heights 20 (& 40)
• AVL:
17,710 (267,914,295)
• Red-Black: 4,090
(4,194,298)
• Which one to use?
• If it is more important to keep the tree balanced, use AVL
• Red-black trees are a nice compromise if you want to keep the tree fairly balanced without
spending lots of extra time doing rotations
• In other words, do you want to incur extra performance cost while building the tree (AVL trees) or
while using the tree (red-black trees)?
* = Non
examinable
Example: 2-3 trees
There are ways to build a binary search tree so that it stays pretty*
balanced whatever order the nodes are added.
One way is to build a 2-3 tree.
* = Non
examinable
2-3 tree
Allow 1 or 2 keys per node.
・2-node: one key, two children.
・3-node: two keys, three children.
Symmetric order. Inorder traversal yields keys in ascending order.
Perfect balance. Every path from root to null link has same length.
how to maintain?
M
3-node
2-node
smaller than E
E J
AC
H
between E and J
R
larger than J
L
P
SX
null link
4
Taken from https://algs4.cs.princeton.edu/lectures/keynote/33BalancedSearchTrees.pdf
* = Non
examinable
Insertion into a 2-3 tree
Insertion into a 2-node at bottom.
・Add new key to 2-node to create a 3-node.
insert G
L
L
E
AC
E
R
H
P
SX
AC
R
GH
P
SX
6
Taken from https://algs4.cs.princeton.edu/lectures/keynote/33BalancedSearchTrees.pdf
* = Non
examinable
Insertion into a 2-3 tree
Insertion into a 3-node at bottom.
・Add new key to 3-node to create temporary 4-node.
・Move middle key in 4-node into parent.
・Repeat up the tree, as necessary.
・If you reach the root and it's a 4-node, split it into three 2-nodes.
insert Z
L
L
E
AC
E
R
H
P
SX
AC
RX
H
P
S
Z
7
Taken from https://algs4.cs.princeton.edu/lectures/keynote/33BalancedSearchTrees.pdf
* = Non
examinable
https://www.youtube.com/watch?v=bhKixY-cZHE
Taken from https://algs4.cs.princeton.edu/lectures/keynote/33BalancedSearchTrees.pdf
* = Non
examinable
2-3 tree: implementation?
Direct implementation is complicated, because:
・Maintaining multiple node types is cumbersome.
・Need multiple compares to move down tree.
・Need to move back up the tree to split 4-nodes.
・Large number of cases for splitting.
fantasy code
public void put(Key key, Value val)
{
Node x = root;
“ Beautiful
algorithms are not always
most useful. ”
while (x.getTheCorrectChild(key)
!= the
null)
{
Donald Knuth
x—
= x.getTheCorrectChildKey();
if (x.is4Node()) x.split();
}
if
(x.is2Node()) x.make3Node(key, val);
else if (x.is3Node()) x.make4Node(key, val);
}
Bottom line. Could do it, but there's a better way.
15
Taken from https://algs4.cs.princeton.edu/lectures/keynote/33BalancedSearchTrees.pdf
* = Non
examinable
How to implement 2-3 trees with binary trees?
Challenge. How to represent a 3 node?
ER
Approach 1: regular BST.
・No way to tell a 3-node from a 2-node.
・Cannot map from BST back to 2-3 tree.
R
E
Approach 2: regular BST with "glue" nodes.
・Wastes space, wasted link.
・Code probably messy.
R
E
Approach 3: regular BST with red "glue" links.
・
・Arbitrary restriction: red links lean left.
R
Widely used in practice.
E
17
Taken from https://algs4.cs.princeton.edu/lectures/keynote/33BalancedSearchTrees.pdf
* = Non
examinable
Example: B-trees
There are ways to build a binary search tree so that it stays pretty* balanced
whatever order the nodes are added.
One way is to build a B-tree. B-trees are:
• A multi-way tree structure suitable for external (disk file) lookup.
• Uses large nodes which can be a disk block
• Each node has alternating pointers and data items
• All nodes but the root node are always at least half full.
• leaf nodes are all at the same level
* = Non
examinable
Inserting in a B-tree
• New items are always inserted in leaf nodes
• When a leaf node fills up it is split in two and a new entry is made in
the layer above.
• When the root node fills up it is split and the tree grows in height.
• e.g. https://www.youtube.com/watch?v=coRJrcIYbF4
* = Non
examinable
B-tree example
16
3
7
42
12
52
18
24
27
35
63
* = Non
examinable
B-tree after adding 30
16
3
7
27
42
52
12
18
24
30
35
63
* = Non
examinable
Searching in a B-tree
・Start at root.
・Find interval for search key and take corresponding link.
・Search terminates in external node.
searching for E
* K
follow this link because
E is between * and K
* D H
K Q U
follow this link because
E is between D and H
* B C
D E F
H I J
K M N O P
Q R T
U W X
search for E in
this external node
Searching in a B-tree set (M = 6)
48
Taken from https://algs4.cs.princeton.edu/lectures/keynote/33BalancedSearchTrees.pdf
* = Non
examinable
Strengths of B-trees
• Provide fast expandable external lookup for strings.
• Cannot become seriously unbalanced.
• The large nodes mean that entries can be found with few disk
accesses.
• The root node, and possibly its immediate children, can easily be kept
in memory
Further additional reading
• Cracking the coding interview: 150 programming questions and solutions (McDowell 2013) –4.1, also Approach V on
p. 34
• Data structures & problem solving using Java (Weiss 2010) – Ch 19, 18, 20.6
• Introduction to algorithms (Cormen et al 2009) – Ch 12, 13, 18
• https://www.youtube.com/watch?v=cv_KDQzZpHs [Useful summary of binary search trees]
• https://algs4.cs.princeton.edu/lectures/keynote/33BalancedSearchTrees.pdf
• https://www.cs.odu.edu/~zeil/cs361/latest/Public/avl/index.html [AVL trees]
• https://www.youtube.com/watch?v=qvZGUFHWChY [Red-Black trees]
• And follow on videos: https://www.youtube.com/watch?v=5IBxA-bZZH8 etc
• https://www.youtube.com/watch?v=v6eDztNiJwo [Full worked e.g. of Red-Black trees, Rob Edwards]
• https://en.m.wikipedia.org/wiki/Tree_rotation
Glossary
• binary tree - linked data structure where each node links to 2 other nodes (left & right, could be empty)
• leaf - a tree node with empty child nodes (no links to left or right nodes)
• root – the first node in a binary tree
• binary search tree - binary tree where all left child nodes’ value < its parent node’s value, all right child nodes value >
its parent node’s value
• traversal – visiting every node of a tree in turn
• height – number of levels in a binary tree (or length of longest path from root to a leaf)
• size – number of nodes in a tree
• balanced - a tree for which height of the two subtrees of every node never differ by more than 1
• complete - a binary tree with all levels except the last level completely filled, and with all the leaves in the last level to
the left side.
• AVL trees / Red-black trees / 2-3 trees / B-trees – self-balancing binary search tree (structures optimised to stay as
balanced as possible)
• Black-height (red-black trees) - the number of black nodes on any simple path from (but not including) the root down
to a leaf
• Rotation – operation conducted on a BST to rebalance it
Today’s lecture
• Using Trees for searching data and keeping it sorted
• Binary tree and Binary Search tree recap
• Tree height and number of nodes
• Balanced trees and complete trees
• Constructing balanced trees
• AVL trees
• Red-Black trees
• [2-3 trees *]
• [B-Trees *]
* = Non
examinable
content
* We won’t cover details of 2-3 trees or B-Trees. Slides on 2-3 trees and B-Trees are kept in
this powerpoint for reference only, if you are interested. Details of 2-3 trees and B-Trees
are not examinable – you are expected to know that these are two other ways of
constructing balanced trees, but not how they work.
O-notation
COMP5180 Algorithms, Correctness and Efficiency
Anna Jordanous
a.k.jordanous@kent.ac.uk
Today’s lecture
• Motivation for why O-notation is important
• O-notation for classifying inputs
• Objections…
• Formalising
• Examples
• Dealing with earlier objections
• Manipulating and Computing with O-notation
• Alternatives to (Big) O-notation
week 12 O-notation
1
O-notation: why it’s important
• you have seen O-notation before, e.g. in COMP3830
• in itself, O-notation has nothing to do with algorithms
• what it does: classify (and order) mathematical functions
• sometimes we encounter mathematical functions of which we
have limited knowledge
• E.g. exact input values
• that knowledge may be enough to classify the function, and the
classification allows us to predict the function’s values, with some
degree of accuracy
• the runtime of program, when given certain characteristics of its
input, is an example of such a limited-knowledge function
week 12 O-notation
2
Classification Goals, General Ideas
• we want to have a relatively simple (but ultimately
mathematically precise) way to classify and
compare functions, and permit for an element of
uncertainty
• tools to cope with uncertainty for classification
1. provide a way for ignoring outliers, e.g. behaviour
of function for all but finitely many inputs
2. allow function values to vary a little bit, e.g. by a
constant factor
O
notation 3
week 12 O-notation
What functions do we use Onotation for?
• (sub-) programs that process data
• have access to that data, are likely not to ignore the
data, may often need to read all of it
• it takes some time to compute, either compute an output
or perform some responsive behaviour (or both)
• this gives a function from data to time
• sub-programs: procedures, methods, code-blocks
• we are not talking about programs that generally
run forever, like web browsers, operating systems,
read-eval-print loops
• though even these will have subtasks of this ilk
week 12 O-notation
4
Comparing parametric programs
• we can view the performance of a parametric
program* as a mathematical function:
mapping
program
input to...
...the performance
data of the program
when run on that
particular input
*a
“parametric
program” is a
program that
takes
parameters
for example, instead of asking:
what is the result of sorting [4,76,1,-8,987,3,2]?
we ask:
how long does it take to sort [4,76,1,-8,987,3,2]?
week 12 O-notation
5
Classifying inputs
so, our runtime function maps
input data…
…to the time it takes to run the
program on that data
• Calculating exact runtimes per input for a mathematical
function is not tremendously useful when talking about
the program’s behaviour, because…
• we need to supply the full input (ok-ish for length 7, but…)
• so, to know the runtime of bubblesort on a certain array of
100,000 elements we need to supply all that data, all those
100,000 elements
week 12 O-notation
6
Classifying inputs
so, our runtime function maps
input data…
…to the time it takes to run the
program on that data
• Calculating exact runtimes per input for a mathematical
function is not tremendously useful…
• instead, we generalise – we will throw inputs that exhibit
sufficiently similar behaviour into one category, and then
look at the runtime of that category
• e.g., instead of supplying the 100,000 elements we just supply
that number: 100,000
• assuming/hoping/claiming that bubblesort shows similar
behaviour for all arrays of that length.
week 12 O-notation
7
Objections
So we should treat runtime behaviour as a function...?
Some objections one can make to these previous
points made:
1. run time depends on your hardware, i.e. the computer
you run the program on
2. running the same program on the same input does not
give you the exact same runtime every time
3. programs can be non-deterministic, e.g. when using
concurrency or random numbers
• For now, we ignore these objections…
• Later we’ll see whether or to what extent our
computational model helps with these issues
week 12 O-notation
8
Today’s lecture
• Motivation for why O-notation is important
• O-notation for classifying inputs
• Objections…
• Formalising
• Examples
• Dealing with earlier objections
• Manipulating and Computing with O-notation
• Alternatives to (Big) O-notation
week 12 O-notation
9
Format of O-notation
• generally: !(# $! , … , $" ) where the $# are
"measurements" of our inputs and # is the growth
function;
• most of the time:
• just one measurement (" = 1)
• often call it "n", most commonly: size of input
• the growth function &
• describes how the program runtime changes for different n values
• is put toge.g. addition multiplication, exponentiation, logarithm,
factorial, constants
• ether using various arithmetic functions,
week 12 O-notation
10
Formally
!(#($)) is a set of growth functions:
NB remember
( )*+ ,*+-+.*,
represents
multiplication
& ∃'. ∃$! . ' > 0 ∧ $! > 0 ∧ ∀$ . $ ≥ $! → 0 ≤ &($) ≤ ' ( #($)}
in words:
!(#($)) is the set of all functions &,
such that for all sufficiently large inputs n,
&($) is at most ' ( #($), for some fixed constant '
week 12 O-notation
11
Let’s unwind this:
• O-notation classifies functions, so each
class O(blah) is a set of functions that
match that “blah” description
• E.g. if O(blah) is O(log(n))
& ∃'. ∃$! . '
• blah itself is a function, and always belongs
> 0 ∧ $!
> 0 ∧ ∀$ . $ ≥ $! to that class; generally: # ∈ !(#)
• the variable , in the description ranges
→ 0 ≤ &($)
over all values bigger than some fixed
≤ ' ( #($)}
number ,$ - this is dealing with outliers
• a function g is in the set O(f) if and only if
g(n) is bounded by c*f(n), where c is a fixed
constant – this is dealing with uncertainty
week 12 O-notation
12
Common Examples (from fast to slow)
O(1) [no growth - having an upper bound]
O(log(n)) [logarithmic growth]
O(n) [linear growth]
O(n*log(n)) [loglinear growth]
O(n2) [quadratic growth]
O(n3) [cubic growth]
O(2n) [exponential growth]
O(3n) [also exponential growth, but genuinely worse]
O(n!) [factorial growth]
week 12 O-notation
13
O(f(n)): 5 ∃7. ∃,$ . 7 > 0 ∧ ,$ > 0 ∧ ∀, . , ≥ ,$ → 0 ≤ 5(,) ≤ 7 ( #(,)}
Example (i)
• take the set ! 1
• This is the set of functions that take ‘1’ operation
• no matter how big n is
• plugging in the definition for O(f(n)) gives us in this case:
• 5 ∃7. ∃,$ . 7 > 0 ∧ ,$ > 0 ∧ ∀,. , ≥ ,$ → 0 ≤ 5(,) ≤ 7}
• so these are all "growth functions" that have a fixed
upper bound
• for describing runtime performance people often read
this as constant time, but it really means: bounded time
• examples of O(1) time: assignments, print statements (of
fixed strings), finite sequences of such statements,...
week 12 O-notation
14
O(f(n)): 5 ∃7. ∃,$ . 7 > 0 ∧ ,$ > 0 ∧ ∀, . , ≥ ,$ → 0 ≤ 5(,) ≤ 7 ( #(,)}
Example (ii)
• take the set ! ,%
• This is the set of functions that grow at a rate of (roughly) n2
• besides ,% itself the class contains quadratic growth
! %
%
functions that are bounded by 7, , e.g. , or 7,% or
%
%
6, + 8 log % ,
• if c = some constant (c>0)
• also functions that "occasionally" show quadratic
growth, e.g. 5 , = if (even(,)) then ,% else ,
• ...and functions that grow strictly more slowly than
quadratic, e.g. 5 , = 3,
• technically, * + ⊆ * +! etc.
week 12 O-notation
15
O(f(n)): 5 ∃7. ∃,$ . 7 > 0 ∧ ,$ > 0 ∧ ∀, . , ≥ ,$ → 0 ≤ 5(,) ≤ 7 ( #(,)}
Follow on example, positive
• let’s show that (, ↦ 2,% + 4) ∈ !(,% )
• (NB we use ↦ [mapsto] to emphasise that we’re working
with a function)
• we need to show 2,% + 4 ≤ 7,% for sufficiently large ,
and some constant 7
• choosing 7 = 4 turns the inequation to 2,% + 4 ≤ 4,%
which we can simplify to 2 ≤ ,%
• this is not always true, just almost always – it is
true if , ≥ 2, so we can choose the cutoff constant
,$ that deals with outliers to ,$ = 2
week 12 O-notation
16
O(f(n)): 5 ∃7. ∃,$ . 7 > 0 ∧ ,$ > 0 ∧ ∀, . , ≥ ,$ → 0 ≤ 5(,) ≤ 7 ( #(,)}
Follow on example, negative
• ,& ∉ ! ,%
• otherwise, ,& ≤ 7,% for all sufficiently large ,, and a
fixed 7
• in particular, we need this to hold for , = 7 + 1
• this would give us the goal 7 + 1 & ≤ 7 7 + 1 %
• 7+1 7+1 % ≤7 7+1 %
• 7+1 ≤7
• (Remember in our formal definition, c > 0)
• we can simplify this to the equivalent 1 ≤ 0, which is
just false
week 12 O-notation
17
Earlier objections…
• we can discharge some issues we had raised earlier
w.r.t. this performance model
1. hardware differences* will normally be within a
constant factor
*let’s just ignore things like quantum computing for now…
2. variations between different runs are small, and within
a small constant factor
• some remaining issues are not as easily dismissed
A. Non-determinism? What non-determinism?
B. When classifying a group of inputs together (e.g. by
input size) they may not be within a constant factor in
behaviour (e.g. bubble sort, quicksort)
week 12 O-notation
18
Order?
• In a convoluted way, this also gives us a notion of
order between those functions:
• &≤/ ↔&∈* /
• this order is reflexive and transitive
• but it is not guaranteed to be asymmetric:
• different functions f and g could exhibit both # ≤ 5
and 5 ≤ #, in which case…
• those functions are equivalent in this notation,
they grow at approximately the same rate
week 12 O-notation
19
Hard-to-compare example
• Imagine two functions for the same task
• O(n*m2) versus O(n2*m)
randomFunction1(n, m)
randomFunction2(n, m)
• here we have two measurements of our inputs, n
and m...
• and the first growth function is linear in n and
quadratic in m
• and for the second it's the other way around
• which should we use…?
(Note: if the measurement is cheap to make we could
create a hybrid algorithm that first makes the
measurement and based on that then decides which
algorithm to deploy)
week 12 O-notation
20
Today’s lecture
• Motivation for why O-notation is important
• O-notation for classifying inputs
• Objections…
• Formalising
• Examples
• Dealing with earlier objections
• Manipulating and Computing with O-notation
• Alternatives to (Big) O-notation
week 12 O-notation
21
Manipulating O-notation
• to describe the O-notation characteristic of a
growth function f we often want the simplest
growth function g, such that O(f)=O(g)
• this would give us a clear (possibly unique)
identification of the "blob" generated by f, its
"name" so to speak
• this involves:
• algebraic manipulation, ordinary
• eliminating constant factors
• eliminating slower-growing summands
week 12 O-notation
22
Eliminating constant factors,
examples
• O(3x2)=O(x2)
NB
" #$% &$%'%($&
represents
multiplication
• justification: if g is bounded by 7 ( 3$2 then it is also
bounded by (3 ( 7) ( $2
• !(log " $) = !(log # $), constant factor log " '
• note: this is why we typically just say log(,) inside Onotation
• similarly, !(3$%& ) = !(3$ ), because [algebraic rule]
3$%& = 3 ( 3$ , and 3 is a constant factor
• however, ! 8 $%& ≠ ! 8 $
• we can do the same algebraic manipulation but x is not
week 12 O-notation
24
a constant factor
Eliminating slower growing
summands
• if # ∈ !(5) then ! # + 5 = ! 5
• why does this work?
• 5 ≤ # + 5 ≤ 75 + 5 = (7 + 1)5
• hence: !(5) ⊆ !(# + 5) ⊆ !((7 + 1)5)
• now c is a constant, hence so is c+1, and we already
know that we can eliminate constant factors,
therefore ! 5 = !((7 + 1)5)
• example: ! 5,% + 8, = !(,% )
• generally, polynomials can be reduced to the term
with largest degree
week 12 O-notation
25
Where do all of these operations
come from?
• program analysis!
• sequences of statements: cost is the sum of the costs of
all the statements
• loops: running a statement k times is k times as
expensive as running it once; nested loops -> polynomials
• divide-and-conquer searches give rise to logarithmic
costs: you can split a search space of size n only
log ' , times into - sections of equal size and continue
search within one section only
• exhaustive trial-and-error searches often give exponential
performance (worst case)
week 12 O-notation
26
Computing with O-notation
• this arises when we analyse code, e.g. if runtime for
statementA is O(a) and the runtime for statementB
is O(b) then the runtime for the sequence
statementA
statementB
is: O(a) + O(b) which is O(a+b).
• note that most of the time we will be able to
simplify O(a+b)
• multiplications occur when we look at loops
week 12 O-notation
27
Cost of if-then-else
We can assign a O(...) cost to an if statement
if (cond) stA; else stB;
...if we know the cost of its parts:
If cond can be performed in O(f(n)), stA in O(g(n)) and
stB in O(h(n)) then the whole thing is
O(f(n)+max(g(n),h(n))).
Note: very often, checking the condition is O(1) and
then we can simplify the expression. Even more
often, we can also simplify O(max(g(n),h(n))).
week 12 O-notation
28
Loops
cost of:
for(int i=0; i<n; i++) statement
We run statement n times. If the cost for (a single
execution of) statement is O(f) then the overall cost
is n×O(f) which we can express as O(n×f).
Note: the cost for the loop infrastructure (increasing
and checking the i) has been ignored here. This is
safe (for n>0), since all statements are at least O(1).
• this is not automatic though for for-each loops, because
their infrastructure is user-programmable
week 12 O-notation
29
Method Calls and Recursion
• cost of a method call is the cost of its method body
("plus 1", to tax the passing of parameters and results)
• for recursive methods this is more subtle:
• define a performance function T for the method
• ...and define it recursively at the places where the method is
recursive
• ideally, try to find a non-recursively-defined version that
satisfies the recurrence equations of T; can mix guessing with
equation-solving
• example: ! 2# = 2! # + &#
• solution: ! # = '(# ) log # )
week 12 O-notation
30
Example, bubblesort loop
for (int i=0; i<K; i++) {
for (int j=1; j<M; j++) {
if (a[j]<a[j-1]) {
int aux=a[j];
a[j]=a[j-1];
a[j-1]=aux;
}
}
}
week 12 O-notation
31
inside-out analysis, inner part
if (a[j]<a[j-1]) {
int aux=a[j];
a[j]=a[j-1];
a[j-1]=aux;
}
This part is O(1): 3 assignments, each has O(1) cost.
O(1)+O(1)+O(1)=O(3)=O(1). (3 is a constant factor)
Condition is O(1) [not a method call], so cost of ifstatement is O(1+max(1,0))=O(1+1)=O(2)=O(1)
week 12 O-notation
32
Middle part
for (int j=1; j<M; j++) {
if (a[j]<a[j-1]) {
int aux=a[j];
a[j]=a[j-1];
a[j-1]=aux;
}
}
The loop is run M-2 times, the loop has (as we have
seen) O(1) cost, so the overall cost is O(M-2)=O(M).
week 12 O-notation
33
Example, bubblesort loop
for (int i=0; i<K; i++) {
inner part
}
We know the cost of the inner part is O(M). We run
the green loop exactly K times. Overall cost: O(K×M).
For normal bubblesort, we have K=M, where K is the
length of the array, giving us a runtime quadratic in
the length of the array.
week 12 O-notation
34
Today’s lecture
• Motivation for why O-notation is important
• O-notation for classifying inputs
• Objections…
• Formalising
• Examples
• Dealing with earlier objections
• Manipulating and Computing with O-notation
• Alternatives to (Big) O-notation
week 12 O-notation
35
Variations on big-O
• O(f) ["big O"] gives a class of growth functions for which
7 ( # is an upper bound
•
/ ∃3. ∃+). 3 > 0 ∧ +) > 0 ∧ ∀+ . + ≥ +) → 0 ≤ /(+) ≤ 3 9 &(+)}
• (we already saw this)
• there are various other notations around, e.g.
• ;(<) is the dual (3 9 & is a lower bound), or / ∈ Ω(&) ⟷ & ∈ *(/);
• ?(<) has this as both upper and lower bound, i.e. Θ & = *(&) ∩ Ω(&)
• while O(f) provides an upper bound, there is also:
• o(f) ["little o"] for providing a strict upper bound: e.g. Big-O vs little-o:
! = *(D ! )
2D
• / ∀3. ∃+). 3 > 0 ∧ +) > 0 ∧ ∀+. + ≥ +) → 0 ≤ / + < 3 9 &(+)}
2D ! ≠ F(D !)
!)
2D = F(D
week 12 O-notation
36
How problems increase
N
log(N)
1
0.00
2
0.69
3
1.10
4
1.39
5
1.61
10
2.30
50
3.91
100
4.61
200
5.30
1000
6.91
NlogN
0.00
1.39
3.30
5.55
8.05
23.03
195.60
460.52
1059.66
6907.76
N2
2N
N!
1
4
9
16
25
100
2500
10000
40000
1000000
2
4
8
16
32
1024
1.1259E+15
1.26765E+30
1.60694E+60
1.0715E+301
1
2
6
24
120
3628800
3.04141E+64
9.3326E+157
#NUM!
#NUM!
• Life time of the universe about 4E+10 years (40 billion
years) = approx 1E+18 seconds. At 5 Peta (5+E15) FLOPS
we get 5E+33 instructions per universe lifetime
• With a graph of 200 nodes, an algorithm taking exactly
exponential time means we need about 3E+26 universe
lifetimes to solve the problem.
week 12 O-notation
37
Further additional reading
• Introduction to algorithms (Cormen et al 2009) – Ch 2, 3
• Data structures & problem solving using Java (Weiss 2010) – Ch 5
• Algorithms in Java (Sedgewick 2003) – Ch 2
• (Algorithms (Sedgewick and Wayne, 2011) – p. 206-207)
• (Cracking the coding interview (McDowell 2013) – questions
relating to finding optimally efficient solutions)
week 12 O-notation
38
Glossary
O-notation – a notation to classify and order performance of (sub)programs using
mathematical functions to specify an upper bound for growth of a (sub)program
upper bound – maximum limit
non-deterministic - where the program does not necessarily run the same every time e.g.
when using concurrency or random numbers
parametric program - a program that takes parameters
growth function – a function which describes how a (sub)program runtime changes for
different input sizes (different ‘n’)
input size - number of items given as input to a (sub)program
sub-programs - procedures, methods, code-blocks, etc
reflexive – a property/relation which relates an input to itself, i.e. x relates to x,
e.g. equality: 2 = 2
transitive – a property/relation where, if x relates to y and y relates to z, then x relates to z,
e.g. less than or equal to: if 2 <= 3 and 3 <= 4 then 2 <= 4
symmetric – a property/relation where, if x relates to y then y relates to x,
e.g. addition: 2 + 4 links 2 and 4 in the same way as 4 + 2
asymmetric / anti-symmetric – opposite of symmetric,
e.g. subtraction: 2 – 4 does not link 2 and 4 in the same way as 4 - 2
39
Today’s lecture
• Motivation for why O-notation is important
• O-notation for classifying inputs
• Objections…
• Formalising
• Examples
• Dealing with earlier objections
• Manipulating and Computing with O-notation
• Alternatives to (Big) O-notation
week 12 O-notation
40
Graphs
COMP5180 Algorithms, Correctness and Efficiency
Anna Jordanous
a.k.jordanous@kent.ac.uk
(Not this type of graph…
… this type of graph)
Today’s lecture
• Graphs – a recap
• Definitions and representations
• Graph traversal (BFS/DFS)
• Weighted Graph Algorithms (Prim’s/Kruskal’s algorithms)
• Motivations
• Additional graph types, terms, concepts
• Density/sparcity and alternative representations
• Paths, cycles and DAGs
• Connectivity
• Bipartite graphs
• Isomorphism
• Additional graph algorithms
• Shortest path
Graphs – a recap
• Graph is a data structure that consists of a set of nodes
and a set of edges that connect the nodes together
• What you’ve seen before:
• Graph: Definitions
• Representing Graphs
• Graph Traversal
• Breadth-First Search
• Depth-First Search
• Weighted Graph Algorithms
• Prim’s Algorithm
• Kruskal’s Algorithm
Recap
• Graph G = (V, E):
• set of nodes V (alt: points, vertices)
• set of edges E (alt: arcs, connections, links
• All edges have a node at each end: ei,j =(vi,vj)
Some possible variations of graphs
• labelled/unlabelled; weighted edges
• directed/undirected
• loops/cycles; multi-edges
Nodes have degrees
• (indegree/outdegree in a directed graph)
Recap - Representing Graphs
• Representations for internal computer data
structures and for common file formats
• Also required for mathematical reasoning
• The method we use will depend on
• Type/characteristics of graph
• Speed of access required
• Memory taken up
• Maintenance (frequency of insertions/deletions)
• Traversal needs
• Programming language restrictions
• Adjacency lists vs adjacency matrices
• (more later)
Adjacency list
List of
reachable
neighbours:
A: C, F
B: D, E
C: D, F
D: E
E: C
Edges:
A C
A F
B D
B E
C D
C F
D E
E C
Adjacency matrix
• Matrix indicates nodes that connect
• In this case we have a binary matrix
• (no multi-edges,
• 1 or 0 edges from node x to node y)
1 2 3 4
1 0 1 1 0
2 1 0 0 1
3 0 0 0 1
4 1 0 0 0
1
2
3
4
Graph traversal
• Start at one vertex of a graph (the start vertex)
• Process the data contained at that vertex
• Move along an edge to process a neighbour
• To avoid entering a repetitive cycle, we need to
mark each vertex as it is processed.
• (If edges are represented separately, we also mark
the edge as traversed)
• When the traversal finishes, all the vertices we
can reach* from the start vertex are processed.
* NB Some of the vertices may not be processed
because there is no path from the start vertex
What guides our choice of which edge to explore next?
What guides our choice of which edge to explore next? A1
Depth first Search (same principle
as for trees)
• Put unvisited vertices on a stack.
Shortest path: Find path from s to t
that uses fewest number of edges:
• DFS (to visit a vertex s)
• Mark s as visited.
• For all unmarked vertices v adjacent to s:
• DFS (v) (Recursively use DFS to visit all
vertices v)
What guides our choice of which edge to explore next? A2
Breadth first Search (same
principle as for trees)
• Put unvisited vertices on a queue.
Shortest path: Find path from s to t that
uses fewest number of edges:
• BFS (from source vertex s)
• Put s onto a FIFO queue. Repeat until the queue is empty:
• remove the least recently added vertex v
• add each of v's unvisited neighbors to the queue, and mark
them as visited.
• Property. BFS examines vertices in
increasing distance from s.
Breadth First or Depth First for
graph traversal?
• Breath first ensures that all the nearest
possibilities have been explored
• Depth first keeps going as far as it can,
then goes back to look at other options
• Some applications can use either (e.g.
connectivity test), most applications are best
with one or the other
• In general, travelling around nodes of a
graph is a key operation in many cases.
We’ll explore this more.
Listing all elements – (minimum)
spanning tree
• Can we draw a tree over the
graph that includes all the
nodes in the graph (so we can
list all elements)?
• Yes, if the graph is weighted
and connected
• (see later re connected)
• Such a tree (that contains all the
vertices of a graph) is called a
spanning tree
• Spanning trees always have N-1
edges, where N is the number of
nodes in the graph
Listing all elements – (minimum)
spanning tree
• A minimum spanning tree
(MST) is a spanning tree
whose weight (the sum of
the weights of its edges) is
no larger than the weight
of any other spanning tree
• You covered algorithms to
find minimum spanning tree
• Prim’s Algorithm
• Kruskal’s Algorithm
Prim’s Algorithm to find MST
(minimum spanning tree)
• (Although named after Prim in 1957 it is
now credited to Jarník in 1930)
• Start from an arbitrary node
• and choose the edge with least weight to
jump to the next node
• O(N2) time complexity
Example of Prim
2
A
3
B
E
4
5
1
D
3
7
2
A
C
B
13
12
E
5
4
D
nodes
A0
B2E3D4
D1E3C13
E3C7
C7
{}
13
12
1
7
Start from an arbitrary node
and choose the edge with least
weight to jump to the next node
C
A B C D E
MST
{A} can ‘see’ B2 E3 D4
can ‘see’ D1 E3 C13
AB
can ‘see’ E3 C7
AB,BD
AB,BD,AE can ‘see’ C7
AB,BD,AE,DC can ‘see’ ….
Found MST
MST = AB,BD,AE,DC
COST = 13
Kruskal’s Algorithm to find MST
(minimum spanning tree)
• Given a weighted undirected graph,
sort the edges according to their weights
and keep selecting edges with smallest
weights that do not form a cycle, until
N-1 edges are selected for the MST.
• Requirements
• We need to take the edges in order
• (a sort needed)
• Sorting is the most expensive operation, so
time complexity is O(E log E) where E is the
number of edges in the graph
• We need to detect cycles
Kruskal Example
8
C
5
A
2
3 1
E
B
Given a weighted undirected
graph, sort the edges according to
their weights and keep selecting
edges with smallest weights that
do not form a cycle, until N-1
edges are selected for the MST.
• Order of Edges:
9
BE 1 Add
AB 2 Add
AE 3 Don’t Add
CE 5 Add
AC 8 Don’t Add
BD 9 Add (and Finish)
(DE 11)
D
11
8
C
5
A
2
3 1
E
B
9
D
11
MST = BE, AB, CE, BD
COST = 17
Prim or Kruskal algorithm to find
Minimal Spanning tree?
• Kruskal is greedy
• test the best edges first
• better for sparse graphs , because the time
complexity is based on the number of edges
• Prim grows a solution
• build up the solution gradually based on
current best partial solution
• better for dense graphs, because the time
complexity is based on the number of nodes
(more on sparse/dense graphs later)
Today’s lecture
• Graphs – a recap
• Definitions and representations
• Graph traversal (BFS/DFS)
• Weighted Graph Algorithms (Prim’s/Kruskal’s algorithms)
• Motivations
• Additional graph types, terms, concepts
• Density/sparcity and alternative representations
• Paths, cycles and DAGs
• Connectivity
• Bipartite graphs
• Isomorphism
• Additional graph algorithms
• Shortest path
Motivations: why do we care
about graphs?
• so far we have seen data structures that have a
particular implementation and a particular purpose
• graphs can be like that too...
• ...but often we use them as an abstraction tool, to
answer questions about "interconnected data”
How/Why are graphs useful to us in applications?
Here are three examples (there are hundreds!)
1. Route finding
2. Network analysis/visualisation
3. Understanding meaningful links between things
Route finding
Application 2: Network
analysis/visualisation
• Social networks are made up of links
between people
• e.g. messages, friends, follow,
connections, groups etc
• We can represent these links as graphs,
with people as the nodes/vertices and the
links between them as edges
Network analysis
e.g. musical social networks
Graph of music genres:
• nodes = genres
• nodes connected if a
musician makes a track in
one genre, then a track
in another genre
• (Results – EDM / Urban /
‘other’)
• Also – who interacts with
who?
https://www.youtube.com
/watch?v=BQz2IQ_uHZY
From the Valuing Electronic Music research project
http://valuingelectronicmusic.org
Application 3: meaningful links
If Things have meaningful links connecting them, we
can make graphs of Things (nodes) and links (edges)
Knowledge graphs/
Semantic Web/Linked Data
https://lod-cloud.net/
Internet of Things
http://www.informationage.com/graph-databases-makingmeaning-internet-things-123458606/
Today’s lecture
• Graphs – a recap
• Definitions and representations
• Graph traversal (BFS/DFS)
• Weighted Graph Algorithms (Prim’s/Kruskal’s algorithms)
• Motivations
• Additional graph types, terms, concepts
• Density/sparcity and alternative representations
• Paths, cycles and DAGs
• Connectivity
• Bipartite graphs
• Isomorphism
• Additional graph algorithms
• Shortest path
Dense vs. sparse graphs
Dense – a graph with
many connections
between nodes
Sparse – a graph with
only a few connections
between nodes
Rough guide: In a sparse graph we have E=O(N),
in a dense (simple) graph we have E=O(N2)
From the
Valuing
Electronic Music
research
project
http://valuingele
ctronicmusic.org
Still
not
sparse
…?
Dense vs. sparse graphs
Dense – a graph with
many connections
between nodes
Sparse – a graph with
only a few connections
between nodes
Rough guide: In a sparse graph we have E=O(N),
in a dense (simple) graph we have E=O(N2)
When does a graph turn from sparse to dense ?
• No objectively correct answer, for a single graph
•
!"#$%& '( %)*%+
We can study !"#$%& '( !')%+
of a graph over time
• if this ratio grows proportional to the number
of nodes, the graphs are dense,
• if it does not go up, we have sparse graphs
• if in between: grey area
Dense vs. sparse graphs- remember..
Adjacency matrix
Matrix operations possible
allowing quick algorithms
and easy parallelism
(dense graphs)
1 2 3 4
1 0 1 1 0
2 0 0 0 1
3 0 0 0 1
4 1 0 0 0
Adjacency list
Adjacency matrix is
inefficient for sparse
graphs, adjacency
list is better
1
2
3
4
1: 2, 3
2: 4
3: 4
4: 1
Alternative graph representations
Many different ways to represent a graph!
Remember our choice of representations is based
on things such as our graph characteristics (e.g.
density/sparseness) and how we want to
implement and analyse our graphs (e.g. speed of
access, memory taken up, maintenance, traversal)
Let’s see a few more options for how to represent
a graph:
•
Object model
•
Object model with redundancy
•
Set based
Object model
Nodes and edges objects.
Like adjacency list, but with actual edge objects
storing links (instead of being implicit in the nodes list)
NODES
n1:A
n2:B
n3:C
n4:D
EDGES
e1:n1 n2 label X
e2:n2 n3 label Y
e3:n3 n4 label Z
e4:n4 n1 label Q
A
X
Q
B
Y
D
Z
C
• Can hold arbitrary node and edge information
• E.g. the labels could be representative of weights
• Modification of the graph is fairly easy
• But… complex and not intuitive
• Finding all neighbouring nodes of a node is hard
Object model with redundancy
• Here, the edges hold information about which
nodes they connect to
• and the nodes have information about which
edges go in or out
NODES
EDGES
n2:7 in: e1,e5 out: e2
e2:n2 n3
n1:1 in: e4 out: e1
n3:2 in: e2 out: e3
n4:2 in: e3 out: e4, e5
e1:n1 n2
e3:n3 n4
e4:n4 n1
1
7
2
2
e5:n4 n2
As object model except
• advantage of quick neighbour finding
• disadvantage of maintaining the redundant structures
Set Based
1
1
4
3
2
4
4
1
3
G =(N,E)
N = {1,2,3,4}
E = {(1,2),(1,3),(2,3),(3,4),(4,1)}
w:E® ℕ
(Only) if we need
labels or weights w(1,2) = 3
then we add
w(1,3) = 4
functions between
w(2,3) = 1
node/edge sets and a
w(3,4) = 4
domain of
w(4,1) = 1
labels/weights
• Good for
mathematical
reasoning
• Can be
specific about
what a graph
is
• Does not
specify how a
graph is
stored, only
the structure
of a graph
Today’s lecture
• Graphs – a recap
• Definitions and representations
• Graph traversal (BFS/DFS)
• Weighted Graph Algorithms (Prim’s/Kruskal’s algorithms)
• Motivations
• Additional graph types, terms, concepts
• Density/sparcity and alternative representations
• Paths, cycles and DAGs
• Connectivity
• Bipartite graphs
• Isomorphism
• Additional graph algorithms
• Shortest path
Paths and cycles
• Path = sequence of vertices V0, V1,
… Vn,
• such that each adjacent pair of vertices
Vi, Vi+1, (i.e. vertices next to each other
in a path) are connected by an edge
• NB route or walk = path in a directed
(graph)
• A cycle in a graph G is a closed
path:
• all the edges are different and all the
intermediate vertices are different.
Special type of directed graph:
Directed Acyclic Graphs
• We know what a directed graph is
• Directed graph is also known as a digraph
• A directed, acyclic graph (DAG) is a graph that
contains no directed cycles.
• (in other words: after leaving any vertex v you
can never get back to v by following edges along
the arrows.)
DAGs can be very useful e.g.
in modelling project
dependencies:
Ø Task i has to be done before
task j and k which have to be
done before m.
Complete graphs
• In a complete graph, every vertex is adjacent
to every other vertex.
• If there are N vertices, there will be N * (N - 1)
edges in a complete directed graph and N * (N
- 1) / 2 edges in a complete undirected graph.
Today’s lecture
• Graphs – a recap
• Definitions and representations
• Graph traversal (BFS/DFS)
• Weighted Graph Algorithms (Prim’s/Kruskal’s algorithms)
• Motivations
• Additional graph types, terms, concepts
• Density/sparcity and alternative representations
• Paths, cycles and DAGs
• Connectivity
• Bipartite graphs
• Isomorphism
• Additional graph algorithms
• Shortest path
Connectivity
• A graph G is connected if there is a
path between each pair of vertices,
and is disconnected otherwise.
• Note that every complete graph is
necessarily connected
• (as the path between each pair of vertices is
just the edge between those vertices),
• But… connected graphs are not
necessarily complete
• (for instance, every tree is a connected graph,
but a tree is not a complete graph)
Connectivity
Every disconnected graph can be split up
into a number of connected subgraphs,
called components.
NB We often distinguish between digraphs
and undirected graphs:
• Use Connected to talk about
undirected graphs where there is
a path between every two nodes.
• Use Strongly connected to talk about
directed graphs where there is a route
(directed path) between every two
nodes.
Example: find strongly connected
components of a directed graph
• explanation: in a directed graph paths are directed
too, following arrows in the same direction only
• thus we can have a directed path from A to B
without having one that goes from B to A
• a strongly connected component is a (full) subgraph
in which there are directed paths between any two
vertices, each way
• Adaptable to many problems
• E.g. motif finding in biological networks (when combined
with isomorphism – see later)
• (identifying over-repeated patterns in a network of data)
“Network motifs are defined as over-represented small connected
subgraphs in networks”
Kim, W., Li, M., Wang, J. et al. Biological network motif detection and
evaluation. BMC Syst Biol 5, S5 (2011)
Example
A
D
B
C
E
G
F
Example with added edge
A
D
B
C
E
G
F
Today’s lecture
• Graphs – a recap
• Definitions and representations
• Graph traversal (BFS/DFS)
• Weighted Graph Algorithms (Prim’s/Kruskal’s algorithms)
• Motivations
• Additional graph types, terms, concepts
• Density/sparcity and alternative representations
• Paths, cycles and DAGs
• Connectivity
• Bipartite graphs
• Isomorphism
• Additional graph algorithms
• Shortest path
Bipartite graph
• In a bipartite graph: two different
types of vertex
• A bipartite graph G=(V,E) has a set V
split into V1 and V2
• (e.g. top right: Jobs and People)
• What makes a bipartite graph special
that every edge (i,j) that exists in E
has (i in V1 )and (j in V2).
• i.e. you can never have an edge between
two nodes of the same type
• Edges must go from one type of node to
the other
Good for tasks such as allocating resources to people,
or pairing up two different types of thing
Today’s lecture
• Graphs – a recap
• Definitions and representations
• Graph traversal (BFS/DFS)
• Weighted Graph Algorithms (Prim’s/Kruskal’s algorithms)
• Motivations
• Additional graph types, terms, concepts
• Density/sparcity and alternative representations
• Paths, cycles and DAGs
• Connectivity
• Bipartite graphs
• Isomorphism
• Additional graph algorithms
• Shortest path
Similarity of two graphs
Remember, a graph G = (V, E). This doesn’t tell us how to draw it…
• So the same graph can
be drawn in different
ways.
• (are you happy that
these are the same
graph?)
• Also, two graphs my look
similar but represent
different graphs.
• (can you see why these
are different graphs?)
Similarity of two graphs:
isomorphism
• AB is an edge of the 2nd graph, but not of the 1st one.
• Although the graphs have essentially the same
information they are not the same.
• However by relabelling the second graph, we can
reproduce the first graph.
• We say that two graphs G and H are isomorphic to
each other, if H can be obtained by relabelling the
vertices of G.
Similarity of two graphs:
isomorphism
• We say that two graphs G and H are isomorphic to each
other if H can be obtained by relabelling vertices of G.
Actually - we haven’t found a really good algorithm to
test for isomorphism. But some basic checks help us see
if two graphs are isomorphic:
• Two isomorphic graphs must have
• the same number of nodes and edges,
• the same degree sequence.
• Two graphs cannot be isomorphic if one of them
contains a subgraph that the other does not.
Today’s lecture
• Graphs – a recap
• Definitions and representations
• Graph traversal (BFS/DFS)
• Weighted Graph Algorithms (Prim’s/Kruskal’s algorithms)
• Motivations
• Additional graph types, terms, concepts
• Density/sparcity and alternative representations
• Paths, cycles and DAGs
• Connectivity
• Bipartite graphs
• Isomorphism
• Additional graph algorithms
• Shortest path
Shortest path
• The path between two vertices with lowest cost is called the
shortest path
• For an unweighted graph, cost of a path = number of edges in the path
• For a weighted graph, cost of a path = sum of the weights of each edge
in the path
• Knowing the shortest path between two vertices helps with
applications such as route finding
Example – unweighted graph
1
2
3
4
5
6
7
8
• Find the shortest path between nodes 1 and 8
• Keep a queue of nodes and a note of which nodes
we have visited, and how we got to them
• Breadth first search
Shortest Path Outline
• This algorithm passes along each edge at most once, so
it has time complexity O(|E|)
Repeat until target found or queue empty
Add the unvisited neighbours of the head of
the queue to the back of the queue
Remove the head of the queue
Remember - BFS (from vertex s)
Put s onto a FIFO queue. Repeat until
the queue is empty:
remove the least recently added
vertex v
add each of v's unvisited neighbors
to the queue, and mark them as
visited.
Shortest unweighted path
algorithm
G = (V,E)
p[a] holds the predecessor of a, initially NIL
current is a queue
PROCEDURE SHORTEST n m
p[n] = n
add n to current
While current is not empty
remove the head of current, x
for each neighbour of x, y
if p[y]== NIL
p[y] = x
if y == m then return SUCCESS
add y to current
return FAIL
In the SUCCESS case, the path can be found in p[],
going backwards from m.
Shortest unweighted path
algorithm
G = (V,E)
p[a] holds the predecessor of a,
initially NIL
current is a queue
PROCEDURE SHORTEST n m
p[n] = n
add n to current
While current is not empty
remove the head of current, x
for each neighbour of x, y
if p[y]== NIL
p[y] = x
if y == m then return SUCCESS
add y to current
return FAIL
In the SUCCESS case, the path can be
found in p[], going backwards from m.
1
2
3
4
5
6
7
8
1
2 4
4 5
5
3 6 7
6 7
7 8
8
Path is 1 2 5 6 8
Shortest Path
• To find the shortest path between two nodes in a
weighted graph
• Dijkstra's algorithm
• To implement, we will need the data structure
priority queue (PQ)
data
Priority queue
C
• A queue where each element also has
G
a priority assigned to it
A
• The priority determines the order in
which items are held in the queue
priority
5
3
2
• Higher priority items can ‘jump the queue’
• For items of the same priority, normal
queue ordering applies
• The priority can change as nodes are
added to the PQ
• Dijkstra: the element with greatest
priority is the one closest to the start
node
T
4
New item for the PQ
Dijkstra outline
• Finding shortest path from node x to (all)
other nodes
• The data structures needed
• a PQ of nodes (priority is cost of path to start
node)
• for each node, maintain the following information:
• the predecessor node - all predecessor info together
give us a tree of shortest paths found so far; NIL initially
• the (so far found) cost of reaching that node; initially 0
• a boolean flag whether the cheapest path has been
found
• Dijkstra has time complexity O(E + N log N)
with a standard implementation
• (PQ as heap – see next week)
Algorithm
p[m] holds the predecessor of m, initially NIL
cheapest(m) marks if this forms part of the cheapest path
(initially false for all nodes)
Add the start node, x, to the PQ. cost(x) = 0
Repeat until PQ is empty
• Remove the node with greatest priority from the PQ, call
it n
• set cheapest[n] to true
• for every neighbour m of n, with m-n edge costing cm, if
cheapest[m]==false:
if p[m]!=NIL and cost[m]<=cost[n]+cm continue inner loop
else set: p[m]=n, cost[m]=cost[n]+cm; add/update m in PQ
for known target node y, stop after second bullet point if
n==y with SUCCESS;
report after loop FAILURE
Example of Dijkstra
• Find the shortest path from A to H
A
1
B
3
3
1
4
4
E
2
F
1
C
G
3
1
3
D
1
H
Add the start node, x, to the PQ
cost(x) = 0.
Repeat until PQ is empty
- Remove the node with greatest
priority from the PQ, call it n
- set cheapest[n] to true
- for every neighbour m of n,
with m-n edge costing cm, if
cheapest[m]==false:
if p[m]!=NIL and cost[m]<=cost[n]+cm
continue inner loop
else set: p[m]=n, cost[m]=cost[n]+cm;
add/update m in PQ
For known target node y, stop after
second bullet point if n==y with SUCCESS;
report after loop FAILURE
Algorithm in action?
• there are good animated versions of Dijkstra out
there on the WWW
• click like a monkey (Japan)
• table-based animation (US)
Also this demo
• Computerphile (UK):
https://www.youtube.com/watch?t=180&v=GazC3A
4OQTE&feature=youtu.be
Today’s lecture
• Graphs – a recap
• Definitions and representations
• Graph traversal (BFS/DFS)
• Weighted Graph Algorithms (Prim’s/Kruskal’s algorithms)
• Motivations
• Additional graph types, terms, concepts
• Density/sparcity and alternative representations
• Paths, cycles and DAGs
• Connectivity
• Bipartite graphs
• Isomorphism
• Additional graph algorithms
• Shortest path
Further additional reading
• Data structures & problem solving using Java (Weiss 2010) – Ch 14
• Introduction to algorithms (Cormen et al 2009) – Ch 22, 24, B4
• Algorithms (Sedgewick and Wayne, 2011) – Ch 4
• Algorithms in Java (Sedgewick 2003) – Ch 1.2, Ch 3.7, Ch 5, Pt 5
• Introduction to algorithms (Cormen et al 2009) – Part VI Ch22-23,
Appendix B.4
• Cracking the coding interview: 150 programming questions and
solutions (McDowell 2013) – Ch 4w
NB For terms used in the ‘Graphs – a recap’
section, please see COMP3830 lectures
Glossary of new terms
Dense – a graph with many connections between nodes
Sparse – a graph with only a few connections between nodes
Path – a sequence of vertices/nodes in a graph that are connected
Route / walk – a path in a directed graph
Cycle – a closed walk (all edges and all vertices are visited at most once except
for the start vertex, which is also the end vertex)
Digraph – directed graph
DAG / directed acyclic graph – a directed graph that contains no cycles
Complete graph – where every vertex is connected (adjacent) to every other
vertex
Connected – a graph with a path between every two nodes
Strongly connected – a directed graph with a route between every two nodes
Component – a connected set of nodes within a graph (subgraph)
Bipartite – a graph which connects nodes of one type to nodes of a second type
Isomorphic –two graphs are isomorphic if they share the same underlying
structure/degrees of nodes, such that all you have to do to obtain one graph is
relabel vertices (if necessary)
Glossary
• complete graph - where every vertex is adjacent to every other vertex
• adjacent – connected by an edge
• connected graph - if there is a path between each pair of vertices
(disconnected otherwise)
• connected usually used for undirected graphs with a path between
every two nodes.
• strongly connected for directed graphs with a route (directed
path) between every two nodes.
• strongly connected component - a (full) subgraph in which there are
directed paths between any two vertices, each way
• bipartite graph – has nodes split into V1 and V2 such that every edge
(i,j) that exists in E has (i in V1) and (j in V2)
• isomorphism – same underlying structure: two graphs are isomorphic to
each other, if one can be obtained by relabelling the nodes of the other
• (degree sequence – all nodes’ indegrees and outdegrees data)
• shortest path – the path between two vertices with lowest cost
• Priority queue – a queue where each element also has a priority
assigned to it
Heaps
COMP5180 Algorithms, Correctness and Efficiency
Anna Jordanous
a.k.jordanous@kent.ac.uk
Some slides in this lecture taken from Algorithms
(Sedgewick and Wayne) material
https://algs4.cs.princeton.edu/home/
Today’s lecture
Heap – data structure and algorithms including sort
• What is a heap? (tree and array format)
• Insert, remove and heapify operations
• Heapsort
• Priority queues using heaps
• Max heap and min heap
Week 14 - Heaps
2
Heap Data Structure
1
X
3
2
O
T
4
8
A
5
G
9
E
10
R
6
S
A
11
Week 14 - Heaps
I
M
7
N
12
3
Heap Data Structure
• Another tree data structure that is useful is the heap
• A binary heap is a special type of binary tree (organised differently to
a binary search tree) that is easier to store as an array
Week 14 - Heaps
4
Heap Data Structure
• Binary tree such that each parent element is larger than its two
children. Thus largest element at root.
• Represent as an array (NB indices start at 1)
1X
O 3
2 T
4G
8 A 9E
6M
5 S
10
R
A 11
7N
I 12
K
1 2 3 4 5 6 7 8 9 10 11 12
A[K]X T O G S WeekM14 - Heaps
N A E R A I
5
Heap Data Structure
• Binary tree such that each parent element is larger than its two
children. Thus largest element at root.
• Represent as an array (NB indices start at 1)
1X
O 3
2 T
4G
8 A 9E
6M
5 S
10
R
A 11
7N
I 12
K
1 2 3 4 5 6 7 8 9 10 11 12
A[K]X T O G S WeekM14 - Heaps
N A E R A I
Heap condition/heap
property
For all nodes i [exc. root
node]:
Parent(i).value > i.value
6
1X
O 3
2T
Heap Data Structure
4G
8A
9E
5S
10R
6M
A 11
I 12
• Easy to get from vertex to child/parent
• Parent of vertex k in position k / 2
• Children of vertex k in positions 2k, 2k + 1
• Each generation has at most double the nodes of the previous one
• At most log2N generations
• No explicit pointers – implicit in an array representation.
Week 14 - Heaps
7
7N
Today’s lecture
Heap – data structure and algorithms including sort
• What is a heap? (tree and array format)
• Insert, remove and heapify operations
• Heapsort
• Priority queues using heaps
• Max heap and min heap
Week 14 - Heaps
8
Heap Algorithms
• All algorithms on heaps operate along some path from the root to the
bottom of the heap.
• For N items there are ≤ log2N nodes in every path through the heap.
• heap algorithms all:
• Change the heap so that the heap condition is violated
• Travel through the heap modifying it so as to restore the heap condition
• (This second step is sometimes referred to as a ‘heapify’ operation)
COMPLEXITY: O(n log n) (LOGARITHMIC)
The basic heap operations: insert, remove all require fewer than 2 log2 N comparisons when
Week 14 - Heaps
9
performed on a heap of N elements
Insert (for example, add P to the heap)
• Add new node to end of list
• Keep exchanging with parent if parent is smaller
• To restore the heap condition
the HEAPIFY algorithm
1
2
4
8
A
O
P
T
5
G
9
E
X
10
R
6
S
A
11
12
I
3
7 N
M
O
P
M
P
13
(Sometimes referred to as
the “bottom-up method” because you work from the
bottom and bring
elements up to restore the
heap condition)
Exchange P with M then with O
Week 14 - Heaps
10
Remove (for example, let’s remove X)
• Remove element by…
• Overwriting element to be removed with the last item in the heap..
1
then
X
XI
2
4
8
A
O
T
3
I
99
6 M
5 S
G
E
10
R
11
A
12
Week 14 - Heaps
7 N
I
11
Remove - continued
• … Then move down heap restoring heap condition (heapify)
TI
2
4
8
A
TSI
O
9
6 M
5 SRI
G
E
10
RI
11
3
7 N
A
Week 14 - Heaps
12
NB as you move the
element down,
swop with largest
of the children
– why do this..?
T
T
S
I
S
G
R
R
A
E
O
I
I
I
M
N
A
Week 14 - Heaps
13
Heapify – the most important part of heap
algorithms
If inserting an item into a heap, we then heapify up.
Heapify-Up (Array A, position i)
// trying to work out what is the largest out of A(i), A(parent).
parent = Parent(i)
if i > 1 and A(i) > A(parent) then
swop_values(A(i), A(parent)
Heapify-Up(A, parent)
end if
Parent(position i) = floor(i/2)
Week 14 - Heaps
// restore the heap condition, then
look again at the heap, from
where that parent value was
// Parent of node i in position i / 2
14
Heapify
If removing an item from a heap, we then heapify down.
Heapify-Down (Array A, position i)
left = Left(i), right = Right(i)
// trying to
if left <= heap-size(A) and A(left) > A(i) then
work out
what is the
largest = left
largest out
of A(i),
else largest = i
A(left) and
end if
A(right)
if right <= heap-size(A) and A(right) > A(largest) then
largest = right
end if
// get largest in the correct (highest) position in the heap,
if largest != i then
then look again at the heap, from the position where that
swop_values(A(i), A(largest)
largest value was
Heapify-Down(A, largest)
end if
Left(position i) = 2i; Right(position i) = 2*i + 1;
// Children of node i in
positions 2i, 2i + 1
Pseudocode for insert or remove an item in a
heap
Insert (A, key)
// insert new item ‘key’ into the heap held in array A
n = size(A)
// find position of the final item in A
A[n+1] = key
// add new item after the final item in A
Heapify-Up(A, n+1) // heapify up from added element
Remove (A, pos)
// remove item at position ‘pos’ from the heap in array A
n = size(A)
// find position of the final item in A
A[pos] = A[n]
// overwrite ‘pos’ with final item in A
A[n] = null
Heapify-Down(A, pos)
// heapify down from pos
Week 14 - Heaps
16
Demo of 4 operations on
the heap, done one after
another:
1. Insert element S
2. Remove the maximum
element from the heap
3. Remove the (new)
maximum element from
the heap
4. Insert element S
Week 14 - Heaps
17
Time complexity
• The basic heap operations of insert, remove all require fewer than 2
log2 N comparisons when performed on a heap of N elements
• (Do you understand why?)
1X
2
4G
8 A
9E
O
T
6M
5 S
10
R
A
11
I
Week 14 - Heaps
3
7N
12
18
Today’s lecture
Heap – data structure and algorithms including sort
• What is a heap? (tree and array format)
• Insert, remove and heapify operations
• Heapsort
• Priority queues using heaps
• Max heap and min heap
Week 14 - Heaps
19
sink(4, 11)
exch(1, 10)
sink(1, 9)
S
O
Sorting using heaps
L
A
X
L
M
S
Heapsort
exch(1, 9)
sink(1, 8)
sink(3, 11)
O
M
A
E
L
A
E
T
L
P
sink(1, 211)
2 X
O
8
9
O
4 5
56
T E
8 10
9 11 10
2
3
4
S5
O
R
T
E
1
exch(1, 7)
sink(1, 6)
S
3
R
3
R
S
E
X
7
A X
7
S
T
O
T L
O
6
7
T
build max heap
(in place)
6
A
AP
11
E
O
result (heap-ordered)
11)
sink(5, sink(5,
11)
L
S
R
P M LL P E R
EA
L
starting point
(arbitrary
order) order)
starting
E point (arbitrary
PM
M
TT
1
S
E
M
keys in arbitrary order
heap construction
heap construction
1
4
A
R
E
E
O
O
X
S8
9
10
M
R
A
E
X
M
T
M
A
E
M
S
SE
P L
L
LRE
L2 E
S
APR
A
A4 L
M1 A
A X
A
M
E3 E E
L
A E5 M
EO 6 O P O7 P P
E
E
O
S9 R T10S X11T
E
R8
X
T
S
R
starting point
(heap-ordered)
starting
point (heap-ordered)
X
E
S M T O X
result (sorted)
11
exch(1,
11)
1
2exch(1,
3
4
511)
6
7
T8
9
10
T
11
exch(1,
1 exch(1,
2 5)3
4 5)
5
6
sink(1,
10)
sink(1,
4)
sink(1,
10) P sorting
sink(1,
4)
XHeapsort:
A M P constructing
L E
X (left)
T S and
L R A down
M O (right)
E E a heap
A E E L M O
P
E
S
R
P 14 - Heaps
E
S
R
Week
L
X
P
O
X
T
exch(1,exch(1,
6)
6)
sink(1,sink(1,
5)
5)
X
T
E
L
R
P
O
X
T
sorted result
sortdownsortdown
(in place)
XO
E
L
X tree.
T
S
S
M
R
binary
・EView input array as a complete
Heap construction: build a max-heap with all N keys.
・
exch(1, 8)
exch(1, 2)
sink(2, 11)
P
S
repeatedly
remove
sink(1,
7) the maximum key.sink(1, 1)
・Sortdown:
M
E
E
P
M
X
T
S
P
O
exch(1, 3)
sink(1, 2)
R
L
A in-place sort. O
R for
Basic
plan
E
L
R
P
X
T
A
E
X
T
E
E
A
R
O
E
E
P
M
P
R
T
exch(1, 4)
sink(1, 3)
S
O
O L
LR
A R
A
A
A M
7
L
8
P
R
MO
9
L 10
11
S
T
X
E
PO
E
20
P
Sorting using heaps
First build the heap by inserting
items one at a time, then one-byone, remove the maximum element
in the heap and heapify
Watch this twice – the second time,
watch what happens in the array
representation
This is an in-place sort
(because it happens inside the
array)
NB Alternative demos:
https://www.youtube.com/watch?v=D_B3HN4gcUA
https://youtu.be/mAO8LpQ6uGQ
or for a faster view with less
talking…:
https://www.youtube.com/watch?v=MtQL_ll5Kh
Q
Week 14 - Heaps
21
Heapsort pseudocode
Heapsort(Array A)
BuildHeap(A)
for i = length(A) to 2
exchange(A[1], A[i])
heap-size(A) = heap-size(A) -1
Heapify-Down(A, 1)
end for
// as A[i] now ignored
BuildHeap(Array A)
heap-size(A) <- length(A)
for i <- floor( length/2 ) to 1
Heapify-Down(A, i)
end for
Week 14 - Heaps
22
Heapsort: Java implementation
Sorting using heaps – in Java
public class Heap
{
public static void sort(Comparable[] a)
{
int N = a.length;
for (int k = N/2; k >= 1; k--)
sink(a, k, N);
while (N > 1)
{
exch(a, 1, N);
sink(a, 1, --N);
}
}
but make static (and pass arguments)
private static void sink(Comparable[] a, int k, int N)
{ /* as before */ }
private static boolean less(Comparable[] a, int i, int j)
{ /* as before */ }
private static void exch(Object[] a, int i, int j)
{ /* as before */ }
}
but convert from 1-based
indexing
Week 14
- Heapsto 0-base indexing
23
42
Comparing heap sort to other sorting
algorithms
Heapsort: mathematical analysis
Proposition. Heap construction uses ≤ 2 N compares and ≤ N exchanges.
Proposition. Heapsort uses ≤ 2 N lg N compares and exchanges.
algorithm can be improved to ~ 1 N lg N
Significance. In-place sorting algorithm with N log N worst-case.
・Mergesort: no, linear extra space.
・Quicksort: no, quadratic time in worst case.
・Heapsort: yes!
in-place merge possible, not practical
N log N worst-case quicksort possible,
not practical
Bottom line. Heapsort is optimal for both time and space, but:
・Inner loop longer than quicksort’s.
・Makes poor use of cache.
・Not stable.
advanced tricks for improving
Week 14 - Heaps
24
45
Today’s lecture
Heap – data structure and algorithms including sort
• What is a heap? (tree and array format)
• Insert, remove and heapify operations
• Heapsort
• Priority queues using heaps
• Max heap and min heap
Week 14 - Heaps
25
Priority queue
• A queue where each element also has a
priority assigned to it
• The priority determines the order in which
items are held in the queue
data
C
G
A
priority
5
3
2
T
4
• Higher priority items can ‘jump the queue’
• For items of the same priority, normal queue
ordering applies
• The priority can change as nodes are
added to the PQ
Week 14 - Heaps
New item for the PQ
26
Priority Queues using heaps
• A priority queue is a queue data structure with additional information
on each node’s priority, such that the priority of a node decides what
position it takes in the queue (and FIFO for nodes of equal priority)
• Heaps are a useful way of representing priority queues
• Higher the priority, the higher it goes up the heap
• Ordering based on priority
Week 14 - Heaps
27
e.g. Dijkstra algorithm
node
p
cost
cheapest?
• Finding shortest path from node x to (all) other nodes in a
weighted graph
• The data structures needed
• a PQ of nodes (priority is cost of path to start node)
• Dijkstra: the element with greatest priority is the one closest to the
start node
• for each node, maintain the following extra info:
• p - the predecessor node - all predecessor info together give us a tree of shortest
paths found so far; NIL initially
• cost - the (so far found) cost of reaching that node; initially 0
• cheapest - a boolean flag whether the cheapest path has been found; false initially
Week 14 - Heaps
28
Example of Dijkstra:
Find the shortest path from A to H
PRIORITY QUEUE
node
p
cost
cheapest?
This Computerphile video is a nice demo of how Dijkstra alg uses a PQ
https://www.youtube.com/watch?t=180&v=GazC3A4OQTE&feature=yo
utu.be
A
1
B
3
3
1
4
4
E
2
F
1
C
G
3
D
1
1
3
H
Add the start node, x, to the PQ
cost(x) = 0
Repeat until PQ is empty
- Remove the node with greatest priority
from the PQ, call it n
- set cheapest[n] to true
- for every neighbour m of n, with m-n
edge costing cm, if cheapest[m]==false:
if p[m]!=NIL and cost[m]<=cost[n]+cm continue
inner loop
else set: p[m]=n, cost[m]=cost[n]+cm;
add/update m in PQ
For known target node y, stop after second bullet
point if n==y with SUCCESS; report after loop
FAILURE
29
Example of Dijkstra:
Find the shortest path from A to H
HEAP
• When we add a node to the PQ, we
order it based on the priority
•
•
Highest priority for Dijkstra is the node
closest to start node (lowest cost)
Higher the priority, the higher it goes up the
heap
• When we change a node’s priority, we
may have to heapify
A
1
B
3
3
1
4
4
E
2
F
1
C
G
3
D
1
1
3
H
Add the start node, x, to the PQ
cost(x) = 0
Repeat until PQ is empty
- Remove the node with greatest priority
from the PQ, call it n
- set cheapest[n] to true
- for every neighbour m of n, with m-n
edge costing cm, if cheapest[m]==false:
if p[m]!=NIL and cost[m]<=cost[n]+cm continue
inner loop
else set: p[m]=n, cost[m]=cost[n]+cm;
add/update m in PQ
For known target node y, stop after second bullet
point if n==y with SUCCESS; report after loop
FAILURE
30
Example of Dijkstra:
Find the shortest path from A to H
HEAP
A
1
B
3
3
1
4
4
E
2
F
1
C
G
3
D
1
1
3
H
Add the start node, x, to the PQ
cost(x) = 0
Repeat until PQ is empty
- Remove the node with greatest priority
from the PQ, call it n
- set cheapest[n] to true
- for every neighbour m of n, with m-n
edge costing cm, if cheapest[m]==false:
if p[m]!=NIL and cost[m]<=cost[n]+cm continue
inner loop
else set: p[m]=n, cost[m]=cost[n]+cm;
add/update m in PQ
For known target node y, stop after second bullet
point if n==y with SUCCESS; report after loop
FAILURE
31
Today’s lecture
Heap – data structure and algorithms including sort
• What is a heap? (tree and array format)
• Insert, remove and heapify operations
• Heapsort
• Priority queues using heaps
• Max heap and min heap
Week 14 - Heaps
32
From Dijkstra example of priority queues
• When we add a node to the PQ, we order it based on the priority
HEAP priority for Dijkstra is the node closest to start node (lowest cost)
• Highest
• Higher the priority, the higher it goes up the heap
• When we change a node’s priority, we may have to heapify
Here, we want the nodes with smallest costs to be at the top of the heap
This is an example of a min heap…
Week 14 - Heaps
33
MAX HEAP and MIN HEAP
1X
• We had been looking at max heaps
2
• Heap condition: a node’s value >= its children nodes
• The root node has the largest (max) value
• We could also consider min heaps,
the reverse of max heaps
which are
4G
8 A
10
9E
Week 14 - Heaps
R
A
11
I
1
2
• DEMO: https://youtu.be/hfA6q1pf4sk
• Min heaps are exactly the same as max heaps except
you do things in reverse
6M
5 S
• Heap condition: a node’s value <= its children nodes
• The root node has the smallest (min) value
• (e.g. swap a node with its parent if the node’s value >= its
parent’s value
O
T
4
8
X
9
12
A
I
6M
5 E
T
7N
A
G
10
3
R
S
11
O
3
7N
12
34
Today’s lecture
Heap – data structure and algorithms including sort
• What is a heap? (tree and array format)
• Insert, remove and heapify operations
• Heapsort
• Priority queues using heaps
• Max heap and min heap
Week 14 - Heaps
35
Further additional reading
• Algorithms in Java (Sedgewick 2003) – Ch 9
• Data structures & problem solving using Java (Weiss 2010) – Ch 21 (23)
• Introduction to algorithms (Cormen et al 2009) – Ch 6
• Algorithms (Sedgewick and Wayne 2011) – Ch 2.4
• Cracking the coding interview: 150 programming questions and solutions (McDowell
2013) – p. 36, Ch 20
• https://www.youtube.com/watch?v=c1TpLRyQJ4w&list=PLTxllHdfUq4fMXqS6gCDWuWhiaRDVGsgu [A
good set of videos explaining heaps]
Week 14 - Heaps
36
Glossary
• (binary) (max) heap - binary tree such that each parent element is larger than its two children
• Heaps are usually max heaps
• min heaps - the reverse of max heaps, such that each parent element is smaller than its two
children
• Heap condition/heap property - For all nodes i [exc. root node]: Parent(i).value > i.value
• heapify – stepping through the heap modifying it level by level so as to restore the heap
condition
• Heapsort/heap sort – a sorting algorithm based on building a heap and then outputting the
heap’s nodes in order
• Sort down – the step of repeatedly outputting the root in heapsort
• Priority Queue - A queue data structure where each element also has a priority assigned to it,
and higher priority items can ‘jump the queue’
Download