# 08. graphIntro

```Algorithm Design and Analysis (ADA)
242-535, Semester 1 2014-2015
8. Introduction
to Graphs
• Objective
o introduce the main kinds of graphs, discuss two
implementation approaches, and remind you
1
Overview
1. Graphs
2. Graph Terminology
3. Implementing Graphs
-
4. Trees and Forests
5. Tree Terminology
2
1. Graphs
• A graph has two parts (V, E), where:
o V are the nodes, called vertices
o E are the links between vertices, called edges
• Example:
o airports and distance between them
SFO
PVD
ORD
LGA
HNL
LAX
DFW
MIA
1.1.Graph Types
• Directed graph
o the edges are directed
o e.g., bus cost network
• Undirected graph
o the edges are undirected
4
1.2. Examples
cslab1a
cslab1b
• Electronic circuits
o Printed circuit board
o Integrated circuit
• Transportation networks
math.brown.edu
cs.brown.edu
o Highway network
o Flight network
• Computer networks
brown.edu
qwest.net
att.net
o Local area network
o Internet
o Web
• Databases
o Entity-relationship
diagram
cox.net
John
Paul
David
5
Graphs are everywhere
Example
Nodes
Edges
Transportation network:
airline routes
airports
nonstop flights
Communication
networks
computers, hubs,
routers
physical wires
Information network:
web
pages
Information network:
scientific papers
articles
references
Social networks
people
“u is v’s friend”,
“u sends email to v”,
Computer programs
functions (or
modules)
statement blocks
“u calls v”
6
A Calling Graph
• A calling graph for a program:
main
makeList
printList
mergeSort
4 examples of
recursion
split
merge
7
Sheet Metal Hole Drilling
• Problem: minimise the
moving time of the drill
over a metal sheet.
continued
8
A Weighted Graph Version
• Add edge numbers (weights) for the movement
time between any two holes.
8
a
4
6
2
6
c
3
5
d
4
b
9
12
e
9
2. Graph Terminology
• End vertices (or endpoints)
of an edge
o U and V are the endpoints
• Edges incident on a vertex
a
o a, d, and b are incident
o U and V are adjacent
• Degree of a vertex
o X has degree 5
• Parallel edges
V
b
d
U
h
X
c
e
j
Z
i
g
W
f
Y
o h and i are parallel edges
• Self-loop
o j is a self-loop
10
• Path
o sequence of alternating
vertices and edges
o begins with a vertex
o ends with a vertex
o each edge is preceded and
followed by its endpoints
• Simple path
o path such that all its vertices
and edges are distinct
• Examples
o P1=(V,b,X,h,Z) is a simple path
o P2=(U,c,W,e,X,g,Y,f,W,d,V) is a
path that is not simple
a
U
c
V
b
d
P2
P1
X
e
h
Z
g
W
f
Y
11
• Cycle
o circular sequence of
alternating vertices and edges
o each edge is preceded and
followed by its endpoints
a
• Simple cycle
o cycle such that all its vertices
and edges are distinct
• Examples
o C1=(V,b,X,g,Y,f,W,c,U,a) is a
simple cycle
o C2=(U,c,W,e,X,g,Y,f,W,d,V,a,) is
a cycle that is not simple
Graphs
U
c
V
b
d
C2
X
e
C1
g
W
f
h
Z
Y
12
12
Connectivity
• A graph is connected if
there is a path between
every pair of vertices
Connected graph
Non connected graph with two
connected components
Some Properties
Notation
Property
V
E
Sv degree(v) = 2*| E |
Proof: each undirected
edge is counted twice
(called the
handshaking lemma)
Property
In an undirected graph
with no self-loops and
no multiple edges
|E|  |V| (|V| - 1)/2
Proof: each vertex has
degree at most (|V| - 1)
set of vertices
set of edges
|. . .| the set size
degree() degree of a vertex
c
d
a
Example
 | V| = 4
 | E | = 6
b  degree(a) = 3
3. Implementing Graphs
• We will typically express running times in terms of
|E| and |V| (often dropping the |’s)
o If |E|  |V|2 the graph is dense
• can also write this as |E| is O(|v2|)
o If |E|  |V| the graph is sparse
• or |E| is O(|V|)
• Dense and sparse graphs are best implemented
using two different data structures:
o Adjacency matricies: for dense graphs
o Adjacency lists: for sparse graphs
15
Dense Big-Oh
• In the most dense graph, a graph of v verticies will
have |V|(|V|-1)/2 edges.
• In that case, for large n, |E| is O(|V|2)
|V| = 5
|E| = (5*4)/2 = 10
16
• Proof that a graph of n nodes has n(n-1)/2
edges. Write as S(n) = n(n-1)/2
• Basis. S(2) = 1. True.
• Inductive Case.
o assume S(n) = n(n-1)/2
o try to show S(n+1) = (n+1)n/2
o we know: S(n+1) = S(n) + n
o S(n+1) = n(n-1)/2 + n
o S(n+1) = (n+1)n/2
(1)
(2)
which is
which is
which is (2)
17
a
b
c
d
e
Graph
a
b
c
d
e
a b c d e
0 1 0 0 1
1 0 1 0 1
0 1 1 0 1
0 0 0 0 1
1 1 1 1 0
18
Properties
• An adjacency matrix represents the graph as a
V * V matrix A:
o A[i, j]
= 1 if edge (i, j)  E
= 0 if edge (i, j)  E
• The degree of a vertex v (of a simple graph) =
sum of row v or sum of column v
o e.g. vertex a has degree 2 since it is connected to b
and e
• An adjacency matrix can represent loops
o e.g. vertex c on the previous slide
continued
19
• An adjacency matrix can represent parallel
edges if non-negative integers are allowed as
matrix entries
o ijth entry = no. of edges between vertex i and j
• The matrix duplicates information around the
main diagonal
o the size can be easily reduced with some coding tricks
• Properties of graphs can be obtained using
matrix operations
o e.g. the no. of paths of a given length, and vertex
degree
20
The No. of Paths of Length n
• If an adjacency matrix A is multiplied by itself
repeatedly:
o A, A2, A3, ..., An
Then the ijth entry in matrix An is equal to the number
of paths from i to j of length n.
21
Example
a
b
A=
c
d
e
a
b
c
d
e
a b c d e
0 1 0 1 0
1 0 1 0 1
0 1 0 1 1
1 0 1 0 0
0 1 1 0 0
22
a b c d e
0 1 0 1 0
1 0 1 0 1
A2 =
0 1 0 1 1
1 0 1 0 0
0 1 1 0 0
0 1 0 1 0
1 0 1 0 1
0 1 0 1 1
1 0 1 0 0
0 1 1 0 0
a
=
b
c
d
e
2 0 2 0 1
0 3 1 2 1
2 1 3 0 1
0 2 0 2 1
1 1 1 1 2
23
Why it Works...
• Consider row a, column c in A2:
c
b
d
a ( 0 1 0 1 0 )
0
1
0
1
a-b-c
a-d-c
b
d
= 0*0 + 1*1 + 0*0 + 1*1 + 0*1
= 2
1
continued
24
• A non-zero product means there is at least one
vertex connecting verticies a and c.
• The sum is 2 because of:
o (a, b, c) and
(a, d, c)
o 2 paths of length two
25
The Degree of Verticies
• The entries on the main diagonal of A2 give the
degrees of the verticies (when A is a simple graph).
• Consider vertex c:
o degree of c == 3 since it is connected to the edges (c,b),
(c,d), and (c,e).
continued
26
• In A2 these become paths of length 2:
o (c,b,c), (c,d,c), and (c,e,c)
• So the number of paths of length 2 for c = the
degree of c
o this is true for all verticies
27
• #define NUMNODES n
int arcs[NUMNODES][NUMNODES];
• arcs[u][v] == 1 if there is an edge (u,v);
0 otherwise
• Storage used: O(|V|2)
• The implementation may also need a way to map
node names (strings) to array indicies.
continued
28
• If n is large then the array will be very large, with
almost half of it being unnecessary.
• If the nodes are lightly connected then most of the
array will contain 0’s, which is a further waste of
memory.
29
Representing Directed Graphs
• A directed graph:
0
1
3
2
4
30
finish
start
0 1 2 3 4
0
1
2
3
4
1
0
1
0
0
1
0
1
0
1
1
0
0
1
0
0
1
0
0
0
0
0
1
1
0
• Not symmetric; all the array
may be necessary.
• Still a waste of space if
nodes are lightly connected.
31
When to use an Adjacency Matrix
• The adjacency matrix is an efficient way to store
dense graphs.
• But most large interesting graphs are sparse
o e.g., planar graphs, in which no edges cross, have
|e| = O(|v|) by Euler’s formula
o For this reason the adjacency list is often a better
32
Euler’s Formula
Characteristic
• Euler (1752) proved that for any connected graph,
where:
F = no. of faces
E = no. of edges
V = no. of verticies/nodes
then the formula holds:
F=E–V+2
F = 5; E = 9; V = 6
33
• Adjacency list: for each vertex v  V, store a list of
• Example:
o
o
o
o
o
0
• Can be used for directed
and undirected graphs.
1
3
2
4
34
• An implementation diagram:
0
0
1
3
2
3
size of array
= no. of
4
vertices (|V|)
1
2
0
1
4
2
4
1
means
NULL
no. of cells
== no. of edges (|E|)
35
Data Structures
• struct cell {
/* for a linked list */
Node nodeName;
struct cell *next;
};
• adj[u] points to a linked list of cells which give the
names of the nodes connected to u.
36
Storage Needs
• How much storage is required?
o The degree of a vertex v == number of incident edges
• directed graphs have in-degree, out-degree values
• For directed graphs, the number of items in an
S out-degree(v) = |E|
•
This uses (V + E) storage
37
• For undirected graphs, the number of items in the
S degree(v) = 2*|E|
(the handshaking lemma)
o Why? If we mark every edge connected to every vertex,
then by the end, every edge will be marked twice
• This also uses (V + E) storage
• In summary, adjacency lists use (V+E) storage
38
3.3. Running Time: Matrix or List?
• Which representation is better for graphs?
• dense graph – use a matrix
• sparse graph – use an adjcency list
• But a more accurate answer depends on the
operations that will be applied to the graph.
• We will consider three operations:
o is there an edge between u and v?
o find the successors of u (in a directed graph)
o find the predecessors of u (in a directed graph)
continued
39
Is there an edge (u,v)?
• Adjacency list: O(1 + E/V)
o O(1) to get to adj[u]
// forget the |...|
o length of linked list is on average E/V
o if a sparse graph (E&lt;&lt;V): O(1+ E/V) =&gt; O(1)
o if a dense graph (E ≈ V2): O(1+ E/V) =&gt; O(V)
40
Find u’s successors (u-&gt;v)
• Adjacency matrix: O(V) since must examine the
entire row for vertex u
• Adjacency list: O(1 + (E/V)) since must look at
entire list pointed to by adj[u]
o if a sparse graph (E&lt;&lt;V): O(1+ E/V) =&gt; O(1)
o if a dense graph (E ≈ V2): O(1+ E/V) =&gt; O(V)
41
Find u’s predecessors (t-&gt;u)
• Adjacency matrix: O(V) since must examine the
entire column for vertex u
o a 1 in the row for ‘t’ means that ‘t’ is a predecessor
• Adjacency list: O(E) since must examine every list
o if a sparse graph (E&lt;&lt;V): O(E) is fast
o if a dense graph (E ≈ V2): O(E) is slow
42
Summary: which is faster?
• Operation
Find edge
Find succ.
Find pred.
Dense Graph
Either
Sparse Graph
Either
Either
• As a graph gets denser, an adjacency matrix has
better execution time than an adjacency list.
43
3.4. Storage Space: Matrix or List?
• The size of an adjacency matrix for a graph of V
nodes is:
o V2 bits (assuming 0 and 1 are stored as bits)
continued
44
• An adjacency list cell uses:
o 32 bits for the integer, 32 bits for the pointer
o so, cell size = 64 bits
• Total no. of cells = total no. of edges, e
o so, total size of lists = 64*E bits
• successors[] has V entries (for V verticies)
o so, array size is 32*V bits
• Total size of an adjacency list data struct:
64*E + 32*V
45
Size Comparison
• An adjacency list will use less storage than an
64*E + 32*V &lt; V2
which is: E &lt; V2/64 – V/2
When V is large, ignore the V/2 term:
E &lt; V2/64
continued
46
• V2 is (roughly) the maximum number of edges.
• So if the actual number of edges in a graph is
1/64 of the maximum number of edges, then an
matrix coding
o but the graph must be quite sparse
47
4. Trees and Forests
• A (free) tree is an undirected
graph T such that
o T is connected
o T has no cycles
This definition of tree is different
from the one of a rooted tree
Tree
• A forest is an undirected graph
without cycles
• The connected components of
a forest are trees
Forest
Graphs
48
48
Uses of Trees
President
Vice-President
Dean of
Engineering
Vice-President
Dean of
Planning
Officer
....
Purchases
Officer
....
....
49
Saturated Hydrocarbons
H
H C
H C
H C
H
H
H
H
H
H
H C
C
C
H
H C
Isobutane
H C
Butane
H
H
H
H
H
H
• Non-rooted (free) trees
o a free tree is a graph with no cycles
50
A Computer File System
/
usr
bin
ed
vi
bin
spool
exs opr
ls
mail
tmp
who
junk
uucp
printer
51
5. (Rooted) Tree Terminology
• e.g. Part of the ancient Greek god family:
levels
0
Uranus
Aphrodite
Eros
Zeus
Apollo
Athena
Kronos
Poseidon
Hermes
Atlas
Prometheus
Ares
Heracles
1
2
3
:
: 52
Some Definitions
• Let T be a tree with root v0.
• Suppose that x, y, z are verticies in T.
• (v0, v1,..., vn) is a simple path in T (no loops).
• a) vn-1 is the parent of vn.
• b) v0, ..., vn-1 are ancestors of vn
• c) vn is a child of vn-1
continued
53
• d) If x is an ancestor of y, then y is a descendant of x.
• e) If x and y are children of z, then x and y are siblings.
• f) If x has no children, then x is a terminal vertex (or a leaf).
• g) If x is not a terminal vertex, then x is an internal (or
branch) vertex.
continued
54
• h) The subtree of T rooted at x is the graph with vertex set
V and edge set E
o V contains x and all the descendents of x
o E = {e | e is an edge on a simple path from x to some vertex
in V}
• i) The length of a path is the number of edges it uses, not
verticies.
continued
55
• j) The level of a vertex x is the length of the simple path from
the root to x.
• k) The height of a vertex x is the length of the simple path
from x to the farthest leaf
o the height of a tree is the height of its root
• l) A tree where every internal vertex has exactly m children is
called a full m-ary tree.
56
Applied to the Example
• The root is Uranus.
• A simple path is {Uranus, Aphrodite, Eros}
• The parent of Eros is Aphrodite.
• The ancestors of Hermes are Zeus, Kronos, and
Uranus.
• The children of Zeus are Apollo, Athena, Hermes,
and Heracles.
continued
57
• The descendants of Kronos are Zeus, Poseidon,
Hades, Ares, Apollo, Athena, Hermes, and
Heracles.
• The leaves (terminal verticies) are Eros, Apollo,
Ares, Atlas, and Prometheus.
• The branches (internal verticies) are Uranus,
Aphrodite, Kronos, and Zeus.
continued
58
• The subtree rooted at Kronos:
Kronos
Zeus
Apollo
Athena
Poseidon
Hermes