08. graphIntro

advertisement
Algorithm Design and Analysis (ADA)
242-535, Semester 1 2014-2015
8. Introduction
to Graphs
• Objective
o introduce the main kinds of graphs, discuss two
implementation approaches, and remind you
about trees
242-535 ADA: 8. Intro. Graphs
1
Overview
1. Graphs
2. Graph Terminology
3. Implementing Graphs
-
adjency matrix
adjency list
4. Trees and Forests
5. Tree Terminology
242-535 ADA: 8. Intro. Graphs
2
1. Graphs
• A graph has two parts (V, E), where:
o V are the nodes, called vertices
o E are the links between vertices, called edges
• Example:
o airports and distance between them
SFO
PVD
ORD
LGA
HNL
LAX
DFW
MIA
1.1.Graph Types
• Directed graph
o the edges are directed
o e.g., bus cost network
• Undirected graph
o the edges are undirected
o e.g., road network
242-535 ADA: 8. Intro. Graphs
4
1.2. Examples
cslab1a
cslab1b
• Electronic circuits
o Printed circuit board
o Integrated circuit
• Transportation networks
math.brown.edu
cs.brown.edu
o Highway network
o Flight network
• Computer networks
brown.edu
qwest.net
att.net
o Local area network
o Internet
o Web
• Databases
o Entity-relationship
diagram
242-535 ADA: 8. Intro. Graphs
cox.net
John
Paul
David
5
Graphs are everywhere
Example
Nodes
Edges
Transportation network:
airline routes
airports
nonstop flights
Communication
networks
computers, hubs,
routers
physical wires
Information network:
web
pages
hyperlinks
Information network:
scientific papers
articles
references
Social networks
people
“u is v’s friend”,
“u sends email to v”,
“u’s FaceBook links to v”
Computer programs
functions (or
modules)
statement blocks
“u calls v”
242-535 ADA: 8. Intro. Graphs
“v can follow u”
6
A Calling Graph
• A calling graph for a program:
main
makeList
printList
mergeSort
4 examples of
recursion
242-535 ADA: 8. Intro. Graphs
split
merge
7
Sheet Metal Hole Drilling
• Problem: minimise the
moving time of the drill
over a metal sheet.
242-535 ADA: 8. Intro. Graphs
continued
8
A Weighted Graph Version
• Add edge numbers (weights) for the movement
time between any two holes.
8
a
4
6
2
6
c
3
5
d
4
242-535 ADA: 8. Intro. Graphs
b
9
12
e
9
2. Graph Terminology
• End vertices (or endpoints)
of an edge
o U and V are the endpoints
• Edges incident on a vertex
a
o a, d, and b are incident
• Adjacent vertices
o U and V are adjacent
• Degree of a vertex
o X has degree 5
• Parallel edges
V
b
d
U
h
X
c
e
j
Z
i
g
W
f
Y
o h and i are parallel edges
• Self-loop
o j is a self-loop
242-535 ADA: 8. Intro. Graphs
10
• Path
o sequence of alternating
vertices and edges
o begins with a vertex
o ends with a vertex
o each edge is preceded and
followed by its endpoints
• Simple path
o path such that all its vertices
and edges are distinct
• Examples
o P1=(V,b,X,h,Z) is a simple path
o P2=(U,c,W,e,X,g,Y,f,W,d,V) is a
path that is not simple
242-535 ADA: 8. Intro. Graphs
a
U
c
V
b
d
P2
P1
X
e
h
Z
g
W
f
Y
11
• Cycle
o circular sequence of
alternating vertices and edges
o each edge is preceded and
followed by its endpoints
a
• Simple cycle
o cycle such that all its vertices
and edges are distinct
• Examples
o C1=(V,b,X,g,Y,f,W,c,U,a) is a
simple cycle
o C2=(U,c,W,e,X,g,Y,f,W,d,V,a,) is
a cycle that is not simple
Graphs
242-535 ADA: 8. Intro. Graphs
U
c
V
b
d
C2
X
e
C1
g
W
f
h
Z
Y
12
12
Connectivity
• A graph is connected if
there is a path between
every pair of vertices
Connected graph
Non connected graph with two
connected components
Some Properties
Notation
Property
V
E
Sv degree(v) = 2*| E |
Proof: each undirected
edge is counted twice
(called the
handshaking lemma)
Property
In an undirected graph
with no self-loops and
no multiple edges
|E|  |V| (|V| - 1)/2
Proof: each vertex has
degree at most (|V| - 1)
set of vertices
set of edges
|. . .| the set size
degree() degree of a vertex
c
d
a
Example
 | V| = 4
 | E | = 6
b  degree(a) = 3
3. Implementing Graphs
• We will typically express running times in terms of
|E| and |V| (often dropping the |’s)
o If |E|  |V|2 the graph is dense
• can also write this as |E| is O(|v2|)
o If |E|  |V| the graph is sparse
• or |E| is O(|V|)
• Dense and sparse graphs are best implemented
using two different data structures:
o Adjacency matricies: for dense graphs
o Adjacency lists: for sparse graphs
242-535 ADA: 8. Intro. Graphs
15
Dense Big-Oh
• In the most dense graph, a graph of v verticies will
have |V|(|V|-1)/2 edges.
• In that case, for large n, |E| is O(|V|2)
|V| = 5
|E| = (5*4)/2 = 10
242-535 ADA: 8. Intro. Graphs
16
• Proof that a graph of n nodes has n(n-1)/2
edges. Write as S(n) = n(n-1)/2
• Basis. S(2) = 1. True.
• Inductive Case.
o assume S(n) = n(n-1)/2
o try to show S(n+1) = (n+1)n/2
o we know: S(n+1) = S(n) + n
o S(n+1) = n(n-1)/2 + n
o S(n+1) = (n+1)n/2
242-535 ADA: 8. Intro. Graphs
(1)
(2)
which is
which is
which is (2)
17
3.1. Adjacency Matrix
a
b
c
d
e
Graph
242-535 ADA: 8. Intro. Graphs
a
b
c
d
e
a b c d e
0 1 0 0 1
1 0 1 0 1
0 1 1 0 1
0 0 0 0 1
1 1 1 1 0
Adjacency Matrix
18
Properties
• An adjacency matrix represents the graph as a
V * V matrix A:
o A[i, j]
= 1 if edge (i, j)  E
= 0 if edge (i, j)  E
• The degree of a vertex v (of a simple graph) =
sum of row v or sum of column v
o e.g. vertex a has degree 2 since it is connected to b
and e
• An adjacency matrix can represent loops
o e.g. vertex c on the previous slide
242-535 ADA: 8. Intro. Graphs
continued
19
• An adjacency matrix can represent parallel
edges if non-negative integers are allowed as
matrix entries
o ijth entry = no. of edges between vertex i and j
• The matrix duplicates information around the
main diagonal
o the size can be easily reduced with some coding tricks
• Properties of graphs can be obtained using
matrix operations
o e.g. the no. of paths of a given length, and vertex
degree
242-535 ADA: 8. Intro. Graphs
20
The No. of Paths of Length n
• If an adjacency matrix A is multiplied by itself
repeatedly:
o A, A2, A3, ..., An
Then the ijth entry in matrix An is equal to the number
of paths from i to j of length n.
242-535 ADA: 8. Intro. Graphs
21
Example
a
b
A=
c
d
242-535 ADA: 8. Intro. Graphs
e
a
b
c
d
e
a b c d e
0 1 0 1 0
1 0 1 0 1
0 1 0 1 1
1 0 1 0 0
0 1 1 0 0
22
a b c d e
0 1 0 1 0
1 0 1 0 1
A2 =
0 1 0 1 1
1 0 1 0 0
0 1 1 0 0
242-535 ADA: 8. Intro. Graphs
0 1 0 1 0
1 0 1 0 1
0 1 0 1 1
1 0 1 0 0
0 1 1 0 0
a
=
b
c
d
e
2 0 2 0 1
0 3 1 2 1
2 1 3 0 1
0 2 0 2 1
1 1 1 1 2
23
Why it Works...
• Consider row a, column c in A2:
c
b
d
a ( 0 1 0 1 0 )
0
1
0
1
a-b-c
a-d-c
b
d
= 0*0 + 1*1 + 0*0 + 1*1 + 0*1
= 2
1
242-535 ADA: 8. Intro. Graphs
continued
24
• A non-zero product means there is at least one
vertex connecting verticies a and c.
• The sum is 2 because of:
o (a, b, c) and
(a, d, c)
o 2 paths of length two
242-535 ADA: 8. Intro. Graphs
25
The Degree of Verticies
• The entries on the main diagonal of A2 give the
degrees of the verticies (when A is a simple graph).
• Consider vertex c:
o degree of c == 3 since it is connected to the edges (c,b),
(c,d), and (c,e).
242-535 ADA: 8. Intro. Graphs
continued
26
• In A2 these become paths of length 2:
o (c,b,c), (c,d,c), and (c,e,c)
• So the number of paths of length 2 for c = the
degree of c
o this is true for all verticies
242-535 ADA: 8. Intro. Graphs
27
Coding Adjacency Matricies
• #define NUMNODES n
int arcs[NUMNODES][NUMNODES];
• arcs[u][v] == 1 if there is an edge (u,v);
0 otherwise
• Storage used: O(|V|2)
• The implementation may also need a way to map
node names (strings) to array indicies.
242-535 ADA: 8. Intro. Graphs
continued
28
• If n is large then the array will be very large, with
almost half of it being unnecessary.
• If the nodes are lightly connected then most of the
array will contain 0’s, which is a further waste of
memory.
242-535 ADA: 8. Intro. Graphs
29
Representing Directed Graphs
• A directed graph:
0
1
3
2
242-535 ADA: 8. Intro. Graphs
4
30
Its Adjacency Matrix
finish
start
0 1 2 3 4
0
1
2
3
4
1
0
1
0
0
1
0
1
0
1
1
0
0
1
0
0
1
0
0
0
0
0
1
1
0
242-535 ADA: 8. Intro. Graphs
• Not symmetric; all the array
may be necessary.
• Still a waste of space if
nodes are lightly connected.
31
When to use an Adjacency Matrix
• The adjacency matrix is an efficient way to store
dense graphs.
• But most large interesting graphs are sparse
o e.g., planar graphs, in which no edges cross, have
|e| = O(|v|) by Euler’s formula
o For this reason the adjacency list is often a better
respresentation than the adjacency matrix
242-535 ADA: 8. Intro. Graphs
32
Euler’s Formula
Characteristic
• Euler (1752) proved that for any connected graph,
where:
F = no. of faces
E = no. of edges
V = no. of verticies/nodes
then the formula holds:
F=E–V+2
F = 5; E = 9; V = 6
242-535 ADA: 8. Intro. Graphs
33
3.2. Adjacency List
• Adjacency list: for each vertex v  V, store a list of
vertices adjacent to v
• Example:
o
o
o
o
o
0
adj[0] = {0, 1, 2}
adj[1] = {3}
adj[2] = {0, 1, 4}
adj[3] = {2, 4}
adj[4] = {1}
• Can be used for directed
and undirected graphs.
1
3
2
4
242-535 ADA: 8. Intro. Graphs
34
• An implementation diagram:
adj[]
0
0
1
3
2
3
size of array
= no. of
4
vertices (|V|)
242-535 ADA: 8. Intro. Graphs
1
2
0
1
4
2
4
1
means
NULL
no. of cells
== no. of edges (|E|)
35
Data Structures
• struct cell {
/* for a linked list */
Node nodeName;
struct cell *next;
};
struct cell *adj[NUMNODES];
• adj[u] points to a linked list of cells which give the
names of the nodes connected to u.
242-535 ADA: 8. Intro. Graphs
36
Storage Needs
• How much storage is required?
o The degree of a vertex v == number of incident edges
• directed graphs have in-degree, out-degree values
• For directed graphs, the number of items in an
adjacency lists is
S out-degree(v) = |E|
•
This uses (V + E) storage
242-535 ADA: 8. Intro. Graphs
37
• For undirected graphs, the number of items in the
adjency list is
S degree(v) = 2*|E|
(the handshaking lemma)
o Why? If we mark every edge connected to every vertex,
then by the end, every edge will be marked twice
• This also uses (V + E) storage
• In summary, adjacency lists use (V+E) storage
242-535 ADA: 8. Intro. Graphs
38
3.3. Running Time: Matrix or List?
• Which representation is better for graphs?
• The simple answer:
• dense graph – use a matrix
• sparse graph – use an adjcency list
• But a more accurate answer depends on the
operations that will be applied to the graph.
• We will consider three operations:
o is there an edge between u and v?
o find the successors of u (in a directed graph)
o find the predecessors of u (in a directed graph)
242-535 ADA: 8. Intro. Graphs
continued
39
Is there an edge (u,v)?
• Adjacency matrix: O(1) to read arcs[u][v]
• Adjacency list: O(1 + E/V)
o O(1) to get to adj[u]
// forget the |...|
o length of linked list is on average E/V
o if a sparse graph (E<<V): O(1+ E/V) => O(1)
o if a dense graph (E ≈ V2): O(1+ E/V) => O(V)
242-535 ADA: 8. Intro. Graphs
40
Find u’s successors (u->v)
• Adjacency matrix: O(V) since must examine the
entire row for vertex u
• Adjacency list: O(1 + (E/V)) since must look at
entire list pointed to by adj[u]
o if a sparse graph (E<<V): O(1+ E/V) => O(1)
o if a dense graph (E ≈ V2): O(1+ E/V) => O(V)
242-535 ADA: 8. Intro. Graphs
41
Find u’s predecessors (t->u)
• Adjacency matrix: O(V) since must examine the
entire column for vertex u
o a 1 in the row for ‘t’ means that ‘t’ is a predecessor
• Adjacency list: O(E) since must examine every list
pointed to by adj[]
o if a sparse graph (E<<V): O(E) is fast
o if a dense graph (E ≈ V2): O(E) is slow
242-535 ADA: 8. Intro. Graphs
42
Summary: which is faster?
• Operation
Find edge
Find succ.
Find pred.
Dense Graph
Adj. Matrix
Either
Adj. Matrix
Sparse Graph
Either
Adj. list
Either
• As a graph gets denser, an adjacency matrix has
better execution time than an adjacency list.
242-535 ADA: 8. Intro. Graphs
43
3.4. Storage Space: Matrix or List?
• The size of an adjacency matrix for a graph of V
nodes is:
o V2 bits (assuming 0 and 1 are stored as bits)
242-535 ADA: 8. Intro. Graphs
continued
44
• An adjacency list cell uses:
o 32 bits for the integer, 32 bits for the pointer
o so, cell size = 64 bits
• Total no. of cells = total no. of edges, e
o so, total size of lists = 64*E bits
• successors[] has V entries (for V verticies)
o so, array size is 32*V bits
• Total size of an adjacency list data struct:
64*E + 32*V
242-535 ADA: 8. Intro. Graphs
45
Size Comparison
• An adjacency list will use less storage than an
adjacency matrix when:
64*E + 32*V < V2
which is: E < V2/64 – V/2
When V is large, ignore the V/2 term:
E < V2/64
242-535 ADA: 8. Intro. Graphs
continued
46
• V2 is (roughly) the maximum number of edges.
• So if the actual number of edges in a graph is
1/64 of the maximum number of edges, then an
adj. list representation will be smaller than an adj.
matrix coding
o but the graph must be quite sparse
242-535 ADA: 8. Intro. Graphs
47
4. Trees and Forests
• A (free) tree is an undirected
graph T such that
o T is connected
o T has no cycles
This definition of tree is different
from the one of a rooted tree
Tree
• A forest is an undirected graph
without cycles
• The connected components of
a forest are trees
Forest
Graphs
242-535 ADA: 8. Intro. Graphs
48
48
Uses of Trees
President
Vice-President
for Academics
Dean of
Engineering
Head of CoE
Vice-President
for Admin.
Dean of
Business
Head of EE
Head of AC.
Planning
Officer
....
Purchases
Officer
....
....
242-535 ADA: 8. Intro. Graphs
49
Saturated Hydrocarbons
H
H C
H C
H C
H
H
H
H
H
H
H C
C
C
H
H C
Isobutane
H C
Butane
H
H
H
H
H
H
• Non-rooted (free) trees
o a free tree is a graph with no cycles
242-535 ADA: 8. Intro. Graphs
50
A Computer File System
/
usr
bin
ed
ad
vi
bin
spool
exs opr
ls
mail
tmp
who
junk
uucp
printer
242-535 ADA: 8. Intro. Graphs
51
5. (Rooted) Tree Terminology
• e.g. Part of the ancient Greek god family:
levels
0
Uranus
Aphrodite
Eros
Zeus
Apollo
Athena
242-535 ADA: 8. Intro. Graphs
Kronos
Poseidon
Hermes
Atlas
Hades
Prometheus
Ares
Heracles
1
2
3
:
: 52
Some Definitions
• Let T be a tree with root v0.
• Suppose that x, y, z are verticies in T.
• (v0, v1,..., vn) is a simple path in T (no loops).
• a) vn-1 is the parent of vn.
• b) v0, ..., vn-1 are ancestors of vn
• c) vn is a child of vn-1
242-535 ADA: 8. Intro. Graphs
continued
53
• d) If x is an ancestor of y, then y is a descendant of x.
• e) If x and y are children of z, then x and y are siblings.
• f) If x has no children, then x is a terminal vertex (or a leaf).
• g) If x is not a terminal vertex, then x is an internal (or
branch) vertex.
242-535 ADA: 8. Intro. Graphs
continued
54
• h) The subtree of T rooted at x is the graph with vertex set
V and edge set E
o V contains x and all the descendents of x
o E = {e | e is an edge on a simple path from x to some vertex
in V}
• i) The length of a path is the number of edges it uses, not
verticies.
242-535 ADA: 8. Intro. Graphs
continued
55
• j) The level of a vertex x is the length of the simple path from
the root to x.
• k) The height of a vertex x is the length of the simple path
from x to the farthest leaf
o the height of a tree is the height of its root
• l) A tree where every internal vertex has exactly m children is
called a full m-ary tree.
242-535 ADA: 8. Intro. Graphs
56
Applied to the Example
• The root is Uranus.
• A simple path is {Uranus, Aphrodite, Eros}
• The parent of Eros is Aphrodite.
• The ancestors of Hermes are Zeus, Kronos, and
Uranus.
• The children of Zeus are Apollo, Athena, Hermes,
and Heracles.
242-535 ADA: 8. Intro. Graphs
continued
57
• The descendants of Kronos are Zeus, Poseidon,
Hades, Ares, Apollo, Athena, Hermes, and
Heracles.
• The leaves (terminal verticies) are Eros, Apollo,
Athena, Hermes, Heracles, Poseidon, Hades,
Ares, Atlas, and Prometheus.
• The branches (internal verticies) are Uranus,
Aphrodite, Kronos, and Zeus.
242-535 ADA: 8. Intro. Graphs
continued
58
• The subtree rooted at Kronos:
Kronos
Zeus
Apollo
Athena
242-535 ADA: 8. Intro. Graphs
Poseidon
Hermes
Hades
Ares
Heracles
continued
59
• The length of the path {Uranus, Aphrodite, Eros} is 2
(not 3).
• The level of Ares is 2.
• The height of the tree is 3.
242-535 ADA: 8. Intro. Graphs
60
Download