Uploaded by loyoraj709

Sedgewick-Directed-Graphs-1983

advertisement
32. Directed Graphs
Directed graphs are graphs in which edges connecting nodes are oneway; this added structure makes it more difficult to determine various
properties. Processing such graphs is akin to traveling around in a city with
many one-way streets or to traveling around in a country where airlines rarely
run round-trip routes: getting from one point to another in such situations
can be a challenge indeed.
Often the edge direction reflects some type of precedence relationship in
the application being modeled. For example, a directed graph might be used
to model a manufacturing line, with nodes corresponding to jobs to be done
and with an edge from node x to node y if the job corresponding to node x
must be done before the job corresponding to node y. How do we decide when
to perform each of the jobs so that none of these precedence relationships are
violated?
In this chapter, we’ll look at depth-first search for directed graphs, as well
as algorithms for computing the transitive closure (which summarizes connectivity information) and for topological sorting and for computing strongly
connected components (which have to do with precedence relationships).
As mentioned in Chapter 29, representations for directed graphs are
simple extensions of representations for undirected graphs. In the adjacency
list representation, each edge appears only once: the edge from z to y is
represented as a list node containing y in the linked list corresponding to x.
In the adjacency matrix representation, we need to maintain a full V-by-V
matrix, with a 1 bit in row x and column y (but not necessarily in row y and
column Z) if there is an edge from x to y.
A directed graph similar to the undirected graph that we’ve been considering is drawn below. This graph consists of the edges AG AI3 CA LM JM
JLJKEDDFHIFEAFGEGCHGGJLGMML.
421
422
CHAPTER 32
The order in which the edges appear is now significant: the notation AG
describes an edge which points from A to G, but not from G to A. But it is
possible to have two edges between two nodes, one in either direction (we have
both HI and IH and both LM and ML in the above graph).
Note that, in these representations, no difference could be perceived
between an undirected graph and a directed graph with two opposite directed
edges for each edge in the undirected graph. Thus, some of algorithms in this
chapter can be considered generalizations of algorithms in previous chapters.
Depth-First Search
The depth-first search algorithm of Chapter 29 works properly for directed
graphs exactly as given. In fact, its operation is a little more straightforward
than for undirected graphs because we don’t have to be concerned with double
edges between nodes unless they’re explicitly included in the graph. However,
the search trees have a somewhat more complicated structure. For example,
the following depth-first search structure describes the operation of the recursive algorithm of Chapter 29 on our sample graph.
As before, this is a redrawn version of the graph, with solid edges correspond-
DIRECTED GRAPHS
ing to those edges that were actually used to visit vertices via recursive calls
and dotted edges corresponding to those edges pointing to vertices that had
already been visited at the time the edge was considered. The nodes are
visited in the order A F E D B G J K L M C H I.
Note that the directions on the edges make this depth-first search forest
quite different from the depth-first search forests that we saw for undirected
graphs. For example, even though the original graph was connected, the
depth-first search structure defined by the solid edges is not connected: it is
a forest, not a tree.
For undirected graphs, we had only one kind of dotted edge, one that
connected a vertex with some ancestor in the tree. For directed graphs, there
are three kinds of dotted edges: up edges, which point from a vertex to some
ancestor in the tree, down edges, which point from a vertex to some descendant
in the tree, and cross edges, which point from a vertex to some vertex which
is neither a descendant nor an ancestor in the tree.
As with undirected graphs, we’re interested in connectivity properties of
directed graphs. We would like to be able to answer questions like “Is there
a directed path from vertex x to vertex y (a path which only follows edges in
the indicated direction)?” and “Which vertices can we get to from vertex x
with a directed path?” and “Is there a directed path from vertex x to vertex
y and a directed path from y to x.7” Just as with undirected graphs, we’ll be
able to answer such questions by appropriately modifying the basic depth-first
search algorithm, though the various different types of dotted edges make the
modifications somewhat more complicated.
Transitive Closure
In undirected graphs, simple connectivity gives the vertices that can be reached
from a given vertex by traversing edges from the graph: they are all those in
the same connected component. Similarly, for directed graphs, we’re often
interested in the set of vertices which can be reached from a given vertex by
traversing edges from the graph in the indicated direction.
It is easy to prove that the recursive visit procedure from the depth-first
search method in Chapter 29 visits all the nodes that can be reached from the
start node. Thus, if we modify that procedure to print out the nodes that it is
visiting (say, by inserting write(name(k)) just upon entering), we are printing
out all the nodes that can be reached from the start node. But note carefully
that it is not necessarily true that each tree in the depth-first search forest
contains all the nodes that can be reached from the root of that tree (in our
example, all the nodes in the graph can be reached from H, not just I). To
get all the nodes that can be visited from each node, we simply call visit V
times, once for each node:
424
CHAPTER 32
for k:=l to Vdo
begin
now:=&
for j:=1 to V do vaIli] :=O;
visit(k);
wri teln
end ;
This program produces the following output for our sample graph:
A F E D B G J K L M C
B
C A F E D B G J K L M
D F E
E D F
F E D
G J K L M C A F E D B
H G J K L M C A F E D B I
I H G J K L M C A F E D B
J K L G C A F E D B M
K
L G J K M C A F E D B
M L G J K C A F E D B
For undirected graphs, this computation would produce a table with the
property that each line corresponding to the nodes in a connected component
lists all the nodes in that component. The table above has a similar property:
certain of the lines list identical sets of nodes. Below we shall examine the
generalization of connectedness that explains this property.
As usual, we could add code to do extra processing rather than just
writing out the table. One operation we might want to perform is to add an
edge directly from 3: to y if there is some way to get from z to y. The graph
which results from adding all edges of this form to a directed graph is called
the transitive closure of the graph. Normally, a large number of edges will be
added and the transitive closure is likely to be dense, so an adjacency matrix
representation is called for. This is an analogue to connected components in
an undirected graph; once we’ve performed this computation once, then we
can quickly answer questions like ‘5s there a way to get from x to y?”
Using depth-first search to compute the transitive closure requires V3
steps in the worst case, since we may have to examine every bit of the
425
DIRECTED GRAPHS
adjacency matrix for the depth-first search from each vertex. There is a
remarkably simple nonrecursive program for computing the transitive closure
of a graph represented with an adjacency matrix:
for y:=l to V do
for x:=1 to V do
if a[x, y] then
for j:=l to Vdo
if a[y, j] then a[x, j]:=true;
S. Warshall invented this method in 1962, using the simple observation that
“if there’s a way to get from node x to node y and a way to get from node y to
node j then there’s a way to get from node x to node j.” The trick is to make
this observation a little stronger, so that the computation can be done in only
one pass through the matrix, to wit: “if there’s a way to get from node x to
node y using only nodes with indices less than x and a way to get from node
y to node j then there’s a way to get from. node x to node j using only nodes
with indices less than x+1.” The above program is a direct implementation
of this.
Warshall’s method converts the adjacency matrix for our sample graph,
given at left in the table below, into the adjacency matrix for its transitive
closure, given at the right:
ABCDEFGHIJKLM
A 1 1 0 0 0 1 1 0 0 0 0 0 0
BOlOOOOOOOOOOO
c1010000000000
DOOOlOlOOOOOOO
EOOOllOOOOOOOO
FOOOOllOOOOOOO
GOOlOlOlOOlOOO
HOOOOOOlllOOOO
1 0 0 0 0 0 0 0 1 1 0 0 0 0
JOOOOOOOOOllll
KOOOOOOOOOOlOO
L0000001000011
MOOOOOOOOOOOll
ABCDEFGHI
JKLM
A 1 1 1 1 1 1 1 0 0 1 1 1 1
BOlOOOOOOOOOOO
c1111111001111
DOOOlllOOOOOOO
EOOOlllOOOOOOO
FOOOlllOOOOOOO
GlllllllOOllll
H l l l l l l l l l l l l l
1 1 1 1 1 1 1 1 1 1 1 1 1 1
JlllllllOOllll
KOOOOOOOOOOlOO
L1111111001111
MlllllllOOllll
426
CHAPTER 32
For very large graphs, this computation can be organized so that the
operations on bits can be done a computer word at a time, which will lead to
significant savings in many environments. (As we’ve seen, it is not intended
that such optimizations be tried with Pascal.)
Topological Sorting
For many applications involving directed graphs, cyclic graphs do arise. If,
however, the graph above modeled a manufacturing line, then it would imply,
say, that job A must be done before job G, which must be done before job
C, which must be done before job A. But such a situation is inconsistent:
for this and many other applications, directed graphs with no directed cycles
(cycles with all edges pointing the same way) are called for. Such graphs are
called directed acyclic graphs, or just dags for short. Dags may have many
cycles if the directions on the edges are not taken into account; their defining
property is simply that one should never get in a cycle by following edges in
the indicated direction. A dag similar to the directed graph above, with a
few edges removed or directions switched in order to remove cycles, is given
below.
The edge list for this graph is the same as for the connected graph of Chapter
30, but here, again, the order in which the vertices are given when the edge
is specified makes a difference.
Dags really are quite different objects from general directed graphs: in
a sense, they are part tree, part graph. We can certainly take advantage of
their special structure when processing them. Viewed from any vertex, a dag
looks like a tree; put another way, the depth-first search forest for a dag has
no up edges. For example, the following depth-first search forest describes
the operation of dfs on the example dag above.
427
DIRECTED GRAPHS
A fundamental operation on dags is to process the vertices of the graph
in such an order that no vertex is processed before any vertex that points
to it. For example, the nodes in the above graph could be processed in the
following order:
J
K
L
M
A
G
H
I
F
E
D
B
C
If edges were to be drawn with the vertices in these positions, all the edges
would go from left to right. As mentioned above, this has obvious application,
for example, to graphs which represent manufacturing processes, for it gives a
specific way to proceed within the constraints represented by the graph. This
operation is called topological sorting, because it involves ordering the vertices
of the graph.
In general, the vertex order produced by a topological sort is not unique.
For example, the order
A
J
G
F
K
L
E
M
B
H
C
I
D
is a legal topological ordering for our example (and there are many others).
In the manufacturing application mentioned, this situation occurs when one
job has no direct or indirect dependence on another and thus they can be
performed in either order.
It is occasionally useful to interpret the edges in a graph the other way
around: to say that an edge directed from x to y means that vertex x
“depends” on vertex y. For example, the vertices might represent terms to be
defined in a programming language manual (or a book on algorithms!) with
an edge from x to y if the definition of x uses y. In this case, it would be
useful to find an ordering with the property that every term is defined before
it is used in another definition. This corresponds to positioning the vertices
in a line so that edges would all go from right to left. A reverse topological
order for our sample graph is:
D
E
F
C
B
I
H
G
A
K
M
L
J
CHAPTER 32
The distinction here is not crucial: performing a reverse topological sort on a
graph is equivalent to performing a topological sort on the graph obtained by
reversing all the edges.
But we’ve already seen an algorithm for reverse topological sorting, the
standard recursive depth-first search procedure of Chapter 29! Simply changing visit to print out the vertex visited just before exiting, for example by
inserting write(name[k] ) right at the end, causes dfs to print out the vertices
in reverse topological order, when the input graph is a dag. A simple induction
argument proves that this works: we print out the name of each vertex after
we’ve printed out the names of all the vertices that it points to. When visit
is changed in this way and run on our example, it prints out the vertices in
the reverse topological order given above. Printing out the vertex name on
exit from this recursive procedure is exactly equivalent to putting the vertex
name on a stack on entry, then popping it and printing it on exit. It would
be ridiculous to use an explicit stack in this case, since the mechanism for
recursion provides it automatically; we mention this because we do need a
stack for the more difficult problem to be considered next.
Strongly Connected Components
If a graph contains a directed cycle, (if we can get from a node back to itself
by following edges in the indicated direction), then it it is not a dag and it
can’t be topologically sorted: whichever vertex on the cycle is printed out first
will have another vertex which points to it which hasn’t yet been printed out.
The nodes on the cycle are mutually accessible in the sense that there is a
way to get from every node on the cycle to another node on the cycle and
back. On the other hand, even though a graph may be connected, it is not
likely to be true that any node can be reached from any other via a directed
path. In fact, the nodes divide themselves into sets called strongly connected
components with the property that all nodes within a componenl are mutually
accessible, but there is no way to get from a node in one component to a node
in another component and back. The strongly connected components of the
directed graph at the beginning of this chapter are two single nodes B and K,
one pair of nodes H I, one triple of nodes D E F, and one large component with
six nodes A C G J L M. For example, vertex A is in a different component
from vertex F because though there is a path from A to F, there is no way to
get from F to A.
The strongly connected components of a directed graph can be found
using a variant of depth-first search, as the reader may have learned to expect.
The method that we’ll examine was discovered by R. E. Tarjan in 1972. Since
it is based on depth-first search, it runs in time proportional to V + E, but it is
actually quite an ingenious method. It requires only a few simple modifications
to our basic visit procedure, but before Tarjan presented the method, no linear
DLRECTED GRAPHS
429
time algorithm was known for this problem, even though many people had
worked on it.
The modified version of depth first search that we use to find the strongly
connected components of a graph is quite similar to the program that we
studied in Chapter 30 for finding biconnected components. The recursive
visit function given below uses the same min computation to find the highest
vertex reachable (via an up link) from any descendant of vertex k, but uses
the value of min in a slightly different way to write out the strongly connected
components:
function visit(k: integer): integer;
var t: link;
m, min : integer;
begin
now:=now+l; val[k] :=now; min:=now;
stack[p] :=k; p:=p+I;
t:=adj[k] ;
while t<>z do
begin
if vaJ[tr.v]=O
then m:=visit(tf.v)
else m:=vaJ[tf.v];
if m<min then min:=m;
t:=tt.next
end ;
if min=vaJ[k] then
begin
repeat
p:=p-1; write(name(stack[p]));
vaJ[stack[p]]:=V+I
until stack[p]=k;
wri teln
end ;
visit:=min;
end ;
I
This program pushes the vertex names onto a stack on entry to visit, then
pops them and prints them on exit from visiting the last member of each
strongly connected component. The point of the computation is the test
whether min=vaJ[k] at the end: if so, all vertices encountered since entry
(except those already printed out) belong to the same strongly connected
430
CHAPTER 32
component as k. As usual, this program could easily be modified to do more
sophisticated processing than simply writing out the components.
The method is based on two observations that we’ve actually already
made in other contexts. First, once we reach the end of a call to visit for
a vertex, then we won’t encounter any more vertices in the same strongly
connected component (because all the vertices which can be reached from that
vertex have been processed, as we noted above for topological sorting). Second,
the “up” links in the tree provide a second path from one vertex to another and
bind together the strong components. As with the algorithm in Chapter 30 for
finding articulation points, we keep track of the highest ancestor reachable
via one “up” link from all descendants of each node. Now, if a vertex x
has no descendants or “up” links in the depth-first search tree, or if it has a
descendant in the depth-first search tree with an “up” link that points to x,
and no descendants with “up” links that point higher up in the tree, then it
and all its descendants (except those vertices satisfying the same property and
their descendants) comprise a strongly connected component. In the depthfirst search tree at the beginning of the chapter, nodes B and K satisfy the
first condition (so they represent strongly connected components themselves)
and nodes F(representing F E D), H (representing H I), and A (representing
A G J L M C) satisfy the second condition. The members of the component
represented by A are found by deleting B K F and their descendants (they
appear in previously discovered components). Every descendant y of x that
does not satisfy this same property has some descendant that has an “up”
link that points higher than y in the tree. There is a path from x to y down
through the tree; and a path from y to x can be found by going down from
y to the vertex with the “up” link that reaches past y, then continuing the
same process until x is reached. A crucial extra twist is that once we’re done
with a vertex, we give it a high val, so that “cross” links to that vertex will
be ignored.
This program provides a deceptively simple solution to a relatively difficult
problem. It is certainly testimony to the subtleties involved in searching
directed graphs, subtleties which can be handled (in this case) by a carefully
crafted recursive program.
- -
r-l
DIRECTED GRAPHS
431
Exercises
1.
Give the adjacency matrix for the transitive closure of the example dag
given in this chapter.
2.
What would be the result of running the transitive closure algorithms on
an undirected graph which is represented with an adjacency matrix?
3.
Write a program to determine the number of edges in the transitive closure
of a given directed graph, using the adjacency list representation.
4. Discuss how Warshall’s algorithm compares with the transitive closure
algorithm derived from using the depth-first search technique described
in the text, but using the adjacency matrix form of visit and removing
the recursion.
5. Give the topological ordering produced for the example dag given in
the text when the suggested method is used with an adjacency matrix
representation, but dfs scans the vertices in reverse order (from V down
to 1) when looking for unvisited vertices.
6. Does the shortest path algorithm from Chapter 31 work for directed
graphs? Explain why or give an example for which it fails.
7.
Write a program to determine whether or not a given directed graph is a
dag.
8.
How many strongly connected components are there in a dag? In a graph
with a directed cycle of size V?
9. Use your programs from Chapters 29 and 30 to produce large random
directed graphs with V vertices. How many strongly connected components do such graphs tend to have?
10. Write a program that is functionally analogous to find from Chapter
30, but maintains strongly connected components of the directed graph
described by the input edges. (This is not an easy problem: you certainly
won’t be able to get as efficient a program as find.)
Download