directed-graph

advertisement
Directed Graph
In a directed graph, each edge (, ) has a direction  → . Thus (, ) ≠ (, ). Directed graph is
useful to model many practical problems (such as one-way road in traffic network, and asymmetric
relation in a social network). However, the direction makes some major differences in many algorithms.
We examine some of them in this chapter.
Notations and Definitions
Directed edge: a directed edge (, ) has a direction  → .  is called the tail, and  is called the head.
Arc: a directed edge (, ) is sometimes called an arc to emphasize it is directed.
Degrees: indeg() is the number of incoming edges to . outdeg() is the number of outgoing edges
from .
We say  is reachable from  if there is a directed path from  to .  → 1 → 2 → ⋯  → .
Strongly connected: A directed graph is strongly connected if every pair of vertices are reachable from
each other.
Clearly, strongly connected relation is an “equivalence relation”:



a ~ a. (Reflexivity)
if a ~ b then b ~ a. (Symmetry)
if a ~ b and b ~ c then a ~ c. (Transitivity)
Strongly connected component: A strongly connected component is a maximal strongly connected
subgraph.
Directed acyclic graph (DAG): A directed graph without any directed cycles.
We are interested in the following questions:



Is a directed graph strongly connected?
Is a directed graph acyclic?
Find all strongly connected component.
BFS and DFS
A quick review: The BFS and DFS algorithms for undirected graph still work for directed graph. The only
difference is that we only following the out-edges in directed graph. More specifically, when  is taken
out from the queue or stack, we add unvisited vertices  such that (, ) ∈  into the queue or stack.
The BFS or DFS algorithm in directed graph still run in ( + ) time to visit all reachable vertices from
the starting vertex .
The edges responsible to add new vertices to the queue or stack form the BFS and DFS trees. The BFS
tree still has the shortest distance property. The DFS tree still has the parenthesis property for the start
and finish time. However, the properties about the non-tree edges in undirected graph do not hold
anymore for directed graph.
Reachability:
Problem: Given , find all vertices  such that there is a directed path from  to .
Theorem: BFS (or DFS) on directed graph ensure that
1. All reachable vertices from the source  are visited.
2. The time complexity is ( + ).
Proof: The same as undirected graph case. Omitted.
Theorem: For directed graph, the path found by the BFS algorithm has the shortest distance from  to .
Proof: The same as undirected graph case. Omitted.
Strongly Connected Graph
Now we consider the problem of determining whether a graph is strongly connected. Example:
Determine a traffic network consisting of only one-way edges are sufficient.
In undirected graph, all vertices reachable from  form the connected components containing . This is
because if there is a path from  to , and a path from  to , then there is a path from  to . That
makes the problem solvable by either the BFS or DFS algorithm. However, this does not apply anymore
for directed graph.
Trivial Algorithm 1: For each  ∈ , use BFS to check if it can reach every other vertex. Time complexity:
( ⋅ ( + )) = ().
Think about the undirected graph again, we connect path  →  →  to build the path from  to . It is
just that in undirected graph the path  →  is the same as the path  → . In directed graph, we cannot
reverse a path. But it does not forbid us from finding another path from  to .
Observation: Pick  ∈ . If there is a path from  to every , and a path from every  to , then  is
strongly connected. (Proof is trivial and omitted)
On the other hand, if  is strongly connected, then the condition of the observation is also correct. Thus,
we have the lemma.
Lemma 1: For any  ∈ ,  is strongly connected ⇔ There is a path from  to every , and a path from
every  to .
We know how to check “there is a path from  to every ” in ( + ) time. How to check if there is a
path from every  to ? Just reverse the edges and the problem is reduced to check if there is a path
from  to every .
Algorithm (Strong Connectivity)
1.
2.
3.
4.
5.
Arbitrarily pick a vertex  ∈ .
Use DFS to check if every vertex is reachable from .
Reverse the direction of all edges.
Use DFS to check if every vertex is reachable from .
If both yes, then say “yes”. Otherwise, say “no”
The correctness follows the Lemma. Time complexity ( + ).
Directed Acyclic Graphs
Problem: Determine if a directed graph is acyclic.
The above three graphs are isomorphic. However, the right most version is the easiest to determine if
the graph has a cycle: We only need to check if there is an arc that points backwards.
Topological ordering: Ordering of the vertices by relabeling the vertices, such that there is no arc
( ,  ) for  > .
Theorem 2: A directed graph  is acyclic ⟺ it has a topological ordering.
Proof: ⇐) Trivial.
⇒) We prove by providing an algorithm to construct the topological ordering.
First, there must be a vertex with in-degree 0. Otherwise, one can find an infinite long backward path by
repeatedly following the incoming edge of the vertex at the head of the path. Eventually we will visit a
vertex twice, causing a directed cycle.
Suppose we find a vertex  with in-degree 0, then it can serve as the first vertex in the topological
ordering. Now we consider  ′ =  − . Then  ′ is a smaller acyclic graph. We topologically order  ′ by
recursion, and then put  at the left-most of the ordering. Details omitted.
Exercise: Find a way to implement the above algorithm in ( + ) time.
Next let us examine a different algorithm for topological ordering. This is a useful preparation for the
later algorithms for strongly connected components.
Algorithm (Sketch):
time ← 1
While not all vertices are visited
Arbitrarily pick an unvisited vertex 
DFS()
We require that the start and finish time of each vertex  is recorded during DFS. Note that we may
have a DFS forest. By using a global time, the start and finish time intervals of two nodes in two different
trees are disjoint.
Claim 3: If  is acyclic, for every arc (, ), finish[]<finish[].
Proof:
Case 1. start[]<start[].
Because acyclic, there is no path from  to . So  is not an ancestor of  on a DFS tree. Thus, the two
intervals [start[v], finish[v]] and [start[u], finish[u]] are disjoint. Thus, start[]<start[] implies that
finish[]<finish[].
Case 2. start[u]<start[v]
Because of the arc (, ), the algorithm ensures that  is a descendant of . Because of the parenthesis
property, finish[]<finish[].
With the claim in place, we have the following algorithm to find a topological ordering.
Algorithm:
1. Run DFS on the whole graph, and record the vertices with decreasing finished time during DFS.
2. Check if the ordering is a topological ordering. If yes, output the ordering. Otherwise output
“cyclic”.
If  is acyclic, then the algorithm outputs a topological ordering. Otherwise, it will realize it is acyclic in
step 2.
Correctness follows from the claim. Time complexity ( + ). (Remark: Don’t do comparison based
sorting after DFS. It will end up with  log .)
Strongly Connected Component
The figure shows an example of strongly connected components of a directed graph. Note that two
strongly connected components do not share vertices with each other because a shared vertex  can be
used as the bridge to connect the two components together. The right figure regards each strongly
connected component as a super node. Two super nodes have an edge if there is an edge connecting
the two components.
A straightforward algorithm is to modify the strongly connected graph algorithm. Find all vertices that
are reachable from , and all vertices from which  can be reached. Then take the intersection. But this
way, finding each component will take ( + ) time. If there are  components, it takes (( + ))
time. We want to learn a linear time algorithm.
Lemma 4. If each strongly connected component as a super node, the resulting super graph is acyclic.
Proof: (As an exercise.)
Idea 1. The acyclic super graph has a sink node with no out-degree. If our DFS starts with a vertex in this
sink component, then it will end up with finding the exact component.
This suggests the following strategy: Repeatedly find the sink component and remove it from the graph.
How to find a sink component? More precisely, how to find a vertex in a sink component?
Idea 2. Recall that if the graph is acyclic, the finish time in the DFS provides a reversed topological
ordering. Now the super graph is acyclic. If we do DFS on the original graph, will the finish time provides
any useful information?
The situation is illustrated in the following figure:  has edges entering ′, but there’re no paths from ′
to . We want to check the finish time of vertices in the two components.
This depends on which component is visited first during DFS.
Case 1. If the first vertex visited in  ∪ ′ is  ∈ ′. Before  finishes, the DFS algorithm won’t enter  at
all.
Case 2. If  is in  instead. Before  finishes, the DFS will guarantee to finish all reachable unvisited
vertices (and therefore all ′).
Thus, no matter what, we have the latest finish time of  is latter than the latest finish time of ′.
Lemma 4: Let  and ′ be two strongly connected components and there are edges from  to ′. Then
the latest finish time of  is latter than the latest finish time of ′.
Proof: This is because of there is no path from  ′ to  (otherwise there is a cycle in the super graph).
Thus the lemma follows the earlier discussion. QED.
Thus, the latest finish time of a component can be used to topologically order the super graph. BUT, we
do not know the components yet!!!
Don’t give up! Let’s step backward a bit and see if we’ve lost everything. We can compute the latest
finish time of all vertices, it ought to be the latest finish time of all components. That vertex belongs to a
source component!
BUT, we do not want a vertex in a source component. We need a vertex in a sink component!!!!! Only
when  in a sink component, we can use DFS() to find the component. Other components do not have
this property.
Again, don’t give up. We’ve got something. We can find the source component. And we need to find the
sink component. Can we make use of what we’ve got?
Idea 3. The answer is simple: if you reverse all edges, the source and sink are reversed. And then you can
use the above algorithm to find it. Aha!
A straightforward summary:
While  is not empty
Reverse edge direction of  to get ′.
DFS on ′.
Let  has the latest finishing time.
Do DFS() on  to find a strongly connected component ; Output .
 ←  − .
Notice a lot of repeated computation in the straightforward implementation. Most can be reduced
straightforwardly. Special attention is required for DFS on  ′ each time.
The purpose of DFS on  ′ is to find  in , which is a source component of  ′ and a sink component .
Removing  from  and  ′ together will result in a pair of reversed graphs again. Now the latest
finishing time of the previous DFS belongs to a new source component of the new  ′ . So, we do not
need to do DFS again.
Thus, the new algorithm:
Reverse edge direction of  to get ′.
DFS on ′ to obtain a finish time for each vertex.
While  is not empty
Let  has the latest finishing time
Do DFS() on  to find a strongly connected component ; Output .
 ←  − .
Time complexity: ( + )
Exercise: Go through the proof of correctness and technical details to get the time complexity.
Acknowledgement: Prepared based upon Lap Chi’s notes. Many figures copied from his notes and
textbooks.
Download
Related flashcards

Functions and mappings

24 cards

Mathematical analysis

32 cards

Number theory

27 cards

Complex analysis

23 cards

Mathematical analysis

35 cards

Create Flashcards