Graphs Chapter 6 SFO LAX Graphs ORD DFW 1 Graph • A graph is a pair (V, E), where – V is a set of nodes, called vertices – E is a collection (can be duplicated) of pairs of vertices, called edges – Vertices and edges are data structures and store elements • Example: – A vertex represents an airport and stores the three-letter airport code – An edge represents a flight route between two airports and stores the mileage of the route PVD ORD SFO LGA HNL LAX DFW Graphs MIA 2 Edge Types • Directed edge – – – – ordered pair of vertices (u,v) first vertex u is the source or origin second vertex v is the destination e.g., a flight flight ORDAA 1206 PVD • Undirected edge – unordered pair of vertices (u,v) – e.g., a flight route • Directed graph (digraph) ORD 849 miles PVD – all the edges are directed – e.g., flight network • Undirected graph – all the edges are undirected – e.g., route network Graphs 3 Applications cslab1a cslab1b • Electronic circuits math.brown.edu – Printed circuit board – Integrated circuit cs.brown.edu • Transportation networks – Highway network – Flight network brown.edu qwest.net att.net • Computer networks – Local area network – Internet – Web cox.net John • Databases Paul David – Entity-relationship diagram Graphs 4 Terminology • End vertices (or endpoints) of an edge – U and V are the endpoints of a • Edges incident on a vertex a V b h – a, d, and b are incident on V • Adjacent vertices – share edge – U and V are adjacent U d X c • Degree of a vertex – X has degree 5 e W • Parallel edges (go between same nodes) j Z i g f – h and i are parallel edges Y • Self-loop (same nodes is origin/destination) – j is a self-loop Graphs 5 Terminology (cont.) • Simple graphs have no parallel (multiple edges between same vertices) or self loop • Path – sequence of alternating vertices and edges which match up • Cycle - path with same first and last vertex • Simple path a U – path such that all its vertices and edges are distinct • Examples – P1=(V,b,X,h,Z) is a simple path – P2=(U,c,W,e,X,g,Y,f,W,d,V) is a path that is not simple Graphs c V b d P2 P1 X h Z e W g f Y 6 Terminology (cont.) • Simple cycle – cycle such that all its vertices and edges are distinct • Connected graph/component • (Spanning – contains all vertices) subgraph • Forest (acyclic), free (no root) trees (connected forest), spanning tree • Examples – C1=(V,b,X,g,Y,f,W,c,U,a,) is a simple cycle – C2=(U,c,W,e,X,g,Y,f,W,d,V,a,) is a cycle that is not simple Graphs a U c V b d C2 X e C1 g W f h Z Y 7 Properties Property 1 Notation Sv deg(v) = 2m n m deg(v) Proof: each edge is counted twice number of vertices number of edges degree of vertex v Property 2 In an undirected graph with no self-loops and no multiple edges m n (n - 1)/2 Proof: each vertex has degree at most (n - 1) Example n = 4 m = 6 deg(v) = 3 What is the bound for a directed graph? Graphs 8 Main Methods of the Graph ADT • Vertices and edges • Update methods – are positions – store elements – – – – – • Accessor methods – – – – – – – – aVertex() incidentEdges(v) endVertices(e) isDirected(e) origin(e) destination(e) opposite(v, e) areAdjacent(v, w) insertVertex(o) insertEdge(v, w, o) insertDirectedEdge(v, w, o) removeVertex(v) removeEdge(e) • Generic methods – – – – Graphs numVertices() numEdges() vertices() edges() 9 Edge List Structure • • • • We could just store a list of nodes and a list of edges. Vertex info: name, id, degree, … Edge info: directed?, endpoints, weight It wouldn’t be all that convenient for doing a traversal, but if our operations involved looking at each edge or each node, it would be simple. • Note: edges point to vertexes (not visa versa) 10 Edge List Structure • Vertex object u – element – reference to position in vertex sequence a c b d v w z • Edge object – – – – element origin vertex object destination vertex object reference to position in edge sequence u z w v • Vertex sequence – sequence of vertex objects a • Edge sequence b c d – sequence of edge objects Graphs 11 Adjacency List Structure • Edge list structure • Incidence sequence for each vertex – sequence of references to edge objects of incident edges • Augmented edge objects – references to end vertices 12 Adjacency Matrix Structure • Edge list structure • Augmented vertex objects a v b u w – Integer key (index) associated with vertex • 2D adjacency array – Reference to edge object for adjacent vertices – Null for non nonadjacent vertices 0 u 1 0 • The “old fashioned” version just has 0 for no edge and 1 for edge 0 Graphs 2 1 w 2 1 a 2 v b 13 Asymptotic Performance n vertices, m edges no parallel edges no self-loops Bounds are “big-Oh” Edge List Adjacency List Adjacency Matrix Space n+m n+m n2 incidentEdges(v) areAdjacent (v, w) m m deg(v) min(deg(v), deg(w)) n 1 insertVertex(o) 1 1 n2 insertEdge(v, w, o) 1 1 1 removeVertex(v) removeEdge(e) m 1 deg(v) 1 n2 1 Graphs 14 Traversals • Visit each node – e.g., web crawler • Have to restart traversal in each connected component, but this allows us to identify components • Reachability in a digraph is an important issue – the transitive closure graph • Book permits counter-direction motion in general traversals 15 What is meant by depth first search? • Go deeper rather than broader • Requires recursion or stack • Used as a general means of traversal 16 Code for adjacency list implementation class Node { int ID; String nodeLabel; EdgeList adj; // successors public String toString() { return ID + nodeLabel ; } } class Edge { int from; // endpoint of start node int to; // endpoint of end node } class EdgeList { EdgeList next; Edge e; } public class Graph{ final static int SIZE = 20; Node [] nodes = new Node[SIZE]; } 17 At your seats Write the code to visit every node and print its name in the order it is visited. Note that with graphs, you may find yourself going in circles if you don’t mark the nodes as “visited” in some way. 18 Looking for a spanning Tree • Spanning tree: all of nodes and some of edges. • Like a calling tree – make sure all are informed • Many times algorithms depend on some ordering of the nodes of a graph • Labeling the edges is some way is helpful in organizing thinking about a graph 19 Depth-First Search (undirected graph) • Simple recursive backtracking algorithm calls recursive function at starting point – For each incident edge to a vertex • If opposite (other) vertex is unvisited – Label edge as “discovery” – Recur on the opposite vertex • Else label edge as “back” • Discovery edges form a component spanning tree, back edges go to an ancestor • with m edges, O(m) using an adjacency list, 20 but not using an adjacency matrix Depth First Search • Notice in the next diagram that multiple steps are shown between each set of pictures. • Discovery edges (tree edges) are drawn with solid black lines. Back edges are drawn with dashed lines. • The current node is shown in solid black 21 Depth-First Traversal 22 Given this code, find dfs numbers class Node { int ID; int dfsNum; String nodeLabel; EdgeList adj; // successors public String toString() { return ID + nodeLabel ; } } class Edge { int from; // endpoint of start node int to; // endpoint of end node int edgeType: {BACK, TREE, CROSS, FORWARD} } class EdgeList { EdgeList next; Edge e; } public class Graph{ final static int SIZE = 20; Node [] nodes = new Node[SIZE]; } 23 DFS Algorithm • The algorithm uses a mechanism for setting and getting “labels” of vertices and edges Algorithm DFS(G) Input graph G Output labeling of the edges of G as discovery edges and back edges for all u G.vertices() setLabel(u, UNEXPLORED) for all e G.edges() setLabel(e, UNEXPLORED) for all v G.vertices() if getLabel(v) = UNEXPLORED DFS(G, v) Algorithm DFS(G, v) Input graph G and a start vertex v of G Output labeling of the edges of G in the connected component of v as discovery edges and back edges setLabel(v, VISITED) for all e G.incidentEdges(v) if getLabel(e) = UNEXPLORED w G.opposite(v,e) if getLabel(w) = UNEXPLORED setLabel(e, DISCOVERY) DFS(G, w) else setLabel(e, BACK) Depth-First Search 24 Example A A unexplored vertex visited vertex unexplored edge discovery edge back edge A B E D E C A B D A D E C B C Depth-First Search 25 Example (cont.) A B A D E B C C A A B D E B C D E D E C Depth-First Search 26 DFS and Maze Traversal • The DFS algorithm is similar to a classic strategy for exploring a maze – We mark each intersection, corner and dead end (vertex) visited – We mark each corridor (edge ) traversed – We keep track of the path back to the entrance (start vertex) by means of a rope (recursion stack) Depth-First Search 27 Properties of DFS Property 1 DFS(G, v) visits all the vertices and edges in the connected component of v A Property 2 The discovery edges labeled by DFS(G, v) form a spanning tree of the connected component of v B Depth-First Search D E C 28 Complexity of DFS • DFS runs in O(n + m) time provided the graph is represented by the adjacency list structure Depth-First Search 29 Biconnectivity (This material replaces that in section 6.3.2) We worry about points of failure in the graph. SEA PVD ORD SNA FCO MIA 30 Separation Edges and Vertices • • • Definitions – Let G be a connected graph – A separation edge of G is an edge whose removal disconnects G – A separation vertex of G is a vertex whose removal disconnects G – A graph is bi-connected if for any two vertices of the graph there are two disjoint paths between them (or a simple path with joins them) Applications – Separation edges and vertices represent single points of failure in a network and are critical to the operation of the network Example – DFW, LGA and LAX are separation vertices – (DFW,LAX) is a separation edge ORD PVD SFO LGA HNL LAX DFW MIA 31 Finding Articulation points 1. Do a depth first search, numbering the nodes as they are visited in preorder. Call the numbers num(v). If a node has already been visited, we consider the edge to it as a “back edge”. The undirected edges become “directed” by the order they are visited. 2. The root is an articulation point if it has two children. Its removal would create two subtrees. 3. For each node, compute the lowest vertex that can be reached by zero or more tree edges followed by possibly one back edge. Low(v) is the minimum of 1. Num(v) [taking no edges] 2. the lowest back edge among all back edges (v,w) [no tree edges] 3. The lowest Low(w) among all tree edges (v,w) [some tree edges and a back edge] 32 To compute Low, we need a postorder traversal. Given Low, we find articulation points as: 1. The root is an articulation point if it has more than one child 2. Any other vertex v is an articulation point if and only if v has some child (w) in the tree such that Low(w) >=Num(v) 33 In the example below, notice • C is an articulation point as C has a child G and Low(G) >=Num(C). • D is an articulation point as Low(E) >=Num(D). 34 So how do we compute Low? 35 What is the complexity? • Assign DFS Numbers • Assign Low • Examine nodes for articulation point If we don’t know which is greater n (the number of nodes) or m (the number of edges), we show it as O(n+m) 36 Breadth-First Search • By levels, typically using queues 37 BFS Facts • There are discovery (tree) and cross edges (why no back edges?) – Once marked, don’t follow again. • Tree edges form spanning tree • Tree edges are paths, minimal in length • Cross edges differ by at most one in level • Try writing the code to do a BFS 38 Thm 6.19: Algorithms based on BFS • • • • Test for connectivity compute spanning forest compute connected components find shortest path between two points (in number of links) • compute a cycle in graph, or report none • (have cross edges) • Good for shortest path information, while DFS better for complex connectivity questions 39 Digraphs • A digraph is a graph whose edges are all directed E D – Short for “directed graph” • Applications – one-way streets – flights – task scheduling Fundamental issue is reachability C B A 40 Complexity • For each node in a digraph, how would you see which other nodes are reachable from that node? • What would the complexity be? 41 E Digraph Properties D C B • A graph G=(V,E) such that – Each edge goes in one direction: A • Edge (a,b) goes from a to b, but not b to a. • If G is simple, m < n*(n-1) – at most an edge between each node and every other node. • If we keep in-edges and out-edges in separate adjacency lists, we can perform listing of in-edges and out-edges in time proportional to their size. 42 Digraph Application • Scheduling: edge (a,b) means task a must be completed before b can be started ics21 ics22 ics23 ics51 ics53 ics52 ics161 ics131 ics141 ics151 ics121 ics171 The good life 43 Digraph Facts • Directed DFS gives directed paths from root to each reachable vertex • Used for O(n(n+m)) algorithm [dfs is O(n+m), these algorithms use n dfs searches] – Find all induced subgraphs (from each vertex, v, find subgraph reachable from v) – Test for strong connectivity – Compute the transitive closure • Directed BFS has discovery, back, cross edges 44 Directed DFS • We can specialize the traversal algorithms (DFS and BFS) to digraphs by traversing edges only along their direction • In the directed DFS algorithm, we have four types of edges – – – – discovery edges back edges (to ancestor) forward edges (to descendant) cross edges (to other) • A directed DFS starting at avertex s determines the vertices reachable from s E D C B A 45 Reachability • DFS tree rooted at v: vertices reachable from v via directed paths E E D C A D C F A E B D C A F B 46 Strong Connectivity • Each vertex can reach all other vertices a g c d e f b 47 Strong Connectivity Algorithm Pick a vertex v in G. • • Perform a DFS from v in G. a G: – If there’s a w not visited, return not strongly connected d • Let G’ be G with edges reversed. • Perform a DFS from v in G’. – If there’s a w not visited, return not strongly connected – Else, return strongly connected e b f a G’: g c d • Running time: O(n+m). g c e b f 48 Strongly Connected Components • Maximal subgraphs such that each vertex can reach all other vertices in the subgraph • Can also be done in O(n+m) time using DFS, but is more complicated (similar to biconnectivity). a c d f g {a,c,g} b {f,d,e,b} e 49 Transitive Closure • Given a digraph G, the transitive closure of G is the digraph G* such that – G* has the same vertices as G – if G has a directed path from u to v (u v), G* has a directed edge from u to v • The transitive closure provides reachability information about a digraph D E B C G A D E B C A G* 50 Computing the Transitive Closure If there's a way to get • We can perform DFS starting at each vertex from A to B and from B to C, then there's a way to get from A to C. – O(n(n+m)) Alternatively ... Use dynamic programming: The Floyd-Warshall Algorithm 51 Floyd-Warshall Transitive Closure • Idea #1: Number the vertices 1, 2, …, n. • Idea #2: Consider paths that use only vertices numbered 1, 2, …, k, as intermediate vertices: i Uses only vertices numbered 1,…,k (add this edge if it’s not already in) j Uses only vertices numbered 1,…,k-1 k Uses only vertices numbered 1,…,k-1 52 Floyd-Warshall’s Algorithm • Floyd-Warshall’s algorithm numbers the vertices of G as v1 , …, vn and computes a series of digraphs G0, …, Gn – G0=G – Gk has directed edge (vi, vj) if G has directed path from vi to vj with intermediate vertices in the set {v1 , …, vk} • We have that Gn = G* • In phase k, digraph Gk is computed from Gk - 1 • Running time: O(n3), assuming areAdjacent is O(1) (e.g., adjacency matrix) Algorithm FloydWarshall(G) Input digraph G (vertices numbered) Output transitive closure G* of G G0 G for k 1 to n do Gk Gk - 1 for i 1 to n (i k) do for j 1 to n (j i, k) do if Gk - 1.areAdjacent(vi vk) Gk - 1.areAdjacent(vk vj) if Gk.areAdjacent(vi vj) Gk.insertDirectedEdge(vi vj) return Gn 53 Floyd-Warshall Example v7 BOS ORD v4 JFK v2 v6 SFO DFW LAX v1 v3 MIA v5 54 1 2 3 1 4 5 6 7 y 2 3 y 4 5 y y y y y 6 y 7 y y y 55 Floyd-Warshall, Conclusion v7 BOS ORD v4 JFK v2 v6 SFO DFW LAX v1 v3 MIA v5 56 Recursion • What if you didn’t want to use recursion to do a DFS? • You could turn the parent edge around to tell you how to get back. • This is termed doing a DFS “in-place” • Obviously the edge would have to be flipped again after the call 57 DAGs and Topological Ordering D • A directed acyclic graph (DAG) is a digraph that has no directed cycles • A topological ordering of a digraph is a numbering v1 , …, vn of the vertices such that for every edge (vi , vj), we have i < j • Example: in a task scheduling digraph, a topological ordering a task sequence that satisfies the precedence v2 constraints Theorem A digraph admits a topological v1 ordering if and only if it is a DAG E B C DAG G A D B C A v4 E v5 v3 Topological ordering of G 58 Topological Sorting • Number vertices, so that (u,v) in E implies u < v wake up 1 A typical student day 3 2 study computer sci. eat 4 7 play nap 8 write c.s. program 9 make cookies for professors 5 more c.s. 6 work out 10 sleep 11 dream about graphs 59 Algorithm for Topological Sorting For each vertex compute its indegree Keep a set S of all vertices with indegree 0 num = 0; While S is not empty { u = s.pop() u.number = ++num; for each of u’s successors w w.indegree— if (w.indegree ==0) S.add(w) } if (num <nodeCt) graph has a directed cycle (text differs) 60 •Topological Sorting DFS Algorithm • Simulate the algorithm by using depth-first search Algorithm topologicalDFS(G) Input dag G Output topological ordering of G n G.numVertices() for all u G.vertices() setLabel(u, !VISITED) for all v G.vertices() which have no incoming edges. if getLabel(v) != VISITED topologicalDFS(G, v) • O(n+m) time. Algorithm topologicalDFS(G, v) Input graph G and a start vertex v of G (having no input arcs from unvisited nodes) Output labeling of the vertices of G in the connected component of v setLabel(v,VISITED) for all edges (v,w) if getLabel(w) = !VISITED topologicalDFS(G, w) Label v with topological number n nn-1 61 Topological Sorting Example a d b c i e f g h 62 Topological Sorting Example 2 d 3 4 6 5 7 8 9 63 Topological Sorting Example 2 1 3 4 6 5 7 8 9 64