7. Graphs A graph is the same as a relation, only with different terminology. Graph terminology (instead of relation terminology) tends to be used with applications to real world situations, which will be the emphasis in this chapter. Let's momentarily ignore our discussion of relations and define what is meant by a graph without reference to relations. There are several types of graphs, depending on the application involved. Let's look at some of them. Definition 1. A graph (or undirected graph) G = (V, E) consists of the following. 1. a set V of vertices (or nodes) 2. a set of edges E. Each edge joins a pair of vertices. Example 1. Consider the graph with four vertices A, B, C and D and four edges e1, e2, e3 and e4 where i. e1 joins A and B ii. e2 joins A and D iii. e3 joins B and C iv. e4 joins B and D Thus V = {A, B, C, D} and E = {e1, e2, e3, e4}. We often represent an edge by the set of the two vertices that it joins, so e1 = {A, B}, e2 = {A, D}, e3 = {B, C} and e4 = {B, D} and E = {{A, B}, {A, D}, {B, C}, {B, D}}. A "picture" of this graph is shown at the right. A e1 B e4 e2 D e3 C Definition 2. A directed graph G = (V, E) consists of the following. 1. a set V of vertices (or nodes) 2. a set of edges E. Each edge goes from one vertiex to another. Example 2. Consider the graph with four vertices A, B, C and D and four edges e1, e2, e3 and e4 where i. e1 goes from A to B ii. e2 goes from D to A iii. e3 goes from C to B iv. e4 goes from B to D Here V = {A, B, C, D} and E = {e1, e2, e3, e4}. For a directed graph we often represent an edge from a vertex v to a vertex w by the ordered pair A B e4 e2 D 7.1 - 1 e1 e3 C (v, w), so e1 = (A, B), e2 = (D, A), e3 = (C, B) and e4 = (B, D) and E = {(A, B), (D, A)}, (C, B), (B, D)}. A picture is shown at the right. Definition 3. A weighted graph (either undirected or directed) is a graph where each edge has a weight associated with it. Example 3. Consider the weighted (undirected) graph with four vertices A, B, C and D and four edges e1, e2, e3 and e4 where A B i. e1 joins A and B and has weight 7 7 ii. e2 joins A and D and has weight 9 9 iii. e3 joins B and C and has weight 6 6 10 iv. e4 joins B and D and has weight 10 D C A picture is shown at the right. 7.1 Graph Models In this section we will be concerned with several types of situations where graph are used in real world situations. Here is a typical example. Example 4. A company has offices in the eight cities that are the vertices in the graph below. There are direct data links between some of the offices. These are represented by the edges in the graph below. It is desired to send a message from one office to another. For example, one might want to send a message from from Denver = 1 to Columbus = 8. In order to do this one needs to find a sequence of direct links starting at Denver and ending at Columbus. Milwaukee 2 Denver 1 Detroit 5 Chicago 3 Cincinatti 6 St. Louis 4 Louisville 7 Columbus 8 This is an example of finding a path from one vertex to another in a graph. In the above example it is easy, e.g. Denver to Chicago to Cincinatti to Columbus. However, if a graph had many vertices and edges and they were stored in some fashion in a computer program, then would want an algorithm to find a path. We look at such an algorithm after giving some terminology regarding paths. 7.1 - 2 Definition 4. A path from a vertex v to a vertex w is a sequence of vertices v0, v1, v2, … ,vm, such that v = v0 and vn = w and there is an edge from each vj to the next vj+1 for j = 0, 1, …, m-1. The path is a cycle (or circuit) if v = w. A path is simple if no edge is repeated. An Algorithm to Find a Path in a Graph. Now let's look at an algorithm for finding a path from one vertex to another in a computer program. We shall use a common way to represent the vertices of a graph inside a computer program which is by means of the numbers 1, 2, …, n where n is the number of vertices in the graph. We shall also use a common way of representing the edges of a graph in a computer program which is by means of a two dimensional array, let's call it edge, with n rows and n columns such that edgei,j = true if there is an edge from vertex i to vertex j and edgei,j = false if not. At the right is the array edge for the graph in Example 4 with true and false abbreviated by T and F. edge = F T T T F F F F T F T F T T F F T T F T F T T F T F T F F F T F F T F F F T F T F T T F T F T T F F T T F T F T F F F F T T T F We shall also use a common way to represent a path v0, v1, v2, … ,vm in a computer program which is by means of a one dimensional array, let's call it path, such that path0 path1 path2 etc pathm = v0 = the starting vertex in the path = v1 = the next vertex in the path = v2 = the next vertex path = = vm = the ending vertex in the path 1 3 6 8 - m is the number of edges in the path. At the right is the array path for the graph in Example 1, the path Denver to Chicago to Cincinatti to Columbus. Here m = 3. The algorithm for finding a path from one vertex to another works like this. We start out at the starting vertex proceeding along edges to other vertices. At each step we only go to a vertex we haven't been to before. We keep track of the path from the starting vertex to where we are now in the array path. If we reach a dead end we move back a step in the path and resume where we left off with that vertex. We continue until we either reach the vertex we want to go to or exhaust all the possibilities. We keep track of which vertices we have already visited by an array called visited where visitedi = true if we have already visited vertex i and visitedi = false if not. The algorithm uses the following variables in addition to the arrays edge, path and visited just described. 7.1 - 3 n = m = start finish the number of vertices the number of edges in the current path = the starting vertex = the ending vertex (we want to find a path from vertex start to vertex finish) current = the vertex we are currently at in our search for the desired path next = the vertex which is the candidate for the next vertex in the path We assume n, edge, start and finish have been set before the algorithm begins. The first step in the algorithm is to initialize the path to hold just the starting vertex and m = 0. We also initialize the array visited to be all false except for visitedstart = true. Finally, we initialize current to start and next to 1. Here are the statements to do this. m = 0 path0 = start For i = 1 to n visitedi = false visitedstart = true current = start next = 1 The rest of the program is a loop. Each time through the loop we look for another vertex that is not visited to add to the path. If we find one good. If not, we back up a step and continue where we left off with that vertex. We continue until we reach vertex finish or we run out of possibilities which is signaled by path becoming empty. Here are the statements to do this. While (current finish) and (m 0) While (next n) and ((Visitednext) or (not Edgecurrent,next)) next = next + 1 If next n then Visitednext = true m = m + 1 pathm = next current = next next = 1 else m = m - 1 If m 0 then next = current + 1 current = pathm 7.1 - 4 If the algorithm ends with m > 0 then we have found the desired path. If the algorithm ends with m = -1 then there is no path from vertex start to vertex finish. Models Involving Cycles. There are a number of problems in the real world that involve determining whether a graph has a cycle. Here are two that are computer related. Example 5. A UNIX system has five computer programs running simultaneously. Let's denote these five programs by P1, P2, P3, P4, P5. Each program uses certain files at certain times. If one program, say P1, is using a certain file and another program, say P2, needs to use it, then the second program has to wait until the first is done with the file before it can continue on. Let's say P2 is blocked by P1 in this case. There is a problem if we have a situation where the following holds Pj1 is blocked by Pj2 Pj2 is blocked by Pj3 etc Pjn-1 is blocked by Pjn Pjn is blocked by Pj1 In this case all of the programs Pj1, Pj2, Pj3, …, Pjn are permanently stuck. This is called a deadlock. A computer's operating system needs to have a way of detecting deadlocks and taking steps to remedy the situation when it occurs. There is a nice graph model of this situation. The vertices are the computer programs. We make an edge from Pi to Pj if Pi is blocked by Pj. Note that this is a directed graph. With this graph model a deadlock corresponds to a cycle in the graph. P1 Consider the following situation P5 is blocked by P1 P3 is blocked by P2 P1 is blocked by P4 P4 is blocked by P5 P2 P5 P3 P4 The graph is pictured at the right. There is deadlock involving P1, P4 and P5. Example 6. A certain C program has five functions. Let's denote these five programs by F1, F2, F3, F4, F5. A function Fj1 is recursive if either i. it calls itself, or ii. it calls another function Fj2 that calls it, or iii. it calls another function Fj2 that calls another function Fj3 that calls Fj1, iv. Fj1 calls Fj2, Fj2 calls Fj3, …, Fjn-1 calls Fjn, and Fjn calls Fj1 7.1 - 5 i.e. Fj1 can get back to itself by some sequence of calls. Whether or not a function is recursive is of interest because a compiler has to do certain things for a recursive function that it does not have to do for a nonrecursive function. These days the normal thing for a compiler to do is to assume that all functions are recursive. However, in some situations it might be of use to for the compiler to determine treat recursive and nonrecursive functions differently. There is a nice graph model of this situation. The vertices are the functions programs. We make an edge from Fi to Fj if Fi calls Fj. Note that this is a directed graph. With this graph model a group of recursive functions corresponds to a cycle in the graph. Consider the following situation F1 calls F2 and F4 F2 calls F3 and F4 F3 calls F5 F5 calls F2 F1 F2 F5 F3 F4 The graph is pictured at the right. F2, F3 and F5 are recursive. 7.1 - 6