This icon lets you know there is a video covering the associated material. Simply click on the icon to view the Graph Theory Graph theory is the study of mathematical objects known as graphs, which consist of vertices (or nodes) connected by edges. Any scenario in which one wishes to examine the structure of a network of connected objects is potentially a problem for graph theory. Examples of graph theory frequently arise not only in mathematics but also in physics and computer science. Below are some examples of graphs: Graph Basics A GRAPH consists of a set of dots, called VERTICES, and a set of EDGES connecting pairs of vertices. Any given vertex need not be connected by an edge. The DEGREE of a vertex is the number of edges connected to that vertex. A LOOP is a special type of edge that connects a vertex to itself. A PATH is a sequence of vertices using the edges. Usually, we are interested in a path between two vertices. For example, a path from vertex A to vertex M is shown below. It is one of many possible paths in this graph. A CIRCUIT is a path that begins and ends at the same vertex. A circuit starting and ending at vertex A is shown below A graph is CONNECTED if there is a path from any vertex to any other vertex. Every graph drawn so far except the second one has been connected. The graph below is disconnected; there is no way to get from the vertices on the left to the vertices on the right. Depending upon the problem being solved, sometimes WEIGHTS are assigned to the edges. The weights could represent the distance between two locations, the travel time, or the travel cost. It is important to note that the distance between vertices in a graph does not necessarily correspond to the weight of an edge. 1 Example • The graph below shows 7 cities. The weights on the edges represent the driving time in minutes between the cities. a. How many vertices and edges does the graph have? You Try • The graph below shows 5 cities. The weights on the edges represent the airfare for a one-way flight between the cities. a. How many vertices and edges does the graph have? There are 7 vertices labeled T, A, E, P, NB, MR and Y. There are 9 edges. You can count the 6 around the boarder of the graph and the 3 inside. b. Is the graph connected? b. Is the graph connected? This graph is connected. You can travel to any of the vertices using a combination of edges. c. What is the degree of the vertex labeled MR? c. What is the degree of the vertex representing LA? MR has degree 3, since there are three edges connected to MR. d. If you drive from T to A to MR, is that a path or a circuit? d. If you fly from Seattle to Dallas to Atlanta, is that a path or a circuit? This is a path. A circuit is a round trip, and we don’t get back to T. e. If you drive from MR to P to Y to MR, is that a path or a circuit? e. If you fly from LA to Chicago to Dallas to LA, is that a path or a circuit? This is a circuit. Starting and ending at the same vertex makes a circuit. f. How many minutes at minimum does it take to drive from E to P? f. What is the minimum cost to fly from Seattle to Atlanta? Since this graph is weighted, we can look at the weight of the edge between E and P and see that it will take 96 minutes to make this drive. 2 Shortest Path (Dijkstra’s Algorithm) The shortest path is the path on a weighted graph that has the smallest sum of those weights. It might represent the cheapest cost, the least time, or the shortest distance, depending on the application. Given a graph, there are potentially thousands of different paths that can be taken from one vertex to another, so it is important to find a great technique that is accurate (optimal) AND fast (efficient). However, sometimes this is not possible, and we accept a good technique that is fast and gives a good approximation (heuristic). For the shortest path, we do have an optimal and efficient algorithm called Dijkstra’s Algorithm. Dijkstra’s Algorithm: 1. Mark the ending vertex with a distance of zero. Designate this vertex as current. 2. Find all vertices leading to the current vertex. Calculate their distances to the end. Since we already know the distance the current vertex is from the end, this will just require adding the most recent edge. Don’t record this distance if it is longer than a previously recorded distance. 3. Mark the current vertex as visited. We will never look at this vertex again. 4. Mark the vertex with the smallest distance as current, and repeat from step 2. Example • Find the shortest path from vertex A to vertex E using Dijkstra’s Algorithm. Step 1: The final vertex E is the active vertex. We label it with the following notation: [0,E]. What this means is that it takes 0 distance to get here from E. Step 2, 3: From E we can get to D and F. D is a distance 9, so we mark D with [9,E]. This indicates that it takes 9 distance to get to D from E. F is a distance 10, so gets [10,E]. Lastly, we cross off E. Step 4: The vertex with the lowest distance that isn’t crossed off is D, which is now the active vertex. Repeat: Step 2-3: From D we can get to C and F. C is a distance 7 from D, so a total of 16 from E, so we mark C with [16,D]. This indicates that it takes 16 distance to get to C through D. F is a further distance 14, so it would get [23,D]. This is further than the current [10, E], so we don’t make a change. Notice that we don’t visit E as it Is already crossed off. Lastly, we cross off D. Step 2, 3: From F we can get to C and G. C is a further distance 4, So we can mark C with [14,F]. This is shorter than the current [16,D] so we replace C’s mark. G is a further distance 2, so gets [12,F]. Lastly, we cross off F. Step 4: The vertex with the lowest distance that isn’t crossed off is G, which is now the active vertex. Repeat: Step 2-3: From G we can get to H and I. H is a distance 1 from G, so a total of 13 from E, so we mark H with [13,G]. This indicates that it takes 13 distance to get to H through G. I is a further distance 6, so it would get [18,G]. Lastly, we cross off G. Step 4: The vertex with the lowest distance that isn’t crossed off is H, which is now the active vertex. Repeat: Step 2-3: From H we can get to A, B and I. A is a distance 8 from H, so a total of 21 from E, so we mark A with [21,H]. B is a distance 11 from H, so a total of 24 from E, so we mark B with [24,H]. I is a distance 7, so it would get [20,H]. However, this is longer than the current [18,G], so we make no change. Lastly, we cross off H. Step 4: The vertex with the lowest distance that isn’t crossed off is C, which is now the active vertex. Repeat: Step 4: Lowest current distance is F, so F is the active vertex. Repeat: 3 Step 2-3: From C we can get to B and I. B is a distance 8 from C, so a total of 22 from E, This is better than the current mark of 24, so we mark B with [22,C]. I is a distance 2 from C, so a total of 16 from E. This is better than the current mark, so we mark I with [16,C]. Lastly, we cross off C. You Try • Find the shortest path from vertex b to vertex h using Dijkstra’s Algorithm. Step 4: The vertex with the lowest distance that isn’t crossed off is I, which is now the active vertex. Repeat: This is where the algorithm essentially terminates. We notice that we can’t get to any uncrossed vertices from I, so we cross it off and move to the next active vertex A. It can get to B, but not with a better distance, so it is crossed off. This leaves the final vertex B, which doesn’t have any new vertices to go to. Thus, our final graph is as follows: We can see the shortest distance from A to E is 21. From A, we can see to travel to H, then to G, then to F, then to E. We can also see the shortest distances from all other vertices to E, and which path to take. Honestly, if you lived in A, you would want to reverse the order of Dijkstra’s Algorithm to start with vertex A so you would end up with the shortest distances from home to all the rest of the cities you visit. Although this algorithm is hard to do by hand, it is easy to implement with computer code, and is in fact essentially how GPS maps give you the shortest path to travel. Computers are so fast they can update essentially instantaneously with the best current path depending on the current timings of routes. Euler Vs Hamiltonian When it comes to paths and circuits, sometimes it is important to focus on the vertices and sometimes it is important to focus on the edges. For example, in a neighborhood, if the vertices are houses and the streets are edges, the person delivering the mail cares about the vertices and making sure not to miss one (houses) whereas the street sweeper/snowplow driver cares about the edges and making sure not to miss one (streets). Euler’s Edges and Hamiltonian’s Vertices (Euler is pronounced “oy-ler”) Euler Circuits and Paths are all about the edges. Note the alliteration with Euler’s Edges. This can help you remember. Hamiltonian circuits and paths are all about the vertices. 4 An Euler circuit is a circuit that uses every edge in a graph with no repeats. Being a circuit, it must start and end at the same vertex. An Euler path also visits every edge once with no repeats but does not have to start and end at the same vertex. Does a graph have an Euler path or circuit? Luckily there are simple theorems that tell us about this: Euler’s Path And Circuit Theorems A graph will contain an Euler path if it contains at most two vertices of odd degree. A graph will contain an Euler circuit if all vertices have even degree Although there are algorithms for finding Euler paths and circuits, for this course it will be more of a trial-and-error process to find these. We will instead focus on Hamiltonian paths, and especially Hamiltonian circuits, for the remainder of this module. A Hamiltonian circuit is a circuit that visits every vertex once with no repeats. Being a circuit, it must start and end at the same vertex. A Hamiltonian path also visits every vertex once with no repeats but does not have to start and end at the same vertex. Tables and Graphs You will encounter tables of weights rather than weighted graphs. The information contained, however, is equivalent. It is a great skill to have to be able to convert from a table to a graph. This gives you the option to work the problem with a graph rather than interpreting the table. It is also possible to work just with the table, but it takes extra practice. Example Convert the following table of weights into an equivalent graph. You Try Convert the following table of weights into an equivalent graph. You can arrange your vertices any way you want, but I usually place them in a nice pattern like the ones to the right of the table. Next, add edges. Finally, add weights so you can clearly see which edge they belong to. 5 Traveling Salesman Problem Suppose a salesman needs to give sales pitches in four cities. He looks up the airfares between each city and puts the costs in a graph. In what order should he travel to visit each city once, then return home, with the lowest cost? This problem considers visiting each vertex exactly once, in a circuit, with the minimum cost. This is, therefore, a Hamiltonian Circuit. Unfortunately, there is not an efficient and optimal algorithm. There is an optimal algorithm. It is called the Brute Force Algorithm and it has us consider EVERY possible Hamiltonian circuit and then look up the cheapest one. The problem is that if there are 15 cities, there are over 43 BILLION different unique circuits. That would be an exhaustive search. So, it is optimal because we will find the cheapest, but it is not efficient as it will take us way too long to perform. Therefore, we settle for a great approximation algorithm. This is what is referred to as Heuristic algorithm. It is efficient and close to optimal. In other words, we will likely not get the best solution for any of the problems covered, but we will get the correct answer for a given algorithm that should be close to or even be the best solution. Nearest Neighbor Algorithm (NNA) 1. Select a starting point. This vertex is given in the problem. 2. Move to the nearest unvisited vertex (the edge with smallest weight). 3. Repeat until the circuit is complete. This algorithm is very efficient (easy to implement) but it is a GREEDY algorithm. It only looks at the immediate cost without considering the consequences in the future. Therefore, it is not guaranteed to produce the optimal solution. Example • Starting at Seattle, find the circuit that results from the application of the Nearest Neighbor algorithm. Starting at Seattle, the cheapest flight would be to LA, at a cost of $70. Now we are in LA. The cheapest flight from LA (not visiting any past vertices/cities) is as follows: LA to Chicago: $100. Then from there: Chicago to Atlanta: $75 Atlanta to Dallas: $85 Dallas to Seattle: $120 Total Cost: $70+$100+$75+$85+$120 = $450 Circuit: Seattle, LA, Chicago, Atlanta, Dallas, Seattle. You Try • Starting at vertex A, find the circuit that results from the application of the Nearest Neighbor algorithm. • Starting at vertex D, find the circuit that results from the application of the Nearest Neighbor algorithm. As you work, make sure you don’t go back to the initial vertex until the final step, and don’t miss any vertex. 6 Some comments about the questions above: note that on the problem on the left, the circuit could be flown in the reverse order: Seattle, Dallas, Atlanta, Chicago, LA, Seattle. This plan would visit the same cities and cost the same low price. This demonstrates that you can list your circuit in the opposite order and still have a correct solution. Both answers should be accepted on the homework and on exams. Also, notice from the two examples on the right that you can get different results from the Nearest Neighbor algorithm depending on what vertex you start with. This is a heuristic algorithm, so it gives an easy and good approximation, not the best answer. So, how can we adapt the NNA to possibly find the best solution? What if we were to apply the NNA starting at each separate vertex and take the best answer? Still not guaranteed to be the true best, but it is better! Repeated Nearest Neighbor Algorithm (RNNA) 1. Do the Nearest Neighbor Algorithm starting at each vertex. 2. Choose the circuit produced with the minimal total weight. Example • Determine the resulting circuit and its weight by applying the Repeated Nearest Neighbor Algorithm on the following graph. If multiple unique circuits result, list all of them. You Try • Determine the resulting circuit and its weight by applying the Repeated Nearest Neighbor Algorithm on the following graph. If multiple unique circuits result, list all of them. Give your circuit as a list of vertices, starting and ending at vertex A. To collect the different answers we will get from this algorithm, we can utilize the following table: Start A B C D E Circuit A→C→B→E→D→A B→A→C→D→E→B C→A→B→E→D→C D→E→B→A→C→D E→D→C→A→B→E Weight 137 123 123 123 123 We can see that the RNNA gives a smallest weight of 123. There appear to be 4 different possible circuits. Let’s take a closer look: First, note that although BACDEB is the answer we got from starting at vertex B, it can be written in reverse order, and can in fact be written in all the following ways: ACDEBA, ABEDCA, BACDEB, BEDCAB, CDEBAC, CABEDC, DEBACD, DCABED, EBACDE, and EDCABE. With this list, we can see that our answer starting with C is equivalent, as is that starting with D and E. So, what is the correct answer you will be expected to give? To make the answer more consistent, it is customary to ask for the answer starting and ending with A. So, listing ACDEBA or ABEDCA would be accepted. 7 Sorted Edges Algorithm Rather than a short-sighted approach like NNA, Sorted Edges has the benefit of attempting to ensure the lowest weights are included in the final circuit. This is done by starting with the cheapest. 1. Select the cheapest unused edge in the graph. 2. Repeat step 1, adding the cheapest unused edge to the circuit, unless: a. adding the edge would create a circuit that doesn’t contain all vertices, or b. adding the edge would give a vertex degree 3. 3. Repeat until a circuit containing all vertices is formed Example • Determine the resulting circuit and its weight by applying the Sorted Edges Algorithm on the following table. Write your answer as a circuit starting and ending with vertex A. I prefer to use a graph to collect the circuit as I create it, so I would start with a blank graph like the one on the right. Next, I search for the smallest weight and see the BD edge weighted 3. That is the first one I place on the graph. I then cross it off from the table (both entries to be safe) Now things start to get more difficult. The next smallest weight in the table is 22. This would connect C and B, but that would create a premature circuit with the top three vertices. This is not allowed (see 2a above). So cross that set of 22s off. Next is the 25. Again, this would create a premature circuit AS WELL as make vertex D degree 3, (disallowed via 2b). Cross them off! 32 is next, and doesn’t work (premature circuit because A is still not included). Next is 33 and this connects A and C. Yay! It is allowed! If you take a moment, you can also realize this is the ONLY possible move left. There is only one thing to do now. Finish the circuit with the edge connecting A and E. Final circuit: ACDBFEA Final weight: 124 8 You Try • Determine the resulting circuit and its weight by applying the Sorted Edges Algorithm on the following table. Write your answer as a circuit starting and ending with vertex A. This completes the techniques we will cover to find the shortest Hamiltonian circuits. For our last concept, we need to review what it is to be a connected graph. As an exercise that won’t be on homework, but will help with understanding a subtle shift in focus needed, try the following: The following three graphs are disconnected. For each graph, draw ONLY ONE edge that connects the graph. There are many possible solutions. The following three graphs are connected, and each have an extra edge that is not needed for the graph to be connected. Cross off ONLY ONE edge that leaves the graph still connected. There may be multiple solutions. Minimum Cost Spanning Trees and Kruskal’s Algorithm A spanning tree is a connected graph using all vertices in which there are no circuits. In other words, there is a path from any vertex to any other vertex, but no circuits. The minimum cost spanning tree is the spanning tree with the smallest total edge weight. 9 For example, the first graph below has all the possible spanning trees on its right: In order to find the spanning tree that has the smallest total edge weight, we can use the following: Kruskal’s Algorithm: 1. Select the cheapest unused edge in the graph. 2. Repeat step 1, adding the cheapest unused edge, unless: a. adding the edge would create a circuit 3. Repeat until a spanning tree is formed (graph is connected) Example • Find a minimum cost spanning tree on the graph below using Kruskal’s algorithm. You Try • Find a minimum cost spanning tree on the graph below using Kruskal’s algorithm. Your answer should be a graph. Also, list the weight of your answer. We start with a blank graph: The shortest is CE with weight 3: Does adding make a circuit? NO Is the graph connected: NO Next is DE with weight 11: Does adding make a circuit? NO Is the graph connected: NO • Find a minimum cost spanning tree for the table below using Kruskal’s algorithm. Your answer should be a graph. Also, list the weight of your answer. Next is AB with weight 12: Does adding make a circuit? NO Is the graph connected: NO Next is CD with weight 18: Does adding make a circuit? YES, so DON’T INCLUDE Next is AE with weight 21: Does adding make a circuit? NO Is the graph connected: YES And we are DONE! Final weight = 47 10