Maze Traversal Using Graph Searching Algorithms Juan Echeverry, Jeffrey Holland UNCW Abstract Our initial plan was to analyze the difference between maze solving algorithms that could be used by humans and ones that can only be used by machines. Upon further research into the problem we recognized the connection between mazes and graphs as well as the connection between two of the algorithms we had chosen. We had chosen a Shortest Path Finder which turned out to be an example of a breadth-first algorithm and Tremaux’s algorithm which we discovered to be the first instance of depth-first search. Key Words: maze, graph, depth-first search, breadthfirst search 1. Introduction: We explored and compared two algorithms for solving mazes. In particular, we examined the runtimes of Tremaux’s algorithm and Shortest Path Finder at varying “perfect” maze sizes. We wanted to analyze the difference between maze solving algorithms that can be done by humans (from inside of the maze) and those that can only be done by a computer. 2. Background A maze is a puzzle in the form of complex branching passages. The goal is to find a route from one point in the maze to another. Mazes that contain no loops in them are known as “standard” or “perfect” mazes and are equivalent to a tree in graph theory. A graph is an ordered pair G = (V, E) composed of a finite set of vertices (V) and edges (E). An edge itself is a two element subset of V represented as an ordered pair. There are two main data structures for representing graphs: an adjacency list and an adjacency matrix. An adjacency list is where vertices are stored as objects and every vertex stores a list of adjacent vertices. An adjacency matrix is a two-dimensional matrix, in which the rows represent source vertices and the columns represent destination vertices. For the human solvable approach we chose Tremaux’s algorithm and for the computer approach we went with a Shortest Path Finder. Tremaux’s algorithm and the Shortest Path Finder are both connected to graph theory in the form of depth-first and breadth-first search, respectively. Depth first search was the first algorithm implemented. Depth-first search is an algorithm for traversing tree and graph data structures. It can be used to check for certain properties such as whether a graph has a cycle of is fully connected and was first discovered for use in solving mazes by Charles Pierre Tremaux in the 19th century. How it works: 1. Pick a starting vertex (current node). 2. While you have not found the objective, go to an unvisited vertex adjacent to the current vertex. 3. This process continues until a vertex with no adjacent vertices is reached. 4. Afterwards the algorithm backs up one edge to the vertex it came from and repeats at step 2 until the objective is reached or it backs up to the starting vertex and the starting vertex has no adjacent, unvisited vertices. 5. If there are still unvisited vertices, the algorithm starts over at an arbitrary vertex of another connected component of the graph. Breadth-first search is an exhaustive strategy for searching in a graph. It was discovered in the 1950’s by E. F. Moore for finding the shortest path out of a maze. It implements the queue data structure and is initialized with a starting vertex. How it works: 1. Pick a starting vertex (current node). 2. Travel to every neighbor of that vertex, then all unvisited vertices two edges away from the current node. 3. This is continued until all vertices in the same connected component as the starting vertex are visited. 4. If there are still unvisited vertices, the algorithm starts over at an arbitrary vertex of another connected component of the graph. The complexity of both algorithms are the same when at worst-case scenarios. When using an adjacency list the complexity is O (|V| + |E|), where V and E are the vertices and edges of the graph, and O (V²|) when using an adjacency matrix. The worst case scenario would be where the starting vertex is at the furthest connected point in the graph from the end vertex and every other vertex and edge would have to be traversed before being reached. 3. Methods We generated our mazes using a randomized implementation of Prim’s algorithm. This generated “perfect” mazes for our tests. Prim’s algorithm, a greedy algorithm, is used for finding minimum spanning trees for connected, weighted, and undirected graphs. The complexity for Prim’s is: Adjacency list: O (|V| + |E|) Adjacency matrix: O (|V²|) The maze generation did not have an impact on our direction besides ensuring a level field for analysis. Both of the chosen algorithms can be used for mazes other than perfect but for our purposes, this was not necessary. These algorithms guarantee solutions, if they exist in the given problem. Tremaux’s algorithm, our human approach, differs from depth-first search in a few ways. The algorithm itself gives some context into how they differ. Following is the exact language used for the implementation of our version of Tremaux’s algorithm: As you walk down a passage, draw a line behind you to mark your path. When you hit a dead end turn around and go back the way you came. When you encounter a junction you haven't visited before, pick a new passage at random. If you're walking down a new passage and encounter a junction you have visited before, treat it like a dead end and go back the way you came. If walking down a passage you have visited before (i.e. marked once) and you encounter a junction, take any new passage if one is available, otherwise, take an old passage (i.e. one you've marked once). All passages will either be empty, meaning you haven't visited it yet, marked once, meaning you've gone down it exactly once, or marked twice, meaning you've gone down it and were forced to backtrack in the opposite direction. When you finally reach the solution, paths marked exactly once will indicate a direct way back to the start. If the Maze has no solution, you'll find yourself back at the start with all passages marked twice. Notice that Tremaux’s algorithm does not contain an explicit stack. Rather, it achieves the same functionality through marking passageways as they are visited. In worst-case scenario, you would have to visit every cell (vertex) twice, except for the path to the exit. A Shortest Path Finder algorithm was used for the computer approach. Where this one differs from breadthfirst search is in the fact that each vertex keeps track of how it was reached. Meaning, it knows what vertex was used in order to get to it and this is used in traversing back from the end vertex. In worst case scenario, you would have to visit every cell. We first used the maze generator to produce mazes of varying sizes, 10, 80, and 200. At these different sizes, time stamps were taken before and after each implementation of our search algorithms and the difference was used in creating graphs of the results. At each size, we did 100 runs to record an average time but cut off the first 10 runs of each data set because the results were skewed to be far longer than they should have been. We were unable to determine what cause this besides the time taken to allot processing power to the program. We graphed the individual runtimes at each size in order to observe if there were any runs where the human approach was faster than the computer approach. After observing this, we switched from observing individual runtimes to the average runtimes. We started generating mazes of size 10 x 10 and increased by 10 until we reached mazes of size 800 x 800. To save time, we did 30 runs and not 100 runs at each size. We graphed the average runtime at each size this time to see if we could notice the near linear increase we were expecting. 4. Results & Analysis When comparing the two in terms of average runtime, the Shortest Path Finder algorithm took the least amount of time to complete. While there were individual runs that Tremaux’s had a significant faster time, this was mostly due to random chance. At smaller size mazes, our implementation of Shortest Path Finder is close to twice as fast as Tremaux’s algorithm. When comparing the two algorithms, this can be seen as resulting from the fact that Shortest Path Finder, despite the fact that it visits more cells (vertices) on average, visits each cell only once. Tremaux’s on the other hand visits a significant portion of the cells twice due to backtracking. As the size of the mazes increase, the 2:1 time ratio, Tremaux’s to Shortest Path, seen at smaller numbers begins to decrease. This is where the amount of memory space that the Shortest Path Finder uses becomes a larger weight than the fact that it only visits each cell once. 5. Conclusion The Shortest Path Finder algorithm worked best within the scope of this project where memory, at the size of maze we went up to, was not an issue but had it been, we believe Tremaux’s would be a better choice. The deeper we dove into the world of mazes we realized it had much more of a connection to graph theory than we initially thought. The fact that a maze is nothing more than a tree, or type of graph, all of the rules that govern graph theory apply to mazes. This maze project pointed us towards key realizations in the interconnectedness of seemingly disparate problem fields. Future work we would like to pursue includes optimization, in the form of parallelization and heuristics, and exploring the complexity of adding dimensions. Figure 1b) Graph representation of the maze N = 10 1400 1200 [1] Levitin, Anany. Introduction to the Design & Analysis of Algorithms. 3rd ed. Boston: Pearson Addison-Wesley, 2007. Print. [2] Pullen, Walter D. "Think Labyrinth: Maze Algorithms." Think Labyrinth: Maze Algorithms. Magitech, 01 Nov. 2014. Web. 03 Dec. 2014. 1000 Time (µs) 6. References 800 Tremaux 600 Shortest 400 200 0 1 11 21 31 41 51 61 71 81 Figure 2: Runtimes in µs at 10 x 10 mazes N = 80 7. Images and Graphs 6000 5000 Time (µs) 4000 Tremaux 3000 Shortest 2000 1000 0 1 11 21 31 41 51 61 71 81 Figure 3: Runtimes in µs at 80 x 80 mazes Figure 1a) Maze with vertices labeled N = 200 30000 25000 Time (µs) 20000 Tremaux 15000 Shortest 10000 5000 0 1 11 21 31 41 51 61 71 81 Figure 4: Runtimes in µs at 200 x 200 mazes Figure 5: Avg. Runtimes across N x N mazes from N = 10 to N = 800, in increments of 10