Finding an Optimal Solution to the Traveling Salesman Problem using Heuristics and Dynamic Programming Joseph Sewell & Eric Salmon Abstract This paper examines the advantages of using heuristic methods and dynamic to resolve the optimal solution to the Traveling Salesman Problem. Using techniques of dynamic programming, branch and bound, exploiting nuances via convex hull and minimum spanning trees, t his method reduces the complexity of conducting an exhaustive search of the solution space while still obtaining an exact solution to the given set. Key Words: Traveling Salesman Problem, Dynamic Programming, Branch and Bound, Convex Hull, Minimum Spanning Tree 1. Introduction The traveling salesman problem is one that has been a fundamental problem in the computer science community for over 50 years. The premise of this problem is fairly simple. A salesman must travel to each city, C, depart from and end at the same node, A, and visit each city exactly once all while generating the lowest cost. This value can be described as time, distance, or other measurement. This paper examines the cost as distance, D, in 2-dimensional Euclidean space. The traveling salesman problem can be expressed as 𝑛 ∑ 𝑑𝑖𝜋(𝑖) 𝑖=1 A tour is represented by a cyclic permutation π where π(i) represents the city that follows city i in the tour. The traveling salesman problem is then the optimization problem to find a permutation π that minimizes the length of the tour.1 While the statement of the problem is simple, obtaining the exact solution is rather laborious with a high complexity. Being able to examine each tour to find the optimal tour requires generating all permutations of the set, N. A naïve implementation to resolve the optimal tour requires n! operations. The purpose of this algorithmic technique is to contribute to a reduction in the complexity of solving the problem with large sets. 2. Approach of Algorithm The algorithm begins by calculating the distance between each city. These distances are stored in an NxN matrix where the diagonal elements are zero.1 This matrix is referenced throughout the generation of the optimal tour. The matrix generated results in avoiding further calculations of distances between each city. Once this matrix has been generated the algorithm proceeds to generate a minimum spanning tree of the entire set, N. Kruskal's algorithm is used to obtain the minimum spanning tree (MST) of the entire set N. Kruskal's algorithm is used for its low complexity algorithm which can generate the MST without excessive overhead. The algorithm runs in O(E * log N) time where E is the total number of edges within the set.2 It should also be noted that E is at most N. The MST allows us establishes both a lower and an upper bound for the cost of the optimum tour by the following logic: the lower bound is selfevident as there is by definition no tour that could visit every city in a lower cost than the total cost of edges in the MST; the upper bound is established by imagining a tour which retraces each every edge on the MST, giving a total cost of 2 * the cost of the edges in the MST. If the set were to contain just two cities, city X and city Y, the MST would show X → Y, a proper lower bound. Conversely, if you multiply the total distance of the MST by two, a logical upper bound is attained, X → Y → X. This value of an upper bound will be crucial in further examination of applicable tours. This value is stored for later use. The lower bound is useful for giving a quick estimate of the cost of the optimal tour, but will not be used in the remainder of this paper. The algorithm next proceeds to produce the convex hull of the entire set, N. The algorithm uses Graham's Scan to produce the convex hull of the entire set. Graham's scan is used due to its low complexity and overhead. This algorithm runs in O(n * log n) time.3 The Graham Scan is a method of finding the convex hull of a set of points, P. We will use this to find the set of points CH that define the vertices of the convex hull of P. Graham scan computes the convex hull in the following manner: 1.) Search N for the point s with the lowest y-value. If more than one point exists with the lowest y-value, choose the point with the lowest xvalue. Add s to CH. 2.) Sort the remaining points by the angle they make with the x-axis and s. Add the the point with lowest angle to CH. the same route. This lowers the complexity from n! to (n1)!. A count of method calls is kept to determine how many candidate solutions are being searched by the devised algorithm. The algorithm searches (𝑛 − 1)! (ℎ − 1)! or, alternatively, 3.) Check from the second to last point added to CH, to the last point added to CH, to the next point a in the sorted list. If the points are collinear, add a to CH. If the points create a left-hand turn, add a. If the points create a right-hand turn, remove the last point added to CH, and remove a from consideration. Repeat until all points have been considered.4 Once the convex hull has been generated the next step in the algorithm can take place. It has been proven that if the weight each edge represents distance and CH is the convex hull of the set of cities in 2-dimensional space, then the order in which the cities on the line segments defining CH appear in the optimal tour will match exactly to the order in which they appear in CH5. This reveals a partial ordering within the convex hull in relation to the optimal tour of the entire set N. This partial ordering will be maintained while generating possible permutations for the optimal tour. We utilize two techniques to reduce complexity further: first, the algorithm recursively chooses either the next city in the partial order defined by CH, or any city that is not part of CH from the pool of remaining cities. Next, we check whether moving to the chosen city and back to the starting city would cause the cost to exceed our upper bound. If so, we backtrack and try a different route. If not, the chosen city is then removed from removed from the pool and the process repeats. Whenever we complete a new tour, we adjust our upper bound to the cost of the new tour and continue. 𝑛−1 ∏ 𝑖! 𝑖=ℎ candidate tours, where n is the number of cities, and h is the number of cities in CH. In the worst case, h=3, and the number of tours to be searched is halved. In the best case, h=n and the optimal tour is found merely by generating the convex hull with a complexity of nlog(n). 4. Results First, we performed 4 tests to show the stepwise improvement of various parts of the algorithm: Baseline: an exhaustive search with no heuristics Test 2: an exhaustive search using a fixed starting point Test 3: an exhaustive search using a fixed starting point and utilizing the partial ordering of the convex hull. Test 4: an exhaustive search utilizing branch and bound techniques based on the cost of the tour versus an upper bound, in addition to the former techniques. 120000000 The algorithm is assessed based on its complexity and running time as compared with an exhaustive search of the entire solution space which searches Complexity 3. Analysis baseline 70000000 test 2 20000000 test 3 n! -30000000 candidate tours. We choose to use a fixed starting point to avoid duplicate tours from different starting points along 3 8 13 N test 4 Unfortunately, our tests were limited to a size of 15 cities, beyond which the exhaustive search without heurisistics was too slow to utilize. for sets > 20 enough to be able to compute the solution in a reasonable amount of time on current hardware. 6. References [1] 5. Conclusion Hasler & Hornik (Journal of Statistical Software, 2007 V. 23 I. 22) Kruskal, J. B. “On the shortest spanning substree of a graph and the traveling salesman problem”. Proceedings of the American Mathematical Society 7: pp. 48-50, 1956. [2] Large sets which have numerous cities which lie within the convex hull (e.g.: do not lie on the line segment defining the hull) do not see a significant reduction in complexity, as our algorithm only removes the The partial ordering is one of the more powerful tools implemented in this algorithm. However, in sets which all cities share their exact location in 2-dimensional space, with the line segments defining the convex hull, result in a computational complexity of O(nlogn) to generate the exact solution to the set. This is a best case scenario of the algorithm. The algorithm is able to generate the exact solution for a set of 14 cities in less than 3 seconds. As the set of cities grow so does the computational time, by a large magnitude. At its current implementation the algorithm does not reduce the computational complexity [3] Graham, R.L. An Efficient Algorithm for Determining the Convex Hull of a Finite Planar Set. Information Processing Letters 1, 132-133, 1972. [4] Wikipedia contributors. (2014, Dec 2. Graham Scan [Online]. Available: http://en.wikipedia.org/wiki/Graham_scan [5] Eilon, S., Watson-Gabdy, C., Christofides, N.1971, Distribution Management, Griffin, London.