Evolutionary Computing and the Traveling Salesman

advertisement
Evolutionary Computing and the Traveling Salesman
By Sam Mulder
11/2/2002
CS 447 – Advanced Topics in AI
Abstract
The Traveling Salesman Problem (TSP) is a very hard optimization problem in the field
of operations research. It has been shown to be NP-complete, and is an often-used
benchmark for new optimization techniques. One of the main challenges with this
problem is that standard, non-AI heuristic approaches are currently very effective and in
wide use for the common fully connected, Euclidean variant that is considered here. In
this paper we examine current evolutionary approaches to the TSP, and introduce a new
approach that takes advantage of the modern state-of-the-art as exemplified in the
Chained Lin-Kernighan algorithm to provide learning. We also discuss the limitations of
evolutionary approaches in general as relate to this problem as well as potential areas
where they may prove useful and not just academic. In the results section, we compare
the new algorithm with traditional Chained Lin-Kernighan as well as a modern neural
clustering approach.
Keywords – Traveling Salesman Problem (TSP), Evolutionary Computing, Evolutionary
Algorithms, Divide-and-Conquer, Chained Lin-Kernighan, Adaptive Resonance Theory
(ART), Self-organizing Maps (SOM), Clustering
Table of Contents
I.
Introduction
A. Traveling Salesman Problem
B. Heuristic Approaches
C. Neural Clustering Approach
D. Evolutionary Algorithms
II.
Current Evolutionary Techniques
III.
Proposed Algorithm
IV.
Experimental Setup
V.
Results
VI.
Conclusions
VII.
Bibliography
Introduction
Evolutionary Algorithms (EAs) are an exciting, and rapidly expanding field of research
in the artificial intelligence community. They attempt to model natural processes in order
to evolve better solutions to problems using the basic operators of crossover and
mutation. EAs have many advantages in trying to solve hard problems, since they evolve
solutions from random starting points and require little domain knowledge about the error
surface to be explored.
The Traveling Salesman Problem (TSP) is one of the most studied problems in computer
science literature. It is an example of the important class of problems known as NPcomplete problems. An NP-complete problem is one that is solvable in polynomial time
by a non-deterministic algorithm, but not necessarily by a deterministic one. Another
way of considering this class of problems is to say that a correct solution may be checked
in polynomial time, but no known algorithm can solve the problem in that time. These
problems are interesting because it is possible to draw parallels between any two NPcomplete problems and show that finding an algorithm to solve one in polynomial time
gives you an algorithm to solve the other. They are also interesting because it has never
been proven that NP-complete problems cannot be solved in polynomial time, so it
remains as an open question.
To show where the class of NP-complete problems falls more clearly, consider two
example problems and how they compare to the TSP. The first problem is to sort a list of
integers. This problem takes on the order of O(n lg(n)) time, using “Big-Oh” notation to
represent an asymptotic upper bound, where n is the number of elements in the list. The
sorting problem is therefore solvable in polynomial time, since O(n lg(n)) = O(n2). Many
common algorithms are known to solve the problem in this bound, such as Merge Sort
and Heap Sort. The second example problem is to generate every combination of
integers in a list. This clearly takes O(n!) time to solve, since every combination must be
generated. This problem is actually harder than the traveling salesman though, because
even if we had a non-deterministic algorithm that always made the correct choices it
would take O(n!) time. Using the other description of NP-complete, it would take O(n!)
time to even check if we had a correct answer. The TSP falls between these two example
problems in difficulty. The most general form of the TSP is to find a Hamiltonian cycle
given an arbitrary graph. All known algorithms to solve this problem take greater than
polynomial time, but if we have a solution we can check it in O(n) time. The more
specific case of the TSP considered in this paper is also a member of NP-complete, but is
more complicated to check.
Now that we see why the TSP is interesting, what hope do EAs have of solving this
problem? The first thing to consider is that for any practical instance of the TSP, an exact
solution is intractable. The best exact algorithms become intractable even with the
number of cities less than 100. Instead, heuristical algorithms are used to find
approximate solutions. In this paper we consider how an EA may be used to augment
standard heuristic approaches and create possibilities for greater parallelism in finding
solutions.
Traveling Salesman Problem
Given a complete undirected graph G =(V,E), with non-negative integer costs c(u,v) for
each edge in E, the most general form of the traveling salesman problem is equivalent to
finding any Hamiltonian cycle over G where such a cycle is known as a tour. The more
common form of the problem is the optimization problem of trying to find the shortest
Hamiltonian cycle. Both of these problems have been proven to be NP-complete in [1].
These problems are very useful to consider because they map so easily to many realworld applications in a wide range of fields from network routing to cryptography [2].
The variation of the TSP that we consider is even more limited. We look only at graphs
that can be mapped to a two-dimensional Euclidean coordinate system where the edge
weight between two nodes is the distance between them. This system implies that the
triangle inequality holds such that c(u,w)  c(u,v) + c(v,w). We also require that the XY
coordinates of each node be given. This variation can be shown to still be NP-complete
and therefore to map onto the more general problem [3]. The Euclidean form of the TSP
has been widely studied [4-7].
Heuristic Approaches
The current best results on large-scale TSP instances come from variations on an
algorithm proposed by Lin and Kernighan in 1973 [8]. This algorithm takes a randomly
selected tour and optimizes it by stages until a local minimum is found. This result is
saved and a new random tour selected to begin the process again. Given enough time, the
optimal tour can be found for any instance of the problem, although in practice this is
limited to small instances for an absolute result due to time constraints. Even with
extremely large instances, however, reasonably short tours can be found [8].
The optimizations performed by the Lin-Kernighan (LK) algorithm use the concept of optimality. If n is the number of cities in a TSP, then we know that given any tour we
can reach an optimal tour by swapping at most n edges. Using -optimality, we consider
the optimal tour that can be reached by swapping only  edges. As  grows, the chance
that the -optimal tour is actually optimal increases. Unfortunately, the processing time
required to find -optimal tours is intractable for large . The largest commonly used
values for  are 2 and 3, although some attempts have been made with  as 4 or 5 [8].
The algorithm proposed by Lin-Kernighan examines variable  optimization. In brief, an
initial 2 edge swap is chosen. The algorithm then examines whether it would be
profitable to make a 3 edge swap, and continues increasing the number of edges as long
as the total swap remains an improvement over the initial tour. This process is repeated
with each node being considered in turn as a starting point for the initial 2 edge swap
until no further improvements can be found. The original algorithm then started with a
different random tour and worked from the beginning, always saving the best tour seen.
A popular variant and the most effective algorithm for large TSP problems known is
Chained LK. With Chained variants of LK, instead of beginning with a random tour at
each new iteration, the algorithm perturbs the previous tour by some amount and
optimizes from there. This perturbation generally consists of a process known as a
double bridge [8, 9]. This consists of a 4-edge exchange that is never found by the basic
LK algorithm and is illustrated below.
Figure - Double Bridge
This exchange has the effect of changing the global shape of the tour and starts the basic
LK algorithm in a position that should find a different minimal tour, increasing the
chance that the actual minimum tour is found [17].
Neural Clustering Approach
Previous research in this area focused on developing an algorithm based on neural
clustering. This algorithm used Adaptive Resonance Theory and Self Organizing Maps
to first cluster the data, find solutions for each cluster, and then merge the resulting subtours into a single tour. This algorithm will be briefly explained below.
Adaptive Resonance Theory (ART) was first introduced by Carpenter and Grossberg
[17]. The unsupervised variants provide a simple, effective neural clustering algorithm.
The original algorithm, designated ART1, used binary input sequences. The variation
developed for this solution uses two-input integer sequences where the integers are the
XY coordinates of a node. In this form, ART is functionally similar to k-means
clustering except that k is increased dynamically as new patterns are introduced.
In the basic ART algorithm, patterns are introduced to a simple two-layer network. Each
neuron in the first layer represents an input. Each neuron in the second layer represents a
pattern recognized by the network. As new patterns are considered, the network
calculates the distance between the new pattern and each of the known patterns. If the
distance is less than the vigilance factor for the network, then that pattern resonates with
the input. The nearest resonating pattern is chosen as the winner, and it is adjusted
towards the input pattern by some learning rate. If no patterns resonate, a new neuron is
created in the second layer corresponding to the new pattern. The size of the vigilance
factor chosen indirectly determines the number of patterns created for a given dataset.
Self-Organizing Maps were first introduced by Kohonen [18]. In their original form,
they provide a neural technique for clustering data points scattered on a Euclidean
surface. Neurons are laid out in a grid on the surface and then shifted towards the
clusters. As each input point on the surface is considered, the closest neuron fires and its
weights are adjusted to move it towards the point based on some learning rate. In this
way, the neurons are drawn towards the nearest clusters of points. SOM differs from
ART in that it begins with a fixed number of neurons that are already associated with
points on the surface.
The main advantage of SOM for the TSP is the fact that the initial neurons can be laid out
in an arbitrary fashion. By choosing a number of neurons equal to the number of nodes
in a graph and setting them in a specific order, we can maintain a path between them that
gives us a final route through the nodes after the network is trained. We use a
modification of SOM that includes regeneration in order to adapt more effectively to the
distribution of the nodes in the graph [16].
Our variation of SOM begins with the goal of matching one neuron to each node while
maintaining an efficient path through the graph. To prevent some neurons from
oscillating between nodes and others from remaining dormant, we inhibit a node after it
has fired once in a given epoch. If the neuron would have fired a second time, we instead
generate a new neuron at its location and shift it towards the new node. We insert the
new neuron in the existing path based on its relative location to the previous neuron. If a
neuron goes through an entire epoch without firing, it is removed. This process increases
the density of neurons in areas of high node density while maintaining a constant number
of neurons after each epoch. Since the new neurons are inserted into the path in the
correct position, path crossovers are generally prevented. This allows us to skip the timeconsuming crossover detection of previous variants in our algorithm.
Our main neural algorithm considered in this paper first uses the ART technique
described above to form clusters of cities. Within each cluster, the SOM algorithm
described above is used to form an initial tour. After initial tour formation, an iteration of
the original Lin-Kernighan algorithm is run to drive the tour to a local minimum. Tours
are then merged together using a simple closest match ring merge algorithm.
Evolutionary Algorithms
Evolutionary Algorithms attempt to use what we know about how genetics in real
organisms work in order to evolve solutions to problems using computers. The basic idea
of an EA is to start with a population of random solutions and then use certain operations
to change that population through generations, hopefully converging towards a solution.
Two primary operators are used in EAs, crossover and mutation. Both of these operators
contribute to either creating a new individual or changing an existing individual. At each
iteration of the algorithm, individuals are created and/or changed and a competition takes
place to see which individuals get to be part of the population for the next iteration.
Crossover is a method for creating a new individual. In general two or more parents are
chosen from the existing population and combined in some way to create a child. A
typical crossover implementation takes a certain number of the genes from one parent
and a certain number from the other. This form of recombination allows existing
solutions to contribute to the formation of new solutions.
Mutation is a method for changing an existing individual. In general, each member of an
existing population will have some small chance to be mutated. If mutation occurs, a
small change will be randomly applied to that individual. This technique allows an EA to
explore the problem space in directions not included in the initial population. The
combination of crossover and mutation provides a partially directed search that still has
the capability to eventually find any point in the problem space.
Previous State of the Art
EAs have been applied to the TSP before, but have been generally unsuccessful as
compared to state of the art heuristic algorithms. A good overview of the current state of
the art in TSP optimization is given in [21]. EAs receive barely a passing mention,
because they have not been competitive with this type of problem. [13] and [22] present
a good overview of straightforward implementations of genetic algorithms for the TSP.
While they do give decent results on small problem instances (less than 100 cities), the
times required to solve such instances are very large compared to other heuristic
approaches.
Although discovered after the implementation of our proposed algorithm, the most
promising research considered was [19]. While not directly considering the TSP, Land
looked at using a combination of EAs and local search heuristics for a different type of
optimization problem. Since there is a known local search heuristic for the TSP that has
given good results for large problem instances, the Lin-Kernighan heuristic described
above, it seems likely that this combination will prove beneficial to the TSP as well.
A parallel implementation of a basic EA using Lin-Kernighan was also considered in
[23]. This algorithm did not use the LK optimization during the evolutionary process or
take advantage of the nature of the search space as described below. Still, it shows that
parallel implementations of the TSP show great promise.
Proposed Algorithm
Two algorithms are proposed in this paper, one that implements a straightforward EA and
one that may combine the EA approach with ART clustering to provide faster results.
The problem representation used is a simple permutation of the cities. This allows us to
use some basic operators to perform crossover and mutation.
Crossover is controlled by a crossover rate. When creating children, a random number is
chosen between 0 and 1. If the number is less than the crossover rate, two parents are
chosen, otherwise only one is chosen. Parents are chosen according to an even
distribution, with all parents having an equal probability of being selected. The crossover
operator behaves as follows. One parent is chosen to be the primary parent, and one the
secondary. A randomly selected segment of the primary parent is chosen to be fixed and
retained. The cities chosen are removed from the second parent, and the remaining cities
placed after the cities from the first parent in their relative order. This process allows a
sub-tour from the first parent to be maintained intact, and the relative ordering of the
cities in the second parent to be maintained.
Mutation is applied based on a mutation rate. When creating children, a random number
is chosen between 0 and 1. If the number is less than the mutation rate, then the mutation
operator is applied to that child. The mutation operator randomly selects 8 cities from the
tour. These 8 cities are then involved in a double bridge swap as discussed in the section
on heuristic algorithms. This modification assures that the learning algorithms will not
settle to the same tour.
The number of children is decided by a parameter set in the parameter file. A fixed
number of children are generated for each iteration of the EA. A competition is then held
to select the parents for the next generation. In this competition, the best n individuals
from the total pool of parents and children are chosen, where n is the original number of
parents.
Taking into account the extreme difficulty of the problem in general, some form of
Lamarkian learning is desirable [20]. Before the competition to select new parents, a
round of the original Lin-Kernighan optimization algorithm is run on each of the children
created this round. This brings them to a local optimum. In general, it keeps the
algorithm exploring a space of good solutions instead of flailing about randomly among
the nearly n! bad ones. After learning has taken place, the competition is applied.
As described in [19] for other types of optimization problems, the search space for TSP
may be viewed as a surface with many basins. Using the Lin-Kernighan algorithm for
local optimization allows us to drive a solution to the local optimum of a particular basin.
In order for our search to be efficient, mutation must be chosen such that a given
mutation does not just move the solution around within the current basin, but instead
jumps it to another basin. With the given optimization technique, basins may be defined
in terms of the double bridge formation as discussed above. By using a double bridge as
our mutation operator, we guarantee that each mutation moves the search to a different
basin. Viewed in this light, a variant of this algorithm with the mutation rate set to 1 and
the crossover rate set to 0 may be seen as a parallel implementation of the popular
Chained Lin-Kernighan variant described in [9] with the difference that instead of always
applying the mutations to the best known tour, a population of tours is kept. Whether or
not this is advantages has yet to be determined.
Experimental Setup
The experimental test-bed uses the routines developed previously for testing neural
network techniques. The problems are taken from a collection of well-tested, randomly
distributed cities described by x,y coordinate pairs. The distances are not stored, but
calculated on the fly. The tours are stored as permutations of the city list.
This code base allows the easy inclusion of Lin-Kernighan optimization and neural
techniques for comparison. A strict object oriented approach is not followed, as the testbed is currently more of a quick-and-dirty solution to the problem to generate proof-ofconcept results than a final solution. Further work may be done in this area to clean up
and optimize the code. It should be noted that the included implementation of the LinKernighan algorithm was coded from scratch using only the original paper on the subject.
Other optimizations have been published and publicly available software may be located
with much faster implementations. A final version of this software would take advantage
of the latest advances in this area.
Results
Chained Lin-Kernighan (Concorde package) on a Sun Ultra 10
Problem Size
Tour Length
Time
100
7.43111E+06
0.24
1000
2.34148E+07
3.78
2000
3.26303E+07
8.36
8000
6.43398E+07
64.74
10000
7.20532E+07
100.54
20000
1.01468E+08
257.49
250000
3.58274E+08
5778.28
Clustered Divide and Conquer with SOM+ART+Lin-Kernighan on a PIII 933 Mhz
Problem Size
Tour Length
Time
% off
100
8.20608E+06
0
1000
2.77407E+07
3
2000
3.86702E+07
5
8000
7.66305E+07
63
10000
8.61756E+07
68
20000
1.20066E+08
189
250000
4.28589E+08
2926
10.43%
18.47%
18.51%
19.10%
19.60%
18.33%
19.63%
EA + Lin-Kernighan on a PIII 933 Mhz
100
7.40308E+06
1000
2.42913E+07
-0.38%
3.74%
% off
78
4235
The results displayed here show a comparison between the three techniques discussed in
this paper. Standard Chained Lin-Kernighan as represented by the Concorde
implementation in [24], the neuro-clustered divide and conquer approach developed in
previous research in [25], and the EA approach presented in this paper. Obviously the
EA approach did not scale as well as the other two. This is due to the fact that it was
running the LK algorithm on a population of size 100. In a parallel implementation, this
disadvantage may be minimized, because the Concorde package would not be able to
take direct advantage of this parallelism. It should be noted that, in contrast to the neural
approach, the EA actually found a better tour than the standard Concorde configuration
did for the 100 city problem. The Concorde program could be modified to run the
Chained Lin-Kernighan algorithm for a longer period of time in order to find this better
result, but standard comparisons are made versus the default configuration. On the 1000
city problem, the EA was a bit off, although still much closer than the divide and conquer
technique. This represents the inherent limitations that any divide and conquer approach
will have on tour quality.
The numbers shown here are exact for the first two algorithm, which are deterministic,
and are averages over 10 runs for the stochastic EA algorithm. The EA was run with a
population of 100, and 100 children for 100 generations. The mutation rate was set to 1
and the crossover rate set to 0. Raising the crossover or lowering the mutation only led to
worse results, so it was determined that mutation was the only positive effect.
Conclusions
EAs will probably never provide the fastest possible solutions to the TSP. We have
demonstrated, however, that in combination with local search techniques they are capable
of finding very good solutions. The primary advantage of the EA in a combinatorial
optimization environment may well prove to be as a model for parallel implementation.
The parallel implementation described in [23] may be seen as an example of the power of
using LK in conjunction with the parallel-nature of algorithms made available through
EAs.
In conclusion, we have demonstrated that a combination of EAs and local search
techniques can provide good solutions to the TSP. In fact, for the 100 city problem, the
solution found by our algorithm was better than that found by the chained LK toolbox at
[24] using default settings. The traditional chained LK algorithm always searches from
the best-known solution. It is possible that an EA approach that saves a population of
good solutions and makes improvement based on those may find the optimal solution
faster. A parallel version will need to be implemented to make a better comparison.
The proposed algorithm for using the EA within the context of ART clustering has not
yet been implemented. This will be an area for future research. Since the divide and
conquer concept of the ART implementation reduces the complexity of the problem,
while placing some limitation on the quality of the results, we expect to see these same
characteristics if it is implemented using the EA.
Bibliography
[1] T. Cormen, C. Leiserson, R. Rivest. Introduction to Algorithms, MIT Press, 1996,
954-960.
[2] S. Lucks. “How Traveling Salespersons Prove Their Identity”. 5th IMA conference
on cryptography and coding, Springer LNCS 142-149, obtainable through:
http://th.informatik.uni-mannheim.de/People/Lucks/papers.html, no date given.
[3] C.H. Papadimitriou. “The Euclidean Traveling Salesman Problem is NP-Complete”,
Theoretical Computer Science 4, 1977, 237-244.
[4] M.L. Braun, J.M. Buhmann. “The Noisy Euclidean Traveling Salesman Problem and
Learning”, Advances in Neural Information Processing Systems 14. MIT Press,
Cambridge, MA, 2002.
[5] E.M. Cochrane, J.M. Cochrane, “Exploring competition and co-operation for solving
the Euclidean Traveling Salesman Problem by using the Self–Organizing Map.”
Artificial Neural Networks, 7-10 September 1999. 180 – 185, Volume:1.
[6] S. Aurora. "Polynomial time approximation schemes for {Euclidean} traveling
salesman and other geometric problems", Journal of the ACM V.45 N.5, 1998, 753-782.
[7] J. Gudmundsson and C. Levcopoulos. "Hardness result for TSP with neighborhoods",
Technical Report LU-CS-TR:2000-216, Department of Computer Science, Lund
University, Sweden, 2000.
[8] S. Lin, B. W. Kernighan, “An Effective Heuristic Algorithm for the Traveling
Salesman Problem”, Operations Research 21, 1973, 498-516.
[9] D. Applegate, W. Cook, and A. Rohe, “Chained Lin-Kernighan for large traveling
salesman problems” Technical report, 2000. http://www.keck.caam.rice.edu/reports/
chained lk.ps
[10] P. Moscoto. “TSPBIB Home Page”
http://www.densis.fee.unicamp.br/~moscato/TSPBIB_home.html, 2002.
[11] N. Vishwanathan, D. Wunsch. “A Hybrid Approach to the TSP”, IEEE International
Joint Conference on Neural Networks, Washington, DC, July 2001.
[12] P. Prinetto, M. Rebaudengo, M. Sonza Reorda. “Hybrid Genetic Algorithms for the
Traveling Salesman Problem”. Artificial Neural Nets and Genetic Algorithms
Proceedings, Innsbruck, Austria, 1993.
[13] K. Bryant. “Genetic Algorithms and the Traveling Salesman Problem”. December
2000. Thesis, Harvey Mudd College, Dept. of Mathematics, 2000.
[14] M. Dorigo, L.M. Gambardella. “Ant Colonies for the Traveling Salesman Problem”.
BioSystems, 1997.
[15] O. Mayumi, O. Takashi. “A New Heuristic Method of the Traveling Salesman
Problem Based on Clustering and Stepwise Tour Refinements”. IPSJ SIGNotes –
ALgorithms No. 45. 1995.
[16] A. Plebe. “Self-Organizing Map Approaches to the Traveling Salesman Problem”.
LFTNC 2001.
[17] G. A. Carpenter and S. Grossberg. “The art of adaptive pattern recognition by a self-organizing neural network”. IEEE Computer, pages 47-88, March 1988.
[18] T. Kohonen. “The Self - Organizing Map”. Proceedings of the IEEE 78. pp 1464 –
1480. 1990.
[19] M. Land. “Evolutionary Algorithms with Local Search for Combinatorial
Optimization”. PhD Thesis, Computer Science & Engineering Dept. Univ. Calif. San
Diego, 1998.
[20] D. Ackley and M. Littman. “A Case for Lamarkian Evolution”. Artificial Life III.
Addison-Wesley, 1994, pp 487-509.
[21] G. Gutin and A. Punnen eds. The Traveling Salesman Problem and It’s Variations.
Kluwer Academic Publishers, Netherlands, 2002.
[22] P. Jog, J. Suh, D. Gucht. “Parallel Genetic Algorithms Applied to the Traveling
Salesman Problem”. SIAM Journal of Optimization. 1(4):515529, 1991.
[23] R. Baraglia, J. Hidalgo, R. Perego. “A Parallel Hybrid Heuristic for the TSP”. EvoWorkshop 2001. LNCS 2037, pp 193-202, 2001.
[24] Concorde software. http://www.math.princeton.edu/tsp/concorde.html.
[25] S. Mulder and D. Wunsch. “Large-Scale Traveling Salesman Problem via Neural
Network Divide and Conquer”. ICONIP, 2002.
Download