Genetic Operators for TSP Chapter 8 in Michalewicz and Fogel, How to Solve It: Modern Heuristics, Springer, 2000 Size of TSP applications Some examples Circuit board drilling: 17,000 cities X-ray crystallography: 14,000 cities VLSI fabrication (assembly): 1.3 million cities Genome sequencing: several million cities Instances with several hundred cities can be solved optimally using general purpose optimization software Size of TSP applications (cont.) Applegate et al. (2003) solved a one million city instance using the cutting plane method due to Dantzig et al. in 2 days (with 0.3% gap) and 308 days (with 0.04% gap) However, the algorithms they used are very difficult to understand and implement, and their computing facilities are highly sophisticated A simple minded GA can solve instances with several thousand cities in under an hour (different operators and solution quality) Evolutionary algorithms Schema theorem and building block hypothesis are valid when a GA uses Binary coding as the representation scheme Crossover and mutation operators that are suitable for binary coding Those algorithms that are based on GA notions, but use other forms of coding and operators are referred to as “evolutionary algorithms” Most applications for TSP (and other ordering problems) are EAs Representation schemes and genetic operators for TSP The most natural representation of a TSP tour is a permutation of cities Binary coding of a TSP tour is possible but not suitable because of ordering dependencies Traditional crossover and mutation will result in illegal tours (duplicates and omissions) that require extensive repair So, we should develop coding schemes and genetic operators that are appropriate for the problem at hand Adjacency representation (AR) AR encodes the tour as a list of n cities City j is in position i if and only if the tour leads from city i to city j For example, the tour 1-5-2-6-3-4-1 is represented by the list (5 6 4 1 2 3) Each tour has one AR, but some lists can represent illegal tours, e.g.the list (5 4 6 1 2 3) yields the cycle 1-5-2-4-1 Advantage: Each gene represents an edge Crossover operators for AR: Alternating-edges crossover (AEC) AEC starts with a random edge from one parent and proceeds by selecting edges from the two parents in an alternating manner, e.g. list tour Parent 1: (5 6 4 1 2 3) 1-5-2-6-3-4-1 Parent 2: (4 5 6 2 3 1) 3-6-1-4-2-5-3 Offspring 1: Offspring 2: (5 6 4 2 3 1) (4 1? ) 1-5-3-4-2-6-1 If a new edge from a parent causes a cycle, one of the remaining edges is selected at random Crossover operators for AR: Subtour-chunks crossover (SCC) SCC starts by choosing a random-length subtour (or subpath) from one parent and proceeds by choosing subtours from the two parents in an alternating manner It extends the tour by choosing edges from alternating parents (such an edge defines the starting point of the next subtour) Again, if an edge from a parent causes a cycle, one of the remaining edges is selected at random Note that SCC is a generalization of AEC Crossover operators for AR: Heuristic crossover (HC) HC first chooses a random starting city It then compares the two edges emanating from this city in two parents and selects the shorter edge The city on the other end of the selected edge becomes the starting city and the process is repeated Again, a random edge is introduced to avoid a cycle Crossover operators for AR: Heuristic crossover (cont.) A modification of HC uses two rules If shorter of the two edges causes a cycle, then try to take the other (longer) edge If the longer edge also causes a cycle, then choose the shortest of the q randomly selected edges This heuristic tries to combine short subpaths from the parents It may leave undesirable crossings of edges Mutation operator for AR: A heuristic based on 2-opt (HM) HM can be used for local improvement (as a form of mutation) It randomly selects two edges, {i, j} and {k, m} If dist(i, j)+dist(k, m)>dist(i, m)+dist(k, j), then the edges {i, j} and {k, m} are replaced by the edges {i, m} and {k, j} Performance of AR Genes in AR represent edges that count for fitness It allows us to look for patterns in good solutions, e.g. (# # 4 # 2 #) denotes all tours that have the edges {3, 4} and {5, 2} However, results obtained with AR and its operators are not very good AEC disrupts good patterns SCC does better than AEC possibly because its rate of disruption is lower Even with HC, percent deviation from optimum is 1627% for problems with 50-200 cities Ordinal representation (OR) OR encodes the tour as a list of n cities Given an ordered list C of cities as a reference point, the i-th element of a chromosome is in the range from 1 to n-i+1 For example, given C=(1 2 3 4 5 6) the tour 1-2-5-3-6-4-1 is represented by the list (1 1 3 1 2 1) Advantage: Traditional 1- and 2-point crossover operators generate legal tours Ordinal representation (cont.) For example, given C=(1 2 3 4 5 6) list tour Parent 1: (1 4 1 | 3 1 1) 1-5-2-6-3-4-1 Parent 2: (3 5 1 | 2 1 1) 3-6-1-4-2-5-3 Offspring 1: Offspring 2: (1 4 1 | 2 1 1) (3 5 1 | 3 1 1) 1-5-2-4-3-6-1 3-6-1-5-2-4-3 Note that the “head” subpaths are preserved, but the “tail” subpaths are disrupted essentially at random, resulting in poor performance Path representation (PR) PR is the most natural representation of a tour For example, the tour 1-5-2-6-3-4-1 is represented by the list (1 5 2 6 3 4) Traditional crossover operators generate illegal tours Edge information is carried not by a single gene but by a pair of genes, hence epistasis is strong Crossover operators for PR: Partially mapped crossover (PMX) PMX takes a subpath between two random cut points from one parent, completes the tour by preserving the order and position of as many cities as possible from the other parent Illegal tours are avoided by using a mapping, e.g. Parent 1: Parent 2: (1 | 5 2 6 | 3 4) (3 | 6 1 4 | 2 5) mapping 56 Offspring 1: Offspring 2: (3 | 5 2 6 | 1 4) (2 | 6 1 4 | 3 5) 2 1 64 Crossover operators for PR: Partially mapped crossover (cont.) PMX preserves the “mid portion” (one subpath) from the first parent With the help of the mapping, it also tries to preserve the order and position of cities in the “side portions” (the other subpath) from the second parent Is it meaningful to preserve the absolute position of a city in the list, when in fact the list represents a tour? (Is the tour 1-5-2-6-3-4-1 not the same as the tour 6-3-4-1-5-2-6?) Crossover operators for PR: Order crossover (OX) OX also takes a subpath between two random cut points from one parent, completes the tour by preserving the relative order of as many cities as possible from the other parent Illegal tours are avoided by ommission, e.g. Parent 1: Parent 2: (1 | 5 2 6 | 3 4) (3 | 6 1 4 | 2 5) Offspring 1: Offspring 2: (4 | 5 2 6 | 3 1) (2 | 6 1 4 | 3 5) using 3-1-4 using 3-5-2 Crossover operators for PR: Variants of OX Order-based crossover: OBC selects several random positions, and the order of cities in selected positions in one parent is imposed on the corresponding cities in the other parent Position-based crossover: Instead of selecting one subpath to be copied, PBC randomly selects several cities for this purpose OBC and PBC are essentially equivalent Crossover operators for PR: Cycle crossover (CX) Starting with city i from the first parent, CX takes the next city j from the second parent as the one in the same position as city i, and places j in the same position it has in the first parent CX also preserves absolute positions of cities When a cycle occurs, remaining cities are filled in from the second parent, e.g. Parent 1: (1 5 2 6 3 4) Parent 2: (3 6 1 4 2 5) Offspring 1: (1 6 2 4 3 5) Offspring 2: (3 5 1 6 2 4) Mutation operators for PR: Inversion Inversion reverses the subpath between two randomly selected cut points For example, the list (1 | 5 2 6 | 3 4) becomes (1 | 6 2 5 | 3 4) after inversion Inversion guarantees a legal tour Experiments show that it outperforms a “cross and correct” operator Increasing the number of cut points decreases the performance Mutation operators for PR: Insertion, displacement and reciprocal exchange Insertion selects a city and inserts it at a random position Displacement selects a subpath and inserts it at a random position Reciprocal exchange swaps two cities Edge preservation in TSP Crossover operators discussed so far try to preserve order and/or position of cities Basic building blocks of TSP that determine the fitness value are not the cities but the edges A good operator should extract edge information from parents as much as possible and pass it to offspring Experimental results show that OX, which preserves relative order of cities (if not the edges), does 11% and 15% better than PMX and CX Edge preserving heuristic crossovers These operators transfer about 60% of edges from parents Edge recombination (ER) operator ER builds an offspring from the edges present in both parents It uses an edge list for this purpose, e.g. for parents (1 2 3 4 5 6 7 8 9) (4 1 2 8 7 6 9 3 5) the edge list is ER operator (cont.) ER selects an initial city from one of the parents, preferably the one having the lowest degree in the edge list, which is city 7 in our example According to the edge list, city 7 can be connected to 6 or 8, both having a degree of three Breaking the tie randomly, suppose ER chooses 6 City 6 can be connected to 5 with degree three or 9 with degree four, so ER chooses 5 Offspring obtained at the end is (7 6 5 4 1 2 8 9 3) with only one non-parental edge {3, 7} ER operator (cont.) The reason behind giving priority to a low degree city is to increase the chance of completing the tour by using the edges present in the parents ER resorts to random choice of the next city only when an “edge failure occurs,” i.e. when the current city does not have a continuing edge since the cities it can be connected to are already visited This way, ER can transfer more than 95% of edges from parents to a single offspring Enhanced ER In choosing the next city from the edge list, give priority to edges common to both parents, e.g. City 4: edges to other cities: 3 -5 1 -5 means that edge {4, 5} is in both parents This is valid for cities having a degree of three, because degree four means no edge is common, and degree two means both edges are common In addition, random choice of next city in case of edge failure can be replaced by a better choice Matrix representation (MR) Binary n x n matrix M represents ordering information, where mij=1 if and only if city i precedes city j, e.g. the tour 1-5-2-6-3-4-1 is represented as 1 1 0 2 0 3 0 4 0 5 0 6 0 2 3 4 5 6 1 1 1 1 1 0 1 1 0 1 0 0 1 0 0 0 0 0 0 0 1 1 1 0 1 0 1 1 0 0 Not very practical due to memory requirements Other possibilities EAs based on divide and conquer strategies and bisection methods Operators based on conventional heuristics Edge exchange crossover (EEX) Subtour exchange crossover (SXX) Edge assembly crossover (EAX): to be discussed in detail Inver-over operator Incorporating local search (local imrovement) Crossover with conventional heuristics Nearest neighbor crossover (NNX) and Greedy crossover (GX) Applied to union graph of parental edges to generate one offspring Deterministic choice of edges from the union graph 2-opt based improvement of offspring Combined use of NNX (95%) and GX (5%) 0.3% deviation from optimal for n<250 Inver-over operator Combination of inversion (mutation) and crossover Each individual competes only with its offspring Parallel hillclimbing where each hillclimber performs a variable number of edge swaps in the spirit of Lin-Kernighan algorithm High selection pressure Adaptive: (1) the number of inversions applied to a single individual and (2) the segment to be inverted are determined by another randomly selected individual (second parent) Inver-over algorithm Inver-over example Given S’=(2 3 9 4 1 5 8 6 7) and c=3 If rand()<p, then c’=8 is selected from S’ and S’=(2 3 8 5 1 4 9 6 7) after random inversion If rand()>p, then a second individual, say (1 6 4 3 5 7 9 2 8) is selected, c’=5 next to c=3, S’=(2 3 5 1 4 9 8 6 7) after guided inversion where the edge {3, 5} is introduced from the second parent This is repeated until selected c’ is next to c in the first individual, then S’ is evaluated Inver-over performance Percent deviation from Help-Karp lower bound is < 0.63% for 442-city instance and < 3% for 10,000-city instance (better than most EAs) Solves 105, 442 and 2392-city instances in 3-4 seconds, < 3 minutes and < 90 minutes (faster than most EAs) Has only three parameters to set: population size, probability p of generating random inversion (as opposed to guided inversion), number of iterations until termination Incorporating local search It is possible to improve individuals produced in initial population and by recombination Conventional deterministic heuristics such as 2opt, 3-opt, and Lin-Kernighan can be used for this purpose Applying these to offspring produced by crossover constitutes a form of mutation (Lamarckian evolution) Some EAs with local search are found to perform better than pure local search with multi starts