Nawn 1 John E. Nawn MAT 5900: Monte Carlo Methods Drs. J. Frey and K. Volpert 22 February 2011 Genetic Algorithms: Survey and Applications Few ideas have dominated science more in the last two centuries more than Darwinian evolution. Charles Darwin promoted a theory wherein species accumulate gradual changes: beneficial changes allow species to better adapt to their environment. When enough changes accrue throughout successive generations, new species arise and populate new environments. Darwin lacked knowledge of genes, biochemical entities that encode the information expressed by an organism (Dawkins 43). Gregor Mendel discovered genes after Darwin’s On the Origin of Species and his research profoundly influenced evolutionary thought in the twentieth century. Properly understood, modern science adheres to neo-Darwinism, a marriage of Mendelian genetics and Darwinian evolution. In this theory, individuals inherit genes from progenitors; this recombination and potential for mutation allow for greater diversity among the offspring and the potential for novel approaches to environmental selection. While any applications of evolution spur research in economics, politics, psychology and various other fields, mathematics has appropriated evolution in order to develop a system of optimizing solutions to a variety of problems, termed genetic algorithms (Reeves and Rowe, 1). While much research has centered on genetic algorithms in the past three decades, this paper uses the text Genetic Algorithms – Principles and Perspectives: A Guide to GA Theory. This work surveys the general approaches underlying genetic algorithms and illuminates the broadest variety of issues surrounding genetic algorithms. While other texts, listed in the bibliography, Nawn 2 serve as references, this paper will quote Genetic Algorithms almost exclusively. Readers interested in other materials related to GA theory should consult the bibliography in Genetic Algorithms, especially Bethke’s Genetic algorithms as functional optimizers [20], Culberson’s “On the futility of a blind search: an algorithmic view of ‘No Free Lunch’” [42] and Holland’s pioneering Adaptations in Natural and Artificial Systems [124]. This paper will discuss broadly the approaches and techniques of the model genetic algorithm (GA), possible approaches to optimization and a specific application of the genetic algorithm to the Traveling Salesman Problem (TSP), complete with appendices providing full trial results. By understanding the components of a genetic algorithm in relation to a particular problem, a better understanding of the power and scope of the genetic algorithm will surface. As interest in genetics and evolution surged in the latter portion of the twentieth century and culminated in neo-Darwinism, many realized the impact these fields would have across the disciplines. In 1975, John Holland introduced genetics into heuristic methodology when he published his Adaptations in Natural and Aritificial Systems. He argued that applying new concepts of recombination and natural selection would allow for adaptive approaches to optimization better than simple stochastic searches (107). Holland, conversant in the language of genetics, used this terminology and ideology to describe the construction and execution of what he termed a genetic algorithm. Holland first described the initial population comprised of individual solutions or approaches which he called “genes” or “chromosomes” (25 – 8). These genes serve to seed the genetic algorithm; they represent known or randomly selected approaches to the given problem. The genetic algorithm then applies these potential solutions to the problem and attempts to determine which ones provide better solutions. Genetic algorithms evaluate potential solutions to a problem in terms of their “fitness” or their ability to provide a better Nawn 3 approach than other contenders do. The genetic algorithm thrives, Holland notes, because the fitness function swiftly eliminates inefficient or unwieldy solutions in favor of adaptive solutions that optimize the given constraint, whether cost, distance, time or another variable of interest (31). However, the solutions themselves change throughout successive “generations,” described as repeated applications of the solutions to the problem. The genes undergo random mutation to yield novel solutions that the fitness function then tests; this variety grants the algorithm its power over stochastic searches. Not only does the algorithm determine the best current solution, it also breeds new approaches that may yield better solutions. These genetic changes occur in two distinct ways, termed as “mutations” and “cross-overs” separately. Mutations transpire within the gene itself; through random selection, the algorithm applies a screen to a current solution in such a way as to produce a new, non-lethal solution. The algorithm considers all mutations and cross-overs that, when applied to the problem, yield incomplete or nonsensical outcomes as lethal. Many sorts of mutation screens exist: most often they involve transposing two elements within a gene (solution) such that a new solution arises (44 – 5). In this way, a mutation results that preserve functionality. Unfortunately, the algorithmic method for approaching mutations differs from the biological phenomenon. In an organism, a variety of mutations may occur within a gene that need not preserve the total information of the original gene. For instance, the mistranslation of a single base pair that creates a novel protein need not be accompanied by a similar mistranslation elsewhere in the gene. However, as such symmetry maintains the mathematical model, all mutations in the genetic algorithm must preserve the total solution. Other mutations exist, such as restructuring portions of the gene, inverting the gene or permuting the elements (43, 45, 105): this paper, however, assumes point mutations as the normal mutation. Nawn 4 The second type of genetic change involves cross-overs between multiple current solutions. A cross-over entails the rearrangement of the information contained in two genes such that some information passes between each gene. This accounts for most variety observed in nature, and allows solutions to gather successful partial solutions into one gene as this paper will describe later. Biological cross-overs, much like biological mutations, suffer fewer limitations than the mathematical cross-overs the genetic algorithms employs (38 – 9). Again, the crossovers the algorithm generates must preserve the information encoded in the gene (solution); hence, most cross-overs use a “mask” through which they trade information. These masks trade a certain portion of information and then assess whether a viable solution has emerged. If the mask generates no valid solutions, the algorithm randomly mutates the gene using the mutations described previously, uses another round of masking to create different solutions, or rejects the new solutions in favor of the old ones. This process continues until the algorithm obtains new, viable solutions that possess all the necessary information from the parent solutions (43). This paper surveys an algorithm for the TSP that employs a cross-over mask that selects two points on the progenitor solutions and swaps all information between these two points; this process repeats until viable solutions remain. This algorithm also uses a cross-over-and-mutation approach, meaning that each generation involves mutations within the genes and cross-overs between genes. Reeves and Rowe debate the general worth of this approach, suggesting that this method might create the most successful diversity among generations (44). However, certain optimal solutions may arise from a mutation and disappear due to a cross-over within one generation, so they advise caution in using this approach. For the purposes of this paper, the computational power far surpasses the number of possible solutions and so conceivably captures all possible solutions eventually. In a Nawn 5 larger analysis, the algorithm might employ cross-over-or-mutation depending on the success of past rounds of mutation and cross-over respectively. This approach bears some semblance to selective breeding where cross-over determines genetic diversity; fitter animals breed with fitter animals while aberrant mutants are removed. This paper focuses exclusively on the application of the genetic algorithm to the traveling salesman problem (TSP) and on the creation of an optimal or shortest trip among the cities given. The TSP arose in the twentieth century and served as one of the first applications of Monte Carlo methodology in predicting best outcomes. The genetic algorithm, while one of the many algorithms used to approach the TSP, has shown itself to be an excellent means to address the complexity and problems of the TSP. The TSP essentially takes a “map” and attempts to predict the shortest or least costly “trip” through the t cities on the map. While the best possible trip becomes more difficult to determine as the number of cities increases, the number of trips remains preodictable. Richard Brualdi shows this number to be the number of invertible, circular permutations of t objects (Brualdi 39 – 41; Table 1.1); namely, (𝑡 − 1)! 2 Thus, the trip through ten cities having the order 39821654710 is identical in the TSP to the trip having the order 1893107456 (cf. Table 5.2). Each trip generates a specific distance or length that serves to indicate, in the genetic algorithm, the “fitness” of that trip: the shortest the length, the fitter the trip. While trips among few cities are easy to calculate and help ascertain the accuracy of the genetic algorithm – see Table 2.3 – Nawn 6 longer trips become more difficult to calculate or even categorize, hence the introduction of the algorithm. The first algorithm used in this paper, available in Appendix Eight, generates a random matrix assumed to be the simplest mapping between the cities, given in the code as “cities” (Appendix Eight). The code then generates n genes, or trips¸ that seed the initial population; here, n = 16. As Reeves and Rowe demonstrate, the value necessary for n grows as the gene length grows (here, the length is the number of cities which ranges from four (4) to ten (10); cf. Appendices Two – Five); for the populations produced in this paper, n = 16 suffices to creates enough initial diversity (Reeves and Rowe 28). These trips, stored in the matrix “trips,” then undergo mutation in the original generation to provide a new generation of mature solutions. The mutation is a simple screen that assigns a one (1) or zero (0) to each of the trips. If the trip has a one, it undergoes a point mutation randomly along its gene; a zero means the trip passes unaltered. The algorithm stores the best solution between the mutant trips and original trips (within the “muttrips” and “trips” matrices, respectively) in the “genshortesttrip” matrix while also storing the corresponding length in “genshortestlength.” Here one easily observes the genetic algorithm in action; from the randomly created trips and mutant trips, the fitness function – the length of each trip – selects the best trip among all possible trips. The original generational code then repeats this process by introducing cross-overs to the newly-mutated adult trips, now stored as the “trips” of the next generation. This code random pairs two trips until all are paired – hence the need for an even number of cities – and then swaps a portion of the genes. It rejects any incomplete or repetitive solutions and repeats this process until new trips result with the proper length: the code “while(length(unique(sample[t,])) < cities)” rejects these invalid solutions. Originally, the code kept the incomplete solutions and Nawn 7 tried to continue swapping portions of the trips until a complete solution arose. However, occasionally no possible cross-overs exist, and so the code stalled attempting to generate appropriate solutions. The new code rejects incomplete solutions and tries again: “sample = cosample” restores the old “trips” generated at the conclusion of the last generation and forces the cross-over process to begin again. These trips then become the “trips” of that generation and undergo the process described above. Each generation entails cross-overs, mutations and comparisons. Finally, the algorithm compares all of the best trips in each generation stored in the matrix “genshortesttrip” and selects the “trueshortesttrip’ and “trueshortestlength” among the generations, as described in Appendix Eight. However, this approach allows the algorithm to restore previous solutions that may have mutated in later generations. If a solution in generation four provided the “trueshortesttrip,” the algorithm should be powerless to restore it or should terminate all less fit mutations of this solution. Hence, new code was created in order to select the true best trip, which involved running multiple trials in order to determine the best trip. This code used an approach called “threshold” selection; it asserts that while a best solution exists, the algorithm wants to select genes that surpass some limit or threshold. The code continues to cycle so long as a trip has not appeared yet that satisfies the threshold condition. Appendices Nine and Ten involve the code “while (mindis > threshold)” which provides a best solution independent of the number of generations required to produce this solution. The number of generations required to produce this “mindis” trip length necessary to surpass the “threshold” is stored as “trialcount” (Appendix Ten). The average count is low for higher thresholds (Tables 3.2, 4.2, 5.2) because numerous solutions suffice to optimize the TSP. Figure 5.1 exemplifies the range of potential solutions to a higher threshold selection; the exponential distribution correlates well with the assumption that Nawn 8 the algorithm selects the first solution that passes the threshold, not the absolute minimum. As the threshold decreases, the number of fit solutions likewise decreases and the so the average number of generations necessary to achieve a fit solution increases in inverse proportion to the number of fit solutions remaining; that is, inversely exponential as Appendix Six demonstrates (Figure 6.1). When the number of cities visited is few, the increase in the number of generations necessary to achieve a fit solution is barely noticeable (cf. Figure 6.1, “Graph of Threshold vs. Average Count Four City TSP”). As the number of cities visited increases, the inversely exponential relationship between the threshold and average count, and thus the average count and number of fit cities, becomes more pronounced (cf. Figure 6.3, “Graph of Threshold vs. Average Count Eight City TSP”). Finally, the rate of mutation plays a large role in the number of generations required to produce an optimal solution. In nature, mutation happens occasionally among single-cellular organisms and rarely among higher multi-cellular organisms. However, this algorithm assumed a mutation rate, “ch,” equivalent to one-half (0.5); this means that mutation was as likely to occur as not for a gene. This number was chosen for the algorithm in order to create the most diversity and produce fit solutions quickly. Appendix Seven demonstrates the effect the mutation rate has on the number of generations required to achieve a fit solution. The average number of generations as a function of the mutation rate exhibits a symmetrical, parabolic character: this arises from the symmetry of the binomial distribution used to create the mutation screen “mut = rbinom(n,1,ch).” The mutation screen 1000111011011101, which mutates trips one, five, six, seven, nine, ten, twelve, thirteen, fourteen and sixteen, generates as much potential diversity as Nawn 9 0111000100100010, its opposite screen. This is because if the mutation rate is high, it creates conditions wherein the unaltered trips become like the mutated trips when the rate is low. A low rate of mutation prevents new solutions from emerging and so slows the algorithm’s search; a high rate of mutation erases potential beneficial solutions and so surpasses the algorithm’s ability to determine if a generation produced a fit solution. Figure 7.1 demonstrates this parabolic relationship between the average number of generations and the mutation rate. Some possible improvements to the algorithms provided below include the introduction of penalties (52), multiple objective programming (53), and a more rigorous building block approach (71 – 3). Penalties occur most often in real-world applications, where certain conceptual solutions yield impossible results. For example, a real application of the TSP to traveling to a city across a river would need to contain a partial solution that involved crossing a bridge between two cities. Penalizing any solution that involves cross the river anywhere but that bridge should not involve setting artificially high costs in the initial conditions. Doing so prevents the solution from evolving properly by prohibiting potential better solutions from evolving from this aberrant solution (52). Instead, the algorithm “flags” the solution and determines if successive mutations remove the aberrant partial solution while retaining the rest of the solution. If not, the fitness function, modified to include a penalty, selects other better solutions (52). Multiple objective programming involves optimizing multiple constraints within a specific algorithm; for instance, time and cost. The original generational TSP R code involved the potential for multiple objectives; a budget function was written such that a cost of the best trip was produced as well. An improved algorithm would allow a best solution to minimize either distance travelled or cost incurred while applying the fitness function to one, the other or Nawn 10 both initial conditions (53). This approach causes multiple best solutions to emerge depending on the selection criteria, thus allowing individuals to use the algorithm in a broader context (100). The reader can arrive at a simple understanding building-block approach by observing that in Table 5.2, all solutions involve the subtrip 9 – 10 or 10 – 9 (identical, as described above). A quick glance at Table 5.1 reveals why the algorithm preserved this partial solution: the distance between cities 9 and 10 is a mere three (3) miles. The building-block hypothesis suggests that extremely fit partial solutions will arise more frequent because they contribute to generally fitter solutions (71 – 2). Thus, an addition to the algorithm that absolutely ranked and conserved the best partial solutions would have directed the algorithm to become increasingly selective as the generations passed. At low levels of complexity, the algorithm often conserves these partial solutions (Table 5.2); however, high levels of complexity often develop competing partial solutions that give rise to local optima (123) that fail to direct the solution towards the absolute optimum. Thus, the building-block approach would determine the fitness of partial solutions as well and thus add another layer of selection pressure to the fitness function. Other improvements include producing a crisper, quicker code and creating multiple mutations within a gene. In addition, the code treated equivalent circular permutations as distinct trips (e.g. Table 4.2, 5.2); treating classes of trips as described above instead of individual trips might have created a quicker code. The traveling salesman problem barely exhausts a cursory description of the range and force of the genetic algorithm; hopefully, the reader has a greater appreciation for the immense potential of genetic algorithms. Many disciplines use the genetic algorithm in order to rate performance, predict likely outcomes, maximize profits and generally solve a variety of Nawn 11 problems. Their ability to actively select best solutions, to breed new solutions and provide optimal solutions while using randomness make them ideal for determine solutions to complex and challenging problems in computer science, economics, nanotechnology and genetics itself. Reeves and Rowe, with good reason, conclude their text with a summary of future research techniques. Ideally, they predict increased effort to refine the theory behind genetic algorithms (274 – 8) and greater application of these algorithms across the disciplines. This algorithm, borne from Monte Carlo methodology, provides a powerful means to optimizing solutions for complex problems. Nawn 12 Appendix One – Initial Conditions for all Traveling Salesman Problems Table 1.1 – Initial Conditions Maximum Distance 1000 Genes Per Generation 16 Mutation Probability 0.5 Trials 1000 (𝑡 − 1)! Trip Classes for t Trips 2 Appendix Two – Four City Traveling Salesman Problem (TSP) Table 2.1 – Initial Distances Cities 1 2 3 4 1 0 537 589 164 2 537 0 481 241 3 589 481 0 242 4 164 241 242 0 Table 2.2 – Observed Count and Trips Threshold 1500 1600 1700 1 1 1 Average Count 1424 1424 1424 Trip Length 1432 4123 2341 Best Trip Table 2.3 – Possible Classes of Trips Possible Class 1 2 3 Sample Trip 1234 1324 1342 Length 1424 1475 1609 Nawn 13 Appendix Three – Six City TSP Cities 1 2 3 4 5 6 Table 3.1 – Initial Distances 1 2 3 4 5 0 545 857 79 251 545 0 71 857 948 857 71 0 235 588 79 857 235 0 633 251 948 588 633 0 594 837 23 630 697 6 594 837 23 630 697 0 Table 3.2 – Observed Count and Trips Threshold 2000 2100 2200 2300 6.667 1.905 1.268 1.267 Average Count 1978 1978 1978 1978 Shortest Length 241563 241563 651423 514236 Best Trip Threshold Average Count Shortest Length Best Trip 2400 1.119 1978 423651 2500 1.017 1978 415632 2600 1.007 1978 514236 2700 1.002 1978 514236 2800 1 1978 365142 Appendix Four – Eight City TSP Cities 1 2 3 4 5 6 7 8 1 0 482 173 328 813 66 711 153 Table 4.1 – Initial Distance 2 3 4 5 6 482 173 328 813 66 0 737 517 941 865 737 0 858 358 93 517 858 0 988 352 941 358 988 0 49 865 93 352 49 0 244 701 647 402 454 638 944 126 698 879 7 711 244 701 647 402 454 0 924 8 153 638 944 126 698 879 924 0 Table 4.2 – Observed Counts and Trips Threshold 1800 2000 2200 305.386 174.77 79.598 Average Count 1757 1757 1757 Shortest Length 42756318 31842756 24813657 Best Trip 2400 30.143 1757 63184275 Threshold Average Count Shortest Length Best Trip 3200 1.516 1757 48136572 2600 14.609 1757 24813657 2800 5.647 1757 31842756 3000 2.567 1757 18427563 Nawn 14 Appendix Five – Ten City TSP Cities 1 2 3 4 5 6 7 8 9 10 1 0 38 556 573 292 291 529 771 496 640 2 38 0 61 149 326 788 742 540 306 700 Table 5.1 – Initial Distances 3 4 5 6 7 556 573 292 291 529 61 149 326 788 742 0 973 259 157 964 973 0 255 230 7 259 255 0 251 279 157 230 251 0 53 964 7 279 53 0 431 407 934 311 708 191 680 88 389 735 386 985 619 376 882 8 771 540 431 407 934 311 708 0 537 348 9 496 306 191 680 88 389 735 537 0 3 Threshold Average Count Shortest Length Best Trip Table 5.2 – Observed Counts and Trips 2000 2250 2500 2750 245.342 95.974 30.617 11.729 1555 1555 1555 1651 16748109532 53216748109 53216748109 93215674810 Threshold Average Count Shortest Length Best Trip 3250 2.02 1741 81095321476 3500 1.184 1713 74238109516 3750 1.036 1741 23591086741 4000 1.001 1454 47632159108 10 640 700 386 985 619 376 882 348 3 0 3000 4.355 1555 10953216748 Nawn 15 Figure 5.1 – Sample Distribution of Minimum Solutions for Ten Cities and 3000 Threshold (1000 Trials) Nawn 16 Appendix Six – Comparison of Threshold Counts Figure 6.1 – Threshold Comparisons Nawn 17 Appendix Seven – Comparison of Mutation Counts Table 7.1 – Mutation Rate vs. Average Count for Ten Cities, Threshold of 3000 Mutation rate, ch Average Count 0.10 60.342 0.25 24.752 0.40 7.893 0.50 4.355 0.60 8.014 0.75 25.067 0.90 63.465 Figure 7.1 – Mutation Count Comparison Nawn 18 Appendix Eight – R Code for Original Generations TSP, Single Run ############################################################# # Final TSP Solution with Mutation and Crossing Over # ############################################################# # Initital Parameters cities = 10 max = 1000 n = 16 gen = 20 ch = 0.5 # Number of Cities # Longest Possible Distance between Cities # Number of Trips to Attempt # Number of Generations # Chance of Mutation Occurring within a Trip # Establishing the Map distance = matrix(sample(1:max,(cities*cities),replace='T'),cities,cities) for(i in 1:cities){ distance[,i] = distance[i,] distance[i,i] = 0 } # Generational Housekeeping genshortestlength = double(gen) genshortesttrip = matrix(double(gen*cities),gen,cities) trueshortestlength = 0 trueshortesttrip = double(cities) length = double(n) mutlength = double(n) trips = matrix(double(n*cities),n,cities) muttrips = matrix(double(n*cities),n,cities) absshortesttrip = double(cities) for (k in 1:n){ # This code seeds the initial trips. order = sample(1:cities,cities,replace='F') dis = double(cities) for (j in 1:(cities-1)){ dis[j] = distance[order[j],order[j+1]] dis[cities] = distance[order[cities],order[1]]} length[k] = sum(dis) trips[k,] = order} shortestlength = min(length) shortesttrip = double(cities) test = double(n) for (k in 1:n){ Nawn 19 if (length[k] > shortestlength){ test[k] = 0 }else test[k] = 1} shortesttripmatrix = test*trips shortesttrip = double(cities) for (k in 1:n){ if (shortesttripmatrix[k,1] > 0){ shortesttrip = shortesttripmatrix[k,]}} muttrips = trips mut = rbinom(n,1,ch) # This code introduces mutations within each trip. for (k in 1:n){ if (mut[k] > 0){ screen = sample(1:cities,2,replace ='F') point1 = muttrips[k,screen[1]] point2 = muttrips[k,screen[2]] muttrips[k,screen[2]] = point1 muttrips[k,screen[1]] = point2}} for (k in 1:n){ mutdis = double(cities) for (j in 1:(cities-1)){ mutdis[j] = distance[muttrips[k,j],muttrips[k,j+1]] mutdis[cities] = distance[muttrips[k,cities],muttrips[k,1]]} mutlength[k] = sum(mutdis)} mutshortestlength = min(mutlength) mutshortesttrip = double(cities) muttest = double(n) for (k in 1:n){ if (mutlength[k] > mutshortestlength){ muttest[k] = 0 }else muttest[k] = 1} mutshortesttripmatrix = muttest*muttrips mutshortesttrip = double(cities) for (k in 1:n){ # This code compares the mutant and original trips. if (mutshortesttripmatrix[k,1] > 0){ mutshortesttrip = mutshortesttripmatrix[k,]}} absshortestlength = min(shortestlength,mutshortestlength) if (shortestlength > mutshortestlength){ absshortesttrip = mutshortesttrip }else absshortesttrip = shortesttrip genshortestlength[1] = absshortestlength genshortesttrip[1,] = absshortesttrip for (a in 2:gen){ # This code introduces generational comparisons. trips = muttrips # This code introduces crossing-over among trips. covector = sample(1:k,k,replace='F') sample = matrix(double(n*cities),n,cities) Nawn 20 cosample = matrix(double(n*cities),n,cities) for (k in 1:n){ sample[k,] = trips[covector[k],] cosample[k,] = trips[covector[k],]} for (t in 1:(n/2)){ copoint1 = sample(1:(cities-1),1,replace='F') copoint2 = sample(copoint1:cities,1,replace='F') for (f in copoint1:copoint2){ sample[t,f] = cosample[(t+n/2),f] sample[(t+n/2),f] = cosample[t,f]} while(length(unique(sample[t,])) < cities){ sample = cosample copoint1 = sample(1:(cities-1),1,replace='F') copoint2 = sample(copoint1:cities,1,replace='F') for (f in copoint1:copoint2){ sample[t,f] = cosample[(t+n/2),f] sample[(t+n/2),f] = cosample[t,f]}}} trips = sample for (k in 1:n){ dis = double(cities) for (j in 1:(cities-1)){ dis[j] = distance[trips[k,j],trips[k,j+1]] dis[cities] = distance[trips[k,cities],trips[k,1]]} length[k] = sum(dis)} shortestlength = min(length) shortesttrip = double(cities) test = double(n) for (k in 1:n){ if (length[k] > shortestlength){ test[k] = 0 }else test[k] = 1} shortesttripmatrix = test*trips shortesttrip = double(cities) for (k in 1:n){ if (shortesttripmatrix[k,1] > 0){ shortesttrip = shortesttripmatrix[k,]}} muttrips = trips mut = rbinom(n,1,ch) for (k in 1:n){ if (mut[k] > 0){ screen = sample(1:cities,2,replace ='F') point1 = muttrips[k,screen[1]] point2 = muttrips[k,screen[2]] muttrips[k,screen[2]] = point1 muttrips[k,screen[1]] = point2}} for (k in 1:n){ Nawn 21 mutdis = double(cities) for (j in 1:(cities-1)){ mutdis[j] = distance[muttrips[k,j],muttrips[k,j+1]] mutdis[cities] = distance[muttrips[k,cities],muttrips[k,1]]} mutlength[k] = sum(mutdis)} mutshortestlength = min(mutlength) mutshortesttrip = double(cities) muttest = double(n) for (k in 1:n){ if (mutlength[k] > mutshortestlength){ muttest[k] = 0 }else muttest[k] = 1} mutshortesttripmatrix = muttest*muttrips mutshortesttrip = double(cities) for (k in 1:n){ if (mutshortesttripmatrix[k,1] > 0){ mutshortesttrip = mutshortesttripmatrix[k,]}} absshortestlength = min(shortestlength,mutshortestlength) if (shortestlength > mutshortestlength){ absshortesttrip = mutshortesttrip }else absshortesttrip = shortesttrip genshortestlength[a] = absshortestlength genshortesttrip[a,] = absshortesttrip} trueshortestlength = min(genshortestlength) truetest = double(gen) # This code compares all generational solutions. for(a in 1:gen){ if (genshortestlength[a] > trueshortestlength){ truetest[a] = 0 }else truetest[a] = 1} trueshortesttripmatrix = truetest*genshortesttrip trueshortesttrip = double(cities) for (a in 1:gen){ if (trueshortesttripmatrix[a,1] > 0){ trueshortesttrip = trueshortesttripmatrix[a,]}} distance genshortestlength genshortesttrip trueshortestlength trueshortesttrip # Initial intracity distances # Shortest distance traveled in each generation # Corresponding trips that produced shortest distances # Shortest possible distance travelled # Corresponding trip Nawn 22 Appendix Nine – R Code for Threshold Algorithm for TSP, Single Run ############################################################# # Final TSP Solution with Mutation and Crossing Over # # Minimum Distance Solutions # ############################################################# # Initial Parameters cities = 10 max = 1000 n = 16 ch = 0.5 threshold = 3600 mindis = cities*max count = 1 # Number of Cities # Longest Possible Distance between Cities # Number of Trips to Attempt # Chance of Mutation Occurring within a Trip # Minimum distance desired for a Trip # Longest Possible Trip (This will be modified) # Running Total Number of Best Solutions (This will be modified) # Establishing the Map distance = matrix(sample(1:max,(cities*cities),replace='T'),cities,cities) for(i in 1:cities){ distance[,i] = distance[i,] distance[i,i] = 0 } # Threshold Housekeeping genshortestlength = double(count) genshortesttrip = matrix(double(count*cities),count,cities) length = double(n) mutlength = double(n) trips = matrix(double(n*cities),n,cities) muttrips = matrix(double(n*cities),n,cities) absshortesttrip = double(cities) for (k in 1:n){ # This code seeds the initial trips. order = sample(1:cities,cities,replace='F') dis = double(cities) for (j in 1:(cities-1)){ dis[j] = distance[order[j],order[j+1]] dis[cities] = distance[order[cities],order[1]]} length[k] = sum(dis) trips[k,] = order} shortestlength = min(length) shortesttrip = double(cities) test = double(n) Nawn 23 for (k in 1:n){ if (length[k] > shortestlength){ test[k] = 0 }else test[k] = 1} shortesttripmatrix = test*trips shortesttrip = double(cities) for (k in 1:n){ if (shortesttripmatrix[k,1] > 0){ shortesttrip = shortesttripmatrix[k,]}} muttrips = trips mut = rbinom(n,1,ch) # This code introduces mutations within each trip. for (k in 1:n){ if (mut[k] > 0){ screen = sample(1:cities,2,replace ='F') point1 = muttrips[k,screen[1]] point2 = muttrips[k,screen[2]] muttrips[k,screen[2]] = point1 muttrips[k,screen[1]] = point2}} for (k in 1:n){ mutdis = double(cities) for (j in 1:(cities-1)){ mutdis[j] = distance[muttrips[k,j],muttrips[k,j+1]] mutdis[cities] = distance[muttrips[k,cities],muttrips[k,1]]} mutlength[k] = sum(mutdis)} mutshortestlength = min(mutlength) mutshortesttrip = double(cities) muttest = double(n) for (k in 1:n){ if (mutlength[k] > mutshortestlength){ muttest[k] = 0 }else muttest[k] = 1} mutshortesttripmatrix = muttest*muttrips mutshortesttrip = double(cities) for (k in 1:n){ # This code compares the mutant and original trips. if (mutshortesttripmatrix[k,1] > 0){ mutshortesttrip = mutshortesttripmatrix[k,]}} absshortestlength = min(shortestlength,mutshortestlength) mindis = absshortestlength if (shortestlength > mutshortestlength){ absshortesttrip = mutshortesttrip }else absshortesttrip = shortesttrip genshortestlength[count] = mindis genshortesttrip[count,] = absshortesttrip while (mindis > threshold){ tcount = count+1 # This code introduces the threshold comparison. Nawn 24 newgenshortestlength = double(tcount) newgenshortesttrip = matrix(double(tcount*cities),(count+1),cities) for (h in 1:count){ # Restores size of genshortestlength and genshortesttrip newgenshortestlength[h] = genshortestlength[h] newgenshortestlength[tcount]= 0 newgenshortesttrip[h,] = genshortesttrip[h,] newgenshortesttrip[tcount,] = double(cities)} genshortestlength = newgenshortestlength genshortesttrip = newgenshortesttrip count = tcount # Every "unsuccessful" generation increases the count. trips = muttrips # This code introduces crossing-over among trips. covector = sample(1:k,k,replace='F') sample = matrix(double(n*cities),n,cities) cosample = matrix(double(n*cities),n,cities) for (k in 1:n){ sample[k,] = trips[covector[k],] cosample[k,] = trips[covector[k],]} for (t in 1:(n/2)){ copoint1 = sample(1:(cities-1),1,replace='F') copoint2 = sample(copoint1:cities,1,replace='F') for (f in copoint1:copoint2){ sample[t,f] = cosample[(t+n/2),f] sample[(t+n/2),f] = cosample[t,f]} while(length(unique(sample[t,])) < cities){ sample = cosample copoint1 = sample(1:(cities-1),1,replace='F') copoint2 = sample(copoint1:cities,1,replace='F') for (f in copoint1:copoint2){ sample[t,f] = cosample[(t+n/2),f] sample[(t+n/2),f] = cosample[t,f]}}} trips = sample for (k in 1:n){ dis = double(cities) for (j in 1:(cities-1)){ dis[j] = distance[trips[k,j],trips[k,j+1]] dis[cities] = distance[trips[k,cities],trips[k,1]]} length[k] = sum(dis)} shortestlength = min(length) shortesttrip = double(cities) test = double(n) for (k in 1:n){ if (length[k] > shortestlength){ test[k] = 0 }else test[k] = 1} shortesttripmatrix = test*trips shortesttrip = double(cities) Nawn 25 for (k in 1:n){ if (shortesttripmatrix[k,1] > 0){ shortesttrip = shortesttripmatrix[k,]}} muttrips = trips mut = rbinom(n,1,ch) for (k in 1:n){ if (mut[k] > 0){ screen = sample(1:cities,2,replace ='F') point1 = muttrips[k,screen[1]] point2 = muttrips[k,screen[2]] muttrips[k,screen[2]] = point1 muttrips[k,screen[1]] = point2}} for (k in 1:n){ mutdis = double(cities) for (j in 1:(cities-1)){ mutdis[j] = distance[muttrips[k,j],muttrips[k,j+1]] mutdis[cities] = distance[muttrips[k,cities],muttrips[k,1]]} mutlength[k] = sum(mutdis)} mutshortestlength = min(mutlength) mutshortesttrip = double(cities) muttest = double(n) for (k in 1:n){ if (mutlength[k] > mutshortestlength){ muttest[k] = 0 }else muttest[k] = 1} mutshortesttripmatrix = muttest*muttrips mutshortesttrip = double(cities) for (k in 1:n){ if (mutshortesttripmatrix[k,1] > 0){ mutshortesttrip = mutshortesttripmatrix[k,]}} absshortestlength = min(shortestlength,mutshortestlength) mindis = absshortestlength if (shortestlength > mutshortestlength){ absshortesttrip = mutshortesttrip }else absshortesttrip = shortesttrip genshortestlength[count] = mindis genshortesttrip[count,] = absshortesttrip} distance genshortestlength genshortesttrip count mindis absshortesttrip # Initial intracity distances # Shortest distance traveled in each generation # Corresponding trips that produced shortest distances # Total number of generations # First minimum solution # Corresponding trip Nawn 26 Appendix Ten – R Code for Threshold Algorithm for TSP, 1000 Trials ############################################################# # Final TSP Solution with Mutation and Crossing Over # # Minimum Distance Solutions - Average # ############################################################# # Initital Parameters cities = 10 max = 1000 n = 16 ch = 0.5 threshold = 3000 # Number of Cities # Longest Possible Distance between Cities # Number of Trips to Attempt # Chance of Mutation Occuring within a Trip # Minimum Distance Desired for a Trip # Establishing the Map distance = matrix(sample(1:max,(cities*cities),replace='T'),cities,cities) for(i in 1:cities){ distance[,i] = distance[i,] distance[i,i] = 0 } nruns = 1000 # Number of Trials to Establish trialmindis = double(nruns) # Records Minimum Distance trialcount = double(nruns) # Records Generations Required trialtrips = matrix(double(nruns*cities),nruns,cities) #Records Trip Required mindisminder = double(threshold) # Count of Each Possible Trip Length Below Threshold for (u in 1:nruns){ # Threshold Housekeeping mindis = cities*max # Longest Possible Trip (This will be modified) count = 1 # Running Total Number of Best Solutions (This will be modified) genshortestlength = double(count) genshortesttrip = matrix(double(count*cities),count,cities) length = double(n) mutlength = double(n) trips = matrix(double(n*cities),n,cities) muttrips = matrix(double(n*cities),n,cities) absshortesttrip = double(cities) # Trials Nawn 27 for (k in 1:n){ # This code seeds the initial trips. order = sample(1:cities,cities,replace='F') dis = double(cities) for (j in 1:(cities-1)){ dis[j] = distance[order[j],order[j+1]] dis[cities] = distance[order[cities],order[1]]} length[k] = sum(dis) trips[k,] = order} shortestlength = min(length) shortesttrip = double(cities) test = double(n) for (k in 1:n){ if (length[k] > shortestlength){ test[k] = 0 }else test[k] = 1} shortesttripmatrix = test*trips shortesttrip = double(cities) for (k in 1:n){ if (shortesttripmatrix[k,1] > 0){ shortesttrip = shortesttripmatrix[k,]}} muttrips = trips mut = rbinom(n,1,ch) # This code introduces mutations within each trip. for (k in 1:n){ if (mut[k] > 0){ screen = sample(1:cities,2,replace ='F') point1 = muttrips[k,screen[1]] point2 = muttrips[k,screen[2]] muttrips[k,screen[2]] = point1 muttrips[k,screen[1]] = point2}} for (k in 1:n){ mutdis = double(cities) for (j in 1:(cities-1)){ mutdis[j] = distance[muttrips[k,j],muttrips[k,j+1]] mutdis[cities] = distance[muttrips[k,cities],muttrips[k,1]]} mutlength[k] = sum(mutdis)} mutshortestlength = min(mutlength) mutshortesttrip = double(cities) muttest = double(n) for (k in 1:n){ if (mutlength[k] > mutshortestlength){ muttest[k] = 0 }else muttest[k] = 1} mutshortesttripmatrix = muttest*muttrips mutshortesttrip = double(cities) for (k in 1:n){ # This code compares the mutant and original trips. if (mutshortesttripmatrix[k,1] > 0){ Nawn 28 mutshortesttrip = mutshortesttripmatrix[k,]}} absshortestlength = min(shortestlength,mutshortestlength) mindis = absshortestlength if (shortestlength > mutshortestlength){ absshortesttrip = mutshortesttrip }else absshortesttrip = shortesttrip genshortestlength[count] = mindis genshortesttrip[count,] = absshortesttrip while (mindis > threshold){ # This code introduces the threshold comparison. tcount = count+1 newgenshortestlength = double(tcount) newgenshortesttrip = matrix(double(tcount*cities),(count+1),cities) for (h in 1:count){ # Restores size of genshortestlength and genshortesttrip newgenshortestlength[h] = genshortestlength[h] newgenshortestlength[tcount]= 0 newgenshortesttrip[h,] = genshortesttrip[h,] newgenshortesttrip[tcount,] = double(cities)} genshortestlength = newgenshortestlength genshortesttrip = newgenshortesttrip count = tcount # Every "unsuccessful" generation increases the count. trips = muttrips # This code introduces crossing-over among trips. covector = sample(1:k,k,replace='F') sample = matrix(double(n*cities),n,cities) cosample = matrix(double(n*cities),n,cities) for (k in 1:n){ sample[k,] = trips[covector[k],] cosample[k,] = trips[covector[k],]} for (t in 1:(n/2)){ copoint1 = sample(1:(cities-1),1,replace='F') copoint2 = sample(copoint1:cities,1,replace='F') for (f in copoint1:copoint2){ sample[t,f] = cosample[(t+n/2),f] sample[(t+n/2),f] = cosample[t,f]} while(length(unique(sample[t,])) < cities){ sample = cosample copoint1 = sample(1:(cities-1),1,replace='F') copoint2 = sample(copoint1:cities,1,replace='F') for (f in copoint1:copoint2){ sample[t,f] = cosample[(t+n/2),f] sample[(t+n/2),f] = cosample[t,f]}}} trips = sample for (k in 1:n){ dis = double(cities) for (j in 1:(cities-1)){ dis[j] = distance[trips[k,j],trips[k,j+1]] Nawn 29 dis[cities] = distance[trips[k,cities],trips[k,1]]} length[k] = sum(dis)} shortestlength = min(length) shortesttrip = double(cities) test = double(n) for (k in 1:n){ if (length[k] > shortestlength){ test[k] = 0 }else test[k] = 1} shortesttripmatrix = test*trips shortesttrip = double(cities) for (k in 1:n){ if (shortesttripmatrix[k,1] > 0){ shortesttrip = shortesttripmatrix[k,]}} muttrips = trips mut = rbinom(n,1,ch) for (k in 1:n){ if (mut[k] > 0){ screen = sample(1:cities,2,replace ='F') point1 = muttrips[k,screen[1]] point2 = muttrips[k,screen[2]] muttrips[k,screen[2]] = point1 muttrips[k,screen[1]] = point2}} for (k in 1:n){ mutdis = double(cities) for (j in 1:(cities-1)){ mutdis[j] = distance[muttrips[k,j],muttrips[k,j+1]] mutdis[cities] = distance[muttrips[k,cities],muttrips[k,1]]} mutlength[k] = sum(mutdis)} mutshortestlength = min(mutlength) mutshortesttrip = double(cities) muttest = double(n) for (k in 1:n){ if (mutlength[k] > mutshortestlength){ muttest[k] = 0 }else muttest[k] = 1} mutshortesttripmatrix = muttest*muttrips mutshortesttrip = double(cities) for (k in 1:n){ if (mutshortesttripmatrix[k,1] > 0){ mutshortesttrip = mutshortesttripmatrix[k,]}} absshortestlength = min(shortestlength,mutshortestlength) mindis = absshortestlength if (shortestlength > mutshortestlength){ absshortesttrip = mutshortesttrip }else absshortesttrip = shortesttrip Nawn 30 genshortestlength[count] = mindis genshortesttrip[count,] = absshortesttrip} mindisminder[mindis] = mindisminder[mindis] + 1 trialmindis[u] = mindis trialcount[u] = count trialtrips[u,] = absshortesttrip} avgcount = mean(trialcount) bestlength = min(trialmindis) # Records Shortest Length Possible besttest = double(nruns) for (u in 1:nruns){ if (trialmindis[u] > bestlength){ besttest[u] = 0 } else besttest[u] = 1} besttripmatrix = besttest*trialtrips besttrip = double(cities) for (u in 1:nruns){ if (besttripmatrix[u,] > 0){ besttrip = besttripmatrix[u,]}} # Records Corresponding Trip distance trialmindis trialtrips trialcount avgcount bestlength besttrip hist(trialmindis) # Initial Intracity Distances # Minimum Solution in Each Trial # Corresponding Trip in Each Trial # Number of Generations Required to Acheive the MinDis # Average Number of Generations Required # Shortest Suggested Trip Length # Corresponding Trip Nawn 31 Selected Bibliography Buckles, Bill P. and Frederick E. Petry. Genetic Algorithms. Los Alamitos, CA: IEEE Computer Society Press, 1992. Print. Brualdi, Richard A. Introductory Combinatorics. 5th ed. 1977. New York: Pearson Prentice Hall, 2010. Dawkins, Richard. The Selfish Gene. 3rd ed. 2006. New York: Oxford University Press, 2009. Print. Krzanowski, Roman and Jonathan Raper. Spatial Evolutionary Modeling. New York: Oxford University, Inc., 2001. Print. Moody, Glyn. Digital Code of Life: How Bioinformatics is Revolutionizing Science, Medicine, and Business. Hoboken, NJ: John Wiley and Sons, Inc., 2004. Print. Reeves, Colin R. and Johathan E. Rowe. Genetic Algorithms: Principles and Perspectives: A Guide to GA Theory. Boston: Kluwer Academic Publishers, 2003. Print.