CS420 Project V Experimentation with Genetic Algorithms Alexander Saites 11/4/2012 Alexander Saites 1 Introduction In this project, I implemented a genetic algorithm in Matlab which probabilistically chose individuals from a population (with a higher probability of choosing more fit individuals) and copied them to serve as children in the next population. With some probability, called the crossover probability, a crossover point was selected and the individual genes for the organisms were crossed over in the offspring. For each bit in the offspring, bits within the offspring were flipped with some probability known as the mutation probability. The following fitness function was used: ௫ ଵ = ଶ , where is the integer interpretation of an individual binary bitstring and is the number of genes (bits) in the bitstring. I then wrote a driver program which allowed me to specify quickly the number of genes, the population size (which is static), the mutation probability, the crossover probability, the number of generations (before completion of the experiment) and a seed for the random number generator. In addition to these, the driver program allowed me to specify the number of experiments to run, allowing me to run several experiments with a given set of parameters and graph them on the same figure. In the following graphs and analysis, 10 experiments with each set of parameters were performed and are graphed on top of one another. Finally, the program displays the individual fitness of each individual in the population for each generation of the experiment. The fitness of the individual is represented as intensity, with greater intensity corresponding to higher fitness. Each row is a generation, with earlier generations appearing at the top of the image. Graphs and Analysis I started with the values suggested in the project description: 20 genes, a population size of 30, a mutation probability of 0.033, a crossover probability of 0.6, and 10 generations. I perfomed this experiment 10 times. The results are shown below: Alexander Saites 2 Figure 1: Average fitness converged around 75% Figure 2: After 10 generations, the most fit individual usually has between 10 and 17 correct bits Alexander Saites 3 Figure 3: This population shows a gradual transistion from strong lack of fitness to a fairly fit population As figure 1 shows, the average fitness for the populations tended to increase gradually to about 75% before stopping. Despite this lower average fitness, the best fitness often reached 1. These populations also show that, occasionally, the best individual would die out, quickly dropping the average. After these generations, the best individual usually has about 15 bits (out of twenty correct). Inspecting a typical individual’s genetic code reveals the following: 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 1 0 0 1 0 Unsurprisingly, many of this individual’s left-most (higher order) bits are ones. Since setting higher order bits results in a fitter individual, this result is no surprise. Observing the genetic code of other “best individuals” after 10 generations show similar results. Since 10 generations is not very many, I decided to do the same experiment over 50 generations. Here are the results: Alexander Saites 4 Figure 4: Same as Figure 1, over 50 generations we see an average fitness of about 75% Figure 5: After 50 generations, more correct bits are set. Alexander Saites 5 Figure 6: This population shows a lot of variation in fitness, even in later generations As seen in the above figures, the best and average fitness results are about the same. The population quickly converges to an average fitness of about 75%. More interestingly, in figure 5 we see that more bits are set correctly (an average closer to 16). Again, we see a typical “best individual” has more higher order bits set: 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 0 0 0 1 In the extra time provided by the other 40 generations, the algorithm resulted in more lower order bits being set. Despite this being better, we still see that many individuals in the population are not that fit (figure 6). This is likely due to the somewhat high mutation probability. In the next experiment, I set the mutation probability to 0.0033: Alexander Saites 6 Figure 7: Dropping the mutation probability results in slower, but better convergence Figure 8: A lower mutation rate also allows a higher number of correct bits Alexander Saites 7 Figure 9: With a lower mutation probability, the population is more stable at higher generations These results show that the mutation probability did indeed have a significant effect on the quality of the population. Figure 7 shows that the population is able to converge to a higher average fitness (around 90%). This makes sense intuitively, as once an individual has more bits set than unset (i.e., more 1s than 0s), mutations are more likely to lower the individual’s fitness than increase it. Thus, a lower mutation rate is more desirable, at least after the population has adapted to the environment (i.e., in later generations). Figure 3 shows that the best individuals were able to hold a higher number of set bits. This, again, makes sense with a lower mutation rate. Finally, we can see in figure 9 that the overall population is much fitter than in the other experiments. This, too, holds up for the same reasons. Now having a good mutation probability, I decided to see what effect changing the crossover probability had on the population. In the following experiments, I dropped the crossover probability to 0.1. Alexander Saites Figure 10: The population converges more slowly, but to a greater average, after lowering the crossover probability Figure 11: The number of correct bits in later populations does not change significantly 8 Alexander Saites 9 Figure 12: The population takes longer to get fitter, but stays fit better The results are not significantly different from in the previous set of experiments, but there does seem to be a trend toward a more fit population that takes longer to converge. This makes sense, as the crossover rate is not likely to have a strong effect, given the nature of the problem. To show this, in the following two experiments, I increased the crossover rate to 1 (every child performs crossover), then dropped it to zero (no crossover). The results are not significantly different from the above: Alexander Saites 10 Figure 13: Best and average with 100% crossover Figure 14: Total number of correct bits with 100% crossover Alexander Saites 11 Figure 15: Sample population with 100% crossover Figure 16: Best and average with 0 crossover Alexander Saites 12 Figure 17: Correct bits with 0 crossover Figure 18: Population with 0 crossover Although 100% crossover results in more correct bits in later populations, the results overall are not significantly different. Again, this mainly has to do with the nature of the problem (i.e., the fitness function is not extremely sensitive to single-point crossover). Alexander Saites 13 For fun, I pumped the crossover probability back to .6 and dropped the mutation probability to 0. Now the only way for the population to improve is via crossover: Figure 19: 0 mutation Alexander Saites 14 Figure 20: 0 mutation Figure 21: 0 mutation Alexander Saites 15 In these results, we see the effect crossover really can have. Without mutation, the populations stay approximately where they started; however, since more fit individuals are more likely to reproduce, later generations do still get more fit. This allows the average fitness to increase to the best individual’s fitness. Although crossover does sometimes increase an individual’s fitness, the lack of sensitivity this problem has still shows in these results. I did run a set of experiments in which both crossover probability and mutation probability were zero, but I did not show the results here because they are really boring: the population is as good as the initial best individual. Breeding just makes the rest of the population genetic copies of this individual. On occasion, the fittest individual dies before it is able to spread its genes, stagnating the population at some less fit individual’s fitness. For a more interesting experiment, I decreased the population size to 10 while using a mutation probability of 0.0033 and a crossover probability of .6. Here are the results: Figure 22: With a smaller population, average fitness is more sensitive to small changes Alexander Saites 16 Figure 23: The number of correct bits in best individuals still sits around the same number Figure 24: A representative population shows the same sensitivity Alexander Saites 17 With this smaller population, the individuals are more sensitive to small changes. With fewer individuals to carry strong genes, the population takes longer to develop better fitness. As a result, the death of a fit individual can devastate the average population. We should expect to see the opposite in a large population: with more individuals, there is a greater probability of individuals being very fit. Thus, the average fitness should increase quickly. Indeed, it does: Figure 25: With a large population, there are many good individuals Alexander Saites 18 Figure 26: It is more likely in this large population for the best individual to have greater number of correct bits Figure 27: This huge population can probably benefit from a lower mutation rate Alexander Saites 19 Finally, I changed the population size back to 30 and started playing with the number of genes. First, I pumped it up to 200 (Matlab can handle this natively – isn’t that great?) Figure 28: With more genes, the population fitness it noisier This population shows a lot more noise. This implies that the population size is not as large as it should be, so I increased it to 50 individuals: Figure 29: With 50 individuals, the population is able to adjust to such a high number of genes. Alexander Saites 20 However, the number of correct genes shows an interesting story: Figure 30: Only about 55% of the total genes are correct Again, this just shows that higher order bits are much more important that lower order. Conclusions Overall, these results show that the fitness function is much more sensitive to higher order bits, resulting in low sensitivity to crossover probability. Furthermore, the results show that a higher number of genes necessitates a greater population, as with too small a population does not allow much genetic information to move before individuals die.