Genetic Algorithms CSCI-2300 Introduction to Algorithms David Goldschmidt, Ph.D. Rensselaer Polytechnic Institute April 28, 2014 Evolutionary Computing Evolutionary computing produces high-quality partial solutions to problems through natural selection and survival of the fittest – Compare to natural biological systems that adapt and learn over time Genetic Algorithm Example Find the maximum value of function f(x) = –x2 + 15x – Represent problem using chromosomes built from four genes: Integer 1 2 3 4 5 Binary code 0001 0010 0011 0100 0101 Integer 6 7 8 9 10 Binary code 0110 0111 1000 1001 1010 Integer 11 12 13 14 15 Binary code 1011 1100 1101 1110 1111 Chromosome label Chromosome string Decoded integer X1 1100 X2 0100 X3 0001 X4 1110 X5 0111 Initial random X6 population 1 0 of 0 1size N = 6: Chrom fitn 12 4 1 14 7 9 36 44 14 14 56 54 Genetic Algorithm Example f(x) 60 60 50 50 40 40 30 30 20 20 10 10 0 0 5 x 10 15 (a) Chromosome initial locations. 0 0 (b) Chrom Genetic Algorithm Example Determine chromosome fitness for each chromosome: Chromosome label X1 X2 X3 X4 X5 X6 f(x) Chromosome string 1 0 0 1 0 1 1 1 0 1 1 0 00 00 01 10 11 01 fitness function here is simply the original function f(x) = –x2 + 15x Decoded integer Chromosome fitness Fitness ratio, % 12 4 1 14 7 9 36 44 14 14 56 54 16.5 20.2 6.4 6.4 25.7 24.8 218 100.0 60 60 50 50 Genetic Algorithm Example Use fitness ratios to determine which chromosomes are selected for crossover and mutation operations: 100 0 75.2 36.7 49.5 43.1 X1: 16.5% X2: 20.2% X3: 6.4% X4: 6.4% X5: 25.3% X6: 24.8% osome ng 00 00 01 10 11 01 Decoded integer Chromosome fitness Fitness ratio, % 12 36 16.5 4 44 20.2 1 14 6.4 14 14 6.4 7 56 25.7 Converge on a near-optimal solution: 9 54 24.8 Genetic Algorithm Example 60 50 40 30 20 10 10 15 nitial locations. 0 0 5 x 10 15 (b) Chromosome final locations. Convergence Example Genetic Algorithms – Step 1 Represent the problem domain as a chromosome of fixed length – – Use a fixed number of genes to represent a solution Use individual bits or characters for efficient memory use and speed 1 0 1 1 0 1 0 0 0 0 0 1 0 1 0 1 – e.g. Traveling Salesman Problem (TSP) http://www.lalena.com/AI/Tsp/ Genetic Algorithms – Step 2 Define a fitness function f(x) to measure the quality of individual chromosomes The fitness function determines – – – which chromosomes carry over to the next generation which chromosomes are crossed over with one another which chromosomes are individually mutated Genetic Algorithms – Step 3 Establish our genetic algorithm parameters: – – – Choose the size of the population, N Set the crossover probability, pc Set the mutation probability, pm Randomly generate an initial population 1 0 1 1 0 1 0 0 0 0 0 1 of chromosomes: – x1, x2, ..., xN 0 1 0 1 1 0 1 1 0 0 1 10 00 10 0 0 1 0 1 0 1 1 0 1 1 0 1 0 1 0 0 0 1 0 1 1 1 0 10 1 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 1 0 1 ... 1 00 1 01 10 01 10 0 0 0 0 1 1 0 1 1 0 1 0 0 0 0 Genetic Algorithms – Step 4 Calculate the fitness of each individual chromosome using f(x): – f(x1), f(x2), ..., f(xN) Order the population based on fitness values Genetic Algorithms – Step 5 Using pc, select pairs of chromosomes for crossover Using pm, select chromosomes for mutation Chromosomes are selected based on their fitness 75.2 values using a roulette wheel approach: 100 0 36.7 49.5 43.1 X1: 16.5% X2: 20.2% X3: 6.4% X4: 6.4% X5: 25.3% X6: 24.8% Genetic Algorithms – Step 6 Create a pair of offspring chromosomes by applying a crossover operation: X6i 1 0 00 1 0 1 00 00 X2i X1i 0 11 00 00 1 0 11 11 11 X5i Genetic Algorithms – Step 6 Mutate an offspring chromosome by applying a mutation operation: X6'i 1 0 0 0 X2'i 0 1 0 1 0 X1'i 1 0 1 1 1 1 1 X1"i X5'i 0 1 0 1 0 1 X2i 0 1 X5 0 1 1 1 0 0 1 0 X2"i Genetic Algorithms – Steps 7 & 8 Step 7: – Place all generated offspring chromosomes in a new population Step 8: – Go back to Step 5 until the size of the new population is equal to the size of the initial population, N Genetic Algorithms – Steps 9 & 10 Step 9: – Replace the initial population with the new population C rossover Generation i X1i 1 1 0 0 f = 36 X2i 0 1 0 0 f = 44 X3i 0 0 0 1 f = 14 X4i 1 1 1 0 f = 14 X5i 0 1 1 1 f = 56 X6i 1 0 0 1 f = 54 X6i 1 0 00 1 0 1 00 00 X2i X1i 0 11 00 00 1 0 11 11 11 X5i X2i 0 1 0 0 0 1 1 1 X5i Generation (i + 1) X1i+1 1 0 0 0 f = 56 X2i+1 0 1 0 1 f = 50 X3i+1 1 0 1 1 f = 44 Step 10: X4i+1 0 1 0 0 f = 44 X5i+1 0 1 1 0 f = 54 X6i+1 0 1 1 1 f = 56 Mutation X6'i 1 0 0 0 X2'i 0 1 0 1 0 X1'i 1 0 X2i – – 1 1 1 1 1 X1"i X5'i 0 1 0 1 0 1 0 1 Go back to Step 4 and repeat the process until termination criteria are satisfied Typically repeat this process for 50-5000+ generations X5i 0 0 1 1 1 0 1 0 X2"i Iteration C rossover Generation i X1i 1 1 0 0 f = 36 X2i 0 1 0 0 f = 44 X3i 0 0 0 1 f = 14 X4i 1 1 1 0 f = 14 X5i 0 1 1 1 f = 56 X6i 1 0 0 1 f = 54 X6i 1 0 00 1 0 1 00 00 X2i X1i 10 11 00 00 0 11 11 11 X5i X2i 0 1 0 0 0 1 1 1 X5i Generation (i + 1) X1i+1 1 0 0 0 f = 56 X2i+1 0 1 0 1 f = 50 X3i+1 1 0 1 1 f = 44 X4i+1 0 1 0 0 f = 44 X5i+1 0 1 1 0 f = 54 X6i+1 0 1 1 1 f = 56 Mutation X6'i 1 0 0 0 X2'i 0 1 0 10 X1'i 10 1 1 1 1 1 X1"i X5'i 0 1 01 01 X2i 0 1 X5i 0 1 1 1 0 0 1 0 X2"i Crossword Puzzle Construction Given: – – Dictionary of valid words and phrases Empty crossword grid Problem: – Fill the crossword grid such that all words both across and down are valid (assign clues later) Crossword Puzzle Construction Genetic Algorithm (GA) – – Evolve a solution by crossovers and mutations through many generations Initial population of crossword grids: – C rossover Generation i X1i 1 1 0 0 f = 36 X2i 0 1 0 0 f = 44 X3i 0 0 0 1 f = 14 X4i 1 1 1 0 f = 14 X5i 0 1 1 1 f = 56 X6i 1 0 0 1 f = 54 X6i 1 0 00 1 0 1 00 00 X2i X1i 10 11 00 00 0 11 11 11 X5i X2i 0 1 0 0 0 1 1 1 X5i Generation (i + 1) X1i+1 1 0 0 0 f = 56 X2i+1 0 1 0 1 f = 50 X3i+1 1 0 1 1 f = 44 X4i+1 0 1 0 0 f = 44 X5i+1 0 1 1 0 f = 54 Random letters? Random letters based on Scrabble® frequencies? Random words from dictionary? X6i+1 0 1 1 1 f = 56 Fitness of each grid is number of valid words Mutation X6'i 1 0 0 0 X2'i 0 1 0 10 X1'i 10 1 1 1 1 1 X1"i X5'i 0 1 01 01 X2i 0 1 X5i 0 1 1 1 0 0 1 0 X2"i Termination Criteria When do we stop? – Pause a genetic algorithm after a given number of generations, then check the fittest chromosomes – If the fittest chromosomes are fit beyond a given threshold, terminate the genetic algorithm ? Also consider stopping when the highest fitness value does not change for a large number of generations Computational Complexity How long does it take for an algorithm to produce a solution? – Depends on the size of the input and the complexity of the algorithm – The size of the input is n – The complexity of the algorithm is classified based on its expected run time Computational Complexity Big-O notation measures the expected run time of an algorithm (i.e. its computational complexity) – – – – – – – Constant time: Logarithmic time: Linear time: Linearithmic time: Quadratic time: Exponential time: Factorial time: O(1) O(log n) O(n) O(n log n) O(n2) O(c n) O(n!) Genetic Algorithms Genetic algorithms are often well-suited to producing reasonable solutions to intractable problems – Intractable problems are problems with excessive computational complexity – i.e. in the Nondeterministic Polynomial (NP) class of problems A reasonable solution is a partial or inexact solution that adequately solves the problem in polynomial time Genetic Algorithms Example Consider the Traveling Salesman Problem (TSP) in which a salesman aims to visit n cities exactly once covering the least distance http://mathworld.wolfram.com/TravelingSalesmanProblem.html http://www.tsp.gatech.edu/games/index.html – – Starting at any given node, choose from n–1 remaining nodes, then choose from n–2 remaining nodes, etc. Testing every possible route takes (n–1)! steps see http://bio.math.berkeley.edu/classes/195/2000/lec14/index.html Genetic Algorithms Example Use a genetic algorithm to evolve a near-optimal solution to the TSP – Label cities A, B, C, D, E, F, etc. Example circuits: ABCDEF, BDAFCE, FBECAD – How do we perform crossover operations? – Basic crossovers might result in invalid members of the population e.g. combining ABCDEF and BDAFCE may result in ABCFCE Genetic Algorithms Example Key challenge of developing a genetic algorithm is often the representation of the problem – – – For TSP, consider a standard ordering ABCDEF, assigning the code 123456 All other sequences encoded based on the removal of letters Basic crossover works... Genetic Algorithms Example All other sequences encoded based on the removal of letters from standard ordering – Sequence BDAFCE has code 231311 B is 2 in ABCDEF D is 3 in ACDEF A is 1 in ACEF F is 3 in CEF C is 1 in CE E is 1 in E Genetic Algorithms Example Crossing ACEDB with ABCED... Crossover Operation another approach: http://www.dna-evolutions.com/dnaappletsample.html Genetic Algorithms Example Combining ACEDB with ABCED... ...yields ACBED from A.K. Dewdney’s The (New) Turing Omnibus, Computer Science Press, New York, 1993 Genetic Algorithms Advantages of genetic algorithms: – – Often outperform “brute force” approaches by randomly jumping around the search space Ideal for problem domains in which near-optimal (as opposed to exact) solutions are adequate Disadvantages of genetic algorithms: – – Might not find any satisfactory partial solutions Tuning can be a challenge