The building Block Hypothesis

Building Blocks CS 5764 Evolutionary Computation Hod Lipson Unifying ideas • Knowledge represented as a population of solutions containing building blocks • Progress is driven by two key processes: – Incremental progress: e.g. mutation (traditional optimization): Refinement – Recombination of solutions (e.g. crossover): Discovering new areas (possibly initially inferior) Terminology “Chromosome” “Gene” 01010100111001010101010010110 Allele one of two or more forms of a gene or a genetic locus A GA Schema • A “template” – a string of symbols taken from the alphabet {0,1,*} – 010*1, *110*, *****, 10101 • The character “*” means “don’t care” – *10*1 represents 01001, 01011, 11001, and 11011 Geometric Interpretation A Schema is a hyperplane in the larger search space manifold Order of a schema • Number of specified alleles in a gene ? 000 001 010 011 ? 010 011 110 111 ? 010 110 ? 101 Order of a schema • How many different strings of length N does a schema of order “O” represent? – A schema of order O represents 2N-O different strings of length N Schema Order Represented Strings *** 0 000 001 010 011 100 101 110 111 *1* 1 010 011 110 111 *10 2 010 110 101 3 101 Destructive Dynamics • Probability of surviving mutation Sm(H)= Defining Length • “D” = The distance between the furthest two non-* symbols Schemata **** *1** *10* 10** 1*1* 1*11 0**1 1001 D 0 1 2 3 Why is the length important? Destructive Dynamics • Probability of surviving single point crossover Strings containing schemata • A bit string represented by a schema is said to “contain” the schema Bit String 1 00 110 1011 Contained Schemata 1 * 00 0* *0 ** 110 11* 1*0 1** *10 *1* **0 *** 1011 101* 10*1 10** 1*11 1*1* 1**1 1*** *011 *01* *0*1 *0** **11 **1* ***1 **** How many schemata does a string of length N include? How many schemata in a population? • There are 3N different schemata (potential genes) of length N • A population of P bit-strings each of length N contains between 2N and min(P2N, 3N) schemata All possible schemata N 3 P 100 Number of Schemata ?-? How many schemata in a population? • There are 3N different schemata of length N • A population of P bit-strings each of length N contains between 2N and min(P2N, 3N) schemata N=3 All possible schemata N 6 20 40 100 P 20 50 100 300 Number of Schemata 64 - 729 1048576 - 52428800 - Estimating Fitness associated with a gene Population 101 100 010 110 f 5 1 2 3 Schemata *** **0 **1 *0* *00 *01 *1* f (5+1+2+3) / 4 = 2.75 (1+2+3) / 3 = 2 5/1=5 (5+1) / 2 = 3 1/1=1 5/1=5 (2+3)/2 = 2.5 Estimation uncertainty: Standard error Observations • If only fitness-proportionate selection is applied (no crossover or mutation), schemata with above (below) average fitness are sampled, generation after generation, by an increasing (decreasing) number of chromosomes. • Schemata with a long defining length have a higher probability to be disrupted by crossover • Schemata with high order have a higher probability of being disrupted by mutation • Schemata with a low order and a short defining length are called building blocks • Building blocks are processed with minimum disruption by GAs, therefore GAs use building blocks of relatively high fitness to build entire solutions • GAs will be successful insofar as the problem has been encoded in a way that can be solved with compact building blocks (low order, low defining lengths) • What is the easiest problem you can think of? Dynamics • • • • • H is a schema present in the population at time t m(H,t) is the number of instances of H at time t u(H,t) is the observed average fitness of H expected number of offspring of x is f(x)/favg(t) If x is an instance of H, then Destructive Dynamics • Probability of surviving single point crossover • Probability of surviving mutation Sm(H)= Combining Effects 13 12 GA (Diversity, Tight Linkage) Best Fitness 11 10 GA (Diversity, Poor Linkage) 9 Parallel Simulated Annealing Parallel Hillclimber GA (Roulette, Tight Linkage) 8 7 6 5 Random Search 4 0 500 1000 1500 2000 2500 3000 Evaluations Generation (x100) Large defining length and small order = poor linkage The Building Block Hypothesis • GAs performs adaptation by identifying and recombining "building blocks", i.e. low order, low defining-length schemata with above average fitness. • GAs perform adaptation by implicitly and efficiently implementing this heuristic. Caveats • Model assumes particular form of representation: – Bit strings, single point crossover, mutation • Assumes fitness-proportionate selection • Assume fixed fitness criterion • Assumes fixed population size Many variations have been published

The building Block Hypothesis

Related documents

Products

Support

The building Block Hypothesis

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib