Page What Do the Following 3 Things Have in Common? Page Genetic Algorithms (GAs) • GAs design jet engines. • GAs draw criminals. • GAs program computers. Page A Potpourri of Applications 1. General Electric’s Engineous (generalized engineering optimization). 2. Face space (criminology). 3. Genetic programming (machine learning). Page Gas Turbine Design Jet engine design at General Electric (Powell, Tong, & Skolkick, 1989) • • • • • Page Coarse optimization - 100 design variables. Hybrid GA + numerical optimization + expert system. Found 2% increase in efficiency. Spending $250K to test in laboratory. Boeing 777 design based on these results. Engineous A new software system called Engineous combines artificial intelligence and numerical methods for the design and optimization of complex aerospace systems. Engineous combines the advanced computational techniques of genetic algorithms, expert systems, and object-oriented programming with the conventional methods of numerical optimization and simulated annealing to create a design optimization environment that can be applied to computational models in various disciplines. Engineous has produced designs with higher predicted performance gains that current manual design processes - on average a 10-to-1 reduction of turnaround time - and has yielded new insights into product design. It has been applied to the aerodynamic preliminary design of an aircraft engine turbine, concurrent aerodynamic and mechanical preliminary design of an aircraft engine turbine blade and disk, a space superconductor generator, a satellite power converter, and a nuclearpowered satellite reactor and shield. Page6 Source: Tong, S.S. ; Powell, D. ; Goel, S. Integration of artificial intelligence and numerical optimization techniques for the design of complex aerospace systems Engineous Engineous Software, Inc. provides process integration and design optimization software solutions and services. The company's iSIGHT software integrates key steps in the product design process, then automates and executes those steps through design exploration tools like optimization, DOE, and DFSS techniques. The FIPER infrastructure links these processes together in a unified environment. Through FIPER, models, applications and "best" processes are easily shared, accessed, and executed with other engineers, groups, and partners. Engineous operates numerous sales offices in the U.S., as well as wholly owned subsidiaries in Asia and Europe. Customers include leading Global 500 companies such as Canon, General Electric, General Motors, Pratt & Whitney, Honeywell, Lockheed Martin, Toshiba, MHI, Ford Motor Company, Chrysler, Toyota, Nissan, Renault, Hitachi, Peugeot, MTU Aero Engines, TRW, BMW, Rolls Royce, and Johnson Controls, Inc. Additional information may be found at www.Engineous.com and http://components.Engineous.com. Page7 Recent news on Engineous Dassault Systèmes to Acquire Engineous Software 06/17/2008 Paris, France, and Providence, R.I., USA, June 17, 2008 – Dassault Systèmes (DS) (Nasdaq: DASTY; Euronext Paris: #13065, DSY.PA), a world leader in 3D and Product Lifecycle Management (PLM) solutions and Engineous Software, a market leader in process automation, integration and optimization, today announced an agreement in which DS would acquire Engineous Software. This acquisition will extend SIMULIA’s leadership in providing Simulation Lifecycle Management solutions on the V6 IP collaboration platform. The proposed acquisition, for an estimated price of 40 million USD, should be completed before the end of July subject to specific closing conditions. Page8 http://www.3ds.com/company/news-media/press-releases-detail/release//single/1789/?no_cache=1 Criminal-likeness Reconstruction No closed form fitness function (Caldwell & Johnston, 1991). • Human witness chooses faces that match best. • GA creates new faces from which to choose. Page What are GAs? • GAs are biologically inspired class of algorithms that can be applied to, among other things, the optimization of nonlinear multimodal functions. • Solves problems in the same way that nature solves the problem of adapting living organisms to the harsh realities of life in a hostile world: evolution. Let’s watch a video... Page What is a Genetic Algorithm (GA)? A GA is an adaptation procedure based on the mechanics of natural selection and genetics. GAs have 2 essential components: 1. Survival of the fittest (selection) 2. Variation Page Nature as Problem Solver Beauty-of-nature argument How Life Learned to Live (Tributsch, 1982, MIT Press) Example: Nature as structural engineer Page Owl Butterfly Page Evolutionary is Revolutionary! Street distinction evolutionary vs. revolutionary is false dichotomy. 3.5 Billion years of evolution can’t be wrong. Complexity achieved in short time in nature. Can we solve complex problems as quickly and reliably on a computer? Page Why Bother? Lots of ways to solve problems: –Calculus –Hill-climbing –Enumeration –Operations research: linear, quadratic, nonlinear programming Why bother with biology? Page Combinatorial Problem • deals with the study of finite or countable discrete structures • deciding when certain criteria can be met, and constructing and analyzing objects meeting the criteria e.g. Satisfiability Is there a truth assignment to the boolean variables such that every clause is satisfied? Page http://www.cs.sunysb.edu/~algorith/files/satisfiability.shtml Non-linear Programming nonlinear programming (NLP) is the process of solving a system of equalities and inequalities, collectively termed constraints, over a set of unknown real variables, along with an objective function to be maximized or minimized, where some of the constraints or the objective function are nonlinear Consider the following nonlinear program: minimise x(sin(3.14159x)) subject to 0 <= x <= 6 In the diagram above there are many local minima; that is, points at the bottom of some portion of the graph Page17 http://people.brunel.ac.uk/~mastjjb/jeb/or/nlp.html Why bother with biology? Gradient search technique Robustness = Breadth + Efficiency. A hypothetical problem spectrum: Page GAs Not New Professor of Psychology and Electrical Engineering & Computer Science Ph.D. University of Michigan Area: Cognition and Cognitive Neuroscience John Holland at University of Michigan pioneered in the 50s. Other evolutionaries: Fogel, Rechenberg, Schwefel. Back to the cybernetics movement and early computers. Reborn in the 70s. Page http://www.lsa.umich.edu/psych/people/directory/profiles/faculty/?uniquename=jholland Genetic algorithms Variant of local beam search with sexual recombination. Page20 How GAs are different from traditional methods? 1. 2. 3. 4. Page21 GAs work with a coding of the parameter set, not the parameter themselves. GAs search from a population of points, not a single point. GAs use payoff (objective function) information, not derivatives or other auxilliary knowledge. GAs use probabilistic transition rules, not deterministic rules. Traditional Approach Problem: Maximise the function f(S1, S2, S3, S4, S5) = sin(S1)2 * sin(S2)2+s3 - loge(S3)*S4-S5 Traditional approach: twiddle with the switch parameters. y = f(S1, S2, S3, S4, S5) output S1, S2, S3, S4, S5 Page22 Setting of five switches Genetic Algorithm Problem: Maximise the function f(S1, S2, S3, S4, S5) = sin(S1)2 * sin(S2)2+s3 - loge(S3)*S4-S5 Natural parameter set of the optimisation problem is represented as a finite-length string y = f(S1, S2, S3, S4, S5) Setting of five switches f(x) output S1, S2, S3, S4, S5 Page 23 GA doesn’t need to know the workings of the black box. 0 x 31 Main Attractions of Genetic Algorithm GA Traditional Optimization Approaches • Simplicity of operation and power of effect • unconstrained • Limitations: continuity, derivative existence, unimodality • work from a rich database of points simultaneously, climbing many peaks in parallel • move gingerly from a single point in the decision space to the next using some transition rule • population of strings = points • Population of well-adapted diversity Page24 GA maximise 2x1 + x2 - 5loge(x1)sin(x2) subject to x1x2 <= 10 | x1 - x2 | <= 2 0.1 <= x1 <= 5 0.1 <= x2 <= 3 Source: Nonlinear Programming, by J E Beasley http://people.brunel.ac.uk/~mastjjb/jeb/or/nlp.html Page25 • works from a rich database of points simultaneously, climbing many peaks in parallel Page Genetic Algorithm GA • Initial Step: random start using successive coin flips GA uses coding 01101 11000 01000 10011 population • blind to auxiliary information GAs are blind, only payoff values associated with individual strings are required • Searches from a population Uses probabilistic transition rules to guide their search towards regions of the search space with likely improvement Page27 Page Genetic Algorithm REPRODUCTION • Selection according to fitness GA uses coding 01101 11000 01000 10011 • Replication 1 population 3 4 Weighted Roulette wheel Mating pool (tentative population) CROSSOVER • Crossover – randomized information exchange 2 Crossover point k = [1, l-1] Page 29 GA builds solutions from the past partial solutions of the previous trials Page Genetic Algorithm MUTATION • Reproduction and crossover may become overzealous and lose some potentially useful genetic material • Mutation protects against irrecoverable loss; it serves as an insurance policy against premature loss of important notions • Mutation rates: in the order of 1 mutation per a thousand bit position transfers Page31 Page Genetic Algorithm: Example Problem: Maximise the function f(x) = x2 on the integer interval [0, 31] 1. Coding of decision variables as some finite length string X as binary unsigned integer of length 5 [0, 31] = [00000, 11111] 2. Constant settings Pmutation=0.0333 Pcross=0.6 Population Size=30 Page33 DeJong(1975) suggests high crossover Probability, low mutation probability (inversely proportional to the pop.size), and A moderate population size Genetic Algorithm: Example SAMPLE PROBLEM • Maximize f(x) = x2; where x is permitted to vary between 0 and 31 3. Select initial population at random (use even numbered population size) String number Initial Population X value f(x) pselect fi f Expected count fi f Actual count(Roulette Wheel) 1 01101 13 169 0.14 0.58 1 2 11000 24 576 0.49 1.97 2 3 01000 8 64 0.06 0.22 0 4 10011 19 361 0.31 1.23 1 Page34 Sum Ave. Max. 1170 293 576 Genetic Algorithm: Example SAMPLE PROBLEM • Maximize f(x) = x2; where x is permitted to vary between 0 and 31 4. Reproduction: select mating pool by spinning roulette wheel 4 times. pselect 30.9% 5.5% 14.4% 1 2 3 49.2 Weighted Roulette wheel Page35 4 01101 11000 01000 10011 0.14 0.49 0.06 0.31 The best get more copies. The average stay even. The worst die off. Choosing offspring for the next generation int Select(int Popsize, double Sumfitness, Population Pop){ [0,1] partSum = 0 rand=Random * Sumfitness j=0 Repeat j++; partSum = partSum + Pop[j].fitness Until (partSum >= rand) or (j = Popsize) Return j } Page36 Genetic Algorithm SAMPLE PROBLEM 5. Crossover – strings are mated randomly using coin tosses to pair the couples - mated string couples crossover using coin tosses to select the crossing site String number Mating Pool after Reproduction Mate (randomly selected) Crossover site (random) New population X-value f(x)=x2 1 0110|1 2 4 01100 12 144 2 1100|0 1 4 11001 25 625 3 11|000 4 2 11011 27 729 4 10|011 3 2 10000 16 256 Page37 Sum Ave. Max. 1754 439 729 Page The Genetic Algorithm 1. Page39 Initialize the algorithm. Randomly initialize each individual chromosome in the population of size N (N must be even), and compute each individual’s fitness. The Genetic Algorithm 1. Initialize the algorithm. Randomly initialize each individual chromosome in the population of size N (N must be even), and compute each individual’s fitness. 2. Select N/2 pairs of individuals for crossover. The probability that an individual will be selected for crossover is proportional to its fitness. Page40 The Genetic Algorithm 1. 2. 3. Page41 Initialize the algorithm. Randomly initialize each individual chromosome in the population of size N (N must be even), and compute each individual’s fitness. Select N/2 pairs of individuals for crossover. The probability that an individual will be selected for crossover is proportional to its fitness. Perform crossover operation on N/2 pairs selected in Step1. Randomly mutate bits with a small probability during this operation. The Genetic Algorithm 1. 2. 3. 4. Page42 Initialize the algorithm. Randomly initialize each individual chromosome in the population of size N (N must be even), and compute each individual’s fitness. Select N/2 pairs of individuals for crossover. The probability that an individual will be selected for crossover is proportional to its fitness. Perform crossover operation on N/2 pairs selected in Step1. Randomly mutate bits with a small probability during this operation. Compute fitness of all individuals in new population. The Genetic Algorithm 5. (Optional Optimization) Select N fittest individuals from combined population of size 2N consisting of old and new populations pooled together. Page43 The Genetic Algorithm 5. (Optional Optimization) Select N fittest individuals from combined population of size 2N consisting of old and new populations pooled together. 6. (Optional Optimization) Rescale fitness of population. Page44 The Genetic Algorithm 5. 6. (Optional Optimization) Select N fittest individuals from combined population of size 2N consisting of old and new populations pooled together. (Optional Optimization) Rescale fitness of population. 7. Determine maximum fitness of individuals in the population. If |max fitness – optimum fitness| < tolerance Then Stop Else Go to Step1. Page45 A Simple GA Example Page47 Let’s see a demonstration for a GA that maximizes the function x f ( x) c n =10 c = 230 -1 = 1,073,741,823 Page48 n Simple GA Example Function to evaluate: 10 x f ( x) coeff Fitness Function or Objective Function coeff – chosen to normalize the x parameter when a bit string of length lchrom =30 is chosen. coeff 230 1 When the x value is normalized, the max. value of the function will be: f ( x) 1.0 This happens when Page49 x 230 1 for the case when lchrom=30 Test Problem Characteristics With a string length=30, the search space is much larger, and random walk or enumeration should not be so profitable. There are 230=1.07(1010) points. With over 1.07 billion points in the space, one-at-a-time methods are unlikely to do very much very quickly. Also, only 1.05 percent of the points have a value greater than 0.9. Page50 Comparison of the functions on the unit interval x2 f(x) 0 x 31 x10 f(x) 0 Page51 x 1,073,741,823 Actual Plot 1 0.9 0.8 0.7 f(x) 0.6 x^2 x^10 0.5 0.4 0.3 0.2 0.1 0 0 Page52 0.1 0.2 0.3 0.4 0.5 X 0.6 0.7 0.8 0.9 1 Decoding a String For every problem, we must create a procedure that decodes a string to create a parameter (or set of parameters) appropriate for that problem. first bit 11010101 Chromosome 1073741823.0 DECODE Parameter OBJ FCN Fitness or Figure of Merit Page53 GA Parameters A series of parametric studies [De Jong, 1975] across a five function suite of optimization problems suggested that good GA performance requires the choice of: – High crossover probability – Low mutation probability (inversely proportional to the population size) – Moderate Population Size (e.g. pmutation=0.0333, pcross=0.6, popsize=30) Page54 Limits of GA • GAs are characterized by a voracious appetite for processing power and storage capacity. • GAs have no convergence guarantees in arbitrary problems. Page55 Limits of GAs • GAs sort out interesting areas of a space quickly, but they are a weak method, without the guarantees of more convergent procedures. • This does not reduce their utility however. More convergent methods sacrifice globality and flexibility for their convergence, and are limited to a narrow class of problem. • GAs can be used where more convergent methods dare not tread. Page56 Advantages of GAs • Well-suited to a wide-class of problems • Do not rely on the analytical properties of the function to be optimized (such as the existence of a derivative) Page57 Advanced GA Architectures GA + Any Local Convergent Method – Start search using GA to sort out the interesting hills in your problem. Once GA ferrets out the best regions, apply locally convergent scheme to climb the local peaks. Page58 Other Applications Optimization of a choice of Fuzzy Logic parameters Page59 Simple GA Implementation Initial population of chromosomes Offspring Population Calculate fitness value Evolutionary operations No Solution Found? Yes Stop Page60 Based on SGA-C, A C-language Implementation of a Simple Genetic Algorithm Let’s see the documentation (pdf file) Page Phase 1 – General Initialisation Initialise Parameters() Randomize() Warmup_Random() Advance_Random() Initialise Population() Decode() Objective Function() Page62 Phase 2 – Generation of Chromosomes Generation() SelectIndividual() CrossOver() Mutation() Page63 Running the GA System Gen = 0 Initialize( OldPop ) Do Gen = Gen + 1 Generation( OldPop, NewPop ) For ii = 1 To PopSize OldPop(ii) = NewPop(ii) 'advance the generation Next ii Loop Until ( (Gen > MaxGen) or (MaxFitness > DesiredFitness) ) Page64 Initialisation Initialise Parameters PopSize = 30 'population size lchrom = 30 'chromosome length MaxGen = 10 PCross = 0.6 PMutation = 0.0333 ReDim GenStat(1 To (MaxGen + 1)) 'Initialize random number generator Randomize 'Initialize counters NMutation = 0 NCross = 0 Page65 Randomization Randomize Sub Randomize() Randomize Timer Warmup_Random (Rnd * 1) End Sub Page66 [0,1] Randomization Warmup_Random() [0, 1] Sub Warmup_Random(RandomSeed As Single) Dim j1 As Integer Dim ii As Integer Dim NewRandom As Single Dim PrevRandom As Single [0, 1] OldRand(55) = RandomSeed NewRandom = 0.000000001 PrevRandom = RandomSeed Page67 For j1 = 1 To 54 ii = (21 * j1) Mod 55 'multiply first, before modulus OldRand(ii) = NewRandom NewRandom = PrevRandom - NewRandom If (NewRandom < 0) Then NewRandom = NewRandom + 1 PrevRandom = OldRand(ii) Next j1 Advance_Random Advance_Random Advance_Random jrand = 0 End Sub Randomization Advance_Random() Sub Advance_Random() Dim j1 As Integer Dim New_Random As Single For j1 = 1 To 24 New_Random = OldRand(j1) - OldRand(j1 + 31) If (New_Random < 0) Then New_Random = New_Random + 1 OldRand(j1) = New_Random Next j1 For j1 = 25 To 55 New_Random = OldRand(j1) - OldRand(j1 - 24) If (New_Random < 0) Then New_Random = New_Random + 1 OldRand(j1) = New_Random Next j1 End Sub Page68 Max: 24+31=55 Max: 55-24=31 Random 'Fetch a single random number between 0.0 and 1.0 Subtractive Method 'See Knuth, D. (1969), v. 2 for details Function Random() As Single jrand = jrand + 1 If jrand > 55 Then jrand = 1 Advance_Random End If Random = OldRand(jrand) End Function Page69 Knuth, 1969. D.E. Knuth The Art of Computer Programming 2, Addison-Wesley, Reading, MA (1969). Initialise Population InitPop() Sub InitPop() Dim j As Integer Dim j1 As Integer For j = 1 To PopSize With OldPop(j) Max: 24+31=55 For j1 = 1 To lchrom .Chromosome(j1) = Flip(0.5) Next j1 .x = Decode(.Chromosome, lchrom) 'decode the string .Fitness = ObjFunc(.x) 'evaluate initial fitness .Parent1 = 0 .Parent2 = 0 .XSite = 0 End With Next j End Sub Page70 Initialise Population Decode() 'decodes the string to create a parameter or set of parameters 'appropriate for that problem 'Decode string as unsigned binary integer: true=1, false=0 Function Decode(Chrom() As Boolean, lbits As Integer) As Single Dim j As Integer Dim Accum As Single Dim PowerOf2 As Single Accum = 0 PowerOf2 = 1 For j = 1 To lbits If Chrom(j) Then Accum = Accum + PowerOf2 PowerOf2 = PowerOf2 * 2 Next j Decode = Accum Binary End Function Page71 to decimal conversion Initialise Population Objective Function() 'Fitness function = f(x) = (x/c) ^ n Function ObjFunc(x As Single) As Single 'coef = (2 ^ 30)-1 = 1073741823 'coef is chosen to normalize the x parameter when a bit string of length lchrom=30 is chosen 'since the x value has been normalized, the maximum value of the fcn wil be f(x)=1, 'when x=(2^30)-1, for the case when lchrom=30 Const coef As Single = 1073741823 'coefficient to normalize domain Const n As Single = 10 'power of x ObjFunc = (x / coef) ^ n End Function Page72 Generation of Chromosomes SelectIndividual() Function SelectIndividual(PopSize As Integer, SumFitness As Single, Pop() As IndividualType) As Integer Dim RandPoint As Single Dim PartSum As Single Select a single individual or offspring for Dim j As Integer PartSum = 0 j=0 RandPoint = Random * SumFitness the next generation via roulette wheel selection Do 'find wheel slot j=j+1 PartSum = PartSum + Pop(j).Fitness Loop Until ((PartSum >= RandPoint ) Or (j = PopSize)) SelectIndividual = j End Function Page73 CrossOver() Generation of Chromosomes Function CrossOver(Parent1() As Boolean, Parent2() As Boolean, Child1() As Boolean, Child2() As Boolean, lchrom As Integer, NCross As Integer, NMutation As Integer, jcross As Integer, Pcross, PMutation As Single) Dim j As Integer If (Flip(PCross)) Then jcross = Rndx(1, lchrom - 1) 'cross-over site is selected between 1 and the last cross site NCross = NCross + 1 Else ‘ use full-length string l, and so a bit-by-bit mutation will take place despite the absence of a cross jcross = lchrom End If For j = 1 To jcross ‘ 1st exchange, 1 to 1 and 2 to 2 Child1(j) = Mutation(Parent1(j), PMutation, NMutation) Child2(j) = Mutation(Parent2(j), PMutation, NMutation) Next j If jcross <> lchrom Then ‘ 2nd exchange, 1 to 2 and 2 to 1 For j = jcross + 1 To lchrom Child1(j) = Mutation(Parent2(j), PMutation, NMutation) Child2(j) = Mutation(Parent1(j), PMutation, NMutation) Next j End If CrossOver OldPop(Mate1).Chromosome, OldPop(Mate2).Chromosome, _ End Function NewPop(j).Chromosome, NewPop(j + 1).Chromosome, _ lchrom, NCross, NMutation, jcross, PCross, PMutation Page74 Generation of Chromosomes Mutation() 'Mutate an allele with PMutation, count number of mutations Function Mutation(Alleleval As Boolean, PMutation As Single, NMutation As Integer) As Boolean Dim Mutate As Boolean Mutate = Flip(PMutation) If Mutate Then NMutation = NMutation + 1 Mutation = Not Alleleval Else Mutation = Alleleval End If End Function Page75 Function Flip(Probability As Single) As Boolean If Probability = 1 Then Flip = True Else Flip = (Rnd <= Probability) End If End Function Generation of Chromosomes Generation() Sub Generation() Dim j As Integer Dim Mate1 As Integer Dim Mate2 As Integer Dim jcross As Integer j=1 Do 'Pick a pair of mates Mate1 = SelectIndividual(PopSize, SumFitness, OldPop) Mate2 = SelectIndividual(PopSize, SumFitness, OldPop) With NewPop(j + 1) .x = Decode(.Chromosome, lchrom) .Fitness = ObjFunc(.x) .Parent1 = Mate1 .Parent2 = Mate2 .XSite = jcross End With j = j + 2 'increment population index Loop Until (j > PopSize) End Sub 'Crossover and mutation - mutation embedded within crossover CrossOver OldPop(Mate1).Chromosome, OldPop(Mate2).Chromosome, _ NewPop(j).Chromosome, NewPop(j + 1).Chromosome, _ lchrom, NCross, NMutation, jcross, PCross, PMutation 'Decode string, evaluate fitness & record parentage date on both children With NewPop(j) .x = Decode(.Chromosome, lchrom) .Fitness = ObjFunc(.x) .Parent1 = Mate1 .Parent2 = Mate2 .XSite = jcross End With Page76 What sort of traits do fit chromosomes carry to survive generations after generations of mating and mutation? Page Schema A Schema is a similarity template describing a subset of strings with similarities at certain string positions. We can think of it as a pattern matching device: a schema matches a particular string if at every location in the schema a 1 matches a 1 in the string, or a 0 matches a 0, or an * matches either. Page78 Notation: Schema For a binary alphabet {0, 1}, we motivate a schema by appending a special symbol *, or don’t care symbol, producing a ternary alphabet: V={ 0, 1, * }; * asterisk is a don’t care symbol which matches either a 0 or a 1 at a particular position. This allows us to build schemata: e.g. Page79 H = *11*0** String A = 0111000 Understanding the building blocks of future solutions Schema Properties Schemata and their properties serve as notational devices for rigorously discussing and classifying string similarities. They provide the basic means for analyzing the net effect of reproduction and genetic operators on the building blocks contained within the population. Page80 Schema Matching A bit string matches a particular schemata if that bit string can be constructed from the schemata by replacing the symbol with the appropriate bit value. e.g. H = *11*0** String A = 0111000 String A is an example of the schema H because the string alleles ai match schema positions hi at the fixed positions 2, 3 and 5. Page81 Understanding the building blocks of future solutions Schema Properties Defining Length of Schema: δ(H) – is the distance between the first and last specific string position 011*1** Order of Schema: o(H) – is the number of fixed positions present in the template 0****** Page 82 Schema Defining Length: δ(H) = 5-1 = 4 Schema Order: o(011*1**) = 4 Schema Defining Length: δ(H) = 0, because there is only one fixed position Schema Order: o(0******) = 1 Effects of Crossover on a Schema What happens to a particular schema when crossover is introduced? Crossover leaves a schema unscathed if it does not cut the schema; otherwise it disrupts it. e.g. H1 = 1***0 H2 = **11* H1 is likely to be disrupted by crossover. H2 is relatively unlikely to be destroyed. Page83 Effects of Mutation on a Schema What happens to a particular schema when mutation is introduced? Mutation at normal, low rates does not disrupt a particular schema very frequently. e.g. H1 = 110*0 H2 = **11* H1 is relatively more likely to be disrupted by mutation (many alleles that can be altered). H2 is relatively unlikely to be mutated because it has a lower order than H1. Page84 Building blocks Who shall live and who shall die? Highly-fit, low-order, short-defining-length schemata (building blocks) are propagated generation to generation by giving exponentially increasing samples to the observed best; All these goes in parallel with no special bookkeeping or special memory other than our population of n strings. Page85 How do we choose a good coding? Page Principles for Choosing a GA Coding 1 Principle of Meaningful Blocks The user should select a coding so that short, low-order schemata are relevant to the underlying problem and relatively unrelated to schemata over other fixed positions. When we design a coding, we should check the distances between related bit positions. Page87 Principles for Choosing a GA Coding 2 Principle of Minimal Alphabets The user should select the smallest alphabet that permits a natural expression of the problem. In our previous examples, we were using binary coding. Was that a good choice then? Page88 Principles for Choosing a GA Coding Example: Refer to our previous problem • Maximize f(x) = x2; where x is permitted to vary between 0 and 31 Binary String Value X Non-binary String Fitness 01101 13 N 169 11000 24 Y 576 01000 8 I 64 10011 19 T 361 We want to represent 32 possible values: [0, 31] Non-binary Coding: 26-letter alphabet {A-Z}, six digits {1-6} Binary Coding: {0, 1} Page89 Number of Schemata for a GA Coding Example: Refer to our previous problem • Maximize f(x) = x2; where x is permitted to vary between 0 and 31 Binary String Value X Non-binary String Fitness 01101 13 N 169 11000 24 Y 576 01000 8 I 64 10011 19 T 361 We want to represent 32 possible values: [0, 31] Non-binary Coding: 26-letter alphabet {A-Z}, six digits {1-6} Binary Coding: {0, 1} For equality of the number of points in each space, we require: Where l Page90 = binary code length, m = non-binary code length 2l = km Number of Schemata for a GA Coding Example: Refer to our previous problem • Maximize f(x) = x2; where x is permitted to vary between 0 and 31 We want to represent 32 possible values: [0, 31] Non-binary Coding: 26-letter alphabet {A-Z}, six digits {1-6} Binary Coding: {0, 1} For equality of the number of points in each space, we require: Where l = binary code length, m = non-binary code length The number of schemata may then be calculated as: Binary case: 3l ; 2+1 (asterisk) alphabets Non-Binary case: (k+1)m 91 Page ; k+1 (asterisk) alphabets 2l = km Number of Schemata for a GA Coding Example: Refer to our previous problem • Maximize f(x) = x2; where x is permitted to vary between 0 and 31 We want to represent 32 possible values: [0, 31] Non-binary Coding: 26-letter alphabet {A-Z}, six digits {1-6} Binary Coding: {0, 1} For equality of the number of points in each space, we require: Where l 2l = km = binary code length, m = non-binary code length The number of schemata may then be calculated as: Binary case: 3l ; 2+1 (asterisk) alphabets Non-Binary case: (k+1)m ; k+1 (asterisk) alphabets The binary alphabet offers the maximum number of schemata per bit of information of any coding. Since these similarities are the essence of our search, when we design 92 a code, we should ,maximise the number of them available for the GA to exploit. Page Genetic algorithms: Exercise Think of a GA approach to solving the 8-Queens problem Page93 Genetic algorithms Variant of local beam search with sexual recombination. Fitness function: number of non-attacking pairs of queens (min = 0, max = 8 × 7/2 = 28); Page94 24/(24+23+20+11) = 31% 23/(24+23+20+11) = 29% etc Why use Fitness Scaling? At the start of the GA run, it is common to have a few extraordinary individuals in a population of mediocre colleagues. If left to the selection rule pselecti = f i f the extraordinary individuals would take over a significant proportion of the finite population in a single generation. Page95 This is undesirable, leading to a premature convergence! Why use Fitness Scaling? Late in the run, there may still be significant diversity within the population. However, the population’s average fitness may be close to the population’s best fitness. If this is left alone, • average members get nearly the same number of copies in future generations, and • the survival of the fittest necessary for improvement becomes a random walk among the mediocre. In both cases, at the beginning of the run, and as the run matures, fitness scaling can help. Page96 Why use Fitness Scaling? Linear Scaling Scaled fitness f’ = a f + b Raw fitness In all cases, we want f’ave = fave because subsequent use of the selection procedure will ensure that each average population member contributes one expected offspring to the next generation. Page97 Why use Fitness Scaling? To control the number of offspring given to the population member with the maximum raw fitness, we choose the other scaling relationship to obtain a scaled maximum fitness. Scaled Maximum Fitness: f’max = cmult * fave For a typical population size of n = 50 to 100, Cmult = [1.2, 2] has been used successfully Page98 Number of expected copies desired for the best population member Fitness Scaling Linear Scaling Under Normal Conditions Scaled fitness f’max f’ave f’min fmin fave Page99 fmax Raw fitness Problem with Linear Scaling Toward the end of a run, the choice of Cmult stretches the raw fitness values significantly. This may in turn cause difficulty in applying the linear scaling rule. The effects of the Linear Scaling rule works during the initial run of the GA: • few extraordinary individuals get scaled down, and • the lowly members of the population get scaled up The problem: As the run matures, points with low fitness can be scaled to negative values! Page100 The stretching required on the relatively close average and maximum raw fitness values causes the low fitness values to go negative after scaling. See for yourself, TestGA2.xls Why use Fitness Scaling? Difficult situation for linear scaling in mature run f’max f’ave f’min 0 Raw fitness fmin fave Page101 fmax Negative fitness violates non-negativity requirement! Fitness Scaling If it’s possible to scale to the desired multiple, Cmult Then Perform linear scaling Else Scaling is performed by pivoting about the average value and stretching the fitness until the minimum value maps to zero. Page102 Scaling Non-negative test: If( min > (fmultiple*avg - max) / (fmultiple - 1.0) ) { Perform Normal Scaling } Page103 Eliminating negative fitness values Solution: When we cannot scale to the desired cmult, we still maintain equality of the raw and scaled fitness averages and we map the minimum raw fitness fmin to a scaled fitness f’min = 0. Description of Routines: Prescale – takes the average, maximum and minimum raw fitness values and calculates linear scaling of the coefficients a and b based on the logic described previously. It takes into account whether the desired cmult can be reached or not. Scalepop – called after Prescaling is done. It scales all the individual raw fitness values using the function Scale. Page104 Fitness Scaling procedure scalepop(popsize:integer; var max, avg, min, sumfitness:real; var pop:population); { Scale entire population } var j:integer; a, b:real; { slope & intercept for linear equation } begin prescale(max, avg, min, a, b); { Get slope and intercept for function } sumfitness := 0.0; for j := 1 to popsize do with pop[j] do begin fitness := scale(objective, a, b); sumfitness := sumfitness + fitness; end; function scale(u, a, b:real):real; end; { Scale an objective function value } begin scale := a * u + b end; Page105 Fitness Scaling { scale.sga: contains prescale, scale, scalepop for scaling fitnesses } procedure prescale (umax, uavg, umin:real; var a, b:real); { Calculate scaling coefficients for linear scaling } const fmultiple = 2.0; { Fitness multiple is 2 } var delta:real; { Divisor } begin if umin > (fmultiple*uavg - umax) / (fmultiple - 1.0) { Non-negative test } then begin { Normal Scaling } delta := umax - uavg; Linear Scaling a := (fmultiple - 1.0) * uavg / delta; b := uavg * (umax - fmultiple*uavg) / delta; end else begin { Scale as much as possible } delta := uavg - umin; Stretch fitness until a := uavg / delta; Minimum maps to zero. b := -umin * uavg / delta; end; end; Let’s try to solve an Page106 example using a stored GA run. Why Scaling? Simple scaling helps prevent the early domination of extraordinary individuals, while it later on encourages a healthy competition among near equals. Page107 Stretch as much as possible RAW Fitness See GA-Scaling-Goldberg.xls 1 0.9 0.8 0.7 0.6 0.5 RAW Fitness 0.4 0.3 0.2 0.1 Raw Fitness vs. Scaled Fitness (Multiple Scaling) 0 0 2 4 6 8 10 1.20 1.00 0.80 0.60 Scaled Fitness 0.40 0.20 0.00 0 Page108 0.2 0.4 0.6 0.8 1 Linear Scaling RAW Fitness Raw Fitness vs. Scaled Fitness (Normal Scaling) 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1.40 1.20 1.00 0.80 0.60 Scaled Fitness 0 2 4 6 8 10 0.40 0.20 0.00 -0.20 0 0.2 0.4 0.6 0.8 1 -0.40 See GA-Scaling-Goldberg.xls Page109 This sample shows a violation of the nonnegativity requirement. This will never happen had we followed the multiple scaling scheme described previously Page Multiparameter, Mapped, Fixed-Point Coding Single U1 Parameter (l1 = 4) 0 0 0 0 -> Umin 1 1 1 1 -> Umax Others map linearly in between Multiparameter Coding (10 parameters) 0 0 0 0 | 0 0 0 0 | ....... | 0 0 0 0 | 0 0 0 0 | U1 U2 U9 U10 ... To construct a multiparameter coding, we can simply concatenate as many single parameter codings as we require. Each coding may have its own sublength, its own Umax and Umin values. Page 111 Multiparameter, Mapped, Fixed-Point Coding Extract a substring from a full string procedure extract_parm(var chromfrom, chromto:chromosome; var jposition, lchrom, lparm:integer); { Extract a substring from a full string } var j, jtarget:integer; begin j := 1; jtarget := jposition + lparm - 1; if jtarget > lchrom then jtarget := lchrom; { Clamp if excessive } while (jposition <= jtarget) do begin chromto[j] := chromfrom[jposition]; jposition := jposition + 1; j := j + 1; end; end; Page112 Multiparameter, Mapped, Fixed-Point Coding Map an unsigned binary integer to range [minparm, maxparm] function map_parm(x, maxparm, minparm, fullscale: real) real; begin map_parm := minparm + (maxparm – minparm) / fullscale*x end; Page113 Multiparameter, Mapped, Fixed-Point Coding Coordinates the decoding of all nparms parameters. function decode_parms(var nparms, lchrom: integer; var chrom: chromosome; var parms: parmspecs); Var j, jposition: integer; Chromtemp:chromosome; {temporary string buffer} begin j:= 1; {parameter counter} jposition := 1 ; {string position counter} repeat with parms[j] do if lparm > 0 then begin extract_parm(chrom, chromtemp, jposition. lchrom, lparm); parameter := map_parm(decode(chromtemp, lparm), maxparm, minparm, power(2.0, lparm)-1.0); end else parameter := 0; j := j + 1; until j > nparms; Page114 end; Multiparameter, Mapped, Fixed-Point Coding Decode a single parameter function decode(chrom:chromosome; lbits:integer):real; { Decode string as unsigned binary integer - true=1, false=0 } var j:integer; accum, powerof2:real; begin accum := 0.0; powerof2 := 1; for j := 1 to lbits do begin if chrom[j] then accum := accum + powerof2; powerof2 := powerof2 * 2; end; decode := accum; end; Page115 References Genetic Algorithms: Darwin-in-a-box Presentation by Prof. David E. Goldberg Department of General Engineering University of Illinois at Urbana-Champaign deg@uiuc.edu Neural Networks and Fuzzy Logic Algorithms by Stephen Welstead Soft Computing and Intelligent Systems Design by Fakhreddine Karray and Clarence de Silva Page116 References http://lancet.mit.edu/~mbwall/presentations/IntroToGAs/P00 1.html A Genetic Algorithm Tutorial Darrell Whitley Computer Science Department Page117