Today’s Topics • Read – For exam: Chapter 13 of textbook – Not on exam: Sections 14.1 - 14.3 & 14.4.1 • Genetic Algorithms (GAs) – – – – – Mutation Crossover Fitness-proportional Reproduction Premature Convergence Building-block Hypothesis • End of Coverage of SEARCH 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 1 Genetic Algorithms (GAs) • Use ideas of – – – – Survival of fittest (death) Combination of ‘genetic material’ (sex) (‘Taxes’ play a role in some algo’s) Mutation (randomness) • Mixing of genes from parents more important than mutation (contrary to popular press) – About 25,000 human genes – For simplicity, assume two variants of each – So 225,000 possible combo’s to explore! 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 2 Basic FRAMEWORK for GAs (many possible ALGORITHMS) 1. 2. 3. 4. Create initial population of entities Evaluate each entity using a fitness function Discard worst N% of entities K times, stochastically grab ‘best’ parents (fitness proportional reproduction) i. ii. Combine them (crossover) to create new entities Make some random changes (mutation) 5. Goto 2 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 3 Representing Entities as Bit Strings • Assume we represent our problem as a bit string (but any data structure ok for GAs) • Cross Over (example on next slide) – Pick two entities, A and B – Choose a cross-over location – Copy first part of A and last part of B – Copy first part of B and last part of A • Mutation – Randomly flip 1 or more bits 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 4 Crossover Example Entity A 1 0 1 1 0 0 0 1 1 0 0 1 1 1 0 Randomly chosen ‘cross over’ point Entity B 0 1 0 1 0 0 1 1 1 0 1 0 1 0 1 0 0 Child C 1 0 1 1 0 0 1 1 0 1 0 1 0 1 0 0 0 1 0 0 1 1 0 1 1 0 0 1 1 1 0 Child D 0 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 5 Aside: My Family Phones My cell phone (#’s changed for anonymity) 406-0917 My wife’s cell phone 328-3729 Our daughter’s cell phone 328-0917 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 6 Typical Design • Discard Worst HALF of Population • Generate Children to Refill Population • Keep Parents and Generated Children • ‘Flip’ a Small Faction of Bits (eg, 0.1%) – Flip bits in all member of population 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 7 Fitness-Proportional Reproduction • Let Fi be the fitness of entity i • Assume Fi are non-negative (if not, use eFi as the fitness for the GA) • Let Ftotal = ∑ Fi // Sum the fitness of all the entities Prob (entity i chosen) = Fi / Ftotal 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 8 Roulette-Wheel View - spin arrow and see where it stops (pie-wedge size proportional to fitness) Fitness A B C D E 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 9 A GA Approach to Supervised ML • Assume we want to learn a model of the form (and all of our N features are numeric) if [ ∑ weighti ₓ featurei ] > threshold then return POS else return NEG • Representation of Entities? – See next slide • Fitness? – Accuracy on TRAIN set plus maybe some points for being different from rest of population • Role of Tuning Set? – Could chose best member of population when done – If we use ALL of population (an ‘ensemble’), could weight each’s predictions 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 10 Possible Representation of Entities … … 16 bits for weight1 … ... 16 bits for weightN … … 16 bits for threshold Notes 1) we might only use 16 bits so weights are small (Occam’s Razor) 2) first bit could be SIGN (or use “2’s complement”) 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 11 Design Tip • Design your space of entities so that most are viable (ie, get a non-zero fitness) • Otherwise will waste a lot of cpu cycles generating useless entities 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 12 Premature Convergence (‘Inbreeding’) • If not careful, entire population can become minor variations of a small number of ‘bit vectors’ • Eg, consider crossing over A and child_of_A – Result will be ¾ a copy of A • Solutions – Don’t crossover with ‘recent’ descendent – Mutate more (but might destroy good traits) 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 13 GAs as Searching a Space Consider the space defined by single-bit mutations 101…01 101…00 001…01 011…10 001…00 001…00 What is a CROSSOVER? - Grab any two nodes (might not be adjacent) - ‘hyper jump’ to a possibly distant 3rd node 101…10 etc 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 14 Building-Block Hypothesis • GAs work well when overall task has subtasks • Fitness function gives credit for being able to solve subtasks • Crossover ‘mixes and matches’ solutions to subtasks • Eg, consider building cars – Need to engine, wheels, windows, brakes, etc 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 15 Which Fitness Function Better for GAs? Fitness State Space Fitness State Space 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 16 Genetic Programming • Entities need not be bit strings • Often ‘genetic programming’ used for richer rep’s of entities – Decision trees – Neural networks – Code snippets – Etc 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 17 In-Class HW • Design Genetic Programming Approach for Creating Good Decision Trees • Think for 2-3 Mins before Raising Hand 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 18 GA Wrapup • Can come up with quite creative solutions since many possibilities considered • Might be too undirected? • Designing good fitness functions can be a challenge • Make more sense as computing power 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 19 End of Search • We’re done with search in discrete spaces • SEARCH is a powerful, general-purpose way to look at problem solving • Next: probabilistic reasoning (but we’ll return to viewing AI tasks from the perspective of search periodically) 10/6/15 CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5 20