Today’s Topics • Read • Genetic Algorithms (GAs)

advertisement
Today’s Topics
• Read
– For exam:
Chapter 13 of textbook
– Not on exam: Sections 14.1 - 14.3 & 14.4.1
• Genetic Algorithms (GAs)
–
–
–
–
–
Mutation
Crossover
Fitness-proportional Reproduction
Premature Convergence
Building-block Hypothesis
• End of Coverage of SEARCH
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
1
Genetic Algorithms (GAs)
• Use ideas of
–
–
–
–
Survival of fittest (death)
Combination of ‘genetic material’ (sex)
(‘Taxes’ play a role in some algo’s)
Mutation (randomness)
• Mixing of genes from parents more important
than mutation (contrary to popular press)
– About 25,000 human genes
– For simplicity, assume two variants of each
– So 225,000 possible combo’s to explore!
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
2
Basic FRAMEWORK for GAs
(many possible ALGORITHMS)
1.
2.
3.
4.
Create initial population of entities
Evaluate each entity using a fitness function
Discard worst N% of entities
K times, stochastically grab ‘best’ parents
(fitness proportional reproduction)
i.
ii.
Combine them (crossover) to create new entities
Make some random changes (mutation)
5. Goto 2
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
3
Representing Entities
as Bit Strings
• Assume we represent our problem as a bit
string (but any data structure ok for GAs)
• Cross Over (example on next slide)
– Pick two entities, A and B
– Choose a cross-over location
– Copy first part of A and last part of B
– Copy first part of B and last part of A
• Mutation
– Randomly flip 1 or more bits
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
4
Crossover Example
Entity A
1
0
1
1
0
0
0
1
1
0
0
1
1
1
0
Randomly chosen ‘cross over’ point
Entity B
0
1
0
1
0
0
1
1
1
0
1
0
1
0
1
0
0
Child C
1
0
1
1
0
0
1
1
0
1
0
1
0
1
0
0
0
1
0
0
1
1
0
1
1
0
0
1
1
1
0
Child D
0
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
5
Aside: My Family Phones
My cell phone (#’s changed for anonymity)
406-0917
My wife’s cell phone
328-3729
Our daughter’s cell phone
328-0917
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
6
Typical Design
• Discard Worst HALF of Population
• Generate Children to Refill Population
• Keep Parents and Generated Children
• ‘Flip’ a Small Faction of Bits (eg, 0.1%)
– Flip bits in all member of population
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
7
Fitness-Proportional
Reproduction
• Let Fi be the fitness of entity i
• Assume Fi are non-negative
(if not, use eFi as the fitness for the GA)
• Let Ftotal = ∑ Fi
// Sum the fitness of all the entities
Prob (entity i chosen) = Fi / Ftotal
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
8
Roulette-Wheel View
- spin arrow and see where it stops
(pie-wedge size proportional to fitness)
Fitness
A
B
C
D
E
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
9
A GA Approach to
Supervised ML
• Assume we want to learn a model of the form
(and all of our N features are numeric)
if [ ∑ weighti ₓ featurei ] > threshold
then return POS else return NEG
• Representation of Entities?
– See next slide
• Fitness?
– Accuracy on TRAIN set
plus maybe some points for being different from rest of population
• Role of Tuning Set?
– Could chose best member of population when done
– If we use ALL of population (an ‘ensemble’),
could weight each’s predictions
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
10
Possible Representation of Entities
…
…
16 bits for
weight1
…
...
16 bits for
weightN
…
…
16 bits for
threshold
Notes
1) we might only use 16 bits so weights are small (Occam’s Razor)
2) first bit could be SIGN (or use “2’s complement”)
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
11
Design Tip
• Design your space of entities so that most
are viable (ie, get a non-zero fitness)
• Otherwise will waste a lot of cpu cycles
generating useless entities
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
12
Premature Convergence
(‘Inbreeding’)
• If not careful, entire population can
become minor variations of a small number
of ‘bit vectors’
• Eg, consider crossing over A and child_of_A
– Result will be  ¾ a copy of A
• Solutions
– Don’t crossover with ‘recent’ descendent
– Mutate more (but might destroy good traits)
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
13
GAs as Searching
a Space
Consider the space defined
by single-bit mutations
101…01
101…00
001…01
011…10
001…00
001…00
What is a
CROSSOVER?
- Grab any two
nodes (might not
be adjacent)
- ‘hyper jump’
to a possibly
distant 3rd node
101…10
etc
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
14
Building-Block Hypothesis
• GAs work well when overall task
has subtasks
• Fitness function gives credit for being
able to solve subtasks
• Crossover ‘mixes and matches’
solutions to subtasks
• Eg, consider building cars
– Need to engine, wheels, windows, brakes, etc
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
15
Which Fitness Function
Better for GAs?
Fitness
State Space
Fitness
State Space
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
16
Genetic Programming
• Entities need not be bit strings
• Often ‘genetic programming’ used
for richer rep’s of entities
– Decision trees
– Neural networks
– Code snippets
– Etc
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
17
In-Class HW
• Design Genetic Programming Approach
for Creating Good Decision Trees
• Think for 2-3 Mins before Raising Hand
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
18
GA Wrapup
• Can come up with quite
creative solutions since
many possibilities considered
• Might be too undirected?
• Designing good fitness functions can
be a challenge
• Make more sense as computing power 
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
19
End of Search
• We’re done with search in
discrete spaces
• SEARCH is a powerful, general-purpose
way to look at problem solving
• Next: probabilistic reasoning (but we’ll
return to viewing AI tasks from the
perspective of search periodically)
10/6/15
CS 540 - Fall 2015 (Shavlik©), Lecture 13, Week 5
20
Download