Genetic Algorithms

advertisement
Genetic Algorithms
Problem: find an optimal solution.
Solution, so far: search through all possible solutions to find the best one.
Solution with genetic algorithms: start from a set of solutions and keep on evolving the
solutions until the “best fit” to the real solution is found.
Each solution is a bit string (for example, concatenated values of features). This bit string
is called a chromosome.
Bits that comprise one feature are called a gene.
Bits within a gene are called alleles.
The position of a bit is called a locus.
For example, let us assume that we have samples 1 and 2 and the task is to find the
optimal value for F2 sample 3 such that the class of that sample is either A or B.
Sample
S1
S2
S3
Feature F1
1
4
2
F2 Class
2
A
5
B
x=?
The traditional mathematics can use various distances, for example Euclidian distance
and minimum square error. We would start from a solution and somehow keep on
evolving it, for example by slightly moving the sample value. For example, we would
start with x=2, and calculate the distance from point S3(2,2) to the points in cluster A.
Then we would pick another point, for example x=2.2, and see if the error improves. If
not, we would try x=1.8. And so on, we would try all points at distances i*δ away from
the original point, where δ is a predetermined step size, and i is a counter. If the error did
not improve, we could repeat the process for another starting value of x.
Genetic algorithms work by putting the solution into the form of a binary string.
The solution is based on a schema. * are don’t cares.
**00**11*
Genetic algorithm:
1. Generate random population of n chromosomes (i.e. suitable solutions for the
problem)
2. Evaluate the fitness of each chromosome x in the population
3. While the fitness is below the desired level, create a new population:
1. Select:
Select two parent chromosomes from a population according to their
fitness (the better fitness, the bigger chance to be selected – it is possible
to be selected several times)
2. Crossover:
Using a prespecified crossover probability, crossover the parents to form
new offspring (children). If no crossover was performed, offspring is the
exact copy of parents.
3. Mutate:
Using a prespecified mutation probability, mutate new offspring at each
locus.
4. Place new offspring in the new population
5. Evaluate:
Evaluate the fitness of each chromosome x in the population.
4. Return the best solution in current population.
Variables to be selected are:
1. type and size of chromosomes
1. define alleles based on specified minimum precision
2. number and values of initial chromosomes
3. crossover probability and strategy
4. mutation probability
5. evaluation, i.e. assigning fitness to chromosomes and their offspring
6. selecting new population
7. stopping criteria
Type and size of chromosomes
1. Binary encoding
Most common way is to have each feature represented as a binary string. The number of
bits is determined from this formula:
(b-a) / (2m -1) ≤ required_precision
where [a, b] is the range of values for the feature, m is the number of bits, and
required_precision is assigned by the user.
If we call C = (b-a) / (2m -1) for short, then each feature will have binary value
binary = a + decimal* C
2. Gray Encoding
3. Value Encoding
4. Permutation encoding
Swap all instances of values to be swapped.
E.g.: in order to swap EB for CD, swap E and C in chromosome 1, and swap B and D in
chromosome 2.
ADE BC
ADC BE
ABC DE
AEC DB
ACE DB
ACE BD
5. etc.
Crossover
1. Single point crossover
Pick a point in the chromosomes, and “swap” corresponding tail portions of two parent
chromosomes to make up an offspring.
11111 111
00000 000
2. Two point crossover
11111000
Same as above except that the split is not at one point, but at two, and we swap the
middle.
1 1111 111
10000111
0 0000 000
3. Arithmetic crossover
4. Uniform crossover
5. etc.
Mutation
Each bit can be mutated (i.e. inverted) or not, according to mutation probability.
Example: problem 10.8
Fitness
Assign some fitness function.
Selection
How do we select the fittest chromosomes which will be used for crossover?
1. Roulette wheel selection
Probability of selecting this chromosome is proportional to its fitness.
2. Ranking
Chromosomes are ranked based on fitness.
3. Tournament (Selecting the fittest)
Only the fittest are selected for crossover.
4. Elitism
At each iteration, save the best chromosome and copy it into the new population.
5. etc.
Termination
1. When there is not much difference between iterations
2. When we completed a given number of iterations
3. etc.
References:
http://cs.felk.cvut.cz/~xobitko/ga/
http://pangea.stanford.edu/~baris/professional/theoryga.html
http://www.burns-stat.com/pages/Tutor/genetic.html
http://members.aol.com/btluke/gmovr01.htm for Aren
http://en.wikipedia.org/wiki/Genetic_algorithm
Download