Lecture 36

advertisement
Lecture 36



Log into Windows/ACENET. Download and
extract TravelingSalesman.zip. Double-click
into the project folders to the solution file.
Double-click on the solution file to start MS VS.
Final programming project vs. final
programming exam?
Questions?
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
1
Outline


Evolutionary computation aka "genetic
algorithms"

Terminology

Basic algorithm
Traveling salesman problem
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
2
Evolutionary Computation



Evolutionary computation is an area of
computing devoted to computing techniques
that mimic biological evolution.
This type of computation was first conceived in
the 1950's and 1960's, but executing these
techniques requires a large amount of
computing resources, so until the 1990's, it was
mostly a research topic.
Now that CPUs are very fast and memory is
very cheap, anyone can explore this area.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
3
Evolutionary Computation

Some terminology




Fitness – a measure of the quality of an individual
item in a population compared to an ideal. Only the
fittest individuals survive to the next generation.
Crossover – similar to biological reproduction.
Characteristics of one or more parent individuals
are combined by some algorithm to form a new
individual.
Mutation – a random change in some portion of an
individual
Growth – a small change in an individual directed
toward a better fit to a goal
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
4
Evolutionary Computation


Evolutionary algorithm (EA) is a generic term
that encompasses all programming methods
that rely on evolution to achieve a stated goal.
There are four recognized subareas of
evolutionary computing


Evolutionary programming – EA that relies
primarily on mutation
Evolutionary strategy – EA that combines
mutation and crossover
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
5
Evolutionary Computation


Genetic algorithm – often used interchangeably
with EA, but also is used to mean EA that relies
primarily on crossover
Genetic programming – evolutionary techniques
applied to creating a program to solve a problem.
Typically the output is a LISP program.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
6
Basic Evolutionary Algorithm



An original population is selected. Typically this
is done in a random fashion so that each
member of the population represents a solution
to the problem. This is the first generation.
The individuals are assessed to determine their
fitness. Typically, they are sorted with the
fittest on top.
After sorting, parents are selected from the
population. They are the individuals that are
the most fit. E.g., the top 30%.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
7
Basic Evolutionary Algorithm


Crossover is performed using the parents to
generate children. E.g., two randomly selected
parents may be chosen with parts of each
going together to form a child. This process
continues until the full population is produced.
E.g. if 30% of the population are parents, then
the other 70% may be replaced by children.
Mutation is performed, making small random
changes to certain elements of an individual
without regard to whether it is beneficial.
Usually a small percentage of the population.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
8
Basic Evolutionary Algorithm


Growth operations may be performed. Growth
allows some small group of the population to
change in positive fashion. It is similar to
mutation except individuals are assessed
immediately and only positive changes are
retained.
The population of survivors (typically parents)
and the new children become the next
generation.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
9
Basic Evolutionary Algorithm


This process is repeated many times until some
time limit expires or until some member of the
population achieves a desired assessed value.
Research in this area primarily focuses on
determining appropriate fitness functions,
methods of crossover, and methods of mutation
as applied to application areas.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
10
Traveling Salesman Problem

One such application area is the Traveling
Salesman Problem (TSP). The problem is
stated as follows:

A salesman is assigned a region of n cities as his
sales territory. Periodically, he is required to tour
his region. That is, he is to visit each city exactly
once before returning to his home city. Assuming
that the cost of traveling between two cities is
proportional to the distance between the two cities,
what is the least-cost route the salesman can
take?
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
11
Traveling Salesman Problem


The TSP has been posed since at least the
1800's with much work done by Harvard
mathematicians in the 1930's on a general form
of the problem.
A brute force solution would be to determine
the total distance of all possible routes.
Unfortunately, for n cities, there are (n-1)!
possible routes, making this solution infeasible
for all but the smallest problems, even with the
155
fastest computers. E.g., 99!  9.332 x 10
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
12
Traveling Salesman Problem

We can arrive at a reasonable approximation to
the optimal solution using evolutionary
programming techniques as follows:


Generate a large number, N, of random paths and
find the shortest one. This is generation 0. For
example, for n = 100, to compute the total distance
of a path we computed 99 distances. Then for N =
1000, we do this 1000 times.
The fitness function is the length of each path, so
for n cities, we sum (n-1) distances between each
city in the path. The shortest path is the most fit.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
13
Traveling Salesman Problem


To create the next generation, we can do crossover
by using the shortest path to generate (N-1) new
children paths by taking a random subsection of the
shortest path and adding the remaining cities at
random. E.g. if n = 5, a shortest path may be
{3,0,2,4,1}. A random subsection might be {2,4}
and randomly selecting the remaining cities could
result in a new child path of {2,4,0,3,1}.
After all the children paths are constructed,
mutation may be applied. E.g. exchanging the
positions of two random cities. The generation is
sorted, and then this process is repeated for k
generations.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
14
Traveling Salesman Project


The Traveling Salesman project is an
application that implements this evolutionary
algorithm and uses graphics to display the
shortest path.
A typical run is shown on the next slide. The
user can enter the following options:




number of cities
number of generations
number of paths per generation
mutation percentage
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
15
Traveling Salesman Project

genes passed on – can
be a percentage or
random. If it is random,
then a path of random
length is chosen from
the parent path and
copied to each child
path. If it is not random,
then the user can chose
0% to 99% of the parent
path to be passed to
each child.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
16
Traveling Salesman Project

The program has two classes.


City class – represents the location of a city as an
(x,y) screen coordinate and stores a boolean flag
visited. It supports properties to get and set these
values.
Path class – contains a path through the cities
represented as an array of integers that are indexes
into the application array of cities and the length of
the path. It supports properties to get and set these
values.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
17
Traveling Salesman Project

The main application data structures are



cities: an array of City objects that are created
when the program starts up and are recreated
whenever the user changes the number of cities.
(See the constructor and the
nudNumCities_ValueChanged handler. Both call
the CreateCities method.)
paths: an array of Path objects that are the
individuals of a generation.
shortestPath: a Path object that contains the
shortest length Path generated so far
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
18
In-class Exercise

The first part of the exercise is to complete the
FindShortestPath method. This method does
the following:



Determine the shortest path of the current
generation
Determine whether this shortest path is the shortest
path seen overall. This part has been provided.
The basic idea is to keep track of the shortest
path as each individual is assessed. Start by
saying the first path is the shortest so far.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
19
In-class Exercise

The algorithm for this is:
1. Initialize shortestIndex to 0 and shortestLength to
paths[0].PathLength
2. For indexes from 1 to numPaths-1
2.1 If paths[i].PathLength is shorter than
shortestLength, then i becomes the new
shortestIndex and paths[i].PathLength
becomes the new shortestLength

Write the code for this. Run and test the
program using small numbers.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
20
In-class Exercise

We can improve the evolution of the shortest
path in two ways:


In the Mutation method, instead of exchanging a
city with the next city in the path, we can exchange
a city with a random other city. Be sure that the
second city chosen is not the same as the first.
In the Crossover method, instead of always copying
the first x cities from the shortest path, we can copy
a random subset of x cities from the shortest path.
This requires choosing a random starting index and
"incrementing" it as the cities are copied.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
21
In-Class Exercise

We can improve the user interface by having
the starting city rendered as a solid green circle
to distinguish it from the other cities in the path.
This can be done using the FillEllipse method
that receives a Brush (rather than a Pen).
Since the filled portion is inside the drawn
outline, a filled ellipse needs to be slightly larger
than a drawn one to look the same size.
Brush greenBrush = new SolidBrush(Color.Green);
g.FillEllipse(greenBrush, cities[0].X­4,
cities[0].Y­4, 9, 9);
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
22
GUI Notes


The user input is done using NumericUpDown
objects (prefix nud). This GUI element is like a
textbox, but also has the up-down arrows to click on.
The amount that is added/subtracted is the Increment
property. It also prevents non-numeric characters
from being typed in.
The progress bar is a ProgressBar object (prefix
pgb). To use it, the Maximum property is set. The
Value property starts at 0 and as it is set, the GUI
element displays the Value/Maximum ratio.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
23
GUI Notes


The Random button has a CheckedChanged event
handler that is called whenever the button is clicked.
When it is checked, it sets the randomGenes flag to
true and disables the Percent box. Vice versa when it
is unchecked.
The Number of Cities box has a ValueChanged event
handler that is called whenever the user changes the
value in that box (either by clicking an arrow or by
typing in the box). This handler recreates the city list
to have the appropriate number of cities. The new
state is not displayed until the next Start button click.
Monday, April 11
CS 205 Programming for the Sciences - Lecture 36
24
Download