of Lil-GP by Darren Lewis

advertisement
Problem Set 4
Genetic Programming and LIL-GP
Where we’ve been

Problem Sets 1 and 2: Theory
Where we’ve been

Problem Sets 1 and 2: Theory

Combinatorics
Where we’ve been

Problem Sets 1 and 2: Theory


Combinatorics
Simple Genetic Algorithms (a.k.a.
How to run an effective hamburger
chain)
Where we’ve been

Problem Sets 1 and 2: Theory



Combinatorics
Simple Genetic Algorithms (a.k.a.
How to run an effective hamburger
chain)
Fundamental Theorem of Genetic
Algorithms
Where we’ve been

Problem Sets 1 and 2: Theory




Combinatorics
Simple Genetic Algorithms (a.k.a.
How to run an effective hamburger
chain)
Fundamental Theorem of Genetic
Algorithms
Hyperplane madness
Where we’ve been

Problem Set 3: Genetic Algorithms
and the GENESIS system
Where we’ve been

Problem Set 3: Genetic Algorithms
and the GENESIS system

Optimization of the function x10
Where we’ve been

Problem Set 3: Genetic Algorithms
and the GENESIS system


Optimization of the function x10
Minimization of the De Jong function
Where we’re going

Problem Set 4:
Genetic Programming and LIL-GP
Where we’re going

Problem Set 4:
Genetic Programming and LIL-GP

Evolve LISP-like programs without
writing one line of LISP! (sorry, John)
Where we’re going

Problem Set 4:
Genetic Programming and LIL-GP


Evolve LISP-like programs without
writing one line of LISP! (sorry, John)
Solve two simple problems: symbolic
regression of a quadratic function,
and symbolic regression of a boolean
even-3-parity function.
Getting the software
1.
Change to your cs426 directory.
cd cs426
Getting the software
1.
Change to your cs426 directory.
cd cs426
2.
Copy the software to your AFS
space.
cp –r /afs/ir/class/cs426/lilgp .
This will create a directory called
lilgp under your cs426 directory.
Uncompressing the code
1.
Change to your new lilgp directory.
cd lilgp
You should have one file:
lilgp.tar
Uncompressing the code
1.
2.
Change to your new lilgp directory.
cd lilgp
You should have one file:
lilgp.tar
Uncompress with the following
command:
tar –xvf lilgp.tar
Uncompressing the code
You should now have a directory
called lilgp1.1. Under this
directory you only need to deal
with the following subdirectories:

1.1: All the code.

htmlMan: All the documentation.
Don’t worry about the others.
Where to start


Browse through the documentation.
It is indexed and easy to follow. If
you want to download it yourself, you
can get it at:
http://garage.cps.msu.edu/software
/software-index.html#lilgp
Start with: lil-gp.contents.htm
Where to start

Try running one of the samples.
They’re located in the app directory.
After compiling and building (just
type make), they’re ready to run. Try
the regression sample. To run, type:
gp –f input.file
If the program does not start
running, there’s something wrong
with your installation.
Warnings:
1.
Don’t change anything in the
kernel directory.
Warnings:
1.
2.
Don’t change anything in the
kernel directory.
Don’t change anything in the
kernel directory.
Warnings:
1.
2.
3.
Don’t change anything in the
kernel directory.
Don’t change anything in the
kernel directory.
Don’t modify the makefiles. Even
though they’re called
GNUmakefile, typing make still
works.
Implementing a problem
1.
Fill in a tableau, deciding on
terminal sets, function sets,
fitness determination, etc.
Implementing a problem
1.
2.
Fill in a tableau, deciding on
terminal sets, function sets,
fitness determination, etc.
Write the code.
Implementing a problem
1.
2.
3.
Fill in a tableau, deciding on
terminal sets, function sets,
fitness determination, etc.
Write the code.
Create a parameter file for the
run.
Implementing a problem
1.
2.
3.
4.
Fill in a tableau, deciding on
terminal sets, function sets,
fitness determination, etc.
Write the code.
Create a parameter file for the
run.
Run the code and examine the
output files.
Tableau

Terminal set: The leaves of the
expression tree.
Tableau


Terminal set: The leaves of the
expression tree.
Function set: The internal nodes.
Tableau



Terminal set: The leaves of the
expression tree.
Function set: The internal nodes.
ex: symbolic regression
terminal set = {x}
function set = {+, -, *, %}
Tableau

Fitness

Raw fitness – sum of error over
fitness cases, number of hits, etc.
Tableau

Fitness


Raw fitness – sum of error over
fitness cases, number of hits, etc.
Standardized fitness – all positive; 0
is best
Tableau

Fitness



Raw fitness – sum of error over
fitness cases, number of hits, etc.
Standardized fitness – all positive; 0
is best
Adjusted fitness – in between 0 and
1, with 1 being the most fit individual
Tableau

In general, LIL-GP was designed
after what is described in Genetic
Programming (Koza, 1992), so all
terminology is described in the
book. See chapters 6 and 7 for full
explanations of all terms used in
the program.
Writing the code

Files you need to provide:
1.
2.
3.
4.
5.
app.h
app.c
appdef.h
function.h
function.c
Writing the code

Files you need to provide:
1.
2.
app.h – Defines the global data
structure used to pass information
back and forth between files.
app.c – Defines all the user
callbacks – initialization, fitness
evaluation, etc.
Writing the code

Files you need to provide:
3.
appdef.h – Two #defines:
MAX_ARGS, and DATATYPE.


MAX_ARGS: Maximum number of
arguments your functions will take
DATATYPE: Type that all of your
functions will return (ex: double, int,
etc.)
Writing the code

Files you need to provide:
4.
5.
function.h – Prototypes for all of
the functions in your function set.
function.c – Implementations of
the functions listed in function.h.
Writing the code

In general, all of the tricky code has
been written for you. In fact, two of
the sample applications are:
1.
2.
Symbolic regression of a function
Boolean-11 multiplexer
Both are explained in GP. You do not
need to rewrite the functions in
app.c, just modify them.
Creating a parameter file

The parameter files included with
the samples are called input.file.
Just modify these for the
homework problems. If you want
to learn more about the extensive
parameters that can be set, look in
the documentation.
Creating a parameter file

The only confusion about
parameter files will come in
specifying crossover rates for the
first problem. To avoid this
confusion, here is what you need
to put in your parameter file to
meet the problem criteria:
Creating a parameter file

Parameters for proper breeding:
breed_phases = 4
breed[1].operator = crossover,
select=fitness, internal=0.0
breed[1].rate = 0.1
breed[2].operator = crossover,
select=fitness, internal=1.0
breed[2].rate = 0.8
Creating a parameter file

Parameters for proper breeding:
breed[3].operator = reproduction,
select=fitness
breed[3].rate = 0.1
breed[4].operator = mutation,
select=fitness
breed[4].rate = 0.00
Creating a parameter file

Aside from this, all you need to
change is pop_size,
max_generations, random_seed,
and perhaps output.basename.
The parameter output.basename
sets the file prefix for all of the
output files.
Creating a parameter file

For problems 2 and 3, the default
crossover parameters work fine.
No need to change them unless
you’re curious.
Running the code

Very simple to run the code:
1.
Compile and build with the
command make. If there are compile
errors, it will tell you now – fix them
before continuing.
Running the code

Very simple to run the code:
1.
2.
Compile and build with the
command make. If there are compile
errors, it will tell you now – fix them
before continuing.
Run the code with the command:
gp –f input.file
Output files

LIL-GP generates lots of good stuff:






.sys – general info about the run
.gen – stats on tree size and depth
.prg – stats on fitness and hits
.bst – info about current best individual
.his – history of the .bst file
.stt – unreadable version of all stats
You will want the .bst and .his files.
Problem 1

This problem deals with symbolic
regression of the function x2/2+2x+2.
There is a sample app called
“regression” provided. All you need to
do is modify the input.file as
described, delete a bunch of lines in
app.c and function.h/c, change a few
numbers in app.c, and it will run.
Guaranteed.
Problem 1

Hint: With the provided random
seed, it will not find a close
solution even after 151
generations. Fiddle with the
random seed and you will quickly
find a perfect solution in much
fewer than 151 generations.
Problem 2

This problem deals with symbolic
regression of a boolean function
that performs even-3-parity. Even3-parity returns true given three
inputs if an even number of inputs
are true (i.e., 0 true, or 2 true).
Problem 2

There is a sample app called
“multiplexer” that performs symbolic
regression for a boolean-11 multiplexer.
Just modify their files to do this
problem. You will need to delete a
bunch of functions and add code for
nand and nor. You might want to add an
even-3-parity test function in app.c for
fitness determination….
Problem 2


You will also need to change the
specification in app.h. Your global
should either have three ints (1 for each
input line) or 1 int (if you like bit logic).
Hint: With a specified population size of
1000, you will very likely have a perfect
individual at generation 0, since the
solution is very simple. This is fine. You
should be happy about it, not stressed.
Problem 3

This is the same as problem 2,
except now with an automatically
defined function. Do not attempt
this problem before you have
problem 2 working completely. You
will be able to quickly modify your
problem 2 code (just app.c) to
make problem 3 work.
Problem 3

In specifying the function sets for this
problem, follow the model given in the
“lawnmower” sample app. This shows
precisely how to build the function sets
for a tree with automatically defined
functions. There will be two different
function sets: one of the result
producing branch, and one for the ADF
branch.
General Hints

Once again, there is not much code
to be written. All you are doing is
modifying existing code. Don’t feel
bad, this is done all the time.
General Hints


Once again, there is not much code to
be written. All you are doing is
modifying existing code. Don’t feel bad,
this is done all the time.
For the function sets, make sure to read
the docs to find out what each of the
arguments should be for the different
types of arguments, terminals, and
functions.
General Hints

Don’t modify the sample apps
directly. Copy the files to a new
directory, so you’ll always have the
original to look back at.
General Hints


Don’t modify the sample apps
directly. Copy the files to a new
directory, so you’ll always have the
original to look back at.
If you get stuck, read the
documentation. It’s actually very
clear and well-indexed.
What to turn in



Answers to the written problems.
app.c and function.c for each
problem.
The .bst file for each problem. The
default of one individual in this file
is fine.
Final thoughts

This is not a test of your
programming ability. It is just an
introduction to one piece of easyto-use software for genetic
programming. Don’t make this
harder than it needs to be by
rewriting everything from scratch.
Use what works.
Final thoughts

Other GP software freely available:

GPCPP – Has basic functionality,
relatively easy to add new problem
classes. (C++) Download at:
http://wwwcgi.cs.cmu.edu/afs/cs/project/airepository/ai/areas/genetic/gp/syste
ms/gpcpp/0.html
Final thoughts

Other GP software freely available:

GPQUICK – Simpler than GPCPP but
less friendly interface. (C++)
Download at:
http://wwwcgi.cs.cmu.edu/afs/cs/project/airepository/ai/areas/genetic/gp/syste
ms/gpquick/0.html
Final thoughts

Other GP software freely available:

ECJ8 – Most versatile, but very large
and somewhat complicated. (Java)
Download at:
http://www.cs.umd.edu/projects/plu
s/ec/ecj/
Final thoughts
Good luck!
Download