euler

advertisement
Automated discovery in math
• Machine learning techniques (GP, ILP, etc.)
have been successfully applied in science
• How about mathematics? Can they be used
to discover interesting relationships in
mathematical “data”?
• This is an exploration of using GP for that
purpose
• Specifically, using GP to automatically
discover Euler’s identity (V – E + F = 2) from
a fairly limited amount of data
Cubes
V=8
E = 12
F=6
V – E + F = 8 – 12 + 6 = 2
Tetrahedra
V=4
E=6
F=4
V–E+F=4–6+4=2
Octahedra
V=6
E = 12
F=8
V – E + F = 6 – 8 + 12 = 2
Data for Euler’s identity
Polyhedron
1
2
3
4
5
6
7
8
9
Cube
Triangular prism
Pentagonal prism
Square pyramid
Triangular pyramid
Pentagonal pyramid
Octahedron
Tower
Truncated cube
V
8
6
10
5
4
6
6
9
10
E
12
9
15
8
6
10
12
16
15
F
6
5
7
5
4
6
8
9
7
At a glance
•
•
•
•
•
•
•
•
50 generations
Population: 4000 ASTs
Generation #: 3600 (90% of population)
Maximum AST depth: 13
Ramped half-and-half initialization
3 non-terminals: +, -, *
12 terminals: V, E, F, 1, 2, …, 9
Crossover, no mutation
Genetic algorithms (GA)
• Search a space of solution attempts
(“individuals”)
• Use natural selection to guide the
search
• Must have a fitness function that can
evaluate any given individual
• Individuals procreate by exchanging
(recombining) “genetic material”
Example: SAT solving
• Problem: Given a CNF formula P over n
variables x1,…,xn, find a satisfying
assignment
• Search space: all n-bit strings
• Fitness measure for a given individual
b1  bn: # of satisfied clauses in P
• Genetic operations: crossover and
mutation
Crossover:
a1 … aj-1|aj … an + b1 … bj-1|bj … bn
a1 … aj-1 | bj … bn
b1 … bj-1 | aj … an
Mutation:
01101001
01100001
Generic GA algorithm
1.
2.
3.
4.
Parameterized over: N, P, G
Construct a random initial population
Set i := 1
If i > N then halt
Compute the fitness of each individual;
if the fittest solves the problem, halt.
5. Create a new population:
1. Pick P – G individuals and copy them
2. Create G new individuals by repeated
applications of genetic operations
6. Set i := i + 1 and go to step 3
Selection
• How is an individual “picked” for
reproduction or copying?
• Main idea: the probability that an individual
is selected should be proportional to the
individual’s fitness
• Many ways to ensure that. One method is
tournament selection:
– Pick 0 < k <= P individuals randomly
– Select the fittest of the k
• When k = 1: No selection pressure
• When k = P: Too much selection pressure
Genetic Programming (GP)
• An instance of the generic GA scheme
• Individuals are now programs, i.e., syntactic
objects
• Search space is kept finite by bounding
program size
• Programs are represented as ASTs
(abstract syntax trees)
Programs as ASTs
if x > 0 then
y := x * x
else
y := z + 1
Parsing
if
:=
>
x
0 y
:=
x
+
y
*
x
z
1
Program structure in GP
• Programs are usually simple Herbrand
•
•
•
•
•
terms, i.e., functional expressions
AST leaves are called terminals
Internal nodes are non-terminals
Non-terminals are function symbols (e.g.
+)
Terminals are constants and variables
Terminals + non-terminals must be
sufficient for expressing solutions
Viewing a functional AST as a
“program”
+
*
x
y
2
The program has two “inputs”, x and y. Given
specific values for these, it produces a unique
result as output
AST Crossover
Crossover pt 1
+
*
T1
Crossover pt 2
-
T2
+
T4
T3
T5
Parents
T6
Children
-
+
+
T5
T6
T3
*
T4
T1
T2
Initial population
• Built randomly
• Two methods for building a random AST:
– Full method: All branches are equally long
– Grow method: Different subtrees can have
different sizes (but less than the maximum)
• More usual: ramped half-and-half
initialization: half of the trees are built
with one method, the other half with the
other method
Problem formulation
• Can cast it as a standard symbolic
•
•
•
•
regression problem
View F as a function of E and V, and
search space of all rational functions of
two variables (up to a max depth)
Error function: difference between actual
# of faces and the result produced by the
program
Optimization: minimize the error
Quick convergence
Another approach
• Search space of all identities
• Generated as follows:
I
T1 = T2
T
L | T1 + T2 | T1 – T2 | T1 * T2
L
V|E|F|1|2|…|9
• Any other integer can be built from 1,…,
9 and the given non-terminals
• Identity is not a non-terminal; it can only
appear at the root of an AST
Details
• Generate P identities randomly (using
ramped half-and-half initialization)
• Crossover on two identities S1 = S2 and
T1 = T2:
• Mate two random subterms Si and Tj from each
identity, producing two new subterms Si’ and Tj’
• If either new term is deeper than the max
depth, then use one of the original parents
• Replace Si and Tj in the identities by Si’ and Tj’
• No mutation
Fitness
• An identity is evaluated on a given triple of
values for V, E, and F
• Computing the fitness of an identity
S = T:
 For each of the k data triples ½:
 If S = T holds for ½, then give the identity a point
• Higher score, greater fitness
• Maximum fitness: 9, minimum: 0
Problem
• Trivially true identities can get perfect
scores, e.g.:
V=V
1 + 2 = 5 – 3
E – E + E = E

• Solution: negative triples, e.g.:
•

V = 0, E = 0, F = 1
Trivial identities will hold for such
negative triples, but plausible identities
will not
Fitness computation
• To evaluate an identity S = T:
• For each of the k data triples p:
– Allocate a point if S = T holds for p
– Allocate a second point if S = T does not hold
for the negative triple
• Maximum score: 18, minimum: 0
• Also impose a penalty of b n/20 c points
for an identity of length n (to discourage
excessively long expressions)
Download