CAGE: A Tool for Parallel Genetic Programming Applications Gianluigi Folino 2 Outline Introduction What is GP How GP works Interesting results Parallel GP Parallel Model for Evolutionary Algorithms Implementation of CAGE (cellular model) Convergence analysis Scalability Analysis Future Works (some ideas) Data Mining and Classification Grid Computing Information Retrieval What is GP Why Parallel GP CAGE Convergence Scalability Future Works 3 Problem Solving and GP Genetic Programming is a general concept to solve problems. Heuristic to find a global optimum in a search space. Weak method including little knowledge about the problem to solve. Mimic the process of natural evolution for the emergence of complex structure (solutions). In a population of computer programs (candidate solutions), only best fit programs survive evolution. What is GP Why Parallel GP CAGE Convergence Scalability Future Works 4 Individual representation and search space The user choose the functions and the terminals necessary to solve a problem. The search space is composed from all the possible programs generated recursively from the functions and the terminals chosen. An computer program (individual) is represented as a parse tree. If (time > 10) ris = 1 + 2 + 3; else ris = 1 + 2 + 4; What is GP Why Parallel GP CAGE Convergence Scalability Future Works 5 How GP works. Genetic programming, uses four steps to solve problems: I. Generate an initial population of random compositions of the functions and terminals of the problem (computer programs). II. Execute each program in the population and assign it a fitness value according to how well it solves the problem. III. Create a new population of computer programs by applying genetic operators (mutation, crossover, etc.) to some selected tree (best fit trees are selected most likely) IV. The best computer program that appeared in any generation, the best-so-far solution, is designated as the result of genetic programming. What is GP Why Parallel GP CAGE Convergence Scalability Future Works 6 Flow Chart of GP What is GP Why Parallel GP CAGE Convergence Scalability Future Works 7 Crossover example X+Y+3 What is GP x / 2+3 Why Parallel GP y+1+x/2 CAGE Convergence y+1+x+y Scalability Future Works 8 Mutation example X+Y+3 What is GP X + Y + Y * (X / 2) Why Parallel GP CAGE Convergence Scalability Future Works 9 Preparatory steps Determine the representation scheme: – set of terminals (ex: {x, y, z}) – set of functions (ex: {=, +, -, *, /}) Determine the fitness measure. Determine the parameters – Population size, number of generations – Number of atoms in program – Probability of crossover, mutation, reproduction Determine the criterion for terminating a run (max number of generations or exact solution). What is GP Why Parallel GP CAGE Convergence Scalability Future Works 10 GP Best Results Creation of four different algorithms for the transmembrane segment identification problem for proteins. Automatic decomposition of the problem of synthesizing a crossover filter. Synthesis of 60 and 96 decibel amplifiers. Creation of soccer-playing program that ranked in the middle of the field of 34 human-written programs in the Robo Cup 1998 competition. What is GP Why Parallel GP CAGE Convergence Scalability Future Works 11 GP Best Results and atypical fitness evaluation Art and GP What is GP Why Parallel GP CAGE Board Games and GP Convergence Scalability Future Works 12 Why Parallel GP The search of solutions is implicitly parallel Hard problems require large populations Time requirements and Memory Requirements for GP (scalability) Locality in the selection operator could help mantaining diversity (convergence) What is GP Why Parallel GP CAGE Convergence Scalability Future Works 13 Models of Parallel GP (Global) Master Slave Slave Slave Slave No distribution of population Master: select, cross, mutate Slaves: evaluate fitness Convergence is the same of sequential GP What is GP Why Parallel GP CAGE Convergence Scalability Future Works 14 Models of Parallel GP (Island and Cellular) Island Model What is GP Why Parallel GP CAGE Cellular Model Convergence Scalability Future Works 15 CAGE CAGE (CellulAr GEnetic programming tool) is a parallel tool for the development of genetic programs (GP). It is implemented using the cellular model on a general purpose distributed memory parallel computer. CAGE is written in C, using the MPI Libraries for the communications between the processors. It can also run on a PC with Linux operating system. What is GP Why Parallel GP CAGE Convergence Scalability Future Works 16 CAGE Implementation single individual Processor 0 Processor 1 Processor 2 The population is arranged in a two-dimensional grid, where each point represents a program tree. CAGE uses a one-dimensional domain decomposition along the x direction. What is GP Why Parallel GP CAGE Convergence Scalability Future Works 17 CAGE Implementation For each element in the grid: •Mutation and unary operators are applied to the current tree •Crossover choices as second parent, the best tree among the neighbours (Moore neighbourhood). •It is applied a policy of replacement. •The chosen individual is put in the new population in the same position of the old one. We have three replacement policies (applied to the result of crossover): Greedy Direct Probabilistic (SA) What is GP Why Parallel GP CAGE Convergence Scalability Future Works 18 Convergence analysis CAGE was tested on some standard problems: Symbolic Regression, Discovery of Trigonometric Identities, Symbolic Integration, Even 5-Parity, Artificial Ant and Royal Tree. We averaged the tests over 20 runs and used a population of 3200 individual (1600 for Symbolic Regression and Integration). Maximum number of generations Probability of crossover Probability of mutation Probability of reproduction Generative method for initial pop. Max depth for a new tree Max depth for a tree after crossover Max depth of a tree for mutation Parsimony factor 100 0.8 0.1 0.1 Ramped 6 8 4 0.0 Parameter used in the experiments (selection method was Greedy for CAGE and fitness proportionate for canonical). What is GP Why Parallel GP CAGE Convergence Scalability Future Works 19 Symbolic Regression Symbolic regression consists in searching for a non trivial mathematical expression that, given a set of value xi for the independent variable, it always assumes the corresponding values yi for the dependent variable. The target function for our experiments is: X4 + X3 + X2 + X A sample of 20 points with the Xi in the range [-1 1] was chosen to compute the fitness. Terminal Symbols Functions What is GP Why Parallel GP CAGE X +, -, *, %, sin, cos, exp, rlog Convergence Scalability Future Works 20 Symbolic Regression CAGE vs Canonical What is GP Why Parallel GP CAGE Different population sizes Convergence Scalability Future Works 21 Symbolic Integration Symbolic Integration consists in searching for a symbolic mathematical expression that is the integral of a given curve. The target function for our experiments was: cosx + 2x + 1 A sample of 50 points with Xi in the range [0 2] was chosen to compute the fitness. Terminal Symbols Functions What is GP Why Parallel GP CAGE X +, -, *, %, sin, cos, exp, rlog Convergence Scalability Future Works 22 Symbolic Integration CAGE vs Canonical What is GP Why Parallel GP CAGE Different population sizes Convergence Scalability Future Works 23 Even-4 and Even-5 Parity In the Even-4 and Even-5 Parity we want to obtain a boolean function that receives 4 (5) boolean variables and gives true only if an even number of variables is true. The fitness cases are the 24 (25) combinations of the variables. The fitness is the sum of the Hamming distance between the goal function and the solution found. Terminal Symbols Functions What is GP Why Parallel GP CAGE d0, d1, d2, d3, (d4) AND, OR, NAND, NOR Convergence Scalability Future Works 24 Even-4 Parity CAGE vs Canonical What is GP Why Parallel GP CAGE Different population sizes Convergence Scalability Future Works 25 Even-5 Parity CAGE vs Canonical What is GP Why Parallel GP CAGE Different population sizes Convergence Scalability Future Works 26 Ant (Santa Fe Trail) The ant problem consists in finding the best strategy for an ant that wants to eat all the food contained in a 32x32 matrix. We used the Santa Fe trail containing 89 pieces of food. The fitness is the sum of pieces not eaten in a fixed number of moves. Terminal Symbols Functions What is GP Why Parallel GP CAGE Forward, Left, Right IfFoodAhead, Prog2, Prog3 Convergence Scalability Future Works 27 Ant (Santa Fe Trail) CAGE vs Canonical What is GP Why Parallel GP CAGE Different population sizes Convergence Scalability Future Works 28 Royal Tree The Royal Tree Problem is composed from a series of functions a, b, c, … with increasing arity. Terminal Symbols Functions X A, B, C, D, E The fitness is the score of the root. Each function computes the scores by summing the weighted scores of the children. If the child is not a perfect tree, the score is multiplied for a penalty factor. The problem has a unique solution, we stopped at level-e tree (326 nodes and 122880of score). What is GP Why Parallel GP CAGE Convergence Scalability Future Works 29 Royal Tree CAGE vs Canonical What is GP Why Parallel GP CAGE Different population sizes Convergence Scalability Future Works 30 Related Work No approaches using the grid model and a few with the island model can be found in literature. 1,000-Pentium Beowulf-Style Cluster Computer for Genetic Programming Niwa and Iba describe a parallel island model realised on a MIMD supercomputer and show experimental results for three different topologies: ring, one way and two way torus (the best). Punch discusses the conflict results using multiple populations, for the Ant and the Royal Tree problem. We run CAGE with the same parameters of these two islands model in order to compare the convergence. What is GP Why Parallel GP CAGE Convergence Scalability Future Works 31 Niwa and Iba (cos2x) We obtained a fitness value of 0.1 in the 20th generations, instead of 62th of Niwa (ring topology). CAGE What is GP Why Parallel GP Niwa and Iba CAGE Convergence Scalability Future Works 32 Niwa and Iba (Even-4 Parity) At the 100th generation Niwa has a fitness of 1.1, while our approach is very close to 0. CAGE What is GP Why Parallel GP Niwa and Iba CAGE Convergence Scalability Future Works 33 Convergence Analysis (fitness diffusion) What is GP Why Parallel GP CAGE Convergence Scalability Future Works 34 Scalability (Isoefficiency metric) Experimental method External criterium: evaluating how much a priori known similarities are recognized Classes of structurally homogeneous documents all documents in each class are conform to the same DTD different classes correspond to different DTDs Test results Picture of the similarity matrix, where pixel the grey levels are proportional to the corresponding values in the matrix Quantitative measures What is GP darker pixels correspond to higher similarity values average intra-class similarity, for each class average inter-class similarity, for each couple of classes Why Parallel GP CAGE Convergence Scalability Future Works 35 Scalability results What is GP Why Parallel GP CAGE Convergence Scalability Future Works 36 Classification (Preliminary results) Genetic Programming is suitable for Data Classification. Good capacity to generalise. The dimension of solutions is smaller than See5. Needs large populations for real datasets. Bagging and boosting to partition datasets. What is GP Why Parallel GP CAGE Convergence Scalability Future Works 37 DECISION TREES AND GP NODES ATTRIBUTES FUNCTIONS ARCS ATTRIBUTE VALUES ARITY OF THE FUNCTIONS LEAFS CLASSES TERMINAL What is GP Why Parallel GP CAGE Convergence Scalability Future Works 38 Grid and Parallel Asynchronous GP Using grid for supercomputing (idle, etc...) Computational grid needs applications Problems (Different computational power,… Drawbacks Classical parallel algorithms: need large bandwidth, synchronism, etc..) Parallel Asynchronous Cellular GP What is GP Why Parallel GP CAGE Convergence Scalability Future Works 39 Information Retrieval Query Expansion and Specific Domain Search Engine Problem: How do we combine the words? Answer: Use GP to add keywords with operators AND, not, or, near). Alternative: Specify query using natural language and specifying with operators. What is GP Why Parallel GP CAGE Convergence Scalability Future Works