Improvement of Multi-population Genetic Algorithms Convergence Time Maria Angelova, Tania Pencheva maria.angelova@clbme.bas.bg, tania.pencheva@clbme.bas.bg Fermentation processes Fermentation processes (FP) are widely used in different branches of industry – in the production of pharmaceuticals, chemicals and enzymes, yeast, foods and beverages. Fermentation processes are: characterized as complex, dynamic systems with interdependent and time-varying process variables; described by non-linear models with a very complex structure. An important step for adequate modeling of non-linear models of FP is the choice of a certain optimization procedure for model parameter identification. Aims of the investigation The influence of five of the main genetic algorithm parameters to be investigated for six modifications of multipopulation genetic algorithms (MpGA) towards convergence time: generation gap - GGAP crossover rate - XOVR mutation rate - MUTR insertion rate - INSR migration rate - MIGR MpGA performance to be demonstrated for parameter identification of S. cerevisiae fed-batch cultivation. Genetic algorithms Genetic algorithms (GA) : - are a direct random search technique for finding global optimal solution in complex multidimensional search space; - are based on mechanics of natural selection and natural genetics; - have advantages such as hard problems solving, noise tolerance, easy to interface and hybridize; - are proved to be very suitable for the optimization of highly non-linear problems, - are applied in the area of biotechnology, especially for parameter identification of fermentation process models. Multi-population genetic algorithms Simple genetic algorithm (SGA) works with a population of coded parameters set called “chromosomes”. Each of these artificial chromosomes is composed of binary strings (or genes) of certain length (number of binary digits). Each gene contains information for the corresponding parameter. Multi-population genetic algorithm (MpGA) is a single population genetic algorithm, in which many populations, called subpopulations, evolve independently from each other for a certain number of generations. After a certain number of generations (isolation time), a number of individuals are distributed between the subpopulations. MpGA modifications Six kinds of MpGA are investigated towards improvement of algorithms convergence time. MpGA differ from each other in the sequence of execution of main genetic operators’ selection, crossover and mutation: MpGA-SCM (coming from sequence selection, crossover, mutation); MpGA-CMS (crossover, mutation, selection); MpGA-SMC (selection, mutation, crossover); MpGA-MCS (mutation, crossover, selection); MpGA-SC (selection, crossover); MpGA-CS (crossover, selection) is newly developed here, provoked by the promising results obtained when selection operator is processed after crossover in SGA. MpGA-CS The main idea of this modification is that the individuals are reproduced processing only crossover and avoiding mutation. In the beginning, MpGA-CS generates a random population of n chromosomes, i.e. suitable solutions for the problem. In order to prevent the loss of reached good solution by crossover, selection has been processed after crossover. Parents’ genes combine to form a whole new chromosome during the crossover. After the reproduction, the MpGA-CS calculates the objective function for the offspring and the best fitted individuals from the offspring are selected to replace the parents, according to their objective values. When a certain number of generations is fulfilled, the MpGA-CS is terminated. Range of investigated genetic algorithm parameters Very big generation gap value does not improve performance of GA, especially regarding how fast the solution will be found. Mutation is randomly applied with low probability, typically in the range 0.01 and 0.1. A higher crossover rate introduces new strings more quickly into the population. A low crossover rate may cause stagnation due to the lower exploration rate. Insertion rate is a general measure how many of the individuals produced at each population are inserted into the new generation. Migration rate characterized the number of exchanged individuals. GGAP XOVR MUTR INSR MIGR 0.5 0.67 0.8 0.9 - 0.65 0.75 0.85 0.95 - 0.02 0.04 0.06 0.08 0.1 0.5 0.6 0.8 0.9 1 0.2 0.4 0.6 0.8 0.1 Mathematical model of S. cerevisiae fed-batch cultivation dX F = μX - X dt V dS F = -qS X + Sin - S dt V dE F = qE X - E dt V dO2 = -qO2 X + k LO2 a O2* - O2 dt dV =F dt where X, S, E, O2 and O2* are concentrations of biomass, substrate (glucose), ethanol, [g.l-1], oxygen and dissolved oxygen saturation, [%]; F – feeding rate, [l.h-1]; V – volume of bioreactor, [l]; kLO a – volumetric oxygen transfer coefficient,[h-1]; Sin – glucose concentration in the feeding solution, [g.l-1]; , qS, qE and qO are respectively specific rates of growth, substrate utilization, ethanol production and dissolved oxygen consumption, [h-1]. 2 2 Specific rates 2 S qs S E 2 E S kS E kE 2S S YSX S kS qe 2E YEX E E kE qo2 =qE YOE qSYOS where 2S , 2E – maximum growth rates of substrate and ethanol, [h-1]; kS, kE – saturation constants of substrate and ethanol, [g.l-1]; Yij – yield coefficients, [g.g-1]. Optimization criterion: J Y = Y - Y * min 2 where Y is the experimental data, Y* – model predicted data, Y = [X, S, E, O2]. Influence of GGAP in MpGA with three genetic operators Influence of GGAP has been investigated towards model accuracy and convergence time. GGAP 0.5 0.67 0.8 0.9 MpGA-SCM J t, s 0.0220 100.8910 0.0221 112.1720 0.0221 155.4680 0.0220 170.2660 MpGA-SMC J t, s 0.0220 111.7810 0.0220 141.0940 0.0220 178.9680 0.0220 340.6720 MpGA-CMS J t, s 0.0221 273.9060 0.0221 325.5780 0.0221 321.0160 0.0221 343.6870 MpGA-MCS J t, s 0.0220 307.8440 0.0220 332.0620 0.0221 373.1560 0.0221 349.7500 Influence of GGAP in MpGA with two genetic operators Influence of GGAP has been again investigated towards model accuracy and convergence time. GGAP 0.5 0.67 0.8 0.9 MpGA-CS J t, s 0.0223 267.9220 0.0222 331.9690 0.0223 333.6250 0.0221 357.0160 MpGA-SC J t, s 0.0222 111.5310 0.0224 119.7340 0.0221 153.3900 0.0220 168.2190 Comparison of MpGA results The optimization criterion values obtained with six kinds of MpGA are very similar - there is no loss of accuracy. The obtained results can be grouped: MpGA-SCM with MpGA-SMC and MpGA-CMS with MpGA-MCS, but the convergence time in second group is much bigger than the first group. Two algorithms without mutation execution, MpGA-SC and MpGA-CS, can be grouped together too. In cases when algorithms are implemented only with two operators the calculation time is much less but for the expenses of model accuracy. Proceeding selection operator before crossover and mutation (no matter their order) needs much less computational time at GGAP, XOVR, MUTR, MIGR and INSR. Results concerning considered GA parameters The GGAP is the most sensitive from five investigated parameters concerning the convergence time. Up to 40% (in case of MpGASCM,) can be saved using GGAP = 0.5 instead of 0.9 without loss of accuracy. Exploring different values of crossover rate no such time saving is realized but it should be pointed that values of 0.85 for XOVR can be assumed as more appropriate. Exploring MUTR values of 0.02 can be assumed as more appropriate. In INSR and MIGR no tendency of influence can be drawn. Optimal GA parameter values GGAP = 0.5, XOVR = 0.85, MUTR = 0.02, INSR = 0.9 and MIGR = 0.1. Because of the similarity of the results obtained with all six kinds of algorithms the results obtained by the developed here MpGA-CS, are presented. As a result of parameter identification, the values of model parameters are respectively: S = 0.98 [h-1], E = 0.13 [h-1], kS = 0.13 [g·l-1], kE = 0.84 [g·l-1], YSX = 0.42 [g·g-1], YEX = 1.67 [g·g-1], kLO a = 96.2329 [h-1], YOS = 766.7862 [g·g-1], YOE = 125.5165 [g·g-1], while CPU time was 288.6720 s and J = 0.0221. 2 Presented results from MpGA-CS application for parameter identification of S. cerevisiae fed-batch cultivation show the effectiveness of GA for solving complex nonlinear problems. Experimental and model data for biomas and substrate concentration Fed-batch cultivation of S. cerevisiae Fed-batch cultivation of S. cerevisiae 30 0.2 data data 0.18 model model 0.16 Substrate concentration, [g/l] Biomass concentration, [g/l] 25 20 15 10 0.14 0.12 0.1 0.08 0.06 0.04 5 0.02 0 0 5 10 Time, [h] 15 0 0 5 10 Time, [h] 15 Experimental and model data for ethanol and dissolved oxygen concentration Fed-batch cultivation of S. cerevisiae Fed-batch cultivation of S. cerevisiae 1 110 data 0.9 data 100 model 90 Dissolved oxygen concentration, [%] Ethanol concentration, [g/l] 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 model 80 70 60 50 40 30 20 0 5 10 Time, [h] 15 10 0 5 10 Time, [h] 15 Analysis and conclusions Altogether six kinds of multi-population genetic algorithms have been examined: - Four of them are with exchanged operators’ sequence of selection, crossover and mutation operators; - Two modifications are without performing of mutation operator. The influence of some of genetic algorithm parameters, namely GGAP, XOVR, MUTR, INSR and MIGR, has been examined for all six kinds of genetic algorithms and the most sensitive - GGAP has been distinguished aiming to improve the convergence time. As “favorite” among the considered here algorithms MpGA-SCM has been marked as the fastest one. Up to almost 40% from calculation time can be saved in the case of MpGA-SCM application using GGAP = 0.5 instead of 0.9 without loss of model accuracy. All modifications of MpGA show the effectiveness of genetic algorithms for solving complex nonlinear problems. IMACS’11 Improvement of Multi-population Genetic Algorithms Convergence Time ACKNOWLEDGEMENTS This work is partially supported by the European Social Fund and Bulgarian Ministry of Education, Youth and Science under Operative Program “Human Resources Development”, grant BG051PO001-3.3.04/40 and National Science Fund of Bulgaria, grant DID 02-29 “Modeling Processes with Fixed Development Rules”. Thank you for your attention!