A.V.Spirov, A.B.Kazansky The Sechenov Institute of Evolutionary Physiology and Biochemistry Thorez ave. 44, St.Petersburg, 194223, Russia. Evolutionary Biology and Evolutionary Computations: Parasitic Mobile Genetic Elements in Artifical Evolution Аннотация. Мобильные Генные Элементы (МГЭ) - транспозоны сродни компьютерным вирусам, это - автономные генетические программы, передающиеся по горизонтали и вертикали, взаимодействующие с гентическими программами хозяина и нацеленные преимущественно на умножение числа собственных копий. Эти характерные черты транспозонов использованы нами для выработки новых алгоритмов эволюционного поиска на основе техники МГЭ. Техника МГЭ была использована нами в компьютерных экспериментах по самопроизвольному усложнению самоорганизующихся сетей генов - контроллеров развития в процессе их коэволюции с искусственными транспозонами. Обсуждаются песпективы развития новой ветви Генетических Aлгоритмов на базе предложенного подхода. If we can't engineer a computer that will be proud of us, we may have to evolve it. Kelly Kevin, Out of Control. An accident, a random change in any delicate mechanism can hardly be expected to improve it. Poking a stick into the machinery of one's watch or one's radio set will seldom make it work better. Theodosius Dobzhansky, Heredity and the Nature of Man. 1. INTRODUCTION Progress in the theory and modelling of biological evolution always has an important applied side effect, stimulating development of new Evolutionary Computations techniques and new algorithms of Artificial Life in general. Particularly, biological concept of competitive coevolution attracted great interest of A-Lifers and specialists in evolutionary computations (Menshutkin, Kazansky, 1986; Kazansky, 1988; Cliff and Miller, 1995; Maley, 1995). Well-known artificial evolving biosystem“Tierra” (Ray, 1991) and it’s decendants such as “Bags”, “Evita” are based on mechanisms of co-evolution of species, competing for resourses. In context of global optimization problem, co-evolutionary systems of competing agents are appealing because continuous changing of fitness landscape, caused by the competitive struggle of each species can prevent stagnation of evolutionary search in the vicinity of local maxima (Maley, 1995). Hillis (1990) reported on significant rising of the efficiency of evolution of sorting programs after introduction of co-evolving parasites (programs deciding the test conditions for the sorting programs). Nowadays, specialists in evolutionary computations and Genetic Algorithms are pinning hopes on development of co-evolutionary genetic systems. Recently we contributed to this field by proposing Mobile Genetic Elements (MGE) technique, (Spirov,1996a,b; Spirov, Samsonova, 1997; Spirov, Kazansky, Kadyrov, 1998). It is based on co-evolution of self-organizing regulatory genetic networks , controlling process of individual development and of artificial genomic parasites - transposons. This new approach to modelling of evolutionary processes was formulated in our resent works on Drosophila embrio early development. It has been successfully tested on artificial ant problem as well. Formalization and generalization of the approach gave us possible to give rise to a new prospective line of evolutionary computations development. In this work we reveal biological background of new approach and set forth MGE technique. We also present some results of application of the technique to computer modelling of Drosophila regulatory genetic networks evolution. Some promising for evolutionary computations properties of proposed artificial co-evolving system such as it's great evolvability are dicussed as well. 2. BIOLOGICAL BACKGROUND The essential progress in study of molecular mechanisms of individual development results in discovery of a hundreds genes, which sole function consist in the control and regulation of other genes activity (Jackle et al., 1992). These genes comprise so-called self-organizing genetic networks (Kauffman,1993,1995). The network of regulatory genes serve to orchestrate the genome activity during embryo development. It is universally recognized, that the evolutionary self-organization of genetic ensembles occurs by means of genes duplications (Altenberg, 1994; Li and Noll, 1994; Wagner, 1994). The typical story of gene origin is as follows: gene duplication (1); its fixation in the population through selection or drift (2); maintenance of gene function by selection (3); gene evolution under mutation and selection (4). In the 1940s, a Nobel prize winner, Barbara McClintock, predicted the existence of pieces of DNA which could jump in and out of chromosomes - 'jumping genes'. Mobile genetic elements, transposons, have this intra-genome self-replicating properties. It has been estimated that 80% of spontaneous mutations are caused by transposons. Many repeated sequences in genomes may have originated as potential transposons, favoured by selection on genetic level. Recently the data were obtained, which evidence, that gene transpositions are involved in the process of co-evolution with host genetic network. Some transposons may have co-evolved with genome of their host in a result of selection at the organismic or population level. This type of selection favours transposons which introduce useful variation through gene rearrangement. Really, it can be interpreted as an evolving macromutational mechanism. A set of possible strategies of interrelation between Transposable Elements (TE) and Drosophila' genome is being discussed in biological literature. First of all, the destabilization of host's genome by transposons looks very promising in a context of modelling of evolution. McClintock characterised these genetic phenomena as "genomic shock" (McClintock, 1984). Particularly, it is worth to mark the phenomenon of hybrid dysgenesis (Lozovskaya, 1995), in which multiple unrelated TEs are mobilised simultaneously via host genome destabilisation. As a rule, TEs re- main silent in Drosophila genome until some stress factor (temperature, irradiation, DNA damage, the introduction of foreign chromatin, viruses, etc.) activates their elements. The insertion of activated TEs into a number of loci leads to alteration of gene expression pattern. It is this burst of transpositions that we realised in our computer experiments. In other words, we simulate the situation of disbalanced system of TE - Drosophila genome. An estimated typical rate of transpositions in natural populations of Drosophila (number of transpositions per element per generation) is of the order of 10^-4 (Charlesworth, 1992). Rate of transpositions, realized in our computer experiments is two or three orders of magitude higher. 3. MODELLING OF GENETIC NETWORKS EVOLUTION Gene ensembles, controlling the segmentation in insects (Patel, 1994) were chosen as a very convenient modelling object. The characteristic feature of organization of genetic networks in these organisms is the control of genes activity at early and late stages of embryogenesis by different regulatory elements, as well as tissue- and stage-specific activation of transposons (Ding and Lipshitz, 1994; Smith and Corces, 1995). Therefore besides genes, acting at early embryogenesis stages, i.e. at stage of segmentation, we incorporate into our model several genes, functioning only at the subsequent developmental stages of development. We suppose, that the activation of these genes at the stage of segmentation could take place as the result of transposable elements insertion. An approach to simulation of the evolution of genetic network is based on the scheme of population dynamics viz. repeating cycles of mutations and/or crossover, selection and reproduction. It is common scheme, generally used in the GA-and GP-techniques. We proposed new, Transposable Elements technique (Spirov, Kazansky, Kadyrov, 1998), which is expanded classical scheme. Namely, we use the following non-classic elements and operators: variable-length genomes and corresponding variable-length operators of duplication, elimination and random addition (1); parasitic mobile genetic elements in evolving genomes (2); operators of competition and co-operation between parasitic genetic elements (3); operators of transposition (4). VARIABLE-LENGTH GENOMES. We use non-traditional for the classical GA-approach (but inherent in GP - technique) variable-length genome (chromosome). As well, as in GP-technique, the growth of genome in our algorithm is limited. Specifically, every genome can have up to nine separated variable-length strings-chromosomes . Each string can include up to six four-letter words separated by spacers. We use three-letter alphabet (A, T, C). In toto we have 81 different 4-letter words. Each gene recognises nine target sites in other genes. Each word corresponds to the number which are used for evaluation of Hill's law parameters. In our concrete model every chromosome contain only one gene. Variable-length operators of duplication, elimination and random addition can be applied independently to every chromosome (gene). The list of operators, acting on chromosomes include the mentioned above variable-length operators, the classical operators of point mutations and recombinations, as well as operators of transposition, defined below. VARABLE-LENGTH OPERATORS. The action of these operators on genome will result in duplication, addition of fragment or deletion of chosen at random chromosome (i.e. gene). SIMULATIONS OF RECRUITMENT OF A NEW GENES INTO GENE NETWORK We named the process of incorporation of new gene into network as the gene recruitment. At the initial state the genotypes of individuals include one active gene (gene A). The gradient of M factor (morphogene) controlling A-gene expression in concentration-dependent manner is predetermined in initial population. Thus low concentration of M factor activate gene A, while high concentration repress it. The target sites (4-letter words) for M in the A become the first "shooting mark" for the attack of transposable elements. Pool of free genes includes genes unusable at initial stage. By definition, gene can regulate only those genes, which contain the target sites for its product. On the other hand, from generation to generation population is infected with site-specific mobile genetic elements, coming from external "source". The infection is possible only provided that gene A contain one of the sites-targets for the morphogen M. So, from the very beginning the whole population can be infected with this "virus". In the model under discussion only the stabilizing selection of A-gene pattern is used. But a specially introduced MGE is able for insertion only into the regulatory region of initial, "wild" type of A-gene. After the insertion of deleter, an infected individual can live no longer than 9 cycles of reproduction, leaving only scanty progeny. The transposon can be transmitted both horizontally ( to the other host) and vertically (to the next generation). In this situation the indirect (conditional) selective advantage could have mutants with the transformed network of regulatory relationships. We mean the individuals with a new recruited gene and with the entirely substituted regulatory region of A-gene. In this computer model the indirect selective pressure causes the recruiting of gene into network and to closing of "reserved" cascade of regulation of gene A by gene-neophyte B and later on, by gene D as well. It is clear, that lack of site for deleter's incertion garantee the selective advantage to the mutants, which have this property. The mutants emerge very soon in the course of computer evolution, but, in view of indirect character of selection they begin to prevail in population only after the laps of many hundreds cycles of reproduction. Moreover, the indirect pressure of selection results in high heterogeneity of mutants, because they are selected in the model not only by score, but by the resistance to the MGE as well. Furthermore, the variability of deleter's site-specificity is allowed in the model of genetic network evolution. Though, this variability is realised with very low frequency. In so doing, the insertion of the MGE is possible in regulatory regions of gene A or gene B (provided the suitable sequence for insertion is present). In a result, if suitable new lines of this genetic parasite emerges late enough in the course of computer evolution, than the prospects for selection of mutants with the new gene-recruit D and with the closure of the new, "reserved" cascade of regulation by genes B and D will appear. Genes A, B and later on, gene D are the activators of their own targets, working in accordance with the discussed earlier concentration-dependent mechanism. RESULTS AND DISCUSSION Analysing the results of set of computer experiments with the model of genomic networks evolution, we came to following conclusions: 1. The probability of spontaneous emergence of structurally redundant genomic networks with several genes - neophyts is very high at the initial stages of artificial evolution. (This effect is one of possible candidates for explanation of so-called Cambrian explosion, the well-known in the history of biosphere biodiversity outburst, started about 500 million years ago); 2. Populations of genomic networks with one-two and afterwards, with three- four new recruited genes appear for a very short period of “big explosion”; 3. Emergence of mutants of genomic parasites always accompany the explosion of new forms of the hosts; 4. After the lapse of several tens of reproductive cycles of computer evolution, the numbers of practically all redundant genomic networks and of their genetic parasites is abruptly reduces. As a rule, only two forms of genomic networks ( initial network and one of the redundant networks) dominate for the very long subsequent time period, lasting for thousands of reproductive cycles; 5. Sporadic emergence of fluctuations of networks-hosts and their parasites abundances are characteristic for this long period of gradual development; 6. There is high probability of the repetition of the explosion of diversity of redundant forms and their parasites in a thousands of reproductive cycles after the last explosion. 7. In general, artificial evolution has features of quasy-periodical process of alternation of short periods of diversity explosion and long periods of relatively gradual development. The successive complication of functional organization of genome in the course of computer evolution is evident. It is essential, that this complication is the result of indirect selection pressure action. We have not prescribed any explicit criteria of selective advantages for genomes, which formed these selforganizing genomic regulatory networks. The succession of events in computer evolution of genomic networks looks very realistic from biological point of view. It corresponds to our knowledge about the inter-relationships between host's genome and MGE and looks promising as for application in the field of evolutionary computations. Recruitment of genes and functional complication of networks in our model is possible only due to mobile parasitic elements’ activity. The coevolution of programs-hosts and subroutines-parasites provided a new insight into the evolutionary process in general. The coevolution-evolution cycle of host and parasite system has new features, not reducible to the sum of evolutions of a single species taken sepately. These simulations also contribute to the old evolutionary debates of gradualism versus punctuated equilibrium adherents. Gradualism maintained that evolutionary changes were small but constant between generations of organisms; punctuated equilibrium proposed that evolutionary changes came suddenly, spurred on by large environmental disturbances. Our computer experiments demonstrate typically non-graduate character of computer evolution, where long quasiequilibrium dynamics suddenly, from time to time bursts with many new complicated forms (evolutionary outbursts). REFERENCES Altenberg, L. The evolution of evolvability in genetic programming. P. 47-74 in K. E. Kinner , ed. Advances in Genetic Programming MIT Press, Cambrige, (1994a). Bronner G; Taubert H; Jackle H Mesoderm-specific B104 expression in the Drosophila embryo is mediated by intetrnal cis-acting elements of the transposon. Chromosoma 103: 669-75 (1995). Bucheton A. The relationship between the flamenco gene and gypsy in Drosophila: how to tame a retrovirus. Trends Genet 11: 349-353 (1995). Charlesworth B, Lapid A, Canada D, The distribution of transposable elements within and between chromosomed in a population of Drosophila melanogaster. I. Element frequencies distribution. Genet Res, 60: 103-114 (1992). Cliff, D. and G.F. Miller. Tracking the red Queen: Measurements of Adaptive Progress in CoEvolutionary Simulations. In: Advances in Artificial Life. Proceedings of theThird European Conference on Artificial Life, pp. 200-218, Springer - Verlag Berlin Heidelberg, (1995). Dellaert, F. and R.D. Beer. Toward an evolvable model of development for auttonomous agent synthesis. In: Artificial Life IV: Proceedings of the Fourth International Workshop on the Synthesis and Symulation of Living Systems. R. Brooks and P.Maes (Eds.). MIT Press,(1994). Ding D. and Lipshitz HD Spatially regulated expression of retrovirus-like transposons during Drosophila melanogaster embryogenesis. Genet Res 64: 167-81 (1994). Hillis W.D., Co-evolving parasites improve simulated evolution as an optimization procedure, Physica D, 42:228-234, 1990. Jackle H; Hoch M; Pankratz MJ; Gerwin N; Sauer F; Bronner G., Transcriptional control by Drosophila gap genes. J. Cell Sci Suppl 16: 39-51, (1992). Kazansky A.B. Simulation of phentypic variation in fish populations by the method of evolutionary modeling.- in: Mathematical modeling of complex biological, systems. Matherials of X AllUnion School, Moskow, "Nauka".-1988, pp. 80-88.(in Russian). Kauffman S.A. The Origins of Order : Self-Organization and Selection in Evolution. New York: Oxford University Press, 1993. Kauffman S.A. At Home, in the Universe: the Search of Laws of Self-Organization and Complexity.- New York: Oxford University Press. (1995) Li, X. and Noll M. Evolution of distinct developmental functions of three Drosophila genes by acquisition of different cos-regulatory regions. Nature 367:83-87 (1994). Lozovskaya ER; Hartl DL; Petrov DA Genomic regulation of transposable elements in Drosophila. Curr Opin genet Dev 5:768-73 (1995).McClintock B, The significance of responses of the genome to challenge. Science, 226: 792-801 (1984). Maley C. The Coevolution of Mutation Rates.- In: Advances in Artificial Life. Proceedings of theThird European Conference on Artificial Life, pp. 219-233, Springer - Verlag Berlin Heidelberg, (1995). Mc Clintock, B. Thesignificance of responses of the genome to challenge.-Science 226: 792-801 (1984). Menshutkin V.V. and Kazansky A.B. Investigation of the processes of adaptation and succession in fish communities by evolution modeling technique. - in: Problems of Ecological Monitoring and Ecosystem Modelling, Leningrad, Gidrometeoizdat.- 1986, pp.277 - 293.(in Russian). Patel NH, Developmental evolution: Insights from studes of insect segmentation, Science 266: 581590 (1994). Ray, T. S. 1992. An approach to the synthesis of life. In Langton, C., Farmer, J., Rasmussen, S., and Taylor, C., editors, Artificial Life II:Proceedings Volume of Santa Fe Conference, volume XI. Addison Wesley: series of the Santa Fe Institute Studies in the Sciences of Complexities, Redwood City, CA. Smith P.A. and Corces V.G. The suppressor of Hairy-wing protein regulates the tissue-specific expression of the Drosophila gypsy retrotransposon. - Genetics 139:215-228 (1995). Spirov A.V. Self-Assemblage of gene networks in evolution via recruiting of new netters. -Lecture Notes in Computer Sciences 1141: 91-100 (1996a). Spirov A.V. Self-organisation of gene networks in evolution via recruiting of new netters. Pp 399405, In: Proceedings of the First International Conference on Evolutionary Computations and Its Applications, Moscow, Russia, (1996b). Spirov A.V., A.B.Kazansky, A.S.Kadyrov. Utilizing of "Parasitic" Mobile genetic elements in genetic algorithms. In: Proceedings of the International Conference on Soft Computing and Measurements (SCM'98), St. Petersburg, 22-26 June, 1998, pp. 266-269. Spirov A.V. and Samsonova M.G., Strategy of Co-evolution of Transposons and Host Genome: Application to Evolutionary Computations. Proceedings of the Third Nordic Workshop on Genetic Algorithms and their Applications (3NWGA), 20 - 22 August 1997, Helsinki, Finland, Ed. Jarmo T. Alander, Finnish Artificial Intelligence Society, Pp.71-82, (1997). Wagner A. Genetic redundancy caused by gene duplications and its evolution in networks of transcriptional regulators. Biol Cybern 74: 557-567 (1996)