The use of computer simulation in studying biological evolution

The Use of Computer Simulation in
Studying Biological Evolution: Pure
Possible
Processes, Selective Regimes and
Open-ended Evolution
Philippe Huneman
IHPST (CNRS / Université PARIS I
Sorbonne)
• The interplay between evolutionary biology and
computer science
• Bioinformatics, Biocomputation
• Genetic algorithms (lexicon : « crossing over »,
genes etc)
• The radical AL claim : digital organisms are
themselves living entities (rather than their
simulations) (Adami 2002). Idea of non-carbonbased forms of life as the essence of life
• Version # 2: Darwinian evolution is an
« algorithm » (Dennett, see Maynard Smith 2000)
• What are simulations doing / teaching ?
• What is the role of natural selection in them ?
• Investigates the relations between biological
evolution and computer simulations of
evolving entities through natural selection.
I. TYPOLOGY OF COMPUTER
SIMULATIONS IN EVOLUTIONARY
THEORY
• A. Kinds of role of selection: a1. formal selection
context
Todd and Miller (1995) on sexual selection
(sexual selection is more an incentive for
exploration than natural selection, since females,
through mate choice, internalize the constraints
of natural selection. )
Maley on biodiversity
(emergence of new species is more conditioned by
geographical barriers than by adaptive potential)
Mikel Maron 2004) –
Moths
and industrial
melanism
A2. No selection context
Boids (Reynolds)
Chu and Adami (2000): simulation of
phylogenies whose parameter is the mean
number of same-order daughter families of a
family
Mc Shea (1996, 2005) : increase of complexity
with no selection
B.Use: weak and strong
• B1. Weak : model is used to test one
hypothesis on one process; it simulates the
behavior of the entities (boids; Maley’s
biodiversity; etc.). Way to test hypotheses
about the world
• B2 Strong : entities of the model don’t
correspond to real entities; the simulation is
meant to explore the kinds of behaviors of digital
entities themselves (Ray’s Tierra, Holland’s Echo
etc.). Hypotheses are made about the model
itself
“digital organisms” as “defined by the sequence of
instructions that constitute their genome, are not
simulated: they are physically present in the
computer and live there” (Adami, 2002)
• Echo : unrealistic assumptions concerning
reproduction, absence of isolative
reproduction for species, makes Echo a poor
model of evolutionary biology (Crubelier
1997)
Langton-Sayama Loop
II. NATURAL SELECTION AND PURE
POSSIBLE PROCESSES
• In the (weak or strong) simulations : causal
processes (i.e. counterfactual dependencies
between classes of sets of cells, and global
state at next step).
a1 a 2 ….. a1+P a2+P
………………
ak …..
A1 n, j A2 n, j……… Ak n, j …..
b1n+1 b2n+1 …………… bkn+1..
Property P n at Step n (Add (on i) Disj (on j) Ai n, j )
Property Pn+1 at Step n+1 (all the bin+1 )
“If P n had not been the case, Pn+1 would not have been the case.”
Causation as counterfactual dependence between steps in Cas
Huneman Minds and machines 2008
-> In « formal selection » contexts simulations :
those causal processes are actual « selective
processes »
• Yet the entities in the simulations can not
exactly match biological entities:
In Echo, you don’t have species easily, in Tierra
no lineages etc.
If one system is designed to study some level of
biological reality, the other levels are not ipso
facto given (whereas if you have, e.g.,
organisms you have genes and species etc.)
-> In actual biology : all levels of the hierarchies
are acting together
• So computer simulations display « pure
possible processes » concerning the entities
modelised, located in a target-level of the
hierarchy
(no implicit entangling between levels)
In the case of formal selection simulations,
« pure » selective processes occur
• Ex. of « natural selection » sensu Channon :
Echo or Hillis’s coevolution between sorting
problems: « natural selection » simulations.
Yet in Echo, for ex., the class of possible
actions is limited
III. THE VALIDATION PROBLEM FOR
COMPUTER SIMULATIONS
What do tell us such simulations ?
• They correlate pure possible processes with
patterns of evolution
• They can not prove that some process caused
some evolutionary result, but they provide
candidate causal explanations : « if pattern X is
met, then process x is likely to have produced it”
• And other causal processes may have been at
work but they were not so significant regarding
such outcome (noise ???)
• Even if we have no idea of the ecological context,
hence of the actual selective pressures
• Adami, Pennock, Ofria and Lenski (2003) show
that evolution is likely to have favoured
complexity : their point is that, if there is
complexity increasing in their sense, then
deleterious mutations might have been
selected; then a decrease in fitness might
have been involved in the stabilisation of
more functionally complex genomes.
• Chu and Adami (2000) investigation of the
patterns of abundance of taxa : if the
distribution of taxa resemble a certain powerlaw scheme X, it is likely that the parameter m
(mean number of same order daughter
families of a family) has been in nature close
to the value of m involved in X (i.e. m=1).
The validation problem
• Epstein (1999) : the case of Anasazi
settlements
• That does not prove that the rules ascribed to
individuals are the accurate ones
see also Reynold flocking boids : it excludes a
centered-controlled social organisation (but
we need other assumptions to make this
plausible)
• Even more : the case of Arakawa’s simulations
in meteorology
• Analysis by Kuippers & Lehnard 2001, Lehnard
2007 : drop « realism » in order to achieve
efficiency
How are simulations to be validated in biology ?
• Mc Shea on complexity.
Challenges Bonner (1988) explanation of the increase of complexity
through selected incerase of size in various lineages
Mc Shea (2005) suggests that complexity increase can be produced
with no natural selection, only variation (complexity defined by
diversity);
models also produce patterns of complexity increase with patterns
produced under various constraints (driven trend vs passive trends,
with no selection).
The pattern found in the fossil records may be produced by such
process – but we need to have an idea about the processes likely to
have actually occurred
• A minimal characterisation of computer
simulations in evolutionary biology : they
provide candidate explanations (pure possible
processes) and null hypotheses for
evolutionary patterns
• For the same reason (they don’t accept
impure processes which are the ones really
occurring) they can’t prove anything by
themselves
• An example worth to investigate : Hubbell’s
ecological neutral theory (2001)
• It skips the level of individual selection;
generate the same outcome as what we see
about succession, stability and persistance in
communities
IV. APPLICATION: DISCONTINUITIES
IN EVOLUTION
4.1. The longstanding problem with innovations
Darwinism is gradualist (small mutations
selected etc.)
Cumulative selection accounts for adaptations
• Novelties,
• Innovations (qualitative, eg morphological,
differences);
• Key innovations : trigger adaptive radiation,
and new phylogenetic patterns (avian wing,
fish gills, language…) – id est, phylogenetic
and ecological causal patterns
• Pattern and processes : the role of
« punctuated equilibria theory » (Eldredge
and Gould 1976)
An issue with discontinuity
• Problem : the fitness value of half a novelty ? (half a
wing !)
-> Solutions :
- Find a benefit for each stage in various species (Darwin
on the eye)
- Conceive of it as an exaptation (ex. feathers) (Gould
and Vrba 1981)
- Developmental processes (Gould, 1977; Muller and
Newman, 2005, etc.) : variation is not « minor », it’s a
rearrangement of structures through shuffling of
developmental modules/time (as such the pucntuated
quilibria pattern don’t require a specific process)
4.2. Exploring discontinuity:
Compositional evolution (Watson 2005) :
“evolutionary processes involving the
combination of systems and susbsystems of semiindependently preadapted genetic material”
(p.3).
• consideration of building blocks obeying some
new rules that are inspired by the biological
phenomena of sex and of symbiosis proves that
in those processes non gradual emergence of
novelties is possible.
• 1. A system with weak interdepencies
between parts can undergo linear evolution:
increases in complexity are linear functions of
the values of the variables describing the
system. Algorithms looking for optimal
solutions in this way are called “hill-climbers”;
they are paradigmatically gradual. They easily
evolve systems more complex in a quantitative
way, but they can’t reach systems that would
display innovations..
• 2. If you have arbitrary strong complexities
between the parts, then evolving a new complex
system will take exponential time (time increases
as an exponential function of the number of
variables). Here, the best algorithm to find
optimal solutions is the random search.
• 3. But if you have modular interdependencies
(encapsulated parts, etc.) between parts, then
evolving new complex systems is a polynomial
function of the variables. (Watson 2005, 68-70)
• Algorithms of the class “divide-and conquer” are
dividing in subparts the optimisation issue, and divide
in its turn each subpart in other subparts : the initial
exponential complexity of the optimisation problem
approached through random search is thereby divided
each time that the general system is divided – so that
in the end the problem has polynomial complexity.
• Those algorithms illustrate how to evolve systems that
are not gradual or linear improvements of extant
systems; but as polynomial functions of the variables,
they are feasible in finite time, unlike random search
processes.
• “Compositional evolution” concerns pure
processes that embody those classes of
algorithms with polynomial rates of
complexification, and have genuine biological
correspondents: sex; symbiosis. “mechanisms
that encapsulate a group of simple entities into a
complex entity” (Watson 2005, 3), and thus
proceed exactly in the way algorithmically proper
to polynomial-time complexity-increasing
algorithms like “divide and conquer”.
• Watson refined the usual crossover clause in
GA, integrating various algorithmic devices
(for ex. “messy GA”, according to Goldberg,
Korb and Deb 1989) in order to account for
selection on blocks that take into account
correlation between distant blocks, hence
creation of new blocks (Watson 2005, 77).
• . This proves that processes formally
structured like those encapsulated processes –
such are symbiosis, endosymbiosis, may be
lateral gene transfer – have been likely to
provide evolvability towards the most complex
innovations, the ones not reachable through
gradual evolution
• The bulk of the demonstration is the identity between algorithmic
classes (hill-climbing, divide-and-conquer, random search) and
evolutionary processes (gradual evolution, compositional
evolution).
• So the solution of the gradualism issue is neither a quest of nondarwinian explanation (“order for free”, etc.), nor a reassertion of
the power of cumulative selection that needs to be more deeply
investigated (Mayr *), but the formal designing of new modes of
selective processes, of the reason of their differences, and of the
differences between their evolutionary potentials. In this sense,
discontinuities in evolution appear as the explananda of a variety of
selective processes whose proper features and typical evolutionary
patterns are demonstrated by computer sciences
Open-ended evolution
• Potential for discontinuities and novelties is
constant or increasing
• New adaptive radiations – wings for insects
and birds, etc. – as opening possibilities of for
other novelties
• Not predictible – but retrodictible
Modelling open-ended evolution
• Question : what is specific to evolution in the
biosphere ?
• Limits in modeling open ended evolution in
Alife (Bedau and Packard 1998)
• Classify possible evolutionary patterns, with
criteria that will take into account the degree
of likeliness of discontinuities and
emergences.
• Those patterns will include classes of the pure
possible processes that are directly
implemented within the computational
devices, and appear to be objects of
investigation in computer sciences.
• , Bedau and Packard (1998) :three kinds of emergence:
class II is “bounded emergence” – Holland’s (1995) GA
Echo –, as opposed to class I, no emergence, in an Echo
simulation with no selection (what they call “Echo
neutral shadow”), and class III is unbounded
emergence – manifest in the phanerozoic fossil records
– i.e. the history of Life.
• “Bounded” for Bedau and Packard means that the
range of adaptations exhibited is somehow finite,
which is not the case in class III
• Intuition : no new environment to be colonized in
digital evolution
Channon’s classification (2002)
1. Artificial selection in the SAGA simulation, 2.
natural selection of program codes in Ray’s
Tierra, which seems a now limited evolution,
3. less limited evolution by Channon’s
“natural selection” in Geb simulation
Is this class 3 = BP class III (phanerozoic
records)?
Typology in terms of driving processes
• No selection. Phase transitions, etc.
• Gradual evolution. Smooth landscapes,
cumulative selection, problem of shifting balance
theory,
• Compositional / discontinuous evolution. Moving
landscapes (not smooth); problems of facilitators
of evolution (Wagner and Altenberg 1996 –
evolvability as constraints on the genotypephenotype map). No fixed optima, hence some
open ended evolution.
• Local patterns of evolution can be simulated,
hence providing candidate processes
• General pattern of open-ended evolution in
phanerozoic record is still unmatched (see
Taylor 2004 for a state of the art)
• No a priori reason for this
• But it might be that no pure possible process
is likely to generate this
• The possibilities provided by those models
settle the ground for empirically deciding
about the specificity of life as a this-worldly
feature (as opposed to « life » by AL theorists)
Conclusion
• Computational models are not a very general
domain of which biology would exemplify some
cases. (Against strong AL claim)
• On the contrary they mostly provide pure
possible processes that might causally contribute
to origin of traits or evolutionary patterns.
• The class of possible processes being larger than
the real processes, obviously not all processes
simulated are likely to be met in actual biology
• The main difference between algorithms and
biology might not be the chemical
implementation of earthly life (replicators are
DNA etc), but the fact that processes at work
in biology are never pure in the sense that
they involve all the levels of the hierarchy
• Algorithmic devices only permit to single out
one or few entities within them. In this sense
they are only generating the pure processes
involving solely those entities.
• This constrains the form of the validation
problem for computer simulations in
evolutionary biology