Here - University of Sussex

advertisement
The Evolutionary Review
Neural Edelmanism and Neural Darwinism
Chrisantha Fernando
For many years Gerald Edelman’s theory of Neural Darwinism confused me [1].
After winning a Nobel Prize for discovering the structure of antibodies, few
people were in a better position to appreciate that the adaptive immune system
works by a kind of natural selection called somatic selection. In somatic
selection, B-cells that produce antibodies that bind a foreign antigen more tightly
can outcompete (replicate faster than) other B-cells that produce antibodies that
don’t bind a foreign antigen as well. Edelman went on to propose that a similar
process takes place in the brain for neuronal groups (neurons connected to each
other by synapses). He argued that like antibodies, neuronal groups also
compete with each other, binding not antigens as antibodies do, but binding
stimuli. Neuronal groups that bind stimuli best can obtain more reward, with reentrant (reciprocal) connections between groups allowing the winning groups to
remould the losing groups.
But is this natural selection? Is Edelman’s theory entitled to its title of “Neural
Darwinism”? It depends which great evolutionary biologist you ask. According
to one definition given by John Maynard Smith (JMS) [2], Edelman’s neuronal
groups would only be units of evolution if re-entrant connections between
groups really allowed the replication of information between groups so that the
losing group came to resemble the winning group. I have not been able to find
evidence of such a mechanism in Edelman’s theory. Edelman has not shown how
a neuronal group could really transmit information to another neuronal group.
But according to a broader definition of natural selection given by the theoretical
biologist George Price [3], one does not require explicit multiplication of
neuronal groups, just a redistribution of resources (competition) between
groups for a process to be called natural selection. Michod has pointed out that
Edelman’s neuronal groups are Darwinian entities in this somewhat boarder
Priceian sense [4].
The Evolutionary Review
This paper describes how Eörs Szathmáry and I solved the problem of how
neuronal groups could really be units of evolution in the JMS sense and not just
in the weaker Priceian sense. The principles involved point to a fundamental link
between the origin of life and the origin of human cognition [5]. Our debugging
of Edelman’s theory may permit the unification of two fields of human
endeavour, evolutionary biology and neuroscience. Both these fields aim to
explain open-ended adaptation, but up to now have done so in relative isolation.
Along with Gerald Edelman and William Calvin [6] we propose that an
underlying process of natural selection takes place in the brain. But we differ in
claiming that populations of neuronal groups (replicators) can undergo natural
selection as defined by JMS, i.e. with true replication of information, and not just
in the broader sense described by Price. In our formulation, neuronal replicators
are patterns of connectivity or patterns of activity in the brain that can make
copies of themselves to nearby brain regions with generation times of seconds to
minutes [7,8]. We believe neuronal units of evolution can undergo natural
selection in the brain itself, to contribute to adaptive thought and action [9,10];
populations of good ideas evolve overnight. Their fitness is determined by the
same Dopamine-based rewards that have been proposed in other neural theories
of reinforcement learning, including Edelman’s. Unlike Edelman and Calvin
however, we propose several viable mechanisms for replication of neuronal
units of evolution. But why is replication so important for natural selection? And
what makes JMS’s formulation of natural selection more powerful than Price’s?
Lets examine both definitions of natural selection more closely. Definitions are
never right or wrong, only helpful or unhelpful. JMS defined a unit of evolution is
any entity that has the following properties [2]. The first property is
multiplication; the entity produces copies of itself that can make further copies
of itself, one entity produces two, two entities produce four, four entities
produce eight, in a process known as autocatalytic growth. Most living things are
capable of autocatalytic growth, but there are some exceptions; for example,
sterile worker ants and mules do not multiply and so whilst being alive, they are
not units of evolution. The second requirement is variation, i.e. there must be
The Evolutionary Review
multiple possible kinds of entity. Some things are capable of autocatalytic growth
and yet do not vary, for example fire can grow exponentially for it is the
macroscopic phenomena arising from an autocatalytic reaction, yet fire does not
accumulate adaptations by natural selection. The third requirement is that there
must be heredity, i.e. like begets like, so that offspring resemble their parents.
Doron Lancet proposed that prior to nucleotides and gene based heredity,
clumps of lipid molecules called composomes could be capable of undergoing
natural selection [11]. But recently we have shown that whilst composomes can
multiply and possess variation, they do not have stable heredity, i.e. like
occasionally produces very much unlike (the mutation bias being too strong in
certain directions), and so unfortunately such systems cannot after all evolve by
natural selection [12]. Later we will see that Edelman’s neuronal groups may fall
into this final category. If units of evolution of different types have different
probabilities of producing offspring, i.e. if they have differential fitness, and if
these probabilities are independent of the frequencies of other entities, the
average fitness of the population will be maximised, and there will be survival of
the fittest.
George Price gave a more general and more inclusive definition of natural
selection [3]. He said that a trait (any measurable value) would increase in
frequency to the extent that the probability of that trait being present in the next
generation was positively correlated with the trait itself, counterbalanced by that
trait’s variability (the tendency of that trait to change between generations for
any reason, e.g. due to mutation). Notice that JMS’s definition is algorithmic, it
tells you roughly how to make the natural selection cake. Price’s definition is
statistical, it tells you whether something is a natural selection cake or not, i.e.
whether it is the kind of cake that can undergo the accumulation of adaptation,
or survival of the fittest. It is important to note that both these definitions were
intended for use in the debate that raged over group selection [13] because there
it was essential to formally define what a legitimate evolvable substrate was.
Here we use them to understand neuronal group selection.
The Evolutionary Review
Lets take some search algorithms and see if they satisfy these two very different
definitions of natural selection. From a computer science perspective, natural
selection is a search algorithm that generates and selects entities for solving a
desired problem, if the quality of that entities’ desired solution is correlated with
the probability of transmission of that entity, i.e. with its fitness. Genetic
algorithms work in this way. They are computer programs that implement
natural selection as defined by JMS and Price [14]. Many other algorithms exist
such as hill-climbing, simulated annealing, temporal difference learning, and
random search, for finding adaptations. We ask, do these satisfy either of the
definitions of natural selection?
A classification of search algorithms shows that natural selection as defined by
JMS really does have some special properties that are often overlooked because
we take its implementation in the biosphere for granted and because we have
erroneously come to equate the models of natural selection that evolutionary
biologists use, with natural selection itself. The table below shows my
classification. Systems undergoing natural selection appear on the right.
Solitary
Search
Parallel Search
Parallel Search with
Competition (Price)
(Stochastic)
hill climbing
Independent hill
climbers
1. Competitive Learning
2. Reinforcement Learning
3. Synaptic Selectionism
4. Neural Edelmanism
Parallel Search with
Competition and
Information Transmission
(JMS)
1. Genetic Natural Selection
2. Adaptive Immune System
3. Genetic Algorithms
4. Didactic receptive fields
5. Neuronal Replicators
Table 1. A classification of search (generate-and-test) algorithms.
On the left hand column of Table 1 is shown the simplest class of search
algorithm, solitary search. In solitary search at most two candidate units are
maintained at one time. An algorithm known as Hill-climbing is an example of a
solitary search algorithm in which a variant of the unit (candidate solution) is
produced and tested at each ‘generation’. If the offspring solution’s quality
exceeds that of its parent, then the offspring replaces the parent. If it does not,
then the offspring is destroyed and the parent produces another correlated
The Evolutionary Review
offspring. Such an algorithm can get stuck on local optima. Figure 1 shows this
algorithm implemented by a robot on an actual hilly landscape. The robot carries
a windmill, and its aim is to get to the highest peak. Lets assume wind speed
increases with altitude for now. It moves randomly to a point on a radius a few
meters away, measures the wind speed and stays there if this wind speed is
higher than the previous wind speed it measured. If it is not higher, it goes back
to its previous location. To do this, it must have some memory of the previous
location.
Figure 1. Imagine a robot on a mountainous landscape whose task it is to reach
the highest peak. One can imagine for example that it holds a windmill which it
wishes to rotate at the highest speed possible, and the higher up it is the faster
its windmill will rotate. If it behaves according to hill-climbing it starts from a
random position (1) moves to a nearby location (2) and tests whether that
location is higher than its original location by measuring the speed of its
windmill. If the wind speed is faster, it remains there, but if the wind speed is not
faster (shown in the unnumbered circles) the robot moves back to the previous
The Evolutionary Review
location. The robot may get stuck on a peak that is not the highest peak (a local
optimum). A robot (not shown) behaving according to stochastic hill-climbing
does the same, except that it accepts the new position with a certain probability
even if it is slightly lower than the original position. By this method stochastic
hill-climbing can sometimes avoid getting stuck on the local optimum, but it can
also occasionally lose the peak it is on because memory is only kept of the
immediately preceding position.
Stochastic hill-climbing and simulated annealing are examples of solitary search
where there is a certain probability of accepting a worse quality offspring. This
balances exploration and exploitation and can reduce the chances of getting
stuck on local optima, however, the cost is potentially losing the currently
optimal peak.
But one can ask, isn’t solitary search actually a kind of natural selection
according to Price and according to JMS? Is it not natural selection with a
population size of two in which one individual replaces the other based on which
is the fitter? Or can the fact that it can be implemented without explicit
multiplication by use of pointers and memory, or by a robot moving on a hillside
mean that it is not an example of natural selection according to JMS? See Figure 2
which shows two other implementations of hill-climbing, this time not on a
hillside but in a system of physical discrete registers that can be in binary states.
The Evolutionary Review
Figure 2. Two implementations of hill-climbers that are both trying to maximize
the number of 0’s in the string. The hill-climber on the left stores a solution, here
represented as a binary string, it modifies the solution at some position in the
string, and stores this modification. After assessing the new solution and
comparing its quality with the original solution it either keeps the modification,
or erases the modification. The hill-climber on the right replicates the entire
solution to a separate location in memory. Only the hill-climber on the right has
true multiplication as defined by JMS.
The search dynamics shown by both machines in Figure 2 are identical and
would be capable of accumulation of adaptations according to Price’s
formulation of natural selection as there was covariance between a trait and
fitness. According to JMS’s definition, the implementation on the right involving
the explicit multiplication (replication) of a unit would constitute natural
The Evolutionary Review
selection but the implementation on the left using pointers and memory would
not. However, notice that this distinction is between implementations of the
same algorithm both of which are indistinguishable in terms of search
performance (although the system on the left uses fewer resources). What about
the robot on the hillside? Here the brain of the robot may store merely the path
back to the previous position, and so the implementation of hillclimbing in that
spatially embodied case may require no replication of an explicitly stored entity
(i.e. a position representation) at all. Therefore, we can say that the phenomenon
of hill-climbing can be implemented either with or without explicit replication,
and therefore may or may not involve natural selection as defined by JMS.
However, in all cases, the phenomenon of hill-climbing must accord with the
principle of natural selection according to Price in order to accumulate
adaptations.
Notice that in Figure 2 (right) only two memory slots at most are available that
can contain a maximum of two candidate solutions at any one time. A slot is
simply a material organization or substance that can be reconfigured into the
form of a unit or candidate solution, for example a piece of memory in a
computer or the organic molecules constituting an organism. What happens if
many more slots are available? How should one best use them? In terms of
Figure 1, this is equivalent to a hillside now inhabited by many robots, rather han
just one. Now our aim is that at least one robot finds the highest peak. Notice that
a slightly different aim would have been to maximize the total wind collected by
the windmills of all the robots.
The simplest algorithm for these robots to follow would be that each one
behaves completely independently of the others and does not communicate with
the others at all. Each of them behaves exactly like the robot in Figure 1. In terms
of the implementations shown in Figure 2, this multiple robot version of search
(simple parallel search) could be achieved by simply having multiple instances of
the hill-climbing machinery, either of the replicating kind, or the pointer and
memory kind, it doesn’t matter.
The Evolutionary Review
So, if we have many slots or robots available, it is possible just to let many of the
solitary searches run at the same time, i.e. in parallel. However, can you see this
would be wasteful whatever the implementation? If one pair of slots (or a robot)
became stuck on a local optimum then there would be no way of reusing this pair
of slots (or the robot). Whereas, if being stuck on a local optimum could be
detected, then random reinitialization of the stuck slot pair would be a
possibility (or in the robot example, moving the stuck robot randomly to a new
position). Even so, one could expect only a linear speed up in the time taken to
find a global optimum (the highest peak). It is difficult to imagine why anyone
would want to do something like this given all those slots, and all those robots.
This is the parallel search described in the second column of Table 1, and it is not
surprising that not many algorithms fall into this class.
A cleverer way to use the extra slots would be to allow competition between
slots for search resources, and by resources I mean the generate-and-test step of
producing a variant and assessing its quality. In the case of robots a step is
moving a robot to a new position and reading the wind-speed there. Such an
assessment step is often the constraining factor in time and processing costs. If
such steps were biased so that the currently higher quality solutions (robots) did
proportionally more of the search, then there would be a biased search by higher
quality solutions. This is known as competitive learning because candidate
solutions compete with each other for reward and exploration opportunities. If
the robots are programmed such that the amount of exploration they do is
greater as the altitude increases, then those at higher altitudes do more
exploration and this may allow a faster discovery of the global optimum. No
robot communicates with any other robot. If robots utilize a common power
supply then the robots are competing with each other for exploration resources.
This is an example of parallel search with resource competition, shown in
column 3 of Table 1. It requires no natural selection as defined by JMS, i.e. it
requires no explicit multiplication of information.
Several algorithms fall into the above category. Reinforcement learning
algorithms are examples of parallel search with competition [15]. Such
The Evolutionary Review
algorithms have been proposed as an explanation for learning in the brain, and
work in the following way. If the response produced by the firing of a synapse is
positively correlated with reward and if this reward strengthens this synapse,
which increases its subsequent probability of firing, then the conditions of
Price’s definition of natural selection have been fulfilled. This is because there is
a positive correlation between the trait (i.e. the response produced by firing the
synapse) and the subsequent probability of that response occurring again.
Similarly, a negative correlation between synaptic firing and reward reduces the
subsequent probability of firing. Sebastian Seung calls these hedonistic synapses
[16]. A single hedonistic synapse is equivalent to a single allele in genetic terms.
If there is an array of such synapses emanating from the same neuron and there
is competition for chemical resources from the cell body of the neuron, then
these synapses are equivalent to multiple genetic alleles competing for resources
from the cell body, and the situation is almost mathematically equivalent to the
Nobel Prize winner Manfred Eigan’s replicator equations [17] in which the total
population size of replicators is kept fixed and there is a well defined number of
possible distinct variants [9]. Eigen’s equations are a popular model used by
evolutionary biologists to model evolution.
Notice that there is a subtle difference between the competition between
synapses described above and robot example given for parallel search with
competition. To use the robot analogy, each synapse is like a robot stuck in
particular location on the hillside, unable to move. Those with higher wind
speeds are allowed to build larger windmills. In this simple kind of synaptic
selectionism the system only exploits the variation that exists at the beginning.
It is not even as powerful as the case we first considered as parallel search with
competition in which the robots with the greatest wind speeds are able to do
more exploration because the variation is limited to the variation that was
produced at the very beginning, robots cannot move up hills, just increase (or
decrease) the size of their windmills.
So do such systems of parallel search with competition between synaptic slots
really exhibit natural selection? Not according to the definition of JMS because
The Evolutionary Review
there is no replicator; there is no copying of solutions from one slot to another
slot, there is no information that is transmitted between synapses. Resources are
simply redistributed between synapses (i.e. synapses are strengthened or
weakened in the same way that the stationary robots increase or decrease the
size of their windmills). Traits (responses) are not copied between slots. Instead,
adaptation arises by the mechanism proposed by Price because there is
covariance between traits (here reward obtained by a particular synaptic
response, or the amount of wind collected by a particular robot windmill) and
their probability of subsequent activation determined by changing the synaptic
weight, or changing the size of a robot windmill. According to Price if this
covariance is maintained there is survival of the fittest synapse or windmill. Such
a process of synaptic selectionism has been proposed by the neuroscientist JeanPiere Changeux [18].
A surprising consequence is that Eigen’s replicator equations can be run without
any system having to undergo natural selection as defined by JMS. That is, they
can model JMS type natural selection without there being any real replicators in
implementing them. But they do always exhibit natural selection as described by
Price, and could of-course serve as models of systems undergoing natural
selection as defined by JMS. Nothing needs to multiply when Eigen’s equations
are run, but they emulate the consequences of multiplication that would occur in
a system undergoing natural selection according to JMS.
In short, synaptic selection algorithms can best be understood as competitive
learning between synapses (slots) that satisfy Price’s criteria for adaptation by
natural selection but use a different recipe to achieve this to that proposed by
JMS. Instead of explicit multiplication of replicators, i.e. a process where matter
at one site reconfigures matter at another site (i.e. where traits are explicitly
copied), both Hebbian learning and Eigen’s replicator equations model the
effects of multiplication. The recipe in the case of synapses emanating from a
single neuron involves encoding the information (trait) by the location of the
synapse, and allowing matter to be redistributed (fitness) between synapses.
This is a very different recipe compared to how JMS pictured natural selection
The Evolutionary Review
working at the genetic, organismal and group levels in the biosphere. Synapses
compete for growth resources, but it is their connections that encode
information. Thus the synaptic selectionism of Changeux [18] is a sound form of
Darwinian dynamics as defined by Price and Eigen, but is not the same class of
implementation of natural selection as defined by JMS. In fact, no modification of
the responses encoded by each individual synapse is possible. Each synapse is a
slot that signifies one fixed solution and the relative probability of a slot being
active is modified by competition. Notice, there is no transmission of information
between slots, in fact no communication between slots at all, in other words, the
response arising from activating synapse A does not become the response arising
from activating synapse B. In terms of the robot analogy, a robot that is doing
well does not call other robots to join it. So, selectionism is an example of parallel
search with competition. It is natural selection in the Price sense, but not in the
JMS sense.
It is at this stage that I think Edelman took Changeux (and later Seung’s) ideas of
natural selection acting at the level of the synapse a step too far. The third Nobel
Prize winner in our story, Francis Crick who worked down the corridor to Gerald
Edelman at the Salk institute disliked Edelman’s Neural Darwinism so much that
he called it Neural Edelmanism [19]. The reason was that Edelman had identified
no replicators in the brain, and so there was no unit of evolution as required by
JMS. However, Edelman had satisfied the definitions of natural selection defined
by Price and Eigen. Edelman proposed competition between neuronal groups (a
neuronal group is Edelman’s implementation of a slot) for synaptic resources,
but he failed to explain how the particular pattern of synaptic weights that
constitute the function of one group could be copied from one group to another.
This leaves no mechanism by which a synaptic-pattern-dependent trait could be
inherited between neuronal groups.
In the best paper to formulate Edelman’s theory of neuronal group selection,
Izhikevich shows that there is no mechanism by which functional variations in
synaptic connectivity patterns can be inherited (transmitted) between neuronal
groups [20]. Edelman does satisfy Price if a neuronal group is doing no more
The Evolutionary Review
than a single synapse in Changeux’s theory, i.e. encoding a particular response.
However, it does not satisfy Price if Edelman wishes to claim that the trait in
question is a transmissible pattern of synaptic strengths within a neuronal group
because Edelman cannot show there is covariance between such a trait and the
number of groups in which such a trait is found across generations. There is no
communication of solutions between group-based slots, no information transfer
as there is no information transfer between synapses. Edelman’s mechanism
appears to be a mechanism of competitive learning between neuronal group
slots which only has a Darwinian interpretation according to Price if a neuronal
group is nothing more than a synapse in Changeux’s model, i.e. with a fixed
response function, without copying of response functions between groups.
Therefore, Francis Crick was right in a sense. Neural Edelmanism falls into the
third column of my classification of search algorithms as competitive learning,
but, which if interpreted as the same theory as Changeux and Seung’s can satisfy
Price’s phenomenological but never JMS’s definition of natural selection. This is
because natural selection as defined by JMS requires information transmission
between slots, i.e. multiplication (replication), and Price’s definition requires
only covariance between a response encoded by a neuronal group and the
probability of a change in frequency of that response.
This leads us to the final column in Table 1. Here is a radically different way of
utilizing multiple slots that extends the algorithmic capacity of the competitive
learning algorithms above. In this case I allow not only the competition of slots
for generate and test cycles, but I allow slots to pass information
(traits/responses) between each other, see Figure 3.
The Evolutionary Review
Figure 3 shows the robots on the hillside again but this time, those robots in the
higher altitudes can recruit robots in lower altitudes to come and join them. This
is equivalent to replication of robot locations. The currently best location can be
copied to other slots. There is transmission of information between slots. Note,
replication is always of information (patterns), i.e. reconfiguration by matter of
The Evolutionary Review
other matter. This is one of the reasons the cyberneticist anthropologist Gregory
Bateson called evolution a mental process [21,22].
This means that the currently higher quality slots have not only a greater chance
of being varied and tested, but that they can copy their traits to other slots that
do not have such good quality traits. This permits the redistribution of
information between material slots. Notice that the synaptic slot system did not
have this capability. If one synapse location produced response A, then it was not
possible for other synaptic locations to come to produce response A, even if
response A was associated with higher reward. We will see shortly a real case in
the brain where such copying of response properties is possible between slots,
and is therefore clear evidence for natural selection in the brain not just of the
Prician type but of the JMS type.
Crucially, such a system of parallel search, competition and information
transmission between slots does satisfy JMS’ definition of natural selection. The
configuration of a unit of evolution (slot) can reconfigure other material slots. It
also satisfies Price’s definition. Some eponymists might wish to say that this is a
full Darwinian population. But it is better to show that there are some
algorithmic advantages compared to a competitive learning system without
information transmission that satisfies only Price’s formulation of natural
selection.
The critical advantage of JMS’s definition over Price’s definition is that multiple
search points can be recruited to the region of the search space that is currently
the best. In terms of evolutionary theory, a solution can reach fixation and then
utilize all the search resources available for further exploration. This allows the
entire population (of robots) to acquire the response characteristics (locations)
of the currently best unit (robot), and therefore, allows the accumulation of
adaptations. Once one peak has been reached by all the robots, they can then all
be in a position to do further exploration to find even higher peaks. Adaptations
can accumulate. In many real world problems there is never a global optimum,
rather further mountain ranges remain to be explored after a plateau has been
The Evolutionary Review
reached. For example, there is no end to science. Not every system that satisfies
Price’s definition of natural selection can have these special properties.
It may come as a surprise that there are already well-recognised processes in the
brain that are known to a limited extent to implement natural selection as
defined by JMS (and Price), and that have the same algorithmic characteristics as
Figure 3. Figure 4 shows a recent experiment by Young et al that showed that
receptive fields in the primary visual cortex of cats can replicate to adjacent
neurons, a process they called “didactic transfer”.
Figure 4. Adapted from Young, the orientation selectivity of simple cells in the
visual cortex can be copied between cells. Simple cells in the visual cortex have
orientation selectivity which means they respond optimally to bars presented to
the visual field of a particular orientation. The arrows in each cell show the
direction of a bar that maximally stimulates that cell. The orientation selectivity
can be copied between cells. In fact, the fitness of an orientation selective
response is the extent to which stimulation at the retina activates the cell with
The Evolutionary Review
such a response. If the retina supplying the cells in the inner circle is cut out
then those cells receive no inputs, and the increase their sensitivity to activation
by horizontal connections the adjacent cells. This process can copy the
orientiation selectivity of adjacent cells onto the central cells.
If a region of the retina is removed then the cortical neurons that normally
receive input from that region are less active and so they become more sensitive
to being activated by their adjacent active neighbours. With a special type of
plasticity called spike-time-dependent plasticity (STDP), nearly all the neurons
in the silenced region take on the orientation selectivity of a neuron adjacent to
that region. In this case the trait is the orientation selectivity, and the fitness is
the change in the proportion of cells with that orientation selectivity. The unit of
evolution is the receptive field, which multiplies, and has hereditary variation.
However, an important term in Price’s formulation is the bias due to
transmission, e.g. mutation (the rate of which is too large in the GARD model),
but this includes any other factor that alters the trait or the fitness of the trait. It
seems that the capacity of STDP in horizontal connections to continue to make
copy after copy with sufficient fidelity may be low in this case, although this has
yet to be tested, but if it is so then covariance between fitness and orientation
selectivity cannot be maintained across many neurons.
Another limitation of Young’s system of didactic receptive fields is that they are
capable of only limited heredity, which means that all possible orientations could
be exhaustively encoded. The situation is analogous to pre-genetic inheritance in
the origin of life. Prior to the origin of nucleotides and template replication, e.g.
DNA and RNA replication, natural selection may have utilized attractor-based
heredity in the form of autocatalytic chemical reaction networks [23]. These
were capable of only limited information transmission. However, the origin of
symbolic information in the form of strings of nucleotides permitted unlimited
heredity. To see this, imagine how long it would take to generate all possible
strings of DNA of only 100 nucleotides in length. There are 4 nucleotides, A, C, G,
and T, so this gives 4100 possibilities. If each string could be made in one second it
The Evolutionary Review
would still take 5 x 1046 years to make them all. The universe is only 433 x 1015
seconds old according to Wikipedia. The capacity to encode symbolic (digital)
information allows a far greater number of states and therefore strategies and
responses to be encoded.
This brings us back then to what Edelman failed to explain in his neuronal group
selection. Is there a way in which a neuronal pattern could transmit an unlimited
amount of information to another neuronal pattern? The neuronal replicator
hypothesis Eors Szathmary and I proposed in 2008 claims that the origin of
human language and unlimited open-ended thought and problem solving in
humans arose because of a major transition in evolution that bears a
resemblance to the origin of nucleotides in the origin of life and the origin of the
adaptive immune system. We propose that the capacity of the brain to evolve
unlimited heredity neuronal replicators by neuronal natural selection allows
truly open-ended creative thought. Penn and Povinelli have given a politically
incorrect but convincing argument that human cognition is indeed qualitatively
distinct from all other animals in that we can reason about unobserved hidden
causes, and the abstract relations between such entities, whereas no other
animal can [24]. We propose that this cognitive sophistication involved the
evolution at the genetic level of the neuronal capacity for not only competitive
learning and Priceian evolution as described by Changuex, Edelman, and Seung,
but for information transmission between higher-order units of neuronal
evolution, and thus natural selection as described by JMS.
We proposed a plausible neuronal basis for the replication of higher order units
of neuronal evolution above the synaptic level (neuronal groups). The method
allows a pattern of synaptic connections to be copied from one such unit to
another as shown in Figure 5 [7].
The Evolutionary Review
Figure 5. Our proposed mechanism for copying patterns of synaptic connections
between neuronal groups. The pattern of connectivity from the lower layer is
copied to the upper layer. See text.
The Evolutionary Review
In the brain there are many topographic maps. These are pathways of parallel
connections that preserve adjacency relationships and they can act to establish a
one-to-one (or at least a few-to-few) transformation between neurons in distinct
regions of the brain. In addition there is a kind of synaptic plasticity called spiketime-dependent plasticity (STDP), the same kind of plasticity that Young used to
explain the copying of receptive fields. It works rather like Hebbian learning.
Donald Hebb said that neurons that fire together wire together, which means
that the synapse connecting neuron A to neuron B gets stronger if A and B fire at
the same time [25]. However, recently it has been discovered that there is an
asymmetric form of Hebbian learning (STDP) where if the pre-synaptic neuron A
fires before the post-synaptic neuron B, the synapse is strengthened, but if presynaptic neuron A fires after post-synaptic neuron B then the synapse is
weakened.
Thus STDP in an unsupervised manner, i.e. without an explicit
external teacher, reinforces potential causal relationships. It is able to guess
which synapses were causally implicated in a pattern of activation.
If a neuronal circuit exists in layer A in Figure 7, and is externally stimulated
randomly to make it’s neurons spike, then due to the topographic map from layer
A to layer B, neurons in layer B will experience similar spike pattern statistics as
in layer A (due to the topographic map). If there is STDP in layer B between
weakly connected neurons then this layer becomes a kind of causal inference
machine that observes the spike input from layer A and tries to produce a circuit
with the same connectivity, or at least that is capable of generating the same
pattern of correlations. One problem with this mechanism is that there are many
possible patterns of connectivity that generate the same spike statistics when a
circuit is randomly externally stimulated to spike. As the circuit size gets larger,
due to the many possible paths that activity can take through a circuit within a
layer, the number of possible equivalent circuits grows. This can be prevented by
limiting the amount of horizontal spread of activity permissible within a layer. If
this is done, and some simple error correction neurons were added, we found it
was possible to evolve a fairly large network to obtain a particular desired
pattern of connectivity. The network with connectivity closest to the desired
connectivity was allowed to replicate itself to other circuits.
The Evolutionary Review
At the moment, neurophysiologists would struggle to observe the connectivity
patterns in microcircuits of this size and to undertake a similar experiment in
slices or neuronal cultures; however, I think the day is not far when it becomes
possible to identify the mechanisms we propose. These are not the only kinds of
neuronal replication that are possible. But this is the closest to turning Edelman’s
theory into something that is truly Darwinian as defined by John Maynard Smith.
Acknowledgements: Thanks to Eors Szathmary, Phil Husbands, Simon
McGregor, and Yasha Hartberg for discussions about the manuscript.
1. Edelman GM (1987) Neural Darwinism. The Theory of Neuronal Group
Selection New York: Basic Books.
2. Maynard Smith J (1986) The problems of biology. Oxford, UK
: Oxford University Press.
3. Price GR (1970) Selection and covariance. Nature 227: 520-521.
4. Michod RE (1988) Darwinian Selection in the Brain. Evolution 43: 694-696.
5. Maynard Smith J, Szathmáry E (1995) The Major Transitions in Evolution.
Oxford: Oxford University Press.
6. Calvin WH (1996) The cerebral code. Cambridge, MA.: MIT Press.
7. Fernando C, Karishma KK, Szathmáry E (2008) Copying and Evolution of
Neuronal Topology. PLoS ONE 3: e3775.
8. Fernando C, Goldstein R, Szathmáry E (2010) The Neuronal Replicator
Hypothesis. Neural Computation 22: 2809–2857.
9. Fernando C, Szathmáry E (2009) Chemical, neuronal and linguistic replicators.
In: Pigliucci M, Müller G, editors. Towards an Extended Evolutionary
Synthesis Cambridge, Ma.: MIT Press. pp. 209-249.
10. Fernando C, Szathmáry E (2009) Natural selection in the brain. In: Glatzeder
B, Goel V, von Müller A, editors. Toward a Theory of Thinking. Berlin.:
Springer. pp. 291-340.
11. Segrè D, Lancet D, Kedem O, Pilpel Y (1998) Graded Autocatalysis Replication
Domain (GARD): kinetic analysis of self-replication in mutually catalytic
sets. Origins Life Evol Biosphere 28: 501-514.
12. Vasas V, Szathmáry E, Santos M (2010) Lack of evolvability in selfsustaining autocatalytic networks constraints metabolism-first
scenarios for the origin of life. . Proc Natl Acad Sci U S A.
13. Okasha S (2006) Evolution and the levels of selection. Oxford: Oxford
University Press.
14. Holland JH (1975) Adaptation in Natural and Artificial Systems. Ann Arbor:
University of Michigan Press.
15. Sutton SR, Barto AG (1998) Reinforcement Learning: An Introduction.
Cambridge, MA: MIT Press.
The Evolutionary Review
16. Seung SH (2003) Learning in Spiking Neural Networks by Reinforcement of
Stochastic Synaptic Transmission. Neuron 40: 1063-1973.
17. Eigen M (1971) Selforganization of matter and the evolution of biological
macromolecules. Naturwissenschaften 58: 465-523.
18. Changeux JP (1985) Neuronal Man: The Biology of Mind: Princeton
University Press.
19. Crick FHC (1989) Neuronal Edelmanism. Trends Neurosci 12: 240-248.
20. Izhikevich EM, Gally JA, Edelman GM (2004) Spike-timing dynamics of
neuronal groups. Cereb Cortex 14: 933-944.
21. Bateson G (1979) Mind and Nature: A Necessary Unity: Bantam Books.
22. Bateson G (1972) Steps to an Ecology of Mind: Collected Essays in
Anthropology, Psychiatry, Evolution, and Epistemology: University Of
Chicago Press.
23. Szathmáry E (2006) The origin of replicators and reproducers. Philos Trans R
Soc London B Biol Sci 361: 1761-1776.
24. Penn DC, Holyoak KJ, Povinelli DJ (2008) Darwin's Mistake: Explaining the
Discontinuity Between Human and Nonhuman Minds. Behavioral and
Brain Sciences 31: 109-130.
25. Hebb DO (1949) The Organization of Behaviour: John Wiley & Sons.
Download