Siemens

advertisement
Observing Trends in Novel Metrics of Neural Networks
Created Through Artificial Evolution and Natural Selection
Jordan Perr
10/2/09
Version 3.x (Siemen’s Submission)
Table of Contents
1. Abstract
2
2. Introduction
2.1. A Brief History of Cognitive Science
2.2. Research
2.3. Relevance
3
3
4
6
3. Polyworld
7
4. Metrics and Results
4.1 Causal Density
4.2 Integrated Information
4.3 Synergy
9
9
9
11
5. Conclusion and Future Plans
13
6. Bibliography
14
1
1. Abstract
Researchers of artificial intelligence recognize that understanding consciousness is a
major component of understanding intelligence. Since the early sixties, computer
scientists such as Lawrence J. Fogel were beginning to apply concepts such as evolution
and natural selection to the pursuit of more intelligent and possibly conscious artificial
machines [8]. Today, researchers such as Virgil Griffith, Anil K. Seth, and Larry Yaeger
are continuing that research by reviving and revamping techniques such as simulated
evolution using modern personal computers. This paper extends the works of those
researchers and others by applying new metrics such as Integrated Information, Causal
Density, and my own metric, Synergy, to networks generated by Polyworld. This work
attempts to correlate any one of those metrics to observed intelligence and evolutionary
fitness. It is an in-depth look at current research being done on such measurements of
artificial neural networks, a critical look at the history of Cognitive Science, and the
introduction of a new type of metric to the scientific community.
2
2. Introduction
2.1. A Brief History of Cognitive Science
What is life and consciousness? Such a question, until recently, belonged to the
domain of existential philosophers and white-haired mad scientists. Traditional “bottom
up” scientific inquiry of the 19th century saw little room for theoretical inquiries into
consciousness and mind, favoring much more quantitative studies such as anatomy and
physics [11]. In fact, the earliest pioneering thoughts of evolutionary artificial
intelligence can be traced back to the likes of Charles Darwin and Fyodor Dostoevsky. In
their time, the power of natural deterministic forces over biological creatures was
minimized by the scientific community (not to mention the Church and general public.)
Early psychologists such as Sigmund Freud began to apply the scientific method to
studying observed behavior in human subjects, combining rigorous properties of
scientific inquiry with formerly philosophic ideas concerning behavior.
In the middle of the 20th century, the tools available to the scientific community
became more robust and accurate. Scientists such as James Watson and Francis Crick
were able to uncover an underlying mechanism (DNA) of seemingly magical properties
of biological entities. Such a discovery only helped to further the notion that intelligence
and consciousness might well be explainable by scientific inquiry: thus the field of
cybernetics was born [6]. For the first time, scientists felt that they had a clear path to
studying and understanding the underlying physical mechanisms that are manifested as
consciousness and life.
Towards the later part of the 20th century, better microscopes and the invention of the
modern computer changed cybernetics so drastically that the field came to the verge of
3
collapse. Modern scientists were looking to combine the disciplines of neuroscience,
anatomy, psychology, computer science, and even philosophy into one new discipline.
Cognitive science was born.
Currently, leading researchers are exploring how neural networks evolve in
simulation. Many different calculable metrics about these neural networks are being
explored. This paper attempts to explain the state of this area of research, introduce the
reader to some of these metrics, and showcase new data concerning them.
2.2. Research
My research started with Lawrence J. Fogel's book on his early experiments in
simulated evolution [8]. These experiments were conducted to prove that iterative
evolution alone could produce mechanisms that perform a given task better than
randomly constructed machines. His work showed the world that evolutionary algorithms
could, in fact, produce very accurate results to very complex problems.
I set out to create a finite state machine (FSM) evolver using the modern Java Virtual
Machine. With it, I was able to replicate his astounding results. The speed at which my
Java FSM Evolver was able to evolve fit machines led me to believe that applying the
concept of iterative evolution would be a fruitful approach to more complex problems
such as generating artificial intelligence.
4
Results from my Java finite state machine evolver.
Similar results to Fogel Et Al. 1966.
After creating the finite state machine evolution environment described above, I
wanted to create a program in which a virtual “creature” would evolve both physically
and behaviorally in an attempt to create an agent that resembled early carbon based life.
My research led me to a Google Tech Talk given by Virgil Griffith at the AlifeX
conference from 2007 [9]. In this talk, Griffith demonstrated how a program called
Polyworld was being used to observe behavioral trends in virtual agents evolving within
a virtual environment. The similarities between my idea and Polyworld were striking. I
was able to contribute modifications to the source code of Polyworld that are being
integrated into the main distribution and will be used by a number of scientists around the
world. I felt comfortable working within Polyworld and decided to use the program in my
future research.
5
2.3. Relevance
From a scientific perspective, this research will prove to be of major significance when
discussing the implications and usefulness of these metrics in determining behavioral
properties of a neural network. Integrated Information and Causal Density are not merely
mathematical measurements of a network's topology, but are theoretically indicative of
consciousness and intelligence. Previous efforts have been made to correlate metrics such
as network complexity to evolutionary time spent in Polyworld. The results of one such
experiment (concerning complexity) are shown below:
Complexity over Evolutionary Time: Yaeger Et Al.[4] Notice the steep increase and plateau at step 5000.
The scientific community wants to observe such trends so that we may understand
exactly what parts of a neural network must be strengthened to encourage intelligence. If
one can determine a certain set of metrics that are plausibly linked to intelligence, they
6
can use heuristics built upon those metrics to guide the evolution of new agents in
simulation. My work extends the efforts of those before me and adds to the world’s
knowledge of how other such metrics change as neural networks evolve. I also hope to
introduce a new metric, Synergy, which can be used for many of the same analytical
reasons as Integrated Information, Complexity, and Causal Density.
Evolutionary computation has real world, every day uses. Experiments have been done
to show how evolutionary algorithms can be used to train robots to navigate a physical
maze, among other things [12]. Many flight control systems used by major airlines and
the Air Force were aided in their creation by evolutionary computation. Another example
is Automated Mathematician, a project in the late 1970s that aimed to create a machine
capable of “discovering” mathematical formulas using evolutionary algorithms. The
Automated Mathematician was able to rediscover Goldbach’s Conjecture and the Unique
Prime Factorization Theorem without human intervention [8].
3. Polyworld
Polyworld is the descendant of Lawrence J. Fogel's FSM evolution environment.
Utilizing the power of modern day personal computers, Polyworld accurately simulates
the evolution and learning processes of multiple complex haploid agents in a virtual (yet
realistic) environment. Polyworld's agents (Polyworldians) are under continuous control
of an Artificial Neural Network (ANN) that is encoded in each agent’s genome. The
ANN of a Polyworldian takes visual input from the rendered virtual environment (pixels),
and causes the creature to physically interact with that environment. In this way,
Polyworldians are completely immersed in this virtual playing field and end up evolving
to meet the needs of that environment. This system allows scientists to observe how
7
ANNs evolve through interaction with each other and their environment. Experiments
have already shown that behaviors such as cannibalism, tribalism, and mating rituals are
exhibited by Polyworldians without the intervention of a human [9].
A simulation running in Polyworld. Each rectangular block on the field is a representation of one Polyworldian.
Full anatomical and functional matrices of each Polyworldian’s ANN and
genealogy logs are automatically recorded along with a whole host of other data during a
Polyworld simulation. This recorded data is critical for the analysis of trends in
Polyworldian neural networks. To analyze the trends in different mathematical metrics, I
wrote scripts to parse the data recorded by Polyworld and feed said data to different
algorithms for metric calculation.
8
4. Metrics and Results
4.1 Causal Density
Causal Density is a measurement of how centralized a given neural network is.
More precisely, casual density reflects the number of interactions among nodes in a
neural network that are causally significant [13]. This metric is calculated by taking the
number of links that are deemed to be causally significant and dividing that number by a
value directly proportional to the size of the neural network.
4.2 Integrated Information
Integrated Information () is the measure of how much information is contained
by a neural network and, at the same time, how synchronized the network is. The actual
algorithm by which scientists calculate  is beyond the scope of this paper (see [16].)
Information, for the sake of this measurement, is defined as the reduction in possible
states a given network experiences by the choosing of one particular future state. This
sounds like a mouthful, but is pretty intuitive in reality. Picture an unabridged dictionary
with millions of definitions to choose from. By picking a particular definition to study,
one has eliminated the millions of other definitions from their interest. With a smaller and
more concise dictionary, the number of possible definitions excluded from consideration
is much smaller when one definition in particular is chosen.
The second component to  is how synergized the network is. To understand
synergy, consider the following thought experiment: A human and an array of
photosensitive diodes (a camera) are placed in front of a projector screen. The human is
9
told to push a button whenever the screen is white, and to release the button when the
screen turns black. The camera is connected to a similar sort of device. When the screen
turns from bright to dark (and vice versa), both subjects will indicate the change properly.
Most would agree that the human subject consciously chose to press their button while
the camera did it without being conscious. Why is the human conscious and the camera
not? What makes the human unique?
One answer to those questions is that the human mind contains more integrated
information than the camera’s sensor. When the human subject sees the screen change
from light to dark, their mind explores a virtually limitless number of new possible states.
They might question whether the screen will flash green, whether the experiment will be
over soon, or whether he or she had parked in an illegal parking space. The human mind
has a vast amount of information and is incredibly skilled at connecting and organizing it.
In other words, the human mind contains a relatively large amount of synergized
information, or .
The camera, on the other hand, is simply allowing photons emitted by the
projector to cause electrons locked within its photoresistors to be freed. Each photodiode
can experience an unlimited number of states (off, on, or somewhere in between) so we
can assume that each photodiode contains some amount of information. The camera’s 
remains low, however, due to the fact that each photodiode is on its own circuit and one
photodiode cannot really affect the state of another. This isolation of nodes drives the
camera’s integrated information very low and shows how synchronicity conceptually
affects .
What, then, is the trend in  for a neural network being evolved in Polyworld? To
calculate this trend, I used the open source Consciousness project developed by Virgil
10
Griffith. This piece of software uses the fastest known, though still incredibly inefficient,
algorithm for calculating . Consciousness performs the calculation in a Big-Oh time that
is worse than exponential. Calculating  for a network of 11 nodes takes about 5 seconds
on a modern laptop, calculating  for 12 nodes takes 30 seconds, and calculating  for 13
nodes takes longer than 10 minutes. To observe trends in Polyworld’s neural networks
(which generally have over 150 nodes) I had to manually prune out un-needed nodes to
reduce the size of said networks. Due to the error introduced in this pruning process and
the sheer amount of time the process consumed, I was unable to observe any noticeable
trend in  over evolutionary time. Researchers are currently working to reduce the run
time of Integrated Information calculation, though a better algorithm may still be years
away.
4.3 Synergy
During this study of Integrated Information, I was curious as to whether or not the
“Synchronicity” factor alone could be an interesting metric for analysis. I therefore
developed an algorithm to quantify how closely packed a given neural network is. This
algorithm works recursively by summing the strength of every connection from a given
neuron and that of any connections thereafter (proportionately scaled). The algorithm
does this for each node in the network and returns the mean value considering every
node. Refer to the pseudocode below for clarification of this novel algorithm.
11
In effect, Synergy tells us how easily an action potential emitted by the average
neuron can affect the entire network. This measurement can be useful and interesting in a
number of ways. One use for such a metric might be to discover how vital a single
employee is to a large corporation, how well connected a given person might be to the
world as a whole, or even how well a computer system is functioning on the internet. The
uses for this type of standardized metric are limitless.
I was curious to see the trend in Synergy as a neural network is evolved through a
system like Polyworld. If the theories about relationships between Integrated Information
and consciousness prove to be true, Synergy might also have a valid claim to such
importance. I present to you the trend of Synergy over evolutionary time:
Graph of synergy value for every hundredth death.
This trend in particular was calculated by observing every hundredth time step in
a single Polyworld simulation of approximately 30,000 steps. The resulting data is, as
you can see, quite noisy. The trend, however, is still clearly visible and strangely enough
12
seems to echo the trend in Complexity observed by Yaeger Et Al. [4]. The synergy value
of the Polyworldian neural networks seems to increase noisily, but steadily, for the first
5,000 time steps. After this point, the synergy value seems to plateau, or even decline for
the remainder of the simulation.
One possible source of error in the above trend graph may be the lackluster
number of trials tested. This was due, in part, to the amount of time that this calculation
took. To calculate synergy to a depth of 3 for every 100 steps of a 30,000 step Polyworld
run (like above), the computation time was about 36 hours. Future efforts should be made
in optimizing the synergy calculation and running it with more raw data.
5. Conclusion and Future Plans
In conclusion, evolutionary artificial intelligence seems to be a promising field for
future advancement. More effort must be made to optimize the calculation of metrics
such as Integrated Information and Synergy so that more detailed trends may be
analyzed. Using these types of metrics to aid in the guided evolution of artificial neural
networks will undoubtedly yield new and exciting systems. I plan to continue studying
and modifying Polyworld with the hope of surpassing current expectations and limits
given to artificial machines. As computers become more powerful, as quantum
computation becomes reality, and as the tools mentioned in this paper are given more
time to mature and grow, evolutionary computation is sure to be a fruitful and beneficial
field.
13
6. Bibliography
1. Giulio Tononi Et al., Measuring Information Integration, BMC Neuroscience (2003).
2. Joseph T. Lizier Et al., Functional and Structural Topologies in Evolved
Neural Networks, (2009).
3. Larry Yaeger Et al., Passive and Driven Trends in the Evolution of Complexity,
Journal of Artificial Life (2008).
4. Yaeger, L. S., Griffith, V., and Sporns, O. (2008). Passive and Driven Trends in the
Evolution of Complexity. In Bullock, S., Noble, J., Watson, R., and Bedau, M. A.,
editors, Artificial Life XI: Proceedings of the Tenth International Conference on the
Simulation and Synthesis of Living Systems, p. 725-732. MIT Press, Cambridge, MA.
5. Gyorgy Buzsaki, Rhythms of the Brain, Oxford University Press, USA, 2006.
6. Jean-Pierre Dupuy (Translated by M. B. DeBevoise), On the Origins of Cognitive
Science: The Mechanization of the Mind, The MIT Press, 2009.
7. Peter Dayan and L. F. Abbott, Theoretical Neuroscience: Computational and
Mathematical Modeling of Neural Systems, The MIT Press, 2005.
8. Lawrence J. Fogel, Intelligence Through Simulated Evolution: Forty Years of
Evolutionary Programming (Wiley Series on Intelligent Systems), WileyInterscience, 1999.
9. Google Tech Talks, Polyworld: Using Evolution to Design Artificial Intelligence,
November 2007.
10. Christian Jacob, Illustrating Evolutionary Computation with Mathematica
Morgan Kaufmann Publishers, 2001.
11. Eric R. Kandel, In Search of Memory: The Emergence of a New Science of Mind,
Norton Paperback, 2006.
12. Stefano Nolfi and Dario Floreano, Evolutionary Robotics: The Biology, Intelligence,
and Technology of Self-organizing Machines (Intelligent Robotics and Autonomous
Agents), The MIT Press, 2000.
14
13. Anil K. Seth, Causal Connectivity of Evolved Neural Networks During Behavior,
Johns Hopkins University (2009).
14. Murray Shanahan, On the Dynamical Complexity of Small-world Networks of
Spiking Neurons, to appear in Physical Review E (2009).
15. Gordon M. Shepherd (ed.), The Synaptic Organization of the Brain,
Oxford University Press. USA, 2003.
16. Virgil Griffith Et Al., An Information-Based Measuer of Synergistic Complexity
Based On Phi, unpublished, 2009
15
Download