Observing Trends in Novel Metrics of Neural Networks Created Through Artificial Evolution and Natural Selection Jordan Perr 10/2/09 Version 3.x (Siemen’s Submission) Table of Contents 1. Abstract 2 2. Introduction 2.1. A Brief History of Cognitive Science 2.2. Research 2.3. Relevance 3 3 4 6 3. Polyworld 7 4. Metrics and Results 4.1 Causal Density 4.2 Integrated Information 4.3 Synergy 9 9 9 11 5. Conclusion and Future Plans 13 6. Bibliography 14 1 1. Abstract Researchers of artificial intelligence recognize that understanding consciousness is a major component of understanding intelligence. Since the early sixties, computer scientists such as Lawrence J. Fogel were beginning to apply concepts such as evolution and natural selection to the pursuit of more intelligent and possibly conscious artificial machines [8]. Today, researchers such as Virgil Griffith, Anil K. Seth, and Larry Yaeger are continuing that research by reviving and revamping techniques such as simulated evolution using modern personal computers. This paper extends the works of those researchers and others by applying new metrics such as Integrated Information, Causal Density, and my own metric, Synergy, to networks generated by Polyworld. This work attempts to correlate any one of those metrics to observed intelligence and evolutionary fitness. It is an in-depth look at current research being done on such measurements of artificial neural networks, a critical look at the history of Cognitive Science, and the introduction of a new type of metric to the scientific community. 2 2. Introduction 2.1. A Brief History of Cognitive Science What is life and consciousness? Such a question, until recently, belonged to the domain of existential philosophers and white-haired mad scientists. Traditional “bottom up” scientific inquiry of the 19th century saw little room for theoretical inquiries into consciousness and mind, favoring much more quantitative studies such as anatomy and physics [11]. In fact, the earliest pioneering thoughts of evolutionary artificial intelligence can be traced back to the likes of Charles Darwin and Fyodor Dostoevsky. In their time, the power of natural deterministic forces over biological creatures was minimized by the scientific community (not to mention the Church and general public.) Early psychologists such as Sigmund Freud began to apply the scientific method to studying observed behavior in human subjects, combining rigorous properties of scientific inquiry with formerly philosophic ideas concerning behavior. In the middle of the 20th century, the tools available to the scientific community became more robust and accurate. Scientists such as James Watson and Francis Crick were able to uncover an underlying mechanism (DNA) of seemingly magical properties of biological entities. Such a discovery only helped to further the notion that intelligence and consciousness might well be explainable by scientific inquiry: thus the field of cybernetics was born [6]. For the first time, scientists felt that they had a clear path to studying and understanding the underlying physical mechanisms that are manifested as consciousness and life. Towards the later part of the 20th century, better microscopes and the invention of the modern computer changed cybernetics so drastically that the field came to the verge of 3 collapse. Modern scientists were looking to combine the disciplines of neuroscience, anatomy, psychology, computer science, and even philosophy into one new discipline. Cognitive science was born. Currently, leading researchers are exploring how neural networks evolve in simulation. Many different calculable metrics about these neural networks are being explored. This paper attempts to explain the state of this area of research, introduce the reader to some of these metrics, and showcase new data concerning them. 2.2. Research My research started with Lawrence J. Fogel's book on his early experiments in simulated evolution [8]. These experiments were conducted to prove that iterative evolution alone could produce mechanisms that perform a given task better than randomly constructed machines. His work showed the world that evolutionary algorithms could, in fact, produce very accurate results to very complex problems. I set out to create a finite state machine (FSM) evolver using the modern Java Virtual Machine. With it, I was able to replicate his astounding results. The speed at which my Java FSM Evolver was able to evolve fit machines led me to believe that applying the concept of iterative evolution would be a fruitful approach to more complex problems such as generating artificial intelligence. 4 Results from my Java finite state machine evolver. Similar results to Fogel Et Al. 1966. After creating the finite state machine evolution environment described above, I wanted to create a program in which a virtual “creature” would evolve both physically and behaviorally in an attempt to create an agent that resembled early carbon based life. My research led me to a Google Tech Talk given by Virgil Griffith at the AlifeX conference from 2007 [9]. In this talk, Griffith demonstrated how a program called Polyworld was being used to observe behavioral trends in virtual agents evolving within a virtual environment. The similarities between my idea and Polyworld were striking. I was able to contribute modifications to the source code of Polyworld that are being integrated into the main distribution and will be used by a number of scientists around the world. I felt comfortable working within Polyworld and decided to use the program in my future research. 5 2.3. Relevance From a scientific perspective, this research will prove to be of major significance when discussing the implications and usefulness of these metrics in determining behavioral properties of a neural network. Integrated Information and Causal Density are not merely mathematical measurements of a network's topology, but are theoretically indicative of consciousness and intelligence. Previous efforts have been made to correlate metrics such as network complexity to evolutionary time spent in Polyworld. The results of one such experiment (concerning complexity) are shown below: Complexity over Evolutionary Time: Yaeger Et Al.[4] Notice the steep increase and plateau at step 5000. The scientific community wants to observe such trends so that we may understand exactly what parts of a neural network must be strengthened to encourage intelligence. If one can determine a certain set of metrics that are plausibly linked to intelligence, they 6 can use heuristics built upon those metrics to guide the evolution of new agents in simulation. My work extends the efforts of those before me and adds to the world’s knowledge of how other such metrics change as neural networks evolve. I also hope to introduce a new metric, Synergy, which can be used for many of the same analytical reasons as Integrated Information, Complexity, and Causal Density. Evolutionary computation has real world, every day uses. Experiments have been done to show how evolutionary algorithms can be used to train robots to navigate a physical maze, among other things [12]. Many flight control systems used by major airlines and the Air Force were aided in their creation by evolutionary computation. Another example is Automated Mathematician, a project in the late 1970s that aimed to create a machine capable of “discovering” mathematical formulas using evolutionary algorithms. The Automated Mathematician was able to rediscover Goldbach’s Conjecture and the Unique Prime Factorization Theorem without human intervention [8]. 3. Polyworld Polyworld is the descendant of Lawrence J. Fogel's FSM evolution environment. Utilizing the power of modern day personal computers, Polyworld accurately simulates the evolution and learning processes of multiple complex haploid agents in a virtual (yet realistic) environment. Polyworld's agents (Polyworldians) are under continuous control of an Artificial Neural Network (ANN) that is encoded in each agent’s genome. The ANN of a Polyworldian takes visual input from the rendered virtual environment (pixels), and causes the creature to physically interact with that environment. In this way, Polyworldians are completely immersed in this virtual playing field and end up evolving to meet the needs of that environment. This system allows scientists to observe how 7 ANNs evolve through interaction with each other and their environment. Experiments have already shown that behaviors such as cannibalism, tribalism, and mating rituals are exhibited by Polyworldians without the intervention of a human [9]. A simulation running in Polyworld. Each rectangular block on the field is a representation of one Polyworldian. Full anatomical and functional matrices of each Polyworldian’s ANN and genealogy logs are automatically recorded along with a whole host of other data during a Polyworld simulation. This recorded data is critical for the analysis of trends in Polyworldian neural networks. To analyze the trends in different mathematical metrics, I wrote scripts to parse the data recorded by Polyworld and feed said data to different algorithms for metric calculation. 8 4. Metrics and Results 4.1 Causal Density Causal Density is a measurement of how centralized a given neural network is. More precisely, casual density reflects the number of interactions among nodes in a neural network that are causally significant [13]. This metric is calculated by taking the number of links that are deemed to be causally significant and dividing that number by a value directly proportional to the size of the neural network. 4.2 Integrated Information Integrated Information () is the measure of how much information is contained by a neural network and, at the same time, how synchronized the network is. The actual algorithm by which scientists calculate is beyond the scope of this paper (see [16].) Information, for the sake of this measurement, is defined as the reduction in possible states a given network experiences by the choosing of one particular future state. This sounds like a mouthful, but is pretty intuitive in reality. Picture an unabridged dictionary with millions of definitions to choose from. By picking a particular definition to study, one has eliminated the millions of other definitions from their interest. With a smaller and more concise dictionary, the number of possible definitions excluded from consideration is much smaller when one definition in particular is chosen. The second component to is how synergized the network is. To understand synergy, consider the following thought experiment: A human and an array of photosensitive diodes (a camera) are placed in front of a projector screen. The human is 9 told to push a button whenever the screen is white, and to release the button when the screen turns black. The camera is connected to a similar sort of device. When the screen turns from bright to dark (and vice versa), both subjects will indicate the change properly. Most would agree that the human subject consciously chose to press their button while the camera did it without being conscious. Why is the human conscious and the camera not? What makes the human unique? One answer to those questions is that the human mind contains more integrated information than the camera’s sensor. When the human subject sees the screen change from light to dark, their mind explores a virtually limitless number of new possible states. They might question whether the screen will flash green, whether the experiment will be over soon, or whether he or she had parked in an illegal parking space. The human mind has a vast amount of information and is incredibly skilled at connecting and organizing it. In other words, the human mind contains a relatively large amount of synergized information, or . The camera, on the other hand, is simply allowing photons emitted by the projector to cause electrons locked within its photoresistors to be freed. Each photodiode can experience an unlimited number of states (off, on, or somewhere in between) so we can assume that each photodiode contains some amount of information. The camera’s remains low, however, due to the fact that each photodiode is on its own circuit and one photodiode cannot really affect the state of another. This isolation of nodes drives the camera’s integrated information very low and shows how synchronicity conceptually affects . What, then, is the trend in for a neural network being evolved in Polyworld? To calculate this trend, I used the open source Consciousness project developed by Virgil 10 Griffith. This piece of software uses the fastest known, though still incredibly inefficient, algorithm for calculating . Consciousness performs the calculation in a Big-Oh time that is worse than exponential. Calculating for a network of 11 nodes takes about 5 seconds on a modern laptop, calculating for 12 nodes takes 30 seconds, and calculating for 13 nodes takes longer than 10 minutes. To observe trends in Polyworld’s neural networks (which generally have over 150 nodes) I had to manually prune out un-needed nodes to reduce the size of said networks. Due to the error introduced in this pruning process and the sheer amount of time the process consumed, I was unable to observe any noticeable trend in over evolutionary time. Researchers are currently working to reduce the run time of Integrated Information calculation, though a better algorithm may still be years away. 4.3 Synergy During this study of Integrated Information, I was curious as to whether or not the “Synchronicity” factor alone could be an interesting metric for analysis. I therefore developed an algorithm to quantify how closely packed a given neural network is. This algorithm works recursively by summing the strength of every connection from a given neuron and that of any connections thereafter (proportionately scaled). The algorithm does this for each node in the network and returns the mean value considering every node. Refer to the pseudocode below for clarification of this novel algorithm. 11 In effect, Synergy tells us how easily an action potential emitted by the average neuron can affect the entire network. This measurement can be useful and interesting in a number of ways. One use for such a metric might be to discover how vital a single employee is to a large corporation, how well connected a given person might be to the world as a whole, or even how well a computer system is functioning on the internet. The uses for this type of standardized metric are limitless. I was curious to see the trend in Synergy as a neural network is evolved through a system like Polyworld. If the theories about relationships between Integrated Information and consciousness prove to be true, Synergy might also have a valid claim to such importance. I present to you the trend of Synergy over evolutionary time: Graph of synergy value for every hundredth death. This trend in particular was calculated by observing every hundredth time step in a single Polyworld simulation of approximately 30,000 steps. The resulting data is, as you can see, quite noisy. The trend, however, is still clearly visible and strangely enough 12 seems to echo the trend in Complexity observed by Yaeger Et Al. [4]. The synergy value of the Polyworldian neural networks seems to increase noisily, but steadily, for the first 5,000 time steps. After this point, the synergy value seems to plateau, or even decline for the remainder of the simulation. One possible source of error in the above trend graph may be the lackluster number of trials tested. This was due, in part, to the amount of time that this calculation took. To calculate synergy to a depth of 3 for every 100 steps of a 30,000 step Polyworld run (like above), the computation time was about 36 hours. Future efforts should be made in optimizing the synergy calculation and running it with more raw data. 5. Conclusion and Future Plans In conclusion, evolutionary artificial intelligence seems to be a promising field for future advancement. More effort must be made to optimize the calculation of metrics such as Integrated Information and Synergy so that more detailed trends may be analyzed. Using these types of metrics to aid in the guided evolution of artificial neural networks will undoubtedly yield new and exciting systems. I plan to continue studying and modifying Polyworld with the hope of surpassing current expectations and limits given to artificial machines. As computers become more powerful, as quantum computation becomes reality, and as the tools mentioned in this paper are given more time to mature and grow, evolutionary computation is sure to be a fruitful and beneficial field. 13 6. Bibliography 1. Giulio Tononi Et al., Measuring Information Integration, BMC Neuroscience (2003). 2. Joseph T. Lizier Et al., Functional and Structural Topologies in Evolved Neural Networks, (2009). 3. Larry Yaeger Et al., Passive and Driven Trends in the Evolution of Complexity, Journal of Artificial Life (2008). 4. Yaeger, L. S., Griffith, V., and Sporns, O. (2008). Passive and Driven Trends in the Evolution of Complexity. In Bullock, S., Noble, J., Watson, R., and Bedau, M. A., editors, Artificial Life XI: Proceedings of the Tenth International Conference on the Simulation and Synthesis of Living Systems, p. 725-732. MIT Press, Cambridge, MA. 5. Gyorgy Buzsaki, Rhythms of the Brain, Oxford University Press, USA, 2006. 6. Jean-Pierre Dupuy (Translated by M. B. DeBevoise), On the Origins of Cognitive Science: The Mechanization of the Mind, The MIT Press, 2009. 7. Peter Dayan and L. F. Abbott, Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems, The MIT Press, 2005. 8. Lawrence J. Fogel, Intelligence Through Simulated Evolution: Forty Years of Evolutionary Programming (Wiley Series on Intelligent Systems), WileyInterscience, 1999. 9. Google Tech Talks, Polyworld: Using Evolution to Design Artificial Intelligence, November 2007. 10. Christian Jacob, Illustrating Evolutionary Computation with Mathematica Morgan Kaufmann Publishers, 2001. 11. Eric R. Kandel, In Search of Memory: The Emergence of a New Science of Mind, Norton Paperback, 2006. 12. Stefano Nolfi and Dario Floreano, Evolutionary Robotics: The Biology, Intelligence, and Technology of Self-organizing Machines (Intelligent Robotics and Autonomous Agents), The MIT Press, 2000. 14 13. Anil K. Seth, Causal Connectivity of Evolved Neural Networks During Behavior, Johns Hopkins University (2009). 14. Murray Shanahan, On the Dynamical Complexity of Small-world Networks of Spiking Neurons, to appear in Physical Review E (2009). 15. Gordon M. Shepherd (ed.), The Synaptic Organization of the Brain, Oxford University Press. USA, 2003. 16. Virgil Griffith Et Al., An Information-Based Measuer of Synergistic Complexity Based On Phi, unpublished, 2009 15