CAP6938 Neuroevolution and Artificial Embryogeny Competitive Coevolution Dr. Kenneth Stanley February 20, 2006 Example: I Want to Evolve a Go Player • Go is one of the hardest games for computers • I am terrible at it • There are no good Go programs either (hypothetically) • I have no idea how to measure the fitness of a Go player • How can I make evolution solve this problem? Generally: Fitness May Be Difficult to Formalize • Optimal policy in competitive domains unknown • Only winner and loser can be easily determined • What can be done? Competitive Coevolution • Coevolution: No absolute fitness function • Fitness depends on direct comparisons with other evolving agents • Hope to discover solutions beyond the ability of fitness to describe • Competition should lead to an escalating arms race The Arms Race The Arms Race is an AI Dream • Computer plays itself and becomes champion • No need for human knowledge whatsoever • In practice, progress eventually stagnates (Darwen 1996; Floreano and Nolfi 1997; Rosin and Belew 1997) So Who Plays Against Whom? • If evaluation is expensive, everyone can’t play everyone • Even if they could, a lot of candidates might be very poor • If not everyone, who then is chosen as competition for each candidate? • Need some kind of intelligent sampling Challenges with Choosing the Right Opponents • Red Queen Effect: Running in Circles – A dominates B – C dominates B – A dominates B • Overspecialization – Optimizing a single skill to the neglect of all others – Likely to happen without diverse opponents in sample • Several other failure dynamics Heuristic in NEAT: Utilize Species Champions Each individual plays all the species champions and keeps a score Hall of Fame (HOF) (Rosin and Belew 1997) • Keep around a list of past champions • Add them to the mix of opponents • If HOF gets too big, sample from it More Recently: Pareto Coevolution • Separate learners and tests • The tests are rewarded for distinguishing learners from each other • The learners are ranked in Pareto layers – Each test is an objective – If X wins against a superset of tests that Y wins again, then X Pareto-dominates Y – The first layer is a nondominated front – Think of tests as objectives in a multiobjective optimization problem • Potentially costly: All learners play all tests De Jong, E.D. and J.B. Pollack (2004). Ideal Evaluation from Coevolution Evolutionary Computation, Vol. 12, Issue 2, pp. 159-192, published by The MIT Press. Choosing Opponents Isn’t Everything • How can new solutions be continually created that maintain existing capabilities? • Mutations that lead to innovations could simultaneously lead to losses • What kind of process ensures elaboration over alteration? Alteration vs. Elaboration Answer: Complexification • Fixed-length genomes limit progress • Dominant strategies that utilize the entire genome must alter and thereby sacrifice prior functionality • If new genes can be added, dominant strategies can be elaborated, maintaining existing capabilities Test Domain: Robot Duel • • • • Robot with higher energy wins by colliding with opponent Moving costs energy Collecting food replenishes energy Complex task: When to forage/save energy, avoid/pursue? Robot Neural Networks Experimental Setup • 13 complexifying runs, 15 fixed-topology runs • 500 generations per run • 2-population coevolution with hall of fame (Rosin & Belew 1997) Performance is Difficult to Evaluate in Coevolution • How can you tell if things are improving when everything is relative? – Number of wins is relative to each generation • No absolute measure is available • No benchmark is comprehensive Expensive Method: Master Tournament (Cliff and Miller 1995; Floreano and Nolfi 1997) • Compare all generation champions to each other • Requires n^2 evaluations – An accurate evaluation may involve e.g. 288 games • Defeating more champions does not establish superiority Strict and Efficient Performance Measure: Dominance Tournament (Stanley & Miikkulainen 2002) Result: Evolution of Complexity • As dominance increases so does complexity on average • Networks with strictly superior strategies are more complex Comparing Performance Summary of Performance Comparisons The Superchamp Cooperative Coevolution • Groups attempt to work with each other instead of against each other • But sometimes it’s not clear what’s cooperation and what’s competition • Maybe competitive/cooperative is not the best distinction? – Newer idea: Compositional vs. test-based Summary • • • • • Picking best opponents Maintaining and elaborating on strategies Measuring performance Different types of coevolution Advanced papers on coevolution: Ideal Evaluation from Coevolution by De Jong, E.D. and J.B. Pollack (2004) Monotonic Solution Concepts in Coevolution by Ficici, Sevan G. (2005) Next Topic: Real-time NEAT (rtNEAT) • Simultaneous and asynchronous evaluation • Non-generational • Useful in video games and simulations • NERO: Video game with rtNEAT -Shorter symposium paper: Evolving Neural Network Agents in the NERO Video Game by Kenneth O. Stanley and Risto Miikkulainen (2005) -Optional journal (longer, more detailed) paper: Real-time Neuroevolution in the NERO Video Game by Kenneth O. Stanley and Risto Miikkulainen (2005) -http://Nerogame.org -Extra coevolution papers Homework due 2/27/06: Working genotype to phenotype mapping. Genetic representation completed. Saving and loading of genome file I/O functions completed. Turn in summary, code, and examples demonstrating that it works. Project Milestones (25% of grade) • • • • • • 2/6: Initial proposal and project description 2/15: Domain and phenotype code and examples 2/27: Genes and Genotype to Phenotype mapping 3/8: Genetic operators all working 3/27: Population level and main loop working 4/10: Final project and presentation due (75% of grade)