Lecture 12 - UCF Computer Science

Download Report

Transcript Lecture 12 - UCF Computer Science

CAP6938
Neuroevolution and
Artificial Embryogeny
Competitive Coevolution
Dr. Kenneth Stanley
February 20, 2006
Example:
I Want to Evolve a Go Player
• Go is one of the hardest games for computers
• I am terrible at it
• There are no good Go programs either
(hypothetically)
• I have no idea how to measure the fitness of a
Go player
• How can I make evolution solve this problem?
Generally: Fitness May Be Difficult
to Formalize
• Optimal policy in competitive domains unknown
• Only winner and loser can be easily determined
• What can be done?
Competitive Coevolution
• Coevolution: No absolute fitness function
• Fitness depends on direct comparisons
with other evolving agents
• Hope to discover solutions beyond the
ability of fitness to describe
• Competition should lead to an escalating
arms race
The Arms Race
The Arms Race is an AI Dream
• Computer plays itself and becomes
champion
• No need for human knowledge
whatsoever
• In practice, progress eventually stagnates
(Darwen 1996; Floreano and Nolfi 1997;
Rosin and Belew 1997)
So Who Plays Against Whom?
• If evaluation is expensive, everyone can’t
play everyone
• Even if they could, a lot of candidates
might be very poor
• If not everyone, who then is chosen as
competition for each candidate?
• Need some kind of intelligent sampling
Challenges with Choosing the
Right Opponents
• Red Queen Effect: Running in Circles
– A dominates B
– C dominates B
– A dominates B
• Overspecialization
– Optimizing a single skill to the neglect of all others
– Likely to happen without diverse opponents in sample
• Several other failure dynamics
Heuristic in NEAT:
Utilize Species Champions
Each individual plays
all the species
champions and
keeps a score
Hall of Fame (HOF)
(Rosin and Belew 1997)
• Keep around a list of
past champions
• Add them to the mix
of opponents
• If HOF gets too big,
sample from it
More Recently:
Pareto Coevolution
• Separate learners and tests
• The tests are rewarded for distinguishing
learners from each other
• The learners are ranked in Pareto layers
– Each test is an objective
– If X wins against a superset of tests that Y wins again,
then X Pareto-dominates Y
– The first layer is a nondominated front
– Think of tests as objectives in a multiobjective
optimization problem
• Potentially costly: All learners play all tests
De Jong, E.D. and J.B. Pollack (2004). Ideal Evaluation from Coevolution Evolutionary
Computation, Vol. 12, Issue 2, pp. 159-192, published by The MIT Press.
Choosing Opponents Isn’t
Everything
• How can new solutions be continually
created that maintain existing capabilities?
• Mutations that lead to innovations could
simultaneously lead to losses
• What kind of process ensures elaboration
over alteration?
Alteration vs. Elaboration
Answer: Complexification
• Fixed-length genomes limit progress
• Dominant strategies that utilize the entire
genome must alter and thereby sacrifice
prior functionality
• If new genes can be added, dominant
strategies can be elaborated, maintaining
existing capabilities
Test Domain: Robot Duel
•
•
•
•
Robot with higher energy wins by colliding with opponent
Moving costs energy
Collecting food replenishes energy
Complex task: When to forage/save energy,
avoid/pursue?
Robot Neural Networks
Experimental Setup
• 13 complexifying runs, 15 fixed-topology
runs
• 500 generations per run
• 2-population coevolution with hall of fame
(Rosin & Belew 1997)
Performance is Difficult to Evaluate
in Coevolution
• How can you tell if things are improving
when everything is relative?
– Number of wins is relative to each generation
• No absolute measure is available
• No benchmark is comprehensive
Expensive Method:
Master Tournament
(Cliff and Miller 1995; Floreano and Nolfi 1997)
• Compare all generation champions to each
other
• Requires n^2 evaluations
– An accurate evaluation may involve e.g. 288 games
• Defeating more champions does not establish
superiority
Strict and Efficient Performance Measure:
Dominance Tournament
(Stanley & Miikkulainen 2002)
Result: Evolution of Complexity
• As dominance increases so does complexity on average
• Networks with strictly superior strategies are more
complex
Comparing Performance
Summary of Performance
Comparisons
The Superchamp
Cooperative Coevolution
• Groups attempt to work with each other
instead of against each other
• But sometimes it’s not clear what’s
cooperation and what’s competition
• Maybe competitive/cooperative is not the
best distinction?
– Newer idea: Compositional vs. test-based
Summary
•
•
•
•
•
Picking best opponents
Maintaining and elaborating on strategies
Measuring performance
Different types of coevolution
Advanced papers on coevolution:
Ideal Evaluation from Coevolution by De Jong, E.D. and J.B. Pollack (2004)
Monotonic Solution Concepts in Coevolution by Ficici, Sevan G. (2005)
Next Topic:
Real-time NEAT (rtNEAT)
• Simultaneous and asynchronous
evaluation
• Non-generational
• Useful in video games and simulations
• NERO: Video game with rtNEAT
-Shorter symposium paper: Evolving Neural Network Agents in the NERO Video Game by Kenneth O.
Stanley and Risto Miikkulainen (2005)
-Optional journal (longer, more detailed) paper: Real-time Neuroevolution in the NERO Video Game by
Kenneth O. Stanley and Risto Miikkulainen (2005)
-http://Nerogame.org
-Extra coevolution papers
Homework due 2/27/06: Working genotype to phenotype mapping.
Genetic representation completed. Saving and loading of genome file
I/O functions completed. Turn in summary, code, and examples
demonstrating that it works.
Project Milestones (25% of grade)
•
•
•
•
•
•
2/6: Initial proposal and project description
2/15: Domain and phenotype code and examples
2/27: Genes and Genotype to Phenotype mapping
3/8: Genetic operators all working
3/27: Population level and main loop working
4/10: Final project and presentation due (75% of grade)