Evolving the goal priorities of autonomous agents
Download
Report
Transcript Evolving the goal priorities of autonomous agents
Evolving the goal priorities of
autonomous agents
Adam Campbell*
Advisor: Dr. Annie S. Wu*
Collaborator: Dr. Randall Shumaker**
School of Electrical Engineering and Computer Science*
Institute for Simulation and Training**
Goal
Develop a controller for a team of collaborating
autonomous vehicles
Simple implementation
Allows for new goals (behaviors) to easily be added
Would like to add social interactions between the agents in
the future
Evolve the parameters of this controller to determine
how the goal weights correlate with different
environments
These simple tests will allow us to get a better idea of how
the goals interact with one another
Having the goal priorities evolve will allow us to more easily
hand code the parameters for future experiments
Motivation
Prioritizing conflicting, parallel goals in a robot controller is a
difficult, and open problem in artificial intelligence
This research examines an evolutionary approach to the Action
Selection problem
Imagine an insect with two goals
Action selection
Get food
Avoid predator
When should the “get food” goal priority be higher than the “avoid
predator” goal?
Two general methods
Take one action
Combine actions
Used in this research
Genetic algorithm
Survival of fittest amongst problem solutions
General algorithm…
1) Initialize random population
2)
3)
4)
5)
6)
Evaluate population
Select individuals
Recombine/mutate selected individuals
If stopping condition not satisfied
GOTO 2
Genetic algorithm example
Problem: find all black squares
Random population
Fitness
4
3
3
4
5
2
GA example continued
Selected population
Crossover & Mutation
Fitness
5
4
2
6
4
5
Legend:
Crossover point
How is the GA used?
Immediate goal functions
Produce a vector indicating where the agent
should move in order to best satisfy the goal
Each immediate goal has a weight associated to it
Five immediate goal functions
Avoid agent
Avoid obstacle
Momentum
Go to area of interest (AOI)
Follow obstacle
Additional parameters
Randomness
Comfort
0.00
Allows obstacle following
to occur
0.01
0.04
Parameters
Simulation parameters
Genetic algorithm parameters
Test cases
2
Population size
50
Simulation ticks
10000
Generations
50
Agents
25
Crossover rate
0.9
Runs per test case
30
Mutation rate (per weight)
0.005
Weight range
[0.0, 1.0]
Two scenarios
Environment 1
Environment 2
Average fitness
Agents must survive and see as many AOIs as
possible
Not much difference in fitness between two scenarios
Evolved parameters
Evolved agents in action
Summary and conclusion
Discussed action selection problem in artificial
intelligence and showed an evolutionary
approach to solving
Tested approach on simple problem scenarios
Method combines actions of goals
Performed well on both scenarios
New behaviors (goals) can easily be added to
the system
The parameters evolved are specific to the
environment they were learned in
Future work
Social interactions between agents
Allow agents to have more than one set of goal weights
Allow communication of data between agents
New immediate goal functions needed
Depending on the agent’s state (hungry, low on fuel, in danger,
etc.) use a different set of goal weights
Other ways to combine vectors from immediate goal functions
Non-linear combination of vectors
Genetic programming
Currently being worked on at George Mason University
Better test scenarios
Evolve parameters that generalize well to unseen environments
Related work
Action selection
M. Humphrys. Action selection in a hypothetical house robot: Using those RL numbers. In Proceedings of the
First International ICSC Symposia on Intelligent Industrial Automation (IIA-96) and Soft Computing (SOCO96), 1996.
M. Humphrys. Action selection methods using reinforcement learning. In From Animals to Animats 4:
Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior, Cambridge, MA,
pages 135-144. MIT Press, Bradford Books, 1996.
Robot control
R. C. Arkin. Motor schema based navigation for a mobile robot. In Proceedings of the IEEE International
Conference on Robotics and Automation (ICRA), Raleigh, NC, pages 264-271, May 1987.
O. Buffet, A. Dutech, and F. Charpillet. Automatic generation of an agent's basic behaviors. In Proceedings of
the 2nd International Joint Conference on Autonomous Agents and MultiAgent Systems (AAMAS'03), 2003.
J. Casper, M. Micire, and R. R. Murphy. Issues in intelligent robots for search and rescue. In Proceedings SPIE
Volume 4024, Unmanned Ground Vehicle Technology II, pages 292-302, July 2000.
S. Koenig and M. Likhachev. Improved fast replanning for robot navigation in unknown terrain. In Proceedings
of the IEEE International Conference on Robotics and Automation (ICRA), pages 968-975, 2002.
J. Rosenblatt. DAMN: A distributed architecture for mobile navigation. In Proceedings of the 1995 AAAI Spring
Symposium on Lessons Learned from Implemented Software Architectures for Physical Agents. AAAI Press,
March 1995.
S. P. Singh, T. Jaakkola, and M. I. Jordan. Learning without state-estimation in partially observable Markovian
decision processes. In International Conference on Machine Learning, pages 284-292, 1994.