AMAM Conference 2005

Download Report

Transcript AMAM Conference 2005

AMAM Conference 2005
Adaptive Motion in Animals and Machines
Outline of the talk



Short AMAM conference overview
Introduction to Embodied Artificial
Intelligence (keynotes, R. Pfeifer)
More detailed look at:


Sensory Motor Coordination
Value-Systems
AMAM: Conference Overview

Motivation of studying Biology

Source of inspiration for robotics



Model features of rather simple animals
(insects…)
Robots and animals have to solve the same
physical problems
Robots are useful tools for computational
neuroscience

Testing Neural Models within a complete
sensing-acting loop
Biorobotics

Bio-inspired technologies







New sensors: Whiskers and Antennas
Muscle-Like (flexible) actuators
Flexible robotic arms and hands
Biped and humanoid robots
Numerical Models of animal and human
locomotion
Central Pattern Generator based and other
control methods
Some robots for illustratoin:
AMAM: robots

Scorpion
[Kirchner05]


8 legged robot
BigDog [Buehler,
Boston Dynamics]
AMAM: Robots

Fish Robot


Iida
Stumpy

„Special“ robot to
investigate cheap
design locomotion
(Iida)
AMAM Conference: Robots

ZAR 4 [boblan05]


Bionic robot arm driven
by artificial muscels
And many more:

Insects :




Coackroaches[ritzmann
05]
Worm [menciassi05]
Amoebic Robots
[ishiguro05]
Bisam Rat [albiez05]
Embodied Artificial Intelligence

Not interested in the control aspects of robots alone, but rather in
designing entire systems








Morphology, Materials + Control
Synthetic Methology: Understanding intelligent behavior by building
Concentrate on complete autonomous robots


[Pfeifer99, Iida03]
Self-Sufficient: Sustain itself over a extended period of time
Situatedness: acquires all information about the environment from its
own sensory system
„Lives“ in a specified ecological niche: no need for universal robots
Embodiment: real physical agents
Adaptivity
„Why do plants have no brain? They do not move.“ [Brooks]
Often aspects of only simple animals are modeled by robots
(locomotion of insects…)


It took evolution 3 billion years to evolve insects/legged locomotion, but
only 500 million more years to develop humans
=> locomotion must be a hard problem
Embodied AI: Principles

Emergence:






Emergent Behaviours: „emerge“ by the
interaction of the robot with the environment
Not preprogrammed
Agent is the result of its history
Exploit the dynamics of the system
More adaptive : developmental mechanisms
Diversity Compliance:


Exploiting ecologicol niche / behavioral diversity
Exploration/Exploitation trade off
Embodied AI: Principles

Parallel, loosely coupled processes


Intelligence emerge from a lager number of parallel processes
Processes are connected to the agent‘s sensory-motor aparatus




Coupling through embodiment or coordination
No functional decompositon/hierarchical control like in traditional
robotic
Supsumption architecture [brooks86]
Sensory-Motor Coordination


Structuring sensory input
Generation of good sensory-motor patterns:




Correlated
Stationarity
Can simplify learning
Dimensionality Reduction of sensory-motor space [lungeralla05,
boekhorst03]
Embodied AI: Principles

„Morphological“ Computation

Parts of the control can be „computed“ by the morphology



Springs and flexible material
Exploit system dynamics for control





Facets in flies, motion paralax
E.g. Exploit gravity and flexible actuators
Can simplify control considerably
Increase learning speed by morphology
„Extreme“ Example: Passive dynamic walker
Cheap Design:


Exploit physics and constraints of ecological niche
Use the most simple architecture for a given task
Embodied AI: Principles

Redundancy:

Overlap of functionality in the subsystems



Required for diversity and adaptivity
Ecological Balance:



Sensory system, Motor system
Complexity of the sensory, motor and neural system has to
match for a given task
Balance between morphology, materials and control [Ishiguro03]
Value Principle




Motivation of the robot to do something (should be more general
than RL)
Essential for every complete autonomous agent
No generally accepted solution exists
2 approaches will be discussed in more detail
Traditional Robotics / AI

In difference to traditional robotics


Limited numbers of degrees of freedom (e.g. wheels)
Stiff structure and joints (servo motors)




Limited natural dynamics
Centralized rule-based control



Easy to control
All Computation has to be done by the control system
Functional decomposition
„Sense-think-act“ cycle
Problems:


Frame problem
Symbol grounding problem
Sensory-Motor coordination (SMC)
[Pfeifer99, Lungarella05]

Used for categorization

Traditional approach: Sensory-input to
category mapping


Prototype or example matching
Difficulties: Often this mapping is not
learnable


Noise and Inaccuracies in Sensors
Ambigious sensory input (Type 2 problems)
Categorization: Example [Nolfi97]

Learn 2 categories
(Wall, Cylinder) with IR
sensors

Data for:


Learn with neural
network



180 orientations, 20
distances
Just linear output units
4 resp 8 hidden
neurons
Very bad results: 35 %
correct categorization
Back dots: correct categoritization
SMC: Categorization




Approach the problem through interacting
with the environment
Object related actions to structure the input
Simplifies the problem of categorization
No real internal category representation



Just different behaviors for different categories
Empirical studies about Dimensionality
Reduction [lungarella05]
Example in infants: Look at object from
different directions in the same distance
SMC: Example

Learning optimal categorization strategy
through a genetic algorithm

Nolfi‘s experiment:


Fitness: Time the robot is near the cylinder
Evolved Behavior:

Robot never stops in front of target:

Move back/forth and left/right hand side
SMC: Example

Learning to distinguish circles and
diamonds [Beer96]



Catching circles, avoiding diamonds
Agent can only move horizontally
Again evolved controller
SMC: Example

Results:

Not merely centering and
statically pattern matching




Dynamic strategy, with active
scanning
Both policies evolve sensorymotor coordination strategies
Examples show quite good the
idea of sensory-motor
coordination
Other examples:


Catching Circle
Darwin II [Reeke89]
Garbage Collector [Pfeifer97,
Schleier96]
Avoiding Diamond
SMC: Conclusion

Nice new ideas for categorization tasks and
robotics in generell

Simple examples that illustrate the use of SMC for
categorization


Examples are „well-suited“ for SMC
No complex categorization problem (e.g for visual object
recognition) found in the literature

Only numerical results which proofs dimensionality reduction


How to use them?
Critic: Humans are also able to do categorization
very well without sensory-motor interaction

The emphasis of SMC is a bit overstressed by the authors
Value Systems & Developmental
Learning [oudeyer04/05, steels03]

Intrinsic Motivation of the Agent:



learn more about the environment
Ideal case: open-end learning
Many different behaviors may emerge


2 approaches to this problem discussed in more detail




Very adaptive
Intelligent Adaptive Curiosity (IAC) [oudeyer04]
Autotelic Principle [steels03]
Still in the beginning, only for toy examples
Other approaches comming from RL


Intrinsically motivated RL [singh04]
Self Motivated Development [schmidhuber05]
IAC: Motivation

Push agent towards situations in which it
maximizes learning progress


Balance between the „unknown“ and the
„predictable“
Goal: Improve prediction machine
P( A(t ), SM (t ))  S (t  1)



A(t) … action
SM(t)… sensory-motor context
S(t+1)… prediction
IAC: framework

Prediction error
E (t ) || S (t  1)  Sa(t  1) ||


=> Decrease E(t)
First naive approach

Learning Progress
LP(t )  ( Em(t )  Em(t  DELAY ))



Em(t)… mean Error at time t
Do not reward high error values, reward high LP
Meta Learning Machine (predicts error)
MP ( A(t ), SM (t ))  Ep(t  1)


Choose action which maximizes Learning Progress
Problem ?
IAC:

Problem of naive approach:


Transition from complex, not predictable
situations to simple situations is
considered as learning progress
Solution:

Instead of comparing the LP succesive in
time, compare the LP succesive in state
space
IAC: algorithm

Prediction machine P


Consists of a set of local experts.
Each expert consists of training examples



Simple NN algorithm is used for prediction
Build kd-tree incrementally : experts in the
leaves
Each expert stores prediction errors and the
mean

Calculate local learning progress



LPi(t) = -(Empi(t) – Empi(t – DELAY)
Used for action selection
Very simple algorithms used

More sophisticated algorithms have a good chance to
improve performance
IAC: experiments

Toy example:


2 wheeled robot, can produce sound
Toy: position depends on sound
frequency intervall




f1 : moves randomly
f2 : stops moving
f3 : toy jumps to robot
Predictor: predict relative position of the
toy
IAC: experiments

Results:



Basically 3 experts
First explores intervall f3, then intervall f2
f1 is not explored : not predictable
IAC: experiments

Playground experiment


AIBO robot on a baby play mat
Various toys: can be bitten, bashed or
simply detected
IAC: Playground Experiment

Motor Control:




Turning head (2 DoF, pan + tilt)
Bashing (2 DoF, strength + angle)
Crouch + Bite (1 DoF, crouches given distance in direction it is
looking at)
Perception:

3 High level sensors (just binary values)





Visual object detection
Biting Sensor
Infra-red distance sensor
Bashing + Biting only produce visible results if applied in front of
an appropriate object
Agent knows nothing about sensorimotor affordances
IAC: Results

Different stages evolves



Stage 1: random exploration +
body babbling
Stage 2: Most of the time looking
around (no biting + bashing)
Stage 3: biting and bashing


Stage 4: Starts to look at objects


Sometimes produces something,
robot still not oriented to objects
Learns precise location of the
object
Stage 5: Trying bite biteable
object, trying to bash bashable
object
The Autotelic principle [steels03]

Autotelic activities: no real reward



Motivational driving signal comes from the
individual itself
Balance between high challenge and
required skill



Climbing, painting…
too high: withdrawal
too low: boredom
Operational description given in [steels03],
no real experiments found
Autotelic Principle: Operational Descripion

Agent:

Organised in number of sub-agencies
(components)


Establish input/output mapping based on knowledge
Each component must be parameterized to self adjust
challenge levels




Precision of movement, weights of objects…
Parameter vector pi for each component
Goal: not to reach a stable state, keep exploring
parameter landscape
Each component has also an associated skill vector
Autotelic Principle: Operational Descripion

Self Regulation:


Operation phase: Clamp challenge
parameters, learn skills through learning
Shake-Up phase:


Increase challenge: skill level already too
high
Decrease challenge: performance could not
be reached
Conclusion: Value Systems

Both approaches try to create open-ended learner


Interesting ideas
Only very simple algorithms used, or not even
implemented


Can help to structure learning progress in complex
environments


Open for improvement
Complete autonomous agents will need some sort of
developmental value system
No complex real-world experiments found

Scalable?
Conclusion: Embodied Intelligence




Provides new ways of thinking about robotic /
intelligence in general
Provides a better understanding of intelligent
behavior by modelling the behavior.
Good principles to design an agent
Claims to solve many problems of traditionial AI



Good and promising ideas
Somehow the algorithmic solutions for more complex
systems are missing
Actually: same problems as for traditional AI


Works for small problems
Hard to scale up
The End

Thank you!
Literature










[pfeifer99] R. Pfeifer and C. Schleier, Understanding Intelligence, MIT Press
[iida03] F. Iida and R. Pfeifer, Embodied Artificial Intelligence
[kirchner05] D. Spenneberg, F. Kirchner, Embodied Categorization of spatial
environments on the Basis of Proprioceptive Data, AMAM 2005
[ritzmann05] R. Ritzmann, R. Quinn, Convergent Evolution and locomotion through
complex terrain by insects, vertebrates and robots, AMAM 2005
[menciassi05] A. Menciassi, S. Spina, Bioinspired robotic worms for locomotion in
unstructered environments, AMAM2005
[ishiguro05] A. Ishiguro, M. Shimizu, Slimebot: A Modular robot that exhibits amoebic
locomotion, AMAM2005
[albiez05] J. Albiez, T. Hinkel, Reactive Foot-control for quadruped walking, AMAM2005
[boblan05] I. Boblan, R. Bannasch, A Humanlike Robot Arm and Hand with fluidic
muscles: The human muscle and the control of technical realization, AMAM 2005
[lungeralla05] M. Lungarella, O. Sporns, Information Self-Structuring: Key Principle for
Learning and Development
[broekhorst03] R. Broekhorst, M. Lungarella, Dimensionality Reduction through sensory
motor-coordination
Literature











[ishiguro03] A. Ishiguro, T. Kawakatsu, How should control and body systems be
coupled? A robotic case study, Embodied artificial intellingence 2003
[nolfi97] S. Nolfi, Evolving non-trivial behavior on autonomous robots: Adaptation is more
powerful than decompositionand integration
[beer96] R. Beer, Toward the Evolution of Dynamical Neural Networks for Minimally
Cognitive Behavior
[reeke89] G. Reeke, O. Sporns, Synthetic neural modeling: A multilevel approach to
analysis of brain complexity
[pfeifer97] R. Pfeifer, C. Schleier, Sensory-motor coordination: The metaphor and beyond:
Practice and future of autonmous robots
[schleier96] C. Schleier, D. Lambrinos, Categorization in a real world agent using haptic
exploration and active perception
[oudeyer04] P. Oudeyer, F. Kaplan, Intelligent Adaptive Curiosity: a source of SelfDevelopment
[oudeyer05] P. Oudeyer, F. Kaplan, The Playground Experiment: Task independent
development of a curious robot.
[steels03] L. Steels, The Autotelic Principle
[singh04] S. Singh, A. Barto, Intrinsically Motivated Learning of Hierarical Collections of
Skills
[schmidhuber05] J. Schmidhuber, Self-Motivated Development Through Rewards for
Predictor Errors/Improvements
Measure influence of SMC
[lungeralla05, broekhorst03]



New experiments with SMC
Measure the effect of SMC with information processing
quantities
Experiments of Broekhorst:

Robot:





5 different Experiments:






Wheeled
CCD camera (compressed to 10 x 10 pixels)
IR sensors (12)
Measure angular velocity
Control setup: Move forward
Moving object
Wiggling : Move forward in oscillatory movement
Tracking 1: Move forward + track object
Tracking 2: Move forward + track moving object
Preprogrammed control
Measure Influence of SMC
[broekhorst03]

Quantify dimension of the sensory
information


Measure Correlation on most significant
principal components from the different
modalities (R*)
3 different information quantities
N
H   p (i )  log p (i )

Shannon entropy

i …Eigenvalue of R*
Dominance of the highest eigenvector
Number of PC‘s that explain 95% of variance
i 1

Results:

Difference:

Variance in the experiments



SMC experiments have higher variance
SMC experiments and non SMC experiments can be
distinguished
No further straithforward results
Measure Influence of SMC
[lungarella05]

Experimental Setup:


Active Vision: (compressed 55 x 75 pixels) looking at screen
2 behaviors:



Foveation: „follow red area“
Random: Same motion structure, not coordinated
2 scenarios


Artificial Scene: Random Data with moving red block
Natural Images
Measure Influence of SMC
[lungarella05]

Quantify sensory information




Entropy
Joint-Entropy
Mutual Information
Integration : Multivariate Mutual Information
I ( X )   H ( xi )  H ( X )
i

Complexity : C ( X )  H ( X )   H ( xi | X  xi )
i

Quantify Dimensionality Reduction


PCA
Isomap ([tenenbaum01], also recognizes non-linear
dimensions)
Results for foveation behavior


Entropy in
central
regions
decreased
Mutual
information
increased
Results for foveation behavior

Integration and
Complexity where
much larger in the
center
Results for foveation behavior


Reduced
dimensionality
(isomap)
Mutual
information
between center
and motor
actions also
increased