Transcript Slide 1

Biologically-inspired robot spatial cognition
based on rat neurophysiological studies
Alejandra Barrera and Alfredo Weitzenfeld
Auton Robot 2008.
Rakesh Gosangi
PRISM lab
Department of Computer Science and Engineering
Texas A&M University
Outline
•
•
•
•
•
Introduction
Related work
Biologically inspired spatial cognition
Experimental results
Conclusion and Discussion
Introduction
• SLAM – the problem of a mobile robot acquiring a map of its
environment while localizing itself in the map.
• Challenges in SLAM
– Data association – if two features observed at different times
correspond to the same object
– Perceptual ambiguity – distinguish between places that provide similar
or equivalent visual patterns
Spatial cognition in rats
• Data association or place recognition in rats is based on cognitive
maps generated in hippocampus
• Cognitive maps are created from visual and kinesthetic feedback
information
• Rats can learn and unlearn to reward locations in goal-oriented
tasks
Contribution of the paper
• Neural network based spatial cognition model for a mobile robot
inspired from rat’s brain structure
–
–
–
–
–
Build a holistic topological map of the environment
Recognize places previously visited
Learn-unlearn to reward locations
Perform goal-directed navigation
Use kinesthetic and visual cues from the environment
Outline
•
•
•
•
•
Introduction
Related work
Biologically inspired spatial cognition
Experimental results
Conclusion and Discussion
Comparison with Milford (2006) - RatSlam
• The two models coincide with mapping and map adaptation but
differ in goal-directed navigation
• Milford et al. use a topological map of experiences where each
experience codifies location and orientation
• Transitions are associated with locomotion
• In this paper, the nodes correspond to visual information patterns
and path integration signals
• Transitions correspond to orientation and locomotion of the rat
Experimental basis
• Morris’ experiment (1981)
• Two types of rats
– Normal rats
– Rats with hippocampal lesions
• Two experimental situations
– Visible platform
– Submerged platform with visual cues around the arena
• Normal rats relate their position with respect to visual cues and
recognize target location
Image borrowed from - Morris, R. G. M. (1981). Spatial localization does not require the presence
of local cues. Learning and Motivation, 12, 239–260.
Experimental basis
• O’Keefe’s experiment (1983)
• A reversal task on a T-maze
• Rats with Hippocampal lesions
– Learned to turn to right arm in a T-maze
– Gradually changed their orientation for left arm to right arm in 8-arm
maze
– Their behavior was based on goal-location relative to body
• Normal rats
– Learned to turn to right arm in T-maze
– The shifting from left to right was not gradual in an 8-arm maze
– Their behavior was based on a spatial map constructed in hippocampus
Outline
•
•
•
•
•
Introduction
Related work
Biologically inspired spatial cognition
Experimental results
Conclusion and Discussion
Biologically inspired spatial cognition
•
•
•
•
•
•
•
•
Biological background
Affordances processing
Rat’s motivation
Path integration
Landmark processing
Place representation and recognition
Learning
Action Selection
Affordance processing
• Affordances are coded as a linear array of cells called affordance
perceptual schema
• An affordance corresponds to a 45° turn relative to the rat’s head
• Each affordance is represented as a Gaussian distribution, the
activation of neuron i is give by
𝐴𝐹𝑖 =
−(𝑖−𝑎)2
ℎ𝑒 2𝑑 2
Motivation
• The rat’s motivation is related to its hunger drive
𝐷 𝑡+1
= 𝐷 𝑡 + 𝛼𝑑 𝑑𝑚𝑎𝑥 − 𝐷 𝑡
− 𝑎 𝐷 𝑡 + 𝑏|𝑑𝑚𝑎𝑥 − 𝐷(𝑡)|
• The rat obtains a reward r(t) by the presence of food
Path Integration
• Process of updating the position of the point of departure each time
the animal performs a motion
• Path integration helps an animal return home
• Path integration uses kinesthetic information
– Magnitude of rotation
– Magnitude of translation
• Path integration module is composed of two neural network layers
– Dynamic Remapping Layer (DRL)
– Path Integration Feature Detector Layer (PIFDL)
Dynamic Remapping Layer
• 2-D array of neurons
• The activation of a neuron (i, j) is computed as
– (x, y) codify the anchor relative to initial coordinates in the plane
• The anchor position displaces each time the rat moves by the same
magnitude but in the opposite direction
• The anchor position is updated by applying convolution between DR
layer and a mask M
• The DR Layer is updated according to C by centering the Gaussian at
(r, c) – maximum value of C
Path Integration Feature Detector Layer
• PIFDL is also a 2-Dimensional array of neurons
• Every neuron in DLR is randomly connected to 50% on neuron in
the PIFDL
• The weights between the two layers are learned through Hebbian
learning
Landmark Processing
• Distance and orientation of each landmark is represented as a linear
array of cells (LPS)
• Each LPS is connected to a 2-Dimensional array of neurons called
Landmark Feature Detector Layer (LFDL)
• The connecting weights are learned through Hebbian learning
• All the LFDLs are combined into a single Landmark Layer (LL)
• Visual information pattern is stored in an array called LP
Place representation and recognition
• Place Cell Layer (PCL) is a 2-Dimensional layer of neurons
• Every neuron in PIFDL is randomly connected to 50% of neurons in
the PCL
• Every neuron in the LL(Landmark Layer) is connected to 50% of
neurons in the PCL
• The synaptic efficacy between the two layers is learned through
Hebbian learning
• PC encodes kinesthetic and visual information sensed by the rat at a
given location and a given orientation
World Graph Layer
• The nodes in the map represent different places
• Arcs between the nodes represent
– The direction of the rat’s head
– Number of steps taken by the rat to move from one node to the other
• Every node can be connected to eight actor units, one for each
direction
• Place recognition
– SD is the similarity degree, N is the number of cells
Learning
• Learn and unlearn reward locations by reinforcement learning
through Actor-Critic Architecture
• Adaptive Critic (AC) unit contains a Predictive Unit (PU) which
estimates future rewards for every place
– Every neuron in PCL(Place cell layer) is connected to PU and every
connection has
• A weight w
• Eligibility trace e
– P(t) is expected reward at time t
– r’(t) is effective reinforcement signal
Action Selection
• Action selection is based on four signals
–
–
–
–
Available affordances at time t (AF)
Random rotations between available affordances (RPS)
Unexplored rotations from current location (CPS)
Global Expectation of Maximum Reward (EMR)
• Representation
– Each affordance in AF is represented as a Gaussian
– RPS is one Gaussian centered at a random array position
– CPS capture the animal’s curiosity.
• As many Gaussians as unexecuted rotations at that location
Outline
•
•
•
•
•
Introduction
Related work
Biologically inspired spatial cognition
Experimental results
Conclusion and Discussion
Experiments
• Hardware
– Sony AIBO ERS-210 4 legged robot
– 1.8 GHz P4 processor
– A local camera with 50° horizontal view and 40° vertical view
• At a given time step the robot takes three non-overlapping snapshots
(0°, +90°, -90°)
• Visual processing analyzes the number of colored pixels in the
images
• Kinesthetic information is obtained from the external motor control,
there is no odometer
• Four experimental conditions
Experiment 1 – T-maze
• Departure point is the base of the maze
• During training phase the goal is set at the end of the left arm
• During the testing phase the goal is shifted to the right arm
• Results
– The robot takes 16 trials to completely unlearn the previously correct
hypothesis
– When the expectation of reward exceeds noise the robot starts visiting
the right arm
– In O’Keefe’s experiments (1983), the rats chose the right arm 90% of the
time by 24th trial
Experiment 2 – 8-arm radial maze
• The goal is set at -90° arm during training phase
• During the testing phase the goal is set at +90° arm
• Results
– When the expectation of reward for -90° arm is smaller than noise the
robot visits other arms randomly
– By the 12th trial the robot starts choosing the +90° arm
– In O’Keefe’s experiments (1983) the rats chose the correct arm by 20th
trial
Experiment 3 – Multiple T-maze
• The robot departs at the base of vertical T-maze
• During training phase the goal is placed at right arm (90°) of the left
horizontal T-maze
• During testing phase the goal is placed at right arm (270°) of the
right horizontal T-maze
• Results
– If the robot reaches the goal at the end of a path then it is positively
reinforced
– If a path does not lead the robot to a goal it is negatively reinforced thus
unlearning the path
– The robot completely unlearns previous goal by 20th trial
Experiment 4 – Maze with landmarks
• Three colored cylinders were placed outside the maze as landmarks
• During testing the robot was placed at different starting locations
• Results
– The robots use place recognition to find goals
– All the robots found the goal successfully from all starting positions
Outline
•
•
•
•
•
Introduction
Related work
Biologically inspired spatial cognition
Experimental results
Conclusion and Discussion
Discussion and conclusions
• The model proposed capture some behavioral aspects of rats
• Abilities
– Build a holistic topological map in real time
– Learn and unlearn goal locations
– Exploit the cognitive map to recognize visited places
• Very simplistic perceptual system
– The current model cannot deal with real environments
• Affordance space and landmark space is discrete
– Computationally expensive to process continuous spaces
Questions / Comments