6.Lecture-664 - iLab! - University of Southern California

Download Report

Transcript 6.Lecture-664 - iLab! - University of Southern California

Michael Arbib: CS664 – Neural Models for Visually guided behaviour
University of Southern California, Fall 2001
Lecture 6.
The Mirror Neuron System Model (MNS) 1
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
1
Visual Control of Grasping in Macaque Monkey
A key theme of
visuomotor coordination:
parietal affordances
(AIP)
drive
frontal motor
schemas
(F5)
F5 - grasp
commands in
premotor cortex
Giacomo Rizzolatti
AIP - grasp
affordances
in parietal cortex
Hideo Sakata
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
2
F5 Motor Neurons
F5 Motor Neurons include all F5 neurons whose firing is related to motor
activity.
We focus on grasp-related behavior. Other F5 motor neurons are related to orofacial movements.

F5 Mirror Neurons form the subset of grasp-related F5 motor neurons of F5
which discharge when the monkey observes meaningful hand movements.
F5 Canonical Neurons form the subset of grasp-related F5 motor neurons
of F5 which fire when the monkey sees an object with related affordances.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
3
Mirror Neurons
Rizzolatti, Fadiga, Gallese, and Fogassi, 1995:
Premotor cortex and the recognition of motor
actions
Mirror neurons form the subset
of grasp-related premotor
neurons of F5 which discharge
when the monkey observes
meaningful hand movements
made by the experimenter or
another monkey.
F5 is endowed with an
observation/execution matching
system
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
4
What is the mirror system (for grasping) for?
Mirror neurons: The cells that selectively discharge when the
monkey executes particular actions as well as when the monkey
observes an other individual executing the same action.
Mirror neuron system (MNS): The mirror neurons and the brain
regions involved in eliciting mirror behavior.
Interpretations:
• Action recognition
• Understanding (assigning meaning to other’s actions)
• Associative memory for actions
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
5
Computing the Mirror System Response
The FARS Model:
Recognize object affordances and determine appropriate grasp.
The Mirror Neuron System (MNS) Model:
We must add recognition of
 trajectory and
 hand preshape
to
 recognition of object affordances
and ensure that all three are congruent.
There are parietal systems other than AIP adapted to this task.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
6
Further Brain Regions Involved
cIPS:
cIPS
cIPS
caudal
intraparietal
sulcus
Axis and
surface orientation
Detection of
biologically meaningful
stimuli (e.g.hand
actions)
Motion related activity
(MT/MST part)
STS:
Superior
Temporal
Sulcus
7b (PF):
Rostral part of
the posterior
parietal lobule
Spatial coding
7a (PG):
for objects,
caudal part of analysis of
the posterior motion during
parietal lobule
interaction of
objects and
self-motion
Mainly somatosensory
Mirror-like responses
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
7
cIPS cell response
Surface orientation selectivity
of a cIPS cell
cIPS
cIPS
Sakata et al. 1997
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
8
Key Criteria for Mirror Neuron Activation
When Observing a Grasp
a) Does the preshape of the hand correspond to the grasp
encoded by the mirror neuron?
b) Does this preshape match an affordance of the target object?
c) Do samples of the hand state indicate a trajectory that will bring the hand to
grasp the object?
Modeling Challenges:
i) To have mirror neurons self-organize to learn to recognize grasps in the
monkey’s motor repertoire
ii) To learn to activate mirror neurons from smaller and smaller samples of a
trajectory.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
9
Mirror Neuron Development Hypothesis
The development of the (grasp) mirror neuron system in a healthy
infant is driven by the visual stimuli generated by the actions
(grasps) performed by the infant himself.
The infant (with maturation of visual acuity) gains the ability to map other
individual’s actions into his internal motor representation.
[In the MNS model, the hand state provides the key representation for this
transfer.]
Then the infant acquires the ability to create (internal) representations for
novel actions observed.
Parallel to these achievements, the infant develops an action prediction
capability (the recognition of an action given the prefix of the action and the
target object)
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
11
The Mirror Neuron System (MNS) Model
Object features
Visual Cortex
cIPS
7b
Ob ject
a ffor da nc e
e xtr a ction
H an d
m otion
d ete ction
S TS
AIP
Ob ject af ford a nce
-h an d state
a ssociation
H an d
sha p e
r ecog nition
F5cano nical
M otor
p rog ra m
(Gr a sp )
Integrate
tem po ral
ass ociation
Act ion
M irro r
Feed back r ecog nition
H an d -Obje ct
sp atia l re lation
a na lysis
7a
(M irr or
N eu r ons)
F5mirror
M otor
p rog ra m
(Re a ch)
M otor
e xe cu tion
M1
F4
Ob ject
loca tion
MIP/LIP/VIP
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
12
1.
2.
3.
Implementing the Basic Schemas of the Mirror Neuron System
(MNS) Model
using Artificial Neural Networks
(Work of Erhan Oztop)
Hand State & Core Mirror Circuit
Visual Processing
Reach and Grasp generation
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
13
MNS: Core Mirror Circuit and Hand State
Object features
Visual Cortex
cIPS
7b
Ob ject
a ffor da nc e
e xtr a ction
H an d
m otion
d ete ction
S TS
AIP
Ob ject af ford a nce
-h an d state
a ssociation
H an d
sha p e
r ecog nition
F5cano nical
M otor
p rog ra m
(Gr a sp )
Integrate
tem po ral
ass ociation
Act ion
M irro r
Feed back r ecog nition
H an d -Obje ct
sp atia l re lation
a na lysis
7a
(M irr or
N eu r ons)
F5mirror
M otor
p rog ra m
(Re a ch)
M otor
e xe cu tion
M1
F4
Ob ject
loca tion
MIP/LIP/VIP
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
14
Opposition Spaces and Virtual Fingers
The goal of a successful
preshape, reach and grasp
is to match the opposition
axis defined by the virtual
fingers of the hand with
the opposition axis defined
by an affordance of the
object
(Iberall and Arbib 1990)
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
15
Hand State
Our current representation of hand state defines a
7-dimensional trajectory F(t) with the following components
F(t) = (d(t), v(t), a(t), o1(t), o2(t), o3(t), o4(t)):
d(t): distance to target at time t
v(t): tangential velocity of the wrist
a(t): Aperture of the virtual fingers involved in grasping at time t
o1(t): Angle between the object axis and the (index finger tip – thumb tip)
vector [relevant for pad and palm oppositions]
o2(t): Angle between the object axis and the (index finger knuckle – thumb tip)
vector [relevant for side oppositions]
o3(t), o4(t): The two angles defining how close the thumb is to the hand as
measured relative to the side of the hand and to the inner surface of the palm.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
17
Hand State components
For most components we need to know (3D) configuration of the hand.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
18
Assuming that we can compute the hand state trajectory, how can
we recognize it as a grasp action ?
The general problem: associate N-dimensional space curves with object
affordances
A special case: The recognition of two (or three) dimensional trajectories
in physical space
Simplest solution: Map temporal information into spatial
domain. Then apply known pattern recognition techniques.
Problem with simplest solution: The speed of the moving
point can be a problem! The spatial representation may
change drastically with the speed
Scaling can overcome the problem. However the scaling
must be such that it preserves the generalization ability of
the pattern recognition engine.
Solution: Fit a cubic spline to the sampled values. Then normalize and resample from the spline curve.
Result:Very good generalization. Better performance than using the Fourier
coefficients to recognize curves.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
19
A simple example of curve recognition
Curve recognition system demonstrated for hand drawn numeral recognition
(successful recognition examples for 2, 8 and 3).
Spatial resolution: 30
Network input size: 30
Hidden layer size: 15
Output size: 5
Training : Back-propagation
with momentum.and
adaptive learning rate
Sampled points
Point used for spline interpolation
Fitted spline
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
20
Core Mirror Circuit as Neural Network
With the assumptions:
•
Visual Information about the hand and the object can be
extracted
•
The information about the hand and the object represented with
the Hand State
We can apply the curve recognition idea for the core mirror circuit
learning. Thus
• We associate a 2 layer feed forward neural network with the core
mirror circuit
• Then the learning task is: given the 7 dimensional hand state
trajectory, predict the grasp action observed.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
21
MNS: Visual Processing
Object features
Visual Cortex
cIPS
7b
Ob ject
a ffor da nc e
e xtr a ction
H an d
m otion
d ete ction
S TS
AIP
Ob ject af ford a nce
-h an d state
a ssociation
H an d
sha p e
r ecog nition
F5cano nical
M otor
p rog ra m
(Gr a sp )
Integrate
tem po ral
ass ociation
Act ion
M irro r
Feed back r ecog nition
H an d -Obje ct
sp atia l re lation
a na lysis
7a
(M irr or
N eu r ons)
F5mirror
M otor
p rog ra m
(Re a ch)
M otor
e xe cu tion
M1
F4
Ob ject
loca tion
MIP/LIP/VIP
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
22
Visual Processing for the MNS model
How much should we attempt to solve ?
 Even though computers are getting more powerful
the vision problem in its general form is an unsolved
problem in engineering.
 There exists gesture recognition systems for humancomputer interaction and sign language interpretation
 Our vision system must at least recognize
1) The Hand and its Configuration
2) Object features
We attempt in (1)

Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
23
Simplifying the problem



We simplifying the problem of recognizing the Hand and its
Configuration by using colored patches on the articulation
points of the hand.
If we can extract the patch positions reliably then we can try
to extract some of the features that make up the hand state by
trying to estimate the 3D pose of the hand from 2D pose.
Thus we have 2 steps:
1. Extract the color marker positions
2. Estimate 3D pose
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
24
The Color-Coded Hand
• The Vision task is
simplified using colored
tapes on the joints and
articulation points
• The First step of hand
configuration analysis is to
locate the color patches
unambiguously (not easy!).
Use color segmentation. But we have to compensate for lighting,
reflection, shading and wrinkling problems: Robust color detection
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
25
Robust Detection of the Colors – RGB space
 A color
image in a computer is composed of a matrix of
pixels triplets (Red,Green,Blue) that define the color of the pixel.
 We want to label a given pixel color as belonging to one of the color
patches we used to mark the hand, or as not belonging to any class.
 A straightforward way to detect whether a given target color (R’,G’,B’)
matches the pixel color (R,G,B) is to look at the squared distance
(R-R’)2 + (G-G’)2 + (B-B’) 2
with a threshold to do the classification.
This does not work well, because the shading and different lighting
conditions effect R,G,B values a lot and a our simple nearest neighbor
method fails. For example an orange patch under shadow is very close to
red in RGB space.
 But we can do better:
Train a neural network that can do the labeling for us
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
26
Robust Detection of Colors – the Color Expert
Create a training set using a test image by manually
picking colors from the image and specifying their labels.
Create a NN – in our case a one hidden layer feed-forward network - that
will accept the R,G,B values as input and put out the marker label, or 0
for a non-marker color.
Make sure that the network is not too “powerful” so that it does not
memorize the training set (as distinct from generalization)
Train it then Use it: When given a pixel to classify, apply the RGB values
of the pixel to the trained network and use the output as the marker that
the pixel belongs to.
One then needs a segmentation system to aggregate the pixels into a
patch with a single color label.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
27
Color Expert: Summary
Color Expert
Preprocessing
(Network weights)
Training phase: A color expert is generated by training a feed-forward network to
approximate human perception of color.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
28
Color Segmentation and Feature Extraction
Features
NN augmented
segmentation system
Actual processing: The hand image is fed to an augmented segmentation system. The color
decision during segmentation is done by the consulting color expert.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
29
Hand Configuration Extraction
Color Coded Hand
Feature Extraction
Step 1 of hand shape
recognition: system
processes the color-coded
hand image and
generates a set of features
to be used by the second
step
Step 2: The feature vector
generated by the first step is
used to fit a 3D-kinematics
model of the hand by the
model matching module. The
resulting hand configuration is
sent to the classification
module.
Model Matching
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
Hand Configuration
30
MNS: Reach and Grasp generation
Object features
Visual Cortex
cIPS
7b
Ob ject
a ffor da nc e
e xtr a ction
H an d
m otion
d ete ction
S TS
AIP
Ob ject af ford a nce
-h an d state
a ssociation
H an d
sha p e
r ecog nition
F5cano nical
M otor
p rog ra m
(Gr a sp )
Integrate
tem po ral
ass ociation
Act ion
M irro r
Feed back r ecog nition
H an d -Obje ct
sp atia l re lation
a na lysis
7a
(M irr or
N eu r ons)
F5mirror
M otor
p rog ra m
(Re a ch)
M otor
e xe cu tion
M1
F4
Ob ject
loca tion
MIP/LIP/VIP
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
31
Virtual Hand/Arm and Reach/Grasp Simulator
A precision pinch
A power grasp
and
a side grasp
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
32
Kinematics model of arm and hand
19 DOF freedom: Shoulder(3), Elbow(1), Wrist(3),
Fingers(4*2), Thumb (3)
Implementation Requirements
 Rendering: Given the 3D positions of links’ start and end points,
generate a 2D representation of the arm/hand (easy)
 Forward Kinematics: Given the 19 angles of the joints compute the
position of each link (easy)
 Reach & Grasp execution: Harder than simple inverse kinematics
since there are more constraints to be satisfied (e.g. multiple target
positions to be achieved at the same time)
 Inverse Kinematics: Given a desired position in space for a
particular link what are the joint angles to achieve the desired position
(semi-hard)

Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
33
A 2D, 3DOF arm example
c
C
b
B
a
A
P(x,y)
Forward kinematics: given joint angles
A,B,C compute the end effector position P:
X = a*cos(A) + b*cos(B) + c*cos(C)
Y = a*sin(A) + b*sin(B) + c*sin(C)
Radius=c
Inverse kinematics: given joint
P(x,y) position P there are infinitely many
joint angle triplets to achieve
b
b
b
Radius of the circles are a and c and the segments
connecting the circles are all equal length of b
Radius=a
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
34
A Simple Inverse Kinematics Solution
Consider just the arm.
The forward kinematics of the arm can be represented as
a vector function that maps joint angles of the arm to the wrist position.
(x,y,z)=F(s1,s2,s3,e) , where s1,s2,s3 are the shoulder angles and e is the
elbow angle.
We can formulate the inverse kinematics problem as an optimization
problem: Given the desired P’ = (x’,y’,z’) to be achieved we can
introduce the error function
J = || (P’-F(s1,s2,s3,e)) ||
Then we can compute the gradient with respect to s1,s2,s3,e and follow
the minus gradient to reach the minimum of J.
This method is called to Jacobian Transpose method as the partial derivatives
of F encountered in the above process can be arranged into the transpose of a
special derivative matrix called the Jacobian (of F).

Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
35
Power grasp time series data
+: aperture; *: angle 1; x: angle 2; : 1-axisdisp1; :1-axisdisp2; :
speed; : distance.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
36
A single grasp trajectory viewed from three different
angles
The wrist trajectory during the
grasp is shown by square traces,
with the distance between any
two consecutive trace marks
traveled in equal time intervals.
How the network classifies the
action as a power grasp. Empty
squares: power grasp output;
filled squares: precision grasp;
crosses:
side grasp output
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
37
Power and precision grasp resolution
(a)
Note that the modeling yields
novel predictions for time course
of activity across a population of
mirror neurons.
(b)
Precision Pinch
Mirror Neuron
Power Grasp
Mirror Neuron
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
38
“Spatial Perturbation” Experiment with trained core
mirror circuit
Figure A. A regular precision
grasp (the hand spatially
coincides with the target).
A
C
B
Figure B. The response of the
network as precision grasp.
D
Figure C. The target object is
displaced to create a ‘fake’
grasp.
Figure D. The response of the
network to action in Figure C.
The activity of the precision
mirror neuron is reduced. In the
graphs the x axes represent the
normalized time (0 for start of
grasp, 1 for the contact with
object) and y axes represent the
cell firing rate.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
39
“Kinematics Alteration” Experiment with the trained
core mirror circuit
A
B
Normalized
speed
Figure B. The velocity
profile is (almost) linear.
Normalized
time
Firing rate
Figure A. A regular
precision grasp (the wrist
has a bell shaped velocity
profile).
Figure C. Classification of
the action in Figure A as
precision grasp.
Firing rate
Figure D. The activity
vanished during the
observation of action
D
C
Normalized
time
E
D
Normalized
time
Note that the scales of the
graphs C and D are
different.
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
40
Research Plan
Development of the Mirror System
Development
of Grasp Specificity in F5 Motor and Canonical Neurons
Visual Feedback for Grasping: A Possible Precursor of the Mirror Property
Recognition of Novel and Compound Actions and their Context
The
Pliers Experiment: Extending the Visual Vocabulary
Recognition of Compounds of Known Movements
From Action Recognition to Understanding: Context and Expectation
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
41
Modeling Challenges
How can MNS be plugged into a learning-by-imitation
system with faith to biological constrains (BG,
Cerebellum, SMA, PFC etc..)
How does the brain handle temporal data? Transform the
learning network into a one which can work directly on
temporal data. Eliminate the preprocessing required
before the input can be applied to MNS core circuit.
Extend the action to be recognized beyond simple grasps.
Model the complementary circuit, learning to grasp by
trial and error.
And a lot more!
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
42
Experimental Challenges
What are “poor” mirror neurons coding?
- temporal recognition codes
- transient response to actions which are not exactly the
preferred stimuli
How can we relate different cells’ responses to each
other?
- Fix the condition and record from as many as possible
cells with the exactly the same condition.
Is it possible to record from mirror cells in different age
groups of monkeys ( i.e. infant to adult)?
Michael Arbib CS564 - Brain Theory and Artificial Intelligence, USC, Fall 2001. Lecture 10. MNS Model 1
43