Psych209_13_01_28_HowBrnLrnsx

Download Report

Transcript Psych209_13_01_28_HowBrnLrnsx

How the Brain Learns:
Rules and Outcomes
Psychology 209
January 28, 2013
The brain is highly plastic and changes in
response to experience
• Alteration of experience leads to alterations of neural
representations in the brain.
• What neurons represent, and how precisely they
represent it, are strongly affected by experience.
• We allocate more of our brain to things we have the
most experience with.
The Sensory Homunculus
Monkey Somatosensory Cortex
Merzenich’s Joined Finger Experiment
Receptive
fields after
fingers were
sown together
Control
receptive
fields
Merzenich’s Rotating Disk Experiment
Merzenich’s Rotating Disk Experiment:
Redistribution and Shrinkage of Fields
Merzenich’s Rotating Disk Experiment:
Expansion of Sensory Representation
Temporal Sharpening
Synaptic Transmission and Learning
• Learning may occur by changing
the strengths of connections.
• Addition and deletion of
synapses, as well as larger
changes in dendritic and axonal
arbors, also occur in response to
experience.
• Recent evidence suggests that
neurons may be added or
deleted in some cases as well.
(This occurs in a specialized sub
region of the hippocampus,
perhaps elsewhere as well
neocortex.)
Pre
Post
Hebb’s Postulate
“When an axon of cell A is near enough to excite a cell B and
repeatedly or persistently takes part in firing it, some growth
process or metabolic change takes place in one or both cells
such that A’s efficiency, as one of the cells firing B, is
increased.”
D. O. Hebb, Organization of Behavior, 1949
In other words:
“Cells that fire together wire together.”
Unknown
Mathematically, this is often taken as:
Dwba = eabaa
(Generally you have to subtract something to achieve stability)
a2
a1
b
Glutamate ejected from
the pre-synaptic terminal
activates AMPA receptors,
exciting the post-synaptic
neuron.
Glutamate also binds to the
NMDA receptor, but it only
opens when the level of
depolarization on the post-synaptic
side exceeds a threshold.
When the NMDA receptor opens,
Ca++ flows in, triggering a
biochemical cascade that results
in an increase in AMPA receptors.
The increase in AMPA receptors
means that an the same amount of
transmitter release at a later time
will cause a stronger postsynaptic effect (LTP).
The Molecular Basis of Hebbian
Learning (Short Course!)
Linsker’s papers
(PNAS, 1986a,b,c)
• Showed how a simple Hebbian learning rule can
give rise to:
– Center-surround receptive fields at initial stages
of visual processing
– Oriented edge detectors at later stages
• Extended the idea to consider why topographic
maps may form.
Emergence of CenterSurround Organization
•
Assume a first-layer (A) of independent randomly spiking
neurons project to a second layer of neurons that receive
inputs according to a Gaussian function of distance from the
corresponding point in the layer below.
•
Then neighboring neurons in layer B will have a tendency to
have slightly correlated activations, due to the fact that they
tend to receive input from some of the same neurons.
•
Now feed activation forward to layer C from layer B. Activation
follows a linear activation rule, and learning occurs according to
the Hebbian rule at right.
•
Additional assumptions: Weights (wrs) are bounded in [-1,+1].
Alternatively there could be two types of weights, some in [0,1]
and others in [-1,0]. So, when a weight reaches its min or max
value, it cannot change further in that direction.
•
d represents a decay term tending to cause an overall negative
drift.
•
The result of running this for a long time with small e is center
surround organization as shown (small dots = excitatory
connections, large dots = inhibitory connections).
•
This comes from the fact that inputs near the center of the
receptive field tend to be more correlated with the other inputs
to the cell than the ones near the edges.
ar = Ss wrsas + br
Dwrs = e(ar-Qr)(as-Qs) - d
How Hebbian Learning Plus Weight Decay
Strengthens Correlated Inputs and Weakens Isolated
Inputs to a Receiving Neuron
unit r
ara3
1
2
3
input units
Units 1 & 2 active
Unit 3 active alone
Activation rule:
ar =
Ssaswrs
Initial weights all = .1
Learning Rule:
Dwrs = earas – d
e = 1.0
d = .075
(2x)
Final weight
.15 .15 .05
This works because inputs correlated with other
inputs are associated with stronger activation of
the receiving unit that inputs that occur on their own.
Linsker’s rule maximizes a
‘correlatedness of inputs’ measure
What Linsker calls‘Hebb Optimal’ weights are those maximizing
the sum of the pair-wise correlations among a unit’s inputs subject
to constraints on the sum of the weights:
SiSj Qijwriwrj
Where
Qij 
 a (t )  a a (t )  a 
 a (t )  a   a (t )  a 
i
t
i
j
j
2
t
i
i
2
t
j
j
When ai  a j  Q s , and ar  Qr , the learning rule shown previously
tends to maximize the sum of the Qij’s subject to a constraint against
large weights (enforced by having a non-zero value for d).
When applied to inputs with ‘mexican
hat’ receptive fields, Linsker’s rule
tends to produce edge detectors!
• Two representative simulated
receptive fields for units at a
layer receiving inputs from a
layer below consisting of
center-surround units are
shown.
• The Mexican Hat receptive field
shape implies a Mexican Hat Q
function, and this favors each
point being like its nearneighbors and un-like its
midrange neighbors.
• “Each point cannot be the
center of a ‘like island’ in an
‘unlike sea’”.
• Whether you get blobs or edge
detectors depends on many
factors…
Miller, Keller, and Stryker (Science, 1989)
model ocular dominance column development
using Hebbian learning
Architecture:
• L and R LGN layers and a cortical
layer containing 25x25 simple
neuron-like units.
• Each neuron in each LGN has an
initial weak projection that forms a
Gaussian hump (illustrated with
disc) at the corresponding location
in the Cortex, but with some noise
around it.
• In the cortex, there are short-range
excitatory connections and longerrange inhibitory connections, with
the net effect as shown in B (scaled
version shown next to cortex to
indicate approximate scale).
Simulation of Ocular Dominance
Column Development based on Hebbian Learning
Experience and Training:
• Before ‘birth’, random activity occurs in
each retina. Due to overlapping
projections to LGN, neighboring LGN
neurons in the same eye tend to be
correlated. No (or less) between-eye
correlation is assumed.
• Learning of weights to cortex from LGN
occurs through a Hebbian learning rule:
Dwcl = eacal – decay
(Note that w’s are not allowed to go
below 0).
• Results indicate that ocular dominance
columns slowly develop over time. Each
panel shows the cortical sheet at a
different point in time during
development. Strength of R vs L eye
input to a given cortical neuron is
indicated with light to dark gray as
shown).
• Some studies indicate that ocular
dominance columns are present before
there is neural activity, but even in that
case mechanisms like those considered
here may play a role in maintaining and
refining the columnar organization.
Competitive Learning and
the Self-Organizing Map (Hbk, Ch. 6)
•
Competitive learning:
–
–
–
–
•
Units organized into mutually inhibitory
clusters
Unit with largest net input is chosen as
‘winner’
Winner’s weights are updated to align
with the input.
Sum or sum of squares of weights
coming to the unit are held constant.
Self-organizing map:
–
–
–
–
–
Very similar but units are arranged in a
line, sheet or higher-d space, and:
Weights are updated for the winner and
for other units near the winner
Extent of update falls off with distance.
Units then spread out over the input
data, and are assigned in proportion to
density of data (as in Merzenich
experiments).
In simulation result below, units are
arranged along a one-d line. Inputs are
points in the 2-d space as shown.
Plusses and Minuses of Hebbian Learning
• Hebbian learning tends to reinforce whatever
response occurs to a particular input.
– If a neuron becomes activated, the inputs that
activate it are strengthened, so the neuron will be
even more likely to be activated next time.
• This may contribute to failures of learning and even
to stamping in of bad habits when the response we
make to an input is not the best one to make.
• Possible examples (see articles by me in today’s
readings):
- Failure of adults to learn new speech sounds
- Phobias, racism
- Entrenchment of “Habits of Mind”
Learning Depends on More than Mere
Stimulation
• Exposure to task-irrelevant stimuli does not lead to
change in studies from the Merzenich lab and other
labs.
• Outcome feedback can lead to enhanced learning.
E.g., feedback helps Japanese adults improve their
discrimination of /r/ vs /l/ sounds.
How might outcomes shape learning?
•
Learning can be influenced by neuro-modulators such as ACH or dopamine.
– One Merzenich expt. showed that an entire cortical area can come to respond to a
particular tone played repeatedly, if a sub-cortical nucleus that releases the modulator
ACH is continually stimulated while the tone is being presented. ACH may enhance
connection weight changes, and level of ACH release may depend on the animal’s
attentional state.
•
Reinforcement learning:
– Release of the modulator dopamine may be triggered by occurrence of rewards, and
dopamine too may modulate the size of connection weight changes.
Dwrs = eRaras – d
R = reinforcement
– This provides a powerful way of shaping Hebb-like learning to strengthen activations that
lead to rewards, and not other activations.
– Maybe R signal corresponds to level of dopamine (above or below a baseline level).
– R may correspond to the extent to which the reward is greater or less than expected (we
will return to this idea later).
•
Error-correcting learning:
– Learning rules driven by the difference between what a network produces and a ‘teaching
signal’ are the main engines of learning in neural networks – we will turn to these next
time.