The Physics of the Brain

Download Report

Transcript The Physics of the Brain

Unsupervised learning
• The Hebb rule – Neurons that fire together
wire together.
• PCA
• RF development with PCA
Classical Conditioning and Hebb’s rule
Ear
A
Nose
B
Tongue
“When an axon in cell A is near enough to excite cell B and
repeatedly and persistently takes part in firing it, some
growth process or metabolic change takes place in one
or both cells such that A’s efficacy in firing B is increased”
D. O. Hebb (1949)
The generalized Hebb rule:
dwi
 xi y
dt
where xi are the inputs
and y the output is assumed linear:
y   wjx j
j
Results in 2D
Example of Hebb in 2D
2
=/3
m
w
1
x2
0
-1
-2
-2
-1
x1
0
1
2
(Note: here inputs have a mean of zero)
On the board:
• Solve simple linear first order ODE
• Fixed points and their stability for non
linear ODE.
In the simplest case, the change in synaptic weight w
is:
wi  xi y
where x are input vectors and y is the neural response.
Assume for simplicity a linear neuron: y   w j x j
So we get:


wi   xi xj wj 
 j

j
Now take an average with respect to the distribution
of inputs, get:




E[wi ]     E[ xi x j ]w j     Qij w j
j
 j

If a small change Δw occurs over a short time Δt
then:
(in matrix notation)
dw
w

 Qw
t
dt
If <x>=0 , Q is the covariance function.
What is then the solution of this simple first order
linear ODE ?
(Show on board)
Mathematics of the generalized Hebb rule
The change in synaptic weight w is:
wi   ( xi  x0 )( y  y0 )
where x are input vectors and y is the neural
response.
Assume for simplicity a linear neuron: y 
w
j
So we get:


wi     xi x j w j  y0 xi  x0  x j w j  y0 x0 
j
 j

j
xj
Taking an average of the the distribution of inputs


E[wi ]     E[ xi x j ]w j  y0 E[ xi ]  x0  E[ x j ]w j  y0 x0 
j
 j

And using E[ xi x j ]  Qij and
E[ xi ]  
We obtain


E[wi ]     Qij w j  y0   x0  w j  y0 x0 
j
 j

In matrix form
E[w ]  Q  x 0J W  y 0 (  x 0 )  Q  k2 J w  e k1
^
Where J is a matrix of ones, e is a unit vector in
direction (1,1,1 … 1), k1  y0 (   x0 ) and
k2  x0 
or
^
E[w ]  Q w  e k1
Where
'
Q'  Q  k 2J
The equation therefore has the form
dw
dt
 [Q w  k1eˆ]
'
If k1 is not zero, this has a fixed point, however it is
usually not stable.
If k1=0 then have:
dw
dt
 Q w
'
The Hebb rule is unstable – how can it be
stabilized while preserving its properties?
The stabilized Hebb (Oja) rule.
wi' (t  1)  ( wi' (t )  wi ) / ( w  Δw)2
Normalize
Where |w(t)|=1
Appoximate to first order in Δw: (show on board)
wi' (t  1)  wi' (t )  wi  wi (t ) w j w j (t )
Now insert
Get:
wi  xi y
j
w (t  1)  w (t )  xi y  w (t )x j yw (t )
'
i
'
i
'
i
'
j
j
 w (t )  xi y  w (t )y  x j w
'
i
'
i
}
j
'
j
y
Therefore
w  w (t  1)  w (t )   ( xi y  w (t ) y )
'
i
'
i
'
i
'
i
The Oja rule therefore has the form:
dwi
2
  ( xi y  wi y )
dt
2


dwi
2
  ( xi y  wi y )    xi  xk wk  wi  x j xk w j wk 
dt
j ,k
 k

dwi
dt

    wk xk xi
 k
 wi 
j ,k

xk x j wk w j 

In matrix form:
dw
  Qw  (w TQw)w 
dt
Average
Using this rule the weight vector converges to
the eigen-vector of Q with the highest eigenvalue. It is often called a principal component
or PCA rule.
•The exact dynamics of the Oja rule have been
solved by Wyatt and Elfaldel 1995
•Variants of networks that extract several
principal components have been proposed (e.g:
Sanger 1989)
Therefore a stabilized Hebb (Oja neuron) carries out
Eigen-vector, or principal component analysis (PCA).
1
=/3
x2
0.5
0
-0.5
-1
-1
-0.5
x1
0
0.5
1
Using this rule the weight vector converges to
the eigen-vector of Q with the highest eigen-value. It
is often called a principal component or PCA rule.
Another way to look at this:
dw
  (Qw   ( y ) w )
dt
Where for the Oja rule:  ( y )  y 2
dw
At the f.p:
0
dt
Qw  w
where
 y
So the f.p is an eigen-vector of Q.
The condition
  y 2 means that w is normalized.

Why? Could there be other choices for β?
2
Show that the Oja rule converges to the state |w^2|=1
dw
The Oja rule in matrix form:
  ( x y  wy 2 )
dt
2
T
d
|
w
|
d(w
 w)
T dw
What is

 2w 
dt
dt
dt

d | w2 |
 2y 2 (1 | w2 |)
dt
Bonus question for H.W: The equivalence above, why does
it prove convergence to normalization.
Show that the f.p of the Oja rule is such that the
largest eigen-vector with the largest eigen-value (PC) is
stable while others are not (from HKP – pg 202).
Start with:
dw
  Qw  (w TQw)w 
dt
Assume w=ua+εub where ua and ub are eigen-vectors
with eigen-values λa,b
dw
dua
d (ub )



dt
dt
dt
  Q(ua  ub )  ((ua  ub ) Q(ua  ub )(ua  ub )
T
dw
  Q(ua  ub )  ((ua  ub )T Q(ua  ub )(ua  ub ) 
dt
Get:
(show on board)
d(ua  ub )
 a ua  b ub  a ua  a ub  b ua   O( 2 )
dt
Therefore:
d
  b  a 
dt
That is stable only when λa> λb for every b.
Finding multiple principal components – the
Sanger algorithm.
dwij
dt
dwij
dt
i
 yi (( x j   yk wkj ))
k 1
i 1
 yi (( x j   yk wkj )  wij yi )
Subtract projection onto
accounted for subspace
(Grahm-Schimdtt)
k 1
Standard Oja
Normalization
Homework 1: (due in 1/28)
Implement a simple Hebb neuron with
random 2D input, tilted at an angle, θ=30o
with variances 1 and 3 and mean 0. Show
the synaptic weight evolution. (200
patterns at least)
1b) Calculate the correlation matrix of the
input data. Find the eigen-values, eigenvectors of this matrix. Compare to 1a.
1c) Repeat 1a for an Oja neuron, compare
to 1b.
+ bonus question above (another 25 pt)
What did we learn up to here?
Visual Pathway
Visual Cortex
Receptive fields are:
•Binocular
•Orientation
Selective
Area
17
LGN
Receptive fields are:
•Monocular
•Radially
Symmetric
Retina
light
electrical signals
Right
Left
Left
Tuning curves
0
90
180
270
360
Right
Orientation Selectivity
Normal
Binocular
Deprivation
Adult
Adult
Eye-opening angle
angle
Eye-opening
Monocular
Deprivation
Normal
Left
Right
Right
% of cells
angle
Left
angle
20
30
15
10
1 2
3 4
5
group
6
7
Rittenhouse et. al.
1 2
3 4
5
group
6
7
First use Hebb/PCA with toy examples
then used with more realistic examples
Aim get selective neurons using a
Hebb/PCA rule
Simple example: Q( r  r ' )  1  q cos( ( r  r ' ))
r
r
r
r
Why?
The eigen-value equation has the form:
1
'
'
'
dr
Q
(
r

r
)
w
(
r
)  w( r )

1
Q can be rewritten in the equivalent form:
Q( r  r ' )  1  q(cos(r ) cos(r ' )  sin( r ) sin( r ' ))
And a possible solution can be written as the sum:

W ( r )  a0   al cos(lr )  bl sin( lr ) 
l 1
Inserting, and by orthogonality get:



2a0  qa1 cos(r )  qb1 sin( r )    a0   (al cos(lr )  bl sin( lr )) 
l 1


So for l=0, λ=2, and for l=1, λ=q, for l>1 there is no solution.
So either w(r)= const with λ=2
or w(r)  a1 cos(r)  b1 sin( r)
with λ=q.
Orientation selectivity from a natural
environment:
The Images:
Natural Images, Noise,
and Learning
image
retinal
activity
•present patches
•update weights
Retina
LGN
Cortex
•Patches from retinal activity image
•Patches from noise
Raw images: (fig 5.8)
Preprocessed images: (fig 5.9)
Monocular
Deprivation
Normal
Left
Right
Right
% of cells
angle
Left
angle
20
30
15
10
1 2
3 4
5
group
6
7
Rittenhouse et. al.
1 2
3 4
5
group
6
7
Binocularity – simple examples.
Q is a 2-eye correlation function.
What is the solution of the eigen-value equation:
mQ  m
1

1

 
1  a  b
1
m 
2
1
1
m 
2
2
2  a  b
  1

  1



In a higher dimensional case:
 Qll Qlr 

Q2  
Q

Q
rr 
 rl
Qll, Qlr etc. are now matrixes.
And Qlr=Qrl.
The eigen-vectors now have the form
1
m 
2
1
 m

 m

 
1
m 
2
2
  m

  m



In a simpler case
 Q
Q2  
Q

Q 

Q 

This implies Qll=Qrr, that is eyes are equivalent.
And the cross eye correlation is a scaled version
of the one eye correlation.
If: Qm  m
with
1
m 
2
1
then: 1, 2
 (1   )
 m

 m

 
1
m 
2
2
  m

  m



Positive correlations (η=0.2)
Hebb with lower
saturation at 0
Negative correlations (η=-0.2)
Lets now assume that Q is as above for the 1D
selectivity example.
 Q
Q2  
Q

Q 

Q 

With 2D space included
2 partially overlapping eyes using natural images
Orientation selectivity and
Ocular Dominance
Left
Eye
Right
Eye
PCA
Left
Right
100
50
0
No. of Cells
Left
Synapses
Right
Synapses
100
50
0
100
50
0
100
50
01
3
Bin
5
What did we learned today?
The Hebb rule is unstable – how can it be
stabilized while preserving its properties?
The stabilized Hebb (Oja) rule.
This is has a fixed point at:
Where:
The only stable fixed point is for λmax