NEURAL NETWORKS AND FUZZY SYSTEMS

Download Report

Transcript NEURAL NETWORKS AND FUZZY SYSTEMS

NEURONAL DYNAMICS 2:
ACTIVATION MODELS
2002.10.8
Chapter 3. Neuronal Dynamics 2 :Activation Models
3.1 Neuronal dynamical system
Neuronal activations change with time. The way
they change depends on the dynamical equations
as following:

x  g(F X ,FY ,
)
(3-1)

y  h(F X ,FY ,
)
(3-2)
2002.10.8
3.1 ADDITIVE NEURONAL DYNAMICS
first-order passive decay model
In the absence of external or neuronal stimuli,
the simplest activation dynamics model is:

x i   xi

y
j
 yj
(3-3)
(3-4)
2002.10.8
3.1 ADDITIVE NEURONAL DYNAMICS
since for any finite initial condition
xi (t )  xi (0)e t
The membrane potential decays
exponentially quickly to its zero potential.
Passive Membrane Decay
Passive-decay rate
Ai  0 scales the rate to
the membrane’s resting potential.

x i   Ai x i
solution :
 Ai t
x(
t
)

x
(
0
)
e
i
i
Passive-decay rate measures: the cell membrane’s
resistance or “friction” to current flow.
2002.10.8
property
Pay attention to Ai property
The larger the passive-decay rate,the faster the
decay--the less the resistance to current flow.
Membrane Time Constants
The membrane time constant C i scales the time
variable of the activation dynamical system.
The multiplicative constant model:

C i x i  -A i x i
(3-8)
2002.10.8
Solution and property
solution
x i (t )  xi (0)e

Ai
t
Ci
property
The smaller the capacitance ,the faster things change
As the membrane capacitance increases toward
positive infinity,membrane fluctuation slows to stop.
Membrane Resting Potentials
Definition
Define resting Potential Pi as the activation value to
which the membrane potential equilibrates in the
absence of external or neuronal inputs:

C i x i  -A i x i  Pi
(3-11)
Solutions
-
x i (t)  x i (0)e
Ai
t
Ci
Pi
 (1- e
Ai
-
Ai
t
Ci
)
(3-12)
2002.10.8
Note
The capacitance appear in the index of the
solution, it is called time-scaling capacitance.
It does
not affect the asymptotic or steady-state
P
solution A and does not depend on the finite initial
condition.
i
i
Additive External Input
Add input
Apply a relatively constant numeral input to a neuron.

x i  -x i  I i
(3-13)
solution
x i (t)  x i (0)e  I i (1 - e )
-t
-t
(3-14)
Meaning of the input
Input can represent the magnitude of directly
experiment sensory information or directly apply
control information.
The input changes slowly,and can be assumed
constant value.
3.2 ADDITIVE NEURONAL FEEDBACK
Neurons do not compute alone. Neuron modify their
state activations with external input and with the feedback
from one another.
This feedback takes the form of path-weighted signals
from synaptically connected neurons.
Synaptic Connection Matrices
n neurons in field FX
p neurons in field FY
The ith neuron axon in FX
jth neurons in FY
a synapse m ij
m ij is constant,can be positive,negative or zero.
Meaning of connection matrix
The synaptic matrix or connection matrix M is an
n-by-p matrix of real number whose entries are the
synaptic efficacies. m ij the ijth synapse is excitatory
if mij  0 inhibitory if mij  0
The matrix M describes the forward projections from
neuron field FX to neuron field FY
The matrix N describes the backward projections
from neuron field FY to neuron field F
X
Bidirectional and Unidirectional connection
Topologies
Bidirectional networks
M and N have the same or approximately the same
structure. N  M T
M  NT
Unidirectional network
A neuron field synaptically intraconnects to itself.M nxn.
BAM
M is symmetric,
network is BAM
M  MT
the unidirectional
2002.10.8
Augmented field and augmented matrix
Augmented field
FX
FY
FZ  FX 
 FY 
M connects FX to FY ,N connects FY to FX then the
augmented field FZ intraconnects to itself by the square
block matrix B
0
B  
N
M

0
2002.10.8
Augmented field and augmented matrix
In the BAM case,when N  M then B  B hence
a BAM symmetries an arbitrary rectangular matrix M.
T
T
In the general case,
P
C  
N
M

Q
P is n-by-n matrix.
Q is p-by-p matrix.
T
If and only if, N  M T P  P T Q  Q the neurons in FZ
are symmetrically intraconnected
C  CT
2002.10.8
3.3 ADDITIVE ACTIVATION MODELS
Define additive activation model
n+p coupled first-order differential equations defines the
additive activation model

y j  -A j y j 

x i  -Ai x i 
p
 S ( x )m
i
i
ij
Ij
(3-15)
 Ii
(3-16)
j 1
p
 S ( y )n
j
j
ji
j 1
2002.10.8
additive activation model define
The additive autoassociative model correspond to a system
of n coupled first-order differential equations

p
x i  -Ai x i   S j ( x j )m ji  I i
(3-17)
j 1
2002.10.8
additive activation model define
A special case of the additive autoassociative
model

where R i'
x j  xi
 r
S (x
xi
Ci x i   
Ri
xi
 ' 
Ri
(3-18)
 Ii
ij
j
j
j
is
j )mij
 Ii
n
1
1
1
 
'
Ri Ri
j rij
(3-19)
(3-20)
rij
measures the cytoplasmic resistance between
neurons i and j.
2002.10.8
Hopfield circuit and continuous additive bidirectional
associative memories
Hopfield circuit arises from if each neuron has a strictly
increasing signal function and if the synaptic connection
matrix is symmetric

xi
Ci xi   '   S j ( x j )mij  I i
Ri
j
(3-21)
continuous additive bidirectional associative memories

p

j 1
p
x i  -Ai x i   S j ( y j )mij  Ii
y j  -Aj y j   Si ( xi )mij  I j
(3-22)
(3-23)
i 1
2002.10.8
3.4 ADDITIVE BIVALENT FEEDBACK
Discrete additive activation models correspond to neurons
with threshold signal function
The neurons can assume only two value: ON and OFF.
ON represents the signal value +1. OFF represents 0 or –1.
Bivalent models can represent asynchronous and stochastic
behavior.
Bivalent Additive BAM
BAM-bidirectional associative memory
Define a discrete additive BAM with threshold signal
functions, arbitrary thresholds and inputs,an arbitrary but
constant synaptic connection matrix M,and discrete time
steps k.
p
xik 1   S j ( y kj )mij  Ii
(3-24)
j 1
p
y kj 1   Si ( xik )mij  I j
(3-25)
i 1
2002.10.8
Bivalent Additive BAM
Threshold binary signal functions
 1

k
Si ( xi )  Si ( xik 1 )
 0

 1

k
S j ( y j )  S j ( y kj 1 )
 0

if
xik  U i
if
xik  U i
if
xik  U i
if
y kj  V j
if
y kj  V j
if
y kj  V j
(3-26)
(3-27)
For arbitrary real-value thresholds U  U1, , Un 
for neurons F X V  V1 , , V p for neurons FY


2002.10.8
A example for BAM model
Example
A 4-by-3 matrix M represents the forward synaptic
projections from F X to FY .
A 3-by-4 matrix MT represents the backward synaptic
projections from FY to F X .
2
 3 0


 1 2 0 
M 
0
3
2


  2 1  1


  3 1 0  2


T
M  0 2 3 1 
 2

0
2

1


A example for BAM model
Suppose at initial time k all the neurons in FY are ON.
So the signal state vector S (Y k ) at time k corresponds to
S (Yk )  (1 1 1)
Input
X k  ( x1k , x2k , x3k , x4k ,)  (5  2 3 1)
Suppose
Ui  V j  0
2002.10.8
A example for BAM model
is:
first:at time k+1 through synchronous operation,the result
S ( X k )  (1 0 1 1)
next:at time k+1 ,these F X signals pass “forward” through the
filter M to affect the activations of the FY neurons.
The three neurons compute three dot products,or correlations.
The signal state vector S ( X k ) multiplies each of the three
columns of M.
2002.10.8
A example for BAM model
the result is:
4
S ( X k ) M  (
i 1
Si ( xik )mi1 ,
 (5
 ( y1k 1
4
3)
y2k 1
4

i 1
Si ( xik )mi 2 ,
4
k
S
(
x
 i i )mi3)
i 1
y3k 1 )
 Yk 1
synchronously compute the new signal state vector S (Yk 1 ):
S (Yk 1 )  (0 1 1)
A example for BAM model
the signal vector passes “backward” through the synaptic
filter S (Yk 1 ) at time k+2:
S (Yk 1 )M T  (2  2 5 0)
 ( x1k 2
 X k 2
x2k 2
x3k 2
x4k 2 )
synchronously compute the new signal state vector
S ( X k  2 )  (1 0 1 1)  S ( X k )
:
A example for BAM model
since S ( X k 2 )  S ( X k ) then
S (Yk 3 )  S (Yk 1 )
conclusion
These same two signal state vectors will pass back and
forth in bidirectional equilibrium forever-or until new
inputs perturb the system out of equilibrium.
A example for BAM model
asynchronous state changes may lead to different
bidirectional equilibrium
keep the first FY neurons ON,only update the second
and third FY neurons. At k,all neurons are ON.
Yk 1  S ( X k ) M  ( 5 4
3)
new signal state vector at time k+1 equals:
S (Yk 1 )  (1 1 1)
A example for BAM model
new F X
activation state vector equals:
X k 2  S (Yk 1 ) M T  ( 1  1 5  2)
synchronously thresholds
S ( X k 2 )  (0 0 1 0)
passing this vector forward to FY gives
Yk  3  S ( X k  2 ) M  ( 0 3 2)
S (Yk 3 )  (1 1 1)
 S (Yk 1 )
A example for BAM model
similarly,
S ( X k  4 )  S ( X k  2 )  (0
0 1 0)
for any asynchronous state change policy we apply to the
neurons F X
the system has reached a new equilibrium,the binary pair
(0
0 1 0), (1 1 1)represents a fixed point of the system.
conclusion
conclusion
Different subset asynchronous state change policies applied
to the same data need not product the same fixed-point
equilibrium. They tend to produce the same equilibria.
All BAM state changes lead to fixed-point stability.
Bidirectional Stability
definition
A BAM system ( Fx , Fy , M ) is Bidirectional stable if all
inputs converge to fixed-point equilibria.
A denotes a binary n-vector in
B denotes a binary p-vector in
0,1n
0,1p
Bidirectional Stability
Represent a BAM system equilibrates to bidirectional fixed
point
Af
) as
 M
 MT
 M
 MT

 M
Af

( Af , B f
A
A'
A'
A ''
MT




B
B
B'

B'

Bf

Bf
Lyapunov Functions
Lyapunov Functions L maps system state variables to real
numbers and decreases with time. In BAM case,L maps the
Bivalent product space to real numbers.
Suppose L is sufficiently differentiable to apply the chain
rule:

L
n

i
L dxi

xi dt

i
L 
xi
x i
(3-28)
Lyapunov Functions
The quadratic choice of L
1
1
T
L  xIx 
2
2

x i2
(3-29)
i
Suppose the dynamical system describes the passive
decay system.

xi   xi
(3-30)
The solution
xi (t )  xi (0)e t
(3-31)
Lyapunov Functions
The partial derivative of the quadratic L:
L
 xi
xi

L

i
(3-32)

x i2 (3-33) or L  


i
2
xi
(3-34)
(3-35)
In either case L  0

(3-36)
At equilibrium
L0
This occurs if and only if all velocities equal zero

xi  0
conclusion
A dynamical system is stable if some Lyapunov Functions L
decreases along trajectories. L  0
A dynamical system is asymptotically stable if it strictly
decreases along trajectories
Monotonicity of a Lyapunov Function provides a sufficient
not necessary condition for stability and asymptotic stability.
Linear system stability
For symmetric matrix A and square matrix B,the quadratic
T
L

xAx
form
behaves as a strictly decreasing Lyapunov

function for any linear dynamical system x  xB if and
only if the matrix ABT  BA is negative definite.
T

L  xA x  x AxT
 xABT x T  xBAxT
 x[ ABT  BA]x T
The relations between convergence rate
and eigenvalue sign
A general theorem in dynamical system theory relates
convergence rate and eigenvalue sign:
A nonlinear dynamical system converges exponetially
quickly if its system Jacobian has eigenvalues with negative
real parts. Locally such nonlinear system behave as linearly.
(Jacobian matrix)
A Lyapunov Function
summarizes total system behavior.

L0
A Lyapunov Function often measures the energy of a
physical sysem.
Represents system energy decrease
with dynamical systems
Potential energy function represented by quadratic form
Consider a system of n variables and its potential-energy
function E. Suppose the coordinate x i measures the
displacement from equilibrium of ith unit.The energy depends
on only coordinate x i ,so E  E ( x1 ,  x n )
since E is a physical quantity,we assume it is sufficiently
smooth to permit a multivariable Taylor-series expansion
about the origin:
Potential energy function represented by quadratic form
E  E (0, ,
0) 

i
1

3!
1

2

i
j

i
j
k
2
E
1
xi 
x i
2

i
j
3E
xi x j x k  
xi  j  k
 E
xi x j
xi x j
1
 xAxT
Where2A is symmetric,since
2E
2E
aij 

 a ji
xi x j x j xi
2E
xi x j
x i x j
The reason of (3-42)follows
First,we defined the origin as an equilibrium of zero potential
energy;so E (0, , 0)  0
Second,the origin is an equilibrium only if all first partial
derivatives equal zero.
Third,we can neglect higher-order terms for small
displacement,since we assume the higher-order products are
smaller than the quadratic products.

Conclusion:
Bounded decreasing L funcs provide
an intuitive way to describe global
“computations” in nueral networks ad
other dynamical system.
Bivalent BAM theorem
The average signal energy L of the forward pass of theF X
Signal state vector S ( X ) through M,and the backward pass
Of the FY signal state vector S (Y ) through M T :
S ( X ) MS (Y )T  S (Y ) M T S ( X )T
L
2
since S (Y ) M T S ( X ) T  [ S (Y ) M T S ( X ) T ]T
 S ( X ) M T S (Y )T
L  S ( X )M T S (Y )T  
n
p
 S ( x )S ( y )m
i
i
j
i
j
j
ij
Lower bound of Lyapunov function
The signal is Lyapunov function clearly bounded below.
For binary or bipolar,the matrix coefficients define the
attainable bound:
L
 m
ij
i
j
The attainable upper bound is the negative of this expression.
Lyapunov function for the general BAM system
The signal-energy Lyapunov function for the general BAM
system takes the form
L  S ( X ) MS(Y )T  S ( X )[ I  U ]T  S (Y )[ J  V ]T
Inputs I  [ I 1 , , I N ] and J  [ J 1 , , J P ] and
constant vectors of thresholds U  [U1 , , U N ] V  [V1 , , VN ]
the attainable bound of this function is.
 m  [ I  U ]  [ J
L
ij
i
j
i
i
i
j
j
V j ]
Bivalent BAM theorem
Bivalent BAM theorem.every matrix is bidrectionally stable
for synchronous or asynchronous state changes.
Proof consider the signal state changes that occur from
time k to time k+1,define the vectors of signal state
changes as:
S (Y )  S (Yk 1 )  S (Yk )
 S1 ( y1 ), , S p ( y p ) ,
S ( X )  S ( X k 1 )  S ( X k )
 S1 ( x1 ), , Sn ( xn ) ,
Bivalent BAM theorem
define the individual state changes as:
S j ( y j )  S j ( y j k 1 )  S j ( y j k )
Si ( xi )  Si ( xi
k 1
)  S i ( xi )
k
We assume at least one neuron changes state from k
to time k+1.
Any subset of neurons in a field can change state,but in
only one field at a time.
For binary threshold signal functions if a state change
is nonzero,
Bivalent BAM theorem
S i ( xi )  1  0  1 S i ( xi )  0  1  1
For bipolar threshold signal functions
S i ( xi )  2
S i ( xi )  2
The “energy”change
L  Lk 1  Lk
L
Differs from zero because of changes in field F X or in
field FY
Bivalent BAM theorem
L  S ( X ) MS(Yk )T  S ( X )[ I  U ]T
 S ( X )[S (Yk ) M T  [ I  U ]]T

S ( x )I  S ( x )U
 S ( x ) S ( y ) m  S ( x ) I  S ( x )U
  S ( x )[ S ( y ) m  I  U ]
  Si ( xi )[ xik 1  U i ]

Si ( xi )S j ( y kj )T mij 
i
j
i
i
i
i
i
i
i
k T
i
i
i
ij
j
j
i
i
0
j
i
k T
i
j
j
j
i
i
i
i
i
i
ij
i
i
i
i
i
Bivalent BAM theorem
Suppose S i ( xi )  0
Then Si ( xi )  Si ( xi
k 1
)  Si ( xi k )
 1 0
k 1
This implies xi  U i so the product is positive:
Si ( xi )[ xik 1  U i ]  0
Another case suppose S i ( xi )  0
Si ( xi )  Si ( xi
k 1
 01
)  Si ( xi )
k
Bivalent BAM theorem
k 1
This implies xi
 Ui
so the product is positive:
Si ( xi )[ xik 1  U i ]  0
So Lk 1  Lk  0 for every state change.
Since L is bounded,L behaves as a Lyapunov function for
the additive BAM dynamical system defined by before.
Since the matrix M was arbitrary,every matrix is
bidirectionally stable. The bivalent Bam theorem is proved.
Property of globally stable dynamical system
Two insights about the rate of convergence
First,the individual energies decrease nontrivially.the BAM
system does not creep arbitrary slowly down the toward the
nearest local minimum.the system takes definite hops into
the basin of attraction of the fixed point.
Second,a synchronous BAM tends to converge faster
than an asynchronous BAM.In another word, asynchronous
updating should take more iterations to converge.
Review
1.Neuronal Dynamical Systems
We describe the neuronal dynamical systems by firstorder differential or difference equations that govern
the time evolution of the neuronal activations or
membrane potentials.
x  g ( FX , FY ,)
y  h( FX , FY ,)
Review
4.Additive activation models
p
x i   Ai xi   S j ( y j )n ji  I i
j 1
n
y j   A j y j   Si ( xi )mij  J j
i 1
Hopfield circuit:
1. Additive autoassociative model;
2. Strictly increasing bounded signal function ( S   0) ;
T
3. Synaptic connection matrix is symmetric ( M  M ).
xi
Ci xi     S j ( x j )m ji  I i
Ri
j
Review
5.Additive bivalent models
p
xik 1   S j ( y kj )m ji  I i
j
n
y kj 1   Si ( xik )mij  I j
i
Lyapunov Functions
Cannot find a lyapunov function,nothing follows;
Can find a lyapunov function,stability holds.
Review
A dynamics system is
stable , if
L  0
;
asymptotically stable, if L  0
.
Monotonicity of a lyapunov function is a sufficient
not necessary condition for stability and asymptotic
stability.
Review
Bivalent BAM theorem.
Every matrix is bidirectionally stable for synchronous or
asynchronous state changes.
•
Synchronous:update an entire field of neurons at a time.
•
Simple asynchronous:only one neuron makes a statechange decision.
•
Subset asynchronous:one subset of neurons per field
makes state-change decisions at a time.
Chapter 3. Neural Dynamics II:Activation Models
The most popular method for constructing M:the bipolar
Hebbian or outer-product learning method
binary vector associations: ( Ai , Bi )
bipolar vector associations: ( X i , Yi )
i  1,2,m
1
Ai  [ X i  1]
2
X i  2 Ai  1
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
The binary outer-product law:
m
M   AkT Bk
k
The bipolar outer-product law:
m
M   X kT Yk
k
The Boolean outer-product law:
m
M   AkT Bk
k
mij  max(a1i b1j , , a mi bmj )
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
The weighted outer-product law:
m
M   wk X kT Yk
Where
k
m
w
k
 1 holds.
k
In matrix notation:
Where
M  X T WY
X T  [ X 1T |  | X mT ]
Y T  [Y1T |  | YmT ]
W  Diagonal [ w1 , , wm ]
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.1 Optimal Linear Associative Memory Matrices
Optimal linear associative memory matrices:
M  X *Y
The pseudo-inverse matrix of
X
:
X
*
XX X  X
*
X XX  X
*
X * X  ( X * X )T
*
*
XX *  ( XX * ) T
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.1 Optimal Linear Associative Memory Matrices
Optimal linear associative memory matrices:
The pseudo-inverse matrix of
If x is a nonzero scalar:
If x is a nonzero vector:
X
:
x*  1/ x
T
x
x*  T
xx
If x is a zero scalar or zero vector :
For a rectangular matrix
X
X , if
*
x*  0
( XX T ) 1exists:
X *  X T ( XX T ) 1
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.1 Optimal Linear Associative Memory Matrices
Define the matrix Euclidean norm M as
M  Trace( MM T )
Minimize the mean-squared error of forward
recall,to find Mˆ that satifies the relation
ˆ  Y  XM
Y  XM
for all M
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.1 Optimal Linear Associative Memory Matrices
1
X
Suppose further that the inverse matrix
exists.
Then
0 0
 Y Y
 Y - XX -1Y
ˆ  X 1Y
So the OLAM matrix Mˆ correspond to M
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
If the set of vector { X 1 ,, X m } is orthonormal
1 if
X i X Tj  
0 if

i j
i j
Then the OLAM matrix reduces to the classical linear
associative memory(LAM) :
Mˆ  X T Y
For
X
is orthonormal, the inverse of
X
is
X
T
.
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.2 Autoassociative OLAM Filtering
Autoassociative OLAM systems behave as linear filters.
In the autoassociative case the OLAM matrix encodes only
the known signal vectors x i . Then the OLAM matrix
equation (3-78) reduces to
MX X
*
M linearly “filters” input measurement x to the output
vector
x  by vector matrix multiplication: xM  x
.
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.2 Autoassociative OLAM Filtering
The OLAM matrix X * X behaves as a projection
operator[Sorenson,1980].Algebraically,this means
the matrix M is idempotent: M 2  M .
Since matrix multiplication is associative,pseudoinverse property (3-80) implies idempotency of the
autoassociative OLAM matrix M:
M 2  MM
 X * XX * X
 ( X * XX * ) X
 X*X
M
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.2 Autoassociative OLAM Filtering
Then (3-80) also implies that the additive dual matrix
I  X * X behaves as a projection operator:
( I  X * X ) 2  ( I  X * X )(I  X * X )
 I 2 - X * X - X * X  X * XX * X
 I - 2X * X  ( X * XX * ) X
 I - 2X * X  X * X
 I - X*X
We can represent a projection matrix M as the
mapping
M : Rn  L
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.2 Autoassociative OLAM Filtering
The Pythagorean theorem underlies projection
operators.
The known signal vectors X 1 ,  , X m span
n
some unique linear subspace L( X 1 ,  , X m ) of R
L equals {im ci X i : for all ci  R} , the set of all
linear combinations of the m known signal vectors.
L denotes the orthogonal complement space
{x  R n : xyT  0 for all y  L}
,the set of all real n-vectors x orthogonal to every
n-vector y in L.
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.2 Autoassociative OLAM Filtering
1. Operator
*
X X
projects
Rn onto L.
2. The dual operator I  X * X projects
Rn onto L .
Projection Operator X * X and I  X * X uniquely
decompose every Rn vector x into a summed signal
vector xˆ and a noise or novelty vector ~
x:
x  xX * X  x ( I  X * X )
 xˆ  ~
x
x
xˆ
2002.10.8
~
x
Chapter 3. Neural Dynamics II:Activation Models
※3.6.2 Autoassociative OLAM Filtering
The unique additive decomposition x
ˆ~
x obeys a
generalized Pythagorean theorem:
|| x ||2 || xˆ ||2  || ~
x ||2
2
2
2
where || x ||  x1    x n defines the squared
Euclidean or l 2 norm.
Kohonen[1988] calls I  X * X the novelty filter on Rn .
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.2 Autoassociative OLAM Filtering
Projection xˆ measures what we know about input x
relative to stored signal vectors X 1 ,  , X m :
m
xˆ   c i x i
i
for some constant vector
( c1 ,  , c n )
.
~
x
The novelty vector
measures what is maximally
unknown or novel in the measured input signal x.
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.2 Autoassociative OLAM Filtering
Suppose we model a random measurement vector x as
a random signal vector x s corrupted by an additive,
independent random-noise vector x N :
x  xs  xN
We can estimate the unknown signal
*
filtered output xˆ  xX X .
x s as the OLAM-
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.2 Autoassociative OLAM Filtering
Kohonen[1988] has shown that if the multivariable noise
distribution is radially symmetric, such as a multivariable
Gaussian distribution,then the OLAM capacity m and
pattern dimension n scale the variance of the randomvariable estimator-error norm || xˆ  x s ||:
m
|| x  x s ||2
n
m
 || x N ||2
n
V [|| xˆ  x s ||]
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.2 Autoassociative OLAM Filtering
1.The autoassociative OLAM filter suppress noise if m  n ,
when memory capacity does not exceed signal dimension.
2.The OLAM filter amplifies noise if m  n , when capacity
exceeds dimension.
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.3 BAM Correlation Encoding Example
The above data-dependent encoding schemes add
outer-product correlation matrices.
The following example illustrates a complete nonlinear
feedback neural network in action,with data deliberately
encoded into the system dynamics.
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.3 BAM Correlation Encoding Example
Suppose the data consists of two unweighted ( w1  w 2  1)
binary associations ( A1 , B1 ) and ( A2 , B 2 ) defined by the
nonorthogonal binary signal vectors:
A1   1 0 1 0 1 0 
B1   1 1 0 0

A2   1 1 1 0 0 0 
B2   1 0 1 0 
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.3 BAM Correlation Encoding Example
These binary associations correspond to the two bipolar
associations ( X 1 , Y1 ) and ( X 2 , Y 2 ) defined by the bipol
–ar signal vectors:
X1  1 1 1 1 1 1 
Y1   1 1  1  1 
X 2  1 1 1 1 1 1 
Y2   1  1 1  1 
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.3 BAM Correlation Encoding Example
We compute the BAM memory matrix M by adding the bipol
T
T
X
–ar correlation matrices 1 Y1 and X 2 Y 2 pointwise. The first
T
correlation matrix X 1 Y1 equals
1 1 1 
 1
 1
 


  1
1 1 1 1 
 1
 1

1

1

1

X 1T Y1    1 1  1  1   
1 1
  1
1 1
 


1
1
1

1

1
 


  1
1 1

1
1
 


2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.3 BAM Correlation Encoding Example
T
X
Observe that the i th row of the correlation matrix 1 Y1
equals the bipolar vector Y 1 multipled by the i th element
T
of X 1 . The j th column has the similar result. So X 2 Y2
equals
 1 1 1 1 


 1  1 1  1
 1  1 1  1

X 2T Y 2  
1 1 1 1 



1
1

1
1


1 1 1 1 


2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.3 BAM Correlation Encoding Example
Adding these matrices pairwise gives M:
M  X 1T Y1  X 2T Y2
1

1
1

1

1
1

1
1
1
1
1
1
1
1
1
1
1
1
 1  1
 
1 1
 1  1

1  1
 
 1   1
1    1
1
1
1
1
1
1
1
1
1
1
1
1
 1  2 0 0  2 
 

 1  0  2 2 0 
 1  2 0 0  2 


1   2 0 0 2 
 

1   0 2 2 0 
1    2 0 0 2 
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.3 BAM Correlation Encoding Example
Suppose, first,we use binary state vectors.All update policies
are synchronous.Suppose we present binary vector A1 as
input to the system—as the current signal state vector at F X .
Then applying the threshold law (3-26) synchronously gives
A1 M  ( 4
2  2  4 )  (1
1 0 0 )  B1
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.3 BAM Correlation Encoding Example
Passing B 1 through the backward filter M T, and applying
the bipolar version of the threshold law(3-27),gives back A1 :
B1 M T  ( 2  2 2  2
2  2 )  ( 1 0 1 0 1 0 )  A1
So ( A1 , B1 ) is a fixed point of the BAM dynamical system.
It has Lyapunov “energy” L( A1 , B1 )   A1 MB1T  6 ,
which equals the backward value  B1 M T A1T  6 .
( A2 , B2 ) has the similar result:a fixed point with
energy  A2 MB2T  6 .
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.3 BAM Correlation Encoding Example
So the two deliberately encoded fixed points reside in
equally “deep” attractors.
Hamming distance H equals l 1 distance. H ( Ai , A j ) counts the
number of slots in which binary vectors A i and A j differ:
n
H ( Ai , A j )   | a ik  a kj |
k
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.3 BAM Correlation Encoding Example
Consider for example the input A  ( 0 1 1 0 0 0 ) ,
which differs from A2 by 1 bit , or H ( A, A2 )  1 . Then
A2   1 1 1 0 0 0 
AM  ( 2  2
2 2)( 1
0
1
0 )  B2
Fig3.2 shows that BAM can return original balance
regardless of the noise. bipolar
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.4 Memory Capacity:Dimensionality Limits Capacity
Synaptic connection matrices encode limited
information.
We sum more correlation matrices ,then mij  1
holds more frequently.
After a point,adding additional associations ( Ak , Bk )
Does not significantly change the connection
matrix. The system “forgets”some patterns.
This limits the memory capacity.
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.4 Memory Capacity:Dimensionality Limits Capacity
Grossberg’s sparse coding theorem [1976] says , for
deterministic encoding ,that pattern dimensionality must
exceed pattern number to prevent learning some patterns
at the expense of forgetting others.
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.5 The Hopfield Model
The Hopfield model illustrates an autoassociative additive
bivalent BAM operated serially with simple asynchronous
state changes.
Autoassociativity means the network topology reduces to only
one field, F X ,of neurons: F X  FY .The synaptic connection
matrix M symmetrically intraconnects the n neurons in field
M  M T or mij  m ji .
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.5 The Hopfield Model
The autoassociative version of Equation (3-24) describes
the additive neuronal activation dynamics:
xik 1   S j ( x kj )m ji  I i
(3-87)
j
for constant input I i , with threshold signal function
 1

S i ( x ik 1 )  S i ( x ik )
 0

如果 x ik 1  U i
如果 x ik 1  U i
(3-88)
如果 x ik 1  U i
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.6.5 The Hopfield Model
We precompute the Hebbian synaptic connection matrix M
by summing bipolar outer-product(autocorrelation)matrices
and zeroing the main diagonal:
m
M   X kT X k  mI
k 1
(3-89)
where I denotes the n-by-n identity matrix .
Zeroing the main diagonal tends to improve recall accuracy
by helping the system transfer function S (XM )behave less
like the identity operator.
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.7 Additive dynamics and the noise-saturation dilemma
Grossberg’s Saturation Theorem
Grossberg’s Saturation theorem states that additive
activation models saturate for large inputs, but
multiplicative models do not .
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
The stationary “reflectance pattern” P  ( p1 ,, p n )
confronts the system amid the background illumination I (t )
pi  0, and
p1    pn  1
The i th neuron receives input I i .Convex coefficient p i
defines the “reflectance” I i :
I i  pi I
A
: the passive decay rate
[0, B ] : the activation bound
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
Additive Grossberg model:
xi   Axi  ( B  xi ) I i
 ( A  I i ) xi  BI i
We can solve the linear differential equation to yield
xi (t )  xi (0)e
( A I i ) t
BI i

[1  e ( A I )t ]
A  Ii
i
For initial condition x i ( 0)  0 , as time increases the
activation converges to its steady-state value:
BI i
xi 
B
A  Ii
As I  
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
So the additive model saturates.
Multiplicative activation model:
x i   Axi  ( B  xi ) I i  xi  I j
j i
 ( A  I i   I j ) xi  BI i
j i
 ( A  I ) xi  BI i
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
For initial condition x i ( 0)  0 ,the solution to this
differential equation becomes
I
xi  pi B
(1  e ( A I ) t )
A I
As time increases, the neuron reaches steady state
exponentially fast:
I
x i  pi B
 pi B
A I
(3-96)
as I   .
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
This proves the Grossberg saturation theorem:
Additive models saturate ,multiplicative
models do not.
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
In general the activation variable x i can assume negative
values . Then the operating range equals [  C i , B i ]
for C i  0 .In the neurobiological literature the lower
bound  C i is usually smaller in magnitude than the upper
bound B i : C i  B i
This leads to the slightly more general shunting
activation model:
x i   Axi  ( B  xi ) I i  (C  xi )  I j
j 1
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
Setting the right-hand side of the above equation to zero,
and we can get the equilibrium activation value:
C
(B  C)I
x i  [ pi 
]
BC
A I
which reduces to (3-96) if C=0.
pi 
C
the neuron generates nonnegative activations.
BC
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
※3.8 General Neuronal Activations:Cohen-Grossberg and
multiplicative models
Consider the symmetric unidirectional or autoassociative
case when F X  FY , M  M T , and M is constant . Then a
neural network possesses Cohen-Grossberg[1983] activation
dynamics if its activation equations have the form
n
x i  ai ( xi )[ bi ( xi )   S j ( x j )mij ](3-102)
j 1
The nonnegative function a i ( x i )  0 represents an abstract
amplification function.
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
Grossberg[1988]has also shown that (3-102) reduces to the
additive brain-state-in-a-box model of Anderson[1977,1983]
and the shunting masking-field model [Cohen,1987] upon
appropriate change of variables.
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
If a i  1 / C i , bi  ( x i / Ri )  I i , S i ( x i )  g i ( x i )  Vi and
constant mij  m ji  Tij  T ji
, where C i and R i are
positive constants , and input I i is constant or varies slowly
relative to fluctuations in x i ,then (3-102) reduces to the
Hopfield circuit[1984]:
C i x i  
xi
 V j Tij  I i
j
Ri
An autoassociative network has shunting or multiplicative
activation dynamics when the amplification function a i is linear,
and b i is nonlinear .
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
For instance , if a i   x i , m ii  1 (self-excitation in lateral
inhibition) , and
1
bi  [ Ai xi  Bi ( S i  I i )  xi ( S i  I i )  Ci (  S j mij  I i )]
j i
xi
then (3-104) describes the distance-dependent ( m ij  m ji )
unidirectional shunting network :
xi   Ai xi  (Bi  xi )[Si ( xi )  I i ]  (Ci  xi )[  S j ( x j )mij  I i ]
j i
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
Hodgkin-Huxley membrane equation:
Vi
c
 (V p  Vi ) g ip  (V   Vi ) g i  (V   Vi ) g i
t
V p , V  and V  denote respectively passive(chloride Cl  ) ,
excitatory (sodium Na  ) , and inhibitory (potassium K  )
saturation upper bounds .
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
At equilibrium, when the current equals zero ,the HodgkinHuxley model has the resting potential V rest :
Vrest
g tpV p  g V   g V 

g p  g  g
Neglect chloride-based passive terms.This gives the
resting potential of the shunting model as
Vrest
g V   g V 

g  g
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
BAM activations also possess Cohen-Grossberg dynamics,
and their extensions:
p
x i  a(
i x i )[bi ( x i )   S j ( y j )mij ]
j
n
y j  a j ( y j )[b j ( y j )   S i ( x i )mij ]
i
with corresponding Lyapunov function L , as we show in
Chapter 6 :
L    S i S j mij   0x S i' ( i )bi ( i )d i   0y S 'j ( j )d j
i
i j
i
j
j
2002.10.8
Chapter 3. Neural Dynamics II:Activation Models
谢谢大家!
2002.10.8