CS 524 – High Performance Computing

Download Report

Transcript CS 524 – High Performance Computing

Correlation Matrix
Memory
CS/CMPE 333 – Neural Networks
Memory – Biological and Physical


What are the characteristics of biological and physical
memory systems?
In the neurobiological context, memory refers to the
relatively enduring neural alterations induced by the
interactions of the organism with its environment
 Store

and recall capabilities
Our memory is physically distributed and operates by
association (rather than addressing)
 Content-addressable memory

What about memory in a computer?
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
2
Associative Memory (1)

Associative memory – content-addressable memory:
memory that operates by association, mapping of an output
to an input (stimulus to output)


Example: face recognition
Characteristics of associative memory






Distributed
Both key (stimulus) and output (stored pattern) are data vectors (as
opposed to address and data)
Storage pattern different from key and is spatially distributed
Stimulus is address as well as stored pattern (with imperfections)
Tolerant to noise and damage
Same elements share in storage of different patterns
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
3
Associative Memory (2)

Auto-associative memory – a key vector is associated
with itself in memory
 Dimension

Hetero-associative memory – a key vector is associated
(paired) with another memorized vector
 Dimension


of input and output are same
of input and output may or may not be the same
Linear associative memory – linear mapping from
input (stimulus) to output
b = Ma
Nonlinear associative memory – nonlinear mapping
from input to output
b = g(M, a)a
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
4
Distributed Memory Mapping (1)


Distributed memory – simultaneous or near
simultaneous activities of many different neurons that
contain information about the external stimuli (keys)
Distributed memory mapping – transform an activity
pattern in the input space to an activity pattern in the
output space
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
5
Distributed Memory Mapping (2)
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
6
Distributed Memory Mapping (3)
ak = [ak1…akp]T = input activity pattern
bk = [bk1…bkp]T = output activity pattern
 How

Associations mapping
ak -> bk k = 1…q
 ak

do we learn from activity patterns ak and bk ?
is thus a key pattern and bk the memorized pattern
Number of patterns stored by network = q which, in
practice, is less than p
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
7
Distributed Memory Mapping (4)

For linear memories
bk = W(k)ak
k = 1…q
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
8
Distributed Memory Mapping (5)
Considering a single output neuron i and pattern k
bki = Σj=1 p wij(k)akj i = 1…p
Or in vector notation
bki = wi(k)Tak


Considering the entire stored pattern bk (i.e. bki where i
= 1…p)
bk = W(k)ak
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
9
Distributed Memory Mapping (6)

W(k) is the memory matrix for a single pattern k. The
memory matrix for the entire set of pattern q is defined
as the summation
M = Σk=1 q W(k)

M contains a piece of every input-output pattern
presented to the memory. Recursive updating:
Mk = Mk-1 + W(k)
k = 1…q
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
10
Correlation Matrix Memory (1)

Define M’ as an estimate of the memory matrix M
M’ = Σk=1 q bkakT
 Outer

product bkakT is an estimate of the weight matrix W(k)
Key points of this equation
 Outer
product rule
 Generalized Hebbian learning at the local level
 M’ is the correlation memory matrix
 This associative memory is called correlation matrix memory
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
11
Correlation Matrix Memory (2)

Correlation matrix memory M’ can be written as
M’ = BAT
A = [a1, a2,…,aq] = key matrix
B = [b1, b2,…,bq] = memorized matrix

And, in recursive form
M’k = M’k-1 + bkakT
k = 1,…,q
 bkakT is
an estimate of weight matrix W(k)
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
12
Recall (1)


Talking about memory, how is it addressed and how is
specific information recalled? What is its capacity?
Consider correlation memory matrix M’
present random key aj as stimulus, response is
b = M’aj = Σk=1 q bkakTaj = Σk=1 q bk (akTaj)
rewriting
b = Σk=1;k ≠ j q bk (akTaj) + bj (ajTaj)
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
13
Recall (2)

Assuming akTak = 1 for all k (i.e. key vectors are
normalized to ‘energy’ 1), then
b = Σk=1;k ≠ j q bk (akTaj) + bj = vj + bj
 bj
= desired response or vector; vj = noise vector
 Noise vector results from the crosstalk of aj with all other key
vectors
 This crosstalk limits the capacity of the memory in reliably
storing and recalling information
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
14
Recall (3)

Assume the key vectors ai are orthonormal, i.e.
ajTak = 1 if j = k and ajTak = 0 if j ≠ k

Then, what is the storage capacity?
 It

is equal to p
What if the key vectors are not orthonormal?
is equal to the rank of the memory matrix M’ (in general)
 Since M’ has the dimensions p x p, its rank cannot be greater
than p
 It

Thus, correlation matrix memory can reliably store a
maximum of p patterns, where p is the input dimension
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
15
Correlation Matrix Memory in Practice




Real-world patterns are not orthonormal. Hence,
storage capacity is less than p (key vector dimension)
If key vectors (patterns) are linearly independent, then
they can be preprocessed to form an orthonormal set
(using Gram-Schmidt procedure)
Preprocessing to enhance key vector ‘separation’ (i.e.
feature enhancement) helps improve storage capacity
and recall performance
Correlation matrix memory may recall patterns that it
has never seen before (i.e. it can make errors if the key
patterns are not orthonormal and/or greater than p)
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
16
Error-Correction in CMM (1)

Correlation matrix memory has no mechanism for
feedback since it is based on Hebbian learning
 The



consequence is there are errors in recall
How do we continually update the memory to reduce
the error?
Impose error-correction learning and updating of the
memory
Remember, memory matrix M’ is made up of the
weights of the neural network
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
17
Error-Correction in CMM (2)



At iteration n, when presented with pattern k, memory
error is
ek(n) = bk – M’(n)ak
Correction at iteration n (delta rule)
ΔM’(n) = ηe(n)akT
Updated memory matrix
M’(n+1) = M’(n) + ΔM’(n)
CS/CMPE 333 - Neural Networks (Sp 2002/2003) - Asim Karim @ LUMS
18