Transcript PINN-NOW99
Designing High-Capacity Neural Networks
for Storing, Retrieving and Forgetting Patterns
in Real-Time
Dmitry O. Gorodnichy
IMMS, Cybernetics Center of Ukrainian Academy Sciences, Kiev,Ukraine
Dept. of Computing Science, University of Alberta, Edmonton, Canada
http://www.cv.iit.nrc.ca/~dmitry/pinn
As presented at IJCNN’99, Washington DC, July 12-17, 1999
Outline
• What is memory? What is good memory?
• Pseudo-Inverse Neural Network is a tool to build it!
• Setting the paradigm for designing non-iterative highcapacity real-time adaptive systems:
• Increasing the Attraction Radius of the network
• Desaturating the network
• Solution for cycles and for fast retrieval
• Just some results
• Processing a stream of data - Dynamic Desaturation
• Conclusions. Some food for thought
What is Memory?
The one that
stores data:
and
retrieves it:
What do we want?
• To store
• as much as possible (for given amount of space)
• as fast as possible
• To retrieve
• from as much noise as possible
• as fast as possible
• To continuously update the contents of the memory, as new
data are coming
We are interested in theoretically grounded solutions
Neural Network - a tool to do that
• A fully connected network of N neurons
Yi, which evolves in time according to the
update rule:
until it reaches a stable state (attractor).
• Patterns can be stored as attractors
-> Non-iterative learning - fast learning
-> Synchronous dynamics - fast retrieval
How to find the best weight matrix C ?
Pseudo-Inverse Learning Rule
• Obtained from stability condition [Personnaz’85, Reznik’93]: CV=eV
C = VV+ or
(Widrow-Hoff’s rule
is its approximation)
• Optical and hardware implementations exist
• Its dynamics can be studied theoretically:
-> It can retrieve up to 0.3 N patterns
(Hebbian rule retrieves only 0.13N patterns)
Attraction Radius
Attraction radius (AR) tells us how good is the retrieval
• Direct AR can be calculated theoretically [Gorodnichy’95] as
-> Weights C determine AR ...
-> … and weights satisfy:
• Indirect AR can be estimated by Monte-Carlo simulations
Desaturation of the network
When there are too many patterns in memory, the network gets saturated:
• There are too many spurious local attractors [Reznik 93].
• Global attractors are never reached.
Solution [Gorodnichy 95&96]:
desaturate the network by partially reducing self-connections:
Cii = Cii*D, 0 < D < 1
• Desaturation:
-> preserves main attractors
-> decreases the number of static spurious attractors
-> makes the network more flexible (increases the number of iterations)
-> drastically increases the attraction radius [Gorodnichy&Reznik’97]
But what about cycles (dynamic spurious attractors)?
Increase of AR with Desaturation
Indirect AR
Direct AR
Dynamics of the network
The behaviour of the netrwork is governed by the energy functions
• Cycles are possible, when D<1 :
• However :
-> They are few, when D>0.1 [Gorodnichy&Reznik’97]
-> They are detected automatically
Update flow neuro-processing
[Gorodnichy&Reznik’94]:
“Process only those neurons which change during the evolution”, i.e.
instead of N multiplications:
do only few of them :
-> is very fast (as only few neurons are actually changing in one iteration)
-> detects cycle automatically
-> suitable for parallel implementation
Dealing with a stream of data
Dynamic desaturation:
-> maintains the capacity of 0.2N (with complete retrieval)
-> allows to store data in real-time (no need for iterative learning methods!)
-> provides means for forgetting obsolete data
-> is the basis for the design of adaptive filters
-> gives new insights on how the brain work
-> is a ground for the revision of the traditional learning theory
That’s what is the Neurocomputer
designed in IMMS of Cybernetics Center of the Ukrainian NAS
Conclusions
• The best performance of Hopfield like networks is achieved with
the Desaturated Pseudo-Inverse Learning Rule:
C=VV+, Cii=D*Cii, D=0.15
E.g. complete retrieval from 8% noise of M=0.5N patterns
from 2% noise of M=0.7N patterns
• The basis for non-iterative learning (to replace traditional iterative
learning methods) is set. This basis is Dynamic Desaturation,
which allows one to build real-time Adaptive Systems.
• Update Flow neuro-processing technique makes retrieval very
fast. It also resolves the issue of spurious dynamic attractors.
• Free code of Pseudo-Inverse Memory is available!
(see our web-site).