Transcript DLM

Dynamic Link Matching
Hamid Reza Vaezi
Mohammad Hossein Rohban
Neural Networks
Spring 2007
Outline
• Introduction
– Topography based Object Recognition
• Basic Dynamic Link Matching
– Ideas
– Formalization
• Improved Dynamic Link Matching
– Principles
– Differential Equations Implementation
• Experiments and Results
Introduction
• Visual Image in Conventional Neural Net
– Image is represented by Vectors
– Ignoring spacial relation
• Solution: preprocess, Neocognitron.
•
Which pattern?
Labeled Graph
•
•
•
•
Data Structure to overcome aforementioned problem
Object Representation
First used in Neural Net by Dynamic Link Matching
Structure:
– Set of Nodes: containing local features.
– Set of Edged: connecting nodes.
Labeled Graph
• Feature Space: set of all local features.
– Image: Absolute information extracted from small patch of image
such as: Color, Texture, Dimension of edge.
– Acoustic signal: onset, offset or energy in particular frequency
channel.
• Sensory Space: space from which relational features are
extracted
– Image: Frequency axes or spatial relations.
– Acoustic signal: frequency or time.
Sample Labeled Graph
• Dashed Line: proximity in
Sensory Space.
• Solid Line: Proximity in feature
Space.
Labeled Graph Matching
• Object Recognition
• Finding partial identity
• Detecting Symmetry
Object Recognition
• Object Recognition Problem
– Given a test image of an object and a gallery of object images, find
the matching images in the gallery.
• Topography based solutions
– Use ordering and local intensity of images
– Find a 1 – 1 mapping between regions of two images.
DLM Principles
•
Dynamic Link Matching
–
–
•
Konen & Von Der Malsburg (1992 – 1993)
Konen & Vorbrüggen (1993)
It contain 4 principle:
•
Correlation Encodes Neighborhood
–
•
Two neighbor nodes have correlated output in both layers.
Layer Dynamics Synchronize
–
•
•
Two blobs should align and synchronize in two layers if model and image
represent the same object in last iterations.
Synchrony is Robust against noise
Synchrony Structures Connectivity
–
Use weight plasticity to improve region mapping.
DLM
• Idea
– Consider two layered neural network
• First layer represents input image (Image Layer)
• Second layer represents gallery images (Model Layer)
– Weight from ith neuron in first layer to jth neuron in second layer,
represents degree of matching between corresponding ith region and jth
region.
– Each neuron stores a local wavelet response in the corresponding pixel
of the image
– Output of each neuron represents image scanning.
DLM (cont.)
DLM (cont.)
• Idea (cont.)
– Create a blob in 1st layer (Image Layer)
• a set of neighbor regions with high output
– 1st layer sends its output to 2nd layer (Model Layer)
• Sigmoid on sum of weighted inputs model.
– Neighbor neurons in 2nd layer with high activities (if exist), amplify
their activities. (topography!)
– If two nodes in two layers fire simultaneously, strengthen their
connection.
– Repeat the above process
– After a while if there is high blob activity in 2nd layer, it is
concluded that two images represent the same object.
DLM (cont.)
DLM (cont.)
DLM (cont.)
• Notations
–
–
–
–
–
h0i = ith neuron of 1st layer
h1j = jth neuron of 2nd layer
Ii(t) iid random noise , Ji = jet connected to ith node
(.) sigmoid activation function, S = similarity Measure
Wij weight of connection between jth to ith neuron
DLM (cont.)
• Local Excitation
• Lack of excitation leads to decay in h(t)
DLM (cont.)
• If two nodes in two layers are correlated, increase their
connection strength
• Weights converging on a 2nd layer neuron are normalized.
• Having changed connections, run differential equations
again.
• Repeat until some predefined number of iterations.
• If activity on 2nd layer is high, two images are considered
equivalent.
Drawbacks
• Need accurate schedule for layer dynamics, rather than
being autonomous.
• Information about correspondence of blobs would be lost in
next iteration, after altering weights.
• Slow process, many iterations, each with solving two
differential equations iteratively.
• In practice can not handle a gallery with more than 3
images.
Solution
• L. Wiskott (1995) changed this architecture.
• Ideas :
–
–
–
–
Two differential equations are considered.
Each model a blob in a layer.
Equations are solved only once.
Blobs are moving almost continuously, thus preserving information
from previous iteration.
– Attention blob concept is introduced
• Do not scan all points in the main image, but regions with high activity.
– Connections are bidirectional for blob alignment and attention blob
formation.
– Much faster and accurate, on 20, 50, 111 model galleries.
Blob Formation
• Local Excitation
• Global Inhibition
• i = (0,0), (0, 1), (0, 2), …
Blob Formation (cont.)
• Formation equation can be written as :
Blob Formation (cont.)
• Blob can arise only if h<1.
• Lower h leads to larger blobs.
• Using this form of activation function :
– Vanishes for negative value, so no oscillation.
– Higher slope for smaller values ease blob formation from
small noise values.
Blob Formation (cont.)
• Creating blob in this way makes neighbor neurons be
highly correlated in temporal domain. (1st Principle)
– Neighbor neurons excites almost in the same way
• In order to test 2nd principle (Synchronization) we
need moving blobs.
• We may store paths of the blobs and move away.
Blob Mobilization
• We may change equations :
• si(t) acts as a memory and is called self inhibitory.
•  is a varying decay constant.
• Rewriting the formula of s :
Blob Mobilization (cont.)
•  takes two values and so has two functions :
– When h>s, it is a high positive value.
– When h<s, it is a low positive value.
• Functions :
– When h>s, blob has recently been arrived, increasing s,
makes blob move away.
– When h<s, blob has recently been moved away, softly
decreasing s, cause blob not to move to its recent place.
Blob Mobilization (cont.)
Why the blob sometimes jumps?
Layer Interaction
• Neurons of two layers are also excited according to
activity of the “known corresponding neurons” in the
other layer :
• Wijpq codes synchrony (mapping) of node j in layer q
to node i in layer p.
Layer Interaction (cont.)
• Left : Early non-synchronized case
• Right : Final synchronized
– There is a blob in the location of maximal input, in output
layer.
Link Dynamics
• Computing neurons activity using “know mapping
matrix”, we want to approximate a new mapping
matrix.
• S measures similarity, J is the jet connected to each
neuron,  is a heavy-side function
Link Dynamics (cont.)
• The synaptic weights grow exponentially controlled
by the correlation between neuron activities.
• If one link in connections converging on node i (in
output layer) grows beyond its initial value, all these
connections will be reduced.
• Best link will be preserved in this case.
Attention Dynamics
• Image layer is usually larger than model layer.
• Need to restrict moving area of blob.
Attention Dynamics (cont.)
• Neurons with corresponding activity value beyond ac
will be strengthen.
• Activity value of attention blob should change slowly.
• Attention blob get excited by corresponding running
blob : moving toward active regions.
Attention Dynamics (cont.)
Attention Dynamics (cont.)
Recognition Dynamics
• The most similar model cooperates most successfully
and is the most active one.
Parameters
Bidirectional Connections
• With unidirectional connections, one blob would run
behind the other.
• Connection
– Model Image : Moving attention blob appropriately.
– Image Model : Discrimination cue as to which model
best fits the image.
Max vs. Summation
• Why did we use maxj instead of summing on j
variable?
– Many connections converging on a neuron, only one is a
correct connection. Using sum decreases neuron SNR.
– Dynamic range of inputs do not change much, after reorganization of weights.
Experiments
• Gallery database of 111 persons.
– One neutral image of frontal view.
– One frontal view with different facial expression.
– Two rotated in depth image with 15 and 30 degrees of
rotation.
– Neutral image acts as model images.
– Other images acts as test images.
• Model is 1010 and image is 1617.
• Grids are moved to have nodes in areas such as eyes,
mouth and nose.
Experiments (cont.)
• DLM is somehow changed :
– For 1000 first time steps, no weight correction is done, to
stabilize attention blob.
• It take 10-15 min to recognize faces on a Sun SPARC
station, with a 50 MHz processor.
• Seems much far from acting real time.
Results
Results (cont.)
Results (cont.)
Results (cont.)
Drawbacks
• Path of running blob is not random, but is dependent
on initial random state of neurons and activity of the
other layer.
• Thus certain paths may dominate and topology is
encoded inhomogenously : strongly along typical
paths and weakly elsewhere.
• Solution :
– Other ways of encoding topology : plane waves.
– Cause slow running of the process.
Conclusions
• DLM works based on topology coding.
• Topology is coded by blobs.
• Two layer architecture tries to find the mapping
between two topologies.
• Topologies are mapped using correlation of neurons.
• Models with highest activity are chosen.
• Proposed method needs no training data to perform
intelligently.
References
• L. Wiskott, “Labeled Graphs and Dynamic Link Matching for Face
Recognition and Scene Analysis,” PhD Thesis, Ruhr University,
Bochum, 1995.
• W. Konen, C. Von Der Malsburg, “Learning to Generalize from Single
Examples in the Dynamic Link Architecture”, Neural Computation,
1993.
Thanks for your attention!
Any Question ?