Mehran University of Engineering and Technology, Jamshoro

Download Report

Transcript Mehran University of Engineering and Technology, Jamshoro

Mehran University of Engineering and
Technology, Jamshoro
Institute of Information Technology
Third Term ME CSN & IT
Neural Networks
1
Kohonen Network/Model/Map
The principal goal of the Kohonen model is to transform
an incoming signal pattern of arbitrary dimension into
a one or two dimensional discrete map, and to perform
this operation adaptively in a topologically ordered
fashion.Figure on the next slide shows the schematic
diagram of a two dimensional lattice of neurons
commonly used as the discrete map.Each neuron in the
lattice is fully connected to all other source nodes in the
input layer. This network represents a feedforward
structure with a single computational layer consisting
of neurons arranged in rows and columns. A one
dimensional lattice is a special case of the configuration
depicted in the figure of the next slide. In this special
case the computational layer consist simply of a single
column of row of neurons.
2
3


The algorithm proceeds first by initializing the
synaptic weights in the network. This can be
done by assigning them small values picked
from a random number generator, in so doing,
no prior order is imposed on the feature map.
Once the network has been initialized, there
are three essential processes involved in the
formation of the self-organized map:
Competition: We have already discussed in
detail how neurons compete among themselves
and how a neuron wins the competition, so this
issue will not be discussed in this lecture.
4
Cooperation: The wining neuron locates the center
of a topological neighborhood of cooperating
neurons.
Question: how we define a topological neighborhood
that is neurobiologically correct?
Answer: There is neurobiological evidence for
lateral interaction among a set of excited neurons.
In particular, a neuron that is firing, tends to excite
the neurons in its immediate neighborhood more
than those farther away from it.
This observation leads us to make the topological
neighborhood around the wining neuron i decay
smoothly with lateral distance.

5
Let hj,idenote the topological neighborhood centered on
winning neuron I, and encompassing a set of excited
(cooperating) neurons, a typical one of which is
denoted by j. Let di,j denote the lateral distance
between winning neuron i and excited neuron j. Then
we assume that the topological neighborhood hj,iis a
unimodal function of the lateral distance dj,i such that
it satisfies two distinct requirements:
1.
the topological neighborhood hj,i is symmetric about
the maximum point defined by di,j = 0; in other
words, it attains its maximum value at the winning
neuron i for which the distance dj,i is zero.
2.
The amplitude of the topological neighborhood hji
decreases monotonically with increasing lateral
distance dj,i, decaying to zero for dj,i  ; this is
necessary condition for convergence.
6
A typical choice of hj,i that satisfy these
requirements is the Gaussian function:
h j,i ( x )
 d 2j,i
 exp 
 2 2





(1)
The parameter  is the effective width of the
topological neighborhood, as shown below:
7
The lateral distance dj,i may be computed as follows:
dj,i = |rj – ri|
(in case of a one-dimensional lattice)
di,j2 = || rj – ri||2 (in case of two-dimensional
lattice)
Another unique feature of the SOM algorithm is that the
size of the topological neighborhood shrinks with time.
This requirement is satisfied by making the width  of
hj,i decrease with time. A popular choice is the
following:
 n 
(n )   0 exp
  

1 

(2)
Where n = 0,1,2,…..,
0 is the initial value of  and
1 is a time constant.
8
Correspondingly, the topological neighborhood assumes a
time varying form given by

d 2j,i 

h j,i ( x ) (n )  exp 
2
 2 (n ) 


(3)
where (n) is defined by equation (2).
Adaptive Process:
The weights can be updated by using the Kohonen rule
Given by
wj = yj(x – wj)
Where  is the learning rate.
Putting yj = hj,i(x) we have
wj = hj,i(x) (x – wj)
(4)
9
Therefore, the updated vector wj[n+1] at
time n+1 is defined by
wj[n+1] = wj[n] + [n]hj,i(x)[n](x – wj[n]) (5)
which is applied to all the neurons in the
lattice that lie inside the topological
neighborhood of winning neuron i.
The learning rate parameters [n] is time
variant. It should start at initial value 0,
and then decrease gradually with increasing
time n. This requirement can be satisfied as

n 
[n]  0 exp
  

2 

n= 0,1,2,…
(6)
10
Two phases of the adaptive process:
(a) Self-organizing or ordering phase:
It is during this first phase of the adaptive process that the
topological ordering of the weight vectors takes place.
The ordering phase may take as many as 1000
iterations of the SOM algorithm, and possibly more.
Careful consideration must be given to the choice of the
learning rate parameter and neighborhood function:

It is recommended that [n] should begin with a
value close to 0.1. It should decrease gradually,
but remain above 0.01. These desirable values can
be satisfied by the following choices in the
formula of equation (6):
0 = 0.1 and 2 = 1000.
11

The function hji should initially include
almost all neurons in the network centered
on the winning neuron i, and then shrink
slowly with time. Specifically, during the
ordering phase that may occupy 1000
iterations or more, hj,i(n) is permitted to
reduce to a small value of only a couple of
neighboring neurons around a winning
neuron or to the winning neuron itself. For a
two-dimensional lattice of neurons, 1 in
equation (2) may be set as follows:
1 = 1000/log(0)
(7)
12

Converging Phase: This second phase of
the adaptive process is needed to fine
tune the feature map and therefore
provide
an
accurate
statistical
quantification of the input space. As a
general rule, the number of iterations
constituting the convergence phase must
be at least 500 times the number of
neurons in the network. Thus, the
converging phase may have to go for
thousands and possibly tens of thousands
of iterations:
13
For good accuracy, the learning
parameter [n] should be maintained
during the convergence phase at a
small value, on the order of 0.01. In
any event, it must not be allowed to
decrease to zero.
 The
neighborhood function should
contain only the nearest neighbors of a
winning neuron.

14
Properties of the feature map:




The feature map provides a good approximation to
the input space.
The feature map computed by the SOM algorithm
is topological ordered in the sense that the spatial
location of a neuron in the lattice corresponds to a
particular domain or feature of input patterns.
The feature map reflects variations in the statistics
of the input distribution.
Given data from an input space with a nonlinear
distribution, the self-organizing map is able to
select a set of best features for approximating the
underlying distribution.
15