Unsupervised learning

Download Report

Transcript Unsupervised learning

Unsupervised learning
Summary from last week
• We explained what local minima are, and
described ways of escaping them.
• We investigated how the backpropagation
algorithm can be improved by changing various
parameters and re-training.
Unsupervised learning
• Supervised learning = 'teacher' presents input
patterns and desired target result.
• Unsupervised learning = input patterns but no
'teaching signal'.
• Self organisation = showing patterns to be
classified, network produces own output
representation.
Three properties required
• Value of output used as measure of similarity
between input pattern and pattern stored in neuron.
• Competitive learning strategy selects neuron with
largest response.
• Method of reinforcing largest response.
Self-organising maps (SOMs)
• Inspiration from Biology: In auditory pathway
nerve cells arranged in relation to frequency
response (tonotopic organisation).
• Kohonen took inspiration from to produce selforganising maps (SOMs).
• In SOM units located physically next to one
another will respond to input vectors that are
‘similar’.
SOMs
• Useful, as difficult for Humans to visualise when
data has > 3 dimensions.
• Large dimensional input vectors 'projected down'
onto 2-D map in way maintaining natural order
similarity.
• SOM is 2-D array of neurons, all inputs arriving at
all neurons (See Fig.).
x1
x2
1
-1
1
0
1
1
0
-1
0
0
0
1
-1
-1
-1
0
-1
1
SOMs
• Initially each neuron has own set of (random)
weights.
• When input arrives neuron with pattern of weights
most similar to input gives largest response.
SOMs
• Positive excitatory feedback between SOM unit
and nearest neighbours.
• Causes all the units in ‘neighbourhood’ of winner
unit to learn.
• As distance from winning unit increases degree of
excitation falls until it becomes inhibition.
• Bubble of activity (neighbourhood) around unit
with largest net input (Mexican-Hat function, See
Fig.).
SOMs
• Initially each weight set to random number.
• Euclidean distance D used to find difference
between input vectors and weights of SOM units
(D = square root of the sum of the squared
differences) =
n
 (x
i 1
i
 wij )
2
SOMs
• For a 2-dimensional problem, the distance
calculated in each neuron is:
2
 (x
i 1
 wij )  ( x1  w1 j )  ( x 2  w2 j )
2
i
2
2
• Input vector simultaneously compared to all
elements in network, one with lowest D is winner.
• Update weights all in neighbourhood around
winning unit.
• As learning proceeds size of neighbourhood
diminished until has only a single unit.
• If winner is ‘c’, neighbourhood defined as being
Mexican Hat function around ‘c’ (see Fig.).
c
neig hbourhood Nc
SOMs
• Weights of units are adjusted using:
wij = k(xi – wij )Yj
Where Yj from Mexican Hat function (controlled by
Nc)
SOMs
• k is a value which changes over time (high at start
of training, low later on).
• If unit lies within the neighbourhood of winning
unit its weight changed by difference between its
weight vector and vector x multiplied by time
factor k and function Yj.
• Each weight vector being updated rotates slightly
toward input vector x.
Two distinct phases in training
• Initial ordering phase: units find correct
topological order (might take 1000 iterations
where k decreases from 0.9 to 0.01, Nc decreases l
from ½ diameter of the network to 1 unit.
• Final convergence phase: accuracy of weights
improves. (k may decrease from 0.01 to 0 while
Nc stays at 1 unit. Phase could be 10 to 100 times
longer depending on desired accuracy.
Examples
• In notes: 2-D array of elements arranged in square
to map rectangular 2-D coordinate space onto
array where units learn to recognise their relative
positions in two-dimensional space.
• Mapping world poverty (shown on video).
• Credit card fraud detection.
SOMs
• Possible to identify which regions belong to which
class by showing network known patterns seeing
which areas active.
Feature map classifier
• Has an additional layer(s) of units that form output
layer, can be trained by several methods (including
backpropagation) to produce particular output
given particular pattern of activation on SOM (see
Fig.).
Neural phonetic typewriter
(1986)
• Can transcribe speech into written text from
unlimited (Finnish) vocabulary in real time.
Accuracy 92-97%.
• 2-D array of units trained using 15-D inputs from
pre-processed speech.
• Units in the 2-dimensional array are allowed to
organise themselves in response to the input
vectors.
• After training SOM calibrated using spectra of
phonemes as inputs.
Neural phonetic typewriter
(1986)
• After training SOM calibrated using spectra of
phonemes as inputs.
• Path across network results in phonetic
transcription of the word. This used as input to
rule-based system to be compared with known
words.
Summary
• Defined unsupervised learning, where no external
‘teacher’ is present.
• Discussed a self-organizing neural network called
a Self-Organising Map (SOM).
• SOM uses unsupervised learning to physically
arrange its neurons so that the patterns that it
stores are arranged such that similar patterns are
close to each other and dissimilar patterns are far
apart.