Transcript Perceptrons

CSM10: Introduction to neural
networks and historical
background
• Tony Browne: [email protected]
Motivation
• Conventional (rule-based) systems perform badly
at some tasks (e.g. face recognition - may fail to
recognise the same face if it is smiling
(brittleness)).
• Many problems where we don’t know the
solution, would like a system to work out the
solution for us (i.e. learn a solution from the
available data).
• How does the human brain do this?
• Can we copy what the brain does?
Human brain made up of 1011 neurons
Brain cells connected
• Each brain cell has lots of connections with
other brain cells by means of nerve fibres
(the wiring connecting brain cells together).
There are about 4 million miles of nerve
fibres in each brain. Some fibres may have
up to 10,000 branches in them.
Neurons
• Each brain cell has lots of connections with
other cells, possibly over 25,000. The
junctions at the end of the neurones are
called synapses.
• Axon - A neurone (or cell body) has many
axons (or nerve fibres).
• Vesicles - these contain the transmitter
substances.
Neurons
• Transmitters - these are small chemicals
used by brain cells as messengers. They are
stored in the vesicles in the nerve ending
ready to be released
• Receptors - these are structures on the
surface of the receiving cell which have a
space designed just for the transmitter (if
the transmitter is a key, receptors are the
lock into which they fit)
Neurons
• Enzymes - these surround the synapse and
break down any spare transmitter that might
leak out to other synapses nearby.
• Electrical signal - This is the way in which
one brain cell sends a message to another.
The signal travels down the nerve fibre
rather like an electrical "Mexican Wave".
Signal transmission
• 1. A brain cell decides to send a message to
another cell in order to make something happen
e.g. tighten a muscle, release a hormone, think
about something, pass on a message etc.
• 2. An electrical impulse is sent from the brain cell
down one of the nerve fibres/neurones towards the
end. It travels at about 120 miles per hour.
Signal transmission
• 3. This message or impulse arrives at the end of
the nerve fibre. When it arrives, a chemical
transmitter is released from the nerve end.
• 4. The transmitter is then released and travels
across the gap between the first nerve fibre and the
next/receiving one.
• 5. The transmitter hits a receptor on the other side.
It fits into it just like a key fitting into a lock.
Signal transmission
• 6. When the transmitter hits the receptor, the
receptor changes shape. This causes changes
inside the nerve ending which sets off an electrical
message in that nerve fibre on to the next
brain/nerve cell. This sequence then carries on
until the effect occurs e.g. the muscle moves etc.
Signal transmission
• 7. The transmitter is either broken down by
enzymes (10%) and removed or taken back up
again into the nerve ending (i.e. recycled) - a
process known as re-uptake.
• 8. The nerve fibre and synapse is then ready for
next message
Important points
• The passage of messages only works one way or
one direction
• There is only one type of transmitter per synapse
• The transmitter allows an electrical message to be
turned into a chemical message and back into an
electrical message.
Transmitter substances
• Over 80 known different transmitter
substances in the brain, each nerve ending
only has one type balance
• In many mental health problems, it is
known that some of these transmitters get
out of balance e.g. you have too much or
too little of a particular transmitter.
Serotonin (5HT)
• In the body, 5-HT is involved with blood pressure
and gut control.
• In the brain, it controls mood, emotions,
sleep/wake, feeding, temperature regulation, etc.
• Too much serotonin and you feel sick, less hungry,
get headaches or migraines
• Too little and you feel depressed, drowsy etc.
• Antidepressants (prozac) boost levels of serotonin
Dopamine
• Three main pathways of dopamine neurones in the
brain. One controls muscle tension and another
controls emotions, perceptions, sorting out what is
real/important/imaginary etc.
• Not enough dopamine in the first group and your
muscles tighten up and shake (Parkinson's
disease).
• Too much dopamine in the second group gives too
much perception e.g. you may see, hear or
imagine things that are not real (schizophrenia)
Noradrenaline (NA)
(sometimes called "norepinephrine" or NE)
• In the body, it controls the heart and blood
pressure, in the brain, it controls sleep,
wakefulness, arousal, mood, emotion and drive
• Too much noradrenaline and you may feel
anxious, jittery etc.
• Too little and you may feel depressed, sedated,
dizzy, have low blood pressure etc.
• Some antidepressants affect NA
Acetylcholine (Ach)
• In the body, acetylcholine passes the messages
which make muscles contract.
• In the brain, it controls arousal, the ability to use
memory, learning tasks etc.
• Too much in your body and your muscles tighten
up.
• Too little can produce dry mouth, blurred vision
and constipation, as well as becoming confused,
drowsy, slow at learning etc.
Glutamate
• Acts as an ‘accelerator’ in the brain
• Too much and you become anxious, excited and
some parts of your brain may become overactive.
• Too little and you may become drowsy or sedated
• Some people may be sensitive to glutamate in
food (monosodium glutamate)
GABA
• Acts as a ‘brake’ in the brain
• Too much and you become drowsy or sedated.
• Too little and you may become anxious and
excited
• Valium and other sedatives act on GABA
Computational modelling
• We can model what happens in neurons/synapses
in software and use these models to answer
interesting questions
• How does the mind work? (how can we model
cognitive processes?)
• Can we use such machine learning systems to
solve problems we do not know the solutions to?
• Such problems include bioinformatics (why does a
particular genetic sequence do what it does?), drug
design, machine vision, robotics.
Two main types of learning
• Supervised learning – a ‘teaching’ target is
provided to indicate to the network what the
correct output should be
• Unsupervised learning – no ‘teaching’ input –
network self-organises into a solution
Perceptrons
• Perceptron (1950's). See Fig. Had three areas of
units, a sensory area, an association area and a
response area.
• Impulses generated by the sensory points are
transmitted to the units in the association layer,
each association layer unit is connected to a
random set of units in the sensory layer.
Perceptrons
• Connections may be either excitatory or inhibitory
(+1, 0 and –1).
• When pattern appears on the sensory layer, an
association layer unit becomes active if the sum of
its inputs exceed a threshold value, then it
produces an output which is sent to the next layer
of units, the response units.
Perceptrons
• Response units connected randomly to the
association layer units with inhibitory
feedback.
Perceptrons
• The response layer units respond in a similar way
to the association layer units, if the sum of their
inputs exceeds a threshold they give an output
value of +1, otherwise their output is -1.
• The Perceptron is a learning device, in its initial
configuration it is incapable of distinguishing
patterns, but can learn this capability through a
training process.
Perceptrons
• During training a pattern is applied to the sensory
area, and the stimulus is propagated through the
layers until a response layer unit is activated. If the
correct response layer unit is activated the output
of the corresponding association layer units is
increased, if the incorrect response layer unit is
active the output of the corresponding association
layer units is decreased.
Perceptrons
• Perceptron convergence theorem: states that if a
pattern can be learned by the Perceptron then it
will be learned in a finite number of training
cycles.
• ADALINES (1960’s) similar to perceptrons but
used a least-mean squares error based learning
mechanism
Perceptrons
• Problems with Perceptrons: (Minsky, 1969) - the
end of neural networks research?
• One of the main points was that Perceptrons can
differentiate patterns only if they are linearly
separable (places a severe restriction on the
applicability of the Perceptron).
Linear inseparability
e.g. XOR problem:
Inputs Target
x1 x2
0 0 0
0 1 1
1 0 1
1 1 0
Linear inseparability
• A simplified Perceptron, with x1 and x2
representing inputs from the sensory area, two
units in the association layer and one in the
response layer is shown in Fig.
• The output function of the output unit is 1 if its net
input is greater than the threshold T, and 0 if it is
less than this threshold, this type of node is called
a linear threshold unit.
f(net) = 1 net  T
f(net) = 0 net < T
The net input to the output node is:
net = w1 x1 + w2 x 2
Linear inseparability
• The problem is to select values of the weights
such that each pair of input values results in a
proper output value, if we refer to next Fig. we see
that this cannot be done.
Linear inseparability
• There is no way to arrange the position of the line
so that the correct two points for each class both
lie in the same region.
• Hyperplanes: Could partition the space correctly
if we had three regions, one region would belong
to one output class, and the other two would
belong to another output class (there is no reason
why disjoint regions cannot belong to the same
output class) as in next Fig.
Linear inseparability
• Can achieve this by adding an extra layer of
weights to the network, which has the effect of
expanding the dimensionality of the space that the
XOR problem is being represented in and
allowing us to produce a hyperplane which
correctly partitions the space, an example with an
appropriate set of connection weights and
thresholds is shown in following Fig.
Multi-Layered Perceptrons
• However, no learning algorithm was known that
could modify the extra layer of weights
• So research in Perceptrons almost died out until
the mid 1980’s
• Then an algorithm was developed by two
American Psychologists that could modify these
other weights
• Opened the way for learning in Multi-Layered
Perceptrons (MLPs)