9-Lecture1(updated)
Download
Report
Transcript 9-Lecture1(updated)
Artificial Intelligence
Neural Networks
(Chapter 9)
Outline of this Chapter
•
•
•
•
•
•
Biological Neurons
Neural networks History
Artificial Neural Network
Perceptrons
Multilayer Neural Network
Applications of neural networks
Definition
Neural Network
A broad class of models that mimic functioning inside the human brain
There are various classes of NN models.
They are different from each other depending on
Problem types
Structure of the model
Model building algorithm
For this discussion we are going to focus on
Feed-forward Back-propagation Neural Network
(used for Prediction and Classification problems)
Biological Neurons
• The brain is made up of neurons (nerve cells) which
have
– dendrites (inputs)
– a cell body (soma)
– an axon (outputs)
– synapse (connections between cells)
• Synapses can be excitatory (potential-increasing activity) or
inhibitory (potential-decreasing), and may change over time
• The synapse releases a chemical transmitter – the sum of which
can cause a threshold to be reached – causing the neuron to fire
(electrical impulse is sent down the axon )
Biology of a neuron
A bit of biology . . .
Most important functional unit in human brain – a class of cells called –
NEURON
Dendrites
Cell Body
Axon
Synapse
Hippocampal Neurons
Source: heart.cbl.utoronto.ca/ ~berj/projects.html
• Dendrites – Receive information
Schematic
• Cell Body – Process information
• Axon – Carries processed information to other neurons
• Synapse – Junction between Axon end and Dendrites of other Neurons
An Artificial Neuron
Dendrites
X1
X2
Xp
w1
w2
..
.
wp
Cell Body
Axon
Direction of flow of Information
I
f
V = f(I)
I = w1X1 + w2X2
+ w3X3 +… + wpXp
• Receives Inputs X1 X2 … Xp from other neurons or environment
• Inputs fed-in through connections with ‘weights’
• Total Input = Weighted sum of inputs from all sources
• Transfer function (Activation function) converts the input to output
• Output goes to other neurons or environment
Biological Neurons (cont.)
• When the inputs reach some threshold an action potential
(electrical pulse) is sent along the axon to the outputs.
• The pulse spreads out along the axon reaching synapse &
releasing transmitters into the bodies of other cells.
• Learning occurs as a result of the synapse’ plasticicity: They
exhibit long-term changes in connection strength.
• There are about 1011 neurons and about 1014 synapses in the
human brain(!)
• A neuron may connect to as many as 100,000 other neurons
Brain structure
• We still do not know exactly how the brain works.
e.g, born with about 100 billion neurons in our brain. Many die as we
progress through life, & are not replaced, but we continue to learn.
But we do know certain things about it.
• Different areas of the brain have different functions
– Some areas seem to have the same function in all humans (e.g., Broca’s
region- speech & language); the overall layout is generally consistent
– Some areas vary in their function; also, the lower-level structure and function
vary greatly
senses
emotions, reasoning,
planning, movement,
& parts of speech.
hearing, memory,
meaning, and language
vision & ability to
recognize objects
Brain structure (cont.)
• We don’t know how different functions are “assigned” or
acquired
– Partly the result of the physical layout / connection to inputs (sensors)
and outputs (effectors)
– Partly the result of experience (learning)
• We really don’t understand how this neural structure/ collection
of simple cells leads to action, consciousness and thought
• Artificial neural networks are not nearly as complex as the actual
brain structure
Comparing brains with computers
Computer
Computational units
Storage units
Cycle time
Bandwidth
Neuron updates/sec
•
•
•
•
•
•
Human Brain
1 CPU, 105 gates
109 Bits RAM, 1011 bits disk
1011 neurons
1011neurons, 1014 synapses
10-8 Sec
109 bits/sec
105
10-3 sec
1014 bits/sec
1014
They are more neurons in human brain than they are bits in computers
Human brain is evolving very slowly---computer memories are growing rapidly.
There are a lot more neurons than we can reasonably model in modern digital computers,
and they all fire in parallel
NN running on a serial computer requires 100 of cycles to decided if a single N will fire--in real brain, all Ns do this in a single step.
e.g. brain recognizes a face in less than a sec--- billion of cycles
Neural networks are designed to be massively parallel
The brain is effectively a billion times faster at what it does
Neural networks History
•
McCulloch & Pitts (1943) are generally recognised as the designers of the
first neural network
•
Many of their ideas still used today (e.g. a neuron has a threshold level
and once that level is reached the neuron fires is still the fundamental way
in which artificial neural networks operate)
•
Hebb (1949) developed the first learning rule (on the premise that if two
neurons were active at the same time the strength between them should
be increased).
•
During the 50’s and 60’s many researchers such as Minsky & Papert,
worked on the perceptron (NNModel)
•
1969 saw the death of neural network research for about 15 years
•
Only in the mid 80’s (Parker and LeCun) NN research revived.
Artificial Neural Network
•
(Artificial) Neural networks are made up of nodes/units connected by
links which have
– inputs edges, each link has a numeric weight
– outputs edges (with weights)
– an activation level (a function of the inputs)
The computation is split into 2 components:
1. Linear component, called input function (ini)-- computes the weighted
sum of the unit’s input values.
2. Non-linear component, called activation function (g)– transforms the
weighted sum into the final value that serves as the unit’s activation value:
ai = g(ini) = g( aj wj,i )
•
Some nodes are inputs (perception), some are outputs (action)
Modeling a Neuron
in i j Wj , iaj
Each unit does a local computation based on inputs from its neighbours &
compute a new activation level – sends along each of its output links
aj: Activation value of unit j
wj,I: Weight on the link from unit j to unit i
inI: Weighted sum of inputs to unit i
aI: Activation value of unit i
g: Activation function.
Activation Functions
• Different models are obtained by using different mathematical functions for g.
• 3 common choices are:
threshold function
logistic function
Step(x)
=
1 if x >= 0, else 0
Sign(x)
=
+1 if x >= 0, else –1
Sigmoid(x) =
1/(1+e-x)
( in which we try to minimize the error by adjusting the weights of the network,
e : represents error degree)
1 represents the firing of a pulse down the axon, & 0 represents no firing.
t (threshold) represents the min total weighted input needed to cause the neuron to
ANN (Artificial Neural Network) – Feed-forward Network
A collection of neurons form a ‘Layer’
X1
X2
X3
X4
Input Layer
Direction of information flow
- Each neuron gets ONLY
one input, directly from outside
Hidden Layer
- Connects Input and Output
layers
Output Layer
- Output of each neuron
directly goes to outside
y1
y2
ANN (Artificial Neural Network) – Feed-forward Network
Number of hidden layers can be
None
One
More
Implementing logical functions
• McCulloch and Pitts: every Boolean function AND, OR, & NOT can be
represented by units with suitable weights & thresholds.
• We can use these units to build a network to compute any Boolean function
(t = threshold or the value of the Bias weight that determines the threshold to
cause the neuron to fire)
Network Structure
They are 2 main categories of NN structure:
• Feed-forward/acyclic networks:
allow signals to travel one way only; from input to output. There is no
feedback (loops).
Tend to be straight forward networks that associate inputs with outputs. (i.e.
pattern recognition.)
Usually arranged in layers– each unit receives input only from units in
preceding layer, no links between units in the same layer.
– single-layer perceptrons
– multi-layer perceptrons
• Recurrent/cyclic networks:
– Feeds its outputs back into its own inputs.
– recurrent neural nets have directed cycles with delays
– The links can form arbitrary topologies.
• The brain is recurrent network – activation is fed back to the units that caused
it.
Feed-forward example
No direct connection to the
outside world
Simple NN with 2 inputs, 2 hidden units & 1 output unit.
a5 = g(W3,5.a3 +W4,5.a4)
= g(W3,5. g(W1,3.a1 +W2,3.a2) +W4,5 g(W1,4.a1 +W2,4.a2))
By adjusting weights, we change the function that the Network
represents: learning occurs in NN this way!
Perceptron
• Is a network with all inputs connected
directly to the output.
• This is called a single layer NN (Neural
Network) or a Perceptron Network.
• It is a simple form of NN that is used for
classification of linearly separable
patterns. (i.e. If we have 2 results we can
separate them with a line with each group
result on a different side of the line)
Perceptron or a Single-layer NN
A Feed-Forward NN with no
hidden units.
Output units all operate
separately--no shared
weights.
First Studied in the 50’s
Other networks were known
about but the perceptron was
the only one capable of
learning and thus all research
was concentrated in this area.
A single weight only affects
one output so we can limit
our study to a model as
shown on the right
Notation can be simpler, i.e.
O Step0 j WjIj
Multilayer NN
• Network with 1/more layers of hidden units
• Layers are usually fully connected;
• numbers of hidden units typically chosen by hand
Summary
• Most brains have lots of neurons; each neuron linear threshold unit (?)
• Perceptrons (one-layer networks) insufficiently expressive
• Multi-layer networks are sufficiently expressive; can be trained by gradient
descent, i.e., error back-propagation
• Many applications: speech, driving, handwriting, fraud detection, etc.
• Engineering, cognitive modelling, and neural system modelling
• sub fields have largely diverged
End of Chapter 9