No - Trinity College Dublin

Download Report

Transcript No - Trinity College Dublin

Financial Informatics –XIV:
Basic Principles
Khurshid Ahmad,
Professor of Computer Science,
Department of Computer Science
Trinity College,
Dublin-2, IRELAND
November 19th, 2008.
https://www.cs.tcd.ie/Khurshid.Ahmad/Teaching.html
1
1
Neural Networks
Artificial Neural Networks
The basic premise of the
course, Neural Networks, is
to introduce our students to
an alternative paradigm of
building information
systems.
2
Artificial Neural Networks
An ANN system can be characterised by
•its ability to learn;
•its dynamic capability;
and
• its interconnectivity
3
Artificial Neural Networks:
An Operational View
Neuron xk
x1
wk2
x3
wk3
x4
wk4
Summing
Junction
Activation
Function
S
yk
Output Signal
Input Signals
x2
wk1
bk
Net input or weighted sum :
net  w1 * x1  w2 * x2  w3 * x3  w4 * x4
Neuronal output
identity function  y1  net
non  negative identity function
y1  0 if net  THRESHOLD ( )
y1  net if net  THRESHOLD ( )
4
Artificial Neural Networks:
An Operational View
A neuron is an information processing unit forming the key ingredient of a
neural network: The diagram above is a model of a biological neuron.
There are three key ingredients of this neuron labelled xk which is
connected to the (rest of the) neurons in the network labelled
x1, x2, x3,…xj.
A set of links, the biological equivalent of synapses, which the kth neuron
has with the (rest of the) neurons in the network. Note that each link
has a WEIGHT denoted by labelled
wk1, wk2,…wkj,
where the first subscript (k in this case) denotes the recipient neurons and
the second subscript (1,2,3…..j) denotes the neurons transmitting to
the recipient neurons. The synaptic weight wkj may lie in a range that
includes negative (inhibitory) values and positive (excitatory) values.
(From Haykin 1999:10-12)
5
Artificial Neural Networks:
An Operational View
The kth neuron adds up the inputs of all the
transmitting neurons at the summing junction or the
adder, denoted by S. The adder acts as a linear
combiner and generates a weighted average
usually denoted by uk:
uk = wk1i*x1 + wk2*x2 + wk3*x3 + ………. + wkj*xj;
the bias (bk )has the effect of increasing or
decreasing the net input to the activation function
depending on the value of the bias.
(From Haykin 1999:10-12)
(From Haykin 1999:10-12)
6
ANN’s: an Operational View
Finally, the linear combination, denoted as vk = uk + bk, is passed
through the activation function which engenders the non-linear
behaviour seen in the behaviour of the biological neurons: the inputs to
and outputs from a given neuron show a complex, often non-linear
behaviour.
For example, if the output from the adder was positive or zero then the
neuron will emit a signal,
yk = 1 if (vk)0 ,
however if the output from the adder was negative then there will be no
output,
yk = 0 if (vk)< 0 .
There are other models of the activiation function as we will see later.
(From Haykin 1999:10-12)
7
ANN’s: an Operational View
Neuron xk
x1
wk2
x3
wk3
x4
wk4
Summing
Junction
Activation
Function
S
yk
Output Signal
Input Signals
x2
wk1
bk
Net input or weighted sum :
net  w1 * x1  w2 * x2  w3 * x3  w4 * x4
Neuronal output
cons tan t  y1  net
y1  0 if net  THRESHOLD ( )
y1  C if net  THRESHOLD ( )
8
ANN’s: an Operational View
Neuron xk
x1
wk2
x3
wk3
x4
wk4
Summing
Junction
Activation
Function
S
yk
Output Signal
Input Signals
x2
wk1
bk
Net input or weighted sum :
net  w1 * x1  w2 * x2  w3 * x3  w4 * x4
Neuronal output
Saturated output
y1  0 if net  THRESHOLD ( )
y1  1 if net  THRESHOLD ( )
9
ANN’s: an Operational View
Discontinuous Output
Neuron xk
x1
wk2
x3
wk3
x4
wk4
S
Threshold (θ)
Output
f(net)
y1  1 if net  THRESHOLD ( )
yk
bk
Net input or weighted sum :
net  w1 * x1  w2 * x2  w3 * x3  w4 * x4
Neuronal output
Saturated output
y1  0 if net  THRESHOLD ( )
Activation
Function
Output Signal
Input Signals
x2
Summing
Junction
wk1
No output
(Normalised)
output (eg. 1)
net
10
ANN’s: an Operational View
Neuron xk
x1
wk2
x3
wk3
x4
wk4
Summing
Junction
S
Activation
Function
yk
Output Signal
Input Signals
x2
wk1
bk
The notion of a discontinuous function simulates the
fundamental notion that biological neurons usually fire if
there is ‘enough’ stimulus available in the environment.
But discontinuous is biologically implausible, so there
must be some degree of continuity in the output such
that an artificial neuron has a degree of biological
plausibility.
11
ANN’s: an Operational View
Pseudo-Continuous Output
Neuron xk
x1
wk2
x3
wk3
x4
wk4
S
yk
bk
Saturation
Threshold (θ’)
f(net)
Net input or weighted sum :
net  w1 * x1  w2 * x2  w3 * x3  w4 * x4
Neuronal output
Saturated output
y1   if net  THRESHOLD ( )
Activation
Function
Output Signal
Input Signals
x2
Summing
Junction
wk1
Output β
Threshold (θ)
Output=α
y1   if net  Saturation THRESHOLD ( ' )
y1  f ( ,  ' ,  ,  )
net
12
ANN’s: an Operational View
Neuron xk
x1
x3
x4
wk2
Summing
Junction
Activation
Function
S
yk
wk3
wk4
bk
A schematic for an 'electronic' neuron
Output Signal
Input Signals
x2
wk1
13
ANN’s: an Operational View
Neural Nets as directed graphs
A directed graph is a geometrical object
consisting of a set of points (called
nodes) along with a set of directed line
segments (called links) between them.
A neural network is a parallel
distributed information processing
structure in the form of a directed graph.
14
ANN’s: an Operational View
Input Connections
Processing Unit
Output Connection
Fan Out
15
ANN’s: an Operational View
A neural network comprises
A set of processing units
A state of activation
An output function for each unit
A pattern of connectivity among units
A propagation rule for propagating patterns of activities
through the network
An activation rule for combining the inputs impinging on
a unit with the current state of that unit to produce a
new level of activation for the unit
A learning rule whereby patterns of connectivity are
modified by experience
An environment within which the system must operate
16
The McCulloch-Pitts Network
. McCulloch and Pitts demonstrated that any logical
function can be duplicated by some network of all-ornone neurons referred to as an artificial neural network
(ANN).
Thus, an artificial neuron can be embedded into a
network in such a manner as to fire selectively in
response to any given spatial temporal array of firings
of other neurons in the ANN.
Artificial Neural Networks for Real Neuroscientists: Khurshid Ahmad, Trinity College, 28 Nov 2006
17
The McCulloch-Pitts Network
Demonstrates that any logical function can be implemented by
some network of neurons.
•There are rules governing the excitatory and inhibitory
pathways.
•All computations are carried out in discrete time intervals.
•Each neuron obeys a simple form of a linear threshold law:
Neuron fires whenever at least a given (threshold) number
of excitatory pathways, and no inhibitory pathways,
impinging on it are active from the previous time period.
•If a neuron receives a single inhibitory signal from an active
neuron, it does not fire.
•The connections do not change as a function of experience.
Thus the network deals with performance but not learning.
18
The McCulloch-Pitts Network
Computations in a McCulloch-Pitts Network
‘Each cell is a finite-state machine and accordingly operates in
discrete time instants, which are assumed synchronous among all
cells. At each moment, a cell is either firing or quiet, the two
possible states of the cell’ – firing state produces a pulse and quiet
state has no pulse. (Bose and Liang 1996:21)
‘Each neural network built from McCulloch-Pitts cells is a finite-state
machine is equivalent to and can be simulated by some neural
network.’ (ibid 1996:23)
‘The importance of the McCulloch-Pitts model is its applicability in
the construction of sequential machines to perform logical
operations of any degree of complexity. The model focused on
logical and macroscopic cognitive operations, not detailed
physiological modelling of the electrical activity of the nervous
system. In fact, this deterministic model with its discretization of
time and summation rules does not reveal the manner in which
biological neurons integrate their inputs.’ (ibid 1996:25)
19
The McCulloch-Pitts Network
Consider a McCulloch-Pitts network which
can act as a minimal model of the sensation of
heat from holding a cold object to the skin and
then removing it or leaving it on permanently.
Each cell has a threshold of TWO, hence fires
whenever it receives two excitatory (+) and no
inhibitory (-) signals from other cells at a
previous time.
Artificial Neural Networks for Real Neuroscientists: Khurshid Ahmad, Trinity College, 28 Nov 2006
20
The McCulloch-Pitts Network
Heat Sensing Network
1
+
+ Hot
3
+
Heat
+
B
Receptors Cold
+
+
A
+
2
+
+
+
4
Cold
21
The McCulloch-Pitts Network
Heat Sensing Network
Truth tables of the firing neurons when the
cold object contacts the skin and is then
removed
1
+
+ Hot
3
Heat
Receptors Cold
Cell 1
Cell 2
Cell a
Cell b
Cell 3
Cell 4
INPUT
INPUT
HIDDEN
HIDDEN
OUTPUT
OUTPUT
1
No
Yes
No
No
No
No
2
No
No
Yes
No
No
No
3
No
No
No
Yes
No
No
4
No
No
No
No
Yes
No
+ +
B
+ +
A
+ +
2
Time
+
+
4
Cold
22
The McCulloch-Pitts Network
Heat Sensing Network
‘Feel hot’/’Feel cold’ neurons show how to create
OUTPUT UNIT RESPONSE to given INPUTS that
depend ONLY on the previous values. This is
known as a TEMPORAL CONTRAST
ENHANCEMENT.
The absence or presence of a stimulus in the
PREVIOUS time cycle plays a major role here.
The McCulloch-Pitts Network demonstrates how
this ENHANCEMENT can be simulated using an
ALL-OR-NONE Network.
23
The McCulloch-Pitts Network
Heat Sensing Network
Truth tables of the firing neurons for the case
when the cold object is left in contact with
the skin – a simulation of temporal contrast
enhancement
Time
Cell 1
Cell 2
Cell a
Cell b
Cell 3
Cell 4
INPUT
INPUT
HIDDEN
HIDDEN
OUTPUT
OUTPUT
1
2
3
24
The McCulloch-Pitts Network
Heat Sensing Network
Truth tables of the firing neurons for the case
when the cold object is left in contact with
the skin – a simulation of temporal contrast
enhancement
1
+
+ Hot
Time
3
Heat
Receptors Cold
Cell 2
Cell a
Cell b
Cell 3
Cell 4
INPUT
INPUT
HIDDEN
HIDDEN
OUTPUT
OUTPUT
1
No
Yes
No
No
No
No
2
No
Yes
Yes
No
No
No
3
No
Yes
Yes
No
No
Yes
+ +
B
+ +
A
+ +
2
Cell 1
+
+
4
Cold
25
The McCulloch-Pitts Network
Memory Models
+
+
A
+ +
+
1
2
+
++
+
+
B
Three stimulus model
1
+
2
+
Permanent Memory
model
26
The McCulloch-Pitts Network
Memory Models
In the permanent
memory model, the
output neuron has
threshold ‘1’; neuron 2
fires if the light has ever
been on anytime in the
past.
Levine, D. S. (1991:16)
+
1
+
2
Permanent Memory
model
27
The McCulloch-Pitts Network
Memory Models
Consider, the three stimulus all-or-none neural network. In this network, neuron 1 responds
to a light being on. Each of the neurons has threshold ‘3’.
In the three stimulus model neuron 2 fires after the light has been on three time units in a
row.
Time
A
1
2
Cell 1
Cell A
Cell B
Cell 2
1
Yes No
No
No
2
Yes No
Yes
No
3
Yes Yes
Yes
No
4
No
Yes
Yes
B
Three stimulus model
All connections are unit positive
Yes
28
The McCulloch-Pitts Network
Why is a McCulloch-Pitts a FSM?
A finite state machine (FSM)is an
AUTOMATON.An input string is read from
left to right; the machine looks at each
symbol in turn. At any time the FSM is in
one of many finitely interval states.
The state changes after each input symbol is
read.
The NEW STATE depends (only) on the
symbol just read and on the current state.
29
The McCulloch-Pitts Network
‘The McCulloch-Pitts model, though it uses an
oversimplified formulation of neural activity patterns,
presages some issues that are still important in current
cognitive models. [..][Some] Modern connectionist
networks contain three types of units or nodes – input
units, output units, and hidden units. The input units react
to particular data features from the environment […]. The
output units generate particular organismic responses […].
The hidden units are neither input nor output units
themselves but, via network connections, influence output
units to respond to prescribed patterns of input unit firings
or activities. [..] [This] input-output-hidden trilogy can be
seen as analogous to the distinction between sensory
neurons, motor neurons, and all other (interneurons) in
the brain’
Levine, Daniel S. (1991: 14-15)
Artificial Neural Networks for Real Neuroscientists: Khurshid Ahmad, Trinity College, 28 Nov 2006
30
The McCulloch-Pitts Network
Linear Neuron: Output is the weighted sum of all
the inputs;
McCulloch-Pitts Neuron: Output is the
thresholded value of the weighted sum
Input Vector? X = X (1,-20,4,-2);
Weight vector? wji=w(wj1,wj2,wj3,wj4)
wj1
=[0.8,0.2,-1,-0.9]
x1
x2
wj2
S
w
yj
j3
x3
j
wj4
x4
0th input
31
The McCulloch-Pitts Network
vj=Swjixi; y=f(v); y=0 if v<=0 or y=1 if v>0
Input Vector? X = X (1,-20,4,-2);
Weight vector? wji=w(wj1,wj2,wj3,wj4)
=[0.8,0.2,-1,-0.9]
wj0=0, x0=0
x1
x2
x3
x4
wj1
wj2
wj3
wj4
S
j
0th input
yj
32
The McCulloch-Pitts Network
Input Vector? X = X (1,-20,4,-2);
Weight vector? w=w(wj1,wj2,wj3,wj4)
=[0.8,0.2,-1,-0.9]
wj0=0, x0=0
vj=Swjixi; y=f(v); f activation function
Linear Neuron: y=v
McCulloch Pitts: y=0 if v<=0 or y=1 if v>0
Sigmoid activation function: f(v)=1 /(1+exp(-v))
33
The McCulloch-Pitts Network
What are the circumstance in a neuron with a
sigmoidal activation function will act like a
McCulloch Pitts network?
Large synaptic weights
What are the circumstance in a neuron with a
sigmoidal activation function will act like a linear
neuron?
Small synaptic weights
34
The McCulloch-Pitts Network
The key outcome of early research in
artificial neural networks clearly
demonstrated the theoretical importance
(brain like behaviour and logical basis)
and extensive utility (regime switching
modes) of threshold behaviour. This
behaviour was emulated through the use of
the squashing functions and is the basis of
many a simulation.
35