Artificial Intelligence CSC 361 - Home

Download Report

Transcript Artificial Intelligence CSC 361 - Home

Complex Systems Engineering
SwE 488
Artificial Complex Systems
Prof. Dr. Mohamed Batouche
Department of Software Engineering
CCIS – King Saud University
Riyadh, Kingdom of Saudi Arabia
[email protected]
Artificial Neural Networks
2
Artificial Neural Networks
(ANN)
What is an Artificial Neural Network?
• Artificial Neural Networks are crude
attempts to model the highly massive
parallel and distributed processing we
believe takes place in the brain.
3
Developing Intelligent
Program Systems
Neural Nets
•
Two main areas of activity:
•
Biological:
Try to model biological neural systems.
•
Computational: develop powerful applications.
4
Biological Motivation: Brain
•
Networks of processing units (neurons) with connections (synapses) between them
•
Large number of neurons: 1011
•
Large connectitivity: each connected to, on average, 104 others
•
Parallel processing
•
Distributed computation/memory
•
Processing is done by neurons and the memory is in the synapses
•
Robust to noise, failures
 ANNs attempt to capture this mode of computation
5
The Brain as a Complex
System
•
The brain uses the outside world to shape
itself. (Self-organization)
•
It goes through crucial periods in which
brain cells must have certain kinds of
stimulation to develop such powers as
vision, language, smell, muscle control,
and reasoning. (Learning, evolution,
emergent properties)
6
Main Features of the
Brain
•
•
•
•
Robust – fault tolerant and degrade
gracefully
Flexible -- can learn without being
explicitly programmed
Can deal with fuzzy, probabilistic
information
Is highly parallel
7
Characteristic of
Biological Computation
• Massive Parallelism
• Locality of Computation → Scalability
• Adaptive (Self Organizing)
• Representation is Distributed
8
Artificial Neural
Networks
History of ANNs
History of Artificial
Neural Networks
•
1943: McCulloch and Pitts proposed a model of a neuron --> Perceptron
•
1960s: Widrow and Hoff explored Perceptron networks (which they
called “Adalines”) and the delta rule.
•
1962: Rosenblatt proved the convergence of the perceptron training
rule.
•
1969: Minsky and Papert showed that the Perceptron cannot deal with
nonlinearly-separable data sets---even those that represent simple
function such as X-OR.
•
1970-1985: Very little research on Neural Nets
•
1986: Invention of Backpropagation [Rumelhart and McClelland, but
also Parker and earlier on: Werbos] which can learn from nonlinearlyseparable data sets.
•
Since 1985: A lot of research in Neural Nets -> Complex Systems
10
Developing Intelligent
Program Systems
Neural Nets Applications
Neural nets can be used to answer the following:
•
Pattern recognition: Does that image contain a
face?
•
Classification problems: Is this cell defective?
•
Prediction: Given these symptoms, the patient
has disease X
•
Forecasting: predicting behavior of stock market
•
Handwriting: is character recognized?
•
Optimization: Find the shortest path for the TSP.
11
Artificial Neural
Networks
Biological Neuron
Typical Biological
Neuron
13
The Neuron
• The neuron receives nerve impulses through its
dendrites. It then sends the nerve impulses through
its
axon to the terminal buttons where
neurotransmitters are released to simulate other
neurons.
14
The neuron
• The unique
components are:
• Cell body or soma
which contains the
nucleus
• The dendrites
• The axon
• The synapses
15
The neuron - dendrites
• The dendrites are short
fibers (surrounding the
cell body) that receive
messages
• The dendrites are very
receptive to connections
from other neurons.
• The dendrites carry
signals from the synapses
to the soma.
16
The neuron - axon
• The axon is a long
extension from the soma
that transmits messages
• Each neuron has only one
axon.
• The axon carries action
potentials from the soma
to the synapses.
17
The neuron - synapses
• The synapses are the connections
made by an axon to another
neuron. They are tiny gaps
between axons and dendrites
(with chemical bridges) that
transmit messages
• A synapse is called excitatory if it
raises the local membrane
potential of the post synaptic cell.
• Inhibitory if the potential is
lowered.
18
Artificial Neural
Networks
artificial Neurons
Typical Artificial Neuron
connection
weights
inputs
output
threshold
20
Typical Artificial Neuron
linear
combination
activation
function
net input
(local field)
21
Equations
Net input:
Neuron output:

 n

hi  
 w ij s j 
 
j1

h  Ws  
s
i   hi 
s  h
22
Artificial Neuron
• Incoming signals to a unit are combined by
summing their weighted values
• Output function: Activation functions
include Step function, Linear function,
Sigmoid function, …
x1
w
Input
s
x
1
w
p
p
1
w
f()

Output=f()
xiwi
0
23
Activation functions
Step function
Sign function
Linear function
Sigmoid
(logistic) function
step(x) = 1, if x >= threshold sign(x) = +1, if x >= 0
-1, if x < 0
0, if x < threshold
sigmoid(x) = 1/(1+e-x)
(in picture above, threshold = 0)
pl(x) =x
Adding an extra input with activation a0 = -1 and weight
W0,j = t (called the bias weight) is equivalent to having a
threshold at t. This way we can always assume a 0 threshold.
24
Real vs. Artificial
Neurons
dendrites
axon
cell
synapse
dendrites
x0
w0

xn
wn
Threshold units
n
 wi xi
i 0
o  1 if
o
n
w x
i 0
i i
 0 and 0 o/w
25
Neurons as Universal
computing machine
• In 1943, McCulloch and Pitts
showed that a synchronous
assembly of such neurons is a
universal computing machine.
That is, any Boolean function can
be implemented with threshold
(step function) units.
26
Implementing AND
-1
x1
1
W=1.5

x2
o(x1,x2)
1
o( x1 , x2 )  1 if  1.5  x1  x2  0
 0 otherwise
27
Implementing OR
-1
x1
1
W=0.5

x2
o(x1,x2)
1
o(x1,x2) = 1 if –0.5 + x1 + x2 > 0
= 0 otherwise
28
Implementing NOT
-1
W=-0.5
x1
-1

o(x1)
o( x1 )  1 if 0.5  x1  0
 0 otherwise
29
Implementing more
complex Boolean
functions
x1 1
x2
1
-1
0.5

x1 or x2
-1
1
x3
1
1.5

(x1 or x2) and x3
30
Using Artificial Neural
Networks
• When using ANN, we have to
define:
• Artificial Neuron Model
• ANN Architecture
• Learning mode
31
Artificial Neural
Networks
ANN Architecture
ANN Architecture
• Feedforward:
Links
are
unidirectional, and there are
no cycles, i.e., the network is a
directed acyclic graph (DAG).
Units are arranged in layers,
and each unit is linked only
to units in the next layer.
There is no internal state
other than the weights.
• Recurrent: Links can form
arbitrary topologies, which
can implement memory.
Behavior
can
become
unstable,
oscillatory,
or
chaotic.
33
Artificial Neural
Network
Feedforward Network
Output layer
fully connected
Hidden layers
Input layer
sparsely connected
34
Artificial Neural Network
FeedForward Architecture
• Information flow
unidirectional
• Multi-Layer
Perceptron (MLP)
• Radial Basis
Function (RBF)
• Kohonen SelfOrganising Map
(SOM)
35
Artificial Neural Network
Recurrent Architecture
• Feedback
connections
• Hopfield Neural
Networks:
Associative
memory
• Adaptive
Resonance Theory
(ART)
36
Artificial Neural Network
Learning paradigms
• Supervised learning:
• Teacher presents ANN input-output pairs,
• ANN weights adjusted according to error
•
•
•
•
Classification
Control
Function approximation
Associative memory
• Unsupervised learning:
• no teacher
• Clustering
37
ANN capabilities
•
•
•
•
•
•
•
Learning
Approximate reasoning
Generalisation capability
Noise filtering
Parallel processing
Distributed knowledge base
Fault tolerance
38
Main Problems with ANN
• Contrary to Expert sytems, with
ANN the Knowledge base is not
transparent (black box)
• Learning sometimes
difficult/slow
• Limited storage capability
39
Some applications of ANNs
•
Pronunciation: NETtalk program (Sejnowski & Rosenberg 1987) is a neural
network that learns to pronounce written text: maps characters strings into
phonemes (basic sound elements) for learning speech from text
•
Speech recognition
•
Handwritten character recognition:a network designed to read zip codes on
hand-addressed envelops
•
ALVINN (Pomerleau) is a neural network used to control vehicles steering
direction so as to follow road by staying in the middle of its lane
•
Face recognition
•
Backgammon learning program
•
Forecasting e.g., predicting behavior of stock market
40
Application of ANNs
The general scheme when using ANNs is as
follows:
Input
Pattern
Stimulus
0
1
0
1
1
encoding 1
0
0
Output
Pattern
Network
1
1
0
0
1
decoding
0
1
0
Response
41
Application: Digit
Recognition
42
Matlab Demo
• Learning XOR function
• Function approximation
• Digit Recognition
43
Learning XOR Operation:
Matlab Code
P = [ 0 0 1 1; ...
0 1 0 1]
T = [ 0 1 1 0];
net = newff([0 1;0 1],[6 1],{'tansig' 'tansig'});
net.trainParam.epochs = 4850;
net = train(net,P,T);
X = [0 1];
Y = sim(net,X);
display(Y);
44
Function Approximation:
Learning Sinus Function
• P = 0:0.1:10;
• T = sin(P)*10.0;
• net = newff([0.0 10.0],[8 1],{'tansig'
'purelin'});
• plot(P,T); pause;
• Y = sim(net,P); plot(P,T,P,Y,’o’); pause;
• net.trainParam.epochs = 4850;
• net = train(net,P,T);
• Y = sim(net,P); plot(P,T,P,Y,’o’);
45
Digit Recognition:
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
P=[1011111111
1111011111
1011111111
1000111011
0100000000
1011100111
1011111011
0111111011
1011111111
1010001010
0100000000
1001111111
1011011011
1111011011
1011111111
;
;
;
;
;
;
;
;
;
;
;
;
;
;
];
•
•
•
•
•
•
•
•
•
•
T=[1000000000;
0100000000;
0010000000;
0001000000;
0000100000;
0000010000;
0000001000;
0000000100;
0000000010;
0 0 0 0 0 0 0 0 0 1 ];
46
Digit Recognition:
net = newff([0 1;0 1;0 1;0 1;0 1;0 1;0 1; 0 1;0 1;0 1;0 1;0 1;0 1;0 1;0
1], [20 10],{'tansig' 'tansig'});
net.trainParam.epochs = 4850;
net = train(net,P,T);
47
When to use ANNs?
•
Input is high-dimensional discrete or real-valued (e.g. raw sensor
input).
•
Inputs can be highly correlated or independent.
•
Output is discrete or real valued
•
Output is a vector of values
•
Possibly noisy data. Data may contain errors
•
Form of target function is unknown
•
Long training time are acceptable
•
Fast evaluation of target function is required
•
Human readability of learned target function is unimportant
⇒ ANN is much like a black-box
48
Conclusions
49
Conclusions
• This topic is very hot and has widespread implications
• Biology
• Chemistry
• Computer science
• Complexity
• We’ve seen the basic concepts …
• But we’ve only scratched the surface!
 From now on, Think Biology, Emergence, Complex
Systems …
50
References
51
References
• Jay Xiong, New Software Engineering Paradigm Based on Complexity Science, Springer
2011.
• Claudios Gros : Complex and Adaptive Dynamical Systems. Second Edition, Springer,
2011 .
• Blanchard, B. S., Fabrycky, W. J., Systems Engineering and Analysis, Fourth Edition,
Pearson Education, Inc., 2006.
• Braha D., Minai A. A., Bar-Yam, Y. (Editors), Complex Engineered Systems, Springer,
2006
• Gibson, J. E., Scherer, W. T., How to Do Systems Analysis, John Wiley & Sons, Inc.,
2007.
• International Council on Systems Engineering (INCOSE) website (www.incose.org).
• New England Complex Systems Institute (NECSI) website (www.necsi.org).
• Rouse, W. B., Complex Engineered, Organizational and Natural Systems, Issues
Underlying the Complexity of Systems and Fundamental Research Needed To Address
These Issues, Systems Engineering, Vol. 10, No. 3, 2007.
52
References
• Wilner, M., Bio-inspired and nanoscale integrated
computing, Wiley, 2009.
• Yoshida, Z., Nonlinear Science: the Challenge of
Complex Systems, Springer 2010.
• Gardner M., The Fantastic Combinations of John
Conway’s New Solitaire Game “Life”, Scientific
American 223 120–123 (1970).
• Nielsen, M. A. & Chuang, I. L. ,Quantum
Computation and Quantum Information, 3rd ed.,
Cambridge Press, UK, 2000.
53
54