Artificial Intelligence CSC 361

Download Report

Transcript Artificial Intelligence CSC 361

Artificial Intelligence
CSC 361
Dr. Yousef Al-Ohali
Computer Science Depart.
CCIS – King Saud University
Saudi Arabia
[email protected]
http://faculty.ksu.edu.sa/YAlohali
Intelligent Systems
Part II: Neural Nets
Developing Intelligent
Program Systems
Machine Learning : Neural Nets


Artificial Neural Networks: Artificial Neural Networks are crude
attempts to model the highly massive parallel and distributed
processing we believe takes place in the brain.
Two main areas of activity:

Biological:
Try to model biological neural systems.

Computational: develop powerful applications.
Developing Intelligent
Program Systems
Machine Learning : Neural Nets
Neural nets can be used to answer the
following:

Pattern recognition: Does that
image contain a face?

Classification problems: Is this cell
defective?

Prediction: Given these symptoms,
the patient has disease X



Forecasting: predicting behavior
of stock market
Handwriting: is character recognized?
Optimization: Find the shortest
path for the TSP.
Developing Intelligent
Program Systems
Machine Learning : Neural Nets
Strength and Weaknesses of ANN

Examples may be described by a large number of attributes (e.g.,
pixels in an image).

Data may contain errors.

The time for training may be extremely long.

Evaluating the network for a new example is relatively fast.

Interpretability of the final hypothesis is not relevant (the NN is
treated as a black box).
Artificial Neural Networks
Biological Neuron
The Neuron

The neuron receives nerve impulses through
its dendrites. It then sends the nerve impulses
through its axon to the terminal buttons
where neurotransmitters are released to
simulate other neurons.
The neuron

The unique
components are:




Cell body or soma
which contains the
nucleus
The dendrites
The axon
The synapses
The neuron - dendrites



The dendrites are short
fibers (surrounding the
cell body) that receive
messages
The dendrites are very
receptive to connections
from other neurons.
The dendrites carry
signals from the
synapses to the soma.
The neuron - axon



The axon is a long
extension from the
soma that transmits
messages
Each neuron has only
one axon.
The axon carries action
potentials from the
soma to the synapses.
The neuron - synapses



The synapses are the connections
made by an axon to another
neuron. They are tiny gaps
between axons and dendrites
(with chemical bridges) that
transmit messages
A synapse is called excitatory if it
raises the local membrane
potential of the post synaptic cell.
Inhibitory if the potential is
lowered.
Artificial Neural Networks
History of ANNs
History of Artificial Neural
Networks

1943: McCulloch and Pitts proposed a model of a neuron -->
Perceptron

1960s: Widrow and Hoff explored Perceptron networks (which they
called “Adalines”) and the delta rule.

1962: Rosenblatt proved the convergence of the perceptron training
rule.




1969: Minsky and Papert showed that the Perceptron cannot deal with
nonlinearly-separable data sets---even those that represent simple
function such as X-OR.
1970-1985: Very little research on Neural Nets
1986: Invention of Backpropagation [Rumelhart and McClelland, but
also Parker and earlier on: Werbos] which can learn from nonlinearlyseparable data sets.
Since 1985: A lot of research in Neural Nets
Artificial Neural Networks
artificial Neurons
Artificial Neuron
•
•
Incoming signals to a unit are combined by summing
their weighted values
Output function: Activation functions include Step
function, Linear function, Sigmoid function, …
x1
w1
Inputs
xp
wp
1
f()

w0
Output=f()
xiwi
Activation functions
Step function
Sign function
Linear function
Sigmoid
(logistic) function
step(x) = 1, if x >= threshold sign(x) = +1, if x >= 0
-1, if x < 0
0, if x < threshold
sigmoid(x) = 1/(1+e-x)
(in picture above, threshold = 0)
pl(x) =x
Adding an extra input with activation a0 = -1 and weight
W0,j = t (called the bias weight) is equivalent to having a
threshold at t. This way we can always assume a 0 threshold.
Real vs. Artificial Neurons
dendrites
axon
cell
synapse
dendrites
x0
w0

xn
wn
Threshold units
n
 wi xi
i 0
o  1 if
o
n
w x
i 0
i i
 0 and 0 o/w
Neurons as Universal
computing machine

In 1943, McCulloch and Pitts showed
that a synchronous assembly of such
neurons is a universal computing
machine. That is, any Boolean function
can be implemented with threshold
(step function) units.
Implementing AND
-1
x1
1
W=1.5

x2
o(x1,x2)
1
o( x1 , x2 )  1 if  1.5  x1  x2  0
 0 otherwise
Implementing OR
-1
x1
1
W=0.5

x2
1
o(x1,x2) = 1 if –0.5 + x1 + x2 > 0
= 0 otherwise
o(x1,x2)
Implementing NOT
-1
W=-0.5
x1
-1

o(x1)
o( x1 )  1 if 0.5  x1  0
 0 otherwise
Implementing more complex
Boolean functions
x1 1
x2
1
-1
0.5

x1 or x2
-1
1
x3
1
1.5

(x1 or x2) and x3
Artificial Neural Networks

When using ANN, we have to define:

Artificial Neuron Model

ANN Architecture

Learning mode
Artificial Neural Networks
ANN Architecture
ANN Architecture


Feedforward: Links are
unidirectional, and there are
no cycles, i.e., the network is
a directed acyclic graph
(DAG). Units are arranged
in layers, and each unit is
linked only to units in the
next layer. There is no
internal state other than the
weights.
Recurrent: Links can form
arbitrary topologies, which
can implement memory.
Behavior can become
unstable, oscillatory, or
chaotic.
Artificial Neural Network
Feedforward Network
Output layer
fully connected
Hidden layers
Input layer
sparsely connected
Artificial Neural Network
FeedForward Architecture

Information flow
unidirectional

Multi-Layer Perceptron
(MLP)

Radial Basis Function
(RBF)

Kohonen SelfOrganising Map (SOM)
Artificial Neural Network
Recurrent Architecture



Feedback connections
Hopfield Neural
Networks: Associative
memory
Adaptive Resonance
Theory (ART)
Artificial Neural Network
Learning paradigms

Supervised learning:


Teacher presents ANN input-output pairs,
ANN weights adjusted according to error





Classification
Control
Function approximation
Associative memory
Unsupervised learning:

no teacher

Clustering
ANN capabilities







Learning
Approximate reasoning
Generalisation capability
Noise filtering
Parallel processing
Distributed knowledge base
Fault tolerance
Main Problems with ANN

Contrary to Expert sytems, with ANN
the Knowledge base is not transparent
(black box)

Learning sometimes difficult/slow

Limited storage capability
Some applications of ANNs

Pronunciation: NETtalk program (Sejnowski & Rosenberg 1987) is a
neural network that learns to pronounce written text: maps
characters strings into phonemes (basic sound elements) for learning
speech from text

Speech recognition

Handwritten character recognition:a network designed to read zip
codes on hand-addressed envelops

ALVINN (Pomerleau) is a neural network used to control vehicles
steering direction so as to follow road by staying in the middle of its
lane

Face recognition

Backgammon learning program

Forecasting e.g., predicting behavior of stock market
When to use ANNs?

Input is high-dimensional discrete or real-valued (e.g. raw sensor input).

Inputs can be highly correlated or independent.

Output is discrete or real valued

Output is a vector of values

Possibly noisy data. Data may contain errors

Form of target function is unknown

Long training time are acceptable

Fast evaluation of target function is required

Human readability of learned target function is unimportant
⇒ ANN is much like a black-box