Artificial Neural Networks
Download
Report
Transcript Artificial Neural Networks
ARTIFICIAL NEURAL
NETWORKS
Outline
Introduction
Computation in the brain
Artificial Neuron Models
Types of Neural Networks
Discussion
What tasks are machines good at doing that humans are
not?
What tasks are humans good at doing that machines are
not?
What does it mean to learn?
How is learning related to intelligence?
What does it mean to be intelligent? Do you believe a
machine will ever be built that exhibits intelligence?
If a computer were intelligent, how would you know?
What does it mean to be conscious?
Can one be intelligent and not conscious or vice versa?
Types of Applications
Machine learning
Cognitive science
Neurobiology
Mathematics
Philosophy
Usage
Signal processing
Speech recognition
Control
Vision
Robotics
Financial applications
Pattern recognition
Data compression
Speech production
Game playing
Computation in the Brain
The Brain is an Information Processing
System
10 billion nerve cells (neurons)
about 10 000 synapses
Massive parallel information processing
Capabilities of the Brain
Its performance tends to degrade gracefully under
partial damage
It can learn (reorganize itself) from experience.
In contrast with most programs / engineered systems
Partial recovery from damage is possible if healthy units
can learn to take over the functions previously carried out
by the damaged areas
It performs massively parallel computations
extremely efficiently
Complex visual perception
Comparison with Computer
Artificial Neural Networks attempt to bring computers a
little closer to the brain's capabilities by imitating certain
aspects of information processing in the brain, in a highly
simplified way.
Artificial Neural Networks
A branch of Artificial Intelligence
A loosely modeled system of artificial neurons
based on the human brain / neurons
A network of many very simple processors
(units)
each possibly having a (small amount of) local
memory
Artificial Neural Networks (Ctd.)
A neural network can be
a processing device,
an algorithm, or
actual hardware
Training rule
weights of connections are adjusted based
presented patterns
Properties
Unidirectional communication channels
(connections)
Carry numeric (as opposed to symbolic) data
Units operate
On local data
On inputs received via connections
Benefits
Learns from examples / experience
to improve their performance
to adapt changes in the environment
to deal with incomplete information or noisy data
Capability of generalization
Great potential for parallelism
Computations of components: independent of each
other
Structure
•A Neural network can be considered as a black box
that is able to predict an output pattern when it
recognizes a given input pattern.
•Once trained, the neural network is able to
recognize similarities when presented with a new
input pattern, resulting in a predicted output pattern.
Network Structure
Layered circuit of neurons
Neighboring layers completely connected; no other
connections (feedforward network)
Arbitrary number of hidden layers allowed, but usually 0 or 1
A Simple Artificial Neuron
•The basic computational element (model neuron) is often called a node or unit. It
receives input from some other units, or perhaps from an external source. Each
input has an associated weight w, which can be modified so as to model synaptic
learning. The unit computes some function f of the weighted sum of its inputs
•Its output, in turn, can serve as input to other units.
•The weighted sum is called the net input to unit i,
often written neti.
•Note that wij refers to the weight from unit j to unit i
(not the other way around).
•The function f is the unit's activation function. In
the simplest case, f is the identity function, and the
unit's output is just its net input. This is called a
linear unit.
Processing Information
Inputs
Weights
x1
w1j
x2
w2j
Neuron j
wij xi
Summations
xi
wij
Output
Yj
Transfer function
Operation of a Single Neuron
Activation Functions
Step function
Changing the bias weight
Wo, moves the threshold
location.
Neural Network Fundamentals
Components and Structure
Processing Elements
Network
Structure of the Network
Processing Information by the Network
Inputs
Outputs
Weights
Summation Function
Learning Process
The learning process of a Neural Network can
be viewed as reshaping a sheet of metal,
which represents the output (range) of the
function being mapped.
The training set (domain) acts as energy
required to bend the sheet of metal such that it
passes through predefined points. However,
the metal, by its nature, will resist such
reshaping.
So the network will attempt to find a low energy
configuration (i.e. a flat/non-wrinkled shape)
that satisfies the constraints (training data).
Learning Process
Learning can be done in supervised or
unsupervised training.
In supervised training, both the inputs and the
outputs are provided.
The network processes the inputs and compares
its resulting outputs against the desired outputs.
Errors are then calculated, causing the system to
adjust the weights which control the network.
This process occurs over and over as the weights are
continually tweaked.
Learning: Three Tasks
1. Compute Outputs
2. Compare Outputs with Desired Targets
3. Adjust Weights and Repeat the Process
Neural Network
Application Development
Preliminary steps of system development are done
ANN Application Development Process
1. Collect Data
2. Separate into Training and Test Sets
3. Define a Network Structure
4. Select a Learning Algorithm
5. Set Parameters, Values, Initialize Weights
6. Transform Data to Network Inputs
7. Start Training, and Determine and Revise Weights
8. Stop and Test
9. Implementation: Use the Network with New Cases
Data Collection and Preparation
Collect data and separate into a training set and a
test set
Use training cases to adjust the weights
Use test cases for network validation
Neural Network Architecture
Representative Architectures
Associative Memory Systems
Associative memory - ability to recall complete situations
from partial information
Systems correlate input data with stored information
Hidden Layer
Three, Sometimes Four or Five Layers
Recurrent Structure
Recurrent network (double layer) - each activity
goes through the network more than once before
the output response is produced
Uses a feedforward and feedbackward approach to
adjust parameters to establish arbitrary numbers of
categories
Example: Hopfield
Neural Network Preparation
(Non-numerical Input Data (text, pictures): preparation may involve
simplification or decomposition)
Choose the learning algorithm
Determine several parameters
Learning rate (high or low)
Threshold value for the form of the output
Initial weight values
Other parameters
Choose the network's structure (nodes and layers)
Select initial conditions
Transform training and test data to the required format
Training the Network
Present the training data set to the network
Adjust weights to produce the desired output for
each of the inputs
Several iterations of the complete training set to get a
consistent set of weights that works for all the training data
Learning Algorithms
Two Major Categories Based On Input Format
Binary-valued (0s and 1s)
Continuous-valued
Two Basic Learning Categories
Supervised Learning
Inputs
Outputs
Unsupervised Learning
Inputs
No desired outputs
System decides how to group the input data
Backpropagation
"Backwards propagation of errors".
Supervised learning method
Implements Delta rule (gradient descent algorithm)
It requires a teacher that knows, or can
calculate, the desired output for any given input.
It is most useful for feed-forward networks
(networks that have no feedback, or simply, that
have no connections that loop).
Activation function should be is differentiable.
Backpropagation Steps
Present a training sample to the neural network.
Compare the network's output to the desired output from that
sample. Calculate the error in each output neuron.
For each neuron, calculate what the output should have been, and a
scaling factor, how much lower or higher the output must be
adjusted to match the desired output. This is the local error.
Adjust the weights of each neuron to lower the local error.
Assign "blame" for the local error to neurons at the previous level,
giving greater responsibility to neurons connected by stronger
weights.
Repeat from step 3 on the neurons at the previous level, using each
one's "blame" as its error.
Training Patterns
10 steps
1000 steps
Training Pattern
The shown pattern has not been
learned yet.... the global error is still
high
1021 steps
Global error is getting down as
training continues.....Deformed and
noisy patterns are also used in
training.
4000 steps
Training Pattern
8000 steps
12000 steps
Recognition
Too much noise
Backpropagation
Backpropagation (back-error propagation)
Most widely used learning
Relatively easy to implement
Requires training data for conditioning the
network before using it for processing other
data
Network includes one or more hidden layers
Network is considered a feedforward
approach
Backpropagation
Externally provided correct patterns are compared with
the neural network output during training (supervised
training)
Feedback adjusts the weights until all training patterns
are correctly categorized
Error is backpropogated through network layers
Some error is attributed to each layer
Weights are adjusted
A large network can take a very long time to train
May not converge
Testing
Test the network after training
Examine network performance: measure the
network’s classification ability
Black box testing
Do the inputs produce the appropriate outputs?
Not necessarily 100% accurate
But may be better than human decision makers
Test plan should include
Routine cases
Potentially problematic situations
May have to retrain
Unsupervised Learning
Only input stimuli shown to the network
Network is self-organizing
Number of categories into which the network
classifies the inputs can be controlled by varying
certain parameters
Examples
Adaptive Resonance Theory (ART)
Kohonen Self-organizing Feature Maps
The Self-Organizing Map:
An Alternative NN Architecture
Kohonen Self-Organizing Map (SOM)
Unsupervised learning
Weights self-adjust to input pattern
Topology
Unsupervised Neural Networks –
Kohonen Learning
Also defined – Self Organizing Map
Learn a categorization of input space
Neurons are connected into a 1-D or 2-D lattice.
Each neuron represents a point in N-dimensional
pattern space, defined by N weights
During training, the neurons move around to try and
fit to the data
Changing the position of one neuron in data space
influences the positions of its neighbors via the
lattice connections
Self Organizing Map – Network
Structure
All inputs are connected by
weights to each neuron
size of neighbourhood
changes as net learns
Aim is to map similar inputs
(sets of values) to similar
neuron positions.
Data is clustered because
it is mapped to the same
node or group of nodes
SOM-Algorithm
1. Initialization :Weights are set to unique
random values
2. Sampling : Draw an input sample x and
present in to network
3. Similarity Matching : The winning neuron i is
the neuron with the weight vector that best
matches the input vector
i = argmin(j){ x – wj }
SOM - Algorithm
4. Updating : Adjust the weights of the winning neuron
so that they better match the input.
Also adjust the weights of the neighbouring neurons.
∆wj = η . hij ( x – wj)
neighbourhood function : hij
over time neigbourhood function gets smaller
Result: The neurons provide a good approximation of
the input space and correspond
Applications
Clustering: explores the similarity between patterns and places
similar patterns in a cluster.
data compression
data mining.
Classification/Pattern recognition: assigns an input pattern
(like handwritten symbol) to one of many classes.
associative memory.
Function approximation: finds an estimate of the unknown
function f() subject to noise.
Various engineering and scientific disciplines
Prediction/Dynamical Systems: forecasts some future
values of a time-sequenced data. Prediction differs from Function
approximation by considering time factor. Here the system is
dynamic and may produce different results for the same input
data based on system state (time).
Types of Neural Networks
Neural Network types can be classified based on following attributes:
• Applications
-Classification
-Clustering
-Function approximation
-Prediction
• Connection Type
- Static (feedforward)
- Dynamic (feedback)
• Topology
- Single layer
- Multilayer
- Recurrent
- Self-organized
• Learning Methods
- Supervised
- Unsupervised
Neural Computing Paradigms
Decisions the builder must make
Size of training and test data
Learning algorithms
Topology: number of processing elements and their
configurations
Transformation (transfer) function
Learning rate for each layer
Diagnostic and validation tools
Results in the Network's Paradigm
Neural Network Software
Program in:
Programming language
Neural network package or NN programming tool
Both
Tools (shells) incorporate:
Training algorithms
Transfer and summation functions
May still need to:
Program the layout of the database
Partition the data (test data, training data)
Transfer the data to files suitable for input to an ANN tool
NN Development Tools
MATLAB NNTOOL
Braincel (Excel Add-in)
NeuralWorks
Brainmaker
PathFinder
Trajan Neural Network Simulator
NeuroShell Easy
SPSS Neural Connector
NeuroWare
Limitations of
Neural Networks
Do not do well at tasks that are not done well by
people
Lack explanation capabilities
Limitations and expense of hardware technology
restrict most applications to software simulations
Training time can be excessive and tedious
Usually requires large amounts of training and test
data
Neural Networks
For Decision Support
Inductive means for gathering, storing, and using
experiential knowledge
Forecasting
ANN in decision support: Easy sensitivity analysis
and partial analysis of input factors
The relationship between a combined expert system,
ANN and a DSS
ANN can expand the boundaries of DSS
Neural Networks for
Regression
Discrete Inputs
Classification
Pattern Recognition
Clustering
Case study:
Assessment of Implication of Competitiveness on Human
Development of Countries via Data Envelopment Analysis and
Cluster Analysis
WEF Scores
- Basic requirements
- Efficiency enhancers
- Innovation and sophistication
factors
HDI Scores
- Life expectancy at birth
- Combined gross enrollment ratio for primary,
secondary and tertiary schools
- GDP / capita
OUTPUT- ORIENTED SUPER
EFFICIENCY DEA
CLUSTER ANALYSIS
by SOM
Calculation of countries’ efficiency
scores considering WEF scores as
input and HDI Scores as output
Classification of the countries based
on WEF and HDI scores
Analyzing the evolution of countries in
competitiveness and human development
perspectives
Iterative clustering
[2 X 4]
[2 X 3]
[5 X 1]
[2 X 2]
Results
Changes over years
Modeling construction
problems using ANN
Homework
Find three articles from your work domain utilizing ANNs
to solve a problem, to help decision making, etc.
Scholar.google.com
Sciencedirect.com
www.library.itu.edu.tr
Select one of them to criticize in one or two paragraphs.
How are they using ANNs? What are the inputs and outputs?
Do you think it is appropriate to use ANNs for that domain? What
other technique can be used?
Submit the full paper of the selected article along with your
criticism. Submit abstracts for the other two.