ICT619 Intelligent Systems
Download
Report
Transcript ICT619 Intelligent Systems
ICT619 Intelligent
Systems
Topic 4: Artificial Neural
Networks
Artificial Neural Networks
PART A
Introduction
An overview of the biological neuron
The synthetic neuron
Structure and operation of an ANN
Problem solving by an ANN
Learning in ANNs
ANN models
Applications
PART B
Developing neural network applications
Design of the network
Training issues
A comparison of ANN and ES
Hybrid ANN systems
Case Studies
ICT619
2
Developing neural network
applications
Neural Network Implementations
Three possible practical implementations of ANNs are:
1. A software simulation program running on a digital
computer
2. A hardware emulator connected to a host computer called a neurocomputer
3. True electronic circuits
ICT619
3
Software Simulations of ANN
Currently the cheapest and simplest implementation
method for ANNs - at least for general purpose use.
Simulates parallel processing on a conventional
sequential digital computer
Replicates temporal behaviour of the network by
updating the activation level and output of each node
for successive time steps
These steps are represented by iterations or loops
Within each loop, the updates for all nodes in a layer
are performed.
ICT619
4
Software simulations of ANN
(cont’d)
In multilayer ANNs, processing for a layer is
completed and its output used to calculate states of
the nodes in the following layer
Typical additional features of ANN simulators
1. Configuring the net according to a chosen architecture and
node operational characteristic
2. Implementation of training phase using a chosen training
algorithm
3. Tools for visualising and analysing behaviour of nets
ANN simulators are written in hi-level languages such
as C, C++ and Java.
ICT619
5
Advantages and possible problems
with software simulators
Advantages and possible problems with software
simulators
Main attraction of ANN simulators is the relatively low
cost and wide availability of ready-made commercial
packages
They are also compact, flexible and highly portable.
Writing your own simulator requires programming skills
and would be time consuming (except that you don't
have to now!)
Training of ANNs using software simulators can be
slow for larger networks (greater than a few hundred)
ICT619
6
Commercially available neural net
packages
Prewritten shells with convenient user interfaces
Cost a few hundred to tens of thousands of dollars
Allow users to specify the ANN design and training
parameters
Usually provide graphic interfaces to enable monitoring
of the net’s training and operation
Likely to provide interfacing with other software
systems such as spreadsheets and databases.
ICT619
7
Neurocomputers
Dedicated special-purpose digital
computer (aka accelerator boards)
Optimised to perform operations
common in neural network simulation
Acts as a coprocessor to a host
computer and is controlled by a
program running on the host.
Can be tens to thousands of times
faster than simulators
Systems are available with approx.
1000 million IPS connection updates
per second for networks with 8,192
neurons e.g ACC Neural Network
Processor
ICT619
8
Neurocomputers
Genobyte's CAM-Brain Machine was developed between 1997 and 2000
ICT619
9
True Networks in Hardware
Closer to biological neural networks than simulations
Consist of synthetic neurons actually fabricated on
silicon chips
Commercially available hardwired ANNs are limited to
a few thousand neurons per chip1.
Chips connected in parallel to achieve larger networks.
Problems: interconnection and interference, fixedvalued weights - work progressing on modifiable
synapses.
1
Figures more than five years old.
ICT619
10
Neural Network Development
Methodology
Aims to add structure and organisation to ANN
applications development for reducing cost, increasing
accuracy, consistency, user confidence and
friendliness
Split development into the following phases:
The Concept Phase
The Design Phase
The Implementation Phase
The Maintenance Phase
ICT619
11
Neural Network Development
Methodology - the Concept Phase
Involves
Validating the proposed application
Selecting an appropriate neural paradigm.
Application validation
Problem characteristics suitable for neural network
application are:
Data intensive
Multiple interacting parameters
Incomplete, erroneous, noisy data
Solution function unknown or expensive
Requires flexibility, generalisation, fault-tolerance, speed
ICT619
12
ANN Development Methodology - the
Concept Phase (cont’d)
Common examples of applications with above
attributes are
pattern recognition (eg, printed or handwritten character,
consumer behaviour, risk patterns),
forecasting (eg, stock market), signal (audio, video, ultrasound)
processing
Problems not suitable for ANN-based solutions include:
A mathematically accurate and precise solution is available
Solution involving deduction and step-wise logic appropriate
Applications involving explaination or reporting
One application area that is unsuitable for ANNs is
resource management eg, inventory, accounts, sales
data analysis
ICT619
13
Selecting an ANN paradigm
Decision based on comparison of application requirements
to capabilities of different paradigms
eg, the multilayer perceptron is well known for its pattern
recognition capabilities,
Kohonen net more suited for applications involving data
clustering
Choice of paradigm also influenced by the training method
that can be employed
eg. supervised training must have adequate number of
input-correct output pairs available and training may take a
relatively long time
Technical and economic feasibility assessments should be
carried out to complete the concept phase
ICT619
14
The Design Phase
The design phase specifies initial values and
conditions at the node, network and training levels
Decisions to be made at the node level include:
Types of input – binary (0,1), bipolar (-1,+1), trivalent (1, 0, +1), discrete, continuous-valued
Transfer function - step or threshold, hyperbolic tangent,
sigmoid, consider possible use of lookup tables for
speeding up calculations
Decisions to be made at the network architecture
level
The number and size of layers and their connectivity
(fully interconnected, or sparsely interconnected, feedforward
or recurrent, other?)
ICT619
15
The Design Phase (cont’d)
'Size' of a layer is the number of nodes in the layer
For the input layer, size is determined by number of data
sources (input vector components) and possibly the
mathematical transformations done
The number of nodes in the output layer is determined
by the number of classes or decision values to be output
Finding optimal size of the hidden layer needs some
experimentation
Too few nodes will produce inadequate mapping, while
too many may result in inadequate generalisation
ICT619
16
The Design Phase (cont’d)
Connectivity
Connectivity determines the flow of signals between
neurons in the same or different layers
Some ANN models, such as the multilayer perceptron,
have only interlayer connections - there is no intralayer
connection
The Hopfield net is an example of a model with
intralayer connections
ICT619
17
The Design Phase (cont’d)
Feedback
There may be no feedback of output values, eg, the
multilayer perceptron
or
There may be feedback as in a recurrent network eg,
the Hopfield net
Other design questions include
Setting of parameters for the learning phase – eg,
stopping criterion, learning rate.
Possible addition of noise to speed up training.
ICT619
18
The Implementation phase
Typical steps:
Gathering the training set
Selecting the development environment
Implementing the neural network
Testing and debugging the network
Gathering the training set
Aims to get right type of data in adequate amount
and in the right format
ICT619
19
Gathering training data (cont’d)
How much data to gather?
Increasing data amount increases training time but may
help earlier convergence
Quality more important than quantity
Collection of data
Potential sources - historical records, instrument
readings, simulation results
Preparation of data
Involves preprocessing including scaling, normalisation,
binarisation, mapping to logarithmic scale, etc.
ICT619
20
Gathering training data (cont’d)
Type of data to collect should be representative of
given problem including routine, unusual and
boundary-condition cases
Mix of good as well as imperfect data but not
ambiguous or too erroneous.
Amount of data to gather
Increasing data amount increases training time but
may help earlier convergence
Quality more important than quantity
ICT619
21
Gathering training data (cont’d)
Collection of data
Potential sources - historical records, instrument
readings, simulation results
Preparation of data
Involves preprocessing including normalisation and
possible binarisation
ICT619
22
Selecting the development
environment
Hardware and software aspects
Hardware requirements based on
speed of operation
memory and storage capacity
software availability
cost
compatibility
The most popular platforms are workstations and highend PC's (with accelerator board option)
ICT619
23
Selecting the development
environment
Two options in choosing software
1. Custom-coded simulators – which requires more
expertise on part of the user but provides maximum
flexibility
2. Commercial development packages – which are
usually easy to use because of a more
sophisticated interface
ICT619
24
Selecting the development
environment (cont’d)
Selection of hardware and software
environment usually based on following
considerations:
ANN paradigm to be implemented
Speed in training and recall
Transportability
Vendor support
Extensibility
Price
ICT619
25
Implementing the neural network
Common steps involved are:
Selection of appropriate neural paradigm
Setting network size
Deciding on the learning algorithm
Creation of screen displays
Determining the halting criteria
Collecting data for training and testing
Data preparation including preprocessing
Organising data into training and test sets
ICT619
26
Implementation - Training
Training the net, which consists of
Loading the training set
Initialisation of network weights – usually to
small random values
Starting the training process
Monitoring the training process until training
is completed
Saving of weight values in a file for use
during operation mode
ICT619
27
Implementation – Training
(cont’d)
Possible problems arising during training
Failure to converge to a set of optimal weight values
Further weight adjustments fail to reduce output error,
stuck in a local minimum
Remedied by resetting the learning parameters and
reinitialising the weights
Overtraining
Net fails to generalise, i.e., fails to classify less than
perfect patterns
Mix of good and imperfect patterns for training helps
ICT619
28
Implementation – Training
(cont’d)
Training results may be affected by the method
of presenting data set to the network.
Adjustments may be made by varying the layer
sizes and fine-tuning the learning parameters.
To ensure optimal results, several variations of
a neural network may be trained and each
tested for accuracy
ICT619
29
Implementation - Testing and
Debugging
Testing can be done by:
1. Observing operational behaviour of the net.
2. Analysing actual weights
3. Study of network behaviour under specific conditions
Observing operational behaviour
Network treated as a black box and its response to a series
of test cases is evaluated
Test data
Should contain training cases as well as new cases
Routine, unusual as well as boundary condition cases
should be tried
ICT619
30
Implementation - Testing and
Debugging (cont’d)
Testing by weight analysis
Weights entering and exiting nodes analysed for
relatively small and large values
In case of significant errors detected in testing,
debugging would involve examining
the training cases for representativeness, accuracy and
adequacy of number
learning algorithm parameters such as the rate at which
weights are adjusted
neural network architecture, node characteristics, and
connectivity
training set-network interface, user-network interface
ICT619
31
The Maintenance Phase
Consists of
placing the neural network in an operational
environment with possible integration
periodic performance evaluation, and maintenance
Although often designed as stand-alone systems,
some neural network systems are integrated with other
information systems using:
Loose-coupling – preprocessor, postprocessor,
distributed component
Tight-coupling or full integration as embedded
component
ICT619
32
The Maintenance Phase
Possible ANN operational environments:
ICT619
33
System evaluation
Continual evaluation is necessary to
ensure satisfactory performance in solving dynamic
problems
check for damaged or retrained networks.
Evaluation can be carried out by reusing
original test procedures with current data.
ICT619
34
ANN Maintenance
Involves modification necessitated by
Decreasing accuracy
Enhancements
System modification falls into two categories
involving either data or software.
Data modification steps:
Training data is modified or replaced
Network retrained and re-evaluated.
ICT619
35
ANN Maintenance (cont’d)
Software changes include changes in
Interfaces
cooperating programs
the structure of the network.
If the network is changed, part of the design and most
of the implementation phase may have to be repeated.
Backup copies should be used for maintenance and
research.
ICT619
36
A comparison of ANN and ES
Similarities between ES and ANN
Both aim to create intelligent computer systems by
mimicking human intelligence, although at different
levels
Design process of neither ES nor ANN is automatic
Knowledge extraction in ES is a time and labour
intensive process
ANNs are capable of learning but selection and
preprocessing of data have to be done carefully.
ICT619
37
A comparison of ANN and ES
(cont’d)
Differences between ANN and ES
Differ in aspects of design, operation and use
Logic vs. brain
ES simulate the human reasoning process based on
formal logic
ANNs are based on modelling the brain, both in structure
and operation
Sequential vs. parallel
The nature of processing in ES is sequential
ANNs are inherently parallel
ICT619
38
A comparison of ANN and ES
(cont’d)
External and static vs. internal and dynamic
Learning is performed external to the ES
ANN itself is responsible for its knowledge acquisition
during the training phase.
Learning is always off-line in ES - knowledge remains
static during operation
Learning in ANNs, although mostly off-line, can be online
Deductive vs. inductive inferencing
Knowledge in an ES always used in a deductive
reasoning process
An ANN constructs its knowledge base inductively from
examples, and uses it to produce decision through
generalisation
ICT619
39
A comparison of ANN and ES
(cont’d)
Knowledge representation: explicit vs. implicit
ES store knowledge in explicit form -possible to inspect
and modify individual rules
ANNs knowledge stored implicitly in the interconnection
weight values
Design issues: simple vs. complex
Technical side of ES development relatively simple
without difficult design choices.
ANN design process often one of trial and error
ICT619
40
A comparison of ANN and ES
(cont’d)
User interface: white box vs. black box
ES have explanation capability
Difficulty in interpreting an ANN's knowledge-base
effectively makes it a black box to the user
State of maturity and recognition: wellestablished vs. early
ES already well established as a methodology in
commercial applications
ANN recognition and development tools at a
relatively early stage.
ICT619
41
Hybrid systems
Neuro-symbolic computing utilises the complementary
nature of computing in neural networks (numerical) and
expert systems (symbolic).
Neuro-fuzzy systems combine neural networks with
fuzzy logic
ANNs can also be combined with genetic algorithm
methodology
Hybrid ES-ANN systems
The strengths of the ES can be utilised to overcome
the weaknesses of an ANN based system and vice
versa.
For example, ANN’s extraction of knowledge from data
ES’s explanation capability
ICT619
42
Hybrid ES-ANN systems
Rule extraction by inference justification in an ANN
MACIE, an ANN based decision support system
described in (Gallant 1993)
Extracts a single rule that justifies an inference in an
ANN
Inference in an ANN is represented by output of a
single node
This output is based upon incomplete input values fed
from a number of nodes as shown in the diagram
below.
ICT619
43
Hybrid ES-ANN systems (cont’d)
A node ui is defined to be a contributing node to node
uj if wij ui 0.
ICT619
44
Hybrid ES-ANN systems (cont’d)
In this example, the
contributing variables are
{u2, u3, u5, u6 }.
The rule produced in this
example is:
IF u6 = Unknown
AND u2 = TRUE
AND u3 = FALSE
AND u5 = TRUE
THEN conclude u7 = TRUE.
ICT619
45
Hybrid ES-ANN systems (cont’d)
One approach to hybrid systems divides a problem into
tasks suitable for either ES and ANN
These tasks are then performed by the appropriate
methodology
One example of such a system (Caudill 1991) is an
intelligent system for delivering packages
ES performs the task of producing the best loading
strategy for packages into trucks
ANN works out best route for delivering the packages
efficiently.
ICT619
46
Hybrid ES-ANN systems (cont’d)
Hybrid ES-ANN systems with ANNs embedded
within expert systems
ANN used to determine which rule to fire, given
the current state of facts.
Another approach to hybrid ES-ANN uses an
ANN as a preprocessor
One or more ANNs produce classifications.
Numerical outputs produced by ANN are
interpreted symbolically by an ES as facts
ES applies the facts for deductive reasoning
ICT619
47
Case Study
Case: Application of ANNs in bankruptcy prediction
(Coleman et al, AI Review, Summer 1991, in Zahedi
1993)
Predicts banks that were certain to fail within a year
Predicts certainty given to bank examiners dealing with the
bank in question.
ANN has 11 inputs, each of which is a ratio developed by
Peat Marwick.
Developed by NeuralWare’s Application Development
Services and Support Group (ADSS)
Software used - the NeuralWorks Professional neural
network development system.
Uses the standard backpropagation (multiplayer perceptron)
network.
ICT619
48
Case Study (cont’d)
ANN has 11 inputs, each a ratio developed by Peat
Marwick.
Inputs connected to a single hidden layer, which in turn is
connected to a single node in the output layer.
Network outputs a single value denoting whether the bank
would or would not fail within that calendar year
Employed the hyperbolic-tangent transfer function and a
proprietary error function created by the ADSS staff.
Trained on a set of 1,000 examples, 900 of which were
viable banks and 100 of which were banks that had actually
gone bankrupt
Training consisted of about 50,000 iterations of the training
set.
Predicted 50% of banks that are viable, and 99% of banks
that actually failed.
ICT619
49
REFERENCES
AI Expert (special issue on ANN), June 1990.
BYTE (special issue on ANN), Aug. 1989.
Caudill,M., "The View from Now", AI Expert, June 1992,
pp.27-31.
Dhar, V., & Stein, R., Seven Methods for Transforming
Corporate Data into Business Intelligence., Prentice Hall
1997
Kirrmann,H., "Neural Computing: The new gold rush in
informatics", IEEE Micro June 1989 pp. 7-9
Lippman, R.P., "An Introduction to Computing with Neural
Nets", IEEE ASSP Magazine, April 1987 pp.4-21.
Lisboa, P., (Ed.) Neural Networks Current Applications,
Chapman & Hall, 1992.
Negnevitsky, M. Artificial Intelligence A Guide to Intelligent
Systems, Addison-Wesley 2005.
ICT619
50
REFERENCES (cont’d)
Bailey, D., & Thompson, D., How to Develop Neural Network
Applications, AI Expert, June 1990, pp. 38-47.
Caudill & Butler, Naturally Intelligent Systems, MIT
Press,1989, pp 227-240.
Caudill, M., “Expert networks”, BYTE pp.109-116, October
1991.
Dhar, V., & Stein, R., Seven Methods for Transforming
Corporate Data into Business Intelligence., Prentice Hall
1997.
Gallant, S., Neural Network Learning and Expert Systems,
MIT Press 1993.
Medsker,L., Hybrid Intelligent Systems, Kluwer Academic
Press, Boston 1995
Zahedi, F., Intelligent Systems for Business, Wadsworth
Publishing, , Belmont, California, 1993.
ICT619
51