Chapter 10 Neural Networks

Download Report

Transcript Chapter 10 Neural Networks

Chapter 10
Neural Network
Chapter Objective
 Understand how feed-forward networks are used to solve
estimation problems.
 Know how input and output data conversions are
performed for neural networks.
 Understand how feed-forward neural networks learn
through backpropagation.
 Know how genetic learning is applied to train feedforward neural networks.
 Know how self-organizing neural networks perform
unsupervised clustering.
 List the strengths and weaknesses of neural networks.
Data Warehouse and Data Mining
2
Chapter 9
Feed-Forward Neural Network
Data Warehouse and Data Mining
3
Chapter 9
Feed-Forward Neural Network
Data Warehouse and Data Mining
4
Chapter 9
Neural Network Training:
A Conceptual View
Data Warehouse and Data Mining
5
Chapter 9
Neural Network Training:
A Conceptual View
Data Warehouse and Data Mining
6
Chapter 9
Neural Network Training:
A Conceptual View
Data Warehouse and Data Mining
7
Chapter 9
Neural Network Explanation
Sensitivity analysis is a technique that has been
successfully applied to gain insight into the effect
individual attributes have on neural network
output.
The general process consists of the following
steps:
1. Divide the data into a training set and a test dataset.
2. Train the network with the training data.
Data Warehouse and Data Mining
8
Chapter 9
Neural Network Explanation
3. Use the test set data to create a new instance I.
Each attribute value for I is the average of all
attribute values within the test data.
4. For each attribute:
a. Vary the attribute value within instance I and present the
modification of I to the network for classification.
b. Determine the effect the variations have on the output of
the neural network.
c. The relative importance of each attribute is measured by
the effect of attribute variations on network output.
Data Warehouse and Data Mining
9
Chapter 9
General Considerations
The following is a partial list of choices that affect the
performance of a neural network model:
• What input attributes will be used to build the network?
• How will the network output be represented?
• How many hidden layers should the network contain?
• How many nodes should there be in each hidden layer?
• What condition will terminate network training?
Data Warehouse and Data Mining
10
Chapter 9
Neural Network Training: A Detailed View
Data Warehouse and Data Mining
11
Chapter 9
Neural Networks
• Advantages
– prediction accuracy is generally high
– robust, works when training examples contain errors
– output may be discrete, real-valued, or a vector of
several discrete or real-valued attributes
– fast evaluation of the learned target function
• Criticism
– long training time
– difficult to understand the learned function (weights)
– not easy to incorporate domain knowledge
Data Warehouse and Data Mining
12
Chapter 9
A Neuron
- mk
x0
w0
x1
w1
xn
å
f
output y
wn
Input
weight
vector x vector w
weighted
sum
Activation
function
• The n-dimensional input vector x is mapped into variable y
by means of the scalar product and a nonlinear function
mapping
Data Warehouse and Data Mining
13
Chapter 9
Network Training
• The ultimate objective of training
– obtain a set of weights that makes almost all the tuples in the
training data classified correctly
• Steps
– Initialize weights with random values
– Feed the input tuples into the network..... one by one
– For each unit
• Compute the net input to the unit as a linear combination
of all the inputs to the unit
• Compute the output value using the activation function
• Compute the error
• Update the weights and the bias
Data Warehouse and Data Mining
Chapter 9
Multi-Layer Perceptron
Output vector
Output nodes
Oj 
1
1 e
I j
Hidden nodes
wij I j   wij Oi   j
i
Input nodes
Input vector: xi
Data Warehouse and Data Mining
Chapter 9
Chapter Summary
• A neural network is parallel computing system of
several interconnected processor nodes.
• The input to individual network nodes is
restricted to numeric values falling in the closed
interval range [0,1].
• Because of this, categorical data must be
transformed prior to network training.
Data Warehouse and Data Mining
18
Chapter 9
Chapter Summary
• Developing a neural network involves first training the
network to carry out the desired computations and then
applying the trained network to solve new problems.
• During the learning phase, training data is used to modify
the connection weights between pairs of nodes so as to obtain
a best result for the output node (s).
• The feed-forward neural network architecture is
commonly used for supervised learning.
• Feed-forward neural networks contain a set of layered
nodes and weighted connections between nodes in
adjacent layers.
Data Warehouse and Data Mining
19
Chapter 9
Chapter Summary
• Feed-forward neural networks are often trained
using a backpropagation learning scheme.
• Backpropagation learning works by making
modifications in weight values starting at the
output layer then moving backward through the
hidden layers of the network.
• Genetic learning can also be applied to train
feed-forward networks.
Data Warehouse and Data Mining
20
Chapter 9
Chapter Summary
• The self-organizing Kohonen neural network
architecture is a popular model for unsupervised
clustering.
• A self-organizing neural network learns by
having several output nodes complete for the
training instances.
• For each instance, the output node whose
weight vectors most closely match the attribute
values of the input instance is the winning node.
Data Warehouse and Data Mining
21
Chapter 9
Chapter Summary
• As a result, the winning node has its associated
input weights modified to more closely match the
current training instance.
• When unsupervised learning is complete,
output nodes winning the most instances are
saved.
• After this, test data is applied and the clusters
formed by the test set data are analyzed to help
determine the meaning of what has been found.
Data Warehouse and Data Mining
22
Chapter 9
Chapter Summary
• A central issue surrounding neural network is their
inability to explain what has been learned.
• Despite this, neural network have been successfully
applied to solve problems in both the business and
scientific world.
• Although we have discussed the most popular neural
network models, several other architectures and learning
rules have been developed.
• Jain, Mao, and Mohiuddin (1996) provide a good
starting point for learning more about neural networks.
Data Warehouse and Data Mining
23
Chapter 9
Key Terms
Average member technique. An unsupervised clustering
neural network explanation technique where the most typical
member of each cluster is computed by finding the average
value for each class attribute.
Backpropagation learning. A training method used with
many feed-forward networks that works by making
modifications in weight values starting at the output layer
then moving backward through the hidden layer.
Delta rule. A neural network learning rule designed to
minimize the sum of squared errors between computed and
target network output.
Data Warehouse and Data Mining
24
Chapter 9
Key Terms
Epoch. One complete pass of the training data through
a neural network.
Feed-forward neural network. A neural network
architecture where all weights at one layer are directed
toward nodes at the next network layer. Weights do not
cycle back as inputs to previous layers.
Fully connected. A neural network structure where all
nodes at one layer of the network are connected to all
nodes in the next layer.
Kohonen network. A two-layer neural network used
for unsupervised clustering.
Data Warehouse and Data Mining
25
Chapter 9
Key Terms
Neural network. A parallel computing system
consisting of several interconnected processors.
Neurode. A neural network processor node. Several
neurodes are connected to form a complete neural
network structure.
Sensitivity analysis. A neural network explanation
technique that allows us to determine a rank ordering for
the relative importance of individual attributes.
Sigmoid function. One of several commonly used
neural network evaluation functions. The sigmoid
function is continuous and outputs a value between 0 or 1.
Data Warehouse and Data Mining
26
Chapter 9
Key Terms
Linearly separable. Two classes, A and B, are said
to be linearly separable if a straight line can be
drawn to separate the instances of class B.
Perceptron neural network. A simple feed-forward
neural network architecture consisting of an
input layer and a single output layer.
Data Warehouse and Data Mining
27
Chapter 9
Reference
Data Mining: Concepts and Techniques (Chapter 7 Slide for
textbook), Jiawei Han and Micheline Kamber, Intelligent Database
Systems Research Lab, School of Computing Science, Simon Fraser
University, Canada
Data Warehouse and Data Mining
28
Chapter 9