Multilayer feed-forward artificial neural networks for
Download
Report
Transcript Multilayer feed-forward artificial neural networks for
Multilayer feed-forward
artificial neural networks
for
Class-modeling
F. Marini, A. Magrì, R. Bucci
Dept. of Chemistry - University of Rome “La Sapienza”
The starting question….
ANN papers published: 1982-2002
4780 4916
4643
5000
4000
3577
3000
2509
2000
1615
1000
367
1
4
28 148
0
1982
85-86
89-90
93-94
97-98
20012002
Despite literature on NNs has increased significantly,
no paper considers the possibility of performing classmodeling
class modeling: what….
classification
class modeling
• Class modeling considers one class at a time
• Any object can then belong or not to that specific
class model
• As a consequence, any object can be assigned to only
one class, to more than one class or to no class at all
…..and why
• Flexibility
• Additional information:
– sensitivity: fraction of samples from category X accepted by
the model of category X
– specificity: fraction of samples from category Y (or Z, W….)
refused by the model of category X
• No need to rebuild the existing models each time a new
category is added.
less equivocal answer to the question:
“are the analytical data compatible with the product
being X as declared?”
A first step forward
• A particular kind of NN, after suitable modifications
could be used for performing class-modeling (Anal.
Chim. Acta, 544 (2005), 306)
– Kohonen SOM
– Addition of dummy random vectors to the training set
– Computation of a suitable (non-parametric) probability
distribution after mapping on the 2D Kohonen layer.
– Definition of the category space based on this distribution
In this communication…
…The possibility of using a different type of neural
network (multilayer feed-forward) to operate classmodeling is studied
– How to?
– Examples
Just a few words about NN
Polla ta deina kouden anqropou deinoteron pelei.
Sophocles
NN: a mathematical approach
• From a computational point of view, ANNs represent a
way to operate a non-linear functional mapping between
an input and an output space.
y f (x)
• This functional relation is expressed in an implicit way (via
a combination of suitably weighted non-linear functions,
in the case of MLF-NN)
• ANNs are usually represented as groups of elementary
computational units (neurons) performing simultaneously
the same operations.
• Types of NN differ in how neurons are grouped and
how they operate
Multilayer feed-forward NN
• Individual processing units are organized in three types
of layer: input, hidden and output
• All neurons within the same layer operate
simultaneously
y1
y2
y3
y4
output
hidden
input
x1
x2
x3
x4
x5
x1
The artificial neuron
w1k
w2k
x2
x3
zk
f()
w3k
zk f (i wik xi w0k )
hidden
input
x1
x2
x3
x4
x5
The artificial neuron
z1
w1j
w2j
z2
z3
y1
w3j
yj
f()
y j f (k wkj zk w0 j )
f (k wkj ( f (i wik xi w0 k )) w0 j )
y2
y3
y4
output
hidden
input
x1
x2
x3
x4
x5
Training
• Iterative variation of connection weights, to minimize an
error criterion.
• Usually, backpropagation algorithm is used:
E
P wij (t ) -
P wij (t - 1)
wij
P
MLF class-modeling: what to do?
• Model for each category has to be built using only
training samples from that category
• Suitable definition of category space
Somewhere to start from
Input
Hidden
Input
x1
x2
x3
Xj
xm
Output value of hidden node 1
When targets are equal to input values, hidden nodes
could be thought of as a sort of non-linear principal
components
… and a first ending point
• For each category a neural network model is computed
providing the input vector also as desired target vector
Ninp-Nhid-Ninp
• Number of hidden layer is estimated by loo-cv
(minimum reconstruction error in prediction)
• The optimized model is then used to predict unknown
samples:
– Sample is presented to the network
– Vector of predicted responses (which is an estimate of the
original input vector) is computed
– Prediction error is calculated and compared to the average
prediction error for samples belonging to the category (as in
SIMCA).
NN-CM in practice
• Separate category autoscaling
•X
•
xˆ
C
train
C
test,i
N ;W ;s
C
hid
C
2
0 ,C
f (xtest,i ; W ) s
C
2
i ,C
C
T
C
ˆ
ˆ
s (xtest,i - xtest,i ) (xtest,i - xtest,i ) / NV
2
i ,C
• Fi ,C
si2,C
s02,C
• if p( F Fi ,C ) is lower than a predifined threshold, the
sample is refused by the category model.
A couple of examples
The classical X-OR
• 200 training samples:
– 100 class 1
– 100 class 2
• 200 test samples:
– 100 class 1
– 100 class 2
3 hidden neurons for each category
• Sensitivity:
Results
– 100% class 1, 100% class2
• Specificity:
– 75% class1 vs class 2
– 67% class2 vs class 1
• Prediction ability:
– 87% class1
– 83% class2
– 85% overall
• These results are significantly better than with
SIMCA and UNEQ (specificities lower than 30% and
classification slightly higher than 60%)
A very small data-set: honey
CM of honey samples
• 76 samples of honey from 6 different botanical origins
(honeydew, wildflower, sulla, heather, eucalyptus and
chestnut)
• 11-13 samples per class
• 2 input variables: specific rotation; total acidity
• Despite the small number of samples, a good NN
model was obtained (2 hidden neurons for each class)
• Possibility of drawing a Coomans’ plot
Further work and Conclusions
• A novel approach to class-modeling based on
multilayer feed-forward NN was presented
• Preliminary results seem to indicate its usefulness in
cases where traditional class modeling fails
• Effect of training set dimension should be further
invetigated (our “small” data set was too good to be
used for obtaining a definitive answer)
• We are analyzing other “exotic” data sets for
classification where traditional methods fail.
Acknowledgements
• Prof. Jure Zupan, Slovenia