DOWN - Ubiquitous Computing Lab

Download Report

Transcript DOWN - Ubiquitous Computing Lab

Soft computing
Lecture 7
Multi-Layer perceptrons
Why hidden layer is needed
Problem of XOR for simple perceptron
X2
(0,1)
(1,1)
Class 1
Class 2
(0,0)
Class 2
Class 1
(1,0)
X1
In this case it is not possible to draw descriminant line
Minimization of error
Main algorithm of training
Kinds of sigmoid used in perceptrons
Exponential
Rational
Hyperbolic tangent
Formulas for error back
propagation algorithm
Modification of weights of synapses
of jth neuron connected with ith ones,
xj – state of jth neuron (output)
For output layer
For hidden layers
k – number of neuron in next layer
connected with jth neuron
(1)
(2)
(3)
(2), (1)
(1)
(3), (1)
(1)
Example of implementation
TNN=Class(TObject)
public
State:integer;
N,NR,NOut,NH:integer;
a:real;
Step:real;
NL:integer; // ъюы-тю шЄхЁрЎшщ яЁш юсєўхэшш
S1:array[1..10000] of integer;
S2:array[1..200] of real;
S3:array[1..5] of real;
G3:array[1..5] of real;
LX,LY:array[1..10000] of integer;
W1:array[1..10000,1..200] of real;
W2:array[1..200,1..5] of real;
W1n:array[1..10000,1..200] of real;
W2n:array[1..200,1..5] of real;
SymOut:array[1..5] of string[32];
procedure FormStr;
procedure Learn;
procedure Work;
procedure Neuron(i,j:integer);
end;
Procedure of simulation of neuron;
procedure TNN.Neuron(i,j:integer);
var
k:integer; Sum:real;
begin
case i of
1: begin
if Form1.PaintBox1.Canvas.Pixels[LX[j],LY[j]]= clRed
then
S1[j]:=1
else
S1[j]:=0;
end;
2: begin
Sum:=0.0;
for k:=1 to NR do
Sum:=Sum + S1[k]*W1[k,j];
if Sum> 0 then
S2[j]:=Sum/(abs(Sum)+Net.a)
else
S2[j]:=0;
end;
3: begin
Sum:=0.0;
for k:=1 to NH do
Sum:=Sum + S2[k]*W2[k,j];
if Sum> 0 then
S3[j]:=Sum/(abs(Sum)+Net.a)
else
S3[j]:=0;
end;
end;
end;
Fragment of procedure of learning
For i:=1 to NR do
for j:=1 to NH do
begin
S:=0;
for k:=1 to NOut do
begin
if (S3[k]>0) and (S3[k]<1) then
D:=S3[k]*(1-S3[k])
else
D:=1;
W2n[j,k]:=W2[j,k]+Step*S2[j]*(G3[k]-S3[k])*D;
S:=S+D*(G3[k]-S3[k])*W2[j,k]
end;
if (S2[j]>0) and (S2[j]<1) then
D:=S2[j]*(1-S2[j])
else
D:=1;
S:=S*D;
W1n[i,j]:=W1[i,j]+Step*S*S1[i];
end;
end;
Generalization
Some of the test data are now misclassified. The problem is that the
network, with two hidden units, now has too much freedom and has fitted
a decision surface to the training data which follows its intricacies in
pattern space without extracting the underlying trends.
Overfitting
Local minima
Two tasks solved by MLP
• Classification (recognition)
– Usually binary outputs
• Regression (approximation)
– Analog outputs
Theorem of Kolmogorov
“Any continuous function from input to
output can be implemented in a threelayer net, given sufficient number of
hidden units nH, proper nonlinearities, and
weights.”
Advantages and disadvantages of MLP with
back propagation
• Advantages:
– Guarantee of possibility of solving of tasks
• Disadvantages:
– Low speed of learning
– Possibility of overfitting
– Impossible to relearning
– Selection of structure needed for solving of
concrete task is unknown
Increase of speed of learning
• Preliminary processing of features before
getting to inputs of percepton
• Dynamical step of learning (in begin one is
large, than one is decreasing)
• Using of second derivative in formulas for
modification of weights
• Using hardware implementation
Fight against of overfitting
• Don’t select too small error for learning or
too large number of iteration
Choice of structure
• Using of constructive learning algorithms
– Deleting of nodes (neurons) and links
corresponding to one
– Appending new neurons if it is needed
• Using of genetic algorithms for selection of
suboptimal structure
Impossible to relearning
• Using of constructive learning algorithms
– Deleting of nodes (neurons) and links
corresponding to one
– Appending new neurons if it is needed