Prezentace aplikace PowerPoint

Download Report

Transcript Prezentace aplikace PowerPoint

Neural Networks
Marcel Jiřina
Institute of Computer Science,
Prague
1
Introduction


Neural networks and their use to classification
and other tasks
ICS AS CR
 Theoretical
computer science
 Neural networks, genetic alg. and nonlinear methods
 Numeric algorithms ..1 mil. eq.
 Fuzzy sets, approximate reasoning, possibility th.
 Applications: Nuclear science, Ecology, Meteorology,
Reliability in machinery, Medical informatics …
Institute of Computer Science,
Prague
2
Structure of talk







NN classification
Some theory
Interesting paradigms
NN and statistics
NN and optimization and genetic algorithms
About application of NN
Conlusions
Institute of Computer Science,
Prague
3
NN classification
Approximators
Teacher
No
teacher
Signals
Associative memories
General
Predictors
MLP-BP
RBF
GMDH
NNSU
Marks
Klán
Kohonen
Carpentier
Grossberg
(SOM)
Continuous, real-valued
Autoassociative
Heteroassociative
Classifiers
Perceptron(*)
Hamming
NE
Kohonen
(NE)
Hopfield
Binary, multi-valued (continuous)
NE – not existing. Associated response can be arbitrary and then must be given - by teacher
Feed-forward, recurrent
Fixed structure - growing
Institute of Computer Science,
Prague
4
Some theory
Kolmogorov theorem
Kůrková – Theorem
Sigmoid transfer function

Institute of Computer Science,
Prague
5
MLP - BP
Three layer - Single hidden layer
MLP – 4 layer – 2 hidden
Other paradigms have its own theory – another
Institute of Computer Science,
Prague
6
Interesting paradigms
Paradigm – general notion on structure, functions
and algorithms of NN
 MLP - BP
 RBF
 GMDH
 NNSU
All: approximators
Approximator + thresholding = Classifier
Institute of Computer Science,
Prague
7
MLP - BP
MLP – error Back Propagation
coefficients ,   (0,1)
- Lavenberg-Marquart
- Optimization tools
MLP with jump transfer function
- Optimization
Feed – forward (in recall)
Matlab, NeuralWorks, …
Good when default is sufficient
or when network is well tuned:
Layers, neurons, , 
Institute of Computer Science,
Prague
8
RBF


Structure same as in MLP
Bell-shaped transfer function (Gauss)




Number and positions of centers: random – cluster analysis
“broadness” of that bell
Size of individual bells
Learning methods
Theory similar to MLP
 Matlab, NeuralWorks, …
Good when default is sufficient or when network is well
tuned : Layers mostly one hidden, # neurons, transfer
function, proper cluster analysis (fixed No. of clusters,
variable? Near – Far metric or criteria)

Institute of Computer Science,
Prague
9
GMDH 1 (…5)
Group Method Data Handling
– Group – initially a pair of signals only



“per partes” or successive polynomial approximator
Growing network
“parameterless” – parameter-barren
– No. of new neurons in each layer only (processing time)
– (output limits, stopping rule parameters)

Overtraining – learning set is split to
– Adjusting set
– Evaluation set
GMDH 2-5: neuron, growing network, learning strategy, variants
Institute of Computer Science,
Prague
10
GMDH 2 – neuron

Two inputs x1, x2 only
– True inputs
– Outputs from neurons of the preceding layer





Full second order polynomial
y = a x12 + b x1 x2 + c x22 + d x1 + e x2 + f
y = neuron’s output
n inputs => n(n-1)/2 neurons in the first layer
Number of neurons grows exponentially
Order of resulting polynomial grows exponentially: 2, 4, 8,
16, 32, …
Ivakhnenko polynomials … some elements are missing
Institute of Computer Science,
Prague
11
GMDH 3 – learning a neuron

Matrix of data: inputs and desired value
u1, u2 , u3, …, un , y
u1, u2 , u3, …, un , y
….


sample 1
sample 1
sample m
A pair of two u’s are neuron’s inputs x1, x2
m approximating equations, one for each sample
a x12 + b x1 x2 + c x22 + d x1 + e x2 + f = y

Matrix
X=Y
 Each


 = (a, b, c, d, e, f)t
row of X is
x12+x1x2+x22+x1+x2+1
LMS solution  = (XtX)-1XtY
If XtX is singular, we omit this neuron
Institute of Computer Science,
Prague
12
GMDH 4 - growing network
x1, x2 y = desired output
Institute of Computer Science,
Prague
13
GMDH 5 learn. strategy
Problem: Number of neurons grows exponentially
NN=n(n-1)2
 Let the first layer of neurons grow unlimited
 In next rows:
[learning set split to adjusting set and evaluating set]
 Compute parameters a,…f using adjusting set
 Evaluate error using evaluating set and sort
 Select some n best neurons and delete the others
 Build the next layer OR
 Stop learning if stopping condition is met.

Institute of Computer Science,
Prague
14
GMDH 6 learn. Strategy 2
Select some n best neurons and delete the others
Control parameter of GMDH network
Error
1
2
3
4
5
6
7
8
9
Institute of Computer Science,
Prague
10 Layer
15
GMDH 7 - variants


Basic – full quadratic polynomial – Ivakh. poly
Cubic, Fourth order simplified …



Reach higher order in less layers and less params
Different stopping rules
Different ratio of sizes of adjusting set and
evaluating set
Institute of Computer Science,
Prague
16
NNSU GA
Neural Network with Switching Units
learned by the use of Genetic Algorithm
 Approximator by lot of local hyper-planes; today
also by local more general hyper-surfaces
 Feed-forward network
 Originally derived from MLP for optical
implementation
 Structure looks like columns above individual inputs

More … František
Institute of Computer Science,
Prague
17
Learning and testing set

Learning set
Adjusting (tuning) set
 Evaluation set


Testing set
One data set – the splitting influences results

Fair evaluation problem
Institute of Computer Science,
Prague
18
NN and statistics

MLP-BP mean squared error minimization
Sum of errors squared … MSE criterion
 Hamming distance for (pure) classifiers


No other statistical criteria or tests are in NN:
NN transforms data, generates mapping
 statistical criteria or tests are outside NN
(2, K-S, C-vM,…)

Is NN good for K-S test? … is y=sin(x) good for 2 test?

Bayes classifiers, k-th nearest neighbor, kernel
methods …
Institute of Computer Science,
Prague
19
NN and optimization and
genetic algorithms
Learning is an optimization procedure
 Specific to given NN
 General optimization systems or methods
 Whole NN
 Parts – GMDH and NNSU - linear regression
 Genetic algorithm
Not only parameters, the structure, too
 May be faster than iterations

Institute of Computer Science,
Prague
20
About application of NN

Soft problems
Nonlinear
 Lot of noise
 Problematic variables
 Mutual dependence of variables


Application areas
Economy
 Pattern recognition
 Robotics
 Particle physics
…

Institute of Computer Science,
Prague
21
Strategy when using NN


For “soft problems” only
NOT for
 Exact
function generation
 periodic signals etc.

First subtract all “systematics”
 Nearly
noise remains
 Approximate this nearly noise
 Add back all systematics

Understand your paradigm
 Tune
it patiently or
 Use “parameterless” paradigm
Institute of Computer Science,
Prague
22
Conlusions

Powerfull tool
Good when well used
 Simple paradigm, complex behavior


Special tool
Approximator
 Classifier


Universal tool
Very different problems
 Soft problems

Institute of Computer Science,
Prague
23