Research on Advanced Learning Algorithms of

Download Report

Transcript Research on Advanced Learning Algorithms of

Research on Advanced Training Algorithms of
Neural Networks
Hao Yu
Ph.D Defense
Aug 17th 2011
Supervisor:
Committee Members:
University Reader:
Bogdan Wilamowski
Hulya Kirkici
Vishwani D. Agrawal
Vitaly Vodyanoy
Weikuan Yu
Outlines
•
•
•
•
•
•
•
•
•
Why Neural Networks
Network Architectures
Training Algorithms
How to Design Neural Networks
Problems in Second Order Algorithms
Proposed Second Order Computation
Proposed Forward-Only Algorithm
Neural Network Trainer
Conclusion & Recent Research
What is Neural Network
• Classification: separate the two groups (red circles and blue
stars) of twisted points [1].
What is Neural Network
• Interpolation: with the given 25 points (red), find the values of
points A and B (black)
What is Neural Network
• Human Solutions
• Neural Network Solutions
What is Neural Network
•
Recognition: retrieve the noised digit images (left) to original images (right)
Noised Images
Original Images
What is Neural Network
• “Learn to Behave”
Learning Process
“Behave”
• Build any relationship between input and outputs [2]
Why Neural Network
• What makes neural network different
Given Patterns (5×5=25)
Testing Patterns (41×41=1,681)
Different Approximators
• Test Results of Different Approximators
Mamdani fuzzy
Nearest
TSK fuzzy
Linear
Neuro-fuzzy
SVM-RBF
SVM-Poly
Spline
Cubic
Neural Network
Matlab Function: Interp2
Comparison
• Neural networks behave potentially as the best approximator
Methods of Computational Intelligence
Fuzzy inference system – Mamdani
Fuzzy inference system – TSK
Neuron – fuzzy system
Support vector machine – RBF kernel
Support vector machine – polynomial kernel
Interpolation – nearest
Interpolation – linear
Interpolation – spline
Interpolation – cubic
Neural network – 4 neurons in FCC network
Neural network – 5 neurons in FCC network
Sum Square Errors
319.7334
35.1627
27.3356
28.9595
176.1520
197.7494
28.6683
11.0874
3.2791
2.3628
0.4648
Outlines
•
•
•
•
•
•
•
•
•
Why Neural Networks
Network Architectures
Training Algorithms
How to Design Neural Networks
Problems in Second Order Algorithms
Proposed Second Order Computation
Proposed Forward-Only Algorithm
Neural Network Trainer
Conclusion & Recent Research
A Single Neuron
• Two basic computations
net
(1) net   xi wi  w0
i 1
w1
w2
w3
f (x)
y
(2)
y  f net 
w0
6
w4
w5
w7 w
x1
x2
x3
x4
x5
x6
x7
7
y  f x   tangain x 
y  f x   gain  x
+1
gain  1
Network Architectures
•
•
Multiplayer perceptron network is the most popular architecture
Networks with connections across layers, such as bridged multiplayer
perceptron (BMLP) networks and fully connected cascade (FCC) networks
are much powerful than MLP networks.
•
•
•
Wilamowski, B. M. Hunter, D. Malinowski, A., "Solving parity-N problems with feedforward neural
networks". Proc. 2003 IEEE IJCNN, 2546-2551, IEEE Press, 2003.
M. E. Hohil, D. Liu, and S. H. Smith, "Solving the N-bit parity problem using neural networks," Neural
Networks, vol. 12, pp1321-1323, 1999.
Example: smallest networks for solving parity-7 problem (analytical results)
BMLP network
MLP network
FCC network
Outlines
•
•
•
•
•
•
•
•
•
Why Neural Networks
Network Architectures
Training Algorithms
How to Design Neural Networks
Problems in Second Order Algorithms
Proposed Second Order Computation
Proposed Forward-Only Algorithm
Neural Network Trainer
Conclusion & Recent Research
Error Back Propagation Algorithm
•
•
The most popular algorithm for neural network training
Update rule of EBP algorithm [3]
w k   g k
•
•
Developed based on gradient optimization
Advantages:
–
–
•
 E E E 
g ,
, 
 w1 w2 w3 
Easy
Stable
Disadvantages:
–
–
Very limited power
Slow convergence
w  w1 , w2 , w3 
Improvement of EBP
• Improved gradient using momentum [4]
w k  1   g k  w k 1
w k
 1    g k
 g k
w k 1
w k 1
• Adjusted learning constant [5-6]
A
B
wk1
w k
0  1
Newton Algorithm
•
Newton algorithm: using the derivative of gradient to evaluate the change of
gradient, then select proper learning constants in each direction [7]
w k  H k 1g k
gi 
E
wi
1 P M 2
E   e pm
2 p 1 m1
•
Advantages:
–
•
 2E

2
 w1
 2E
H   w w
 2 1
 
 2E

 w N w1
Fast convergence
Disadvantages:
–
–
Not stable
Requires computation of second order derivative
2E
w1w2
2E
w22

2E
w N w2
2E 


w1w N 
2E 


w2 w N 

 
2E 


w N2 
Gaussian-Newton Algorithm
•
Gaussian-Newton algorithm: eliminate the second order derivatives in
Newton Method, by introducing Jacobian matrix

w k   J Tk J k
g  JT e
H  JT J
•
Advantages:
–
•
Fast convergence
Disadvantages:
–
Not stable

1
J Tk e k
 e1,1

 w1
 e1, 2

 w1
 
 e1, M
 w
1

J  
 e P ,1
 w
1


e
 P,2
 w1
 
 e
 P,M
 w1
e1,1
w2
e1, 2
w2

e1, M
w2

e P ,1
w2
e P , 2
w2

e P , M
w2









e1,1 

w N 
e1, 2 

w N 
 
e1, M 
w N 

 
e P ,1 
w N 

e P , 2 
w N 
 
e P , M

w N 
 e1,1
e
 1, 2
 

 e1, M
e 

 e P ,1
e
 P,2
 

e P , M














Levenberg Marquardt Algorithm
•
LM algorithm: blend EBP algorithm and Gaussian-Newton algorithm [8-9]

w k   J Tk J k   k I
–
–
•
1
J k ek
When evaluation error increases, μ increase, LM algorithm switches to EBP algorithm
When evaluation error decreases, μ decreases, LM algorithm switches to Gaussian-Newton
method
Advantages
–
–
•

Fast convergence
Stable training
Comparing with first order algorithms, LM algorithm has much more
powerful search ability, but it also requires more complex computation
Comparison of Different Algorithms
• Training XOR patterns using different algorithms
XOR problem EBP
success rate
average iteration
average time (ms)
α=0.1
100%
17845.44
3413.26
XOR problem EBP
using momentum
success rate
average iteration
average time (ms)
α=0.1
m=0.5
100%
18415.84
4687.79
XOR problem – EBP
adjusted learning constant
success rate
100%
average iteration
170.23
average time (ms) 41.19
α=10
18%
179.00
46.83
α=10
m=0.5
100%
187.76
39.27
XOR problem – Gaussian-Newton algorithm
success rate
6%
average iteration
1.29
average time (ms)
2.29
XOR problem – LM algorithm
success rate
100%
average iteration
5.49
average time (ms) 4.35
Outlines
•
•
•
•
•
•
•
•
•
Why Neural Networks
Network Architectures
Training Algorithms
How to Design Neural Networks
Problems in Second Order Algorithms
Proposed Second Order Computation
Proposed Forward-Only Algorithm
Neural Network Trainer
Conclusion & Recent Research
How to Design Neural Networks
• Traditional design:
– Most popular training algorithm: EBP algorithm
– Most popular network architecture: MLP network
• Results:
– Large size neural networks
– Poor generalization ability
– Lots of engineers move to other methods, such as fuzzy systems
How to Design Neural Networks
•
B. M. Wilamowski, "Neural Network Architectures and Learning Algorithms: How Not to Be
Frustrated with Neural Networks," IEEE Ind. Electron. Mag., vol. 3, no. 4, pp. 56-63, 2009.
–
–
Over-fitting problem
Mismatch between size of training patterns and network size
2 neurons
6 neurons
•
3 neurons
7 neurons
4 neurons
8 neurons
5 neurons
9 neurons
Recommended design policy: compact networks benefit generalization ability
–
–
Powerful training algorithm: LM algorithm
Efficient network architecture: BMLP network and FCC network
Outlines
•
•
•
•
•
•
•
•
•
Why Neural Networks
Network Architectures
Training Algorithms
How to Design Neural Networks
Problems in Second Order Algorithms
Proposed Second Order Computation
Proposed Forward-Only Algorithm
Neural Network Trainer
Conclusion & Recent Research
Problems in Second Order Algorithms
• Matrix inversion
J
T
J  I

1
– Nature of second order algorithms
– The size of matrix is proportional to the size of networks
– As the size of networks increases, second order algorithms
may not as efficient as first order algorithms
Problems in Second Order Algorithms
• Architecture limitation
•
M. T. Hagan and M. Menhaj, "Training feedforward networks with the Marquardt algorithm". IEEE Trans. on Neural
Networks, vol. 5, no. 6, pp. 989-993, 1994. (citation 2474)
– Only developed for training MLP networks
– Not proper for design compact networks
• Neuron-by-Neuron Algorithm
•
B. M. Wilamowski, N. J. Cotton, O. Kaynak and G. Dundar, "Computing Gradient Vector and Jacobian Matrix in
Arbitrarily Connected Neural Networks", IEEE Trans. on Industrial Electronics, vol. 55, no. 10, pp. 3784-3790, Oct.
2008.
–
–
–
–
SPICE computation routines
Capable of training arbitrarily connected neural networks
Compact neural network design: NBN algorithm + BMLP (FCC) networks
Very complex computation
Problems in Second Order Algorithms
• Memory limitation:
–
–
–
–
•
•
•
J
T
J  I

1
The size of Jacobian matrix J is P×M×N
P is the number of training patterns
M is the number of outputs
N is the number of weights
Practically, the number of training patterns is huge
and is encouraged to be as large as possible
MINST handwritten digit database [10]: 60,000
training patterns, 784 inputs and 10 outputs. Using
the simplest network architecture (1 neuron per
output), the required memory could be nearly 35
GB.
Limited by most of the Windows compiler.
 e11
 w
1


e
 12
 w1
 
 e
 1M
 w1
J  
 e P1

 w1
 e P 2
 w
1

 
 e PM
 w1
e11
w2
e12
w2

e1M
w2

e P1
w2
e P 2
w2

e PM
w2









e11
w N
e12
w N

e1M
w N

e P1
w N
e P 2
w N

e PM
w N



















Problems in Second Order Algorithms
• Computational duplication
– Forward computation: calculate errors
– Backward computation: error backpropagation
–
–
Very complex
Inefficient for networks with multiple outputs
...
...
...
+1
...
...
+1
+1
Forward Computation
Outputs
In second order algorithms, both Hagan and
Menhaj LM algorithm and NBN algorithm,
the error backpropagation process has to be
repeated for each output.
Inputs
•
Backward Computation
Outlines
•
•
•
•
•
•
•
•
•
Why Neural Networks
Network Architectures
Training Algorithms
How to Design Neural Networks
Problems in Second Order Algorithms
Proposed Second Order Computation
Proposed Forward-Only Algorithm
Neural Network Trainer
Conclusion & Recent Research
Proposed Second Order Computation – Basic Theory
• Matrix Algebra [11]
Memory comparison
N
P×M

JT
P×M
J

H
Multiplication
Methods
Row-column
Column-row
Difference
N
Elements for storage
(P × M) × N + N × N + N
N×N + N
(P × M) × N
Row-column multiplication
Computation comparison
N
N
N
J
T

J

q
N
Multiplication
Methods
Row-column
Column-row
Addition
Multiplication
(P × M) × N × N
N × N × (P × M)
(P × M) × N × N
N × N × (P × M)
Column-row multiplication
• In neural network training, considering
– Each pattern is related to one row of Jacobian matrix
– Patterns are independent of each other
Proposed Second Order Computation – Derivation
•
Hagan and Menhaj LM algorithm
or NBN algorithm

w  J T J   I
 e11
 w
1


e
 12
 w1
 
 e
 1M
 w1
J  
 e P1

 w1
 e P 2
 w
1



 e PM
 w1
e11
w2
e12
w2

e1M
w2

e P1
w2
e P 2
w2

e PM
w2









e11
w N
e12
w N

e1M
w N

e P1
w N
e P 2
w N

e PM
w N




















1
•
Improved Computation
w  Q   I 1 g
JTe
 e11 
e 
 12 
 


 e1M 
e  


 e P1 
e 
 P2 
 


e PM 
q pm
η pm
  e  2
  pm 
  w1 

 e pm e pm
  w w
2
1



 e pm e pm

 w N w1
e pm e pm
w1 w2
2
 e pm 


 w 
2 


e pm e pm
w N w2
e pm e pm 

w1 w N 

e pm e pm 

w2 w N 



2
 e pm  
 
 
 

w
N

 

 e pm
  e pm 
 w e pm   w 
 1
  1

e
 pm
  e pm 
e
  w pm    w   e pm
 2
  2

 e
  

 pm e   e pm 
 wN pm   wN 
 e pm
j pm  
 w1
e pm
w2

e pm 

wN 
q pm  j Tpm j pm
P
Q
M
 q
pm
p 1 m 1
ηpm  j Tpme pm
P
g
M
 η
p 1 m 1
pm
Proposed Second Order Computation – Pseudo Code
•
Properties:
–
–
•
Main contributions:
–
–
–
•
•
No need for Jacobian matrix storage
Vector operation instead of matrix operation
Significant memory reduction
Memory reduction benefits computation speed
NO tradeoff !
Memory limitation caused by Jacobian matrix
storage in second order algorithms is solved
Again, considering the MINST problem, the
memory cost for storage Jacobian elements
could be reduced from more than 35 gigabytes
to nearly 30.7 kilobytes
% Initialization
Q=0;
g =0
% Improved computation
for p=1:P
% Number of patterns
% Forward computation
…
for m=1:M
% Number of outputs
% Backward computation
…
calculate vector jpm;
calculate sub matrix qpm;
calculate sub vector ηpm;
Q=Q+qpm;
g=g+ηpm;
end;
end;
Pseudo Code
Proposed Second Order Computation – Experimental Results
•
Memory Comparison
Parity-N Problems
Patterns
Structures
Jacobian matrix sizes
Weight vector sizes
Average iteration
Success Rate
Algorithms
Traditional LM
Improved LM
•
N=14
16,384
15 neurons
5,406,720
330
99.2
13%
Actual memory cost
79.21Mb
3.41Mb
N=16
65,536
17 neurons
27,852,800
425
166.4
9%
385.22Mb
4.30Mb
Time Comparison
Parity-N Problems
Patterns
Neurons
Weights
Average Iterations
Success Rate
Algorithms
Traditional LM
Improved LM
N=9
512
10
145
38.51
58%
0.78
0.33
N=11
N=13
2,048
8,192
12
14
210
287
59.02
68.08
37%
24%
Averaged training time (s)
68.01
1508.46
22.09
173.79
N=15
32,768
16
376
126.08
12%
43,417.06
2,797.93
Outlines
•
•
•
•
•
•
•
•
•
Why Neural Networks
Network Architectures
Training Algorithms
How to Design Neural Networks
Problems in Second Order Algorithms
Proposed Second Order Computation
Proposed Forward-Only Algorithm
Neural Network Trainer
Conclusion & Recent Research
Traditional Computation – Forward Computation
y j,1
• For each training pattern p
•
Calculate net for neuron j

y
j , ni 1 w
ni
net j 
•
wj,1
y j,2 wj,2
wj,i
y j,i w
j,ni1
w
 w j,0
j ,i y j ,i
i 1
 
Calculate derivative for neuron j
sj 
•
y j
net j

 
f j net j
net j
Calculate output at output m
 
om  Fm, j y j
•
Calculate error at output m
e pm  o pm  d pm
Fm, j (yj )
om
j,ni
Calculate output for neuron j
y j ,ni
y j  f j net j
•
yj
f j (netj )
wj,0
1
...
yi
...
wi
netj
sj
yj
...
...
...
...
epm=opm-dpm
Traditional Computation – Backward Computation
• For first order algorithms
•
Calculate delta [12]
 j  sj
no
F
'
m, j em
m 1
•
Do gradient vector
g j ,i 
E
 y j ,i  j
w j ,i
• For second order algorithms
•
Calculate delta
 m, j  s j Fm' , j
•
Calculate Jacobian elements
e p ,m
 y j ,i m , j
w j ,i
Proposed Forward-Only Algorithm
• Extend the concept of backpropagation factor δ
– Original definition: backpropagated from output m to neuron j
 m, j  s j Fm' , j
– Our definition: backpropagated from neuron k to neuron j
network inputs
m, j
netj
sj
yj
netk
m,k
sk
o1
yk
F 'k , j
k, j  Fk', j s j
om
network outputs
 k , j  Fk', j s j
Proposed Forward-Only Algorithm
• Regular Table
– lower triangular elements: k≥j, matrix δ has triangular shape
– diagonal elements: δk,k=sk
– Upper triangular elements: weight connections between neurons

1  1,1 w1,2 
2  2,1  2, 2 
   
j
 j ,1  j,2 
   
k  k ,1  k ,2 
   
nn  nn,1  nn,2 
Neuron
Index
1
2
 k
w1, j  w1, k
w2, j  w2,k
  
 j, j  wj,k
  
 k, j   k ,k
  
 nn, j   nn,k
j









nn
w1,nn
w2,nn

k j
 k , j   k ,k
k j
 k , k  sk
wk,nn
k j
k, j  0

 nn,nn
w
i j
wj,nn

k 1
i ,k  i , j
Proposed Forward-Only Algorithm
• Train arbitrarily connected neural networks
Index
1
2
3
4
5
6
1
s1
2
3
0 0
0 s2 0
0 0 s3
0 0 0
4
5
Index
6
1
w1,5 w1,6
0
0 w2,5 w2,6
0 w3,5 w3,6
5
2
s 4 w4,5 w4,6
 5,1 5,2  5,3 5,4 s 5 0
3
Index
1
2
3
4
5
6
2
3
4
5
0 w1,3 0 0 w1,6
0 s 2 0 w2,4 w2,5 0
 3,1 0 s 3 0 w3,5 w3,6
0 4,2 0 s 4 w4,5 0
 5,1 5,2  5,3 5,4 s 5 0
 6,1 0  6,3 0 0 s 6
5
6
2,1 s 2 w2,3 w2,4 w2,5 w2,6
3
 3,1 3,2 s 3 w3,4 w3,5 w3,6
 4,1 4,2 4,3 s 4 w4,5 w4,6
 5,1 5,2  5,3 5,4 s 5 w5,6
 6,1 6,2  6,3 6,4  6,5 s 6
6
3
s1
4
2
6
1
3
s 1 w1,2 w1,3 w1, 4 w1,5 w1,6
5
4
 6,1 6,2  6,3 6,4 0 s 6
2
1
4
6
1
1
5
6
2
4
1
2
3
4
5
6
Proposed Forward-Only Algorithm
•
•
Train networks with multiple outputs
1 output
2 outputs
3 outputs
4 outputs
The more outputs the networks have, the more efficient the forward-only algorithm will be
Proposed Forward-Only Algorithm
•
Pseudo codes of two different algorithms
for all patterns
% Forward computation
for all neurons (nn)
for all weights of the neuron (nx)
calculate net;
end;
calculate neuron output;
calculate neuron slope;
end;
for all outputs (no)
calculate error;
%Backward computation
initial delta as slope;
for all neurons starting from output neurons (nn)
for the weights connected to other neurons (ny)
multiply delta through weights
sum the backpropagated delta at proper nodes
end;
multiply delta by slope (for hidden neurons);
end;
end;
end;
for all patterns (np)
% Forward computation
for all neurons (nn)
for all weights of the neuron (nx)
calculate net;
end;
calculate neuron output;
calculate neuron slope;
set current slope as delta;
for weights connected to previous neurons (ny)
for previous neurons (nz)
multiply delta through weights then sum;
end;
multiply the sum by the slope;
end;
related Jacobian elements computation;
end;
for all outputs (no)
calculate error;
end;
end;
Forward-only algorithm
Traditional forward-backward algorithm
•
In forward-only computation, the backward computation (bold in left figure) is replaced by
extra computation in forward process (bold in right figure)
Proposed Forward-Only Algorithm
+/–
×/÷
Exp
+/–
×/÷
Exp
+/–
×/÷
exp
•
Computation cost estimation
Hagan and Menhaj Computation
Forward Part
Backward Part
nn×nx + 3nn + no
no×nn×ny
nn×nx + 4nn
no×nn×ny + no×(nn – no)
nn
0
Forward-only computation
Forward
Backward
nn×nx + 3nn + no + nn×ny×nz
0
nn×nx + 4nn + nn×ny + nn×ny×nz
0
nn
0
Subtraction forward-only from traditional
nn×ny×(no – 1)
nn×ny×(no – 1) + no×(nn – no) – nn×ny×nz
0
0.9
0.8
0.7
0.6
0.5
0.4
0
20
40
60
The number of hidden neurons
Simplified computation: organized in a regular table with general formula
Easy to be adapted for training arbitrarily connected neural networks
Improved computation efficiency for networks with multiple outputs
Tradeoff
–
Extra memory is required to store the extended δ array
80
MLP networks with one hidden layer; 20 inputs
Properties of the forward-only algorithm
–
–
–
•
Number of output=1 to 10
1
Ratio of time consumption
•
100
Proposed Forward-Only Algorithm
•
Experiments: training compact neural networks with good generalization ability
Neur
ons
8
9
10
11
12
13
14
15
Success Rate
EBP
0%
0%
0%
0%
0%
35%
42%
56%
FO
5%
25%
61%
76%
90%
96%
99%
100%
Average Iteration
EBP
Failing
Failing
Failing
Failing
Failing
573,226
544,734
627,224
FO
222.5
214.6
183.5
177.2
149.5
142.5
134.5
119.3
Average Time (s)
EBP
Failing
Failing
Failing
Failing
Failing
624.88
651.66
891.90
FO
0.33
0.58
0.70
0.93
1.08
1.35
1.76
1.85
8 neurons, FO
SSETrain=0.0044, SSEVerify=0.0080
8 neurons, EBP
SSETrain=0.0764, SSEVerify=0.1271
Under-fitting
12 neurons, EBP
SSETrain=0.0018, SSEVerify=0.4909
Over-fitting
Proposed Forward-Only Algorithm
• Experiments: comparison of computation efficiency
Forward Kinematics [13]
β
End Effector
L2
Computation
methods
Traditional
Forward-only
Time cost (ms/iteration)
Forward
Backward
0.307
0.771
0.727
0.00
Relative
time
100.0%
67.4%
ASCII to Images
L1
α
Error Correction
Problems
Computation
methods
Traditional
Forward-only
Time cost (ms/iteration)
Forward
Backward
8.24
1,028.74
61.13
0.00
Relative
time
100.0%
5.9%
8-bit signal
Computation
Methods
Traditional
Forward-only
Time Cost (ms/iteration)
Forward
Backward
40.59
468.14
175.72
0.00
Relative
Time
100.0%
34.5%
Outlines
•
•
•
•
•
•
•
•
•
Why Neural Networks
Network Architectures
Training Algorithms
How to Design Neural Networks
Problems in Second Order Algorithms
Proposed Second Order Computation
Proposed Forward-Only Algorithm
Neural Network Trainer
Conclusion & Recent Research
Software
•
The tool NBN Trainer is developed based on Visual C++ and used for training neural
networks
•
•
•
Pattern classification and recognition
Function approximation
Available online (currently free): http://www.eng.auburn.edu/~wilambm/nnt/index.htm
Parity-2 Problem
• Parity-2 Patterns
Outlines
•
•
•
•
•
•
•
•
•
Why Neural Networks
Network Architectures
Training Algorithms
How to Design Neural Networks
Problems in Second Order Algorithms
Proposed Second Order Computation
Proposed Forward-Only Algorithm
Neural Network Trainer
Conclusion & Recent Research
Conclusion
• Second order algorithms are more efficient and advanced in training
neural networks
• The proposed second order computation removes Jacobian matrix
storage and multiplication. It solves memory limitation
• The proposed forward-only algorithm simplifies the computation
process in second order training: a regular table + a general formula
• The proposed forward-only algorithm can handle arbitrarily
connected neural networks
• The proposed forward-only algorithm has speed benefit for networks
with multiple outputs
Recent Research
• RBF networks
– ErrCor algorithm: hierarchical training algorithm
– Network size increases based on the training information
– No more trial-by-trial
• Applications of Neural Networks (future work)
– Dynamic controller design
– Smart grid distribution systems
– Pattern recognition in EDA software design
References
[1] J. X. Peng, Kang Li, G.W. Irwin, "A New Jacobian Matrix for Optimal Learning of Single-Layer Neural Networks," IEEE Trans. on Neural
Networks, vol. 19, no. 1, pp. 119-129, Jan 2008
[2] K. Hornik, M. Stinchcombe and H. White, "Multilayer Feedforward Networks Are Universal Approximators," Neural Networks, vol. 2,
issue 5, pp. 359-366, 1989.
[3] D. E. Rumelhart, G. E. Hinton and R. J. Wiliams, "Learning representations by back-propagating errors," Nature, vol. 323, pp. 533-536,
1986 MA.
[4] V. V. Phansalkar, P.S. Sastry, "Analysis of the back-propagation algorithm with momentum," IEEE Trans. on Neural Networks, vol. 5, no.
3, pp. 505-506, March 1994.
[5] M. Riedmiller, H. Braun, "A direct adaptive method for faster backpropagation learning: The RPROP algorithm". Proc. International
Conference on Neural Networks, San Francisco, CA, 1993, pp. 586-591.
[6] Scott E. Fahlman. Faster-learning variations on back-propagation: An empirical study. In T. J. Sejnowski G. E. Hinton and D. S. Touretzky,
editors, 1988 Connectionist Models Summer School, San Mateo, CA, 1988. Morgan Kaufmann.
[7] M. R. Osborne, "Fisher’s method of scoring," Internat. Statist. Rev., 86 (1992), pp. 271-286.
[8] K. Levenberg, "A method for the solution of certain problems in least squares," Quarterly of Applied Machematics, 5, pp. 164-168, 1944.
[9] D. Marquardt, "An algorithm for least-squares estimation of nonlinear parameters," SIAM J. Appl. Math., vol. 11, no. 2, pp. 431-441, Jun.
1963.
[10] L. J. Cao, S. S. Keerthi, Chong-Jin Ong, J. Q. Zhang, U. Periyathamby, Xiu Ju Fu, H. P. Lee, "Parallel sequential minimal optimization for
the training of support vector machines," IEEE Trans. on Neural Networks, vol. 17, no. 4, pp. 1039- 1049, April 2006.
[11] D. C. Lay, Linear Algebra and its Applications. Addison-Wesley Publishing Company, 3rd version, pp. 124, July, 2005.
[12] H. N. Robert, "Theory of the Back Propagation Neural Network," Proc. 1989 IEEE IJCNN, 1593-1605, IEEE Press, New York, 1989.
[13] N. J. Cotton and B. M. Wilamowski, "Compensation of Nonlinearities Using Neural Networks Implemented on Inexpensive
Microcontrollers" IEEE Trans. on Industrial Electronics, vol. 58, No 3, pp. 733-740, March 2011.
Prepared Publications – Journals
•
•
•
•
•
•
•
H. Yu, T. T. Xie, Stanisław Paszczyñski and B. M. Wilamowski, "Advantages of Radial Basis
Function Networks for Dynamic System Design," IEEE Trans. on Industrial Electronics
(Accepted and scheduled publication in December, 2011)
H. Yu, T. T. Xie and B. M. Wilamowski, "Error Correction – A Robust Learning Algorithm for
Designing Compact Radial Basis Function Networks," IEEE Trans. on Neural Networks (Major
revision)
T. T. Xie, H. Yu, J. Hewllet, Pawel Rozycki and B. M. Wilamowski, "Fast and Efficient Second
Order Method for Training Radial Basis Function Networks," IEEE Trans. on Neural Networks
(Major revision)
A. Malinowski and H. Yu, "Comparison of Various Embedded System Technologies for
Industrial Applications," IEEE Trans. on Industrial Informatics, vol. 7, issue 2, pp. 244-254, May
2011
B. M. Wilamowski and H. Yu, "Improved Computation for Levenberg Marquardt Training,"
IEEE Trans. on Neural Networks, vol. 21, no. 6, pp. 930-937, June 2010 (14 citations)
B. M. Wilamowski and H. Yu, "Neural Network Learning Without Backpropagation," IEEE
Trans. on Neural Networks, vol. 21, no.11, pp. 1793-1803, Nov. 2010 (5 citations)
Pierluigi Siano, Janusz Kolbusz, H. Yu and Carlo Cecati, "Real Time Operation of a Smart
Microgrid via FCN Networks and Optimal Power Flow," IEEE Trans. on Industrial Informatics
(under reviewing)
Prepared Publications – Conferences
•
•
•
•
•
•
•
•
H. Yu and B. M. Wilamowski, "Efficient and Reliable Training of Neural Networks," IEEE Human System
Interaction Conference, HSI 2009, Catania. Italy, May 21-23, 2009, pp. 109-115. (Best paper award in
Computational Intelligence section) (11 citations)
H. Yu and B. M. Wilamowski, "C++ Implementation of Neural Networks Trainer," 13th IEEE Intelligent
Engineering Systems Conference, INES 2009, Barbados, April 16-18, 2009, pp. 237-242 (8 citations)
H. Yu and B. M. Wilamowski, "Fast and efficient and training of neural networks," in Proc. 3nd IEEE Human
System Interaction Conf. HSI 2010, Rzeszow, Poland, May 13-15, 2010, pp. 175-181 (2 citations)
H. Yu and B. M. Wilamowski, "Neural Network Training with Second Order Algorithms," monograph by
Springer on Human-Computer Systems Interaction. Background and Applications, 31st October, 2010. (Accepted)
H. Yu, T. T. Xie, M. Hamilton and B. M. Wilamowski, "Comparison of Different Neural Network Architectures
for Digit Image Recognition," in Proc. 3nd IEEE Human System Interaction Conf. HSI 2011, Yokohama, Japan,
pp. 98-103, May 19-21, 2011
N. Pham, H. Yu and B. M. Wilamowski, "Neural Network Trainer through Computer Networks," 24th IEEE
International Conference on Advanced Information Networking and Applications, AINA2010, Perth, Australia,
April 20-23, 2010, pp. 1203-1209 (1 citations)
T. T. Xie, H. Yu and B. M. Wilamowski, "Replacing Fuzzy Systems with Neural Networks," in Proc. 3nd IEEE
Human System Interaction Conf. HSI 2010, Rzeszow, Poland, May 13-15, 2010, pp. 189-193.
T. T. Xie, H. Yu and B. M. Wilamowski, "Comparison of Traditional Neural Networks and Radial Basis Function
Networks," in Proc. 20th IEEE International Symposium on Industrial Electronics, ISIE2011, Gdansk, Poland, 2730 June 2011 (Accepted)
Prepared Publications – Chapters for IE
Handbook (2nd Edition)
•
•
•
•
•
H. Yu and B. M. Wilamowski, "Levenberg Marquardt Training," Industrial Electronics
Handbook, vol. 5 – INTELLIGENT SYSTEMS, 2nd Edition, 2010, chapter 12, pp. 12-1 to 12-16,
CRC Press.
H. Yu and M. Carroll, "Interactive Website Design Using Python Script," Industrial Electronics
Handbook, vol. 4 – INDUSTRIAL COMMUNICATION SYSTEMS, 2nd Edition, 2010, chapter
62, pp. 62-1 to 62-8, CRC Press.
B. M. Wilamowski, H. Yu and N. Cotton, "Neuron by Neuron Algorithm," Industrial Electronics
Handbook, vol. 5 – INTELLIGENT SYSTEMS, 2nd Edition, 2010, chapter 13, pp. 13-1 to 13-24,
CRC Press.
T. T. Xie, H. Yu and B. M. Wilamowski, "Neuro-fuzzy System," Industrial Electronics
Handbook, vol. 5 – INTELLIGENT SYSTEMS, 2nd Edition, 2010, chapter 20, pp. 20-1 to 20-9,
CRC Press.
B. M. Wilamowski, H. Yu and K. T. Chung, "Parity-N problems as a vehicle to compare
efficiency of neural network architectures," Industrial Electronics Handbook, vol. 5 –
INTELLIGENT SYSTEMS, 2nd Edition, 2010, chapter 10, pp. 10-1 to 10-8, CRC Press.
Thanks 