Learning in Neural Networks and Defect Tolerant Classifiers

Download Report

Transcript Learning in Neural Networks and Defect Tolerant Classifiers

Memristor in Learning Neural
Networks
Shaodi Wang
Parts of slides from Elham Zamanidoost and
Ligang Gao
Puneet Gupta
([email protected])
1
Characteristics
+
+
-
Ag
Ag
Ag
Pt
Pt
Pt
Pt
-
NanoCAD Lab
+
Ag
-
Shaodi Wang ([email protected])
-
2
+
Neural Network
NanoCAD Lab
Shaodi Wang ([email protected])
3
Learning in Neural Network
• Supervised Learning
- Training set contains input and output
*Feed-forward network
*Recurrent network
• Unsupervised Learning
-Training set contains input only
*self-organizing network
NanoCAD Lab
Shaodi Wang ([email protected])
4
Multi Layer Perceptron
• Hidden layer(s) perform classification of features
• Sigmoid activation function
Back Propagation Learning:
Apply gradient decent over the entire network
As before, we have:
w(i)  w(i)  w(i)
E
E u
w(i )   *
 * *
  x(i )
w(i )
u w(i )
For every output neuron:

E yout E

 ( yout )(1  yout )( yout  ytrain )
u u yout
u
 yhid
w(i )
For every hidden neuron:

e yhid uout yout e

  yhid 1  yhid  whid ,out  out
u uhid yhid uout yout
u
 x(i )
w(i )
NanoCAD Lab
Shaodi Wang ([email protected])
5
Gradient Descent
• Define cost function as sum of errors over entire training set, and
errors as: E  1 ( ytrain  yout )2
2
• Now train the network in order to minimize the cost. This means that
we need to minimize the error. Hence, we need a continuous
activation function to calculate the derivative.
1
• Sigmoid activation function: f (v) 
1  e v
*Gradient Descent Learning
w(i )   *
where

E
E v
 *
*
  x(i )
w(i )
v w(i )
E
E yout

*
 ( yout  ytrain ) * f (v)
v yout v
df (v) d  1 
1  ev  1
v  2
v


(
1

e
)
(

e
)

dv
dv 1  e v 
(1  e v ) 2
1  e v
1
1
1




v 2
v 2
v
(1  e ) (1  e )
(1  e ) (1  e v ) 2


1
1 
1

 f (v)1  f (v) 
(1  e v )  (1  e v ) 
NanoCAD Lab
Shaodi Wang ([email protected])
6
Recurrent Network
• Characteristics:
- Nodes connect back to other nodes or themselves
- Information flow is bidirectional
• Fully recurrent network: there is a pair of directed connections
between every pair of neurons in the network
NanoCAD Lab
Shaodi Wang ([email protected])
7
Hopfield Network
• Characteristics:
- A RNN in which all connections are symmetric
- Binary threshold activation function (CAM)
- No unit has a connection with itself and Wi,j =Wj,i (symmetric)
- symmetric weights guarantee that the energy function decreases monotonically
-
Hebbian learning: Increase weight between two nodes if both have same activity,
otherwise decrease.
- Synchronous training: the outputs for all the nodes are
calculated before applied to the other nodes
- Asynchronous training: randomly choose a node and
calculate its output
NanoCAD Lab
Shaodi Wang ([email protected])
8
Self Organized Map
• The purpose of SOM is to map a multidimensional input space onto
a topology preserving map of neurons
– Preserve a topological so that neighboring neurons respond to «
similar »input patterns
– The topological structure is often a 2 or 3 dimensional space
• Each neuron is assigned a weight vector with the same
dimensionality of the input space
• Input patterns are compared to each weight vector and the closest
wins (Euclidean Distance)
NanoCAD Lab
Shaodi Wang ([email protected])
9
Thanks
NanoCAD Lab
Shaodi Wang ([email protected]) 10