Presentation - Vision and Image Science Lab
Download
Report
Transcript Presentation - Vision and Image Science Lab
Technion - Israel Institute of Technology
Department of Electrical Engineering
The Vision Research and Image Science Laboratory
Handwritten Character Recognition
Using
Artificial Neural Networks
Shimie Atkins & Daniel Marco
Supervisor: Johanan Erez
November 1998
Project Goals
Creating
an optimal neural network capable of
identifying characters.
Creating
training files, in which the characters
are reduced, and presented to the net.
Creating
an application that enables the user to
define forms that are filled and later on identified.
Theoretical Background
What is a neuron?
A black box that has inputs.
These inputs are summed according to their respective
weights - and the sum is the total input.
A certain mathematical function is then applied to the
total input, which now becomes the neuron’s output.
What is a Layer?
A group of neurons that are parallel to each other, with no
connections amongst themselves.
Background (cont)
What is a Network?
A group of layers, where neurons’ outputs in one layer
are the inputs of the neurons in the next layer.
The input of the network is the first layer’s input.
The output of the network is the last layer’s output.
Note:
Our network is fully connected, thus all the neurons in
one layer are connected to each neuron in the next layer.
Training Algorithm
We used the back-propagation algorithm, which
works as follows:
Net receives input vector and desired output vector.
Propagate input vector through net.
Compare net’s output vector with desired vector, if they
match within the allowable error --> Done.
Update weights beginning with last layer and advancing
backwards towards the first layer.
Go back to second step.
The Reduction Algorithm
Divide height and width of the large picture by those of
the small picture.
Each block of pixels received in the large picture
represents one pixel in the small picture.
Count number of “On” pixels in block.
Determine, according to a predefined threshold,
whether the pixel in the small picture is “On” or “Off”.
Note:
If Dimensions of block are not integral padding is
initiated.
Reduction Algorithm (cont)
Code Implementation
•We Implemented the network in C++.
•We Defined three classes: CNeuron, CLayer, Cnet .
They have the following relationships:
Layer
Neuron
Net
Results of Form Identification
The following form was created (using our
interface) and scanned:
Results (cont)
First net identified the form as follows:
Configuration of Network:
37 nets (8x10 inputs, 3 layers)
Alphabetical - 40 neurons in 2nd layer
Numerical - 25 neurons in 2nd layer
Train File:
dan1to6.trn (40 examples of each
character).
Identification Rate (for check file):
96%
Results (cont)
Second network identified the form as follows:
Configuration of Network:
2 nets (8x10 inputs, 3 layers)
Alphabetical - 30 neurons in 2nd layer
Numerical - 30 neurons in 2nd layer
Train File:
dan1to9.trn (60 examples of each
character)
Identification Rate (for check file):
95.5%
Conclusions
The identification success rate is proportional to the
number of examples of each character given.
More than one hidden layer does not necessarily
improve performance.
Most errors are with letters of similar shape (such as IJ or O-D), numerical digits are usually successfully
identified.
The configuration and training parameters of the net
have minimal influence (once inside certain optimal
limits ).