Slides - Computer Science

Download Report

Transcript Slides - Computer Science

Markov Models for
Handwriting Recognition
An “agent” (i.e. computer program) can’t
determine that
is I, I is t, is m …
What information does the agent get?
• Each character is 28 x 28 pixel image.
• Represented by 28 x 28 array of 1s and 0s
(1 = pixel occupied, 0 = pixel unoccupied)
• Thousands of test examples: arrays labeled with
corresponding letters
D .. Z
D .. Z
D .. Z
Use these to compute probability that
is I, is is t,
is m …
Neural Networks
Ananya Das Christman
CS311
Fall 2016
Biological Neural Nets
• Pigeons as art experts (Watanabe et al. 1995)
• Experiment:
– Pigeon in box
– Present paintings of two different artists (e.g. Monet
/ Van Gogh)
– Reward for pecking when presented a particular
artist (e.g. Van Gogh)
Results
• Pigeons were able to discriminate between Van Gogh and
Chagall with 95% accuracy (when presented with pictures
they had been trained on)
• Discrimination still 85% successful for previously unseen
paintings of the artists
• Pigeons do not simply memorize the pictures
• They can extract and recognise patterns (the‘style’)
• They generalize from the already seen to make predictions
• This is what neural networks (biological and artificial) are
good at (unlike conventional computer programs)
Our Nervous System
Synapse
Neuron
Neuron
Synapse
Electrical
Signal
Neuron
Our Nervous System:
The Computer Science View
A NEURON is a brain cell
• collects, processes, and disseminates
electrical signals
• Neurons are connected via synapses
• They FIRE depending on the conditions of
neighboring neurons
Our nervous system
The human brain
• contains ~100 billion neurons
• each neuron is connected to
~10,000 other neurons
• neurons send signals of
different strengths to each
other (can “fire” as fast as 1
millisecond)
Express this in CS language?
Neural Network
(node) = neuron
A
(edge) = synapse
B
w = strength of signal sent from A to B
• If A fires and w is positive, then A stimulates B.
• If a node is stimulated enough, then it also fires.
• How much stimulation is required is determined by its threshold.
Neural Networks
Node (Neuron)
Edge (synapses)
McCulloch and Pitts (1943)
Our approximation of the brain
Perceptron: A Single Neuron
Input x1
Weight w1
Input x2
Weight w2
å
g(X)
Output y
g: threshold function
Input x3
Weight w3
X = å wi xi
i
Weight w4
Input x4
Possible threshold functions
Step function:
g(X):
g(X) >= threshold
if X (weighted sum of inputs) >= threshold
output 1
otherwise
output 0
Sigmoid:
1
g(X) =
1+ e-aX
g(X) < threshold
Possible threshold functions
Step function:
g(X):
g(X) >= threshold
if X (weighted sum of inputs) >= threshold
output 1
otherwise
output 0
Sigmoid function:
1
g(X) =
1+ e-aX
g(X) < threshold
Step Function
1
1
1
0
-1
?
1
Threshold = 1
0.5
1
Step Function
1
1
1
0
-1
0
1
Threshold = 1
0.5
1
Weighted sum = 0.5,
which is < threshold
Step Function
1
1
0
0
-1
?
1
Threshold = 1
0.5
1
Step Function
1
1
0
0
-1
1
1
Threshold = 1
0.5
1
Weighted sum = 1.5,
which is >= threshold
Kinds of Neural networks
inputs
inputs
output
output
How are these different?
Kinds of Neural networks
inputs
inputs
hidden units/layer
hidden units/layer
output
output
Kinds of Neural networks
inputs
inputs
output
output
How are these different?
Kinds of Neural networks
inputs
Recurrent network
Output is fed back to input
Can support memory!
Kinds of Neural networks
inputs
inputs
Feed forward networks (we’ll mostly deal with these)
Eventually…Classify
Red = True
x1 = 1
W1
…
…
Apple? = True
Round = True
x2 = 1
W2
Eventually…Classify
Red = False
x1 = 0
W1
…
…
Apple? = False
Round = True
x2 = 1
W2
Eventually…Classify
Red = True
x1 = 1
W1
…
…
Apple? = True
Round = True
x2 = 1
W2
We don’t know how the
features map to the
correct output
Eventually…Classify
Red = False
x1 = 0
W1 =?
T?
…
…
Apple? = False
Round = True
x2 = 1
W2 = ?
So we want to learn the
weights and threshold
that would produce this
mapping
How to learn the mapping?
Suppose 2 features:
• Red?
• Round?
In all test examples
labeled ‘Apple’:
Red = true
Round = true
How to learn the mapping?
Suppose 2 features:
• Red?
• Round?
In all training examples
labeled ‘Apple’:
Red = true
Round = true
Mapping is AND
AND
x1
x2
x1 and x2
0
0
0
0
1
0
1
0
0
1
1
1
x1 = Red
x2 = Round
Eventually…Classify
Red = False
x1 = 0
W1 =?
T?
…
…
Apple? = False
Round = True
x2 = 1
W2 = ?
So we want to learn the
weights and threshold
that would produce this
mapping
AND
Input x1
W1 = ?
T=?
Input x2
W2 = ?
Inputs are either 0 or 1
Output y
AND
Input x1
W1 = 1
T=2
Output y
Output is 1 only if
all inputs are 1
Input x2
W2 = 1
Inputs are either 0 or 1