Transcript ppt
Machine Learning
and having it deep and structured
Introduction
Outline
What is Machine Learning?
Deep Learning
Structured Learning
Some tasks are very complex
• You know how to write programs.
• One day, you are asked to write a program for
speech recognition.
Find the common patterns
from the left waveforms
你好
你好
你好
你好
You quickly get lost in the
exceptions and special cases.
It seems impossible to
write a program for speech
recognition
Let the machine learn by itself
你好
Learn how to
do speech
recognition
大家好
You said “你好”
人帥真好
You only have to write the
program for learning
A large amount of
audio data
Learning ≈ Looking for a Function
• Speech Recognition
f
“你好”
• Handwritten Recognition
f
“2”
• Weather forecast
f
weather today
“sunny tomorrow”
• Play video games
f
Positions and
number of enemies
“jump”
Types of Learning
Supervised Learning
Reinforcement Learning
Unsupervised Learning
x:
Supervised Learning
y : “你好”
Hypothesis
Function Set
Model
f1 , f 2
Training:
Pick the best
Function f*
“Best”
Function
Training
Data
x: function input
y: function output
f
x , yˆ , x , yˆ ,
1
(label)
1
2
2
*
y 大家好
Testing:
f x y
x
“2”
Reinforcement Learning
Model
Hypothesis
Function Set
Example:
Dialogue System
f1 , f 2
Bad!
How are
x:
you?
Training
Data
x , x ,
1
2
Good
Bye
know how
good f(x) is
No labels
f1(x)=“Good Bye”
Reinforcement Learning
Model
Hypothesis
Function Set
Example:
Dialogue System
f1 , f 2
Good!
Training:
Pick the best
Function f*
Training
Data
x , x ,
1
2
How are
you?
Fine.
know how
good f(x) is
No labels
f2(x)=“Fine”
Reinforcement Learning
Model
Training:
Pick the best
Function f*
Training
Data
x , x ,
1
2
Hypothesis
Function Set
f1 , f 2
“Best”
Function
f
*
know how
good f(x) is
No labels
Machine:
y’ = “hi”
Testing:
f x y
: x’ = “hello”
Unsupervised Learning
Training
Data
x , x ,
1
2
No labels
Lots of audio
without text annotation
What can I do
with these data?
Outline
What is Machine Learning?
Deep Learning
Structured Learning
Inspired from human brain
Human Brains are Deep
A Neuron for Machine
Each neuron is a function
x1 w1
x2 w2
…
wN
Activation
function
z z
xN
b
bias
a
z
1
z
z
1 e
Sigmoid function
z
Deep Learning
• Neural Network: Cascading the neurons
f : R N RM
Input
Layer 1
Layer L
Layer 2
Output
x1
……
y1
x2
……
y2
……
……
……
……
……
xN
Hidden Layer
yM
Deep Learning
Universality Theorem:
Any continuous function f
f : R N RM
Can be realized by a network
with one hidden layer
(given enough hidden neurons)
Reference:
http://neuralnetworksandde
eplearning.com/chap4.html
Popular
Powerful
• Speech Recognition (TIMIT):
(Deep neural network on TIMIT usually used 4 to 8 layers)
Three misunderstandings
about Deep Learning
1. Deep learning works because the
model is more “complex”
Deep is simply more complex …..
Deep works better simply
because it uses more
parameters.
……
……
……
x1
x2
……
Shallow
xN
x1
x2
……
Deep
xN
Fat + Short v.s. Thin + Tall
Which one is better?
……
x1
x2
……
Shallow
xN
x1
x2
……
Deep
xN
Deep Learning - Why?
2
f : R {0,1}
Toy Example
Sample 10,0000
points as training data
0
……
x
0 or 1
……
……
……
……
y
……
1
Deep Learning - Why?
Toy Example
1 hidden layer:
125 neurons
3 hidden layers:
500 neurons
2500 neurons
How many neurons in
each hidden layers?
100~200
25~50
50~100
Less than 25
Deep Learning - Why?
• Experiments on Hand-writing digit classification
Deeper: Using less parameters to achieve the same performance
Three misunderstandings
about Deep Learning
2. When you are using deep learning,
you need more training data.
Size of Training Data
• Different number of training examples
10,0000
1 hidden
layer
3 hidden
layers
5,0000
2,0000
Size of Training Data
• Experiments on Hand-writing digit classification
Deeper: Using less training data to achieve the same performance
Three misunderstandings
about Deep Learning
3. You can simply get the power of deep
by cascading the neurons.
Hard to get the power of Deep …
Can I get all the power of deep from this course?
No, the researchers still do not understand all the
mystery of deep learning.
Outline
What is Machine Learning?
Deep Learning
Structured Learning
In the real world ……
f : X Y
X (Input domain):
Sequence, graph structure, tree structure ……
Y (Output domain):
Sequence, graph structure, tree structure ……
Retrieval
f : X Y
X:
Y:
“Machine learning”
(keyword)
A list of web pages (Search Result)
Translation
f : X Y
X:
“Machine learning and having
it deep and structured”
(One kind of sequence)
Y:
“機器學習及其深層
與結構化”
(Another kind of sequence)
Speech Recognition
f : X Y
X:
(One kind of sequence)
Y:
“大家好,歡迎大家來修
機器學習及其深層與結構
化”
(Another kind of sequence)
Speech Summarization
f : X Y
Record Lectures
X:
Select the most informative segments to
form a compact version
Y:
Summary
Object Detection
f : X Y
X : Image
Y:
Object Positions
Haruhi
Mikuru
Image Segmentation
f : X Y
X : Image
Y:
foreground
Source of images: Nowozin, Sebastian, and Christoph H. Lampert. "Structured
learning and prediction in computer vision." Foundations and Trends® in Computer
Graphics and Vision 6.3–4 (2011): P57.
Remote Image Ground Survey
f : X Y
Source of images: Nowozin, Sebastian, and Christoph H. Lampert. "Structured
learning and prediction in computer vision." Foundations and Trends® in Computer
Graphics and Vision 6.3–4 (2011): P146.
Pose Estimation
f : X Y
X : Image
Y : Pose
Source of images: http://groups.inf.ed.ac.uk/calvin/Publications/eichner-techreport10.pdf
Structured Learning
• The tasks above are developed separately in the
past.
• Recently, people realize that there is a unified
framework behind these approaches.
• Three steps
Evaluation
Inference
Learning
Concluding Remarks
What is Machine Learning?
Deep Learning
Structured Learning
Reference
• Deep Learning
• Neural Networks and Deep Learning
• http://neuralnetworksanddeeplearning.com/
• For more information
• http://deeplearning.net/
• Structure Learning
• Structured Learning and Prediction in Computer Vision.
• http://www.nowozin.net/sebastian/papers/nowozin2011struct
ured-tutorial.pdf
• Linguistic Structure Prediction
• http://www.cs.cmu.edu/afs/cs/Web/People/nasmith/LSP/PUBL
ISHED-frontmatter.pdf
Thank you!
Powerful
• Inspired from human brain
visual cortex
Retina
Layer L
Layer 1 Layer 2
x1
……
……
x2
……
……
……
……
Pixels
……
……
xN
……
Edges Primitive
Shapes
……
……
Powerful
• Image Recognition
1st hidden layer
2nd hidden layer
3rd hidden layer
Reference: Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding
convolutional networks. In Computer Vision–ECCV 2014 (pp. 818-833)