multipleLearners - Heather Dewey

Download Report

Transcript multipleLearners - Heather Dewey

Ensemble Methods

“No free lunch theorem”
Wolpert and Macready 1995


“No free lunch theorem”
Wolpert and Macready 1995
Solution search also involves searching for
learners

Different algorithms


Different algorithms
Different parameters



Different algorithms
Different parameters
Different input representations/features




Different algorithms
Different parameters
Different input representations/features
Different data

Base learner

Diversity over accuracy

Model combination




Voting
Bagging
Boosting
Cascading

Data set = [1,2,3,4,5,6,7,8,9,10]

Samples:
 Input to learner 1 = [10,2,5,10,3]
 Input to learner 2 = [4,5,2,7,6,3]
 Input to learner 3 = [8,8,4,9,1]

Create complementary learners


Create complementary learners
Train successive learners on the mistakes of
predecessors

Weak learners combine to a strong learner

Adaboost – Adaptive Boosting


Adaboost – Adaptive Boosting
Allows for a smaller training set



Adaboost – Adaptive Boosting
Allows for a smaller training set
Simple classifiers




Adaboost – Adaptive Boosting
Allows for a smaller training set
Simple classifiers
Binary

Modify probability of drawing examples from
a training set based on errors
Step 3


1
1 error
1  log(
)
2
error
error  0.33
1
1 .33
1  log(
)
2
.33
1  0.35

Demo

Sequence classifiers by complexity


Sequence classifiers by complexity
Use classifier j+1 if classifier j doesn’t meet a
confidence threshold



Sequence classifiers by complexity
Use classifier j+1 if classifier j doesn’t meet a
confidence threshold
Train cascading classifiers on instances the
previous classifier is not confident about




Sequence classifiers by complexity
Use classifier j+1 if classifier j doesn’t meet a
confidence threshold
Train cascading classifiers on instances the
previous classifier is not confident about
Most examples classified quickly, harder ones
passed to more expensive classifiers

Boosting and Cascading






Object detection/tracking
Collaborative filtering
Neural networks
Optical character recognition ++
Biometrics
Data mining

Ensemble methods are proven effective,
but why?