Transcript Document

Are we still talking about diversity
in classifier ensembles?
Ludmila I Kuncheva
School of Computer Science
Bangor University, UK
“CLASSIFIER ENSEMBLE DIVERSITY”
Publications (580)
Search on 10 Sep 2014
Citations (4594)
580 papers, 335 journals / conferences
MULTIPLE CLASSIFIER SYSTEMS 30
INT JOINT CONF ON NEURAL NETWORKS (IJCNN) 22
PATTERN RECOGNITION 17
NEUROCOMPUTING 14
EXPERT SYSTEMS WITH APPLICATIONS 13
INFORMATION SCIENCES 12
APPLIED SOFT COMPUTING 11
PATTERN RECOGNITION LETTERS 10
INFORMATION FUSION 9
IEEE INT JOINT CONF ON NEURAL NETWORKS 9
KNOWLEDGE-BASED SYSTEMS 7
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 7
INT J OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE 6
MACHINE LEARNING 5
IEEE TRANSACTIONS ON NEURAL NETWORKS 5
JOURNAL OF MACHINE LEARNING RESEARCH 5
APPLIED INTELLIGENCE 4
INTELLIGENT DATA ANALYSIS 4
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 4
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING 4
NEURAL INFORMATION PROCESSING 4
0
5
10
15
20
25
30
580 papers, 335 journals / conferences
MULTIPLE CLASSIFIER SYSTEMS 30
INT JOINT CONF ON NEURAL NETWORKS (IJCNN) 22
PATTERN RECOGNITION 17
NEUROCOMPUTING 14
EXPERT SYSTEMS WITH APPLICATIONS 13
INFORMATION SCIENCES 12
APPLIED SOFT COMPUTING 11
PATTERN RECOGNITION LETTERS 10
INFORMATION FUSION 9
IEEE INT JOINT CONF ON NEURAL NETWORKS 9
KNOWLEDGE-BASED SYSTEMS 7
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 7
INT J OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE 6
MACHINE LEARNING 5
IEEE TRANSACTIONS ON NEURAL NETWORKS 5
JOURNAL OF MACHINE LEARNING RESEARCH 5
APPLIED INTELLIGENCE 4
INTELLIGENT DATA ANALYSIS 4
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION 4
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING 4
NEURAL INFORMATION PROCESSING 4
0
5
10
15
20
25
30
Where in the world are we?
China
UK
USA
Spain
Brazil
Canda
Poland
Iran
Italy
140
68
63
55
41
32
28
23
19
...
France 11
Where in the world are we?
China
UK
USA
Spain
Brazil
Canda
Poland
Iran
Italy
140
68
63
55
41
32
28
23
19
...
France 11
Laurent HEUTTE
Professor of Computer
Science, University of Rouen,
France
Are we still talking about diversity in classifier ensembles?
Apparently yes...
That elusive diversity...
Classifier ensemble
class label
“combiner”
classifier
classifier
feature values
(object description)
classifier
That elusive diversity...
independent outputs ≠ independent errors
hence, use ORACLE outputs
Classifier 1
Classifier 2
correct
wrong
correct
𝑎
𝑏
wrong
𝑐
𝑑
Number of instances labelled
correctly by classifier 1 and
mislabelled by classifier 2
That elusive diversity...
Classifier 1
Classifier 2
correct
wrong
correct
𝑎
𝑏
wrong
𝑐
𝑑
•
•
•
•
•
Q
kappa
correlation (rho)
disagreement
double fault
• ...
That elusive diversity...
SEVENTY SIX !!!
Do we need more “NEW” pairwise diversity measures?
Looks like we don’t...
Kappa-error diagrams
• proposed by Margineantu and Dietterich in 1997
• visualise individual accuracy and diversity in a 2-dimensional plot
• have been used to decide which ensemble members can be
pruned without much harm to the overall performance
Example
𝑒𝑖𝑗
sonar data (UCI): 260 instances, 60 features, 2 classes,
ensemble size L = 11 classifiers, base model – tree C4.5
0.4
0.38
Adaboost 75.0%
0.36
Bagging 77.0%
0.34
0.32
Random subspace
80.9%
0.3
0.28
0.26
0.24
-0.2
Rotation Forest
84.7%
Random oracle
83.3%
-0.1
0
0.1
0.2
0.3
0.4
0.5

Kuncheva L.I., A bound on kappa-error diagrams for analysis of classifier ensembles, IEEE Transactions on
Knowledge and Data Engineering, 2013, 25 (3), 494-501 (DOI: 10.1109/TKDE.2011.234).
Kappa-error diagrams
C2
correct
C1
correct
wrong
a
c
wrong
b
d
error
kappa = (observed – chance)/(1-chance)
Kappa-error diagrams
error
kappa
bound
(tight)
Kappa-error diagrams – simulated ensembles L = 3
error
1
0.5
0
-1
kappa
0
1
Kappa-error diagrams – real data L = 11
error
0.4
0.3
0.2
0.1
0
-1
-0.5
0
kappa
Real data: 77,422,500 pairs of classifiers
error
room for improvement
0.4
0.2
0
-1
0
1
kappa
Is there space for new classifier ensembles?
Looks like yes...
Good and Bad diversity
Diversity is not
MONOTONICALLY related to
ensemble accuracy
Good and Bad diversity
MAJORITY VOTE
3 classifiers: A, B, C
15 objects,
wrong vote,
correct vote
individual accuracy = 10/15 = 0.667
P = ensemble accuracy
A
B
C
independent classifiers
P = 11/15 = 0.733
A
B
C
identical classifiers
P = 10/15 = 0.667
A
B
C
dependent classifiers 1
P = 7/15 = 0.467
A
B
C
dependent classifiers 2
P = 15/15 = 1.000
Good and Bad diversity
MAJORITY VOTE
3 classifiers: A, B, C
15 objects,
wrong vote,
correct vote
individual accuracy = 10/15 = 0.667
P = ensemble accuracy
A
B
C
independent classifiers
P = 11/15 = 0.733
A
B
C
identical classifiers
P = 10/15 = 0.667
A
B
C
dependent classifiers 1
P = 7/15 = 0.467
Bad diversity
A
B
C
dependent classifiers 2
P = 15/15 = 1.000
Good diversity
Good and Bad diversity
Are these outputs diverse?


𝐶1
𝐶2




𝐶5
𝐶6
𝐶7
Data set Z
𝐶3
𝐶4
object 𝑧𝑖
Ensemble, L = 7 classifiers
Good and Bad diversity
How about these?

  
Data set Z
𝐶1
𝐶2
𝐶3
𝐶4
𝐶5
𝐶6
object 𝑧𝑖
Ensemble, L = 7 classifiers
𝐶7
Good and Bad diversity
3 vs 4... Can’t be more diverse, really...

  
Data set Z
𝐶1
𝐶2
𝐶3
𝐶4
𝐶5
𝐶6
object 𝑧𝑖
Ensemble, L = 7 classifiers
𝐶7
Good and Bad diversity
MAJORITY VOTE


  
Good diversity
Data set Z
𝐶1
𝐶2
𝐶3
𝐶4
𝐶5
𝐶6
object 𝑧𝑖
Ensemble, L = 7 classifiers
𝐶7
Good and Bad diversity
MAJORITY VOTE


   
Bad diversity
Data set Z
𝐶1
𝐶2
𝐶3
𝐶4
𝐶5
𝐶6
object 𝑧𝑖
Ensemble, L = 7 classifiers
𝐶7
Good and Bad diversity
𝑙𝑖
number of classifiers with correct output for 𝑧𝑖
Brown G., L.I. Kuncheva, "Good" and "bad"
diversity in majority vote ensembles, Proc.
Multiple Classifier Systems (MCS'10), Cairo,
Egypt, LNCS 5997, 2010, 124-133.
𝐿 − 𝑙𝑖 number of classifiers with wrong output for 𝑧𝑖
𝑝
mean individual accuracy
𝑁
number of data points
Decomposition of the Majority Vote Error
Add BAD diversity
𝐸𝑚𝑎𝑗
1
= 1−𝑝 −
𝑁𝐿
Individual error
Subtract GOOD diversity
maj
1
( 𝐿 − 𝑙𝑖 ) +
𝑁𝐿

𝑙𝑖
maj 
Good and Bad diversity

This object will contribute 𝐿 − 𝑙𝑖 = (7 – 4) = 3 to good diversity

This object will contribute 𝑙𝑖 = 3 to bad diversity
Note that diversity
quantity is 3
in both cases
Ensemble Margin
The voting margin for object 𝑧𝑖 is the proportion of <correct minus wrong votes>
𝑙𝑖 − (𝐿 − 𝑙𝑖 )
𝑚𝑖 =
𝐿
POSITIVE

4 − (7 − 4) 1
𝑚𝑖 =
=
7
7

3 − (7 − 3)
1
𝑚𝑖 =
=−
7
7
NEGATIVE
Ensemble Margin
Average margin
1
𝑚=
𝑁
𝑁
𝑖=1
1
𝑚𝑖 =
𝑁
𝑁
𝑖=1
𝑙𝑖 − (𝐿 − 𝑙𝑖 )
𝐿
Large 𝒎 corresponds to BETTER ensembles...
However, nearly all diversity measures are functions of
Average absolute margin
1
|𝑚| =
𝑁
𝑁
|𝑚𝑖 |
𝑖=1
or
Average square margin
𝑚2
1
=
𝑁
𝑁
𝑚𝑖2
𝑖=1
Margin has
no sign...
Ensemble Margin
Diversity is not MONOTONICALLY
related to ensemble accuracy
So, STOP LOOKING for a monotonic
relationship!!!
Conclusions
1. Beware! Overflow of diversity measures!
2. In theory, there is some room for better classifier ensembles.
3. Diversity is not monotonically related to ensemble accuracy, hence
larger diversity does not necessarily mean better accuracy.
Directly engineered or heuristic? – up to you
36
37