Dynamic Classifier Selection for Effective Mining from Noisy Data

Download Report

Transcript Dynamic Classifier Selection for Effective Mining from Noisy Data

Dynamic Classifier Selection for Effective
Mining from Noisy Data Streams
Xingquan Zhu, Xindong Wu, and Ying Yang
Proc. of KDD 2003
2005/3/25
報告人:董原賓
Problem
Problem:


Many existing data stream mining efforts
are based on the Classifier Combination
techniques
Dramatic concept drift、Significant amount
of noise
Solution:

Choose the most reliable classifier
Multiple Classifier System(MCS)


MCS assumption: each base classifier has a
particular sub-domain from which it is most
reliable
Two categories of MCS integration techniques:
 Classifier Combination (CC) techniques


All base classifiers are combined to work out
the final decision EX:SAM( Select All Majority )
Classifier Selection (CS) techniques

Select the single best classifier from base
classifiers for the final decision
Classifier Selection techniques
Two types of CS techniques:
 Static Classifier Selection, during the training
phase, EX: CVM (Cross Validation Majority)
 Dynamic Classifier Selection, during the
classification phase, call it “dynamic” because
the classifier used critically depends on the
test instance itself, EX: DCS_LA (Dynamic
Classifier Selection by Local Accuracy)
Definition




Dataset D, training set X, test set Y and
evaluation set Z
Nx, Ny and Nz represent the numbers
of instances in X, Y and Z respectively
C1,C2,…,CL the L base classifiers from X
The selected best classifier C* to classify
each instance Ix in Y
Definition



The instances in D have M attributes
A1,A2,…,AM and each attribute A
contains ni values V1Ai,…,VniAi
For an attribute Ai ,use its values to
partition Z into ni subsets S1Ai,…,SniAi
where S1Ai ∪.. ∪ SniAi = Z
IkAi denotes instance Ik’s value on
attribute Ai
Attribute-Oriented Dynamic
Classifier Selection (AO-DCS)
Three steps of AO-DCS:
 Partition the evaluation set into subsets by
using the attribute values of the instances
 Evaluate the classification accuracy of each
base classifier on all subsets
 For a test instance, use its attribute values to
select the corresponding subsets and select
the base classifier that has the highest
classification accuracy
Partition by attributes
Partition By Attributes
Instance IMary
Name Gender
Age
Height
Age:<30(S1A)
≧30(S2A)
Mary
Female
29
163
Dave
Male
51
170
Martha
Female
63
149
Nancy
Female
35
157
≧ 181 (S3H)
John
Male
18
182
Gender:Male(S1G)
Base Classifier:C1, C2, C3
Height:≦ 160 (S1H)
161~180(S2H)
S1G : IDave, IJohn
S2G : IMary, IMartha, INancy
Female(S2G)
Evaluate the classification accuracy
Partition by
attributes
Subsets from Attribute Ai
L base classifiers
The classification accuracy
Dynamic Classifier Selection
S1G S2G S1A S2A S1H S2H S3H
AverageAcy[2] = 0.63
C1 0.8 0.5 0.6 0.4 0.2 0.4 0.6
AverageAcy[3] = 0.56
C2 0.4 0.7 0.6 0.3 0.5 0.9 0.8
C3 0.6 0.9 0.3 0.5 0.7 0.8 0.4
Name
Alex
Gender
Male
Age
24
Height
177
The accuracy of C1 : AverageAcy[1] = (0.8+0.6+0.4) / 3 = 0.6
Applying AO-DCS in Data
Steam Mining
Steps:
 partition streaming data into a series of
chunks, S1 , S2 , .. Si ,.., each of which is
small enough to be processed by the
algorithm at one time.
 Then learn a base classifier Ci from
each chunk Si
Applying AO-DCS in Data
Steam Mining (cont.)

To evaluate all base classifiers (in the case that
the number of base classifiers is too large, we
can keep only the most recent K classifiers) and
determine the “best” one for each test instance
note: We will dynamically construct an evaluation
set Z (using the most recent instances, because
they are likely consistent with the current test
instances)
Experiment
Experiment
Experiment
Experiment
Experiment
Experiment
Experiment