1 ENTROPY-BASED CONCEPT SHIFT DETECTION PETER

Download Report

Transcript 1 ENTROPY-BASED CONCEPT SHIFT DETECTION PETER

ENTROPY-BASED
CONCEPT SHIFT
DETECTION
PETER VORBURGER,
ABRAHAM BERNSTEIN
IEEE ICDM 2006
1
Speaker: Li HueiJyun
Advisor: Koh JiaLing
Date:2007/11/6
OUTLINE
Introduction
 Entropy and Concept Shift Adaption

Calculating Entropy on Data Streams
 Algorithm Control Strategy using Entropy Measure

Experimental setup
 Experimental Results
 Discussion of the Experiments
 Application to a Real-World Problem: Context
Switches in Sensor Data
 Limitations, Future Work, and Conclusion

2
INTRODUCTION

Problem:
In many applications data is gathered over time,
which raises the problem that the concepts to be
learned may drift (i.e., change) over time.
 The increasing amount of data (e.g., multimedia
content, data warehouse ) and limitation of
computing power due to miniaturization (e.g.,
wearable computing) call for faster and more
resource friendly algorithms.


Motivation: the analysis of sensor data on
wearable devices.
3
INTRODUCTION

Context-awareness: A Scenario-based Approach
for Direct Interruptablity Prediction on Wearable
Devices
Classifiers predict peoples’ anticipated behavior
based on sensory input
 Contexts (or contextual situations) switch rather
than gradually change
 Contextual information could be reused, even for new,
not yet encountered situations


An ongoing monitoring of the sensor stream is
needed
4
INTRODUCTION

Problem:


online pattern matching mechanism comparing the
sensor stream to the entire library of already known
contexts is computational complex and not yet
suitable for today’s wearable devices.
Solution:

indicate possible candidates (or hot spots) for context
changes limiting the computationally intensive
context (re-)determination on those candidates.
5
ENTROPY AND CONCEPT SHIFT
ADAPTION

Assumptions:
As long as the distribution of older instances
(features and target values) is similar to the
distribution of new instances no concept drift
occurred
 A distribution difference between older and more
recent instances indicates a change in the target
concept


Measure the distribution inequality:
If two distributions are equal, the entropy measure
results in a value of 1
 If they are absolutely different the measure will
result in a value of 0

6
CALCULATING ENTROPY ON DATA
STREAMS

Sliding window technique: compares two
windows, one presenting older and the other
representing more recent instances in the stream
Compare the two windows by counting and
comparing all instances with respect to their class
and stream membership
 Discretize the range of instance values to a fixed
number of bins to take the approximate value
distribution into account

7
CALCULATING ENTROPY ON DATA
STREAMS

A data stream: a sequence consisting of
sequentially ordered tuples di in time ti
i  (1, 2, 3, …)
 di := ( si , li)



where si is the vector of all feature stream
instances sni at time ti
The domain of the label stream l is discrete and
contains all class values c  C
8
CALCULATING ENTROPY ON DATA
STREAMS
Hi: the resulting entropy at time ti and is defined
as the mean of all data stream entropies His at
time ti
C
B
1 S
 Hi 
 Hi s where Hi s    Hi scb

S
s 1
c 1 b 1
S: the number of feature-streams
 His is calculated from the entropies Hiscb

9
CALCULATING ENTROPY ON DATA
STREAMS

Hiscb: represent the entropy of each class (c C)
and bin (b B) given the stream s at time ti
Bins: discrete aggregation of the values of each
feature stream s

: the probability that an instance occurs in the
old window at time ti, belong to class c, with feature
domain of stream s in bin b
 wiscb: depend on i, s, c, b


10
ALGORITHM CONTROL STRATEGY
USING ENTROPY MEASURE

Instance selection style algorithm
11
EXPERIMENTAL SETUP
Real concept drifts: changes in the actual target
concepts
 Virtual concept drifts: changes in the distribution
 Generate synthetic data set:


H. Wang, W. Fan, P. S. Yu, and J. Han. Mining
concept-drifting data streams using ensemble
classifiers. In KDD ‘03: proceedings of the ninth
ACM SIGKDD international conference on Knowledge
discovery and data mining
Real drift data set
 Virtual drift data set
 Mixed data set

12
EXPERIMENTAL SETUP

Performance measure:
Accuracy
 The area under the ROC-curve


A representative set of benchmarks:
Perfect benchmark: assumes an oracle-given ideal
window size ξ for any point in time
 A selection of ensemble classifiers: the literature so
far showed to have the highest accuracy and
robustness against noise


P. Vorburger and A. Bernstein. Entropy-based
detection of real and virtual concept shifts.
Working Paper – University of Zurich,
Department of Informatics, 2006
13
EXPERIMENTAL RESULTS
14
EXPERIMENTAL RESULTS

The prediction quality against increasing noise
levels
15
EXPERIMENTAL RESULTS

Computational complexity


Compare ensemble classifiers and the entropy
measure based algorithm
Measure the elapsed time:
three committee classifiers: 2031.6±15s
 Entropy based algorithm: 148.6s
 Entropy calculation without Naïve Bayes model
building: 1.1±0.1s

16
DISCUSSION OF THE EXPERIMENT
Entropy measure outperforms the ensemble
benchmark algorithm on real concept shifts
 Exhibit a greater predictive power while
requiring less computational resources
 The entropy measure based algorithm showed
the nearly the same robustness towards noise as
the perfect benchmark and the committee
classifiers

17
APPLICATION TO A REAL-WORLD
PROBLEM: CONTEXT SWITCHES IN
SENSOR DATA

Data set:
Audio: decomposed into 10 features
 accelerometer data recorded over a time of 15381s:
merged in one single feature


The wearable data acquisition set up: a
microphone and three three-dimentional
accelerometers attached on the subject’s
shoulder, wrist, and leg
18
APPLICATION TO A REAL-WORLD
PROBLEM: CONTEXT SWITCHES IN
SENSOR DATA
(A)walking
 (D)lecture

(B)streetcar
(E)cafeteria
(C)office work
(F)meeting
19
LIMITATIONS, FUTURE WORK, AND
CONCLUSION
Gradual concept drifts
 Find boundary conditions
 Recognize recurring concepts and exploit this
information
 Generalizability
 The choice of the suitable parameters could be
optimized

20
LIMITATIONS, FUTURE WORK, AND
CONCLUSION
Find a measure for detecting and measuring
concept shifts as an analogue for context switches
 Formulation of entropy on data streams is
capable to detect and measure concept shifts
 Algorithm with an entropy based instance
selection strategy outperformed ensemble based
algorithms on real concept shift data sets
 Given algorithm robustness towards noise, its
sensitivity towards concept shifts, its
computational efficiency, and predictive power on
real concept shift data sets

21