Temporal Spatial Data Mining

Download Report

Transcript Temporal Spatial Data Mining

An Approach to Active Spatial Data Mining
Wei Wang
Data Mining Lab, UCLA
March 24, 1999
Outline
•
•
•
•
•
Introduction
Spatial Data Mining Triggers
Strategies
Performance
Conclusions
Introduction
• Huge amount of spatial data
are generated everyday.
Satellite
Satellite
Satellite
–
–
–
–
–
Earth Observing System
National Spatial Data Infrastructure
National Image Mapping Agency
one meter resolution data
digital earth
Users are usually interested in
the hidden information.
Satellite dish
Satellite dish
Satellite dish
– Aggregate information
– Clustering
– Patterns
Introduction
• Knowledge discovery processes are
computationally expensive.
• Today’s technology advances provide
necessary computing power to carry out
such complicated processes.
Spatial Data mining Triggers
• Since data evolves over time, interesting patterns are likely
to emerge or change.
• Goal: identify and find (most of) interesting patterns
• Problems:
– Knowledge discovery processes are expensive.
It is not feasible to re-process the entire data set for every change.
– Approach to periodically examine the data.
• Long delays
• Transient patterns might be missed
Natural solution: Usage of triggers.
Spatial Data mining Triggers
• Traditional database triggers can not be directly applied:
– Expressive power of traditional database triggers is limited,
especially in describing spatial relationships.
– Example: Trigger bandwidth reallocation when the size of a cluster
exceeds 20.
.. .... .. .
.. ...
..
.
.
.
.
.
.
Strategies
• STING+ was designed to introduce and support spatial
triggers efficiently.
• Observation (spatial locality): Only objects added to the
shaded area will contribute to the growth of cluster size at
this moment.
.. .... .. .
.. ...
..
.
.
.
.
.
.
Strategies
• STING+ Strategy: Monitor only the area occupied by
potential clusters and its neighborhood.
• Observation (cumulative effect): at least 4 more objects
are needed in order to make the cluster size be 20.
• STING+ Strategy: Space is organized in a hierarchy so
that updates can be suspended at some level in the
hierarchy until the cumulative effect might cause the
trigger to be fired.
Strategies
– Space is recursively divided into smaller rectangular
cells down to a specified granularity and is organized
via the inherit pyramid hierarchy.
.. .... .. .
.. ...
..
.
.. .... .. .
.. ...
.
.
. .
..
.
.
.
Level 1
. .
.
.
Level 2
Strategies
– STING+ decomposes a trigger into a set of sub-triggers
associated with individual cells in the hierarchical
structure to monitor the cumulative effect of data
changes within the cell.
..... . .
..... . .
. ... ...
. ... ...
..
..
Sub-trigger
.
.
on cell
Higher level
sub-trigger
on cell
.
.
.
.
.
Level 4
.
.
.
.
.
Level 3
Strategies
– Updates/insertions are suspended at some level in the
hierarchy until such time that the cumulative effect of
these insertions might cause the trigger condition to
become satisfied.
..... . .
..... . .
. ... ...
. ... ...
..
..
.
.
+ ++
+
+ ++
+
.
.
.
.
.
.
Level 0
.
.
.
.
Level 1
Strategies
........ .
.. ...
+ ++
+
........ .
.. ...
..
.
+ ++
+
.
.
.
.
.
.
.
Level 2
..
.
.
.
.
Level 3
No update of cluster !
Performance
• Comparison with periodic re-examination
– If the period is set to be less than 4000 updates,
STING+ consumes less CPU cycles.
– Significant delay and transient patterns misses
can occur for larger period.
• Not acceptable in many applications
– No delay and no transient patterns missed with
STING+.
Conclusions
• STING+ is an efficient approach to active
spatial data mining.
– Users can define triggers to monitor the change
of spatial data.
– Updates to the data are carefully monitored
– Evaluation of trigger is postponed until the
cumulative effect of data updates might cause
the trigger to be fired.