Spatio-temporal sequential pattern
Download
Report
Transcript Spatio-temporal sequential pattern
Mining Frequent Spatiotemporal Sequential Patterns
Authors: Huiping Cao, Nikos Mamoulis, David W. Cheung
Department of Computer Science, University of Hong Kong
Accepted to ICDM 2005
Abhinaya Sinha
Vijay Gandhi
CSCI 8715
Date: 4/3/06
Outline
Introduction
Problem Definition
Contribution
Approach
Experimental Evaluation
Assumptions
Rewrite today
2
Spatio-Temporal Pattern
Spatio-temporal sequence is
a list of time-sampled
locations.
(x1,y1,t1), (x2,y2,t2),…,(xn,yn,tn)
A pattern is a subset of the
sequence
Bus routes across 3 runs
A pattern is frequent if its
support >= threshold value
3
Motivation
Predict trajectory of an object
Detect frequent routes
followed by an object
Ecology
CHIMPS project at UofM
(http://www.discoverchimps.org)
Tracking chimpanzee
movements and analyzing
their behavior
Location based services
Where a user may go next
and thus, provide customized
services
4
Problem Definition
Input: Given spatio-temporal sequence S and
support threshold min_sup
Output: Frequent spatio-temporal patterns
Objective: Lower computation time and space
Constraints: Use of a device such as GPS to
sample locations
5
Contributions
Model for spatio-temporal sequential pattern mining
Algorithm for extracting frequent singular spatial
pattern elements a.k.a. spatial regions
Algorithm for combining single regions to patterns
of longer length
6
Methodology
Convert spatio-temporal sequence to set of line segments
i.e. abstract the trajectory
Detect frequent, singular regions
Combine singular patterns to longer sequences
7
Modeling the problem
Aim: abstract (approximate) the trajectory
Input: original spatio-temporal sequence
Output: sequence segments
Need
locations are not repeated exactly
Derived line segments are used for defining spatial regions
Compress original data and decrease effort required in
mining
Uses Douglas-Pecker (DP) algorithm for performing this
step
8
Concepts in Modeling
Spatio-temporal Segment
Representative line segment of a segment
Similarity between line segments
| l1.angle – l2.angle | <= θ
| l1.len – l2.len | <= length factor * max (
l1.len, l2.len)
Closeness between line segments
Distance (points belonging to line1, line2) <= ε
Mean line segment of a set of line segments
Spatial pattern element/ spatial region
Sides determined by mean line segment and
average orthogonal distance
Spatio-temporal sequential pattern
Ordered sequence of spatial pattern
elements
Support of a pattern
number of sequences supporting it
9
Discovering frequent singular
patterns
Input: All segments representing the spatiotemporal sequence, support threshold
Output: All segments have been assigned to
regions or been found as outliers.
This step constructs frequent single spatial regions
Uses similarity and closeness to assign segments
to regions
10
Deriving longer patterns
Goal is to derive longer patterns from the singular
frequent patterns obtained in previous step.
Input is series SR of spatial regions.
Output: Frequent patterns.
Two algorithms
Level-wise mining
Mining using substring tree
11
Example
Input: r1-r2-r1-r2-r3
minsup = 2
Output candidates
r1-r2 #2; r2-r1 #1; r2-r3 #1
r1
#2
r2 #2; r3 #1
Frequent Patterns: r1-r2; r1; r2
12
Level-wise pattern mining
Apriori Algorithm
e.g. input: r1-r2-r1-r2
r1(2) r2(2)
r1-r2(2) r2-r1(1)
Output: r1; r2; r1-r2
13
Contribution: Improvements
Connectivity constraint
Closeness property
Input: r1-r2-r1-r3-r1-r2
minsup: 2
r1(3)
r2(2)
r3(1)
r1-r2 r2-r1 r1-r3 r3-r1 r2r3 r3r2
14
Mining using substring tree
Tree structure with each node is pattern element
The substring tree is used to help in counting long
substrings
Depth first search gives substring
e.g. r1-r2-r1-r3-r1-r2
15
Validation Methodology
Computer simulations
Candidate Algorithms: Level-wise, Grid I, Grid II,
Substring tree
Evaluation done on
Real-world dataset
Synthetic dataset
16
Experimental Evaluation
Substring takes least time
17
Experimental Evaluation
Extracted patterns
Comparison with grid-based approach
18
Assumptions
Spatio-temporal sequence is a list of (x,y) values
Trajectory is abstracted by line segments
Closeness and similarity of line segments is defined
using 2-D data constructs
Hence is only applicable to 2-D data
19
Rewrite Today
Experimental evaluation on more real-life datasets
Account for noise in input data (inaccuracy of GPS
data)
Explore techniques that abstract the spatial
trajectory with higher accuracy
20
21