Spatio-temporal sequential pattern

Download Report

Transcript Spatio-temporal sequential pattern

Mining Frequent Spatiotemporal Sequential Patterns
Authors: Huiping Cao, Nikos Mamoulis, David W. Cheung
Department of Computer Science, University of Hong Kong
Accepted to ICDM 2005
Abhinaya Sinha
Vijay Gandhi
CSCI 8715
Date: 4/3/06
Outline







Introduction
Problem Definition
Contribution
Approach
Experimental Evaluation
Assumptions
Rewrite today
2
Spatio-Temporal Pattern

Spatio-temporal sequence is
a list of time-sampled
locations.


(x1,y1,t1), (x2,y2,t2),…,(xn,yn,tn)
A pattern is a subset of the
sequence
Bus routes across 3 runs

A pattern is frequent if its
support >= threshold value
3
Motivation


Predict trajectory of an object

Detect frequent routes
followed by an object
Ecology

CHIMPS project at UofM
(http://www.discoverchimps.org)
Tracking chimpanzee
movements and analyzing
their behavior
Location based services

Where a user may go next
and thus, provide customized
services


4
Problem Definition

Input: Given spatio-temporal sequence S and
support threshold min_sup

Output: Frequent spatio-temporal patterns

Objective: Lower computation time and space

Constraints: Use of a device such as GPS to
sample locations
5
Contributions

Model for spatio-temporal sequential pattern mining

Algorithm for extracting frequent singular spatial
pattern elements a.k.a. spatial regions

Algorithm for combining single regions to patterns
of longer length
6
Methodology
Convert spatio-temporal sequence to set of line segments
i.e. abstract the trajectory
Detect frequent, singular regions
Combine singular patterns to longer sequences
7
Modeling the problem

Aim: abstract (approximate) the trajectory

Input: original spatio-temporal sequence

Output: sequence segments

Need




locations are not repeated exactly
Derived line segments are used for defining spatial regions
Compress original data and decrease effort required in
mining
Uses Douglas-Pecker (DP) algorithm for performing this
step
8
Concepts in Modeling




Spatio-temporal Segment
Representative line segment of a segment
Similarity between line segments

| l1.angle – l2.angle | <= θ

| l1.len – l2.len | <= length factor * max (
l1.len, l2.len)
Closeness between line segments





Distance (points belonging to line1, line2) <= ε
Mean line segment of a set of line segments
Spatial pattern element/ spatial region

Sides determined by mean line segment and
average orthogonal distance
Spatio-temporal sequential pattern

Ordered sequence of spatial pattern
elements
Support of a pattern

number of sequences supporting it
9
Discovering frequent singular
patterns

Input: All segments representing the spatiotemporal sequence, support threshold

Output: All segments have been assigned to
regions or been found as outliers.

This step constructs frequent single spatial regions

Uses similarity and closeness to assign segments
to regions
10
Deriving longer patterns

Goal is to derive longer patterns from the singular
frequent patterns obtained in previous step.

Input is series SR of spatial regions.

Output: Frequent patterns.

Two algorithms
 Level-wise mining
 Mining using substring tree
11
Example




Input: r1-r2-r1-r2-r3
minsup = 2
Output candidates
r1-r2 #2; r2-r1 #1; r2-r3 #1
r1
#2
r2 #2; r3 #1
Frequent Patterns: r1-r2; r1; r2
12
Level-wise pattern mining
Apriori Algorithm
 e.g. input: r1-r2-r1-r2
r1(2) r2(2)
r1-r2(2) r2-r1(1)
 Output: r1; r2; r1-r2

13
Contribution: Improvements




Connectivity constraint
Closeness property
Input: r1-r2-r1-r3-r1-r2
minsup: 2
r1(3)
r2(2)
r3(1)
r1-r2 r2-r1 r1-r3 r3-r1 r2r3 r3r2
14
Mining using substring tree




Tree structure with each node is pattern element
The substring tree is used to help in counting long
substrings
Depth first search gives substring
e.g. r1-r2-r1-r3-r1-r2
15
Validation Methodology

Computer simulations

Candidate Algorithms: Level-wise, Grid I, Grid II,
Substring tree

Evaluation done on
 Real-world dataset
 Synthetic dataset
16
Experimental Evaluation
Substring takes least time
17
Experimental Evaluation
Extracted patterns
Comparison with grid-based approach
18
Assumptions

Spatio-temporal sequence is a list of (x,y) values

Trajectory is abstracted by line segments

Closeness and similarity of line segments is defined
using 2-D data constructs

Hence is only applicable to 2-D data
19
Rewrite Today

Experimental evaluation on more real-life datasets

Account for noise in input data (inaccuracy of GPS
data)

Explore techniques that abstract the spatial
trajectory with higher accuracy
20
21