Transcript Document

Shape-based Similarity Query for
Trajectory of Mobile Object
NTT Communication
Science Laboratories,
NTT Corporation, JAPAN.
Yutaka Yanagisawa
Jun-ichi Akahani
Tetsuji Satoh
“Rickshaw” (NARA/KYOTO)
2 / 31
Background
• The Recent technologies allow us to track moving
objects using highly accurate positioning devices.
• There are many applications using such location
information have been developed.
– Navigation Systems, Location-based Information Systems,
etc.
A Navigation System
Digital City Kyoto:
A Location-based Information System
3 / 31
Motion Pattern Analysis
• Motion pattern analysis is one of the most
interesting technologies of these applications.
– By analyzing their motion patterns, it is possible to extract
the behavioral characteristics of moving objects.
– The applications can predict the future behavior of the
moving objects using extracted characteristics.
• Single Motion Analysis
– focuses on the statistical characteristics of a moving object.
• Relative Motion Analysis
– focuses on the similarity between motion patterns.
We discuss the approach based on the similarity of trajectory
shapes because it is a simple and intuitive approach.
4 / 31
Similarity of Trajectory Shapes
• This approach is called shape-based approach.
An example of an information providing system.
B
Exhibition hall
The trajectories of visitors
are stored in a database.
C
It retrieves trajectories that
are similar to A …
A
Entrance
D
Show information
about B, C, and D.
Exit
It is possible to predict the
future route of the new
visitor.
5 / 31
Problem
• However, there are few database systems
which can search trajectories based on shapes.
– Many database systems retrieve moving objects
based on only “distance”.
Y
L
The minimum distance between L
and L2 is less than the distance
between L and L1.
L1
L2
D2
However, L2 is more similar in
shape to L than L1 intuitively.
D1
Not appropriate for shape-based approach
X
6 / 31
Our Approaches
• We propose a shape-based similarity query
for searching trajectories from moving object
databases.
• Moreover, we present an efficient indexing
method for retrieving moving objects based
on our proposed query.
Shape-based Similarity Query
8 / 31
Data Model for Trajectory
• In real world, a trajectory of a moving object can
be modeled as a continuous line in space.
– However, positioning devices can not track a moving
object continuously.
• In our work, a trajectory is stored as a sequence of
points (discrete line) in databases.
– This model is used as a popular data model.
In Real World
In Databases
9 / 31
A Similarity of Time Series Data
• The key idea have been proposed in the technique
for time series database.
• The similarity between two time series data is
defined as the Euclidean distance between the points
in n dimensional space.
W = <w1, w2, …, w9>
W’ = <w’1, w’2, …, w’9>
x
(n=9)
w9
D(W , W ' ) 
x
n
2
(
w

w
'
)
 i i
i 0
D(W, W’)
w1 w2
w’9
w’1
t1
t8
t
t
10 / 31
A Similarity of Time Series Data
• The key idea have been proposed in the techniques
for time series database.
• The similarity between two time series data is
defined as the Euclidean distance between the points
in n dimensional space.
Distance between W and W’ is
5
W = <2, 3, 4, 3>
W’ = <1, 1, 2, 3>
12  22  22  02
5
4
4
3
3
2
2
1
1
=3
0
2
1 2
11 / 31
A Similarity of Time Series Data
• The key idea have been proposed in the techniques
for time series database.
• The similarity between two time series data is
defined as the Euclidean distance between the points
in n dimensional space.
5
4
W = <2, 2, 2, 3>
W’ = <2, 2, 2, 3>
In this case, the distance is zero.
3
2
1
Smaller distance means higher similarity
12 / 31
A Similarity of Time Series Data
• The distance fits to intuitive similarity of
line shapes.
• There is an effective search algorithm to
calculate this distance.
• We will extend the similarity for trajectory
in 2 or more dimensional space.
Our Proposed Similarity
of Trajectories
• The similarity of trajectories can be defined as an
extension of the distance of time series data.
– The distance can be given as the following expression.
D( pi , p'i )  ( xi  x'i ) 2  ( yi  y'i ) 2
Y
p7
p’7
D (l , l ' ) 
l
l’
p1

p’1
n
2
D
(
p
,
p
'
)
 i i
i 1
X
13 / 31
Shape-based Similarity Query
for Trajectories
14 / 31
• We define a shape-based similarity query for
trajectories as a subsequence matching query.
– Because the length of trajectories are often difference.
SSQ(L, l, q)
L: A set of stored trajectories in database.
l: A trajectory to be compared.
q: The distance from l.
Answer La: A set of sub-trajectories
+
Stored trajectories
A given trajectory
Answer trajectories
Shape-based Similarity Query
for Trajectories
15 / 31
An Answer Sub-Trajectory
A Given Trajectory
- The database calculates the distance between
the given trajectory l and each sub-trajectory.
- If the distance is less than the given distance q,
the database adds the sub-trajectory to the
answer set of trajectories La.
Indexing
17 / 31
Approach
• The existing spatial structures are appropriate for
retrieving an object based on the distance.
– However, these structures have no method for searching
the data based on the similarity between trajectories.
We extend the spatial data structure for our proposed query.
An Efficient Calculation Process
for the Shape-based Similarity: 1
18 / 31
• The essential idea was presented as a PAA:
Piecewise Aggregate Approximation [Keogh01].
– PAA is an efficient method of approximating the time
series data for a similarity search.
Using the `average sequences’ of a sub-sequences.
x
x
(N=3)
W
W‘
w1
W
W'
w '1
t
w2
w3
w '2
w '3
t
An Efficient Calculation Process
for the Shape-based Similarity: 2
19 / 31
• The distance between the average sequences
is the lower bound of the distance between
the original two sequences.
D(W ,W ' )  D(W ,W ' )
x
D( W, W’)
D( W, W’)
W

W’
By comparing average sequences, we can know the lower
bound of the distance between
original sequences.
t
t
An Efficient Calculation Process
for the Shape-based Similarity: 3
20 / 31
• In the case of trajectories, the distance between
the center points of trajectories is the lower bound
of the distance between the original trajectories.
v1
Y
L
Y
The distance between these
center points is the lower
bound of the original distance.
v’1
L’
v7
v’7
Calculating
center points
X
X
Combination of PAA and
Spatial Data Structure: 1
21 / 31
• For making indexes, the database calculates
the center points of sub-trajectories.
– The length of each sub-trajectory must be fixed
to the system parameter N.
– In this example, N is four.
l
Y
Y
p8
l
p7
p3
p2
p6
p5
The center point of
from p5 to p8
p4
The center point of
from p1 to p4
p1
X
X
Combination of PAA and
Spatial Data Structure: 2
22 / 31
• Next, the database makes indexes to the points
using a traditional spatial data structure.
– Our implemented system makes an index to every
center point using R+-Tree.
The database can search objects based on the similarity of
trajectories using the spatial data structure.
Y
Normal R+-Tree X
Y
Y
X
X
Our Proposed Index Structure
23 / 31
Query Processing: 1
• When a SSQ(L,lQ,q) is given, the database
calculates the center point of lQ at first.
– Suppose that the length of stored center points
is fixed to 4 (N=4) in the following example.
If a query SSQ(L,lQ,q) is given..
Y
Y
pQ is the center point
of a given trajectory.
lQ
pQ
X
X
24 / 31
Query Processing: 2
• Next, the database searches stored points within
the distance q from the calculated point pQ using
the spatial data structure.
Y
An index tree (R+-Tree)
A
A
Candidate points
C
B
B
pQ
C
X
The region within the distance q from pQ
25 / 31
Query Processing: 3
• Finally, the database checks the distance between
a given trajectory lQ and each candidate trajectory.
• If the distance is less than a given threshold q, the
candidate trajectory is added to the answer set La.
If D(lQ , l1 )  q ,
l1 is added to La
Y
p2
Y
lQ
pQ
l1
p1
X
l1 is the original trajectory of p1.
Performance Study
27 / 31
Performance Study
• We conducted an experiment for evaluating
our proposed query and indexing method.
– Measuring the processing time for retrieving
trajectories required by a shape-based similarity
query.
• For this evaluation, two types of trajectories
are stored in a database.
– tracked by GPS and generated by a simulator.
We compared the processing time using both methods:
- Our indexing method,
- A spatial data structure (R+-Tree).
28 / 31
Trajectory Data: 1
• This is an example of trajectory data captured by
GPS receivers on rickshaws (in Nara city).
– Rickshaw is tour guide, they work in Nara / Kyoto.
A trajectory of a rickshaw in all day
A GPS Receiver (eTrex/GARMIN)
Rickshaw
29 / 31
Trajectory Data: 2
• This figure displays trajectories generated
by our implemented simulator.
The simulator can generate
trajectories
such that people walk on a plane
freely.
Velocity and direction of each
object are given as random
values. But the changes of these
values are slow and continuous.
30 / 31
The Result of the Experiment
The processing time to calculate 10 random queries is displayed:
Using R+-Tree
Using our index structure
Amount of Stored Points
20000
20000
18000
18000
16000
16000
14000
14000
12000
Time
10000
(msec)
8000
12000
Time
10000
(msec)
8000
6000
6000
4000
Points
2000
32000
0
4
8000
8
16
Length
32
64
128
256
4000
Points
2000
32000
0
4
8000
8
16
2000
Length
32
64
128
256
2000
Length of lQ (=N)
For retrieving longer trajectories from stored data,
our proposed method has high advantages to existing methods.
31 / 31
Conclusions
• We have proposed a shape-based similarity
query to find moving objects.
– Database users can find moving objects for
analyzing their motion patterns.
• Moreover, we have presented an effective
indexing method to search for the trajectories
required by our proposed queries.
– We demonstrated the advantage of our proposed
method to existing spatial data structures.
32 / 31
Future Work
• We will evaluate our proposed method
using these data.
[Human Tracking by using Laser Scanners]
- University of Tokyo
(Dr. Zhao and Prof. Shibasaki)
- Captured at Geoinformation Forum
Japan 2002 (32.096 people visited)
[Motion Capture Data]
- Tokyo University of Technology
(Creative Labo)
- 76 moving points on bodies (120fps)
- Playing football and judo