Transcript PPT
Extracting Mobility Statistics from
Indexed Spatio-Temporal Datasets
Yoshiharu Ishikawa
Yuichi Tsukamoto
Hiroyuki Kitagawa
University of Tsukuba
August 30, 2004
STDBM 2004 at Toronto
Outline
Background and objectives
Markov transition probability
Indexing method for moving trajectories
Proposed methods
naïve algorithm
CSP-based algorithm
Experimental results
Conclusions
Background
Moving object databases
Research issues
stores and manages information on a huge number of
moving objects
supports queries on moving trajectories and/or
moving status
spatio-temporal indexes
extraction of statistics (e.g., selectivities)
Statics in spatio-temporal databases
used for query optimization
also useful in mobility analysis
Our Approach
Objective: extracting mobility statistics from spatiotemporal databases
Target: trajectory data indexed using R-trees
Statistics to be extracted:Markov transition probability
target space is decomposed in cells
estimating transition probabilities between cells using the
indexed trajectory data
Features
search problem is formalized as constraint satisfaction problem
(CSP)
efficient processing using R-trees
Outline
Background and objectives
Markov transition probability
Indexing method for moving trajectories
Proposed methods
naïve algorithm
CSP-based algorithm
Experimental results
Conclusions
Markov Transition Probability (1)
Assumption: target space is decomposed in cells
Example 1: What is the estimated probability that an
object currently in cell c0 moves in cell c1 in a unit time
later?
c0
c1
A
t =τ
A
t =τ+1
First-order Markov transition probability Pr(c1|c0)
Markov Transition Probability (2)
Example 2: What is the probability that an object
which moves from c0 to cell c1 in a unit time
moves to cell c2 in the next unit time?
c0
c1
A
A
c2
A
t =τ
t =τ+1
t =τ+2
Second-order transition probability Pr(c2|c0, c1)
Extension to order-n Markov transition
probability Pr(cn|c0, …, cn-1) is easy
Markov Transition Probability
Conventional technique in traffic data analysis
Special kind of association rules
Upton & Fingleton, 1989 [13]
probability corresponds to the confidence factor
difference: existence of order
Usage
trajectory estimation
estimates where a moving object moves to in the next
period
simulation of movement status
given status of moving objects at t = , we can estimate the
change of the status at t = + 1, + 2, …
Assumptions
Movement patterns obeys stationary process
Cell decomposition
movement tendency does not change as time passes
each cell is a rectangle
cell size is arbitrary: non-uniform decomposition is
allowed
cell decomposition can be specified dynamically
Unit time length
unit time can be specified as arbitrary length (e.g.,
one minuite, 10 minuites, …)
but a unit time length should be a multiple of
sampling time length
Formalization of Probability (1)
Target data: trajectory data from t = 0 to t = T
Definition of first-order Markov transition probability
T 1
Pr( c1 | c0 )
| objs(c , t ) objs(c , t 1) |
t 0
0
T 1
| objs(c , t ) |
t 0
1
0
objs(ci, t): set of objects which were in cell ci at t
denominator: no. of objects which were in cell c0 at arbitrary t (0
≤ t ≤ T 1)
numerator: no. of objects each of which contained in
denominator and moved cell c1 at t + 1
Formalization of Probability (2)
Definition of order-n Markov Transition
Probability
T 1
Pr(cn | c0 ,, cn 1 )
n
|
i0 objs (ci , t i) |
t 0
T 1
n 1
|
i0 objs (ci , t i) |
t 0
denominator: no. of objects each of which was in cell
c0 at t (0 ≤ t ≤ T 1), in cell c1 at t + 1, …, and in cell cn
1 at t + n 1
numerator: no. of objects each of which is contained
in Dominator and moved cell cn at t + n
Generalized Transition Probability
Estimation Problem (1)
Given n + 1 cell sets
C0 {c0, 1 ,, c0, |C0 | },, Cn {cn , 1 ,, cn, |Cn| },
for each of arbitrary cell combinations
(c0 ,, cn ) C0 Cn ,
output Pr(cn|c0,…,cn-1)
Derives transition probability according to the
specified cell sets at once
Generalized Transition Probability
Estimation Problem (2)
Example: Given C0 = {c0, c1}, C1 = {c1, c2}, C2 =
{c1, c2, c3}, estimate second-order probabilities
c0
c1
c2
c3
Algorithm outputs 12 probabilities Pr(c1|c0, c1), Pr(c2|c0,
c1), …, Pr(c3|c1, c2)
Outline
Background and objectives
Markov transition probability
Indexing method for moving trajectories
Proposed methods
naïve algorithm
CSP-based algorithm
Experimental results
Conclusions
Indexing Methods for Trajectories
R-tree-based approach is assumed
Point-based representation: trajectories is
represented as a set of points
(d+1)-dimension R-tree is used (e.g., 3D R-tree)
incorporating temporal dimension
(d +1)-D R-tree-based Representation
x
x
root
b
1
5
6
3
B
a
4
c
2
0 1
A
0 1
2
3
4
5
6
2
3
4
5
6
7
root
7 8
(=T)
a
Sampling-based representation
1
2
b
3
c
4
5
6
8
(=T)
Outline
Background and objectives
Markov transition probability
Indexing method for moving trajectory data
Proposed methods
naïve algorithm
CSP-based algorithm
Experimental results
Conclusions
Naïve Algorithm (1)
Based on the definition of the Markov transition
probability
Example: Estimating Pr(c2|c0, c1)
Determine objs(c0, ) and objs(c1, + 1) using the R-tree
objs(ci, t): the set of objects which were in cell ci at time t
Take intersection of two sets; the cardinality of the intersection
is added to Scount
If the intersection is not empty objs(c2, + 2) is determined using
the R-tree
Take intersection of objs(c0, ), objs(c1, + 1) , objs(c2, + 2); the
cardinality of the result is added to Qcount
This process is repeated for each (0 ≤ ≤ T – n)
Calculate Pr(c2|c0, c1) based on Scount, Qcount
No. of search on R-tree is proportional to T
Naïve Algorithm (2)
Example: estimation of Pr c2 | c0 , c1
x
Qcount += 1
Output =
Qcount
Scount
cell c2
cell c1
cell c0
0
1
2
3
4
Scount += 1
5
6
7
Scount += 1
8
(=T)
No. of search
on R-tree
is proportional
to T
Outline
Background and objectives
Markov transition probability
Indexing method for moving trajectories
Proposed methods
naïve algorithm
CSP-based algorithm
Experimental results
Conclusions
Basic Idea (1)
Estimation of Pr(cn|c0, …, cn-1) based on three steps:
1.
2.
3.
Count the no. of objects which were in c0, …, cn-1 at each
unit time using an R-tree
Count the no. of objects which were in c0, …, cn at each
unit time using an R-tree
Compute Pr(cn|c0, …, cn-1) by [result of step 2] / [result of
step 1]
Benefits
step 1 & 2 can be processed using the same algorithm
algorithm for step 1 is given by setting n → n – 1
requires only two searches on R-tree
Basic Idea (2)
Example: estimation of Pr(c2|c0, c1)
x
Step 1: count
objects
which moved from
c0 to c1 within a
cell unit time
c2
Step 2: count objects
that moved as
cell c , c , c at each
c1 0 1 2
unit time
cell Step 3: compute
c0 probability
Qcount = 1
Pr(c2|c0, c1) = ―――――
Scount = 2
0
1
2
3
4
5
6
7
8 (= T )
Counting Using R-tree (1)
How can we compute no. of objects which were
in c0, …, cn at each unit time?
Idea: the problem is formalized as a constraint
satisfaction problem (CSP)
An object satisfying the constraint fulfills the
following constraints for some
it was in cell c0 at t =
it was in cell c1 at t = + 1
…
it was in cell cn at t = + n
Search objects that satisfy all n + 1 constraints
Counting Using R-tree (2)
Effective use of R-tree is necessary
We extend the CSP solution search method
using R-trees (Papadias et al, VLDB’98) [7]
considers spatial constraints
search CSP solutions from the root to leaves
Example: find all spatial objects x, y, z that satisfy
overlap(x, y) and north(y, z)
Use of pruning and backtracks
Reduce search space using constraints
enumerates all solutions with one R-tree access
Example of Counting (1)
x
root
For C0 = {c1}, C1 = {c1, c2},
C2={c2}, derive
probabilities for (C0, C1, C2)
b
1
5
6
3
c2 Derive two probabilities at
a
once
4
c
2
0
1
c1
2
3
4
5
6
7
(=T)
Pr(c2|c1, c1): the probability that
an object which have moved
as c1c1 next moves to c2
Pr(c2|c1, c2)
8
Example of Counting (2)
x
root
R-tree
b
1
root
5
6
3
c2
a
4
1
c1
2
3
4
b
c
c
2
0
a
5
6
7
(=T)
8
1
2
3
4
5
6
Pruning Method (1)
Pruning condition 1:
Movement between two R-tree
nodes which do not temporary
consecutive is impossible
x
b
c
a
Candidates can be deleted
0 1
2
3
4
5
6
7
8
(=T)
Example:
- movement such as a b and
b c are allowed
- movement a c is impossible
Pruning Method (2)
x
Pruning condition 2:
Trajectory is not contained
in the target cell
cell c1
0 1
2
3
4
5
6
7
8
(=T)
Example: When we are
counting for c1 c1, we
should consider only nodes
that overlaps with c1
Pruning Method (3)
x
Pruning condition 3:
If [max distance an object
can move] < [distance between
MBRs] then an object cannot
move from a node to next node
1
distance
between
MBRs
2
0 1
2
3
4
5
6
7
8
(=T)
Query Processing Example
x
tree
level
=2
root
a
cell c2
b
root
Targets:
c1 c1 c2
c1 c2 c2
root
c
cell c1
pruning
pruning
t
tree
level
=1
1
2
cell c2
cell c1
pruning
tree
level
=0
cell c2
cell c1
backtrack
AnThere
objectisthat
no
moved
as
objects that
c1
c1 as
c2
moved
is cfound
and
1 c1 c2
counted
c1 c2 c2
Outline
Background and objectives
Markov transition probability
Indexing method for moving trajectory data
Proposed methods
Naïve algorithm
CSP-based algorithm
Experimental results
Conclusions
Dataset (1)
Generated using the moving object simulator
made by Brinkoff [1]
Simulates car movement situation on actual city
road network
Oldenburg city, Germany (about 2.5km x 2.8km)
no. of initial moving objects: 5
5 objects are created in a minute
on average 100 objects are moving in the map at a
time
data is generated for T = 1000 minutes
120K points are stored in 3-D R-tree
Dataset (2)
Example for
estimating using 3 x 3 cells
c0
0
c3
0.183
c6
0.04
c1
0.081 c4
0.348 c7
0.10
c2
0.08
c5
0.01
c8
0.02
Experimental Result (1)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
T (minute)
00
0
T=
1
00
T=
9
00
T=
8
00
T=
7
00
T=
6
00
T=
5
00
T=
4
00
T=
3
T=
2
00
Naïve
CSP
00
T=
1
Map is decomposed into 30 x 30 cells
First-order Markov transition probabilities
Randomly 3 x 3 cells are selected
Ellapsed Time (second)
Experimental Result (2)
8
7 Naïve
6 CSP
5
4
3
2
1
T (minute)
00
0
T=
1
00
T=
9
00
T=
8
00
T=
7
00
T=
6
00
T=
5
00
T=
4
00
T=
3
00
T=
2
00
0
T=
1
Estimation of second-order transition probabilities
Other parameters are same to the former case
Ellapsed Time (second)
Experimental Result (3)
120
100
Naïve
CSP
80
60
40
20
T (minute)
T=
90
0
T=
10
00
T=
80
0
T=
70
0
T=
60
0
T=
50
0
T=
40
0
T=
30
0
T=
20
0
0
T=
10
0
Estimation of third-order transition probabilities
Other parameters are similar to the former case
Ellapsed Time (second)
Experimental Result (4)
The case when CSP-based approach is not effective
Target space is decomposed into 20 x 20 cells
Estimation of second-order transition probabilities
25
20
Since cell
decomposition
is coarse, the
pruning cannot
reduce
candidates
Naïve
CSP
15
10
5
T (minute)
T=
90
0
T=
10
00
T=
80
0
T=
70
0
T=
60
0
T=
50
0
T=
40
0
T=
30
0
T=
20
0
0
T=
10
0
Ellapsed Time (second)
Conclusions and Future Work
Conclusions
mobility statistics based on Markov transition
probability
proposals of two algorithms
naïve approach
CSP-based approach
CSP-based approach effectively utilizes R-tree
structure
Future Work
adaptive cell decompositions
extension to non-stationary Markov transitions