Transcript Week 2

Alla Petrakova
 Becoming
familiar with Motion Pattern
algorithms described in:
• Similarity Invariant Classification of Events by KL
Divergence Minimization by Khokhar, Saleemi,
Shah
• Scene Understanding by Statistical Modeling of
Motion Patterns by Saleemi, Hartung, Shah
 Gathering
a comprehensive list of state of
the art Trajectory Clustering methods
used in Data Mining.
• 25 articles and counting
 Finding
data sets used
 Finding code – if available
 Testing against motion pattern algorithm
 Clustering
and data mining reading:
• Trajectory Clustering: A partition-and-group
framework by Lee, Han and Whang
TRACLUS and MoveMine
 Written
by Lee, Han, Whang in 1997
 Serves as foundation for MoveMine set of
works
 357 citations
 Preciseness




vs Conciseness
Characteristic points – points where the behavior of trajectory
changes rapidly
MDL (Minimum Description Length) principle
L(H) conciseness (hypothesis)
L(D|H) preciseness
 Distance
formula:

dist(Li,Lj) = w⊥ ·d⊥(Li,Lj)+w∥ ·d∥(Li,Lj)+ wθ ·dθ(Li,Lj)
• The optimal partitioning of a trajectory should possess two desirable
properties: preciseness and conciseness. Pre- ciseness means that the
difference between a trajectory and a set of its trajectory partitions
should be as small as possible.
• Weights may differ depending on application. We will use w = 1 for all
of them.
• From “Noisy Logo Recognition Using Line Segment Hausdorff
Distance” paper
• Modified Line Hausdorff Distance
 MDL
cost = L(H) + L(D|H)
• L(H) represents the sum of the length of all
trajectory partitions (conciseness)
• L(D|H) represents the number of segments that
deviate from actual trajectory (preciseness)
• We need to find the optimal partitioning that
minimizes L(H ) + L(D|H ). This is exactly the
tradeoff between preciseness and conciseness.
 Clustering:
• Based on DBSCAN
• Parameters common to TRUCLUS and DBSCAN
 ε – the maximum distance
 MinLns – minimum number of line segments in a
cluster
• Parameter unique to TRUCLUS:
 Trajectory cardinality of a cluster:
PTR(Ci) = {TR(Lj) | ∀Lj ∈ Ci}
 Parameter
selection
• ε - simulated annealing
• MinLns – average number of lines at an optimal ε
 Complexity
–
• O(n2)
• Depending on organization and indexing of data
(line segments), complexity can be reduced to
O(n long n)
Testing against motion pattern algorithm
 Elk
1993:
• 33 trajectories
• 47,204 points
Used in the following papers:
J. gil Lee and J. Han. Trajectory clustering: A partition-and-group
framework. In Proceedings of the ACM International Conference on Management
of Data (SIGMOD), Beijing, China, pages 593–604, 2007. Cited by 357

Elio Masciari. 2012. Finding homogeneous groups in trajectory streams. In
Proceedings of the Third ACM SIGSPATIAL International Workshop on
GeoStreaming (IWGS '12). ACM, New York, NY, USA, 11-18.
DOI=10.1145/2442968.2442970 http://doi.acm.org/10.1145/2442968.2442970

Zhenhui Li, Jae-Gil Lee, Xiaolei Li, and Jiawei Han. 2010. Incremental
clustering for trajectories. In Proceedings of the 15th international conference on
Database Systems for Advanced Applications - Volume Part II (DASFAA'10),
Hiroyuki Kitagawa, Yoshiharu Ishikawa, Qing Li, and Chiemi Watanabe (Eds.),
Vol. Part II. Springer-Verlag, Berlin, Heidelberg, 32-46. DOI=10.1007/978-3-64212098-5_3 http://dx.doi.org/10.1007/978-3-642-12098-5_3

Elio Masciari. 2009. A Complete Framework for Clustering Trajectories. In
Proceedings of the 2009 21st IEEE International Conference on Tools with
Artificial Intelligence (ICTAI '09). IEEE Computer Society, Washington, DC,
USA, 9-16. DOI=10.1109/ICTAI.2009.31 http://dx.doi.org/10.1109/ICTAI.2009.31

Yu Zhang and Dechang Pi. 2009. A Trajectory Clustering Algorithm Based
on Symmetric Neighborhood. In Proceedings of the 2009 WRI World Congress on
Computer Science and Information Engineering - Volume 03 (CSIE '09), Vol. 3.
IEEE Computer Society, Washington, DC, USA, 640-645.
DOI=10.1109/CSIE.2009.366 http://dx.doi.org/10.1109/CSIE.2009.366



Jae-Gil Lee, Jiawei Han, Xiaolei Li, and Hector Gonzalez. 2008. TraClass:
trajectory classification using hierarchical region-based and trajectory-based
clustering. Proc. VLDB Endow. 1, 1 (August 2008), 1081-1094.


Jae-Gil Lee, Jiawei Han, and Xiaolei Li. 2008. Trajectory Outlier Detection:
A Partition-and-Detect Framework. In Proceedings of the 2008 IEEE 24th
International Conference on Data Engineering (ICDE '08). IEEE Computer
Society, Washington, DC, USA, 140-149. DOI=10.1109/ICDE.2008.4497422
http://dx.doi.org/10.1109/ICDE.2008.4497422
TRACLUS
UCF
 Deer1995
• 32 trajectories
• 20,065 data points
Used in the following papers:
J. gil Lee and J. Han. Trajectory clustering: A partition-and-group
framework. In Proceedings of the ACM International Conference on Management
of Data (SIGMOD), Beijing, China, pages 593–604, 2007. Cited by 357

Elio Masciari. 2012. Finding homogeneous groups in trajectory streams. In
Proceedings of the Third ACM SIGSPATIAL International Workshop on
GeoStreaming (IWGS '12). ACM, New York, NY, USA, 11-18.
DOI=10.1145/2442968.2442970 http://doi.acm.org/10.1145/2442968.2442970

Zhenhui Li, Jae-Gil Lee, Xiaolei Li, and Jiawei Han. 2010. Incremental
clustering for trajectories. In Proceedings of the 15th international conference on
Database Systems for Advanced Applications - Volume Part II (DASFAA'10),
Hiroyuki Kitagawa, Yoshiharu Ishikawa, Qing Li, and Chiemi Watanabe (Eds.),
Vol. Part II. Springer-Verlag, Berlin, Heidelberg, 32-46. DOI=10.1007/978-3-64212098-5_3 http://dx.doi.org/10.1007/978-3-642-12098-5_3

Elio Masciari. 2009. A Complete Framework for Clustering Trajectories. In
Proceedings of the 2009 21st IEEE International Conference on Tools with
Artificial Intelligence (ICTAI '09). IEEE Computer Society, Washington, DC,
USA, 9-16. DOI=10.1109/ICTAI.2009.31 http://dx.doi.org/10.1109/ICTAI.2009.31

Yu Zhang and Dechang Pi. 2009. A Trajectory Clustering Algorithm Based
on Symmetric Neighborhood. In Proceedings of the 2009 WRI World Congress on
Computer Science and Information Engineering - Volume 03 (CSIE '09), Vol. 3.
IEEE Computer Society, Washington, DC, USA, 640-645.
DOI=10.1109/CSIE.2009.366 http://dx.doi.org/10.1109/CSIE.2009.366


Jae-Gil Lee, Jiawei Han, Xiaolei Li, and Hector Gonzalez. 2008. TraClass:
trajectory classification using hierarchical region-based and trajectory-based
clustering. Proc. VLDB Endow. 1, 1 (August 2008), 1081-1094.


Jae-Gil Lee, Jiawei Han, and Xiaolei Li. 2008. Trajectory Outlier Detection:
A Partition-and-Detect Framework. In Proceedings of the 2008 IEEE 24th
International Conference on Data Engineering (ICDE '08). IEEE Computer Society,
Washington, DC, USA, 140-149. DOI=10.1109/ICDE.2008.4497422
http://dx.doi.org/10.1109/ICDE.2008.4497422

TRACLUS
UCF

Swainson’s Hawks
•
•
•
•
43 trajectories
4514 points
Follows migration route
Closest we have to ground truth
“Swainson's Hawks converged in
eastern Mexico on the Gulf of Mexico
coast. Southward, these hawks
followed a narrow, well-defined path
through Central America, across the
Andes Mountains in Columbia, and
east of the Andes to central Argentina
where they all spent the austral
summer. Swainson's Hawks northward
migration largely retraced their
southward route.”
Fuller, M.R., Seegar, W.S., Schueck,
L.S., 1998. Routes and Travel Rates of
Migrating Peregrine Falcons Falco
peregrinus and Swainson's Hawks
Buteo swainsoni in the Western
Hemisphere. Journal of Avian
Biology 29:433-440.
TRACLUS
UCF