Slides - Microsoft

Download Report

Transcript Slides - Microsoft

Trajectory Data Mining
Dr. Yu Zheng
Lead Researcher, Microsoft Research
Chair Professor at Shanghai Jiao Tong University
Editor-in-Chief of ACM Trans. Intelligent Systems and Technology
http://research.microsoft.com/en-us/people/yuzheng/
Paradigm of Trajectory Data Mining
Uncertainty
Privacy
Preserving
Reducing
Uncertainty
Traj. Pattern Mining
Moving
Freq. Seq.
Together
Patterns
Patterns
Periodic
Clustering
Patterns
Trajectory Indexing and Retrieval
Distance of
Query Historical
Trajectory
Trajectories
Trajectory
Classification
Trajectory
Outlier/Anomaly
Detection
Managing Recent
Trajectories
Trajectory Preprocessing Map-Matching
Stay Point Detection
Noise Filtering
Graph
Mining
Routing
Matrix
Analysis
TD
Compression
MF
Segmentation
CF
Matrix
Spatial
Trajectories
Spatial
Trajectories
Spatial
Trajectories
Tensor
Graph
Yu Zheng. Trajectory Data Mining: An Overview. ACM Transactions on Intelligent Systems and Technology. 2015, vol. 6, issue 3.
Trajectory Classification
• Differentiate between trajectories (or its segments) of different status:
–
–
–
–
motions
transportation modes
human activities
……
• Applications
–
–
–
–
trip recommendation
life experiences sharing
context-aware computing
……
Trajectory Classification
• General Steps:
– Divide a trajectory into segments using segmentation methods.
Sometimes, each single point is regarded as a minimum inference unit
– Extract features from each segment (or point)
– Build a model to classify each segment (or point)
• Some models
– Dynamic Bayesian Network (DBN)
– HMM and Conditional Random Field (CRF)
Learning Transportation Modes Based on GPS
Trajectories
• Goal & Results: Inferring transportation modes from raw GPS data
– Differentiate driving, riding a bike, taking a bus and walking
– Achieve a 0.75 inference accuracy (independent of other sensor data)
GPS
log
Users
Infer model
Learning Transportation Modes Based on
GPS Trajectories
• Motivation
– For users:
• Reflect on past events and understand their own life pattern
• Obtain more reference knowledge from others’ experiences
– For service provider:
• Classify trajectories of different transportation modes
• Enable smart-route design and recommendation
• Difficulty
– Velocity-based method cannot handle this problem well (<0.5 accuracy)
– People usually transfer their transportation modes in a trip
– The observation of a mode is vulnerable to traffic condition and weather
Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Learning Transportation Modes Based on
GPS Trajectories
• Contributions and insights
– A change point-based segmentation method
• Walk is a transition between different transportation modes
• Handle congestions to some extent
– A set of sophisticated features
• Robust to traffic condition
• Feed into a supervise learning-based inference model
– A graph-based post-processing
• Considering typical user behavior
• Employing location constrains of the real world
• WWW 2008 (first version)
Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Architecture
Online Inference
Test Data
Segmentation
Extracting
Feature
Inference
Model
Offline Learning
Training Data
Segmentation
Extracting
Feature
Model Training
Graph Building
Knowledge
Extraction
Spatial
Knowledge
Post-Processing
Trans.
Modes
Change Point
Clustering
Spatially Indexed
Knowledge
Spatial Indexing
Walk-Based Segmentation
• Commonsense knowledge from the real world
– Typically, people need to walk before transferring transportation modes
– Typically, people need to stop and then go when transferring modes
Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Walk-Based Segmentation
• Change point-based Segmentation Algorithm
–
–
–
–
Step 1: distinguish all possible Walk Points, non-Walk Points.
Step 2: merge short segment composed by consecutive Walk Points or non-Walk points
Step 3: merge consecutive Uncertain Segment to non-Walk Segment.
Step 4: end point of each Walk Segment are potential change points
Forward
Backward
Bus
Walk
Car
(a)
(b)
Certain Segment
Car
3 Uncertain Segments
Certain Segment
(c)
Denotes a non-walk Point:
P.V>Vt or P.a>at
Denotes a possible walk point: P.V<Vt and P.a<at
Feature Extraction (1)
• Features
Category Features
Basic
Features
Significance
Dist
Distance of a segment
MaxVi
The ith maximal velocity of a segment
MaxAi
The ith maximal acceleration of a segment
AV
Average velocity of a segment
EV
Expectation of velocity of GPS points in a segment
DV
HCR
Variance of velocity of GPS points in a segment
Heading Change Rate
Advanced
SR
Features
VCR
Stop Rate
Velocity Change Rate
Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Feature Extraction (2)
• Our features are more discriminative than velocity
– Heading Change Rate (HCR)
– Stop Rate (SR)
– Velocity change rate (VCR)
– >65 accuracy
Velocity
Vs
a) Driving
Distance
Velocity
p1. head
p1
ad
. he
p2
H1
p1.V1
Vs
p2.
V2
p3
b) Bus
Distance
Velocity
p2
L1, T1
Vs
c) Walking
Distance
Graph-Based Post-Processing (1)
• Using location-constraints to improve the inference performance??
Crossroad
Bus Stop
Traffic Light
Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Graph-Based Post-Processing (2)
• Transition probability between different transportation modes
– P(Bike|Walk) and P(Bike|Driving)
Transition P(Walk|Driving)
Ground Truth
Inference result
Transition P(Bike|Walk)
Segment[i-1]: Driving Segment[i]: Walk
P(Driving):
P(Bus):
P(Bike):
P(Walk):
75%
10%
8%
7%
P(Bike): 40%
P(Walk): 30%
P(Bus):
20%
P(Driving): 10%
Segment[i+1]: Bike
P(Bike): 62%
P(Walk): 24%
P(Bus):
8%
P(Driving): 6%
Segment[i].P(Bike) = Segment[i].P(Bike) * P(Bike|Car)
Segment[i].P(Walk) = Segment[i].P(Walk) * P(Walk|Car)
Graph-Based Post-Processing (3)
• Mine a implied road network from users’ GPS logs
– Use the location constraints and typical user behaviors as probabilistic cues
– Being independent of the map information
(1) Change points and
start/end points
(2) Building Graph
N1
N2
N7
N6
A start or end point
N5
A change point
(4) Probability calculation
N1
N8
N5
N4
P85(Mi)
P18(Mi)
P54(Mi)
P185(Mi|Mj)
N3
N8
P854(Mi|Mj)
(3) Spatial indexing
N1
N7
N5
P581(Mi|Mj)
P458(Mi|Mj)
N6
M={Driving, Walk, Bike, Bus},
E.g., P(M0) = P(Driving); P(M3|M1)= P(Bus | Walk);
N2
N8
N3
N4
Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Graph-Based Post-Processing (4)
Segments of a GPS trajectory
Search spatial index
N
Found in graph ?
Y
Y
Features X
Labeled data
Inference model
Knowledge Mining
Is maxProb > T1?
N
Normal postprocessing
N
Is maxProb < T2?
Prior probability-based enhancement
Output the mode
with maxProb as
result
Transition
probability-based
enhancement
Posterior Probability
P(mi | X)
Posterior Probability
P(mi | Eij)
Prior Probablity
P(mi)
Final Results: P(mi | X, Eij)= P(mi | X) P(mi | Eij) / P(mi)
Output the mode with
maxProb as result
End
Yu Zheng, et al. Understanding Mobility Based on GPS Data. UbiComp 2008
Thanks!
Yu Zheng
[email protected]
Homepage
Yu Zheng. Trajectory Data Mining: An Overview.
ACM Transactions on Intelligent Systems and Technology. 2015, vol. 6, issue 3.