FastDTW_Presentation - fastdtw

Download Report

Transcript FastDTW_Presentation - fastdtw

FastDTW: Toward Accurate
Dynamic Time Warping in Linear
Time and Space
Department of Computer Sciences
Florida Institute of Technology
Stan Salvador and Philip Chan
Outline
•
•
•
•
•
•
•
Dynamic Time Warping (DTW)
Problem Statement
Related Work for Speeding up DTW
FastDTW Algorithm
Evaluation of FastDTW
Contributions
Limitations and Future Work
Dynamic Time Warping (DTW)
• Aligns two time series by warping the time
dimension
• Warping - expanding/contracting the time
dimension
Time
The Dynamic Time Warping
Algorithm
• A dynamic programming approach
• Solutions to slightly smaller problems used
to find larger solutions
The DTW Cost Matrix
Time
Time Series Y
|Y|
j
1
1
|X|
i
Time Series X
Time Series X
Time
Distance of Min-Cost Warp Path
Time
Time Series Y
|Y|
j
1
1
|X|
i
Time Series X
Time Series X
Time
Finding Min-Cost Warp Path
Time
Time Series Y
|Y|
j
1
1
|X|
i
Time Series X
Time Series X
Time
min[ D(i  1, j ), D(i, j  1), D(i  1, j  1)]
Advantages of DTW
• DTW is optimal
• An intuitive distance measurement
• Local variation in the time axis is common
– Handwriting
– Speech
– “Events” that start after varying delays
Disadvantages of DTW
•
•
•
•
O(N2) time and space complexity
Only practical for small data sets (<3,000)
Time series are often very long
Data mining requires a scalable DTW
algorithm
Problem Statement
• We desire an efficient Dynamic Time
Warping algorithm
– Linear time complexity
– Linear space complexity
– Warp path is needed in addition to warp
distance
– Warp path must be nearly optimal
Does DTW Need to be Faster?
“Myth 3: There is a need (and room) for
improvements in the speed of DTW for data
mining applications.”
(Keogh today-9:45am)
• Keogh: many time series
• FastDTW: Long time series
Existing Methods to Speed Up
DTW
1. Constraints – only fill in part of the cost
matrix
2. Abstraction – sample the data before time
warping
Constraints
Sakoe-Chiba Band Itakura Parallelogram
(Sakoe & Chiba 1978)
(Itakura 1975)
•
Still O(N2) if the window width is a function of input size
(linear if the width is constant)
•
Assumes a near-optimal warp path stays near the i=j axis
•
Accuracy depends on the domain
Abstraction
1/1
1/5
1/1
(Keogh & Pazzani 2000), (Chu et al. 2002)
• O(N) if N pts are sampled down to ≤ N
• Assumptions
– Sampling preserves time series structure
– Small deviations from the optimal path cause
little increase in warp-path distance
Our FastDTW Algorithm
•
•
A multi-resolution approach inspired by a multilevel graph bisection algorithm (Karypis 1997)
3 key operations
1. Coarsening – reduce
the resolution of a
time series
2. Projection – use a
low-res warp path as
an initial solution at a
higher resolution
3. Refinement – Refine
a projected warp path
locally adjusting the
path
Sample Run of FastDTW
1/8
1/4
1/2
1/1
FastDTW Algorithm
1. Set the resolution to be the coarsest
2. Find the initial path using regular DTW
3. Repeat
a. Double the resolution
b. Project the path onto the finer resolution
c. Find a path through the projected area (plus a
small radius around the projected area)
4. Until the original resolution is reached
Complexity
• O(N) time
• O(N) space
• Details in the paper
Evaluation Criteria
• Accuracy
The error of an approximate Time Warping algorithm:
% error = approxDist  optimalDis t 100
where:
optimalDis t
approxDist – the warp path distance of the approximate algorithm
optimalDist – the warp path distance of the DTW algorithm
• Efficiency
Runtime (measured in seconds)
Evaluation Procedure
(Accuracy)
•
Data Sets – UCR Time Series Data Mining
Archive (Keogh & Folias 2002), 3 groups used:
1. Random – 45 unrelated time series (earthquakes,
random walk, eeg, speech, etc.)
2. Trace – 200 time series simulating nuclear power
plant failure (4 classes)
3. Gun – 200 time series of a gun being drawn and
pointed (2 classes)
•
Procedure
1. Run FastDTW, Constraints (Sakoe-Chiba Band), and
Data Abstraction on all pairs within a data set group,
also vary the radius
2. Record the average error of all three methods for a
group of data and a radius
Average % Error
(Accuracy)
Radius
FastDTW
Abstraction
Band
0
1
10
20
30
19.2%
8.6%
1.5%
0.8%
0.6%
983.3%
547.9%
6.5%
2.8%
1.8%
794.1%
136.8%
9.3%
2749.2% 2385.7%
Error in Different Data Sets
Accuracy of FastDTW, Bands, and Data Abstraction
FastDTW-Random
Abstraction-Random
Band-Random
100%
90%
FastDTW-Trace
Abstraction-Trace
Band-Trace
FastDTW - Gun
Abstraction-Gun
Band-Gun
80%
70%
Error
60%
50%
40%
30%
20%
10%
0%
0
5
10
15
radius
20
25
30
Evaluation Procedure
(Execution-time)
•
Data Sets
–
–
•
Synthetic sine waves with Gaussian noise
10 to 180,000 data points
Procedure
1. Run FastDTW and DTW on each data set,
vary the radius for FastDTW
2. Compare the Execution times
Execution Time
Execution Time of FastDTW on Large Time Series
DTW
5000
FastDTW (radius=100)
4500
FastDTW (radius=20)
4000
FastDTW (radius=0)
Time (seconds)
3500
3000
2500
2000
1500
1000
500
0
0
20,000
40,000
60,000
80,000
100,000
Length of Time Series
120,000
140,000
160,000
180,000
Summary of Contributions
• FastDTW – an approximation of DTW
– O(N) time and space complexity
– Scales well to long time series
– Accurate, 8.6% error if radius=1, 0.8% error if
radius=20
Limitations and Future Work
• Limitations
– FastDTW does not always find an optimal solution
• Future Work
– Examine using different step sizes between resolutions
– Investigate search algorithms to help improve
refinement
– Examine # of cells evaluated vs. accuracy between the
FastDTW, Abstraction, and Band algorithms.
Questions?
Thanks to those who helped with this research:
Matt Mahoney (Florida Institute of Technology),
Brian Buckley, Walter Schiefele (Interface & Control Systems)
This research is partially supported by NASA
FastDTW Pseudocode
Input:
Output:
1|
2|
3|
4|
5|
6|
7|
8|
9|
10|
11|
12|
13|
14|
15|
16|
17|
18|
19|
20|
21|
X, Y, radius
1) A minimum distance warp path between X and Y
2) The warped path distance between X and Y
// The min size of the coarsest resolution.
Integer minTSsize = radius+2
IF (|X|≤ minTSsize OR |Y|≤ minTSsize)
{
// Base Case: for a very small time series run the full DTW algorithm
RETURN DTW(X, Y)
}
ELSE
{
// Recursive Case: Project the warp path from a coarser resolution onto the current current resolution.
//
Run DTW only along the projected path (and also radius cells from the projected path).
TimeSeries shrunkX = X.reduceByHalf() // Coarsening
TimeSeries shrunkY = Y.reduceByHalf() // Coarsening
WarpPath lowResPath = FastDTW(shrunkX, shrunkY, radius)
SearchWindow window = ExpandedResWindow(lowResPath, X, Y, radius)
RETURN DTW(X, Y, window)
}
// Refinement
// Projection