Adaptive Ordering of Pipelined Stream Filters
Download
Report
Transcript Adaptive Ordering of Pipelined Stream Filters
Adaptive Ordering of
Pipelined Stream Filters
S. Babu, R. Motwani, K. Munagala, I.
Nishizawa, and J. Widom
In Proc. of SIGMOD 2004, June 2004
Outline
Introduction
Ordering of filters
Adaptive ordering algorithm
Preliminaries
Algorithms
Experimental evaluation
Conclusion
Introduction
Why “Stream”?
Many modern applications deal with data that
Why “Adaptive”?
Is updated continuously
Needs to be processed in real-time
Arrival characteristics of streams may vary significantly
over time
Characteristic of filters
Direct
Commutative
Ordering of filters
Challenges
Selectivities across filters may be correlated.
An exhaustive algorithm becomes infeasible.
Greedy approach
Arrival characteristics of streams may vary
significantly over time
Adaptive approach
Adaptive ordering algorithm
Three-way tradeoff
Run-time overhead
Convergence properties
Monitor the changes of statistics
Determine when to change the current order
If data and filter characteristics stabilize, adaptive algorithm
should converge to a solution with desirable properties
Speed of adaptivity
The time the convergence takes when data and filter
characteristics change
Preliminaries
symbol
meaning
query
input stream
filters
stream tuple
ordering
conditional drop probability
Processing time per tuple
total cost
Algorithms
A-GREEDY algorithm
The SWEEP algorithm
The INDEPENDENT algorithm
The LOCALSWAP algorithm
(Static) Greedy algorithm
Greedy approach
A-GREEDY Profiler
A-GREEDY Reoptimizer
violation
A-GREEDY Algorithm (1/2)
A-GREEDY Algorithm (2/2)
A-GREEDY properties
Convergence
Good
Run-time overhead
Profile-tuple creation
Profile-window maintenance
Matrix-view update
Detection and correction of GI violations
Adaptivity
rapidly
Algorithms
A-GREEDY algorithm
The SWEEP algorithm
The INDEPENDENT algorithm
The LOCALSWAP algorithm
The SWEEP Algorithm
Rotating
over
SWEEP properties
Convergence
Run-time overhead
Good
A-Greedy:
Sweep:
Adaptivity
Violations may remain undetected for a relatively long time
Up to
stages
Algorithms
A-GREEDY algorithm
The SWEEP algorithm
The INDEPENDENT algorithm
The LOCALSWAP algorithm
The INDEPENDENT Algorithm
Assume the filters are independent
frequently in database literature
seldom true in practice
INDEPENDENT properties
Convergence
independent
dependent
can be
times worse than the GI ordering
Run-time overhead
converge to the optimal ordering
lower than A-Greedy
Adaptivity
rapidly
Algorithms
A-GREEDY algorithm
The SWEEP algorithm
The INDEPENDENT algorithm
The LOCALSWAP algorithm
The LOCALSWAPS Algorithm
Detect
a swap between adjacent filters in
improve performance
would
LOCALSWAP properties
Convergence
Run-time overhead
dependent on the way characteristics change
in some case,
times higher cost than GI ordering
lower than A-Greedy
Adaptivity
may take more time to converge
may get stuck in a local optima
Experimental evaluation
Three parts
Convergence experiments
Run-time overhead experiments
Adaptivity experiments
Convergence experiments
Factors
Number of filters
Filter selectivities
Cost of filters
Correlation among filters
Comparison
Optimal
A-Greedy
Independent
Convergence experiments (number)
Factors
Number of filters
Convergence experiments (selectivity)
Factors
Filter selectivities
Convergence experiments (correlation)
Factors
Correlation among filters
Run-time overhead experiments
Factors
Number of filters
Filter selectivities
Cost of filters
Comparison
Optimal
A-Greedy
Sweep
LocalSwaps
Independent
Run-time overhead experiments (time)
Factors
Spending time
≥ 98%
Run-time overhead experiments (number)
Factors
Number of filters
Run-time overhead experiments (cost)
Factors
Cost of filters
Adaptivity experiments
Factors
Rate of change
Cost of filters
Comparison
A-Greedy
Sweep
LocalSwaps
Independent
Adaptivity experiments (speed)
Adaptivity experiments (rate)
Factors
Rate of change
Adaptivity experiments (cost)
Tradeoff
Run-time overhead ↔ adaptivity
Conclusion
Different points along the tradeoff spectrum
Fast Adaptivity
A-Greedy
Low run-time overhead
Slow adaptivity, good convergence
Independent filters
Sweep
Independent
Swap between adjacent filters, unpredicted convergence
LocalSwaps
Appendix
stable
correlation factor Γ
filters are divided into
Each contains Γ filters
groups
Filters in
different groups
independent
the same groups
80% the same result of input tuples
most correlated
completely independent
Example 1
Example 2
For
Total cost = 20
,
Input
1
2
1
4
7
2
5
4
total
Cost
4
3
4
2
1
3
1
2
20
Example 6
1, 2, 3, …, 100, 1, 2, …, 100, 1, 2, …
INDEPENDENT
permutation of
+
cost:
1
1
50
2
2
51
…
…
…
A-GREEDY
49
49
99
one of
100
cost:
+
+permutation of others
Example 7
1, 2, 3, …, 100, 1, 2, …, 100, 1, 2, …
Before,
After,
A-GREEDY
x
x
x
x
x
+ permutation of others
cost:
/100 ?
LOCALSWAP
x
x
cost:
/100 ?