Adaptive Ordering of Pipelined Stream Filters

Download Report

Transcript Adaptive Ordering of Pipelined Stream Filters

Adaptive Ordering of
Pipelined Stream Filters
S. Babu, R. Motwani, K. Munagala, I.
Nishizawa, and J. Widom
In Proc. of SIGMOD 2004, June 2004
Outline







Introduction
Ordering of filters
Adaptive ordering algorithm
Preliminaries
Algorithms
Experimental evaluation
Conclusion
Introduction

Why “Stream”?

Many modern applications deal with data that



Why “Adaptive”?


Is updated continuously
Needs to be processed in real-time
Arrival characteristics of streams may vary significantly
over time
Characteristic of filters


Direct
Commutative
Ordering of filters

Challenges

Selectivities across filters may be correlated.

An exhaustive algorithm becomes infeasible.


Greedy approach
Arrival characteristics of streams may vary
significantly over time

Adaptive approach
Adaptive ordering algorithm

Three-way tradeoff

Run-time overhead



Convergence properties


Monitor the changes of statistics
Determine when to change the current order
If data and filter characteristics stabilize, adaptive algorithm
should converge to a solution with desirable properties
Speed of adaptivity

The time the convergence takes when data and filter
characteristics change
Preliminaries
symbol
meaning
query
input stream
filters
stream tuple
ordering
conditional drop probability
Processing time per tuple
total cost
Algorithms




A-GREEDY algorithm
The SWEEP algorithm
The INDEPENDENT algorithm
The LOCALSWAP algorithm
(Static) Greedy algorithm


Greedy approach
A-GREEDY Profiler
A-GREEDY Reoptimizer

violation
A-GREEDY Algorithm (1/2)
A-GREEDY Algorithm (2/2)
A-GREEDY properties

Convergence
 Good


Run-time overhead
 Profile-tuple creation
 Profile-window maintenance
 Matrix-view update
 Detection and correction of GI violations

Adaptivity
 rapidly
Algorithms




A-GREEDY algorithm
The SWEEP algorithm
The INDEPENDENT algorithm
The LOCALSWAP algorithm
The SWEEP Algorithm

Rotating
over
SWEEP properties

Convergence


Run-time overhead



Good
A-Greedy:
Sweep:
Adaptivity


Violations may remain undetected for a relatively long time
Up to
stages
Algorithms




A-GREEDY algorithm
The SWEEP algorithm
The INDEPENDENT algorithm
The LOCALSWAP algorithm
The INDEPENDENT Algorithm

Assume the filters are independent




frequently in database literature
seldom true in practice
INDEPENDENT properties

Convergence

independent


dependent


can be
times worse than the GI ordering
Run-time overhead


converge to the optimal ordering
lower than A-Greedy
Adaptivity

rapidly
Algorithms




A-GREEDY algorithm
The SWEEP algorithm
The INDEPENDENT algorithm
The LOCALSWAP algorithm
The LOCALSWAPS Algorithm

Detect



a swap between adjacent filters in
improve performance
would
LOCALSWAP properties

Convergence



Run-time overhead


dependent on the way characteristics change
in some case,
times higher cost than GI ordering
lower than A-Greedy
Adaptivity


may take more time to converge
may get stuck in a local optima
Experimental evaluation

Three parts




Convergence experiments
Run-time overhead experiments
Adaptivity experiments
Convergence experiments

Factors





Number of filters
Filter selectivities
Cost of filters
Correlation among filters
Comparison



Optimal
A-Greedy
Independent
Convergence experiments (number)

Factors

Number of filters
Convergence experiments (selectivity)

Factors

Filter selectivities
Convergence experiments (correlation)

Factors

Correlation among filters
Run-time overhead experiments

Factors




Number of filters
Filter selectivities
Cost of filters
Comparison





Optimal
A-Greedy
Sweep
LocalSwaps
Independent
Run-time overhead experiments (time)

Factors

Spending time
≥ 98%
Run-time overhead experiments (number)

Factors

Number of filters
Run-time overhead experiments (cost)

Factors

Cost of filters
Adaptivity experiments

Factors



Rate of change
Cost of filters
Comparison




A-Greedy
Sweep
LocalSwaps
Independent
Adaptivity experiments (speed)
Adaptivity experiments (rate)

Factors

Rate of change
Adaptivity experiments (cost)

Tradeoff

Run-time overhead ↔ adaptivity
Conclusion

Different points along the tradeoff spectrum

Fast Adaptivity


A-Greedy
Low run-time overhead

Slow adaptivity, good convergence


Independent filters


Sweep
Independent
Swap between adjacent filters, unpredicted convergence

LocalSwaps
Appendix
stable
correlation factor Γ



filters are divided into
Each contains Γ filters
groups
Filters in
 different groups


independent
the same groups

80% the same result of input tuples


most correlated

completely independent

Example 1
Example 2


For
Total cost = 20
,
Input
1
2
1
4
7
2
5
4
total
Cost
4
3
4
2
1
3
1
2
20
Example 6
1, 2, 3, …, 100, 1, 2, …, 100, 1, 2, …
INDEPENDENT
permutation of
+
cost:
1
1
50
2
2
51
…
…
…
A-GREEDY
49
49
99
one of
100
cost:
+
+permutation of others
Example 7
1, 2, 3, …, 100, 1, 2, …, 100, 1, 2, …
Before,
After,
A-GREEDY
x
x
x
x
x
+ permutation of others
cost:
/100 ?
LOCALSWAP
x
x
cost:
/100 ?