Randomized Algorithms for optimizing large join queries

download report

Transcript Randomized Algorithms for optimizing large join queries

Randomized Algorithms for
optimizing large join queries
Algorithms
• Iterative improvement
– start at a random state
– move to random neighbor with less cost until at
local minimum
• Simulated annealing
– start at a random state
– start at initial “temperature”
– move to a random neighbor
• if neighbor is cheaper, move
State space
• all possible execution strategies of a query
form a state space
• a strategy is a join processing tree
• differences between strategies
– join order
– join method
• each point is connected to other points
• neighbors are a single change in strategy:
– join method
Evaluation
•
•
•
•
•
•
tree queries and star queries
5-100 joins
3 types of relation catalogs
II is okay
SA is bad at first, good in the long run
2PO is best of both worlds
Analysis:
• Query size:
– small queries (5-10) show no difference
• Variance in catalog parameters
– differences increase as size and variety of
relations increase
– relation size has more impact than selectivity
State Space analysis
• It’s a cup/ well