IBM blue-and-white template
Download
Report
Transcript IBM blue-and-white template
®
IBM Software Group
Recommending Materialized Views and Indexes
with the IBM DB2 Design Advisor
(Automating Physical Database Design)
Jarek Gryz
IBM Software Group | DB2 Information Management Software
Agenda
Motivation
Indexes in DB2
Materialized query tables in DB2
Problem definition
How does the DB2 Design Advisor tool work ?
Experiments
IBM Software Group | DB2 Information Management Software
Motivation
IBM Software Group | DB2 Information Management Software
Why have an index?
Performance, Performance, Performance
Provides order
for example : Joins, GROUP BY, ORDER BY, DISTINCT
Limits I/O and data retrieved due to filtering with predicates
Range of values (start/stop keys)
Join predicates
Provides index-only access
Enforces uniqueness or other constraints
Provides statistics useful to the optimizer for cardinality estimation
for example: statistics on number of keys
IBM Software Group | DB2 Information Management Software
Why have Materialized query tables ?
The MQT feature is a powerful feature in DB2 that allows you to precompute
and materialize a query result into a table
Full refresh or incremental refresh possible
Subsequently it allows similar queries to automatically use the precomputed
data from the MQT to improve performance
IBM Software Group | DB2 Information Management Software
Problem Definition
Given:
Workload information
System configuration
Database characteristics
Determine:
An Index and MQT set that will
•lead to good workload performance
•in a reasonable or specified maximum time
•considering disk space and maintenance constraints
•and be easy to use
IBM Software Group | DB2 Information Management Software
The DB2 Design Advisor
Automatically capture :
A representative query workload (potentially compressing it to reduce its size)
The existing database characteristics and environment
System information
Determines:
An Index and MQT set that will lead to good ESTIMATED workload response time
•
•
•
•
•
Using DB2's Query Rewriter/Optimizer to suggest candidates
Using DB2's Optimizer to provide cost / benefit information
Using a combinatorial algorithm to perform a cost-benefit analysis observing
constraints of (1) advisor execution time, (2) disk space and (3) anticipated DB2
costs of creating the entities plus overhead during INSERT / UPDATE / DELETE
activity.
Using sampled or estimated statistics of new entities
Providing both GUI and command line options for initiating
IBM Software Group | DB2 Information Management Software
Issues in automating physical DB design selection
When to initiate the design algorithm?
Health monitor with health indicators such as number of sort overflows
to initiate the advisor
What data to use to make the decision?
Automatically capture workload, DB, and system information
Allow work on real data or just statistics
How to make the decision?
Method to be described
How are the recommendations implemented?
Little user interaction to ask if or when to initiate to gain DBA trust
Online methods to reduce implementation cost
•
E.g., online index creation
IBM Software Group | DB2 Information Management Software
The various steps within the Advisor
IBM Software Group | DB2 Information Management Software
Index candidate generation
During optimization generate virtual candidates when:
Predicate exists but no index (e.g., R.A > 5 or R.A=S.B)
Ordering required
Uniqueness required
Winning candidates are the virtuals in the final optimized query plan
Provides candidates we know the optimizer will use
IBM Software Group | DB2 Information Management Software
MQT Candidate Generation
Candidates are generated from original queries, logical views and common
expressions which are formed by matching multiple queries.
Uses multiquery optimization (MQO)
Provides candidates we know the queries will use
Candidates can contain table references in a federated DB (tables on different servers)
IBM Software Group | DB2 Information Management Software
Combinatorial search algorithm
The search phase uses a knapsack algorithm and random swap method
to choose the recommended index and MQT set
Requires each candidate to have a cost-benefit ratio (cbratio)
Benefit based on estimated cost with and without MQT usage (updates have
negative benefit)
Cost based on disk space usage
REFRESH DEFERRED or IMMEDIATE MQTs recommended.
Assumption (DEFERRED):
•
•
estimated time for population = full refresh cost
one refresh cost included in the calculation
IMMEDIATE changes added in plans for insert/update/deletes
If indexes on candidate MQT are selected, then the MQT must be
selected as well
IBM Software Group | DB2 Information Management Software
Experiments
Detect what MQO candidates adds to performance improvement
OLAP DB and workload
Workload estimated execution time (WET)
Type of MQT Selection
WET without MQTs
WET with MQTs
% diff in WETs
Num. of MQTs
MQTs from queries
493.7 seconds
353.0 seconds
28.5%
7
493.7 seconds
352.0 seconds
28.4%
4
MQTs from MQO
IBM Software Group | DB2 Information Management Software
Autonomic capabilities in DB2 Stinger
• Configuration Advisor
• Design Advisor
• advises: Indexes, MDCs,
MQTs, Partitioning
Selfconfiguring
Self-healing
Self-optimizing
• Automated Table Maintenance
• Runstats
• Reorg
• Statistics profiling
• Health Monitor
• Recommendation Advisor
• Automatic page write integrity checking
• Automatic index reorganization
• Recovery Expert
• Fault Monitor
• Backup
• Self-tuning
• Automated
• HADR
• DB2/Websphere Integration
• log and trace analyzer
• Query compiler
• query rewrite
• cost based optimization
• Automatic query parallelism degree
• Self-configuring/optimizing utilities
• Adaptive utility throttling
• Runstats
• Performance Expert
• Query patroller workload manager
• Self-tuning load
IBM Software Group | DB2 Information Management Software