Transcript Document
Chapter 6: Database Evolution
• Title: AutoAdmin “What-if” Index Analysis Utility
• Authors: Surajit Chaudhuri, Vivek Narasayya
• ACM SIGMOD 1998
AutoAdmin “What-if” Index Analysis Utility
• Problem
– Problem Statement
– Why is this problem important?
– Why is this problem hard?
• Approaches
– Approach description, key concepts
– Contributions (novelty, improved)
– Assumptions
Problem Statement – Index Selection
• Given
– Database, i.e. tables
– A Database Management System, i.e. SQL Server
• Find
– Index Selection Tool to compare alternate indexes
• by estimated processing cost of workload queries
– Tools to define workload queries, alternative indices, …
• Objectives
– Assist database administrators, i.e. reduce manual effort
• Constraints
– Avoid physical alternation to Databases
– Avoid scan of large tables to create statistics for query optimizer
Why is this problem important?
• Total cost of ownership
– Database administration is a significant component
– Annual salaries compare with cost of software and hardware
• Thus, DBA time is valuable
– Tools to assist DBA produce significant savings
• DBA perform many services
– Performance tuning via Index selection
– Other, e.g. recovery, upgrade, …
Why is this problem Hard?
• Index selection is intrinsically a hard search problem.
– A large # of possible single and multi-column indexes
– Various types of indexes
– Variety of usage by query optimizer
• e.g., indexed-only access
• Trade-off between read and update queries
– Cost of update, insert, delete, bulk load may go up!
– Impact analysis is required
Novelty of Contribution
• Limitations of Related Work
– Lack of impact analysis
– Reliance of large physical changes to databases
• Ex. ‘views’ in simulating hypothetical database [Stonebraker]
• Very computation intensive
• Contributions
– Built ‘index selection’ & ‘index analysis’ utility.
(This paper more focuses on index analysis utility.)
– Hypothetical index structures
– Enables a large class of analysis at low cost.
• Query optimizer to evaluate indexes
• Sampling to collect statistics
Basic Concepts
• Ideas for modeling
– ‘Workload’ = a set of SQL statements
– ‘Configuration’ = a set of indexes
– Hypothetical Configuration Analysis (HCA)
• Summary analysis on simulation results
• Ideas for efficient implementation
– Simulate a hypothetical configuration
• estimate the cost of queries in the workload
– Sampling to estimate statistics needed by optimizer
Key Concepts - Simulating Configurations
• Simulating Hypothetical
Configurations to Estimate
– (a) the cost of queries in the
workload: is relative to the
total cost of the workload in
the current configuration.
– (b) usage of indexes:
represents the indexes that
are expected to be used by
the server to answer the
query if the hypothetical
configuration exists
Key Concepts - Sampling Strategy
• Used for creating
hypothetical indexes.
– Accuracy – Figure 6
– Cost – Figure 5
• Details - adaptive
page-level sampling
algorithm
– Psuedo-code (fig.. 3)
Key Concepts - Architecture
Validation Methodology
• Prototyped index analysis utility
– for Microsoft SQL Server 7.0
• Case Study
– Workload – TPC-D – Decision
Support benchmark
– Usage analysis
• Figure 10. Usage of each index in
configuration for the workload
• Figure 11. – Distribution of selection
conditions on a given table
– Comparison of two configuration by
cost
• Figure 13 – For 10 most expensive
queries
• Figure 14 – By SQL Statement types
Examples of Summary Analysis
• Distribution of workload by SQL type
• Distribution of conditions over tables
Summary
• Paper’s focus
– Index Analysis Utility
• Ideas
– Simulating hypothetical indexes can estimate the cost of queries
and usage of indexes.
• Contributions
– The first of its kind, index analysis utility with low cost
– Presented efficient mechanism for implementation
• Analytical Validation
– Case study with Microsoft SQL Server 7.0
Assumptions, Rewrite today
• Assumptions
– The simulation represents the real behavior of database system.
• Rewrite today
– Compare with newer methods
• DB2 Design Advisor
– Experimental evaluation
• To measure the efficiency of the index analysis utility