Cache Tables: Paving the way for an Adaptive Database Cache

Download Report

Transcript Cache Tables: Paving the way for an Adaptive Database Cache

Cache Tables: Paving
the way for an Adaptive
Database Cache
Mehmet Altınel, Christof Bornhövd, C. Mohan,
Hamid Pirahesh, Berthold Reinwald
(IBM Almaden Research Center)
Sailesh Krishnamurthy
(Computer Science Division,UC Berkeley)
Presented by:
Umar Farooq Minhas
October 04, 2006
Motivation

Wide-spread use of Transactional Web Applications (TWA) in enterprise
applications


Broad range of components e.g. network load balancers, HTTP servers,
application servers, … , databases etc.
Issues


Response time
Scalability

Solutions


Caching of static HTML pages
Multiple level caches
2
Motivation contd..

Static Caching, Drawbacks




Run business logic in remote application servers close to end users




TWAs tend to be more & more dynamic
High volumes of data
Highly personalized contents
Reduced response time
Reduced load on in-house systems
Benefits are limited by the frequency with which remote server needs
to access backend DB
Proposed Solution: DBCache

Allows DB caching at mid-tier nodes, remote data centers and edge
servers
3
DBCache: Overview

Built using full-fledged DBMS, DB2



Reduced development effort
Allows caching of related DB objects
 Triggers, constraints, indices , stored procedures, …
Makes use of existing distributed query execution

Provides cache transparency

Supports both full-table and partial-table caching

On-demand caching


Adapts to dynamically changing loads
Exploits typical characteristics of TWA queries
4
DBCache: Contributions

Database cache model


Introduces a new DB object ‘Cache Table’
Dynamic/static caching support

Novel query re-write scheme

Cache load and maintenance mechanisms
5
Outline









Motivation
DBCache: Overview
Cache Tables
Dynamic Cache Model
Query Compilation
Cache Table Population and Maintenance
Performance Evaluation
Conclusions & Future Work
Discussion
6
Cache Tables


A Cache Table is a database object by which an end user can specify
that a table (cache table) in a database (cache database) is a cache of
a table (backend table) in another database (backend database)
Cache
Table
Back
end
Table
Cache DB
Backend DB
Two types of cache tables supported:


Declarative/Static Cache tables
Dynamic Cache tables
7
Declarative/Static Cache Tables

When table contents static and known upfront




Use declarative cache tables
Similar to materialized views
Entire table cached in absence of predicate definition
Exploits existing materialized view support in DB2
8
Dynamic Cache Tables



Populated on-demand
Provides adaptability
Can choose to cache only “hot” items
9
DBCache Schema Setup




Cache schema exact mirror of backend DB schema
Each backend DB table represented by

Cache Table or

Nickname (caching disabled)
Requires no change in existing queries
Allows caching of other relevant logical and physical objects
10
Outline









Motivation
DBCache: Overview
Cache Tables
Dynamic Cache Model
Query Compilation
Cache Table Population and Maintenance
Performance Evaluation
Conclusions & Future Work
Discussion
11
Dynamic Cache Model

Key concepts

Cache Keys
 Defined on cache table column
 Can be non-unique
 Must be ‘domain-complete’


Unique/Primary key columns complete by definition
Guarantees correctness of equality predicates
12
Dynamic Cache Model

Key concepts contd..

Referential Cache Constraints (RCCs)
 Defined between any cols of two cache tables
 Creates a cache-parent/cache-child relationship
 Guarantees the correctness of equi-join predicates
 Somewhat similar to referential integrity constraints
13
Dynamic Cache Model

Key concepts contd..

Cache Groups
 Set of related cache tables whose content is (directly or transitively)
populated by the values of one or more cache keys of a single cache
table, called the root table.
 Tables reachable by RCC constraints from the root table are called
member tables
 Advantages
 Application context recognized more easily
 Helps avoiding conflicting cache constraints
14
Dynamic Cache Model

Key concepts contd..

Cache Groups contd..
 Represented by a directed graph called cache group graph, nodes
denote cache tables and edges denote RCCs
 Direction of an edge for RCC is from a cache-parent to a cache-child
 Bi-directional edges possible
 Two or more groups can be overlapping
 Captured in connectivity graphs
15
Dynamic Cache Model

Issues with Cache Constraints



Can cause unexpected cache loads resulting in a phenomena called
recursive cache load problem
A cache group is called safe if it avoids this problem
How to ensure group safety ?
16
Dynamic Cache Model

Rules for cache group safety

Rule-1: A cache group graph must not include any heterogeneous cycles.

Rule-2: A cache table must not have more than one non-unique domaincomplete column.

A new cache constraint is created only if it doesn’t violate Rule 1 and Rule 2.
17
Outline









Motivation
DBCache: Overview
Cache Tables
Dynamic Cache Model
Query Compilation
Cache Table Population and Maintenance
Performance Evaluation
Conclusions & Future Work
Discussion
18
Query Compilation

Declarative Cache Tables



Existing materialized view matching mechanism in DB2 is exploited
Name switching
Dynamic Cache Tables




Generate two plans local plan and remote plan
Choose at run-time through a switch operator which uses the probe query to
decide which leg to execute
Janus (two-headed) plan: derived from Roman Mythology
God of gates, doors, doorways, beginnings and endings. Month of January ?
19
http://en.wikipedia.org/wiki/Janus_%28mythology%29
Query Compilation

Constructing a Janus Plan:
1
Initial Query
Plan
Replace Cache Table
names with Nicknames
Remote Query
Plan
2 Generate a probe query by checking all equality predicates that
can potentially participate in probe query condition
if none found then ABORT ( remote query plan gets executed )
3
Cloned Input
Query Graph
Replace Nicknames with
eligible Cache Table names
from step - 2
Local Query
Plan
4 Insert switch operator on top of remote, local and probe query plans
20
Outline









Motivation
DBCache: Overview
Cache Tables
Dynamic Cache Model
Query Compilation
Cache Table Population and Maintenance
Performance Evaluation
Conclusions & Future Work
Discussion
21
Cache Table Population & Maintenance

Declarative Cache Tables


Relies on DPropR utility:
IBM’s asynchronous data
replication tool
Dynamic Cache Tables

On-demand loading



Cache key values failing
probe query are used to
extract data
Extracted data populated
asynchronously by a cache
daemon
Cache invalidation



Generate invalidation
messages and send to
cache daemon
Cache daemon generates
and executes deletes against
cacheDB
Updated rows get loaded
with new requests
22
Outline









Motivation
DBCache: Overview
Cache Tables
Dynamic Cache Model
Query Compilation
Cache Table Population and Maintenance
Performance Evaluation
Conclusions & Future Work
Discussion
23
Performance Evaluation


Focus: Evaluate overhead of Janus plans for dynamic tables
 Overhead of probe query and switch operator
 Overhead of on-demand loading
Experimental settings
24
Performance Evaluation

Cache Hit Case



Janus plan vs. pure local queries
Difference gives the overhead for probe query and the switch operator
Cache table loaded with all the data from backend table
25
Performance Evaluation

Cache Miss Case



Janus plan vs. pure remote queries
Difference gives the overhead
Cache table initially empty
26
Outline









Motivation
DBCache: Overview
Cache Tables
Dynamic Cache Model
Query Compilation
Cache Table Population and Maintenance
Performance Evaluation
Conclusions & Future Work
Discussion
27
Conclusions & Future Work

Significant contributions

Provides a new frame-work to implement DB caching for TWAs and tends to
provide:
 Seamless integration with current applications
 Supports static/dynamic cache tables
 Adapts to the changing workloads in TWAs
 Re-uses the functionality of a full-fledged DBMS i.e. DB2

What next ?
 Provide efficient, scalable, zero-admin DBCache
 Development of new tools to ease deployment
 Improve adaptability and maintenance
28
Comparison
vs. amco05:




Relies on asynchronous data propagation utility
Not completely transparent
May not work for heterogeneous DBMSs
Allows stale data
vs. gula04:








Cache constraints against C&C constraints
Doesn’t provide any guarantees of freshness/consistency
Relatively more transparent
Maintenance-centric vs. query-centric
Both deployed as mid-tier level caches
Both use a full-fledged DBMS
Both use Materialized views
Both use two-headed query plans
29
Discussion

Is it really that good ?





Using full-fledged DBMS at each middle-tier node, drawbacks ?
How is data freshness specified/guaranteed ?
Is it adaptable ? Weakly ? Strongly ?
When can cache constraints become bottleneck ?
Size of dynamic cache tables ?


Caching of other physical & logical DB Objects ?



Updates to those objects in backend DB?
Message traffic between Cache Daemon & Backend DB ?


Cache replacement policies/cleansing mechanisms?
Very frequent updates in backend DB
Local updates ?
Flaws in performance evaluation ?
30