Transcript ppt

A Survey of Techniques Used to Reduce
the Semantic Gap Between DBMS
Storage Manager and Storage
Subsystems
Biplob Kumar Debnath
Nagapramod Mandagere
1
Semantic Gap
 Standard interfaces to storage subsystem virtualizes storage as
a simple linear array of fixed-size logical blocks



(+) Compatibility
(+) Case of implementation
(-) Hide useful information
 DBMS storage managers do a lot to optimize storage access,
without sufficient information
 Storage subsystem front-ends do a lot to optimize under the
covers, without enough info about application
 Wouldn’t it be better if we can reduce the semantic gap?
2
Contributions
 Classification of Existing Approaches
 Comparison of Existing Approaches
 Research Directions
 Recommendations
3
Outline
 Background on Database Storage Manager
 Motivation
 Classification overview
 Comparison of Different approaches
 Future Research Directions
 Conclusion
4
Background:
Database Storage Manager
 Interface between DMBS
and disk storage is in terms
of fixed sized disk pages
 Same Layout on
disk/memory
 Multi-level storage hierarchy
 Different optimal access on
each level
Figure is taken from [1]
5
Data Layout in Disk
6
Slide is taken from [1]
NSM in Memory Hierarchy
 Optimized for full record access
 Slow for partial access


Wastes I/O Bandwidth
Low spatial locality at CPU cache
Figure is taken from [1]
7
Importance
Pitfalls in Database Storage Managers
DISK
DATABASE
Page layout is
Relational tables are
LINEAR
TWO-DIMENSIONAL
When a query requests a subset of attributes,
the database must read ALL attributes even
though only a fraction is needed
However, with tables of small number of attributes and online
transaction processing (OLTP) workload, such pitfall was not 8
considered as a major concern in commercial DBMSs.
Motivation - New Applications
 Peta-scale data size

Data administration becomes challenging
 High-dimensional tables (e.g., 500 attributes in a
single table is common)
 NSM data layout is completely inefficient
 A radical change in the query workload as many
queries ask only about very small part of the table
(e.g., decision support queries)
 Need a radical change in the data layout
9
Classification Overview




Server Side Modifications
Storage Side Modifications
Specialized Database Machine
New Storage Devices
10
Classification of Existing Approaches
 Server Side Modifications




DSM
PAX
Fractured Mirrors
Atropos LVM
 Storage Side Modifications


Semantic Disk
Object based Storage Device
 Specialized Hardware


Database Machine
Active Disk
 New Storage Devices


MEMS
Object based Storage Devices
11
Decomposition Storage Model (DSM)
 Partition original table into n 1-attribute sub-tables
 Each sub-table stored separately in NSM pages
12
Slide is taken from [1]
DSM in Memory Hierarchy
 Optimized for partial-access
 Slow for full-record access

Reconstructing full record incurs random I/O
Slide is taken from [1]
13
Comparison of NSM & DSM
Page
Layout
Cache Memory Performance
Disk Memory Performance
Full-Record
Access
Full-Record
Access
Partial-Record
Access
Partial-Record
Access
NSM
√
x
√
x
DSM
x
√
x
√
14
Fractured Mirrors
Two copies – one in DSM & one in NSM
 Requires Double Space
15
Figure is taken from [1]
Disk Basics




Access time = seek time + rotation delay + transfer time
Seek Time = 5-15 ms
Rotational delay = 4 ms on average for 7400 rpm drive
Transfer time = 1 ms for 8MB/sec
Track Aligned Access to save Rotational Time
16
Figure is taken from [11]
Motivation Behind LVM
Semi-sequential access
Track
1
2
Read 1
3
4
5
6
7
Process 1
Skew
Head-0
1
2
3
4
5
6
7
Head-1
12
13
14
8
9
10
11
Head-3
16
17
17
19
20
21
15
17
Atropos Logical Volume Manager
 LVM exposes low level storage characteristics to the
applications
 Applications uses this info for efficient data layout
18
Figure is taken from [3]
DB Aware Semantic Disk
 Find correlations among the blocks.
 Storage systems have a very high level knowledge of the DBMS
working behavior.
 Uses WAL entries to find correlations among blocks
 Uses block relations knowledge for better




Caching
Layout
Reliability
Security guarantees
 With some help from DBMS side it can improve performance even
better.
 access time of a particular block or table
 access counts, # of queries that access table within a fixed time
19
 access co-relations for tables and indices
Reference [4]
Database Machine
 Explored extensively in late 1970s
and late 1980s
 Classified in 4 categories:




Processor per head
Processor per track
Processor per disk
Multi-processor cache
 Reasons for failure:

Use special-purpose hardware.

Performance was impressive for
only scan, projections operations.

Same performance can be
achieved by smart indexing
techniques.

CPU was sitting idle.

Communication overhead

Database vendors did not agree
to rewrite their legacy code base.
20
Reference [6]
Active Disk
 Got attention from the
research community again
in 1990s.
 Leverage storage technology
advancements and parallel
algorithms improvement.
 Widespread use of disk
arrays that use a large
number of disks working in
parallel.
 Advancement in parallel
algorithms
 Serial communications
advancement
 Disks are embedded with a
processor, memory, cache, and
network connection.
 Disk can perform some computation
in addition to storing data.
 How to map all basic database
operations on Active Disk system
with proper low-level primitives.
21
Reference [5]
MEMS Storage
22
Figure is taken from [1]
2-D Database Access in MEMS
 MEMS is 2-D
 Full area is divided into grids
 Activate only select heads
Figure is taken from [1]
23
Object-based Storage Device (OSD)
Traditional
OSD
• Same hardware, new interface
Slide is taken from [10]
OSD Objects
 An object is a logical unit






File-like access methods
Attributes describes the characteristics
Security policies authorizes access
Variable size and can store any type of data
Application can decide what should go into object
Dynamically shrink and grow
OSD: Shared attributes
Traditional
Slide is taken from [10]
OSD
OSD for Databases
Figure is taken from [2]
27
Comparison
28
Possible Research Areas
 Finding efficient layout for indexes.
 Exploring new possibilities assuming large
cache is available in storage devices
 Making changes to support database aware
semantic disk approach.
 Explore OSD
29
OSD + DBMS Research Issues
 Deciding upon the object and attribute representations
 Defining the interface layer between DBMS and OSD
 Defining a geometry-aware read/write functions in the
OSD side
 Changing the storage manager at the DBMS to be light
weight to only communicate with OSD
 Modifying the query optimizer in the DBMS to take care
of a new set of possible query plans that utilize OSD
30
Conclusion
 It is possible to reduce the semantic gap
between DBMS and Storage subsystems.

For short term solution, Semantic disk or
Atropos like LVM looks promising.

For Long term solution, OSD looks very
promising.
31
Thank You !!!!
32
Comments / Questions ?
33
References











[1] A. Ailamkai, Database Architectures for New Hardware (invited tutorial), VLDB Toronto,
Canada, August 2004.
[2] Steve Schlosser, Sami Iren. Database Storage Management with Object-Based Storage
Devices, DAMON 2005
[3]Atropos: A Disk Array Volume Manager for Orchestrated Use of Disks, In 3rd USENIX
Conference on File and Storage Technologies FAST 04, CA. March 2004
[4]Muthian Sivathanu, Lakshmi N Bairavasundaram, et al, Database Aware Sematicaly
Smart Storage, FAST 2005.
[5]E. Riedel, C. Faloutsos, and D. Nagle. Active Disk Archittecture for Databases. Technical
Report CMU-CS-00-145, Carniegie Mellon University, April 2000.
[6]K. Keeton. Computer Architecture Support for Database Applications, PhD thesis,
University of California at Berkeley, 1999.
[7]R. Ramamurthy, D. J. Dewitt, and Q. Su. A Case for Fractured Mirrors, In Proc. VLDB,
2002
[8]A. Ailamaki, D. J. Dewitt, M. D. Hill, and M. Skounakis. Weaving Relations for Cache
Performance. In Proc. VLDB, 2001.
[9]S. Schlosser et al. , On Multidimensioanla data & Moderndisk, Fast 2005.
[10] E. Riedel, Object Bases Storage Device basics, SNIA Tutorial, April 2005.
[11] http://www.cs.duke.edu/courses/spring01/cps110/slides/disk-fs.pdf
34
Modern Disk & Multidimensal data
35
Figures are taken from [9]