Chapter 3 Data Storage and Access Methods

Download Report

Transcript Chapter 3 Data Storage and Access Methods

Chapter 3
Data Storage and Access
Methods
Title: Operating System Support for Database
Management
Author: Michael Stonebraker
Pages: 217—223
Problem Definition

Apparent disconnect between DBMS performance
goals and operating system design and
implementation.

Services provided by OS are inadequate and suboptimal.

Paper evaluates the following services:
•
•
•
•
•
Buffer pool management
File system
Interprocess communication
Consistency control
Paged virtual memory
Contributions


Demonstrates OS services are too
slow or inappropriate for DBMS
tasks.
Attempts to make OS designers
aware of and more sensitive to DBMS
needs.
Key Concepts

Buffer Pool Management
• OS has a fixed buffer pool that handles all I/O
• UNIX uses LRU replacement strategy, which
may not be ideal for a DBMS
• Large performance overhead to pull a block
into the buffer. Approx. 5000 instructions for
512 bytes
• No good prefetch strategy.
• UNIX does not implement a selected force out
buffer manager where the DBMS can dictate
the order of the commits
Key Concepts

The File System
• UNIX implements its file system as character
arrays and forces the DBMS to implement its
own higher level objects.
• Tree Structured File Systems

UNIX implements 2 service using trees
• Keeping track of blocks in a given file
• Hierarchical directory structure


DBMS adds a third tree to support keyed access
One tree with all 3 kinds of information is more
efficient.
Key Concepts

Scheduling Process Management and
Interprocess Communication
• Performance


Task switches are inevitable
Processes have a great deal of state information
making task switches expensive
• Critical Sections


Buffer pool is a shared data segment.
Problems arise if OS deschedules a DB process
holding a lock on the buffer pool.
• Server model


OS needs to provide a message facility for multiple
processes to message a single process.
Server must do its own scheduling and multitasking.
Key Concepts

Consistency Control
• Many Operating Systems can only place locks
at the file level.
• DBMS prefer finer granularity.
• When DBMS implement its own buffer pool,
crash recovery by the operating system would
be impossible.

Paged Virtual Memory
• Large files may not be able to be stored in
memory
• Binding chunks of the file into user space may
incur a performance loss.
Validation

Content is mostly informational.

Based off previous papers and existing
implementations of current systems.

Examples are cited primarily from the
UNIX OS and the Ingres DBMS.

Issues could be biased and may not be
common or applicable to all OS and DBMS
combinations.
Assumptions



Presents the topic as one that is applicable
to across a number of DBMS and OS
Author constrains his examples to UNIX
and Ingres.
Paper was written in 1981. Operating
Systems have advanced considerably
since then. His points may no longer be
applicable.
Changes if Rewritten Today



Increase the diversity of operating
systems and DBMS
Add industry perspective. Are the
problems Stonebraker presents really a
problem for DBMS designers?
Quantify claims by providing statistical
analysis of performance hits.