systemoverview2
Download
Report
Transcript systemoverview2
What Should a DBMS Do?
• Store large amounts of data
• Process queries efficiently
• Allow multiple users to access the database
concurrently and safely.
• Provide durability of the data.
• How will we do all this??
User/
Application
Query
update
Generic Architecture
Query compiler
Execution engine
Record, index
requests
Query execution
plan
Index/record mgr.
Page
commands
Buffer manager
Read/write
pages
Storage manager
storage
Query Optimization
Goal:
Declarative SQL query
Imperative query execution plan:
buyer
City=‘seattle’
SELECT S.sname
FROM Purchase P, Person Q
WHERE P.buyer=Q.name AND
Q.city=‘seattle’ AND
Q.phone > ‘5430000’
phone>’5430000’
(Simple Nested Loops)
Buyer=name
Purchase
Person
Plan: Tree of R.A. ops, with choice of alg for each op.
Ideally: Want to find best plan. Practically: Avoid worst plans!
Alternate Plans
Find names of people who bought telephony products
buyer
buyer
Category=“telephony”
(hash join)
Category=“telephony”
(hash join)
prod=pname
(hash join)
Buyer=name
Purchase
Person
Product
Buyer=name
(hash join)
prod=pname
Purchase
Product
But what if we’re only looking for Bob’s purchases?
Person
ACID Properties
Atomicity: all actions of a transaction happen, or none happen.
Consistency: if a transaction is consistent, and the database starts
from a consistent state, then it will end in a consistent
state.
Isolation: the execution of one transaction is isolated from other
transactions.
Durability: if a transaction commits, its effects persist in the
database.
Problems with Transaction
Processing
Airline reservation system:
Step 1: check if a seat is empty.
Step 2: reserve the seat.
Bad scenario: (but very common)
Customer 1 - finds a seat empty
Customer 2 - finds the same seat empty
Customer 1 - reserves the seat.
Customer 2 - reserves the seat.
Customer 1 will not be happy; spends night in airport hotel.
The Memory Hierarchy
Main Memory
Disk
Tape
• 5-10 MB/S
• 1.5 MB/S transfer rate
•Volatile
transmission rates • 280 GB typical
•limited address
• 2-10 GB storage
capacity
spaces
• average time to
• Only sequential access
• expensive
access a block:
• Not for operational
• average access
10-15 msecs.
data
time:
10-100 nanoseconds • Need to consider
seek, rotation,
transfer times.
Cache:
• Keep records “close”
access time 10 nano’s
to each other.
Disk Space Manager
Task: manage the location of pages on disk (page = block)
Provides commands for:
Disk head
• allocating and deallocating a page
on disk
• reading and writing pages.
Why not use the operating system
Arm movement
for this task?
• Portability
• Limited size of address space
• May need to span several Arm assembly
disk devices.
Spindle
Tracks
Sector
Platters
Buffer Management in a DBMS
Page Requests from Higher Levels
BUFFER POOL
disk page
free frame
MAIN MEMORY
DISK
DB
choice of frame dictated
by replacement policy
• Data must be in RAM for DBMS to operate on it!
• Table of <frame#, pageid> pairs is maintained.
• LRU is not always good.
Buffer Manager
Manages buffer pool: the pool provides space for a limited
number of pages from disk.
Needs to decide on page replacement policy.
Enables the higher levels of the DBMS to assume that the
needed data is in main memory.
Why not use the Operating System for the task??
- DBMS may be able to anticipate access patterns
- Hence, may also be able to perform prefetching
- DBMS needs the ability to force pages to disk.
Record Formats: Fixed Length
F1
F2
F3
F4
L1
L2
L3
L4
Base address (B)
Address = B+L1+L2
• Information about field types same for all
records in a file; stored in system catalogs.
• Finding i’th field requires scan of record.
• Note the importance of schema information!
Files of Records
• Page or block is OK when doing I/O, but
higher levels of DBMS operate on records,
and files of records.
• FILE: A collection of pages, each containing
a collection of records. Must support:
– insert/delete/modify record
– read a particular record (specified using record
id)
– scan all records (possibly with some conditions
on the records to be retrieved)
Cost Model for Our Analysis
As
a good approximation, we ignore CPU
costs:
–
–
–
–
B: The number of data pages
R: Number of records per page
D: (Average) time to read or write disk page
Measuring number of page I/O’s ignores gains of
pre-fetching blocks of pages; thus, even I/O cost
is only approximated.
Sorting
• Illustrates the difference in algorithm design
when your data is not in main memory:
– Problem: sort 1Gb of data with 1Mb of RAM.
• Arises in many places in database systems:
–
–
–
–
–
Data requested in sorted order (ORDER BY)
Needed for grouping operations
First step in sort-merge join algorithm
Duplicate removal
Bulk loading of B+-tree indexes.
2-Way Sort: Requires 3 Buffers
• Pass 1: Read a page, sort it, write it.
– only one buffer page is used
• Pass 2, 3, …, etc.:
– three buffer pages used.
INPUT 1
OUTPUT
INPUT 2
Disk
Main memory
buffers
Disk
Two-Way External Merge Sort
• Each pass we read +
write each page in file.
• N pages in the file => the
number of passes
log2 N 1
• So total cost is:
3,4
6,2
9,4
8,7
5,6
3,1
2
3,4
2,6
4,9
7,8
5,6
1,3
2
4,7
8,9
2,3
4,6
1,3
5,6
• Idea: Divide and
conquer: sort subfiles
and merge
2
2-page runs
PASS 2
2,3
4,4
6,7
8,9
2 N log 2 N 1
Input file
PASS 0
1-page runs
PASS 1
1,2
3,5
6
4-page runs
PASS 3
1,2
2,3
3,4
4,5
6,6
7,8
9
8-page runs
General External Merge Sort
More than 3 buffer pages. How can we utilize them?
• To sort a file with N pages using B buffer pages:
– Pass 0: use B buffer pages. Produce N / B sorted runs
of B pages each.
– Pass 2, …, etc.: merge B-1 runs.
INPUT 1
...
INPUT 2
...
OUTPUT
...
INPUT B-1
Disk
B Main memory buffers
Disk
Cost of External Merge Sort
• Number of passes: 1 log B1 N / B
• Cost = 2N * (# of passes)
• E.g., with 5 buffer pages, to sort 108 page
file:
– Pass 0: 108 / 5 = 22 sorted runs of 5 pages
each (last run is only 3 pages)
– Pass 1: 22 / 4 = 6 sorted runs of 20 pages
each (last run is only 8 pages)
– Pass 2: 2 sorted runs, 80 pages and 28 pages
– Pass 3: Sorted file of 108 pages
Number of Passes of External
Sort
B=3 B=5
4
7
100
5
10
1,000
7
13
10,000
9
17
100,000
10
20
1,000,000
12
23
10,000,000
14
26
100,000,000
15
1,000,000,000 30
N
B=9
3
4
5
6
7
8
9
10
B=17 B=129 B=257
1
1
2
2
2
3
2
2
4
3
3
5
3
3
5
3
4
6
4
4
7
4
5
8
B: number of frames in the buffer pool; N: number of pages in relation.