CS 245: Database System Principles

Download Report

Transcript CS 245: Database System Principles

Chapter 11: Hardware
(Slides by Hector Garcia-Molina,
http://www-db.stanford.edu/~hector/cs245/notes.htm)
Chapter 11
1
Outline
•
•
•
•
Hardware: Disks
Access Times
Optimizations
Other Topics:
– Storage costs
– Using secondary storage
– Disk failures
Chapter 11
2
The Big Picture:
DBMS
Data Storage
Chapter 11
3
P
...
Typical
Computer
M
Secondary
Storage
Chapter 11
4
Processor
Fast, slow, reduced instruction set,
with cache, pipelined…
Speed: 100  500  1000 MIPS
Memory
Fast, slow, non-volatile, read-only,…
Access time: 10-6  10-9 sec.
1 s  1 ns
Chapter 11
5
Secondary storage
Many flavors:
- Disk: Floppy (hard, soft)
Removable Packs
Winchester
Ram disks
Optical, CD-ROM…
Arrays
- Tape Reel, cartridge
Robots
Chapter 11
6
Focus on: “Typical Disk”
…
Terms: Platter, Head,
Cylinder, Track
Sector (physical),
Block (logical), Gap
Chapter 11
7
Top View
Chapter 11
8
“Typical” Numbers
Diameter:
1 inch  15 inches
Cylinders:
100  2000
Surfaces:
1 (CDs) 
(Tracks/cyl)
2 (floppies)  30
Sector Size: 512B  50K
Capacity:
360 KB (old floppy)
 30 GB (I use)
Chapter 11
9
Disk Access Time
block x
in memory
I want
block X
?
Chapter 11
10
Time = Seek Time +
Rotational Delay +
Transfer Time +
Other
Chapter 11
11
Rotational Delay
Head Here
Block I Want
Chapter 11
12
Average Rotational Delay
R = 1/2 revolution
“typical” R = 8.33 ms (3600 RPM)
Chapter 11
13
Transfer Rate: t
• “typical” t: 1  3 MB/second
• transfer time: block size
t
Chapter 11
14
Other Delays
• CPU time to issue I/O
• Contention for controller
• Contention for bus, memory
“Typical” Value: 0
Chapter 11
15
• So far: Random Block Access
• What about: Reading “Next” block?
Chapter 11
16
If we do things right
(e.g., Double Buffer,
Stagger Blocks…)
Time to get = Block Size + Negligible
block
t
- skip gap
- switch track
- once in a while,
next cylinder
Chapter 11
17
Rule of
Thumb
• Ex:
Random I/O: Expensive
Sequential I/O: Much less
1 KB Block
» Random I/O:  20 ms.
» Sequential I/O:  1 ms.
Chapter 11
18
Cost for Writing similar to Reading
…. unless we want to verify!
need to add (full) rotation + Block size
t
Chapter 11
19
• To Modify a Block?
To Modify Block:
(a) Read Block
(b) Modify in Memory
(c) Write Block
[(d) Verify?]
Chapter 11
20
Block Address:
•
•
•
•
Physical Device
Cylinder #
Surface #
Sector
Chapter 11
21
Outline
•
•
•
•
Hardware: Disks
Access Times
Optimizations
Other Topics
here
– Storage Costs
– Using Secondary Storage
– Disk Failures
Chapter 11
22
Optimizations
(in controller or O.S.)
• Double Buffering
• Disk Scheduling Algorithms: sec. 11.5.4
– e.g., elevator algorithm
Chapter 11
23
Double Buffering
Problem: Have a File
» Sequence of Blocks B1, B2
...
Have a Program
» Process B1
» Process B2
» Process B3
Chapter 11
24
Single Buffer Solution
(1)
(2)
(3)
(4)
Read B1  Buffer
Process Data in Buffer
Read B2  Buffer
Process Data in Buffer ...
Chapter 11
25
Say P = time to process/block
R = time to read in 1 block
n = # blocks
Single buffer time = n(P+R)
Chapter 11
26
Double Buffering
Memory:
Disk:
process
process
C
A
B
A B C D E F G
donedone
Chapter 11
27
Say P  R
P = Processing time/block
R = IO time/block
n = # blocks
What is processing time?
• Double buffering time = R + nP
• Single buffering time
Chapter 11
= n(R+P)
28
Block Size Selection?
• Big Block  Amortize I/O Cost
Unfortunately...
• Big Block  Read in more useless stuff!
and takes longer to read
Chapter 11
29
Trend
Trend
• As memory prices drop,
blocks get bigger ...
Chapter 11
30
Storage Cost
tape
typical capacity (bytes)
1015
optical
disks
1013
1011
109
electronic
secondary
magnetic
optical
disks
tape
electronic
main
107
105
from Gray & Reuter
cache
103
10-9
Chapter 11
10-6
10-3
10-0
103
access time (sec)
31
Storage Cost
104
cache
102
dollars/MB
from Gray & Reuter
electronic
main
electronic
secondary
100
tape
magnetic
optical
disks
optical
disks
10-2
tape
10-4
10-9
Chapter 11
10-6
10-3
10-0
103
access time (sec)
32
Using secondary storage effectively
• Example: Sorting data on disk
• Conclusion:
– I/O costs dominate
– Design algorithms to reduce I/O
• Also: How big should blocks be?
Chapter 11
33
Disk Failures
(Sec 11.6)
• Partial  Total
• Intermittent  Permanent
Chapter 11
34
Coping with Disk Failures
• Detection
– e.g. Checksum
• Correction
 Redundancy
Chapter 11
35
At what level do we cope?
• Single Disk
– e.g., Error Correcting Codes
• Disk Array
Logical
Chapter 11
Physical
36
Operating System
e.g., Stable Storage
Logical Block
Chapter 11
Copy A
Copy B
37
Database System
• e.g.,
Log
Current DB
Chapter 11
Last week’s DB
38
Summary
Summary
• Secondary storage, mainly disks
• I/O times
• I/Os should be avoided,
especially random ones…..
Chapter 11
39
Outline
•
•
•
•
Hardware: Disks
Access Times
Optimizations
Other Topics
– Storage Costs
– Using Secondary Storage
– Disk Failures
here
Chapter 11
40