Removable Disks

Download Report

Transcript Removable Disks

12.6 Swap-Space Management
 Swapping (Section 8.2) is to move least active process
between main memory and disk.

Systems actually combine swapping with virtual memory techniques
and swap pages, not entire processes
 Swap-space — Virtual memory uses disk space as an
extension of main memory.

Its goal is to provide best throughput for the virtual memory system.
 The amount of needed swap space depends on

Amount of physical memory, amount of virtual memory, the way
virtual memory is used.
 It is safer to overestimate the amount of swap space
required.

Linux suggest setting swap space to double the amount of physical
memory
Operating System Principles
12.1
Silberschatz, Galvin and Gagne ©2005
 Swap-space can be carved out of the normal file
system as a very large file: too inefficient

more commonly, it can be in a separate raw disk partition. The
goal is to optimize for speed rather than for storage efficiency
 Swap-space management

In Solaris 1, swap space is used for pages of anonymous
memory, which includes memory for the stack, heap, and
uninitialized data of a process

Solaris 2 allocates swap space only when a page is forced out
of physical memory, not when the virtual memory page is first
created.

In Linux, swap space is also used for pages of anonymous
memory or for regions of memory shared by several processes

Each swap area consists of a series of 4-KB page slots.
Associated with each swap area is a swap map: an array of
integer counters to indicate the number of mappings to the
swapped page. Kernel uses swap maps to track swap-space use.
Operating System Principles
12.2
Silberschatz, Galvin and Gagne ©2005
Data Structures for Swapping on Linux
Systems
This page slot is
available
Operating System Principles
The swapped page is mapped to
three different processes
12.3
Silberschatz, Galvin and Gagne ©2005
12.7 RAID Structure
 RAID -- Redundant Array of Inexpensive (or Independent)
Disks

multiple disk drives provides reliability via redundancy.
 The mean time to failure (loss of data) of a mirrored
volume depends on the mean time to failure of the two
individual disks and the mean time to repair

If the mean time to failure of a disk is 100,000 hours and the mean
time to repair is 10 hours, then the mean time to loss of data of the
mirrored disk system is 1000002/(2*10) = 500 * 106 hours

Handling power failure:

write one copy first, then the next, so that one of the two copies is
always consistent

Add a NVRAM (nonvolatile RAM ) cache to the RAID array
Operating System Principles
12.4
Silberschatz, Galvin and Gagne ©2005
Improvement in Performance via Parallelism
 With disk mirroring, the number of reads per unit time
has doubled
 The transfer rate could be improved by striping data
across the disks


bit-level data striping: splitting the bits of each byte across eight
disks (or a multiple of 8 or a number divides 8)
block-level data striping: blocks of a file are striped across
multiple disks.

With n disks, block i goes to disk (i mod n)+1
 Data striping has two main goals


Increase the throughput of multiple small accesses by load
balancing
Reduce the response time of large accesses
Operating System Principles
12.5
Silberschatz, Galvin and Gagne ©2005
RAID Levels
 Disk striping uses a group of disks as one storage unit.
 RAID schemes improve performance and improve the
reliability of the storage system by storing redundant
“parity” data

Mirroring or shadowing keeps duplicate of each disk.

Block interleaved parity uses much less redundancy.

If the number of bits in the byte set to 1 is even (parity is 0) or odd
(parity = 1)
 In the following figure, P indicates error-correcting bits,
and C indicates a second copy of the data. In all cases,
four disks’ worth of data are stored, and the extra disks
are used to store redundant information for failure
recovery
Operating System Principles
12.6
Silberschatz, Galvin and Gagne ©2005
RAID Levels 1-3
Besides parity, Error
Correcting stores two or
more extra bits to
reconstruct the data if a
single bit is damaged.
RAID 2 is not used.
A single parity bit can be
used for error correction as
well as for detection.
Normally RAID 3 includes a
dedicated parity hardware
and NVRAM cache.
Operating System Principles
12.7
Silberschatz, Galvin and Gagne ©2005
RAID Levels 4-6
read-modify-write cycle: a
write requires four disk
accesses: two to read and two
to write.
Skip the paragraph about
WAFL.
Spreading data and parity among
all disks to avoid the overuse of a
single parity disk in RAID 4.
RAID 5 is the most common parity
RAID system.
Stores extra redundant data to
guard against multiple disk
failures.
Operating System Principles
12.8
Silberschatz, Galvin and Gagne ©2005
RAID 0 + 1 and RAID 1 + 0
RAID 0 for performance,
RAID 1 for reliability.
A set of disks are striped,
and then the stripe is
mirrored to another stripe.
Disks are mirrored in pairs and the
resulting mirror pairs are striped.
If a single disk fails in RAID 0+1, the
entire stripe is inaccessible, leaving
only the other stripe available.
With a failure in RAID 1+0, the
single disk in unavailable, but the
rest of the disks are available.
SKIP p.457-459
Operating System Principles
12.9
Silberschatz, Galvin and Gagne ©2005
12.8 Stable-Storage Implementation
 Write-ahead log scheme requires stable storage.
 To implement stable storage:

Replicate information on more than one nonvolatile
storage media with independent failure modes.

Update information in a controlled manner to ensure that
we can recover the stable data after any failure during
data transfer or recovery.
 A disk write results in three possible outcomes:

Successful completion

Partial failure: only some of the sectors were written with
the new data, and the sector being written during failure
may be corrupted

Total failure: The failure occurred before the disk write
started, so the previous data remain intact
Operating System Principles
12.10
Silberschatz, Galvin and Gagne ©2005
 To be able to recover from a disk writing failure, the system
must maintain two physical blocks for each logical block. An
output operation is executed as follows:

Write the data onto the first physical block

When the first write complete successfully, write the same data onto
the second physical block

Declare the operation complete only after the second write
completes successfully
 Recovery process

If both blocks are the same with no error, then no action

If a block contains a detected error, then replace it with the value of
the other block

If neither block contains a detectable error, but their contents differ,
then replace the content of the first block with that of the second
 Performance could be improved by using NVRAM cache
Operating System Principles
12.11
Silberschatz, Galvin and Gagne ©2005
12.9 Tertiary Storage Devices
 Low cost is the defining characteristic of tertiary storage.
 Generally, tertiary storage is built using removable
media
 Common examples of removable media are floppy disks
and CD-ROMs; other types, like MO (magneto-optic
disk) and tapes, are available.
Operating System Principles
12.12
Silberschatz, Galvin and Gagne ©2005
Removable Disks
 Floppy disk — thin flexible disk coated with magnetic
material, enclosed in a protective plastic case.

Most floppies hold about 1 MB; similar technology is used for
removable disks that hold more than 1 GB.

Removable magnetic disks can be nearly as fast as hard disks,
but they are at a greater risk of damage from exposure.
 Optical disks do not use magnetism; they employ
special materials that are altered by laser light to have
relative dark or bright spots

The phase-change disk drive uses laser light at three different
powers

Low to read data

Medium to erase the disk by melting and refreezing the recording
medium into a crystalline state

High to melt the medium into the amorphous state to write the disk
Operating System Principles
12.13
Silberschatz, Galvin and Gagne ©2005
Removable Disks
 A magneto-optic disk records data on a rigid
platter coated with magnetic material.

Laser heat is used to amplify a large, weak
magnetic field to record a bit.

Laser light is also used to read data (Kerr effect)
 The
polarization of the laser beam is rotated
clockwise or counterclockwise depending on the
orientation of the magnetic field.

The magneto-optic head flies much farther from the
disk surface than a magnetic disk head, and the
magnetic material is covered with a protective layer
of plastic or glass; resistant to head crashes.
Operating System Principles
12.14
Silberschatz, Galvin and Gagne ©2005
Removable Disks
 Read-write disks: The data on these disks can be modified
over and over.
 WORM (“Write Once, Read Many Times”) disks can be
written only once.

Thin aluminum film sandwiched between two glass or plastic platters.

To write a bit, the drive uses a laser light to burn a small hole through
the aluminum; information can be destroyed but not altered.

Very durable and reliable.
 Read Only disks, such ad CD-ROM and DVD, com from the
factory with the data pre-recorded by pressing, instead of
burning.
Operating System Principles
12.15
Silberschatz, Galvin and Gagne ©2005
Tapes
 Compared to a disk, a tape is less expensive and holds
more data, but random access is much slower.
 Tape is an economical medium for purposes that do not
require fast random access, e.g., backup copies of disk
data, holding huge volumes of data.
 Large tape installations typically use robotic tape changers
that move tapes between tape drives and storage slots in
a tape library.

stacker – library that holds a few tapes

silo – library that holds thousands of tapes
 A disk-resident file can be archived to tape for low cost
storage; the computer can stage it back into disk storage
for active use. A robotic tape library is called a near-line
storage.
SKIP 12.9.1.3
Operating System Principles
12.16
Silberschatz, Galvin and Gagne ©2005
Operating System Issues
 Major OS jobs are to manage physical devices
and to present a virtual machine abstraction to
applications
 For hard disks, the OS provides two abstraction:

Raw device – an array of data blocks.

File system – the OS queues and schedules the
interleaved requests from several applications.
 How about removable storages?
Operating System Principles
12.17
Silberschatz, Galvin and Gagne ©2005
Application Interface
 Most OSs handle removable disks almost exactly like
fixed disks — a new cartridge is formatted and an empty
file system is generated on the disk.
 Tapes are presented as a raw storage medium. An
application does not open a file on the tape, it opens the
whole tape drive as a raw device.

Usually the tape drive is reserved for the exclusive use of that
application until the application exits or closes the tape device.

Since the OS does not provide file system services, the
application must decide how to use the array of blocks.

Since every application makes up its own rules for how to
organize a tape, a tape full of data can generally only be used by
the program that created it.
Operating System Principles
12.18
Silberschatz, Galvin and Gagne ©2005
Tape Drives
 The basic operations for a tape drive differ from those of a
disk drive.

locate positions the tape to a specific logical block, not an entire
track (corresponds to seek).

The read position operation returns the logical block number
where the tape head is.

The space operation enables relative motion.

Most tape drives have a variable block size, which is determined
when the block is written. If an area of defective tape is
encountered during writing, that area is skipped and the block is
written again.
 Tape drives are “append-only” devices; updating a block
in the middle of the tape also effectively erases everything
beyond that block.
 An EOT mark is placed after a block that is written.
Operating System Principles
12.19
Silberschatz, Galvin and Gagne ©2005
File Naming
 The issue of naming files on removable media is
especially difficult when we want to write data on a
removable cartridge on one computer, and then use the
cartridge in another computer.

Contemporary OSs generally leave the name space problem
unsolved for removable media, and depend on applications and
users to figure out how to access and interpret the data.
 Some kinds of removable media (e.g., CDs and DVDs)
are so well standardized that all computers use them the
same way.
Operating System Principles
12.20
Silberschatz, Galvin and Gagne ©2005
Hierarchical Storage Management (HSM)
 A hierarchical storage system extends the storage
hierarchy beyond primary memory and secondary
storage to incorporate tertiary storage — usually
implemented as a jukebox of tapes or removable disks.
 Usually incorporate tertiary storage by extending the
file system.

Small and frequently used files remain on disk.

Large, old, inactive files are archived to the jukebox.
 HSM is usually found in supercomputing centers and
other large installations that have enormous volumes of
data.
Operating System Principles
12.21
Silberschatz, Galvin and Gagne ©2005
Speed
Performance Issues:
Speed, Reliability, Cost
 Two aspects of speed in tertiary storage are bandwidth
and latency.
 Bandwidth is measured in bytes per second.

Sustained bandwidth – average data rate during a large
transfer; # of bytes/transfer time.
 Data

rate when the data stream is actually flowing.
Effective bandwidth – average over the entire I/O time,
including seek or locate, and cartridge switching.
 Drive’s
Operating System Principles
overall data rate.
12.22
Silberschatz, Galvin and Gagne ©2005
 Access latency – amount of time needed to locate data.

Access time for a disk – move the arm to the selected cylinder and
wait for the rotational latency; < 5 milliseconds.

Access on tape requires winding the tape reels until the selected
block reaches the tape head; tens or hundreds of seconds.

Generally say that random access within a tape cartridge is about
a thousand times slower than random access on disk.
 If a jukebox is involved, the access latency is much higher
 The low cost of tertiary storage is a result of having many
cheap cartridges share a few expensive drives.
 A removable library is best devoted to the storage of
infrequently used data, because the library can only
satisfy a relatively small number of I/O requests per hour.
Operating System Principles
12.23
Silberschatz, Galvin and Gagne ©2005
Reliability
 A fixed disk drive is likely to be more reliable than a
removable disk or tape drive.
 An optical cartridge is likely to be more reliable than a
magnetic disk or tape.
 A head crash in a fixed hard disk generally destroys
the data, whereas the failure of a tape drive or optical
disk drive often leaves the data cartridge unharmed.
Operating System Principles
12.24
Silberschatz, Galvin and Gagne ©2005
Cost
 Main memory is much more expensive than disk storage
 The cost per megabyte of hard disk storage is competitive
with magnetic tape if only one tape is used per drive.
 The cheapest tape drives and the cheapest disk drives have
had about the same storage capacity over the years.
 Tertiary storage gives a cost savings only when the number
of cartridges is considerably larger than the number of
drives.
Operating System Principles
12.25
Silberschatz, Galvin and Gagne ©2005
Price per Megabyte of DRAM, 1981 to 2004
Operating System Principles
12.26
Silberschatz, Galvin and Gagne ©2005
Price per Megabyte of Magnetic Hard Disk, 1981 to 2004
Operating System Principles
12.27
Silberschatz, Galvin and Gagne ©2005
Price per Megabyte of a Tape Drive, From 1984-2000
The dramatic fall in disk prices (four orders of magnitude) has made
the price per MB of a magnetic disk drive approaching that of a tape
cartridge without the tape drive.
That has largely rendered tertiary storage obsolete.
Operating System Principles
12.28
Silberschatz, Galvin and Gagne ©2005