ppt - Carnegie Mellon School of Computer Science

Download Report

Transcript ppt - Carnegie Mellon School of Computer Science

15-440 Distributed Systems
Lecture 1 – Introduction to Distributed Systems
1
What Is A Distributed System?
“A collection of independent computers that appears to its
users as a single coherent system.”
•Features:
• No shared memory – message-based communication
• Each runs its own local OS
• Heterogeneity
• Expandability
•Ideal: to present a single-system image:
• The distributed system “looks like” a single computer
rather than a collection of separate computers.
Definition of a Distributed
System
Figure 1-1. A distributed system organized as middleware. The
middleware layer runs on all machines, and offers a uniform
interface to the system
Distributed Systems: Goals
• Resource Availability: remote access to resources
• Distribution Transparency: single system image
• Access, Location, Migration, Replication, Failure,…
• Openness: services according to standards (RPC)
• Scalability: size, geographic, admin domains, …
• Example of a Distributed System?
• Web search on google
• DNS: decentralized, scalable, robust to failures, ...
• ...
4/1/2016
4
15-440 Distributed Systems
Lecture 2 & 3 – 15-441 in 2 Days
Packet Switching –
Statistical Multiplexing
Packets
• Switches arbitrate between inputs
• Can send from any input that’s ready
• Links never idle when traffic to send
• (Efficiency!)
6
Model of a communication channel
• Latency - how long does it take for the first bit to reach
destination
• Capacity - how many bits/sec can we push through?
(often termed “bandwidth”)
• Jitter - how much variation in latency?
• Loss / Reliability - can the channel drop packets?
• Reordering
7
Packet Switching
• Source sends information as self-contained packets that
have an address.
• Source may have to break up single message in multiple
• Each packet travels independently to the destination host.
• Switches use the address in the packet to determine how to
forward the packets
• Store and forward
• Analogy: a letter in surface mail.
8
Internet
• An inter-net: a network of
networks.
• Networks are connected using
routers that support
communication in a hierarchical
fashion
• Often need other special devices
at the boundaries for security,
accounting, ..
Internet
• The Internet: the interconnected
set of networks of the Internet
Service Providers (ISPs)
• About 17,000 different networks
make up the Internet
9
Network Service Model
• What is the service model for inter-network?
• Defines what promises that the network gives for any
transmission
• Defines what type of failures to expect
• Ethernet/Internet: best-effort – packets can get
lost, etc.
10
Possible Failure models
• Fail-stop:
• When something goes wrong, the process stops / crashes /
etc.
• Fail-slow or fail-stutter:
• Performance may vary on failures as well
• Byzantine:
• Anything that can go wrong, will.
• Including malicious entities taking over your computers and
making them do whatever they want.
• These models are useful for proving things;
• The real world typically has a bit of everything.
• Deciding which model to use is important!
11
What is Layering?
• Modular approach to network functionality
• Example:
Application
Application-to-application channels
Host-to-host connectivity
Link hardware
12
IP Layering
• Relatively simple
Application
Transport
Network
Link
Physical
Host
Bridge/Switch
Router/Gateway
Host
13
Protocol Demultiplexing
• Multiple choices at each layer
FTP
HTTP
NV
TCP
IPX
NET1
TFTP
UDP
Network
IP
Type
Field
Protocol
Field
TCP/UDP
IP
NET2
…
NETn
Port
Number
14
Goals [Clark88]
0 Connect existing networks
initially ARPANET and ARPA packet radio network
1.Survivability
ensure communication service even in the presence of
network and router failures
2. Support multiple types of services
3. Must accommodate a variety of networks
4. Allow distributed management
5. Allow host attachment with a low level of effort
6. Be cost effective
7. Allow resource accountability
15
Goal 1: Survivability
• If network is disrupted and reconfigured…
• Communicating entities should not care!
• No higher-level state reconfiguration
• How to achieve such reliability?
• Where can communication state be stored?
Failure handing
Net Engineering
Switches
Host trust
Network
Host
Replication
Tough
Maintain state
Less
“Fate sharing”
Simple
Stateless
More
16
CIDR IP Address Allocation
Provider is given 201.10.0.0/21
Provider
201.10.0.0/22
201.10.4.0/24
201.10.5.0/24
201.10.6.0/23
17
Ethernet Frame Structure (cont.)
• Addresses:
• 6 bytes
• Each adapter is given a globally unique address
at manufacturing time
• Address space is allocated to manufacturers
• 24 bits identify manufacturer
• E.g., 0:0:15:*  3com adapter
• Frame is received by all adapters on a LAN and
dropped if address does not match
• Special addresses
• Broadcast – FF:FF:FF:FF:FF:FF is “everybody”
• Range of addresses allocated to multicast
• Adapter maintains list of multicast groups node is
interested in
18
18
End-to-End Argument
• Deals with where to place functionality
• Inside the network (in switching elements)
• At the edges
• Argument
• If you have to implement a function end-to-end anyway
(e.g., because it requires the knowledge and help of the
end-point host or application), don’t implement it
inside the communication system
• Unless there’s a compelling performance enhancement
• Key motivation for split of functionality between
TCP,UDP and IP
Further Reading: “End-to-End Arguments in System Design.” Saltzer, Reed,
and Clark.
19
User Datagram Protocol (UDP):
An Analogy
•
•
•
•
•
UDP
Single socket to receive
messages
No guarantee of delivery
Not necessarily in-order
delivery
Datagram – independent
packets
Must address each packet
Postal Mail
Postal
Mail
• Single mailbox to receive
letters
messages
• Unreliable 
• Not necessarily in-order
delivery
• Letters
sentisindependently
Each letter
independent
• Must address each letter
reply
Example UDP applications
Multimedia, voice over IP
20
Transmission Control Protocol (TCP):
An Analogy
TCP
• Reliable – guarantee
delivery
• Byte stream – in-order
delivery
• Connection-oriented –
single socket per
connection
• Setup connection
followed by data transfer
•
•
•
•
Telephone Call
Guaranteed delivery
In-order delivery
Connection-oriented
Setup connection
followed by conversation
Example TCP applications
Web, Email, Telnet
21
15-440 Distributed Systems
Lecture 5 – Classical Synchronization
25
Classic synchronization primitives
• Basics of concurrency
• Correctness (achieves Mutex, no deadlock, no livelock)
• Efficiency, no spinlocks or wasted resources
• Fairness
• Synchronization mechanisms
• Semaphores (P() and V() operations)
• Mutex (binary semaphore)
• Condition Variables (allows a thread to sleep)
• Must be accompanied by a mutex
• Wait and Signal operations
• Work through examples again + GO primitives
4/1/2016
26
15-440 Distributed Systems
Lecture 6 – RPC
27
RPC Goals
• Ease of programming
• Hide complexity
• Automates task of implementing distributed
computation
• Familiar model for programmers (just make a
function call)
Historical note: Seems obvious in retrospect, but RPC was only invented in the ‘80s. See
Birrell & Nelson, “Implementing Remote Procedure Call” ... or
Bruce Nelson, Ph.D. Thesis, Carnegie Mellon University: Remote Procedure Call., 1981 :)
Passing Value Parameters (1)
• The steps involved in a doing a
remote computation through RPC.
29
Stubs: obtaining transparency
• Compiler generates from API stubs for a
procedure on the client and server
• Client stub
•
•
•
•
Marshals arguments into machine-independent format
Sends request to server
Waits for response
Unmarshals result and returns to caller
• Server stub
• Unmarshals arguments and builds stack frame
• Calls procedure
• Server stub marshals results and sends reply
30
Real solution: break transparency
• Possible semantics for RPC:
• Exactly-once
• Impossible in practice
• At least once:
• Only for idempotent operations
• At most once
• Zero, don’t know, or once
• Zero or once
• Transactional semantics
31
Asynchronous RPC (3)
• A client and server interacting through
two asynchronous RPCs.
32
Important Lessons
• Procedure calls
• Simple way to pass control and data
• Elegant transparent way to distribute application
• Not only way…
• Hard to provide true transparency
•
•
•
•
Failures
Performance
Memory access
Etc.
• How to deal with hard problem  give up and let
programmer deal with it
• “Worse is better”
33
15-440 Distributed Systems
Lecture 7,8 – Distributed File Systems
34
Why DFSs?
• Why Distributed File Systems:
•
•
•
•
•
Data sharing among multiple users
User mobility
Location transparency
Backups and centralized management
Examples: NFS (v1 – v4), AFS, CODA, LBFS
• Idea: Provide file system interfaces to remote FS’s
•
•
•
•
Challenge: heterogeneity, scale, security, concurrency,..
Non-Challenges: AFS meant for campus community
Virtual File Systems: pluggable file systems
Use RPC’s
35
NFS vs AFS Goals
• AFS: Global distributed file system
•
•
•
•
•
“One AFS”, like “one Internet”
LARGE numbers of clients, servers (1000’s cache files)
Global namespace (organized as cells) => location transparency
Clients w/disks => cache, Write sharing rare, callbacks
Open-to-close consistency (session semantics)
• NFS: Very popular Network File System
•
•
•
•
•
NFSv4 meant for wide area
Naming: per-client view (/home/yuvraj/…)
Cache data in memory, not on disk, write through cache
Consistency model: buffer data (eventual, ~30seconds)
Requires significant resounces as users scale
36
DFS Important bits (1)
• Distributed filesystems almost always involve a
tradeoff: consistency, performance, scalability.
• We’ve learned a lot since NFS and AFS (and can
implement faster, etc.), but the general lesson
holds. Especially in the wide-area.
• We’ll see a related tradeoff, also involving
consistency, in a while: the CAP tradeoff.
Consistency, Availability, Partition-resilience.
DFS Important Bits (2)
• Client-side caching is a fundamental technique to
improve scalability and performance
• But raises important questions of cache consistency
• Timeouts and callbacks are common methods for
providing (some forms of) consistency.
• AFS picked close-to-open consistency as a good
balance of usability (the model seems intuitive to
users), performance, etc.
• AFS authors argued that apps with highly concurrent,
shared access, like databases, needed a different
model
Coda Summary
• Distributed File System built for mobility
• Disconnected operation key idea
• Puts scalability and availability before
data consistency
• Unlike NFS
• Assumes that inconsistent updates are very
infrequent
• Introduced disconnected operation mode and file
hoarding and the idea of “reintegration”
40
Low Bandwidth File System
Key Ideas
• A network file systems for slow or wide-area
networks
• Exploits similarities between files
• Avoids sending data that can be found in the server’s
file system or the client’s cache
• Uses RABIN fingerprints on file content (file chunks)
• Can deal with byte offsets when part of file change
• Also uses conventional compression and caching
• Requires 90% less bandwidth than traditional
network file systems
41
15-440 Distributed Systems
Lecture 9 – Time Synchronization
42
Impact of Clock Synchronization
• When each machine has its own clock, an event
that occurred after another event may nevertheless
be assigned an earlier time.
43
Clocks in a Distributed System
Network
• Computer clocks are not generally in perfect agreement
• Skew: the difference between the times on two clocks (at any instant)
• Computer clocks are subject to clock drift (they count time at different
rates)
• Clock drift rate: the difference per unit of time from some ideal reference
clock
• Ordinary quartz clocks drift by about 1 sec in 11-12 days. (10-6 secs/sec).
• High precision quartz clocks drift rate is about 10-7 or 10-8 secs/sec
44
Perfect networks
• Messages always arrive, with propagation delay
exactly d
• Sender sends time T in a message
• Receiver sets clock to T+d
• Synchronization is exact
Cristian’s Time Sync
• A time server S receives signals from a UTC source
• Process p requests time in mr and receives t in mt from S
• p sets its clock to t + RTT/2
• Accuracy ± (RTT/2 - min) :
• because the earliest time S puts t in message mt is min after p sent mr.
• the latest time was min before mt arrived at p
• the time by S’s clock when mt arrives is in the range [t+min, t + RTT - min]
mr
p
mt
Time server,S
Tround is the round trip time recorded by p
min is an estimated minimum round trip time
46
Berkeley algorithm
• Cristian’s algorithm • a single time server might fail, so they suggest the use of a group of
synchronized servers
• it does not deal with faulty servers
• Berkeley algorithm (also 1989)
•
•
•
•
An algorithm for internal synchronization of a group of computers
A master polls to collect clock values from the others (slaves)
The master uses round trip times to estimate the slaves’ clock values
It takes an average (eliminating any above average round trip time or with
faulty clocks)
• It sends the required adjustment to the slaves (better than sending the
time which depends on the round trip time)
• Measurements
• 15 computers, clock synchronization 20-25 millisecs drift rate < 2x10-5
• If master fails, can elect a new master to take over (not in bounded time)
•
47
NTP Protocol
• All modes use UDP
• Each message bears timestamps of recent events:
• Local times of Send and Receive of previous message
• Local times of Send of current message
• Recipient notes the time of receipt T3 (we have T0,
T1, T2, T3)
Server
T1
m
T2
Time
m'
Time
Client
T0
T3
48
Logical time and logical clocks (Lamport
1978)
p1
a
b
m1
Physical
time
p2
c
d
m2
p3
e
f
• Instead of synchronizing clocks, event ordering can be used
1.
2.
3.
If two events occurred at the same process pi (i = 1, 2, … N) then
they occurred in the order observed by pi, that is the definition of:
“ i”
when a message, m is sent between two processes, send(m)
happens before receive(m)
The happened before relation is transitive
• The happened before relation is the relation of causal ordering
49
Total-order Lamport clocks
• Many systems require a total-ordering of events,
not a partial-ordering
• Use Lamport’s algorithm, but break ties using the
process ID
• L(e) = M * Li(e) + i
• M = maximum number of processes
• i = process ID
Vector Clocks
(1,0,0) (2,0,0)
p1
a
b
m1
(2,1,0)
p2
c
(0,0,1)
(2,2,0)
d
Physical
time
m2
(2,2,2)
p3
e
f
• Note that e  e’ implies V(e)<V(e’). The converse
is also true
• Can you see a pair of parallel events?
•
c || e (parallel) because neither V(c) <= V(e) nor V(e) <= V(c)
51
15-440 Distributed Systems
Lecture 10 – Mutual Exclusion
Example: Totally-Ordered Multicasting
(+$100)
(San Francisco)
(+1%)
(New York)
• San Fran customer adds $100, NY bank adds 1% interest
• San Fran will have $1,111 and NY will have $1,110
• Updating a replicated database and leaving it in an inconsistent
state.
• Can use Lamport’s to totally order
55
Mutual Exclusion
A Centralized Algorithm (1)
@ Server:
while true:
m = Receive()
If m == (Request, i):
If Available():
Send (Grant) to i
@ Client  Acquire:
Send (Request, i) to coordinator
Wait for reply
56
Distributed Algorithm Strawman
• Assume that there are n coordinators
• Access requires a majority vote from m > n/2
coordinators.
• A coordinator always responds immediately to a
request with GRANT or DENY
• Node failures are still a problem
• Large numbers of nodes requesting access can
affect availability
57
A Distributed Algorithm 2
(Lamport Mutual Exclusion)
• Every process maintains a queue of pending requests for
entering critical section in order. The queues are ordered
by virtual time stamps derived from Lamport timestamps
• For any events e, e' such that e --> e' (causality ordering), T(e) <
T(e')
• For any distinct events e, e', T(e) != T(e')
• When node i wants to enter C.S., it sends time-stamped
request to all other nodes (including itself)
• Wait for replies from all other nodes.
• If own request is at the head of its queue and all replies have been
received, enter C.S.
• Upon exiting C.S., remove its request from the queue and send a
release message to every process.
58
A Distributed Algorithm 3
(Ricart & Agrawala)
• Also relies on Lamport totally ordered clocks.
• When node i wants to enter C.S., it sends timestamped request to all other nodes. These other
nodes reply (eventually). When i receives n-1
replies, then can enter C.S.
• Trick: Node j having earlier request doesn't reply
to i until after it has completed its C.S.
59
A Token Ring Algorithm
• Organize the processes involved into a logical ring
• One token at any time  passed from node to
node along ring
60
A Token Ring Algorithm
• Correctness:
• Clearly safe: Only one process can hold token
• Fairness:
• Will pass around ring at most once before getting
access.
• Performance:
• Each cycle requires between 1 - ∞ messages
• Latency of protocol between 0 & n-1
• Issues
• Lost token
61
Summary
• Lamport algorithm demonstrates how distributed
processes can maintain consistent replicas of a
data structure (the priority queue).
• Ricart & Agrawala's algorithms demonstrate utility
of logical clocks.
• Centralized & ring based algorithms much lower
message counts
• None of these algorithms can tolerate failed
processes or dropped messages.
62
15-440 Distributed Systems
Lecture 11 – Concurrency, Transactions
63
Distributed Concurrency Management
• Multiple-Objects, Multiple Servers. Ignore Failure
•
•
•
•
•
Single Server: Transactions (RD/WR to Global State)
ACID: Atomicity, Consistency, Isolation, Durability
Learn what these mean in the context of transactions
E.g. banking app => ACID is violated if not careful
Solutions: 2-phase locking (General, strict, strong strict)
• Deadling with deadlocks => build “waits-for” graph
• Transactions: 2 phases (prep, commit/abort)
• Preparation: generate Lock Set “L”, Updates “U”
• COMMIT (updated global state), ABORT (leave state as is)
• Example using banking app
4/1/2016
64
Distributed Transactions – 2PC
• Similar idea as before, but:
• State spread across servers (maybe even WAN)
• Want to enable single transactions to read and update
global state while maintaining ACID properties
• Overall Idea:
• Client initiate transaction. Makes use of “co-ordinator”
• All other relevant servers operate as “participants”
• Co-ordinator assigns unique transaction ID (TID)
• 2 Phase-Commit
• Prepare & Vote (client determine states, talk to Coord.)
• Commit/Abort Phase (co-ordinator boardcast to clients)
65
15-440 Distributed Systems
Lecture 12 – Logging and Crash Recovery
66
Summary – Fault Tolerance
• Real Systems (are often unreliable)
• Introduced basic concepts for Fault Tolerant Systems
including redundancy, process resilience, RPC
• Fault Tolerance – Backward recovery using
checkpointing, both Independent and coordinated
• Fault Tolerance –Recovery using Write-AheadLogging, balances the overhead of checkpointing
and ability to recover to a consistent state
67
Dependability Concepts
•
Availability – the system is ready to be used immediately.
•
Reliability – the system runs continuously without failure.
•
Safety – if a system fails, nothing catastrophic
will happen. (e.g. process control systems)
•
Maintainability – when a system fails, it can
be repaired easily and quickly (sometimes, without its
users noticing the failure). (also called Recovery)
•
•
What’s a failure? : System that cannot meet its goals => faults
Faults can be: Transient, Intermittent, Permanent
Masking Failures by Redundancy
•
Strategy: hide the occurrence of failure from
other processes using redundancy.
1. Information Redundancy – add extra bits to
allow for error detection/recovery (e.g.,
Hamming codes and the like).
2. Time Redundancy – perform operation and, if
needs be, perform it again. Think about how
transactions work (BEGIN/END/COMMIT/ABORT).
3. Physical Redundancy – add extra (duplicate)
hardware and/or software to the system.
Recovery Strategies
•
When a failure occurs, we need to bring the
system into an error free state (recovery). This
is fundamental to Fault Tolerance.
1. Backward Recovery: return the system to
some previous correct state (using checkpoints),
then continue executing.
-- Can be expensive, however still used
2. Forward Recovery: bring the system into a
correct new state, from which it can then
continue to execute.
-- Need to know potential errors up front!
Independent Checkpointing
The domino effect – Cascaded rollback
P2 crashes, roll back, but 2 checkpoints inconsistent (P2
shows m received, but P1 does not show m sent)
Shadow Paging Vs WAL
• Shadow Pages
• Provide Atomicity and Durability, “page” = unit of storage
• Idea: When writing a page, make a “shadow” copy
• No references from other pages, edit easily!
• ABORT: discard shadow page
• COMMIT: Make shadow page “real”. Update pointers to
data on this page from other pages (recursive). Can be
done atomically
72
Shadow Paging vs WAL
• Write-Ahead-Logging
•
•
•
•
•
•
Provide Atomicity and Durability
Idea: create a log recording every update to database
Updates considered reliable when stored on disk
Updated versions are kept in memory (page cache)
Logs typically store both REDO and UNDO operations
After a crash, recover by replaying log entries to
reconstruct correct state
• 3 Passes: (Analysis Pass, recovery pass, Undo Pass)
• WAL is more common, fewer disk operations,
transactions considered committed once log written.
73
15-440 Distributed Systems
Lecture 13 – Errors and Failures
Measuring Availability
• Mean time to failure (MTTF)
• Mean time to repair (MTTR)
• MTBF = MTTF + MTTR
• Availability = MTTF / (MTTF + MTTR)
• Suppose OS crashes once per month, takes 10min to
reboot.
• MTTF = 720 hours = 43,200 minutes
MTTR = 10 minutes
• Availability = 43200 / 43210 = 0.997 (~“3 nines”)
75
Disk failure conditional probability
distribution - Bathtub curve
Infant
mortality
Burn
out
1 / (reported MTTF)
Expected operating lifetime
76
Parity Checking
Single Bit Parity:
Detect single bit errors
77
Block Error Detection
• EDC= Error Detection and Correction bits (redundancy)
• D = Data protected by error checking, may include header fields
• Error detection not 100% reliable!
• Protocol may miss some errors, but rarely
• Larger EDC field yields better detection and correction
78
Error Detection – Cyclic Redundancy Check
(CRC)
• Polynomial code
• Treat packet bits a coefficients of n-bit polynomial
• Choose r+1 bit generator polynomial (well known –
chosen in advance)
• Add r bits to packet such that message is divisible by
generator polynomial
• Better loss detection properties than checksums
• Cyclic codes have favorable properties in that they are
well suited for detecting burst errors
• Therefore, used on networks/hard drives
79
Error Recovery
• Two forms of error recovery
• Redundancy
• Error Correcting Codes (ECC)
• Replication/Voting
• Retry
• ECC
• Keep encoded redundant data to help repair losses
• Forward Error Correction (FEC) – send bits in advance
• Reduces latency of recovery at the cost of bandwidth
80
Summary
• Definition of MTTF/MTBF/MTTR: Understanding
availability in systems.
• Failure detection and fault masking techniques
• Engineering tradeoff: Cost of failures vs. cost of
failure masking.
• At what level of system to mask failures?
• Leading into replication as a general strategy for fault
tolerance
• Thought to leave you with:
• What if you have to survive the failure of entire
computers? Of a rack? Of a datacenter?
81
81
15-440 Distributed Systems
Lecture 14 – RAID
Thanks to Greg Ganger and Remzi Arapaci-Dusseau for slides
Just a bunch of disks (JBOD)
• Yes, it’s a goofy name
• industry really does sell “JBOD enclosures”
83
Disk Striping
• Interleave data across multiple disks
• Large file streaming can enjoy parallel transfers
• High throughput requests can enjoy thorough load
balancing
• If blocks of hot files equally likely on all disks (really?)
84
Redundancy via replicas
• Two (or more) copies
• mirroring, shadowing, duplexing, etc.
• Write both, read either
85
Simplest approach: Parity Disk
• Capacity: one
extra disk needed
per stripe
86
Updating and using the parity
Updating and using the parity
Fault-Free Read
Fault-Free Write
2
D
D
D
P
1
D
Degraded Read
D
D
D
P
4
3
D
D
P
Degraded Write
D
D
D
P
87
Solution: Striping the Parity
• Removes parity disk bottleneck
88
RAID Taxonomy
• Redundant Array of Inexpensive Independent Disks
• Constructed by UC-Berkeley researchers in late 80s (Garth)
• RAID 0 – Coarse-grained Striping with no redundancy
• RAID 1 – Mirroring of independent disks
• RAID 2 – Fine-grained data striping plus Hamming code disks
• Uses Hamming codes to detect and correct multiple errors
• Originally implemented when drives didn’t always detect errors
• Not used in real systems
• RAID 3 – Fine-grained data striping plus parity disk
• RAID 4 – Coarse-grained data striping plus parity disk
• RAID 5 – Coarse-grained data striping plus striped parity
89
How often are failures?
• MTBF (Mean Time Between Failures)
• MTBFdisk ~ 1,200,00 hours (~136 years, <1% per year)
• MTBFmutli-disk system = mean time to first disk failure
• which is MTBFdisk / (number of disks)
• For a striped array of 200 drives
• MTBFarray = 136 years / 200 drives = 0.65 years
90
Rebuild: restoring redundancy after failure
• After a drive failure
• data is still available for access
• but, a second failure is BAD
• So, should reconstruct the data onto a new drive
• on-line spares are common features of high-end disk arrays
• reduce time to start rebuild
• must balance rebuild rate with foreground performance impact
• a performance vs. reliability trade-offs
• How data is reconstructed
• Mirroring: just read good copy
• Parity: read all remaining drives (including parity) and compute
91
Conclusions
• RAID turns multiple disks into a larger, faster, more
reliable disk
• RAID-0: Striping
Good when performance and capacity really matter,
but reliability doesn’t
• RAID-1: Mirroring
Good when reliability and write performance matter,
but capacity (cost) doesn’t
• RAID-5: Rotating Parity
Good when capacity and cost matter or workload is
read-mostly
• Good compromise choice