Slides by Jonathan

Download Report

Transcript Slides by Jonathan

Abstract
• Increases in CPU and memory will be
wasted if not matched by similar
performance in I/O
• SLED vs. RAID
• 5 levels of RAID and respective
cost/performance analysis
Background – CPU Performance
• Unprecedented CPU growth
– Gordon Bell: 40% per year between ’74 – ’84
– Bill Joy: Millions of Instructions Per Second =
MIPS = 2Year – 1984
– Mainframes and supercomputers don’t share
same growth rate
• Multiprocessors used to cope with performance
Background – Memory
• Gene Amdahl: each CPU instruction per second
requires one byte of main memory
– If system costs are not dominated by the cost of
memory the memory chip capacity should grow at same
rate
• Gordon Moore: transistors/chip = 2Year – 1964
– RAM has quadrupled every 2-3 years
• Ratio of MB RAM to MIPS has increased in
recent years due to dropping memory prices
Background – I/O
• Primary measure of magnetic disk
technology is max number of bits that can
be stored per square inch
– MAD: maximal areal density
– MAD = 10(Year-1971)/10
• Doubled in capacity & halved in price every
3 years
Background – I/O con’t
• Capacity not the only measure
– Performance
• Main memory increases have kept pace with CPU
for 2 reasons
– Caches
– Speed increases
• SLED has only increased marginally
– Held back because they are mechanical
– Seek and rotation delays
– Greater density solved only some of problem
Background – I/O con’t
• Larger main memories and solid state disks
as buffers
– Problem: volatility
• High rate of random request for small
chunks of data (transaction processing)
• Low rate of requests for large chunks of
data (large simulations on supercomputers)
…the Problem
• What is the impact of improving the performance
of some parts while leaving others the same?
• Amdahl’s Law:
s=
1 _
(1-f) + f/k
where: s = effective setup
f = fraction of work in faster mode
k = speedup while in faster mode
…the Problem con’t
• Example:
– Current applications spend 10% of time in I/O
– Using Amdahl’s Law
• When computers are 10x faster total system speedup will only
be 5x
• When computers are 100x faster, total system speedup will be
a mere 10x faster
– 90% wasted total speedup due to I/O
• While software & buffering will help in the near
term, a solution is needed to avoid crisis
…the Problem con’t
• Questions:
– How can we increase performance of secondary
storage and/or disk I/O?
– Can a more cost effective solution be found?
…the Solution
• Arrays of inexpensive disks
– PC disks: lower cost and performance
– Versus SLED
• I/Os per second of an inexpensive disk is within a
factor of 2
• Cheaper per MB, less power consumption
– Contains full track buffers and most functions
of traditional mainframe controller
– Small Computer System Interface (SCSI)
Reliability
• Unreliable nature of disks forces constant backup
• MTTF of disk array = MTTF of a single disk
Number of disks in array
• Example: MTTF for 100 CP-3100 disks is 30,000/100
or 300 hours which is less than 2 weeks!
• Compare to the IBM 3380 which has a MTTF >
30,000 hours
• Without fault tolerance large arrays of disks are too
unreliable to be a viable solution
Better Solution
• Redundant Arrays of Independent Disks
(RAID)
• Things to consider
–
–
–
–
Reliability
Overhead cost
Usable storage percentage
Performance
Reliability
• Regarding reliability
– Check disks with redundant information
– Replacement of disk occurs in short time
(MTTR)
• Obsolescence
– Extremely high MTTF is ‘overkill’
– Are you still using a 20 year old disk?
Overhead Cost
• Number of extra check disks
Usable Storage Percentage
• Percentage of total disk space allocated to
actual data
• Another way of viewing costs of overhead
Performance
• Various applications have various uses for
the I/O
– Supercomputers: number of reads/writes per
second on large chunks of data
– Transaction processing: number of individual
read/writes per second
• Also use read-modify-write access
Advantages
•
•
•
•
•
•
•
Full data redundancy (reliability)
Much lower cost compared to SLEDs
Data rate
I/O rate
Modular growth potential (scalability)
Lower power consumption
Availability (after failure)
Disadvantages
• Capacity
• Quantity of independent disks to achieve
size/performance needs
• Large number of connections/cables
My Opinion
• Viable solution to problem assuming other issues
are resolved
– Actual MTTF rates for arrays
– Connectivity to 100 or even 1000 disks
• Low cost allows entry into consumer market
• Similar solution to dual-core CPUs
– Can’t go any faster so just use 2 of them
– Growing trend in computer industry
Problem Solved?
• Increased performance over current
secondary storage solutions?
– YES
• More Cost effective than current standard
secondary storage solutions?
– YES
The End!!!
• Questions?
• Comments?
• Discussion…