Transcript CSE451

Hardware Trends
CSE451
Andrew Whitaker
Motivation
Hardware moves quickly
OS code tends to stick around for a while
“System building” extends way beyond the
operating system
Famous Quotes
“I think there is a world market for maybe
five computers”
 Thomas Watson, IBM, 1943
“There is no reason for any individual to
have a computer in their home”
Ken Olson, Digital, 1977
“640K ought to be enough for anybody”
Bill Gates, 1981
CPU Performance Trends
Processor performance stopped doubling
every 18 months in 2002
Source: David Patterson
Why Has CPU Performance Growth
Slowed?
 Moore’s law continues to provide more
transistors
 But:
Chip designs have gotten too complex
Communication times have gotten (relatively) longer
Circuit noise becomes a factor at small scale
 Bottom line: automatic processor performance
improvements are a thing of the past
Industry Response: Multi-Core
Architectures
Sidebar: Educational Ramifications
“Given this sea change, how much of the
curriculum and what fraction of the CS
faculty is oblivious to concurrency? How
many algorithms, data structures, languages,
compilers, debuggers, operating systems,
books, and lectures must change to match
the transformation in underlying technology
if they are to remain relevant in the 21st
century?”
David Patterson -- Computer Science Education in
the 21st Century
Grand Challenge of Parallel
Computing
How do we get a speedup of N on an Nway multi-processor?
Software must be parallelizable
Speedup can refer to:
Latency: the length of time to complete a single
task
Throughput: the rate at which tasks are
completed
The Human Baby Problem
Problem: produce a human baby
Q: Is this problem parallelizable?
Latency: No
9 women cannot make a baby in a month
Throughput: Yes
9 women can produce 1 baby/month
Speeding Up an Internet Service
Which is easier for Amazon.com?
Handling more customers
Making individual customer transactions go
faster?
Parallelization Theory
 Amdahl’s Law predicts speedup on a parallel
machine
1
speedup =
F + (1 - F) / N
 N: number of processors
 F: Fraction of computation that is sequential
Sequential == not parallelizable
 Embarrassingly parallel: F  0
Case Study: Merge Sort
Given N processors,
what speedup is
possible?
Parallelization in Practice
 Predicting performance is difficult because there
are many flavors of parallelism
Multiple processors
 Symmetric multiprocessors (SMPs)
Multi-core processors
Multi-threaded processors (“Hyperthreading”)
Clusters of machines
 Running the software is the only way to know for
sure…
Disk Storage Trends
 1975-1989
Doubled every 3+ years
25% improvement each year
Factor of 10 every decade
Still exponential, but far less rapid than processor
performance
 Since 1990
Doubling every 12 months
100% improvement each year
Factor of 1000 every decade
10x as fast as processor performance!
Memory Capacity
Continues to track Moore’s law
Ed Lazowska: “I remember pulling all
kinds of strings to get a special deal:
512K of VAX-11/780 memory for $30,000”
Today (2006): roughly $300 / GB
© 2004 Jim Gray, Microsoft Corporation
A Quick Intro to Threads
public class Yo extends Thread {
private static int x = 7;
public void run () {
x++;
}
public static void main (String [] args) {
Thread t1 = new Yo();
Thread t2 = new Yo();
t1.start();
t2.start();
What is wrong with
this program?
try {
t1.join();
t2.join();
}
catch (InterruptedException iex) { }
System.out.println("Value of x: " + x);
}
}
Breaking Down x++
 x++ in MIPS assembly:
lw $t, offset($s)
addi $t, $t, 1
sw $t, offset($s)
 Key point: x++ is not atomic
Requires multiple instructions to complete its work
 This is a race condition
Program’s output depends on thread scheduling