Transcript CSE451
Hardware Trends
CSE451
Andrew Whitaker
Motivation
Hardware moves quickly
OS code tends to stick around for a while
“System building” extends way beyond the
operating system
Famous Quotes
“I think there is a world market for maybe
five computers”
Thomas Watson, IBM, 1943
“There is no reason for any individual to
have a computer in their home”
Ken Olson, Digital, 1977
“640K ought to be enough for anybody”
Bill Gates, 1981
CPU Performance Trends
Processor performance stopped doubling
every 18 months in 2002
Source: David Patterson
Why Has CPU Performance Growth
Slowed?
Moore’s law continues to provide more
transistors
But:
Chip designs have gotten too complex
Communication times have gotten (relatively) longer
Circuit noise becomes a factor at small scale
Bottom line: automatic processor performance
improvements are a thing of the past
Industry Response: Multi-Core
Architectures
Sidebar: Educational Ramifications
“Given this sea change, how much of the
curriculum and what fraction of the CS
faculty is oblivious to concurrency? How
many algorithms, data structures, languages,
compilers, debuggers, operating systems,
books, and lectures must change to match
the transformation in underlying technology
if they are to remain relevant in the 21st
century?”
David Patterson -- Computer Science Education in
the 21st Century
Grand Challenge of Parallel
Computing
How do we get a speedup of N on an Nway multi-processor?
Software must be parallelizable
Speedup can refer to:
Latency: the length of time to complete a single
task
Throughput: the rate at which tasks are
completed
The Human Baby Problem
Problem: produce a human baby
Q: Is this problem parallelizable?
Latency: No
9 women cannot make a baby in a month
Throughput: Yes
9 women can produce 1 baby/month
Speeding Up an Internet Service
Which is easier for Amazon.com?
Handling more customers
Making individual customer transactions go
faster?
Parallelization Theory
Amdahl’s Law predicts speedup on a parallel
machine
1
speedup =
F + (1 - F) / N
N: number of processors
F: Fraction of computation that is sequential
Sequential == not parallelizable
Embarrassingly parallel: F 0
Case Study: Merge Sort
Given N processors,
what speedup is
possible?
Parallelization in Practice
Predicting performance is difficult because there
are many flavors of parallelism
Multiple processors
Symmetric multiprocessors (SMPs)
Multi-core processors
Multi-threaded processors (“Hyperthreading”)
Clusters of machines
Running the software is the only way to know for
sure…
Disk Storage Trends
1975-1989
Doubled every 3+ years
25% improvement each year
Factor of 10 every decade
Still exponential, but far less rapid than processor
performance
Since 1990
Doubling every 12 months
100% improvement each year
Factor of 1000 every decade
10x as fast as processor performance!
Memory Capacity
Continues to track Moore’s law
Ed Lazowska: “I remember pulling all
kinds of strings to get a special deal:
512K of VAX-11/780 memory for $30,000”
Today (2006): roughly $300 / GB
© 2004 Jim Gray, Microsoft Corporation
A Quick Intro to Threads
public class Yo extends Thread {
private static int x = 7;
public void run () {
x++;
}
public static void main (String [] args) {
Thread t1 = new Yo();
Thread t2 = new Yo();
t1.start();
t2.start();
What is wrong with
this program?
try {
t1.join();
t2.join();
}
catch (InterruptedException iex) { }
System.out.println("Value of x: " + x);
}
}
Breaking Down x++
x++ in MIPS assembly:
lw $t, offset($s)
addi $t, $t, 1
sw $t, offset($s)
Key point: x++ is not atomic
Requires multiple instructions to complete its work
This is a race condition
Program’s output depends on thread scheduling