Slides available - PHARM - University of Wisconsin
Download
Report
Transcript Slides available - PHARM - University of Wisconsin
Multicore: Panic or Panacea?
Mikko H. Lipasti
Associate Professor
Electrical and Computer Engineering
University of Wisconsin – Madison
http://www.ece.wisc.edu/~pharm
Multicore Mania
First, servers
Then desktops
AMD Athlon X2, 2005
Then laptops
IBM Power4, 2001
Intel Core Duo, 2006
Soon, your cellphone
Sep 18, 2007
ARM MPCore, prototypes for a while now
Mikko Lipasti-University of Wisconsin
What is behind this trend?
Moore’s Law
Chip power consumption
Single-thread performance trend
Sep 18, 2007
Mikko Lipasti-University of Wisconsin
[source: Intel]
Dynamic Power
P
k
A
dyn
iC
iV
if
2
i
units
Static CMOS: current flows when active
Combinational logic evaluates new inputs
Flip-flop, latch captures new value (clock edge)
C: capacitance of circuit
Terms
wire length, number and size of transistors
V: supply voltage
A: activity factor
f: frequency
Future: Fundamentally power-constrained
Sep 18, 2007
Mikko Lipasti-University of Wisconsin
Easy answer: Multicore
Core
Core
Core
Core
Core
Core
Core
Single Core
Dual Core
Quad Core
Core area
A
~A/2
~A/4
Core power
W
~W/2
~W/4
Chip power
W+O
W + O’
W + O’’
Core performance
P
0.9P
0.8P
Chip performance
P
1.8P
3.2P
Sep 18, 2007
Mikko Lipasti-University of Wisconsin
Amdahl’s Law
# CPUs
n
f
1
f
1-f
Time
f – fraction that can run in parallel
1-f – fraction that must run serially
Speedup
Sep 18, 2007
1
f
(1 f )
n
1
lim
n
f
1 f
1 f
n
Mikko Lipasti-University of Wisconsin
1
# CPUs
Fixed Chip Power Budget
1
f
1-f
Amdahl’s Law
n
Time
Ignores (power) cost of n cores
Revised Amdahl’s Law
Sep 18, 2007
More cores each core is slower
Parallel speedup < n
Serial portion (1-f) takes longer
Also, interconnect and scaling overhead
Mikko Lipasti-University of Wisconsin
Fixed Power Scaling
Chip Performance
128
64
32
99.9% Parallel
16
99% Parallel
8
90% Parallel
4
80% Parallel
2
1
1
2
4
8
16
32
64
128
# of cores/chip
Fixed power budget forces slow cores
Serial code quickly dominates
Sep 18, 2007
Mikko Lipasti-University of Wisconsin
Predictions and Challenges
Parallel scaling limits many-core
Interconnect overhead
Single-thread performance
>4 cores only for well-behaved programs
Optimistic about new applications
Will degrade unless we innovate
Parallel programming
Sep 18, 2007
Express/extract parallelism in new ways
Retrain programming workforce
Mikko Lipasti-University of Wisconsin
Research Agenda
Programming for parallelism
Single-thread performance and power
Sources of parallelism
New applications, tools, and approaches
Most attractive to programmer/user
Chip multiprocessor overheads
Sep 18, 2007
Interconnect, caches, coherence, fairness
Mikko Lipasti-University of Wisconsin
Finding Parallelism
1.
Functional parallelism
2.
Automatic extraction
3.
[UW Multiscalar]
Decompose serial programs
Data parallelism
4.
Car: {engine, brakes, entertain, nav, …}
Game: {physics, logic, UI, render, …}
Vector, matrix, db table, pixels, …
Request parallelism
Sep 18, 2007
Web, shared database, telephony, …
Mikko Lipasti-University of Wisconsin
Balancing Work
Amdahl’s parallel phase f: all cores busy
If not perfectly balanced
(1-f) term grows (f not fully parallel)
Performance scaling suffers
Manageable for data & request parallel apps
Very difficult problem for other two:
Functional parallelism
Automatically extracted
Scale power to mismatch [Multiscalar]
Sep 18, 2007
Mikko Lipasti-University of Wisconsin
Coordinating Work
Synchronization
Traditionally: locks and mutual exclusion
Some data somewhere is shared
Coordinate/order updates and reads
Otherwise chaos
Hard to get right, even harder to tune for perf.
Research: Transactional Memory
Sep 18, 2007
[UW Multifacet]
Programmer: Declare potential conflict
Hardware and/or software: speculate & check
Commit or roll back and retry
Mikko Lipasti-University of Wisconsin
Single-thread Performance
Still most attractive source of performance
Speeds up parallel and serial phases
Can use it to buy back power
Must focus on power consumption
Sep 18, 2007
Performance benefit ≥ Power cost
Mikko Lipasti-University of Wisconsin
Single-thread Performance
Hardware accelerators and circuits
Domain-specific [UW MESA]
Reconfigurable [UW Compton]
VLSI and design automation [UW WISCAD, Kursun]
Increasing frequency
Seems prohibitive: clock power
Clever clocking schemes can help [UW Pharm]
Increasing instruction-level parallelism
[UW Multiscalar, UW Pharm, UW Smith]
Sep 18, 2007
Without blowing power budget
Alternatively, reduce power for same performance
Mikko Lipasti-University of Wisconsin
Chip Multiprocessor Overheads
Core Interconnect
80% of chip power [Borkar, ISLPED ‘07 panel]
Need fundamentally different approach
Revisit circuit switching
Cache coherence
Sep 18, 2007
[UW Pharm]
[UW Multifacet, Pharm]
Match workload behavior
Optimize for on-chip communication
Mikko Lipasti-University of Wisconsin
Chip Multiprocessor Overheads
Shared caches
On-chip memory can be shared
Optimize replacement, replication
Fairness
Sep 18, 2007
[UW Multifacet, Multiscalar, Smith]
[UW Smith]
Maintain Performance isolation
Share resources fairly (memory, caches)
Mikko Lipasti-University of Wisconsin
Research Groups @ UW
Group
Faculty
URL
Compton
Kati Compton
www.ece.wisc.edu/~kati
Kursun
Volkan Kursun
www.cae.wisc.edu/~kursun
MESA
Mike Schulte
mesa.ece.wisc.edu
Multifacet
Mark Hill, David Wood
http://www.cs.wisc.edu/multifacet
Multiscalar
Guri Sohi
www.cs.wisc.edu/~mscalar
PHARM
Mikko Lipasti
www.ece.wisc.edu/~pharm
Smith
James Smith
www.engr.wisc.edu/ece/faculty/smith_james.html
Vertical
Karu Sankaralingam
www.cs.wisc.edu/vertical/wiki
WISCAD
Azadeh Davoodi
www.cae.wisc.edu/~adavoodi
Sep 18, 2007
Mikko Lipasti-University of Wisconsin
Conclusion
Forecast
Hardware Challenges
Limited multicore (≤4) is here to stay
Manycore (>4) will find its place
Single-thread performance and power
Multicore overhead
Software Challenges
Sep 18, 2007
Finding application parallelism
Creating correct parallel programs
Creating scalable parallel programs
Mikko Lipasti-University of Wisconsin
Questions?
http://www.ece.wisc.edu/~pharm
Sep 18, 2007
Mikko Lipasti-University of Wisconsin