Transcript 7810-12

CS 7810
Lecture 12
Power-Aware Microarchitecture: Design and
Modeling Challenges for Next-Generation
Microprocessors
D. Brooks et al.
IEEE Micro, Nov/Dec 2000
Power/Energy Basics
• Energy = Power x time
• Dynamic Power = a C V2 f
a
C
V
f
switching activity factor
capacitances being charged
voltage swing
processor frequency
• Current trends: f and C are rising, V is dropping,
overall dynamic power is increasing
• Leakage energy is also increasing
Processor Breakdowns
Alpha 21264
Caches
O-o-o Issue Logic
Mem management unit
FP unit
Integer unit
Clock power
Pentium Pro
16%
19%
9%
11%
11%
34%
Metrics
• Performance a f a 1/D (D is delay or execution time)
• Delay of a circuit a 1/(V – Vt) ; lower frequency
tolerates longer delays, hence, can reduce voltage
• Power = a C V2 f ; since f is roughly proportional
to voltage, P a V3 a f3
• Since V and f are variable, remove it from the
expression: PD3 = constant (regardless of V and f)
This is the best metric to compare processors;
any other metric (say, perf/watt) can be “fudged”
by changing voltage or frequency
Metric Example
Power/f3 =
V = 1.25; f = 1GHz
Perf
Power
Proc-A
100
MIPS/W = 10
1000 MIPS
100W
V = 1.0; f = 0.8GHz MIPS/W = 15.6
Perf
800 MIPS
Power
51.2W
V= 1.5; f = 1.2GHz
Perf
Power
MIPS/W = 6.9
1200 MIPS
172.8W
Proc-B
80
MIPS/W = 10
800 MIPS
80W
MIPS/W = 15.6
640 MIPS
41W
MIPS/W = 6.9
960 MIPS
138.2W
Metrics
• PD3 gives ratio of power if two processors were
tuned* to yield the same performance
• (PD3)1/3 gives ratio of performance if two
processors were tuned* to yield the same power
*Tuning is done through voltage and frequency
scaling and it is assumed that a linear relationship
exists between V and f – note that in modern
processors, this is not true and PDx is the right
metric, where x > 3 (x can be 1 or 2 in markets
where performance is not very critical)
Commercial Examples
Global Power Saving Strategies
• Dynamic frequency scaling – trivially reduces
power, worsens performance, no effect on energy
• If off-chip components (memory) dominate, there
will be an energy reduction with DFS
• Leakage power is unaffected by DFS, so if leakage
dominates, overall energy increases
• Montecito: 20MHz changes in frequency can
happen in a single cycle
Global Power Saving Strategies
• Dynamic voltage scaling – since we are changing
frequency, can also combine it with voltage scaling
as each circuit has longer slack – has a more than
quadratic effect on dynamic power, a linear effect
on leakage power, and a more than linear effect
on energy
• Intel Xscale: roughly 50ms to scale from 1.65-0.75V
• DVS opportunities are reducing: lower voltage
margins, error rates may increase
Localized Power Saving Strategies
• When a processor structure is not used in a cycle,
gate off its clock for that cycle – gating can happen
in a single cycle; increase in complexity
• Leakage energy can be reduced by gating off
supply voltage V during periods of inactivity – takes
more time to effect
• Body biasing can also reduce leakage power
Localized Power Saving Strategies
Dynamically adjust
frequency/voltage and
size for each domain,
based on thruput rates
Leakage Power
Leakage is a linear function of supply voltage, a linear function of the
number of transistors, and an exponential function of threshold voltage
From Butts and Sohi, MICRO’00
Power-Performance Trade-Offs
Power-Performance Trade-Offs
Caches, bpreds are doubled at each point below, while the x-axis
represents the sizes of issue queues, registers, ROB, etc.
Argues against going to wider/larger superscalars
Other Observations
• Clustered architectures have better power scalability
(since the complexity of each cluster remains unchanged)
• CMP and SMT
can employ
complexity-effective
designs – power
consumption is low
(little wasted work)
and multi-threaded
performance
continues to be high
From ISPASS’06
Title
• Bullet