A Static Power Model for Architects

Download Report

Transcript A Static Power Model for Architects

Power Management
Προηγμένη Αρχιτεκτονική Υπολογιστών
Κωστή Ελένη Μ 487
Ραπτοπούλου Κλειώ Μ 515
Ψαρρά Τζένη Μ 510
Power Management
Contents
Introduction
Basic definitions
Dynamic power management (DPM)
Static power
Current Power Reduction Techniques
Conclusion
References
Power Management
Introduction
Microprocessor performance has been
improving every year:
Semiconductor technology scaling
Larger numbers of smaller and faster transistors.
Innovations in computer architecture and
accompanying software
Microprocessors’ performance greater than what
would have been possible by technology scaling
alone.
Power Management
CPU Power Problem
Power consumption for Intel CPUs (following
figure).
X-axis: technology generation
Y-axis: maximum power consumption.
As indicated by the dashed line in the main part of the curve
power consumption has been increasing for each new CPU
generation.
The points to the side of the main curve indicate newer versions
of each processor family. These are implemented in newer
semi-conductor processes with smaller geometries that
the lead-processor in that family.
Power Management
Smaller feature sizes in conjunction with lower supply
voltages lead to lower power consumption in the newer
versions. However, moving to a new CPU generation in the
same process is associated with an increase in the power
consumption.
Power Management
Architecture Importance
Two main reasons why architecture is
instrumental in boosting performance beyond
technology scaling:
Technology scaling is often non-uniform
processors are optimized for speed
main memories are mostly optimized for density.
Technology scaling facilitates higher integration
by allowing us to pack more transistors on a chip
of the same size.
Power Management
Architecture Importance
More transistors and higher frequencies to deliver
higher performance  successive processor
generations ≠ increasing power requirements
and density
Microarchitectural mechanisms consumes the
power  (re)designing to address power
concerns  focus on power-aware microarchitectural techniques.
Power Management
Power Consumption
Nascent problems
Battery lifetime
Heat removal problems
Operating cost
Types of power consumption
Dynamic power: dissipation, whenever a transistor
or wire changes voltage - switching
Static power: dissipation, or the power due to
leakage current in the absence of any switching
activity.
Traditionally responsibility of circuit designers.
Power Management
Dynamic vs Static Power
Dynamic Power Pdyn  CVCC2 f
With smaller technologies dynamic power per
transistor decreases
– C, Vcc decreases
– f increases
Static Power
Pdyn  VCC I leak
Due to the leakage current
Increasing with technology scaling
– Vcc decreases linearly
– Ileak increases exponentially
Power Management
Dynamic vs Static
Power Management
Dynamic power
management (DPM)
Dynamic Power Management (DPM) is a design
methodology that dynamically reconfigures an electronic
system to provide the requested services and
performance levels with a minimum number of active
components or a minimum load on such components.
DPM encompasses a set of techniques that achieve
energy-efficient compu-tation by selectively turning off
(or reducing the performance of) system components
when they are idle (or partially unexploited).
Power Management
Model of dynamic power
consumption
Power Management
Model Parameters
An effective capacitance, Ceff , can be defined
which combines the physical capacitance being
switched, C, as in previous slide and the activity
factor a:
Ceff = α · C
The effective capacitance can be found from
simulation and measurements as:
Ceff = Pd / f · V2cc
Power Management
Online Algorithms for DPM
The Deterministic Algorithm
(the request interrival time probability
distribution is not known before hand)
The Probability-based Algorithm
(the length of the idle interval is generated
by a fixed, known distribution)
Power Management
Static Power
Even when devices do not change values due to
the imperfect nature of semiconductor-based
Transistors power is dissipated. This is the static
power. In existing designs, static power is
relatively small. However, as we move towards
smaller transistors and lower voltages, static
power increases rapidly.
Power Management
Target
To enable architects to consider static power
consumption in their design decisions
Problem: most factors affecting static power are
decided in circuit level design
Proposed solution: a model whose abstraction
level is appropriate for its application
– Relative but NOT absolute accuracy
Power Management
Simplified Formula
A simple four parameter model useful at the
architectural level:
Pstatic  VCC  N  kdesign  Iˆleak
Parameter
V CC
Description
Power supply voltage
Scaling behavior
Reducing
Decreases by 30 % per
process generation
• Multiple supply voltage domains
• Increase IPC to allow lower clock frequency
(allowing V CC reduction) at same performance
N
Number of transistors
in design
Increases by 100 % per
process generation
• Reduce functionality (e.g., removing special
purpose circuitry)
• Use circuit style requiring fewer transistors for
same functionality
k design
Empirically
determined parameter
representing the
characteristics of an
average device
Approximately constant
• Use efficient circuit style
• Reduce clock frequency to allow more
complex (high fan-in) logic
I ˆ leak
Technology parameter
describing the per
device subthreshold
leakage
Highly dependent on
aggressiveness of VT
(threshold voltage) scaling
• Partition design into frequency domains allowing
use of less aggressive (lower leakage) devices in some
domains leak
Power Management
Different Ways For Lower
Static Power
Reducing the Supply Voltage
Reducing the Number of Devices
Using More Efficient Circuits
Using Multiple Threshold Voltages
Power Reduction with Speculation
Power Management
Reducing Static Power
Reduce VCC
Not an architectural controllable parameter
Performance less sensitive to latency
VCC drops
Reduce supply voltage for entire chip without
partitioning
– The global clock frequency must be reduced
Power Management
Reducing Static Power
Reduce VCC
Partition circuitry into several domains operating
at different supply voltage levels
– Both static and dynamic power savings are
possible
– Used for off-chip communication parts
– Extra delay on crossing domain boundaries
Power Management
Reducing Static Power
Reduce the total number of devices
Very difficult without affecting the performance
or functionality
Cache size, number of functional units and
issue/retire bandwidth are first targets due to
varying degrees of difficulty and performance
impact
Power Management
Reducing Static Power
Reduce the total number of devices
Turn off devices (when they are unused) rather
than eliminating them
– Power gating analogous to clock gating
– Additional circuitry is used for determination
of the unit’s necessity
However…
Power Management
Reducing Static Power
However…
– The addition of a gating device reduces performance
and noise margins
– Latency for turning on a device
(two alternative latency cases)
– Possible partitions
– Decode logic for a rare or privileged instruction
– Interrupt logic
– Logic to handle rare certain rare exceptions
Power Management
Reducing Static Power
Use more static power efficient circuits
kdesign values can be used for static power reduction.
– Use choices with lower kdesign
– Wide multiplextors
higher cost
(analogous number of inputs)
Tri-state bus with multiple drivers have stacked
devices
accomplish the same function with
lower total leakage
So…
Power Management
Reducing Static Power
So…
Instead of wide multiplexors, a tri-state bus with
multiple drivers
– Associative arrays are approximately three times
leakier than random-access memories
Power Management
Reducing Static Power
Use multiple threshold voltages
Different transistor speeds may be used in
different ways.
Employment of fast devices only along critical
timing paths
Determining which functional units require the
lowest latencies and allocating the budget of fast,
leaky devices to these units only.
Power Management
Reducing Static Power
Speculation
Speculate the result of a complicated power
hungry device using simple power efficient device
– Usage of static data reduction methods for
selecting
the
appropriate
devices
for
speculation
Power Management
Reducing Static Power
Speculation
– Data Speculation on L1 cache accesses
(an example)
 Majority will be a hit
 Retrieve data without checking the tag
 Use slower, power efficient circuit to
check tag
 Use tag in case of mis-speculation
Power Management
Current Power Reduction
Techniques
Many circuit techniques
Clock Gating
Input Vector Determination Technique
Some architecture techniques
Pipeline Gating
no published techniques for static power
Power Management
Clock Gating
Clock is the largest contributor to the CPU power
Reduce the switched capacitance on the clock
will thus have the most impact on total power.
Consideration: first digital components that are clocked.
This class of components:
 is wide, and
 it includes most processors, controllers and memories.
Power consumption in clocked digital components (in CMOS
technology) is roughly proportional to the clock frequency and to
the square of the supply voltage.
Power can be saved by reducing the clock frequency (and in the
limit by stopping the clock), or by reducing the supply voltage
(and in the limit by powering off a component).
Power Management
Clock Gating
Effective way to do this is:
Partition the clock network and
allow all those portions to toggle that are needed on each
cycle.
Namely, the clock of an idle component can be stopped
during the period of idleness. Power savings are achieved in
the registers (whose clock is halted) and in the combinational
logic gates where signals do not propagate due to the
freezing of data in registers.
Power Management
Clock Gating
Issues to be concerned:
Disabled block may not power up in time
or that modified clocks may generate glitches.
Is the impact on current variations when
large blocks are switched on and off.
Power Management
Why use clock gating?
Clock gating is widely used because:
It is conceptually simple
It has small overhead (clock can be restarted by
simply deasserting the clock-freezing signal.) in terms
of additional circuits
It has often zero performance overhead because the
component can transition from an idle to an active
state in one (or few) cycles
Power Management
Caching Strategies
The size and the type of the cache is a step which has big
influence on the power consumption. A high hit ratio cache
significantly decreases the off-chip memory communications. On
the other hand, a cache itself consumes quite a lot of power and
chip area (following figure ).
At least two types of caches
present in the current microprocessors:
one for instructions
and one for the data
Power Management
Input Vector
Determination Technique
Consideration:
A combinational circuit
whose input nodes are
state bits of an overall
sequential circuit which
will be put in standby
mode.
Target:
We need to choose an input
vector for the combinational
circuit that causes it to dissipate
very low leakage power. The
search problem for the vector
that gives the least leakage
power is a very difficult one
because of the potentially huge
size of the search space.
Furthermore, it is not absolutely
necessary
to
find
this
minimizing.
Power Management
Input Vector
Determination Technique
Solution:
Development of an algorithm to find such a vector based
on a process of random sampling. Randomly chosen
vectors are applied to the circuit and the leakage due to
each is monitored, and the vector which gives the least
observed leakage value is reported. Clearly, the number
of vectors to be applied determines the quality of the
resulting solution.
Power Management
Pipeline Gating
An innovative method for power reduction per in
high-performance
microprocessors
without
impacting performance.
Control rampant speculation in the pipeline.
An inexpensive mechanisms for determining when a
branch is likely to mispredict, and for stopping wrong-path
instructions from entering the pipeline.
Power Management
Power Consumption for Pentium Pro chip,
broken down by individual processor components
(an example)
Reg Alias Table
Reservation
Station
Ext Bus Logic
Rest
Clock
Fp Exec
Int Exec
Inst Fetch
Data Cache
Recorder Buf
Inst Dec
Power Management
Goals and Contributors
Control speculation and reduce the amount of
unnecessary work in high-performance, wide-issue,
super-scalar processors.
Contributors:
Method to reduce the number of speculatively issued
instructions
We compare the effectiveness and cost of this design using
various confidence estimation mechanisms, and
We present results which show a significant reduction in
unnecessary work with a negligible performance loss.
Power Management
Pipeline Gating
Pipeline with a two fetch and decode cycles, showing additional
hardware required for pipeline gating. The low-confidence branch
counter records the number of unresolved branches that reported as
low-confidence. The counter value is compared against a threshold value
(“N”). The processor ceases instruction fetch if there are more than N
unresolved low-confident branches in the pipeline..
Power Management
Confidence Estimators
Confidence estimation is a diagnostic test that attempts
to classify each branch prediction as having “high
confidence”, meaning that the branch was likely predicted
correctly, or “low confidence”, meaning the branch was
likely mis-predicted.
Perfect confidence estimation
JRS Confidence estimator
Static confidence estimation
Saturating Counters
Distance
Power Management
Conclusions
Power consumption is as important a design criteria as
performance even if your application is plugged into
the wall
Low power design starts at the system level
a top
down approach will yield greatest results
Power management is a system issue that requires
circuit, microarchitecture and software interaction
Performance requirements are increasing and so
microarchitectural complexity must go up without
sacrificing power
Power Management
Conclusions
Set your power budget at the start of the design and
measure it as you go
Understand where the power goes in your designs
today and use the data to improve future products
Low power design presents new challenges
reducing mW is much more interesting than
increasing MHz!
Power Management
Conclusion
Static power dissipation will became an important
component in overall power dissipation
Catch dynamic power in two to three
generation
Architects will need to address this problem in
architectural design level
Power Management
References
Micro-Architectural Innovations: Boosting Microprocessor
Performance Beyond Semiconductor Technology Scaling,
Andreas Moshovos, Gurindar S. Sohi.
J. A. Butts and G. S. Sohi. A static power model for
architects. In Proc. 33rd Annual International Symposium
on Microarchitecture, pages 248–258, Dec. 200.
J. P. Halter and F. N. Najm. A gate-level leakage power
reduction method for ultra-low-power CMOS circuits. In
Proc. IEEE Custom Integrated Circuits Conference, pages
475–478, May 1997.
S. Manne, A. Klauser, and D. Grunwald. Pipeline gating:
speculation control for energy reduction. In
Proc. 25th Annual International Symposium on Computer
Architecture, pages 132–141, June-July 1998.
Power Management
References
Reducing power in high-performance microprocessors, Vivek
Tiwari, Deo Singh, Suresh Rajgobal, Gaurav Mehta, Rakesh
patel, Franklin Baez.
Microprocessors: Low Power and Low Energy Solutions, Flavius
Gruian
System Approaches to Power Management, Dennis Monticelli
A Survey of Design for System-Level Dynamic Power
Management, Luca Benini, Alessandro Bogliolo, Giovanni De
Micheli.
Competitive Analysis of Dynamic Power Management Strategies
for System with Multiple Power Saving States,Sandra Irani,
Sandeep K, Shukla, Rajesh K.Gupta.
Power Management
References
Scaling principles for low power, T.Njlstad (NTNU)
Power Aware Microarchitecture Resource Scaling, Anoop
Iyer, Diana Marculescu
S.-H. Yang, M. D. Powell, B. Falsafi, K. Roy, and T. N.
Vijaykumar. An Integrated Circuit/Architec-ture
Approach to Reducing Leakage in Deep-Submicron HighPerformance I-Caches. In International Symposium on
High-Performance Computer Architecture, Jan. 2001.
Power Management