Lecture 7: High-level power reduction and management
Download
Report
Transcript Lecture 7: High-level power reduction and management
High-level Power Reduction and
Management
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
Outline
General Observations
RTL Power Management Techniques
■ Gated Clock Architecture
■ Precomputation
■ Guarded Evaluation
Behavior-Level Power Reduction Techniques
■ Performance Speedup Techniques
● Algebraic Transformations
● Common Case Computation
■ Switched Capacitance Reduction
● Algebraic Transformations
Power Supply Gating
■ Basic Concept
■ Isolation Cells
■ Retention Flip-Flops
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
2
General Observations
Not all components need to be active all the time
Energy-efficient computations achieved by selectively turning off
(or reducing the performance of) system components when they
are idle
Issues:
■ Controls to support power management
● Frequency control (clock gating)
● Voltage control (power shutdown)
■ Identify when circuits (or parts) can be idle
■ Location of controls
● Hardware
● Software (Hybrid)
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
3
Outline
General Observations
RTL Power Management Techniques
■ Gated Clock Architecture
■ Precomputation
■ Guarded Evaluation
Behavior-Level Power Reduction Techniques
■ Performance Speedup Techniques
● Algebraic Transformations
● Common Case Computation
■ Switched Capacitance Reduction
● Algebraic Transformations
Power Supply Gating
■ Basic Concept
■ Isolation Cells
■ Retention Flip-Flops
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
4
Gated Clock Architecture
Block Fa is controlled by primary inputs, state, and primary
outputs
STATE
Combinational
Logic
IN
OUT
GCLK
fa
L
&
CLK
Latch L takes care of filtering glitches
■ L is transparent when clock is inactive
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
5
Gated Clock Architecture : Redundant Clocking
Detection
Idea [Ohnishi97]:
■ Redundant clockings activate registers unnecessarily
■ Use application profiles to detect redundant clockings
● Difference in the numbers of incoming and outgoing data of a
register
■ Gated clock scheme designed using this information
Redundant behaviors of a register
■ Unused data latching: Data not transferred to a destination
■ Unchanged data latching: Register re-stores data already present
from source
■ Redundant data holding: Register re-stores data already present
from itself.
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
6
Redundant Clocking Detection
Identify the redundant behaviors for register X during the 10
clock cycle snapshot shown.
Courtesy: [Ohnishi97]
# Unused data latching(X) or
AUU (X )
= 8-6=2
# Unchanged data latching(X) or AUC ( X ) = 8 - 5 = 3
# Redundant data holding(X) or AHOLD (X ) = 10 – 8 = 2
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
7
Algorithm
Algorithm for redundant clocking detection and gated clock
architecture definition
1. Register data transfer condition extraction
● Analyze RTL HDL of circuit to extract data transfer
conditions
● Conditions under which data transfers to/from register
happened
2. Profiling
● Count the number of times these conditions become
true during RTL simulation
● Estimate the number of redundant behaviors of each
register from these counts
3. Register grouping algorithm applied and gated clock
introduced for each group
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
8
Register Data Transfer Conditions
Data Transfer Graph (DTG) captures data transfer condition between
registers (denoted C RT (vi , v j ) )
Example
Courtesy: [Ohnishi97]
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
9
Register Data Transfer Conditions
Three types of data transfer conditions
CLAT (vi )
Data transfer condition between
register i and one or more source
registers of i
m
C LAT (vi ) CRT (vr , vi )
r 1
CUSED (vi )
Data transfer condition between
register i and one or more destination
registers of i
n
CUSED (vi ) C RT (vi , vr )
r 1
CCHG (vi )
Data transfer condition to one or
more source registers of i
k
CCHG (vi ) C LAT (vr )
r 1
Copyright Agarwal & Srivaths, 2007
Courtesy: [Ohnishi97]
Low-Power Design and Test, Lecture 7
10
Profiling
Count the number of times CLAT (vi ) , CUSED (vi ), and CCHG (vi )
become true during RTL simulation
■ Call these numbers ALAT (vi ) , AUSED (vi ), and ACHG (vi )
We can now determine
AHOLD (vi ) ACK (vi ) ALAT (vi )
AUU (vi ) ALAT (vi ) AUSED (vi )
AUC (vi ) ALAT (vi ) ACHG (vi )
Recall our initial example!
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
11
Register Grouping Algorithm
1.
Record clock cycle in which each register behaves redundantly as follows:
■ Calculate AHOLD AUU AUC in every cycle for each register
■ If ( AHOLD AUU AUC ) cyclet ( AHOLD AUU AUC ) cyclet 1
record t (redundant clocking detected in cycle t)
2.
Greedy grouping of registers
foreach reg i
i,j do not belong to any group
{
Add i to new Group Gi;
foreach reg j
{
#redundancy_similarity= #clock_cycles in which i,j behave redundantly.
if (#redundancy_similarity > threshold)
Add j to Gi;
}
}
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
12
Register Grouping Algorithm
3. Calculate the total redundant power for each group
4. Select groups whose total redundant powers are more than a
given threshold power
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
13
Outline
General Observations
RTL Power Management Techniques
■ Gated Clock Architecture
■ Precomputation
■ Guarded Evaluation
Behavior-Level Power Reduction Techniques
■ Performance Speedup Techniques
● Algebraic Transformations
● Common Case Computation
■ Switched Capacitance Reduction
● Algebraic Transformations
Power Supply Gating
■ Basic Concept
■ Isolation Cells
■ Retention Flip-Flops
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
14
Pre-computation
Duplicate part of logic to precompute circuit output values one
cycle before they are required
Use these values to reduce the
total amount of switching in the
circuit in the next cycle
Original Circuit
(n input, single output)
Circuit with Pre-computation
Circuit Embodiments
■ g1, g0 : Predictor functions
g1 1 f 1
g0 1 f 0
■ LE = 0; when either g1 or g0
evaluates to 1
Courtesy: [Macii98]
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
15
Pre-computation
An Example [Devadas95]
■ N-bit comparator
■ Pre-computation circuit based
on the behavior of the
comparison operation
● If the MSBs of C and D are
not equal, C>D can be
evaluated just using the
MSBs
● Otherwise, the rest of the
bits (of C and D) are also
needed.
■ Therefore, LE is given by
LE C (n 1) D(n 1)
Copyright Agarwal & Srivaths, 2007
Comparator Circuit
XNOR
Comparator Circuit
with pre-computation
Low-Power Design and Test, Lecture 7
16
Outline
General Observations
RTL Power Management Techniques
■ Gated Clock Architecture
■ Precomputation
■ Guarded Evaluation
Behavior-Level Power Reduction Techniques
■ Performance Speedup Techniques
● Algebraic Transformations
● Common Case Computation
■ Switched Capacitance Reduction
● Algebraic Transformations
Power Supply Gating
■ Basic Concept
■ Isolation Cells
■ Retention Flip-Flops
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
17
Guarded Evaluation
Operand Isolation: Use transparent latches as a mechanism for shutting
down redundant switching
■ Latches enabled when useful computation needs to be done
Guarded Evaluation [Tiwari98]
■ Identifies where transparent latches must be placed
■ Identifies which signals control enable/disable of these latches
Courtesy: [Macii98]
Original Circuit
Copyright Agarwal & Srivaths, 2007
Circuit with Guard
Logic
Low-Power Design and Test, Lecture 7
18
Guarded Evaluation
An Example RTL Circuit: Dual-operation ALU
■ Ctrl=0 (1) : SHIFT (ADD) operation performed
■ Clock gating will not work here!
REG B
REG A
REG B
REG A
Guard
Logic
SHIFTER
ctrl
ctrl
ADDER
SHIFTER
ctrl
0
ADDER
1
ctrl
0
1
ALU with
Guard Logic
ALU
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
19
Background: Observability Don’t Cares
Well known concept in logic synthesis
ODC set of a Boolean variable x: Conditions on the Primary
Inputs such that x is not observable at the Primary Outputs.
Example: AND gate with inputs x,y and output z
■ x is not observable when y is 0
■ x is not observable when z is not observable
ODC ( x) y ODC ( z )
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
20
Guarded Evaluation
Exploit observability don’t care set
ODCz
■ Set of PI assignments to X so
that the value at z has no
effect at POs.
■ Then the guard logic control
signal s must satisfy the
logical condition
s ODC z
Circuit with Guard Logic
(Pure Guarded Evaluation)
■ Further,
tl ( s) te (Y )
Earliest time an input to F can change
Latest settling time of s to 1
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
21
Guarded Evaluation
Extended Guarded Evaluation
■ Larger set of conditions under
which we can shut off logic
s ( x ODC z )
■ Shutdown conditions now include
additionally
● PI assignments not in ODCz
● But, for whom, z=1
Copyright Agarwal & Srivaths, 2007
z
w
s
Low-Power Design and Test, Lecture 7
22
Outline
General Observations
RTL Power Management Techniques
■ Gated Clock Architecture
■ Precomputation
■ Guarded Evaluation
Behavior-Level Power Reduction Techniques
■ Performance Speedup Techniques
● Algebraic Transformations
● Common Case Computation
■ Switched Capacitance Reduction
● Algebraic Transformations
Power Supply Gating
■ Basic Concept
■ Isolation Cells
■ Retention Flip-Flops
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
23
Behavior-level Power Reduction Techniques
Recall the equation for dynamic power consumption
Pdyn
1
2
CVdd * a * f
2
Two key approaches for reducing power:
■ Use performance speed-up transformations, and trade-off
performance for power through voltage scaling
● How will this work?
■ Reduce the effective capacitance being switched
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
24
Trading off performance for power consumption
benefits
Exploit voltage and frequency scaling to trade-off performance
gains for significant power consumption savings
When voltage and frequency
scaling is performed, we can
calculate the power consumption
benefits by determining the new
operating voltage
■ Let Topt be the shortened
execution time due to the use of
performance optimization
■ Assume that the voltage scaled
circuit takes the same time
(TORIG) to complete as the
original circuit
Copyright Agarwal & Srivaths, 2007
Vdd
Topt
TORIG
Vdd
Vddnew
Low-Power Design and Test, Lecture 7
Topt
TORIG
25
Trading off performance for power consumption
benefits
We have first the following equations for Topt and Torig
Topt N cyc *1 / f orig
Torig N cyc *1 / f new
Topt / Torig f new / f orig
Dependency of frequency on circuit voltage is given below
f (Vdd Vt ) 2 / Vdd
We therefore have the following equation below for calculating
Vddnew
Topt / Torig ((Vddnew Vt ) 2 /(Vdd Vt ) 2 ) * (Vdd / Vddnew )
Topt / Torig Vdd new / Vdd
Use Vddnew to calculate final power consumption!
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
26
Performance Optimization Transformations on an
Example Behavior [Chandraskan95]
Example Behavior of an IIR Filter
YN X N A *YN 1
Behavior Data Flow
XN
YN
+
*
Design Characteristics
D
A
Copyright Agarwal & Srivaths, 2007
•
•
•
•
•
Vdd = 5V
Critical path Length = 2
Throughput = 2*N
Capacitance = 1 unit
Power = 25 units
Low-Power Design and Test, Lecture 7
27
Transformation (1): Loop Unrolling
We can unroll the recursive equation once, and get the following
YN 1 X N 1 A *YN 2
YN X N A *YN 1
Behavior Data Flow
XN
Design Characteristics
YN
+
2D
A
X N 1
* *
+
Copyright Agarwal & Srivaths, 2007
A
•
•
•
•
•
Vdd = 5V
Critical path Length = 2
Throughput = 2*N
Capacitance = 1 unit
Power = 25 units
YN 1
No Low-Power
change
in performance/power!
Design and Test, Lecture 7
28
Transformation (2): Distributivity and Constant
Propagation
We can apply distributive law and constant propagation
YN 1 X N 1 A *YN 2
YN X N A * X N 1 A * YN 2
2
Behavior Data Flow
XN
+
YN
+
2D
*
A2
A2
X N 1
Design Characteristics
*
*
+
Copyright Agarwal & Srivaths, 2007
•
•
•
•
•
Vdd = 5V
Critical path Length = 3
Throughput = 3*(N/2)
Capacitance = 1.5 units
Power = 25 units
A
YN 1
Low-Power
• Vdd = 3.75V How?
• Critical path Length = 3
• Throughput = 2*N
• Capacitance = 1.5 units
• Power
= 20
units7
Design
and Test,
Lecture
Voltage
Scaling
29
Transformation (3): Pipelining
Let us assume we will now process two samples in parallel at any given time
Non-pipelined
operation
……………..
op1
op2
op3
op4
op1
Pipelined
operation
……………..
op2
op3
Copyright Agarwal & Srivaths, 2007
op4
Low-Power Design and Test, Lecture 7
30
Transformation (3): Pipelining
Behavior Data Flow with Pipelining
■ Observe that the critical path length reduces to 2
XN
+
D
2D
*
A2
A2
X N 1
YN
+
*
D
*
+
A
YN 1
Design Characteristics
•
•
•
•
•
Vdd = 2.9V How?
Critical path Length = 2
Throughput = 2*N
Capacitance = 1.5 units
Power = 12.5 units (2X reduction)
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
31
Transformation (3): Pipelining
Source: [Chandraskan95]
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
32
Common Case Computation: A PowerOptimization Technique [Lakshminarayana99]
Recall Amdahl’s law !
Idea
■ Identify
computations or
sequence of
computations in
behavior that
occur most
frequently
■ Design separate
circuit that
implements
common-case
behavior efficiently
Copyright Agarwal & Srivaths, 2007
Generic Architecture
ORIGINAL
CIRCUIT
Common-case
Detection &
execution
circuit
Activity of energy
optimized circuit
Low-Power Design and Test, Lecture 7
33
CCC: Example [Lakshminarayana99]
GCD Behavior
STG annotated with state and
state transition probabilities
from simulation profiles
while (x != y) {
if (x > y) {
x := x - y;
} else {
y := y - x;
}
}
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
34
CCC: Example [Lakshminarayana99]
Identified
common case behavior
if (x != y) {
if (x > y) {
x := x -y;
}}
if (x != y) {
if (x > y) {
x := x -y;
}}
if (x != y) {
if (x > y) {
x := x -y;
}}
if (x != y) {
if (x > y) {
x := x -y;
}}
Tempx := x - 4y;
if (Tempx > 0) {
x := Tempx;
}
y
x
Common Case
Execution
Copyright Agarwal & Srivaths, 2007
Simplified
common
case behavior
2
Common Case
Detection
0
Low-Power Design and Test, Lecture 7
35
CCC: Results
Performance improvement of more than 4X!
Can be traded-off for power savings
■ Average power consumption reduction: 59%
Average area overhead: 23%
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
36
Outline
General Observations
RTL Power Management Techniques
■ Gated Clock Architecture
■ Precomputation
■ Guarded Evaluation
Behavior-Level Power Reduction Techniques
■ Performance Speedup Techniques
● Algebraic Transformations
● Common Case Computation
■ Switched Capacitance Reduction
● Algebraic Transformations
Power Supply Gating
■ Basic Concept
■ Isolation Cells
■ Retention Flip-Flops
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
37
Operation Reduction: Distributivity
[Chandrakasan95]
Reducing operations reduces the switched capacitance
2nd order polynomial example
X 2 A* X B
can be rewritten as
X
X * ( X A) B
A
*
+
X
X
+
A
*
+
*
X
B
+
B
X
One lesser multiplication!
Same throughput
No change to the critical path
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
38
Operation Reduction: Distributivity
[Chandrakasan95]
Reducing operations reduces the switched capacitance
■ Can also increase the critical path (can mean higher voltage
to realize the same throughput)
3rd order polynomial example
X 3 A* X 2 B * X C
X * ( X * ( X A) B) C
can be rewritten as
A
X
*
X
*
+
+
*
X
A
+
X
+
X
B
+
*
*
B
*
C
#Operations=7
Critical path=4
Copyright Agarwal & Srivaths, 2007
#Operations=5
Critical path=5
Low-Power Design and Test, Lecture 7
X
+
C
39
Strength Reduction and Common Sub-Expression
Strength Reduction
■ Exploit dissimilarity in energy consumption between operations
■ E.g, Conversion of multiplications with constants into shift-add operations
Common Sub-Expression
■ Identify common computations between two computational threads and re-use
to reduce the number of operations
Example: 4-tap FIR Filter [Mehendale95]
Yn i 0 Ai * X ni
3
X n2
X n1
Xn
Ao
A2
A1
*
X n 3
Coefficients Value
A3
*
*
+
+
Copyright Agarwal & Srivaths, 2007
*
+
Yn
A0
(0.0111011)2
A1
(0.0101011)2
A2
(1.0110011)2
A3
(1.1001010)2
2’s complement
Low-Power Design and Test, Lecture
7
40
fixed-point
arithmetic
Strength Reduction and Common Sub-Expression
Step 1. Apply Strength Reduction
■ Replace multiplication by equivalent Shift and Add from the binary
representation of the coefficients
Yn i 0 Ai * X ni A0 * X 3 A1 * X 2 A2 * X1 A3 * X 0
3
A0 (0.0111011)2
A0 * X 3 2 8 * ( X 3 X 3 1 X 3 3 X 3 4 X 3 5)
A2 (1.0110011)2
A2 * X 1 28 * ( X 1 X 1 1 X 1 4 X 1 5 X 1 7)
#Adds
#Subs
#Shifts
15
2
15
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
41
Strength Reduction and Common Sub-Expression
Step 2. Identify common sub-expressions across coefficients
■ Two coefficients that have 1 in more than one bit location
A0 * X 3 2 8 * ( X 3 X 3 1 X 3 3 X 3 4 X 3 5)
A2 * X 1 28 * ( X 1 X 1 1 X 1 4 X 1 5 X 1 7)
■ Compute (X1 + X3) = X13 separately
■ Similarly, compute (X0 + X2) = X02 separately
#Adds
#Subs
#Shifts
11
2
10
■ Similarly, Compute (X13 + X13 << 1) = X13_01 separately
#Adds
#Subs
#Shifts
10
2
9
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
42
Outline
General Observations
RTL Power Management Techniques
■ Gated Clock Architecture
■ Precomputation
■ Guarded Evaluation
Behavior-Level Power Reduction Techniques
■ Performance Speedup Techniques
● Algebraic Transformations
● Common Case Computation
■ Switched Capacitance Reduction
● Algebraic Transformations
Power Supply Gating
■ Power Switches
■ Isolation Cells
■ Retention Flip-Flops
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
43
Power Supply Gating
Basic Concept:
■ Switches placed on-chip to turn off
power supply when circuit (parts) are
idle.
Benefits:
■ Leakage power reduction
Challenges
■ IR drop leads to timing closure issues
■ Simultaneous switching of gating cells
Two styles of power gating
■ Fine-grained power gating
● Power gating logic part of library
cells
■ Coarse-grained power gating
● Power gating cells part of power
Courtesy [Cadence-PowerMgmtDesignLine06]
grid
network
Low-Power Design and Test, Lecture 7
Copyright Agarwal & Srivaths, 2007
44
Power Supply Gating: An Example [OMAP-ISSCC05]
90nm OMAP2420 SoC
Power Switch used in OMAP
5 power domains in OMAP SoC enabled by power gating
Power switches gate VDD, consists of
■ Weak PMOS: Sinks low current for power restore
■ Strong PMOS: Deliver current for normal operation
2-pass power turn-on mechanism to prevent current surges
■ Weak switches turned on first to almost fully restore VDD(local), and then the strong
switches are turned on to support normal operation
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
45
Power Supply Gating: An Example [OMAPISSCC05]
Leakage currents
compared between
■ All power domains ON
■ WkUp domain only ON
Nearly 40X reduction seen
at room temperature
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
46
Isolation Cells
Special cells used at the interfaces between blocks which are
shut-down and blocks which are on.
Prevents the outputs of shut-down modules from floating
Types of Isolation Cells
■ Sets the output to a known value (0 or 1)
■ Sets the output to the last valid value
Cells and their enables need to be always ON.
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
47
Data Retention
Things to do before we power down
■ Save state of the module(s) being shut down
Options [Zyuban02]
■ For processors, OS can save relevant state to local memory
and read back
● Save/restore overheads (time, energy consumption)
■ Use scan to save complete state
■ Keep all latches on a separate power supply and just power
down logic
■ Provide each latch with a shadow latch called retention latch
(each retention latch is on a separate power supply)
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
48
Data Retention
Integrated Scan Retention
Courtesy: [Zyuban-ISLPED02]
Save and Restore Operations
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
49
References
Survey Papers
■ [Devadas95] S. Devadas, S. Malik: A Survey of Optimization Techniques Targeting Low Power VLSI Circuits. DAC 1995:
242-247
■ [Macii98] E. Macii, M. Pedram, F. Somenzi: High-level power modeling, estimation, and optimization. IEEE Trans. on CAD
of Integrated Circuits and Systems 17(11): 1061-1079 (1998)
■
[Chandrakasan95a] A. P. Chandrakasan, M. Potkonjak, R. Mehra, J. Rabaey, and R. Brodersen, ``Optimizing power using
transformations,'' IEEE Trans. Computer-Aided Design, vol. 14, pp. 12--31, Jan. 1995.
RTL Power Management
■ [Ohnishi97] M. Ohnishi, A. Yamada, H. Noda, and T. Kambe, ``A Method of Redundant Clocking Detection and Power
Reduction at the RTL level,'' in Proc. Int. Symp. Low Power Electronics & Design (ISLPED), pp. 131-136, Aug. 1997.
■ [Tiwari98] V. Tiwari, S. Malik, P. Ashar: Guarded evaluation: pushing power management to logic synthesis/design. IEEE
Trans. on CAD of Integrated Circuits and Systems (TCAD) 17(10): 1051-1060 (1998)
Behavioral Power Optimization
■ [Mehendale95] M. Mehendale, S. D. Sherlekar, G. Venkatesh, “Synthesis of multiplier-less FIR filters with minimum number
of additions”. ICCAD 1995: 668-671
■ [Lakshminarayana99] G. Lakshminarayana, A. Raghunathan, K. S. Khouri, N. K. Jha, S. Dey: Common-Case Computation:
A High-Level Technique for Power and Performance Optimization. DAC 1999: 56-61
Power Supply Gating
■ [Cadence-PowerMgmtDesignLine06] Anand Iyer, “Demystify power gating and stop leakage cold”, Power Management
DesignLine, 03/03/06
■ [Zyuban02] V. Zyuban, S. V. Kosonocky: Low power integrated scan-retention mechanism. ISLPED 2002: 98-102
■ [OMAP-ISSCC05] P. Royannez, H. Mair, F. Dahan, M. Wagner et. al.; "90nm Low Leakage SoC Design Techniques for
Wireless Applications"; ISSCC'05, Feb 2005
Copyright Agarwal & Srivaths, 2007
Low-Power Design and Test, Lecture 7
50