Transcript Defense
A Hardware-Software Processor Architecture
Using
Pipeline Stalls For Leakage Power Management
Khushboo Sheth
Thesis Committee:
Dr. Vishwani Agrawal, Advisor
Dr. Victor Nelson
Dr. Adit Singh
Master’s Thesis Defense
December 3, 2008
Dec 3, 2008
Sheth: MS Thesis
1
Outline
Motivation
Background
NOP-cycle method for energy saving
Comparison of Reference method with NOPcycle method
Architecture Modification
Power Management Techniques
Sleep mode operation
Drowsy mode operation
Conclusion
Dec 3, 2008
Sheth: MS Thesis
2
Power components in CMOS circuit
Ron
Dynamic power
VDD
vi (t)
vo(t)
Leakage power
Short circuit power
R=large
CL
Ground
Dec 3, 2008
Sheth: MS Thesis
3
Motivation
Technology scaling
Gate size
Per transistor dynamic
power decreases
Per transistor leakage
power increases
Number of transistors
increase
Leakage
Power
Density
Contribution of Leakage
increases
Reduction in threshold
voltage
Dec 3, 2008
Sheth: MS Thesis
4
Processor Power Trend
Processor power increases every generation
Dec 3, 2008
Sheth: MS Thesis
5
Objective of This Work
Explore
power management for a
processor at the architecture level.
Reduce power and minimize leakage
energy.
Propose and evaluate a new hardwaresoftware technique for power
management.
Dec 3, 2008
Sheth: MS Thesis
6
Background
A
simple technique to reduce power is to
slow-down the clock:
Dynamic power reduced in proportion to clock
rate.
Leakage power remains unchanged.
A computing task takes longer in the power
saving mode:
• Consumes the same dynamic energy
• Consumes more leakage energy
We
Dec 3, 2008
use this as a reference method.
Sheth: MS Thesis
7
Clock-Slowdown (Reference) Method
Normal operation:
•
•
•
•
•
Rated clock frequency, f
Dynamic power, Pd
Static power, Ps
Total power, Pd + Ps
Energy consumed by an N-cycle task = (Pd + Ps) N/f
Power saving mode:
•
•
•
•
•
Clock frequency, f/n
Dynamic Power, Pd/n
Static Power, Ps
Total power, P(n) = Pd/n +Ps
Energy consumed by an N-cycle task, E(N,n) = (Pd+ nPs) N/f
Dec 3, 2008
Sheth: MS Thesis
8
Power Saving Ratio
P-ratio
= P(1)/P(n)
= n(Pd + Ps)/(Pd + nPs)
= n(k+1)/(k+n), where k = Pd/Ps
Low leakage technology, k >> 1
P-ratio = n
High leakage technology, k ≤ 2
P-ratio = 3n/(n+2)
for k = 2
= 2n/(n+1)
for k = 1
= 3n/(2n+1)
for k = 0.5
Dec 3, 2008
Sheth: MS Thesis
9
Power Saving Ratio, P-ratio
5
Low leakage
k >> 1
P-ratio
4
3
k=2
2
k=1
k = 0.5
1
1
2
3
4
5
Clock slowdown factor, n
Dec 3, 2008
Sheth: MS Thesis
10
Energy Saving Ratio
E-ratio
= E(N,1)/E(N,n)
= (Pd + Ps)/(Pd + nPs) = n P-ratio
= (k+1)/(k+n), where k = Pd/Ps
Low leakage technology, k >> 1
E-ratio = 1
High leakage technology, k ≤ 2
E-ratio = 3/(n+2)
for k = 2
= 2/(n+1)
for k = 1
= 3/(2n+1)
for k = 0.5
Dec 3, 2008
Sheth: MS Thesis
11
4
k = 0.5
1/E-ratio
3
k=1
k=2
2
1
Energy increase →
Energy Saving Ratio, E-ratio
No energy
increase
Low leakage
k >> 1
0
1
2
3
4
5
Clock slowdown factor, n
Dec 3, 2008
Sheth: MS Thesis
12
Instruction Slowdown: New Energy Saving
Method
Maintain rated clock frequency (f).
Instruction slowdown factor, m, where m ≥ 0; power
management hardware inserts m nop’s per
instruction.
Provide hardware sleep modes to reduce nop
power:
Power control signals generated by control logic
•
•
•
•
ALU powered down
Register file clocks gated
Memory sleep mode
Pipeline register clocks gated
Dec 3, 2008
Sheth: MS Thesis
13
Power Consumed With NOPs
P
P/f
βP/f
β
=
=
=
=
Power consumed by instructions cycles
energy consumed per instruction cycle
energy consumed per NOP cycle
reduction factor (0≤β≤1) due to
power down/sleep modes
f/(m+1) Instruction cycles
Energy = P/(m+1)
mf/(m+1) NOP cycles
Energy = mβP/(m+1)
1 second (f cycles)
Power = P(1 + mβ)/(m + 1)
Dec 3, 2008
Sheth: MS Thesis
14
NOP-Cycles Method
Normal operation:
•
•
•
•
•
Rated clock frequency, f, m = 0
Dynamic power, Pd
Static power, Ps
Total power, Pd + Ps
Energy consumed by an N-cycle task = (Pd + Ps) N/f
Power saving mode:
•
•
•
•
•
Clock frequency, f
Dynamic Power, Pd (1 + mβ)/(m + 1)
Static Power, Ps (1 + mβ)/(m + 1)
Total power, P(m) = (Pd + Ps) (1 + mβ)/(m + 1)
Energy consumed by an N-cycle task,
E(N,m) = (Pd+Ps) [(1+mβ)/(m+1)] N(m+1)/f = (Pd+Ps)(1+mβ)N/f
Dec 3, 2008
Sheth: MS Thesis
15
Power and Energy Saving Ratio
P-ratio
E-ratio
Dec 3, 2008
=
=
=
=
P(0) / P(m)
(m + 1) / (1 + mβ)
E(N,0) / E(N,m)
1 / (1 + mβ)
Sheth: MS Thesis
16
Power Saving Ratio, P-ratio
Ideal case
β=0
P-ratio
4
β = 0.1
3
β = 0.33
2
β = 0.5
Decreasing power →
5
β=1
1
0
1
2
3
4
Instruction slowdown factor, m
Dec 3, 2008
Sheth: MS Thesis
17
Energy Saving Ratio, P-ratio
β=1
1/E-ratio
4
β = 0.5
3
β = 0.33
2
β = 0.1
β=0
1
0
1
2
3
Increasing energy →
5
4
Instruction slowdown factor, m
Dec 3, 2008
Sheth: MS Thesis
18
Comparing Two Cases
Energy(Clock slowdown)/Energy(Instruction slowdown)
k+m+1
=
_____________
(k+1) (1+mβ)
where, n = m+1, and k = Pd/Ps
Dec 3, 2008
Sheth: MS Thesis
19
Clock Slowdown Vs. Instruction
Slowdown, β = 1 (No Sleep Mode)
3
Advantage →
Energy ratio
4
2
1
k = 0.5
k=1
k=2
k >> 1
0
0
1
2
3
4
Slowdown factor, m or n-1
Dec 3, 2008
Sheth: MS Thesis
20
Clock Slowdown Vs. Instruction
Slowdown, β = 0.5 (Sleep Mode)
3
Advantage →
Energy ratio
4
2
1
k = 0.5
k=1
k=2
k >> 1
0
0
1
2
3
4
Slowdown factor, m or n-1
Dec 3, 2008
Sheth: MS Thesis
21
Clock Slowdown Vs. Instruction
Slowdown, β = 0.1 (Sleep Mode)
3
Advantage →
Energy ratio
4
2
1
k = 0.5
k=1
k=2
k >> 1
0
0
1
2
3
4
Slowdown factor, m or n-1
Dec 3, 2008
Sheth: MS Thesis
22
32 Bit MIPS pipeline processor
Dec 3, 2008
Sheth: MS Thesis
23
Modified Architecture
Slow down signal
ALU, Data memory and Register File put to sleep mode
Dec 3, 2008
Sheth: MS Thesis
24
Power Management Techniques
Clock Gating:
Clock Signal halted in idle devices
Switching activity reduced
Leakage power unaffected
A glitch can cause a temporarily false clock turn off/on
Enabled Flip Flops:
Registers replaced by a representative with an
enabled signal
When disabled, outputs are not changing
Reduces switching activity, but clock still active which
consumes lot of power
Less effective
Dec 3, 2008
Sheth: MS Thesis
25
Sleep Mode Operation
Activity of the entire system is monitored rather than that
of the individual modules.
If the system has been idle for some predetermined
time-out duration, then the entire system is shut down
and enters what is known as sleep mode.
System inputs are monitored for activity, which will then
trigger the system to wake up and resume processing.
Overhead in time and power associated with entering
and leaving sleep mode.
Trade-offs to be made in setting the length of the desired
time-out period.
Dec 3, 2008
Sheth: MS Thesis
26
Implementing Sleep Mode
Power-gating technique
Suitably sized header or footer
transistor for a circuit block
Sleep signal applied to the gate of
the header or footer transistor to
turn-off the supply voltage of the
circuit block
When circuit block is being
requested for use, the sleep signal is
de-asserted to restore the voltage at
the virtual Vdd.
Dec 3, 2008
Sheth: MS Thesis
27
Drowsy mode for memories
To retain any information stored in the
memory cells when switched to lowpower mode drowsy mode provides a
better solution
High-threshold (high-Vt) transistor
used to separate virtual Vdd from Vdd
supply line
Supplies a very low voltage to the cell
when it is turned in to low power mode
High-Vt device drastically reduces the
leakage of the circuit because of the
exponential dependence of leakage on
Vt
Dec 3, 2008
Sheth: MS Thesis
28
Conclusion
For the higher-leakage technologies, hardware-software
technique inserts pipeline stalls in the processor while
maintaining the clock rate of the processor. The
hardware units are designed to save leakage power
while processing NOP instruction by putting the idle
blocks into sleep mode.
This technique is more effective when NOP cycle
consumes less than 50% power than regular instruction
cycle
Future work includes considering the power of the active
cycles and applying voltage reduction when reducing the
clock frequency, if the performance penalty can be met.
Dec 3, 2008
Sheth: MS Thesis
29
References
P. Lotfi-Kamran, A. Rahmani, A. Salehpour, A. Afzali-Kusha, and Z. Navabi,
“Stall Power Reduction in Pipelined Architecture Processors”, in Proc. of
21st International Conference on VLSI Design, 2008, pp.541-546.
K. Najeeb, V. V. R. Konda, S. S. Hari, V. Kamakoti, and V. M. Vedula,
“Power Virus Generation Using Behavioral Models of Circuits”, in Proc. 25th
IEEE VLSI Test Symposium, 2007, pp.35-40.
B. Yu and M. L. Bushnell, “A Novel Dynamic Power Cut-off Technique
(DPCT) for Active Leakage Reduction in Deep Submicron CMOS Circuits”,
in Proc. International Symposium On Low Power Electronics and Design,
2006, pp. 214-219.
K. Flautner, N. S. Kim, S. Martin, D. Blaauw, and T. Mudge, “Drowsy
Caches: Simple Techniques for Reducing Leakage Power”, in Proc.
International Symposium on Computer Architecture, 2002, pp.148-157.
Z. Hu, A. Buyuktosunoglu, V. Srinivasan, V. Zyuban, H. Jacobson, and P.
Bose, “Microarchitectural Techniques for Power Gating of Execution Units”,
in International Symposium on Low Power Electronics and Design, 2004,
pp. 32-37.
D. Ernst, N. S. Kim, S. Das, S. Pant, R. Rao, T. Pham, C. Ziesler, D.
Blaauw,T. Austin, K. Flautner, and T. Mudge, “Razor: A Low-Power Pipeline
Based on Circuit-Level Timing Speculation, in Proc. 36th Annual IEEE/ACM
International Symposium on Microarchitecture, Dec. 2003, pp. 7-18.
Dec 3, 2008
Sheth: MS Thesis
30
Thank You !!
Dec 3, 2008
Sheth: MS Thesis
31