Advanced VLSI Design - Washington State University
Download
Report
Transcript Advanced VLSI Design - Washington State University
EE 587
SoC Design & Test
Partha Pande
School of EECS
Washington State University
[email protected]
1
Power & Low Power Design
2
Recent Trends
3
What about power in the future?
Power Projections Too High!
Sun’s Surface
Rocket Nozzle
10,000
Power
(Watts)
1,000
Nuclear Reactor
HotPentium
Plate®
100
processors
286
8086
10
8085
8080
486
386
8008
1
4004
0.1
Source: Intel
’71
’74
’78
’85
’92
’00
’04
’08
4
Power in CMOS Gate
5
Power in CMOS Gate
• Capacitive Voltage Transitions
• Follow board notes
• Ref HJS Chapter 5 (5.8)
6
Logic Function
• Transition probabilities depend on the style of the logic gates
• Follow board notes
7
Logic Style
• Static and Dynamic Logic styles
• Power is consumed during the precharge operation for Dynamic
Logic
• For dynamic case activity depends only on the signal probability
• The transition probability is higher in dynamic circuits
8
Circuit Topology
A
O1
O2
F
B
C
Chain
A
D
O1
B
C
F
O2
D
Tree
• The
chain implementation might have glitches
• The tree implementation is balanced and glitch free
9
Short-Circuit Power
• Finite rise and fall times of the input waveforms
result in a direct current path between Vdd and
GND
• Refer to Board Notes
Vin
Vdd-Vtp
Vdd-Vtp
Vtn
VTN Vin VDD VTP
Vtn
Tscf
Tscr
10
Static Power
• Sub threshold leakage
• pn junction leakage
• DC current in the output low state
11
Sub Threshold Leakage
I sub I s e
• As
q (VGS VT Voffset )
nKT
(1 e
(
qVDS
)
KT
)
Vt is reduced the subthreshold current increases
• With decreasing temperature the subthreshold current decreases
12
P-N junction leakage
•
•
Reverse-bias of source and
drain junctions
Bottom of the junctions and the
channel-facing sidewalls
I pn I o (e
(
q
)VSB
KT
1)
• One final form of static power arises in the pseudo-NMOS gates
when the output is low
13
Power dissipation in Clocks and I/O
• Significant power dissipation can occur in clocks in high
performance designs
2
• Clock switches on every cycle so P= CV f (i.e., α=1)
• Must compute all gate and wire capacitance associated with the
clock and then add them all up
• Include input capacitance of flops or gated-clock inputs
• Total clock capacitance can be ~nF range
• For I/O drivers, there is a typical power dissipation provided for
inputs and outputs
• Need to assume something about switching probability of inputs
such as clock, reset, address, data, control
14
Memory and Logic Power Equations
• Total chip power due to memory and logic is
PTotal PLogic PMemory
•
P consists of both dynamic and static terms
PLogic PDynamic PStandby
• For
dynamic switching, we must determine the activity factors
associated with the logic blocks using simulation or probabilistic
methods
15
Intel Pentium-II Power Distribution
16
Methods of Reducing Power
• Architectural Decisions – has the highest impact (parallelism,
pipelining, low activity designs, lower frequency operation )
• Software – low power instructions, algorithms
• CAD tools to implement low-power techniques
• Circuit Techniques – gated clocks, low glitch circuits, reduce
capacitances, reduce activity
• Recent developments – Vdd scaling, VT adjustments
• Process technology – Silicon-on-Insulator (SOI)
17
Architectural Level Optimization
18
Delay vs. Vdd
• Reduce the supply
voltage to reduce the
dynamic power
• Speed penalty for Vdd
reduction
19
Arithmetic Computation
• Reference datapath
• Consists of an adder and
comparator
• The power of the reference datapath
is given as
Pref Cref V f ref
2
ref
20
Parallel Implementation
• One way to maintain throughput
while reducing the supply voltage
is to utilize a parallel architecture
• Two identical adder-comparator
datapaths are used, allowing
each unit to work at half the
original rate while maintaining the
original throughput
21
Pipelined Implementation
22
Architecture Based Voltage Scaling
Architecture
Voltage
Area
(Normalized)
Power
(Normalized)
Simple
5V
1
1
Parallel
2.9 V
3.4
0.36
Pipelined
2.9 V
1.3
0.39
PipelinedParallel
2.0 V
3.7
0.2
• Bigger
improvement can be achieved by simultaneously
exploiting parallelism and pipelining
23
NoC Architectures-System level Parallelism
• NoC
exhibits inherent parallelism
24
Traffic localization
• Traffic within a cluster – localized traffic
25
Bit energy vs. throughput
Uniform Spatial Traffic
Localized Traffic
•Traffic Localization reduces energy dissipation
• Modular architectures help in reducing power
• Communication aware functional mapping
26