Low Power Design of Digital Circuits
Download
Report
Transcript Low Power Design of Digital Circuits
Variable Input Delay CMOS Logic for
Low Power Design
Tezaswi Raja
Transmeta Corp., San Jose, CA, USA
Vishwani D. Agrawal
Dept. of ECE, Auburn University, AL, USA
http://www.eng.auburn.edu/~vagrawal
Michael L. Bushnell
Dept. of ECE, Rutgers University, NJ, USA
Research Funded by: National Science Foundation
Jan 2005
Raja et al.: Low Power Design
1
Talk Outline
Motivation
Background on Glitch Elimination Techniques
Problem Statement
New Variable Input Delay Logic
Transistor Level Design of Variable Input
Delay Gate
Results
Physical Level Implementation
Conclusion and Future Work
Jan 2005
Raja et al.: Low Power Design
2
What Are Glitches?
Delay =1
2
Delay = 2
2
Glitches occur due to differential (unbalanced) path delays.
Glitches are transients that are unnecessary for the correct
functioning of the circuit.
Glitches waste power in CMOS circuits.
Jan 2005
Raja et al.: Low Power Design
3
Prior work
Delay Balancing for Glitch Elimination:
Hazard Filtering for Glitch Elimination:
Glitch suppression by increasing the inertial delay of gates.
Ref: Agrawal et al., VLSI Design `97, `99, `03, `04.
Gate Sizing for Glitch Elimination:
Balancing delays by adding buffers on select paths.
Ref: Chandrakasan and Brodersen and other books
Every gate is modeled as an equivalent inverter.
Model is non-linear
Ref : Berkelaar et al., IEEE Trans. on Circuits and Systems ‘96
Transistor Sizing for Area-Speed Oprimization:
Jan 2005
Size the width and length of every transistor to get exact delay.
Model is non-linear
Convergence problems due to large search space.
Ref: Fishburn et al., ICCAD ’85.
Raja et al.: Low Power Design
4
Example: Why Buffers Were Necessary?
1
Critical path delay = 3
1
1
Delay unit is the smallest delay possible for a
gate in a given technology.
Critical Path is the longest delay path in the
circuit and determines the speed of the circuit.
Jan 2005
Raja et al.: Low Power Design
5
Example (cont.)
0
1
0
1
time
1
For glitch free operation of first gate:
Differential delay at inputs < inertial delay
OK
Jan 2005
Raja et al.: Low Power Design
6
Example (cont.)
1
1
1
0
time
1
For glitch free operation of second gate:
Jan 2005
Differential delay at inputs < inertial delay
OK (Assuming equality does not produce a glitch)
Raja et al.: Low Power Design
7
Example (cont.)
1
time
1
2
1
0
For glitch free operation of third gate:
Jan 2005
Differential delay at inputs < inertial delay
Not true for gate 3
Raja et al.: Low Power Design
8
Example (cont.)
1
time
1
2
1
1
1
For glitch free operation with no IO delay increase:
Must add a delay buffer.
Buffer is necessary for conventional gate design – only
gate output delay is controllable.
Jan 2005
Raja et al.: Low Power Design
9
Controllable Input Delay Gates
1
time
1
2
1
2
0
Assume gate input delays to be controllable
Glitches can be suppressed without buffers
Jan 2005
Raja et al.: Low Power Design
10
Problem Statement
Find a glitch reduction technique such that:
All glitches are eliminated in the circuit.
No delay buffers are inserted in the circuit.
Circuit operates at the highest possible speed
permitted by the device technology.
Technique should be scalable for large circuits.
Circuits are realizable at the physical level of design.
Note: The objective is to minimize switching power. Hence, no attempt is made to
reduce short-circuit and leakage power, which is an order of magnitude lower for
present CMOS technologies; those components of power may be addressed in the
future research.
Jan 2005
Raja et al.: Low Power Design
11
New Variable Input Delay Logic
I/O path delay through a gate = Input Delay + Output Delay
Output Delay
Input Delay
Propagation delay through a gate from the inputs to the
outputs.
Extra delay that can be added on a single I/O path through
the gate, which can be controlled independently of the
other input delays.
Variable Input Delay Logic
Jan 2005
Logic level design of circuits using components with variable
input and output delays along different I/O paths through
the gate.
Raja et al.: Low Power Design
12
Delay Model for a New Gate
1
d3,1 + d3
2
d3,2 + d3
3
Separate the output (inertial) and input delay variables.
d3 - output delay of the gate.
d3,1 - input delay of the gate along path from 1 to 3.
Technology constraint:
0 d3,1 ,d3,2 ub
Input delay difference has an upper bound, which we define as
Gate Input Differential Delay Upper Bound ( ub ).
Jan 2005
Raja et al.: Low Power Design
13
Gate Input Differential Delay Upper
Bound (ub)
It is a measure of the maximum difference in delay of any
two I/O paths through the gate, that can be designed in a
given CMOS technology.
Arbitrary input delays cannot be realized in practice due to the
technology limitation at the transistor and layout levels.
The bound ub is the limit of flexibility allowed by the
technology to the designer at the transistor and layout levels.
The following feasibility condition must be imposed while
determining delays for glitch suppression:
0 di, j ub
Jan 2005
Raja et al.: Low Power Design
14
New Linear Programs
We propose two new LPs for designing circuits
based on the specifications of the design.
Minimum dynamic power (MDP) LP
Where the circuit consumes least power possible and
operates at the highest possible speed for that power.
Delay specification (DS) LP
Jan 2005
Where the circuit meets a given delay requirement
but does it by adding the smallest number of
buffers.
Raja et al.: Low Power Design
15
New MDP LP Example
1
d5,1 + d5
5
d7,5 + d7
d5,2 + d5
2
d7,6 + d7
d6,2 + d6
3
d6,3 + d6
7
d7,4 + d7
6
4
Gate inertial delay variables d5 ..d7
Gate input delay variables di, j for every path through gate i
from input j
Corresponding window variables t5 ..t7 and T5 ..T7.
Jan 2005
Raja et al.: Low Power Design
16
New MDP LP Example (cont.)
1
2
d5,1 + d5
5
d7,5 + d7
d5,2 + d5
d7,6 + d7
d6,2 + d6
3
d6,3 + d6
7
d7,4 + d7
6
4
Inertial delay constraint for gate 5: d5 1
Input delay (feasibility) constraints for gate 5:
0 d5,1 ub
0 d5,2 ub
Jan 2005
Raja et al.: Low Power Design
17
New MDP LP Example (cont.)
1
2
d5,1 + d5
5
d7,5 + d7
d5,2 + d5
d7,6 + d7
d6,2 + d6
3
d6,3 + d6
7
d7,4 + d7
6
4
Differential delay constraints for gate 5:
T5 > T1 + d5,1 + d5;
T5 > T2 + d5,2 + d5;
Jan 2005
t5 < t1+ d5,1 + d5;
t5 < t2+ d5,2 + d5;
Raja et al.: Low Power Design
d5 > T5 – t5;
18
New MDP LP Example (cont.)
1
2
d5,1 + d5
5
d7,5 + d7
d5,2 + d5
7
d7,6 + d7
d6,2 + d6
d7,4 + d7
3
d6,3 + d6
6
4
IO delay constraint for each PO in the circuit:
T7 maxdelay;
maxdelay is the parameter which gives the delay of the critical path.
This determines the speed of operation of the circuit.
Jan 2005
Raja et al.: Low Power Design
19
New MDP LP Example (cont.)
1
d5,1 + d5
5
d7,5 + d7
d5,2 + d5
2
7
d7,6 + d7
d6,2 + d6
d7,4 + d7
3
d6,3 + d6
6
4
Objective Function:
minimize maxdelay;
This gives the fastest possible, minimum dynamic power
consuming circuit, given the feasibility condition for the
technology.
Jan 2005
Raja et al.: Low Power Design
20
Solution Curves
Power
Previous solutions
New MDP LP solutions
Power
consumed
by buffers
Minimum
Dynamic
power
ub = ∞
ub=15
ub=10
ub=5
Fastest Possible
Design in any
technology
Jan 2005
ub=0
Maxdelay
Raja et al.: Low Power Design
21
Delay Specification LP
If the design needs to meet a given delay
specification and the designer is willing to sacrifice
some dynamic power by inserting buffers.
Modifications to MDP LP
Insert buffer variables at every fanout stem and
branches and at PIs (similar to Linear constraint set
method by Raja et al.)
maxdelay is a given parameter, which is the maximum
delay of the critical path according to specification.
Jan 2005
Raja et al.: Low Power Design
22
Delay Specification LP
Components of the LP
Gate constraints – unchanged
Input delay (feasibility) constraints – unchanged for
same ub
Differential delay constraints – unchanged
Maxdelay constraints – unchanged but maxdelay is
a given parameter.
Objective function:
Minimize sum ( dj) where j є buffers
Jan 2005
Raja et al.: Low Power Design
23
Solution Curves
Power
Previous solutions
New MDP LP solutions
New DS LP solutions
Power
consumed
by buffers
Minimum
Dynamic
power
ub = ∞
ub=15
ub=10
ub=5
Fastest Possible
Design in any
technology
Jan 2005
ub=0
Maxdelay
Raja et al.: Low Power Design
24
Transistor Level Implementation
Ron
Cr Cin
d3,1
Cin
d3,2
Ron
Ron
Cp
Cr
Cin
Cr
Conventional CMOS gate design:
Delay = Ron ( Crouting + Cinput )
Energy = 0.5 (Cr + Cin ) V2
Delay can be changed by changing the resistance or the capacitance.
Resistance does not affect energy per transition.
Jan 2005
Raja et al.: Low Power Design
25
Transistor Level Implementation
Possible implementations of the variable input delay
gate:
Capacitance manipulation method where the input
capacitance offered by the respective transistor pair is varied.
Pass transistor added design where an extra transistor is
added to increase the resistance and thereby the input delay.
We propose the addition of:
Jan 2005
Single nMOS transistor
CMOS pass transistor
We describe the single nMOS transistor added design in detail
here. The other two are documented in the thesis.
Raja et al.: Low Power Design
26
Single nMOSFET Added Design
Ron
d3,1 = Ron (Cr + Cin ) + Rs Cin
Rs
Cr
Cin
Cin
Ron
d3,1
d3,1 = Output + Input delay
d3,2
Energy = 0.5 (Cr + Cin ) V2
Cr
d3,2 = Ron (Cr + Cin )
The input delay can be added by an nMOS transistor in series to
the path desired.
The addition of resistance does not increase the energy per
transition.
Jan 2005
Raja et al.: Low Power Design
27
Effect of Input Slope
Rs
Too large ub cannot be realized in practice due to noise issues.
Increased resistance degrades the slope of a signal and we use the
CMOS gate following it to regenerate the slope.
The regenerative capability of a gate is limited and this determines
practical ub value.
The slope allowed in a design depends on the noise specifications
of the circuit.
Jan 2005
Raja et al.: Low Power Design
28
Single nMOSFET Added Design
Advantages:
Almost completely independent control of input delays.
ub is very high compared to capacitance manipulation method.
Very less overhead compared to a conventional buffer.
Can be integrated to full-custom as well as standard cell place and
route design flows.
Design Issues:
nMOSFET degrades the signal when passing logic 1. Hence, it
increases the leakage of the transistors in the fanout stages.
However, this is for certain input combinations only.
Short circuit current is a function of the ratio of input/output
slopes. Since we increase the input slope by inserting resistance, it
might increase short circuit power by a minor amount.
Jan 2005
Raja et al.: Low Power Design
29
CMOS Pass Transistor Added Design
Ron
Rs
Cr
Ron
Cin
Cin
d3,1 = Ron (Cr + Cin) + Rs Cin
d3,1
d3,1 = Output + Input delay
d3,2
Energy = 0.5 (Cr + Cin) V2
Cr
d3,2 = Ron (Cr + Cin)
The input delay can be added by the input CMOS pass transistor
in series to the path desired.
This does not degrade the signal as both transistors together
conduct both logic values well.
Jan 2005
Raja et al.: Low Power Design
30
Technology Mapping
Delay required
Look Up Table for
sizes
Transistor Sizes
yes
Error
no
acceptable?
Increment that
transistor
dimension
Sensitivity of
each transistor
size to delay
Determine sizes of transistors in a gate for the given delay and
given load capacitance.
First guess is given by the look-up table.
Second stage is sensitivity driven.
Reduces the complexity of transistor search.
Jan 2005
Raja et al.: Low Power Design
31
Results for Speed of Circuit Using MDP LP
Maxdelay is normalized to the length of the critical path when all gates are of unit delay.
Each curve is a different benchmark circuit.
As we increase ub the circuit becomes faster.
Flexibility required for fastest operation of circuit is proportional to the size of the circuit.
Jan 2005
Raja et al.: Low Power Design
32
Power Opt. Using MDP LP (for ub=10)
Circuit
No. of maxdelay Norm.
vectors
delay
Original power
Optimized power
Avg.
Peak
Avg.
Peak
c432
56
71
4.17
1.0
1.0
0.65
0.55
c499
54
34
2.26
1.0
1.0
0.70
0.65
c880
78
45
1.50
1.0
1.0
0.48
0.45
c1355
87
67
2.05
1.0
1.0
0.47
0.36
c1908
144
173
4.32
1.0
1.0
0.54
0.44
c2670
82
35
1.09
1.0
1.0
0.68
0.56
c3540
200
347
7.38
1.0
1.0
0.53
0.43
c5315
157
542
11.06
1.0
1.0
0.53
0.44
c6288
141
124
1.87
1.0
1.0
0.22
0.18
c7552
158
50
1.16
1.0
1.0
0.28
0.26
Jan 2005
Raja et al.: Low Power Design
33
Power Opt. Using DS LP (for ub=10)
Circuit
c432
c499
c880
c1355
c1908
Jan 2005
Norm.
Maxdelay
Conventional gates
Variable input delay gates
(Raja et al., VLSI Design `03)
Avg.
Peak
Buffers
Avg.
Peak
Buffers
1.0
0.72
0.67
95
0.69
0.66
61
2.0
0.62
0.60
66
0.65
0.55
0
1.0
0.91
0.87
48
0.86
0.84
0
2.0
0.70
0.66
0
0.71
0.65
0
1.0
0.68
0.54
62
0.58
0.45
1
2.0
0.68
0.52
34
0.56
0.45
0
1.0
0.58
0.48
224
0.48
0.42
64
2.0
0.57
0.48
192
0.44
0.39
32
1.0
0.69
0.59
219
0.56
0.46
5
2.0
0.59
0.44
70
0.55
0.45
4
Raja et al.: Low Power Design
34
Power Opt. Using DS LP (for ub=10)
Circuit
c2670
c3540
c5315
c6288
c7552
Jan 2005
Norm. Power (conventional gates)
Maxdelay (Raja et al., VLSI Design `03)
Power (variable input
delay gates)
Avg.
Peak
Buffers
Avg.
Peak
Buffers
1.0
0.79
0.65
157
0.70
0.56
2
2.0
0.71
0.58
35
0.69
0.57
0
1.0
0.64
0.44
239
0.57
0.46
3
2.0
0.58
0.46
140
0.54
0.43
1
1.0
0.63
0.52
280
0.57
0.48
26
2.0
0.60
0.45
171
0.55
0.46
4
1.0
0.40
0.36
294
0.91
0.87
584
2.0
0.36
0.34
120
0.21
0.16
0
1.0
0.38
0.34
366
0.28
0.24
1
2.0
0.36
0.32
111
0.27
0.24
0
Raja et al.: Low Power Design
35
Example Circuit
1
2
3
5
4
d=2
1
2
3
7
d=1
Unoptimized Circuit
d=1
d=1
d=1
5
4
d=1
1
2
3
6
4
d=2
Jan 2005
7
d=1
d=2
d=1
Buffer optimized
Circuit
d=1
5
7
6
d=2
6
d=1
d=1
nMOS optimized
Circuit
d=1
Raja et al.: Low Power Design
36
Example Circuit – Spectre Results
time
Unoptimized Circuit
Jan 2005
time
Buffer optimized Circuit
Raja et al.: Low Power Design
time
nMOS optimized Circuit
37
Physical Level Verification
AMPL
Delays
Technology Mapping
Transistor Sizes
Create Cells using Prolific
Standard Cell Library
No
Routing
acceptable?
Standard Cell Place and Route
Layout
Extract Routing Capacitance
Routing load
Yes
Optimized Layout
Jan 2005
Analog Power simulations
Energy Consumption
Raja et al.: Low Power Design
38
Layouts of C7552 (0.25 CMOS)
c7552 Un-optimized
Gate Count
Transistor Count
Critical Delay
Area
Jan 2005
= 3827
≈ 40,000
= 2.15 ns
= 710 x 710 um2
c7552 optimized (ub = 10)
Gate Count
= 3828
Transistor Count ≈ 45,000
Critical Delay
= 2.15 ns
Area
= 760 x 760 um2(1.14)
Raja et al.: Low Power Design
39
Instantaneous Power Savings
Peak Power Savings = 68%
Jan 2005
Raja et al.: Low Power Design
40
Patents and Dissertations
Patents
V. D. Agrawal, “Low Power Circuits Through Hazard Pulse
Suppression,” U.S. Patent 5,983,007, November 1999.
T. Raja, V. D. Agrawal and M. L. Bushnell, “Variable Input Delay
CMOS Logic and Its Application to Low Power Design,” to be
submitted to USPTO through Rutgers Univ., May 2004.
Dissertations
Jan 2005
T. Raja, Minimum Dynamic Power Design of CMOS Circuits using a Reduced
Constraint Set Linear Program, MS Thesis, Dept. of ECE, Rutgers
University, May 2002.
T. Raja, Minimum Dynamic Power CMOS Design with Variable Input Delay
Logic , PhD Thesis, Dept. of ECE, Rutgers University, May 2004.
S. Uppalapati, Low Power Design of Standard Cell Digital VLSI Circuits,
MS. Thesis, Dept. of ECE, Rutgers University, October 2004.
Raja et al.: Low Power Design
41
Papers
V. D. Agrawal, “Low-Power Design by Hazard Filtering,” Proc. 10th
Int. Conf. VLSI Design, Jan. 1997, pp. 193-197.
V. D. Agrawal, M. L. Bushnell, G. Parthasarathy, and R. Ramadoss,
“Digital Circuit Design for Minimum Transient Energy and a
Linear Programming Method,” Proc. 12th Int. Conf. VLSI Design,
Jan. 1999, pp. 434-439.
T. Raja, V. D. Agrawal, and M. L. Bushnell, “Minimum Dynamic
Power CMOS Circuit Design by a Reduced Constraint Set Linear
Program,” Proc. 16th Int. Conf. VLSI Design, Jan. 2003, pp. 527-532.
T. Raja, V. D. Agrawal, and M. L. Bushnell, “CMOS Circuit Design
for Minimum Dynamic Power and Highest Speed,” Proc. 17th Int.
Conf. VLSI Design, Jan. 2004, pp. 1035-1040.
Jan 2005
Raja et al.: Low Power Design
42
Conclusion
Main idea: Minimum dynamic power high speed circuits can be
designed if gates with variable input delays are used.
The new design suppresses all glitches without any delay buffers.
Decreases power without loss in speed and very little increase in area.
Developed a linear program solution to demonstrate the idea.
Developed new gate design for transistor level implementation.
Results have been verified by physical layout design of large circuits.
Results show average power savings up to 58%.
Technique easily scalable for large circuits.
Leakage power remains a concern – ongoing research.
Jan 2005
Raja et al.: Low Power Design
43