logical effort
Download
Report
Transcript logical effort
Lecture 6:
Logical
Effort
Outline
Logical Effort
Delay in a Logic Gate
Multistage Logic Networks
Choosing the Best Number of Stages
Example
Summary
6: Logical Effort
CMOS VLSI Design 4th Ed.
2
Introduction
Chip designers face a bewildering array of choices
– What is the best circuit topology for a function?
– How many stages of logic give least delay?
– How wide should the transistors be?
???
Logical effort is a method to make these decisions
– Uses a simple model of delay
– Allows back-of-the-envelope calculations
– Helps make rapid comparisons between alternatives
– Emphasizes remarkable symmetries
6: Logical Effort
CMOS VLSI Design 4th Ed.
3
Example
Ben Bitdiddle is the memory designer for the Motoroil 68W86,
an embedded automotive processor. Help Ben design the
decoder for a register file.
A[3:0] A[3:0]
32 bits
6: Logical Effort
CMOS VLSI Design 4th Ed.
16 words
4:16 Decoder
Decoder specifications:
– 16 word register file
– Each word is 32 bits wide
– Each bit presents load of 3 unit-sized transistors
– True and complementary address inputs A[3:0]
– Each input may drive 10 unit-sized transistors
Ben needs to decide:
– How many stages to use?
– How large should each gate be?
– How fast can decoder operate?
16
Register File
4
Delay in a Logic Gate
Express delays in process-independent unit d d abs
Delay has two components: d = f + p
3RC
f: effort delay = gh (a.k.a. stage effort)
3 ps in 65 nm process
– Again has two components
60 ps in 0.6 mm process
g: logical effort
– Measures relative ability of gate to deliver current
– g 1 for inverter
h: electrical effort = Cout / Cin
– Ratio of output to input capacitance
– Sometimes called fanout
p: parasitic delay
– Represents delay of gate driving no load
– Set by internal parasitic capacitance
6: Logical Effort
CMOS VLSI Design 4th Ed.
5
Delay Plots
=f+p
= gh + p
What about
NOR2?
2-input
NAND
6
Normalized Delay: d
d
Inverter
5
g=1
p=1
d=h+1
4
3
g = 4/3
p=2
d = (4/3)h + 2
Effort Delay: f
2
1
Parasitic Delay: p
0
0
1
2
3
4
5
Electrical Effort:
h = Cout / Cin
6: Logical Effort
CMOS VLSI Design 4th Ed.
6
Computing Logical Effort
DEF: Logical effort is the ratio of the input
capacitance of a gate to the input capacitance of an
inverter delivering the same output current.
Measure from delay vs. fanout plots
Or estimate by counting transistor widths
2
Y
2
A
Y
1
Cin = 3
g = 3/3
6: Logical Effort
2
A
2
B
2
A
4
B
4
Cin = 4
g = 4/3
CMOS VLSI Design 4th Ed.
Y
1
1
Cin = 5
g = 5/3
7
Catalog of Gates
Logical effort of common gates
Gate type
Number of inputs
1
2
3
4
n
NAND
4/3
5/3
6/3
(n+2)/3
NOR
5/3
7/3
9/3
(2n+1)/3
2
2
2
2
4, 4
6, 12, 6
8, 16, 16, 8
Inverter
Tristate / mux
XOR, XNOR
6: Logical Effort
1
2
CMOS VLSI Design 4th Ed.
8
Catalog of Gates
Parasitic delay of common gates
– In multiples of pinv (1)
Gate type
Number of inputs
1
2
3
4
n
NAND
2
3
4
n
NOR
2
3
4
n
4
6
8
2n
4
6
8
Inverter
Tristate / mux
XOR, XNOR
6: Logical Effort
1
2
CMOS VLSI Design 4th Ed.
9
Example: Ring Oscillator
Estimate the frequency of an N-stage ring oscillator
Logical Effort:
Electrical Effort:
Parasitic Delay:
Stage Delay:
Frequency:
6: Logical Effort
31 stage ring oscillator in
g=1
0.6 mm process has
h=1
frequency of ~ 200 MHz
p=1
d=2
fosc = 1/(2*N*d) = 1/4N
CMOS VLSI Design 4th Ed.
10
Example: FO4 Inverter
Estimate the delay of a fanout-of-4 (FO4) inverter
d
Logical Effort:
Electrical Effort:
Parasitic Delay:
Stage Delay:
6: Logical Effort
g=1
h=4
p=1
d=5
The FO4 delay is about
300 ps in 0.6 mm process
15 ps in a 65 nm process
CMOS VLSI Design 4th Ed.
11
Limitations to Linear Delay Model
Input and output slope
Input arrival times
Velocity saturation
VDD Vt Vc
IdsatN series
Idsat
VDD Vt NVc
Voltage dependence
k
k
k
CVDD
VDD Vt
CVDD
VDD Vt*
kC
Vt*
1
VDD
CVDD
Ioff 10
6: Logical Effort
VDD
S
CMOS VLSI Design 4th Ed.
12
Limitations to Linear Delay Model
Gate source capacitance (series transistor
capacitances)
Bootstrapping (Miller effect)
6: Logical Effort
CMOS VLSI Design 4th Ed.
13
Multistage Logic Networks
Logical effort generalizes to multistage networks
Path Logical Effort
G
gi
Path Electrical Effort
g1 = 1
h1 = x/10
6: Logical Effort
Cin-path
F f i gi hi
Path Effort
10
H
Cout-path
x
g2 = 5/3
h2 = y/x
y
g3 = 4/3
h3 = z/y
z
g4 = 1
h4 = 20/z
CMOS VLSI Design 4th Ed.
20
14
Multistage Logic Networks
Logical effort generalizes to multistage networks
Path Logical Effort
G
gi
Path Electrical Effort
H
Cout path
Cin path
F f i gi hi
Path Effort
Can we write F = GH?
6: Logical Effort
CMOS VLSI Design 4th Ed.
15
Paths that Branch
No! Consider paths that branch:
15
G
H
GH
h1
h2
F
=1
5
= 90 / 5 = 18
= 18
= (15 +15) / 5 = 6
= 90 / 15 = 6
= g1g2h1h2 = 36 = 2GH
6: Logical Effort
CMOS VLSI Design 4th Ed.
15
90
90
16
Branching Effort
Introduce branching effort
– Accounts for branching between stages in path
b
Con path Coff path
Con path
B bi
Note:
h
i
BH
Now we compute the path effort
– F = GBH
6: Logical Effort
CMOS VLSI Design 4th Ed.
17
Multistage Delays
Path Effort Delay
DF f i
Path Parasitic Delay
P pi
Path Delay
D d i DF P
6: Logical Effort
CMOS VLSI Design 4th Ed.
18
Designing Fast Circuits
D d i DF P
Delay is smallest when each stage bears same effort
fˆ gi hi F
1
N
Thus minimum delay of N stage path is
1
N
D NF P
This is a key result of logical effort
– Find fastest possible delay
– Doesn’t require calculating gate sizes
6: Logical Effort
CMOS VLSI Design 4th Ed.
19
Gate Sizes
How wide should the gates be for least delay?
fˆ gh g CCoutin
gi Couti
Cini
fˆ
Working backward, apply capacitance
transformation to find input capacitance of each gate
given load it drives.
Check work by verifying input cap spec is met.
6: Logical Effort
CMOS VLSI Design 4th Ed.
20
Example: 3-stage path
Select gate sizes x and y for least delay from A to B
x
x
A
8
6: Logical Effort
x
CMOS VLSI Design 4th Ed.
y
45
y
B
45
21
Example: 3-stage path
x
x
A
8
x
y
45
y
Logical Effort
Electrical Effort
Branching Effort
Path Effort
Best Stage Effort
Parasitic Delay
Delay
6: Logical Effort
B
45
G = (4/3)*(5/3)*(5/3) = 100/27
H = 45/8
B=3*2=6
F = GBH = 125
fˆ 3 F 5
P=2+3+2=7
D = 3*5 + 7 = 22 = 4.4 FO4
CMOS VLSI Design 4th Ed.
22
Example: 3-stage path
Work backward for sizes
y = 45 * (5/3) / 5 = 15
x = (15*2) * (5/3) / 5 = 10
x
y
x
A P:
84
N: 4
6: Logical Effort
45
45
P:
x 4
N: 6
P:
y 12
N: 3
CMOS VLSI Design 4th Ed.
B
B
45
45
23
Best Number of Stages
How many stages should a path use?
– Minimizing number of stages is not always fastest
Example: drive 64-bit datapath with unit inverter
Initial Driver
1
1
1
1
8
4
2.8
16
8
D = NF1/N + P
= N(64)1/N + N
23
Datapath Load
N:
f:
D:
6: Logical Effort
64
1
64
65
CMOS VLSI Design 4th Ed.
64
2
8
18
64
3
4
15
Fastest
64
4
2.8
15.3
24
Derivation
Consider adding inverters to end of path
– How many give least delay?
Logic Block:
n1Stages
Path Effort F
n1
D NF pi N n1 pinv
1
N
N - n1 ExtraInverters
i 1
1
1
1
D
F N ln F N F N pinv 0
N
Define best stage effort
F
1
N
pinv 1 ln 0
6: Logical Effort
CMOS VLSI Design 4th Ed.
25
Best Stage Effort
pinv 1 ln 0 has no closed-form solution
Neglecting parasitics (pinv = 0), we find = 2.718 (e)
For pinv = 1, solve numerically for = 3.59
6: Logical Effort
CMOS VLSI Design 4th Ed.
26
Sensitivity Analysis
D(N) /D(N)
How sensitive is delay to using exactly the best
1.6
number of stages?
1.51
1.4
1.26
1.2
1.15
1.0
( =2.4)
(=6)
0.0
0.5
0.7
1.0
1.4
2.0
N/ N
2.4 < < 6 gives delay within 15% of optimal
– We can be sloppy!
– I like = 4
6: Logical Effort
CMOS VLSI Design 4th Ed.
27
Example, Revisited
Ben Bitdiddle is the memory designer for the Motoroil 68W86,
an embedded automotive processor. Help Ben design the
decoder for a register file.
A[3:0] A[3:0]
32 bits
6: Logical Effort
CMOS VLSI Design 4th Ed.
16 words
4:16 Decoder
Decoder specifications:
– 16 word register file
– Each word is 32 bits wide
– Each bit presents load of 3 unit-sized transistors
– True and complementary address inputs A[3:0]
– Each input may drive 10 unit-sized transistors
Ben needs to decide:
– How many stages to use?
– How large should each gate be?
– How fast can decoder operate?
16
Register File
28
Number of Stages
Decoder effort is mainly electrical and branching
Electrical Effort:
H = (32*3) / 10 = 9.6
Branching Effort:
B=8
If we neglect logical effort (assume G = 1)
Path Effort:
F = GBH = 76.8
Number of Stages:
N = log4F = 3.1
Try a 3-stage design
6: Logical Effort
CMOS VLSI Design 4th Ed.
29
Gate Sizes & Delay
Logical Effort:
Path Effort:
Stage Effort:
Path Delay:
Gate sizes:
A[3] A[3]
10
10
A[2] A[2]
10
10
A[1] A[1]
10
10
G = 1 * 6/3 * 1 = 2
F = GBH = 154
fˆ F 1/ 3 5.36
D 3 fˆ 1 4 1 22.1
z = 96*1/5.36 = 18
y = 18*2/5.36 = 6.7
A[0] A[0]
10
10
y
z
word[0]
96 units of wordline capacitance
y
6: Logical Effort
z
word[15]
CMOS VLSI Design 4th Ed.
30
Comparison
Compare many alternatives with a spreadsheet
D = N(76.8 G)1/N + P
Design
NOR4
N
1
G
3
P
4
D
234
NAND4-INV
2
2
5
29.8
NAND2-NOR2
2
20/9
4
30.1
INV-NAND4-INV
NAND4-INV-INV-INV
3
4
2
2
6
7
22.1
21.1
NAND2-NOR2-INV-INV
4
20/9
6
20.5
NAND2-INV-NAND2-INV
4
16/9
6
19.7
INV-NAND2-INV-NAND2-INV
5
16/9
7
20.4
NAND2-INV-NAND2-INV-INV-INV
6
16/9
8
21.6
6: Logical Effort
CMOS VLSI Design 4th Ed.
31
Review of Definitions
Term
Stage
Path
number of stages
1
N
logical effort
g
G gi
electrical effort
h CCoutin
H
branching effort
b
effort
f gh
F GBH
effort delay
f
DF f i
parasitic delay
p
P pi
delay
d f p
6: Logical Effort
Con-path Coff-path
Con-path
CMOS VLSI Design 4th Ed.
Cout-path
Cin-path
B bi
D d i DF P
32
Method of Logical Effort
1)
2)
3)
4)
5)
Compute path effort
Estimate best number of stages
Sketch path with N stages
Estimate least delay
Determine best stage effort
N log 4 F
1
N
D NF P
ˆf F N1
gi Couti
Cini
fˆ
6) Find gate sizes
6: Logical Effort
F GBH
CMOS VLSI Design 4th Ed.
33
Some Key Insights
Allows one to compare alternative topologies quickly
NAND structures are usually faster than NOR
Paths are faster when the effort delays of each
stage are about the same and their value is around
four.
Path delay is insensitive to modest deviations from
optimum. Stage efforts of 2.4 – 6 give only 15%
deviation.
Using stage efforts greater than four reduces area
and power at a slight cost in speed.
Fewer stages or larger gates generally do not help.
6: Logical Effort
CMOS VLSI Design 4th Ed.
34
Some Key Insights
The delay of a well-designed path is about log4F
FO4 inverter delays. Each quadrupling of the load
adds about one FO4 inverter delay to the path.
The logical effort of each input of a gate increases
through no fault of its own delay as the number of
inputs grows. Maximum is about 4 series transistors.
Inverters or 2 input NAND gates are best for driving
nodes with a large branching effort.
When a path forks and one leg is more critical than
the others, buffer the noncritical path.
6: Logical Effort
CMOS VLSI Design 4th Ed.
35
Some Key Insights
In paths that branch, each fork should contribute
equal delay.
The stage efforts f can be made equal for each
stage, but load capacitance can change.
To minimize delay, upsize gates on nodes with large
fixed capacitances to reduce the effort borne by the
gate, while only slightly increasing the effort borne
by the predecessor.
A standard cell library offers a discrete set of gates.
6: Logical Effort
CMOS VLSI Design 4th Ed.
36
Limits of Logical Effort
Chicken and egg problem
– Need path to compute G
– But don’t know number of stages without G
Simplistic delay model
– Neglects input rise time effects
Interconnect
– Iteration required in designs with wire
Maximum speed only
– Not minimum area/power for constrained delay
6: Logical Effort
CMOS VLSI Design 4th Ed.
37
Delay Models
Slope-based linear model.
Nonlinear delay model (table look-up)
Current-source model
6: Logical Effort
CMOS VLSI Design 4th Ed.
38
Pitfalls and Fallacies
Defining gate delay for an unloaded gate
Trying to increase speed by increasing the size of
transistors in a path.
– Especially if someone else is designing the
previous gate
Trying to increase speed by using as few stages as
possible.
6: Logical Effort
CMOS VLSI Design 4th Ed.
39
Historical Perspective
6: Logical Effort
CMOS VLSI Design 4th Ed.
40
Summary
Logical effort is useful for thinking of delay in circuits
– Numeric logical effort characterizes gates
– NANDs are faster than NORs in CMOS
– Paths are fastest when effort delays are ~4
– Path delay is weakly sensitive to stages, sizes
– But using fewer stages doesn’t mean faster paths
– Delay of path is about log4F FO4 inverter delays
– Inverters and NAND2 best for driving large caps
Provides language for discussing fast circuits
– But requires practice to master
6: Logical Effort
CMOS VLSI Design 4th Ed.
41