CSE 477. VLSI Systems Design - Department of Electrical
Download
Report
Transcript CSE 477. VLSI Systems Design - Department of Electrical
Advanced Digital Integrated Circuits
Static CMOS Logic
CMOS Inverter:
A First Look
VDD
Vin
Vout
CL
CMOS Inverter:
Steady State Response
VDD
VDD
VOL = 0
VOH = VDD
VM = f(Rn, Rp)
Rp
Vout = 1
Vout = 0
Rn
Vin = 0
Vin = V DD
CMOS Properties
Full rail-to-rail swing high noise margins
Logic levels not dependent upon the relative device sizes
transistors can be minimum size ratioless
Always a path to Vdd or GND in steady state low
output impedance (output resistance in k range)
large fan-out (albeit with degraded performance)
Extremely high input resistance (gate of MOS transistor
is near perfect insulator) nearly zero steady-state
input current
No direct path steady-state between power and ground
no static power dissipation
Propagation delay function of load capacitance and
resistance of transistors
Review: Short Channel I-V Plot (NMOS)
X 10-4
2.5
VGS = 2.5V
2
VGS = 2.0V
1.5
1
VGS = 1.5V
0.5
VGS = 1.0V
0
0
0.5
1
1.5
2
2.5
VDS (V)
NMOS transistor, 0.25um, Ld = 0.25um, W/L = 1.5, VDD = 2.5V, VT = 0.4V
Review: Short Channel I-V Plot (PMOS)
All polarities of all voltages and currents are reversed
-2
VDS (V)
-1
0
0
VGS = -1.0V
-0.2
VGS = -1.5V
-0.4
-0.6
VGS = -2.0V
-0.8
VGS = -2.5V
-1 X 10-4
PMOS transistor, 0.25um, Ld = 0.25um, W/L = 1.5, VDD = 2.5V, VT = -0.4V
Transforming PMOS I-V Lines
Want common coordinate set Vin, Vout, and IDn
IDn
IDSp = -IDSn
VGSn = Vin ; VGSp = Vin - VDD
VDSn = Vout ; VDSp = Vout - VDD
Vout
Vin = 0
Vin = 0
Vin = 1.5
Vin = 1.5
VGSp = -1
VGSp = -2.5
Mirror around x-axis
Vin = VDD + VGSp
IDn = -IDp
Horiz. shift over VDD
Vout = VDD + VDSp
CMOS Inverter Load Lines
PMOS
2.5
NMOS
X 10-4
Vin = 0V
Vin = 2.5V
2
Vin = 0.5V
Vin = 2.0V
1.5
Vin = 1.0V 1
Vin = 2V
0.5
Vin = 1V
Vin = 1.5V
Vin = 1.5V
Vin = 0.5V
Vin = 1.5V
Vin = 1.0V
Vin = 2.0V
Vin = 0.5V
0
Vin = 2.5V 0
0.5
1
1.5
Vout (V)
2
2.5 Vin = 0V
0.25um, W/Ln = 1.5, W/Lp = 4.5, VDD = 2.5V, VTn = 0.4V, VTp = -0.4V
CMOS Inverter VTC
NMOS off
PMOS res
2.5
NMOS sat
PMOS res
Vout (V)
2
1.5
NMOS sat
PMOS sat
1
NMOS res
PMOS sat
0.5
NMOS res
PMOS off
0
0
0.5
1
1.5
Vin (V)
2
2.5
CMOS Inverter:
Switch Model of Dynamic Behavior
VDD
VDD
Rp
Vout
Vout
CL
Vin = 0
CL
Rn
Vin = V DD
Gate response time is determined by the time to charge CL
through Rp (discharge CL through Rn)
Simulated Inverter VM
VM is relatively
insensitive to variations in
device ratio
1.5
1.4
1.3
setting the ratio to 3, 2.5
and 2 gives VM’s of 1.22V,
1.18V, and 1.13V
1.2
1.1
Increasing the width of
the PMOS moves VM
towards VDD
1
0.9
0.8
0.1
1
(W/L)p/(W/L)n
Note: x-axis is semilog
~3.4
10
Increasing the width of
the NMOS moves VM
toward GND
Noise Margins Determining VIH and VIL
By definition, VIH and VIL are
where dVout/dVin = -1 (= gain)
3
VOH = VDD
NMH = VDD - VIH
NML = VIL - GND
2
VM
Approximating:
VIH = VM - VM /g
VIL = VM + (VDD - VM )/g
1
VOL = GND0
VIL
Vin VIH
A piece-wise linear
approximation of VTC
So high gain in the transition
region is very desirable
CMOS Inverter VTC from Simulation
0.25um, (W/L)p/(W/L)n = 3.4
(W/L)n = 1.5 (min size)
VDD = 2.5V
2.5
Vout (V)
2
VM 1.25V, g = -27.5
1.5
VIL = 1.2V, VIH = 1.3V
NML = NMH = 1.2
(actual values are
VIL = 1.03V, VIH = 1.45V
NML = 1.03V & NMH = 1.05V)
1
0.5
0
0
0.5
1
Vin (V)
1.5
2
2.5
Output resistance
low-output = 2.4k
high-output = 3.3k
Combinatorial vs. Sequential
Combinatorial Circuit
Only depend on the input
No memory effect
Operational/Computational structures
Time does not matter
Sequential Circuit
Memory based
Time matters
Used to keep the state of the circuit
CMOS Circuit Styles
Static complementary CMOS - except during switching,
output connected to either VDD or GND via a lowresistance path
high noise margins
- full rail to rail swing
- VOH and VOL are at VDD and GND, respectively
low output impedance, high input impedance
no steady state path between VDD and GND (no static power
consumption)
delay a function of load capacitance and transistor resistance
comparable rise and fall times (under the appropriate transistor
sizing conditions)
Dynamic CMOS - relies on temporary storage of signal
values on the capacitance of high-impedance circuit
nodes
simpler, faster gates
increased sensitivity to noise
Static Complementary CMOS
Pull-up network (PUN) and pull-down network (PDN)
VDD
PMOS transistors only
In1
In2
PUN
InN
In1
In2
InN
pull-up: make a connection from VDD to F
when F(In1,In2,…InN) = 1
F(In1,In2,…InN)
PDN
pull-down: make a connection from F to
GND when F(In1,In2,…InN) = 0
NMOS transistors only
PUN and PDN are dual logic networks
One and ONLY one of the networks conduct @ DC
Threshold Drops
VDD
PUN
VDD
S
D
VDD
D
0 VDD
VGS
S
CL
VDD 0
PDN
D
VDD
S
CL
0 VDD - VTn
CL
VGS
VDD |VTp|
S
D
CL
Construction of PDN
NMOS devices in series implement a NAND function
A•B
A
B
On
NMOS
Creates
a zero
NMOS devices in parallel implement a NOR function
A+B
A
B
Dual PUN and PDN
PUN and PDN are dual networks
DeMorgan’s theorems
A+B=A•B
[!(A + B) = !A • !B or !(A | B) = !A & !B]
A•B=A+B
[!(A • B) = !A + !B or !(A & B) = !A | !B]
a parallel connection of transistors in the PUN corresponds to a
series connection of the PDN
Complementary gate is naturally inverting (NAND,
NOR, AOI, OAI)
Number of transistors for an N-input logic gate is 2N
CMOS NAND
A
A
B
F
0
0
1
0
1
1
1
0
1
1
1
0
B
A•B
A
B
A
B
CMOS NOR
B
A
A+B
A
B
A
B
A
B
F
0
0
1
0
1
0
1
0
0
1
1
0
Complex CMOS Gate
B
A
C
D
OUT = !(D + A • (B + C))
A
D
B
C
Standard Cell Layout Methodology
Routing
channel
VDD
signals
GND
What logic function is this?
Cell Design
Standard Cells
General purpose logic
Can be synthesized
Same height, varying width
Datapath Cells
For regular, structured designs (arithmetic)
Includes some wiring in the cell
Fixed height and width
Standard Cell Layout Methodology – 1980s
Routing
channel
VDD
signals
GND
Standard Cell Layout Methodology – 1990s
Mirrored Cell
No Routing
channels
VDD
VDD
M2
M3
GND
Mirrored Cell
GND
Standard Cells
N Well
VDD
Cell height 12 metal tracks
Metal track is approx. 3 + 3
Pitch =
repetitive distance between objects
Cell height is “12 pitch”
2
Cell boundary
In
Out
GND
Rails ~10
Standard Cells
With minimal
diffusion
routing
VDD
VDD
VDD
M2
In
Out
In
Out
In
Out
M1
GND
GND
Standard Cells
Proprietary cells, Artizan, Virage, etc.
VDD
2-input NAND gate
VDD
B
A
B
Out
A
GND
Multi-Fingered Transistors
One finger
Two fingers (folded)
Less diffusion capacitance
XNOR/XOR Implementation
XNOR
XOR
A
A
AB
B
A
B
B
AB
A
B
AB
How many transistors in each?
Can you create the stick transistor
layout for the lower left circuit?
AB
VTC is Data-Dependent
0.5/0.25 NMOS
0.75 /0.25 PMOS
3
A
M3 B
M4
2
A,B: 0 -> 1
B=1, A:0 -> 1
A=1, B:0->1
F= A • B
D
A
S
D
B
M1
VGS1 = VB
1
M2
VGS2 = VA –VDS1
S
weaker
PUN
Cint
0
0
1
2
The threshold voltage of M2 is higher than M1 due to the
body effect ()
VTn1 = VTn0
VTn2 = VTn0 + ((|2F| + Vint) - |2F|)
since VSB of M2 is not zero (when VB = 0) due to the presence of Cint
Static CMOS Full Adder Circuit
!Cout = !Cin & (!A | !B) | (!A & !B)
!Sum = Cout & (!A | !B | !Cin) | (!A & !B & !Cin)
B
A
B
B
A
Cin
A
B
Cin
Cin
!Cout
!Sum
A
A
B
B
A
Cin
A
B
Cin
A
B
Cout = Cin & (A | B) | (A & B)
Sum = !Cout & (A | B | Cin) | (A & B & Cin)
Switch Delay Model
Req
A
A
Rp
A
Rp
Rp
B
Rn
Rp
CL
Cint
A
Rn
A
Cint
A
NAND2
Rp
A
B
Rn
B
INV
CL
Rn
Rn
A
B
CL
NOR2
Input Pattern Effects on Delay
Delay is dependent on the pattern of
inputs
Low to high transition
Rp
A
Rp
both inputs go low
- delay is 0.69 Rp/2 CL
B
one input goes low
- delay is 0.69 Rp CL
Rn
CL
B
Rn
A
High to low transition
both inputs go high
- delay is 0.69 2Rn CL
Cint
Delay Dependence on Input Patterns
3
A=B=10
2.5
Voltage [V]
2
A=1 0, B=1
1.5
A=1, B=10
1
Input Data
Delay
Pattern
(psec)
A=B=01
67
A=1, B=01
64
A= 01, B=1
61
A=B=10
45
A=1, B=10
80
A= 10, B=1
81
0.5
0
-0.5
0
100
200
time [ps]
300
400
NMOS = 0.5m/0.25 m
PMOS = 0.75m/0.25 m
CL = 100 fF
Transistor Sizing
Rp
2 A
Rp
B
Rn
2
B
2
Rn
A
Rp
4 B
2
CL
Cint
Rp
4
Cint
A
1
Rn
Rn
A
B
CL
1
Transistor Sizing a Complex CMOS Gate
A
B
8 6
C
8 6
4 3
D
4 6
OUT = D + A • (B + C)
A
D
2
1
B
2C
2
Fan-In Considerations
A
B
C
D
A
CL
B
C3
C
C2
D
C1
Distributed RC model
(Elmore delay)
tpHL = 0.69 Reqn(C1+2C2+3C3+4CL)
Propagation delay deteriorates
rapidly as a function of fan-in –
quadratically in the worst case.
tp as a Function of Fan-In
1250
quadratic
tp (psec)
1000
Gates with a
fan-in
greater than
4 should be
avoided.
750
tpHL
500
250
tp
tpL
linear
H
0
2
4
6
8
fan-in
10
12
14
16
tp as a Function of Fan-Out
tpNOR2
tpNAND2
tpINV
tp (psec)
2
All gates
have the
same drive
current.
Slope is a
function of
“driving
strength”
4
6
8
10
eff. fan-out
12
14
16
tp as a Function of Fan-In and Fan-Out
Fan-in: quadratic due to increasing resistance and
capacitance
Fan-out: each additional fan-out gate adds two gate
capacitances to CL
tp = a1FI + a2FI2 + a3FO
RC Tree Definitions
RC tree characteristics
A unique resistive path exists
between the source node and any
node of the network
s
r1
r2
1
c1
- Single input (source) node, s
- All capacitors are between a node
and GND
c2
r3
4
r4
3
c3
- No resistive loops
2
c4
ri
i
Path resistance (sum of the resistances on the path from the
input node to node i)
ci
i
rii = rj (rj [path(s i)]
j=1
Shared path resistance (resistance shared along the paths from the input
node to nodes i and k)
N
rik = rj (rj [path(s i) path(s k)])
j=1
A typical wire is a chain network with (simplified) Elmore
N
delay of
DN = cirii
i=1
Chain Network Elmore Delay
D1=c1r1
r1
1
Vin
c1
r2
D2=c1r1 + c2(r1+r2)
2
c2
ri-1
i-1
ci-1
ri
rN
i
ci
N
cN
Di=c1r1+ c2(r1+r2)+…+ci(r1+r2+…+ri)
N
Elmore delay equation
i
DN = cirii = ci rj
Di=c1req+ 2c2req+ 3c3req+…+ icireq
VN
Elmore delay in NAND4
Fast Complex Gates: Design Technique 1
Transistor sizing
as long as fan-out capacitance dominates
Progressive sizing
InN
CL
MN
In3
M3
C3
In2
M2
C2
In1
M1
C1
Distributed RC line
M1 > M2 > M3 > … > MN
(the fet closest to the
output is the smallest)
Can reduce delay by more than
20%; decreasing gains as
technology shrinks
Fast Complex Gates: Design Technique 2
Transistor ordering
critical path
In3 1 M3
charged
CL
In2 1 M2
C2 charged
In1
M1
01
C1 charged
delay determined by time to
discharge CL, C1 and C2
critical path
01
In1
M3
CLcharged
In2 1 M2
C2 discharged
In3 1 M1
C1 discharged
delay determined by time to
discharge CL
Sizing and Ordering Effects
A
3 B
3 C
3 D
A
44
B
45
C
46
C2
D
47
C1
3
CL= 100 fF
C3
Progressive sizing in pull-down
chain gives up to a 23%
improvement.
Input ordering saves 5%
critical path A – 23%
critical path D – 17%
Fast Complex Gates:
Design Technique 3
Alternative logic structures
F = ABCDEFGH
Fast Complex Gates: Design Technique 4
Isolating fan-in from fan-out using buffer insertion
CL
CL
Symmetric Gates
Inputs can be made perfectly symmetric
2
2
A
1
1
B
1
1
Y
More Combinatorial Circuits
There are two other important combinatorial circuit
families
Transmission Gates Logic
Dynamic Logic
Both of these families are useful in the design of high
performance circuits
NMOS Transistors in Series/Parallel
Primary inputs drive both gate and source/drain
terminals
NMOS switch closes when the gate input is high
A
B
X
Y
X = Y if A and B
A
X
B
X = Y if A or B
Y
Remember - NMOS transistors pass a strong 0 but a
weak 1
PMOS Transistors in Series/Parallel
Primary inputs drive both gate and source/drain
terminals
PMOS switch closes when the gate input is low
A
B
X
Y
X = Y if A and B = A + B
A
X
B
X = Y if A or B = A B
Y
Remember - PMOS transistors pass a strong 1 but a
weak 0
Pass Transistor (PT) Logic
B
B
A
0
B
F =AB
A
B
F =AB
0
Gate is static – a low-impedance path exists to both
supply rails under all circumstances
N transistors instead of 2N
No static power consumption
Ratioless
Bidirectional (versus undirectional)
VTC of PT AND Gate
B
1.5/0.25
2
A
Vout, V
0.5/0.25
B=VDD, A=0VDD
1
0.5/0.25
B
0
0.5/0.25
A=VDD, B=0VDD
A=B=0VDD
F= AB
0
0
1
2
Pure PT logic is not regenerative - the signal
gradually degrades after passing through a number
of PTs (can fix with static CMOS inverter insertion)
Differential PT Logic (CPL)
B
A
A
B
B
PT Network
A
A
B
B
Inverse PT
Network
B
B
F
F=AB
B
B
F=AB
B
AND/NAND
B
A
F=A+B
B
A
A
F
B
A
A
F
F
A
F=AB
A
F=A+B
B
F=AB
A
OR/NOR
XOR/XNOR
CPL Properties
Differential so complementary data inputs and outputs
are always available (so don’t need extra inverters)
Still static, since the output defining nodes are always
tied to VDD or GND through a low resistance path
Design is modular; all gates use the same topology, only
the inputs are permuted.
Simple XOR makes it attractive for structures like adders
Fast (assuming number of transistors in series is small)
Additional routing overhead for complementary signals
Still have static power dissipation problems
Solution 3: Transmission Gates (TGs)
Most widely used
solution
C
C
A
A
B
B
C
C
C = GND
A = VDD
B
C = VDD
C = GND
A = GND
B
C = VDD
Full swing bidirectional switch controlled by the gate
signal C, A = B if C = 1
Resistance of TG
W/Lp=0.50/0.25
30
0V
25
Rn
Rp
Resistance, k
20
2.5V
Vout
Rp
Rn
15
2.5V
10
Req
W/Ln=0.50/0.25
5
0
0
1
2
TG Multiplexer
S
S
S
S
F
S
VDD
In2
S
F
In1
S
F = !(In1 S + In2 S)
GND
In1
In2
Dynamic CMOS
In static circuits at every point in time (except when
switching) the output is connected to either GND or VDD
via a low resistance path.
fan-in of N requires 2N devices
Dynamic circuits rely on the temporary storage of signal
values on the capacitance of high impedance nodes.
requires only N + 2 transistors
takes a sequence of precharge and conditional evaluation
phases to realize logic functions
Dynamic Gate
CLK
CLK
Mp
off
Mp on
Out
In1
In2
In3
CLK
CL
PDN
1
Out
!((A&B)|C)
A
C
B
Me
CLK
Two phase operation
Precharge (CLK = 0)
Evaluate (CLK = 1)
off
Me on
Conditions on Output
Once the output of a dynamic gate is discharged, it
cannot be charged again until the next precharge
operation.
Inputs to the gate can make at most one transition during
evaluation.
Output can be in the high impedance state during and
after evaluation (PDN off), state is stored on CL
Properties of Dynamic Gates
Logic function is implemented by the PDN only
number of transistors is N + 2 (versus 2N for static
complementary CMOS)
should be smaller in area than static complementary CMOS
Full swing outputs (VOL = GND and VOH = VDD)
Nonratioed - sizing of the devices is not important for
proper functioning (only for performance)
Faster switching speeds
reduced load capacitance due to lower number of transistors per
gate (Cint) so a reduced logical effort
reduced load capacitance due to smaller fan-out (Cext)
no Isc, so all the current provided by PDN goes into discharging CL
Ignoring the influence of precharge time on the switching speed of
the gate, tpLH = 0 but the presence of the evaluation transistor
slows down the tpHL
Properties of Dynamic Gates, con’t
Power dissipation should be better
But power dissipation can be significantly higher due to
higher transition probabilities
extra load on CLK
PDN starts to work as soon as the input signals exceed
VTn, so set VM, VIH and VIL all equal to VTn
consumes only dynamic power – no short circuit power
consumption since the pull-up path is not on when evaluating
lower CL- both Cint (since there are fewer transistors connected to
the drain output) and Cext (since there the output load is one per
connected gate, not two)
by construction can have at most one transition per cycle – no
glitching
low noise margin (NML)
Needs a precharge clock
Dynamic Behavior
CLK
2.5
Out
Evaluate
In1
1.5
In2
In3
In &
CLK
0.5
In4
CLK
Out
Precharge
-0.5
0
0.5
#Trns
VOH
VOL
VM
NMH
6
2.5V
0V
VTn 2.5-VTn
Time, ns
NML
VTn
tpHL
1
tpLH
tp
110ps 0ns 83ps
Gate Parameters are Time Independent
The amount by which the output voltage drops is a
strong function of the input voltage and the available
evaluation time.
Noise needed to corrupt the signal has to be larger if the
evaluation time is short – i.e., the switching threshold is truly
time independent.
CLK
2.5
Voltage (V)
Vout (VG=0.45)
1.5
Vout (VG=0.55)
0.5
Vout (VG=0.5)
VG
-0.5
0
20
40
60
Time (ns)
80
100
Issues in Dynamic Design 1: Charge Leakage
CLK
4
CLK
3
Mp
Out
1
CL
A=0
2
CLK
Evaluate
VOut
Me
Precharge
Leakage sources
Minimum clock rate of a few kHz
Impact of Charge Leakage
Output settles to an intermediate voltage determined by
a resistive divider of the pull-up and pull-down networks
Once the output drops below the switching threshold of the
fan-out logic gate, the output is interpreted as a low voltage.
CLK
2.5
Voltage (V)
Out
1.5
0.5
-0.5
0
20
Time (ms)
40
A Solution to Charge Leakage
Keeper compensates for the charge lost due to the pulldown leakage paths.
Keeper
CLK
Mp
Mkp
!Out
A
CL
B
CLK
Me
Same approach as level restorer for pass
transistor logic
Issues in Dynamic Design 2: Charge Sharing
CLK
Mp
Out
A
CL
B=0
CLK
Ca
Me
Charge stored originally on
CL is redistributed (shared)
over CL and CA leading to
static power consumption by
downstream gates and
possible circuit malfunction.
Cb
When Vout = - VDD (Ca / (Ca + CL )) the drop in Vout is
large enough to be below the switching threshold of
the gate it drives causing a malfunction.
Solution to Charge Redistribution
CLK
Mp
Mkp
CLK
Out
A
B
CLK
Me
Precharge internal nodes using a clockdriven transistor (at the cost of increased
area and power)
Issues in Dynamic Design 3: Backgate Coupling
Susceptible to crosstalk due to 1) high impedance of the
output node and 2) capacitive coupling
Out2 capacitively couples with Out1 through the gate-source and
gate-drain capacitances of M4
CLK
Mp
A=0
M1
B=0
M2
CLK
Out1 =1
CL1
M6
M5
Out2 =0
M4
CL2
M3
Me
Dynamic NAND
Static NAND
In
Issues in Dynamic Design 4: Clock Feedthrough
A special case of capacitive coupling between the clock
input of the precharge transistor and the dynamic output
node
CLK
Mp
A
CL
B
CLK
Out
Me
Coupling between Out and
CLK input of the precharge
device due to the gatedrain capacitance. So
voltage of Out can rise
above VDD. The fast rising
(and falling edges) of the
clock couple to Out.
Clock Feedthrough
CLK
Clock feedthrough
Out
In1
2.5
In2
1.5
In3
In4
In &
CLK
0.5
Out
CLK
-0.5
0
0.5
Time, ns
1
Clock feedthrough
Cascading Dynamic Gates
V
CLK
Mp
CLK
CLK
Mp
Out1
Out2
In
In
CLK
Me
CLK
Out1
VTn
Me
V
Out2
t
Only a single 0 1 transition allowed at the
inputs during the evaluation period!
Domino Logic
CLK
In1
In2
In3
CLK
Mp
11
10
PDN
Me
Out1
CLK
Mp Mkp
00
01
In4
In5
CLK
PDN
Me
Out2
Why Domino?
CLK
In1
Ini PDN
Inj
CLK
Ini
Inj
PDN
Ini
Inj
PDN
Like falling dominos!
Ini
Inj
PDN