Transcript Document
Progettazione di circuiti e sistemi VLSI
Anno Accademico 2007-2008
Lezione 15
Riepilogo 1
Challenges in Digital Design
“Macroscopic Issues”
“Microscopic Problems”
• Time-to-Market
• Millions of Gates
• High-Level Abstractions
• Reuse & IP: Portability
• Predictability
• etc.
• Ultra-high speed design
• Interconnect
• Noise, Crosstalk
• Reliability, Manufacturability
• Power Dissipation
• Clock distribution.
Everything Looks a Little Different
?
…and There’s a Lot of Them!
10,000
10,000,000
100,000
100,000,000
Logic Tr./Chip
Tr./Staff Month.
1,000
1,000,000
10,000
10,000,000
100
100,000
Productivity
(K) Trans./Staff - Mo.
Complexity
Logic Transistor per Chip (M)
Productivity Trends
1,000
1,000,000
58%/Yr. compounded
Complexity growth rate
10
10,000
100
100,000
1,0001
10
10,000
x
0.1
100
xx
0.01
10
xx
x
1
1,000
21%/Yr. compound
Productivity growth rate
x
x
0.1
100
0.01
10
2009
2007
2005
2003
2001
1999
1997
1995
1993
1991
1989
1987
1985
1983
1981
0.001
1
Source: Sematech
Complexity outpaces design productivity
Courtesy, ITRS Roadmap
Design Metrics
• How to evaluate performance of a
digital circuit (gate, block, …)?
– Cost
– Reliability
– Scalability
– Speed (delay, operating frequency)
– Power dissipation
– Energy to perform a function
Cost of Integrated Circuits
• NRE (non-recurrent engineering) costs
– design time and effort, mask generation
– one-time cost factor
• Recurrent costs
– silicon processing, packaging, test
– proportional to volume
– proportional to chip area
Mapping between analog and digital signals
V
“ 1”
V
OH
V
V
IH
out
Slope = -1
OH
Undefined
Region
V
“ 0”
V
Slope = -1
IL
V
OL
OL
V
IL
V
IH
V
in
A Modern CMOS Process
gate-oxide
TiSi2
AlCu
SiO2
Tungsten
poly
p-well
n+
SiO2
n-well
p+
p-epi
p+
Dual-Well Trench-Isolated CMOS Process
CMOS Process at a Glance
Define active areas
Etch and fill trenches
Implant well regions
Deposit and pattern
polysilicon layer
Implant source and drain
regions and substrate contacts
Create contact and via windows
Deposit and pattern metal layers
Design Rules
• Interface between designer and process
engineer
• Guidelines for constructing process masks
• Unit dimension: Minimum line width
– scalable design rules: lambda parameter
– absolute dimensions (micron rules)
Progettazione di circuiti e sistemi VLSI
Anno Accademico 2007-2008
Lezione 3
Dispositivi e modelli
Diode Model
RS
+
VD
-
ID
CD
Current-Voltage Relations
Long-Channel Device
A model for manual analysis
Dynamic Behavior of MOS Transistor
G
CGS
CGD
D
S
CGB
CSB
B
CDB
Gate Capacitance
G
G
CGC
CGC
D
S
Cut-off
G
CGC
D
S
Resistive
D
S
Saturation
Most important regions in digital design: saturation and cut-off
Impact of Interconnect
Parasitics
• Interconnect parasitics
– reduce reliability
– affect performance and power consumption
• Classes of parasitics
– Capacitive
– Resistive
– Inductive
CMOS Inverter Load Characteristics
ID n
PMOS
Vin = 0
Vin = 2.5
Vin = 0.5
Vin = 2
Vin = 1
Vin = 1.5
Vin = 1.5
Vin = 1
Vin = 1.5
Vin = 2
Vin = 2.5
NMOS
Vin = 1
Vin = 0.5
Vin = 0
Vout
CMOS Inverter VTC
NMOS off
PMOS res
2.5
Vout
2
NMOS s at
PMOS res
1
1.5
NMOS sat
PMOS sat
0.5
NMOS res
PMOS sat
0.5
1
1.5
2
NMOS res
PMOS off
2.5
Vin
CMOS Inverter Propagation Delay
Approach 1
VDD
tpHL = CL Vswing/2
Iav
CL
Vout
~
Iav
Vin = V DD
CL
kn VDD
CMOS Inverter Propagation Delay
Approach 2
VDD
tpHL = f(R on.CL)
= 0.69 RonCL
Vout
ln(0.5)
Vout
CL
Ron
1
VDD
0.5
0.36
Vin = V DD
RonCL
t
Inverter Chain
In
Out
CL
If CL is given:
- How many stages are needed to minimize the delay?
- How to size the inverters?
May need some additional constraints.
Apply to Inverter Chain
In
Out
1
2
N
CL
tp = tp1 + tp2 + …+ tpN
C gin, j 1
t pj ~ RunitCunit 1
C
gin
,
j
N
N
C gin, j 1
, C gin, N 1 C L
t p t p , j t p 0 1
C
j 1
i 1
gin, j
Optimal Tapering for Given N
Delay equation has N - 1 unknowns, Cgin,2 – Cgin,N
Minimize the delay, find N - 1 partial derivatives
Result: Cgin,j+1/Cgin,j = Cgin,j/Cgin,j-1
Size of each stage is the geometric mean of two neighbors
C gin, j C gin, j 1C gin, j 1
- each stage has the same effective fanout (Cout/Cin)
- each stage has the same delay
Optimum Delay and Number
of Stages
When each stage is sized by f and has same eff. fanout f:
f N F CL / Cgin,1
Effective fanout of each stage:
f NF
Minimum path delay
t p Nt p 0 1 N F /
Optimum Effective Fanout f
Optimum f for given process defined by
f exp 1 f
fopt = 3.6
for =1
Dynamic Power Dissipation
Vdd
Vin
Vout
CL
2
dd
L
Energy/transition = C * V
L
Power = Energy/transition * f = C * V
2
dd
*f
Not a function of Ltransistor
sizes!
dd
Need to reduce C , V , and f to reduce power.
Progettazione di circuiti e sistemi VLSI
Anno Accademico 2007-2008
Lezione 6
La logica combinatoria
Combinational vs. Sequential
Logic
In
Combinational
Logic
Circuit
In
Out
Combinational
Logic
Circuit
State
Combinational
Output = f(In)
Sequential
Output = f(In, Previous In)
Out
Static CMOS Circuit
At every point in time (except during the switching
transients) each gate output is connected to either
VDD or Vss via a low-resistive path.
The outputs of the gates assume at all times the value
of the Boolean function, implemented by the circuit
(ignoring, once again, the transient effects during
switching periods).
This is in contrast to the dynamic circuit class, which
relies on temporary storage of signal values on the
capacitance of high impedance circuit nodes.
Constructing a Complex Gate
VDD
VDD
C
F
SN4
F
SN1
A
SN3
D
B
C
B
SN2
A
D
A
B
D
C
F
(a) pull-down network
(b) Deriving the pull-up network
hierarchically by identifying
sub-nets
A
D
B
C
(c) complete gate
CMOS Properties
• Full rail-to-rail swing; high noise margins
• Logic levels not dependent upon the relative
device sizes; ratioless
• Always a path to Vdd or Gnd in steady state;
low output impedance
• Extremely high input resistance; nearly zero
steady-state input current
• No direct path steady state between power
and ground; no static power dissipation
• Propagation delay function of load
capacitance and resistance of transistors
Transistor Sizing
Rp
2 A
Rp
B
Rn
2
B
2
Rn
A
Rp
4 B
2
CL
Cint
Rp
4
Cint
A
1
Rn
Rn
A
B
CL
1
Transistor Sizing a Complex
CMOS Gate
A
B
8 6
C
8 6
4 3
D
4 6
OUT = D + A • (B + C)
A
D
2
1
B
2C
2
Fast Complex Gates:
Design Technique 1
• Transistor sizing
– as long as fan-out capacitance dominates
• Progressive sizing
InN
CL
MN
In3
M3
C3
In2
M2
C2
In1
M1
C1
Distributed RC line
M1 > M2 > M3 > … > MN
(the mos closest to the
output is the smallest)
Can reduce delay by more than
20%; decreasing gains as
technology shrinks
Fast Complex Gates:
Design Technique 2
• Transistor ordering
critical path
In3 1 M3
charged
CL
In2 1 M2
C2 charged
In1
M1
01
C1 charged
delay determined by time to
discharge CL, C1 and C2
critical path
01
In1
M3
CLcharged
In2 1 M2
C2 discharged
In3 1 M1
C1 discharged
delay determined by time to
discharge CL
Logical Effort
CL
Delay k RunitCunit 1
Cin
p g f
p – intrinsic delay (3kRunitCunit) - gate parameter f(W)
g – logical effort (kRunitCunit) – gate parameter f(W)
f – effective fanout
Normalize everything to an inverter:
ginv =1, pinv = 1
Divide everything by inv
(everything is measured in unit delays inv)
Assume = 1.
Delay in a Logic Gate
Gate delay:
d=h+p
effort delay
intrinsic delay
Effort delay:
h=gf
logical
effort
effective fanout =
Cout/Cin
Logical effort is a function of topology, independent of sizing
Effective fanout (electrical effort) is a function of load/gate size
Logical Effort
• Inverter has the smallest logical effort and
intrinsic delay of all static CMOS gates
• Logical effort of a gate presents the ratio of its
input capacitance to the inverter capacitance
when sized to deliver the same current
• Logical effort increases with the gate
complexity
Logical Effort
Logical effort is the ratio of input capacitance of a gate to the input
capacitance of an inverter with the same output current
VDD
A
VDD
A
2
2
B
F
2
F
A
A
VDD
B
4
A
4
2
F
1
A
B
Inverter
g=1
1
B
2
2-input NAND
g = 4/3
2-input NOR
g = 5/3
1
Add Branching Effort
Branching effort:
b
Con path Coff path
Con path
Summary
Sutherland,
Sproull
Harris
Ratioed Logic
VDD
Resistive
Load
VDD
Depletion
Load
RL
PDN
VSS
(a) resistive load
PMOS
Load
VSS
VT < 0
F
In1
In2
In3
VDD
F
In1
In2
In3
PDN
VSS
(b) depletion load NMOS
F
In1
In2
In3
PDN
VSS
(c) pseudo-NMOS
Goal: to reduce the number of devices over complementary CMOS
Active Loads
VDD
Depletion
Load
VDD
PMOS
Load
VT < 0
VSS
F
In1
In2
In3
PDN
VSS
depletion load NMOS
F
In1
In2
In3
PDN
VSS
pseudo-NMOS
Pass-Transistor Logic
Inputs
B
Switch
Out
A
Out
B
Network
B
• N transistors
• No static consumption
Dynamic CMOS
• In static circuits at every point in time (except
when switching) the output is connected to
either GND or VDD via a low resistance path.
– fan-in of n requires 2n (n N-type + n P-type)
devices
• Dynamic circuits rely on the temporary
storage of signal values on the capacitance of
high impedance nodes.
– requires on n + 2 (n+1 N-type + 1 P-type)
transistors
Dynamic Gate
Clk
Clk
Mp
off
Mp on
Out
In1
In2
In3
Clk
CL
PDN
A
C
B
Me
Clk
Two phase operation
Precharge (Clk = 0)
Evaluate (Clk = 1)
1
Out
((AB)+C)
off
Me on
Cascading Dynamic Gates
V
Clk
Mp
Clk
Mp
Out1
Me
Clk
Out2
In
In
Clk
Clk
Me
Out1
VTn
V
Out2
t
Only 0 1 transitions allowed at inputs!
Differential (Dual Rail) Domino
off
Mp Mkp
Clk
Out = AB
1
on
Mkp
0
Clk
Mp
1
A
!A
0
!B
B
Clk
Me
Solves the problem of non-inverting logic
Out = AB
NORA Logic
Clk
In1
In2
In3
Clk
Mp
11
10
Out1
PDN
Clk
Me
In4
In5
PUN
00
01
Clk
Me
to other
PDN’s
WARNING: Very sensitive to noise!
Mp
Out2
(to PDN)
to other
PUN’s