Transcript 0 - EECS

Digital Design
Chapter 7:
Physical Implementation
Slides to accompany the textbook Digital Design, First Edition,
by Frank Vahid, John Wiley and Sons Publishers, 2007.
http://www.ddvahid.com
Some changes by Mark Brehob
Copyright by Frank Vahid
Instructors of courses requiring Vahid's Digital Design textbook (published by John Wiley and Sons) have permission to modify and use these slides for customary course-related activities,
subject to keeping
this copyright
notice in place and unmodified. These slides may be posted as unanimated pdf versions on publicly-accessible course websites.. PowerPoint source (or pdf
Digital
Design
with animations) may not be posted to publicly-accessible websites, but may be posted for students on internal protected sites or distributed directly to students by other electronic means.
Copyright
1
Instructors may make printouts of the slides available to students for a reasonable photocopying charge, without incurring royalties. Any other use requires explicit permission. Instructors
Franksource
Vahidor obtain special use permissions from Wiley – see http://www.ddvahid.com for information.
may obtain PowerPoint
Introduction
• A digital circuit design is just an idea, perhaps drawn on
paper
• We eventually need to implement the circuit on a physical
device
– How do we get from (a) to (b)?
k
Belt W ar n
p
w
s
IC
Digital Design
Copyright
Frank Vahid
(a) Digital circuit
design
(b) Physical
implementation
2
Manufactured IC Technologies
• We can manufacture our own IC
– Months of time and millions of dollars
– (1) Full-custom or (2) semicustom
• (1) Full-custom IC
– We make a full custom layout
• Using CAD tools
• Layout describes the location and size of
every transistor and wire
k
BeltWarn
p
w
Custom
Layout
s
– A fab (fabrication plant) builds IC for layout
– Hard!
• Fab setup costs ("non-recurring engineering",
or NRE, costs) high
• Error prone (several "respins")
• Fairly uncommon
Fab
months
IC
– Reserved for special ICs that demand the very
best performance or the very smallest
Digital Design size/power
Copyright
Frank Vahid
3
Full Custom inverter
Digital Design
Copyright
Frank Vahid
Images from http://wiki.usgroup.eu/wiki/public/tutorials/fullcustomvirtuoso
and http://www.iis.ee.ethz.ch/~kgf/aries/8.html
4
Take away – full custom
• Very expensive and slow
– Requires folks who really know what they are doing and
requires them to take time
• EECS 312 final project is to build a fairly small device (e.g. 4-bit
adder) with certain space and power bounds. This is a multiweek thing!
– Fabrication will generally not have any short cuts.
• So might take months to get this spun.
• Best performance
– Can tune the heck out of things and make stuff go really fast
or use very little power.
• Where do you think this sees use?
Digital Design
Copyright
Frank Vahid
5
Gate array
• From Wikipedia:
– A gate array circuit is a prefabricated silicon chip circuit with no
particular function in which transistors, standard NAND or NOR logic
gates, and other active devices are placed at regular predefined
positions and manufactured on a wafer, usually called a master
slice. Creation of a circuit with a specified function is accomplished
by adding a final surface layer or layers of metal interconnects to the
chips on the master slice late in the manufacturing process, joining
these elements to allow the function of the chip to be customized as
desired.
– Gate array master slices are usually prefabricated and stockpiled in
large quantities regardless of customer orders. The design and
fabrication according to the individual customer specifications may
be finished in a shorter time compared with standard cell or full
custom design.
Digital Design
Copyright
Frank Vahid
6
Manufactured IC Technologies – Gate Array ASIC
• (2) Semi-custom IC
– "Application-specific IC" (ASIC)
– (a) Gate array or (b) standard
cell
• (2a) Gate array
– Series of gates already laid out
on chip
– We just wire them together
k
BeltWarn
p
w
s
(b)
(a)
k
p
w
• Using CAD tools
s
– Vs. full-custom
• Cheaper and quicker to design
• But worse performance, size,
power
(d)
(c)
IC
Fab
weeks
(just wiring)
– Was quite popular
• Sees less use today AFAICT
Digital Design
Copyright
Frank Vahid
7
Manufactured IC Technologies – Gate Array ASIC
• (2a) Gate array
– Example: Mapping a half-adder
to a gate array
Half-adder equations:
a
b
ab
a'b
ab'
s = a'b + ab'
co = ab
co
s
Gate array
Digital Design
Copyright
Frank Vahid
8
Manufactured IC Technologies – Standard Cell ASIC
• (2) Semicustom IC
– "Application-specific IC" (ASIC)
– (a) Gate array or (b) standard
cell
• (2b) Standard cell
k
p
BeltWarn
w
– Pre-laid-out "cells" exist in
s
library, not on chip
(a)
– Designer instantiates cells into
pre-defined rows, and connects
– Vs. gate array
• Better performance/power/size
• A bit harder to design
– Vs. full custom
• Not as good of circuit, but still
far easier to design
Digital Design
Copyright
Frank Vahid
(b)
k
p
s
Cell library
w
cell row
cell row
cell row
(d)
(c)
IC
Fab
1-3 months
(cells and wiring)
9
Manufactured IC Technologies – Standard Cell ASIC
• (2b) Standard cell
– Example: Mapping a half-adder
to standard cells
a
b
co = ab
s = a'b + ab'
ab
co
s
a'b
cell row
ab'
cell row
a
b
ab
a'b
ab'
co
cell row
s
gate array
Digital Design
Copyright
Frank Vahid
Notice fewer gates and shorter wires
for standard cells versus gate array,
but at cost of more design and
manufacturing effort
10
Question:
• What are the key differences between standard cells and
gate array design?
Digital Design
Copyright
Frank Vahid
11
Implementing Circuits Using NAND Gates Only
(We’ve been doing this since GA1!)
• Gate array may have
NAND gates only
a
x
– NAND is universal gate
Inputs
x
0
1
• Any circuit can be
mapped to NANDs only
• Convert AND/OR/NOT
circuit to NAND-only circuit
using mapping rules
– After converting, remove
double inversions
x
F=x'
a
0
1
b
0
1
b
F=x'
Output
F
1
0
a
a
b
F=ab
b
F=ab
(ab)'
a
a
a
Double inversion
b
F=a+b
b
F=(a'b')'=a''+b''=a+b
Digital Design
Copyright
Frank Vahid
12
Implementing Circuits Using NOR Gates Only
• Example: Half adder
double inversion
a
b
a
b
a
s
b
(a)
Digital Design
Copyright
Frank Vahid
a
b
double inversion
(b)
a
b
s
s
a
b
(c)
13
Implementing Circuits Using NOR Gates Only
• Example: Seat belt warning light on a NOR-based gate array
– Note: if using 2-input NOR gates, first convert AND/OR gates to 2-inputs
was
k
p
w
k
s
p
s
(a)
Digital Design
Copyright
Frank Vahid
p
(b)
k
1
3
5
(c)
3
2
4
5
w
s
4
2
1
w
(d)
14
Programmable IC Technology – FPGA
7.3
• Manufactured IC technologies require weeks to
months to fabricate
– And have large (hundred thousand to million dollar)
initial costs
• Programmable ICs are pre-manufactured
– Can implement circuit today
– Just download bits into device
– Slower/bigger/more-power than manufactured ICs
• But get it today, and no fabrication costs
• Popular programmable IC – FPGA
– "Field-programmable gate array"
• Developed late 1980s
• Though no "gate array" inside
– Named when gate arrays were very popular in the 1980s
• Programmable in seconds
Digital Design
Copyright
Frank Vahid
15
FPGA Internals: Lookup Tables (LUTs)
• Basic idea: Memory can implement combinational logic
– e.g., 2-address memory can implement 2-input logic
– 1-bit wide memory – 1 function; 2-bits wide – 2 functions
• Such memory in FPGA known as Lookup Table (LUT)
F = x'y' + xy
4x1 Mem.
x
0
0
1
1
y
0
1
0
1
F
1
0
0
1
1
x
y
rd
a1
a0
4x1 Mem.
1
0
1
2
3
1
0
0
1
D
x=0
y=0
rd
a1
a0
0
1
2
3
1
0
0
1
D
(b )
Digital Design
Copyright
Frank Vahid
x
y
F G
0 0
1 0
0 1
0 0
1 0
0 1
1 1
1 0
4x2 Mem.
1
x
y
rd 0 10
1 00
2 01
3 10
a1
a0 D1 D0
F=1
F
(a )
F = x'y' + xy
G = xy'
(c)
(d )
F G
(e )
16
FPGA Internals: Lookup Tables (LUTs)
• Example: Seat-belt warning
light (again)
k
BeltWarn
p
w
s
(a)
k
p
s
(c)
8x1 Mem.
0
0
1
0
2
0
3
0
a2
0
a1 4
0
a0 5
6
1
7
0
IC
(b)
k
0
p
0
s
0
w
0
0
0
0
0
1
1
1
0
1
0
0
0
1
1
0
0
0
1
0
0
1
1
1
1
0
1
1
0
Programming
(seconds)
Fab
1-3 months
D
w
Digital Design
Copyright
Frank Vahid
17
FPGA Internals: Lookup Tables (LUTs)
• Lookup tables become inefficient for more inputs
– 3 inputs  only 8 words
– 8 inputs  256 words;
16 inputs  65,536 words!
• FPGAs thus have numerous small (3, 4, 5, or even 6-input) LUTs
– If circuit has more inputs, must partition circuit among LUTs
– Example: Extended seat-belt warning light system:
Sub-circuits have only 3-inputs each
k
BeltWarn
p
k
w
p
s
s
t
t
d
d
(a)
5-input circuit, but 3input LUTs available
Digital Design
Copyright
Frank Vahid
BeltWarn
x
w
k
p
s
3 inputs
1 output
x=kps'
3 inputs
1 output
w=x+t+d
(b)
8x1 Mem.
0
0
1
0
2
0
3
0
a2
0
a1 4
0
a0 5
6
1
7
0
x
D
D
t
d
Partition circuit into
3-input sub-circuits
8x1 Mem.
0
0
1
1
2
1
3
1
a2
1
a1 4
1
a0 5
6
1
7
1
(c)
w
Map to 3-input LUTs
18
LUTs
Digital Design
Copyright
Frank Vahid
19
FPGA Internals: Lookup Tables (LUTs)
• LUT typically has 2 (or more) outputs, not just one
• Example: Partitioning a circuit among 3-input 2-output lookup tables
a
b
c
d
8x2 Mem.
0
F
e
a
b
c
( a)
1
2
3
t
d
a
b
c
1
2
F
3
e
Digital Design
Copyright
Frank Vahid
00
00
00
00
00
00
00
01
0
1
2
3
a2
a1 4
a0 5
6
7
D1 D0
00
10
00
10
00
10
10
10
D1 D0
t
(b)
(Note: decomposed one 4input AND input two
smaller ANDs to enable
partitioning into 3-input
sub-circuits)
1
2
3
a2
a1 4
a0 5
6
7
8x2 Mem.
d
e
F
(c)
First column unused;
second column
implements AND
Second column unused;
first column implements
AND/OR sub-circuit
20
FPGA Internals: Lookup Tables (LUTs)
• Example: Mapping a 2x4 decoder to 3-input 2-output LUTs
d0
d1
d2
d3
0
i1
i0
8x2 Mem.
0 10
1 01
2 00
3 00
a2
a1 4 00
a0 5 00
6 00
7 00
0
D1 D0
i1
i0
(a)
Digital Design
Copyright
Frank Vahid
d0 d1
8x2 Mem.
0 00
1 00
2 10
3 01
a2
a1 4 00
a0 5 00
6 00
7 00
D1 D0
(b)
d2 d3
21
FPGA Internals: Switch Matrices
• Previous slides had hardwired connections between LUTs
• Instead, want to program the connections too
• Use switch matrices (also known as programmable interconnect)
– Simple mux-based version – each output can be set to any of the four inputs
just by programming its 2-bit configuration memory
Switch matrix
2-bit
memory
FPGA (partial)
P0
P1
P2
P3
8x2 Mem.
0 00
1 00
2 00
3 00
a2
a1 4 00
a0 5 00
6 00
7 00
D1 D0
o0
o1
m0
m1
m2
m3
Switch
matrix
P6
P7
(a)
m0
m1
m2
m3
s1 s0
i0
o0
i1 4x1
i2 mux d
i3
2-bit
memory
D1 D0
P8
P9
P4
P5
Digital Design
Copyright
Frank Vahid
8x2 Mem.
0 00
1 00
2 00
3 00
a2
a1 4 00
a0 5 00
6 00
7 00
s1 s0
i0
o1
i1 4x1
i2 mux d
i3
(b)
22
FPGA Internals: Switch Matrices
• Mapping a 2x4 decoder onto an FPGA with a switch matrix
0
0
i1
i0
8x2 Mem.
8x2 Mem.
0
1
2
3
a2
a1 4
a0 5
6
7
0
1
2
3
a2
a1 4
a0 5
6
7
10
01
00
00
00
00
00
00
D1 D0
10 o0
m0 11 o1
m1
m2
m3
Switch
matrix
00
00
10
01
00
00
00
00
10
d3
d2
(a)
Digital Design
Copyright
Frank Vahid
m0
m1
m2
m3
s1 s0
i0
o0
i1 4x1
d
i2 mux
i3
11
D1 D0
d1
d0
i1
i0
These bits establish the desired connections
Switch matrix
FPGA (partial)
s1 s0
i0
o1
i1 4x1
d
i2 mux
i3
(b)
23
F=A*B+C
G=A*B*C*!D*E
24
F=A*B+C
G=A*B*C*!D*E
25
FPGA Internals: Configurable Logic Blocks (CLBs)
• LUTs can only
implement
combinational logic
• Need flip-flops to
implement sequential
logic
• Add flip-flop to each
LUT output
– Configurable Logic
Block (CLB)
• LUT + flip-flops
– Can program CLB
outputs to come
from flip-flops or
from LUTs directly
Digital Design
Copyright
Frank Vahid
FPGA
CLB
P0
P1
P2
P3
CLB output
flip-flop
1-bit
CLB
output
configuration
memory
8x2 Mem.
8x2 Mem.
0
1
2
3
a2
a1 4
a0 5
6
7
0
1
2
3
a2
a1 4
a0 5
6
7
00
00
00
00
00
00
00
00
D1
0
CLB
10
2x1 0
D0
10
2x1
00 o0
m0 00 o1
m1
m2
m3
Switch
matrix
D1
0
10
2x1 0
00
00
00
00
00
00
00
00
D0
10
2x1
P6
P7
P8
P9
P4
P5
26
FPGA Internals: Sequential Circuit Example using CLBs
a
b
c
FPGA
d
CLB
w
x
y
0
0
a
b
z
(a)
Left lookup table
D1
a1
a0
0
a
b
0
0
0
1
1
0
0
1
1
0
0
1
0
0
1
0
1
1
0
0
8x2 Mem.
0
1
2
3
a2
4
a1
a0 5
6
7
0
1
2
3
a2
4
a1
a0 5
6
7
11
10
01
00
00
00
00
00
D0
D0
10 o0
m0 11 o1
m1
m2
m3
Switch
matrix
00
01
10
11
00
00
00
00
D1
D0
10
10
w=a' x=b'
below unused
(b)
Digital Design
Copyright
Frank Vahid
8x2 Mem.
D1
a2
CLB
1
10
2 x1 1
10
1
2 x1
2 x1 1
2 x1
z
y
x
w
c
d
(c)
27
FPGA Internals: Overall Architecture
• Consists of hundreds or thousands of CLBs and switch
matrices (SMs) arranged in regular pattern on a chip
Connections for just one
CLB shown, but all
CLBs are obviously
connected to channels
Represents channel with
tens of wires
CLB
CLB
SM
CLB
SM
CLB
SM
CLB
Digital Design
Copyright
Frank Vahid
CLB
CLB
SM
CLB
CLB
28
FPGA Internals: Programming an FPGA
FPGA
• All configuration
memory bits are
connected as
one big shift
register
Pin
Pclk
0
0
a
b
(a)
– Known as scan
chain
• Shift in "bit file"
of desired circuit
1
(b)
Pin
Pclk
a
Digital Design
Copyright
Frank Vahid
CLB
8x2 Mem.
0 11
1 10
2 01
3 01
a2
4 00
a1
a0 5 00
6 00
7 00
D1
D0
2 x1 1
2x1
CLB
8x2 Mem.
0 01
1 00
2 11
3 10
a2
4 00
a1
a0 5 00
6 00
7 00
10 o0
m0 11 o1
m1
m2
m3
Switch
matrix
1
D1
D0
2 x1 1
2 x1
z
y
x
w
c
d
Conceptual view of configuration bit scan chain
is that of a 40-bit shift register
(c) Bit file contents for desired circuit: 1101011000000000111101010011010000000011
This isn't wrong. Although the bits appear as "10" above, note that the scan
chain passes through those bits from right to left – so "01" is correct here.
29
7.4
Other Technologies
• Off-the-shelf logic (SSI) IC
– Logic IC has a few gates,
connected to IC's pins
VCC
I14 I13 I12 I11 I10 I9
• Known as Small Scale Integration
(SSI)
I8
IC
– Popular logic IC series: 7400
• Originally developed 1960s
– Back then, each IC cost $1000
– Today, costs just tens of cents
Digital Design
Copyright
Frank Vahid
I1
I2
I3
I4
I5
I6
I7
GND
30
7400-Series Logic ICs
Digital Design
Copyright
Frank Vahid
31
Using Logic ICs
• Example: Seat belt warning light using off-the-shelf 7400 ICs
– Option 1: Use one 74LS08 IC having 2-input AND gates, and one 74LS04 IC
having inverters
(a) Desired circuit
I14 I13 I12 I11 I10 I9
I8
k
p
w
(c) Connect
ICs to create
desired circuit
74LS08 IC
s
(a)
k
p
I1
I2
I3 I4
n
I5
I6
I7
I14 I13 I12 I11 I10 I9
I8
w
k
p
n
w
74LS04 IC
s
s
I1
(b)
I2
I3
I4
I5
I6
I7
(c)
Digital Design
Copyright
Frank Vahid
(b) Decompose into
2-input AND gates
a
a
32
Other Technologies
I1 I2 I3
• Simple Programmable Logic
Devices (SPLDs)
– Developed 1970s (thus, pre-dates
FPGAs)
– Prefabricated IC with large ANDOR structure
– Connections can be "programmed"
to create custom circuit
O1
• Circuit shown can implement any
3-input function of up to 3 terms
– e.g., F = abc + a'c'
PLD IC
programmable nodes
Digital Design
Copyright
Frank Vahid
33
Programmable Nodes in an SPLD
• Fuse based – "blown" fuse removes
connection
• Memory based – 1 creates connection
I1 I2 I3
programmable node
Fuse based
O1
(a)
Fuse
"unblown" fuse
"blown" fuse
Memory based
PLD IC
mem
1
programmable nodes
mem
0
(b)
Digital Design
Copyright
Frank Vahid
34
PLD Drawings and PLD Implementation Example
I1 I2 I3
• Common way of drawing PLD
connections:
wired AND
– Uses one wire to represent all
inputs of an AND
– Uses "x" to represent connection
×
I3*I2'
×
O1
• Crossing wires are not
connected unless "x" is present
PLD IC
• Example: Seat belt warning
light using SPLD
k
p
k
p
s
BeltWarn
× × ×
w
××
s
×× ×× ××
kps'
0
w
0
PLD IC
Digital Design
Copyright
Frank Vahid
Two ways to generate a 0 term
35
PLD Extensions
I1
I2
I3
I1
I2
I3
programmable bit
O1
O1
FF
2
1
O2
O2
FF
PLD IC
(a )
Two-output PLD
Digital Design
Copyright
Frank Vahid
2
1
PLD IC
(b )
clk
PLD with programmable registered outputs
36
More on PLDs
•
Originally (1970s) known as Programmable Logic Array – PLA
– Had programmable AND and OR arrays
•
AMD created "Programmable Array Logic" – "PAL" (trademark)
– Only AND array was programmable (fuse based)
•
Lattice Semiconductor Corp. created "Generic Array Logic – "GAL" (trademark)
– Memory based
•
As IC capacities increased, companies put multiple PLD structures on one chip,
interconnecting them
– Become known as Complex PLDs (CPLD), and older PLDs became known as
Simple PLDs (SPLD)
•
GENERALLY SPEAKING, difference of SPLDs vs. CPLDs vs. FPGAs:
– SPLD: tens to hundreds of gates, and usually non-volatile (saves bits without power)
– CPLD: thousands of gates, and usually non-volatile
– FPGA: tens of thousands of gates and more, and usually volatile (but no reason why
couldn't be non-volatile)
Digital Design
Copyright
Frank Vahid
37
7.5
FPGA
PLD
reprogrammable
Technology Comparisons
Quicker availability
Lower design cost
Easier design
Digital Design
Copyright
Frank Vahid
Full-custom
Standard cell (semicustom)
Gate array (semicustom)
Faster performance
Higher density
Lower power
Larger chip capacity
More optimized
38
Technology Comparisons
Processor varieties
(1): Custom processor in full-custom IC
Custom
Processor
(2)
More optimized
(1)
Highly optimized
(2): Custom processor in FPGA
Parallelized circuit, slower IC
technology but programmable
Easier design
Programmable
processor
PLD
(4)
FPGA
(3)
Program runs (mostly)
sequentially on moderate-costing IC
Gate Standard Full-custom
array
cell
IC technologies
Digital Design
Copyright
Frank Vahid
(3): Programmable processor in standard
cell IC
(4): Programmable processor in FPGA
Not only can processor be
programmed, but FPGA can be
programmed to implement multiple
processors/coprocessors
39
Key Trend in Implementation Technologies
• Transistors per IC doubling every 18 months for past three decades
100,000
10,000
1,000
100
10
19
97
20
00
20
03
20
06
20
09
20
12
20
15
20
18
Transistors per IC (millions)
– Known as "Moore's Law"
– Tremendous implications – applications infeasible at one time due to
outrageous processing requirements become feasible a few years later
– Can Moore's Law continue – No?
Digital Design
Copyright
Frank Vahid
40