Evolution of implementation technologies
Download
Report
Transcript Evolution of implementation technologies
Evolution of Implementation Technologies
Discrete devices: relays, transistors (1940s-50s)
Discrete logic gates (1950s-60s)
trend toward
Integrated circuits (1960s-70s)
of integration
higher levels
e.g. TTL packages: Data Book for 100’s of different parts
Map your circuit to the Data Book parts
Gate Arrays (IBM 1970s)
“Custom” integrated circuit chips
Design using a library (like TTL)
Transistors are already on the chip
Place and route software puts the chip together
automatically
+ Large circuits on a chip
+ Automatic design tools (no tedious custom layout)
- Only good if you want 1000’s of parts
Xilinx FPGAs - 1
Gate Array Technology (IBM - 1970s)
Simple logic gates
Use transistors to
implement combinational
and sequential logic
Interconnect
Wires to connect inputs and
outputs to logic blocks
I/O blocks
Special blocks at periphery
for external connections
Add wires to make connections
Done when chip is fabed
“mask-programmable”
Construct any circuit
Xilinx FPGAs - 2
Programmable Logic
Disadvantages of the Data Book method
Constrained to parts in the Data Book
Parts are necessarily small and standard
Need to stock many different parts
Programmable logic
Use a single chip (or a small number of chips)
Program it for the circuit you want
No reason for the circuit to be small
Xilinx FPGAs - 3
Programmable Logic Technologies
Fuse and anti-fuse
Fuse makes or breaks link between two wires
Typical connections are 50-300 ohm
One-time programmable (testing before programming?)
Very high density
EPROM and EEPROM
High power consumption
Typical connections are 2K-4K ohm
Fairly high density
RAM-based
Memory bit controls a switch that connects/disconnects two
wires
Typical connections are .5K-1K ohm
Can be programmed and re-programmed in the circuit
Xilinx FPGAs - 4
Low density
Programmable Logic
Program a connection
Connect two wires
Set a bit to 0 or 1
Regular structures for two-level logic (1960s-70s)
All rely on two-level logic minimization
PROM connections - permanent
EPROM connections - erase with UV light
EEPROM connections - erase electrically
PROMs
Program connections in the _____________ plane
PLAs
Program the connections in the ____________ plane
PALs
Program the connections in the ____________ plane
Xilinx FPGAs - 5
Making Large Programmable Logic Circuits
Alternative 1 : “CPLD”
Put a lot of PLDS on a chip
Add wires between them whose connections can be
programmed
Use fuse/EEPROM technology
Alternative 2: “FPGA”
Emulate gate array technology
Hence Field Programmable Gate Array
You need:
A way to implement logic gates
A way to connect them together
Xilinx FPGAs - 6
Field-Programmable Gate Arrays
PALs, PLAs = 10 - 100 Gate Equivalents
Field Programmable Gate Arrays = FPGAs
Altera MAX Family
Actel Programmable Gate Array
Xilinx Logical Cell Array
100 - 1000(s) of Gate Equivalents!
Xilinx FPGAs - 7
Field-Programmable Gate Arrays
Logic blocks
To implement combinational
and sequential logic
Interconnect
Wires to connect inputs and
outputs to logic blocks
I/O blocks
Special logic blocks at
periphery of device for
external connections
Key questions:
How to make logic blocks programmable?
How to connect the wires?
After the chip has been fabbed
Xilinx FPGAs - 8
Tradeoffs in FPGAs
Logic block - how are functions implemented: fixed
functions (manipulate inputs) or programmable?
Support complex functions, need fewer blocks, but they are
bigger so less of them on chip
Support simple functions, need more blocks, but they are
smaller so more of them on chip
Interconnect
How are logic blocks arranged?
How many wires will be needed between them?
Are wires evenly distributed across chip?
Programmability slows wires down – are some wires
specialized to long distances?
How many inputs/outputs must be routed to/from each logic
block?
What utilization are we willing to accept? 50%? 20%? 90%?
Xilinx FPGAs - 9
Altera EPLD (Erasable Programmable
Logic Devices)
Historical Perspective
PALs: same technology as programmed once bipolar PROM
EPLDs: CMOS erasable programmable ROM (EPROM) erased by UV light
Altera building block = MACROCELL
CLK
8 Product Term
AND-OR Array
+
Programmable
MUX's
Clk
MUX
AND
ARRAY
Output
MUX
Q
I/O Pin
Inv ert
Control
F/B
MUX
Programmable polarity
pad
Xilinx FPGAs - 10
Seq. Logic
Block
Programmable feedback
Altera EPLD
Altera EPLDs contain 8 to 48 independently programmed macrocells
Global
CLK
Personalized
by EPROM
bits:
Clk
MUX
Synchronous Mode
1
Flipflop controlled
by global clock signal
OE/Local CLK
Q
EPROM
Cell
Global
CLK
Clk
MUX
local signal computes
output enable
Asynchronous Mode
1
OE/Local CLK
Q
EPROM
Cell
Flipflop controlled
by locally generated
clock signal
+ Seq Logic: could be D, T positive or negative edge triggered
+ product term to implement clear function
Xilinx FPGAs - 11
Altera Multiple Array Matrix (MAX)
AND-OR structures are relatively limited
Cannot share signals/product terms among macrocells
Logic
Array
Blocks
(similar to
macrocells)
LAB A
LAB H
LAB B
LAB C
LAB G
P
I
A
LAB F
LAB D
LAB E
Xilinx FPGAs - 12
Global Routing:
Programmable
Interconnect
Array
EPM5128:
8 Fixed Inputs
52 I/O Pins
8 LABs
16 Macrocells/LAB
32 Expanders/LAB
LAB Architecture
I/O Pad
Macrocell
ARRAY
I
N
P
U
T
S
I/O
Block
I/O Pad
P
I
A
Expander
Product
Term
ARRAY
Macrocell
P-Terms
Expander
P-Terms
Expander Terms shared among all
macrocells within the LAB
Xilinx FPGAs - 13
P22V10 PAL
INCREMENT
2904
1
0
0
FIRST
FUSE
NUMBERS
4
8
12
16
20
24
28
32
36
2948
2992
3036
3080
3124
3168
3212
3256
3300
3344
3388
3432
3476
3520
3564
3608
40
ASYNCHRONOUS RESET
(TO ALL REGISTERS)
44
88
132
176
220
264
308
352
396
1 1
1 0
AR
D
Q
23
0 0
0 1
Q
5808
SP
P
R
1
0
5809
OUTPUT
LOGIC
MACROCEL
L
18
P - 5818
R - 5819
6
440
3652
484
528
572
616
660
704
748
792
836
880
OUTPUT
LOGIC
MACROCELL
3696
3740
3784
3828
3872
3916
3960
4004
4048
4092
4136
4180
4224
4268
22
P - 5810
R - 5811
2
924
968
1012
1056
1100
1144
1188
1232
1276
1320
1364
1408
1452
P - 5820
R - 5821
4312
OUTPUT
LOGIC
MACROCELL
4356
4400
4444
4488
4532
4576
4620
4664
4708
4752
4796
4840
21
P - 5812
R - 5813
1496
OUTPUT
LOGIC
MACROCEL
L
16
P - 5822
R - 5823
8
4884
OUTPUT
LOGIC
MACROCELL
4928
4972
5016
5060
5104
5148
5192
5236
5280
5324
20
P - 5814
R - 5815
4
OUTPUT
LOGIC
MACROCEL
L
15
P - 5824
R - 5825
9
5368
2156
2200
2244
2288
2332
2376
2420
2464
2508
2552
2596
2640
2684
2728
2772
2816
2860
5
17
7
3
1540
1584
1628
1672
1716
1760
1804
1848
1892
1936
1980
2024
2068
2112
OUTPUT
LOGIC
MACROCEL
L
OUTPUT
LOGIC
MACROCELL
5412
5456
5500
5544
5588
5632
5676
5720
OUTPUT
LOGIC
MACROCEL
L
14
P - 5826
R - 5827
19
10
P - 5816
R - 5817
SYNCHRONOUS
PRESET
(TO ALL REGISTERS)
5764
11
INCREMEN
T
13
0
4
8
12
16
20
24
28
32
36
40
Supports large number of product terms per output
Xilinxassociated
FPGAs - 14
Latches and muxes
with output pins
+
rows of interconnect
Anti-fuse Technology:
Program Once
Use Anti-fuses to build
up long wiring runs from
short segments
I/O Buffers, Programming and Test Logic
I/O Buffers, Programming and Test Logic
Logic Module
I/O Buffers, Programming and Test Logic
Rows of programmable
logic building blocks
I/O Buffers, Programming and Test Logic
Actel Programmable Gate Arrays
Wiring Tracks
8 input, single output combinational logic blocks
Xilinx FPGAs - 15
FFs constructed
from discrete cross coupled gates
Actel Logic Module
SOA
S0
Basic Module is a
Modified 4:1 Multiplexer
S1
D0
2:1 MUX
D1
2:1 MUX
Y
D2
2:1 MUX
D3
R
"0"
SOB
Example:
Implementation of S-R Latch
2:1 MUX
"0"
2:1 MUX
"1"
2:1 MUX
Xilinx FPGAs - 16
S
Q
Actel Interconnect
Logic Module
Horizontal
Track
Anti-fuse
Vertical
Track
Interconnection Fabric
Xilinx FPGAs - 17
Actel Routing Example
Logic Module
Input
Logic Module
Logic Module
Output
Input
Jogs cross an anti-fuse
minimize the # of jogs for speed critical circuits
2 - 3 hops for most interconnections
Xilinx FPGAs - 18
Xilinx Programmable Gate Arrays
CLB - Configurable Logic Block
Three types of routing
RAM-programmable
can be reconfigured
CLB
CLB
Wiring Channels
CLB
IOB
direct
general-purpose
long lines of various lengths
IOB
IOB
Can be used as memory
IOB
IOB
Built-in fast carry logic
IOB
IOB
IOB
5-input, 1 output function
or 2 4-input, 1 output functions
optional register on outputs
Xilinx FPGAs - 19
CLB
CLB
Slew
Rate
Control
CLB
D
Q
Passive
Pull-Up,
Pull-Down
Output
Buffer
Switch
Matrix
Vcc
Pad
Input
Buffer
CLB
Q
CLB
Programmable
Interconnect
C1 C2 C3 C4
S/R
Control
DIN
G
Func.
Gen.
SD
F'
H'
EC
RD
1
F4
F3
F2
F1
H
Func.
Gen.
F
Func.
Gen.
Y
G'
H'
S/R
Control
DIN
SD
F'
D
G'
Q
H'
1
H'
K
Q
D
G'
F'
EC
RD
X
Delay
I/O Blocks (IOBs)
H1 DIN S/R EC
G4
G3
G2
G1
D
Configurable
Logic Blocks (CLBs)
The Xilinx 4000 CLB
Xilinx FPGAs - 21
Two 4-input functions, registered output
Xilinx FPGAs - 22
5-input function, combinational output
Xilinx FPGAs - 23
CLB Used as RAM
Xilinx FPGAs - 24
Fast Carry Logic
Xilinx FPGAs - 25
Xilinx 4000 Interconnect
Xilinx FPGAs - 26
Switch Matrix
Xilinx FPGAs - 27
Xilinx 4000 Interconnect Details
Xilinx FPGAs - 28
Global Signals - Clock, Reset, Control
Xilinx FPGAs - 29
Xilinx 4000 IOB
Xilinx FPGAs - 30
Xilinx FPGA Combinational Logic Examples
Key: General functions are limited to 5 inputs
(4 even better - 1/2 CLB)
No limitation on function complexity
Example
2-bit comparator:
A B = C D and A B > C D implemented with 1 CLB
(GT) F = A C' + A B D' + B C' D'
(EQ) G = A'B'C'D'+ A'B C'D + A B'C D'+ A B C D
Can implement some functions of > 5 input
Xilinx FPGAs - 31
Xilinx FPGA Combinational Logic
Examples
N-input majority function: 1 whenever n/2 or more inputs are 1
N-input parity functions: 5 input/1 CLB; 2 levels yield 25 inputs!
5-input Majority Circuit
9 Input Parity Logic
CLB
CLB
7-input Majority Circuit
CLB
CLB
CLB
CLB
Xilinx FPGAs - 32
Xilinx FPGA Adder Example
Example
2-bit binary adder - inputs: A1, A0, B1, B0, CIN
outputs: S0, S1, Cout
A3
B3
A2
CLB
Cout
B2
A1
CLB
S3
A3 B3 A2 B2
CLB
A0
CLB
S2
C2
B1
C1
Full Adder, 4 CLB delays to
final carry out
CLB
S1
C0
S0
A1 B1 A0 B0 Cin
S2
2 x Two-bit Adders (3 CLBs
each) yields 2 CLBs to final
carry out
CLB
S0
S3
Cout
B0 Cin
S1
C2
Xilinx FPGAs - 33
Computer-Aided Design
Can't design FPGAs by hand
Way too much logic to manage, hard to make changes
Hardware description languages
Specify functionality of logic at a high level
Validation: high-level simulation to catch specification
errors
Verify pin-outs and connections to other system components
Low-level to verify mapping and check performance
Logic synthesis
Process of compiling HDL program into logic gates and flip-flops
Technology mapping
Map the logic onto elements available in the implementation
technology (LUTs for Xilinx FPGAs)
Xilinx FPGAs - 34
CAD Tool Path (cont’d)
Placement and routing
Assign logic blocks to functions
Make wiring connections
Timing analysis - verify paths
Determine delays as routed
Look at critical paths and ways to improve
Partitioning and constraining
If design does not fit or is unroutable as placed split into
multiple chips
If design it too slow prioritize critical paths, fix placement
of cells, etc.
Few tools to help with these tasks exist today
Generate programming files - bits to be loaded into
chip for configuration
Xilinx FPGAs - 35
Xilinx CAD Tools
Verilog (or VHDL) use to specify logic at a high-level
Combine with schematics, library components
Synopsys
Compiles Verilog to logic
Maps logic to the FPGA cells
Optimizes logic
Xilinx APR - automatic place and route (simulated
annealing)
Provides controllability through constraints
Handles global signals
Xilinx Xdelay - measure delay properties of mapping and
aid in iteration
Xilinx XACT - design editor to view final mapping results
Xilinx FPGAs - 36
Applications of FPGAs
Implementation of random logic
Easier changes at system-level (one device is modified)
Can eliminate need for full-custom chips
Prototyping
Ensemble of gate arrays used to emulate a circuit to be
manufactured
Get more/better/faster debugging done than with simulation
Reconfigurable hardware
One hardware block used to implement more than one function
Functions must be mutually-exclusive in time
Can greatly reduce cost while enhancing flexibility
RAM-based only option
Special-purpose computation engines
Hardware dedicated to solving one problem (or class of problems)
Accelerators attached to general-purpose computers
Xilinx FPGAs - 37
Implementation Strategies
ROM-based Design
Example: BCD to Excess 3 Serial Converter
Conversion Process
Bits are presented in bit serial fashion
starting with the least significant bit
Single input X, single output Z
Xilinx FPGAs - 38
BCD
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
Excess 3 Code
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
Implementation Strategies
Present State
S0
S1
S2
S3
S4
S5
S6
Next
X=0
S1
S3
S4
S5
S5
S0
S0
State
X=1
S2
S4
S4
S5
S6
S0
--
Output
X=0 X=1
1
0
1
0
0
1
0
1
1
0
0
1
1
--
State Transition Table
Reset
0/1
S1
0/1
S0
S2
1/0
0/0,
1/1
0/1
0/0,
1/1
S5
Derived State Diagram
S4
S3
0/0,
1/1
1/0
1/0
S6
0/1
Xilinx FPGAs - 39
Implementation Strategies
ROM-based Implementation
ROM Address
X Q2 Q1 Q0
0
0
0 0
0
0
0 1
0
0
1 0
0
0
1 1
0
1
0 0
0
1
0 1
0
1
1 0
0
1
1 1
1
0
0 0
1
0
0 1
1
0
1 0
1
0
1 1
1
1
0 0
1
1
0 1
1
1
1 0
1
1
1 1
ROM Outputs
Z D2 D1 D0
1 0
0 1
1 0
1 1
0 1
0 0
0 1
0 1
1 1
0 1
0 0
0 0
1 0
0 0
X X X X
0 0
1 0
0 1
0 0
1 1
0 0
1 1
0 1
0 1
1 0
1 0
0 0
X X X X
X X X X
1
CLK
1
0
X
conv erter ROM
Z
X
D2
Q2
D1
Q1
D0
Q0
1
0
9
13
12
5
4
CLK
D
C
B
A
QD
175 QD
QC
QC
QB
QB
1 CLR
\Reset
15
14
10
11
7
6
2
QA
3
QA
Circuit Level Realization
74175 = 4 x positive edge triggered D FFs
Truth Table/ROM I/Os
In ROM-based designs, no need to consider state assignment
Xilinx FPGAs - 40
Z
Implementation Strategies
LSB
MSB
Timing Behavior for input strings 0 0 0 0 (0) and 1 1 1 0 (7)
0000
1100
LSB
1110
LSB
Xilinx FPGAs - 41
0101
Implementation Strategies
PLA-based Design
State Assignment with NOVA
0
1
0
1
0
1
0
1
0
1
0
1
0
S0
S0
S1
S1
S2
S2
S3
S3
S4
S4
S5
S5
S6
S1
S2
S3
S4
S4
S4
S5
S5
S5
S6
S0
S0
S0
S0
S1
S2
S3
S4
S5
S6
1
0
1
0
0
1
0
1
1
0
0
1
1
=
=
=
=
=
=
=
000
001
011
110
100
111
101
NOVA derived
state assignment
9 product term
implementation
NOVA input file
Xilinx FPGAs - 42
Implementation Strategies
.i 4
.o 4
.ilb x q2
.ob d2 d1
.p 16
0 000 001
1 000 011
0 001 110
1 001 100
0 011 100
1 011 100
0 110 111
1 110 111
0 100 111
1 100 101
0 111 000
1 111 000
0 101 000
1 101 --0 010 --1 010 --.e
Espresso Inputs
q1 q0
d0 z
1
0
1
0
0
1
0
1
1
0
0
1
1
-
Espresso Outputs
Xilinx FPGAs - 43
.i 4
.o 4
.ilb x q2 q1 q0
.ob d2 d1 d0 z
.p 9
0001 0100
10-0 0100
01-0 0100
1-1- 0001
-0-1 1000
0-0- 0001
-1-0 1000
--10 0100
---0 0010
.e
Implementation Strategies
D2 = Q2 • Q0 + Q2 • Q0
D1 = X • Q2 • Q1 • Q0 + X • Q2 • Q0 + X • Q2 • Q0 + Q1 • Q0
D0 = Q0
Z = X• Q1 + X • Q1
1
CLK 9
1
0
X
conv erter PLA
X
Q2
Q1
Q0
Z
D2
D1
D0
1
0
13
12
5
4
CLK
175
D
C
B
A
1 CLR
\Reset
Xilinx FPGAs - 44
15
QD
14
QD
10
QC
11
QC
7
QB
6
QB
2
QA
3
QA
Z
Implementation Strategies
10H8 PAL: 10 inputs, 8 outputs, 2 product terms per OR gate
D1 = D11 + D12
D11 = X • Q2 • Q1 • Q0 + X • Q2 • Q0
D12 = X • Q2 • Q0 + Q1 • Q0
0 1 2 3
0. Q2 • Q0
1. Q2 • Q0
8. X • Q2 • Q1 • Q0
9. X • Q2 • Q0
16. X • Q2 • Q0
17. Q1 • Q0
24. D11
25. D12
32. Q0
33. not used
40. X • Q1
41. X • Q1
45
89
12 13 16 17 20 21 24 25 28 29 30 31
X
0
1
D2
8
9
D11
16
17
D12
24
25
D1
32
33
D0
40
41
Z
Q2
Q1
Q0
D11
D12
Xilinx FPGAs - 45
Implementation Strategies
0 1 2 3
45
89
12 13 16 17 20 21 24 25 28 29 30 31
X
0
1
D2
8
9
D11
16
17
D12
24
25
D1
32
33
D0
40
41
Z
Q2
Q1
Q0
D11
D12
Xilinx FPGAs - 46
Implementation Strategies
Buffered Input
or product term
Registered PAL Architecture
CLK
OE
Q2 • Q0 + Q2 • Q0
Q2 • Q0
Q2 • Q0
D2
Q2+
Q2+
DQ
Q
Q2+
Q2 • Q0 + Q2 • Q0
X
Q2 Q2
Q0 Q0
Negative Logic
Feedback
D1 = X • Q2 • Q1 • Q0 + X • Q2 + X • Q0 + Q2 • Q0 + Q1 • Q0
D2 = Q2 • Q0 + Q2 • Q0
D0 = Q0
Z = X • Q1 + X • Q1
Xilinx FPGAs - 47
Implementation Strategies
Programmable Output Polarity/XOR PALs
CLK
OE
Buried Registers: decouple
FF from the output pin
DQ
Q
Advantage of XOR PALs: Parity and Arithmetic Operations
AB
AB
AB
AB
AB
AB
AB
AB
C
C
C
C
C
C
C
C
D
D
D
D
D
D
D
D
A B C D
AB
AB
CD
CD
Xilinx FPGAs - 48
A B C D
Implementation Strategies
Example of XOR PAL
Example of Registered PAL
INCREMEN
T
INCREMEN
T
1
1
0
FIRST
FUSE
NUMBER
4
8
12
16
20
24
28
32
0
36
0
40
D
Q
23
80
120
FIRST
FUSE
NUMBER
S
Q
2
4
8
12
16
20
24
28
0
32
64
96
128
160
192
224
19
2
160
200
D
240
280
Q
22
256
288
320
352
384
416
448
480
Q
3
320
360
D
400
440
Q
21
512
544
576
608
640
672
704
736
4
D
560
600
Q
20
768
800
832
864
896
928
960
992
5
D
720
760
Q
18
Q
D
Q
17
Q
4
Q
640
680
Q
3
Q
480
520
D
19
Q
D
Q
16
Q
5
6
800
840
D
880
920
Q
1024
1056
1088
1120
1152
1184
1216
1248
18
Q
7
D
Q
15
Q
6
960
1000
D
Q
17
1280
1312
1344
1376
1408
1440
1472
1504
1040
1080
Q
8
1120
1160
D
Q
16
1536
1568
1600
1632
1664
1696
1728
1760
Q
9
D
Q
15
1360
1400
Q
14
Q
7
1200
1240
1280
1320
D
D
Q
13
Q
8
Q
1792
1824
1856
1888
1920
1952
1984
2016
10
1440
1480
D
Q
14
1520
1560
Q
13
11
INCREMEN
T
0
4
8
12
16
20
24
28
NOTE: FUSE NUMBER = FIRST FUSE NUMBER +
INCREMENT
32
36
Xilinx FPGAs - 49
9
12
11
Specifying PALs with ABEL
P10H8 PAL
module bcd2excess3
title 'BCD to Excess 3 Code Converter State Machine'
u1 device 'p10h8';
"Input Pins
X,Q2,Q1,Q0,D11i,D12i
pin
1,2,3,4,5,6;
"Output Pins
D2,D11o,D12o,D1,D0,Z
pin
19,18,17,16,15,14;
INSTATE = [Q2, Q1, Q0];
S0 = [0, 0, 0];
S1 = [0, 0, 1];
S2 = [0, 1, 1];
S3 = [1, 1, 0];
S4 = [1, 0, 0];
S5 = [1, 1, 1];
S6 = [1, 0, 1];
Explicit equations
for partitioned
output functions
equations
D2 = (!Q2 & Q0) # (Q2 & !Q0);
D1 = D11i # D12i;
D11o = (!X & !Q2 & !Q1 & Q0) # (X & !Q2 & !Q0);
D12o = (!X & Q2 & !Q0) # (Q1 & !Q0);
D0 = !Q0;
Z = (X & Q1) # (!X & !Q1);
end bcd2excess3;
Xilinx FPGAs - 50
Specifying PALs with ABEL
P12H6 PAL
module bcd2excess3
title 'BCD to Excess 3 Code Converter State Machine'
u1 device 'p12h6';
"Input Pins
X, Q2, Q1, Q0
pin
1, 2, 3, 4;
"Output Pins
D2, D1, D0, Z
pin
17, 18, 16, 15;
INSTATE = [Q2, Q1,
S0in = [0, 0, 0];
S1in = [0, 0, 1];
S2in = [0, 1, 1];
S3in = [1, 1, 0];
S4in = [1, 0, 0];
S5in = [1, 1, 1];
S6in = [1, 0, 1];
Simpler equations
Q0]; OUTSTATE = [D2, D1, D0];
S0out = [0, 0, 0];
S1out = [0, 0, 1];
S2out = [0, 1, 1];
S3out = [1, 1, 0];
S4out = [1, 0, 0];
S5out = [1, 1, 1];
S6out = [1, 0, 1];
equations
D2 = (!Q2 & Q0) # (Q2 & !Q0);
D1 = (!X & !Q2 & !Q1 & Q0) # (X & !Q2 & !Q0) #
(!X & Q2 & !Q0) # (Q1 & !Q0);
D0 = !Q0;
Z = (X & Q1) # (!X & !Q1);
end bcd2excess3;
Xilinx FPGAs - 51
Specifying PALs with ABEL
P16R4 PAL
module bcd2excess3
title 'BCD to Excess 3 Code Converter'
u1 device 'p16r4';
"Input Pins
Clk, Reset, X, !OE
"Output Pins
D2, D1, D0, Z
SREG
S0 =
S1 =
S2 =
S3 =
S4 =
S5 =
S6 =
= [D2,
[0, 0,
[0, 0,
[0, 1,
[1, 1,
[1, 0,
[1, 1,
[1, 0,
pin
1, 2, 3, 11;
pin 14, 15, 16, 13;
D1, D0];
0];
1];
1];
0];
0];
1];
1];
state_diagram SREG
state S0: if Reset then S0
else if X then S2 with Z = 0
else S1 with Z = 1
state S1: if Reset then S0
else if X then S4 with Z = 0
else S3 with Z = 1
state S2: if Reset then S0
else if X then S4 with Z = 1
else S4 with Z = 0
state S3: if Reset then S0
else if X then S5 with Z = 1
else S5 with Z = 0
state S4: if Reset then S0
else if X then S6 with Z = 0
else S5 with Z = 1
state S5: if Reset then S0
else if X then S0 with Z = 1
else S0 with Z = 0
state S6: if Reset then S0
else if !X then S0 with Z = 1
end bcd2excess3;
Xilinx FPGAs - 52
FSM Design with Counters
Synchronous Counters: CLR, LD, CNT
0
Four kinds of transitions for each state:
(1) to State 0 (CLR)
(2) to next state in sequence (CNT)
(3) to arbitrary next state (LD)
(4) loop in current state
CLR
CNT
n+1
n
no
signals
asserted
LD
m
Careful state assignment is needed to reflect basic sequencing
of the counter
Xilinx FPGAs - 53
FSM Design with Counters
Excess 3 Converter Revisited
Reset
0/1
1
0/1
0
4
1/0
0/0,
1/1
5
2
0/0,
1/1
0/1
3
0/0,
1/1
1/0
1/0
6
0/1
Xilinx FPGAs - 54
Note the sequential nature
of the state assignments
FSM Design with Counters
Excess 3 Converter
Inputs/Current
Next
State
State
X Q2 Q1 Q0 Q2+ Q1+
0 0 0 0
0
0
0 0 0 1
0
1
0 0 1 0
0
1
0 0 1 1
0
0
0 1 0 0
1
0
0 1 0 1
0
1
0 1 1 0
0
0
0 1 1 1
X
X
1 0 0 0
1
0
1 0 0 1
1
0
1 0 1 0
0
1
1 0 1 1
0
0
1 1 0 0
1
0
1 1 0 1
1
1
1 1 1 0
X
X
1 1 1 1
X
X
Outputs
Q0+
1
0
1
0
1
1
0
X
0
1
1
0
1
0
X
X
Z CLR LD
1 1 1
1 1 1
0 1 1
0 0 X
1 1 1
0 1 0
1 0 X
X X X
0 1 0
0 1 0
1 1 1
1 0 X
0 1 1
1 1 1
X X X
X X X
EN
1
1
1
X
1
X
X
X
X
X
1
X
1
1
X
X
C
X
X
X
X
X
0
X
X
1
1
X
X
X
X
X
X
B
X
X
X
X
X
1
X
X
0
0
X
X
X
X
X
X
A
X
X
X
X
X
0
X
X
0
1
X
X
X
X
X
X
CLR signal dominates LD which dominates Count
Xilinx FPGAs - 55
Implementing FSMs with Counters
.i 5
Espresso Input File .i 5
.o 7
.o 7
.ilb res x q2 q1 q0
.ilb res x q2 q1 q0
.ob z clr ld en c b a
.ob z clr ld en c b a
.p 17
.p 10
1---- -0----0-001 0101101
00000 1111---0-01 1000000
00001 1111---11-0 1000000
00010 0111--0-0-0 0101100
00011 00----Excess 3 Converter -000- 1010000
00100 0111---0--0 0010000
00101 110-011
0-10- 0101011
00110 10------11- 1000000
00111 -------11-- 0010000
01000 010-100
-1-1- 1010000
01001 010-101
.e
01010 1111--Espresso Output File
01011 10----01100 1111--01101 0111--01110 ------01111 ------Xilinx FPGAs - 56
.e
FSM Implementation with Counters
CLK
1
0
1
0
7
P
10
163
T
RCO15
2
CLK
6 D
QD 11
5 C
QC 12
4 B
QB 13
3 A
14
QA
9 LOAD
excess 3 PLA
X
Reset
X
Q2
Q1
Q0
Z
\CLR
\LD
EN
C
B
A
1
CLR
Excess 3 Converter Schematic
Synchronous Output Register
Xilinx FPGAs - 57
D Q
C Q
Z
Implementation Strategies
Xilinx LCA Architecture
Implementing the BCD to Excess 3 FSM
Q2+ = Q2 • Q0 + Q2 • Q0
Q1+ = X • Q2 • Q1 • Q0 + X • Q2 • Q0 + X • Q2 • Q0 + Q1 • Q0
Q0+ = Q0
Z = Z • Q1 + X • Q1
No function more complex than 4 variables
4 FFs implies 2 CLBs
Synchronous Mealy Machine
Global Reset to be used
Place Q2+, Q0+ in once CLB
Q1, Z in second CLB
maximize use of direct & general purpose interconnections
Xilinx FPGAs - 58
Implementing the BCD to Excess 3 FSM
Clk
Clk
X
CE
CE
A
CE
DI
B
X
Q2
Q0
FG
DI
Q2
B
C
C
Y
K
Q0
Q0
FG
E
K
X
Q2
Q1
Q0
X
Q1
A
X
FG
Y
FG
E
D
RES
D
RES
CLB2
CLB1
Xilinx FPGAs - 59
Q1
Z
Design Case Study
Traffic Light Controller
Decomposition into primitive subsystems
• Controller FSM
next state/output functions
state register
• Short time/long time interval counter
• Car Sensor
• Output Decoders and Traffic Lights
Xilinx FPGAs - 60
Design Case Study
Traffic Light Controller
Block Diagram
Reset
Clk
TS
short time/
long time
counter
TL
ST
C (async)
F
controller fsm
Reset
Next State 2
Output
2
Logic
Car
Sensor C (sync)
Clk
2
2
State
Register
Xilinx FPGAs - 61
Encoded
Light
Light
Decoders
Signals
3
3
H
Design Case Study
Subsystem Logic
0
+
Cin
Present
D
C
Q
Light
Decoders
F0
2 A
3 B
F1
1 G
Y0
Y1
Y2
Y3
139a
4
5
6
7
1
\Reset
H0
CLK
Interval
Timer
CLK
Reset
H1
+
7 P
10 T 163
1
2 CLK RCO
5
6 D
QD 11
5 C
QC 12
4 B
QB 13
3 A
QA 14
9 LOAD
CLR 1
CLR FPGAs - 62
Xilinx
ST
0
0
HG HY HR
+
Car Detector
1
FG FY FR
RQ
\Present
0
1 A
Y0
1
4 B
Y1
3
Y2
1 G
Y3
5
139b
TL
TS
1
1
2
1
09
Design Case Study
State Assignment: HG = 00, HY = 10, FG = 01, FY = 11
P1 = C TL Q1 + TS Q1 Q0 + C Q1 Q0 + TS Q1 Q0
P0 = TS Q1 Q0 + Q1 Q0 + TS Q1 Q0
ST = C TL Q1 + C Q1 Q0 + TS Q1 Q0 + TS Q1 Q0
HL[1] = TS Q1 Q0 + Q1 Q0 + TS Q1 Q0
HL[0] = TS Q1 Q0 + TS Q1 Q0
Next State Logic
FL[1] = Q0
FL[0] = TS Q1 Q0 + TS Q1 Q0
PAL/PLA Implementation:
5 inputs, 7 outputs, 8 product terms
PAL 22V10 -- 11 inputs, 10 prog. IOs, 8 to 14 prod terms per OR
ROM Implementation:
32 word by 8-bit ROM (256 bits)
Xilinxsize
FPGAs - 63
Reset may double ROM
Design Case Study
Counter-based Implementation
HG
TL•C / ST
HY
TS / ST
FG
TL+C / ST
FY
TS / ST
ST = Count
TS
TL
\C
TL
C
1 GA
3 A3
4 A2
5 A1
6 A0
13
12
11
10
153
YA 7
B3
B2
B1
B0
15 GB
2 x 4:1 MUX
YB 9
S1 SO
2 14
ST
+
7
10 P 163
T
15
2 CLK RCO
6
5
4
3
D
C
B
A
QD
QC
QB
QA
11
12
13
14
9 LOAD
\Reset 1 CLR
TTL Implementation with MUX and Counter
Can we reduce package count by using an 8:1 MUX?
Xilinx FPGAs - 64
Q1
Q0
Design Case Study
Counter-based Implementation
Dispense with direct output functions for the traffic lights
Why not simply decode from the current state?
HG HY HR
1
1 G
Q1
Q0
3B
2A
Y3
Y2
Y1
Y0
0
7
6
5
4
139a
ST is a Synchronous Mealy Output
Light Controllers are Moore Outputs
Xilinx FPGAs - 65
0
FG FY FR
0
0
1
Design Case Study
LCA-Based Implementation
Discrete Gate Method:
None of the functions exceed 5 variables
P1, ST are 5 variable (1 CLB each)
P0, HL1, HL0, FL0 are 3 variable (1/2 CLB each)
FL1 is 1 variable (1/2 CLB)
4 1/2 CLBs total!
Xilinx FPGAs - 66
Design Case Study
TL C TS
TS
TS
LCA-Based
Implementation
Q0
Placement of
functions selected
to maximize the
use of direct
connections
DI CE A
B
X
C F0
K
Y
E D R
TL
DI CE A
B
X
C
K
Y
E D R
TS
C
Q0
Xilinx FPGAs - 67
Q0
DI CE A
B
X
C Q1
K
Y
E D R
Q0
Q1
TS
DI CE A
B
X
C
K
Y
E D R
C
Q1
TL
F1
Q1
Q0
TS
Q1
DI CE A
B
X
C ST
K
Y
E D R
DI CE A
B
X
C
K
Y
E D R
H1
H0
Design Case Study
LCA-Based Implementation
Counter/Multiplexer Method:
4:1 MUX, 2 Bit Upcounter
MUX: six variables (4 data, 2 control)
but this is the kind of 6 variable function that can be
implemented in 1 CLB!
2nd CLB to implement TL • C and TL + C'
But note that ST/Cnt is really a function of TL, C, TS, Q1, Q0
1 CLB to implement this function of 5 variables!
2 Bit Counter: 2 functions of 3 variables (2 bit state + count)
Also implemented in one CLB
Traffic light decoders: functions of 2 variables (Q1, Q0)
2 per CLB = 3 CLB for the six lights
Total count = 5 CLBs
Xilinx FPGAs - 68