Evolution of implementation technologies

Download Report

Transcript Evolution of implementation technologies

Evolution of Implementation Technologies
 Discrete devices: relays, transistors (1940s-50s)
 Discrete logic gates (1950s-60s)
trend toward
 Integrated circuits (1960s-70s)
of integration
higher levels
e.g. TTL packages: Data Book for 100’s of different parts
Map your circuit to the Data Book parts
 Gate Arrays (IBM 1970s)
“Custom” integrated circuit chips
Design using a library (like TTL)
Transistors are already on the chip
Place and route software puts the chip together
automatically
+ Large circuits on a chip
+ Automatic design tools (no tedious custom layout)
- Only good if you want 1000’s of parts
Xilinx FPGAs - 1
Gate Array Technology (IBM - 1970s)
 Simple logic gates
Use transistors to
implement combinational
and sequential logic
 Interconnect
Wires to connect inputs and
outputs to logic blocks
 I/O blocks
Special blocks at periphery
for external connections
 Add wires to make connections
Done when chip is fabed
“mask-programmable”
Construct any circuit
Xilinx FPGAs - 2
Programmable Logic
 Disadvantages of the Data Book method
Constrained to parts in the Data Book
Parts are necessarily small and standard
Need to stock many different parts
 Programmable logic
Use a single chip (or a small number of chips)
Program it for the circuit you want
No reason for the circuit to be small
Xilinx FPGAs - 3
Programmable Logic Technologies
 Fuse and anti-fuse
Fuse makes or breaks link between two wires
Typical connections are 50-300 ohm
One-time programmable (testing before programming?)
Very high density
 EPROM and EEPROM
High power consumption
Typical connections are 2K-4K ohm
Fairly high density
 RAM-based
Memory bit controls a switch that connects/disconnects two
wires
Typical connections are .5K-1K ohm
Can be programmed and re-programmed in the circuit
Xilinx FPGAs - 4
Low density
Programmable Logic
 Program a connection
Connect two wires
Set a bit to 0 or 1
 Regular structures for two-level logic (1960s-70s)
All rely on two-level logic minimization
PROM connections - permanent
EPROM connections - erase with UV light
EEPROM connections - erase electrically
PROMs
Program connections in the _____________ plane
PLAs
Program the connections in the ____________ plane
PALs
Program the connections in the ____________ plane
Xilinx FPGAs - 5
Making Large Programmable Logic Circuits
 Alternative 1 : “CPLD”
Put a lot of PLDS on a chip
Add wires between them whose connections can be
programmed
Use fuse/EEPROM technology
 Alternative 2: “FPGA”
Emulate gate array technology
Hence Field Programmable Gate Array
You need:
A way to implement logic gates
A way to connect them together
Xilinx FPGAs - 6
Field-Programmable Gate Arrays
 PALs, PLAs = 10 - 100 Gate Equivalents
 Field Programmable Gate Arrays = FPGAs
Altera MAX Family
Actel Programmable Gate Array
Xilinx Logical Cell Array
 100 - 1000(s) of Gate Equivalents!
Xilinx FPGAs - 7
Field-Programmable Gate Arrays
 Logic blocks
To implement combinational
and sequential logic
 Interconnect
Wires to connect inputs and
outputs to logic blocks
 I/O blocks
Special logic blocks at
periphery of device for
external connections
 Key questions:
How to make logic blocks programmable?
How to connect the wires?
 After the chip has been fabbed
Xilinx FPGAs - 8
Tradeoffs in FPGAs
 Logic block - how are functions implemented: fixed
functions (manipulate inputs) or programmable?
Support complex functions, need fewer blocks, but they are
bigger so less of them on chip
Support simple functions, need more blocks, but they are
smaller so more of them on chip
 Interconnect
How are logic blocks arranged?
How many wires will be needed between them?
Are wires evenly distributed across chip?
Programmability slows wires down – are some wires
specialized to long distances?
How many inputs/outputs must be routed to/from each logic
block?
What utilization are we willing to accept? 50%? 20%? 90%?
Xilinx FPGAs - 9
Altera EPLD (Erasable Programmable
Logic Devices)
 Historical Perspective
 PALs: same technology as programmed once bipolar PROM
 EPLDs: CMOS erasable programmable ROM (EPROM) erased by UV light
 Altera building block = MACROCELL
CLK
8 Product Term
AND-OR Array
+
Programmable
MUX's
Clk
MUX
AND
ARRAY
Output
MUX
Q
I/O Pin
Inv ert
Control
F/B
MUX
Programmable polarity
pad
Xilinx FPGAs - 10
Seq. Logic
Block
Programmable feedback
Altera EPLD
Altera EPLDs contain 8 to 48 independently programmed macrocells
Global
CLK
Personalized
by EPROM
bits:
Clk
MUX
Synchronous Mode
1
Flipflop controlled
by global clock signal
OE/Local CLK
Q
EPROM
Cell
Global
CLK
Clk
MUX
local signal computes
output enable
Asynchronous Mode
1
OE/Local CLK
Q
EPROM
Cell
Flipflop controlled
by locally generated
clock signal
+ Seq Logic: could be D, T positive or negative edge triggered
+ product term to implement clear function
Xilinx FPGAs - 11
Altera Multiple Array Matrix (MAX)
AND-OR structures are relatively limited
Cannot share signals/product terms among macrocells
Logic
Array
Blocks
(similar to
macrocells)
LAB A
LAB H
LAB B
LAB C
LAB G
P
I
A
LAB F
LAB D
LAB E
Xilinx FPGAs - 12
Global Routing:
Programmable
Interconnect
Array
EPM5128:
8 Fixed Inputs
52 I/O Pins
8 LABs
16 Macrocells/LAB
32 Expanders/LAB
LAB Architecture
I/O Pad
Macrocell
ARRAY
I
N
P
U
T
S
I/O
Block
I/O Pad
P
I
A
Expander
Product
Term
ARRAY
Macrocell
P-Terms
Expander
P-Terms
Expander Terms shared among all
macrocells within the LAB
Xilinx FPGAs - 13
P22V10 PAL
INCREMENT
2904
1
0
0
FIRST
FUSE
NUMBERS
4
8
12
16
20
24
28
32
36
2948
2992
3036
3080
3124
3168
3212
3256
3300
3344
3388
3432
3476
3520
3564
3608
40
ASYNCHRONOUS RESET
(TO ALL REGISTERS)
44
88
132
176
220
264
308
352
396
1 1
1 0
AR
D
Q
23
0 0
0 1
Q
5808
SP
P
R
1
0
5809
OUTPUT
LOGIC
MACROCEL
L
18
P - 5818
R - 5819
6
440
3652
484
528
572
616
660
704
748
792
836
880
OUTPUT
LOGIC
MACROCELL
3696
3740
3784
3828
3872
3916
3960
4004
4048
4092
4136
4180
4224
4268
22
P - 5810
R - 5811
2
924
968
1012
1056
1100
1144
1188
1232
1276
1320
1364
1408
1452
P - 5820
R - 5821
4312
OUTPUT
LOGIC
MACROCELL
4356
4400
4444
4488
4532
4576
4620
4664
4708
4752
4796
4840
21
P - 5812
R - 5813
1496
OUTPUT
LOGIC
MACROCEL
L
16
P - 5822
R - 5823
8
4884
OUTPUT
LOGIC
MACROCELL
4928
4972
5016
5060
5104
5148
5192
5236
5280
5324
20
P - 5814
R - 5815
4
OUTPUT
LOGIC
MACROCEL
L
15
P - 5824
R - 5825
9
5368
2156
2200
2244
2288
2332
2376
2420
2464
2508
2552
2596
2640
2684
2728
2772
2816
2860
5
17
7
3
1540
1584
1628
1672
1716
1760
1804
1848
1892
1936
1980
2024
2068
2112
OUTPUT
LOGIC
MACROCEL
L
OUTPUT
LOGIC
MACROCELL
5412
5456
5500
5544
5588
5632
5676
5720
OUTPUT
LOGIC
MACROCEL
L
14
P - 5826
R - 5827
19
10
P - 5816
R - 5817
SYNCHRONOUS
PRESET
(TO ALL REGISTERS)
5764
11
INCREMEN
T
13
0
4
8
12
16
20
24
28
32
36
40
Supports large number of product terms per output
Xilinxassociated
FPGAs - 14
Latches and muxes
with output pins
+
rows of interconnect
Anti-fuse Technology:
Program Once
Use Anti-fuses to build
up long wiring runs from
short segments
I/O Buffers, Programming and Test Logic
I/O Buffers, Programming and Test Logic
Logic Module
I/O Buffers, Programming and Test Logic
Rows of programmable
logic building blocks
I/O Buffers, Programming and Test Logic
Actel Programmable Gate Arrays
Wiring Tracks
8 input, single output combinational logic blocks
Xilinx FPGAs - 15
FFs constructed
from discrete cross coupled gates
Actel Logic Module
SOA
S0
Basic Module is a
Modified 4:1 Multiplexer
S1
D0
2:1 MUX
D1
2:1 MUX
Y
D2
2:1 MUX
D3
R
"0"
SOB
Example:
Implementation of S-R Latch
2:1 MUX
"0"
2:1 MUX
"1"
2:1 MUX
Xilinx FPGAs - 16
S
Q
Actel Interconnect
Logic Module
Horizontal
Track
Anti-fuse
Vertical
Track
Interconnection Fabric
Xilinx FPGAs - 17
Actel Routing Example
Logic Module
Input
Logic Module
Logic Module
Output
Input
Jogs cross an anti-fuse
minimize the # of jogs for speed critical circuits
2 - 3 hops for most interconnections
Xilinx FPGAs - 18
Xilinx Programmable Gate Arrays
 CLB - Configurable Logic Block
 Three types of routing
 RAM-programmable
can be reconfigured
CLB
CLB
Wiring Channels
CLB
IOB
direct
general-purpose
long lines of various lengths
IOB
IOB
 Can be used as memory
IOB
IOB
 Built-in fast carry logic
IOB
IOB
IOB
5-input, 1 output function
or 2 4-input, 1 output functions
optional register on outputs
Xilinx FPGAs - 19
CLB
CLB
Slew
Rate
Control
CLB
D
Q
Passive
Pull-Up,
Pull-Down
Output
Buffer
Switch
Matrix
Vcc
Pad
Input
Buffer
CLB
Q
CLB
Programmable
Interconnect
C1 C2 C3 C4
S/R
Control
DIN
G
Func.
Gen.
SD
F'
H'
EC
RD
1
F4
F3
F2
F1
H
Func.
Gen.
F
Func.
Gen.
Y
G'
H'
S/R
Control
DIN
SD
F'
D
G'
Q
H'
1
H'
K
Q
D
G'
F'
EC
RD
X
Delay
I/O Blocks (IOBs)
H1 DIN S/R EC
G4
G3
G2
G1
D
Configurable
Logic Blocks (CLBs)
The Xilinx 4000 CLB
Xilinx FPGAs - 21
Two 4-input functions, registered output
Xilinx FPGAs - 22
5-input function, combinational output
Xilinx FPGAs - 23
CLB Used as RAM
Xilinx FPGAs - 24
Fast Carry Logic
Xilinx FPGAs - 25
Xilinx 4000 Interconnect
Xilinx FPGAs - 26
Switch Matrix
Xilinx FPGAs - 27
Xilinx 4000 Interconnect Details
Xilinx FPGAs - 28
Global Signals - Clock, Reset, Control
Xilinx FPGAs - 29
Xilinx 4000 IOB
Xilinx FPGAs - 30
Xilinx FPGA Combinational Logic Examples
 Key: General functions are limited to 5 inputs
(4 even better - 1/2 CLB)
 No limitation on function complexity
 Example
2-bit comparator:
A B = C D and A B > C D implemented with 1 CLB
(GT) F = A C' + A B D' + B C' D'
(EQ) G = A'B'C'D'+ A'B C'D + A B'C D'+ A B C D
 Can implement some functions of > 5 input
Xilinx FPGAs - 31
Xilinx FPGA Combinational Logic
 Examples
N-input majority function: 1 whenever n/2 or more inputs are 1
N-input parity functions: 5 input/1 CLB; 2 levels yield 25 inputs!
5-input Majority Circuit
9 Input Parity Logic
CLB
CLB
7-input Majority Circuit
CLB
CLB
CLB
CLB
Xilinx FPGAs - 32
Xilinx FPGA Adder Example
 Example
2-bit binary adder - inputs: A1, A0, B1, B0, CIN
outputs: S0, S1, Cout
A3
B3
A2
CLB
Cout
B2
A1
CLB
S3
A3 B3 A2 B2
CLB
A0
CLB
S2
C2
B1
C1
Full Adder, 4 CLB delays to
final carry out
CLB
S1
C0
S0
A1 B1 A0 B0 Cin
S2
2 x Two-bit Adders (3 CLBs
each) yields 2 CLBs to final
carry out
CLB
S0
S3
Cout
B0 Cin
S1
C2
Xilinx FPGAs - 33
Computer-Aided Design
 Can't design FPGAs by hand
Way too much logic to manage, hard to make changes
 Hardware description languages
Specify functionality of logic at a high level
 Validation: high-level simulation to catch specification
errors
Verify pin-outs and connections to other system components
Low-level to verify mapping and check performance
 Logic synthesis
Process of compiling HDL program into logic gates and flip-flops
 Technology mapping
Map the logic onto elements available in the implementation
technology (LUTs for Xilinx FPGAs)
Xilinx FPGAs - 34
CAD Tool Path (cont’d)
 Placement and routing
Assign logic blocks to functions
Make wiring connections
 Timing analysis - verify paths
Determine delays as routed
Look at critical paths and ways to improve
 Partitioning and constraining
If design does not fit or is unroutable as placed split into
multiple chips
If design it too slow prioritize critical paths, fix placement
of cells, etc.
Few tools to help with these tasks exist today
 Generate programming files - bits to be loaded into
chip for configuration
Xilinx FPGAs - 35
Xilinx CAD Tools
 Verilog (or VHDL) use to specify logic at a high-level
Combine with schematics, library components
 Synopsys
Compiles Verilog to logic
Maps logic to the FPGA cells
Optimizes logic
 Xilinx APR - automatic place and route (simulated
annealing)
Provides controllability through constraints
Handles global signals
 Xilinx Xdelay - measure delay properties of mapping and
aid in iteration
 Xilinx XACT - design editor to view final mapping results
Xilinx FPGAs - 36
Applications of FPGAs
 Implementation of random logic
Easier changes at system-level (one device is modified)
Can eliminate need for full-custom chips
 Prototyping
Ensemble of gate arrays used to emulate a circuit to be
manufactured
Get more/better/faster debugging done than with simulation
 Reconfigurable hardware
One hardware block used to implement more than one function
Functions must be mutually-exclusive in time
Can greatly reduce cost while enhancing flexibility
RAM-based only option
 Special-purpose computation engines
Hardware dedicated to solving one problem (or class of problems)
Accelerators attached to general-purpose computers
Xilinx FPGAs - 37
Implementation Strategies
ROM-based Design
Example: BCD to Excess 3 Serial Converter
Conversion Process
Bits are presented in bit serial fashion
starting with the least significant bit
Single input X, single output Z
Xilinx FPGAs - 38
BCD
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
Excess 3 Code
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
Implementation Strategies
Present State
S0
S1
S2
S3
S4
S5
S6
Next
X=0
S1
S3
S4
S5
S5
S0
S0
State
X=1
S2
S4
S4
S5
S6
S0
--
Output
X=0 X=1
1
0
1
0
0
1
0
1
1
0
0
1
1
--
State Transition Table
Reset
0/1
S1
0/1
S0
S2
1/0
0/0,
1/1
0/1
0/0,
1/1
S5
Derived State Diagram
S4
S3
0/0,
1/1
1/0
1/0
S6
0/1
Xilinx FPGAs - 39
Implementation Strategies
ROM-based Implementation
ROM Address
X Q2 Q1 Q0
0
0
0 0
0
0
0 1
0
0
1 0
0
0
1 1
0
1
0 0
0
1
0 1
0
1
1 0
0
1
1 1
1
0
0 0
1
0
0 1
1
0
1 0
1
0
1 1
1
1
0 0
1
1
0 1
1
1
1 0
1
1
1 1
ROM Outputs
Z D2 D1 D0
1 0
0 1
1 0
1 1
0 1
0 0
0 1
0 1
1 1
0 1
0 0
0 0
1 0
0 0
X X X X
0 0
1 0
0 1
0 0
1 1
0 0
1 1
0 1
0 1
1 0
1 0
0 0
X X X X
X X X X
1
CLK
1
0
X
conv erter ROM
Z
X
D2
Q2
D1
Q1
D0
Q0
1
0
9
13
12
5
4
CLK
D
C
B
A
QD
175 QD
QC
QC
QB
QB
1 CLR
\Reset
15
14
10
11
7
6
2
QA
3
QA
Circuit Level Realization
74175 = 4 x positive edge triggered D FFs
Truth Table/ROM I/Os
In ROM-based designs, no need to consider state assignment
Xilinx FPGAs - 40
Z
Implementation Strategies
LSB
MSB
Timing Behavior for input strings 0 0 0 0 (0) and 1 1 1 0 (7)
0000
1100
LSB
1110
LSB
Xilinx FPGAs - 41
0101
Implementation Strategies
PLA-based Design
State Assignment with NOVA
0
1
0
1
0
1
0
1
0
1
0
1
0
S0
S0
S1
S1
S2
S2
S3
S3
S4
S4
S5
S5
S6
S1
S2
S3
S4
S4
S4
S5
S5
S5
S6
S0
S0
S0
S0
S1
S2
S3
S4
S5
S6
1
0
1
0
0
1
0
1
1
0
0
1
1
=
=
=
=
=
=
=
000
001
011
110
100
111
101
NOVA derived
state assignment
9 product term
implementation
NOVA input file
Xilinx FPGAs - 42
Implementation Strategies
.i 4
.o 4
.ilb x q2
.ob d2 d1
.p 16
0 000 001
1 000 011
0 001 110
1 001 100
0 011 100
1 011 100
0 110 111
1 110 111
0 100 111
1 100 101
0 111 000
1 111 000
0 101 000
1 101 --0 010 --1 010 --.e
Espresso Inputs
q1 q0
d0 z
1
0
1
0
0
1
0
1
1
0
0
1
1
-
Espresso Outputs
Xilinx FPGAs - 43
.i 4
.o 4
.ilb x q2 q1 q0
.ob d2 d1 d0 z
.p 9
0001 0100
10-0 0100
01-0 0100
1-1- 0001
-0-1 1000
0-0- 0001
-1-0 1000
--10 0100
---0 0010
.e
Implementation Strategies
D2 = Q2 • Q0 + Q2 • Q0
D1 = X • Q2 • Q1 • Q0 + X • Q2 • Q0 + X • Q2 • Q0 + Q1 • Q0
D0 = Q0
Z = X• Q1 + X • Q1
1
CLK 9
1
0
X
conv erter PLA
X
Q2
Q1
Q0
Z
D2
D1
D0
1
0
13
12
5
4
CLK
175
D
C
B
A
1 CLR
\Reset
Xilinx FPGAs - 44
15
QD
14
QD
10
QC
11
QC
7
QB
6
QB
2
QA
3
QA
Z
Implementation Strategies
10H8 PAL: 10 inputs, 8 outputs, 2 product terms per OR gate
D1 = D11 + D12
D11 = X • Q2 • Q1 • Q0 + X • Q2 • Q0
D12 = X • Q2 • Q0 + Q1 • Q0
0 1 2 3
0. Q2 • Q0
1. Q2 • Q0
8. X • Q2 • Q1 • Q0
9. X • Q2 • Q0
16. X • Q2 • Q0
17. Q1 • Q0
24. D11
25. D12
32. Q0
33. not used
40. X • Q1
41. X • Q1
45
89
12 13 16 17 20 21 24 25 28 29 30 31
X
0
1
D2
8
9
D11
16
17
D12
24
25
D1
32
33
D0
40
41
Z
Q2
Q1
Q0
D11
D12
Xilinx FPGAs - 45
Implementation Strategies
0 1 2 3
45
89
12 13 16 17 20 21 24 25 28 29 30 31
X
0
1
D2
8
9
D11
16
17
D12
24
25
D1
32
33
D0
40
41
Z
Q2
Q1
Q0
D11
D12
Xilinx FPGAs - 46
Implementation Strategies
Buffered Input
or product term
Registered PAL Architecture
CLK
OE
Q2 • Q0 + Q2 • Q0
Q2 • Q0
Q2 • Q0
D2
Q2+
Q2+
DQ
Q
Q2+
Q2 • Q0 + Q2 • Q0
X
Q2 Q2
Q0 Q0
Negative Logic
Feedback
D1 = X • Q2 • Q1 • Q0 + X • Q2 + X • Q0 + Q2 • Q0 + Q1 • Q0
D2 = Q2 • Q0 + Q2 • Q0
D0 = Q0
Z = X • Q1 + X • Q1
Xilinx FPGAs - 47
Implementation Strategies
Programmable Output Polarity/XOR PALs
CLK
OE
Buried Registers: decouple
FF from the output pin
DQ
Q
Advantage of XOR PALs: Parity and Arithmetic Operations
AB
AB
AB
AB
AB
AB
AB
AB
C
C
C
C
C
C
C
C
D
D
D
D
D
D
D
D
A  B  C  D
AB
AB
CD
CD
Xilinx FPGAs - 48
A  B  C  D
Implementation Strategies
Example of XOR PAL
Example of Registered PAL
INCREMEN
T
INCREMEN
T
1
1
0
FIRST
FUSE
NUMBER
4
8
12
16
20
24
28
32
0
36
0
40
D
Q
23
80
120
FIRST
FUSE
NUMBER
S
Q
2
4
8
12
16
20
24
28
0
32
64
96
128
160
192
224
19
2
160
200
D
240
280
Q
22
256
288
320
352
384
416
448
480
Q
3
320
360
D
400
440
Q
21
512
544
576
608
640
672
704
736
4
D
560
600
Q
20
768
800
832
864
896
928
960
992
5
D
720
760
Q
18
Q
D
Q
17
Q
4
Q
640
680
Q
3
Q
480
520
D
19
Q
D
Q
16
Q
5
6
800
840
D
880
920
Q
1024
1056
1088
1120
1152
1184
1216
1248
18
Q
7
D
Q
15
Q
6
960
1000
D
Q
17
1280
1312
1344
1376
1408
1440
1472
1504
1040
1080
Q
8
1120
1160
D
Q
16
1536
1568
1600
1632
1664
1696
1728
1760
Q
9
D
Q
15
1360
1400
Q
14
Q
7
1200
1240
1280
1320
D
D
Q
13
Q
8
Q
1792
1824
1856
1888
1920
1952
1984
2016
10
1440
1480
D
Q
14
1520
1560
Q
13
11
INCREMEN
T
0
4
8
12
16
20
24
28
NOTE: FUSE NUMBER = FIRST FUSE NUMBER +
INCREMENT
32
36
Xilinx FPGAs - 49
9
12
11
Specifying PALs with ABEL
P10H8 PAL
module bcd2excess3
title 'BCD to Excess 3 Code Converter State Machine'
u1 device 'p10h8';
"Input Pins
X,Q2,Q1,Q0,D11i,D12i
pin
1,2,3,4,5,6;
"Output Pins
D2,D11o,D12o,D1,D0,Z
pin
19,18,17,16,15,14;
INSTATE = [Q2, Q1, Q0];
S0 = [0, 0, 0];
S1 = [0, 0, 1];
S2 = [0, 1, 1];
S3 = [1, 1, 0];
S4 = [1, 0, 0];
S5 = [1, 1, 1];
S6 = [1, 0, 1];
Explicit equations
for partitioned
output functions
equations
D2 = (!Q2 & Q0) # (Q2 & !Q0);
D1 = D11i # D12i;
D11o = (!X & !Q2 & !Q1 & Q0) # (X & !Q2 & !Q0);
D12o = (!X & Q2 & !Q0) # (Q1 & !Q0);
D0 = !Q0;
Z = (X & Q1) # (!X & !Q1);
end bcd2excess3;
Xilinx FPGAs - 50
Specifying PALs with ABEL
P12H6 PAL
module bcd2excess3
title 'BCD to Excess 3 Code Converter State Machine'
u1 device 'p12h6';
"Input Pins
X, Q2, Q1, Q0
pin
1, 2, 3, 4;
"Output Pins
D2, D1, D0, Z
pin
17, 18, 16, 15;
INSTATE = [Q2, Q1,
S0in = [0, 0, 0];
S1in = [0, 0, 1];
S2in = [0, 1, 1];
S3in = [1, 1, 0];
S4in = [1, 0, 0];
S5in = [1, 1, 1];
S6in = [1, 0, 1];
Simpler equations
Q0]; OUTSTATE = [D2, D1, D0];
S0out = [0, 0, 0];
S1out = [0, 0, 1];
S2out = [0, 1, 1];
S3out = [1, 1, 0];
S4out = [1, 0, 0];
S5out = [1, 1, 1];
S6out = [1, 0, 1];
equations
D2 = (!Q2 & Q0) # (Q2 & !Q0);
D1 = (!X & !Q2 & !Q1 & Q0) # (X & !Q2 & !Q0) #
(!X & Q2 & !Q0) # (Q1 & !Q0);
D0 = !Q0;
Z = (X & Q1) # (!X & !Q1);
end bcd2excess3;
Xilinx FPGAs - 51
Specifying PALs with ABEL
P16R4 PAL
module bcd2excess3
title 'BCD to Excess 3 Code Converter'
u1 device 'p16r4';
"Input Pins
Clk, Reset, X, !OE
"Output Pins
D2, D1, D0, Z
SREG
S0 =
S1 =
S2 =
S3 =
S4 =
S5 =
S6 =
= [D2,
[0, 0,
[0, 0,
[0, 1,
[1, 1,
[1, 0,
[1, 1,
[1, 0,
pin
1, 2, 3, 11;
pin 14, 15, 16, 13;
D1, D0];
0];
1];
1];
0];
0];
1];
1];
state_diagram SREG
state S0: if Reset then S0
else if X then S2 with Z = 0
else S1 with Z = 1
state S1: if Reset then S0
else if X then S4 with Z = 0
else S3 with Z = 1
state S2: if Reset then S0
else if X then S4 with Z = 1
else S4 with Z = 0
state S3: if Reset then S0
else if X then S5 with Z = 1
else S5 with Z = 0
state S4: if Reset then S0
else if X then S6 with Z = 0
else S5 with Z = 1
state S5: if Reset then S0
else if X then S0 with Z = 1
else S0 with Z = 0
state S6: if Reset then S0
else if !X then S0 with Z = 1
end bcd2excess3;
Xilinx FPGAs - 52
FSM Design with Counters
Synchronous Counters: CLR, LD, CNT
0
Four kinds of transitions for each state:
(1) to State 0 (CLR)
(2) to next state in sequence (CNT)
(3) to arbitrary next state (LD)
(4) loop in current state
CLR
CNT
n+1
n
no
signals
asserted
LD
m
Careful state assignment is needed to reflect basic sequencing
of the counter
Xilinx FPGAs - 53
FSM Design with Counters
Excess 3 Converter Revisited
Reset
0/1
1
0/1
0
4
1/0
0/0,
1/1
5
2
0/0,
1/1
0/1
3
0/0,
1/1
1/0
1/0
6
0/1
Xilinx FPGAs - 54
Note the sequential nature
of the state assignments
FSM Design with Counters
Excess 3 Converter
Inputs/Current
Next
State
State
X Q2 Q1 Q0 Q2+ Q1+
0 0 0 0
0
0
0 0 0 1
0
1
0 0 1 0
0
1
0 0 1 1
0
0
0 1 0 0
1
0
0 1 0 1
0
1
0 1 1 0
0
0
0 1 1 1
X
X
1 0 0 0
1
0
1 0 0 1
1
0
1 0 1 0
0
1
1 0 1 1
0
0
1 1 0 0
1
0
1 1 0 1
1
1
1 1 1 0
X
X
1 1 1 1
X
X
Outputs
Q0+
1
0
1
0
1
1
0
X
0
1
1
0
1
0
X
X
Z CLR LD
1 1 1
1 1 1
0 1 1
0 0 X
1 1 1
0 1 0
1 0 X
X X X
0 1 0
0 1 0
1 1 1
1 0 X
0 1 1
1 1 1
X X X
X X X
EN
1
1
1
X
1
X
X
X
X
X
1
X
1
1
X
X
C
X
X
X
X
X
0
X
X
1
1
X
X
X
X
X
X
B
X
X
X
X
X
1
X
X
0
0
X
X
X
X
X
X
A
X
X
X
X
X
0
X
X
0
1
X
X
X
X
X
X
CLR signal dominates LD which dominates Count
Xilinx FPGAs - 55
Implementing FSMs with Counters
.i 5
Espresso Input File .i 5
.o 7
.o 7
.ilb res x q2 q1 q0
.ilb res x q2 q1 q0
.ob z clr ld en c b a
.ob z clr ld en c b a
.p 17
.p 10
1---- -0----0-001 0101101
00000 1111---0-01 1000000
00001 1111---11-0 1000000
00010 0111--0-0-0 0101100
00011 00----Excess 3 Converter -000- 1010000
00100 0111---0--0 0010000
00101 110-011
0-10- 0101011
00110 10------11- 1000000
00111 -------11-- 0010000
01000 010-100
-1-1- 1010000
01001 010-101
.e
01010 1111--Espresso Output File
01011 10----01100 1111--01101 0111--01110 ------01111 ------Xilinx FPGAs - 56
.e
FSM Implementation with Counters
CLK
1
0
1
0
7
P
10
163
T
RCO15
2
CLK
6 D
QD 11
5 C
QC 12
4 B
QB 13
3 A
14
QA
9 LOAD
excess 3 PLA
X
Reset
X
Q2
Q1
Q0
Z
\CLR
\LD
EN
C
B
A
1
CLR
Excess 3 Converter Schematic
Synchronous Output Register
Xilinx FPGAs - 57
D Q
C Q
Z
Implementation Strategies
Xilinx LCA Architecture
Implementing the BCD to Excess 3 FSM
Q2+ = Q2 • Q0 + Q2 • Q0
Q1+ = X • Q2 • Q1 • Q0 + X • Q2 • Q0 + X • Q2 • Q0 + Q1 • Q0
Q0+ = Q0
Z = Z • Q1 + X • Q1
No function more complex than 4 variables
4 FFs implies 2 CLBs
Synchronous Mealy Machine
Global Reset to be used
Place Q2+, Q0+ in once CLB
Q1, Z in second CLB
maximize use of direct & general purpose interconnections
Xilinx FPGAs - 58
Implementing the BCD to Excess 3 FSM
Clk
Clk
X
CE
CE
A
CE
DI
B
X
Q2
Q0
FG
DI
Q2
B
C
C
Y
K
Q0
Q0
FG
E
K
X
Q2
Q1
Q0
X
Q1
A
X
FG
Y
FG
E
D
RES
D
RES
CLB2
CLB1
Xilinx FPGAs - 59
Q1
Z
Design Case Study
Traffic Light Controller
Decomposition into primitive subsystems
• Controller FSM
next state/output functions
state register
• Short time/long time interval counter
• Car Sensor
• Output Decoders and Traffic Lights
Xilinx FPGAs - 60
Design Case Study
Traffic Light Controller
Block Diagram
Reset
Clk
TS
short time/
long time
counter
TL
ST
C (async)
F
controller fsm
Reset
Next State 2
Output
2
Logic
Car
Sensor C (sync)
Clk
2
2
State
Register
Xilinx FPGAs - 61
Encoded
Light
Light
Decoders
Signals
3
3
H
Design Case Study
Subsystem Logic
0
+
Cin
Present
D
C
Q
Light
Decoders
F0
2 A
3 B
F1
1 G
Y0
Y1
Y2
Y3
139a
4
5
6
7
1
\Reset
H0
CLK
Interval
Timer
CLK
Reset
H1
+
7 P
10 T 163
1
2 CLK RCO
5
6 D
QD 11
5 C
QC 12
4 B
QB 13
3 A
QA 14
9 LOAD
CLR 1
CLR FPGAs - 62
Xilinx
ST
0
0
HG HY HR
+
Car Detector
1
FG FY FR
RQ
\Present
0
1 A
Y0
1
4 B
Y1
3
Y2
1 G
Y3
5
139b
TL
TS
1
1
2
1
09
Design Case Study
State Assignment: HG = 00, HY = 10, FG = 01, FY = 11
P1 = C TL Q1 + TS Q1 Q0 + C Q1 Q0 + TS Q1 Q0
P0 = TS Q1 Q0 + Q1 Q0 + TS Q1 Q0
ST = C TL Q1 + C Q1 Q0 + TS Q1 Q0 + TS Q1 Q0
HL[1] = TS Q1 Q0 + Q1 Q0 + TS Q1 Q0
HL[0] = TS Q1 Q0 + TS Q1 Q0
Next State Logic
FL[1] = Q0
FL[0] = TS Q1 Q0 + TS Q1 Q0
PAL/PLA Implementation:
5 inputs, 7 outputs, 8 product terms
PAL 22V10 -- 11 inputs, 10 prog. IOs, 8 to 14 prod terms per OR
ROM Implementation:
32 word by 8-bit ROM (256 bits)
Xilinxsize
FPGAs - 63
Reset may double ROM
Design Case Study
Counter-based Implementation
HG
TL•C / ST
HY
TS / ST
FG
TL+C / ST
FY
TS / ST
ST = Count
TS
TL
\C
TL
C
1 GA
3 A3
4 A2
5 A1
6 A0
13
12
11
10
153
YA 7
B3
B2
B1
B0
15 GB
2 x 4:1 MUX
YB 9
S1 SO
2 14
ST
+
7
10 P 163
T
15
2 CLK RCO
6
5
4
3
D
C
B
A
QD
QC
QB
QA
11
12
13
14
9 LOAD
\Reset 1 CLR
TTL Implementation with MUX and Counter
Can we reduce package count by using an 8:1 MUX?
Xilinx FPGAs - 64
Q1
Q0
Design Case Study
Counter-based Implementation
Dispense with direct output functions for the traffic lights
Why not simply decode from the current state?
HG HY HR
1
1 G
Q1
Q0
3B
2A
Y3
Y2
Y1
Y0
0
7
6
5
4
139a
ST is a Synchronous Mealy Output
Light Controllers are Moore Outputs
Xilinx FPGAs - 65
0
FG FY FR
0
0
1
Design Case Study
LCA-Based Implementation
Discrete Gate Method:
None of the functions exceed 5 variables
P1, ST are 5 variable (1 CLB each)
P0, HL1, HL0, FL0 are 3 variable (1/2 CLB each)
FL1 is 1 variable (1/2 CLB)
4 1/2 CLBs total!
Xilinx FPGAs - 66
Design Case Study
TL C TS
TS
TS
LCA-Based
Implementation
Q0
Placement of
functions selected
to maximize the
use of direct
connections
DI CE A
B
X
C F0
K
Y
E D R
TL
DI CE A
B
X
C
K
Y
E D R
TS
C
Q0
Xilinx FPGAs - 67
Q0
DI CE A
B
X
C Q1
K
Y
E D R
Q0
Q1
TS
DI CE A
B
X
C
K
Y
E D R
C
Q1
TL
F1
Q1
Q0
TS
Q1
DI CE A
B
X
C ST
K
Y
E D R
DI CE A
B
X
C
K
Y
E D R
H1
H0
Design Case Study
LCA-Based Implementation
Counter/Multiplexer Method:
4:1 MUX, 2 Bit Upcounter
MUX: six variables (4 data, 2 control)
but this is the kind of 6 variable function that can be
implemented in 1 CLB!
2nd CLB to implement TL • C and TL + C'
But note that ST/Cnt is really a function of TL, C, TS, Q1, Q0
1 CLB to implement this function of 5 variables!
2 Bit Counter: 2 functions of 3 variables (2 bit state + count)
Also implemented in one CLB
Traffic light decoders: functions of 2 variables (Q1, Q0)
2 per CLB = 3 CLB for the six lights
Total count = 5 CLBs
Xilinx FPGAs - 68