ENGIN112 - lecture 2
Download
Report
Transcript ENGIN112 - lecture 2
ECE 697F
Reconfigurable Computing
Lecture 2
Field Programmable Gate Arrays
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Overview
• Three types of FPGAs
- EEPROM
- SRAM
- Antifuse
• SRAM FPGA architectural choices.
• FPGA logic blocks -> size versus performance.
• FPGA switch boxes
• State-of-the-art
- Research issues in architecture.
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Configuration vs. programming
° FPGA configuration:
° CPU programming:
•
Bits stay at the device they
program.
•
Instructions are fetched from
a memory.
•
A configuration bit controls a
switch or a logic bit.
•
Instructions select complex
operations.
Lecture 2: Field Programmable Gate Arrays
add r1, r2
addIR
r1, r2
memory
CPU
September 13, 2004
Logic element questions
° How many inputs?
° How many functions?
• All functions of n inputs or eliminate some combinations?
• What inputs go to what pieces of the function?
° Any specialized logic?
• Adder, etc.
° What register features?
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Anti-Fuse FPGA (Actel ACT family)
• Anti-fuses are one-time programmable.
- 16 Volt pulse eliminates dielectric
- Only need to program once.
• High performance -> direct connections between poly and N+
• Less appropriate for Reconfigurable Computing
- Good for bus transceivers
- High speed operation.
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Antifuses
° Permanently programmed.
° Make a connection with electrical signal.
• More reliable than breaking a connection.
• Avoids shrapnel.
° Resistance of about 100 W.
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Antifuse structure
Metal 2
antifuse
via
Metal 1
substrate
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
+
rows of interconnect
Anti-fuse Technology:
Program Once
Use Anti-fuses to build
up long wiring runs from
short segments
I/O Buffers, Programming and Test Logic
I/O Buffers, Programming and Test Logic
Logic Module
I/O Buffers, Programming and Test Logic
Rows of programmable
logic building blocks
I/O Buffers, Programming and Test Logic
Actel Programmable Gate Arrays
Wiring Tracks
8 input, single output combinational logic blocks
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
FFs constructed from discrete
cross coupled gates
Actel Logic Module
SOA
S0
Basic Module is a
Modified 4:1 Multiplexer
S1
D0
2:1 MUX
D1
2:1 MUX
Y
D2
2:1 MUX
D3
R
"0"
SOB
Example:
Implementation of S-R Latch
2:1 MUX
"0"
2:1 MUX
"1"
2:1 MUX
S
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Q
Actel Interconnect
Logic Module
Horizontal
Track
Anti-fuse
Vertical
Track
Interconnection Fabric
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Actel Routing Example
Logic Module
Input
Logic Module
Output
Logic Module
Input
Jogs cross an anti-fuse
minimize the # of jogs for speed critical circuits
2 - 3 hops for most interconnections
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
EEPROM Devices (PLDs)
• Frequently used technology for PALs, GALs, EPLDs
• User design frequently decomposed into SOP
representation
• Appropriate for system glue logic.
• Single transistor interconnection point.
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Altera Max 7000 Macrocell
LAB Local Array
Parallel Logic
Expanders
(from other
macrocells)
Global
Clear
This respresents a
multiplexer
controlled by the
configuration
program
Global
Clock
Programmable
Register
PRN
ProductTerm
Select
Matrix
Q
D
Clock/
Enable
Select
To I/O
Control
Block
ENA
CLRN
Clear VCC
Select
Shared Logic
Expanders
36 Signals
from PIA
To PIA
16 Expander
Product
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Max 7000 PLD Structure
Input/GCLK1
Input/OE2/GCLK2
Input/OE1
Input/GCLRn
6 Output
6-16
I/O Pins
6 Output
6- LAB A
16
I/O
Macrocells
Control
1-16
Block 6-
I/O
Control
Block 6-
6
Lecture 2: Field Programmable Gate Arrays
3
16
Macrocells
17-32
1
I/O
Control
6- Block
6-16
LAB C
Macrocells
33-48
3
16
6-16
3
1
6-16
6-16
I/O Pins
6
PIA
6-
6-16
I/O Pins
3
6-16
6
616
LAB B
LAB D
6-
Macrocells
49-64
I/O
Control
6- Block
6
September 13, 2004
6-16
I/O Pins
SRAM-based FPGA
Q
Read or Write
Q
P1
P2
P3
P4
Data
Programming Bit
Out
I1 I2
2-Input LUT
• SRAM bits can be programmed many times
• Each programming bit takes up five transistors
• Larger device area reduces speed versus EPROM and
antifuse.
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Field Programmable Gate Array
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Design Tradeoffs
switchbox
Logic
Cluster
Lecture 2: Field Programmable Gate Arrays
IO connections
• Some logic clusters are large
(e.g. Altera/Xilinx contains 8-10
LUT-FF pairs)
• Three important issues:
- Logic elements per cluster
- Cluster connectivity to
interconnect – wires (FC) –
connection flexibility
- Switchbox flexibility (Fs)
September 13, 2004
Issue 1: The Logic Cluster
• Question: How many BLE
should there be per cluster?
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Logic cluster utilization (Betz & Rose)
°
Logic utilization vs.
fraction of inputs
accessible to LE in
cluster.
°
Utilization at 100%
when only 50%-60%
of inputs are
accessible.
°
Also found that
connecting each track
to only one LE output
per cluster was
sufficient.
© 1998 IEEE
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Area efficiency vs. cluster size (Betz & Rose)
° Transistors
per LE vs.
cluster size.
•
Includes
overhead
circuits.
° Clusters in
size 1-8 were
area-efficient.
© 1998 IEEE
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Logic Cluster Size
• Interestingly, small block cluster more efficient (Betz –
CICC’99)
• Includes area needed for routing.
• Small clusters (e.g. one BLE per cluster) not “CAD friendly).
• Most commercial devices have 4-10 BLEs per cluster
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Number of Inputs per Cluster
• Lots of opportunities for input sharing in large clusters
(Betz – CICC’99)
• Reducing inputs reduces the size of the device and makes
it faster.
• Most FPGA devices have more inputs than actually needed
to allow for routing flexibility
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Connection Box Flexibility
Tracks
Logic
Cluster
IO pin
T0 T1
Out
T0
T1
T2
T2
Out
FC = 3
T0 T1
T2
• Fc -> How many tracks does an input pin connect to?
• If logic cluster is small, FC is large
FC = W
• If logic cluster is large, Fc can be less.
- Approximately 0.2W for Xilinx XC4000EX, Virtex
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Switchbox Flexibility
0
1
0
0
1
1
0
1
• Switch box provides optimized interconnection area.
• Flexibility found to be not as important as FC
• Six transistors needed for FS= 3
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Putting it all together
• Xilinx XC4000EX family
- FS = 3
- FC = 0.2
- I=8
• Altera Flex10K family
- FS = 3
- FC = 0.25
- I = 22
More contemporary FPGAs have larger cluster sizes and segmentation.
• More difficult to quantify exact Fc and Fs values.
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Switchbox Issues
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Switch Matrix
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Xilinx 4000 Interconnect Details
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Wilton Switchbox
0
1
2
2
2
1
1
0
0
0
1
2
• Rotate connections inside the switchbox while keeping FS= 3
• Still has six transistors for base switch matrix.
• Eliminates domain issue
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Switchbox Issues
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Buffering
S
• FPGAs need to buffer to isolate large RC networks
• Architects must decide where to place buffers.
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
S
Segmentation
X
Y
Length 4
Length 2
Length 1
• Segmentation distribution: how many of each length?
• Longer length
- Better performance?
- Reduced routability?
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Translating a Design to an FPGA
FPGA
FPGA
FPGA
FPGA
• Hierarchical FPGA likely to have a tree-like interconnect.
• Each “sub-array” contains about 100K gates
• Clever VLSI layout needed
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Pipelined Interconnect
FPGA
FPGA
• Latest trend in FPGAs is to embed clocked flip flops in device to
pipeline data.
• Helps create tolerance for delay
• Allows interconnect to be reused
• Large FPGA looks like a parallel processor.
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
FPGA Comparison
SRAM
Antifuse
Flash
EPROM
Speed
Worst
Best
Worst
Medium
Power
Varies
Near Best
Best
Worst
Density
Medium
Second
Best
Worst
Worst
Best
Medium
Medium
Routing Cell size
1
1/10
1/7
PLD
Reprogrammable
Yes
No
Yes
Yes
Radiation
Lecture 2: Field Programmable Gate Arrays
September 13, 2004
Summary
• Three basic types of FPGA devices
- Antifuse
- EEPROM
- SRAM
• Key issues for SRAM FPGA are logic cluster, connection box, and
switch box.
• Latest advances examine performance and routability.
Next class: FPGA versus Processor
Lecture 2: Field Programmable Gate Arrays
September 13, 2004