FPGA - University of Toronto
Download
Report
Transcript FPGA - University of Toronto
FULL CUSTOM DESIGN OF AN FPGA
Jongsok Choi
M.A.Sc Candidate, University of Toronto
Overview
TSMC 0.35 um technology
Cadence tools
Less than 2mm X 2mm die area
Design time = 1 month
Tile based approach
Each tile contains a Logic Block,
2 Connections Blocks and a Switch Box
Pass transistor approach
2
References
Architecture and CAD for Deep-Submicron FPGAs
3
Presentation Outline
Schematics
Base Cells – Pass transistor, SRAM, Multiplexer
Logic Block – LUT, Set/Reset Logic, D-flipflop
Connection Box – Right, Bottom
Switch Box
Tile 2X2
Programming Circuitry – Row, Column
FPGA 4X4 – Programming a multiplier
FPGA 32X16 – full schematic
Layouts
Base Cells – SRAM, Multiplexer, Pull-up Buffer
Logic Block – LUT, Set/Reset Logic, D-flipflop
Connection Box – Right, Bottom
Programming Circuitry – Row, Column
Tile – Single tile, Tile 2X2
FPGA 4X4 – Post-layout simulation of programmed multiplier
FPGA 32X16 – floor plan, full layout
Clock tree – H-tree implemented
Complete layout with Padframe
DRC, LVS Results
Employed layout techniques and Conclusions
Schematics
5
Base Cells
Highlighted red boxes in the top right hand corner indicate where this cell
is used (e.g. Pass transistor is used in the logic element, connection boxes 1
and 2, and the switch block)
o Pass transistor
Schematic
Simulation
6
Base Cells
o SRAM cell : to program the FPGA with the required
functionality
Schematic
Simulation
7
Base Cells
o 2-to-1 Multiplexer
Schematic:
Simulation
8
Base Cells
o 4-to-1 Multiplexer: to choose between the four
SRAM bits in the LUT
Simulation
Schematic
Sel2/Sel1
out
11
IN_1
10
IN_2
01
IN_3
00
IN_4
9
Logic Block
Top-level Schematic
10
Logic Block - LUT
Schematic
Simulation
11
Logic Block – Set/Reset Logic
Schematic:
Simulation
When Sram 1, 2 set to ‘1’ => Set= 1
When Sram 1, 2 set to ‘0’ => Reset= 1
12
Logic Block – D-Flip Flop
Schematic
Simulation
13
Connection Box -Right
o Functionality: Connect vertical tracks to logic
element
Schematic
Simulation
Track2 selected when SRAM set to ‘0’
Track1 selected when SRAM set to ‘1’
14
Connection Box -Bottom
Top Level Schematic
Output from CB to Tracks
Input to CB from Tracks
15
Switch Box
Schematic
16
TILE 2x2
Schematic:
Each tile has different connections at the switch box
Segmented and staggered routing structure for FPGA
Segment Length of 2
V1 V2
V3 V4
H1
H2
H3
H4
17
Programming Circuitry – Programming Column
Schematic
Simulation
18
Programming Circuitry – Programming Row
Schematic
Simulation
19
FPGA 4x4
Schematic
20
FPGA 4x4
FPGA Mapping and Programming
bits for a 2 by 2 Multiplier
Table shows manually created
bitstream to program the multiplier
using 4X4 tiles with programming
circuits
21
FPGA 4x4
Simulation
2 by 2 Multiplier correctly implemented
Shows correct output for all possible inputs
Bit[3]
Bit[2]
Bit[1]
0
1
2
0
3
0
1
2
3
0
2
4
6
0
3
6
9
Bit[0]
0
1
2
3
0
1
2
3
0
1
2
3
Input 1
1
2
3
Numbers shows
total output
Input 2
22
FPGA 32x16 – Full Schematic
23
Layouts
24
Base Cells
o SRAM cell : to program the FPGA with the required
functionality
Schematic
Layout
25
Base Cells
o 4-to-1 Multiplexer: to choose between the four
SRAM bits in the LUT
Layout
Schematic
Sel2/Sel1
out
11
IN_1
10
IN_2
01
IN_3
00
IN_4
26
Base Cells
o Pull-up buffer: used to pull the degraded signal back
up to VDD
Layout
27
Logic Block
Top-level Schematic
28
Logic Block - LUT
Schematic
Layout
Layout
29
Logic Block – Set/Reset Logic
Schematic
Layout
30
Logic Block – D-flipflop
Schematic
Layout
31
Logic Block
Layout
LUT
Set/Reset
Buffer_inverter for clock
D-flipflop
Pullup Buffer
32
Connection Box -Right
Schematic
Layout
33
Connection Box - Bottom
Top-level Schematic
Output from Connection box to Tracks
34
Programming Circuitry – Programming Column
Schematic
Layout
35
Programming Circuitry – Programming Column
36
Programming Circuitry – Programming Row
Schematic
Layout
37
Programming Circuitry – Programming Row
38
Tile
Schematic
39
Tile -Layout
Logic Element
Right Connection Box
Bottom Connection Box
Switch Box
40
TILE 2x2 - Layout
41
FPGA 4x4 - Layout
42
FPGA 4x4 - Post Layout Simulation
FPGA Mapping and Programming
bits for a 2 by 2 Multiplier
Table shows manually created
bitstream to program the multiplier
using 4X4 tiles with programming
circuits
43
FPGA 4x4 – Post-Layout Simulation
Post-Layout Simulation
2 by 2 Multiplier correctly implemented
Shows correct output for all possible inputs
Matches schematic simulations
Input 1
Input 2
Bit[0]
0
0
1
1
2
3
2
0
1
2
Numbers shows
0
total output
1
2
3
3
3
0
1
2
0
2
4
3
6
0
1
2
0
3
6
3
9
Bit[1]
Bit[2]
Bit[3]
44
32x16 Tiles FPGA Floorplan
Programming Column
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
1.525
mm
4x4 Tile
1.525
mm
1.525 mm
Programming Row
4x4 Tile
1.25 mm
45
FPGA 32x16 - Layout
46
Clock Tree
H-tree structure
Perfectly symmetrical in every direction to reduce clock skew
47
Complete layout with Padframe
48
DRC - Passed
49
LVS - Passed
50
Layout Techniques Employed
General Techniques
Cell pitch of 6um used, layouts optimized for area to match pitch size
Shared Sources/Drains when possible to minimize area
Shared VDD and ground rails between rows
Hierarchical Layout
Bigger cells composed of multiple smaller cells
Orthogonal metal routing using M3, M4, Local routing using M1, M2
Blocks made to abut well
Wider tracks for power rails to provide enough power
Wider horizontal tracks, vertical tracks, and clock tree for increase drive
strength
51
Conclusions
Designed a fully functional FPGA
Can Implement up to 512 gates
Consists of 8,704 SRAMs
148,448 transistors without padframe
52
Questions
53