FPGA - University of Toronto

Download Report

Transcript FPGA - University of Toronto

FULL CUSTOM DESIGN OF AN FPGA
Jongsok Choi
M.A.Sc Candidate, University of Toronto
Overview
 TSMC 0.35 um technology
 Cadence tools
 Less than 2mm X 2mm die area
 Design time = 1 month
 Tile based approach
 Each tile contains a Logic Block,
2 Connections Blocks and a Switch Box
 Pass transistor approach
2
References
 Architecture and CAD for Deep-Submicron FPGAs
3
Presentation Outline

Schematics
 Base Cells – Pass transistor, SRAM, Multiplexer
 Logic Block – LUT, Set/Reset Logic, D-flipflop
 Connection Box – Right, Bottom
 Switch Box
 Tile 2X2
 Programming Circuitry – Row, Column
 FPGA 4X4 – Programming a multiplier
 FPGA 32X16 – full schematic

Layouts
 Base Cells – SRAM, Multiplexer, Pull-up Buffer
 Logic Block – LUT, Set/Reset Logic, D-flipflop
 Connection Box – Right, Bottom
 Programming Circuitry – Row, Column
 Tile – Single tile, Tile 2X2
 FPGA 4X4 – Post-layout simulation of programmed multiplier
 FPGA 32X16 – floor plan, full layout
 Clock tree – H-tree implemented
 Complete layout with Padframe
 DRC, LVS Results
 Employed layout techniques and Conclusions
Schematics
5
Base Cells
Highlighted red boxes in the top right hand corner indicate where this cell
is used (e.g. Pass transistor is used in the logic element, connection boxes 1
and 2, and the switch block)
o Pass transistor
 Schematic

Simulation
6
Base Cells
o SRAM cell : to program the FPGA with the required
functionality
 Schematic
 Simulation
7
Base Cells
o 2-to-1 Multiplexer
 Schematic:
 Simulation
8
Base Cells
o 4-to-1 Multiplexer: to choose between the four
SRAM bits in the LUT
 Simulation
 Schematic
Sel2/Sel1
out
11
IN_1
10
IN_2
01
IN_3
00
IN_4
9
Logic Block
 Top-level Schematic
10
Logic Block - LUT

Schematic

Simulation
11
Logic Block – Set/Reset Logic
 Schematic:
 Simulation

When Sram 1, 2 set to ‘1’ => Set= 1

When Sram 1, 2 set to ‘0’ => Reset= 1
12
Logic Block – D-Flip Flop
 Schematic
 Simulation
13
Connection Box -Right
o Functionality: Connect vertical tracks to logic
element
 Schematic
 Simulation
 Track2 selected when SRAM set to ‘0’
 Track1 selected when SRAM set to ‘1’
14
Connection Box -Bottom
 Top Level Schematic
 Output from CB to Tracks
 Input to CB from Tracks
15
Switch Box
 Schematic
16
TILE 2x2
 Schematic:
 Each tile has different connections at the switch box
 Segmented and staggered routing structure for FPGA
 Segment Length of 2
V1 V2
V3 V4
H1
H2
H3
H4
17
Programming Circuitry – Programming Column
 Schematic
 Simulation
18
Programming Circuitry – Programming Row
 Schematic
 Simulation
19
FPGA 4x4
 Schematic
20
FPGA 4x4
 FPGA Mapping and Programming
bits for a 2 by 2 Multiplier
 Table shows manually created
bitstream to program the multiplier
using 4X4 tiles with programming
circuits
21
FPGA 4x4
 Simulation
 2 by 2 Multiplier correctly implemented
 Shows correct output for all possible inputs
Bit[3]
Bit[2]
Bit[1]
0
1
2
0
3
0
1
2
3
0
2
4
6
0
3
6
9
Bit[0]
0
1
2
3
0
1
2
3
0
1
2
3
Input 1
1
2
3
Numbers shows
total output
Input 2
22
FPGA 32x16 – Full Schematic
23
Layouts
24
Base Cells
o SRAM cell : to program the FPGA with the required
functionality
 Schematic
 Layout
25
Base Cells
o 4-to-1 Multiplexer: to choose between the four
SRAM bits in the LUT
 Layout
 Schematic
Sel2/Sel1
out
11
IN_1
10
IN_2
01
IN_3
00
IN_4
26
Base Cells
o Pull-up buffer: used to pull the degraded signal back
up to VDD

Layout
27
Logic Block
 Top-level Schematic
28
Logic Block - LUT
 Schematic

Layout
 Layout
29
Logic Block – Set/Reset Logic
 Schematic

Layout
30
Logic Block – D-flipflop
 Schematic
 Layout
31
Logic Block
 Layout
LUT
Set/Reset
Buffer_inverter for clock
D-flipflop
Pullup Buffer
32
Connection Box -Right
 Schematic
 Layout
33
Connection Box - Bottom
 Top-level Schematic

Output from Connection box to Tracks
34
Programming Circuitry – Programming Column
 Schematic
 Layout
35
Programming Circuitry – Programming Column
36
Programming Circuitry – Programming Row
 Schematic
 Layout
37
Programming Circuitry – Programming Row
38
Tile
 Schematic
39
Tile -Layout
Logic Element
Right Connection Box
Bottom Connection Box
Switch Box
40
TILE 2x2 - Layout
41
FPGA 4x4 - Layout
42
FPGA 4x4 - Post Layout Simulation
 FPGA Mapping and Programming
bits for a 2 by 2 Multiplier
 Table shows manually created
bitstream to program the multiplier
using 4X4 tiles with programming
circuits
43
FPGA 4x4 – Post-Layout Simulation
 Post-Layout Simulation
 2 by 2 Multiplier correctly implemented
 Shows correct output for all possible inputs
 Matches schematic simulations
Input 1
Input 2
Bit[0]
0
0
1
1
2
3
2
0
1
2
Numbers shows
0
total output
1
2
3
3
3
0
1
2
0
2
4
3
6
0
1
2
0
3
6
3
9
Bit[1]
Bit[2]
Bit[3]
44
32x16 Tiles FPGA Floorplan
Programming Column
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
4x4 Tile
1.525
mm
4x4 Tile
1.525
mm
1.525 mm
Programming Row
4x4 Tile
1.25 mm
45
FPGA 32x16 - Layout
46
Clock Tree
 H-tree structure
 Perfectly symmetrical in every direction to reduce clock skew
47
Complete layout with Padframe
48
DRC - Passed
49
LVS - Passed
50
Layout Techniques Employed
 General Techniques
 Cell pitch of 6um used, layouts optimized for area to match pitch size
 Shared Sources/Drains when possible to minimize area
 Shared VDD and ground rails between rows
 Hierarchical Layout
 Bigger cells composed of multiple smaller cells
 Orthogonal metal routing using M3, M4, Local routing using M1, M2
 Blocks made to abut well
 Wider tracks for power rails to provide enough power
 Wider horizontal tracks, vertical tracks, and clock tree for increase drive
strength
51
Conclusions
 Designed a fully functional FPGA
 Can Implement up to 512 gates
 Consists of 8,704 SRAMs
 148,448 transistors without padframe
52
Questions
53