Network-on-chip
Download
Report
Transcript Network-on-chip
Network-on-FPGA
Aleksander Ślusarczyk
Matthijs Visser
Henk Corporaal
Overview
• Hardware
–
–
–
–
–
–
Network
Router
Topologies
Network interface
mMIPS processor
Memory
• Software
– Communication library
– Software tools
– Two applications
HC Adv. Computer Architecture
2
Xilinx
university
board
HC Adv. Computer Architecture
3
Network-on-FPGA
uP
NI
uP
• Network
– topologies
– routing
• Data processor
Mem
– mMIPS
– network interface
IF
HC Adv. Computer Architecture
4
Dally’s network
• Torus topology
• E-cube routing
• Unidirectional links
– deadlock-free (2 virtual
channels per link)
HC Adv. Computer Architecture
5
Router
HC Adv. Computer Architecture
6
Sub-router
H
16b
HC Adv. Computer Architecture
D
16b
T
16b
7
Dally’s network
Guaranteed delivery, deadlock-free
– no software required, reliable out-of-the-box
Fixed route
– therefore no congestion avoidance, and load
balancing
– no timing and bandwidth guarantees
HC Adv. Computer Architecture
8
Topologies - Mesh
• Bidir links (double the
connections)
• Asymetric at edges
HC Adv. Computer Architecture
9
Topologies - Tree
• One route
• Bidir links
• Top-level nodes
overloaded
HC Adv. Computer Architecture
10
Routing – Static or Dynamic
• Static routing: Header contains routing
information
– E.g. streetsign routing:
“goto x, turn left, goto y, turn right, … ”
(= source routing)
– Determined by user application or Network
Interface (e.g. routing table)
• Dynamic routing: Intermediate router
determines best route
HC Adv. Computer Architecture
11
E-cube Routing
Route dimensions in
fixed order
• e.g. first X-dim, then
Y-dim
Consequence:
• no routing freedom
• certain turns not
used
HC Adv. Computer Architecture
x-dim
(0.0)
(2,2)
12
Interval routing
• Range of
addresses
assigned to
output port
1
[2,5]
[1,2]
[1,1]
[3,5]
3
2
[4,5]
• Deadlock-free
labellings for
many topologies
HC Adv. Computer Architecture
[1,2]
4
[3,5]
[1,4]
5
13
Using route tables
• Time slot allocation
• In a time slot one
connection active
• Compile-time fixed
t\o
t1
t2
t3
O3
I1
I2
I1
I1
O1
I3
O3
I2
• Contention-free
• Guaranteed timing
O2
O2
– Scheduling required
O1
HC Adv. Computer Architecture
14
miniMIPS Data processor
•
•
•
•
pipelined
28 instructions
separate D/I memory
synthesizable
SystemC
HC Adv. Computer Architecture
15
Network interfacing
IM
DM
NI
• Memory mapped network
device
mMIPS
Data: 0x8000000
Ctl:
0x8000004
address
send
data_rdy
send_rdy
HC Adv. Computer Architecture
16
Memory
• Data and instruction
cache
RAM
MEMIF
I$
IM
D$
DM
NI+
NI
– Currently : local main
memory
– Extension: network access
to remote memory
mMIPS
HC Adv. Computer Architecture
17
Implementation
mMIPS
Cache
Router
N.I.
+
:
:
:
:
:
600 slices
2 x 300 slices
500 slices
100 slices
1800
Virtex2 3000 : 15,000 slices + 200 KB RAM
@ 30-50 MHz
HC Adv. Computer Architecture
18
Software for the Network-on-FPGA
January 2004 , version 1.0
C compiler (LCC)
• Advantages
+ Designed for retargetability
+ Ported by Jan Hoogerbrugge for mMips
+ Different memory layouts supported without
recompilation
• Disadvantages
– ANSI/POSIX libraries not implemented
– No debugging information
HC Adv. Computer Architecture
20
mMips communication revisited
Memory mapped communication
• Request transmission of Data_word
• Check whether Data_word valid?
• Set destination node address
Status_word
Data_word
Max. physical
address
• Contains received data,
• Location to write
outgoing data to
0x0000
32 bits
HC Adv. Computer Architecture
21
C communications library
Possible communication scheme:
Message passing
• Blocking send and receive
• Non-blocking send (= try) and receive (= peek)
Possible implementation:
C Function ¥
Description
sc_send_word() and
sc_receive_word()
Send or receive exactly 4 bytes
sc_send() and
sc_receive()
Send / receive any number of bytes.
¥ Retry count as optional parameter
HC Adv. Computer Architecture
22
C communications library
Advantages of Message Passing
• Directly supported by hardware
Small code base (meets memory constraints)
Easy to implement (meets time constraints)
• Forms basis for more complex protocols
Only two operations (meets constraints for simplicity)
Uses message passing (= a standard, as required)
HC Adv. Computer Architecture
23
Simulator (SystemC)
System level design tool
– C++ Class Libraries for
hardware constructs, such as adders
– SystemC model of the mMips network
– Standalone executable can be generated
HC Adv. Computer Architecture
25
Simulator (SystemC)
Important debugging tool
– VCD tracings
– Memory dumps (ROM & RAM)
– Spy module:
•
•
•
•
Spy on instruction pointer (IP) & communication
Watch read/writes on specific addresses
Stop simulation when IP at specific address
Additional options…
HC Adv. Computer Architecture
26
C library for debugging
Desirable because:
• LCC cannot generate debugging info
• No CRT/console, so no printf()
HC Adv. Computer Architecture
27
C library for debugging
Solution to debugging problem?
• Implements a printf()-variant
• Writes output to memory
Useful for both Simulator
and FPGA implementation.
FPGA memory
0x8000
Program data
and Stack
- Reserved -
Output of printf() is
stored here
0x4000
Instructions
0x0000
HC Adv. Computer Architecture
28
NoC applications
Two online and tested applications
•
Multi processor JPEG decoder
•
“Gossip”: a small message circulates the
network
HC Adv. Computer Architecture
29
JPEG decoder
Input:
JPEG image
2x2 mMips
Network
Output:
BITMAP image
HC Adv. Computer Architecture
30
"Gossip" application
Network layout
2-by-2 network (4 nodes)
Memory (per node)
16 Kbyte ROM, 16 Kbyte RAM
Send a short message
over the network
Message (18 bytes):
“I know something!”
HC Adv. Computer Architecture
Node 0 (x0y0)
Node 0 (x1y1)
Node 1 (x1y0)
Node 2 (x0y1)
32
“Gossip”: from idea to hardware
1. Create the C program
•
All nodes are identical except for their node ID
•
Node ID: pointer to address in user_data segment.
2. Compilation
•
•
•
Compile one node (lcc)
Separate code and
data using a
shell script
Insert user_data
Program data
and Stack
User data
Program code
2
1
HC Adv. Computer Architecture
Node 0
File with
User data
3
(e.g.
Node ID)
33
“Gossip”: from idea to hardware
3. Use the SystemC simulator to test & debug
4. Upload to and run in FPGA
Program data
and Stack
User data
Program code
2
3
1
HC Adv. Computer Architecture
Node 0
34