Network on a Chip

download report

Transcript Network on a Chip

Network on a Chip:
An Architecture for the Billion Transistor Era
A. Hemani, A. Jantsch, S. Kumar, A. Postula,
J. Öberg, M. Millberg, D. Lindqvist
Royal Institute of Technology, Stockholm
Jönkoping University, Jönkoping
University of Queensland, Brisbane
Ericsson Radio Systems, Stockholm
The Problem
And it is not just more gates . . .
Wire Delay
Power Management
Embedded Software
More design choices (HW, mP, DSP, FPGA,…)
Signal Integrity
Hybrid Chips
Methodologies & Platforms
 Behavioural synthesis
 Solves
an insignificant problem today.
 Will eventually replace and/or subsume RTL synthesis.
 IP/VC based design method
 200-400
IP/VC blocks of 100k gates required in .1
micron era.
 Interface design too big a problem.
 Platform based design
step in the right direction.
 Platforms
 Bus
based interconnect scheme will not scale
 FPGAs point in the right direction. Low granularity.
The Emerging Platforms & Architectures
Algorithm on a chip
Hardwired computation
Hardwired interconnectivity
Centralised storage
System on a chip
Programmable computation
Hardwired interconnectivity
Partially distributed storage
Network on a chip
Programmable computation
Programmable interconnectivity
Fully distributed storage
Network on a chip
 Generic
 Computational resources
 Processor
cores, FPGA blocks
 Storage
 Distributed
 I/O
 Programmable
 Interconnect
 All
resources have an address
 Resources are interconnected by a network of switches
 Resources communicate by sending addresed packets
of data.
Honeycomb Structure: a Possible NOC Topology
• Nodes of a honeycomb cell are
populated with resources
• A switch at centre interconnects
resources at nodes
• Switches are connected to their
immediate neighbours
• Each resource is directly
connected to three switches and
can reach 12 resources with a
single hop.
• Connectivity is further improved
by directly connecting switches
to their next nearest neighbour.
The Performance Overhead
 Pipelined Interconnect
 Wire delays will soon require pipelining wires(Berkeley)
 In .1 micron long wire delay will be 100x compared to gate
 Signals will need 10s of clock cycles to cross chip.
 Switching in NOC provides natural pipelining.
 Latency is attenuated
 Globally asynchronous & Locally Synchronous design
 Switching with low logic depth can be the high speed
clock domain
 Computation with high logic depth can be the slow clock
 Latency is attenuated by the ratio of communication clock
to computation clock.
The Area Overhead
 Area overhead
study by Guerrier and Greiner shows that the area
overhead will not be an issue.
 A more accurate and detailed answer will have to wait
further research.
Design Methodology
Design entry:
Set of communicating tasks
Task graph analysis
Scheduling policy
Binding of tasks
to resources
Code generation
for tasks
NOC Compiler
 How to map an
application onto the
NOC platform?
 Future systems on chip will be networks
 Fixed platforms will facilitate design
 Main open questions:
 Network
 Network nodes?
 NOC Compiler?