Dr. Amr Talaat ELECT1002 SoC Design

Download Report

Transcript Dr. Amr Talaat ELECT1002 SoC Design

Communication Synthesis:
Buses and Network-on-Chip (NOC)
Dr. Eng. Amr T. Abdel-Hamid
Spring 2008
System-On-a-Chip Design
ELECT 1002
The SoC nightmare
System Bus
SoC Design
DMA
CPU
Mem
Ctrl.
MPEG
DSP
The “Board-on-a-Chip”
Approach
Bridge
I
o
o
The
architecture
is tightly
coupled
Dr. Amr Talaat
C
Peripheral Bus
Source: Prof Jan Rabaey CS-252-2000 UC Berkeley
ELECT1002
Very long wires
SoC Design
Year 2005
1 ns (1 GHz)
Year 2010
0.1 ns (10 GHz)
B
B
Dr. Amr Talaat
A
ELECT1002
A
Why NoC?
SoC Design
 Global wire delays
 increase exponentially or linearly by inserting repeaters
 The delay may exceed one clock cycle after repeater insertion
 In ultra-deep submicron processes, 80% or more of the delay of cr
itical paths will be due to interconnections
 Communication structures need to be designed first and then followed by fun
ctional blocks
Dr. Amr Talaat
ELECT1002
Homogeneous SoC (MP-SoC)
SoC Design
CPU
MEM
CPU
MEM
CPU
MEM
CPU
MEM
Interconnection network (BUS, XBAR)
Dr. Amr Talaat
CPU
MEM
ELECT1002
CPU
MEM
CPU
MEM
CPU
MEM
Why not bus?
SoC Design
 Shared medium arbitrated bus, the most frequently used on-chip interconnect
architectures
 Pros
 Simple, low area cost, and extensibility
 Cons
 The intrinsic parasitic resistance and capacitance can be quite high fo
r a long bus line
 Every additional IP block adds to parasitic capacitance and causes inc
reased propagation delay
 The number of IP blocks that can be connected by the bus is limited
Dr. Amr Talaat
ELECT1002
On-Chip Communication
SoC Design
Bus-based architectures
Dr. Amr Talaat
 Bus based interconnect
 Low cost
 Easier to Implement
 Flexible
ELECT1002
Irregular architectures
Regular Architectures
 Networks on Chip
 Layered Approach
 Buses replaced with Networked arc
hitectures
 Better electrical properties
 Higher bandwidth
 Energy efficiency
 Scalable
Network on Chip
SoC Design
Software
Software
Transport
Transport
Network
Network
Wiring
Data Link Layer
Wiring
Separation of concerns
Dr. Amr Talaat
 Communication-based Design
 Orthogonalizes function and communication
 Builds on well-known models-of-computation and correct-by-constru
ction synthesis flow
 Parallels layered approach exploited by communications community
ELECT1002
NoC
SoC Design
What is Network-on-Chip
(NoC)?
• Leveraging existing
computer networking
principles to improve intercomponent intra-chip
communications for SoC.
• Each on chip component
connected by switch to a
particular comm wire(s)
Dr. Amr Talaat
• Improvement over
standard bus based
interconnections for SoC
architectures in terms of
throughput
ELECT1002
SOC Current Trend
SoC Design
 Explicitly parallel SoC architectures
 Integrating huge amounts of Memory in chip designs
 Distributed Shared Memory Environments
 Should allow Interconnection centric design flow and better predictab
ility
 Physical design Closure
 Wire delay dominates gate delay
Dr. Amr Talaat
ELECT1002
Design goal of NoC
SoC Design




High throughput
Low latency
Less energy consumption
Small area requirements
 Network-on-Chip Basics:
 Architectures
 Routing Strategies
 Evaluation
Router
Logic
CNI
IP Core
Dr. Amr Talaat
Figure 1: NoC Architecture
ELECT1002
To/From
Network
Routing: Circuit/Packet Switching
SoC Design
Circuit Switching
• Dedicated path, or circuit, is established over which data packets will
travel
• Naturally lends itself to time-sensitive guaranteed service due to
resource allocation
• Reservation of bandwidth decreases overall throughput and increases
average delays
Packet Switching
Dr. Amr Talaat
• Intermediate routers are now responsible for the routing of individual
packets through the network, rather than following a single path
• Provides for so-called best-effort services
ELECT1002
Routing: Wormhole/Virtual Cut Through
SoC Design
Wormhole Switching
• Message is divided up into smaller, fixed length flow units called flits
• Only first flit contains routing information, subsequent flits follow
• Buffer size is significantly reduced due to the limitation on the number
of flits needed to be buffered at any given time
Virtual Cut Through Switching
• Much like Wormhole switching
Dr. Amr Talaat
• Header flit can travel ahead and undergo processing while remaining
flits are still navigating the network
• Higher acceptance rates and lower latencies than Wormhole
ELECT1002
Wormhole Switching
SoC Design
Dr. Amr Talaat
ELECT1002
Routing: Contention
SoC Design
•Contention occurs when routers or IP blocks attempt to send data over
the same link at the same time
• For Circuit switching, contention is resolved at the time of actual
connection setup
• For packet switching, contention resolution is handled at a much finer
level, by the router buffering and scheduling individual packets of
information
• Better overall performance for packet switched networks at the cost
of lack of service guarantee
Dr. Amr Talaat
ELECT1002
Architectures: SPIN
SoC Design
• SPIN: Scalable, Programmable, Integrated Network
• Every level has same number switches
• Network grows like (NlogN)/8
• Trades area overhead and decreased power efficiency for higher
throughput
• Illustrative of performance vs. power consumption
Dr. Amr Talaat
ELECT1002
Architectures: CLICHE
SoC Design
•CLICHÉ: Chip-Level
Integration of
Communicating
Heterogeneous Elements
• Two-dimensional mesh
network layout for NoC
design
Dr. Amr Talaat
• All switches are connected
to the four closest other
switches and target
resource block, except
those switches on the edge
of the layout
• Connections are two
unidirectional links
ELECT1002
Architectures: Torus
SoC Design
•Similar to mesh
based architectures
• Wires are wrapped
around from the top
component to the
bottom and rightmost
to leftmost
• Smaller hop count
• Higher bandwidth
Dr. Amr Talaat
• Decreased
Contention
• Increased chip
space usage
ELECT1002
Architectures: Folded Torus
SoC Design
•Similar to Torus
•Torus, the long
end-around
connections can
yield excessive
delays
•Avoided by folding
the torus
Dr. Amr Talaat
ELECT1002
Architectures: Octagon
SoC Design
•Standard model: 8
components, 12
interconnects
• Design complexity
increases linearly
with number of
nodes
• Largest packet
travel distance is
two hops
• High throughput
Dr. Amr Talaat
• Shortest path
routing easy to
implement
ELECT1002
Architectures: BFT
SoC Design
•BFT: Butterfly Fat Tree
• Each node in tree model has
coordinates (level, position)
where level is depth and position
is from left to right
• Leaves are component blocks
• Interior nodes are switches
• Four child ports per switch and
two parent ports
Dr. Amr Talaat
•LogN levels, ith level has
n/(2^i+1) switches, n = leaves
(blocks)
• Use traffic aggregation to
reduce congestion
ELECT1002
Network interface
SoC Design
 Open Core Protocol (OCP)
 An interface standard between IP cores and the interconnection f
abric
Dr. Amr Talaat
ELECT1002
Packet Format
SoC Design
Dr. Amr Talaat
Type: Head, Data, Tail and Complete
VCID: Virtual Channel Identifier
Route: ‘N’ bit route field with last 2 bits specifying the
Route to be used in the next controller
00 - Left
01 - Right
10 - Straight
11 - Extract
Data: Actual Data field
ELECT1002
Routing Example
SoC Design
Dr. Amr Talaat
ELECT1002
Simulation
SoC Design
A simulator is used to investigate various metrics:
•Each system consists of 256 functional IP blocks
•Wormhole routing is used
•User can choose uniform and localized traffic
•Support both Poisson and self-similar message injection distributions
Dr. Amr Talaat
A flit is only one word (36 bits, 4 bits are for packet framing).
ELECT1002
Area comparison
SoC Design
 SPIN and Octagon have a considerably higher silicon area overhead.
Dr. Amr Talaat
ELECT1002
Projected performance
SoC Design
Dr. Amr Talaat
ELECT1002