Topology Optimization for Application-Specific Networks-on-Chip

Download Report

Transcript Topology Optimization for Application-Specific Networks-on-Chip

Topology Optimization for Application-Specific
Networks-on-Chip
Tapani Ahonen
Tampere University of Technology
Institute of Digital and Computer Systems
P.O.Box 553, FI-33101 Tampere
Phone +358 3 3115 4562
Fax +358 3 3115 3095
Email [email protected]
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Outline

Motivation

Platform Design

OIDIPUS – A Network-on-Chip Topology Design Tool

Case Study in Process Control and Monitoring

Evaluation of the Tool
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Motivation for a Design Paradigm Shift

system integration to a single chip
–
–
–
–

complex interconnections
signal coupling
timing closure problems
...
increasing mask costs
– ASICs are vanishing from low-volume markets and frequently
updated products

long interconnections are very costly
– FPGAs fall behind

chip-level reuse: development platforms for application domains
– complicated design process
– programmability implies high data traffic
– block-level reuse
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Platform Design Flow

Specify performance requirements in
– task level
– application level
– system level

statistical execution models for
temporal requirements
– choose processing elements with known
characteristics


map and schedule to minimize
communication
optimize NoC layout
– patitioning
– connectivity
– block placement
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
OIDIPUS - Network-on-Chip Topology Design Tool

layout optimization (based on communication and IP specs)
– target: speed and/or power consumption
• asynchronous communication
– partitioning and block placement
– connectivity optimization of the seed topology under constraints on
• maximum number of node dimensions and width of a link
• reliability

execution at an early stage of design
– assumptions with abstract input information
• square block layout assumed w/o aspect ratio
• node location at center of a side w/o determination
• constant throughput
• default data activity
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
OIDIPUS – Design Space Exploration


exhaustive search not feasible with >> 10 hosts
simulated annealing
– allows to escape from a local minimum
– simulation schedule from effort parameter
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
OIDIPUS – Partitioning the Network


goal: minimize (the cost of) global communication
main factors of power consumption
– communication distance (wire length)
– data activity

actual distance unknown at partitioning time
– use data activity for cost calculation
• (only throughputs contribute with default data activity)

actual latencies are also unknown
– use latency tolerance figures for cost calculation
case study: F(c)
= Nc (Wp + (1/)Wl)(Po-Pe/2)
where Nc is the number of communication channels,  is throuhput,
 is latency tolerance, W stands for weight, P stands for partition
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
OIDIPUS – Block Placement


goal: minimize (the cost of) local communication
assumptions
– always through shortest path w/o prbability spec
– asynchronous communication
– proper repeater insertion => delay proportional to wire length


the longest link (length Lllp) on a path restricts speed
number of hops from origin to target (Hp) determine latency
– Clc = (1/ )Hp Lllp

total path length (Lp) dominates power consumption
– Cpc =  Lp



case study: F(c) = Nc ( LpWp + (1/ )Hp LllpWl)
where Nc is the number of communication channels,  is throuhput,
 is latency tolerance, W stands for weight
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Case Study: Industrial Process Monitoring and Automation

remote monitoring and control
– event sensing (ADCs)
• pattern recognition (DSP)
• voice recognition (DSP)
• probing (sensors)
– network connection / user interface (protocol processor, I/O devices)
• reprogrammability for different product runs

reactive appliances
– programmable system control (RISC)
– motor / process controlling signal generation (controllers & DACs)

maintenance facilitation
– event recording (memory)
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Functional Blocks and the communication requirements
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Evaluating the Tool with a Single Design Goal

case study specification was used as an input to OIDIPUS

simple benchmark algorithm with human-like behavior
– prioritizes channels and adds respective blocks to the topology trying
to minimize the communication distance
– block placement in bi-directional ring only
– four different results with 2-5% higher cost than OIDIPUS topologies

placement and partitioning by a human designer
– ”intelligent” usage of a memory block as a passive network bridge
• eliminates one link, but adds control traffic
– reoptimization of the block placement with OIDIPUS resulted in a cost
that was 3.2% lower in the other partition and 0.3% lower in the other
– partitioning matched to the human design
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Evaluating the Tool with a Single Design Goal
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Evaluating the Tool with a Single Design Goal

gradual change of the design goal demonstrated block placement
variation with the objective

design using OIDIPUS from the beginning
– 1 link and a network bridge more
• more hops per path in average
– shorter overall interconnections (15%)
– shorter physical path langth (11%)
– cost of partitioning ~32% lower, cost of block placement <2% higher



human designer reduced the cost against an average design ~25%
OIDIPUS design had a cost approximately 33% lower than average
benefits grow with network complexity
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Evaluating the Tool with a Single Design Goal

OIDIPUS was used without partitioning
– 45% longer average path compared to the partitioned design
– 65% more hops
– results
• 38% lowered cost against an average implementation
• <27% higher cost against the partitioned design
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Conclusion

NoC layouts can be significantly optimized by exploiting
application domain specific features
– critical wires are effectively shortened

human designers succeed through divide-and-conquer strategy

the higher the level of freedom for an automation tool, the better the
optimization results
– with freedom in connectivity, small partitions should be avoided unless
they are isolated from the rest of the system

This was just the beginning of the journey
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems