Topology Optimization for Application-Specific Networks-on-Chip
Download
Report
Transcript Topology Optimization for Application-Specific Networks-on-Chip
Topology Optimization for Application-Specific
Networks-on-Chip
Tapani Ahonen
Tampere University of Technology
Institute of Digital and Computer Systems
P.O.Box 553, FI-33101 Tampere
Phone +358 3 3115 4562
Fax +358 3 3115 3095
Email [email protected]
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Outline
Motivation
Platform Design
OIDIPUS – A Network-on-Chip Topology Design Tool
Case Study in Process Control and Monitoring
Evaluation of the Tool
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Motivation for a Design Paradigm Shift
system integration to a single chip
–
–
–
–
complex interconnections
signal coupling
timing closure problems
...
increasing mask costs
– ASICs are vanishing from low-volume markets and frequently
updated products
long interconnections are very costly
– FPGAs fall behind
chip-level reuse: development platforms for application domains
– complicated design process
– programmability implies high data traffic
– block-level reuse
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Platform Design Flow
Specify performance requirements in
– task level
– application level
– system level
statistical execution models for
temporal requirements
– choose processing elements with known
characteristics
map and schedule to minimize
communication
optimize NoC layout
– patitioning
– connectivity
– block placement
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
OIDIPUS - Network-on-Chip Topology Design Tool
layout optimization (based on communication and IP specs)
– target: speed and/or power consumption
• asynchronous communication
– partitioning and block placement
– connectivity optimization of the seed topology under constraints on
• maximum number of node dimensions and width of a link
• reliability
execution at an early stage of design
– assumptions with abstract input information
• square block layout assumed w/o aspect ratio
• node location at center of a side w/o determination
• constant throughput
• default data activity
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
OIDIPUS – Design Space Exploration
exhaustive search not feasible with >> 10 hosts
simulated annealing
– allows to escape from a local minimum
– simulation schedule from effort parameter
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
OIDIPUS – Partitioning the Network
goal: minimize (the cost of) global communication
main factors of power consumption
– communication distance (wire length)
– data activity
actual distance unknown at partitioning time
– use data activity for cost calculation
• (only throughputs contribute with default data activity)
actual latencies are also unknown
– use latency tolerance figures for cost calculation
case study: F(c)
= Nc (Wp + (1/)Wl)(Po-Pe/2)
where Nc is the number of communication channels, is throuhput,
is latency tolerance, W stands for weight, P stands for partition
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
OIDIPUS – Block Placement
goal: minimize (the cost of) local communication
assumptions
– always through shortest path w/o prbability spec
– asynchronous communication
– proper repeater insertion => delay proportional to wire length
the longest link (length Lllp) on a path restricts speed
number of hops from origin to target (Hp) determine latency
– Clc = (1/ )Hp Lllp
total path length (Lp) dominates power consumption
– Cpc = Lp
case study: F(c) = Nc ( LpWp + (1/ )Hp LllpWl)
where Nc is the number of communication channels, is throuhput,
is latency tolerance, W stands for weight
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Case Study: Industrial Process Monitoring and Automation
remote monitoring and control
– event sensing (ADCs)
• pattern recognition (DSP)
• voice recognition (DSP)
• probing (sensors)
– network connection / user interface (protocol processor, I/O devices)
• reprogrammability for different product runs
reactive appliances
– programmable system control (RISC)
– motor / process controlling signal generation (controllers & DACs)
maintenance facilitation
– event recording (memory)
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Functional Blocks and the communication requirements
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Evaluating the Tool with a Single Design Goal
case study specification was used as an input to OIDIPUS
simple benchmark algorithm with human-like behavior
– prioritizes channels and adds respective blocks to the topology trying
to minimize the communication distance
– block placement in bi-directional ring only
– four different results with 2-5% higher cost than OIDIPUS topologies
placement and partitioning by a human designer
– ”intelligent” usage of a memory block as a passive network bridge
• eliminates one link, but adds control traffic
– reoptimization of the block placement with OIDIPUS resulted in a cost
that was 3.2% lower in the other partition and 0.3% lower in the other
– partitioning matched to the human design
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Evaluating the Tool with a Single Design Goal
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Evaluating the Tool with a Single Design Goal
gradual change of the design goal demonstrated block placement
variation with the objective
design using OIDIPUS from the beginning
– 1 link and a network bridge more
• more hops per path in average
– shorter overall interconnections (15%)
– shorter physical path langth (11%)
– cost of partitioning ~32% lower, cost of block placement <2% higher
human designer reduced the cost against an average design ~25%
OIDIPUS design had a cost approximately 33% lower than average
benefits grow with network complexity
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Evaluating the Tool with a Single Design Goal
OIDIPUS was used without partitioning
– 45% longer average path compared to the partitioned design
– 65% more hops
– results
• 38% lowered cost against an average implementation
• <27% higher cost against the partitioned design
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems
Conclusion
NoC layouts can be significantly optimized by exploiting
application domain specific features
– critical wires are effectively shortened
human designers succeed through divide-and-conquer strategy
the higher the level of freedom for an automation tool, the better the
optimization results
– with freedom in connectivity, small partitions should be avoided unless
they are isolated from the rest of the system
This was just the beginning of the journey
SLIP 14.2.2004
Tampere University of Technology
Institute of Digital and Computer Systems