Overall Voice Market

Download Report

Transcript Overall Voice Market

High-end Routers &
Modern Supercomputers
Bob Newhall & Dan Lenoski
Cisco Systems, Routing Technology Group
NORDUnet 2003, Reykjavik – August 2003
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
1
Agenda
• Traditional Routers and Supercomputers
• Modern Routers and Supercomputers
• Comparison of Subsystems
• Conclusions
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
2
What’s a Router? Traditionally…
Con/Aux
NVRAM
Flas
h
ROM
CPU
MIPS
Secondary
Cache
SRAM
System
Controller
SDRAM
(256 MB)
I/O Bus
CPU Bus
PB
PCI Bus0
FE
PA-5
PB
PB
PA-6
PA-3
PB
PB
PA-4
PA-1
PB
PB
PA-2
PCI Bus1
PCMCIA-1
PCMCIA-2
PCI Bus2
Architecturally, routers have been like normal
computers except:
- Mechanical form factors, especially for IO
- Embedded forwarding and routing SW
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
3
What’s a Supercomputer?
Traditionally… Cray Y-MP
Cray Y-MP C90
250 Gbyte/sec of
interconnect bandwidth
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
4
Evolution of High-End Routers
• Increasing bandwidth of external connections:
 T1 -> DS3 -> OC3 -> OC12 -> OC48 -> OC192 -> OC768
 1mbit/sec -> 40 gbit/sec
• Line speed increases require changes in router
architecture to remove the central memory
bottleneck and replace with distributed memories
and central interconnect fabric
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
5
Evolution of High-End Routers
• Increased computational power for routing,
forwarding and feature processing
• Larger systems (more line cards) desired by
end customers to exploit DWDM capabilities
and simplify operation of POPs
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
6
What’s a High-End Router today?
Linecards (8-16)
T1 to
OC-192
Interfaces
Multi-Gigabit
Switching Capacity
Switch Fabric
Route Processor(s)
Distributed Architecture with Crossbar Switch Fabric
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
7
The next-generation of High-End Routers
Linecards (100’s to 1000’s)
T1 to
OC-768
Interfaces
Multi-Terabit
Switching Capacity
Switch Fabric
Route Processor(s)
Multi-Chassis, Distributed Architecture with Multi-Stage Switch Fabric
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
8
Evolution of Supercomputers
• Move from globally clocked, ECL vector processors
to distributed-memory uP based multiprocessors
 250MHz C90 to 1-2GHz Pentium 4, Alpha, Power3
• This architecture change driven by:
 Complexity and economics of building highest performance
processors
 Commoditization of smaller-scale computers
 Not driven by programming desires of end-users
• Note that state-of-the-art processors can generate
less than 10Gbit/sec of communication data
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
9
What’s a Supercomputer today?
ASCII White at LLNL
• 8K processors in 512
nodes, 12TFLOPS
• Interconnect has
connection BW of
1TByte/Sec
•
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
Diagram and photo from LLNL
ASCII webpage
10
Major components of a
Router
• Distributed Control Plane
 Used to run routing protocols (= dist. computer)
• Distributed Data Plane
 Packet Processing: Examine L2-L7 protocol information
(Determine QoS, VPN ID, policy, etc.)
 Packet Forwarding: Make appropriate routing, switching, and
queuing decisions
• System Interconnect
 Control Plane – can be combined with data plane or
dedicated
 Data Interconnect – at least sum of external BW required
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
11
Major components of a
Supercomputer
• Distributed Control / Computational nodes
 Small number of processor nodes (4-16) with local memory
• Distributed IO Subsystem
 Typically tied to subset of nodes, but if fully distributed these
can be viewed as sync/source of external bandwidth similar
to router external connections
• System interconnect
 BW driven primarily by data sharing requirements and often
limited by CPU’s ability to generate data
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
12
Router – Supercomputer Analogy
High-End Router Supercomputer
Route Processors
Line Cards
Switch Fabric
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
CPU Nodes
I/O Nodes
Interconnection
Network
13
Route Processors ~ CPU Nodes
• Route Processors execute routing protocols and
maintain routing and forwarding information bases
 Large networks dictate gigabytes of memory to hold routing
and interface database
 Also require high-peak computation rates to reconverge
network topology and download table updates to line cards
 1000 MIPs per eight 40Gbit/sec interfaces for control plane
• CPU nodes in supercomputer run applications and
source and sync processor communication traffic
 1-2 Gflops and 1000 MIPs per processor
 1-2 Gbytes of memory per processor
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
14
Router Line Card ~ SC I/O Node
• Packet forwarding, classification and feature
processing require complex look-ups and queuing
decisions be made on a per packet basis
 Even with HW assist (TCAMs, etc.) approximately 500
instructions per packet
 At 40Gbps and minimum size packet => 100MPPS
 Total of 50,000 MIPS / 40Gbps line rate
• Queuing and TCP/IP congestion semantics imply
200millisec of buffering on ingress and egress
 .2sec x 40Gbps x 2 = 16Gbits = 2Gbyte / 40Gbps line rate
 Fragmentation usually typically requires 4x BW queuing
 40Gbps => 160Gpbs per queue x 2 (I & E) => 320Gbps
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
15
Distributed Memory Router Line Card
RTT Buffer Mem
(1GB)+ pointer
SRAM
To
Fabric
512+MB
DRAM
From
Fabric
Input
Queuing
Control
CPU Mem
Control
Fabric
Re-Assem.
Table SRAM
Fwd/Class
TCAMs
Receive
Fwd
Engine
Linecard
Control
CPU
Transmit
Fwd
Engine
Table SRAM
Fwd/Class
TCAMs
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
L2
Buffering
Framer
Optics
Output
Queuing
RTT Buffer Mem
(1GB)+ pointer
SRAM
16
Supercomputer I/O Nodes
• Disk and network attachment dominate requirements
• Computational requirements on data typically limits
effective throughput
• 52 nodes of 512 on ASCII-White each with
appox. 1-2Gbyte/sec per node of IO BW
• Data must be moved from IO to local node memory
and then IPC’d to other computational nodes
 Limited by node to interconnect BW limits
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
17
Router Switch Fabric ~
SC Interconnect Network
• Critical design parameters are:
 Throughput
 Traffic Isolation
 Fault-Tolerance
• Router switch fabric must have over-speed of
fabric BW to line BW to provide traffic isolation and
deal with packet fragmentation
 Minimum 1.5x with at least 2x line rate desirable
 60-100Gbps per 40Gbps line rate
• Depending size of system – topology varies from
 Crossbar
 Multistage Network (e.g., Benes, Clos)
 Must be symmetric – all-to-all (like old-style Supercomputer)
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
18
Supercomputer Interconnect Network
• Critical parameters are:
 Throughput
 Latency (end-to-end)
• Actual supercomputers interconnects vary
substantially, but usually <1Gbyte/sec per processor
• Topology Varies, but generally exploits locality
 Hypercube
 Torus or Mesh
 Multi-stage networks
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
19
Overall Comparison
Feature
NORDUnet 2003
Control MIPS
512 Linecard
512 node, 8K
40Gpbs/LC
ASCII-White
Router
SuperComputer
64 GIPS
8000 GIPS
Data MIPS
25600 GIPS
N/A
Total Memory
Storage
Total Memory
Bandwidth
Interconnect
Bandwidth
1024 Gbytes
4096 Gbytes
20 Tbyte/sec
8 Tbyte/sec
4 Tbyte/sec
2 Tbyte/sec
© 2003, Cisco Systems, Inc. All rights reserved.
20
Overall Technology Required
• Traditionally, networking equipment exploited off-theshelf silicon, FPGA, standard ASIC technology
• High-end routers with OC-192 support approaching
supercomputers
 0.25u and 0.18u ASICs shipped in early 2001
• High-end routers with OC-768 support require the
leading edge of technology
 ASICs using 0.13u technology and >1500pin packages
 Latest memory technology
 Rambus, FCRAM and RLDRAM, QDR SRAM
 Power per rack comparable to the 9.5KW for IBM’s SP2
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
21
Conclusions
• Explosive data rates and optics capabilities have
pushed router technology tremendously in the last
decade
 From embedded single-board computers in the 80’s
 To distributed-memory computers with specialized forwarding,
queuing and feature processing capabilities
• In nearly every metric of system technology,
today’s high-end routers match or exceed the
capability of an equivalent supercomputer
• In addition, high-end routers have a critical
requirement of system fault-tolerance
• Going forward, advances in high-end routers and
supercomputers are technology-limited
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
22
Thank you!
Bob Newhall, [email protected]
NORDUnet 2003
© 2003, Cisco Systems, Inc. All rights reserved.
23