Router Anatomy - Institute for Systems Research
Download
Report
Transcript Router Anatomy - Institute for Systems Research
Anatomy of an IP Router
Vahid Tabatabaee
Fall 2007
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
1
References
Title: Network Processors Architectures, Protocols, and Platforms
Author: Panos C. Lekkas
Publisher: McGraw-Hill
James Aweya, “IP Router Architectures: An Overview”, Nortel
Networks, Ottawa, Canada
Florian Brodersen, Alexander Klinetschek, “Anatomy of a High
Performance IP router”, Communication Network Seminar 2003/04,
Hasso-Plattner-Institute, University of Potsdam, Jan. 2004
Steve Kohalmi, Tim Hale, “Anatomy of an IP Service Edge Switch”,
2002 Quary Technologies.
Cisco Systems CRS-1 router documents.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
2
Basic IP Router Components
Network Interfaces
Path computation,
Routing Table Maintenance
Processing Modules
Buffering Modules
Interconnection Unit (switch
fabric)
The processing and buffering
modules may be replicated
either fully or partially on the
network interfaces.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
Transfer Packets btw.
Ingress and Egress
Interface (Line) Cards
Packet Forwarding,
Packet Processing,
May cache routing table
3
Basic Functions of a Router
Route Processing (Routing
Protocols OSPF, RIP, …)
Slow Path or
Control Plane
Path Computation
Routing Table
Maintenance
Reachability Propagation
Packet Forwarding
Fast Path or
Data Plane
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
4
Packet Forwarding
IP Packet Validation
Version Number
Header length field
Check sum.
Dest. IP address parsing and table lookup.
Local delivery in the network.
Unicast delivery to an output port.
Multicast delivery to a set of output ports
Packet Lifetime Control
Adjust the time-to-live (TTL) field
A packet with positive TTL is delivered to a local address
Packet delivered to output ports has its TTL decremented and rechecked before
forwarding
Packet Fragmentation
Check if the packet size is larger than MTU of the network
If yes, fragment the packet.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
5
First Generation of Routers
Similar to a typical
computer layout.
All functionality is
implemented in
software.
Single CPU, single
Memory, Single
Bus!
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
6
Problems with first generation routers
Processing speed is limited by the
single CPU.
The CPU should process all packets
destined to it and those packets that
are passing through it.
Major packet processing tasks such
as table lookups are memory intensive
operations and can not be done faster
by simple processor upgrades.
Software implementation is inefficient,
since it is a small set of operations
repeated on all packets.
Slow path and fast path are
implemented on the same CPU.
Therefore, slow path can influence the
fast path.
The routing table size has grown from
20,000 entries from 1994 to 260,000
entries today.
Moving data from one interface to
another can be time consuming that
often exceeds the packet processing
time.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
Source http://bgp.potaroo.net/
7
Problems with first generation routers
The routing table lookup speed can not be improved if we use
traditional memories.
The conventional bus structure for the interconnection is very
inefficient.
Every packet has to pass the bus at least twice.
The whole packet (not just the header) is transferred.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
8
How fast a router should be?
An OC-48 link data rate : 2.488 Gbps
Packet rate is more important than the data rate.
Bottleneck is caused by the minimum packet size which depends
on the technology.
E.g. Packet-over-SONET (PoS): 40 byte IP payload + 6 byte
PPP/HDLC overhead:
2.405 Gbps /(8 x 46) = 6.53 MPPS
The aggregate packet rate for a 16 port system:
16 x 6.53 = 104.48 MPPS
One decision every 9.57 nsec.
SDRAM speed is about 10ns from sequential locations and
practically around 20-50 ns.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
9
What is the solution
Take advantage of Parallelism:
NIC became more intelligent and took care of most packet
forwarding.
We use ASIC in NIC (line cards) for packet classification and
forwarding.
Most packets do not go to the CPU card (control card).
Switching Interface:
Use switching element to pass packets between line cards
directly and simultaneously.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
10
Modern Switch Based Architecture
Classification and forwarding
decisions are done in line
cards.
High speed interconnection
mechanism (switching)
between the line cards.
This provides a fast data path.
Standard CPU (RISC
processor) is used for the
control plane (slow path).
Hardware and/or software
implementation for
classification and forwarding
in the line card.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
11
Functional Blocks in a Modern Switches
The PHY Interface
Responsible for transmitting and receiving information
Conversion of the bit stream from digital form to analog signal and vice
versa.
Switch Fabric
The router has a bus or a backplane
The switch fabric reads packet from input port and routes it to the
output port.
Packet processing
Fast path (data path): Handles all operations that are executed in real
time on packets (e.g.: framing/parsing, classification, modification,
compression/encryption, queueing)
Slow path (control path): Operations executed of the packet flows.
(e.g.: add. Resolution, route calculation, update of routing table,…)
Host processing
Network management, configuring devices, diagnostics
Implemented in software on a CPU
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
12
Line cards in Modern Switches
Line card handles packet processing such as:
Classification
Forwarding
Traffic Policing and shaping
Monitoring and Statistics
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
13
Data Path Diagram
Source: Light Reading Report
Switching
Element
Ingress
Optics
Egress
CDR &
Serdes
Framer
/
Mapper
Network
Processor
Traffic
Manager
Switch
Interface
Scheduling
Element
Line Card
Switch Card
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
Packet Processing Units
14
Data Path Functions
Network Processor
Ingress Traffic Manager
- Parse
- Identify flow
- Determine Egress Port
- Mark QoS Parameters
- Append TM or SF
Header
- Police
- Manage congestion (WRED)
- Queue packets in classbased VOQs
- Segment packets into
switch cells
Switch Fabric
- Queues cells in class based
VOQs
- Flow control TM per class
based VOQ
- Schedule class based VOQs
to egress ports
Ingress Line Card
Switch Fabric
-Reassemble cells into
packets
-Shape outgoing traffic
-Schedule egress traffic
Egress Line Card
TM
Scheduler
Incoming
packets
SF
Arbiter
Reassemble
Class based queueing of outgoing
packets
Segmentation + header
WRED
Egress Traffic Manager
Egress
Scheduler&
Shaper
Discard
SF
Flow Control
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
15
Switch Card to Line Card Connection
This connection should pass through the Backplane.
Serdes (Serializer-Deserializer) is used for this
connection.
Each Serdes signal run over two wires and two pins
(differential mode signal).
The speed is usually around 3.125 Gbps.
They run some sort of coding (8b/10b encoding)
The actual data rate would be around 2.5 Gbps.
There are attempts to provide 10 Gbps serdes.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
16
How many serdes do we need?
How fast should be the connection between switch card and line card?
The line speed is not enough.
Switch fabric throughput is less than 100% due to contention.
Network Processor, Traffic manager and switch fabric add their
headers.
There is also cell tax.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
17
RL
Line
Card
Elements
Line Card
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
RTM
Switch
Interface
Switch Fabric
Header
Fragmentation
(Cell Tax)
Effective Speedup = RSF/RTM
In the commercial systems,
speedup usually refers to
RSF/RL.
Higher speedup factor:
Increases system design
complexity.
Increases power
consumption.
Creates signal integrity
issues.
Required Speedup factor is
around 2
Traffic Manager
Header
Speedup
RSF
Switching
Element
Scheduling
Element
Switch Card
18
Redundancy
We have spare switch cards and control
cards in the system.
The redundancy models:
Passive redundancy (N:1) We have
one inactive switch card in the
system that starts to work after
failure.
Passive redundancy (1:1, N:N) for
each active switch card, we have
one inactive card.
Load-Sharing Redundancy (N-1) all
cards are active and when a failure
happens, performance will degrade
gracefully.
Active Redundancy (1+1): Two sets
of fabrics carrying the same traffic.
Source: www.idt.com/content/switchblock.jpg
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
19
Example
In a 16 port 10Gbps switch with 2X speed up with and N:N
redundancy how many 2.5 Gbps serdes do we need?
We need 20 Gbps active and 20 Gbps redundant data rate for
each line card.
This means 16 serdes for each line card.
For 16 line cards we need 256 serdes in this system.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
20
Example
What is the effective speed up of this system for 40 byte IP packets if
the traffic manager header size is 12 bytes, switch fabric header size is
8 bytes and the payload size of the cell is 52 bytes.
Solution: In slide 9 example we show that there can be 6.53 MPPS (40
byte packets) on an OC-48 line.
Similarly on an OC-192 there can be up to 9.622/(8x46) = 26.15 MPPS.
Each packet is encapsulated in one cell, since
40 + 6 < 52
The maximum number of cells that a line card can generate is
(2.5 x 8 Gbps) / ((52+8+12)x8) = 34.722
Effective Speedup is,
Speedup = (34.722/26.15) = 1.33
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
21
Traces Per Serdes
Typical LVDS speed is 1.25Gbps
For 2.5Gbps we need 2 channels
LVDS is differential, i.e. 2 traces per channel
LVDS is unidirectional, i.e. 2 for full duplex
Full duplex 2.5Gbps, using LVDS requires 8
traces
In the previous example we will have 256 x 8 =
2048 traces on the back plane.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
22
A sample Router (Cisco CRS-1)
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
23
The line card chassis
8 service cards and 8 physical layer interface module cards
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
24
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
25
16 slot Single-Shelf system
Physical layer
Interface
Module
Switching
Card
Routing
Processor
(control plane)
The distributed route processor (DRP) is optional components that provide enhanced routing capabilities.
• The DRP contains two symmetric multiprocessors (SMPs), each of which performs routing functions.
• Processor-intensive tasks (such as BGP speakers and ISIS) can be offloaded from the route processors (RPs) to the DRPs.
ENTS689L: Packet Processing and Switching
26
Anatomy of an IP Router
Multishelf Systems
2 to 72 line card shelves
1 to 8 fabric card shelves
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
27
How to handle packet processing?
Of the shelf CPU
This usually would be a RISC processor.
In low end systems it could be a CISC processor.
ASIC
Specialized high performance ASIC to handle packet
processing.
Ideal approach for companies such as IBM and intel,
since they are manufacturers of Integrated Circuits
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
28
Off-the-shelf CPU Systems
Packet processing is implemented in software running on the CPU.
Modifications, upgrades and debugging is accomplished by simple
software updates and downloads
Update time much shorter which is good for both user and
developer
Not very efficient: spending many clock cycles on tasks not related
to packet processing.
Fastest off-the-shelf CPU can handle about 1 gigabit per second.
Trend is to do deeper packet processing (more on this later).
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
29
Memory Bottleneck
The pipeline architecture of CPU enables them to perform billions
of instructions per second.
However, in order to sustain the pipeline they should fetch data
from memory and store it back continuously.
This can be done with very sophisticated multi-level hierarchy of
different memory technology, interleaving memory banks.
This requires prohibitive cost, design complexity and power
consumption.
Hence typical processor pipeline end-up being often empty, which
reduces the system throughput.
Network traffic statistics models are completely different from local
traffic on a computer bus. They do not have the same spatial and
temporal locality properties. Hence, the typical processor’s cache
systems are not effective.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
30
Sup-optimal Instruction Set
The instruction set that we need for packet processing requires
specific bit level operations.
These instructions should be done at wire speed.
These instructions are not available as standard instructions of offthe-shelf CPU.
Hence, we have to assemble multiple standard instructions to
perform the intended functionality.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
31
Packet processing with ASIC
ASIC typically delivers higher performance.
ASIC is not programmable:
Adding new functionality new design
Adding new protocol new design
New design Costly for both vendor and the user.
ASIC design is very time consuming
Design cycle takes 12 to 18 months.
If we need some modification we may need to recode the whole
design.
Many start-up failures are due to time delay.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
32
ASIC development is costly
Expensive and time-consuming to change.
For testing an ASIC you need to design a
system
Expensive development tools (design and
verification).
Requires ASIC designers (much more
expensive).
Tape out of a chip costs around a million dollar.
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
33
So is there a middle ground solution?
Can we have a technology that :
Has flexibility of programmable processors
Has high speed of ASICs
Solution is called
Network Processor!
Network processor are programmable similar to CPU, but their
performance is close to ASIC
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
34
Network Processors value proposition
Shorter time to market:
Instead of 18 months it takes about 6 months to complete
development cycle of packet processing part.
Longer time in market:
New features can be embedded into a deployed network
processor based product.
Increased time in market reduces cost of product ownership
over the life of product.
Just-in-time delivery of new features:
We can modify the design and adding new features in the field
without penalizing the customer.
Greater focus on other issues of business management
Most functions are already coded in a standard way
Developers can focus on differentiating features
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
35
Packet Processing Stages
1.
Remove Link Layer Headers and Decryption:
Ethernet
PPP (Point-to-Point Protocol) Frame
PPP over ATM
PPP over Ethernet over ATM
2.
Identify Ingress Subscriber:
To extract information from the link layer protocol header about the owner of the
packet.
3.
Filtering:
To permit of deny specific traffic flows, based on various attributes of the IP and
higher layers headers.
4.
Traffic Classification:
To allow different traffic management, QoS, security and routing policies applied to
different types of flows.
5.
Traffic Metering, Marking & Policing:
To control Peak and Committed Information Rate.
To determine PHB in the DiffServ Model (chnaging the priority)
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
36
Packet Processing Stages
6.
Custom Routing Polices:
To direct some traffic through specific paths (internet, VPN, specific destination)
Virtual Private Routed Network allows users to network in privacy over their own
routed network using their own private address.
Sending suspicious traffic to explicit locations for special processing.
7.
NAT & NAPT (Network Address [Port] Translation):
Address translation at the source if the user is using a private address space.
Static one-to-one with NAT and dynamic many-to-one with NAPT.
8.
Route Table Look-up:
Best matching prefix look-up on the destination IP address.
9.
Enforcing the PHB/ PerFlow (Link Sharing):
Priority, WRR, WFQ scheduling, WRED (weighted random early detection).
10. Egress Side Processing (QoS, filtering, encryption, NAT, Egress Subscriber Identification,
Traffic Classification, Link Sharing)
11. Statistical Collection
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
37
Deep Packet Processing
In deep packet processing we need to look at the contents of the
packet not just the header.
Why do we need deep packet processing?
Deep packet inspection for firewalls and intrusion detection
systems.
Traffic shape or discard P2P traffic
Server load balancing: distribution of traffic among servers
based on the web destination
Network Monitoring and Analysis
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
38
Packet Processing Implementation issues
We need multiple table look-ups for each packet.
Access to whole packet not just the IP header is necessary.
There can be ten of thousands of simultaneously active
subscribers comprising millions of application flows.
In a fully loaded Gigabit Ethernet connection about 1.5 million
packets per second must be processed
Modern general processors are optimized for numeric computation
rather than processing packets.
Memory read and write speeds become bottlenecks.
Caching and high-speed memory burst capability does not help,
since packet processing requires:
Large tables
Short entries
Random access queries
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
39
How do network processors do this?
Specialized circuitry and micro-engines to perform all generic
packet processing functions.
They also usually embed a major programmable module, usually a
tailor-made RISC CPU (and sometimes more than one).
Real time operating system
Handshake communication with other parts of the system
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
40
Network Processor Categories
Platform Network Processors objectives:
Handle most packet processing functions
Minimize the number of components and the hardware cost
Optimize the trade-off btw. Performance and flexibility
Accelerate software development cycle
Peripheral Network Processors
Designed to optimize a specific function
Compressor chips
IP security
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
41
The other side argument
Every single task can be done in wire-speed.
How about multi-tasks at the same time.
What is a realistic scenario to consider?
Challenge of Benchmarking
Programming complexity
ENTS689L: Packet Processing and Switching
Anatomy of an IP Router
42