Transcript 571Lecture7

Lecture Note on Switch Architectures
Function of Switch
Input 1
Input N
Output 1
Output N
Naive Way
Input 1
Input N
Output 1
Output N
Bus-Based Switch
Controller
Input Port
Processor
Output Port
Processor
Input Port
Processor
Output Port
Processor
Input Port
Processor
Output Port
Processor
• No buffering at input port
processor (IPP)
• Output port processor (OPP)
buffers cells
• Controller exchanges control
message with terminals and other
controller.
• Disadvantage:
– Bus bandwidth is equal to sum of
external link for non-blocking
– IPPs and OPPs must operate at
full bus bandwidth
– Bus width increases with number
of links
Centralized Bus Arbitration
• IPPs send requests to central arbiter
• Request may includes:
–
–
–
–
Priority
Waiting time
OPP destination(s)
Length of IPP queue
• Arbitration complexity is O(N^2)
• Distributed version is preferred, but may degrade throughput.
Bus Arbitration Using Rotating Daisy Chain
• Rotating token eliminates positional favoritism
0
1
K
N
Token
Ring Switch
IPP
RI
RI
OPP
• Same bandwidth and
complexity as bus switch
• Avoids capacitive loading
of bus, allowing higher
clock frequencies
• Control mechanisms
– Token passing
– Slotted ring with busy bit
IPP
RI
RI
OPP
Shared Buffer Switch
Controller
Input Port
Processor
Output Port
Processor
Input Port
Processor
Output Port
Processor
Input Port
Processor
Output Port
Processor
Shared Memory
• Individual queues are
rarely full.
• Shared memory needs two
times of external link
bandwidth
• Require less memory
• Better ability to handle
burst traffic
Crossbar Switch
Input Port
Processor
Output Port
Processor
Input Port
Processor
Output Port
Processor
Input Port
Processor
Output Port
Processor
Controller
Output Buffering
Input Port
Output Port
• Efficient, but needs N time speed up internally.
Input Buffering
Output Port
Input Port
•
•
•
•
Multiple packets simultaneously transmitted distinct outputs.
Require sophisticated arbitration
No speed up required
Head-of-line blocking
Bi-partite Matching
•
•
•
•
Require global information
Complexity is O(N^(5/2))
Not suitable for hardware implementation
May leads to starvation
0
0
0
0
1
1
1
1
2
2
2
2
3
3
3
3
Stravation for
traffic(0 to 0 and 1 to
0)
Desired Arbitration Algorithms
• High throughput
– Low backlog in each input queue
– Close to 100% for each input and output
• Starvation free
– No queue will be hold indefinitely
• Simple to implement
Options to Build High Performance Switches
• Bufferless crossbar
• Buffered crossbar
• Shared buffer
Bufferless Crossbar
• Centralized arbitrator is required
– Arbitration complexity is O(N*N)
– O(log2N) iterations of arbitration needed for high throughput
• Synchronization in all elements
• Single point failure: central arbitrator
• Complex line interface
Buffered Crossbar
• Simple scheduling algorithms
– Ingress: O(1)
– Egress: O(N)
• Inefficient use of memory
– Memory linearly increased with number of ports
Shared Memory
•
•
•
•
•
No central arbitrator needed
Reduced memory requirements
Distributed flow control
Less timing constrains
Simpler line card interface
Comparisons (1)
Bufferless
Crossbar
Buffered
Crossbar
Shared
Memory
Line Card
Buffering
Arbitor
Complexity
Number of
iterations
Available
Arbitration
Time Slots
Yes
O(NXN)
O(log N)
1/O(logN)
Yes
O(N)
1
1
No
O(N)
1
1
Comparison (2)
Assume 10G for each port and packet size is 64 bytes.
Time for arbitration (nano seconds)
10 G ports
Bufferless
Crossbar
Buffered
Crossbar
Shared
Memory
8
16
32
64
17.1
12.8
10.2
8.5
51.2
51.2
51.2
51.2
51.2
51.2
51.2
51.2
Scaling Number of Ports
• Single larger switch is less expensive, more reliable, easier to
maintain and offer better performance, but
– O(n2) complexity
– Board-level buses limited by capacitive loading
• Port multiplexing
• Buffered multistage routing
– Dynamic routing: Benes network
– Static routing: Clos network
• Bufferless multistage routing
– Deflection routing
Port Multiplexing
•
•
•
•
•
IPP
OPP
IPP
OPP
High speed core can handles high speed links as well as low speed
Sharing of common circuitry
Reduced complexity in interconnection network
Better queueing performance for bursty traffic
Less fragmentation of bandwidth and memory
Dynamic Routing – Benes Network
distribution
Rout and copy
0000
0001
1000
0010
0011
1001
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1101
1111
• Network expanded by adding stages on left and right
– 2k-1 stages with d port switch elements supports dk ports
•
•
•
•
Traffic distribution on first k-1 stages
Routing on last k stages
Internal load  external load
Traffic maybe out of order: need re-sequencing
Static Routing - Clos network
d
r
• All traffic follows same path
• r  2d-1 to be strict non-blocking.
Deflection Routing
3
3
4
1
2
3
3
4
1
2
2
1
2
3
2
4
Basic Architectural Components:
Forwarding Decision
1.
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
2.
Interconnect
3.
Output
Scheduling
ATM Switches
Direct Lookup
VCI
(Port, VCI)
Memory
Ethernet Switches
Hashing
Memory
#1 #2 #3 #4
Search
Data
48
Associated
Data
Hashing Function
16
#1 #2
CRC-16
Linked lists
#1 #2 #3
Advantages
• Simple
• Expected lookup time can be small
Disadvantages
• Non-deterministic lookup time
• Inefficient use of memory
{
Hit?
Address
log2N
Per-Packet Processing in IP Routers
1. Accept packet arriving on an incoming link.
2. Lookup packet destination address in the forwarding table,
to identify outgoing port(s).
3. Manipulate packet header: e.g., update header checksum.
4. Send packet to the outgoing port(s).
5. Classify and buffer packet in the queue.
6. Transmit packet onto outgoing link.
IP Router
Lookup
H
E
A
D
E
R
Forwarding Engine
Destination
Address
Incoming
Packet
Next Hop Computation
Forwarding Table
Destination Next Hop
-------------
----
----
Next Hop
Link
Lookup and Forwarding Engine
Packet
payload header
Router
Destination
Address
Lookup Data
Forwarding Engine
Outgoing
Port
Example Forwarding Table
Destination IP Prefix
65.0.0.0/ 8
128.9.0.0/16
Outgoing Port
3
Prefix length
142.12.0.0/19
IP prefix: 1-32 bits
1
7
Multiple Matching
Longest
matching prefix
128.9.176.0/24
128.9.16.0/21 128.9.172.0/21
65.0.0.0/8
0
128.9.0.0/16
128.9.16.14
142.12.0.0/19
232-1
Routing lookup: Find the longest matching prefix (or the most specific
route) among all prefixes that match the destination address.
Longest Prefix Matching Problem
2-dimensional search
• Prefix Length
• Prefix Value
Performance Metrics
• Lookup time
• Storage space
• Update time
• Preprocessing time
Required Lookup Rates
Year
Line
Line-rate
(Gbps)
40B packets
(Mpps)
1998-99
OC12c
0.622
1.94
1999-00
OC48c
2.5
7.81
2000-01
OC192c
10.0
31.25
2002-05
OC768c
40.0
125
31.25 Mpps  33 ns
DRAM: 50-80 ns, SRAM: 5-10 ns
Size of Forward Table
Number of Prefixes
100000
90000
80000
70000
60000
50000
10,000/year
40000
30000
20000
10000
0
95
96
97
Year
98
99
00
Renewed growth due to multi-homing of enterprise networks
Trees and Tries
Binary Search Tree
<
>
>
<
N entries
>
log2N
<
Binary Search Trie
0
0
1
1
010
0
1
111
Number
Typical Profile of Forward Table
Prefix Length
Most prefixes are 24-bits or shorter
Basic Architectural Components:
Interconnect
1.
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
2.
Interconnect
3.
Output
Scheduling
First-Generation Routers
Buffer
Memory
CPU
DMA
DMA
DMA
Line
Interface
Line
Interface
Line
Interface
MAC
MAC
MAC
Second-Generation Routers
Buffer
Memory
CPU
DMA
DMA
DMA
Line
Card
Local
Buffer
Memory
Line
Card
Local
Buffer
Memory
Line
Card
Local
Buffer
Memory
MAC
MAC
MAC
Third-Generation Routers
Line
Card
CPU
Card
Line
Card
Local
Buffer
Memory
Local
Buffer
Memory
MAC
MAC
Switching Goals
Circuit Switches
• A switch that can handle N calls has N logical inputs and N
logical outputs
– N up to 200,000
• Moves 8-bit samples from an input to an output port
– Samples have no headers
– Destination of sample depends on time at which it arrives at the switch
• In practice, input trunks are multiplexed
– Multiplexed trunks carry frames, i.e., set of samples
• Extract samples from frame, and depending on position in
frame, switch to output
– each incoming sample has to get to the right output line and the right
slot in the output frame
Blocking in Circuit Switches
• Can’t find a path from input to output
• Internal blocking
– slot in output frame exists, but no path
• Output blocking
– no slot in output frame is available
Time Division Switching
• Time division switching interchanges sample position within a
frame: time slot interchange (TSI)
Scaling Issues with TSI
Space Division Switching
• Each sample takes a different path through the switch,
depending on its destination
Time Space Switching
Time Space Time Switching
Packet Switches
• In a circuit switch, path of a sample is determined at
time of connection establishment. No need for header.
• In a packet switch, packets carry a destination field or
label. Need to look up destination port on-the-fly.
– Datagram switches: lookup based on destination address
– Label switches: lookup based on labels
Blocking in Packet Switches
• Can have both internal and output blocking
• Internal
– no path to output
• Output
– trunk unavailable
• Unlike a circuit switch, cannot predict if packets will block.
• If packet is blocked, must buffer or drop
Dealing with Blocking in Packet Switches
• Over-provisioning
– internal links much faster than inputs
• Buffers
– at input or output
• Backpressure
– if switch fabric doesn’t have buffers, prevent packet from entering until
path is available
• Parallel switch fabrics
– increases effective switching capacity
Basic Architectural Components:
Queuing, Classification
1.
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
Forwarding
Table
Forwarding
Decision
2.
Interconnect
3.
Output
Scheduling
Techniques in Queuing
Input Queueing
Output Queueing
Output Queueing
Individual Output Queues
Centralized Shared Memory
Memory b/w = 2N.R
1
2
N
1
2
Memory b/w = (N+1).R
N
Input Queuing
Delay
Input Queueing Performance
Load
58.6%
100%
Delay
Virtual Output Queues at Input
Load
100%
Classification
H
E
A
D
E
R
Incoming
Packet
Forwarding Engine
Packet Classification
Classifier (Policy Database)
Predicate
Action
----------------
----
Action
Multi-Field Packet Classification
Field 1
Field 2
…
Field k Action
Rule 1
152.163.190.69/21 152.163.80.11/32
…
UDP
A1
Rule 2
152.168.3.0/24
152.163.0.0/16
…
TCP
A2
…
…
…
…
…
…
Rule N
152.168.0.0/16
152.0.0.0/8
…
ANY
An
Given a classifier with N rules, find the action associated
with the highest priority rule matching an incoming packet.