two-level routing table - Department of Computer Science

Download Report

Transcript two-level routing table - Department of Computer Science

Department of Computer Science
A Scalable, Commodity Data Center
Network Architecture
Mohammad Al-Fares Alexander Loukissas Amin Vahdat
SIGCOMM’08
Reporter: Fuchao Zhou
Department of Computer Science
Problem
• How to design Data Center Network Architecture
-- Scalable interconnection bandwidth
-- Without incurring tremendous cost
-- Compatibility with hosts running Ethernet and IP
Department of Computer Science
Existing solutions
• Using specialized hardware and communication
protocols such as InfiniBand and Myrinet
-- More expensive for using high-end switches
-- Not natively compatible with TCP/IP applications
• Using commodity Ethernet switches and routers
to interconnect cluster machines
-- Need appropriate network topology
-- Bandwidth scales poorly with cluster size
-- Non-linear cost increases with cluster size
Department of Computer Science
Existing solutions
• Typical architectures today
-- Two-level trees of switches or routers (supports 5K to 8K hosts)
-- Three-level trees of switches or routers
• Disadvantages
-- only support 50%
bandwidth available at
the edge of the network
-- incurring tremendous
cost($37M to supports
27,648 hosts)
Department of Computer Science
Proposed solution
• Typical architectures today
-- k pods, each containing two layers of k/2 switches
-- (k/2)2 k-port core switches
-- supports k3/4 hosts(48-ary fat-tree supports 27,648 hosts)
• Advantages
-- non-blocking
-- all switching elements
are identical ($8.64M to
supports 27,648 hosts)
-- compatible with hosts
running Ethernet and IP
k-ary fat-tree topology
Department of Computer Science
Static Routing method
• two-level routing
table
-- maximum bisection
bandwidth in this
network
• IP address
-- Core switches:10.k.j.i
-- Pod switches:
10.pod.switch.1
-- Hosts:10.pod.switch.ID
Department of Computer Science
Static Routing example
Prefix
Output port
Prefix
Output port
10.0.0.0/16
0
10.2.0.0/24
0
10.1.0.0/16
1
10.2.1.0/24
1
10.2.0.0/16
2
0.0.0.0/0
10.3.0.0/16
3
2
2
3
10.0.3.1
3
3
0
10.2.3.1
2
10.0.1.3
Prefix
Output port
10.0.0.0/24
0
10.0.1.0/24
1
0.0.0.0/0
1
10.2.1.3
Packet from 10.0.1.2 to host 10.2.0.3
Packet from 10.0.1.3 to host 10.2.0.2
0.0.0.2/8
2
0.0.0.3/8
3
Department of Computer Science
Dynamic Routing methods
• flow classification
1. Recognize subsequence packets of the same flow, and
forward them to the same outgoing port against packet
reordering;
2. Periodically reassign output ports to ensure fair
distribution on flows on output ports in the face of
dynamically changing flow size.
Department of Computer Science
Dynamic Routing methods
• flow scheduling (with a central scheduler)
Method1:(notification)
1. Edge switches detect any outgoing large flow
2. Send notifications to a central scheduler periodically
3. The central scheduler order a re-assignment;
Method2:(monitor)
1. A central scheduler tracks all active large flows
2. Assign them non-conflicting paths if possible.
3. The scheduler maintains Boolean state for all links
Department of Computer Science
Fault-Tolerance
• Simple failure broadcast protocol
-- Each switch maintains a Bidirectional forwarding
Detection session(BRD)(D.Datz, D.Ward. BFD for IPv4 AND IPv6, 2008)
• Two classes of failures
Department of Computer Science
Fault-Tolerance based on the flow
classification(1)
Outgoing inter- and intra-pod traffic originating from the edge switch
Intra-pod traffic using the upper-layer switch as an intermediary
Inter-pod traffic coming into the upper-layer switch
Department of Computer Science
Fault-Tolerance based on the flow
classification(2)
Outgoing inter-pod traffic
Incoming inter-pod traffic
Department of Computer Science
Fault-Tolerance based on the flow
scheduling
• Simpler
• The scheduler marks any link reported to be down
as busy or unavailable
Department of Computer Science
Limitations
• The performance evaluation of a prototype of the
architecture consisting of 4 pods(16 hosts)
• Fat-tree topology is wiring overhead
-- 3k3/4 wire cables for a k-ary fat tree
-- e.g. k=48, supporting 27,648 hosts.
3*483/4=82,944 wire cables --.
• How many changes for the commodity switches
should be considered.
--don’t support the dynamic routing techniques
support two-level routing table
-- don’t
Department of Computer Science
Limitations
• Dynamic routing techniques also have limitations -- flow classifier just only has local knowledge available
-- centralized scheduler with global knowledge may be
infeasible for large arbitrary network
• two-level routing solution cannot avoid local
congestion without dynamic routing technique
Department of Computer Science
Q&A
Department of Computer Science
Extra slides