att_coronet11x - Princeton CS

Download Report

Transcript att_coronet11x - Princeton CS

Projects Related to Coronet
Jennifer Rexford
Princeton University
http://www.cs.princeton.edu/~jrex
Outline
• SEATTLE
– Scalable Ethernet architecture
• Router grafting (joint work with Kobus)
– Seamless re-homing of links to BGP neighbors
– Applications of grafting for traffic engineering
• Static multipath routing (Martin’s AT&T project)
– Joint traffic engineering and fault tolerance
2
SEATTLE
Scalable Ethernet Architecture for Large Enterprises
(joint work with Changhoon Kim and Matt Caesar)
http://www.cs.princeton.edu/~jrex/papers/seattle08.pdf
3
Goal: Network as One Big LAN
• Shortest-path routing on flat addresses
–Shortest paths: scalability and performance
–MAC addresses: self-configuration and mobility
H
S
S
H
H
H
S
S
S
S
S
H
S
H
S
H
S
S
S
S
H
H
• Scalability without hierarchical addressing
–Limit dissemination and storage of host info
–Sending packets on slightly longer paths
4
SEATTLE Design Decisions
Objective
Approach
Solution
1. Avoiding
flooding
Never broadcast
unicast traffic
2. Restraining
broadcasting
Bootstrap hosts
via unicast
3. Reducing
routing state
Populate host info
only when and where
it is needed
Traffic-driven resolution
with caching
4. Shortest-path
forwarding
Allow switches
to learn topology
L2 link-state routing
maintaining only
switch-level topology
* Meanwhile, avoid modifying end hosts
Network-layer
one-hop DHT
5
Network-Layer One-hop DHT
• Maintains <key, value> pairs with function F
– Consistent hash mapping a key to a switch
– F is defined over the set of live switches
• One-hop DHT
2128-1 0 1
– Link-state routing ensures
switches know each other
• Benefits
– Fast and efficient
reaction to changes
– Reliability and capacity
naturally grow with
size of the network
6
Location Resolution
<key, val> = <MAC addr, location>
x
C
Owner
Host discovery
A
Hash
F(MACx) = B
y
Forward directly
from D to A
User
Publish
<MACx, A>
Tunnel
to A
Resolver
Switches
End hosts
Control message
Data traffic
Tunnel
to B
B
Store
<MACx, A>
D
Traffic to x
Hash
F(MACx ) = B
Notify
<MACx, A>
E
7
Address Resolution
<key, val> = <IP addr, MAC addr>
x
<IPx ,MACx>
Broadcast
ARP request
for IPx
C
A
Hash
F(IPx) = B
Unicast
look-up to B
D
y
Hash
F(IPx ) = B
Unicast reply
<IPx, MACx, A>
B
Store
<IPx, MACx, A>
E
Traffic following ARP takes a shortest path
without separate location resolution
8
Handling Network and Host Dynamics
• Network events
–Switch failure/recovery
 Change in <key, value> for DHT neighbor
 Fortunately, switch failures are not common
–Link failure/recovery
 Link-state routing finds new shortest paths
• Host events
–Host location, MAC address, or IP address
–Must update stale host-information entries
9
Handling Host Information Changes
Dealing with host mobility
x
Old
location
F
Host talking
with x
y
A
< x, A >
< x, D >
New
location
C
< x, A >
< x, D >
B
D
< x, D >
Resolver
< x, A >
< x, D >
E
MAC- or IP-address change can be handled similarly
10
Packet-Level Simulations
• Large-scale packet-level simulation
–Event-driven simulation of control plane
–Synthetic traffic based on LBNL traces
–Campus, data center, and ISP topologies
• Main results
–Much less routing state than Ethernet
–Only slightly more stretch than IP routing
–Low overhead for handling host mobility
11
Prototype Implementation
XORP
Click
Interface
Network
Map
OSPF
Daemon
Link-state
advertisements
User/Kernel Click
Routing
Table
Ring
Manager
Host Info
Manager
Host-info registration
and notification msgs
SeattleSwitch
Data
Frames
Data
Frames
Throughput: 800 Mbps for 512B packets, or 1400 Mbps for 896B packets
12
Conclusions on SEATTLE
• SEATTLE
–Self-configuring, scalable, efficient
• Enabling design decisions
–One-hop DHT with link-state routing
–Reactive location resolution and caching
–Shortest-path forwarding
• Relevance to Coronet
–Backbone as one big virtual LAN
–Using Ethernet addressing
13
Router Grafting
Joint work with Eric Keller, Kobus van der Merwe, and Michael Schapira
http://www.cs.princeton.edu/~jrex/papers/nsdi10.pdf
http://www.cs.princeton.edu/~jrex/papers/temigration.pdf
14
Today: Change is Disruptive
• Planned change
–Maintenance on a link, card, or router
–Re-homing customer to enable new features
–Traffic engineering by changing the traffic matrix
provider
customer
• Several minutes of disruption
–Remove link and reconfigure old router
–Connect link to the new router
–Establish BGP session and exchange routes
15
Router Grafting: Seamless Migration
• IP: signal new path in underlying transport network
• TCP: transfer TCP state, and keep IP address
• BGP: copy BGP state, repeat decision process
Send state
Move link
16
Prototype Implementation
• Added grafting into Quagga
– Import/export routes, new ‘inactive’ state
– Routing data and decision process well separated
• Graft daemon to control process
• SockMi for TCP migration
Graftable Router
Modified
Quagga
Linux kernel
2.6.19.7
Emulated
link migration
graft
daemon
Handler
Comm
SockMi.ko
click.ko
Linux kernel 2.6.19.7-click
Unmod.
Router
Quagga
Linux kernel
2.6.19.7
17
Grafting for Traffic Engineering
Rather than tweaking the routing protocols…
* Rehome customer to change traffic matrix
18
Traffic Engineering Evaluation
• Internet2 topology, and traffic data
Total Link Usage
• Developed algorithms to determine links to graft
1600000
Network can handle more traffic
(at same level of congestion)
1400000
1200000
1000000
800000
Original Topology
(optimal paths)
600000
With Grafting
400000
200000
0
1
1.2
1.4
1.6
1.8
2
2.2
Demand Multiple
2.4
2.6
19
Conclusions
• Grafting for seamless change
–Make maintenance and upgrades seamless
–Enable new management applications (e.g., TE)
• Implementing grafting
–Modest modifications to the router
–Leveraging programmable transport networks
• Relevance to Coronet
–Flexible edge-router connectivity
–Without disrupting neighboring ISPs
20
Joint Failure Recovery
and Traffic Engineering
Joint work with Martin Suchara, Dahai Xu, Bob Doverspike, and David Johnson
http://www.cs.princeton.edu/~jrex/papers/stamult10.pdf
21
Simple Network Architecture
• Precomputed multipath routing
– Offline computation based on underlying topology
– Multiple paths between each pair of routers
• Path-level failure detection
– Edge router only learns which path(s) have failed
– E.g., using end-to-end probes, like BFD
– No need for network-wide flooding
• Local adaptation to path failures
– Ingress router rebalances load over remaining paths
– Based on pre-installed weights
22
Architecture
• topology design
• list of shared risks
• traffic demands
• fixed paths
• splitting ratios
t
0.25
0.25
s
0.5
23
Architecture
• fixed paths
• splitting ratios
t
0.5
0.5
s
0
link cut
24
24
State-Dependent Splitting
• Custom splitting ratios
–Weights for each combination of path failures
configuration:
Failure
Splitting Ratios
-
0.4, 0.4, 0.2
p2
0.6, 0, 0.4
…
…
p1
0.4
0.4 p2
0.2
p3
at most 2#paths
entries
0.6
0.4
25
Optimizing Paths and Weights
• Optimization algorithms
– Computing multiple paths per pair of routers
– Computing splitting ratios for each failure scenario
• Performance evaluation
– On AT&T topology, traffic, and shared-risk data
– Performance competitive with optimal solution
– Using around 4-8 paths per pair of routers
• Benefits
– Joint failure recovery and traffic engineering
– Very simple network elements (nearly zero code)
– Part of gradual move away from dynamic layer 3
26