Robust Traffic Engineering in a Dynamic Environment
Download
Report
Transcript Robust Traffic Engineering in a Dynamic Environment
Resilient Routing Reconfiguration
Ye Wang, Hao Wang*, Ajay Mahimkar+, Richard Alimi,
Yin Zhang+, Lili Qiu+, Yang Richard Yang
* Google
+ University of Texas at Austin
Yale University
ACM SIGCOMM 2010, New Delhi, India
September 2, 2010
Motivation
Failures are common in operational IP networks
Accidents, attacks, hardware failure, misconfig, maintenance
Multiple unexpected failures may overlap
Planned maintenance affects multiple network elements
e.g., concurrent fiber cuts in Sprint (2006)
May overlap with unexpected failures (e.g. due to inaccurate SRLG)
Increasingly stringent requirement on reliability
VoIP, video conferencing, gaming, mission-critical apps, etc.
SLA has teeth violation directly affects ISP revenue
Need resiliency: network should recover quickly &
smoothly from one or multiple overlapping failures
Challenge: Topology Uncertainty
Number of failure scenarios quickly explodes
Difficult to optimize routing to avoid congestion
under all possible failure scenarios
500-link network, 3-link failures: > 20,000,000!
Brute-force failure enumeration is clearly infeasible
Existing methods handle only 100s of topologies
Difficult to install fast routing
Preconfigure 20,000,000 backup routes on routers?
Existing Approaches & Limitations
Focus exclusively on reachability
e.g., FRR, FCP (Failure Carrying Packets), Path Splicing
May suffer from congestion and unpredictable performance
congestion mostly caused by rerouting under failures [Iyer et al.]
multiple network element failures have domino effect on FRR
rerouting, resulting in network instability [N. So & H. Huang]
Only consider a small subset of failures
e.g., single-link failures [D. Applegate et al.]
Insufficient for demanding SLAs
Online routing re-optimization after failures
Too slow cannot support fast rerouting
4
R3: Resilient Routing Reconfiguration
A novel link-based routing protection scheme
requiring no enumeration of failure scenarios
provably congestion-free for all up-to-F link failures
efficient w.r.t. router processing/storage overhead
flexible in supporting diverse practical requirements
Problem Formulation
Goal: congestion-free rerouting under up-to-F link failures
Input: topology G(V,E), link capacity ce, traffic demand d
dab: traffic demand from router a to router b
Output: base routing r, protection routing p
rab(e): fraction of dab carried by link e
pl(e): (link-based) fast rerouting for link l
capacity
5
pab(ac)=1
a
dab=6
c
10
pab(cb)=1
10
rab(ab)=1
10
b
10
d
6
From Topology Uncertainty
to Traffic Uncertainty
Instead of optimizing for original traffic demand on
all possible topologies under failures
R3 optimizes protection routing for a set of traffic
demands on the original topology
Rerouting virtual demand set captures the effect of failures on
amount of rerouted traffic
Protection routing on original topology can be easily
reconfigured for use after failure occurs
7
Rerouting Virtual Demand Set
Failure scenario (f) rerouted traffic (x)
load/capacity
4/5
c
4/10
2/10
a
0/10
0/10
b
Failure
scenario
Rerouted
traffic
Upper bound of
rerouted traffic
ac fails
xac = 4
xac ≤ 5 (cac)
ab fails
xab = 2
xab ≤ 10 (cab)
d
rerouted traffic xl after link l fails = base load on l given r
(r is congestion-free xl ≤ cl)
Rerouted traffic under all possible up-to-F-link failure
scenarios (independ of r):
XF = { x | 0 ≤ xl ≤ cl, Σ(xl/cl) ≤ F } (convex combination)
8
R3 Overview
Offline precomputation
Plan (r,p) together for original demand d plus rerouting virtual
demand x on original topology G(V,E) to minimize congestion
p may “use” links that will later fail
Online reconfiguration
Convert and use p for fast rerouting after failures
9
Offline Precomputation
Compute (r,p) to minimize MLU (Max Link Utilization)
for original demand d + rerouting demand x ∈ XF
r carries d, p carries x∈XF
min(r,p) MLU
s.t. [1] r is a routing, p is a routing;
[2] ∀x∈XF, ∀e:
[ ∑a,b∈Vdabrab(e) + ∑l∈Exlpl(e) ] / ce ≤ MLU
Original traffic Rerouting traffic
Challenge: [2] has infinite number of constraints
Solution: apply LP duality a polynomial # of constraints
10
Online Reconfiguration
Step 1: Fast rerouting after ac fails:
Precomputed p for ac:
pac(ac)=1/3, pac(ab)=1/3, pac(ad)=1/3
ac fails fast reroute using pac \ pac(ac)
equivalent to a rescaled pac:
ξac(ac)=0, ξac(ab)=1/2, ξac(ad)=1/2
link l fails ξl(e)=pl(e)/(1-pl(l))
pξac
(e)
ac(e)
1/3
0
c
1
2/3
1/2
1/3
a
b
1/3
1/2
1/2
1/3
d
11
Online Reconfiguration (Cont.)
Step 2: Reconfigure p after failure of ac
Current p for ab:
pab(ac)=1/2, pab(ad)=1/2
ac fails 1/2 need to be “detoured” using ξac
pab(ac)=0, pab(ad)=3/4, pab(ab)=1/4
link l fails ∀l’, pl’(e) = pl’(e)+pl’(l) ξl(e)
pab(e)
0
1/2
c
1/2
0
1/4
0
a
b
3/4
1/2
1/2
3/4
Apply detour ξac on every
protection routing (for other
links) that is using ac
d
12
R3 Guarantees
Sufficient condition for congestion-free
if ∃(r,p) s.t. MLU ≤ 1 under d+XF
no congestion under any failure involving up to F links
Necessary condition under single link failure
if there exists a protection routing guarantees no congestion
under any single-link failure scenario
∃(r,p) s.t. MLU ≤ 1 under d+X1
Adding superset of rerouted traffic to original demand is not so wasteful
Open problem: is R3 optimal for >1 link failures?
R3 online reconfiguration is order independent of
multiple failures
13
R3 Extensions
Fixed base routing
Trade-off between no-failure and failure protection
Associate different protection levels to traffic with different priorities
Realistic failure scenarios
Add envelope γ(≥1) to bound end-to-end path delay
Prioritized traffic protection
Add penalty envelope β(≥1) to bound no-failure performance
Trade-off between MLU and end-to-end delay
r can be given (e.g., as an outcome of OSPF)
Shared Risk Link Group, Maintenance Link Group
Traffic variations
Optimize (r,p) for d ∈ D + x ∈ XF
14
Evaluation Methodology
Network Topology
Abilene, US-ISP (PoP-level)
Level-3, SBC, and UUNet
GT-ITM
Traffic Demand
Real:
Rocketfuel:
Synthetic:
Real:
Synthetic:
Abilene, US-ISP
gravity model
Failure Scenarios
Abilene:
US-ISP:
all failure scenarios with up to 3 physical links
maintenance events (6 months) +
all 1-link, 2-link failures +
~1100 sampled 3-link failures
Enumeration only needed for evaluation, not for R3
15
Evaluation Methodology (cont.)
R3 vs other rerouting schemes
OSPF+R3: add R3 rerouting to OSPF
MPLS-ff+R3: “ideal” R3 (flow-based base routing)
OSPF+opt : benchmark, optimal rerouting (based on OSPF)
OSPF+CSPF-detour: commonly used
OSPF+recon: ideal OSPF reconvergence
FCP: Failure Carrying Packets [Lakshminarayanan et al.]
PathSplice: Path Splicing (k=10,a=0,b=3) [Motivala et al.]
Performance metrics
MLU (Maximum Link Utilization) measures congestion
lowerbetter
performance ratio = MLU of the algorithm / MLU of optimal routing
for the changed topology (corresponding to the failure scenario)
always ≥1, closer to 1better
16
US-ISP Single Failure
R3 achieves near-optimal performance (R3);
R3 out-performs other schemes significantly.
17
US-ISP Multiple Failures
all two-link failure scenarios
sampled three-link failure scenarios
R3 consistently out-performs other schemes by at least 50%
18
US-ISP No Failure: Penalty Envelope
R3 near optimal under no failures with 10% penalty envelope
19
Experiments using Implementation
R3 Linux software router implementation
Based on Linux kernel 2.6.25 + Linux MPLS
Implement flow-based fast rerouting and efficient R3 online
reconfiguration by extending Linux MPLS
Abilene topology emulated on Emulab
3 physical link failures (6 directed link failures)
20
R3 vs OSPF-recon: Link Utilization
R3 out-performs OSPF+recon by a factor of ~3
21
Precomputation Complexity
Profile computation time (in seconds)
Computer: 2.33 GHz CPU, 4 GB memory
Offline precomputation time < 36 minutes
(operational topologies, < 17 minutes)
Computation time stable with higher #failures.
22
Router Storage Overhead
Estimate maximum router storage overhead for our implementation
ILM (MPLS incoming label mapping) and NHLFE (MPLS next-hop
label forwarding entry) are required to implement R3 protection routing
Modest router storage overhead:
FIB < 300 KB, RIB < 20 MB,
#ILM < 460, #NHLFE < 2,500
23
Conclusions
R3: Resilient Routing Reconfiguration
Provably congestion-free guarantee under multiple failures
Key idea: convert topology uncertainty to traffic uncertainty
offline precomputation + online reconfiguration
Flexible extensions to support practical requirements
Trace-driven simulation
R3 near optimal (>50% better than existing approaches)
Linux implementation
Feasibility and efficiency of R3 in real networks
24
Thank you!
25
Backup Slides
26
US-ISP Single Failure: Zoom In
For each hour (in one day), compare the worst case failure
scenario for each algorithm
R3 achieves near-optimal performance (R3 vs opt)
27
Level-3 Multiple Failures
Left: all two-failure scenarios; Right: sampled three-failure scenarios
R3 out-performs other schemes by >50%
28
SBC Multiple Failures
Left: all two-failure scenarios
Right: sampled three-failure scenarios
“Ideal” R3 outperforms OSPF+R3 in some cases
29
Robustness on Base Routing
OSPFInvCap: link weight is inverse proportional to bandwidth
OSPF: optimized link weights
Left: single failure; Right: two failures
A better base routing can lead to better routing protection
30
Flow RTT (Denver-Los Angeles)
R3 implementation achieves smooth routing protection
31
R3: OD pair throughput
Traffic demand is carried by R3
under multiple link failure scenarios
32
R3: Link Utilization
Bottleneck link load is controlled under 0.37 using R3
33
Offline Precomputation Solution
[C2] contains infinite number of constraints due to x
Consider maximum “extra” load on link e caused by x
πe(l)
λe
Σl∈Exlpl(e) ≤ UB, if there exists multipliers πe(l) and λe (LP duality):
Convert [C2] to polynomial number of constraints:
& constraints on πe(l) and λe
34
Traffic Priority
Service priority is a practical requirement
of routing protection
Traffic Priority Example
TPRT (real-time IP) traffic
should be congestion-free under up-to-3 link
failures (to achieve 99.999% reliability SLA)
Private Transport (TPP) traffic
should survive up-to-2 link failures
General IP traffic
should only survive single link failures
35
R3 with Traffic Priority
Attribute protection level to each class of
traffic
TPRT protection level 3
TPP protection level 2
IP protection level 1
Traffic with protection level greater than
equal to i should survive under failure
scenarios covered by protection level i
36
R3 with Traffic Priority Algorithm
Precomputation with Traffic Priority
Consider protection for each protection level i
Guarantee each class of traffic has no
congestion under the failure scenarios
covered by its protection level
37
R3 with Traffic Priority Simulation
Routing Protection Simulation
Basic R3 vs R3 with Traffic Priority
Methodology
US-ISP: a large tier-1 operational network topology
hourly PoP-level TMs for a tier-1 ISP (1 week in 2007)
extract IPFR and PNT traffic from traffic traces, subtract IPFR
and PNT from total traffic and treat the remaining as general IP
Protection levels:
TPRT: up-to-4 link failures
TPP: up-to-2 link failures
IP: single link failures
Failure scenarios:
enumerated all single link failures
100 worst cases of 2 link failures
100 worst cases of 4 link failures
38
Traffic Protection Priority
IP: up-to-1 failure protection;
TPP: up-to-2 failure protection;
TPRT: up-to-4 failure protection
Left: single failure;
Righttop: worst two failures;
Rightbottom: worst four failures
R3 respects different traffic protection priorities
39
Linux Implementation
MPLS-ff software router
Based on Linux kernel 2.6.25 + Linux MPLS
Implement flow-based routing for efficient R3 online reconfiguration
Extend MPLS FWD (forward) data structure to enable per hop traffic splitting
for each flow
Failure detection and response
Detection using Ethernet monitoring in kernel
In operational networks, detection can be conservative (for SLRG, MLG)
Notification using ICMP 42 flooding
Requires reachability under failures
Rerouting traffic by MPLS-ff label stacking
Online reconfiguration of traffic splitting ratios (locally at each
router)
40
MPLS-ff
R3 uses flow-based traffic splitting (for each OD
pair)
e.g. pnewywash(chichous)=0.2 means link chichous will
carry 20% traffic originally carried by newywash when
newywash fails
Current MPLS routers only support path-based
traffic splitting
Traffic load on equal-cost LSPs is proportional to requested
bandwidth by each LSP
Juniper J-, M-, T- series and Cisco 7200, 7500, 12000 series
e.g. multiple paths from newy to wash can be setup to
protect link newywash
41
MPLS-ff
Convert flow-based to path-based routing
e.g., using flow decomposition algorithm
-- [Zheng et al]
Expensive to implement R3 online reconfiguration
Need to recompute and signal LSPs after each failure
Extend MPLS to support flow-based routing
MPLS-ff
Enable next-hop traffic splitting ratios for each flow
flow: traffic originally carried by a protected link
42
MPLS-ff
MPLS FWD data structure
Extended to support multiple Next Hop Label
Forwarding Entry (NHLFEs)
One NHLFE specifying one neighbor
One Next-hop Splitting Ratio for each NHLFE
Label 200 FWD:
NHLFE R2: 50%
NHLFE R3: 50%
NHLFE R4: 0%
R2
R3
R1
R4
43
MPLS-ff
Implement Next-hop Splitting Ratio
router i, next-hop j, protected link (a,b):
Packet hashing
Same hash value for packets in the same TCP flow
Independent hash values on different routers for
any particular TCP flow
Hash = f(src, dst,
srcPort, dstPort)
FWD to j if 40<Hash<64
i
j
all packets from i:
40<Hash<64!
(skewed hash value
distribution)
Hash of (packet flow fields + router id)
44
Failure Response
Failure Detection and
Notification
detection: Ethernet monitoring
in Linux kernel
notification: ICMP 42 flood
Failure Response
MPLS-ff label stacking
Protection Routing
Update
R3 online reconfiguration
requires a local copy of p on
each router
45
R3 Design: Routing Model
Network topology as graph G = (V,E)
Traffic matrix (TM)
V: set of routers
E: set of directed network links, link e=(i,j) with capacity ce
TM d is a set of demands: d = { dab | a,b ∈ V }
dab: traffic demand from a to b
Flow-based routing representation
r = { rab(e) | a,b ∈ V, e ∈ E }
rab(e): the fraction of traffic from a to b (dab) that is carried by
ink e
e.g. rab(e)=0.25 means link e carry 25% traffic from a to b
46
Intuition behind R3
Plan rerouting on the single original topology
Avoid enumeration of topologies (failure scenarios)
Compute (r,p) to gurantee
congestion-free
Not real traffic
demand!
for d + x∈XF
on G
Puzzles
Add rerouted traffic before it appears
Topology with links that will fail!
XF = { x | 0 ≤ xl ≤ cl, Σ(xl/cl) ≤ F } superset of rerouted traffic under failures
Use network resource that will disappear
Protection routing p uses links that will fail
By doing two counter-intuitive things, R3 achieves:
• Congestion-free under multiple link failures w/o
enumeration of failure scenarios
• Optimal for single link failure
• Fast rerouting
47
Online Reconfiguration
Step 1: Fast rerouting after ac fails:
Precomputed p for link ac:
pac(ac)=0.4, pac(ab)=0.3, pac(ad)=0.3
ac fails 0.4 need to be carried by ab and ad
pac: ac 0, ab 0.5, ad 0.5
a locally rescales pac and activates fast rerouting
Efficiently implemented by extending MPLS label stacking
Fast rerouting
load/capacity
4-4/5
4/5
pac(e)
0.4
0
c
0.6
1
0.3
0.5
b
0.3
0.5
d
2+4/10
2/10
2+2/10
2/10
a
0.3
0.5
c
a
0/10
0+2/10
b
0/10
0+2/10
d
48