MPLS Network Tuning
Download
Report
Transcript MPLS Network Tuning
MPLS Network Tuning:
How to Squeeze Most From Your Network
Swarup Acharya ([email protected])
Technical Manager, Multiservice Networking Research Group
Optical Networking Division, Bell Laboratories
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Multi-Protocol Label Switching
MPLS has emerged as the foundation of next-generation data
networks
– Provides the underpinnings of the converged network vision
– “Connection-oriented” veneer on an inherent connectionless IP
network
IP/MPLS market continues to grow
– Year-over-year traffic growth: 119% (2002), 118% (2003), 84% (2004)
[Infonetics Research]
Drivers for data network convergence
– CapEx Savings, OpEx savings, Competition, Convergence..
This talk will focus on Network Management challenges in
delivering the lower CapEx, OpEx promise of MPLS
2
Copyright 2004 Lucent Technologies Inc. All rights reserved.
MPLS Traffic Engineering
The cornerstone of MPLS is Traffic Engineering (TE)
– Given a new demand, how best to route it in the network?
– No longer limited by IGP (OSPF, IS-IS) restrictions of:
• Destination-based forwarding
• Simple “additive” min-cost routing, ignorant of bandwidth
MPLS TE enables:
– Constrained-based routing
• Bandwidth, delay etc..
– Explicit routing (aka Source routing)
– Label Switched Paths (LSP)
• With appropriate resources reservations via RSVP
3
Copyright 2004 Lucent Technologies Inc. All rights reserved.
TE Benefits
Path from L1-L3
Path from L2-L3
L2
IGP
Shortest Path
Routing
B
A
C
L3
L1
D
E
“Longer” paths under-utilized,
“Shorter” paths bottlenecked!
L2
Traffic
Engineered
Paths
B
A
C
L3
L1
D
E
TE enables a more load-balanced network..
4
Copyright 2004 Lucent Technologies Inc. All rights reserved.
TE Necessary, But is it Sufficient?
General belief that Traffic Engineering enables efficient MPLS
networks
– “Traffic engineering reduces the overall cost of operations by more efficient use
of bandwidth resources…
.. [Cisco documentation]
Will TE alone provide all the efficiency you can get?
MPLS + TE moves the network from packet-switched to a “virtual”
circuit-switched IP network..
What about bandwidth inefficiencies inherent in circuit-switched
environments such as SONET,SDH, ATM…?
Will MPLS suffer the same fate?
5
Copyright 2004 Lucent Technologies Inc. All rights reserved.
SONET/SDH Bandwidth Fragmentation
Primary cause of poor SONET/SDH efficiency
Circuit churn leaves behind “stranded” capacity
New requests often denied even if sufficient capacity exists
– Often at only ~30-40% network utilization
SONET NMSs have had “constrained-based” routing for a while now..
– Bandwidth accounted in routing
– Arcane SONET/SDH constraints
Increasingly, defragmentation tools used to “recover” capacity
– Often, in conjunction with newer Optical Control Planes
6
Copyright 2004 Lucent Technologies Inc. All rights reserved.
SONET/SDH Link Fragmentation
2.5 Gbps Link on a Fiber
622 Mbps
a
622 Mbps
b c d
622 Mbps
a c d b
622 Mbps
622 Mbps
622 Mbps
e
f
New request for a
622 Mbps demand rejected!
622 Mbps 622 Mbps
e
Request accommodated!
f
Legend/Glossary
• By rearranging traffic carried on
different time slots, capacity on
the link can be freed and reused
Symbol
Bandwidth
SDH
SONET
Fiber/link
2.5 Gbps
STM-16
OC-48
e
622 Mbps
STM-4
OC-12
a
155 Mbps
STM-1
OC-3
7
Copyright 2004 Lucent Technologies Inc. All rights reserved.
SONET/SDH Ring Fragmentation
Alternate Routing of
Same Demands on Ring
7-Node STS-48 BLSR Ring
Time Slots
New STS-3 demand from
Node2-4 rejected!
Accepted here
Nodes
Is there an MPLS analogue?
8
Copyright 2004 Lucent Technologies Inc. All rights reserved.
MPLS Network Example
: Bandwidth of all Links
E
A
C
New Service Request:
L3
L2
F
B
L1, L2, L3: LSPs of /2 b/w
Setup LSP L4 between A and
C, Bandwidth
L1
D
Router A rejects request:
No available route meeting
b/w requirements
No available bandwidth?
Or, Is the bandwidth fragmented?
9
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Avoiding Bandwidth Fragmentation
E
A
C
L2
L4
Alternate routing for L1-L3,
L1
enables L4 to be met
L3
F
B
D
TE alone does not guarantee high efficiency
– L1-L3 were “optimally” routed in both cases, yet fragmentation occurs
Key: Can the demands be routed without adding new hardware?
Lower Fragmentation Higher Utilization Lower CapEx
10
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Network Tuning
Fragmentation is a problem for MPLS networks too
In general, Network Management systems need to provide Network
Engineering tools to address fragmentation
– “Traffic Engineering puts traffic where the bandwidth is, Network
Engineering creates bandwidth where the traffic will be..”
– Relatively little focus on engineering tools
– Network engineering requires “global” knowledge, TE is a per-LSP
optimization
However, network engineering operation cannot be service disruptive
Network Tuning: Hitless, Disruption-free Network Engineering
– Network tuning is NOT network planning
– For live, operational networks, not greenfield designs
11
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Re-cap
Did the Net-Heads check with the Bell-Heads as to what NM
quagmire they were getting into?
MPLS Traffic Engineering:
•
Helps avoid congestion prevalent in native IP networks
•
Limited ability to mitigate circuit-switched inefficiencies
•
On its own, cannot extract the most juice from the network
Critical need for Network Tuning tools
No reason why MPLS cannot become equally inefficient down the road..
Key Tradeoff:
Grow infrastructure to meet traffic demand [CapEx Hit], OR,
Tune network for improved efficiency [OpEx Hit]?
12
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Network Tuning Scenarios
The router rejected a new LSP setup request due to insufficient
bandwidth. Can I engineer the current LSP routes to “free” the
necessary bandwidth for the new one?
I need to bring down a router for an OS upgrade. Can I:
a) Re-route the LSPs on the router to avoid bringing them down?
b) Upgrade the router OS and then revert them back to their original
routes?
The traffic on a node/set of links had exceed the recommended
load threshold. Can I move traffic from the “hot zone” to minimize
damage in case of failure?
Can I have automated, scalable Network Tuning tools?
13
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Bell Labs Möbius Tool:
MPLS Provisioning, Tuning System
Support For:
- Cisco 72*/75*/120*
- JNPR M*, T*
- ERX
Link color indicates load (Red: high, Green: acceptable load)
Copyright 2004 Lucent Technologies Inc. All rights reserved.
14
Möbius LSP View
15
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Network Tuning: Fail-Setup Optimization
16
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Network Tuning: Fail-Setup Optimization
17
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Network Tuning: Fail-Setup Optimization
18
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Network Tuning: Fail-Setup Optimization
Optimization Done!
19
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Impacted Circuits (Old Routes)
20
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Impacted Circuits (Old + New Routes)
1: Re-route
2: Re-route
3: Provision
21
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Network Tuning: Load Balance
(“Hot Zone” Clearing)
Clear Traffic in this
“Hot Zone” below
specific threshold..
Network View
(After)
Network View
(Before)
22
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Network Engineering Requirements
Step-by-step Migration Sequence
– Operating on live traffic -- providing a design for an “optimized” layout
does not help
How do I get from current LSP layout to the new layout?
– Original LSP QoS constraints have to be maintained on new route
E
A
C
E
L3
L2
F
B
A
C
L2
E
L3
L1
A
L4
L2
L3
L1
F
D
B
Re-route L2
C
L1
F
D
B
D
Re-route L3
Provision L4
23
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Algorithmic Challenge
Hot Zone Load Balancing
Migration a very challenging theoretical
problem (“NP-hard”)
Requires innovative algorithms for large
networks
How to scale to a network with 10s of
routers, 100s of links and 1000s of LSPs?
Non HZ LSP Rerouted
100
% Hot Zone Cleared
– Problem of scale -- exponential search
space
HZ LSP Rerouted
In Bell Labs, we have patent-pending
algorithms to provide migration
sequence in “real-time”
– Milliseconds to seconds for
reasonable sized networks
90
80
70
60
50
40
30
20
10
0
30
50
70
80
90
Network Load (%)
• Hot Zone size = 10% Network
• Goal: Clear all LSPs in Hot Zone
• Chart shows contribution of LSPs
outside the hot zone in meeting the
goal as network loads increase
24
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Requirements II: Hitless, Disruption-free
Engineering
“Hitless” is not zero packet loss
• In reality, everything is only near-hitless
Requirement: It should be perceived as hitless from the application’s
perspective
• SONET/SDH has a 50ms grace during protection switching
• Even with a 100 ms hit, can do >500 re-routes before a 4-9s reliability
SLA is broken.
MPLS provides infrastructure for hitless re-routing
25
Copyright 2004 Lucent Technologies Inc. All rights reserved.
MPLS make-before-break
Mechanism to achieve hitless LSP re-routing
– Signal new route, switch traffic and delete old route
Signaling protocols use intelligence when reserving
bandwidth on new route
– E.g., Shared Explicit (SE) style flag in RSVP to avoid double
bandwidth reservation on common links
Possibility of packets going out-of-order
– If new route is significantly shorter than original route
– Even if it occurs, very short duration and bounded
26
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Inducing make-before-break
Typically, make-before-break is an internal function
– Used by router to re-route LSPs (e.g, if ‘re-optimize’ flag on)
For network engineering, make-before-break needs to be triggered
from the outside. Also:
– New path is given (as opposed to router calculating it)
– If the new path is bad, traffic should not switch and bring the LSP
down!
Routers need to provide mechanism to trigger make-before-break
– Backdoors available - varies by vendor OS
– Insert the new path with a higher priority (lower path option) and force
a re-optimization
– Replace the Explicit Route Object (ERO) with a new route
27
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Requirements III: Preserve Network Stability
Should not bring down customer traffic
• Key: Traffic should switch only if new path is up!
LSP re-routes does not change the IP topology
• Only the path is changed, not the connectivity
Will cause OSPF updates
• Bandwidth on links will change
Should be attempted during “lean” traffic periods
28
Copyright 2004 Lucent Technologies Inc. All rights reserved.
CapEx Savings from Tuning
Simulation Model
Normal --Operation
Mobius
--- No Engineering
With Engineering
LSP Traffic randomly generated
and routed on shortest available
path
450
On failure to setup LSP due to
lack of bandwidth:
Case I: No Engineering
A new link added between source
and destination
Case II: With Engineering
Attempt to re-route LSPs to create
“free” space for new one
On failure, new link added
between source and destination
#OC192 Ports
40 Node, 100 Link network (to
start)
400
350
300
250
200
More traffic for same
infrastructure
150
100000 300000 500000 700000 900000 1100000
Traffic Volume (MB)
Network Growth with Increased Traffic
29
Copyright 2004 Lucent Technologies Inc. All rights reserved.
CapEx Savings - II
Alternative Simulation Model
10 Node, 15 Link network (to start)
LSP Traffic randomly generated
and routed on shortest available
path
On failure to setup LSP due to
lack of bandwidth:
No Engineering
With Engineering
140
Case I: No Engineering
A new link added on the hop that
is out of capacity
Case II: With Engineering
# OC-192 Ports
120
100
80
60
40
Attempt to re-route LSPs to create
“free” space for new one
20
100000
200000
On failure, new link added on the
hop that is out of capacity
300000
400000
500000
Traffic Volume (Mb)
30
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Network Management Systems
Traditional IP networking view is that router has all the smarts
Engineering “intelligence” requires network-wide view
– Has to reside in a single “entity”
– Too complicated to co-ordinate engineering operations across different
routers in a distributed fashion
Good choice: MPLS Network Management System (NMS)
NMS needs to provide support for:
– Algorithmic and graphical tools for what-if scenarios
– Seamless, point-n-click support to execute optimizations
– Support for proactive engineering
• No longer limited to “reactive” operational mindset
Requisite NMS tools can lower operations overhead from hours/days
to minutes!
31
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Lucent’s Navis Provisioning Manager
Component-based, Multi-vendor Layer 2/3 NMS
Navis Provisioning Manager
Service Modules
Order Gateway
Work
Manager
ATM
Inventory
Gateway
VPN
Service Modules
Routing & MPLS TE
ATMoMPLS
xDSL
Frame Relay
MPLS
Ethernet
L2&3
L2&3
Network Adaptors
Activation
Flow
Core IP/MPLS EMS / NE
IPSEC
Möbius
Configuration
Flow
FR/ATM EMS / NE
Fast Network Adaptors Creation
Large Multi-vendor Testing Labs
Committed Corporate Partnership with Hardware
Vendors
Component-Based, Multi-Vendor Activation
Rich set of Services over L2 & L3
Software Development Kits for Network Element
Support
Access EMS / NE
32
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Conclusions
MPLS gaining momentum in service provider networks
Industry focus on Traffic Engineering
– Does not suffice to get the most from the network
– Need to also consider network engineering and hitless tuning
MPLS provides the necessary infrastructure for network tuning
– Necessary requirement to avoid inefficiencies in circuit-switched
environments
Effective network tuning improves network utilization, lowers CapEx!
33
Copyright 2004 Lucent Technologies Inc. All rights reserved.
Questions?
34
Copyright 2004 Lucent Technologies Inc. All rights reserved.