A Performance Comparison of Wireless Ad Hoc Network

Download Report

Transcript A Performance Comparison of Wireless Ad Hoc Network

Dynamics of Hot-Potato
Routing in IP Networks
Jennifer Rexford
AT&T Labs—Research
http://www.research.att.com/~jrex
Joint work with Renata Teixeira (UCSD),
Aman Shaikh (AT&T), and Timothy Griffin (Intel)
Outline
 Internet routing


Interdomain and intradomain routing
Coupling due to hot-potato routing
 Measuring hot-potato routing


Measuring the two routing protocols
Correlating the two data streams
 Performance evaluation

Characterization on AT&T’s network
 Implications on operational practices
 Conclusions and future directions
2
Autonomous Systems
Multiple links
Middle of path
4
3
5
2
7
1
6
Web server
Client
AS path: 6, 5, 4, 3, 2, 1
3
Interdomain Routing (BGP)
 Border Gateway Protocol (BGP)


IP prefix: block of destination IP addresses
AS path: sequence of ASes along the path
 Policy configuration by the operator


Path selection: which of the paths to use?
Path export: which neighbors to tell?
“I can reach 12.34.158.0/24
via AS 1”
“I can reach 12.34.158.0/24”
1
12.34.158.5
2
3
4
Intradomain Routing (IGP)
 Interior Gateway Protocol (OSPF and IS-IS)



Shortest path routing based on link weights
Routers flood link-state information to each other
Routers compute “next hop” to reach other routers
 Link weights configured by the operator


Simple heuristics: link capacity or physical distance
Traffic engineering: tuning link weights to the traffic
2
3
2
Path cost: 2+1+5
1
1
1
4
3
5
3
5
Two-Level Internet Routing
inter-domain
routing (BGP)
AS 3
 Hierarchical routing

AS 2
Intra-domain


Metric based
Inter-domain

Reachability and policy
 Design principles
AS 1
intra-domain
routing (IGP)



Autonomous system (AS) = network
with unified administrative routing
policy (ex. AT&T, Sprint, UCSD)
Scalability
Isolation
Simplicity of reasoning
6
Coupling: Hot-Potato Routing
Z
10
Routes
thousands
packet to dst
of destinations
switch exit point!!!
X
ISP network
11
9
dst
Y
failure
planned maintenance
traffic engineering
Consequences:
Routers CPU overload
Hot-potato routing = ISPs policy of routing to
Transient forwarding instability
closest exit point when there
Traffic
is more than one
route shift
to
destination Inter-domain routing changes
7
BGP Decision Process
“Equally good”
 Ignore if exit point unreachable
 Highest local preference
 Lowest AS path length
 Lowest origin type
 Lowest MED (with same next hop AS)
 Lowest IGP cost to next hop
 Lowest router ID of BGP speaker
8
Hot-Potato Routing Model
Z
10
X
8
99
W
dst
Y
 Cost vector for Z: cX=10, cW=8, and cY=9
 Egress set for dst: {X, Y}
 Best route for Z: through Y, which is closest
Hot-potato change: change in cost vector causes change in best route
9
The Big Picture
Cost vector
changes
Interdomain
changes
Hot-potato routing changes
(interdomain changes caused by intradomain changes)
10
Why Care about Hot Potatoes?
 Understanding of Internet routing


Frequency of hot-potato routing changes
Influence on end-to-end performance
 Operational practices



Knowing when hot-potato changes happen
Avoiding unnecessary hot-potato changes
Analyzing externally-caused BGP updates
 Distributed root cause analysis


Each AS can tell what BGP updates it caused
Someone should know why each change happened
11
Our Approach
 Measure both protocols

BGP and OSPF monitors
Z
AT&T backbone
X
OSPF messages
BGP updates
M
Y
 Correlate the two streams

Match BGP updates with OSPF events
 Analyze the interaction
12
Heuristic for Matching
Stream of OSPF messages
Transform stream of OSPF
messages into routing
changes
refresh
link failure
weight change
chg cost
chg cost
del
Match BGP updates
with OSPF events that
happen close in time
time
Classify BGP updates
by possible OSPF causes
Stream of BGP updates
13
Computing Cost Vectors
 Transform OSPF messages into path
cost changes from a router’s perspective
OSPF routing changes:
X 5
Y 4
7
Z
X
11
1
2
2
10
10
2
1
1
CHG Y, 7
DEL X
ADD X, 5
M
LSA
weight
LSA
weight
change, 10
10
LSA
deletechange,
Y
14
Classifying BGP Updates
 Cannot have been caused by cost change



Destination just became (un)available in BGP
New BGP route through same egress point
New route better/worse than old (e.g., AS path len)
 Can have been caused by cost change

New route is equally good as old route (perhaps X
got closer, or Y got further away)
M
Z
X
dst
Y
15
The Role of Time
 IGP link-state advertisements


Multiple LSAs from a single physical event
Group into single cost vector change
 BGP update messages



70 sec
Multiple BGP updates during convergence
Group into single BGP routing change
 Matching IGP to BGP

10 sec
180 sec
Avoid matching unrelated IGP and BGP changes
Match related changes that are close in time
Characterize the measurement data to determine the right windows
16
Summary Results (June 2003)
 High variability in % of BGP updates
location
min
max
days > 10%
close to peers
0%
3.76%
0
between peers
0%
25.87%
5
 One LSA can have a big impact
location
no impact
prefixes impacted
close to peers
97.53%
less than 1%
between peers
97.17%
55%
17
Delay for BGP Routing Change
 Router vendor scan timer


BGP table scan every 60 seconds
OSPF changes arrive uniformly in interval
 Internal BGP hierarchy (route reflectors)


Wait for another router to change best route
Introduces a wait for a second BGP scan
 Transmitting many BGP messages

Latency for transferring the data
We have seen delays of up to 180 seconds!
18
Delay for First BGP Change
Fraction of
Hot-Potato Changes
Two BGP scans
Cisco BGP scan timer
Routers in backbone (June)
Routers in backbone (September)
Router in lab
time BGP update – time LSA (seconds)
19
Cumulative Number of
Hot-Potato Changes
Transferring Multiple Prefixes
81 seconds delay
time BGP update – time LSA (seconds)
20
BGP Updates Over Prefixes
Contrast with
non-OSPF triggered
BGP updates
Cumulative
%BGP updates
prefixes with only
one exit point
OSPF-triggered BGP updates
affects ~50% of prefixes
uniformly
Non-OSPF triggered
All
OSPF-triggered
% prefixes
21
Operational Implications
 Forwarding plane convergence

Accuracy of active measurements
 Router proximity to exit points

Likelihood of hot-potato routing changes
 Cost in/out of links during maintenance

Avoid triggering BGP routing changes
22
Forwarding Convergence
R1’s scan process
Packets to dst may
can take up to
Scan
be caught
processin a loop
using
to reach dst
60 seconds to R
run
runs
for in
60Rseconds!
2 starts
1
2
R1
10
R2
111
10
100
dst
23
Measurement Accuracy
 Measurements of customer experience

Probe machines have just one exit point!
loop to reach dst
R1
W1
10
R2
111
100
W2
dst
24
Avoid Equal-distance Exits
Z
X
10
Z
X
1
10
1000
Y
dst
Y
dst
Small changes will make
Z switch exit points to dst
More robust to intra-domain
routing changes
25
Careful Cost in/out Links
X
5
100
5
10
4
Z
10
10
dst
Y
Traffic is more predictable
Faster convergence
Less impact on neighbors
26
Ongoing Work
 Black-box testing of the routers


Scan timer and its effects (forwarding loops)
Vendor interactions (with Cisco)
 Impact of the IGP-triggered BGP updates



Changes in the flow of traffic
Forwarding loops during convergence
Externally visible BGP routing changes
 Improving isolation (cooling those potatoes!)



Operational practices: preventing interactions
Protocol design: weaken the IGP/BGP coupling
Network design: internal topology/architecture
27
Thanks!
 Any questions?
28
iBGP Route Reflectors
Announcement X
X
9
Z
dst X,19
Y, 21
18
dst W, 20
10
Y
8
11
dst
20
W
Scalability trade-off:
Less BGP state
vs.
Number of BGP updates from Z
and longer convergence delay
29
Exporting Routing Instability
Announcement
Z
X
dst
Y
No change => no announcement
Z
X
dst
Y
30