Avi Freedman Presentation

Download Report

Transcript Avi Freedman Presentation

Optimal External Route
Selection: Tips and Techniques
for ISPs
Avi Freedman
Net Access
Overview
•
•
•
•
•
•
•
•
Brief review of BGP routing concepts
Safe routing
Determining policy
Using MEDs
Setting MEDs on internal routes
as-path padding to tune external traffic
Using local-prefs to tune external traffic
Setting MEDs to tune external traffic
BGP Concept Review
BGP Intro
• BGP4 is the protocol used on the Internet to
exchange routing information between
providers, and to propagate external routing
information through networks.
• Each autonomous network is called an
Autonomous System.
• ASs which inject routing information on
their own behalf have ASNs.
BGP Peering
• BGP-speaking routers peer with each other
over TCP sessions, and exchange routes
through the peering sessions.
• Providers typically try to peer at multiple
places. Either by peering with the same AS
multiple times, or because some ASs are
multi-homed, a typical network will have
many candidate paths to a given prefix.
The BGP Route
• The BGP route is, conceptually, a “promise”
to carry data to a section of IP space. The
route is a “bag” of attributes.
• The section of IP space is called the
“prefix” attribute of the route.
• As a BGP route travels from AS to AS, the
ASN of each AS is stamped on it when it
leaves that AS. Called the AS_PATH
attribute, or “as-path” in Cisco-speak.
BGP Route Attributes
• In addition to the prefix, the as-path, and the
next-hop, the BGP route has other
attributes, affectionately known as
“knobs and twiddles” –
–
–
–
weight, rarely used - “sledgehammer”
local-pref, sometimes used - “hammer”
origin code, rarely used
MED (“metric”) - a gentle nudge
BGP Policy
• BGP was designed to allow ASs to express
a routing policy. This is done by filtering
certain routes, based on prefix, as-path, or
other attributes - or by adjusting some of the
attributes to influence the best-route
selection process.
BGP Best-Route Selection
• With all of the paths that a router may
accumulate to a given prefix, how does the
BGP router choose which is the “best”
path?
• Through an RFC-specified (mostly) route
selection algorithm.
BGP Best-Route Selection
•Do not consider IBGP path if not synchronized
•Do not consider path if no route to next hop
•Highest weight (local to router)
•Highest local preference (global within AS)
•Shortest AS path
•Lowest origin code IGP < EGP < incomplete
•Lowest MED
•Prefer EBGP path over IBGP path
•Path with shortest next-hop metric wins
•Lowest router-id
BGP Selection, Summary
• So, local-pref is stronger than as-path is
stronger than MED.
• Setting local-pref without careful planning
can cause strange things (preferring other
paths to get to your own customers)…
Safe Routing
Safe Routing
• BGP routes are “promises” to carry traffic
to a certain destination. Still, not every
provider makes good promises {at all
times}.
• So, it is best to sanity-filter all eBGP
sessions.
Safe Routing
• Method 1:
– The Cisco “maximum-prefix” keyword
• neighbor <remote-ip> maximum-prefix [percent] [warning]
– Sets a maximum number of prefixes allowed
for a peer.
– Behavior 1 - Shut down the session and log the
fact.
– Behavior 2 - Leave the session up; just log the
warning.
Safe Routing - Filtering
• Another method of sanity filtering is to
restrict your peers based on routes or aspaths.
• Usually, it is hard to filter based on routes
(except for our friends, the fanatics at ANS).
• So, from smaller providers it is a good idea
to prevent random route redistribution.
Safe Routing - Filtering
ip as-path access-list 40 deny _701_
ip as-path access-list 40 deny _1239_
ip as-path access-list 40 deny _3561_
ip as-path access-list 40 deny _1_
ip as-path access-list 40 deny _1673_
ip as-path access-list 40 deny _174_
ip as-path access-list 40 permit .*
• Apply this access-list inbound for sanity.
“I am Blackholio”
• In sufficiently strange circumstances, this
won’t help.
• If someone (AS 7007, perhaps) strips the
as-path information, as-path filters do no
good.
Determining Policy
Determining Policy
• What do you want to do?
• The tricky part.
• Configuring is easy…
• Do you want to prefer higher-quality
connections?
• Optimize for cost of the links?
Connection Quality
• We will assume that you want to optimize
for connection quality.
• This generally means, in the Platonic zeropacket-loss Internet, minimizing latency and
avoiding small pipes.
• We’ll come back to small pipes and backup
paths when we talk about local-prefs.
• We’ll talk about minimizing latency when
we explore MEDs.
Connection Quality
• At all times, we must minimize packet loss.
• In general, this means avoiding public
exchanges in favor of private peering and/or
transit.
• Sometimes this might not be economically
desirable, but if you don’t tune this way,
stay vigilant about inter-connection quality.
• Best to measure it if you really care...
Measuring Packet Loss with MRTG
Max Max: 423.0 ms (352.5%) Average Max: 32.0 ms (26.7%)
Current Max: 37.0 ms (30.8%)
Max Min: 9.0 ms (7.5%) Average Min: 5.0 ms (4.2%) Current
Min: 6.0 ms (5.0%)
Peering Points
• You want to prefer paths that you hear over
uncongested pipes.
• Assuming you have non-full private
interconnects, PIs will be better than public
exchanges.
• Of course, that can depend on which
Gigaswitch you’re on; whether you’re at
PSK, PACBell, AADS, or the MAEs.
Hot-Potato
• In general, traffic is handed off as soon as
possible to external providers to minimize
backbone utilization and costs.
• This is not always the best plan if you want
to maximize connection quality (assuming
your inter-LATA and/or cross-country links
are not full).
• Solution - Listen to and use MEDs.
Asymmetry
• For this presentation, we are going to ignore
the return path - data coming back into your
network.
• Still, for best tuning you will want to
explore this and use as-path padding and
possibly controlled de-aggregation (to
willing partners)...
Review: Policy
• Somehow, you want to prefer better-quality
links.
• In the examples that follow, we’ll assume a
small but national network, peering at
MAE-West, MAE-East, and Pennsauken.
• Additionally, private interconnects with
IDT, PSI, Digex, above.net, and Exodus.
• Transit through above.net and UUNET.
Goals
• Our goals will be to prefer, in this order:
–
–
–
–
–
Private interconnects
Regionality of traffic
Pennsauken over MAE-East
Public Exchanges
Transit pipes, above.net first
Using MEDs
Introduction to MEDs
• The MULTI_EXIT_DISCRIMINATOR, or
MED, is a BGP attribute used to:
– Describe internal network topology.
– Pass on this topology to external peers.
• A smaller knob than others, like local-pref
or as-path padding.
• Major problem - no inter-provide
consistency on MED semantics.
• Internally, also called “metrics”.
Setting MEDs
for Internal Route
Setting MEDs
• Use an internally consistent scheme.
• Usually, people’s MEDs are in the low
hundreds or less.
• Suggestion - use average delay in ms
between POPs.
• Set MEDs in one direction only.
• To be advanced, MEDs can be set on a perrouter basis in a POP, but usually are not.
Network Diagram
CHI
PHL
SF
DC
Setting MEDs
• For SF, CHI, PHL, DC:
SF-DC
SF-CHI
CHI-PHL
CHI-DC
PHL-DC
PHL-PSK
DC-MAE-E
SF-MAE-W
+60
+40
+30
+25
+10
+0
+5
+5
Network Diagram w/ MEDs
CHI
30
40
SF
25
PHL
10
60
DC
Route Maps in DC
route-map from-sf
set metric +60
route-map from-chi
set metric +40
route-map from-phl
set metric +10
neighbor <sf-ip> route-map from-sf in
etc...
What this Does
• A route originating in PHL will have:
– metric 60 or or 70 in SF (unless there are
multiple link failures)
– metric 10 or 60 in SF
– metric 10 or 35 in DC
• etc…
• Thus, a provider honoring MEDs (not doing
hot-potato) will send packets destined to
that route in PSK, to PHL.
Slight Improvement?
• Or, change things to weight PSK vs. DC
over PHL vs. DC.
PSK +0
MAE-E +20
• Thus, a provider honoring MEDs will send
a PHL-destined packet to PSK. This is
generally a good thing.
Using as-path Padding
as-path padding
• Some think that modifying as-paths is a
nasty business.
• It is a good beginning way to do
preferences.
• If providers have already padded to deprefer, preserves that “de-preference”.
• Simple to do.
as-path padding
• First, policy?
–
–
–
–
–
Private interconnects - pad no times
Regionality of traffic - pad four times x-country
Pennsauken over MAE-East - pad once; twice
Public Exchanges - twice at MAE-West
Transit pipes, above.net first - pad three
• Problem - can’t pad easily going cross-country.
• But we can do the rest.
– Problem - lots of route-maps and typing.
• Why? Can’t prepend our own AS inside network,
so must have separate roue-map per session.
route-maps
• On everyone, at above.net:
route-map prepend-once permit 10
set as pre 6461 6461 6461
• On everyone, at UUNET:
route-map prepend-once permit 10
set as pre 701 701 701
• On PSI, at MAE-East and MAE-West:
route-map prepend-once permit 10
set as pre 174 174
• On PSI, at Pennsauken:
route-map prepend-once permit 10
set as pre 174
Using local-prefs
Local-prefs
• Most common method of preferring
external routes.
• Local-pref is a number, by default 100, put
on routes and passed to all routers within a
network.
• Never passed to an eBGP peer.
Implementing Policy
–
–
–
–
–
Customers - local-pref 200
Private interconnects - local-pref 150
Pennsauken over MAE-East - 120 for Pennsauken
Public Exchanges - 100 at MAE-East and MAE-West
Transit pipes, above.net first - 80 from transit pipes
– Regionality of traffic - defer to MEDS for equal localpref. May want to add PACBELL cxn and make it 120.
route-maps
• At Pennsauken:
route-map psk in
set local-pref 120
set community 4969:800
neighbor peer-group external-peer-psk route-map psk in
or
neighbor <remoteip> route-map psk in
Problem: Prefers Bad Paths
• The problem with this approach:
• Take AS 14000, who has a T1 to Sprintlink
and a backup-backup-backup 56k to another
local provider, say, 13000.
• Announces as:
– 1239 14000 and
– 701 13000 14000 14000 14000 14000 1400
• Local-prefs can screw with this.
Listening to MEDs
Listening to MEDs: Same Peer
• Nothing special is required to listen to
MEDs.
• Because MEDs mean different things to
different networks, one approach is no only
set MEDs inbound for your own routes.
• When listening to MEDs at multiple
locations from a peer, set to internal MEDs
if you want to hot-potato.
route-map on DC, v2
route-map from-sf permit 10
match community 1
set metric +60
MEDs from Diff. eBGP Peers
• “bgp always-compare-med” keyword
allows Ciscos to use MEDs among different
providers.
• Otherwise, will use them to compare iBGP
routes, or eBGP routes from the same AS.
Setting MEDs on External
Routes
Preferring External Routes w/
MEDs
• Can be done, sometimes while preserving
remote MED info, but usually remote MED
info is lost.
• Better in some cases than as-path padding
or local-prefs (as-path padding is
undesirable when you have to pass routes
on to customers; local-prefs might use
backup links…).
Preferring External Routes w/
MEDs
• Assuming not honoring remote MEDs:
– Set metric inbound to 0 and set internal-route
MEDs on routes, then:
–
–
–
–
–
Private interconnects - no change
Regionality of traffic - no change - add normal MEDs
Pennsauken over MAE-East - add 20 for MAE-East
Public Exchanges - no change, or add 20
Transit pipes, above.net first - add 30 or 40
Active Route Override
Overriding BGP
• Some have started to override BGP when
evidence suggests better routing, on a perprefix basis.
• ASAP from above.net, ?fastpath?, ?others?
• Ideally actively and autonomously,
determine best path to frequently-used
prefixes and inject fixer-routes.
• Soon, Cisco will have hooks for injection.