Overview: Routing

Download Report

Transcript Overview: Routing

Interdomain Routing
Jennifer Rexford
Advanced Computer Networks
http://www.cs.princeton.edu/courses/archive/fall08/cos561/
Tuesdays/Thursdays 1:30pm-2:50pm
Outline
• Interdomain routing
– Autonomous Systems (ASes)
• Path-vector routing
– Loop detection and convergence
– Flexible path selection
• Business relationships
– Customer-provider and peer-peer
– Hierarchy from tier-1 ASes to stubs
• Border Gateway Protocol (BGP)
– Announcements and withdrawals
– Import and export policies
• Discussion of Paxson96 and Labovitz97
Interdomain Routing: Between Networks
• AS-level topology
– Nodes are Autonomous Systems (ASes)
– Destinations are prefixes (e.g., 12.0.0.0/8)
– Edges are links and business relationships
4
3
5
2
1
Client
7
6
Web server
AS Numbers (ASNs)
ASNs are 16 bit values.
64512 through 65535 are “private”
Currently around 30,000 in use.
•
•
•
•
•
•
•
•
•
Level 3: 1
MIT: 3
Harvard: 11
Yale: 29
Princeton: 88
AT&T: 7018, 6341, 5074, …
UUNET: 701, 702, 284, 12199, …
Sprint: 1239, 1240, 6211, 6242, …
…
ASNs represent units of routing policy
Challenges for Interdomain Routing
• Scale
– Prefixes: 250,000, and growing
– ASes: 30,000, and growing
– Routers: at least in the millions…
• Privacy
– ASes don’t want to divulge internal topologies
– … or their business relationships with neighbors
• Policy
– No Internet-wide notion of a link cost metric
– Need control over where you send traffic
– … and who can send traffic through you
Policy-Based Path-Vector Routing
Path-Vector Routing
• Extension of distance-vector routing
– Support flexible routing policies
– Reduce convergence time (avoid count-to-infinity)
• Key idea: advertise the entire path
– Distance vector: send distance metric per dest d
– Path vector: send the entire path for each dest d
“d: path (2,1)”
3
“d: path (1)”
1
2
data traffic
data traffic
d
Faster Loop Detection
• Node can easily detect a loop
– Look for its own node identifier in the path
– E.g., node 1 sees itself in the path “3, 2, 1”
• Node can simply discard paths with loops
– E.g., node 1 simply discards the advertisement
“d: path (2,1)”
3
“d: path (1)”
2
“d: path (3,2,1)”
1
Flexible Policies
• Each node can apply local policies
– Path selection: Which path to use?
– Path export: Whether to advertise the path?
• Examples
– Node 2 may prefer the path “2, 3, 1” over “2, 1”
– Node 1 may not let node 3 hear the path “1, 2”
2
3
1
2
3
1
Business Relationships
Business Relationships
• Neighboring ASes have business contracts
– How much traffic to carry
– Which destinations to reach
– How much money to pay
• Common business relationships
– Customer-provider
• E.g., Princeton is a customer of USLEC
• E.g., MIT is a customer of Level3
– Peer-peer
• E.g., UUNET is a peer of Sprint
• E.g., Harvard is a peer of Harvard Business School
Customer-Provider Relationship
• Customer needs to be reachable from everyone
– Provider tells all neighbors how to reach the customer
• Customer does not want to provide transit service
– Customer does not let its providers route through it
Traffic to the customer
Traffic from the customer
d
provider
announcements
provider
traffic
customer
d
customer
Multi-Homing: Two or More Providers
• Motivations for multi-homing
– Extra reliability, survive single ISP failure
– Financial leverage through competition
– Better performance by selecting better path
– Gaming the 95th-percentile billing model
Provider 1
Provider 2
Princeton Example
• Internet: customer of USLEC and Patriot
• Research universities/labs: customer of Internet2
• Local non-profits: provider for several non-profits
Patriot
USLEC
Internet2
Peer-Peer Relationship
• Peers exchange traffic between customers
– AS exports only customer routes to a peer
– AS exports a peer’s routes only to its customers
– Often the relationship is settlement-free (i.e., no $$$)
Traffic to/from the peer and its customers
announcements
peer
d
traffic
peer
AS Structure: Tier-1 Providers
• Tier-1 provider
– Has no upstream provider of its own
– Typically has a national or international backbone
• Top of the Internet hierarchy of ~10 ASes
– AOL, AT&T, Global Crossing, Level3, UUNET, NTT, Qwest,
SAVVIS (formerly Cable & Wireless), and Sprint
– Full peer-peer connections between tier-1 providers
AS Structure: Other ASes
• Other providers
– Provide transit service to downstream customers
– … but, need at least one provider of their own
– Typically have national or regional scope
– Includes several thousand ASes
• Stub ASes
– Do not provide transit service to others
– Connect to one or more upstream providers
– Includes vast majority (e.g., 85-90%) of the ASes
Border Gateway Protocol
Border Gateway Protocol
• Prefix-based path-vector protocol
• Policy-based routing based on AS Paths
• Evolved during the past 18 years
• 1989 : BGP-1 [RFC 1105], replacement for EGP
• 1990 : BGP-2 [RFC 1163]
• 1991 : BGP-3 [RFC 1267]
• 1995 : BGP-4 [RFC 1771], support for CIDR
• 2006 : BGP-4 [RFC 4271], update
BGP Operations
Establish session on
TCP port 179
AS1
BGP session
Exchange all
active routes
AS2
Exchange incremental
updates
While connection
is ALIVE exchange
route UPDATE messages
Incremental Protocol
• A node learns multiple paths to destination
– Stores all of the routes in a routing table
– Applies policy to select a single active route
– … and may advertise the route to its neighbors
• Incremental updates
– Announcement
• Upon selecting a new active route, add node id to path
• … and (optionally) advertise to each neighbor
– Withdrawal
• If the active route is no longer available
• … send a withdrawal message to the neighbors
ASPATH Attribute
128.112.0.0/16
AS Path = 1755 1239 7018 88
128.112.0.0/16
AS Path = 1239 7018 88
AS 1239
Sprint
AS 1755
AS 88
Princeton
128.112.0.0/16
Prefix Originated
Global Access
128.112.0.0/16
AS Path = 1129 1755 1239 7018 88
Ebone
AS 12654
128.112.0.0/16
AS Path = 7018 88
AS7018
128.112.0.0/16
AS Path = 88
AS 1129
RIPE NCC
RIS project
128.112.0.0/16
AS Path = 3549 7018 88
AT&T
128.112.0.0/16
AS Path = 7018 88
AS 3549
Global Crossing
BGP Policy: Applying Policy to Routes
• Import policy
– Filter unwanted routes from neighbor
• E.g. prefix that your customer doesn’t own
– Manipulate attributes to influence path selection
• E.g., assign local preference to favored routes
• Export policy
– Filter routes you don’t want to tell your neighbor
• E.g., don’t tell a peer a route learned from other peer
– Manipulate attributes to control what they see
• E.g., make a path look artificially longer than it is
BGP Policy: Influencing Decisions
Open ended programming.
Constrained only by vendor configuration language
Receive Apply Policy =
filter routes &
BGP
Updates tweak attributes
Apply Import
Policies
Based on
Attribute
Values
Best
Routes
Best Route
Selection
Best Route
Table
Apply Policy =
filter routes &
tweak attributes
Apply Export
Policies
Install forwarding
Entries for best
Routes.
IP Forwarding Table
Transmit
BGP
Updates
Import Policy: Local Preference
• Favor one path over another
– Override the influence of AS path length
– Apply local policies to prefer a path
• Example: prefer customer over peer
Local-pref = 90
Sprint
AT&T
Local-pref = 100
Tier-2
Tier-3
Yale
Import Policy: Filtering
• Discard some route announcements
– Detect configuration mistakes and attacks
• Examples on session to a customer
– Discard route if prefix not owned by the customer
– Discard route that contains other large ISP in AS path
Patriot
Princeton
128.112.0.0/16
USLEC
Export Policy: Filtering
• Discard some route announcements
– Limit propagation of routing information
• Examples
– Don’t announce routes from one peer to another
UUNET
AT&T
Sprint
Export Policy: Filtering
• Discard some route announcements
– Limit propagation of routing information
• Examples
– Don’t announce routes for network-management
hosts or the underlying routers themselves
USLEC
network
operator
Princeton
Export Policy: Attribute Manipulation
• Modify attributes of the active route
– To influence the way other ASes behave
• Example: AS prepending
– Artificially inflate the AS path length seen by others
– To convince some ASes to send traffic another way
Sprint
USLEC
Patriot
88 88
Princeton
128.112.0.0/16
88
BGP Policy Configuration
• Routing policy languages are vendor-specific
– Not part of the BGP protocol specification
– Different languages for Cisco, Juniper, etc.
• Still, all languages have some key features
– Policy as a list of clauses
– Each clause matches on route attributes
– … and either discards or modifies matching routes
• Configuration often done by human operators
– Implementing the policies of their AS
– Biz relationships, traffic engineering, security, …
Measuring Internet Routing
Motivations for Measuring the Routing System
• Characterizing the Internet
– Internet path properties
– Demands on Internet routers
– Routing convergence
• Improving Internet health
– Protocol design problems
– Protocol implementation problems
– Configuration errors or attacks
• Operating a network
– Detecting and diagnosing routing problems
– Traffic shifts, routing attacks, flaky equipment, …
Techniques for Measuring Internet Routing
• Active probing
– Inject probes along path through the data plane
– E.g., using traceroute
• Passive route monitoring
– Capture control-plane messages between routers
– E.g., using tcpdump or a software router
– E.g., dumping the routing table on a router
• Injecting network events
– Cause failure/recovery at planned time and place
– E.g., BGP route beacon, or planned maintenance
Internet Routing is Hard to Measure
• Nobody knows the Internet topology
– No central registry of the AS-level graph
– Little public information about intra-AS topologies
• Deploying monitoring infrastructure is hard
– Forwarding: active probes of end-to-end paths
– Routing: passive monitoring of routing messages
• Many measurement challenges
– Network conditions vary by location
– Network conditions change over time
– One-way measurements are hard to collect
– Controlled experiments are hard to do
Two Papers for Today
• Both early measurement studies of routing
– Initially appeared at SIGCOMM’96 and ’97
– Both won the “best student paper” award 
• And recently won the SIGCOMM “test of time” award!
– Early glimpses into the health of Internet routing
– Early wave of papers on Internet measurement
• Differences in emphasis
– Paxson96: end-to-end active probing to measure
the characteristics of the data plane
– Labovitz97: passive monitoring of BGP update
messages from several ISPs to characterize
(in)stability of the interdomain routing system
Backup Slides Summarizing
Paxson96 and Labovitz97
Active Measurement: Traceroute
• Time-To-Live field in IP packet header
– Source sends a packet with a TTL of n
– Each router along the path decrements the TTL
– “TTL exceeded” sent when TTL reaches 0
• Traceroute tool exploits this TTL behavior
TTL=1
source
Time
exceeded
destination
TTL=2
Send packets with TTL=1, 2, 3, … and record source of “time exceeded” message
Paxson Study: Forwarding Loops
• Forwarding loop
– Packet returns to same router multiple times
• May cause traceroute to show a loop
– If loop lasted long enough
– So many packets traverse the loopy path
• Traceroute may reveal false loops
– Path change that leads to a longer path
– Causing later probe packets to hit same nodes
• Heuristic solution
– Require traceroute to return same path 3 times
Paxson Study: Causes of Loops
• Transient vs. persistent
– Transient: routing-protocol convergence
– Persistent: likely configuration problem
• Challenges
– Appropriate time boundary between the two?
– What about flaky equipment going up and down?
– Determining the cause of persistent loops?
• Causes of persistent loops
– E.g., misconfiguration
12.1.2.0/24
0.0.0.0/0
Paxson Study: Path Fluttering
• Rapid changes between paths
– Multiple paths between a pair of hosts
– Load balancing policies inside the network
• Packet-based load balancing
– Round-robin or random
– Multiple paths for packets in a single flow
• Flow-based load balancing
– Hash of some fields in the packet header
– E.g., IP addresses, port numbers, etc.
– To keep packets in a flow on one path
Paxson Study: Routing Stability
• Route prevalence
– Likelihood of observing a particular route
– Relatively easy to measure with sound sampling
– Poisson arrivals see time averages (PASTA)
– Most host pairs have a dominant route
• Route persistence
– How long a route endures before a change
– Much harder to measure through active probes
– Look for cases of multiple observations
– Typical host pair has path persistence of a week
Paxson Study: Route Asymmetry
• Hot-potato routing
• Other causes
– Asymmetric link weights
in intradomain routing
– Cold-potato routing,
where AS requests traffic
enter at particular place
Customer B
Provider B
multiple
peering
points
• Consequences
Early-exit
routing
Provider A
Customer A
– Lots of asymmetry
– One-way delay is not
necessarily half of the
round-trip time
Labovitz Study: Surprising Statistics
• BGP is an incremental protocol
– In theory, no update messages in steady state
• Two kinds of update messages
– Announcement: advertising a new route
– Withdrawal: withdrawing an old route
• Study saw an alarming number of updates
– At the time, Internet had around 45,000 prefixes
– Routers were exchanging 3-6 million updates/day
– Sometimes as high as 30 million in a day
• Placing a very high load on the routers
Labovitz Study: Classifying Update Messages
• Analyze update messages
– For each (prefix, peer) tuple
– Classify the kinds of routing changes
• Forwarding instability
– WADiff: explicit withdraw, replaced by alternate
– AADiff: implict withdraw, replaced by alternate
• Pathological
– WADup: explicit withdraw, and then reanounced
– AADup: duplicate announcement
– WWDup: duplicate withdrawal
Labovitz Study: Duplicate Withdrawals
• Time-space trade-off in router implementation
– Common system building technique
– Trade one resource for another
– Can have surprising side effects
• The gory details
– Ideally, you should not send a withdrawal if you
never sent a neighbor a corresponding
announcement
– Requires remembering what update message you
sent to each neighbor
– Easier to just send everyone a withdrawal when
your route goes away
Labovitz Study: Practical Impact
• “Stateless BGP” is compliant with the standard
– But, it forces other routers to handle more load
– So that you don’t have to maintain state
– Arguably very unfair, and bad for global Internet
• One router vendor was largely at fault
– Router vendor modified its implementation
– ISPs then deployed the updated software
Labovitz Study: Still Hard to Diagnose Problems
• Despite having very detailed view into BGP
– Some pathologies were very hard to diagnose
• Possible causes
– Flaky equipment
– Synchronization of BGP timers
– Interaction between BGP and intradomain routing
– Policy oscillation
• Topics addressed more in subsequent work