Transcript ppt
COMS W4995-1
Lecture 6
Dynamic routing protocols II
1. Dynamic Routing Protocols: Link State Routing
2. Intra-Domain Routing Protocols: OSPF & BGP
Dynamic Routing Protocols
Link State Routing
The Gang of Four
Link State
IGP
EGP
OSPF
IS-IS
Vectoring
RIP
BGP
Link State Routing
Based on Dijkstra’ s Shortest-Path-First algorithm.
Each router starts by knowing:
Each router advertises to the entire network (flooding):
Prefixes of its attached networks.
Links to its neighbors.
Prefixes of its directly connected networks.
Active links to its neighbors.
Each router learns:
A complete topology of the network (routers, links).
Each router computes shortest path to each destination.
In a stable situation, all routers have the same graph, and compute
the same paths.
Dijkstra’s Shortest Path Algorithm for a Graph
Input: Graph (N,E) with
N the set of nodes and E the set of edges
cvw
link cost (cvw = 1 if (v,w) E, cvv = 0)
s
source node.
Output: Dn
cost of the least-cost path from node s to node n
M = {s};
for each n M
Dn = csn;
while (M all nodes) do
Find w M for which Dw = min{Dj ; j M};
Add w to M;
for each neighbor n of w and n M
Dn = min[ Dn, Dw + cwn ];
Update route;
end for
end while
end for
Link state routing: graphical illustration
Global view:
b
3
a
a’s view:
3
a
d
b
6
b
d’s view:
c
c
1
a
c’s view:
2
c
6
a
b’s view: 3
1
2
d
c
b
1
c
2
d
6
Collecting all views yield a global & complete view of the network!
Operation of a Link State Routing protocol
Received
LSAs
Link State
Database
Dijkstra’s
Algorithm
LSAs are flooded
to other interfaces
IP Routing
Table
Link State Routing: Properties
Each node requires complete topology information
Link state information must be flooded to all nodes
Guaranteed to converge
Distance Vector vs. Link State Routing
With distance vector routing, each node has information only about
the next hop:
Node A: to reach F go to B
Node B: to reach F go to D
Node D: to reach F go to E
Node E: go directly to F
Distance vector routing makes
poor routing decisions if
directions are not completely
correct
(e.g., because a node is down).
A
B
C
D
E
If parts of the directions incorrect, the routing may be incorrect until the
routing algorithms has re-converged.
F
Distance Vector vs. Link State Routing
In link state routing, each node has a complete map of
the topology
A
If a node fails, each
node can calculate
the new route
B
C
D
E
A
F
A
Difficulty: All nodes need to
have a consistent view of the
network
A
B
C
D
E
B
C
D
E
A
F
B
C
D
E
B
C
D
E
A
F
B
C
D
E
F
F
A
F
B
C
D
E
F
Distance Vector vs. Link State Routing
Link State
•
•
•
•
•
•
Vectoring
Topology information is flooded
within the routing domain
Best end-to-end paths are
computed locally at each
router.
Best end-to-end paths
determine next-hops.
•
Based on minimizing some
notion of distance
Works only if policy is shared
and uniform
Examples: OSPF, IS-IS
•
•
•
•
•
Each router knows little about
network topology
Only best next-hops are chosen
by each router for each
destination network.
Best end-to-end paths result
from composition of all nexthop choices
Does not require any notion of
distance
Does not require uniform
policies at all routers
Examples: RIP, BGP
Dynamic Routing Protocols
Open Shortest Path First
OSPF
OSPF = Open Shortest Path First
The OSPF routing protocol is the most important link
state routing protocol on the Internet (another link state
routing protocol is IS-IS (intermediate system to
intermediate system)
The complexity of OSPF is significant
RIP (RFC 2453 ~ 40 pages)
OSPF (RFC 2328 ~ 250 pages)
History:
1989: RFC 1131 OSPF Version 1
1991: RFC1247 OSPF Version 2
1994: RFC 1583 OSPF Version 2 (revised)
1997: RFC 2178 OSPF Version 2 (revised)
1998: RFC 2328 OSPF Version 2 (current version)
Features of OSPF
Provides authentication of routing messages
Enables load balancing by allowing traffic to be split
evenly across routes with equal cost
Type-of-Service routing allows to setup different routes
dependent on the TOS field
Supports subnetting
Supports multicasting
Allows hierarchical routing
Hierarchical OSPF
Hierarchical OSPF
Two-level hierarchy: local area, backbone.
Link-state advertisements only in area
each nodes has detailed area topology; only know
direction (shortest path) to nets in other areas.
Area border routers: “summarize” distances to nets
in own area, advertise to other Area Border routers.
Backbone routers: run OSPF routing limited to
backbone.
Example Network
10.1.1.2
.1
4
.2
.2
3
2
• Metric is in the range [0 , 216]
• Metric can be asymmetric
3
.6
1
.5
.3
5
.6
10.1.7.0 / 24
.4
.3
.3
1
.4
.2
.5
.5
10.1.5.0/24
10.1.2.3
• Link costs are called Metric
.4
10.1.4.0 / 24
10.1.1.0 / 24
Router IDs can be
selected
independent of
interface addresses,
but usually chosen to
be the smallest
interface address
2
10.1.3.0 / 24
.1
10.1.7.6
10.1.4.4
10.1.6.0 / 24
10.1.1.1
10.1.5.5
Link State Advertisement (LSA)
10.1.1.1
10.1.1.2
10.1.1.0 / 24
3
2
.2
.2
.4
.4
.4
.3
.5
.3
.3
.5
10.1.5.5
The LSA of router 10.1.1.1 is as follows:
Link State ID:
10.1.1.1
Advertising Router:
Number of links:
10.1.1.1 = Router ID
3 = 2 links plus router itself
Description of Link 1:
Description of Link 2:
Description of Link 3:
Link ID = 10.1.1.2, Metric = 4
Link ID = 10.1.2.2, Metric = 3
Link ID = 10.1.1.1, Metric = 0
.6
.5
10.1.5.0/24
10.1.2.3
.6
10.1.7.0 / 24
10.1.4.0 / 24
10.1.6.0 / 24
.1
.2
10.1.3.0 / 24
4
.1
10.1.7.6
10.1.4.4
= Router ID
Network and Link State Database
10.1.1.1
10.1.1.0 / 24
Each router has a
database which
contains the LSAs
from all other routers
.2
.2
.4
10.1.4.0 / 24
.4
.4
.3
.5
.3
.3
.6
10.1.7.0 / 24
10.1.6.0 / 24
.1
.2
10.1.7.6
10.1.4.4
10.1.3.0 / 24
.1
10.1.1.2
.6
.5
.5
10.1.5.0/24
10.1.5.5
10.1.2.3
LS Type
Link StateID
Adv. Router
Checksum
LS SeqNo
LS Age
Router-LSA
10.1.1.1
10.1.1.1
0x9b47
0x80000006
0
Router-LSA
10.1.1.2
10.1.1.2
0x219e
0x80000007
1618
Router-LSA
10.1.2.3
10.1.2.3
0x6b53
0x80000003
1712
Router-LSA
10.1.4.4
10.1.4.4
0xe39a
0x8000003a
20
Router-LSA
10.1.5.5
10.1.5.5
0xd2a6
0x80000038
18
Router-LSA
10.1.7.6
10.1.7.6
0x05c3
0x80000005
1680
Link State Database
The collection of all LSAs is called the link-state
database
Each router has an identical link-state database
Useful for debugging: Each router has a complete description of the
network
If neighboring routers discover each other for the first
time, they will exchange their link-state databases
The link-state databases are synchronized using reliable
flooding
OSPF Packet Format
OSPF Message
IP header
OSPF Message
Header
OSPF packets are not
carried as UDP payload!
OSPF has its own IP
protocol number: 89
TTL: set to 1 (in most cases)
Body of OSPF Message
Message Type
Specific Data
LSA
LSA
Header
Destination IP: neighbor’s IP address or 224.0.0.5
(ALLSPFRouters) or 224.0.0.6 (AllDRouters)
LSA
LSA
Data
... ...
LSA
OSPF Packet Format
OSPF Message
Header
2: current version
is OSPF V2
version
Message types:
1: Hello (tests reachability)
2: Database description
3: Link Status request
4: Link state update
5: Link state acknowledgement
Standard IP checksum taken
over entire packet
Authentication passwd = 1:
Authentication passwd = 2:
Body of OSPF Message
type
message length
source router IP address
ID of the Area
from which the
packet originated
Area ID
checksum
authentication type
authentication
authentication
32 bits
64 cleartext password
0x0000 (16 bits)
KeyID (8 bits)
Length of MD5 checksum (8 bits)
Nondecreasing sequence number (32 bits)
0: no authentication
1: Cleartext
password
2: MD5 checksum
(added to end
packet)
Prevents replay
attacks
OSPF LSA Format
LSA
Link Age
LSA
Header
LSA
Header
LSA
Data
Link Type
Link State ID
advertising router
link sequence number
checksum
length
Link ID
Link 1
Link Data
Link Type #TOS metrics
Metric
Link ID
Link 2
Link Data
Link Type #TOS metrics
Metric
Discovery of Neighbors
Routers multicasts OSPF Hello packets on all OSPFenabled interfaces.
If two routers share a link, they can become neighbors, and
establish an adjacency
10.1.10.1
10.1.10.2
Scenario:
Router 10.1.10.2 restarts
OSPF Hello
OSPF Hello: I heard 10.1.10.2
After becoming a neighbor, routers exchange their link state
databases
Neighbor discovery and database synchronization
Scenario:
Router 10.1.10.2
restarts
Discovery of
adjacency
10.1.10.1
10.1.10.2
OSPF Hello
OSPF Hello: I heard 10.1.10.2
After neighbors are discovered the nodes exchange their databases
Database Description: Sequence = X
Sends database
description.
(description only
contains LSA
headers)
Acknowledges
receipt of
description
Database Description: Sequence = X, 5 LSA headers =
Router-LSA, 10.1.10.1, 0x80000006
Router-LSA,
10.1.10.2, 0x80000007
Router-LSA,
10.1.10.3, 0x80000003
Router-LSA,
10.1.10.4, 0x8000003a
Router-LSA,
10.1.10.5, 0x80000038
Router-LSA,
10.1.10.6, 0x80000005
Database Description: Sequence = X+1, 1 LSA header=
Router-LSA,
10.1.10.2, 0x80000005
Database Description: Sequence = X+1
Sends empty
database
description
Database
description of
10.1.10.2
Regular LSA exchanges
10.1.10.1
Link State Request packets, LSAs =
Router-LSA,
10.1.10.1,
Router-LSA,
10.1.10.2,
Router-LSA,
10.1.10.3,
Router-LSA,
10.1.10.4,
Router-LSA,
10.1.10.5,
Router-LSA,
10.1.10.6,
10.1.10.1 sends
requested LSAs
Link State Update Packet, LSAs =
Router-LSA, 10.1.10.1, 0x80000006
Router-LSA, 10.1.10.2, 0x80000007
Router-LSA, 10.1.10.3, 0x80000003
Router-LSA, 10.1.10.4, 0x8000003a
Router-LSA, 10.1.10.5, 0x80000038
Router-LSA, 10.1.10.6, 0x80000005
10.1.10.2
10.1.10.2 explicitly
requests each LSA
from 10.1.10.1
Dissemination of LSA-Update
A router sends and refloods LSA-Updates, whenever the
topology or link cost changes. (If a received LSA does
not contain new information, the router will not flood the
packet)
Exception: Infrequently (every 30 minutes), a router will
flood LSAs even if there are not new changes.
Acknowledgements of LSA-updates:
explicit ACK, or
implicit via reception of an LSA-Update
Question: If a new node comes up, it could build the
database from regular LSA-Updates (rather than
exchange of database description). What role do the
database description packets play?
Dynamic Routing Protocols (Inter-domain)
Border Gateway Protocol
BGP Quick View
BGP = Border Gateway Protocol . Currently in version 4,
specified in RFC 1771. (~ 60 pages)
Note: In the context of BGP, a gateway is nothing else
but an IP router that connects autonomous systems.
Interdomain routing protocol for routing between
autonomous systems
Uses TCP to establish a BGP session and to send
routing messages over the BGP session
BGP is a path vector protocol. Routing messages in BGP
contain complete routes.
Network administrators can specify routing policies
BGP Policy-based Routing
Each node is assigned an AS number (ASN)
BGP’s goal is to find any AS-path (not an optimal one).
Since the internals of the AS are never revealed, finding
an optimal path is not feasible.
Network administrator sets BGP’s policies to determine
the best path to reach a destination network.
How Many ASNs are there today?
20,570
14,588
origin
only (no
transit)
Thanks to Geoff Huston. http://bgp.potaroo.net on October 9, 2005
ARDs versus ASes
Autonomous Routing Domains Don’t Always Need BGP or an ASN
Qwest
Nail up routes 130.132.0.0/16
pointing to Yale
Nail up default routes 0.0.0.0/0
pointing to Qwest
Yale University
130.132.0.0/16
Static routing is the most common way of connecting an
autonomous routing domain to the Internet.
This helps explain why BGP is a mystery to many …
ASNs Can Be “Shared” (RFC 2270)
AS 701
UUNet
AS 7046
Crestar
Bank
AS 7046
NJIT
AS 7046
Hood
College
128.235.0.0/16
ASN 7046 is assigned to UUNet. It is used by
Customers single homed to UUNet, but needing
BGP for some reason (load balancing, etc..) [RFC 2270]
ARDs and ASes: Summary
Most ARDs have no ASN (statically routed at Internet edge)
Some unrelated ARDs share the same ASN (RFC 2270)
Some ARDs are implemented with multiple ASNs (example:
Worldcom)
ASes are just an implementation detail of Inter-domain routing
How many prefixes today?
221,002
IPv4 Address space covered
33.3%
23%
Thanks to Geoff Huston. http://bgp.potaroo.net on October 9, 2005
Policy-Based vs. Distance-Based Routing?
Minimizing
“hop count” can
violate commercial
relationships that
constrain interdomain routing.
Host 1
Cust1
YES
ISP1
NO
ISP3
ISP2
Cust3
Host 2
Cust2
Thanks to Tim Griffin
http://www.cl.cam.ac.uk/users/tgg22
Customer versus Provider
provider
provider
customer
IP traffic
customer
Customer pays provider for access to the Internet
Why not minimize “AS hop Count”?
National
ISP1
National
ISP2
YES
NO
Regional
ISP3
Cust3
Regional
ISP2
Cust2
Regional
ISP1
Cust1
Shortest path routing is not compatible with commercial relations
The “Peering” Relationship
peer
provider
peer
customer
Peers provide transit between
their respective customers
Peers do not provide transit
between peers
traffic
allowed
traffic NOT
allowed
Peers (often) do not exchange $$$
Peering Provides Shortcuts
Peering also allows connectivity between
the customers of “Tier 1” providers.
peer
provider
peer
customer
Peering Wars
Peer
Reduces upstream transit costs
Can increase end-to-end
performance
May be the only way to connect
your customers to some part of
the Internet (“Tier 1”)
Don’t Peer
You would rather have
customers
Peers are usually your
competition
Peering relationships may
require periodic renegotiation
Peering struggles are by far the most
contentious issues in the ISP world!
Peering agreements are often confidential.
The Border Gateway Protocol (BGP)
BGP =
+
RFC 1771
“optional” extensions
RFC 1997 (communities) RFC 2439 (damping) RFC 2796 (reflection) RFC3065 (confederation) …
+
routing policy configuration
languages (vendor-specific)
+
Current Best Practices in
management of Interdomain Routing
BGP was not DESIGNED.
It EVOLVED.
BGP Route Processing
Open ended programming.
Constrained only by vendor configuration language
Receive Apply Policy =
filter routes &
BGP
Updates tweak
attributes
Apply Import
Policies
Based on
Attribute
Values
Best
Routes
Best Route
Selection
Best Route
Table
Apply Policy =
filter routes &
tweak
attributes
Apply Export
Policies
Install forwarding
Entries for best
Routes.
IP Forwarding Table
Transmit
BGP
Updates
BGP Attributes
Value
----1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
...
255
Code
--------------------------------ORIGIN
AS_PATH
NEXT_HOP
MULTI_EXIT_DISC
LOCAL_PREF
ATOMIC_AGGREGATE
AGGREGATOR
COMMUNITY
ORIGINATOR_ID
CLUSTER_LIST
DPA
ADVERTISER
RCID_PATH / CLUSTER_ID
MP_REACH_NLRI
MP_UNREACH_NLRI
EXTENDED COMMUNITIES
Reference
--------[RFC1771]
[RFC1771]
[RFC1771]
[RFC1771]
[RFC1771]
[RFC1771]
[RFC1771]
[RFC1997]
[RFC2796]
[RFC2796]
[Chen]
[RFC1863]
[RFC1863]
[RFC2283]
[RFC2283]
[Rosen]
Most
important
attributes
reserved for development
From IANA: http://www.iana.org/assignments/bgp-parameters
Not all attributes
need to be present in
every announcement
ASPATH Attribute
AS 1129
135.207.0.0/16
AS Path = 1755 1239 7018 6341
135.207.0.0/16
AS Path = 1239 7018 6341
AS 1239
Sprint
AS 1755
Ebon
e
AS 6341
AT&T Research
135.207.0.0/16
Prefix Originated
135.207.0.0/16
AS Path = 1129 1755 1239 7018 6341
AS 12654
RIPE NCC
RIS project
135.207.0.0/16
AS Path = 7018 6341
AS7018
135.207.0.0/16
AS Path = 6341
Global Access
135.207.0.0/16
AS Path = 3549 7018 6341
AT&T
135.207.0.0/16
AS Path = 7018 6341
AS 3549
Global Crossing
Shorter Doesn’t Always Mean Shorter
In fairness:
could you do
this “right” and
still scale?
Mr. BGP says that
path 4 1 is better
than path 3 2 1
Duh!
AS 4
AS 3
Exporting internal
state would
dramatically
increase global
instability and
amount of routing
state
AS 2
AS 1
Routing Example 1
Thanks to Han Zheng
Routing Example 2
Thanks to Han Zheng
Tweak Tweak Tweak (TE)
For inbound traffic
Filter outbound routes
Tweak attributes on
outbound routes in the
hope of influencing your
neighbor’s best route
selection
inbound
traffic
outbound
routes
For outbound traffic
Filter inbound routes
Tweak attributes on
inbound routes to
influence best route
selection
outbound
traffic
inbound
routes
In general, an AS has more control over outbound traffic
Backup Links with Local Preference (Outbound Traffic)
AS 1
primary link
Set Local Pref = 100
for all routes from AS 1
backup link
AS 65000
Set Local Pref = 50
for all routes from AS 1
Forces outbound traffic to take primary link, unless link is down.
Multihomed Backups (Outbound Traffic)
AS 1
AS 3
provider
provider
primary link
backup link
Set Local Pref = 100
for all routes from AS 1
Set Local Pref = 50
for all routes from AS 3
AS 2
Forces outbound traffic to take primary link, unless link is down.
Shedding Inbound Traffic with ASPATH Prepending
AS 1
Prepending will (usually)
force inbound
traffic from AS 1
to take primary link
provider
192.0.2.0/24
ASPATH = 2 2 2
192.0.2.0/24
ASPATH = 2
primary
backup
customer
AS 2
192.0.2.0/24
Yes, this is a
Glorious Hack …
… But Padding Does Not Always Work
AS 1
AS 3
provider
provider
192.0.2.0/24
ASPATH = 2
192.0.2.0/24
ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2
primary
backup
customer
AS 2
192.0.2.0/24
AS 3 will send
traffic on “backup”
link because it prefers
customer routes and local
preference is considered
before ASPATH length!
Padding in this way is often
used as a form of load
balancing
COMMUNITY Attribute to the Rescue!
AS 1
AS 3
provider
provider
AS 3: normal
customer local
pref is 100,
peer local pref is 90
192.0.2.0/24
ASPATH = 2
COMMUNITY = 3:70
192.0.2.0/24
ASPATH = 2
primary
backup
customer
AS 2
192.0.2.0/24
Customer import policy at AS 3:
If 3:90 in COMMUNITY then
set local preference to 90
If 3:80 in COMMUNITY then
set local preference to 80
If 3:70 in COMMUNITY then
set local preference to 70
BGP Issues - What is a BGP Wedgie?
BGP
¾ wedgie
Full
wedgie
policies make sense locally
Interaction of local policies allows
multiple stable routings
Some routings are consistent with
intended policies, and some are not
If an unintended routing is
installed (BGP is “wedged”), then
manual intervention is needed to
change to an intended routing
When
an unintended routing is
installed, no single group of network
operators has enough knowledge to
debug the problem
Dynamic Routing Protocols: Summary
Dynamic routing protocols: RIP, OSPF, BGP
RIP uses distance vector algorithm, and converges slow
(the count-to-infinity problem)
OSPF uses link state algorithm, and converges fast. But
it is more complicated than RIP.
Both RIP and OSPF finds lowest-cost path.
BGP uses path vector algorithm, and its path selection
algorithm is complicated, and is influenced by policies.
BGP has its own problems see WIDGI by Tim Griffin
More Readings (Optional)
BGP Wedgies: Bad Routing Policy Interactions that
Cannot be Debugged
JI’s Intro to interdomain routing.
"Interdomain Setting of PlanetLab Nodes."
PlanetLab Meeting, May 14, 2004.
Understanding the Border Gateway Protocol (BGP)
ICNP 2002 Tutorial Session
References
–
–
[VGE1996, VGE2000] Persistent Route Oscillations in Inter-Domain Routing.
Kannan Varadhan, Ramesh Govindan, and Deborah Estrin. Computer
Networks, Jan. 2000. (Also USC Tech Report, Feb. 1996)
[GW1999] An Analysis of BGP Convergence Properties. Timothy G. Griffin,
Gordon Wilfong. SIGCOMM 1999
[GSW1999] Policy Disputes in Path Vector Protocols. Timothy G. Griffin, F.
Bruce Shepherd, Gordon Wilfong. ICNP 1999
[GW2001] A Safe Path Vector Protocol. Timothy G. Griffin, Gordon Wilfong.
INFOCOM 2001
[GR2000] Stable Internet Routing without Global Coordination. Lixin Gao,
Jennifer Rexford. SIGMETRICS 2000
[GGR2001] Inherently safe backup routing with BGP. Lixin Gao, Timothy G.
Griffin, Jennifer Rexford. INFOCOM 2001
[GW2002a] On the Correctness of IBGP Configurations. Griffin and
Wilfong.SIGCOMM 2002.
[GW2002b] An Analysis of the MED oscillation Problem. Griffin and Wilfong.
ICNP 2002.