In this state BGP is
Download
Report
Transcript In this state BGP is
•BGP overview
•BGP operations
•BGP messages
•BGP decision algorithm
•BGP states
Xuan Zheng
(modified by M. Veeraraghavan)
1
BGP overview
• Currently in version 4.
• InterAS (or Interdomain) routing protocol for
exchanging network reachability information among
BGP routers.
• Uses TCP on port 179 to send routing messages.
• BGP is a distance vector protocol, but unlike in RIP,
routing messages in BGP contain complete routes.
• Network administrators can specify routing policies.
Xuan Zheng
(modified by M. Veeraraghavan)
2
BGP overview (cont.)
BGP routers are also
called BGP speakers
Xuan Zheng
(modified by M. Veeraraghavan)
3
BGP operations
• Two BGP routers exchanging information on a connection are called
peers.
– Initially, BGP peers exchange the entire BGP routing table.
– A BGP router retains the current version of the entire BGP routing
tables of all of its peers for the duration of the connection.
– Subsequently, only incremental updates are sent as the routing
tables change.
– Keepalive messages are sent periodically to ensure that the
connection between the BGP peers is alive.
– Notification messages are sent in response to errors or special
conditions.
Xuan Zheng
(modified by M. Veeraraghavan)
4
BGP operations (cont.)
• A route is defined as a unit of information that pairs a
destination with the attributes of a path to that destination.
• Routes are stored in the Routing Information Bases (RIBs).
• A RIB within a BGP router consists of three distinct parts:
– Adj-RIBs-In: contains unprocessed routing information that has
been advertised to the local BGP router by its peers;
– Loc-RIB: contains the routes that have been selected by the local
BGP router's Decision Process;
– Adj-RIBs-Out: organizes the routes for advertisement to specific
peers by means of the local speaker’s UPDATE messages.
Xuan Zheng
(modified by M. Veeraraghavan)
5
eBGP and iBGP
• BGP can also be used within an AS. BGP connections inside an AS are
called internal BGP (iBGP), and BGP connections between different Ass
are called external BGP (eBGP).
R2
iBGP
• If an AS has multiple
connections to other
AS's, multiple BGP
speakers are needed.
R3
• All BGP speakers
representing the same
AS must give a
consistent image of the
AS to the outside.
AS2
eBGP
eBGP
R1
R4
AS1
AS4
• Hence iBGP
• The purpose of iBGP is to ensure that network reachability information is
consistent among multiple BGP routers in the same AS.
Xuan Zheng
(modified by M. Veeraraghavan)
6
BGP messages
• BGP header format
– Marker: authenticates incoming BGP messages or detects loss of
synchronization between a pair of BGP peers
– Length: indicates the total length of the message in octets, including the BGP
header
– Type: indicates the type of the message
•
The BGP synchronization rule states that if an AS provides transit service to another AS, BGP
should not advertise a route until all of the routers within the AS have learned about the route
via an IGP.
0
16
24
31
Marker
Length
Xuan Zheng
(modified by M. Veeraraghavan)
Type
7
OPEN message
0
8
16
24
31
Marker
Length
Type=OPEN
My autonomous system
Version
Hold time
BGP identifier
Optional parameter length
Optional parameters
•
•
•
•
Purpose: first message sent after TCP connection is opened
Version: the protocol version number of the message
My autonomous system: The AS number of the sending router
Hold time: the number of seconds between the transmission of successive
KEEPALIVE messages
• BPG identifier: identifier of the sending BGP router (one interface IP addr.)
• Optional parameter: a list of optional parameters
Xuan Zheng
(modified by M. Veeraraghavan)
8
KEEPALIVE message
0
8
16
24
31
Marker
Length
Type=KEEPALIVE
• If the hold time is zero, then KEEPALIVE messages
will not be sent.
Xuan Zheng
(modified by M. Veeraraghavan)
9
NOTIFICATION message
0
8
16
24
31
Marker
Length
Error subcode
Type=NOTIFICATION
Error code
Data
• When a BGP speaker detects an error, it sends a Notification and then
closes the TCP conncetion.
• Error code: the type of error condition
• Error subcode: specific information about the nature of the error
• Data: the reason for the notification.
• Examples: Open message error, Update message error (bad attribute), hold
timer expired, etc.
Xuan Zheng
(modified by M. Veeraraghavan)
10
UPDATE message
BGP header
Unfeasible routes length (2 octets)
Length (1 octet)
Prefix (variable)
……
Length (1 octet)
Prefix (variable)
Withdrawn routes (variable)
Attribute type Attribute length Attribute value
Total path attribute length (2 octets)
……
Path attributes (variable)
Attribute type Attribute length Attribute value
Network layer reachability information (variable)
Length (1 octet)
Prefix (variable)
……
Length (1 octet)
•
•
•
•
•
Prefix (variable)
Unfeasible routes length: the total length of the withdrawn routes field in octets.
Withdrawn routes: a list of IP address prefixes for the routes that need to be
withdrawn from BGP routing tables.
Total path attribute length: the total length of the Path Attributes field in octets.
Path attributes: a variable length sequence of path attributes.
NLRI (Network Layer Reachability Information): a list of IP prefixes.
Xuan Zheng
(modified by M. Veeraraghavan)
11
Update message (cont.)
Attribute type
OT P E
•
0
Attribute length Attribute value
Attribute type code
Attribute flag (1 octet):
– O bit: attribute is optional (O=1), or well-known (required) (O=0).
– T bit: an optional attribute is transitive (T=1), or non-transitive (T=0). Well-known
attributes are always transitive.
– P bit: the information in the optional transitive attribute is partial (P=1), or complete (P=0).
– E bit: the attribute length is two octets (E=1), or one octet (E=0).
•
Four types of attributes
– Well-known mandatory – recognized by all BGP speakers
– Well-known discretionary, optional transitive, optional non-transitive
– Paths with unrecongnized optional transitive attributes are passed on when a BGP speaker
does not recognize the attribute. But unrecognized optional non-transitive attributes
should be silently dropped.
Xuan Zheng
(modified by M. Veeraraghavan)
12
Types of attributes
• Attribute type code:
– ORIGIN (type code 1): well-known mandatory
• defines the origin of the NLRI - well-known mandatory
– 0: IGP – indicates that the NLRI is interior to the originating AS
– 1: EGP – inidicates that the NLRI is learned through BGP
– 2: incomplete – NLRI learned through some other means
– AS_PATH (type code 2): well-known mandatory
• lists the sequence of ASs that the route have traversed to reach the destination
– A BGP speaker propagating a route prepends its own AS to the AS_PATH list
– Used to detect loops
– NEXT_HOP (type code 3): well-known mandatory
• defines the IP address of the border router that should be used as the next hop to
reach the destinations listed in the NLRI
– MULTI_EXIT_DISC (MED) (type code 4): Multi-Exit Discriminator optional nontransitive – inter-AS-metric (hop count
• discriminates among multiple entry/exit points to a neighboring AS and gives a hint
to the neighboring AS about the preferred path.
– makes no sense to compare a MED value by one AS with a MED used by another AS
because metrics vary from AS to AS.
Xuan Zheng
(modified by M. Veeraraghavan)
13
Types of attributes (cont.)
• Attribute type code:
– LOCAL_PREF ( type code 5): well-known discretionary
• informs other BGP routers within the same AS of its degree of preference
for an advertised route
– only part of iBGP; not included in eBGP exchanges
– ATOMIC_AGGREGATE (type code 6): well-known discretionary
• a BGP speaker, when presented with a set of overlapping routes from one
of its peers to reach a given NLRI, informs other BGP routers that it
selected a less specific route without selecting a more specific one that is
included in it. Ensures that certain aggregates are not deeaggregated.
• a route describing a smaller set of destinations (a longer prefix) is said to
be more specific than a route describing a larger set of destinations (a
shorted prefix)
– AGGREGATOR (type code 7): optional transitive
• specifies the last AS number that formed the aggregate route followed by
the IP address of the BGP router that formed the aggregate route.
• advertises which AS and which BGP speaker within that AS performed the
aggregation
14
Xuan Zheng
(modified by M. Veeraraghavan)
•
Example
10.10.3.0/24
10.10.1.2
R1
10.10.4.1
R2
AS1
•
10.1.1.0/24
(with MED 100)
10.10.4.2
iBGP
R3
10.10.1.3
•
10.10.1.1
10.1.2.0/24
R4
eBGP
R5
AS2
10.1.1.0/24
(with MED 200)
Routing table at R2
• Reach 10.1.2.0/24 through 10.10.1.2
• Reach 10.10.3.0/24 through 10.10.4.1
Routing table at R3
• Reach 10.1.2.0/24 through 10.10.1.2
• Reach 10.10.3.0/24 through 10.10.4.2
•
R3 will assume AS2 wants it to use R4
to reach 10.1.1.0/24 because its MED
is lower
• Reach 10.1.1.0/24 via 10.10.1.2
10.10.3.0/24: CIDR (Classless Interdomain
Routing) notation; 24 is the number of network
mask bits – so the network prefix here is
10.10.3 and mask is 255.255.255.0.
•
205.100.0.0/22 means the mask is
255.255.252.0 – so prefix range runs from
Xuan Zheng
205.100.0.0 to 205.100.3.0.
(modified by M. Veeraraghavan)
•
•
NEXT_HOP: R4 advertises
10.1.2.0/24 to R3 (eBGP) with a next
hop of 10.10.1.2 (IP address of BGP
peer)
R3 should advertise 10.1.2.0/24 using
iBGP with a next hop of 10.10.1.2 (as
in eBGP) – reason is R3 is not an
immediate neighbor of R1 or R2; R1
and R2 should update their routing
table information for 10.1.2.0/24 with
the next-hop to reach 10.10.1.2 based
on their IGP information.
15
The BGP decision algorithm
• After BGP router receives updates about different
destinations from peers, the protocol will have to
decide which paths to choose in order to reach a
specific destination.
• BGP will choose only a single path to reach a specific
destination.
• The decision process is based on different attributes,
such as next hop, local preference, the route origin,
and so on.
• BGP will always propagate the best path to its
neighbors.
Xuan Zheng
(modified by M. Veeraraghavan)
16
How BGP selects a path to a destination
BGP selects one path as the best path to a destination; places it in its routing table and
propagates the path information to its neighbors (Cisco web page)
1.
If path specifies a NextHop that is inaccessible drop the update.
2.
Prefer the largest Weight (Weight is a Cisco-specific concept – not in
BGP: locally assigned number; prefer routes with higher weights)
3.
If same weight prefer largest Prefer path with largest Local Preference.
4.
If same Local Preference, prefer the route that was originated by BGP
running on this router.
5.
If no route was originated, prefer the shorter AS_path.
6.
If all paths have the same AS_path length, prefer the lowest origin code
(IGP<EGP<INCOMPLETE).
7.
If origin codes are the same, prefer the path with the lowest MED.
8.
If all paths have the same MED, prefer the External path over Internal.
9.
If all paths are still the same, prefer the path through the closest IGP
neighbor.
10. Prefer the route with the lowest IP address value as specified by the
BGP router ID.
Xuan Zheng
(modified by M. Veeraraghavan)
17
BGP finite state machine
• Idle state: In this state BGP refuses all incoming BGP
connections. No resources are allocated to the peer.
• Connect state: In this state BGP is waiting for the transport
protocol (TCP) connection to be completed.
• Active state: In this state BGP is trying to acquire a peer by
initiating a transport protocol connection. When done, it sends
an OPEN message.
• OpenSent state: In this state BGP waits for an OPEN message
from its peer.
• OpenConfirm state: In this state BGP waits for a
KEEPALIVE or NOTIFICATION message.
• Established state: In the Established state BGP can exchange
UPDATE, NOTIFICATION, and KEEPALIVE messages with
its peer.
Xuan Zheng
(modified by M. Veeraraghavan)
18
References
• Section 8.7.3 of Communication Networks by A.
Leon Garcia and I. Widjaja
• RFC 1771 (can be obtained from www.ietf.org)
• “Using BGP for inter-domain routing”
http://www.cisco.com/univercd/cc/td/doc/cisintwk
/ics/icsbgp4.htm
• “BGP case studies”
http://www.cisco.com/warp/public/459/bgptoc.html
Xuan Zheng
(modified by M. Veeraraghavan)
19