Investigating the Causes of Inter-Domain Routing Instability

Download Report

Transcript Investigating the Causes of Inter-Domain Routing Instability

Investigation of Global
Network Routing Behavior
BJ Premore
Dartmouth College
Prof. David Nicol, Advisor
December 8, 2000
In collaboration with
Jim Cowie, Renesys Corporation
Tim Griffin, AT&T Labs-Research
Andy Ogielski, Renesys Corporation
… and several other colleagues
Overview
• Objectives
–
–
–
–
better understand inter-domain routing dynamics
explore impact of implementation tradeoffs
explore extensions before deployment
provide a useful tool for researchers
• Implementation
– simulation architecture
– BGP functionality
– validation
• Research Applications
– convergence (ongoing)
– security (ongoing)
– timing, policy interaction, proposed extensions, etc.
2
Overview
• Objectives
–
–
–
–
better understand inter-domain routing dynamics
explore impact of implementation tradeoffs
explore extensions before deployment
provide a useful tool for researchers
• Implementation
– simulation architecture
– BGP functionality
– validation
• Research Applications
– convergence (ongoing)
– security (ongoing)
– timing, policy interaction, proposed extensions, etc.
3
Simulation Architecture
DML = Domain Modeling Language
- model configuration
SSFNet = SSF Network Models
- compositional approach to large network design
- not independent
SSF = Scalable Simulation Framework
- a modern standard for discrete-event simulation
of large, complex systems
- multiple implementations
- the “engine under the hood”
4
Simulation Layers
DML Configurations
Model Instances
configure
SSFNet
enhances
DaSSF
implements
C++
CSSF
implements
C
Raceway
Network
Components
as Java Classes
Simulator
Implementations
implements
Java
SSF standard
Simulator API
5
Why Another Simulator?
• Fully Integrated Network Environment
– control over more than just BGP
– eg: TCP/IP, traffic, router & link hardware
• Scalability
– designed to handle large, complex simulations
– tens of thousands of multi-protocol nodes
• Design Trade-off Toggles
– eg: tie-breaking in route selection
– eg: apply minAdver timer to withdrawals
• Explore Impact of New Functionality
– before it goes live!
– eg: MPLS; protocol extensions
6
Pros and Cons
• We can’t …
– expect to model real-world routers perfectly with every
detail
• We can …
–
–
–
–
–
capture the most important characteristics
change and tweak the protocol
explore consequences of fundamental design of BGP
explore proposed and novel protocol extensions
evaluate and analyze collective behavior on a large-scale
7
SSFNet Layer
DML configurations
Model Instances
configure
SSFNet
enhances
DaSSF
implements
C++
CSSF
implements
C
Raceway
Network
Components
as Java Classes
Simulator
Implementations
implements
Java
SSF standard
Simulator API
8
Example SSFNet Components
physical entities
protocols
IP
router
link
host
logical containers
Sockets
BGP
TCP
FTP
HTTP
OSPF
Net
protocol graph
9
SSF.OS.BGP
• Based on RFCs
• RFC 1771: BGP-4 and latest drafts
• RFC compliant implementation
• Includes some RFC-specified extensions (Route
Reflection)
• Has features similar to those used by vendors
(policy-based filtering)
10
SSF.OS.BGP4 Functionality
– Finite state machine, timers, RIB
– TCP transport
– Peering: exterior and interior
– Route reflection
– Messages and path attributes
– Policy
– filter based on path attribute
– attribute modification
– Monitoring of protocol operation
– gather stats on practically any event of interest
11
Package SSF.OS.BGP4
Organization
BGPSession
PeerEntry
PeerEntry RIBIn
Policy Rule
(inbound)
Policy Rule
(outbound)
RIBIn LocRIB RIBOut
RIBOut
Timers
ConnRetry
KeepAlive
Hold
MinAdver
12
Validation Methodology
• No standards, create our own suite
• Basic behavior in simple topologies
– Peering session maintenance (Hold & KeepAlive timer
operation)
– Route advertisement and withdrawal
– Route selection
– Reflection
– Internal BGP
• General behavior in complex topologies
– End-to-end data delivery
– Exercises basic behaviors as well
• Policy testing
– Converging and non-converging gadgets [Griffin 1999] 13
Example: Route Reflection
Validation Test Topology
14
Another Test Topology
15
Large Network Example
16
Example With Monitoring
Filters
17
DML Example
host [
id 1
interface [ id 1 ]
]
router [
id 2
interface [ idrange [ from 1 to 4 ] ]
]
link [
attach 1(1)
attach 2(1)
]
1
2
1
1
2
3 4
18
DML: Adding Protocols
BGP
OSPF
TCP
IP
protocol graph
router [
graph [
ProtocolSession [
name bgp
use SSF.OS.BGP4.BGPSession
]
ProtocolSession [
name ospf
use SSF.OS.OSPF.sOSPF
]
ProtocolSession [
name tcp
use SSF.OS.TCP.tcpSessionMaster
]
ProtocolSession [
name ip
use SSF.OS.IP
]
]
]
19
Overview
• Objectives
–
–
–
–
better understand inter-domain routing dynamics
explore impact of implementation tradeoffs
explore extensions before deployment
provide a useful tool for researchers
• Implementation
– simulation architecture
– BGP functionality
– validation
• Research Applications
– convergence (ongoing)
– security (ongoing)
– timing, policy interaction, proposed extensions, etc.
20
Interesting Possibilities
– Better value for MinAdver timer?
– Improved route flap dampening?
– Policy studies
– How do various configurations affect convergence?
– Test effects of policy changes before deployment
– EGP-IGP interaction studies
– Are there instability side-effects?
– Is it safe to convert between different cost metrics?
– MPLS
– Will it have any unexpected effects on routing?
– Security studies
21
A Security Study
• Black Holes
• How many networks can/will be included?
• parameters
– severity of misconfiguration or maliciousness
– number of misbehaving routers
– location of misbehaving routers
• Other Questions
– What is the impact of SBGP on routing efficiency?
– Can attacks and misconfigurations be detected?
– How can we speed up convergence after an attack?
22
23
A Convergence Study
• Goals
• build upon previous work
– Labovitz, Ahuja, Bose & Jahanian 2000
– what factors contribute to observed dynamic
behaviors?
• isolate contributions of different parameters
– policy, topology, iBGP, timers, etc.
• make recommendations for implementations
(eventually)
– what changes can alleviate impact of various factors?
24
A Convergence Study
•
Model Parameters
–
–
–
•
topology: N ASes each with just 1 router
» shape: line, loop, wheel, meshes, grid
» size: vary N from 2 to 100
policy
» permit all or typical customer/provider/peer
link delay
» all equal or random
Advertise, Withdraw, Wait and Watch
1. Wait for system to reach stable state, then …
2. Designated AS advertises a bogus destination to everyone
else
3. Wait for system to reach a stable state again, then …
4. Designated AS tells everyone that the bogus route is not
reachable through it any more
5. Wait for system to reach a stable state again
25
Simple Topologies
loop
wheel
line
imesh
emesh
grid
26
Line Experiment
fixed or random link delays
27
Loop Experiment
fixed link delays
28
Wheel Experiment
fixed link delays
29
IBGP Full Mesh Experiment
fixed link delays
30
EBGP Full Mesh Experiment
fixed link delay
31
Grid Experiment
fixed link delay, width=10, no policy
32
Preliminary Observations
• Convergence time related to number of
alternate paths a router sees
– policy helps reduce
• Agreement with previous results
– full mesh experiments in particular
• Full external mesh still the most
interesting
– how many alternate paths are actually “seen” depends a
lot on timing
– using random link delays reduced convergence time
33
Coming Soon …
– Functionality
–
–
–
–
–
aggregation
route flap dampening
communities
confederations
and more ...
– Experiments
–
–
–
–
–
look for better timer values
how does policy affect convergence?
can we improve route flap dampening?
test extensions and other proposed modifications
and more …
34
For Further Information
SSF/Raceway and SSFNet:
http://www.ssfnet.org/
SSF.OS.BGP4:
http://www.cs.dartmouth.edu/~beej/research/bgp/java/
(or follow link from www.ssfnet.org)
35
This sample DML code configures an AS with a single router running BGP.
It performs explicit configuration of all BGP attributes. It is taken from the
‘goodgadget’ validation test. (continued next page)
Net [
id 1
AS_status boundary
router [
id 1
graph [
ProtocolSession [
name bgp use SSF.OS.BGP4.BGPSession
autoconfig false
connretry_time 120 min_as_orig_time 15
reflector false
neighbor [
as 0 address 1(1) use_return_address 1(1)
hold_time 90 keep_alive_time 30 min_adver_time 30
infilter [ # give low priority to routes learned from 0
clause [
precedence 1
predicate []
action [
primary permit
atom [ attribute local_pref type set value 80 ]
]
]
]
outfilter [ _extends .filters.permit_all ]
]
36
neighbor [
as 2 address 1(2) use_return_address 1(2)
hold_time 90 keep_alive_time 30 min_adver_time 30
infilter [ # give high priority to routes learned from 2
clause [
precedence 1
predicate []
action [
primary permit
atom [ attribute local_pref type set value 100 ]
]
]
]
outfilter [ _extends .filters.permit_all ]
]
neighbor [
as 3 address 1(2) use_return_address 1(3)
hold_time 90 keep_alive_time 30 min_adver_time 30
infilter [ # deny all routes learned from 3
clause [ precedence 1 predicate [] action [ primary deny ] ]
]
outfilter [ _extends .filters.permit_all ]
]
]
ProtocolSession [ name socket use SSF.OS.Socket.socketMaster ]
ProtocolSession [ name tcp
use SSF.OS.TCP.tcpSessionMaster ]
ProtocolSession [ name ip
use SSF.OS.IP ]
]
interface [ idrange [ from 0 to 3 ] ]
]
host [ id 101 _extends .basic_host ]
link [ attach 1(0) attach 101(0) delay 0.001 ]
]
37