Transcript defended

Reliable Internet Routing
Martin Suchara
Thesis advisor
Prof. Jennifer Rexford
June 15, 2011
The Importance of Service Availability
 Network service availability more important
than before
 New critical network applications
 VoIP, teleconferencing, online banking
 Applications moving to the cloud
 Latency and disruptions affect performance
of enterprise applications
 Routing is critical for availability
 Provides connectivity/reachability
2
Is Best Effort Availability Enough?
 Traditional approach: build reliable system out
of unreliable components
 Networks with rich connectivity
 Routing protocols that find an alternate path if
the primary one fails
 Transmission protocols
retransmit data lost during
transient disruptions
link cut
3
Better than Best-Effort Availability
 Improper load balancing → service disruptions
 Choose alternate paths after a link failure that
allow good load balancing
 Some configurations prevent convergence
 Router configurations that allow routing
protocols to (quickly) agree on a path
 False announcement → choice of wrong path
 Prevent adversarial attacks on the routing
system
4
The Three Problems
 Routers in a single autonomous system
search for optimal paths (after a failure)
 Cooperative model
 Rational autonomous systems with conflicting
business policies that do not allow them to
agree on a route selection
 Rational model
 Attacks by other autonomous systems
 Adversarial model
5
In This Work
FAILURE RESILIENT
ROUTING
BGP SAFETY
ANALYSIS
SECURE ROUTING WITH
SMALL GROUPS
Ensure good load
balancing after a failure
Detect conflicting
routing policies
Prevent malicious
route hijacks
Cooperative
Rational
Adversarial
Intradomain (MPLS)
Interdomain (BGP)
Interdomain (BGP)
Architecture design,
simulation and
optimization
Analysis of a
distributed system
Protocol design and
measurement-driven
simulation
Best paper award in
SIGMETRICS 2011
In INFOCOM 2011
Technical report and
CoNext workshop
6
PART I
Failure Resilient Routing
Simple Failure Recovery with Load Balancing
Martin Suchara
in collaboration with:
D. Xu, R. Doverspike,
D. Johnson and J. Rexford
Failure Recovery and Traffic
Engineering in IP Networks
 Uninterrupted data delivery when equipment
fails
 Re-balance the network load after failure
 Existing solutions either treat failure recovery
and traffic engineering separately or require
congestion feedback
 This work: integrated failure recovery and traffic
engineering with pre-calculated load balancing
8
Architectural Goals
1. Simplify the network
 Allow use of minimalist cheap routers
 Simplify network management
2. Balance the load
 Before, during, and after each failure
3. Detect and respond to failures
9
The Architecture – Components
 Management system
 Knows topology, approximate traffic
demands, potential failures
 Sets up multiple paths and calculates load
splitting ratios
 Minimal functionality in routers
 Path-level failure notification
 Static configuration
 No coordination with other routers
10
The Architecture
• topology design
• list of shared risks
• traffic demands
• fixed paths
• splitting ratios
t
0.25
s
0.25
0.5
11
The Architecture
• fixed paths
• splitting ratios
t
0.5
s
0.5
0
link cut
12
The Architecture: Summary
1. Offline optimizations
2. Load balancing on end-to-end paths
3. Path-level failure detection
How to calculate the paths and
splitting ratios?
13
Goal I: Find Paths Resilient to Failures
 A working path needed for each allowed failure
state (shared risk link group)
R1
e1
e4
e2
e3
R2
e5
Example of failure states:
S = {e1}, { e2}, { e3}, { e4}, { e5}, {e1, e2}, {e1, e5}
14
Goal II: Minimize Link Loads
failure states indexed by s
links indexed by e
cost
Φ(ues)
aggregate congestion cost
weighted for all failures:
ues =1 minimize ∑s ws∑e Φ(ues)
while routing all traffic
link utilization ues
failure state weight
Cost function is a penalty for approaching capacity
15
Possible Solutions
congestion
Suboptimal
solution
Good performance
and practical?
Solution not
scalable
capabilities of routers
 Too simple solutions do not do well
 Diminishing returns when adding functionality
16
Computing the Optimal Paths
 Solve a classical multicommodity flow for each
combination of edge failures:
min load balancing objective
s.t. flow conservation
demand satisfaction
edge flow non-negativity
 Decompose flow into paths and splitting ratios
 Paths used by our heuristics (coming next)
 Solution also a performance upper bound
17
1. State-Dependent Splitting:
Per Observable Failure
 Custom splitting ratios for each observed
combination of failed paths
 NP-hard unless paths are fixed
configuration:
0.4
0.2
Failure
Splitting Ratios
-
0.4, 0.4, 0.2
p2
0.6, 0, 0.4
…
…
p1
0.4 p2
p3
at most 2#paths
entries
0.6
0.4
18
2. State-Independent Splitting:
Across All Failure Scenarios
 Fixed splitting ratios for all observable failures
 Non-convex optimization even with fixed paths
 Heuristic to compute splitting ratios
 Average of the optimal ratios
configuration:
p1, p2, p3:
0.4, 0.4, 0.2
0.4
0.2
p1
0.4 p2
p3
0.667
0.333
19
Our Solutions
1. State-dependent splitting
2. State-independent splitting
How do they compare to the
optimal solution?
Simulations with shared risks for AT&T topology
 954 failures, up to 20 links simultaneously
20
objective value
Congestion Cost – AT&T’s IP
Backbone with SRLG Failures
State-independent
splitting
How do
we compare to OSPF?
not optimal but
Usesimple
optimized OSPF link
weights [Fortz, Thorup ’02].
State-dependent splitting
indistinguishable from optimum
increasing load
network traffic
Additional router capabilities improve
performance up to a point
21
objective value
Congestion Cost – AT&T’s IP
Backbone with SRLG Failures
OSPF with optimized link
weights can be suboptimal
increasing load
network traffic
OSPF uses equal splitting on shortest paths.
This restriction makes the performance worse.
22
cdf
Number of Paths – Various Topologies
number
numberofofpaths
paths
More paths for larger and more
diverse topologies
23
Summary
 Simple mechanism combining path protection
and traffic engineering
 Favorable properties of state-dependent
splitting algorithm:
(i) Near optimal load balancing
(ii) Simplifies network management and design
(iii) Small number of paths
(iv) Delay comparable to current OSPF
 Path-level failure information is just as
good as complete failure information
24
PART II
BGP Safety Analysis
The Conditions of BGP Convergence
Martin Suchara
in collaboration with:
Alex Fabrikant and
Jennifer Rexford
The Internet is a Network of Networks
 Previous part focuses on a single autonomous
system (AS)
 ~35,000 independently administered ASes
cooperate to find routes
 Some route policies do not allow convergence
 Past work: “reasonable” policies that are
sufficient for convergence
 This work: necessary and sufficient
conditions of convergence
26
The Border Gateway Protocol (BGP)
 BGP calculates paths to each address prefix
5
2
3
4
1
Prefix d
 Each Autonomous System (AS) implements its
own custom policies
 Can prefer an arbitrary path
 Can export the path to a subset of neighbors 27
Business Driven Policies of ASes
 Customer-Provider Relationship
 Provider exports its customer’s routes to
everybody
 Customer exports provider’s routes only to
downstream customers
 Peer-Peer Relationship
 Export only customer routers to a peer
 Export peer routes only to customers
28
BGP Safety Challenges
 35,000 ASes and 300,000 address blocks
 Routing convergence usually takes minutes
 But the system does not always converge…
Prefer 120
to 10
Prefer 210
to 20
1
2
Use 120
10
20
Use 210
0
29
d
Results on BGP Safety
 Absence of a “dispute wheel” sufficient for
safety (Griffin, Shepherd, Wilfong, 2002)
 Necessary or sufficient conditions of safety
(Gao and Rexford, 2001), (Gao, Griffin and Rexford, 2001), (Griffin,
Jaggard and Ramachandran, 2003), (Feamster, Johari and
Balakrishnan, 2005), (Sobrinho, 2005), (Fabrikant and
Papadimitriou, 2008), (Cittadini, Battista, Rimondini and Vissicchio,
2009), …
 Verifying safety is computationally hard
(Fabrikant and Papadimitriou, 2008), (Cittadini, Chiesa, Battista and
Vissicchio, 2011)
30
Models of BGP
 Existing models (variants of SPVP)
 Widely used to analyze BGP properties
 Simple but do not capture spurious
behavior of BGP
 This work
 A new model of BGP with spurious updates
 Spurious updates have major consequences
 More detailed model makes proofs easier!
31
SPVP– Traditional Model of BGP
(Griffin and Wilfong, 2000)
The higher the more
preferred
Permitted paths
Always
includes the
empty path
120
10
210
20
ε
The topology
Selected path: 210
ε
1
2
0
The destination
 Activation models the processing of BGP
update messages sent by neighbors
 System is safe if all “fair” activation sequences
32
lead to a stable path assignment
What are Spurious Updates?
 A phenomenon: router announces a route
other than the highest ranked one
1230
10
Spurious BGP
update 230:
230
1
2
210
20
230
Selected path: 20
3
30
0
 Behavior not allowed in SPVP
33
What Causes Spurious Updates?
1. Limited visibility to improve scalability
 Internal structure of ASes
 Cluster-based router architectures
2. Timers and delays to prevent instabilities and
reduce overhead
 Route flap damping
 Minimal Route Advertisement Interval timer
 Grouping updates to priority classes
 Finite size message queues in routers
34
DPVP– A More General Model of BGP
 DPVP = Dynamic Path Vector Protocol
 Transient period τ after each route change
 Spurious updates with a less preferred
recently available route
 Only allows the “right” kind of spurious updates
 Every spurious update has a cause in BGP
 General enough and future-proof
35
DPVP– A More General Model of BGP
The permitted paths
and their ranking
Spurious update
Selected path: 210
120
10
ε
210
20
20
1
ε
2
0
Remember all
recently available
paths (e.g. 20, 210)
StableTime = τ after
last path change
 Spurious updates are allowed only if
current time < StableTime
 Spurious updates may include paths that were
36
recently available or the empty path
Consequences of Spurious Updates
 Spurious behavior is temporary, can it have
long-term consequences?
 Yes, it may trigger oscillations in otherwise safe
configurations!
 Which results do not hold in the new model?
37
Analogs of Previous Results in DPVP
 Most previous results in SPVP also hold for
DPVP
 Absence of a “dispute wheel” sufficient for
safety in SPVP (Griffin, Shepherd, Wilfong, 2002)
 Still sufficient in DPVP
 Some results cannot be extended
 Slightly different conditions of convergence
 Exponentially slower convergence possible
38
DPVP Makes Analysis Easier
 No need to prove that:
 Announced route is the highest ranked one
 Announced route is the last one learned from
the downstream neighbor
 We changed the problem
 PSPACE complete vs. NP complete
39
Necessary and Sufficient Conditions
 How can we prove a system may oscillate?
 Classify each node as “stable” or “coy”
 At least one “coy” node exists
 Prove that “stable” nodes must be stable
 Prove that “coy” nodes may oscillate
Easy in a model with
spurious announcements
40
Necessary and Sufficient Conditions
 Definition: CoyOTE is a triple
satisfying several conditions
1230
10
1
2
Coy nodes may make
spurious announcements
210
20
230
30
3
0
(C, S, Π)
Stable nodes have
a permanent path
One path assigned to each node
proves if the node is coy or stable
 Theorem: DPVP oscillates if and only if
it has a CoyOTE
41
Verifying the Convergence
Conditions = Finding a CoyOTE
 In general an NP-hard problem
 Can be checked in polynomial time for most
“reasonable” network configurations!
e.g.
(i) filter paths violating business relationships
(ii) prefer paths not containing certain AS numbers
(iii) prefer paths from certain groups of neighbors
(iv) prefer shorter paths over longer ones
(v) prefer paths from a lowest AS number neighbor
42
DeCoy – Safety Verification Algorithm
 Goal: verify safety in polynomial time
 Key observation: greedy algorithm works!
1. Let the origin be in the stable set S
2. Keep expanding the stable set S until stuck
 If all nodes become stable system is safe
 Otherwise system can oscillate
43
Summary
 DPVP: best of both worlds
 More accurate model of BGP
 Model simplifies theoretical analysis
 Key results
(i) Spurious announcements are real
(ii) Safe instances in SPVP may oscillate in DPVP
(iii) No dispute wheel → safety
(iv) Necessary and sufficient conditions of
convergence, can be found in polynomial time
44
PART III
How Small Groups
can Secure Routing
Martin Suchara
in collaboration with:
Ioannis Avramopoulos
and Jennifer Rexford
Vulnerabilities – Example 1

Invalid origin attack
 Nodes 1, 3 and 4 route to the adversary
 The true destination is blackholed
1
2
3
4
6
12.34.*
Attacker
5
Genuine
origin
7
46
12.34.*
Vulnerabilities – Example 2

Adversary spoofs a shorter path
 Node 4 routes through 1 instead of 2
 The traffic may be blackholed or intercepted
No attack
1
2
3
6
4
Thinks route
thru 2 shorter
5
Genuine
origin
7
47
12.34.*
Vulnerabilities – Example 2

Adversary spoofs a shorter path
 Node 4 routes through 1 instead of 2
 The traffic may be blackholed or intercepted
Announce
17
1
2
3
6
4
Thinks route
thru 1 shorter
5
Genuine
origin
7
48
12.34.*
State of the Art – S-BGP and soBGP

Mechanism: identify which routes are invalid
and filter them

S-BGP
 Certificates to verify origin AS
 Cryptographic attestations added to routing
announcements at each hop

soBGP
 Build a (partial) AS level topology database
49
How Our Solution Helps

Benefits of previous solutions only for large
deployments (10,000 ASes)
 No incentive for early adopters

Our goal: Provide incentives to early adopters!

The challenge: few participants relying on many
non-participants

Our Solution: raise the bar for the adversary
significantly
 10-20 cooperating nodes
50
Lessons Learned from Experimentation
Observation
Justification
Participation of large ISPs
is important
They learn many routes
some of which are valid
Perfect detection of bad
routes is desirable
Better (but not ideal)
performance
The non-participants are
worse off than the
participants
The participants reject
implicated routes while nonparticipants accept all
Need to increase path
diversity
Perfect detection not enough
51
Our Approach – Key Ideas

Circumvent the adversary
with secure overlay routing

Hijack the hijacker: all
participants announce the
protected prefix

Hire a few large ISPs to help

Detect invalid routes
accurately with data plane
detectors
52
Our Approach – Key Ideas

Circumvent the adversary
with secure overlay routing

Hijack the hijacker: all
participants announce the
protected prefix

Hire a few large ISPs to help

Detect invalid routes
accurately with data plane
detectors
53
Our Approach – Key Ideas

Circumvent the adversary
with secure overlay routing

Hijack the hijacker: all
participants announce the
protected prefix

Hire a few large ISPs to help

Detect invalid routes
accurately with data plane
detectors
54
Our Approach – Key Ideas

Circumvent the adversary
with secure overlay routing

Hijack the hijacker: all
participants announce the
protected prefix

Hire a few large ISPs to help

Detect invalid routes
accurately with data plane
detectors
55
Secure Overlay Routing (SBone)
Protects intra-group traffic
 Overlay of participants’
Usenetworks
peer
route
 Bad paths detected by
probing

Use provider
route
Use longer
route
1
2
Participant
5
Nonparticipant
4
3
Detected as
6
12.34.* ; 12.34.1.1
bad
7
56
12.34.* ; 12.34.1.1
Secure Overlay Routing (SBone)

Traffic may go through an intermediate node
Uses path through
intermediate node 3
Forwards
traffic for 1
?
1
2
?
?
5
?
12.8.1.1
4
3
12.34.1.1
12.34.* ; 12.8.1.1
6
7
57
12.34.* ; 12.34.1.1
Percentage of Secure Participants
SBone – 30 Random + Help of Some
Large ISPs
5 large ISPs
3 large ISPs
1 large ISP
0 large ISPs
Group Size (ASes)
58
SBone – Multiple Adversaries
Percentage of Secure Participants
Solution: enlist more large ISPs!
With 5 adversaries, the
performance degrades
5 large ISPs
3 large ISPs
1 large ISP
0 large ISPs
Group Size (ASes)
59
SBone – Properties
Observation
Justification
SBone offers good
availability even for very
small groups
It better exposes path
diversity
Non-participants are not
secure yet
They lack the ability to
tunnel around problems
60
Hijacking the Hijacker – Shout
Secure traffic from non-participants
Prefers
Use
shortest
short
path
customer’s
 All participants
announce
the protected prefix
path
1
4
leading
12.34.* to adversary
 Once the traffic enters the overlay, it is securely
forwarded to the true prefix owner

1
2
12.34.*
3
6
12.34.*
4
5
12.34.*
Node 4 shouts
7
61
12.34.*
Percentage of Secure ASes
Shout + SBone – 1 Adversary
With as few as 10 participants
+ 3 large ISPs, 95% of all
ASes can reach the victim!
5 large ISPs
3 large ISPs
1 large ISP
0 large ISPs
Group Size (ASes)
62
Percentage of Secure ASes
Shout + SBone – 5 Adversaries
More adversaries 
larger groups required!
5 large ISPs
3 large ISPs
1 large ISP
0 large ISPs
Group Size (ASes)
63
Shout – Properties
Observation
Justification
Can secure communication It suffices if non-participant
from non-participants
reaches any participant
Routing table sizes do not
increase
Increases < 5%
Shout does not inflate path
lengths significantly
Path lengths increase by
<15% with 3 large ISPs
64
Summary

SBone and Shout are novel mechanisms that
allow small groups to secure BGP

The proposed solution
(i) Secures address space of a small group of
participants
(ii) Allows both participants and non-participants
pick valid routes
(iii) Provides incentives to the adopters
65
Conclusion
Better than Best-Effort Availability
 Our three solutions:
(i) Allow routers in a single AS to cooperatively find
failure resilient paths and balance the load
(ii) Major step towards allowing rational ASes to
verify their configurations do not prevent route
convergence
(iii) Small number of participating ASes can protect
themselves against malicious BGP attacks
 Improved reliability of the Internet
67
Thank You!
68