IPMA & NID Talk - Illinois - Cornell

Transcript IPMA & NID Talk - Illinois - Cornell

Scalability & Stability of the Internet
Infrastructure
Farnam Jahanian
Department of EECS
University of Michigan
<[email protected]>
Context
•Routers •Name Servers
•Critical Services
•Protocol Scrubbers
Network
Infrastructure
•Replication schemes
•Countermeasures
•Network Attacks
•Operational Faults
•S/H Failures
Active
Response
Capabilities
Anomalous
Network Events
LIGHTHOUSE: Survivable
Network
Infrastructure
•Netflow Statistics
•Event Aggregation
•Data Mining
Analysis
Engines
Coarse and
Fine Grained
Measurement
Tools
•Windmill Probes
Joint projects between U. Michigan & Merit Network
Motivation

Increasing reliance of financial and national utility infrastructures
on interconnected IP-based networks

Explosive growth in both size and topological complexity of the
underlying communication infrastructure

Reliance on off-the-self infrastructure & shrink-wrapped code

Network infrastructure is vulnerable:
– inherent instability and transient oscillations
– delayed convergence and long failover
– coordinated denial of service attacks on network resources
– hardware and software failures
– operational faults and misconfigurations
Imminent Collapse of the Internet
Collapse of the Internet
?
Now
Internet Growth
Explosive growth in both size and topological
complexity



Internet end-system growth
Traffic volume & characteristics
Infrastructure topological evolution
Infrastructure Topological Evolution
Between 1995-1999:

Decentralization: from a single backbone network to a
conglomeration of 100s of backbone and 1000s ISP.

Loss of hierarchy and abstraction: from strict hierarchical
network to increasingly a full-mesh interconnection.

Significant bandwidth increase: from signle T3 (45MB) circuit
and T1 (1MB) links to multiple OC48 (1.2GB) circuits and OC12
(622MB) lines between nodes.
Internet Evolution: NSFNet
NSFNet Backbone
Hello/EGP
Regional
Campus
Campus
Hello/EGP
Hello/EGP
Regional
Campus
Regional
Campus
Hierarchical network with a single central backbone
Internet Evolution: Today
AS1
AS2
C2
C1
AS4
AS3
C3
C4
Full-mesh interconnection of ISP backbones and customers
Impact of Instability & Failures
– Increased end-to-end Loss/Latency
– Increased delay in convergence & network reachability
– Backbone infrastructure CPU/Memory requirements
– Backbone “route flap storms”
– Network management complexity
Background: Internet Architecture
BGP
BGP
BGP
Background: Internet Routing

Two major categories
– Inter-domain (BGP between autonomous systems)
– Intra-domain (OSPF, ISIS, IGRP inside an AS)

BGP
– Incremental: announcements and withdraws
– Updates include policy (e.g. MED, ASPath)
– Maintain multiple possible routes
Background: BGP Routing Protocol

BGP is an incremental protocol that sends update
information only upon changes in network topology or
routing policy.

Two forms of messages:
 announcements:
 New network accessible
 Prefer another route to network destination
 withdrawals:
 Destination network is no longer accessible

Routing policies vs. shortest number of hops
Background: Internet Core

Networks aggregated into CIDR (Classless Inter-Domain
Routing) prefixes

Prefix represents a set of destination IP addresses

At Internet “core” all routers maintain paths to “defaultfree” routes

Originally 5 major Internet Exchange Points (IXPs)

In 1996, approximately 30,000 default-free routes
Roadmap

Study of stability of routing in the Internet backbone
– Transient oscillations, pathological redundant updates
– congestion collapse and correlation to network usage
– SIGCOMM’97 and INFOCOMM’99

Study of route availability and failover rates
– long-term availability of Internet backbone routes
– Case study of regional provider
– FTCS’99

Study of convergence behavior of routing protocols
– Injection of route changes into the Internet backbone
– Impact of convergence delay on end-to-end path
– 18-month study & ongoing
Internet Exchange Points
Deployed probes machines at five public exchange points
Collected all routing updates at IXPs over four year period
Internet Routing Instability Results

Number of BGP routing updates exchanged per day in the
Internet core is orders of magnitude larger than expected.

Most routing information is dominated by pathological, or
redundant updates, which do not directly reflect changes in
routing policy or topology.

Instability and redundant updates exhibit a specific
periodicity of 30 and 60 seconds.

Instability and redundant updates show a surprising
correlation to network usage and exhibit corresponding
daily and weekly cyclic trends.
Instability Results (Continued)

Instability is not dominated by a small set of autonomous
systems or routes.

Instability is not disproportionately dominated by prefixes
of specific lengths, i.e. independent of aggregation.

Discounting policy fluctuation and pathological behavior,
there remains a significant level of Internet forwarding
instability.

Details: SIGCOMM’97 & INFOCOMM’99
Growth in Routing State
Linear growth in routing table
Initial Findings
(SIGCOMM’97)

Up to 60 million BGP updates/day for only 30,000 default-free
routes!
– On avg. 2-6 Million withdraws per day (mostly duplicates)
– e.g., ISP A had 259 routes but withdrew 2.4 million routes

All state changes well distributed across prefix lengths,
autonomous systems

Unexpected frequency components
– 30 second inter-arrival time between updates
– Daily/weekly components
More Initial Observations

Most routing updates pathological (millions!)
– Some due to misconfiguration
 Private networks
 Host routes
 Multicast routes
– Majority duplicate updates
 Duplicate withdraws (WWDup > 99.99%)
 Duplicate announcements (AADup)
BGP Updates
30 Second Frequency Components 1997
Origins of Pathological Updates
(INFOCOM99)

Majority stem from two router software implementation
issues:
– stateless BGP withdraws
– non-transitive attribute filtering

Frequency due to non-jittered router timers
– lack of precise specification

Others sources of pathologies:
– BGP/IBGP misconfiguration
– Still others DSU/CSU oscillation
– And still others due distance-vector algorithm
After Initial Publication of Results

One popular vendor validated our conjectures and released
updated software in 1997
– Software rapidly deployed by ISPs
– Stateful BGP reduced updates by orders of magnitude
– Addition of random intervals to timers diminished frequency
components
BGP Announcements and Withdraws
NANOG presentation
ISP Geeks Release
Mainline Release
Frequency Components
1997
1998
BGP Failures -- Congestion Collapse
(BGP Frequency)
A Short Story
Sigcomm '97 findings were puzzling:
Bandwidth Utilization  Instability
Hypothesis:

Congestion causes underlying TCP to backoff

BGP-level timers expire, causing termination
Border Gateway Protocol (BGP)
MCI



Sprint
Interdomain protocol between Autonomous Systems
Routing peers exchange reachability information incrementally
BGP uses TCP as the transport protocol between peer routers
BGP Congestion Collapse Hypothesis


Congestion causes underlying TCP
to backoff
BGP-level timers expire, causing
termination
 Interaction between BGP and TCP
leads to router congestion collapse
 High bandwidth utilization  BGP
Instability
 Validated using Windmill tool
(SIGCOMM98)
What about Failures?

Some state changes due to policy changes & network failures

Cannot distinguish between policy, intra-domain and interdomain failures

Methodology:
– Measure long-term rate of failure for Internet backbone routes
– Case study of regional provider
Internet Infrastructure Failures
(FTCS99)

Internet significantly less reliable and available than PSTN
telephone network.

After a network becomes unreachable, in most cases, it
takes longer than 5 mins before it is reachable again.

Even for transient oscillations, convergence of backbone
routing states may be in the order of mins!

Route failover (re-routing of traffic to a given network)
occurs on average of once every three days or more.

A small fraction of network paths contribute
disproportionately to number of long-term outages
Definitions

Route Failure: Prefix destination unavailable for 30 or
more minutes

Route Repair: A failed route becomes available

Route Failover: A route replaced with one associated with
a different path
Route Failures: How long before a network is
unreachable?
Route Repairs: How long before a network is
reachable again?
Failover: How long before traffic is re-routed?
Conventional Wisdom on Convergence

Internet is highly redundant
– Just reroute around in a few milliseconds

Routing protocol convergence takes only a few ????

“Bad news travels fast”
– Fast withdraw propagation valid goal
Not True!
– Announcements slower because bundled

BGP has great convergence properties
– Path vector solved the convergence and counting to infinity
(looping) problems

All my customers are multi-homed, triple-homed
– Convergence -- what, me worry?
18-Month Study of Convergence Behavior

Instrument the Internet
– Inject routes into geographically and topologically
diverse provider BGP peering sessions (Japan,
Michigan, US Exchange Points, Canada, UK)
– Periodically fail and change these routes (i.e. send
withdraws or new attributes)
– Time events using ICMP ping and NTP synchronized
BGP “routeviews” monitoring machines
– Wait 18 months… (50,000 routing events)
Passive & Active Measurement
Infrastructure
Fault Injection Server
BGP
Stub AS
Upstream
ISP2
ISP3
BGP
ISP4
BGP
Stub AS
ISP5
ICMP
Echos
Internet
Upstream
ISP1
ISP6
RouteViews
Data
Collection
Probe
Terminology

Tdown: A previously available route is withdrawn. This is a
route failure.

Tup : previously unavailable route is announced as available.
This is a route repair.

Tshort: A route is replaced with another route having a shorter
path. This is a route failover.

Tlong: A route is replaced by another route with a longer path.
This is a route failover.
Avg. number of messages generated by
each ISP following a routing update event
3.5
3
2.5
Japan
Verio
Michnet
CANet
2
1.5
1
Tdown
Tlong
Tup
Tshort
• Tdown and Tlong generated more messages than Tup and Tshort
• Significant variation among ISPs within each category of message
Withdraw Convergence (Tdown)
After a BGP route is withdrawn, barring other failures,
how long does it take Internet routing tables to reach
steady-state?
Withdraw Convergence
100
Cumulative Percentage of Faults
90
80
70
per.japan
60
per.canet
50
per.michnet
40
per.verio
30
20
10
0
0
20
40
60
80
100
120
140
160
Seconds Until Convergence
Convergence delay after a Tdown
Withdraw Convergence

Different providers exhibit different behavior

70% of withdraws from most ISPs take more than a minute

For ISP in Canada, 20% withdraws took more than three minutes
to converge

Observed latencies of up to 10 mins for certain events

No correlation between convergence latency and geography or
topological (except for MichNet)
Failovers and Repairs
What are the relative convergence latencies for failovers
and repairs?
Does bad news (withdraws) travel faster?
Failures, Failovers and Repairs
100
Cumulative Percentage of Events
90
80
70
60
Tup
Bad News Does Not Travel Fast!
Tshort
50
Tlong
40
Tdow n
30
20
10
0
0
20
40
60
80
100
120
Seconds Until Convergence
140
160
Failures, Failovers and Repairs



Bad news does not travel fast…
Repairs (Tup) exhibit similar convergence properties as
long  short path failover
Failures (Tdown) and short  long failovers also similar
– Slower than Tup (e.g. a repair)
– 60% take longer than two minutes
– Failover times degrade the greater the degree of multihoming!
End2End Connectivity
Impact of delayed convergence on E2E connectivity?
After a failover, how long before my site is reachable?
– Modified ICMP pings sent once a second
– Source IP address block of pseudo-AS
– 100 randomly chosen web sites from cache logs
Impact of Convergence Delay on
End-to-End Path
60
Percent Packet Loss
50
40
Tlong
30
20
Tshort
Fault
10
0
-9 -8 -7 -6 -5 -4 -3 -2 -1
0
1
2
3
4
5
6
7
8
9
One Minute Bins Before and After Fault
Avg. packet loss to 100 web sites (1 min bins in the ten mins
preceding and following a routing update)
What is Happening?

Non-deterministic ordering of BGP update messages leads to
– Transient oscillations
– Each change in FIB adds delay (CPU, BGP bundling timer)
– At extreme, convergence triggers BGP dampening
BGP Bad News
Given best current routing practices, inter-domain BGP
convergence times degrade exponentially with increase in
the degree of interconnectivity for a given route
… and the degree of inter-connectivity (multi-homing,
transit, etc) is increasing
Internet vs. Telephone Network

Packet-switched vs. circuit-switched

No explicit reservation on the Internet

Fault-tolerant switches in telephone networks

Significantly shorter development, testing and deployment
cycle in the Internet world

Reliability vs. time-to-market

Relative degree of operational experience

Small number of telecommunication companies vs. a
conglomeration of thousands of ISPs
Growing reliance on the Internet for commerce,
healthcare, education, ...
Challenges Facing Today’s Internet are
Bandwidth and Latency
The Next Challenge Jeopardizing the Explosive
Growth of the Web is AVAILABILITY.
Acknowledgements

Michigan Students & Merit Staff: Abha Ahuja, Mukesh Agrawal,
Paul Howell, Craig Labovitz, Rob Malan, Matt Smart, David
Watson

Sponsors: National Science Foundation, DARPA, Intel, IBM, HP

IPMA & NID Talk - Illinois - Cornell

Transcript IPMA & NID Talk - Illinois - Cornell

Directory