Path Splicing with Network Slicing

Download Report

Transcript Path Splicing with Network Slicing

Improving Internet Availability
with Path Splicing
Nick Feamster
Georgia Tech
Can the Internet be “Always On”?
“It is not difficult to create a list of desired characteristics for a
new Internet. Deciding how to design and deploy a network
that achieves these goals is much harder.
Over time, our list will evolve. It should be:
1. Robust and available. The network should be as robust,
fault-tolerant and available as the wire-line telephone
network is today.
2. …
• Various studies (Paxson, etc.) show the
Internet is at about 2.5 “nines”
• More “critical” (or at least availabilitycentric) applications on the Internet
• At the same time, the Internet is getting
more difficult to debug
– Scale, complexity, disconnection, etc.
2
Availability of Other Services
• Carrier Airlines (2002 FAA Fact Book)
– 41 accidents, 6.7M departures
– 99.9993% availability
• 911 Phone service (1993 NRIC report +)
– 29 minutes per year per line
– 99.994% availability
• Std. Phone service (various sources)
– 53+ minutes per line per year
– 99.99+% availability
3
Threats to Availability
• Natural disasters
• Physical device failures (node, link)
– Drunk network administrators
4
Threats to Availability
• Natural disasters
• Physical failures (node, link)
– Drunk network administrators
– Cisco bugs
•
•
•
•
•
Misconfiguration
Mis-coordination
Denial-of-service (DoS) attacks
Changes in traffic patterns (e.g., flash crowd)
…
5
Availability: Two Aspects
• Reliability: Connectivity in the routing tables
should approach the that of the underlying graph
– If two nodes s and t remain connected in the
underlying graph, there is some sequence of hops in
the routing tables that will result in traffic
• Recovery: In case of failure (i.e., link or node
removal), nodes should quickly be able to
discover a new path
6
Where Today’s Protocols Stand
• Reliability
– Approach: Compute backup paths
– Challenge: Many possible failure scenarios!
• Recovery
– Approach: Switch to a backup when a failure occurs
– Challenge: Must quickly discover a new working path
– Meanwhile, packets are dropped, reordered, etc.
7
Multipath: Promise and Problems
s
t
• Bad: If any link fails on both paths, s is
disconnected from t
• Want: End systems remain connected unless
the underlying graph has a cut
8
Path Splicing: Main Idea
Compute multiple forwarding trees per destination.
Allow packets to switch slices midstream.
s
t
• Step 1 (Perturbations): Run multiple instances of the
routing protocol, each with slightly perturbed versions of
the configuration
• Step 2 (Slicing): Allow traffic to switch between
instances at any node in the protocol
9
Outline
• Path Splicing
– Achieving Reliabile Connectivity
• Generating slices
• Constructing paths
– Forwarding
– Recovery
• Properties
– High Reliability
– Bounded Stretch
– Fast recovery
• Ongoing Work
10
Generating Slices
• Goal: Each instance provides different paths
• Mechanism: Each edge is given a weight that is
a slightly perturbed version of the original weight
– Two schemes: Uniform and degree-based
“Base” Graph
Perturbed Graph
3
s
3
1.5
t
3
s
5
1.25
4
1.5
t
3.5
11
How to Perturb the Link Weights?
• Uniform: Perturbation is a function of the initial
weight of the link
• Degree-based: Perturbation is a linear function
of the degrees of the incident nodes
– Intuition: Deflect traffic away from nodes where traffic
might tend to pass through by default
12
Constructing Paths
• Goal: Allow multiple instances to co-exist
• Mechanism: Virtual forwarding tables
a
s
dst
b
t
next-hop
Slice 1 t
a
Slice 2 t
c
c
13
Forwarding Traffic
• Packet has shim header with forwarding bits
• Routers use lg(k) bits to index forwarding tables
– Shift bits after inspection
• To access different (or multiple) paths, end
systems simply change the forwarding bits
– Incremental deployment is trivial
– Persistent loops cannot occur
• Various optimizations are possible
14
Putting It Together
• End system sets forwarding bits in packet header
• Forwarding bits specify slice to be used at any hop
• Router: examines/shifts forwarding bits, and forwards
s
t
15
Evaluation
• Defining reliability
• Does path splicing improve reliability?
– How close can splicing get to the best possible
reliability (i.e., that of the underlying graph)?
• Can path splicing enable fast recovery?
– Can end systems (or intermediate nodes) find
alternate paths fast enough?
16
Defining Reliability
• Reliability: the probability that, upon failing each
edge with probability p, the graph remains
connected
• Reliability curve: the fraction of sourcedestination pairs that remain connected for
various link failure probabilities p
• The underlying graph has an underlying
reliability (and reliability curve)
– Goal: Reliability of routing system should approach
that of the underlying graph.
17
Reliability Curve: Illustration
Fraction of
source-dest
pairs
disconnected
Better
reliability
Probability of link failure (p)
More edges available to end systems -> Better reliability
18
Reliability Approaches Optimal
• Sprint (Rocketfuel) topology
• 1,000 trials
• p indicates probability edge was removed from base graph
Reliability
approaches
optimal
Average stretch is
only 1.3
Sprint topology,
degree-based
perturbations
19
Recovery: Two Mechanisms
• End-system recovery
– Switch slices at every hop with probability 0.5
• Network-based recovery
– Router switches to a random slice if next hop is
unreachable
– Continue for a fixed number of hops till
destination is reached
20
20
Simple Recovery Strategies Work Well
• Which paths can be recovered within 5 trials?
– Sequential trials: 5 round-trip times
– …but trials could also be made in parallel
Recovery approaches
maximum possible
Adding a few more slices
improves recovery beyond
best possible reliability
with fewer slices.
21
What About Loops?
• Persistent loops are avoidable
– In the simple scheme, path bits are exhausted from
the header
– Never switching back to the same
• Transient loops can still be a problem because
they increase end-to-end delay (“stretch”)
– Longer end-to-end paths
– Wasted capacity
22
Significant Novelty for Modest Stretch
• Novelty: difference in nodes in a perturbed
shortest path from the original shortest path
Fraction of edges on short
path shared with long path
Example
s
d
Novelty: 1 – (1/3) = 2/3
23
Splicing Improves Availability
• Reliability: Connectivity in the routing tables
should approach the that of the underlying graph
– Approach: Overlay trees generated using random
link-weight perturbations. Allow traffic to switch
between them.
– Result: Splicing ~ 10 trees achieves near-optimal
reliability
• Recovery: In case of failure, nodes should
quickly be able to discover a new path
– Approach: End nodes randomly select new bits.
– Result: Recovery within 5 trials approaches best
possible.
24
Extension: Interdomain Paths
• Observation: Many routers already learn multiple
alternate routes to each destination.
• Idea: Use the forwarding bits to index into these
alternate routes at an AS’s ingress and egress routers.
default
d
alternate
Splice paths at ingress
and egress routers
Required new functionality
• Storing multiple entries per prefix
• Indexing into them based on packet headers
• Selecting the “best” k routes for each destination
25
Open Questions and Ongoing Work
• How does splicing interact with traffic
engineering? Sources controlling traffic?
• What are the best mechanisms for generating
slices and recovering paths?
• Can splicing eliminate dynamic routing?
26
Conclusion
• Simple: Forwarding bits provide access to
different paths through the network
• Scalable: Exponential increase in available
paths, linear increase in state
• Stable: Fast recovery does not require fast
routing protocols
http://www.cc.gatech.edu/~feamster/papers/splicing-hotnets.pdf
27
Network-Level Spam Filtering
Spam: More Than Just a Nuisance
• 75-90% of all email traffic
– PDF Spam: ~11% and growing
– Content filters cannot catch!
• As of August 2007, one in every 87 emails
constituted a phishing attack
• Targeted attacks on the rise
– 20k-30k unique phishing attacks per month
– Spam targeted at CEOs, social networks on the rise
29
Problem #1: Content-Based Filtering is
Malleable
• Low cost to evasion: Spammers can easily alter
features of an email’s content can be easily adjusted and
changed
• Customized emails are easy to generate: Contentbased filters need fuzzy hashes over content, etc.
• High cost to filter maintainers: Filters must be
continually updated as content-changing techniques
become more sophistocated
30
Problem #2: IPs are Ephemeral
• Spammers use various techniques to change the
IP addresses from which they send traffic
• Humans must notice changing behavior
• Existing blacklists cannot stay up to date
One technique: BGP
Spectrum Ability
~ 10 minutes
• Hijack IP address space
• Send spam
• Withdraw route
31
Our Approach: Network-Based
Filtering
• Filter email based on how it is sent, in addition
to simply what is sent.
• Network-level properties are more fixed
–
–
–
–
Hosting or upstream ISP (AS number)
Botnet membership
Location in the network
IP address block
• Challenge: Which properties are most useful for
distinguishing spam traffic from legitimate email?
32
SpamTracker: Main Idea and Intuition
• Idea: Blacklist sending behavior
(“Behavioral Blacklisting”)
– Identify sending patterns that are commonly used by
spammers
• Intuition: Much more difficult for a spammer to
change the technique by which mail is sent than
it is to change the content
33
SpamTracker Design
Approach
• For each sender,
construct a
behavioral
fingerprint
• Cluster senders
with similar
fingerprints
• Filter new senders
that map to
existing clusters
Email
IP x domain x time
Lookup
Score
Collapse
Cluster
Classify
34
Building the Classifier: Clustering
• Feature: Distribution of email sending volumes
across recipient domains
• Clustering Approach
– Build initial seed list of bad IP addresses
– For each IP address, compute feature vector:
volume per domain per time interval
– Collapse into a single IP x domain matrix:
– Compute clusters
35
Clustering: Output and Fingerprint
• For each cluster,
compute
fingerprint vector:
• New IPs will be
compared to this
“fingerprint”
IP x IP Matrix: Intensity
indicates pairwise similarity
36
Classifying IP Addresses
• Given “new” IP address, build a feature vector
based on its sending pattern across domains
• Compute the similarity of this sending pattern to
that of each known spam cluster
– Normalized dot product of the two feature vectors
– Spam score is maximum similarity to any cluster
37
Summary
• Spam is on the rise and becoming more clever
– 12% of spam now PDF spam. Content filters are
falling behind. Also becoming more targeted
• IP-Based blacklists are evadable
– Up to 30% of spam not listed in common blacklists at
receipt. ~20% remains unlisted after a month
– Spammers commonly steal IP addresses
• New approach: Behavioral blacklisting
– Blacklist how the mail was sent, not what was sent
– SpamTracker being deployed and evaluated by a
large spam filtering company
38