spitznagel1-29

Download Report

Transcript spitznagel1-29

Packet Classification for Core Routers:
Is there an alternative to CAMs?
Paper by:
Florin Baboescu, Sumeet Singh, George Varghese
Presentation by:
Edward W. Spitznagel
Edward W. Spitznagel
9 April 20161
Applied Research Laboratory
Outline
• Introduction
• Packet Classification Problem
• Extended Grid-of-Tries (EGT)
–
–
–
–
Grid-of-Tries
Extending Grid-of-Tries into EGT
Path Compression
Results
• Summary
Edward W. Spitznagel
9 April 20162
Applied Research Laboratory
Packet Classification Problem
• Suppose you are a firewall, or QoS router, or network monitor ...
• You are given a list of rules (filters) to determine how to process
incoming packets, based on the packet header fields
Filter
Source
Address
Destination
Address
Source
Port
Destination
Port
Protocol
Cost
Action
a
11*
01*
2-4
0-15
TCP
2
fwd 7
b
01*
0010
3-15
3-15
UDP
10
fwd 2
c
0101
*
3
*
*
5
deny
d
1101
101*
*
*
ICMP
7
fwd 5
• Goal: when a packet arrives, find the least-cost rule that matches the
packet’s header fields
Edward W. Spitznagel
9 April 20163
Applied Research Laboratory
Packet Classification Problem
• Example: packet arrives with header (0101, 0010, 3, 5, UDP)
– classification result: filter c
– filter b also matches, but, c has lower cost
Filter
Source
Address
Destination
Address
Source
Port
Destination
Port
Protocol
Cost
Action
a
11*
01*
2-4
0-15
TCP
2
fwd 7
b
01*
0010
3-15
3-15
UDP
10
fwd 2
c
0101
*
3
*
*
5
deny
d
1101
101*
*
*
ICMP
7
fwd 5
• Easy when we have only a few rules; very hard with 100,000 rules
and packets arriving at 40 Gb/s
Edward W. Spitznagel
9 April 20164
Applied Research Laboratory
Packet Classification - Metrics
• Metrics for evaluating classification algorithms:
– Time complexity of classifying a packet
 often
expressed as the number of memory accesses required
– Storage requirements of data structures
– Number of fields that can be handled
Edward W. Spitznagel
9 April 20165
Applied Research Laboratory
Packet Classification in Core Routers
• Many core routers have “fairly large” (e.g. 2000
rule) databases
– Expected to grow; in fact, may be limited by current
technology
• Classification in core routers must be done quickly
– Emerging core routers operate at 40Gb/s. With 40-byte
packets, that means one packet every 8 nsec
• Thus the general belief that brute-force hardware
(TCAMs) will be necessary to support packet
classification in core routers
Edward W. Spitznagel
9 April 20166
Applied Research Laboratory
Packet Classification - TCAM disadvantages
• Ternary CAMs (TCAM) have disadvantages
– Density Scaling: 10-12 transistors per bit of TCAM
(vs. 4-6 transistors per bit of SRAM)
– Power Scaling: due to performing all comparisons in
parallel.
– Time Scaling: 5-10 nsec for a TCAM operation
– Extra Chips: requires TCAM chip(s) and bridge ASIC
– Rule Multiplication for ranges: arbitrary ranges are
represented by sets of prefixes; very inefficient.
• Thus, we consider an algorithmic solution...
Edward W. Spitznagel
9 April 20167
Applied Research Laboratory
Packet Classification trends
• Packet classification in 2D: several good methods
– Grid of Tries, Area-based QuadTrees, FIS-trees, Tuplespace search, range trees and fractional cascading
• Classification in k dimensions, where k>2, is hard
– O(logK-1 N) time and linear space, or O(log N) time and
O(NK) space, for N filters in K dimensions
• Modern algorithms: use heuristics to exploit the
structure and properties that real-world filter
databases tend to have.
– Example: RFC and HiCuts algorithms
Edward W. Spitznagel
9 April 20168
Applied Research Laboratory
Extended Grid of Tries (EGT)
• Observation: Core router tables studied have a low
maximum filter depth in the 2D space defined by
<Source IP Address, Destination IP Address>
0xFFFF
 in
 i.e.
no point in this 2D plot
of filters is covered by more
than 20 filters
c
b
Dest.Address
this case, “low” means
20 or less
d
a
0
0
Source Address
0xFFFF
Edward W. Spitznagel
9 April 20169
Applied Research Laboratory
Extended Grid of Tries (EGT)
• The Basic Idea:
– Use an existing 2D scheme to classify with respect to
Source IP and Dest. IP
– Then, do linear search over a
small list of possible matches
(at most 20, but typically
around 5)
• EGT: use Grid-of-Tries
as the 2D scheme
Edward W. Spitznagel
9 April 201610
Applied Research Laboratory
Grid of Tries - Intuition
• Imagine a search trie containing Dest. Address prefixes
• Now add a Source Address trie under each Dest. prefix
– Filters are stored in these tries, perhaps multiple times
Edward W. Spitznagel
9 April 201611
Applied Research Laboratory
Grid of Tries - Intuition
• Reduce storage by storing each filter only once
– But we now need to backtrack to ancestors’ source tries
during a search...
Edward W. Spitznagel
9 April 201612
Applied Research Laboratory
Grid of Tries
• Use switch pointers to improve search efficiency
– allows us to jump to the next source trie among
ancestors, instead of backtracking
Edward W. Spitznagel
9 April 201613
Applied Research Laboratory
Extended Grid of Tries
• EGT uses jump pointers instead of switch pointers
– EGT requires the 2D search to return all filters matching in
those dimensions
– Thus, some of the nodes skipped by a switch pointer cannot be
skipped in an EGT search
• So, search complexity is a bit higher than in ordinary
Grid-of-Tries
– worst case search takes W+(H+1)*W = (H+2)*W time, where
W=time to find best prefix in a single trie, and H=max trie
height (H=32 for IPv4)
– but, the authors expect typically it takes L*W with L being a
small value (reflecting the low maximum prefix containment
seen in most filter databases)
Edward W. Spitznagel
9 April 201614
Applied Research Laboratory
EGT with Path Compression (EGT-PC)
• EGT-PC adds Path Compression whereby single
branching paths are removed
– Improves search time and storage requirements, particularly for
small filter sets
Edward W. Spitznagel
9 April 201615
Applied Research Laboratory
EGT-PC: Results
• Storage requirements: impressively low (almost as low as TCAM!)
– since we store each filter only once
 Storage, in terms of number of 32-bit words
• Classification time is good, but not as impressive
– also a result of storing each filter once: we therefore may need to traverse
multiple Source tries
 Memory accesses, in terms of 32-bit word accesses
Edward W. Spitznagel
9 April 201616
Applied Research Laboratory
EGT-PC: Results
• Memory usage by component:
• Storage for list is proportional
to number of filters
• Storage for trie is roughly
proportional to number of filters
• Path compression reduces storage by a factor of 3, roughly
Edward W. Spitznagel
9 April 201617
Applied Research Laboratory
EGT-PC: Results with larger databases
• Larger databases are generated using smaller ones as a core
– randomly generated prefixes for Source Address and Destination Address,
using the prefix length distributions from the original databases
– Other fields are randomly derived from the distributions in the original
databases
• Memory Accesses: still not bad, even for large databases
• Storage Requirements: still appear to be linear
Edward W. Spitznagel
9 April 201618
Applied Research Laboratory
EGT-PC: Remarks
• May only work well with core routers
• Lookups:
– faster than HiCuts; not as fast or as deterministic as RFC.
– can easily be characterized by maximum 2D filter depth
• Storage requirements: quite good
– using Grid-of-Tries for the 2D scheme is a wise choice (storage efficiency)
• Very nice to have results comparing several different algorithms
(unlike nearly all previous papers)
• It is possible to apply the basic EGT idea, but with a different 2D
scheme
– Tuple Space, FIS-trees, RFC in 2D, and perhaps Area-based QuadTrees
– The trick is that the 2D scheme must be modified to return all filters matching
those 2 dimensions (rather than just the least-cost filter matching those 2
dimensions)
Edward W. Spitznagel
9 April 201619
Applied Research Laboratory
Comparison of different algorithms
Best
TCAM
RFC
Lookup Speed
EGT-PC
HiCuts-1
Linear
Search
HiCuts-4
Best
Linear
Search
TCAM
EGT
Worst
Storage Requirements
HiCuts-1
EGT
HiCuts-4
Worst
RFC
EGT-PC
Edward W. Spitznagel
9 April 201620
Applied Research Laboratory
Summary
• Packet Classification: Given packet P and list of filters
F, find least cost filter in F that matches P
– Important metrics: Lookup time, data structure size
• Extended Grid of Tries
– Core routers have a low maximum filter depth in the 2D
space defined by <Src. Addr, Dest. Addr>
– Thus, we can perform a 2D search via Grid of Tries, and
then
 and
we can add path compression to the trie
– Lookup time is fairly good; storage requirements are
very good.
Edward W. Spitznagel
9 April 201621
Applied Research Laboratory
Thanks -- Questions?
Edward W. Spitznagel
9 April 201622
Applied Research Laboratory
Backup slides to follow...
Edward W. Spitznagel
9 April 201623
Applied Research Laboratory
Geometric Representation
• Filters with K fields can
be represented
geometrically in K
dimensions
xxx
Source Port
2-3
b
010
0-7
c
xx1
7
Source Port
a
c
c
c
b
6
• Example:
Filter Source Address
c
4
a
2
0
0
2
4
6
Source Address
Edward W. Spitznagel
9 April 201624
Applied Research Laboratory
Ternary CAMs
• Most popular practical approach to high-performance
packet classification
• Hardware compares query word (packet header) to all
stored words (filters) in parallel
– each bit of a stored word can be 0, 1, or X (don’t care)
• Very fast, but not without drawbacks:
– High power consumption limits scalability
– inefficient representation of ranges
Edward W. Spitznagel
9 April 201625
Applied Research Laboratory
Ternary CAM - Example
Src. Addr. Dest. Addr.
Packet:
1110
0110
Query:
11100110
TCAM
Filter
a
Source
Address
11xx
Destination
Address
Address
Contents
0
11xxxxxx
11100110
0xxx01xx
11100110
xxxx0110
11100110
xxxx
1
b
0xxx
01xx
2
c
xxxx
0110
Match!
Doesn’t Match
Match!
(Now perform priority
resolution...)
Edward W. Spitznagel
9 April 201626
Applied Research Laboratory
Range Matching in TCAMs
Source Port
Destination Port
F
1-4
3-5
• Convert ranges into
sets of prefixes
– 1-4 becomes 001, 01*, and 100
– 3-5 becomes 011 and 10*
Destination Port
Filter
6
F
4
2
0
0
2
4
6
Source Port
Edward W. Spitznagel
9 April 201627
Applied Research Laboratory
Range Matching in TCAMs
Source Port
Destination Port
a
b
c
d
e
f
001
01*
100
001
01*
100
10*
10*
10*
011
011
011
• With two 16-bit range fields,
a single rule could require up
to 900 TCAM entries!
Destination Port
Filter
6
a
b
c
d
e
f
4
2
0
0
• Typical case: entire filter set
expands by a factor of 2 to 6
2
4
6
Source Port
Edward W. Spitznagel
9 April 201628
Applied Research Laboratory