Deep Packet Inspection - Computer Science and Engineering
Download
Report
Transcript Deep Packet Inspection - Computer Science and Engineering
Deep Packet Inspection with
Regular Expression Matching
Min Chen, Danny Guo
{michen, guo}@cs.ucr.edu
CSE Dept, UC Riverside
03/14/2007
Outline
Motivation
Background and challenges
Evaluation metrics
Algorithm comparison
Implementation details
– Regular expression
– DFA & NFA
– Detection engine
Result
Future work
Motivation
A class of packet processing applications
need to inspect packets deeper than the
protocol headers and analyze its payload
– Network Security
– HTTP load balancing
– XML processing
– Content-based billing and forwarding
Deep Packet Inspection
(DPI)
Shallow packet inspection
– Checks the header portion of a packet only
Deep packet inspection
– A form of computer network packet filtering
that examines the data part of a throughpassing packet, searching for non-protocol
compliance or predefined criteria to decide if
the packet can pass
Challenges for DPI
Operates
at wire speed
Large number of signatures (i.e. string
patterns)
Patterns highly complex and have
overlaps
Location of signatures is unknown
DPI Evaluation Metrics
Packet
processing rate
Memory requirement
– SRAM, DRAM, TCAM
Power
consumption
– TCAM
Scalability
– The time to process new signatures and
insert them into the system
DPI Algorithms
Fixed string matching
Regular expression matching
– Parallel Boyer-Moore (BM)
– Aho-Corasick Boyer-Moore (AC_BM)
– Setwise Boyer-Moore-Horspool
– Bloom Filter
– CAM Based
– Deterministic Finite Automation (DFA)
– Non-deterministic Finite Automation (NFA)
Regular Expression (RE)
Expressive power and flexibility for
describing useful patterns
– Linux Application Protocol Classifier
(L7-filter)
– the Snort intrusion detection system
(1131 out of 4867 rules using regular
expressions as of February 2006)
Example of RE
“^(ymsg|ypns|yhoo).?.?.?.?.?.?.?[lwt].*\xc0\x80”
DFA Vs. NFA
Performance comparison
– For 1 RE with length n
Construction
Time
Processing
Complexity
Space
Complexity
NFA
O(n)
O(n^2)
O(n)
DFA
O(2^n)
O(1)
O(2^n)
DFA
– Higher processing speed
– Acceptable construction time and memory
consumption with lazy-DFA (DFA+NFA)
– More efficient in software implementation
Project Architecture
Detection
Engine
Input buffer
RE1 DFA
RE2 DFA
RE3 DFA
RE8 DFA
Content Scanner 1
outgoing
Streams
Dispatcher
Incoming
Streams
Content Scanner 2
…
Content Scanner 16
Detection Engine Setup
# of Content Scanner (optimal)
– SRAM 128bits (input)
– Processing unit: 8bits/char
– Processing power: 128/8 = 16 chars/cycle
# of REs for each Content Scanner
–
–
–
–
SRAM 128bits (output)
Processing unit: 1bit (accept:1 else:0)
# of streams: 16 (best throughput)
Each stream could be processed with 128/16=8
REs concurrently
DFA Representation
States
Input char
0 1 …
1 4 2 5
2 1 1 2
… 3 4 5
M 12 42 2
255
7
23
3
1
Environment on Grep
application
Input stream: 70MB file
RE:
– For speed test: “[1-9]* [0-9]\.*[0-9]+”
– For area test: “U\.?S\.?(D\.?)?[\ ]*(\$[\ ]*)?([09]+,[0-9]+,[0-9]+|[0-9]+\.[0-9]+\.[0-9]+|[09]+(\.[0-9]+)?[\ ]*milli?on)”
Result
Optimal throughput
– 16 * 8bits * 200MHz = 25.6Gbps
Processing speedup
Real
User
sys
Grep_FPGA Grep_original
10.271
76.133
3.740
76.096
5.496
Logic consumption
– 9% Slice Flip-fllop
– 6% 4-input LUT
0.036
Speedup
7.412
20.346
-152.667
Future work
SNORT
– More powerful application
Input stream preprocessing
– TCP/IP packet
– Packet arrival interval latency
Special thanks to John and Betul for
the instruction on ISE and ROCCC