AFOSR_review_Jul10 - Northwestern University
Download
Report
Transcript AFOSR_review_Jul10 - Northwestern University
Intrusion Detection and
Forensics for Self-defending
Wireless Networks
Yan Chen
Lab for Internet and Security Technology
EECS Department
Northwestern University
Security Challenges in GIG
Wireless Networks
• In addition to sharing similar challenge of wired net
– High speed traffic (e.g., WiMAX)
– Zero-day threats
– Lack of quality info for situational-aware analysis: attack
target/strategy, attacker (botnet) size, etc.
• Wireless networks are more vulnerable
– Many emerging wireless network protocols: WiMAX, mobile
IP v4/6, EAP authentication protocols
Self-Defending Wireless Networks
• Proactive vulnerability analysis of wireless network
protocols (done in year 1)
– Found a class of exception triggered DoS attacks
• Net-based adaptive anomaly diagnosis, intrusion
detection and mitigation
– Polymorphic zero-day worm signature generation (done in
year 2)
– Automated analysis of large-scale botnet probing events for
situation aware info (done in year 3)
– Scalable signature matching w/ massive vulnerability
signatures (done in year 4)
• Projects done in pipeline, not serialized.
Accomplishments in AY 06-07
Five conference papers
• Detecting Stealthy Spreaders Using Online Outdegree
Histograms, in the Proc. of the 15th IEEE International
Workshop on Quality of Service (IWQoS), 2007 (26.6%).
• Hamsa: Fast Signature Generation for Zero-day Polymorphic
Worms with Provable Attack Resilience, to appear in IEEE
Symposium on Security and Privacy, 2006 (9%).
• Towards Scalable and Robust Distributed Intrusion Alert
Fusion with Good Load Balancing, in Proc. of ACM
SIGCOMM Workshop on Large-Scale Attack Defense
2006(33%).
• Automatic Vulnerability Checking of IEEE 802.16 WiMAX
Protocols through TLA+, in Proc. of the Second Workshop on
Secure Network Protocols (NPSec) (33%).
• A DoS Resilient Flow-level Intrusion Detection Approach for
High-speed Networks, in IEEE International Conference on
Distributed Computing Systems (ICDCS), 2006 (14%).
Accomplishments in AY 07-08
Three conference, one journal papers and
two book chapters, and one patent filed
• “Accurate and Efficient Traffic Monitoring Using Adaptive Non-linear
Sampling Method", in the Proc. of IEEE INFOCOM, 2008
• “A Survey of Existing Botnet Defenses “, in Proc. of IEEE IWSSE 2008.
• “Honeynet-based Botnet Scan Traffic Analysis", invited book chapter for
“Botnet Detection: Countering the Largest Security Threat”, Springer, 2007.
• “Integrated Fault and Security Management”,
invitedpublication
book chapterwith
for Dr.
• Collaborated
“Information Assurance: Dependability
and Security
Networked
Keesook
Han infrom
AFRL
Systems”, Morgan Kaufmann Publishers, 2007.
• “Reversible Sketches: Enabling Monitoring and Analysis over High-speed
Data Streams”, in ACM/IEEE Transaction on Networking, Volume 15,
Issue 5, Oct. 2007.
• “Network-based and Attack-resilient Length Signature Generation for
Zero-day Polymorphic Worms”, in the Proc. of the IEEE ICNP, 2007.
Accomplishments in AY 08-09
Five conference and one journal papers
• “Using Failure Information Analysis to Detect Enterprise Zombies", in the
Proc. of SecureComm 2009.
• “Exception Triggered DoS Attacks on Wireless Networks”, the 39th
IEEE/IFIP International Conference on Dependable Systems and Networks
(DSN), 2009.
• "BotGraph: Large Scale Spamming Botnet Detection", USENIX Symposium
on Networked Systems Design and Implementation (NSDI) 2009.
• "Towards Efficient Large-Scale VPN Monitoring and Diagnosis under
Operational Constraints", IEEE INFOCOM (main conference), 2009.
• “Automating Analysis of Large-Scale Botnet Probing Events”, ACM
Symposium on Information, Computer and Communications Security
(ASIACCS), 2009.
• “Pollution Attacks and Defenses for Internet Caching Systems”, in Journal of
Computer Networks, 2008.
Accomplishments in AY 09-10
Two conference and four journal papers and
one patent filed
• “NetShield: Massive Semantics-based Vulnerability Signature Matching for
High-speed Networks, in the Proc. of ACM SIGCOMM 2010
• HiFIND: A high-speed flow-level intrusion detection approach with DoS
resiliency, Journal of Computer Networks, Volume 54, Issue 8, June 2010..
• Measurement and Diagnosis of Address Misconfigured P2P Traffic, in the
Proc. of IEEE INFOCOM, 2010
• Thwarting Zero-Day Polymorphic Worms With Network-Level Length-Based
Signature Generation, in ACM/IEEE Transaction on Networking (ToN),
Volume 18, Issue 1, 2010.
• POPI: A User-level Tool for Inferring Router Packet Forwarding Priority",
ACM/IEEE Transaction on Networking (ToN), Volume 18, Issue 1, 2010.
• "Towards Unbiased End-to-End Network Diagnosis", in ACM/IEEE
Transaction on Networking (ToN), Volume 17, Number 6, Dec. 2009.
Overall Achievement
•
•
•
•
•
15 conference papers
6 journal papers
2 book chapters
2 patents filed
Several software released to the community
– E.g. www.nshield.org
• Invited talks to Cisco, Juniper, etc. for
potential tech transfer
8
NetShield: Matching a Large
Vulnerability Signature Ruleset
for High Performance Network
Defense
9
Outline
•
•
•
•
•
Motivation
High Speed Matching for Large Rulesets
High Speed Parsing
Evaluation
Research Contributions
10
NetShield Overview
NIDS/NIPS (Network Intrusion
Detection/Prevention System) operation
Signature
DB
Packets
NIDS/NIPS
`
`
`
Security • Accuracy
alerts
• Speed
• Attack Coverage
11
Network Level Defense
• Network gateways/routers are the vantage
points for detecting large scale attacks
• Only host based detection/prevention is not
enough
– Some users do not apply the host-based schemes
due to the reliability, overhead, and conflicts
– Many users do not update or patch their system on
time
– E.g., Conficker worm in the end of 2008 infected 9~15
millions of hosts
– Cannot only reply on end users for security protection
12
State Of The Art
Regular expression (regex) based approaches
Used by: Cisco IPS, Juniper IPS, open source Bro
Example: .*Abc.*\x90+de[^\r\n]{30}
Pros
• Can efficiently match multiple sigs simultaneously,
through DFA
• Can describe the syntactic context
13
Cons of Regex
Limited expressive power, cannot describe
semantic context, thus inaccurate
Theoretical prospective
Regex
Protocol Context
Context
Sensitive
grammar
Free
Practical prospective
• HTTP chunk encoding
• DNS label pointers
State Of The Art
Vulnerability Signature [Wang et al. 04]
Blaster Worm (WINRPC) Example:
Vulnerability: design flaws enable the bad
BIND:
inputs lead&&
therpc_vers_minor==1
program to a bad&&
state
rpc_vers==5
packed_drep==\x10\x00\x00\x00
Good
&& context[0].abstract_syntax.uuid=UUID_RemoteActivation state
BIND-ACK:
Bad input
rpc_vers==5
&& rpc_vers_minor==1
CALL:
rpc_vers==5 && rpc_vers_minors==1 && packed_drep==\x10\x00\x00\x00
Bad
Vulnerability
&& opnum==0x00 && stub.RemoteActivationBody.actual_length>=40
state
Signature
&& matchRE(stub.buffer, /^\x5c\x00\x5c\x00/)
Pros
• Directly describe
semantic context
• Very expressive, can
express the vulnerability
condition exactly
• Accurate
Cons
• Slow!
• Existing approaches all
use sequential matching
• Require protocol parsing
15
Speed
High
Motivation of NetShield
State of the
art regex Sig
IDSes
NetShield
Theoretical accuracy
limitation of regex
Low
Existing
Vulnerability
Sig IDS
Low
Accuracy
High
16
Motivation
• Desired Features for Signature-based
NIDS/NIPS
– Accuracy (especially for IPS)
– Speed
Cannot capture
vulnerability – Coverage: Large ruleset
condition well!
Regular
Expression
Vulnerability
Accuracy
Relative
Poor
Much Better
Speed
Good
??
Memory
OK
??
Coverage
Good
??
Shield
[sigcomm’04]
Focus of
this work
17
Research Challenges and Solutions
• Challenges
– Matching thousands of vulnerability
signatures simultaneously
• Sequential matching match multiple sigs.
simultaneously
– High speed protocol parsing
• Solutions
– An efficient algorithm which matches multiple
sigs simultaneously
– A tailored parsing design for high-speed
18
signature matching
Background
• Vulnerability signature basic
– Use protocol semantics to express vulnerabilities
– Defined on a sequence of PDUs & one predicate for
Blastereach
WormPDU
(WINRPC) Example:
BIND:
– Example: ver==1 && method==“put” && len(buf)>300
rpc_vers==5 && rpc_vers_minor==1 && packed_drep==\x10\x00\x00\x00
&&
context[0].abstract_syntax.uuid=UUID_RemoteActivation
• Data
representations
BIND-ACK:
– For &&
all the
vulnerability signatures we studied, we only
rpc_vers==5
rpc_vers_minor==1
CALL: need numbers and strings
rpc_vers==5
&& rpc_vers_minors==1
&&<,packed_drep==\x10\x00\x00\x00
– number
operators: ==, >,
>=, <=
&& opnum==0x00 && stub.RemoteActivationBody.actual_length>=40
– String operators:
==, match_re(.,.), len(.).
&& matchRE(stub.buffer,
/^\x5c\x00\x5c\x00/)
19
Outline
•
•
•
•
•
Motivation
High Speed Matching for Large Rulesets
High Speed Parsing
Evaluation
Research Contributions
20
Matching Problem Formulation
• Suppose we have n signatures, defined on k
matching dimensions (matchers)
– A matcher is a two-tuple (field, operation) or a fourtuple for the associative array elements
– Translate the n signatures to a n by k table
– This translation unlocks the potential of matching
multiple signatures simultaneously
Rule 4: URI.Filename=“fp40reg.dll” && len(Headers[“host”])>300
RuleID Method == Filename == Header == LEN
1
DELETE
*
*
2
POST
Header.php
*
3
*
awstats.pl
*
4
*
fp40reg.dll
name==“host”; len(value)>300
5
*
*
name==“User-Agent”; len(value)>544
21
Matching Problem Formulation
• Challenges for Single PDU matching
problem (SPM)
– Large number of signatures n
– Large number of matchers k
– Large number of “don’t cares”
– Cannot reorder matchers arbitrarily -buffering constraint
– Field dependency
• Arrays, associative arrays
• Mutually exclusive fields.
22
Difficulty of the SPM
• Bad News
– A well-known computational geometric problem
can be reduced to this problem.
– And that problem has bad worst case bound
O((log N)K-1) time or O(NK) space (worst case
ruleset)
• Good News
– Measurement study on Snort and Cisco ruleset
– The real-world rulesets are good: the
matchers are selective.
– With our design O(K)
23
Matching Algorithms
Candidate Selection Algorithm
1.Pre-computation decides the rule order and
• Integer range checking
matcher order
balanced binary search tree
• String exact
matching
Trie
2.Decomposition.
Match
each
matcher
• Regex DFA (XFA)
separately and iteratively combine the
results efficiently
24
Outline
•
•
•
•
•
Motivation
High Speed Matching for Large Rulesets.
High Speed Parsing
Evaluation
Research Contribution
25
High Speed Parsing
General V.S. Special Purpose
Keep the whole parse
Parsing and matching
V.S. on the fly
tree in memory
Parse all the nodes
in the tree
Only signature related
V.S. fields (leaf nodes)
• Design a parsing state machine
• Build an automated parsing state machine
generator
Outline
•
•
•
•
•
Motivation
High Speed Matching for Large Rulesets.
High Speed Parsing
Evaluation
Research Contributions
27
Evaluation Methodology
Fully implemented prototype
• 12,000 lines of C++ and
3,000 lines of Python
Release at:
www.nshield.org
Deployed at a university DC
with up to 106Mbps
• 26GB+ Traces from Tsinghua Univ. (TH), Northwestern (NU)
and DARPA
• Run on a P4 3.8Ghz single core PC w/ 4GB memory
• After TCP reassembly and preload the PDUs in memory
• For HTTP we have 794 vulnerability signatures which cover
973 Snort rules.
• For WINRPC we have 45 vulnerability signatures which cover
28
3,519 Snort rules
Parsing Results
Trace
TH
DNS
TH
NU
TH
WINRPC WINRPC HTTP
Avg flow len (B)
77
879
596
6.6K 55K 2.1K
Throughput
(Gbps)
Binpac
Our parser
0.31
3.43
1.41
16.2
1.11
12.9
2.10 14.2 1.69
7.46 44.4 6.67
11.2
Max. memory per 15
11.5
15
11.6
15
3.6
14
Speed up ratio
NU
HTTP
3.1
14
DARPA
HTTP
3.9
14
connection
(bytes)
29
Matching Results
8-core 11.0
Trace
TH
NU
TH
WINRPC WINRPC HTTP
NU
HTTP
DARPA
HTTP
Avg flow length (B)
879
596
6.6K
55K
2.1K
10.68
14.37
4
9.23
10.61
1.8
0.34
2.63
11.3
2.37 0.28
17.63 1.85
11.7 8.8
1.48
27
0.033 0.038 0.0023
20
20
20
Throughput (Gbps)
Sequential
CS Matching
Matching only time
speed up ratio
Avg # of Candidates 1.16
Max. memory per
connection (bytes)
27
30
Scalability and Accuracy Results
Rule scaling results
Throughput (Gbps)
0
1
2
3
4
Performance
decrease
gracefully
0
200
400
600
# of rules used
800
Accuracy
• Create two polymorphic
WINRPC exploits which
bypass the original Snort
rules but detect
accurately by our
scheme.
• For 10-minute “clean”
HTTP trace, Snort
reported 42 alerts,
NetShield reported 0
alerts. Manually verify
the 42 alerts are false
positives
31
Research Contribution
Make vulnerability signature a practical solution
for NIDS/NIPS
Regular Expression Exists Vul. IDS
NetShield
Accuracy
Poor
Good
Good
Speed
Good
Poor
Good
Memory
Good
??
Good
Coverage
Good
??
Good
• Multiple sig. matching candidate
selection algorithm
• Parsing parsing state machine
Build a better Snort alternative!
32
Q&A
Thanks!
33
Observations
• PDU parse tree
• Leaf nodes are
numbers or strings
PDU
array
General V.S. Special Purpose
Keep the whole parse
Parsing and matching
V.S. on the fly
tree in memory
Parse all the nodes
in the tree
Only signature related
V.S. fields (leaf nodes)
34
Efficient Parsing with State Machines
• Studied eight protocols: HTTP, FTP, SMTP,
eMule, BitTorrent, WINRPC, SNMP and DNS
as well as their vulnerability signatures
• Common relationships among leaf nodes
Automated parsing state
machine
Var
Var
generator: UltraPAC
derive
Var
Sequential
Branch
Loop
Derive
(a)
(b)
(c)
(d)
• Pre-construct parsing state machines based
on parse trees and vulnerability signatures 35
Example for WINRPC
• Rectangles are states
• Parsing variables: R0 .. R4
• 0.61 instruction/byte for BIND PDU
R1-16
8 merge2
1 ncontext
3 padding
Bind-ACK
1
rpc_vers
1 rpc_ver_minor
R0 1
ptype
Header 1
pfc_flags
R0
4 packed_drep
Bind
R1 2 frag_length
6
merge1
merge3
R4
20*R4
2
ID
1 n_tran_syn
1 padding
16 UUID
4 UUID_ver
tran_syn
Bind-ACK
R2 ‹- 0
R3 ‹- ncontext
Bind
R2++
R2£R3
36
Step 1: Pre-Computation
• Optimize the matcher order based on buffering
constraint & field arrival order
• Rule reorder:
1
Require
Matcher 1
Require
Matcher 1
Require
Matcher 2
Don’t care
Matcher 1
Don’t care
Matcher 1
&2
n
37
Step 2: Iterative Matching
PDU={Method=POST, Filename=fp40reg.dll,
Header: name=“host”, len(value)=450}
S1={2} Candidates after match Column 1 (method==)
S2=S1 A2+B2 ={2} {}+{4}={}+{4}={4}
S3=S2 A3+B3={4} {4}+{}={4}+{}={4}
Si Ai 1
Don’t care
RuleID Method == Filename
== Header == LEN
R1
R2
R3
1
2
DELETE
SiPOST
* matcher i+1 *
Header.php
*
*
3
*
awstats.pl
4
*
fp40reg.dll
5
*
*
Si Ai 1
require
In Ai+1 len(value)>300
name==“host”;
matcher i+1
name==“User-Agent”; len(value)>544
38
Complexity Analysis
Three HTTP traces:
avg(|Si|)<0.04
• Merging complexity
Two WINRPC
– Need k-1 merging iterations
traces: avg(|Si|)<1.5
– For each iteration
• Merge complexity O(n) the worst case, since Si can
have O(n) candidates in the worst case rulesets
• For real-world rulesets, # of candidates is a small
constant. Therefore, O(1)
– For real-world rulesets: O(k) which is the
optimal we can get
39
Refinement and Extension
• SPM improvement
– Allow negative conditions
– Handle array cases
– Handle associative array cases
– Handle mutual exclusive cases
• Extend to Multiple PDU Matching (MPM)
– Allow checkpoints.
40
Experiences
• Working in process
– In collaboration with MSR, apply the semantic
rich analysis for cloud Web service profiling.
To understand why slow and how to improve.
• Interdisciplinary research
• Student mentoring (three undergraduates,
six junior graduates)
41
Future Work
• Near term
– Web security (browser security, web server security)
– Data center security
– High speed network intrusion prevention system with
hardware support
• Long term research interests
– Combating professional profit-driven attackers will be
a continuous arm race
– Online applications (including Web 2.0 applications)
become more complex and vulnerable.
– Network speed keeps increasing, which demands
highly scalable approaches.
42
Research Contributions
• Demonstrate vulnerability signatures can
be applied to NIDS/NIPS, which can
significantly improve the accuracy of
current NIDS/NIPS
• Propose the candidate selection algorithm
for matching a large number of
vulnerability signatures efficiently
• Propose parsing state machine for fast
protocol parsing
43
• Implement the NetShield
Comparing With Regex
• Memory for 973 Snort rules: DFA
5.29GB (XFA 863 rules1.08MB),
NetShield 2.3MB
• Per flow memory: XFA 36 bytes,
NetShield 20 bytes.
• Throughput: XFA 756Mbps,
NetShield 1.9+Gbps
(*XFA [SIGCOMM08][Oakland08])
44
Measure Snort Rules
• Semi-manually classify the rules.
1. Group by CVE-ID
2. Manually look at each vulnerability
• Results
– 86.7% of rules can be improved by protocol semantic
vulnerability signatures.
– Most of remaining rules (9.9%) are web DHTML and
scripts related which are not suitable for signature
based approach.
– On average 4.5 Snort rules are reduced to one
vulnerability signature.
– For binary protocol the reduction ratio is much higher
than that of text based ones.
• For netbios.rules the ratio is 67.6.
45
Matcher order
Si 1 Si Ai 1 Bi 1
Reduce Si+1 Enlarge Si+1
Merging Overhead |Si| (use hash table to calculate
in Ai+1, O(1))
| Ai 1 Bi 1 | fixed, put the matcher later, reduce Bi+1
46
Matcher order optimization
• Worth buffering only if estmaxB(Mj)<=MaxB
• For Mi in AllMatchers
– Try to clear all the Mj in the buffer which
estmaxB(Mj)<=MaxB
– Buffer Mi if (estmaxB(Mi)>MaxB)
– When len(Buf)>Buflen, remove the Mj with
minimum estmaxB(Mj)
47