Slides - Zhichun Li

Download Report

Transcript Slides - Zhichun Li

Network-based and Attack-resilient Length
Signature Generation for Zero-day
Polymorphic Worms
Zhichun Li1, Lanjia Wang2, Yan Chen1
and Judy Fu3
1
Lab for Internet and Security Technology (LIST), Northwestern Univ.
2
Tsinghua University, China
3
Motorola Labs, USA
The Spread of Sapphire/Slammer
Worms
2
Limitations of Exploit Based Signature
Signature: 10.*01
1010101
10111101
Internet
Traffic
Filtering
X
X
11111100
Our network
00010111
Polymorphism!
Polymorphic worm might not have
exact exploit based signature
3
Vulnerability Signature
Internet
Vulnerability
signature traffic
filtering
X
X
Our network
X
X
Unknown
Vulnerability
Work for polymorphic worms
Work for all the worms which target the
same vulnerability
Better!
4
Benefits of Network Based Detection
Internet
Gateway routers
Our network
Host based
detection
• At the early stage of the worm, only limited
worm samples.
• Host based sensors can only cover limited IP
space, which might have scalability issues.5
Early Detection!
Design Space and Related Work
Network Based
Exploit Based
Vulnerability
Based
[Polygraph-SSP05]
[Hamsa-SSP06]
[PADS-INFOCOM05]
[CFG-RAID05]
[Nemean-Security05]
LESG (this paper)
Host Based
[DOCODA-CCS05]
[TaintCheck-NDSS05]
[Vulsig-SSP06]
[Vigilante-SOSP05]
[COVERS-CCS05]
[ShieldGen-SSP07]
• Most host approaches depend on lots of host
information, such as source/binary code of the
vulnerable program, vulnerability condition, execution
6
traces, etc.
Outline
•
•
•
•
•
•
•
Motivation and Related Work
Design of LESG
Problem Statement
Three Stage Algorithm
Attack Resilience Analysis
Evaluation
Conclusions
7
Basic Ideas
• At least 75%
vulnerabilities are due
to buffer overflow
• Intrinsic to buffer
overflow vulnerability
and hard to evade
• However, there could
be thousands of fields
to select the optimal
field set is hard
Overflow!
Protocol message
Vulnerable
buffer
8
Framework
Network
Tap
TCP
25
Known
Worm
Filter
Worm
Flow
Classifier
Protocol
Classifier
TCP
53
TCP
80
. . .
Suspicious
Traffic Pool
TCP
137
UDP
1434
LESG
Signatures
Real time
Normal traffic
reservoir
ICDCS06,
INFOCOM
06, TON
Normal
Traffic Pool
Policy driven
9
LESG Signature Generator
10
Outline
•
•
•
•
•
•
•
Motivation and Related Work
Design of LESG
Problem Statement
Three Stage Algorithm
Attack Resilience Analysis
Evaluation
Conclusions
11
Field Hierarchies
DNS PDU
12
Length-based Signature Definition
Length Signature
Name Type Class TTL RDlength RDATA
100
Length Signature
(Name,100)
Vulnerable
Signature Set
{(Name,100), (Class,50), (RDATA,300)}
“OR” relationship
Ground truth signature
(RDATA,315)
Buffer length!
13
Problem Formulation
Worms which are
not covered in the
suspicious pool
are at most 
Suspicious
pool
LESG
Signature
Normal
pool

With noise NP-Hard!
Minimize the false
positives in the
normal pool
14
Outline
•
•
•
•
•
•
•
Motivation and Related Work
Design of LESG
Problem Statement
Three Stage Algorithm
Attack Resilience Analysis
Evaluation
Conclusions
15
Stages I and II
COV≥1%
FP≤0.1%
Stage I: Field Filtering
Trade off between
specificity and sensitivity
Score function
Score(COV,FP)
Stage II: Length Optimization
16
Stage III
• Find the optimal set of fields as the
signature with high coverage and low
false positive
• Separate the fields to two sets, FP=0
and FP>0
– Opportunistic step (FP=0)
– Attack Resilience step (FP>0)
• The similar greedy algorithm for each
step
17
Stage III (cont.)
Name
Stage I
COV0≥1%
FP0≤0.1%
Type
Class TTL Comments
Residual
coverage
≥5%
50%
RDATA
0.05%
(RDATA,300) [50%,0.05%]
(Name,100) [40%,0.03%]
(Class,50) [35%,0.09%]
(Comments,2000) [10%,0.1%]
suspicious normal 18
Stage III (cont.)
Name
Stage I
COV0≥1%
FP0≤0.1%
Type
Class TTL Comments
Residual
coverage
≥5%
50%
RDATA
0.05%
{(RDATA,300)}
(Class,50) [25%,0.02%]
(Name,100) [3%,0.08%]
(Comments,2000) [1%,0.05%]
suspicious normal 19
Stage III (cont.)
Name
Stage I
COV0≥1%
FP0≤0.1%
Type
Class TTL Comments
Residual
coverage
γ≥5%
RDATA
(50+25)% (0.05+0.02)%
{(RDATA,300),(Class,50)}
(Class,50) [25%,0.02%]
(Name,100) [3%,0.08%]
(Comments,2000) [1%,0.05%]
suspicious normal 20
Attack Resilience Bounds
• Depend on whether deliberated noise
injection (DNI) exists, we get different
bounds
• With 50% noise in the suspicious pool,
we can get the worse case bound
FN<2% and FP<1%
• In practice, the DNI attack can only
achieve FP<0.2%
• Resilient to most proposed attacks
(proposed in other papers)
21
Outline
•
•
•
•
•
•
•
Motivation and Related Work
Design of LESG
Problem Statement
Three Stage Algorithm
Attack Resilience Analysis
Evaluation
Conclusions
22
Methodology
• Protocol parsing with Bro and BINPAC
(IMC2006)
• Worm workload
– Eight polymorphic worms created based on
real world vulnerabilities including CodeRed
II and Lion worms.
– DNS, SNMP, FTP, SMTP
• Normal traffic data
– 27GB from a university gateway and 123GB
email log
23
Results
• Single/Multiple worms with noise
– Noise ratio: 0~80%
– False negative: 0~1% (mostly 0)
– False positive: 0~0.01% (mostly 0)
• Pool size requirement
– 10 or 20 flows are enough even with 20%
noises
• Speed results
– With 500 samples in suspicious pool and
320K samples in normal pool, For DNS,
24
parsing 58 secs, LESG 18 secs
Conclusions
• A novel network-based automated worm
signature generation approach
– Work for zero day polymorphic worms with
unknown vulnerabilities
– First work which is both Vulnerability based
and Network based using length signature for
buffer overflow vulnerabilities
– Provable attack resilience
– Fast and accurate through experiments
25