Transcript icnp07

Network-based and Attack-resilient Length
Signature Generation for Zero-day
Polymorphic Worms
Zhichun Li1, Lanjia Wang2, Yan Chen1
and Judy Fu3
1
Lab for Internet and Security Technology (LIST), Northwestern Univ.
2
Tsinghua University, China
3
Motorola Labs, USA
The Spread of Sapphire/Slammer
Worms
Limitations of Content Based Signature
Signature: 10.*01
1010101
10111101
Internet
Traffic
Filtering
X
X
11111100
00010111
Polymorphism!
Polymorphic worm might not have
exactly content based signature
Our network
Vulnerability Signature
Internet
Vulnerability
signature traffic
filtering
X
X
Our network
X
X
Vulnerability
Work for polymorphic worms
Work for all the worms which target the
same vulnerability
Network Based Detection
Internet
Gateway routers
Our network
Host based
detection
•
•
At the early stage of the worm, only limited worm
samples.
Host based sensors can only cover limited IP space,
which might have scalability issues. Thus they might
not be able to detect the worm in its early stage
Design Space and Related Work
Network Based
Exploit Based
Vulnerability
Based
[Polygraph-SSP05]
[Hamsa-SSP06]
[PADS-INFOCOM05]
[CFG-RAID05]
[Nemean-Security05]
LESG (this paper)
Host Based
[DOCODA-CCS05]
[TaintCheck-NDSS05]
[Vulsig-SSP06]
[Vigilante-SOSP05]
[COVERS-CCS05]
[ShieldGen-SSP07]
• Most host approaches depend on lots of host
information, such as source/binary code of the
vulnerable program, vulnerability condition, execution
traces, etc.
Outline
•
•
•
•
•
•
•
Motivation and Related Work
Design of LESG
Problem Statement
Three Stage Algorithm
Attack Resilience Analysis
Evaluation
Discussions and Conclusions
7
Key Ideas
• At least 75% vulnerabilities are due to buffer
overflow
• Some protocol fields might map to the
vulnerable buffer to trigger the vulnerability
• The length of some protocol field have to
longer than the buffer length
• Intrinsic to buffer overflow vulnerability and
hard to evade
• However, there could be thousands of fields to
select the optimal field set is hard
Framework
• Sniff network traffic from network
gateways
• Filter out known worms
• Existing flow classifiers
– Separate traffic into a suspicious traffic pool
and a normal traffic pool
– E.g. port scan detector, honeynets
• LESG Signature Generator
LESG Signature Generator
Outline
•
•
•
•
•
•
•
Motivation and Related Work
Design of LESG
Problem Statement
Three Stage Algorithm
Attack Resilience Analysis
Evaluation
Discussions and Conclusions
11
Field Hierarchies
DNS PDU
Length-based Signature Definition
• Signature S j  ( f j , l j ), f j  E , l j is signature
length for field f j
• Matching: for flow X  {x1 , x2 ,..., xk , ..., xK },
– if x f  l j , flow X is labeled as a worm flow
j
• Signature Set S  {S1, S2 , ..., S J }
– worm flows: match at least one signature
• Ground truth signature B  ( f B , LB ), LB is the
vulnerable buffer length
2016/4/9
13
Problem Formulation
Coverage in the
suspicious pool is
bounded by 1-
Suspicious
pool
LESG
Signature
Normal
pool
Coverage bound
1-
With noise NP-Hard!
Minimize the false
positives in the
normal pool
Outline
•
•
•
•
•
•
•
Motivation and Related Work
Design of LESG
Problem Statement
Three Stage Algorithm
Attack Resilience Analysis
Evaluation
Discussions and Conclusions
15
Stage I and II
COV=1%
FP=0.1%
Stage I: Field Filtering
Trade off Score function
Score(COV,FP)
Stage II: Length Optimization
16
Stage III
• Find the optimal set of fields as the
signature approximately
• Separate the fields to two sets, FP=0
and FP>0
– Opportunistic step (FP=0)
– Attack Resilience step (FP>0)
• The similar greedy algorithm for each
step
– Every time find the field with maximum
residual coverage and the coverage is no
less than a threshold.
17
Attack Resilience Bounds
High
b0
Accuracy
Ground Truth Signature
Know the vulnerable field
Multiple field Optimal b1
LESG Signature
• With different assumptions on b0 and whether
deliberated noise injection (DNI) exists, get
bound b1
– DNI: Theorem2 and 3
– No DNI: Theorem4 and 5
• With 90% noise in the suspicious pool, we can
get the FN<10% and FP<1.8%
Low • Resilient to most proposed attacks
18
Outline
•
•
•
•
•
•
•
Motivation and Related Work
Design of LESG
Problem Statement
Three Stage Algorithm
Attack Resilience Analysis
Evaluation
Discussions and Conclusions
19
Methodology
• Protocol parsing with Bro and BINPAC
• Worm workload
– Eight polymorphic worms created based on
real world vulnerabilities
– DNS, SNMP, FTP, SMTP
• Normal traffic data
– 27GB from a university gateway and
123GB email log.
• Experiment Settings
Score(COV, FP)  (1 / log FP  1) * COV
FP0  0.1%, COV0  1%,  '  1%,   5%
20
Results
• Single/Multiple worms with noise
– Noise ratio: 0~80%
– False negative: 0~1% (mostly 0)
– False positive: 0~0.01% (mostly 0)
• Speed and memory consumption
– For DNS, parsing 58 secs, LESG 18 secs
for (500,320K)
• Pool size requirement
– 10 or 20 is enough
21
Results – Attack Resilience
• The worm not only spread worms but also
spread worse case faked noise to mislead
the signature generation
• DNS Lion worm, noise ratio: 8%~92%,
suspicious pool size 200
22
Conclusions
• A novel network-based automated worm
signature generation approach
– Work for zero day polymorphic worms with
unknown vulnerabilities
– Vulnerability based and Network based
– Length-based signatures for buffer overflow
worms
– Provable attack resilience
– Fast and accurate through experiments
2016/4/9
23
Backup Slides
Discussions of Practical Issues
• Speed of signature matching
– Major over head: protocol parsing
– Software (Bro with Binpac): 50~200Mbps
– Optimized Binpac: 600Mbps
– Hardware: 3Gbps
• Relationship between fields and buffers
– Mostly direct mapping between fields
– Analyzed 19 vulnerabilities, 1 exception
2016/4/9
25
LEngth-based Signature Generator (LESG)
Thwart zero-day
polymorphic worms
Network-based
Vulnerability-based
75% of Vulnerabilities
based on buffer overflow
Target buffer overflow worms
Only use network level info
LESG
Attack resilient
Noise tolerant
Can detect zero-day worm in
real-time
Efficient signature matching