PowerPoint プレゼンテーション

Download Report

Transcript PowerPoint プレゼンテーション

A Statistical Anomaly Detection Technique
based on
Three Different Network Features
Yuji Waizumi
Tohoku Univ.
Background
The Internet has entered the business
world
Need to protect information and systems
from hackers and attacks
Network security has been becoming
important issue
Many intrusion/attack detection methods
has been proposed
Intrusion Detection System
Two major detection principles:

Signature Detection
 Attempts to flag behavior that is close to some
previously defined pattern signature of a known
intrusion

Anomaly Detection
 Attempts to quantify the usual or acceptable
behavior and flags other irregular behavior as
potentially intrusive.
Motivation
Anomaly detection system


Pro: can detect unknown attacks
Con: many false positives
Improve the performance of Anomaly
detection system



Analyze the characteristics of attacks
Propose method to construct features as
numerical values from network traffic
Construct detection system using the
features
Classification of Attacks
DARPA Intrusion Detection Evaluation



DoS: Denial of Service
Probe: Surveillance of Targets
Remote to Local(R2L), User to Root(U2R):
Unauthorized Access to a Host or Super User
Re-classification of Attacks
Classification by Traffic Characteristics

DoS, Probe
 Traffic Quantity
 Access Range

Probe
 Structure of Communication Flows

DoS, R2L, U2R
 Contents of Communications
To detect attacks with above characteristics,
it is necessary to construct features corresponding
those classes.
Network Traffic Feature
Numerical values(vectors) expressing
state of traffic
We propose three different network
feature sets


Based of re-classification of attacks
Analyzed independently
Time Slot Feature (34
dimension)
Count various packets, flags, transmission
and reception bytes, and port variety by a
unit time
Estimate scale and range of attacks
Target


Probe (Scan)
DoS
Each slot is expressed as a vector
Ex) (TCP,icmp,SYN,FIN,RST,UDP,DNS,…)
Element value
Examples (Time Slot Feature)
Vector element
Values are regularizes
as mean=0, variance=1.0
rst flag
(port 21)
ftp scan
normal traffic only
rst flag
(port 23)
telnet scan
Flow Counting Feature
Flow is specified by
(srcIP, dstIP, srcPort,dstPort,protocol)
Count packets, flags, transmission and
reception bytes in a flow
Target


Scan with illegal flags
Ports used as backdoors
TCP:19 dim. , UDP:7 dim.
Element value
Examples
(Flow Counting Feature)
Specific packets of
attacks are extremely
high and low.
Vector element
Decrease of SYN packet
Normal traffic
Port sweep(scan)
Flow Payload Feature
Represent content of communication
Histogram of character codes of a flow


Count 8bit-unit(256 class)
Transmission and reception are counted
independently (total 512 class)
Target


Buffer overflow
Malicious code
Examples
(Flow Payload Feature)
Specific character of
attacks are extremely
high and low.
Normal traffic
imap attack
Modeling Normal Behavior
Each packet appears based on protocol
Correlations between elements of the
feature vectors
Profile based on correlations can
represent normal behavior of network
traffic
Principal Component
Analysis:PCA
Extract correlation among samples as
Principal Component
Principal Component lay along sample
distribution
Non-correlated data
Principal Component
Discriminant Function
Projection Distance
Long Distant
Samples:
•Unordinary traffic
•Break
Detection
Criterion
Correlation
Principal Component
Projection
Distance
Anomaly sample
Detection Algorithm
Independent Detection


The three features are used for
PCA independently
"Logical OR" operation for detection alerts by
each feature
Network Traffic
Features
Time Slot
PCA
Alert
Flow Counting
PCA
Alert
Flow Payload
PCA
Alert
OR
Alert
Performance Evaluation
Two Examine Scenario

Scenario1
 Learn Week1 and 3
 Test Week4 and 5

Scenario2
 Learn Week 4 and 5
 Test Week 4 and 5
 More Practical Situation

Real network traffic may include attack traffic
Criterion for Evaluation

Detection rate when number of miss-detection
(false positive) per day is 10
Data Set
Data Set





1999 DARPA off-line intrusion detection
evaluation test set
Contain 5 weeks data (from Monday to
Friday)
Week1,3: Normal traffic only
Week2: Including attacks (for learning)
Week4,5: Including attacks (for testing)
Scenario 1 Result
# of
detection
# of target
Detection
rate
Proposed Method
104
171
60.8%
NETAD
132
185
71.4%
2003
Forensics
15
27
55.6%
2000
Expert1
85
169
50.3%
Expert2
81
173
46.8%
Dmine
41
102
40.2%
Scenario 2 Result
# of
detection
Proposed Method
NETAD
# of target
Detection
rate
100
171
58.5%
70
185
37.8%
NETAD
•Use IP address as white list
•Overfit learning data
Proposed Method
•Independent of IP address
•Evaluate only anomaly of traffic
Detection Results every Features
Scenario 1
Time Slot Feature(TS)
Flow Counting
Feature(FC)
Flow Payload
Feature(FP)
(TS) & (FC) &
(FP)
(TS)
(FC)
22
9
(FP
)
5
13
6
44
5
Scenario 2
Time Slot Feature(TS)
Flow Counting(FC)
Flow Payload(FP)
(TS) & (FC) &
(FP)
(TS
)
37
(FC
)
7
(FP)
8
3
2
40
Low detection overlap
Each feature detect
different characteristic
attacks
# of Detection
by both TS & FP
3
# of Detection by
all Three Features
# of Detection
by FP only
Conclusion
For network security



Classification attacks into three types
Construct three features corresponding to
three attack characteristics
Detection method with PCA
 Learning the three features independently

Higher detection accuracy
 With samples including attacks