Data Mining in Cyber Threat Analysis - MINDS
Download
Report
Transcript Data Mining in Cyber Threat Analysis - MINDS
Data Mining for Network Intrusion Detection
Vipin Kumar
Army High Performance Computing Research Center
Department of Computer Science
University of Minnesota
http://www.cs.umn.edu/~kumar
Project Participants:
V. Kumar, A. Lazarevic, J. Srivastava
P. Dokas, E. Eilertson, L. Ertoz, S. Iyer, S. Ketkar, P. Tan
Research supported by AHPCRC/ARL
Cyber Threat Analysis
Incidents Reported to Computer Emergency
Response Team/Coordination Center (CERT/CC)
As the cost of information
processing and Internet
accessibility falls,
organizations are becoming
increasingly vulnerable to
potential cyber threats
such as network intrusions
60000
50000
40000
30000
20000
10000
0
90
91
92
93
94
95
96
97
98
99
Intrusions are actions that attempt to bypass security
mechanisms of computer systems
Intrusions are caused by:
Attackers accessing the system from
Internet
Insider attackers - authorized users
attempting to gain and misuse
non-authorized privileges
00
01
Intrusion Detection
Intrusion Detection System
combination of software
and hardware that attempts
to perform intrusion detection
raises the alarm when possible
intrusion happens
Traditional intrusion detection system IDS tools (e.g.
SNORT) are based on signatures of known attacks
Limitations
Signature database has to be manually revised
for each new type of discovered intrusion
www.snort.org
They cannot detect emerging cyber threats
Substantial latency in deployment of newly created signatures
across the computer system
Data Mining for Intrusion
Detection
Increased interest in data mining based IDS for detection
Attacks for which it is difficult to build signatures
Unforeseen/Unknown attacks
Emerging Threats
Data mining approaches for intrusion detection
Misuse detection
Building predictive models from labeled labeled data sets (instances
are labeled as “normal” or “intrusive”)
Can only detect known attacks and their variations
High accuracy in detecting many kinds of known attacks
Anomaly detection
Able to detect novel attacks as deviations from “normal” behavior
Potential high false alarm rate - previously unseen (yet legitimate)
system behaviors may also be recognized as anomalies
Misuse Detection
Classification of intrusions
RIPPER [Madam ID @ Columbia U], Bayesian classifier [ADAM @
George Mason U], fuzzy association rules [Bridges00], decision
trees [ARL U Texas, Sinclair99], neural networks [Lippmann00,
Ghosh99, Canady98], genetic algorithms [Bridges00, Sinclair99]
Association pattern analysis
Building normal profile [Barbara01, Manganaris99], frequent
episodes for constructing features [Madam ID @ Columbia U]
Cost sensitive modeling
AdaCost [Fan99], MetaCost [Domingos99], [Ting00], [Karakoulas95]
Learning from rare class
[Kubat97, Fawcett97, Ling98, Provost01, Japkowicz01, Chawla01,
Joshi01]
Anomaly Detection
Statistical approaches
Finite mixture model [Yamanishi00], 2 based [Ye01]
Various anomaly detection
Temporal sequence learning [Lane98], neural networks [Ryan98],
similarity tree [Kokkinaki97], generating artificial anomalies [Fan01],
Clustering [Madam ID, Eskin02], unsupervised SVM [Madam
ID, Eskin02],
Outlier detection schemes
Nearest neighbor approaches [Knorr98, Jin01, Ramaswamy00,
Aggarwal01], Density based [Breunig00], connectivity based
[Tang01],Clustering based [Yu99]
Key Technical Challenges
Large data size
Millions of network connections
are common for commercial network sites, …
High dimensionality
Hundreds of dimensions are possible
Temporal nature of the data
Data points close in time - highly correlated
Skewed class distribution
“Mining needle in a haystack.
So much hay and so little time”
Interesting events are very rare looking for the “needle in a haystack”
Data Preprocessing
Converting network traffic into data
High Performance Computing (HPC) can be critical for online analysis and scalability to very large data sets
MINDS Project - Recent Accomplishments
MINDS – MINnesota INtrusion
Detection System
Learning from Rare Class – Building rare
class prediction models
Anomaly/outlier detection
Summarization of attacks using
association pattern analysis
MINDS - Learning from Rare Class
Problem: Building models for rare network attacks
(Mining needle in a haystack)
Standard data mining models are not suitable for rare classes
Models must be able to handle skewed class distributions
Learning from data streams - intrusions are sequences of events
Key results:
PNrule and related work [Joshi, Agarwal, Kumar, SIAM 2001,
SIGMOD 2001, ICDM 2001, KDD 2002]
SMOTEBoost algorithm [Lazarevic, in review]
CREDOS algorithm [Joshi, Kumar, in review]
Classification based on association - add frequent items
as “meta-features” to original data set
MINDS - Anomaly and Outlier
Detection
Approach
Detecting novel attacks/intrusions by identifying them as
deviations from “normal” behavior
Goals:
Construct useful set of features for data mining algorithms
Identify novel intrusions using outlier detection schemes
Distance based techniques
Nearest neighbor approach
Mahalanobis-distance approach
Clustering based approaches
Density based schemes
Unsupervised Support Vector
Machines (SVM)
Experimental Evaluation
Publicly available data set
DARPA 1998 Intrusion Detection Evaluation Data Set
Real network data from
University of Minnesota
Anomaly detection is applied
Open source signaturebased network IDS
4 times a day
network
10 minutes time window
www.snort.org
10 minutes cycle
2 millions connections
net-flow data using CISCO
routers
Anomaly
scores
MINDS
Data preprocessing
anomaly
detection
…
…
Association
pattern analysis
DARPA 1998 Data Set
DARPA 1998 data set (prepared and managed by MIT
Lincoln Lab) includes a wide variety of intrusions
simulated in a military network environment
9 weeks of raw TCP dump data
7 weeks for training (5 million connection records)
2 weeks for training (2 million connection records)
Connections are labeled as normal or attacks (4 main
categories of attacks - 38 attack types)
DOS- Denial Of Service
Probe - e.g. port scanning
U2R - unauthorized access to gain root privileges,
R2L - unauthorized remote login to machine,
Two types of attacks
Bursty attacks
- involve multiple network connections
Non-bursty attacks - involve single network connections
Feature construction
Three groups of features
Basic features of individual TCP connections: source
& destination IP/port, protocol, number of bytes,
duration, number of packets (used in SNORT only in stream
builder)
Time based features
For the same source (destination) IP address, number of unique destination
(source) IP addresses inside the network in last T seconds
Number of connections from source (destination) IP to the same destination
(source) port in last T seconds
Connection based features
For the same source (destination) IP address, number of unique destination
(source) IP addresses inside the network in last N connections
Number of connections from source (destination) IP to the same destination
(source) port in last N connections
MINDS Outlier Detection on DARPA’98 Data
ROC Curves for different outlier detection techniques
ROC Curves for different outlier detection techniques
1
1
0.9
0.9
Detection Rate
0.7
0.6
0.5
ROC curves for bursty attacks
0.4
Unsupervised SVM
LOF approach
Mahalanobis approach
NN approach
0.3
0.2
0.1
0
0.02
0.04
0.06
0.08
False Alarm Rate
0.1
0.12
Detection Rate
0.8
0.8
0.7
0.6
0.5
0.4
0.3
LOF approach
NN approach
Mahalanobis approach
Unsupervised SVM
0.2
0.1
0
0
0.02
0.04
0.06
False Alarm Rate
0.08
0.1
LOF approach is consistently better than other
approaches
ROC curves for single-connection attacks
Unsupervised SVMs are good but only for high
false alarm (FA) rate
LOF approach is superior to other outlier
detection schemes
NN approach is comparable to LOF for low FA rates, but detection rate
Majority of single connection attacks are
probably located close to the dense
regions of the normal data
decrease for high FA
Mahalanobis-distance approach – poor due to multimodal normal
behavior
Outlier Detection Recent Results (on DARPA’98 data)
Analyzing multi-connection attacks using the score
values assigned to network connections
Detection rate is measured through number of
connections that have score higher than 0.5
1
0.9
Low peaks due to
occasional “reset”
value for the feature
called “connection
status”
Connection score
0.8
0.7
0.6
0.5
0.4
LOF approach
0.3
0.2
NN aproach
Mahalanobis-distance based approach
0.1
0
0
10
20
30
40
50
60
Number of connections
70
80
90
100
Recently Detected Real-life Attacks
During the past few months various intrusive/suspicious activities
were detected at the AHPCRC and at the U of Minnesota using MINDS
A sample of top ranked anomalies/attacks picked by MINDS
August 13, 2002
Detected scanning for Microsoft DS service on port 445/TCP (Ranked #1)
Reported by CERT as recent DoS attacks that needs further analysis (CERT August 9, 2002)
Undetected by SNORT since the scanning was non-sequential (very slow)
Number of scanning activities
on Microsoft DS service on port
445/TCP reported in the World
(Source www.incidents.org)
Recently Detected Real-life Attacks …(ctd)
A sample of top ranked anomalies/attacks picked by MINDS
August 13, 2002
Detected scanning for Oracle server (Ranked #2)
Reported by CERT, June 13, 2002
First detection of this attack type by our University
Undetected by SNORT because the scanning was hidden within another Web
scanning
August 8, 2002
Identified machine that was running Microsoft PPTP VPN server on non-standard
ports, which is a policy violation (Ranked #1)
Undetected by SNORT since the collected GRE traffic was part of the normal traffic
October 30, 2002
Identified compromised machines that were running FTP servers on non-standard
ports, which is a policy violation (Ranked #1)
Anomaly detection identified this due to huge file transfer on a non-standard port
Undetectable by SNORT due to the fact there are no signatures for these activities
Recently Detected Real-life Attacks …(ctd)
A sample of top ranked anomalies/attacks picked by MINDS
October 10, 2002
Detected several instances of slapper worm that were not identified by SNORT since
they were variations of existing warm code
Deteted by MINDS anomaly detection algorithm since source and destination ports
are the same but non-standard, and slow scan-like behavior for the source port
Potentially detectable by SNORT using more general rules, but the false alarm rate
will be too high
Number of slapper worms
on port 2002 reported in
the World (Source
www.incidents.org)
Recently Detected Real-life Attacks …(ctd)
Top ranked anomalies/attacks picked by MINDS
October 10, 200
Detected a distributed windows networking scan from two different
source locations (Ranked #1)
Similar distributed scan from 100 machines scattered around the
World happened at University of Auckland, New Zealand, on August
8, 2002 and it was reported by CERT, Insecure.org and other
security organizations
Attack
sources
Destination IPs
Distributed scanning activity
SNORT vs. MINDS Anomaly/Outlier
SNORT has static knowledge manually updated by human
analysts
MINDS anomaly/outlier detection algorithms are adaptive
in nature include infinite number of rules
MINDS anomaly/outlier detection algorithms san also be
effective in detecting anomalous behavior originating from
a compromised machine
SNORT vs. MINDS Anomaly/Outlier
Content-based attacks (e.g. content of the packet)
SNORT is able to detect only those attacks with known signatures
Out of scope for MINDS anomaly/detection algorithms, since they do not
use the content of the packets
Scanning activities
Same source sequential destination scans
SNORT is better than MINDS anomaly/outlier detection in identifying these attacks,
since it is specifically designed for their detection
Scans with random destinations
MINDS anomaly/outlier detection algorithms discover them quicker than SNORT
since SNORT has to increase time window (specifies the scanning threshold)
which results in the large memory requirements
Slow scans
MINDS anomaly/outlier detection identifies them better than SNORT, since SNORT
has to increase time window which increases processing requirements
SNORT vs. MINDS Anomaly/Outlier
Policy violations (e.g. rogue and unauthorized
services)
MINDS anomaly/outlier detection algorithms are
successful in detecting policy violations, since they are
looking for unusual and suspicious network behavior
To detect these attacks SNORT has to have a rule for
each specific unauthorized activity, which causes
increase in the number of rules and therefore the
memory requirements
MINDS - Framework for Mining Associations
Ranked
connections
attack
Anomaly
Detection
System
Discriminating
Association
Pattern
Generator
normal
update
1.
Build normal profile
2.
Study changes in
normal behavior
3.
Knowledge
Base
Create attack summary
4.
Detect misuse behavior
5.
Understand nature of
the attack
R1: TCP, DstPort=1863 Attack
…
…
…
…
R100: TCP, DstPort=80 Normal
Discovered Real-life Association Patterns
Rule 1: SrcIP=XXXX, DstPort=80, Protocol=TCP, Flag=SYN,
NoPackets: 3, NoBytes:120…180 (c1=256, c2 = 1)
Rule 2: SrcIP=XXXX, DstIP=YYYY, DstPort=80, Protocol=TCP,
Flag=SYN, NoPackets: 3, NoBytes: 120…180 (c1=177, c2 = 0)
At first glance, Rule 1 appears to describe a Web scan
Rule 2 indicates an attack on a specific machine
Both rules together indicate that a scan is performed first,
followed by an attack on a specific machine identified as
vulnerable by the attacker
Discovered Real-life Association Patterns…(ctd)
DstIP=ZZZZ, DstPort=8888, Protocol=TCP (c1=369, c2=0)
DstIP=ZZZZ, DstPort=8888, Protocol=TCP, Flag=SYN (c1=291, c2=0)
This pattern indicates an anomalously high number of TCP
connections on port 8888 involving machine ZZZZ
Follow-up analysis of connections covered by the pattern
indicates that this could be a machine running a variation of
the Kazaa file-sharing protocol
Having an unauthorized application increases the
vulnerability of the system
Discovered Real-life Association Patterns…(ctd)
SrcIP=XXXX, DstPort=27374, Protocol=TCP, Flag=SYN, NoPackets=4,
NoBytes=189…200 (c1=582, c2=2)
SrcIP=XXXX, DstPort=12345, NoPackets=4, NoBytes=189…200
(c1=580, c2=3)
SrcIP=YYYY, DstPort=27374, Protocol=TCP, Flag=SYN, NoPackets=3,
NoBytes=144 (c1=694, c2=3)
……
This pattern indicates a large number of scans on ports
27374 (which is a signature for the SubSeven worm) and
12345 (which is a signature for NetBus worm)
Further analysis showed that no fewer than five machines
scanning for one or both of these ports in any time window
Discovered Real-life Association Patterns…(ctd)
DstPort=6667, Protocol=TCP (c1=254, c2=1)
This pattern indicates an unusually large number of
connections on port 6667 detected by the anomaly detector
Port 6667 is where IRC (Internet Relay Chat) is typically run
Further analysis reveals that there are many small packets
from/to various IRC servers around the world
Although IRC traffic is not unusual, the fact that it is flagged
as anomalous is interesting
This might indicate that the IRC server has been taken down (by a
DOS attack for example) or it is a rogue IRC server (it could be
involved in some hacking activity)
Discovered Real-life Association Patterns…(ctd)
DstPort=1863, Protocol=TCP, Flag=0, NoPackets=1, NoBytes<139
(c1=498, c2=6)
DstPort=1863, Protocol=TCP, Flag=0 (c1=587, c2=6)
DstPort=1863, Protocol=TCP (c1=606, c2=8)
This pattern indicates a large number of anomalous TCP
connections on port 1863
Further analysis reveals that the remote IP block is owned
by Hotmail
Flag=0 is unusual for TCP traffic
Conclusions
Rare class predictive models improve the detection of
infrequent attack types
MINDS anomaly/outlier detection algorithms are
successful in detection of intrusions that could not be
picked by commercial “state of the art” IDS tools
(SNORT)
Slow scans and random scans
Policy violations and unauthorized activities
MINDS association patterns can be useful in creating
summaries of detected attacks and suggesting new
signatures
Future Work
On-line detection algorithms
Better characterization of “normal” behavior
Detection of distributed attacks
Insider attacks
Other applications of anomaly detection
Credit card fraud detection
Insurance fraud detection
Transient fault detection for industrial process control
Detecting individuals with rare medical syndromes (e.g. cardiac
arrhythmia)
Questions?
Distance based Outlier Detection Schemes
Nearest Neighbor (NN) approach
For each point compute the distance to the k-th nearest neighbor dk
Outliers are points that have larger distance dk and therefore are
located in the more sparse neighborhoods
Mahalanobis-distance based approach
Mahalanobis distance is more appropriate for computing distances
with skewed distributions
y’
x’
*
*
*
*
*
*
p2
*
*
* *
* *
* *
*
*
* *
* *
*
*
p1
Back
Density based Outlier Detection Schemes
Local Outlier Factor (LOF) approach
For each point compute the density of local neighborhood
Compute LOF of example p as the average of the ratios of the
density of example p and the density of its nearest neighbors
Outliers are points with the largest LOF value
In the NN approach, p2
is not considered as
outlier, while the LOF
approach find both p1
and p2 as outliers
p2
p1
Back
Unsupervised Support Vector Machines for
Outlier Detection
Unsupervised SVMs attempt to separate the entire set of
training data from the origin, i.e. to find a small region
where most of the data lies and label data points in this
region as one class
Parameters
Expected number of outliers
Variance of rbf kernel
As the variance of the rbf kernel
gets smaller, the separating
surface gets more complex
origin
push the hyper plane away from
origin as much as possible
Back
SNORT signature based Network IDS
SNORT (www.snort.org) is an open source
Network Intrusion Detection System (IDS)
based on signatures
SNORT contains anomaly detector SPADE (Statistical
Packet Anomaly Detection Engine) usually turned off due
to high false alarm rate
SNORT may be configured in one of the following modes
sniffer mode – reads the packets from the network and displays
them for you in a continuous stream on the console
packet logger mode – logs the packet to the disk
intrusion detection mode - analyzes network traffic for matches
against a user defined rule set and perform several actions based
upon what it sees.
Back
SPADE – SNORT Anomaly Detection
SPADE is a SNORT preprocessor plugin which sends
alerts of anomalous packet through standard SNORT
reporting mechanisms (the fewer times that a particular
kind of packet has occurred in the past, the higher its
anomaly score will be)
It is a part of SPICE (Stealthy Probing and Intrusion
Correlation Engine) project at www.silicondefense.com
SPICE consists of two parts:
SPADE that act as an anomaly sensor engine and report anomalous
events to event correlator
event correlator that groups these events together and send out
reports of unusual activity (e.g., portscans)
Back
Recently detected real-life attacks
http://www.cert.org/current/current_activity.html#Microsoft-DS
Microsoft-DS (445/tcp) Activity
updated August 9 | added August 9
“We have received reports of widespread scanning and
possible denial of service activity targeted at the
Microsoft-DS service on port 445/tcp. We are interested
in receiving reports of this activity from sites with
detailed logs and evidence of an attack. Please send all
reports to [email protected]”
Back