ids - ODU Computer Science

Download Report

Transcript ids - ODU Computer Science

Data Mining for Intrusion
Detection: A Critical Review
Klaus Julisch
From: Applications of data Mining in
Computer Security (Eds. D. Barabara
and S. Jajodia)
Knowledge Discovery from databases
(KDD)
• Five steps
– (1) Understanding the application domain
– (2) Data integration and selection
– (3) Data mining
– (4) Pattern evaluation
– (5) Knowledge representation
Data Mining Meets Intrusion Detection
•
IDS: Misuse detection and anomaly detection
–
–
•
IDS: Host-based and network-based IDS
–
–
•
Misuse detection: Requires a collection of known attacks
Anomaly detection: Requires user or system profile
Host-based: Analyze host-bound audit sources such as audit trails, system logs, or application logs.
Network-based: Analyze packets captured on a network
MADAMID: Mining Audit Data for Automated Models for Intrusion Detection---At
Columbia University---Learn classifiers that distinguish between intrusions and normal activities
–
–
–
–
–
(i) Training connection records are partitioned into---normal connection records and intrusion
connection records
(ii) Frequent episode rules are mined separately for the two categories of training data---form
intrusion-only patterns
(iii) Intrusion-only patterns are used to derive additional attributes---indicative of intrusive behavior
(iv) Initial training records are augmented with the new attributes
(v) A classifier is learnt that distinguishes normal records from intrusion records---the misuse IDS –
the classifier ---is the end product of MADAMID
ADAM (Audit Data Analysis and Mining)
• Network-based anomaly detection system
• Learns normal network behavior from attack-free training data and
represents it as a set of association rules---the profile
• At runtime, the records of the past δ seconds are continuously mined for
new association rules that are not contained in the profile---which are
sent to a classifier which separates false positives from true positives
• Its association rules are of the form: ∏ Ai = vi
– Each association rule must have the source host and destination host and destination
port among the attributes
– Multi-level association rules have been introduced to capture coordinated and
distributed attacks
Clustering of Unlabeled ID Data
•
Main focus: Training anomaly detection systems over noisy data
–
–
–
•
•
•
•
•
Number of normal elements in the training data is assumed to be significantly larger than the
number of anomalous elements
Anomalous elements are assumed to be qualitatively different from normal ones
Thus, anomalies appear as outliers standing out from normal data---thus explicit modeling of outliers
results in anomaly detection
Use of clustering--- all normal data may cluster into similar groups and all intrusive
into the others---intrusive ones will be in small clusters since they are rare
Real-time data is compared with the clusters to determine a classification
Network-based anomaly detection has been built
In addition to the intrinsic attributes (e.g., source host, destination host, start time,
etc.), connection records also include derived attributes such as the #of failed login
attempts, the #of file-creation operations as well as various counts and averages
over temporally adjacent connection records
Euclidean distance is used to determine similarity between connection records
Mining the Alarm Stream
•
Applying data mining to alarms triggered by IDS
–
–
•
(i) Model the normal alarm stream so as to henceforth raise the severity of “abnormal alarms”
(ii) Extract predominant alarm patterns---which a human expert can understand and act upon---e,g.,
write filters or patch a weak IDS signature
Manganaris et al:
– Models alarms as tuples (t,A)---t timestamp and A is an alarm type
– All other attributes of an alarm are ignored
– The profile of normal alarm behavior is learned as:
• Time-ordered alarm stream is partitioned into bursts
• Association rules are mined from the bursts
• This results in profile of normal alarms
– At run time various tests are carried out to test if an alarm burst
is normal
• Clifton and Gengo; Julisch:
– Mine historical alarm logs to find new knowledge---to reduce the
future alarm load---e.g., to write filtering rules to discard false
positives
• Tools: Frequent episode rules
• Attribute-oriented induction
– Repeated replacing attributes by more abstract values
» E.g., IP addresses to networks, timestamps to weekdays, and
ports to port ranges; the hierarchies are provided by user
– Generalization helps previously distinct alarms getting merged into a
few classes---huge alarm logs are condensed into short and
comprehensible summaries---reduces the alarm load by 80%
• Isolated application of data mining techniques can be a dangerous activity--leading to the discovery of meaningless or misleading patterns
• Data mining without a proper understanding of the application domain
should be avoided
• Validation step is extremely important