Transcript Document

Information Visualization for
an Intrusion Detection System
Ching-Lung Fu
James Blustein
Daniel Silver
Overview

Research Objective:


explore / discover factors for building a better
IDS (network based)
Initial stage of our research
Short comings of IDS
 Spatial Hypertext / visualization
 ML & UM + IDS + SH


Recent Update

Revisit the IDS users
2
Problem Source

Rule based IDS



Machine Learning based IDS, high errors





resulting a network too restricted to be used, or
an IDS vulnerable to new types of attacks
Training Data imbalance: available “real-attack”
training examples are scarce
A machine learning algorithm need to “see” enough
examples to generalize to “unseen” future examples
Ambiguous data
Could a human expert do better?
Current Machine Learning algorithms cannot
generalize better than humans
3
Problem Source

High false detections
Preventing immediate response to the real
attacks
 User’s trust
 Unusable IDS  Most system admins now
attend to the problem after the attack or after
the damage has been done.

4
Alternative IDS
Reduce the dependability on detection
mechanism
 Visual intelligence

harnessing human abilities
 keeps humans “in the loop”

 contributing
judgment and sharing some
responsibility
 personal involvement & empowerment
5
Alternative IDS

A visualization + machine learning tool
could provide the answer
6
SH as a visualization mechanism
Information Triage
 What is Spatial Hypertext (SH) ?

Graphic workspace with freely manipulable
objects.
 Relationship represented by color, proximity,
alignment, containment, etc.
 Ambiguity & implicit
 Examples in the next few pages

7
SH – example 1
8
9
Power of Visualization example 2
10
An on-line example

http://www.hivegroup.com/salesforce.html
11
SH as a visualization mechanism continued
Emerging information
 Human has excellent visual intelligence
 Able to contain lot of information
 Please see my poster for a new
developing framework

12
Challenges





The information visualization cannot be effective
if the machine learning components cannot
deliver accurate information
The publicly available testing dataset are not
good enough
Data ambiguity always exist
The ML algorithms are not the bottleneck,
feature extraction processes are
The ML algorithms may be used to “mine” the
features used directly by visualization tools;
human eyes detect the anomalies
13
Revisit the IDS users





Most of them still rely on primitive tools
IDS are completely not trusted
Response to problems only after complaints
have been made
Many organizations refuse the visit as they do
not have an IDS — “Security through obscurity”
Some organizations simply unplug the important
system from the network to avoid unnecessary
exposures
14
Conclusion
Improve current ML based IDS as a
component
 Data Mining on features for information
visualization
 Spatial Hypertext – a hybrid approach in
which information visualization
complements the IDS

15
Questions ?
Ching-Lung Fu
Dalhousie Computer Science
<[email protected]>
16