Testbed for Quantitative Assessment of IDS

Transcript Testbed for Quantitative Assessment of IDS

A Testbed for Quantitative
and Metrics Based
Assessment of IDS
By
Farhan Mirza
60-520
1
Contents







Introduction
Intrusion Detection System
Air Force Evaluation Environment
LARIAT
TIDeS
Tests and Results
Conclusion
2
Core Papers






Gautam Singaraju, Lawrence Teo, Yuliang Zheng, “A Testbed for Quantitative
Assessment of IDS using Fuzzy Logic”, Laboratory of Information Integration
Security and Privacy (LIISP), University of North Carolina at Charlotte Calpytix
Security Corporation, USA, Appears in Proceeding of the Second IEEE International
Information Assurance Workshop (IWIA ‘04)
P. Mell, V. Hu, R. Lippmann. J. Haines, and M. Zissman. “An overview of issues in
testing intrusion detection systems”. NIST Interagency Report NIST IR 7007, NIST,
http://csrc.nist.gov/publications/nistir/nistir-7007.pdf, June 2003
E. Biermann, E. Clote, and L. Venter; “A comparison of Intrusion Detection Systems”;
Computers and Security, Pages 676-683, 2001
R. Lippman, J. W. Haines, D. J. Fried, J. Korba, and K. Das; “The 1999 DARPA Off-line
Intrusion Detection Evaluation”;
http://www.ll.mit.edu/IST/ideval/pubs/2000/1999EvalComputerNeworks2000.pdf
L. M. Rossey., R. K. Cunnigham, D. J. Fried, Jl C. Rabek, R. P. Lippmann, and J. W.
Haines. Lariat: Lincoln adaptable real-time information assurance testbed. Fourth
International Workshop on Recent Advances in Intrusion Detection, 2001
T. G. Champion and R. S. Durst. Air force intrusion detection system evaluation
environment. RAID Symposium, 1999
3
Introduction

Intrusion Detection System
– Major investment for a firm
– Common component in the corporate and
home network
– Growing in popularity
– Commercial IDS are costly
– Few are free, but effectiveness is doubtful
4
Introduction (Cont..)





IDSs employ different technologies
Claim to effectively detect an intrusion
In specific test environment - Technologies evokes
question about their effectiveness and
performance
Under scrutiny are network parameters – network
bandwidth conditions, out-of-order packet
sequence etc
Careful evaluations of IDSs are desired to check
its effectiveness by varying network parameters
[2]
5
IDS Testbeds

Testbed Development - Defense Advanced
Research Projects Agency (DARPA) and
Air Force in association with Lincoln Lab
 Unavailable to public for evaluation
 Air Force Evaluation Environment [7]
 Lincoln Adaptable Real-Time Information
Assurance Testbed (LARIAT) [3]
6
Metrics to quantify an IDS



Apart from strong testing scenario – required a
robust and reliable metrics to quantify an IDS
One of the metrics suggested by National
Institute of Standards and Technology (NIST) [4]
– Based on quantitative analysis of IDS by varying
network parameters
– Legitimate and illegitimate traffic can easily be
included for system testing
– User should be able to customize the testbed
Other words - testbed should be built with plug-nplay architecture and be scalable
7
Air Force Evaluation Environment





Simulates the complexity of MAN found at military bases
Theoretically top-level firewall protect single entry point
into base MAN
Size and diversity is simulated using software to
dynamically assign arbitrary source protocol addresses
Uses two traffic generators
– Outside machine – ran network sessions between the model
base and simulated Internet
– Inside machine – ran network sessions within the model
base’s address space and simulated in presence of larger
network
Entire testbed was completely isolated in AFRL’s laboratory
8
AFRL Virtual Test Network
Architecture
9
AFRL Actual Physical Network
10
AFRL Traffic Generator
Architecture

Five layers to design
– The scheduler
– The master controller
– The slave layer
– The automata layer
– The virtual networking layer
11
Full-Time traffic generation
system architecture
12
LARIAT – Lincoln Adaptable Real-Time
Information Assurance Testbed






An extension of testbed created for DARPA 1998 & 1999
intrusion detection evaluations
Two design goals
– Supports real-time evaluations
– Create a deployable, configurable and easy-to-use testbed
Supports automated and quantitative evaluations
Components – generate realistic background user traffic
and real network attacks, verify attack success or failure,
score ID system performance
Provides graphical user interface to control and monitoring
Currently being exercised at four sites
13
LARIAT Experiment Steps






Initialize Network
Distribute Configuration
Pre-Conditions
Run Traffic
Verify and Score
Clean Up
14
Automated Run Sequence
15
LARIAT GUI
16
Software Components
17
Sample Attack Scenario
available with LARIAT
18
Testbed for evaluating Intrusion
Detection Systems (TIDeS)





Scalable architecture with rigid matrices for
evaluation, that forms the foundation for the
TIDeS framework
Evaluates IDSs on a common Platform
Based on Fuzzy Logic
User can customize the testing scenarios by
being able to add or remove attacks from attack
database
Allows a set of IDSs to determine the best IDS
amongst them in specific environment
19
Testbed Architecture
20
Capabilities of TIDeS

To add new protocols
 To add new scripts
 Default protocols – HTTP, SMTP, POP3,
TELNET, FTP and SSH
 Depend on scenario - Data is captured
from short time to 24/7
21
Testing Scenarios


Non-environmental based testing scenario
– Does not depend on data that has been collected on the network
Test Conducted in this Scenario
– All-legitimate traffic testing
 Launches only legitimate traffic
 Network traffic is increased till network breaks down
 # of false alarms determined and classified as false positives
– All-illegitimate traffic testing
 Launches only attacks from attack database
 Network traffic is increased till network breaks down
 If attack is not detected by IDS, it could be classified as false negative
– Mixed traffic testing
 Launches both legitimate and illegitimate traffic
 Traffic generated randomly and launched traffic is logged
 Network traffic is increased till network breaks down
 IDS output and logged launch traffic profile determine false alarms
22
Testing Scenarios (Cont…)

Environmental based testing scenario
– Depends upon the traffic that has been captured
from the user’s network
– Important as the IDS evaluation performed under
the actual network condition
– Such a testing of entire spectrum of conditions
leads to the effective evaluation of IDSs
– The results from testing is provided to Fuzzy Logic
evaluation Framework
23
Components of TIDeS
Architecture






Handler
Virtual Machine Emulator
Launcher
Environment Profile Generator
Scripts
Evaluation Framework
24
Handler



Main
Controller
An Interface
to the
testbed
Provides
capability of
monitoring
the tests
25
Virtual Machine Emulator
– Emulates numerous virtual machines
with unique IP addresses
– Maps entire network into a single
computer
– Capability to emulate routers and each
virtual machine can have a different OS
– Virtual network setup is created
– Honeyd is used
26
Launcher





Launcher generates traffic when a control signal
is received from handler through the agent and
then to virtual machine emulator
Launcher in turn activates the scripts that
generate traffic
Launcher then launch environment profile
Handler activates the launcher
Accessing the different services – the scripts
create the traffic on the network
27
Environment Profile Generator




Used to generate the environmental traffic
patterns of the user’s network
Generated from the real-time condition by
analyzing networks
Environment profile is exported to the machine
that hosts the virtual machine emulator
Traffic generator generates different environment
profiles for each of the IP address
28
Environment Profiles in TIDeS
framework

University Environment Profile
 Stand-alone Environment Profile
 Home Environment Profile
29
University Environment Profile
Number of Server used – 4
 All servers used in University environment
 Server 1 – Accepts HTTP connections
 Server 2 – Interactive server that accepts SSH, TELNET and FTP
connections
 Server 3 – One of 2 mail servers, accepts SMTP connections
 Server 4 – Other mail server, accepts POP and IMAP connections
 Both mail servers also accept SSH connections only for
management staff
 Servers run on Sun Solaris OS
 Snoop is used as packet capturing application developed by Sun
Microsystems
 Servers are working for working day period of a day

30
Home Environment Profile





Generated by monitoring a Home system
Exposed to many attacks from the Internet for
short duration
Typically connect using modems, over slow
connection usually at 56kbps
Profile need not be monitored for longer period
and hence have different evaluation scenario
Connections and data throughput is measured for
3-hours period
31
Stand-alone Environment Profile





Generated to monitor a Stand-alone system
Connected to the system and is not disconnected
from the system for long periods of time
Connected to broadband
Vulnerable to attacks from Internet and also from
insider attacks
Monitored for 24 hours a day for 7 days a week
32
Scripts

Operating system independent and
activated by launcher
 Connect the server and interact with there
service on the server
 6 legitimate scripts and 40 attack scripts
used in TIDeS
33
Few of Default Attack Scripts
with TIDeS
34
Evaluation Framework




TIDeS - many parameters for IDS evaluation
– Depth – defined as number of attacks detected by the
system to the total number of known attacks
– Breadth – defined as the number of unknown attacks to the
attacks detected that fall outside the framework of system’s
attack database
– False alarms – performance under stress, reliability and
accuracy of detecting individual attacks
Evaluation - based on error rate and network load
parameters
Decision making process – Based on fuzzy logic and fuzzy
rules
Performance evaluation are performed using false
positives, false negatives, and cumulative false alarms
35
Evaluation Metrics

Managerial and architectural Metrics
 Performance Metrics
 Analytical Metrics
 Interactivity Metrics
36
Managerial and Architectural
Metrics


Evaluate the architecture efficiency of an IDS
Matrics are:
– Distributed Management
 Determines the distribution capabilities among different
analyzers
– Configuration Difficulty
 How well a user understands the deployment of an IDS
would enable a correct deployment of the IDS
– Ease of Policy and License Management
 Ease of setting security and intrusion detection policies
as well as the difficulty in obtaining, updating and
extending licenses
– Availability of Updates
 Availability and cost of updates of signature and/or
behavior profiles as well as the availability and cost of
product upgrades
37
Managerial and Architectural
Metrics (Cont…)
– Adjustable Sensitivity
 Ease of altering the sensitivity of IDS at various times
and for different environments in order to achieve a
balance between false positive and false negative error
rates
– Data Storage Capacity Needs
 Amount of disk space consumed for storing the signature
profiles, logs and other application data.
– Scalable Load Balancing
 Measures the ability of an IDS to partition traffic into
independent, balanced sensor loads, and the ability of
load-balancing sub process to scale upwards and
downwards
38
Performance Metrics


Measure and evaluate the parameters that
impact the performance of the IDS
Metrics are:
– Observed False Positive Ratio

This is the ratio of alarms wrongly raised by the IDS to
the total number of transactions. The False Positive
Ratio is given by
1
– False Negative Ratio

This is the ratio of actual attacks that are not detected
by the IDS to the total number of transactions. This is
given by
2
39
Performance Metrics (Cont..)
– Cumulative False Alarm Rate

The weighted average of False Positive and False Negative
ratios
– Induced Traffic Latency

Given by the delay measured in the arrival of the packets at the
target network in the presence and absence of an IDS.
– Stress Handling and Point of Breakdown

Point of breakdown of an IDS is defined as the level of network
or host traffic that results in a shutdown or malfunction of IDS. It
is measured as packets/sec or number of simultaneous TCP
streams
– IDS Throughput

Defined as the observed level of traffic up to which the IDS
performs without dropping any packets.
40
Analytical Metrics


Depth and Breath of System’s Detection Capability
– Depth: defined as the number of attack signature patterns
and/or behavior models known to it.
– Breadth: given by the number of attacks and intrusions
recognized by the IDS that lie outside its knowledge domain
Reliability of Attack Detection
– Defined as the ratio of false positives to total alarms raised.
Reliability of attack detection is given by
3
41
Analytical Metrics (Cont..)

Possibility of Attack
– Defined as the ratio of false negatives to true negatives.
Possibility of attack is given by
4


Consistency
– Given by the variation in the performance (false positive and
false negative measurement) of an IDS under varying
network load and traffic environments
Error Reporting and Recovery
– Extent of event notification and logging. This is again a
subjective criteria requiring user discretion
42
Interactivity Metrics


These are again a set of subjective metrics demanding user
analysis
These metrics are:
– Firewall Interaction: Ability to interact with the Firewall
systems
– Router Interaction: Degree to which an IDS interacts with the
router and redirects attacker’s traffic to a Honeypot
– SNMP interaction: Ability of an IDS to send an SNMP trap to
one or more network devices in response to a detected
attack
– User friendliness: The ease to set up and configure an IDS
in users’ environment
43
Fuzzy Logic Basics


Fuzzy Set
– extension of classical set theory and are used in fuzzy logic
– involve in capturing, representing and working with linguistic
notations
– objects with unclear boundaries
Fuzzy Systems
– knowledge-based or rule-based systems at the heart of
which is a knowledge-base system consisting of so-called
fuzzy IF-THEN rules
– A fuzzy IF-THEN rule is an IF-THEN statement
– Example: Fuzzy IF-THEN rule:
IF the false alarm rate of the IDS is high,
THENlesserscoreisawardedtotheIDS
44
Fuzzy Logic with IDS




Fuzzy Logic – provides simple non-linear logical
solution to the problem of measuring IDS capabilities
Fuzzy set approach – starts off by encapsulating all
available domain knowledge and organizing it into a
manageable format
Collection of IF-THEN rules forms a suitable control
and decision making protocol
These rules include linguistic terms given in above
equation
45
IDS testing and evaluation Basic Tests
- Test 1: Testing for False Alarms

Case 1: False Positive
– Only attack traffic launched
– Network load is measured as % of total network
bandwidth
– % false positive alarms are measure as per
equation 1
– Mapping the %FP and average network loads
during the testing phase, onto their respective
fuzzy sets
– Testing is carried out until system breaks down
46
Test 1: Testing for False
Alarms
– Case 2: False Negative



Similar process is repeated for false negatives with only
legitimate traffic launched the IDS
Amount of traffic predicted as attacks now become the false
negatives
Similar calculations are made for false negatives giving us the
output false negative performance set
– Case 3: Cumulative False Alarms



Output sets obtained in the above tests are fed back to the
fuzzy evaluator to obtain a cumulative performance report for
the system.
This process is known as forward chaining, where the fuzzy
result of one test is forwarded for further evaluation
The evaluation process would be similar to the above
discussed method, giving us a precise grade for the system’s
error rate performance on a fuzzy scale
47
Test 2: Consistency and
Reliability
– Error consistency test



The test is similar to test 1
However, network traffic is a mixture of legitimate as well as
attack traffic
The %error in this case is measured as follows:
5


The performance of the IDS tested at various network loads and
its consistency checked against the results of test 1
Besides error consistency, also measure the ratio of %FP to
%FN. The possibility of attack given by Percentage possibility of
Attack =
6
48
Results



Various quantitative analysis is performed on the IDS
during the testing phase with the TIDeS framework
Evaluations performed on the working of well-known IDS
Preliminary results
– Alerts generated by an IDS when there was no illegitimate
traffic launched on the network
– Testing launched 897 legitimate traffic transactions
– Total 170 attacks were detected under a network load of
10% of a T1 LAN connection
– Indicates an 18.5% error in the detection capabilities
49
Conclusion





Testing and Selecting an IDS is a major challenge
TIDeS Testbed – allows users to select best IDS
for specific customized environment
Based on reliable and robust metrics
Development of traffic profiles and evaluation
framework allows TIDeS to be built to evaluate
systems in users environment
Fuzzy logic Evaluation Framework can also be
used to evaluate an IDS
50
Future Work

The output of IDS are not conforming to a
standard format – can be achieved using
IDMEF
 IDMEF – converts the output of a system
into XML format - need to be tested with
TIDeS
 As many attacks are discovered everyday
– incorporating more scripts are required
51
References
[1] E. Biermann, E. Clote, and L. Venter. A cpmparison of Intrusion Detection Systems. Computers and
Security, Pages 676-683, 2001
[2] C. Iheagwara and A. Blyth. Evaluation of the performance of ID systems in a switched and distributed
environment: The International Journal of Computer and Telecommunications Networking, 39(2):
93-112, June 2002
[3] L. M. Rossey., R. K. Cunnigham, D. J. Fried, Jl C. Rabek, R. P. Lippmann, and J. W. Haines. Lariat:
Lincoln adaptable real-time information assurance testbed. Fourth International Workshop on
Recent Advances in Intrusion Detection, 2001
[4] P. Mell, V. Hu, R. Lippmann. J. Haines, and M. Zissman. An overview of issues in testing intrusion
detection systems. NIST Interagency Report NIST IR 7007, NIST,
http://csrc.nist.gov/publications/nistir/nistir-7007.pdf, June 2003
[5] N. Provos. Honeyd - a virtual honeypot daemon (extended abstract). 10th DFN-CERT Workshop,
Hamburg, Germany, February 2003. www.citi.umich.edu/u/provos/papers/honeyd-eabstract.pdf
[6] Gautam Singaraju, Lawrence Teo, Yuliang Zheng, “A Testbed for Quantitative Assessment of IDS
using Fuzzy Logic”, Laboratory of Information Integration Security and Privacy (LIISP), University
of North Carolina at Charlotte Calpytix Security Corporation, USA http://www.calpytix.com, Appears
in Proceeding of the Second IEEE International Information Assurance Workshop (IWIA ‘04)
[7] T. G. Champion and R. S. Durst. Air force intrusion detection system evaluation environment. RAID
Symposium, 1999
[8] R. Lippman, J. W. Haines, D. J. Fried, J. Korba, and K. Das; “The 1999 DARPA Off-line Intrusion
Detection Evaluation”;
http://www.ll.mit.edu/IST/ideval/pubs/2000/1999EvalComputerNeworks2000.pdf
52
Questions

Ask now, or e-mail me
– [email protected]
53
Thanks!
54

Testbed for Quantitative Assessment of IDS

Transcript Testbed for Quantitative Assessment of IDS

Directory