Network Intrusion Detection Today and Tomorrow

Download Report

Transcript Network Intrusion Detection Today and Tomorrow

Network Intrusion Detection
Applications and Research
Like Zhang
Outline
•
•
•
•
•
Recent debate over NIDS
Introduction to NIDS
A survey of current NIDS products
Research on anomaly NIDS
Conclusion
Is NIDS dead?
“Hype Cycle for Information Security”
Gartner Report, 2003
• False positives and negatives
• Requiring full-time monitoring (24 hours a
day, seven days a week, 365 days a year)
• Market failure
• Will be obsolete by 2005
Current Situation
• Intrusion Detection evolves into Intrusion
Prevention
• New types of IDS come into play
(distributed IDS, application-based IDS,etc.)
• NIDS is applied to firewall, anti-virus system,
optional plug-in for server-side program, or
deployed as a standalone product
NIDS Techniques
•
•
•
•
Signature-based
Anomaly-based
Stateful detection
Application-level detection
Signature-base NIDS
Similar to the traditional anti-virus applications
Example:
Martin Overton, “Anti-Malware Tools: Intrusion Detection Systems”,
European Institute for Computer Anti-Virus Research (EICAR), 2005
Signature found at W32.Netsky.p binary sample
Rules for Snort:
Anomaly Detection
• Already used by industry
--Protocol Anomaly
--Statistical/Threshold based
• In Research
--Data mining
Protocol Anomaly Detection
Based on the well established RFCs
Focus on the packet header
Example:
--All SMTP commands have a fixed maximum size. If the size exceeds
the limit, it could be a buffer overflow or malicious code inserting
attack
--SYN flood attack: attacker sends SYN with fake source address
--Teardrop attack: fragmented IP packets with overlapped offset
Threshold based
Using training data to generate a statistical
model, then select proper thresholds for
network environment (traffic volume, TCP
packet count, IP fragments count, etc.)
-- usually used as an complementary tool
Stateful IDS
• No practical Solutions
• Very simple implementing
Example:
Snort uses patter matching in continuous Packets.
Traditional signature rules: “pattern1” “pattern1 || pattern2”
The rule now can be defined as: “pattern1.*pattern2”
Application-level IDS
Focus on specific services or programs
(Web Server, Database, etc.)
Example
--Monitoring all invocation for Microsoft RPCs
--Analyze HTTP request for malicious query strings
Products:
--mod_security: an optional IDS component for Apache
Web Server
IDS Today
Products and Applications
•
•
•
•
•
Snort
McAfee Intrushield
ISS RealSecure
Cisco IPS
Symantec IDS
Snort
• Open Source, since 1998
• Used by many major network security
products
• Signature-based (more than 3000)
• Simple IP header protocol anomaly
detection
• Simple stateful pattern matching
McAfee
• Profile-based anomaly detection
--Manually create profile
--Create profile by self-learning through a training period
• Using profile plus threshold for defending
against DOS and DDOS
• Inspect encrypted traffic by collecting the
server side private keys
ISS RealSecure
• About 2000 signatures
• Application-based approach
--identifying any possible exploit to the published vulnerabilities of
MS RPC, IIS, Apache, Lotus, etc.
• Additional support for P2P,Instant Messengers
• Virtual Prevention System
--a virtual environment to examine the execution of a file in order to
find any possible malicious behaviors
• Support for IPv6
--Detect possible backdoors which enable the IPv6 of a system
(usually off)
Cisco IPS produtcs
Protocol decoding
Threshold based property checking
Signature matching
Protocol Anomaly Detection
Checking file behaviors by intercepting all
calls to the system resources
Symantec
• Multi-steps (protocol, vulnerability,
signature, DOS, traffic, evasion check)
• Unique feature: evasion check
e.g. request “/index.html” can be replace
with “/%69nd%65x.html” to evade the
signature matching
Summary of Current Products
Signature
General
Snort
McAfee
Intrushield
ISS
RealSecure
Cisco IDS
Symantec
IMUNE
x
x
x
x
x
Application based
Anomaly
Detection
x
Profile-based
x
Vulnerability-based
x
Statistical-based
Protocol-based
x
x
Self-learning
x
Application specific
x
Stateful
IPv6 Support
x
x
x
x
x
x
x
Behavior
Encrypted Traffic Detection
x
x
x
x
x
Challenges for NIDS
• High false positives
-- FP of 0.1% means a normal packet will be misclassified as an
alert for every 1000 normal packets, which is about one error alert
per minute on a 100M network
• Zero day attack (unknown attack)
--Most current products rely on signature-based detection, difficult to
detect new attacks.
• Poor at automatically preventing ability
--Human interaction is required when attack is detected
Research on Intrusion Detection
• Columbia University
--Data mining based (since 1997)
• University of California at Santa Barbara
--Service Specific (HTTP)
--Stateful IDS
• Florida Institute of Technology
--Protocol Anomaly (Statistical based)
• University of Minnesota
--MIND (Minnesota Intrusion Detection System)
Columbia Univ. IDS
• 1997, Applied RIPPER rule learning algorithm
on UNIX system calls monitoring for malicious
events detection
• 1998, Applied the algorithm on off-line network
traffic data (clean training data)
• 2000, Applied EM and clustering algorithm for
dealing with noisy dataset
• 2001, Developed an complete experiment NIDS
based on those algorithms.
• 2004, New approach towards payload anomaly
detection
Implementing Procedure
Wenke Lee, Sal Stolfo, and Kui Mok., “A Data Mining Framework for Building
Intrusion Detection Models”, Proceedings of the 1999 IEEE Symposium on
Security and Privacy, Oakland, CA, May 1999
Pre-Processing
Process raw packet data
Feature construction
Apply RIPPER algorithm
Create statistic features
Rule learning
Pre Processing
SYN flood attack
Feature Construction
(service=http, flag=S0, dst_host=victim),
(service=http, flag=S0, dst_host=victim)
-> (service=http, flag=S0, dst_host=victim)
[0.93, 0.03, 2]
93% of the time, after two http connections with S0
flag are made to host victim, within 2 seconds from
the first of these two, the third similar connection is
made, and this pattern occurs in 3% of the data
RIPPLE Rules
smurf :- service=ecr_i, host_count >= 5,
host_srv_count>=5
( if the service is icmp echo request, and connections with the same
destination host are at least 5, and connections with the same service
are at least 5,then it is a smurf/DOS attack)
satan :- host_REJ_%>=83%, host_diff_srv_% >=
87%
( for connections with the same destination host, if the rejection rate is at least
83%, and the percentage of different services is at least 87%, then it is a
santa/PROBING attack)
Experiment Results
Applied on DARPA’98 Intrusion Detection Evaluation Data Set
Payload based Approach
K. Wang, S. J. Stolfo, “Anomalous Payload-based Network
Intrusion Detection”, RAID 2004
• Construct the statistical model for all bytes in the header
• Use Mahananobis distance to measure the difference
Problems:
• Clean training data is required
• False positive (unacceptable)
Service Specific IDS by UCSB
V.Giovanni et al at University of California at Santa Barbara
Since 2002
•
•
•
•
Application level
Focuses on HTTP request
HTTP request analyzing
Constructing models for important fields in
the request instead of all bytes of the
payload (Columbia payload approach)
Sample Request
Request
GET /scripts/access.pl?user=johndoe&cred=admin
Properties for Detection
Request Type: e.g. GET
Request Length: e.g. Length(“GET /scripts/access.pl?user=johndoe&cred=admin”)
Payload Distribution
Request Type
Assumption:
If a rare used request type was found, it is very possible it
will initiate malicious activity
Anomaly Score:
AStype=-log2(p[type])
P[type] stands for the probability of a certain type
Request Length
Assumption:
The request length should not vary much of a certain type.
Otherwise, it is probably caused by some attacks
(e.g. overflow)
Anomaly Score:
ASlen=1.5(1-)/(2.5*)
P[type] stands for the probability of a certain type
Characters Distribution
256 ASCII Characters
Segment
0
1
2
3
4
5
ASCII Value
0
1-3
4-6
7-11
12-15
16-255
e.g. “passwd” -> “112 97 115 115 119 100”
Distributions: {0.33, 0.17, 0.17, 0.17, 0.17}
2=f(Oi, Ei) (i corresponds from segment 0 to 5)
Aspd= 2*(15/L) (L stands for the payload length)
Final Anomaly Score
AS=0.3*AStype + 0.3*ASlen+0.4*ASpd
Later Research at UCSB
Structure Inference with Markov Model
Other Properties Used
• Token Finder
if the query parameter is drawn from known candidates
• Attribute Presence or absence
malicious crafted request usually ignore the order of parameters
• Access Frequency
• Invocation order
• Request time interval
Experiment Results
• Tested at UCSB campus network and
Google
• False positive 0.06%
Major cons:
Limited to HTTP service
Packet Header Anomaly Detection
Packet Header Anomaly Detection (PHAD)
developed by Florida Institute of Technology since 2001
Basic Assumption:
If an event x happened n times with r different results in the
training period, the probability of a novel data is r/n
Implementing
Step 1:
Assign the novel data probability to important fields
of the packet header (protocol type, flags, etc.)
Step 2:
Adding all the novel data probability together as a
threshold
MINDS
MINDS (Minnesota Intrusion Detection System)
Statistic outlier-based anomaly detection
Compared 5 outlier-based scheme:
• K-th nearest neighbor
• Nearest neighbor
• Mahalanobis-distance based
• Local Outlier Factor (LOF)
• Unsupervised SVMs
Comparison Result
A. Lazarevic, et al, “A Comparative Study of Anomaly Detection Schemes in
Network Intrusion Detection”, Proceedings of the 3rd SIAM Conference on Data
Mining, San Francisco, 2003
Some Emerging Approaches
• SVMs
(unsupervised and supervised)
• PCA
• PCA + SVMs
• Neural Network
Conclusion
• Signature based approaches still play the major
part in practical IDS
• Anomaly detection has only very limited success
• New approaches are proposed everyday, but
false positive and detection rate are still the
major problem
• Various mechanisms should work together for
maximum success