PPT - University of Delaware

Download Report

Transcript PPT - University of Delaware

A Framework for Extrusion Detection
Using Machine learning
Yan Luo and Jeffrey J.P.Tsai
Presented by: Koundinya Surepeddi
Dept. of Computer & Information Sciences
University of Delaware
CISC 879 - Machine Learning for Solving Systems Problems
Introduction
•
•
Extrusion – Unauthorized transfer of digital assets that has
classified information.
Extrusion can be malicious
– theft of information
– accidental inclusion of an unauthorized recipient on
a sensitive e-mail
CISC 879 - Machine Learning for Solving Systems Problems
Why EDS?
•
Problem: leaking of confidential information ( lack of
information security)
•
Solution: EDS (Extrusion detection system)
CISC 879 - Machine Learning for Solving Systems Problems
EDS VS IDS
•
Extrusion detection is a reverse process of intrusion detection.
IDS – protects the system from outside attacks
EDS – protects the system from inside attacks.
•
•
IDS use misuse detection and anomaly detection.
EDS uses combination of
– misuse detection (Well known attacks –> signatures)
– anomaly detection (System activities –> normal profiles)
– data mining techniques
CISC 879 - Machine Learning for Solving Systems Problems
Detecting Extrusions
•
Automatic process to detect extrusions
Raw data
(User & System Activities)
Data Mining Techniques (Association rule analysis,
Frequency analysis,…..)
Detection rules, Proper features (Existing extrusions,
Future extrusions)
CISC 879 - Machine Learning for Solving Systems Problems
EDS Framework
•
The framework includes
1. Data collection (produces raw data)
a. User monitor - keyboard click events, mouse click events
b. Process monitor - process create & process terminate events
c. File system monitor - file create, file modify, file open/close,
file read/write events.
d. Network monitor - network traffic data events
e. Clipboard monitor - clipboard data events, copy/paste events.
2. Data analysis
3. Extrusion detection
CISC 879 - Machine Learning for Solving Systems Problems
System diagram
CISC 879 - Machine Learning for Solving Systems Problems
Components
•
Target system:
System that is being protected from extrusions.
•
The target system can be
A personal computer,
A local network, or
A whole company's computer system.
•
Data collection:
Collects raw data which includes event information.
CISC 879 - Machine Learning for Solving Systems Problems
Components
•
Raw data:
Initial Event information
•
Data analysis:
Analyzes raw data to generate detection
rules and select proper features.
•
Detection rules: Automatically generated by data analysis
module used for extrusion detections.
•
Features: The proper features model the system and user's
normal profiles.
The normal profiles can be further used to detect abnormal
events and the possible extrusions.
CISC 879 - Machine Learning for Solving Systems Problems
Components
Detection engine:
•
•
Loads the detection rules from SQL database and applies
them on the target system for run-time extrusion.
•
The detection engine can monitor the target system.
•
•
If system’s behavior deviate the baseline of the normal
profiles, the alarm of possible extrusions is triggered.
Database:
stores raw data,
detection rules,
proper features and
system and user's normal profiles.
CISC 879 - Machine Learning for Solving Systems Problems
Technical Approach
1.
Find pattern of Extrusions
2.
Extrusion Forecasting
3.
Dynamic characteristics
CISC 879 - Machine Learning for Solving Systems Problems
Pattern of Extrusions
•
Step 1:
Sort the recorded events by their timestamps.
•
Step 2:
The system is pre-trained with large datasets by using
Data mining techniques or
Pre-defined activities by users.
•
Step 3:
Pattern recognition techniques -For real-time monitoring of
the system activities.
The alarm will be triggered when some patterns are found.
CISC 879 - Machine Learning for Solving Systems Problems
Example - BINDER
BINDER
•
An Extrusion-based Break-in Detector.
•
Detects break-in extrusions by determining the network
connections are unrelated to user actions.
•
Only processes that receive user input are allowed to make
connections.
CISC 879 - Machine Learning for Solving Systems Problems
Extrusion Forecasting
•
We can forecast the next intrusion or extrusion activity:
•
•
•
Enough patterns of intrusion and extrusion activities.
Define the detection rules correctly and completely.
Forecasting new intrusion or extrusion activities :
•
•
The partial pattern recognition or rule matching.
To find abnormal activities, we first need to model the
normal activities.
CISC 879 - Machine Learning for Solving Systems Problems
Dynamic Mechanism
•
•
•
How to organize events information and use them to detect
extrusion?
First way (Rule based detection):
1. Define some rules & apply them to the recorded or the realtime events information.
2. If the rule is matched, then the alarm will be triggered.
Second way :
1. Organize these events information as a large dataset or
many datasets.
2. Apply the data mining and the pattern recognition
techniques.
CISC 879 - Machine Learning for Solving Systems Problems
Rule based Detection
•
A dynamic mechanism for adding rules, testing rules, and
deleting rules.
•
Step 1:
Add the specific rules to the system and run the experiments.
•
Step 2:
If the result is not very good, delete the previous rules and
add more new rules.
Run another experiments based on new rules.
CISC 879 - Machine Learning for Solving Systems Problems
Rule based Detection
Template
•
Sequence of rules: Action_1 -> Action_2 -> … -> Action_N
•
The user can specify a sequence of actions as a detection
rule.
•
Then the system will examine the recorded system events.
•
The system will find whether there is a sequence of events
that match the detection rule.
CISC 879 - Machine Learning for Solving Systems Problems
Rule based Detection
Detection Rule
•
A confidential file is opened -> the content is copied to
clipboard -> the content is pasted to another file -> the other
file is saved
•
So in this detection rule, there are four actions and there a
specific order that these actions are performed.
•
Then if our detection system found a sequence of events that
match this rule, then some alarm will be triggered and some
reactions will be done.
CISC 879 - Machine Learning for Solving Systems Problems
Rule based Detection
Timing problem:
•
Sometimes the extrusion activities will not follow our detection
rule step by step.
•
The order of there activities may be interleaving.
•
So our detection rules should concern the timing information.
CISC 879 - Machine Learning for Solving Systems Problems
Conclusion
•
•
Combination method which integrates both misuse detection
and anomaly detection for automatically generating detection
rules and selecting proper features
Extrusion detection and confidential information protection
can be carried out based on the detection rules and proper
features.
CISC 879 - Machine Learning for Solving Systems Problems
Queries?
CISC 879 - Machine Learning for Solving Systems Problems