Transcript PPT

Learning Classifier Systems to Intrusion Detection
Monu Bambroo
12/01/03
Outline
Problem
Motivation
Background
My Approach
EA Consideration
Results
Questions
Intrusion Detection
Problem of identifying unauthorized users
Protect the system from being compromised
System should provide
• Data confidentiality
• Data Integrity
• Data Availability
2 categories
• Anomaly Detection
Looks for unusual events in data been monitored. Difficult to
implement.
• Misuse Detection
Data in network is compared with a database of known signatures.
cannot prevent against unknown attacks
Revenue loss in 2002 = $455,848,000
Intrusion Detection…
Available Approaches
• Data Mining Techniques
• Short Sequence of system calls
My Approach
• Genetic algorithm to evolve a simple set of fuzzy rules
that can solve some intrusion detection problems
Fuzzy logic Concept
• In my approach genetic algorithms can find good and simple fuzzy
rules to characterize intrusions (abnormal) and normal behavior of
network
•As difference between normal and abnormal activities are not distinct ,
but rather fuzzy , fuzzy logic is used.
Fuzzy Sets
Classic Sets
In fuzzy set an object can
partially be in a set
In classic sets an object is
entirely in a set or not
The membership degree takes The membership degree takes
values between 0 and 1
only 2 values 0 or 1
• Membership function
Fuzzy sets are characterized by a continuous
membership function which maps an object to a
membership degree taking values between 0 and 1
inclusive.
System Attributes
su-attempted: su command attempted
num-root: attempted root access
num-file-creations: file creation operations
num-failed-login-in: failed login attempts
src-bytes: no. of bytes from source to destination
dst-bytes: no. of bytes from destination to source
Duration: duration of the connection
Fuzzy rules:
If condition then consequent
where ‘condition’ is a complex fuzzy expression
‘consequent’ is an atomic expression
Some Rules
If the duration is high and src-bytes is high then portscan is a ‘high’
If su-attempted is high and failed-login-attempts is high
then R2L is ‘high’
If num-root is high and num-file-creation is high then
R2L is ‘high’
If src-bytes is high and su-attempted is high and
duration is high then port-scan is ‘high’
If num-root is medium and failed-login-attempts is
medium then R2L is ‘medium’
If duration is low and src-bytes is low then port-scan is
‘low’
where high, low, medium are membership functions
Using Fuzzy product inference engine the degree of
confidence in a rule can be evaluated.
Learning classifiers Systems (LCS)
Classifier systems are intended as a framework that uses genetic
algorithms to study learning in condition/action , rule based systems
They consists of 2 parts
• Population of condition-action rules called classifiers
• Algorithm for utilizing, evaluating and improving the rules
Classifier systems address 3 basic problems in machine learning
• Parallelism and Co-ordination
• Credit Assignment
• Rule discovery
The generic architecture of a LCS
Learning classifier system…
The classifier system can be viewed as a message processing system acting on
current list of messages
More messages means more active rules
Credit assignment is handled by setting up a market situation.
Credit is accumulated by rule as a strength (a kind of capital)
Rule discovery exploits the genetic algorithms ability to discover and recombine
rules.
Rule strength is treated as fitness by genetic algorithms.
Fuzzy Learning Classifier System (FLCS)
The fuzzy classifier system is a crossover between a learning classifier system and
fuzzy logic.
A learning classifier system learn rules whose clauses are strings of bits.
Each bit may represent a Boolean value for the corresponding variable. A genetic
algorithm operates on these strings to evolve a best solution.
In fuzzy classifier system the main idea is to consider the symbols in the rule clauses
as labels associated to fuzzy sets.
The rule activation module has to select one rule in LCS whereas in FLCS all the
rules matching the degree greater than a given threshold are triggered and action is
computed.
Evolving fuzzy classifier systems
• Use of Michigan Approach
• Used genetic algorithm to generate fuzzy
classifiers for intrusion detection
• Fuzzyfication of input values into fuzzy
messages
• Coding of fuzzy if-then rules and fuzzy
matching
00:1111, 01:101/001
• Fuzzy matching and evaluation
• Credit Distribution Algorithm
Bucket Brigade Algorithm with appropriate
fuzzyfication.
EA consideration
sensitivity = TP/TP+FN , specificity = TN/TN+FP
Fitness = a*sensitivity + b*specificity,
a, b are assigned weights for each rule
A pre specified number of fuzzy rules say N in current
population is replaced by newly generated rules by genetic
operations.
Worst rules with smallest fitness are removed from current
population and newly generated rules are added to generate
N rules
Crossover and Mutation are used to generate new rules
Crowding is used to replace classifiers
Questions??