10_PHMM_appsx
Download
Report
Transcript 10_PHMM_appsx
PHMM Applications
Mark Stamp
PHMM Applications
1
Applications
We
consider 2 applications of PHMMs
from information security
o Masquerade detection
o Malware detection
Both
show some strengths of PHMMs
Both are somewhat unique
PHMMs not always a first choice
PHMM Applications
2
PHMM for Masquerade
Detection
Swapna Vemparala
Mark Stamp
PHMM Applications
3
Masquerader?
Masquerader
makes unauthorized use
of another user’s account
o Masquerader tries to evade detection by
pretending to be the other user
Can
we detect masquerader?
o Intrusion Detection System (IDS)
We
consider special case where such
an IDS is based on UNIX commands
PHMM Applications
4
Schonlau Dataset
Collected
UNIX commands for 50 users
o 5k training commands per user, plus…
o 10k command attack/user commands
Key
to tell which blocks are attack and
which belong to same user
o Nominally, 100 blocks, 100 commands each
No
real session start/end info provided
o This could be a problem…
PHMM Applications
5
Previous Work
Lots
of papers use Schonlau data
Types of methods that have been used
o Information theoretic
o Text mining
o Hidden Markov Model
o Naïve Bayes
o Sequences and bioinformatics
o SVM, and Other
PHMM Applications
6
Information Theoretic
Schonlau
originally used compressionbased scheme
o The theory is that commands by same
user should compress more
o By subsequent standard, poor results
Some
other similar work
o But still not any good results
Compression
PHMM Applications
for malware detection
7
Text Mining
Look
for repetitive sequences
o Can be used to detect particular user
o Almost like a signature
PCA
has also been used here
o Repetitive sequences are patterns
o PCA can find such structure
o Training cost considered high
Other
PHMM Applications
ways to do “text mining”?
8
Hidden Markov Model
Need
we say more?
HMM is one of the most popular
detection strategies in this field
o Results are good
o Serves as benchmark in many studies of
other techniques
We
implement HMM detector and
compare to PHMM
PHMM Applications
9
Naïve Bayes
Naïve
Bayes (NB) relies on frequencies
o No sequential info used
o Very simple
o Efficient training & scoring
Discuss
Naïve Bayes in later chapter
Close connection between HMM and NB
o So, not too surprising that this works
o But, surprising that it works so well
PHMM Applications
10
Sequences and Bioinformatics
n-gram
approaches very popular
o Like HMM, also used as benchmark
Sequence
alignment has been used
o Based on Smith-Waterman algorithm
o Like constructing MSA in PHMM
o Closest previous work to PHMM
We’ll
compare our PHMM results to
both n-gram and HMM
PHMM Applications
11
Support Vector Machines
Several
previous studies use SVM
o SVM has nice geometric interpretation
o One of most popular in machine learning
For
masquerade detection, SVM
results are about same as NB
Claimed that SVM is more efficient,
as compared to Naïve Bayes
o But, Naïve Bayes is very efficient
PHMM Applications
12
Other
Frequent and/or infrequent commands
o Neither seems to perform well
“Hybrid Bayes one step Markov” and
“Hybrid multistep Markov”
o Nice names, but not so good
Non-negative Matrix Factorization
o Good results
Ensemble (combination) approaches
o Seem to offer slight improvement
PHMM Applications
13
Experimental Results
Again,
we compare HMM and n-grams
to several PHMM models
o All are tested on Schonlau dataset
o Then we generate a simulated dataset
o All tested again on simulated data
Why
simulated data?
o Schonlau data has some limitations
o This will be explained later…
PHMM Applications
14
HMM & n-Gram ROC Curves
First,
compare
HMM and
n-grams
PHMM Applications
15
HMM and n-Gram AUC
For
ROC curves on previous slide…
PHMM Applications
16
Training PHMM
How
many sequences to use?
o More sequences, better for E matrix…
o …but worse for gaps
How
long of sequences?
o For Schonlau dataset, we have 5k
training commands per user
Where
to begin/end sequences?
o No good answer for Schonlau dataset
PHMM Applications
17
PHMM Sequences
Note
that all 5k
commands used in
each case
PHMM Applications
18
PHMM ROC Curves
ROC
curves
for each
PHMM case
Trend?
PHMM Applications
19
PHMM AUC
AUC
for each PHMM case
o 5, 10, and 20 sequences are best cases
PHMM Applications
20
HMM, n-Gram, and PHMM
Again,
for
Schonlau
dataset
Which method
is better?
PHMM Applications
21
HMM vs PHMM
HMM
and PHMM give similar results
on Schonlau dataset
Surprising that PHMM does so well
o Why? No begin/end sequence info
What
if we had “better” sequences?
o PHMM could certainly do better and
maybe much, much better
But
how to get a better dataset?
PHMM Applications
22
Simulated Dataset
Generate
Markov model for each user
o Gives us matrices π and A
o Based on monograph and digraph stats
Now
we can generate sequences
o Use matrix π to select initial element
o Then use matrix A to generate sequence
HMM
will do well on this data (why?)
PHMM might do well, or not
PHMM Applications
23
ROC Curves Simulated Data
HMM
vs
PHMM
Based on 5k
training
commands
PHMM Applications
24
AUC for Simulated Data
Again,
PHMM Applications
based on 5k training commands
25
Real World Problem
Masquerade
detection in real world
At first, we have little training data
o Can’t protect user until we train a model
o So, want to train as soon as possible
Minimum
data that is really needed?
We compare HMM and PHMM with
200, 400, and 800 training commands
PHMM Applications
26
Limited Training Data
Simulated
data
HMM
vs
PHMM
Big difference
when very
little training
data available
PHMM Applications
27
Limited Training Data
PHMM
most impressive with very
little data (especially wrt AUC0.1)
PHMM Applications
28
Limited Training Data
Same
results
as previous
slide
PHMM Applications
29
Masquerade Detection
Strategy
Obtain
200 commands, train PHMM
Use this PHMM model until a reliable
set of 800+ commands is available
Then train HMM on 800+ commands
Use HMM from then on
Gives us a reliable model with limited
data, and best model with more data
PHMM Applications
30
Another PHMM Advantage?
PHMM
might be better when attacker
hijacks ongoing session
Masquerader mimics average behavior
o This is what is modeled by HMM
Harder
to mimic sequential behavior
o As modeled by PHMM
o Depends on position in the sequence
This
should be investigated further
PHMM Applications
31
PHMM for Malware Detection
Lin Huang
Mark Stamp
PHMM Applications
32
Malware Detection
PHMM
previously tested for
metamorphic detection
o Based on extracted opcodes
Results
not impressive
MSA has gaps and PHMM is weak
Code transposition causes problems
o And transposition is easy to implement
Opcode
PHMM Applications
sequencing, not strong feature
33
Malware Detection 2.0
Here,
again apply PHMM to malware
But what features to use
Want feature(s) where…
o Sequence/order is critical
o Harder for malware writer to modify
sequential information
What
feature(s) to use?
o Not (static) opcodes
PHMM Applications
34
Software Birthmarks
Birthmark
is inherent feature of code
o In contrast to a watermark
We
consider both static and dynamic
birthmarks
Static --- collected without executing
Dynamic --- execution/emulation
Examples of each?
Advantages/disadvantages of each?
PHMM Applications
35
This Research
Consider
opcodes
o Static feature, extracted by disassembly
Also
consider API calls
o Dynamic, use Buster Sandbox Analyzer
Compare
HMM and PHMM for both
o Then 4 cases for each malware family
PHMM Applications
36
Data
Malware
Benign
PHMM Applications
data from Malicia Project
set of 20 Windows applications
37
Results: Opcode Sequences
Scatterplots
and ROC curves
for Security
Shield
PHMM Applications
38
HMM Results
Results
for all families, static and
dynamic birthmarks
PHMM Applications
39
PHMM
Dynamic
PHMM Applications
birthmarks --- API calls
40
Results
Static
and
dynamic
HMM
And dynamic
PHMM
PHMM Applications
41
Bottom Line
In
these cases, dynamic data gives
better results
o API calls better than (static) opcodes
HMM
does very well on API calls…
…but PHMM can do even better
Sequential info in API calls!
Is PHMM really worth it?
PHMM Applications
42
References
Masquerade detection
o L. Huang and M. Stamp, Masquerade detection
using profile hidden Markov models, Computers
& Security, 30(8):732-747, November 2011
Malware detection
o S. Vemparala, et al, Malware detection using
dynamic birthmarks, 2nd International
Workshop on Security & Privacy Analytics
(IWSPA 2016), co-located with ACM CODASPY
2016, March 9-11, 2016
PHMM Applications
43