10_PHMM_appsx

Download Report

Transcript 10_PHMM_appsx

PHMM Applications
Mark Stamp
PHMM Applications
1
Applications
 We
consider 2 applications of PHMMs
from information security
o Masquerade detection
o Malware detection
 Both
show some strengths of PHMMs
 Both are somewhat unique
 PHMMs not always a first choice
PHMM Applications
2
PHMM for Masquerade
Detection
Swapna Vemparala
Mark Stamp
PHMM Applications
3
Masquerader?
 Masquerader
makes unauthorized use
of another user’s account
o Masquerader tries to evade detection by
pretending to be the other user
 Can
we detect masquerader?
o Intrusion Detection System (IDS)
 We
consider special case where such
an IDS is based on UNIX commands
PHMM Applications
4
Schonlau Dataset
 Collected
UNIX commands for 50 users
o 5k training commands per user, plus…
o 10k command attack/user commands
 Key
to tell which blocks are attack and
which belong to same user
o Nominally, 100 blocks, 100 commands each
 No
real session start/end info provided
o This could be a problem…
PHMM Applications
5
Previous Work
 Lots
of papers use Schonlau data
 Types of methods that have been used
o Information theoretic
o Text mining
o Hidden Markov Model
o Naïve Bayes
o Sequences and bioinformatics
o SVM, and Other
PHMM Applications
6
Information Theoretic
 Schonlau
originally used compressionbased scheme
o The theory is that commands by same
user should compress more
o By subsequent standard, poor results
 Some
other similar work
o But still not any good results
 Compression
PHMM Applications
for malware detection
7
Text Mining
 Look
for repetitive sequences
o Can be used to detect particular user
o Almost like a signature
 PCA
has also been used here
o Repetitive sequences are patterns
o PCA can find such structure
o Training cost considered high
 Other
PHMM Applications
ways to do “text mining”?
8
Hidden Markov Model
 Need
we say more?
 HMM is one of the most popular
detection strategies in this field
o Results are good
o Serves as benchmark in many studies of
other techniques
 We
implement HMM detector and
compare to PHMM
PHMM Applications
9
Naïve Bayes
 Naïve
Bayes (NB) relies on frequencies
o No sequential info used
o Very simple
o Efficient training & scoring
 Discuss
Naïve Bayes in later chapter
 Close connection between HMM and NB
o So, not too surprising that this works
o But, surprising that it works so well
PHMM Applications
10
Sequences and Bioinformatics
 n-gram
approaches very popular
o Like HMM, also used as benchmark
 Sequence
alignment has been used
o Based on Smith-Waterman algorithm
o Like constructing MSA in PHMM
o Closest previous work to PHMM
 We’ll
compare our PHMM results to
both n-gram and HMM
PHMM Applications
11
Support Vector Machines
 Several
previous studies use SVM
o SVM has nice geometric interpretation
o One of most popular in machine learning
 For
masquerade detection, SVM
results are about same as NB
 Claimed that SVM is more efficient,
as compared to Naïve Bayes
o But, Naïve Bayes is very efficient
PHMM Applications
12
Other

Frequent and/or infrequent commands
o Neither seems to perform well

“Hybrid Bayes one step Markov” and
“Hybrid multistep Markov”
o Nice names, but not so good

Non-negative Matrix Factorization
o Good results

Ensemble (combination) approaches
o Seem to offer slight improvement
PHMM Applications
13
Experimental Results
 Again,
we compare HMM and n-grams
to several PHMM models
o All are tested on Schonlau dataset
o Then we generate a simulated dataset
o All tested again on simulated data
 Why
simulated data?
o Schonlau data has some limitations
o This will be explained later…
PHMM Applications
14
HMM & n-Gram ROC Curves
 First,
compare
HMM and
n-grams
PHMM Applications
15
HMM and n-Gram AUC
 For
ROC curves on previous slide…
PHMM Applications
16
Training PHMM
 How
many sequences to use?
o More sequences, better for E matrix…
o …but worse for gaps
 How
long of sequences?
o For Schonlau dataset, we have 5k
training commands per user
 Where
to begin/end sequences?
o No good answer for Schonlau dataset
PHMM Applications
17
PHMM Sequences
 Note
that all 5k
commands used in
each case
PHMM Applications
18
PHMM ROC Curves
 ROC
curves
for each
PHMM case
 Trend?
PHMM Applications
19
PHMM AUC
 AUC
for each PHMM case
o 5, 10, and 20 sequences are best cases
PHMM Applications
20
HMM, n-Gram, and PHMM
 Again,
for
Schonlau
dataset
 Which method
is better?
PHMM Applications
21
HMM vs PHMM
 HMM
and PHMM give similar results
on Schonlau dataset
 Surprising that PHMM does so well
o Why? No begin/end sequence info
 What
if we had “better” sequences?
o PHMM could certainly do better and
maybe much, much better
 But
how to get a better dataset?
PHMM Applications
22
Simulated Dataset
 Generate
Markov model for each user
o Gives us matrices π and A
o Based on monograph and digraph stats
 Now
we can generate sequences
o Use matrix π to select initial element
o Then use matrix A to generate sequence
 HMM
will do well on this data (why?)
 PHMM might do well, or not
PHMM Applications
23
ROC Curves Simulated Data
 HMM
vs
PHMM
 Based on 5k
training
commands
PHMM Applications
24
AUC for Simulated Data
 Again,
PHMM Applications
based on 5k training commands
25
Real World Problem
 Masquerade
detection in real world
 At first, we have little training data
o Can’t protect user until we train a model
o So, want to train as soon as possible
 Minimum
data that is really needed?
 We compare HMM and PHMM with
200, 400, and 800 training commands
PHMM Applications
26
Limited Training Data
 Simulated
data
 HMM
vs
PHMM
 Big difference
when very
little training
data available
PHMM Applications
27
Limited Training Data
 PHMM
most impressive with very
little data (especially wrt AUC0.1)
PHMM Applications
28
Limited Training Data
 Same
results
as previous
slide
PHMM Applications
29
Masquerade Detection
Strategy
 Obtain
200 commands, train PHMM
 Use this PHMM model until a reliable
set of 800+ commands is available
 Then train HMM on 800+ commands
 Use HMM from then on
 Gives us a reliable model with limited
data, and best model with more data
PHMM Applications
30
Another PHMM Advantage?
 PHMM
might be better when attacker
hijacks ongoing session
 Masquerader mimics average behavior
o This is what is modeled by HMM
 Harder
to mimic sequential behavior
o As modeled by PHMM
o Depends on position in the sequence
 This
should be investigated further
PHMM Applications
31
PHMM for Malware Detection
Lin Huang
Mark Stamp
PHMM Applications
32
Malware Detection
 PHMM
previously tested for
metamorphic detection
o Based on extracted opcodes
 Results
not impressive
 MSA has gaps and PHMM is weak
 Code transposition causes problems
o And transposition is easy to implement
 Opcode
PHMM Applications
sequencing, not strong feature
33
Malware Detection 2.0
 Here,
again apply PHMM to malware
 But what features to use
 Want feature(s) where…
o Sequence/order is critical
o Harder for malware writer to modify
sequential information
 What
feature(s) to use?
o Not (static) opcodes
PHMM Applications
34
Software Birthmarks
 Birthmark
is inherent feature of code
o In contrast to a watermark
 We
consider both static and dynamic
birthmarks
 Static --- collected without executing
 Dynamic --- execution/emulation
 Examples of each?
 Advantages/disadvantages of each?
PHMM Applications
35
This Research
 Consider
opcodes
o Static feature, extracted by disassembly
 Also
consider API calls
o Dynamic, use Buster Sandbox Analyzer
 Compare
HMM and PHMM for both
o Then 4 cases for each malware family
PHMM Applications
36
Data
 Malware
 Benign
PHMM Applications
data from Malicia Project
set of 20 Windows applications
37
Results: Opcode Sequences
 Scatterplots
and ROC curves
for Security
Shield
PHMM Applications
38
HMM Results
 Results
for all families, static and
dynamic birthmarks
PHMM Applications
39
PHMM
 Dynamic
PHMM Applications
birthmarks --- API calls
40
Results
 Static
and
dynamic
HMM
 And dynamic
PHMM
PHMM Applications
41
Bottom Line
 In
these cases, dynamic data gives
better results
o API calls better than (static) opcodes
 HMM
does very well on API calls…
 …but PHMM can do even better
 Sequential info in API calls!
 Is PHMM really worth it?
PHMM Applications
42
References

Masquerade detection
o L. Huang and M. Stamp, Masquerade detection
using profile hidden Markov models, Computers
& Security, 30(8):732-747, November 2011

Malware detection
o S. Vemparala, et al, Malware detection using
dynamic birthmarks, 2nd International
Workshop on Security & Privacy Analytics
(IWSPA 2016), co-located with ACM CODASPY
2016, March 9-11, 2016
PHMM Applications
43