here - Michael Goffin

Download Report

Transcript here - Michael Goffin

Yara & Python
Malware Identification and Classification
CarolinaCon 7
Michael Goffin
@mjxg
http://www.mgoff.in
Hey sir!
Why hello there!!
• Rochester Institute of Technology
• Computer Science House
• Information Security Scientist/Engineer
What’s in store?
•
•
•
•
•
•
Malware
Yara
Python
Identification and Classification of Malware
Showing it all off
QQ session
Malware! Sonofa...
Methods of acquisition
• downloads
• compromised website content (ex:
images)
• attachments
• links to compromised site content
You’ve been infiltrated!
Things to note:
• You don’t know it yet, and might not for a
while
• You don’t know the scope of it
• You don’t know the severity of it
But you eventually see something…
Start the cycle!
Management wants answers!
What do you do next?
• Go into a panic!
• Oh no! We should remove the known
compromised host(s) from network!
• We should assess the compromise…somehow!
• Oh geez, might be good to change passwords –
let’s just have everyone do it just in case!
• We need to go through logs and other hosts for
signs of lateral movement – wait, what are we
looking for?
• Can we make firewall rules to block any IPs or
domains?
• Do we have any AV or IDS appliances?
Most importantly
You did get a copy of the malware to
analyze, right?
…Right?
Get better at data mining!
• Who is interested in this user or your company?
• What are they trying to do with this malware (and what are they exploiting?)?
• When did this malware come in?
• Where did it come from and where did it go to?
• Why are they after your company, or this user?
• How does this malware help them accomplish their goals?
What do we do with all the
data?
Build a classification database over time!
• Identify trends
• Find commonalities
Lots of action, now what?
Enter Yara
What does Yara do?
Identify and classify malware samples based on textual or
binary patterns contained within those samples
MALWARE!
MALWARE!
MALWARE!
MALWARE!
How does it do it?
Pretty basic:
• Search for patterns
• Use defined conditions to determine if the
patterns are a positive match
• Output matching rule content for
consumption
Yara and Python
Step 1:
% python
Step 2:
> import yara
> rules = yara.compile(signatures)
> matches = rules.match(filetoscan)
Step 3:
profit
As the old saying goes…
If it walks like a duck…
And it quacks like a duck…
It’s probably the DHA installing backdoors
and keyloggers while xfil’ing your data.
Identification
• Can we tease out specific characteristics
about this piece of malware that can
describe it both from a functional and
fashionable perspective?
•
•
•
•
What does it attempt to touch?
What does it attempt to modify?
Is this type of malware stylish?
Etc.
Identification
• Are there any quantitative or qualitative
datasets about this malware that can help
further describe its nature?
– Functions used in other malware
– Code style similar to other malware
– IPs or domains used
– Specific targets (files, processes, etc.)
– End result of successful execution
Classification
Questions[1]:
• Does an unknown malware instance
belong to a known malware family or does
it constitute a novel malware strain?
• What behavioral features are
discriminative for distinguishing instances
of one malware family from those of other
families?
– Compare these to our Identification
Strains
•
•
•
•
•
•
•
Trojan
Rootkit
Backdoor
Xfil
Worms
Ransomware
Keylogger
Build Signatures
•
•
•
•
Generate conditions
Build rules for those conditions
Compile rules into a signature set
Develop process to scan files using those
signature sets
• Generate alerts
Set human response expectations to these
alerts!!
What a rule looks like
rule foo
{
meta:
key: value
strings:
$variable = something
condition:
logic_for_determining_positive_rule_match
}
Conditions
Some basic condition examples:
• A string or value exists
• A set of strings or values exist
• Strings or values at certain offsets exist
• The number of times a string or value
occurs
• File size restriction
Let’s see Yara in action!
How to incorporate Yara
• Web downloads
• Web content
– Urllib
• Email attachments
• Honeypots
Grab files from AV and IDS appliances to
scan!
Why Yara?
• Supplement to additional applications
(Snort, AV, detonation chambers)
• MD5 of known malware only good if exact
file is seen again
• Detect future malware with similar
identifiers that AV or IDS might not catch
yet
• Free
The cooldown…
• http://code.google.com/p/yara-project/
Questions?
References
•
[1] Learning
and Classification of Malware
Behavior – Rieck, Holz, Willems, Dussel,
Laskov
– http://pi1.informatik.uni-mannheim.de/filepool/publications/malwareclassification-dimva08.pdf