Jarhead Analysis and Detection of Malicious Java Applets

Download Report

Transcript Jarhead Analysis and Detection of Malicious Java Applets

Jarhead
Analysis and
Detection of Malicious
Java Applets
Johannes Schlumberger, Christopher Kruegel, Giovanni Vigna
University of California
Annual Computer Security Applications Conference (ACSAC)
(December, 2012)
Reporter: 鍾怡傑 2013/03/25
Outline
 INTRODUCTION
 BACKGROUND
 Java applet
 Java exploits
 JARHEAD SYSTEM OVERVIEW
 FEATURE DISCUSSION
 Obfuscation
 Behavior
 EVALUATION
 Manually Dataset
 Wepawet Dataset
 POSSIBLE EVASION
 CONCLUSIONS
INTRODUCTION
 We address the problem of malicious Java applets, a
problem on the rise that is currently not well addressed
by existing work.
 Jarhead uses static analysis and machine learning
techniques to identify malicious Java applets.
INTRODUCTION
 Drive-by download attacks
 Social engineering attacks
INTRODUCTION
 Signature-based detection avoidable by obfuscation
 Honeyclients need vulnerable software combination
 Java plugin version
 Java version
 Browser and OS version
BACKGROUND-Java applet
 Java bytecode + application files
 Commonly bundled as Jar-archive
 Embedded in web pages
 Executed by web browsers in sandboxed JVM
 Optional digital signature disables sandbox
 Developed in the 90ies for mobile code
 Superseded by CSS, JavaScript, Flash, . . .
 Modern browsers still support Applets
Next
Jar-archive
Embedded in web pages
 <applet code="xxxx.class” archive="test.jar"
width="550" height="150">
<param name="bgcolor" value="255,255,255">
<param name="font" value="新明細體">
</applet>
Digital Signature
https://chrometerm.appspot.com
BACKGROUND-Java exploits
 Users unaware of Java applets
 Plugins default enabled
 Plugins out of date
 Multiple vulnerabilities in the JVM or Java library
JARHEAD SYSTEM OVERVIEW
 Detector for malicious Java applets
 Static
 Reliable
 Accurate
 Fast
 Offline
 Robust
 Low maintenance
 Analyzed large number of samples
 Detected previously unknown exploits
How does Jarhead work?
1. Unpack
2. Disassemble
3. Statically extract feature set
4. Classification
5. Result
Why statically?
1. Partial exploits can not be analyzed dynamically
2. Resistant to fingerprinting/evasion
3. Independent of Environment (JVM/Java version,
OS,. . . )
4. 100% Code coverage
FEATURE DISCUSSION
 General metrics (size in bytes, . . . )
 Obfuscation
 Code metrics
 String obfuscation
 Active code obfuscation
 Behavior
 Interaction with security-critical components
 Download and execute
 Jar Content
 Known vulnerable functions
 42 features total
Obfuscation
Code metrics
 We collect a number of simple metrics that look at the
size of an applet, i.e., the total number of instructions
and the number of lines of disassembled code, its
number of classes, and the number of functions per
class.
 Cyclomatic complexity is a complexity metric for
code, computed on the control flow graph (CFG).
 To find semantically useless code, we measure the
number of dead local variables and the number of
unused methods and functions.
String obfuscation
 Strings are heavily used by both benign and malicious
applets.
 The reason for string obfuscation is to defend against
signature-based systems.
 For the length feature, we determine the length of the
shortest and longest string in the pool as well as the
average length of all strings.
Active Code Obfuscation
 To counter code analysis techniques that check for
the invocation of known vulnerable library functions
within the Java library, malicious applets frequently
use reflection.
 To detect such activity, we count the absolute number
of times reflection is used in the bytecode to
instantiate objects and to call functions.
 We check if the Java.io.Serializable
java.lang.Object or java.lang.Class interface.
 we check if the JavaScript interface is used.
Behavior
Interaction with security-critical
components
 Several vulnerabilities in different versions of the Sun
Java plugin have led to exploits that bypass the
sandboxing mechanisms.
 Runtime class
 System class
 ClassLoader class
Download and execute
 For a successful exploit, it is necessary to execute a file
after it has been downloaded.
 Java.net.URL objects
 Sockets
 Write files
 spawn a new process
Jar Content
 The number of files in the Jar that are not Java class
files(media files, images, . . . ).
 Binary machine code in the archive.(executable or
library)
 The total size of the Jar archive in bytes
Known vulnerable functions
 MidiSystem.getSoundbank()
 javax.management.remote.rmi.RMIConnectionImpl()
 MIDlet
 The combination of functions is
MidiSystem.getSequencer, and
Sequencer.addControllerEventListener
 javax.management.MBeanServer interface
Obfuscation features
 Cyclomatic complexity
 Semantically useless code (dead variables, unused
functions, . . . )
 Percentage of non-ASCII strings
 Length and number of Strings
 Use of Reection
 Dynamic code loading
 Invocation of JS interpreter
Behavioral features
 Interaction with Runtime
 Interaction with System Security Manager
 Check for extensions of the ClassLoader
 Use of URLs, FileStreams, . . .
 Ability to spawn process
 SMS-send functionality
 Call to known vulnerable functions
Top ten features
Merit
Attribute
Type
0.398
gets_parameters
behavior
0.266
functions_per_class
obfuscation
0.271
no_of_instructions
obfuscation
0.257
gets_runtime
behavior
0.254
lines_of_disassembly
obfuscation
0.232
uses_file_outputstream
behavior
0.22
percent_unused_methods
obfuscation
0.211
longest_string_char_cnt
obfuscation
0.202
mccabe_complexity_avg
obfuscation
0.197
calls_execute_function
behavior
EVALUATION
 Manually collected (2,854 samples)
 Applet collection sites
 http://echoecho.com
 http://javaboutique.internet.com (http://www.jguru.com/)
 Malware research community site
 http://filex.jeek.org
 Security site
 http://www.malwaredomainlist.com
 Web crawl
 Wepawet (1,551 samples)
 https://wepawet.iseclab.org/
Manually Dataset
 Virustotal found 1,721 (82.1%) of the files to be benign
and 374 (17.9%) to be malicious
 Virustotal has actually misclassified 61 (2.9%) applets.
 34 (1.6%) benign applets as malicious
 27 (1.3%) malicious applets as benign
 The classifier only misclassified a total of 11 (0.5%)
samples.
 The false positive rate was 0.2% (4 applets)
 The false negative rate was 0.3% (7 applets)
Comparison of Jarhead and
Virustotal misclassifications
Virustotal (42 AVs)
Jarhead (10x cross-val.)
False pos.
1.6%
0.2%
False neg.
1.3%
0.3%
Wepawet Dataset
 The authors of Wepawet provided us with 1,551 Jar
files.
 Virustotal found 413 (32.4%) applets to be benign and
862 (67.6%) applets to be malicious. 86 (6.7%) samples
 59 (4.6%) malicious applets as benign
 27 (2.1%) benign applets as malicious.
 We found a total misclassification count of 21 (1.6%)
 The false positive rate was 0.9% (12 applets)
 The false negative rate was 0.7% (9 applets)
Jarhead’s performance on
the Wepawet dataset
Original classifier
10x cross validated
False positives
2.1%
0.9%
False negatives
4.6%
0.7%
POSSIBLE EVASION
 It is possible to use the Java native interface (JNI) to
execute native code on the machine. This is not
covered by our analysis.
 Malicious behavior is distributed among multiple
applets within a single page
 A completely new class of exploits or vulnerabilities
could bypass our detection either
CONCLUSIONS
 We address the quickly growing problem of malicious
Java applets by building a detection system based on
static analysis and machine learning.
 We also deployed our system as a plugin for the
Wepawet system, which is publicly accessible.
 In the future, we plan to improve our results by using
more sophisticated static analysis techniques to
achieve even higher accuracy.
Thank you. . .
any Questions?