Predicting zero-day software vulnerabilities through data
Download
Report
Transcript Predicting zero-day software vulnerabilities through data
PREDICTING ZERO-DAY
SOFTWARE VULNERABILITIES
THROUGH DATA MINING
1
Su Zhang
Department of Computing and Information Science
Kansas State University
OUTLINE
Motivation.
Related work.
Proposed approach.
Possible techniques.
Plan.
2
OUTLINE
Motivation.
Related work.
Proposed approach.
Possible techniques.
Plan.
3
THE TREND OF VULNERABILITY NUMBERS
4
ZERO-DAY VULNERABILITY
What is zero-day vulnerability?
It is a vulnerability which is found by underground hackers
before being made public.
Increasing threat from zero-day vulnerabilities.
Many attacks are attributed to zero-day vulnerabilities.
E.g. in 2010 Microsoft confirmed a vulnerability in Internet
Explorer, which affected some versions that were released
in 2001.
5
OUR GOAL
Risk awareness. The possibility of zero-day
vulnerability must be considered for
comprehensive risk assessment for enterprise
networks.
6
ENTERPRISE RISK ASSESSMENT FRAMEWORK
7
ENTERPRISE RISK ASSESSMENT FRAMEWORK
8
ENTERPRISE RISK ASSESSMENT FRAMEWORK
9
ENTERPRISE RISK ASSESSMENT FRAMEWORK
10
ENTERPRISE RISK ASSESSMENT FRAMEWORK
11
PROBLEM
Predict
the information of
zero – day vulnerabilities
from software configurations.
12
OUTLINE
Motivation.
Related work.
Proposed approach.
Possible techniques.
Plan.
13
RELATED WORK
O. H. Alhazmi and Y. K. Malaiya, 2005.
Andy Ozment, 2007.
Kyle Ingols, et al, 2009.
Miles A. McQueen, et al, 2009.
14
OUTLINE
Motivation.
Related work
Proposed approach.
Possible techniques.
Plan.
15
PROPOSED APPROACH
Predict the likelihood of zero-day vulnerabilities
for specific software applications.
NVD
Available since 2002.
Rich data source including the preconditions and
consequences of vulnerabilities. It could be used to
build our model and validate our work.
16
SYSTEM ARCHITECTURE
Output(MTTNV&CVSS Metrics)
Our Prediction Model
CPE (common platform enumeration)
Scanner (e.g. Nessus or OVAL)
Target Machine
IE
WinXP
FireFox
…
17
PREDICTION MODEL
Predictive data: CPE (common platform
enumeration)
Indicate software configuration on a host.
Predicted data: MTTNV (Mean Time to Next
Vulnerability) & CVSS Metrics
MTTNV indicates the probability of zero-day
vulnerabilities.
CVSS metrics indicate the properties of the predicted
vulnerabilities.
18
CPE (COMMON PLATFORM ENUMERATION)
What is CPE?
CPE is a structured naming scheme for information
technology systems, software, and packages.
Example (in primitive format)
cpe:/a:acme:product:1.0:update2:pro:en-us
Professional edition of the "Acme Product 1.0 Update
2 English".
19
CPE LANGUAGE
20
CVSS (COMMON VULNERABILITY SCORING SYSTEM )
An open framework for communicating the
characteristics and impacts of IT vulnerabilities.
Metric Vector
access complexity (H, M, L)
authentication ( R, NR)
confidentiality (N, P, C)
...
CVSS Score: Calculated based on above vector. It
indicates the severity of a vulnerability.
21
CVSS USED IN RISK ASSESSMENT
We use CVSS to derive a conditional probability.
How likely a vulnerability could be successfully
exploited, given all preconditions fulfilled.
By combining the conditional probability with
attack graph one can calculate the cumulative
probability, we could obtain a overall estimated
likelihood of the given machine being
compromised.
22
OUTLINE
Motivation.
Related work.
Proposed approach.
Possible techniques.
Plan.
23
POSSIBLE TECHNIQUES
Linear Regression ( input are continuous
variables).
Statistical classification (input are discrete
variables).
Maximum likelihood and least squares
(Determining the parameters of our model).
24
VALIDATION METHODOLOGY
Earlier years of NVD: Building our model.
Later years of NVD: Validate our model.
Criteria: Closer to the factual value than without
considering zero-day vulnerabilities.
25
OUTLINE
Motivation.
Related work.
Proposed approach.
Possible techniques.
Plan.
26
PLAN
Next phase: Study data-mining tools (e.g.
Support Vector Machine) . Then build up our
prediction model.
Validate the model on NVD.
Final phase:
If the previous phase provides a good model, we will
incorporate the generated result into MulVAL.
Otherwise, we are going to investigate the problem.
27
REFERENCES
[1]Andrew Buttner et al, ”Common Platform Enumeration (CPE) –
Specification,” 2008.
[2]NVD, http://nvd.nist.gov/home.cfm.
[3]O. H. Alhazmi et al, “Modeling the Vulnerability Discovery Process,” 2005.
[4]Omar H. Alhazmi et al, “Prediction Capabilities of Vulnerability Discovery
Models,” 2006.
[5]Andy Ozment, “Improving Vulnerability Discovery Models,” 2007.
[6]R. Gopalakrishna and E. H. Spafford, “A trend analysis of vulnerabilities,”
2005.
[7]Christopher M. Bishop, “Pattern Recognition andMachine Learning,” 2006.
[8]Xinming Ou et al, “MulVAL: A logic-based network security analyzer,” 2005.
[9] Kyle Ingols et al, “Modeling Modern Network Attacks and
Countermeasures Using Attack Graphs” 2009.
[10] Miles A. McQueen et al, “Empirical Estimates and Observations of 0Day
Vulnerabilities,” 2009.
[11] Alex J. Smola et al, “A Tutorial on Support Vector Regression,” 1998.
28
THANK YOU!
Questions & Answers
29