Bug Localization with Association Rule Mining
Download
Report
Transcript Bug Localization with Association Rule Mining
Bug Localization with
Machine Learning Techniques
Wujie Zheng
[email protected]
Background of Bug Localization
Software is far from bug-free
Debugging is a methodical process of finding
and correcting the bugs in a program
Manual debugging is laborious and expensive
It is possible to automatic or semi-automatic this
process
Bug localization is to find a set or a ranking of
source code locations that are likely buggy
through automatic analysis
Techniques of Bug Localization
Program Slicing
Experimental Methods
Machine Learning Methods
Program Slicing
Debugging with Program Slicing
Program Dependence Graph (PDG)
Control dependence
Data dependence
Static Slice vs. Dynamic Slice
Start from the inputs or the variables that cause the failure (but not root
cause), find all statements that may cause the failure
Static slice: all statements that may affect the value of a variable at a
program point for any arbitrary execution of the program
Dynamic slice: all statements that actually affect the value of a variable at a
program point for a particular execution of the program
Forward Slice vs. Backward Slice
Forward slice: the statements that are affected by the value of variable v at
statement s
Backward slice: the statements that affect the value of variable v at
statement s
PDG
void main() {
int sum = 0;
int i = 1;
while (i<11) {
sum = add(sum,i);
i = add(i, 1);
}
printf(“sum=%d\n”, sum);
printf(“i=%d\n”, i);
}
static int add(int a, int b) {
return (a+b);
}
In the illustration above, control dependence edges are shown in blue and data
dependence edges are shown in green. Intraprocedural dependences are shown
with solid lines while interprocedural dependences are shown with dashed lines.
Experimental Methods
Delta Debugging
Generalize and simplify some failing test case to a
minimal test case that still produces the failure
Separate the test case into two sub tests, choose
the sub test that fails, and repeat the process.
When both sub tests pass, perform another
divide-and-test again
Essentially a binary search algorithm
Delta Debugging
Machine Learning Methods
Problem Setting
A set of failing and passing test cases are
observed (test cases as the samples)
The statements are instrumented and some
traces are collected (statements as the features)
Find a set or a ranking of the statements that are
likely buggy
Machine Learning Methods
Direct correlation analysis
Build a classification model of the test cases (feature
selection)
Tarantula
LIBLIT05
Logistic regression
Decision tree
Build a behavior model of the statements from
passing test cases, and then check violations in
failing test cases
Dynamic invariants
SOBER
PPDG
Tarantula
Visualization:
Logistic regression
The logistic function
Maximize the likelihood of the training set
The regularization term
The input features are normalized first
The larger the resulting coefficient is, the
more suspicious the statement is
Dynamic invariants
Given a set of test cases, mine the implicit rules
over the variables
The templates
Invariants over any variable: x=a …
Invariants over two numeric variables: y=ax+b …
Invariants over three numeric variables: y=ax+by+c …
…
SOBER
SOBER [Liu05]
the probability density function of the evaluation
bias of P on passing runs and failing runs
respectively
The bug relevance score of P is then defined as
the difference between them
Probabilistic Program Dependence Graph
(PPDG)
Conditional Independence
Dependence Network
The state of a statement is independent of the previous
statements conditioned on its immediate parents
Allow cycles, suitable for loops in programs
Usage in debugging
The distributions of conditional probabilities are learned
from the passing runs
The lower the conditional probability of a state has, the
more suspicious the statement it belongs to is
Our Work
Belongs to the group of “Build a classification model
of the test cases (feature selection)”
Association Rule Mining
The procedure
Mine all the strong (high frequency and high confidence)
association rules X=>fail
Select the rule X’=> with highest confidence (the best
classification model of this kind)
Output the statement set X’ as the results
Problems:
It is hard to justify that such rule (only conjunction phrase) based
model is suitable
The resulting rule may contain lots of statements. Further ranking
scheme is still needed
High computational overhead
Our Work
Belongs to the group of “Build a classification model
of the test cases (feature selection)”
Feature Subset Selection
The procedure
Iteratively evaluates a candidate subset of features, then modifies
the subset and evaluates if the new subset is an improvement
over the old.
Greedy forward selection
Evaluation of the subsets requires a scoring metric that grades a
subset of features.
F-measure (i.e., harmonic average of precision and recall,
1/(1/precision+1/recall))
Problems
Too simple, just a traditional method
How to improve the work?
Adopt the association rule mining method to
mine the behavior models
Association rule mining is better for mining
frequent patterns than building classification
models
Difficulties:
Usage pattern may dominate, while failure statements
related pattern may be not frequent
If we consider relative frequent pattern for every
statements, there may be too many results to be mined
and used effectively
Consider program dependence graph may help
How to improve the work?
Consider the debugging problem of specific
bugs such as data race
Data race is an important kinds of bugs, and it is
difficult to debug
Sequential pattern mining?
Combine dynamic analysis with machine learning
Existing dynamic analysis methods can provide valuable
features, e.g., the potential data race in a certain
execution (many false positives)
How to improve the work?
…
Other topics of automated debugging
Failure classification
NLP
Failure trace clustering
NLP + Failure trace clustering
Failure replaying
Record the failures in user sites compactly and
reproduce it in developer sites
Debugging with replaying
Traverse to any point of an execution without restarting the
program
Query the trace database
Thank you!