issta04 - Computer Science and Engineering
Download
Report
Transcript issta04 - Computer Science and Engineering
Static and Dynamic Analysis
of Call Chains in Java
Atanas Rountev
Scott Kagan
Michael Gibas
Ohio State University
Call Chains
Sequence of call graph edges
<e1, e4>, <e1, e4, e3>, <e2, e3>, etc
e2
X.m()
X.n()
e1
e3
X.o()
e4
Y.p()
Analysis of Call Chains
Static analysis
Input: static call graph
Output: set of static call chains
Dynamic analysis
Input: set of static call chains
Output: run-time coverage of the chains
Uses of Call Chain Information
Software Understanding
Reverse engineering of UML sequence diagrams
Code browsing
Original motivation
Sequences of calls that may lead to a method being
executed
Software Testing
Coverage of “interesting” chains
Integration testing
Object interactions
Our Work
Defined an approach for static and dynamic
analysis of call chains in Java components
Performed an experimental evaluation of the
imprecision of static call chain analysis
Static Analysis Input
Component and static call graph
Component: set of classes
Static call graph for the component
Entry methods
X.m()
c1,X
X.o()
c2,X
c4,Y
X.n()
c3,Y
Y.p()
Static Analysis Output
Data structure which represents a set
of call chains
Based on calling context tree
Ammons et al., PLDI 1997
Nodes correspond to procedures
Edges correspond to calling relationships
Each node represents a chain
Statically built
Represents a component, not a program
Calling context forest
Example
c1,X
X.o()
c4,Y
X.m()
c2,X
X.n()
c3,Y
X.n()
c3,Y
Y.p()
Y.p()
Forest Construction
Traverse call graph
Stopping mechanism
Limit depth
Filter infeasible chains
Calls through implicit formal parameter this
Dynamic Analysis
Input: calling context forest and an
executable that contains the component
Goal: traverse forest, visit nodes
Traversal can be in different states
Obvious application: chain coverage
Start: not yet entered component
Active: currently traversing forest
Limited: tree node not present
External: left component, waiting to return
Instrument component code
Instrumentation
Generate run-time events
entered(m)
exited(m)
c is call site, X is receiver class
after(c, X)
Exiting component code
before(c, X)
m is an entry method
Entering component code
c is call site, X is receiver class
Instrumentation localized
Limitations
No exceptions are thrown
Developed solution
Single threaded execution
Static and dynamic checks to ensure
constraints
Basic Example
X.m()
c1,X
X.o()
c2,X
X.n()
Counter
1
0
Previous State
Limited
State
Start
Active
Limited
External
Limited
Active
Start
Event
entered(X.m)
before(c2,X)
before(c3,Y)
before(c5,Z)
after(c5,Z)
after(c3,Y)
after(c2,X)
exited(X.m)
Our Work
Defined an approach for static and dynamic
analysis of call chains in Java components
Performed an experimental evaluation of the
imprecision of static call chain analysis
Experimental Evaluation
Chains are statically constructed
Some chains may be infeasible
What is the degree of imprecision?
What are the causes of imprecision?
Ultimate goal: better static analysis
Analysis Implementation
Soot framework (www.sable.mcgill.ca)
Input: a component and set of tests
Output: call chains coverage
Call graph construction
Fragment analysis (Rountev et al., TSE 2004)
Placeholder main method
RTA, Andersen’s style points-to analysis
Static Analysis
Infeasibility filter, depth-limited chains
Degree of Imprecision
Open source set of tests for Java standard
library classes
Formed 6 components
Determined run-time coverage of statically
computed call chains
Average 101 methods
Were uncovered chains feasible or infeasible?
Enhanced test suite for maximal coverage
Uncovered chains from enhanced test suite were
infeasible
(4
co
)
ll a
to
r(
3)
co
ll a
to
r(
4)
da
te
(3
)
da
te
(4
de
)
ci
m
al
(3
de
)
ci
m
al
bo
(4
un
)
da
ry
bo
(3
un
)
da
ry
(4
)
zi
p
(3
)
(4
)
(3
)
400
zi
p
gz
ip
gz
ip
Number of chains
Analysis Imprecision
Feasible
Andersen's
RTA
350
300
250
200
150
100
50
0
Cause of Imprecision Experiment
Approximations made by the static analysis
if (flag) x.foo();
Abstracted semantics
All branches of non-reference conditions possible
Otherwise, same as regular Java semantics
What if infeasible chains are feasible in new
semantics?
Imprecision in any static analysis with same
approximation
For all components, 94.5% of infeasible
chains were feasible in new semantics
Future Work
Exceptions
no new states, only new instrumentation
Multi-threading
Increase data sets
Study more causes of infeasibility
Speed optimized implementation