issta04 - Computer Science and Engineering

Download Report

Transcript issta04 - Computer Science and Engineering

Static and Dynamic Analysis
of Call Chains in Java
Atanas Rountev
Scott Kagan
Michael Gibas
Ohio State University
Call Chains

Sequence of call graph edges

<e1, e4>, <e1, e4, e3>, <e2, e3>, etc
e2
X.m()
X.n()
e1
e3
X.o()
e4
Y.p()
Analysis of Call Chains

Static analysis



Input: static call graph
Output: set of static call chains
Dynamic analysis


Input: set of static call chains
Output: run-time coverage of the chains
Uses of Call Chain Information

Software Understanding

Reverse engineering of UML sequence diagrams


Code browsing


Original motivation
Sequences of calls that may lead to a method being
executed
Software Testing

Coverage of “interesting” chains


Integration testing
Object interactions
Our Work

Defined an approach for static and dynamic
analysis of call chains in Java components

Performed an experimental evaluation of the
imprecision of static call chain analysis
Static Analysis Input

Component and static call graph


Component: set of classes
Static call graph for the component

Entry methods
X.m()
c1,X
X.o()
c2,X
c4,Y
X.n()
c3,Y
Y.p()
Static Analysis Output

Data structure which represents a set
of call chains






Based on calling context tree
 Ammons et al., PLDI 1997
Nodes correspond to procedures
Edges correspond to calling relationships
Each node represents a chain
Statically built
Represents a component, not a program

Calling context forest
Example
c1,X
X.o()
c4,Y
X.m()
c2,X
X.n()
c3,Y
X.n()
c3,Y
Y.p()
Y.p()
Forest Construction


Traverse call graph
Stopping mechanism


Limit depth
Filter infeasible chains

Calls through implicit formal parameter this
Dynamic Analysis


Input: calling context forest and an
executable that contains the component
Goal: traverse forest, visit nodes


Traversal can be in different states





Obvious application: chain coverage
Start: not yet entered component
Active: currently traversing forest
Limited: tree node not present
External: left component, waiting to return
Instrument component code
Instrumentation

Generate run-time events

entered(m)



exited(m)


c is call site, X is receiver class
after(c, X)


Exiting component code
before(c, X)


m is an entry method
Entering component code
c is call site, X is receiver class
Instrumentation localized
Limitations

No exceptions are thrown



Developed solution
Single threaded execution
Static and dynamic checks to ensure
constraints
Basic Example
X.m()
c1,X
X.o()
c2,X
X.n()
Counter
1
0
Previous State
Limited
State
Start
Active
Limited
External
Limited
Active
Start
Event
entered(X.m)
before(c2,X)
before(c3,Y)
before(c5,Z)
after(c5,Z)
after(c3,Y)
after(c2,X)
exited(X.m)
Our Work

Defined an approach for static and dynamic
analysis of call chains in Java components

Performed an experimental evaluation of the
imprecision of static call chain analysis
Experimental Evaluation

Chains are statically constructed




Some chains may be infeasible
What is the degree of imprecision?
What are the causes of imprecision?
Ultimate goal: better static analysis
Analysis Implementation




Soot framework (www.sable.mcgill.ca)
Input: a component and set of tests
Output: call chains coverage
Call graph construction

Fragment analysis (Rountev et al., TSE 2004)



Placeholder main method
RTA, Andersen’s style points-to analysis
Static Analysis

Infeasibility filter, depth-limited chains
Degree of Imprecision


Open source set of tests for Java standard
library classes
Formed 6 components


Determined run-time coverage of statically
computed call chains


Average 101 methods
Were uncovered chains feasible or infeasible?
Enhanced test suite for maximal coverage

Uncovered chains from enhanced test suite were
infeasible
(4
co
)
ll a
to
r(
3)
co
ll a
to
r(
4)
da
te
(3
)
da
te
(4
de
)
ci
m
al
(3
de
)
ci
m
al
bo
(4
un
)
da
ry
bo
(3
un
)
da
ry
(4
)
zi
p
(3
)
(4
)
(3
)
400
zi
p
gz
ip
gz
ip
Number of chains
Analysis Imprecision
Feasible
Andersen's
RTA
350
300
250
200
150
100
50
0
Cause of Imprecision Experiment

Approximations made by the static analysis


if (flag) x.foo();
Abstracted semantics



All branches of non-reference conditions possible
Otherwise, same as regular Java semantics
What if infeasible chains are feasible in new
semantics?


Imprecision in any static analysis with same
approximation
For all components, 94.5% of infeasible
chains were feasible in new semantics
Future Work

Exceptions





no new states, only new instrumentation
Multi-threading
Increase data sets
Study more causes of infeasibility
Speed optimized implementation