Transcript icsm04

Precise Identification of
Side-Effect-Free Methods in Java
Atanas (Nasko) Rountev
Ohio State University
Sept 12
ICSM'04
1
Side-Effect-Free Methods

Side effects of a method: changes to the
values of memory locations




In C++: global variables, local variables on the
stack, static fields, heap objects
In Java: static fields, heap objects
Side-effect-free method: does not make
changes that are observable by its
callers
This work: static analysis of Java code
for identifying side-effect-free methods
Sept 12
ICSM'04
2
Motivation


Program comprehension
Reverse engineering of UML class
diagrams: isQuery attrbutes

Code transformations

Specifications (e.g. JML)

Assertions

Simplification for other analyses
Sept 12
ICSM'04
3
Talk Outline

Static analyses for finding methods that
are free of side effects




Works on a partial program
Parameterized by class analysis
Tracks transitive modifications of static
fields and observable heap objects
Experiments


Sept 12
How many methods are reported by the
analysis as being free of side effects?
How many are really free of side effects?
ICSM'04
4
Problem Definition



Analyzed component:
set of Java classes
Some methods are
designated as boundary
methods ( )
Could an unknown caller
of a boundary method
observe side effects?


Sept 12
Static fields
Fields of observable
objects
ICSM'04
5
Example
class WordIterator { …
private CharIterator text;
public void setText(CharIterator c) { text = c; }
public int next() { return nextPos(); }
private int nextPos() { … text.nextChar(); … } }
public char m() {
CharIterator t = new CharIterator(“abc”);
return t.nextChar(); } … }
class CharIterator { …
private int curr;
public char nextChar() { … curr++; … } … }
Sept 12
ICSM'04
6
Class Analysis

Which objects does a reference variable
point to?


Set of object names: e.g.,



Similar question for reference object fields
One name per class
One name per “new” expression
Points-to pairs


Sept 12
(x,o) : variable x points to object name o
(o1.f,o2) : field f of o1 points to o2
ICSM'04
7
Class Analysis

Flow sensitivity: track statement order

Context sensitivity: track calling context



We consider context-sensitive, flowinsensitive class analyses


Set of context abstractions
(xc,o) : variable x points to object name o
when the calling context is c
Most class analyses are in this category
Typically, whole-program analyses
Sept 12
ICSM'04
8
Artificial Main
main



Sept 12
Goal: simulate everything
that unknown callers can
do with respect to
object references
Run a whole-program
class analysis on main
and the component
Artificial variables and
statements
ICSM'04
9
Some Points-to Pairs
wi
obj1 [WordIter]
setText.thisobj1
text
ci
obj2 [CharIter]
setText.cobj1
t = new CharIter
t.nextChar()
m.tobj1
obj3 [CharIter]
nextChar.thisobj3
Sept 12
ICSM'04
10
Direct Side Effects

Observable objects: reachable from the
artificial variables in the points-to graph


In the example: obj1 and obj2
Direct side effects of a method



Sept 12
x.f = … , where x refers to an observable
object
new C, when representing an observable
object
C.f = … , where f is a static field
ICSM'04
11
Example

setText: this.text = c
setText.thisobj1

obj1
nextChar: this.curr++
nextChar.thisobj2
obj2
nextChar.thisobj3
obj3

Direct(setText,obj1) = { obj1.text }

Direct(nextChar,obj2) = { obj2.curr }
Sept 12
ICSM'04
12
Transitive Side Effects

Transitive side effects of a method


Under context c1, calls another method with
some calling context c2
For c2, the callee has side effects
private int nextPos() { … text.nextChar(); … } }
nextPos.thisobj1
obj1
text
Direct(nextChar,obj2) = { obj2.curr } obj2
Trans(nextPos,obj1) = { obj2.curr }
Sept 12
ICSM'04
13
Boundary Methods
class WordIterator { …
public void setText(CharIterator c) { … }
public int next() { … }
public char m() { … } // free of side effects
class CharIterator { …
public char nextChar() { … } … }
Mod(setText,obj1) = { obj1.text }
Mod(next,obj1) = { obj2.curr }
Mod(nextChar,obj2) = { obj2.curr }
Sept 12
ICSM'04
14
Experimental Study

Small–scale study of:



two side-effect analyses based on two class
analyses
feasibility of the analysis solution
Class analyses


Sept 12
Context-sensitive, flow-insensitive analysis
similar to the one from the example (CSA)
Rapid Type Analysis (RTA): low end of the
theoretical precision spectrum
ICSM'04
15
Experimental Study

Seven components from the standard
Java libraries



Classes that implement some functionality,
plus all their transitive server classes
Boundary methods: provide access to the
functionality
Implementation based on the Soot
framework (McGill) and the BANE
constraint solver (UC Berkeley)
Sept 12
ICSM'04
16
Results from CSA
Boundary methods
Side-effect-free methods
40
Number of methods
35
30
25
20
15
10
5
Sept 12
ICSM'04
da
ry
bo
un
be
r
nu
m
da
te
co
lla
to
r
ch
ec
ke
d
p
zi
gz
ip
0
17
Summary of Results


CSA-based analysis: a quarter of the
boundary methods are side-effect-free
Perfect precision: we proved that all
other methods did have real side effects



Manual examination of the code
The analysis did not miss any side-effectfree methods
Surprising result: RTA-based analysis
achieves almost the same precision
Sept 12
ICSM'04
18
Conclusions


It is possible to adapt many wholeprogram class analyses to find sideeffect-free methods in partial programs
Preliminary experimental results



Significant number of side-effect-free
methods
Analyses achieve very good precision
Future work: more comprehensive study
Sept 12
ICSM'04
19