자바 언어를 위한 정적 분석 (Static Analysis for Java)

Download Report

Transcript 자바 언어를 위한 정적 분석 (Static Analysis for Java)

자바 언어를 위한 정적 분석 틀
(A Framework for SBA for Java)
KAIST 프로그램 분석시스템 연구단 세미나
1999. 11. 1
창병모
숙명여대 전산학과
http://cs.sookmyung.ac.kr/~chang
1
목차
• Java Overview
• Why static analyses ?
• Static analyses
– Higher-order CFA and Class analysis
– Exception analysis
• Speed-up SBA by partitioning
• Conclusions
2
Java overview
• Object oriented
– classes (single inheritance)
– object types (interfaces)
– objects
• C-like syntax
• Static typing
– type soundness by Eisenbach in ECOOP’97
– by Nipkow in ACM POPL’98
3
Java overview
• Dynamic binding
– inheritance and overriding
– class analysis (object type inference)
• Exception handling
– uncaught exception analysis
• Automatic memory management
– escape analysis
• Concurrency
4
Why Static Analyses for Java
• Class analysis
– for call graph and fast method dispatch in compiler
• Exception analysis
– for verification and programming environments
• Escape analysis
– for efficient memory management and
synchronization optimization
5
Higher-order CFA and
Class Analyses
6
Higher-order Flow Analysis
• Call graphs
– Many program analyses rely on a call-graph.
– There is an edge (f,g) if function f calls function g.
– Call graphs are easy to compute in FORTRAN.
• Not so easy in higher-order languages
– functional (ML)
– object-oriented (Java, C++)
– pointer-based (C)
7
Higher-order Flow Analysis
• In a functional language
e1 e2
closure analysis [Shivers88]
• In an object oriented language
e.m()
class analysis
• In a pointer language
(*p)()
pointer analysis
• In each case it is unclear which function is called.
8
Class Analyses for Java
• The goal is
– to approximate the classes of the objects, to which
an expression refer
• Also gives an approximation of the call graph
9
Class Analyses for Java
• A set variable Ce for each expression e.
– class names of the objects, to which the
expression e refers
• Set up set-constraints of the form :
Ce  se
• Analysis assigns possible classes of e to Ce.
– Solution of the constraints yield the information
10
Sample Constraints for Class Analyses
Suppose e is each of the following expressions
• new C
|
Ce  {C}
• if e0 then e1 else e2
|
Ce  Ce1  Ce2
• id = e1
|
Cid  Ce1
• Method application e0.m(e1)
for each class C in [e0] with a method m(x1) = return em
C  Ce0  (Cx1  Ce1)
Ce  Cem
11
History in Class Analyses
• Constraint resolution in O(n3 ) time.
• This analysis was discovered
– by Palsberg and Schwartzbach in 1991.
– closely related to closure analysis for functional
programs (Jones,Shivers).
• Fast interprocedural class analysis
– by node(set variable) merging
– Grove and Chambers in ACM POPL’98
12
Objects and Methods in Java
class C {
int n;
void incr() { n++; }
void decr() { n--; }
}
method table
object

void C::incr(this)
{ this.n++; g}
incr
Value of n
decr
void C::decr(this)
{ this.n--; }
• Method invocation:
[[e.m(arg)]] =
object * o = [[e]]; lookup(ovtable, m)(o, [[arg]])
13
Objects and Methods in Java
• Layout of method tables attached to objects
– based on inheritance hierarchies
• Transformation of method invocations into
method lookups + calls.
• We can generate a direct call c.m using analysis
– if that set is a singleton {c} or
– if all elements in that set have the same
implementation of method m
14
Uncaught Exception Analysis
15
Exceptions in Java
• Every exception is declared as
– a subclass of “Exception” class
• Throw exceptions
throw e
• Exception handling
try { … } catch (E x) { … }
• Specify uncaught exceptions in method definition
m(...) throws … { …}
16
Uncaught Exception Analysis in JDK
• Intraprocedural analysis
– Based on programmer’s specifications.
• Not elaborate enough to
– suggest for specialized handling nor
– remove unnecessary handlers
17
Uncaught Exception Analysis
• We need an interprocedural analysis to
– estimate Java program's exception flows
– independently of the programmer's specs.
• Approximate all possible uncaught exceptions
– for every expression and every method
• Exception analysis after class analysis
18
Deriving Set Constraints
• A set variable Pe for every expression e
– class names of uncaught exceptions from e.
• Deriving set constraints of the form :
Pe  se
• Analysis assigns classes of possible
uncaught exceptions of e to Pe
19
Deriving Set Constraints
Suppose e is each of the following expressions
• id = e1
|
Pe  Pe1
• if e0 then e1 else e2
|
Pe  Pe0  Pe1  Pe2
• throw e1
|
Pe  Ce1  Pe1
• try e0 catch (c1 x1 ) e1
|
Pe  (Pe0 - {c1}*)  Pe1
• Method invocation e0.m(e1)
- Pe  Pe0  Pe1
- for each class C in Ce0 with a method m(x1) = em
C  Ce0  Pe  Pem
20
Speed up SBA by partitioning
21
Motivation
• SBA usually derives a set-constraint for every
expression
• SBA tends to be not practical for large programs
– There are too many set variables and set-constraints
for large programs.
22
Main idea
• Reduce the number of set variables and so set-
constaints
• How to reduce ?
– The basic idea is to partition a collection of set variables.
– Each set variable in set constraints is replaced by the set
variable for the block, to which it belongs
• Any disjoint partition of set variables gives a sound
approximation to the original set constraints.
23
Partitioning
• Partition of set-variables
– Let  be a set of set-variables.
– Let  / = {B1, B2, ..., Bn } be a partition of 
– Let [Xe] be the set-variable for the block, to which a
set variable Xe belongs.
• Transform a set-constraint C into C'
– Replace each set variable Xe in C by [Xe]
– Xe  se is transformed into [Xe]  se/ where se/ is
obtained by replacing each set variable Xe by [Xe]
24
Theorem on Partitioning
• Soundness theorem
– sba(C') is a sound approximation of sba(C) in that
lm(C')([Xe])  lm(C)(Xe) for every expression e.
• Proof.
– Galois insertion : (  L)  (/  L)
– set-constraints using monotone operators
25
Questions on Partitioning
• Questions ?
– If I is a model for C, then is I/ also model for C’ ?
• This is not always valid !!
• What partitions is this valid for ?
– What partitions don’t lose precision ?
26
Partition times
• Definition time
– Define a specific analysis for a language and derive
set-constraints for that.
• Set-up time
– Set up set-constrains for input programs
• Analysis time
– Find a least solution for the set-constraints
27
Set-up time Partitioning
• Partitioning at Set-up time
– After setting up set-constraints and before evaluation
– Merge set-variables based on some relation
• Partitioning at dataflow analysis [Soffa94]
– Idempotence
X=Y
– Common subexpression
X=
Y=
28
Analysis-time Partitioning
• Partitioning at Analysis time
– Merging set-variables during analysis
• Example
– Fast interprocedural class analysis [Chambers98]
– Merging set-variables which are evaluated often
during analysis
29
Definition-time Partitioning
• Partitioning at Definition time
– Partitioning at design stage of analysis
– Starting from the expression-level set constraints,
define one set variable for a block of set-variables
– We can mechanically derive set-constraints from the
set-constraints.
• Example
– Method-level analysis
– Expression-kind-level analysis
30
Method-level Exception Analysis
• Cost-Effective ?
– Too Many Set Variables for large programs
• Observations
– exceptions are sparse objects
– exceptions are usually explicit
– methods are usually explicit
31
Set Variables for Method-level Analysis
• Pf for each method f
– class names of uncaught exceptions during the call to f
• Pg for try expressions eg in
try eg catch (c1 x1) e1
• Assume that Ce represents
– classes that are ``available'' at an expression e
32
Method-level Set Constraints
Suppose each expression is in a method f
• id = e1
|
Pf  Pf
• if e0 then e1 else e2
|
Pf  Pf  Pf  Pf
• throw e1
|
Pf  Ce1  Pf
• try e0 catch (c1 x1 ) e1
|
Pf  (Pg - {c1}*)  Pf
• Method invocation e0.m(e1)
- Pf  Pf  Pf
- for each class C in [e0] with a method m(x1) = em
C  Ce0  Pf  Pc.m
33
Method-level Set Constraints
Suppose each expression is in a method f
• id = e1
|
• if e0 then e1 else e2
|
• throw e1
|
Pf  Ce1  ExnClasses
• try eg catch (c1 x1) e1
|
Pf  Pg - {c1}*
• Method invocation e0.m(e1)
- for each class C in [e0] with a method m(x1) = em
C  Ce0  Pf  Pc.m
34
Expression-kind-level Analysis
• Pf,k for each expression-kind k in a method f
– class names of uncaught exceptions during the call to f
• Pg,k for each expression-kind k in a try expression
eg
try eg catch (c1 x1) e1
• Assume that Ce represents
– classes that are ``available'' at an expression e
35
Expression-kind-level Set Constraints
Suppose each expression is in a method f
• id = e1
| Pf,ass  Pf,k1 where k1 = kind(e1)
• if e0 then e1 else e2
| Pf,if  Pf,k1  Pf,k2  Pf,k3
• throw e1
| Pf,throw  Ce1  Pf,k1
• try eg catch (c1 x1 ) e1
| Pf,try  (Pg,k0 - {c1}*)  Pf,k1
• Method invocation e0.m(e1)
- Pf,call  Pf,k0  Pf,k1
- for each class C in Ce0 with a method m(x1) = em
C  Ce0  Pf,call  Pm,km
36
Exception Analyses for Java
• Exception analysis for Java
– by Yi and Chang in ECOOP’99 Workshop
– Expression-level and Method-level
• We are currently devising
– a general framework for method-level analysis
• Jex
– A tool for a view of the exception flow
– by Robillard and Murphy in 1999
37
Applications of Exception Analysis
• A kind of program verification
– Provide programmers information on all possible
uncaught exceptions
• Can be incorporated in Java programming
environment
38
Escape Analysis
39
Escape Analysis
• Escape analysis is basically
– lifetime analysis of objects
• An object escapes a method if it is
– passed as a parameter
– returned
• Basic idea of applications:
– Basically all objects are allocated in a heap.
– If an object doesn’t escape a method(or region), it
can be allocated on stack
40
Escape Graph for Escape Analysis
• inside node
– object created inside the currently analyzed region
and accessed via inside edges.
• outside node
– object created outside the currently analyzed region
or accessed via outside edges.
• inside edge
– references created inside the current region
• outside edge
– references created outside the current region
41
Example
Class complex {
double x, y;
complex (double a, double b) { x = a; y = b;}
complex multiply(complex a) {
complex product = new complex(x*a.x - y*a.y, x*a.y+y*a.x);
return product;
}
complex add(complex a) {
complex sum = new complex(x+a.x,y+a.y);
return sum;
}
complex multiplyAdd(complex a, complex b) {
complex product = a.multiply(b);
complex sum = this.add(product);
return sum;
}
}
42
Example
Analysis Result for mutiplyAdd
a
b
this
product
sum
Inside edge
Inside node
Return value
Outside edge
Outside node
43
Escape Analysis
• Intraprocedural analysis
– Construction of escape graph following the
control-flow
• Interprocedural analysis
– For every method invocation cite, mapping
between caller and callee.
– To simulate parameter passing and returning
44
Escape Analysis
• OOPSLA’99
– Compositional Pointer and Escape Analysis for
Java Programs by J. Whaley and M. Rinard
– Escape analysis for object-oriented languages:
application to Java by B Blanchet
– Escape analysis for Java by J.D. Choi et al.
45
Conclusions
• We surveyed major analyses for Java
• A new framework for SBA for speed-up
• Further research topics
– Refining the framework and proof, ...
– static analysis in connection with verification
– analyses of Java bytecode
46