Modeling The Heap

Download Report

Transcript Modeling The Heap

Mark Marron1, Deepak Kapur2,
Manuel Hermenegildo1
1Imdea-Software
(Spain)
2University of New Mexico
1


Want to identify regions (sets of objects) that
are conceptually related
Conceptually related
• Same recursive data structure
• Stored in equivalent locations (e.g., same array)


Extract information via static analysis
Apply memory optimizations on regions
instead of over entire heap
• Region Allocation/Collection
• Region/Parallel GC
• Optimized Layout
2

Must be Dynamic
• Variable based partitions too coarse, do not
represent composition well.
• Allocation site based too imprecise, can cause
spurious grouping of objects.

Must be Repartitionable
• Want to track program splitting and merging
regions: list append, subset operations.
3

Base on storage shape graph
• Nodes represent sets of objects (or recursive data
structures), edges represent sets of pointers
• Has natural representation heap regions and
relations between them
• Efficient

Annotate nodes and edges with additional
instrumentation properties
• For region identification only need type information
4

Recursive Structures
• Group objects representing same recursive
structure, keep distinct from other recursive
structures

References
• Group objects stored in similar sets of locations
together (objects in A, in B, both A and B)

Composite Structures
• Group objects in each subcomponent, group similar
components hierarchically
5

The general approach taken to Identifying
Recursive Data Structures is well known
• Look at type information to determine which
objects may be part of a recursive structure
• Based on connectivity group these recursive objects
together

Two subtle distinctions made in this work
• Only group objects in complete recursive structure
• Ignore back pointers in computing complete
recursive structures
6
class Enode
{
Enode[] fromN;
…
}
7


The grouping of objects that are in the same
container or related composite structures is
more difficult
Given regions R, R’ when do they represent
conceptually equivalent sets of objects
• Stored in the same types of locations (variables,
collections, referred to by same object fields)
• Have same type of recursive signature (can split leaf
contents of recursive structures from internal
recursive component)
8
9


N-Body Simulation in 3-dimensions
Uses Fast Multi-Pole method with space
decomposition tree
• For nearby bodies use naive n2 algorithm
• For distant bodies compute center of mass of many
bodies and treat as single point mass
10
11
for(…) {
root = null;
makeTree();
Iterator<Body> bm = this.bodyTabRev.iterator();
while(bm.hasNext())
bm.next().hackGravity(root);
Iterator<Body> bp = this.bodyTabRev.iterator();
while(bm.hasNext())
bm.next().propUpdatedAccel();
}
12

Statically collect, space decomposition tree and all
MathVector/double[] objects (11% of GC work).
13

GC objects reachable from the acc/vel fields in
parallel with the hackGravity method (no overhead).
14

Inline Double[] into MathVector objects, 23% serial
speedup 37% memory use reduction.
15
Benchmark LOC
Analysis
Time
Analysis
Memory
Region
Ok
tsp
910
0.03s
<30 MB
Y
em3d
1103
0.09s
<30 MB
Y
voronoi
1324
0.50s
<30 MB
Y
bh
2304
0.72s
<30 MB
Y
db
1985
0.68s
<30 MB
Y
raytrace
5809
15.5s
38 MB
Y
exp
3567
152.3s
48 MB
Y
debug
15293
114.8s
122 MB
Y
16


Simple interpreter and debug environment for
large subset of Java language
14,000+ Loc (in normalized form), 90 Classes
• Additional 1500 Loc for specialized standard library
handling stubs.


Large recursive call structures, large
inheritance trees with numerous virtual
method implementations
Wide range of data structure types, extensive
use of java.util collections, heap contains
both shared and unshared structures.
17




Region Information provides excellent basis
for driving many memory optimizations and
supporting other analysis work
A simple set of heuristics (when taking into
account a few subtleties) is sufficient for
grouping memory objects
Recent work shows excellent scalability on
non-trivial programs
Further work on developing robust
infrastructure for further evaluation and
applications
18
19

Many programs (particularly OO programs)
use back pointers to parent objects
• Makes type recursive even though structure is finite
• Can lead to grouping many structures that are
conceptually distinct in the program
• Simply ignore them based on depth from roots

Similarly want to wait until structures are
finished before merging them
• Preserves analysis precision during construction of
recursive structures
• Prevents grouping of objects that have recursive
types but are used in finite heap structures
20

Using heuristics based on declared type
information and connectivity group
• Objects that make up recursive data structures
• Objects that are stored in the same sets of
containers (arrays, java.util collections)
• Objects that are in the same kind of composite
structure

Use incremental approach to identify these
structures (for efficiency in dataflow analysis)
21
exp
Heap
vars
22
exp
{+, -, *, /}
{Const}
{Var}
{Var[]}
vars
23
24
25
26
27
28

We have the core of a practical heap analysis
technique
• Performance:
 Analyze moderate size non-trivial Java programs
 Runtime on the order of 10s of seconds
 Recent work should improve scalability
• Accuracy:
 Precisely represent connectivity, sharing, shape
properties + region and dependence information
• Qualitatively Useful
 Used results in multiple optimization domains
 Want to apply tool to other problems, work on
improvements in frontend, IR and exporting results
29
Demo of the shape analysis and
benchmark code available at:
www.cs.unm.edu/~marron/software.html
exp
{+, -, *, /}
Heap
vars
31
exp
{+, -, *, /}
{Const}
Heap
{Var}
vars
32