powerpoint slides

Download Report

Transcript powerpoint slides

Data Flow Analysis for Software
Prefetching Linked Data
Structures in Java
Brendon Cahoon
Kathryn S. McKinley
Dept. of Computer Science
University of Massachusetts
Amherst, MA
Dept. of Computer Sciences
University of Texas at Austin
Austin, TX
Motivation
• Object-oriented languages are mainstream
• Key performance issues
– Same old: processor-memory gap, parallelism
 Combination of modern processors and languages
results in poor memory performance
RSIM Performance
Compiled Java (Vortex) – no GC
80
60
40
20
Busy
Data Memory Stalls
er
w
Po
3d
Em
ro
no
i
Ts
p
Vo
B
is
o
rt
H
B
ad
d
Tr
ee
Pe
rim
et
er
st
M
ea
lth
0
H
% Execution Time
100
Prefetching Arrays vs. Objects
• Most prior work concentrates on arrays
– Compilers directly prefetch any element
– Loop transformations enable effective scheduling
– Successful results using both hardware and software
• Cannot use same techniques on linked data
structures
– Objects are small and disjoint
– Access patterns are less regular and predictable
– Only know the address of directly connected objects
Software Data Prefetching for Java
Hide memory latency from linked structure traversals
• Introduced by Luk and Mowry for C programs:
– We add data flow and interprocedural analysis
– Identify pointer structures from declaration
– Find pointer chasing in loops and self recursive calls
• Challenges introduced by Java
– Dynamically allocated objects make analysis difficult
– Small methods obscure context
Outline
• Data flow analysis for identifying linked structures
– New intra and interprocedural analysis
• Greedy prefetching
• Jump-pointer prefetching
• Experimental results
Identifying Linked Structure Traversals
Loop
while (o != null) {
t = o;
…
o = t.next;
}
Recursion
method visit() {
….
if (this.next != null)
visit(this.next);
}
We define a data flow solution:
– Intraprocedural for loops
– Interprocedural for recursion
Benefits:
– Independent of program representation
– Many compilers use data flow frameworks
– May be composed with other analyses
Data Flow Analysis
• Data flow information
– Sets of tuples: <variable, field name, statement, status>
• Status values: not recurrent, possibly, recurrent
Not recurrent : initial value
Possibly : first use of a field reference
Recurrent : an object accessed in linked structure traversal
• Intraproceedural: forward, flow-sensitive, may analysis
• Interprocedural: bidirectional, context-sensitive
Analysis Examples
while (o != null) {
s1: t = o.next;
s2: o = t;
}
st Iteration
12nd
Iteration
s1: o is possibly,
not recurrent, set
set tt to
to recurrent
possibly
s2: t is recurrent,
possibly,
set o to possibly
recurrent
while (o != null) {
s1: o = o.next;
s2: o = bar();
}
1st Iteration
s1: set o to possibly
s2: set o to not recurrent
s1: o = o.next;
s2: o = o.next;
s1: set o to possibly
s2: set o to possibly
Analysis Extensions for Common Idioms
• Track objects in fields or arrays
– Class based field assignments
– Arrays are monolithic
while (e.f != null) {
o = e.f;
e.f = o.next;
o.compute();
}
while (e.hasMoreElements()) {
o = (ObjType)e.nextElement();
o.compute();
}
• Indirect recurrent objects
– Unique objects referenced by linked structures
Greedy Prefetching
• Prefetch directly connected objects
• Algorithm consists of two steps:
– Detect accesses to linked structures
– Schedule prefetches
• When object is not null
• Completely hiding latency is difficult
Greedy Prefetching Example
Doubly linked list
int sum (Dlist l) {
int s = 0;
while (l != null) {
s =+ l.data;
l = l.next;
}
return s;
}
Greedy Prefetching Example
Doubly linked list
Greedy prefetching
int sum (Dlist l) {
int s = 0;
while (l != null) {
prefetch(l.next);
s += l.data;
l = l.next;
}
return s;
}
Jump-Pointer Prefetching
•
Prefetch indirectly connected objects
– Tolerates more latency than greedy prefetching
•
Algorithm contains three steps:
– Find linked data structure traversal and creation sites
– Create jump-pointers
•
When creating or traversing the linked structure
– Schedule prefetches
•
Prefetch special jump-pointer field
Inserting Jump-Pointers at Creation Time
Void add(ObjType o) {
ListNode n = new ListNode(o);
jumpObj = jumpQueue[i];
jumpObj.jmp = n;
jumpQueue[i++%size] = n;
if (head == null) {
head = n;
} else {
tail.next = n;
}
tail = n;
}
1
2
3
jumpObj
4
5
n
Jump-Pointer Prefetching Example
Doubly linked list
Jump-pointer prefetching
int sum (Dlist l) {
int s = 0;
while (l != null) {
prefetch(l.jmp);
s += l.data;
l = l.next;
}
return s;
}
Experimental Results
• Object-oriented Olden benchmarks in Java
• Simulation using RSIM
– Out-of-order, superscalar processor
• Compile programs using Vortex
– Translate Java programs to Sparc assembly
– Contains object-oriented, traditional optimizations
– Linked structure analysis, greedy and jump-pointer
prefetching
Prefetching Performance
health
mst
perimtr treeadd
bh
bisort
tsp
N G J
N G J
voronoi
em3d power
Normalized execution time (%)
100
80
60
40
20
0
N G J
N G J
N G J
N G J
Busy
N G J
Data Memory Stalls
N G J
N G J
N G J
Prefetch Effectiveness
health
mst
perimtr treeadd
bh
bisort
tsp
G J
G J
G J
voronoi em3d power
100
Percentage of Prefetches
80
60
40
20
0
G J
G J
G J
G J
Useful
Late
Early
G J
Unnec.
G J
G J
Static Prefetch Statistics
Program
Interprocedural
Mono
Poly
Health
8
Mst
3
Perimeter
9
Treeadd
2
BH
IntraProcedural
16
Fields
1
5
8
8
10
Bisort
4
Tsp
6
14
14
1
Voronoi
Em3d
Power
4
20
4
Contributions and Future Work
• New interprocedural data flow analysis for Java
• Evaluation of prefetching on Java programs
Prefetching hides latency, but
Room for improvement
Other uses for analysis (work in progress)
– Garbage collection: prefetching, object traversal
– Prefetching arrays of objects