An Evaluation of Staged Run

Download Report

Transcript An Evaluation of Staged Run

Design and Implementation of
the Joeq Virtual Machine
John Whaley
Stanford University
Sun Microsystems Labs
Mountain View, CA
August 26, 2003
About me
• Worked on Java VMs since JDK 1.0
–
–
–
–
–
–
1996: Extended AWT to support pen input
1997: Clean-room Java VM written in C++
1998: Jalapeno: designed opt compiler, …
1999: MIT Flex: dataflow framework, etc.
2000: IBM Tokyo JIT: x86 performance
2001: joeq virtual machine
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
1
Key Features
• Implemented in 100% Java
– Includes native methods to manipulate addresses,
memory, registers directly.
• Native vs. hosted execution
– Native: run directly on hardware
– Hosted: run on top of another VM
• Bootstrap to native via reflection
• Supports both GC and explicit deallocation
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
2
Key Features
• Compiler and program analysis framework
• Multiple languages: Java, C, C++, …
– Single intermediate representation
• Static, quasi-static, and dynamic compilation
– Single unified compiler infrastructure
• Online and offline profiling system
• M:N thread scheduler
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
3
Motivation/Purpose
• Started Ph.D. studies, needed a research
infrastructure
• Purpose:
– Try out new ideas
– Do research
– Publish papers
• Not out to:
– Compete with other VMs
– Make a shippable product
– Change the world
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
4
Other Options
• SUIF
–
–
–
–
Written in C++
Limited support for Java
No dynamic compilation or runtime system
EDG frontend: not 100% gcc compatible
–
–
–
–
Written in Java
Very familiar with the system
Supports Java only
Not available outside of IBM
• Jalapeno
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
5
Other Options
• MIT Flex compiler
–
–
–
–
Written in Java
Familiar with system
Open-source GPL
Statically-compiled Java only
• Kaffe, etc.
– Written in C
– Poor design, poor performance
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
6
Why Another VM?
• General problem with established
projects:
– Established users and code base made it
difficult to make major changes.
– Wanted to fix the design "mistakes" of
Jalapeno and MIT Flex compiler
– More productive in Java than in C++
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
7
Design Goals
• Ease of trying out new research ideas
– Implemented in Java
– Modularity.
– Lots of reusable code, use of software
patterns.
• Support Java and C/C++
– A single intermediate representation
– Support GC and explicit deallocation
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
8
Design Goals
• Support static, quasi-static, dynamic
compilation.
– Unified compiler framework.
– Compiler implemented in Java.
– Allow "maybe" responses due to
incomplete information.
– General code patching mechanism.
– Profile framework allows online/offline
profiling.
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
9
Design Goals
• Get something up and running quickly.
– Make compiler, runtime easy to debug
– Hijack class libraries from running VM
– LGPL: can borrow code from other opensource projects
– Goal: Self-bootstrapping after one month
• Make it available for others to use.
– Documentation, etc.
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
10
Not Design Goals
• Performance leader
– An endless pit, takes a lot of effort
– Performance just needs to be “reasonable”
– Should be designed for good performance
if someone wanted to put in the effort
• 100% conformance to specification
– If programs work, that’s good enough.
– No access to good test suites, anyway.
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
11
System Overview
FRONT-END
ELF object
file
ELF binary
loader
Disassemble
to Quad
SUIF file
SUIF file
loader
SUIF to
Quad
Java class
file
Object file
data section
Optimizations
and analyses
Bytecode
decoder
Memory
heaps
Garbage
collector
MEMORY MANAGER
DYNAMIC
Controller
Profiler
Quad
IR
BACK-END
Quad
backend
Bytecode to
Quad
Java class
file loader
August 26, 2003
COMPILER
Bytecode
IR
Bytecode
backend
Class/member
metadata
Executable code
in memory
ELF file
code section
COFF file
code section
System
interface
Bytecode/Quad
interpreters
INTERPRETER
Compiled code
plus metadata
Profile data
file
External
libraries
Introspection,
verification,
type checking
RUN-TIME SUPPORT
Design and Implementation of
the Joeq Virtual Machine
Thread scheduler,
synchronization,
stack walker
12
Consequences of 100% Java
• Implementation purity
– Self-applicable
– VM code is great for program analysis, makes a
great test suite
• Portability
– >95% of the code is system-independent
– Hosted execution
• Easier software engineering
– Exceptions, GC, software patterns, existing tools
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
13
Consequences of 100% Java
• Java is not a panacea of portability
– Hosted execution works OK on most VMs
– Native bootstrapping is horribly VMdependent
• Internal class library changes cause Joeq to
break
– Supporting multiple JDK versions is difficult
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
14
Bootstrapping technique
• Use reflection and code analysis to determine
root set of methods and objects
• Dump the objects and code into an object file
(COFF or ELF format)
• Use a standard linker to generate an
executable
• Easy support for static and quasi-static
compilation, cross-language calls, dynamic
linking, etc.
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
15
Bootstrapping trickiness
• Custom class loaders
– Have to hijack class loader and wrap it
• Files, etc. must be reinitialized
– Some state stored in native code
• Objects created during image write
– Finalizer threads, reflection caches, character
encodings, …
• Reflection doesn’t work on all objects
– Throwable backtrace, ThreadLocal, etc.
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
16
Consequences of bootstrapping
technique
• Standard file formats very useful
– Use existing tools and debuggers
• Big startup time improvement on
applications (30x)
– Skips all of the initialization code, JIT
startup costs
• Large object files, number of relocations
cause problems with some tools.
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
17
Consequences of bootstrapping
technique
• Automatic discovery of necessary code:
time-consuming, too conservative.
• Hardwired class list: smaller and faster,
but breaks often.
• Problem: Instantiating an object means
class is initialized, which brings in class
initializer and many more objects
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
18
Consequences of bootstrapping
technique
• Bootstrapping process is a major pain
– Time-consuming: reflection is inefficient
– Difficult to debug
– Process breaks with different JDK
versions, environment variables, command
line options, locales, etc.
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
19
Class library implementation
• GNU Classpath: too incompatible, too buggy
• Hijack Sun class library by class merging
– Make a “mirror” class with the same name.
– Special class loader merges the classes.
• Easy implementation of native methods.
– Native code is just normal Java code.
• Perfect compatibility, easy updates
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
20
Consequences of mirror classes
• Types don’t match, so javac complains
– Cast to java.lang.Object, then back down.
• Doesn’t work on different class libraries.
• Many changes between subversions.
– Use a hierarchy of mirror classes
• Incompatible changes lead to many
hacks.
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
21
Multiple language support
• Joeq has support for:
– Java class files
– SUIF files
• C, C++, Fortran, …
– x86 object code
• All are translated into a single
intermediate representation, the Quad.
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
22
Quad intermediate representation
• Analyses and optimizations are instantly
applicable to all languages
• Cross-language inlining and optimization
– Elimination of JNI overhead
• Support for raw address manipulation in Java
falls out naturally
• Type-accurate garbage collection for wellbehaved C/C++ programs
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
23
Quad intermediate representation
• Generic interfaces for operators
– Lots of shared code
• Types are optional
– Type analysis will construct type
information
• Doesn’t support all esoteric C/C++
features
– Computed labels, C++ nastiness, etc.
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
24
Hierarchy of Operators
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
25
Memory management
• Memory management is abstracted into
different heaps
– Each heap has its own allocation/deallocation
policy
• Interface for querying garbage collection
policies
– Type-accurate, semi-accurate, conservative
– GC-safe points or at any instruction
– Thread-local allocation pools
• Working out an interface with JMTk
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
26
Consequences of memory
management framework
• Debugging
– Run under hosted execution mode
– Image snapshots
– 100% type-accurate is hard
• Coordinating threads for GC
– Making a general interface is tricky
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
27
Thread scheduler
• M:N thread scheduler
– Lightweight Java threads
– Thread switch at any instruction
– Uses local thread queues and work-stealing
• Timer ticks by using setitimer interrupts
(Linux) or a separate thread (Windows)
• Thread-local information stored off of fs
register
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
28
Consequences of Java thread
scheduler
• Accessing threads in a machineindependent way is not easy
• Linux pthread implementation is broken
– Lots of bugs, race conditions, inefficiencies
– Changing stack pointer is not always
supported
– Use of fs register is not always supported
• Windows support is much nicer (?)
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
29
Running an Open-Source Project
• Lots of interest, but very few people
actually follow thru
• Not many people have the skills
– Of those, not many have the time
• Of those, even fewer have the perseverance
– The result is that there have only been minor
contributions by others
• Documentation, testing, file releases,
updating the web site all take time.
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
30
Running an Open-Source Project
• What’s needed:
– Nightly build scripts and regression testing
– Implementation hackers
– People interested in GC
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
31
Conclusion: What I’ve learned
• Software patterns are useful
– Joeq: 100K lines of code
• Modular design is key
– Trying out new type checker: ~2 hours
• For maximum efficiency, design the system to
be easily debuggable.
• Preemptively eliminate obvious problems.
• Its more fun to write code when you also write
the compiler.
August 26, 2003
Design and Implementation of
the Joeq Virtual Machine
32