Transcript JVM

Java Virtual Machine
Instruction Set Architecture
Justin Dzeja
What is a Java Virtual Machine?
• JVM is an abstract computing machine
▫ Like an actual computing machine, it has an
instruction set and manipulates various memory
areas at run time
• A JVM enables a set of computer software
programs and data structures to use a virtual
machine model for the execution of other
computer programs and scripts
▫ Not just Java, JVM now supports many languages
▫ Ada, C, LISP, Python
Why a Virtual Machine?
• The Java platform was initially developed to address
the problems of building software for networked
consumer devices
• It was designed to support multiple host
architectures and to allow secure delivery of software
components
• To meet these requirements, compiled code had to
survive transport across networks, operate on any
client, and assure the client that it was safe to run
• "Write Once, Run Anywhere"
Java Timeline
• 1991 – James Gosling begins work on Java project
▫ Originally, the language is named “Oak”
• 1995 – Sun releases first public implementation as
Java 1.0
• 1998 - JDK 1.1 release downloads top 2 million
• 1999 - Java 2 is released by Sun
• 2005 - Approximately 4.5 million developers use
Java technology
• 2007 – Sun makes all of Java’s core code available
under open-source distribution terms
Java Principles
• Sun set five primary goals in the creation of the
Java language,:
▫ It should be "simple, object oriented, and
familiar".
▫ It should be "robust and secure".
▫ It should be "architecture neutral and portable".
▫ It should execute with "high performance".
▫ It should be "interpreted, threaded, and dynamic".
JVM Instruction Set Architecture
• Instructions
▫ A Java virtual machine instruction consists of a onebyte opcode specifying the operation to be performed,
followed by zero or more operands supplying
arguments or data that are used by the operation
▫ Operands are not required, there are many
instructions that consist of only the opcode
▫ One-byte instructions allow for up to 256 instructions
but only about 200 are used in class files, leaving room
for more instructions to be added
▫ Each instruction has a mnemonic name which is
mapped to the binary one-byte opcode
JVM Instruction Set Architecture
• Instruction Format
▫ The mnemonic operation names often include the
data type followed by the operation name
 iadd, ladd, fadd, dadd
 int, long, float, double
▫ JVM supports conversion operations that convert
from one data type to another, these include both
data types in the operation name
 i2l, i2f, i2d, l2f, l2d, f2d
Operation Types
• The JVM ISA is a CISC architecture, thus having
many instructions
• They can be classified into broad groups
▫
▫
▫
▫
▫
▫
▫
Load and store
Arithmetic and logic
Type conversion
Object creation and manipulation
Operand stack management
Control transfer
Method invocation and return
Operation Types
• Load and store
▫
▫
Used to move values from local variable array or heap to the operand stack
iload, istore
• Arithmetic and logic
▫
▫
JVM supports addition, subtraction, division, multiplication, remainder, negation, increment
irem, idiv, iinc
• Type conversion
▫
▫
Allows converting from one primitive data type to another
i2l, i2f, i2d, l2f, l2d, f2d
• Object creation and manipulation
▫
▫
Instantiating objects and manipulating fields
new, putfield
• Operand stack management
▫
swap, dup2
• Control transfer
▫
ifeq, goto
• Method invocation and return
▫
invokespecial, areturn
JVM Data Types
• The Java virtual machine operates on two kinds
of types: primitive types and reference types
• Integral Types:
▫
▫
▫
▫
▫
Byte - 8-bit signed two's-complement integers
Short - 16-bit signed two's-complement integers
Int - 32-bit signed two's-complement integers
Long - 64-bit signed two's-complement integers
Char - 16-bit unsigned integers representing
Unicode characters
JVM Data Types
JVM Data Types
• Floating Point Types:
▫ Float - values are elements of the float value set
(typically 32-bit single-precision but may vary with
implementation)
▫ Double - values are elements of the double value
set(64-bit double-precision)
• Boolean - values true and false
▫ JVM has very little support for the Boolean data type
▫ Boolean variables in a Java program are compiled to
use values of the JVM int data type
• returnAddress - are pointers to the opcodes of Java
virtual machine instructions
JVM Data Types
• Three kinds of reference types
▫ Class types
▫ Array types
▫ Interface types
• These reference dynamically created classes,
arrays, or interface implementations
• Can be set to null when not referencing anything
and then cast to any type
JVM Data Types
• The basic unit of size for data values in the Java
virtual machine is the word
▫ a fixed size chosen by the designer of each Java
virtual machine implementation
• The word size must be large enough to hold a
value of type byte, short, int, char, float,
returnAddress, or reference
▫ at least 32 bits
JVM Runtime Data Areas
• Since JVM is a virtual machine it doesn’t have
any physical registers , instead it defines various
runtime data areas that are used during
execution of a program
• One of the areas defined is the program counter
register
▫ Each thread of control has its own PC register
▫ The register is wide enough to contain a
returnAddress or a native pointer on the specific
platform
JVM Runtime Data Areas
• JVM Stack
▫ Each thread gets its own JVM stack when it is
created
▫ Stacks store frames which hold data and play a
role in method invocation and return
▫ The actual memory for a JVM stack does not need
to be contiguous
▫ The stack can be either of a fixed size or
dynamically contracted and expanded as needed
JVM Runtime Data Areas
• JVM Frames
▫ A frame is used to store data and partial results, as well as to perform
dynamic linking , return values for methods, and dispatch exceptions
▫ A new frame is created each time a method is invoked and destroyed
when the method is completed
▫ Only one frame, for the executing method, is active at any point
▫ Each frame contains a local variable array
 Local variables can store primitive or reference data types
 Variables are addressed by index, starting from zero
 Data types long and double occupy two consecutive local variables
▫ Frames also contains an operand stack
 Last-in-first-out (LIFO)
 JVM loads values from local variables or fields onto the stack
 Then JVM instructions can take operands from the stack, operate on
them, and the push the result back onto the stack
 The operand stack size is fixed at compile time based on method
associated with the frame
JVM Operand Stack
• Code:
▫
▫
▫
▫
iload_0
iload_1
iadd
istore_2
// push the int in local variable 0
// push the int in local variable 1
// pop two ints, add them, push result
// pop Int, store into local variable 2
JVM Runtime Data Areas
• JVM Heap
▫ The heap is a data area shared by all JVM threads
▫ Memory from the heap is allocated for instances of
classes and arrays
▫ Can be either of fixed size or dynamic
▫ Does not to be in contiguous memory space
▫ Maintained by an automatic storage management
system or garbage collector
JVM Runtime Data Areas
• Method Area
▫ The method area is also shared among all JVM
threads
▫ It stores per-class structures
 such as the runtime constant pool, field and method
data, code for methods and constructors, including
the special methods used in class and instance
initialization
▫ The method area is logically part of the heap, but
depending on the implementation it may or may
not be garbage collected or compacted
JVM Runtime Data Areas
• Runtime Constant Pool
▫ The runtime constant pool is a per-class runtime
representation of the constant pool table in a class
file
▫ It contains numeric constants as well as method
and field references that are resolved at runtime
▫ This is similar to a symbol table for a conventional
programming language, although it stores a wider
range of data
JVM Addressing Modes
• JVM supports three addressing modes
▫ Immediate addressing mode
 Constant is part of instruction
▫ Indexed addressing mode
 Accessing variables from local variable array
▫ Stack addressing mode
 Retrieving values from operand stack using pop
JVM Method Calls
• Sample Code
int add12and13() {
return addTwo(12, 13);
}
• Compiles to
Method int add12and13()
0 aload_0
1 bipush 12
3 bipush 13
5 invokevirtual #4
8 ireturn
// Push local variable 0 (this)
// Push int constant 12
// Push int constant 13
// Method Example.addtwo(II)I
// Return int on top of operand stack; it
//is the int result of addTwo()
Design Principles
• Simplicity favors regularity
▫ Examples of this principle can be found throughout
the JVM specification
▫ Instructions are all a standard opcode that is one byte
in size
▫ The naming conventions for opcode mnemonics are
standard across different types of operations
• Smaller is faster
▫ Data areas such as the heap are dynamic in size
resulting in memory space saved when not in use
▫ JVM has a large instruction set, which results in a
smaller code size when converted to byte code
Design Principles
• Make the common case fast
▫ JVM includes instructions to increment variables
or to arithmetically shift values for fast execution
of common operations
• Good design demands good compromises
▫ The JVM finds a good balance between high
performance and being secure and robust
JVM Advantages/Disadvantages
• A self-contained operating environment as if it’s a
separate computer, this gives JVM two main
advantages
▫ System Independence – a Java application will run the
same on any JVM, regardless of the underlying system
▫ Security – Since a Java program operates in a selfcontained environment there is less risk to files and
other applications
• The disadvantage is that running the virtual
machine is extra overhead on the system, which can
impair performance in some situations
Sources
• http://java.sun.com/docs/books/jvms/second_
edition/html/VMSpecTOC.doc.html
• http://www.cis.cau.edu/121/lecture05-2.htm
• http://www.particle.kth.se/~lindsey/JavaCours
e/Book/Part1/Supplements/Chapter01/JVM.ht
ml