Java Virtual Machine

advertisement
Course Overview
PART I: overview material
1
2
3
Introduction
Language processors (tombstone diagrams, bootstrapping)
Architecture of a compiler
PART II: inside a compiler
4
5
6
7
Syntax analysis
Contextual analysis
Runtime organization
Code generation
PART III: conclusion
8
9
Supplementary material:
Java’s runtime organization
and the Java Virtual Machine
Interpretation
Review
The Java Virtual Machine
1
What This Topic is About
We look at the JVM as an example of a real-world runtime system
for a modern object-oriented programming language.
JVM is probably the most common and widely used VM in the
world, so you’ll get a better idea what a real VM looks like.
•
•
•
•
•
JVM is an abstract machine.
What is the JVM architecture?
What is the structure of .class files?
How are JVM instructions executed?
What is the role of the constant pool in dynamic linking?
Also visit this site for more complete information about the JVM:
http://java.sun.com/docs/books/vmspec/2nd-edition/html/VMSpecTOC.doc.html
The Java Virtual Machine
2
Recap: Interpretive Compilers
Why?
A tradeoff between fast(er) compilation and a reasonable runtime
performance.
How?
Use an “intermediate language”
• more high-level than machine code => easier to compile to
• more low-level than source language => easy to implement as an
interpreter
Example: A “Java Development Kit” for machine M
Java–>JVM
M
The Java Virtual Machine
JVM
M
3
Abstract Machines
Abstract machine implements an intermediate language in between
the high-level language (e.g. Java) and the low-level hardware (e.g.
Pentium)
High level
Java
Implemented in Java:
Machine independent
Java
Java compiler
JVM (.class files)
Java JVM interpreter
or JVM JIT compiler
Low level
Pentium
The Java Virtual Machine
Pentium
4
Abstract Machines
An abstract machine is intended specifically as a runtime system for
a particular (kind of) programming language.
• JVM is a virtual machine for Java programs.
• It directly supports object-oriented concepts such as classes,
objects, methods, method invocation etc.
• Easy to compile Java to JVM
=> 1. easy to implement compiler
2. fast compilation
• Another advantage: portability
The Java Virtual Machine
5
Class Files and Class File Format
External representation
(platform independent)
.class files
load
JVM
Internal representation
(implementation dependent)
classes
objects
primitive types
arrays
strings
methods
The JVM is an abstract machine in the truest sense of the word.
The JVM specification does not give implementation details (can be
dependent on target OS/platform, performance requirements, etc.)
The JVM specification defines a machine independent “class file
format” that all JVM implementations must support.
The Java Virtual Machine
6
Data Types
JVM (and Java) distinguishes between two kinds of types:
Primitive types:
• boolean: boolean
• numeric integral: byte, short, int, long, char
• numeric floating point: float, double
• internal, for exception handling: returnAddress
Reference types:
• class types
• array types
• interface types
Note: Primitive types are represented directly, reference types are
represented indirectly (as pointers to array or class instances).
The Java Virtual Machine
7
JVM: Runtime Data Areas
Besides OO concepts, JVM also supports multi-threading. Threads are
directly supported by the JVM.
=> Two kinds of runtime data areas:
1. shared between all threads
2. private to a single thread
Shared
Garbage Collected
Heap
Method area
The Java Virtual Machine
Thread 1
pc
Java
Stack
Thread 2
pc
Native
Method
Stack
Java
Stack
Native
Method
Stack
8
Java Stacks
JVM is a stack based machine, much like TAM.
JVM instructions
• implicitly take arguments from the stack top
• put their result on the top of the stack
The stack is used to
• pass arguments to methods
• return a result from a method
• store intermediate results while evaluating expressions
• store local variables
This works similarly to (but not exactly the same as) what we
previously discussed about stack-based storage allocation and routines.
The Java Virtual Machine
9
Stack Frames
The Java stack consists of frames. The JVM specification does not say
exactly how the stack and frames should be implemented.
The JVM specification specifies that a stack frame has areas for:
Pointer to runtime constant pool
args
+
local vars
operand stack
A new call frame is created by executing
some JVM instruction for invoking a
method (e.g. invokevirtual,
invokenonvirtual, ...)
The operand stack is initially empty,
but grows and shrinks during execution.
The Java Virtual Machine
10
Stack Frames
The role/purpose of each of the areas in a stack frame:
pointer to
constant pool
args
+
local vars
operand stack
Used implicitly when executing JVM
instructions that contain entries into the
constant pool (more about this later).
Space where the arguments and local variables
of a method are stored. This includes a space
for the receiver (this) at position/offset 0.
Stack for storing intermediate results
during the execution of the method.
• Initially it is empty.
• The maximum depth is known at
compile time.
The Java Virtual Machine
11
Stack Frames
An implementation using registers such as SB, ST, and LB and a
dynamic link is one possible implementation.
to previous frame on the stack
SB
LB
dynamic link
args
+
local vars
operand stack
to runtime constant pool
JVM instructions store and load
(for accessing args and locals) use
addresses which are numbers
from 0 to #args + #locals - 1
ST
The Java Virtual Machine
12
JVM Interpreter
The core of a JVM interpreter is basically this:
do {
byte opcode = fetch an opcode;
switch (opcode) {
case opCode1 :
fetch operands for opCode1;
execute action for opCode1;
break;
case opCode2 :
fetch operands for opCode2;
execute action for opCode2;
break;
case ...
} while (more to do)
The Java Virtual Machine
13
Instruction-set: typed instructions!
JVM instructions are explicitly typed: different opCodes for
instructions for integers, floats, arrays, reference types, etc.
This is reflected by a naming convention in the first letter of the
opCode mnemonics:
Example: different types of “load” instructions
iload
lload
fload
dload
aload
integer load
long load
float load
double load
reference-type load
The Java Virtual Machine
14
Instruction set: kinds of operands
JVM instructions have three kinds of operands:
- from the top of the operand stack
- from the bytes following the opCode
- part of the opCode itself
Each instruction may have different “forms” supporting different
kinds of operands.
Example: different forms of “iload”
Assembly code
Binary instruction code layout
iload_0
26
iload_1
27
iload_2
28
iload_3
29
iload n
21
n
wide iload n
The Java Virtual Machine
196
21
n
15
Instruction-set: accessing arguments and locals
arguments and locals area inside a stack frame
0:
1:
2:
3:
Instruction examples:
iload_1
istore_1
iload_3
astore_1
aload 5
fstore_3
aload_0
The Java Virtual Machine
args: indexes 0 .. #args - 1
locals: indexes #args .. #args + #locals - 1
• A load instruction takes something
from the args/locals area and pushes
it onto the top of the operand stack.
• A store instruction pops something
from the top of the operand stack
and places it in the args/locals area.
16
Instruction-set: non-local memory access
In the JVM, the contents of different “kinds” of memory can be
accessed by different kinds of instructions.
accessing locals and arguments: load and store instructions
accessing fields in objects: getfield, putfield
accessing static fields: getstatic, putstatic
Note: Static fields are a lot like global variables. They are allocated
in the “method area” where also code for methods and
representations for classes (including method tables) are stored.
Note: getfield and putfield access memory in the heap.
Note: JVM doesn’t have anything similar to registers L1, L2, etc.
The Java Virtual Machine
17
Instruction-set: operations on numbers
Arithmetic
add: iadd, ladd, fadd, dadd
subtract: isub, lsub, fsub, dsub
multiply: imul, lmul, fmul, dmul
etc.
Conversion
i2l, i2f, i2d,
l2f, l2d, f2d,
f2i, d2i, …
The Java Virtual Machine
18
Instruction-set …
Operand stack manipulation
pop, pop2, dup, dup2, swap, …
Control transfer
Unconditional: goto, jsr, ret, …
Conditional: ifeq, iflt, ifgt, if_icmpeq, …
The Java Virtual Machine
19
Instruction-set …
Method invocation:
invokevirtual: usual instruction for calling a method on an
object.
invokeinterface: same as invokevirtual, but used
when the called method is declared in an interface (requires a
different kind of method lookup)
invokespecial: for calling things such as constructors,
which are not dynamically dispatched (this instruction is also
known as invokenonvirtual).
invokestatic: for calling methods that have the “static”
modifier (these methods are sent to a class, not to an object).
Returning from methods:
return, ireturn, lreturn, areturn, freturn, …
The Java Virtual Machine
20
Instruction-set: Heap Memory Allocation
Create new class instance (object):
new
Create new array:
newarray: for creating arrays of primitive types.
anewarray, multianewarray: for arrays of reference
types.
The Java Virtual Machine
21
Instructions and the “Constant Pool”
Many JVM instructions have operands which are indexes pointing to
an entry in the so-called constant pool.
The constant pool contains all kinds of entries that represent
“symbolic” references for “linking”. This is the way that instructions
refer to things such as classes, interfaces, fields, methods, and
constants such as string literals and numbers.
These are the kinds of constant pool entries that exist:
• Integer
• Class_info
• Float
• Fieldref_info
• Long
• Methodref_info
• Double
• InterfaceMethodref_info
• Name_and_Type_info
• String
• Utf8_info (Unicode characters)
The Java Virtual Machine
22
Instructions and the “Constant Pool”
Example: We examine the getfield instruction in detail.
Format:
180
indexbyte1
CONSTANT_Fieldref_info {
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
Class_info {
u1 tag;
u2 name_index;
}
CONSTANT_Name_and_Type_info {
u1 tag;
u2 name_index;
u2 descriptor_index;
}
The Java Virtual Machine
indexbyte2
Utf8Info
fully
qualified
class name
Utf8Info
name of field
Utf8Info
field descriptor
23
Instructions and the “Constant Pool”
That previous picture is rather complicated, let’s simplify it a little:
180
Format:
indexbyte1
indexbyte2
Fieldref
Class
Utf8Info
fully qualified
class name
The Java Virtual Machine
Name_and_Type
Utf8Info
name of field
Utf8Info
field descriptor
24
Instructions and the “Constant Pool”
The constant entries format is part of the Java class file format.
Luckily, we have a Java assembler that allows us to write a kind of
textual assembly code and that is then transformed into a binary
.class file.
This assembler takes care of creating the constant pool entries for us.
When an instruction operand expects a constant pool entry the
assembler allows you to enter the entry “in place” in an easy syntax.
Example:
getfield mypackage/Queue i I
The Java Virtual Machine
25
Instructions and the “Constant Pool”
Fully qualified class names and descriptors in constant pool UTF8
entries.
1. Fully qualified class name: a package + class name string. Note
this uses “/” instead of “.” to separate each level along the path.
2. Descriptor: a string that defines a type for a method or field.
Java
boolean
integer
Object
String[]
int foo(int,Object)
The Java Virtual Machine
descriptor
Z
I
Ljava/lang/Object;
[Ljava/lang/String;
(ILjava/lang/Object;)I
26
Linking
In general, linking is the process of resolving symbolic references in
binary files.
Most programming language implementations have what we call
“separate compilation”. Modules or files can be compiled separately
and transformed into some binary format. But since these separately
compiled files may have connections to other files, they have to be
linked.
=> The binary file is not yet executable, because it has some kind of
“symbolic links” in it that point to things (classes, methods, functions,
variables, etc.) in other files/modules.
Linking is the process of resolving these symbolic links and replacing
them by real addresses so that the code can be executed.
The Java Virtual Machine
27
Loading and Linking in JVM
In JVM, loading and linking of class files happens at runtime, while
the program is running!
Classes are loaded as needed.
The constant pool contains symbolic references that need to be
resolved before a JVM instruction that uses them can be executed
(this is the equivalent of linking).
In JVM a constant pool entry is resolved the first time it is used by a
JVM instruction.
Example:
When a getfield is executed for the first time, the constant pool
entry index in the instruction can be replaced by the offset of the field.
The Java Virtual Machine
28
Closing Example
As a closing example on the JVM, we will take a look at the
compiled code of the following simple Java class declaration.
class Factorial {
int fac(int n) {
int result = 1;
for (int i=2; i<n; i++) {
result = result * i;
}
return result;
}
}
The Java Virtual Machine
29
Compiling and Disassembling
% javac Factorial.java
% javap -c -verbose Factorial
Compiled from Factorial.java
class Factorial extends java.lang.Object {
Factorial();
/* Stack=1, Locals=1, Args_size=1 */
int fac(int);
/* Stack=2, Locals=4, Args_size=2 */
}
Method Factorial()
0 aload_0
1 invokespecial #1 <Method java.lang.Object()>
4 return
The Java Virtual Machine
30
Compiling and Disassembling ...
// address:
Method int fac(int) // stack:
0 iconst_1
// stack:
1 istore_2
// stack:
2 iconst_2
// stack:
3 istore_3
// stack:
4 goto 14
7 iload_2
// stack:
8 iload_3
// stack:
9 imul
// stack:
10 istore_2
// stack:
11 iinc 3 1
// stack:
14 iload_3
// stack:
15 iload_1
// stack:
16 if_icmplt 7
// stack:
19 iload_2
// stack:
20 ireturn
The Java Virtual Machine
0
this
this
this
this
this
1
n
n
n
n
n
2
result
result
result
result
result
3
i
i 1
i
i 2
i
this
this
this
this
this
this
this
this
this
n
n
n
n
n
n
n
n
n
result
result
result
result
result
result
result
result
result
i
i
i
i
i
i
i
i
i
result
result i
result*i
i
i n
result
31
Writing Factorial in “jasmin”
Jasmin is a Java Assembler Interface. It takes ASCII descriptions
for Java classes, written in a simple assembler-like syntax and using
the Java Virtual Machine instruction set. It converts them into binary
Java class files suitable for loading into a JVM implementation.
.class package Factorial
.super java/lang/Object
.method package <init>( )V
.limit stack 50
.limit locals 1
aload_0
invokenonvirtual java/lang/Object/<init>( )V
return
.end method
The Java Virtual Machine
32
Writing Factorial in “jasmin” (continued)
.method package fac(I)I
.limit stack 50
.limit locals 4
iconst_1
istore 2
iconst_2
istore 3
Label_1:
iload 3
iload 1
if_icmplt Label_4
iconst_0
goto Label_5
Label_4:
iconst_1
Label_5:
ifeq Label_2
The Java Virtual Machine
iload 2
iload 3
imul
dup
istore 2
pop
Label_3:
iload 3
dup
iconst_1
iadd
istore 3
pop
goto Label_1
Label_2:
iload 2
ireturn
iconst_0
ireturn
.end method
33
Download